Organize catalog content¶
This topic provides instructions for how to create namespaces and tables for an internal catalog in Snowflake Open Catalog.
Important
To ensure that the access privileges defined for a catalog are enforced correctly, you must:
Ensure a directory only contains the data files that belong to a single table.
Create a directory hierarchy that matches the namespace hierarchy for the catalog.
For example, if a catalog includes:
Top-level namespace
namespace1
Nested namespace
namespace1a
A
customers
table, which is grouped under nested namespacenamespace1a
An
orders
table, which is grouped under nested namespacenamespace1a
The directory hierarchy for the catalog must be:
/namespace1/namespace1a/customers/<files for the customers table *only*>
/namespace1/namespace1a/orders/<files for the orders table *only*>
A catalog admin can use Open Catalog or a third-party query engine to organize catalog content as follows:
Object |
Use |
---|---|
Namespace |
|
Table |
Third-party query engine |
Note
The tables and namespaces for an external catalog are read-only in Open Catalog. If you need to organize catalog content for an external catalog, you must use Snowflake. For more information, see Snowflake-managed Apache Iceberg™ tables.
The example code in this topic shows how to use Apache Spark to organize catalog content. The example code is in PySpark.
Create a namespace¶
This section provides instructions for creating top-level or nested namespaces.
Important
When you create a namespace, don’t use periods or spaces in the namespace name.
Create a top-level namespace¶
To create a top-level namespace, you can use Apache Spark or Open Catalog.
Example: Create a top-level namespace by using Apache Spark¶
The following example code creates a top-level namespace named namespace1
in the catalog1
catalog.
spark.sql("use catalog1").show()
spark.sql("CREATE NAMESPACE namespace1")
Create a top-level namespace by using Open Catalog¶
To create a namespace by using Open Catalog, follow these steps:
Sign in to Open Catalog.
From the menu on the left, select Catalogs.
From the list of catalogs, select the catalog where you want to create a top-level namespace.
Select + Namespace.
In the Name field, enter a name for the namespace and select Submit.
Create a nested namespace¶
To create a nested namespace, you can use Apache Spark or Open Catalog.
Example: Create a nested namespace by using Apache Spark¶
The following example code creates a nested namespace named namespace1a
in the catalog1
catalog. This nested namespace is created under the
existing top-level namespace namespace1
.
spark.catalog.setCurrentCatalog("catalog1")
spark.sql("use catalog1").show()
spark.sql("CREATE NAMESPACE namespace1.namespace1a")
Create a nested namespace by using Open Catalog¶
To create a nested namespace by using Open Catalog, follow these steps:
Sign in to Open Catalog.
From the menu on the left, select Catalogs.
From the list of catalogs, select the catalog where you want to create a nested namespace.
Select the Namespaces tab and then select the parent namespace where you want to create the nested namespace.
If needed, repeat the previous step to navigate to the parent namespace of the nested namespace to create.
Select + Namespace.
In the Name field, enter a name for the nested namespace and select Submit.
Create a table¶
This section provides instructions for creating tables by using Apache Spark.
Example: Create a table¶
The following example code creates a customers
table under nested namespace namespace1a
in the catalog1 catalog. It is created with id
and
custnum
columns, and the data type for both columns is integer
.
spark.sql("use catalog1").show()
spark.sql ("use namespace1.namespace1a")
spark.sql("CREATE OR REPLACE TABLE customers (id int, custnum int) using iceberg")
Example: Insert rows into a table¶
The following example code inserts a row into the customers table.
spark.sql("use catalog1").show()
spark.sql ("use namespace1.namespace1a")
spark.sql("INSERT INTO customers VALUES (123,456)")