Organize catalog content

This topic provides instructions for how to create namespaces and tables for an internal catalog in Polaris Catalog™.

Important

To ensure that the access privileges defined for a catalog are enforced correctly, you must:

  • Ensure a directory only contains the data files that belong to a single table.

  • Create a directory hierarchy that matches the namespace hierarchy for the catalog.

For example, if a catalog includes:

  • Top-level namespace namespace1

  • Nested namespace namespace1a

  • A customers table, which is grouped under nested namespace namespace1a

  • An orders table, which is grouped under nested namespace namespace1a

The directory hierarchy for the catalog must be:

  • /namespace1/namespace1a/customers/<files for the customers table *only*>

  • /namespace1/namespace1a/orders/<files for the orders table *only*>

A catalog admin can use Polaris Catalog or a third-party query engine to organize catalog content as follows:

Object

Use

Namespace

  • Polaris Catalog
  • Third-party query engine

Table

Third-party query engine

Note

The tables and namespaces for an external catalog are read-only in Polaris Catalog. If you need to organize catalog content for an external catalog, you must use Snowflake. For more information, see Snowflake-managed Apache Iceberg™ tables.

The example code in this topic shows how to use Apache Spark to organize catalog content. The example code is in PySpark.

Create a namespace

This section provides instructions for creating top-level or nested namespaces.

Important

When you create a namespace, don’t use periods or spaces in the namespace name.

Create a top-level namespace

To create a top-level namespace, you can use Apache Spark or Polaris Catalog.

Example: Create a top-level namespace by using Apache Spark

The following example code creates a top-level namespace named namespace1 in the catalog1 catalog.

spark.sql("use catalog1").show()
spark.sql("CREATE NAMESPACE namespace1")
Copy

Create a top-level namespace by using Polaris Catalog™

To create a namespace by using Polaris Catalog, follow these steps:

  1. Sign in to Polaris Catalog.

  2. From the menu on the left, select Catalogs.

  3. From the list of catalogs, select the catalog where you want to create a top-level namespace.

  4. Select + Namespace.

  5. In the Name field, enter a name for the namespace and select Submit.

Create a nested namespace

To create a nested namespace, you can use Apache Spark or Polaris Catalog.

Example: Create a nested namespace by using Apache Spark

The following example code creates a nested namespace named namespace1a in the catalog1 catalog. This nested namespace is created under the existing top-level namespace namespace1.

spark.catalog.setCurrentCatalog("catalog1")
spark.sql("use catalog1").show()
spark.sql("CREATE NAMESPACE namespace1.namespace1a")
Copy

Create a nested namespace by using Polaris Catalog

To create a nested namespace by using Polaris Catalog, follow these steps:

  1. Sign in to Polaris Catalog.

  2. From the menu on the left, select Catalogs.

  3. From the list of catalogs, select the catalog where you want to create a nested namespace.

  4. Select the Namespaces tab and then select the parent namespace where you want to create the nested namespace.

  5. If needed, repeat the previous step to navigate to the parent namespace of the nested namespace to create.

  6. Select + Namespace.

  7. In the Name field, enter a name for the nested namespace and select Submit.

Create a table

This section provides instructions for creating tables by using Apache Spark.

Example: Create a table

The following example code creates a customers table under nested namespace namespace1a in the catalog1 catalog. It is created with id and custnum columns, and the data type for both columns is integer.

spark.sql("use catalog1").show()
spark.sql ("use namespace1.namespace1a")
spark.sql("CREATE OR REPLACE TABLE customers (id int, custnum int) using iceberg")
Copy

Example: Insert rows into a table

The following example code inserts a row into the customers table.

spark.sql("use catalog1").show()
spark.sql ("use namespace1.namespace1a")
spark.sql("INSERT INTO customers VALUES (123,456)")
Copy