Create external cloud storage for a catalog

This article describes how to create external cloud storage for Apache Iceberg™ tables for the following cloud storage providers:

  • Amazon S3

  • Google Cloud Storage (GCS)

  • Microsoft Azure container

Before you can create an internal catalog in your Polaris Catalog™ account, you must first create and configure external cloud storage for it.

Create an Amazon S3 bucket

To create a storage bucket for Amazon S3, do the following:

  1. Log in to the AWS Management Console.

  2. From the home dashboard, search for and select S3.

  3. Select Create bucket.

  4. Enter a Bucket name.

  5. If needed, configure the settings for your storage bucket or else use the default settings.

  6. Scroll to the bottom of the page and select Create bucket.

  7. Search for and select the storage bucket you created.

  8. To create a folder, select Create folder.

    Note

    As a best practice, we recommend creating this folder.

  9. Enter a Folder name where you want to store Apache Iceberg™ tables and select Create folder.

  10. Select the folder you created.

  11. To copy the S3 URI for the folder you created, select Copy S3 URI. Store the S3 URI, because you must specify it when you create a catalog in Polaris Catalog.

    Note

    When creating a catalog in Polaris Catalog, you enter the S3 URI in the Default base location field.

Create a Google Cloud Storage (GCS) bucket

To create a GCS bucket, do the following:

  1. Log in to the Google Cloud Platform Console as a project editor.

  2. From the home dashboard, hover on Products & solutions and select All products.

  3. Scroll to the Storage section and select Cloud Storage.

  4. Select Create.

  5. In the Name your bucket field, enter a name for your GCS bucket.

  6. If needed, configure the settings for your storage bucket.

  7. Select CREATE FOLDER.

  8. Enter a folder name where you want to store Apache Iceberg™ tables and select Create.

  9. Select the folder you created.

  10. To copy the path to the folder you created, select the Copy icon. Store this folder path, because you must specify it when you create a catalog in Polaris Catalog.

    Screenshot that shows the Copy icon in Google Cloud Platform.

    Note

    When creating a catalog in Polaris Catalog, you enter the S3 URI in the Default base location field.

Create a Microsoft Azure container

To create a Microsoft Azure container for your Apache Iceberg™ tables, create a container by using one of the following Azure cloud storage services:

  • Blob storage

  • Data Lake Storage Gen2

  • General-purpose v1

  • General-purpose v2

These services are the Azure cloud storage services that Polaris Catalog supports for storage integrations. A storage integration is a Polaris Catalog object that stores a generated identity and access management (IAM) entity for your external cloud storage and is created when you create a catalog.

Step 1: Create a storage account

  1. Log in to Azure.

  2. From the home dashboard, search for and select Storage account.

  3. Select + Create.

  4. In the Resource group field, select a resource group for your storage account or Select Create new to create a new resource group.

  5. In the Storage account name field, enter a name for your storage account.

  6. If needed, enable hierarchical namespace to use the storage account for Azure Data Lake Storage Gen2 workloads. For more information, see Create a storage account.

  7. If needed, configure the settings for your storage account.

  8. Select Review + create.

  9. Select Create.

Step 2: Create a container in your storage account

Follow these steps to create a container and copy the path to it:

  1. In Azure, navigate to the storage account you created.

  2. From the menu on the left, Select Data storage.

  3. Under Data storage, select Containers.

  4. Select + Container.

  5. Enter a Name for your container and select Create.

  6. Copy and save the name of your container. You need to specify this name when you create a catalog in Polaris Catalog.

  7. If you’re using a hierarchical namespace and need to create add a directory:

    a. Select the container you created. b. Select + Add Directory. c. Enter a Name for the directory and select Save. d. Copy and save the name of this directory. You need to specify this name when you create a catalog in Polaris Catalog.

Step 3: Copy the endpoint path to your container

  1. In Azure, navigate to the storage account you created.

  2. From the menu on the left, select Settings.

  3. Under Settings, select Endpoints.

  4. Copy and store the path to the primary endpoint for your container:

    • If you’re using blob storage, under Blob service, select the Copy to clipboard icon for the Primary endpoint: Blob service field.

    • If you’re using Azure Data Lake Storage, under Data Lake Storage, select the Copy to clipboard icon for the Primary endpoint: Data Lake Storage field.

    Note

    When creating a catalog in Polaris Catalog, you enter the path to the primary endpoint for your container in the Default base location field. The steps for creating a catalog in Polaris Catalog include instructions for how to format this path into the required format for the Default base location field.