AWS PrivateLink and Snowflake Open Catalog¶
This topic describes how to configure AWS PrivateLink to directly connect your Snowflake Open Catalog account to your query engine by using inbound private connectivity.
Prerequisites¶
Your Snowflake Open Catalog account is hosted on AWS.
You have the necessary permissions to configure your AWS DNS service with the private connectivity URL for your Open Catalog account. For guidance, see How to configure the AWS DNS service (Route 53) to access Snowflake via a PrivateLink in the Snowflake Community.
Step 1: Enable AWS PrivateLink¶
In this procedure, you enable AWS PrivateLink for your Open Catalog account. This configuration allows the query engine to connect to Open Catalog through private connectivity. You will need the 12-digit identifier for your Amazon Web Services (AWS) account and the federated token value that contains access credentials for a federated user.
To obtain the federated token value, execute the following command by using the AWS CLI and copy the value into a text editor:
aws sts get-federation-token --name sam
Sign in to Snowflake Open Catalog.
In the navigation menu, select Settings.
Select Authorize.
In the Authorize Private Link dialog, enable private connectivity for your account:
In the ID field, enter the 12-digit identifier for your Amazon Web Services (AWS) account.
For Federated token, enter the federated token value that you copied to a text editor.
Select Save.
Step 3: Retrieve your Open Catalog account settings¶
Retrieve these settings, which you’ll need later to create and configure a VPC endpoint and your VPC network.
Sign in to Snowflake Open Catalog.
In the navigation menu, select Settings.
On the Settings page, copy the values for the following settings into a text editor:
PrivateLink Account URL
Regionless PrivateLink Account URL
PrivateLink OCSP URL
Regionless PrivateLink OCSP URL
VPCE Service ID
You paste these values when you create and configure a VPC endpoint (VPCE), configure your VPC network, and connect to Open Catalog through AWS PrivateLink.
For descriptions of each setting, see Return values for the SYSTEM$GET_PRIVATELINK_CONFIG system function in the Snowflake documentation. In this topic, the names of the account settings are in JSON format.
Note
Remember that, where applicable, the description refers to a Snowflake account but your value is actually for your Snowflake Open
Catalog account. For example, the privatelink-account-url
is the URL for your Snowflake Open Catalog account.
Optional: To retrieve these values in JSON format, Create a Snowflake CLI connection for Open Catalog, and then call the SYSTEM$GET_PRIVATELINK_CONFIG system function.
In the Snowflake documentation,
privatelink-vpce-id
corresponds to the VPCE Service ID in Open Catalog.
Step 4: Create and configure a VPC endpoint¶
In this procedure, you create and configure a corresponding VPC endpoint (VPCE) in your AWS VPC environment.
Note
If you already created a VPC endpoint for your Snowflake account, and the account is in the same deployment as your Open Catalog account, creating a new VPC endpoint for your Open Catalog account isn’t required. You can optionally skip this step.
For instructions, see Create and configure a VPC endpoint (VPCE) in the Snowflake documentation, starting with step 2.
Step 5: Configure your VPC network¶
To configure your VPC network, create CNAME records in your DNS service to resolve the appropriate endpoint values from your Open Catalog account settings for private connectivity to the DNS name of your VPC Endpoint.
For instructions, see Configure your VPC network
in the Snowflake documentation. Remember that these instructions are for Snowflake, so some of the features mentioned in them don’t apply
to Open Catalog. For example, regionless-snowsight-privatelink-url
is for Snowsight, which isn’t supported in Open Catalog.
For additional help with DNS configuration, contact your internal AWS administrator.
Step 6: Connect to Open Catalog through AWS PrivateLink¶
To register a service connection and connect your query engine to Snowflake Open Catalog through AWS PrivateLink, use the code:
import pyspark from pyspark.sql import SparkSession spark = SparkSession.builder.appName('iceberg_lab') \ .config('spark.jars.packages', 'org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.1,<maven_coordinate>') \ .config('spark.sql.extensions', 'org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions') \ .config('spark.sql.defaultCatalog', 'opencatalog') \ .config('spark.sql.catalog.opencatalog', 'org.apache.iceberg.spark.SparkCatalog') \ .config('spark.sql.catalog.opencatalog.type', 'rest') \ .config('spark.sql.catalog.opencatalog.uri','https://<open_catalog_privatelink_account_url>/polaris/api/catalog') \ .config('spark.sql.catalog.opencatalog.header.X-Iceberg-Access-Delegation','vended-credentials') \ .config('spark.sql.catalog.opencatalog.credential','<client_id>:<client_secret>') \ .config('spark.sql.catalog.opencatalog.warehouse','<catalog_name>') \ .config('spark.sql.catalog.opencatalog.scope','PRINCIPAL_ROLE:<principal_role_name>') \ .getOrCreate()
Parameters¶
Note
Ensure that you set up your DNS service to match the value you specify for <open_catalog_account_identifier>
.
Parameter |
Description |
---|---|
|
Specifies the name of the catalog to connect to. |
|
Specifies the Maven coordinate for your external cloud storage provider:
|
|
Specifies the client ID for the service principal to use. |
|
Specifies the client secret for the service principal to use. |
|
Specifies the URL to connect to your Snowflake account using AWS PrivateLink or Azure Private Link.
|
|
Specifies the principal role that is granted to the service principal. |
Step 7 (Optional): Create a catalog integration for Snowflake¶
If you’re using Snowflake to query Open Catalog-managed tables, create a catalog for Snowflake that uses a private IP address. To create this catalog integration, your Snowflake account must be in the same deployment as your Open Catalog account.
For an example, see Example: Catalog integration that uses a private IP address in the Snowflake documentation.