Snowflake Data Clean Room: Accessing external data from an Amazon S3 bucket

Data analyzed in a Snowflake Data Clean Room can be native to Snowflake, reside externally in cloud provider storage, or both. Connectors allows collaborators to access external data from a cloud provider from within the clean room.

Snowflake uses the following strategies to make external data available in a clean room:

  • If a collaborator has a Snowflake account, the data from external cloud storage is materialized in the Snowflake account as soon as the connector is authenticated.

  • If a collaborator is not a Snowflake customer and is using a managed account to join a clean room, the connector uses Snowflake external tables to make data available. Only the metadata associated with an external table is stored in Snowflake.

This topic describes how to use a connector so clean room analysts can access external data from an Amazon S3 bucket.

Connect to an S3 bucket

Allowing clean room collaborators to access data from Amazon S3 storage consists of the following steps:

  1. In AWS, create an IAM policy with the following permissions:

    • s3:GetBucketLocation

    • s3:GetObject

    • s3:GetObjectVersion

    • s3:ListBucket

  2. In AWS, create an IAM role that references the new IAM policy.

  3. In AWS, copy the identifiers of the S3 bucket and IAM role.

  4. In the clean room environment, create the connector.

  5. In AWS, update the IAM role with the service account identifiers from the clean room environment.

  6. In the clean room environment, authenticate the connector with AWS.

The following sections discuss these steps in more detail.

Create an IAM policy in AWS

Snowflake suggests creating a dedicated IAM policy for the connector that includes the necessary permissions to access the S3 bucket. In a subsequent step, you will add this policy to an IAM role that represents the identity of the connector.

To create a IAM policy that contains permissions to the S3 bucket:

  1. Sign in to the AWS Management Console.

  2. From the Console Home dashboard, select Identity and Access Management (IAM). You can use the search to locate it.

  3. In the left navigation, select Account settings.

  4. In the Security Token Service (STS) section, find the region of the account associated with the clean room environment, and toggle on Active.

    To find the region of the account associated with a clean room environment, sign in to the clean room, and select Connectors » Cleanrooms » Snowflake.

  5. In the left navigation, select Policies.

  6. Select Create policy.

  7. In the Policy editor section, select JSON.

  8. Copy and paste the following policy body into the policy editor, then edit the JSON to include your bucket name (<bucket>) and folder path prefix (<prefix>). Be sure to keep the ::: format. For example, if your s3 bucket URI is s3://sales/customers/, then the value of the Resource JSON field is arn:aws:s3:::sales/customers/*.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:GetObjectVersion"
          ],
          "Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
        },
        {
          "Effect": "Allow",
          "Action": [
            "s3:ListBucket",
            "s3:GetBucketLocation"
          ],
          "Resource": "arn:aws:s3:::<bucket>",
          "Condition": {
            "StringLike": {
              "s3:prefix": [
                "<prefix>/*"
              ]
            }
          }
        }
      ]
    }
    
    Copy
  9. Select Next.

  10. Enter a policy name (for example, snowflake_cleanroom_access), and select Create policy.

Create an IAM role in AWS

You are now ready to create an AWS IAM role that represents the identity of the connector. During the creation process, you associate the role with the new IAM policy that grants permissions that the connector needs to access the S3 bucket.

To create a new IAM role:

  1. Sign in to the AWS Management Console.

  2. From the Console Home dashboard, select Identity and Access Management (IAM).

  3. In the left navigation, select Roles.

  4. Select Create role.

  5. In the Trusted entity type section, select AWS account.

  6. In the An AWS account section, select Another AWS account.

  7. In the Account ID field, enter a temporary placeholder value that contains 12 digits (for example, the account identifier of the current AWS account). You will replace this value in a subsequent step.

  8. Select Require external id, then enter a temporary placeholder value such as 0000. You will replace this value in a subsequent step.

  9. Select Next.

  10. In the Permissions policies section, find the policy that you created while completing the steps in Create an IAM policy in AWS, and select its check box.

  11. Select Next.

  12. Enter a role name (for example, snowflake_cleanroom_connector), and select Create role.

Copy S3 bucket and IAM role identifiers

When creating the connector in the clean room environment, you will need the identifiers of the S3 bucket and the IAM role. Before creating the connector, use the following steps to copy and save these identifiers.

To copy the IAM role identifier:

  1. Sign in to the AWS Management Console.

  2. From the Console Home dashboard, select Identity and Access Management (IAM).

  3. In the left navigation, select Roles.

  4. Find the role that you created while completing the steps in Create an IAM role in AWS, and select it to open it.

  5. In the Summary section, find the ARN and select the copy icon. Save this identifier for a subsequent step.

To copy the S3 bucket identifier:

  1. Sign in to the AWS Management Console.

  2. From the Console Home dashboard, select S3.

  3. Find the name of your S3 bucket and select it to open it.

  4. Navigate into the prefix of the bucket, then select Copy S3 URI. Don’t try to select the button in the Objects section. Save the S3 URI for a subsequent step.

Create connector and copy service account details

You are now ready to create the connector in the clean room environment. Once you have created the connector, you need to copy details about its service account so it can be associated with the IAM role in AWS.

To create the connector in your clean room environment:

  1. Navigate to the sign in page.

  2. Enter your email address, and select Continue.

  3. Enter your password.

  4. If you are associated with multiple clean room environments, select the Snowflake account you want to use.

  5. In the left navigation, select Connectors, then expand the Amazon Web Services section.

  6. In the AWS Role ARN field, enter the identifier of the IAM role that you copied from AWS. For example, it might be arn:aws:iam::772412615275:role/mub00002_vhb71832_role

  7. In the S3 Bucket URI field, enter the identifier of the S3 bucket that you copied from AWS. For example, it might be s3://sales/customer_data/.

  8. Select Create. The clean room generates a service account that it uses to access AWS.

  9. Use the copy icon to copy the Principal and External ID identifiers of the connector’s service account, and save them for the next task.

Update IAM role with service account details

You are now ready to update the IAM role with the identifiers associated with the connector’s service account. To update the IAM role:

  1. Sign in to the AWS Management Console.

  2. From the Console Home dashboard, select Identity and Access Management (IAM).

  3. In the left navigation, select Roles.

  4. Find the role that you created while completing the steps in Create an IAM role in AWS, and select it to open it.

  5. Select the Trust relationships tab.

  6. Select Edit trust policy.

  7. Modify the JSON of the trust policy to include the identifiers from the connector’s service account. You copied these identifiers when you completed the steps in Create connector and copy service account details. Make the following changes to the JSON:

    • Replace the value of the AWS JSON field with the Principal value you copied from the clean room environment. In the following example, the value of Principal in the clean room environment is arn:aws:iam::115136555074:user/x4gy-s-p2345g38

    • Replace the value of the sts:ExternalId JSON field with the External ID value you copied from the clean room environment. In the following example, the value of External ID in the clean room environment is UCA56729_SFCRole=4447_uht2344sdf3mrWLNRM0y3bE=.

      {
        "Version": "2012-10-17",
        "Statement": [
          {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
              "AWS": "arn:aws:iam::115136555074:user/x4gy-s-p2345g38"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
              "StringEquals": {
                "sts:ExternalId": "UCA56729_SFCRole=4447_uht2344sdf3mrWLNRM0y3bE="
              }
            }
          }
        ]
      }
      
      Copy
  8. Select Update policy.

Authenticate the connector

You are now ready to authenticate the connector to make sure it can access the S3 bucket. To authenticate the connector:

  1. In the clean room environment, select Connectors and expand the Amazon Web Services section. If you are signed out of the clean room environment, see Sign in to the web app.

  2. Select the S3 bucket you are connecting to, and select Authenticate.

Remove access to external data on AWS

To remove access to an S3 bucket from a clean room environment:

  1. Navigate to the sign in page.

  2. Enter your email address, and select Continue.

  3. Enter your password.

  4. If you are associated with multiple clean room environments, select the Snowflake account you want to use.

  5. In the left navigation, select Connectors and expand the Amazon Web Services section.

  6. Find the S3 bucket that is currently connected, and select the trash can icon.