Data Providers: Sharing Data

In this Topic:

Why Am I Sharing My Data?

Data collaboration is critical for success. No matter the focus of your work, your organization collects specialized data that offers important insights and new perspectives. Your organization may already recognize the value of collaboration, sharing or selling its data. You may be new to the space, exploring for the first time the ways to make available your organization’s data trove. Share data because there are others out there who want and need that data, and recognize its real world value.

How Do I Share My Data?

The process of listing your data may involve several individuals or teams in your organization. You’ll need to:

  • Determine the scope of the data you’re sharing. Engage internally to understand how you want to segment the data to best serve your consumers while maximizing the benefit to your organization. You can have multiple listings in the Marketplace.

  • Consider how to market your data product. Snowflake provides space in the listing for details about the data and any pricing model you might want to apply. Your organization may want to do additional marketing to drive consumers to your listing.

  • Load the data you want to share into Snowflake. Snowflake provides detailed documentation for your operations and engineering teams to use to load data.

  • Create and publish the Marketplace listing. When your organization is ready to share its data, you’ll submit your Marketplace listing to Snowflake for review. This can be an iterative process, with Snowflake helping to ensure your data and listing are ready for consumption.

How Do I Load Data Into Snowflake?

Loading data into Snowflake is a technical process that may require engineering support. The methods you use are usually determined by where your data resides. The methods aren’t mutually exclusive; if your organization has data residing in more than one location, you may need to use multiple methods of loading data.

Snowflake supports loading data from many sources, including:

  • Amazon S3

  • Google Cloud Storage

  • Microsoft Azure

  • Local data sources

Snowflake provides detailed documentation about loading data. Working with your organization, you’ll need to:

After your data is loaded into Snowflake, you’ll need to create and manage the data shares that will fuel your Marketplace listings.

To learn more about loading data into Snowflake, see the following resources:

How Do I Manage My Shares?

Managing shares in Snowflake is a technical process that may require engineering support. Shares are curated collections of data from your databases, usually tables, views, and functions. A share can be included only in one listing at a time. A single database can have multiple shares.

Each listing requires a share. A share can be created before you configure your listing, or you can use the web interface to select database objects for your listing. If you use the latter method, Snowflake automatically creates a share for you.

Snowflake provides documentation that details the tasks related to data shares, including:

  • Creating the shares. In addition to your experts and engineers, you may want to involve individuals who are responsible for marketing and product development in the process. Considering the audience you are targeting with your listing will help to optimize the data you select for the share.

  • Managing the shares. As your product matures, your organization may need to make changes to your data shares.

Note: If objects in a share are dropped and later recreated, you must grant those objects to the share again. For example, suppose you refresh some data by dropping and recreating a table. If that table was previously granted to a share, you must grant that table to the share again.

Preparing Shares With Paid and Trial Data

Creating and managing shares that include paid data is similar to the instructions for shares that include trial (free or open) data. For instructions, see Working with Shares.

Create a single share for your paid listing. To differentiate paid data from trial data in the share, create secure views that filter access to paid data in the underlying tables using the SYSTEM$IS_LISTING_PURCHASED function. This SQL function can be implemented as a WHERE clause in the secure view definition.

Note

Currently, a share in a paid listing cannot include secure UDFs.

Creating Secure Views for Paid and Trial Data

Add views to your shares for paid listings that call the SYSTEM$IS_LISTING_PURCHASED SQL function. This function determines whether the account in which a query is executed has agreed to purchase paid data in the current listing.

This section provides examples for creating the secure views.

Note

  • To return the same rows regardless of the purchase status of the consumer account, omit the SYSTEM$IS_LISTING_PURCHASED function from a view.

Example 1: Return Data Based on the Purchase Status of the Account

Create a secure view that selects all columns in a table. The view returns rows only when queried within a consumer account that has agreed to purchase paid data:

CREATE SECURE VIEW paid_v
  AS
  SELECT
    *
  FROM
    paid_t
  WHERE
    SYSTEM$IS_LISTING_PURCHASED() = TRUE;

Example 2: Return a Subset of Rows Based on the Purchase Status of the Account

Create a secure view that returns a subset of rows identified as trial or paid (i.e. for purchase). The view relies on a Boolean column named is_free in the underlying table that contains TRUE for trial rows and FALSE for paid rows. The view only returns all rows only when queried within a consumer account that has agreed to purchase paid data; otherwise, the query only rows marked as trial.

Note that adding a column to identify paid or trial rows is not necessary. You could filter records conditionally based on other values in your existing data model. For one example, see Example 3: Return Only the Most Recent Rows Based on the Purchase Status of the Account (in this topic).

CREATE SECURE VIEW paid_v
  AS
  SELECT
    *
  FROM
    paid_t
  WHERE
    is_free
    OR
    SYSTEM$IS_LISTING_PURCHASED() = TRUE;

Example 3: Return Only the Most Recent Rows Based on the Purchase Status of the Account

Create a secure view that returns only the most recent rows to a consumer account that has not agreed to purchase paid data:

Note

A secure view can filter on column data types other than date or timestamp.

CREATE SECURE VIEW paid_v
  AS
  SELECT *
  FROM
    paid_t
  WHERE
    (timestamp > current_timestamp() - interval '7 days')
    OR
    SYSTEM$IS_LISTING_PURCHASED() = TRUE;

Validating Secure Views for Paid and Trial Data

After your secure views are prepared, you can validate that they have been set up correctly by simulating the experiences of paid and trial consumer accounts. While simulating a paid or trial consumer, you can run queries against the secure views to determine if the consumer has access to the expected data.

Important

This method does not validate whether consumers are able to securely access your data. This method only validates whether the share will work as expected for your consumers.

To validate your shares, execute a query against a secure view using share_context(system$is_listing_purchased):

EXECUTE USING share_context(system$is_listing_purchased=>{ 'TRUE' | 'FALSE' })
  AS <query>

Where:

  • SYSTEM$IS_LISTING_PURCHASED specifies whether you want to validate as a paid consumer, or as a trial or unpaid consumer. The valid values are:

    • TRUE, to validate the share as a paid consumer.

    • FALSE, to validate the share as a trial or unpaid consumer.

  • <query> is the SQL query that you want to run against the secure view.

When you use the command to run your query, the query is executed against the share as though you are a consumer.

For example, suppose you have a share that you want to validate. Your share includes a secure view named PURCHASED_VIEW, which protects all of the data from a table named SHARE_TABLE. You want to ensure that the data can be accessed only if it’s been purchased.

To check if unpaid or trial consumers will be able to access any data in the secure view, you use the following query:

EXECUTE USING share_context(system$is_listing_purchased=>'FALSE')
  AS
    SELECT
      *
    FROM
      example_database.example_schema.PURCHASED_VIEW

The secure view is working as expected and no data is accessible to unpaid and trial consumer, so your query returns the following response:

Query produced no results

To confirm that your paid consumers will have access to the data, you use the following query:

EXECUTE USING share_context(system$is_listing_purchased=>'TRUE')
  AS
    SELECT
      *
    FROM
      example_database.example_schema.PURCHASED_VIEW

Because the secure view is working as expected, your query returns all of the columns and rows in SHARE_TABLE, the desired outcome for paid consumers.