Cost Governance of the Snowflake Connector for Google Analytics Raw Data

The Snowflake connector for Google Analytics Raw Data is subject to the Connector Terms.

This topic provides best practices for cost governance and finding the optimal warehouse size for the Snowflake Connector for Google Analytics Raw Data.

Measuring Cost of the Connector

If the connector has a separate account only for data ingestion and storage, and the account shows no other activity (such as executing queries by users using the ingested data), you can read the overall cost on the account level. To learn more, refer to Exploring Overall Cost.

If the account is not dedicated only to the connector or you need to investigate the costs further, you should analyze the charged costs for the components separately:

For an introduction to these three components of cost, refer to Understanding Overall Cost.

General Recommendations

To obtain cost generated by the connector, we recommend that you create a separate account solely for using the connector. Using a specific account you track the exact data transfer generated by the connector.

If you cannot use a separate account for the connector, consider the following:

  • Create a separate database for storing ingested data to track storage cost easier.

  • Allocate a warehouse only for the connector to get the exact compute cost.

  • Use object tags on databases and a warehouse to build custom cost reports.

Compute Cost

We recommend that you create a dedicated warehouse only for the connector. This configuration allows you to create resource monitors on the warehouse. You can use the monitors to send email alerts and suspend the warehouse, stopping the connector when the set credit quota is exceeded. The connector automatically resumes after the credit quota is renewed. Note that setting credit quota too low in configurations where large volumes of data are ingested may cause the connector to not ingest all data. A major benefit is that the warehouse size can be adjusted to the data volume.

For information on how to check credits consumed by the warehouse, refer to Exploring Compute Cost. You can also assign object tags to the warehouse and use the tags to create cost reports.

If the warehouse used by the connector is used by other workflows, you can split the cost by roles. To split usage by roles, use the query for splitting warehouse usage and add the following WHERE clause on the QUERY_HISTORY view:

WAREHOUSE_NAME = '<connector warehouse name>' AND
ROLE_NAME = '<role created for the connector to ingest data>'
Copy

Note that the role is the name created when the connector was installed, for example SNOWFLAKE_CONNECTOR_FOR_GOOGLE_ANALYTICS_RAW_DATA.

The query gives only an approximation of the cost.

Storage Cost

The Snowflake Connector for Google Analytics Raw Data stores data in two places:

  • The connector database, which is created from the public share and which holds the connector internal state.

  • The user-specified schema where the ingested data is stored.

Data storage is also used by the Snowflake Fail-safe feature. The amount of data stored in Fail-safe depends on the table updates done by the connector.

If you want to check storage usage using Snowsight, we recommend that you use a separate database for storing ingested data. This way you can filter the graphs for storage usage by object, which shows usage by individual database. You can also view storage use by querying the DATABASE_STORAGE_USAGE_HISTORY view and filtering by databases used by the connector.

If the database contains other schemas not related to the connector, you can query storage usage of a specific schema that is dedicated to the data ingested from the connector. You can get the information from the TABLE_STORAGE_METRICS view after filtering by database and schema names and aggregating columns with storage usage.

Data Transfer Cost

Snowflake charges only for egress traffic generated by the connector, based on the size of the requests from the connector to Google Analytics Raw Data. The responses from Google Analytics Raw Data do not generate cost on Snowflake side.

Information on data transfer usage is available only in the aggregated form for all external functions on the account level. To access the number of transferred bytes, use the DATA_TRANSFER_HISTORY view and filter by the EXTERNAL_ACCESS transfer type.

There can be additional fees related to data transfer on the BigQuery side: data storage + egress traffic. Specifically the connector uses whats referred to as Streaming reads (Storage Read API).

Please review the associated documentation for details.

Healthcheck Task Cost

The connector creates an internal, serverless task that regularly inspects the health of the instance, and sends a summary to Snowflake for monitoring purposes. The task is created after completing the installation wizard, or calling CONFIGURE_CONNECTION in a worksheet. It generates a fixed compute cost of up to 0.5 credits per day, even when no properties are enabled for ingestion.

The task cannot be explicitly suspended or dropped, however pausing the connector will also disable the healthcheck.

Determining optimal warehouse size for the connector instance

To find the optimal warehouse size for the connector, you should consider the factors that affect the performance of the connector, such as:

  • Number of Google Analytics properties

  • Amount of data produced by each of the properties

  • Schedule of synchronizing properties

We recommend that you define a set of measurable expectations, such as time intervals in which all tables should be synchronized, and pick the smallest warehouse size that meets these expectations. To determine if you can downsize the warehouse, see Monitoring warehouse load.

For Snowflake Connector for Google Analytics Raw Data, we recommend starting using an XSMALL warehouse and than experimenting with larger warehouse to possibly improve performance.

In addition, there can be a large difference in warehouse size requirement during different ingestion stages. For example, consider:

  • Initial ingestion where the connector is loading historical data, possibly years worth of data, a larger warehouse can be beneficial.

  • Normal daily ingestion, when loading only current daily increments of data, the smallest warehouses will suffice.

In addition, if a large property set is enabled for ingestion, consider a larger warehouse so that the connector can keep up with the data flow.