Setting Up data ingestion for your ServiceNow® data

The Snowflake connector for ServiceNow® is subject to the Connector Terms.

This topic describes how to set up data ingestion for the Snowflake Connector for ServiceNow®.

Note

The Snowflake Connector for ServiceNow® ingests data from ServiceNow tables into Snowflake. Data ingestion depends on v2 of the ServiceNow table API.

Strategies for Ingesting ServiceNow Tables

Note

  • The connector can only ingest tables with sys_id columns present.

  • ServiceNow views are not supported. Instead of ingesting these views, you should synchronize all tables for the underlying view and join the synchronized tables in Snowflake.

The connector uses different ingestion strategies, depending on the table schema. The connector uses three ingestion modes:

  • The initial load of data occurs for each table when the table is enabled for synchronization.

    In this mode, the table is ingested by iterating through the records identified by the IDs in the sys_id column. Once all records are ingested, the initial load phase is complete. For certain tables, you can also set the data range start time value which can restrict which records are ingested.

  • Incremental updates occur only for tables with sys_updated_on or sys_created_on columns.

    Incremental updates begin after the initial load is done and occur on a regular schedule that you can configure. In this mode, the connector ingests only the records that were added, updated or deleted since last synchronization.

  • For tables that do not have sys_updated_on or sys_created_on columns, the connector uses truncate and load mode.

    In this mode, the table is always ingested using the initial load approach, and newly ingested data replaces the old data. The connector replaces the data by running the INSERT OVERWRITE command.

Note

  • In the “incremental updates” mode, the connector uses the sys_updated_on column, if that column is present.

    If the column is not present, the connector uses the sys_created_on column instead.

  • For rotated tables, the connector always uses the sys_created_on column. If the table is rotated using a different column than sys_created_on, the ingestion of that table might cause performance issues.

Parallel Ingestion of ServiceNow Tables

The connector ingests a few tables in parallel, but the ingestion of each individual table is a synchronous process. This means that ingesting large tables might block the connector from updating other tables. This issue is more likely to occur during the initial load phase than in other phases.

Note

  • If the sys_updated_on or sys_created_on fields are not updated when the record is modified in ServiceNow, those modifications won’t be propagated to Snowflake, which results in data inconsistency. Snowflake recommends that you avoid disabling the update of system fields.

  • If a record deletion is not audited, information about deleted records won’t be propagated to Snowflake, resulting in a data inconsistency.

Note

Due to restrictions on the Snowflake and ServiceNow REST APIs, the connector cannot ingest a table if a single row exceeds 10 MB of data. In that case, the connector tries to ingest data with the frequency defined in the table schedule. If a row exceeds the limit, the connector generates an error message and continues ingesting another table.

Setting Up Data Ingestion Using Snowsight

To set up data ingestion using Snowsight, do the following:

  1. Sign in to Snowsight as a user with the ACCOUNTADMIN role.

  2. In the navigation menu, select Data Products » Marketplace.

  3. Search for the Snowflake Connector for ServiceNow®, then select the tile for the connector.

  4. In the page for the Snowflake Connector for ServiceNow®, select Select Tables.

    This displays a list of all the ServiceNow tables.

    Note

    The connector can only ingest tables with sys_id columns present.

  5. Select the tables you want to ingest:

    1. Search for the table you want to ingest.

    2. Select the checkbox in the Status column next to the table you want to select.

    3. Under Synch Schedule, select how frequently you want to synchronize the table between Snowflake and ServiceNow.

    4. Repeat these steps for each table you want to ingest into Snowflake.

  6. Select the heading of the Status column to see the tables you have currently selected.

  7. Select Start Ingestion to begin ingesting data into your Snowflake account.

The connector status changes to Loading Data. When at least one of the tables is ingested successfully, the connector status changes to Last Successful Load: just now.

Refer to Monitoring the connector for information on how to view the contents of the tables in Snowflake.

Modifying Data Ingestion Using Snowsight

To modify the ServiceNow tables to be ingested or the synchronization schedule for the tables, do the following:

  1. Sign in to Snowsight as a user with the ACCOUNTADMIN role.

  2. In the navigation menu, select Data Products » Marketplace.

  3. Search for the Snowflake Connector for ServiceNow®, then select the tile for the connector.

  4. In the page for the Snowflake Connector for ServiceNow®, select Edit.

  5. Modify the tables you want to ingest:

    1. Search for the table you want to ingest.

    2. Select the checkbox in the Status column next to the table you want to select or deselect.

    3. Under Synch Schedule, select how frequently you want to synchronize the table between Snowflake and ServiceNow.

  6. Select Update.

Setting Up Data Ingestion Using SQL Statements

To set up data ingestion using SQL Statements, do the following:

Note

To configure these settings, you use stored procedures that are defined in the PUBLIC schema of the database that serves as an instance of the connector.

Before calling these stored procedures, select that database as the database to use for the session.

For example, if that database is named my_connector_servicenow, run the following command:

USE DATABASE my_connector_servicenow;
Copy

Specifying the Synchronization Schedule

The Snowflake Connector for ServiceNow® synchronizes data from all ServiceNow tables to Snowflake on a specified schedule. By default, all of the tables are synchronized once every hour (1h).

To change the default synchronization schedule for all tables, call the CONFIGURE_CONNECTOR stored procedure with the following arguments:

CALL CONFIGURE_CONNECTOR('data_ingestion_schedule', '<schedule>');
Copy

Where:

data_ingestion_schedule (the string literal)

Specifies that you want to configure the schedule for synchronization.

schedule

Specifies the frequency of the synchronization. You can specify one of the following string values:

  • '30m'

  • '1h'

  • '3h'

  • '6h'

  • '12h'

  • '1d'

The connector also allows you to specify a different schedule for each table that is enabled for synchronization. To change the synchronization schedule for a selected set of tables, call the CONFIGURE_TABLES_SCHEDULE stored procedure with the following arguments:

CALL CONFIGURE_TABLES_SCHEDULE(<table_names>, <schedule>);
Copy

Where:

table_names

Specifies an array of table names for which you want to configure the synchronization schedule.

schedule

Specifies the frequency of the synchronization. You can specify one of the following JSON values:

  • { 'type': 'interval', 'value': '<interval_value>' }, where interval_value is one of the following string values:

    • '30m'

    • '1h'

    • '3h'

    • '6h'

    • '12h'

    • '1d'

  • { 'type': 'custom', 'value': { 'hour': <hour>, 'dayOfWeek': '<day_of_week>' } }, where hour specifies the hour in UTC timezone at which the ingestion should start, and day_of_week specifies day of the week when the ingestion should be performed. It is possible to use special expressions as a day of week:

    • '*' to run the ingestion everyday.

    • '1-3' to run the ingestion from Monday to Wednesday.

    • '0,5,6' to run the ingestion on Friday, Saturday and Sunday.

    Possible values that can be used in the expression for day_of_week configuration are:

    • '0' - Sunday

    • '1' - Monday

    • '2' - Tuesday

    • '3' - Wednesday

    • '4' - Thursday

    • '5' - Friday

    • '6' - Saturday

    Other non-digit values like '5L' indicating the last Friday of a month or 'FRI-SUN' indicating the range from Friday to Sunday are not supported.

For example to ingest tables table_1 and table_2 each Saturday and Sunday at 11:00 PM UTC call the following stored procedure:

CALL CONFIGURE_TABLES_SCHEDULE(['table_1', 'table_2'], { 'type': 'custom', 'value': { 'hour': 23, 'dayOfWeek': '0,6' } });
Copy

By default the connector tries to start the ingestion in 3 hour time window from scheduled start time. If it is not possible to start the ingestion within that time frame, for example, when the connector is ingesting other tables, the current scheduled run is not executed. The connector attempts to run the ingestion at the next scheduled time frame. It is possible to change the duration of the time frame by calling CONFIGURE_CONNECTOR store procedure:

CALL CONFIGURE_CONNECTOR('custom_schedule_start_ingestion_window', <window_length>);
Copy

where window_length is length of the window in ISO 8601 duration format. The duration must be rounded to a whole hour, and must last for at least 1 hour. For example, value 'PT12H' specifies a window that lasts for 12 hours, and 'P2D' specifies a window that lasts for 2 days.

If you only enable tables with custom schedules, this configuration only affects time it takes to create and refresh flattened views for the configured tables. The flattened views are created in the first ingestion cycle after the following conditions are met:

  • Ingestion of metadata tables is finished

  • Ingestion of the configured table has started.

If email alerts are enabled, Snowflake recommends changing the alert frequency to Once per day when using custom scheduling.

Additionally, the connector exposes deprecated CONFIGURE_CONNECTOR_TABLES stored procedure with the following arguments:

CALL CONFIGURE_CONNECTOR_TABLES('schedule_interval', '<schedule>', '<table_names>');
Copy

Where:

schedule

Specifies the frequency of the synchronization. You can specify one of the following string values:

  • '30m'

  • '1h'

  • '3h'

  • '6h'

  • '12h'

  • '1d'

table_names

Specifies a comma-delimited list of the table names.

For these tables, the schedule that you specify in the schedule parameter overrides the default schedule that you set by calling the CONFIGURE_CONNECTOR('data_ingestion_schedule', 'schedule') stored procedure.

Specifying the data range start time

By default, the Snowflake Connector for ServiceNow® synchronizes all the records in the corresponding ServiceNow tables. For the tables with: sys_updated_on or sys_created_on columns (from now on here called time columns) present, it is possible to restrict the range of synchronized data by setting a data range start time - i.e. lower bound for the corresponding time column value of the records.

With such a configuration the records with the corresponding time column value older than the data range start timestamp is not ingested. The corresponding time column used by this procedure is determined in the same way as for the incremental updates .

To change the data range start time value, call the CONFIGURE_TABLES_RANGE_START stored procedure with the following arguments:

CALL CONFIGURE_TABLES_RANGE_START(<table_names>, <range_start>);
Copy

Where:

table_names

Specifies an array of table names for which you want to configure the data range start time.

range_start

Timestamp specifying the data range start time in TIMESTAMP_TZ format or NULL to unset the current value.

Note

You cannot set the data range start time for the tables with neither sys_updated_on nor sys_created_on column present.

  • If the ingestion of the table has not been started yet, the data range start time value is taken into account with the first ingestion.

  • If the ingestion of the table has already been started (e.g. a reload is in progress), the data range start time value is ignored and (another) reload of the table(s) is required to filter out the records with too old corresponding time column value.

It is therefore recommended to set the data range start time before starting the first ingestion of a table (hence also before enabling it).

For example, if tables table1 and table2 have the required time column(s), in order to set the data range start time to 2022-11-23 07:00:00 UTC for theses two tables, run the following command:

CALL CONFIGURE_TABLES_RANGE_START(['table1', 'table2'], TO_TIMESTAMP_TZ('2022-11-23 07:00:00 +00:00'));
Copy

Then:

  • for table table1, for example, if ts ingestion has not been started yet, all the records with corresponding time column value before 2022-11-23 07:00:00 is not ingested.

  • for table table2, for example, if its ingestion has already been started, the data range time start value is ignored in all data synchronizations until reloading this table. During the reload all the records with corresponding time column value before 2022-11-23 07:00:00 is not ingested.

It is also possible to unset the data range start time. For example, in order to unset it for table table1, run the following command:

CALL CONFIGURE_TABLES_RANGE_START(['table1'], NULL);
Copy

Again, if an ingestion of table table1 has already been started, reloading this table is required to ingest all the records back from ServiceNow.

Note

Loading the data with respecting the data range start time may take longer than loading all historical data due to lower performance of incremental update.

Enabling or Disabling the Synchronization of a Table

To enable the synchronization of data for a specific table in ServiceNow, call the ENABLE_TABLES stored procedure with the following arguments:

CALL ENABLE_TABLES(<tables_to_enable>);
Copy

Where:

tables_to_enable

Specifies an array of ServiceNow table names.

Use the table name, not the label displayed in the ServiceNow UI. You can find the table name in the data dictionary tables in ServiceNow. In the ServiceNow UI, go to System Definition » Tables. The Name column displays the name of the table.

For example, to enable the synchronization of the tables named table1, table2, and table3, run the following command:

CALL ENABLE_TABLES(['table1', 'table2', 'table3']);
Copy

To disable the synchronization of data for a specific table in ServiceNow, call the DISABLE_TABLES stored procedure with the following arguments:

CALL DISABLE_TABLES(<tables_to_disable>);
Copy

Where:

tables_to_disable

Specifies an array of ServiceNow table names.

Use the table name, not the label displayed in the ServiceNow UI. You can find the table name in the data dictionary tables in ServiceNow. In the ServiceNow UI, go to System Definition » Tables. The Name column displays the name of the table.

For example, to disable the synchronization of the tables named table1 and table2, run the following command:

CALL DISABLE_TABLES(['table1', 'table2']);
Copy

Disabling the table stops its synchronization. When the table synchronization is re-enabled, ingestion resumes from where it was paused.

Note

Disabling all tables from synchronization does not mean that the Snowflake Connector for ServiceNow® stops generating costs. Some tasks may run in the background, such as those related to notifications.

The connector exposes deprecated version of ENABLE_TABLES procedure that takes two arguments:

CALL ENABLE_TABLES('<tables_to_configure>', <enable>);
Copy

Where:

tables_to_configure

Specifies a comma-delimited list of ServiceNow table names.

enable

Specifies whether synchronization should be enabled or disabled for these tables. Specify TRUE to enable or FALSE to disable.

This procedure is deprecated and shall be removed in a future connector release. We recommend using ENABLE_TABLES and DISABLE_TABLES with a single argument.

The ENABLE_TABLES and DISABLE_TABLES procedures add the specified table names to the ENABLED_TABLES view.

If you want to add all tables available in ServiceNow to the ENABLED_TABLES view, call the PREFILL_CONFIG_TABLE stored procedure.

Note

In order for you to call this procedure, the ServiceNow user used by the connector must have access to the sys_db_object table.

To call this procedure, run the following command:

CALL PREFILL_CONFIG_TABLE();
Copy

This procedure adds all ServiceNow tables to the ENABLED_TABLES view and disables the tables for ingestion by default.

To enable these newly added tables for synchronization:

  1. Run the following commands to produce a comma-delimited list of the tables in ENABLED_TABLES view:

    SELECT LISTAGG(TABLE_NAME, ',') FROM ENABLED_TABLES;
    
    Copy
  2. In the list returned by this command, remove any tables that should not be synchronized.

  3. Call the ENABLE_TABLES stored procedure, and pass in this list.

If new tables have been added recently to ServiceNow, you can identify the new tables and enable those tables for synchronization by using this same approach (i.e. generating the list of tables, editing the list, and passing the list to the ENABLE_TABLES stored procedure).

Note

The connector does not support roll backs or delete recoveries in ServiceNow.

Using the roll back and delete recovery features may result in data inconsistency. Records that are recovered in ServiceNow may still be marked as deleted in Snowflake. To resolve it you can use the RELOAD_TABLE stored procedure.

Enabling the Synchronization of a Table with Column Filtering

If you do not need all columns from a ServiceNow table in Snowflake, you can skip them. For example, you might want to skip the columns if a single row exceeds the maximum row size of 10 MB.

To enable table ingestion with specified columns run the following command:

CALL ENABLE_TABLE_WITH_COLUMNS('<table_to_enable>', <include_columns>, <exclude_columns>);
Copy

Where:

table_to_enable

Specifies a ServiceNow table name.

include_columns

Specifies an array of column names to be ingested. If sys_id, sys_created_on, and sys_updated_on exist, they are always included, even though they are not mentioned in this array.

exclude_columns

Specifies an array of column names to be excluded from ingestion. You cannot exclude sys_id, sys_created_on, or sys_updated_on columns, as the connector uses them in the ingestion process.

Note

Since columns in ServiceNow are written in lowercase and the API that the connector uses is case-sensitive, the values provided for specified columns must also be in lowercase.

Note

You shouldn’t provide both include_columns and exclude_columns. If you want to list include_columns, you should leave the exclude_columns empty, and vice versa. If both arrays are not empty and there aren’t any conflicting columns, include_columns takes precedence over exclude_columns.

If both include_columns and exclude_columns are empty arrays, all the available columns will be ingested.

For example having a ServiceNow table named u_table with columns sys_id, sys_updated_on, col_1 and col_2 and executing:

CALL ENABLE_TABLE_WITH_COLUMNS('u_table', ['sys_id', 'sys_updated_on'], []);
Copy

will ingest only sys_id and sys_updated_on columns for the given table, but calling:

CALL ENABLE_TABLE_WITH_COLUMNS('u_table', [], ['col_1']);
Copy

will ingest sys_id, sys_updated_on and also col_2.

Note

To use this procedure, the connector must use the ServiceNow user assigned the ServiceNow admin role. Without this role, the procedure will return an authorization error. For more information, see Preparing Your ServiceNow Instance.

The connector validates the provided columns and rejects the enablement request if any of the columns is not available in ServiceNow. As ServiceNow API supports only include mode, the connector transforms provided column arrays into included columns list and sends it with each request to ServiceNow. The URL with included columns might be too long to handle by ServiceNow. The connector validates this limitation when the ENABLE_TABLE_WITH_COLUMNS is invoked.

Included columns configuration for each table can be found in the INCLUDED_COLUMNS column of the ENABLED_TABLES view. To modify the list of ingested columns, you need to disable the specific table first. If column filtering is configured for a table, you can enable the columns only using the ENABLE_TABLE_WITH_COLUMNS procedure. You cannot use the ENABLE_TABLES in this case.

The config change does not affect the already ingested data. Column filtering applies only to the newly ingested records.

Flattened views only include the columns specified when the table was enabled. They are updated every time the list of included columns changes. If no column filtering is configured, a view consists of all the available columns.

Reloading Data in a Table

To reload data in particular table, call the RELOAD_TABLE stored procedure:

CALL RELOAD_TABLE('<table_name>');
Copy

Where:

table_name

Specifies the name of the table to reload.

When you call the RELOAD_TABLE stored procedure, the connector performs the following example:

  1. The connector suspends the original table for ingestion temporarily.

    Note

    While the table is being reloaded, you cannot re-enable the table for ingestion.

  2. The connector creates a separate temporary table for ingestion.

  3. The connector ingests the data into this new temporary table. This table is visible in the CONNECTOR_STATS view as a table named with a __tmp suffix).

    Note

    Every reload takes data range start time value into account, which can restrict which records are ingested.

  4. After the data is ingested, the connector replaces the data in the original table with the data in the temporary table.

  5. The connector deletes the temporary table.

  6. The connector re-enables the original table for ingestion.

During this process, you can continue to query the existing data in the original table. However, changes to the data in the ServiceNow table won’t be reflected in the Snowflake table until the ingestion process completes.

To avoid overloading your ServiceNow instance, reload only one table at time.

Canceling Table Reload

To cancel the process of reloading the data in a table, use the CANCEL_RELOAD_TABLE stored procedure as shown in the following example:

CALL CANCEL_RELOAD_TABLE('<table_name>');
Copy

Where:

table_name

Specifies the name of the table whose reload you want to cancel.

When you cancel the reload, the connector drops all temporary objects created during the reload. The table is then available for ingestion as part of the normal synchronization schedule.

Configuring the Size of a Single Page Fetch for a Table

The connector fetches data from a table by dividing it into smaller chunks called pages. Each API request to ServiceNow fetches one page. Due to limitations on the Snowflake and ServiceNow REST APIs, the size of the response from the ServiceNow API cannot exceed 10 MB and the response should be returned in less than one minute.

To account for this, the connector limits the number of rows fetched within a single API request. This limit is the page size.

The connector uses the following process to determine the page size:

  1. Initially, the default page size is set to 10,000 rows.

  2. If the fetch request fails during ingestion because the response size is exceeded, the page size is gradually decreased by 1000, 100, 10 and 1 until the request succeeds or the final page size is set to 1.

  3. The successful page size is saved in the connector state and this value is used by subsequent requests.

The current page size for a table is available in the ENABLED_TABLES view. To see the page size, run the following command:

SELECT PAGE_SIZE FROM ENABLED_TABLES WHERE TABLE_NAME = '<table_name>';
Copy

Where:

table_name

Specifies the name of the ServiceNow table being ingested.

The process the connector uses for determining the page size may lead to inefficiencies. This process only reduces the page size. It does not increase the page size. This can happen in situations where a table has a single large row that causes the page size to be set to a lower value.

To avoid this situation, you can manually set the page size by calling the RESET_PAGE_SIZE stored procedure as shown in the following example:

CALL RESET_PAGE_SIZE('<table_name>');
Copy

Where:

table_name

Specifies the name of the ServiceNow table being ingested.

Ingestion run

Ingestion runs for a given table are triggered according to the configured schedule. A run downloads all the relevant rows divided into pages mentioned in the previous paragraph from the source table in a loop.

Initial load and updates

As soon as a page of data is fetched, it is inserted into the corresponding event log table. At this stage the newly fetched changes are not yet available in the sync table or through flattened views. When it is done the next request with updated criteria is issued as long as any data is returned. When the ingestion run is complete, and there is no more data to fetch in the source table, an asynchronous merge task is triggered, that takes all the changes from the event log inserted since the last merge and applies them to the sync table. When it is complete, the data becomes available in sync table and flattened views.

Truncate and load

In truncate and load mode a temporary table is created for each ingestion run. Each fetched page of rows is first inserted into this temporary table (this table exists in the internal connector schema and is not available to connector users). At this stage the newly fetched changes are not yet available in the sync table or through flattened views, they still show data fetched in the previous run. When the ingestion run is completed, and there is no more data available in the source table, data from the temporary table replaces existing data in the sync table. All the fetched rows are also added to the event log. At the end the temporary table is dropped.

Monitoring progress

To check the status of a current or past ingestion run, you can query the CONNECTOR_STATS view. It’s visible in the STATUS column. It’s set to DONE only if data was successfully fetched and all the changes were applied to the sync table. When the ingestion is running or the merge to the sync table/replace of rows in the sync table has not been completed yet, the status is RUNNING.

Next steps

After configuring ingestion, perform the steps described in Accessing the ServiceNow® data in Snowflake to view and otherwise access ServiceNow data.