Setting up the Snowflake Connector for PostgreSQL Agent container

Note

The Snowflake Connector for PostgreSQL is subject to the Connector Terms.

The process of configuring the Snowflake Connector for PostgreSQL agent includes the following steps:

After installing the Snowflake Connector for PostgreSQL, you must configure and run the Docker container with the agent.

Prerequisites

Before installing the connector, you must set up the environment where the agent runs.

Configuring your firewall to access to Snowflake

Please see Configuring network requirements.

Installing an orchestration tool

The agent is distributed as a Docker image that you can run using orchestration tools and services like Docker, Kubernetes, or OpenShift.

To run the agent, you must have one of these tools installed. Your environment must have:

  • At least 6 GB of RAM available to the container running the agent. The agent is a memory-intensive application.

  • 4 CPUs available to handle the throughput requirements of the agent. Decreasing the number of CPUs decreases the throughput linearly. Having additional CPUs does not provide significant gains.

The connector requires exactly one instance of the agent application to be running at all times.

Preparing data sources configuration file

To set up the data sources on agent side:

  1. Copy the following json template for data sources configuration and paste it to file:

    {
        "<data_source_name_1>": {
            "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]",
            "username": "<postgresql_db_username>",
            "password": "<postgresql_db_password>",
            "publication": "<postgresql_db_publication>",
            "ssl": false
        },
        "<data_source_name_2>": {
            "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]",
            "username": "<postgresql_db_username>",
            "password": "<postgresql_db_password>",
            "publication": "<postgresql_db_publication>",
            "ssl": false
        },
        "<data_source_name_3>": {
            "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]",
            "username": "<postgresql_db_username>",
            "password": "<postgresql_db_password>",
            "publication": "<postgresql_db_publication>",
            "ssl": false
        }
    }
    
    Copy
  2. Fill the necessary parameters for each data source you want to be available for replication (Template contains randomly selected number of data sources for example purposes).

  3. Give each data source a name consisting only of UPPERCASE letters and numbers.

  4. Make a note of the data source names; you will need them when configuring replication.

  5. File is mounted by the following parameter of docker run command:

    --volume </path/to/datasources/json/file>:/home/agent/datasources.json
    
    Copy

Data source parameters

url

Specifies the PostgreSQL connection URL in the following syntax:

jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]
Copy

If databaseName is not specified, it is the same as the username used for replication.

For more information on PostgreSQL JDBC URL formats, see PostgreSQL JDBC documentation.

username

Specifies the user name for the PostgreSQL connection. This user should have REPLICATION permission or be a superuser. For more information on replication security, see PostgreSQL documentation.

password

Specifies the password of the PostgreSQL database user.

publication

Specifies publication name created for the replication of a particular database.

Each data source agent creates a unique replication slot with the name sf_db_conn_rs_KBMd_<data_source_name>. It keeps the WAL position so that the replication can be restored if the new agent continues the work of the previous agent instance. If you decide to permanently remove connector integration you also need to remove the replication slot created by the connector in the source database.

ssl

Specifies if the agent is connected to the PostgreSQL server using SSL. Possible values are: true or false and only those boolean values are acceptable.

If you decide to use SSL certificate you need to pass the PostgreSQL server root.crt to the container and mount them in /home/agent/.postgresql/root.crt path.

For example if you use docker you can add to your run command:

-v /path/to/source/root_server.crt:/home/agent/.postgresql/root.crt
Copy

If you use docker-compose add

volumes:
/path/to/source/root_server.crt:/home/agent/.postgresql/root.crt
Copy

If you want to configure the SSL connection in specific mode you can always pass all parameters to the url in the format described here. For example

jdbc:postgresql://localhost:5432/database?ssl=true&sslmode=verify-full&sslrootcert=/Users/username/.postgresql/root.crt
Copy

Running the Docker container of the agent

Note

When using Docker to manage the image for the agent application, consider using the --restart flag with the appropriate restart policy. This is important to ensure that the agent application is always running.

The following are the general procedures for running the agent using Docker:

  1. Prepare json file with snowflake connection configuration downloaded from wizard.

  2. Prepare json file with data sources configuration.

  3. Run the container:

    docker run -d \
       --restart unless-stopped \
       --name agent-app-instance \
       --volume </path/to/ssh/keys/directory>:/home/agent/.ssh \
       --volume </path/to/snowflake/json/file>:/home/agent/snowflake.json \
       --volume </path/to/datasources/json/file>:/home/agent/datasources.json \
       -m 6g \
       snowflakedb/database-connector-agent:latest
    
    Copy

    Attention

    If you already pulled the latest version, it is possible that you’ll need to pull it again by typing docker pull snowflakedb/database-connector-agent:latest to be sure you’re running the the latest version from dockerhub rather than the cached one.

Managing Key Generation and User Assignment

Whenever you download a configuration json file from the Snowflake wizard, the user is provided with a newly generated temporary key pair. On the initial agent run, a different set of keys is generated into the directory mounted using the -- volume parameter:

--volume </path/to/ssh/keys/directory>:/home/agent/.ssh
Copy

The Agent alters its own authentication method and sets it to the generated public key. On subsequent agent runs, the generated keys should be reused by the agent container by mounting the same directory specified as the prior --volume.

Optional agent container environment variables

Agent provides additional options that can be controlled via environment variables. They are all optional.

SNOWFLAKE_PRIVATEKEYPASSWORD

Specifies the password to the private key file. It can be added if you want the private key be generated with a password or if you are reusing keys that were generated with a password.

SNOWFLAKE_ENFORCEDURL

Specifies the URL to connect to Snowflake, which overrides the automatically specified URL. which can be used to connect the agent to Snowflake via privatelink.

JAVA_OPTS

Specifies the Java options or system properties passed to the container. Separate each property with a space.

Note the following:

  • Use the -Xmx option to set the maximum Java heap size. Snowflake recommends setting this value to the amount of memory available to the container minus 1GB.

    For example, if the container has 6GB available, set the following:

    JAVA_OPTS=-Xmx5g
    
    Copy
  • To configure the proxy that allows the agent to connect to Snowflake, set the following system properties:

    JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port>
    
    Copy
  • To bypass the proxy for one or more IP addresses or hosts, set the http.nonProxyHosts system property to the list of hosts:

    • Use a pipe symbol (|) to separate the host names.

    • To specify hostnames that match a pattern, use an asterisk (*) as a wildcard character.

      JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port>
      -Dhttp.nonProxyHosts='*.my_company.com|localhost|myorganization-myaccount.snowflakecomputing.com|192.168.91.*'
      
      Copy
  • To pass credentials to the proxy server, set the http.proxyUser and http.proxyPassword system properties.

    JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port>
    -Dhttp.proxyUser=<proxy-user> -Dhttp.proxyPassword=<proxy-pass>
    
    Copy

Next steps

After completing these procedures, follow the steps in Configuring replication for the Snowflake Connector for PostgreSQL.