Setting up the Snowflake Connector for PostgreSQL Agent container¶
Note
The Snowflake Connector for PostgreSQL is subject to the Connector Terms.
The process of configuring the Snowflake Connector for PostgreSQL agent includes the following steps:
After installing the Snowflake Connector for PostgreSQL, you must configure and run the Docker container with the agent.
Prerequisites¶
Before installing the connector, you must set up the environment where the agent runs.
Configuring your firewall to access to Snowflake¶
Please see Configuring network requirements.
Installing an orchestration tool¶
The agent is distributed as a Docker image that you can run using orchestration tools and services like Docker, Kubernetes, or OpenShift.
To run the agent, you must have one of these tools installed. Your environment must have:
At least 6 GB of RAM available to the container running the agent. The agent is a memory-intensive application.
4 CPUs available to handle the throughput requirements of the agent. Decreasing the number of CPUs decreases the throughput linearly. Having additional CPUs does not provide significant gains.
The connector requires exactly one instance of the agent application to be running at all times.
Preparing data sources configuration file¶
To set up the data sources on agent side:
Copy the following json template for data sources configuration and paste it to file:
{ "<data_source_name_1>": { "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]", "username": "<postgresql_db_username>", "password": "<postgresql_db_password>", "publication": "<postgresql_db_publication>", "ssl": false }, "<data_source_name_2>": { "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]", "username": "<postgresql_db_username>", "password": "<postgresql_db_password>", "publication": "<postgresql_db_publication>", "ssl": false }, "<data_source_name_3>": { "url": "jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]", "username": "<postgresql_db_username>", "password": "<postgresql_db_password>", "publication": "<postgresql_db_publication>", "ssl": false } }
Fill the necessary parameters for each data source you want to be available for replication (Template contains randomly selected number of data sources for example purposes).
Give each data source a name consisting only of UPPERCASE letters and numbers.
Make a note of the data source names; you will need them when configuring replication.
File is mounted by the following parameter of docker run command:
--volume </path/to/datasources/json/file>:/home/agent/datasources.json
Data source parameters¶
url
Specifies the PostgreSQL connection URL in the following syntax:
jdbc:postgresql://<host>:<port>/<databaseName>[?<key1>=<value1>[&<key2>=<value2>]
If
databaseName
is not specified, it is the same as the username used for replication.For more information on PostgreSQL JDBC URL formats, see PostgreSQL JDBC documentation.
username
Specifies the user name for the PostgreSQL connection. This user should have
REPLICATION
permission or be a superuser. For more information on replication security, see PostgreSQL documentation.password
Specifies the password of the PostgreSQL database user.
publication
Specifies publication name created for the replication of a particular database.
Each data source agent creates a unique replication slot with the name
sf_db_conn_rs_KBMd_<data_source_name>
. It keeps the WAL position so that the replication can be restored if the new agent continues the work of the previous agent instance. If you decide to permanently remove connector integration you also need to remove the replication slot created by the connector in the source database.ssl
Specifies if the agent is connected to the PostgreSQL server using SSL. Possible values are:
true
orfalse
and only those boolean values are acceptable.If you decide to use SSL certificate you need to pass the PostgreSQL server
root.crt
to the container and mount them in/home/agent/.postgresql/root.crt
path.For example if you use docker you can add to your run command:
-v /path/to/source/root_server.crt:/home/agent/.postgresql/root.crt
If you use
docker-compose
addvolumes: /path/to/source/root_server.crt:/home/agent/.postgresql/root.crt
If you want to configure the SSL connection in specific mode you can always pass all parameters to the
url
in the format described here. For examplejdbc:postgresql://localhost:5432/database?ssl=true&sslmode=verify-full&sslrootcert=/Users/username/.postgresql/root.crt
Running the Docker container of the agent¶
Note
When using Docker to manage the image for the agent application, consider using the --restart
flag
with the appropriate restart policy. This is important to ensure that the agent application
is always running.
The following are the general procedures for running the agent using Docker:
Prepare json file with snowflake connection configuration downloaded from wizard.
Prepare json file with data sources configuration.
Run the container:
docker run -d \ --restart unless-stopped \ --name agent-app-instance \ --volume </path/to/ssh/keys/directory>:/home/agent/.ssh \ --volume </path/to/snowflake/json/file>:/home/agent/snowflake.json \ --volume </path/to/datasources/json/file>:/home/agent/datasources.json \ -m 6g \ snowflakedb/database-connector-agent:latest
Attention
If you already pulled the latest version, it is possible that you’ll need to pull it again by typing
docker pull snowflakedb/database-connector-agent:latest
to be sure you’re running the the latest version from dockerhub rather than the cached one.
Managing Key Generation and User Assignment¶
Whenever you download a configuration json file from the Snowflake wizard, the user is provided with a newly generated temporary key pair.
On the initial agent run, a different set of keys is generated into the directory mounted using the -- volume
parameter:
--volume </path/to/ssh/keys/directory>:/home/agent/.ssh
The Agent alters its own authentication method and sets it to the generated public key.
On subsequent agent runs, the generated keys should be reused by the agent container by mounting the same directory specified as the prior --volume
.
Optional agent container environment variables¶
Agent provides additional options that can be controlled via environment variables. They are all optional.
SNOWFLAKE_PRIVATEKEYPASSWORD
Specifies the password to the private key file. It can be added if you want the private key be generated with a password or if you are reusing keys that were generated with a password.
SNOWFLAKE_ENFORCEDURL
Specifies the URL to connect to Snowflake, which overrides the automatically specified URL. which can be used to connect the agent to Snowflake via privatelink.
JAVA_OPTS
Specifies the Java options or system properties passed to the container. Separate each property with a space.
Note the following:
Use the
-Xmx
option to set the maximum Java heap size. Snowflake recommends setting this value to the amount of memory available to the container minus 1GB.For example, if the container has 6GB available, set the following:
JAVA_OPTS=-Xmx5g
To configure the proxy that allows the agent to connect to Snowflake, set the following system properties:
JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port>
To bypass the proxy for one or more IP addresses or hosts, set the
http.nonProxyHosts
system property to the list of hosts:Use a pipe symbol (
|
) to separate the host names.To specify hostnames that match a pattern, use an asterisk (
*
) as a wildcard character.JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port> -Dhttp.nonProxyHosts='*.my_company.com|localhost|myorganization-myaccount.snowflakecomputing.com|192.168.91.*'
To pass credentials to the proxy server, set the
http.proxyUser
andhttp.proxyPassword
system properties.JAVA_OPTS=-Dhttp.useProxy=true -Dhttp.proxyHost=<proxy-host> -Dhttp.proxyPort=<proxy-port> -Dhttp.proxyUser=<proxy-user> -Dhttp.proxyPassword=<proxy-pass>
Using privatelink connection to Snowflake¶
By default, agent does not use privatelink connection. In order to connect the agent with Snowflake via privatelink,
the SNOWFLAKE_ENFORCEDURL
environment variable must be passed to the agent container, and must contain the privatelink url.
For a complete list of all optional environment variables, see Optional agent container environment variables.
Next steps¶
After completing these procedures, follow the steps in Configuring replication for the Snowflake Connector for PostgreSQL.