How to Connect to Databricks SQL in Precog
This guide is specifically for setting up Databricks SQL as a destination in Precog. This destination supports various deployment types, including AWS, Azure, Google Cloud, SAP, and others.
Precog setup
- Log in to your Precog account.
- Add a new data pipeline and select Databricks SQL as the destination.
- Follow the directions in Precog and supply the required information into the configuration form.
Databricks SQL setup
Find the following information from your Databricks instance. Enter the information into the Precog Destination configuration form.
- Log in to the Databricks Instance that you want to load data into. An instance name is assigned to each Databricks deployment. Typically, Databricks customers have separate instances for development, staging, and production environments. Locate your Workspace Instance Name from the URL. The instance name is the first part of the URL when you log into your Databricks deployment. For example, for the URL https://cust-success.cloud.databricks.com/, the instance name is cust-success.cloud.databricks.com.
You can find more information in the Get identifiers for workspace objects | Databricks documentation. - Find the SQL Warehouse ID
While still logged into your Databricks Workspace, navigate to the sidebar, click SQL > SQL Warehouses. Then click on the target warehouse's name. On the Connection Details tab, find the HTTP path. The Warehouse ID is the string of letters and numbers following /sql/1.0/warehouses/ in the HTTP path field for your warehouse.
You can find more information in the Get connection details for a Databricks compute resource documentation. - Generate a Personal Access Token
Both user and service principal personal access tokens (PAT) are supported, though Precog recommends creating a service principal. You can find more information in the Service principals | Databricks documentation. To generate a Service Principal PA you need to be a Workspace Admin. Follow these instructions in the Databricks personal access token authentication documentation. - Identify the catalog where schemas and tables should be created. If unspecified, the default catalog for the workspace will be used.
Each workspace that uses Unity Catalog has a default catalog configured. If Unity Catalog was automatically enabled for your workspace, the pre-provisioned workspace catalog is set as the default. This default catalog allows you to run data operations without having to specify a catalog name. Workspace admins can change the default catalog if needed.
You can find more information in the What are catalogs | Databricks documentation. - Enter all the above information into the Precog destination configuration form. Click on the “Create destination” button to connect Databricks SQL as a destination.