Databricks

Euno's Databricks integration supports auto-discovery of:

  • Databricks workspaces

  • Unity Catalog databases (catalogs)

  • Unity Catalog schemas

  • Unity Catalog tables and views

  • Table and view SQL definitions

  • Tags and metadata at all levels

Setting up Databricks integration

Overview

Databricks is a unified analytics platform that combines data engineering, data science, and business analytics. With Unity Catalog, Databricks provides a unified governance solution for data and AI assets across clouds and platforms.

In the Unity Catalog data model, data is organized in a three-level namespace: Catalog → Schema → Table/View. Each catalog contains multiple schemas, and each schema contains multiple tables and views.

To discover Databricks resources, Euno connects to your Databricks SQL endpoint and queries the Unity Catalog system tables (system.information_schema.*). These system views provide comprehensive metadata about all catalogs, schemas, tables, views, tags, and governance information in your Databricks workspace.

Euno automatically excludes system catalogs (system, samples) and allows you to configure which user catalogs to discover using regular expression patterns.

Step 1: Prepare Databricks Workspace

Ensure Databricks SQL endpoint is available

  1. Verify that your Databricks workspace has Unity Catalog enabled

  2. Ensure you have a SQL endpoint or SQL warehouse running in your workspace

  3. Note the workspace hostname (e.g., dbc-xxxxxxxx-xxxx.cloud.databricks.com)

  4. Note the HTTP path of your SQL endpoint (e.g., /sql/1.0/warehouses/warehouse-id)

Create a Databricks personal access token

For secure authentication, create a personal access token with appropriate permissions:

  1. In your Databricks workspace, go to SettingsDeveloperAccess tokens

  2. Click Generate new token

  3. Set an appropriate comment (e.g., "Euno integration")

  4. Set the token lifetime (or leave blank for no expiration)

  5. Click Generate

  6. Important: Copy the token immediately and store it securely

Required Permissions: The user creating the token must have:

  • Access to the Databricks workspace

  • Permission to use the SQL endpoint/warehouse

  • SELECT permissions on Unity Catalog system tables

  • Access to the catalogs, schemas, and tables you want to discover

Step 2: Configure Euno's Databricks Integration

Configuration
Description

Server Hostname

The hostname of your Databricks workspace (e.g., dbc-xxxxxxxx-xxxx.cloud.databricks.com)

HTTP Path

The HTTP path of your Databricks SQL endpoint (e.g., /sql/1.0/warehouses/warehouse-id)

Access Token

Personal access token for authentication

Workspace Name

Custom name for the workspace in Euno (optional, defaults to hostname)

Step 3: Schedule

  • Enable the Schedule option.

  • Choose:

    1. Weekly: Set specific days and times.

    2. Hourly: Define the interval in hours (e.g., every 8 hours).

Step 4: Resource Cleanup

  • Immediate Cleanup: Remove resources not detected in the most recent successful source integration run.

  • No Cleanup: Keep all resources indefinitely, even if they are no longer detected.

To keep your data relevant and free of outdated resources, Euno provides automatic resource cleanup options. These settings determine when a resource should be removed if it is no longer detected by a source integration. For a detailed explanation on Euno's cleanup strategies, see: Resource Sponsorship in Euno.

Step 5: Advanced Settings

Click on the 'Advanced' section to display these additional configurations.

Configuration
Description

Override Base URI

Custom base URI to use in resource URIs (defaults to server hostname)

Database Pattern

Use a regular expression to allow or exclude specific databases. ".*" will include or exclude all databases. System databases (like 'system' and 'samples') are always excluded.

Examples of Database Patterns:

  • .* - Include all user databases

  • production_.* - Include only databases starting with "production_"

  • .* (allow) and test_.* (deny) - Include all databases except those starting with "test_"

Step 6: Save Configuration

Click the Test & Save button to complete the setup. Euno will validate the connection and permissions before saving.

Discovered Resources

The Euno-Databricks integration discovers various resources including Databricks workspaces, Unity Catalog databases, schemas, tables, views, and their associated metadata. For detailed information about discovered resources and their properties, see databricks-integration-discovered-resources.

Last updated