dbt Core

Euno's dbt Core integration supports auto-discovery of dbt resources. It automates processing of dbt uploaded artifacts from completed runs. This allows for seamless data synchronization and analysis using the latest available dbt job runs.

How It Works

The integration follows these steps:

  1. Outputs a url for uploading dbt artifacts The following artifacts can then be uploaded to Euno's endpoint:

    • run_results.json

    • manifest.json

    • semantic_manifest.json

    • catalog.json

  2. Process the Artifacts

    • The integration processes the uploaded artifacts to extract relevant information and adds the discovered resources to Euno's data model.

Setting up Euno's dbt Core Integration

Step 1: Configure New dbt Core Source in Euno

Step 1: Access the Sources Page

  1. Navigate to the Sources page in the Euno application.

  2. Click on the Add New Source button.

Step 2: General Configuration

  1. Name: Enter a name for your dbt Core source (e.g., "dbt - Marketing Models").

  2. Configuration Details:

    1. Build target: The default build target to use when observing dbt resources. Consult the table below depending on the warehouse technology your dbt is run against.

Snowflake

The build target should follow the pattern: <account_id>.<region>.<cloud provide>.snowflakecomputing.com

For example:

  • foo-1234.us-east-1.aws.snowflakecomputing.com

  • bar-5678.us-west-2.gcp.snowflakecomputing.com

Trino

The build target should follow the pattern trino.<region>.<gcloud provider>

For example:

  • trino.us-west-2.aws

  • trino.us-east-2.gcp

Databricks

The build target should be the hostname of the databrick workspace

For example:

  • 4200747832468935.5.gcp.databricks.com

  • dbc-50e7cad0-c2f9.cloud.databricks.com

  • adb-5555555555555555.19.azuredatabricks.net

Bigquery

The build target should just say bigquery

Step 3: Resource Cleanup Options

To keep your data relevant and free of outdated resources, Euno provides automatic resource cleanup options. These settings determine when a resource should be removed if it is no longer detected by a source integration. For a detailed explanation see: Resource Sponsorship in Euno.

  • Time-Based Cleanup (default): Remove resources that were last detected X days before the most recent successful source integration run (user-defined X, default is 7 days).

  • Immediate Cleanup: Remove resources not detected in the most recent successful source integration run.

  • No Cleanup: Keep all resources indefinitely, even if they are no longer detected.

Step 4: Advanced Settings (Optional)

Click on the 'Advanced' section to display these additional configurations.

Configuration
Description

Allow Processing Builds with a Partial Catalog

By default, Euno only observes dbt resources (e.g., models, sources, snapshots, and seeds) that have a corresponding entry in the catalog.json file. Checking this box will expand the scope to include all resources listed in the manifest.json file, even if they do not appear in the catalog.json file.

Note: Resources without a matching entry in the catalog.json file will not have schema information available, as this data is exclusively pulled from the catalog. By default, the integration processes only builds with a complete, error-free catalog.json.

Source Repository URL

The URL of the git repository where the dbt project is stored

Source Repository branch

The branch of the git repository where the dbt project is stored

Relative directory of the dbt project

Subdirectory within the git repository where the dbt project is stored

Mapping

Euno will ingest dbt resources to the database and schema stated in the manifest file, unless a database.schema mapping is added. In that case, the resource will be ingested to the database and schema stated in the value of the target. example: source: analytics_v2, target: analytics will map `analytics_v2.analytics` to `analytics.analytics` source: r'/^(hive_metastore\..+)_spark$/, target: analytics will map hive_metastore.some_schema_spark to hive_metastore.some_schema.

Step 4: Save Configuration

Click the Save button, and Euno will generate an integration key. Please copy the integration key and save it somewhere, as the key will disappear after copying.

Step 5: Add the Integration Key

Take the copied integration key and add it to the configuration of the application or webhook that will send the dbt artifacts to Euno. This ensures that the application or webhook can authenticate and securely transmit the artifacts to Euno.

Step 6: Get the upload endpoint

Click "run", and euno will provide an endpoint to upload the artifacts into, where you will use the integration key from Step 5, as a header. See example script for uploading dbt build artifacts.

dbt Core Source: Click "Run"
dbt Core Source: Run URL

Step 7: Uploading artifacts

Files to upload from your dbt build are:

  • run_results.json

  • manifest.json

  • semantic_manifest.json

  • catalog.json

The following script assumes files are under the current working directory. You can change that in the code if you choose to use a different directory.

import requests
import os


# Endpoint URL - replace with endpoint link from the integration itself 
endpoint_url = "https://api.app.getdelphi.io/accounts/4/integrations/45/run"

# Headers with authorization token
integration_key = "your_key_here"  # Replace this with the integration key
headers = {
    "authorization": f"Bearer {integration_key}"
}

# List of files to upload
file_names = ["catalog.json", 
              "manifest.json", 
              "run_results.json", 
              "semantic_manifest.json"]

# Ensure all files exist before attempting upload
missing_files = [file for file in file_names if not os.path.exists(file)]
if missing_files:
    print(f"Error: The following files are missing: {', '.join(missing_files)}")
    exit(1)

# Sending the POST request
try:
    files = [('files', (file, open(os.path.join(os.getcwd(), file), 'rb'), 'application/json')) 
             for file in file_names
             ]

    response = requests.post(
        url=endpoint_url,
        headers=headers,
        files=files
    )

    # Checking the response
    if response.status_code == 200:
        print("Files uploaded successfully!")
        print("Response:", response.json())
    else:
        print("Failed to upload files.")
        print("Status Code:", response.status_code)
        print("Response:", response.text)
except Exception as e:
    print("An error occurred:", e)

Generating a new trigger URL: If you need to create a new integration key, go to the Sources page and click on the three-dot menu to the right of the source you want to create a new key for. In the dropdown menu click on Generate trigger URL. The generated URL will include the new integration key.

Last updated