Databricks Integration Discovered Resources

The Euno Databricks integration discovers and updates the following resource families:

  • Databricks workspace

  • Unity Catalog databases (catalogs)

  • Unity Catalog schemas

  • Databricks tables (including views, materialized views, and external tables)

  • Databricks table columns

  • Databricks notebooks

  • Usage and lineage properties across the resources above

Discovery inputs

The integration combines multiple Databricks data sources:

  • system.information_schema.* for metadata

  • system.query.history for usage and notebook execution evidence

  • system.access.table_lineage for table lineage evidence

  • system.access.column_lineage for regular CLL and notebook reporting evidence

  • Workspace API for notebook discovery

Resource hierarchy

The Databricks integration follows this container hierarchy:

Databricks workspace

Resource type: databricks_workspace

Property
Description
Example

name

Workspace display name

Analytics Workspace

type

Resource type

databricks_workspace

subtype

Resource subtype

databricks_workspace

container_type

Container classification

pure_container

database_technology

Platform

databricks

URI pattern

Example: platform.databricks.databricks_workspace.dbc-50e7cad0-c2f9.cloud.databricks.com

Unity Catalog database (catalog)

Resource type: database

Property
Description
Example

name

Catalog name

analytics_prod

type

Resource type

database

subtype

Resource subtype

database

parent_container

Parent workspace URI

platform.databricks...

database_technology

Platform

databricks

description

Catalog comment

Production analytics catalog

native_owners

Catalog owner(s)

created_at, updated_at

Catalog timestamps

2026-03-01T10:00:00Z

created_by, updated_by

Creator/modifier

tags

Tag names with empty/null values

class.credit_card

meta

Tag key-values with non-empty values

{"department": "analytics"}

URI pattern

Unity Catalog schema

Resource type: database_schema

Property
Description
Example

name

Schema name

customer_data

type

Resource type

database_schema

subtype

Resource subtype

database_schema

parent_container

Parent catalog URI

databricks.<host>.<catalog>

database_technology

Platform

databricks

database_database

Catalog name

analytics_prod

description

Schema comment

Customer-related data

native_owners

Schema owner(s)

created_at, updated_at

Schema timestamps

2026-03-01T10:00:00Z

created_by, updated_by

Creator/modifier

tags

Tag names with empty/null values

pii

meta

Tag key-values with non-empty values

{"domain": "marketing"}

URI pattern

Databricks table resources

Resource type: table

Databricks table resources include these subtypes:

  • base_table

  • external_table

  • view

  • materialized_view

Property
Description
Example

name

Table/view name

customers

type

Resource type

table

subtype

Table subtype

external_table

parent_container

Parent schema URI

databricks.<host>.<catalog>.<schema>

database_technology

Platform

databricks

database_schema

Schema name

customer_data

database_database

Catalog name

analytics_prod

description

Table/view comment

Customer profile table

native_owners

Owner(s)

created_at, updated_at

Object timestamps

2026-03-01T10:00:00Z

created_by, updated_by

Creator/modifier

tags

Tag names with empty/null values

sensitive

meta

Tag key-values with non-empty values

{"team": "revops"}

sql_dialect

SQL dialect

databricks

platform_uri_prefix

Databricks URI prefix for this workspace

databricks.<hostname>

table_properties.materialized

Materialization flag derived from Databricks table type

true / false

table_schema

Structured schema for observed columns

list of column descriptors

table_dependencies

Upstream/downstream relationship targets (table or notebook URIs)

list of URIs

URI pattern

System exclusions

The crawler excludes system objects from discovery:

  • Catalogs: system, samples, hive_metastore

  • Schemas: information_schema

Table usage properties

Usage windows are emitted for 14d, 30d, and 60d intervals.

Read usage properties

Applicable to table resources (base_table, external_table, view, materialized_view).

Property family
Meaning

total_read_queries_14d total_read_queries_30d total_read_queries_60d

Number of read queries

total_read_runtime_14d total_read_runtime_30d total_read_runtime_60d

Total active compute runtime (ms) for read queries

total_read_bytes_processed_14d total_read_bytes_processed_30d total_read_bytes_processed_60d

Total read bytes

distinct_users_14d distinct_users_30d distinct_users_60d

Distinct querying users

total_read_dbu_14d total_read_dbu_30d total_read_dbu_60d

Total Databricks read DBU-seconds (compute-time proxy)

average_read_dbu_14d average_read_dbu_30d average_read_dbu_60d

Average read DBU-seconds per read query

Read DBU values are derived from query-history active compute time.

Write usage properties

Write-usage applicability is determined by Databricks table_type classification:

Observed subtype

Databricks table_type values

Write usage applicable?

Behavior

view

VIEW

No

Write metrics are emitted as explicit zeros each crawl (stale-value remediation).

materialized_view

MATERIALIZED_VIEW

Yes

Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.

base_table

MANAGED, STREAMING_TABLE, MANAGED_SHALLOW_CLONE

Yes

Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.

external_table

EXTERNAL, EXTERNAL_SHALLOW_CLONE, EXTERNAL_TABLE

Yes

Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.

external_table

FOREIGN

No

Write metrics are not applicable (FOREIGN remains read-only for write-usage semantics).

Property family
Meaning

total_write_queries_14d total_write_queries_30d total_write_queries_60d

Number of write queries

total_write_runtime_14d total_write_runtime_30d total_write_runtime_60d

Total active compute runtime (ms) for write queries

total_write_bytes_processed_14d total_write_bytes_processed_30d total_write_bytes_processed_60d

Total bytes processed by write queries

Databricks column resources

Resource type: column

Property
Description
Example

name

Column name

customer_id

type

Resource type

column

subtype

Resource subtype

column

description

Column comment

Primary key

parent_container

Parent table URI

databricks.<host>.<catalog>.<schema>.<table>

database_technology

Platform

databricks

database_schema

Schema name

customer_data

database_database

Catalog name

analytics_prod

native_data_type

Native Databricks type

STRING

normalized_data_type

Euno-normalized type

string

upstream_fields

Column lineage/reporting relationship targets

list of URIs

URI pattern

Column usage properties

Column usage is emitted for 14d, 30d, and 60d windows.

Property family
Meaning

total_read_queries_14d total_read_queries_30d total_read_queries_60d

Number of distinct statements that read the column

distinct_users_14d distinct_users_30d distinct_users_60d

Distinct users reading the column

Databricks notebook resources

Resource type: databricks_notebook

Property
Description
Example

name

Notebook display name (basename)

daily_sales_rollup

type

Resource type

databricks_notebook

subtype

Resource subtype

databricks_notebook

native_id

Databricks notebook object ID

2249857805096087

parent_container

Parent workspace URI

platform.databricks...

created_at, updated_at

Notebook timestamps

2026-03-01T10:00:00

description

Notebook description (if available)

Daily revenue aggregation

native_last_data_update

Latest observed execution time from query history evidence

2026-03-03T22:22:44

defines

Tables defined by notebook execution evidence

list of table URIs

URI pattern

Relationships

The integration emits parent-child and lineage/usage relationships through observed properties.

Parent-child relationships

  • Workspace -> Catalog (parent_container)

  • Catalog -> Schema (parent_container)

  • Schema -> Table (parent_container)

  • Table -> Column (parent_container)

  • Workspace -> Notebook (parent_container)

Table and view dependencies

  • View lineage remains supported: for Databricks views, table_dependencies captures upstream table/view dependencies inferred from view SQL.

  • Notebook evidence can also append notebook URIs to table_dependencies for reporting semantics (table -> notebook downstream pattern).

  • table_dependencies is a unified dependency field, and the target URI type indicates whether the dependency is warehouse lineage or notebook reporting lineage.

Notebook defines relationships

  • Notebook -> Table define semantics are represented by defines on notebook resources.

Column lineage and notebook reporting field semantics

  • Regular Databricks CLL is represented as upstream_fields on target columns, pointing to source column URIs.

  • Notebook reporting field semantics are represented as upstream_fields on source columns, pointing to notebook URIs.

Statement-level column dependencies for views

For Databricks views, Euno also captures statement-level column dependencies and reflects them in table-level upstream_fields.

This means upstream_fields for a view can include columns used in SQL logic clauses, not only columns selected in the projection.

Included SQL logic clauses:

  • JOIN conditions

  • WHERE

  • GROUP BY

  • HAVING

  • QUALIFY

  • ORDER BY

Example:

  • SQL: SELECT order_id FROM orders_raw WHERE order_status = 'COMPLETE'

  • Expected table-level effect: the view can have an upstream_fields relationship to orders_raw.column.order_status, even though order_status is not in the SELECT output.

Notes and caveats

  • Notebook execution evidence and notebook-derived relationships are evaluated from system.query.history, system.access.table_lineage, and system.access.column_lineage with a 30-day lookback window.

  • Regular Databricks CLL (upstream_fields on target columns) is evaluated from system.access.column_lineage with a 30-day lookback window.

  • Table and column usage metrics are emitted for 14d, 30d, and 60d windows.

  • Usage and lineage relationships are emitted only for resources that are observed in the current integration scope.

Last updated