Databricks Integration Discovered Resources
The Euno Databricks integration discovers and updates the following resource families:
Databricks workspace
Unity Catalog databases (catalogs)
Unity Catalog schemas
Databricks tables (including views, materialized views, and external tables)
Databricks table columns
Databricks notebooks
Usage and lineage properties across the resources above
Discovery inputs
The integration combines multiple Databricks data sources:
system.information_schema.*for metadatasystem.query.historyfor usage and notebook execution evidencesystem.access.table_lineagefor table lineage evidencesystem.access.column_lineagefor regular CLL and notebook reporting evidenceWorkspace API for notebook discovery
Resource hierarchy
The Databricks integration follows this container hierarchy:
Databricks workspace
Resource type: databricks_workspace
name
Workspace display name
Analytics Workspace
type
Resource type
databricks_workspace
subtype
Resource subtype
databricks_workspace
container_type
Container classification
pure_container
database_technology
Platform
databricks
URI pattern
Example: platform.databricks.databricks_workspace.dbc-50e7cad0-c2f9.cloud.databricks.com
Unity Catalog database (catalog)
Resource type: database
name
Catalog name
analytics_prod
type
Resource type
database
subtype
Resource subtype
database
parent_container
Parent workspace URI
platform.databricks...
database_technology
Platform
databricks
description
Catalog comment
Production analytics catalog
created_at, updated_at
Catalog timestamps
2026-03-01T10:00:00Z
tags
Tag names with empty/null values
class.credit_card
meta
Tag key-values with non-empty values
{"department": "analytics"}
URI pattern
Unity Catalog schema
Resource type: database_schema
name
Schema name
customer_data
type
Resource type
database_schema
subtype
Resource subtype
database_schema
parent_container
Parent catalog URI
databricks.<host>.<catalog>
database_technology
Platform
databricks
database_database
Catalog name
analytics_prod
description
Schema comment
Customer-related data
created_at, updated_at
Schema timestamps
2026-03-01T10:00:00Z
tags
Tag names with empty/null values
pii
meta
Tag key-values with non-empty values
{"domain": "marketing"}
URI pattern
Databricks table resources
Resource type: table
Databricks table resources include these subtypes:
base_tableexternal_tableviewmaterialized_view
name
Table/view name
customers
type
Resource type
table
subtype
Table subtype
external_table
parent_container
Parent schema URI
databricks.<host>.<catalog>.<schema>
database_technology
Platform
databricks
database_schema
Schema name
customer_data
database_database
Catalog name
analytics_prod
description
Table/view comment
Customer profile table
created_at, updated_at
Object timestamps
2026-03-01T10:00:00Z
tags
Tag names with empty/null values
sensitive
meta
Tag key-values with non-empty values
{"team": "revops"}
sql_dialect
SQL dialect
databricks
platform_uri_prefix
Databricks URI prefix for this workspace
databricks.<hostname>
table_properties.materialized
Materialization flag derived from Databricks table type
true / false
table_schema
Structured schema for observed columns
list of column descriptors
table_dependencies
Upstream/downstream relationship targets (table or notebook URIs)
list of URIs
URI pattern
System exclusions
The crawler excludes system objects from discovery:
Catalogs:
system,samples,hive_metastoreSchemas:
information_schema
Table usage properties
Usage windows are emitted for 14d, 30d, and 60d intervals.
Read usage properties
Applicable to table resources (base_table, external_table, view, materialized_view).
total_read_queries_14d
total_read_queries_30d
total_read_queries_60d
Number of read queries
total_read_runtime_14d
total_read_runtime_30d
total_read_runtime_60d
Total active compute runtime (ms) for read queries
total_read_bytes_processed_14d
total_read_bytes_processed_30d
total_read_bytes_processed_60d
Total read bytes
distinct_users_14d
distinct_users_30d
distinct_users_60d
Distinct querying users
total_read_dbu_14d
total_read_dbu_30d
total_read_dbu_60d
Total Databricks read DBU-seconds (compute-time proxy)
average_read_dbu_14d
average_read_dbu_30d
average_read_dbu_60d
Average read DBU-seconds per read query
Read DBU values are derived from query-history active compute time.
Write usage properties
Write-usage applicability is determined by Databricks table_type classification:
Observed subtype
Databricks table_type values
Write usage applicable?
Behavior
view
VIEW
No
Write metrics are emitted as explicit zeros each crawl (stale-value remediation).
materialized_view
MATERIALIZED_VIEW
Yes
Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.
base_table
MANAGED, STREAMING_TABLE, MANAGED_SHALLOW_CLONE
Yes
Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.
external_table
EXTERNAL, EXTERNAL_SHALLOW_CLONE, EXTERNAL_TABLE
Yes
Write metrics are measured from write statements and zero-filled when no write activity is seen in-window.
external_table
FOREIGN
No
Write metrics are not applicable (FOREIGN remains read-only for write-usage semantics).
total_write_queries_14d
total_write_queries_30d
total_write_queries_60d
Number of write queries
total_write_runtime_14d
total_write_runtime_30d
total_write_runtime_60d
Total active compute runtime (ms) for write queries
total_write_bytes_processed_14d
total_write_bytes_processed_30d
total_write_bytes_processed_60d
Total bytes processed by write queries
Databricks column resources
Resource type: column
name
Column name
customer_id
type
Resource type
column
subtype
Resource subtype
column
description
Column comment
Primary key
parent_container
Parent table URI
databricks.<host>.<catalog>.<schema>.<table>
database_technology
Platform
databricks
database_schema
Schema name
customer_data
database_database
Catalog name
analytics_prod
native_data_type
Native Databricks type
STRING
normalized_data_type
Euno-normalized type
string
upstream_fields
Column lineage/reporting relationship targets
list of URIs
URI pattern
Column usage properties
Column usage is emitted for 14d, 30d, and 60d windows.
total_read_queries_14d
total_read_queries_30d
total_read_queries_60d
Number of distinct statements that read the column
distinct_users_14d
distinct_users_30d
distinct_users_60d
Distinct users reading the column
Databricks notebook resources
Resource type: databricks_notebook
name
Notebook display name (basename)
daily_sales_rollup
type
Resource type
databricks_notebook
subtype
Resource subtype
databricks_notebook
native_id
Databricks notebook object ID
2249857805096087
parent_container
Parent workspace URI
platform.databricks...
created_at, updated_at
Notebook timestamps
2026-03-01T10:00:00
description
Notebook description (if available)
Daily revenue aggregation
native_last_data_update
Latest observed execution time from query history evidence
2026-03-03T22:22:44
defines
Tables defined by notebook execution evidence
list of table URIs
URI pattern
Relationships
The integration emits parent-child and lineage/usage relationships through observed properties.
Parent-child relationships
Workspace -> Catalog (
parent_container)Catalog -> Schema (
parent_container)Schema -> Table (
parent_container)Table -> Column (
parent_container)Workspace -> Notebook (
parent_container)
Table and view dependencies
View lineage remains supported: for Databricks views,
table_dependenciescaptures upstream table/view dependencies inferred from view SQL.Notebook evidence can also append notebook URIs to
table_dependenciesfor reporting semantics (table -> notebook downstream pattern).table_dependenciesis a unified dependency field, and the target URI type indicates whether the dependency is warehouse lineage or notebook reporting lineage.
Notebook defines relationships
Notebook -> Table define semantics are represented by
defineson notebook resources.
Column lineage and notebook reporting field semantics
Regular Databricks CLL is represented as
upstream_fieldson target columns, pointing to source column URIs.Notebook reporting field semantics are represented as
upstream_fieldson source columns, pointing to notebook URIs.
Statement-level column dependencies for views
For Databricks views, Euno also captures statement-level column dependencies and reflects them in table-level upstream_fields.
This means upstream_fields for a view can include columns used in SQL logic clauses, not only columns selected in the projection.
Included SQL logic clauses:
JOINconditionsWHEREGROUP BYHAVINGQUALIFYORDER BY
Example:
SQL:
SELECT order_id FROM orders_raw WHERE order_status = 'COMPLETE'Expected table-level effect: the view can have an
upstream_fieldsrelationship toorders_raw.column.order_status, even thoughorder_statusis not in the SELECT output.
Notes and caveats
Notebook execution evidence and notebook-derived relationships are evaluated from
system.query.history,system.access.table_lineage, andsystem.access.column_lineagewith a 30-day lookback window.Regular Databricks CLL (
upstream_fieldson target columns) is evaluated fromsystem.access.column_lineagewith a 30-day lookback window.Table and column usage metrics are emitted for
14d,30d, and60dwindows.Usage and lineage relationships are emitted only for resources that are observed in the current integration scope.
Last updated