OpenLineage

Euno's OpenLineage integration enables seamless ingestion of data lineage events from any system that produces OpenLineage compliant events. This integration automatically processes lineage metadata, table information, and tags to build a comprehensive view of your data pipeline dependencies and transformations.

How It Works

The integration follows these steps:

  1. Provides a secure endpoint Euno generates a unique integration key and endpoint URL for receiving OpenLineage events

  2. Accepts OpenLineage Events The integration accepts both single events and arrays of events in standard OpenLineage format via HTTP POST

  3. Processes Lineage and Metadata

    • Extracts table information from input and output datasets

    • Creates lineage relationships between input and output tables

    • Processes tags and metadata from dataset facets

    • Validates naming conventions for supported data warehouses

Supported Data Warehouses

Currently, the Euno's OpenLineage integration supports:

  • Snowflake - with naming convention:

    • namespace: snowflake://org-account

    • name: database.schema.table

  • BigQuery - with naming convention:

    • namespace: bigquery

    • name: project.dataset.table

For detailed naming conventions, see the OpenLineage Naming Specification.

Setting up Euno's OpenLineage Integration

Step 1: Configure New OpenLineage Source in Euno

Access the Sources Page

  1. Navigate to the Sources page in the Euno application

  2. Click on the Add New Source button

  3. Select OpenLineage from the available integrations

Step 2: General Configuration

  1. Name: Enter a descriptive name for your OpenLineage source (e.g., "Data Pipeline Lineage")

  2. Configuration Details:

    • OpenLineage integration requires minimal configuration as it's a push-based integration

    • No schedule configuration is needed since events are pushed in real-time

Step 3: Resource Cleanup Options

Configure automatic resource cleanup options to manage outdated resources:

  • Time-Based Cleanup (default): Remove resources not detected for X days (default: 7 days)

  • Immediate Cleanup: Remove resources not detected in the most recent run

  • No Cleanup: Keep all resources indefinitely

Step 4: Save Configuration

Click the Save button, and Euno will generate an integration key. Copy and save this key securely as it will not be displayed again.

Step 5: Get the Upload Endpoint

  1. Click "Reset Trigger Key" to get the endpoint URL

  2. Copy the provided endpoint URL where you'll send OpenLineage events

  3. Use the integration key from Step 4 as the Bearer token in your Authorization header

Sending OpenLineage Events

Example: Single Event with Lineage and Tags

Here's a complete example of an OpenLineage event with input/output lineage and tags:

{
  "eventType": "COMPLETE",
  "eventTime": "2024-01-15T10:30:00.001Z",
  "run": {
    "runId": "my-etl-run-12345"
  },
  "job": {
    "namespace": "production-pipeline",
    "name": "customer-analytics-etl"
  },
  "inputs": [
    {
      "namespace": "snowflake://myorg-account123",
      "name": "raw_data.public.customer_events"
    },
    {
      "namespace": "bigquery", 
      "name": "external_data.staging.product_catalog"
    }
  ],
  "outputs": [
    {
      "namespace": "snowflake://myorg-account123",
      "name": "analytics.public.customer_analytics",
      "facets": {
        "tags": [
          {
            "key": "environment",
            "value": "production"
          },
          {
            "key": "team", 
            "value": "data-engineering"
          },
          {
            "key": "contains_pii"
          },
          {
            "key": "data_classification",
            "value": "sensitive"
          }
        ],
        "schema": {
          "fields": [
            {
              "name": "customer_id",
              "type": "BIGINT",
              "description": "Unique customer identifier"
            },
            {
              "name": "total_purchases",
              "type": "DECIMAL(10,2)",
              "description": "Total purchase amount"
            }
          ]
        }
      }
    }
  ],
  "producer": "https://my-etl-system.com/v1.2.0",
  "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}

cURL Command Examples

Single Event Upload

curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_INTEGRATION_KEY_HERE" \
  -d '{
    "eventType": "COMPLETE",
    "eventTime": "2024-01-15T10:30:00.001Z",
    "run": {"runId": "simple-etl-123"},
    "job": {"namespace": "my-pipeline", "name": "daily-aggregation"},
    "inputs": [{
      "namespace": "snowflake://myorg-account123", 
      "name": "raw.public.events"
    }],
    "outputs": [{
      "namespace": "snowflake://myorg-account123",
      "name": "analytics.public.daily_stats",
      "facets": {
        "tags": [
          {"key": "environment", "value": "prod"},
          {"key": "automated"}
        ]
      }
    }],
    "producer": "my-pipeline-v1.0",
    "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
  }' \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run

Multiple Events Upload

curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_INTEGRATION_KEY_HERE" \
  -d '[
    {
      "eventType": "START", 
      "eventTime": "2024-01-15T10:00:00.001Z",
      "run": {"runId": "batch-job-456"},
      "job": {"namespace": "etl", "name": "batch-processor"},
      "producer": "scheduler-v2.1",
      "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    },
    {
      "eventType": "COMPLETE",
      "eventTime": "2024-01-15T10:15:00.001Z", 
      "run": {"runId": "batch-job-456"},
      "job": {"namespace": "etl", "name": "batch-processor"},
      "inputs": [{"namespace": "bigquery", "name": "raw.events.user_actions"}],
      "outputs": [{"namespace": "bigquery", "name": "processed.analytics.user_metrics"}],
      "producer": "scheduler-v2.1", 
      "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    }
  ]' \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run

Upload from File

# Save your event to a file
cat > event.json << 'EOF'
{
  "eventType": "COMPLETE",
  "eventTime": "2024-01-15T10:30:00.001Z",
  "run": {"runId": "file-upload-test"},
  "job": {"namespace": "testing", "name": "file-upload"},
  "outputs": [{
    "namespace": "snowflake://myorg-account123",
    "name": "test.public.sample_table"
  }],
  "producer": "test-script", 
  "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}
EOF

# Upload the file
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_INTEGRATION_KEY_HERE" \
  -d @event.json \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run

What Gets Observed in Euno

When OpenLineage events are processed, Euno observes:

Table Resources

  • Tables from both inputs and outputs with properties:

    • name: Table name

    • database: Database name

    • schema: Schema name

    • database_technology: snowflake or bigquery

    • type: table

    • meta : see below

    • tags : see below

Lineage Relationships

  • Lineage: Output tables get lineage pointing to input tables

Tags and Metadata

  • Tags with values (e.g., {"key": "environment", "value": "prod"}) become meta properties

  • Tags without values (e.g., {"key": "pii"}) become simple tags

Generating a new trigger key: If you need to create a new integration key, go to the Sources page and click on the three-dot menu next to your OpenLineage source. Select "Reset Trigger Key" to generate a new key and endpoint URL.

Last updated