# OpenLineage

Euno's OpenLineage integration enables seamless ingestion of data lineage events from any system that produces [OpenLineage](https://openlineage.io/) compliant events. This integration automatically processes lineage metadata, table information, and tags to build a comprehensive view of your data pipeline dependencies and transformations.

### How It Works

The integration follows these steps:

1. **Provides a secure endpoint** Euno generates a unique trigger secret and endpoint URL for receiving OpenLineage events
2. **Accepts OpenLineage Events** The integration accepts both single events and arrays of events in standard OpenLineage format via HTTP POST
3. **Processes Lineage and Metadata**
   * Extracts table information from input and output datasets
   * Creates lineage relationships between input and output tables
   * Processes tags and metadata from dataset facets
   * Validates naming conventions for supported data warehouses

### Supported Data Warehouses

Currently, the Euno's OpenLineage integration supports:

* **Snowflake** - with naming convention:
  * namespace: `snowflake://org-account`
  * name: `database.schema.table`
* **BigQuery** - with naming convention:
  * namespace: `bigquery`
  * name: `project.dataset.table`

For detailed naming conventions, see the [OpenLineage Naming Specification](https://openlineage.io/docs/spec/naming).

## Setting up Euno's OpenLineage Integration

### Step 1: Configure New OpenLineage Source in Euno

#### Access the Sources Page

1. Navigate to the **Sources** page in the Euno application
2. Click on the **Add New Source** button
3. Select **OpenLineage** from the available integrations

### Step 2: General Configuration

1. **Name**: Enter a descriptive name for your OpenLineage source (e.g., "Data Pipeline Lineage")
2. **Configuration Details**:
   * OpenLineage integration requires minimal configuration as it's a push-based integration
   * No schedule configuration is needed since events are pushed in real-time

### Step 3: Resource Cleanup Options

Configure automatic **resource cleanup** options to manage outdated resources:

* **Time-Based Cleanup (default)**: Remove resources not detected for X days (default: 7 days)
* **Immediate Cleanup**: Remove resources not detected in the most recent run
* **No Cleanup**: Keep all resources indefinitely

### Step 4: Save Configuration

Click the **Save** button, and Euno will generate a trigger secret. **Copy and save this secret securely** as it will not be displayed again.

### Step 5: Get the Upload Endpoint

1. Click **"Reset Trigger Key"** to get the endpoint URL
2. Copy the provided endpoint URL where you'll send OpenLineage events
3. Use the trigger secret from Step 4 as the Bearer token in your Authorization header

## Sending OpenLineage Events

### Example: Single Event with Lineage and Tags

Here's a complete example of an OpenLineage event with input/output lineage and tags:

```json
{
  "eventType": "COMPLETE",
  "eventTime": "2024-01-15T10:30:00.001Z",
  "run": {
    "runId": "my-etl-run-12345"
  },
  "job": {
    "namespace": "production-pipeline",
    "name": "customer-analytics-etl"
  },
  "inputs": [
    {
      "namespace": "snowflake://myorg-account123",
      "name": "raw_data.public.customer_events"
    },
    {
      "namespace": "bigquery", 
      "name": "external_data.staging.product_catalog"
    }
  ],
  "outputs": [
    {
      "namespace": "snowflake://myorg-account123",
      "name": "analytics.public.customer_analytics",
      "facets": {
        "tags": [
          {
            "key": "environment",
            "value": "production"
          },
          {
            "key": "team", 
            "value": "data-engineering"
          },
          {
            "key": "contains_pii"
          },
          {
            "key": "data_classification",
            "value": "sensitive"
          }
        ],
        "schema": {
          "fields": [
            {
              "name": "customer_id",
              "type": "BIGINT",
              "description": "Unique customer identifier"
            },
            {
              "name": "total_purchases",
              "type": "DECIMAL(10,2)",
              "description": "Total purchase amount"
            }
          ]
        }
      }
    }
  ],
  "producer": "https://my-etl-system.com/v1.2.0",
  "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}
```

### cURL Command Examples

#### Single Event Upload

```bash
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TRIGGER_SECRET_HERE" \
  -d '{
    "eventType": "COMPLETE",
    "eventTime": "2024-01-15T10:30:00.001Z",
    "run": {"runId": "simple-etl-123"},
    "job": {"namespace": "my-pipeline", "name": "daily-aggregation"},
    "inputs": [{
      "namespace": "snowflake://myorg-account123", 
      "name": "raw.public.events"
    }],
    "outputs": [{
      "namespace": "snowflake://myorg-account123",
      "name": "analytics.public.daily_stats",
      "facets": {
        "tags": [
          {"key": "environment", "value": "prod"},
          {"key": "automated"}
        ]
      }
    }],
    "producer": "my-pipeline-v1.0",
    "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
  }' \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run
```

#### Multiple Events Upload

```bash
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TRIGGER_SECRET_HERE" \
  -d '[
    {
      "eventType": "START", 
      "eventTime": "2024-01-15T10:00:00.001Z",
      "run": {"runId": "batch-job-456"},
      "job": {"namespace": "etl", "name": "batch-processor"},
      "producer": "scheduler-v2.1",
      "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    },
    {
      "eventType": "COMPLETE",
      "eventTime": "2024-01-15T10:15:00.001Z", 
      "run": {"runId": "batch-job-456"},
      "job": {"namespace": "etl", "name": "batch-processor"},
      "inputs": [{"namespace": "bigquery", "name": "raw.events.user_actions"}],
      "outputs": [{"namespace": "bigquery", "name": "processed.analytics.user_metrics"}],
      "producer": "scheduler-v2.1", 
      "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
    }
  ]' \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run
```

#### Upload from File

```bash
# Save your event to a file
cat > event.json << 'EOF'
{
  "eventType": "COMPLETE",
  "eventTime": "2024-01-15T10:30:00.001Z",
  "run": {"runId": "file-upload-test"},
  "job": {"namespace": "testing", "name": "file-upload"},
  "outputs": [{
    "namespace": "snowflake://myorg-account123",
    "name": "test.public.sample_table"
  }],
  "producer": "test-script", 
  "schemaURL": "https://openlineage.io/spec/1-0-5/OpenLineage.json#/definitions/RunEvent"
}
EOF

# Upload the file
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TRIGGER_SECRET_HERE" \
  -d @event.json \
  https://api.app.euno.ai/accounts/YOUR_ACCOUNT_ID/integrations/YOUR_INTEGRATION_ID/run
```

## What Gets Observed in Euno

When OpenLineage events are processed, Euno observes:

### Table Resources

* **Tables** from both inputs and outputs with properties:
  * `name`: Table name
  * `database`: Database name
  * `schema`: Schema name
  * `database_technology`: `snowflake` or `bigquery`
  * `type`: `table`
  * `meta` : see below
  * `tags` : see below

### Lineage Relationships

* **Lineage**: Output tables get lineage pointing to input tables

### Tags and Metadata

* **Tags with values** (e.g., `{"key": "environment", "value": "prod"}`) become **meta properties**
* **Tags without values** (e.g., `{"key": "pii"}`) become **simple tags**

**Generating a new trigger secret:** If you need to rotate the secret, go to the **Sources** page and click on the three-dot menu next to your OpenLineage source. Select **"Reset Trigger Key"** to generate a new trigger secret and endpoint URL.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.euno.ai/sources/transformation-etl/openlineage-integration.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
