🖱️Pre-Aggregate Models

The pre-aggregates feature allows users to generate pre-aggregate models and tables based on dbt metrics. These pre-aggregate tables are summarized versions of data tables, created to reduce compute costs and optimize query times. By automating the generation and refreshing of pre-aggregate tables, this feature streamlines data processing and enhances performance, allowing for quicker data retrieval and analysis.

How to Generate Pre-aggregate Models

To generate a pre-aggregate model, start with the metric you would like to base the pre-aggregate model on. In the data model screen, navigate to the specific metric, whether in tabular or graph view. Once the user clicks on the metric's sidebar, near the top, there is a button called “Add pre-aggregate.” Once clicked, the review change draft screen will be displayed to the user, where the details about the pre-aggregate table can be defined.

Review Change Draft

Define Change Details

Change Title: By default, the system auto-generates a title using the structure: "Add aggregate model {model-name} to <dbt-project-name>." This title can be edited.
Change Description: By default, the system auto-generates a description using the structure: "An auto-generated aggregate model based on dbt metric(s); {metric(s)} being added to <dbt-project-name>." This description can be edited.

Define Fields for Pre-aggregate Table

Target Branch: Specify the target branch to add the change to.
Metrics: Specify the metrics on which the pre-aggregate table is based. By default, the metric that initiated the change will be included. Additional metrics can be added or removed.
Filter: Optional field. Users can add a filter using the free-text input.
Metric Time: By default, the addition of the metric time field is added with a grain of “Day.” This field is optional, and the user can also change the grain (e.g., day, week, month, quarter, year).
Group By: Define the columns of the table:
- Entities: Optional field. Select an entity from the list of associated entities, depending on the metrics being used. If there are multiple join paths, users can select the specific path.
- Dimensions: Optional field. Select dimensions from the list that are joinable to the selected metrics. If the group-by field is a timestamp-type dimension, you can select the grain.
Timeframe: Optional. If selected user must:
- Start Date: Specify a start date.
- End Date: Specify an end date.

Define Model Properties

Model Name: Enter the name of the model. This field is mandatory; the model name must be defined by the user to proceed with the change draft creation.
Model Description: Enter a description for the model.
Model Tags: Add tags to the model. By default, the tag "euno-auto-generated" is used.
Model Meta: Add metadata to the model.
Model Materialization: Choose how the model should be materialized (view, table, incremental).
Model Schema: Select the schema from the list of available schemas in the dbt project. By default, the schema will be taken from the underlying model of the metric the change was triggered from.

Define Column Properties

For each column within the table, the following information can be defined:

Name: The name of the column. This field is non-editable.
Alias: Provide an alias for each column.
Description: Provide a description for each column.
Tags: Add tags to each column.
Meta: Add metadata to each column.
Tests: Define tests for each column:
- Unique: By default, not filled out by Euno, but can be edited by the user.
- Not Null: By default, not filled out by Euno, but can be edited by the user.
- Accepted Values: A list of strings, which by default is empty, but can be edited by the user.
- Relationships: By default, not filled out by Euno, but can be edited by the user. The user will select from a dropdown list the table, and then from a second dropdown list the column.

Once satisfied with the proposal, clicking the “Create change draft” button will transform the proposal into a change draft. From there, the workflow and lifecycle are the same as any other change automation in Euno. For more details, see the Change Draft Screen under the Shift-left Proposals automation page.

Note: When reviewing the code diff for the model change, you may notice that inside the metadata of the change, Euno will automatically add additional system information to the metadata of the model. This system information will be utilized in a future automation that will allow users with these types of models to refresh the underlying SQL of the model if one of the underlying metrics changes. The structure of the system metadata is similar to that of dbt’s saved queries. This structure will also allow users to easily create a saved query for the model based on the model's metadata.

meta:
  euno.ai/start_time: 2023-01-01T00:00:00Z
  euno.ai/end_time: 2023-12-31T23:59:59Z
  euno.ai/saved_query: |
    metrics:
      - sales
      - revenue
    group_by:
      - region
      - product_category
    where: region = 'North America'

Refresh Automation for Pre-Aggregate Models

This feature automates the process of refreshing pre-materialized aggregate models based on changes in the underlying dbt metrics used in the model. The refresh operation ensures that the model’s SQL is up-to-date with the latest metric definitions.

Note: Only Euno-generated pre-aggregate models qualify for the automated refresh. These models include the Euno-inserted meta key euno.ai/saved_query.

Identification of Out-of-Sync Pre-Aggregate Models

To identify pre-aggregate models that are out of sync with the underlying data, either use the basic filtering on the data model screen by adding the filters "Is pre-aggregated" set to "True" and "Is outdated" set to "True," or use EQL filtering with out_of_sync_pre_aggregate_model set to true and is_automatic_pre_aggregated_dbt_model set to true. This will return a list of all models that are out of sync with the underlying metrics.

Triggering the Automated Refresh

If a resource qualifies for the refresh, the user can click on the metric's sidebar. Near the top, there is a button called “Refresh underlying SQL query.” Once clicked, a pop-up displaying the diff will appear, allowing the user to either continue or cancel the automation.

If "Continue" is clicked, another dialog box will open, allowing the user to edit the change title, description, and target branch.

Once the user clicks "Create change draft," they will be redirected to the "Model Changes" screen, where the first row in the "In Progress" tab will display the Type: “Refresh underlying SQL query.” Clicking on the specific change will redirect the user to the change draft page, where they can see the details of the change and the diff. To create the PR, the user simply needs to click the "Create Pull Request" button located in the top-right corner of the page.

PreviousShift-left Proposals NextReference

Last updated 5 months ago