Skip to main content

Dagster & Snowflake with components

The dagster-snowflake library provides both a BaseSnowflakeSqlComponent, which can be used to write your own Snowflake components, and a ready-to-use SnowflakeTemplatedSqlComponent which can be used to execute SQL queries from Dagster in order to rebuild data assets in Snowflake. This guide will walk you through how to use the SnowflakeTemplatedSqlComponent to create a component that will execute custom SQL.

1. Prepare a Dagster project

To begin, you'll need a Dagster project. You can use an existing components-ready project or create a new one:

create-dagster project my-project && cd my-project/src

Activate the project virtual environment:

source ../.venv/bin/activate

Finally, add the dagster-snowflake library to the project:

uv add dagster-snowflake

2. Scaffold a Snowflake SQL component

Now that you have a Dagster project, you can scaffold a Snowflake SQL component. You'll need to provide a name for your component. In this example, we'll create a component that will execute a SQL query to calculate the daily revenue from a table of sales transactions.

dg scaffold defs dagster_snowflake.SnowflakeTemplatedSqlComponent daily_revenue
Creating defs at /.../my-project/src/my_project/defs/daily_revenue.

The scaffold call will generate a defs.yaml file:

tree my_project/defs
my_project/defs
├── __init__.py
└── daily_revenue
└── defs.yaml

2 directories, 2 files

3. Configure Snowflake resource

You'll need to configure a Snowflake resource to enable your component to connect to your Snowflake instance. For more information on the Snowflake resource, see the Using Snowflake with Dagster guide.

Create a resources.py file in your defs directory:

my_project/defs/resources.py
from dagster_snowflake import SnowflakeResource

import dagster as dg

defs = dg.Definitions(
resources={
"snowflake": SnowflakeResource(
account=dg.EnvVar("SNOWFLAKE_ACCOUNT"),
user=dg.EnvVar("SNOWFLAKE_USER"),
password=dg.EnvVar("SNOWFLAKE_PASSWORD"),
database=dg.EnvVar("SNOWFLAKE_DATABASE"),
schema=dg.EnvVar("SNOWFLAKE_SCHEMA"),
)
}
)
tree my_project/defs
my_project/defs
├── __init__.py
├── daily_revenue
│   └── defs.yaml
└── resources.py

2 directories, 3 files

You will only need a single resource in your project for each Snowflake instance you'd like to connect to - this resource can be used by multiple components.

4. Write custom SQL

You can customize the SQL template and define the assets that will be created. Update your defs.yaml file with a SQL template and template variables. You can also specify properties for the asset in Dagster, such as a group name and kind tag:

my_project/defs/daily_revenue/defs.yaml
type: dagster_snowflake.SnowflakeTemplatedSqlComponent

attributes:
sql_template: |
SELECT
DATE_TRUNC('day', {{ date_column }}) as date,
SUM({{ amount_column }}) as daily_revenue
FROM {{ table_name }}
WHERE {{ date_column }} >= '{{ start_date }}'
GROUP BY DATE_TRUNC('day', {{ date_column }})
ORDER BY date

sql_template_vars:
table_name: SALES_TRANSACTIONS
date_column: TRANSACTION_DATE
amount_column: SALE_AMOUNT
start_date: "2024-01-01"

assets:
- key: ANALYTICS/DAILY_REVENUE
group_name: analytics
kinds: [snowflake]

You can run dg list defs to see the asset corresponding to your component:

dg list defs
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ ANALYTICS/DAILY_REVENUE │ analytics │ │ snowflake │ │ │
│ │ └─────────────────────────┴───────────┴──────┴───────────┴─────────────┘ │
│ Resources │ ┏━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ │
│ │ ┡━━━━━━━━━━━┩ │
│ │ │ snowflake │ │
│ │ └───────────┘ │
└───────────┴──────────────────────────────────────────────────────────────────────────┘

5. Launch your assets

Now, you can launch your asset using the UI or CLI to execute your SQL query and rebuild the table in Snowflake:

dg launch --assets '*'