dbt Meets High-Performance Analytics: The Exasol + dbt Integration
If you’ve been working in analytics engineering for any amount of time, you’ve probably noticed a tension. On one side, you’ve got tools like dbt that have completely transformed how teams manage transformation logic with version-controlled SQL, tested pipelines, and clear lineage. On the other side, most data platforms make you choose between developer experience and raw performance.
We don’t think you should have to choose. That’s why we’re excited about the dbt-exasol adapter, which brings dbt’s transformation framework to Exasol’s high-performance analytics engine.
Why This Matters
dbt has become the standard for analytics engineering. It’s how modern data teams define, test, and document their transformation logic. But dbt is only as fast as the engine running your SQL, and that’s where Exasol comes in.
Exasol is purpose-built for analytical workloads. It’s an in-memory, massively parallel processing (MPP) database that routinely outperforms platforms many times its size. When you pair dbt’s transformation framework with a database that can execute complex analytical queries in seconds rather than minutes, you get fast iteration cycles, reliable pipelines, and the headroom to scale into demanding workloads like AI feature engineering without hitting a performance wall.
Getting Started
Setting up the adapter takes about two minutes. Install it alongside dbt Core:
bash
python -m pip install dbt-exasol
Then configure your profiles.yml:
yaml
my_exasol_project:
target: dev
outputs:
dev:
type: exasol
threads: 4
dsn: HOST:PORT
user: USERNAME
password: PASSWORD
dbname: db
timestamp_format: "YYYY-MM-DD HH:MI:SS.FF6"
schema: MY_SCHEMA
If you’re running Exasol SaaS, there’s built-in OpenID authentication support. Just swap your username and password for an access_token or refresh_token and enable encryption. It’s the kind of setup you’d expect from a modern cloud analytics stack.
Performance-Aware Transformations
The dbt-exasol adapter doesn’t just translate your dbt models into SQL and hope for the best. Starting with version 1.8.1, you can configure Exasol-specific optimizations directly in your model configs:
sql
{{ config(
materialized='table',
partition_by_config='order_date',
distribute_by_config='customer_id',
primary_key_config=['order_id']
) }}
SELECT
order_id,
customer_id,
order_date,
total_amount
FROM {{ ref('stg_orders') }}
The partition_by_config and distribute_by_config parameters let you tell Exasol how to physically organize your data for optimal query performance. This matters when your downstream dashboards need sub-second response times or when your data scientists are running heavy aggregations across billions of rows.
Why dbt on Exasol Makes Sense
The config parameters above are useful, but the bigger story is what dbt brings to your data team and why Exasol is a particularly good engine to run it on.
Start with the economics: dbt Core is open-source and free. If you’ve been managing six-figure invoices for proprietary ETL tools while rationing licenses across your team, that matters.
Then there’s the architecture. dbt turns Exasol into your transformation engine directly. Data lands in Exasol, transformations run as SQL inside Exasol, and you skip the overhead of shuttling data through external tools. All transformation logic lives in version-controlled code rather than a proprietary GUI, so your team gets proper collaboration, code review, and automated testing as part of the standard workflow.
On top of that, dbt handles common pain points that traditionally eat up engineering time: reusable transformation patterns through its macro system, built-in historical tracking via snapshots, automated data quality checks, and consistent code standards through linting tools like sqlfluff.
The result is tested, version-controlled transformation code running on an engine built for analytical speed.
Incremental Models and Microbatch Processing
Nobody wants to rebuild their entire warehouse every time new data arrives. The adapter supports dbt’s incremental materialization strategies, including the microbatch approach for time-series data (introduced in dbt-exasol 1.10, requires dbt-core 1.10+):
sql
{{ config(
materialized='incremental',
incremental_strategy='microbatch',
event_time='created_at',
begin='2024-01-01',
batch_size='day',
lookback=2
) }}
SELECT * FROM {{ ref('raw_events') }}
Microbatch processes your data in discrete time windows using a DELETE + INSERT pattern, with each batch running as a separate transaction. Combined with Exasol’s in-memory speed, your incremental pipelines don’t just work, they’re fast. There’s even a sample mode (dbt run –sample=”2 days”) for quick development iterations without processing your full dataset. Note that sample mode also requires dbt-core 1.10+.
Where AI Workflows Come In
Modern analytics isn’t just about dashboards and reports anymore. Teams are building feature stores, training ML models, and deploying AI agents that need to query analytical data in real time.
dbt on Exasol gives you a clean, tested, version-controlled foundation for all of this.
Feature engineering at scale. Use dbt to define and maintain your ML feature pipelines. Exasol’s MPP architecture means those heavy window functions, rolling aggregates, and cross-table joins that feature engineering demands actually run fast enough to iterate on during development, not just in overnight batch jobs.
AI-ready data layers. dbt’s modeling patterns (staging, intermediate, marts) give you a structured way to build the clean, well-documented data layers that AI applications need. One especially useful feature here: dbt can automatically push your model and column documentation into Exasol itself using persist_docs. Add this to your project config and every description you write in your YAML files gets propagated to the database as table and column comments with each build. That means tools like Exasol’s MCP server can read rich metadata directly from the database, giving AI agents real context about what each table and column means without any manual upkeep. When your LLM-powered agent needs to query your data, it’s working with tested, documented models rather than ad-hoc SQL sprawl.
Real-time analytical serving. Exasol’s in-memory architecture means your dbt-built models can serve as a high-performance backend for AI applications that need sub-second analytical queries. Teams are using this for everything from recommendation engines to anomaly detection systems that need to crunch large volumes of data on the fly.
Agentic AI patterns. If you’re exploring database agents (AI systems that can autonomously query and reason over your data), a well-structured dbt project on Exasol unlocks immediate value for your data consumers: model and column descriptions authored in dbt are propagated directly as table and view descriptions in Exasol, making your data warehouse self-documenting. This metadata is effortlessly picked up by Exasol’s MCP server, giving your LLM the context it needs to produce higher-quality, more accurate query results.
To enable this, add the following to your dbt project configuration:
yaml
models:
+persist_docs:
relation: true
columns: true
On the engineering side, because your dbt project is code-based, Analytical Engineers gain a powerful advantage: an LLM can read the project files directly, understand the transformation logic, and actively assist in development; generating code changes, suggesting improvements, and running the project to close the build-test loop faster than ever.
Use Cases We’re Seeing
Financial services teams use dbt on Exasol to build compliant, auditable transformation pipelines that can handle the computational demands of risk modeling and regulatory reporting, without the multi-hour batch windows they’d need on other platforms.
Healthcare and pharma organizations manage complex clinical trial data pipelines where data integrity is non-negotiable and the ability to trace lineage from raw source to final analytical model matters for regulatory reasons.
Retail and e-commerce companies build customer analytics pipelines that combine transactional history with real-time behavioral data, using dbt’s incremental models on Exasol to keep their recommendation and personalization systems current.
Data platform teams consolidate fragmented transformation logic into a single dbt project backed by a database that can actually handle the consolidated workload without degrading performance.
The Ecosystem
The adapter supports the broader dbt ecosystem through the dbt-exasol-utils shim package (which users need to install separately), which ensures tools like dbt-utils and dbt_date work smoothly with Exasol’s SQL dialect. SSL/TLS encryption is enabled by default (and enforced starting from Exasol 8), and there’s proper certificate validation built in. Native embed adapter support is planned but not yet available (https://github.com/exasol/dbt-exasol/issues/157).
It’s maintained as open-source under the Apache 2.0 license, with active development from both the Exasol team and the community. The adapter supports dbt Core v1.8.0 and newer (latest release is v1.10.4), with compatibility for Exasol 7.x and above.
A couple of current limitations worth knowing about: Python models are not yet supported (work in progress), and materialized views are unavailable in Exasol.
What’s Next
We see the dbt integration as one piece of a larger story. Exasol is becoming a high-performance analytics platform where teams can run transformations, serve AI workloads, and build intelligent applications, all without stitching together a dozen different tools.
If you’re already using dbt and you’re hitting performance limits with your current data platform, give dbt-exasol a try. And if you’re already on Exasol but haven’t explored dbt yet, this is a good time to start bringing modern analytics engineering practices to your fastest analytical engine.
Get started with the dbt-exasol adapter on GitHub or check out the official setup documentation.
Have questions about running dbt on Exasol? We’d love to hear from you. Reach out to the Exasol community or open an issue on GitHub.