More teams are shifting their data quality checks out of dashboards and into the transformation layer itself. It’s obvious why: the transformation code is the first place that touches real data. If a check fails here, the pipeline stops before corrupted rows ever land downstream. No backfills, no detective work, no “how long has this been wrong?” scramble. And the value cuts both ways:Documentation Index
Fetch the complete documentation index at: https://docs.elementary-data.com/llms.txt
Use this file to discover all available pages before exploring further.
- You catch issues before they ever hit the data warehouse or lake, right at the ingestion and preprocessing layers.
- You catch issues after the data warehouse too, in the pipelines that stream data to downstream destinations, models, APIs, and operational systems.
Python: The Backbone of Modern Data Engineering
Python has become the backbone of modern data engineering - especially in pipelines that go beyond SQL. It now drives:- Ingestion and storage of unstructured data
- Vectorization and embedding pipelines for AI systems
- ML model training and feature generation
- Monitoring of model inputs and outputs
- Hybrid pipelines that mix structured, semi-structured, and free-form data
Wrapping Existing Tools Instead of Inventing New Ones
Engineers already have strong opinions about how they want to write tests. Some rely on Great Expectations, others on DQX, pytest-based workflows, or homegrown frameworks. Reinventing a new test engine or DSL would just fragment the landscape - so we didn’t. We focused on the simplest possible layer: a lightweight Python SDK that captures any Python test result, from any framework, and reports it to Elementary. You keep your code - we handle the metadata, structure, and visibility. This means full observability without dictating how you build.Built for Teams That Treat Their Data Pipelines Like Software
Elementary has always leaned into engineering-first workflows. Our deep integration with dbt set that foundation. Extending this into Python is the natural continuation of that approach. As more transformations shift into Python (Pyspark, SQL generation, AI/ML pipelines, unstructured data processing), teams want the same capabilities they rely on when using Elementary with dbt:- Understand what ran
- Track when it ran
- Measure how long it took
- Identify which upstream assets fed it
- Trace which downstream assets it produced
- Run data quality checks on the product and see the results
- Get alerts on data issues as soon as they happen
What You’ll See in Elementary Once You Report Through the SDK
When a Python pipeline reports assets, test results, and execution metadata, everything shows up in Elementary unified with your dbt and cloud tests:- All test results appear together — Python validations, dbt tests, cloud tests — in a single, consistent interface.
- Alerts fire through your existing channels (Slack, PagerDuty, email), ensuring that pipeline-level issues trigger the same operational flow as warehouse-level ones.
- Incidents are created automatically for detected issues, including opening Jira tickets. Elementary’s agentic tools then investigate root cause, assess downstream impact, and guide resolution.
- Lineage becomes fully connected, tying together Python assets, dbt models, warehouse tables, unstructured data, vectors, and ML outputs.
- Every table, view, file, or vector store entity produced by Python becomes discoverable through the Elementary catalog, data discovery agent, and MCP server — giving analysts, DS, and AI teams a shared understanding of the entire data ecosystem.
Next Steps
Setup Guide
Learn how to install and configure the Python SDK
Usage Examples
See how to report assets and test results from your Python pipelines

