- Static Threshold Alerts: Define specific time thresholds that, when exceeded, trigger an alert
- Anomaly Detection Alerts: Use Elementary’s anomaly detection to identify unusual increases in query execution time
Static Threshold Alerts
You can define tests that fail when model execution times exceed predefined thresholds. This approach is straightforward and ideal when you have clear performance requirements.Implementation Steps
- Create a singular test SQL file in your dbt project (e.g.,
tests/test_models_run_under_30m.sql
):
- The test monitors model runs over the past 24 hours
- It fails if any model takes longer than 30 minutes to run (1800 seconds)
- The test is tagged with “model_performance” for easy identification
- Results are ordered by execution time in descending order
Anomaly Detection Alerts
Instead of using fixed thresholds, you can leverage Elementary’s anomaly detection to identify unusual increases in execution time. This approach is more dynamic and can adapt to your evolving data pipeline.Implementation Steps
- Define a source on the
model_run_results
view in yourschema.yml
file (or another YAML file):
- Elementary monitors the
execution_time
column for anomalies - Dimensions are set to
package_name
andname
to analyze each model individually - The test only detects spikes in execution time (
anomaly_direction: spike
) - Small changes under 10% are ignored (
spike_failure_percent_threshold: 10
) - The severity is set to “warn” but can be adjusted as needed
Choosing the Right Approach
Both methods have their strengths:- Static Threshold: Simple to implement and understand. Ideal when you have clear performance requirements (e.g., “models must run in under 30 minutes”).
- Anomaly Detection: More adaptive to your specific environment. Better at detecting relative changes in performance rather than absolute thresholds. Useful when normal execution times vary across different models.