freshness_anomalies
elementary.freshness_anomalies
Monitors the freshness of your table over time, as the expected time between data updates.
Upon running the test, your data is split into time buckets (daily by default, configurable with the time bucket
field),
and then we compute the maximum freshness value per bucket for the last days_back
days (by default 14).
The test then compares the freshness of each bucket within the detection period (last 2 days by default, controlled by the
backfill_days
var), and compares it to the freshness of the previous time buckets.
If there were any anomalies during the detection period, the test will fail.
Test configuration
Required configuration: timestamp_column
Default configuration: anomaly_direction: spike
to alert only on delays.
tests:
— elementary.freshness_anomalies:
timestamp_column: column name
where_expression: sql expression
anomaly_sensitivity: int
days_back: int
backfill_days: int
min_training_set_size: int
time_bucket:
period: [hour | day | week | month]
count: int
models:
- name: < model name >
tests:
- elementary.freshness_anomalies:
timestamp_column: < timestamp column > # Mandatory
where_expression: < sql expression >
time_bucket: # Daily by default
period: < time period >
count: < number of periods >
models:
- name: < model name >
tests:
- elementary.freshness_anomalies:
timestamp_column: < timestamp column > # Mandatory
where_expression: < sql expression >
time_bucket: # Daily by default
period: < time period >
count: < number of periods >