Monitors the freshness of event data over time, as the expected time it takes each event to load - that is, the time between when the event actually occurs (the event timestamp), and when it is loaded to the database (the update timestamp).

This test compliments the freshness_anomalies test and is primarily intended for data that is updated in a continuous / streaming fashion.

The test can work in a couple of modes:

  • If only an event_timestamp_column is supplied, the test measures over time the difference between the current timestamp (“now”) and the most recent event timestamp.
  • If both an event_timestamp_column and an update_timestamp_column are provided, the test will measure over time the difference between these two columns.

Test configuration

Required configuration: event_timestamp_column Default configuration: anomaly_direction: spike to alert only on delays.

tests:   — elementary.event_freshness_anomalies:     event_timestamp_column: column name     update_timestamp_column: column name     where_expression: sql expression     anomaly_sensitivity: int     days_back: int     backfill_days: int     min_training_set_size: int     time_bucket:       period: [hour | day | week | month]       count: int     seasonality: day_of_week

  - name: < model name >
      - elementary.event_freshness_anomalies:
          event_timestamp_column: < timestamp column > # Mandatory
          update_timestamp_column: < timestamp column > # Optional
          where_expression: < sql expression >
          time_bucket: # Daily by default
            period: < time period >
            count: < number of periods >