timestamp_column
training_period
param only works for tests that have timestamp_column
configuration.
It works differently according to the table materialization:
training_period
period is calculated on each run.training_period
period is calculated on the first test run, and on full refresh. The following test runs will only calculate the values of the detection_period
period.training_period
automatically to insure full time buckets. For example if the time_bucket
of the test is period: week
, and 14 days training_period
result in Tuesday, the test will collect 2 more days back to complete a week (starting on Sunday).training_period
automatically to ensure there are enough training set values to calculate an anomaly. For example if the seasonality
of the test is day_of_week
, training_period
will be increased to ensure enough Sundays, Mondays, Tuesdays, etc. to calculate an anomaly for each.training_period
training_period
your test training set will be larger. This means a larger sample size for calculating the expected range, which should make the test less sensitive to outliers. This means less chance of false positive anomalies, but also less sensitivity so anomalies have a higher threshold.
If you decrease training_period
your test training set will be smaller. This means a smaller sample size for calculating the expected range, which might make the test more sensitive to outliers. This means more chance of false positive anomalies, but also more sensitivity as anomalies have a lower threshold.