JSON schema
elementary.json_schema
Allows validating that a string column matches a given JSON schema. The test expects a JSON schema as input, defined according to the JSON schema standard, defined in YAML format (see an example below).
This test along with the relevant JSON schema can be auto-generated (see details below).
This test relies on our Python tests capability, and is currently only supported for Snowflake and BigQuery data warehouses.
Auto-generate JSON schema tests
Elementary provides the generate_json_schema_test
macro in order to auto-generate the JSON schema for a given
column using existing data.
Example usage:
dbt run-operation elementary.generate_json_schema_test --args '{"node_name": "login_events", "column_name": "raw_event_data"}'
Will print:
Please add the following test to your model configuration:
----------------------------------------------------------
columns:
- name: raw_event_data
tests:
- elementary.json_schema:
type: object
properties:
event_id:
type: integer
event_name:
type: string
event_args:
type: array
items:
type: string
required:
- event_id
- event_name
Note: The generate_json_schema_test
macro relies on a 3rd-party python library called genson
. If you are using
BigQuery, you will need to pre-install this library in your Dataproc cluster (See dbt’s documentation on Python models
for more details)