elementary.json_schema

Allows validating that a string column matches a given JSON schema. The test expects a JSON schema as input, defined according to the JSON schema standard, defined in YAML format (see an example below).

This test along with the relevant JSON schema can be auto-generated (see details below).

This test relies on our Python tests capability, and is currently only supported for Snowflake and BigQuery data warehouses.

Auto-generate JSON schema tests

Elementary provides the generate_json_schema_test macro in order to auto-generate the JSON schema for a given column using existing data.

Example usage:

dbt run-operation elementary.generate_json_schema_test --args '{"node_name": "login_events", "column_name": "raw_event_data"}'

Will print:

Please add the following test to your model configuration:
----------------------------------------------------------

columns:
  - name: raw_event_data
    tests:
      - elementary.json_schema:
          type: object
          properties:
            event_id:
              type: integer
            event_name:
              type: string
            event_args:
              type: array
              items:
                type: string
          required:
            - event_id
            - event_name

Note: The generate_json_schema_test macro relies on a 3rd-party python library called genson. If you are using BigQuery, you will need to pre-install this library in your Dataproc cluster (See dbt’s documentation on Python models for more details)