ELEMENTARY CLOUD

The Data Health Dashboard is intended for your data consumers and stakeholders, that want to get a summary of what is happening with the data in your organization.

The dashboard is based on the 6 pillars of data observability:

Accuracy

The level to which data represents the real-world scenario and confirms with a verifiable source.

Consistency

The degree to which data is the same across all instances of the data.

Completeness

Is the data sufficient to deliver meaningful inferences and decisions.

Freshness

Whether or not the data is updated according to the expected SLA

Uniqueness

Whether or not each record only exists in the system once and there are no duplications

Validity

Whether the values are in the same data type / format / range as indicated by a predefined set of rules

The purpose of is to give a high-level overview that doesn’t require deep technical knowledge or going into specific test results. the dashboard presents the data health in a simple way, by giving a health score, and using a color code to indicate if this score is healthy. Filters are available at the top of the page, making it easy to see the data health in different contexts.

Data Health Dashboard

How is the data health score calculated?

Each test you run in either dbt or Elementary is mapped to one of these pillars, and given a score. The scoring method is very simple:

  • If the test passes, the score is 100
  • If the test is in warn status, the score is 50
  • If the test is in fail status, the score is 0

The results are aggregated to give a health score for each pillar. The total score is a weighted average of the 6 pillars, where the weight is configurable. The thresholds for the color coding (green, yellow and red) are also configurable.

Score weight and threshold configuration

Can I customize the quality dimension mapping of my tests?

Of course! Each test you run, whether it’s a generic or a custom test, can be mapped to one of the 6 quality dimensions. The way to do so is to add quality_dimension to the test definition in your dbt project:

Next steps

  • Send a daily report of the data health to your stakeholders
  • Compare the data health of different domains
  • Set up alerts for when the data health is below a certain threshold