In order to connect to your data warehouse, Elementary Data Lineage CLI requires connection details and credentials. These are provided to the tool using a file named profiles.yml. You can store as many profiles as you need in this file. Typically, you would have one profile for each warehouse you use.
If you use dbt, you don't even need to create the file, you already have it. In general, we support the same profiles format as dbt for our supported integrations, with the exception that in BigQuery we require 'location', and in dbt it is optional.
How to create 'profiles.yml'?
Choose a directory that does not sync with any version control system
Create a new file and name it profiles.yml
Use a template of the relevant integration to create a profile in the file, and save it.
# example profiles.yml file
## SNOWFLAKE ##
## profile name, replace 'my_snowflake_profile' with a name of your choice ##
## User/password auth, other options (Keypair/SSO) require other configs ##
## Schema is used as filter by default, you can use the 'ignore schema'
## option to see cross schema lineage
## OPTIONAL - if you want to create the lineage based on queries from more
## than the last 7 days or you have 10k or more queries in the history pulled during
## the requested dates range, add this parameter (NOTE: account_usage requires more permissions, see note below).
## BIGQUERY ##
## profile name, replace 'my_bigquery_profile' with a name of your choice ##
## Service account auth, other options require other configs ##
keyfile:[full path to your keyfile]
## Dataset is used as filter by default, you can use the 'ignore schema'
## option to see cross dataset lineage (only in the same location)
## Location is mandatory, can be one of US or EU, or a regional location
Connection details and permissions
Each warehouse schema you connect to requires a profile with connection details, andthe provided credentials need to have permissions to read the query history for this schema. If the credentials provided will not have access to the entire query history, a partial lineage graph will be generated based on the available queries (can be exported using the export query history option).
The required connection details and permissions for each data warehouse is detailed in the relevant integration documentation under integrations.
Note: in BigQuery we require 'location' (details), and in dbt it is optional.
Anonymous usage tracking
We want to keep building and improving, and for that, we need to understand how users work with Elementary (and data is fun!). For that we added an anonymous tracking of events using Posthog (open-source product analytics, highly recommended).
We only track start, end, platform, number of queries and the size of the graph. No credentials, queries content, table names or anything private (not now and not ever).
By default this completely anonymous tracking is turned on. You can opt-out at any time by adding the following to your profiles.yml file: