Metabase
Important Capabilities
| Capability | Status | Notes | 
|---|---|---|
| Platform Instance | ✅ | Enabled by default | 
This plugin extracts Charts, dashboards, and associated metadata. This plugin is in beta and has only been tested on PostgreSQL and H2 database.
Dashboard
/api/dashboard endpoint is used to retrieve the following dashboard information.
- Title and description
- Last edited by
- Owner
- Link to the dashboard in Metabase
- Associated charts
Chart
/api/card endpoint is used to retrieve the following information.
- Title and description
- Last edited by
- Owner
- Link to the chart in Metabase
- Datasource and lineage
The following properties for a chart are ingested in DataHub.
| Name | Description | 
|---|---|
| Dimensions | Column names | 
| Filters | Any filters applied to the chart | 
| Metrics | All columns that are being used for aggregation | 
CLI based Ingestion
Install the Plugin
pip install 'acryl-datahub[metabase]'
Config Details
- Options
- Schema
Note that a . is used to denote nested fields in the YAML recipe.
| Field | Description | 
|---|---|
| connect_uri string | Metabase host URL. Default: localhost:3000 | 
| database_alias_map object | Database name map to use when constructing dataset URN. | 
| default_schema string | Default schema name to use when schema is not provided in an SQL query Default: public | 
| engine_platform_map map(str,string) | |
| password string(password) | Metabase password. | 
| platform_instance_map map(str,string) | |
| username string | Metabase username. | 
| env string | The environment that all assets produced by this connector belong to Default: PROD | 
The JSONSchema for this configuration is inlined below.
{
  "title": "MetabaseConfig",
  "description": "Any non-Dataset source that produces lineage to Datasets should inherit this class.\ne.g. Orchestrators, Pipelines, BI Tools etc.",
  "type": "object",
  "properties": {
    "env": {
      "title": "Env",
      "description": "The environment that all assets produced by this connector belong to",
      "default": "PROD",
      "type": "string"
    },
    "platform_instance_map": {
      "title": "Platform Instance Map",
      "description": "A holder for platform -> platform_instance mappings to generate correct dataset urns",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "connect_uri": {
      "title": "Connect Uri",
      "description": "Metabase host URL.",
      "default": "localhost:3000",
      "type": "string"
    },
    "username": {
      "title": "Username",
      "description": "Metabase username.",
      "type": "string"
    },
    "password": {
      "title": "Password",
      "description": "Metabase password.",
      "type": "string",
      "writeOnly": true,
      "format": "password"
    },
    "database_alias_map": {
      "title": "Database Alias Map",
      "description": "Database name map to use when constructing dataset URN.",
      "type": "object"
    },
    "engine_platform_map": {
      "title": "Engine Platform Map",
      "description": "Custom mappings between metabase database engines and DataHub platforms",
      "type": "object",
      "additionalProperties": {
        "type": "string"
      }
    },
    "default_schema": {
      "title": "Default Schema",
      "description": "Default schema name to use when schema is not provided in an SQL query",
      "default": "public",
      "type": "string"
    }
  },
  "additionalProperties": false
}
Metabase databases will be mapped to a DataHub platform based on the engine listed in the
api/database response. This mapping can be
customized by using the engine_platform_map config option. For example, to map databases using the athena engine to
the underlying datasets in the glue platform, the following snippet can be used:
  engine_platform_map:
    athena: glue
DataHub will try to determine database name from Metabase api/database
payload. However, the name can be overridden from database_alias_map for a given database connected to Metabase.
Compatibility
Metabase version v0.41.2
Code Coordinates
- Class Name: datahub.ingestion.source.metabase.MetabaseSource
- Browse on GitHub
Questions
If you've got any questions on configuring ingestion for Metabase, feel free to ping us on our Slack.