Use a registered community connector

Important

This feature is in Beta. Workspace admins can control access to this feature from the Previews page. See Manage Azure Databricks previews.

This page shows how to use a registered community connector to ingest data from a supported source into Azure Databricks. To create a custom connector for a source that isn't supported yet, see Create a custom connector.

Requirements

A Azure Databricks workspace with Unity Catalog enabled
A connection for the source you want to ingest, or permissions to create a connection
Write access to a catalog and schema for the ingested tables

Create an ingestion pipeline

To use a registered community connector:

In the sidebar of your Azure Databricks workspace, click +New > Add or upload data, then select the source under Community connectors.
Click + Create connection or select an existing connection, then click Next.
For Pipeline name, enter a name for the pipeline.
For Event log location, enter a catalog name and a schema name. Azure Databricks stores the pipeline event log here. Ingested tables are also written here by default.
For Root path, enter your workspace path (for example, /Workspace/Users/<your-email>/connectors). Azure Databricks clones and stores the connector source code here.
Click Create pipeline.

In the pipeline editor, open ingest.py and update the objects field to include the tables you want to ingest. For example:

from databricks.labs.community_connector.pipeline import ingest

pipeline_spec = {
    "connection_name": "my_stripe_connection",  # Required: UC connection name
    "objects": [
        {"table": {"source_table": "charges"}},
        {"table": {"source_table": "customers",
                   "destination_table": "stripe_customers"}},
    ],
}

ingest(spark, pipeline_spec)

Run the pipeline manually or schedule it.

Pipeline configuration options

You can configure the following options in ingest.py:

Option	Description
`connection_name`	Required. The name of the connection that stores authentication credentials for the source.
`objects`	Required. A list of tables to ingest. Each entry has the format `{"table": {"source_table": "..."}}`. You can also specify an optional `destination_table` inside the `table` object.
`destination_catalog`	The catalog where ingested tables are written. Defaults to the catalog set during pipeline creation.
`destination_schema`	The schema where ingested tables are written. Defaults to the schema set during pipeline creation.
`scd_type`	The slowly changing dimension strategy: `SCD_TYPE_1`, `SCD_TYPE_2`, or `APPEND_ONLY`. Defaults to `SCD_TYPE_1`.
`primary_keys`	Override the default primary keys for a table. Provide a list of column names.

Feedback

Var denne side nyttig?

Last updated on 2026-05-01