Export Langfuse traces to Azure Databricks MLflow

Configure Langfuse to send OpenTelemetry-based trace spans to the Azure Databricks MLflow OTLP endpoint, so that traces are stored in Unity Catalog tables alongside your other MLflow traces.

Consolidating traces on Azure Databricks provides the following benefits:

  • Query and compare Langfuse-instrumented calls together with traces from other frameworks in a single place.
  • Use Azure Databricks SQL to analyze trace data at scale.
  • Apply Unity Catalog governance such as access controls and lineage to all your traces.

Requirements

Step 1: Install packages

Install the required packages in your Azure Databricks notebook:

%pip install "langfuse>=3.14.5" "mlflow[databricks]>=3.10.0" opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp-proto-http
%restart_python

Step 2: Disable Langfuse trace collection

Langfuse requires the LANGFUSE_HOST, LANGFUSE_PUBLIC_KEY, and LANGFUSE_SECRET_KEY environment variables to initialize its SDK. Set these variables to dummy values to prevent traces from being sent to a Langfuse server. Only the OTEL exporter, configured in Add the OTLP exporter, will receive the spans.

import os

os.environ["LANGFUSE_HOST"] = "localhost"
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""

Note

This is the simplest approach to prevent Langfuse from collecting traces on its own servers. Alternatively, you can set LANGFUSE_TRACING_ENABLED=False to disable Langfuse's built-in trace collection.

Step 3: Configure Azure Databricks connection

Retrieve the workspace host URL and an API token. In a Azure Databricks notebook, you can get these from the notebook context:

# If running outside a Databricks notebook, set DATABRICKS_HOST and DATABRICKS_TOKEN environment variables manually.
context = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
DATABRICKS_HOST = context.apiUrl().get().rstrip("/")
DATABRICKS_TOKEN = context.apiToken().get()

Define your Unity Catalog catalog and schema, then link them to an MLflow experiment using set_experiment_trace_location. This tells Azure Databricks where to store the incoming traces.

import mlflow
from mlflow.entities import UCSchemaLocation
from mlflow.tracing.enablement import set_experiment_trace_location

CATALOG_NAME = "<UC_CATALOG_NAME>"
SCHEMA_NAME = "<UC_SCHEMA_NAME>"
EXPERIMENT_ID = "<EXPERIMENT_ID>"

trace_location = set_experiment_trace_location(
    location=UCSchemaLocation(catalog_name=CATALOG_NAME, schema_name=SCHEMA_NAME),
    experiment_id=EXPERIMENT_ID,
)

Step 5: Add the Azure Databricks OTLP exporter

Get the TracerProvider that Langfuse uses, then attach a BatchSpanProcessor with an OTLPSpanExporter that points to the Azure Databricks OTLP endpoint.

from langfuse import get_client
from opentelemetry import trace as otel_trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor

# Initialize the Langfuse client. Langfuse registers its own TracerProvider as the global OpenTelemetry TracerProvider.
langfuse = get_client()

# Retrieve the global TracerProvider.
# Use this to attach an additional span processor that forwards Langfuse spans to Databricks.
provider = otel_trace.get_tracer_provider()

# Configure the OTLP exporter to send spans to the Databricks MLflow endpoint.
# The X-Databricks-UC-Table-Name header routes incoming spans to the Unity Catalog table defined by the trace location.
databricks_exporter = OTLPSpanExporter(
    endpoint=f"{DATABRICKS_HOST}/api/2.0/otel/v1/traces",
    headers={
        "content-type": "application/x-protobuf",
        "Authorization": f"Bearer {DATABRICKS_TOKEN}",
        "X-Databricks-UC-Table-Name": trace_location.full_otel_spans_table_name,
    },
)

# Attach the Databricks exporter to Langfuse's provider. Because the Langfuse
# env vars are set to dummy values, this processor is the only active exporter,
# so all spans are sent exclusively to Databricks.
provider.add_span_processor(BatchSpanProcessor(databricks_exporter))

Step 6: Run a traced function

Use Langfuse's @observe() decorator to instrument your functions. The decorator creates OTEL spans that the exporter sends to Azure Databricks.

from langfuse import observe

@observe()
def my_llm_call(prompt: str) -> str:
    # Replace with your LLM logic (OpenAI, Anthropic, etc.)
    return f"Response to: {prompt}"

@observe()
def my_pipeline(user_input: str) -> str:
    result = my_llm_call(user_input)
    return result

my_pipeline("Hello, world!")

Step 7: View traces

Open the MLflow experiment in your Azure Databricks workspace and click the Traces tab. The trace from the my_pipeline call should appear.

Next steps