Unity AI Gateway エンドポイントのクエリを実行する

Important

この機能はベータ版です。アカウント管理者は、アカウントコンソール の [プレビュー ] ページからこの機能へのアクセスを制御できます。 Manage Azure Databricks プレビューを参照してください。

このページでは、サポートされている API を使用して Unity AI Gateway エンドポイントにクエリを実行する方法について説明します。

Requirements

Unity AI Gateway プレビューがあなたのアカウントに対して有効になりました。 Manage Azure Databricks プレビューを参照してください。
Unity AI Gateway でサポートされているリージョン内のAzure Databricks ワークスペース。
ワークスペースに対して有効になっている Unity カタログ。「Unity Catalog のワークスペースを有効にする」を参照してください。

サポートされている API と統合

Unity AI Gateway では、次の API と統合がサポートされています。

Unified API: Azure Databricksでモデルを照会するための OpenAI 互換インターフェイス。各モデルのクエリ方法を変更することなく、異なるプロバイダーのモデルをシームレスに切り替えます。
ネイティブ API: 最新のモデルとプロバイダー固有の機能にアクセスするためのプロバイダー固有のインターフェイス。
コーディングエージェント: コーディングエージェントを Unity AI Gateway と統合して、AI 支援型開発ワークフローに一元的なガバナンスと監視を追加します。コーディング・エージェントとの統合を参照してください。
Databricks Apps 上のエージェント: Unity AI Gateway 経由で LLM トラフィックをルーティングする AI エージェントを作成し、Databricks Apps にデプロイします。手順 4 を参照してください。Unity AI Gateway を使用して Databricks Apps 上のエージェントから LLM の使用を管理します。
ai_query: ai_query を使用して、SQL または Python から Azure Databricks が提供する Unity AI Gateway エンドポイントにクエリを実行し、バッチ推論を行います。「ai_queryを使用したエンドポイントのクエリ」を参照してください。

を使用してエンドポイントにクエリを実行する `ai_query`

ai_query 関数を使用して、提供Azure Databricks Unity AI Gateway エンドポイントを SQL またはPythonから直接照会できます。これにより、バッチ推論ワークロードの使用状況追跡情報をキャプチャできます。

Note

Unity AI Gateway の ai_query のサポートは、Azure Databricks提供のエンドポイント (databricks-gpt-5-4、databricks-claude-sonnet-4 など) でのみ使用できます。 Unity AI Gateway で作成したエンドポイントはまだサポートされていません。
使用状況の追跡は、ai_queryバッチ推論ワークロードにのみ適用されます。レート制限、ガードレール、推論テーブル、フォールバックなどのその他の Unity AI Gateway 機能は適用されません。

作業を開始するには:

アカウントの Unity AI Gateway プレビューを有効にします。 Manage Azure Databricks プレビューを参照してください。
ai_query を使用して、Azure Databricks指定されたエンドポイントに対してクエリを実行します。

SELECT ai_query(
  'databricks-gpt-5-4',
  'Summarize the following text: ' || text_column
) AS summary
FROM my_table
LIMIT 10

Azure Databricks指定されたエンドポイントに対して ai_query を介して行われた要求は、usage tracking システムテーブル (system.ai_gateway.usage) にキャプチャされます。これらの要求は、組み込みの使用状況ダッシュボードにも表示されます。

完全な ai_query 構文とパラメーターリファレンスについては、 ai_query 関数を参照してください。ベストプラクティスとサポートされているモデルについては、「ai_queryの使用」を参照してください。

統合 API を使用してエンドポイントにクエリを実行する

統合 API は、Azure Databricksのモデルに対してクエリを実行するための OpenAI 互換インターフェイスを提供します。統合 API を使用すると、コードを変更することなく、異なるプロバイダーのモデルをシームレスに切り替えることができます。

MLflow チャット補完 API

MLflow チャット出力候補 API

Python

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256
)

print(chat_completion.choices[0].message.content)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

MLflow Embeddings API

MLflow Embeddings API

Python

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

embeddings = client.embeddings.create(
  input="What is Databricks?",
  model="<ai-gateway-endpoint>"
)

print(embeddings.data[0].embedding)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "input": "What is Databricks?"
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/embeddings

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

Supervisor API

Supervisor API

Supervisor API (/mlflow/v1/responses) は、ベータ版でエージェントを構築するための OpenResponses 互換のプロバイダーに依存しない API です。アカウント管理者は 、[プレビュー ] ページからアクセスを有効にすることができます。 Manage Azure Databricks プレビューを参照してください。コードを変更せずに、プロバイダー間でエージェントのユースケースに最適なモデルを選択します。

Python

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

response = client.responses.create(
  model="<ai-gateway-endpoint>",
  input=[{"role": "user", "content": "What is Databricks?"}]
)

print(response.output_text)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "input": [
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/responses

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

ネイティブ API を使用してエンドポイントにクエリを実行する

ネイティブ API は、プロバイダー固有のインターフェイスを提供し、Azure Databricks上のモデルに対してクエリを実行します。ネイティブ API を使用して、プロバイダー固有の最新の機能にアクセスします。

OpenAI Responses API

OpenAI Responses API

Python

from openai import OpenAI
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/openai/v1"
)

response = client.responses.create(
  model="<ai-gateway-endpoint>",
  max_output_tokens=256,
  input=[
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "Hello!"}]
    },
    {
      "role": "assistant",
      "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
    },
    {
      "role": "user",
      "content": [{"type": "input_text", "text": "What is Databricks?"}]
    }
  ]
)

print(response.output)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_output_tokens": 256,
    "input": [
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "Hello!"}]
      },
      {
        "role": "assistant",
        "content": [{"type": "output_text", "text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "content": [{"type": "input_text", "text": "What is Databricks?"}]
      }
    ]
  }' \
  https://<workspace-url>/ai-gateway/openai/v1/responses

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

Anthropic Messages API

Anthropic Messages API (アプリケーションプログラムインターフェース)

Python

import anthropic
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    "Authorization": f"Bearer {DATABRICKS_TOKEN}",
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I assist you today?"},
    {"role": "user", "content": "What is Databricks?"},
  ],
)

print(message.content[0].text)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello!"},
      {"role": "assistant", "content": "Hello! How can I assist you today?"},
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/anthropic/v1/messages

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

Google Gemini API

Google Gemini API

Python

from google import genai
from google.genai import types
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = genai.Client(
  api_key="databricks",
  http_options=types.HttpOptions(
    base_url="https://<workspace-url>/ai-gateway/gemini",
    headers={
      "Authorization": f"Bearer {DATABRICKS_TOKEN}",
    },
  ),
)

response = client.models.generate_content(
  model="<ai-gateway-endpoint>",
  contents=[
    types.Content(
      role="user",
      parts=[types.Part(text="Hello!")],
    ),
    types.Content(
      role="model",
      parts=[types.Part(text="Hello! How can I assist you today?")],
    ),
    types.Content(
      role="user",
      parts=[types.Part(text="What is Databricks?")],
    ),
  ],
  config=types.GenerateContentConfig(
    max_output_tokens=256,
  ),
)

print(response.text)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "Hello!"}]
      },
      {
        "role": "model",
        "parts": [{"text": "Hello! How can I assist you today?"}]
      },
      {
        "role": "user",
        "parts": [{"text": "What is Databricks?"}]
      }
    ],
    "generationConfig": {
      "maxOutputTokens": 256
    }
  }' \
  https://<workspace-url>/ai-gateway/gemini/v1beta/models/<ai-gateway-endpoint>:generateContent

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

使用状況追跡のタグ要求

Databricks-Ai-Gateway-Request-Tags HTTP ヘッダーを使用して、個々の要求にカスタムキー値タグをアタッチできます。要求タグはrequest_tagsシステムテーブルと推論テーブルの両方の列に記録されるため、プロジェクト、チーム、環境、またはその他のディメンション別にコスト、属性の使用状況、フィルター分析を追跡できます。

ヘッダー値は、文字列値に文字列キーをマッピングする JSON オブジェクトである必要があります。例えば次が挙げられます。

{ "project": "chatbot", "team": "ml-platform", "environment": "production" }

extra_headers パラメーター (Python) を使用するか、ヘッダーを直接渡して (REST API) 要求にタグをアタッチします。

Python (OpenAI SDK)

from openai import OpenAI
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)

request_tags = {"project": "chatbot", "team": "ml-platform"}

chat_completion = client.chat.completions.create(
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  extra_headers={
    "Databricks-Ai-Gateway-Request-Tags": json.dumps(request_tags)
  }
)

Python (Anthropic SDK)

import anthropic
import json
import os

DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')

request_tags = {"project": "chatbot", "team": "ml-platform"}

client = anthropic.Anthropic(
  api_key="unused",
  base_url="https://<workspace-url>/ai-gateway/anthropic",
  default_headers={
    "Authorization": f"Bearer {DATABRICKS_TOKEN}",
    "Databricks-Ai-Gateway-Request-Tags": json.dumps(request_tags),
  },
)

message = client.messages.create(
  model="<ai-gateway-endpoint>",
  max_tokens=256,
  messages=[
    {"role": "user", "content": "What is Databricks?"},
  ],
)

REST API

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -H 'Databricks-Ai-Gateway-Request-Tags: {"project": "chatbot", "team": "ml-platform"}' \
  -d '{
    "model": "<ai-gateway-endpoint>",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "What is Databricks?"}
    ]
  }' \
  https://<workspace-url>/ai-gateway/mlflow/v1/chat/completions

<workspace-url> を Azure Databricks ワークスペースの URL に置き換え、<ai-gateway-endpoint> を Unity AI Gateway エンドポイント名に置き換えます。

次のステップ

Supervisor API (ベータ) - ホストされたツールを使用してマルチターンエージェントワークフローを実行します。 /mlflow/v1/responses

手順 4. Unity AI Gateway を使用して Databricks Apps 上のエージェントから LLM の使用を管理する — Databricks Apps 上のエージェントからの LLM 呼び出しを Unity AI ゲートウェイ経由でルーティングする
Unity AI Gateway エンドポイントの使用状況を監視する
推論テーブルを使用してモデルを監視する
Unity AI Gateway エンドポイントのレート制限を構成する

フィードバック

このページはお役に立ちましたか?

Last updated on 2026-06-01

Unity AI Gateway エンドポイントのクエリを実行する

Requirements

サポートされている API と統合

を使用してエンドポイントにクエリを実行する ai_query

統合 API を使用してエンドポイントにクエリを実行する

MLflow チャット出力候補 API

Python

REST API

MLflow Embeddings API

Python

REST API

Supervisor API

Python

REST API

ネイティブ API を使用してエンドポイントにクエリを実行する

OpenAI Responses API

Python

REST API

Anthropic Messages API (アプリケーションプログラムインターフェース)

Python

REST API

Google Gemini API

Python

REST API

使用状況追跡のタグ要求

Python (OpenAI SDK)

Python (Anthropic SDK)

REST API

次のステップ

フィードバック

その他のリソース

を使用してエンドポイントにクエリを実行する `ai_query`