Model deployment options for Content Understanding analyzers

Azure Content Understanding in Foundry Tools uses your Foundry model deployments for all operations that require a generative AI model. This approach helps you maximize provisioned capacity and consolidate capacity into fewer deployments, if needed. You can also choose the model that best fits your scenario for price and latency.

You're billed for all tokens (input and output) processed by the connected deployment, and Content Understanding only bills you for Content Understanding-specific meters. See the pricing explainer to learn more about the billing model.

The service requires a chat completion model and an embeddings model and supports a few different options for each.

Supported models

The service is periodically updated to add support for more models. The currently supported models are listed in Service limits - Supported generative models. Please refer to Model retirement schedule to track Foundry model lifecycle stage and retirement date.

Note

GPT-5.2 is now supported across all Content Understanding analyzers. Support for additional models will be added in future updates.

Check supported models per analyzer

Different analyzers support different sets of models. To check which models a specific analyzer supports, use the GET analyzers API:

GET /contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01

The response includes a supportedModels object that lists the valid completion and embedding models for that analyzer:

{
  "analyzerId": "prebuilt-invoice",
  // ...
  "supportedModels": {
    "completion": [
      "gpt-4.1",
      "gpt-5.2"
    ],
    "embedding": [
      "text-embedding-3-large"
    ]
  },
  "models": {
    "completion": "prebuilt-analyzer-completion",
    "embedding": "prebuilt-analyzer-embedding"
  }
}

Model selection for prebuilt analyzers

Prebuilt analyzers use model aliases instead of direct model names in their models section. This allows the service to support model upgrades without changing analyzer definitions.

Prebuilt analyzers reference the following model aliases:

Model Alias	Used by
`prebuilt-analyzer-completion`	Default for most prebuilt analyzers
`prebuilt-analyzer-completion-mini`	Default for selected prebuilt analyzers, e.g. `prebuilt-*Search`
`prebuilt-analyzer-embedding`	Prebuilt analyzers that require embeddings

You map these aliases to your actual deployments in the modelDeployments configuration (see Set default deployments).

How model selection works

When you create a custom analyzer, you can specify which chat completion model and embedding model it uses.

{
  "analyzerId": "myInvoice",
  "models": {
    // Specify the completion and embedding models for this custom analyzer by referencing the model aliases
    "completion": "prebuilt-analyzer-completion",
    "embedding": "prebuilt-analyzer-embedding"
  },
  "config": {

  }
  // Complete analyzer definition
}

Tip

GPT-5.2 is a recommended model for use with Foundry and the Studio. You can use any supported chat completion model that fits your quality, latency, and cost goals. Embedding models are used when you use labeled samples or in-context learning to improve analyzer quality.

Two ways to provide model deployments

To set model deployments, you have two options:

Option 1: Set default model deployments at the resource level.
Option 2: Pass model deployment pointers in every analyze request.

If you set resource defaults, you can still override those defaults for a single request by including modelDeployments in that request.

Option 1: Set default deployments at the resource level

You can directly connect Content Understanding to your model deployment when you call analyze via the API. However, to simplify management across a set of different analyzers, you can centrally manage default models for all analyzers under a given Foundry resource. To do so, choose one of the following setup methods:

Content Understanding Studio
REST API or code

For the full onboarding flow, see Quickstart: Try out Content Understanding Studio.

Open Content Understanding Studio.
Select the Settings gear icon in the upper-right corner.
Select Add resource to open the Add new connected resource dialog.
To connect a resource, select the subscription, resource group, and Foundry resource in the dialog.
Optional: Select Enable auto-deployment for required models if no default deployment available.
Select Next, review mappings, and then save.

Studio can configure defaults for supported models such as gpt-5.2, gpt-4.1, gpt-4.1-mini, and text-embedding-3-large. If the selected resource doesn't already have the required deployments, Studio can deploy them when auto-deployment is enabled.

From here, you can go on to try out Content Understanding features in the web portal by following the steps in the quickstart.

Use PATCH /contentunderstanding/defaults to set model deployment defaults at the resource level.

PATCH /contentunderstanding/defaults
{
  "modelDeployments": {
    "gpt-5.2": "myGpt52Deployment",
    "text-embedding-3-large": "myTextEmbedding3LargeDeployment",
    // Specify default model deployments using "model alias": "deployment name"
    "prebuilt-analyzer-completion": "myGpt52Deployment",
    "prebuilt-analyzer-completion-mini": "myGpt41-miniDeployment",
    "prebuilt-analyzer-embedding": "myTextEmbedding3LargeDeployment"
  }
}

After you set defaults, analyze requests can omit modelDeployments. Example analyze request that uses resource defaults:

POST /contentunderstanding/analyzers/myInvoice:analyze
{
  "inputs": [
    {
      "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/invoice.pdf"
    }
  ]
  // No modelDeployments needed - uses resource defaults
}

Option 2: Pass model deployments in each analyze request

Use this option when you want each request to explicitly point to model deployments by passing a modelDeployments object in the analyze request. This approach gives you maximum flexibility to use different deployments for different requests and doesn't require resource defaults.

POST /contentunderstanding/analyzers/myInvoice:analyze
{
  "inputs": [
    {
      "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/invoice.pdf"
    }
  ],
  "modelDeployments": {
    "prebuilt-analyzer-completion": "myGpt52Deployment",
    "prebuilt-analyzer-embedding": "myTextEmbedding3LargeDeployment"
  }
}

The modelDeployments values in this analyze request override defaults that you configured at the resource level.

Usage and billing data

Analyze responses include a usage property. This property reports token usage for your connected deployment and other Content Understanding usage meters. You can compare these values with deployment usage data to correlate consumption from Content Understanding with your model deployment.

{
  "usage": {
    "documentPagesMinimal": 3, 
    "documentPagesBasic": 2, 
    "documentPagesStandard": 1, 
    "audioHours": 0.234,
    "videoHours": 0.123,
    "contextualizationToken": 1000,
    "tokens": {
      "gpt-5.2-input": 1234, /*Completion model Input and output tokens consumed*/
      "gpt-5.2-output": 2345,
      "text-embedding-3-large": 3456 /*Embedding tokens consumed*/
    }
  }
}

For details on how billing works for Content Understanding, see the pricing explainer.

Content filtering and Guardrails

Each Foundry model deployment has an associated Guardrails instance that evaluates content for safety. Content Understanding surfaces the Guardrails output directly in the analyze response as a content_filters array. If a Guardrails instance blocks content, the analyze operation returns an error; if it annotates content, the result passes through with filter metadata attached.

To adjust content filter thresholds or switch from blocking to annotating, update the Guardrails configuration on the model deployment in your Azure AI Foundry project. For more information, see Content filtering and Guardrails and the content_filters response object reference.

Feedback

Was this page helpful?

Last updated on 2026-05-08