ドキュメントレイアウトスキル

ドキュメントレイアウト スキルでは、Foundry Tools の Azure Document Intelligence のレイアウトモデルを使用して、ドキュメントの分析、構造と特性の検出、Markdown 形式またはテキスト形式での構文表現の生成を行います。このスキルでは、テキストと画像の抽出がサポートされています。後者には、ドキュメント内の画像の位置を保持する場所メタデータが含まれます。関連するコンテンツに近い画像は、取得拡張生成 (RAG) とマルチモーダル検索のシナリオで役立ちます。

1 日あたりインデクサーあたり 20 ドキュメントを超えるトランザクションの場合、このスキルでは、課金対象の Microsoft Foundry リソースをスキルセットにアタッチする必要があります。組み込みスキルの実行は、既存の Foundry Tools Standard 価格で課金されます。

この記事は、ドキュメントレイアウトスキルのリファレンスドキュメントです。使用方法については、「ドキュメントレイアウトでチャンクとベクター化を行う方法」を参照してください。

Tip

このスキルは、PDF などの構造と画像を持つコンテンツで使用するのが一般的です。マルチモーダルチュートリアルでは、2 つの異なるデータチャンク戦略を使用した画像の言語化について説明します。

Limitations

このスキルには、次の制限があります。

このスキルは、Azure ドキュメントインテリジェンスレイアウトモデルで 5 分以上の処理を必要とする大規模なドキュメントには適していません。スキルはタイムアウトしますが、課金目的でスキルセットに関連付けられている場合は、Foundry リソースに料金が適用されます。不要なコストを回避するために、ドキュメントが処理制限内に収まるように最適化されていることを確認します。
このスキルは Azure ドキュメントインテリジェンスレイアウトモデルを呼び出すので、ドキュメントの種類ごとにドキュメントの種類ごとに文書化されたすべてのサービス動作が出力に適用されます。たとえば、Word (DOCX) ファイルと PDF ファイルでは、画像の処理方法が異なるため、結果が異なる場合があります。 DOCX と PDF の間で一貫した画像動作が必要な場合は、ドキュメントを PDF に変換するか、代替方法についてマルチモーダル検索ドキュメントを確認することを検討してください。

Supported regions

ドキュメントレイアウトスキルは、 Azure Document Intelligence REST API の v4.0 (2024-11-30) を呼び出します。

サポートされるリージョンは、モダリティと、スキルが Azure ドキュメントインテリジェンスレイアウトモデルに接続する方法によって異なります。現在、実装されているバージョンのレイアウトモデルでは、21Vianet リージョンはサポートされていません。

Approach	Requirement
データのインポートウィザード	Azure AI 検索サービスと Azure AI マルチサービスアカウントを、米国東部、西ヨーロッパ 2、または米国中北部のいずれかのリージョンに作成します。
プログラムによる課金に Microsoft Foundry リソースキーを使用する	同じリージョンに Azure AI 検索サービスと Microsoft Foundry リソースを作成します。リージョンは、 Azure AI 検索と Azure ドキュメントインテリジェンスの両方をサポートする必要があります。
プログラムによる、課金に Microsoft Entra ID 認証 (プレビュー) を使用する	同じリージョンの要件はありません。各サービスが利用可能な任意のリージョンに Azure AI 検索サービスと Microsoft Foundry リソースを作成します。

サポートされているファイル形式

このスキルは、次のファイル形式を認識します。

.PDF
.JPEG
.JPG
.PNG
.BMP
.TIFF
.DOCX
.XLSX
.PPTX
.HTML

Supported languages

印刷されたテキストについては、 Azure ドキュメントインテリジェンスレイアウトモデルでサポートされている言語に関するページを参照してください。

@odata.type

Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill

Data limits

PDF および TIFF の場合、最大 2,000 ページを処理できます (Free レベルのサブスクリプションでは、最初の 2 ページのみが処理されます)。
ドキュメントを分析するためのファイルサイズが Azure Document Intelligence 有料 (S0) レベルで 500 MB、 Azure Document Intelligence Free (F0) レベルで 4 MB の場合でも、インデックス作成には検索サービスレベルのインデクサーの制限が適用されます。
画像のサイズは、50 ピクセル x 50 ピクセルまたは 10,000 ピクセル x 10,000 ピクセルの間である必要があります。
PDF がパスワードロックされている場合は、インデクサーを実行する前にロックを解除します。

Skill parameters

パラメーターでは大文字と小文字が区別されます。 REST API の特定のプレビューバージョンでは、いくつかのパラメーターが導入されました。すべてのパラメーターへのフルアクセスには、一般公開バージョン (2025-09-01) または最新のプレビュー (2025-11-01-preview) を使用することをお勧めします。

Parameter name	Allowed Values	Description
`outputMode`	`oneToMany`	スキルによって生成される出力のカーディナリティを制御します。
`markdownHeaderDepth`	`h1`、 `h2`、 `h3`、 `h4`、 `h5`、 `h6(default)`	`outputFormat`が `markdown` に設定されている場合にのみ適用されます。このパラメーターは、考慮する必要がある最も深い入れ子レベルを表します。たとえば、markdownHeaderDepth が `h3`されている場合、 `h4`など、より深いセクションは `h3`にロールされます。
`outputFormat`	`markdown(default)`、`text`	New. スキルによって生成される出力の形式を制御します。
`extractionOptions`	`["images"]`、 `["images", "locationMetadata"]`、 `["locationMetadata"]`	New. ドキュメントから抽出された追加コンテンツを特定します。出力に含めるコンテンツに対応する列挙型の配列を定義します。たとえば、 `extractionOptions` が `["images", "locationMetadata"]`されている場合、出力には画像と場所メタデータが含まれ、ページ番号やセクションなど、コンテンツが抽出された場所に関連するページの場所情報が提供されます。このパラメーターは両方の出力形式に適用されます。
`chunkingProperties`	See below.	New. `outputFormat`が `text` に設定されている場合にのみ適用されます。他のメタデータを再計算しながらテキストコンテンツをチャンクする方法をカプセル化するオプション。

ChunkingProperties Parameter	Version	Allowed Values
`unit`	`Characters`. 現在、唯一の許容値です。チャンクの長さは、単語やトークンではなく、文字単位で測定されます	New. チャンクユニットのカーディナリティを制御します。
`maximumLength`	300 ~ 50000 の任意の整数	New. String.Length で測定される最大チャンク長 (文字数)。
`overlapLength`	Integer. この値は、次の値の半分より小さくする必要があります。 `maximumLength`	New. 2 つのテキストチャンク間で提供される重複の長さ。

Skill inputs

Input name	Description
`file_data`	コンテンツの抽出の対象となるファイル。

"file_data" 入力は、次のように定義されたオブジェクトである必要があります。

{
  "$type": "file",
  "data": "BASE64 encoded string of the file"
}

または、次のように定義できます。

{
  "$type": "file",
  "url": "URL to download file",
  "sasToken": "OPTIONAL: SAS token for authentication if the URL provided is for a file in blob storage"
}

ファイル参照オブジェクトは、次のいずれかの方法で生成できます。

インデクサー定義の allowSkillsetToReadFileData パラメーターを true に設定します。この設定により、BLOB データソースからダウンロードされた元のファイルデータを表すオブジェクトであるパス /document/file_data が作成されます。このパラメーターは、Azure Blob Storage 内のファイルにのみ適用されます。
$type、data、またはurlとsastokenを提供する JSON オブジェクト定義を返すカスタムスキルを持つ。 $type パラメーターはfileに設定する必要があり、dataはファイルコンテンツの base 64 でエンコードされたバイト配列である必要があります。 url パラメーターは、その場所でファイルをダウンロードするためのアクセス権を持つ有効な URL である必要があります。

Skill outputs

Output name	Description
`markdown_document`	`outputFormat`が `markdown` に設定されている場合にのみ適用されます。 Markdown ドキュメント内の各セクションを表す "sections" オブジェクトのコレクション。
`text_sections`	`outputFormat`が `text` に設定されている場合にのみ適用されます。任意のセクションヘッダー自体を含む、ページの境界内のテキストを表すテキストチャンクオブジェクトのコレクション (構成されたさらにチャンクを考慮)。テキストチャンクオブジェクトには、該当する場合は `locationMetadata` が含まれます。
`normalized_images`	`outputFormat`が`text`に設定され、`extractionOptionsimages`含まれている場合にのみ適用されます。ドキュメントから抽出されたイメージのコレクション (該当する場合は `locationMetadata` を含む)。

マークダウン出力モードのサンプル定義

{
  "skills": [
    {
      "description": "Analyze a document",
      "@odata.type": "#Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill",
      "context": "/document",
      "outputMode": "oneToMany", 
      "markdownHeaderDepth": "h3", 
      "inputs": [
        {
          "name": "file_data",
          "source": "/document/file_data"
        }
      ],
      "outputs": [
        {
          "name": "markdown_document", 
          "targetName": "markdown_document" 
        }
      ]
    }
  ]
}

マークダウン出力モードのサンプル出力

{
  "markdown_document": [
    { 
      "content": "Hi this is Jim \r\nHi this is Joe", 
      "sections": { 
        "h1": "Foo", 
        "h2": "Bar", 
        "h3": "" 
      },
      "ordinal_position": 0
    }, 
    { 
      "content": "Hi this is Lance",
      "sections": { 
         "h1": "Foo", 
         "h2": "Bar", 
         "h3": "Boo" 
      },
      "ordinal_position": 1,
    } 
  ] 
}

markdownHeaderDepthの値は、"セクション" ディクショナリ内のキーの数を制御します。スキル定義の例では、 markdownHeaderDepth は "h3" であるため、"sections" ディクショナリには h1、h2、h3 の 3 つのキーがあります。

テキスト出力モードと画像とメタデータの抽出の例

この例では、固定サイズのチャンクでテキストコンテンツを出力し、ドキュメントから場所メタデータと共に画像を抽出する方法を示します。

テキスト出力モードと画像とメタデータ抽出のサンプル定義

{
  "skills": [
    {
      "description": "Analyze a document",
      "@odata.type": "#Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill",
      "context": "/document",
      "outputMode": "oneToMany",
      "outputFormat": "text",
      "extractionOptions": ["images", "locationMetadata"],
      "chunkingProperties": {     
          "unit": "characters",
          "maximumLength": 2000, 
          "overlapLength": 200
      },
      "inputs": [
        {
          "name": "file_data",
          "source": "/document/file_data"
        }
      ],
      "outputs": [
        { 
          "name": "text_sections", 
          "targetName": "text_sections" 
        }, 
        { 
          "name": "normalized_images", 
          "targetName": "normalized_images" 
        } 
      ]
    }
  ]
}

テキスト出力モードと画像とメタデータ抽出のサンプル出力

{
  "text_sections": [
      {
        "id": "1_7e6ef1f0-d2c0-479c-b11c-5d3c0fc88f56",
        "content": "the effects of analyzers using Analyze Text (REST). For more information about analyzers, see Analyzers for text processing.During indexing, an indexer only checks field names and types. There's no validation step that ensures incoming content is correct for the corresponding search field in the index.Create an indexerWhen you're ready to create an indexer on a remote search service, you need a search client. A search client can be the Azure portal, a REST client, or code that instantiates an indexer client. We recommend the Azure portal or REST APIs for early development and proof-of-concept testing.Azure portal1. Sign in to the Azure portal 2, then find your search service.2. On the search service Overview page, choose from two options:· Import data wizard: The wizard is unique in that it creates all of the required elements. Other approaches require a predefined data source and index.All services > Azure Al services | Al Search >demo-search-svc Search serviceSearchAdd indexImport dataImport and vectorize dataOverviewActivity logEssentialsAccess control (IAM)Get startedPropertiesUsageMonitoring· Add indexer: A visual editor for specifying an indexer definition.",
        "locationMetadata": {
          "pageNumber": 1,
          "ordinalPosition": 0,
          "boundingPolygons": "[[{\"x\":1.5548,\"y\":0.4036},{\"x\":6.9691,\"y\":0.4033},{\"x\":6.9691,\"y\":0.8577},{\"x\":1.5548,\"y\":0.8581}],[{\"x\":1.181,\"y\":1.0627},{\"x\":7.1393,\"y\":1.0626},{\"x\":7.1393,\"y\":1.7363},{\"x\":1.181,\"y\":1.7365}],[{\"x\":1.1923,\"y\":2.1466},{\"x\":3.4585,\"y\":2.1496},{\"x\":3.4582,\"y\":2.4251},{\"x\":1.1919,\"y\":2.4221}],[{\"x\":1.1813,\"y\":2.6518},{\"x\":7.2464,\"y\":2.6375},{\"x\":7.2486,\"y\":3.5913},{\"x\":1.1835,\"y\":3.6056}],[{\"x\":1.3349,\"y\":3.9489},{\"x\":2.1237,\"y\":3.9508},{\"x\":2.1233,\"y\":4.1128},{\"x\":1.3346,\"y\":4.111}],[{\"x\":1.5705,\"y\":4.5322},{\"x\":5.801,\"y\":4.5326},{\"x\":5.801,\"y\":4.7311},{\"x\":1.5704,\"y\":4.7307}]]"
        },
        "sections": []
      },
      {
        "id": "2_25134f52-04c3-415a-ab3d-80729bd58e67",
        "content": "All services > Azure Al services | Al Search >demo-search-svc | Indexers Search serviceSearch0«Add indexerRefreshDelete:selected: TagsFilter by name ...:selected: Diagnose and solve problemsSearch managementStatusNameIndexesIndexers*Data sourcesRun the indexerBy default, an indexer runs immediately when you create it on the search service. You can override this behavior by setting disabled to true in the indexer definition. Indexer execution is the moment of truth where you find out if there are problems with connections, field mappings, or skillset construction.There are several ways to run an indexer:· Run on indexer creation or update (default).. Run on demand when there are no changes to the definition, or precede with reset for full indexing. For more information, see Run or reset indexers.· Schedule indexer processing to invoke execution at regular intervals.Scheduled execution is usually implemented when you have a need for incremental indexing so that you can pick up the latest changes. As such, scheduling has a dependency on change detection.Indexers are one of the few subsystems that make overt outbound calls to other Azure resources. In terms of Azure roles, indexers don't have separate identities; a connection from the search engine to another Azure resource is made using the system or user- assigned managed identity of a search service. If the indexer connects to an Azure resource on a virtual network, you should create a shared private link for that connection. For more information about secure connections, see Security in Azure Al Search.Check results",
        "locationMetadata": {
          "pageNumber": 2,
          "ordinalPosition": 1,
          "boundingPolygons": "[[{\"x\":2.2041,\"y\":0.4109},{\"x\":4.3967,\"y\":0.4131},{\"x\":4.3966,\"y\":0.5505},{\"x\":2.204,\"y\":0.5482}],[{\"x\":2.5042,\"y\":0.6422},{\"x\":4.8539,\"y\":0.6506},{\"x\":4.8527,\"y\":0.993},{\"x\":2.5029,\"y\":0.9845}],[{\"x\":2.3705,\"y\":1.1496},{\"x\":2.6859,\"y\":1.15},{\"x\":2.6858,\"y\":1.2612},{\"x\":2.3704,\"y\":1.2608}],[{\"x\":3.7418,\"y\":1.1709},{\"x\":3.8082,\"y\":1.171},{\"x\":3.8081,\"y\":1.2508},{\"x\":3.7417,\"y\":1.2507}],[{\"x\":3.9692,\"y\":1.1445},{\"x\":4.0541,\"y\":1.1445},{\"x\":4.0542,\"y\":1.2621},{\"x\":3.9692,\"y\":1.2622}],[{\"x\":4.5326,\"y\":1.2263},{\"x\":5.1065,\"y\":1.229},{\"x\":5.106,\"y\":1.346},{\"x\":4.5321,\"y\":1.3433}],[{\"x\":5.5508,\"y\":1.2267},{\"x\":5.8992,\"y\":1.2268},{\"x\":5.8991,\"y\":1.3408},{\"x\":5.5508,\"y\":1.3408}]]"
        },
        "sections": []
       }
    ],
    "normalized_images": [ 
        { 
            "id": "1_550e8400-e29b-41d4-a716-446655440000", 
            "data": "SGVsbG8sIFdvcmxkIQ==", 
            "imagePath": "aHR0cHM6Ly9henNyb2xsaW5nLmJsb2IuY29yZS53aW5kb3dzLm5ldC9tdWx0aW1vZGFsaXR5L0NyZWF0ZUluZGV4ZXJwNnA3LnBkZg2/normalized_images_0.jpg",  
            "locationMetadata": {
              "pageNumber": 1,
              "ordinalPosition": 0,
              "boundingPolygons": "[[{\"x\":2.0834,\"y\":6.2245},{\"x\":7.1818,\"y\":6.2244},{\"x\":7.1816,\"y\":7.9375},{\"x\":2.0831,\"y\":7.9377}]]"
            }
        },
        { 
            "id": "2_123e4567-e89b-12d3-a456-426614174000", 
            "data": "U29tZSBtb3JlIGV4YW1wbGUgdGV4dA==", 
            "imagePath": "aHR0cHM6Ly9henNyb2xsaW5nLmJsb2IuY29yZS53aW5kb3dzLm5ldC9tdWx0aW1vZGFsaXR5L0NyZWF0ZUluZGV4ZXJwNnA3LnBkZg2/normalized_images_1.jpg",  
            "locationMetadata": {
              "pageNumber": 2,
              "ordinalPosition": 1,
              "boundingPolygons": "[[{\"x\":2.0784,\"y\":0.3734},{\"x\":7.1837,\"y\":0.3729},{\"x\":7.183,\"y\":2.8611},{\"x\":2.0775,\"y\":2.8615}]]"
            } 
        }
    ] 
}

上記のサンプル出力の “sections” は空白で表示されることに注意してください。それらを設定するには、セクションが適切に入力されるように、outputFormatに設定markdownで構成されたスキルを追加する必要があります。

このスキルでは、Azure ドキュメントインテリジェンスを使用して locationMetadata を計算します。ページと境界ポリゴン座標の定義方法の詳細については、 Azure ドキュメントインテリジェンスレイアウトモデルを参照してください。

imagePathは、格納されているイメージの相対パスを表します。ナレッジストアファイルプロジェクションがスキルセットで構成されている場合、このパスはナレッジストアに格納されているイメージの相対パスと一致します。

フィードバック

このページはお役に立ちましたか?

Last updated on 2026-04-30