画像分析認知スキル

画像解析スキルは、画像内容に基づいて豊富な視覚的特徴を抽出します。例えば、画像からキャプションを生成したり、タグを生成したり、有名人やランドマークを識別したりできます。この記事は 画像解析 スキルの参考資料です。使用方法については「画像からテキストと情報を抽出する」をご覧ください。

このスキルは、Foundry ToolsのAzure Visionが提供する機械学習モデルを使用します。 画像解析 は、以下の要件を満たす画像に対して作業を行います。

画像はJPEG、PNG、GIF、またはBMP形式で提示する必要があります
イメージのファイルサイズは 4 メガバイト (MB) 未満にする必要があります
画像の寸法は50×50ピクセル以上でなければなりません

OCRおよび画像解析に対応可能なデータソースは、Azure Blob StorageおよびAzure Data Lake Storage (ADLS) Gen2のブロブ、そしてMicrosoft OneLakeの画像コンテンツです。画像は単独のファイルでも、PDFやその他のファイルに埋め込まれた画像でも可能です。

このスキルは AI画像解析API バージョン3.2を用いて実装されています。もしソリューションがそのサービスAPIの新しいバージョン(例:バージョン4.0)を呼び出す必要がある場合は、Web API custom skillで実装するか、ImageAnalysisV4パワースキルを使うことを検討してください。

Note

このスキルはFoundry Toolsに縛られており、1日あたり1インデクサーあたり20件を超える取引に対して請求可能なリソースが必要です。組み込みスキルの実行は、既存の Foundry Tools Standard 価格で課金されます。

さらに、画像抽出は請求単位でAzure AI 検索です。

@odata.type

Microsoft.Skills.Vision.ImageAnalysisSkill

Skill parameters

パラメーターでは大文字と小文字が区別されます。

Parameter name	Description
`defaultLanguageCode`	返すべき言語を示す文字列です。サービスは認識結果を指定された言語で返します。このパラメータが指定されていない場合、デフォルト値は「en」となります。サポートされている言語には、Azure Visionの一部の一般利用可能な言語が含まれます。言語がAzure Visionに新たに導入され、一般利用可能状態になると、完全にこのスキルに統合されるまでに遅延が予想されます。
`visualFeatures`	返す視覚的特徴タイプを示す文字列の配列です。有効な視覚的特徴タイプには以下が含まれます: 成人向け - 画像がポルノ(ヌードや性行為を描写)、グロテスク(極端な暴力や血の描写)、または示唆的(過激な内容とも呼ばれる)かを検出します。ブランド - 画像内のさまざまなブランドを検出し、おおよその位置も含まれます。カテゴリ - Foundry Toolsで定義された分類法に従って画像コンテンツを分類します。説明 - 対応言語で完全な文で画像内容を記述します。顔 - 面が存在するかどうかを検出します。存在すれば座標、性別、年齢を生成します。オブジェクト - 画像内のさまざまなオブジェクトを検出し、おおよその位置も含まれます。タグ - 画像の内容に関連する詳細な単語リストでタグ付けします。視覚的特徴の名前は大文字を区別します。 colorとimageTypeのビジュアル機能はすでに廃止されていますが、カスタムスキルでこの機能にアクセスできます。各`defaultLanguageCode`で視覚特徴がサポートされているAzure Vision Image Analysisドキュメントを参照してください。
`details`	どのドメイン固有の詳細を返すべきかを示す文字列の配列です。有効な視覚的特徴タイプには以下が含まれます: セレブリティ - 画像にセレブリティが検出された場合、その人物を識別します。ランドマーク - 画像に検出された場合にランドマークを識別します。

Skill inputs

Input name	Description
`image`	Complex Type. 現在は「/document/normalized_images」フィールドでのみ動作し、`imageAction`が`none`以外の値に設定されたときにAzureブロブインデクサーが生成します。

Skill outputs

Output name	Description
`adult`	出力は複雑なタイプの単一の成人オブジェクトで、ブールフィールド(`isAdultContent`、 `isGoryContent`、 `isRacyContent`)と二重型スコア(`adultScore`、 `goreScore`、 `racyScore`)で構成されています。
`brands`	出力はブランドオブジェクトの配列で、そのオブジェクトは `name` (文字列)と `confidence` スコア(ダブル)からなる複合型です。また、画像内の位置を示す4つのバウンディングボックス座標(ピクセル単位で`x`、`y`、`w`、`h`)を含む`rectangle`も返します。長方形の場合、 `x` と `y` は左上にあります。左下は `x`、 `y+h`。右上は `x+w`、 `y`。右下は `x+w`、 `y+h`。
`categories`	出力はカテゴリオブジェクトの配列であり、各カテゴリオブジェクトは `name` (文字列)、 `score` (ダブル)、および有名人やランドマークの詳細を含むオプション `detail` からなる複素型です。カテゴリー名の全リストはカテゴリー分類法をご覧ください。詳細は入れ子された複素型です。有名人の詳細は名前、自信スコア、顔のバウンディングボックスで構成されます。ランドマークの詳細は名前と自信スコアです。
`description`	出力は複雑なタイプの単一の記述オブジェクトで、 `tags` と `caption` のリスト( `Text` (文字列)と `confidence` (ダブル)からなる配列)で構成されています。
`faces`	`age`、`gender`、`faceBoundingBox`からなる複素型で、画像内の配置を示す4つのバウンディングボックス座標(ピクセル単位)を持ちます。座標は `top`、 `left`、 `width`、 `height`です。
`objects`	出力は視覚的特徴オブジェクトの配列です。各オブジェクトは複素型で、 `object` (文字列)、 `confidence` (二重)、 `rectangle` (画像内の位置を示す4つのバウンディングボックス座標)、そしてオブジェクト名と信頼度を含む `parent` から成ります。
`tags`	出力は imageTag オブジェクトの配列であり、タグオブジェクトは `name` (文字列)、 `hint` (文字列)、 `confidence` (ダブル)からなる複素型です。ヒントの追加は稀です。タグが曖昧な場合にのみ生成されます。例えば、「カーリング」とタグ付けされた画像には、内容をより明確に示すために「スポーツ」のヒントが加えられることがあります。

スキルの定義

{
    "description": "Extract image analysis.",
    "@odata.type": "#Microsoft.Skills.Vision.ImageAnalysisSkill",
    "context": "/document/normalized_images/*",
    "defaultLanguageCode": "en",
    "visualFeatures": [
        "adult",
        "brands",
        "categories",
        "description",
        "faces",
        "objects",
        "tags"
    ],
    "inputs": [
        {
            "name": "image",
            "source": "/document/normalized_images/*"
        }
    ],
    "outputs": [
        {
            "name": "adult"
        },
        {
            "name": "brands"
        },
        {
            "name": "categories"
        },
        {
            "name": "description"
        },
        {
            "name": "faces"
        },
        {
            "name": "objects"
        },
        {
            "name": "tags"
        }
    ]
}

Sample index

単一のオブジェクト( adult や descriptionなど)については、インデックス内ですべてのオブジェクトに対して出力を返す Collection(Edm.ComplexType) として構造化し、すべてのオブジェクトに対して返 adult と description 出力を返すことができます。出力をインデックスフィールドにマッピングする方法については、「複素型からの情報の平坦化」を参照してください。

{
    "fields": [
        {
            "name": "metadata_storage_name",
            "type": "Edm.String",
            "key": true,
            "searchable": true,
            "filterable": false,
            "facetable": false,
            "sortable": true
        },
        {
            "name": "metadata_storage_path",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "facetable": false,
            "sortable": true
        },
        {
            "name": "content",
            "type": "Edm.String",
            "sortable": false,
            "searchable": true,
            "filterable": false,
            "facetable": false
        },
        {
            "name": "adult",
            "type": "Edm.ComplexType",
            "fields": [
                {
                    "name": "isAdultContent",
                    "type": "Edm.Boolean",
                    "searchable": false,
                    "filterable": true,
                    "facetable": true
                },
                {
                    "name": "isGoryContent",
                    "type": "Edm.Boolean",
                    "searchable": false,
                    "filterable": true,
                    "facetable": true
                },
                {
                    "name": "isRacyContent",
                    "type": "Edm.Boolean",
                    "searchable": false,
                    "filterable": true,
                    "facetable": true
                },
                {
                    "name": "adultScore",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "goreScore",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "racyScore",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                }
            ]
        },
        {
            "name": "brands",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "name",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "confidence",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "rectangle",
                    "type": "Edm.ComplexType",
                    "fields": [
                        {
                            "name": "x",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "y",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "w",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "h",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "categories",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "name",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "score",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "detail",
                    "type": "Edm.ComplexType",
                    "fields": [
                        {
                            "name": "celebrities",
                            "type": "Collection(Edm.ComplexType)",
                            "fields": [
                                {
                                    "name": "name",
                                    "type": "Edm.String",
                                    "searchable": true,
                                    "filterable": false,
                                    "facetable": false
                                },
                                {
                                    "name": "faceBoundingBox",
                                    "type": "Collection(Edm.ComplexType)",
                                    "fields": [
                                        {
                                            "name": "x",
                                            "type": "Edm.Int32",
                                            "searchable": false,
                                            "filterable": false,
                                            "facetable": false
                                        },
                                        {
                                            "name": "y",
                                            "type": "Edm.Int32",
                                            "searchable": false,
                                            "filterable": false,
                                            "facetable": false
                                        }
                                    ]
                                },
                                {
                                    "name": "confidence",
                                    "type": "Edm.Double",
                                    "searchable": false,
                                    "filterable": false,
                                    "facetable": false
                                }
                            ]
                        },
                        {
                            "name": "landmarks",
                            "type": "Collection(Edm.ComplexType)",
                            "fields": [
                                {
                                    "name": "name",
                                    "type": "Edm.String",
                                    "searchable": true,
                                    "filterable": false,
                                    "facetable": false
                                },
                                {
                                    "name": "confidence",
                                    "type": "Edm.Double",
                                    "searchable": false,
                                    "filterable": false,
                                    "facetable": false
                                }
                            ]
                        }
                    ]
                }
            ]
        },
        {
            "name": "description",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "tags",
                    "type": "Collection(Edm.String)",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "captions",
                    "type": "Collection(Edm.ComplexType)",
                    "fields": [
                        {
                            "name": "text",
                            "type": "Edm.String",
                            "searchable": true,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "confidence",
                            "type": "Edm.Double",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "faces",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "age",
                    "type": "Edm.Int32",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "gender",
                    "type": "Edm.String",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "faceBoundingBox",
                    "type": "Collection(Edm.ComplexType)",
                    "fields": [
                        {
                            "name": "top",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "left",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "width",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "height",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "objects",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "object",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "confidence",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "rectangle",
                    "type": "Edm.ComplexType",
                    "fields": [
                        {
                            "name": "x",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "y",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "w",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "h",
                            "type": "Edm.Int32",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                },
                {
                    "name": "parent",
                    "type": "Edm.ComplexType",
                    "fields": [
                        {
                            "name": "object",
                            "type": "Edm.String",
                            "searchable": true,
                            "filterable": false,
                            "facetable": false
                        },
                        {
                            "name": "confidence",
                            "type": "Edm.Double",
                            "searchable": false,
                            "filterable": false,
                            "facetable": false
                        }
                    ]
                }
            ]
        },
        {
            "name": "tags",
            "type": "Collection(Edm.ComplexType)",
            "fields": [
                {
                    "name": "name",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "hint",
                    "type": "Edm.String",
                    "searchable": true,
                    "filterable": false,
                    "facetable": false
                },
                {
                    "name": "confidence",
                    "type": "Edm.Double",
                    "searchable": false,
                    "filterable": false,
                    "facetable": false
                }
            ]
        }
    ]
}

サンプル出力フィールドマッピング

ターゲットフィールドは複雑なフィールドまたは集合である場合があります。インデックスの定義は任意の部分フィールドを指定します。

"outputFieldMappings": [
    {
        "sourceFieldName": "/document/normalized_images/*/adult",
        "targetFieldName": "adult"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/brands/*",
        "targetFieldName": "brands"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/categories/*",
        "targetFieldName": "categories"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/description",
        "targetFieldName": "description"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/faces/*",
        "targetFieldName": "faces"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/objects/*",
        "targetFieldName": "objects"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/tags/*",
        "targetFieldName": "tags"
    }

出力フィールドマッピングのバリエーション(入れ子状プロパティ)

出力フィールドのマッピングは、セレブリティやランドマークなどの下位レベルのプロパティに定義できます。この場合、インデックススキーマに各詳細を個別に格納するフィールドがあることを確認してください。

"outputFieldMappings": [
    {
        "sourceFieldName": "/document/normalized_images/*/categories/detail/celebrities/*",
        "targetFieldName": "celebrities"
    },
    {
        "sourceFieldName": "/document/normalized_images/*/categories/detail/landmarks/*",
        "targetFieldName": "landmarks"
    }

Sample input

{
    "values": [
        {
            "recordId": "1",
            "data": {
                "image": {
                    "data": "BASE64 ENCODED STRING OF A JPEG IMAGE",
                    "width": 500,
                    "height": 300,
                    "originalWidth": 5000,
                    "originalHeight": 3000,
                    "rotationFromOriginal": 90,
                    "contentOffset": 500,
                    "pageNumber": 2
                }
            }
        }
    ]
}

Sample output

{
  "values": [
    {
      "recordId": "1",
      "data": {
        "categories": [
          {
            "name": "abstract_",
            "score": 0.00390625
          },
          {
            "name": "people_",
            "score": 0.83984375,
            "detail": {
              "celebrities": [
                {
                  "name": "Satya Nadella",
                  "faceBoundingBox": [
                        {
                            "x": 273,
                            "y": 309
                        },
                        {
                            "x": 395,
                            "y": 309
                        },
                        {
                            "x": 395,
                            "y": 431
                        },
                        {
                            "x": 273,
                            "y": 431
                        }
                    ],
                  "confidence": 0.999028444
                }
              ],
              "landmarks": [ ]
            }
          }
        ],
        "adult": {
          "isAdultContent": false,
          "isRacyContent": false,
          "isGoryContent": false,
          "adultScore": 0.0934349000453949,
          "racyScore": 0.068613491952419281,
          "goreScore": 0.08928389008070282
        },
        "tags": [
          {
            "name": "person",
            "confidence": 0.98979085683822632
          },
          {
            "name": "man",
            "confidence": 0.94493889808654785
          },
          {
            "name": "outdoor",
            "confidence": 0.938492476940155
          },
          {
            "name": "window",
            "confidence": 0.89513939619064331
          }
        ],
        "description": {
          "tags": [
            "person",
            "man",
            "outdoor",
            "window",
            "glasses"
          ],
          "captions": [
            {
              "text": "Satya Nadella sitting on a bench",
              "confidence": 0.48293603002174407
            }
          ]
        },
        "faces": [
          {
            "age": 44,
            "gender": "Male",
            "faceBoundingBox": [
                {
                    "x": 1601,
                    "y": 395
                },
                {
                    "x": 1653,
                    "y": 395
                },
                {
                    "x": 1653,
                    "y": 447
                },
                {
                    "x": 1601,
                    "y": 447
                }
            ]
          }
        ],
        "objects": [
          {
            "rectangle": {
              "x": 25,
              "y": 43,
              "w": 172,
              "h": 140
            },
            "object": "person",
            "confidence": 0.931
          }
        ],
        "brands":[  
           {  
              "name":"Microsoft",
              "confidence": 0.903,
              "rectangle":{  
                 "x":20,
                 "y":97,
                 "w":62,
                 "h":52
              }
           }
        ]
      }
    }
  ]
}

Error cases

以下の誤りの場合、要素は抽出されません。

Error Code	Description
`NotSupportedLanguage`	提供されている文言はサポートされていません。
`InvalidImageUrl`	画像URLがフォーマットが悪かったりアクセス不能です。
`InvalidImageFormat`	入力データは有効な画像ではありません。
`InvalidImageSize`	入力画像が大きすぎます。
`NotSupportedVisualFeature`	指定された特徴タイプは有効ではありません。
`NotSupportedImage`	例えば、支持されていない画像、児童ポルノなどです。
`InvalidDetails`	サポートされていないドメイン固有モデル。

"One or more skills are invalid. Details: Error in skill #<num>: Outputs are not supported by skill: Landmarks"と似たエラーが出たら、パスを確認してください。有名人もランドマークも detailの対象物件です。

"categories":[  
      {  
         "name":"building_",
         "score":0.97265625,
         "detail":{  
            "landmarks":[  
               {  
                  "name":"Forbidden City",
                  "confidence":0.92013400793075562
               }
            ]

フィードバック

このページはお役に立ちましたか?

Last updated on 2026-04-30