Share via

We are using the prebuilt-tax.us model to process W-2 forms. The field "LocalWagesTipsEtc" is present in the document, but the model is not extracting it in the result.

Lokesh Saini 0 Reputation points
2026-02-17T18:25:11.52+00:00

We are using the prebuilt-tax.us model to process W-2 forms.

The field "LocalWagesTipsEtc" is present in the document, but the model is not extracting it in the result same for other fields of

LocalTaxInfos

.

Azure AI services
Azure AI services

A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.

{count} votes

2 answers

Sort by: Most helpful
  1. SRILAKSHMI C 15,030 Reputation points Microsoft External Staff Moderator
    2026-02-27T09:19:04.31+00:00

    Hello Lokesh Saini,

    Welcome to Microsoft Q&A and Thank you for reaching out.

    If you’re using the prebuilt-tax.us model to process W-2 forms and fields like LocalWagesTipsEtc (under LocalTaxInfos) are present in the document but not appearing in the result, this is typically due to one of the following reasons:

    1.Field Behavior & Confidence Threshold

    Local tax fields (W-2 Boxes 18–20) are:

    • Optional

    Often repeated (multiple localities)

    Returned only when the model has sufficient confidence

    If the OCR confidence is low (small font, light print, skewed scan, overlapping text), the model may detect the text but not populate the structured field.

    Please verify,

    • Check the raw OCR output (content, words) to confirm the value was read.
    • Review confidence scores for nearby fields.

    2.Array Handling

    LocalTaxInfos is returned as an array, not a single object.

    If your code expects a single object instead of iterating through the array, it may appear as if the field is missing.

    Make sure you are checking:

    documents[0].fields["LocalTaxInfos"]

    and iterating through all entries.

    3.Tier Limitation

    If you are using the F0 (free) tier, only the first two pages are analyzed.

    If your W-2 content extends beyond those pages, the local tax section may not be processed. Upgrading to S0 removes this limitation.

    4.Supported Field Validation

    Confirm that LocalWagesTipsEtc is included in the supported schema for the W-2 model version you are using. While it is generally supported, schema support can vary by API version.

    Also ensure you are using the latest available Document Intelligence API version, as updates may improve extraction coverage.

    5.Layout Variations

    The prebuilt model is trained on common IRS W-2 layouts. Extraction may fail if:

    The employer uses a non-standard template

    The local tax section is shifted or compressed

    There are alignment inconsistencies

    In such cases, text may appear in OCR but not be mapped to the structured field.

    6.Alternative steps

    If the prebuilt model does not reliably extract the required fields:

    Try the General Document model to extract key-value pairs.

    Consider training a custom model if you frequently process a consistent W-2 layout that differs from standard formats.

    Custom models can improve accuracy for repeated document structures.

    Recommended Steps

    Verify the document quality (clear scan, high resolution).

    Confirm you are not hitting F0 tier page limits.

    Check that you are iterating through LocalTaxInfos as an array.

    Review the raw OCR output to confirm text detection.

    Confirm API version and SDK are up to date.

    If the issue persists with clean samples, test in Document Intelligence Studio for comparison.

    If reproducible across multiple valid W-2s, open a support request with:

    A redacted sample document

      API version
      
         Full JSON response
         
    

    This will help determine whether it is a parsing issue, confidence threshold behavior, layout variation, or a model extraction limitation.

    Please refer this

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    0 comments No comments

  2. Alex Burlachenko 19,465 Reputation points Volunteer Moderator
    2026-02-19T12:50:42.0666667+00:00

    minutes will recheck...

    so.... hi Lokesh Saini,

    this is usually not a bug but a model coverage limitation. The prebuilt-tax.us model does not guarantee extraction of every possible W-2 field, especially local sections, because those fields vary heavily by state and employer formatting. The model is optimised for the most standard federal boxes, and local tax sections are often inconsistent in layout and labelling.

    See the model version u are using. are u on the latest API version and model release, because field coverage sometimes expands between versions. Review the raw layout output using the prebuilt-layout model on the same document. If the text for LocalWagesTipsEtc is detected correctly in layout but missing from prebuilt-tax.us structured fields, then it confirms it is a schema coverage limitation rather than an OCR issue.

    Whether the field is actually present in the JSON schema returned by the model. If LocalTaxInfos exists but is empty, the model likely did not reach sufficient confidence to populate it. In that case try testing multiple W-2 samples to see if extraction works for some formats but not others. If consistent extraction of local tax fields is critical, the recommended approach is to build a custom model using Azure Document Intelligence custom extraction trained specifically on ur W-2 variants. Prebuilt models are generalised, but custom models are far more reliable for structured but variable sections like local tax boxes.

    If needed, open a support request with sample redacted W-2s so msft can confirm whether that specific field is currently supported in the prebuilt-tax.us schema.

    rgds,

    Alex

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.