Share via

Extract base64 content without using open() method

RahulK 20 Reputation points
2026-02-13T15:20:56.0466667+00:00

Problem Statement : There is a document which contains customXMLMappings, content controls, custom properties etc and it is a collaborative document which means it will have changes made by multiple people. The ask is to extract the base64 of this document after sanitizing it(removing content controls and custom properties etc, accepting all revisions).

Current Approach : To achieve the above, we are creating a new document using createDocument() and populating it with the OOXML content of the original while simultaneously sanitizing it. To extract the base64 now, we are using tempDoc.open() wherein tempDoc is the new document being created. However, using open() is causing the document to open up in a new window.

Requirement : We want to fetch the base64 of the sanitized document without having to open it. We did look at the options and could see that there are third party libraries that do this. However, we intend to do this natively without relying on any third party library or server side processing.

Please suggest.

Microsoft 365 and Office | Development | Office JavaScript API
{count} votes

2 answers

Sort by: Most helpful
  1. Austin-H 8,330 Reputation points Microsoft External Staff Moderator
    2026-02-14T02:28:23.21+00:00

    Hello RahulK

    Thank you for reaching out to Microsoft Q&A

    Based on my research, the Word.DocumentCreated object returned by context.application.createDocument() does not provide any method to extract the document as Base64 without opening it and currently, tempDoc.open() is the only documented way to access a created document for Base64 extraction, which triggers the unwanted new window behavior.

    Given this you can try using an in-place sanitization approach with workflow:

    • Disable change tracking: context.document.changeTrackingMode = Word.ChangeTrackingMode.off
    • Accept all revisions: context.document.body.getTrackedChanges().acceptAll()
    • Remove content controls: Iterate context.document.contentControls and call delete(false) to preserve inner content
    • Remove custom metadata: Delete items from context.document.properties.customProperties and context.document.customXmlParts
    • Extract Base64: Use Office.context.document.getFileAsync(Office.FileType.Compressed) and convert to Base64
    • Restore original: Either use context.document.undo() or save the original Base64 beforehand and restore via body.insertFileFromBase64(originalBase64, Word.InsertLocation.replace)

    Example for this approach:

    async function getSanitizedBase64() {
      return await Word.run(async (context) => {
        const doc = context.document;
        // Save original state
        const originalBase64 = await getDocumentAsBase64();
        // Sanitize
        doc.changeTrackingMode = Word.ChangeTrackingMode.off;
        doc.body.getTrackedChanges().acceptAll();
        await context.sync();
        const contentControls = doc.contentControls;
        contentControls.load("items");
        await context.sync();
        for (let i = contentControls.items.length - 1; i >= 0; i--) {
          contentControls.items[i].delete(true);
        }
        const customProps = doc.properties.customProperties;
        customProps.load("items");
        await context.sync();
        for (let i = customProps.items.length - 1; i >= 0; i--) {
          customProps.items[i].delete();
        }
        const customXmlParts = doc.customXmlParts;
        customXmlParts.load("items");
        await context.sync();
        for (let i = customXmlParts.items.length - 1; i >= 0; i--) {
          customXmlParts.items[i].delete();
        }
        await context.sync();
        // Extract sanitized Base64
        const sanitizedBase64 = await getDocumentAsBase64();
        // Restore original
        doc.body.insertFileFromBase64(originalBase64, Word.InsertLocation.replace);
        await context.sync();
        return sanitizedBase64;
      });
    }
     
    function getDocumentAsBase64() {
      return new Promise((resolve, reject) => {
        Office.context.document.getFileAsync(Office.FileType.Compressed, { sliceSize: 4194304 }, (result) => {
          if (result.status === Office.AsyncResultStatus.Succeeded) {
            const file = result.value;
            const sliceCount = file.sliceCount;
            let slicesReceived = 0;
            const docData = [];
            const getSlice = (index) => {
              file.getSliceAsync(index, (sliceResult) => {
                if (sliceResult.status === Office.AsyncResultStatus.Succeeded) {
                  docData.push(sliceResult.value.data);
                  slicesReceived++;
                  if (slicesReceived === sliceCount) {
                    file.closeAsync();
                    const allBytes = new Uint8Array(docData.reduce((acc, arr) => acc + arr.length, 0));
                    let offset = 0;
                    docData.forEach(arr => {
                      allBytes.set(new Uint8Array(arr), offset);
                      offset += arr.length;
                    });
                    resolve(btoa(String.fromCharCode(...allBytes)));
                  } else {
                    getSlice(index + 1);
                  }
                } else {
                  file.closeAsync();
                  reject(sliceResult.error);
                }
              });
            };
            getSlice(0);
          } else {
            reject(result.error);
          }
        });
      });
    }
    

    Additionally, you can read here for more information:

    Word.Document.changeTrackingMode

    Word.ContentControl.delete

    Office.Document.getFileAsync

    Hope my answer will help you.


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".   

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.  


  2. Q&A Assist
    2026-02-13T15:21:10.1+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    To extract the base64 content of a sanitized document without using the open() method, you can utilize the getFileAsync method provided by the Office JavaScript API. This method allows you to retrieve the entire document as a byte array, which can then be converted to a Base64 string without opening the document in a new window.

    Here’s a general approach you can follow:

    1. Create a new document using createDocument() and populate it with the sanitized OOXML content from the original document.
    2. Use getFileAsync to retrieve the document. You can specify the Office.FileType.Compressed option to get the binary data of the file.
    3. Convert the byte array to a Base64 string. You will need to implement a function to handle this conversion.
    4. Send the Base64 string to your desired endpoint or use it as needed.

    Here’s a simplified code snippet to illustrate the process:

    function getBase64FromDocument() {
        // Assuming 'tempDoc' is your new document
        tempDoc.getFileAsync(Office.FileType.Compressed, function (result) {
            if (result.status === Office.AsyncResultStatus.Succeeded) {
                const byteArray = result.value;
                const base64String = byteArrayToBase64(byteArray);
                // Use the base64String as needed
            } else {
                console.error(result.status);
            }
        });
    }
    
    function byteArrayToBase64(byteArray) {
        // Convert byte array to Base64 string
        return btoa(String.fromCharCode.apply(null, byteArray));
    }
    

    This approach allows you to fetch the Base64 content of the sanitized document without opening it in a new window, adhering to your requirement of not using the open() method.


    References:

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.