S/

Pemrograman / 8 min read

How to Programmatically Update Text in Google Docs

Satria Aji Putra
Satria Aji Putra Author
How to Programmatically Update Text in Google Docs

If you’ve ever tried to write a script or extension to interact with Google Docs, you probably hit a wall pretty fast. Unlike normal web apps where you can just grab a standard DOM element and update its text, Google Docs does things completely differently. They use a custom canvas-based architecture, which means direct text manipulation flies right out the window.

Here is a breakdown of what I learned while trying to programmatically update text in Google Docs, and the workarounds that actually do the job.

The Problem with Google Docs

Why is it so weird?

Normally, online text editors rely on contenteditable divs or <textarea> tags. If you want to change the text, it’s just a matter of touching the DOM. But Google Docs draws the entire document on an HTML5 <canvas>.

They didn’t do this just to make our lives harder. The canvas approach gives them:

  • Crazy performance even when a document has hundreds of pages.
  • Pixel-perfect rendering that looks exactly the same across Chrome, Firefox, Safari, etc.
  • Total control over layout, spacing, and typography.
  • A built-in defense mechanism against messy extensions that try to hijack the DOM.

But the downside for us developers? The text you’re staring at isn’t a text node. It’s literally just pixels painted on a screen.

The “Annotated Canvas” Trick

Luckily, Google didn’t completely shut the door. They built an accessibility feature known as the “annotated canvas” mode. When you turn this on, Google Docs overlays an invisible, structured DOM layer right on top of the canvas. It’s meant for screen readers, but we can totally abuse it for programmatic access.

You basically have to set this flag to enable it:

javascript
window._docs_annotate_canvas_by_ext = 'extension-id-here';

The string value has to be a Chrome extension ID that Google recognizes. It’s basically an allowlist system to gatekeep access to this API.

How to Actually Swap Text

Since we can’t just run element.innerText = 'something', we have to fake user interactions. Google Docs still listens for standard browser events like keystrokes, clipboard actions, and mouse clicks. If we structure these events perfectly, Docs will accept them as real input.

1. Faking a Paste Event

If you want to insert a chunk of text or replace something, firing off a paste event is by far the most reliable method I’ve found.

javascript
function simulatePaste(targetElement, plainText, htmlContent) {
    // Create a ClipboardEvent with custom data
    const clipboardData = new DataTransfer();

    const pasteEvent = new ClipboardEvent('paste', {
        clipboardData: clipboardData,
        bubbles: true,
        cancelable: true
    });

    // Set the clipboard data
    if (plainText) {
        clipboardData.setData('text/plain', plainText);
    }

    if (htmlContent) {
        clipboardData.setData('text/html', htmlContent);
    }

    // Dispatch the event
    targetElement.dispatchEvent(pasteEvent);
}

Google Docs intercepts this paste event, reads the clipboard data we injected, and processes it exactly like a normal Ctrl+V or Cmd+V.

2. Spoofing Keyboard Presses

Sometimes pasting is overkill, especially if you just need to type a single character or hit backspace. We can manually build and dispatch KeyboardEvent objects.

javascript
function simulateKeyPress(targetElement, character) {
    const upperChar = character.toUpperCase();

    const keyPressEvent = new KeyboardEvent('keypress', {
        altKey: false,
        bubbles: true,
        cancelable: true,
        charCode: 0,
        code: `Key${upperChar}`,
        composed: true,
        key: character.toLowerCase(),
        ctrlKey: false,
        keyCode: character.charCodeAt(0),
        shiftKey: upperChar === character,
        which: character.charCodeAt(0),
        isComposing: false,
        repeat: false,
        metaKey: false
    });

    targetElement.dispatchEvent(keyPressEvent);
}

// Example: Type "Hello"
function typeText(targetElement, text) {
    for (const char of text) {
        simulateKeyPress(targetElement, char);
    }
}

And if you need to wipe out some existing text, you can simulate a Delete or Backspace:

javascript
function simulateDelete(targetElement) {
    const deleteEvent = new KeyboardEvent('keydown', {
        bubbles: true,
        cancelable: true,
        keyCode: 46,  // Delete key code
        key: 'Delete',
        code: 'Delete'
    });

    targetElement.dispatchEvent(deleteEvent);
}

function simulateBackspace(targetElement) {
    const backspaceEvent = new KeyboardEvent('keydown', {
        bubbles: true,
        cancelable: true,
        keyCode: 8,  // Backspace key code
        key: 'Backspace',
        code: 'Backspace'
    });

    targetElement.dispatchEvent(backspaceEvent);
}

3. The Modern Approach: InputEvent

If you want something a bit more semantic for replacing text, modern browsers support InputEvent. Docs reacts to beforeinput pretty well:

javascript
function insertReplacementText(targetElement, plainText, htmlContent) {
    const dataTransfer = new DataTransfer();

    const inputEvent = new InputEvent('beforeinput', {
        inputType: 'insertReplacementText',
        data: plainText,
        dataTransfer: dataTransfer,
        cancelable: true,
        bubbles: true
    });

    if (plainText) {
        dataTransfer.setData('text/plain', plainText);
    }

    if (htmlContent) {
        dataTransfer.setData('text/html', htmlContent);
    }

    targetElement.dispatchEvent(inputEvent);
}

I usually prefer this when I specifically need to swap out highlighted text with something else right away.

Controlling the Selection

None of the tricks above matter if you don’t know where the text will appear. Docs tracks its own cursor and selection state, which means we need to talk to the annotated canvas layer to move the cursor around.

Forcing a New Selection

The annotated layer gives us a setSelection() method to highlight a specific range of text.

javascript
function setSelection(selectionWrapper, startIndex, endIndex) {
    // The selection wrapper is obtained from the annotated canvas
    return selectionWrapper.setSelection(startIndex, endIndex);
}

What if it doesn’t work?

I’ve noticed that setSelection() likes to fail silently sometimes. When that happens, you have to “wake up” the cursor by sending a dummy keyboard event, like tapping the arrow key, before trying to select the text again.

javascript
function forceSelection(selectionWrapper, keyboardTarget, start, end) {
    // First, try direct selection
    let success = selectionWrapper.setSelection(start, end);

    if (!success) {
        // Dispatch arrow keys to activate selection
        keyboardTarget.dispatchEvent(new KeyboardEvent('keydown', {
            bubbles: true,
            cancelable: true,
            keyCode: 39,  // ArrowRight
            key: 'ArrowRight',
            code: 'ArrowRight',
            shiftKey: true
        }));

        // Check if selection is now available
        if (selectionWrapper.getSelectionRanges().length > 0) {
            // Try setting selection again
            success = selectionWrapper.setSelection(start, end);
        }
    }

    return success;
}

Finding the Hidden Event Targets

Google Docs doesn’t attach its keyboard listeners to the main document body. Instead, they bury a specific iframe for handling input. If you fire your keyboard events at the wrong element, nothing happens.

javascript
function getKeyboardEventTarget() {
    // Google Docs uses this iframe for text input
    const iframe = document.querySelector('iframe.docs-texteventtarget-iframe');

    if (iframe && iframe.contentDocument) {
        // Find the contenteditable element inside the iframe
        const editableElement = iframe.contentDocument.querySelector('[contenteditable="true"]');
        return editableElement || iframe.contentDocument;
    }

    return null;
}

function getAnnotatedCanvasElement() {
    // The annotated canvas container
    return document.querySelector('.kix-canvas-tile-content');
}

Tying It All Together

If we put all the pieces together, we get a reliable flow for replacing text: grab the iframe, force the selection, fire a paste event, and give react time to process.

javascript
async function replaceTextInGoogleDocs(startIndex, endIndex, newText) {
    // Step 1: Get the target elements
    const keyboardTarget = getKeyboardEventTarget();
    const selectionWrapper = getAnnotatedCanvasElement();

    if (!keyboardTarget || !selectionWrapper) {
        throw new Error('Google Docs elements not found');
    }

    // Step 2: Set the selection to the range we want to replace
    const selectionSuccess = forceSelection(
        selectionWrapper,
        keyboardTarget,
        startIndex,
        endIndex
    );

    if (!selectionSuccess) {
        throw new Error('Failed to set selection');
    }

    // Step 3: Simulate paste with the new text
    simulatePaste(keyboardTarget, newText, null);

    // Step 4: Wait for Google Docs to process the change
    await new Promise(resolve => setTimeout(resolve, 100));

    // Step 5: Verify the change (optional)
    const currentText = getCurrentDocumentText();
    // Add your verification logic here

    return true;
}

A Few Real-World Tips

1. Don’t Rush the Thread

Google Docs handles events asynchronously on its own timeline. If you spam events too fast, they get dropped. A tiny sleep function works wonders:

javascript
await new Promise(resolve => setTimeout(resolve, 50));

2. Always Double Check

Because things can desync, you should verify the state of the text before applying a blind change. You don’t want your script deleting the wrong paragraph.

javascript
function validateText(expectedBefore, actualBefore, start, end) {
    if (expectedBefore.slice(0, end) !== actualBefore.slice(0, end)) {
        throw new Error('Document text has changed unexpectedly');
    }
}

3. Dealing with Trusted Types

If your environment enforces Trusted Types (which is becoming standard for security), you might get blocked when injecting raw HTML strings. You have to pass the string through a policy first.

javascript
function createTrustedHTML(htmlString) {
    if (window.trustedTypes) {
        const policy = trustedTypes.createPolicy('docs-editor-policy', {
            createHTML: (input) => input
        });
        return policy.createHTML(htmlString);
    }
    return htmlString;
}

4. Know Your Target

I can’t emphasize this enough: always look for the iframe.docs-texteventtarget-iframe. If Docs pushes an update and changes this class name, your entire script will break. Always inspect the DOM to make sure your selectors are up to date.

javascript
const target = document.querySelector('iframe.docs-texteventtarget-iframe')
    ?.contentDocument
    ?.querySelector('[contenteditable="true"]');

What Could Go Wrong?

Rate Limits

If you write a loop that types out 1,000 characters instantly, Google Docs will just ignore most of them. Throttling is mandatory here.

javascript
async function typeWithDelay(target, text, delayMs = 50) {
    for (const char of text) {
        simulateKeyPress(target, char);
        await new Promise(resolve => setTimeout(resolve, delayMs));
    }
}

Focus State

The editor literally won’t accept keyboard inputs if the window isn’t focused.

javascript
function ensureFocus(editorElement) {
    if (document.activeElement !== editorElement) {
        editorElement.focus();
    }
}

Sync Issues

If you’re editing a document while five other people are typing at the same time, your character indexes are going to shift dynamically. If your script thinks it’s deleting word #5, but someone just hit enter twice above it, you’re going to delete the wrong thing. Always re-fetch the document state immediately before dispatching an edit.

Wrapping Up

Modifying Google Docs programmatically isn’t impossible, but it definitely feels like you’re fighting the browser. Because of their canvas architecture, the only way forward is to aggressively spoof events—faking pastes, manually moving selections, and digging through invisible iframes.

The main things to remember are: rely on the annotated canvas, track down the exact event-handler iframe, and respect the asynchronous nature of the editor by pacing your events. It takes a bit of trial and error to get the timing right, but once you dial it in, the workaround is surprisingly stable.

Discovery / Related