Detecting and Tracking Page Translations with Google Analytics 4

.
Tags: GA4, Google Tag Manager

I recently had a discussion with a client around adding a Google Translate feature to their website. Something we then wanted to split test to see if conversions could be increased. So before briefing the developers, we wanted to track how often users currently translate our website.

What is a page translation

Currently, some of the major browsers have built-in tools to translate a page into a given language. Both Safari and Edge automatically detect if a page is in a language other than the users' default language settings.

When a “translatable” language is detected, these browsers will notify the user that the page can be translated.

In other browsers, such as Chrome and Firefox, users can install e.g. the Google Translate extension to translate a page.

It’s the usage of these translations tools, that we aim to track.

How translation tools work

Put simply, whenever a user activates a translation tool in a browser, the browser identifies all visible text on the page. It then sends each text piece to a translation API, which returns translated texts. The translations are then inserted instead of the original content.

For Google Translate, Edge’s translation and Safari’s translation this all occurs seamlessly. The page isn’t even refreshed.

So, how de we detect translations?

No matter which analytics tool we want to use, the first step is to detect such translations in the first place. For this particular client, the actual tracked event was sent to their Google Analytics 4. But the approach should work for any tool.

Regardless, determining how to detect translations turned out to be as simple as using the browsers' web inspectors. As each translation tool performs its duties, these translation tools make some subtle changes to the DOM.

For example, when enabling a page translation in Microsoft Edge, this happens:

<!-- Original German title tag: -->
<title>DER SPIEGEL | Online-Nachrichten</title>

<!-- The title tag now translated into English: -->
<title _msthash="149916" _msttexthash="439634">DER SPIEGEL | Online News</title>

In fact, Microsoft Edge will add these two attributes (_msthash and _msttexthash) to all translated HTML elements on the page.

And in a similar fashion, Google Translate (whether used in Google Chrome, Mozilla Firefox etc.) will alter the <html> tag itself:

<!-- Google Translate adds a single class to the <html> tag -->
<html class="translated-ltr">

Safari does it subtly by either adding or altering the existing lang attribute on the <html> tag:

<!-- Safari modifies the lang attribute on the <html> tag -->
<html lang="en-US">

So it appears that various translation tools put a pretty visible mark on the pages they translate. The one remaining problem is that these changes to the HTML occur seamlessly without a page refresh or any other tangible event. This means that we can’t reliably just look for the presence of these attributes or values when a page loads.

Instead, we need to monitor if these changes occur at any time while the user is on a given page. Luckily, modern browsers implement the MutationObserver API which is built for exactly that; namely monitoring changes to elements in real time.

I’ll spare you for too much of the technical stuff, so please make do with this (commented code). You might want to copy it to your own code editor for increased readability:

(function(){
    // Start by checking if the MutationObserver API is available
    if( typeof MutationObserver === 'function') {
        // Tell the observer to monitor for changes to HTML attributes
        var config = { attributes: true };
        // Build the function to run when a change is observed
        var callback = function(mutationList, observer) {
            // Loop through each observed change (using old school loop as ES6 is still not supported in GTM)
            for(var i = 0; i < mutationList.length; i++) {
                // Only do something if the change was on an attribute
                if (mutationList[i]['type'] === 'attributes') {
                    if(
                        // Check for Edge's attributes
                        mutationList[i]['attributeName'] === '_msthash' || mutationList[i]['attributeName'] == '_msttexthash' ||
                        // Check for Google Translate's class
                        (mutationList[i]['attributeName'] === 'class' && mutationList[i]['target'].className.substring(0,9) == 'translate') ||
                        // Check for Safari's lang attribute
                        mutationList[i]['attributeName'] === 'lang'
                    ) {
                        // Stop observing to only track once per page
                        // On an SPA site, you might want to remove this line if you want to track the event on all pages
                        observer.disconnect();
                        // Send an event to the dataLayer (or do whatever you want)
                        window.dataLayer = window.dataLayer || [];
                        window.dataLayer.push({
                            'event': 'translate'
                        });
                    }
                }
                break;
            }
        };
        // Create the actual observer
        var observer = new MutationObserver(callback);
        // Attach the observer to the <title> tag
        observer.observe(document.getElementsByTagName('title')[0], config);
        // Attach the observer to the <html> tag
        observer.observe(document.getElementsByTagName('html')[0], config);
    }
})();

This piece of javascript can be enabled inline or in an external js file - or included in your tag manager of choice. For Google Tag Manager, simply insert this on all pages using a Custom HTML tag (and remember to add <script> before the code, and </script> after the code).

Obviously, if your site has features or tools that alter the HTML in the same way as these translations tool do, those changes will be captured too. If that’s the case, you’d need more specificity in the code.

Setup the GTM configuration

The rest is straightforward: Create a new trigger in GTM to capture the translate event:

GTM trigger to capture translate event

And then configure a new GA4 Event tag:

Send a GA4 translation event

And that’s it. GA4 will now begin to collect events on any page where users enable translations.

For example, setup a free form report to examine how often users enable translations. In this example (screenshot below), I have inserted a custom dimension called Site Language (on this particular multilingual site, we track the currently selected language). This report simply tells us that the language most often being translated into something else is actually English. Which came as a bit of a surprise.

GA4 translation free form report

And thanks for reading this far! :)