User-agents and Javascript files

Previously we’ve written a series of articles on Javascript Troubleshooting and SEO. I recently came across a new issue involving Javascript Rendering and Google and I’ll walk you through it. The issue isn’t actually new, but it was the first time I’ve seen it in this fashion, and I deal with a lot of Javascript Rendering issues for clients. Here’s the tl;dr: Don’t accidentally cloak your content by deciding to not give Bot user-agents all relevant Javascript files.

While I believe it’s highly unlikely that you’ll experience this issue, you should understand how to identify it. Thankfully, it’s fairly simple and just requires a little work with the right tools.

User-Agents and Javascript Files

Here’s the background for this issue. A Tech site was utilizing a third party service to provide landing pages, but Google wasn’t reading the content. The blame was placed on the third party tool, but it looked like an implementation issue. In short, Google was not rendering any Javascript on that page. Now while it’s easy to think that the issue is purely implementation (and I did at first), you should always dig deeper to verify what you think. 

The issue was that the live test option in GSC’s URL Inspection Tool was also not rendering any Javascript. This is abnormal, as you can use the Live Test to see the final version of the page that Google gets after it renders everything. With this in mind, multiple pages that did not utilize the third party tool were tested. All Live Test inspection crawl code (and the indexed crawl code) showed was a message that Javascript appeared to be disabled. Now, instead of showing you a blank view from URL Inspection, here’s what the cache showed:

And here is a text only view:

Note: The cache is not always truthful, and it does not show the exact version of a page that Google has indexed. You may see a view like this, but the crawled code of the indexed page will show that all the content has been rendered. The indexed crawl is gospel when it comes to what content is actually indexed.

And here’s what you should be seeing if Google had rendered the page:

Clearly, there’s a render issue. In this case, the issue was that the site was only delivering all Javascript files to regular browser user-agents, and bot user-agents were only getting around half of the Javascript files. The problem was that the Javascript files that were not passed on to the bot user-agents were the triggers for all of the content on the page. This essentially is a case of accidental cloaking, but with the benefit of harming yourself. This ultimately meant that Google was blocked from seeing all of the content on the page outside of menu and footer items. 

This was found by running Javascript crawls through Screaming Frog (in order to fully make sure that robots.txt wasn’t blocking Javascript files) under different user-agents, and then checking the rendered code AND the loaded Javascript files. In the images below, you can see the difference between the two crawls in how many Javascript files were loaded. The Chrome user-agent crawl shows the 60 JS files, compared to 34 from the Googlebot crawl.

Regular Browser User-Agent

Googlebot User-Agent

How Do I diagnose this issue myself?

With that in mind, let’s take you through the process of how to identify something like this.

You can refer to our previous article on how to check to make sure everything is rendered correctly, so we’ll skip past those steps that should be taken when undergoing any rendering troubleshooting. Here are the new steps you can take to help diagnose an issue with user-agents receiving different files.

Set up a Screaming Frog crawl (or whatever crawler you use) with list mode, because you want to check a single URL. You want to have SF setup in Javascript Rendering mode (Screenshot if you want), and you want to have it grab all of the data, like normal. The key here is to make sure you also pull the source code and the rendered code. The settings are here:

Select a bot user-agent (Screamingfrog is fine, as would be Googlebot if it’s not blocked), and set robots.txt settings to obey robots.txt (You’ll want to compare with an ignored crawl later to make sure nothing important was blocked). Run the crawl, and check the rendered code. Refer to our other Javascript article on how to emulate that code.

Now, set the user-agent to Chrome (or any standard user browser), and run the crawl again. No other settings should change. Again, emulate the code.

If you’ve emulated the code and compared the Bot user-agent Code with the Browser user-agent Code, you’d see that the regular Browser user-agent loaded all of the content, whereas the Bot user-agent did not. 

In my case, the Bot user-agent only loaded the message that Javascript was not enabled. When you see something similar to this, it is your first piece of evidence that there is an issue when it comes to what different user-agent’s load. After emulating the code (or before if you want), also check to see if there is a difference in found Javascript files. You can see in our example above that there was. This is your second (or first) piece of evidence. 

At this point, you’d know that there is something odd happening when it comes to the different user-agent’s. Unless you have full control over a site, the server etc., you’ll need to go to the developers and ask them to look into the issue. 

Conclusion

As stated before, it’s highly unlikely that you’ll run into an issue like this, but it doesn’t mean you won’t. Sites are typically set up so as to provide all user-agents with the same experience, but you never know when developers may start tinkering with things. It’s not wrong to serve different types of files to different UA’s, but it can lead to issues like the above. Hopefully this left you with one more method of troubleshooting at your disposal on the off chance that you experience such an issue.

Similar Posts