7 methods to apply Splunk in Technical SEO


In order to succeed at technical SEO, you have to develop in-depth understanding into the architecture of a website and know, in detail, the ways Google interacts with URLs within the site. If you dedicate time towards analyzing site access logs in detail, there’s a lot you can learn about how Google (and others) understand a website.

There are many tools out there that can be applied for this task, and Splunk is one of them. Splunk can either be cloud-based or hosted, and it’s a tool which enables you to analyze large portions of data quickly hence make decisions about them easier.

For a majority of sites, the open-source version of Splunk is adequate for their needs. It allows you to upload up to 0.5GB every day, which is sufficient for site access logs analysis.

The following are some ways you can use Splunk to further your technical SEO goals:

  1. To find out whether URLs have been crawled with Google or other Bots

Especially after you have just launched a new site, you’d be eager to find out whether the page has been crawled from the Google cache. Unfortunately, this can take days, even weeks, from the first time the Google bot crawls on the page.

A quicker way to accomplish the same is to search the exact title of the page on Google. If the page has been crawled by Google, it should appear in the results. Even faster, you can use Splunk to search for Googlebot access in the web logs.

  1. Finding 404 pages

A 404 page is a wasted chance, because anytime a user sees the error page, you have lost a chance to show content they were interested in, and it doesn’t give great user experience either.

Crawling tools, such as Screaming Frog, can be used to find 404 pages, but you might not be able to fix broken links or find incorrectly linked 404 pages. In such scenarios, you can apply Splunk to parse the log so that you can discover which URLs return 404s and are often accessed by users. Once identified, you can choose to fix the page or institute a redirection towards a working page.

  1. Finding Googlebot 302s

302 redirections are temporary, unlike those labeled 301, which are permanent. There are different ways that 302s can occur in a site: They could be inherent within a platform e.g. the .NET platform. You should be keen to discover where 302s exist so that you can make sure they are correctly labeled (some pages should actually have the 301 redirect). You can use Splunk to find 302 pages for further analysis.

  1. Finding out how many pages have been crawled daily

If you’re using Google Webmaster Tools, like teams within an SEO agency London, you might have seen the screen displaying the number of URLs they have crawled in a day. This data is not 100% accurate, so the better way is to use Splunk to scan your logs for the exact number of URLs. Once the query required runs, simply switch to the statistics tab to get the true figure. The visualization tab shows changes in the number over a period of time.

Author Bio

This article was authored by Michael Bentos, who is a qualified SEO professional in London. Visit Paradox SEO for more information or to find an SEO Agency in London, who can make things happen.