Tracking Error 404 Pages and Broken Links in Google Analytics: Google Analytics Power User 11

GA can be used to detect errors, too.
Next up in the Google Analytics Power User series is something slightly different. Instead of your standard "out of the box" reporting, we're using Google analytics to look at errors on your site. The most common of which will be the classic "Error 404".
Whether you are an ecommerce site, lead generation site, or publishing site (even a blog)... you will likely have the occasional technical problem with your site. Sometime your site may throw a 4XX (client error) or 5XX (Server Error). These errors are often created as a result of broken links within your site, so you will want to find them and fix them ASAP.

Setup: Placing the GATC in your Error 404's, 500's, etc.
To track error pages you will need to have the Google Analytics Tracking Code (GATC) added to your error page template(s) (contact your webmaster to complete this task). Without the tracking code on the error pages you will not be able report on these pages or find their associated broken links.
As a reference, here is the code:
Var gaJsHost = (("https:" == document.location.protocal) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script tupe="text/javascript">
Var pageTracker = _gat.getTracker("UA-xxxxx-x");
pageTracker._trackPageview();
</script>
With the GATC on your error pages they will now be tracked as page views. Since these error pages will be considered page views you will want to create a filter that will allow you to view them in isolation from each other.
One important aspect to setting up your error report properly is the naming convention that you use for your error pages' in the HTML "title" tag. All of the title tags need to be created in a consistent manner so that a single filter will be able to identify them. An example title tag would be: Error 404 Page.
For other errors you would just replace the '404' with the appropriate error number. An example of a custom filter that will track the above format would be:
Field A -> Extract A | Page Title | (Error [0-9][0-9][0-9].+)
Field B -> Extract B | Request URI | (.*)
Output To -> Constructor | Request URI | $A1$B1
Field A Required: Yes
Field B Required: Yes
Override Ouput Field: Yes
Case Sensitive: No
An alternative version of the Field A regular expression is: (Error [0-9]{3}.+) Once you have the GATC on your error pages and set up your filter to help differentiate the different type of error pages from each other it is time to figure out where your error pages are and which links (broken links) sent your visitors to these non existent pages.
Creating the Report in Top Content
To create your error pages report you will go to the top content report and then filter by the term "error" (assuming you had this word in your error page title tags as we did above). This will bring up all of your error pages and let you see the associated Content Performance metrics associated with them (pageviews, unique pageviews, bounce rate...)
Now that you have identified the error pages we need to track down the broken links that created the error. To do this you will click on an error page in the report to view the content details of that page. From the content details page you will then click on the "Navigation Summary" link on the right side. Once on the Navigation Summary page you will be able to see the page your visitor was on prior to visiting your error page. A quick visit to these pages and you should be able to figure out where your broken links are. Sometimes the pages with the broken links will not be within your site and you will need to contact someone else and ask them to correct the link, or set up a 301 redirect on your site to redirect the inbound visitor to the correct URL.
This is a great report for your webmaster or IT team to review regularly so that you can fix broken links as quickly as possible and reduce the number of error pages that your visitors will see.
A question for you though: IE and Google Chrome come with their own cookie cutter 404 page. How do you measure those who view the actual custom 404 page?
Cheers,
Julien
I think the page you are discussing is the page related to when you try to access a domain that doesn't exist, not a 404 page on an active domain. When the domain is not registered and doesn't have a site associated with it, Chrome, IE, and Firefox (when the latest Google Toolbar is installed), will show a custom page with suggestions and the ability to perform a search.
For example, try going to: http://www.vkistudios.com/this-shows-an-error404.h...
You should see the VKI Studios error 404 page.
Versus going to a domain such as: http://asdfasdf135123.com/
Which would show you the browser specific pages you mentioned.
Tested this in the latest version of Chrome and IE8 and they seem to work as expected.
Dave
your remark makes perfect sense but arguably your method only works if a 404 page is actually implemented ;-)
If I try to open http://analyseweb.fr/wheee.htm i'll get a vanilla 404 message from Chrome, IE et al. (i know, I'm too lazy to put a 404 up)
On your site, I actually see the default page pop up very briefly before being shown the default error page for the browser.
Whereas on one of our servers where the default 404 page is being used, and Google Analytics is not currently configured to track the 404 error, the default page shows up fine:
http://www.vkirealestate.com/wheee.html
I wonder if there is something in your response headers that is affecting this.
Dave
I expect this is happening for you because your default error 404 page is less than 512 bytes. Take a look at this article from Matt Cutts:
http://www.mattcutts.com/blog/404-pages-in-google-...
Dave
I have been working on a nice 404 page for a client website and it is insightful to check its visits on Google Analytics. Plus I've added some links so I can see wich links are working good or not.
I'm sure that this great post will make many more people start to think about measure 404 pages.
Thanks for your time, John.
Would be better as: ^\s*Error [0-9]{3}.*
use just (Error page +) for the Field A-> Extract A section.
And instead of just looking for 'Error' in the regex, we added in the numbers to the regular expression to try and ensure that we were only pulling in actual error pages. We wouldn't want blog posts or other pages that have 'Error' in their title to get pulled into the reporting.
Would you please send us screen shots of the reports your are referring to, particularly the navigation summary
Thanks in advance
Everything works the same, except for the code you put on the error page. Just update that to use the new async code. Instructions can be found here: http://code.google.com/apis/analytics/docs/trackin...