Feedback Form

Slicing and Dicing Google Cookies - Part 1

One of the questions we see regularly on Google Groups from consultants and end users in general regards the identity and structure of Google Analytics Cookies.

Some good articles exist (e.g.: Cookies Set By Google Analytics and Justin Cutroni's Google Analytics Short Cuts) which have been helpful in understanding GA cookies and we wanted to contribute something real that relied on that knowledge: a JavaScript class for slicing and dicing Google cookies to extract those juicy bits. We present it in the context of a project.

The problem:

Some sites are eCommerce sites (they have a shopping cart), others are not and some are in between. And then there are those that earn revenue without a shopping cart but by displaying ads; these are eCommerce sites on steroids. Each ad displayed is like an item sold for which the advertiser is paid. Shipping is free.

But in this context, what exactly constitutes a transaction and how are Order Id's assigned?

The solution:

Turns out that what constitutes a transaction is in the eyes of the beholder. The requirement was to track revenue by visit and to identify the highest (and lowest!) earning pages and site sections. So:

Transaction=>Visit
Product Name=>Unique Ad Description
Product SKU=>Page
Product Category=>Site Section
To identify a Transaction (Order ID), we needed to identify the visit uniquely across all pageviews without creating our own tracking system.

GA already identifies the session uniquely on the visitor's machine as the start time (to the nearest second) of visitor's current session and stores it in the __utma cookie. However, visitors' sessions could have the same start times. GA also identifies the visitor uniquely with an anonymized id. The 2 values together would be more than adequately unique.

However, we did not want the visitor id to be recorded with in the reports. Although not personally identifying visitors, it would allow revenue to be accumulated per individual visitor within GA. That would probably go against Google's Terms of Service and, more importantly, against it's core principle "Don't be evil". By dividing the visitor id by the session start time the value would be sufficiently unique but would change on each visit, making it impossible to track revenue by individual visitor.

The implementation:

There have been other scenarios where we wrote code to extract data from the cookies but it was time to write something more generic and enduring.

I had used Adam Vandenberg's Querystring class to disect querystrings. Since both query strings and cookies are stored as a delimited series of name=value pairs, an adaptation of his code, with some GA cookie-specific enhancements resulted in gaVKIcookies.js in fairly short order.

Those and other elements of the GA cookies are shown in the following table which is populated in real time in an iframe using gaVKIcookies.js:

The calculation of a unique visit id is as follows:

// Create gaVKIcookies object
var gaVKIcookies = new gaVKIcookies();
   // Construct unique VisitID var strUniqueVisitId = gaVKIcookies.get('visitorId')/ gaVKIcookies.get('stime');

Run the sample useGAcookies.html page that shows the use of gaVKIcookies.js which you are free to download and enjoy.

Most of the GA cookies and their components, including the __utmz campaign/referrer cookie are isolated in the code.

The detail of each component will be covered in a later post but, in the meantime, is apparent from gaVKIcookies.js.

The latest version of WASP (v0.73) now displays GA's cookies: WASP post and download link

An important note about Google Analytics and privacy: GA has a strict policy against tracking Personally Identifiable Information (PII). It enforces this, not only in its Terms of Service (Google Analytics Privacy Policy) but also by basing the design of the reporting on that Privacy Policy. GA does not report on visitors. Any users preventing GA from tracking their use of web sites is doing themselves a disservice - but that's a different soap box for another post.

Comments
My plan (and yes, there actually is a plan) for this series is as follows:

Part 2 will explain the individual __utm* cookies and their sub-component values - I'm now writing the iframe source and aiming to publish on Wed Nov 10th ( 2008 :)

Part 3 will describe how GA cookies work as part of cookies in general.

Those posts will have answered and demonstrated what I've gleaned from forums and groups to be your most f.a.q's and, more importantly, the questions that perhaps should have been asked.

I'd like to complete the series with a Part 4 covering questions asked and aspects suggested by our readers.

Please comment on what you would like to see covered in Part 4.
Please also comment on the methods of explanation used - e.g. is the interactive iframe helpful or confusing?

We look forward to hearing from you.

Happy measuring, testing and optimizing ..
Brian
# Posted By Brian Katz | 12/6/08 1:40 PM
Thanks for the detailed post. Just wondering how accurate would be the web analytics when users start using Inprivate browsing and Inprivate blocking on browsers.
# Posted By Lakkineni | 12/29/08 12:52 PM
Hi Lakkineni

The most important "rule" of analytics is to look at context and trends.
Assuming the affect of Inprivate posting is huge (which I doubt will be the case) it would affect the accuracy of visit numbers but trends would still be relevant.
Having said that it would make it more difficult to compare periods before privacy features kicked in to those after it did.
Also, the users that use privacy features enough to make an impact, would form a segment excluded from analytics. This has good and bad affects - our stats will not cover all user segments, but the segments that are tracked will be better understood since they exlcude a particular type of user.
I don't expect the affects of these features to be substantial because users cannot have a worthwhile web experience without cookies - they may even regret not maintaining cookies between browsing sessions.
Brian
# Posted By Brian Katz | 1/12/09 2:23 PM
Thanks for your reply Brian. I have been using Google Chrome since release. When I click on the history I was shocked to see all the list of sites I have visited and others using my computer. But definitely I felt Google has invaded my personal space and I started using the Incognito option to browse. Sooner or later people will figure out how much information they are giving away and will consider using these options. Well..it's just my personal opinion though.
# Posted By Lakkineni | 1/17/09 11:38 PM
Hi Lakkineni
I'm shocked that you were shocked - storing history is what browsers do - its been one of their explicit functions from the get-go.

I cannot feel invasion of personal space. I would, however, if Google Chrome were transmitting the history, even against an anonomized ID but apparently, Chrome does not do that:
See http://www.google.com/chrome/intl/en/privacy.html - halfway down the page.

Google Toolbar may be another matter, but even then their readily available privacy policy seem fairly clear:

See: http://www.google.com/support/toolbar/bin/static.p...

Thanks for the interest and the interesting points you raised
# Posted By Brian Katz | 1/19/09 10:04 AM