Splunk: Real-time (web) analytics, powerful data mining and cost effective single customer view

Splunk is a fantastic monitoring and operational intelligence tool and now we are all trained up here at Datalicious with certificates to prove it (see end of post). The most frequent use case is for systems administrators, but we set out to play around with it and see how we could use it for web analytics. We realised that we could use its powerful, expressive search language and its intuitive charting & visualisation features to do analytics work that's more difficult, more expensive, or simply not possible, in other web analytics suites.

The big philosophy of Splunk is that you just throw all your data into it and worry about how to report on it and what to do with it later. This is great for us: it means we can focus on gathering as much data as possible in the implementation stage of a project, and there's no risk of getting to the reporting & insights staging only to realise we've overlooked something.

We have a setup where all our Google Analytics data is cloned and sent into Splunk. We hacked together a simple, scaleable pixel server in node which acts as an intermediary between Google Analytics and our Splunk installation. Our server can handle any pixel request, so we can supplement the data that Google Analytics gathers with anything we want to do in our tracking code - without having to set up Custom Variables in advance, and without being limited to 5 of them.

Once the data is in Splunk, its search language lets us get right at the data and do whatever we want with it. For example, maybe we want to see how many page views our website gets on average per session, to see how our latest site design is performing. We can run this search:
eventtype=datalicious_GA earliest=-7d | stats avg(utms) AS avg | eval avg=round(avg, 2)
Broken down, it's pretty simple: we're looking at the event type called "datalicious_GA", which has been defined elsewhere. The earliest results we want are 7 days ago. We "pipe" the output of that search to the "stats" command, and we get an average of "utms", which is Google Analytics' session counter. We then round it to two so that it looks a bit nicer, and we get this:

average page views

Fairly simple. But what happens if we realise we want to break those results down by some kind of segmentation which we didn't plan for in the past? It's no problem. If at any time in the future we get some additional metadata about our visitors, we can apply that retrospectively to generate segmentations across their full history. For example, lets say some visitors eventually "convert", which for our website is simply clicking one of the links to contact us. We could run this more complex search query:
eventtype="dataliciousGA" | eval type="Non-Converter" | join type=outer datalicious [search eventtype="dataliciousGA" | join datalicious [search eventtype="datalicious_conversion"] | eval type="Converter"] | stats avg(utms) AS avg by type | eval avg=round(avg,2)
This just means we want to do a search for converters, join it to the search result for all visitors, and show the average per-session page views of each of those segments.
 segmented average page views

It's trivial to look at something like conversions by channel:

Screen_5

Of course, no one wants to look at ugly search strings all day. That's why we build visualisations:

individually segmented page views

It's important to emphasise that we can retrospectively apply a segmentation across the full history of all impressions, events and custom data at any time. In the above example, we built a little form and got people from around the office to fill in their name. We associated that with the unique cookie ID they have on our website, and suddenly we can track their individual behaviour over all time. This didn't have to be the name, it could have been any meaningful segmentation: annual household income, country, favourite musical genre, etc.

And of course, we can apply all of those segmentations across data like search keywords:

segmented search keywords

Check out our fancy certification:

(download)
Check out the Datalicious Supertag: Container tag for smarter tag management

Chrome pre-rendering doesn't just speed up page load times but potentially also skews web analytics

Last month the Chrome developers announced a new feature, Instant Pages, to speed up page loads after Google searches, and elsewhere on the web. I hadn't paid much attention to it until I noticed Niall Kennedy's comments on it and nearly choked on my morning porridge. Here's how it works, including some detail on its impact on web analytics.

What's going on?

When you do a search in Google using the latest versions of Chrome, the search results page can include a special header to tell Chrome to pre-load the top result pages. Chrome will then go ahead and load, in the background, those pages. It downloads the page, all the dependent stylesheets, images and JavaScript files. And executes the JavaScript. The user may or may not click through to the page.

What does this mean for web analytics?

That last bit is important. If your web analytics automatically fires on page load, it means you'll be recording a page view when potentially the user didn't actually see it. To fix this, Niall Kennedy suggests a fairly straightforward test to ensure your web analytics only loads when it should. Integrating this into your analytics shouldn't be too hard, though it will require some changes. At some point it's fairly likely that the analytics vendors will do this inside their own libraries, or at least create a standard way to do it.

Check out the Datalicious Supertag: Container tag for smarter tag management

We're hiring! Looking for a Technical Director to join our team of data geeks in Sydney ($100-120k base)

THIS ROLE HAS BEEN FILLED!

If you are a senior web developer (or analyst with exceptional technical skills) with at least 4 to 5 years of work experience who is ready to step up and manage his own small team and would like to get into analytics and data driven marketing in a fast paced agency environment then you should apply. 

Please note: Digital experts without serious JavaScript/PHP/SQL skills need not apply!

What you should bring to the table
+ Keen interest in all things web analytics and online optimisation
+ Basic understanding of marketing principles provides a head start
+ Entrepreneurial spirit, business or marketing degree would be good
+ Ability to identify client opportunities and turn them into projects 
+ Ability to translate business requirements into project deliverables
+ Demonstrated ability for lateral thinking and problems solving
+ Project and team management experience, large/mid scale projects
+ Exceptional organizational, presentation and communication skills
+ Expert HTML, JavaScript, PHP and MySQL development skills
+ Experience with other database related technologies a strong bonus 
+ Omniture SiteCatalyst implementation experience would be fantastic
+ Experience with CRM, ad serving, targeting or testing platforms useful
+ Solid understanding of HTTP and FTP protocols an absolute must
+ Risk management to maximise data quality and minimise data loss
+ Good rounding in statistics or mathematics would be very helpful
+ Demonstrated ability to manage client and partner relationships
+ Demonstrated ability for developing processes and documentation
+ Ability to see through vendor sales pitches and identify weaknesses
+ Attention to detail bordering on OCD absolutely crucial for success
+ Ability to respond to client deadlines when necessary

How we will reward your efforts
+ Exposure to a growing list of interesting blue chip clients
+ Highly flexible working hours in a dynamic team environment
+ Young start-up with a "work hard, play hard" company attitude
+ Training on industry leading analytics and marketing platforms
+ Freedom to experiment with emerging technologies and new tools
+ Potential for public speaking engagements to build personal profile 
+ Potential for a wider regional APAC role in the not to distant future
+ Attractive salary base plus performance based bonus and perks

If the above sounds interesting, please email us at jobs@datalicious.com so we can have a look at your resume and arrange a quick initial phone interview to ask you a few questions before we meet for a proper interview.
Check out the Datalicious Supertag: Container tag for smarter tag management

New ClickTale segmented heat maps show mouse data for prospects vs. existing customers

Screen_shot_2010-03-24_at_6

ClickTale launched two new heat maps feature today that are worth mentioning.

The Segmented Heap Maps (see screen shot below) allow analysts to show mouse movement, mouse click and page scrolling data for different segments to analyse differences in behaviour. The segmentation options include customer status, conversion status, media channels and any other custom segmentation variables such as age, gender or location but I especially like the fact that we'll now be able to analyse website usage for new prospects vs. existing customers separately.

Ultra Scale Heat Maps on the other hand allow analysts to show aggregate mouse data from up to 100,000 visitors in one single image enabling usability testing on a super large scale compared to standard eye tracking methods.

"With an 84-88% correlation between our Mouse Move Heatmaps and expensive eye-tracking studies, website owners can now conduct incredibly accurate usability studies on a massive scale, and at a fraction of the cost."

Screen_shot_2010-08-31_at_6

Check out the Datalicious Supertag: Container tag for smarter tag management

ADMA survey reveals lack of web analytics data usage in media attribution and single customer view

Over the past few months the ADMA Data & Analytics Council conducted an online survey to establish how evolved the direct marketing industry in Australia really is in terms of data and analytics

Key findings of the survey included

  • Although 62% of respondents said that they tie sales data back to campaigns and media channels driving them, 59% admitted that they were not actually using web analytics which is interesting given the increased importance of online channels in driving sales.
  • A similar trend emerged when asking marketers about whether they had a single customer view. A surprisingly large amount of respondents (44%) said they had a single customer view but interestingly only 41% of those companies were incorporating web analytics data into their single view. Given the growing amount of online customer touch points this raises the questions how complete these single customer views really are.

Although the survey was only a quick and dirty exercise I think the results are quite interesting and the council is now considering to extend and refine the survey to shed some more light on the highlighted issues above.

Please subscribe to the ADMA Councils blog if you would like to hear about research like this in the future or email councils@adma.com.au if you would like to help shape similar future initiatives.

(download)

Check out the Datalicious Supertag: Container tag for smarter tag management