Data mining and visualisation of raw social media data from BuzzBumbers in Tableau BI platform

Bnlogo

We recently did a bit of custom data mining for one of our clients to showcase the power of analysing raw social media data through business intelligence platforms such as Tableau and given it was only a demo of public data it should be ok to share the results with the rest of you.

The Australian based social data provider BuzzNumbers (great platform, check it out) was so kind to provide the raw data (small sample of consumer generated content mentioning Zara fashion related terms) and we then analysed and visualised the data using Tableau and the free word cloud service Tagxedo, the results of which you can see below.

BuzzNumbers is scanning all social profiles for geographic keywords and based on their frequency they allocate users with a home location which allowed us to visualise not only where social buzz originated from geographically but also what region was generating the most influential buzz. In Zara's example the majority of buzz volume came from the US, UK and Australia but there were a few interesting outliers such as Thailand and Egypt that showed high buzz influence (smaller but darker red dots).

Next we looked at what sites and categories of sites were generating the most buzz value or media value and interestingly while Twitter generated the most buzz value other sites such as Youtube where more influential (darker red bars). Once we broke the sites down by major categories and countries a completely different picture emerged hinting at different site categories performing better in certain countries for example forums seem to be big in Australia but negligible overseas.

And of course we had to visualise what people were actually talking about hence the word cloud to help visualise the different topics, i.e. the larger the font the more mentions that term had. The BuzzNumbers guys don't believe in automatic sentiment analysis (and we kind of agree) so they offer a cost effective manual sentiment analysis but unfortunately we didn't have enough time to set this up in this case.

BuzzNumber metrics definitions

  • BuzzValue: Equivalent online advertising value in AUD (based on banner ad CPMs)
  • BuzzRank: Website rankings based on traffic/visitors/page views (1 being highest)
  • BuzzInfluence: Influence based on search rankings & link popularity (for post not site)

(download)

Check out the Datalicious Supertag: Container tag for smarter tag management

Enhance Google Analytics with Super Cookies

Google Analytics mechanics are quite different to Omniture Stie Catalyst, for Google Analytics many of the calculations such as pages per visit, first visit, last visit, etc are stored in cookies, they are not calculated on the server side. Additionally the visitor ID is not necessarily the same for a given user, it isn't used to tie information together, instead Google Analytics relies on your cookies telling the truth. The problem is that cookies inherently only ever tell partial truths, the attrition rates are huge, so how can you trust this information? The answer is you can to a point, but be aware of what you're looking at, because its far from perfect. If you want to make it more accurate, then use super cookies. This post is aimed to touch on several of these areas where we think flash can add major value to existing Google Analytics solutions. We've already covered the super cookie technology in a previous post, so we won't dwell on the basics, if you need some more background please read the previous post:

Examples of Super Cookie additions to Google Analytics deployments
  1. Flash based persistent cookies work across multiple domains and multiple browsers - Use super cookies to re-set targeting and other custom variables across your network of domains, in any browser.
  2. Cookie deletion measurements - Find out how often do your existing users delete their standard cookies?
  3. Browser switching measurements - Do your users switch between multiple browsers? How does this affect your analytics?
Browser-logo-major
 
Which browser? Who cares!!!
1. Super persistant cookies
For those wanting to get better accuracy from google analytics, or if you're using the Google Analyitcs custom variables for targeting and reporting, this is for you. 
Persisting the profile - In order to keep the rich information on your users no matter whether they delete cookies or switch browsers, you have two main options:

a) Respawn their past GA cookies before loading GA - All their profile information is associated with their visitor ID. By reseting this to it's original value (stored either prior to cookie deletion, or from a previous browser), their profile remains intact. This method can also help you to keep more realistic figures on unique visitors as long as you can replace the visitor ID prior to sending any requests to Google. Although this gives the smoothest operation, the privacy issues are obvious and must be addressed. 
b) Keep a copy of targeting parameters in a super cookie - If you detect a cookie deletion, resend the the parameters to Google Analytics so they can be re-bind them to the new visitor ID. This is a little more privacy friendly, as you're allowing the user to remove association to a specific ID, but their profile remains. You no longer know who they are, but you still know a little about them to help serve them better.

2. Cookie Deletion Measurements
If you grapple with privacy concerns but are still desperate to know how many of your users delete their cookies, then you can use this method to find out without fear of privacy invasion. This technique is useful for adjusting data inaccuracies caused by cookie deletion. 

Super cookies remain after users delete their standard cookies. Because flash cookies are not currently dealt with by browser settings (Chrome has some functionality), or understood by consumers, they are rarely deleted (assume this will increase in the future). By comparing the super cookie value to the standard cookie value, you can quickly tell if a previous value existed and has since been deleted. The high level logic is found below (note: this over-simplistic and does not allow for browser switching, see section 3!). The following pseudo code would actually be done in JavaScript:

IF standardCookie(a) is not equal to superCookie(a) AND superCookie(a) is not null THEN
{
LOAD GA CODE
Set custom variable to indicate a cookie deletion
SEND GOOGLE REQUEST
} ELSE {
LOAD GA CODE
SEND GOOGLE REQUEST
}

The above logic would enable you to see several things including:

a) The total number of cookie deletions (using the prop or event)
b) Conversion rates of users who have deleted their cookies vs those that haven't (using the custom variable). Note: This is particularly useful for Targeting, where profiling enhances conversion. You can directly measure the uplift of normal users compared to users post cookie deletion.

3. Browser Switching Measurements
Many people now use multiple internet browsers for a variety of reasons, evaluation, different features, old bookmarks and probably most importantly, technical issues. The problem for analysts is that traditional cookies are browser specific, so each browser appears as a different user. Super cookies can quantify this issue. Super cookies provide the capability to keep a cross browser profile that remains even if a user uninstalls a specific browser and switches to a completely new one, but for the purposes of the exercise we are only looking to quantify the issue.

To create this capability the following logic can be used. Again this would be written in JavaScript. 
IF current browser is not equal to superCookie(browser) THEN
{
LOAD GA CODE
set custom variable "browser A > browser B"
SEND GOOGLE REQUEST
set superCookie(browser) = "browser B"
ELSE {
LOAD GA CODE
SEND GOOGLE REQUEST
}

The above logic would enable you to see:

a) Which browsers people are switching from/to. This can help you plan future testing resource allocations, etc.
b) Which pages browser switches are commonly associated with (above logic does not show a direct correlation to a specific page, but you can store the final session page in the super cookie and use that to see if the user has made a browser switch on the same page, which may indicate a browser issue).
c) How many browser switches have occurred (set the variable to be page specific)
d) How many users use multiple browsers (if you keep a common visitor ID across multiple browsers)

Hopefully this article has helped to show you how super cookies can be used to improve your Google Analytics deployment accuracy. For actual code examples, please see our original super cookie post or download the zip file below. For any questions or enquiries, please contact us at insights@datalicious.com

 

 

Check out the Datalicious Supertag: Container tag for smarter tag management

Use Google Analytics custom variables and simple JavaScript to target site content in real-time

I was reading this on another blog earlier today and thought it was worth passing on. In essence Google Analytics have provided a function to read the custom variables for targeting purposes. Although this is essentially nothing more than being able to read a cookie and use it to segment and target page content, it's nice to be able to use the same variables used by Analytics, as the targeting immediately has context in reports.
 
If you already use the custom variables (index 1-5), you can now use the following function to read the value and switch out content using some simple JavaScript.
 
_getVisitorCustomVar()
 
_getVisitorCustomVar(index)
 
Returns the visitor level custom variable assigned for the specified index.
 
pageTracker._getVisitorCustomVar(1); 
 
Parameters
 
Int index The index of the visitor level custom variable.
 
Returns
 
String The value of the visitor level custom variable. Returns undefined if unable to retrieve the variable for the specified index.
 
Read the original blog post from Michael Whitaker here or check out the code reference in Google's help section if you want to find out more.
Check out the Datalicious Supertag: Container tag for smarter tag management

Automated marketing dashboards with data from multiple channels including web, retail and call center

Are you and your team still spending more time compiling reports in Excel than actually analysing them and taking action? Would your company benefit from an automated dashboard solution that mashes up and visualises data from all your various channels including online, retail and call center?

Check out our internactive multi-channel marketing dashboard example on Tableau Public or the screens shots below and you will get an idea how powerful single source reporting with data from multiple channels can be.

If your company still needs a high performance data warehouse solution to power its business intelligence platform, you might also find our recent article on open source column based databases interesting. 

(download)

Check out the Datalicious Supertag: Container tag for smarter tag management

Omniture Site Catalyst and Google Analytics custom variable comparison

Many online businesses don't really need the complex capabilities of Omniture Site Catalyst, Google Analytics is good enough to meet their requirements, or almost good enough. Omniture's custom variables are very powerful, but i would think the average utilisation that i observe (when we're not setting it up!) is less than 10%, which is interesting considering the cost. I've experienced the frustration of Google Analytics limitations, but I can also appreciate that for free it is a pretty nice piece of software. Both have their rightful place, so I thought it would be appropriate to illustrate the differences you need to consider before deciding which to deploy.

One of the interesting features of Google Analytics is the custom variable, which can be set to any value and then later used for custom segmentation. The functionality of this variable is different to Omniture though and unlike with Omniture's variables, it is not configurable. The key difference is is the way the value sticks. With Google, although it can be changed during a session, the value associated with reports does not change for the rest of that session, so it can effectively only be one value per session. If it's changed again during that session, subsequent sessions will reflect this new value, but the current session will not. For some things this is ok, but for other applications it is quite frustrating.

Summary of key characteristics

Omniture Variables

- Can be related to pageviews (i.e. they don't have to stick). These are called props
- Can be related to success events, like purchases, registrations. These are called eVars
- Can change during the session (or not if you prefer, first or last setting can be configured)
- Can be made to expire
- Can be configured to be stacked (this stores a sequence of values)
- Many variables available, so you can customise lots of different things at the same time

Google Analytics Variable
- Cannot change during a session (can change between sessions)
- Can be stacked using javascript
- Has nice custom segmentation tools which are easy and instant, unlike Omniture ASI segments

Which to choose?

If you have the cash and the resources to use Site Catalyst properly, their custom variables are far beyond Google Analytics, the two aren't even close. If you also consider many variables are important to your business, then Google will probably not get you where you need to go. But if you just want standard usage information and possibly only require one variable to perform some basic segmentation, then Google Analytics is for you. You can also stack the custom variable, which will buy you a little extra capability, but it's pretty clunky. If you need some advice, drop Hamish a line at hogilvy@datalicious.com.

About the author

Check out the Datalicious Supertag: Container tag for smarter tag management