bob-o-rama.com

The native home of anything sufficiently 'Bob'.

Axis change

Simple data visualisation

7th July 2008 - 8:50 - bob

There is a very simple trick that, while obvious once you know it is something that may not be obvious at first.

Many reports, not just from web analytics products look like this:

analytics axis change chart before

Click to enlarge.

In this case we are looking at a 'Most Frequent Referrers' report, with a classic bar chart. Don't worry about the data (it happens to be from this site), what I want you to pay attention to is the chart and how pointless it is.

All it's telling us is that the most frequent referrer (the leftmost bar) is sending us most traffic (by definition (duh!)) and the second-most frequent referrer is sending us a bit less (and so on).

Whenever we see a chart like this we are looking at:

analytics axis change diagram before

...where the x-axis is showing 'rank' and thus volume and the y-axis is also showing volume (both 'Visits' in this case). With a little thought we will see that we are wasting an axis.

What we really ought to have is something more like:

analytics axis change diagram after

...where there is rank/volume on the x-axis and some form of 'quality' (not volume based) measurement on the y-axis.

In NetInsight we would use this button to change the dataset to be shown on the y-axis:

analytics axis change change axis

Here we are selecting the 'Bounce Rate' metric (the percentage of people that land on the site and only view one page - lower is usually better)

We then see:

analytics axis change chart after

Click to enlarge.

Which is actually useful - we can instantly see the high volume referrers on the left, with not unreasonable bounce rates, but in ranks 3, 4 and 6 (wikipedia, iwantmymum and linusm) we see referrers that have even better (remember, lower is better) bounce rates.

If we start building the volume of traffic from these sources then we should hopefully see increasing volume while maintaining the higher quality.

This tactic should work with any volume-based x-axis such as Views, Visits or Visitors and any quality-based metric on the y-axis, bounce rate, conversion rate, average order value, cost per acquisition etc.

Know this is useful for someone, hope it's been useful for you.

bad day

Is bob having a bad day

24th June 2008 - 17:15 - bob

Occasionally people ask me is bob is having a bad day. Well this page aims to answer that question.

On any given day bob may be having a bad day, but on the whole most days aren't too bad.

To answer the question fully you should look to the following for guidance:

  • Is Bob working on a project that is already running late
  • Has Bob had sufficient tea coffee today
  • Has Bob had enough sleep - lack of sleep may simply be due to too much work, being woken-up by Linus or Gabriel or some combination of the two
  • Does bob have to do something that he really doesn't want to do today? This may include, but not be limited to, answering stupid questions, writing documentation or filling-in expenses claims

If you have any doubt about the above, you may find it useful to ask and maintain some sort of score.

Having a bad day isn't just limited to just the above, it could be that he just doesn't want to be nice to you.

At some point, this 'is bob having a bad day' page may be more useful - but not for now.

AVG Response

Initial thoughts and response to AVG linkscanner #wa

22nd June 2008 - 11:18 - bob

AVG is doing some interesting things. I think that my own perceptions of what they are up to are biased by my own interest in web analytics - after all, regular users of the software really don't care what their AV is doing.

Useragent filtering

The web analytics platform that I am most used to is Unica Affinium NetInsight, both in the on-premise (and possibily logfile based) and on-demand (hosted, and thus most likely to be JavaScript pagetag based) versions.

Due to the logfile-based nature (at least historically) of NetInsight it has always had the need to filter out Robots/Spiders, monitoring agents and all sorts of other garbage that litters the data. As such it's trivial to segment away or exclude the current AVG useragent, either in your own installation or, with a brief request to the on-demand team, from your hosted install.

Of course, this is already broken - AVG already seem to be altering the useragent string to something that looks completely real, and thus impossible to block all by itself.

Understanding AVG

The main thing that I would like to know about right now is the sort of environment that AVG presents to JavaScript - what sort of screen resolution, locale, plugin list, cookies etc.

If the above presents a recognisable fingerprint it would then be possible to filter based on these multiple criteria.

Of course, it may be the case that it presents the actual environment of the host, which would make things much harder to work with, although I don't think that this is likely to be the case.

How AVG executes

JavaScript pagetags typically create the URL that they are going to request from a complex block of code. I propose (and I stand to be corrected, as AV isn't my thing) there are four main options for how AVG can function.

  • Static analysis of the JavaScript
  • Sandboxed execution of JavaScript
  • Sandboxed execution of JavaScript that allows the tag to 'fire' to the outside world
  • Actual execution of JavaScript

Now - I don't *think* it's doing static analysis, although I have colleagues that know about such things - I'll have a word on Monday.

I hope (for the sake of AVG) that it isn't executing the code for real - that would open-up the opportunity for malicious exploitation - although we may be able to exploit it ourselves. :-)

Which leaves some form of sandbox. This should be easy enough to implement as JavaScript runs in one anyway. AVG would just need a separate instance. The real question is what does the sandbox provide for an environment and how is it allowed to interact with the rest of the world - at least we know that it allows extra requests to be made.

References - further reading

http://www.grisoft.com/ww.72

http://www.grisoft.com/ww.faq.num-1066#faq_1066

http://www.grisoft.com/ww.faq.num-1188#faq_1188

Disclaimer

All this is pure speculation, but it almost makes me want to sign-up to see what it does.

All for now. Comments/thoughts via usual channels

Link Visualisation

Thoughts on a link visualisation tool

19th June 2008 - 14:08 - bob

I have been having a thought - something to do with visualising the relationships between sites/blogs/posts/pages.

Clearly others have gone before me, I rather like:

http://www.touchgraph.com/TGGoogleBrowser.html, http://www.aharef.info/static/htmlgraph/ and http://home.snafu.de/tilman/xenulink.html for various reasons.

None of these quite do the job that I need - so if I'm going to create something myself I need some:

  • network visualisation, including some de-cluttering algorithms
  • site indexer (perhaps using web analytics data)
  • source of link information for links going in the other direction

To be fancy this could all be done in 3D, but I'm not sure it would be any more useful than something in 2D.

And then I'll become fabulously rich.

Very Exciting

That post where I apologie for not posting

16th June 2008 - 21:16 - bob

This isn't really the post where I apologies for not posting, but as I was writing the subject I found myself thinking about that sort of thing.

twitter logo

Anyhow - I have been playing with Twitter this evening - figuring out if there is anything in it for me.

Up until this point I have followed (stalked) people via the RSS version of their feed - but now I am using the traditional route.

You will see at the bottom of the right-hand menu on my site a little 'Latest Twitter' thing, which in a eating-its-own-tail sense should also show the auto-added-to-twitter items whenever I post something.

Let's see how this plays-out.