Live testing for omnichannel experience design might be contentious, but from our recent experiences, we know it offers excellent insight that far exceeds lab-based user testing.
What is live testing
Live user testing is user testing in the target live environment, with real people. I.e. the target users, who DON’T know they are being tested on, to learn about the user experience we’re designing for. It seems to work especially well with cross-channel experiences – physical to digital. Unlike A/B testing, it’s for testing early prototypes of a solution. In the case of our government-funded R&D project, we tested quite complex demonstrators that capture real customer data – this usage could almost be considered an Alpha level release. Although it not.
It has become increasingly clear that we cannot exclude environmental factors such as the user’s location, from the digital interactions. The two are now too inextricably linked in our work.
By using rapidly-built prototypes, live user testing offers us: real-time user data and insight from as close to the real thing as possible. By looking like the proposed real thing, acting like the real thing and even offering useful real interactions and content to the users. By using these techniques and offering a meaningful value exchange to the audience that they would engage with when the product/service/app etc goes live – we gain tremendous insight. Which is what its all about.
There are any number of factors which got me to conclude that ‘something had to change’ in the way we were user testing things. But looking back (which kinda made me write this post), there were probably four standout triggers that got me to here. These were:
- A client’s request
- A/B testing, an airport lounge guerrilla test
- Lean startup methods
About four years ago, I was running a user test in a lab for a travel company. We were doing the typical user testing thing of creating a test plan and agreeing the proposed tasks for the test etc. However – the client wanted the first test to be completely open.
“just use it, navigate around and tell us what you think”
No task, no structure, completely open-ended. I dutifully included the task and while watching the test participants with the client, I realised that this was an excellent task and probably the most instructive across the whole test. In hindsight it was probably this task that first planted the seed of distrust I have had since with lab-based tests and got me to thinking “there must be another way”.
As an aside, I have a similar distrust of focus groups too – even though I’ve run them and sometime really enjoyed the experience. Once when user testing some ad concepts with a target group of 30-50 year old ladies, after a rubbish morning session, in the second session I made sure there was plenty of alcohol flowing. It was both hilarious in parts but more importantly, very insightful. Which again is the point.
The second thing that probably changed things for me was A/B testing, which was becoming more prevalent in the UX industry. More and more people [especially developers I was working with] were turning to A/B or multivariate testing – which always seemed to make sense – even if it’s quite difficult to run with many of the tools which purport to making it possible. Some classic well-documented examples of this include Google’s famous 41 shades of blue test (which several designers highlighted as reason to leave Google), 37 signals tagline tests and not forgetting Optimizely talking about their work on the Obama campaign. But A/B testing seemed like a really good, sensible almost obvious test to run. And yet, how often do you do it (ask yourself)?
For the third factor, lets skip forward to Febuary 2013. I’d persuaded an agency’s client [and the design agency] to let us test the prototype I’m building – for an airport car park booking service – in the airport departure lounge itself, with bored passengers waiting for flights. These passengers had probably just used the car park. Full marks go to the airport marketing manager for letting us go do it. It’s not always the easiest to get clearance to go wandering around ‘airport side’ [other side of passport security] do research AND annoy real people. This I know from my time working at Flybe.com. This approach to research is guerrilla, it often involves looking like real staff but I’ll be honest, is quite awkward to set up. But – it was far more rewarding than going through the usual process of recruitment, labs and one-way mirrors that user testing often is. It was a revelation to me.
Donning a luminous vest will get you [a] into almost any location and [b] people almost always trust you – including both staff and customers.
I had this view that qualitative and quantitative research approaches must differ. Qual and quant don’t mix, aren’t the same and you couldn’t run one at the same time as the other. I had got into my head that the quant numbers had to be very big. But, what if we created a prototype, ran google analytics on it and got people who didn’t know they were in a test i.e. in a live environment to test it, what then? How many would “be enough”? Looking at Jacob Nielsen’s other benchmark report [and I don’t mean the you only need 5 participants one] he states “my recommendation is to test 20 users in quantitative studies”. It seemed to me that we could do things differently.
Margin of Error in quantitative studies (Jacob Neilsen study, 2006)
And finally – the thing that really made it obvious to try things differently? Across all this time: lots of folks working on and writing about start-ups releasing alpha and beta versions of their SAAS products to early adopters to get some much needed feedback. Just look at the Lean Start-up methodology. Why not take it a bit further and create a prototype of the actual product and release even earlier?! I guess these reasons [and probably many more] all contributed to me thinking: “there must be another way. A better way, something I’d trust more.” Then lo, the right project at the right time came along for us…
Fast-forward to January this year. JOYLAB had just won a Nesta and Arts Council funded R&D project, working with National Theatre Wales and NoFit State Circus to investigate ways to capture user data from people at a live event. The brief had many difficult variables/challenges e.g. events were sometimes un-ticketed so you wouldn’t know who was attending. The location could also throw up issues – could be on the side of a mountain! But – this time there was no client (to define the process), just us and our partners – who God love them, gave us the space and support to just get on with it.
Screenshots of prototypes built for our R&D project
Over the last four months we’ve built multiple prototypes [both simple and complex], tested interactions and processes e.g. print to text service to mobile websites and now the end-to-end experience. We’ve tested at five different events, sometimes across multiple evenings at the different events. Each time learning, tweaking, developing. We’ve run analytics e.g. Google Analytics across the prototypes and developed some in-built analytics to understand usage. Usage across the tests ranged from just 19 participants up to nearly 90, with a over 370 test participants in total.
And only the once (in the last test), did we also use recruited participants (in this case we recruited 11 people – from our earlier survey work and co-design sessions). But we got them to come along to the show (incentivising with free tickets) and ran phone interviews with them in the days after the test.
Hook asset for a live test on our R&D project
We also have the luxury of academic oversight on the project from Prof. Hamish Fyfe and Dr. Daniel Cunliffe. The academic consortium, ReDraw are attached to the project and are looking at ways in which academia and industry can work together and “to create opportunities for new business models” (source). Being able to discuss and validate our project methodology with academics and gain some peer review has been hugely helpful.
The experience of running live tests like this has been very rewarding. The results have been humbling – remember you’re user testing your interaction design in a live environment. There is no safety net of a researcher “framing” the task for the test participants. These are real, unengaged people who are there to do something else. They are just using your prototype to get something done – even if you’ve incentivised it.
We learnt that running this type of test is viable, not easy to set up, requires real work to engage the person in the test, but the results we saw spoke for themselves. Not to mention, it forces you the designer to get into the real environment and observe real users.
Its all about hooks! By doing live user testing in live environments you learn that unless you’re part of the existing reason a user is there, you have to create a Hook – a thing to incentive the user to engage with your test. These hooks are mini advertising/marketing/comms tasks. They’re often part of the new flow your designing. What we call doing ‘just enough’ to get the user to engage with the test. The question of fidelity is very pertinent here and needs needs some (considerable) consideration.
So that’s it. Whenever we get the chance now, we’ll do live user testing. We’ll persuade and fight for it with our clients. We have a method, a way – we’ve tested it and it works.
I know that many research processes have been hard won over the years, written down in books and used by many. But test processes aren’t holy rituals. We’re doing research, finding stuff out. Sometimes research seems to be considered a terribly careful and reverent thing. And yet really, it seems more important to just be curious. Prod and scratch and turn things over and out. Rather than follow something called a “usability test”.
Do some live user testing – learn some truths. If you want to find our more about how we’ve succeeded or just want to geek out with us, get in touch.