Analyzing User Test Data — The Devil is in the Detail

4 min readJul 20, 2018

At Loop11 we like to practice dog fooding.

Recently we ran two usability studies to gain comparative benchmarking data on an existing design and then comparing that to a new one.

Since the new design was yet to be released, we created two InVision prototypes, one for each design. We then created 4 tasks and 5 questions and generated two identical studies, one for each prototype.

Next, we set about running 100 participants though the prototypes, 50 on each, to see if our new design had created a better overall experience for participants.

Disappointment Meets Confusion

An hour after launch we had the results back from the two studies so I set about consuming the reports. It didn’t take me long to see that the average page views per task were higher for our new design.

As I’m sure many of you can attest, it hurts a little when something you’ve put a lot of effort into and believe is better proves to be worse. But in the interest of creating a better piece of software, I swallowed my pride and took a look at some of the highest page count participants to see how we’d failed.

I focussed in on two participants who were large outliers, both having recorded roughly 3 times the amount of page views than their next nearest participant. These two participants alone were enough to elevate the page count averages to the point where the new design was out performed by the old design.

As I watched the videos of these two participants I only became more confused.

On the surface there was nothing out of the norm with their participation. Sure, they were having a bit of trouble, but nothing unusual, they even managed to successfully complete the tasks.

The 3 Little Known Factors That Dictate Successful User Tests

“What a waste of time” the researcher says as they throw their hands in the air.

medium.com

Unexpected Insights

Having thought the videos rendered no clues, I wondered if there was bug in our software or something else that could explain the high page counts. The next data point I chose to investigate was the page path analysis for each of these participants. This is where I got my first clue.

Roughly every 3 pages there was one of two strange URLs. When you look at a path analysis, the URLs you see normally correlate with the website/prototype you are testing. In this instance I saw two completely different, unrelated websites.

My first thought: “This participant has been stuffing around!”

So, I went straight back to the videos, thinking I must have somehow missed their frequent visitation to these other websites. I was sure I’d catch them red handed!

What did I find?

Bupkis.

I must be going crazy right? That’s what I thought. I loaded up the video for the third time and took another watch through. This time I saw something. Here’s a snippet from the video:

Did you see it? Initially I had ignored it, but as I interrogated all parts of the frame, I became more interested in what was going on in the top left-hand corner. Here’s another look at where I focussed my attention:

What we’re showing here are two previously opened browser tabs that were automatically reloading themselves every 20 or 30 seconds. As it turns out this participant had 7 other tabs open in their browser, each on a different recruitment company website. The first two tabs were both separate participant recruitment websites and were refreshing themselves presumably to ensure the participant was shown the latest available jobs.

This participant was using our browser extension to participate in the study, and as such, this enables our studies to run across different browser tabs. Every time one of the two hidden web pages reloaded our code acknowledge the page load event and logged it against this participant.

I may be naive, but it had not occurred to me that this was something I might run up against. I went back into the Loop11 results and removed the visits to these pages seeing as the participant had never actually opened these tabs during our study. Once the page views for these two participants were corrected, two my great relief, the results from the new design started to look much better.

Making Use of a Gift

While in the middle of figuring out what was going on I admit that I was feeling pretty frustrated, however, after the fact I realised this for what it was. A gift!

This unexpected insight into unusual behaviour had shown me how some admittedly niche participant behaviour could significantly, and incorrectly, impact the results our product produced. This in turn allowed me to schedule the work to ensure that automatically reloading pages, that were not in focus, would be ignored and not counted as a page view within a study.

So, in addition to gaining insights on the performance of the two designs themselves, I was also lucky enough to gain some valuable product insights. With the combination of videos, page views and path analysis this type of information would have gone at best undiagnosed or at worst unnoticed.

If you don’t have comprehensive quantitative data, supported by the ability to drill down further with qualitative elements, like video or audio, then you are risking the success of your product.

I say this enough, but its times like this that I really appreciate this sentiment and truly love our product.

Originally posted on the blog of Loop11 — User Testing Tool.

Written by Ben Newton, Product Lead, and host of True North design podcast.