I attended a talk last week about some pretty cool big data analysis. Part of it included a demo about how the conclusions were drawn – again, very cool Unfortunately, the speaker also included the same comment a couple of times, “Of course, we need to improve the user interface.” Lots of words and a few boxes and highlights in what looked like a pretty standardized template, like the MadLibs games we would play as kids asking for a noun and a verb and a color and an animal and a food, filling them in and reading aloud a story about “One day I went camping in the chandelier and we roped a red hamster and fed it pickles. If you were familiar with what was being asked, you could get an understandable answer out. If not, much hilarity ensued.
Now, first and foremost, I am a software developer. I do a bunch of things, and I once wrote a post on another blog about how important it is for me that the project management pieces fit together as nicely as the actual software architecture, so I have a foot in both worlds. But really, from the top I take a technical view of things, and while I am a writer, my favorite medium for writing is a software language, and my favorite entry mechanism is a programming editor.
So my favorite thing to do is to play with data, to build connections between different subsystems, to massage the data and the sources to make connections. However the output looks. “We can fix that later.” Or I could’ve said, “Of course we need to improve the user interface.” Because I’m an infrastructure kind of person.
At least I was. And then we tried a different way and got some pretty amazing results. And now I’m a convert.
The project was a manageability project, focused on applying data warehouse techniques to performance and manageability data. Correlating log records, performance data, configuration data, health check output, and workload information to build a tool to allow at-a-glance management across the entire software stack and standardized across various data manageability frameworks.
We knew we had lots of data, but needed to turn that data into information. Not much different than any other sensor analytics or BI problem, actually. But with lots of objects to manage in a big data world, lots of workloads, lots of processes, lots of entities, there was no shortage of things we could do with it. The real answer, was…. what makes sense?
The answer that we came up with was that we wanted to provide the tools for the various big data administrators to tell their stories. To compare performance from last week, last month, last quarter, last year. To predict when they would run out of headroom in their clusters based on workloads. To see when workloads were outliers. To be able to tell why things failed last night, looking at the log records and health check results from their Hadoop distro. To handle the “Why is my query running long?” question when an user calls in after an ad hoc query – was it a disk failure or a crazy query plan?
That is, we had to focus on our users and what experience we wanted them to have. How to make sure the tool fit their needs like a perfectly balanced hammer or screwdriver or saw.
So we started talking and walking through some scenarios based on our own past experiences in this domain. Not individual use cases (though we sometimes did get to that level), but more workload tasks and “playing” with the data, working through “If I were debugging a problem, where would I look.”
This is trickier than it sounds, because each person uses different keys to hint at what’s going on when they’re on a fishing expedition to figure out what’s going on. Dietrich Dörner calls these “supersignals” in one of my all-time favorite books, The Logic of Failure. Frankly, I read this book, devoured it, really, and it changed my life in lots of ways. Not just for the projects I was working on at the time (not the one I’m talking about here) but also on the team dynamics and ecosystem in which we operated. I still refer to my marked-up dead-tree copy, as well as the version on my tablet. Amazing book. HIGHLY RECOMMENDED….
Supersignals are each individual user’s way of dealing with complexity, “the existence of many different independent variables in a system.” As Dörner says,
Complexity is not an objective factor but a subjective on. Supersignals reduce complexity, collapsing a number of features into one. Consequently, complexity must be understood in terms of a specific individual and his or her supply of supersignals. We learn supersignals from experience, and our supply can differ greatly from another individual’s. Therefore, there can be no objective measure of complexity.
This is why one person looks at the log records first, then workload data for a particular job, and then the memory usage to discover an obscure problem with data ingest, while another person looks at transaction rates and disk IO rates before discovering the same thing. And a third person looks at CPU rates and disk space usage and takes much longer to arrive at the same conclusion – it’s the set of supersignals they’re use to.
For situations, like this, you simply cannot prescribe a single precise workflow for arriving at at explanation of a particular scenario when you’re looking forwards at an issue. (The answer is generally pretty clear with hindsight though). Instead, the UX goal is to make it easy for each person to access data his or her own way, in the order she prefers to be able to suss out the factors that help explain and resolve the current situation. Want to monitor logs by time? Sure, easy. Watching workloads throughput over time and for a particular application? Fine! How about health checks at the server level ? No problem – watch those too!
But sniffing out places where the issues are causing your antenna to twitch is only one piece of the puzzle. The other piece is giving the tools to make it easy for you to tell your story (or, for this project, the big data administrator to tell his or her story ) to other people – other administrators, or managers, or even CxOs wanting to know whether there’s a big change in IT budget needed to handle upcoming trends, or when to predict the next quarterly rollup.
The UX for explaining these things to others is just as important as the UX for finding issues. Both of these sides of the tool have to fit naturally with the way you think, like a chisel in a sculptor’s hand.
So we took a UX-driven design strategy. We started with “what data is available, what tests could we do, what issues do people often see here?” But we didn’t stay at that point very long. Instead we jumped quickly to something very different: We asked why would I care about this data? In what scenarios does this feed my need for information, and how do I use it there? Translated to different units? Grouped across all activity? Isolated into a discrete chunk? Over time? One-time? Associated with similar workload I recognize because they look like this? And so on. What is that experience like?” is the question that drove us . Once we’d worked through that, the what of the actual implementation details was trivial.
This approach resulted in sketches and wireframes and mockups and “how do you connect the data?” and “when and how do you move from this topic to that?” and “what kinds of charting make sense for this?” and other similar questions for anyone we could find who had any interaction with the subject at hand. That meant lots of fast user review and lots of informal feedback. Short cycles and lots of prototyping with show and tell. And lots and lots of discussion about what might and might not work, where everyone’s feedback was critical.
As the wireframes and mockups were solidified, UXDD led us next into demos and discussions that invited people in, talking about how problems that they see would manifest themselves in different ways, sometimes ambiguous, so “that hints that we should go look at the event log to see if there is anything from this component.” A radically different style of demo and discussion than what I’d typically used in the past, because suddenly it was real and personal and shared ownership, rather than instructions and canned scenarios and something akin to a scripted presentation spiel.
We learned a really important lesson here, one that was picked up from another great book, Sketching User Experiences: Getting the Design Right and the Right Design by Bill Buxton. The biggest advance was that we needed to have sketches when getting user input. Not final, fancy drawings and screens with color and fixed positioning and fonts that looked like we’d invested lots of time in them, but obvious sketches and rough outlines. If things looked too done, too formal, we found that we didn’t get real feedback and contribution. At best we got “OK, that would work” or critiques of color choices or maybe small screen layout hints. People didn’t want to criticize something that they thought was too “done” for their input to matter.
So we intentionally made things look sketchy for these discussions – no shading or detailed data, obviously fake-data graphs. And especially not anything that suggested we’d invested a lot of time and effort in something We’d bring up a drawing tool and just start drawing circles and arrows and numbers and lines together on the fly, from a basic “nothing invested here, just to get us started” template. We didn’t want people to hold back and with a sketch, it seemed that they had no problems suggesting changes. Sketches converted our discussions into a shared development/discovery process, rather than “can you comment on this?”. This actually turned out to be a pretty critical breakthrough for getting great feedback to refine our various screens.
In effect, we refined our UX for having a UX design discussion based on the style and depth of feedback received.
Those discussions also drove the infrastructure design for additional data acquisition, frequency, retention knobs, etc. We took the thinking and feedback and worked backwards to figure out what kinds of data we needed, for how long for both decision making and retention, what it might be combined with and on what dimensions (not just temporally, but also things like workflow and cluster, and business grouping). Then we worked to make sure that those associations were easy and natural to do. Like breathing.
While the final details for the user interface for types of data or new services or new connections were being tested and finalized, we were busy hardening the plumbing and data acquisition. Then, with it all integrated together, we could again examine the UX and play with the data to figure out whether there were additional ways to extract information from data, and talk more with our users to see what they could do now that they couldn’t do before.
This integration phase was the key, because it was only then that we’d discover disconnects between human thinking and what the cluster and its component entities did, everything from timescales to data that needed more translation before being human-friendly and query-read. But the flexible architecture we chose made that level of change pretty easy.
In this way, the product advanced in leaps and bounds, with a breathtakingly fast turnaround. And it led to fast understanding and a sense of ownership from our user community too. In the space of a half hour or so, it was easy to take users from “what is this?” to “can I have this now because I need it to answer the questions my boss just asked me about our resource usage, even though you didn’t show that exact scenario…”
Had we followed the more traditional use case driven design, “A user wants to debug a problem with a disk,” well, the result would most likely have been much less flexible. Do A, then B, then C, which will prompt action D and if that doesn’t work, step E. And displays that, once you know what they mean, you can make sense of them, kind of like detailed weather models with lots of interpretation required, rather than intuitive understanding. These kinds of things are great with hindsight and with high predictability. But many real-world problems aren’t like that, and it’s the hard ones that challenge you.
The technology in the talk I attended last week was pretty darn cool, but that coolness was lost in a sea of words and numbers, so that the important information, the actual value from the system, wasn’t visible. It was technology but without an immediately compelling story that invited people in. After the talk, I could see how I might use it conceptually but I couldn’t detail it out and integrate it into my big data worldview. Why? Because the interface made it appear that my involvement was entirely an afterthought, rather than the purpose of the exercise..
If the technology had started with the UX, and their related emphasis on why and how before worrying about the what of the technology, I could’ve been sold…. And for startups in this space, that’s the real key to traction.
Thanks to Gunnar Tapper for reviewing an early draft of this post.