Wizard of Oz Studies and the Application of Illusion for UX Research

How do you test something that doesn’t exist? The obvious answer is to build it. But what if it’s difficult, time-consuming, and/or expensive to build? You don’t want to risk wasting resources on an untested idea. A common approach that user experience designers (UX) use in such situations is Wizard of Oz studies.

If you’re familiar with the story then you probably remember that (spoiler alert) the great and powerful Oz turns out to be nothing more than an average man behind a curtain pulling levers and using smoke and mirrors to create the perception of a powerful being.

How Wizard of Oz Studies Work

The idea is pretty much the same for Wizard of Oz studies. During a study, a user is lead to believe they are interacting with an automated piece of technology. In reality, designers have constructed a prototype with a façade that fools the user into thinking this. The brain of the technology is really just a researcher who reacts in the moment to the user by providing them with appropriate responses for their interactions. In essence, the researcher is the wizard and the study participants are all Dorothy (Harwood, 2018).

This is why Nielsen Norman Group categorizes Wizard of Oz studies as a form of static prototyping that can be used for testing. Interactive (clickable) prototypes are built with pre-determined responses provided to users automatically by the computer. Static prototypes give users responses provided by a person who knows the design of the product well. Another common type of static prototypes used for testing is paper prototyping in which a person acts as the computer switching and arranging pieces of paper representing a user interface according to user interactions (Pernice, 2016).

The researcher (wizard) left replying to study participants in another room (right) as a smart speaker voice. Photo courtesy of Answer Lab

The term Wizard of Oz studies is believed to have been coined by researcher John F. Kelley in the 1980s while being used to test language recognition and voice technology. This field is still rather cutting edge today so you can imagine how difficult and expensive testing it in the 1980s must have been. The Wizard of Oz technique allowed researchers to test out their theories without needing to invest in the daunting task of actually building beforehand all the complex code necessary for a computer to understand spoken language.

Today the Wizard of Oz technique is often applied to help in other new areas of human-computer interaction such as augmented reality and artificial intelligence where building fully functional prototypes in order to test is unrealistic and sometimes impossible. However, it can be a useful research technique for any type of design project (Geison, 2019).

Benefits of Wizard of oZ Studies

As mentioned before, this research technique can help provide insight into design ideas without the commitment of building actual functionality. What you learn from a Wizard of Oz study can help you feel more confident in pursuing an idea or decide an idea isn’t worth pursuing after all. You can also use it to identify and iterate through problems before dedicating resources to a full-on development process.

You can always just ask users in interviews, surveys, or other research methods if a certain design idea would be useful for them. However, not only are people notoriously bad at expressing what they want but asking them to imagine a hypothetical doesn’t tell you anything about how they’d behave with your design. Wizard of Oz studies help you to collect behavioral data about how users actually interact with something, not just how they say they would (Welk, 2018).

Flexibility is also a big advantage of these studies. They can be very complex and formal or very basic which means large businesses and small businesses alike can take advantage of the benefits.

For example, I previously worked for a theatre that lacked the technological capability to take online subscription orders. Patrons and staff alike expressed that it would be better to have this option, but that’s not the only data upon which you decide to make a big financial investment as a small nonprofit. Would a patron base of mostly older people actually use online ordering?

To find out I created a very informal study using the form builder Formstack to offer patrons an approximation of the experience of ordering a subscription online, placed a link to this form on the website, and tracked what happened. Orders were processed by hand by box office staff (the wizards), but patrons didn’t know this. Any follow-up calls or emails about details just seemed like nice concierge services. Seeing how many people, both new and existing patrons, embraced and easily used this option told us that when resources were available investing in this missing functionality would be worth it. Interestingly, the prototype in this case actually became the temporary solution to the problem while waiting for the permeant one.

What you need to conduct a Wizard of Oz Study

The most important thing to remember when conducting a Wizard of Oz study is do not reveal to participants that responses are not computer-generated or “real.” Breaking the illusion can not only sow possible feelings of distrust and resentment toward researchers but also affect user behavior and thus your data.

What exactly a Wizard of Oz study looks like is very dependent on the design you’re testing and what you’re goals for testing are. No matter what you must have some sort of prototype that users can interact with. This can be a specially designed shell without internal functionality or you can use existing technology such as social media or other software as a means of mimicking the interactions users would have with your design. Finding the right solution can take a bit of creativity and ingenuity (Dam & Siang, 2020).

Another thing to consider is the level of involvement that will be required of your wizard. Make sure you design your study according to the human resources you will have available to conduct it. Is the researcher who has the right knowledge to be the wizard able to allocate the time required to respond to user interactions? If not, then you may need to adjust your study format. You should also plan for any other people you need, including how many participants you want, who you want them to be, a moderator and notetaker if your test will be conducted in-person, and data analysts.

Wizard of Oz Study Examples

Here are 3 examples of how Wizard of Oz studies have been used to test design ideas.


Photos courtesy of Sliced Bread

LifeMoves is a charitable organization in the San Francisco area that provides resources to families and individuals struggling with homelessness. The organization worked with the design firm Sliced Bread to research and design a proposed service to support LifeMoves clients in successfully transitioning out of LifeMoves programs and finding and maintaining their own housing.

Sliced Bread’s research revealed a few possible directions for design – chore coordination, discussion groups, motivational tips, reminders, surveys – that all used SMS and email functionality as a form of support. To quickly discover which ideas worked best for users, researchers set up a Wizard of Oz study in which they pretended to be intelligent messaging systems communicating with participants who were real LifeMoves clients.

The study helped Sliced Bread discover logistical and technical problems and led them to a new concept, LifeMoves CONNECT, which incorporates a combination of the design ideas tested. Sliced Bread is currently working with the organization to build a pilot program for the new service (LifeMoves, n.d.).


Photos courtesy of Sliced Bread

Sliced Bread also employed a Wizard of Oz study to design a collaborative digital parenting app called Rakkoon for Strajillion, a social media monitoring startup. The company wanted the process of parents monitoring their kids’ social media to involve the kids themselves more rather than just having parents acting as spies in their kids’ lives.

The design team came up with the idea of alerting both kids and parents when kids interacted with questionable content. For the study, researchers recruited families and manually monitored the kids’ social media for a week. They used a Slack chat to deliver “automated” alerts about problematic content to everyone in a family. Then family members were able to discuss the possible problems with each other directly in the chat.

Parents in the study loved the experience and about half of the kids actually liked it too. This research helped Sliced Bread develop a prototype and helped Strajillion learn about the viability of pursuing a new market opportunity. With new funding, they’ve started a private Beta of Rakkoon and are so far receiving positive feedback (Rakkoon, n.d.).

UX Research for Human-Robot Interaction

In 2009 a group of researchers in Austria conducted a Wizard of Oz study to determine if Wizard of Oz studies could be useful for doing UX research for human-robot interaction. I know, it sounds weirdly circular, but just hang on because it does make sense.

Because of the complex nature of designing and building humanoid robots UX testing is often unable to be conducted until late in the design process when the robots are close to done and able to interact. By this point, UX problems tend to become expensive or extremely difficult to fix.

With human-robot interactions likely to grow, the researchers argued user experience needs to be considered more in design, and this delay in UX testing needs to be overcome.

They designed a Wizard of Oz study to simulate a person working with a humanoid robot to install a sheet of plasterboard at a construction site. They decided to use augmented reality to simulate the situation and created a 3D video game that included the robot and setting.

Before beginning the test, participants were briefed by a moderator on the verbal commands the robot accepted. During the test, they held and moved a physical piece of plasterboard and used the verbal commands to interact with the robot in the 3D game which was projected lifesize on a wall. The robot provided different queues in response – vocalizations, blinking lights. A researcher in another room watched and listened to the participants via cameras. When the participants spoke the predetermined commands correctly the researcher used software to trigger the action in the robot.

After the tests, participants completed an AttrakDiff questionnaire to rate how well they felt different types of feedback from the robot worked for helping complete the task. The tests revealed that a combination of verbal and haptic feedback worked best for people. More importantly, though, this conclusion showed that Wizard of Oz tests for human-robot interaction can produce helpful UX data, even though they may require more advanced technology (game design software) than Wizard of Oz studies typically use (Weiss, Bernhaupt, Schwaiger, Altmaninger, Buchner & Tscheligi, 2009)

Lions and Tigers and Robots, Oh My!

As you’ve seen, Wizard of Oz studies have a wide range of applications. They can even be used to test themselves! They can help you get answers quicker than it would take to build a fully functioning prototype or breakthrough to insights that would otherwise stay hidden by resource or technology limitations. Just like in the story they’re named after these studies show how a little bit of illusion can go a long way.


