conversational user interface

Teammates: Jia Liu, Xueting Zhang

My role: conversation modeling | personas | experience prototyping | video production


It can be intimidating to converse with fast-talking locals in a foreign country. In a future with driverless cabs, might the absence of humans present an opportunity to improve the user experience for tourists?


Rover is a conversational user interface for driverless cabs. It creates seamless ride experiences in a foreign land by assisting users with a suite of tasks related to reservation, navigation, and exploration.

Market Research

Our guerrilla research revealed that the conversational market for driverless cabs was ripe for innovation.

Our competitive analysis showed a fragmented landscape of products and services, comprised of established car manufacturers and startups. While conversational user interfaces (CUI) and autonomous vehicles (AV) are both active areas of product development, the combination of CUIs within AVs is an emerging technology under private development, with limited officially published market data. As of October 2017, some potential players in the space include Google Assistant, Teslabot, Bot Studio, and Uber.

A table summarizing our competitive analysis.

Experience Prototyping

I advocated to narrow our problem space in order to constrain the spectulative distractions of designing for the future.

Based on our analysis of the market for driverless cabs, we decided to focus our design on business travelers. The Global Business Travel Association reported an estimated $1.2 trillion spent on business travel and 488 million business trips in 2015 alone; we saw this as a market with immense opportunity. Accordingly, we developed a persona for a businessman from Latin America visiting an unfamiliar city in the U.S.

The Quirks of Conversation Design

Ubiquitous. Unforgiving. Short auditory attention spans.

Conversational user interfaces present a number of unique challenges to designers. For one, people tend to be much better with visual information than auditory information. This means that a conversation equivalent of a data-rich graphical interface will never work. CUIs need to speak a simple and succinct language, with lots of repetition built in. CUIs must also be highly flexible in their journey from intent to fulfillment. Finally, CUIs must be exceptionally adept at error recovery in order to account for all the times that a user forgets or fails to understand the next step.

For my thoughts on the politics of conversational user inferface design, read this article.

Conversation Modeling

We focused on a few key intents and modeled each of those conversations from beginning to end via happy as well as indirect paths.

We created a personality for our CUI:

Rover is male, with a full, deep voice. He stays with the same user across rides. He remembers past conversations and references recent ones. His overarching goal is to make cab riders feel more at ease as they order and ride cars. He is able to access maps, traffic information, local news, local dining options, local events, and transportation schedules.

We also created a list of common intents that business travelers have. These would translate to functions that the CUI is able to assist with, such as order a ride from the phone, set language, find and enter the correct cab, update destination, prebook a cab, find a place to eat, book tickets to events, and listen to podcasts and audiobooks.

To ground our work, we developed a conversational model that identified our target user’s intents and mapped a range of possible utterances to responses from the system. We also came up with a happy path for each intent. A happy path is a direct conversation that leads from utterance to fulfilment of an intent. Here's an example of a happy path:

A model of a happy path from utterance to fulfilment. In this case, the intent is to order a cab from the user's phone.

A model of a complex conversation showing the pre-attentive and attentive states of the CUI.

Final Design

We ultimately incorporated graphic elements, in the form of text messages, to accentuate the power of our voice interface.

Our initial design involved a voice-only interface. We designed Rover to be able to talk to its users. We imagined that this would happen both inside the car as well as outside it. However, during our iteration process, we realised that voice was not appropriate for all the interactions that Rover would be expected to perform. For example, Rover would need to confirm pickup times with its users. These interactions involve contacting the users wherever they are. A voice-based app on the user’s mobile phone felt too intrusive for this purpose, so we came up with a text-based conversational interface. Rover texts its users using a chatbot, just like a human chauffeur might. In addition to a text-based option, we also decided to add a simple visual cue inside the car to indicate Rover’s pre-attentive and attentive states.

In our final CUI design, we decided to incorporate vocal, text-based, as well as visual elements to maximize the robustness of our design.

Voice is used for in-car conversations.

Text is used for out-of-car contact and media-rich conversations, such as restaurant recommendations.

Visuals are used in-car cue to indicate pre-attentive and attentive states.

If I had more time...

...I would model the conversations on DialogFlow and try them out with real users. I would also parallelly prototype a graphical interface that might accompany the CUI in autonomous cabs of the future.