Designing for multimodal interaction in future vehicles

10 min read

Role

Research Assistant

Responsibilities

User Research, Interaction Design

Managers

Prof. MariAnne Karlsson

Company

Design and Human Factors and Volvo Cars Corporation

Timeframe

Nov 2014 - Aug 2015

Multimodal designs project cover image

Background

I contributed to three research studies-Automotive Integration of Multimodal Interaction Technologies in future vehicles (AIMMIT), Automotive UX and HMI design for automated vehicles - HATric projects. The first and second study focused on identifying natural interaction patterns through modality choices and designing HMIs for automated driving, while the second examined the impact of prototype fidelity on UX evaluations. This is a summary of my contributions in the AIMMIT project.

Introduction

Infotainment systems, which combine navigation, communication, and entertainment, are significant sources of driver distraction. However, they also enhance user satisfaction and play a crucial role in maintaining competitiveness in the industry. Traditionally, usability and safety has been priority when designing and evaluating in-car interfaces but hese factors are not enough for competitiveness in the 21st century.

Studies suggest that natural interactions are key to maintaining safety in vehicles. A major challenge is integrating multimodal interfaces that reduce driver cognitive load, especially with the increasing use of mobile and connected technologies. Successes in domains like personal computing, gaming, and mobile devices have raised user expectations for seamless multimodal experiences in the automotive space.

A study by Razavi et al. (2013) demonstrated use of alternate modalities through a car simulation, where users controlled radio, windows, mirrors, and lights intuitively—eliminating the need for any touch interaction. AIMMIT leverages such insights to create seamless, user-friendly automotive interfaces for the future.

Snippet from a study by Razavi et. al
[↑] Illustrations of the gestures used in the study by Razavi et al. (2013) to control various functions in a car simulation. Pointing at a device or an object locks the object for selection. The hand gestures are performed depending on the requirement.

The AIMMIT project explores multimodal HMI concepts to enhance vehicle functionality, safety, and user experience. By integrating natural, contact-free interactions like speech and gestures, it aims to reduce visual demand and improve usability.

The Challenge

The challenge was not to design futuristic or experimental cockpit HMIs, but to develop practical interaction solutions that reduce the time spent on secondary tasks, promoting safer driving and higher user acceptance.

Minimize Visual Distraction

Support drivers in performing complex secondary tasks with minimal visual distraction and high user acceptance.

Enable Emerging Technologies

Create intuitive and user-friendly interfaces to integrate emerging technologies, such as autonomous driving and advanced active safety features.

The Volvo S60 Multi panel System
[↑] A Volvo XC90 driver cockpit

Research Aim

I explored how participants, given a predefined choice of modality—sound (voice), gesture, or haptics—chose to execute basic interactions with a ‘device’ or function. Using these insights, I formulated design recommendations to guide the development of future multimodal automotive interfaces.

The Volvo S60 Multi panel System
[↑] Some interactions feel natural. Certain everyday hand gestures feel so natural that we perform them effortlessly.

The goal was to extract natural interaction patterns from tasks that people perform effortlessly. Since these come naturally, the amount of cognitive load needed to replicate them for a specific interaction in an automotive setting would be minimal. Making interactions with infotainment systems more natural and at the same time offering a safer and intuitive environment.

Methodology

20 participants between the age group of 20 - 65 were part of the study. A theoretical representative sampling was applied and data gathering involved both quantitative and qualitative methods. To begin with, an analysis of the Volvo S60's multipanel system revealed 242 distinct user operations. These were further organized into six fundamental operations:

On/Off Operations

Actions related to activating or deactivating a function or object.

Increase/Decrease

Adjusting properties of a function, such as volume or brightness, in two directions.

Make a Choice

Selecting an option from a list or predefined set of choices.

Search

Looking for specific information or content within the system.

Enter String/Number

Inputting numerical or textual data into the system.

Move Operations

Repositioning objects on the screen according to user intent.

The Volvo S60 Multi panel System
[↑] The Volvo S60 Multi panel System that was taken as a reference to list the different operations performed by users. Functions are controlled via buttons in the steeringwheel, in the centre console under the colourscreen or via a remote control.
An overview of the components in S60's multipanel system
[↑] An overview of the components in S60's multipanel system. There are about 242 operations a user can perform on the system.

In order to study how users would perform these 6 operations using different modalities, I derived 12 basic tasks people would generally perform in their day-to-day lives, that were inclusive of these 6 operations. The participants were asked to demonstrate their spontaneous choice for executing the 12 tasks listed using touch, speech and gesture modalities which were presented in randomised order.

After they demonstrated their choice, the participants were asked to rate how natural it felt to perform the particular task on a scale of 1-5 (1- very unnatural, rather unnatural, neither or, rather natural, 5- very natural).

List of tasks and the rating scales used in the study
[↑] Document showing the list of tasks and the rating scales used in the study. A sample question is also shown.

Study Prototype

The setup included a physical prototype- a dial knob, a slider and a button (see image), in order to present a very basic haptic interaction space for the participants.

Study prototype for physical interaction
[↑] Study prototype showing a dial knob, slider and button representing a basic physical interaction space. This gave an idea of the physical interaction space the participants are used to.

User Interviews

Each participant shared their spontaneous modality choices for the 12 tasks listed. For ex: “If you were to turn a device ON (operation), which speech (modality) would you use?” Participants then demonstrated their choice by speaking out loud.

To make the study a lot more interesting, we decided not to introduce a vehicular context in the tasks listed. The participants had the freedom to come up with their choice of gesture, speech or touch by imagining a context they desired.

Screengrabs from the interview recordings
[↑] Screengrabs from the interview recordings. Participants suggesting gesture interactions for a task.

After they demonstrated their choice, the participants rated how natural it felt to perform the task on a scale of 1-5 (1- very unnatural, rather unnatural, neither or, rather natural, 5- very natural).

Rating scales to rate naturalness

Analysis

I listed the responses from all 20 participants for all 12 tasks for different modalities to identify any consistent patterns across the sample. The spontaneous choices that were tagged “very natural” or “rather natural” across modalities were further investigated and compared with how the operations were executed in the current multi panel system.

Analysis of the study

To complete the analysis I also listed a summary of all the feedback received from the participants along with the top modality choices to assess the practicality of the choices they made. This would help me come up with future design recommendations with modalities that are likely to feel more natural to users.

A summary of all the feedback received from the participants

Lastly, I compared the suggested interactions with the existing setup in the Volvo S60 to evaluate their feasibility and compatibility. The current system in the Volvo S60 relies heavily on physical controls, such as buttons and knobs, as well as touch-based interfaces for various functions. This setup was assessed in light of the participants' preferences, such as a combination of natural modalities.

Feedback analysis

Results

Results indicated that participants preferred using multiple modalities for simple tasks, with gestures feeling natural in some cases and a combination of gesture and speech in others. While certain modalities were rated as less natural for specific tasks, common patterns emerged, such as raising an arm to adjust volume, zooming in or out, and using gestures for scrolling and selection.

Hand gesture sketches

Touch

Considered the easiest due to familiarity. However, gestures were perceived as harder due to inexperience and lack of visual feedback.

Gestures

Often seen as the hardest due to participants' inexperience and the abstract nature of gestures. Without visual feedback, gestures felt unbounded and difficult to conceptualize naturally.

Speech

Easier than gestures for certain tasks since it requires no physical effort, but its effectiveness was context-dependent.

Visual & Audio Feedback

Gestures felt more intuitive when accompanied by visual or audio cues.

Abstract Gestures

Unstructured gestures without clear boundaries were harder to imagine and implement effectively.

Summary

Participants generally found touch to be the easiest, while gestures were the hardest due to lack of experience and feedback. Speech was rated between the two, depending on context.

Design recommendations

Using insight from the results, I re-visited the multi panel system in the Volvo S60 and proposed a few design recommendations. One of the top level requirements was that any alternate design for a specific operation, the driver must always be in complete control. Three of these recommendations for different secondary tasks are presented below.

Selecting Media

The Volvo S60's menu navigation relies on buttons and knobs, requiring users to navigate multiple levels (5 levels) for deeper functions like file copying. Simplifying this with alternative modalities could improve usability.

Operations like TUNE/NAVIGATE, CONFIRM, and EXIT are used to traverse a hierarchical menu structure. Accessing deeper levels (e.g., level 4 or 5) required navigating through levels 1–3, which can be cumbersome.

The study results suggested that users prefer a combination of touch and speech for selecting media. In order to select a media app and start listening to music, there are multiple natural ways to achieve this as per the study. Simply using a speech Interaction or a combination of a single touch and speech interaction can help them complete the task more naturally.

Hierarchical task analysis for listening to music

Adjusting Volume

As study findings indicate participants perceive gestures as a natural and intuitive method for this task. The existing touch interaction using a dial knob can be complemented with gestures for offering a natural way to adjust volume.

Hierarchical task analysis for adjusting volume

Entering a web address to search

The existing web browser presents an interaction challenge. Current methods for SEARCH and ENTER STRING operations (using a daisy wheel controlled by the TUNE-OK pad) were time-consuming for longer inputs. Additionally, browser scrolling controlled by number pads can be inefficient. This suggested a need for alternative input methods for improved in-vehicle web browsing.

The recommendation was to offer a speech interaction for both entering a web address and searching. This would allow users to speak the address directly into the system, reducing the need for manual input. Alternatively, a combination of touch and speech could be used for activating search and entering a web address.

Hierarchical task analysis for entering a web address

Wireframing

The next step was to evaluate a number of scenarios in a vehicular context. 3 common scenarios were listed and a prototype was designed to match these scenarios in accordance with the findings. The XC90 infotainment layout was considered as a reference as Volvo plans to upgrade the UI layout in future S60s.

A combination of touch and gestures recommended from the study were used as new ways of interaction. Some of the icons used are shown below. Colour codings for touch, speech and gesture interactions were used to help our participants get accustomed with the interactions in the study.

Select the media app and start listening to music

The user decides to turn on music on Spotify. They use a combination of touch and speech to select the media app and start listening to music.

Gesture interaction cues are presented while the music is playing. The user can swipe left or right to change the track and raise their arm to increase or decrease the volume. An alternate scenario was also briefed were the participant uses "pointing finger gesture at the device" to start playing music.

Early wireframes for the user study

User Study

A total of 10 participants took part in the study. All of them were part of the earlier interciews. The prototype were presented to the participants along with the scenarios they had to complete and rate, one after the other. The participants were again asked to rate the interactions on a scale of 1-5 (1- very unnatural, rather unnatural, neither or, rather natural, 5- very natural).

Findings from user study on updated wireframes

Analysis and Findings

The average ratings were calculated for each scenario and the results were compared with the earlier study. The results were then used to evaluate the feasibility of the new interactions proposed. The interactions felt, on an average, rather natural overall. The scenario with the best outcome of naturalness was the one where the user had to search an address in the browser. This can be attributed to the fact that touch inputs take more time and effort compared to speech inputs and thus making the interaction feel more natural.

Participants re-iterated that gestures where they pointed at the device for music and adjusting volume felt more natural compared to the touch interactions.

The prototypes require further refinement and testing with a larger sample size to validate the results. However, the study strongly suggests that advancements in multimodal interaction research hold great promise for creating more natural and intuitive user experiences.The latest collaboration between RISE and Volvo Car Corporation- "Safe chauffeurs in safe and healthy multimodal driver information environments" marks a significant step forward in this area.

You can learn more about my study and the results by reading the full paper here. OR Reach out to me directly for more information.

My Takeaways

Research has always been a core part of my design process. During my time at the Design and Human Factors division, I have met some of the best design researchers in industry and academia. It was always a joy working with them and more importantly, learning from them. The work I performed have only made me realise that great design is not an accident and its the details you tap into from your research that makes your design really usable. Long story short, here are some takeaways from my work.

  • Choose your literatures wisely. Literatures can sometimes be overwhelming if you don't choose them carefully during the review stages. They can eat up a large chunk of your time from your research plan. Try to find studies with a good amount of citations.
  • Do not hurry your study design. I learnt the importance of being patient and taking small steps while preparing the study design - choice of participants, the setup and methodologies for the study. Get enough feedback on from your peers and your supervisors before you finalise your study design.
  • Manuscripts are underrated. I got better at preparing manuscripts and questionnaires by avoiding words that can bias user responses to questions. Which needed several reviews with my supervisor before we agreed on a final version. This is really worth the time.

References

Dagmar, K., 2012. Supporting the development process of multimodal and natural automotive user interfaces. Available at: http://duepublico.uni-duisburg-essen.de/servlets/DocumentServlet?id=27825

Karlsson, M. & Nilsson, J., 2007. Visual and haptic cues and feedback in an integrated user interface.

Valli, A., 2008. Mul?media Tools and Applica?ons vol. 38 (3) p. 295-305

Kuppattu Kalladithodi, S., & Karlsson, M. (2015). Modality, Natural Interaction and User Interface Design: Exploring the Idea of Basic Operations and Preferred Modality. Available at: https://research.chalmers.se/en/publication/249258