IUI '22 Companion: 27th International Conference on Intelligent User Interfaces

Full Citation in the ACM Digital Library

SESSION: Workshops

First Workshop on Adaptive and Personalized Explainable User Interfaces (APEx-UI 2022)

Adaptation and personalization are crucial aspects of the design and development of successful Artificial Intelligence systems, from search engines and recommender systems to wearable devices. The increased desire for customization inevitably leads to the need for the end-user to understand the rationale behind displaying that specific tailored content. User interfaces play a central role to provide the right explanations to the end-users. While adaptive and personalized user interfaces are well-known and advanced research fields, a common issue we face in terms of explainability is finding intelligent user interfaces following the one-fits-all paradigm without considering the different peculiarities of individuals.

The 1st Workshop on Adaptive and Personalized Explainable User Interfaces (APEx-UI 2022) aims to foster a cross-disciplinary and interdisciplinary discussion between experts from different fields (e.g. computer science, psychology, sociology, law, medicine, business, etc.) in order to answer a precise research question: How can we adapt and personalize explainable user interfaces to the needs, demands and requirements of different end-users, considering their distinct knowledge, background and expertise?

HAI-GEN 2022: 3rd Workshop on Human-AI Co-Creation with Generative Models

Recent advances in generative AI have resulted in a rapid and dramatic increase to the fidelity of created artifacts, from realistic-looking images of faces [9] to antimicrobial peptide sequences that treat diseases [5] to faked videos of prominent business leaders [4, 10]. We believe that people skilled within their creative domain can realize great benefits by incorporating generative models into their own work: as a source of inspiration, as a tool for manipulation, or as a creative partner. Our workshop will bring together researchers and practitioners from both the HCI and AI disciplines to explore and better understand the opportunities and challenges in building, using, and evaluating human-AI co-creative systems.

HEALTHI: Workshop on Intelligent Healthy Interfaces

The second workshop on intelligent healthy interfaces (HEALTHI), collocated with the 2022 ACM Intelligent User Interfaces (IUI) conference, offers a forum that brings academics and industry researchers together and seeks submissions broadly related to the design of healthy user interfaces. The workshop will discuss intelligent user interfaces such as screens, wearables, voices assistants, and chatbots in the context of accessibly supporting health, health behavior, and wellbeing.

Sixth HUMANIZE Workshop on Transparency and Explainability in Adaptive Systems Through User Modeling Grounded in Psychological Theory: Summary

The sixth HUMANIZE workshop1 on Transparency and Explainability in Adaptive Systems through User Modeling Grounded in Psychological Theory took place in conjunction with the 27th annual meeting of the Intelligent User Interfaces (IUI)2 community that was hosted virtually by the University of Helsinki (Finland) on March 22, 2022. The 2022 edition of the workshop was held together with TExSS (Transparency and Explanations in Smart Systems) 3. The workshop provided a venue for researchers from different fields to interact by accepting contributions on the intersection of practical data mining methods and theoretical knowledge for personalization. A total of two papers was accepted for this edition of the workshop.

SOcial and Cultural IntegrAtion with PersonaLIZEd Interfaces (SOCIALIZE) 2022

This is the second edition of the SOcial and Cultural IntegrAtion with PersonaLIZEd Interfaces (SOCIALIZE) workshop. Also this year, our goal is to bring together researchers from all over the world interested in studying and developing new interactive techniques for fostering the social and cultural inclusion of individuals from different realities, with particular attention to the vulnerable and at-risk categories. In particular, this year large space is devoted to socially assistive robots and their use in specific contexts. The invited talk will also deal with the problem of developing intelligent robotic applications in the interaction between the robot and the human, including the assistance of children in hospitals and people with dementia.

TExSS 22: Transparency and Explanations in Smart Systems

Smart systems, such as decision support or recommender systems, continue to prove challenging for people to understand, but are nonetheless ever more pervasive based on the promise of harnessing rich data sources that are becoming available in every domain. These systems tend to be opaque, raising important concerns about how to discover and account for fairness or bias issues. The workshop on Transparency and Explanations in Smart Systems (TExSS) welcomes researchers and practitioners interested in exchanging ideas for overcoming the design, development, and evaluation issues in intelligent user interfaces. Specifically, we will focus on barriers preventing better reliability, trainability, usability, trustworthiness, fairness, accountability, and transparency. This year's theme is “Responsible, Explainable AI for Inclusivity and Trust”, emphasizing the importance of responsibility that tech-industry and developers have towards the design, implementation and evaluation of explainable, inclusive and trustworthy human-AI interaction.

Advanced Techniques for Preventing Thermal Imaging Attacks

Thermal cameras can be used to detect user input on interfaces, such as touchscreens, keyboards, and PIN pads, by recording the heat traces left by the users’ fingers after interaction (e.g., typing a message or entering a PIN) and using them to reconstruct the input. While previous work mitigated the thermal attacks by complicating input or distorting heat traces, our research is the first to propose preventing thermal attack using deep learning (DL) techniques to prevent malicious use of thermal cameras. Our DL models detect interfaces in the thermal camera feed and then obfuscate heat traces on them. Our preliminary findings show that the proposed framework can detect interfaces and eliminate authentication information from thermal images. At the same time, our methods still reveal if an interface has been interacted with. Thus, our approach improves security without impacting the utility of the thermal camera.

SESSION: Posters and Demos

AF’fective Design: Supporting Atrial Fibrillation Post-treatment with Explainable AI

This paper reports the preliminary design results of AF’fective, an AI driven patient monitoring system designed to facilitate the post-treatment of patients with Atrial Fibrillation (AF). In-depth interviews, co-design and prototype testing sessions were carried out with 16 cardiologists to investigate the context surrounding AF treatment, evaluate different explainable AI strategies and better understand how explainable AI could be designed and used to support AF post-treatment. Through the design process, we learnt key lessons such as the pitfalls of over-justification and how augmenting machine explanations with data sources that allow for self-interpretation could enhance perceived control over the decision making process and increase user acceptance in the system.

An Intelligent Color Recommendation Tool for Landing Page Design

Color plays an important role in users’ attitudes and purchase intentions in the context of advertising. In landing page design, designers usually struggle with getting appropriate color palettes for multiple design elements, such as buttons, texts, and icons. Therefore, we build a color recommendation system for landing page design. To learn different color palettes for each design element instead of a single palette for the overall design, we use a color sequence combining multiple color palettes of design elements. We train a masked color model for color sequence completion and advertising performance prediction. Further, the system allows users to recolor a specified element in the image based on recommended colors. We conduct a user study by collecting qualitative feedback from professional graphic designers through interviews, which validates the usability of the proposed color recommendation system for landing page design.

BareTQL: An Interactive System for Searching and Extraction of Open Data Tables

There has been a plethora of research and commercial activities around extracting structured data from documents (e.g. web pages and scientific articles) and making them available to other applications. Many organizations and government bodies have been also making their data available to public. Despite the progress in many different aspects of table extraction and publishing, querying incomplete data in tables with little or no schema has been a challenge. This paper presents BareTQL, an interactive system for querying open data tables in the presence of the aforementioned challenges.

Caregiver: An Application for The First Step in Alzheimer's Disease Early Diagnosis

Alzheimer's Disease(AD) is an invasive neurodegenerative disorder that has no cure. The treatment of AD is based on drugs, which have to be administered in the early stages of the development of the disease. Therefore, early diagnosis of the disease is of great significance for the efficiency of the treatment. AD is currently diagnosed by the provision of a professional, conducting medical tests, which is not easily accessible. In this paper, we introduce Caregiver, an application that takes initiative by administering the Self- Administered Gerocognitive Exam (SAGE) test on mobile devices and facilitates administration of the test in a fun and simple digital environment. Caregiver gives suggestions to its users to seek professional help for treatment of AD, based on a score they get out of 22 points. Conducting the SAGE test step by step through gamification, Caregiver provides essential healthcare to people that don't have access to medical faculties. Its user-friendly interface enables its users to actively interact with the application, and obtain more accurate results than conducting the test by themselves. Caregiver evaluates visual inputs of the test using a weighted combination of two different classification models, which decreases the total time to conduct the test, hence providing early diagnosis.

Distraction Detection in Automotive Environment using Appearance-based Gaze Estimation

Distraction detection systems in automotive has great importance due to the prime safety of passengers. Earlier approaches confined to use indirect methods of driving performance metrics to detect visual distraction. Recent methods attempted to develop dedicated classification models for gaze zone estimation whose cross-domain performance was not investigated. We adopt a more generic appearance-based gaze estimation approach where no assumption on setting or participant was made. We proposed MAGE-Net with less number of parameters while achieving on par performance with state of the art techniques on MPIIGaze dataset. We utilized the proposed MAGE-Net and performed a cross-domain evaluation in automotive setting with 10 participants. We observed that the gaze region error using MAGE-Net for interior regions of car is 15.61 cm and 15.13 cm in x and y directions respectively. We utilized these results and demonstrated the capability of proposed system to detect visual distraction using a driving simulator.

Enabling Continuous Object Recognition in Mobile Augmented Reality

Mobile Augmented Reality (MAR) applications enable users to interact with physical environments through overlaying digital information on top of camera views. Detecting and classifying complex objects in the real world presents a critical challenge to enable immersive user experiences in MAR applications. Aiming to provide continuous MAR experiences, we address a key challenge of continuous object recognition, which requires accommodating an increasing number of recognition requests on different types of images in MAR systems and possible new types of images in emerging applications. Inspired by the latest advance in continual learning approaches in computer vision, this paper presents a novel MAR system to enhance its scalability with continual learning in realistic scenarios. Our experiments demonstrate that 1) the system enables efficiently recognising objects without requiring retraining from scratch; and 2) edge computing further reduces latency for continual object recognition.

EV Life: A Counterfactual Dashboard Towards Reducing Carbon Emissions of Automotive Behaviors

Adopting electric vehicles (EVs) is an important step towards meeting climate change targets. Despite the increased availability of electric vehicles (EVs), many individuals are unfamiliar with the environmental and cost savings and how their driving behaviors might change (e.g., where and how to charge) when switching from a conventional fuel vehicle. While behavioral science research can identify what factors are barriers to EV adoption, there is a struggle to identify interventions that can help mitigate these barriers. We introduce EV Life, a mobile app for showing a counterfactual view of people’s automotive behaviors which introduces two functions. First, the app monitors a person’s driving trips in their current vehicle and provides a counterfactual dashboard that highlights what their trip would be like with an EV, including information about cost savings, reduction in carbon emissions, and charging locations. Second, the app provides a research platform for testing interventions for belief change using rule based or machine learning notification delivery.

GAB - Gestures for Artworks Browsing

Hands are an important tool for our daily communication with our peers and the world. They allow us to convey information through particular gestures that are either the product of social conventions or personal expressions. Thanks to the sophistication of sensing and computer vision technologies over the past decade, automated hand recognition can now be more easily used and integrated in simple web applications. In a context of digital artworks collections, it means that gestures can now be envisioned as a new browsing tool that goes beyond simple movements to navigate through a 3D digital space. The paper presents Gestures for Artwork Browsing (GAB), a web application which proposes to use hand motions as a way to directly query pictorial hand gestures from the past. Based on materials from a digitized collection of Renaissance paintings, GAB enables users to record a sequence with the hand movement of their choice and outputs an animation reproducing that same sequence with painted hands. Fostering new research possibilities, the project is a novelty in terms of art database browsing and human-computer interaction, as it does not require traditional search tools such as text-based inputs based on metadata, and allows a direct communication with the content of the artworks.

GazeSync: Eye Movement Transfer Using an Optical Eye Tracker and Monochrome Liquid Crystal Displays

Can we see the world through the eyes of somebody else? We present an early work to transfer eye gaze from one person to another. Imagine you can follow the eye gaze of an instructor while explaining a complex work step, or you can experience a painting like an expert would: Your gaze is directed to the important parts and follows the appropriate steps. In this work, we explore the possibility to transmit eye-gaze information in a subtle, unobtrusive fashion between two individuals. We present an early prototype consisting of an optical eye-tracker for the leader (person who shares the eye gaze) and two monochrome see-through displays for the follower (person who follows the eye gaze of the leader). We report the results of an initial user test and discuss future works.

Handwriting Messenger by Which the User Can Feel the Presence of Communication Partners

Communication technologies such as instant messaging and e-mail allow for convenient, instantaneous communication between remote locations. However, since these technologies generally use fixed fonts, text messages are less expressive than those found in handwritten letters. We hypothesize that the social presence of the communicating party can be better felt by the recipient by adding tactile information to the messages. Here, we introduce a haptic interface in which the message recipient can feel the handwriting and presence of the sender.

ImCasting: Nonverbal Behaviour Reinforcement Learning of Virtual Humans through Adaptive Immersive Game

This research work puts a focus on the user experience of an alternative method to teach nonverbal behaviour to Embodied Conversational Agents in immersive environments. We overcome the limitations of the existing approaches by proposing an adaptive Virtual Reality game, called ImCasting, in which the player takes an active role in improving the learning models of the agents. Specifically, we based our approach on the Human-in-the-loop framework with human preferences to teach the nonverbal behaviour to the agents through the system. We introduce game mechanisms built around all the tasks of this Machine Learning framework, designing how a human should interact within this framework in real-time. The study explores how a game interaction in an immersive environment can improve the user experience in performing this interactive task, sharing the same space with the learning agents. In particular, we focus on the involvement of the players as well as the usability of the system. We conducted a preliminary evaluation to compare the design of our system with a baseline system which does not use any game mechanisms in teaching the nonverbal behaviour to virtual agents. Results suggest that our design concept and the game story are more engaging, increasing the satisfaction usability factor perceived by the participants.

Inequity in Popular Speech Recognition Systems for Accented English Speech

Voice-enabled technology has become increasingly common in homes, businesses, and other parts of everyday life. The benefits of smart speakers, hands-free controllers, and digital assistants should be equally accessible to everyone, yet voice recognition performance can be frustratingly low for speakers with accents.In this work, we examine algorithmic bias in several voice recognition systems by measuring standard accuracy metrics on under-represented English accents.

Language Models Can Generate Human-Like Self-Reports of Emotion

Computational interaction and user modeling is presently limited in the domain of emotions. We investigate a potential new approach to computational modeling of emotional response behavior, by using modern neural language models to generate synthetic self-report data, and evaluating the human-likeness of the results. More specifically, we generate responses to the PANAS questionnaire with four different variants of the recent GPT-3 model. Based on both data visualizations and multiple quantitative metrics, the human-likeness of the responses increases with model size, with the largest Davinci model variant generating the most human-like data.

Leveraging Generative Conversational AI to Develop a Creative Learning Environment for Computational Thinking

We explore how generative conversational AI can assist students’ learning, creative, and sensemaking process in a visual programming environment where users can create comics from code. The process of visualizing code in terms of comics involves mapping programming language (code) to natural language (story) and then to visual language (of comics). While this process requires users to brainstorm code examples, metaphors, and story ideas, the recent development in generative models introduces an exciting opportunity for learners to harness their creative superpower and researchers to advance our understanding of how generative conversational AI can augment our intelligence in creative learning contexts. We provide an overview of our system and discuss interaction scenarios to demonstrate ways we can partner with generative conversational AI in the context of learning computer programming.

Neural Language Models as What If? -Engines for HCI Research

Collecting data is one of the bottlenecks of Human-Computer Interaction (HCI) and user experience (UX) research. In this poster paper, we explore and critically evaluate the potential of large-scale neural language models like GPT-3 in generating synthetic research data such as participant responses to interview questions. We observe that in the best case, GPT-3 can create plausible reflections of video game experiences and emotions, and adapt its responses to given demographic information. Compared to real participants, such synthetic data can be obtained faster and at a lower cost. On the other hand, the quality of generated data has high variance, and future work is needed to rigorously quantify the human-likeness, limitations, and biases of the models in the HCI domain.

PARKS-Gaze - A Precision-focused Gaze Estimation Dataset in the Wild under Extreme Head Poses

The performance of appearance-based gaze estimation systems that utilizes machine learning depends on training datasets. Most of the existing gaze estimation datasets were recorded in laboratory conditions. The datasets recorded in the wild conditions display limited head pose and intra-person variation. We proposed PARKS-Gaze, a gaze estimation dataset with 570 minutes of video data from 18 participants. We captured head pose range of ± 50, [-40,60] degrees in yaw and pitch directions respectively. We captured multiple images for a single Point of Gaze (PoG) enabling to carry out precision analysis of gaze estimation models. Our cross-dataset experiments revealed that the model trained on proposed dataset obtained lower mean test errors than existing datasets, indicating its utility for developing real-world interactive gaze controlled applications.

Predicting Persuasiveness of Participants in Multiparty Conversations

Persuasiveness is an important capability in communication skills. This study aims to estimate the persuasiveness of participants in group discussions. First, human annotators rated the level of persuasiveness of each of four participants in group discussions. Next, multimodal and multiparty models were created to estimate the persuasiveness of each participant using speech, language, and visual (head pose) features using GRU-based neural network. The experimental results showed that multimodal and multiparty models performed better than unimodal and single-person models. The best performing multimodal multiparty model achieved 80% accuracy in predicting high/low persuasiveness, and 77% accuracy in predicting the most persuasive participant in the group.

Shape-Flexible Underwater Display System with Wirelessly Powered and Controlled Smart LEDs

This paper proposes a 3D shape-flexible underwater display system which allows users to built a 3D display easily. To attain the flexibility, this paper proposes and evaluates the underwater wireless powering and communication method, which eliminates cables connected to each node. Simulation results show that the proposed method can deliver 4.78 mW to a 8mm × 8mm × 8mm size node and the communication speed of 9600 baud is feasible.

SummaryLens – A Smartphone App for Exploring Interactive Use of Automated Text Summarization in Everyday Life

We present SummaryLens, a concept and prototype for a mobile tool that leverages automated text summarization to enable users to quickly scan and summarize physical text documents. We further combine this with a text-to-speech system to read out the summary on demand. With this concept, we propose and explore a concrete application case of bringing ongoing progress in AI and Natural Language Processing to a broad audience with interactive use cases in everyday life. Based on our implemented features, we describe a set of potential usage scenarios and benefits, including support for low-vision, low-literate and dyslexic users. A first usability study shows that the interactive use of automated text summarization in everyday life has noteworthy potential. We make the prototype available as an open-source project to facilitate further research on such tools.

The Diversity of Music Recommender Systems

While the algorithms used by music streaming services to provide recommendations have often been studied in offline, isolated settings, little research has been conducted studying the nature of their recommendations within the full context of the system itself. This work seeks to compare the level of diversity of the real-world recommendations provided by five of the most popular music streaming services, given the same lists of low-, medium- and high-diversity input items. We contextualized our results by examining the reviews for each of the five services on the Google Play Store, focusing on users’ perception of their recommender systems and the diversity of their output. We found that YouTube Music offered the most diverse recommendations, but the perception of the recommenders was similar across the five services. Consumers had multiple perspectives on the recommendations provided by their music service—ranging from not wanting any recommendations to applauding the algorithm for helping them find new music.

ThermalDrive - Towards Situation Awareness over Thermal Feedback in Automated Driving Scenarios

We present ThermalDrive, a thermal interface that provided situational awareness information using thermal feedback on the face of the driver. A prototype is built to simulate the autonomous driving and the thermal interface in virtual reality. We conduct an experiment to investigate the impact of displaying system situation awareness information via the thermal feedback in a VR driving simulation (16 participants). The initial results indicate that the thermal interface might be a suitable feedback mechanism to convey some information in autonomous driving. In particular, cold thermal feedback was effective in terms of notability and user preference.

Touch-behavioral Authentication on Smartphones using Machine Learning

The traditional authentication approaches for smartphones, such as PIN code or pattern-based password, are vulnerable of password hacking in public by shoulder-suffering or smudge attack. On the other side, advanced authentication approaches, such as fingerprints or retina-based recognition, required specific hardware or high computational power. In this work, we propose using the advancements in machine learning (ML) for providing the authentication mechanism without the requirement of any additional hardware. We propose the usage of users’ touch interaction behavior on smartphone screen to provide the required authentication mechanism. We propose the solution in two modes, i.e., using the supervised ML technique where the system is trained and then authorized the legitimate user using a set of simple shapes, and using the unsupervised ML technique where the system is trained on a user’s free touch interaction with the underlying device. Moreover, we conducted a preliminary user study with our supervised learning based system.

Using Wearables Data for Differentiating Between Injured and Non-Injured Athletes

Smartwatches nowadays are rich sensors that may provide their wearers with various data about their physical performance and physiological status. In this paper, we explore the possibility of using this data for identifying differences between athletes during a training program. We aim to distinguish between those who suffered musculoskeletal injuries to athletes that were not injured, by considering the external load and athletes’ heart rate (Internal load). By Comparing the two groups, we found significant differences between the groups in the following features: Heart rate at rest and during sleep. In addition, percent time of rapid eye movement (REM) and deep sleep were significantly different between the two groups and the external load expressed by distance was significantly lower in the injured group. Our findings suggest that by tracing heart rate and sleep quality during a training program, we were able to characterize athletes that were at risk of injuries. This may be a first step for further analysis aimed to explore the possibility to predict the risk of injuries and to adapt the training loads accordingly to prevent injuries. In addition, upon such characteristics, user profiles can be built and used for personalized recommendations for avoiding injuries during training.

Videos2Doc: Generating Documents from a Collection of Procedural Videos

The availability of user-generated multi-modal content, including videos and images, in abundance makes it easy for users to use it as a reference and source of information. However, several hours may be required for the consumption of this large corpus of data. Particularly, for authors and content creators abstracting out information from videos to then representing it in a textual format is a tedious task. The challenges are multiplied due to the diversity and the variety introduced when there are several videos associated with a given query or topic of interest. We present, Videos2Doc, a machine learning-based framework for automated document generation from a collection of procedural videos. Videos2Doc enables author-guided document generation for those looking for authoring assistance and an easy consumption experience for those preferring the text or document media over videos. Our proposed interface allows the users to choose several visual and semantic preferences for the output document allowing the generation of custom documents and webpage templates from a given set of inputs. Empirical and qualitative evaluations establish the utility of Videos2Doc as well as the superiority over the current benchmarks. We believe, Videos2Doc will ease the task of making multimedia accessible through automation in conversion to alternate presentation modes.

Vietnamese Speech-based Question Answering over Car Manuals

This paper presents a novel Vietnamese speech-based question answering system QA-CarManual that enables users to ask car-manual-related questions (e.g. how to properly operate devices and/or utilities within a car). Given a car manual written in Vietnamese as the main knowledge base, we develop QA-CarManual as a lightweight, real-time and interactive system that integrates state-of-the-art technologies in language and speech processing to (i) understand and interact with users via speech commands and (ii) automatically query the knowledge base and return answers in both forms of text and speech as well as visualization. To our best knowledge, QA-CarManual is the first Vietnamese question answering system that interacts with users via speech inputs and outputs. We perform a human evaluation to assess the quality of our QA-CarManual system and obtain promising results.

SESSION: Student Consortium

Explaining Artificial Intelligencewith Tailored Interactive Visualisations

Artificial intelligence (AI) is becoming ubiquitous in the lives of both researchers and non-researchers, but AI models often lack transparency. To make well-informed and trustworthy decisions based on these models, people require explanations that indicate how to interpret the model outcomes. This paper presents our ongoing research in explainable AI, which investigates how visual analytics interfaces and visual explanations, tailored to the target audience and application domain, can make AI models more transparent and allow interactive steering based on domain expertise. First, we present our research questions and methods, contextualised by related work at the intersection of AI, human-computer interaction, and information visualisation. Then, we discuss our work so far in healthcare, agriculture, and education. Finally, we share our research ideas for additional studies in these domains.

How do Conversational Agents Transform Qualitative Interviews? Exploration and Support of Researchers’ Needs in Interviews at Scale

In recent years, conversational agents (CAs) have been receiving more attention as tools for collecting data through qualitative interviews. The problem is we know little about how CAs affect both the interviewees and interviewers. This PhD project is dedicated to studying how to evaluate CA-mediated interviews and their effects on participants (both interviewees and interviewers). The findings of this project will allow us to support the interview practitioners with the tools for interview analytics and interview data analysis. It will be especially helpful in the large-scale settings which CA-mediated interviews enable. This proposal describes State-of-the-art on the topic and presents the motivation of a study with key research questions to answer.

Integrating in-hand physical objects in mixed reality interactions

With the launch of commercial mixed reality headsets, it has become increasingly important to find new interaction modalities making use of their capabilities. Indeed, endowed with spatial understanding, those devices offer the possibility to interact with the physical environments around the users and move beyond the traditional WIMP(Windows, Icons, Menus, Pointer) paradigm. In this context, this research project aims at designing and implementing interactions that are seamless for users who are using physical tools during assembling tasks while wearing a mixed reality headset that provides them instructions. The goals of this project are to provide natural interactions despite having tools in hand: 1) by considering the tools in the hands while recognizing gestures or by using them to directly interact with the simulated virtual physics, 2) by associating different computational features such as mapped functionalities or allowed interactions depending on the tool being used or its shape. To meet those goals, we are iterating over the implementation of prototypes involving object and gesture recognitions and will evaluate the designed interactions in an assembly scenario.

Interactive Intelligent Tools for Creative Processes using Multimodal Information

The individual differences in human behavior, combined with the ubiquity and constant evolution of computing devices, led to different forms of human-computer interaction. Keyboard, touch, voice, or gesture, among others, are used to provide input and receive feedback. However, applications that provide users with multimodal interaction for creative processes are usually agnostic about media content and user profile, unable to highlight significant elements that may be useful to foster creativity. This work's primary goal is to design models and provide techniques to create intelligent interactive tools that use rich multimodal information to support people in creative processes.

Leveraging Human-Agent Collaboration for Multimodal Task Guidance with Concurrent Authoring Capabilities

Instructive assistance systems are applied to assist users in performing a task by providing instructions. These systems are especially relevant for industrial workers and service technicians who must cope with the increasing task variety and complexity that resulted from the trend of applying cyber-physical production systems and small lot sizes. While current research mainly focuses on assembly work and the application of augmented reality, there is a lack of research for these systems for adapting to the challenging conditions of maintenance environments and for providing concurrent authoring capabilities. Within this PhD project, we aim to design a multimodal conversational agent as an instructive assistance system with authoring capabilities and evaluate the system in three different scenarios to close this research gap.