Eye-tracking has been extensively used both in psychology for understanding various aspects of human cognition, as well as in human computer interaction (HCI) for evaluation of interface design or as a form of direct input. In recent years, eye-tracking has also been investigated as a source of information for machine learning models that predict relevant user states and traits (e.g., attention, confusion, learning, perceptual abilities). These predictions can then be leveraged by AI agents to model their users and personalize the interaction accordingly. In this talk, Dr. Conati will provide an overview of the research her lab has done in this area, including detecting and modeling user cognitive skills, and affective states, with applications to user-adaptive visualizations, intelligent tutoring systems and health.
In a world where the use of AI is growing and evolving, where will we be in 5 years? 10 years? 20 years? What role will AI play in our society, and how will humans and AI interact? While there will undoubtedly be scenarios where AI systems will be able to outperform humans, there will also continue to be instances where humans will be a critical part of the process. As researchers explore improvements to AI systems, we also need to explore the interplay between humans and AI, and continue to evolve our understanding of how humans and AI systems can work together, effectively harnessing the benefits of both systems .
Designing effective interaction between the human and the AI systems is critical for future use of Human-AI systems . Merely building an AI system that blindly sends recommendations to users has been shown in some cases to decrease human performance . Different models can also have differential impact on user's trust of the model, adherence to the recommendation, and can impact bias in decision making tasks. This talk will highlight important directions for Human-AI research.
This talk covers two main research directions based on the iCub humanoid robot. The iCub is a humanoid robot designed to support research in embodied AI. At 104 cm tall, the iCub has the size of a five-year-old child. It can crawl on all fours, walk, and sit up to manipulate objects. Its hands have been designed to support sophisticate manipulation skills. The iCub is distributed as Open Source following the GPL license and can now count on a worldwide community of enthusiastic developers. There are more than 40 robots available in laboratories across the globe. The iCub sensory system allows seeing, hearing and feeling physical contacts. It is one of the few platforms in the world with a sensitive full-body skin. The iCub is being used at the Italian Institute of Technology as a model platform to develop the technology of future interactive service robots. In particular, I will describe our work in the field of physical and social interaction. For example, through extensive use of machine learning, we developed algorithms to interpret and use external contact information in a variety of tasks as well as contactless cues - vision, sound - to ease interaction between the user and the robot.
Personalization is increasingly being perceived as an important factor for the effectiveness of Recommender Systems (RS). This is especially true in the tourism domain, where travelling comprises emotionally charged experiences, and therefore, the more about the tourist is known, better recommendations can be made. The inclusion of psychological aspects to generate recommendations, such as personality, is a growing trend in RS and they are being studied to provide more personalized approaches. However, although many studies on the psychology of tourism exist, studies on the prediction of tourist preferences based on their personality are limited. Therefore, we undertook a large-scale study in order to determine how the Big Five personality dimensions influence tourists' preferences for tourist attractions, gathering data from an online questionnaire, sent to Portuguese individuals from the academic sector and their respective relatives/friends (n=508). Using Exploratory and Confirmatory Factor Analysis, we extracted 11 main categories of tourist attractions and analyzed which personality dimensions were predictors (or not) of preferences for those tourist attractions. As a result, we propose the first model that relates the five personality dimensions with preferences for tourist attractions, which intends to offer a base for researchers of RS for tourism to automatically model tourist preferences based on their personality.
Autism spectrum disorder (ASD) is a long-standing mental condition characterized by hindered mental growth and development and is a lifelong disability for the majority of affected individuals. In 2018, 2-3% of children in the USA have been diagnosed with autism. As these children move to adulthood, they have difficulty in developing a well-functioning motor skill. Some of these abnormalities, however, can be gradually improved if they are treated appropriately during their adulthood. Studies have shown that people with ASD enjoy playing video games. Educational games, however, have been primarily developed for children with ASD, which are too primitive for adults with ASD. We have developed a gaming and personalized recommender system that suggests therapeutic games to adults with ASD which can improve their social-interactive skills. The gaming system maintains the entertainment value of the games to ensure that adults are interested in playing them, whereas the recommender system suggests appropriate games for adults with ASD to play. The effectiveness and merit of our gaming and recommender system is backed up by an empirical study.
Digital pen signals were shown to be predictive for cognitive states, cognitive load and emotion in educational settings. We investigate whether low-level pen-based features can predict the difficulty of tasks in a cognitive test and the learner's performance in these tasks, which is inherently related to cognitive load, without a semantic content analysis. We record data for tasks of varying difficulty in a controlled study with children from elementary school. We include two versions of the Trail Making Test (TMT) and six drawing patterns from the Snijders-Oomen Non-verbal intelligence test (SON) as tasks that feature increasing levels of difficulty. We examine how accurately we can predict the task difficulty and the user performance as a measure for cognitive load using support vector machines and gradient boosted decision trees with different feature selection strategies. The results show that our correlation-based feature selection is beneficial for model training, in particular when samples from TMT and SON are concatenated for joint modelling of difficulty and time. Our findings open up opportunities for technology-enhanced adaptive learning.
To develop a multi-turn dialogue-based conversational recommender system (DCRS), it is important to predict users' intents behind their utterances and their satisfaction with the recommendation, so as to allow the system to incrementally refine user preference model and adjust its dialogue strategy. However, little work has investigated these issues so far. In this paper, we first contribute with two hierarchical taxonomies for classifying user intents and recommender actions respectively based on grounded theory. We then define various categories of feature considering content, discourse, sentiment, and context to predict users' intents and satisfaction by comparing different machine learning methods. The experimental results for user intent prediction task show that some models (such as XGBoost and SVM) can perform well in predicting user intents, and incorporating context features into the prediction model can significantly boost the performance. Our empirical study also demonstrates that leveraging dialogue behavior features (i.e., including both user intents and recommender actions) can achieve good results in predicting user satisfaction.
Picture passwords, which require users to draw selections on images as their secret password, typically provide globalized solutions without taking into consideration that people across diverse cultures exhibit differences within interactive systems. Aiming to shed light on the effects of culture towards users' interactions within picture password schemes, we conducted a between-subjects cross-cultural (Eastern vs. Western) study (n=67). Users created a password on a picture illustrating content highly related to their daily-life experiences (culture-internal) vs. a picture illustrating the same daily-life experiences, but in a different cultural context (culture-external). Results revealed that people across cultures exhibited differences in visual processing, comprehension, and exploration of the picture content prior to making their password selections. The observed differences can be accounted by considering sociocultural theories highlighting the holistic preference of Eastern populations compared to the analytic preference of Western populations. Qualitative data also triangulate the findings by exposing the likeability and users' engagement towards the picture content familiar to individual's culture. Findings underpin the necessity to consider cultural differences in the design of personalized picture passwords.
This paper contributes to the automatic estimation of the subjective emotional experience that audio-visual media content induces in individual viewers, e.g. to support affect-based recommendations. Making accurate predictions of these responses is a challenging task because of their highly person-dependent and situation-specific nature. Findings from psychology indicate that an important driver for the emotional impact of media is the triggering of personal memories in observers. However, existing research on automated predictions focuses on the isolated analysis of audiovisual content, ignoring such contextual influences. In a series of empirical investigations, we (1) quantify the impact of associated personal memories on viewers' emotional responses to music videos in-the-wild and (2) assess the potential value of information about triggered memories for personalizing automatic predictions in this setting. Our findings indicate that the occurrence of memories intensifies emotional responses to videos. Moreover, information about viewers' memory response explains more variation in video-induced emotions than either the identity of videos or relevant viewer-characteristics (e.g. personality or mood). We discuss the implications of these results for existing approaches to automated predictions and describe ways for progress towards developing memory-sensitive alternatives.
Recent years have seen a growing interest in block-based programming environments for computer science education. Although block-based programming offers a gentle introduction to coding for novice programmers, introductory computer science still presents significant challenges, so there is a great need for block-based programming environments to provide students with adaptive support. Predictive student modeling holds significant potential for adaptive support in block-based programming environments because it can identify early on when a student is struggling. However, predictive student models often make a number of simplifying assumptions, such as assuming a normal response distribution or homogeneous student characteristics, which can limit the predictive performance of models. These assumptions, when invalid, can significantly reduce the predictive accuracy of student models.
To address these issues, we introduce an approach to predictive student modeling that utilizes Bayesian hierarchical linear models. This approach explicitly accounts for individual student differences and programming activity differences by analyzing block-based programs created by students in a series of introductory programming activities. Evaluation results reveal that predictive student models that account for both the distributional and hierarchical factors outperform baseline models. These findings suggest that predictive student models based on Bayesian hierarchical modeling and representing individual differences in students can substantially improve models' accuracy for predicting student performance on post-tests. By improving the predictive performance of student models, this work holds substantial potential for improving adaptive support in block-based programming environments.
Increasing aggregate diversity (or catalog coverage) is an important system-level objective in many recommendation domains where it may be desirable to mitigate the popularity bias and to improve the coverage of long-tail items in recommendations given to users. This is especially important in multistakeholder recommendation scenarios where it may be important to optimize utilities not just for the end user, but also for other stakeholders such as item sellers or producers who desire a fair representation of their items across recommendation lists produced by the system. Unfortunately, attempts to increase aggregate diversity often result in lower recommendation accuracy for end users. Thus, addressing this problem requires an approach that can effectively manage the trade-offs between accuracy and aggregate diversity. In this work, we propose a two-sided post-processing approach in which both user and item utilities are considered. Our goal is to maximize aggregate diversity while minimizing loss in recommendation accuracy. Our solution is a generalization of the Deferred Acceptance algorithm which was proposed as an efficient algorithm to solve the well-known stable matching problem. We prove that our algorithm results in a unique user-optimal stable match between items and users. Using three recommendation datasets, we empirically demonstrate the effectiveness of our approach in comparison to several baselines. In particular, our results show that the proposed solution is quite effective in increasing aggregate diversity and item-side utility while optimizing recommendation accuracy for end users.
E-commerce and online services are getting more and more ubiquitous day by day. Like many other e-commerce paradigms, online grocery services can highly benefit from recommender systems, especially when it comes to predicting users' shopping behavior. This specific scenario owns peculiar characteristics, such as repetitiveness and loyalty, which makes the task very different from the standard recommendations. In this work, we present an efficient solution to compute the next basket recommendation, under a more general top-n recommendation framework. We propose a set of collaborative filtering based techniques able to capture users' shopping patterns. Furthermore, we analyzed how recency plays a key role in this particular task. We finally compare our method with state-of-the-art algorithms on two online grocery service datasets.
In this paper, we analyze a wide range of physiological, behavioral, performance, and subjective measures to estimate cognitive load (CL) during e-learning. To the best of our knowledge, the analyzed sensor measures comprise the most diverse set of features from a variety of modalities that have to date been investigated in the e-learning domain. Our focus lies on predicting the subjectively reported CL and difficulty as well as intrinsic content difficulty based on the explored features. A study with 21 participants, who learned through videos and quizzes in a Moodle environment, shows that classifying intrinsic content difficulty works better for quizzes than for videos, where participants actively solve problems instead of passively consuming videos. Regression analysis for predicting the subjectively reported level of CL and difficulty also works with very low error within content topics. Among the explored feature modalities, eye-based features yield the best results, followed by heart-based and then skin-based measures. Furthermore, combining multiple modalities results in better performance compared to using a single modality. The presented results can guide researchers and developers of cognition-aware e-learning environments by suggesting modalities and features that work particularly well for estimating difficulty and CL.
Video summaries or highlights are a compelling alternative for exploring and contextualizing unprecedented amounts of video material. However, the summarization process is commonly automatic, non-transparent and potentially biased towards particular aspects depicted in the original video. Therefore, our aim is to help users like archivists or collection managers to quickly understand which summaries are the most representative for an original video. In this paper, we present empirical results on the utility of different types of visual explanations to achieve transparency for end users on how representative video summaries are, with respect to the original video. We consider four types of video summary explanations, which use in different ways the concepts extracted from the original video subtitles and the video stream, and their prominence. The explanations are generated to meet target user preferences and express different dimensions of transparency: concept prominence, semantic coverage, distance and quantity of coverage. In two user studies we evaluate the utility of the visual explanations for achieving transparency for end users. Our results show that explanations representing all of the dimensions have the highest utility for transparency, and consequently, for understanding the representativeness of video summaries.
Intelligent computer systems aim at providing user-assistance for challenging tasks, like decision-making, planning, or learning. For offering optimal assistance, it is essential for such systems to decide when to be reactive or proactive and how active system behaviour should be designed. Especially, as this decision may greatly influence the user's trust in the system. Therefore, we conducted a mixed-factorial study which examines how different levels of proactivity (none, notification, suggestion, and intervention) as well as timing strategies (fixed-timing and insecurity-based) are trusted by subjects while performing a planning task. The results showed, that proactive system behaviour is perceived trustworthy in insecure situations independent of the timing. However, proactive dialogue showed strong effects on cognition-based trust (system's perceived competence and reliability) depending on task difficulty. Furthermore, fully autonomous system behaviour fails to establish an adequate human-computer trust relationship, in contrast to conservative strategies.
Drowsiness is a major cause of fatal traffic accidents. Automated driving is intended to counteract this problem, but in the lower levels of automation, the driver is still responsible as a fallback. Current drowsiness detection methods are often based on driving behavior parameters. Since the automation of the driving task reduces the scope of use of these parameters, alternatives are necessary. Particularly methods that include physiological signals seem to be auspicious. However, inside a vehicle, only non- or minimally intrusive measurement techniques are allowed. In this work, a machine learning-based driver drowsiness detection method is presented applied solely to physiological data from non-intrusive wrist-worn smart wearable devices. A user study (N=30) on a test track with SAE level-2 automated driving was conducted where heart rate data with three commercially available fitness trackers were recorded. Different machine learning models were tested in a 2- and 3-level classification of drowsiness. For both cases and with all tested devices, high accuracies (>90%) could be achieved. The proposed methodology provides new options for the development of intelligent driver-vehicle interaction concepts and interfaces, especially for driver drowsiness detection on the way to fully automating the driving task.
News articles are increasingly consumed digitally and recommender systems (RS) are widely used to personalize news feeds for their users. Thereby, particular concerns about possible biases arise. When RS filter news articles opaquely, they might "trap" their users in filter bubbles. Additionally, user preferences change frequently in the domain of news, which is challenging for automated RS. We argue that both issues can be mitigated by depicting an interactive version of the user's preference profile inside an overview of the entire domain of news articles. To this end, we introduce NewsViz, a RS that visualizes the domain space of online news as treemap, which can interactively be manipulated to personalize a feed of suggested news articles. In a user study (N=63), we compared NewsViz to an interface based on sliders. While both prototypes yielded high results in terms of transparency, recommendation quality and user satisfaction, NewsViz outperformed its counterpart in the perceived degree of control. Structural equation modeling allows us to further uncover hitherto underestimated influences between quality aspects of RS. For instance, we found that the degree of overview of the item domain influenced the perceived quality of recommendations.
In the classroom, children mainly use general search systems such as Google, Baidu or Bing. For many years and from different perspectives, a call has been made that it is necessary to provide children in an educational context with child-friendly search systems. Research responding to this call often focuses on the relevance, readability and reliability of the retrieved documents. Instead, inspired by a recent study based on adult users on the role emotions play in web search, we explore whether and how children searching in a school context react to the emotional content often part of Search Engine Result Pages. We do so by examining emotions inferred from queries and corresponding retrieved resources in query logs produced by children ages 9 to 11 in a classroom setting in 3 different countries. We also consider teachers' observations that contextualize this analysis.
With the uptake of algorithmic personalization in the news domain, news organizations increasingly trust automated systems with previously considered editorial responsibilities, e.g., prioritizing news to readers. In this paper we study an automated news recommender system in the context of a news organization's editorial values.
We conduct and present two online studies with a news recommender system, which span one and a half months and involve over 1,200 users. In our first study we explore how our news recommender steers reading behavior in the context of editorial values such as serendipity, dynamism, diversity, and coverage. Next, we present an intervention study where we extend our news recommender to steer our readers to more dynamic reading behavior.
We find that (i) our recommender system yields more diverse reading behavior and yields a higher coverage of articles compared to non-personalized editorial rankings, and (ii) we can successfully incorporate dynamism in our recommender system as a re-ranking method, effectively steering our readers to more dynamic articles without hurting our recommender system's accuracy.
Recommender systems are often biased toward popular items. In other words, few items are frequently recommended while the majority of items do not get proportionate attention. That leads to low coverage of items in recommendation lists across users (i.e. low aggregate diversity) and unfair distribution of recommended items. In this paper, we introduce FairMatch, a general graph-based algorithm that works as a post-processing approach after recommendation generation for improving aggregate diversity. The algorithm iteratively finds items that are rarely recommended yet are high-quality and add them to the users' final recommendation lists. This is done by solving the maximum flow problem on the recommendation bipartite graph. While we focus on aggregate diversity and fair distribution of recommended items, the algorithm can be adapted to other recommendation scenarios using different underlying definitions of fairness. A comprehensive set of experiments on two datasets and comparison with state-of-the-art baselines show that FairMatch, while significantly improving aggregate diversity, provides comparable recommendation accuracy.
The suggestion of Points of Interest to people with Autism Spectrum Disorder (ASD) challenges recommender systems research because these users' perception of places is influenced by idiosyncratic sensory aversions which can mine their experience by causing stress and anxiety. Therefore, managing individual preferences is not enough to provide these people with suitable recommendations. In order to address this issue, we propose a Top-N recommendation model that combines the user's idiosyncratic aversions with her/his preferences in a personalized way to suggest the most compatible and likable Points of Interest for her/him. We are interested in finding a user-specific balance of compatibility and interest within a recommendation model that integrates heterogeneous evaluation criteria to appropriately take these aspects into account. We tested our model on both ASD and "neurotypical" people. The evaluation results show that, on both groups, our model outperforms in accuracy and ranking capability the recommender systems based on item compatibility, on user preferences, or which integrate these two aspects by means of a uniform evaluation model.
We have become increasingly reliant on recommender systems to help us make decisions in our daily live. As such, it is becoming essential to explain to users how these systems reason to enable them to correct system assumptions and to trust the system. The advantages of explaining the recommendation process has been shown by a vast amount of research. Additionally, previous studies showed that personality affects users' attitudes, tastes and information processing. However, it is still unclear whether personality has an impact on the way users process and perceive explanations. In this paper, we report the results of a study that investigated differences between personal characteristics of the perception and the gaze pattern of a music recommender interface in the presence and absence of explanations. We investigated the differences between Need For Cognition, Musical Sophistication and the Big Five personality traits. Results show empirical evidence of the differences between Musical Sophistication and Openness on both perception and gaze pattern. We found that users with a high Musical Sophistication and a low Openness score benefit the most from explanations.
While online content is personalized to an increasing degree, eg. using recommender systems (RS), the rationale behind personalization and how users can adjust it typically remains opaque. This was often observed to have negative effects on the user experience and perceived quality of RS. As a result, research increasingly has taken user-centric aspects such as transparency and control of a RS into account, when assessing its quality. However, we argue that too little of this research has investigated the users' perception and understanding of RS in their entirety. In this paper, we explore the users' mental models of RS. More specifically, we followed the qualitative grounded theory methodology and conducted 10 semi-structured face-to-face interviews with typical and regular Netflix users. During interviews participants expressed high levels of uncertainty and confusion about the RS in Netflix. Consequently, we found a broad range of different mental models. Nevertheless, we also identified a general structure underlying all of these models, consisting of four steps: data acquisition, inference of user profile, comparison of user profiles or items, and generation of recommendations. Based on our findings, we discuss implications to design more transparent, controllable, and user friendly RS in the future.
Automated complex word identification (CWI) is a crucial task in several applications, from readability assessment to lexical simplification. So far, several works have modeled CWI with the goal of targeting the needs of non-native speakers. However, studies in language acquisition show that different native languages can create positive or negative interferences w.r.t. reading comprehension, favouring or hindering the understanding of a document in a foreign language. Therefore, we propose to modify CWI to address the specific difficulties connected to different native languages. In particular, we present a pipeline that, based on the user native language, identifies complex terms by automatically detecting cognates and false friends on the fly. The selection presented by the CWI module is adaptive in that it changes depending on the native language of the user. We implement and evaluate our approach for four different native languages (French, English, German and Spanish), in a setting where documents are written in Italian and should be read by language learners with low proficiency. We show that a personalised strategy based on false friend detection identifies complex terms that are different from those usually selected with standard approaches based on word frequency.
Popular approaches in learner modeling explore response-time as observational data supplemental to response correctness, to enrich the predictive models of learner knowledge. It has been argued that the relationship between response-time and knowledge mastery is non-linear. Determining the degree of association (dependence structure) between those two observations is an open question. To address this objective, we propose an approach based on copulas, i.e., a statistical tool suitable for capturing dependence structure between two variables. All of the information about the dependence structures can be estimated by copula models separately, allowing for the construction of more flexible joint distributions than existing multivariate distributions. This paper puts into practice a two-step pipeline for building the analytical models. Specifically, we propose a flexible copula-based approach that describes the dependence structure between students' response-time and mastery, in learning and testing contexts, and apply the methodology on four datasets. The two datasets are coming from Intelligent Tutoring Systems and are shared via an online repository, and the other two were collected during the validation of an (adaptive) assessment system. The results reveal five generic patterns of associations across-datasets, for various types of activities, domains and learner characteristics (i.e., not across-contexts). We elaborate on those findings and on the implications of our approach for adaptive systems.
Subjective well-being (SWB) is a well-studied, widely used construct that refers to how people feel and think about their lives as one of many comprehensive perspectives on well-being. Much research has analyzed the role and utilization of technologies to improve one's SWB; however, especially when it comes to user modeling, multifaceted and variational aspects of SWB are less frequently considered. This paper presents an analysis on identifying factors for smartphone-based data on SWB and modeling SWB changes, based on a four-month user study with 78 college students. Our regression analysis highlights the significance of user attributes (e.g., personality, self-esteem) on SWB and salient factors derived from smartphone data (e.g., time spent on campus, ratio of standing/sitting stationary, expenses) that significantly account for SWB. Our classification analysis shows the potential for detecting SWB changes with reasonable performance, as well as for improving a model to be more tailored to individuals.
Requirements engineering is one of the most critical phases in the context of software development. Unclear textual specifications of requirements, hidden dependencies between requirements, and suboptimal prioritizations and release plans represent the major reasons for project delays and even cancellation. In this paper, we show how group recommender user interfaces can help to improve the quality of requirements engineering processes. To that end, we developed a novel group recommendation approach that focuses on the aspect of improving requirements prioritization by making preference elicitation processes more flexible as well as by introducing innovative user interfaces that foster information exchange among stakeholders. We conducted a large user study (N=313 participants) to evaluate our approach. The evaluation results indicate that argumentation-based user interfaces in a group setting trigger more rating and communication activity among the group members which significantly improves the quality of the prioritization process. Our main contributions are twofold: (1) more flexibility of the requirements evaluation by supporting the delegation of votes to experts and (2) an increased engagement of the stakeholders responsible for the requirements.
Eliciting the preferences and needs of tourists is challenging, since people often have difficulties to explicitly express them -- especially in the initial phase of travel planning. Recommender systems employed at the early stage of planning can therefore be very beneficial to the general satisfaction of a user. Previous studies have explored pictures as a tool of communication and as a way to implicitly deduce a traveller's preferences and needs. In this paper, we conduct a user study to verify previous claims and conceptual work on the feasibility of modelling travel interests from a selection of a user's pictures. We utilize fine-tuned convolutional neural networks to compute a vector representation of a picture, where each dimension corresponds to a travel behavioural pattern from the traditional Seven-Factor model. In our study, we followed strict privacy principles and did not save uploaded pictures after computing their vector representation. We aggregate the representations of the pictures of a user into a single user representation, i.e., touristic profile, using different strategies. In our user study with 81 participants, we let users adjust the predicted touristic profile and confirm the usefulness of our approach. Our results show that given a collection of pictures the touristic profile of a user can be determined.
As recommender systems have become more widespread and moved into areas with greater social impact, such as employment and housing, researchers have begun to seek ways to ensure fairness in the results that such systems produce. This work has primarily focused on developing recommendation approaches in which fairness metrics are jointly optimized along with recommendation accuracy. However, the previous work had largely ignored how individual preferences may limit the ability of an algorithm to produce fair recommendations. Furthermore, with few exceptions, researchers have only considered scenarios in which fairness is measured relative to a single sensitive feature or attribute (such as race or gender). In this paper, we present a re-ranking approach to fairness-aware recommendation that learns individual preferences across multiple fairness dimensions and uses them to enhance provider fairness in recommendation results. Specifically, we show that our opportunistic and metric-agnostic approach achieves a better trade-off between accuracy and fairness than prior re-ranking approaches and does so across multiple fairness dimensions.
Smartphones utilize context signals, such as time and location, to predict users' app usage tailored to individual users. To be effective, such personalization relies on access to sufficient information about each user's behavioral habits. For new users, the behavior information may be sparse or non-existent. To handle these cases, app category usage prediction approaches can employ signals from users who are similar along one or more dimensions, i.e., those in the same cohort. In this paper, we describe a characterization and evaluation of the use of such cohort modeling to enhance app category usage prediction. We experiment with pre-defined cohorts from three taxonomies - demographics, psychographics, and behavioral patterns - independently and in combination. We also evaluate various approaches to assign users into the corresponding cohorts. We show, through extensive experiments with large-scale mobile app usage logs from a mobile advertising company, that leveraging cohort behavior can yield significant prediction performance gains than when using the personalized signals at the individual prediction level. In addition, compared to the personalized model, the cohort-based approach can significantly alleviate the cold-start problem, achieving strong predictive performance even with limited amount of user interactions.
Over the course of the last decade, online retailers have demonstrated that knowledge about customer preferences and shopping patterns is an important asset for running a successful business. For example, customer preferences and shopping histories are the foundation for recommender systems that support the search for relevant products to buy online. With the increasing adoption of modern technologies, traditional retailers are able to collect similar data about customer behavior in their stores. For example, smart fitting rooms allow to track interactions of customers with products beyond the scope of a traditional retail store. In this paper we explore how customers of a large international fashion retailer buy products online and in brick-and-mortar stores, and uncover significant differences between the two domains. In particular, we find that online customers frequently focus on buying products from one specific category, whereas customers in brick-and-mortar stores often buy a more diverse range of product types. Further, we investigate products that customers take into fitting rooms, and we find that they frequently deviate from, and complement purchases. Finally, we demonstrate how our findings impact practical applications, illustrated using recommender systems, and discuss how shopping baskets from different domains can be leveraged.
Serendipity-oriented recommender systems have increasingly been recognized as useful to overcome the "filter bubble" problem of accuracy-oriented recommenders, by recommending unexpected and relevant items to users. However, most of existing systems are based on researchers' assumptions about the effect of item features on serendipity, but less from users' perspective to study what item features and even user characteristics might affect their perceived serendipity. In this paper, we have attempted to fill in this vacancy based on results of a large-scale user survey (involving over 10,000 users). We have analyzed the correlation between different types of features (i.e., numerical and categorical) with user perceptions, and furthermore identified the interaction effect from user characteristics (such as personality traits and curiosity). We finally discuss the implications of our work to augment the effectiveness of current serendipity-oriented recommender systems.
Driving can occupy a considerable part of our daily lives and is often associated with high levels of stress. Motivated by the effectiveness of controlled breathing, this work studies the potential use of breathing interventions while driving to help manage stress. In particular, we implemented and evaluated a closed-loop system that monitored the breathing rate of drivers in real-time and delivered either a conscious or an unconscious personalized acoustic breathing guide whenever needed. In a study with 24 participants, we observed that conscious interventions more effectively reduced the breathing rate but also increased the number of driving mistakes. We observed that prior driving experience as well as personality are significantly associated with the effect of the interventions, which highlights the importance of considering user profiles for in-car stress management interventions.
Motivated by the recent advances of reinforcement learning and the traditional grounded Self Determination Theory (SDT), we explored the impact of hierarchical reinforcement learning (HRL) induced pedagogical policies and data-driven explanations of the HRL-induced policies on student experience in an Intelligent Tutoring System (ITS). We explored their impacts first independently and then jointly. Overall our results showed that 1) the HRL induced policies could significantly improve students' learning performance, and 2) explaining the tutor's decisions to students through data-driven explanations could improve the student-system interaction in terms of students' engagement and autonomy.
A central concern in an interactive intelligent system is optimization of its actions, to be maximally helpful to its human user. In recommender systems for instance, the action is to choose what to recommend, and the optimization task is to recommend items the user prefers. The optimization is done based on earlier user's feedback (e.g. "likes" and "dislikes"), and the algorithms assume the feedback to be faithful. That is, when the user clicks "like," they actually prefer the item. We argue that this fundamental assumption can be extensively violated by human users, who are not passive feedback sources. Instead, they are in control, actively steering the system towards their goal. To verify this hypothesis, that humans steer and are able to improve performance by steering, we designed a function optimization task where a human and an optimization algorithm collaborate to find the maximum of a 1-dimensional function. At each iteration, the optimization algorithm queries the user for the value of a hidden function f at a point x, and the user, who sees the hidden function, provides an answer about f(x). Our study on 21 participants shows that users who understand how the optimization works, strategically provide biased answers (answers not equal to f(x)), which results in the algorithm finding the optimum significantly faster. Our work highlights that next-generation intelligent systems will need user models capable of helping users who steer systems to pursue their goals.
This paper investigates the user experience of visualizations of a machine learning (ML) system that recognizes objects in images. This is important since even good systems can fail in unexpected ways as misclassifications on photo-sharing websites showed. In our study, we exposed users with a background in ML to three visualizations of three systems with different levels of accuracy. In interviews, we explored how the visualization helped users assess the accuracy of systems in use and how the visualization and the accuracy of the system affected trust and reliance. We found that participants do not only focus on accuracy when assessing ML systems. They also take the perceived plausibility and severity of misclassification into account and prefer seeing the probability of predictions. Semantically plausible errors are judged as less severe than errors that are implausible, which means that system accuracy could be communicated through the types of errors.
Technology-assisted systems to monitor and assess rehabilitation exercises have an opportunity of enhancing rehabilitation practices by automatically collecting patient's quantitative performance data. However, even if a complex algorithm (e.g. Neural Network) is applied, it is still challenging to develop such a system due to patients with various physical conditions. The system with a complex algorithm is limited to be a black-box system that cannot provide explanations on its predictions. To address these challenges, this paper presents a hybrid model that integrates a machine learning (ML) model with a rule-based (RB) model as an explainable artificial intelligence (AI) technique for quantitative assessment of stroke rehabilitation exercises. For evaluation, we collected therapist's knowledge on assessment as 15 rules from interviews with therapists and the dataset of three upper-limb stroke rehabilitation exercises from 15 post-stroke and 11 healthy subjects using a Kinect sensor. Experimental results show that a hybrid model can achieve comparable performance with a ML model using Neural Network, but also provide explanations on a model prediction with a RB model. The results indicate the potential of a hybrid model as an explainable AI technique to support the interpretation of a model and fine-tune a model with user-specific rules for personalization.
Social collaborative platforms such as GitHub and Stack Overflow have been increasingly used to improve work productivity via collaborative efforts. To improve user experiences in these platforms, it is desirable to have a recommender system that can suggest not only items (e.g., a GitHub repository) to a user, but also activities to be performed on the suggested items (e.g., forking a repository). To this end, we propose a new approach dubbed Keen2Act, which decomposes the recommendation problem into two stages: the Keen and Act steps. The Keen step identifies, for a given user, a (sub)set of items in which he/she is likely to be interested. The Act step then recommends to the user which activities to perform on the identified set of items. This decomposition provides a practical approach to tackling complex activity recommendation tasks while producing higher recommendation quality. We evaluate our proposed approach using two real-world datasets and obtain promising results whereby Keen2Act outperforms several baseline models.
Past studies have shown that personality has a significant association with user behaviour and preferences, not least towards music. This makes personality information a promising aspect for user modelling in personalised recommender systems and similar domains. In contrast to existing studies, which investigate personality correlates of music preferences via genres or styles, we study such correlates by modelling music preferences at a finer-grained content level, using audio features of the music users listen to. Leveraging listening and personality information of more than 1,300 Last.fm users, we identify several significant medium and weak correlations between music audio features and personality traits, the latter defined by the five-factor model. Our results provide useful insights into the relationship between personality and music preference, which can be valuable for music recommender systems in terms of more personalised recommendations.
Item cold-start recommendation, which predicts user preference on new items that have no user interaction records, is an important problem in recommender systems. In this paper, we model the disparity between user preferences on warm items (those having interaction record) and that on cold-start items using the Wasserstein distance. On this basis, we propose Wasserstein Collaborative Filtering (WCF), which predicts user preference on cold-start items by minimizing the Wasserstein distance under user embedding constraint. Our analysis shows that minimizing the Wasserstein distance ensures that users sharing similar tastes on warm items also have similar preferences on cold-start items. Experimental results show that WCF consistently outperform the state-of-the-art methods in recommendation quality, usually by a large margin.
An increasing body of research indicates that transparency in recommender systems affects trust of users. Additionally, a vast amount of studies already showed that personality impacts the way users perceive a recommender system. However, only recently, research has begun to investigate the effects of cognitive style on the perception of recommender systems. Furthermore, it is still unclear whether this cognitive style also affects the interaction strategies of users, and whether the reason why and when users want transparency is affected by this cognitive style. Additionally, despite the ubiquitous presence of recommender systems on mobile environments, no study has investigated the effect of transparency for mobile music recommender systems. In this paper, we report the results of a within-subject study (N=25) on a mobile music recommender system where we investigated the effect of cognitive styles on three different aspects: the interaction strategies with the different applications, the reasons why and when users want transparency and the effect of transparency on the trust of users. The results show that users with a rational thinking style put more effort in seeking the best recommendations and that they want scrutable explanations to adjust the recommendation. In contrast, intuitive thinkers only need explanations when they search for a very specific kind of music.
The current study investigated the role of trust in students' attitudes towards personal data sharing in the context of e-assessment, and whether this is different for students with special educational needs and disabilities (SEND). SEND students were included as a special target group because they may feel more dependent on e-assessment technologies, and thus, more easily consent to personal data sharing. A mixed methods research design was adopted combining an online survey and a focus group interview to collect quantitative and qualitative data. The findings suggest that a considerable number of students trust e-assessment technology that does not require the physical presence of a supervisor. Students who trust are more likely to perceive e-assessment technology as having no disadvantages, and are more willing to share their personal data for e-assessment purposes. The responses of SEND and non-SEND students do not differ significantly in terms of trust. However, the results diverge regarding the relation between trust and perception of e-assessment technology as having no disadvantages. Practical implications for informed consent are discussed.
Food recommender systems typically rely on popularity, as well as similarity between recipes to generate personalized suggestions. However, this leaves little room for users to explore new preferences, such as to adopt healthier eating habits.
In this short paper, we present a recommendation strategy based on knowledge about food and users' health-related characteristics to generate personalized recipes suggestions. By focusing on personal factors as a user's BMI and dietary constraints, we exploited a holistic user model to re-rank a basic recommendation list of 4,671 recipes, and investigated in a web-based experiment (N=200) to what extent it generated satisfactory food recommendations. We found that some of the information encoded in a users' holistic user profiles affected their preferences, thus providing us with interesting findings to continue this line of research.
Language provides a unique window into thoughts, enabling direct assessment of mental-state alterations. Due to their increasing popularity, online social media platforms have become promising means to study different mental disorders. However, the lack of available datasets can hinder the development of innovative diagnostic methods. Tools to assist health practitioners in screening and monitoring individuals under potential risk are essential.
In this paper, we present a new a dataset to foster the research on automatic detection of depression. To this end, we present a methodology for automatically collecting large samples of depression and non-depression posts from online social media. Furthermore, we perform a benchmark on the dataset to establish a point of reference for researchers who are interested in using it.
With estimates suggesting that half of the world's population learns or speaks at least two languages, Web information access systems such as Web search engines need to cater for an increasing variety of individual language proficiencies and preferences. However, while significant advances have been made regarding the handling, retrieval, and automatic translation of multilingual information, there has been a relative lack of user-centered research aiming to support individual users' multilingual abilities. To address this research gap, this paper presents a series of user studies and experiments that aim to inform novel search solutions that specifically support multilingual users. In particular, the experiments presented in this paper examine the extent to which a system can predict, for a given query, what language(s) a multilingual user would prefer the search results to be in. Results from our studies show that such predictions can statistically significantly outperform a baseline model, and that users' languages and proficiencies, their current location, as well as the search topic domain and type all influence the prediction results.
Information Visualization is a key technique to assist users in data analysis tasks, by creating visual representations of data to amplify human cognition. However, while human cognitive abilities and styles have been shown to differ significantly, Information Visualizations have traditionally been designed in a manner that does not consider such individual user differences. Recent research has started to address this issue, by identifying individual user characteristics that influence individual users' interactions with Information Visualizations, as well as developing novel Information Visualization systems that provide more personalized support. This paper presents a set of experiments aimed towards building such User-Adaptive Information Visualization systems, by studying the extent to which a user's cognitive style can be inferred from a user's interaction with an Information Visualization system. Results show that a user's eye gaze data can be used to infer a user's cognitive style during information visualization usage with up to 86% accuracy, and that the most informative features relate to a user's saccade angles and fixation durations.
Users from Location-Based Social Networks can be characterised by how and where they move. However, most of the works that exploit this type of information neglect either its sequential or its geographical properties. In this article, we focus on a specific family of recommender systems, those based on nearest neighbours; we define related users based on common check-ins and similar trajectories and analyse their effects on the recommendations. For this purpose, we use a real-world dataset and compare the performance on different dimensions against several state-of-the-art algorithms. The results show that better neighbours could be discovered with these approaches if we want to promote novel and diverse recommendations.
Market cannibalization is inevitable when there are two or more competing marketing approaches to the same customer base. The cannibalization problem has been discussed in the context of search advertising of individual advertisers, whereas in this paper we discuss the problem that advertising platform companies face in dealing with multiple advertisers. In online advertising, they must properly serve ads with varying mass appeal to users with various interests. For them, it is important to maximize the value of the ads for advertisers and also for the platform. To do so, they deploy user models to serve ads. However, shortsighted models could lead to a decrease in overall performance in an attempt to improve certain ads' performance while slightly impairing the rest. We consider this phenomenon from the perspective of cannibalization and confirm the existence of a cannibalization problem in optimizing the delivery of ads in minor categories. To resolve this problem, we propose new methods, apply them to an ad delivery system, and conduct an A/B test. Our methods overcame the cannibalization problem and increased revenue by + 0.6% compared with the baseline method.
Peer grading, in which students grade each other's work, can provide an educational opportunity for students and reduce grading effort for instructors. A variety of methods have been proposed for synthesizing peer-assigned grades into accurate submission grades. However, when the assumptions behind these methods are not met, they may underperform a simple baseline of averaging the peer grades. We introduce SABTXT, which improves over previous work through two mechanisms. First, SABTXT uses a limited amount of historical instructor ground truth to model and correct for each peer's grading bias. Secondly, SABTXT models the thoroughness of a peer review based on its textual content, and puts more weight on the more thorough peer reviews when computing submission grades. In our experiments with over ten thousand peer reviews collected over four courses, we show that SABTXT outperforms existing approaches on our collected data, and achieves a mean squared error that is 6% lower than the strongest baseline on average.
Interactive dashboards enable viewing and interacting with complex underlying data using visualisations such as charts, tables, maps, or even text typically on a single display. By bringing the most important information in a single place, dashboards enable performance monitoring and support decision making. Although nowadays dashboards are widely adopted in many domains, they involve challenges that prevent users from utilising them as they were intended. For example, having a dashboard with too much data can negatively affect decision making and lead to misleading interpretation. Through this research, we identify and investigate the challenges associated with dashboards, what users do in response to those challenges, and what adaptations can be applied to mitigate these challenges. Consequently, we aim to examine and evaluate a set of adaptation techniques that can improve the experience of users interacting with dashboards.
Driving behaviour is key to determining the safety of individuals on the road. It can be argued that understanding driving behaviour and developing methods to improve it will lead to a decrease in accidents and improve citizen safety. At present, most of the work associated with driving behaviour is carried out by insurance companies who use mobile apps and telematic sensors to monitor driving behaviours. These companies are, mainly, capturing driving data to calculate annual premiums rather than to share that data with the drivers. On the academic side, the work focuses on feedback approach and real-time warnings systems. Both commercial and academic research does not consider the significant fact that all drivers are not the same; one-size-fits-all" will not work.
This research investigates the scope of personalisation by factors such as age, gender, culture, country and type of driving (e.g. rural or urban) and its impact on driver behaviour. The aim is to improve the effectiveness of driving behaviours systems which can produce meaningful feedback to the driver. Our model suggests that through personalisation, user-modelling and persuasive techniques such as regular feedback reports to drivers (showing their bad driving behaviour), it is possible to improve driving styles and eventually create improved driving behaviour systems. Another positive outcome of this model will be safer roads. We have conducted surveys, used focus groups and interviews to find out the types of driver and their preferences.
Educational Recommender Systems (EdRecSys) are different in nature from conventional Recommender Systems (RecSys) --mostly related to e-commerce-- as the main goal of EdRecSys is supporting students learning' instead of maximizing users' satisfaction from consuming the recommended items. Thus, research on transparency for traditional RecSys is hard to transfer from e-commerce contexts to educational scenarios, as the level of knowledge of the end-user (i.e. the student) is crucial for generating and evaluating the impact of the recommendations on students' learning. In this paper I present the main idea of my thesis proposal, which aims to fill this gap by taking a user-centered approach that combines design and evaluation of personalized recommender algorithms and explanatory interfaces with students in real learning contexts.
Using a robot to guide a non-medically skilled human helper to perform neurorehabilitation post-stroke therapies is a challenging task. Much information needs to be expressed in a quick manner and it needs to be precise to empower the helper with the feeling of "doing the right thing" during a therapy session. This doctoral research paper aims to highlight current efforts of modelling the interaction in such a situation and presents the setup for its research. We suggest a robot system setup which will be used for the "arm basis training" (ABT). We will present selected research questions for modelling both users and the role of the robot. On the whole, we aim to make patient-helper interaction more engaging and easier. This could hopefully enable even non-medical helpers to perform this therapy and keep both participants' motivation high throughout the whole therapy.
Player modelling is an important task for almost any game creator, which helps in understanding the player-base. One of the major issues is an early leave of players which makes modelling them challenging. In our research, we focus on the cold-start problem by utilizing information about a player from multiple games or other players in a given game. Although multiple studies focus on cross-game modelling, they still often require manual mapping of features or don't consider a player's behaviour specific to the given game. Our proposed method is based on transfer learning and unsupervised translation. In addition, we propose a combination of group-based and individual player models.
This tutorial provides a common ground for both researchers and practitioners interested in data and algorithmic bias in recommender systems. Guided by real-world examples in various domains, we introduce problem space and concepts underlying bias investigation in recommendation. Then, we practically show two use cases, addressing biases that lead to disparate exposure of items based on their popularity and to systematically discriminate against a legally-protected class of users. Finally, we cover a range of techniques for evaluating and mitigating the impact of these biases on the recommended lists, including pre-, in-, and post-processing procedures. This tutorial is accompanied by Jupyter notebooks putting into practice core concepts in data from real-world platforms.
Ethical considerations are getting increased attention with regards to providing responsible personalization for robots and autonomous systems. This is partly as a result of the currently limited deployment of such systems in human support and interaction settings. The tutorial will give an overview of the most commonly expressed ethical challenges and ways being undertaken to reduce their impact using the findings in an earlier undertaken review supplemented with recent work and initiatives. The tutorial will exemplify the challenges related to privacy, security and safety through several examples from own and others' work.Ethics, Robotics, Autonomous systems, Privacy, Security and Safety
The objective of this tutorial is to give a structured overview of the conceptual frameworks behind current state-of-the-art recommender systems, explain their underlying assumptions, the resulting methods and their shortcomings, and to introduce an exciting new class of approaches that frames the task of recommendation as a counterfactual policy learning problem. The tutorial can be divided into two modules. In module 1, participants learn about current approaches for building real-world recommender systems that comprise mainly of two frameworks, namely: recommendation as optimal auto-completion of user behaviour and recommendation as reward modelling. In module 2, we present the framework of recommendation as a counterfactual policy learning problem and go over the theoretical guarantees that address the shortcomings of the previous frameworks. We then proceed to go over the associated algorithms and test them against classical methods in RecoGym, an open-source recommendation simulation environment.
Overall, we believe the subject of the course is extremely actual and fills a gap between the consecrated recommendation frameworks and the cutting edge research and sets the stage for future advances in the field.
Adaptive systems are usually interactive systems. As such they stand to benefit considerably from a development lifecycle that ensures user involvement from the early design stages, and embraces evaluation, in both formative and summative forms. Evaluating an adaptive system involves a number of specific problems and pitfalls that need to be addressed by the selection of specific methods, techniques and criteria. This tutorial aims to introduce participants to the peculiarities that arise when evaluating adaptive interactive systems. A layered evaluation framework is used to separate the evaluation process into a number of different aspects which can be applied at different stages throughout the development life-cycle.
ACM PATCH 2020, organized in conjunction with the 28th International Conference on User Modeling, Adaptation and Personalization, is the latest event of the PATCH series, started in 2007 and held within the UMAP and IUI Conference series. We summarize the main ideas addressed in the papers accepted for publication in the workshop proceedings and for presentation at the event.
The Second International Workshop on Adaptive and Personalized Privacy and Security (APPS 2020) aims to bring together researchers and practitioners working on diverse topics related to understanding and improving the usability of privacy and security software and systems, by applying user modeling, adaptation and personalization principles. Our special focus in 2020 is on healthcare systems, more specifically on ensuring security and privacy of medical data in smart patient-centric healthcare systems. The second edition of the workshop includes interdisciplinary contributions from Austria, Canada, China, Cyprus, Denmark, Germany, Greece, Israel, the Netherlands, Turkey and the UK that introduce new and disruptive ideas, suggest novel solutions, and present research results about various aspects (theory, applications, tools) for bringing user modeling, adaptation and personalization principles into privacy and systems security. This summary gives a brief overview of APPS 2020, held online in conjunction with the 28th ACM Conference on User Modeling, Adaptation and Personalization (ACM UMAP 2020).
A wide range of tools and applications have been developed for supporting Computer Science Education, ranging from visual programming languages to web applications. In this setting it is crucial to model user needs and provide personalized support to improve the effectiveness and satisfaction of learning experiences. This summary gives a brief overview of the workshop Adaptation and Personalization in Computer Science Education organized at UMAP 2020 in order to bring together researchers, practitioners and education stakeholders interested in these topics. The workshop program consists of a keynote speech by Wolfgang Slany head of the Catrobat Project and by three technical sessions offering different perspectives on the main themes of the workshop.
Nowadays, the profound digital transformation has upgraded the role of the computational system into an intelligent multidimensional communication medium that creates new opportunities, competencies, models and processes. The need for human-centered adaptation and personalization is even more recognizable since it can offer hybrid solutions that could adequately support the rising multi-purpose goals, needs, requirements, activities and interactions of users. HAAPIE workshop embraces the essence of the "human-machine co-existence" and brings together researchers and practitioners from different disciplines to present and discuss a wide spectrum of related challenges, approaches and solutions. In this respect, the fifth edition of HAAPIE includes 5 long papers.
The 3rd FairUMAP workshop brings together researchers working at the intersection of user modeling, adaptation, and personalization on the one hand, and bias, fairness and transparency in algorithmic systems on the other hand.
Adaptive and personalized systems have become pervasive technologies which are gradually playing an increasingly important role in our daily lives. Indeed, we are now used to interact every day with algorithms that help us in several scenarios, ranging from services that suggest us music to be listened to or movies to be watched, to personal assistants able to proactively support us in complex decision-making tasks.
As the importance of such technologies in our everyday lives grows, it is fundamental that the internal mechanisms that guide these algorithms are as clear as possible. Unfortunately, the current research tends to go in the opposite direction, since most of the approaches try to maximize the effectiveness of the personalization strategy (e.g., recommendation accuracy) at the expense of the explainability and the transparency of the model.
The main research questions which arise from this scenario is simple and straightforward: How can we deal with such a dichotomy between the need for effective adaptive systems and the right to transparency and interpretability?
The workshop aims to provide a forum for discussing such problems, challenges and innovative research approaches in the area, by investigating the role of transparency and explainability on the recent methodologies for building user models or for developing personalized and adaptive systems.