ETRA '18- Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications

Full Citation in the ACM Digital Library

SESSION: Cognition

An investigation of the effects of n-gram length in scanpath analysis for eye-tracking research

Scanpath analysis is a controversial and important topic in eye tracking research. Previous work has shown the value of scanpath analysis in perceptual tasks; little research has examined its utility for understanding human reasoning in complex tasks. Here, we analyze n-grams, which are continuous ordered subsequences of participants' scanpaths. In particular we studied the length of n-grams that are most appropriate for this form of analysis. We reuse datasets from previous studies of human cognition, medical diagnosis and art, systematically analyzing the frequency of n-grams of increasing length, and compare this approach with a string alignment-based method. The results show that subsequences of four or more areas of interest may not be of value for finding patterns that distinguish between two groups. The study is the first to systematically define the parameters of the length of n-gram suitable for analysis, using an approach that holds across diverse domains.

Evaluating gender difference on algorithmic problems using eye-tracker

Gender differences in programming comprehension has been a topic of discussion in recent years. We conducted an eye-tracking study on 51(21 female, 30 male) computer science undergraduate university students to examine their cognitive processes in pseudocode comprehension. We aim to identify their reading strategies and eye gaze behavior on the comprehension of pseudocodes in terms of performance and visual effort when solving algorithmic problems of varying difficulty levels. Each student completed a series of tasks requiring them to rearrange randomized pseudocode statements in a correct order for the problem presented. Our results indicated that the speed of analyzing the problems were faster among male students, although female students fixated longer in understanding the problem requirements. In addition, female students more commonly fixated on indicative verbs (i.e., prompt, print), while male students fixated more on operational statements (i.e., loops, variables calculations, file handling).

How many words is a picture worth?: attention allocation on thumbnails versus title text regions

Cognitive scientists and psychologists have long noted the "picture superiority effect", that is, pictorial content is more likely to be remembered and more likely to lead to an increased understanding of the material. We investigated the relative importance of pictorial regions versus textual regions on a website where pictures and text co-occur in a very structured manner: video content sharing websites. We tracked participants' eye movements as they performed a casual browsing task, that is, selecting a video to watch. We found that participants allocated almost twice as much attention to thumbnails as to title text regions. They also tended to look at the thumbnail images before the title text, as predicted by the picture superiority effect. These results have implications for both user experience designers as well as video content creators.

Cross-subject workload classification using pupil-related measures

Real-time evaluation of a person's cognitive load can be desirable in many situations. It can be employed to automatically assess or adjust the difficulty of a task, as a safety measure, or in psychological research. Eye-related measures, such as the pupil diameter or blink rate, provide a non-intrusive way to assess the cognitive load of a subject and have therefore been used in a variety of applications. Usually, workload classifiers trained on these measures are highly subject-dependent and transfer poorly to other subjects. We present a novel method to generalize from a set of trained classifiers to new and unknown subjects. We use normalized features and a similarity function to match a new subject with similar subjects, for which classifiers have been previously trained. These classifiers are then used in a weighted voting system to detect workload for an unknown subject. For real-time workload classification, our methods performs at 70.4% accuracy. Higher accuracy of 76.8% can be achieved in an offline classification setting.

Correlation between gaze and hovers during decision-making interaction

Taps only consist of a small part of the manual input when interacting with touch-enabled surfaces. Indeed, how the hand behaves in the hovering space is informative of what the user intends to do. In this article, we present a data collection related to hand and eye motion. We tailored a kiosk-like system to record participants' gaze and hand movements. We specifically designed a memory game to detect the decision-making process users may face. Our data collection comprises of 177 trials from 71 participants. Based on a hand movement classification, we extracted 16588 hovers. We study the gaze behaviour during hovers, and we found out that the distance between gaze and hand depends on the target's location on the screen. We also showed how indecision can be deducted from this distance.

A system to determine if learners know the divisibility rules and apply them correctly

Mathematics teachers may find it challenging to manage the learning that takes place in learners' minds. Typical true/false or multiple choice assessments, whether in oral, written or electronic format, do not provide evidence that learners applied the correct principles.

A system was developed to analyse learners' gaze behaviour while they were determining whether a multi-digit dividend is divisible by a divisor. The system provides facilities for a teacher to set up tests and generate various types of quantitative and qualitative reports.

The system was tested with a group of 16 learners from Grade 7 to Grade 10 in a pre-post experiment to investigate the effect of revision on their performance. It was proven that, with tests that are carefully compiled according to a set of heuristics, eye tracking can be used to determine whether learners use the correct strategy when applying divisibility rules.

SESSION: Fundamental eye tracking

Supervised descent method (SDM) applied to accurate pupil detection in off-the-shelf eye tracking systems

The precise detection of pupil/iris center is key to estimate gaze accurately. This fact becomes specially challenging in low cost frameworks in which the algorithms employed for high performance systems fail. In the last years an outstanding effort has been made in order to apply training-based methods to low resolution images. In this paper, Supervised Descent Method (SDM) is applied to GI4E database. The 2D landmarks employed for training are the corners of the eyes and the pupil centers. In order to validate the algorithm proposed, a cross validation procedure is performed. The strategy employed for the training allows us to affirm that our method can potentially outperform the state of the art algorithms applied to the same dataset in terms of 2D accuracy. The promising results encourage to carry on in the study of training-based methods for eye tracking.

CBF: circular binary features for robust and real-time pupil center detection

Modern eye tracking systems rely on fast and robust pupil detection, and several algorithms have been proposed for eye tracking under real world conditions. In this work, we propose a novel binary feature selection approach that is trained by computing conditional distributions. These features are scalable and rotatable, allowing for distinct image resolutions, and consist of simple intensity comparisons, making the approach robust to different illumination conditions as well as rapid illumination changes. The proposed method was evaluated on multiple publicly available data sets, considerably outperforming state-of-the-art methods, and being real-time capable for very high frame rates. Moreover, our method is designed to be able to sustain pupil center estimation even when typical edge-detection-based approaches fail - e.g., when the pupil outline is not visible due to occlusions from reflections or eye lids / lashes. As a consequece, it does not attempt to provide an estimate for the pupil outline. Nevertheless, the pupil center suffices for gaze estimation - e.g., by regressing the relationship between pupil center and gaze point during calibration.

A novel approach to single camera, glint-free 3D eye model fitting including corneal refraction

Model-based methods for glint-free gaze estimation typically infer eye pose using pupil contours extracted from eye images. Existing methods, however, either ignore or require complex hardware setups to deal with refraction effects occurring at the corneal interfaces. In this work we provide a detailed analysis of the effects of refraction in glint-free gaze estimation using a single near-eye camera, based on the method presented by [Świrski and Dodgson 2013]. We demonstrate systematic deviations in inferred eyeball positions and gaze directions with respect to synthetic ground-truth data and show that ignoring corneal refraction can result in angular errors of several degrees. Furthermore, we quantify gaze direction dependent errors in pupil radius estimates. We propose a novel approach to account for corneal refraction in 3D eye model fitting and by analyzing synthetic and real images show that our new method successfully captures refraction effects and helps to overcome the shortcomings of the state of the art approach.

Smooth-i: smart re-calibration using smooth pursuit eye movements

Eye gaze for interaction is dependent on calibration. However, gaze calibration can deteriorate over time affecting the usability of the system. We propose to use motion matching of smooth pursuit eye movements and known motion on the display to determine when there is a drift in accuracy and use it as input for re-calibration. To explore this idea we developed Smooth-i, an algorithm that stores calibration points and updates them incrementally when inaccuracies are identified. To validate the accuracy of Smooth-i, we conducted a study with five participants and a remote eye tracker. A baseline calibration profile was used by all participants to test the accuracy of the Smooth-i re-calibration following interaction with moving targets. Results show that Smooth-i is able to manage re-calibration efficiently, updating the calibration profile only when inaccurate data samples are detected.

Comparison of mapping algorithms for implicit calibration using probable fixation targets

With growing access to cheap low end eye trackers using simple web cameras, there is also a growing demand on easy and fast usage of this devices by untrained and unsupervised end users. For such users the necessity to calibrate the eye tracker prior to its first usage is often perceived as obtrusive and inconvenient. In the same time perfect accuracy is not necessary for many commercial applications. Therefore, the idea of implicit calibration attracts more and more attention. Algorithms for implicit calibration are able to calibrate the device without any active collaboration with users. Especially, a real time implicit calibration, that is able to calibrate a device on-the-fly, while a person uses an eye tracker, seems to be a reasonable solution to the aforementioned problems.

The paper presents examples of implicit calibration algorithms (including their real time versions) based on the idea of probable fixation targets (PFT). The algorithms were tested during a free viewing experiment and compared to the state of the art PFT based algorithm and explicit calibration results.

Revisiting data normalization for appearance-based gaze estimation

Appearance-based gaze estimation is promising for unconstrained real-world settings, but the significant variability in head pose and user-camera distance poses significant challenges for training generic gaze estimators. Data normalization was proposed to cancel out this geometric variability by mapping input images and gaze labels to a normalized space. Although used successfully in prior works, the role and importance of data normalization remains unclear. To fill this gap, we study data normalization for the first time using principled evaluations on both simulated and real data. We propose a modification to the current data normalization formulation by removing the scaling factor and show that our new formulation performs significantly better (between 9.5% and 32.7%) in the different evaluation settings. Using images synthesized from a 3D face model, we demonstrate the benefit of data normalization for the efficiency of the model training. Experiments on real-world images confirm the advantages of data normalization in terms of gaze estimation performance.

SESSION: Digital interactions

Leveraging eye-gaze and time-series features to predict user interests and build a recommendation model for visual analysis

We developed a new concept to improve the efficiency of visual analysis through visual recommendations. It uses a novel eye-gaze based recommendation model that aids users in identifying interesting time-series patterns. Our model combines time-series features and eye-gaze interests, captured via an eye-tracker. Mouse selections are also considered. The system provides an overlay visualization with recommended patterns, and an eye-history graph, that supports the users in the data exploration process. We conducted an experiment with 5 tasks where 30 participants explored sensor data of a wind turbine. This work presents results on pre-attentive features, and discusses the precision/recall of our model in comparison to final selections made by users. Our model helps users to efficiently identify interesting time-series patterns.

Gaze and head pointing for hands-free text entry: applicability to ultra-small virtual keyboards

With the proliferation of small-screen computing devices, there has been a continuous trend in reducing the size of interface elements. In virtual keyboards, this allows for more characters in a layout and additional function widgets. However, vision-based interfaces (VBIs) have only been investigated with large (e.g., full-screen) keyboards. To understand how key size reduction affects the accuracy and speed performance of text entry VBIs, we evaluated gaze-controlled VBI (g-VBI) and head-controlled VBI (h-VBI) with unconventionally small (0.4°, 0.6°, 0.8° and 1°) keys. Novices (N = 26) yielded significantly more accurate and fast text production with h-VBI than with g-VBI, while the performance of experts (N = 12) for both VBIs was nearly equal when a 0.8--1° key size was used. We discuss advantages and limitations of the VBIs for typing with ultra-small keyboards and emphasize relevant factors for designing such systems.

Gaze typing in virtual reality: impact of keyboard design, selection method, and motion

Gaze tracking in virtual reality (VR) allows for hands-free text entry, but it has not yet been explored. We investigate how the keyboard design, selection method, and motion in the field of view may impact typing performance and user experience. We present two studies of people (n = 32) typing with gaze+dwell and gaze+click inputs in VR. In study 1, the typing keyboard was flat and within-view; in study 2, it was larger-than-view but curved. Both studies included a stationary and a dynamic motion conditions in the user's field of view.

Our findings suggest that 1) gaze typing in VR is viable but constrained, 2) the users perform best (10.15 WPM) when the entire keyboard is within-view; the larger-than-view keyboard (9.15 WPM) induces physical strain due to increased head movements, 3) motion in the field of view impacts the user's performance: users perform better while stationary than when in motion, and 4) gaze+click is better than dwell only (fixed at 550 ms) interaction.

The eye of the typer: a benchmark and analysis of gaze behavior during typing

We examine the relationship between eye gaze and typing, focusing on the differences between touch and non-touch typists. To enable typing-based research, we created a 51-participant benchmark dataset for user input across multiple tasks, including user input data, screen recordings, webcam video of the participant's face, and eye tracking positions. There are patterns of eye movements that differ between the two types of typists, representing glances at the keyboard, which can be used to identify touch-.typed strokes with 92% accuracy. Then, we relate eye gaze with cursor activity, aligning both pointing and typing to eye gaze. One demonstrative application of the work is in extending WebGazer, a real-time web-browser-based webcam eye tracker. We show that incorporating typing behavior as a secondary signal improves eye tracking accuracy by 16% for touch typists, and 8% for non-touch typists.

Towards gaze-based quantification of the security of graphical authentication schemes

In this paper, we introduce a two-step method for estimating the strength of user-created graphical passwords based on the eye-gaze behaviour during password composition. First, the individuals' gaze patterns, represented by the unique fixations on each area of interest (AOI) and the total fixation duration per AOI, are calculated. Second, the gaze-based entropy of the individual is calculated. To investigate whether the proposed metric is a credible predictor of the password strength, we conducted two feasibility studies. Results revealed a strong positive correlation between the strength of the created passwords and the gaze-based entropy. Hence, we argue that the proposed gaze-based metric allows for unobtrusive prediction of the strength of the password a user is going to create and enables intervention to the password composition for helping users create stronger passwords.

Enhanced representation of web pages for usability analysis with eye tracking

Eye tracking as a tool to quantify user attention plays a major role in research and application design. For Web page usability, it has become a prominent measure to assess which sections of a Web page are read, glanced or skipped. Such assessments primarily depend on the mapping of gaze data to a Web page representation. However, current representation methods, a virtual screenshot of the Web page or a video recording of the complete interaction session, suffer either from accuracy or scalability issues. We present a method that identifies fixed elements on Web pages and combines user viewport screenshots in relation to fixed elements for an enhanced representation of the page. We conducted an experiment with 10 participants and the results signify that analysis with our method is more efficient than a video recording, which is an essential criterion for large scale Web studies.

SESSION: Mobile eye tracking

Predicting the gaze depth in head-mounted displays using multiple feature regression

Head-mounted displays (HMDs) with integrated eye trackers have opened up a new realm for gaze-contingent rendering. The accurate estimation of gaze depth is essential when modeling the optical capabilities of the eye. Most recently multifocal displays are gaining importance, requiring focus estimates to control displays or lenses. Deriving the gaze depth solely by sampling the scene's depth at the point-of-regard fails for complex or thin objects as eye tracking is suffering from inaccuracies. Gaze depth measures using the eye's vergence only provide an accurate depth estimate for the first meter. In this work, we combine vergence measures and multiple depth measures into feature sets. This data is used to train a regression model to deliver improved estimates. We present a study showing that using multiple features allows for an accurate estimation of the focused depth (MSE<0.1m) over a wide range (first 6m).

Capturing real-world gaze behaviour: live and unplugged

Understanding human gaze behaviour has benefits from scientific understanding to many application domains. Current practices constrain possible use cases, requiring experimentation restricted to a lab setting or controlled environment. In this paper, we demonstrate a flexible unconstrained end-to-end solution that allows for collection and analysis of gaze data in real-world settings. To achieve these objectives, rich 3D models of the real world are derived along with strategies for associating experimental eye-tracking data with these models. In particular, we demonstrate the strength of photogrammetry in allowing these capabilities to be realized, and demonstrate the first complete solution for 3D gaze analysis in large-scale outdoor environments using standard camera technology without fiducial markers. The paper also presents techniques for quantitative analysis and visualization of 3D gaze data. As a whole, the body of techniques presented provides a foundation for future research, with new opportunities for experimental studies and computational modeling efforts.

Learning to find eye region landmarks for remote gaze estimation in unconstrained settings

Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras. In unconstrained real-world settings, however, such methods are surpassed by recent appearance-based methods due to difficulties in modeling factors such as illumination changes and other visual artifacts. We present a novel learning-based method for eye region landmark localization that enables conventional methods to be competitive to latest appearance-based methods. Despite having been trained exclusively on synthetic data, our method exceeds the state of the art for iris localization and eye shape registration on real-world imagery. We then use the detected landmarks as input to iterative model-fitting and lightweight learning-based gaze estimation methods. Our approach outperforms existing model-fitting and appearance-based methods in the context of person-independent and personalized gaze estimation.

Wearable eye tracker calibration at your fingertips

Common calibration techniques for head-mounted eye trackers rely on markers or an additional person to assist with the procedure. This is a tedious process and may even hinder some practical applications. We propose a novel calibration technique which simplifies the initial calibration step for mobile scenarios. To collect the calibration samples, users only have to point with a finger to various locations in the scene. Our vision-based algorithm detects the users' hand and fingertips which indicate the users' point of interest. This eliminates the need for additional assistance or specialized markers. Our approach achieves comparable accuracy to similar marker-based calibration techniques and is the preferred method by users from our study. The implementation is openly available as a plugin for the open-source Pupil eye tracking platform.

Fixation detection for head-mounted eye tracking based on visual similarity of gaze targets

Fixations are widely analysed in human vision, gaze-based interaction, and experimental psychology research. However, robust fixation detection in mobile settings is profoundly challenging given the prevalence of user and gaze target motion. These movements feign a shift in gaze estimates in the frame of reference defined by the eye tracker's scene camera. To address this challenge, we present a novel fixation detection method for head-mounted eye trackers. Our method exploits that, independent of user or gaze target motion, target appearance remains about the same during a fixation. It extracts image information from small regions around the current gaze position and analyses the appearance similarity of these gaze patches across video frames to detect fixations. We evaluate our method using fine-grained fixation annotations on a five-participant indoor dataset (MPIIEgoFixation) with more than 2,300 fixations in total. Our method outperforms commonly used velocity- and dispersion-based algorithms, which highlights its significant potential to analyse scene image information for eye movement detection.

Error-aware gaze-based interfaces for robust mobile gaze interaction

Gaze estimation error can severely hamper usability and performance of mobile gaze-based interfaces given that the error varies constantly for different interaction positions. In this work, we explore error-aware gaze-based interfaces that estimate and adapt to gaze estimation error on-the-fly. We implement a sample error-aware user interface for gaze-based selection and different error compensation methods: a naïve approach that increases component size directly proportional to the absolute error, a recent model by Feit et al. that is based on the two-dimensional error distribution, and a novel predictive model that shifts gaze by a directional error estimate. We evaluate these models in a 12-participant user study and show that our predictive model significantly outperforms the others in terms of selection rate, particularly for small gaze targets. These results underline both the feasibility and potential of next generation error-aware gaze-based user interfaces.

SESSION: Gaze-based interaction

Circular orbits detection for gaze interaction using 2D correlation and profile matching algorithms

Recently, interaction techniques in which the user selects screen targets by matching their movement with the input device have been gaining popularity, particularly in the context of gaze interaction (e.g. Pursuits, Orbits, AmbiGaze, etc.). However, though many algorithms for enabling such interaction techniques have been proposed, we still lack an understanding of how they compare to each other. In this paper, we introduce two new algorithms for matching eye movements: Profile Matching and 2D Correlation, and present a systematic comparison of these algorithms with two other state-of-the-art algorithms: the Basic Correlation algorithm used in Pursuits and the Rotated Correlation algorithm used in PathSync. We also examine the effects of two thresholding techniques and post-hoc filtering. We evaluated the algorithms on a user dataset and found the 2D Correlation with one-level thresholding and post-hoc filtering to be the best performing algorithm.

Dwell time reduction technique using Fitts' law for gaze-based target acquisition

We present a dwell time reduction technique for gaze-based target acquisition. We adopt Fitts' Law to achieve the dwell time reduction. Our technique uses both the eye movement time for target acquisition estimated using Fitts' Law (Te) and the actual eye movement time (Ta) for target acquisition; a target is acquired when the difference between Te and Ta is small. First, we investigated the relation between the eye movement for target acquisition and Fitts' Law; the result indicated a correlation of 0.90 after error correction. Then we designed and implemented our technique. Finally, we conducted a user study to investigate the performance of our technique; an average dwell time of 86.7 ms was achieved, with a 10.0% Midas-touch rate.

Hidden pursuits: evaluating gaze-selection via pursuits when the stimuli's trajectory is partially hidden

The idea behind gaze interaction using Pursuits is to leverage the human's smooth pursuit eye movements performed when following moving targets. However, humans can also anticipate where a moving target would reappear if it temporarily hides from their view. In this work, we investigate how well users can select targets using Pursuits in cases where the target's trajectory is partially invisible (HiddenPursuits): e.g., can users select a moving target that temporarily hides behind another object? Although HiddenPursuits was not studied in the context of interaction before, understanding how well users can perform HiddenPursuits presents numerous opportunities, particularly for small interfaces where a target's trajectory can cover area outside of the screen. We found that users can still select targets quickly via Pursuits even if their trajectory is up to 50% hidden, and at the expense of longer selection times when the hidden portion is larger. We discuss how gaze-based interfaces can leverage HiddenPursuits for an improved user experience.

Contour-guided gaze gestures: using object contours as visual guidance for triggering interactions

The eyes are an interesting modality for pervasive interactions, though their applicability for mobile scenarios is restricted by several issues so far. In this paper, we propose the idea of contour-guided gaze gestures, which overcome former constraints, like the need for calibration, by relying on unnatural and relative eye movements, as users trace the contours of objects in order to trigger an interaction. The interaction concept and the system design are described, along with two user studies, that demonstrate the method's applicability. It is shown that users were able to trace object contours to trigger actions from various positions on multiple different objects. It is further determined, that the proposed method is an easy to learn, hands-free interaction technique, that is robust against false positive activations. Results highlight low demand values and show that the method holds potential for further exploration, but also reveal areas for refinement.

Improving map reading with gaze-adaptive legends

Complex information visualizations, such as thematic maps, encode information using a particular symbology that often requires the use of a legend to explain its meaning. Traditional legends are placed at the edge of a visualization, which can be difficult to maintain visually while switching attention between content and legend.

Moreover, an extensive search may be required to extract relevant information from the legend. In this paper we propose to consider the user's visual attention to improve interaction with a map legend by adapting both the legend's placement and content to the user's gaze.

In a user study, we compared two novel adaptive legend behaviors to a traditional (non-adaptive) legend. We found that, with both of our approaches, participants spent significantly less task time looking at the legend than with the baseline approach. Furthermore, participants stated that they preferred the gaze-based approach of adapting the legend content (but not its placement).

Rapid alternating saccade training

While individual eye movement characteristics are remarkably stable, experiments on saccadic spatial adaptation indicate that oculomotor learning is possible. To further investigate saccadic learning, participants received veridical feedback on saccade rate while making sequential saccades as quickly as possible between two horizontal targets. Over the course of five days, with just ten minutes of training per day, participants were able to significantly increase the rate of sequential saccades. This occurred through both a reduction in dwell duration and to changes in secondary saccade characteristics. There was no concomitant change in participant's accuracy or precision. The learning was retained following the training and generalized to saccades of different directions, and to reaction time measures during a delayed saccade task. The study provides evidence for a novel form of saccadic learning with applicability in a number of domains.

SESSION: Social and natural behaviors

Robust eye contact detection in natural multi-person interactions using gaze and speaking behaviour

Eye contact is one of the most important non-verbal social cues and fundamental to human interactions. However, detecting eye contact without specialised eye tracking equipment poses significant challenges, particularly for multiple people in real-world settings. We present a novel method to robustly detect eye contact in natural three- and four-person interactions using off-the-shelf ambient cameras. Our method exploits that, during conversations, people tend to look at the person who is currently speaking. Harnessing the correlation between people's gaze and speaking behaviour therefore allows our method to automatically acquire training data during deployment and adaptively train eye contact detectors for each target user. We empirically evaluate the performance of our method on a recent dataset of natural group interactions and demonstrate that it achieves a relative improvement over the state-of-the-art method of more than 60%, and also improves over a head pose based baseline.

I see what you see: gaze awareness in mobile video collaboration

An emerging use of mobile video telephony is to enable joint activities and collaboration on physical tasks. We conducted a controlled user study to understand if seeing the gaze of a remote instructor is beneficial for mobile video collaboration and if it is valuable that the instructor is aware of sharing of the gaze. We compared three gaze sharing configurations, (a) Gaze_Visible where the instructor is aware and can view own gaze point that is being shared, (b) Gaze_Invisible where the instructor is aware of the shared gaze but cannot view her own gaze point and (c) Gaze_Unaware where the instructor is unaware about the gaze sharing, with a baseline of shared-mouse pointer. Our results suggests that naturally occurring gaze may not be as useful as explicitly produced eye movements. Further, instructors prefer using mouse rather than gaze for remote gesturing, while the workers also find value in transferring the gaze information.

Gaze patterns during remote presentations while listening and speaking

Managing an audience's visual attention to presentation content is critical for effective communication in tele-conferences. This paper explores how audience and presenter coordinate visual and verbal information, and how consistent their gaze behavior is, to understand if their gaze behavior can be used for inferring and communicating attention in remote presentations. In a lab study, participants were asked first to view a short video presentation, and next, to rehearse and present to a remote viewer using the slides from the video presentation. We found that presenters coordinate their speech and gaze at visual regions of the slides in a timely manner (in 72% of all events analyzed), whereas audience only looked at what the presenter talked about in 53% of all events. Rehearsing aloud and presenting resulted in similar scanpaths. To further explore if it possible to infer if what a presenter is looking at is also talked about, we successfully trained models to detect an attention match between gaze and speech. These findings suggest that using the presenter's gaze has the potential to reliably communicate the presenter's focus on essential parts of the visual presentation material to help the audience better follow the presenter.

A visual comparison of gaze behavior from pedestrians and cyclists

In this paper, we contribute an eye tracking study conducted with pedestrians and cyclists. We apply a visual analytics-based method to inspect pedestrians' and cyclists' gaze behavior as well as video recordings and accelerometer data. This method using multi-modal data allows us to explore patterns and extract common eye movement strategies. Our results are that participants paid most attention to the path itself; advertisements do not distract participants; participants focus more on pedestrians than on cyclists; pedestrians perform more shoulder checks than cyclists do; and we extracted common gaze sequences. Such an experiment in a real-world traffic environment allows us to understand realistic behavior of pedestrians and cyclists better.

Scene perception while listening to music: an eye-tracking study

Previous studies have observed longer fixations and fewer saccades while viewing various outdoor scenes and listening to music compared to a no-music condition. There is also evidence that musical tempo can modulate the speed of eye movements. However, recent investigations from environmental psychology demonstrated differences in eye movement behavior while viewing natural and urban outdoor scenes. The first goal of this study was to replicate the observed effect of music listening while viewing outdoor scenes with different musical stimuli. Next, the effect of a fast and a slow musical tempo on eye movement speed was investigated. Finally, the effect of the type of outdoor scene (natural vs. urban scenes) was explored. The results revealed shorter fixation durations in the no-music condition compared to both music conditions, but these differences were non-significant. Moreover, we did not find differences in eye movements between music conditions with fast and slow tempo. Although significantly shorter fixations were found for viewing urban scenes compared with natural scenes, we did not find a significant interaction between the type of scene and music conditions.

Enabling unsupervised eye tracker calibration by school children through games

To use eye trackers in a school classroom, children need to be able to calibrate their own tracker unsupervised and on repeated occasions. A game designed specifically around the need to maintain their gaze in fixed locations was used to collect calibration and verification data. The data quality obtained was compared with a standard calibration procedure and another game, in two studies carried out in three elementary schools. One studied the effect on data quality over repeated occasions and the other studied the effect of age on data quality. The first showed that accuracy obtained from unsupervised calibration by children was twice as good after six occasions with the game requiring the fixed gaze location compared with the standard calibration, and as good as standard calibration by group of supervised adults. In the second study, age was found to have no effect on performance in the groups of children studied.

SESSION: Clinical and emotional

Systematic shifts of fixation disparity accompanying brightness changes

Video-based gaze tracking is prone to brightness changes due to their effects on pupil size. Monocular observations indeed confirm variable fixation locations depending on brightness. In close viewing, pupil size is coupled with accommodation and vergence, the so-called near triad. Hence, systematic changes in fixation disparity might be expected to co-occur with varying pupil size. In the current experiment, fixation disparity was assessed. Calibration was conducted either on dark or on bright background, and text had to be read on both backgrounds, on a self-illuminating screen and on paper. When calibration background matches background during reading, mean fixation disparity did not differ from zero. In the non-calibrated conditions, however, a brighter stimulus went along with a dominance of crossed fixations and vice versa. The data demonstrate that systematic changes in fixation disparity occur as effect of brightness changes advising for careful setting calibration parameters.

Towards using the spatio-temporal properties of eye movements to classify visual field defects

Perimetry---assessment of visual field defects (VFD)---requires patients to be able to maintain a prolonged stable fixation, as well as to provide feedback through motor response. These aspects limit the testable population and often lead to inaccurate results. We hypothesized that different VFD would alter the eye-movements in systematic ways, thus making it possible to infer the presence of VFD by quantifying the spatio-temporal properties of eye movements. We developed a tracking test to record participant's eye-movements while we simulated different gaze-contingent VFD. We tested 50 visually healthy participants and simulated three common scotomas: peripheral loss, central loss and hemifield loss. We quantified spatio-temporal features using cross-correlogram analysis, then applied cross-validation to train a decision tree algorithm to classify the conditions. Our test is faster and more comfortable than standard perimetry and can achieve a classifying accuracy of ∼90% (True Positive Rate = ∼98%) with data acquired in less than 2 minutes.

Scanpath comparison in medical image reading skills of dental students: distinguishing stages of expertise development

A popular topic in eye tracking is the difference between novices and experts and their domain-specific eye movement behaviors. However, very little is researched regarding how expertise develops, and more specifically, the developmental stages of eye movement behaviors. Our work compares the scanpaths of five semesters of dental students viewing orthopantomograms (OPTs) with classifiers to distinguish sixth semester through tenth semester students. We used the analysis algorithm SubsMatch 2.0 and the Needleman-Wunsch algorithm. Overall, both classifiers were able distinguish the stages of expertise in medical image reading above chance level. Specifically, it was able to accurately determine sixth semester students with no prior training as well as sixth semester students after training. Ultimately, using scanpath models to recognize gaze patterns characteristic of learning stages, we can provide more adaptive, gaze-based training for students.

Development of diagnostic performance & visual processing in different types of radiological expertise

The aim of this research was to compare visual patterns while examining radiographs in groups of people with different levels and different types of expertise. Introducing the latter comparative base is the original contribution of these studies. The residents and specialists were trained in medical diagnosing of X-Rays and for these two groups it was possible to compare visual patterns between observers with different level of the same expertise type. On the other hand, the radiographers who took part in the examination - due to specific of their daily work - had experience in reading and evaluating X-Rays quality and were not trained in diagnosing. Involving this group created in our research the new opportunity to explore eye movements obtained when examining X-Ray for both medical diagnosing and quality assessment purposes, which may be treated as different types of expertise.

We found that, despite the low diagnosing performance, the radiographers eye movement characteristics were more similar to the specialists than eye movement characteristics of the residents. It may be inferred that people with different type of expertise, yet after gaining a certain level of experience (or practise), may develop similar visual patterns which is the original conclusion of the research.

Implementing innovative gaze analytic methods in clinical psychology: a study on eye movements in antisocial violent offenders

A variety of psychological disorders like antisocial personality disorder have been linked to impairments in facial emotion recognition. Exploring eye movements during categorization of emotional faces is a promising approach with the potential to reveal possible differences in cognitive processes underlying these deficits. Based on this premise we investigated whether antisocial violent offenders exhibit different scan patterns compared to a matched healthy control group while categorizing emotional faces. Group differences were analyzed in terms of attention to the eyes, extent of exploration behavior and structure of switching patterns between Areas of Interest. While we were not able to show clear group differences, the present study is one of the first that demonstrates the feasibility and utility of incorporating recently developed eye movement metrics such as gaze transition entropy into clinical psychology.

Ocular reactions in response to impressions of emotion-evoking pictures

Oculomotor indicies in response to emotional stimuli were analysed chronologically in order to investigate the relationships between eye behaviour and emotional activity in human visual perception. Seven participants classified visual stimuli into two emotional groups using subjective ratings of images, such as "Pleasant" and "Unpleasant". Changes in both eye movements and pupil diameters between the two groups of images were compared. Both the mean saccade lengths and the cross power spectra of eye movements for "Unpleasant" ratings were significantly higher than for other ratings of eye movements in regards to certain the duration of certain pictures shown. Also, both mean pupil diameters and their power spectrum densities were significantly higher when the durations of pictures presented were lengthened. When comparing the response latencies, pupil reactions followed the appearance of changes in the direction of eye movements. The results suggest that at specific latencies, "Unpleasant" images activate both eye movements and pupil dilations.

Dynamics of emotional facial expression recognition in individuals with social anxiety

This paper demonstrates the utility of ambient-focal attention and pupil dilation dynamics to describe visual processing of emotional facial expressions. Pupil dilation and focal eye movements reflect deeper cognitive processing and thus shed more light on the dynamics of emotional expression recognition. Socially anxious individuals (N = 24) and non-anxious controls (N = 24) were asked to recognize emotional facial expressions that gradually morphed from a neutral expression to one of happiness, sadness, or anger in 10-sec animations. Anxious cohorts exhibited more ambient face scanning than their non-anxious counterparts. We observed a positive relationship between focal fixations and pupil dilation, indicating deeper processing of viewed faces, but only by non-anxious participants, and only during the last phase of emotion recognition. Group differences in the dynamics of ambient-focal attention support the hypothesis of vigilance to emotional expression processing by socially anxious individuals. We discuss the results by referring to current literature on cognitive psychopathology.

SESSION: Notes (short papers)

An eye gaze model for seismic interpretation support

Designing systems to offer support to experts during cognitive intensive tasks at the right time is still a challenging endeavor, despite years of research progress in the area. This paper proposes a gaze model based on eye tracking empirical data to identify when a system should proactively interact with the expert during visual inspection tasks. The gaze model derives from the analyses of a user study where 11 seismic interpreters were asked to perform the visual inspection task of seismic images from known and unknown basins. The eye tracking fixation patterns were triangulated with pupil dilations and thinking-aloud data. Results show that cumulative saccadic distances allow identifying when additional information could be offered to support seismic interpreters, changing the visual search behavior from exploratory to goal-directed.

Anyorbit: orbital navigation in virtual environments with eye-tracking

Gaze-based interactions promise to be fast, intuitive and effective in controlling virtual and augmented environments. Yet, there is still a lack of usable 3D navigation and observation techniques. In this work: 1) We introduce a highly advantageous orbital navigation technique, AnyOrbit, providing an intuitive and hands-free method of observation in virtual environments that uses eye-tracking to control the orbital center of movement; 2) The versatility of the technique is demonstrated with several control schemes and use-cases in virtual/augmented reality head-mounted-display and desktop setups, including observation of 3D astronomical data and spectator sports.

Autopager: exploiting change blindness for gaze-assisted reading

A novel gaze-assisted reading technique uses the fact that in linear reading, the looking behavior of the reader is readily predicted. We introduce the AutoPager "page turning" technique, where the next bit of unread text is rendered in the periphery, ready to be read. This approach enables continuous gaze-assisted reading without requiring manual input to scroll: the reader merely saccades to the top of the page to read on. We demonstrate that when the new text is introduced with a gradual cross-fade effect, users are often unaware of the change: the user's impression is of reading the same page over and over again, yet the content changes. We present a user evaluation that compares AutoPager to previous gaze-assisted scrolling techniques. AutoPager may offer some advantages over previous gaze-assisted reading techniques, and is a rare example of exploiting "change blindness" in user interfaces.

Binocular model-based gaze estimation with a camera and a single infrared light source

We propose a binocular model-based method that only uses a single camera and an infrared light source. Most gaze estimation approaches are based on single eye models and with binocular models they are addressed by averaging the results from each eye. In this work, we propose a geometric model of both eyes for gaze estimation. The proposed model is implemented and evaluated in a simulated environment and is compared to a binocular model-based method and polynomial regression-based method with one camera and two infrared lights that average the results from both eyes. The method performs on par with methods using multiple light sources while maintaining robustness to head movements. The study shows that when using both eyes in gaze estimation models it is possible to reduce the hardware requirements while maintaining robustness.

BORE: boosted-oriented edge optimization for robust, real time remote pupil center detection

Undoubtedly, eye movements contain an immense amount of information, especially when looking to fast eye movements, namely time to the fixation, saccade, and micro-saccade events. While, modern cameras support recording of few thousand frames per second, to date, the majority of studies use eye trackers with the frame rates of about 120 Hz for head-mounted and 250 Hz for remote-based trackers. In this study, we aim to overcome the challenge of the pupil tracking algorithms to perform real time with high speed cameras for remote eye tracking applications. We propose an iterative pupil center detection algorithm formulated as an optimization problem. We evaluated our algorithm on more than 13,000 eye images, in which it outperforms earlier solutions both with regard to runtime and detection accuracy. Moreover, our system is capable of boosting its runtime in an unsupervised manner, thus we remove the need for manual annotation of pupil images.

Deepcomics: saliency estimation for comics

A key requirement for training deep learning saliency models is large training eye tracking datasets. Despite the fact that the accessibility of eye tracking technology has greatly increased, collecting eye tracking data on a large scale for very specific content types is cumbersome, such as comic images, which are different from natural images such as photographs because text and pictorial content is integrated. In this paper, we show that a deep network trained on visual categories where the gaze deployment is similar to comics outperforms existing models and models trained with visual categories for which the gaze deployment is dramatically different from comics. Further, we find that it is better to use a computationally generated dataset on visual category close to comics one than real eye tracking data of a visual category that has different gaze deployment. These findings hold implications for the transference of deep networks to different domains.

Development and evaluation of a gaze feedback system integrated into eyetrace

A growing field of studies in eye-tracking is the use of gaze data for realtime feedback to the subject. In this work, we present a software system for such experiments and validate it with a visual search task experiment. This system was integrated into an eye tracking analysis tool. Our aim was to improve subject performance in this task by employing saliency features for gaze guidance. This realtime feedback system can be applicable within many realms, such as learning interventions, computer entertainment, or virtual reality.

Evaluating similarity measures for gaze patterns in the context of representational competence in physics education

The competent handling of representations is required for understanding physics' concepts, developing problem-solving skills, and achieving scientific expertise. Using eye-tracking methodology, we present the contributions of this paper as follows: We first investigated the preferences of students with the different levels of knowledge; experts, intermediates, and novices, in representational competence in the domain of physics problem-solving. It reveals that experts more likely prefer to use vector than other representations. Besides, a similar tendency of table representation usage was observed in all groups. Also, diagram representation has been used less than others. Secondly, we evaluated three similarity measures; Levenshtein distance, transition entropy, and Jensen-Shannon divergence. Conducting Recursive Feature Elimination technique suggests Jensen-Shannon divergence is the best discriminating feature among the three. However, investigation on mutual dependency of the features implies transition entropy mutually links between two other features where it has mutual information with Levenshtein distance (Maximal Information Coefficient = 0.44) and has a correlation with Jensen-Shannon divergence (r(18313) = 0.70, p < .001).

EyeMSA: exploring eye movement data with pairwise and multiple sequence alignment

Eye movement data can be regarded as a set of scan paths, each corresponding to one of the visual scanning strategies of a certain study participant. Finding common subsequences in those scan paths is a challenging task since they are typically not equally temporally long, do not consist of the same number of fixations, or do not lead along similar stimulus regions. In this paper we describe a technique based on pairwise and multiple sequence alignment to support a data analyst to see the most important patterns in the data. To reach this goal the scan paths are first transformed into a sequence of characters based on metrics as well as spatial and temporal aggregations. The result of the algorithmic data transformation is used as input for an interactive consensus matrix visualization. We illustrate the usefulness of the concepts by applying it to formerly recorded eye movement data investigating route finding tasks in public transport maps.

Fixation-indices based correlation between text and image visual features of webpages

Web elements associate with a set of visual features based on their data modality. For example, text associated with font-size and font-family whereas images associate with intensity and color. The unavailability of methods to relate these heterogeneous visual features limiting the attention-based analyses on webpages. In this paper, we propose a novel approach to establish the correlation between text and image visual features that influence users' attention. We pair the visual features of text and images based on their associated fixation-indices obtained from eye-tracking. From paired data, a common subspace is learned using Canonical Correlation Analysis (CCA) to maximize the correlation between them. The performance of the proposed approach is analyzed through a controlled eye-tracking experiment conducted on 51 real-world webpages. A very high correlation of 99.48% is achieved between text and images with text related font families and image related color features influencing the correlation.

Gazecode: open-source software for manual mapping of mobile eye-tracking data

Purpose: Eye movements recorded with mobile eye trackers generally have to be mapped to the visual stimulus manually. Manufacturer software usually has sub-optimal user interfaces. Here, we compare our in-house developed open-source alternative to the manufacturer software, called GazeCode. Method: 330 seconds of eye movements were recorded with the Tobii Pro Glasses 2. Eight coders subsequently categorized fixations using both Tobii Pro Lab and GazeCode. Results: Average manual mapping speed was more than two times faster when using GazeCode (0.649 events/s) compared with Tobii Pro Lab (0.292 events/s). Inter-rater reliability (Cohen's Kappa) was similar and satisfactory; 0.886 for Tobii Pro Lab and 0.871 for GazeCode. Conclusion: GazeCode is a faster alternative to Tobii Pro Lab for mapping eye movements to the visual stimulus. Moreover, it accepts eye-tracking data from manufacturers SMI, Positive Science, Tobii, and Pupil Labs.

Image-based scanpath comparison with slit-scan visualization

The comparison of scanpaths between multiple participants is an important analysis task in eye tracking research. Established methods typically inspect recorded gaze sequences based on geometrical trajectory properties or strings derived from annotated areas of interest (AOIs). We propose a new approach based on image similarities of gaze-guided slit-scans: For each time step, a vertical slice is extracted from the stimulus at the gaze position. Placing the slices next to each other over time creates a compact representation of a scanpath in the context of the stimulus. These visual representations can be compared based on their image similarity, providing a new measure for scanpath comparison without the need for annotation. We demonstrate how comparative slit-scan visualization can be integrated into a visual analytics approach to support the interpretation of scanpath similarities in general.

Implicit user calibration for gaze-tracking systems using an averaged saliency map around the optical axis of the eye

A 3D gaze-tracking method that uses two cameras and two light sources can measure the optical axis of the eye without user calibration. The visual axis of the eye (line of sight) is estimated by conducting a single-point user calibration. This single-point user calibration estimates the angle k that is offset between the optical and visual axes of the eye, which is a user-dependent parameter. We have proposed an implicit user calibration method for gaze-tracking systems using a saliency map around the optical axis of the eye. We assume that the peak of the average of the saliency maps indicates the visual axis of the eye in the eye coordinate system. We used both-eye restrictions effectively. The experimental result shows that the proposed system could estimate angle k without explicit personal calibration.

Microsaccade detection using pupil and corneal reflection signals

In contemporary research, microsaccade detection is typically performed using the calibrated gaze-velocity signal acquired from a video-based eye tracker. To generate this signal, the pupil and corneal reflection (CR) signals are subtracted from each other and a differentiation filter is applied, both of which may prevent small microsaccades from being detected due to signal distortion and noise amplification. We propose a new algorithm where microsaccades are detected directly from uncalibrated pupil-, and CR signals. It is based on detrending followed by windowed correlation between pupil and CR signals. The proposed algorithm outperforms the most commonly used algorithm in the field (Engbert & Kliegl, 2003), in particular for small amplitude microsaccades that are difficult to see in the velocity signal even with the naked eye. We argue that it is advantageous to consider the most basic output of the eye tracker, i.e. pupil-, and CR signals, when detecting small microsaccades.

Predicting observer's task from eye movement patterns during motion image analysis

Predicting an observer's tasks from eye movements during several viewing tasks has been investigated by several authors. This contribution adds task prediction from eye movements tasks occurring during motion image analysis: Explore, Observe, Search, and Track. For this purpose, gaze data was recorded from 30 human observers viewing a motion image sequence once under each task. For task decoding, the classification methods Random Forest, LDA, and QDA were used; features were fixation- or saccade-related measures. Best accuracy for prediction of the three tasks Observe, Search, Track from the 4-minute gaze data samples was 83.7% (chance level 33%) using Random Forest. Best accuracy for prediction of all four tasks from the gaze data samples containing the first 30 seconds of viewing was 59.3% (chance level 25%) using LDA. Accuracy decreased significantly for task prediction on small gaze data chunks of 5 and 3 seconds, being 45.3% and 38.0% (chance 25%) for the four tasks, and 52.3% and 47.7% (chance 33%) for the three tasks.

Pupil responses signal less inhibition for own relative to other names

Previous research suggests that self-relevant stimuli, such as one's own name, attract more attention than stimuli that are not self-relevant. In two experiments, we examined to which extent the own name is also less prone to inhibition than other names using a Go/NoGo approach. The pupil diameter was employed as psychophysiological indicator of attention. A total of 36 subjects performed various categorization tasks, with their own name and other names. Whereas in Go-trials, pupil dilation for own and other names did not differ, in NoGo-trials, significant larger pupil dilations were obtained for subjects' own names compared to other names. This difference was especially pronounced at larger intervals after stimulus onset, suggesting that inhibitory processing was less effective with one's own name.

Pupil size as an indicator of visual-motor workload and expertise in microsurgical training tasks

Pupillary responses have been for long linked to cognitive workload in numerous tasks. In this work, we investigate the role of pupil dilations in the context of microsurgical training, handling of microinstruments and the suturing act in particular. With an eye-tracker embedded on the surgical microscope oculars, eleven medical participants repeated 12 sutures of artificial skin under high magnification. A detailed analysis of pupillary dilations in suture segments revealed that pupillary responses indeed varied not only according to the main suture segments but also in relation to participants' expertise.

PuReST: robust pupil tracking for real-time pervasive eye tracking

Pervasive eye-tracking applications such as gaze-based human computer interaction and advanced driver assistance require real-time, accurate, and robust pupil detection. However, automated pupil detection has proved to be an intricate task in real-world scenarios due to a large mixture of challenges - for instance, quickly changing illumination and occlusions. In this work, we introduce the <u>Pu</u>pil <u>Re</u>constructor with <u>S</u>ubsequent <u>T</u>racking (PuReST), a novel method for fast and robust pupil tracking. The proposed method was evaluated on over 266,000 realistic and challenging images acquired with three distinct head-mounted eye tracking devices, increasing pupil detection rate by 5.44 and 29.92 percentage points while reducing average run time by a factor of 2.74 and 1.1. w.r.t. state-of-the-art 1) pupil detectors and 2) vendor provided pupil trackers, respectively. Overall, PuReST outperformed other methods in 81.82% of use cases.

Relating eye-tracking measures with changes in knowledge on search tasks

We conducted an eye-tracking study where 30 participants performed searches on the web. We measured their topical knowledge before and after each task. Their eye-fixations were labelled as "reading" or "scanning". The series of reading fixations in a line, called "reading-sequences" were characterized by their length in pixels, fixation duration, and the number of fixations making up the sequence. We hypothesize that differences in knowledge-change of participants are reflected in their eye-tracking measures related to reading. Our results show that the participants with higher change in knowledge differ significantly in terms of their total reading-sequence-length, reading-sequence-duration, and number of reading fixations, when compared to participants with lower knowledge-change.

Robustness of metrics used for scanpath comparison

In every quantitative eye tracking research study, researchers need to compare eye movements between subjects or conditions. For both static and dynamic tasks, there is a variety of metrics that could serve this purpose. It is important to explore the robustness of the metrics with respect to artificial noise. For dynamic tasks, where eye movement data is represented as scanpaths, there are currently no studies regarding the robustness of the metrics.

In this study, we explored properties of five metrics (Levenshtein distance, correlation distance, Fréchet distance, mean and median distance) used for comparison of scanpaths. We systematically added noise by applying three transformations to the scanpaths: translation, rotation, and scaling. For each metric, we computed baseline similarity for two random scanpaths and explored the metrics' sensitivity. Our results allow other researchers to convert results between studies.

Sensitivity to natural 3D image transformations during eye movements

The saccadic suppression effect, in which visual sensitivity is reduced significantly during saccades, has been suggested as a mechanism for masking graphic updates in a 3D virtual environment. In this study, we investigate whether the degree of saccadic suppression depends on the type of image change, particularly between different natural 3D scene transformations. The user observed 3D scenes and made a horizontal saccade in response to the displacement of a target object in the scene. During this saccade the entire scene translated or rotated. We studied six directions of transformation corresponding to the canonical directions for the six degrees of freedom. Following each trial, the user made a forced-choice indication of direction of the scene change. Results show that during horizontal saccades, the most recognizable changes were rotations along the roll axis.

SLAM-based localization of 3D gaze using a mobile eye tracker

Past work in eye tracking has focused on estimating gaze targets in two dimensions (2D), e.g. on a computer screen or scene camera image. Three-dimensional (3D) gaze estimates would be extremely useful when humans are mobile and interacting with the real 3D environment. We describe a system for estimating the 3D locations of gaze using a mobile eye tracker. The system integrates estimates of the user's gaze vector from a mobile eye tracker, estimates of the eye tracker pose from a visual-inertial simultaneous localization and mapping (SLAM) algorithm, a 3D point cloud map of the environment from a RGB-D sensor. Experimental results indicate that our system produces accurate estimates of 3D gaze over a much larger range than remote eye trackers. Our system will enable applications, such as the analysis of 3D human attention and more anticipative human robot interfaces.

Suitability of calibration polynomials for eye-tracking data with simulated fixation inaccuracies

Current video-based eye trackers are not suited for calibration of patients who cannot produce stable and accurate fixations. Reliable calibration is crucial in order to make repeatable recordings, which in turn are important to accurately measure the effects of a medical intervention. To test the suitability of different calibration polynomials for such patients, inaccurate calibration data were simulated using a geometric model of the EyeLink 1000 Plus desktop mode setup. This model is used to map eye position features to screen coordinates, creating screen data with known eye tracker data. This allows for objective evaluation of gaze estimation performance over the entire computer screen. Results show that the choice of calibration polynomial is crucial in order to ensure a high repeatability across measurements from patients who are hard to calibrate. Higher order calibration polynomials resulted in poor gaze estimation even for small simulated fixation inaccuracies.

Training facilitates cognitive control on pupil dilation

Physiological responses are generally involuntary; however, real-time feedback enables, at least to a certain extent, to voluntary control automatic processes. Recently, it was demonstrated that even pupil dilation is subject to controlled interference. To address effects of training on the ability to exercise control on pupil dilation, the current study examines repeated exercise over seven successive days. Participants utilize self-induced changes in arousal to increase pupil diameter, real-time feedback was applied to evaluate and improve individual performance. We observe inter-individual differences with regard to responsiveness of the pupillary response: six of eight participants considerably increase pupil diameter already during the first session, two exhibit only slight changes, and all showed rather stable performance throughout training. There was a trend towards stronger peak amplitudes that tend to occur increasingly early across time. Hence, higher cognitive control on pupil dilations can be practiced by most users and may therefore provide an appropriate input mechanism in human-computer interaction.

Useful approaches to exploratory analysis of gaze data: enhanced heatmaps, cluster maps, and transition maps

Exploratory analysis of gaze data requires methods that make it possible to process large amounts of data while minimizing human labor. The conventional approach in exploring gaze data is to construct heatmap visualizations. While simple and intuitive, conventional heatmaps do not clearly indicate differences between groups of viewers or give estimates for the repeatability (i.e., which parts of the heatmap would look similar if the data were collected again). We discuss difference maps and significance maps that answer to these needs. In addition we describe methods based on automatic clustering that allow us to achieve similar results with cluster observation maps and transition maps. As demonstrated with our example data, these methods are effective in highlighting the strongest differences between groups more effectively than conventional heatmaps.

SESSION: ETRA doctoral symposium abstracts

A text entry interface using smooth pursuit movements and language model

Nowadays, with the development of eye tracking technology, the gaze-interaction applications demonstrate great potential. Smooth pursuit based gaze typing is an intuitive text entry system with low learning effort. In this study, we provide a language-prediction function for a smooth-pursuit based gaze-typing system. Since the state-of-the-art neural network models have been successfully applied in language modeling, this study uses a pretrained model based on convolutional neural networks (CNNs) and develops a prediction function, which can predict both next possible letters and word. The results of a pilot experiment have shown that the next possible letters or word can be well predicted and selected. The mean typing speed can achieve 4.5 words per minute. The participants consider that the word prediction is helpful for reducing the visual search time.

Asynchronous gaze sharing: towards a dynamic help system to support learners during program comprehension

To participate in a society of a rapidly changing world, learning fundamentals of programming is important. However, learning to program is challenging for many novices and reading source code is one major obstacle in this challenge. The primary research objective of my dissertation is developing a help system based on historical and interactive eye tracking data to help novices master program comprehension. Helping novices requires detecting problematic situations while solving programming tasks using a classifier to split novices into successful/unsuccessful participants based on the answers given to program comprehension tasks. One set of features of this classifier is the story reading and execution reading order. The first step in my dissertation is creating a classifier for the reading order problem. The current status of this step is analyzing eye tracking datasets of novices and experts.

Audio-visual interaction in emotion perception for communication: doctoral symposium, extended abstract

Information from multiple modalities contributes to recognizing emotions. While it is known interactions occur between modalities, it is unclear what characterizes these. These interactions, and changes in these interactions due to sensory impairments, are the main subject of this PhD project.

This extended abstract for the Doctoral Symposium of ETRA 2018 describes the project; its background, what I hope to achieve, and some preliminary results.

Automatic detection and inhibition of neutral and emotional stimuli in post-traumatic stress disorder: an eye-tracking study: eye-tracking data of an original antisaccade task

This research project addresses the understanding of attentional biases post-traumatic stress disorder (PTSD). This psychiatric condition is mainly characterized by symptoms of intrusion (flashbacks), avoidance, alteration of arousal and reactivity (hypervigilance), and negative mood and cognitions persisting one month after the exposure of a traumatic event [American Psychiatric Association 2013]. Clinical observations as well as empirical research highlighted the symptom of hypervigilance as being central in the PTSD symptomatology, considering that other clinical features could be maintained by it [Ehlers and Clark 2000]. Attentional Control theory has described the hypervigilance in anxious disorders as the co-occurrence of two cognitive processes : an enhanced detection of threatening information followed by difficulties to inhibit their processing [Eysenck et al. 2007]. Nevertheless, attentional control theory has never been applied to PTSD. This project aims at providing cognitive evidence of hypervigilance symptoms in PTSD using eye-tracking during the realization of reliable Miyake tasks [Eysenck and Derakshan 2011]. Therefore, our first aim is to model the co-occurring processes of hypervigilance using eye-tracking technology. Indeed, behavioral measures (as reaction time) do not allow a clear representation of cognitive processes occurring subconsciously in a few milliseconds [Felmingham 2016]. Therefore, eye-tracking technology is essential in our studies. Secondly, we aim to analyze the differential impact of trauma-related stimulus vs negative stimuli on PTSD patients, by conducting scan paths following both of those stimuli presentation. This research project is divided into four studies. The first one will be described is this doctoral symposium.

Eye-tracking measures in audiovisual stimuli in infants at high genetic risk for ASD: challenging issues

Individuals with autism spectrum disorder (ASD) have shown difficulties to integrate auditory and visual sensory modalities. Here we aim to explore if very young infants at genetic risk of ASD show atypicalities in this ability early in development. We registered visual attention of 4-month-old infants in a task using audiovisual natural stimuli (speaking faces). The complexity of this information and the attentional features of this population, among others, involves a great amount of challenges regarding data quality obtained with an eye-tracker. Here we discuss some of them and draw possible solutions.

Intelligent cockpit: eye tracking integration to enhance the pilot-aircraft interaction

In this research, we use eye tracking to monitor the attentional behavior of pilots in the cockpit. We built a cockpit monitoring database that serves as a reference for real-time assessment of the pilot's monitoring strategies, based on numerous flight simulator sessions with eye-tracking recordings. Eye tracking may also be employed as a passive input for assistive system, future studies will also explore the possibility to adapt the notifications' modality using gaze.

Investigating the multicausality of processing speed deficits across developmental disorders with eye tracking and EEG: extended abstract

Neuropsychological tests inform about performance differences in cognitive functions but they typically tell little about the causes for these differences. Here, we propose a project which builds upon a recently developed novel multimodal neuroscientific approach of simultanous eye-tracking and EEG measurements to provide insights into diverse causes of performance differences in the Symbol Search Test (SST). Using a unique large dataset we plan to investigate the causes for performance differences in the SST in healthy and clinically diagnosed children and adolescents. Firstly, we aim to investigate how causes for differences in performance in the SST evolve over age in healthy, typically developing children. With this we plan to dissect aging effects from effects that are specific to developmental neuropsychiatric disorders. Secondly, we will include subjects with deficient performance to investigate different causes for bad performance to identify data-driven subgroups of poor performers.

Seeing in time: an investigation of entrainment and visual processing in toddlers

Recent neurophysiological and behavioral studies have provided strong evidence of rhythmic entrainment in the perceptual level in adults. The present study examines if rhythmic auditory stimulation synchronized with visual stimuli and fast tempi could enhance the visual processing in toddlers. Two groups of participants with different musical experiences are recruited. A head-mounted camera will be used to investigate perceptual entrainment when participants perform visual search tasks.

Seeing into the music score: eye-tracking and sight-reading in a choral context

Musical sight-reading is a complex task which requires fluent use of multiple types of skills and knowledge. The ability to sight-read a score is typically described as one of the most challenging aims for beginners and finding ways of scaffolding their learning is, therefore, an important task for researchers in music education. The purpose of this study is to provide a deeper understanding of how an application of eye tracking technology can be utilized to improve choir singers' sight-reading ability. Collected data of novices' sight-reading patterns during choral rehearsal have helped identify problems that singers are facing. Analyzing corresponding patterns in sight-reading performed by expert singers may provide valuable information about helpful strategies developed with increasing experience. This project is expected to generate an approximate model, similar to the experts' eye movement path. The model will then be implemented in a training method for unskilled choral singers. Finally, as a summative result, we plan to evaluate how the training affects novices' competency in sight-reading and comprehension of the score.

Towards concise gaze sharing

Computer-supported collaboration changed the way we learn and work together, as co-location is no longer a necessity. While presence, pointing and actions belong to the established inventory of awareness functionality which aims to inform about peer activities, visual attention as a beneficial cue for successful collaboration does not. Several studies have shown that providing real-time gaze cues is advantageous, as it enables more efficient referencing by reducing deictic expressions and fosters joint attention by facilitating shared gaze. But the actual use is held back due to its inherent limitations: Real-time gaze display is often considered distracting, which is caused by its constant movement and an overall low signal-to-noise ratio. As a result, the transient nature makes it difficult to associate with a dynamic stimulus over time. While it is helpful when referencing or shared gaze is crucial, the application in common collaborative environments with a constant alternation between close and loose collaboration presents challenges. My dissertation work will explore a novelty gaze sharing approach, that aims to detect task-related gaze patterns which are displayed in concise representations. This work will contribute to our understanding of coordination in collaborative environments and propose algorithms and design recommendations for gaze sharing.

Training operational monitoring in future ATCOs using eye tracking: extended abstract

Improved technological possibilities continue to increase the significance of operational monitoring in air traffic control (ATC). The role of the air traffic controller (ATCO) will change in that they will have to monitor the operations of an automated system for failures. In order to take over control when automation fails, future ATCOs will need to be trained. While current ATC training is mainly based on performance indicators, this study will focus on the benefit of using eye tracking in future ATC training. Utilizing a low-fidelity operational monitoring task, a model of how attention should be allocated in case of malfunction will be derived. Based on this model, one group of ATC novices will receive training on how to allocate their attention appropriately (treatment). The other group will receive no training (control). Eye movements will be recorded to investigate how attention is allocated and if the training is successful. Performance measures will be used to evaluate the effectiveness of the training.

Using eye tracking to simplify screening for visual field defects and improve vision rehabilitation: extended abstract

My thesis will encompass two main research objectives:

(1) Evaluation of eye tracking as a method to screen for visual field defects.

(2) Investigating how vision rehabilitation therapy can be improved by employing eye tracking.

Virtual reality as a proxy for real-life social attention?

Previous studies found large amounts of overt attention allocated towards human faces when they were presented as images or videos, but a relative avoidance of gaze at conspecifics' faces in real-world situations. We measured gaze behavior in a complex virtual scenario in which a human face and an object were similarily exposed to the participants' view. Gaze at the face was avoided compared to gaze at the object, providing support for the hypothesis that virtual reality scenarios are capable of eliciting modes of information processing comparable to real-world situations.

SESSION: ETRA video presentation abstracts

Head and gaze control of a telepresence robot with an HMD

Gaze interaction with telerobots is a new opportunity for wheelchair users with severe motor disabilities. We present a video showing how head-mounted displays (HMD) with gaze tracking can be used to monitor a robot that carries a 360° video camera and a microphone. Our interface supports autonomous driving via way-points on a map, along with gaze-controlled steering and gaze typing. It is implemented with Unity, which communicates with the Robot Operating System (ROS).

Developing photo-sensor oculography (PS-OG) system for virtual reality headsets

Virtual reality (VR) is employed in a variety of different applications. It is our belief that eye-tracking is going to be a part of the majority of VR devices that will reduce computational burden via a technique called foveated rendering and will increase the immersion of the VR environment. A promising technique to achieve low energy, fast, and accurate eye-tracking is photo-sensor oculography (PS-OG). PS-OG technology enables tracking a user's gaze location at very fast rates - 1000Hz or more, and is expected to consume several orders of magnitude less power compared to a traditional video-oculography approach. In this demo we present a prototype of a PS-OG system that we started to develop. The long-term aim of our project is to develop a PS-OG system that is robust to sensor shifts. As a first step we have built a prototype that allows us to test different sensors and their configurations, as well as record and analyze eye-movement data.

Automatic mapping of gaze position coordinates of eye-tracking glasses video on a common static reference image

This paper describes a method for automatic semantic gaze mapping from video obtained by eye-tracking glasses to a common reference image. Image feature detection and description algorithms are utilized to find the position of subsequent video frames and map corresponding gaze coordinates on a common reference image. This process allows aggregate experiment results for further experiment analysis and provides an alternative for manual semantic gaze mapping methods.

Substantiating reading teachers with scanpaths

We present a tool that allows reading teachers to record and replay students' voice and gaze behavior during reading. The tool replays scanpaths to reading professionals without prior gaze data experience. On the basis of test experiences with 147 students, we share our initial observations on how teachers make use of the tool to create a dialog with their students.

Tracing gaze-following behavior in virtual reality using wiener-granger causality

We modelled gaze following behavior in a naturalistic virtual reality environment using Wiener-Granger causality. Using this method, gaze following was statistically tangible throughout the experiment, but could not easily be pinpointed to precise moments in time.

Semantic fovea: real-time annotation of ego-centric videos with gaze context

Visual context plays a crucial role in understanding human visual attention in natural, unconstrained tasks - the objects we look at during everyday tasks provide an indicator of our ongoing attention. Collection, interpretation, and study of visual behaviour in unconstrained environments therefore is necessary, however presents many challenges, requiring painstaking hand-coding. Here we demonstrate a proof-of-concept system that enables real-time annotation of objects in an egocentric video stream from head-mounted eye-tracking glasses. We concurrently obtain a live stream of user gaze vectors with respect to their own visual field. Even during dynamic, fast-paced interactions, our system was able to recognise all objects in the user's field-of-view with moderate accuracy. To validate our concept, our system was used to annotate an in-lab breakfast scenario in real time.

Hands-free web browsing: enriching the user experience with gaze and voice modality

Hands-free browsers provide an effective tool for Web interaction and accessibility, overcoming the need for conventional input sources. Current approaches to hands-free interaction are primarily categorized in either voice or gaze-based modality. In this work, we investigate how these two modalities could be integrated to provide a better hands-free experience for end-users. We demonstrate a multimodal browsing approach combining eye gaze and voice inputs for optimized interaction, and to suffice user preferences with unimodal benefits. The initial assessment with five participants indicates improved performance for the multimodal prototype in comparison to single modalities for hands-free Web browsing.

Mobile consumer shopping journey in fashion retail: eye tracking mobile apps and websites

Despite the rapid adoption of smartphones among fashion consumers, their dissatisfaction with retailers' mobile apps and websites also increases. This suggests that understanding how mobile consumers use smartphones for shopping is important in developing digital shopping platforms fulfilling consumers' expectations. Research to date has not focused on eye tracking consumer shopping behavior using smartphones. For this research, we employed mobile eye tracking experiments in order to develop unique shopping journeys for each fashion consumer accounting for differences and similarities in their behavior. Based on scan path visualizations and shopping journeys we developed a precise account about the areas the majority of fashion consumers look at when browsing and inspecting product pages. Based on the findings, we identified mobile consumers' behaviour patterns, usability issues of the mobile channel and established what features the mobile retail channel needs to have to satisfy fashion consumers' needs by offering pleasing customer user experiences.

EyeMR: low-cost eye-tracking for rapid-prototyping in head-mounted mixed reality

Mixed Reality devices can either augment reality (AR) or create completely virtual realities (VR). Combined with head-mounted devices and eye-tracking, they enable users to interact with these systems in novel ways. However, current eye-tracking systems are expensive and limited in the interaction with virtual content. In this paper, we present EyeMR, a low-cost system (below 100$) that enables researchers to rapidly prototype new techniques for eye and gaze interactions. Our system supports mono- and binocular tracking (using Pupil Capture) and includes a Unity framework to support the fast development of new interaction techniques. We argue for the usefulness of EyeMR based on results of a user evaluation with HCI experts.

A gaze-contingent intention decoding engine for human augmentation

Humans process high volumes of visual information to perform everyday tasks. In a reaching task, the brain estimates the distance and position of the object of interest, to reach for it. Having a grasp intention in mind, human eye-movements produce specific relevant patterns. Our Gaze-Contingent Intention Decoding Engine uses eye-movement data and gaze-point position to indicate the hidden intention. We detect the object of interest using deep convolution neural networks and estimate its position in a physical space using 3D gaze vectors. Then we trigger the possible actions from an action grammar database to perform an assistive movement of the robotic arm, improving action performance in physically disabled people. This document is a short report to accompany the Gaze-contingent Intention Decoding Engine demonstrator, providing details of the setup used and results obtained.

Use of attentive information dashboards to support task resumption in working environments

Interruptions are known as one of the big challenges in working environments. Due to improper resuming the primary task, such interruptions may result in task resumption failures and negatively influence the task performance. This phenomenon also occurs when users are working with information dashboards in working environments. To address this problem, an attentive dashboard issuing visual feedback is developed. This feedback supports the user in resuming the primary task after the interruption by guiding the visual attention. The attentive dashboard captures visual attention allocation of the user with a low-cost screen-based eye-tracker while they are monitoring the graphs. This dashboard is sensitive to the occurrence of external interruption by tracking the eye-movement data in real-time. Moreover, based on the collected eye-movement data, two types of visual feedback are designed which highlight the last fixated graph and unnoticed ones.

Eyemic: an eye tracker for surgical microscope

The concept of hands free surgical microscope has become increasingly popular in the domain of microsurgery. The higher magnification, the smaller field of view, necessitates frequent interaction with the microscope during an operation. Researchers showed that manual (hand) interactions with a surgical microscope resulted in disruptive and hazardous situations. Previously, we proposed the idea of eye control microscope as a solution to this interaction problem. While gaze contingent applications have been widely studied in HCI and eye tracking domain the lack of ocular based eye trackers for microscope being an important concern in this domain. To solve this critical problem and provide opportunity to capture eye movements in microsurgery in real time we present EyeMic, a binocular eye tracker that can be attached on top of any microscope ocular. Our eye tracker has only 5mm height to grantee same field of view, and it supports up to 120 frame per second eye movement recording.

Real-time gaze transition entropy

In this video, we introduce a real-time algorithm that computes gaze transition entropy. This approach can be employed in detecting higher level cognitive states such as situation awareness. We first compute fixations using our real-time version of a well established velocity threshold based algorithm. We then compute the gaze transition entropy for a content independent grid of areas of interest in real-time using an update processing window approach. We test for Markov property after each update to test whether Markov assumption holds. Higher entropy corresponds to increased eye movement and more frequent monitoring of the visual field. In contrast, lower entropy corresponds to fewer eye movements and less frequent monitoring. Based on entropy levels, the system could then alert the user accordingly and plausibly offer an intervention. We developed an example application to demonstrate the use of the online calculation of gaze transition entropy in a practical scenario.

Modeling corneal reflection for eye-tracking considering eyelid occlusion

Capturing Purkinje images is essential for wide-range and accurate eye-tracking. The range of eye rotation over which the Purkinje image is observable has so far been modeled by a cone shape called a gaze cone. In this study, we extended the gaze cone model to include occlusion by an eyelid. First, we developed a measurement device that has eight spider-like arms. Then, we proposed a novel model that considers eyeball movement. Using the device, we measured the range of corneal reflection, and we fitted the proposed model to the results.

Enhanced representation of web pages for usability analysis with eye tracking

Eye tracking as a tool to quantify user attention plays a major role in research and application design. For Web page usability, it has become a prominent measure to assess which sections of a Web page are read, glanced or skipped. Such assessments primarily depend on the mapping of gaze data to a Web page representation. However, current representation methods, a virtual screenshot of the Web page or a video recording of the complete interaction session, suffer either from accuracy or scalability issues. We present a method that identifies fixed elements on Web pages and combines user viewport screenshots in relation to fixed elements for an enhanced representation of the page. We conducted an experiment with 10 participants and the results signify that analysis with our method is more efficient than a video recording, which is an essential criterion for large scale Web studies.

Self-made mobile gaze tracking for group studies

Mobile gaze tracking does not need to be expensive. We have a self-made mobile gaze tracking system, consisting of glasses-like frame and a software which computes the gaze point. As the total cost of the equipment is less than thousand euros, we have prepared five devices which we use in group studies, to simultaneously estimate multiple students' gaze in classroom. The inexpensive mobile gaze tracking technology opens new possibilities for studying the attentional processes on a group level.

SESSION: ETRA demo presentation abstracts

An implementation of eye movement-driven biometrics in virtual reality

As eye tracking can reduce the computational burden of virtual reality devices through a technique known as foveated rendering, we believe not only that eye tracking will be implemented in all virtual reality devices, but that eye tracking biometrics will become the standard method of authentication in virtual reality. Thus, we have created a real-time eye movement-driven authentication system for virtual reality devices. In this work, we describe the architecture of the system and provide a specific implementation that is done using the FOVE head-mounted display. We end with an exploration into future topics of research to spur thought and discussion.

Anyorbit: orbital navigation in virtual environments with eye-tracking

Gaze-based interactions promise to be fast, intuitive and effective in controlling virtual and augmented environments. Yet, there is still a lack of usable 3D navigation and observation techniques. In this work: 1) We introduce a highly advantageous orbital navigation technique, AnyOrbit, providing an intuitive and hands-free method of observation in virtual environments that uses eye-tracking to control the orbital center of movement; 2) The versatility of the technique is demonstrated with several control schemes and use-cases in virtual/augmented reality head-mounted-display and desktop setups, including observation of 3D astronomical data and spectator sports.

Robust marker tracking system for mapping mobile eye tracking data

One of the challenges of mobile eye tracking is mapping gaze data on a reference image of the stimulus. Here we present a marker-tracking system that relies on the scene-video, recorded by eye tracking glasses, to recognize and track markers and map gaze data on the reference image. Due to the simple nature of the markers employed, the current system works with low-quality videos and at long distances from the stimulus, allowing the use of mobile eye tracking in new situations.

Deep learning vs. manual annotation of eye movements

Deep Learning models have revolutionized many research fields already. However, the raw eye movement data is still typically processed into discrete events via threshold-based algorithms or manual labelling. In this work, we describe a compact 1D CNN model, which we combined with BLSTM to achieve end-to-end sequence-to-sequence learning. We discuss the acquisition process for the ground truth that we use, as well as the performance of our approach, in comparison to various literature models and manual raters. Our deep method demonstrates superior performance, which brings us closer to human-level labelling quality.

A gaze gesture-based paradigm for situational impairments, accessibility, and rich interactions

Gaze gesture-based interactions on a computer are promising, but the existing systems are limited by the number of supported gestures, recognition accuracy, need to remember the stroke order, lack of extensibility, and so on. We present a gaze gesture-based interaction framework where a user can design gestures and associate them to appropriate commands like minimize, maximize, scroll, and so on. This allows the user to interact with a wide range of applications using a common set of gestures. Furthermore, our gesture recognition algorithm is independent of the screen size, resolution, and the user can draw the gesture anywhere on the target application. Results from a user study involving seven participants showed that the system recognizes a set of nine gestures with an accuracy of 93% and a F-measure of 0.96. We envision, this framework can be leveraged in developing solutions for situational impairments, accessibility, and also for implementing rich a interaction paradigm.

New features of scangraph: a tool for revealing participants' strategy from eye-movement data

The demo describes new features of ScanGraph, an application intended for a finding of participants with a similar stimulus reading strategy based on the sequences of visited Areas of Interest. The result is visualised using cliques of a simple graph. ScanGraph was initially introduced in 2016. Since the original publication, new features were added. First of them is the implementation of Damerau-Levenshtein algorithm for similarity calculation. A heuristic algorithm for cliques finding used in the original version was replaced by the Bron-Kerbosch algorithm. ScanGraph reads data from open-source application OGAMA, and with the use of conversion tool also data from SMI BeGaze, which allows analysing dynamic stimuli as well. The most prominent enhancement is the possibility of similarity calculation among participants not only for a single stimulus but for multiple files at once.

A visual comparison of gaze behavior from pedestrians and cyclists

In this paper, we contribute an eye tracking study conducted with pedestrians and cyclists. We apply a visual analytics-based method to inspect pedestrians' and cyclists' gaze behavior as well as video recordings and accelerometer data. This method using multi-modal data allows us to explore patterns and extract common eye movement strategies. Our results are that participants paid most attention to the path itself; advertisements do not distract participants; participants focus more on pedestrians than on cyclists; pedestrians perform more shoulder checks than cyclists do; and we extracted common gaze sequences. Such an experiment in a real-world traffic environment allows us to understand realistic behavior of pedestrians and cyclists better.

iTrace: eye tracking infrastructure for development environments

The paper presents iTrace, an eye tracking infrastructure, that enables eye tracking in development environments such as Visual Studio and Eclipse. Software developers work with software that is comprised of numerous source code files. This requires frequent switching between project artifacts during program understanding or debugging activities. Additionally, the amount of content contained within each artifact can be quite large and require scrolling or navigation of the content. Current approaches to eye tracking are meant for fixed stimuli and struggle to capture context during these activities. iTrace overcomes these limitations allowing developers to work in realistic settings during an eye tracking study. The iTrace architecture is presented along with several use cases of where it can be used by researchers. A short video demonstration is available at

An inconspicuous and modular head-mounted eye tracker

State of the art head mounted eye trackers employ glasses like frames, making their usage uncomfortable or even impossible for prescription eyewear users. Nonetheless, these users represent a notable portion of the population (e.g. the Prevent Blindness America organization reports that about half of the USA population use corrective eyewear for refractive errors alone). Thus, making eye tracking accessible for eyewear users is paramount to not only improve usability, but is also key for the ecological validity of eye tracking studies. In this work, we report on a novel approach for eye tracker design in the form of a modular and inconspicuous device that can be easily attached to glasses; for users without glasses, we also provide a 3D printable frame blueprint. Our prototypes include both low cost Commerical Out of The Shelf (COTS) and more expensive Original Equipment manufacturer (OEM) cameras, with sampling rates ranging between 30 and 120 fps and multiple pixel resolutions.