This joint panel between UIST and CSCW brings together leading researchers at the intersection of the conferences-systems researchers in collaborative and social computing-to engage in a discussion and retrospective. Pairs of panelists will represent each decade since the founding of the conferences, sharing a brief retrospective that surveys the most influential papers of that decade, the zeitgeist of the problems that were popular that decade and why, and what each decade's work has to say to the decades that came before and after. The panel is intended as a space to celebrate advances in the field, and reflect on the burdens and opportunities that it faces ahead.
Human communication mediated by computers and Augmented Reality devices will enable us to dynamically express, share and explore new ideas with each other via live simulations as easily as we talk about the weather. This collaboration provides a "shared truth" - what you see is exactly what I see, I see you perform an action as you do it, and we both see exactly the same dynamic transformation of this shared information space. When you express an idea, the computer, a full participant in this conversation, instantly makes it real for both of us enabling us to critique and negotiate the meaning of it. This shared virtual world will be as live, dynamic, pervasive, and visceral as the physical.
The development of new hardware can be split into two phases: prototyping and production. A wide variety of tools and techniques have empowered people to build prototypes during the first phase, but the transition to production is still complex, costly and prone to failure. This means the second phase often requires an up-front commitment to large volume production in order to be viable. I believe that new tools and techniques can democratize hardware production. Imagine "DevOps for hardware" - everything from circuit simulation tools to re-usable hardware test jig designs; and from test-driven development for hardware to telepresence for remote factory visits. Supporting low volume production and organic scaling in this way would spur innovation and increase consumer choice. I encourage the UIST community to join me in pursuit of this vision.
Knowledge work increasingly spans multiple computing surfaces. Yet in status quo user experiences, content as well as tools, behaviors, and workflows are largely bound to the current device-running the current application, for the current user, and at the current moment in time. SurfaceFleet is a system and toolkit that uses resilient distributed programming techniques to explore cross-device interactions that are unbounded in these four dimensions of device, application, user, and time. As a reference implementation, we describe an interface built using SurfaceFleet that employs lightweight, semi-transparent UI elements known as Applets. Applets appear always-on-top of the operating system, application windows, and (conceptually) above the device itself. But all connections and synchronized data are virtualized and made resilient through the cloud. For example, a sharing Applet known as a Portfolio allows a user to drag and drop unbound Interaction Promises into a document. Such promises can then be fulfilled with content asynchronously, at a later time (or multiple times), from another device, and by the same or a different user.
Knowledge management and sharing involves a variety of specialized but isolated software tools, tied together by the files that these tools use and produce. We interviewed 23 scientists and found that they all had difficulties using the file system to keep track of, re-find and maintain consistency among related but distributed information. We introduce FileWeaver, a system that automatically detects dependencies among files without explicit user action, tracks their history, and lets users interact directly with the graphs representing these dependencies and version history. Changes to a file can trigger recipes, either automatically or under user control, to keep the file consistent with its dependants. Users can merge variants of a file, e.g. different output formats, into a polymorphic file, or morph, and automate the management of these variants. By making dependencies among files explicit and visible, FileWeaver facilitates the automation of workflows by scientists and other users who rely on the file system to manage their data.
Many computing tasks, such as comparison shopping, two-factor authentication, and checking movie reviews, require using multiple apps together. On large screens, "windows, icons, menus, pointer" (WIMP) graphical user interfaces (GUIs) support easy sharing of content and context between multiple apps. So, it is straightforward to see the content from one application and write something relevant in another application, such as looking at the map around a place and typing walking instructions into an email. However, although today's smartphones also use GUIs, they have small screens and limited windowing support, making it hard to switch contexts and exchange data between apps.
We introduce DoThisHere, a multimodal interaction technique that streamlines cross-app tasks and reduces the burden these tasks impose on users. Users can use voice to refer to information or app features that are off-screen and touch to specify where the relevant information should be inserted or is displayed. With DoThisHere, users can access information from or carry information to other apps with less context switching.
We conducted a survey to find out what cross-app tasks people are currently performing or wish to perform on their smartphones. Among the 125 tasks that we collected from 75 participants, we found that 59 of these tasks are not well supported currently. DoThisHere is helpful in completing 95% of these unsupported tasks. A user study, where users are shown the list of supported voice commands when performing a representative sample of such tasks, suggests that DoThisHere may reduce expert users' cognitive load; the Query action, in particular, can help users reduce task completion time.
As sensors and interactive devices become ubiquitous and transition outdoors and into the wild, we are met with the challenge of mass deployment and actuation. We present E-seed, a biomimetic platform that consumes little power to deploy, harvests energy from nature to install, and functions autonomously in the field. Each seed can individually self-drill into a substrate by harvesting moisture fluctuations in its ambient environment. As such, E-seed acts as a shape-changing interface to autonomously embed functional devices and interfaces into the soil, with the potential of aerial deployment in hard-to-reach locations. Our system is constructed primarily from wood veneer, making it lightweight, inexpensive, and biodegradable. In this paper, we detail our fabrication process and showcase demos that leverage the E-seed platform as a self-drilling interface. We envision that possible applications include soil sensors, sampling, and environmental monitoring for agriculture and reforestation.
Despite recent advancements in 3D printing technology, which allows users to rapidly produce 3D objects, printing tall and/or large objects still consumes more time and large amount of support material. In order to address these problems, we propose Pop-up Print, a method to 3D print an object in a compact "folded" state and then unfold it after printing to achieve the final artifact. Using this method, we can reduce the object's print height and volume, which directly affects the printing time and support material consumption. In addition, thanks to the reversibility of folding/unfolding, we can reversibly minimize the printed object's volume when unused for storage or transportation, and expand it only in use. To achieve Pop-up Print, we first conducted an experiment using selected printed sample objects with several parameters, in order to determine suitable crease patterns that make both the unfolded and folded state mechanically stable. Based on this result, we developed an interactive design tool to convert 3D models - such as a Stanford Bunny or a Huffman's cone - to the folded shape. Our design tool allows users to decide non-intuitive parameters that may affect the form's mechanical stability, while maintaining both functional crease patterns and the object's original form factor. Finally, we demonstrate the feasibility of our method through several examples of folded objects.
Morphing materials allow us to create new modalities of interaction and fabrication by leveraging the materials? dynamic behaviors. Yet, despite the ongoing rapid growth of computational tools within this realm, current developments are bottlenecked by the lack of an effective simulation method. As a result, existing design tools must trade-off between speed and accuracy to support a real-time interactive design scenario. In response, we introduce SimuLearn, a data-driven method that combines finite element analysis and machine learning to create real-time (0.61 seconds) and truthful (97% accuracy) morphing material simulators. We use mesh-like 4D printed structures to contextualize this method and prototype design tools to exemplify the design workflows and spaces enabled by a fast and accurate simulation method. Situating this work among existing literature, we believe SimuLearn is a timely addition to the HCI CAD toolbox that can enable the proliferation of morphing materials.
The space around the body not only expands the interaction space of a mobile device beyond its small screen, but also enables users to utilize their kinesthetic sense. Therefore, body-centric peephole interaction has gained considerable attention. To support its practical implementation, we propose OddEyeCam, which is a vision-based method that tracks the 3D location of a mobile device in an absolute, wide, and continuous manner with respect to the body of a user in both static and mobile environments. OddEyeCam tracks the body of a user using a wide-view RGB camera and obtains precise depth information using a narrow-view depth camera from a smartphone close to the body. We quantitatively evaluated OddEyeCam through an accuracy test and two user studies. The accuracy test showed the average tracking accuracy of OddEyeCam was 4.17 and 4.47cm in 3D space when a participant is standing and walking, respectively. In the frst user study, we implemented various interaction scenarios and observed that OddEyeCam was well received by the participants. In the second user study, we observed that the peephole target acquisition task performed using our system followed Fitts? law. We also analyzed the performance of OddEyeCam using the obtained measurements and observed that the participants completed the tasks with suffcient speed and accuracy.
We present MonoEye, a multimodal human motion capture system using a single RGB camera with an ultra-wide fisheye lens, mounted on the user's chest. Existing optical motion capture systems use multiple cameras, which are synchronized and require camera calibration. These systems also have usability constraints that limit the user's movement and operating space. Since the MonoEye system is based on a wearable single RGB camera, the wearer's 3D body pose can be captured without space and environment limitations. The body pose, captured with our system, is aware of the camera orientation and therefore it is possible to recognize various motions that existing egocentric motion capture systems cannot recognize. Furthermore, the proposed system captures not only the wearer's body motion but also their viewport using the head pose estimation and an ultra-wide image. To implement robust multimodal motion capture, we design three deep neural networks: BodyPoseNet, HeadPoseNet, and CameraPoseNet, that estimate 3D body pose, head pose, and camera pose in real-time, respectively. We train these networks with our new extensive synthetic dataset providing 680K frames of renderings of people with a wide range of body shapes, clothing, actions, backgrounds, and lighting conditions. To demonstrate the interactive potential of the MonoEye system, we present several application examples from common body gestural to context-aware interactions.
C-Face (Contour-Face) is an ear-mounted wearable sensing technology that uses two miniature cameras to continuously reconstruct facial expressions by deep learning contours of the face. When facial muscles move, the contours of the face change from the point of view of the ear-mounted cameras. These subtle changes are fed into a deep learning model which continuously outputs 42 facial feature points representing the shapes and positions of the mouth, eyes and eyebrows. To evaluate C-Face, we embedded our technology into headphones and earphones. We conducted a user study with nine participants. In this study, we compared the output of our system to the feature points outputted by a state of the art computer vision library (Dlib) from a font facing camera. We found that the mean error of all 42 feature points was 0.77 mm for earphones and 0.74 mm for headphones. The mean error for 20 major feature points capturing the most active areas of the face was 1.43 mm for earphones and 1.39 mm for headphones. The ability to continuously reconstruct facial expressions introduces new opportunities in a variety of applications. As a demonstration, we implemented and evaluated C-Face for two applications: facial expression detection (outputting emojis) and silent speech recognition. We further discuss the opportunities and challenges of deploying C-Face in real-world applications.
Training a state-of-the-art deep neural network (DNNs) is a computationally-expensive and time-consuming process, which incentivizes deep learning developers to debug their DNNs for computational performance. However, effectively performing this debugging requires intimate knowledge about the underlying software and hardware systems-something that the typical deep learning developer may not have. To help bridge this gap, we present Skyline: a new interactive tool for DNN training that supports in-editor computational performance profiling, visualization, and debugging. Skyline's key contribution is that it leverages special computational properties of DNN training to provide (i) interactive performance predictions and visualizations, and (ii) directly manipulatable visualizations that, when dragged, mutate the batch size in the code. As an in-editor tool, Skyline allows users to leverage these diagnostic features to debug the performance of their DNNs during development. An exploratory qualitative user study of Skyline produced promising results; all the participants found Skyline to be useful and easy to use.
We aim to increase the flexibility at which a data worker can choose the right tool for the job, regardless of whether the tool is a code library or an interactive graphical user interface (GUI). To achieve this flexibility, we extend computational notebooks with a new API mage, which supports tools that can represent themselves as both code and GUI as needed. We discuss the design of mage as well as design opportunities in the space of flexible code/GUI tools for data work. To understand tooling needs, we conduct a study with nine professional practitioners and elicit their feedback on mage and potential areas for flexible code/GUI tooling. We then implement six client tools for mage that illustrate the main themes of our study findings. Finally, we discuss open challenges in providing flexible code/GUI interactions for data workers.
Data scientists have embraced computational notebooks to author analysis code and accompanying visualizations within a single document. Currently, although these media may be interleaved, they remain siloed: interactive visualizations must be manually specified as they are divorced from the analysis provenance expressed via dataframes, while code cells have no access to users' interactions with visualizations, and hence no way to operate on the results of interaction. To bridge this divide, we present B2, a set of techniques grounded in treating data queries as a shared representation between the code and interactive visualizations. B2 instruments data frames to track the queries expressed in code and synthesize corresponding visualizations. These visualizations are displayed in a dashboard to facilitate interactive analysis. When an interaction occurs, B2 reifies it as a data query and generates a history log in a new code cell. Subsequent cells can use this log to further analyze interaction results and, when marked as reactive, to ensure that code is automatically recomputed when new interaction occurs. In an evaluative study with data scientists, we find that B2 promotes a tighter feedback loop between coding and interacting with visualizations. All participants frequently moved from code to visualization and vice-versa, which facilitated their exploratory data analysis in the notebook.
We present RealitySketch, an augmented reality interface for sketching interactive graphics and visualizations. In recent years, an increasing number of AR sketching tools enable users to draw and embed sketches in the real world. However, with the current tools, sketched contents are inherently static, floating in mid-air without responding to the real world. This paper introduces a new way to embed dynamic and responsive graphics in the real world. In RealitySketch, the user draws graphical elements on a mobile AR screen and binds them with physical objects in real-time and improvisational ways, so that the sketched elements dynamically move with the corresponding physical motion. The user can also quickly visualize and analyze real-world phenomena through responsive graph plots or interactive visualizations. This paper contributes to a set of interaction techniques that enable capturing, parameterizing, and visualizing real-world motion without pre-defined programs and configurations. Finally, we demonstrate our tool with several application scenarios, including physics education, sports training, and in-situ tangible interfaces.
Virtual Reality (VR) users often need to work with other users, who observe them outside of VR using an external display. Communication between them is difficult; the VR user cannot see the external user's gestures, and the external user cannot see VR scene elements outside of the VR user's view. We carried out formative interviews with experts to understand these asymmetrical interactions and identify their goals and challenges. From this, we identify high-level system design goals to facilitate asymmetrical interactions and a corresponding space of implementation approaches based on the level of programmatic access to a VR application. We present TransceiVR, a system that utilizes VR platform APIs to enable asymmetric communication interfaces for third-party applications without requiring source code access. TransceiVR allows external users to explore the VR scene spatially or temporally, to annotate elements in the VR scene at correct depths, and to discuss via a shared static virtual display. An initial co-located user evaluation with 10 pairs shows that our system makes asymmetric collaborations in VR more effective and successful in terms of task time, error rate, and task load index. An informal evaluation with a remote expert gives additional insight on utility of features for real world tasks.
Videos are a convenient platform to begin, maintain, or improve a fitness program or physical activity. Traditional video systems allow users to manipulate videos through specific user interface actions such as button clicks or mouse drags, but have no model of what the user is doing and are unable to adapt in useful ways. We present adaptive video playback, which seamlessly synchronises video playback with the user's movements, building upon the principle of direct manipulation video navigation. We implement adaptive video playback in Reactive Video, a vision-based system which supports users learning or practising a physical skill. The use of pre-existing videos removes the need to create bespoke content or specially authored videos, and the system can provide real-time guidance and feedback to better support users when learning new movements. Adaptive video playback using a discrete Bayes and particle filter are evaluated on a data set collected of participants performing tai chi and radio exercises. Results show that both approaches can accurately adapt to the user's movements, however reversing playback can be problematic.
We present CoVR, a novel robotic interface providing strong kinesthetic feedback (100 N) in a room-scale VR arena. It consists of a physical column mounted on a 2D Cartesian ceiling robot (XY displacements) with the capacity of (1) resisting to body-scaled users' actions such as pushing or leaning; (2) acting on the users by pulling or transporting them as well as (3) carrying multiple potentially heavy objects (up to 80kg) that users can freely manipulate or make interact with each other. We describe its implementation and define a trajectory generation algorithm based on a novel user intention model to support non-deterministic scenarios, where the users are free to interact with any virtual object of interest with no regards to the scenarios' progress. A technical evaluation and a user study demonstrate the feasibility and usability of CoVR, as well as the relevance of whole-body interactions involving strong forces, such as being pulled through or transported.
We focus on the problem of simulating the haptic infrastructure of a virtual environment (i.e. walls, doors). Our approach relies on multiple ZoomWalls---autonomous robotic encounter-type haptic wall-shaped props---that coordinate to provide haptic feedback for room-scale virtual reality. Based on a user's movement through the physical space, ZoomWall props are coordinated through a predict-and-dispatch architecture to provide just-in-time haptic feedback for objects the user is about to touch. To refine our system, we conducted simulation studies of different prediction algorithms, which helped us to refine our algorithmic approach to realize the physical ZoomWall prototype. Finally, we evaluated our system through a user experience study, which showed that participants found that ZoomWalls increased their sense of presence in the VR environment. ZoomWalls represents an instance of autonomous mobile reusable props, which we view as an important design direction for haptics in VR.
Encountered-type haptic devices (EHDs) face a number of challenges when physically embodying content in a virtual environment, including workspace limits and device latency. To address these issues, we propose REACH+, a framework for dynamic visuo-haptic redirection to improve the perceived performance of EHDs during physical interaction in VR. Using this approach, we estimate the user's arrival time to their intended target and redirect their hand to a point within the EHD's spatio-temporally reachable space. We present an evaluation of this framework implemented with a desktop mobile robot in a 2D target selection task, tested at four robot speeds (20, 25, 30 and 35 cm/s). Results suggest that REACH+ can improve the performance of lower-speed EHDs, increasing their rate of on-time arrival to the point of contact by up to 25% and improving users? self-reported sense of realism.
This paper introduces a Unified Model of Saliency and Importance (UMSI), which learns to predict visual importance in input graphic designs, and saliency in natural images, along with a new dataset and applications. Previous methods for predicting saliency or visual importance are trained individually on specialized datasets, making them limited in application and leading to poor generalization on novel image classes, while requiring a user to know which model to apply to which input. UMSI is a deep learning-based model simultaneously trained on images from different design classes, including posters, infographics, mobile UIs, as well as natural images, and includes an automatic classification module to classify the input. This allows the model to work more effectively without requiring a user to label the input. We also introduce Imp1k, a new dataset of designs annotated with importance information. We demonstrate two new design interfaces that use importance prediction, including a tool for adjusting the relative importance of design elements, and a tool for reflowing designs to new aspect ratios while preserving visual importance.
Many digital design tasks require a user to set a large number of parameters. Gallery-based interfaces provide a way to quickly evaluate examples and explore the space of potential designs, but require systems to predict which designs from a high-dimensional space are the right ones to present to the user. In this paper we present the design adjectives framework for building parameterized design tools in high dimensional design spaces. The framework allows users to create and edit design adjectives, machine-learned models of user intent, to guide exploration through high-dimensional design spaces. We provide a domain-agnostic implementation of the design adjectives framework based on Gaussian process regression, which is able to rapidly learn user intent from only a few examples. Learning and sampling of the design adjective occurs at interactive rates, making the system suitable for iterative design workflows. We demonstrate use of the design adjectives framework to create design tools for three domains: materials, fonts, and particle systems. We evaluate these tools in a user study showing that participants were able to easily explore the design space and find designs that they liked, and in professional case studies that demonstrate the framework's ability to support professional design concepting workflows.
Creating marketing videos from scratch can be challenging, especially when designing for multiple platforms with different viewing criteria. We present URL2Video, an automatic approach that converts a web page into a short video given temporal and visual constraints. URL2Video captures quality materials and design styles extracted from a web page, including fonts, colors, and layouts. Using constraint programming, URL2Video's design engine organizes the visual assets into a sequence of shots and renders to a video with user-specified aspect ratio and duration. Creators can review the video composition, modify constraints, and generate video variation through a user interface. We learned the design process from designers and compared our automatically generated results with their creation through interviews and an online survey. The evaluation shows that URL2Video effectively extracted design elements from a web page and supported designers by bootstrapping the video creation process.
Getting laser-cut mechanisms, such as those in micro-scopes, robots, vehicles, etc., to work, requires all their components to be dimensioned precisely. This precision, however, tends to be lost when fabricating on a differ-ent laser cutter, as it is likely to remove more or less mate-rial (aka 'kerf'). We address this with what we call kerf-canceling mechanisms. Kerf-canceling mechanisms replace laser-cut bearings, sliders, gear pairs, etc. Unlike their tradi-tional counterparts, however, they keep working when manufactured on a different laser cutter and/or with different kerf. Kerf-canceling mechanisms achieve this by adding an additional wedge element per mechanism. We have created a software tool KerfCanceler that locates traditional mecha-nisms in cutting plans and replaces them with their kerf-canceling counterparts. We evaluated our tool by converting 17 models found online to kerf-invariant models; we evaluated kerf-canceling bearings by testing with kerf values ranging from 0mm and 0.5mm and find that they perform reliably independent of this kerf.
We present LamiFold, a novel design and fabrication workflow for making functional mechanical objects using a laser cutter. Objects fabricated with LamiFold embed advanced rotary, linear, and chained mechanisms, including linkages that support fine-tuning and locking position. Laser cutting such mechanisms without LamiFold requires designing for and embedding off-the-shelf parts such as springs, bolts, and axles for gears. The key to laser cutting our functional mechanisms is the selective cutting and gluing of stacks of sheet material. Designing mechanisms for this workflow is non-trivial, therefore we contribute a set of mechanical primitives that are compatible with our lamination workflow and can be combined to realize advanced mechanical systems. Our software design environment facilitates the process of inserting and composing our mechanical primitives and realizing functional laser-cut objects.
We present Tsugite - an interactive system for designing and fabricating wood joints for frame structures. To design and manually craft such joints is difficult and time consuming. Our system facilitates the creation of custom joints by a modeling interface combined with computer numerical control (CNC) fabrication. The design space is a 3D grid of voxels that enables efficient geometrical analysis and combinatorial search. The interface has two modes: manual editing and gallery. In the manual editing mode, the user edits a joint while receiving real-time graphical feedback and suggestions provided based on performance metrics including slidability, fabricability, and durability with regard to the direction of fiber. In the gallery mode, the user views and selects feasible joints that have been pre-calculated. When a joint design is finalized, it can be manufactured with a 3-axis CNC milling machine using a specialized path planning algorithm that ensures joint assemblability by corner rounding. This system was evaluated via a user study and by designing and fabricating joint samples and functional furniture.
Recognition of human behavior plays an important role in context-aware applications. However, it is still a challenge for end-users to build personalized applications that accurately recognize their own activities. Therefore, we present CAPturAR, an in-situ programming tool that supports users to rapidly author context-aware applications by referring to their previous activities. We customize an AR head-mounted device with multiple camera systems that allow for non-intrusive capturing of user's daily activities. During authoring, we reconstruct the captured data in AR with an animated avatar and use virtual icons to represent the surrounding environment. With our visual programming interface, users create human-centered rules for the applications and experience them instantly in AR. We further demonstrate four use cases enabled by CAPturAR. Also, we verify the effectiveness of the AR-HMD and the authoring workflow with a system evaluation using our prototype. Moreover, we conduct a remote user study in an AR simulator to evaluate the usability.
Immersive authoring is a paradigm that makes Virtual Reality (VR) application development easier by allowing programmers to create VR content while immersed in the virtual environment. In this paradigm, programmers manipulate programming primitives through direct manipulation and get immediate feedback on their program's state and output. However, existing immersive authoring tools have a low ceiling; their programming primitives are intuitive but can only express a limited set of static relationships between elements in a scene. In this paper, we introduce FlowMatic, an immersive authoring tool that raises the ceiling of expressiveness by allowing programmers to specify reactive behaviors---behaviors that react to discrete events such as user actions, system timers, or collisions. FlowMatic also introduces primitives for programmatically creating and destroying new objects, for abstracting and re-using functionality, and for importing 3D models. Importantly, FlowMatic uses novel visual representations to allow these primitives to be represented directly in VR. We also describe the results of a user study that illustrates the usability advantages of FlowMatic relative to a 2D authoring tool and we demonstrate its expressiveness through several example applications that would be impossible to implement with existing immersive authoring tools. By combining a visual program representation with expressive programming primitives and a natural User Interface (UI) for authoring programs, FlowMatic shows how programmers can build fully interactive virtual experiences with immersive authoring.
"View-dependent effects'' have parameters that change with the user's view and are rendered dynamically at runtime. They can be used to simulate physical phenomena such as exposure adaptation, as well as for dramatic purposes such as vignettes. We present a technique for adding view-dependent effects to 360 degree video, by interpolating spatial keyframes across an equirectangular video to control effect parameters during playback. An in-headset authoring tool is used to configure effect parameters and set keyframe positions. We evaluate the utility of view-dependent effects with expert 360 degree filmmakers and the perception of the effects with a general audience. Results show that experts find view-dependent effects desirable for their creative purposes and that these effects can evoke novel experiences in an audience.
The software behind online community platforms encodes a governance model that represents a strikingly narrow set of governance possibilities focused on moderators and administrators. When online communities desire other forms of government, such as ones that take many members? opinions into account or that distribute power in non-trivial ways, communities must resort to laborious manual effort. In this paper, we present PolicyKit, a software infrastructure that empowers online community members to concisely author a wide range of governance procedures and automatically carry out those procedures on their home platforms. We draw on political science theory to encode community governance into policies, or short imperative functions that specify a procedure for determining whether a user-initiated action can execute. Actions that can be governed by policies encompass everyday activities such as posting or moderating a message, but actions can also encompass changes to the policies themselves, enabling the evolution of governance over time. We demonstrate the expressivity of PolicyKit through implementations of governance models such as a random jury deliberation, a multi-stage caucus, a reputation system, and a promotion procedure inspired by Wikipedia's Request for Adminship (RfA) process.
As a typical part of the interface of web surveys, progress indicators show the degree of completion for participants. These progress indicators influence the dropout and response behavior as various studies suggest. For this reason, the indicator should be chosen carefully. However, calculating the progress in adaptive surveys with many branches is often difficult. Recently related work has provided algorithms for such surveys based on different prediction strategies and has identified the Root Mean Squared Error as a valuable measure to compare different strategies. However, all previously mentioned strategies have shown poor predictions in some cases. In this paper, we present a new strategy which learns from historical data. A simulation study with 10k randomly generated surveys shows its benefits and its limits. As an example of application, we confirm our results of the simulation by comparing different prediction strategies for two large real-world surveys.
While there is an enormous amount of information online for making decisions such as choosing a product, restaurant, or school, it can be costly for users to synthesize that information into confident decisions. Information for users' many different criteria needs to be gathered from many different sources into a structure where they can be compared and contrasted. The usefulness of each criterion for differentiating potential options can be opaque to users, and evidence such as reviews may be subjective and conflicting, requiring users to interpret each under their personal context. We introduce Mesh, which scaffolds users to iteratively build up a better understanding of both their criteria and options by evaluating evidence gathered across sources in the context of consumer decision-making. Mesh bridges the gap between decision support systems that typically have rigid structures and the fluid and dynamic process of exploratory search, changing the cost structure to provide increasing payoffs with greater user investment. Our lab and field deployment studies found evidence that Mesh significantly reduces the costs of gathering and evaluating evidence and scaffolds decision-making through personalized criteria enabling users to gain deeper insights from data.
In this paper, we present Acustico, a passive acoustic sensing approach that enables tap detection and 2D tap localization on uninstrumented surfaces using a wrist-worn device. Our technique uses a novel application of acoustic time differences of arrival (TDOA) analysis. We adopt a sensor fusion approach by taking both 'surface waves' (i.e., vibrations through surface) and 'sound waves' (i.e., vibrations through air) into analysis to improve sensing resolution. We carefully design a sensor configuration to meet the constraints of a wristband form factor. We built a wristband prototype with four acoustic sensors, two accelerometers and two microphones. Through a 20-participant study, we evaluated the performance of our proposed sensing technique for tap detection and localization. Results show that our system reliably detects taps with an F1-score of 0.9987 across different environmental noises and yields high localization accuracies with root-mean-square-errors of 7.6mm (X-axis) and 4.6mm (Y-axis) across different surfaces and tapping techniques.
Today's wearable and mobile devices typically use separate hardware components for sensing and actuation. In this work, we introduce new opportunities for the Linear Resonant Actuator (LRA), which is ubiquitous in such devices due to its capability for providing rich haptic feedback. By leveraging strategies to enable active and passive sensing capabilities with LRAs, we demonstrate their benefits and potential as self-contained I/O devices. Specifically, we use the back-EMF voltage to classify if the LRA is tapped, touched, as well as how much pressure is being applied. The back-EMF sensing is already integrated into many motor and LRA drivers. We developed a passive low-power tap sensing method that uses just 37.7 uA. Furthermore, we developed active touch and pressure sensing, which is low-power, quiet (2 dB), and minimizes vibration. The sensing method works with many types of LRAs. We show applications, such as pressure-sensing side-buttons on a mobile phone. We have also implemented our technique directly on an existing mobile phone's LRA to detect if the phone is handheld or placed on a soft or hard surface. Finally, we show that this method can be used for haptic devices to determine if the LRA makes good contact with the skin. Our approach can add rich sensing capabilities to the ubiquitous LRA actuators without requiring additional sensors or hardware.
In this paper we present MechanoBeat, a 3D printed mechanical tag that oscillates at a unique frequency upon user interaction. With the help of an ultra-wideband (UWB) radar array, MechanoBeat can unobtrusively monitor interactions with both stationary and mobile objects. MechanoBeat consists of small, scalable, and easy-to-install tags that do not require any batteries, silicon chips, or electronic components. Tags can be produced using commodity desktop 3D printers with cheap materials. We develop an efficient signal processing and deep learning method to locate and identify tags using only the signals reflected from the tag vibrations. MechanoBeat is capable of detecting simultaneous interactions with high accuracy, even in noisy environments. We leverage UWB radar signals' high penetration property to sense interactions behind walls in a non-line-of-sight (NLOS) scenario. A number of applications using MechanoBeat have been explored and the results have been presented in the paper.
Current wearable AR devices create an isolated experience with a limited field of view, vergence-accommodation conflicts, and difficulty communicating the virtual environment to observers. To address these issues and enable new ways to visualize, manipulate, and share virtual content, we introduce Augmented Augmented Reality (AAR) by combining a wearable AR display with a wearable spatial augmented reality projector. To explore this idea, a system is constructed to combine a head-mounted actuated pico projector with a Hololens AR headset. Projector calibration uses a modified structure from motion pipeline to reconstruct the geometric structure of the pan-tilt actuator axes and offsets. A toolkit encapsulates a set of high-level functionality to manage content placement relative to each augmented display and the physical environment. Demonstrations showcase ways to utilize the projected and head-mounted displays together, such as expanding field of view, distributing content across depth surfaces, and enabling bystander collaboration.
Head-Mounted Displays (HMDs) are the dominant form of enabling Virtual Reality (VR) and Augmented Reality (AR) for personal use. One of the biggest challenges of HMDs is the exclusion of people in the vicinity, such as friends or family. While recent research on asymmetric interaction for VR HMDs has contributed to solving this problem in the VR domain, AR HMDs come with similar but also different problems, such as conflicting information in visualization through the HMD and projection. In this work, we propose ShARe, a modified AR HMD combined with a projector that can display augmented content onto planar surfaces to include the outside users (non-HMD users). To combat the challenge of conflicting visualization between augmented and projected content, ShARe visually aligns the content presented through the AR HMD with the projected content using an internal calibration procedure and a servo motor. Using marker tracking, non-HMD users are able to interact with the projected content using touch and gestures. To further explore the arising design space, we implemented three types of applications (collaborative game, competitive game, and external visualization). ShARe is a proof-of-concept system that showcases how AR HMDs can facilitate interaction with outside users to combat exclusion and instead foster rich, enjoyable social interactions.
We present HMD Light, a proof-of-concept Head-Mounted Display (HMD) implementation that reveals the Virtual Reality (VR) user's experience in the physical environment to facilitate communication between VR and external users in a mobile VR context. While previous work externalized the VR user's experience through an on-HMD display, HMD Light places the display into the physical environment to enable larger display and interaction area. This work explores the interaction design space of HMD Light and presents four applications to demonstrate its versatility. Our exploratory user study observed participant pairs experience applications with HMD Light and evaluated usability, accessibility and social presence between users. From the results, we distill design insights for HMD Light and asymmetric VR collaboration.
Correcting errors in entered text is a common task but usually difficult to perform on mobile devices due to tedious cursor navigation steps. In this paper, we present JustCorrect, an intelligent post hoc text correction technique for smartphones. To make a correction, the user simply types the correct text at the end of their current input, and JustCorrect will automatically detect the error and apply the correction in the form of an insertion or a substitution. In this way, manual navigation steps are bypassed, and the correction can be committed with a single tap. We solved two critical problems to support JustCorrect: (1) Correction Algorithm: we propose an algorithm that infers the user's correction intention from the last typed word. (2) Input Modalities: our study revealed that both tap and gesture were suitable input modalities for performing JustCorrect. Based on our findings, we integrated JustCorrect into a soft keyboard. Our user studies show that using JustCorrect reduces the text correction time by 12.8% over the stock Android keyboard and by 9.7% over the "Type, then Correct" text correction technique by Zhang et al. (2019). Overall, JustCorrect complements existing post hoc text correction techniques, making error correction more automatic and intelligent.
Drawing boards offer a self-stable work surface that is continuously adjustable. On digital displays, such as the Microsoft Surface Studio, these properties open up a class of techniques that sense and respond to tilt adjustments. Each display posture-whether angled high, low, or somewhere in-between-affords some activities, but not others. Because what is appropriate also depends on the application and task, we explore a range of app-specific transitions between reading vs. writing (annotation), public vs. personal, shared person-space vs. task-space, and other nuances of input and feedback, contingent on display angle. Continuous responses provide interactive transitions tailored to each use-case. We show how a variety of knowledge work scenarios can use sensed display adjustments to drive context-appropriate transitions, as well as technical software details of how to best realize these concepts. A preliminary remote user study suggests that techniques must balance effort required to adjust tilt, versus the potential benefits of a sensed transition.
We study interaction interferences, situations where an unexpected change occurs in an interface immediately before the user performs an action, causing the corresponding input to be misinterpreted by the system. For example, a user tries to select an item in a list, but the list is automatically updated immediately before the click, causing the wrong item to be selected. First, we formally define interaction interferences and discuss their causes from behavioral and system-design perspectives. Then, we report the results of a survey examining users' perceptions of the frequency, frustration, and severity of interaction interferences. We also report a controlled experiment, based on state-of-the-art experimental protocols from neuroscience, that explores the minimum time interval, before clicking, below which participants could not refrain from completing their action. Finally, we discuss our findings and their implications for system design, paving the way for future work.
Mainstream board-level circuit design tools work at the lowest level of design --- schematics and individual components. While novel tools experiment with higher levels of design, abstraction often comes at the expense of the fine-grained control afforded by low-level tools. In this work, we propose a hardware description language (HDL) approach that supports users at multiple levels of abstraction from broad system architecture to subcircuits and component selection. We extend the familiar hierarchical block diagram with polymorphism to include abstract-typed blocks (e.g., generic resistor supertype) and electronics modeling (i.e., currents and voltages). Such an approach brings the advantages of reusability and encapsulation from object-oriented programming, while addressing the unique needs of electronics designers such as physical correctness verification. We discuss the system design, including fundamental abstractions, the block diagram construction HDL, and user interfaces to inspect and fine-tune the design; demonstrate example designs built with our system; and present feedback from intermediate-level engineers who have worked with our system.
MorphSensor is a 3D electronic design tool that enables designers to morph existing sensor modules of pre-defined two-dimensional shape into free-form electronic component arrangements that better integrate with the three-dimensional shape of a physical prototype. MorphSensor builds onto existing sensor module schematics that already define the electronic components and the wiring required to build the sensor. Since MorphSensor maintains the wire connections throughout the editing process, the sensor remains fully functional even when designers change the electronic component layout on the prototype geometry. We detail the MorphSensor editor that supports designers in re-arranging the electronic components, and discuss a fabrication pipeline based on customized PCB footprints for making the resulting freeform sensor. We then demonstrate the capabilities of our system by morphing a range of sensor modules of different complexity and provide a technical evaluation of the quality of the resulting free-form sensors.
On-body electronics and sensors offer the opportunity to seamlessly augment the human with computing power. Accordingly, numerous previous work investigated methods that exploit conductive materials and flexible substrates to fabricate circuits in the form of wearable devices, stretchable patches, and stickers that can be attached to the skin. For all these methods, the fabrication process involves several manual steps, such as designing the circuit in software, constructing conductive patches, and manually placing these physical patches on the body. In contrast, in this work, we propose to fabricate electronics directly on the skin. We present BodyPrinter, a wearable conductive-ink deposition machine, that prints flexible electronics directly on the body using skin-safe conductive ink. The paper describes our system in detail and, through a series of examples and a technical evaluation, we show how direct on-body fabrication of electronic circuits and sensors can further enhance the human body.
We engineered an exoskeleton, which we call HandMorph, that approximates the experience of having a smaller grasping range. It uses mechanical links to transmit motion from the wearer's fingers to a smaller hand with five anatomically correct fingers. The result is that HandMorph miniaturizes a wearer's grasping range while transmitting haptic feedback.
Unlike other size-illusions based on virtual reality, HandMorph achieves this in the user's real environment, preserving the user's physical and social contexts. As such, our device can be integrated into the user's workflow, e.g., to allow product designers to momentarily change their grasping range into that of a child while evaluating a toy prototype.
In our first user study, we found that participants perceived objects as larger when wearing HandMorph, which suggests that their size perception was successfully transformed. In our second user study, we assessed the experience of using HandMorph in designing a simple toy trumpet for children. We found that participants felt more confident in their toy design when using HandMorph to validate its ergonomics.
Many features of materials can be experienced through tactile cues, even using one's feet. For example, one can easily distinguish between moss and stone without looking at the ground. However, this type of material experience is largely not supported in AR and VR applications. We present bARefoot, a prototype shoe providing tactile impulses tightly coupled to motor actions. This enables generating virtual material experiences such as compliance, elasticity, or friction. To explore the parameter space of such sensorimotor coupled vibrations, we present a design tool enabling rapid design of virtual materials. We report initial explorations to increase understanding of how parameters can be optimized for generating compliance, and to examine the effect of dynamic parameters on material experiences. Finally, we present a series of use cases that demonstrate the potential of bARefoot for VR and AR.
We present Omni, a self-contained 3D haptic feedback system that is capable of sensing and actuating an untethered, passive tool containing only a small embedded permanent magnet. Omni enriches AR, VR and desktop applications by providing an active haptic experience using a simple apparatus centered around an electromagnetic base. The spatial haptic capabilities of Omni are enabled by a novel gradient-based method to reconstruct the 3D position of the permanent magnet in midair using the measurements from eight off-the-shelf hall sensors that are integrated into the base. Omni's 3 DoF spherical electromagnet simultaneously exerts dynamic and precise radial and tangential forces in a volumetric space around the device. Since our system is fully integrated, contains no moving parts and requires no external tracking, it is easy and affordable to fabricate. We describe Omni's hardware implementation, our 3D reconstruction algorithm, and evaluate the tracking and actuation performance in depth. Finally, we demonstrate its capabilities via a set of interactive usage scenarios.
Live programming is a paradigm in which the programmer can visualize the runtime values of the program each time the program changes. The promise of live programming depends on using test cases to run the program and thereby provide these runtime values. In this paper we show that in some situations test cases are insufficient in a fundamental way, in that there are no test inputs that can drive certain incomplete loops to produce useful data, a problem we call the loop-datavoid problem. The problem stems from the fact that useful data inside the loop might only be produced after the loop has been fully written. To solve this problem, we propose a paradigm called Focused Live Programming with Loop Seeds, in which the programmer provides hypothetical values to start a loop iteration, and then the programming environment focuses the live visualization on this hypothetical loop iteration. We introduce the loop-datavoid problem, present our proposed solution, explain it in detail, and then present the results of a user study.
Live programming is a paradigm in which the programming environment continually displays runtime values. Program synthesis is a technique that can generate programs or program snippets from examples. \deltextThis paper presents a new programming paradigm called Synthesis-Aided Live Programming that combines these two prior ideas in a synergistic way. When using Synthesis-Aided Live Programming, programmers can change the runtime values displayed by the live \addtextPrevious works that combine the two have taken a holistic approach to the way examples describe the behavior of functions and programs. This paper presents a new programming paradigm called Small-Step Live Programming by Example that lets the user apply Programming by Example locally. When using Small-Step Live Programming by Example, programmers can change the runtime values displayed by the live visualization to generate local program snippets. % Live programming and program % synthesis work perfectly together because the live programming environment % reifies values, which makes it easy for programmers to provide the examples % needed by the synthesizer. We implemented this new paradigm in a tool called \toolname, and performed a user study on $13$ programmers. Our study finds that Small-Step Live Programming by Example with \toolname helps users solve harder problems faster, and that for certain types of queries, users prefer it to searching the web. Additionally, we identify the \usersynthgap, in which users' mental models of the tool do not match its ability, and needs to be taken into account in the design of future synthesis tools.
Programming-by-example (PBE) has become an increasingly popular component in software development tools, human-robot interaction, and end-user programming. A long-standing challenge in PBE is the inherent ambiguity in user-provided examples. This paper presents an interaction model to disambiguate user intent and reduce the cognitive load of understanding and validating synthesized programs. Our model provides two types of augmentations to user-given examples: 1) semantic augmentation where a user can specify how different aspects of an example should be treated by a synthesizer via light-weight annotations, and 2) data augmentation where the synthesizer generates additional examples to help the user understand and validate synthesized programs. We implement and demonstrate this interaction model in the domain of regular expressions, which is a popular mechanism for text processing and data wrangling and is often considered hard to master even for experienced programmers. A within-subjects user study with twelve participants shows that, compared with only inspecting and annotating synthesized programs, interacting with augmented examples significantly increases the success rate of finishing a programming task with less time and increases users? confidence of synthesized programs.
We present Capacitivo, a contact-based object recognition technique developed for interactive fabrics, using capacitive sensing. Unlike prior work that has focused on metallic objects, our technique recognizes non-metallic objects such as food, different types of fruits, liquids, and other types of objects that are often found around a home or in a workplace. To demonstrate our technique, we created a prototype composed of a 12 x 12 grid of electrodes, made from conductive fabric attached to a textile substrate. We designed the size and separation between the electrodes to maximize the sensing area and sensitivity. We then used a 10-person study to evaluate the performance of our sensing technique using 20 different objects, which yielded a 94.5% accuracy rate. We conclude this work by presenting several different application scenarios to demonstrate unique interactions that are enabled by our technique on fabrics.
ZebraSense is a novel dual-sided woven touch sensor that can recognize and differentiate interactions on the top and bottom surfaces of the sensor. ZebraSense is based on an industrial multi-layer textile weaving technique, yet it enables a novel capacitive sensing paradigm, where each sensing element contributes to touch detection on both surfaces of the sensor simultaneously. Unlike the common "sensor sandwich" approach used in previous work, ZebraSense inherently minimizes the number of sensing elements, which drastically simplifies both sensor construction and its integration into soft goods, while preserving maximum sensor resolution. The experimental evaluation confirmed the validity of our approach and demonstrated that ZebraSense is a reliable, efficient, and accurate solution for detecting user gestures in various dual-sided interaction scenarios, allowing for new use cases in smart apparel, home decoration, toys, and other textile objects.
We present Sonoflex, a thin-form, embroidered dynamic speaker made without using a permanent magnet. Our design consists of two flat spiral coils, stacked on top of each other, and is based on an isolated, thin (0.15 mm) enameled copper wire. Our approach allows for thin, lightweight, and textile speakers and does not require high voltage as in electrostatic speakers. We show how the speaker can be designed and fabricated and evaluate its acoustic properties as a function of manufacturing parameters (size, turn counts, turn spacing, and substrate materials). The experiment results revealed that we can produce audible sound with a broad frequency range (1.5 kHz - 20 kHz) with the embroidered speaker with a diameter of 50 mm. We conclude the paper by presenting several applications such as audible notifications and near-ultrasound communication.
We propose a novel text decoding method that enables touch typing on an uninstrumented flat surface. Rather than relying on physical keyboards or capacitive touch, our method takes as input hand motion of the typist, obtained through hand-tracking, and decodes this motion directly into text. We use a temporal convolutional network to represent a motion model that maps the hand motion, represented as a sequence of hand pose features, into text characters. To enable touch typing without the haptic feedback of a physical keyboard, we had to address more erratic typing motion due to drift of the fingers. Thus, we incorporate a language model as a text prior and use beam search to efficiently combine our motion and language models to decode text from erratic or ambiguous hand motion. We collected a dataset of 20 touch typists and evaluated our model on several baselines, including contact-based text decoding and typing on a physical keyboard. Our proposed method is able to leverage continuous hand pose information to decode text more accurately than contact-based methods and an offline study shows parity (73 WPM, 2.38% UER) with typing on a physical keyboard. Our results show that hand-tracking has the potential to enable rapid text entry in mobile environments.
Existing augmented reality (AR) applications often ignore the occlusion between real hands and virtual objects when incorporating virtual objects in user's views. The challenges come from the lack of accurate depth and mismatch between real and virtual depth. This paper presents GrabAR1, a new approach that directly predicts the real-and-virtual occlusion and bypasses the depth acquisition and inference. Our goal is to enhance AR applications with interactions between hand (real) and grabbable objects (virtual). With paired images of hand and object as inputs, we formulate a compact deep neural network that learns to generate the occlusion mask. To train the network, we compile a large dataset, including synthetic data and real data. We then embed the trained network in a prototyping AR system to support real-time grabbing of virtual objects. Further, we demonstrate the performance of our method on various virtual objects, compare our method with others through two user studies, and showcase a rich variety of interaction scenarios, in which we can use bare hand to grab virtual objects and directly manipulate them.
Virtual Reality allows users to embody avatars that do not match their real bodies. Earlier work has selected changes to the avatar arbitrarily and it therefore remains unclear how to change avatars to improve users' performance. We propose a systematic approach for iteratively adapting the avatar to perform better for a given task based on users' performance. The approach is evaluated in a target selection task, where the forearms of the avatar are scaled to improve performance. A comparison between the optimised and real arm lengths shows a significant reduction in average tapping time by 18.7%, for forearms multiplied in length by 5.6. Additionally, with the adapted avatar, participants moved their real body and arms significantly less, and subjective measures show reduced physical demand and frustration. In a second study, we modify finger lengths for a linear tapping task to achieve a better performing avatar, which demonstrates the generalisability of the approach.
Despite the ubiquity of direct manipulation techniques available in computer-aided design applications, creating digital content remains a tedious and indirect task. This is because applications require users to perform numerous low-level editing operations rather than allowing them to directly indicate high-level design goals. Yet, the creation of graphic content, such as videos, animations, and presentations often begins with a description of design goals in natural language, such as screenplays, scripts, outlines. Therefore, there is an opportunity for language-oriented authoring, i.e., leveraging the information found in the structure of a language to facilitate the creation of graphic content. We present a systematic exploration of the identification, graphic description, and interaction with various linguistic structures to assist in the creation of visual content. The prototype system, Crosspower, and its proposed interaction techniques, enables content creators to indicate and customize their desired visual content in a flexible and direct manner.
Audio travel podcasts are a valuable source of information for travelers. Yet, travel is, in many ways, a visual experience and the lack of visuals in travel podcasts can make it difficult for listeners to fully understand the places being discussed. We present Crosscast: a system for automatically adding visuals to audio travel podcasts. Given an audio travel podcast as input, Crosscast uses natural language processing and text mining to identify geographic locations and descriptive keywords within the podcast transcript. Crosscast then uses these locations and keywords to automatically select relevant photos from online repositories and synchronizes their display to align with the audio narration. In a user evaluation, we find that 85.7% of the participants preferred Crosscast generated audio-visual travel podcasts compared to audio-only travel podcasts.
Audio descriptions make videos accessible to those who cannot see them by describing visual content in audio. Producing audio descriptions is challenging due to the synchronous nature of the audio description that must fit into gaps of other video content. An experienced audio description author will produce content that fits narration necessary to understand, enjoy, or experience the video content into the time available. This can be especially tricky for novices to do well. In this paper, we introduce a tool, Rescribe, that helps authors create and refine their audio descriptions. Using Rescribe, authors first create a draft of all the content they would like to include in the audio description. Rescribe then uses a dynamic programming approach to optimize between the length of the audio description, available automatic shortening approaches, and source track lengthening approaches. Authors can iteratively visualize and refine the audio descriptions produced by Rescribe, working in concert with the tool. We evaluate the effectiveness of Rescribe through interviews with blind and visually impaired audio description users who give feedback on Rescribe results. In addition, we invite novice users to create audio descriptions with Rescribe and another tool, finding that users produce audio descriptions with fewer placement errors using Rescribe.
Keyframe-based sculpting provides unprecedented freedom to author animated organic models, which can be difficult to create with other methods such as simulation, scripting, and rigging. However, sculpting animated objects can require significant artistic skill and manual labor, even more so than sculpting static 3D shapes or drawing 2D animations, which are already quite challenging.
We present a keyframe-based animated sculpting system with the capability to autocomplete user editing under a simple and intuitive brushing interface. Similar to current desktop sculpting and VR brushing tools, users can brush surface details and volume structures. Meanwhile, our system analyzes their workflows and predicts what they might do in the future, both spatially and temporally. Users can accept or ignore these suggestions and thus maintain full control. We propose the first interactive suggestive keyframe sculpting system, specifically for spatio-temporal repetitive tasks, including low-level spatial details and high-level brushing structures across multiple frames. Our key ideas include a deformation-based optimization framework to analyze recorded workflows and synthesize predictions, and a semi-causal global similarity measurement to support flexible brushing stroke sequences and complex shape changes. Our system supports a variety of shape and motion styles, including those difficult to achieve via existing animation systems, such as topological changes that cannot be accomplished via simple rig-based deformations and stylized physically-implausible motions that cannot be simulated. We evaluate our system via a pilot user study that demonstrates the effectiveness of our system.
Posing expressive 3D faces is extremely challenging. Typical facial rigs have upwards of 30 controllable parameters, that while anatomically meaningful, are hard to use due to redundancy of expression, unrealistic configurations, and many semantic and stylistic correlations between the parameters. We propose a novel interface for rapid exploration and refinement of static facial expressions, based on a data-driven face manifold of natural expressions. Rapidly explored face configurations are interactively projected onto this manifold of meaningful expressions. These expressions can then be refined using a 2D embedding of nearby faces, both on and off the manifold. Our validation is fourfold: we show expressive face creation using various devices; we verify that our learnt manifold transcends its training face, to expressively control very different faces; we perform a crowd-sourced study to evaluate the quality of manifold face expressions; and we report on a usability study that shows our approach is an effective interactive tool to author facial expression.
Augmenting human action videos with visual effects often requires professional tools and skills. To make this more accessible by novice users, existing attempts have focused on automatically adding visual effects to faces and hands, or let virtual objects strictly track certain body parts, resulting in rigid-looking effects. We present PoseTween, an interactive system that allows novice users to easily add vivid virtual objects with their movement interacting with a moving subject in an input video. Our key idea is to leverage the motion of the subject to create pose-driven tween animations of virtual objects. With our tool, a user only needs to edit the properties of a virtual object with respect to the subject's movement at keyframes, and the object is associated with certain body parts automatically. The properties of the object at intermediate frames are then determined by both the body movement and the interpolated object keyframe properties, producing natural object movements and interactions with the subject. We design a user interface to facilitate editing of keyframes and previewing animation results. Our user study shows that PoseTween significantly requires less editing time and fewer keyframes than using the traditional tween animation in making pose-driven tween animations for novice users.
This work presents Slice of Light, a visualization design created to enhance transparency and integrative transition between realities of Head-Mounted Display (HMD) users sharing the same physical environment. Targeted at reality-guests, Slice of Light's design enables guests to view other HMD users' interactions contextualized in their own virtual environments while allowing the guests to navigate among these virtual environments. In this paper, we detail our visualization design and the implementation. We demonstrate Slice of Light with a block-world construction scenario that involves a multi-HMD-user environment. VR developer and HCI expert participants were recruited to evaluate the scenario, and responded positively to Slice of Light. We discuss their feedback, our design insights, and the limitations of this work.
Virtual reality has recently been adopted for use within the domain of visual analytics because it can provide users with an endless workspace within which they can be actively engaged and use their spatial reasoning skills for data analysis. However, virtual worlds need to utilize layouts and organizational schemes that are meaningful to the user and beneficial for data analysis. This paper presents DataHop, a novel visualization system that enables users to lay out their data analysis steps in a virtual environment. With a Filter, a user can specify the modification they wish to perform on one or more input data panels (i.e., containers of points), along with where output data panels should be placed in the virtual environment. Using this simple tool, highly intricate and useful visualizations may be generated and traversed by harnessing a user's spatial abilities. An exploratory study conducted with six virtual reality users evaluated the usability, affordances, and performance of DataHop for data analysis tasks, and found that spatially mapping one's workflow can be beneficial when exploring multidimensional datasets.
Mobile devices with passive depth sensing capabilities are ubiquitous, and recently active depth sensors have become available on some tablets and AR/VR devices. Although real-time depth data is accessible, its rich value to mainstream AR applications has been sorely under-explored. Adoption of depth-based UX has been impeded by the complexity of performing even simple operations with raw depth data, such as detecting intersections or constructing meshes. In this paper, we introduce DepthLab, a software library that encapsulates a variety of depth-based UI/UX paradigms, including geometry-aware rendering (occlusion, shadows), surface interaction behaviors (physics-based collisions, avatar path planning), and visual effects (relighting, 3D-anchored focus and aperture effects). We break down the usage of depth into localized depth, surface depth, and dense depth, and describe our real-time algorithms for interaction and rendering tasks. We present the design process, system, and components of DepthLab to streamline and centralize the development of interactive depth features. We have open-sourced our software at https://github.com/googlesamples/arcore-depth-lab to external developers, conducted performance evaluation, and discussed how DepthLab can accelerate the workflow of mobile AR designers and developers. With DepthLab we aim to help mobile developers to effortlessly integrate depth into their AR experiences and amplify the expression of their creative vision.
We propose a Servo-Gaussian model to predict success rates in continuous manual tracking tasks. Two tasks were conducted to validate this model: path steering and pursuit of a 1D moving target. We hypothesized that (1) hand movements follow the servo-mechanism model, (2) submovement endpoints form a bivariate Gaussian distribution, thus enabling us to predict the success rate at which a submovement endpoint falls inside the tolerance, and (3) the success rate for a whole trial can be predicted if the number of submovements is known. The cross-validation showed R^2>0.92 and MAE<4.9% for steering and R^2>0.95 and MAE<6.5% for pursuit tasks. These results demonstrate that our proposed model delivers high prediction accuracy even for unknown datasets.
Modeling touch pointing is essential to touchscreen interface development and research, as pointing is one of the most basic and common touch actions users perform on touchscreen devices. Finger-Fitts Law  revised the conventional Fitts? law into a 1D (one-dimensional) pointing model for finger touch by explicitly accounting for the fat finger ambiguity (absolute error) problem which was unaccounted for in the original Fitts? law. We generalize Finger-Fitts law to 2D touch pointing by solving two critical problems. First, we extend two of the most successful 2D Fitts law forms to accommodate finger ambiguity. Second, we discovered that using nominal target width and height is a conceptually simple yet effective approach for defining amplitude and directional constraints for 2D touch pointing across different movement directions. The evaluation shows our derived 2D Finger-Fitts law models can be both principled and powerful. Specifically, they outperformed the existing 2D Fitts? laws, as measured by the regression coefficient and model selection information criteria (e.g., Akaike Information Criterion) considering the number of parameters. Finally, 2D Finger-Fitts laws also advance our understanding of touch pointing and thereby serve as the basis for touch interface designs.
Jitter in interactive systems occurs when visual feedback is perceived as unstable or trembling even though the input signal is smooth or stationary. It can have multiple causes such as sensing noise, or feedback calculations introducing or exacerbating sensing imprecisions. Jitter can however occur even when each individual component of the pipeline works perfectly, as a result of the differences between the input frequency and the display refresh rate. This asynchronicity can introduce rapidly-shifting latencies between the rendered feedbacks and their display on screen, which can result in trembling cursors or viewports. % This paper contributes a better understanding of this particular type of jitter. We first detail the problem from a mathematical standpoint, from which we develop a predictive model of jitter amplitude as a function of input and output frequencies, and a new metric to measure this spatial jitter. Using touch input data gathered in a study, we developed a simulator to validate this model and to assess the effects of different techniques and settings with any output frequency. The most promising approach, when the time of the next display refresh is known, is to estimate (interpolate or extrapolate) the user's position at a fixed time interval before that refresh. % When input events occur at 125~Hz, as is common in touch screens, we show that an interval of 4 to 6~ms works well for a wide range of display refresh rates. This method effectively cancels most of the jitter introduced by input/output asynchronicity, while introducing minimal imprecision or latency.
We introduce HERMITS, a modular interaction architecture for self-propelled Tangible User Interfaces (TUIs) that incorporates physical add-ons, referred to as mechanical shells. The mechanical shell add-ons are intended to be dynamically reconfigured by utilizing the locomotion capability of self-propelled TUIs (e.g. wheeled TUIs, swarm UIs). We developed a proof-of-concept system that demonstrates this novel architecture using two-wheeled robots and a variety of mechanical shell examples. These mechanical shell add-ons are passive physical attatchments that extend the primitive interactivities (e.g. shape, motion and light) of the self-propelled robots.
The paper proposes the architectural design, interactive functionality of HERMITS as well as design primitives for mechanical shells. The paper also introduces the prototype implementation that is based on an off-the-shelf robotic toy with a modified docking mechanism. A range of applications is demonstrated with the prototype to motivate the collective and dynamically reconfigurable capability of the modular architecture, such as an interactive mobility simulation, an adaptive home/desk environment, and a story-telling narrative. Lastly, we discuss the future research opportunity of HERMITS to enrich the interactivity and adaptability of actuated and shape changing TUIs.
Reconfiguring shapes of objects enables transforming existing passive objects with robotic functionalities, e.g., a transformable coffee cup holder can be attached to a chair's armrest, a piggy bank can reach out an arm to 'steal' coins. Despite the advance in end-user 3D design and fabrication, it remains challenging for non-experts to create such 'transformables' using existing tools due to the requirement of specific engineering knowledge such as mechanisms and robotic design.
We present Romeo -- a design tool for creating transformables to robotically augment objects' default functionalities. Romeo allows users to transform an object into a robotic arm by expressing at a high level what type of task is expected. Users can select which part of the object to be transformed, specify motion points in space for the transformed part to follow and the corresponding action to be taken. Romeo then automatically generates a robotic arm embedded in the transformable part ready for fabrication. A design session validated this tool where participants used Romeo to accomplish controlled design tasks and to open-endedly create coin-stealing piggy banks by transforming 3D objects of their own choice.
Despite the recent growth in popularity of personal mobility devices (e.g., e-scooters and e-skateboards), they still suffer from limited safety and narrow design form factors, due to their rigid structures. On the other hand, inflatable interfaces studied in human-computer interaction can achieve large volume change by simple inflation/deflation. Inflatable structure also offers soft and safe interaction owing to material compliance and diverse fabrication methods that lead to a wide range of forms and aesthetics. In this paper, we propose poimo, a new family of POrtable and Inflatable MObility devices, which consists of inflatable frames, inflatable wheels, and inflatable steering mechanisms made of a mass-manufacturable material called drop-stitch fabric. First, we defined the basic material properties of a drop-stitch inflatable structure that is sufficiently strong to carry a person while simultaneously allowing soft deformation and deflation for storage and portability. We then implemented an interactive design system that can scan the user's desired riding posture to generate a customized personal mobility device and can add the user's shape and color preferences. To demonstrate the custom-design capability and mobility, we designed several 3D models using our system and built physical samples for two basic templates: a motorcycle and a wheelchair. Finally, we conducted an online user study to examine the usability of the design system and share lessons learned for further improvements in the design and fabrication of poimo.
Physical buttons provide clear haptic feedback when pressed and released, but their responses are unvarying. Physical buttons can be powered by force actuators to produce unlimited click sensations, but the cost is substantial. An alternative can be augmenting physical buttons with simple and inexpensive vibration actuators. When pushed, an augmented button generates a vibration overlayed on the button's original kinesthetic response, under the general framework of haptic augmented reality. We explore the design space of augmented buttons while changing vibration frequency, amplitude, duration, and envelope. We then visualize the perceptual structure of augmented buttons by estimating a perceptual space for 7 physical buttons and 40 augmented buttons. Their sensations are also assessed against adjectives, and results are mapped into the perceptual space to identify meaningful perceptual dimensions. Our results contribute to understanding the benefits and limitations of programmable vibration-augmented physical buttons with emphasis on their feels.
Force feedback has not been fully explored in modern gaming environments where a gamepad is the main interface. We developed various game interaction scenarios where force feedback through the thumbstick of the gamepad can be effective, and categorized them into five themes. We built a haptic device and control system that can support all presented interactions. The resulting device, FS-Pad, has sufficient fidelity to be used as a haptic game interaction design tool. To verify the presented interactions and effectiveness of the FS-Pad, we conducted a user study with game players, developers, and designers. The subjects used an FS-Pad while playing a demo game and were then interviewed. Their feedback revealed the actual needs for the presented interactions as well as insight into the potential design of game interactions when applying FS-Pad.
We introduce an optimal control method for electromagnetic haptic guidance systems. Our real-time approach assists users in pen-based tasks such as drawing, sketching or designing. The key to our control method is that it guides users, yet does not take away agency. Existing approaches force the stylus to a continuously advancing setpoint on a target trajectory, leading to undesirable behavior such as loss of haptic guidance or unintended snapping. Our control approach, in contrast, gently pulls users towards the target trajectory, allowing them to always easily override the system to adapt their input spontaneously and draw at their own speed. To achieve this flexible guidance, our optimization iteratively predicts the motion of an input device such as a pen, and adjusts the position and strength of an underlying dynamic electromagnetic actuator accordingly. To enable real-time computation, we additionally introduce a novel and fast approximate model of an electromagnet. We demonstrate the applicability of our approach by implementing it on a prototypical hardware platform based on an electromagnet moving on a bi-axial linear stage, as well as a set of applications. Experimental results show that our approach is more accurate and preferred by users compared to open-loop and time-dependent closed-loop approaches.
Natural language interaction has evolved as a useful modality to help users explore and interact with their data during visual analysis. Little work has been done to explore how autocompletion can help with data discovery while helping users formulate analytical questions. We developed a system called \system as a design probe to better understand the usefulness of autocompletion for visual analysis. We ran three Mechanical Turk studies to evaluate user preferences for various text- and visualization widget-based autocompletion design variants for helping with partial search queries. Our findings indicate that users found data previews to be useful in the suggestions. Widgets were preferred for previewing temporal, geospatial, and numerical data while text autocompletion was preferred for categorical and hierarchical data. We conducted an exploratory analysis of our system implementing this specific subset of preferred autocompletion variants. Our insights regarding the efficacy of these autocompletion suggestions can inform the future design of natural language interfaces supporting visual analysis.
People often seek help online while using complex software. Currently, information search takes users' attention away from the task at hand by creating a separate search task. This paper investigates how multimodal interaction can make in-task help-seeking easier and faster. We introduce ReMap, a multimodal search interface that helps users find video assistance while using desktop and web applications. Users can speak search queries, add application-specific terms deictically (e.g., "how to erase this"), and navigate search results via speech, all without taking their hands (or mouse) off their current task. Thirteen participants who used ReMap in the lab found that it helped them stay focused on their task while simultaneously searching for and using learning videos. Users' experiences with ReMap also raised a number of important challenges with implementing system-wide context-aware multimodal assistance.
Assembling circuits on breadboards using reference designs is a common activity among makers. While tools like Fritzing offer a simplified visualization of how components and wires are connected, such pictorial depictions of circuits are rare in formal educational materials and the vast bulk of online technical documentation. Electronic schematics are more common but are perceived as challenging and confusing by novice makers. To improve access to schematics, we propose SchemaBoard, a system for assisting makers in assembling and inspecting circuits on breadboards from schematic source materials. SchemaBoard uses an LED matrix integrated underneath a working breadboard to visualize via light patterns where and how components should be placed, or to highlight elements of circuit topology such as electrical nets and connected pins. This paper presents a formative study with 16 makers, the SchemaBoard system, and a summative evaluation with an additional 16 users. Results indicate that SchemaBoard is effective in reducing both the time and the number of errors associated with building a circuit based on a reference schematic, and for inspecting the circuit for correctness after its assembly.
Iterative artistic exploration, mechanism building, and interaction programming are essential processes of prototyping interactive kinetic art (IKA). However, scattered tools and interwoven workflows across digital and physical worlds make the task difficult. We present WIKA, an integrated environment supporting the whole creation process of IKA in the form of a layered picture frame in a single workspace. A projected AR system with a mobile device efficiently makes an interactive tabletop. The projected information connected with physical components (e.g. sensors and motors) enables the programming and simulation on the workspace. Physical components are applied from the initial phase of prototyping using an AR plate, and this supports the iterative trial-and-error process by bridging the workflow. A user study shows that WIKA enabled non-experts to create diverse IKA with their ideas. A tangible interaction and projected information enable the iterative and rapid creation. The method that integrates the hardware and software in the physical environment can be applied to other prototyping tools that support the creation of interactive and kinetic elements.
We present a system for generating and visualizing interactive 3D Augmented Reality tutorials based on 2D video input, which allows viewpoint control at runtime. Inspired by assembly planning, we analyze the input video using a 3D CAD model of the object to determine an assembly graph that encodes blocking relationships between parts. Using an assembly graph enables us to detect assembly steps that are otherwise difficult to extract from the video, and generally improves object detection and tracking by providing prior knowledge about movable parts. To avoid information loss, we combine the 3D animation with relevant parts of the 2D video so that we can show detailed manipulations and tool usage that cannot be easily extracted from the video. To further support user orientation, we visually align the 3D animation with the real-world object by using texture information from the input video. We developed a presentation system that uses commonly available hardware to make our results accessible for home use and demonstrate the effectiveness of our approach by comparing it to traditional video tutorials.
Force feedback is commonly used to enhance realism in virtual reality (VR). However, current works mainly focus on providing different force types or patterns, but do not investigate how a proper point of application of force (PAF), which means where the resultant force is applied to, affects users' experience. For example, users perceive resistive force without torque when pulling a virtual bow, but with torque when pulling a virtual slingshot. Therefore, we propose a set of handheld controllers, ElastiLinks, to provide force feedback between controllers with dynamic PAFs.A rotatable track on each controller provides a dynamic PAF, and two common types of force feedback, resistive force and impact, are produced by two links, respectively. We performed a force perception study to ascertain users' resistive and impact force level distinguishability between controllers. Based on the results, we conducted another perception study to understand users' distinguishability of PAF offset and rotation differences. Finally, we performed a VR experience study to prove that force feedback with dynamic PAFs enhances VR experience.
Haptic controllers have an important role in providing rich and immersive Virtual Reality (VR) experiences. While previous works have succeeded in creating handheld devices that simulate dynamic properties of rigid objects, such as weight, shape, and movement, recreating the behavior of flexible objects with different stiffness using ungrounded controllers remains an open challenge. In this paper we present ElaStick, a variable-stiffness controller that simulates the dynamic response resulting from shaking or swinging flexible virtual objects. This is achieved by dynamically changing the stiffness of four custom elastic tendons along a joint that effectively increase and reduce the overall stiffness of a perceived object in 2-DoF. We show that with the proposed mechanism, we can render stiffness with high precision and granularity in a continuous range between 10.8 and 71.5Nmm/degree. We estimate the threshold of the human perception of stiffness with a just-noticeable difference (JND) study and investigate the levels of immersion, realism and enjoyment using a VR application.
We present PIVOT, a wrist-worn haptic device that renders virtual objects into the user's hand on demand. Its simple design comprises a single actuated joint that pivots a haptic handle into and out of the user's hand, rendering the haptic sensations of grasping, catching, or throwing an object anywhere in space. Unlike existing hand-held haptic devices and haptic gloves, PIVOT leaves the user's palm free when not in use, allowing users to make unencumbered use of their hand. PIVOT also enables rendering forces acting on the held virtual objects, such as gravity, inertia, or air-drag, by actively driving its motor while the user is firmly holding the handle. When wearing a PIVOT device on both hands, they can add haptic feedback to bimanual interaction, such as lifting larger objects. In our user study, participants (n=12) evaluated the realism of grabbing and releasing objects of different shape and size with mean score 5.19 on a scale from 1 to 7, rated the ability to catch and throw balls in different directions with different velocities (mean=5.5), and verified the ability to render the comparative weight of held objects with 87% accuracy for ~100g increments.
Brain-computer interfaces (BCIs) are increasingly used to perform simple operations such as a moving a cursor, but have remained of limited use for more complex tasks. In our new approach to BCI, we use brain relevance feedback to control a generative adversarial network (GAN). We obtained EEG data from 31 participants who viewed face images while concentrating on particular facial features. Following, an EEG relevance classifier was trained and propagated as feedback on the latent image representation provided by the GAN. Estimates for individual vectors matching the relevant criteria were iteratively updated to optimize an image generation process towards mental targets. A double-blind evaluation showed high performance (86.26% accuracy) against random feedback (18.71%), and not significantly lower than explicit feedback (93.30%). Furthermore, we show the feasibility of the method with simultaneous task targets demonstrating BCI operation beyond individual task constraints. Thus, brain relevance feedback can validly control a generative model, overcoming a critical limitation of current BCI approaches.
Neurophysiological laboratory studies are often constraint to immediate geographical surroundings and access to equipment may be temporally restricted. Limitations of ecological validity, scalability, and generalizability of findings pose a significant challenge for the development of brain-computer interfaces (BCIs), which ultimately need to function in any context, on consumer-grade hardware. We introduce MYND: An open-source framework that couples consumer-grade recording hardware with an easy-to-use application for the unsupervised evaluation of BCI control strategies. Subjects are guided through experiment selection, hardware fitting, recording, and data upload in order to self-administer multi-day studies that include neurophysiological recordings and questionnaires at home. As a use case, thirty subjects evaluated two BCI control strategies "Positive memories" and "Music imagery" by using a four-channel electroencephalogram (EEG) with MYND. Neural activity in both control strategies could be decoded with an average offline accuracy of 68.5% and 64.0% across all days.
Aiming for the creation and development of taste media, a taste display was developed in this study that can reproduce tastes measured using taste sensors. By performing iontophoresis on five gels, which contain dissolved electrolytes that reproduce the five basic tastes, the quantity of ions that contact the tongue was controlled. A tasteless gel was added, so that the sum of the currents flowing in the six gels could be kept constant, ensuring a uniform amount of stimulation on the tongue. The measured tastes could be successfully reproduced through calibration, in which the indicated taste levels were matched with the taste-sensor measurements. Furthermore, video-editing software was adapted to edit taste information as well as recorded audio and video. In addition, effector and equalizer prototypes were built that can not only reproduce the recorded tastes in their original states but also adjust the tastes to match individual preferences.
A major problem in task-oriented conversational agents is the lack of support for the repair of conversational breakdowns. Prior studies have shown that current repair strategies for these kinds of errors are often ineffective due to: (1) the lack of transparency about the state of the system's understanding of the user's utterance; and (2) the system's limited capabilities to understand the user's verbal attempts to repair natural language understanding errors. This paper introduces SOVITE, a new multi-modal speech plus direct manipulation interface that helps users discover, identify the causes of, and recover from conversational breakdowns using the resources of existing mobile app GUIs for grounding. SOVITE displays the system's understanding of user intents using GUI screenshots, allows users to refer to third-party apps and their GUI screens in conversations as inputs for intent disambiguation, and enables users to repair breakdowns using direct manipulation on these screenshots. The results from a remote user study with 10 users using SOVITE in 7 scenarios suggested that SOVITE's approach is usable and effective.
Mobile solutions can help transform speech and sound into visual representations for people who are deaf or hard-of-hearing (DHH). However, where handheld phones present challenges, head-worn displays (HWDs) could further communication through privately transcribed text, hands-free use, improved mobility, and socially acceptable interactions.
Wearable Subtitles is a lightweight 3D-printed proof-of-concept HWD that explores augmenting communication through sound transcription for a full workday. Using a low-power microcontroller architecture, we enable up to 15 hours of continuous use. We describe a large survey (n=501) and three user studies with 24 deaf/hard-of-hearing participants which inform our development and help us refine our prototypes. Our studies and prior research identify critical challenges for the adoption of HWDs which we address through extended battery life, lightweight and balanced mechanical design (54 g), fitting options, and form factors that are compatible with current social norms.
Future homes and offices will feature increasingly dense ecosystems of IoT devices, such as smart lighting, speakers, and domestic appliances. Voice input is a natural candidate for interacting with out-of-reach and often small devices that lack full-sized physical interfaces. However, at present, voice agents generally require wake-words and device names in order to specify the target of a spoken command (e.g., 'Hey Alexa, kitchen lights to full bright-ness'). In this research, we explore whether speech alone can be used as a directional communication channel, in much the same way visual gaze specifies a focus. Instead of a device's microphones simply receiving and processing spoken commands, we suggest they also infer the Direction of Voice (DoV). Our approach innately enables voice commands with addressability (i.e., devices know if a command was directed at them) in a natural and rapid manner. We quantify the accuracy of our implementation across users, rooms, spoken phrases, and other key factors that affect performance and usability. Taken together, we believe our DoV approach demonstrates feasibility and the promise of making distributed voice interactions much more intuitive and fluid.
Near-surface multi-finger tracking (NMFT) technology expands the input space of touchscreens by enabling novel interactions such as mid-air and finger-aware interactions. We present DeepFisheye, a practical NMFT solution for mobile devices, that utilizes a fisheye camera attached at the bottom of a touchscreen. DeepFisheye acquires the image of an interacting hand positioned above the touchscreen using the camera and employs deep learning to estimate the 3D position of each fingertip. We created two new hand pose datasets comprising fisheye images, on which our network was trained. We evaluated DeepFisheye's performance for three device sizes. DeepFisheye showed average errors with approximate value of 20 mm for fingertip tracking across the different device sizes. Additionally, we created simple rule-based classifiers that estimate the contact finger and hand posture from DeepFisheye's output. The contact finger and hand posture classifiers showed accuracy of approximately 83 and 90%, respectively, across the device sizes.
The automatic recognition of how people use their hands and fingers in natural settings -- without instrumenting the fingers -- can be useful for many mobile computing applications. To achieve such an interface, we propose a vision-based 3D hand pose estimation framework using a wrist-worn camera. The main challenge is the oblique angle of the wrist-worn camera, which makes the fingers scarcely visible. To address this, a special network that observes deformations on the back of the hand is required. We introduce DorsalNet, a two-stream convolutional neural network to regress finger joint angles from spatio-temporal features of the dorsal hand region (the movement of bones, muscle, and tendons). This work is the first vision-based real-time 3D hand pose estimator using visual features from the dorsal hand region. Our system achieves a mean joint-angle error of 8.81 degree for user-specific models and 9.77 degree for a general model. Further evaluation shows that our system outperforms previous work with an average of 20% higher accuracy in recognizing dynamic gestures, and achieves a 75% accuracy of detecting 11 different grasp types. We also demonstrate 3 applications which employ our system as a control device, an input device, and a grasped object recognizer.
TelemetRing is a batteryless and wireless ring-shaped keyboard that supports command and text entry in daily lives by detecting finger typing on various surfaces. The proposed inductive telemetry approach eliminates bulky batteries or capacitors from the ring part. Each ring consists of a sensor coil (the ring part itself), 1-DoF piezoelectric accelerometer, and varactor diode; moreover, it has different resonant frequencies. Typing shocks slightly shift the resonant frequency, and these are detected by a wrist-mounted readout coil. 5-bit chord keyboard is realized by attaching five sensor rings on five fingers. Our evaluation shows that the prototype achieved the tiny (6 g, 3.5 cm^3) ring sensor and 89.7% of typing detection ratio.
Supporting voice commands in applications presents significant benefits to users. However, adding such support to existing GUI-based web apps is effort-consuming with a high learning barrier, as shown in our formative study, due to the lack of unified support for creating multi-modal interfaces. We develop Geno---a developer tool for adding the voice input modality to existing web apps without requiring significate NLP expertise. Geno provides a unified workflow for developers to specify functionalities to support by voice (intents), create language models for detecting intents and the relevant information (parameters) from user utterances, and fulfill the intents by either programmatically invoking the corresponding functions or replaying GUI actions on the web app. Geno further supports references to GUI context in voice commands (e.g., "add this to the playlist"). In a study, developers with little NLP expertise were able to add the multi-modal support for two existing web apps using Geno.
We present TileCode, a video game creation environment that runs on battery-powered microcontroller-based gaming handhelds. Our work is motivated by the popularity of retro video games, the availability of low-cost gaming handhelds loaded with many such games, and the concomitant lack of a means to create games on the same handhelds. With TileCode, we seek to close the gap between the consumers and creators of video games and to motivate more individuals to participate in the design and creation of their own games. The TileCode programming model is based on tile maps and provides a visual means for specifying the context around a sprite, how a sprite should move based on that context, and what should happen upon sprite collisions. We demonstrate that a variety of popular video games can be programmed with TileCode using 10-15 visual rules and compare/contrast with block-based versions of the same games implemented using MakeCode Arcade.
Collaborative robots promise to transform work across many industries and promote human-robot teaming as a novel paradigm. However, realizing this promise requires the understanding of how existing tasks, developed for and performed by humans, can be effectively translated into tasks that robots can singularly or human-robot teams can collaboratively perform. In the interest of developing tools that facilitate this process we present Authr, an end-to-end task authoring environment that assists engineers at manufacturing facilities in translating existing manual tasks into plans applicable for human-robot teams and simulates these plans as they would be performed by the human and robot. We evaluated Authr with two user studies, which demonstrate the usability and effectiveness of Authr as an interface and the benefits of assistive task allocation methods for designing complex tasks for human-robot teams. We discuss the implications of these findings for the design of software tools for authoring human-robot collaborative plans.
From full-color objects to functional capacitive artifacts, 3D printing multi-materials became essential to broaden the application areas of digital fabrication. We present Programmable Filament, a novel technique that enables multi-material printing using a commodity FDM 3D printer, requiring no hardware upgrades. Our technique builds upon an existing printing technique in which multiple filament segments are printed and spliced into a single threaded filament. We propose an end-to-end pipeline for 3D printing an object in multi-materials, with an introduction of the design systems for end-users. Optimized for low-cost, single-nozzle FDM 3D printers, the system is built upon our computational analysis and experiments to enhance its validity over various printers and materials to design and produce a programmable filament. Finally, we discuss application examples and speculate the future with its potential, such as custom filament manufacturing on-demand.
We present DefeXtiles, a rapid and low-cost technique to produce tulle-like fabrics on unmodified fused deposition modeling (FDM) printers. The under-extrusion of filament is a common cause of print failure, resulting in objects with periodic gap defects. In this paper, we demonstrate that these defects can be finely controlled to quickly print thinner, more flexible textiles than previous approaches allow. Our approach allows hierarchical control from micrometer structure to decameter form and is compatible with all common 3D printing materials.
In this paper, we introduce the mechanism of DefeXtiles, establish the design space through a set of primitives with detailed workflows, and characterize the mechanical properties of DefeXtiles printed with multiple materials and parameters. Finally, we demonstrate the interactive features and new use cases of our approach through a variety of applications, such as fashion design prototyping, interactive objects, aesthetic patterning, and single-print actuators.
Automatic knitting machines are robust, digital fabrication devices that enable rapid and reliable production of attractive, functional objects by combining stitches to produce unique physical properties. However, no existing design tools support optimization for desirable physical and aesthetic knitted properties. We present KnitGIST (Generative Instantiation Synthesis Toolkit for knitting), a program synthesis pipeline and library for generating hand- and machine-knitting patterns by intuitively mapping objectives to tactics for texture design. KnitGIST generates a machine-knittable program in a domain-specific programming language.
Handheld Perspective-Corrected Displays (HPCDs) are physical objects that have a notable volume and that display a virtual 3D scene on their entire surface. Being handheld, they create the illusion of holding the scene in a physical container (the display). This has strong benefits for the intuitiveness of 3D interaction: manipulating objects of the virtual scene amounts to physical manipulations of the display. HPCDs have been limited so far to technical demonstrators and experimental tools to assess their merits. However, they show great potential as interactive systems for actual 3D applications. This requires that novel interactions be created to go beyond object manipulation and to offer general-purpose services such as menu command selection and continuous parameter control. Working with a two-handed spherical HPCD, we report on the design and informal evaluations of various interaction techniques for distant object selection, scene scaling, menu interaction and continuous parameter control. In particular, our design leverages the efficient two-handed control of the rotations of the display. We demonstrate how some of these techniques can be assemble in a self-contained anatomy learning application. Novice participants used the application in a qualitative user experiment. Most participants used the application effortlessly without any training or explanations.
Haptic simulation of hand tools like wrenches, pliers, scissors and syringes are beneficial for finely detailed skill training in VR, but designing for numerous hand tools usually requires an expert-level knowledge of specific mechanism and protocol. This paper presents HapLinkage, a prototyping framework based on linkage mechanism, that provides typical motion templates and haptic renderers to facilitate proxy design of virtual hand tools. The mechanical structures can be easily modified, for example, to scale the size, or to change the range of motion by selectively changing linkage lengths. Resistant, stop, release, and restoration force feedback are generated by an actuating module as part of the structure. Additional vibration feedback can be generated with a linear actuator. HapLinkage enables easy and quick prototypting of hand tools for diverse VR scenarios, that embody both of their kinetic and haptic properties. Based on interviews with expert designers, it was confirmed that HapLinkage is expressive in designing haptic proxy of hand tools to enhance VR experiences. It also identified potentials and future development of the framework.
In this keynote talk, Dr. Costanza-Chock will explore the theory and practice of design justice, discuss how design affordances, disaffordances, and dysaffordances distribute benefits and burdens unequally according to users? location within the matrix of domination (white supremacy, heteropatriarchy, ableism, capitalism, and settler colonialism), and invite us to consider how user interface designers can intentionally contribute to building ?a better world?, a world where many worlds fit; linked worlds of collective liberation and ecological sustainability.