TVX '18- Proceedings of the 2018 ACM International Conference on Interactive Experiences for TV and Online Video

Full Citation in the ACM Digital Library

SESSION: Opening & Closing Keynotes

Towards Smarter Cinematographic Drones

With the advent of stable and powerful quadrotors, coupled with high quality camera lenses, quadrotor drones are becoming new cinematographic devices in the toolbox of both professional and amateur filmmakers. Many sequences in recent movies and TV shows are now shot by drones due to the simplicity of the process compared to installing cumbersome and expensive camera cranes, and because of the new creative shooting possibilities it opens.

However, mastering the control of such devices in order to create cinematographic sequences in possibly evolving environments requires a significant amount of time and practice. Indeed, synchronizing the quadrotor and camera motions while ensuring that the drone is in a safe position and satisfies desired visual properties remains challenging. Professional film crews actually rely on two operators who coordinate their actions: a pilot focuses on controlling the motion of the drone, and a cinematographer focuses on motion and focus of the camera.

Making Creative Content for Future TV: A Collaborative Ecosystem for Content-making in Korea & Asia

Content makers need to make a concerted effort to escape from the conventional obsolete business model, and looking for new opportunities in the new media consuming platforms like 'YouTube' and 'Netflix.' In this keynote talk, I will highlight an innovative & collaborative ecosystem that I am now setting up in the Korea & Asia markets. Motivation: In the Korea & Asia markets, television was a dominant platform to distribute films or TV contents, so the traditional business model of the big-budget movies or content was to simply sell their contents to the traditional distributors (e.g., Korea Broadcasting System, IPTV, OTT, and etc.), just after the wide releases in the cinema. However, people are now watching various kinds of videos on mobile devices or computers, not on TV, and want more personalized or lifestyle-matched contents than ever. In order to lead this future market trend, a new ecosystem for both content creators and media consumers is urgently needed. Problem statement: In an actual fact, the reason why Disney has tried to merge 21st Century Fox (Film and television content division) can be seen in this regard. The Disney wants to monopolize the content distribution market in the traditional model (e.g., IPTV, OTT in Korea) and it wants to be a game-changer against emerging platforms such as YouTube and Netflix. No content distribution through Netflix or YouTube is a key to see the new power-game. A significant problem by this trend-setting is that small & medium budget content makers (content creators) cannot be afoot in the monopolized distribution markets by Disney, Netflix, and YouTube. This means that fewer content providers will survive by the next decade. Approach: I will introduce the new ecosystem which aims to be beneficial for all the stakeholders (producer, director, screenwriter, CG technician, etc.), not for the best (i.e., big content providers) becomes the beast. My talk is about how to make such an ecosystem for the future content industry in Korea, by which new digital transformation in the media consumption can be soundly delivered for all of us.

SESSION: Session 1: Interactivity and Immersion

Gestures for Controlling a Moveable TV

We investigate the effects of physical context on the preference and production of touchless (3D) gestures, focusing on what users consider to be natural and intuitive. Using an elicitation task, we asked for users' preferred gestures to control a "moving TV" display from a distance of 3-4m. We conducted three user studies (N=16 each) using the same premise but varying the physical conditions encountered, such as number of hands available or distance and orientation to the display. This is important to ensure the robustness of the gesture set. We observed two dominant strategies which we interpret as dependent on the user's mental model: hand-as-display and hand-moving-display. Across the varying conditions, users were found to be consistent with their preferred gesture strategy while varying the production (number of hands, orientation, extension of arms) of their gestures in order to match both their mental models and the physical context of use. From a technology perspective, this natural variation challenges the notion of identifying "the optimal gesture set" and should be taken into account when designing future systems with gesture control.

Frictional Realities: Enabling Immersion in Mixed-Reality Performances

This paper presents a case study of a Mixed-Reality Performance employing 360-degree video for a virtual reality experience. We repurpose the notions of friction to illustrate the different threads at which priming is enacted during the performance to create an immersive audience experience. We look at aspects of friction between the different layers of the Mixed-Reality Performance, namely: temporal friction, friction between the physical and virtual presence of the audience, and friction between realities. We argue that Mixed-Reality Performances that employ immersive technology, do not need to rely on its presumed immersive nature to make the performance an engaging or coherent experience. Immersion, in such performances, emerges from the audience' transition towards a more active role, and the creation of various fictional realities through frictions.

SESSION: Session 2: Storytelling

Narrative Bytes: Data-Driven Content Production in Esports

Esports - video games played competitively that are broadcast to large audiences - are a rapidly growing new form of mainstream entertainment. Esports borrow from traditional TV, but are a qualitatively different genre, due to the high flexibility of content capture and availability of detailed gameplay data. Indeed, in esports, there is access to both real-time and historical data about any action taken in the virtual world. This aspect motivates the research presented here, the question asked being: can the information buried deep in such data, unavailable to the human eye, be unlocked and used to improve the live broadcast compilations of the events? In this paper, we present a large-scale case study of a production tool called Echo, which we developed in close collaboration with leading industry stakeholders. Echo uses live and historic match data to detect extraordinary player performances in the popular esport Dota 2, and dynamically translates interesting data points into audience-facing graphics. Echo was deployed at one of the largest yearly Dota 2 tournaments, which was watched by 25 million people. An analysis of 40 hours of video, over 46,000 live chat messages, and feedback of 98 audience members showed that Echo measurably affected the range and quality of storytelling, increased audience engagement, and invoked rich emotional response among viewers.

Facts, Interactivity and Videotape: Exploring the Design Space of Data in Interactive Video Storytelling

We live in a society that is increasingly data rich, with an unprecedented amount of information being captured, stored and analysed about our lives and the people we share them with. We explore the relationship between this new data and emergent forms of interactive video storytelling. In particular we ask: i) how can interactive video storytelling techniques be employed to provide accessible, informative and pleasurable ways for people to engage with data; and ii) how can data be used by the creators of interactive video stories to meet expressive goals and support new forms of experience? We present an analysis of 43 interactive videos that use data in a noteworthy fashion. This analysis reveals a design space comprising key techniques for telling engaging interactive video stories with and about data. To conclude, we discuss challenges relating to the production and consumption of such content and make recommendations for future research.

SESSION: Session 3: Understanding Users

How Users Perceive Delays in Synchronous Companion Screen Experiences - An Exploratory Study

A lot of work has been focused around enabling accurately synchronised companion screen experiences. The challenge has been to ensure that the delays between the presentation of programme content to the TV and the delivery of the relevant companion screen content to a mobile device are kept to a minimum. This is mainly driven by the need to ensure that the integrity of the editorial design of companion screen experiences can be maintained at the users' end. This paper presents a 32-participant study which sought to explore the impact of delays between the presentation of programmes on a TV and the presentation of companion content on a Tablet. Three types of experiences: 1) video-to-slideshow using Factual content, 2) video-to-alt-video using Sports content, and 3) video-to-AD (audio description) using Drama content; were tested across eight levels of delays. Participant responses suggest different factors influence their evaluation of the different types of experiences tested.

"I Can Watch What I Want": A Diary Study of On-Demand and Cross-Device Viewing

In recent years, on-demand video services, such as Netflix and Amazon Video, have become extremely popular. To understand how people use these services, we recruited 20 people from nine households to keep a viewing diary for 14 days. To better understand these household viewing diaries, in-depth interviews were conducted. We found that people took advantage of the freedom and choice that on-demand services offer, watching on different devices and in different locations, both in the home and outside. People often watched alone so they could watch what they wanted, rather than coming together to watch something of mutual interest. Despite this flexibility, the evening prime time continued to be the most popular time for people to watch on-demand content. Sometimes they watched for extended periods, and during interviews concerns were expressed about how on-demand services make it far too easy to watch too much and that this is often undesirable.

Utilitarian and Hedonic Motivations for Live Streaming Shopping

Watching live streams as part of the online shopping experience is a relatively new phenomenon. In this paper, we examine live streaming shopping, conceptualizing it as a type of online shopping that incorporates real-time social interaction. Live streaming shopping can happen in two ways: live streaming embedded in e-commerce, or e-commerce integrated into live streaming. Based on prior research related to live streaming and consumer motivation theories, we examined the relationships between hedonic and utilitarian motivations and shopping intention. We found that hedonic motivation is positively related to celebrity-based intention and utilitarian motivation is positively related to product-based intention. A content analysis of open-ended questions identified eight reasons for why consumers prefer live streaming shopping over regular online shopping.

SESSION: Session 4: Data-Driven Approaches

A Data-driven Approach to Explore Television Viewing in the Household Environment

The rise of small, IoT-related devices and sensors have enabled us to sense and collect data than ever before. In this study, we walk through our attempt of a data-driven approach in collecting behavioral data on television viewing, an activity thought as passive and habitual. We conducted a 14 day experiment with 13 households in the wild using a data logger installed at each house. Television-related data in IR log data and IPTV packets, and contextual data in Bluetooth signal data and brightness data are collected through the data logger. The data is supplemented by the qualitative situational information that participants provided via in-situ chatbot surveys. Our non-intrusive data logger has enabled behavioral data collection in a natural, comprehensive manner. Detailed television viewing behaviors recorded through IR data logs, volume of viewing sessions, and in-situ chatbot responses show how television viewing is heavily context-dependent than previously thought.

Explicating the Challenges of Providing Novel Media Experiences Driven by User Personal Data

The turn towards personal data to drive novel media experiences has resulted in a shift in the priorities and challenges associated with media creation and dissemination. This paper takes up the challenge of explicating this novel and dynamic scenario through an interview study of employees delivering diverse personal data driven media services within a large U.K. based media organisation. The results identify a need for better interactions in the user-data-service ecosystem where trust and value are prioritised and balanced. Being legally compliant and going beyond just the mandatory to further ensure social accountability and ethical responsibility as an organisation are unpacked as methods to achieve this balance in data centric interactions. The work also presents how technology is seen and used as a solution for overcoming challenges and realising priorities to provide value while preserving trust within the personal data ecosystem.

SESSION: Session 5: Systems

A New Production Platform for Authoring Object-based Multiscreen TV Viewing Experiences

Multiscreen TV viewing refers to a spectrum of media productions that can be watched on TV screens and companion screens (e.g., smartphones and tablets). TV production companies are now promoting an interactive and engaging way of viewing TV by offering tailored applications for TV programs. However, viewers are demotivated to install dozens of applications and switch between them. This is one of the obstacles that hinder companion screen applications from reaching mass audiences. To solve this, TV production companies need a standard process for producing multiscreen content, allowing viewers to follow all kinds of programs in one single application. This paper proposes a new object-based production platform for authoring programs for multiscreen. The platform consists of two parts: the preproduction tool and the live editing tool. To evaluate whether the proposed workflow is appropriate, validation interviews were conducted with professionals in the TV broadcasting industry. The professionals were positive about the proposed new workflow, indicating that the platform allows for preparations at the preproduction stage and reduces the workload during the live broadcasting. They see as well its potential to adapt to the current production workflow.

Digital Authoring of Interactive Public Display Applications

HbbTV (Hybrid broadcast broadband TV) is an emerging force in the entertainment industry, and proper standarisation of technologies would be hugely beneficial for the creation of content. HbbTV aims to realise this vision and has been widely successful thus far. This paper introduces MPAT (Multi Platform Application Toolkit), which is the result of multiple organisational entities effort and dedication to extend the capabilities and functionality of HbbTV, in order to ease the design and creation of interactive TV applications. The paper also showcases the versatility of MPAT, by describing a series of case studies which provide digital storytelling and visual authoring of interactive applications which transcend traditional TV use cases, and instead provide a gripping interactive experience via integration with public displays.

Companion Screen Architecture for Bridging TV Experiences and Life Activities

The diversification of personal lifestyles has complicated the roles of media and associated service consumption. In our current era, when people start to use new services by transitioning from one service or device to another, bothersome operations can decrease their motivation to use the new services effectively. For example, even though companion screen services are now available on integrated broadcast?broadband systems, broadcast accessibility from mobile service remains suboptimal because existing architectures remain television (TV)-centric and cannot use these services effectively. In response to this issue, we propose a user-centric companion screen architecture (CSA) that can tune to a specified TV channel and launch broadcast-related TV applications from mobile and Internet of Things (IoT)-enabled devices. We confirmed the general versatility of this CSA by prototyping multiple use cases involving various broadcasters and by evaluating broadcast accessibility from mobile devices via user tests. The obtained results showed that 86% of the examinees expressed improved user satisfaction and that 78% the examinees reported a potential increase in the number of broadcasts they would watch. Thus, we conclude that our proposed CSA improves broadcast accessibility from mobile and IoT services and can help bridge the gap between TV experiences and life activities.

SESSION: Session: Work-in-Progress

Twickle: Growing Twitch Streamer's Communities Through Gamification of Word-of-Mouth Referrals has grown to be one of the largest streaming platforms worldwide, hosting over 2 million active streamers. Many of these streamers are using their Twitch stream to earn a living, turning their streams into a business. However, growing a community that supports this endeavor remains a central challenge amongst streamers. In this paper, we present Twickle: a web-based leaderboard tool that leverages the gamification of word-of-mouth referrals to grow a streamer's community. An initial feasibility study with four streamers reveals that Twickle increases the amount of new viewers and is appreciated by the Twitch community. We address design opportunities for Twickle and outline future research.

Touchable Video Streams: Towards Multi-sensory and Multi-contact Experiences

Haptic feedback takes on an important role in providing spatial cues, which are difficult to convey solely by sight, as well as in increasing the immersion of contents. However, although a number of techniques and applications for haptic media have been proposed in this regard, live streaming of touchable video has yet to be actively deployed due to computational complexity and equipment limitations. In order to mitigate these issues, we introduce an approach to render haptic feedback directly from RGB-D video streams without surface reconstruction, and also describe how to superimpose virtual objects or haptic effects onto real-world scenes. Furthermore, we discuss possible improvements in software and appropriate device setups to extend the proposed system to support a practical solution for multi-sensory and multi-point interaction in streaming touchable media.

A Mediography of Virtual Reality Non-Fiction: Insights and Future Directions

The emergence in recent years of consumer-accessible virtual reality (VR) technologies such as the Google Daydream, Oculus Rift and HTC Vive has led to a renewal of commercial, academic and public interest in immersive interactive media. Virtual reality non-fiction (VRNF) (e.g. documentary) is an emergent and rapidly evolving new medium for filmmaking that draws from - and builds upon - traditional forms of non-fiction, as well as interactive media, gaming and immersive theatre. In this paper, we present our ongoing work to capture and present the first comprehensive record of VRNF - a Mediography of Virtual Reality Non-Fiction - to tell the story of where this new medium has come from, how it is evolving, and where it is heading.

Content Unification in iTV to Enhance User Experience: The UltraTV Project

Recent changes in TV viewers' consumption habits are pushing to a point where industry content providers and producers must create new technological solutions to retain customers. To cope with these, the UltraTV project consortium developed an iTV concept, with a focus on the unification of content from different sources. This brings together traditional TV along with Over-the-Top content, aiming to provide an integrated solution that could foster the audiovisual consumption and ease the discovery of content. This paper presents the implemented solution and reports on the results of its evaluation using a Field Trial. Results provide valuable insights for a market-oriented version of the UltraTV concept, proving the feasibility and user demand for a profile-based content unification solution for future iTV solutions.

Viewers' behaviors at home on TV and other screens: an online survey

In a context where audiovisual consumption habits are continually transforming, mostly driven by Video On Demand services, this paper has the main goal of characterizing the motivational factors and behaviors related with the uses of multiple devices at home. The report is sustained in the results of an online survey carried out in Portugal, aiming to collect information about the online video and linear TV content consumption. Besides the regular TV contents, usually watched on a TV connected to a set-top box, the Computer was the most chosen device to watch all the other sources of content at home. Furthermore, 71,4% stated that they usually connect more than one device to the TV screen.

Personalising the TV Experience with Augmented Reality Technology: Synchronised Sign Language Interpretation

This paper explores the potential of augmented reality technology as a novel way to allow users to view a sign language interpreter through an optical head-mounted display while watching a TV programme. We address the potential of augmented reality for personalisation of TV access services as part of closed laboratory investigations. Based on guidelines of regulatory authorities and research on traditional sign language services on TV, as well as feedback from experts, we justify our two design proposals. We describe how we produced the content for the AR prototype applications and what we have learned during the process. Finally, we develop questions for our upcoming user studies.

Educational Online Video: Opportunities And Barriers to Integrate it in the Entertainment Consumption Routines

General population and particularly teenagers are increasingly using mobile devices for video consumption instead of the regular TV set. Considering that the top motivation for video consumption is to seek for entertainment, there is an opportunity to try to capture some of those moments for educational content enriched with some entertainment characteristics. This study aims to identify narrative and technical characteristics to incorporate in educational informal videos, designed for new media platforms by analysing the preferences of teenagers aged from 12 to 16 years old that attend the Portuguese public school system. Furthermore, the research team expects to understand if educational videos enriched with the referred characteristics are able to be included in entertainment consumption routines of these viewers. Some of the most valued characteristics are the comic approach, the integration of animations, the relaxed yet clear presenter style and the low level of scientific detail in video explanations.

Understanding Blind or Visually Impaired People on YouTube through Qualitative Analysis of Videos

In this paper, we analyzed videos to explore blind or visually impaired (BVI) people on YouTube. While researchers found how BVI people interact with contents and other people on social media platforms (e.g., Facebook), little is known about the experience of BVI people on video-based social media platforms (e.g., YouTube). To use videos as a mean of identifying the needs of BVI people on YouTube, we collected and analyzed a specific type of video called Visually Impaired People (VIP) Tag video. This Tag video has a set of structured questions about eye condition and experience as a BVI person. Based on the qualitative analysis of 24 VIP Tag videos created by BVI people, we found how they create videos and why they joined YouTube. In conclusion, we present how video-content analysis can be used to create an inclusive video-based social media platform.

Collecting Observational Data about Online Video Use in the Home Using Open-Source Broadcasting Software

Capturing contextual data about online media consumption in the home can be difficult, often requiring site visits and hardware installation in the field. In this paper, we present an exploratory study in which we use free, open-source broadcasting software and participants' existing computer hardware to capture remote, contextual video data inside the home. This method allows participants to simultaneously capture live recordings across multiple computer screens-as well as themselves and their home viewing environment-while watching long-form online video. We discuss the affordances and challenges of this method for researchers seeking to capture contextual data remotely.

A Study on User Experience Evaluation of Glasses-type Wearable Device with Built-in Bone Conduction Speaker: Focus on the Zungle Panther

The current HDM-oriented glasses wearable devices are inconvenient to use in real daily life that it needs both miniaturization and weight lightening. Glasses-type wearable devices are expected to develop into glasses-type devices. There is a lack of research on evaluation of user experience of VR(Virtual reality), AR(Augmented reality), television and game using glasses-type devices and design guideline. This research used Zungle Panther, a glasses-type wearable device with built-in bone conduction speaker to research user experience evaluation model needed for AR-VR content of the near future to be used in daily user life, and to research design guideline needed for glasses-type device guideline for near-future content including TV and online videos

Dynamic Subtitles in Cinematic Virtual Reality

Cinematic Virtual Reality has been increasing in popularity in recent years. Watching 360° movies with a Head Mounted Display, the viewer can freely choose the direction of view, and thus the visible section of the movie. Therefore, a new approach for the placements of subtitles is needed. There are three main issues which have to be considered: the position of the subtitles, the speaker identification and the influence for the VR experience. In our study we compared a static method, where the subtitles are placed at the bottom of the field of view, with dynamic subtitles 1, where the position of the subtitles depends on the scene and is close to the speaking person. This work-in-progress describes first results of the study which point out that dynamic subtitles can lead to a higher score of presence, less sickness and lower workload.

Augmenting the Radio Experience by Enhancing Interactions between Radio Editors and Listeners

Radio has a long history of being a one-way communication channel from radio station to listener. Recent technological advancements, such as online radio, enable the listener to interact more easily with radio stations, potentially augmenting the overall radio experience of the listener. In turn, the editorial teams of radio stations are challenged with the streams of incoming messages. In this paper, we report on the results of an initial, exploratory co-design process that aimed at mapping needs and values of both end-users, i.e. listeners and radio editors towards interaction. Specifically, we organized 6 co-design workshops at radio stations of 3 different countries. Results demonstrate how needs of both type of end-users overlap. The paper is concluded with 5 general points of attention, i.e. relevant feedback, co-creation of content, personal services, content on demand and being part of a community, which form the basis for the continuation of our work.

Taxonomies in DUI Design Patterns

Recently a library of design patterns ( was created to aid researchers and designers in specifying Distributed User Interfaces (DUIs). The patterns provide an overview of the solutions to common DUI design problems without requiring a significant amount of time to be spent on reading domain-specific literature and exploring existing DUI implementations. Among the main limitations of the library's current implementation is the significant overlap among design pattern descriptions and their relationships not being sufficiently clear. To address this, a systematic approach was undertaken to remove the overlaps among the design patterns, as well as to clarify their relationships by creating a taxonomic structure. The results of this study open several research directions to advance the current work on DUI design patterns.

Smartphone-like or TV-like Smart TV?: The Effect of False Memory Creation

False belief pertains to what users believe falsely in their mental model about remembering novel features with no prior experience. The current study investigated how the False Belief technique can be employed to extract a first-time smart TV user's mental model. Smart features formed by a group of users' false memories (n=41) were monitored to see how the users' mental model changed with retention intervals (immediate, short, and long delays). The findings showed that a gist trace formed in the first-time use cannot last long (1 month) because of the greater false belief effect. Practical implications of these findings should be furthered to improve the apparent adoption obstacles in smart-TV use.

Experiencing Virtual Reality Together: Social VR Use Case Study

As Virtual Reality (VR) applications gain more momentum recently, the social and communication aspects of VR experiences become more relevant. In this paper, we present some initial results of understanding the type of applications and factors that users would find relevant for Social VR. We conducted a study involving 91 participants, and identified 4 key use cases for Social VR: video conferencing, education, gaming and watching movies. Further, we identified 2 important factors for such experiences: interacting within the experience, and enjoying the experience. Our results serve as an initial step before performing more detailed studies on the functional requirements for specific Social VR applications. We also discuss the necessary research to fill in current technological gaps in order to move Social VR experiences forward.

As Music Goes By in versions and movies along time

Music and movies have a significant impact in our lives and they have been playing together since the early days of the moving image. Music history on its own goes back till much earlier, and has been present in every known culture. It has also been common for artists to perform and record music originally written and performed by other musicians, since ancient times. In this paper we address the relevance and the support to access music in versions and movies along time, and introduce As Music Goes By, an interactive web application being designed and developed to contribute to this purpose, aiming at increased richness and flexibility, the chance to find unexpected meaningful information, and the support to create and experience music and movies that keep touching us.

ImAc: Enabling Immersive, Accessible and Personalized Media Experiences

The integration of immersive contents and consumption devices within the TV landscape brings new fascinating opportunities. However, the exploitation of these immersive TV services is still in its infancy and ground - breaking solutions need to be devised. A key challenge is to enable truly inclusive experiences, regardless of the sensorial and cognitive capacities of the users, their age and language. In this context, ImAc project explores how accessibility services (subtitling, audio description and sign language) can be efficiently integrated with immersive media, such as omnidirectional and Virtual Reality (VR) contents, while keeping compatibility with current standards and technologies. This paper provides an overview of the project, by focusing on its motivation, the followed user - centered methodology and its key research objectives. The end - to - end system (from production to consumption) being specified, the envisioned scenarios and planned evaluations are al so briefly described.

SESSION: Workshop Session Summaries

Care TVX: Challenges and Design to Improve TV in In-hospital Environment and for Visually Impaired People

The overarching goal of this workshop is to bring together practices and research in medical fields with media contents developers, designers, and UI/UX/QoE researchers, as well as hospital practitioners as ophthalmologists and psychiatrists. Discussions will relate to:

  • how new visual experience and media contents can improve the in-hospital experience of in-patients, caregivers, and medical staffs;
  • how better understanding of visual impairments (e.g. for elderly) can help to design more inclusive TV experience.

The starting point will be the cases and experiences of Care TVX, followed by a multidisciplinary discussion on "challenges and design considerations to adjusted or new hospital and care practices", led by workshop participants. The outcome of the workshop will be a collection of best practices in the form of position papers and online content.

360° Video Storytelling and Virtual Reality Workshop

The purpose of this joint workshop is to bring together a diverse group of researchers and practitioners for focused discussion and knowledge sharing in 360° video storytelling and virtual reality.