Facts, Interactivity and Videotape: Exploring the Design Space of Data in Interactive Video Storytelling

We live in a society that is increasingly data rich, with an unprecedented amount of information being captured, stored and analysed about our lives and the people we share them with. We explore the relationship between this new data and emergent forms of interactive video storytelling. In particular we ask: i) how can interactive video storytelling techniques be employed to provide accessible, informative and pleasurable ways for people to engage with data; and ii) how can data be used by the creators of interactive video stories to meet expressive goals and support new forms of experience? We present an analysis of 43 interactive videos that use data in a noteworthy fashion. This analysis reveals a design space comprising key techniques for telling engaging interactive video stories with and about data. To conclude, we discuss challenges relating to the production and consumption of such content and make recommendations for future research.


INTRODUCTION
The rise of personal informatics (e.g. health, fitness and sleep tracking), consumer Internet of Things (IoT) (e.g. smart thermostats and connected cars), social media and the open data initiative means people now have access to an unprecedented amount of data about their lives, environments and the people they share them with. This proliferation of data is set to continue, with wearable technology adoption forecasted to reach 28% by this year, two thirds of consumers planning to purchase an in-home IoT device by 2019 [1], global social media adoption reaching 37% [43] and the G8 countries signing a charter to make government data open by default [34].
In parallel with these developments, we have seen the emergence of new forms of interactive video. Key examples include: interactive documentaries (iDocs) [15], ShapeShifting TV [42] and, most recently, experiments in object-based [4] and perceptive media [11]. These new interactive forms enable the creation of video experiences where content is varied based on viewers' interactions, context and other data about them, and, in turn, reveal new possibilities for non-linearity, responsiveness and personalization not possible in traditional video storytelling.
In our research we ask whether these emergent forms of interactive video can be used as a platform to tell stories about, and with, the data that is proliferating our modern society. We hypothesize that the development of such datadriven interactive video experiences can offer two key, interconnected, benefits: i) presenting data in interactive video stories can provide an alternative means to reveal, contextualise and explain data, to those who are uninspired or unable to engage via existing presentation forms; and ii) equipping filmmakers to exploit modern data sources as a new material in interactive videos can reveal new ways to meet expressive goals and support new viewer experiences.
Despite this potential, we know very little about the form that such data-driven interactive videos should take; what tools, processes and underpinning technologies are required to craft them; and what issues will affect viewer reception, perception and experience. In this paper, we aim to develop a foundational understanding of these issues by exploring how data has been employed in existing interactive video content. We present an analysis of 43 items of interactive video content that use data in a prominent or otherwise noteworthy fashion when telling a story. Based on this analysis, we present a design space that comprises key techniques that can be employed to tell interactive video stories with and about data, and challenges that will affect their production and reception.
In mapping such techniques and challenges, our findings can guide the creation of future data-driven video content; inform future research into tools and underpinning technologies for its production and delivery; and support the development of ethical production guidelines that ensure data use is appropriate and sensitive to issues of viewer perception, experience and privacy.

MOTIVATION AND RELATED WORK The Challenge of Engaging the Public with Data
Empowering the public to understand and act upon the unprecedented amount of data that exists about us in modern society can provide wide-ranging personal and societal benefits. For example, by transforming the way people engage with their diet, fitness, mental and physical health, transport and energy use [25,33] and fostering civic awareness and participation [20]. However, motivating and facilitating mass, inclusive public engagement with data remains an unsolved challenge. Making data available (e.g. for download from open data portals or devices) isn't enough. Exploring raw data requires significant motivation, time and skill with tools not held by most people [16]. Rather, more inclusive methods are needed to motivate and enable a broader demographic to access, understand and take action in response to data about them [19].
The prevalent approach for making data accessible to the public is to provide interfaces that make the direct exploration of sources more manageable, usually by placing them in a structured graphical form. A substantial body of research in the field of data visualization has demonstrated that exploring data in these ways can be intuitive, visually compelling and, as a result, highly engaging for many users [e.g. 6]. For example, mobile applications displaying interactive graphs have proved a popular way for users to understand data from wearable health monitors [40] and other interactive visual representations have been shown to have great potential in enabling members of the public to explore open data released by governments [12].
Despite the success interactive data visualizations have had in engaging many users, research suggests that they may not be the right way to motivate and enable all people to find insight in data that exists about them. Concerns have been raised about whether the activity of seeking out and directly exploring data in graphical form, even with easy to use interfaces, will have sufficient mainstream appeal to stimulate mass engagement [24,32]. Moreover, previous research has shown that many people, in particular those from low-education backgrounds [10], may struggle to determine appropriate actions to take in response to data presented in primarily graphical form, as this can require an understanding of how low-level facts and trends, often represented across multiple different visualisations [23], relate to contextual factors and expert knowledge [8,31].

Story-based Approaches for Data Engagement
Previous work has explored how activities and media forms popular with the public can be used to broaden the appeal and accessibility of data. For example, art [17] and artistic installations [36], craft activities and products [3], gamification [38] and the fabrication of bespoke souvenirs [30] have been proposed as alternatives to reveal and explain insights within data. One approach with great potential for bringing data to people in a form that they can, and will want to, consume is storytelling [18,21]. This stands to reason, few forms of communication are as accessible, informative and engaging as a well-told story. Skilled storytellers employ established narrative techniques (e.g. genres, structures, devices, characters) in the context of their audience's knowledge and experience to provide easily understood and remembered paths through complex, expansive and conflicting sources of information [5,46]. Moreover, good stories don't just inform they entertain. Forms like documentary can make engagement with information exciting, intriguing and challenging through the creative application of narrative techniques.
Data journalism has already demonstrated how skilled storytelling techniques from established media forms can be applied to support data engagement, and a public appetite to consume data in story form [14]. Data journalism articles allow people to gain individualised insight from complex data sources, by blending the presentation of interactive visualisations with storytelling techniques from text-based print and online journalism [39]. Moreover, the awards and critical acclaim received by such articles (including a Pulitzer Prize [28]) shows how data-driven storytelling can open up new opportunities for creative expression and lead to content that is highly compelling in its own right.

Interactive Video for Data Engagement and Expression
We hypothesize that video-based storytelling forms have especially strong potential for facilitating broader public engagement with data. Few storytelling forms are consumed so regularly by such a large segment of the population. For example, 91% of adults in the UK watch television each week, viewing an average 3 hours and 36 minutes per day, and the appeal of video content continues to grow in the online age with consumption of paid videoon-demand services rising to 26% in 2015 [35]. Additionally, video offers a diverse range of genres (e.g. documentary, drama, comedy) with different storytelling techniques that can be applied to appeal to different groups.
Presenting data through stories is, of course, central to existing video genres, such as news and documentary [2]. However, traditional broadcast distribution, where the same content is transmitted to all viewers, has meant video stories must feature aggregate data relevant to large, homogenous audiences. Consequently, the insight that a person can receive about their data from current video storytelling approaches is significantly limited in terms of personal relevance and depth, compared to what is available to those motivated and able to use analytics tools and interactive visualisations. Recent developments in interactive video, such as web-based interactive documentaries [e.g. 15] and object-based media [e.g. 4] allow for the creation of video stories that are dynamically reconfigured in response to information about each viewer and their context. As a consequence, there are now opportunities to create interactive videos that present each viewer with an individually tailored perspective on their data.
We argue that such interactive data-driven videos can offer new ways to present data to the public that a large and diverse section of the population will be able to, and equally crucially, want to use. Moreover, we expect that equipping filmmakers to exploit modern data sources as a material in interactive video productions can reveal new ways to meet expressive goals and support the creation of new forms of media experience.

METHOD
While the coming together of data from modern sources with emergent forms of interactive video presents opportunities for increased public data engagement and new forms of expression, the form, production and reception of such data-driven interactive videos are not yet understood. In particular, the production of video content depends on established aesthetic techniques for telling effective and engaging stories. It isn't yet clear what form the equivalent techniques that incorporate data into video stories should take and how they should be tailored in response to different data sources, genres, production goals and audiences. In this paper, we seek to address this knowledge gap by developing a foundational understanding of the ways that data from modern sources has been incorporated into existing interactive video content.
To develop this understanding, we conducted an analysis of existing examples of data-driven video content. Our analysis followed a two-stage process. In the first stage, we identified a set of interactive videos that use data in a prominent or otherwise noteworthy fashion. In order to identify this set we reviewed 300 interactive videos from the following sources: 262 from MIT's Docubase, 27 from i-docs.org, 6 from open web searches (e.g. with terms such as "data visualization video") and 5 items that the author was previously aware of. Each item was watched by a researcher and included or excluded based on the following criteria: 1) the content must be primarily video-based; 2) data must feature prominently in what is presented to the viewer; and 3) the presentation of data must extend beyond what is possible in a non-interactive video (e.g. content showing only static graphical visualizations similar to those seen on television news were excluded). This final criterion was chosen to focus our study on the relationship between data and interactive video content (see [2] for a review of data-driven storytelling in non-interactive videos).
Our first stage analysis resulted in a set of 43 interactive videos. In the second stage, the examples in this set were subjected to a more detailed review. Each item was viewed again by a researcher in order to determine information in the following categories: the genre of video story; the type of data used; the approach employed for presenting the data in the story; any role that data played in supporting the story; any role that the story played in supporting the viewer in understanding data; any notable challenges for data-driven stories raised; and examples of best-practice for addressing such challenges. It was not possible to access a working version of 4 examples. However, we were able to include these in the sample by reviewing secondary sources (e.g. videos showing their experience that were available). Where multiple items were part of a series, these were reviewed as individual items. This decision was made because we found that different episodes in a series could often employ and pose divergent approaches and challenges. Where an item is an episode in a series, the naming convention "Series Title (Episode #: Title)" is used.
To conclude our review, we analyzed the information recorded for each category to identify, and classify the examples in terms of, a set of recurrent design features. These features were determined by grouping examples based on observed similarities, and by cross-referencing these emergent groups with storytelling literature [e.g. 13,45,46]. This process was iterative, with examples reconsidered in light of emergent features during multiple passes through the data.

CASE STUDIES
Our analysis revealed a design space of data-driven video stories that is presented in Table 1. This space maps the examples reviewed in terms of: i) design strategies that were employed to tell engaging interactive video stories about, and with, data; ii) presentation structures that were used to combine data and video together into cohesive narrative experiences; and iii) a set of further recurrent features observed (e.g. genre, form of data presentation, type of data). In presenting our design space, we draw inspiration from the method employed by Segel and Heer in their related review of data journalism content [39]. We first describe a set of detailed case studies, which each illustrate a selection of design space features. When describing each case study, we highlight the design space features described in bold. We then present further analysis of the design space that reveals additional features and challenges that span the sample in a subsequent section.

Unspeak
In this web-based interactive documentary the viewer is introduced to Stephen Poole's concept of "Unspeak" -the careful choice of language "to say something without saying it, without getting into an argument and so having to justify itself" [37]. The video portion of the experience is a traditional linear film, in which a narrator explains the concept of Unspeak drawing on a set of examples from politics and popular culture. While watching this video, the viewer has the option to pause playback and view a set of interactive data visualizations. These visualizations allow the viewer to explore data about the prevalence of different Unspeak terms on social media in the three years preceding the film's release. This data is presented in a variety of graphical forms. The data included is static (i.e. not dynamically sourced at the time of viewing) and the same for all viewers, having been recorded at production time.
"Unspeak" illustrates perhaps the most rudimentary presentation structure observed in the design space: loosely coupled data. In this structure (used in 5 examples) video content and data are presented across two distinct, companion experiences. While interacting with either experience the viewer is given the option to switch to the other, usually by clicking on a link or icon. The example also illustrates one of the most prevalent design strategies employed in the sample (30 examples), the use of video to provide contextualization for data. In "Unspeak", if the viewer were to interact with the data visualizations in isolation they may find them comparatively uninteresting. However, by accompanying these visualizations with a video that offers a rich explanation of the rhetorical power of words, the viewer is motivated to investigate how the otherwise innocuous terms shown in the graphs have been employed to frame major political events. This, in turn, gives the data a new meaning and heightened credence.

Frontline: Targeting the Electorate
"Frontline: Targeting the Electorate" is a web-based interactive documentary that explores how the electorate is targeted during political campaigns. The film provides each viewer with a personalized perspective on how they may have been targeted with information during the 2012 US presidential campaign. At the start of the experience, the viewer is informed that their location has been automatically detected from their browser. They are then asked to complete a short interactive quiz about their demographics, political views and lifestyle. This dynamic and individual data is used to assemble a bespoke video comprising interviews with experts explaining the particular strategies used to target the viewer during the campaign.
This case exemplifies the adaptive narrative presentation structure, which was observed in 3 examples. With this structure, decisions about the selection and order of video content shown to the viewer are made based on a data source. In "Frontline: Targeting the Electorate", this presentation structure is used to implement one of the most commonly observed design strategies (22 examples): the use of personal data as a device that supports and amplifies the filmmaker's intent. In the film, the intent is to make viewers aware of the concerning tactics that can be employed by politicians during elections. Adapting the narrative shown to each viewer in response to their individual data supports this in two ways. Firstly, the filmmaker creates a strong personal connection between the issue of targeting and the viewer by presenting videos that refer to the actual tactics that would be employed to target them. Secondly, by adapting the content of the film using the same data used to target voters during elections, the filmmaker is able to reinforce the authenticity of their argument by demonstrating that such targeting is viable.

Coal: A Love Story
This web-based interactive documentary presents the viewer with a collection of video and other media that highlights America's dependence on coal. One section of the film, "Coal & You", allows the viewer to explore how their own lifestyle and behavior influences the issues discussed in the film, by calculating their own annual coal consumption. The viewer is presented with a linear video that shows people doing everyday activities, such as web browsing and cooking. These are overlaid with static text stating the amount of coal each activity depends on. At the end of the video, the viewer is told that the average American family burns 6,500 lbs. of coal per year, and asked how they compare. An interactive quiz then enquires about the viewer's location and everyday habits (e.g. number of hours spent on a mobile device). The answers are used to provide an individual perspective on static energy consumption data (classified as factual in our design space), which is presented as a personal coal use estimate shown via a combination of graphics and text.
"Coal: A Love Story" demonstrates a presentation strategy that was observed in 18 examples in the design space: interleaved data. With this strategy, the viewer is shown a series of videos in a linear structure. At points between these videos the viewer is asked to interact with data in a format that is not video-based (e.g. a graphical interactive data visualization). This differs from the loosely coupled structure, because the viewer is not given a choice of when to interact with data, but is actively directed to explore it at defined points in the narrative. In the film, this presentation structure is used to support a design strategy found in 8 examples: the use of video to create intrigue about the content of a data source. In this case, video is not primarily used to present or explain data. Rather, the preceding video is used to heighten the viewer's interest in and, therefore, motivate their engagement with data that is subsequently presented in a different form. In this way, the strategy can be seen to leverage a crucial feature of video storytelling to engage the viewer with data: plot structure. By arranging the presentation of information into two phases of exposition (in the preceding video) and denouement (in the ensuing interactive quiz), the viewer's interest in exploring the data source is heightened. The film is also a further example of using video to provide contextualization and personal data as a device.

Do Not Track
"Do Not Track" is a 7 episode web-based interactive documentary series that aims to inform the viewer about the proliferation of web tracking and its implications. The third episode, "Like Mining", explores some of the problematic ways that companies might profile people based on the information they post on social networks. The episode is structured around a television advertisement for "Illuminus", a fictional company that uses social network profiling to make decisions about insurance policies and loans. Using an interleaved data presentation structure, the viewer is shown segments of the advert's linear video narrative. These are followed by interactive web pages that use text, graphics and a personalized image to demonstrate to the viewer that many features of the fictional product shown in the advert could actually be implemented by analyzing their individual social network data. This data is accessed dynamically from the Facebook API, if the viewer provides authorization.
This case illustrates a design strategy that was observed only in this example, which we refer to as fiction made real through data. In this strategy, video is used to illustrate a fictional scenario, which is then demonstrated to be a realistic proposition using a data source. The result in this case is that the authenticity of the fictional scenario illustrated in the video advert, and the conceivable risks it shows, is established more strongly than may be possible if it was shown on its own in a non-interactive video. While examples employing this approach could be subsumed into personal data as a device, we feel they should be marked as a distinct strategy in the design space. This is because they demonstrate how a key feature of video storytellingthe capability to richly depict fictional situations -can be exploited to increase a viewer's level of interest in data and its implications and, conversely, how individualized data can enhance the impact of fictional scenarios when included as a device in documentaries.

The Risk Taker's Survival Guide
This web-based interactive documentary explores the topic of risk. In one section of the film, a combination of adaptive narrative and interleaved data presentation structures is used to illustrate how poor many people are at assessing risk in everyday life. Half of the film's viewers are shown a video of a backpack in a busy crowd in New York's Grand Central Station, while the other half are shown a bag on an empty rural station platform. Both groups are asked to decide whether they would report this bag to station staff using an interactive quiz. Data about the percentage of viewers that chose to report each bag is then shown on the screen (Figure 1). At the time of viewing, this was much higher for the bag on the isolated platform than the busy one. In a following video segment, a narrator explains that it would be logical for more people to report the bag in the busy station, as if this contained an explosive device it could cause harm to a larger number of people.
This section of "The Risk Taker's Survival Guide" illustrates a design strategy that was observed in 8 examples in the design space: using peer data as a device. In such examples, the possibility of gathering data about, and from, viewers through their interactions with an interactive video is exploited to assemble an evidence base that supports the filmmaker's intent. In "The Risk Taker's Survival Guide" this viewer engagement data helps the filmmaker illustrate how poor people can be at assessing risk in two important ways. Firstly, by asking the viewer to participate in the creation of the data that evidences a point, the filmmaker is able to create a stronger personal connection between, and perspective on, the results than might be possible with the inclusion of pre-set statistics in a traditional video. Secondly, the use of actual data gathered from the viewers of the film to evidence arguments, rather than general statistics, has the potential to give that evidence a greater sense of authenticity.

Netwars
In this 5 episode web-based interactive documentary, the viewer is confronted with current and future risks posed by cyber warfare. All episodes in the series are narrated by an unnamed, unidentified and potentially untrustworthy character that introduces and explains a range of cyber security threats. For example, in the first episode, "Out of CTRL", the narrator introduces the risks posed by vulnerability scanning in a video segment. The viewer is then presented with an interactive page that uses text and graphics to show how vulnerable their computer configuration (e.g. their operating system) makes them to attack. This data is dynamic and individual to the viewer, having been detected automatically from their web browser.
In addition to further demonstrating the interleaved data presentation structure, personal data as a device and video to create intrigue strategies, this case also illustrates a further strategy evident in 16 of the examples reviewed: the use of character to support data engagement. The narrator in Netwars is not a neutral figure that simply introduces and explains the data presented. Rather, he is a mysterious, erratic and potentially untrustworthy figure, whose dramatic performance evokes a sense of unease and apprehension about the issues discussed. When combined with the presentation of individual data that illustrates how such issues personally affect the viewer, this application of character serves as a powerful device to grab the viewer's attention and emphasize the importance of the film's topic.

Here at Home
The topic of this web-based interactive documentary is an experiment designed to reduce homelessness in five Canadian cities. Viewers are presented with an interactive graphical visualization of data from the study, which is the same for all viewers. Scattered amongst this visualization float small pictures of people involved in the study. Clicking on a picture plays a short video interview about that person's experiences of the project.
"Here at Home" illustrates two recurrent, and closely linked, features evident in the design space that have not been seen in the case studies discussed so far. Firstly, the example illustrates a further presentation structure: augmented visualization. In the 6 examples employing this structure, the viewer experience most closely resembles exploring an interactive visualization. However, the viewer is presented with the option to view linear video content while exploring the data by, e.g., clicking on thumbnails. The result, in this case study, is that the videos connect otherwise abstract and sterile statistics to the lives and experiences of real people. This makes the case a further example of the use of character to support data engagement and video to provide contextualization.
The example also illustrates a design strategy that was exhibited in 6 examples in the design space: the use of data as a narrative structure. In "Here at Home", data doesn't just provide additional evidence and information about the individual stories shown in the videos. It also offers a meaningful route through them. Previous accounts of database documentaries have noted how the design of interfaces for exploring collections of video clips can provide an opportunity for filmmakers to "preserve narrative and argument in the documentary text" [29]. Examples employing this strategy demonstrate how interactive data visualizations can be used as the basis of such interfaces, providing a meaningful path through which a viewer can navigate and come to understand an otherwise loosely connected collection of films.

Holy Mountain
This web-based interactive documentary tells residents' stories of Montreal's Mount Royal. Using an elegantly illustrated web-based interface, the viewer can navigate between five locations on Mount Royal. At each, a selection of videos and photo essays, which each tell a Montrealer's story of the place, can be viewed. While the viewer explores each location, short messages about Mount Royal from the social network site Twitter are displayed in the corner of the screen. These messages are sourced dynamically using the Twitter API, by searching for the keywords "Mont Royal". Clicking on these messages takes the viewer to Twitter where they can interact further (e.g. by reading replies or by sending a response to the author). This case illustrates a design strategy that was observed 5 times in the design space: the use of data for liveness.
Here, the filmmaker not only creates a static record of past memories of Mount Royal recorded in video form, but also employs live data to create an experience that is more closely connected with current Montrealers' lives.

Take This Lollipop
This web-based interactive film sits in the genre of horror.
The short film begins with a scene in which an angry and deranged character is stalking a victim on Facebook. The video incorporates social network data (e.g. images and text messages) gathered dynamically from the viewer's individual Facebook page into shots of the protagonist's computer screen. Additionally, the character is shown to look up directions on Google Maps, which are based on individual data about the viewer's location. As a consequence, it is made clear to the viewer that the potential victim of the character is, in fact, them. In the final scene, the character is seen to drive to their victim's location and get out of his car -where the film ends.
"Take This Lollipop" employs a number of the strategies discussed in previous case studies to support the filmmaker's intent to scare the viewer, including: personal data as a device and the use of strong character. In addition, it illustrates a further presentation structure not discussed so far: dynamic video. In this structure, the primary viewer experience resembles a traditional video (i.e. a story is told through the linear presentation of video frames). However, the data shown in this sequence varies depending on the viewer, through the inclusion of, e.g., dynamic graphics, text or images. The film also illustrates a further design strategy employed in 5 examples: the use of atmosphere to re-contextualize data. By employing established horror tropes including dramatic sounds, closeup framing, erratic cutting, hard lighting and prominent shadows, the filmmaker creates an atmosphere of unease and dread. When the viewer's own social network data is juxtaposed with this atmosphere it takes on a new meaning; frightening the viewer and, in turn, stimulating them to reflect on the data they share online.

After the Storm
"After the Storm" is a web-based interactive documentary that combines videos with interactive elements to tell one person's story of a devastating tornado. At two points in the story, viewers must scroll through lists of hurricane deaths and locations that are shown as text (Figure 2). While video plays and the spoken narration continues as the viewer scrolls, the story will not progress until the viewer reaches the end of the list. In addition to providing another example of the dynamic video presentation structure, this case illustrates a strategy only seen in this example: the use of data interaction as a device. Scrolling through the lists takes time and effort, which, in turn, stresses the scale of the damage resulting from tornados and their frequency across the US. This supports the filmmaker in making the argument that such events "could happen to you". In this way, interacting with data is not just a means to view information, but a device supporting the telling of the story.

Types of Data and Level of Personalization
We classify the types of data featured into 8 categories. 58.1% of examples featured data that was sourced dynamically during the experience, from sources such as data APIs or from information available found via the viewer's browser. In the majority of cases where data was dynamically included in an experience, we observed that little or no analysis was performed on it as part of its presentation in the story. Rather, in 68% of examples employing dynamic data this was presented in its raw form (e.g. by showing a viewer's location on a map rather than drawing a meaningful inference from that location). Despite analysis of dynamic data being rare in the sample, its potential value as a storytelling strategy was illustrated in "Do Not Track (EP3: Like Mining)" where data sourced dynamically from the viewer's Facebook page was analysed to determine aspects of the viewer's personality.

Challenges and Emergent Best Practice
A number of the design strategies relied on the availability of data individual to the viewer to create a bespoke experience. If this data doesn't exist, is partially complete, or the viewer doesn't allow access, there is potential that the film may not be viewable as intended. For example, the personal impact of the fictional Illuminus reports in "Do Not Track (Episode 3: Like Mining)" is reduced if the viewer does not have, or is not willing to provide access to, Facebook data. In examples employing peer data as a device, the potential for a cold start problem [c.f. 22] was noted, where arguments based on peer data cannot be made until that data has been amassed.
While some examples simply displayed an error message or made content unavailable when data was unavailable, others offered solutions to allow the viewer to engage. In "Frontline: Targeting the Electorate" if the viewer's location isn't available or within the US, they are given the option to manually input their own, or an imaginary, location. This solution is likely to only be practical in cases where video stories are based on simple forms of data that can be entered easily by the viewer. A potentially more scalable solution was observed in Do Not Track. Here, social network data from an alias, the film's narrator, is used to illustrate the fictional Illuminus advert. This means the film can be experienced by all viewers, but is still problematic as the impact of basing the film's fiction made real strategy on individual viewer data is lost.
The relevance of data to different viewers was also identified as a potential challenge. Multiple examples implemented the personal data as a device strategy by highlighting an individual connection between the viewer's personal data and a shared data set. Such cases rely on each viewer's individual data having sufficient relevance to the shared data for a meaningful connection to be made. This may not always be the case. For example, in "God's Lake Narrows" the viewer's location is correlated with First Nations reserves in Canada, to suggest that the problems experienced by such communities are 'closer to home' than the viewer might think. This message is undermined if the film is viewed outside of Canada, and the data shows the viewer is actually a long distance from the nearest reserve. In such cases, the use of data from an alias, which is more relevant, may provide a way to give all viewers a meaningful experience.
In examples where dynamically accessed data was used to create a sense of liveness, a further challenge was posed by the lack of control that the filmmaker has over that data. For example, in "Mount Royal", there is a risk the social network data included may be inappropriate or offensive. This could, for example, prove a barrier to the assignment of classifications to data-driven films. Where dynamic data is used as the basis of argumentation, as in examples employing the peer data as a device strategy, there is a further risk that the data will not support or otherwise conform to the filmmaker's intended message.
Offering viewers informed consent about how their data will be used in the telling of a story was also a crucial challenge revealed in the design space. In "In Limbo" the viewer provides authentication to the Gmail API at the start. Text from their emails is subsequently shown on screen with no prior warning. This unexpected display of private data could be problematic, especially if the film is viewed in a social situation. Providing viewers with a clear explanation of how their data will be used, so that they can   make an informed choice about progressing, may be a way to avoid such problematic situations. However, this simple approach to consent may be incompatible with examples like "Take This Lollipop", "Netwars" and "Do Not Track", where feeling surprised about the unexpected, but creative, ways data is used is a compelling feature of the experience.

Interactive Video Can Tell Engaging Stories with Data
At the beginning of this paper we hypothesized that presenting data in interactive video stories could provide an alternative means to reveal, contextualise and explain data, to those who are uninspired or unable to engage via existing presentation forms. While empirical studies with viewers will be needed to fully explore this hypothesis, our findings provide initial evidence of the potential of this approach. Our analysis has demonstrated a range of structures and strategies (summarized in Table 2) through which video storytelling techniques can be employed to motivate and enable audiences to engage with data sources. The examples reviewed, in particular those using the contextualization strategy, show us how situating the presentation of data within the narrative of a video story can aid in communicating the meaning, importance and implications of that data to viewers. Moreover, further strategies revealed in our analysis demonstrate how the application of video-storytelling devices such as strong characters, plot structures of intrigue and resolution, genre tropes and cinematic atmosphere can make data sources and the insights they contain interesting, exciting and even unnerving for viewers. We argue that by transforming data presentation into something that is not just informative, but entertaining, these particular dramatic strategies could make video storytelling an especially effective way to inspire a large section of the population to engage.

Interactivity and Personalization Add Significant Value
The application of video storytelling techniques to make information attractive, engaging or otherwise enjoyable for In examples such as "Netwars", "Do Not Track" and "Frontline: Targeting the Electorate", this allowed the filmmaker to draw on established video techniques to tell a compelling story about data, but without the restriction imposed by noninteractive video that this data should be the same for all viewers. The possibility of personalizing video stories in response to viewer data was observed to underpin a number of the design strategies, including personalization as a device and fiction made real. Additionally, it was seen to enhance the effect of other strategies including the application of atmosphere and character. For example, the effect of the erratic and unreliable narrator in Netwars is heightened by the use of viewers' personal data in the story.

Genre Tropes Were Central to the Design Strategies
All but two examples conformed to the norms and traditions of an established genre and three examples were seen to employ characteristics from a secondary genre when engaging the viewer with data (e.g. advertising in "Do Not Track (EP 3: Like Mining)"). This application of tropes and conventions from existing genres of video storytelling was observed to be a key way in which compelling viewer experiences could be built about and around data sources. Examples of the role played by genre ranged from the broader placement of personal data within the context of established narrative structures from documentary film to the combination of docudrama traditions [27] with a viewer's individual data in the fiction made real strategy.
The prevalent application of genre tropes demonstrates that the potential value of video storytelling doesn't just lie in the inherent features of the video form (e.g. moving image, structured presentation of information over time). Rather, our findings highlight the important role that the rich history of techniques for exploiting these features to tell compelling stories, and filmmakers' skilled application of them, can play in engaging people with data. We feel this is an important observation to make in the context of previous research on data storytelling, which has tended to focus on the value of visual storytelling and narrative structure [e.g. 2,39] and not what can be gained by building on a broader range of genre conventions and their skilled application.

Presentation Structures
Dynamic video Different data presented to each viewer in an otherwise linear video.
Interleaved data Data presented in interactive intervals between video segments.
Augmented data Data visualization augmented with video.
Loosely coupled data Video and data presented across companion experiences.
Adaptive narrative Selection and order of video content changed based on data.

Contextualization
Video contextualizes a data source.
Personal data as device Personalization supports and amplifies the filmmaker's intent.

Video creates intrigue
Video creates intrigue about data.
Fiction made real Fictional video scenario shown to be authentic using data.
Peer data as device Data captured from other viewers support's filmmakers intent.
Use of character Characters support viewer's engagement with data.
Data as narrative structure Data offers a path to navigate videos.
Data for liveness Dynamic data provides contemporaneity for video.
Use of atmosphere Cinematic atmosphere re-frames the viewer's experience of data.
Interaction as device Interaction with data as a device that supports the filmmaker's intent.

The Potential of Adaptive Narrative is Underexplored
One of the most promising presentation structures revealed in the design space was that of adaptive narrative. The examples employing this structure demonstrated the rich and exciting possibility of creating data-driven video stories in which the fundamental narrative structure is changed by selecting and rearranging video content based on a data source. By changing the narrative structure of a video story in response to a data source the filmmaker may be empowered to profoundly tailor how it is contextualized and analyzed, and the conclusions and recommendations that are drawn. In doing so, the expressive and rhetorical power of the design strategies identified in this paper might be significantly enhanced. For instance, in "Coal: A Love Story" rather than presenting each viewer's personal coal use estimate as text and graphs, they might be shown a bespoke adaptive video that makes individualised behavior change suggestions based on aspects of their energy use.
While highly promising, the adaptive narrative presentation structure was the least prevalent in our analysis (7% of examples). The design space was instead dominated by examples where the extent of personalization offered to the viewer in response to their data was limited to including dynamic text, graphics and images within an otherwise static and linear narrative structure. In our future work, we are interested in exploring why this rich category of datadriven video stories is as yet so underexplored and how barriers to its production might be addressed by building on the wealth of prior research in non-linear content authoring, production and distribution offered by the TVX community.

The Variety of Data Sources Employed is Limited
While the design space illustrates a rich and diverse range of strategies for incorporating data into interactive video stories, the variety of data included in examples was quite limited. We were particularly surprised to find that only one example featured data from the IoT devices that account for an increasing amount of data produced in our society.
Where data individual to the viewer was incorporated into a video story the variety of sources was even more limited. Such examples only used data about viewers' social media, location, browser and viewing device. We posit that the reason for this limited data variety stems from the fact that some data sources are easier to include in an interactive video from a technical and viewer perspective. A viewer's social network data or location can be easily included using established APIs, with only a simple authentication step. In contrast, including data from a home IoT device would be technically complex and may require the viewer to conduct a complex process to configure the transfer of data from device to content. For this reason, we believe that the development of tools and infrastructure [e.g. 7] that make it easier for a broader variety of data to be included in video stories will be an important avenue for future research. This will, in turn, enable a richer variety of data-driven video content and, most crucially, allow for the design strategies identified in this paper to be used to help people understand a broader range of data. Developing tools that enable the creation of films that draw inferences from dynamic data sources, rather than just displaying data in raw form, may also open up a wealth of new production opportunities (e.g., for the personalization as a device strategy).

There is a Need to Establish Ethical Production Practice
The examples highlighted a number of ethical challenges posed by data-driven video stories. These include the lack of control over the appropriateness of content that includes dynamic data (e.g. as with the live social network data included in examples such as "Mount Royal") and the need to make viewers aware of how personal data will be used before they consent (e.g. as with the unexpected on-screen display of personal email data in "InLimbo"). Furthermore, the centrality of personalization to the design strategies suggests that data-driven videos, if not designed responsibly, may enforce the harmful information filter bubble effects seen in other forms of personalized online media [9]. We hypothesize that research exploring how to negotiate these and other complex issues of ethics, privacy and data ownership will be crucial for creating positive audience perceptions and responses to interactive video content that is based on personal data. We argue that research establishing ethical production practices for datadriven video stories will also be particularly vital, if we are to ensure that content is not damaging to audiences or society. This will be especially important if data-driven video stories extend into areas (e.g., advertising) where there is potential for the nefarious application of personalization based on viewer data.