VAST Challenge 2015: Mayhem at dinofun world

A fictitious amusement park and a larger-than-life hometown football hero provided participants in the VAST Challenge 2015 with an engaging yet complex storyline and setting in which to analyze movement and communication patterns. The datasets for the 2015 challenge were large—averaging nearly 10 million records per day over a three day period—with a simple straightforward structured format. The simplicity of the format belied a complex wealth of features contained in the data that needed to be discovered and understood to solve the tasks and questions that were posed. Two Mini-Challenges and a Grand Challenge compose the 2015 competition. Mini-Challenge 1 contained structured location and date-time data for park visitors, against which participants were to discern groups and their activities. Mini-Challenge 2 contained structured communication data consisting of metadata about time-stamped text messages sent between park visitors. The Grand Challenge required participants to use both movement and communication data to hypothesize when a crime was committed and identify the most likely suspects from all the park visitors. The VAST Challenge 2015 received 71 submissions, and the datasets were downloaded, at least partially, from 26 countries.


INTRODUCTION
The Visual Analytics Science and Technology (VAST) Challenges [1] aim to advance visual analytics through a series of competitions. In the VAST Challenges, researchers and software developers put themselves in the role of analysts to determine if their tools, techniques and approaches can address the specified problems effectively. VAST Challenge problems provide both realistic tasks and synthetic data sets, which live on after the completion of each year's challenge and are used for education, software evaluation, and demonstration of new techniques.
The Challenge consisted of two Mini-Challenges and a Grand Challenge requiring integration and synthesis of information from the minis. The Mini-Challenges were tightly related, as they both involved analysis of human behavior within the fictitious amusement park. The scenario provided to the contestants was as follows: "DinoFun World is a typical modest-sized amusement park, sitting on about 215 hectares and hosting thousands of visitors each day. It has a small-town feel, but it is well known for its exciting rides and events. One event last year was a weekend tribute to Scott Jones, internationally renowned football ("soccer," in US terminology) star. Scott Jones is from a town nearby DinoFun World. He was a classic hometown hero, with thousands of fans who cheered his success as if he were a beloved family member. To celebrate his years of stardom in international play, DinoFun World declared "Scott Jones Weekend", where Scott was scheduled to appear in two stage shows each on Friday, Saturday, and Sunday to talk about his life and career. In addition, a show of memorabilia related to his illustrious career would be displayed in the park's Pavilion. However, the event did not go as planned. Scott's weekend was marred by crime and mayhem perpetrated by a poor, misguided, and disgruntled figure from Scott's past. While the crimes were rapidly solved, park officials and law enforcement figures are interested in understanding just what happened during that weekend to better prepare themselves for future events. They are interested in understanding how people move and communicate in the park, as well as how patterns changes and evolve over time, and what can be understood about motivations for changing patterns." This year's challenge scenario is set in an amusement park similar to Hersheypark in Pennsylvania or Alton Park in England. The simulated park covers a large geographic space (approximately 500x500 m 2 ) and is populated with ride attractions, restaurants and food stops, souvenir and game stores, an arcade, a show hall, and a performance stage. The rides can be categorized as kiddie rides, general rides, or thrill rides. This setting is the backdrop for individual and group movements and the establishment of patterns of life behaviors of visitors. Figure 1 shows the Dinofun World amusement park layout. The attractions are numbered and contestants were provided with a list of the attraction names that follow the dinosaur theme. The red line indicates the visitor pathway through the park, although dark green areas are also areas where people can move (for example, attractions 24 and 30 are log flume and water rapids rides for which spectators may be located "inside" of the ride boundaries. For other rides, people are not allowed inside ride boundaries. Attraction 63 is a show stage area, which will be populated during performances.) The area is divided into a 100x100 grid to assist in specifying people's locations. Visitors carry a mobile device that enables location tracking through the park, records check-ins to rides, and logs text messages that are sent. Attractions have a grid point representing the visitor entry location for the purposes of the challenges.
People and other park elements in this scenario were modelled by software agents. The agents received plans for moving around the park, according to several pre-defined people-types and grouptypes. The people-types primarily follow age characteristics, such as adult, teen, child, and infant. Accordingly, teens "raced" through the park, children stayed close to their adults, and so on. The challenge developers specified several group types based on their personal experiences with groups in amusement parks. As each person travelled through the park, their location was logged  to a file every second, so contestants can track their journey using visualization tools. Groups allowed for more complex behavior patterns than individual agents. The developers defined a set of groups, their behaviors, and statistics for contestants to follow across all of the days of the park simulation. Each ride was modelled after a ride found in existing amusement parks around the world. Information about ride capacities and durations was gathered from various amusement park sales guides found on the web and from observing videos of ride behaviors on YouTube. People would essentially go "off-grid" when they were on a ride; no movement data was recorded. Visitors would also queue-up for rides, however if they saw a ride line length requiring an overly long wait, e.g., an hour, they could pass it by and continue to their next scheduled ride.

SCOPE OF VAST CHALLENGE 2015
As mentioned above, the VAST Challenge 2015 consisted of two independent Mini-Challenges and a Grand Challenge. Teams were invited to participate and submit to one or both Mini-Challenges as well as the Grand Challenge. This year, we encouraged participants to create innovative visualizations to support their analyses of the data. There were many different features within the data sets that could use creative approaches to analyze; even if a particular approach didn't address the entire Challenge problem set, we encouraged teams to enter with their new ways of working with this data. As in previous years, entries required both a written response to challenge questions with supporting illustrations, and an explanatory video, which was useful for illustrating human interactions important to the solution.

Challenge Tasks
The two individual Mini-Challenge tasks and the Grand Challenge are summarized below. Descriptions of the tasks are posted at http://www.vacommunity.org/VAST+Challenge+2015. All Mini-Challenge materials are archived in the Visual Analytics Benchmark Repository [2].

Mini-Challenge 1: Visitor Movement
Mini-Challenge 1 focused on movement of people around the park. Participants were asked to characterize the movement of groups and individuals, with a special emphasis on what might be relevant to better understanding the incident that occurred during the "Scott Jones Weekend" event. Contestants had access to movement tracking information for all paying park visitors over the three days of the celebration. In contrast to previous years, we allowed participants to use data from both Mini-Challenges to complete a single Mini-Challenge. The datasets provided were .csv files for Friday, Saturday, and Sunday, containing a date-time stamp, a visitor ID, a tag as to whether the record referred to a movement within the park grid or a "check-in" to an amusement park ride, and a grid location (x,y coordinates).
Questions asked of the participants were as follows: 1. Characterize the attendance at the park on this weekend. Describe up to twelve different types of groups at the park on this weekend. 2. How big is the group type? a. Where does this type of group like to go in the park? b. How common is this type of group? c. What are your other observations about this type of group? d. What can you infer about the group? e. If you were to make one improvement to the park to better meet this group's needs, what would it be? Please limit your response to no more than 12 images and 1000 words. 3. Are there notable differences in the patterns of activity on in the park across the three days? Please describe the notable difference you see. Please limit your response to no more than 3 images and 300 words. 4. What anomalies or unusual patterns do you see?
Describe no more than 10 anomalies, and prioritize those unusual patterns that you think are most likely to be relevant to the crime. Please limit your response to no more than 10 images and 500 words.
The definition of "group" was intentionally left to the contestants to determine, so that they could best formulate their response within the context of their working hypotheses and evidence. With respect to the data generation, groups were created with several specific characteristics that that influenced their movement. For example, a large family group may have between 1-3 adults and 1-5 children. An ambitious family group would move more quickly through the park, and spend more time on thrill rides than other ride types. They would visit shopping stalls in the evenings. They would arrive around 08:00 and exit around 23:00. There was also a possibility that at some point during the day, the group would split up according to peopletypes; the adults and children (e.g., independent teens) would travel around the park in different ways.
Park operations would impact groups. For example, when a thrill ride shut down for repairs, it would affect the agendas for the teens mentioned above, more than for parents and very young children focused on kiddie rides.
A major disruption to the movement patterns of park guests occurred on Sunday. As mentioned in the introduction, Scott Jones would have shows twice a day throughout the weekend. On Sunday, the afternoon show was cancelled. This meant that groups that came to see Scott were not allowed into the Stage area, although some visitors still came to the Stage area at show time, but were unable to check-in.
There were several unusual patterns exhibited by groups and individuals throughout the weekend. A very large group represented a touring party that moved around and kept together the entire time they were in the park. A smaller group approached a specific thrill ride several times without checking in, then finally deciding to check-in after several iterations. The developers called this the "dare-you" group. A pattern involving several groups occurred when visitors stopped at a specific food stand, and then shortly after their visit, went to the first-aid building. Eventually, the food stand was closed for that day.

Mini-Challenge 2: Visitor Communication
As mentioned in the web site pages for Dinofun World that were part of the auxiliary data provided to participants, the park provided an app that allowed guests to send text messages to members of their visiting group or friends that they made during their visit to the park. It also allowed "external" messages sent to people outside of the park. The web site also describes the "Cindysaurus Trivia Game" that allowed guest to play for prizes during their visit. These two methods of communication formed the basis of the analysis needed for this Mini-Challenge.
Participants were asked to characterize the communications traffic throughout the park. The data for Mini-Challenge 2 consisted of three days worth of communications from Friday through Sunday. The data fields were a timestamp, the originator's ID, the recipient's ID, and the park area from which the message was sent. As can be seen in the shading of the map above, the park was broken up into five themed areas: the Entry Corridor, Kiddie Land, Tundra Land, Wet Land, and Coaster Alley. So, while these locations were not precise, they indicated general geo-coordinate information for the analyses.
Participants were asked to characterize dominant communication IDs, interesting communication patterns, and suspicious patterns that could contribute to the analysis of the crime. The specific questions were as follows: 1. Identify those IDs that stand out for their large volumes of communication. For each of these IDs: a. Characterize the communication patterns you see. b. Based on these patterns, what do you hypothesize about these IDs? Please limit your response to no more than 4 images and 300 words. 2. Describe up to 10 communications patterns in the data.
Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime. Please limit your response to no more than 10 images and 1000 words. 3. From this data, can you hypothesize when the vandalism was discovered? Describe your rationale. Please limit your response to no more than 3 images and 300 words.
There were two huge users of the communications facility: the Cindysaurus Trivia Game service and the Park Help Desk. The two IDs used by these services were not specifically called out in the data or identified, however each had revealing patterns. The Trivia Game sent out trivia questions every five minutes to all Park visitors. Park visitors who wanted to play the game would respond, and this displayed a large diverse communication volume to the single recipient. The Park Help desk showed a large communication volume continuously throughout the day. The pattern would include messages to and from this single ID, but it was dominated by one-to-one communications. If a Challenge contestant provided a reasonable substitute hypothesis about the communications instead of the Trivia Game or Help Desk, that was accepted as a correct hypothesis.
The second task asked for up to 10 communications patterns. We did not specify any characteristics for patterns we wanted the contestants to report, but patterns related to the incident were present in the data. Some innocuous patterns include group leaders sending bulk messages to their groups to request meetups or to communicate an interesting bit of information. Some of the message recipients responded. The pattern was repeated at other times during the day. When people were standing in a ride queue or at a food and drink attraction, they had an increased possibility of making new friends with people nearby who did not accompany them to the park. They could communicate with new friends during the rest of their time in the park. There was an increased likelihood that people would send external messages when Scott Jones was in the park (8:45-11:35 each day and 13:45-16:30 on Friday and Saturday), specifically if they were near Scott as he traveled through the park.
Some communications patterns were associated with the crime. For example, there was an increase in messages among group members, to the help desk, and to external contacts when the Scott Jones memorabilia vandalism was discovered.
Participants were asked to hypothesize when the vandalism was discovered. From the communication data, it was possible to see that people visiting the Pavilion shortly after the show increased their contact with external contacts, other group members, and the help desk, supporting the hypothesis that the vandalism was discovered immediately after the first show. Additional communication, particularly external communication, occurred as the police moved through the park to investigate shortly after 12 noon.

Grand Challenge: Uncovering a Nefarious Plot
The Grand Challenge required contestants to blend knowledge obtained from the two Mini-Challenges to answer questions of interest to law enforcement officials. How was the crime executed and who was responsible?
The questions asked of Grand Challenge participants were as follows: 1. Scott is not a paying customer and does not have an ID. Describe Scott Jones' activities in the park during the three-day weekend. Who does he spend most of his time with? When does he arrive? When does he leave? What route does he follow? Limit your response to no more than 10 images and 1000 words. 2. Identify up to 8 issues with park operations during the three-day weekend. Provide a rationale for your answers.
Limit your response to no more than 8 images and 800 words. 3. For the crime, describe the following, and provide a rationale for your answer. a. When did the crime occur? b. Where did the crime take place? c. Who are the most likely suspects in the crime? Limit your response to no more than 5 images and 500 words.
Scott Jones and his eight-person entourage were at the park all three days. Scott himself was not wearing a sensor, but Scott and his entourage were always together. Scott's movements could be tracked by identifying the movements of his entourage, who are tracked and have very unique movement profiles. Scott has six appearances scheduled (two each day). He performed in the first five shows, but the sixth was canceled due to concern about Scott's safety. Each day, Scott arrived and left at the same time for both the morning and afternoon shows. The movement patterns were identical, maximizing his visibility as he strolled through the park enroute to the Stage area.
Park operations caused some interesting impacts to park visitor behavior and movements. The DinofunWorld Park App had some problems. On Saturday, around 15:53, data loss occurred for some visitors. This happened again on Sunday, around 10:23. There is also a problem with the data reported for visitor ID 1983765 (this turned out to be the prime suspect). Starting at 20:18 on Saturday, he tampered with his app in a test of disabling the tracking feature. This created spurious duplicate entries of movement placing him at a different part of the park. The app was reinitialized on Sunday by Park staff.
The crime occurred on Sunday, between 9:15 and 11:33 in Location 32, the Creighton Pavilion. As originally planned, the most likely suspects were to have been ID 1983765 (representing Eddie Smith, the prime suspect) and his accomplices, ID 1089132 and ID 1723967. The prime suspect and the two accomplices performed surveillance of the Creighton Pavilion and the park perimeter and exits on Friday and Saturday.
The Pavilion closed 9:30 -11:30 and 14:30 -16:30 every day during the Scott Jones shows. The park was short on security, so they had to close this exhibit in order to provide sufficient security for the Scott Jones show. On Sunday, the prime suspect was to have remained in the Pavilion while it was closed to the public, while his accomplices remained on watch outside. He moved early, which resulted in suspicious behaviors, but resulted in his not clearly being identifiable as the prime suspect. Three other IDs remained in the Pavilion during its closed period and appeared to be likely prime suspects. As we always do with VAST Challenge analysis, all reasonable hypotheses, evidence, and explanations were fully credited as we reviewed the submissions.
The vandalism in the Pavilion was discovered by some of first park visitors who went into the Pavilion after its 11:30 re-opening on Sunday. This is indicated by the increase in communications as the park visitors discovered the vandalism, reported it, and talked about it among their groups and with their friends and family outside the park.

REVIEW PROCESS
The VAST Challenge committee recruited reviewers with expertise either in visual analytics, visualization, and related disciplines and domain experts. Over 77 reviewers participated, each providing from 1 to 6 reviews. Each submission received 3 to 5 anonymous peer reviews. All reviewers were given the opportunity to recommend entries for award consideration.
Reviewers were asked to provide an overall rating, comments on the overall rating, a review of how well task questions were answered and how well visual analytics were applied, including whether or not innovative tools were created for the challenge. Reviewers could comment on compelling features from either a visualization perspective or from an analyst's perspective.
The VAST Challenge Committee held two separate one-day meetings to determine awards for each of the Mini-Challenges and Grand Challenge. During each meeting, the committee considered the reviewer award recommendations and finalized the list of awards and honorable mentions based on all reviewer scores and comments. The committee also identified noteworthy aspects of submissions to be mentioned during their presentations at the VAST Challenge Workshop in October.

VAST CHALLENGE 2015 RESULTS
The submissions recognized for awards and honorable mentions in 2015 are listed in Table 1. Additional information about the Challenge entries can be found in the Challenge papers included in the VAST 2015 electronic proceedings, and shortly in IEEE Xplore and in the Visual Analytics Benchmark Repository.

Mini-Challenge 1 Awards
Mini-challenge 1 required contestants to look at movement patterns in the large and in detail. KU Leuven provided great details about movement, ride, and dwell patterns and provided an insightful analysis. Middlebury offered a useful integrated analysis environment ( Figure 2) with a large display format that clearly showed detailed patterns about individuals and groups. James Skinner, an independent participant, and Gabriel Rossiter, from University College London, provided an insightful and entertaining video, playing off the themes of this year's dinosaur themed challenge. University of Peking showed us a clear and concisely organized entry, which featured a collaborative analysis system. Konstanz employed a high-detailed, coordinated analysis space, featuring rich visualizations.

Mini-Challenge 2 Awards
Analyzing communications metadata, either with or without movement data posed a difficult task to determine what aspects of the Park app interactions were relevant to the crime. Zhejiang developed a custom tool to analyze the temporal relationships among the communications, as well as the network of communications that resulted. Purdue followed the large-scale interactions in the park, and identified mass interactions, plus details such as active "middlemen" involved in bridging groups. New York University was given an award as they showed a high degree of interactions in their analysis (Figure 3), that stood out in their submission video. The result was a compelling analysis and description of the park activities. Central South University was awarded for their application of advanced analytic techniques, including community detection algorithms, in their analysis.

Grand Challenge Awards
A record twelve teams submitted Grand Challenge awards, all of which were of high quality. This year, the combined team from Arizona State University and University of Stuttgart were recognized with an award for Outstanding Comprehensive Submission, where their entries for Mini-Challenge 1, Mini-Challenge 2, and the Grand Challenge stood out to the reviewers and judges (Figure 4).
With an extremely clear and detailed analysis, Konstanz was given an honorable mention for their work in detection and reporting of subtle features of the datasets in their entry. TU-Darmstadt caught the reviewer's eye with intuitive interaction and animation in their entry ( Figure 5). Hong Kong University of Science and Technology also featured advanced crossvisualization interactions and a clear, straightforward analysis video.

DISCUSSION
The VAST Challenge 2015 repeated the format from last year with Mini-Challenges and an overarching Grand Challenge. However, the two Mini-Challenges were not as substantially dissimilar from each other, as was the case in VAST Challenge 2014. Presumably, this provided the opportunity for contestants to more conveniently consider Grand Challenge aspects as they proceeded through the Mini-Challenges. The following section includes observations made by the VAST Challenge committee about this year's competition.

GENERAL OBSERVATIONS
The Challenge received 74 submissions across the Mini-Challenges and Grand Challenge. This was rather surprising, since typically fewer Mini-Challenges result in fewer participants overall. The participation figures across all of the years of the VAST Challenge are shown in Table 2.
The committee was surprised to find contestants were very conservative in the use of their visualizations; many conventional visualization approaches were applied to this challenge. We encourage the application of innovative visualization techniques and experimental approaches to the VAST challenges, even at the expense of getting a correct or complete answer.
There were many different entities to analyze in this challenge, including people, rides, and messages. To distinguish between entity characteristics or groups, contestants often resorted to visualizations that used rainbow color maps or multiple conflicting color schemes within a single tool. With a large collection of entities being displayed in a single visualization, this practice results in the viewer becoming visually overloaded quite quickly and unable to determine information of significance within the display. We encourage contestants to follow established visualization best practices when applying a color map to a display, or consider visual encodings other than color when working with data that has similar characteristics to this year's challenge data.
There were very good analyses using either commercial tools or custom built tools. This resulted in contestants finding errors, or, stated more gently, unintended features, in the datasets. This was quite encouraging from the committee's perspective, as the sharp eyes and insights of the community will encourage us to strive for a very high degree of quality in the VAST Challenge offerings. VAST Challenge datasets are traditionally developed to be very "clean". That is, contestants are not typically required to do extensive pre-processing to remove errors, conflicts, or other messiness that comes with real-world data -we leave those kinds of challenges to other venues and contests. When present, the types and extents of data problems are carefully gauged to highlight how the community would deal with a certain form of data issue relevant to a particular challenge (as opposed to the entire spectrum of what is possible).
Presentation matters! There were several submissions that were viewed favorably by the judges because contestants paid extra attention to communicating their findings visually in an effective and novel way in either their analysis summary or the presentation video. Communicating findings within the data and making them memorable was not explicitly called out as an integral a part of the VAST Challenge 2015 concerns. However, effective briefing remains an important (and under-addressed) real-world consideration for visual analytic tools. These examples of briefing techniques suggest new ways that analysts might link the data and visual representations to share the story they found in the data with others. By addressing briefing, we not only consider our primary expert analyst needs, but also the needs of our fictitious park officials and law enforcement agents -we imagine new ways that data might be transformed, visualized, and consumed by this secondary audience and put to use in service of to apprehending the suspects.
The videos provided with the submissions were of good quality this year. The explanations tended to be clear and reasonably well-structured. We encourage future participants to ensure that their video highlights how their tool was used to analyze the data and solve the challenge problems and not simply demonstrate the features and functionality of a custom tool.

VAST CHALLENGE AS A RESEARCH COMMUNITY RESOURCE
The VAST Challenge committee continues to receive requests for archived data and advice concerning its use in student research projects and in visual analytics-related classes. Researchers continue to use archived VAST Challenge datasets, in conjunction with the scenario ground truth, as a public domain resource for analytic experimentation and validation. We continue to appreciate the support from the University of Maryland in maintaining the Visual Analytics Benchmark Repository, where all previous VAST Challenge materials are freely available for download.

TOWARD VAST CHALLENGE 2016
The VAST Challenge 2015 was a great success in the number of participants, the number of data downloads, and the number of participating countries all being record-setting counts. VAST Challenge 2016 will be the tenth anniversary of the event, and the VAST Challenge committee plans to celebrate this milestone at the upcoming conference in Washington, D.C.