Assignment 5 - Dino Fun World

This is a larger assignment in which you can use the tools you have learned about so far -- R, OpenRefine, Tableau or any other data analysis tools that you like. You have roughly one month to complete the assignment.

You should be working in teams of two people.

The Data

The simulated park covers a large geographic space (approx. 500x500 m^2) and is populated with ride attractions, restaurants and food stops, souvenir and game stores, an arcade, a show hall, and a performance stage. The attractions are categorized into Thrill Rides, Kiddie Rides, Rides for Everyone, Food, Restrooms, Shopping, and Shows & Entertainment. Patterns can be found in the movement through the park and communications among visitors, including expected normal visit patterns and unexpected patterns. Your challenge for this year is focused on exploration of these various patterns.

A news article on a park incident:

Mayhem at DinoFun World June 10, 2014; By Mako Harrison, staff reporter

The crowd that gathered at DinoFun World this weekend to honor local star and international soccer celebrity Scott Jones were ready to hand out more than a red card to the individuals who vandalized a pavilion exhibiting Jones’s memorabilia and made off with an Olympic medal and possibly other irreplaceable items. “How could they do this to him?” sobbed one distraught fan, shortly after learning about the incident. “He means so much to all of us and our town!” The crime forced partial closure of DinoFun World and local police were on the scene shortly after the vandalism was discovered by park visitors. Security guards are being questioned to eliminate the possibility of an inside job. “Creighton Pavilion was closed and locked up tight before each show,” stated Park Chief of Security Barney Wojciehowicz. “We needed the extra security at the Stage area to ensure visitor safety since the crowds were so large there. We’ve just never had any problems at the Pavilion prior to this.” Scott Jones’s representatives were not available for comment.

Files

Questions to Answer

Your task is to look for interesting patterns in the data - and to highlight those that may be related to the crime. During the analysis you are free to choose your own analysis questions. I highly encourage you to follow a logical train of thought. Think first what data may be relevant? How can you filter the data? Do you want or need to do some pre-processing in R?

If you want to practice with the dataset first before starting an investigation into the crime, you could try to answer a few simple questions:

  • When to people arrive at and leave the park?
  • What attractions are the most popular? Are there temporal patterns about when certain places are visited?
  • How long do they stay at the park?

And then once you get more proficient you could look at:

More complicated questions

  • Can you distinguish between people who work at and who visit the park? Who are they?
  • Can you identify people traveling together through the park?

For you analysis report, you may want to try and answer some of the following questions:

  • At what time did the crime occur?
  • Do you have any suspicions about who could be involved in the crime?
  • Can you find the id of Scott Jones and the people working with him?

Generate a report

Your final report should include

  • a title: summarizing your analysis findings in one sentence
  • identifying information such as your name, email address, and a date
  • an introduction: summarize your assignment and how you approached it. Do not yet give any details.
  • a methods section where you describe
    • what data you had available and which part of the data you used
    • which main statistical methods if any did you use
    • which data transformations did you perform if any, did you make modification to the data
    • which what analysis tools did you use
  • a results section: showing what you found - posing hypotheses related to the crime if any. This should be the main part of your report. Provide a logical structure in this sentence (e.g. first we looked at this, this led us to investigate this, ... which led us to conclude that...)
  • conclusions: summarizing your work but also clearly describing potential problems and limitations of your analysis

Your report should

  • tell a story
  • not include every analysis you performed
  • References should be included for (statistical) methods used and for any resources that helped you with the analysis

The report should not be overly long (<10 A4 pages). Do not include code - but do include figures and text.

Template

For your report please use a Latex or Word template from this website http://junctionpublishing.org/vgtc/Tasks/camera_tvcg.html

Grading

You will be graded on your report. Your grade will be based largely on creativity in the analysis - that is whether you can tell an interesting story about the data involving the things you looked at and found. I will also look at how difficult the analyses were that you conducted. You will also get additional points for coming up with your own related analysis questions and going beyond the ones I outlined above.

Submitting the Assignment


WHAT - You should submit a single ZIP file called "YOUR_NAME-Project2.zip" via email. It should contain:

  1. A pdf file named "YOUR_NAME-Project-Report.pdf" containing the report as outlined above.

WHERE - You should email the file to petra.isenberg@inria.fr with the subject VA-Project.

WHEN - Remember that the report is due before "23:00 on Sunday April, 3rd