Assignment 5 - Dino Fun World

This is a larger assignment in which you can use the tools you have learned about so far -- R, OpenRefine, Tableau or any other data analysis tools that you like. You have roughly one month to complete the assignment. The deadline is November 4th - the final class of the Analysis Block of this course. On this day you will present your results.

The Data

The simulated park covers a large geographic space (approx. 500x500 m^2) and is populated with ride attractions, restaurants and food stops, souvenir and game stores, an arcade, a show hall, and a performance stage. The attractions are categorized into Thrill Rides, Kiddie Rides, Rides for Everyone, Food, Restrooms, Shopping, and Shows & Entertainment. Patterns can be found in the movement through the park and communications among visitors, including expected normal visit patterns and unexpected patterns. Your challenge for this year is focused on exploration of these various patterns.

A news article on a park incident:

Mayhem at DinoFun World June 10, 2014; By Mako Harrison, staff reporter

The crowd that gathered at DinoFun World this weekend to honor local star and international soccer celebrity Scott Jones were ready to hand out more than a red card to the individuals who vandalized a pavilion exhibiting Jones’s memorabilia and made off with an Olympic medal and possibly other irreplaceable items. “How could they do this to him?” sobbed one distraught fan, shortly after learning about the incident. “He means so much to all of us and our town!” The crime forced partial closure of DinoFun World and local police were on the scene shortly after the vandalism was discovered by park visitors. Security guards are being questioned to eliminate the possibility of an inside job. “Creighton Pavilion was closed and locked up tight before each show,” stated Park Chief of Security Barney Wojciehowicz. “We needed the extra security at the Stage area to ensure visitor safety since the crowds were so large there. We’ve just never had any problems at the Pavilion prior to this.” Scott Jones’s representatives were not available for comment.

Files

Questions to Answer

Your task is to look for interesting patterns in the data - and to highlight those that may be related to the crime. During the analysis you are free to choose your own analysis questions. You could start your analysis by looking at some simple things first, such as:

Simple questions

  • When to people arrive at and leave the park?
  • What attractions are the most popular? Are there temporal patterns about when certain places are visited?
  • How many attractions do people check in to during the day?
  • How long do they stay at the park?

And then look at:

More complicated questions

  • Can you distinguish between people who work at and who visit the park? Who are they?
  • Can you find the id of Scott Jones and the people working with him?
  • At what time did the crime occur?
  • Do you have any suspicions about who could be involved in the crime?
  • Can you identify people traveling together through the park?

Generate a report

Your final report can be produced with any editor you like.

  • It should include
    • a title: summarizing your analysis in one sentence
    • your name and identifying information such as email address and a date
    • an introduction: summarize your assignment and your contribution ,
    • a methods section where you describe what data did you consider and why, which statistics if any did you use, did you transform the data, did you make modification, ...
    • a results section: showing what you found - posing hypotheses related to the crime if any
    • conclusions: summarizing your work but also clearly describing potential problems and limitations of your analysis
  • It should tell a story
  • It should not include every analysis you performed
  • References should be included for (statistical) methods used and for any resources that helped you with the analysis

The report should not be overly long (<10 A4 pages). Do not include code - but do include figures and text.

Generate a presentation

The presentation should summarize your report (in fact - it may be useful to first create the presentation - as it will help you write the report). Follow the same structure as for your report. On Nov. 4th you will have 15 minutes to present the results of your analysis.

Grading

You will be graded on your presentation and report. Your grade will be based largely on creativity in the analysis - that is whether you can tell an interesting story about the data involving the things you looked at and found. I will also look at how difficult the analyses were that you conducted. If you only provide very simple graphs (and don't attempt to look at any of the more difficult questions outlined above) then you will pass but your grade will not be very high. You will also get additional points for coming up with your own analysis questions and going beyond the ones I outlined above.

Submitting the Assignment


WHAT - You should submit a single ZIP file called "YOUR_NAME-Project2.zip" via email. It should contain:

  1. A pdf file named "YOUR_NAME-Project-Report.pdf" containing the report as outlined above.
  2. A pdf file named "YOUR_NAME-Project-Slides.pdf" containing the slides for your presentation

WHERE - You should email the file to petra.isenberg@inria.fr with the subject VA-Project.

WHEN - Remember that Assignment 2 is due before "23:00 on November 3rd

Bring your presentation on your laptop to class on November 4th to present your results.