Progressive Data Analysis: Roadmap and Research Agenda

Jean-Daniel Fekete, Danyel Fisher, and Michael Sedlmair,
Eurographics Association, ISBN: 978-3-03868-270-7, DOI: 10.2312/pda.20242707, Publisher Link, PDF, 2024

We live in an era where data is abundant and growing rapidly; databases storing big data sprawl past memory, reach computation limits, and become increasingly distributed. Engineers are designing new hardware and software systems, with new storage management and predictive computation, to sustain this growth. Yet, while good for data at scale, these infrastructures do not support exploratory data analysis (EDA) effectively. EDA allows analysts to make sense of data with little or no known model. It is essential in many application domains, from network security and fraud detection to epidemiology and preventive medicine. Data exploration is done through an iterative loop where analysts interact with data through computations that return results, usually shown with visualizations, which in turn are interacted with by the analyst again. EDA calls for highly responsive systems: at 500 ms, users change their querying behavior; past five or ten seconds, users abandon tasks or lose attention. To address this problem, a new computation paradigm has emerged in the last decade: <em>Progressive Data Analysis</em>.

This book is an introduction to this new paradigm. It explains the main scientific and technical benefits of performing complex data analysis progressively on large data. It also introduces the challenges that it raises to become fully usable. These important issues concern research fields that are traditionally separate in computer science: databases, scientific computing, machine learning, visualization, statistics, and human-computer interaction. The book ends with a research agenda to help the scientific community converge on key research questions.

Table of Contents

  1. Introduction
  2. Concepts and Definitions
  3. Data Management
  4. Data Structures and Algorithms
  5. Visualization
  6. Uncertainty and Quality
  7. Human Aspects
  8. Machine Learning
  9. Evaluation
  10. Challenges and Research Agenda

Structure of the Book

Authors

Marco AngeliniSapienze Univesità di Roma
Michaël AupetitQatar Computing Research Institute
Sriram Karthik BadamApple Inc.
Carsten BinnigTechnical University of Darmstadt & DFKI
Jean-Daniel FeketeInria & Université Paris-Saclay
Danyel Fisher 
Barbara HammerCITEC, Bielefeld University
Jaemin JoSungkyunkwan University
Nicola PezzottiPhilips Cardiologs, TU/e, AI4MR
Gaëlle RicherInria & Université Paris-Saclay
Florin RusuUniversity of California Merced
Giuseppe SantucciSapienze Univesità di Roma
Hans-Jörg SchulzAarhus University
Michael SedlmairUniversity of Stuttgart
Hendrik StrobeltIBM Research, MIT-IBM Watson AI Lab
Cagatay TurkayUniversity of Warwick
Anna VilanovaTU/e
Chris WeaverUniversity of Oklahoma

Printed Book

https://diglib.eg.org/handle/10.2312/3607057

Book

@book{PDABook,
  TITLE = {Progressive Data Analysis: Roadmap and Research Agenda},
  AUTHOR = {Fekete, Jean-Daniel and Fisher, Danyel and Sedlmair, Michael},
  PUBLISHER = {Eurographics},
  PAGES = {231},
  YEAR = {2024},
  MONTH = Nov,
  DOI = {10.2312/pda.20242707},
  ISBN = {978-3-03868-270-7},
  PDF = {https://diglib.eg.org/bitstreams/f57f92d0-df37-4569-9ca1-c15980c541a2/download},
  URL = {https://diglib.eg.org/handle/10.2312/3607057}
}