Progressive Data Analysis: Roadmap and Research Agenda
Jean-Daniel Fekete, Danyel Fisher, and Michael Sedlmair,
Eurographics Association, ISBN: 978-3-03868-270-7, DOI: 10.2312/pda.20242707, Publisher Link, PDF, 2024
We live in an era where data is abundant and growing rapidly; databases storing big data sprawl past memory, reach computation limits, and become increasingly distributed. Engineers are designing new hardware and software systems, with new storage management and predictive computation, to sustain this growth. Yet, while good for data at scale, these infrastructures do not support exploratory data analysis (EDA) effectively. EDA allows analysts to make sense of data with little or no known model. It is essential in many application domains, from network security and fraud detection to epidemiology and preventive medicine. Data exploration is done through an iterative loop where analysts interact with data through computations that return results, usually shown with visualizations, which in turn are interacted with by the analyst again. EDA calls for highly responsive systems: at 500 ms, users change their querying behavior; past five or ten seconds, users abandon tasks or lose attention. To address this problem, a new computation paradigm has emerged in the last decade: <em>Progressive Data Analysis</em>.
This book is an introduction to this new paradigm. It explains the main scientific and technical benefits of performing complex data analysis progressively on large data. It also introduces the challenges that it raises to become fully usable. These important issues concern research fields that are traditionally separate in computer science: databases, scientific computing, machine learning, visualization, statistics, and human-computer interaction. The book ends with a research agenda to help the scientific community converge on key research questions.
Table of Contents
- Introduction
- Concepts and Definitions
- Data Management
- Data Structures and Algorithms
- Visualization
- Uncertainty and Quality
- Human Aspects
- Machine Learning
- Evaluation
- Challenges and Research Agenda
Structure of the Book
Authors
Marco Angelini | Sapienze Univesità di Roma |
Michaël Aupetit | Qatar Computing Research Institute |
Sriram Karthik Badam | Apple Inc. |
Carsten Binnig | Technical University of Darmstadt & DFKI |
Jean-Daniel Fekete | Inria & Université Paris-Saclay |
Danyel Fisher | |
Barbara Hammer | CITEC, Bielefeld University |
Jaemin Jo | Sungkyunkwan University |
Nicola Pezzotti | Philips Cardiologs, TU/e, AI4MR |
Gaëlle Richer | Inria & Université Paris-Saclay |
Florin Rusu | University of California Merced |
Giuseppe Santucci | Sapienze Univesità di Roma |
Hans-Jörg Schulz | Aarhus University |
Michael Sedlmair | University of Stuttgart |
Hendrik Strobelt | IBM Research, MIT-IBM Watson AI Lab |
Cagatay Turkay | University of Warwick |
Anna Vilanova | TU/e |
Chris Weaver | University of Oklahoma |
Printed Book
https://diglib.eg.org/handle/10.2312/3607057
@book{PDABook, TITLE = {Progressive Data Analysis: Roadmap and Research Agenda}, AUTHOR = {Fekete, Jean-Daniel and Fisher, Danyel and Sedlmair, Michael}, PUBLISHER = {Eurographics}, PAGES = {231}, YEAR = {2024}, MONTH = Nov, DOI = {10.2312/pda.20242707}, ISBN = {978-3-03868-270-7}, PDF = {https://diglib.eg.org/bitstreams/f57f92d0-df37-4569-9ca1-c15980c541a2/download}, URL = {https://diglib.eg.org/handle/10.2312/3607057} }