Skip to content
FREE SHIPPING ON ALL DOMESTIC ORDERS $35+
FREE SHIPPING ON ALL US ORDERS $35+

Dark Data: Why What You Don't Know Matters

Availability:
in stock, ready to be shipped
Save 5% Save 5%
Original price $19.95
Original price $19.95 - Original price $19.95
Original price $19.95
Current price $18.99
$18.99 - $18.99
Current price $18.99

A practical guide to making good decisions in a world of missing data

In the era of big data, it is easy to imagine that we have all the information we need to make good decisions. But in fact the data we have are never complete, and may be only the tip of the iceberg. Just as much of the universe is composed of dark matter, invisible to us but nonetheless present, the universe of information is full of dark data that we overlook at our peril. In Dark Data, data expert David Hand takes us on a fascinating and enlightening journey into the world of the data we don't see.

Dark Data explores the many ways in which we can be blind to missing data and how that can lead us to conclusions and actions that are mistaken, dangerous, or even disastrous. Examining a wealth of real-life examples, from the Challenger shuttle explosion to complex financial frauds, Hand gives us a practical taxonomy of the types of dark data that exist and the situations in which they can arise, so that we can learn to recognize and control for them. In doing so, he teaches us not only to be alert to the problems presented by the things we don't know, but also shows how dark data can be used to our advantage, leading to greater understanding and better decisions.

Today, we all make decisions using data. Dark Data shows us all how to reduce the risk of making bad ones.

ISBN-13: 9780691234465

Media Type: Paperback

Publisher: Princeton University Press

Publication Date: 02-15-2022

Pages: 344

Product Dimensions: 5.50(w) x 8.40(h) x 0.90(d)

David J. Hand is emeritus professor of mathematics and senior research investigator at Imperial College London, a former president of the Royal Statistical Society, and a fellow of the British Academy. His many previous books include The Improbability Principle, Measurement: A Very Short Introduction, Statistics: A Very Short Introduction, and Principles of Data Mining.

What People are Saying About This

From the Publisher

"When we make decisions in our personal and professional lives, we typically start with some form of data. The very word 'data' derives from the Latin meaning 'something given.' But who gave it? Where is it from? Should I accept it at face value? Opening our eyes to the pitfalls of taking 'something given' for granted, this insightful book should be required reading for everyone in an age when 'fake news' and the explosion of data go hand in hand."—Adrian Smith, director and chief executive of The Alan Turing Institute

"David Hand shines a bright light onto the dark corners of statistics. This is a learned book but a witty, readable, and important one. I learned a lot and so will you."—Tim Harford, author of Fifty Inventions That Shaped the Modern Economy and presenter of the BBC series More or Less

"It is hard to think of anyone having anything at all to do with data-driven decisions who couldn't benefit from reading this book. David Hand effortlessly guides the reader through the many pitfalls of dark data."—Arno Siebes, Universiteit Utrecht

"This unique and much-needed book provides an accessible guide to dark data at a time when general awareness of the phenomenon is declining."—Geert Molenberghs, Universiteit Hasselt and KU Leuven

Table of Contents

Preface xi

Part 1 Dark Data: Their Origins and Consequences

Chapter 1 Dark Data: What We Don't See Shapes Our World 3

The Ghost of Data 3

So You Think You Have All the Data? 12

Nothing Happened, So We Ignored It 17

The Power of Dark Data 22

All around Us 24

Chapter 2 Discovering Dark Data: What We Collect and What We Don't 28

Dark Data on All Sides 28

Data Exhaust, Selection, and Self-Selection 31

From the Few to the Many 43

Experimental Data 56

Beware Human Frailties 67

Chapter 3 Definitions and Dark Data: What Do You Want to Know? 72

Different Definitions and Measuring the Wrong Thing 72

You Can't Measure Everything 80

Screening 90

Selection on the Basis of Past Performance 94

Chapter 4 Unintentional Dark Data: Saying One Thing, Doing Another 98

The Big Picture 98

Summarizing 102

Human Error 103

Instrument Limitations 108

Linking Data Sets 111

Chapter 5 Strategic Dark Data: Gaming, Feedback, and Information Asymmetry 114

Gaming 114

Feedback 122

Information Asymmetry 128

Adverse Selection and Algorithms 130

Chapter 6 Intentional Dark Data: Fraud and Deception 140

Fraud 140

Identity Theft and Internet Fraud 144

Personal Financial Fraud 149

Financial Market Fraud and Insider Trading 153

Insurance Fraud 158

And More 163

Chapter 7 Science and Dark Data: The Nature of Discovery 167

The Nature of Science 167

If Only I'd Known That 172

Tripping over Dark Data 181

Dark Data and the Big Picture 184

Hiding the Facts 199

Retraction 215

Provenance and Trustworthiness: Who Told You That? 217

Part II Illuminating and Using Dark Data

Chapter 8 Dealing with Dark Data: Shining a Light 223

Hope! 223

Linking Observed and Missing Data 224

Identifying the Missing Data Mechanism 233

Working with the Data We Have 236

Going Beyond the Data: What If You Die First? 241

Going Beyond the Data: Imputation 245

Iteration 252

Wrong Number! 256

Chapter 9 Benefiting from Dark Data: Reframing the Question 262

Hiding Data 262

Hiding Data from Ourselves: Randomized Controlled Trials 263

What Might Have Been 265

Replicated Data 269

Imaginary Data: The Bayesian Prior 276

Privacy and Confidentiality Preservation 278

Collecting Data in the Dark 287

Chapter 10 Classifying Dark Data: A Route through the Maze 291

A Taxonomy of Dark Data 291

Illumination 298

Notes 307

Index 319