Beautiful Data

Beautiful Data : The Stories Behind Elegant Data Solutions

By (author) , By (author)


You save US$9.12

Free delivery worldwide

Dispatched from the UK in 2 business days

When will my order arrive?


In this insightful book, you'll learn from the best data practitioners in the field just how wide-ranging - and beautiful - working with data can be. Join 39 contributors as they explain how they developed simple and elegant solutions on projects ranging from the Mars lander to a Radiohead video. With "Beautiful Data", you will: explore the opportunities and challenges involved in working with the vast number of datasets made available by the Web; learn how to visualize trends in urban crime, using maps and data mashups; discover the challenges of designing a data processing system that works within the constraints of space travel; also learn how crowdsourcing and transparency have combined to advance the state of drug research; and, understand how new data can automatically trigger alerts when it matches or overlaps pre-existing data. Learn about the massive infrastructure required to create, capture, and process DNA data. That's only small sample of what you'll find in "Beautiful Data". For anyone who handles data, this is a truly fascinating book. Contributors include: Nathan Yau; Jonathan Follett and Matt Holm; J.M. Hughes; Raghu Ramakrishnan, Brian Cooper, and Utkarsh Srivastava; Jeff Hammerbacher; Jason Dykes and Jo Wood; Jeff Jonas and Lisa Sokol; Jud Valeski; Alon Halevy and Jayant Madhavan; Aaron Koblin and Valdean Klump; Michal Migurski; Jeff Heer; Coco Krumme; Peter Norvig; Matt Wood and Ben Blackburne; Jean-Claude Bradley, Rajarshi Guha, Andrew Lang, Pierre Lindenbaum, Cameron Neylon, Antony Williams, and Egon Willighagen; Lukas Biewald and Brendan O'Connor; Hadley Wickham, Deborah Swayne, and David Poole; Andrew Gelman, Jonathan P. Kastellec, and Yair Ghitza; and, Toby Segaran.

show more

Product details

  • Paperback | 382 pages
  • 177.8 x 261.62 x 27.94mm | 839.14g
  • O'Reilly Media, Inc, USA
  • Sebastopol, United States
  • English
  • Original
  • Illustrations (some col.), maps (some col.)
  • 0596157118
  • 9780596157111
  • 224,932

About Toby Segaran

Toby Segaran is the author of Programming Collective Intelligence, a very popular O'Reilly title. He was the founder of Incellico, a biotech software company later acquired by Genstruct. He currently holds the title of Data Magnate at Metaweb Technologies and is a frequent speaker at technology conferences. Jeff Hammerbacher is Vice President of Products and Chief Scientist at Cloudera. Jeff was an Entrepreneur in Residence at Accel Partners immediately prior to co-founding Cloudera. Before Accel, he conceived, built, and led the Data team at Facebook. The Data team was responsible for driving many of the applications of statistics and machine learning at Facebook, as well as building out the infrastructure to support these tasks for massive data sets. The team produced two open source projects: Hive, a system for offline analysis built above Hadoop, and Cassandra, a structured storage system on a P2P network. Before joining Facebook, Jeff was a quantitative analyst on Wall Street. Jeff earned his Bachelor's Degree in Mathematics from Harvard University.

show more

Table of contents

From the contents:§§ Chapter 1 Seeing Your Life in Data§§ Personal Environmental Impact Report (PEIR)§§ your.flowingdata (YFD)§§ Personal Data Collection§§ Data Storage§§ Data Processing§§ Data Visualization§§ The Point§§ How to Participate§§ Chapter 2 The Beautiful People: Keeping Users in Mind When Designing Data Collection Methods§§ Introduction: User Empathy Is the New Black§§ The Project: Surveying Customers About a New Luxury Product§§ Specific Challenges to Data Collection§§ Designing Our Solution§§ Results and Reflection§§ Chapter 3 Embedded Image Data Processing on Mars§§ Abstract§§ Introduction§§ Some Background§§ To Pack or Not to Pack§§ The Three Tasks§§ Slotting the Images§§ Passing the Image: Communication Among the Three Tasks§§ Getting the Picture: Image Download and Processing§§ Image Compression§§ Downlink, or, It's All Downhill from Here§§ Conclusion§§ Chapter 4 Cloud Storage Design in a PNUTShell§§ Introduction§§ Updating Data§§ Complex Queries§§ Comparison with Other Systems§§ Conclusion§§ Acknowledgments§§ References§§ Chapter 5 Information Platforms and the Rise of the Data Scientist§§ Libraries and Brains§§ Facebook Becomes Self-Aware§§ A Business Intelligence System§§ The Death and Rebirth of a Data Warehouse§§ Beyond the Data Warehouse§§ The Cheetah and the Elephant§§ The Unreasonable Effectiveness of Data§§ New Tools and Applied Research§§ MAD Skills and Cosmos§§ Information Platforms As Dataspaces§§ The Data Scientist§§ Conclusion§§ Chapter 6 The Geographic Beauty of a Photographic Archive§§ Beauty in Data: Geograph§§ Visualization, Beauty, and Treemaps§§ A Geographic Perspective on Geograph Term Use§§ Beauty in Discovery§§ Reflection and Conclusion§§ Acknowledgments§§ References§§ Chapter 7 Data Finds Data§§ Introduction§§ The Benefits of Just-in-Time Discovery§§ Corruption at the Roulette Wheel§§ Enterprise Discoverability§§ Federated Search Ain't All That§§ Directories: Priceless§§ Relevance: What Matters and to Whom?§§ Components and Special Considerations§§ Privacy Considerations§§ Conclusion§§ Chapter 8 Portable Data in Real Time§§ Introduction§§ The State of the Art§§ Social Data Normalization§§ Conclusion: Mediation via Gnip§§ Chapter 9 Surfacing the Deep Web§§ What Is the Deep Web?§§ Alternatives to Offering Deep-Web Access§§ Conclusion and Future Work§§ References§§ Chapter 10 Building Radiohead's House of Cards How It All Started§§ The Data Capture Equipment§§ The Advantages of Two Data Capture Systems§§ The Data§§ Capturing the Data, aka "The Shoot"§§ Processing the Data§§ Post-Processing the Data§§ Launching the Video§§ Conclusion§§ Chapter 11 Visualizing Urban Data§§ ...

show more