Data Smart

Data Smart : Using Data Science to Transform Information Into Insight

4.18 (687 ratings by Goodreads)
By (author) 

Free delivery worldwide

Available. Dispatched from the UK in 3 business days
When will my order arrive?


Data Science gets thrown around in the press like it's magic. Major retailers are predicting everything from when their customers are pregnant to when they want a new pair of Chuck Taylors. It's a brave new world where seemingly meaningless data can be transformed into valuable insight to drive smart business decisions. But how does one exactly do data science? Do you have to hire one of these priests of the dark arts, the "data scientist," to extract this gold from your data? Nope. Data science is little more than using straight-forward steps to process raw data into actionable insight. And in Data Smart, author and data scientist John Foreman will show you how that's done within the familiar environment of a spreadsheet. Why a spreadsheet? It's comfortable! You get to look at the data every step of the way, building confidence as you learn the tricks of the trade. Plus, spreadsheets are a vendor-neutral place to learn data science without the hype. But don't let the Excel sheets fool you. This is a book for those serious about learning the analytic techniques, the math and the magic, behind big data.
Each chapter will cover a different technique in a spreadsheet so you can follow along: * Mathematical optimization, including non-linear programming and genetic algorithms * Clustering via k-means, spherical k-means, and graph modularity * Data mining in graphs, such as outlier detection * Supervised AI through logistic regression, ensemble models, and bag-of-words models * Forecasting, seasonal adjustments, and prediction intervals through monte carlo simulation * Moving from spreadsheets into the R programming language You get your hands dirty as you work alongside John through each technique. But never fear, the topics are readily applicable and the author laces humor throughout. You'll even learn what a dead squirrel has to do with optimization modeling, which you no doubt are dying to know.
show more

Product details

  • Paperback | 432 pages
  • 185.42 x 231.14 x 22.86mm | 657.71g
  • New York, United States
  • English
  • 1. Auflage
  • 111866146X
  • 9781118661468
  • 28,813

Back cover copy

"Data Smart makes modern statistic methods and algorithmsunderstandable and easy to implement. Slogging through textbooksand academic papers is no longer required!"
--Patrick Crosby, Founder of StatHat & first CTO atOkCupid

"When Mr. Foreman interviewed for a job at my company, hearrived dressed in a 'Kentucky Colonel' kind of suit and spokeabout nonsensical things like barbecue, lasers, and orange juicepulp. Then, he explained how to de-mystify and solve just about anycomplex 'big data' problem in our company with simple spreadsheets.No server clusters, mainframes, or Hadoop-a-ma-jigs. Just Excel. Ihired him on the spot. After reading this book, you too will learnhow to use math and basic spreadsheet formulas to improve yourbusiness or, at the very least, how to trick senior executives intohiring you as their data scientist."
--Ben Chestnut, Founder & CEO ofMailChimp

"You need a John Foreman on your analytics team. But if youcan't have John, then reading this book is the next bestthing."
--Patrick Lennon, Director of Analytics, TheCoca-Cola Company

Most people are approaching data science all wrong. Here'show to do it right.

Not to disillusion you, but data scientists are not mysticalpractitioners of magical arts. Data science is something you cando. Really. This book shows you the significant data sciencetechniques, how they work, how to use them, and how they benefityour business, large or small. It's not about coding or databasetechnologies. It's about turning raw data into insight you can actupon, and doing it as quickly and painlessly as possible.

Roll up your sleeves and let's get going.

Relax -- it's just a spreadsheet

Visit the companion website at todownload spreadsheets for each chapter, and follow them as youlearn about:

Artificial intelligence using the general linear model, ensemble methods, and naive BayesClustering via k-means, spherical k-means, and graphmodularityMathematical optimization, including non-linear programming andgenetic algorithmsWorking with time series data and forecasting with exponentialsmoothingUsing Monte Carlo simulation to quantify and address riskDetecting outliers in single or multiple dimensionsExploring the data-science-focused R language
show more

Table of contents

Introduction xiii 1 Everything You Ever Needed to Know about Spreadsheets but Were Too Afraid to Ask 1 2 Cluster Analysis Part I: Using K-Means to Segment Your Customer Base 29 3 Naive Bayes and the Incredible Lightness of Being an Idiot 77 4 Optimization Modeling: Because That "Fresh Squeezed" Orange Juice Ain't Gonna Blend Itself 101 5 Cluster Analysis Part II: Network Graphs and Community Detection 155 6 The Granddaddy of Supervised Artificial Intelligence Regression 205 7 Ensemble Models: A Whole Lot of Bad Pizza 251 8 Forecasting: Breathe Easy; You Can't Win 285 9 Outlier Detection: Just Because They're Odd Doesn t Mean They're Unimportant 335 10 Moving from Spreadsheets into R 361 Conclusion 395 Index 401
show more

About John W. Foreman

John W. Foreman is Chief Data Scientist for, where he leads a data science product development effort called the Email Genome Project. As an analytics consultant, John has created data science solutions for The Coca-Cola Company, Royal Caribbean International, Intercontinental Hotels Group, Dell, the Department of Defense, the IRS, and the FBI.
show more

Rating details

687 ratings
4.18 out of 5 stars
5 42% (288)
4 39% (268)
3 15% (106)
2 3% (21)
1 1% (4)
Book ratings by Goodreads
Goodreads is the world's largest site for readers with over 50 million reviews. We're featuring millions of their reader ratings on our book pages to help you find your new favourite book. Close X