Data Mining for Business Analytics
24%
off

Data Mining for Business Analytics : Concepts, Techniques, and Applications in R

4.5 (2 ratings by Goodreads)
By (author)  , By (author)  , By (author)  , By (author)  , By (author)  , By (author) 

Free delivery worldwide

Available. Dispatched from the UK in 2 business days
When will my order arrive?

Not expected to be delivered to the United States by Christmas Not expected to be delivered to the United States by Christmas

Description

Data Mining for Business Analytics: Concepts, Techniques, and Applications in R presents an applied approach to data mining concepts and methods, using R software for illustration


Readers will learn how to implement a variety of popular data mining algorithms in R (a free and open-source software) to tackle business problems and opportunities.


This is the fifth version of this successful text, and the first using R. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes:





Two new co-authors, Inbal Yahav and Casey Lichtendahl, who bring both expertise teaching business analytics courses using R, and data mining consulting experience in business and government

Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students

More than a dozen case studies demonstrating applications for the data mining techniques described

End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented

A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions www.dataminingbook.com



Data Mining for Business Analytics: Concepts, Techniques, and Applications in R is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.
show more

Product details

  • Hardback | 576 pages
  • 185 x 254 x 32mm | 1,328g
  • New York, United States
  • English
  • 1. Auflage
  • 1118879368
  • 9781118879368
  • 897,331

Back cover copy

Data Mining for Business Analytics: Concepts, Techniques, and Applications in R presents an applied approach to data mining concepts and methods, using R software for illustration



Readers will learn how to implement a variety of popular data mining algorithms in R (a free and open-source software) to tackle business problems and opportunities.



This is the fifth version of this successful text, and the first using R. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes:

Two new co-authors, Inbal Yahav and Casey Lichtendahl, who bring both expertise teaching business analytics courses using R, and data mining consulting experience in business and government Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students More than a dozen case studies demonstrating applications for the data mining techniques described End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions www.dataminingbook.com

Data Mining for Business Analytics: Concepts, Techniques, and Applications in R is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology.
show more

Table of contents

Foreword by Gareth James xix


Foreword by Ravi Bapna xxi


Preface to the R Edition xxiii


Acknowledgments xxvii


PART I PRELIMINARIES


CHAPTER 1 Introduction 3


1.1 What Is Business Analytics? 3


1.2 What Is Data Mining? 5


1.3 Data Mining and Related Terms 5


1.4 Big Data 6


1.5 Data Science 7


1.6 Why Are There So Many Different Methods? 8


1.7 Terminology and Notation 9


1.8 Road Maps to This Book 11


Order of Topics 11


CHAPTER 2 Overview of the Data Mining Process 15


2.1 Introduction 15


2.2 Core Ideas in Data Mining 16


2.3 The Steps in Data Mining 19


2.4 Preliminary Steps 21


2.5 Predictive Power and Overfitting 33


2.6 Building a Predictive Model 38


2.7 Using R for Data Mining on a Local Machine 43


2.8 Automating Data Mining Solutions 43


PART II DATA EXPLORATION AND DIMENSION REDUCTION


CHAPTER 3 Data Visualization 55


3.1 Uses of Data Visualization 55


3.2 Data Examples 57


3.3 Basic Charts: Bar Charts, Line Graphs, and Scatter Plots 59


3.4 Multidimensional Visualization 67


3.5 Specialized Visualizations 80


3.6 Summary: Major Visualizations and Operations, by Data Mining Goal 86


CHAPTER 4 Dimension Reduction 91


4.1 Introduction 91


4.2 Curse of Dimensionality 92


4.3 Practical Considerations 92


4.4 Data Summaries 94


4.5 Correlation Analysis 97


4.6 Reducing the Number of Categories in Categorical Variables 99


4.7 Converting a Categorical Variable to a Numerical Variable 99


4.8 Principal Components Analysis 101


4.9 Dimension Reduction Using Regression Models 111


4.10 Dimension Reduction Using Classification and Regression Trees 111


PART III PERFORMANCE EVALUATION


CHAPTER 5 Evaluating Predictive Performance 117


5.1 Introduction 117


5.2 Evaluating Predictive Performance 118


5.3 Judging Classifier Performance 122


5.4 Judging Ranking Performance 136


5.5 Oversampling 140


PART IV PREDICTION AND CLASSIFICATION METHODS


CHAPTER 6 Multiple Linear Regression 153


6.1 Introduction 153


6.2 Explanatory vs. Predictive Modeling 154


6.3 Estimating the Regression Equation and Prediction 156


6.4 Variable Selection in Linear Regression 161


CHAPTER 7 k-Nearest Neighbors (kNN) 173


7.1 The k-NN Classifier (Categorical Outcome) 173


7.2 k-NN for a Numerical Outcome 180


7.3 Advantages and Shortcomings of k-NN Algorithms 182


CHAPTER 8 The Naive Bayes Classifier 187


8.1 Introduction 187


8.2 Applying the Full (Exact) Bayesian Classifier 189


8.3 Advantages and Shortcomings of the Naive Bayes Classifier 199


CHAPTER 9 Classification and Regression Trees 205


9.1 Introduction 205


9.2 Classification Trees 207


9.3 Evaluating the Performance of a Classification Tree 215


9.4 Avoiding Overfitting 216


9.5 Classification Rules from Trees 226


9.6 Classification Trees for More Than Two Classes 227


9.7 Regression Trees 227


9.8 Improving Prediction: Random Forests and Boosted Trees 229


9.9 Advantages and Weaknesses of a Tree 232


CHAPTER 10 Logistic Regression 237


10.1 Introduction 237


10.2 The Logistic Regression Model 239


10.3 Example: Acceptance of Personal Loan 240


10.4 Evaluating Classification Performance 247


10.5 Example of Complete Analysis: Predicting Delayed Flights 250


10.6 Appendix: Logistic Regression for Profiling 259


Appendix A: Why Linear Regression Is Problematic for a Categorical Outcome 259


Appendix B: Evaluating Explanatory Power 261


Appendix C: Logistic Regression for More Than Two Classes 264


CHAPTER 11 Neural Nets 271


11.1 Introduction 271


11.2 Concept and Structure of a Neural Network 272


11.3 Fitting a Network to Data 273


11.4 Required User Input 285


11.5 Exploring the Relationship between Predictors and Outcome 287


11.6 Advantages and Weaknesses of Neural Networks 288


CHAPTER 12 Discriminant Analysis 293


12.1 Introduction 293


12.2 Distance of a Record from a Class 296


12.3 Fisher's Linear Classification Functions 297


12.4 Classification Performance of Discriminant Analysis 300


12.5 Prior Probabilities 302


12.6 Unequal Misclassification Costs 302


12.7 Classifying More Than Two Classes 303


12.8 Advantages and Weaknesses 306


CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 311


13.1 Ensembles 311


13.2 Uplift (Persuasion) Modeling 317


13.3 Summary 324


PART V MINING RELATIONSHIPS AMONG RECORDS


CHAPTER 14 Association Rules and Collaborative Filtering 329


14.1 Association Rules 329


14.2 Collaborative Filtering 342


14.3 Summary 351


CHAPTER 15 Cluster Analysis 357


15.1 Introduction 357


15.2 Measuring Distance between Two Records 361


15.3 Measuring Distance between Two Clusters 366


15.4 Hierarchical (Agglomerative) Clustering 368


15.5 Non-Hierarchical Clustering: The k-Means Algorithm 376


PART VI FORECASTING TIME SERIES


CHAPTER 16 Handling Time Series 387


16.1 Introduction 387


16.2 Descriptive vs. Predictive Modeling 389


16.3 Popular Forecasting Methods in Business 389


16.4 Time Series Components 390


16.5 Data-Partitioning and Performance Evaluation 395


CHAPTER 17 Regression-Based Forecasting 401


17.1 A Model with Trend 401


17.2 A Model with Seasonality 407


17.3 A Model with Trend and Seasonality 411


17.4 Autocorrelation and ARIMA Models 412


CHAPTER 18 Smoothing Methods 433


18.1 Introduction 433


18.2 Moving Average 434


18.3 Simple Exponential Smoothing 439


18.4 Advanced Exponential Smoothing 442


PART VII DATA ANALYTICS


CHAPTER 19 Social Network Analytics 455


19.1 Introduction 455


19.2 Directed vs. Undirected Networks 457


19.3 Visualizing and Analyzing Networks 458


19.4 Social Data Metrics and Taxonomy 462


19.5 Using Network Metrics in Prediction and Classification 467


19.6 Collecting Social Network Data with R 471


19.7 Advantages and Disadvantages 474


CHAPTER 20 Text Mining 479


20.1 Introduction 479


20.2 The Tabular Representation of Text: Term-Document Matrix and "Bag-of-Words" 480


20.3 Bag-of-Words vs. Meaning Extraction at Document Level 481


20.4 Preprocessing the Text 482


20.5 Implementing Data Mining Methods 489


20.6 Example: Online Discussions on Autos and Electronics 490


20.7 Summary 494


PART VIII CASES


CHAPTER 21 Cases 499


21.1 Charles Book Club 499


21.2 German Credit 505


21.3 Tayko Software Cataloger 510


21.4 Political Persuasion 513


21.5 Taxi Cancellations 517


21.6 Segmenting Consumers of Bath Soap 518


21.7 Direct-Mail Fundraising 521


21.8 Catalog Cross-Selling 524


21.9 Predicting Bankruptcy 525


21.10 Time Series Case: Forecasting Public Transportation Demand 528


Index 535
show more

About Galit Shmueli

Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 publications including books.

Peter C. Bruce is President and Founder of the Institute for Statistics Education at Statistics.com. He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective (Wiley) and co-author of Practical Statistics for Data Scientists: 50 Essential Concepts (O'Reilly).

Inbal Yahav, PhD, is Professor at the Graduate School of Business Administration at Bar-Ilan University, Israel. She teaches courses in social network analysis, advanced research methods, and software quality assurance. Dr. Yahav received her PhD in Operations Research and Data Mining from the University of Maryland, College Park.

Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years.

Kenneth C. Lichtendahl, Jr., PhD, is Associate Professor at the University of Virginia. He is the Eleanor F. and Phillip G. Rust Professor of Business Administration and teaches MBA courses in decision analysis, data analysis and optimization, and managerial quantitative analysis. He also teaches executive education courses in strategic analysis and decision-making, and managing the corporate aviation function.
show more

Rating details

2 ratings
4.5 out of 5 stars
5 50% (1)
4 50% (1)
3 0% (0)
2 0% (0)
1 0% (0)
Book ratings by Goodreads
Goodreads is the world's largest site for readers with over 50 million reviews. We're featuring millions of their reader ratings on our book pages to help you find your new favourite book. Close X