• Mining of Massive Datasets See large image

    Mining of Massive Datasets (Hardback) By (author) Anand Rajaraman, By (author) Jeffrey David Ullman

    $62.09 - Save $5.73 (8%) - RRP $67.82 Free delivery worldwide Available
    Dispatched in 2 business days
    When will my order arrive?
    Add to basket | Add to wishlist |

    DescriptionThe popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and which can be used on even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. The PageRank idea and related tricks for organizing the Web are covered next. Other chapters cover the problems of finding frequent itemsets and clustering. The final chapters cover two applications: recommendation systems and Web advertising, each vital in e-commerce. Written by two authorities in database and Web technologies, this book is essential reading for students and practitioners alike.


Other books

Other people who viewed this bought | Other books in this category
Showing items 1 to 10 of 10

 

Reviews | Bibliographic data
  • Full bibliographic data for Mining of Massive Datasets

    Title
    Mining of Massive Datasets
    Authors and contributors
    By (author) Anand Rajaraman, By (author) Jeffrey David Ullman
    Physical properties
    Format: Hardback
    Number of pages: 326
    Width: 178 mm
    Height: 248 mm
    Thickness: 24 mm
    Weight: 798 g
    Language
    English
    ISBN
    ISBN 13: 9781107015357
    ISBN 10: 1107015359
    Classifications

    BIC E4L: COM
    Nielsen BookScan Product Class 3: S10.2
    B&T Book Type: NF
    B&T Modifier: Region of Publication: 03
    Warengruppen-Systematik des deutschen Buchhandels: 16320
    DC22: 006.312
    B&T Modifier: Academic Level: 02
    LC classification: QA
    BIC subject category V2: UMB
    B&T Modifier: Text Format: 06, 01
    Abridged Dewey: 005
    B&T Merchandise Category: UP
    Ingram Subject Code: XD
    LC subject heading:
    BISAC V2.8: COM021000, COM004000
    B&T General Subject: 228
    BISAC V2.8: COM021030
    BIC subject category V2: UYQM, UNF
    DC23: 006.312
    Ingram Theme: ASPT/SCITAS
    Thema V1.0: UYQM, UNF
    Edition statement
    New ed.
    Illustrations note
    90 b/w illus. 160 exercises
    Publisher
    CAMBRIDGE UNIVERSITY PRESS
    Imprint name
    CAMBRIDGE UNIVERSITY PRESS
    Publication date
    30 December 2011
    Publication City/Country
    Cambridge
    Author Information
    Anand Rajaraman is CEO of Kosmix Inc., a website which organizes the Internet by topic. He is also a consulting assistant professor in the Computer Science Department at Stanford University. In 1996, together with four other engineers, Rajaraman founded Junglee Corp., which pioneered Internet comparison shopping. It was acquired by Amazon.com Inc. in August 1998 for 1.6 million shares of stock valued at $250 million. Rajaraman went on to become Director of Technology at Amazon.com, where he was responsible for technology strategy. He helped launch the transformation of Amazon.com from a retailer into a retail platform, enabling third-party retailers to sell on Amazon.com's website. Third-party transactions now account for almost 25% of all US transactions, and represent Amazon's fastest-growing and most profitable business segment. Rajaraman was also an inventor of the concept underlying Amazon.com's Mechanical Turk. Rajaraman and his business partner, Venky Harinarayan, co-founded Cambrian Ventures, an early stage VC fund, in 2000. Cambrian went on to back several companies later acquired by Google and has funded companies like Mobissimo, Aster Data Systems and TheFind.com. Jeffrey David Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus) at Stanford University. He is also the CEO of Gradiance. Ullman's research interests include database theory, data integration, data mining and education using the information infrastructure. He is one of the founders of the field of database theory and was the doctoral advisor of an entire generation of students who later became leading database theorists in their own right. He was also the Ph.D. advisor of Sergey Brin, one of the co-founders of Google, and served on Google's technical advisory board. In 1995 he was inducted as a Fellow of the Association for Computing Machinery and in 2000 he was awarded the Knuth Prize. Ullman is also the co-recipient (with John Hopcroft) of the 2010 IEEE John von Neumann Medal, for 'laying the foundations for the fields of automata and language theory and many seminal contributions to theoretical computer science'.
    Table of contents
    1. Data mining; 2. Large-scale file systems and map-reduce; 3. Finding similar items; 4. Mining data streams; 5. Link analysis; 6. Frequent itemsets; 7. Clustering; 8. Advertising on the Web; 9. Recommendation systems; Index.