OpenCL Programming Guide

OpenCL Programming Guide

3.93 (16 ratings by Goodreads)
By (author)  , By (author)  , By (author)  , By (author)  , By (author) 

Free delivery worldwide

Available. Dispatched from the UK in 2 business days
When will my order arrive?


Using the new OpenCL (Open Computing Language) standard, you can write applications that access all available programming resources: CPUs, GPUs, and other processors such as DSPs and the Cell/B.E. processor. Already implemented by Apple, AMD, Intel, IBM, NVIDIA, and other leaders, OpenCL has outstanding potential for PCs, servers, handheld/embedded devices, high performance computing, and even cloud systems. This is the first comprehensive, authoritative, and practical guide to OpenCL 1.1 specifically for working developers and software architects. Written by five leading OpenCL authorities, OpenCL Programming Guide covers the entire specification. It reviews key use cases, shows how OpenCL can express a wide range of parallel algorithms, and offers complete reference material on both the API and OpenCL C programming language. Through complete case studies and downloadable code examples, the authors show how to write complex parallel programs that decompose workloads across many different devices. They also present all the essentials of OpenCL software performance optimization, including probing and adapting to hardware. Coverage includes * Understanding OpenCL's architecture, concepts, terminology, goals, and rationale * Programming with OpenCL C and the runtime API * Using buffers, sub-buffers, images, samplers, and events * Sharing and synchronizing data with OpenGL and Microsoft's Direct3D * Simplifying development with the C++ Wrapper API * Using OpenCL Embedded Profiles to support devices ranging from cellphones to supercomputer nodes * Case studies dealing with physics simulation; image and signal processing, such as image histograms, edge detection filters, Fast Fourier Transforms, and optical flow; math libraries, such as matrix multiplication and high-performance sparse matrix multiplication; and more * Source code for this book is available at more

Product details

  • Paperback | 648 pages
  • 177.8 x 228.6 x 30.48mm | 952.54g
  • Pearson Education (US)
  • Addison-Wesley Educational Publishers Inc
  • New Jersey, United States
  • English
  • 0321749642
  • 9780321749642
  • 356,514

Review quote

"Welcome to the new world of heterogeneous parallel programming with this authoritative and accessible guide to the complete OpenCL Programming Model." -Professor Pat Hanrahan, Stanford Universityshow more

About Aaftab Munshi

Aaftab Munshi is the spec editor for the OpenGL ES 1.1, OpenGL ES 2.0, and OpenCL specifications and coauthor of the book OpenGL ES 2.0 Programming Guide (with Dan Ginsburg and Dave Shreiner, published by Addison-Wesley, 2008). He currently works at Apple. Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneous processors, in particular looking at high-level abstractions for parallel programming on the emerging class of processors that contain both CPUs and accelerators such as GPUs. Benedict has contributed extensively to the OpenCL's design and has represented AMD at the Khronos Group open standard consortium. Benedict has a Ph.D. in computer science for his work on type systems for extensible records and variants. He has been working at AMD since 2008. Timothy G. Mattson is an old-fashioned parallel programmer, having started in the mid-eighties with the Caltech Cosmic Cube and continuing to the present. Along the way, he has worked with most classes of parallel computers (vector supercomputers, SMP, VLIW, NUMA, MPP, clusters, and many-core processors). Tim has published extensively, including the books Patterns for Parallel Programming (with Beverly Sanders and Berna Massingill, published by Addison-Wesley, 2004) and An Introduction to Concurrency in Programming Languages (with Matthew J. Sottile and Craig E. Rasmussen, published by CRC Press, 2009). Tim has a Ph.D. in chemistry for his work on molecular scattering theory. He has been working at Intel since 1993. James Fung has been developing computer vision on the GPU as it progressed from graphics to general-purpose computation. James has a Ph.D. in electrical and computer engineering from the University of Toronto and numerous IEEE and ACM publications in the areas of parallel GPU Computer Vision and Mediated Reality. He is currently a Developer Technology Engineer at NVIDIA, where he examines computer vision and image processing on graphics hardware. Dan Ginsburg currently works at Children's Hospital Boston as a Principal Software Architect in the Fetal-Neonatal Neuroimaging and Development Science Center, where he uses OpenCL for accelerating neuroimaging algorithms. Previously, he worked for Still River Systems developing GPU-accelerated image registration software for the Monarch 250 proton beam radiotherapy system. Dan was also Senior Member of Technical Staff at AMD, where he worked for over eight years in a variety of roles, including developing OpenGL drivers, creating desktop and hand-held 3D demos, and leading the development of handheld GPU developer tools. Dan holds a B.S. in computer science from Worcester Polytechnic Institute and an M.B.A. from Bentley more

Table of contents

Figures xv Tables xxi Listings xxv Foreword xxix Preface xxxiii Acknowledgments xli About the Authors xliii Part I: The OpenCL 1.1 Language and API 1 Chapter 1: An Introduction to OpenCL 3 What Is OpenCL, or ... Why You Need This Book 3 Our Many-Core Future: Heterogeneous Platforms 4 Software in a Many-Core World 7 Conceptual Foundations of OpenCL 11 OpenCL and Graphics 29 The Contents of OpenCL 30 The Embedded Profile 35 Learning OpenCL 36 Chapter 2: HelloWorld: An OpenCL Example 39 Building the Examples 40 HelloWorld Example 45 Checking for Errors in OpenCL 57 Chapter 3: Platforms, Contexts, and Devices 63 OpenCL Platforms 63 OpenCL Devices 68 OpenCL Contexts 83 Chapter 4: Programming with OpenCL C 97 Writing a Data-Parallel Kernel Using OpenCL C 97 Scalar Data Types 99 Vector Data Types 102 Other Data Types 108 Derived Types 109 Implicit Type Conversions 110 Explicit Casts 116 Explicit Conversions 117 Reinterpreting Data as Another Type 121 Vector Operators 123 Qualifiers 133 Keywords 141 Preprocessor Directives and Macros 141 Restrictions 146 Chapter 5: OpenCL C Built-In Functions 149 Work-Item Functions 150 Math Functions 153 Integer Functions 168 Common Functions 172 Geometric Functions 175 Relational Functions 175 Vector Data Load and Store Functions 181 Synchronization Functions 190 Async Copy and Prefetch Functions 191 Atomic Functions 195 Miscellaneous Vector Functions 199 Image Read and Write Functions 201 Chapter 6: Programs and Kernels 217 Program and Kernel Object Overview 217 Program Objects 218 Kernel Objects 237 Chapter 7: Buffers and Sub-Buffers 247 Memory Objects, Buffers, and Sub-Buffers Overview 247 Creating Buffers and Sub-Buffers 249 Querying Buffers and Sub-Buffers 257 Reading, Writing, and Copying Buffers and Sub-Buffers 259 Mapping Buffers and Sub-Buffers 276 Chapter 8: Images and Samplers 281 Image and Sampler Object Overview 281 Creating Image Objects 283 Creating Sampler Objects 292 OpenCL C Functions for Working with Images 295 Transferring Image Objects 299 Chapter 9: Events 309 Commands, Queues, and Events Overview 309 Events and Command-Queues 311 Event Objects 317 Generating Events on the Host 321 Events Impacting Execution on the Host 322 Using Events for Profiling 327 Events Inside Kernels 332 Events from Outside OpenCL 333 Chapter 10: Interoperability with OpenGL 335 OpenCL/OpenGL Sharing Overview 335 Querying for the OpenGL Sharing Extension 336 Initializing an OpenCL Context for OpenGL Interoperability 338 Creating OpenCL Buffers from OpenGL Buffers 339 Creating OpenCL Image Objects from OpenGL Textures 344 Querying Information about OpenGL Objects 347 Synchronization between OpenGL and OpenCL 348 Chapter 11: Interoperability with Direct3D 353 Direct3D/OpenCL Sharing Overview 353 Initializing an OpenCL Context for Direct3D Interoperability 354 Creating OpenCL Memory Objects from Direct3D Buffers and Textures 357 Acquiring and Releasing Direct3D Objects in OpenCL 361 Processing a Direct3D Texture in OpenCL 363 Processing D3D Vertex Data in OpenCL 366 Chapter 12: C++ Wrapper API 369 C++ Wrapper API Overview 369 C++ Wrapper API Exceptions 371 Vector Add Example Using the C++ Wrapper API 374 Chapter 13: OpenCL Embedded Profile 383 OpenCL Profile Overview 383 64-Bit Integers 385 Images 386 Built-In Atomic Functions 387 Mandated Minimum Single-Precision Floating-Point Capabilities 387 Determining the Profile Supported by a Device in an OpenCL C Program 390 Part II: OpenCL 1.1 Case Studies 391 Chapter 14: Image Histogram 393 Computing an Image Histogram 393 Parallelizing the Image Histogram 395 Additional Optimizations to the Parallel Image Histogram 400 Computing Histograms with Half-Float or Float Values for Each Channel 403 Chapter 15: Sobel Edge Detection Filter 407 What Is a Sobel Edge Detection Filter? 407 Implementing the Sobel Filter as an OpenCL Kernel 407 Chapter 16: Parallelizing Dijkstra's Single-Source Shortest-Path Graph Algorithm 411 Graph Data Structures 412 Kernels 414 Leveraging Multiple Compute Devices 417 Chapter 17: Cloth Simulation in the Bullet Physics SDK 425 An Introduction to Cloth Simulation 425 Simulating the Soft Body 429 Executing the Simulation on the CPU 431 Changes Necessary for Basic GPU Execution 432 Two-Layered Batching 438 Optimizing for SIMD Computation and Local Memory 441 Adding OpenGL Interoperation 446 Chapter 18: Simulating the Ocean with Fast Fourier Transform 449 An Overview of the Ocean Application 450 Phillips Spectrum Generation 453 An OpenCL Discrete Fourier Transform 457 A Closer Look at the FFT Kernel 463 A Closer Look at the Transpose Kernel 467 Chapter 19: Optical Flow 469 Optical Flow Problem Overview 469 Sub-Pixel Accuracy with Hardware Linear Interpolation 480 Application of the Texture Cache 480 Using Local Memory 481 Early Exit and Hardware Scheduling 483 Efficient Visualization with OpenGL Interop 483 Performance 484 Chapter 20: Using OpenCL with PyOpenCL 487 Introducing PyOpenCL 487 Running the PyImageFilter2D Example 488 PyImageFilter2D Code 488 Context and Command-Queue Creation 492 Loading to an Image Object 493 Creating and Building a Program 494 Setting Kernel Arguments and Executing a Kernel 495 Reading the Results 496 Chapter 21: Matrix Multiplication with OpenCL 499 The Basic Matrix Multiplication Algorithm 499 A Direct Translation into OpenCL 501 Increasing the Amount of Work per Kernel 506 Optimizing Memory Movement: Local Memory 509 Performance Results and Optimizing the Original CPU Code 511 Chapter 22: Sparse Matrix-Vector Multiplication 515 Sparse Matrix-Vector Multiplication (SpMV) Algorithm 515 Description of This Implementation 518 Tiled and Packetized Sparse Matrix Representation 519 Header Structure 522 Tiled and Packetized Sparse Matrix Design Considerations 523 Optional Team Information 524 Tested Hardware Devices and Results 524 Additional Areas of Optimization 538 Appendix: Summary of OpenCL 1.1 541 The OpenCL Platform Layer 541 The OpenCL Runtime 543 Buffer Objects 544 Program Objects 546 Kernel and Event Objects 547 Supported Data Types 550 Vector Component Addressing 552 Preprocessor Directives and Macros 555 Specify Type Attributes 555 Math Constants 556 Work-Item Built-In Functions 557 Integer Built-In Functions 557 Common Built-In Functions 559 Math Built-In Functions 560 Geometric Built-In Functions 563 Relational Built-In Functions 564 Vector Data Load/Store Functions 567 Atomic Functions 568 Async Copies and Prefetch Functions 570 Synchronization, Explicit Memory Fence 570 Miscellaneous Vector Built-In Functions 571 Image Read and Write Built-In Functions 572 Image Objects 573 Image Formats 576 Access Qualifiers 576 Sampler Objects 576 Sampler Declaration Fields 577 OpenCL Device Architecture Diagram 577 OpenCL/OpenGL Sharing APIs 577 OpenCL/Direct3D 10 Sharing APIs 579 Index 581show more

Rating details

16 ratings
3.93 out of 5 stars
5 25% (4)
4 50% (8)
3 19% (3)
2 6% (1)
1 0% (0)
Book ratings by Goodreads
Goodreads is the world's largest site for readers with over 50 million reviews. We're featuring millions of their reader ratings on our book pages to help you find your new favourite book. Close X