Speculative Execution in High Performance Computer Architectures

Speculative Execution in High Performance Computer Architectures

Edited by  , Edited by 

Free delivery worldwide

Available. Dispatched from the UK in 2 business days
When will my order arrive?


Until now, there were few textbooks that focused on the dynamic subject of speculative execution, a topic that is crucial to the development of high performance computer architectures. Speculative Execution in High Performance Computer Architectures describes many recent advances in speculative execution techniques. It covers cutting-edge research projects, as well as numerous commercial implementations that demonstrate the value of this latency-hiding technique. The book begins with a review of control speculation techniques that use instruction cache prefetching, branch prediction and predication, and multi-path execution. It then examines dataflow speculation techniques including data cache prefetching, address value and data value speculation, pre-computation, and coherence speculation. This textbook also explores multithreaded approaches, emphasizing profile-guided speculation, speculative microarchitectures, and compiler techniques.show more

Product details

  • Hardback | 456 pages
  • 162.6 x 248.9 x 30.5mm | 793.8g
  • Taylor & Francis Inc
  • Chapman & Hall/CRC
  • Boca Raton, FL, United States
  • English
  • 94 black & white illustrations
  • 1584884479
  • 9781584884477

Table of contents

INTRODUCTION David R. Kaeli, Northeastern University and Pen C. Yew, University of Minnesota INSTRUCTION CACHE PREFETCHING Glenn Reinman, UCLA Computer Science Department Direct Mapped Cache Set Associative Cache Pseudo Associative Cache Way Prediction Cache Next Line Prefetching Target Prefetching Stream Buffers Nonblocking Instruction Caches and Out-Of-Order Fetch Fetch Directed Instruction Prefetching Integrated Prefetching Wrong-Path Prefetching Compiler Strategies BRANCH PREDICTION Philip G. Emma, IBM T.J. Watson Research Laboratory The von Neumann Programming Model vs. ENIAC Dataflow and Control Flow The Branch Instruction The IAS Machine: A Primitive Stored-Program Architecture Virtuality Branch Instruction Semantics General Instruction-Set Architectures and Extensions Memory Consistency and Observable Order Branches and Performance Pipelining Pipeline Disruptions and Their Penalties Superscalar Processing Multithreading Instruction Prefetching and Autonomy The Delayed Branch Instruction Branch Flow in a Pipeline: The "When" of Branch Prediction Predicting Branches at Decode Time Predicting Branches at Instruction-Prefetch Time Static Branch Prediction Dynamic Branch Prediction Branch Prediction With Counters Predicting by Profiling Branch Actions Group Behaviors vs. Predicting Individual Branches The Decode History Table (a.k.a. Branch History Table) Discriminators Using Multiple Discriminators: A Path-Based Approach Implementation A Timing Caveat Hybrid Predictors Instruction Prefetching The Branch History Table (a.k.a. Branch Target Buffer) Operation of the BTB Fetch Width and Branch Mispredictions The Subroutine Call and Return Structure Predicting Return Addresses by Using a Stack Recognizing Subroutine Calls and Returns Taking Advantage of the BTB Structure Eliminating the Stack Working Sets and Contexts The Size of a BTB Entry The BTB and the Instruction Cache: Economies of Size More Exotic Prediction for the More Difficult Branches Branches and the Operand Space Branches and the Operand-Address Space Tandem Branch Prediction Accuracy and the Updating of Tables Predictor Bandwidth and Anomalous Behaviors The Importance of Fast Prediction Mechanisms Superscalar Processing and the Monolithic Prediction of Branch Sequences Predicting Branches in a Multithreaded Environment Limitations Simplicity Complexity Two Saving Graces Implementing Real Branch Prediction Mechanisms TRACE CACHES Eric Rotenberg North Carolina State University Traces Core Fetch Unit Based on Instruction Cache Trace Cache Operation Path Associativity Indexing Strategy Partial Matching Coupling Branch Prediction with the Trace Cache Trace Selection Policy Multi-Phase Trace Construction Managing Overlap between Instruction Cache and Trace Cache Speculative vs. Non-Speculative Trace Cache Updates Powerful vs. Weak Core Fetch Unit Parallel vs. Serial Instruction Cache Access L1 vs. L2 Instruction Cache Loop Caches BRANCH PREDICATION David August, Princeton University Overcoming Branch Problems with Predication If-Conversion Predicate Optimization and Analysis The Predicated Intermediate Representation Hewlett-Packard Laboratories PD Cydrome Cydra 5 ARM Texas Instruments C6X Systems with Limited Predicated Execution Support Predication in the Itanium 2 Processor MULTIPATH EXECUTION Augustus K. Uht, University of Rhode Island Branch Tree Geometry Branch Path/Instruction ID Phases of Operation Granularity With Predication With Data Speculation Compiler-Assisted Hardware: Classically-Based Hardware: Non Classically-Based Multiprocessors Functional or Logic Language Machines Branch Prediction Confidence Estimation Pipeline Depth Implications of Amdahl's Law - ILP Version Memory Bandwidth Requirements DATA CACHE PREFETCHING Yan Solihin, North Carolina State University, and Donald Yeung, University of Maryland at College Park Architectural Support Array Prefetching Pointer Prefetching Relationship with Data Locality Optimizations Stride and Sequential Prefetching Correlation Prefetching Content-Based Prefetching ADDRESS PREDICTION Avi Mendelson, Intel Mobil Micro-Processor Architect Terminology and Definitions Non-Speculative Address Calculation Techniques Speculative Address Calculation Techniques Chapter Focus Characterization of Address Predictability Address Predictability vs. Value Predictability Combining Address Prediction with Prefetching Mechanism Basic Characterization Load Promotion Memory Bypassing Compiler Based Speculative Load Promotion DATA SPECULATION Yiannakis Sazeides, University of Cyprus; Pedro Marcuello, Intel-UPC Barcelona Research Center; James E. Smith, University of Wisconsin-Madison; and Antonio Gonzalez, Universitat Polit'ecnica de Catalunya Basic Value Predictors Value Predictor Alternatives Confidence Estimation Implementation Issues Data Dependence Predictors Verification Recovery Other Microarchitectural Implications of Data Value Speculation Related Work: Data Value Speculation Related Work: Data Dependence Speculation INSTRUCTION PRECOMPUTATION: DYNAMICALLY REMOVING REDUNDANT COMPUTATIONS USING PROFILING Joshua J. Yi, Freescale Semiconductor Inc.; Resit Sendag, University of Rhode Island; and David J. Lilja, University of Minnesota at Twin Cities A Comparison of Instruction Precomputation and Value Reuse Upper-Bound - Profile A, Run A Different Input Sets - Profile B, Run A Combination of Input Sets - Profile AB, Run A Frequency versus Frequency and Latency Product Performance of Instruction Precomputation versus Value Reuse PROFILE-BASED SPECULATION Youfeng Wu and Jesse Fang, Intel Microprocessor Technology Labs Control Flow Profile Memory Profile Value Profile Static Analysis Instrumentation Hardware Performance Monitoring Special Hardware Software-Hardware Collaborative Profiling Compile-time Profiling Runtime Profiling Continuous Profiling Trace Scheduling Hot-Cold Optimizations Code Layout Data Layout Stride Prefetching Hot Data Stream Prefetching Mississippi Delta Prefetching Java Runtime Parallelizing Machine Speculative Parallel Threading Speculative Computation Reuse Software-Based Speculative Precomputation Stability across Multiple Workloads Update When Program Changes Maintenance during optimizations Perturbation by Profiling Code COMPILATION AND SPECULATION Jin Lin, Wei-Chung Hsu and Pen-Chung Yew University of Minnesota, Minneapolis Alias Profiling Data Dependence Profiling Speculative Alias and Dataflow Analyses A Framework for Speculative Alias Analysis and Dataflow Analysis Overview F-Insertion Step Rename Step Downsafety Step CodeMotion Step Recovery Code Generation for General Speculative Optimizations Check Instructions and Recovery Code Representation for Multi-Level Speculation Interaction of the Early Introduced Recovery Code with Later Optimizations MULTITHREADING AND SPECULATION Pedro Marcuello, Jesus Sanchez and Antonio Gonzalez Intel-UPC Barcelona Research Center; Intel Labs; Universitat Politecnica de Catalunya; Barcelona (Spain) Building Helper Threads Microarchitectural Support for Helper Threads Thread Spawning Schemes Microarchitectural Support for Speculative Architectural Threads References Andreas Moshovos, University of Toronto EXPLOITING LOAD/STORE PARALLELISM VIA MEMORY DEPENDENCE PREDICTION Static Methods Hybrid Static/Dynamic Methods Dynamic Methods Working Example Multiple Dependences Per Static Load or Store Methodology Performance Potential of Load/Store Parallelism Performance with Naive Memory Dependence Speculation Using Address-Based Scheduling to Extract Load/Store Parallelism Speculation/Synchronization RESOURCE FLOW MICROARCHITECTURES David A. Morano and David R. Kaeli, Northeastern University The Operand as a First Class Entity Dynamic Dependency Ordering Handling Multipath Execution Names and Renaming The Active Station Idea Register and Memory Operand Storage Operand Forwarding and Snooping Result Forwarding Buses and Operand Filtering A Small Resource-Flow Microarchitecture A Distributed Scalable Resource-Flow Microarchitectureshow more