scalable parallel computing

April 4, 2018 | Author: Kala Vishnu | Category: Parallel Computing, Computer Cluster, Thread (Computing), Scalability, Central Processing Unit


Comments



Description

Scalable Parallel Computing: Technology, Architecture, ProgrammingK. Hwang and Z. Xu, McGraw-Hill, New York, NY, 1998. ISBN 0-07-031798-4. Chapter 1: Scalable Computer Platforms and Models (p. 3-50) Evolution of Computer Architectures Five generations of machines Scalable Computer Architectures Functionality and Performance Scaling in Cost Compatibility System Architectures Shared Nothing Shared Disk Shared Memory Macro-Architecture vs. Micro-Architecture Dimensions of Scalability Resource Scalability Application Scalability Technology Scalability Parallel Computer Models: Semantic Attributes Homogeneity Synchrony Interaction Mechanism Address Space Memory Model Performance Attributes Machine size, clock rate, workload, sequential execution, parallel execution, speed, speedup, efficiency, utilization, startup time, asymptotic bandwidth Abstract Machine Models: PRAM: Tcomp and Tload imbalance, simple, shared variable 1 of 11 K. Hwang and Z. Xu, Scalable Parallel Computing: Technology, Architecture, Programming, McGrawHill, New York, NY, 1998. ISBN 0-07-031798-4. Architecture. custom interconnections. Tload imbalance. Scalable Parallel Computing: Technology. NY. hard to scale Massively Parallel Processor (MPP): NORMA. Computation Phase. availability. and Interaction Phase Physical Machine Models: Parallel Vector Processor (PVP): UMA. message passing. Tcommunication. Xu. synch Phase Parallel: Tcom. . New York. available utilization. SSI challenged. crossbar or bus. shared memory (hardware or software based). Tload imbalance. commodity processors and interconnection Basic Concept of Clustering Cluster Nodes Single-System-Image (SSI) Internode Connection Enhanced Availability Better Performance Cluster benefits and difficulties Useability. shared memory Symmetric Multiprocessor (SMP): UMA. 1998. and Tparallel Includes all overhead Execution phases: Parallelism Phase. Hwang and Z. scalability. message passing. Programming. Tsynchronization. custom interconnection. McGrawHill.Bulk Synchronous Parallel: Tcom. ISBN 0-07-031798-4. Tcommunication. interact. and Tsynchronization Includes interaction overhead Superstep execution: comp. “classic” supercomputers Distributed Shared Memory (DSM): NUMA or NORMA. shared memory. possible cache directories Cluster of Workstations (COW): NORMA. crossbar. and performance/cost ratio Scalable Design Principles Independence Balanced Design Design for Scalability Latency Hiding 2 of 11 K. Tasks. . NY. 1998. and Threads Process State and State Table Process Descriptor Process Context Execution Mode – kernel.Chapter 2: Basics of Parallel Programming (p. Hwang and Z. Granularity Also called grain size. Programming. New York. Interaction/Communication Issues Communication Synchronization Aggregation Data and Resource Dependence Flow dependence Anti-dependence Output dependence I/O dependence Unknown dependence Bernstein Conditions Oi ∩ I j = ∅ Ii ∩ O j = ∅ Oi ∩ O j = ∅ 3 of 11 K. ISBN 0-07-031798-4. Scalable Parallel Computing: Technology. 59-77) Comparison of parallel and sequential programming Programming Components and Considerations Processes. user Parallelism Issues Homogeneity in Processes Language Constructs Static versus Dynamic Parallelism Process Grouping Allocation Issues: DOP Degree of parallelism. McGrawHill. Xu. Architecture. Scalable Parallel Computing: Technology. Programming. ISBN 0-07-031798-4. . New York. McGrawHill. total exchange Performance Metrics Sequential Time. Architecture. Efficiency. Parallel Time. Speedup. 91-154) Benchmarks have been defined to focus on specific machine characteristics Micro benchmarks: specific functions or attributes Macro benchmarks: functional programs representative of a class of applications Performance of Parallel Computers Computations Parallelism and Interaction Overhead Parallelism Overhead Process Management Grouping Operations (creation/destruction of groups) Process Inquiry Operations Interaction Overhead Synchronization Communication Aggregation Broadcast. Utilization Total Overhead Scalability and Speedup Analysis Amdahl’s Law: Fixed Problem Size Gustafson’s Law Fixed Time Sun and Ni’s Law Memory/Resource Bounding Iso-performance Models 4 of 11 K. NY. Critical Path Time Speed. Hwang and Z.Chapter 3: Performance Metrics and Benchmarks (p. scatter. 1998. Xu. gather. Architecture. NY. Programming. Hwang and Z. New York. . Scalable Parallel Computing: Technology. Xu. ISBN 0-07-031798-4.Chapter 4: Microprocessors as Building Blocks (p. 1998. 155-210) Instruction Pipeline Design Issues: Pipeline cycle or processor cycle Instruction issue latency Cycles per instruction (CPI) Instruction issue rate Simple operations Complex operations Resource conflicts Instruction Execution Ordering From CISC to RISC and beyond Scalar Superscalar Superpipelined Superscalar-Superpipelined VLIW Multimedia Extensions Future Microprocessors Multiway Superscalar Superspeculative Processor Simultaneous Multithreaded Processor Trace (multiscalar) Processor Vector IRAM Processor Single-chip Multiprocessors Raw (configurable) Processors 5 of 11 K. McGrawHill. New York. COMA. Programming.Chapter 5: Distributed Memory and Latency Tolerance (p. Xu. Architecture. NY. 211-272) Memory Hierarchy Inclusion Property Coherence Contention Locality of Reference Properties Temporal Spatial Sequential Memory Planning Capacity Average Access Time Cache Coherency Protocols Sources of incoherence: Write by different processors. process migration.sc 6 of 11 K. NUMA. 1998. McGrawHill. ISBN 0-07-031798-4. Hwang and Z. Scalable Parallel Computing: Technology. . write invalidate MESI Shared Memory Consistency Memory event ordering Memory Consistency Models Strict Consistency Sequential Consistency Processor Consistency Weak Consistency Release Consistency Distributed Cache/Memory Architectures UMA. I/O operations Cache Coherency Protocols Snoopy or Cache Directories Snoopy Coherency Protocols Must be able to observe memory transfers write-update vs. NORMA SMP centralized memory architectures Others distributed memory architectures Cache Coherence Considerations Cache Coherent – cc Non cache coherent – ncc Software Cache coherent . Scalable Parallel Computing: Technology. reduction. 1998. Hwang and Z. and hiding Distributed Coherent Caches Data Prefetching Relaxed Memory Consistency Multithreaded Latency Hiding 7 of 11 K. ISBN 0-07-031798-4. Xu.Cache Directories Latency Tolerance Techniques Latency avoidance. New York. NY. Programming. . McGrawHill. Architecture. network diameter. . Xu. Architecture. ISBN 0-07-031798-4. McGrawHill. and Multistage Interconnection Networks (MIN) Gigabit Network Technology Ethernet ATM Scalable Coherence Interface (SCI) 8 of 11 K. New York. 1998. 273-342) Basic Interconnecion Network Network Components Network Characteristics Network Properties Network Topologies Node degree. Hwang and Z. Scalable Parallel Computing: Technology. Programming. NY. Crossbar.Chapter 6: System Interconnections and Gigabit Networks (p. bisection width Buses. 1998. McGrawHill. Hwang and Z. Architecture. ISBN 0-07-031798-4. 343-402) Software Multithreading – the thread concept Threads. Synchronization. Programming. New York. The TCP/IP Communication Protocol Suite OSI and Internet protocol stack Network Addressing TCP and UDP and IP Fast and Efficient Communications Effective Bandwidth Network Interface Circuitry Software communication libraries 9 of 11 K. User Level Processing Synchronization Mechanisms Synch problems faced by users Language constructs employed by the user to solve the synch problem (high-level language constructs) Synch primitives available in multiprocessor architectures (low-level constructs) Algorithms used to implement high-level constructs with the low-level constructs available. .Chapter 7: Threading. LWP states and LWP management Heavyweight Process Kernel vs. Xu. and Communication (p. thread states and thread management Lightweight Process (LWP). NY. Scalable Parallel Computing: Technology. Programming. 1998. Xu. New York. 407-452) SMP and CC-NUMA Technology Availability: Bottleneck: Latency: Memory Bandwidth: I/O Bandwidth: Scalability: Programming Advantage: Typical Applications – Commercial SMP Servers Comparison of CC-NUMA Architectures Architecture: Shared Memory Access: Enhanced Scalability: Concerns: 10 of 11 K. Hwang and Z. Scalable Parallel Computing: Technology. .Chapter 8: Symmetric and CC-NUMA Multiprocessors (p. Architecture. ISBN 0-07-031798-4. NY. McGrawHill. Scalable Parallel Computing: Technology. 1998. Above Kernel. and Fault-Tolerant Failover Recovery Schemes Checkpointing and Failure Recovery Methods Overhead What to Checkpoint Consistent Snapshot Support for Single System Image Single System (Application. Programming. New York. Architecture. McGrawHill. NY. Mutual Takeover. . Hwang and Z. ISBN 0-07-031798-4.Chapter 9: Support of Clusters and Availability (p. Xu. 453-504) Challenges of Clustering Classification Attributes Dedicated Cluster Enterprise Cluster Cluster Design Issues Availability Support for Clustering Reliability Availability Serviceability Types of Failures Availability Techniques Isolated Redundancy: Hot Standby. Kernel/Hardware) Single Control Use from any entry point Location Transparent Job Management in Clusters Characteristics of Cluster Workload Job Scheduling Issues 11 of 11 K.
Copyright © 2024 DOKUMEN.SITE Inc.