Stream processor architecture

by Scott Rixner

Media processing applications, such as three-dimensional graphics, video compression, and image processing, currently demand 10-100 billion operations per second of sustained computation. Fortunately, hundreds of arithmetic units can easily fit on a modestly sized 1cm2 chip in modern VLSI. The challenge is to provide these arithmetic units with enough data to enable them to meet the computation demands of media processing applications. Conventional storage hierarchies, which frequently include caches, are unable to bridge the data bandwidth gap between modern DRAM and tens to hundreds of arithmetic units. A data bandwidth hierarchy, however, can bridge this gap by scaling the provided bandwidth across the levels of the storage hierarchy. The stream programming model enables media processing applications to exploit a data bandwidth hierarchy effectively. Media processing applications can naturally be expressed as a sequence of computation kernels that operate on data streams. This programming model exposes the locality and concurrency inherent in these applications and enables them to be mapped efficiently to the data bandwidth hierarchy. Stream programs are able to utilize inexperience local data bandwidth when possible and consume expensive global data bandwidth only when necessary. Stream Processor Architecture presents the architecture of the Imagine streaming media processor, which delivers a peak performance of 20 billion floating-point operations per second. Imagine efficiently supports 48 arithmetic units with a three-tiered data bandwidth hierarchy. At the base of the hierarchy, the streaming memory system employs memory access scheduling to maximize the sustained bandwidth of external DRAM. At the center of the hierarchy, the global stream register file enables streams of data to be recirculated directly from one computation kernel to the next without returning data to memory. Finally, local distributed register files that directly feed the arithmetic units enable temporary data to be stored locally so that it does not need to consume costly global register bandwidth. The bandwidth hierarchy enables Imagine to achieve up to 96 of the performance of a stream processor with infinite bandwidth from memory and the global register file.

「Nielsen BookData」より

[目次]

  • Foreword. Acknowledgements.1: Introduction. 1.1 Stream Architecture. 1.2. The Imagine Media Processor. 1.3. Contributions. 1.4. Overview. 2: Background. 2.1. Special-purpose Media Processors. 2.2. Programmable Media Processors.2.3. Vector Processors. 2.4. Stream Processors. 2.5. Storage Hierarchy. 2.6. DRAM Access Scheduling. 2.7. Summary. 3: Media Processing Applications. 3.1. Media Processing. 3.2. Sample Applications. 3.3. Application Characteristics. 4: The Imagine Stream Processor. 4.1. Stream Processing. 4.2. Architecture. 4.3. Programming Model. 4.4. Implementation. 4.5. Scalability and Extensibility. 5: Data Bandwidth Hierarchy. 5.1. Overview. 5.2. Communication Bottlenecks. 5.3. Register Organization. 5.4. Evaluation. 5.5. Summary. 6: Memory Access Scheduling. 6.1. Overview. 6.2. Modern DRAM. 6.3. Memory Access Scheduling. 6.4. Evaluation. 6.5. Summary. 7: Conclusions. 7.1. Imagine Summary. 7.2. Future Architectures. References. Index.

「Nielsen BookData」より

この本の情報

書名 Stream processor architecture
著作者等 Rixner Scott
シリーズ名 The Kluwer international series in engineering and computer science
出版元 Kluwer Academic Publishers
刊行年月 c2002
ページ数 xiv, 120 p.
大きさ 24 cm
ISBN 0792375459
NCID BA5754572X
※クリックでCiNii Booksを表示
言語 英語
出版国 アメリカ合衆国
この本を: 
このエントリーをはてなブックマークに追加

このページを印刷

外部サイトで検索

この本と繋がる本を検索

ウィキペディアから連想