Sixty nine people contributed to this book of high performance programming examples. Optimization work on real world examples, from around the world, are shown with code (which can also be downloaded) and explanations. Examples are run on processors and Intel Xeon Phi coprocessors, using the same code. An outstanding read for those interested in how to optimize code for today's machines.
(James Reinders and Jim Jeffers, editors and contributors, with the book.)
More information at:
I have a copy of my latest book (with 6 wonderful co-authors)! Based on the SIGGRAPH tutorial we did last year, it reviews successful techniques for parallel programming in applications doing visual effects (think: animated movies!) The most referenced technique is TBB although other methods including OpenCL are discussed.
Nice write-up associated with its release at SIGGRAPH 2014.
Piper is an experimental prototype of Intel® Cilk™ Plus that provides library headers and runtime support for pipe-while loops. A pipe-while loop is a new parallel loop construct described in a recent paper on On-the-fly pipeline parallelism, published in July 2013 in collaboration with researchers at MIT. A pipe-while loop is a generalization of an ordinary while loop that allows for pipeline parallelism between iterations.
You can learn more on the Cilk Plus website: .
The latest compilers require some changes to the Cilk Plus examples in the book... we have updated the download of source code to match. You can download the revised/corrected Cilk Plus examples here (0.5MB) or on the download page.
The following details about the tutorial which used this presentation are on the SC13 website:
Structured Parallel Programming with Patterns
SESSION: Structured Parallel Programming with Patterns
Tutorial, 8:30am-5:00pm, Room 302, November 17, 2013 (SC13 - Denver)
Michael D. McCool - Intel Corporation
James R. Reinders - Intel Corporation
Arch Robison - Intel Corporation
Michael Hebenstreit - Intel Corporation
Parallel programming is important for performance, and developers need a comprehensive set of strategies and technologies for tackling it. This tutorial is intended for C++ programmers who want to better grasp how to envision, describe and write efficient parallel algorithms at the single shared-memory node level. This tutorial will present a set of algorithmic patterns for parallel programming. Patterns describe best known methods for solving recurring design problems. Algorithmic patterns in particular are the building blocks of algorithms. Using these patterns to develop parallel algorithms will lead to better structured, more scalable, and more maintainable programs. This course will discuss when and where to use a core set of parallel patterns, how to best implement them, and how to analyze the performance of algorithms built using them. Patterns to be presented include map, reduce, scan, pipeline, fork-joint, stencil, tiling, and recurrence. Each pattern will be demonstrated using working code in one or more of Cilk Plus, Threading Building Blocks, OpenMP, or OpenCL. Attendees also will have the opportunity to test the provided examples themselves on an HPC cluster for the time of the SC13 conference.