You are here: Foswiki>ABI Web>ThesesHome>ThesisParallelismInSeqan>ThesisParallelismInSeqanPatterns (21 Jun 2010, KonradRudolph)Edit Attach

Parallelism In The SeqAn Library: Patterns for Parallelization

A brief summary of patterns for parallelization, based on [Massingill et al. 2000].

Finding Concurrency
Algorithm Structure
Supporting Structures
- Program structures
Literature

Finding Concurrency

Decomposition

Task decomposition

Identify functional units that can be computed independently/in parallel.

Tasks may map to function calls or independent loop iterations.

Data decomposition

Identify physically continuous blocks of data to operate on, partition the data appropriately and assign a task for each partition.

For example, decompose loops into independent tasks for each iteration, or partition loop/data into chunks and assign one task to each chunk.

Dependency analysis

Task grouping

Aggregate tasks to form task groups, thus reducing granularity.

“Tasks that correspond to a high-level operation naturally group together.”

Merge tasks that share the same constraints (shared data …).

Task ordering

Identify ordering constraints between task groups:

Tasks that need to execute concurrently.
Tasks that need to execute sequentially because the second depends on the results of the first.
Absence of any constraints between sets of tasks.

Data sharing

Identify which data is

Task local
Shared, but read-only
Effectively task local (i.e. partitioned into subsets corresponding to tasks)
Read/write
- Accumulated
- Scattered

Identify border regions where data sets of different tasks overlap.

Evaluation

Suitability for the target platform

How many PEs are available?
How are data structures shared among PEs?
What does the target architecture imply about the # of UE and how structures are shared among them?
Will the time spent doing useful work in a task be significantly greater than the time taken to deal with dependencies?

Design quality

Flexibility
Efficiency
Simplicity

Preparation for the next phase

How regular are the tasks and their data dependencies?
Are interactions between tasks (groups) synchronous or asynchronous?
Are the tasks grouped in the best way?

Algorithm Structure

Organized by …

… tasks

Task parallelism
Divide and conquer

… data decomposition

Geometric decomposition
Recursive decomposition
Domain decomposition

… flow of data

Pipeline – Each stage should take approximately the same time. Subsequent stages execute in parallel (communication via queue). Replicate stateless stages to augment parallelism.
Event-based coordinaton

Supporting Structures

Program structures

Master/worker
- static: each worker works on one task only, then returns
- dynamic: workers on standby until they receive work via input queue, process work, push into output queue, idle.
Producer/consumer
- Fixed-size queue
- Consumers wait while queue empty; producers wait while queue full
- Requires synchronizing guards
Loop parallelism
Fork/join
Leader/follower
Work stealing
- Asynchronous round-robin
- Global round-robin
- Random polling

Literature

[Massingill et al., 2000] B. L. Massingill, T. G. Mattson, and B. A. Sanders. Patterns for finding concurrency for parallel application programs. In 7th. Pattern Languages of Programs Conference (PLOP 2000), 2000.
[Kumar et al., 1994] V. Kumar, A. Grama, A. Gupta, G Karypis, Introduction to parallel computing: Design and analysis of algorithms, The Benjamin/Cummings Publishing Company, Inc. (Redwood City, California), 1994.

Topic revision: r12 - 21 Jun 2010, KonradRudolph

ABI

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback