3. Speedup & PRAMs

Jan 20, 2021

Speedup

on a uniprocessor: \(P_1 = T_1(N)\)
with multiple processors: \(P = T_P(N)\)

Speedup \(S = \frac{T_1(N)}{T_P(N)}\)

Goal is to achieve linear speedup

\(T_1^{*}\) is the best known sequential time
- when both upper (\(O\)) and lower (\(\Omega\)) bound, then we have \(\Theta\) bound
- for example: linear search of an array with \(N\) values is \(\Theta(N)\)
- \(T_1\) is always expressed as optimal time \(T_1^*\)

Example 1: PRAM find value

Start with \(P = N\) processors on (concurrent read) PRAM model and same size input

shared X[1...N]

for P = 1 ... PID parbegin:
    if X[PID] == value:
        print("found")
parend

Time: \(T_P = \Theta(1)\)

Speedup: \(S = \frac{T_1^*}{T_P} = \frac{\Theta(N)}{\Theta(1)} = \Theta(N) = \Theta(P)\)

Important! - never divide by \(O\) - always express speedup in terms of \(P\) - always compare to time-optimal \(T_1^*\)

Example 2: PRAM find value v. 2

Similar setting but assume CREW PRAM. Use this different algorithm, this time (try to) use variable to determine found state and index of where the match was found:

shared X[1...N]
found = false
index_where = None

for P = 1 ... PID parbegin:
    if X[PID] == value:
    then:
        found = true
        index_where = PID  # issue!
    else:
        noop, noop
parend

if found
    do Proc(A)
else
    do Proc(B)

What is the problem?

Simultaneous writes by multiple processors into same memory location → runtime errors

PRAM types

EREW PRAM = exclusive read, exclusive write: no two processors allowed to read/write to/from same shared memory location
CREW PRAM = concurrent read, exclusive write
CRCW PRAM = concurrent read and write
- the most popular model
- there are subcategories of CRCW PRAM:
  - Common CRCW PRAM: concurrent writes must write the same value
  - Priority CRCW PRAM: processor with higher priority gets to write
  - arbitrary CRCW PRAM: concurrent writes resolved arbitrarily