Skip to content

3. Speedup & PRAMs

Jan 20, 2021

Speedup

  • on a uniprocessor: P1=T1(N)
  • with multiple processors: P=TP(N)

Speedup S=T1(N)TP(N)

Goal is to achieve linear speedup

  • T1 is the best known sequential time
    • when both upper (O) and lower (Ω) bound, then we have Θ bound
    • for example: linear search of an array with N values is Θ(N)
    • T1 is always expressed as optimal time T1

Example 1: PRAM find value

Start with P=N processors on (concurrent read) PRAM model and same size input

1
2
3
4
5
6
shared X[1...N]

for P = 1 ... PID parbegin:
    if X[PID] == value:
        print("found")
parend

Time: TP=Θ(1)

Speedup: S=T1TP=Θ(N)Θ(1)=Θ(N)=Θ(P)

Important! - never divide by O - always express speedup in terms of P - always compare to time-optimal T1

Example 2: PRAM find value v. 2

Similar setting but assume CREW PRAM. Use this different algorithm, this time (try to) use variable to determine found state and index of where the match was found:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
shared X[1...N]
found = false
index_where = None

for P = 1 ... PID parbegin:
    if X[PID] == value:
    then:
        found = true
        index_where = PID  # issue!
    else:
        noop, noop
parend

if found
    do Proc(A)
else
    do Proc(B)

What is the problem?

Simultaneous writes by multiple processors into same memory location → runtime errors

PRAM types

  • EREW PRAM = exclusive read, exclusive write: no two processors allowed to read/write to/from same shared memory location
  • CREW PRAM = concurrent read, exclusive write
  • CRCW PRAM = concurrent read and write
    • the most popular model
    • there are subcategories of CRCW PRAM:
      • Common CRCW PRAM: concurrent writes must write the same value
      • Priority CRCW PRAM: processor with higher priority gets to write
      • arbitrary CRCW PRAM: concurrent writes resolved arbitrarily