Skip to content

24. Omega Networks

Apr 12, 2021

Omega Networks (cont.)

img

  • omega networks built using shuffle exchange
  • number of wires/nodes = 4
  • Hardware cost \(HW \$ = \Theta(P) + \# \text{stages} \cdot \text{cost of stage} = P + \log P \times P/2 = \Theta(P \lg P)\)
  • Latency \(\lt 2 \cdot \lg P\) (\(\log P\) == number of stages)
  • Work = \(T \cdot HW \$\)
  • everything is slowed down a log factor on this hardware
  • there are embedded binary trees in the hardware → we can do associative ops in same time as PRAM: \(W \cdot \log P\)

How to enforce EREW/CREW/CRCW?

  • EREW: runtime error if using the same wires
  • CRCW: use combining switch that does combining
    • remembers that outgoing message must follow along multiple wires
    • broadcast reads on both output wires

Writes

  • common: combine messages, do 1 write
  • arbitrary: randomly drop message
  • priority: compare sources and drop lower priority message

Not all permutations are possible → can have blocks

  • such network is called blocking network
  • combining does not help because the carried values may be different
  • need higher bandwidth to carry more values

Are there non-blocking networks?

  • yes, solution exists for non-blocking networks of depth \(\exists \Theta(\log P)\)
  • on practice there exists depth \(\Theta(\lg^2 P)\)
  • simplest: omega network repeated \(\log P\) times
  • routing of this network gets very complicated and requires an additional controller connected to all stages → controller computes route over \(\log P\) stages offline

Pipelining

The logarithmic overhead of omega network does not always apply, e.g.

\(z\) sets \(\sum\)
1 \(\lg P\)
2 +1
3 +1
  • \(HW \$ = T_z = \log P + (z - 1)\)
  • \(W = T_z \cdot (P \lg P) \lt P \lg^2 P + z P \lg P = \Theta z(P \lg P)\) (amortized)
  • make \(z \gt \lg P\) to amortize the \(HW \$\) overhead → good for pipelining

Recirculating Network

Idea is to use one stage to perform as if there were multiple stages.

img

Complications:

  • each switch must have a counter to know which action to perform on input; counter is mod \(\log P\)
  • need to store \(\log P\) routing info
  • pro: \(HW \$\) is now \(\Theta(P)\)
  • has same blocking issue, cannot pipeline
  • this is not a universal hardware → need to know what you are doing when using it!

Prefix sum with omega network

Read 2 inputs, add, send up

\(N = 8\), \(P = 7\), \(N \approx P\)

\(P = 2L - 1 = 2 \cdot \frac{N}{2} - 1 = \Theta(N)\)

img

  • \(T^{(1)} = \Theta(\lg N)\)
  • \(T^{(2)} = +1\)
  • \(\dots\)
  • \(T^{(7)} = +1\)
  • \(T = \Theta(\lg N) + 2 - 1\)
  • \(W = \Theta(N \lg N) + \Theta(P2)\)