Skip to content

10. PDAs and CFGs

Feb 8-10, 2022

PDA \(\Leftrightarrow\) CFG

Prove in two directions:

  1. PDA \(\subseteq\) CFG

    Convert any context-free grammar \(G\) to a PDA \(M\) with \(L(G) = L(M)\)

    Take any arbitrary CFG \(G\), conversion procedure:

    • for each production \(A \rightarrow w\) in \(G\), add transition: \(\epsilon, A \rightarrow w\)
    • for each terminal \(a\) in \(G\), add transition: \(a, a \rightarrow \epsilon\)



    Example:

    GRAMMAR                      ε,S → aSTb
    S → aSTb                     ε,S → b
    S → b                        ε,T → Ta
    T → Ta                       ε,T → ε
    T → ε                        a,a → ε
                                 b,b → ε
                   ε,ε → S          ⮏        ε,$ → $
     -→ ( q0 ) -----------------> ( q1 ) ---------------------> 《 q2 》

    In general we can show that grammar \(G\) generates string \(w\) \(S \overset{*}{\Rightarrow} w\) iff and only if PDA \(M\) accepts \(w\) \((q_0, w, \$) \overset{*}{\succ} (q_2, \epsilon, \$)\).


  2. PDA \(\supseteq\) CFG

    Convert any PDA \(M\) to a context-free grammar \(G\) with \(L(G) = L(M)\)

    Take any arbitrary PDA \(M\), modify PDA \(M\) so that:

    • the PDA has a single accept state
    • use new initial stack symbol # (not used in any transitions)
    • On acceptance the stack contains only stack symbol #
    • Each transition either pushes a symbol or pops a symbol but not both together

    Using this modified PDA, construct the grammar

    Denoting variables: \(A_{{q_i},{q_j}}\) where \(q_i, q_j\) are states of PDA

    for each state \(q\), grammar \(A_{qq} \rightarrow \epsilon\)
    for every 3 states \(p,q,r\) \(A_{pq} \rightarrow A_{pr}A_{rq}\)
    for every pair of transitions   \(p \overset{a,\epsilon \rightarrow t}{\longrightarrow} t\)     \(s \overset{b,t \rightarrow \epsilon}{\longrightarrow} q\) \(A_{pq} \rightarrow aA_{rs}b\)
    Initial state \(q_0\) and accept state \(q_f\) Start variable \(A_{{q_0}{q_f}}\)

Proving L(G) = L(M)

  1. \(L(G) \subseteq L(M)\)

    We need to show that if \(G\) has derivation \(A_{{q_0}{q_f}} \overset{*}{\Rightarrow} w\) there is an accepting computation in \(M\) \((q_0, w, \#) \overset{*}{\succ} (q_f, \epsilon, \#)\) with input string \(w\). Will actually show that if \(G\) has derivation \(A_{pq} \overset{*}{\Rightarrow} w\), then there is a computation in \(M\) \((p, w, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\).

    Since there is no transition with # symbol:

    \(A_{{q_0}{q_f}} \overset{*}{\Rightarrow} w\)     =>     \((q_0, w, \epsilon) \overset{*}{\succ} (q_f, \epsilon, \epsilon)\)     =>     \((q_0, w, \#) \overset{*}{\succ} (q_f, \epsilon, \#)\)

    If \(A_{pq} \overset{*}{\Rightarrow} w\) (string of terminals) then there is a computation from state \(p\) to state \(q\) on string \(w\) which leaves the stack empty: \((q_0, w, \epsilon) \overset{*}{\succ} (q_f, \epsilon, \epsilon)\)

    Can formally prove this claim by induction on the number of steps in derivation: \(A_{pq} \Rightarrow \dots \Rightarrow w\) where "\(\Rightarrow \dots \Rightarrow\)" represents the number of steps.


    Base case: one derivation step \(A_{pq} \Rightarrow w\)

    means \(A_{qq} \rightarrow \epsilon\) kind production must have been used, therefore \(p=q\) and \(w = \epsilon\). This computation of PDA trivially exists: \((p, \epsilon, \epsilon) \overset{*}{\succ} (p, \epsilon, \epsilon)\).


    Inductive step: \(k\) derivation steps \(A_{pq} \Rightarrow \overset{k}{\dots} \Rightarrow w\)

    suppose for \(k\) steps \((p, w, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\) holds. For \(k+1\) steps, show that \((p, w, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\) holds.


    case 1:   \(A_{pq} \Rightarrow A_{pr}A_{rq} \Rightarrow \dots \Rightarrow w\), we can write \(w = yz\) then

        \(A_{pq} \Rightarrow \dots \Rightarrow y\) - at most \(k\) steps, and from IH \((p, y, \epsilon) \overset{*}{\succ} (r, \epsilon, \epsilon)\)

        \(A_{rq} \Rightarrow \dots \Rightarrow z\) - at most \(k\) steps, and from IH \((r, z, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\)

        Then \((p, yz, \epsilon) \overset{*}{\succ} (r, z, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\), and since \(w=yz\), \((p, w, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\).


    case 2:   \(A_{pq} \Rightarrow aA_{rs}b \Rightarrow \dots \Rightarrow w\), we can write \(w = ayb\)

    \(A_{pq} \Rightarrow \dots \Rightarrow y\) at most \(k\) steps. From IH, PDA has computation \((r, y, \epsilon) \overset{*}{\succ} (s, \epsilon, \epsilon)\). Grammar contains production \(A_{pq} \Rightarrow aA_{rs}b\) and PDA contains transitions:

    • \(p \overset{a,\epsilon \rightarrow t}{\longrightarrow} t\)     =>   \((p, ayb, \epsilon) \succ (r, yb, t)\)
    • \(s \overset{b,t \rightarrow \epsilon}{\longrightarrow} q\)     =>   \((s, b, t) \succ (q, \epsilon, \epsilon)\)

    We know:   \((r, y, \epsilon) \overset{*}{\succ} (s, \epsilon, \epsilon)\)     =>     \((r, yb, t) \overset{*}{\succ} (s,b,t)\)

    We also know:   \((p, ayb, \epsilon) \succ (r, yb, t)\)   and   \((s, b, t) \succ (q, \epsilon, \epsilon)\)

    Therefore:   \((p, ayb, \epsilon) \succ (r, yb, t) \overset{*}{\succ} (s, b, t) \succ (q, \epsilon, \epsilon)\)

    Since \(w=ayb\),   \((p, w, \epsilon) \overset{*}{\succ} (q, \epsilon, \epsilon)\).


  2. \(L(G) \supseteq L(M)\)

    The proof for this direction works the same way as the previous inductive proof.

Therefore \(L(G) = L(M)\).