Skip to content

Context-Free Languages

Summarized notes from Introduction to Theory of Computation, Chapter 2

  • describable equivalently by context-free grammars and pushdown automata
    • grammar describes how to generate the language, PDA how to recognize it
  • can describe languages with recursive structures => particularly useful in parsers
  • also note: every regular language is context free

Context Free Grammars (CFG)

4-tuple (V,Σ,R,S)

V variables (finite set)
Σ terminals (finite set, disjoint from V)
R rules (finite set): variable and a string of variables and terminals
SV start variable (by convention left of topmost rule)
  • language of the grammar is {wΣ|Sw}
  • same information can be represented pictorially using a parse tree

Ambiguity

  • grammar does not capture precedence relations, e.g. * and +
  • leftmost derivation = at every step the leftmost remaining variable is the one replaced
  • for string w and context-free grammar G:

    • w is derived ambiguously in G if it has two or more different leftmost derivations
    • G is ambiguous if it generates some string ambiguously
  • sometimes resolvable if unambiguous version of the same grammar exists

  • some languages are inherently ambiguous, e.g. {aibjck|i=j or j=k}

Chomsky normal form

Any context-free language is generated by a context-free grammar in Chomsky normal form. Context-free grammar is in Chomsky normal form if every rule is of the form:

ABCAa

where a is any terminal and A, B, and C are any variables—except that B and C may not be the start variable. In addition, we permit the rule Sϵ, where S is the start variable.

Pushdown automaton (PDA)

  • FA + an unlimited stack: can push and pop symbols on a stack
  • the stack enables recognizing some non-regular languages, e.g. {0n1n|N0}
  • context-free grammars and pushdown automata are equivalent in power => always possible to transform one to the other

Definition

6-tuple (Q,Σ,Γ,δ,q0,F) (nondeterministic)

Q finite set of states
Σ finite input alphabet
Γ finite stack alphabet (may differ from Σ)
δ:Q×Σϵ×ΓϵP(Q×Γϵ) transition function
q0Q start state
FQ finite set of accept states
  • nondeterministic transition may have several legal next moves
  • in a state diagram label "a,bc" means: read a from input, pop b, push c
  • at each transition any processed symbol may be ϵ
  • special symbol e.g. $ maybe pushed to the stack initially, to know when stack is empty

Pumping lemma for context free languages

If A is a context-free language, then there is a number p (the pumping length) where, if s is any string in A of length at least p, then s may be divided into five pieces s=uvxyz satisfying the conditions:

  1. for each i0,uvixyizA
  2. |vy|>0, and
  3. |vxy|p

Deterministic context-free languages (DCFLs)

  • deterministic and nondeterministic PDAs are not equivalent in power: DPDANPDA
  • languages that are recognizable by deterministic pushdown automata (DPDAs) are called deterministic context-free languages (DCFLs)
  • automaton cannot make choices about how it proceeds because only a single possibility is available at every point
  • the parsing problem is generally easier for DCFLs than for CFLs
  • deterministic context-free grammars are equal to DPDAs, provided that we restrict to endmarked languages, where all strings include a terminating symbol (every string has a forced handle) => one can always be converted to the other

Definition

6-tuple (Q,Σ,Γ,δ,q0,F) (deterministic)

Q finite set of states
Σ finite input alphabet
Γ finite stack alphabet
δ:Q×Σϵ×Γϵ(Q×Γϵ){} transition function
q0Q start state
FQ finite set of accept states

The transition function δ must satisfy the following condition: for every qQ,aΣ, and xΓ, exactly one of the values: δ(q,a,x),δ(q,a,ϵ),δ(q,ϵ,x), and δ(q,ϵ,ϵ) is not .

ϵ-moves in the DPDA’s transition function are allowed (unlike in DFAs)

  • may read an input symbol without popping a stack symbol (ϵ-input move), and vice versa (ϵ-stack move)
  • ϵ-input move and ϵ-stack move maybe combined: δ(q,ϵ,ϵ)

Acceptable moves based on stack state:

  • When stack is nonempty, DPDA has exactly one legal move in every situation
  • If the stack is empty, a DPDA can move only if the transition function specifies a move that pops ϵ
  • Otherwise the DPDA has no legal move and it rejects without reading the rest of the input