DCFG

class neologism.DCFG[source]

A class that represents dynamically modifiable context-free grammar.

The API uses str for symbols and Rule for rules.

The grammar object. Build one from rules directly, or load one from a bison yacc file via from_yacc_file(). Mutate it freely; the sentences view always reflects the current state.

Construction

classmethod from_yacc_file(file_path: str | PathLike[str], bison_path: str | None = None) DCFG[source]

Construct a DCFG by loading rules from a bison-style yacc file.

Bison-specific cleanup is applied: the $end terminal is removed and start_symbol is set to $accept.

Note

This function needs bison to be installed and on PATH (or located via bison_path).

Parameters:
  • file_path (str | os.PathLike[str]) – The path of the yacc file.

  • bison_path (str | None) – Optional PATH override used to locate bison.

Raises:
copy() DCFG[source]
Returns:

A fully independent copy of the grammar.

Return type:

DCFG

Note

Mutations on the returned DCFG – including future changes to node or edge attributes on the underlying graph – never affect the original.

load_dcfg(dcfg: DCFG) None[source]

Loads rules from another DCFG.

Parameters:

dcfg (DCFG) – The other DCFG instance to load the rules from.

Rules

property rules: Set[Rule]
Returns:

The rules, that define the grammar.

Return type:

set[Rule]

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd'), Rule('c' => 'x' 'y' 'z')]
rules_containing(symbol: str) Set[Rule][source]

Can be used to filter rules, that contain the given symbol (terminal or nonterminal).

Parameters:

symbol (str) – The symbol, which the rules are filtered by.

Returns:

The rules, that define the grammar and contain the given symbol.

Return type:

set[Rule]

See also

rules

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules_containing("d"), key=repr)
[Rule('a' => 'b' 'c' 'd')]
>>> sorted(dcfg.rules_containing("c"), key=repr)
[Rule('a' => 'b' 'c' 'd'), Rule('c' => 'x' 'y' 'z')]
add_rule(rule: Rule) None[source]

Add a new rule to the grammar.

Note

The symbols in the rule will be automatically added if they are not already in the grammar.

Parameters:

rule (Rule) – The new rule to add.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd'), Rule('c' => 'x' 'y' 'z')]
remove_rule(rule: Rule) None[source]

Remove a rule from the grammar.

Note

If there are symbols that are not used by any rule after the removal, they are removed.

Parameters:

rule (Rule) – The rule to remove.

Raises:

ValueError – If rule is not in the rules of the grammar.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd'), Rule('c' => 'x' 'y' 'z')]
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd', 'x', 'y', 'z']
>>> dcfg.remove_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd')]
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd']

Symbols

property symbols: Set[str]
Getter:

The symbols that are defined in the grammar.

Type:

set[str]

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd', 'x', 'y', 'z']
property terminals: Set[str]
Getter:

The terminals that are defined in the grammar.

Type:

set[str]

Note

It is possible for a terminal to not be used by any of the rules.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> dcfg.add_rule(Rule("c", ("1", "2", "3")))
>>> sorted(dcfg.terminals)
['1', '2', '3', 'b', 'd', 'x', 'y', 'z']
property nonterminals: Set[str]
Getter:

The nonterminals that are defined in the grammar.

Type:

set[str]

Note

It is possible for a nonterminal to not be used by any of the rules.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> dcfg.add_rule(Rule("c", ("1", "2", "3")))
>>> sorted(dcfg.nonterminals)
['a', 'c']
is_symbol_terminal(symbol: str) bool[source]

Checks whether a symbol is terminal in the grammar.

Parameters:

symbol (str) – The symbol to check.

Raises:

ValueError – If symbol was not in the grammar.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.terminals)
['b', 'd', 'x', 'y', 'z']
>>> sorted(dcfg.nonterminals)
['a', 'c']
>>> dcfg.is_symbol_terminal("a")
False
>>> dcfg.is_symbol_terminal("b")
True
make_symbol_terminal(symbol: str) None[source]

Make a symbol terminal in the grammar.

Parameters:

symbol (str) – The symbol to be made terminal.

Raises:

ValueError – If symbol was not in the grammar.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd'), Rule('c' => 'x' 'y' 'z')]
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd', 'x', 'y', 'z']
>>> dcfg.make_symbol_terminal("c")
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd')]
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd']
remove_symbol(symbol: str) None[source]

Removes a symbol from the grammar.

Parameters:

symbol (str) – Symbol to remove.

Raises:

ValueError – If symbol was not in the grammar.

Note

If symbol was used by a rule in the grammar, it will be removed from the rule, too.

Note

If symbol was a nonterminal, the rules which have symbol as their lhs are also removed.

Note

If there are symbols that are not used by any rule after the removal, they are removed.

See also

symbols

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'c' 'd')]
>>> sorted(dcfg.symbols)
['a', 'b', 'c', 'd']
>>> dcfg.remove_symbol("c")
>>> sorted(dcfg.rules, key=repr)
[Rule('a' => 'b' 'd')]
>>> sorted(dcfg.symbols)
['a', 'b', 'd']

Start symbol

property start_symbol: str
Getter:

Returns the start symbol.

Setter:

Sets the start symbol.

Type:

str

Note

If it is not explicitly set, the first added rule’s lhs is used.

Raises:

ValueError – (setter only) If the new start symbol is not in the grammar.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b",)))
>>> dcfg.add_rule(Rule("c", ("d",)))
>>> dcfg.start_symbol
'a'
>>> dcfg.start_symbol = "c"
>>> dcfg.start_symbol
'c'

Sentence enumeration

property sentences: Set[Tuple[str, ...]]
Getter:

Returns all possible sentences of the grammar, starting with start_symbol.

Type:

set[tuple]

Note

If the grammar is not finite, sentences are generated from a copy of the grammar, which has been made finite.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> dcfg.add_rule(Rule("c", ("1", "2", "3")))
>>> sorted(dcfg.sentences)
[('b', '1', '2', '3', 'd'), ('b', 'x', 'y', 'z', 'd')]
iter_sentences() Iterator[Tuple[str, ...]][source]

Yields all possible sentences of the grammar one at a time, starting with start_symbol.

Note

If the grammar is not finite, sentences are generated from a copy of the grammar, which has been made finite.

Note

Unlike sentences, the stream is not deduplicated. Ambiguous grammars may yield the same sentence more than once. Wrap in set if uniqueness matters.

>>> dcfg = DCFG()
>>> dcfg.add_rule(Rule("a", ("b", "c", "d")))
>>> dcfg.add_rule(Rule("c", ("x", "y", "z")))
>>> dcfg.add_rule(Rule("c", ("1", "2", "3")))
>>> sorted(dcfg.iter_sentences())
[('b', '1', '2', '3', 'd'), ('b', 'x', 'y', 'z', 'd')]
is_finite() bool[source]
Returns:

Returns whether there is at least one loop in the resolutions of the nonterminals.

Return type:

bool