In
formal grammar theory,
mildly context-sensitive languages are a class of
formal languages which can be efficiently parsed, but still possess enough
context sensitivity to allow the
parsing of
natural languages. The concept was first introduced by
Aravind Joshi in
1985.
Definition
The formal conditions imposed on the class are:
- The languages must be parsable in polynomial time.
- The language must have constant growth; this means that the distribution of string lengths should be linear rather than supralinear. This is often guaranteed by proving a pumping lemma for some class of mildly context-sensitive languages.
- The language should admit limited cross-serial dependencies, allowing grammatical agreement to be enforced between two arbitrarily long subphrases; this condition is not satisfied by context-free grammars. This is formally enforced by requiring that the language consisting of strings concatenated with themselves belong to the class of mildly context-sensitive languages.
Formalisms
Some attempts at creating mildly context-sensitive language formalisms include:
The first two of those grammar classes define the same set of languages, while the rest four define another single, strictly smaller class. The larger of those two classes may be parsed by thread automatons, while the other smaller one may be parsed by embedded pushdown automatons.
While all languages in both classes are mildly context-senstive and both classes support some cross-serial dependency, neither class exhausts the full set of mildly context-sensitive languages.
Control Language Hierarchy
A more precisely defined hierarchy of languages that correspond to the mildly context-sensitive class was defined by David J. Weir. Based on the work of Nabil A. Khabbaz, Weir's Control Language Hierarchy is a containment hierarchy of countable set of language classes where the
Level-1 is defined as context-free, and
Level-2 is the class of tree-adjoining and the other three grammars.
Following are some of the properties of Level-k languages in the hierarchy:
- Level-k languages are properly contained by Level-(k + 1) language class
- Level-k languages can be parsed in time
- Level-k contains the language , but not
- Level-k contains the language , but not
Those properties correspond well (at least for small k > 1) to the conditions of mildly context-sensitive languages imposed by Joshi, and as k gets bigger, the language class become, in a sense, less mildly context-sensitive.
See also
Further reading