Deterministic Regular Expressions with Back-References

(Dominik D. Freydenberger, Markus L. Schmid)

Supported by the project Taming Extended Regular Expressions

Abstract

Most modern libraries for regular expression matching allow back-references (i.e., repetition operators) that substantially increase expressive power, but also lead to intractability. In order to find a better balance between expressiveness and tractability, we combine these with the notion of determinism for regular expressions used in XML DTDs and XML Schema. This includes the definition of a suitable automaton model, and a generalization of the Glushkov construction.

Versions of the Paper