# DNA Expressions – A Formal Notation for DNA

## Rudy van Vliet

Date: *10 December, 2015, 12:30*

#### Summary

We describe a formal notation for DNA molecules that may contain nicks and gaps. The resulting DNA expressions denote formal DNA molecules. Different DNA expressions may denote the same molecule. Such DNA expressions are called equivalent. We examine which DNA expressions are minimal, which means that they have the shortest length among all equivalent DNA expressions. Among others, we describe how to construct a minimal DNA expression for a given molecule. We also present an efficient, recursive algorithm to rewrite a given DNA expression into an equivalent, minimal DNA expression.

For many formal DNA molecules, there exists more than one minimal DNA expression. We define a minimal normal form, i.e., a set of properties such that for each formal DNA molecule, there is exactly one (minimal) DNA expression with these properties. We finally describe an efficient, two-step algorithm to rewrite an arbitrary DNA expression into this normal form.