DMS Software Reengineering Toolkit

The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems. DMS was originally motivated by a theory for maintaining designs of software called Design Maintenance Systems. DMS and "Design Maintenance System" are registered trademarks of Semantic Designs.

Usage
DMS has been used to implement domain-specific languages (such as code generation for factory control), test coverage and profiling tools, clone detection, language migration tools, C++ component reengineering., and for research into difficult topics such as refactoring C++ reliably.

Features
The toolkit provides means for defining language grammars and will produce parsers which automatically construct abstract syntax trees (ASTs), and prettyprinters to convert original or modified ASTs back into compilable source text. The parse trees capture, and the prettyprinters regenerate, complete detail about the original source program, including source position, comments, radix and format of numbers, etc., to ensure that regenerated source text is as recognizable to a programmer as the original text modulo any applied transformations.

DMS uses GLR parsing technology with semantic predicates. This enables it to handle all context-free grammars as well as most non-context-free language syntaxes, such as Fortran, which requires matching of multiple DO loops with shared CONTINUE statements by label to produce ASTs for correctly nested loops as it parses. DMS has a variety of predefined language front ends, covering most real dialects of C and C++ including C++0x, C#, Java, Python, PHP, EGL, Fortran, COBOL, Visual Basic, Verilog, VHDL and some 20 or more other languages. DMS can handle ASCII, ISO-8859, UTF-8, UTF-16, EBCDIC, Shift-JIS and a variety of Microsoft character encodings.

DMS provides attribute grammar evaluators for computing custom analyses over ASTs, such as metrics, and includes support for symbol table construction. Other program facts can be extracted by built-in control- and data- flow analysis engines, local and global pointer analysis, whole-program call graph extraction, and symbolic range analysis by abstract interpretation.

DMS is implemented in a parallel programming language, PARLANSE, which allows using symmetric multiprocessing to speed up large analyses and conversions.

Rewriting
Changes to ASTs can be accomplished by both procedural methods coded in PARLANSE and source-to-source tree transformations coded as rewrite rules using surface-syntax conditioned by any extracted program facts, using DMS's Rule Specification Language (RSL). The rewrite rule engine supporting RSL handles associative and commutative rules. A rewrite rule for C to replace a complex condition by the  operator be written as: Rewrite rules have names, e.g. simplify_conditional_assignment. Each rule has a "match this" and "replace by that" pattern pair separated by ->, in our example, on separate lines for readability. The patterns must correspond to language syntax categories; in this case, both patterns must be of syntax category statement also separated in sympathy with the patterns by ->. Target language (e.g., C) surface syntax is coded inside meta-quotes ", to separate rewrite-rule syntax from that of the target language. Backslashes inside meta-quotes represent domain escapes, to indicate pattern meta variables (e.g., \v, \e1, \e2) that match any language construct corresponding to the metavariable declaration in the signature line, e.g., e1 must be of syntactic category: (any) expression. If a metavariable is mentioned multiple times in the match pattern, it must match to identical subtrees; the same identically shaped v must occur in both assignments in the match pattern in this example. Metavariables in the replace pattern are replaced by the corresponding matches from the left side. A conditional clause if provides an additional condition that must be met for the rule to apply, e.g., that the matched metavariable v, being an arbitrary left-hand side, must not have a side effect (e.g., cannot be of the form of a[i++]; the no_side_effects predicate is defined by an analyzer built with other DMS mechanisms).

Achieving a complex transformation on code is accomplished by providing a number of rules that cooperate to achieve the desired effect. The ruleset is focused on portions of the program by metaprograms coded in PARLANSE.

A complete example of a language definition and source-to-source transformation rules defined and applied is shown using high school algebra and a bit of calculus as a domain-specific language.