W3C MMI

The Multimodal Interaction Activity is an initiative from W3C aiming to provide means (mostly XML) to support Multimodal interaction scenarios on the Web.

This activity was launched in 2002. The Multimodal Interaction Framework Working group has already produced :
 * the Multimodal Interaction Framework, providing a general framework for multimodal interaction, and the kinds of markup languages being considered.
 * A set of use cases.
 * A set of core requirements, which describes the fundamental requirements to address in the future specifications.

The set of devices that are considered are mobile phones, automotive telematics, PCs connected on the Web.

Current work
The following XML specifications (currently in advanced Working draft state) are already addressing various parts of the Core Requirements :
 * EMMA (Extensible Multi-Modal Annotations): a data exchange format for the interface between input processors and interaction management systems.  It will define the means for recognizers to annotate application specific data with information such as confidence scores, time stamps, input mode (e.g. key strokes, speech or pen), alternative recognition hypotheses, and partial recognition results etc.
 * InkML – an XML language for digital ink traces: an XML data exchange format for ink entered with an electronic pen or stylus as part of a multimodal system.
 * Multimodal architecture: A loosely coupled architecture for the multimodal interaction framework that focuses on providing a general means for components to communicate with each other, plus basic infrastructure for application control and platform services.
 * Emotion Markup Language: EmotionML will provide representations of emotions and related states for technological applications.

Useful Links

 * Multimodal Interaction Activity on W3C site
 * The W3C Multimodal Architecture, Part 1: Overview and challenges on IBM DeveloperWorks