Web annotation

Web annotation can refer to online annotations of web resources such as web pages or parts of them, or a set of W3C standards developed for this purpose. The term can also refer to the creations of annotations on the World Wide Web and it has been used in this sense for the annotation tool INCEpTION, formerly WebAnno. This is a general feature of several tools for annotation in natural language processing or in the philologies.

Annotation of web resources
With a web annotation system, a user can add, modify or remove information from a Web resource without modifying the resource itself. The annotations can be thought of as a layer on top of the existing resource, and this annotation layer is usually visible to other users who share the same annotation system. In such cases, the web annotation tool is a type of social software tool. For Web-based text annotation systems, see Text annotation.

Web annotation can be used for the following purposes:


 * to rate a Web resource, such as by its usefulness, user-friendliness, suitability for viewing by minors.
 * to improve or adapt its contents by adding/removing material (like wiki).
 * as a collaborative tool, e.g. to discuss the contents of a certain resource.
 * as a medium of artistic or social criticism, by allowing Web users to reinterpret, enrich or protest against institution or ideas that appear on the Web.
 * to quantify transient relationships between information fragments.
 * to save, retain and synthesize selected information.

Annotations can be considered an additional layer with respect to comments. Comments are published by the same publisher who hosts the original document. Annotations are added on top of that, but may eventually become comments which, in turn, may be integrated in a further version of the document itself.

Web Annotation standard
In the Web Annotation standard, "[a]n annotation is considered to be a set of connected resources, typically including a body and target, and conveys that the body is related to the target. The exact nature of this relationship changes according to the intention of the annotation, but the body is most frequently somehow "about" the target. (...) The (...) model supports additional functionality, enabling content to be embedded within the annotation, selecting arbitrary segments of resources, choosing the appropriate representation of a resource and providing styling hints to help clients render the annotation appropriately."

- Robert Sanderson, Paolo Ciccarese, Benjamin Young (eds.)

The basic data structures of Web Annotation (Fig. 1) are
 * target (the element being annotated, e.g., a web document or a part of it),
 * body (the content of the annotation, e.g., a string value), and
 * annotation (the element that serves to relate body and target of an annotation)



The body can be a literal value or structured content (a URI). The target can be identified by an URI (e.g., fragment identifiers) and/or a selector that defines a domain-, resource- or application-specific access protocol, e.g., offset-based, XPath-based, etc.

Web Annotation was standardized on February 23, 2017 with the release of three official Recommendations by the W3C Web Annotation Working Group:


 * Web Annotation Data Model
 * Web Annotation Vocabulary
 * Web Annotation Protocol

These recommendations were accompanied by additional working group notes that describe their application:


 * Embedding Web Annotations in HTML
 * Selectors and States

The Web Annotation data model is also provided in machine-readable form as the Web Annotation ontology. Note that this ontology defines the Web Annotation namespace, and that this namespace is conventionally abbreviated as. This is the abbreviation for Open Annotation, a W3C Community Group whose specifications formed the basis for the Web Annotation standard.

Web Annotation supersedes other standardization initiatives for annotations on the web within the W3C, e.g., the earlier Annotea project discontinued after 2003.

Related specifications
Web Annotation can be used in conjunction with (or as an alternative to) fragment identifiers that describe how to address elements within a web document by means of URIs. These include


 * RFC 5147 (URI fragment identifiers for the text/plain media type)
 * RFC 7111 (URI fragment identifiers for the text/csv media type)
 * RFC 8118 (URI fragment identifiers for the application/pdf media type)
 * SVG fragment identifiers
 * XPointer (for addressing components of XML documents)
 * Media Fragments (for addressing components of media files)

Other, non-standardized fragment identifiers are in use, as well, e.g., within the NLP Interchange Format.

Independently from Web Annotation, more specialized data models for representing annotations on the web have been developed, e.g., the NLP Interchange Format (NIF) for applications in language technology. In early 2020, the W3C Community Group "Linked Data for Language Technology" launched an initiative to harmonize these vocabularies and to develop a consolidated RDF vocabulary for linguistic annotations on the web.

Comparison of web annotation systems
Many of these systems require software to be installed to enable some or all of the features below. This fact is only noted in footnotes if the software that is required is additional software provided by a third party.