Korpusomat

Korpusomat - a tool for creating and searching electronic language corpora, created at the Institute of Computer Science of the Polish Academy of Sciences.

Korpusomat is a fourth generation corpus tool. It is a web application, which eliminates the need to store data sets on the user's own computer. The corpus is created either by adding text files from the local drive (in any language and format ), or by indicating websites from which texts are to be downloaded. Then, the corpus is annotated automatically on several levels: morphosyntantic, named entities recognition (e.g. geographical names or people) and partial syntantic information (which also allows for the visualization of dependency trees). The finished corpus can be edited, shared with other users, and searched. There are also a number of functions offering statistical summaries of the collected texts