User:Danpaulsmith/SARES

What is SARES?
SARES is a web-based application that attempts to automatically "tag" images with semantic keywords.

Is there anything special about the tags?
Yep! The tags SARES generates are actually links to one of the tag's "online resources" which contains various information about it, for example: the resource "London" contains a wealth of information such as location, population size and historical facts. What is an online resource?

The online resource database SARES uses is DBpedia (the blue circle which most of the arrows are pointing towards) - which is like another version of Wikipedia but converted into a special type of data (RDF) making it much more flexible, accessible and most importantly - machine readable. If you visit the "London" resource link above, you will see DBpedia let's you view the human-friendly HTML version of the resource in your browser as a two-columned table of properties and their values. At the bottom of the page you will see links for different machine-readable formats of the resource - these formats allow machines to link information together, a bit like turning the Internet into a database itself. Resources are not just a HTML pages (IMDB, Wikipedia, Yahoo Answers), they are knowledge representations of things - this means SARES is tagging images with "concepts" of things and not just plain text or a link to another web page.

DBpedia is one circle out of the Linking Open Data cloud. There are resource databases for music, books, television programmes, medical information and so on - which are all available for access. The LOD cloud was born around 2007 and shall continue to grow rapidly, perhaps eventually leading us into the Semantic Web.

How does it work?
To collect data, SARES scans the BBC news feed twice a day for new stories. When it finds a new story, it extracts the main image and the story text - which it needs to extract keywords from.

SARES makes use of two public keyword extraction services - Zemanta and OpenCalais, not together - but rather to compare. SARES uses the keywords they return to search DBpedia for their resources using the DBpedia Lookup API.

SARES stores the DBpedia resources (tags) and news story information (article link, image link and text) as RDF data in a Sesame RDF repository using the SARES ontology. SARES uses it's own ontology to inform other machines or applications over the internet how the tagging information is being stored in the SARES databases. So when another machine tries to access the data fom SARES, the SARES ontology is like an info sheet saying "I am full of SaresResource's. A SaresResource has some text (hasDescription), a SaresResource has an image (hasImageURI) and a SaresResource has some tags (hasSemanticAnnotation)".

Machines can now use this information to query the SARES databases for those specific types of data, for example "Find me any images that (hasSemanticAnnotation) 'London'".

SARES has three interfaces for you to explore and interact with - if you are wondering what you can do on each page - click the information icon at the bottom of the page.

Related Wiki pages
* Resource Description Framework * Semantic Web * Mashup (web application hybrid) * Sesame (framework) * OpenCalais * Zemanta