Toloka

Toloka is a crowdsourcing platform and microtasking project launched by Yandex in 2014 to quickly markup large amounts of data, which are then used for machine learning and improving search algorithms. The proposed tasks are usually simple and do not require any special training from the performer. Most of the tasks are designed to improve algorithms that are used by modern technologies spanning self-driving vehicles, smart web searches, advanced voice assistants and e-commerce. Upon completion of each task the performer receives a reward based on the volume of images, videos, and unstructured text. The service has two app versions – for Android and iOS.

Origin of the platform's name
A toloka used to be a form of mutual assistance among villagers of Russia, Ukraine, Belarus, Estonia, Latvia, and Lithuania. It was organized in villages to perform urgent work requiring a large number of workers, such as harvesting, logging, building houses, etc. Sometimes a toloka was used for community works (building churches, schools, roads, etc.).

Types of tasks and scope of results
Data labeling helps to improve search quality and effectively tune result ranking algorithms of search engine.

Machine learning
To train machine learning algorithm requires labeling of large volumes with positive and negative examples of data. Toloka performers receive tasks to determine the presence or absence of objects defined by a computer in a content item. In tasks of another type, a context of the dialogue is given and a scale is proposed by which it is necessary to assess whether a chatbot's answer in this context is appropriate, interesting, and so on. Another group of tasks in Toloka is translation verification performed by collecting examples of translations from different performers.

Audit and marketing research
Checking the quality of the online store, delivery service, writing reviews about products and services. Such audits allow to control the quality of the service and identify weaknesses, over which work will be carried out in the future to improve and eliminate the identified problems.

Users
Toloka users, also known as performers or tolokers, are people who earn money by completing system testing and improvement tasks on the Toloka crowdsourcing platform. In 2018, more than a million people participated in Toloka projects. Most performers are young people under 35 (usually engineering students or mothers on maternity leave). Performers mainly see Toloka as an additional source of income, but many of them note that they like to do meaningful work and clean up the internet. As of March 2022, Toloka has 245,000 monthly active performers in 123 countries. Tolokers generates over 15 million labels per day.

Requesters
All tasks in Toloka are placed by requesters. The main uses of Toloka are data collection and processing for machine learning, speech technology, computer vision, smart search algorithms, and other projects, as well as content moderation, field tasks, optimization of internal business processes.

Toloka Research
In May 2019, the service's team started publishing datasets for non-commercial and academic purposes to support the scientific community and attract researchers to Toloka. Such datasets are addressed to researchers in different directions like linguistics, computer vision, testing of result aggregation models, and chatbot training. Toloka research has been showcased at a range of conferences, including the Conference on Neural Information Processing Systems (NeurIPS), the International Conference on Machine Learning (ICML) and the International Conference on Very Large Data Bases (VLDB).