Crawlers and data tagging in Crowdsourcing mode

The bank approached us with the task of cold tagging of goods on non-bank platforms. It was necessary to collect a sample of objects from the public domain, as well as organize their marking for the subsequent construction of ML classification models.

Organized the collection of data from competing platforms with preliminary markup by LLM models, as well as the process of adjusting the markup using Yandex Toloka. We collected both a golden set for quality measurements and a training set.

Team strengths:
  • Collection of data from open sources
  • Formation of sets of samples and tasks according to the customer’s request
  • Preparing data marts to solve the target problem
Engineering NLP
