Request: The bank approached us with the task of cold tagging of goods on non-bank platforms. It was necessary to collect a sample of objects from the public domain, as well as organize their marking for the subsequent construction of ML classification models.
Solution: Organized the collection of data from competing platforms with preliminary markup by LLM models, as well as the process of adjusting the markup using Yandex Toloka. We collected both a golden set for quality measurements and a training set.
Team strengths:
Collection of data from open sources
Formation of sets of samples and tasks according to the customer’s request