Compression Group

We help to implement and optimize the resources of advanced deep learning models:

Search for neural network architectures
We study methods of automatic search and iteration of optimal neural network architectures that belong to a given class, fulfill the constraints defined by the task and solve the target task with the best quality.
Pruning
We create methods for thinning the weights and relationships of models in order to optimize the resources consumed by the model.
Distillation
We are exploring ways to train light models at the output of heavy analogues without loss as a solution to the final problem.
Quantization
We reduce the bitness of operations and weights of neural network models for the possibility of application on low-board processors, as well as to speed up the calculations of the model.
Evaluation of the potential quality of the model
We create methods for predicting the expected quality of the model on specific samples to automate the selection of the best candidates.
Effective methods of training models
We apply algorithms for automated initialization, optimization, and change approaches to model training to accelerate convergence to the best model configuration.
Compression Group clients form model optimization goals such as reducing OPEX for training and applying neural network models
We have identified the most frequent optimization requests:
  • Optimization at the stage of model application
    Our customers are interested in reducing the resources consumed by the model: RAM, CPU and GPU, SSD, power consumption. Reducing the necessary resources leads to an improvement in user characteristics: the speed of operation, the retention of charge by the device, etc.
  • Optimization of learning processes
    Training of complex neural network architectures takes a lot of time and requires a large amount of computing resources. To save money, it is necessary to automate and optimize the processes of training and choosing the best models.
  • Preparing to transfer to the device
    Saving on resources is also possible when transferring the computing load from a centralized to a decentralized format (that is, to user devices). In order for the device to have enough resources to start up, it is necessary to optimize the model.
  • Compatibility with new calculators
    Analog chips are available on the market, where models can be integrated; low-bit processors that accelerate calculations in low-bit operations, and so on. It is possible to run only those models that fulfill the specified restrictions on them.
The library of compression methods developed on the basis of our experience allows us to reduce the risks of achieving results and accelerate the delivery of solutions to the customer
Quality
our methods are superior to the quality of ready-made Pwtorch or Tensorflow methods for complex architectures
Measurability
the results of our methods are provided by honest methods of comparing the resources consumed
Flexibility
in adapting the methods to the customer's tasks, the solution architecture and research approach are provided
Guarantees
the results from the compression team are confirmed by a successful project track
The research results are fully transmitted to the client
  • Software implementation
    An easy-to-use library for with readable and reproducible code
  • Database of materials
    A database of materials with reviews for a quick dive into the field and a technical report
  • Trained models
    Parameters of trained models packed in the format required by the client
  • Anything else
    We can prepare project artifacts in the format required by the customer
Compression Group optimized models for:
Scientific consultants of Compression Group
  • Konstantin Vorontsov
    Prof, MSU
    Specialisation: NLP
  • Vadim Strijov
    Prof, Grenoble
    Specialisation: Sensors
  • Mikhail Burtsev
    Prof, AIRI
    Specialisation: NLP
  • Radoslav Neychev
    Consult
    Specialisation: Sensors
  • Andrey Leonidov
    Prof, CERN
    Specialisation: ML
  • Andrey Raygorodsky
    Prof, Yandex
    Specialisation: ML, Graphs
  • Ilya Zharikov
    Consult
    Specialisation: DL & Sensors
  • Oleg Bakhteev
    Consult
    Specialisation: DL
Made on
Tilda