Semantic video coding

Project objective:
Develop a neural network video compression method that takes into account high-level semantics.

Motivation:
A variety of multimedia content occupies an important part in the life of a modern person, and now, more than ever before, there is an urgent need to optimize the costs of organizing the storage, transmission, and streaming of a huge volume of such data. Consequently, the latest methods are in demand that will allow the most efficient compression of images/video files with minimal loss in quality. In this task, neural network codecs already bypass traditional ones even in the general domain. When training to compress more specific and homogeneous content, the advantage increases. In this connection, for testing, it was hypothesized that the model’s consideration of additional data, such as the semantic component of the video, will allow maintaining a visual quality of the picture that is pleasant for humans, even with a high degree of video compression.

Solution:
Our solution is based on best practices for creating neural network video codecs for low latency scenarios and our own developments in extracting and using semantic information.

Results:
Under NDA. Original neural network models and approaches to image/video compression at the level of current SOTA solutions have been developed.

Customer: Under NDA.
Technology stack: Python, PyTorch.