VARS is a multi-view multi-task video architecture, that automatically identifies the type of foul and their severity.

It encodes per-view video features (E), aggregates the view features (A), and classifies different properties of the foul action (C).

  • The goal was to improve the performance of VARS on the Soccernet Challenge dataset. Researchers from many universities and companies participated in the challenge. We managed to get third place on the test dataset.
  • Apart from model optimisations, we also reimplemented the model in Pytorch Lightning and used Weights & Biases for logging and model tracking.

vars2

Used tools

  • Pytorch Lightning
  • Weights & Biases (wandb)
  • Python
  • MViTv2