Learning Unbiased Transformer for Long-Tail Sports Action Classification

Yijun Qian, Lijun Yu, Wenhe Liu, Alexander G. Hauptmann

Published in MediaEval, 2021

CEUR

The Sports Video Task in MediaEval 2021 Challenge contains two subtasks, detection and classification. The classification subtask aims to classify different strokes in table tennis segments. These strokes are fine grained actions and difficult to distinguish. To solve this challenge, we, the INF Team, proposed a fine grained action classification pipeline with SWIN-Transformer and a combination of optimization techniques. According to the evaluation results, our best submission ranks first with 74.21% accuracy and significantly outperforms the runner-up (74.21% vs 68.78%).

@inproceedings{qian2021learning,
  title={Learning Unbiased Transformer for Long-Tail Sports Action Classification},
  author={Qian, Yijun and Yu, Lijun and Liu, Wenhe and Hauptmann, Alexander G},
  booktitle={MediaEval},
  year={2021}
}