Temporal Attention for Robust Multiple Object Pose Tracking

Zhongluo Li, Junichiro Yoshimoto, Kazushi Ikeda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Estimating the pose of multiple objects has improved substantially since deep learning became widely used. However, the performance deteriorates when the objects are highly similar in appearance or when occlusions are present. This issue is usually addressed by leveraging temporal information that takes previous frames as priors to improve the robustness of estimation. Existing methods are either computationally expensive by using multiple frames, or are inefficiently integrated with ad hoc procedures. In this paper, we perform computationally efficient object association between two consecutive frames via attention through a video sequence. Furthermore, instead of heatmap-based approaches, we adopt a coordinate classification strategy that excludes post-processing, where the network is built in an end-to-end fashion. Experiments on real data show that our approach achieves state-of-the-art results on PoseTrack datasets.

Original languageEnglish
Title of host publicationNeural Information Processing - 30th International Conference, ICONIP 2023, Proceedings
EditorsBiao Luo, Long Cheng, Zheng-Guang Wu, Hongyi Li, Chaojie Li
PublisherSpringer Science and Business Media Deutschland GmbH
Pages551-561
Number of pages11
ISBN (Print)9789819980697
DOIs
Publication statusPublished - 2024
Event30th International Conference on Neural Information Processing, ICONIP 2023 - Changsha, China
Duration: 20-11-202323-11-2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14450 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference30th International Conference on Neural Information Processing, ICONIP 2023
Country/TerritoryChina
CityChangsha
Period20-11-2323-11-23

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Temporal Attention for Robust Multiple Object Pose Tracking'. Together they form a unique fingerprint.

Cite this