A Sentiment Pre-trained Text-Guided Multimodal Cross-Attention Transformer for Improved Depression Detection

Shiyu Teng, Shurong Chai, Jiaqing Liu, Tomoko Tateyama, Lanfen Lin, Yen Wei Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Depression is a widespread mental health issue requiring efficient automated detection methods. Traditional single-modality approaches are less effective due to the disorder's complexity, leading to a focus on multimodal analysis. Recent advancements include transformer-based fusion methods, yet their application in depression detection is often limited by the dominant text modality. To address this, we propose the Text-Guided Multimodal Cross-Attention Transformer, enhancing cross-modal interactions between text, audio, and video for more effective depression detection. Our approach uniquely pre-trains encoders on a large sentiment dataset to better capture emotion-related features crucial for identifying depression-related sentiment changes. Our method demonstrates superior performance on the AVEC2019 benchmark, outperforming current state-of-the-art depression detection techniques.

Original languageEnglish
Title of host publication46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350371499
DOIs
Publication statusPublished - 2024
Event46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024 - Orlando, United States
Duration: 15-07-202419-07-2024

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)1557-170X

Conference

Conference46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2024
Country/TerritoryUnited States
CityOrlando
Period15-07-2419-07-24

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Biomedical Engineering
  • Computer Vision and Pattern Recognition
  • Health Informatics

Fingerprint

Dive into the research topics of 'A Sentiment Pre-trained Text-Guided Multimodal Cross-Attention Transformer for Improved Depression Detection'. Together they form a unique fingerprint.

Cite this