A Transformer-based Multimodal Network for Audiovisual Depression Prediction

Shiyu Teng, Shurong Chai, Jiaqing Liu, Tateyama Tomoko, Xinyin Huang, Yen Wei Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Depression is a prevalent mental ailment that causes many diseases all over the world. Identification of people with mental illness faces a challenge, as there is no difference between mentally ill people and normal people in physiology, and clinicians can only make a subjective diagnosis according to the relevant information of patients. Hence, it has become imperative to develop automated methods for audiovisual depression prediction. Although many studies have been conducted in the field, there still remains a challenge. Long-term temporal context information is difficult to extract from long sequences of aural and visual data. This study aimed to construct a novel transformer-based multimodal network to distinguish depressed patients from normal people. We evaluate our approach on the Chinese Soochow University depressive severity dataset and demonstrate that our method outperforms the existing method.

Original languageEnglish
Title of host publicationGCCE 2022 - 2022 IEEE 11th Global Conference on Consumer Electronics
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages761-764
Number of pages4
ISBN (Electronic)9781665492324
DOIs
Publication statusPublished - 2022
Event11th IEEE Global Conference on Consumer Electronics, GCCE 2022 - Osaka, Japan
Duration: 18-10-202221-10-2022

Publication series

NameGCCE 2022 - 2022 IEEE 11th Global Conference on Consumer Electronics

Conference

Conference11th IEEE Global Conference on Consumer Electronics, GCCE 2022
Country/TerritoryJapan
CityOsaka
Period18-10-2221-10-22

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Information Systems and Management
  • Electrical and Electronic Engineering
  • Media Technology
  • Instrumentation
  • Social Psychology

Fingerprint

Dive into the research topics of 'A Transformer-based Multimodal Network for Audiovisual Depression Prediction'. Together they form a unique fingerprint.

Cite this