Publications

Depression Detection via Influence-based Relabeling for Resolving Training Set Noise

Published

IEEE International Conference on Systems, Man, and Cybernetics (SMC)

Date

2022.11.01

Abstract

Early depression detection research employs machine learning models trained on crowd-sourcing data. The training data easily suffers from label noise due to weak self-perception of people and uncontrollability of collection process.
In this work, we firstly introduce the influence-based relabeling in the depression detection task to revise the noise label, and further move one step forward to propose a threshold ratio function to control the relabeling samples size.
The relabeling sample size is usually ignored in the previous influence function, so that the relabeling is sometimes overwhelming, leading to great distribution change on the training data, and degrading model performance.
Our proposed method aims at avoiding giant change on the training data. To achieve this, we design an adjustable ratio threshold for the samples to be relabeled.
The ratio is according to the trained model performance. If the model has good performance on the validation set, the relabeling ratio tends mild, otherwise, the relabeling can be aggressive. In the experiments, we recruited almost about 205 participants
and collect the usage data from smartphone and band trackers, including participants’ response to the questionnaire DASS-21. We discusses several main stream denoising methods and compares four most recent methods in the depression detection
task. The proposed model achieves a best testing F1 score of 86.3%.