Abstract:
Despite significant advancements in artificial intelligence (AI), most machine learning (ML) solutions remain black boxes with little to no explanation of how decisions are made. To build trust in AI applications in health care, it is crucial for practitioners and patients to understand the reasons behind decisions made by ML models. In particular, there is a need for explainable AI systems for mental health. While there has been significant progress in developing stress prediction models, those models provide no explanation how they determine a prognosis. In this work, we propose a new design for an explanatory AI report of the results of automated stress assessment based on wearable sensors. Because medical practitioners and patients are likely to be familiar with blood test reports, we modeled the look and feel of the explanatory AI on those of a standard blood test report, in which the rows indicate the different physiological sources being tested, and the columns indicate the test results and associated parameters. The physiological measurements used by the AI model to generate the stress report include electrocardiogram, electromyography, electrodermal activity, respiration, and body temperature data. The test indicator results, reflecting the AI explanation, include the following indicators: the predicted stress probability, reference intervals for normal range of values for each physiological signal, warning flags that indicate results in the abnormal stress ranges, and the impact of each physiological signal to the overall stress prediction. The stress prediction and impact measures were derived using ML explainable models that show the contributions of individual features to the overall result of the model. The reference intervals and flags were then derived from those contributions. Historical studies in psychology were used to form ground truth explanations for the physiological signals. The AI explanation reports were then evaluated for usefulness and effectiveness using documented real stress and physiological study data from 14 users. The confidence in the predicted stress was reflected by the accuracy of the used ML prediction model, which came at F1-binary score of 0.78. The contributions of each physiological signal to the stress prediction were shown to correlate with ground truth. The reference intervals for stress versus non-stress were quite distinctive with little variation. In addition to these quantitative evaluations, a qualitative survey by an expert in psychiatry confirmed the confidence and effectiveness of the explanation report in understanding the different aspects of the AI system: result of stress prediction and which physiological (vital) signs were related to stressful episodes. The report also provided a source of additional medical insights into the patient’s mental health.