Summary
- exploratory research project comparing numeric scale ratings of depression to language conceptualization of depression
- worked within the Yale Rutledge Lab using data collected as part of a study aiming to understand decision-making and happiness
Data Analysis and Visualization
- analyzing text responses to prompts based on the Patient Health Questionnaire-9 by generating word embeddings (OpenAI embeddings)
- comparing pattern of word embeddings across the 9 prompts to patterns of numeric scale
- specifically, computing euclidean distance between numeric responses to the 9 prompts and computing cosine distance between embeddings for the 9 prompts, then computing the correlation between numeric matrices and cosine matrices


Correlation between numeric pattern of responding and embedding pattern of responding between depressed and non-depressed participants


Dimension Reduction Analyses on Word Embeddings
- to understand how text responses cluster together, coloured by the 9 prompts

Summary
There are differences in how people talk about their symptoms of depression and how people rate their symptoms numerically.
Because of this, language use may be a useful added dimension to consider when trying to understand individuals’ experiences of depression.
Word embeddings provide a useful way to capture linguistic and semantic differences, which can then be integrated with other modalities of data (e.g., numeric questionnaire responses, biological measures) and integrated into computational models.