How Reliable is Artificial Empathy in Large Language Models?

We’ve seen artificial intelligence make strides in numerous sectors, from healthcare to finance, solving problems that seemed insurmountable just a few years ago. 

But what happens when we ask AI to understand us, not just solve our problems? 

A groundbreaking paper published in 2023 attempts to answer this question by examining Large Language Models (LLMs) and their capacity for empathy.

The Research in Focus

The study critically evaluates seven peer-reviewed publications that explore LLMs – primarily focusing on OpenAI’s ChatGPT-3.5 – in a medical context. 

Although the field is abuzz with various large language models, the spotlight shines mostly on ChatGPT-3.5. 

The papers included in the review analyse how well this particular model recognizes emotions and provides emotionally supportive responses.

Subjectivity: A Double-Edged Sword

One of the most startling revelations from the review is the methodology behind evaluating empathy in LLMs. 

Six out of seven studies relied on subjective human evaluation to measure the AI’s emotional intelligence. While ChatGPT-3.5 seems to understand and respond to emotional cues, the evaluation criteria might be flawed. 

We’re banking on human subjectivity, which is, unfortunately, not a gold standard for measuring empathy.

The Dangers of Subjective Metrics

Why does subjectivity matter? 

Consider the implications in a medical setting. Misjudging a machine’s level of empathy could result in inappropriate emotional support during sensitive consultations, leading to complications or decreased patient trust. 

The stakes are high, and relying solely on human interpretation could be a risky gamble.

The Need for Objective Measures

The research paper makes a clarion call for more robust, objective metrics for evaluating AI empathy. 

Although LLMs like ChatGPT-3.5 exhibit a promising ability to recognize emotional states and provide supportive responses, we need a more reliable way to measure these capabilities.

Until then, the role of AI in sensitive sectors like healthcare remains a grey area.

The Road Ahead: The Future of Empathy in AI

The research review, while not providing definitive answers, sets the stage for future inquiries. 

To integrate AI seamlessly into areas requiring emotional intelligence, the next phase of research must move beyond human subjectivity. 

This could involve creating standardised tests, using biometric data, or developing new frameworks that more accurately capture the nuances of empathy.


The 2023 systematic review serves as an intriguing starting point for anyone interested in the interplay between artificial intelligence and human emotion.

It uncovers the potential and pitfalls of using Large Language Models in emotionally-charged settings like healthcare. As AI systems continue to infiltrate our daily lives, understanding their limitations isn’t just an intellectual exercise; it’s a societal imperative.

For those keen on unlocking the next frontier of AI, the paper presents both a cautionary tale and a roadmap. 

It’s time to push the boundaries and refine our approach.

Because understanding how AI understands us is crucial for a future where man and machine coexist.

More in the Blog

Stay informed on all things AI...

< Get the latest AI news >

Join Our Webinar Cloud Migration with a twist

Aug 18, 2022 03:00 PM BST / 04:00 PM SAST