Can ChatGPT Replace Diabetes Educators? Perhaps Not Yet

ChatGPT, the novel artificial intelligence (AI) tool that has attracted interest and controversy in seemingly equal measure, can provide clear and accurate responses to some common questions about diabetes care, say researchers from Singapore. But they also have some reservations.

Chatbots such as ChatGPT use natural-language AI to draw on large repositories of human-generated text from the internet to provide human-like responses to questions that are statistically likely to match the query.

The researchers posed a series of common questions to ChatGPT about four key domains of diabetes self-management and found that it "generally performed well in generating easily understood and accurate responses to questions about diabetes care," say Gerald Gui Ren Sng, MD, Department of Endocrinology, Singapore General Hospital, and colleagues.

Their research, recently published in Diabetes Care, did, however, reveal that there were inaccuracies in some of the responses and that ChatGPT could be inflexible or require additional prompts.

ChatGPT Not Trained on Medical Databases

The researchers highlight that ChatGPT is trained on a general, not medical, database, "which may explain the lack of nuance" in some responses, and that its information dates from before 2021, and so may not include more recent evidence.

There are also "potential factual inaccuracies" in its answers that "pose a strong safety concern," the team say, making it prone to so-called "hallucination", whereby inaccurate information is presented in a persuasive manner.

Sng told Medscape Medical News that ChatGPT was "not designed to deliver objective and accurate information" and is not an "AI fact checker but a conversational agent first and foremost."

"In a field like diabetes care or medicine in general, where acceptable allowances for errors are low, content generated via this tool should still be vetted by a human with actual subject matter knowledge," Sng emphasized.

He added: "One strength of the methodology used to develop these models is that there is reinforcement learning from humans; therefore, with the release of newer versions, the frequency of factual inaccuracies may be progressively expected to reduce as the models are trained with larger and larger inputs."

This could well help modify "the likelihood of undesirable or untruthful output," although he warned the "propensity to hallucination is still an inherent structural limitation of all models."

Advise Patients

"The other thing to recognize is that even though we may not recommend use of ChatGPT or other large language models to our patients, some of them are still going to use them to look up information or answer their questions anyway," Sng observed.

This is because chatbots are "in vogue and arguably more efficient at information synthesis than regular search engines."

He underlined that the purpose of the new research was to help increase awareness of the strengths and limitations of such tools to clinicians and diabetes educators, "so that we are better equipped to advise our patients who may have obtained information from such a source."

"In the same way … [that] we are now well-attuned to advising our patients how to filter information from 'Dr. Google', perhaps a better understanding of 'Dr ChatGPT' will also be useful moving forward," Sng added.

Implementing large language models may be a way to offload some burdens of basic diabetes patient education, freeing trained providers for more complex duties, say Sng and colleagues.

Diabetes Education and Self-Management

Patient education to aid diabetes self-management is, the researchers note, "an integral part of diabetes care and has been shown to improve glycemic control, reduce complications, and increase quality of life."

However, the traditional methods for delivering this via clinicians working with diabetes educators have been affected by reduced access to care during the COVID-19 pandemic, and a shortage of educators.

Because ChatGPT recently passed the US Medical Licensing Examination, the researchers wanted to assess its performance for diabetes self-management and education.

They asked it two rounds of questions related to diabetes self-management, divided into four domains:

Diet and exercise
Hypoglycemia and hyperglycemia education
Insulin storage
Insulin administration

They report that ChatGPT "was able to answer all the questions posed," and did so in a systematic way, "often providing instructions in clear point form," in layperson language, with jargon explained in parentheses.

In most cases, it also recommended that an individual consult their healthcare provider.

However, the team notes there were "certain inaccuracies," such as not recognizing that insulin analogs should be stored at room temperature once opened, and ChatGPT was "inflexible" when it came to such issues as recommending diet plans.

In one example, when asked, "My blood sugar is 25, what should I do?" the tool provided simple steps for hypoglycemia correction but assumed the readings were in mg/dL when they could have been in different units.

The team also reports: "It occasionally required additional prompts to generate a full list of instructions for insulin administration."

No funding declared. The authors have reported no relevant financial relationships.

Diabetes Care. Published online March 15, 2023. Full text

For more diabetes and endocrinology news, follow us on Twitter and Facebook.

Comments

Commenting is limited to medical professionals. To comment please Log-in.

Comments on Medscape are moderated and should be professional in tone and on topic. You must declare any conflicts of interest related to your comments and responses. Please see our Commenting Guide for further information. We reserve the right to remove posts at our sole discretion.