On the same day in May 2023, an experienced lymphoedema practitioner (AM) assessed response validity from two brands of Generative AI: Bard (Alphabet Inc [Google’s parent company], California, Version 2.0.1) and ChatGPT (Open AI, California, Version 3.01, 2023). Response validity was assessed on a subjective scale from not valid, potentially valid and valid, based on common clinical knowledge and the current evidence base. Both Bard and ChatGPT have been trained on large datasets of medical information and are therefore able to access and process text queries quickly. For the purpose of differential diagnosis, both systems can provide the practitioner with a list of possible diagnoses based on text input.
To allow for comparison of responses between brands, the identical scenario and text queries that were first entered in Bard were then entered into ChatGPT. Given that generative AI learns over time, the responses may change as the technology evolves. Screenshots of the responses were therefore taken.
Scenario
A 66-year-old female patient presents to a newly trained lymphoedema practitioner. The patient’s chief complaint is bilateral leg swelling with some pain and discomfort. There are some signs of chronic fatty tissue deposition in the lower legs. The practitioner asks the patient about their medical history, their family history and their lifestyle habits. The practitioner also performs a physical examination of the patient’s legs.
While suspecting lymphoedema, the practitioner is unsure of the diagnosis. The practitioner is concerned about the report of “some pain” and questions whether the patient could have lipoedema. The practitioner decides to consult with generative AI. Along with clinical reasoning and further tests, the practitioner then attempts to provide a possible diagnosis for the patient.
Results
A) Bard
The practitioner types the following initial query into Bard: “A 66 year of old female presents complaining of bilateral swelling. She has no family history of swelling. There is marked pitting oedema on physical examination. What could the diagnosis be?”
The initial Bard response does not include lymphoedema as a possible diagnosis [Figure 1]. The possible causes of swelling are proposed as:
- Heart failure.
- Kidney disease.
- Liver disease.
- Thyroid problems.
- Medication side-effects.
- Infection.
- Cancer.
With the following health conditions being proposed, the clinician would need to use their own clinical reasoning to exclude the conditions above. A pathway for such clinical reasoning is proposed [Figure 4].
The practitioner types a second query into Bard: “The patient is medically well. The swelling is chronic. There is no shortness of breath at rest, no night pain. The patient has been to her GP who has ruled out cancer or an infection. What else could the swelling be from?”
The chatbot provides the following health conditions as possible diagnosis [Figure 2]:
- Chronic venous insufficiency (CVI).
- Lymphoedema.
- Nephrotic syndrome.
- Myxoedema.
- Oestrogen therapy.
The practitioner then types in, “There are no signs of venous disease. Kidney problems have been excluded. A thyroid problem is unlikely, but the patient will be encouraged to have tests from her GP. She is not on estrogen therapy. The lymphoedema diagnosis may be possible. Apart from swelling, the patient’s legs are painful to touch. What could be the cause?”
The chatbot provides confirmation that there could be a possible diagnosis of lymphoedema [Figure 3].
Having undertaken lymphoedema training and having a reasonable understanding of differential diagnosis of types of chronic oedemas, the practitioner queries whether or not the patient could have lipoedema due to the symptoms of pain sensitivity. The question that is proposed is, “What about lipoedema? How could I tell if the patient has this condition when compared to lymphoedema?”
Bard then provides responses that contain information that are not factually correct including characteristics of lipoedema and lymphoedema [Figure 5]. The response also encouraged referral to a qualified healthcare practitioner by stating, “The best way to tell the difference between lipoedema and lymphedema is to see a doctor or other healthcare professional who is experienced in diagnosing and treating these conditions. A doctor will perform a physical exam and may order imaging tests, such as an ultrasound or MRI, to help make a diagnosis.”
The characteristics of lipoedema that Bard proposed were:
- Painful, pitting swelling in the legs and sometimes the arms.
- Asymmetrical swelling.
- A ‘peau d’orange’ or orange-peel appearance to the skin.
- Asymmetrical fat distribution.
The responses were found not to be valid [Table 1]. Lipoedema is most frequently experienced in the legs with an association with non-pitting oedema. Asymmetrical swelling would show a tendency towards lymphoedema.
In terms of treatment, weight loss was proposed as possible solution for lipoedema, which was also not valid. “There is no evidence that lipoedema leads to weight gain” (Bertsch and Erbacher, 2018). Lipedema adiposity is resistant to weight loss diets (Wiedner et al, 2020: Keith et al, 2021).
The characteristics of lymphoedema that Bard proposed were:
- Swelling in the arms or legs.
- A feeling of heaviness or tightness in the affected area.
- Redness or warmth in the affected area.
- Pain in the affected area.
- Skin that is shiny or has a ‘woody’ texture.
The responses were found to be valid or potentially valid (Table 2). Swelling is the main symptom in lymphoedema. It can be accompanied by a feeling of heaviness or tightness in the affected area.
B) ChatGPT
The initial query was entered into ChatGPT. The response suggested that “Based on the information provided, the diagnosis that could be considered in this case is congestive heart failure (CHF)”. The response also included recommendations to have “thorough medical evaluation, including a detailed history, physical examination, and possibly additional tests, would be necessary to confirm the diagnosis”.
From the second query, ChatGPT provided the following possible diagnosis:
- Venous insufficiency.
- Chronic kidney disease.
- Liver disease.
- Hypothyroidism.
- Medications.
- Lymphoedema.
- Certain autoimmune disorders.
Notably, the medication explanation included a broad range of possible medications: “Certain medications, such as calcium channel blockers, nonsteroidal anti-inflammatory drugs (NSAIDs), or hormones, may cause fluid retention and edema”.
The third query produced the following likely causes:
- Cellulitis.
- Deep vein thrombosis (DVT).
- Arthritis.
- Peripheral neuropathy.
- Fibromyalgia.
- Chronic regional pain syndrome (CRPS).
A flow chart explaining how these conditions would be excluded by differential diagnosis is provided [Figure 6].
Conclusion
In this case study, neither Bard nor ChatGPT suggested a possible diagnosis of lipoedema for a presentation of chronic bilateral leg swelling with pain. Only ChatGPT suggested lymphoedema as a possible diagnosis. ChatGPT provided more comprehensive responses than Bard. The generative artificial intelligence in this case study did not assist the clinician in differential diagnosis. Trained lymphoedema practitioners need to rely on their own training at this stage and have a high level of understanding of chronic oedemas in order to be able to diagnose their patients correctly. Further research on AI is warranted as the technology continues to evolve and learn from responses.