This conversation is part of a series of interviews in which JAMA Network editors and expert guests explore issues surrounding the rapidly evolving intersection of artificial intelligence (AI) and medicine.
With the introduction of electronic health record (EHR) systems, patient-clinician communication has become far more direct and frequent. Although convenient for patients, this places a larger burden on health care professionals.
But Eleni Linos, MD, DrPH, director of Stanford Center for Digital Health and a professor of dermatology and epidemiology at Stanford University, thinks AI could be part of the solution to this growing problem.
Linos and colleagues are among the first to study patient satisfaction with AI-generated responses to medical questions, rather than just the quality of the responses and their risk of causing harm. In a study published this past October in JAMA Network Open, the researchers assessed survey respondents’ satisfaction with responses to medical questions real patients had asked in their EHR. Satisfaction was consistently higher for AI responses than for clinician responses.
Linos recently spoke about the findings and the future of patient-clinician messaging in a conversation with Roy Perlis, MD, MSc, editor in chief of JAMA+ AI and director of the Center for Quantitative Health at Massachusetts General Hospital.
This interview has been edited for clarity and length.
Dr Perlis:In your paper, you compared the responses of clinicians and AI to actual patient messages. Can you say a bit more about what you looked at?
Dr Linos:We became very interested in looking at actual patient-doctor communications because that’s where the information on what patients are thinking, feeling, and wondering about really is. One of the most exciting parts of this work was trying to get a better understanding not just of how patients perceive how their clinician responds to their questions through the record, the messages we’re all familiar with, but also how they feel about the response they get from their doctor, their clinical team, whether it’s generated by a real clinician, a real doctor, or AI assisted.
Now, the reason this was super interesting is because we know that there have been other studies showing that AI support for responses to patient messages can really increase efficiency of a clinical team. Our hope is that it can also reduce physician burnout. So there’s many potential advantages. The question we really had was, well, “How do the patients feel about it? What do the patients want to know, and what is their perspective here?”
Dr Perlis:In this study, you have both survey respondents, who were looking at the responses, and clinical raters who are saying how good the responses were. What did you find?
Dr Linos:We found that the AI-generated responses were received more positively by patients, and that patients preferred and were more satisfied with the responses generated with the help of AI. That was surprising because many of us worry about doctors not paying attention to problems or not being there for us. The fact that patients themselves were more satisfied with these responses was surprising, but in a good way.
Dr Perlis:So the model overperformed [compared with] your expectations?
Dr Linos:Yes, I think so. It definitely gives me hope for the future because if we can create a win-win situation where doctors can get back some time for real in-person communication with their patients and the medical work they’re so good at, and patients are getting more in-depth, longer, more detailed, more accurate responses to their queries, that’s a really unique situation where this type of technology is helping both the patient and the clinical team, increasing efficiency, and no one loses. That’s the reason this study and many others in this field give me a deep sense of optimism about the potential of gen[erative] AI in health care.
Dr Perlis:You mentioned length. One of the things I thought was interesting was that the patients liked longer responses from the clinicians, but there didn’t seem to be a correlation with the length of the AI responses. Do you have a sense of why that might be?
Dr Linos:The bottom line is that many physician responses are very short. That probably relates to the lack of time. Clinicians, physicians, and nurses are so pressed in today’s busy and pressured health care delivery system to see more patients, to respond to more messages. Time is such a precious resource and a limiting factor that doesn’t allow them to often to write as long and as detailed responses as they would like.
That’s what we saw in these findings too. Essentially, we know that the responses generated by AI tend to be more detailed and longer on average, and that’s another advantage. AI assistants don’t get tired, and they also don’t have the same time limitations that a real human clinician has. I think patients do prefer more in-depth, longer responses to medical questions.
Dr Perlis:What about empathy? Was the AI able to generate sufficiently empathic responses?
Dr Linos:Absolutely. I think empathy is one of the strengths of these models that is constantly improving. These tools will role-model empathy and teach us real humans, clinicians, and people living in our communities to respond in more empathetic and caring ways.
I am very curious as a next step about how the generation of doctors who are trained with these very empathetic, caring, large language models’ responses that don’t get frustrated, don’t get angry, have infinite patience…what that will mean for the next generation of medical students who will become the next generation of doctors in the way they communicate. I think about this a lot for my own kids as well because there is a potential for this technology to elevate all of our real-life communication as well.
But going back to our original study, the AI-drafted messages were rated very highly by patients on empathy.
Dr Perlis:If I’m a med student or a house officer reading this study, what’s the message? What’s the take home in terms of how to use AI to write those kinds of responses?
Dr Linos:I don’t want anyone to get into trouble by uploading protected medical information into regular ChatGPT. So one caveat to say is that these were anonymized and completely deidentified messages. We use a version of ChatGPT that is secure and HIPAA [Health Insurance Portability and Accountability ACT]–compliant. So one message is to be very aware of the privacy guardrails. For someone who’s on the wards as a practicing doctor, the first important thing is just to understand what resources your health system has right now. Do you have a secure, HIPAA-compliant version of one of these models that you can use?
The second thing is to familiarize yourself with it and learn how to use it. If you do have access to one of these private, secure models, encouraging doctors to use these tools as much as they can in a way that they can improve not just the messaging to the patient, but their own time efficiency, mental health, and burnout. Ultimately, my hope is that this technology can be a win-win and deliver better care for patients, make patients feel more comfortable, more heard, but also make those frontline doctors, those interns, those busy residents have a little bit more time in order to do the work that they went into this field to do in the first place.
Dr Perlis:I feel like this is where I get to say, “Don’t try this at home.” It does make me wonder, though, how do you think patients are going to react as increasingly this becomes part of the workflow—that the responses that they get may have incorporated AI in their generation?
Dr Linos:Well, I think our world is changing. And many patients, understandably are going directly to these tools and asking their questions. Often, patients cannot get appointments with a real doctor for months. This is not unusual.
I’ll tell you a brief story. Clinically I’m a dermatologist, and I had a friend reach out to me and say that she needed a mole checked and she managed to get an appointment on Friday to see one of my colleagues. I remember thinking, “Huh, how did she get an appointment on Friday? That’s two days away.” [I] assumed there was a cancellation. Then she texts me on Friday and says, “Gosh, I’m such an idiot. It’s a year from now, and I came a year early.” I remember just laughing out loud, but at the same time wanting to cry because it is unacceptable and ridiculous that the delays—even in areas you don’t traditionally think of as underserved or underresourced—in getting specialty care are so long.
But for any specialty, even for primary care appointments, it’s hard to get the care you need when you need it. These AI tools are available 24/7 and are actually very good. We know from much of the work you’ve published too in your journal how accurate they are, how effective they are. I don’t think patients will be surprised that some of these responses have been generated with the support of AI. I think it’s always good to know what has been human generated and what has had the support of technology. I would personally want to know if something I receive from a doctor, who it was coming from. But I hope many patients find this helpful and are not worried about it.
Dr Perlis:One of the things you’ve mentioned a couple of times is burnout and the potential for this to help in reducing clinician burnout. The worry I’ve heard expressed is, if this makes it more efficient to respond to patient messages, is the expectation going to be instead of responding to 20 of them a day that people have to respond to 100 a day? In other words, is our workflow just going to adjust to increased volumes where the doc is now in the position of waving through the AI responses rather than being able to take more time and think about them? Is that something we should be worried about or do you think there’s a middle ground?
Dr Linos:All of these issues are definitely important to think about. The other challenge or concern I’ve heard is, as AI helps with the easier medical decisions, it’s not just that the volume will get higher, but will the difficulty get harder? Will real clinicians be dealing with harder and harder medical decisions in addition to more of them? This is a really important topic to think about, how we decide as a society who benefits and who gains from these efficiencies. Just keeping at the forefront this idea that the workers themselves, the doctors, whatever industry you’re in, as these industries become more efficient, ensuring that some of that benefit goes back to improve the lives and working conditions of the workers. I don’t think we have any set answers on this yet, but I do think it’s a very important question we should be keeping at the forefront.
Dr Perlis:You and your research letter make a strong case for using these kinds of [AI] tools. I was curious as I was reading—there’s a lot of allusions to the different models that you used and different prompts. Were there times you got responses back from the AI tools that were just so bad or so strange that you couldn’t use them in the study, where you had to go back and change your prompts or rearrange things, or were they fairly consistently usable?
Dr Linos:In this study we were able to use all of the responses. Now, as you know, the models change all the time and they’re changing continuously. My sense is that they are constantly improving and heading in the direction of being more and more useful. Had this study been conducted a year ago, that may not have been the case. But as we move forward, most of the responses I see, even to queries I’ve asked since this article, are usable. Obviously, the prompt’s phrasing and engineering really matters. But I would say that, yes, as time goes by, these models are simply improving. They’re getting better.
Dr Perlis:That is encouraging to hear. My last question for you is, where do you go from here as far as this particular study? Where do you go next in terms of understanding how these sorts of tools work and where they fit into our practice?
Dr Linos:One very interesting next step is what does this mean for the world? What does this mean for people across the world globally and internationally in terms of both access to direct care through these [large language model–based] models and for training and improving the education and responses of frontline workers across the world? As I’m sure you think about deeply through your journal and other work, as technology grows, we always want to make sure we are not leaving anyone behind and [we are] thinking actively about who these technology improvements are helping. So for me, one of the interesting next steps is how does this apply to answering medical questions globally?
That raises a follow-up question, which is relevant, again, across industries. What languages are available for these types of responses right now? Our study was English only, but even in the US, we know there are many people for whom English is not their first language. So how can these models be developed and be just as accurate and helpful in other languages? What does that mean for people living in other countries and for languages that are not the common top 5 languages spoken? I grew up in Greece, and Greece is one of these countries where there’s only 10 million people in the whole country. Luckily, most Greeks do also understand English, but there are many other places in the world where we need to think about not just access but also language.
There’re many interesting next steps to be [taken] here relating to this issue of how these models are integrated safely in medical care. As clinicians, as doctors, one of our guiding principles is, first, do no harm. As much as we want to run forward to deliver exceptional care, empathetic responses, accurate responses, all it takes is one mistake, one medical error to really put the brakes on something like this. And so safety is really, really important.
Developing an analogous process of rating [to] the way we would when introducing a new technology [or] a new medication and thinking of side effects or complications or negative effects and measuring those [is important] because there’s no doubt in my mind that these models are going to transform health care very, very quickly. It’s our role as academics to make sure we have those guidelines in place to be able to measure and ensure that the safety of our patients, especially vulnerable patients, is constantly assessed, so we are heading in the right direction. I’ll say I’m optimistic but do see many, many follow-up studies and guidelines that need to be further developed.
Dr Perlis:So you’re optimistic, but cautious. That sounds about right to me.

