A scoping review of how chatbots are evaluated in health research

6 February 2024

A scoping review of how chatbots are evaluated in health research

What is the research about?

Our study reviewed the methods used for evaluating chatbots (also known as conversational agents) in health research. It can be challenging to evaluate chatbots, as they’re often quite complex and unpredictable. We looked at different methods used and outcomes assessed in all previous studies into chatbots in healthcare. With this information, we made a guide to help researchers understand how to best evaluate health chatbots.

What did the researchers do?

Our team searched five academic databases for publications looking at chatbots in healthcare. Using specific search terms, we identified all the papers published before 13-Jan-2021 that examined a chatbot in a healthcare setting. We excluded any chatbots that limited the interaction to just selecting a response from a menu of predefined options. From each publication, we identified the design of the evaluation and the outcome measures assessed. Each paper was assessed for quality using the mobile health evidence reporting and assessment (mERA) checklist.

What you need to know:

This research is crucial for healthcare professionals, policymakers, and developers of digital health technologies. It supports healthcare professionals in selecting digital tools that are most suitable for patient care, influences policymakers in establishing standards for digital health technologies, and advises developers on creating chatbots that comprehensively address healthcare requirements. This knowledge is vital for advancing digital health technologies in a way that improves healthcare delivery and outcomes.

What did the researchers find?

We discovered that there is a wide variety of methods and outcomes used to evaluate artificial intelligence chatbots in healthcare. Our analysis led to the creation of a structured framework that aligns with the World Health Organisation's standards for digital health technologies. This framework provides clear guidelines for assessing the effectiveness, safety, and user satisfaction of health-related chatbots, offering a valuable tool for future research and development in this area.

How can you use this research?

Our research offers guidance for enhancing the development and evaluation of AI chatbots within the healthcare sector, focusing on their safety, effectiveness, and ability to satisfy patient needs. Future research, building upon this framework, is vital for pushing forward digital health innovations, particularly in areas such as engaging users, offering personalised care, and integrating these technologies within healthcare systems to improve patient outcomes and healthcare accessibility. This approach ensures that patients receive care that is both focused on their needs and rigorously evaluated for quality and reliability.

About the researchers

This work, led by Dr Hang Ding, was carried out by the members of RECOVER's Technology-enabled rehabilitation research team.


Ding, H., Simmich, J., Vaezipour, A., Andrews, N., & Russell, T. (2023). Evaluation framework for conversational agents with artificial intelligence in health interventions: a systematic scoping review. Journal of the American Medical Informatics Association, https://doi.org/10.1093/jamia/ocad222


AI Chatbots; Healthcare; Evaluation Framework; Patient Satisfaction; Digital Health Technologies

Contact information, acknowledgements

Joshua Simmich

RECOVER Injury Research Centre, The University of Queensland

Email: j.simmich@uq.edu.au

X: @JSimmich