Leveraging large language models for thematic analysis: a case study in the charity sector

Abstract

This study explores how large language models (LLMs) can support deductive and inductive thematic coding in real-life contexts, balancing AI-driven efficiency with essential human oversight. Using three datasets from Tearfund, a UK-based Christian charity, we propose a dual-role human–LLM collaborative framework where the LLM functions as an initial annotator and a validator. In the deductive phase, GPT-4o and GPT-4o-mini were compared against human coders. GPT-4o achieved a substantial agreement in multi-label thematic categorization (κ = 0.61–0.65), while GPT-4o-mini showed a moderate agreement (κ = 0.41–0.58). Both models excelled in sentiment analysis (κ = 0.91–0.95), but struggled with evaluating evidence of impact due to contextual complexity (κ ≤ 0.01). GPT-4o-mini exhibited greater output variability and instability than GPT-4o, but benefited more from few-shot learning to mitigate hallucinations. In the inductive phase, GPT-4o demonstrated a strong semantic alignment with human-generated themes (cosine similarity = 0.76–0.79) though its tendency toward broad themes required human refinement. Despite their potential to streamline thematic analysis, LLMs also pose limitations and implementation challenges, including inconsistencies in excerpt extraction (precision = 0.41, recall = 0.53) and the trade-off between the time saved in coding and the time required for human validation. To facilitate practical implementation, we provide reusable prompt templates for four stages: context, instructions, data processing, and verification. Our findings underline the indispensable role of human expertise—from prompt engineering and managing hallucinations to final verification—to ensure accurate and trustworthy AI-assisted analyses. While LLMs can enhance qualitative analysis, their full potential is only realized under skilled human guidance.

Metadata

Item Type:	Article
Authors/Creators:	Wen, C. Clough, P. https://orcid.org/0000-0003-1739-175X Paton, R. Middleton, R.
Copyright, Publisher and Additional Information:	© 2025 The Authors. This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Keywords:	Large language models (LLMs); Generative AI (GenAI); GPT-4o; Prompt engineering; Thematic analysis
Dates:	Accepted: 3 July 2025 Published (online): 17 August 2025 Published: 17 August 2025
Institution:	The University of Sheffield
Academic Units:	The University of Sheffield > Faculty of Social Sciences (Sheffield) > Information School (Sheffield)
Depositing User:	Symplectic Sheffield
Date Deposited:	02 Sep 2025 08:31
Last Modified:	02 Sep 2025 08:31
Published Version:	https://doi.org/10.1007/s00146-025-02487-4
Status:	Published online
Publisher:	Springer Verlag
Refereed:	Yes
Identification Number:	10.1007/s00146-025-02487-4
Open Archives Initiative ID (OAI ID):	oai:eprints.whiterose.ac.uk:230962

Download

Published Version

Filename: s00146-025-02487-4.pdf

Licence: CC-BY 4.0

CLICK TO DOWNLOAD

CORE (COnnecting REpositories)

Leveraging large language models for thematic analysis: a case study in the charity sector

Abstract

Metadata

Download

Published Version

Export

Statistics