Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

Simon, S.; Sankaranarayanan, S.; Tajik, E.; Borchers, C.; Shahrokhian, B.; Balzan, F.; Celik, B.

Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

Date

2025

Authors

Publisher

Springer Science and Business Media Deutschland GmbH

Abstract

Large Language Models (LLMs) have demonstrated fluency in text generation and reasoning tasks. Consequently, the field has probed the ability of LLMs to automate qualitative analysis, including inductive thematic analysis (iTA), previously achieved through human reasoning only. Studies using LLMs for iTA have yielded mixed results so far. LLMs have successfully been used for isolated steps of iTA in hybrid setups. With recent advances in multi-agent systems (MAS) enabling complex reasoning and task execution through multiple, collaborating LLM agents, the first results point towards the possibility of automating sequences of the iTA process. However, previous work especially lacks methodological standards for assessing the reliability and validity of LLM-derived iTA. Thus, in this paper, we propose a method for assessing the quality of iTA systems based on consistency with human coding on a benchmark dataset. We present criteria for benchmark datasets and an expert blind review with this method on two iTA outputs: one iTA conducted by domain experts, and another fully automated with a MAS built on the Claude 3.5 Sonnet LLM. Results indicate a high level of consistency and contribute evidence that complex qualitative analysis methods common in AIED research can be carried out by MAS. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

Description

Google, Gates Foundation, Hewlett Packard Enterprise, Eedi, VitalSource, Duolingo English Test, Springer.

Keywords

Agentic LLMs, Claude 3.5 Sonnet, Inductive Analysis, Large Language Models, Multi-Agent Systems, Qualitative Coding, Thematic Analysis

WoS Q

N/A

Scopus Q

Q3

Source

Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

15879 LNAI

Start Page

60

End Page

73

URI

https://doi.org/10.1007/978-3-031-98420-4_5
https://hdl.handle.net/20.500.14720/28362

Collections

Scopus İndeksli Yayınlar Koleksiyonu

Full item page

Google Scholar™

Check

Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Turkish CoHE Thesis Center URL

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections

Google Scholar™