Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

Simon, S.; Sankaranarayanan, S.; Tajik, E.; Borchers, C.; Shahrokhian, B.; Balzan, F.; Celik, B.

Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

dc.authorscopusid	57889613800
dc.authorscopusid	57200651663
dc.authorscopusid	57210696892
dc.authorscopusid	57224723719
dc.authorscopusid	57844040700
dc.authorscopusid	59243988400
dc.authorscopusid	60018634300
dc.contributor.author	Simon, S.
dc.contributor.author	Sankaranarayanan, S.
dc.contributor.author	Tajik, E.
dc.contributor.author	Borchers, C.
dc.contributor.author	Shahrokhian, B.
dc.contributor.author	Balzan, F.
dc.contributor.author	Celik, B.
dc.date.accessioned	2025-09-03T16:38:42Z
dc.date.available	2025-09-03T16:38:42Z
dc.date.issued	2025
dc.department	T.C. Van Yüzüncü Yıl Üniversitesi	en_US
dc.department-temp	[Simon S.] Copenhagen University, Nørregade 10, København, Copenhagen, 1172, Denmark; [Sankaranarayanan S.] Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, PA, United States; [Tajik E.] Florida State University, 222 S. Copeland Street, Tallahassee, 32306, FL, United States; [Borchers C.] Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, 15213, PA, United States; [Shahrokhian B.] Arizona State University, 1151 S Forest Ave, Tempe, United States; [Balzan F.] University of Bologna, Via Zamboni, 33, BO, Bologna, 40126, Italy; [Strauß S.] Ruhr University Bochum, Universitätsstraße 150, Bochum, 44801, Germany; [Viswanathan S.A.] Arizona State University, 1151 S Forest Ave, Tempe, United States; [Ataş A.H.] Galatasaray University, Çırağan Cd. No:36, Beşiktaş/İstanbul, Ortaköy, 34349, Turkey; [Čarapina M.] Zagreb University of Applied Sciences, Vrbik 8, Zagreb, 10000, Croatia; [Liang L.] University of Sydney, Camperdown NSW 2050, Sydney, Australia; [Celik B.] Van Yuzuncu Yil University, Yüzüncü Yıl Kampüsü, Tuşba/Van, Bardakçı, 65090, Turkey	en_US
dc.description	Google, Gates Foundation, Hewlett Packard Enterprise, Eedi, VitalSource, Duolingo English Test, Springer.	en_US
dc.description.abstract	Large Language Models (LLMs) have demonstrated fluency in text generation and reasoning tasks. Consequently, the field has probed the ability of LLMs to automate qualitative analysis, including inductive thematic analysis (iTA), previously achieved through human reasoning only. Studies using LLMs for iTA have yielded mixed results so far. LLMs have successfully been used for isolated steps of iTA in hybrid setups. With recent advances in multi-agent systems (MAS) enabling complex reasoning and task execution through multiple, collaborating LLM agents, the first results point towards the possibility of automating sequences of the iTA process. However, previous work especially lacks methodological standards for assessing the reliability and validity of LLM-derived iTA. Thus, in this paper, we propose a method for assessing the quality of iTA systems based on consistency with human coding on a benchmark dataset. We present criteria for benchmark datasets and an expert blind review with this method on two iTA outputs: one iTA conducted by domain experts, and another fully automated with a MAS built on the Claude 3.5 Sonnet LLM. Results indicate a high level of consistency and contribute evidence that complex qualitative analysis methods common in AIED research can be carried out by MAS. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.	en_US
dc.identifier.doi	10.1007/978-3-031-98420-4_5
dc.identifier.endpage	73	en_US
dc.identifier.isbn	9783031984198
dc.identifier.issn	0302-9743
dc.identifier.scopus	2-s2.0-105011948858
dc.identifier.scopusquality	Q3
dc.identifier.startpage	60	en_US
dc.identifier.uri	https://doi.org/10.1007/978-3-031-98420-4_5
dc.identifier.uri	https://hdl.handle.net/20.500.14720/28362
dc.identifier.volume	15879 LNAI	en_US
dc.identifier.wosquality	N/A
dc.language.iso	en	en_US
dc.publisher	Springer Science and Business Media Deutschland GmbH	en_US
dc.relation.ispartof	Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en_US
dc.relation.publicationcategory	Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı	en_US
dc.rights	info:eu-repo/semantics/closedAccess	en_US
dc.subject	Agentic LLMs	en_US
dc.subject	Claude 3.5 Sonnet	en_US
dc.subject	Inductive Analysis	en_US
dc.subject	Large Language Models	en_US
dc.subject	Multi-Agent Systems	en_US
dc.subject	Qualitative Coding	en_US
dc.subject	Thematic Analysis	en_US
dc.title	Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency	en_US
dc.type	Conference Object	en_US
dspace.entity.type	Publication
gdc.coar.access	metadata only access
gdc.coar.type	text::conference output

Collections

Scopus İndeksli Yayınlar Koleksiyonu

Comparing a Human’s and a Multi-Agent System’s Thematic Analysis: Assessing Qualitative Coding Consistency

Files

Collections