Exploring the Potential of Topological Data Analysis for Explainable Large Language Models: A Scoping Review
Date Issued
2026
Author(s)
Sekuloski, Petar
Kitanovski, Dimitar
DOI
10.5281/zenodo.17100816
Abstract
Large language models (LLMs) have become central to modern artificial intelligence, yet their internal decision-making processes remain difficult to interpret. As interest grows in making these models more transparent and reliable, topological data analysis (TDA) has emerged as a promising mathematical approach for exploring their structure. This scoping review maps the current landscape of research where TDA tools—such as persistent homology and Mapper—are used to examine LLM components like attention patterns, latent representations, and training dynamics. By analyzing topological features across layers and tasks, these methods provide new ways to understand how language models generalize, respond to unfamiliar inputs, and shift under fine-tuning. The review also considers how TDA-based techniques contribute to broader goals in interpretability and robustness, especially in detecting hallucinations, out-of-distribution behavior, and representational collapse. Overall, the findings suggest that TDA offers a rigorous and versatile framework for studying LLMs, helping researchers uncover deeper patterns in how these models learn and reason.
