Please use this identifier to cite or link to this item: http://hdl.handle.net/20.500.12188/27397
Title: A Comprehensive Analysis of LayoutLM and Donut for Document Classification
Authors: Bajrami, Merxhan
Zdravevski, Eftim 
Lameski, Petre 
Stojkoska, Biljana 
Keywords: document classification, layout analysis, OCR, intelligent document processing
Issue Date: Jul-2023
Publisher: Ss Cyril and Methodius University in Skopje, Faculty of Computer Science and Engineering, Republic of North Macedonia
Series/Report no.: CIIT 2023 papers;22;
Conference: 20th International Conference on Informatics and Information Technologies - CIIT 2023
Abstract: Document classification is important in everyday life as it allows for efficient management and organization of vast amounts of digital documents, saving time and resources. This task is essential for businesses, organizations, and individ uals who handle large volumes of data and need to quickly retrieve and analyze specific information. AI-based document classification can help organizations better manage and organize their digital assets, improve information retrieval, and make better business decisions based on the insights derived from the classified documents. This paper compares the performance of two transformer-based models, LayoutLM and Donut, for image classification tasks on two different datasets. LayoutLM was trained using pre-trained weights from Microsoft, while Donut used pre-trained weights from Huggingface. Both models were fine-tuned for 100 epochs with early stopping technique, using the Adam optimizer and Cross Entropy Loss. Our results show that LayoutLM performs better than Donut on the first dataset, achieving an overall accuracy of 0.88, while Donut achieved an accuracy of 0.74. Our study demonstrates the importance of carefully selecting and evaluating different models for document classification tasks, based on the specific char- acteristics of the dataset and the task requirements. Additionally, we provide insights into the strengths and weaknesses of both LayoutLM and Donut models for document classification on different datasets.
URI: http://hdl.handle.net/20.500.12188/27397
Appears in Collections:Faculty of Computer Science and Engineering: Conference papers

Files in This Item:
File Description SizeFormat 
CIIT2023_paper_22.pdf9.19 MBAdobe PDFView/Open
Show full item record

Page view(s)

194
checked on May 2, 2024

Download(s)

549
checked on May 2, 2024

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.