An ensemble novel architecture for Bangla Mathematical Entity Recognition (MER) using transformer based learning

An ensemble novel architecture for Bangla Mathematical Entity Recognition (MER) using transformer based learning

2024 | Tanjim Taharat Aurpa, Md Shoaib Ahmed
This paper presents an ensemble architecture for Bangla Mathematical Entity Recognition (MER) using transformer-based learning. The proposed method utilizes the Bidirectional Encoder Representations from Transformers (BERT) model to recognize mathematical entities in Bangla text. A novel dataset comprising 13,717 observations, each containing a mathematical statement, entity, and type, was created for this task. The ensemble approach combines two BERT models with different input sequences of Bangla mathematical statements and entities. The proposed method achieves a high accuracy of 99.76% with ensemble BERT and 97.98% with BERT. The results demonstrate that the ensemble approach significantly improves performance compared to the single BERT model. The study also evaluates the performance of the proposed method on other transformer models such as ELECTRA and XLNet, showing that the ensemble technique enhances accuracy. The research highlights the importance of mathematical entity recognition in Bangla for various applications including education, research, and data processing. The proposed method and dataset contribute to the field of Bangla Natural Language Processing (NLP) and provide a valuable resource for further research in mathematical entity recognition. The study concludes that the ensemble architecture of BERT is a promising approach for Bangla MER, and future work may focus on expanding the dataset, supporting multiple languages, and handling complex mathematical expressions.This paper presents an ensemble architecture for Bangla Mathematical Entity Recognition (MER) using transformer-based learning. The proposed method utilizes the Bidirectional Encoder Representations from Transformers (BERT) model to recognize mathematical entities in Bangla text. A novel dataset comprising 13,717 observations, each containing a mathematical statement, entity, and type, was created for this task. The ensemble approach combines two BERT models with different input sequences of Bangla mathematical statements and entities. The proposed method achieves a high accuracy of 99.76% with ensemble BERT and 97.98% with BERT. The results demonstrate that the ensemble approach significantly improves performance compared to the single BERT model. The study also evaluates the performance of the proposed method on other transformer models such as ELECTRA and XLNet, showing that the ensemble technique enhances accuracy. The research highlights the importance of mathematical entity recognition in Bangla for various applications including education, research, and data processing. The proposed method and dataset contribute to the field of Bangla Natural Language Processing (NLP) and provide a valuable resource for further research in mathematical entity recognition. The study concludes that the ensemble architecture of BERT is a promising approach for Bangla MER, and future work may focus on expanding the dataset, supporting multiple languages, and handling complex mathematical expressions.
Reach us at info@study.space