Augmented-Based Indonesian Abstractive Text Summarization using Pre-Trained Model mT5

Andre Setiawan Wijaya; Abba Suganda Girsang

doi:https://doi.org/10.14445/22315381/IJETT-V71I11P220

Research Article | Open Access | Download PDF

Volume 71 | Issue 11 | Year 2023 | Article Id. IJETT-V71I11P220 | DOI : https://doi.org/10.14445/22315381/IJETT-V71I11P220

Augmented-Based Indonesian Abstractive Text Summarization using Pre-Trained Model mT5

Andre Setiawan Wijaya, Abba Suganda Girsang

Received	Revised	Accepted	Published
10 Jul 2023	07 Sep 2023	09 Oct 2023	04 Nov 2023

Citation :

Andre Setiawan Wijaya, Abba Suganda Girsang, "Augmented-Based Indonesian Abstractive Text Summarization using Pre-Trained Model mT5," International Journal of Engineering Trends and Technology (IJETT), vol. 71, no. 11, pp. 190-200, 2023. Crossref, https://doi.org/10.14445/22315381/IJETT-V71I11P220

Abstract

Nowadays, up-to-date information is endlessly generated by online users; however, sometimes, information on the internet often needs a lot of time to read by readers. Therefore, tools like automatic text summarization are especially important today. Although Indonesian is one of the most used languages in the world, its research in abstractive automatic text summarization is very limited compared to other languages like English and Mandarin. Recently, many pre-trained models for NLP have been developed and are able to generate abstractive automatic text in the English language. Recently, using data augmentation in NLP has also gained a lot of interest; according to research, applying data augmentation in the training set can improve the performance of NLP downstream tasks such as aspect-based sentiment analysis and machine translation. Thus, this research tries to augment the Indonesian news dataset, Liputan6, using the backtranslation method, which will be used to train and fine-tune mT5, mBART, and IndoBART model to generate Indonesian abstractive automatic text summarization task, then compare the result of the summarization and ROUGE score with the models trained and fine-tuned using non-augmented Liputan6 datasets. The result shows that models that are trained with the augmented Liputan6 dataset gained an increase in ROUGE-1, ROUGE-2, and ROUGE-L scores.

Keywords

Backtranslation, Data augmentation, Fine-tune pre-trained model, Indonesian abstractive text summarization, Indonesian automatic text summarization.

References

[1] Abirami Rajasekaran, and R. Varalakshmi, “Review on Automatic Text Summarization,” International Journal of Engineering and Technology, vol. 7, no. 2.33, pp. 456-460, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Annapurna P. Patil et al., “Automatic Text Summarizer,” 2014 International Conference on Advances in Computing, Communications, and Informatics (ICACCI), pp. 1530-1534, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Aarti Patil et al., “Automatic Text Summarization,” International Journal of Computer Applications, vol. 109, no. 17, pp. 1-2, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Yogan Jaya Kumar et al., “A Review on Automatic Text Summarization Approaches,” Journal of Computer Science, vol. 12, no. 4, pp. 178-190, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Wafaa S. El-Kassas et al., “Automatic Text Summarization: A Comprehensive Survey,” Expert Systems with Applications, vol. 165, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[6] M. Firoz Mridha et al., “A Survey of Automatic Text Summarization: Progress, Process and Challenges,” IEEE Access, vol. 9, pp. 156043- 156070, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Ahmad T. Al-Taani, “Automatic Text Summarization Approaches,” 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), pp. 93-94, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Mahak Gambhir, and Vishal Gupta, “Recent Automatic Text Summarization Techniques: A Survey,” Artificial Intelligence Review, vol. 47, no. 1, pp. 1-66, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Sumit Chopra, Michael Auli, and Alexander M. Rush, “Abstractive Sentence Summarization with Attentive Recurrent Neural Networks,” Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 93-98, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Qian Chen et al., “Distraction-Based Neural Networks for Document Summarization,” arXiv, pp. 1-8, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Itsumi Saito et al., “Length-Controllable Abstractive Summarization by Guiding with Summary Prototype,” arXiv, pp. 1-8, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Virapat Kieuvongngam, Bowen Tan, and Yiming Niu, “Automatic Text Summarization of Covid-19 Medical Research Articles Using Bert and GPT-2,” arXiv, pp. 1-13, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[13] A. Leoraj, and M. Jeyakarthic, “Spotted Hyena Optimization with Deep Learning-Based Automatic Text Document Summarization Model,” SSRG International Journal of Electrical and Electronics Engineering, vol. 10, no. 5, pp. 153-164, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Steven Y. Feng et al., “A Survey of Data Augmentation Approaches for NLP,” arXiv, pp. 1-21, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Sayali Hande, and M.A. Potey, “Temporal Text Summarization of TV Serial Excerpts Using Lingo Clustering and Lucene Summarizer,” SSRG International Journal of Computer Science and Engineering, vol. 2, no. 5, pp. 16-22, 2015.
[Google Scholar] [Publisher Link]
[16] Yu Li et al., “A Diverse Data Augmentation Strategy for Low-Resource Neural Machine Translation,” Information, vol. 11, no. 5, pp. 1- 12, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Colin Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” Journal of Machine Learning Research, vol. 21, no. 1, pp. 1-67, 2020.
[Google Scholar] [Publisher Link]
[18] Batuhan Baykara, and Tunga Güngör, “Turkish Abstractive Text Summarization Using Pretrained Sequence-to-Sequence Models,” Natural Language Engineering, vol. 29, no. 5, pp. 1-30, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Abhimanyu Chopra, Abhinav Prashar, and Chandresh Sain, “Natural Language Processing,” International Journal of Technology Enhancements and Emerging Engineering Research, vol. 1, no. 4, pp. 131-134, 2013.
[Google Scholar] [Publisher Link]
[20] Alpa Reshamwala, Dhirendra S. Mishra, and Prajakta Ganesh Pawar, “Review on Natural Language Processing,” IRACST Engineering Science and Technology, vol. 3, no. 1, pp. 113-116, 2013.
[Google Scholar]
[21] Mansi Agarwal, and Abhishek Saxena, “An Overview of Natural Language Processing,” International Journal for Research in Applied Science and Engineering Technology, vol. 7, no. 5, pp. 2811-2813. 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[22] C.S. Saranyamol, and L. Sindhu, “A Survey on Automatic Text Summarization,” International Journal of Computer Science and Information Technologies, vol. 5, no. 6, pp. 7889-7893, 2014.
[Google Scholar] [Publisher Link]
[23] Dani Gunawan et al., “Automatic Text Summarization for Indonesian Language Using TextTeaser,” IOP Conference Series: Materials Science and Engineering, vol. 190, no. 1, pp. 1-7, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Adhika Pramita Widyassari et al., “Literature Review of Automatic Text Summarization: Research Trend, Dataset, and Method,” 2019 International Conference on Information and Communications Technology (ICOIACT), pp. 491-496, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Connor Shorten, and Taghi M. Khoshgoftaar, “A Survey on Image Data Augmentation for Deep Learning,” Journal of Big Data, vol. 6, no. 60, pp. 1-48, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Steven Y. Feng et al., “A Survey of Data Augmentation Approaches for NLP, arXiv, pp. 1-21, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Jiaao Chen et al., “An Empirical Survey of Data Augmentation for Limited Data Learning in NLP,” Transactions of the Association for Computational Linguistics, vol. 11, pp. 191-211, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Huu-Thanh Duong, and Tram-Anh Nguyen-Thi, “A Review: Preprocessing Techniques and Data Augmentation for Sentiment Analysis, Computational Social Networks, vol. 8, no. 1, pp. 1-16, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Pei Liu et al., “A Survey of Text Data Augmentation,” 2020 International Conference on Computer Communication and Network Security (CCNS), pp. 191-195, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Linting Xue et al., “mT5: A Massively Multilingual Pre-Trained Text-to-Text Transformer,” arXiv, pp. 1-17, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Ashish Vaswani et al., “Attention is all you Need,” 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 1-11, 2017.
[Google Scholar] [Publisher Link]
[32] Yinhan Liu et al., “Multilingual Denoising Pre-Training for Neural Machine Translation,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 726-742, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Mike Lewis et al., “Bart: Denoising Sequence-to-Sequence Pre-Training for Natural Language Generation, Translation, and Comprehension, arXiv, pp. 1-10, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Samuel Cahyawijaya et al., “IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation, arXiv, pp. 1-24, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Chin-Yew Lin, “Rouge: A Package for Automatic Evaluation of Summaries,” Association for Computational Linguistics, pp. 74-81, 2004.
[Google Scholar] [Publisher Link]
[36] Zhe Wang, “An Automatic Abstractive Text Summarization Model based on Hybrid Attention Mechanism,” Journal of Physics: Conference Series, vol. 1848, no. 1, pp. 1-7, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Dmitrii Aksenov et al., “Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling,” arXiv, pp. 1- 10, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Tahmid Hasan et al., “XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages,” arXiv, pp. 1-11, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Rike Adelia, Suyanto Suyanto, and Untari Novia Wisesty “Indonesian Abstractive Text Summarization Using Bidirectional Gated Recurrent Unit,” Procedia Computer Science, vol. 157, pp. 581-588, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Muhammad Alfhi Saputra, Wikky Fawwaz Al Maki, and Nur Andini, “Abstractively Automatic Indonesian Text Summarizer Using Long Short-term Memory Method,” eProceedings of Engineering, vol. 8, no. 2, pp. 1-15, 2021.
[Google Scholar] [Publisher Link]
[41] Fajri Koto, Jey Han Lau, and Timothy Baldwin, “Liputan6: A Large-Scale Indonesian Dataset for Text Summarization,” arXiv, 2020.
[CrossRef] [Google Scholar] [Publisher Link]