Transcription of Informatics Final Project Seminar Recordings via Speech-to-Text

Trisna Gelar, Aprianti Nanda Sari, Djoko Cahyo Utomo Lieharyani, Fauza Naylassana, Nisrina Wafa Zakiya Hamdani

Abstract


Bandung State Polytechnic has implemented project-oriented problem-based learning as its new educational approach. Project 6, or Final Project, is a course that communicates student learning results. Documenting seminar events is crucial as it provides valuable resources for students to analyze seminar outcomes and address any inquiries from teachers. However, not all students are inherently proactive in documenting or recording these activities. The task of transcribing learning outcomes becomes distinct when the emphasis is placed on students. This study aims to develop and evaluate a speech-to-text model utilizing DeepSpeech for transcribing seminar presentations related to final year projects, tackling the difficulties presented by spontaneous speech patterns and specialized technical terminology in software engineering. The model is trained and assessed utilizing Word Error Rate (WER) and Character Error Rate (CER) measures. The results of this study are the development of speech-to-text systems for educational purposes, especially within project-based student-centered learning. These resulting transcriptions could benefit both students and educators by offering a searchable and analyzable account of seminar presentations and improving feedback.


Keywords


Speech To Text; Final Project; DeepSpeech; Seminar Recording

Full Text:

PDF

References


Agrawal, J., Gupta, M., & Garg, H. (2023). A review on speech separation in cocktail party environment: challenges and approaches. Multimedia Tools and Applications, 82(20), 31035–31067. https://doi.org/10.1007/s11042-023-14649-x

Bakken, J. P., Uskov, V. L., Rayala, N., Syamala, J., Shah, A., Aluri, L., & Sharma, K. (2019). The Quality of Text-to-Voice and Voice-to-Text Software Systems for Smart Universities: Perceptions of College Students with Disabilities (pp. 51–66). https://doi.org/10.1007/978-3-319-92363-5_5

Fernández, P. Á. Á., & Hajek, J. R. (2020). A Case Study in Comparative Speech-to-Text Libraries for Use in Transcript Generation for Online Education Recordings. SIGITE 2020 - Proceedings of the 21st Annual Conference on Information Technology Education, 223–228. https://doi.org/10.1145/3368308.3415380

Firdaus, L. H., Wulan, S. R., & Maspupah, A. (2024). A CASE environment for Project-based Course to learn a sustainable software development. E3S Web of Conferences, 479, 07026. https://doi.org/10.1051/e3sconf/202447907026

Gelar, T., & Nanda, A. (2022). Exploration of Spontaneous Speech Corpus Development in Urban Agriculture Instructional Videos. Journal of Software Engineering, Information and Communication Technology (SEICT), 3(1), 1–14. https://doi.org/10.17509/seict.v3i1.44548

Higgins, C., & Ikeda, M. (2021). The materialization of language in tourism networks. Applied Linguistics Review, 12(1), 123–152. https://doi.org/10.1515/applirev-2019-0100

Kataoka, Y., Thamrin, A. H., Murai, J., & Kataoka, K. (2019). Employing automatic speech recognition for quantitative oral corrective feedback in Japanese second or foreign language education. ACM International Conference Proceeding Series, 52–58. https://doi.org/10.1145/3369255.3369285

Latif, S., Qadir, J., Qayyum, A., Usama, M., & Younis, S. (2021). Speech Technology for Healthcare: Opportunities, Challenges, and State of the Art. IEEE Reviews in Biomedical Engineering, 14, 342–356. https://doi.org/10.1109/RBME.2020.3006860

Millett, P. (2021). Accuracy of Speech-to-Text Captioning for Students Who are Deaf or Hard of Hearing. Journal of Educational, Pediatric & (Re) Habilitative Audiology , 25, 1–13. https://www.edaud.org/journal/2021/1-article-21.pdf

Sari, A. N., Pribadi, D. S., Gelar, T., Rahmawati, A., Azzahra, N., & Oktoharitsa, H. (2023). Implementasi Digital Signature pada Laporan Tugas Akhir. Jurnal Informatika Polinema, 10(1). https://doi.org/10.33795/jip.v10i1.1323

Setiarini, S. D., & Wulan, S. R. (2021). Analysis Software Engineering Team’s Soft Skills Learning using Online Learning Platform with Project-Oriented Problem-Based Learning (POPBL). Inform : Jurnal Ilmiah Bidang Teknologi Informasi Dan Komunikasi, 6(2), 81–86. https://doi.org/10.25139/inform.v6i2.3986

Tumminia, J., Kuznecov, A., Tsilerides, S., Weinstein, I., McFee, B., Picheny, M., & Kaufman, A. R. (2021). Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio. 5–9. http://arxiv.org/abs/2104.01304

Zhao, M., Zheng, W., Ye, Y., & Wu, M. (2018). Research on Educational Video Retrieval Method Based on Audio Transcription Technology. 149(Mecae), 384–388. https://doi.org/10.2991/mecae-18.2018.75

Zielonka, M., Krasiński, W., Nowak, J., Rośleń, P., Stopiński, J., Żak, M., Górski, F., & Czyżewski, A. (2023). A survey of automatic speech recognition deep models performance for Polish medical terms. 2023 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), 19–24. https://doi.org/10.23919/SPA59660.2023.10274442




DOI: https://doi.org/10.17509/edsence.v6i2.74660

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Jurnal Pendidikan Multimedia (Edsence)

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Jurnal Pendidikan Multimedia (Edsence) ( p-ISSN:2685-2489 | e-ISSN:2685-2535) published by Universitas Pendidikan Indonesia (UPI)

Indexed by:

           

 

p-ISSN:2685-2489 | e-ISSN:2685-2535

 

Visitor Number :

View My Stats