[1] Pavitra, A.R.R. et al. (2023) 'A Review on Intelligent Voice Assistant with Multilingual Support using Artificial Intelligence,' Journal of Emerging Technologies and Innovative Research (JETIR), 10(4). https://www.jetir.org/papers/JETIR2304137.pdf.
[2] Dong, Qianqian & Huang, Zhiying & Xu, Chen & Zhao, Yunlong & Wang, Kexin & Cheng, Xuxin & Ko, Tom & Tian, Qiao & Li, Tang & Yue, Fengpeng & Bai, Ye & Chen, Xi & Lu, Lu & Ma, Zejun & Wang, Yuping & Wang, Mingxuan & Wang, Yuxuan. (2023). PolyVoice: Language Models for Speech to Speech Translation. 10.48550/arXiv.2306.02982.
[3] ElevenLabs: Free Text to Speech & AI Voice Generator | ElevenLabs (2024). https://elevenlabs.io/.
[4] AI4Bharat (no date) GitHub - AI4Bharat/IndicWav2Vec: Pretraining, fine-tuning, and evaluation scripts for Indic-Wav2Vec2. https://github.com/AI4Bharat/IndicWav2Vec.
[5] Bark - a Hugging Face Space by Suno (no date). https://huggingface.co/spaces/suno/bark.
[6] Company, F. and Meta (2023) 'Introducing Voicebox: the most versatile AI for speech generation,' Meta, 16 June. https://about.fb.com/news/2023/06/introducing-voicebox-ai-for-speech-generation/.
[7] Dowding, John & Gawron, Jean & Appelt, Doug & Bear, John & Cherny, Lynn & Moore, Robert & Moran, Douglas. (1994). Gemini: A Natural Language System For Spoken-Language Understanding. 10.3115/981574.981582.
[8] Gupta, S.C. (2023) 'InDIC Language Stack for voice assistants and conversational AI | towards data science,' Medium, 6 August. https://towardsdatascience.com/vernacular-indic-language-bharat-bhasha-stack-for-conversational-ai-platform-and-voice-assistant-apps-6f8b9b4ad0a5.
[9] Dabre, R. et al. (2022) 'INDIcBART: a pre-trained model for INDIC Natural Language Generation,' Findings of the Association for Computational Linguistics: ACL 2022 [Preprint]. https://doi.org/10.18653/v1/2022.findings-acl.145.
[10] Madhani, Y., Khapra, M.M. and Kunchukuttan, A. (2023) Bhasha-Abhijnaanam: Native-script and Romanized Language Identification for 22 Indic languages. https://arxiv.org/abs/2305.15814.
[11] Madhani, Y. et al. (2022) Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users. https://arxiv.org/abs/2205.03018.
[12] Bhogale, K.S. et al. (2023) VistaAr: Diverse Benchmarks and Training Sets for Indian Language ASR. https://arxiv.org/abs/2305.15386.
[13] Yadav, H. and Sitaram, S. (2022) A survey of multilingual models for Automatic Speech recognition. https://arxiv.org/abs/2202.12576.
[14] Rabiyath, S.S. et al. (2024) 'Bashini website and App – an overview,' Journal of Emerging Technologies and Innovative Research (JETIR), 11(1). https://www.jetir.org/papers/JETIR2401141.pdf.
[15] Mhaske, A. et al. (2022b) Naamapadam: a Large-Scale named entity annotated data for Indic languages. https://arxiv.org/abs/2212.10168.
[16] Create realistic Hindi Text to Speech | ElevenLabs (no date b). https://elevenlabs.io/languages/hindi.
[17] Auto Generate Hindi Voiceover Online | Wavel AI (2022). https://wavel.ai/solutions/ai-voice-generator/hindi-voiceover.
[18] Xiong, W. & Wu, L. & Alleva, Fil & Droppo, Jasha & Huang, Xuedong & Stolcke, A.. (2018). The Microsoft 2017 Conversational Speech Recognition System. 5934-5938. 10.1109/ICASSP.2018.8461870.
[19] Canbek, N.G. and Mutlu, M.E. (2016) 'On the track of Artificial Intelligence: Learning with Intelligent Personal Assistants,' Journal of Human Sciences, 13(1), p. 592. https://doi.org/10.14687/ijhs.v13i1.3549.
[20] Best Speech-to-Text APIs in 2024 (no date). https://www.edenai.co/post/best-speech-to-text-apis?referral=red-best-stt-apis-2Let.