VANI AI: A Multilingual Voice-Enabled Conversational Assistant with Offline Capability for Accessibility and Inclusive Communication

Dode, Prathit; Anand, Prince; Mandpe, Lakshita

doi:https://doi.org/10.55041/ijcope.v2i5.684

Volume 02, Issue 05

Published on: May 2026

VANI AI: A MULTILINGUAL VOICE-ENABLED CONVERSATIONAL ASSISTANT WITH OFFLINE CAPABILITY FOR ACCESSIBILITY AND INCLUSIVE COMMUNICATION

Prathit Dode Prince Anand Lakshita Mandpe

Indore Institute of Science and Technology, Indore MP

DOI:https://doi.org/10.55041/ijcope.v2i5.684

Article Status

Plagiarism Passed Peer Reviewed Open Access

Available Documents

Download PDF Review Report

Abstract

VANI AI is an innovative multilingual, voice-enabled conversational assistant engineered to bridge digital communication barriers across diverse linguistic communities in India. The system integrates advanced speech recognition, natural language processing, and text-to-speech synthesis to deliver seamless, real-time interactions in multiple regional Indian languages alongside English. A core feature of VANI AI is its offline processing capability, ensuring uninterrupted functionality in low-connectivity environments prevalent in rural and semi-urban regions. The assistant is designed with a focus on accessibility, targeting users with varying levels of digital literacy, including the elderly and differently-abled populations. This paper presents the architectural design, system modules, experimental evaluation, and comparative analysis of VANI AI, demonstrating significant improvements in multilingual comprehension, response latency, and user accessibility metrics over existing solutions. Results indicate that VANI AI achieves over 91% speech recognition accuracy across Hindi, Marathi, Bengali, and English in offline mode, positioning it as a viable tool for inclusive digital communication.

Keywords: VANI AI; Multilingual NLP; Voice Interface; Offline Speech Recognition; Accessible AI; Conversational Agent; Indian Languages; Inclusive Technology; Text-to-Speech; Digital Literacy

How to Cite this Paper

Dode, P., Anand, P. & Mandpe, L. (2026). VANI AI: A Multilingual Voice-Enabled Conversational Assistant with Offline Capability for Accessibility and Inclusive Communication. International Journal of Creative and Open Research in Engineering and Management, <i>02</i>(05). https://doi.org/10.55041/ijcope.v2i5.684

Dode, Prathit, et al.. "VANI AI: A Multilingual Voice-Enabled Conversational Assistant with Offline Capability for Accessibility and Inclusive Communication." International Journal of Creative and Open Research in Engineering and Management, vol. 02, no. 05, 2026, pp. . doi:https://doi.org/10.55041/ijcope.v2i5.684.

Dode, Prathit,Prince Anand, and Lakshita Mandpe. "VANI AI: A Multilingual Voice-Enabled Conversational Assistant with Offline Capability for Accessibility and Inclusive Communication." International Journal of Creative and Open Research in Engineering and Management 02, no. 05 (2026). https://doi.org/https://doi.org/10.55041/ijcope.v2i5.684.

Search & Index

References

[1] J. Devlin, M. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," NAACL-HLT, 2019.

[2] A. Conneau et al., "Unsupervised Cross-lingual Representation Learning at Scale," ACL, 2020.

[3] A. Srivastava and P. Saxena, "Automatic Speech Recognition for Hindi using Deep Neural Networks," IJACSA, vol. 10, no. 6, 2019.

[4] A. Joshi et al., "Jugalbandi: A Multilingual Information Access System for Rural India," ACL Workshop, 2023.

[5] Ministry of Electronics and IT, "Bhashini: National Language Translation Mission," Govt. of India, 2022.

[6] M. Choudhury, "Language Technologies for Low-Resource Indian Languages: Challenges and Opportunities," ICON, 2020.

[7] S. Mittal and R. Gupta, "Edge Deployment of NLP Models on Constrained Mobile Devices," IEEE IoT Journal, 2022.

[8] K. Rao and H. Sak, "Multi-accent Speech Recognition with Hierarchical Grapheme-based Models," Interspeech, 2017.

[9] V. Peddinti, D. Povey, and S. Khudanpur, "A Time Delay Neural Network Architecture for Efficient Modeling of Long Temporal Contexts," Interspeech, 2015.

[10] P. K. Nayak et al., "Towards Building ASR Systems for the Low Resource Bhojpuri Language," LREC, 2022

Ethical Compliance & Review Process

•All submissions are screened under plagiarism detection.
•Review follows editorial policy.
•Authors retain copyright.
•Peer Review Type: Double-Blind Peer Review
•Published on: May 22 2026

CCBYNC

This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt this work for non-commercial purposes with proper attribution.

View License

Back to Volume 02, Issue 05 View All Issues Next Article

← Previous Article

Valuation Accuracy of DCF Vs. Comparable Company Analysis Vs. Precedent Transactions: Evidence from M&A Deal Outcomes in Emerging Markets (2015–2024)

Next Article →

Vehicle Accident Detection And Monitoring System Using Data Analytics