Research
Academic research, publications, and experimental projects
PolySpeech-HS: Multilingual Non-Autoregressive Text-to-Speech Synthesis with Hidden-State Adapters
Speech Synthesis & Multilingual AI
Abstract
A non-autoregressive text-to-speech (TTS) multilingual synthesis framework designed to address the linguistic diversity and real-time deployment challenges of Indian languages. By deploying a unified encoder-decoder architecture paired with lightweight hidden-state adapters, PolySpeech-HS enables efficient cross-lingual generalization while preserving language-specific prosodic nuances. Achieved state-of-the-art performance with MOS of 4.30, MCD of 4.7 dB, and RTF of 0.13 across six Indian languages.
A Novel Data-Centric Transformer Fine-Tuning: A Modular Framework for Rapid Domain Adaptation and Deployment
Large Language Models & Domain Adaptation
Abstract
This research demonstrates a data-centric, hardware-light workflow for fine-tuning transformers that sidesteps the drawbacks of costly LLM APIs. By automatically scraping high-signal web content and converting it into Q&A pairs, we fine-tune a GPT-2-Medium model (355M parameters) in ≈7 minutes on a single RTX-3060. The resulting assistant achieves 67.3% accuracy (+34% over base model) with 1.4s median latency and $0 per call cost.
Fine-Tuning Mistral 22B: The First Large Language Model for Assamese Language Tasks
Low-Resource Language Processing
Abstract
The first fine-tuned Large Language Model specifically engineered for Assamese, a low-resource Indo-Aryan language spoken by approximately 15 million individuals. Introduces AssamText-750K dataset and custom Unicode mapping system exclusively for Assamese. This pioneering work becomes the first and only Assamese LLM backed by language-specific Unicode infrastructure, achieving 20% average improvement across text generation fluency, sentiment analysis accuracy, and Assamese-to-English translation quality.