Research

Academic research, publications, and experimental projects

PolySpeech-HS: Multilingual Non-Autoregressive Text-to-Speech Synthesis with Hidden-State Adapters

Speech Synthesis & Multilingual AI

Abstract

A non-autoregressive text-to-speech (TTS) multilingual synthesis framework designed to address the linguistic diversity and real-time deployment challenges of Indian languages. By deploying a unified encoder-decoder architecture paired with lightweight hidden-state adapters, PolySpeech-HS enables efficient cross-lingual generalization while preserving language-specific prosodic nuances. Achieved state-of-the-art performance with MOS of 4.30, MCD of 4.7 dB, and RTF of 0.13 across six Indian languages.

IEEE Transactions on Audio, Speech and Language Processing

2025

•Vellore Institute of Technology

TTS

Non-Autoregressive

Hidden-State Adapters

Multilingual AI

Indian Languages

AMO-HSA

A Novel Data-Centric Transformer Fine-Tuning: A Modular Framework for Rapid Domain Adaptation and Deployment

Large Language Models & Domain Adaptation

Abstract

This research demonstrates a data-centric, hardware-light workflow for fine-tuning transformers that sidesteps the drawbacks of costly LLM APIs. By automatically scraping high-signal web content and converting it into Q&A pairs, we fine-tune a GPT-2-Medium model (355M parameters) in ≈7 minutes on a single RTX-3060. The resulting assistant achieves 67.3% accuracy (+34% over base model) with 1.4s median latency and $0 per call cost.

IEEE Transactions on Computational Social Systems

2025

•Vellore Institute of Technology

GPT-2

LoRA

8-bit Adam

Domain Adaptation

Next.js

Q&A Generation

Fine-tuning

Fine-Tuning Mistral 22B: The First Large Language Model for Assamese Language Tasks

Low-Resource Language Processing

Abstract

The first fine-tuned Large Language Model specifically engineered for Assamese, a low-resource Indo-Aryan language spoken by approximately 15 million individuals. Introduces AssamText-750K dataset and custom Unicode mapping system exclusively for Assamese. This pioneering work becomes the first and only Assamese LLM backed by language-specific Unicode infrastructure, achieving 20% average improvement across text generation fluency, sentiment analysis accuracy, and Assamese-to-English translation quality.

IEEE Transactions on Neural Networks and Learning Systems

2025

•Vellore Institute of Technology

Mistral 22B

LoRA

Unicode Mapping

Assamese NLP

Low-Resource Languages

AssamText-750K