Exploring the frontiers of AI, language models, and multilingual technology
Dhrith is our next-generation ASR model that listens beyond words. It understands emotion, rhythm, and code-switched language — capturing not just what is said, but how it's said. Built for India's multilingual reality, Dhrith brings emotional intelligence to speech recognition.
Read more →
We at Soket AI Labs are thrilled to unveil India's first open source multilingual model, Pragna-1B available in four Indian languages - Hindi, Gujarati, Bangla and English. The model is designed to cater to the rich tapestry of Indian languages, significantly expanding the horizons of AI inclusivity and accessibility.
Read more →
We are pleased to inform the NLP community about the availability of the Bhasha SFT dataset, an extensive collection curated by Soket AI Labs for the supervised fine-tuning of Multilingual Large Language Models (LLMs), focusing on Indic languages. The dataset features over 13 million instruction-response pairs in four languages: Hindi, Gujarati, Bengali, and English.
Read more →
Soket Labs is pleased to announce the release of the "Bhasha" series, commencing with two significant datasets: "bhasha-wiki" and "bhasha-wiki-indic". These datasets are engineered to support the development of AI models that are attuned to the linguistic and cultural nuances of India, representing a crucial step forward in the diversification of linguistic resources in computational linguistics.
Read more →