As part of the IndiaAI Mission to build large language models (LLMs) that reflect the country’s linguistic diversity, the Indian IT Ministry has shortlisted several companies, including Gnani.ai.
Ganesh Gopalan, Co-Founder, and CEO of Gnani.ai, highlights the need for an indigenous language model tailored to India, explains how it differs from global counterparts by capturing the country’s linguistic diversity, and explores AI’s transformative impact across key sectors.
Any recent progress or milestones achieved by Gnani AI?
In the past year, there has been a greater adoption of AI in the industry. This tremendous demand is great news for companies like us, which has been a deep tech company from birth. For many years, it was unfashionable to be in AI but that has changed. Specific to the India AI mission, we were selected last Friday to build on the voice-to-voice models. We are building a foundational model using new architecture to help sort out real-time conversations and make them nearly instantaneous. Typically, AI voice conversations have problems with latency and accuracy. The emotional context is also often lost. When you have multiple models, the error tends to cascade across the models. Our model fuses different elements of the architecture into a single architecture, allowing the software components to work closely together, thereby reducing latency. It also enables real-time communication and tracks the emotion behind it. Usually, the architectures built across the world are speech-to-text systems, followed by LLM, and text-to-speech systems. We are crunching many of these modules together and encoding the pitch, the emotion, and the tone behind the conversations. So, the answer or output will depend also on the emotions. We are excited about this making a huge difference to not only the industry but also many government use cases.
The Indigenous foundational model seems to mirror efforts in the US and China. How will it differ from its global counterparts?
We are building these models to handle India-specific problems. A global model might work for English, and Hindi to some extent, but typically fails with other Indian languages, especially resource-conscious languages like something spoken in the Northeast. The intention is to build something new and something that works for India’s diverse languages and dialects. We are also doing something foundational in terms of the voice-to-voice LLMs while catering to Indian languages and the diversity therein.
How can a government-led project like this boost India’s global standing in AI?
The importance of AI in the world will be as important as the emergence of computers or the internet. Whether it is government or enterprise services, AI will be the oil on which things run. For AI to run, you need GPUs. That has been a key problem for startups and any company in India. This core resource, without which you can not train AI is largely unavailable in India; we have old or sometimes unusable GPU components. It has also been a prohibitive price. We are trying to build AI systems, but often there are not enough GPUs to train them. The government, however, will address the availability of GPUs and their pricing.
What are some potential use cases for this model across sectors? Will the model be democratised—open for broad access and use?
Any sector involving real-time voice conversations will benefit from this. For example, we did an intervention in UP to solve maternal health issues. The infant mortality rate is a huge embarrassment for the country. A theory that came out is that access to information can help improve the situation. We built an autonomous voice AI agent that spoke to pregnant mothers in the local dialects and reminded them of vaccinations among other things. What we learned there inspired this effort. It was not only the information provided, but the emotions of the people speaking. Other benefits are citizen services and access to education. Apart from the obvious use cases in enterprise, anything that requires real-time conversations with machines will be impacted. This is a great technology challenge because not many companies in the world have done this.This model can be used by anyone in the industry to solve their specific problems. Voice tends to be a natural form of communication and all of us gravitate towards that.
What data is being used to train the model? How many languages and dialects will it support?
In the initial few years, we are looking at 22 languages, with 14 in the first phase. We have been working in the voice AI space for a long time now. When we first started our company, we collected data on every language, district, industry, and noise environment.For example, I speak Tamil but I come from a small suburb of Mumbai called Chembur. How I speak Tamil is different from somebody in Chennai, Tirunelveli, or Thanjavur. It is easy to say that a model understands and speaks Tamil with low latency, but does it understand every dialect?We have collected a few million hours of unedited audio over the years. We have data in every language and are supplementing that with synthetic data from regions or languages that are resource-constrained. There is also a lot of open-source access to data, and we are using a combination of all three to build these models. One advantage we bring to the table is the huge amount of data we have collected, especially when we started the company. We have only enhanced this over the years.
Start-ups seem to be spearheading AI innovation in India. What advantage do you have over bigger companies?
IT services companies fundamentally focus on services and that has been their success. I am sure they are doing wonders on AI services and will continue to because AI talent will be abundant in the country. The problem, however, is that if you are a listed company and have huge profits or large shareholders, you may not invest or take bets on future technology. Irrespective of what technology comes in, you will leverage it best and grow your company. Start-ups tend to bet on the future. We always have to try to discover something new as part of our DNA. The challenge is when companies like us become established and have our IPOs, we have to maintain that level of innovation to do the next best thing. Today, it is comparatively easier since we have less to lose and can do the innovations we want.
Anurag Dhole is a seasoned journalist and content writer with a passion for delivering timely, accurate, and engaging stories. With over 8 years of experience in digital media, she covers a wide range of topics—from breaking news and politics to business insights and cultural trends. Jane's writing style blends clarity with depth, aiming to inform and inspire readers in a fast-paced media landscape. When she’s not chasing stories, she’s likely reading investigative features or exploring local cafés for her next writing spot.