Care coordination has always been one of the core problems in HealthCare. A patient’s story is scattered across a handwritten prescription, a photographed lab report, an ambient consultation, an insurance claim, and a pharmacy receipt. Turning that scatter into timely, safe, connected care is what agentic AI is finally beginning to do at scale, not just a chatbot layered on top of the system, but as reasoning stitched through every step of the journey.
At EkaCare, we sit at that stitching point. As India’s fastest-growing connected healthcare platform, we serve millions of consumers managing their longitudinal health records, tens of thousands of doctors documenting care in clinics across the country, and tens of hospitals and clinic chains running their operations on our stack. Our job is to take the messy, multimodal reality of Indian healthcare and turn it into a structured, interoperable, trustworthy clinical context, fast enough to be useful in a few minutes consultation.
We have been leveraging NVIDIA’s enterprise AI stack to build that future, NVIDIA NeMo Curator, NVIDIA Nemotron pre-training data, and NVIDIA GPUs from the ground up. This post is an early look at why we believe the next inflection is happening now, and why NVIDIA’s newly announced Nemotron 3 Nano Omni model, which we’ve had the privilege of evaluating ahead of its public release, is one of the most important pieces.
EkaCare is a full-stack healthcare platform, not a single product. Four pillars come together to simplify patient care:
• Eka EMR - a modern electronic medical record for clinics and hospitals, designed around the realities of Indian outpatient workflows.
• Eka Scribe - an ambient AI scribe that listens to the doctor–patient consultation and produces a structured, editable clinical note, so physicians look at patients, not screens.
• Eka PHR - a personal health record used by millions of Indians to consolidate prescriptions, labs, vaccinations, and discharge summaries into a single, portable, ABDM-compliant longitudinal record.
• Eka Platform - the developer and integrator rails that expose our AI, FHIR, and ABDM capabilities to the broader Indian health-tech ecosystem.
Together, these pillars are increasingly powered by agentic AI: agents that can interact with a patient before they walk into the clinic, agents that structure a doctor’s ambient dictation into a FHIR-ready note, agents that reason over a patient’s history to flag a missed follow-up or an adverse drug interaction, and agents that help a care team close the loop after the visit. Simplifying patient care, for us, is increasingly about orchestrating these agents well, and that orchestration is only as good as the multimodal model underneath it.
In Indian healthcare, almost nothing arrives as clean structured information. A single patient encounter can include a handwritten prescription on a notepad, a photocopied stack of lab reports, the ambient audio of a fast-paced Hindi–English consultation or an old discharge summary from another hospital. Each of these has historically required its own pipeline: complex OCR and vision models for scanned documents, Automatic Speech Recognition (ASR) for the audio, and separate Large Language Models (LLMs) to stitch the resulting text back into clinical meaning.
That fragmented architecture comes with a steep tax. Every additional model is another set of GPUs to provision, another failure mode to monitor, another place where context is lost on the boundary. The real challenge of multimodal healthcare AI is not training any single modality well, it is making the modalities reason together. A patient’s spoken “sugar is up this month” only becomes safe clinical context when the model also sees the HbA1c on the lab report and the metformin on the prescription.
When you strip away the input modality, the cognitive task we are asking the AI to perform is almost identical across the board.
Whether the model is looking at a photograph of a degraded discharge summary, parsing a dense lab report, or listening to a code-mixed patient consultation, the objective is the same: extract the clinical truth and structure it into a standardised information architecture. Ultimately, all these paths must converge into a valid, interoperable FHIR document that downstream agents, be it clinical, administrative, or patient-facing can reason over.
If the structural output is always the same, why should we keep maintaining isolated models that cannot share context?
NVIDIA’s newly announced Nemotron 3 Nano Omni is a unified multimodal model that natively ingests video, text, images and audio, and reasons across all of them in a single forward pass. It is one of the most efficient and performant open multimodal model available today, and ideal for powering sub-agents that must quickly perceive and act on rich enterprise data in parallel.
Two properties make it especially compelling for healthcare at India’s scale:
• True any-to-any multimodality. Video, text, images and audio as first-class inputs, with cross-modal reasoning—exactly the substrate that multimodal clinical agents need.
• Efficiency-first design. A hybrid Mixture-of-Experts architecture with large total capacity but a small 30B-A3B active parameter count per forward pass, meaning we get heavyweight reasoning at lightweight inference cost—critical for deploying into public-sector and small-clinic environments.
• Openness. Not only are the model weights open, so that we can fine-tune, distil, and deploy it inside our own stack, which is critical for clinical data in India, but NVIDIA also releases the datasets and training techniques for full transparency and control.
Indian healthcare requires systems that are deeply knowledgeable but ruthlessly efficient. Whether we are powering Digital Public Goods or running it for a small-town hospital, inference cost and deployment latency dictate a project’s viability as much as its accuracy.
This is exactly where Nemotron 3 Nano Omni’s Mixture-of-Experts architecture could shine. A large total parameter capacity gives the model the headroom to ingest complex clinical ontologies and maintain high-level reasoning across long, messy patient contexts. At inference time, only a small fraction of those parameters activate per forward pass, so we get the reasoning depth and cross-modal learning of a heavyweight model at the speed and compute cost of a lightweight one.
For a platform serving millions of patients across thousands of clinics, that ratio is not a technical detail. It is the difference between a demo and a deployment.
With early access to the model, we ran Nemotron 3 Nano Omni against two of the most frequent multimodal workflows on the EkaCare platform:
• Prescription Parsing and transcription structuring. The transcription of an ambient audio of a fast-paced code-mixed consultation, or an image of a clinical note/prescription photographed under realistic clinic-lighting conditions, is converted to an FHIR resource with discrete entities as output.
• Lab report parsing. Multi-page, multi-lab, multi-format lab reports, often low-contrast scans or phone captures, parsed into structured DiagnosticReport resources with LOINC-mapped observations and reference ranges.
In both workflows, the out-of-the-box Nemotron 3 Nano Omni model, strictly zero-shot, with no domain-specific fine-tuning, already achieves higher structural-parsing accuracy than the open base models we originally used to build and ship the Parrotlet series. This suggests that, with fine-tuning analogous to what we applied to the Parrotlet models, Nemotron 3 Nano Omni could match or exceed their accuracy while consolidating the pipeline into a single model.
The Parrotlet series required deep, deliberate fine-tuning to handle nuances of Indian HealthCare settings. Seeing an untuned, multimodal baseline perform well on those foundational blocks in structural parsing strongly validates the hypothesis that unified architectures inherently possess stronger reasoning and data-abstraction capabilities, because modalities cross-pollinate during training. The clinical reasoning the model develops by parsing dense laboratory tables directly improves its ability to structure a doctor’s spoken diagnosis, and vice versa.
A closed model, however capable, cannot carry Indian healthcare on its back. We need the ability to fine-tune on sovereign clinical datasets, to distil variants for on-device deployment, and to co-evolve the model with our own agentic workflows. Nemotron 3 Nano Omni being open lets us do exactly that.
Building on our evaluation, we plan to push Nemotron 3 Nano Omni further in two high-leverage directions:
• Agentic clinical workflows. Fine-tuning Nemotron 3 Nano Omni to be the reasoning core of multi-step clinical agents; pre-consult intake, follow-up coordination, care-plan adherence, and safety checks, so the same model that parses a lab report can also decide what to do about it.
• Vernacular ambient scribe. Fine-tuning Nemotron 3 Nano Omni on Indian code-mixed consultation audio across Hindi, Tamil, Telugu, Bengali, and more, to extend our Eka Scribe capabilities.
Both directions depend on the same underlying property: a single, open, efficient multimodal brain that we can shape to Indian clinical reality.
Agentic AI is simplifying patient care in ways that felt aspirational even eighteen months ago, and it is doing so most credibly where multimodal reasoning, efficient inference, and open ecosystems converge. EkaCare bets that India is the most interesting place in the world to build this, because nowhere else does the combination of scale, modality diversity, and linguistic complexity force the technology to grow so fast.
NVIDIA’s Nemotron 3 Nano Omni is the kind of foundation that makes this bet cheaper. We are excited to keep benchmarking it across our most demanding EkaCare workflows, to share our findings with the community as the model becomes publicly available, and to continue building agentic, multimodal, sovereign AI for Indian patients.
EkaCare is India’s fastest-growing connected healthcare platform, building the digital backbone for doctors, hospitals, and patients through Eka EMR, Eka Scribe, Eka PHR, and Eka DevPortal. EkaCare is used by millions of consumers managing their longitudinal health records, tens of thousands of doctors, and tens of hospitals and clinic chains across the country.
Note: Nemotron 3 Nano Omni has not yet been publicly released at the time of writing. The observations in this post are from our early-access evaluation and will be updated as we continue benchmarking on production EkaCare workloads.