Daffodil Software

Success Story

Developing India's First National AI Data Repository: AIKosh

Download PDF

About The Client

IndiaAI, a department under the Ministry of IT, has created a seven-pillar strategy to strengthen the country’s artificial intelligence initiatives. To implement this strategy, IndiaAI works with the National e-Governance Division (NeGD), which serves as the government’s execution arm for major digital projects. One of these pillars is AIKosh - India’s first national AI datasets platform. AIKosh is designed to bring together datasets, models, and resources from government ministries, private organizations, and research institutions, giving users a trusted source of data to develop and train AI solutions.

Country

India

Industry

Government

Services Used

Business Situation

Before AIKosh, datasets in India were siloed across ministries, universities, hospitals, and private organizations. Data engineers had to spend significant time sourcing, cleaning, and validating information before they could begin training AI models. Accessing private datasets was also expensive, and without consistent validation or licensing frameworks, the quality and reliability of data varied greatly. In addition, training required GPUs, which were costly to outsource and not always easily available.

To address these challenges, IndiaAI envisioned a single national repository that would consolidate datasets from government and private contributors into one trusted platform. The goal was to make data affordable, authentic, and compliant with privacy regulations, while also providing built-in tools for experimentation and model training. The platform also needed to be designed for long-term growth, ensuring that more ministries, organizations, and contributors could be onboarded over time.

To implement this vision, NeGD partnered with Daffodil Software to develop AIKosh. Daffodil Software provided the technical expertise required to build a platform capable of AI development and data management at a national level. To achieve this vision, NeGD required a solution that would:

Unify fragmented sources: Aggregate datasets, models, and resources from ministries, universities, hospitals, and private contributors into one searchable repository.

Enforce validation and approvals: Introduce a multi-level review process so only verified and trustworthy datasets were published.

Define usage rights clearly: Attach licenses to each dataset, making rules for access, sharing, and application transparent.

Control data access: Allow contributors to decide how their datasets are shared - whether openly downloadable, available only on request, or kept private for use within their own workspace.

Maintain data quality: Introduce a scoring system based on uniqueness, completeness, and regular updates, encouraging contributors to keep their datasets accurate and valuable.

Provide a sandbox for experimentation: Offer an in-platform GPU-enabled environment where data engineers could test datasets and train AI models directly.

Scale with adoption: Ensure the architecture could seamlessly onboard more ministries, organizations, and private contributors over time.

The Solution: A Unified Repository of 1500+ datasets across domains

The AIKosh platform was built with several key components working together to create a comprehensive AI data collection platform. Each feature was designed to address specific challenges in data accessibility and AI development.

iconUnified Repository of Artefacts

AIKosh serves as a single destination for datasets, AI models, toolkits, and use cases, eliminating the silos that previously existed across ministries and organizations. By offering a central hub, it allows researchers, startups, and enterprises to discover and leverage resources from multiple domains without duplication of effort. The repository is searchable by sector, organization, and usage tags, ensuring ease of discovery and efficient knowledge sharing.

Unified Repository of Artefacts
iconRole-Based Governance and Access Control

The platform introduced structured roles – Explorer, Contributor, Organization Admin, and Platform Admin—each with distinct privileges. Contributors could upload datasets, models, and use cases, while organization admins oversaw approvals and access requests. This model ensured that data sharing remained secure and accountable, while still giving contributors flexibility to decide whether their artefacts were open, request-based, or private.

Access Control
iconMulti-Level Approval Workflow

Every dataset or model published on AIKosh passed through a rigorous multi-step review process. Uploads were first validated by organization admins, followed by platform moderators, and finally the Ministry, before being visible on the platform. This ensured authenticity, compliance, and consistency across all artefacts, building trust for users who rely on verified resources.

Multi-Level Approval Workflow
iconFlexible Data Ingestion

AIKosh supported multiple methods of onboarding datasets, manual uploads, push and pull APIs, secure SFTP transfers, and direct integrations with popular platforms like GitHub, Kaggle, and Hugging Face. This flexibility reduced barriers for contributors, enabling both government ministries and private organizations to seamlessly bring their data into the ecosystem. API keys further allowed programmatic access, enabling developers to integrate datasets directly into their workflows.

Flexible Data Ingestion
iconData Quality and Validation

To encourage reliability, Daffodil implemented automated pipelines that scored datasets on factors such as uniqueness, completeness, frequency of updates, and absence of duplication. Contributors received a star-rating, motivating them to improve data quality over time. 

Data Quality and Validation
iconLicensing and Compliance

Each dataset was published under a clearly defined license, ranging from open-source to restricted or government-specific terms. This provided transparency on how data could be accessed, shared, or reused. Anonymization tools were also included to safeguard sensitive information, ensuring compliance with national data protection policies and ethical AI practices.

Licensing and Compliance
iconIntegrated Notebook Environment

One of AIKosh’s key impactful features was its GPU-enabled notebook, which gave users a ready-to-use sandbox for experimentation. Instead of outsourcing expensive GPUs, researchers and engineers could directly train and test AI models on the platform. This lowered costs and accelerated the development cycle, making AI research more accessible to a wider community.

Integrated Notebook Environment

The Impact

AIKosh has consolidated 1,447 datasets and 217 AI models from 34 organizations across 20+ sectors into a single platform, replacing the fragmented system where data engineers previously searched multiple disconnected government and private sources. The integrated notebook environment eliminated the need for teams to outsource GPU computing for AI model training and testing.

The platform established standardized access controls and automated quality scoring across government ministries and private contributors, enabling cross-sector data sharing that was previously limited by organizational silos. This created the foundation for India’s national AI development infrastructure, supporting broader AI advancement initiatives across the country.

icon

1447

Datasets

icon

217

AI Models

icon

34

Contributing Organizations

Read related case studies

Daffodil helps NeGD to maximize e-governance by unifying 1800+ government services on a single application: UMANG

NeGD advances the Digital India campaign by adding 1800+ government services to its UMANG app.

Poshan Tracker

Daffodil helps NeGD modernize a health tracker for rural child care centers as a part of the Digital India Initiative

Denso

Developing a compliance management software solution for DENSO

Awards & Accolades

Celebrating our awards and achievements in technology innovation, quality engineering, and growth-oriented culture

ZZ Award
Mobile Web Award 2020
WPI Award
Zinnov Zones Award
Economic Times Award
Best Tech Brands 2021
EITSEA Finalist 2019
ZZ Award
Mobile Web Award 2020
WPI Award
Zinnov Zones Award
Economic Times Award
Best Tech Brands 2021
EITSEA Finalist 2019
ZZ Award
Mobile Web Award 2020
WPI Award

Daffodil Unthinkable Software Corp. 2026 - All Rights Reserved