Over the past six months, the Cloud Native Computing Foundation’s (CNCF) Cloud Native AI (CNAI) Working Group has been at the forefront of integrating artificial intelligence (AI) and machine learning (ML) with cloud-native technologies. This is a moment to discuss the working group’s initiatives and opportunities for contributors and enthusiasts as we prepare for KubeCon Europe 2025 in London.
The group has been making strides in galvanizing the community around the needs of AI/ML workloads on top of cloud native environments. A significant deliverable was the publication of the “Cloud Native AI” whitepaper in March 2024.
The whitepaper gives an overview of the AI/ML lifecycle and discusses how cloud native technologies can support and improve the workloads in this lifecycle. It also examines existing challenges in the ecosystem, such as efficiently using GPUs, and offers additional insights about potential solutions.
Building on the Cloud Native AI whitepaper, the group has focused on several of the following initiatives:
- Educational Outreach: Demystifying MLOps and AI engineering, the working group has hosted multiple projects and AI industry experts to guide newcomers and seasoned professionals.
- CN AI Landscape and Radar: Mapping the cloud native AI ecosystem, helping practitioners navigate the myriad available tools and open source projects.
- PyTorch and LF AI/Data Foundations Collaboration: The group aims to ensure cohesive progress in the open source ecosystem and avoid duplicated efforts in the AI domain.
- Performance Benchmarking: A Cloud Native AI leaderboard is being established to foster healthy competition and highlight aspects such as top-performing open source AI inference projects.
The group has also been very active. These are some of the new developments and project updates.
- AI Scheduling Challenges Whitepaper: The group is developing a whitepaper discussing best practices for efficiently scheduling AI workloads in cloud native environments.
- AI Security Whitepaper: This whitepaper is for people working with cloud native AI systems in a cloud environment. It focuses on finding and reducing security threats related to these systems.
- Best Practices and Benchmarks for AI Risks Whitepaper: This paper will examine the NIST AI risk management framework and develop best practices for AI risk assessment and compliance for AI workloads, platforms and models.
- HolmesGPT Project: A project that helps users solve alerts faster with an AI Agent and the Model Context Protocol (MCP).
- Llama Stack: An open source framework that standardizes the essential components to simplify AI application development. It establishes best practices throughout the Llama ecosystem.
- KitOps: An open source DevOps tool that packages and versions your AI and machine learning models, datasets, code, and configurations into a reproducible artifact called a ModelKit, ensuring compatibility with existing tools used by your data scientists and developers.
- CN AI Model Format Specification: A specification that aims to provide a standard way to package, distribute and run AI models in a cloud native environment.
- AIBrix: an open-source initiative under the vLLM umbrella designed to provide essential building blocks to construct scalable GenAI inference infrastructure. It delivers a cloud native solution optimized for deploying, managing, and scaling large language model (LLM) inference, explicitly tailored to enterprise needs.
- Event-Driven AI Systems: These systems were introduced to enhance the latency of RAG systems running on Kubernetes, making them more efficient to scale.
- Knative Eventing: The Knative project showcased EventTypes with advancements in event-driven AI within cloud native environments.
- KAITO Project: The project has been admitted to the CNCF’s sandbox, highlighting its potential impact on the community.
- Video LLM Summarizer Project: A community created a tool to summarize CNCF meetings using LLMs on the CNCF YouTube channel.
Furthermore, the group has participated in several community events, including AI_DEV 2024 Paris: Cloud Native Artificial Intelligence—Top to Bottom, Secure AI Summit 2024: Toward Zero Trust with AI, and Kubecon/CloudNativeCon North America 2024.
Looking Ahead to KubeCon + CloudNativeCon Europe 2025
There is a lot of excitement around the event, with several sessions featuring Cloud Native AI topics, including cutting edge presentations and keynotes on Kubernetes, AI/ML, MLOps, platform engineering, and edge computing.
Cloud Native + Kubernetes AI Day
This Kubecon co-located event will showcase sessions about cloud native AI challenges and advancements. Some topics include the rise of large language models (LLMs), the development of Graph RAGs, and how Ethical Considerations in AI are reshaping how businesses innovate, scale, and move from development to production.
Other Co-Located Events
- CNCF Maintainer Summit: This event will allow open source maintainers to discuss the CNCF TAG reboot proposal, collaborate, and address common project challenges.
- OpenTofu Day Europe: This event connects the OpenTofu community as they transition from this open source alternative.
- Security, Observability, and Edge Summits – Covering the latest in cloud native best practices.
Call to Action: Get Involved with the Cloud Native AI Working Group
New community members are always welcome to get involved, participate in ongoing or new initiatives, share ideas, collaborate, and stay current on the latest open source projects and news.
Zoom meetings are on the second and fourth Thursday of each month from 8 am to 9 am PT. We also regularly communicate on the CNCF Slack channel: #wg-artificial-intelligence
Conclusion
The Cloud Native AI Working Group has seen many developments in the last few months. These include new open source project presentations, community event participation, and whitepapers related to cloud native AI. We invite community members to participate; it’s a great time to get involved!