News
News about Lablup Inc and Backend.AI
Discover 'Backend.AI Continuum' and 'Backend.AI for Personal Supercomputer' at NVIDIA GTC 2025
By LablupGreetings from Lablup. We're excited to be a Silver Sponsor at NVIDIA GTC 2025, which runs March. 17-21 in, San Jose, California.
Lablup (#547) is unveiling two innovative products at GTC 2025 based on our AI Infrastructure Operating Platform, Backend.AI. Backend.AI is an accelerated workload hosting platform that maximizes the performance and improves the operability of GPU infrastructure and can run in a variety of environments, including the cloud or on-premises/air-gapped environments. Backend.AI is NVIDIA DGX-Ready software certified, ensuring high compatibility and behavioral stability with the NVIDIA DGX platform.
#1. Backend.AI Continuum
Backend.AI Continuum is a solution that enables organizations using cloud API-based services to continue critical operations even in the case of network failures or service outages. The technology intelligently bridges the gap between the cloud and on-premises environments during normal operation and automatically switches to local resources in the event of a cloud outage, keeping API calls and processing uninterrupted. Backend.AI Continuum is an innovative solution that allows enterprises to have the flexibility of the cloud and the reliability of on-premises at the same time.
Given the recent increase in global organizations' reliance on cloud services, Backend.AI Continuum is expected to be a key part of our North American expansion. Lablup is currently working on pilot projects with several potential customers in the US leveraging Backend.AI Continuum.
#2. Backend.AI for Personal Supercomputer
Backend.AI for Personal Supercomputer is a lightweight version of Backend.AI that runs on personal supercomputer products such as NVIDIA DGX Spark or edge computing devices based on Jetson Orin™ Nano series™ modules. It optimizes the core functionality of the Backend.AI for use on consumer desktops, delivering high-density generative AI performance in edge AI computing environments. We believe that Backend.AI for Personal Supercomputer will make it easier for individuals interested in AI technology, or even small and medium-sized enterprises, to adopt and manage AI.
Sessions
Learn about the session announcement featuring Lablup's CEO Jeongkyu Shin and CTO Joongi Kim.
Talks & Panels | Universal NIM Acceleration With GPU-Sharing Containers (Presented by Lablup) [S74194]
Dive into the universal world of NIMs. NVIDIA NIMs envisions the path to achieve Multi-modal, multi-agent AI systems using optimized container templates. Lablup will explain how our GPU-native container engine further accelerates NIMs to deliver such advanced multi-agent AI systems at low cost and high performance. It exploits a novel fractional GPU-sharing technology to accommodate multiple different models having diverse performance bottlenecks in a single GPU, automate resource allocation and model combinations with memory size estimation techniques, and auto-scale the NIM containers by incorporating inference runtime metrics. All these features are implemented on both air-gapped on-premises clusters and cloud-native setups. On top of it, we've also built a streamlined UI to import, fine-tune, and serve open models in just one click, effectively hiding all those technical details from the end user.
Theater Talk | Resilient Edge-Cloud Hybrid AI Infrastructure: Orchestrating Multi-Modal Agents in Resource-Constrained Environments (Presented by Lablup) [EXS74187]
Explore new solutions for building resilient AI infrastructure that seamlessly integrates edge and cloud computing through intelligent orchestration. Lablup will demonstrate how an eight-node NVIDIA Jetson Nano cluster serves multiple LLMs (including Gemma 2 2B and Llama 3.2 3B) and supports multi-modal AI agents while maintaining cloud GPU integration. Learn how to orchestrate distributed AI systems that process text, images, and sensor data, enabling robust edge computing with cloud failover capabilities.
See you soon!
Our team is looking forward to connect with you at GTC 2025. If you're attending GTC 2025, be sure to stop by our booth(#547) to see what's new in the AI market. In addition to live demos of the Backend.AI Continuum and Backend.AI for Personal Supercomputers, we will be presenting examples of integrations within the NVIDIA ecosystem, as well as private meetings for customers, individuals, and partners interested in learning more about Lablup.
Earlier this year, Lablup opened a regional office in Silicon Valley and is taking on the challenge of global expansion. We hope GTC 2025 will be a great opportunity to connect with our team and learn more about Lablup and Backend.AI.
About GTC 2025
NVIDIA GTC is the largest technical conference in AI field. With more than 1000 sessions, 300+ exhibits, hands-on technical training, networking events, and also a keynote from NVIDIA CEO Jensen Huang, it's the best opportunity to join thousands of developers, innovators, and business leaders to explore how AI and accelerated computing are helping solve humanity's complex problems.
18 March 2025
Lablup x Intel Announce Support for Intel® Gaudi® 2 & 3 Platforms in Backend.AI
By LablupSeoul — Lablup announces support for Intel® Gaudi® 2 & Intel® Gaudi® 3 AI Accelerators* in Backend.AI at Supercomputing 2024. Adding support for Intel's AI accelerators to the list of AI accelerator vendors already supported by Backend.AI, including NVIDIA, Rebellions, FuriosaAI, AMD, and others, Lablup is able to offer its customers the most AI accelerators and GPUs on the market, making the Backend.AI platform more competitive and giving customers more choice.
*As of November 2024, Backend.AI supports the Intel® Gaudi® 2 AI accelerator.
*Support for Intel® Gaudi® 3 AI Accelerators is planned on the first half of 2025.
Lablup and Intel have worked closely to unlock the power of Intel® Gaudi® 2 & 3 Platform and make it available in Backend.AI, and as the result of this collaboration, we are pleased to announce Backend.AI now supports the Intel® Gaudi® 2 and Intel® Gaudi® 3 AI Accelerators.
Powerful container orchestration with Sokovan™ by Backend.AI
Sokovan™ is a standalone open-source container orchestrator highly optimized for latest hardware acceleration technologies used in multi-tenant and multi-node scaling scenarios. With fully customizable job scheduling and node allocation policies, it accommodates a hybrid mix of interactive, batch, and service workloads in a single cluster without compromising the performance of the AI.
Get the most out of the AI Accelerators, reach your maximum possibilities.
In complex business, promising AI performance and manageability is the key to success. Intel® Gaudi® 3 AI accelerator, the latest release from Intel, offers powerful AI performance and features. Lablup Backend.AI, a Platform-as-a-Service, offers a wide range of features which are optimized for enterprise-grade AI environments.
Innovation deeply integrated with Intel® Gaudi® 2 & 3 Platform
Customers who have already adopted Intel® Gaudi® 2 AI Accelerators or Intel® Gaudi® 3 AI Accelerators in their business environments, as well as those who will adopt Intel® Gaudi® 2 & 3 Platform in the future, will benefit from the wide range of features that Lablup Backend.AI supports for Intel® Gaudi®. Check out the following examples made possible with Backend.AI with Intel® Gaudi® 2 & 3 Platform.
Card-level accelerator allocation
Maximize Intel® Gaudi® 2 & 3 AI Accelerators cluster workload by lending users the actual number of accelerators intended to. For example, customers can run and train models on their existing preferred platform, then serve on Intel® Gaudi® 2 & 3 Platforms, or vice versa.
External storage allocation
Get the most of out the integrated storage solutions in terms of performance. Utilize vendor-specific filesystem acceleration features without user intervention. Backend.AI supports major, widely used platforms such as Dell PowerScale, VAST Data, WEKA, NetApp, etc.,
Multi-scale workloads
Whatever your environment is, from single-card AI workload which can run small models, to multi-node multi-card AI workload which can run gigantic models, Backend.AI ensure its best performance. At this point as Nov.1, Backend.AI is ready to run Single-card AI workloads and Single-node, Multi-card AI workloads. Multi-node, Multi-card AI workload support will be finalized this year.
Inference statistics management
Monitor up-to-date, detailed metrics about the performance provided by your AI framework. Backend.AI makes inference statistics management easy, not only showing the information from the hardware, but also on software so that administrator can deep-dive into the metrics.
Rule-based inference replica auto scaling
Let the system self-optimize the resource usage. With varied user traffic to inference workloads based on a combination of hardware and software performance metrics, administrators do not need to manually control remaining resources.
*Currently in development (Targeting Dec. 2024)
NUMA-aware resource allocation
Achieve the maximum bare-metal performance by eliminating inter-CPU and PCIe bus overheads within a single node, when there are multiple CPU sockets and multiple accelerators for each socket.
User-based, Project-based storage quota management
Budget-efficient, easy data-space management by limiting data storage quota per user or single project.
Hugepage memory allocation support
Minimize the CPU overheads when using accelerators by using larger memory pages but fewer in number to reduce address translation overheads. This support will be finalized this year.
... and much more
Lablup continues to communicate with Intel, expanding the possibility of Backend.AI. Many more are still in development and will be announced soon. Elevate what your cluster can do with Backand.AI and Intel® Gaudi® 3 AI Accelerators.
Getting most out of your Intel® Gaudi® 3 AI Accelerators.
Backend.AI is designed to bring out the maximum performance Intel® Gaudi® 3 AI Accelerators are capable of. Built on the high-efficiency Intel® Gaudi® platform with proven MLPerf benchmark performance, Intel® Gaudi® 3 AI accelerators are built to handle demanding training and inference. Support AI applications like Large Language Models, Multi-Modal Models and Enterprise RAG in your data center or in the cloud—from node to mega cluster, all running on the Ethernet infrastructure you likely already own. Whether you need a single accelerator or thousands, Intel® Gaudi® 3 can play a pivotal role in your AI success.
Do these with superior user interface.
Unlike other systems, Backend.AI is designed for system administrators to control their system easily as possible. Our consumer-grade user interface makes administrators manage their system within a few clicks and types. Backend.AI WebUI is widely used by our proven customers, and they love what they can do without opening the Command Line Interface.
Make your Intel® Gaudi® 2 & 3 Platform 'manageable' with Backend.AI, unleash your performance.
We are making AI services efficient, scalable, and accessible to scientists, researchers, DevOps, enterprises, and AI enthusiasts. Lablup and Intel are working closely together to enable the success of Generative AI and deep learning-based services that are popular today. With our proven technology, Backend.AI provides hardware-level integration with Intel® Gaudi® 2& 3 Platform for the best effort.
About Intel® Gaudi® 3 AI accelerator
Intel® Gaudi® 3 AI accelerator is driving improved deep learning price-performance and operational efficiency for training and running state-of-the-art models, from the largest language and multi-modal models to more basic computer vision and NLP models. Designed for efficient scalability—whether in the cloud or in your data center, Intel® Gaudi® 3 AI Accelerators bring the AI industry the choice it needs—now more than ever. To learn more about Intel® Gaudi® 3, visit intel.com/gaudi3
About Lablup Backend.AI
Backend.AI supports a wide range of GPUs and AI accelerators on the market to achieve maximum efficiency of its performance and provides a user interface to make everything easy. This allows customers to efficiently build, train, and deliver AI models, from the smallest to the largest language models, significantly reducing the cost and complexity of developing and operating services. Backend.AI is the key to unlock the full potential of Generative AI and Accelerated Computing, transforming your business with cutting-edge technology. To learn more about Backend.AI®, visit backend.ai
Download Whitepaper
Backend.AI & Intel Gaudi AI Accelerators Whitepaper Download
31 October 2024
Meet Lablup at NVIDIA GTC 2024: Pushing the Frontiers of AI Technology
By LablupGreetings from Lablup! We are thrilled to announce our participation in the upcoming NVIDIA GTC 2024 conference, taking place from March 18th to 21st in San Jose, California. As a Silver Sponsor, Lablup is gearing up to showcase our cutting-edge AI technologies and products at this premier event, which is making a comeback as an in-person gathering after a five-year hiatus.
About GTC 2024
GTC is the world's largest AI conference, hosted by NVIDIA. With over 300,000 attendees expected to join both online and in-person, this year's event promises an unparalleled opportunity to explore the latest AI tech trends. From the highly anticipated keynote by NVIDIA CEO Jensen Huang to more than 900 sessions, 300+ exhibits, and 20+ technical workshops covering generative AI and beyond, GTC 2024 is set to be a game-changer for anyone interested in the future of AI.
Lablup at GTC 2024
At GTC, Lablup will be running an exhibition booth (#1233) where we will demonstrate Backend.AI Enterprise, the only NVIDIA DGX-Ready software in the APAC region. Backend.AI is an AI infrastructure management platform that maximizes the performance of NVIDIA DGX systems and other GPU infrastructures while enhancing usability.
We will also be introducing FastTrack, our MLOps solution that streamlines and automates the entire development process for generative AI models. Prepare to be amazed by our demo showcasing how FastTrack can automatically fine-tune foundation models for various industries and transform them into chatbots and other practical applications.
Sessions at GTC
Lablup will be presenting two sessions at GTC.
The first session, titled "Idea to Crowd: Manipulating Local LLMs at Scale," will delve into the techniques and use cases for fine-tuning and operating local LLMs across various scales, from personal GPUs to large-scale data centers. We will share how we optimize resource usage through quantization and lightweight techniques, and illustrate the expansion process of personalized LLMs through concrete examples.
Our second session, "Personalized Generative AI," will explore how to effortlessly run and personalize generative AI models on small-scale hardware such as personal GPUs, PCs, or home servers. We will introduce automated methods for operating and fine-tuning generative AI in compact form factors, offering a glimpse into a future where personalized AI assistants become an integral part of our daily lives.
Hope to meet you soon!
We've given you a sneak peek into the exciting technologies and vision Lablup will be presenting at GTC 2024. If you're attending the event in San Jose this March, be sure to visit our booth (#1233) to experience the latest AI tech firsthand and engage with the Lablup team.
For those joining online, our session presentations will provide valuable insights into the present and future of local LLMs and personalized generative AI. Lablup remains committed to pushing the boundaries of AI technology, making it more accessible and user-friendly for businesses and individuals alike.
Don't miss this incredible opportunity to witness the power of AI and its potential to revolutionize our world. Join us at GTC 2024 and let's embark on this exciting journey together. See you there!
15 March 2024