Top Stories

  • How to save GPU memory in LLM serving: Principles and operating conditions of KV cache offloading

    How to save GPU memory in LLM serving: Principles and operating conditions of KV cache offloading

    By Kyujin Cho, Jinho Heo
    Learn how KV cache offloading works in LLM serving for Agentic AI—covering architecture, data movement paths, and when offloading helps or hurts inference performance.

    27 April 2026

  • Building Production RAG Systems: Lessons from Tariff Support

    Building Production RAG Systems: Lessons from Tariff Support

    By Sergey Leksikov
    Over the past year, we have built two production RAG systems addressing completely different tasks. One is HSense, a multi-agent system for Korean customs item classification, and the other is the Backend.AI RAG Assistant, which processes customer support queries based on seven document projects.

    23 April 2026

  • Inside NVIDIA DGX Spark: Is DGX Spark Actually Blackwell?

    Inside NVIDIA DGX Spark: Is DGX Spark Actually Blackwell?

    By Jeongkyu Shin, Kyujin Cho
    DGX Spark is a desktop AI supercomputer that packs 128GB of unified memory and 1 PFLOP-class Grace Blackwell (GB10) performance into a palm-sized box. However, its internal GPU belongs to the SM12x series, distinct from the data center-grade Blackwell (SM100). This creates a subtle architectural gap: the latest LLM stacks, heavily reliant on MLA·DSA-specific kernels like GLM-5, "Blackwell support" alone doesn't guarantee immediate compatibility. This creates a subtle architectural gap requiring separate code management for Hopper, data center Blackwell, and consumer Blackwell. The engineering team examines Spark, which is based on Blackwell but features a slightly different architecture.

    19 February 2026

We're here for you!

Complete the form and we'll be in touch soon

Contact Us

Headquarter & HPC Lab

KR Office: 8F, 577, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea US Office: 3003 N First st, Suite 221, San Jose, CA 95134

© Lablup Inc. All rights reserved.

We value your privacy

We use cookies to enhance your browsing experience, analyze site traffic, and understand where our visitors are coming from. By clicking "Accept All", you consent to our use of cookies. Learn more