Mar 1, 2024

From Idea To Crowd: Manipulating Local LLMs At Scale

    Jeongkyu Shin

    Founder / Researcher / CEO

    Joongi Kim

    Co-Founder / CTO

Mar 1, 2024

From Idea To Crowd: Manipulating Local LLMs At Scale

    Jeongkyu Shin

    Founder / Researcher / CEO

    Joongi Kim

    Co-Founder / CTO

You need to visit an external page to watch the video. Click on the image to proceed.

Overview

Large language models (LLMs) are the pinnacle of generative AI. While cloud-based LLMs have enabled mass adoption, local on-premises LLMs are garnering attention in favor of personalization, security, and air-gapped setups. Ranging from personal hobbies to professional domains, both open-source foundational LLMs and fine-tuned models are utilized across diverse fields. We'll introduce the technology and use cases for fine-tuning and running LLMs on a small scale, like PC GPUs, to an expansive scale to serve mass users on data centers. We combine resource-saving and model compression techniques like quantization and QLoRA with vLLM and TensorRT-LLM.

 Additionally, we illustrate the scaling-up process of such genAI models by the fine-tuning pipeline with concrete and empirical examples. You'll gain a deep understanding of how to achieve the operation and expansion of personalized LLMs, and inspirations for the possibilities that this opens up.

We're here for you!

Complete the form and we'll be in touch soon

Contact Us

Headquarter & HPC Lab

8F, 577, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea

© Lablup Inc. All rights reserved.