Mar 1, 2024

From Idea To Crowd: Manipulating Local LLMs At Scale

    신정규

    창업멤버 / 연구원 / CEO

    김준기

    창업멤버 / CTO

Mar 1, 2024

From Idea To Crowd: Manipulating Local LLMs At Scale

    신정규

    창업멤버 / 연구원 / CEO

    김준기

    창업멤버 / CTO

비디오 시청을 위해서는 외부 페이지를 방문해야 합니다. 화면 이미지를 클릭해서 이동하세요.

Overview

Large language models (LLMs) are the pinnacle of generative AI. While cloud-based LLMs have enabled mass adoption, local on-premises LLMs are garnering attention in favor of personalization, security, and air-gapped setups. Ranging from personal hobbies to professional domains, both open-source foundational LLMs and fine-tuned models are utilized across diverse fields. We'll introduce the technology and use cases for fine-tuning and running LLMs on a small scale, like PC GPUs, to an expansive scale to serve mass users on data centers. We combine resource-saving and model compression techniques like quantization and QLoRA with vLLM and TensorRT-LLM. Additionally, we illustrate the scaling-up process of such genAI models by the fine-tuning pipeline with concrete and empirical examples. You'll gain a deep understanding of how to achieve the operation and expansion of personalized LLMs, and inspirations for the possibilities that this opens up.

도움이 필요하신가요?

내용을 작성해 주시면 곧 연락 드리겠습니다.

문의하기

본사 및 HPC 연구소

서울특별시 강남구 선릉로 577 CR타워 8층

© Lablup Inc. All rights reserved.