728x90

๋ฐฐ๊ฒฝ


vllm ์„œ๋ฒ„ ์šด์˜์ค‘ 0.14.0 ๋ฏธ๋งŒ ๋ฒ„์ „์—์„œ RCE ์ทจ์•ฝ์ ์ด ๋ฐœ์ƒํ–ˆ๋‹ค๊ณ  ํ•ด์„œ ๋ฒ„์ „ ํŒจ์น˜๋ฅผ ํ–ˆ์Šต๋‹ˆ๋‹ค. 
๊ทธ๋Ÿฐ๋ฐ ์ด์ „์— ๋‚˜์™€์žˆ๋˜ ์ทจ์•ฝ์  ์ค‘ ๋ชจ๋ธ ๋กœ๋“œ๋ฅผ ํ†ตํ•ด์„œ RCE ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ธ€์„ ๋ณด
๊ณ  ์ด๊ฒŒ ์–ด๋–ป๊ฒŒ ๊ฐ€๋Šฅํ•œ๊ฑด์ง€ ์ฐพ์•„๋ณด๊ฒŒ ๋˜์—ˆ๋Š”๋ฐ์š”, 
๋ฐฐํฌํฌ๋งท์ด๋‚˜ ์ผ๋ถ€ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ๋ชจ๋ธ๋กœ๋“œ์—์„œ ๊ฐ€์ค‘์น˜๋งŒ ๋ถˆ๋Ÿฌ์˜ค๋Š”๊ฒƒ์ด ์•„๋‹ˆ๋ผ
ํŒŒ์ด์ฌ ์ฝ”๋“œ ๋กœ์ง์„ ํƒˆ ์ˆ˜ ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค์„ ์•Œ๊ฒŒ ๋˜์–ด ์ •๋ฆฌํ•  ๊ฒธ ๊ธ€์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

 

 

CVE-2025-66448: vLLM Config Trust Bypass RCE | Miggo

The vulnerability lies in the __init__ method of the Nemotron_Nano_VL_Config class, located in the now-removed file vllm/transformers_utils/configs/nemotron_vl.py. The commit ffb08379d8870a1a81ba82b72797f196838d0c86 addresses the vulnerability by completel

www.miggo.io

 

๋ชจ๋ธ ๋ฐฐํฌ ํฌ๋งท

์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๋‹ค ๋ณด๋ฉด ํ•™์Šต ์ž์ฒด๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ง€์ ์ด ๋ฐ”๋กœ ๋ฐฐํฌ์ž…๋‹ˆ๋‹ค. ํ•™์Šต๋œ ๋ชจ๋ธ์€ ๋‹จ์ˆœํ•œ ์ฝ”๋“œ๊ฐ€ ์•„๋‹ˆ๋ผ ์ˆ˜๋ฐฑ MB์—์„œ ์ˆ˜์‹ญ GB์— ์ด๋ฅด๋Š” ๊ฐ€์ค‘์น˜ ๋ฐ์ดํ„ฐ์™€ ์‹คํ–‰ ๊ตฌ์กฐ๋ฅผ ํ•จ๊ป˜ ๊ฐ–๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ ๋ชจ๋ธ์„ ์–ด๋–ค ํ˜•ํƒœ๋กœ ์ €์žฅํ•˜๊ณ  ์ „๋‹ฌํ•  ๊ฒƒ์ธ๊ฐ€์— ๋Œ€ํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐ”๋กœ ๋ชจ๋ธ ๋ฐฐํฌ ํฌ๋งท์˜ ์ถœ๋ฐœ์ ์ž…๋‹ˆ๋‹ค.

์ดˆ๊ธฐ์—๋Š” ํ•™์Šตํ•œ ํ”„๋ ˆ์ž„์›Œํฌ ๋‚ด๋ถ€์—์„œ๋งŒ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—, ๋‹จ์ˆœํžˆ ๋ฉ”๋ชจ๋ฆฌ ๊ฐ์ฒด๋ฅผ ๊ทธ๋Œ€๋กœ ์ง๋ ฌํ™”ํ•˜๋Š” ๋ฐฉ์‹์ด ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๋ชจ๋ธ์ด ์ปค์ง€๊ณ , ํ˜‘์—…๊ณผ ์™ธ๋ถ€ ๊ณต์œ ๊ฐ€ ๋Š˜์–ด๋‚˜๋ฉด์„œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์š”๊ตฌ์‚ฌํ•ญ์ด ๋“ฑ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ€์žฅ ํฐ ๊ฒƒ์€ ๋‹ค๋ฅธ ํ™˜๊ฒฝ์—์„œ๋„ ๋™์ผํ•˜๊ฒŒ ๋ชจ๋ธ์„ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค๋Š” ๊ฒƒ์ธ๋ฐ์š”, ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ณ  ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์€ ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌ์„ฑํ•˜์ง€ ์•Š๋Š” ํ•œ ๊ทธ๋‹ค์ง€ ๋ฌธ์ œ๊ฐ€ ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค๋งŒ, ์ถ”๋ก ์„ ํ•  ๋•Œ์—๋Š” ์ด์‹์„ฑ์ด ์ค‘์š”ํ•˜๊ฒŒ ์—ฌ๊ฒจ์กŒ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ชจ๋ธ ํŒŒ์ผ๋งŒ export ํ•˜๊ฒŒ ๋˜์—ˆ๊ณ , ์ด๋Ÿฐ ์š”๊ตฌ์‚ฌํ•ญ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ชจ๋ธ ๋ฐฐํฌ ํฌ๋งท์ด ๋“ฑ์žฅํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Pytorch .pt .pth

Pytorch ์˜ ๋ชจ๋ธ ์ €์žฅ ๋ฐฉ์‹์€ Python ๊ฐ์ฒด๋ฅผ ๊ทธ๋Œ€๋กœ ๋ฐ์ดํ„ฐ๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์ธ๋ฐ ์ด๊ฒƒ์„ ์ง๋ ฌํ™”๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ํฌ๋งท๋„ ๋‹ค๋ฅธ ํฌ๋งท๋“ค๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๋ชจ๋ธ ์žฌํ˜„์„ฑ์˜ ์š”๊ตฌ์‚ฌํ•ญ์„ ํ•ด๊ฒฐํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— Research Level ์—์„œ๋Š” ํŽธํ•˜๊ฒŒ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์ง€๋งŒ, ๋‚ด๋ถ€์ ์œผ๋กœ pickle ์„ ์‚ฌ์šฉํ•˜๊ณ , ์ฝ”๋“œ๋‚˜ ๋ฐ์ดํ„ฐ ์ž์ฒด๋ฅผ ๋ชจ๋‘ ์ง๋ ฌํ™” ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•ด๋‹น ๊ฐ์ฒด๋ฅผ ๋กœ๋“œํ•˜๋Š” ๊ฒฝ์šฐ RCE๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์น˜๋ช…์ ์ธ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

python ๊ณต์‹œ๋ฌธ์„œ์—์„œ pickle ์€ ์ง๋ ฌํ™”์™€ ์—ญ์ง๋ ฌํ™”๋ฅผ ์œ„ํ•œ ๋ชจ๋“ˆ์ด๋ผ๊ณ  ๋‚˜์™€์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์˜ˆ์‹œ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ๋“ค๋„ ๋‚˜์ค‘์— ํ•œ๋ฒˆ ์ฐพ์•„๋ณผ๋ฒ• ํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

pickle — Python object serialization

๊ทธ๋ž˜์„œ Pytorch ์˜ ๋ชจ๋ธ์€ ๋ฐฐํฌํ™˜๊ฒฝ์—์„œ๋Š” ์‚ฌ์šฉ์„ ์ง€์–‘ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์€ ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

pytorch ๋Š” ๋ชจ๋ธ์˜ ํ˜•ํƒœ๋ฅผ ์ €์žฅํ•  ๋•Œ ์•„๋ž˜์™€ ๊ฐ™์ด ์ €์žฅํ•˜๋ฉด์„œ ์ง๋ ฌํ™”๋ฅผ ํ•˜๋Š”๋ฐ์š”, ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ์ €์žฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

import torch
#model ๊ฐ์ฒด ๊ทธ๋Œ€๋กœ ์ง๋ ฌํ™”
torch.save(model, 'model.pth')
torch.load('model.pth')

#model ํŒŒ๋ผ๋ฏธํ„ฐ ์ง๋ ฌํ™” 
torch.save(model.state_dict(), 'model.pth')
model.load_state_dict(torch.load('model.pth'))

๋ฐœ์ƒ๊ฐ€๋Šฅํ•œ ์ทจ์•ฝ์ 

# Define model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

# Initialize model
model = TheModelClass()

๋งŒ์•ฝ ์œ„์™€ ๊ฐ™์€ ๋ชจ๋ธ์ด ์žˆ๋‹ค๋ฉด torch.save ํ•˜๋Š” ์‹œ์ ์—์„œ TheModelClass ๊ฐ€ ์ง๋ ฌํ™”๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ class ์•ˆ์— ์žˆ๋Š” ํ•จ์ˆ˜๋“ค์— ๋ญ”๊ฐ€ ๋‹ค๋ฅธ ๋ชฉ์ ์˜ ์ฝ”๋“œ๊ฐ€ ์žˆ๋‹ค๋ฉด torch.load() ํ•˜๋Š” ์‹œ์ ์—์„œ ๊ทธ๋Œ€๋กœ ์‹คํ–‰๋˜๊ฒ ์ง€์š”. ์ด๊ฒƒ์ด pytorch ์˜ model.state_dict() ๋ฅผ ์ €์žฅํ•˜์ง€ ์•Š๊ณ  save ํ–ˆ์„ ๋•Œ์˜ ๋ฌธ์ œ์  ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ pytorch ๊ถŒ์žฅ์‚ฌํ•ญ์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ ์ €์žฅ๋˜๊ฒŒ ํ•˜๋Š” torch.save(model.state_dict,’model.pth’) ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

Huggingface .safetensors

safetensors ๋Š” ๊ฐ€์ค‘์น˜๋ฅผ ๋น ๋ฅด๊ฒŒ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ค๊ธฐ ์œ„ํ•œ ํ˜•์‹์ธ๋ฐ์š”, ๋‹ค๋ฅธ ๋ชจ๋ธ์—์„œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์ทจ์•ฝ์  ๋ฌธ์ œ ํŠนํžˆ pickle ์„ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋ฐœ์ƒํ•˜๋Š” python ๊ฐ์ฒด์ €์žฅ์ด๋‚˜ ์‹คํ–‰๊ฐ€๋Šฅํ•œ ๊ตฌ์กฐ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. safetensors ํŒŒ์ผ ๊ตฌ์กฐ๋Š” ํ—ค๋”์™€ ๋ธ”๋ก์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

ํ—ค๋”๋Š” JSON ํ˜•์‹์œผ๋กœ ๋œ ํ…์„œ๋“ค์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์ด๊ณ , ๋ฐ์ดํ„ฐ๋ธ”๋ก์€ weight๋“ค์ด ์กด์žฌํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ ํ˜•ํƒœ์ž…๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ safetensors ๋ฅผ ์—ด์–ด์„œ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋Š”๋ฐ์š”

https://huggingface.co/Qwen/Qwen3-ASR-1.7B/tree/main

 

Qwen/Qwen3-ASR-1.7B at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

 

์˜ ๋‘๋ฒˆ์งธ safetensors ๊ฐ€ ๋ฐœ๊ฒฌํ•œ๊ฒƒ์ค‘ ์šฉ๋Ÿ‰์ด ์ข€ ์ž‘๋„ค์š”, ์ด๊ฑฐ๋กœ ํ…Œ์ŠคํŠธ ํ•ด๋ณด์…”๋„ ์ข‹์„ ๋“ฏ ํ•ฉ๋‹ˆ๋‹ค.

from safetensors import safe_open

safetensors_file = 
with safe_open(safetensors_file, framework="pt") as f:
  tensor_name = f.keys()
  print(f"tensor list {tensor_name}")

  for key in tensor_name:
    tensor = f.get_tensor(key)
    print(f"tensor name {key} ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : {tensor.dtype}")
    print(f"tensor name {key} ์˜ shape : {tensor.shape}")
tensor list ['thinker.model.layers.5.mlp.gate_proj.weight', 'thinker.model.layers.5.mlp.up_proj.weight', 'thinker.model.layers.5.post_attention_layernorm.weight', 'thinker.model.layers.5.self_attn.k_norm.weight', 'thinker.model.layers.5.self_attn.k_proj.weight', 'thinker.model.layers.5.self_attn.o_proj.weight', 'thinker.model.layers.5.self_attn.q_norm.weight', 'thinker.model.layers.5.self_attn.q_proj.weight', 'thinker.model.layers.5.self_attn.v_proj.weight', 'thinker.model.layers.6.input_layernorm.weight', 'thinker.model.layers.6.mlp.down_proj.weight', 'thinker.model.layers.6.mlp.gate_proj.weight', 'thinker.model.layers.6.mlp.up_proj.weight', 'thinker.model.layers.6.post_attention_layernorm.weight', 'thinker.model.layers.6.self_attn.k_norm.weight', 'thinker.model.layers.6.self_attn.k_proj.weight', 'thinker.model.layers.6.self_attn.o_proj.weight', 'thinker.model.layers.6.self_attn.q_norm.weight', 'thinker.model.layers.6.self_attn.q_proj.weight', 'thinker.model.layers.6.self_attn.v_proj.weight', 'thinker.model.layers.7.input_layernorm.weight', 'thinker.model.layers.7.mlp.down_proj.weight', 'thinker.model.layers.7.mlp.gate_proj.weight', 'thinker.model.layers.7.mlp.up_proj.weight', 'thinker.model.layers.7.post_attention_layernorm.weight', 'thinker.model.layers.7.self_attn.k_norm.weight', 'thinker.model.layers.7.self_attn.k_proj.weight', 'thinker.model.layers.7.self_attn.o_proj.weight', 'thinker.model.layers.7.self_attn.q_norm.weight', 'thinker.model.layers.7.self_attn.q_proj.weight', 'thinker.model.layers.7.self_attn.v_proj.weight', 'thinker.model.layers.8.input_layernorm.weight', 'thinker.model.layers.8.mlp.down_proj.weight', 'thinker.model.layers.8.mlp.gate_proj.weight', 'thinker.model.layers.8.mlp.up_proj.weight', 'thinker.model.layers.8.post_attention_layernorm.weight', 'thinker.model.layers.8.self_attn.k_norm.weight', 'thinker.model.layers.8.self_attn.k_proj.weight', 'thinker.model.layers.8.self_attn.o_proj.weight', 'thinker.model.layers.8.self_attn.q_norm.weight', 'thinker.model.layers.8.self_attn.q_proj.weight', 'thinker.model.layers.8.self_attn.v_proj.weight', 'thinker.model.layers.9.input_layernorm.weight', 'thinker.model.layers.9.mlp.down_proj.weight', 'thinker.model.layers.9.mlp.gate_proj.weight', 'thinker.model.layers.9.mlp.up_proj.weight', 'thinker.model.layers.9.post_attention_layernorm.weight', 'thinker.model.layers.9.self_attn.k_norm.weight', 'thinker.model.layers.9.self_attn.k_proj.weight', 'thinker.model.layers.9.self_attn.o_proj.weight', 'thinker.model.layers.9.self_attn.q_norm.weight', 'thinker.model.layers.9.self_attn.q_proj.weight', 'thinker.model.layers.9.self_attn.v_proj.weight', 'thinker.model.norm.weight']
tensor name thinker.model.layers.5.mlp.gate_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.mlp.gate_proj.weight ์˜ shape : torch.Size([6144, 2048])
tensor name thinker.model.layers.5.mlp.up_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.mlp.up_proj.weight ์˜ shape : torch.Size([6144, 2048])
tensor name thinker.model.layers.5.post_attention_layernorm.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.post_attention_layernorm.weight ์˜ shape : torch.Size([2048])
tensor name thinker.model.layers.5.self_attn.k_norm.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.k_norm.weight ์˜ shape : torch.Size([128])
tensor name thinker.model.layers.5.self_attn.k_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.k_proj.weight ์˜ shape : torch.Size([1024, 2048])
tensor name thinker.model.layers.5.self_attn.o_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.o_proj.weight ์˜ shape : torch.Size([2048, 2048])
tensor name thinker.model.layers.5.self_attn.q_norm.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.q_norm.weight ์˜ shape : torch.Size([128])
tensor name thinker.model.layers.5.self_attn.q_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.q_proj.weight ์˜ shape : torch.Size([2048, 2048])
tensor name thinker.model.layers.5.self_attn.v_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.5.self_attn.v_proj.weight ์˜ shape : torch.Size([1024, 2048])
tensor name thinker.model.layers.6.input_layernorm.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.6.input_layernorm.weight ์˜ shape : torch.Size([2048])
tensor name thinker.model.layers.6.mlp.down_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.6.mlp.down_proj.weight ์˜ shape : torch.Size([2048, 6144])
tensor name thinker.model.layers.6.mlp.gate_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.6.mlp.gate_proj.weight ์˜ shape : torch.Size([6144, 2048])
tensor name thinker.model.layers.6.mlp.up_proj.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16
tensor name thinker.model.layers.6.mlp.up_proj.weight ์˜ shape : torch.Size([6144, 2048])
tensor name thinker.model.layers.6.post_attention_layernorm.weight ์˜ ๋ฐ์ดํ„ฐํƒ€์ž… : torch.bfloat16

weight ์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š”๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. RCE ๋ฅผ ์›์ฒœ์ ์œผ๋กœ ๋ง‰๊ธฐ ์œ„ํ•ด ์„ค๊ณ„ ๋œ ๋งŒํผ safetensors ๋ชจ๋ธ์ž์ฒด์— ๋Œ€ํ•ด์„œ๋Š” ๋ฐœ๊ฒฌ๋œ ์ทจ์•ฝ์ ์ด ์—†์Šต๋‹ˆ๋‹ค.

Microsoft ONNX(Open Neural Network Exchange)

ONNX ๋Š” ๋งŽ์€ ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ ๊ฐ„์˜ ๋ชจ๋ธ์„ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋œ ์˜คํ”ˆ์†Œ์Šค ํฌ๋งท์ž…๋‹ˆ๋‹ค. ONNX ๋ฅผ ํ†ตํ•ด์„œ ๊ฐœ๋ฐœ์ž๋“ค์€ Pytorch ๋‚˜ Tensorflow ๋“ฑ ์ƒ์ดํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ๊ฐœ๋ฐœํ•ด๋„ ONNX ๋ฅผ ํ†ตํ•ด์„œ ์„œ๋กœ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ์‰ฝ๊ฒŒ ์ „ํ™˜ํ•ด์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์—ญ์‹œ ๋ฐฐํฌ๋ฅผ ์›ํ™œํ•˜๊ฒŒ ํ•˜์ž๋Š” ์ •์‹ ์—์„œ ๊ฐœ๋ฐœ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

import torch
import torchvision.models as models
import onnx

# ์‚ฌ์ „ ํ›ˆ๋ จ๋œ PyTorch ๋ชจ๋ธ ๋กœ๋“œ
model = models.resnet18(pretrained=True)
model.eval()

# ๋”๋ฏธ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
x = torch.randn(1, 3, 224, 224, requires_grad=True)

# ๋ชจ๋ธ์„ ONNX ํฌ๋งท์œผ๋กœ ๋ณ€ํ™˜
torch.onnx.export(model,               # ์‹คํ–‰ํ•  ๋ชจ๋ธ
                  x,                   # ๋ชจ๋ธ ์ž…๋ ฅ๊ฐ’ (ํŠœํ”Œ ๋˜๋Š” ์—ฌ๋Ÿฌ ์ž…๋ ฅ๊ฐ’์„ ์œ„ํ•œ ํŠœํ”Œ๋„ ๊ฐ€๋Šฅ)
                  "resnet18.onnx",     # ์ €์žฅ๋  ๋ชจ๋ธ์˜ ์ด๋ฆ„
                  export_params=True,  # ๋ชจ๋ธ ํŒŒ์ผ ๋‚ด ํ•™์Šต๋œ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ์ €์žฅํ• ์ง€์˜ ์—ฌ๋ถ€
                  opset_version=10,    # ๋ชจ๋ธ์„ ๋ณ€ํ™˜ํ•  ๋•Œ ์‚ฌ์šฉํ•  ONNX ๋ฒ„์ „
                  do_constant_folding=True,  # ์ตœ์ ํ™”: ์ƒ์ˆ˜ ํด๋”ฉ์„ ์ˆ˜ํ–‰ํ• ์ง€ ์—ฌ๋ถ€
                  input_names = ['input'],   # ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•œ ์ด๋ฆ„
                  output_names = ['output'], # ๋ชจ๋ธ์˜ ์ถœ๋ ฅ๊ฐ’์— ๋Œ€ํ•œ ์ด๋ฆ„
                  dynamic_axes={'input' : {0 : 'batch_size'},    # ๋ฐฐ์น˜ ํฌ๊ธฐ์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๋Š” ์ž…๋ ฅ ์ฐจ์›
                                'output' : {0 : 'batch_size'}})  # ๋ฐฐ์น˜ ํฌ๊ธฐ์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๋Š” ์ถœ๋ ฅ ์ฐจ์›

ONNX ๋ฐœ์ƒ ๊ฐ€๋Šฅํ•œ ์ทจ์•ฝ์ 

์ตœ๊ทผ๊นŒ์ง€๋Š” ONNX ์˜ ๋ณด๊ณ ๋œ ์ทจ์•ฝ์ ๋“ค์—์„œ ONNX ์ž์ฒด์˜ ์ทจ์•ฝ์ ์€ ๊ฑฐ์˜ ์—†๋‹ค๊ณ  ํ•ด๋„ ๋ ์ •๋„๋กœ ์—†์—ˆ๊ณ , ๊ฒŒ๋‹ค๊ฐ€ RCE ๋Š” ์ „ํ˜€ ๋ณผ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋งˆ์ €๋„ C/C++ ์—„๋ฐ€ํžˆ ๋งํ•˜๋ฉด ๋Ÿฐํƒ€์ž„ ์œ ํ˜•์˜ ์ทจ์•ฝ์ ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค๋Š”๋ฐ์š”, ์ตœ๊ทผ ๋ฐœํ‘œ ๋œ Path Traveling ์ทจ์•ฝ์ ๋„ ONNX ํฌ๋งท์˜ ๋ฌธ์ œ๋ผ๊ธฐ๋ณด๋‹ค๋Š”, ONNX ๋ชจ๋ธ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ตฌํ˜„์˜ ์ทจ์•ฝ์ ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

 

 

ONNX Path Traversal Vulnerability Exploited | Matt T.๋‹˜์ด ํ† ํ”ฝ์— ๋Œ€ํ•ด ์˜ฌ๋ฆผ | LinkedIn

CVE-2025-51480 Path Traversal vulnerability in onnx.external_data_helper.save_external_data in ONNX 1.17.0 allows attackers to overwrite arbitrary files by supplying crafted external_data.location paths containing traversal sequences, bypassing intended di

www.linkedin.com

 

GGUF / GGML

GGML (Georgi Gerganov Machine Learning Format)

GGML์€ Georgi Gerganov๊ฐ€ ๊ฐœ๋ฐœํ•œ ๊ฒฝ๋Ÿ‰ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ, ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์„ ํฌํ•จํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ CPU ํ™˜๊ฒฝ์—์„œ ํšจ์œจ์ ์œผ๋กœ ์ถ”๋ก ํ•˜๊ธฐ ์œ„ํ•ด ์„ค๊ณ„๋œ C/C++ ๊ธฐ๋ฐ˜ ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. Hugging Face์˜ ์†Œ๊ฐœ ๊ธ€์—์„œ๋„ ๊ฐ•์กฐํ•˜๋“ฏ, GGML์€ ๊ธฐ์กด ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ๊ฐ–๋Š” ๋ณต์žก์„ฑ๊ณผ ๋ฌด๊ฑฐ์šด ์˜์กด์„ฑ์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์ธ ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์ธ PyTorch๋‚˜ TensorFlow๋Š” ๋งค์šฐ ๊ฐ•๋ ฅํ•˜์ง€๋งŒ, ๋Œ€๊ทœ๋ชจ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์˜์กด์„ฑ๊ณผ ๋ณต์žกํ•œ ๋นŒ๋“œ ํ™˜๊ฒฝ์„ ์š”๊ตฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์„œ๋ฒ„ ํ™˜๊ฒฝ์—์„œ๋Š” ๋ฌธ์ œ๊ฐ€ ๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์ง€๋งŒ, ๊ฐœ์ธ PC๋‚˜ ๋‚ด๋ถ€๋ง, ์˜คํ”„๋ผ์ธ ํ™˜๊ฒฝ, ํ˜น์€ ๋ฆฌ์†Œ์Šค๊ฐ€ ์ œํ•œ๋œ ์‹œ์Šคํ…œ์—์„œ๋Š” ๋ถ€๋‹ด์œผ๋กœ ์ž‘์šฉํ•ฉ๋‹ˆ๋‹ค. GGML์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์™ธ๋ถ€ ์˜์กด์„ฑ์„ ๊ฑฐ์˜ ๊ฐ–์ง€ ์•Š๋Š” ๊ตฌ์กฐ, ๊ทธ๋ฆฌ๊ณ  ๋‹จ์ˆœํ•œ C ์ฝ”๋“œ ๊ธฐ๋ฐ˜ ๊ตฌํ˜„์„ ์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค.

GGML์˜ ํ•ต์‹ฌ ์ฒ ํ•™์€ “์ž‘๊ณ , ๋‹จ์ˆœํ•˜๋ฉฐ, ์˜ˆ์ธก ๊ฐ€๋Šฅํ•œ ์‹คํ–‰”์ž…๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ GGML์€ ๋ช‡ ๊ฐœ์˜ ์†Œ์Šค ํŒŒ์ผ๋งŒ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ปดํŒŒ์ผ๋œ ๋ฐ”์ด๋„ˆ๋ฆฌ ํฌ๊ธฐ ์—ญ์‹œ ๋งค์šฐ ์ž‘์Šต๋‹ˆ๋‹ค. ๋ณ„๋„์˜ Python ๋Ÿฐํƒ€์ž„์ด๋‚˜ ๋Œ€ํ˜• ํ”„๋ ˆ์ž„์›Œํฌ ์—†์ด๋„ ๋ชจ๋ธ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ํ™˜๊ฒฝ ์ด์‹์„ฑ์ด ๋งค์šฐ ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค. Linux, macOS, Windows๋Š” ๋ฌผ๋ก ์ด๊ณ  ARM ์•„ํ‚คํ…์ฒ˜๋‚˜ Apple Silicon ํ™˜๊ฒฝ์—์„œ๋„ ๋น„๊ต์  ์‰ฝ๊ฒŒ ๋นŒ๋“œํ•˜๊ณ  ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ ํ•˜๋‚˜์˜ ์ค‘์š”ํ•œ ํŠน์ง•์€ ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์„ฑ์ž…๋‹ˆ๋‹ค. GGML์€ ํ…์„œ ํ‘œํ˜„๊ณผ ์—ฐ์‚ฐ์—์„œ ๋ถˆํ•„์š”ํ•œ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ , CPU ์บ์‹œ ์นœํ™”์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ GGML์ด ๋„๋ฆฌ ์ฃผ๋ชฉ๋ฐ›๊ฒŒ ๋œ ์ด์œ  ์ค‘ ํ•˜๋‚˜๋Š” ๊ฐ•๋ ฅํ•œ ์–‘์žํ™”(quantization) ์ง€์›์ž…๋‹ˆ๋‹ค. float32 ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์„ int8, int5, int4 ์ˆ˜์ค€์œผ๋กœ ์••์ถ•ํ•ด ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ํฌ๊ฒŒ ์ค„์ด๋ฉด์„œ๋„, ์ถ”๋ก  ์„ฑ๋Šฅ์„ ์‹ค์šฉ์ ์ธ ์ˆ˜์ค€์œผ๋กœ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ํŠน์„ฑ ๋•๋ถ„์— GGML์€ ํ•™์Šต๋ณด๋‹ค๋Š” ์ถ”๋ก  ์ค‘์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€๋Šฅํ•œ ํ•œ ์ ์€ ์ž์›์œผ๋กœ ๋น ๋ฅด๊ฒŒ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด๋ฉฐ, ์‹ค์ œ๋กœ llama.cpp, whisper.cpp, GPT4All, LM Studio, Ollama์™€ ๊ฐ™์€ ์—ฌ๋Ÿฌ ํ”„๋กœ์ ํŠธ๋“ค์ด GGML์„ ์ €์ˆ˜์ค€ ์—ฐ์‚ฐ ์—”์ง„์œผ๋กœ ํ™œ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ GGML์€ ๋‹จ์ˆœํ•œ ๋ชจ๋ธ ํฌ๋งท์ด๋ผ๊ธฐ๋ณด๋‹ค๋Š”, ๋ชจ๋ธ ์‹คํ–‰์„ ๋‹ด๋‹นํ•˜๋Š” ์ €์ˆ˜์ค€ ๋Ÿฐํƒ€์ž„์— ๊ฐ€๊น๋‹ค๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ตฌ์กฐ์ ์œผ๋กœ ๋ณด๋ฉด GGML์€ ๋‚ด๋ถ€์— ํ…์„œ์™€ ์—ฐ์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ด€๋ฆฌํ•˜๋Š” context๋ฅผ ๋‘๊ณ , ์—ฐ์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ CPU, CUDA, Metal ๋“ฑ ๋‹ค์–‘ํ•œ ๋ฐฑ์—”๋“œ๋ฅผ ์ง€์›ํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋ฐฑ์—”๋“œ๋ณ„๋กœ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น๊ณผ ์—ฐ์‚ฐ ์Šค์ผ€์ค„๋ง์„ ๋ถ„๋ฆฌํ•ด ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ ๋•๋ถ„์— ๊ฐ€๋ณ์ง€๋งŒ ๋‹จ์ˆœํ•œ ์ˆ˜์ค€์„ ๋„˜๋Š” ์œ ์—ฐ์„ฑ์„ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋‹ค๋งŒ GGML์€ ์ด๋Ÿฌํ•œ ์žฅ์ ๊ณผ ํ•จ๊ป˜ ํ•œ๊ณ„๋„ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. C/C++ ๊ธฐ๋ฐ˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํŠน์„ฑ์ƒ ์‚ฌ์šฉ ๋‚œ์ด๋„๊ฐ€ ๋†’๊ณ , Python ๊ธฐ๋ฐ˜ ํ”„๋ ˆ์ž„์›Œํฌ์— ์ต์ˆ™ํ•œ ์‚ฌ์šฉ์ž์—๊ฒŒ๋Š” ์ง„์ž… ์žฅ๋ฒฝ์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ชจ๋ธ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ ํ‘œํ˜„์ด ์ œํ•œ์ ์ด๊ณ , ํ† ํฌ๋‚˜์ด์ €๋‚˜ specia1 token, rope ์„ค์ •๊ณผ ๊ฐ™์€ ๋ถ€๊ฐ€ ์ •๋ณด๋ฅผ ํ•จ๊ป˜ ๊ด€๋ฆฌํ•˜๋Š” ๋ฐ์—๋Š” ๋ถˆํŽธํ•จ์ด ์กด์žฌํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ•œ๊ณ„๋Š” ๋ชจ๋ธ์ด ๋ณต์žกํ•ด์งˆ์ˆ˜๋ก ์ ์  ๋” ๋ฌธ์ œ๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ฐฐ๊ฒฝ ์†์—์„œ GGML์€ ์ ์ฐจ GGUF(GGML Unified Format)๋กœ ๋ฐœ์ „ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. GGUF๋Š” GGML์˜ ์ฒ ํ•™์„ ์œ ์ง€ํ•˜๋ฉด์„œ๋„, ๋ชจ๋ธ ์‹คํ–‰์— ํ•„์š”ํ•œ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‹ค ๋ช…ํ™•ํ•˜๊ณ  ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋‹ด๊ธฐ ์œ„ํ•ด ์„ค๊ณ„๋œ ํฌ๋งท์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ llama.cpp ์ƒํƒœ๊ณ„์—์„œ๋„ GGML๋ณด๋‹ค๋Š” GGUF ์‚ฌ์šฉ์ด ๊ถŒ์žฅ๋˜๊ณ  ์žˆ์œผ๋ฉฐ, GGML์€ ์ ์ฐจ ๋ ˆ๊ฑฐ์‹œ ํฌ๋งท์˜ ์œ„์น˜๋กœ ์ด๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ •๋ฆฌํ•˜์ž๋ฉด, GGML์€ “๋ชจ๋ธ์„ ์•ˆ์ „ํ•˜๊ฒŒ ์ €์žฅํ•œ๋‹ค”๋Š” ๋ฐฐํฌ ํฌ๋งท์˜ ๊ฐœ๋…๋ณด๋‹ค๋Š”, “๋ชจ๋ธ์„ ๊ฐ€๋ณ๊ณ  ํšจ์œจ์ ์œผ๋กœ ์‹คํ–‰ํ•œ๋‹ค”๋Š” ๋ชฉ์ ์— ์ถฉ์‹คํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค. Python ๊ฐ์ฒด ์ง๋ ฌํ™”๋‚˜ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ์ฝ”๋“œ ๋กœ๋”ฉ๊ณผ๋Š” ๊ฑฐ๋ฆฌ๊ฐ€ ๋ฉ€๊ธฐ ๋•Œ๋ฌธ์—, ๊ตฌ์กฐ์ ์œผ๋กœ RCE์™€ ๊ฐ™์€ ์ทจ์•ฝ์ ๊ณผ๋„ ๋ฌด๊ด€ํ•œ ํŽธ์ž…๋‹ˆ๋‹ค. ๋‹ค๋งŒ ๋‹ค๋ฅธ ๋ชจ๋“  ์‹คํ–‰ ์—”์ง„๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ, ์ตœ์ข…์ ์ธ ์•ˆ์ •์„ฑ๊ณผ ๋ณด์•ˆ์„ฑ์€ ๋Ÿฐํƒ€์ž„ ๊ตฌํ˜„๊ณผ ์šด์˜ ๋ฐฉ์‹์— ์˜ํ•ด ๊ฒฐ์ •๋œ๋‹ค๋Š” ์ ์€ ๋™์ผํ•˜๊ฒŒ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.

GGUF (GGML Unified Format)

GGUF๋Š” GGML์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๊ฐœ์„ ๋œ ํฌ๋งท์ž…๋‹ˆ๋‹ค. ์ด๋ฆ„์—์„œ ์•Œ ์ˆ˜ ์žˆ๋“ฏ 'ํ†ตํ•ฉ๋œ(Unified)' ํ˜•์‹์„ ์ง€ํ–ฅํ•˜๋ฉฐ, ๋” ๋งŽ์€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๊ณ  ํ™•์žฅ์„ฑ์„ ๋†’์˜€์Šต๋‹ˆ๋‹ค. ์ด๋ฆ„์„ ๋ถ™์ผ๋•Œ์—๋„

<BaseName><SizeLabel><FineTune><Version><Encoding><Type><Shard>.gguf ๋ผ๋Š” ๋„ค์ด๋ฐ ๊ทœ์น™์„ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ๋” ๋งŽ์€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•  ์ˆ˜ ์žˆ๊ฒŒ ํŒŒ์ผ๊ตฌ์กฐ๊ฐ€ ๊ฐœ์„ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

GGUF ๋Š” ๋„ˆ๋ฌด ๋งŽ์€ ์ด์•ผ๊ธฐ๋“ค์ด ์žˆ๋Š”๋ฐ ๋”ฐ๋กœ ๋‹ค๋ฃจ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ฒฐ๋ก ์€ GGML ์€ ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ ์„œ๋น™ ํŠนํ™” ๋ฐฐํฌ ํฌ๋งท์ด๊ณ , GGUF ๋Š” ์—ฌ๊ธฐ์„œ ๊ด€๋ฆฌ์ ์ธ ์ธก๋ฉด์„ ๊ณ ๋„ํ™”ํ•œ ํฌ๋งท์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

GGML /GGUF ์˜ ์ทจ์•ฝ์  ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ

GGML ์ด๋‚˜ GGUF ๋‘˜๋‹ค Python ๊ฐ์ฒด๋ฅผ ํฌํ•จํ•˜์ง€ ์•Š๊ณ  ๊ฐ™์€ ์˜๋ฏธ๋กœ pickle ์ด๋‚˜ ์–ด๋–ค ์Šคํฌ๋ฆฝํŠธ๋ฅผ ํฌํ•จํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ชจ๋ธ ์ž์ฒด๊ฐ€ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚จ๋‹ค๋˜์ง€์˜ ์ทจ์•ฝ์ ์€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์•Œ์•„๋ณด๋‹ค ๋ณด๋‹ˆ ์ •๋ง ๋„ˆ๋ฌด ๋งŽ์€ ํ”„๋ ˆ์ž„์›Œํฌ๋“ค์ด ์žˆ๋”๋ผ๊ตฌ์š”, ๊ทธ๋ž˜์„œ GPT ์—๊ฒŒ ์ •๋ฆฌ๋ฅผ ์ข€ ํ•ด๋‹ฌ๋ผ ํ–ˆ๋”๋‹ˆ ์–ด๋””์„œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€๋„ ๋ชจ๋ฅด๋Š” ๋…€์„๋“ค๊นŒ์ง€ ๊ฐ€์ ธ๋‹ค ์ •๋ฆฌ๋ฅผ ํ–ˆ๋„ค์š”,

ํฌ๋งท / ํ˜•ํƒœ ์ฃผ ์‚ฌ์šฉ์ฒ˜ ํฌํ•จ ๋‚ด์šฉ ์ฝ”๋“œ ์‹คํ–‰ ๊ฐ€๋Šฅ์„ฑ ๋ณด์•ˆ ์œ„ํ—˜๋„ ์žฅ์  ๋‹จ์  ๊ถŒ์žฅ ์‚ฌ์šฉ ์—ฌ๋ถ€
safetensors HF, ๋‚ด๋ถ€๋ง, ๋ณด์•ˆ ํ™˜๊ฒฝ ์ˆœ์ˆ˜ ํ…์„œ ๊ฐ€์ค‘์น˜ โŒ ์—†์Œ โญ ๋งค์šฐ ๋‚ฎ์Œ pickle ๋ฏธ์‚ฌ์šฉ, fast mmap, ์•ˆ์ „ ๊ฐ€์ค‘์น˜๋งŒ ์ €์žฅ โœ… ๊ฐ•๋ ฅ ๊ถŒ์žฅ
PyTorch .pt / .pth ์—ฐ๊ตฌ/๊ฐœ๋ฐœ Python ๊ฐ์ฒด + ๊ฐ€์ค‘์น˜ ๐Ÿ”ฅ ๊ฐ€๋Šฅ ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ์ €์žฅ ์œ ์—ฐ์„ฑ pickle ๊ธฐ๋ฐ˜ RCE โŒ ๋ฐฐํฌ ๊ธˆ์ง€
HF .bin (pytorch_model.bin) HF ๊ตฌ๋ฒ„์ „ pickle ๊ฐ€์ค‘์น˜ ๐Ÿ”ฅ ๊ฐ€๋Šฅ ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ํ˜ธํ™˜์„ฑ ์‚ฌ์‹ค์ƒ .pt โŒ
ONNX .onnx ์ถ”๋ก /์„œ๋น™ ์ •์  ๊ทธ๋ž˜ํ”„ + ๊ฐ€์ค‘์น˜ โŒ โญ ๋‚ฎ์Œ ํ”„๋ ˆ์ž„์›Œํฌ ๋…๋ฆฝ, ๋น ๋ฆ„ ๋™์  ๊ตฌ์กฐ ์ œํ•œ โœ… ์ถ”๋ก ์šฉ
TorchScript .ts / .pt PyTorch ์„œ๋น™ IR ๊ทธ๋ž˜ํ”„ + ๊ฐ€์ค‘์น˜ โš ๏ธ ์ œํ•œ์  โš ๏ธ ์ค‘๊ฐ„ Python ์ œ๊ฑฐ ๋””๋ฒ„๊น… ์–ด๋ ค์›€ โš ๏ธ ์ œํ•œ์ 
TensorFlow SavedModel TF ์„œ๋น™ ๊ทธ๋ž˜ํ”„ + ๊ฐ€์ค‘์น˜ โŒ โญ ๋‚ฎ์Œ TF Serving ์ตœ์  TF ์ข…์† โš ๏ธ
HDF5 .h5 Keras ๊ฐ€์ค‘์น˜ + ๊ตฌ์กฐ โŒ โญ ๋‚ฎ์Œ ๋‹จ์ˆœ ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ ํ•œ๊ณ„ โš ๏ธ
GGUF / GGML llama.cpp ์–‘์žํ™” ๊ฐ€์ค‘์น˜ โŒ โญ ๋‚ฎ์Œ CPU ์นœํ™” ํ•™์Šต ๋ถˆ๊ฐ€ โœ… ๋กœ์ปฌ
MLflow model MLOps ๋ชจ๋ธ + ๋ฉ”ํƒ€ + ์ฝ”๋“œ ๐Ÿ”ฅ ๊ฐ€๋Šฅ ๐Ÿ”ฅ๐Ÿ”ฅ ๊ด€๋ฆฌ ํŽธํ•จ ์ฝ”๋“œ ํฌํ•จ โš ๏ธ ๊ฒ€์ฆ ํ•„์ˆ˜
Triton model repo NVIDIA Triton ๋ชจ๋ธ + config โŒ โญ ๋‚ฎ์Œ ๊ณ ์„ฑ๋Šฅ ์„œ๋น™ ์„ค์ • ๋ณต์žก โœ…
Docker image ๋ฐฐํฌ ๋ชจ๋ธ + ์ฝ”๋“œ + OS ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ์žฌํ˜„์„ฑ ๊ณต๊ฒฉ๋ฉด ํผ โš ๏ธ ๋‚ด๋ถ€๊ฒ€์ฆ
HF repo (์ „์ฒด) ๊ณต์œ  ๊ฐ€์ค‘์น˜ + Python ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ ํŽธ์˜์„ฑ trust_remote_code โŒ ๋ฌด๊ฒ€์ฆ
LoRA / Adapter ํŒŒ์ธํŠœ๋‹ ๊ฐ€์ค‘์น˜ delta โŒ โญ ๋‚ฎ์Œ ๊ฒฝ๋Ÿ‰ base ํ•„์š” โœ…

๊ทธ๋ž˜์„œ ๊ฒฐ๋ก ์€ ๋ชจ๋ธ์€ ์—ฌ๋Ÿฌ ์š”๊ตฌ์‚ฌํ•ญ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ํ†ตํ•ฉ๋œ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ–ˆ๊ณ , ๊ทธ๊ณณ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ทจ์•ฝ์ ์€ ๋Œ€์ฒด๋กœ pickle ์˜ ์ง๋ ฌํ™”๋ฅผ ์‚ฌ์šฉํ•ด์„œ ๊ธฐ๋Œ€๋˜๋Š” ๋ฌธ์ œ์ ๋“ค์ด์˜€์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ pickle ์˜ ์ง๋ ฌํ™”๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด, RCE ๊ฐ™์€ ์น˜๋ช…์ ์ธ ๋ฌธ์ œ๋“ค์€ ๋ชจ๋ธ ์ž์ฒด์—์„œ ์ƒ๊ธฐ์ง€ ์•Š์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ๋ชจ๋ธ ๋Ÿฐํƒ€์ž„ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์ทจ์•ฝ์ ๋“ค์€ ์ „ํ˜€ ๋‹ค๋ฅธ ์˜์—ญ์ด๋‹ˆ ์‚ฌ์šฉ์— ์ฐธ๊ณ ํ•ด์•ผํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

ํ‹€๋ฆฐ ์ •๋ณด๊ฐ€ ์žˆ๋‹ค๋ฉด ์•Œ๋ ค์ฃผ์„ธ์š”!

728x90

'Dev,AI > Machine Learning' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Transformer ๊ตฌ์กฐ์—์„œ Layer Norm ์ด Batch Norm์ด ๋” ์ ํ•ฉํ•œ ์ด์œ   (0) 2026.01.15
Seq2Seq  (4) 2024.01.28

+ Recent posts