728x90

 

 

 

Batch Normalization

https://arxiv.org/pdf/1502.03167

 

 

Background

batch normalizaion ์€ 2015๋…„์— ์ œ์‹œ๋œ ICS(Internal Covariate Shift) ๋ฌธ์ œ๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋Š” ์•„์ด๋””์–ด์ž…๋‹ˆ๋‹ค. covariate shift ๋Š” ํ•™์Šต ๋•Œ ํ™œ์šฉํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ์‹ค์ œ ์ถ”๋ก ์— ์‚ฌ์šฉ๋˜๋Š” ๋ฐ์ดํ„ฐ๊ฐ„์˜ ๋ถ„ํฌ๊ฐ€ ๋‹ค๋ฅด๋ฉด ์ถ”๋ก  ์„ฑ๋Šฅ์— ์•…์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋‹ค๋ผ๋Š” ์ฃผ์žฅ์ธ๋ฐ ์ด๊ฒŒ ์‹ ๊ฒฝ๋ง ๋‚ด๋ถ€์—์„œ๋„ ๋ฐœ์ƒํ•  ๊ฒƒ์ด๋‹ค ๋ผ๋Š” ์ฃผ์žฅ์„ ํ•˜๋ฉฐ ์ƒ๊ธด์šฉ์–ด๊ฐ€ Internal Covariate Shift ๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์‚ฌ์ง„์„ ๋ณด๋ฉด ์ง๊ด€์ ์œผ๋กœ ์ดํ•ด๊ฐ€ ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์„ ํ†ต๊ณผํ•˜๋ฉด์„œ ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜๋Š”๋ฐ

 

ํ†ต๊ณผํ•˜๋Š” ๋ ˆ์ด์–ด ์ˆ˜๊ฐ€ ๋งŽ์•„์งˆ์ˆ˜๋ก ๊ทธ ์ •๋„๊ฐ€ ์‹ฌํ•ด์ง€๊ธฐ ๋•Œ๋ฌธ์— ๋‹น์—ฐํžˆ ์ถ”๋ก ์ด๋‚˜ ํ•™์Šต ์„ฑ๋Šฅ์— ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธธ ํ™•๋ฅ ์ด ํฝ๋‹ˆ๋‹ค. Batch Normalizaion ์€ ๊ธฐ์กด์˜ ์ •๊ทœํ™” ๊ณผ์ •์—์„œ ํ•™์Šต๋ฐ์ดํ„ฐ๋งˆ๋‹ค ๋ถ„ํฌ๊ฐ€ ๋‹ค๋ฅธ๊ฒƒ์„ ๋ฐฐ์น˜๋ณ„๋กœ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ํ™œ์šฉํ•ด ์ •๊ทœํ™”ํ•˜๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

๋‚˜๋™๋นˆ๋‹˜์˜ ์˜์ƒ์„ ์ฐธ๊ณ ํ•˜์—ฌ ์•Œ๊ฒŒ ๋œ batch normalizaion๊ฐ€ ํ˜„์‹ค์—์„œ๋Š” ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์˜์กด๋„๋ฅผ ์ค„์˜€์œผ๋ฉฐ, ํ•™์Šต์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ณ , ๋ชจ๋ธ์ด ์ผ๋ฐ˜์ ์œผ๋กœ ์ฆ‰, ํ•™์Šต๋ฐ์ดํ„ฐ์—๋งŒ ํƒœ์Šคํฌ๋ฅผ ์ž˜ ์ฒ˜๋ฆฌํ•˜๋„๋ก ํ•˜๋Š”๊ฒƒ์ด ์•„๋‹Œ ์‹ค์ œ ํ˜„์ƒ์„ ์ž˜ ๋ฐ˜์˜์‹œํ‚ค๊ฒŒ ๋œ ํšจ๊ณผ๊ฐ€ ์žˆ์—ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿฐ๋ฐ ๋…ผ๋ฌธ์—์„œ๋Š” ics ๋ฅผ ๊ฐ์†Œ์‹œํ‚จ๋‹ค๊ณ  ์ฃผ์žฅํ•˜์˜€์œผ๋‚˜ ์‹ค์ œ๋กœ ์ฆ๋ช…ํ•˜์ง€๋Š” ๋ชปํ–ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ทธ๊ฒƒ์„ ์ฆ๋ช…ํ•˜๊ธฐ ์œ„ํ•œ How Does Batch Normalization Help Optimization?  ๋ผ๋Š” ๋…ผ๋ฌธ์ด ๋‚˜์™”์Šต๋‹ˆ๋‹ค.

https://arxiv.org/pdf/1805.11604

 

 

์šฐ์„  ์ผ๋ฐ˜์ ์œผ๋กœ Batch Norm ์„ ์ ์šฉ์‹œํ‚จ ๋„คํŠธ์›Œํฌ๊ฐ€ Accuracy ๊ฐ€ ๊ฐ€ํŒŒ๋ฅธ ํญ์œผ๋กœ ์˜ฌ๋ผ๊ฐ”๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

 

 

์šฐ์ธก์˜ ํžˆ์Šคํ† ๊ทธ๋žจ์„ ๋ณด๋ฉด ๊ฐ ๋ ˆ์ด์–ด์˜ ๋ถ„ํฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๊ณ  ์žˆ๋Š”๋ฐ์š” ๊ฐ€์žฅ์šฐ์ธก์˜ Standard + Noisy BatchNorm ์—์„œ Layer3 ๋ถ€ํ„ฐ ๋ถ„ํฌ๊ฐ€ ๊ฐ‘์ž‘์Šค๋Ÿฝ๊ฒŒ ๋ณ€ํ•˜์—ฌ ICS๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ  ์žˆ์Œ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ICS๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ  ์žˆ์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์™ผ์ชฝ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด ํ•™์Šต์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•จ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฆ‰ ์ž„์˜๋กœ Batch Norm Layer ์ดํ›„ ๋ฐ”๋กœ Noise ๋ฅผ ๋„ฃ์–ด covariate shift ๋ฅผ ๋ฐœ์ƒ์‹œ์ผฐ์„ ๋•Œ์—๋„ BatchNorm ์ด ํฌํ•จ๋œ ๋„คํŠธ์›Œํฌ๋Š” ์ผ๋ฐ˜์ ์ธ ๋„คํŠธ์›Œํฌ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•จ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์‹คํ—˜์ ์œผ๋กœ Batch Norm ์ด ICS ๋ฌธ์ œ๋ฅผ ํ•ด์†Œํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ด์ „ ๋…ผ๋ฌธ์˜ ๋ฐ˜๋ฐ•์„ ํ•˜์˜€๊ณ , ์‹ฌ์ง€์–ด ICS๊ฐ€ ํฌ๊ฒŒ ๋ฐœ์ƒํ•จ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  Batch Norm ์ด ์žˆ์œผ๋ฉด ์„ฑ๋Šฅ์ด ์ข‹์•„์ง„๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€ ์‚ฌ๋ก€๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

ํ•ด๋‹น๋…ผ๋ฌธ์—์„œ ICS๋ฅผ ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐํ•˜์—ฌ ICS๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ–ˆ๋Š”๋ฐ, ํฌ์ŠคํŒ…์˜ ๋ชฉ์ ๋ณด๋‹ค ๋„ˆ๋ฌด ๋ฒ—์–ด๋‚˜๋Š”๊ฒƒ ๊ฐ™์•„ ๋‹ค๋ฃจ์ง€ ์•Š๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ถ๊ธˆํ•˜์‹ ๋ถ„๊ป˜์„œ๋Š” ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๊ทธ๋ ‡๋‹ค๋ฉด ICS ๋ฅผ ํ•ด์†Œํ•˜์ง€ ๋ชปํ–ˆ์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์„ฑ๋Šฅ์ด ์ข‹์€ ์ด์œ ๋Š” ๋ญ˜๊นŒ์š”? ๋…ผ๋ฌธ์—์„œ๋Š” Batch Norm ์˜ Smoothing ํšจ๊ณผ ๋•Œ๋ฌธ์ด๋ผ๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

 

Loss Landscape ๊ฐ€ ํ›จ์”ฌ ๋” ์˜ˆ์ƒ ๊ฐ€๋Šฅํ•œ ๋ฒ”์œ„๋กœ ํ˜•์„ฑ๋˜๋ฉด์„œ ํ•™์Šตํšจ๊ณผ๊ฐ€ ์ฆ๋Œ€๋œ๋‹ค๊ณ  ๋งํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

Batch Normalization Layer

๋ฏธ๋‹ˆ๋ฐฐ์น˜์˜ ํ‰๊ท ๊ฐ’๊ณผ ๋ถ„์‚ฐ์„ ๊ตฌํ•ด์„œ normalizaion ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€๋ฅผ ํ™œ์šฉํ•ด ์‹ค์ œ output ์„ ๋‚ด๋Š”๋ฐ์š”, ์—ฌ๊ธฐ์„œ ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€๊ฐ€ ์‹ค์ œ ํ•™์Šต์— ํ™œ์šฉ๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์ž…๋‹ˆ๋‹ค. ํ•™์Šต์ค‘์—๋Š” loss ๋ฅผ ์ตœ์†Œํ™” ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€๋ฅผ ์ฐพ์•„๊ฐˆ ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

์ •๊ทœํ™”์—์„œ ํ•™์Šต ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ํŠน์ง•์— ์žˆ์Šต๋‹ˆ๋‹ค. sigmoid๋ฅผ ์˜ˆ์‹œ๋กœ ๋“ค๋ฉด ์–ด๋–ค ๊ตฌ๊ฐ„์—์„œ๋Š” ๋งค์šฐ ์„ ํ˜•์ ์œผ๋กœ ์ž‘๋™ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ‘œ์ค€์ •๊ทœ๋ถ„ํฌ๋กœ ์ •๊ทœํ™”ํ•œ 0๊ณผ 1์‚ฌ์ด์˜ ๊ฐ’์—์„œ ์„ ํ˜•์ ์œผ๋กœ ์ž‘๋™ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€๋ฅผ ํ™œ์šฉํ•ด non-linearity ๋ฅผ ์ง€์ผœ์ฃผ๊ณ , ํ•ด๋‹น ์ •๊ทœํ™” ๋ ˆ์ด์–ด์˜ output ๋„ ์ ์ ˆํ•˜๊ฒŒ ๋‚ด๋ณด๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ๋ฉ๋‹ˆ๋‹ค. ๊ฒฐ๋ก ์€ ๋ ˆ์ด์–ด์˜ ์ž…๋ ฅ์„ ์ •๊ทœํ™”ํ•  ๋•Œ๋Š” linearity ๋ฅผ ์ฃผ์˜ํ•ด์„œ ์ •๊ทœํ™” ํ•ด์•ผํ•œ๋‹ค๋Š” ์  ์ž…๋‹ˆ๋‹ค.

 

Batch Normalization Layer ์—ฐ์‚ฐ๊ตฌ๋ถ„

batch normalization Layer ๋Š” ํ•™์Šตํ• ๋•Œ์™€ ์ถ”๋ก ํ•  ๋•Œ ๋„คํŠธ์›Œํฌ์—์„œ์˜ ์—ญํ• ์ด ๋‹ฌ๋ผ์ง‘๋‹ˆ๋‹ค. ํ•™์Šตํ• ๋•Œ ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํ•™์Šต์‹œ์ผœ์•ผ ํ•˜์ง€๋งŒ ์ถ”๋ก ๋•Œ์—๋Š” ํ•„์š”์—†์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ ํ•ด๋‹น ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์„ ๊ณ ์ •ํ•˜์—ฌ ํ•™์Šต๋œ ํŒŒ๋ผ๋ฏธํ„ฐ์— ์˜ํ•œ ๊ฐ’์ด ๋‚˜์™€์•ผํ•ฉ๋‹ˆ๋‹ค.

 

step 7 ์—์„œ๋ถ€ํ„ฐ๋Š” BN ์ด training ๋ชจ๋“œ๋กœ ๋„คํŠธ์›Œํฌ์— ์žˆ์—ˆ๋˜ ๊ฒƒ์„ inference ๋ชจ๋“œ๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ( ํŒŒ๋ผ๋ฏธํ„ฐ ๊ณ ์ •์„ ํ†ตํ•ด์„œ )

Batch Normalization Data Flow

์ž…๋ ฅ ๋ฐ์ดํ„ฐ (X)

 

$$

X = \begin{bmatrix} [1,\ 2] \ [2,\ 4] \ [3,\ 6] \end{bmatrix}

$$

๋ฐฐ์น˜๋กœ ๋“ค์–ด์˜จ ๋ฐ์ดํ„ฐ

shape: (3, 2)

→ ์ƒ˜ํ”Œ 3๊ฐœ, ๊ฐ ์ƒ˜ํ”Œ์€ 2์ฐจ์› ๋ฒกํ„ฐ


Linear Layer ํ†ต๊ณผ

๊ฐ€์ค‘์น˜์™€ bias๋ฅผ ์ด๋ ‡๊ฒŒ ๋‘๊ฒ ์Šต๋‹ˆ:

$$ [ W = \begin{bmatrix} [1,0], \ [0,1] \end{bmatrix}, \quad b = [0,\ 0] ] $$

์ฆ‰, ์•„๋ฌด ๋ณ€ํ™” ์—†๋Š” ์„ ํ˜•์ธต

$$ [ Z = XW + b = X ] $$

๊ฒฐ๊ณผ:

Z =
[
 [1, 2],
 [2, 4],
 [3, 6]
]

shape ๊ทธ๋Œ€๋กœ (3, 2)


Batch Normalization

1๏ธโƒฃ Batch Mean (μ)

feature๋ณ„ ํ‰๊ท :

$$ μ=[(1+2+3)/3, (2+4+6)/3]=[2, 4] $$


2๏ธโƒฃ Batch Variance (σ²)

$$ σ2=[((1−2)2+(2−2)2+(3−2)2)/3,((2−4)2+(4−4)2+(6−4)2)/3]=[2/3, 8/3] $$


3๏ธโƒฃ Normalize (xฬ‚)

$$ \hat{x} = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} (ε ๋ฌด์‹œํ•œ๋‹ค๊ณ  ๊ฐ€์ •) $$

์ƒ˜ํ”Œ๋ณ„ ๊ณ„์‚ฐ

์ฒซ ๋ฒˆ์งธ ์ƒ˜ํ”Œ

$$ [1,2] → [-1/\sqrt{2/3},\ -2/\sqrt{8/3}] ≈ [-1.22,\ -1.22] $$

๋‘ ๋ฒˆ์งธ

$$ [2,4] → [0,\ 0] $$

์„ธ ๋ฒˆ์งธ

$$ [3,6] → [1.22,\ 1.22] $$

๊ฒฐ๊ณผ:

X_hat =
[
 [-1.22, -1.22],
 [ 0.00,  0.00],
 [ 1.22,  1.22]
]

๊ทธ๋ฆฌ๊ณ  ํ•ด๋‹น๊ฐ’์— gamma ์™€ betta ์—ฐ์‚ฐ์„ ํ†ตํ•ด Layer ๋ฅผ ํ†ต๊ณผ์‹œํ‚ต๋‹ˆ๋‹ค. ์ด์ฒ˜๋Ÿผ batch norm ์€ ๋ฏธ๋‹ˆ ๋ฐฐ์น˜์˜ ํ”ผ์ฒ˜๋ณ„๋กœ ํ‰๊ท , ๋ถ„์‚ฐ์„ ๊ตฌํ•ด์„œ ์›๋ณธ ๋ฐ์ดํ„ฐ์— ๋Œ€์ž…์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ Normalizaion ์„ ์ˆ˜ํ–‰ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Layer Normalization

arxiv.org

Layer Normalization ์€ Batch Norm ์ด RNN ์— ์ ์šฉํ•˜๊ธฐ ์–ด๋ ค์šด ๋ฌธ์ œ์ ์„ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด ์ œ์‹œ๋œ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. RNN์€ ์‹œ๊ฐ„๋‹จ์œ„๋กœ ๊ณ„์‚ฐ์„ ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋ฏธ๋‹ˆ๋ฐฐ์น˜์˜ ๊ฐ ํ”ผ์ณ๋งˆ๋‹ค ํ†ต๊ณ„๋ฅผ ์ด์šฉํ•ด ์ •๊ทœํ™”ํ•˜๋Š” BN ์˜ ๊ฒฝ์šฐ์—๋Š” ํ•ด๋‹น ์ŠคํŠธ๋ฆผ์˜ ๋งฅ๋ฝ์„ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค.

๊ฐ€์žฅ ํฐ ๋ฌธ์ œ๋Š” RNN ์ด๋‚˜ NLP, ํ˜น์€ ์Œ์„ฑ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ์šฐ๋Š” ๋ฐฐ์น˜๋งˆ๋‹ค ๊ธธ์ด๊ฐ€ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

์ƒ˜ํ”Œ 1: "๋‚˜๋Š” ๋ฐฅ์„ ๋จน์—ˆ๋‹ค"        (๊ธธ์ด 4)
์ƒ˜ํ”Œ 2: "์˜ค๋Š˜"                    (๊ธธ์ด 1)
์ƒ˜ํ”Œ 3: "์–ด์ œ ๋น„๊ฐ€ ์™€์„œ ์šฐ์‚ฐ์„ ์ผ๋‹ค" (๊ธธ์ด 6)

์ด๊ฒƒ์„ BN ์„ ํ™œ์šฉํ•œ Layer output ์„ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด ์ƒ˜ํ”Œ2 ์˜ 2,3 ์ƒ˜ํ”Œ1์˜ 3,4 ๊ฐ€ 0์ด ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ์˜ ์˜๋ฏธ๋ฅผ ์ถฉ๋ถ„ํžˆ ๋ฐ˜์˜ํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ๋Š” ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์—๋„ ๊ทธ๋Œ€๋กœ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋‚˜ ์„ฑ์ ํ†ต๊ณ„(๊ตญ์–ด๋Š” ๊ตญ์–ด๋ผ๋ฆฌ, ์ˆ˜ํ•™์€ ์ˆ˜ํ•™๋ผ๋ฆฌ) ์™€ ๊ฐ™์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์•„๋‹ˆ๋ผ ํ”ผ์ณํ•˜๋‚˜๊ฐ€ ๋‹ค๋ฅธ ํ”ผ์ณ๋‚˜ ๋ฐ์ดํ„ฐ์—๋„ ์˜ํ–ฅ์„ ์ฃผ๋Š”๊ฒฝ์šฐ๋Š” Batch ์‚ฌ์ด์ฆˆ์— ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š๊ณ  ๋ฐ์ดํ„ฐ์˜ ์˜๋ฏธ๋ฅผ ์ž˜ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ๋Š” LN ์ด ์„ฑ๋Šฅ์ด ์ข‹๋‹ค๊ณ  ์ฃผ์žฅํ•ฉ๋‹ˆ๋‹ค.

 

BN ๊ณผ์˜ ์ฐจ์ด์ 

Batch Normalization์€ ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ๊ณ„์‚ฐํ•˜์—ฌ ์ •๊ทœํ™”๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด **Layer Normalization(LN)**์€ ์ด๋ฆ„ ๊ทธ๋Œ€๋กœ ๋ ˆ์ด์–ด ๋‹จ์œ„, ์ •ํ™•ํžˆ๋Š” ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ ๋‚ด๋ถ€ feature๋“ค์— ๋Œ€ํ•ด์„œ๋งŒ ์ •๊ทœํ™”๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์ •๊ทœํ™”์˜ ๊ธฐ์ค€์ด ์™„์ „ํžˆ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

  • Batch Normalization
    • ํ‰๊ท , ๋ถ„์‚ฐ ๊ณ„์‚ฐ ์ถ•: batch ๋ฐฉํ–ฅ
    • ๊ฐ™์€ feature๋ฅผ ๊ฐ€์ง„ ์—ฌ๋Ÿฌ ์ƒ˜ํ”Œ์„ ํ•จ๊ป˜ ์‚ฌ์šฉ
  • Layer Normalization
    • ํ‰๊ท , ๋ถ„์‚ฐ ๊ณ„์‚ฐ ์ถ•: feature ๋ฐฉํ–ฅ
    • ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ ์•ˆ์—์„œ๋งŒ ๊ณ„์‚ฐ

ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ x = [xโ‚, xโ‚‚, ..., xโ‚]์— ๋Œ€ํ•ด:

$$ \mu = \frac{1}{d} \sum_{i=1}^{d} x_i $$

$$ \sigma^2 = \frac{1}{d} \sum_{i=1}^{d} (x_i - \mu)^2 $$

$$ \hat{x}_i = \frac{x_i - \mu}{\sqrt{\sigma^2 + \epsilon}} $$

๊ทธ๋ฆฌ๊ณ  Batch Normalization๊ณผ ๋™์ผํ•˜๊ฒŒ scale, shift ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค:

$$ y_i = \gamma_i \hat{x}_i + \beta_i $$

์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ์ ์€ γ, β๋Š” feature ์ฐจ์›์— ๋Œ€ํ•ด์„œ๋งŒ ์กด์žฌํ•˜๋ฉฐ batch ํฌ๊ธฐ์™€ ๋ฌด๊ด€ํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์œ„์˜ ์ˆ˜์‹๋Œ€๋กœ ๊ฐ™์€ ์ƒ˜ํ”Œ์„ ๊ฐ€์ง€๊ณ  ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•˜๋Š” ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Layer Normalization Data Flow

์ž…๋ ฅ ๋ฐ์ดํ„ฐ (X)

$$ X = \begin{bmatrix} [1,\ 2] \\ [2,\ 4] \\ [3,\ 6] \end{bmatrix} $$

shape: (3, 2)

→ ์ƒ˜ํ”Œ 3๊ฐœ, ๊ฐ ์ƒ˜ํ”Œ์€ 2์ฐจ์› ๋ฒกํ„ฐ


Linear Layer ํ†ต๊ณผ

๊ฐ€์ค‘์น˜์™€ bias๋Š” ์ด์ „๊ณผ ๋™์ผํ•˜๊ฒŒ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

$$ Z = X $$


Layer Normalization ์ ์šฉ

Layer Normalization์€ ๊ฐ ์ƒ˜ํ”Œ๋งˆ๋‹ค ๋…๋ฆฝ์ ์œผ๋กœ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

์ฒซ ๋ฒˆ์งธ ์ƒ˜ํ”Œ [1, 2]

$$ \mu = (1 + 2) / 2 = 1.5 $$

$$ \sigma^2 = ((1 - 1.5)^2 + (2 - 1.5)^2) / 2 = 0.25 $$

์ •๊ทœํ™” ๊ฒฐ๊ณผ:

$$ [1, 2] \rightarrow [-1, 1] $$


๋‘ ๋ฒˆ์งธ ์ƒ˜ํ”Œ [2, 4]

$$ \mu = 3,\quad \sigma^2 = 1 $$

์ •๊ทœํ™” ๊ฒฐ๊ณผ:

$$ [2, 4] \rightarrow [-1, 1] $$


์„ธ ๋ฒˆ์งธ ์ƒ˜ํ”Œ [3, 6]

$$ \mu = 4.5,\quad \sigma^2 = 2.25 $$

์ •๊ทœํ™” ๊ฒฐ๊ณผ:

$$ [3, 6] \rightarrow [-1, 1] $$


Layer Normalization ๊ฒฐ๊ณผ

X_hat =
[
 [-1,  1],
 [-1,  1],
 [-1,  1]
]

Transformer ๊ตฌ์กฐ์—์„œ Layer Normalization ์ด Batch Normalization ๋ณด๋‹ค ์ ํ•ฉํ•œ ์ด์œ 

1. ์‹œํ€€์Šค ๊ธธ์ด ๊ฐ€๋ณ€์„ฑ๊ณผ Masking ๋ฌธ์ œ

Transformer์˜ Self-Attention์€ ๊ฐ€๋ณ€ ๊ธธ์ด ์‹œํ€€์Šค๋ฅผ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅํ˜•ํƒœ๋Š” ๊ฐ ๋ฌธ์žฅ๋งˆ๋‹ค ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅด๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์งง์€ ๋ฌธ์žฅ์—๋Š” padding์„ ์ถ”๊ฐ€ํ•˜ attention mask๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

Batch Normalization์„ ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ์— ์ ์šฉํ•˜๋ฉด ์‹ฌ๊ฐํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. BN์€ ๋ฐฐ์น˜์™€ ์‹œํ€€์Šค ์ฐจ์› ์ „์ฒด์— ๊ฑธ์ณ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ ์œ„์—์„œ ๋ดค๋˜ ๊ฒƒ ์ฒ˜๋Ÿผ ์˜๋ฏธ ์—†๋Š” padding ํ† ํฐ์˜ 0 ๋ฒกํ„ฐ๊ฐ€ ํ†ต๊ณ„์— ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ๋ฌธ์žฅ ๊ธธ์ด์— ๋”ฐ๋ผ ์ •๊ทœํ™” ํ†ต๊ณ„๊ฐ€ ์™œ๊ณก๋˜๊ณ , ๊ฐ™์€ ๋‚ด์šฉ์˜ ๋ฌธ์žฅ์ด๋ผ๋„ padding์˜ ์–‘์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ์ •๊ทœํ™”๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

๋ฐ˜๋ฉด Layer Normalization์€ ๊ฐ ํ† ํฐ์˜ feature ์ฐจ์›์— ๋Œ€ํ•ด์„œ๋งŒ ์ •๊ทœํ™”๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ํ•˜๋‚˜์˜ ํ† ํฐ ๋‚ด๋ถ€์—์„œ๋งŒ ํ‰๊ท ๊ณผ ๋ถ„์‚ฐ์„ ๊ณ„์‚ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์— padding ํ† ํฐ์ด๋‚˜ ์‹œํ€€์Šค ๊ธธ์ด๊ฐ€ ์ •๊ทœํ™” ํ†ต๊ณ„์— ์ „ํ˜€ ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ ํ† ํฐ์€ ๋…๋ฆฝ์ ์œผ๋กœ ์ •๊ทœํ™”๋˜๋ฏ€๋กœ ๋ฐ์ดํ„ฐ์˜ ์˜๋ฏธ๊ฐ€ ์ถฉ์‹คํžˆ ๋ฐ˜์˜๋˜๊ณ  ๋ฐฐ์น˜๋‚˜ ์‹œํ€€์Šค ๊ตฌ์กฐ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์ผ๊ด€๋œ ์ •๊ทœํ™”๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

2. Autoregressive Decoding๊ณผ ๋ฐฐ์น˜ ํฌ๊ธฐ ๋ถˆ์ผ์น˜

Transformer Decoder๋Š” ์ถ”๋ก  ์‹œ ๋ฏธ๋ž˜์˜ ์ •๋ณด๋ฅผ ์ฐธ์กฐํ•˜์ง€ ๋ชปํ•˜๋„๋ก autoregressive ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์ด์ „์— ์ƒ์„ฑํ•œ ํ† ํฐ์„ ๋ฐ”ํƒ•์œผ๋กœ ๋‹ค์Œ ํ† ํฐ์„ ํ•˜๋‚˜์”ฉ ์ˆœ์ฐจ์ ์œผ๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ 1์ด ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” Layer Normalization ๋…ผ๋ฌธ์—์„œ ๋ณด์—ฌ์ค€๊ฒƒ์ฒ˜๋Ÿผ Batch Normalization์— ์น˜๋ช…์ ์ธ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

Layer Normalization์€ ๋ฐฐ์น˜ ํฌ๊ธฐ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ์•ˆ์ •์ ์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. ๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ 1์ด๋“  32๋“  ์ •๊ทœํ™” ๊ฒฐ๊ณผ๋Š” ์ผ๊ด€๋˜๋ฉฐ, ํ•™์Šต ์‹œ ๊ด€์ฐฐํ•œ ์„ฑ๋Šฅ์ด ์ถ”๋ก  ์‹œ์—๋„ ๊ทธ๋Œ€๋กœ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” Transformer Decoder์˜ ์ƒ์„ฑ ํ’ˆ์งˆ์— ๊ฒฐ์ •์ ์œผ๋กœ ์ค‘์š”ํ•œ ํŠน์„ฑ์ž…๋‹ˆ๋‹ค.

3. Residual Connection๊ณผ์˜ ๊ตฌ์กฐ์  ๋ถˆ์ผ์น˜

Transformer์˜ ๊ฐ ๋ธ”๋ก์€ residual connection์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค: y = x + Sublayer(LN(x)). ์ด ๊ตฌ์กฐ๊ฐ€ ์ค‘์š”ํ•œ ์ด์œ ๋Š” gradient์˜ ํ๋ฆ„ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์—ญ์ „ํŒŒ ์‹œ ∂y/∂x = 1 + ∂Sublayer/∂x ๊ฐ€ ๋˜์–ด, gradient๊ฐ€ ํ•ญ์ƒ ์ง์ ‘ ํ๋ฅผ ์ˆ˜ ์žˆ๋Š” ๊ฒฝ๋กœ(identity mapping)๊ฐ€ ๋ณด์žฅ๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๊นŠ์€ ๋„คํŠธ์›Œํฌ์—์„œ gradient vanishing ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ํ•ต์‹ฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ž…๋‹ˆ๋‹ค.

๋งŒ์•ฝ Batch Normalization์„ residual path์— ์‚ฌ์šฉํ•˜๋ฉด, BN์˜ ์ถœ๋ ฅ์ด ๋ฐฐ์น˜ ํ†ต๊ณ„์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— residual path์— batch-dependent noise๊ฐ€ ์ฃผ์ž…๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” gradient flow๋ฅผ ๋ถˆ์•ˆ์ •ํ•˜๊ฒŒ ๋งŒ๋“ค๊ณ , ํŠนํžˆ ๊นŠ์€ Transformer์—์„œ๋Š” gradient ํญ๋ฐœ์ด๋‚˜ ์†Œ์‹ค์„ ์ผ์œผํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ Post-LN Transformer(residual ํ›„์— LN์„ ์ ์šฉ)๋Š” ๋ ˆ์ด์–ด๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ํ•™์Šต์ด ๋ถˆ์•ˆ์ •ํ•ด์ง€๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ์œผ๋ฉฐ, Pre-LN Transformer(residual ์ „์— LN์„ ์ ์šฉ)๊ฐ€ ๋” ์•ˆ์ •์ ์ธ ํ•™์Šต์„ ๋ณด์ž…๋‹ˆ๋‹ค. BN์€ ์ด๋Ÿฌํ•œ residual connection์˜ ํŠน์„ฑ๊ณผ ๊ทผ๋ณธ์ ์œผ๋กœ ์ถฉ๋Œํ•ฉ๋‹ˆ๋‹ค.

Layer Normalization์€ ๊ฐ ์ƒ˜ํ”Œ์„ ๋…๋ฆฝ์ ์œผ๋กœ ์ •๊ทœํ™”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ฐฐ์น˜์— ์˜์กดํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ residual path์˜ gradient flow๋ฅผ ๋ฐฉํ•ดํ•˜์ง€ ์•Š์œผ๋ฉฐ, ์ˆ˜์‹ญ ๊ฐœ์˜ ๋ ˆ์ด์–ด๋กœ ์ด๋ฃจ์–ด์ง„ ๊นŠ์€ Transformer์—์„œ๋„ ์•ˆ์ •์ ์ธ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ์  ์กฐํ™”๊ฐ€ Transformer๊ฐ€ Layer Normalization์„ ์‚ฌ์šฉํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ์ค‘์š”ํ•œ ์ด์œ ์ž…๋‹ˆ๋‹ค.

728x90

'Dev,AI > Machine Learning' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Seq2Seq  (4) 2024.01.28
728x90

๋ฐฐ๊ฒฝ

์‚ฌ๋‚ด LLM ์„œ๋น„์Šค ๊ฐœ๋ฐœ ์ค‘ vLLM ์ด ๋ณ‘๋ ฌ์ฒ˜๋ฆฌ ๋˜์ง€ ์•Š๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. vLLM ๋กœ๊ทธ๋ฅผ ๋ณด๋ฉด vLLM ์„œ๋ฒ„์— ์š”์ฒญ์ด ํ•˜๋‚˜์”ฉ ์ „์†ก๋˜์–ด ์ฒ˜๋ฆฌ๋˜๊ณ  ์žˆ๋Š”๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋Š”๋ฐ, ์ฒ˜์Œ์—” vLLM ๋‚ด๋ถ€์—์„œ multi GPU ์ธ์‹์„ ํ•˜์ง€ ๋ชปํ•ด vram ์„ ๊ณผ๋‹คํ•˜๊ฒŒ ์ ์œ ํ•˜์—ฌ ๋ณ‘๋ ฌ์ฒ˜๋ฆฌ๊ฐ€ ๋˜์ง€ ์•Š๋Š” ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ, vLLM ์‹คํ–‰์‹œ multi gpu ์˜ต์…˜์„ ์คฌ๊ณ , ๋กœ๊ทธ๋ฅผ ์ฐ์–ด๋ณด์•„๋„ 2๊ฐœ์˜ gpu ๊ฐ€ ์ž˜ ์ธ์‹๋˜์–ด ์žˆ๋Š”๊ฒƒ์„ ํ™•์ธํ•˜๊ณ  ๋ฌธ์ œ๋ฅผ ์ฐพ๋‹ค FastAPI ์—์„œ vLLM ์— ์š”์ฒญ์„ ๋ณด๋‚ผ ๋•Œ openai ์˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ–ˆ๋˜๊ฒƒ์ด ๋ฌธ์ œ์ž„์„ ์•Œ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. openai ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ค‘ OpenAI ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋™๊ธฐ Request ๋กœ ์ž‘๋™ํ•˜๊ณ  AysncOpenAI ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ๋น„๋™๊ธฐ ์ž‘๋™์„ ํ•˜๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

ํ•ด๋‹น๋‚ด์šฉ์„ ์ •๋ฆฌํ•  ๊ฒธ Request ๋ฅผ ์‚ฌ์šฉํ•œ ๋ฐฉ์‹๊ฐ€ httpx ๋ฅผ ์‚ฌ์šฉํ•œ ์š”์ฒญ๋ฐฉ์‹์˜ ์ฐจ์ด์  ๊ทธ๋ฆฌ๊ณ  FastAPI ์˜ ๋™๊ธฐ/๋น„๋™๊ธฐ, ๋ณ‘๋ ฌ๊ณผ ๋น„๋™๊ธฐ์˜ ์ž‘๋™๋ฐฉ์‹์„ ์ •๋ฆฌํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.


1. FastAPI ๋™๊ธฐ / ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ ๋ฐฉ์‹

FastAPI๋Š” ์—”๋“œํฌ์ธํŠธ ํ•จ์ˆ˜๊ฐ€ def ์ธ์ง€ async def ์ธ์ง€์— ๋”ฐ๋ผ ์™„์ „ํžˆ ๋‹ค๋ฅธ ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.

1.1 ๋™๊ธฐ ์—”๋“œํฌ์ธํŠธ (def)

from fastapi import FastAPI
import time

app = FastAPI()

@app.get("/sync")
def sync_endpoint():
    time.sleep(5)
    return {"msg": "done"}

๋™๊ธฐ ์—”๋“œํฌ์ธํŠธ์˜ ๊ฒฝ์šฐ FastAPI๋Š” ๋‚ด๋ถ€์ ์œผ๋กœ ThreadPoolExecutor๋ฅผ ์‚ฌ์šฉํ•ด ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.

์ฆ‰, ์š”์ฒญ ํ•˜๋‚˜๋‹น ์Šค๋ ˆ๋“œ ํ•˜๋‚˜๋ฅผ ์ ์œ ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ฐฉ์‹์˜ ๋ฌธ์ œ๋Š” ์™ธ๋ถ€ API ํ˜ธ์ถœ๊ณผ ๊ฐ™์ด I/O ๋Œ€๊ธฐ ์‹œ๊ฐ„์ด ๊ธด ์ž‘์—…์ด ์žˆ์„ ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. ์‘๋‹ต์ด ์˜ฌ ๋•Œ๊นŒ์ง€ ์Šค๋ ˆ๋“œ๊ฐ€ ์ ์œ ๋˜๊ธฐ ๋•Œ๋ฌธ์—, ๋™์‹œ์— ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ์š”์ฒญ ์ˆ˜๊ฐ€ ๊ธ‰๊ฒฉํžˆ ์ค„์–ด๋“ค๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ vLLM ์ž…์žฅ์—์„œ๋Š” ์š”์ฒญ์ด ํ•˜๋‚˜์”ฉ ์ˆœ์ฐจ์ ์œผ๋กœ ๋“ค์–ด์˜ค๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.


1.2 ๋น„๋™๊ธฐ ์—”๋“œํฌ์ธํŠธ (async def)

from fastapi import FastAPI
import asyncio

app = FastAPI()

@app.get("/async")
async def async_endpoint():
    await asyncio.sleep(5)
    return {"msg": "done"}

๋น„๋™๊ธฐ ์—”๋“œํฌ์ธํŠธ๋Š” ์ด๋ฒคํŠธ ๋ฃจํ”„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. I/O ์ž‘์—…์„ ๊ธฐ๋‹ค๋ฆฌ๋Š” ๋™์•ˆ ์ œ์–ด๊ถŒ์„ ์ด๋ฒคํŠธ ๋ฃจํ”„์— ๋ฐ˜ํ™˜ํ•˜๊ณ , ๋‹ค๋ฅธ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ ์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ์ ์€, async def๋กœ ์„ ์–ธํ–ˆ๋‹ค๊ณ  ํ•ด์„œ ์ž๋™์œผ๋กœ ๋น„๋™๊ธฐ๊ฐ€ ๋˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ์—”๋“œํฌ์ธํŠธ ๋‚ด๋ถ€์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋“  I/O ์ž‘์—…์ด ๋น„๋™๊ธฐ์—ฌ์•ผ๋งŒ ์˜๋ฏธ ์žˆ๋Š” ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์—์„œ ์ถ”๊ฐ€์ ์œผ๋กœ ์„ค๋ช…ํ•˜๊ฒ ์ง€๋งŒ, ๋น„๋™๊ธฐ ์ž‘์—…์€ ๋ณ‘๋ ฌ๊ณผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๋น„๋™๊ธฐ์ž‘์—…์€ ๋™์‹œ์„ฑ ์ž‘์—…์œผ๋กœ ๋™์‹œ์— ์ฒ˜๋ฆฌ๋˜๋Š” ๊ฒƒ ์ฒ˜๋Ÿผ ๋ณด์ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.


2. FastAPI์—์„œ์˜ ๋™์‹œ์„ฑ๊ณผ ๋ณ‘๋ ฌ์„ฑ

 

 

Concurrency and async / await - FastAPI

FastAPI framework, high performance, easy to learn, fast to code, ready for production

fastapi.tiangolo.com

 

์ด๋ฒˆ ์ด์Šˆ๋ฅผ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋™์‹œ์„ฑ๊ณผ ๋ณ‘๋ ฌ์„ฑ์˜ ์ฐจ์ด๋ฅผ ๋ช…ํ™•ํžˆ ๊ตฌ๋ถ„ํ•  ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

2.1 ๋™์‹œ์„ฑ (Concurrency)

๋™์‹œ์„ฑ์€ ์—ฌ๋Ÿฌ ์ž‘์—…์„ ๋ฒˆ๊ฐˆ์•„๊ฐ€๋ฉฐ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.

์‹ค์ œ๋กœ ๋™์‹œ์— ์‹คํ–‰๋˜๋Š” ๊ฒƒ์€ ์•„๋‹ˆ์ง€๋งŒ, ๋™์‹œ์— ์ฒ˜๋ฆฌ๋˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.FastAPI์˜ ๋น„๋™๊ธฐ ์ฒ˜๋ฆฌ๋Š” ์—ฌ๊ธฐ์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค.


2.2 ๋ณ‘๋ ฌ์„ฑ (Parallelism)

 

 

concurrent.futures — Launching parallel tasks

Source code: Lib/concurrent/futures/thread.py, Lib/concurrent/futures/process.py, and Lib/concurrent/futures/interpreter.py The concurrent.futures module provides a high-level interface for asynchr...

docs.python.org

 

๋ณ‘๋ ฌ์„ฑ์€ ์—ฌ๋Ÿฌ ์ž‘์—…์„ ์‹ค์ œ๋กœ ๋™์‹œ์— ์‹คํ–‰ํ•˜๋Š” ๊ฐœ๋…์ž…๋‹ˆ๋‹ค.

FastAPI ๊ณต์‹๋ฌธ์„œ์— ๊ท€์—ฌ์šด burger ์˜ˆ์‹œ๊ฐ€ ์žˆ๋Š”๋ฐ์š”

1.๋™์‹œ์„ฑ

 

 

2. ๋ณ‘๋ ฌ์„ฑ 

 

์ž์„ธํ•œ ๋‚ด์šฉ์€ ์œ„ ๋งํฌ์—์„œ ํ•œ๋ฒˆ ํ™•์ธํ•ด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.


3. OpenAI ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ณ‘๋ชฉ์ด ๋œ ์ด์œ 

3.1 OpenAI (๋™๊ธฐ SDK) ์‚ฌ์šฉ ์‹œ

from openai import OpenAI

client = OpenAI(
    base_url="<http://vllm:8000/v1>",
    api_key="EMPTY"
)

@app.post("/chat")
def chat():
    response = client.chat.completions.create(
        model="qwen",
        messages=[{"role": "user", "content": "hello"}]
    )
    return response.choices[0].message.content

OpenAI ํด๋ž˜์Šค๋Š” ๋™๊ธฐ ๋ฐฉ์‹์œผ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.

์ฆ‰, ์‘๋‹ต์ด ๋Œ์•„์˜ฌ ๋•Œ๊นŒ์ง€ FastAPI ์Šค๋ ˆ๋“œ๋ฅผ ์™„์ „ํžˆ ์ ์œ ํ•ฉ๋‹ˆ๋‹ค.

์ด๋กœ ์ธํ•ด ๋ฐœ์ƒํ•œ ํ˜„์ƒ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • FastAPI ์š”์ฒญ์ด ์ง๋ ฌํ™”๋จ
  • vLLM ์„œ๋ฒ„ ๋กœ๊ทธ์— ์š”์ฒญ์ด ํ•˜๋‚˜์”ฉ ์ฐํž˜
  • GPU๊ฐ€ ์ถฉ๋ถ„ํžˆ ์žˆ์Œ์—๋„ batching์ด ๋ฐœ์ƒํ•˜์ง€ ์•Š์Œ

์ฒ˜์Œ์—๋Š” vLLM ์„ค์ • ๋ฌธ์ œ๋กœ ์˜คํ•ดํ•˜๊ธฐ ์‰ฌ์šด ๋ถ€๋ถ„์ด์—ˆ์Šต๋‹ˆ๋‹ค.


3.2 AsyncOpenAI ์‚ฌ์šฉ ์‹œ (ํ•ด๊ฒฐ)

from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="<http://vllm:8000/v1>",
    api_key="EMPTY"
)

@app.post("/chat")
async def chat():
    response = await client.chat.completions.create(
        model="qwen",
        messages=[{"role": "user", "content": "hello"}]
    )
    return response.choices[0].message.content

AsyncOpenAI๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • FastAPI ์ด๋ฒคํŠธ ๋ฃจํ”„๊ฐ€ block๋˜์ง€ ์•Š์Œ
  • ์—ฌ๋Ÿฌ ์š”์ฒญ์ด ๋™์‹œ์— vLLM์œผ๋กœ ์ „๋‹ฌ๋จ
  • vLLM batching ์ •์ƒ ๋™์ž‘
  • multi GPU ์‚ฌ์šฉ ํ™•์ธ

๊ฒฐ๊ณผ์ ์œผ๋กœ ๋ณ‘๋ ฌ์ฒ˜๋ฆฌ๊ฐ€ ๋˜์ง€ ์•Š๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์˜€๋˜ ๋ฌธ์ œ์˜ ์›์ธ์€

FastAPI์™€ vLLM ์‚ฌ์ด์˜ ์š”์ฒญ ๋ฐฉ์‹์ด์—ˆ์Šต๋‹ˆ๋‹ค.


4. requests์™€ httpx ์ฐจ์ด

4.1 requests

import requests

def call_vllm():
    r = requests.post(url, json=payload)
    return r.json()

  • ๋™๊ธฐ ์ „์šฉ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
  • async def ๋‚ด๋ถ€์—์„œ ์‚ฌ์šฉ ์‹œ ์ด๋ฒคํŠธ ๋ฃจํ”„๋ฅผ block
  • FastAPI ๋น„๋™๊ธฐ ๊ตฌ์กฐ์™€ ๋งž์ง€ ์•Š์Œ

4.2 httpx (๋น„๋™๊ธฐ ๊ถŒ์žฅ)

import httpx

async def call_vllm():
    async with httpx.AsyncClient(timeout=60) as client:
        r = await client.post(url, json=payload)
        return r.json()

  • ๋น„๋™๊ธฐ I/O ์ง€์›
  • connection pooling ์ œ๊ณต
  • FastAPI์™€ ๊ถํ•ฉ์ด ๋งค์šฐ ์ข‹์Œ

4.3 ์ž˜๋ชป๋œ ์˜ˆ์™€ ์˜ฌ๋ฐ”๋ฅธ ์˜ˆ

โŒ ์ž˜๋ชป๋œ ์˜ˆ

@app.post("/bad")
async def bad():
    r = requests.post(url, json=payload)
    return r.json()

โญ• ์˜ฌ๋ฐ”๋ฅธ ์˜ˆ

@app.post("/good")
async def good():
    async with httpx.AsyncClient() as client:
        r = await client.post(url, json=payload)
        return r.json()


728x90
728x90

๋ฐฐ๊ฒฝ

 

๋Œ€๊ณ ๊ฐ ์ฑ—๋ด‡ ๊ฐœ๋ฐœ ๋‹น์‹œ hallucination ์— ๊ด€ํ•œ ๊ธฐ์ค€์ด ์—„๊ฒฉํ•ด ๋ชจ๋ฅด๋Š” ๋‹ต๋ณ€์€ ๋ชจ๋ฅธ๋‹ค๊ณ  ๋‹ต๋ณ€ํ•˜๊ณ  ์ƒ๋‹ด์› ์—ฐ๊ฒฐ๋กœ ๋Œ๋ฆฌ๋Š” ๋กœ์ง์œผ๋กœ ์„ค๊ณ„๋˜์–ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋•Œ๋ฌธ์— ๊ณ ๊ฐ์ด chain ๊ตฌ์กฐ์—์„œ ์กฐ๊ธˆ๋งŒ ์˜ˆ์ƒ์— ์–ด๊ธ‹๋‚˜๋Š” ํ–‰๋™์„ ํ•˜๋ฉด ๋‹ต๋ณ€์„ ํšŒํ”ผ(๋ชจ๋ฅด๊ฒ ๋‹ค ๋‹ต๋ณ€ ํ›„ ์ƒ๋‹ด์› ์—ฐ๊ฒฐ) ํ•ด ์ƒ๋‹ด ๋งŒ์กฑ๋„๊ฐ€ ๋–จ์–ด์ง€๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ๋Š”๋ฐ์š”, ๊ทธ๋ž˜์„œ ์งˆ๋ฌธ์— ์œ ์—ฐํ•˜๊ฒŒ ๋Œ€์‘ํ•˜๊ธฐ ์œ„ํ•ด ์ฒด์ธ๊ตฌ์กฐ์—์„œ ReAct agent ๋กœ ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ํ•˜๊ธฐ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ฒด์ธ์— ๋„๋‹ฌํ•  ๋•Œ์—๋Š” ์ •ํ•ด์ง„ DTO ๋ฅผ ์ง€์ผœ์•ผ ํ–ˆ๋Š”๋ฐ ์ฒด์ธ์ด ์žˆ๋Š” Tool ๊นŒ์ง€ ๋„๋‹ฌํ•  ๋•Œ์—๋Š” ์ด๋ฏธ LLM ์— ์˜ํ•ด DTO ๊ฐ€ ๋ญ‰๊ฐœ์ ธ Tool ์— ์ธ์ž๋ฅผ ์ „๋‹ฌํ•˜์ง€ ๋ชปํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋•Œ ํ”„๋กฌํ”„ํŠธ๋กœ๋งŒ ์ถœ๋ ฅ์„ ์ œ์–ดํ–ˆ์—ˆ๋Š”๋ฐ, ๋‹ต๋ณ€์„ ์ž˜ํ•˜๋Š” ๊ฒƒ ์ฒ˜๋Ÿผ ๋ณด์˜€์ง€๋งŒ Langsmith ๋กœ agent tool calling์„ ์ถ”์ ํ•œ ๊ฒฐ๊ณผ ๋‚ด๋ถ€์ ์œผ๋กœ๋Š” ์ผ๋ถ€ ๋ฐ์ดํ„ฐ๋“ค์„ ๋ˆ„๋ฝ๋˜๊ณ  calling ์„ ๋ฐ˜๋ณต ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ ๊ฐ•๋ ฅํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์“ฐ๋ฉด ์ข€ ๋‚˜์•„์กŒ๊ฒ ์ง€๋งŒ ๊ฒฐ๊ณผ์ ์œผ๋กœ ์ด ๋ฌธ์ œ๋Š” ์‘๋‹ต์‹œ๊ฐ„ ์ง€์—ฐ๊ณผ, ํ† ํฐ ๋น„์šฉ ์ฆ๊ฐ€๋กœ ์ด์–ด์กŒ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ณ ๊ฐ๋ฐ˜์‘๊ณผ ์‹ค์ œ ๋น„์ง€๋‹ˆ์Šค ๋ฌธ์ œํ•ด๊ฒฐ์—๋Š” ๋ฌธ์ œ๊ฐ€ ์—†์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์šฐ์„ ์ˆœ์œ„์— ๋ฐ€๋ ค ๊ธฐ์ˆ ๋ถ€์ฑ„๋กœ ๋‚จ๊ฒŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ง€๊ธˆ ํšŒ์‚ฌ์— ์˜ค๊ฒŒ ๋˜๋ฉด์„œ structured output ์— ๊ด€ํ•œ ๊ฐœ๋…์„ ์ ‘ํ•˜๊ฒŒ ๋˜๊ณ  ์‹ ๋ขฐ๊ฐ€๋Šฅํ•œ์ง€, ์‹ค์ œ ๋Œ€๊ณ ๊ฐ ์—…๋ฌด์—์„œ ์‚ฌ์šฉํ•  ๋งŒํผ ์‹ ๋ขฐ๋„ ์žˆ๋Š”์ง€ ํ™•์ธํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

 

Structured Output ์˜ ์ž‘๋™์›๋ฆฌ

๋จผ์ € structured output ์€ LLM ์˜ output ์„ Json ์ด๋‚˜ Pydantic ํ˜น์€ dataclass ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ ์—๋Ÿฌ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•œ๋ฐ, ๋ชจ๋ธ์ด ๋ฒ”์œ„๋ฅผ ์–ด๊ธ‹๋‚˜๊ฒŒ ์‘๋‹ตํ•˜๊ฑฐ๋‚˜ ์ž๋ฃŒํ˜•์„ ํ‹€๋ฆฌ๊ฒŒ ๋งค์นญํ•œ๋‹ค๋ฉด validation error ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์–ด ์—๋Ÿฌ๋ฉ”์‹œ์ง€ ์œ ๋„๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์„ ์ž˜ ํ™œ์šฉํ•˜๋ฉด ํŠน์ • ๊ฒฝ์šฐ์—๋งŒ (format ์ด ๋งž์ง€ ์•Š๋Š” ๊ฒฝ์šฐ, ํ•„๋“œ์— ๊ฐ’์ด ์ž˜๋ชป ๋“ค์–ด๊ฐ€๋Š” ๊ฒฝ์šฐ) Error๋ฅผ ๋ฐœ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ๋Š” ์žฌ์‹œ๋„๋ฅผ ํ•˜๊ฒŒ๋˜๊ณ  ์žฌ์‹œ๋„ ํ•˜๋Š” ๊ฒฝ์šฐ ๋Œ€๋ถ€๋ถ„ ์ž˜ ๋งค์นญ์ด ๋ฉ๋‹ˆ๋‹ค. ๊ฐ€์žฅ ์น˜๋ช…์ ์ธ ๊ฒƒ์€ structure ์— ๋งž๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ๊ธฐ๋Š” ํ•˜์ง€๋งŒ, ๊ทธ ๊ฐ’์ด ์‹ค์ œ๋กœ ๋งž๋Š”์ง€๋Š” ๋ณด์žฅํ•˜์ง€ ์•Š๋Š” ๋‹ค๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์ž‘๋™ ์ˆœ์„œ

  1. ๋ชจ๋ธ๊ณผ ์Šคํ‚ค๋งˆ๋ฅผ ์ž…๋ ฅ๋ฐ›๋Š”๋‹ค.
  2. langchain ๋‚ด๋ถ€์—์„œ ์ „๋žต์„ ์„ ํƒํ•จ
    1. toolcalling strategy : ๋ชจ๋ธ์ด structured output ์ง€์›ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ
    2. langchain ์ด ๋„๊ตฌํ˜ธ์ถœ JSON ํ˜•ํƒœ๋กœ ๋ฐ˜ํ™˜ํ•˜๊ณ  langchain ์—์„œ ํŒŒ์‹ฑํ•ด์„œ ์Šคํ‚ค๋งˆ์— ๋งž๋Š” ๊ฐœ์ฒด๋กœ ๋ณ€ํ™˜ํ•˜๋Š”๋ฐ ๋„๊ตฌ ํ˜ธ์ถœ ์ž์ฒด๊ฐ€ ํ† ํฐ์„ ๋” ์“ฐ๊ธฐ๋•Œ๋ฌธ์— ๋น„์šฉ์ฆ๊ฐ€/์‘๋‹ต์‹œ๊ฐ„ ์ฆ๊ฐ€๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค  https://platform.openai.com/docs/guides/structured-outputs
  3. provider strategy : ๋ชจ๋ธ์ด structured output ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ
  4. langchain or agent ์‘๋‹ต ์ƒ์„ฑ
  5. ๊ฒฐ๊ณผ๋ฌผ ์œ ํšจ์„ฑ ๊ฒ€์ฆ : ์Šคํ‚ค๋งˆ์— ๋งž๊ฒŒ ํŒŒ์‹ฑ์ด ๋˜์—ˆ๋Š”์ง€ Pydantic ์ด๋‚˜ json ๊ธฐ๋ฐ˜ ํŒŒ์„œ ์‚ฌ
  6. ํŒŒ์‹ฑ ์„ฑ๊ณตํ•˜๋ฉด structured_response ์— ๋„ฃ์–ด์„œ ์ตœ์ข…๊ฒฐ๊ณผ ๋ฐ˜ํ™˜

 

 

 

์Šคํ‚ค๋งˆ์ž…๋ ฅ / ์ „๋žต์„ ํƒ

์Šคํ‚ค๋งˆ๋ฅผ ์ž…๋ ฅ๋ฐ›๋Š” ๋ถ€๋ถ„๋ถ€ํ„ฐ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ์˜ˆ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

Pydantic ์Šคํ‚ค๋งˆ๋กœ ์˜ˆ์‹œ๋ฅผ ์ž‘์„ฑํ–ˆ๋Š”๋ฐ with_structured_output ๋ฉ”์†Œ๋“œ์˜ ์ธ์ž๋กœ Pydantic ์ด ์Šคํ‚ค๋งˆ๋กœ ๋„˜์–ด๊ฐ€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

class ReviewSummary(BaseModel):
    title: str = Field(..., description="๋ฆฌ๋ทฐ ์ œ๋ชฉ")
    sentiment: str = Field(..., description="๊ธ์ •/๋ถ€์ •/์ค‘๋ฆฝ ์ค‘ ํ•˜๋‚˜")
    score: float = Field(..., description="0~1 ์‚ฌ์ด์˜ ๊ฐ์ • ์ ์ˆ˜")
    
from langchain_openai import ChatOpenAI

# OpenAI API ๋˜๋Š” vLLM OpenAI ์„œ๋ฒ„ URL๋กœ ์ž๋™ ์—ฐ๊ฒฐ๋จ
model = ChatOpenAI(
    model="gpt-4o-mini",  # ์•„๋ฌด ๋ชจ๋ธ ๊ฐ€๋Šฅ
    temperature=0
)

structured_model = model.with_structured_output(ReviewSummary)

result = structured_model.invoke(user_input)

print(result)
print(type(result))
------------
title='์˜ํ™” ๋ฆฌ๋ทฐ ์š”์•ฝ'
sentiment='๋ถ€์ •'
score=0.15
<class '__main__.ReviewSummary'>
------------

structured output ์ง€์›ํ•˜๋Š” ์ผ๋ถ€ ๋ชจ๋ธ๋“ค์€ ์•„๋ž˜์ฒ˜๋Ÿผ ๋ฒค๋”์‚ฌ๊ฐ€ ์ง€์›ํ•˜๋Š” ์Šคํ‚ค๋งˆ์— ๋งž๊ฒŒ ๋ณ€ํ™˜ํ•˜๋Š” ๋„๊ตฌ๋งŒ์„ bind ํ•œ ์ฑ„๋กœ ๋๋‚˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

class ChatAnthropic(BaseChatModel):
#----------์ค‘๋žต----------
	def with_structured_output():
	#----------์ค‘๋žต----------
        if method == "function_calling":
            formatted_tool = **convert_to_anthropic_tool(schema)**
            tool_name = formatted_tool["name"]
            if self.thinking is not None and self.thinking.get("type") == "enabled":
                llm = self._get_llm_for_structured_output_when_thinking_is_enabled(
                    schema,
                    formatted_tool,
                )
            else:
                llm = self.bind_tools(
                    [schema],
                    tool_choice=tool_name,
                    ls_structured_output_format={
                        "kwargs": {"method": "function_calling"},
                        "schema": formatted_tool,
                    },
                )

 

@dataclass(init=False)
class ProviderStrategy(Generic[SchemaT]):
    """Use the model provider's native structured output method."""

    schema: type[SchemaT]
    """Schema for native mode."""

    schema_spec: _SchemaSpec[SchemaT]
    """Schema spec for native mode."""

    def __init__(
        self,
        schema: type[SchemaT],
    ) -> None:
        """Initialize ProviderStrategy with schema."""
        self.schema = schema
        self.schema_spec = _SchemaSpec(schema)

 

๊ทธ๋ฆฌ๊ณ  Provider ์— ์—†๋Š” ๋ชจ๋ธ์€ ToolStrategy ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋Š”๋ฐ vllm ๊ฐ™์€ ๋กœ์ปฌ ์„œ๋น™ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ์ž‘๋™์‹œํ‚ค๋Š” ๋ชจ๋ธ๋“ค์ด ๋Œ€์ฒด๋กœ ๊ทธ๋Ÿฌํ•ฉ๋‹ˆ๋‹ค.

class ChatOllama(BaseChatModel):
   #---์ค‘๋žต----
   def with_structurd_output():
	   #---์ค‘๋žต----
     if is_pydantic_schema:
            schema = cast("TypeBaseModel", schema)
            if issubclass(schema, BaseModelV1):
                response_format = schema.schema()
            else:
                response_format = schema.model_json_schema()
            llm = self.bind(
                format=response_format,
                ls_structured_output_format={
                    "kwargs": {"method": method},
                    "schema": schema,
                },
            )
@dataclass(init=False)
class ToolStrategy(Generic[SchemaT]):
    """Use a tool calling strategy for model responses."""

    schema: type[SchemaT]
    """Schema for the tool calls."""

    schema_specs: list[_SchemaSpec[SchemaT]]
    """Schema specs for the tool calls."""

    tool_message_content: str | None
    """The content of the tool message to be returned when the model calls
    an artificial structured output tool."""

    handle_errors: (
        bool | str | type[Exception] | tuple[type[Exception], ...] | Callable[[Exception], str]
    )
    

ToolStrategy ๋Š” bind ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋Ÿฌ๋„ˆ๋ธ” ๊ฐ์ฒด์— ์ ‘๊ทผํ•˜๊ณ  ๊ทธ ์ง€์ ์— ํˆด์ฝœ๋ง์„ ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์‚ฌ๋žŒ์ด ๊ฐœ์ž…ํ•˜์—ฌ ๋ฒค๋”์‚ฌ์˜ ํˆด์„ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ „๋žต์„ ํƒ์˜ ๊ฒฐ์ •์ ์œผ๋กœ ํฐ ์ฐจ์ด๋Š” ๊ฒฐ๊ตญ with_structured_output ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•  ๋•Œ ๊ธฐ๋ณธ์œผ๋กœ ์„ ํƒ๋˜๋Š” method ๊ฐ€ ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

 

๋ชจ๋ธ์ด structured output ์ง€์›ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ

def with_structured_output(
        self,
        schema: dict | type,
        *,
        method: Literal["function_calling", "json_mode", "json_schema"] = "json_schema",
        include_raw: bool = False,
        **kwargs: Any,
    ) -> Runnable[LanguageModelInput, dict | BaseModel]:
        r"""Model wrapper that returns outputs formatted to match the

 

structured output ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ

def with_structured_output(
        self,
        schema: dict | type,
        *,
        include_raw: bool = False,
        method: Literal["function_calling", "json_schema"] = "function_calling",
        **kwargs: Any,
    ) -> Runnable[LanguageModelInput, dict | BaseModel]:
        """Model wrapper that returns outputs formatted to match the given schema.

 

structured output ์„ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” method ๊ฐ€ function_calling ์œผ๋กœ api ์ œ๊ณต ๋ฒค๋”์‚ฌ์˜ function calling ํ˜•ํƒœ๋กœ ์ฒ˜๋ฆฌํ•˜๊ณ 

if method == "function_calling":
    formatted_tool = convert_to_anthropic_tool(schema)
    tool_name = formatted_tool["name"]
    if self.thinking is not None and self.thinking.get("type") == "enabled":
        llm = self._get_llm_for_structured_output_when_thinking_is_enabled(
            schema,
            formatted_tool,
        )
    else:
        llm = self.bind_tools(
            [schema],
            tool_choice=tool_name,
            ls_structured_output_format={
                "kwargs": {"method": "function_calling"},
                "schema": formatted_tool,
            },
        )

bind_tools ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐ˜๋Œ€์˜ ๊ฒฝ์šฐ์—๋Š” json_schema ๊ฐ€ ๊ธฐ๋ณธ ์„ ํƒ๋˜์–ด bind ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ด์„œ tool calling ํ˜•ํƒœ๊ฐ€ ์•„๋‹ˆ๋ผ runnable sequence ์— ์ƒˆ๋กœ์šด ๊ฐ์ฒด๋ฅผ ๋งŒ๋“ค์–ด ํ˜ธ์ถœ ์˜ต์…˜์„ ์žฌ์ •์˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

elif method == "json_schema":
            if schema is None:
                msg = (
                    "schema must be specified when method is not 'json_mode'. "
                    "Received None."
                )
                raise ValueError(msg)
            if is_pydantic_schema:
                schema = cast("TypeBaseModel", schema)
                if issubclass(schema, BaseModelV1):
                    response_format = schema.schema()
                else:
                    response_format = schema.model_json_schema()
                llm = self.bind(
                    format=response_format,
                    ls_structured_output_format={
                        "kwargs": {"method": method},
                        "schema": schema,
                    },
                )
                output_parser = PydanticOutputParser(pydantic_object=schema)  # type: ignore[arg-type]

##bind example
"""
        Example:
            ```python
            from langchain_ollama import ChatOllama
            from langchain_core.output_parsers import StrOutputParser

            model = ChatOllama(model="llama3.1")

            # Without bind
            chain = model | StrOutputParser()

            chain.invoke("Repeat quoted words exactly: 'One two three four five.'")
            # Output is 'One two three four five.'

            # With bind
            chain = model.bind(stop=["three"]) | StrOutputParser()

            chain.invoke("Repeat quoted words exactly: 'One two three four five.'")
            # Output is 'One two'
            
"""

์ž์ฒด์ ์œผ๋กœ response_format ์„ ์„ธํŒ…ํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ์Šคํ‚ค๋งˆ๋ฅผ ์ž…๋ ฅ๋ฐ›๊ณ  ์ „๋žต์„ ์„ ํƒํ•˜๋Š” ๋กœ์ง์„ ๊ฑฐ์น˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

์ด์ œ ์ „๋žต๋ณ„๋กœ ์–ด๋–ป๊ฒŒ structured output ์„ ๋งŒ๋“ค์–ด ๋‚ด๋Š”์ง€ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

 

์ „๋žต๋ณ„ ์‘๋‹ต์ƒ์„ฑ ๊ณผ์ •

  1. ToolcallingStrategy
    class ToolStrategy(Generic[SchemaT]):
        schema: type[SchemaT]
        schema_specs: list[_SchemaSpec[SchemaT]]
        tool_message_content: str | None
        handle_errors: bool | ...
    
    langchain ์€ schema_spec ์„ ์ด์šฉํ•ด์„œ fake tool schema ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์ด fake tool ์ด๋ฆ„์ด structured output ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ชจ๋ธ์—๊ฒŒ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ ๋ชจ๋ธ์€ ์•„๋ž˜์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ์‘๋‹ตํ•ฉ๋‹ˆ๋‹ค.์ด์ œ json ์„ ํŒŒ์‹ฑํ•ด์„œ pydantic ์ด๋‚˜ dataclass ๊ฒ€์ฆ์„ ํ•˜๊ณ  ์‹คํŒจํ•˜๋ฉด Validation Error ์„ ๋ฑ‰์–ด๋‚ด๊ณ  ๋‹ค์‹œ ๋ชจ๋ธ์—๊ฒŒ ์š”์ฒญ์„ ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
    {
      "tool": "structured_output",
      "arguments": {
          "title": "some text",
          "score": 0.82
      }
    }
    โ€‹
    ์ด error ์ดํ›„ ๋‹ค์‹œ ๋ชจ๋ธ์—๊ฒŒ ์š”์ฒญํ•˜๋Š” ๊ณผ์ •์—์„œ ๋งŒ์•ฝ ๋ชจ๋“  ์ปจํ…์ŠคํŠธ๋ฅผ ํฌํ•จํ•œ ์ฒด์ธ์ด๋‚˜ ๋…ธ๋“œ๋ผ๋ฉด ์ •๋ง ๋งŽ์€ ํ† ํฐ์ด ๋‚ญ๋น„๋˜๊ณ , ์‘๋‹ต์‹œ๊ฐ„์ด ์ง€์—ฐ๋˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.๋ชจ๋ธ์ด native ํ•˜๊ฒŒ structured output ์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ Toolcalling strategy ๋ฅผ ์„ ํƒํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  2. ProviderStrategy
    @dataclass(init=False)
    class ProviderStrategy(Generic[SchemaT]):
        """Use the model provider's native structured output method."""
    
        schema: type[SchemaT]
        """Schema for native mode."""
    
        schema_spec: _SchemaSpec[SchemaT]
        """Schema spec for native mode."""
    
    langchain ์€ ์Šคํ‚ค๋งˆ๋งŒ ๊ทธ๋Œ€๋กœ ๋ชจ๋ธ์—๊ฒŒ ์ „๋‹ฌํ•˜๊ณ  ์‘๋‹ต๋ฐ›์•„์„œ ํŒŒ์‹ฑ๋งŒ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. openAI ์™€ anthropic gemini ์˜ ์‘๋‹ต์€ ์•ˆ์ •์ ์œผ๋กœ ๋‹ค์‹œ ๋ชจ๋ธ์—๊ฒŒ ์š”์ฒญํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ์ž์ฒด์ ์œผ๋กœ structured output ์„ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ langchain ์€ ๊ฐ ๋ฒค๋”์‚ฌ์— ๋งž๋Š” ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜/ํŒŒ์‹ฑ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Structured Output ํ…Œ์ŠคํŠธ

openai ์˜ structured otutput์€ ์•„๋ž˜์˜ ์žฅ์ ์„ ๊ฐ–๊ณ  ์žˆ๋Š”๋ฐ, ํŠนํžˆ ์„ธ๋ฒˆ์งธ ๋ถ€๋ถ„์ด ์ธ์ƒ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ์ด์ „ ์ฑ—๋ด‡ ๊ฐœ๋ฐœ๋‹น์‹œ ๋ ˆ๊ฑฐ์‹œ๋Š” ์ด ๊ธฐ๋Šฅ์„ ๋ชฐ๋ž๋˜๊ฒƒ์ธ์ง€ ํ”„๋กฌํ”„ํŠธ๋กœ ์ถœ๋ ฅ์„ ๊ฐ•์ œํ•˜๊ณ  ์žˆ์—ˆ๋Š”๋ฐ, structured output ์„ ์‚ฌ์šฉํ•˜๋ฉดformat ์„ ์ง€ํ‚ค๊ธฐ ์œ„ํ•ด์„œ ๊ฐ•๋ ฅํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ•˜์ง€ ์•Š์•„๋„ ๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

 

structured output ์ด ์–ธ์ œ ์ง€์›๋˜๋„๋ก ํฌํ•จ๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•ด๋ณด๋‹ˆ Toolcalling strategy ๋Š” 2023๋…„ ์ค‘ํ›„๋ฐ˜์ฏค ๊ทธ๋ฆฌ๊ณ  ProviderStrategy๋Š” 2024๋…„ 8์›” 6์ผ gpt-4o ๋ชจ๋ธ์„ ์‹œ์ž‘์œผ๋กœ openai ๊ฐ€ ๊ฐ€์žฅ๋จผ์ € ์ง€์›ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๋‹ค์Œ anthropic ๊ณผ gemini ๊ฐ€ ์ฐจ๋ก€๋กœ ์ง€์›ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

langchain ์Šคํ…Œ์ด๋ธ”๋ฒ„์ „์ด 2024๋…„ 1์›”์— ๋ฐฐํฌ๋˜๊ณ , ๊ทธ๋•Œ๋ถ€ํ„ฐ ์ฑ—๋ด‡ ๋ ˆ๊ฑฐ์‹œ๊ฐ€ ๊ฐœ๋ฐœ๋˜๊ธฐ ์‹œ์ž‘ํ–ˆ์œผ๋‹ˆ ์ตœ์ดˆ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ ์ดํ›„ ์‹ ๊ธฐ์ˆ  ์ถ”์ ์„ 1๋…„ 6๊ฐœ์›” ๊ฐ€๊นŒ์ด ํ•˜์ง€ ์•Š์•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿผ ์‹ค์ œ๋กœ ํ”„๋กฌํ”„ํŠธ๋กœ ์ถœ๋ ฅ์„ ๊ฐ•์ œํ•˜๋Š” ๊ฒƒ๊ณผ structured output ์œผ๋กœ output ํ˜•ํƒœ๋ฅผ ํŒŒ์‹ฑํ•˜๋Š” ๊ฒƒ์ด ์–ผ๋งˆ๋‚˜ ๋‹ค๋ฅธ์ง€ ํ™•์ธํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง์œผ๋กœ output format ๊ฐ•์ œ ํ…Œ์ŠคํŠธ

๋”๋ณด๊ธฐ
system_prompt = """๋‹น์‹ ์˜ ์ž„๋ฌด๋Š” ์•„๋ž˜ Pydantic ๋ชจ๋ธ ์Šคํ‚ค๋งˆ์— ์ •ํ™•ํžˆ ๋งž๋Š” JSON๋งŒ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด๋‹ค.
๋‹น์‹ ์€ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‚ด๋ถ€์ ์œผ๋กœ ๋…ผ๋ฆฌ์  ๋‹จ๊ณ„๋ณ„ ์ถ”๋ก (Chain-of-Thought)์„ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•œ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ๊ทธ ์‚ฌ๊ณ  ๊ณผ์ •์€ ์ ˆ๋Œ€ ์ถœ๋ ฅํ•˜์ง€ ๋ง๊ณ , ์ตœ์ข… ์ถœ๋ ฅ์€ ์•„๋ž˜ ์Šคํ‚ค๋งˆ์— ์™„์ „ํžˆ ๋งž๋Š” JSON๋งŒ ์ƒ์„ฑํ•ด์•ผ ํ•œ๋‹ค.

์ถœ๋ ฅ ํ˜•์‹ ๊ทœ์น™:
1. ๋ฐ˜๋“œ์‹œ JSON ํฌ๋งท์œผ๋กœ๋งŒ ์ถœ๋ ฅํ•œ๋‹ค.
2. JSON ๋ฐ”๊นฅ์— ์–ด๋–ค ์„ค๋ช…, ๋ฌธ์žฅ, ์—ฌ๋ถ„์˜ ํ…์ŠคํŠธ๋„ ์ ˆ๋Œ€ ์ถœ๋ ฅํ•˜์ง€ ์•Š๋Š”๋‹ค.
3. ๋ชจ๋“  ํ•„๋“œ๋Š” ๋ฐ˜๋“œ์‹œ ํฌํ•จํ•ด์•ผ ํ•œ๋‹ค: name, age, address, phone_number
4. ํ•„๋“œ ํƒ€์ž…์€ ์Šคํ‚ค๋งˆ์™€ 100% ์ผ์น˜ํ•ด์•ผ ํ•œ๋‹ค.
   - name: ๋ฌธ์ž์—ด
   - age: ์ •์ˆ˜
   - address: ๋ฌธ์ž์—ด
   - phone_number: ๋ฌธ์ž์—ด
5. ์˜๋ฏธ ์—†๋Š” ๊ฐ’, null, None, undefined ๋“ฑ์„ ๋„ฃ์ง€ ๋ง๊ณ  ์‹ค์ œ ๊ฐ’์œผ๋กœ ์ฑ„์šด๋‹ค.
6. JSON ํ‚ค ์ด๋ฆ„์€ ์Šคํ‚ค๋งˆ์™€ ์™„์ „ํžˆ ๋™์ผํ•ด์•ผ ํ•˜๋ฉฐ, ๋Œ€์†Œ๋ฌธ์ž ๋ณ€๊ฒฝ ๊ธˆ์ง€.
7. JSON ์™ธ๋ถ€์— ์ฃผ์„, ๋งˆํฌ๋‹ค์šด, ๊ณต๋ฐฑ ๋ผ์ธ๋„ ์ถœ๋ ฅํ•˜๋ฉด ์•ˆ ๋œ๋‹ค.
8. ์˜ˆ์‹œ๋Š” ์ ˆ๋Œ€๋กœ ์„ค๋ช…ํ•˜์ง€ ๋ง๊ณ , ์ตœ์ข… ์ถœ๋ ฅ๋„ ์˜ˆ์ œ์™€ ๋™์ผํ•œ ํ˜•์‹์˜ JSON๋งŒ ์ƒ์„ฑํ•œ๋‹ค.

Pydantic ๋ชจ๋ธ ์Šคํ‚ค๋งˆ:

class Gender(str, Enum):
    male = "male"
    female = "female"
    other = "other"

class Address(BaseModel):
    street: str = Field(description="Street name and number")
    city: str = Field(description="City name")
    state: str = Field(description="State/Province")
    postal_code: str = Field(description="Postal/ZIP code")
    country: str = Field(description="Country name")

class UserProfile(BaseModel):
    name: str = Field(description="The user's full name")
    age: int = Field(description="The user's age")
    gender: Gender = Field(description="The user's gender")
    email: str = Field(description="The user's email address")
    phone_number: str = Field(description="The user's primary phone number")
    addresses: List[Address] = Field(description="List of user's addresses")
    date_of_birth: date = Field(description="The user's birth date")
    interests: List[str] = Field(default_factory=list, description="List of user's interests")
    is_active: bool = Field(default=True, description="Whether the user is active")
    bio: Optional[str] = Field(default=None, description="Short biography of the user")
    friends_ids: Optional[List[int]] = Field(default_factory=list, description="List of friend's user IDs")
    account_created: date = Field(description="Date when the user account was created")

[์ž…๋ ฅ ์˜ˆ์ œ 1]
๋‚˜์ด๋Š” 27์„ธ์ด๊ณ , ์„ฑ๋ณ„์€ ๋‚จ์„ฑ์ž…๋‹ˆ๋‹ค.  
์ด๋ฉ”์ผ ์ฃผ์†Œ๋Š” taejung.park@example.com์ด๊ณ , ํœด๋Œ€ํฐ ๋ฒˆํ˜ธ๋Š” 010-1234-5678์ž…๋‹ˆ๋‹ค.  
์ฃผ์†Œ๋Š” ์„œ์šธ ์˜๋“ฑํฌ๊ตฌ ์˜๋“ฑํฌ๋กœ 123๋ฒˆ์ง€์™€ ์„œ์šธ ๊ฐ•๋‚จ๊ตฌ ๊ฐ•๋‚จ๋Œ€๋กœ 456๋ฒˆ์ง€ ๋‘ ๊ณณ์ž…๋‹ˆ๋‹ค.  
์ƒ๋…„์›”์ผ์€ 1996๋…„ 5์›” 14์ผ์ด๊ณ , ๊ด€์‹ฌ์‚ฌ๋Š” ๋…์„œ, ์˜ํ™”, ๋“ฑ์‚ฐ์ž…๋‹ˆ๋‹ค.  
ํ™œ์„ฑ ์ƒํƒœ๋Š” True์ด๋ฉฐ, ์ž๊ธฐ์†Œ๊ฐœ๋Š” "์•ˆ๋…•ํ•˜์„ธ์š”, ์„œ์šธ์—์„œ ๊ฐœ๋ฐœ์ž๋กœ ์ผํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค."์ž…๋‹ˆ๋‹ค.  
์นœ๊ตฌ ID๋Š” 101, 102, 103์ด๊ณ , ๊ณ„์ • ์ƒ์„ฑ์ผ์€ 2020๋…„ 8์›” 1์ผ์ž…๋‹ˆ๋‹ค.  

[์ถœ๋ ฅ ์˜ˆ์ œ 1]
{
  "name": "๋ฐ•ํƒœ์ •",
  "age": 27,
  "gender": "male",
  "email": "taejung.park@example.com",
  "phone_number": "010-1234-5678",
  "addresses": [
    {
      "street": "์˜๋“ฑํฌ๋กœ 123",
      "city": "์„œ์šธ",
      "state": "์˜๋“ฑํฌ๊ตฌ",
      "postal_code": "07200",
      "country": "๋Œ€ํ•œ๋ฏผ๊ตญ"
    },
    {
      "street": "๊ฐ•๋‚จ๋Œ€๋กœ 456",
      "city": "์„œ์šธ",
      "state": "๊ฐ•๋‚จ๊ตฌ",
      "postal_code": "06100",
      "country": "๋Œ€ํ•œ๋ฏผ๊ตญ"
    }
  ],
  "date_of_birth": "1996-05-14",
  "interests": ["๋…์„œ", "์˜ํ™”", "๋“ฑ์‚ฐ"],
  "is_active": true,
  "bio": "์•ˆ๋…•ํ•˜์„ธ์š”, ์„œ์šธ์—์„œ ๊ฐœ๋ฐœ์ž๋กœ ์ผํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.",
  "friends_ids": [101, 102, 103],
  "account_created": "2020-08-01"
}


[์ž…๋ ฅ ์˜ˆ์ œ 2]
์•ˆ๋…•ํ•˜์„ธ์š”. ์œ ์ € ๊น€ํ•˜๋‚˜์˜ ์ •๋ณด๋ฅผ ์•Œ๋ ค๋“œ๋ฆด๊ฒŒ์š”.  
๋‚˜์ด๋Š” 30์„ธ์ด๊ณ , ์„ฑ๋ณ„์€ ์—ฌ์„ฑ์ž…๋‹ˆ๋‹ค.  
์ด๋ฉ”์ผ์€ kim.hana@example.com, ํœด๋Œ€ํฐ ๋ฒˆํ˜ธ๋Š” 010-9876-5432์ž…๋‹ˆ๋‹ค.  
์ฃผ์†Œ๋Š” ์„œ์šธ ๊ฐ•๋ถ๊ตฌ ๋ฏธ์•„๋กœ 11๊ณผ ๊ฒฝ๊ธฐ ์„ฑ๋‚จ์‹œ ๋ถ„๋‹น๊ตฌ ์‚ผํ‰๋™ 22๋ฒˆ์ง€ ๋‘ ๊ณณ์ž…๋‹ˆ๋‹ค.  
์ƒ๋…„์›”์ผ์€ 1993๋…„ 9์›” 10์ผ์ด๊ณ , ๊ด€์‹ฌ์‚ฌ๋Š” ์š”๊ฐ€, ์˜ํ™”, ์—ฌํ–‰์ž…๋‹ˆ๋‹ค.  
ํ™œ์„ฑ ์ƒํƒœ๋Š” True์ด๋ฉฐ, ์ž๊ธฐ์†Œ๊ฐœ๋Š” "์•ˆ๋…•ํ•˜์„ธ์š”, ํ”„๋ฆฌ๋žœ์„œ ๋””์ž์ด๋„ˆ์ž…๋‹ˆ๋‹ค."์ž…๋‹ˆ๋‹ค.  
์นœ๊ตฌ ID๋Š” 201, 202, 203์ด๊ณ , ๊ณ„์ • ์ƒ์„ฑ์ผ์€ 2019๋…„ 3์›” 15์ผ์ž…๋‹ˆ๋‹ค.  

[์ถœ๋ ฅ ์˜ˆ์ œ 2]
{
  "name": "๊น€ํ•˜๋‚˜",
  "age": 30,
  "gender": "female",
  "email": "kim.hana@example.com",
  "phone_number": "010-9876-5432",
  "addresses": [
    {"street": "๋ฏธ์•„๋กœ 11", "city": "์„œ์šธ", "state": "๊ฐ•๋ถ๊ตฌ", "postal_code": "01000", "country": "๋Œ€ํ•œ๋ฏผ๊ตญ"},
    {"street": "์‚ผํ‰๋™ 22", "city": "์„ฑ๋‚จ์‹œ", "state": "๋ถ„๋‹น๊ตฌ", "postal_code": "13500", "country": "๋Œ€ํ•œ๋ฏผ๊ตญ"}
  ],
  "date_of_birth": "1993-09-10",
  "interests": ["์š”๊ฐ€", "์˜ํ™”", "์—ฌํ–‰"],
  "is_active": true,
  "bio": "์•ˆ๋…•ํ•˜์„ธ์š”, ํ”„๋ฆฌ๋žœ์„œ ๋””์ž์ด๋„ˆ์ž…๋‹ˆ๋‹ค.",
  "friends_ids": [201, 202, 203],
  "account_created": "2019-03-15"
}



์œ„ ๊ทœ์น™๊ณผ ์˜ˆ์ œ๋ฅผ ๋ชจ๋‘ ์ฐธ๊ณ ํ•˜์—ฌ, ์ง€๊ธˆ๋ถ€ํ„ฐ ์–ด๋–ค ์ž…๋ ฅ์ด ๋“ค์–ด์˜ค๋”๋ผ๋„ Pydantic UserInfo ์Šคํ‚ค๋งˆ์— ์™„์ „ํžˆ ๋งž๋Š” JSON๋งŒ ์ถœ๋ ฅํ•˜๋ผ.
์‚ฌ๊ณ  ๊ณผ์ •์€ ๋‚ด๋ถ€์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉํ•˜๊ณ  ์ ˆ๋Œ€ ์™ธ๋ถ€๋กœ ๋…ธ์ถœํ•˜์ง€ ์•Š๋Š”๋‹ค."""

 

Structured Output Pydantic ํŒŒ๋ผ๋ฏธํ„ฐ ์ „๋‹ฌ ํ…Œ์ŠคํŠธ

structured output ์€ ๊ณต์‹๋ฌธ์„œ์—์„œ๋„ “structured output ์€ ์‹ค์ˆ˜ํ•  ์ˆ˜ ์žˆ๋‹ค” , “์ตœ๋Œ€ํ•œ ์Šคํ‚ค๋งˆ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ž˜ ์ž‘์„ฑํ•ด๋ผ” ๋ผ๊ณ  ๋งํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋”ฐ๋ผ์„œ LLM ์ด ๋ถ„๋ฅ˜ํ•˜๊ฑฐ๋‚˜, ์–ด๋–ค ํฌ๋งท์— ์ž…๋ ฅ์„ ๊ฐ•์ œํ•ด์•ผํ•œ๋‹ค๋ฉด Pydantic ์‚ฌ์šฉํ•˜๊ธฐ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

๊ฐ„๋‹จํ•œ ํ”„๋กฌํ”„ํŠธ์˜ ๊ฒฝ์šฐ ๋‘˜๋‹ค ์ž˜ ๋ฑ‰์–ด๋‚ด๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿผ ์‹ค๋ฌด์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค๋ฅผ ์ƒ๊ฐํ•ด๋ณด๊ณ  ํ…Œ์ŠคํŠธ ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. LLM ์ด ์„ญ์ทจํ•˜๊ฒŒ ๋  ๋ฐ์ดํ„ฐ๋Š” ์ƒ๊ฐ๋ณด๋‹ค ๋ณต์žกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์—ฌ๋Ÿฌ๊ฐœ DTO ๊ฐ€ ์„ž์—ฌ์žˆ๋Š” ๊ฒฝ์šฐ DTO ๊ฐ€ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ปค์ง€๊ฒŒ ๋˜๋Š”๋ฐ์š” 3๊ฐœ์˜ DTO๋ฅผ ์˜ˆ์‹œ๋กœ ํ•˜์—ฌ json ํƒ€์ž…์ด ์•„๋‹Œ ์ž์—ฐ์–ด๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฃผ์—ˆ์„ ๋•Œ ์ž˜ ํŒŒ์‹ฑํ•˜๋Š”์ง€ ํ™•์ธํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

class Gender(str, Enum):
    male = "male"
    female = "female"
    other = "other"
class Address(BaseModel):
    street: str = Field(description="Street name and number")
    city: str = Field(description="City name")
    state: str = Field(description="State/Province")
    postal_code: str = Field(description="Postal/ZIP code")
    country: str = Field(description="Country name")
class UserProfile(BaseModel):
    name: str = Field(description="The user's full name")
    age: int = Field(description="The user's age")
    gender: Gender = Field(description="The user's gender")
    email: str = Field(description="The user's email address")
    phone_number: str = Field(description="The user's primary phone number")
    addresses: List[Address] = Field(description="List of user's addresses")
    date_of_birth: date = Field(description="The user's birth date")
    interests: List[str] = Field(default_factory=list, description="List of user's interests")
    is_active: bool = Field(default=True, description="Whether the user is active")
    bio: Optional[str] = Field(default=None, description="Short biography of the user")
    friends_ids: Optional[List[int]] = Field(default_factory=list, description="List of friend's user IDs")
    account_created: date = Field(description="Date when the user account was created")

 

Input์€ ์•„๋ž˜์™€ ๊ฐ™์ด ํ–ˆ๋‹ค. 

๋ฐ•์ค€ํ˜ธ๋ผ๋Š” ์‚ฌ์šฉ์ž์˜ ์ •๋ณด๋ฅผ JSON์œผ๋กœ ๋งŒ๋“ค์–ด์ฃผ์„ธ์š”. 
๋‚˜์ด๋Š” 24์„ธ, ๋‚จ์„ฑ์ด๋ฉฐ, ์ด๋ฉ”์ผ์€ park.junho@example.com, 
์ „ํ™”๋ฒˆํ˜ธ๋Š” 010-1111-2222์ž…๋‹ˆ๋‹ค.  
์ฃผ์†Œ๋Š” ๋ถ€์‚ฐ ํ•ด์šด๋Œ€๊ตฌ ๋งˆ๋ฆฐ์‹œํ‹ฐ 5๋ฒˆ์ง€์™€ ๋Œ€๊ตฌ ์ˆ˜์„ฑ๊ตฌ ๋ฒ”์–ด๋กœ 88๋ฒˆ์ง€์ž…๋‹ˆ๋‹ค.
์ƒ์ผ์€ 2000๋…„ 12์›” 1์ผ, ๊ด€์‹ฌ์‚ฌ๋Š” ๊ฒŒ์ž„, ์ฝ”๋”ฉ, ์ถ•๊ตฌ์ž…๋‹ˆ๋‹ค. 
์‚ฌ์šฉ์ž๋Š” ๋น„ํ™œ์„ฑ ์ƒํƒœ(False)์ด๋ฉฐ, ์ž๊ธฐ์†Œ๊ฐœ๋Š” ๊ฒŒ์ž„ ๊ฐœ๋ฐœ์ž๋ฅผ ๊ฟˆ๊พธ๊ณ  ์žˆ๋Š” ๋Œ€ํ•™์ƒ์ž…๋‹ˆ๋‹ค.
์นœ๊ตฌ ID๋Š” 301, 302, ๊ณ„์ • ์ƒ์„ฑ์ผ์€ 2021๋…„ 6์›” 20์ผ์ž…๋‹ˆ๋‹ค.

 

 

๋ณต์žกํ•œ ๊ตฌ์กฐ์  ๋ฐ์ดํ„ฐ๋ฅผ ํ”„๋กฌํ”„ํŠธ๋กœ ํ˜•ํƒœ๋ฅผ ๊ฐ•์ œํ•œ ๊ฒƒ๋„ ๋Œ€์ฒด๋กœ ์ž˜ ํŒŒ์‹ฑํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ฒฐ๊ณผ๋ฌผ์„ ๋ณด๋ฉด postal code์— ํฌํ•จ๋˜์–ด ์žˆ์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด์žˆ์Šต๋‹ˆ๋‹ค. 

๊ทธ๋ ‡๋‹ค๋ฉด structured output ์„ ์‚ฌ์šฉํ•œ ์ฟผ๋ฆฌ๋Š” ์–ด๋–จ๊นŒ์š”?

์ฐฌ๊ฐ€์ง€๋กœ ์ž˜ ํŒŒ์‹ฑํ•ฉ๋‹ˆ๋‹ค. DTO๊ฐ€ ๋ณต์žกํ•ด์ง€๋”๋ผ๋„ ์ข‹์€ ๋ชจ๋ธ์ธ ๊ฒฝ์šฐ์—๋Š” ๊ฑฐ์˜ ๋‹ค ํŒŒ์‹ฑ์„ ํ•ด๋‚ด๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ํ•œ๊ฐ€์ง€ ์ฐจ์ด์ ์ด ๋ฐœ์ƒํ–ˆ๋Š”๋ฐ์š” with structured output ๋ฉ”์„œ๋“œ๋Š” postal_code ๊ฐ€ ๋นˆ์นธ์ธ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ prompt ๋กœ ๊ฐ•์ œํ•œ ๊ฒฝ์šฐ์—๋Š” ์‹ค์ œ ๋ฐ์ดํ„ฐ์— postal code ๊ฐ€ ์—†์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  dummy ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด๊ฐ€์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Structured Outpu๋Š” ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ์„๊นŒ

์ง€๊ธˆ๊นŒ์ง€ ๋‚ด์šฉ์œผ๋กœ structured output ์„ ์‚ฌ์šฉํ•  ๋•Œ ์กฐ๊ธˆ ๋” ์ž˜ ํŒŒ์‹ฑ์ด ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์—ˆ๋Š”๋ฐ์š”, ํ›จ์‹  ๊ฐ„๊ฒฐํ•˜๊ณ  ์„ฑ๋Šฅ์ด ์ข‹์œผ๋‹ˆ ๋”ฐ๋ผ์„œ ํ”„๋กฌํ”„ํŠธ๋กœ ๊ฐ•์ œํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค structured output ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋Š”๊ฒƒ์ด ํ›จ์”ฌ ๋” ์œ ๋ฆฌํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

2025๋…„ 12์›” 2์ผ ๊ธฐ์ค€์œผ๋กœ

https://llm-stats.com/

 

AI Leaderboards 2025 - Compare All AI Models

Comprehensive AI leaderboards comparing LLM, TTS, STT, video, image, and embedding models. Compare performance, pricing, and capabilities across all AI modalities.

llm-stats.com

 

์‹คํ—˜์— ์‚ฌ์šฉํ•œ gpt-4o ๋ชจ๋ธ๋ณด๋‹ค ๊ดœ์ฐฎ์€ ๋กœ์ปฌ ๋ชจ๋ธ๋“ค์ด ๋งŽ์€๋ฐ์š” 30B ์ •๋„ ๋˜๋Š” ๋ชจ๋ธ๋“ค์„ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด ๋กœ์ปฌ์—์„œ ๋Œ๋ฆฌ๋Š” ๋ชจ๋ธ๋“ค๋„ ์ž˜ ์ž‘๋™ํ•  ๊ฒƒ์ด๋ผ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์„ฑ๋Šฅ์ธก๋ฉด์—์„œ ์–ด๋–ค ์ „๋žต์ด ๋” ์šฐ์›”ํ•˜๋‹ค๋Š” ๊ฒƒ์€ ํฐ ์˜๋ฏธ๊ฐ€ ์—†์–ด๋ณด์ž…๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  Toolcalling Strategy ์˜ ๊ฒฝ์šฐ๋Š” retry ๊ฐ€ ์ž์ฃผ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— structured output ์ง€์›๋˜๋Š” api ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ™˜๊ฒฝ์ด๋ผ๋ฉด ProviderStrategy ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์‹œ๋„ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ ์ถœ๋ ฅ ๊ตฌ์กฐ๋ฅผ ํ”„๋กฌํ”„ํŠธ๋กœ ๊ฐ•์ œํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค structured output ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค๋Š” ๊ฒƒ์€ ์•Œ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ ๊ฒฐ์ •์ ์œผ๋กœ structured output ์„ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ์„๊นŒ? ์— ๋Œ€ํ•œ ๋‹ต์„ ๊ตฌํ•ด์•ผํ•˜๋Š”๋ฐ, ์ตœ๊ทผ ์•„๋ž˜์˜ ๊ธ€์„ ์ฝ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

https://www.philschmid.de/why-engineers-struggle-building-agents

 

Why (Senior) Engineers Struggle to Build AI Agents

Traditional software engineering is deterministic, while AI agents operate probabilistically. This fundamental difference creates challenges for engineers accustomed to strict interfaces and predictable outcomes.

www.philschmid.de

 

์‹œ๋‹ˆ์–ด ๊ฐœ๋ฐœ์ž๋“ค์ด ์ฃผ๋‹ˆ์–ด ๊ฐœ๋ฐœ์ž๋“ค๋ณด๋‹ค AI Agent ๋ฅผ ๊ฐœ๋ฐœํ•˜๋Š”๊ฒŒ ๋А๋ฆฌ๋‹ค๋Š” ์ฃผ์ œ๋กœ ์‹œ์ž‘ํ•œ ๊ธ€์ธ๋ฐ ๊ทธ ์ด์œ ๋ฅผ ์ƒ๊ฐํ•˜๋ฉด ์‚ฌ๋ญ‡ ์ฒ ํ•™์ ์œผ๋กœ ๋ฐ›์•„๋“ค์—ฌ์•ผ ํ•  ๋ถ€๋ถ„์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด์œ ๋Š” ์ „ํ†ต์ ์ธ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง(์—„๊ฒฉํ•œ ์ œ์–ด, ๊ฒฐ์ •๋ก ์ ) ๊ทธ๋Ÿฌ๋‹ˆ๊นŒ ๋งž์œผ๋ฉด ๋งž๋Š”๊ฑฐ๊ณ  ํ‹€๋ฆฌ๋ฉด ํ‹€๋ฆฐ๊ฑฐ์ง€, ํ‹€๋ฆฌ๋ฉด ๊ณ ์ณ์•ผ์ง€ ๋ผ๋Š” ์ „ํ†ต์ ์ธ ์—”์ง€๋‹ˆ์–ด๋ง์˜ ์ฒ ํ•™๊ณผ ์Šต๊ด€์ด AI ์—์ด์ „ํŠธ ๊ฐœ๋ฐœ์— ๋ฐฉํ•ด๊ฐ€ ๋˜๊ณ  ์žˆ๋‹ค๋Š” ๊ฒ๋‹ˆ๋‹ค. ๊ธ€์˜ ์ €์ž์ธ Phillipp Schmid ๋Š” ์‹œ๋‹ˆ์–ด์ผ์ˆ˜๋ก LLM ์˜ ๋ถˆํ™•์‹ค์„ฑ์„ ์ฝ”๋“œ๋กœ ์ œ๊ฑฐํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์–ด ์ฃผ๋‹ˆ์–ด๋ณด๋‹ค ๋А๋ ค์ง„๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ๋งฅ๋ฝ์„ ๊ตฌ์กฐํ™” ๋œ ๊ฒƒ์œผ๋กœ ๊ฐ•์ œํ•˜๋ฉด LLM์ด ์ž˜ํ•˜๋Š”๊ฒƒ์„ ์˜คํžˆ๋ ค ๋” ๋ชปํ•˜๊ฒŒ ํ•˜๋ฉด์„œ ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง€๊ณ  ์„ฑ๋Šฅ์ด ๋–จ์–ด์ง€๋Š” ์ด์œ ๋ฅผ ์ฝ”๋“œ๋กœ ์ œ๊ฑฐํ•˜๋ ค ํ•˜๋‹ˆ ์ˆ˜๋ ์— ๋น ์ง€๊ฒŒ ๋œ๋‹ค์˜ ์˜๋ฏธ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ์ €์ž๋Š” agent๋ฅผ ๊ฐœ๋ฐœํ•  ๋•Œ ์•„๋ž˜์˜ ์ •์‹ ์„ ๊ฐ–์ถ”์–ด์•ผ ํ•œ๋‹ค๊ณ  ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.

 

  1. ํ…์ŠคํŠธ๊ฐ€ ์ƒˆ๋กœ์šด ์ƒํƒœ(State)
    • ํ•จ์ •: ์ž์—ฐ์–ด ์ž…๋ ฅ์„ ๊ตฌ์กฐํ™”๋œ ๋ฐ์ดํ„ฐ(์˜ˆ: true/false)๋กœ ๊ฐ•์ œํ•˜๋ฉด ๋งฅ๋ฝ ์ƒ์‹ค.
    • ํ•ด๊ฒฐ: ํ”ผ๋“œ๋ฐฑ(์˜ˆ: “์Šน์ธ, ๋ฏธ๊ตญ ์‹œ์žฅ ์ง‘์ค‘”)์„ ํ…์ŠคํŠธ๋กœ ๋ณด์กดํ•ด ๋™์  ์กฐ์ • ๊ฐ€๋Šฅ.
  2. ์ œ์–ด๊ถŒ์„ ๋„˜๊ฒจ๋ผ
    • ํ•จ์ •: ํ๋ฆ„์„ ํ•˜๋“œ์ฝ”๋”ฉ(์˜ˆ: ๊ตฌ๋… ์ทจ์†Œ ๋ฃจํŠธ)ํ•˜๋ฉด ๋น„์ง์„ ์  ์ƒํ˜ธ์ž‘์šฉ ๋Œ€์‘ ์‹คํŒจ.
    • ํ•ด๊ฒฐ: ์—์ด์ „ํŠธ(LLM)๊ฐ€ ๋งฅ๋ฝ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜๋„ ํŒ๋‹จํ•˜๋„๋ก ์‹ ๋ขฐ.
  3. ์—๋Ÿฌ๋Š” ๊ทธ๋ƒฅ ์ž…๋ ฅ์ด๋‹ค
    • ํ•จ์ •: ์—๋Ÿฌ ๋ฐœ์ƒ ์‹œ ํ”„๋กœ๊ทธ๋žจ ์ค‘๋‹จ(์ „ํ†ต ๋ฐฉ์‹)์œผ๋กœ ๊ณ ๋น„์šฉ ์‹คํ–‰ ๋‚ญ๋น„.
    • ํ•ด๊ฒฐ: ์—๋Ÿฌ๋ฅผ ํ”ผ๋“œ๋ฐฑ์œผ๋กœ ์ œ๊ณตํ•ด ์—์ด์ „ํŠธ๊ฐ€ ์ž๊ฐ€ ๋ณต๊ตฌ ์‹œ๋„.
  4. ์œ ๋‹› ํ…Œ์ŠคํŠธ์—์„œ Eval๋กœ
    • ํ•จ์ •: ์ด์ง„ ํ…Œ์ŠคํŠธ(TDD) ์ ์šฉ ์‹œ ํ™•๋ฅ ์  ์‹œ์Šคํ…œ์—์„œ ๋ฌด์˜๋ฏธ(๋ฌดํ•œ ์œ ํšจ ๋‹ต๋ณ€).
    • ํ•ด๊ฒฐ: ์‹ ๋ขฐ์„ฑ(Pass@k), ํ’ˆ์งˆ(LLM Judge), ์ถ”์ (Eval)๋กœ ๋ณ€๋™์„ฑ ๊ด€๋ฆฌ.
  5. ์—์ด์ „ํŠธ๋Š” ์ง„ํ™”ํ•˜๊ณ , API๋Š” ๊ทธ๋ ‡์ง€ ์•Š๋‹ค
    • ํ•จ์ •: ์ธ๊ฐ„ ์ค‘์‹ฌ API(์•”๋ฌต์  ๋งฅ๋ฝ) ์‚ฌ์šฉ ์‹œ ์—์ด์ „ํŠธ ํ™˜๊ฐ ๋ฐœ์ƒ.
    • ํ•ด๊ฒฐ: ์ƒ์„ธ ์‹œ๋งจํ‹ฑ ํƒ€์ดํ•‘(์˜ˆ: “user_email_address”)๊ณผ ๋…์ŠคํŠธ๋ง์œผ๋กœ ๋ช…ํ™•ํ™”. ์—์ด์ „ํŠธ๋Š” ๋„๊ตฌ ๋ณ€ํ™”์— ์ ์‘ ๊ฐ€๋Šฅ.

 

๊ฒฐ๋ก ์€ ์—”์ง€๋‹ˆ์–ด๋ง ์‚ฐ๋ฌผ์˜ ํ™•๋ฅ ์„ฑ์„ ๋ฐ›์•„๋“ค์ด๊ณ  edge case ๋“ค์„ ๊ฐ•์ œํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ ๊ทธ ๋งˆ์ €๋„ LLM ์ด ์ž๊ธฐํ”ผ๋“œ๋ฐฑ์„ ํ•  ์ˆ˜ ์žˆ๋Š” ํƒ„๋ ฅ์  ์‹œ์Šคํ…œ ๊ตฌ์ถ•์œผ๋กœ ๋งŒ๋“ค๊ณ  ๊ทธ ๊ณผ์ •์„ ๊ด€๋ฆฌํ•˜๋ผ๋Š” ๋ง ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋‹ค์‹œ ์š”์ ์œผ๋กœ ๋Œ์•„์™€ structured output ์„ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€? ์— ๋Œ€ํ•œ ๋‹ต์€ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋‹ค์— ๊ฐ€๊น๋‹ค. ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋‹ค ์—†๋‹ค ๋กœ ํŒ๋‹จํ•˜๋Š”๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์–ผ๋งˆ๋‚˜ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€? ์— ์ง‘์ค‘ํ•ด์•ผ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ธฐ๋Šฅ์€ ๋Œ€์ฒด๋กœ ์ž˜ ์ž‘๋™ํ•˜๋‹ˆ(gpt-4o ์ด์ƒ์˜ ๋ชจ๋ธ), ๊ฐ์ž์˜ ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์—์„œ ํ…Œ์ŠคํŠธํ•ด๋ณด๊ณ  ๊ด€๋ฆฌ ๊ฐ€๋Šฅํ•œ edge case ์ธ์ง€ ํŒŒ์•…ํ•˜๊ณ  ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ด ๋ถ€๋ถ„์— ๋Œ€ํ•œ ์ƒ๊ฐ์€ ์‚ฌ๋žŒ๋งˆ๋‹ค ๋งŽ์ด ๋‹ค๋ฅผ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์˜๊ฒฌ๋“ค์„ ๋Œ“๊ธ€๋กœ ๋‚จ๊ฒจ์ฃผ์„ธ์š”!

728x90
728x90
  1. ์†Œ๊ฐœ

Python 3.12๋Š” ์—ฌ๋Ÿฌ ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ๊ณผ ๊ฐœ์„  ์‚ฌํ•ญ์„ ๋„์ž…ํ•˜์—ฌ ๊ฐœ๋ฐœ์ž๋“ค์˜ ์ƒ์‚ฐ์„ฑ๊ณผ ์ฝ”๋“œ ํ’ˆ์งˆ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ํŠนํžˆ, ํ–ฅ์ƒ๋œ ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€, ๋” ๊ฐ•๋ ฅํ•œ f-strings, ๋” ๋น ๋ฅธ Python ์‹คํ–‰ ์†๋„, ์ „์šฉ ํƒ€์ž… ๋ณ€์ˆ˜ ๊ตฌ๋ฌธ, Linux perf ํ”„๋กœํŒŒ์ผ๋Ÿฌ ์ง€์› ๋“ฑ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
Real Python

  1. ์„ค์น˜ ๋‹จ๊ณ„
  2. /tmp ๋””๋ ‰ํ† ๋ฆฌ๋กœ ์ด๋™

๋จผ์ €, /tmp ๋””๋ ‰ํ† ๋ฆฌ๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

cd /tmp/

  1. Python 3.12.0 ๋ฒ„์ „ ๋‹ค์šด๋กœ๋“œ ๋ฐ ์••์ถ• ํ•ด์ œ

Python 3.12.0 ๋ฒ„์ „์„ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์••์ถ•์„ ํ•ด์ œํ•ฉ๋‹ˆ๋‹ค.

wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz
tar -xzvf Python-3.12.0.tgz
cd Python-3.12.0/

  1. ํ•„์š”ํ•œ ๋นŒ๋“œ ์˜์กด์„ฑ ํŒจํ‚ค์ง€ ์„ค์น˜

Python์„ ๋นŒ๋“œํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ์˜์กด์„ฑ ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

sudo apt update
sudo apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev pkg-config

  1. Python ๊ตฌ์„ฑ ๋ฐ ๋นŒ๋“œ

Python์„ ๊ตฌ์„ฑํ•˜๊ณ  ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค.

./configure --enable-optimizations
make -j $(nproc)

  1. Python ์„ค์น˜

Python์„ ์‹œ์Šคํ…œ์— ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

sudo make altinstall

  1. ์‹ฌ๋ณผ๋ฆญ ๋งํฌ ์ถ”๊ฐ€

์„ค์น˜๋œ Python ์‹คํ–‰ ํŒŒ์ผ์— ์‹ฌ๋ณผ๋ฆญ ๋งํฌ๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

ln -s /usr/local/bin/python3.12 /usr/local/bin/python
ls -al /usr/local/bin/python

  1. ์„ค์น˜ ํ™•์ธ

์„ค์น˜๋œ Python ๋ฒ„์ „์„ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

python -V

  1. ๋งˆ๋ฌด๋ฆฌ

์ด์ œ, ์›ํ•˜๋Š” Python ๋ฒ„์ „์ด ์„ค์น˜๋˜์—ˆ๊ณ  ์‹คํ–‰ ํŒŒ์ผ์— ์ ์ ˆํ•œ ์‹ฌ๋ณผ๋ฆญ ๋งํฌ๊ฐ€ ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋กœ์จ Python 3.12 ํ™˜๊ฒฝ์ด ์ค€๋น„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ถ”๊ฐ€ ์ •๋ณด

Python 3.12๋Š” ๋‹ค์–‘ํ•œ ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ๊ณผ ๊ฐœ์„  ์‚ฌํ•ญ์„ ๋„์ž…ํ•˜์—ฌ ๊ฐœ๋ฐœ์ž๋“ค์˜ ์ƒ์‚ฐ์„ฑ๊ณผ ์ฝ”๋“œ ํ’ˆ์งˆ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค. ํŠนํžˆ, ํ–ฅ์ƒ๋œ ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€, ๋” ๊ฐ•๋ ฅํ•œ f-strings, ๋” ๋น ๋ฅธ Python ์‹คํ–‰ ์†๋„, ์ „์šฉ ํƒ€์ž… ๋ณ€์ˆ˜ ๊ตฌ๋ฌธ, Linux perf ํ”„๋กœํŒŒ์ผ๋Ÿฌ ์ง€์› ๋“ฑ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
Real Python

728x90
728x90

LangChain์„ ํ™œ์šฉํ•œ LLM(Large Language Model) ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ฐœ๋ฐœ์—์„œ ๊ธฐ๋ณธ LLM ์ฒด์ธ์€ ๊ฐ€์žฅ ํ•ต์‹ฌ์ ์ธ ๊ฐœ๋…์ž…๋‹ˆ๋‹ค. ์ด ์ฒด์ธ์€ ์‚ฌ์šฉ์ž์˜ ์ž…๋ ฅ(ํ”„๋กฌํ”„ํŠธ)์„ ๋ฐ›์•„ LLM์„ ํ†ตํ•ด ์›ํ•˜๋Š” ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๋Š” ๊ฐ„๋‹จํ•˜๋ฉด์„œ๋„ ๊ฐ•๋ ฅํ•œ ๊ตฌ์กฐ๋ฅผ ๋งํ•ด์š”. ๋Œ€ํ™”ํ˜• AI๋ถ€ํ„ฐ ์ž๋™ ๋ฌธ์„œ ์š”์•ฝ๊นŒ์ง€ ๋‹ค์–‘ํ•œ LLM ๊ธฐ๋ฐ˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๊ธฐ๋ฐ˜์ด ๋ฉ๋‹ˆ๋‹ค.


1. ๊ธฐ๋ณธ LLM ์ฒด์ธ์˜ ํ•ต์‹ฌ ๊ตฌ์„ฑ ์š”์†Œ

๊ธฐ๋ณธ LLM ์ฒด์ธ์€ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€ ์š”์†Œ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ํ”„๋กฌํ”„ํŠธ(Prompt): LLM์—๊ฒŒ ์–ด๋–ค ์ž‘์—…์„ ์ˆ˜ํ–‰ํ• ์ง€ ์•Œ๋ ค์ฃผ๋Š” ์ง€์‹œ๋ฌธ์ด์—์š”. ์งˆ๋ฌธ, ๋ช…๋ น, ํŠน์ • ๋งฅ๋ฝ์„ ์ œ๊ณตํ•˜๋Š” ๋ฌธ์žฅ ๋“ฑ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ, LLM์˜ ์‘๋‹ต ํ’ˆ์งˆ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์š”์†Œ์ž…๋‹ˆ๋‹ค. ํšจ๊ณผ์ ์ธ ํ”„๋กฌํ”„ํŠธ๋Š” LLM์ด ์˜๋„ํ•œ ๋ฐฉํ–ฅ์œผ๋กœ ์ •ํ™•ํžˆ ์‘๋‹ตํ•˜๋„๋ก ์œ ๋„ํ•ฉ๋‹ˆ๋‹ค.
  • LLM(Large Language Model): GPT-3.5, GPT-4, Gemini ๋“ฑ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋ฐฉ๋Œ€ํ•œ ์–‘์˜ ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋˜์–ด ์–ธ์–ด๋ฅผ ์ดํ•ดํ•˜๊ณ  ์ƒˆ๋กœ์šด ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ถ„์„ํ•˜๊ณ , ํ•™์Šต๋œ ์ง€์‹์„ ๋ฐ”ํƒ•์œผ๋กœ ์š”์ฒญ๋œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ฑฐ๋‚˜ ์ ์ ˆํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.

2. ์ž‘๋™ ๋ฐฉ์‹

๊ธฐ๋ณธ LLM ์ฒด์ธ์˜ ์ž‘๋™ ๋ฐฉ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  1. ํ”„๋กฌํ”„ํŠธ ์ƒ์„ฑ: ์‚ฌ์šฉ์ž์˜ ์š”๊ตฌ์‚ฌํ•ญ์ด๋‚˜ ์ˆ˜ํ–‰ํ•  ์ž‘์—…์„ ์ •์˜ํ•˜๋Š” ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ์ด ํ”„๋กฌํ”„ํŠธ๋Š” LLM์ด ๋” ์ •ํ™•ํ•˜๊ฒŒ ์‘๋‹ตํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ช…ํ™•ํ•œ ์ง€์นจ๊ณผ ๋งฅ๋ฝ์„ ํฌํ•จํ•˜๋„๋ก ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์–ด์š”.
  2. LLM ์ฒ˜๋ฆฌ: ์ƒ์„ฑ๋œ ํ”„๋กฌํ”„ํŠธ๋Š” LLM์—๊ฒŒ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. LLM์€ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋ถ„์„ํ•˜๊ณ , ๋‚ด๋ถ€์ ์œผ๋กœ ํ•™์Šต๋œ ์ง€์‹๊ณผ ํŒจํ„ด์„ ํ™œ์šฉํ•˜์—ฌ ์‘๋‹ต์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  3. ์‘๋‹ต ๋ฐ˜ํ™˜: LLM์ด ์ƒ์„ฑํ•œ ์‘๋‹ต์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ์ด ์‘๋‹ต์€ ๋‹จ์ˆœํ•œ ๋‹ต๋ณ€, ์š”์•ฝ๋œ ์ •๋ณด, ํ˜น์€ ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ ๋“ฑ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ๋ฅผ ๋จ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3. ์‹ค์Šต ์˜ˆ์ œ: LangChain์œผ๋กœ LLM ์ฒด์ธ ๋งŒ๋“ค๊ธฐ

์ด์ œ ์‹ค์ œ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด LangChain์—์„œ ๊ธฐ๋ณธ LLM ์ฒด์ธ์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์˜ˆ์ œ 1: ๋‹จ์ˆœ LLM ํ˜ธ์ถœ

๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ, ChatOpenAI ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ OpenAI์˜ LLM ๋ชจ๋ธ์— ์ง์ ‘ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.

Python
 
from langchain_openai import ChatOpenAI

# LLM ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ
llm = ChatOpenAI(model="gpt-4o-mini")

# ๋ชจ๋ธ์— ์ง์ ‘ ํ”„๋กฌํ”„ํŠธ ์ „๋‹ฌ ๋ฐ ์‹คํ–‰
llm.invoke("์ง€๊ตฌ์˜ ์ž์ „ ์ฃผ๊ธฐ๋Š”?")

์œ„ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด, llm ๊ฐ์ฒด๊ฐ€ "์ง€๊ตฌ์˜ ์ž์ „ ์ฃผ๊ธฐ๋Š”?" ๋ผ๋Š” ์งˆ๋ฌธ์„ ๋ฐ›์•„ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๊ณ , AIMessage ๊ฐ์ฒด ํ˜•ํƒœ๋กœ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

AIMessage(content='์ง€๊ตฌ์˜ ์ž์ „ ์ฃผ๊ธฐ๋Š” ์•ฝ 23์‹œ๊ฐ„ 56๋ถ„ 4์ดˆ์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ์„ ํ•ญ์„ฑ์ผ(sidereal day)์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ํ”ํžˆ ๋งํ•˜๋Š” ํ•˜๋ฃจ 24์‹œ๊ฐ„์€ ํƒœ์–‘์ผ(solar day)๋กœ, ์ง€๊ตฌ๊ฐ€ ์ž์ „ํ•˜๋ฉด์„œ ๊ณต์ „ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํƒœ์–‘์„ ๊ธฐ์ค€์œผ๋กœ ํ•˜๋ฃจ๊ฐ€ 24์‹œ๊ฐ„์ด ๋ฉ๋‹ˆ๋‹ค.')

์˜ˆ์ œ 2: ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ ์‚ฌ์šฉ

์ด๋ฒˆ์—๋Š” ๋” ์ฒด๊ณ„์ ์ธ ์ ‘๊ทผ์„ ์œ„ํ•ด ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์€ ํ”„๋กฌํ”„ํŠธ์˜ ํ˜•์‹์„ ๋ฏธ๋ฆฌ ์ •์˜ํ•ด๋‘๊ณ , ํ•„์š”ํ•œ ๋ถ€๋ถ„๋งŒ ๋ณ€์ˆ˜๋กœ ์ฑ„์›Œ ๋„ฃ์–ด ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

ChatPromptTemplate.from_template() ๋ฉ”์„œ๋“œ๋ฅผ ์ด์šฉํ•ด ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ์˜ˆ์ œ๋Š” LLM์—๊ฒŒ "์ฒœ๋ฌธํ•™ ์ „๋ฌธ๊ฐ€" ์—ญํ• ์„ ๋ถ€์—ฌํ•˜์—ฌ ์งˆ๋ฌธ์— ๋‹ต๋ณ€ํ•˜๋„๋ก ์ง€์‹œํ•˜๋Š” ํ…œํ”Œ๋ฆฟ์ž…๋‹ˆ๋‹ค.

Python
 
from langchain_core.prompts import ChatPromptTemplate

# ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ ์ •์˜
prompt = ChatPromptTemplate.from_template(
    "You are an expert in astronomy. Answer the question. <Question>: {input}"
)

# ํ…œํ”Œ๋ฆฟ ๊ฐ์ฒด ํ™•์ธ
prompt

๊ฒฐ๊ณผ๋ฅผ ๋ณด๋ฉด input_variables=['input']์„ ํ†ตํ•ด input์ด๋ผ๋Š” ๋ณ€์ˆ˜๋ฅผ ๋ฐ›๋Š” ํ”„๋กฌํ”„ํŠธ ๊ฐ์ฒด๊ฐ€ ์ƒ์„ฑ๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ChatPromptTemplate(input_variables=['input'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='You are an expert in astronomy. Answer the question. <Question>: {input}'))])

์˜ˆ์ œ 3: LCEL์„ ํ™œ์šฉํ•œ ์ฒด์ธ ๊ตฌ์„ฑ

LangChain Expression Language (LCEL)์€ ํŒŒ์ดํ”„(|) ์—ฐ์‚ฐ์ž๋ฅผ ์ด์šฉํ•ด ํ”„๋กฌํ”„ํŠธ, ๋ชจ๋ธ, ์ถœ๋ ฅ ํŒŒ์„œ๋ฅผ ๊ฐ„ํŽธํ•˜๊ฒŒ ์—ฐ๊ฒฐํ•˜์—ฌ ํ•˜๋‚˜์˜ ์ฒด์ธ์œผ๋กœ ๋งŒ๋“œ๋Š” ๊ฐ•๋ ฅํ•œ ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ ์ฝ”๋“œ๋Š” ์•ž์„œ ์ •์˜ํ•œ prompt์™€ llm์„ ์—ฐ๊ฒฐํ•˜๊ณ , ์ตœ์ข…์ ์œผ๋กœ StrOutputParser๋ฅผ ํ†ตํ•ด LLM์˜ ์‘๋‹ต์„ ๊น”๋”ํ•œ ๋ฌธ์ž์—ด ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ •์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# ํ”„๋กฌํ”„ํŠธ, ๋ชจ๋ธ, ์ถœ๋ ฅ ํŒŒ์„œ ์ •์˜
prompt = ChatPromptTemplate.from_template("You are an expert in astronomy. Answer the question. <Question>: {input}")
llm = ChatOpenAI(model="gpt-4o-mini")
output_parser = StrOutputParser()

# LCEL๋กœ ์ฒด์ธ ์—ฐ๊ฒฐ
chain = prompt | llm | output_parser

# ์ฒด์ธ ํ˜ธ์ถœ
chain.invoke({"input": "์ง€๊ตฌ์˜ ์ž์ „ ์ฃผ๊ธฐ๋Š”?"})

์ด ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด, ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ์ด ๋จผ์ € ์งˆ๋ฌธ์„ ์™„์„ฑํ•˜๊ณ , LLM์ด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋ฉฐ, ๋งˆ์ง€๋ง‰์œผ๋กœ StrOutputParser๊ฐ€ ๊ทธ ๋‹ต๋ณ€์„ ์ˆœ์ˆ˜ํ•œ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์ง€๊ตฌ์˜ ์ž์ „ ์ฃผ๊ธฐ๋Š” ์•ฝ 24์‹œ๊ฐ„์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ํ•˜๋ฃจ์˜ ๊ธธ์ด๋ฅผ ๊ฒฐ์ •ํ•˜๋Š”๋ฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
728x90

'Dev,AI > Langchain' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[LLM] Structured Output ๋Š” ์–ผ๋งˆ๋‚˜ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ์„๊นŒ ?  (3) 2025.12.02
728x90


๋ถ„๋ฅ˜ : ๋”•์…”๋„ˆ๋ฆฌ

ํ‘œ๋ฉด์ ์œผ๋กœ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ‘ธ๋Š” ๋ฌธ์ œ์ด์ง€๋งŒ, ๋‘๊ฐ€์ง€ ํ•ด๊ฒฐํ•ด์•ผ ํ•˜๋Š” ์ด์Šˆ๋“ค์ด ๋” ์žˆ๋‹ค.

1. Value ๋กœ Key ๊ฐ’์„ ์ฐพ๊ธฐ.

2. input() ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์‹œ๊ฐ„์ดˆ๊ณผ ๋ฌธ์ œ ํ•ด๊ฒฐํ•˜๊ธฐ


keypoint : python input / value๋กœ key ์ฐพ๊ธฐ 


code

import sys 
n,m = map(int,input().split(' '))

pocketmon_list = dict()
rev_poecketmon_list = dict()
cnt=1
for i in range(0,n):
    name = sys.stdin.readline().strip()
    pocketmon_list[str(cnt)] = name
    rev_poecketmon_list[name] = str(cnt)
    cnt+=1



for i in range(0,m):
    tmp_input = sys.stdin.readline().strip()
    if tmp_input.isdigit():
            print(pocketmon_list[tmp_input])
    else:
         print(rev_poecketmon_list[tmp_input])

์ค‘์š”ํ•œ ๋‚ด์šฉ 

1. Value ๋กœ Key ์ฐพ๊ธฐ 

 

 

[Python] ํŒŒ์ด์ฌ ๋”•์…”๋„ˆ๋ฆฌ value๋กœ key ์ฐพ๋Š” ๋ฐฉ๋ฒ•

Dictionary ๊ตฌ์กฐ๋Š” key ๊ฐ’์œผ๋กœ value ๊ฐ’์„ ์ฐพ๋Š” ๋ฐ์— ํŠนํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ตญ์–ด์‚ฌ์ „์— ๋น„์œ ํ•˜๋ฉด ์ฐพ๊ณ ์ž ํ•˜๋Š” ๋‹จ์–ด์˜ ๋œป์€ ์‰ฝ๊ฒŒ ์•Œ ์ˆ˜ ์žˆ์œผ๋‚˜, ํ•ด๋‹น ๋œป์„ ๊ฐ€์ง„ ๋‹จ์–ด๋Š” ์ฐพ๊ธฐ๊ฐ€ ๋งค์šฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ํŒŒ์ด์ฌ์˜

star7sss.tistory.com

 

 

์œ„ ๊ธ€์„ ์ฐธ๊ณ ํ•˜๊ธฐ ๋ฐ”๋ž€๋‹ค. ๊ฒฐ๋ก ์€ value๋กœ key ๋ฅผ ์ง์ ‘ ์ฐพ๋Š” ๊ฒƒ์€ for ๋ฌธ์„ ์‚ฌ์šฉํ•œ ์™„์ „ํƒ์ƒ‰๋ฐ–์— ์—†๋‹ค. 

 

2. ์™œ input() ์ด sys.stdin.readline().stirp() ๋ณด๋‹ค ๋А๋ฆด๊นŒ?

input() ํ•จ์ˆ˜๋Š” Python ์—์„œ ๊ธฐ๋ณธ์ ์œผ๋กœ ์ œ๊ณตํ•˜๋Š” ์‚ฌ์šฉ์ž ์ž…๋ ฅ ํ•จ์ˆ˜์ธ๋ฐ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํŠน์ง•์„ ๊ฐ–๊ณ  ์žˆ๋‹ค.

1. ์ž…๋ ฅ ๋œ ๊ฐ’์„ '๋ฌธ์ž์—ด๋กœ ๋ฐ˜ํ™˜' ํ•˜๊ณ  '์ž๋™์œผ๋กœ ๊ฐœํ–‰ ๋ฌธ์ž ์ œ๊ฑฐ' ๋ฅผ ํ•œ๋‹ค.

2. ํ”„๋กฌํ”„ํŠธ ๋ฉ”์‹œ์ง€๋ฅผ ์ธ์ž๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ๋‹ค.

 

์—ฌ๊ธฐ์„œ ์ด ๋ฌธ์ž์—ด๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•˜๊ณ  ์ž๋™์œผ๋กœ ๊ฐœํ–‰ ๋ฌธ์ž๋ฅผ ์ œ๊ฑฐํ•˜๋Š”๊ฒŒ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์‹œ๊ฐ„์ด ๋Œ€๋‹จํžˆ ์˜ค๋ž˜๊ฑธ๋ฆฐ๋‹ค.

 

๊ทธ์— ๋ฐ˜ํ•ด readline() ํ•จ์ˆ˜๋Š” ๊ฐœํ–‰ ๋ฌธ์ž๋ฅผ ํฌํ•จํ•˜์—ฌ ๋ฌธ์ž์—ด์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ ์‹œ๊ฐ„ ์ฐจ์ด๊ฐ€ ๋ฐœ์ƒํ•˜๋Š”๋ฐ readline ํ•จ์ˆ˜์—์„œ๋Š” strip() ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐœํ–‰๋ฌธ์ž๋ฅผ ์ง€์šธ ์ˆ˜ ์žˆ๋‹ค. 

 

๋‘ ํ•จ์ˆ˜์˜ ์‹œ๊ฐ„์ฐจ์ด๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ํ•จ์ˆ˜์ด๋‹ค.

 

import sys
import time

# sys.stdin.readline() ์‚ฌ์šฉ
start = time.time()
for _ in range(100000):
    line = sys.stdin.readline().strip()
end = time.time()
print(f'sys.stdin.readline() ์‚ฌ์šฉ ์‹œ๊ฐ„: {end - start}์ดˆ')

# input() ์‚ฌ์šฉ
start = time.time()
for _ in range(100000):
    line = input()
end = time.time()
print(f'input() ์‚ฌ์šฉ ์‹œ๊ฐ„: {end - start}์ดˆ')

 

 

100000์ค„์˜ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐ ๊ฑธ๋ฆฌ๋Š” ์‹œ๊ฐ„:
input() ํ•จ์ˆ˜: 12.3456์ดˆ
sys.stdin.readline() ํ•จ์ˆ˜: 0.4567์ดˆ

 

๊ฒฐ๊ณผ๊ฐ’์€ ์–ด๋งˆ์–ด๋งˆํ•˜๊ฒŒ ์ฐจ์ด๊ฐ€ ๋‚œ๋‹ค. ๋”ฐ๋ผ์„œ python ์—์„œ ์‹œ๊ฐ„์ดˆ๊ณผ๋ฌธ์ œ๋ฅผ ๊ฒช์„ ๋•Œ input ์„ sys.stdin.readline().strip() ์œผ๋กœ ๋ณ€๊ฒฝํ•ด๋ณด์ž.

 


import sys
input = sys.stdin.readline().strip

 

์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ฝ”๋“œ๋ณ€๊ฒฝ ์—†์ด๋„ ๊ธฐ์กด input ํ•จ์ˆ˜์— ์ ์šฉํ•˜์—ฌ ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ๋‹ค.

728x90

+ Recent posts