[모두를 위한 딥러닝 시즌2][Lab 5] Logistic Regression

모두를 위한 딥러닝 시즌2 2021. 10. 25. 19:43

이번 시간에는 Logistic Regression에 대해 알아보자.

Binary Classification

- 입력된 값을 두 가지 카테고리로 분류하는 것이다. ex) true or false?

Logistic vs Linear

- Logistic; 데이터가 뚜렷이 구분된다.

- Linear; 연속적인 데이터, 수치화 ex) 몸무게, 기온

Logistic_Y = [[0], [0], [0], [1], [1], [1]]]    #One Hot
Linear_Y = [828.655, 833.450, 819.239, 828.349, 831.659]   #Numeric

Hypothesis Representation

데이터를 입력하면 linear한 function이 나오는데 이 linear한 값을 Logistic function을 통해서 0과 1의 구간으로 표현해내고 특정 Dicision Boundary로 0과 1로 분류해서 결과값으로 출력해낸다. 이 것이 Logistic Regression의 전체적인 과정이다.

Sigmoid(Logistic) function g(z)

linear한 값을 0과 1의 구간으로 표현해내기 위해 사용하는 Sigmoid 함수에 대해 알아보자.

z값이 커질수록 e^-z값이 0에 수렴해서 g(z)값은 결국 1에 수렴하게 되고

z값이 작아질수록 e^-z값이 ∞로 발산하면서 g(z)값은 결국 0에 수렴하게 된다.

hypothesis = tf.sigmoid(z)
hypothesis = tf.div(1., 1. + tf.exp(z))

Decision Boundary

Decision Boundary를 경계로 두 구간을 나눠서 값을 출력한다.

Decision Boundary는 특정한 값이나 Linear function, 또는 Linear하지 않은 function으로 정할 수 있다.

Cost Function

θ(weight) 값을 최적의 parameter로 만드는걸 목적으로 한다. 즉 cost를 최소화해야 한다.

Cost함수는 아래와 같다.

cost의 최저점을 구하기 위해서는 cost함수가 convex해야 한다. (Lab 4 참조)

이전 시간까지는 단순히 가설에서 실제 값을 뺀 값의 평균을 cost함수로 사용했다. 그런데 Logistic Regression에서는 이렇게 하면 꾸불꾸불한, 즉 convex하지 않은 그래프가 나온다. 그래서 log를 붙여서 cost함수를 convex하게 만든다.

[Tansorflow Code]
def loss_fn(hypothesis, labels):
cost = -tf.reduce_mean(labels * tf.log(hypothesis) + (1 - labels) * tf.log(1 - hypothesis))
return cost

Optimization

어떻게 하면 cost값을 최소화 할 수 있을까.

지난 Lab 4에서 배운 것과 같이 기울기의 절댓값이 작은 쪽으로 가도록 θ값을 계속 조끔씩 움직인다.

def grad(hypothesis, labels):
	with tf.GradientTape() as tape:
		lpss_value = loss_fn(hypothesis, labels)
	return tape.gradient(loss_value, [W, b])
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.01)
optimizer.apply_gradients(grads_and_vars = zip(grads, [W, b]))

지금까지의 logistic regrassion의 이해를 통해서 인공신경망에 대한 전체적인 component를 이해할 수 있다.

x값을 입력했을때, w를 통해서 wx로 linear한 값이 나온다. 그리고 activation function을 통해서 1또는 0값이 나오게 된다.

출처

[모두를 위한 딥러닝 시즌2] - 유튜브 링크

☞ https://www.youtube.com/watch?v=qPMeuL2LIqY&list=PLQ28Nx3M4Jrguyuwg4xe9d9t2XE639e5C&index=2

'모두를 위한 딥러닝 시즌2' 카테고리의 다른 글

[모두를 위한 딥러닝 시즌2][Lab 7] application and tips (0)	2021.11.02
[모두를 위한 딥러닝 시즌2][Lab 6] Softmax Regression (0)	2021.10.27
[모두를 위한 딥러닝 시즌2][Lab 4] Multi variable linear regression (0)	2021.10.24
[모두를 위한 딥러닝 시즌2][Lab 3] Liner Regression and How to minimize cost (0)	2021.10.24
[모두를 위한 딥러닝 시즌2][Lab2] Simple Liner Regression (0)	2021.10.09

ABOUT ME

math&computer math&computer

'모두를 위한 딥러닝 시즌2' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'모두를 위한 딥러닝 시즌2' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바