[DL] Keras 이용해서 DL 모델 만들기

대학원 합격 예측 딥러닝 모델

Posted Sep 27, 2022

By aijinsol

3 min read

Keras를 이용하면 딥러닝 모델을 손쉽게 만들 수 있다.

💡 (요약) 딥러닝 모델 만들기

데이터 전처리
모델 빌딩 & 학습
새로운 데이터 예측

💡 (참고) 딥러닝 모델 성능 향상

데이터 전처리
파라미터 튜닝 (layer 갯수, node 갯수, activation 함수 등)
데이터 양 늘리기

대학원 합격 예측 프로젝트

Code

  
import pandas as pd
import tensorflow as tf


# Pre-process
data = pd.read_csv('gpascore.csv')
data = data.dropna()
y = data['admit'].values

x = []
for i, rows in data.iterrows():
    x.append([rows['gre'], rows['gpa'], rows['rank']])

# Model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='tanh'),
    tf.keras.layers.Dense(128, activation='tanh'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Train
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(np.array(x), y, epochs=1000)

# Predict
pred = model.predict( [ [800, 4.20, 1], [300, 1.2, 2] ] )

1. Dataset

column: 4 (admit, gre, gpa, rank)
row: 427
admit=0(불합격) / admit=1(합격)

(table) Dataset example

admit	gre	gpa	rank
0	380	3.21	3
1	660	3.67	3
1	800	4	1
…	…	…	…

2. Data Preprocessing

1) Null value

  
import pandas as pd

data = pd.read_csv('gpascore.csv')

# check null value
print(data.isnull().sum())

admit    0
gre      1
gpa      0
rank     0
dtype: int64

  
# drop null-included rows
data = data.dropna()

print(data.isnull().sum())

admit    0
gre      0
gpa      0
rank     0
dtype: int64

2) X, Y data

  
y = data['admit'].values
print(y)
print(type(y))

  
[0 1 1 1 0 1 1 ...1 1]
<class 'numpy.ndarray'>

  
x = []

for i, rows in data.iterrows():
    x.append([rows['gre'], rows['gpa'], rows['rank']])

print(x)
print(type(x))

  
[[380.0, 3.21, 3.0], [660.0, 3.67, 3.0], ..., [760.0, 3.76, 2.0], [710.0, 3.82, 3.0]]
<class 'list'>

3. Build model & train

  
import tensorflow as tf

# 1. model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='tanh'),
    tf.keras.layers.Dense(128, activation='tanh'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# 2. model compile
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# 3. model training (weights updating)
model.fit(np.array(x), y, epochs=1000)

  
Epoch 1/1000
 9/14 [==================>...........] - ETA: 0s - loss: 0.7727 - accuracy: 0.4792
...
Epoch 999/1000
14/14 [==============================] - 0s 7ms/step - loss: 0.5708 - accuracy: 0.6965
Epoch 1000/1000
14/14 [==============================] - 0s 6ms/step - loss: 0.5710 - accuracy: 0.7082
<keras.callbacks.History at 0x17aab7280>

model.fit에 x, y 데이터를 넣을 때 list 형태가 아닌 numpy array 또는 tf.tensor 자료형으로 변환해서 넣어야한다. list 형태로 넣으면 에러가 뜬다.
- np.array(데이터) : list to array 변환 함수
y 데이터가 0과 1로 되어있으므로, 손실함수로 binary_crossentropy를 사용하고, 마지막 레이어에는 sigmoid 활성함수를 사용한다.

4. Predict

  
pred = model.predict( [ [800, 4.20, 1], [300, 1.2, 2] ] )
print(pred)

  
1/1 [==============================] - 0s 20ms/step
[[0.77883834]
 [0.00869057]]

Reference

코딩애플 - 문과여도 상관없다 Tensorflow 딥러닝 AI 기초부터 실무까지

DeepLearning

This post is licensed under CC BY 4.0 by the author.