'기상청콘테스트' 태그의 글 목록

기상청콘테스트

[기상청 콘테스트] 해면 기상 상태에 따른 선박 사고 위험도를 측정하는 모델을 개발 - 11(텐서플로우 소프트맥스 함수 적용) 2018.07.25

[기상청 콘테스트] 해면 기상 상태에 따른 선박 사고 위험도를 측정하는 모델을 개발 - 11(텐서플로우 소프트맥스 함수 적용)

2018. 7. 25. 12:26

특정 바다 상태일 때 사고 확률이 얼마나 나올 것인가를 알아보는 것이 목적인지라 확률로 나오는 것들을 다 해보고 있다.

그런데 생각보다 잘 안 되고 있어서 문제이다.

1차 제출이 26일까지라 거의 끝났지만 제출 전까지 분류 알고리즘을 확률로 바꿔보는 작업을 하려고 한다.

오늘은 소프트맥스 함수를 이용하였다. 소스코드는 아래와 같다.

========================= Python =========================

import tensorflow as tf

import numpy as np

xy1 = np.loadtxt("D:/deep1/projectdata/sibala1.csv", delimiter=",") # 실제

xy = np.loadtxt("D:/deep1/projectdata/sixx/West-Central2.csv", delimiter=",") # 학습

x_data = xy[:, 0:-1] # 8 colomns 학습

y_data = xy[:, [-1]] # 1 columns, [0, 1 ] 2 classes 학습

x1_data = xy[:, 0:-1] # 실제

y1_data = xy1[:, [-1]] # 실제

data_columns = 8

one_hot_classes = 2

Ww = [[-0.38062394, -1.1824702 ],

[-0.58688945, -0.08159877],

[ 0.0117368, 0.15746553],

[-0.27620426, 1.167679 ],

[ 0.35279995, 0.16170044],

[-0.4119897, -0.20874338],

[-1.8185207, 0.66683036],

[ 0.6255578, 0.62470055]]

Bb = [0.4964638, 0.50644296]

X = tf.placeholder(tf.float32, shape=[None, data_columns]) #[n, 8]

Y = tf.placeholder(tf.int32, shape=[None, 1]) # [n, 1]

Y_one_hot = tf.one_hot(Y, one_hot_classes) # 3D array

Y_one_hot = tf.reshape(Y_one_hot, [-1, one_hot_classes]) # 2D array shape [n, 2]

# W [8, 2]

W = tf.Variable(tf.random_normal([data_columns, one_hot_classes]), name='weight')

# b [2]

b = tf.Variable(tf.random_normal([one_hot_classes]), name='bias')

# W [8, 2]

W = tf.Variable(Ww, name='weight')

# b [2]

b = tf.Variable(Bb, name='bias')

logits = tf.matmul(X, W) + b

hypothesis = tf.nn.softmax(logits)

cost_i = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=Y_one_hot)

cost = tf.reduce_mean(cost_i)

optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)

train = optimizer.minimize(cost)

prediction = tf.argmax(hypothesis, 1)

correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

for step in range(1001):

sess.run(train, feed_dict={X: x_data, Y: y_data})

if step % 1000 == 0:

loss, acc, W_val, b_val = sess.run([cost, accuracy, W, b], feed_dict={X: x_data, Y: y_data})

print("Step: {:5}\tLoss: {:.3f}\tAcc: {:.2%}".format(step, loss, acc))

print(W_val, b_val)

#print(sess.run([W, b], feed_dict={X: x_data, Y: y_data}))

pred, hy_val = sess.run([prediction, hypothesis], feed_dict={X: x1_data, Y: y1_data})

all = 0

acci = 0

predict = 0

test = 0

for p, y, i in zip(pred, y1_data.flatten(), range(len(hy_val))):

all += 1

if int(y) == 1:

acci += 1

if p == int(y):

predict += 1

if p == int(y):

test += 1

print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)), hy_val[i])

print("accu all : {:.3f}".format(test/all))

print("accu predict : {:.3f}".format(predict/acci))

========================= Python =========================

최적화를 시켜서 Loss를 0.7까지 낮췄지만 cost func. 값으로는 여전히 높은 수치이고 실제 사고 데이터에 적용시켜보니 사고를 분류하는 확률이 10%정도 밖에 안 됐다.

'데이터 분석 > 데이터 분석 프로젝트' 카테고리의 다른 글

[빅데이터 캠퍼스 공모전] 외국인 관광객 유치를 위한 외국인 카드 사용 데이터 분석 및 시뮬레이션-1(데이터 수집 및 확인) (0)	2018.08.03
[주식매매 시뮬레이션] 크롤링으로 네이버 주식 데이터를 가져와 Zipline을 이용하여 주식 매매 시뮬레이션 돌리기 -1 (0)	2018.07.30
[기상청 콘테스트] 해면 기상 상태에 따른 선박 사고 위험도를 측정하는 모델을 개발 - 10(데이터셋을 바꿔서 로지스틱스 회귀분석 적용) (0)	2018.07.24
[기상청 콘테스트] 해면 기상 상태에 따른 선박 사고 위험도를 측정하는 모델을 개발 - 9(텐서플로우 로지스틱스 회귀분석) (0)	2018.07.24
[기상청 콘테스트] 해면 기상 상태에 따른 선박 사고 위험도를 측정하는 모델을 개발 - 8(바다를 6개로 나눠서 로지스틱스 회귀분석) (0)	2018.07.19

PREV 1 2 3 4 ···8 NEXT

데이터 분석가 블로그