Tensorflow Keras - 6 (Self Attention layer 이용하기)

Machine Learning/Tensorflow 2021. 3. 20. 12:59

2년전 캐글에 공유된 소스를 변경하여 작성하였습니다.

출처:

www.kaggle.com/arcisad/keras-bidirectional-lstm-self-attention?select=train.csv

Keras Bidirectional LSTM + Self-Attention

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

Data set:

www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/data?select=test.csv

Jigsaw Unintended Bias in Toxicity Classification

Detect toxicity across a diverse range of conversations

www.kaggle.com

Text Data이고, 텍스트의 문맥이 긍정인지 부정인지 라벨링이 되어있습니다.

라이브러리

#!pip install keras-self-attention

#https://pypi.org/project/keras-self-attention/
import pandas as pd
import numpy as np

from tensorflow import keras
from tensorflow.keras.preprocessing.text import one_hot, Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Flatten, Dense

from tensorflow.keras.models import Sequential
from keras_self_attention import SeqSelfAttention

self-attention layer는 따로 설치를 하시면 됩니다.

!pip install keras-self-attention

Data load & preprocessing

#load
df = pd.read_csv('./train.csv')

#preprocessing
trainig_sample = df.sample(100000, random_state=0)
X_train = trainig_sample['comment_text'].astype(str)
X_train = X_train.fillna('DUMMY')
y_train = trainig_sample['target']
y_train = y_train.apply(lambda x: 1 if x > 0.5 else 0)

X_train에는 comment_text만 y_train에는 target을 0과 1로 변경을 해줬습니다.

긍정, 부정으로 나누어 줬습니다.

def get_seqs(text):
    sequences = tokenizer.texts_to_sequences(text)
    padded_sequences = pad_sequences(sequences, maxlen=max_length, padding='post')
    return padded_sequences
    
    
X_train = get_seqs(X_train)

앞전 게시글에 설명을 했듯이 자연어처리에 tokenizer는 필수입니다.

Hyper parameters

epochs = 2
max_num_words = 20000
max_length = 128
tokenizer = Tokenizer(num_words=max_num_words)
tokenizer.fit_on_texts(X_train)
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

Model Create

model = Sequential()
model.add(Embedding(max_num_words, 100, input_length=max_length))
model.add(Bidirectional(LSTM(units=128, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)))
model.add(SeqSelfAttention(attention_activation='sigmoid'))
model.add(Bidirectional(LSTM(units=64, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)))
model.add(SeqSelfAttention(attention_activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))

순차적으로 Embedding을 진행하여, padding화 되어있는 텍스트들을 고차원으로 변경을 해줍니다.

그리고 양방향 LSTM을 이용하고, self-attention layer를 거치면서 attention score만큼 중요도가 높은 단어들에 대해서

높은 점수들을 줍니다.

그 뒤로 동일하게 처리했습니다.

model.summary()

Model: "sequential_5"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding_5 (Embedding)      (None, 128, 100)          2000000
_________________________________________________________________
bidirectional_6 (Bidirection (None, 128, 256)          234496
_________________________________________________________________
seq_self_attention_4 (SeqSel (None, None, 256)         16449
_________________________________________________________________
bidirectional_7 (Bidirection (None, None, 128)         164352
_________________________________________________________________
seq_self_attention_5 (SeqSel (None, None, 128)         8257
_________________________________________________________________
dense_4 (Dense)              (None, None, 1)           129
=================================================================
Total params: 2,423,683
Trainable params: 2,423,683
Non-trainable params: 0
_________________________________________________________________

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

optimizer는 여전히 'adam'입니다. loss 또한 이중 클래스 분류 문제이기 때문에 binary_crossentropy를 사용했고,

metrics 또한 accuracy입니다.

Model fit

model.fit(X_train, y_train, epochs=epochs)

Epoch 1/2
3125/3125 [==============================] - 2896s 927ms/step - loss: 0.1534 - accuracy: 0.9525
Epoch 2/2
3125/3125 [==============================] - 3001s 960ms/step - loss: 0.1028 - accuracy: 0.9638

Model test

validation_sample = df.sample(500, random_state=42)
X_val = validation_sample['comment_text'].astype(str)
X_val = X_val.fillna('DUMMY')
y_val = validation_sample['target']
y_val = y_val.apply(lambda x: 1 if x > 0.5 else 0)

loss, accuracy = model.evaluate(get_seqs(X_val), y_val)
print('Evaluation accuracy: {0}'.format(accuracy))

16/16 [==============================] - 2s 102ms/step - loss: 0.0967 - accuracy: 0.9660
Evaluation accuracy: 0.9660000205039978

기존 X_train에 사용됐던 데이터를 샘플링하여 예측을 해봅니다.

Test file load & predict

test = pd.read_csv('./test.csv')

X_test = test['comment_text'].astype(str)
X_test = X_test.fillna('DUMMY')

probs = model.predict(get_seqs(X_test), verbose=1)

probs = [x[0] for x in probs]

model.save("attention_md.h5")

submission = pd.DataFrame(test['id']).reset_index(drop=True)
submission['prediction'] = pd.Series(probs, name='prediction')
submission.to_csv('submission.csv', index=False)

동일하게 테스트 파일을 변환하여 예측을 해보고서, 캐글에 제출할 파일을 생성합니다.

전체 소스 다운로드

github.com/Joonyeong97/Tensorflow-tutorial

Joonyeong97/Tensorflow-tutorial

GitHub Desktop tutorial repository. Contribute to Joonyeong97/Tensorflow-tutorial development by creating an account on GitHub.

github.com

저작자표시

'Machine Learning > Tensorflow' 카테고리의 다른 글

[Python,TF]Upbit API를 이용해서 코인 시세를 예측해보자! (0)	2021.04.15
Tensorflow Keras - 7일차 (Resnet50V2) 및 실제 이미지로 학습하기 (0)	2021.04.02
Tensorflow Keras - 5 (자연어처리,IMDB 데이터 이용하기) (0)	2021.03.17
Tensorflow Keras - 4 (자연어처리,감정분석) (0)	2021.03.16
Tensorflow Keras - 3 (전이학습,VGG16) (0)	2021.03.16

ABOUT ME

DataCook DataCook

출처:

Data set:

Data load & preprocessing

Hyper parameters

Model Create

Model fit

Model test

Test file load & predict

전체 소스 다운로드

'Machine Learning > Tensorflow' 카테고리의 다른 글

티스토리툴바

ABOUT ME

출처:

Data set:

Data load & preprocessing

Hyper parameters

Model Create

Model fit

Model test

Test file load & predict

전체 소스 다운로드

'Machine Learning > Tensorflow' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바