Keras model.fit_generatorのジェネレーターを使用します

Question

私はもともと、Kerasモデルをトレーニングするためのカスタムジェネレーターを作成するときにgenerator構文を使用しようとしました。だから私は__next__からyielded。ただし、model.fit_generatorでモードをトレーニングしようとすると、ジェネレーターがイテレーターではないというエラーが表示されます。修正は、yieldをreturnに変更することでした。また、状態を追跡するために__next__のロジックを再調整する必要がありました。 yieldが私のために仕事をするのに比べると、かなり面倒です。

yieldでこの作業を行う方法はありますか？ returnステートメントを使用する必要がある場合、非常に不格好なロジックを持つ必要があるイテレーターをさらにいくつか作成する必要があります。

Jessica Alan · Answer

コードを投稿しなかったため、コードのデバッグを手伝うことはできませんが、テンプレートとして使用するセマンティックセグメンテーションプロジェクト用に作成したカスタムデータジェネレーターは省略しました。

def generate_data(directory, batch_size): """Replaces Keras' native ImageDataGenerator.""" i = 0 file_list = os.listdir(directory) while True: image_batch = [] for b in range(batch_size): if i == len(file_list): i = 0 random.shuffle(file_list) sample = file_list[i] i += 1 image = cv2.resize(cv2.imread(sample[0]), INPUT_SHAPE) image_batch.append((image.astype(float) - 128) / 128) yield np.array(image_batch)

使用法：

model.fit_generator( generate_data('~/my_data', batch_size), steps_per_Epoch=len(os.listdir('~/my_data')) // batch_size)

Vaasha · Answer

最近、Kerasのジェネレーターで遊んでみましたが、ようやくサンプルを準備することができました。ランダムデータを使用しているため、NNを教えることは意味がありませんが、Kerasのpythonジェネレーターを使用することの良い例です。

いくつかのデータを生成する

import numpy as np import pandas as pd data = np.random.Rand(200,2) expected = np.random.randint(2, size=200).reshape(-1,1) dataFrame = pd.DataFrame(data, columns = ['a','b']) expectedFrame = pd.DataFrame(expected, columns = ['expected']) dataFrameTrain, dataFrameTest = dataFrame[:100],dataFrame[-100:] expectedFrameTrain, expectedFrameTest = expectedFrame[:100],expectedFrame[-100:]

ジェネレータ

def generator(X_data, y_data, batch_size): samples_per_Epoch = X_data.shape[0] number_of_batches = samples_per_Epoch/batch_size counter=0 while 1: X_batch = np.array(X_data[batch_size*counter:batch_size*(counter+1)]).astype('float32') y_batch = np.array(y_data[batch_size*counter:batch_size*(counter+1)]).astype('float32') counter += 1 yield X_batch,y_batch #restart counter to yeild data in the next Epoch as well if counter >= number_of_batches: counter = 0

ケラスモデル

from keras.datasets import mnist from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten, Reshape from keras.layers.convolutional import Convolution1D, Convolution2D, MaxPooling2D from keras.utils import np_utils model = Sequential() model.add(Dense(12, activation='relu', input_dim=dataFrame.shape[1])) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adadelta', metrics=['accuracy']) #Train the model using generator vs using the full batch batch_size = 8 model.fit_generator(generator(dataFrameTrain,expectedFrameTrain,batch_size), epochs=3,steps_per_Epoch = dataFrame.shape[0]/batch_size, validation_data=generator(dataFrameTest,expectedFrameTest,batch_size*2),validation_steps=dataFrame.shape[0]/batch_size*2) #without generator #model.fit(x = np.array(dataFrame), y = np.array(expected), batch_size = batch_size, epochs = 3)

出力

Epoch 1/3 25/25 [==============================] - 3s - loss: 0.7297 - acc: 0.4750 - val_loss: 0.7183 - val_acc: 0.5000 Epoch 2/3 25/25 [==============================] - 0s - loss: 0.7213 - acc: 0.3750 - val_loss: 0.7117 - val_acc: 0.5000 Epoch 3/3 25/25 [==============================] - 0s - loss: 0.7132 - acc: 0.3750 - val_loss: 0.7065 - val_acc: 0.5000

agcala · Answer

これは、私が任意のサイズのファイルを読み込むのために実装した方法です。そして、それは魅力のように機能します。

import pandas as pd hdr=[] for i in range(num_labels+num_features): hdr.append("Col-"+str(i)) # data file do not have header so I need to # provide one for pd.read_csv by chunks to work def tgen(filename): csvfile = open(filename) reader = pd.read_csv(csvfile, chunksize=batch_size,names=hdr,header=None) while True: for chunk in reader: W=chunk.values # labels and features Y =W[:,:num_labels] # labels X =W[:,num_labels:] # features X= X / 255 # any required transformation yield X, Y csvfile = open(filename) reader = pd.read_csv(csvfile, chunksize=batchz,names=hdr,header=None)

私が持っているメインのバック

nval=number_of_validation_samples//batchz ntrain=number_of_training_samples//batchz ftgen=tgen("training.csv") fvgen=tgen("validation.csv") history = model.fit_generator(ftgen, steps_per_Epoch=ntrain, validation_data=fvgen, validation_steps=nval, epochs=number_of_epochs, callbacks=[checkpointer, stopper], verbose=2)