マルチクラスのセマンティックセグメンテーションを実装する方法は？

Question

-net をバイナリ分類のラベル付き画像でトレーニングできます。

しかし、multi-class分類（4クラス）のためにKeras/Theanoの最終レイヤーを構成する方法を理解するのに苦労しています。

unit8および64 x 64ピクセルの634個の画像と対応する634個のマスクがあります。

私のマスクは、黒（0）と白（1）ではなく、次のように、3つのカテゴリと背景のカラーラベル付きオブジェクトを持っています。

黒（0）、背景
赤（1）、オブジェクトクラス1
緑（2）、オブジェクトクラス2
黄色（3）、オブジェクトクラス3

トレーニングを実行する前に、マスクを含む配列は次のようにワンホットエンコードされます。

mask_train = to_categorical(mask_train, 4)

これにより、mask_train.shapeが(634, 1, 64, 64)から(2596864, 4)に変わります。

私のモデルはUnetアーキテクチャに厳密に従っていますが、ワンホットエンコードされた配列と一致するように構造をフラット化できないため、最終的なレイヤーに問題があるようです。

[...] up3 = concatenate([UpSampling2D(size=(2, 2))(conv7), conv2], axis=1) conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(up3) conv8 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv8) up4 = concatenate([UpSampling2D(size=(2, 2))(conv8), conv1], axis=1) conv9 = Conv2D(64, (3, 3), activation='relu', padding='same')(up4) conv10 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv9) # here I used number classes = number of filters and softmax although # not sure if a dense layer should be here instead conv11 = Conv2D(4, (1, 1), activation='softmax')(conv10) model = Model(inputs=[inputs], outputs=[conv11]) # here categorical cross entropy is being used but may not be correct model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy']) return model

モデルの最後の部分を変更してこれが正常にトレーニングされるようにする方法について何か提案はありますか？さまざまな形状の不一致エラーが発生し、数回実行しても、損失はエポック全体で変化しませんでした。

Daniel M&#246;ller · Answer

あなたのターゲットは(634,4,64,64) channels_firstを使用している場合。
または(634,64,64,4) if channels_last。

ターゲットの各チャネルは1つのクラスである必要があります。各チャネルは0と1のイメージであり、1はピクセルがそのクラスであることを意味し、0はピクセルがそのクラスではないことを意味します。

次に、ターゲットは634グループで、各グループには4つの画像が含まれます。各画像は64x64ピクセルで、ピクセル1は目的の機能の存在を示します。

結果が正しく並べられるかどうかはわかりませんが、次の方法を試してみてください。

mask_train = to_categorical(mask_train, 4) mask_train = mask_train.reshape((634,64,64,4)) #I chose channels last here because to_categorical is outputing your classes last: (2596864,4) #moving the channel: mask_train = np.moveaxis(mask_train,-1,1)

順序が正しく機能しない場合は、手動で行うことができます。

newMask = np.zeros((634,4,64,64)) for samp in range(len(mask_train)): im = mask_train[samp,0] for x in range(len(im)): row = im[x] for y in range(len(row)): y_val = row[y] newMask[samp,y_val,x,y] = 1

Daniel · Answer

少し遅れますが、試してみてください

mask_train = to_categorical(mask_train, num_classes=None)

その結果、(634, 4, 64, 64) ために mask_train.shapeと個々のクラスのバイナリマスク（ワンホットエンコード）。

最後のコンバージョンレイヤー、アクティブ化と損失は、マルチクラスセグメンテーションに適しています。