画像とその境界ボックスのサイズ変更

Question

バウンディングボックスを含む画像があり、画像のサイズを変更したい。

img = cv2.imread("img.jpg",3) x_ = img.shape[0] y_ = img.shape[1] img = cv2.resize(img,(416,416));

次に、スケールファクターを計算します。

x_scale = ( 416 / x_) y_scale = ( 416 / y_ )

そして、画像を描画します。これは、元の境界ボックスのコードです。

( 128, 25, 447, 375 ) = ( xmin,ymin,xmax,ymax) x = int(np.round(128*x_scale)) y = int(np.round(25*y_scale)) xmax= int(np.round (447*(x_scale))) ymax= int(np.round(375*y_scale))

しかし、これを使用して私は得ます：

enter image description here

オリジナルは：

enter image description here

このロジックにはフラグがありません。何が問題なのですか？

コード全体：

imageToPredict = cv2.imread("img.jpg",3) print(imageToPredict.shape) x_ = imageToPredict.shape[0] y_ = imageToPredict.shape[1] x_scale = 416/x_ y_scale = 416/y_ print(x_scale,y_scale) img = cv2.resize(imageToPredict,(416,416)); img = np.array(img); x = int(np.round(128*x_scale)) y = int(np.round(25*y_scale)) xmax= int(np.round (447*(x_scale))) ymax= int(np.round(375*y_scale)) Box.drawBox([[1,0, x,y,xmax,ymax]],img)

とドローボックス

def drawBox(boxes, image): for i in range (0, len(boxes)): cv2.rectangle(image,(boxes[i][2],boxes[i][3]),(boxes[i][4],boxes[i][5]),(0,0,120),3) cv2.imshow("img",image) cv2.waitKey(0) cv2.destroyAllWindows()

境界ボックスの画像とデータは別々に読み込まれます。画像内に境界ボックスを描画しています。画像にはボックス自体は含まれていません。

SergGr · Answer

私は2つの問題があると信じています：

交換する必要がありますx_およびy_理由はshape[0]は実際にはy次元であり、shape[1]はx次元です
元の画像とスケーリングされた画像で同じ座標を使用する必要があります。元の画像では、長方形は(160, 35)-(555, 470) のではなく (128,25)-(447,375)コードで使用します。

次のコードを使用した場合：

import cv2 import numpy as np def drawBox(boxes, image): for i in range(0, len(boxes)): # changed color and width to make it visible cv2.rectangle(image, (boxes[i][2], boxes[i][3]), (boxes[i][4], boxes[i][5]), (255, 0, 0), 1) cv2.imshow("img", image) cv2.waitKey(0) cv2.destroyAllWindows() def cvTest(): # imageToPredict = cv2.imread("img.jpg", 3) imageToPredict = cv2.imread("49466033\img.png ", 3) print(imageToPredict.shape) # Note: flipped comparing to your original code! # x_ = imageToPredict.shape[0] # y_ = imageToPredict.shape[1] y_ = imageToPredict.shape[0] x_ = imageToPredict.shape[1] targetSize = 416 x_scale = targetSize / x_ y_scale = targetSize / y_ print(x_scale, y_scale) img = cv2.resize(imageToPredict, (targetSize, targetSize)); print(img.shape) img = np.array(img); # original frame as named values (origLeft, origTop, origRight, origBottom) = (160, 35, 555, 470) x = int(np.round(origLeft * x_scale)) y = int(np.round(origTop * y_scale)) xmax = int(np.round(origRight * x_scale)) ymax = int(np.round(origBottom * y_scale)) # Box.drawBox([[1, 0, x, y, xmax, ymax]], img) drawBox([[1, 0, x, y, xmax, ymax]], img) cvTest()

「元の」画像を「49466033\img.png」として使用します。

次の画像が表示されます

ご覧のように、私の細い青い線は元の赤い線の内側にあり、選択したtargetSizeのままです（スケーリングが実際に正しく機能します）。

Italo Jos&#233; · Answer

resize_dataset_pascalvoc を使用できます

使いやすいpython3 main.py -p <IMAGES_&_XML_PATH> --output <IMAGES_&_XML> --new_x <NEW_X_SIZE> --new_y <NEW_X_SIZE> --save_box_images <FLAG>"

すべてのデータセットのサイズを変更し、新しい注釈ファイルをサイズ変更した画像に書き換えます