Opencvを使用して画像内のテキスト領域を検出する

Question

画像があり、その中のテキスト領域を検出したい。

TiRG_RAW_20110219プロジェクトを試しましたが、結果は満足のいくものではありません。入力イメージが http://imgur.com/yCxOvQS,Gd38rCa の場合、出力として http://imgur.com/yCxOvQS,Gd38rCa#1 を生成しています。

誰でもいくつかの代替案を提案できますか。テキスト領域のみを入力として送信することにより、tesseractの出力を改善したかったのです。

Amit Kushwaha · Answer

import cv2 def captch_ex(file_name): img = cv2.imread(file_name) img_final = cv2.imread(file_name) img2gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, mask = cv2.threshold(img2gray, 180, 255, cv2.THRESH_BINARY) image_final = cv2.bitwise_and(img2gray, img2gray, mask=mask) ret, new_img = cv2.threshold(image_final, 180, 255, cv2.THRESH_BINARY) # for black text , cv.THRESH_BINARY_INV ''' line 8 to 12 : Remove noisy portion ''' kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3)) # to manipulate the orientation of dilution , large x means horizonatally dilating more, large y means vertically dilating more dilated = cv2.dilate(new_img, kernel, iterations=9) # dilate , more the iteration more the dilation # for cv2.x.x _, contours, hierarchy = cv2.findContours(dilated, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE) # findContours returns 3 variables for getting contours # for cv3.x.x comment above line and uncomment line below #image, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) for contour in contours: # get rectangle bounding contour [x, y, w, h] = cv2.boundingRect(contour) # Don't plot small false positives that aren't text if w < 35 and h < 35: continue # draw rectangle around contour on original image cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 255), 2) ''' #you can crop image and send to OCR , false detected will return no text :) cropped = img_final[y :y + h , x : x + w] s = file_name + '/crop_' + str(index) + '.jpg' cv2.imwrite(s , cropped) index = index + 1 ''' # write original image with added contours to disk cv2.imshow('captcha_result', img) cv2.waitKey() file_name = 'your_image.jpg' captch_ex(file_name)

Click to see result

Mike Sandford · Answer

手を汚したくない場合は、これらのテキスト領域を1つの大きな長方形領域に拡大してみて、一度にtesseractにフィードすることができます。

また、イメージを数回しきい値処理し、それぞれを個別にtesseractにフィードして、それがまったく役立つかどうかを確認することもお勧めします。出力を辞書の単語と比較して、特定のOCR結果が良いかどうかを自動的に判断できます。