[OpenCV][Python]印出圖像中文字的位置及高寬

螃蟹_crab

發佈於[Python][OpenCV]學習心得筆記

更新於 2024/07/24發佈於 2024/07/24閱讀時間約 7 分鐘

本文將說明如何去辨識出圖片文字位置及高寬。

印出結果

測式圖

程式碼

import cv2
import numpy as np

def read_posion(img):
    '''
    輸入背景黑色，物件白色的圖
    '''
    num_labels, labels, stats, _ = cv2.connectedComponentsWithStats(img, connectivity=8)
    components = []
    # boxes_data = []
    for i in range(1, num_labels): # 跳過背景
        x, y, w, h, _ = stats[i]
        components.append((x, y, w, h))

    components.sort(key=lambda c: c[0])  # 按 x 座標排序

    # 合併 x 軸在正負5範圍內的OCR
    merged_components = []
    current_component = list(components[0])

    for i in range(1, len(components)):
        if abs(components[i][0] - current_component[0]) <= 5:
            current_component[0] = min(current_component[0], components[i][0])  # X 取最小值
            current_component[1] = min(current_component[1], components[i][1])  # Y 取最小值
            current_component[2] = max(current_component[2], components[i][2])  # w 取最大值
            current_component[3] = abs(components[i][1] - current_component[1]) + components[i][3] # h 取 Y2 - Y1 + H2
        else:
            merged_components.append(tuple(current_component[:4]))
            current_component = list(components[i][:4])

    #合併最後一個OCR結果
    merged_components.append(tuple(current_component[:4]))

    return merged_components
    
img = cv2.imread(f'圖片路徑')
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
box = read_posion(gray_img)

for i,data in enumerate(box):
    x,y,h,w = data
    #印出OCR 位置，高寬
    print(f'第{i}個OCR，x:{x},y:{y},h:{h},w:{w}')

函式詳細說明

函式定義和參數:
- read_posion(img) 函式接受一個參數
- img：輸入的二值化圖像，背景是黑色，物件是白色。

計算連通域:

num_labels, labels, stats, _ = cv2.connectedComponentsWithStats(img, connectivity=8)

- 使用 OpenCV 的 connectedComponentsWithStats 函數計算連通域
- num_labels：連通域的數量。
- labels：標籤圖，每個連通域有一個唯一的標籤。
- stats：每個連通域的統計資料（x, y, w, h, area）。
- _:忽略的中心點資料。

提取連通域並存入列表:

components = []
for i in range(1, num_labels):  # 跳過背景
    x, y, w, h, _ = stats[i]
    components.append((x, y, w, h))

- 遍歷 stats，跳過背景，提取每個連通域的位置信息和尺寸，存入 components 列表。
按 x 座標排序:
```
components.sort(key=lambda c: c[0])
```
- 將 components 按 x 座標進行排序。

合併相鄰的連通域:

merged_components = []
current_component = list(components[0])

for i in range(1, len(components)):
    if abs(components[i][0] - current_component[0]) <= 5:
        current_component[0] = min(current_component[0], components[i][0])  # X 取最小值
        current_component[1] = min(current_component[1], components[i][1])  # Y 取最小值
        current_component[2] = max(current_component[2], components[i][2])  # w 取最大值
        current_component[3] = abs(components[i][1] - current_component[1]) + components[i][3]  # h 取 Y2 - Y1 + H2
    else:
        merged_components.append(tuple(current_component[:4]))
        current_component = list(components[i][:4])

merged_components.append(tuple(current_component[:4]))

- 初始化 merged_components 列表和 current_component。
- 遍歷 components 列表，如果當前組件與前一組件的 x 座標差值在正負5範圍內，則合併它們。
- 合併後的結果存入 merged_components。
返回合併後的元件資訊:
```
return merged_components
```
- 返回合併後的元件資訊，這些資訊包括每個連通域的 x, y, w, h（左上角座標和寬高）。