Show imports
import cv2
import numpy as np
import matplotlib.pyplot as plt
Vishal Bakshi
August 21, 2024
In this notebook, I’ll walk through a modified algorithm (suggested by Claude) to calculate the percentage of positive space in the letters of an image. I’ll define this percentage as:
\[\frac{\text{Area of Letter}}{\text{Area of Bounding Box Around Letter}}\]
This algorithm is part of my exploration of non-ML baselines to classify text images into various typeface categories (e.g., “humanist sans,” “grotesque sans,” “script,” “display,” etc.). Once the non-ML baseline is established, I’ll train a neural network for this task. This is one of many notebooks in my TypefaceClassifier project series.
As we do with all of these algorithms (thus far), we start by loading the image of text and binarizing it.
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
binary
ndarray (512, 512)
array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)
Then we findContours
in the image. In order to find both the inner and outer contours (for letters with “holes” in them like o
, d
, e
p
q
and so on) we use the cv2.RETR_TREE
parameter.
This returns a hierarchy
which has the shape 1, N, 4
where N
is the number of contours. The 4
elements per contour are [Next, Previous, First_Child, Parent]
. If a contour doesn’t have one of those, the value of hierarchy
is -1
.
Visualizing the contours—note how the inside shapes (holes) of letters like d
are also identified as contours.
As an example, I’ll calculate the area inside the contours of the letter d
(it has an outer contour, 67
and an inner contour 68
):
The bounding box (rectangle) around the contour is calculated with boundingRect
:
The rectangle’s area is:
The area of the contour
is a bit more involved as it’s the difference between the outer area of the d
and the inner (hole) area of the d
:
The hierarchy
of this d
’s outer contour tells us that is has no parent (-1
) and it has a child 68
.
Iterating through the children (it has 1 child) we calculate the inner contour’s area:
In this case, since there is only one child, we could calculate the contour directly:
The area of the letter is the difference between inner and outer areas:
The percentage of positive space of the letter d
, in pixels, is the area of the letter divided by the area of the bounding box:
If we had ignored the inner area of the d
and used the outer contour area, this percentage would be significantly larger:
I’ll wrap all of this functionality into a set of functions (well, Claude did that for me to begin with) and then test it out on different images of text. I’m also calculating the outer contour area so I can illustrate the difference of areas for an image.
letter_area_ratio
definitiondef get_letter_area(contour, hierarchy, contours, idx):
outer_area = cv2.contourArea(contour)
inner_area = 0
# Check if hierarchy is valid
if hierarchy is None or len(hierarchy) < 3:
return outer_area # Return outer area if hierarchy is invalid
child = hierarchy[idx][2] # First child
while child != -1 and child < len(contours):
inner_area += cv2.contourArea(contours[child])
# Safely get next child
#if child < len(hierarchy) and len(hierarchy[child]) > 0:
child = hierarchy[child][2]
# else:
# break # Exit loop if we can't get next child safely
return outer_area - inner_area
def letter_area_ratio(image_path):
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
area_ratios = []
total_outer_area = 0
total_box_area = 0
total_letter_area = 0
for i, contour in enumerate(contours):
if hierarchy[0][i][3] == -1: # This is an outer contour
x, y, w, h = cv2.boundingRect(contour)
rect_area = w * h
outer_area = cv2.contourArea(contour)
letter_area = get_letter_area(contour, hierarchy[0], contours, i)
total_box_area += rect_area
total_outer_area += outer_area
total_letter_area += letter_area
if rect_area > 0:
ratio = letter_area / rect_area
area_ratios.append(ratio)
avg_ratio = np.median(area_ratios) if area_ratios else 0
return avg_ratio, total_box_area, total_outer_area, total_letter_area
I also asked Claude for a function to visualize the original image, the bounding boxes, the outer contours, and the refined (outer - inner) contours.
visualize_analysis
definitiondef visualize_analysis(image_path):
img = cv2.imread(image_path)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Threshold the image
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# Find contours with hierarchy
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
avg_ratio, total_box_area, total_outer_area, total_letter_area = letter_area_ratio(image_path)
img_boxes = img_rgb.copy()
img_contours = img_rgb.copy()
img_refined = img_rgb.copy()
# Draw bounding boxes
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
cv2.rectangle(img_boxes, (x, y), (x+w, y+h), (250, 0, 92), 2)
# Draw all contours, including inner ones
cv2.drawContours(img_refined, contours, -1, (250, 0, 92), 2)
# Draw only outer contours
for i, contour in enumerate(contours):
if hierarchy[0][i][3] == -1: # This is an outer contour
cv2.drawContours(img_contours, [contour], 0, (250, 0, 92), 2)
fig, axs = plt.subplots(2, 2, figsize=(10, 8))
fig.suptitle(f'Letter Analysis (Avg Area Ratio: {avg_ratio:.2f})', fontsize=16)
axs[0, 0].imshow(img_rgb)
axs[0, 0].set_title('Original Image')
axs[0, 0].axis('off')
axs[0, 1].imshow(img_boxes)
axs[0, 1].set_title(f'Bounding Boxes\nTotal Area: {total_box_area:.0f} pixels')
axs[0, 1].axis('off')
axs[1, 0].imshow(img_contours)
axs[1, 0].set_title(f'Outer Contours\nTotal Area: {total_outer_area:.0f} pixels')
axs[1, 0].axis('off')
axs[1, 1].imshow(img_refined)
axs[1, 1].set_title(f'Outer - Inner Contours\nTotal Area: {total_letter_area:.0f} pixels')
axs[1, 1].axis('off')
plt.tight_layout()
plt.show()
Across the entire image, the refined (outer - inner) contour area is about 10% less than the outer area.
For the same typeface (serif
) the average percentage of positive space is (kind of?) consistent.
for sz in [18, 24, 36, 76, 330]:
avg_ratio, _, _, _ = letter_area_ratio(f'serif-{sz}px.png')
print('serif', sz, avg_ratio)
serif 18 0.25
serif 24 0.30213903743315507
serif 36 0.32516339869281047
serif 76 0.37477598566308246
serif 330 0.425178283873936
For a different typeface (display
), the average ratio (of letter area to bounding box area) is considerably larger:
Given that the median letter area ratio for display
is larger than serif
texts, I think this algorithm is a good candidate for distinguishing between different typefaces.
I’m continually impressed by the functionality offered in the OpenCV library. I’m also impressed by Claude’s ability to provide me simple, usable and understandable code for OpenCV.
I hope you enjoyed this blog post! Follow me on Twitter @vishal_learner.