Show imports
import pandas as pd, numpy as np, matplotlib.pyplot as plt
import cv2
Vishal Bakshi
August 31, 2024
In this notebook, I’ll walk through an algorithm suggested by Claude to calculate the “squareness” of letters in a text image. This measure is calculated as:
\[\frac{\text{min(width, height)}}{\text{max(width, height)}}\]
Where \(\text{width}\) and \(\text{height}\) are the width and height of the rectangular bounding box around the letter.
This algorithm is part of my exploration of non-ML baselines to classify text images into various typeface categories (e.g., “humanist sans,” “grotesque sans,” “script,” “display,” etc.). Once the non-ML baseline is established, I’ll train a neural network for this task. This is one of many notebooks in my TypefaceClassifier project series.
This is a very straightforward (code-wise) algorithm compared to some of the other ones I’ve worked through. In fact, this algorithm is just a slight modification to the aspect ratio algorithm. The aspect ratio is the ratio of bounding rectangle width to bounding rectangle length. This “squareness” ratio is the minimum dimension divided by the maximum dimension.
We start by loading the image an calculating the binary
(as always).
path = 'serif-36px.png'
img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
binary
ndarray (512, 512)
array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)
We then find the contours in the image:
For each contour, we create a bounding rectangle and calculate the squareness ratio:
Then calculate the average ratio across all contours:
0.8333333333333334
And that’s it!
Squareness ratios that are close to 1 represent letter forms that are closer to a perfect square. Ratios closer to 0 represent letters that are closer to wide or tall rectangles.
I’ll wrap the squareness ratio algorithm in a function and then calculate the median ratio for display
versus serif
images.
def squareness_ratio(image_path):
# Read the image
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
# Threshold the image
_, binary = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# Find contours
contours, _ = cv2.findContours(binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
squareness_ratios = []
for contour in contours:
# Get bounding rectangle
x, y, w, h = cv2.boundingRect(contour)
# Calculate squareness (ratio of width to height)
# We use min/max to ensure the ratio is always between 0 and 1
squareness = min(w, h) / max(w, h)
squareness_ratios.append(squareness)
# Calculate average squareness
squareness_ratio = np.median(squareness_ratios) if squareness_ratios else 0
return squareness_ratio
The display
text is more rectangular than the serif
text so I expect it to have a lower squareness ratio.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(8, 5))
# Display the first image
ax1.imshow(cv2.imread('display-76px.png', cv2.IMREAD_GRAYSCALE), cmap='gray')
ax1.axis('off') # Hide axes
ax1.set_title('display (narrower)')
# Display the second image
ax2.imshow(cv2.imread('serif-76px.png', cv2.IMREAD_GRAYSCALE), cmap='gray')
ax2.axis('off') # Hide axes
ax2.set_title('serif (squarish)')
# Adjust the layout and display the plot
plt.tight_layout()
plt.show()
Across 7 different font sizes, the median squareness ratio for display
images is less than serif
images.
ts = ['display', 'serif']
szs = [18, 24, 36, 76, 240, 330, 420]
res = []
for t in ts:
for sz in szs:
image_path = f"{t}-{sz}px.png"
sr = squareness_ratio(image_path)
res.append([t, sz, sr])
res = pd.DataFrame(res, columns=['typeface', 'font-size', 'squareness-ratio'])
res.groupby('typeface')['squareness-ratio'].agg(['mean', 'median'])
mean | median | |
---|---|---|
typeface | ||
display | 0.763897 | 0.769231 |
serif | 0.749344 | 0.823529 |
If we limit the font sizes to 18-76px, the difference is larger:
Similar to aspect ratio the squareness ratio is a good candidate for distinguishing between typefaces. Even though these algorithms are similar, I’ll keep them both in play for now and see how they perform on a larger and more diverse dataset of text images as they offer different insights:
I hope you enjoyed this blog post! Follow me on Twitter @vishal_learner.