import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
from google.colab.patches import cv2_imshow
from math import sqrt
from skimage import data, exposure
from skimage.feature import blob_dog, blob_log, blob_doh, hog, local_binary_pattern
from skimage.color import rgb2gray, label2rgb
from skimage.transform import rotate
from sklearn import cluster, decomposition
Exploring Different Feature Detection Algorithms in Computer Vision
Background
In this notebook, I’ll work through the first “Further Research” prompt from Chapter 13 (Convolutional Neural Networks) in the fastai book.
What features other than edge detectors have been used in computer vision (especially before deep learning became popular)?
To start this exploration, I prompted ChatGPT and got a list of feature detection algorithms. I’ll explore 7 of the algorithms that I could get to work and/or ran relatively quickly and/or ran without crashing my Google Colab kernel.
Corner Detectors
Harris Corner Detection
The Harris Corner Detection algorithm was developed in 1988 by Chris Harris & Mike Stephens in their paper “A Combined Corner and Edge Detector”. I’ll use the example code given in the OpenCV documentation on this algorithm. I found the following image to help understand the concept of this algorithm:
I created an image with simple shapes in Google Slides with varying rounded-ness of corners. I’ll wrap the OpenCV documentation code in a function so I can call it with different parameters to experiment.
def do_corner_harris(fname='/content/shapes1.png', blockSize=2, ksize=3, k=0.04):
= cv.imread(fname)
img = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
gray
= np.float32(gray)
gray = cv.cornerHarris(gray,blockSize,ksize,k)
dst
#result is dilated for marking the corners, not important
= cv.dilate(dst,None)
dst
# Threshold for an optimal value, it may vary depending on the image.
>0.01*dst.max()]=[0,0,255]
img[dst
cv2_imshow(img)
Using the default settings, the perfect square’s corners are detected. I asked the new Claude 3.5-Sonnet model how k
affects the algorithm and it said that:
The value of k influences the trade-off between detecting true corners and rejecting edge-like features:
If k is too small: The algorithm becomes more sensitive to edges.
If k is too large: The algorithm may miss some corners.
do_corner_harris()
If I reduce the value of k
to 0.005
, it starts detecting corners in the rounded squares (which is fine) but also on the circle (which seems like noise).
=0.005) do_corner_harris(k
I’ll go back to k=0.04
and change blockSize
to see how it changes corner detection:
A larger blockSize
(20
instead of 2
) starts to detect the rounded corners of the rounded square.
=20) do_corner_harris(blockSize
Decreasing ksize
(1
instead of the default 3
) seems to increase the sensitivity of the corner detection, as the rounded squares’ corners are detected as well as points on the circle.
=1) do_corner_harris(ksize
Shi-Tomasi Corner Detector
Next, I’ll experiment with the Shi-Tomasi Corner Detector, which is a modification of the Harris Corner Detection algorithm.
def do_shi_tomasi(maxCorners=25, qualityLevel=0.01, minDistance=10):
= cv.imread('/content/shapes1.png')
img = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
gray
= cv.goodFeaturesToTrack(gray,maxCorners,qualityLevel,minDistance)
corners = np.int0(corners)
corners
for i in corners:
= i.ravel()
x,y 3,255,-1)
cv.circle(img,(x,y),
plt.imshow(img),plt.show()
The documentation example settings in this case are more sensitive to corner-like regions, even detecting 8 points on the circle as corners.
do_shi_tomasi()
DeprecationWarning: `np.int0` is a deprecated alias for `np.intp`. (Deprecated NumPy 1.24)
corners = np.int0(corners)
That might be because the number of corners it’s looking for is 25. Let’s see what happens if I reduce maxCorners
to 5
:
=5) do_shi_tomasi(maxCorners
DeprecationWarning: `np.int0` is a deprecated alias for `np.intp`. (Deprecated NumPy 1.24)
corners = np.int0(corners)
Interestingly, reducing maxCorners
doesn’t avoid corner detection on the circle. I’ll increase the qualityLevel
and see if that avoids detection of corners on the circle:
=100, qualityLevel=0.04) do_shi_tomasi(maxCorners
DeprecationWarning: `np.int0` is a deprecated alias for `np.intp`. (Deprecated NumPy 1.24)
corners = np.int0(corners)
After quadrupuling the qualityLevel
, even if maxCorners
is 100
, the corners on the rounded squares are not detected. Let’s see how changing minDistance
(from 10
to 100
) affects corner detection:
=100) do_shi_tomasi(minDistance
DeprecationWarning: `np.int0` is a deprecated alias for `np.intp`. (Deprecated NumPy 1.24)
corners = np.int0(corners)
Fewer corners are detected because of the larger minimum distance required, but the quality of corners detected doesn’t improve.
Blob Detectors
I’ll be running the same code as the scikit-image documentation with different images, going from slowest and most accurate (Laplacian of Gaussian) to fastest (Determinant of Hessian).
Laplacian of Gaussian
def do_blob_log(fname='/content/shapes1.png', max_sigma=30, num_sigma=10, threshold=0.1):
= cv.imread(fname)
image = rgb2gray(image)
image_gray
= blob_log(image_gray, max_sigma=max_sigma, num_sigma=num_sigma, threshold=threshold)
blobs_log
# Compute radii in the 3rd column.
2] = blobs_log[:, 2] * sqrt(2)
blobs_log[:, = plt.subplots(1, 1, figsize=(3, 3))
fig, ax
f'Laplacian of Gaussian: {len(blobs_log)} blobs')
ax.set_title(
ax.imshow(image)
for blob in blobs_log:
= blob
y, x, r = plt.Circle((x, y), r, color='red', linewidth=2, fill=False)
c
ax.add_patch(c)
ax.set_axis_off()
plt.tight_layout() plt.show()
do_blob_log()
That took about 10 seconds to run and did not result in what I was expecting—it seems to be identifying the white space as blobs? I’m not sure how to interpret this result. I’ll change the max_sigma
value and see what that does.
A very small max_sigma
results in many smaller blobs around the edges of the shapes. A large max_sigma
results in fewer, much larger blobs.
=1) do_blob_log(max_sigma
=100) do_blob_log(max_sigma
Changing num_sigma
next (from 10
to 1
and 100
):
A smaller num_sigma
has a similar effect to a small max_sigma
(many small blobs).
=1) do_blob_log(num_sigma
A larger num_sigma
results in fewer blobs but of greatly varying sizes.
=100) do_blob_log(num_sigma
=0.01) do_blob_log(threshold
As threshold
gets larger, the number of blobs decreases significantly.
=0.4) do_blob_log(threshold
Difference of Gaussian
def do_blob_dog(fname='/content/shapes1.png', max_sigma=30, threshold=0.1):
= cv.imread(fname)
image = rgb2gray(image)
image_gray
= blob_dog(image_gray, max_sigma=max_sigma, threshold=threshold)
blobs_dog 2] = blobs_dog[:, 2] * sqrt(2)
blobs_dog[:,
= plt.subplots(1, 1, figsize=(3, 3))
fig, ax
f'Difference of Gaussian: {len(blobs_dog)} blobs')
ax.set_title(
ax.imshow(image)
for blob in blobs_dog:
= blob
y, x, r = plt.Circle((x, y), r, color='red', linewidth=2, fill=False)
c
ax.add_patch(c)
ax.set_axis_off()
plt.tight_layout() plt.show()
I still don’t understand why it’s focused on detecting the white space and not the black objects.
do_blob_dog()
I wonder if I reverse the colors in the image (manually, in Google Slides) would that change the results?
='/content/shapes3.png') do_blob_dog(fname
Yup—it’s now focused on the bright (white) objects compared to the dark (black) background. It’s still doing a terrible job at detecting the shapes as it has way too many blobs.
A larger max_sigma
(100
instead of 30
) doesn’t improve the results (we get only marginally closer to 4 total blobs).
='/content/shapes3.png', max_sigma=100) do_blob_dog(fname
A larger threshold
value (0.32
instead of 0.1
) captures one blog per small shape but doesn’t detect the largest square as a single blob (instead, it captures each corner as its own blob).
='/content/shapes3.png', threshold=0.32) do_blob_dog(fname
Increasing max_sigma
with a larger threshold
gets the desired result—all four shapes are detected with one blob each.
='/content/shapes3.png', max_sigma=100, threshold=0.35) do_blob_dog(fname
I’ll retry Laplacian of Gaussian on this new image with white shapes on a black background.
Laplacian of Gaussian requires a slightly larger threshold
of 0.45
to get the same results. Increasing max_sigma
(with num_sigma
at 10
or 3
and threshold=0.35
) picks up additional noise (four corners of the square).
='/content/shapes3.png', max_sigma=100, threshold=0.45) do_blob_log(fname
='/content/shapes3.png', max_sigma=300, threshold=0.35) do_blob_log(fname
='/content/shapes3.png', max_sigma=100, num_sigma=3, threshold=0.35) do_blob_log(fname
Determinant of Hessian
def do_blob_doh(fname='/content/shapes3.png', max_sigma=30, threshold=0.01):
= cv.imread(fname)
image = rgb2gray(image)
image_gray
= blob_doh(image_gray, max_sigma=max_sigma, threshold=threshold)
blobs_doh
= plt.subplots(1, 1, figsize=(3, 3))
fig, ax
f'Determinant of Hessian: {len(blobs_doh)} blobs')
ax.set_title(
ax.imshow(image)
for blob in blobs_doh:
= blob
y, x, r = plt.Circle((x, y), r, color='red', linewidth=2, fill=False)
c
ax.add_patch(c)
ax.set_axis_off()
plt.tight_layout() plt.show()
The documentation values of max_sigma=30
and threshold=0.01
result in 21 blobs for the Determinant of Hessian approach.
do_blob_doh()
Increasing max_sigma
significantly gets the correct number of blobs (4) but they are erratically positioned and don’t match the target shapes.
=200) do_blob_doh(max_sigma
I can’t quite find the right combination of parameters with the Determimant of Hessian approach. The best I can do (with the limited manual combinations I tried) is to get the right number of blobs (4) with the wrong placement.
=0.023) do_blob_doh(threshold
=200, threshold=0.01) do_blob_doh(max_sigma
=175, threshold=0.025) do_blob_doh(max_sigma
Scale-Invariant Feature Transform (SIFT)
From the OpenCV docs:
A corner may not be a corner if the image is scaled.
In the image below, I created a small rounded square at the top left. I copied that square and scaled it up a few times to get the “zoomed-in” corner on the bottom right.
When using the Harris Corner detection algorithm with an increased blockSize
the rounded corners in the small version and large version of the square are detected, however I’m not sure if such large blocks are recommended or ever used. There also seems to be some overlap between the bottom right corner of the small square and the top right corner of the large square.
='/content/shapes4.png', blockSize=80) do_corner_harris(fname
If I use SIFT with a larger edgeThreshold
than the default, it seems to identify key points along the rounded corner of the large square, and the entire smaller square. I’m not entirely sure how to interpret these keypoints so I can’t say if it’s better or worse than the Harris Corner Detector.
def do_sift(fname='/content/shapes4.png', edgeThreshold=10):
= cv.imread(fname)
img = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
gray
= cv.SIFT_create(edgeThreshold=edgeThreshold)
sift = sift.detect(gray, mask=None)
kp
=cv.drawKeypoints(gray,kp,img,flags=cv.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
img
cv2_imshow(img)
do_sift()
='/content/shapes3.png', edgeThreshold=50) do_sift(fname
With a larger edgeThreshold
than the default, SIFT is placing key points on all corners of the shapes in the image, including points on the circle.
Histogram of Oriented Gradients (HoG)
def do_hog(fname='/content/shapes3.png'):
= cv.imread(fname)
image
= hog(
fd, hog_image
image,=8,
orientations=(16, 16),
pixels_per_cell=(1, 1),
cells_per_block=True,
visualize=-1,
channel_axis
)
= plt.subplots(1, 2, figsize=(8, 4), sharex=True, sharey=True)
fig, (ax1, ax2)
'off')
ax1.axis(=plt.cm.gray)
ax1.imshow(image, cmap'Input image')
ax1.set_title(
# Rescale histogram for better display
= exposure.rescale_intensity(hog_image, in_range=(0, 10))
hog_image_rescaled
'off')
ax2.axis(=plt.cm.gray)
ax2.imshow(hog_image_rescaled, cmap'Histogram of Oriented Gradients')
ax2.set_title( plt.show()
The Histogram of Gradients algorithm generally seems to find the gradient at each point of the shape’s boundary.
do_hog()
I’ll give it a much simpler image (straight lines) to see how it transforms that:
='/content/shapes5.png') do_hog(fname
It’s interesting to note that on the diagonal lines it has “thicker” gradients than the vertical and horizontal lines.
Final Thoughts
Computer Vision is obviously a vast field with a significant history of research and applications. I’m not very familiar with this field outside of what I’ve learned and applied from the fastai course and Kaggle competitions about image recognition, so this was a helpful exercise to slightly expand my understanding of the CV universe.
I hope you enjoyed this blog post! Follow me on Twitter @vishal_learner.