Computer Vision — Maria Namitha Nelson

5CV Algorithms Implemented

80COCO Object Classes

0.93Top Confidence Score (YOLO)

5Test Images Detected

Overview

A comprehensive computer vision assignment implementing five foundational algorithms: Harris Corner Detection, Image Pyramids, ORB feature detection, SIFT keypoint detection, and YOLOv5 real-time object detection on the COCO dataset.

Q1 — Harris Corner Detection

The Harris algorithm detects corners by analysing local intensity changes in multiple directions. Applied to a Rubik's Cube image — a geometrically rich subject with many well-defined corners. All 27 visible faces and edge intersections were correctly detected.

harris corner detection

corners = cv2.cornerHarris(image, blockSize=2, ksize=3, k=0.04)
corners = cv2.dilate(corners, None)
image_with_corners[corners > 0.01 * corners.max()] = [0, 0, 255]

Subject image — used for Harris, ORB & SIFT detection

Q2 — Image Pyramid

Gaussian image pyramid constructed over 6 downsampling levels using cv2.pyrDown(). Applied to a building photograph to demonstrate multi-scale image representation — foundational for scale-invariant detection and image blending.

Input image for Gaussian pyramid construction (6 levels)

Q3 — ORB Feature Detection

SURF is patent-restricted in OpenCV, so ORB (Oriented FAST and Rotated BRIEF) was used as a free, high-performance alternative. ORB detects keypoints and computes binary descriptors suitable for fast matching.

orb detector

orb = cv2.ORB_create()
keypoints, descriptors = orb.detectAndCompute(gray, None)
img_with_keypoints = cv2.drawKeypoints(
    image, keypoints, None, (255, 0, 0),
    flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS
)

Q4 — SIFT Keypoint Detection

SIFT (Scale-Invariant Feature Transform) detects and describes local features invariant to scale, rotation, and illumination changes. The rich circular keypoint visualisation shows scale and orientation for each detected feature.

sift detector

sift = cv2.SIFT_create()
keypoints, descriptors = sift.detectAndCompute(gray, None)
img_with_keypoints = cv2.drawKeypoints(image, keypoints, None)

Q5 — YOLOv5 Object Detection (COCO Dataset)

YOLOv5 (via Ultralytics) was loaded with pre-trained weights and run on 5 test images from the COCO dataset. YOLO performs single-pass detection — classifying and localising all objects in one forward pass.

yolo inference

from ultralytics import YOLO
model   = YOLO("yolov5s.pt")
results = model(img_rgb)
boxes   = results[0].boxes
# boxes.cls   → class IDs
# boxes.xyxy  → bounding box coords
# boxes.conf  → confidence scores

YOLOv5 detection — airplane at 0.89 & 0.90 confidence

YOLO Detection Results

Image 1 — 1 person, 1 skateboard detected (conf: 0.82, 0.71)
Image 2 — 3 persons, 3 elephants detected (conf up to 0.93)
Image 3 — 1 apple, 1 orange detected (conf: 0.85, 0.52)
Image 4 — 2 persons, 5 cows detected (conf up to 0.91)
Image 5 — 2 airplanes detected (conf: 0.90, 0.89)

Tech Stack

opencv-python 4.10 — Harris, pyramids, ORB, SIFT
ultralytics 8.3 — YOLOv5 inference
torch 2.4 + torchvision — deep learning backend
COCO dataset — 80 object classes, industry-standard benchmark
Google Colab — execution environment

Computer Vision:Harris, SIFT & YOLO