অধ্যায়Phase 5 · ডিপ লার্নিং ফর ভিশন
5.5 20 মিনিট পড়া

YOLO Object Detection

YOLOv8 দিয়ে detection।

🎬 গল্প দিয়ে শুরু
Object detection মানে শুধু “এটা কী” না — “কোথায় আছে”ও। 2016 সালে YOLO এই কাজ single network-এ একবার দেখেই করতে শেখাল (You Only Look Once)। ২০২৫-এ Ultralytics YOLOv8 — production standard।

Install ও first inference

bash
pip install ultralytics
python
from ultralytics import YOLO

model = YOLO("yolov8n.pt")           # n = nano, পরের s/m/l/x
results = model("dhaka_street.jpg")  # বা video, webcam (0), URL

for r in results:
    r.save("out.jpg")                # bbox আঁকা ছবি
    for box, cls, conf in zip(r.boxes.xyxy, r.boxes.cls, r.boxes.conf):
        print(model.names[int(cls)], float(conf), box.tolist())

Webcam real-time

python
from ultralytics import YOLO
import cv2

model = YOLO("yolov8n.pt")
cap = cv2.VideoCapture(0)
while True:
    ok, frame = cap.read()
    if not ok: break
    res = model(frame, verbose=False, imgsz=640, conf=0.4)[0]
    cv2.imshow("yolo", res.plot())
    if cv2.waitKey(1) == 27: break

Custom dataset — YOLO format

text
data/
├── images/
│   ├── train/  001.jpg ...
│   └── val/
├── labels/
│   ├── train/  001.txt   (class cx cy w h — normalized 0-1)
│   └── val/
└── data.yaml
yaml
data.yaml
path: ./data
train: images/train
val:   images/val
names:
  0: person
  1: rickshaw
  2: bus
  3: car

Training

python
from ultralytics import YOLO

model = YOLO("yolov8n.pt")           # pre-trained start
model.train(
    data="data/data.yaml",
    epochs=50, imgsz=640, batch=16,
    device=0,                        # GPU
    name="dhaka-traffic",
)
metrics = model.val()                # mAP50, mAP50-95
Labeling tool
LabelImg, Roboflow, CVAT — YOLO format export support। Roboflow online সবচেয়ে সহজ।

Model size choice

ModelParamsUse case
yolov8n3.2MEdge, Jetson Nano, ~30 FPS CPU
yolov8s11MDecent GPU real-time
yolov8m26MBalanced production
yolov8l44MHigh accuracy, server
yolov8x68MBest accuracy, slow

Tracking + counting (built-in)

python
from ultralytics import YOLO

model = YOLO("yolov8n.pt")
for r in model.track(source="traffic.mp4", persist=True,
                     tracker="bytetrack.yaml", conf=0.4):
    for box, tid, cls in zip(r.boxes.xyxy, r.boxes.id, r.boxes.cls):
        print(int(tid), model.names[int(cls)])
প্র্যাকটিস টাস্ক
  1. Pre-trained YOLOv8n দিয়ে নিজের একটি ছবিতে detection চালান।
  2. Roboflow-এ ৫০টি image label করে rickshaw detector train করুন।
  3. Tracking + line crossing count combine করে vehicle counter বানান।

সারসংক্ষেপ

  • YOLO = single-shot detector, fast + accurate।
  • Ultralytics 3 line-এ inference, custom training সহজ।
  • Dataset format = YOLO txt (class cx cy w h)।
  • n/s/m/l/x — speed-accuracy spectrum।