Edge AI Systems

Jetson, Coral, Raspberry Pi।

🎬 গল্প দিয়ে শুরু

Cloud-এ inference পাঠাতে latency বেশি, bandwidth ব্যয়বহুল, privacy ঝুঁকি। সমাধান — মডেলটাকে ছোট করে device-এ চালানো। CCTV camera, drone, factory line, smart doorbell — সবই এখন Edge AI।

Edge AI কেন?

Latency — 5–20 ms vs cloud-এ 200+ ms।
Privacy — ছবি device ছেড়ে বের হয় না।
Cost — bandwidth ও cloud GPU bill শূন্য।
Offline — internet ছাড়াই কাজ।
Reliability — network drop-এ system বন্ধ হয় না।

Model ছোট করার কৌশল

Quantization — FP32 → INT8 (4× ছোট, 2–4× দ্রুত)।
Pruning — অপ্রয়োজনীয় weight ছাঁটা।
Knowledge Distillation — বড় teacher → ছোট student।
Architecture search — MobileNet, EfficientNet, YOLO-Nano।
Operator fusion — Conv+BN+ReLU একসাথে।

Format ও runtime

text

PyTorch (.pt) ─export─► ONNX (.onnx) ─convert─► target runtime
                                                 ├─ TensorRT (Jetson)
                                                 ├─ TFLite (Coral, Mobile)
                                                 ├─ OpenVINO (Intel)
                                                 └─ CoreML (Apple)

YOLOv8 → Jetson (TensorRT)

python

from ultralytics import YOLO
m = YOLO("yolov8n.pt")
# Jetson-এ এই command চালান:
m.export(format="engine", half=True, device=0)   # FP16 TensorRT engine
# inference:
YOLO("yolov8n.engine")("test.jpg")

Speed

Jetson Orin Nano-তে YOLOv8n FP16 ≈ 60–80 FPS @ 640×640।

TFLite + Edge TPU (Coral)

bash

# 1) Model কে TFLite-এ convert + INT8 quantize
# 2) Edge TPU compiler দিয়ে compile
edgetpu_compiler model_int8.tflite

python

from pycoral.adapters import common, detect
from pycoral.utils.edgetpu import make_interpreter

it = make_interpreter("model_edgetpu.tflite")
it.allocate_tensors()
# inference loop ~ 15 ms per frame @ 2W

Deployment Checklist

Power budget (W) ও thermal throttling জানুন।
Camera → encode → inference → output — pipeline benchmark।
Model update OTA strategy।
Logging + crash recovery (systemd, supervisor)।
Security — encrypted model, signed firmware।

Phase 6 শেষ — কী শিখলেন?

Vision Transformer ও attention-based vision।
GAN — generator/discriminator, DCGAN থেকে StyleGAN।
Stable Diffusion — latent diffusion, ControlNet, LoRA।
CLIP, BLIP, VLM — multimodal AI।
3D vision — stereo, monocular depth, point cloud।
Visual SLAM — localization + mapping।
Edge AI — Jetson, Coral, quantization, TensorRT।

পরবর্তী — Phase 7: Production & Deployment

মডেল train করতে শিখলেন, এবার serve করতে শিখবেন — FastAPI, Docker, ONNX, TensorRT, GPU acceleration ও streaming pipeline।

প্র্যাকটিস টাস্ক

YOLOv8n FP32 → INT8 quantize করে accuracy drop measure করুন।
Raspberry Pi 5-এ MobileNet-V3 চালিয়ে FPS benchmark করুন।
Jetson Nano-তে CCTV stream-এ real-time person detection deploy করুন।