Xceed Imagination
← Back to AI Insights
Computer VisionProduction
Technical · 6 min read

Running YOLO on Edge Devices: Lessons from the Factory Floor

Jetson Nano, INT8 quantization, and the latency budget you actually need for real-time inspection.

Running computer vision models on edge devices in a factory is a different game from running them in a cloud notebook. Here's what we learned deploying YOLO-based inspection systems on NVIDIA Jetson devices.

The latency budget

Our production line moves at 60 parts per minute. That gives us 1 second per part. Factor in camera capture (50ms), image preprocessing (30ms), network transfer (20ms if using a separate compute node), and post-processing (50ms). That leaves ~850ms for inference. Comfortable for one model, tight for an ensemble.

Quantization: the free lunch

We went from FP32 (45ms inference) to FP16 (25ms) to INT8 (12ms) with less than 1% accuracy loss. INT8 quantization on TensorRT is essentially free performance. The trick is using a representative calibration dataset — at least 500 images that cover all defect types and lighting conditions.

What actually fails in production

  1. Lighting changes. Sunlight through a factory window at 4pm creates shadows that didn't exist during training. Solution: controlled LED lighting with diffusers, plus data augmentation with brightness/contrast variations.
  2. Camera drift. Vibrations from the press line slowly shift the camera angle over weeks. Solution: a calibration check every shift using a reference pattern.
  3. New defect types. The model sees a defect it wasn't trained on and either ignores it or misclassifies it. Solution: an anomaly detection layer that flags anything the model is uncertain about.
  4. Temperature. Jetson devices throttle in hot factory environments. Solution: proper heatsinks, thermal monitoring, and a fallback to lower-res inference if temperature spikes.

Our edge stack

NVIDIA Jetson Orin NX, YOLOv8 with TensorRT optimization, Docker containers for deployment, MQTT for result streaming, and a central dashboard for monitoring all stations.

Written by the Xceed AI team. Talk to us →