Yolo

Introduction

current detection system repurpose classifiers to perform detection.

DPM: sliding window
R-CNN: region propoal + classifier

YOLO Detection System

1.resizes the input size to 488*488.
2.runs a single convolutional network on the image.
3.thresholds the resulting detection.

benefit

1.fast. 2.reason globally about the image when making predictions. 3.learns generalizable representations of object.

Unified Detection

divides the input image into an SS grid.
Each grid predicts B bounding boxes. Each bounding box consists of 5 predictions: x, y, w, h, confidence and C conditional class probabilities.
confidence: predicts the intersection over union between the predicted box and the ground truth.
S
S(B5 + C) tensor.
output: SSC tensor.

Network Design

overview fast-yolo uses a neural network with fewer convolutional layers and fewer fiflters in those layers.

Limitations of YOLO

1.struggles with small objects.
2.struggles to generalize to objects in new or unuasal aspect ratios or configurations.

  1. A small error in a small box has a greater effect on IOU compared to a large box.
    main source of error: incorrect localizations.

Conclusion

real-time object detection. simple and can trained directly on full images.
Unlike classifier-based approaches
1.YOLO is trained on a loss function that directly corresponds to detection performance
2.the entire model is trained jointly.

发布于: 2017年 03月 18日