Object Recognition (6) – YOLO

In previous posts, we talked about the R-CNN series models for object recognition. We see how it evolves from multi-stage architecture to single stage single network. Now we know the object detection model can be simplified to just a single network. Can we make things more simpler by dropping the RPN and just use a deep CNN network? YOLO answer this question with yes.


Object Recognition (5) – Mask R-CNN

Mask R-CNN is the last model of R-CNN series.

As the introduction, object location can be reported roughly as bounding box, or finely as pixels. After optimizing the R-CNN to good accuracy of both category and location detection, now the next step is to try report pixels for object location instead of bounding box.

Continue reading Object Recognition (5) – Mask R-CNN

Object Recognition (4) – Faster R-CNN

Faster R-CNN is the fastest model of the R-CNN series.As the name suggested, it’s faster than the Fast R-CNN.

Faster R-CNN solved the major remain problem of Fast R-CNN. The Selective Search is replaced with Region Proposal Network(RPN) and merged into the main network. Now the detection only require a single pass through the network.

Continue reading Object Recognition (4) – Faster R-CNN


Object recognition is one of the hottest areas of Artificial Intelligence.

The ultimate target is the visual ability like human being. That is recognizing the scene, every object and background. So for object recognition, computer should report every object’s location and category. If the any object is the known specific one, identity should also be reported.

Continue reading Object Recognition (1) – Introduction