Viet-Anh on Software Logo

What is: YOLOP?

SourceYOLOP: You Only Look Once for Panoptic Driving Perception
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

YOLOP is a panoptic driving perception network for handling traffic object detection, drivable area segmentation and lane detection simultaneously. It is composed of one encoder for feature extraction and three decoders to handle the specific tasks. It can be thought of a lightweight version of Tesla's HydraNet model for self-driving cars.

A lightweight CNN, from Scaled-yolov4, is used as the encoder to extract features from the image. Then these feature maps are fed to three decoders to complete their respective tasks. The detection decoder is based on the current best-performing single-stage detection network, YOLOv4, for two main reasons: (1) The single-stage detection network is faster than the two-stage detection network. (2) The grid-based prediction mechanism of the single-stage detector is more related to the other two semantic segmentation tasks, while instance segmentation is usually combined with the region based detector as in Mask R-CNN. The feature map output by the encoder incorporates semantic features of different levels and scales, and our segmentation branch can use these feature maps to complete pixel-wise semantic prediction.