Viet-Anh on Software Logo

What is: Grid Sensitive?

SourceYOLOv4: Optimal Speed and Accuracy of Object Detection
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Grid Sensitive is a trick for object detection introduced by YOLOv4. When we decode the coordinate of the bounding box center xx and yy, in original YOLOv3, we can get them by

x=s(g_x+σ(p_x))y=s(g_y+σ(p_y))\begin{aligned} &x=s \cdot\left(g\_{x}+\sigma\left(p\_{x}\right)\right) \\ &y=s \cdot\left(g\_{y}+\sigma\left(p\_{y}\right)\right) \end{aligned}

where σ\sigma is the sigmoid function, g_xg\_{x} and g_yg\_{y} are integers and ss is a scale factor. Obviously, xx and yy cannot be exactly equal to sg_xs \cdot g\_{x} or s(g_x+1)s \cdot\left(g\_{x}+1\right). This makes it difficult to predict the centres of bounding boxes that just located on the grid boundary. We can address this problem, by changing the equation to

x=s(g_x+ασ(p_x)(α1)/2)y=s(g_y+ασ(p_y)(α1)/2)\begin{aligned} &x=s \cdot\left(g\_{x}+\alpha \cdot \sigma\left(p\_{x}\right)-(\alpha-1) / 2\right) \\ &y=s \cdot\left(g\_{y}+\alpha \cdot \sigma\left(p\_{y}\right)-(\alpha-1) / 2\right) \end{aligned}

This makes it easier for the model to predict bounding box center exactly located on the grid boundary. The FLOPs added by Grid Sensitive are really small, and can be totally ignored.