Viet-Anh on Software Logo

What is: Balanced L1 Loss?

SourceLibra R-CNN: Towards Balanced Learning for Object Detection
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Balanced L1 Loss is a loss function used for the object detection task. Classification and localization problems are solved simultaneously under the guidance of a multi-task loss since Fast R-CNN, defined as:

L_p,u,t_u,v=L_cls(p,u)+λ[u1]L_loc(tu,v)L\_{p,u,t\_{u},v} = L\_{cls}\left(p, u\right) + \lambda\left[u \geq 1\right]L\_{loc}\left(t^{u}, v\right)

L_clsL\_{cls} and L_locL\_{loc} are objective functions corresponding to recognition and localization respectively. Predictions and targets in L_clsL\_{cls} are denoted as pp and uu. t_ut\_{u} is the corresponding regression results with class uu. vv is the regression target. λ\lambda is used for tuning the loss weight under multi-task learning. We call samples with a loss greater than or equal to 1.0 outliers. The other samples are called inliers.

A natural solution for balancing the involved tasks is to tune the loss weights of them. However, owing to the unbounded regression targets, directly raising the weight of localization loss will make the model more sensitive to outliers. These outliers, which can be regarded as hard samples, will produce excessively large gradients that are harmful to the training process. The inliers, which can be regarded as the easy samples, contribute little gradient to the overall gradients compared with the outliers. To be more specific, inliers only contribute 30% gradients average per sample compared with outliers. Considering these issues, the authors introduced the balanced L1 loss, which is denoted as L_bL\_{b}.

Balanced L1 loss is derived from the conventional smooth L1 loss, in which an inflection point is set to separate inliers from outliners, and clip the large gradients produced by outliers with a maximum value of 1.0, as shown by the dashed lines in the Figure to the right. The key idea of balanced L1 loss is promoting the crucial regression gradients, i.e. gradients from inliers (accurate samples), to rebalance the involved samples and tasks, thus achieving a more balanced training within classification, overall localization and accurate localization. Localization loss L_locL\_{loc} uses balanced L1 loss is defined as:

L_loc=_ix,y,w,hL_b(tu_iv_i)L\_{loc} = \sum\_{i\in{x,y,w,h}}L\_{b}\left(t^{u}\_{i}-v\_{i}\right)

The Figure to the right shows that the balanced L1 loss increases the gradients of inliers under the control of a factor denoted as α\alpha. A small α\alpha increases more gradient for inliers, but the gradients of outliers are not influenced. Besides, an overall promotion magnification controlled by γ is also brought in for tuning the upper bound of regression errors, which can help the objective function better balancing involved tasks. The two factors that control different aspects are mutually enhanced to reach a more balanced training.bb is used to ensure L_b(x=1)L\_{b}\left(x = 1\right) has the same value for both formulations in the equation below.

By integrating the gradient formulation above, we can get the balanced L1 loss as:

L_b(x)=αb(bx+1)ln(bx+1)αx if x<1 L\_{b}\left(x\right) = \frac{\alpha}{b}\left(b|x| + 1\right)ln\left(b|x| + 1\right) - \alpha|x| \text{ if } |x| < 1

L_b(x)=γx+C otherwise L\_{b}\left(x\right) = \gamma|x| + C \text{ otherwise }

in which the parameters γ\gamma, α\alpha, and bb are constrained by αln(b+1)=γ\alpha\text{ln}\left(b + 1\right) = \gamma. The default parameters are set as α=0.5\alpha = 0.5 and γ=1.5\gamma = 1.5