I read Fang Wan's paper and your code.And in your code:
loss = cls_det_loss / 20 + refine_loss_1*0.1 + refine_loss_2*0.1
I think cls_det_loss / 20 is a global entropy models, and the two refine_loss are local entropy models, 0.1 is the regularization weight. And refine_loss_1 and refine_loss_2 are in different object localization branches,which is according the "Accumulated Recurrent Learning" in paper.Are these right?
By the way, I also want to know the bbox_pred whether used in Train mode.I see you explain the mean of "bbox_pred = bbox_pred[:,:80]" in #9. But I'm still a little confused, because I print the bbox_pred when training , and I find the values are all 0. So the bbox_pred is only used in Test mode?
Look forward your reply,thank you.