Hello,
First of all, thank you for your great work.
I want to ask a point in the implementation of input projection before the MaskHeadSmallConv in segmentation.py .
The implementation applies stride 2 to the features which makes the best stride 8. However, for the segmentation tasks, it is possible to get better result when the stride 4 is utilized for the mask creation. The original segmentation head implementation of DETR also utilizes in that way. Therefore, I want to ask that what is the reason for utilizing that stride in your implementation?
Thanks in advance