-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about single feature map training #6
Comments
For the first question: the default padding mode is And I am also curious about the author's response about the second question. |
Hi, thanks for your interest! Regarding the training with single feature map case: I think there is more than one way the network can find a shortcut solution. In general, different approaches to prevent the shortcut lead to qualitatively different (and IMO, suboptimal) solutions. You can visualize this with the pca feature visualization (one of the visualizations shown in
Using a modified architecture (less deep, wider, stride 1, no padding) can indeed mitigate some of these issues, as well as training on higher-resolution images (i.e. 512x512) such that the receptive field does not cover the whole image. Another idea I considered is using the entropy of the transition distribution to weigh the loss (intuitively, you can imagine that high-entropy transition distribution means there are many similar nodes in the graph, so less likely to be a small object). But I haven't investigated these thoroughly. I would be interested to hear if you guys have any ideas! It's a bit mysterious, and resolving this issue would open a lot of potential followups that involve more dense learning objectives. |
Thanks for your response! |
Thanks for the great job and for sharing the code.
About the single feature map training, I have some questions. I will be appreciated if you can share the answers.
The text was updated successfully, but these errors were encountered: