Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASK FOR HELP #1

Open
chenyang126 opened this issue Mar 13, 2023 · 4 comments
Open

ASK FOR HELP #1

chenyang126 opened this issue Mar 13, 2023 · 4 comments

Comments

@chenyang126
Copy link

Dear Doctor:
Your work is excellent! I have some questions that I would like to ask for your help.
I added LCL to my UDA model, the way I did it was to take 2-norm on logit before it passed in the cross-entropy loss function, I changed the eps from 1e-7 to 1e-3 due to using AMP, but after adding LCL, my loss function curve keeps going up. Did I do anything wrong?

2-norm

        norms = torch.norm(logits_src, p=2, dim=1, keepdim=True) + 1e-3
        normed_logit = torch.div(logits_src, norms)
        norms_hr = torch.norm(hr_logits_src, p=2, dim=1, keepdim=True) + 1e-3
        normed_logit_hr = torch.div(hr_logits_src, norms_hr)

cross-entropy loss

        loss_src =0.9 * self.loss(normed_logit, gt_src) + \
            0.1 * self.loss(normed_logit_hr, cropped_gt_src) 

Looking forward to your help! Thanks!
best regards!

@KiwiXR
Copy link
Collaborator

KiwiXR commented Mar 13, 2023

Hi chenyang, thanks for your interest in our work!
To be honest, I am not so familiar with AMP, but I will try my best to give some suggestions.
In LCL, we use epsilon (eps) to prevent the divide-by-zero error when dividing logits by norms. Therefore, we expect the value of eps to be close to zero, or at least much smaller than norms. For your case, perhaps 1e-3 is too large a number for such purpose, and could ruin the training process. I would suggest starting from assigning eps with the smallest floating point number available. Another solution could be forcing full precision while performing the division using eps=1e-7, and then change back the result to the intended precision.
Feel free to contact me if you have any further questions.

@chenyang126
Copy link
Author

Dear Doctor:
Thanks for your help! I tried using full precision locally, but still had a rising loss function. I noticed two issues:

  1. Does pseudo-weight require additional processing after taking a 2-norm approach to logits;
  2. Is the Thing-Class ImageNet Feature Distance (FD) Loss in daformer not used in your work, because this loss conflicts with other functions ?
    Looking forward to your help! Thanks!
    best regards!

@KiwiXR
Copy link
Collaborator

KiwiXR commented Mar 15, 2023

Hi chenyang,

  • For 1, as you can see from our implementation, the pseudo_weight is directly calculated from logits before normalization. Therefore, I think no extra processing is required.

  • For 2, we haven't tried FD loss in VBLC, so I can't say for certain the effect of it. Feedbacks are welcome if you would like to share the influence of this loss on our method!

@chenyang126
Copy link
Author

Thank you for your reply! I will try to add/delete FD loss later for experimentation and will give you feedback if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants