ASK FOR HELP #1

chenyang126 · 2023-03-13T07:35:34Z

Dear Doctor:
Your work is excellent! I have some questions that I would like to ask for your help.
I added LCL to my UDA model, the way I did it was to take 2-norm on logit before it passed in the cross-entropy loss function, I changed the eps from 1e-7 to 1e-3 due to using AMP, but after adding LCL, my loss function curve keeps going up. Did I do anything wrong?

2-norm

        norms = torch.norm(logits_src, p=2, dim=1, keepdim=True) + 1e-3
        normed_logit = torch.div(logits_src, norms)
        norms_hr = torch.norm(hr_logits_src, p=2, dim=1, keepdim=True) + 1e-3
        normed_logit_hr = torch.div(hr_logits_src, norms_hr)

cross-entropy loss

        loss_src =0.9 * self.loss(normed_logit, gt_src) + \
            0.1 * self.loss(normed_logit_hr, cropped_gt_src)

Looking forward to your help! Thanks!
best regards!

The text was updated successfully, but these errors were encountered:

KiwiXR · 2023-03-13T13:40:41Z

Hi chenyang, thanks for your interest in our work!
To be honest, I am not so familiar with AMP, but I will try my best to give some suggestions.
In LCL, we use epsilon (eps) to prevent the divide-by-zero error when dividing logits by norms. Therefore, we expect the value of eps to be close to zero, or at least much smaller than norms. For your case, perhaps 1e-3 is too large a number for such purpose, and could ruin the training process. I would suggest starting from assigning eps with the smallest floating point number available. Another solution could be forcing full precision while performing the division using eps=1e-7, and then change back the result to the intended precision.
Feel free to contact me if you have any further questions.

chenyang126 · 2023-03-15T02:45:47Z

Dear Doctor:
Thanks for your help! I tried using full precision locally, but still had a rising loss function. I noticed two issues:

Does pseudo-weight require additional processing after taking a 2-norm approach to logits;
Is the Thing-Class ImageNet Feature Distance (FD) Loss in daformer not used in your work, because this loss conflicts with other functions ?
Looking forward to your help! Thanks!
best regards!

KiwiXR · 2023-03-15T07:56:56Z

Hi chenyang,

For 1, as you can see from our implementation, the pseudo_weight is directly calculated from logits before normalization. Therefore, I think no extra processing is required.
For 2, we haven't tried FD loss in VBLC, so I can't say for certain the effect of it. Feedbacks are welcome if you would like to share the influence of this loss on our method!

chenyang126 · 2023-03-15T11:08:34Z

Thank you for your reply! I will try to add/delete FD loss later for experimentation and will give you feedback if it works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASK FOR HELP #1

ASK FOR HELP #1

chenyang126 commented Mar 13, 2023

KiwiXR commented Mar 13, 2023

chenyang126 commented Mar 15, 2023

KiwiXR commented Mar 15, 2023

chenyang126 commented Mar 15, 2023

ASK FOR HELP #1

ASK FOR HELP #1

Comments

chenyang126 commented Mar 13, 2023

2-norm

cross-entropy loss

KiwiXR commented Mar 13, 2023

chenyang126 commented Mar 15, 2023

KiwiXR commented Mar 15, 2023

chenyang126 commented Mar 15, 2023