Foreground Scale for Any V2 #255

math-artist · 2024-11-21T19:02:24Z

First, thank you for this super useful app. I have only started using iw3 yesterday, and I already have over one hundred files converted (small files, many are samples for testing)

During my testing, I found annoying that the model Any_V2 seemed truncated when setting the foreground scale to 1. So, I have plot the curves to see what happens.

I use Any V2 in another project, and the depth maps close to 0 are the furthest, and the higher ones are closer. So, what foreground scale is doing, is it is flattening the background for a very small gain in the slope near 1. And that's exactly what I am seeing when I am using it. I think it's implemented backward.

I wrote this code derived from a function you made that has, I think, the correct transform for Any V2.

def inv_softplus01_edited(x, bias, scale):
    min_v = ((torch.zeros(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    max_v = ((torch.ones(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    v = ((1 - x - bias) * scale).expm1().clamp(min=1e-6).log()
    return 1 - (v - min_v) / (max_v - min_v)

The text was updated successfully, but these errors were encountered:

nagadomi · 2024-11-22T02:04:16Z

Thanks for the info.
I too thought the current conversion curve for Depth-Anything was not good, but since I don't use it myself, I left it alone for a long time.
The current expression is just a smooth function of x > 0.5 ? (x - 0.5) * 2 : 0 as you say.
I will try to organize knowledge of that area at this time.

andy500 · 2024-12-19T12:33:05Z

where do l put this code in thank you

math-artist · 2024-12-19T13:25:24Z

This one is tricky because it need to be implemented at different places.

The code above was just an example, but it's a replacer for softplus01 in mapper.py

So, in mapper.py, comment the old function and put this:

def softplus01(x, bias, scale):
    min_v = ((torch.zeros(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    max_v = ((torch.ones(1, dtype=x.dtype, device=x.device) - bias) * scale).expm1().clamp(min=1e-6).log()
    v = ((1 - x - bias) * scale).expm1().clamp(min=1e-6).log()
    return 1 - (v - min_v) / (max_v - min_v)

In mapper.py again, inside the function resolve_mapper_function, you have to replace the bias and scale values. look for the condition elif name in {"mul_1", "mul_2", "mul_3"}: and replace param:

    elif name in {"mul_1", "mul_2", "mul_3"}:
        param = {
            # none 1x
            "mul_1": {"bias": -0.15, "scale": 4},  # smooth 1.5x
            "mul_2": {"bias": -0.08, "scale": 5},  # smooth 2x
            "mul_3": {"bias": -0.04, "scale": 6},  # smooth 3x
        }[name]

This is what I use currently after running some tests, but it could be improved. It's better than the former function because the background was flattened too much.

However, I am running the app most of the time with foreground scale = 0, and even sometimes -1, but I crank up the 3D strength instead. I didn't find the flatter image on foreground with Anything Depth V2, but I know that the metric and the original model behave quite differently. I don't use the metric model.

Regarding my method of using negative foreground scale and higher 3D strength:

The problem with using high divergence is always caused by high pixel displacement, which always happen more on closer objects. This is how 3D vision work. Distant objects will not cause artefacts. Using a negative value until the closest image has the proper depth to compensate for high 3D strength can allow you to have the proper depth for popping objects, while also having deeper and far away backgrounds.

andy500 · 2024-12-19T22:08:44Z

thank you will try this

francdn · 2024-12-20T16:18:19Z

@math-artist What value do you use for 3D strength?

andy500 · 2024-12-20T16:51:53Z

1.5 and l use row_flow_v3_sym works very good

francdn · 2024-12-20T17:05:01Z

I was asking @math-artist since he said "I crank up the 3D strength instead".

Reno-CZ · 2025-01-22T17:31:52Z

row_flow_v3_sym - i dont have this, why?

nagadomi added the iw3 label Nov 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foreground Scale for Any V2 #255

Foreground Scale for Any V2 #255

math-artist commented Nov 21, 2024

nagadomi commented Nov 22, 2024

andy500 commented Dec 19, 2024

math-artist commented Dec 19, 2024 •

edited

Loading

andy500 commented Dec 19, 2024

francdn commented Dec 20, 2024

andy500 commented Dec 20, 2024

francdn commented Dec 20, 2024

Reno-CZ commented Jan 22, 2025 •

edited

Loading

Foreground Scale for Any V2 #255

Foreground Scale for Any V2 #255

Comments

math-artist commented Nov 21, 2024

nagadomi commented Nov 22, 2024

andy500 commented Dec 19, 2024

math-artist commented Dec 19, 2024 • edited Loading

andy500 commented Dec 19, 2024

francdn commented Dec 20, 2024

andy500 commented Dec 20, 2024

francdn commented Dec 20, 2024

Reno-CZ commented Jan 22, 2025 • edited Loading

math-artist commented Dec 19, 2024 •

edited

Loading

Reno-CZ commented Jan 22, 2025 •

edited

Loading