Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restoration of Symbolic Properties #1946

Open
philip-paul-mueller opened this issue Feb 21, 2025 · 4 comments
Open

Restoration of Symbolic Properties #1946

philip-paul-mueller opened this issue Feb 21, 2025 · 4 comments

Comments

@philip-paul-mueller
Copy link
Collaborator

philip-paul-mueller commented Feb 21, 2025

Restoration of symbolic, i.e. restoration from file/json, property essentially works by relying on pystr_to_symbolic() with simplify set to False.
See dace/properties.py:1150:

@staticmethod
def from_string(s):
    return pystr_to_symbolic(s, simplify=False)

@staticmethod
def to_string(obj):
    # Go through sympy once to reorder factors
    return str(pystr_to_symbolic(str(obj), simplify=False))

The idea of setting simplify to False is that the content of the symbol does not change.
However it seems that this is not the case, at least for certain expressions, which sometimes change.
This has sever implication, for example storing an SDFG to disc and restoring it right away will alter the SDFG's hash.

The issue is that expressions of the form -(a + b) * c are transformed into (b - a) * c, i.e. the minus is moved into the first parentheses, as far as I can tell during restoration, by looking at the content of serialized_a.
In my specific case it happens to the expression -(10*__coarse_i_K_gtx_vertical + vertical_start - Min(vertical_end, 10*__coarse_i_K_gtx_vertical + vertical_start + 10))*(4*Max(283876, horizontal_end) - 4*Min(0, horizontal_start) + 1).
This is an expression I found inside a Memlet, to be precise inside the volume expression.
In the reproducer below, it is important that simplify() is called on the final expression, without it would not have worked.
Note that the Memlet propagation uses simplify() during the calculation of the volume.

I wrote this reproducer:

import dace
@dace.properties.make_properties
class Test:
    sym = dace.properties.SymbolicProperty(default=0)

    def __init__(self, sym):
        self.sym = sym

p1 = dace.symbolic.pystr_to_symbolic("10*__coarse_i_K_gtx_vertical + vertical_start")
p2 = dace.symbolic.pystr_to_symbolic("Min(vertical_end, 10*__coarse_i_K_gtx_vertical + vertical_start + 10)")
p3 = dace.symbolic.pystr_to_symbolic("(4*Max(283876, horizontal_end) - 4*Min(0, horizontal_start) + 1)")
# Without `simplify()` it will not work!
pp = dace.symbolic.simplify((p2 - p1 ) * p3)

original_a = Test(pp)
serialized_a = dace.serialize.all_properties_to_json(original_a)
restored_a = Test(0)
dace.serialize.set_properties_from_json(restored_a, serialized_a)

assert str(original_a.sym) == str(restored_a.sym), f"`sym` has chnaged from '{str(original_a.sym)}' to '{str(restored_a.sym)}'"
@tim0s
Copy link
Contributor

tim0s commented Feb 21, 2025

I don't understand the text describing the issue, specifically

-(10*__coarse_i_K_gtx_vertical + vertical_start - Min(vertical_end, 10*__coarse_i_K_gtx_vertical + vertical_start + 10))(4Max(283876, horizontal_end) - 4Min(0, horizontal_start) + 1) is transformed into -(10__coarse_i_K_gtx_vertical + vertical_start - Min(vertical_end, 10*__coarse_i_K_gtx_vertical + vertical_start + 10))(4Max(283876, horizontal_end) - 4*Min(0, horizontal_start) + 1)

I had to copy that into a text file and compare it character by character, but I still do not see the difference.

The code does not run for me because a is used but never defined.

@tim0s
Copy link
Contributor

tim0s commented Feb 21, 2025

import dace

@dace.properties.make_properties
class Test:
    sym = dace.properties.SymbolicProperty(default=0)

    def __init__(self, sym):
        self.sym = sym

p1 = dace.symbolic.pystr_to_symbolic("A")
p2 = dace.symbolic.pystr_to_symbolic("M(A)")
p3 = dace.symbolic.pystr_to_symbolic("(M(A-1) - M(A-2)+1)")
pp = dace.symbolic.simplify((p2 - p1 ) * p3)

original_a = Test(pp)
serialized_a = dace.serialize.all_properties_to_json(original_a)
restored_a = Test(0)
dace.serialize.set_properties_from_json(restored_a, serialized_a)

assert str(original_a.sym) == str(restored_a.sym), f"`sym` has chnaged from '\n{str(original_a.sym)}' to '\n{str(restored_a.sym)}'"

gives

AssertionError: `sym` has chnaged from '
-(A - M(A))*(-M(A - 2) + M(A - 1) + 1)' to '
(-A + M(A))*(-M(A - 2) + M(A - 1) + 1)'

So yes, sympy either is not deterministic or we call it differently :)

@philip-paul-mueller
Copy link
Collaborator Author

I think there is at least another issue, this time regarding to connectors, especially out connectors of Maps.
Connectors are dicts that maps a name to a type, however, it is allowed that multiple Memlets starts at the same out connector of a Map.
For the sake of argument consider the SDFG that is generated by the following snipped:

def foo(A: dace.float64[10], B: dace.float64[10], C: dace.float64[10]):
    for i in dace.map[0:10]:
        B[i] = A[i] + 1.   # The Memlet connecting the MapEntry node with the Tasklet is `A[i]`
        C[i] = A[i] - 1.    # The Memlet connecting the MapEntry node with the Tasklet is `A[i]`

Simplify will ensure that both Memlets originate at the same outgoing connector at the MapEntry node.
However, this is not a problem because both Memlet transport a single float.

Now consider this code snipped:

def foo(A: dace.float64[10], idx: dace.int32[10], B: dace.float64[10], C: dace.float64[10]):
    for i in dace.map[0:10]:
        B[i] = A[i] + 10.   # This Tasklet has a single connected with the MapEntry node with the Memlet `A[i]`.
        C[i] = A[idx[i]]     # This Tasklet has two connections with the MapEntry node, the first Memlet is `Idx[i]`
                                     #  the second Memlet is `A[:]`.

Again simplify will make sure that both Memlets, that are associated to the data A, are connected to the same output connector at the MapEntry node.
As before the Memlet that goes to the first Tasklet carries a single float, but now, because of the indirect access in the second Tasklet, the Memlet that goes to the second Tasklet and is associated with A is now a pointer.

As I understand the code in dace/sdfg/infer_types.py the behaviour is, that it processes the edges in random order and if the connection type is already determined it will skip it.
This means:

  • The types of the input connectors of both Tasklets will be infered correctly.
  • The type of the output connector of the MapEntry node depends if first the edge to the first Tasklet or the second Tasklet is processed.

I am not sure, but I am not fully sure if this is an issue in the specific case.
But it makes the SDFG a bit unstable, because running type inferance is in itself not a stable process.

@philip-paul-mueller
Copy link
Collaborator Author

@tim0s
Thanks for spotting, it was a copy past error, yesterday was not my day, I also fixed the reproducer.
Sorry for the trouble I caused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants