You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I'm currently working on a games-adjacent tool, for editing NDS ROM files. Working with these binary files, I often have to search a file's entire bytes for a small (or large) byte subsequence. I use this the most in the compression algorithms, which are fairly niche compression variants, not included in Godot's built-in compression types.
Describe the problem or limitation you are having in your project
While GDScript String's String::Find will search for a substring, GDScript Array's (and Array-Like classes) Array::Find will only search for a singular value, not a subarray.
So while GDScript can do: "ABCDEF".find("DE")
It cannot do [1, 2, 3, 4, 5, 6].find([1,2])
Making it more complex, and less performant to try and find a sub-array in a large array.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
I suggest adding an Array::Find_Seq function to GDScript Arrays (and Array-Like classes) which will search for a Sub-Array. (As well as the Reverse variant, Array::RFind_Seq).
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
I am prepared to implement this proposal, and have already implemented some of it in the arrayFindSeq Branch of my Godot Fork. I compiled Godot with my adjusted functions, and ran tests to confirm the intended behaviors, and confirm my intuitions about performance. Most of the relevant code updates are in core/variant/array.cpp, with necessary bindings in core/variant/array.h & core/variant/variant_call.cpp
[Edited for Brevity]
Across the 3 basic tests I performed, the new functions behaved as expected, dramatically out-performing a GDScript function (30x - 45x) to find the sequence, and performing on par (1x) or significantly better (4x) than using String::Find as a workaround.
This is not a complete implementation, a complete implementation would include:
Updates to the documentation
Implementations of Find_Seq in either Vector.h and/or CowData.cpp, as PackedByteArrays and other ArrayLike classes inherit from those rather than Array.cpp
Discussion of exact intended behaviors with Typed Arrays & Reverse Finds
Perhaps an implementation of Find_Custom_Sequence
Discussion of how to handle the exponential cost of searching an Array of Arrays for a subsequence of Arrays.
If this enhancement will not be used often, can it be worked around with a few lines of script?
[Edited for Brevity]
The workaround options are:
Write a GDScript function to perform the search
Perform an Array->String Conversion and use String::Find to find the subsequence
I think neither is satisfactory. Writing a function in GDScript produces dramatically worse performance than a native CPP search. Using a String Conversion can make the code rather confusing, if not extremely difficult to even produce a correct result. And in the intended use-cases, the String::Find is often less performant than a C++ Array:Find_Seq, due to having perform significantly more comparisons per array element.
Is there a reason why this should be core and not an add-on in the asset library?
I think it's important that GDScript have robust & performant Array capabilities. These functions will make complex Array interactions more approachable, and make it friendlier to users. I myself was fairly surprised that Strings (a nice wrapper around a form of Array) have a subsequence Find while actual Arrays do not.
The text was updated successfully, but these errors were encountered:
I am actually planning implement the backend of this function anyway! Stemming from a similar background :)
In short, String's find implementation could be abstracted to run on Span instead without loss of efficiency. Packed* arrays and Arrays could then use the same implementation without needing new code.
For me, this is part of a longer term refactor in trying to reduce duplicate code.
I am currently waiting for the Span PR to get merged (hoping for 4.5 dev 1). When that's done, find can be moved out to something like spans::find_subspan, and exposing that to GDScript users could be achieved with just a binding to that from Vector, without needing any new code. :)
Describe the project you are working on
Hi! I'm currently working on a games-adjacent tool, for editing NDS ROM files. Working with these binary files, I often have to search a file's entire bytes for a small (or large) byte subsequence. I use this the most in the compression algorithms, which are fairly niche compression variants, not included in Godot's built-in compression types.
Describe the problem or limitation you are having in your project
While GDScript String's String::Find will search for a substring, GDScript Array's (and Array-Like classes) Array::Find will only search for a singular value, not a subarray.
So while GDScript can do:
"ABCDEF".find("DE")
It cannot do
[1, 2, 3, 4, 5, 6].find([1,2])
Making it more complex, and less performant to try and find a sub-array in a large array.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
I suggest adding an Array::Find_Seq function to GDScript Arrays (and Array-Like classes) which will search for a Sub-Array. (As well as the Reverse variant, Array::RFind_Seq).
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
I am prepared to implement this proposal, and have already implemented some of it in the arrayFindSeq Branch of my Godot Fork. I compiled Godot with my adjusted functions, and ran tests to confirm the intended behaviors, and confirm my intuitions about performance. Most of the relevant code updates are in core/variant/array.cpp, with necessary bindings in core/variant/array.h & core/variant/variant_call.cpp
[Edited for Brevity]
Across the 3 basic tests I performed, the new functions behaved as expected, dramatically out-performing a GDScript function (30x - 45x) to find the sequence, and performing on par (1x) or significantly better (4x) than using String::Find as a workaround.
This is not a complete implementation, a complete implementation would include:
If this enhancement will not be used often, can it be worked around with a few lines of script?
[Edited for Brevity]
The workaround options are:
I think neither is satisfactory. Writing a function in GDScript produces dramatically worse performance than a native CPP search. Using a String Conversion can make the code rather confusing, if not extremely difficult to even produce a correct result. And in the intended use-cases, the String::Find is often less performant than a C++ Array:Find_Seq, due to having perform significantly more comparisons per array element.
Is there a reason why this should be core and not an add-on in the asset library?
I think it's important that GDScript have robust & performant Array capabilities. These functions will make complex Array interactions more approachable, and make it friendlier to users. I myself was fairly surprised that Strings (a nice wrapper around a form of Array) have a subsequence Find while actual Arrays do not.
The text was updated successfully, but these errors were encountered: