Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition during Python library loading #350

Merged

Conversation

andrewwhitehead
Copy link
Contributor

An error could be triggered when the library is loaded from multiple threads simultaneously (for example if a thread pool is used). This update adds a lock around the library initialization.

The cached_property dependency is also removed.

@swcurran
Copy link
Contributor

Can you provide some details on where this was found and so who might be interested in upgrading sooner than later?

@andrewwhitehead
Copy link
Contributor Author

This was found in did-webvh-py when verifying proofs on multiple threads. I haven't seen it previously.

@andrewwhitehead andrewwhitehead merged commit 5619bd0 into openwallet-foundation:main Jan 23, 2025
28 checks passed
@andrewwhitehead andrewwhitehead deleted the fix/init-race branch January 23, 2025 20:07
@ff137
Copy link
Contributor

ff137 commented Jan 27, 2025

This is a massive fix.

We had an issue with connecting many concurrent users to an issuer (using didexchange, via the issuer's public did).

If there were too many simultaneous requests, the automated flows just never completed, without any errors or exceptions being raised. I was investigating last week, and it seemed to potentially be related to many steps in acapy opening and closing an askar session, in short succession. But I really couldn't tell what was wrong.

We just tested with using askar 0.4.2 in our acapy fork, and the issue is resolved 🎉 thanks so much for finding/fixing it!

@andrewwhitehead
Copy link
Contributor Author

@ff137 The main error I was addressing would only occur the first time the library was loaded in a Python process, if the loading was kicked off on a second thread before the first could complete. I wouldn't expect it to occur in a running system except just as it was starting up. It's also possible that the cached_property change fixed something in the multi-threading though, as it's now using the built-in implementation which may have more appropriate locking behaviour.

@ff137
Copy link
Contributor

ff137 commented Jan 27, 2025

Ahh ... It very well might be the cached_property dependency that isn't used anymore.

Just found this in their readme: https://github.com/pydanny/cached-property?tab=readme-ov-file#working-with-threads
They do have threaded_cached_property instead, but the built-in functools.cached_property definitely has better support. Very confusing with the identical naming. In any case, this is definitely a big scalability fix -- Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants