-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated code formatting and lint (latest ruff
, mypy
)
#91
Updated code formatting and lint (latest ruff
, mypy
)
#91
Conversation
@alexander-beedie thanks for the pull request. Looks good. Out of curiosity, what type of things are you planning to do with surrealDB.py and polars? A lot of friends have switched from pandas to polars and are amazed at the speed increases they are getting. |
Nice to hear! Initially I'm just looking to allow use of a Experimenting on a local branch, I have the following test working already, which builds on the about-to-merge Polars support1 for async drivers that I committed yesterday: from surrealdb import Surreal
import polars as pl
import asyncio
async def get_frame(query: str) -> pl.DataFrame:
async with Surreal("ws://localhost:8000/rpc") as client:
await client.use("test", "test")
return pl.read_database(query=query, connection=client)
asyncio.run(get_frame("SELECT * FROM person LIMIT 1"))
# shape: (1, 5)
# ┌─────────────────────────────┬───────────┬────────┬─────────────────────────┬──────────────┐
# │ id ┆ marketing ┆ pass ┆ tags ┆ user │
# │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
# │ str ┆ bool ┆ str ┆ list[str] ┆ str │
# ╞═════════════════════════════╪═══════════╪════════╪═════════════════════════╪══════════════╡
# │ person:fvmbtxcv79dveedch4vk ┆ false ┆ 123456 ┆ ["polars", "surrealdb"] ┆ Stroop Wafel │
# └─────────────────────────────┴───────────┴────────┴─────────────────────────┴──────────────┘ I added KuzuDB support2 to Polars (along with polars DataFrame export on their side) recently, and am interested generally in extending what we can accept beyond the standard relational DBs, in order to unlock some more interesting use-cases. Footnotes |
Hey! Looping @CarolineMorton who is working on a project with SurrealDB on creating realistic synthetic data for healthcare research. She has used polars a fair bit in the past, and I know she wants the output of the synthetic data to be a polars df. |
This is great. Thanks for the @ @maxwellflitton. Thank you for your efforts on integrating SurrealDB with Polars. My project involves importing significant reference data (SNOMED/ICD10) into SurrealDB to leverage its query functionality for navigating through a graph of related information. This includes asynchronously sampling from these nodes and compiling the results into a structure similar to a table, with a preference for exporting these as Polars DataFrames. The use of Arrow in Polars for data handling is a key factor for this preference. The work to allow a Surreal client to act as a connection in the Polars read_database function sounds really promising. The prospect of SurrealDB incorporating Arrow export capabilities could significantly benefit the "big data" community. Would you be open to discussing potential collaboration on this front? Cheers |
Thanks for the PR @alexander-beedie! 🎉 Adding Polars and Arrow support would be phenomenal 🚀 |
@CarolineMorton, FYI the initial commit just landed: pola-rs/polars#15269. I'd expect us to make a release that contains the new code sometime in the coming week; would be very interested in your feedback and/or feature requests (or bug reports, but lets hope there aren't any ;) as Polars support for async database drivers is also new (landing in 0.20.17).
Sure, though any Surreal → Arrow engineering work would realistically have to come from the Surreal folks - I'm more than happy to help test/integrate with Polars should such support arrive though! I think it would be a big win on many fronts 😎 |
@maxwellflitton: Anything else I need to do to help get this one merged? I already have another small PR pending that will deal with a minor Exception-related issue that I spotted while testing the new Polars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with @maxwellflitton that its ready to merge, so I'll do that 🙌
FYI: @CarolineMorton (and @AlexFrid / @maxwellflitton ;) - pola-rs/polars#15269 just merged. We'll have our initial SurrealDB support, validated locally against both both If the Surreal Python SDK/driver is eventually able to return Arrow data then we'll be able to take a significant step forward performance-wise (true zero-copy to Polars), but this should still be a good enabler for plenty of use-cases👌 Once it's released (likely in the next few days), let me know if you try it out and see any rough spots; async driver support as a whole is new to this release, and it will be good to get some "real world" testing (and feel free to tag me directly in any feedback). |
Fantastic @alexander-beedie - Will check out the polars changes. Excited about moving this all forward. @AlexFrid Shall we have a chat about adding arrow support some time |
What is the motivation?
Was taking a look at the SurrealDB Python API, with a view to possibly adding some degree of Polars1 support for the async client/driver (I am one of the Polars devs), and realised that the lint/formatting could be made faster & brought up to date. Seemed like a good way to both look through the code and make an initial contribution 😎👍
Type of Change
What does this change do?
ruff
from0.0.245
to0.3.3
and expands lint scope.mypy
version (and pins linter/formatter versions).black
in favour ofruff format
2 (nearly identical but much faster).pre-commit
hooks to use the newer versions (from the official "astral" repo).Formatting differences
The formatting is (as you'd hope!) essentially the same, with most differences being marginal (such as an initial line break after the leading
"""
inside multiline docstrings). Some additional lines longer than the 88-char limit were found/fixed. For consistency, applied the formatter to the "tests" and "surrealdb" dirs, and enabled code formatting in docstrings.Additional linting
Despite the substantial increase in the (potential) lint rules now covered, only a small handful of minor tidy-ups and Exception improvements were actually flagged.
Rules enabled before:
Rules enabled after:
Exception improvements:
Where a
SurrealDbError
is raised from within another Exception a "from None" designation3 is now added. The result is that the same error, with the same message, is raised, but without the unnecessary/misleading "During handling of the above exception, another exception occurred" context. This noticeably improves signal-to-noise when dealing with Exception traces.Also:
A few
super
calls were also automatically simplified:What is your testing strategy?
No functional changes were made, so no new tests were added.
Confirmed that the existing unit tests all run successfully.
Is this related to any issues?
ruff
for code formatting #43.Have you read the Contributing Guidelines?
Footnotes
Polars: https://github.com/pola-rs/polars ↩
Ruff formatter: https://docs.astral.sh/ruff/formatter/ ↩
PEP409: https://peps.python.org/pep-0409/#proposal ↩