-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] document #1
Comments
+1 ;) |
api长得和原来那个用ctypes的差不多 |
Ok will add it soon. |
Just one question, why not maintain the ctypes version? It's no longer maintained https://github.com/smartfile/python-librsync and it doesn't require any dependencies. I think there is no big difference in performance between CFFI and CTYPES for this library. |
I'm not the maintainer of the origin repo so I don't know why they stop maintaining it either. I write this wrapper with cython which should be far more faster than the ctypes one, as for the cffi backend, it just a copy of cython to run on pypy. Both cython and cffi backend have the same function and signature. |
And here is a simple example |
Simple usage have been added. |
It seems to me that they no longer maintained the ctype version because they no longer used the module for their product. This option is interesting because it does not impose any dependency in the code. Moreover I think that to propose the 3 solutions would really make this lib a good reference. Regarding performance, I would be curious to compare, because the call to the library is rather direct with the ctype method, there is very little code to access C functions. Regarding the installation of the library, it would be interesting to use pip's extra require. Which would give the advantage of being able to install this way: Lack of time to test now but it would be necessary to indicate the compatible versions of librsync. Regarding the ctype code, I can provide you with the version that I modified to work with version 2.x of librsync, I have to revalidate it with the latest version (2.3.x). import ctypes
import ctypes.util
import tempfile
import functools
paths = ['lib/librsync', '../lib/librsync', 'librsync', 'librsync1', 'rsync']
if is_windows:
_librsync = None
for p in paths:
try:
_librsync = ctypes.cdll.LoadLibrary(p)
except EnvironmentError:
continue
else:
break
if not _librsync:
raise ImportError('Could not find librsync, make sure it is installed')
else:
path = next((ctypes.util.find_library(p) for p in paths if ctypes.util.find_library(p)), None)
if path:
try:
_librsync = ctypes.cdll.LoadLibrary(path)
except ImportError:
raise ImportError('Could not load librsync at "%s"' % path)
else:
raise ImportError('Could not find librsync, make sure it is installed')
VERSION = bytes(ctypes.cast(_librsync.rs_librsync_version, ctypes.c_char_p).value).decode().split()[1]
MAX_SPOOL = 1024 ** 2 * 5
TRACE_LEVELS = (0, 1, 2, 3, 4, 5, 6, 7)
RS_DONE = 0
RS_BLOCKED = 1
# Default length of strong signatures, in bytes. The MD4 checksum is truncated to this size.
RS_JOB_BLOCKSIZE = 65536
# Default block length, if not determined by any other factors.
RS_DEFAULT_STRONG_LEN = 8
# Default, if not determined by file size
RS_DEFAULT_BLOCK_LEN = 2048
RS_DELTA_MAGIC = 0x72730236 # r s \2 6
RS_MD4_SIG_MAGIC = 0x72730136 # r s \1 6
RS_BLAKE2_SIG_MAGIC = 0x72730137 # r s \1 7
# PREFERRED_MAGIC_HASH = RS_MD4_SIG_MAGIC if parse_version(VERSION) < parse_version('1.0.0') else RS_BLAKE2_SIG_MAGIC
#############################
# DEFINES FROM librsync.h #
#############################
# librsync.h: rs_buffers_s
class Buffer(ctypes.Structure):
_fields_ = [
('next_in', ctypes.c_char_p),
('avail_in', ctypes.c_size_t),
('eof_in', ctypes.c_int),
('next_out', ctypes.c_char_p),
('avail_out', ctypes.c_size_t),
]
# char const *rs_strerror(rs_result r);
_librsync.rs_strerror.restype = ctypes.c_char_p
_librsync.rs_strerror.argtypes = (ctypes.c_int,)
# rs_job_t *rs_sig_begin(size_t new_block_len, size_t strong_sum_len);
_librsync.rs_sig_begin.restype = ctypes.c_void_p
_librsync.rs_sig_begin.argtypes = (ctypes.c_size_t, ctypes.c_size_t, ctypes.c_int,)
# rs_job_t *rs_loadsig_begin(rs_signature_t **);
_librsync.rs_loadsig_begin.restype = ctypes.c_void_p
_librsync.rs_loadsig_begin.argtypes = (ctypes.c_void_p,)
# rs_job_t *rs_delta_begin(rs_signature_t *);
_librsync.rs_delta_begin.restype = ctypes.c_void_p
_librsync.rs_delta_begin.argtypes = (ctypes.c_void_p,)
# rs_job_t *rs_patch_begin(rs_copy_cb *, void *copy_arg);
_librsync.rs_patch_begin.restype = ctypes.c_void_p
_librsync.rs_patch_begin.argtypes = (ctypes.c_void_p, ctypes.c_void_p,)
# rs_result rs_build_hash_table(rs_signature_t* sums);
_librsync.rs_build_hash_table.restype = ctypes.c_size_t
_librsync.rs_build_hash_table.argtypes = (ctypes.c_void_p,)
# rs_result rs_job_iter(rs_job_t *, rs_buffers_t *);
_librsync.rs_job_iter.restype = ctypes.c_int
_librsync.rs_job_iter.argtypes = (ctypes.c_void_p, ctypes.c_void_p,)
# void rs_trace_set_level(rs_loglevel level);
_librsync.rs_trace_set_level.restype = None
_librsync.rs_trace_set_level.argtypes = (ctypes.c_int,)
# void rs_free_sumset(rs_signature_t *);
_librsync.rs_free_sumset.restype = None
_librsync.rs_free_sumset.argtypes = (ctypes.c_void_p,)
# rs_result rs_job_free(rs_job_t *);
_librsync.rs_job_free.restype = ctypes.c_int
_librsync.rs_job_free.argtypes = (ctypes.c_void_p,)
# A function declaration for our read callback.
patch_callback = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_void_p, ctypes.c_longlong,
ctypes.c_size_t, ctypes.POINTER(Buffer))
class LibrsyncError(Exception):
def __init__(self, r):
super(LibrsyncError, self).__init__(_librsync.rs_strerror(ctypes.c_int(r)))
def seekable(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
s = args[0]
assert callable(getattr(s, 'seek', None)), 'Must provide seekable file-like object'
return f(*args, **kwargs)
return wrapper
def _execute(job, f, o=None):
"""
Executes a librsync "job" by reading bytes from `f` and writing results to
`o` if provided. If `o` is omitted, the output is ignored.
"""
# Re-use the same buffer for output, we will read from it after each
# iteration.
out = ctypes.create_string_buffer(RS_JOB_BLOCKSIZE)
while 1:
block = f.read(RS_JOB_BLOCKSIZE)
buff = Buffer()
# provide the data block via input buffer.
buff.next_in = ctypes.c_char_p(block)
buff.avail_in = ctypes.c_size_t(len(block))
buff.eof_in = ctypes.c_int(not block)
# Set up our buffer for output.
buff.next_out = ctypes.cast(out, ctypes.c_char_p)
buff.avail_out = ctypes.c_size_t(RS_JOB_BLOCKSIZE)
r = _librsync.rs_job_iter(job, ctypes.byref(buff))
if o:
o.write(out.raw[:RS_JOB_BLOCKSIZE - buff.avail_out])
if r == RS_DONE:
break
elif r != RS_BLOCKED:
raise LibrsyncError(r)
if buff.avail_in > 0:
# There is data left in the input buffer, librsync did not consume
# all of it. Rewind the file a bit so we include that data in our
# next read. It would be better to simply tack data to the end of
# this buffer, but that is very difficult in Python.
f.seek(f.tell() - buff.avail_in)
if o and callable(getattr(o, 'seek', None)):
# As a matter of convenience, rewind the output file.
o.seek(0)
return o
def debug(level=7):
assert level in TRACE_LEVELS, "Invalid log level %i" % level
_librsync.rs_trace_set_level(level)
@seekable
def signature(f,
s=None,
block_size=RS_DEFAULT_BLOCK_LEN,
block_checksum=RS_DEFAULT_STRONG_LEN,
magic=RS_MD4_SIG_MAGIC):
"""
Generate a signature for the file `f`. The signature will be written to `s`.
If `s` is omitted, a temporary file will be used. This function returns the
signature file `s`. You can specify the size of the blocks using the
optional `block_size` parameter.
"""
if s is None:
s = tempfile.SpooledTemporaryFile(max_size=MAX_SPOOL, mode='wb+')
job = _librsync.rs_sig_begin(block_size, block_checksum, magic)
try:
_execute(job, f, s)
finally:
_librsync.rs_job_free(job)
return s
@seekable
def delta(f, s, d=None):
"""
Create a delta for the file `f` using the signature read from `s`. The delta
will be written to `d`. If `d` is omitted, a temporary file will be used.
This function returns the delta file `d`. All parameters must be file-like
objects.
"""
if d is None:
d = tempfile.SpooledTemporaryFile(max_size=MAX_SPOOL, mode='wb+')
sig = ctypes.c_void_p()
try:
job = _librsync.rs_loadsig_begin(ctypes.byref(sig))
try:
_execute(job, s)
finally:
_librsync.rs_job_free(job)
r = _librsync.rs_build_hash_table(sig)
if r != RS_DONE:
raise LibrsyncError(r)
job = _librsync.rs_delta_begin(sig)
try:
_execute(job, f, d)
finally:
_librsync.rs_job_free(job)
finally:
_librsync.rs_free_sumset(sig)
return d
@seekable
def patch(f, d, o=None):
"""
Patch the file `f` using the delta `d`. The patched file will be written to
`o`. If `o` is omitted, a temporary file will be used. This function returns
the be patched file `o`. All parameters should be file-like objects. `f` is
required to be seekable.
"""
if o is None:
o = tempfile.SpooledTemporaryFile(max_size=MAX_SPOOL, mode='wb+')
@patch_callback
def read_cb(opaque, pos, length, buff):
f.seek(pos)
size_p = ctypes.cast(length, ctypes.POINTER(ctypes.c_size_t)).contents
size = size_p.value
block = f.read(size)
size_p.value = len(block)
buff_p = ctypes.cast(buff, ctypes.POINTER(ctypes.c_char_p)).contents
buff_p.value = block
return RS_DONE
job = _librsync.rs_patch_begin(read_cb, None)
try:
_execute(job, d, o)
finally:
_librsync.rs_job_free(job)
return o |
The By the way, it's necessary to point out that it is quite difficult to compile |
As for performance, the built-in ctypes module itself is based on Python-C API, and all type conversions are completed in Python. It's just a wrapper around libffi., how can it be faster than native Python-C API which cython compile against? And for cffi, I'm using the API mod which also compiles to native Python-C API. The ctypes module is like cffi's ABI mod, so, with a ffi.dlopen, you may get sth. like ctypes. |
Now it's very easy to compile librsync on windows, only two commands: cmake -A Win32 -D BUILD_RDIFF=OFF -D BUILD_SHARED_LIBS=OFF .
-- Building for: Visual Studio 16 2019
-- Selecting Windows SDK version 10.0.19041.0 to target Windows 10.0.17763.
-- The C compiler identification is MSVC 19.29.30146.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/BuildTools/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x86/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- DO_RS_TRACE=0
.....
-- Could NOT find LIBB2 (missing: LIBB2_LIBRARY_RELEASE LIBB2_INCLUDE_DIR)
-- Using included blake2 implementation.
-- Could NOT find Doxygen (missing: DOXYGEN_EXECUTABLE)
-- CMAKE_C_FLAGS = /DWIN32 /D_WINDOWS /W3 /D_CRT_SECURE_NO_WARNINGS
-- Configuring done
-- Generating done
-- Build files have been written
cmake --build . --config Release
Checking Build System
Building Custom Rule CMakeLists.txt
checksum_test.c
checksum.c
rollsum.c
...
Génération de code en cours...
sumset_test.vcxproj -> librsync-2.3.2\Release\sumset_test.exe
Building Custom Rule librsync-2.3.2/CMakeLists.txt When i get more time after holidays, i will push a try to implement ctype if you are with that. |
Looks good. So there maybe a third backend like |
yes, that would be the idea :) |
Besides, just compiling for windows is not enough. You must also set the visibility of functions so that ctypes/cffi can load the module. I suffered a lot from this. |
来点文档
The text was updated successfully, but these errors were encountered: