-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix raxRemove crash at memcpy() due to key size exceeds max Rax size #1722
base: unstable
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #1722 +/- ##
============================================
+ Coverage 70.97% 71.13% +0.16%
============================================
Files 123 123
Lines 65536 65537 +1
============================================
+ Hits 46511 46618 +107
+ Misses 19025 18919 -106
|
Great! Can you add a test case that would fail without this fix? |
You also need to fix the DCO. |
Fix raxRemove crash at memcpy due to key size exceed RAX_NODE_MAX_SIZE. Note that this could happen when key size was more than 512MB if we allow it by increasing the default proto-max-bulk-len. The crash could happen when we recompress the rax after removing a key due to expiry or DEL while memcpy merge the key that exceed 512MB limit. While the counting phase has the size check, the actual compress logic is missing it which lead to this crash. Signed-off-by: Ram Prasad Voleti <[email protected]>
8ab5ffd
to
88d39ec
Compare
Add unit test to reproduce the crash in raxRemove. Signed-off-by: Ram Prasad Voleti <[email protected]>
88d39ec
to
e09cf46
Compare
Thank you for taking a look. @zuiderkwast @madolson
Please let me know if you need any adjustments to the change. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test!
I have just a minor comment about the name and comment for the test case.
I have tried this test case without the fix and it indeed crashes. This is the stacktrace when running in GDB (with optimization so we can't see the args; sorry for that):
Program received signal SIGSEGV, Segmentation fault.
0x0000fffff7c94e80 in __memcpy_generic () from /lib64/libc.so.6
(gdb) bt
#0 0x0000fffff7c94e80 in __memcpy_generic () from /lib64/libc.so.6
#1 0x00000000004bc0fc in raxRemove (rax=0xfffff200c000, s=<optimized out>, len=<optimized out>, old=<optimized out>) at unit/../rax.c:1181
#2 0x00000000004c3540 in test_raxRemoveCrash (argc=<optimized out>, argv=<optimized out>, flags=<optimized out>) at unit/test_rax.c:1071
#3 0x00000000004762b4 in runTestSuite (test=test@entry=0x771a18 <unitTestSuite+176>, argc=argc@entry=4, argv=argv@entry=0xffffffffee08, flags=flags@entry=6) at unit/test_main.c:25
#4 0x000000000045a4cc in main (argc=4, argv=0xffffffffee08) at unit/test_main.c:61
Update the test name and description, addressing the comments of the PR Signed-off-by: Ram Prasad Voleti <[email protected]>
e605771
to
dff6d5e
Compare
Fix raxRemove crash at memcpy() (line 1181#) due to key size exceeds
RAX_NODE_MAX_SIZE
. Note that this could happen when key size was more than 512MB if we allow it by increasing the defaultproto-max-bulk-len
. The crash could happen when we recompress the rax after removing a key due to expiry or DEL while memcpy() merge the key that exceed 512MB limit. While the counting phase has the size check, the actual compress logic is missing it which lead to this crash.Crash explanation with example:
![Screenshot 2025-02-12 at 8 52 16 PM](https://private-user-images.githubusercontent.com/90425341/412742931-92774422-b945-4d19-9149-7071083be533.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2ODkzNjksIm5iZiI6MTczOTY4OTA2OSwicGF0aCI6Ii85MDQyNTM0MS80MTI3NDI5MzEtOTI3NzQ0MjItYjk0NS00ZDE5LTkxNDktNzA3MTA4M2JlNTMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDA2NTc0OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWFiYjc0ZDgwYjM3YmE4NjM4ODBlNzRiYjgwZTc1YTU3MDdhY2M0OWVlZTA1NzdmNjkwYWM2MDFmOGE1MzA5ZGMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.52KonUXyfZzfY0LawGUZmec6aWv_Nuo8XJdRFxiWWJ0)
![Screenshot 2025-02-12 at 9 48 25 PM](https://private-user-images.githubusercontent.com/90425341/412743015-7791c729-db9b-4b23-b9e4-f0a3fffd3422.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk2ODkzNjksIm5iZiI6MTczOTY4OTA2OSwicGF0aCI6Ii85MDQyNTM0MS80MTI3NDMwMTUtNzc5MWM3MjktZGI5Yi00YjIzLWI5ZTQtZjBhM2ZmZmQzNDIyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDA2NTc0OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY1OGNjZGU2Y2FkNDJiZWExZDVhN2ZmYjBjZWVlZDc3N2U5ZmE4YmZhNTA4MDA3MTVjY2E5NDk0MzNmYjNiMDgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.tGxhb0pzvy2dZFw2VZofZj0dbqgVPxSQnH4oSNumxlQ)
`