Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix raxRemove crash at memcpy() due to key size exceeds max Rax size #1722

Open
wants to merge 3 commits into
base: unstable
Choose a base branch
from

Conversation

VoletiRam
Copy link
Contributor

Fix raxRemove crash at memcpy() (line 1181#) due to key size exceeds RAX_NODE_MAX_SIZE. Note that this could happen when key size was more than 512MB if we allow it by increasing the default proto-max-bulk-len. The crash could happen when we recompress the rax after removing a key due to expiry or DEL while memcpy() merge the key that exceed 512MB limit. While the counting phase has the size check, the actual compress logic is missing it which lead to this crash.

Crash explanation with example:
Screenshot 2025-02-12 at 8 52 16 PM
Screenshot 2025-02-12 at 9 48 25 PM
`

Copy link

codecov bot commented Feb 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.13%. Comparing base (2f2b8d1) to head (dff6d5e).
Report is 3 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1722      +/-   ##
============================================
+ Coverage     70.97%   71.13%   +0.16%     
============================================
  Files           123      123              
  Lines         65536    65537       +1     
============================================
+ Hits          46511    46618     +107     
+ Misses        19025    18919     -106     
Files with missing lines Coverage Δ
src/rax.c 82.78% <100.00%> (+0.02%) ⬆️

... and 11 files with indirect coverage changes

@zuiderkwast
Copy link
Contributor

Great! Can you add a test case that would fail without this fix?

@madolson
Copy link
Member

You also need to fix the DCO.

Fix raxRemove crash at memcpy due to key size exceed RAX_NODE_MAX_SIZE.
Note that this could happen when key size was more than 512MB if we
allow it by increasing the default proto-max-bulk-len.
The crash could happen when we recompress the rax after removing a key
due to expiry or DEL while memcpy merge the key that exceed 512MB
limit. While the counting phase has the size check, the actual
compress logic is missing it which lead to this crash.

Signed-off-by: Ram Prasad Voleti <[email protected]>
Add unit test to reproduce the crash in raxRemove.

Signed-off-by: Ram Prasad Voleti <[email protected]>
@VoletiRam
Copy link
Contributor Author

Thank you for taking a look. @zuiderkwast @madolson

  1. Fixed the missing DCO.
  2. Added a regression unit test that reproduces the crash scenario:
    • The test is in test_rax.c.
    • It's disabled by default as it is a large memory test.

Please let me know if you need any adjustments to the change. Thank you.

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test!

I have just a minor comment about the name and comment for the test case.

I have tried this test case without the fix and it indeed crashes. This is the stacktrace when running in GDB (with optimization so we can't see the args; sorry for that):

Program received signal SIGSEGV, Segmentation fault.
0x0000fffff7c94e80 in __memcpy_generic () from /lib64/libc.so.6
(gdb) bt
#0  0x0000fffff7c94e80 in __memcpy_generic () from /lib64/libc.so.6
#1  0x00000000004bc0fc in raxRemove (rax=0xfffff200c000, s=<optimized out>, len=<optimized out>, old=<optimized out>) at unit/../rax.c:1181
#2  0x00000000004c3540 in test_raxRemoveCrash (argc=<optimized out>, argv=<optimized out>, flags=<optimized out>) at unit/test_rax.c:1071
#3  0x00000000004762b4 in runTestSuite (test=test@entry=0x771a18 <unitTestSuite+176>, argc=argc@entry=4, argv=argv@entry=0xffffffffee08, flags=flags@entry=6) at unit/test_main.c:25
#4  0x000000000045a4cc in main (argc=4, argv=0xffffffffee08) at unit/test_main.c:61

src/unit/test_rax.c Outdated Show resolved Hide resolved
Update the test name and description, addressing the comments of the PR

Signed-off-by: Ram Prasad Voleti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants