shares shreds' payload between window-service and retransmit-stage #4803

behzadnouri · 2025-02-05T15:47:18Z

Problem

Shreds received from turbine are concurrently sent to window-service to be deserialized and inserted into blockstore, while their payload is sent to retransmit-stage. Using a shared payload between the two concurrent paths will reduce allocations and memcopies.

Summary of Changes

The commit shares shreds' payload between window-service and retransmit-stage.

steviez · 2025-02-05T17:35:49Z

@vadorovsky has been looking into a possible change to back Packet with Bytes instead of the fixed sized array. I would have to let Michal comment more on the specifics, but such a change might allow us to accomplish the same thing as this PR without ever copying the payload out of the original Packet buffer.

behzadnouri · 2025-02-05T18:14:12Z

@vadorovsky has been looking into a possible change to back Packet with Bytes instead of the fixed sized array. I would have to let Michal comment more on the specifics, but such a change might allow us to accomplish the same thing as this PR without ever copying the payload out of the original Packet buffer.

I would not like to tie this with that other work (which I am not even confident is the right thing to do).
Replacing inner buffer in Packet with Bytes is a huge change, and it is not necessarily a good change because

Bytes requires dynamic dispatch which is pretty slow.
We already have a recycler for Packets. That will not work well with Bytes or will require to make a lot of clones which defeats the whole point of using Bytes anyways.
The gpu code will break, which is definitely a negative trade-off (we spend a lot more time on sigverify than copying bytes). It is true that the gpu code is not used much today, but that can change in the future when the demand goes up.
Bytes does not work with [u8; N] or Arc<Vec<u8>>. I am already using Arc<Vec<u8>> in the payload and I have plans to use [u8; N] in the payload as well: allows using fixed size arrays inside shred::Payload #4792

steviez · 2025-02-05T18:54:21Z

We already have a recycler for Packets. That will not work well with Bytes or will require to make a lot of clones which defeats the whole point of using Bytes anyways.

The recyclers operate on PacketBatch right ? I think the idea would be one big Bytes allocation for PacketBatch, and each Packet within that batch gets a slice of that one. I think Bytes would handle the ref counting properly to ensure PacketBatch is dropped only after the individual Bytes references are all dropped

The gpu code will break, which is definitely a negative trade-off (we spend a lot more time on sigverify than copying bytes). It is true that the gpu code is not used much today, but that can change in the future when the demand goes up.

As an FYI, I think we're leaning pretty heavily towards ripping the GPU code out: #3817

Bytes does not work with [u8; N] or Arc<Vec<u8>>. I am already using Arc<Vec<u8>> in the payload and I have plans to use [u8; N] in the payload as well: allows using fixed size arrays inside shred::Payload #4792

Got it, we can discuss over there

I would not like to tie this with that other work (which I am not even confident is the right thing to do).

Fair enough. We can optimize things as they are and if Bytes ends up happening / being a good choice, we can refactor this if there are any gains to be had

behzadnouri · 2025-02-05T19:16:07Z

The recyclers operate on PacketBatch right ? I think the idea would be one big Bytes allocation for PacketBatch, and ...

Still does not sound like to me it will work with the recycler.

As an FYI, I think we're leaning pretty heavily towards ripping the GPU code out: #3817

That is terrible. Sigverify is more of a major bottleneck than whatever #3817 is going to solve.

We can optimize things as they are and if Bytes

Not using Bytes is good optimization. It requires dynamic dispatch which is pretty slow.

behzadnouri · 2025-02-05T19:37:14Z

core/src/window_service.rs

-        let mut addrs: Vec<_> = self.addrs.iter().collect();
-        let reverse_count = |(_addr, count): &_| Reverse(*count);
-        if addrs.len() > MAX_NUM_ADDRS {
-            addrs.select_nth_unstable_by_key(MAX_NUM_ADDRS, reverse_count);
-            addrs.truncate(MAX_NUM_ADDRS);
-        }
-        addrs.sort_unstable_by_key(reverse_count);
-        info!(
-            "num addresses: {}, top packets by source: {:?}",
-            self.addrs.len(),
-            addrs
-        );


Removing this info! log (and associated addrs bookkeeping) here because it is pretty inefficient to collect emit these logs here.
I will look into putting something similar elsewhere in the pipeline (maybe shred-fetch-stage or sigverify).

Shreds received from turbine are concurrently sent to window-service to be deserialized and inserted into blockstore, while their payload is sent to retransmit-stage. Using a shared payload between the two concurrent paths will reduce allocations and memcopies.

behzadnouri force-pushed the share-shred-payload-sigverify branch from c8f42a9 to a91c1d3 Compare February 5, 2025 17:01

behzadnouri commented Feb 5, 2025

View reviewed changes

behzadnouri requested review from carllin, steviez and AshwinSekar February 5, 2025 19:38

behzadnouri force-pushed the share-shred-payload-sigverify branch from a91c1d3 to 3653ce0 Compare February 5, 2025 23:50

behzadnouri force-pushed the share-shred-payload-sigverify branch from 3653ce0 to 631996b Compare February 5, 2025 23:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shares shreds' payload between window-service and retransmit-stage #4803

shares shreds' payload between window-service and retransmit-stage #4803

behzadnouri commented Feb 5, 2025 •

edited

Loading

steviez commented Feb 5, 2025

behzadnouri commented Feb 5, 2025 •

edited

Loading

steviez commented Feb 5, 2025

behzadnouri commented Feb 5, 2025

behzadnouri Feb 5, 2025

shares shreds' payload between window-service and retransmit-stage #4803

Are you sure you want to change the base?

shares shreds' payload between window-service and retransmit-stage #4803

Conversation

behzadnouri commented Feb 5, 2025 • edited Loading

Problem

Summary of Changes

steviez commented Feb 5, 2025

behzadnouri commented Feb 5, 2025 • edited Loading

steviez commented Feb 5, 2025

behzadnouri commented Feb 5, 2025

behzadnouri Feb 5, 2025

Choose a reason for hiding this comment

behzadnouri commented Feb 5, 2025 •

edited

Loading

behzadnouri commented Feb 5, 2025 •

edited

Loading