Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: add multi-stream dnstap sockets #14861

Closed
johnhtodd opened this issue Nov 22, 2024 · 13 comments · Fixed by #15123
Closed

dnsdist: add multi-stream dnstap sockets #14861

johnhtodd opened this issue Nov 22, 2024 · 13 comments · Fixed by #15123

Comments

@johnhtodd
Copy link

  • Program: dnsdist
  • Issue type: Feature request

Short description

Adding multiple streams of dnstap data emitted from dnsdist would allow better scaling for dnstap consumers.

Usecase

We're using Vector to consume dnstap streams from dnsdist. There apparently are bottlenecks with single-socket models of dnstap data transmission. This I'm sure would be exacerbated by slower latency on the LAN and larger traffic volumes - the ACK traffic will start to cause pileups with a single socket. Spreading the data load out across many dnstap sockets like pdns-rec does would make sense. There is discussion of this here: vectordotdev/vector#20744 (see comment from james-stevens)

Description

Having dnsdist open multiple simultaneous dnstap sockets to any named endpoint would be useful. The number of sockets could be configurable, or it could be based on threading, or dynamic based on number of messages - no opinion on that. How does pdns-rec do it?

@omoerbeek
Copy link
Member

The recursor opens a dnstap or protobuf logging stream per thread. Each thread logs to its associated socket.

@rgacogne
Copy link
Member

At the moment DNSdist creates one fstrm_writer per newFrameStreamUnixLogger or newFrameStreamTcpLogger object, so in theory it is already possible to create more than one connection, but it might be quite cumbersome to use them so.

@johnhtodd
Copy link
Author

More notes on this: When we had >60kqps flowing through a single TCP session to Vector, I was seeing what I would describe as high fluctuations in bandwidth between the transmitting server and the Vector instance. Meaning: for a few milliseconds, there would be many (20? I don't recall the number) megabits of throughput, which would then drop for a few milliseconds to 5 megabits of throughput, and then jump back to 20. This was observed via packet dumps with tshark. Sorry that I don't have the exact figures here; I was hunting a different problem and that behavior was a "Huh... interesting." moment, but I did not document it. I suspect this is a behavior that is made worse by the single-socket model in use. The same packet dump looking at data coming from pdns-rec (with less than 60kqps, admittedly) was extremely smooth - there were no fluctuations in the packet throughput between Vector and the recursive resolver on the dnstap data stream. Is this the "fault" of dnsdist? Not necessarily, but the single-socket model may make things worse than they need to be.

@esensar
Copy link
Contributor

esensar commented Jan 31, 2025

I am starting to work on this and my initial thought was to just add an additional option and open multiple connections inside of FrameStreamLogger and keep track of them there, but maybe adding a separate RemoteLoggerInterface that just acts as a group of other loggers would make more sense?

Not sure how to name it, but idea would be to manually build a number of loggers (or maybe provide a function that would do it based on a provided number parameter, or by number of available threads) and then just select a different one for each message. This would open up the possibility to easily providing different strategies, to achieve different goals, like passing same messages to all the different loggers or implementing the idea from this issue by selecting a different logger for each new message.

Does an approach like this make sense @rgacogne @omoerbeek ? I wanted to check before I head into implementation.

@rgacogne
Copy link
Member

rgacogne commented Feb 3, 2025

Yes, it would make a lot of sense to me to have a new RemoteLoggerInterface which is a pool of FrameStreamLogger objects (actually, it could perhaps even be a pool of RemoteLoggerInterface without being tied to FrameStreamLogger?). I would start with a simple round-robin strategy, but indeed being careful to abstract the exact strategy might be nice in the future.

@omoerbeek
Copy link
Member

Please also consider a strategy where we have a thread local object per thread and a very simple strategy: select the current thread local object. So far this has worked well in the recursor.

@rgacogne
Copy link
Member

rgacogne commented Feb 4, 2025

Would using a thread local pool work for the recursor? If the abstraction is well-designed you can use a pool of one FrameStreamLogger without any additional cost, and I'm pretty sure in some setups it will be useful to have more than one TCP connection to the FrameStream server per thread.

@omoerbeek
Copy link
Member

Would using a thread local pool work for the recursor? If the abstraction is well-designed you can use a pool of one FrameStreamLogger without any additional cost, and I'm pretty sure in some setups it will be useful to have more than one TCP connection to the FrameStream server per thread.

unless I'm mighty confused, Recursor already uses thread local loggers, each Recursor thread gets it's own.

@esensar
Copy link
Contributor

esensar commented Feb 4, 2025

Yes, looking at the code, it seems that recursor has a pool of loggers for each thread (each of the configured servers has a logger for each thread).

I will try to implement something similar, to have a strategy for that in this pool, but I am not sure if I will be able to. I am a bit confused by the fact that these can be created at any time, and in recursor they are created when threads get initialized. Either way, I will at least try to design it in a way that it could later be added as an additional strategy.

Also, I think that logger per thread might work better as an additional option for existing new...Logger functions, because I assumed that this new function for creating a pool would take an array of loggers, which might be confusing in combination with logger per thread. Maybe that is not a good idea?

@rgacogne
Copy link
Member

rgacogne commented Feb 4, 2025

I guess I'm confused, so let me try to clarify what I mean. Otto, my understanding is that, right now, you can configure a list of DNStap servers in the recursor via (dnstapFrameStreamServer or logging.dnstap_framestream_servers). All servers in that list will receive each DNStap messages exported by the recursor (queries, responses, or both depending on the per-server settings). To do that, the recursor keeps a per-thread vector of FrameStreamLogger objects. So if I'm correct it means that each recursor thread will have a single TCP connection to each DNStap server, which might at some point have the same scalability problem that DNSdist has. It's less likely to happen because the recursor only exports outgoing queries and incoming responses, but still it might happen. I don't see a way to get more than one TCP connection per thread to a given server, right?
Assuming I'm correct, would it make sense to have the possibility to have a per-thread vector of pools of FrameStreamLogger objects instead, so that each thread can distribute outgoing DNStap messages to a single server over a pool of TCP connections (probably in a round-robin way) to avoid saturating a single TCP connection?

@omoerbeek
Copy link
Member

Indeed recursor used single logging object per destination per thread, whcih has worked well in what we've seen.

I was under the impression that dnsdist uses a global single logging object and suggesting that moving to a per-thread logging object would already have a real impact. I might be mistaken though. Does dnsdist have per-thread logging objects or global ones?

@rgacogne
Copy link
Member

rgacogne commented Feb 4, 2025

Indeed recursor used single logging object per destination per thread, whcih has worked well in what we've seen.

OK, so let's not touch the recursor then, that's fine by me.

I was under the impression that dnsdist uses a global single logging object and suggesting that moving to a per-thread logging object would already have a real impact. I might be mistaken though. Does dnsdist have per-thread logging objects or global ones?

dnsdist uses a single FrameStreamLogger (so TCP connection) assigned to a rule, so in most cases this translates to a single logging object (and TCP connection), yes. Given the fact that a single DNSdist thread can deal with N*100k QPS, I don't expect a per-thread object to be enough.
In my mind, what we need is:

  • newFrameStreamTcpLogger, newFrameStreamUnixLogger and DnstapLoggerConfiguration (YAML) take a new parameter to specify the number of TCP connections to open to the server. Then internally DNStap messages are sent over the TCP connections in a round-robin way.

@omoerbeek
Copy link
Member

OK, i understand better now why a single thread local logging object (per thread) is likely not enough.

esensar added a commit to esensar/pdns that referenced this issue Feb 5, 2025
This adds a new kind of `RemoteLoggerInterface`: `RemoteLoggerPool`.
It can take multiple other `RemoteLoggerInterface`s and pass data to
them in round-robin order by default.

This also adds additional option to `newRemoteLogger`, `newFrameStreamTcpLogger`
and `newFrameStreamUnixLogger`: `connectionCount`, which can be used to
generate a pool with multiple connections.

Closes: PowerDNS#14861
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants