Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipe: add flb_pipe_error #10017

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

braydonk
Copy link
Contributor

@braydonk braydonk commented Feb 26, 2025

On Windows, the flb_pipe_r and flb_pipe_w macros do not set errno on failure, meaning calling flb_errno in error scenarios is insufficient. This PR adds a new macro that will check the correct place, WSAGetLastError, and output a similar error message. On Linux this will still be flb_errno, meaning messages should work the same as they always did, but now on Windows we will get actual error messages.

Fixes #3146


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change

I had trouble finding a scenario wherein a proper winsock.h failure is propagated. In #3146 they mentioned sending to an elasticsearch host that is not available, but the pattern for how the http client handles that error appears to have changed such that a direct pipe read error isn't propagated.

To test the log format, I made this local change on Windows:

--- a/src/flb_utils.c
+++ b/src/flb_utils.c
@@ -494,7 +494,9 @@ int flb_utils_timer_consume(flb_pipefd_t fd)
     int ret;
     uint64_t val;

-    ret = flb_pipe_r(fd, &val, sizeof(val));
+    // ret = flb_pipe_r(fd, &val, sizeof(val));
+    ret = -1;
+    WSASetLastError(WSAEADDRINUSE);
     if (ret == -1) {
         flb_pipe_error();
         return -1;

Resulting log:

[2025/02/27 18:01:58] [error] [C:\Users\braydonk\Git\fluent-bit\src\flb_utils.c:501 WSAGetLastError=10048] Only one usage of each socket address (protocol/network address/port) is normally permitted.

On Linux I made the following similar local change:

--- a/src/flb_utils.c
+++ b/src/flb_utils.c
@@ -494,7 +494,9 @@ int flb_utils_timer_consume(flb_pipefd_t fd)
     int ret;
     uint64_t val;
 
-    ret = flb_pipe_r(fd, &val, sizeof(val));
+    // ret = flb_pipe_r(fd, &val, sizeof(val));
+    ret = -1;
+    errno = EADDRINUSE;
     if (ret == -1) {
         flb_pipe_error();
         return -1;

And this is the resulting error message:

[2025/02/28 13:57:17] [error] [/usr/local/google/home/braydonk/Git/fluent-bit/src/flb_utils.c:501 errno=98] Address already in use
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

Docs PR should not be required.

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

On Windows, the `flb_pipe_r` and `flb_pipe_w` macros do not set errno on
failure, meaning calling `flb_errno` in error scenarios is insufficient.
This PR adds a new macro that will check the correct place,
`WSAGetLastError`, and output a similar error message. On Linux this
will still be `flb_errno`, meaning messages should work the same as they
always did, but now on Windows we will get actual error messages.

Signed-off-by: braydonk <[email protected]>
@braydonk
Copy link
Contributor Author

That s390x failure doesn't look related to this PR.

@edsiper edsiper added this to the Fluent Bit v4.0.0 milestone Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flb_errno printing wrong error codes on windows
2 participants