Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processor_sampling: new trace sampling processor #10029

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

edsiper
Copy link
Member

@edsiper edsiper commented Feb 28, 2025

This patch introduces a new trace sampling processor designed with a pluggable architecture, allowing easy extension to support multiple sampling strategies and backends.

The initial implementation includes basic probabilistic sampling, with future patches planned to add additional sampling methods such as rate-limiting, latency-based, and tail-based sampling.

The probabilistic sampler can be configured as follows:

pipeline:
  inputs:
    - name: opentelemetry
      port: 4318

      processors:
        traces:
          - name: sampling
            type: probabilistic
            debug: true
            rules:
               sampling_percentage: 40

  outputs:
    - name: stdout
      match: '*'

in this configuration:

  • debug mode (debug: true) is enabled, allowing detailed logging of sampling decisions.
  • sampling_percentage: 40 ensures that 40% of traces are retained, while the rest are discarded.
  • traces that pass sampling will be forwarded to the stdout output for visibility.
Fluent Bit v4.0.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/02/28 16:46:00] [ info] [fluent bit] version=4.0.0, commit=0e885e2d60, pid=778903 [2025/02/28 16:46:00] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/02/28 16:46:00] [ info] [simd    ] disabled
[2025/02/28 16:46:00] [ info] [cmetrics] version=0.9.9
[2025/02/28 16:46:00] [ info] [ctraces ] version=0.6.0
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] initializing
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] storage_strategy='memory' (memory only)
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] listening on 0.0.0.0:4318
[2025/02/28 16:46:00] [ info] [processor:sampling:sampling.0] initializing probabilistic sampling processor
[2025/02/28 16:46:00] [ info] [sp] stream processor started
[2025/02/28 16:46:00] [ info] [output:stdout:stdout.0] worker #0 started

🔍 Debug sampling 'probabilistic' (0x779068027940): before
   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=5b8efff798038103d269b633813fc60c                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=eee19b7ec3c1b174 name=I'm a server span                │
   │   ├── id=eee19b7ec3c1b175 name=Child span of server span        │
   │   ├── id=eee19b7ec3c1b176 name=Database query                   │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=6a9dfff798038103d269b633813fc60d                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=fff19b7ec3c1b174 name=A span in another trace          │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=7c8efff798038103d269b633813fc60e                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Slow request                     │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=8d9efff798038103d269b633813fc60f                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=High traffic span                │
   │   ├── id=0000000000000000 name=Load testing event               │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=9a1bfff798038103d269b633813fc610                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Faulty transaction               │
   │   ├── id=0000000000000000 name=Database rollback                │
   └─────────────────────────────────────────────────────────────────┘

🔍 Debug sampling 'probabilistic' (0x779068027940): after
   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=6a9dfff798038103d269b633813fc60d                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=fff19b7ec3c1b174 name=A span in another trace          │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=7c8efff798038103d269b633813fc60e                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Slow request                     │
   └─────────────────────────────────────────────────────────────────┘

|-------------------- RESOURCE SPAN --------------------|
  resource:
     - attributes: - service.name: 'other.service'
     - dropped_attributes_count: 0 - schema_url: "" [scope_span] instrumentation scope:
        - name                    : other.library
        - version                 : 2.0.0
        - dropped_attributes_count: 0
        - attributes: undefined
    schema_url: ""
    [spans]
         [span #0 'A span in another trace']
             - trace_id                : 6a9dfff798038103d269b633813fc60d
             - span_id                 : fff19b7ec3c1b174
             - parent_span_id          : undefined
             - kind                    : 2 (server)
             - start_time              : 1544712660000000000
             - end_time                : 1544712662000000000
             - dropped_attributes_count: 0
             - dropped_events_count    : 0
             - dropped_links_count     : 0
             - trace_state             : (null)
             - status:
                 - code    : 0
             - attributes: none
             - events: none
             - [links]
|-------------------- RESOURCE SPAN --------------------|
  resource:
     - attributes:
            - service.name: 'latency.test.service'
     - dropped_attributes_count: 0
     - schema_url: ""
  [scope_span]
    instrumentation scope:
        - name                    : latency.test.library
        - version                 : 3.0.0
        - dropped_attributes_count: 0
        - attributes: undefined
    schema_url: ""
    [spans]
         [span #0 'Slow request']
             - trace_id                : 7c8efff798038103d269b633813fc60e
             - span_id                 : 0000000000000000
             - parent_span_id          : undefined
             - kind                    : 2 (server)
             - start_time              : 1544712660000000000
             - end_time                : 1544712675000000000
             - dropped_attributes_count: 0
             - dropped_events_count    : 0
             - dropped_links_count     : 0
             - trace_state             : (null)
             - status:
                 - code    : 0
             - attributes: none
             - events: none
             - [links]

Manual test

Using this JSON file: trace_sampling_extended.json , try with curl:

curl -X POST -H "Content-Type: application/json" -d @trace_sampling_extended.json -i localhost:4318/v1/traces

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

This patch introduces a new trace sampling processor designed with a
pluggable architecture, allowing easy extension to support multiple
sampling strategies and backends.

The initial implementation includes basic probabilistic sampling, with
future patches planned to add additional sampling methods such as
rate-limiting, latency-based, and tail-based sampling.

The probabilistic sampler can be configured as follows:

  pipeline:
    inputs:
      - name: opentelemetry
        port: 4318

        processors:
          traces:
            - name: sampling
              type: probabilistic
              debug: true
              rules:
                sampling_percentage: 40

    outputs:
      - name: stdout
        match: '*'

in this configuration:
 - debug mode (debug: true) is enabled, allowing detailed logging of sampling decisions.
 - sampling_percentage: 40 ensures that 40% of traces are retained, while the rest are discarded.
 - traces that pass sampling will be forwarded to the stdout output for visibility.

Fluent Bit v4.0.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/02/28 16:46:00] [ info] [fluent bit] version=4.0.0, commit=0e885e2d60, pid=778903
[2025/02/28 16:46:00] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/02/28 16:46:00] [ info] [simd    ] disabled
[2025/02/28 16:46:00] [ info] [cmetrics] version=0.9.9
[2025/02/28 16:46:00] [ info] [ctraces ] version=0.6.0
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] initializing
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] storage_strategy='memory' (memory only)
[2025/02/28 16:46:00] [ info] [input:opentelemetry:opentelemetry.0] listening on 0.0.0.0:4318
[2025/02/28 16:46:00] [ info] [processor:sampling:sampling.0] initializing probabilistic sampling processor
[2025/02/28 16:46:00] [ info] [sp] stream processor started
[2025/02/28 16:46:00] [ info] [output:stdout:stdout.0] worker #0 started

🔍 Debug sampling 'probabilistic' (0x779068027940): before
   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=5b8efff798038103d269b633813fc60c                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=eee19b7ec3c1b174 name=I'm a server span                │
   │   ├── id=eee19b7ec3c1b175 name=Child span of server span        │
   │   ├── id=eee19b7ec3c1b176 name=Database query                   │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=6a9dfff798038103d269b633813fc60d                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=fff19b7ec3c1b174 name=A span in another trace          │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=7c8efff798038103d269b633813fc60e                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Slow request                     │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=8d9efff798038103d269b633813fc60f                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=High traffic span                │
   │   ├── id=0000000000000000 name=Load testing event               │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=9a1bfff798038103d269b633813fc610                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Faulty transaction               │
   │   ├── id=0000000000000000 name=Database rollback                │
   └─────────────────────────────────────────────────────────────────┘

🔍 Debug sampling 'probabilistic' (0x779068027940): after
   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=6a9dfff798038103d269b633813fc60d                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=fff19b7ec3c1b174 name=A span in another trace          │
   └─────────────────────────────────────────────────────────────────┘

   ┌─────────────────────────────────────────────────────────────────┐
   │ trace_id=7c8efff798038103d269b633813fc60e                       │
   ├─────────────────────────────────────────────────────────────────┤
   │ spans:                                                          │
   │   ├── id=0000000000000000 name=Slow request                     │
   └─────────────────────────────────────────────────────────────────┘

|-------------------- RESOURCE SPAN --------------------|
  resource:
     - attributes:
            - service.name: 'other.service'
     - dropped_attributes_count: 0
     - schema_url: ""
  [scope_span]
    instrumentation scope:
        - name                    : other.library
        - version                 : 2.0.0
        - dropped_attributes_count: 0
        - attributes: undefined
    schema_url: ""
    [spans]
         [span #0 'A span in another trace']
             - trace_id                : 6a9dfff798038103d269b633813fc60d
             - span_id                 : fff19b7ec3c1b174
             - parent_span_id          : undefined
             - kind                    : 2 (server)
             - start_time              : 1544712660000000000
             - end_time                : 1544712662000000000
             - dropped_attributes_count: 0
             - dropped_events_count    : 0
             - dropped_links_count     : 0
             - trace_state             : (null)
             - status:
                 - code    : 0
             - attributes: none
             - events: none
             - [links]
|-------------------- RESOURCE SPAN --------------------|
  resource:
     - attributes:
            - service.name: 'latency.test.service'
     - dropped_attributes_count: 0
     - schema_url: ""
  [scope_span]
    instrumentation scope:
        - name                    : latency.test.library
        - version                 : 3.0.0
        - dropped_attributes_count: 0
        - attributes: undefined
    schema_url: ""
    [spans]
         [span #0 'Slow request']
             - trace_id                : 7c8efff798038103d269b633813fc60e
             - span_id                 : 0000000000000000
             - parent_span_id          : undefined
             - kind                    : 2 (server)
             - start_time              : 1544712660000000000
             - end_time                : 1544712675000000000
             - dropped_attributes_count: 0
             - dropped_events_count    : 0
             - dropped_links_count     : 0
             - trace_state             : (null)
             - status:
                 - code    : 0
             - attributes: none
             - events: none
             - [links]

Signed-off-by: Eduardo Silva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants