i#6354: Do not output zero-instruction (unfiltered) threads #6356

derekbruening · 2023-10-10T22:29:02Z

Tightens the normal-mode output rule to discard a thread with no instructions (keeping the no-data check for i-filtered modes) and expands the empty-thread behavior to include regular exit runs and not just detach.

Adds the scheduler_launcher CPU check from PR #6355.

It is not easy to create a test with a thread that reliably executes zero instructions yet still outputs some data. I added an invariant check for a zero-instruction thread and confirmed it fires on the trace in PR #6355:

$ bin64/drrun -t drcachesim -simulator_type invariant_checker -indir ~/tmp/drmemtrace.schedtest.x64
Trace invariant failure in T85300 at ref # 12 (0 instrs since timestamp 13341391214912820): An unfiltered thread should have at least 1 instruction

Issue: #6354

Tightens the normal-mode output rule to discard a thread with no instructions (keeping the no-data check for i-filtered modes). Adds the scheduler_launcher CPU check from PR #6355. It is not easy to create a test with a thread that reliably executes zero instructions yet still outputs some data. I added an invariant check for a zero-instruction thread and confirmed it fires on the trace in PR #6355: ``` $ bin64/drrun -t drcachesim -simulator_type invariant_checker -indir ~/tmp/drmemtrace.schedtest.x64 Trace invariant failure in T85300 at ref # 12 (0 instrs since timestamp 13341391214912820): An unfiltered thread should have at least 1 instruction ``` Issue: #6354

derekbruening · 2023-10-10T22:30:09Z

@prasun3 could you check whether with this branch in place you no longer get that no-instruction thread? I'm not sure how to create a test to recreate that: how to have a thread reliably execute the post-futex yet not execute any instructions.

derekbruening · 2023-10-10T23:06:09Z

Failure is coherence test invalid type 23 before a bundle #4167

prasun3 · 2023-10-11T07:48:06Z

I can reproduce this with the cmd line below. It happens when there is a long running test and we use -exit_after_tracing. The same issue occurs with -max_trace_size followed by manually killing the application. In both cases, I see an exception, so the exit syscall is probably not executed

build/bin64/drrun -t drcachesim -trace_after_instrs 9000M -exit_after_tracing 5M --offline -- ./threadsig 2 1000000000

<Stopping application ... threadsig (123613)>
<Application ... threadsig (123613).  DynamoRIO Cache Simulator Tracer internal crash at PC 0x00007efdcac8f603.  Please report this at http://dynamorio.org/issues.  Program aborted.
Received SIGSEGV at pc 0x00007efdcac8f603 in thread 123614
Base: 0x00007efdcb26b000
Registers:eax=0x0000000000000240 ebx=0x00007efb8716dd60 ecx=0x00007efd87081980 edx=0x7a2ab66beb93f336
        esi=0x0000000000000000 edi=0x00007efd872cf320 esp=0x00007efb872cf290 ebp=0x00007efdcb03c708
        r8 =0x0000000000000000 r9 =0x00007efd872cf320 r10=0x0000000000000003 r11=0x0000000000000246
        r12=0x00007efb8716db30 r13=0x00007efd872c90e0 r14=0x00007efb8716db40 r15=0x0000000000000072
        eflags=0x0000000000010246
version 10.0.19640, custom build

With -max_trace_size followed by kill

build/bin64/drrun -t drcachesim -trace_after_instrs 9000M -max_trace_size 5M --offline -- ./threadsig 2 1000000000

<Application ... threadsig (123671).  Application exception at PC 0x00007fd58a647d2b.
Signal 15 delivered to application as default action.
Callstack:
        0x00007fd58a647d2b   </lib/x86_64-linux-gnu/libpthread-2.27.so+0x8d2b>
        0x00007fd587eda0f0
        0xc48348c2440f48ff
>
<Stopping application ... threadsig (123671)>

prasun3 · 2023-10-11T07:58:33Z

@prasun3 could you check whether with this branch in place you no longer get that no-instruction thread? I'm not sure how to create a test to recreate that: how to have a thread reliably execute the post-futex yet not execute any instructions.

The same behaviour is seen even with this branch.

derekbruening · 2023-10-12T03:28:17Z

<Application ... threadsig (123613). DynamoRIO Cache Simulator Tracer internal crash at PC 0x00007efdcac8f603. Please report this at http://dynamorio.org/issues. Program aborted.

But the tool crashed! All bets are off. The trace might be corrupted or not finalized properly. Probably raw2trace or invariant_checker will fail/complain if so and apparently they did not for you so maybe it's ok but I would be wary of it. That crash should be fixed: was an issue filed on it?

derekbruening · 2023-10-12T03:30:45Z

The same behaviour is seen even with this branch.

Looks like -exit_after_tracing doesn't turn off tracing mode. We always detach which turns it off...probably we should have all exit methods do so, or otherwise enable the drop-empty-thread code. Will update the PR.

…exit, too

derekbruening · 2023-10-12T04:04:25Z

The same behaviour is seen even with this branch.

Looks like -exit_after_tracing doesn't turn off tracing mode. We always detach which turns it off...probably we should have all exit methods do so, or otherwise enable the drop-empty-thread code. Will update the PR.

Updated. The empty thread should no longer exist.

prasun3 · 2023-10-12T06:47:39Z

I'm not seeing zero-inst traces with this branch and scheduler_launcher also seems to pass on these threads.

Is detach only available with source instrumentation?

I'll file an issue about the crash with -exit_after_tracing.

prasun3 · 2023-10-12T06:50:08Z

I don't fully understand the impact of the changes in output.cpp - could maybe someone more familiar do the code review?

prasun3 · 2023-10-12T06:58:25Z

I'll file an issue about the crash with -exit_after_tracing.

Looks like we discussed it here #3346 (comment)

derekbruening · 2023-10-12T19:59:15Z

Still not sure how to make a reliable test. I tried the -exit_after_tracing command and it did result in one empty thread but it had just one cpuid: it did not have a 2nd and so would not have triggered the cpuid mismatch. We may have to just go with the manual testing from @prasun3.

clients/drcachesim/tests/scheduler_launcher.cpp

clients/drcachesim/tracer/output.cpp

Fixes un-escaped quote characters from PR #6335 which show up as CMake warnings. Issue: #3101

…xpand ifdef to logical coverage.

…-empty-threads

prasun3 · 2023-10-13T03:16:38Z

Still not sure how to make a reliable test. I tried the -exit_after_tracing command and it did result in one empty thread but it had just one cpuid: it did not have a 2nd and so would not have triggered the cpuid mismatch. We may have to just go with the manual testing from @prasun3.

This is the test that I ran

build/bin64/drrun -pidfile pid.txt -t drcachesim -trace_after_instrs 9000M -exit_after_tracing 5M --offline -- ./threadsig 2 1000000000 
build/clients/bin64/drraw2trace -indir drmemtrace.threadsig.$(cat pid.txt).*
build/clients/bin64/scheduler_launcher -trace_dir drmemtrace.threadsig.$(cat pid.txt).*/trace -cpu_schedule_file drmemtrace.threadsig.$(cat pid.txt).*/trace/cpu_schedule.bin.zip

prasun3 · 2023-10-20T10:17:51Z

@derekbruening I am seeing zero-insr threads with L0_filter. Is this because there are two INSTRUCTION_COUNT markers?

Cmd line

build/bin64/drrun -pidfile pid.txt -t drcachesim -trace_after_instrs 9000M \
    -L0_filter \
    -exit_after_tracing 100K --offline -- ./threadsig 2 1000000000

Raw file entries

EXTENDED: header
THREAD
PID
EXTENDED: marker: TRACE_MARKER_TYPE_CACHE_LINE_SIZE
EXTENDED: marker: TRACE_MARKER_TYPE_PAGE_SIZE
TIMESTAMP
EXTENDED: marker: TRACE_MARKER_TYPE_CPU_ID
EXTENDED: marker: TRACE_MARKER_TYPE_INSTRUCTION_COUNT
EXTENDED: marker: TRACE_MARKER_TYPE_INSTRUCTION_COUNT
EXTENDED: footer

Trace

Output format:
<--record#-> <--instr#->: <---tid---> <record details>
------------------------------------------------------------
           1           0:      210474 <marker: version 6>
           2           0:      210474 <marker: filetype 0xbc2>
           3           0:      210474 <marker: cache line size 64>
           4           0:      210474 <marker: chunk instruction count 10000000>
           5           0:      210474 <marker: page size 4096>
           6           0:      210474 <marker: timestamp 13342269712523323>
           7           0:      210474 <marker: tid 210474 on core 20696>
           8           0:      210474 <marker: instruction count 0>
           9           0:      210474 <marker: instruction count 0>
          10           0:      210474 <thread 210474 exited>
View tool results:
              0 : total instructions

derekbruening · 2023-10-20T14:25:01Z

For an instruction filter we can't really have the same standard as there could be zero instruction records but still some data records. I guess the check could look for data records too? But how is an instr-filtered trace being fed to the scheduler: it must be a warmup with no post-warmup? I had assumed the filtered part would not be asked to be scheduled.

prasun3 · 2023-10-25T11:11:31Z

I originally saw the scheduling issue with a bimodal trace, and since the issue was root-caused to empty traces, the example above reproduces empty traces with filtering enabled.

I thought running bimodal traces through the scheduler would be okay since they have cpu id markers, but sounds like that is not the case since it relies on instructions being present?

derekbruening · 2023-10-25T17:28:15Z

I thought running bimodal traces through the scheduler would be okay since they have cpu id markers, but sounds like that is not the case since it relies on instructions being present?

We didn't really think about this use case. The basics would work but preempting by instruction count would not (so the simulator/tool would have to pass in a time value) and skipping for regions of interest or recorded schedules would not. File a separate issue on scheduler support for i-filtered traces? Maybe the rep string proposed solution Proposal #E at #6354 (comment) would also solve this.

prasun3 · 2023-10-26T04:29:17Z

I am interested in just generating per-core streams so it is okay for me even if those other features you listed don't work. But I can file an issue in case it is useful in the future or to others.

derekbruening mentioned this pull request Oct 10, 2023

i#6354 test: Verify one CPU ID per stream with MAP_TO_RECORDED #6355

Open

derekbruening requested a review from prasun3 October 10, 2023 23:13

prasun3 mentioned this pull request Oct 11, 2023

seeing more than one cpu id on one output stream with MAP_TO_RECORDED_OUTPUT #6354

Open

Throw out empty threads when tracing is still active on (non-detach) …

cec2cb6

…exit, too

derekbruening requested a review from abhinav92003 October 12, 2023 19:57

abhinav92003 approved these changes Oct 12, 2023

View reviewed changes

clients/drcachesim/tests/scheduler_launcher.cpp Outdated Show resolved Hide resolved

clients/drcachesim/tracer/output.cpp Outdated Show resolved Hide resolved

clients/drcachesim/tracer/output.cpp Outdated Show resolved Hide resolved

derekbruening added 3 commits October 12, 2023 18:10

i#3101 drcachesim external: Fix CMake syntax errors (#6357)

1ca6f8e

Fixes un-escaped quote characters from PR #6335 which show up as CMake warnings. Issue: #3101

Review requests: split into separate bools with augmented comments; e…

ca45a66

…xpand ifdef to logical coverage.

Merge branch 'master' of github.com:DynamoRIO/dynamorio into i6354-no…

0d662cb

…-empty-threads

derekbruening merged commit a749004 into master Oct 12, 2023

derekbruening deleted the i6354-no-empty-threads branch October 12, 2023 22:53

prasun3 mentioned this pull request Oct 26, 2023

Support running instruction-filtered traces or portions of traces in the scheduler #6396

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i#6354: Do not output zero-instruction (unfiltered) threads #6356

i#6354: Do not output zero-instruction (unfiltered) threads #6356

derekbruening commented Oct 10, 2023 •

edited

Loading

derekbruening commented Oct 10, 2023

derekbruening commented Oct 10, 2023

prasun3 commented Oct 11, 2023

prasun3 commented Oct 11, 2023

derekbruening commented Oct 12, 2023

derekbruening commented Oct 12, 2023

derekbruening commented Oct 12, 2023

prasun3 commented Oct 12, 2023

prasun3 commented Oct 12, 2023

prasun3 commented Oct 12, 2023

derekbruening commented Oct 12, 2023

prasun3 commented Oct 13, 2023

prasun3 commented Oct 20, 2023

derekbruening commented Oct 20, 2023

prasun3 commented Oct 25, 2023

derekbruening commented Oct 25, 2023

prasun3 commented Oct 26, 2023

i#6354: Do not output zero-instruction (unfiltered) threads #6356

i#6354: Do not output zero-instruction (unfiltered) threads #6356

Conversation

derekbruening commented Oct 10, 2023 • edited Loading

derekbruening commented Oct 10, 2023

derekbruening commented Oct 10, 2023

prasun3 commented Oct 11, 2023

prasun3 commented Oct 11, 2023

derekbruening commented Oct 12, 2023

derekbruening commented Oct 12, 2023

derekbruening commented Oct 12, 2023

prasun3 commented Oct 12, 2023

prasun3 commented Oct 12, 2023

prasun3 commented Oct 12, 2023

derekbruening commented Oct 12, 2023

prasun3 commented Oct 13, 2023

prasun3 commented Oct 20, 2023

derekbruening commented Oct 20, 2023

prasun3 commented Oct 25, 2023

derekbruening commented Oct 25, 2023

prasun3 commented Oct 26, 2023

derekbruening commented Oct 10, 2023 •

edited

Loading