-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent results with deferred/onto #151
Comments
I'd say it's an inconsistency. This stems from https://github.com/ztellman/manifold/blob/193f5f48972c8e20dd0a9fc41a1311566a9f7bdd/src/manifold/deferred.clj#L378 - when state of deferred is realized, the listener execution happens immediately on the thread which attaches the listener. |
Here's a more practical example of why this is a problem: (let [d (-> 100
(d/success-deferred)
(d/onto (ForkJoinPool/commonPool)))
t0 (System/currentTimeMillis)
_ (d/chain d #(Thread/sleep %)) ;; want to run in parallel with ↓
_ (d/chain d #(Thread/sleep %)) ;; want to run in parallel with ↑
t1 (System/currentTimeMillis)]
(println "Chaining took" (- t1 t0) "ms"))
; ==> Chaining took 209 ms I would have expected the calls to It's also a problem if the code in the chained callbacks requires thread-local context that is provided by the executor, for example MDC. |
@dm3 since I've explicitly set the executor with |
As @dm3 said, if you chain on a deferred which is already realized, it executes immediately and does not register any sort of callback. Since you are simply using |
@ztellman Is this the desired behavior, though? In my second example, code that I would expect not to block at all actually takes 200ms. Imagine that this is happening in a UI thread, and I want to ensure that the callbacks, which may block, are run in a separate thread to avoid hanging the UI. The initial deferred came from a library function, and it may (or may not) be realized by the time it gets to me. |
It's a fair question. This code is effectively using deferreds as a way to convey blocking work onto another thread, which I would accomplish with |
Thanks for considering. The actual problem that is biting me is that the code in my callbacks requires thread-local context that is automatically provided by the executor I specified, and the thread-local context isn't always there. I think the unexpected blocking case is more compelling to most users of the library, though. |
Wrapping the chain call in future-with will fix your immediate issue, fwiw.
I’ll update once I’ve mulled this over a bit.
…On Fri, Feb 23, 2018 at 2:40 PM Philip Garrett ***@***.***> wrote:
Thanks for considering. The actual problem that is biting me is that the
code in my callbacks requires thread-local context that is automatically
provided by the executor I specified, and the thread-local context isn't
always there. I think the unexpected blocking case is more compelling to
most users of the library, though.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#151 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAB6P0tVAcJ23ke213AXYO4dIdsHxyllks5tXxRGgaJpZM4SOehO>
.
|
@ztellman any update on this? |
I've stumbled over this as well, trying to use an executor to set up a thread-local context. On my machine, this snippet constantly yields the REPL thread, so the executor passed to (-> (d/future 1)
(d/onto (executor/fixed-thread-executor 2))
(d/chain (fn [_] (.getName (Thread/currentThread))))) This means that the I guess this is not really an issue if all one cares about is fast execution, but this means that one cannot use custom executors for other purposes. @KingMob -- if you decide this is not a bug, maybe we should document this, stating that there is no guarantee? The current docstring |
Yes, if the |
(-> (d/future 1)
(d/onto (e/fixed-thread-executor 2))
(d/chain (fn [_] (prn (.getName (Thread/currentThread))))))
=> "nREPL-session-8ba7d6cf-90f9-428b-9505-bba48850ff05"
(or sometimes if unlucky) => "manifold-execute-3" (-> (d/future (Thread/sleep 100) 1)
(d/onto (e/fixed-thread-executor 2))
(d/chain (fn [_] (prn (.getName (Thread/currentThread))))))
=> "manifold-pool-37-1" |
Yep, tried the same with I'm not saying this is necessarily wrong behaviour, but certainly surprising and misleading. I lost two days trying to use custom executors to trace my async program. :) |
At a bare minimum, we should improve the docs, since someone asks this question at least twice a year. Would you be willing to contribute a PR improving them? For both the docstrings and the markdown files? Can you document the current behavior of I'm not in love with the current state of things, but it's risky to try and make it more consistent. Let's ignore the discussion in clj-commons/aleph#427 about trying to improve performance by detecting when you wish to schedule on the same executor, and consider only whether we can safely alter the
I think we can skip using bound-fn ourselves, but we need to measure the executor overhead before we can implement this. Anyone with a lot of Thoughts? Anyone have an Aleph server in a staging env they can volunteer as a guinea pig? |
Sure, happy to give this a try!
Unfortunately I cannot volunteer as a guinea pig right now. In general, I think it is a good idea to not use |
I don't think it's a good idea... It means that every deferred will be rescheduled on the Executor using Aleph [1], right? [1] : https://github.com/exoscale/aleph/blob/master/src/aleph/http/server.clj#L595-L598 |
@arnaudgeiser I'm not clear how your linked code sample relates. What I'm contemplating is this: We alter the code in In terms of performance, I don't think it will be as bad as we fear. For starters, it won't change the perf of any chained fn where the previous link in the chain is usually unrealized at call time, because those will already be scheduled on an executor anyway (or same thread if executor is nil). And for slower fns in a chain, the executor overhead is probably not that bad. This will mostly show up as an issue for tiny fns, but Aleph doesn't chain too many of those. We can also mitigate any performance impact by using FWIW, I took a stab at converting Regardless, I'm not convinced it'll be worth all the work in the end, as compared to just documenting the existing behavior. One issue is, |
Sorry for the delay. Today, if the deferred is realized, you continue on the current thread regardless of the presence of that executor on the thread local. Which means you can blindly [1] : https://github.com/clj-commons/manifold/blob/master/src/manifold/executor.clj#L69 |
On a side note, I was having a look at how @mpenet's auspex library implemented the manifold support for And it seems he just ignored the parameter and calls
(-> (d/future 1)
(d/chain' (fn [_] (prn (.getName (Thread/currentThread))))) ;; manifold-pool-1-1
(->> (d/future-with (manifold.executor/fixed-thread-executor 2)))
(d/chain' (fn [_] (prn (.getName (Thread/currentThread)))))) ;; manifold-pool-1-1 How the following (non-working code) should be expressed to make the second (-> (d/future 1)
(d/chain' (fn [_] (prn (.getName (Thread/currentThread))))) ;; current-thread
(d/onto (manifold.executor/fixed-thread-executor 2))
(d/chain' (fn [_] (prn (.getName (Thread/currentThread)))))) ;; manifold-pool-1-1 [1] : https://github.com/mpenet/auspex/blob/master/src/qbits/auspex/manifold.clj#L55-L56 |
Hi, just to weigh in on this, the behavior in Manifold is shaped by two considerations:
I don't think the current behavior properly balances these two considerations, for the reasons already demonstrated in this thread. Luckily, I think the behavior here can be pretty easily addressed by some minor changes to the |
Hello Zach (and welcome back)! What you propose kind of match what Matthew had in mind initially when we were talking about this on Aleph [1] :
Considering that If this can be easily addressed on the [1] : clj-commons/aleph#427 (comment) |
Zach, thanks for chiming in! You're always welcome here. I think the proposed change of checking the executor and continuing to run in the current thread if it's already in that executor's pool is good, but in the past, we've always held off out of caution, and uncertainty if the benefits are worth it. Is anything different? To recap what it will entail: The first major change would be how The next major change is that callbacks on already-realized deferreds would switch from running on the current thread (no matter where it is) to only running on the current thread if it's in the same pool (or nil). This has always been unexpected behavior, and someone complains about it at least twice a year. Generally, I would be most happy if people's first guess as to which pools things are running on is correct. I would love to say "If you specify a pool, it will always run on that pool. If you're already in that pool, it will continue on the current thread. If you need to submit a job to the pool you're already on, use @arnaudgeiser FYI, @alexander-yakushev suggested the following for determining the executor in clj-commons/aleph#427 (comment):
I didn't see the APIs required to do this with Netty pools, but it should work just fine for determining if a thread is in one of our pools. |
This is where I'm a bit confused. Regarding :
You cannot get the Pretty sure I'm missing the whole thing there. I understand what we are trying to achieve but not how. I was initially thinking about that slight change on (if (inside-executor? (.executor d))
~success-clause
~unrealized-clause) EDIT :
I tend to concur, this is where the decision is took to call [1] : https://github.com/clj-commons/manifold/blob/master/src/manifold/executor.clj#L71 |
Maybe I need to take a closer look, but I think the general point is, with
our own pools, we can alter Dirigiste/Manifold/Aleph if needed to achieve
this. To the extent that we initialize netty, maybe we can set this up
there, too.
But for any random Java executor in the wild, there’s no guarantees.
Hopefully we won’t encounter many of those, though I know some people have
replaced Dirigiste with their own pools.
…On Thu, Mar 2, 2023 at 4:22 AM Arnaud Geiser ***@***.***> wrote:
I suppose, if you compare Thread.currentThread().getThreadGroup() to the
executor's ThreadGroup from its ThreadFactory, you can avoid redundant
rescheduling in Manifold itself, and thus keep with-executor as is.
This is where I'm a bit confused.
For starters, the threads we are creating don't have any ThreadGroup
attached to them [1]. So I guess it will require modifications, even on our
own pools. This is the same for all Executors created directly from the JDK
[2]. Manifold users might not want to use the Dirigiste Executors but plain
JDK Executors.
Regarding :
to the executor's ThreadGroup from its ThreadFactory
You cannot get the ThreadFactory from an ExecutorService, but from a
ThreadPoolExecutor [3]. Dirigiste executor doesn't give you access to its
internal ThreadFactory [4].
Even so... having access to the ThreadFactory is not giving us access to
the ThreadGroup the threads are created with.
Pretty sure I'm missing the whole thing there. I understand what we are
trying to achieve but not how.
If you can shed some light, that would be great.
I was initially thinking about that slight chance on success-error-unrealized
:
(if (inside-executor? (.executor d))
~success-clause
~unrealized-clause)
[1] :
https://github.com/clj-commons/manifold/blob/master/src/manifold/executor.clj#L71
[2] :
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/Executors.java#L687-L689
[3] :
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/concurrent/ThreadPoolExecutor.java#L1519
[4] :
https://github.com/clj-commons/dirigiste/blob/master/src/io/aleph/dirigiste/Executor.java#L121
—
Reply to this email directly, view it on GitHub
<#151 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAHHB5KUHMX5DWKAY6VNEIDWZ64ZFANCNFSM4ERZ5BHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I stand corrected on Having taken a longer look at everything, here's what I believe needs to change:
Is there anything I'm overlooking? |
Zach, IIUC, you're suggesting the second thread local contain the executor the thread is actually running in, and that it will never change (Which is unlike how the Let's call it something like We'll wrap non-Manifold/dirigiste executors so their Then, we need to check the code to find out where it'll have to change. A search for calls to
I think the If we change the above locations, I don't think we also have to fool with the Is this a good summary? |
I think we are overlooking people using Aleph/Manifold/Dirigiste might also use others kind of asynchronous clients which use JDK Executors. The behavior of the I'm scared about the following question:
This kind of behavior is something you can experience today using Aleph. If you ever decide to change the Dirigiste Executor (the thread-factory to be precise) [1] with a JDK Executor, you will lose the And then... you realize that some defaults of Aleph is actually preventing you from shooting yourself in the foot. (def manifold-executor (ex/fixed-thread-executor 4 {:onto? true}))
(.submit manifold-executor
#(-> (d/future (Thread/sleep 100))
(d/chain' (fn [_] (prn (.getName (Thread/currentThread)))))))
;; => "manifold-pool-13-2"
(def executor (java.util.concurrent.Executors/newFixedThreadPool 4))
(.submit executor
#(-> (d/future (Thread/sleep 100))
(d/chain' (fn [_] (prn (.getName (Thread/currentThread)))))))
;; => "manifold-execute-9" Do we really want more of this? Anyway, just wanted to share my concerns. But implementation wise, what you propose will probably work the way we expect. I would propose to explore another solution to not have this split between the Manifold/JDK world. (defn chain-with
([x executor f]
...)
([x executor f g]
...))
([x executor f g & fs]
...)) And then, we "just" force the application of (-> (d/future 1)
(d/chain' (fn [_] (prn (.getName (Thread/currentThread))))) ;; current-thread
(d/chain-with' (manifold.executor/fixed-thread-executor 2) (fn [_] (prn (.getName (Thread/currentThread)))))) ;; manifold-pool-1-1 With that,
And
I will be happier with this proposition. [1] : https://github.com/clj-commons/aleph/blob/master/src/aleph/http/server.clj#L612 |
FYI, bit busy at the moment, I won't be able to get back to this until the following week. |
I'm trying to ensure that my chained callbacks are always run on a specific thread pool, so I'm using
d/onto
followed byd/chain
to set up the callbacks. What I'm seeing, though, is that sometimes the callbacks will be run directly on the derefing thread. I'm able to reproduce this behavior reliably (clojure 1.8, Java 1.8, manifold 0.1.6).When I run
chained-onto-threads
with anf
that produces a manifold future, sometimes the chained function runs on an nREPL thread, not a ForkJoinPool thread as expected:With an
f
that produces asuccess-deferred
, the chained function always runs on an nREPL thread, not on the executor that was specified:Can you shed any light on what's happening? Am I doing something wrong or is this a bug?
The text was updated successfully, but these errors were encountered: