-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRIVERS-2868 Adjust getMore maxTimeMS Calculation for tailable awaitData Cursors #1749
base: master
Are you sure you want to change the base?
Conversation
source/client-side-operations-timeout/client-side-operations-timeout.md
Outdated
Show resolved
Hide resolved
source/client-side-operations-timeout/client-side-operations-timeout.md
Outdated
Show resolved
Hide resolved
source/client-side-operations-timeout/client-side-operations-timeout.md
Outdated
Show resolved
Hide resolved
"name": "iterateOnce", | ||
"object": "tailableCursor", | ||
"arguments": { | ||
"timeoutMS": 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In timeoutMS is refreshed for getMore if maxAwaitTimeMS is set
test case, timeoutMS
is set when the cursor is created. I suggest we follow the same approach here and the other test since not all drivers allow setting a timeout for each next()
call as an argument - it gets refreshed under the hood and i believe is supposed to behave the same in the drivers which additionally allow to override timeouMS
per next
() call.
For example, we could set timeoutMS to 150ms and maxAwaitTimeMS to 100ms like this:
{
"name": "createFindCursor",
"object": "collection",
"arguments": {
"filter": {},
"cursorType": "tailableAwait",
"batchSize": 1,
"timeoutMS": 150
"maxAwaitTimeMS": 100
},
"saveResultAsEntity": "tailableCursor"
},
Also, we configure a failpoint to block the find command for 60ms
and call iterateOnce()
. At this point, maxTimeMS
on the command should be ≤ 90ms, which is ≤ maxAwaitTimeMS
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vbabanin It's not idiomatic for the Go Driver to use the timeoutMS applied to constructing a cursor to each next()
call. This would violate user expectations of context.Context. For example,
findCtx, findCancel := context.WithTimeout(context.Background(), 1*time.Second)
defer findCancel()
cur, _ := coll.Find(findCtx, bson.D{}) // User is applying a 1s timeout
// User does not apply a timeoutMS to the context passed to Next, internally
// applying a timeout here would violate expectations.
for cursor.Next(context.Background()) {
// ...
}
Tests such as timeoutMS is refreshed for getMore if maxAwaitTimeMS is set
are skipped in the Go Driver specifically because timeoutMS is not being applied at either the client or operation level. More than likely we will create a DRIVERS ticket to address this, see GODRIVER-3480 for more info. IIUC, we are likely to suggest putting the same timeout on both constructor and next. Would this work for Java?
operations:
- name: createFindCursor
object: *collection
arguments:
filter: {}
cursorType: tailableAwait
batchSize: 1
maxAwaitTimeMS: 50
timeoutMS: 100
saveResultAsEntity: &tailableCursor tailableCursor
# Iterate twice to force a getMore.
- name: iterateOnce
object: *tailableCursor
arguments:
timeoutMS: 100
- name: iterateOnce
object: *tailableCursor
arguments:
timeoutMS: 100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
summary of discussion with @prestonvasquez:
Currently, the spec mandates that drivers reuse the same timeoutMS
specified during cursor creation:
Change stream: Drivers MUST also apply the original timeoutMS value to each next call on the change stream."
Cursors: After the operation has executed, the original timeoutMS value MUST also be applied to each next call on the created cursor."
However, since Python and Go allow timeoutMS overrides per call, we should update the spec to clarify that overriding timeoutMS
per next call is permitted, as it is idiomatic to their language. We should create a separate DRIVERS ticket to address that and update other unified tests for consistency to cover a use-case with override per getMore
.
Some drivers might not currently expect timeoutMS: 100
argument in iterateOnce
, that could lead to failed tests - this is the case for Java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vbabanin For now, I think we should update the unified spec tests to pass per the specifications (i.e. not applying a timeoutMS iteratively). Can you verify that the updated tests pass for Java?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I ran the tests and one of them applying the remaining timeoutMS if it's less than maxAwaitTimeMS
fails with state should be: maxAwaitTimeMS must be less than timeoutMS
during cursor creation as expected by the spec. I’ve posted a suggestion for handling this in a separate comment: #1749 (comment).
…imeout.md Co-authored-by: Viacheslav Babanin <[email protected]>
"maxAwaitTimeMS": 100, | ||
"timeoutMS": 50 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By spec, maxAwaitTimeMS
must be less than timeoutMS when passed to createFindCursor.
"maxAwaitTimeMS": 100, | |
"timeoutMS": 50 | |
"maxAwaitTimeMS": 50, | |
"timeoutMS": 51 |
If we use this config, timeoutMS
will likely be less than maxAwaitTimeMS
by the time getMore executes— since there is just a 1ms difference.
Alternatively, we could configure a failpoint to delay the find command execution to lower exhaust timeout so that it becomes less then maxAwaitTimeMS
, making the test more robust. This could prevent potential edge cases where fast-executing drivers issue a getMore at 50ms.
Example with the failpoint:
{
"description": "apply remaining timeoutMS if less than maxAwaitTimeMS",
"operations": [
{
"name": "failPoint",
"object": "testRunner",
"arguments": {
"client": "failPointClient",
"failPoint": {
"configureFailPoint": "failCommand",
"mode": {
"times": 1
},
"data": {
"failCommands": [
"find"
],
"blockConnection": true,
"blockTimeMS": 60
}
}
}
},
{
"name": "createFindCursor",
"object": "collection",
"arguments": {
"filter": {},
"cursorType": "tailableAwait",
"batchSize": 1,
"maxAwaitTimeMS": 50,
"timeoutMS": 100
},
"saveResultAsEntity": "tailableCursor"
},
{
"name": "iterateOnce",
"object": "tailableCursor"
},
{
"name": "iterateOnce",
"object": "tailableCursor"
}
],
"expectEvents": [
{
"client": "client",
"events": [
{
"commandStartedEvent": {
"commandName": "find",
"databaseName": "test"
}
},
{
"commandStartedEvent": {
"commandName": "getMore",
"databaseName": "test",
"command": {
"maxTimeMS": {
"$$lte": 50
}
}
}
}
]
}
]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vbabanin It's unclear to me how blocking find
will reduce the remaining timeout on a call to Next
.
# Each next call starts a fresh timeout_ms.
def next(cursor):
timeout_ms = cursor['timeout_ms']
# ...
while get_more(remaining_timeout_ms):
pass # ...
Here is a gist with the updated JSON: https://gist.github.com/prestonvasquez/498ccc00c59a72b367de88272f0a777b
Does this pass in the Java driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vbabanin I've updated the unified spec tests with the changes discussed offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@vbabanin can you confirm this test passes in .net on both linux and windows? |
DRIVERS-2868
Please complete the following before merging:
clusters, and serverless).