-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OLS-39: Add streaming_query endpoint #2014
base: main
Are you sure you want to change the base?
Conversation
@onmete: This pull request references OLS-39 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/hold |
@onmete: This pull request references OLS-39 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@onmete: This pull request references OLS-39 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@@ -184,6 +145,64 @@ def conversation_request( | |||
) | |||
|
|||
|
|||
def process_request(auth: Any, llm_request: LLMRequest): | |||
"""Process incoming request.""" | |||
timestamps = {"start": time.time()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change will open the type from float
to any
if I'm reading it correctly. Is there a reason to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just removes the type hinting for this internal variable. Linter is happy (or can't understand the missing type) so I find it unnecessary for this "debug" type of variable. It can be timestamps: dict[str: float] = {"start": time.time()}
ofc.
/unhold |
f2115b6
to
8396db7
Compare
8396db7
to
5f242a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change all those if media_type==TEXT ... else ...JSON
into switch over all media types. Also by using StrEnum
instead of string constant, the code will be expandable and less direct checks will be needed in model code. Look into constants.py
for examples please.
|
||
timestamps["validate question"] = time.time() | ||
|
||
return ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the return type is precise and known in advance, IMHO should be used in function header
ols/app/endpoints/ols.py
Outdated
response = docs_summarizer.create_response( | ||
llm_request.query, config.rag_index, history | ||
) | ||
logger.debug(f"{conversation_id} Generated response: {response}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't use f-string in anything related to logger
ols/app/endpoints/streaming_ols.py
Outdated
ref_docs_string = "\n".join( | ||
f"{title}: {url}" | ||
for title, url in { | ||
rag_chunk.doc_title: rag_chunk.doc_url for rag_chunk in rag_chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does not it work differently than JSON output? I mean there is just one yield, not yield per doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is one yield for text type and multiple yields for json (per doc)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2014 +/- ##
==========================================
+ Coverage 96.89% 96.90% +0.01%
==========================================
Files 72 73 +1
Lines 2932 3040 +108
==========================================
+ Hits 2841 2946 +105
- Misses 91 94 +3
|
756e359
to
a8085de
Compare
ec28a50
to
a964e45
Compare
0d6f479
to
c576edd
Compare
e06ba43
to
f45c711
Compare
/test 4.17-e2e-ols-cluster |
Can we expect this will be ported to road-core/service eventually? |
@TamiTakamiya hopefully yes. |
dce5295
to
f62aeb6
Compare
400da31
to
adba5b3
Compare
adba5b3
to
9a7d90d
Compare
@@ -132,7 +98,7 @@ def conversation_request( | |||
else: | |||
summarizer_response = generate_response( | |||
conversation_id, llm_request, previous_input | |||
) | |||
) # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need those "ignores"? Is it because of Mypy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
generated_content = "" | ||
|
||
async for item in summary_gen: | ||
if isinstance(item, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious - does the async gen return something else than string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SummarizerResponse
I'll do it ASAP |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tisnik The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@onmete: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
Add streaming_query endpoint
Type of change
TODO
Preview
Non-streaming
Streaming