OLS-39: Add streaming_query endpoint #2014

onmete · 2024-12-05T10:52:31Z

Description

Add streaming_query endpoint

Type of change

New feature

TODO

decide if this is how we want to implement it
additional minor code org
docs/schema
fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

openshift-ci-robot · 2024-12-05T10:52:36Z

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

New feature

TODO

decide if this is how we want to implement it
fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

onmete · 2024-12-05T10:52:49Z

/hold

openshift-ci-robot · 2024-12-09T08:49:48Z

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

New feature

TODO

decide if this is how we want to implement it
docs/schema
fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2024-12-09T08:50:08Z

@onmete: This pull request references OLS-39 which is a valid jira issue.

In response to this:

Description

Add streaming_query endpoint

Type of change

New feature

TODO

decide if this is how we want to implement it
additional minor code org
docs/schema
fix tests

Preview

Non-streaming

curl -X POST http://localhost:8080/v1/query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

{"conversation_id":"6adc636b-22e3-45cc-8f01-f73fdc2490bc","response":"In OpenShift's realm, where pods take flight,  \nContainers unite, in harmony bright.  \nWith labels and volumes, they dance and they play,  \nOn nodes they reside, come night or come day.  \n\nImmutable spirits, they rise and they fall,  \nManaged by controllers, they heed the call.  \nFrom dev to production, they scale with great ease,  \nIn this orchestration, they flow like the breeze.  \n\nSo here's to the pods, in their vibrant array,  \nIn OpenShift's embrace, they thrive and they sway.  \nWith each deployment, a story unfolds,  \nIn the world of containers, their magic beholds.","referenced_documents":[{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html","title":"Using pods"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html","title":"Pod [v1]"},{"docs_url":"https://docs.openshift.com/container-platform/4.15/nodes/index.html","title":"Overview of nodes"}],"truncated":false}

Streaming

curl -X POST http://localhost:8080/v1/streaming_query -H "Content-Type: application/json" -d '{"query": "write a short poem about running pods in openshift?"}'

In OpenShift's realm, where pods take flight,
Containers unite, in the day and the night.
With labels and volumes, they dance in a line,
Scaling up swiftly, like stars that align.

Immutable they stand, in their defined space,
Resilient and strong, they embrace every race.
From nodes they emerge, with IPs to claim,
In the orchestration of Kubernetes, they play the game.

So here's to the pods, in their vibrant array,
In OpenShift's embrace, they thrive every day.
With each deployment, a story unfolds,
In the world of containers, their magic beholds.

---

Using pods: https://docs.openshift.com/container-platform/4.15/nodes/pods/nodes-pods-using.html
Pod [v1]: https://docs.openshift.com/container-platform/4.15/rest_api/workloads_apis/pod-v1.html
Overview of nodes: https://docs.openshift.com/container-platform/4.15/nodes/index.html

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

ols/app/endpoints/streaming_ols.py

joshuawilson · 2024-12-10T20:37:07Z

ols/app/endpoints/ols.py

@@ -184,6 +145,64 @@ def conversation_request(
    )


+def process_request(auth: Any, llm_request: LLMRequest):
+    """Process incoming request."""
+    timestamps = {"start": time.time()}


This change will open the type from float to any if I'm reading it correctly. Is there a reason to do this?

It just removes the type hinting for this internal variable. Linter is happy (or can't understand the missing type) so I find it unnecessary for this "debug" type of variable. It can be timestamps: dict[str: float] = {"start": time.time()} ofc.

onmete · 2024-12-11T07:35:08Z

/unhold

tisnik

I'd change all those if media_type==TEXT ... else ...JSON into switch over all media types. Also by using StrEnum instead of string constant, the code will be expandable and less direct checks will be needed in model code. Look into constants.py for examples please.

README.md

tisnik · 2024-12-11T14:52:17Z

ols/app/endpoints/ols.py

+
+    timestamps["validate question"] = time.time()
+
+    return (


the return type is precise and known in advance, IMHO should be used in function header

tisnik · 2024-12-11T14:53:18Z

ols/app/endpoints/ols.py

+            response = docs_summarizer.create_response(
+                llm_request.query, config.rag_index, history
+            )
+            logger.debug(f"{conversation_id} Generated response: {response}")


please don't use f-string in anything related to logger

tisnik · 2024-12-11T14:59:14Z

ols/app/endpoints/streaming_ols.py

+        ref_docs_string = "\n".join(
+            f"{title}: {url}"
+            for title, url in {
+                rag_chunk.doc_title: rag_chunk.doc_url for rag_chunk in rag_chunks


does not it work differently than JSON output? I mean there is just one yield, not yield per doc

there is one yield for text type and multiple yields for json (per doc)

codecov-commenter · 2024-12-12T07:35:20Z

Codecov Report

Attention: Patch coverage is 96.15385% with 6 lines in your changes missing coverage. Please review.

Project coverage is 96.90%. Comparing base (8757bf9) to head (b6413b1).
Report is 51 commits behind head on main.

Files with missing lines	Patch %	Lines
ols/app/endpoints/streaming_ols.py	93.97%	5 Missing ⚠️
ols/src/query_helpers/docs_summarizer.py	96.96%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2014      +/-   ##
==========================================
+ Coverage   96.89%   96.90%   +0.01%     
==========================================
  Files          72       73       +1     
  Lines        2932     3040     +108     
==========================================
+ Hits         2841     2946     +105     
- Misses         91       94       +3

Files with missing lines	Coverage Δ
ols/app/endpoints/ols.py	`99.54% <100.00%> (+0.01%)`	⬆️
ols/app/models/models.py	`100.00% <100.00%> (ø)`
ols/app/routers.py	`100.00% <100.00%> (ø)`
ols/constants.py	`100.00% <100.00%> (ø)`
ols/src/query_helpers/docs_summarizer.py	`98.50% <96.96%> (-1.50%)`	⬇️
ols/app/endpoints/streaming_ols.py	`93.97% <93.97%> (ø)`

... and 1 file with indirect coverage changes

onmete · 2025-01-08T12:58:04Z

/test 4.17-e2e-ols-cluster

TamiTakamiya · 2025-01-08T20:56:22Z

Can we expect this will be ported to road-core/service eventually?

onmete · 2025-01-09T07:32:50Z

@TamiTakamiya hopefully yes.

tisnik · 2025-01-10T09:47:39Z

ols/app/endpoints/ols.py

@@ -132,7 +98,7 @@ def conversation_request(
    else:
        summarizer_response = generate_response(
            conversation_id, llm_request, previous_input
-        )
+        )  # type: ignore[assignment]


do we need those "ignores"? Is it because of Mypy?

tisnik · 2025-01-10T09:48:45Z

tests/unit/query_helpers/test_docs_summarizer.py

+    generated_content = ""
+
+    async for item in summary_gen:
+        if isinstance(item, str):


just curious - does the async gen return something else than string?

SummarizerResponse

tisnik · 2025-01-10T11:36:55Z

Can we expect this will be ported to road-core/service eventually?

I'll do it ASAP

tisnik · 2025-01-10T11:37:01Z

/approve

openshift-ci · 2025-01-10T11:37:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tisnik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [tisnik]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

README.md

openshift-ci · 2025-01-10T13:59:21Z

@onmete: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 5, 2024

onmete marked this pull request as draft December 5, 2024 10:52

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 5, 2024

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 5, 2024

openshift-ci bot requested review from bparees and tisnik December 5, 2024 10:53

tisnik reviewed Dec 9, 2024

View reviewed changes

ols/app/endpoints/streaming_ols.py Outdated Show resolved Hide resolved

xrajesh reviewed Dec 10, 2024

View reviewed changes

ols/app/endpoints/streaming_ols.py Show resolved Hide resolved

xrajesh reviewed Dec 10, 2024

View reviewed changes

ols/app/endpoints/streaming_ols.py Outdated Show resolved Hide resolved

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2024

joshuawilson reviewed Dec 10, 2024

View reviewed changes

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 11, 2024

onmete force-pushed the streaming-response branch from f2115b6 to 8396db7 Compare December 11, 2024 07:56

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2024

onmete marked this pull request as ready for review December 11, 2024 08:06

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024

openshift-ci bot requested a review from xrajesh December 11, 2024 08:06

onmete force-pushed the streaming-response branch from 8396db7 to 5f242a7 Compare December 11, 2024 09:19

tisnik suggested changes Dec 11, 2024

View reviewed changes

onmete force-pushed the streaming-response branch from 756e359 to a8085de Compare December 12, 2024 10:40

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 20, 2024

onmete force-pushed the streaming-response branch from ec28a50 to a964e45 Compare January 5, 2025 08:37

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 5, 2025

onmete force-pushed the streaming-response branch from 0d6f479 to c576edd Compare January 6, 2025 07:33

onmete force-pushed the streaming-response branch 3 times, most recently from e06ba43 to f45c711 Compare January 8, 2025 09:22

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025

Implement streaming response endpoint

3e27816

onmete force-pushed the streaming-response branch from dce5295 to f62aeb6 Compare January 9, 2025 12:34

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025

onmete force-pushed the streaming-response branch from 400da31 to adba5b3 Compare January 9, 2025 15:05

Add e2e tests for streming endpoint

9a7d90d

onmete force-pushed the streaming-response branch from adba5b3 to 9a7d90d Compare January 9, 2025 16:35

tisnik reviewed Jan 10, 2025

View reviewed changes

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 10, 2025

JoaoFula reviewed Jan 10, 2025

View reviewed changes

README.md Outdated Show resolved Hide resolved

Fix readme part about streaming_query

b6413b1

OLS-39: Add streaming_query endpoint #2014

Are you sure you want to change the base?

OLS-39: Add streaming_query endpoint #2014

Conversation

onmete commented Dec 5, 2024 • edited Loading

Description

Type of change

TODO

Preview

Non-streaming

Streaming

openshift-ci-robot commented Dec 5, 2024 • edited by openshift-ci bot Loading

Description

Type of change

TODO

Preview

Non-streaming

Streaming

onmete commented Dec 5, 2024

openshift-ci-robot commented Dec 9, 2024 • edited by openshift-ci bot Loading

Description

Type of change

TODO

Preview

Non-streaming

Streaming

openshift-ci-robot commented Dec 9, 2024 • edited by openshift-ci bot Loading

Description

Type of change

TODO

Preview

Non-streaming

Streaming

Choose a reason for hiding this comment

Choose a reason for hiding this comment

onmete commented Dec 11, 2024

tisnik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Dec 12, 2024 • edited Loading

Codecov Report

onmete commented Jan 8, 2025

TamiTakamiya commented Jan 8, 2025

onmete commented Jan 9, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tisnik commented Jan 10, 2025

tisnik commented Jan 10, 2025

openshift-ci bot commented Jan 10, 2025

openshift-ci bot commented Jan 10, 2025

onmete commented Dec 5, 2024 •

edited

Loading

openshift-ci-robot commented Dec 5, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Dec 9, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Dec 9, 2024 •

edited by openshift-ci bot

Loading

codecov-commenter commented Dec 12, 2024 •

edited

Loading