Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipes for balancing agent response time and completeness/accuracy #87

Open
hinthornw opened this issue Sep 17, 2023 · 0 comments
Open
Assignees

Comments

@hinthornw
Copy link
Contributor

Collection of recipes.

Probably a few tactics people would want to employ:

  • Streaming to optimize time to first token
  • Callbacks for tools to respond with progress
  • Callbacks in runnable chains (or whatever other cognitive arch we propose)

May be a couple common "runtimes" too:

  • Chat apps where you can indeed stream/control the UX
  • Whatsapp/Discord/SMS/Twilio etc. bots that are beholden to webhook timeouts and can only send in batch

Use cases:

  • Retrieval Q&A -
    • Perplexity clone
    • Perplexity + more lookup to finalize answer
    • More + various sources
  • Code tutor/copilots
    • Like the "help me build this app" type of thing
@hinthornw hinthornw self-assigned this Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant