Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat needs an ability to attach images/files #1617

Open
kodjima33 opened this issue Dec 31, 2024 · 13 comments
Open

Chat needs an ability to attach images/files #1617

kodjima33 opened this issue Dec 31, 2024 · 13 comments
Assignees

Comments

@kodjima33
Copy link
Collaborator

kodjima33 commented Dec 31, 2024

Implementation

(Thinh's note)

Backend (2):

  1. new api route POST /files to upload files, input: file, logic: create new docs users>uid>files{id, name, thumbnail, mime_type, openai_file_id} direct upload to openai(/files), output: {id, name, thumbnail,}
  2. POST /messages with new body param file_ids: [str], logic: chat with files use openai https://platform.openai.com/docs/api-reference/threads , use the best openai model o1 (or gpt-4o if o1 is not work with file yet)

App (4):

  1. Chat > Message box: add options to attach photos, take photo, attach files.
  2. Chat > Upload photo,files to /files, before submitting messages
  3. Chat > Submit message to /messages with the new body field file_ids
  4. Chat > Message list: render the message with attachment(photo, file)

Be notice:

the current chat feature: ensure the new chat works seamlessly with the current chat feature.

keep the UI simple: we can use openai app as the standard product.

thread and end-thread option: the best implementation is we could detect if users are asking a question that needs the context from file(what file) or not. btw, if it's too complicated at this time, so let's go with either:

  1. having an option to end thread
  2. or just use Clear chat to force end thread

Maybe useful

@kodjima33 kodjima33 converted this from a draft issue Dec 31, 2024
@beastoin
Copy link
Collaborator

beastoin commented Jan 2, 2025

#1573

@beastoin
Copy link
Collaborator

beastoin commented Jan 2, 2025

Implementation

Backend (2):

  1. new api route POST /files to upload files, input: file, logic: create new docs users>uid>files{id, name, thumbnail, mime_type, openai_file_id} direct upload to openai(/files), output: {id, name, thumbnail,}
  2. POST /messages with new body param file_ids: [str], logic: chat with files use openai https://platform.openai.com/docs/api-reference/threads , use the best openai model o1 (or gpt-4o if o1 is not work with file yet)

App (4):

  1. Chat > Message box: add options to attach photos, take photo, attach files.
  2. Chat > Upload photo,files to /files, before submitting messages
  3. Chat > Submit message to /messages with the new body field file_ids
  4. Chat > Message list: render the message with attachment(photo, file)

Be notice:

the current chat feature: ensure the new chat works seamlessly with the current chat feature.

keep the UI simple: we can use openai app as the standard product.

thread and end-thread option: the best implementation is we could detect if users are asking a question that needs the context from file(what file) or not. btw, if it's too complicated at this time, so let's go with either:

  1. having an option to end thread
  2. or just use Clear chat to force end thread

@beastoin
Copy link
Collaborator

beastoin commented Jan 2, 2025

import os
from dotenv import load_dotenv
import openai

class FileChat:
    def __init__(self):
        load_dotenv()
        openai.api_key = os.getenv("OPENAI_API_KEY")
        self.thread = None
        self.file_id = None
        self.assistant = None

    def load_document(self, file_path):
        """Upload a document to OpenAI and create a thread"""
        # Upload the file to OpenAI
        with open(file_path, 'rb') as file:
            response = openai.files.create(
                file=file,
                purpose='assistants'
            )
            self.file_id = response.id

        # Create an assistant with file search capability
        self.assistant = openai.beta.assistants.create(
            name="File Reader",
            instructions="You are a helpful assistant that answers questions about the provided file. Use the file_search tool to search the file contents when needed.",
            model="gpt-4o",
            tools=[{"type": "file_search"}]
        )
        
        # Create a thread and attach the file
        self.thread = openai.beta.threads.create()
        openai.beta.threads.messages.create(
            thread_id=self.thread.id,
            role="user",
            content="Please help me answer questions about the attached file.",
            attachments=[{
                "file_id": self.file_id,
                "tools": [{"type": "file_search"}]
            }]
        )

    def ask(self, question):
        """Ask a question about the loaded document"""
        if not self.thread or not self.file_id:
            return "Please load a document first using load_document(file_path)"

        # Add the question to the thread
        openai.beta.threads.messages.create(
            thread_id=self.thread.id,
            role="user",
            content=question
        )

        # Create a run with the assistant
        run = openai.beta.threads.runs.create(
            thread_id=self.thread.id,
            assistant_id=self.assistant.id
        )

        # Wait for the response
        while True:
            run_status = openai.beta.threads.runs.retrieve(
                thread_id=self.thread.id,
                run_id=run.id
            )
            if run_status.status == 'completed':
                break

        # Get the messages
        messages = openai.beta.threads.messages.list(
            thread_id=self.thread.id
        )

        # Return the latest assistant response
        return messages.data[0].content[0].text.value

    def cleanup(self):
        """Clean up resources"""
        if self.file_id:
            # Delete the file from OpenAI
            openai.files.delete(self.file_id)
            self.file_id = None
        if self.assistant:
            # Delete the assistant
            openai.beta.assistants.delete(self.assistant.id)
            self.assistant = None
        self.thread = None

def main():
    # Initialize the chat system
    chat = FileChat()

    print("Welcome to File Chat!")
    print("First, please provide the path to your text file.")

    try:
        while True:
            file_path = input("\nEnter file path (or 'quit' to exit): ")

            if file_path.lower() == 'quit':
                break

            try:
                chat.load_document(file_path)
                print(f"\nFile loaded successfully! You can now ask questions about {file_path}")

                while True:
                    question = input("\nAsk a question (or 'new' for new file, 'quit' to exit): ")

                    if question.lower() == 'quit':
                        chat.cleanup()
                        return
                    elif question.lower() == 'new':
                        chat.cleanup()
                        break

                    answer = chat.ask(question)
                    print("\nAnswer:", answer)

            except Exception as e:
                print(f"Error: {str(e)}")
    finally:
        # Ensure cleanup happens even if there's an error
        chat.cleanup()

if __name__ == "__main__":
    main()

@beastoin
Copy link
Collaborator

beastoin commented Jan 2, 2025

@mdmohsin7 man, pls read this ticket's description and feel free to ask me anything. if everything is clear and you're excited about this feature, drop your UI/UX proposal then go ahead.

@nquang29 said that he is also excited with this ticket so you can ask him if he can help on backend side or not / Quang's Discord @windtran_

@mdmohsin7
Copy link
Collaborator

Got it! As mentioned we can simply follow the UX of ChatGPT or even iMessage

Screenshot_20250102-182331~2

Screenshot_20250102-182350~2

Chat > Upload photo,files to /files, before submitting messages
Chat > Submit message to /messages with the new body field file_ids

What if we upload the file right after the user selects it? Similar to how the ChatGPT app does it

So if I understand correctly, @nquang29 will be working on the backend and I'll have to make the app side changes?

@beastoin

@beastoin
Copy link
Collaborator

beastoin commented Jan 2, 2025

yes i mean uploading right after selecting the image / file.

use our figma and draft the design pls sir

you can do both, or just ask Quang to see if he could help so that we can speed up the progress.

@mdmohsin7

@mdmohsin7
Copy link
Collaborator

The designs in our figma are very old and are not the ones that are being followed currently. I'll quickly code the design without the functionality and will share the image with you

Alright I'll message Quang on discord

@beastoin

@mdmohsin7
Copy link
Collaborator

Progress:

IMG_AF3DDE7D6CF4-1

IMG_0984249CA139-1

@mdmohsin7
Copy link
Collaborator

Are we going to allow multiple file uploads?

@beastoin
Copy link
Collaborator

beastoin commented Jan 3, 2025

multiple file uploads - yes

at the time you use figma, your mind focuses completely on design (ui/ux) - not code. that's the reason why if you want to create great ux, you need to draft your ideas somewhere - away from your code editor.

@mdmohsin7

@mdmohsin7
Copy link
Collaborator

What is the max limit on the number of files? And also any max limit on the file size?

Since we don't have the current UI designs in Figma, it would have taken more time to design the new UI so I just went with code itself for now. Pls check the video in #1629, that should give you an idea of how the UI will look. The app side part is almost done (will have to modify it a bit to support multiple files), just need to connect to the backend

@beastoin

@beastoin
Copy link
Collaborator

beastoin commented Jan 4, 2025

just follow what chatgpt did

@mdmohsin7 ^

@beastoin beastoin moved this from To do to In progress in omi TODO Jan 5, 2025
@beastoin beastoin removed the status in omi TODO Jan 7, 2025
@mdmohsin7
Copy link
Collaborator

just follow what chatgpt did

ChatGPT only allows 3-4 files on free plan

I've asked Quang on discord for help with backend, he's interested it seems and waiting for him response

@kodjima33 kodjima33 moved this to To do in omi TODO Jan 14, 2025
@beastoin beastoin moved this from This week to In progress in omi TODO Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

No branches or pull requests

3 participants