- OpenDevin{" "}
-
- is an autonomous software engineer that can solve software engineering
- and web-browsing tasks end-to-end. It can perform data science queries, such
- as "Find the number of pull requests to the OpenDevin repository in the last
- month," and software engineering tasks, such as "Please add tests to this
- file and verify that all the tests pass. If they don't fix the file."
-
-
-
-
- At the same time, OpenDevin is a platform and community for agent developers
- to test out and evaluate new agents.
-
-
-
-
-
- Support
-
-
-
- {`Please file a bug on {githubLink} if you notice a problem that likely affects others. If you're having trouble installing, or have general questions, reach out on {discordLink} or {slackLink}.`}
-
-
-
-
-
- How to fix a GitHub issue with OpenDevin?
-
-
-
- To fix an issue on GitHub using OpenDevin, send a prompt to OpenDevin asking it to follow
- steps like the following:
-
-
-
Read the issue https://github.com/OpenDevin/OpenDevin/issues/1611
-
Clone the repository and check out a new branch
-
Based on the instructions in the issue description, modify files to fix the issue
-
Push the resulting output to GitHub using the GITHUB_TOKEN environment variable
-
Tell me the link that I need to go to to send a pull request
-
-
- Before you run OpenDevin, you can do:
-
-
- export SANDBOX_ENV_GITHUB_TOKEN=XXX
-
-
- where XXX is a GitHub token that you created that has permissions to push to the OpenDevin repo. If you donโt have write permission to the OpenDevin repo, you might need to change that to:
-
-
- Push the resulting output to my fork at https://github.com/USERNAME/OpenDevin/ using the GITHUB_TOKEN environment variable
-
-
- where USERNAME is your GitHub username.
-
-
-
-
-
- How is OpenDevin different from Devin?
-
-
- Devin
-
- is a commercial product by Cognition Inc., that served as the initial
- inspiration for OpenDevin. They both aim to do a good job at solving software
- engineering tasks, but OpenDevin you can download, use, and modify, while Devin
- you can only use through the Cognition site. In addition, OpenDevin has evolved
- beyond the initial inspiration, and now serves as a community-driven ecosystem for
- agent development in general, and we'd love to have you join and
-
- contribute!
-
-
-
-
- How is OpenDevin different from ChatGPT?
-
-
-
- ChatGPT you can access online, it does not interface with local files, and
- its ability to execute code is limited. So it can write code, but it is not
- easy to test or execute it.
-
-
-
-
-
- );
-}
diff --git a/docs/src/pages/index.tsx b/docs/src/pages/index.tsx
index 7a2b9f0cc032..8f1605b0c6ee 100644
--- a/docs/src/pages/index.tsx
+++ b/docs/src/pages/index.tsx
@@ -4,12 +4,11 @@ import { HomepageHeader } from "../components/HomepageHeader/HomepageHeader";
import { Welcome } from "../components/Welcome/Welcome";
import { translate } from '@docusaurus/Translate';
-export function Header({ title, summary, description }): JSX.Element {
+export function Header({ title, summary }): JSX.Element {
return (
)}
{
expect(confirmationModeInput).toHaveAttribute("data-selected", "true");
});
- it("should display the existing values if it they are present", () => {
+ it("should display the existing values if they are present", () => {
renderSettingsForm({
LLM_MODEL: "model2",
AGENT: "agent2",
@@ -119,17 +119,18 @@ describe("SettingsForm", () => {
});
it("should call the onAgentChange handler when the agent changes", async () => {
+ const user = userEvent.setup();
renderSettingsForm();
+ // We need to enable the agent select
+ const agentSwitch = screen.getByTestId("enableagentselect");
+ await user.click(agentSwitch);
+
const agentInput = screen.getByRole("combobox", { name: "agent" });
- await act(async () => {
- await userEvent.click(agentInput);
- });
+ await user.click(agentInput);
const agent3 = screen.getByText("agent3");
- await act(async () => {
- await userEvent.click(agent3);
- });
+ await user.click(agent3);
expect(onAgentChangeMock).toHaveBeenCalledWith("agent3");
});
diff --git a/frontend/src/components/modals/settings/SettingsForm.tsx b/frontend/src/components/modals/settings/SettingsForm.tsx
index 927db98883c2..4a3ab2c87b9c 100644
--- a/frontend/src/components/modals/settings/SettingsForm.tsx
+++ b/frontend/src/components/modals/settings/SettingsForm.tsx
@@ -33,17 +33,10 @@ function SettingsForm({
}: SettingsFormProps) {
const { t } = useTranslation();
const { isOpen: isVisible, onOpenChange: onVisibleChange } = useDisclosure();
+ const [isAgentSelectEnabled, setIsAgentSelectEnabled] = React.useState(false);
return (
<>
- ({ value: agent, label: agent }))}
- defaultKey={settings.AGENT}
- onChange={onAgentChange}
- tooltip={t(I18nKey.SETTINGS$AGENT_TOOLTIP)}
- disabled={disabled}
- />
({ value: model, label: model }))}
@@ -88,6 +81,23 @@ function SettingsForm({
tooltip={t(I18nKey.SETTINGS$LANGUAGE_TOOLTIP)}
disabled={disabled}
/>
+ ({ value: agent, label: agent }))}
+ defaultKey={settings.AGENT}
+ onChange={onAgentChange}
+ tooltip={t(I18nKey.SETTINGS$AGENT_TOOLTIP)}
+ disabled={disabled || !isAgentSelectEnabled}
+ />
+
+ {t(I18nKey.SETTINGS$AGENT_SELECT_ENABLED)}
+ {
),
);
+ // We need to enable the agent select first
+ const agentSwitch = screen.getByTestId("enableagentselect");
+ await user.click(agentSwitch);
+
const resetButton = screen.getByRole("button", {
name: /MODAL_RESET_BUTTON_LABEL/i,
});
diff --git a/frontend/src/components/terminal/Terminal.test.tsx b/frontend/src/components/terminal/Terminal.test.tsx
index c8a583dca45c..254bb6290cca 100644
--- a/frontend/src/components/terminal/Terminal.test.tsx
+++ b/frontend/src/components/terminal/Terminal.test.tsx
@@ -44,7 +44,9 @@ describe("Terminal", () => {
expect(screen.getByText("Terminal")).toBeInTheDocument();
expect(mockTerminal.open).toHaveBeenCalledTimes(1);
- expect(mockTerminal.write).toHaveBeenCalledWith("$ ");
+ expect(mockTerminal.write).toHaveBeenCalledWith(
+ "opendevin@docker-desktop:/workspace $ ",
+ );
});
it("should load commands to the terminal", () => {
@@ -54,7 +56,7 @@ describe("Terminal", () => {
]);
expect(mockTerminal.writeln).toHaveBeenNthCalledWith(1, "INPUT");
- expect(mockTerminal.writeln).toHaveBeenNthCalledWith(2, "OUTPUT");
+ expect(mockTerminal.write).toHaveBeenNthCalledWith(2, "OUTPUT");
});
it("should write commands to the terminal", () => {
@@ -66,13 +68,13 @@ describe("Terminal", () => {
});
expect(mockTerminal.writeln).toHaveBeenNthCalledWith(1, "echo Hello");
- expect(mockTerminal.writeln).toHaveBeenNthCalledWith(2, "Hello");
+ expect(mockTerminal.write).toHaveBeenNthCalledWith(2, "Hello");
act(() => {
store.dispatch(appendInput("echo World"));
});
- expect(mockTerminal.writeln).toHaveBeenNthCalledWith(3, "echo World");
+ expect(mockTerminal.writeln).toHaveBeenNthCalledWith(2, "echo World");
});
it("should load and write commands to the terminal", () => {
@@ -82,13 +84,13 @@ describe("Terminal", () => {
]);
expect(mockTerminal.writeln).toHaveBeenNthCalledWith(1, "echo Hello");
- expect(mockTerminal.writeln).toHaveBeenNthCalledWith(2, "Hello");
+ expect(mockTerminal.write).toHaveBeenNthCalledWith(2, "Hello");
act(() => {
store.dispatch(appendInput("echo Hello"));
});
- expect(mockTerminal.writeln).toHaveBeenNthCalledWith(3, "echo Hello");
+ expect(mockTerminal.writeln).toHaveBeenNthCalledWith(2, "echo Hello");
});
it("should end the line with a dollar sign after writing a command", () => {
@@ -99,7 +101,9 @@ describe("Terminal", () => {
});
expect(mockTerminal.writeln).toHaveBeenCalledWith("echo Hello");
- expect(mockTerminal.write).toHaveBeenCalledWith("$ ");
+ expect(mockTerminal.write).toHaveBeenCalledWith(
+ "opendevin@docker-desktop:/workspace $ ",
+ );
});
// This test fails because it expects `disposeMock` to have been called before the component is unmounted.
diff --git a/frontend/src/hooks/useTerminal.ts b/frontend/src/hooks/useTerminal.ts
index d2665d33b543..587055fe1d06 100644
--- a/frontend/src/hooks/useTerminal.ts
+++ b/frontend/src/hooks/useTerminal.ts
@@ -34,7 +34,7 @@ export const useTerminal = (commands: Command[] = []) => {
terminal.current.loadAddon(fitAddon.current);
terminal.current.open(ref.current);
- terminal.current.write("$ ");
+ terminal.current.write("opendevin@docker-desktop:/workspace $ ");
terminal.current.onKey(({ key, domEvent }) => {
if (domEvent.key === "Enter") {
terminal.current?.write("\r\n");
@@ -93,15 +93,15 @@ export const useTerminal = (commands: Command[] = []) => {
// Start writing commands from the last command index
for (let i = lastCommandIndex.current; i < commands.length; i += 1) {
const command = commands[i];
- const lines = command.content.split("\n");
+ const lines = command.content.split("\r\n");
- lines.forEach((line: string) => {
- terminal.current?.writeln(line);
+ lines.forEach((line, index) => {
+ if (index < lines.length - 1 || command.type === "input") {
+ terminal.current?.writeln(line);
+ } else {
+ terminal.current?.write(line);
+ }
});
-
- if (command.type === "output") {
- terminal.current.write("\n$ ");
- }
}
lastCommandIndex.current = commands.length; // Update the position of the last command
diff --git a/frontend/src/i18n/translation.json b/frontend/src/i18n/translation.json
index 4ec0d05db7e9..46c1dd46954d 100644
--- a/frontend/src/i18n/translation.json
+++ b/frontend/src/i18n/translation.json
@@ -601,6 +601,12 @@
"zh-CN": "็ปง็ปญ",
"de": "Fortfahren"
},
+ "CHAT_INTERFACE$AUTO_MESSAGE": {
+ "en": "Please continue working on the task on whatever approach you think is suitable.\nIf you think you have solved the task, you can give to end the interaction.\nIMPORTANT: YOU SHOULD NEVER ASK FOR HUMAN HELP.\n"
+ },
+ "CHAT_INTERFACE$INPUT_AUTO_MESSAGE":{
+ "en": "Sent the default Auto Msg ๐ค"
+ },
"CHAT_INTERFACE$USER_ASK_CONFIRMATION": {
"en": "Do you want to continue with this action?",
"de": "Mรถchten Sie mit dieser Aktion fortfahren?",
@@ -649,6 +655,11 @@
"zh-CN": "ๅ้ๆถๆฏ",
"de": "Nachricht senden"
},
+ "CHAT_INTERFACE$TOOLTIP_UPLOAD_IMAGE": {
+ "en": "Upload image",
+ "zh-CN": "ไธไผ ๅพ็",
+ "de": "Bild hochladen"
+ },
"CHAT_INTERFACE$INITIAL_MESSAGE": {
"en": "Hi! I'm OpenDevin, an AI Software Engineer. What would you like to build with me today?",
"zh-CN": "ไฝ ๅฅฝ๏ผๆๆฏ OpenDevin๏ผไธๅ AI ่ฝฏไปถๅทฅ็จๅธใไปๅคฉๆณๅๆไธ่ตท็ผๅไปไน็จๅบๅข?",
@@ -721,6 +732,9 @@
"de": "Wartet auf die Bestรคtigung des Benutzers, bevor der Code ausgefรผhrt wird.",
"zh-CN": "ๅจๆง่กไปฃ็ ไนๅ็ญๅพ ็จๆท็กฎ่ฎคใ"
},
+ "SETTINGS$AGENT_SELECT_ENABLED": {
+ "en": "Enable Agent Selection - Advanced Users"
+ },
"BROWSER$EMPTY_MESSAGE": {
"en": "No page loaded.",
"zh-CN": "้กต้ขๆชๅ ่ฝฝ",
diff --git a/frontend/src/index.css b/frontend/src/index.css
index 6e3a605ba0ce..a981d6b3411b 100644
--- a/frontend/src/index.css
+++ b/frontend/src/index.css
@@ -40,3 +40,12 @@ code {
padding: 0;
background-color: inherit;
}
+
+.markdown-body {
+ white-space: pre-wrap; /* Handles line breaks */
+}
+
+.markdown-body ul {
+ list-style-type: disc; /* Handles bullet points */
+ margin-left: 20px;
+}
diff --git a/frontend/src/services/actions.ts b/frontend/src/services/actions.ts
index b1d91f5485fc..c55b52b3258b 100644
--- a/frontend/src/services/actions.ts
+++ b/frontend/src/services/actions.ts
@@ -28,7 +28,9 @@ const messageActions = {
},
[ActionType.MESSAGE]: (message: ActionMessage) => {
if (message.source === "user") {
- store.dispatch(addUserMessage(message.args.content));
+ store.dispatch(
+ addUserMessage({ content: message.args.content, imageUrls: [] }),
+ );
} else {
store.dispatch(addAssistantMessage(message.args.content));
}
diff --git a/frontend/src/services/browseService.ts b/frontend/src/services/browseService.ts
new file mode 100644
index 000000000000..41525f1d30e7
--- /dev/null
+++ b/frontend/src/services/browseService.ts
@@ -0,0 +1,8 @@
+import ActionType from "#/types/ActionType";
+import Session from "./session";
+
+export function updateBrowserTabUrl(newUrl: string): void {
+ const event = { action: ActionType.BROWSE, args: { url: newUrl } };
+ const eventString = JSON.stringify(event);
+ Session.send(eventString);
+}
diff --git a/frontend/src/services/chatService.ts b/frontend/src/services/chatService.ts
index af1ab45ce86b..d857fb603f44 100644
--- a/frontend/src/services/chatService.ts
+++ b/frontend/src/services/chatService.ts
@@ -1,8 +1,11 @@
import ActionType from "#/types/ActionType";
import Session from "./session";
-export function sendChatMessage(message: string): void {
- const event = { action: ActionType.MESSAGE, args: { content: message } };
+export function sendChatMessage(message: string, images_urls: string[]): void {
+ const event = {
+ action: ActionType.MESSAGE,
+ args: { content: message, images_urls },
+ };
const eventString = JSON.stringify(event);
Session.send(eventString);
}
diff --git a/frontend/src/services/fileService.ts b/frontend/src/services/fileService.ts
index bdbae8991ebf..77751c43f411 100644
--- a/frontend/src/services/fileService.ts
+++ b/frontend/src/services/fileService.ts
@@ -67,10 +67,14 @@ export async function uploadFiles(files: FileList): Promise {
};
}
-export async function listFiles(path: string = "/"): Promise {
- const data = await request(
- `/api/list-files?path=${encodeURIComponent(path)}`,
- );
+export async function listFiles(
+ path: string | undefined = undefined,
+): Promise {
+ let url = "/api/list-files";
+ if (path) {
+ url = `/api/list-files?path=${encodeURIComponent(path)}`;
+ }
+ const data = await request(url);
if (!Array.isArray(data)) {
throw new Error("Invalid response format: data is not an array");
}
diff --git a/frontend/src/state/chatSlice.ts b/frontend/src/state/chatSlice.ts
index 757806e1e048..a1b01fa77682 100644
--- a/frontend/src/state/chatSlice.ts
+++ b/frontend/src/state/chatSlice.ts
@@ -10,12 +10,15 @@ export const chatSlice = createSlice({
name: "chat",
initialState,
reducers: {
- addUserMessage(state, action: PayloadAction) {
+ addUserMessage(
+ state,
+ action: PayloadAction<{ content: string; imageUrls: string[] }>,
+ ) {
const message: Message = {
sender: "user",
- content: action.payload,
+ content: action.payload.content,
+ imageUrls: action.payload.imageUrls,
};
-
state.messages.push(message);
},
@@ -23,8 +26,8 @@ export const chatSlice = createSlice({
const message: Message = {
sender: "assistant",
content: action.payload,
+ imageUrls: [],
};
-
state.messages.push(message);
},
diff --git a/opendevin/README.md b/opendevin/README.md
index f977abbdeb10..11c52dcab2e8 100644
--- a/opendevin/README.md
+++ b/opendevin/README.md
@@ -3,11 +3,7 @@
This directory contains the core components of OpenDevin.
This diagram provides an overview of the roles of each component and how they communicate and collaborate.
-
-
-
-
OpenDevin System Architecture Diagram (July 4, 2024)
-
+
## Classes
The key classes in OpenDevin are:
diff --git a/opendevin/condenser/condenser.py b/opendevin/condenser/condenser.py
new file mode 100644
index 000000000000..d1fef59117fb
--- /dev/null
+++ b/opendevin/condenser/condenser.py
@@ -0,0 +1,147 @@
+from opendevin.core.exceptions import (
+ SummarizeError,
+)
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.core.message import Message, TextContent
+from opendevin.events.action import (
+ AgentSummarizeAction,
+)
+
+from .prompts import (
+ MESSAGE_SUMMARY_WARNING_FRAC,
+ SUMMARY_PROMPT_SYSTEM,
+ parse_summary_response,
+)
+
+
+class CondenserMixin:
+ """Condenses a group of condensable messages as done by MemGPT."""
+
+ def condense(
+ self,
+ messages: list[Message],
+ ):
+ # Start past the system message, and example messages.,
+ # and collect messages for summarization until we reach the desired truncation token fraction (eg 50%)
+ # Do not allow truncation for in-context examples of function calling
+ token_counts = [
+ self.get_token_count([message]) # type: ignore
+ for message in messages
+ if message.condensable
+ ]
+ message_buffer_token_count = sum(token_counts) # no system and example message
+
+ desired_token_count_to_summarize = int(
+ message_buffer_token_count * self.config.message_summary_trunc_tokens_frac # type: ignore
+ )
+
+ candidate_messages_to_summarize = []
+ tokens_so_far = 0
+ for message in messages:
+ if message.condensable:
+ candidate_messages_to_summarize.append(message)
+ tokens_so_far += self.get_token_count([message]) # type: ignore
+ if tokens_so_far > desired_token_count_to_summarize:
+ last_summarized_event_id = message.event_id
+ break
+
+ # TODO: Add functionality for preserving last N messages
+ # MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST = 3
+ # if preserve_last_N_messages:
+ # candidate_messages_to_summarize = candidate_messages_to_summarize[:-MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST]
+ # token_counts = token_counts[:-MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST]
+
+ logger.debug(
+ f'message_summary_trunc_tokens_frac={self.config.message_summary_trunc_tokens_frac}' # type: ignore
+ )
+ # logger.debug(f'MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST={MESSAGE_SUMMARY_TRUNC_KEEP_N_LAST}')
+ logger.debug(f'token_counts={token_counts}')
+ logger.debug(f'message_buffer_token_count={message_buffer_token_count}')
+ logger.debug(
+ f'desired_token_count_to_summarize={desired_token_count_to_summarize}'
+ )
+ logger.debug(
+ f'len(candidate_messages_to_summarize)={len(candidate_messages_to_summarize)}'
+ )
+
+ if len(candidate_messages_to_summarize) == 0:
+ raise SummarizeError(
+ f"Summarize error: tried to run summarize, but couldn't find enough messages to compress [len={len(messages)}]"
+ )
+
+ # TODO: Try to make an assistant message come after the cutoff
+
+ message_sequence_to_summarize = candidate_messages_to_summarize
+
+ if len(message_sequence_to_summarize) <= 1:
+ # This prevents a potential infinite loop of summarizing the same message over and over
+ raise SummarizeError(
+ f"Summarize error: tried to run summarize, but couldn't find enough messages to compress [len={len(message_sequence_to_summarize)} <= 1]"
+ )
+ else:
+ print(
+ f'Attempting to summarize with last summarized event id = {last_summarized_event_id}'
+ )
+
+ action_response = self.summarize_messages(
+ message_sequence_to_summarize=message_sequence_to_summarize
+ )
+ summary_action: AgentSummarizeAction = parse_summary_response(action_response)
+ summary_action.last_summarized_event_id = (
+ last_summarized_event_id if last_summarized_event_id else -1
+ )
+ return summary_action
+
+ def _format_summary_history(self, message_history: list[dict]) -> str:
+ # TODO use existing prompt formatters for this (eg ChatML)
+ return '\n'.join([f'{m["role"]}: {m["content"]}' for m in message_history])
+
+ def summarize_messages(self, message_sequence_to_summarize: list[Message]):
+ """Summarize a message sequence using LLM"""
+ context_window = self.config.max_input_tokens # type: ignore
+ summary_prompt = SUMMARY_PROMPT_SYSTEM
+ summary_input = self._format_summary_history(
+ self.get_text_messages(message_sequence_to_summarize) # type: ignore
+ )
+ summary_input_tkns = self.get_token_count(text=summary_input) # type: ignore
+ if context_window is None:
+ raise ValueError('context_window should not be None')
+ if summary_input_tkns > MESSAGE_SUMMARY_WARNING_FRAC * context_window:
+ trunc_ratio = (
+ MESSAGE_SUMMARY_WARNING_FRAC * context_window / summary_input_tkns
+ ) * 0.8 # For good measure...
+ cutoff = int(len(message_sequence_to_summarize) * trunc_ratio)
+ curr_summary = self.summarize_messages(
+ message_sequence_to_summarize=message_sequence_to_summarize[:cutoff]
+ )
+ curr_summary = parse_summary_response(curr_summary)
+ curr_summary_message = (
+ 'Summary of all Action and Observations till now. \n'
+ f'Action: {curr_summary.summarized_actions}\n'
+ f'Observation: {curr_summary.summarized_observations}'
+ )
+ input = [
+ Message(
+ role='assistant', content=[TextContent(text=curr_summary_message)]
+ )
+ ] + message_sequence_to_summarize[cutoff:]
+ summary_input = self._format_summary_history(self.get_text_messages(input)) # type: ignore
+
+ message_sequence = []
+ message_sequence.append(
+ Message(role='system', content=[TextContent(text=summary_prompt)])
+ )
+ message_sequence.append(
+ Message(role='user', content=[TextContent(text=summary_input)])
+ )
+
+ response = self.completion( # type: ignore
+ messages=message_sequence,
+ temperature=0.0,
+ condense=True,
+ )
+
+ print(f'summarize_messages gpt reply: {response.choices[0].message.content}')
+
+ action_response = response['choices'][0]['message']['content']
+ return action_response
diff --git a/opendevin/condenser/prompts.py b/opendevin/condenser/prompts.py
new file mode 100644
index 000000000000..a1de3ae9fb30
--- /dev/null
+++ b/opendevin/condenser/prompts.py
@@ -0,0 +1,74 @@
+import re
+
+from opendevin.core.exceptions import (
+ InvalidSummaryResponseError,
+ LLMMalformedActionError,
+ LLMResponseError,
+)
+from opendevin.core.logger import opendevin_logger as logger
+from opendevin.core.utils import json
+from opendevin.events.action.agent import AgentSummarizeAction
+from opendevin.events.event import EventSource
+from opendevin.events.serialization.action import action_from_dict
+
+WORD_LIMIT = 200
+MESSAGE_SUMMARY_WARNING_FRAC = 0.75
+
+SUMMARY_PROMPT_SYSTEM = """
+Your job is to summarize a history of previous messages in a conversation between an AI persona and a human. The conversation you are given is a from a fixed context window and may not be complete. Keep your summary less than {WORD_LIMIT} words, do NOT exceed this word limit.
+Only output the summary, do NOT include anything else in your output.
+Given the following actions and observations, create a JSON response with:
+ - "action": "summarize"
+ - args:
+ - "summarized_actions": A precise sentence summarizing all the provided actions, written in the first person.
+ - "summarized_observations": A few precise sentences summarizing all the provided observations, written in the third person.
+Example:
+{
+ "action": "summarize",
+ "args": {
+ "summarized_actions": "I located the UML specification PDF, parsed its content, and searched for information about sequence diagrams.",
+ "summarized_observations": "The agent encountered a UnicodeDecodeError when initially searching the PDF text, but was able to resolve this by installing the PyPDF2 library and successfully extracting relevant information about sequence diagrams."
+ }
+}
+Make sure to include in observations any relevant information that the agent needs to remember.
+%(events)s
+"""
+
+
+def parse_summary_response(response: str) -> AgentSummarizeAction:
+ """
+ Parses a JSON summary of events.
+
+ Parameters:
+ - response: The response string to be parsed
+ Returns:
+ - The summary action output by the model
+ """
+ try:
+ try:
+ action_dict = json.loads(response)
+ except LLMResponseError:
+ pattern = r'(?<=Action:\s).+?(?=\s*Observation:)|(?<=Observation:\s).+'
+ matches = re.findall(pattern, response.replace('**', ''), re.DOTALL)
+ action_summary = matches[0].strip() if matches else None
+ observation_summary = matches[1].strip() if len(matches) > 1 else None
+ action_dict = {
+ 'action': 'summarize',
+ 'args': {
+ 'summarized_actions': action_summary,
+ 'summarized_observations': observation_summary,
+ },
+ }
+
+ action = action_from_dict(action_dict)
+ if action is None or not isinstance(action, AgentSummarizeAction):
+ error_message = f'Expected a summarize action, but the response got {str(type(action)) if action else None}'
+ logger.error(error_message)
+ raise InvalidSummaryResponseError(error_message)
+ action._source = EventSource.AGENT # type: ignore
+ except (LLMResponseError, LLMMalformedActionError) as e:
+ logger.error(f'Failed to parse summary response: {str(e)}')
+ raise InvalidSummaryResponseError(
+ f'Failed to parse the response: {str(e)}'
+ ) from e
+ return action
diff --git a/opendevin/controller/agent_controller.py b/opendevin/controller/agent_controller.py
index bbb0e11cf9fd..533322d5ff8d 100644
--- a/opendevin/controller/agent_controller.py
+++ b/opendevin/controller/agent_controller.py
@@ -21,6 +21,7 @@
AgentDelegateAction,
AgentFinishAction,
AgentRejectAction,
+ AgentSummarizeAction,
ChangeAgentStateAction,
CmdRunAction,
IPythonRunCellAction,
@@ -36,6 +37,7 @@
ErrorObservation,
Observation,
)
+from opendevin.events.observation.browse import BrowserOutputObservation
from opendevin.llm.llm import LLM
# note: RESUME is only available on web GUI
@@ -131,9 +133,9 @@ async def report_error(self, message: str, exception: Exception | None = None):
- a user-friendly message, which will be shown in the chat box. This should not be a raw exception message.
- an ErrorObservation that can be sent to the LLM by the agent, with the exception message, so it can self-correct next time.
"""
- self.state.last_error = message
if exception:
- self.state.last_error += f': {exception}'
+ message += f': {exception}'
+ self.state.last_error = message
self.event_stream.add_event(ErrorObservation(message), EventSource.AGENT)
async def _start_step_loop(self):
@@ -145,7 +147,6 @@ async def _start_step_loop(self):
logger.info('AgentController task was cancelled')
break
except Exception as e:
- traceback.print_exc()
logger.error(f'Error while running the agent: {e}')
logger.error(traceback.format_exc())
await self.report_error(
@@ -161,6 +162,10 @@ async def on_event(self, event: Event):
await self.set_agent_state_to(event.agent_state) # type: ignore
elif isinstance(event, MessageAction):
if event.source == EventSource.USER:
+ logger.info(
+ event,
+ extra={'msg_type': 'ACTION', 'event_source': EventSource.USER},
+ )
if self.get_agent_state() != AgentState.RUNNING:
await self.set_agent_state_to(AgentState.RUNNING)
elif event.source == EventSource.AGENT and event.wait_for_response:
@@ -196,6 +201,8 @@ async def on_event(self, event: Event):
logger.info(event, extra={'msg_type': 'OBSERVATION'})
elif isinstance(event, CmdOutputObservation):
logger.info(event, extra={'msg_type': 'OBSERVATION'})
+ elif isinstance(event, BrowserOutputObservation):
+ logger.info(event, extra={'msg_type': 'OBSERVATION'})
elif isinstance(event, AgentDelegateObservation):
self.state.history.on_event(event)
logger.info(event, extra={'msg_type': 'OBSERVATION'})
@@ -376,16 +383,27 @@ async def _step(self) -> None:
self.state.traffic_control_state = TrafficControlState.NORMAL
else:
self.state.traffic_control_state = TrafficControlState.THROTTLING
- await self.report_error(
- f'Task budget exceeded. Current cost: {current_cost:.2f}, Max budget: {self.max_budget_per_task:.2f}, task paused. {TRAFFIC_CONTROL_REMINDER}'
- )
- await self.set_agent_state_to(AgentState.PAUSED)
+ if self.headless_mode:
+ # set to ERROR state if running in headless mode
+ # there is no way to resume
+ await self.report_error(
+ f'Task budget exceeded. Current cost: {current_cost:.2f}, max budget: {self.max_budget_per_task:.2f}, task stopped.'
+ )
+ await self.set_agent_state_to(AgentState.ERROR)
+ else:
+ await self.report_error(
+ f'Task budget exceeded. Current cost: {current_cost:.2f}, Max budget: {self.max_budget_per_task:.2f}, task paused. {TRAFFIC_CONTROL_REMINDER}'
+ )
+ await self.set_agent_state_to(AgentState.PAUSED)
return
self.update_state_before_step()
action: Action = NullAction()
try:
action = self.agent.step(self.state)
+ if isinstance(action, AgentSummarizeAction):
+ self.state.history.add_summary(action)
+ return
if action is None:
raise LLMNoActionError('No action was returned')
except (LLMMalformedActionError, LLMNoActionError, LLMResponseError) as e:
diff --git a/opendevin/controller/state/state.py b/opendevin/controller/state/state.py
index a8dd2ed33372..372a66767930 100644
--- a/opendevin/controller/state/state.py
+++ b/opendevin/controller/state/state.py
@@ -2,6 +2,7 @@
import pickle
from dataclasses import dataclass, field
from enum import Enum
+from typing import Any
from opendevin.controller.state.task import RootTask
from opendevin.core.logger import opendevin_logger as logger
@@ -106,6 +107,9 @@ class State:
start_id: int = -1
end_id: int = -1
almost_stuck: int = 0
+ # NOTE: This will never be used by the controller, but it can be used by different
+ # evaluation tasks to store extra data needed to track the progress/state of the task.
+ extra_data: dict[str, Any] = field(default_factory=dict)
def save_to_session(self, sid: str, file_store: FileStore):
pickled = pickle.dumps(self)
@@ -126,10 +130,17 @@ def restore_from_session(sid: str, file_store: FileStore) -> 'State':
except Exception as e:
logger.error(f'Failed to restore state from session: {e}')
raise e
+
+ # update state
if state.agent_state in RESUMABLE_STATES:
state.resume_state = state.agent_state
else:
state.resume_state = None
+
+ # don't carry last_error anymore after restore
+ state.last_error = None
+
+ # first state after restore
state.agent_state = AgentState.LOADING
return state
@@ -160,13 +171,15 @@ def __setstate__(self, state):
# remove the restored data from the state if any
def get_current_user_intent(self):
- """Returns the latest user message that appears after a FinishAction, or the first (the task) if nothing was finished yet."""
+ """Returns the latest user message and image(if provided) that appears after a FinishAction, or the first (the task) if nothing was finished yet."""
last_user_message = None
+ last_user_message_image_urls: list[str] | None = []
for event in self.history.get_events(reverse=True):
if isinstance(event, MessageAction) and event.source == 'user':
last_user_message = event.content
+ last_user_message_image_urls = event.images_urls
elif isinstance(event, AgentFinishAction):
if last_user_message is not None:
return last_user_message
- return last_user_message
+ return last_user_message, last_user_message_image_urls
diff --git a/opendevin/core/config.py b/opendevin/core/config.py
index 4fff99216a99..70cd269528b1 100644
--- a/opendevin/core/config.py
+++ b/opendevin/core/config.py
@@ -78,6 +78,8 @@ class LLMConfig:
input_cost_per_token: float | None = None
output_cost_per_token: float | None = None
ollama_base_url: str | None = None
+ message_summary_trunc_tokens_frac: float = 0.75
+ attempts_to_condense: int = 2
drop_params: bool | None = None
def defaults_to_dict(self) -> dict:
@@ -111,6 +113,12 @@ def to_safe_dict(self):
ret[k] = '******' if v else None
return ret
+ def set_missing_attributes(self):
+ """Set any missing attributes to their default values."""
+ for field_name, field_obj in self.__dataclass_fields__.items():
+ if not hasattr(self, field_name):
+ setattr(self, field_name, field_obj.default)
+
@dataclass
class AgentConfig:
@@ -139,23 +147,31 @@ class SandboxConfig(metaclass=Singleton):
"""Configuration for the sandbox.
Attributes:
- box_type: The type of sandbox to use. Options are: ssh, e2b, local.
+ api_hostname: The hostname for the EventStream Runtime API.
container_image: The container image to use for the sandbox.
user_id: The user ID for the sandbox.
timeout: The timeout for the sandbox.
enable_auto_lint: Whether to enable auto-lint.
use_host_network: Whether to use the host network.
initialize_plugins: Whether to initialize plugins.
- update_source_code: Whether to update the source code in the EventStreamRuntime.
- Used for development of EventStreamRuntime.
+ od_runtime_extra_deps: The extra dependencies to install in the runtime image (typically used for evaluation).
+ This will be rendered into the end of the Dockerfile that builds the runtime image.
+ It can contain any valid shell commands (e.g., pip install numpy).
+ The path to the interpreter is available as $OD_INTERPRETER_PATH,
+ which can be used to install dependencies for the OD-specific Python interpreter.
+ od_runtime_startup_env_vars: The environment variables to set at the launch of the runtime.
+ This is a dictionary of key-value pairs.
+ This is useful for setting environment variables that are needed by the runtime.
+ For example, for specifying the base url of website for browsergym evaluation.
+ browsergym_eval_env: The BrowserGym environment to use for evaluation.
+ Default is None for general purpose browsing. Check evaluation/miniwob and evaluation/webarena for examples.
+ persist_sandbox: Whether to persist the sandbox after the task is done.
+ fast_boot: Whether to use a fast boot mode for the sandbox.
+ port: The port to use for the sandbox.
"""
- box_type: str = 'ssh'
- container_image: str = 'ghcr.io/opendevin/sandbox' + (
- f':{os.getenv("OPEN_DEVIN_BUILD_VERSION")}'
- if os.getenv('OPEN_DEVIN_BUILD_VERSION')
- else ':main'
- )
+ api_hostname: str = 'localhost'
+ container_image: str = 'nikolaik/python-nodejs:python3.11-nodejs22' # default to nikolaik/python-nodejs:python3.11-nodejs22 for eventstream runtime
user_id: int = os.getuid() if hasattr(os, 'getuid') else 1000
timeout: int = 120
enable_auto_lint: bool = (
@@ -163,7 +179,12 @@ class SandboxConfig(metaclass=Singleton):
)
use_host_network: bool = False
initialize_plugins: bool = True
- update_source_code: bool = False
+ od_runtime_extra_deps: str | None = None
+ od_runtime_startup_env_vars: dict[str, str] = field(default_factory=dict)
+ browsergym_eval_env: str | None = None
+ persist_sandbox: bool = False
+ fast_boot: bool = False
+ port: int = 63710
def defaults_to_dict(self) -> dict:
"""Serialize fields to a dict for the frontend, including type hints, defaults, and whether it's optional."""
@@ -211,7 +232,6 @@ class AppConfig(metaclass=Singleton):
max_iterations: The maximum number of iterations.
max_budget_per_task: The maximum budget allowed per task, beyond which the agent will stop.
e2b_api_key: The E2B API key.
- ssh_hostname: The SSH hostname.
disable_color: Whether to disable color. For terminals that don't support color.
debug: Whether to enable debugging.
enable_cli_session: Whether to enable saving and restoring the session when run from CLI.
@@ -224,13 +244,14 @@ class AppConfig(metaclass=Singleton):
agents: dict = field(default_factory=dict)
default_agent: str = _DEFAULT_AGENT
sandbox: SandboxConfig = field(default_factory=SandboxConfig)
- runtime: str = 'server'
+ runtime: str = 'eventstream'
file_store: str = 'memory'
file_store_path: str = '/tmp/file_store'
+ # TODO: clean up workspace path after the removal of ServerRuntime
workspace_base: str = os.path.join(os.getcwd(), 'workspace')
- workspace_mount_path: str = (
+ workspace_mount_path: str | None = (
UndefinedString.UNDEFINED # this path should always be set when config is fully loaded
- )
+ ) # when set to None, do not mount the workspace
workspace_mount_path_in_sandbox: str = '/workspace'
workspace_mount_rewrite: str | None = None
cache_dir: str = '/tmp/cache'
@@ -239,11 +260,7 @@ class AppConfig(metaclass=Singleton):
max_iterations: int = _MAX_ITERATIONS
max_budget_per_task: float | None = None
e2b_api_key: str = ''
- ssh_hostname: str = 'localhost'
disable_color: bool = False
- persist_sandbox: bool = False
- ssh_port: int = 63710
- ssh_password: str | None = None
jwt_secret: str = uuid.uuid4().hex
debug: bool = False
enable_cli_session: bool = False
@@ -315,7 +332,6 @@ def __str__(self):
'e2b_api_key',
'github_token',
'jwt_secret',
- 'ssh_password',
]:
attr_value = '******' if attr_value else None
@@ -390,6 +406,11 @@ def set_attr_from_env(sub_config: Any, prefix=''):
elif env_var_name in env_or_toml_dict:
# convert the env var to the correct type and set it
value = env_or_toml_dict[env_var_name]
+
+ # skip empty config values (fall back to default)
+ if not value:
+ continue
+
try:
# if it's an optional type, get the non-None type
if get_origin(field_type) is UnionType:
@@ -406,11 +427,6 @@ def set_attr_from_env(sub_config: Any, prefix=''):
f'Error setting env var {env_var_name}={value}: check that the value is of the right type'
)
- if 'SANDBOX_TYPE' in env_or_toml_dict:
- logger.opendevin_logger.error(
- 'SANDBOX_TYPE is deprecated. Please use SANDBOX_BOX_TYPE instead.'
- )
- env_or_toml_dict['SANDBOX_BOX_TYPE'] = env_or_toml_dict.pop('SANDBOX_TYPE')
# Start processing from the root of the config object
set_attr_from_env(cfg)
@@ -433,8 +449,7 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml'):
try:
with open(toml_file, 'r', encoding='utf-8') as toml_contents:
toml_config = toml.load(toml_contents)
- except FileNotFoundError as e:
- logger.opendevin_logger.info(f'Config file not found: {e}')
+ except FileNotFoundError:
return
except toml.TomlDecodeError as e:
logger.opendevin_logger.warning(
@@ -501,8 +516,6 @@ def load_from_toml(cfg: AppConfig, toml_file: str = 'config.toml'):
keys_to_migrate = [key for key in core_config if key.startswith('sandbox_')]
for key in keys_to_migrate:
new_key = key.replace('sandbox_', '')
- if new_key == 'type':
- new_key = 'box_type'
if new_key in sandbox_config.__annotations__:
# read the key in sandbox and remove it from core
setattr(sandbox_config, new_key, core_config.pop(key))
@@ -529,10 +542,6 @@ def finalize_config(cfg: AppConfig):
cfg.workspace_mount_path = os.path.abspath(cfg.workspace_base)
cfg.workspace_base = os.path.abspath(cfg.workspace_base)
- # In local there is no sandbox, the workspace will have the same pwd as the host
- if cfg.sandbox.box_type == 'local':
- cfg.workspace_mount_path_in_sandbox = cfg.workspace_mount_path
-
if cfg.workspace_mount_rewrite: # and not config.workspace_mount_path:
# TODO why do we need to check if workspace_mount_path is None?
base = cfg.workspace_base or os.getcwd()
diff --git a/opendevin/core/exceptions.py b/opendevin/core/exceptions.py
index fe973083962c..ca130dbb9d13 100644
--- a/opendevin/core/exceptions.py
+++ b/opendevin/core/exceptions.py
@@ -67,3 +67,34 @@ def __init__(self, message='Agent must return an action'):
class LLMResponseError(Exception):
def __init__(self, message='Failed to retrieve action from LLM response'):
super().__init__(message)
+
+
+class TokenLimitExceededError(Exception):
+ """Exception raised when the user-defined max_input_tokens limit is exceeded."""
+
+ def __init__(self, message='User-defined token limit exceeded. Condensing memory.'):
+ super().__init__(message)
+
+
+class ContextWindowLimitExceededError(Exception):
+ def __init__(
+ self, message='Context window limit exceeded. Unable to condense memory.'
+ ):
+ super().__init__(message)
+
+
+class SummarizeError(Exception):
+ """Exception raised when message can't be Summarized."""
+
+ def __init__(self, message='Error Summarizing The Memory'):
+ super().__init__(message)
+
+
+class InvalidSummaryResponseError(Exception):
+ def __init__(self, message='Invalid summary response'):
+ super().__init__(message)
+
+
+class UserCancelledError(Exception):
+ def __init__(self, message='User cancelled the request'):
+ super().__init__(message)
diff --git a/opendevin/core/logger.py b/opendevin/core/logger.py
index 5bde01847504..f765cdff9958 100644
--- a/opendevin/core/logger.py
+++ b/opendevin/core/logger.py
@@ -9,7 +9,7 @@
from termcolor import colored
DISABLE_COLOR_PRINTING = False
-DEBUG = False
+DEBUG = os.getenv('DEBUG', 'False').lower() in ['true', '1', 'yes']
ColorType = Literal[
'red',
@@ -87,7 +87,6 @@ def filter(self, record):
'e2b_api_key',
'github_token',
'jwt_secret',
- 'ssh_password',
]
# add env var names
@@ -123,9 +122,8 @@ def get_console_handler():
return console_handler
-def get_file_handler(log_dir=None):
+def get_file_handler(log_dir):
"""Returns a file handler for logging."""
- log_dir = os.path.join(os.getcwd(), 'logs') if log_dir is None else log_dir
os.makedirs(log_dir, exist_ok=True)
timestamp = datetime.now().strftime('%Y-%m-%d')
file_name = f'opendevin_{timestamp}.log'
@@ -159,16 +157,21 @@ def log_uncaught_exceptions(ex_cls, ex, tb):
opendevin_logger = logging.getLogger('opendevin')
opendevin_logger.setLevel(logging.INFO)
+LOG_DIR = os.path.join(
+ # parent dir of opendevin/core (i.e., root of the repo)
+ os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
+ 'logs',
+)
if DEBUG:
opendevin_logger.setLevel(logging.DEBUG)
-opendevin_logger.addHandler(get_file_handler())
+ # default log to project root
+ opendevin_logger.info('DEBUG logging is enabled. Logging to %s', LOG_DIR)
+opendevin_logger.addHandler(get_file_handler(LOG_DIR))
opendevin_logger.addHandler(get_console_handler())
opendevin_logger.addFilter(SensitiveDataFilter(opendevin_logger.name))
opendevin_logger.propagate = False
opendevin_logger.debug('Logging initialized')
-opendevin_logger.debug(
- 'Logging to %s', os.path.join(os.getcwd(), 'logs', 'opendevin.log')
-)
+
# Exclude LiteLLM from logging output
logging.getLogger('LiteLLM').disabled = True
@@ -194,7 +197,7 @@ def __init__(self, filename, mode='a', encoding='utf-8', delay=False):
self.session = datetime.now().strftime('%y-%m-%d_%H-%M')
else:
self.session = 'default'
- self.log_directory = os.path.join(os.getcwd(), 'logs', 'llm', self.session)
+ self.log_directory = os.path.join(LOG_DIR, 'llm', self.session)
os.makedirs(self.log_directory, exist_ok=True)
if not DEBUG:
# Clear the log directory if not in debug mode
diff --git a/opendevin/core/main.py b/opendevin/core/main.py
index 34f58082979c..9e89d5385bad 100644
--- a/opendevin/core/main.py
+++ b/opendevin/core/main.py
@@ -1,13 +1,18 @@
import asyncio
-import os
import sys
+import uuid
from typing import Callable, Type
import agenthub # noqa F401 (we import this to get the agents registered)
from opendevin.controller import AgentController
from opendevin.controller.agent import Agent
from opendevin.controller.state.state import State
-from opendevin.core.config import get_llm_config_arg, load_app_config, parse_arguments
+from opendevin.core.config import (
+ AppConfig,
+ get_llm_config_arg,
+ load_app_config,
+ parse_arguments,
+)
from opendevin.core.logger import opendevin_logger as logger
from opendevin.core.schema import AgentState
from opendevin.events import EventSource, EventStream, EventStreamSubscriber
@@ -16,11 +21,9 @@
from opendevin.events.observation import AgentStateChangedObservation
from opendevin.llm.llm import LLM
from opendevin.runtime import get_runtime_cls
-from opendevin.runtime.sandbox import Sandbox
+from opendevin.runtime.runtime import Runtime
from opendevin.storage import get_file_store
-config = load_app_config()
-
def read_task_from_file(file_path: str) -> str:
"""Read task from the specified file."""
@@ -33,79 +36,99 @@ def read_task_from_stdin() -> str:
return sys.stdin.read()
-async def run_agent_controller(
- agent: Agent,
+async def create_runtime(
+ config: AppConfig,
+ sid: str | None = None,
+ runtime_tools_config: dict | None = None,
+) -> Runtime:
+ """Create a runtime for the agent to run on.
+
+ config: The app config.
+ sid: The session id.
+ runtime_tools_config: (will be deprecated) The runtime tools config.
+ """
+ # set up the event stream
+ file_store = get_file_store(config.file_store, config.file_store_path)
+ session_id = 'main' + ('_' + sid if sid else str(uuid.uuid4()))
+ event_stream = EventStream(session_id, file_store)
+
+ # agent class
+ agent_cls = agenthub.Agent.get_cls(config.default_agent)
+
+ # runtime and tools
+ runtime_cls = get_runtime_cls(config.runtime)
+ logger.info(f'Initializing runtime: {runtime_cls}')
+ runtime: Runtime = runtime_cls(
+ config=config,
+ event_stream=event_stream,
+ sid=session_id,
+ plugins=agent_cls.sandbox_plugins,
+ )
+ await runtime.ainit()
+
+ return runtime
+
+
+async def run_controller(
+ config: AppConfig,
task_str: str,
- max_iterations: int,
- max_budget_per_task: float | None = None,
+ runtime: Runtime | None = None,
+ agent: Agent | None = None,
exit_on_message: bool = False,
fake_user_response_fn: Callable[[State | None], str] | None = None,
- sandbox: Sandbox | None = None,
- runtime_tools_config: dict | None = None,
- sid: str | None = None,
headless_mode: bool = True,
) -> State | None:
"""Main coroutine to run the agent controller with task input flexibility.
It's only used when you launch opendevin backend directly via cmdline.
Args:
- task_str: The task to run.
+ config: The app config.
+ task_str: The task to run. It can be a string.
+ runtime: (optional) A runtime for the agent to run on.
+ agent: (optional) A agent to run.
exit_on_message: quit if agent asks for a message from user (optional)
fake_user_response_fn: An optional function that receives the current state (could be None) and returns a fake user response.
- sandbox: An optional sandbox to run the agent in.
headless_mode: Whether the agent is run in headless mode.
"""
- # Logging
- logger.info(
- f'Running agent {agent.name}, model {agent.llm.config.model}, with task: "{task_str}"'
- )
+ # Create the agent
+ if agent is None:
+ agent_cls: Type[Agent] = Agent.get_cls(config.default_agent)
+ agent = agent_cls(
+ llm=LLM(config=config.get_llm_config_from_agent(config.default_agent))
+ )
- # set up the event stream
- file_store = get_file_store(config.file_store, config.file_store_path)
- cli_session = 'main' + ('_' + sid if sid else '')
- event_stream = EventStream(cli_session, file_store)
+ if runtime is None:
+ runtime = await create_runtime(config)
+ event_stream = runtime.event_stream
# restore cli session if enabled
initial_state = None
if config.enable_cli_session:
try:
logger.info('Restoring agent state from cli session')
- initial_state = State.restore_from_session(cli_session, file_store)
+ initial_state = State.restore_from_session(
+ event_stream.sid, event_stream.file_store
+ )
except Exception as e:
- print('Error restoring state', e)
+ logger.info('Error restoring state', e)
# init controller with this initial state
controller = AgentController(
agent=agent,
- max_iterations=max_iterations,
- max_budget_per_task=max_budget_per_task,
+ max_iterations=config.max_iterations,
+ max_budget_per_task=config.max_budget_per_task,
agent_to_llm_config=config.get_agent_to_llm_config_map(),
event_stream=event_stream,
initial_state=initial_state,
headless_mode=headless_mode,
)
- # runtime and tools
- runtime_cls = get_runtime_cls(config.runtime)
- runtime = runtime_cls(config=config, event_stream=event_stream, sandbox=sandbox)
- await runtime.ainit()
- runtime.init_sandbox_plugins(controller.agent.sandbox_plugins)
- runtime.init_runtime_tools(
- controller.agent.runtime_tools,
- is_async=False,
- runtime_tools_config=runtime_tools_config,
+ assert isinstance(task_str, str), f'task_str must be a string, got {type(task_str)}'
+ # Logging
+ logger.info(
+ f'Agent Controller Initialized: Running agent {agent.name}, model {agent.llm.config.model}, with task: "{task_str}"'
)
- # browser eval specific
- # TODO: move to a better place
- if runtime.browser and runtime.browser.eval_dir:
- logger.info(f'Evaluation directory: {runtime.browser.eval_dir}')
- with open(
- os.path.join(runtime.browser.eval_dir, 'goal.txt'), 'r', encoding='utf-8'
- ) as f:
- task_str = f.read()
- logger.info(f'Dynamic Eval task: {task_str}')
-
# start event is a MessageAction with the task, either resumed or new
if config.enable_cli_session and initial_state is not None:
# we're resuming the previous session
@@ -144,12 +167,13 @@ async def on_event(event: Event):
# save session when we're about to close
if config.enable_cli_session:
end_state = controller.get_state()
- end_state.save_to_session(cli_session, file_store)
+ end_state.save_to_session(event_stream.sid, event_stream.file_store)
# close when done
await controller.close()
- await runtime.close()
- return controller.get_state()
+ state = controller.get_state()
+
+ return state
if __name__ == '__main__':
@@ -165,23 +189,30 @@ async def on_event(event: Event):
else:
raise ValueError('No task provided. Please specify a task through -t, -f.')
+ # Load the app config
+ # this will load config from config.toml in the current directory
+ # as well as from the environment variables
+ config = load_app_config()
+
# Override default LLM configs ([llm] section in config.toml)
if args.llm_config:
llm_config = get_llm_config_arg(args.llm_config)
if llm_config is None:
raise ValueError(f'Invalid toml file, cannot read {args.llm_config}')
config.set_llm_config(llm_config)
- llm = LLM(config=config.get_llm_config_from_agent(args.agent_cls))
- # Create the agent
- AgentCls: Type[Agent] = Agent.get_cls(args.agent_cls)
- agent = AgentCls(llm=llm)
+ # Set default agent
+ config.default_agent = args.agent_cls
+
+ # if max budget per task is not sent on the command line, use the config value
+ if args.max_budget_per_task is not None:
+ config.max_budget_per_task = args.max_budget_per_task
+ if args.max_iterations is not None:
+ config.max_iterations = args.max_iterations
asyncio.run(
- run_agent_controller(
- agent=agent,
+ run_controller(
+ config=config,
task_str=task_str,
- max_iterations=args.max_iterations,
- max_budget_per_task=args.max_budget_per_task,
)
)
diff --git a/opendevin/core/message.py b/opendevin/core/message.py
new file mode 100644
index 000000000000..6c8a81653a43
--- /dev/null
+++ b/opendevin/core/message.py
@@ -0,0 +1,69 @@
+from enum import Enum
+
+from pydantic import BaseModel, Field, model_serializer
+from typing_extensions import Literal
+
+from opendevin.core.config import load_app_config
+
+config = load_app_config()
+
+
+class ContentType(Enum):
+ TEXT = 'text'
+ IMAGE_URL = 'image_url'
+
+
+class Content(BaseModel):
+ type: ContentType
+
+ @model_serializer
+ def serialize_model(self):
+ raise NotImplementedError('Subclasses should implement this method.')
+
+
+class TextContent(Content):
+ text: str
+ type: ContentType = ContentType.TEXT
+
+ @model_serializer
+ def serialize_model(self):
+ return {'type': self.type.value, 'text': self.text}
+
+
+class ImageContent(Content):
+ image_urls: list[str]
+ type: ContentType = ContentType.IMAGE_URL
+
+ @model_serializer
+ def serialize_model(self):
+ images: list[dict[str, str | dict[str, str]]] = []
+ for url in self.image_urls:
+ images.append({'type': self.type.value, 'image_url': {'url': url}})
+ return images
+
+
+class Message(BaseModel):
+ role: Literal['user', 'system', 'assistant']
+ content: list[TextContent | ImageContent] = Field(default=list)
+ condensable: bool = True
+ event_id: int = -1
+
+ @property
+ def contains_image(self) -> bool:
+ return any(isinstance(content, ImageContent) for content in self.content)
+
+ @model_serializer
+ def serialize_model(self) -> dict:
+ content: list[dict[str, str | dict[str, str]]] = []
+ # check model provider is groq
+ if 'groq/' in config.get_llm_config().model:
+ if self.role in ['system', 'assistant']:
+ return {'role': self.role, 'content': self.content[0].text}
+
+ for item in self.content:
+ if isinstance(item, TextContent):
+ content.append(item.model_dump())
+ elif isinstance(item, ImageContent):
+ content.extend(item.model_dump())
+
+ return {'role': self.role, 'content': content}
diff --git a/opendevin/core/schema/config.py b/opendevin/core/schema/config.py
index b10ebe7ad069..f7c4f25b55d3 100644
--- a/opendevin/core/schema/config.py
+++ b/opendevin/core/schema/config.py
@@ -36,11 +36,9 @@ class ConfigType(str, Enum):
MAX_ITERATIONS = 'MAX_ITERATIONS'
AGENT = 'AGENT'
E2B_API_KEY = 'E2B_API_KEY'
- SANDBOX_BOX_TYPE = 'SANDBOX_BOX_TYPE'
SANDBOX_USER_ID = 'SANDBOX_USER_ID'
SANDBOX_TIMEOUT = 'SANDBOX_TIMEOUT'
USE_HOST_NETWORK = 'USE_HOST_NETWORK'
- SSH_HOSTNAME = 'SSH_HOSTNAME'
DISABLE_COLOR = 'DISABLE_COLOR'
DEBUG = 'DEBUG'
FILE_UPLOADS_MAX_FILE_SIZE_MB = 'FILE_UPLOADS_MAX_FILE_SIZE_MB'
diff --git a/opendevin/events/action/agent.py b/opendevin/events/action/agent.py
index c9d4ec6a3edd..fd5462bb7a88 100644
--- a/opendevin/events/action/agent.py
+++ b/opendevin/events/action/agent.py
@@ -20,16 +20,27 @@ def message(self) -> str:
@dataclass
class AgentSummarizeAction(Action):
- summary: str
+ """
+ Action to summarize a list of events.
+
+ Attributes:
+ - summarized_actions: A sentence summarizing all the actions.
+ - summarized_observations: A few sentences summarizing all the observations.
+ """
+
+ summarized_actions: str = ''
+ summarized_observations: str = ''
action: str = ActionType.SUMMARIZE
+ last_summarized_event_id = -1
@property
def message(self) -> str:
- return self.summary
+ return self.summarized_observations
def __str__(self) -> str:
ret = '**AgentSummarizeAction**\n'
- ret += f'SUMMARY: {self.summary}'
+ ret += f'SUMMARIZED ACTIONS: {self.summarized_actions}\n'
+ ret += f'SUMMARIZED OBSERVATIONS: {self.summarized_observations}\n'
return ret
diff --git a/opendevin/events/action/commands.py b/opendevin/events/action/commands.py
index 0edccf82982e..e1381c7f0944 100644
--- a/opendevin/events/action/commands.py
+++ b/opendevin/events/action/commands.py
@@ -10,6 +10,14 @@
class CmdRunAction(Action):
command: str
thought: str = ''
+ keep_prompt: bool = True
+ # if True, the command prompt will be kept in the command output observation
+ # Example of command output:
+ # root@sandbox:~# ls
+ # file1.txt
+ # file2.txt
+ # root@sandbox:~# <-- this is the command prompt
+
action: str = ActionType.RUN
runnable: ClassVar[bool] = True
is_confirmed: ActionConfirmationStatus = ActionConfirmationStatus.CONFIRMED
@@ -33,7 +41,6 @@ class IPythonRunCellAction(Action):
action: str = ActionType.RUN_IPYTHON
runnable: ClassVar[bool] = True
is_confirmed: ActionConfirmationStatus = ActionConfirmationStatus.CONFIRMED
- kernel_init_code: str = '' # code to run in the kernel (if the kernel is restarted)
def __str__(self) -> str:
ret = '**IPythonRunCellAction**\n'
diff --git a/opendevin/events/action/message.py b/opendevin/events/action/message.py
index 724821c24337..b235dd8687b6 100644
--- a/opendevin/events/action/message.py
+++ b/opendevin/events/action/message.py
@@ -8,6 +8,7 @@
@dataclass
class MessageAction(Action):
content: str
+ images_urls: list | None = None
wait_for_response: bool = False
action: str = ActionType.MESSAGE
@@ -18,4 +19,7 @@ def message(self) -> str:
def __str__(self) -> str:
ret = f'**MessageAction** (source={self.source})\n'
ret += f'CONTENT: {self.content}'
+ if self.images_urls:
+ for url in self.images_urls:
+ ret += f'\nIMAGE_URL: {url}'
return ret
diff --git a/opendevin/events/event.py b/opendevin/events/event.py
index 7cf6d4accd3a..d4a7f915e02f 100644
--- a/opendevin/events/event.py
+++ b/opendevin/events/event.py
@@ -39,3 +39,13 @@ def cause(self) -> int | None:
if hasattr(self, '_cause'):
return self._cause # type: ignore[attr-defined]
return None
+
+ @property
+ def timeout(self) -> int | None:
+ if hasattr(self, '_timeout'):
+ return self._timeout # type: ignore[attr-defined]
+ return None
+
+ @timeout.setter
+ def timeout(self, value: int | None) -> None:
+ self._timeout = value
diff --git a/opendevin/events/observation/__init__.py b/opendevin/events/observation/__init__.py
index a6c7405ec3c0..fb498fcc3825 100644
--- a/opendevin/events/observation/__init__.py
+++ b/opendevin/events/observation/__init__.py
@@ -6,7 +6,7 @@
from .error import ErrorObservation
from .files import FileReadObservation, FileWriteObservation
from .observation import Observation
-from .reject import RejectObservation
+from .reject import UserRejectObservation
from .success import SuccessObservation
__all__ = [
@@ -21,5 +21,5 @@
'AgentStateChangedObservation',
'AgentDelegateObservation',
'SuccessObservation',
- 'RejectObservation',
+ 'UserRejectObservation',
]
diff --git a/opendevin/events/observation/browse.py b/opendevin/events/observation/browse.py
index eaf44ac101ed..3ac00c8539b7 100644
--- a/opendevin/events/observation/browse.py
+++ b/opendevin/events/observation/browse.py
@@ -11,17 +11,12 @@ class BrowserOutputObservation(Observation):
url: str
screenshot: str = field(repr=False) # don't show in repr
- status_code: int = 200
error: bool = False
observation: str = ObservationType.BROWSE
# do not include in the memory
open_pages_urls: list = field(default_factory=list)
active_page_index: int = -1
- dom_object: dict = field(default_factory=dict, repr=False) # don't show in repr
- axtree_object: dict = field(default_factory=dict, repr=False) # don't show in repr
- extra_element_properties: dict = field(
- default_factory=dict, repr=False
- ) # don't show in repr
+ axtree_txt: str = ''
last_browser_action: str = ''
last_browser_action_error: str = ''
focused_element_bid: str = ''
@@ -34,7 +29,6 @@ def __str__(self) -> str:
return (
'**BrowserOutputObservation**\n'
f'URL: {self.url}\n'
- f'Status code: {self.status_code}\n'
f'Error: {self.error}\n'
f'Open pages: {self.open_pages_urls}\n'
f'Active page index: {self.active_page_index}\n'
diff --git a/opendevin/events/observation/reject.py b/opendevin/events/observation/reject.py
index d337889f7de6..2bb34ba48fa3 100644
--- a/opendevin/events/observation/reject.py
+++ b/opendevin/events/observation/reject.py
@@ -6,7 +6,7 @@
@dataclass
-class RejectObservation(Observation):
+class UserRejectObservation(Observation):
"""This data class represents the result of a successful action."""
observation: str = ObservationType.USER_REJECTED
diff --git a/opendevin/events/serialization/action.py b/opendevin/events/serialization/action.py
index f8051842eeb2..3f7a8265af3c 100644
--- a/opendevin/events/serialization/action.py
+++ b/opendevin/events/serialization/action.py
@@ -4,6 +4,7 @@
AgentDelegateAction,
AgentFinishAction,
AgentRejectAction,
+ AgentSummarizeAction,
ChangeAgentStateAction,
)
from opendevin.events.action.browse import BrowseInteractiveAction, BrowseURLAction
@@ -31,6 +32,7 @@
ModifyTaskAction,
ChangeAgentStateAction,
MessageAction,
+ AgentSummarizeAction,
)
ACTION_TYPE_TO_CLASS = {action_class.action: action_class for action_class in actions} # type: ignore[attr-defined]
@@ -54,6 +56,10 @@ def action_from_dict(action: dict) -> Action:
args = action.get('args', {})
try:
decoded_action = action_class(**args)
- except TypeError:
- raise LLMMalformedActionError(f'action={action} has the wrong arguments')
+ if 'timeout' in action:
+ decoded_action.timeout = action['timeout']
+ except TypeError as e:
+ raise LLMMalformedActionError(
+ f'Error creating {action_class} from {action=}: {e}'
+ )
return decoded_action
diff --git a/opendevin/events/serialization/event.py b/opendevin/events/serialization/event.py
index d50c1f06eb4c..aaa9f4b23572 100644
--- a/opendevin/events/serialization/event.py
+++ b/opendevin/events/serialization/event.py
@@ -14,14 +14,12 @@
DELETE_FROM_MEMORY_EXTRAS = {
'screenshot',
- 'dom_object',
- 'axtree_object',
+ 'axtree_txt',
'open_pages_urls',
'active_page_index',
'last_browser_action',
'last_browser_action_error',
'focused_element_bid',
- 'extra_element_properties',
}
@@ -61,6 +59,8 @@ def event_to_dict(event: 'Event') -> dict:
props.pop(key, None)
if 'action' in d:
d['args'] = props
+ if event.timeout is not None:
+ d['timeout'] = event.timeout
elif 'observation' in d:
d['content'] = props.pop('content', '')
d['extras'] = props
@@ -75,6 +75,7 @@ def event_to_memory(event: 'Event', max_message_chars: int) -> dict:
d.pop('cause', None)
d.pop('timestamp', None)
d.pop('message', None)
+ d.pop('images_urls', None)
if 'extras' in d:
remove_fields(d['extras'], DELETE_FROM_MEMORY_EXTRAS)
if isinstance(event, Observation) and 'content' in d:
diff --git a/opendevin/events/serialization/observation.py b/opendevin/events/serialization/observation.py
index 1ff31f37fdc8..37183e9a21e3 100644
--- a/opendevin/events/serialization/observation.py
+++ b/opendevin/events/serialization/observation.py
@@ -9,7 +9,7 @@
from opendevin.events.observation.error import ErrorObservation
from opendevin.events.observation.files import FileReadObservation, FileWriteObservation
from opendevin.events.observation.observation import Observation
-from opendevin.events.observation.reject import RejectObservation
+from opendevin.events.observation.reject import UserRejectObservation
from opendevin.events.observation.success import SuccessObservation
observations = (
@@ -23,7 +23,7 @@
SuccessObservation,
ErrorObservation,
AgentStateChangedObservation,
- RejectObservation,
+ UserRejectObservation,
)
OBSERVATION_TYPE_TO_CLASS = {
diff --git a/opendevin/events/stream.py b/opendevin/events/stream.py
index 5a6f90998b98..054ca40af288 100644
--- a/opendevin/events/stream.py
+++ b/opendevin/events/stream.py
@@ -22,16 +22,16 @@ class EventStreamSubscriber(str, Enum):
class EventStream:
sid: str
+ file_store: FileStore
# For each subscriber ID, there is a stack of callback functions - useful
# when there are agent delegates
_subscribers: dict[str, list[Callable]]
_cur_id: int
_lock: threading.Lock
- _file_store: FileStore
def __init__(self, sid: str, file_store: FileStore):
self.sid = sid
- self._file_store = file_store
+ self.file_store = file_store
self._subscribers = {}
self._cur_id = 0
self._lock = threading.Lock()
@@ -39,7 +39,7 @@ def __init__(self, sid: str, file_store: FileStore):
def _reinitialize_from_file_store(self) -> None:
try:
- events = self._file_store.list(f'sessions/{self.sid}/events')
+ events = self.file_store.list(f'sessions/{self.sid}/events')
except FileNotFoundError:
logger.debug(f'No events found for session {self.sid}')
self._cur_id = 0
@@ -100,7 +100,7 @@ def get_events(
def get_event(self, id: int) -> Event:
filename = self._get_filename_for_id(id)
- content = self._file_store.read(filename)
+ content = self.file_store.read(filename)
data = json.loads(content)
return event_from_dict(data)
@@ -136,9 +136,7 @@ def add_event(self, event: Event, source: EventSource):
event._source = source # type: ignore [attr-defined]
data = event_to_dict(event)
if event.id is not None:
- self._file_store.write(
- self._get_filename_for_id(event.id), json.dumps(data)
- )
+ self.file_store.write(self._get_filename_for_id(event.id), json.dumps(data))
for stack in self._subscribers.values():
callback = stack[-1]
asyncio.create_task(callback(event))
@@ -149,7 +147,7 @@ def filtered_events_by_source(self, source: EventSource):
yield event
def clear(self):
- self._file_store.delete(f'sessions/{self.sid}')
+ self.file_store.delete(f'sessions/{self.sid}')
self._cur_id = 0
# self._subscribers = {}
self._reinitialize_from_file_store()
diff --git a/opendevin/llm/llm.py b/opendevin/llm/llm.py
index 5d4e8b55ab0d..d54244538b67 100644
--- a/opendevin/llm/llm.py
+++ b/opendevin/llm/llm.py
@@ -1,8 +1,11 @@
+import asyncio
import copy
import warnings
from functools import partial
+from typing import Optional
from opendevin.core.config import LLMConfig
+from opendevin.core.message import Message
with warnings.catch_warnings():
warnings.simplefilter('ignore')
@@ -13,6 +16,7 @@
APIConnectionError,
ContentPolicyViolationError,
InternalServerError,
+ OpenAIError,
RateLimitError,
ServiceUnavailableError,
)
@@ -24,16 +28,20 @@
wait_random_exponential,
)
+from opendevin.condenser.condenser import CondenserMixin
+from opendevin.core.exceptions import (
+ ContextWindowLimitExceededError,
+ TokenLimitExceededError,
+ UserCancelledError,
+)
from opendevin.core.logger import llm_prompt_logger, llm_response_logger
from opendevin.core.logger import opendevin_logger as logger
from opendevin.core.metrics import Metrics
-__all__ = ['LLM']
-
message_separator = '\n\n----------\n\n'
-class LLM:
+class LLM(CondenserMixin):
"""The LLM class represents a Language Model instance.
Attributes:
@@ -56,6 +64,9 @@ def __init__(
self.metrics = metrics if metrics is not None else Metrics()
self.cost_metric_supported = True
+ # Set up config attributes with default values to prevent AttributeError
+ LLMConfig.set_missing_attributes(self.config)
+
# litellm actually uses base Exception here for unknown model
self.model_info = None
try:
@@ -80,7 +91,6 @@ def __init__(
else:
# Max input tokens for gpt3.5, so this is a safe fallback for any potentially viable model
self.config.max_input_tokens = 4096
-
if config.max_output_tokens is None:
if (
self.model_info is not None
@@ -95,6 +105,12 @@ def __init__(
if self.config.drop_params:
litellm.drop_params = self.config.drop_params
+ if self.config.model.startswith('ollama'):
+ max_input_tokens = self.config.max_input_tokens
+ max_output_tokens = self.config.max_output_tokens
+ if max_input_tokens and max_output_tokens:
+ litellm.OllamaConfig.num_ctx = max_input_tokens + max_output_tokens
+
self._completion = partial(
litellm_completion,
model=self.config.model,
@@ -108,7 +124,7 @@ def __init__(
top_p=self.config.top_p,
)
- completion_unwrapped = self._completion
+ self.completion_unwrapped = self._completion
def attempt_on_error(retry_state):
logger.error(
@@ -144,14 +160,56 @@ def wrapper(*args, **kwargs):
else:
messages = args[1]
+ try:
+ if self.is_over_token_limit(messages):
+ raise TokenLimitExceededError()
+ except TokenLimitExceededError:
+ # If we got a context alert, try trimming the messages length, then try again
+ if kwargs['condense'] and self.is_over_token_limit(messages):
+ # A separate call to run a summarizer
+ summary_action = self.condense(messages=messages)
+ return summary_action
+ else:
+ print('step() failed with an unrecognized exception:')
+ raise ContextWindowLimitExceededError()
+
+ kwargs.pop('condense', None)
+ if isinstance(messages[0], Message):
+ messages = [message.model_dump() for message in messages]
+ kwargs['messages'] = messages
+
# log the prompt
debug_message = ''
for message in messages:
- debug_message += message_separator + message['content']
+ content = message['content']
+
+ if isinstance(content, list):
+ for element in content:
+ if isinstance(element, dict):
+ if 'text' in element:
+ content_str = element['text'].strip()
+ elif (
+ 'image_url' in element and 'url' in element['image_url']
+ ):
+ content_str = element['image_url']['url']
+ else:
+ content_str = str(element)
+ else:
+ content_str = str(element)
+
+ debug_message += message_separator + content_str
+ else:
+ content_str = str(content)
+
+ debug_message += message_separator + content_str
+
llm_prompt_logger.debug(debug_message)
- # call the completion function
- resp = completion_unwrapped(*args, **kwargs)
+ # skip if messages is empty (thus debug_message is empty)
+ if debug_message:
+ resp = self.completion_unwrapped(*args, **kwargs)
+ else:
+ resp = {'choices': [{'message': {'content': ''}}]}
# log the response
message_back = resp['choices'][0]['message']['content']
@@ -163,6 +221,207 @@ def wrapper(*args, **kwargs):
self._completion = wrapper # type: ignore
+ # Async version
+ self._async_completion = partial(
+ self._call_acompletion,
+ model=self.config.model,
+ api_key=self.config.api_key,
+ base_url=self.config.base_url,
+ api_version=self.config.api_version,
+ custom_llm_provider=self.config.custom_llm_provider,
+ max_tokens=self.config.max_output_tokens,
+ timeout=self.config.timeout,
+ temperature=self.config.temperature,
+ top_p=self.config.top_p,
+ drop_params=True,
+ )
+
+ async_completion_unwrapped = self._async_completion
+
+ @retry(
+ reraise=True,
+ stop=stop_after_attempt(self.config.num_retries),
+ wait=wait_random_exponential(
+ multiplier=self.config.retry_multiplier,
+ min=self.config.retry_min_wait,
+ max=self.config.retry_max_wait,
+ ),
+ retry=retry_if_exception_type(
+ (
+ RateLimitError,
+ APIConnectionError,
+ ServiceUnavailableError,
+ InternalServerError,
+ ContentPolicyViolationError,
+ )
+ ),
+ after=attempt_on_error,
+ )
+ async def async_completion_wrapper(*args, **kwargs):
+ """Async wrapper for the litellm acompletion function."""
+ # some callers might just send the messages directly
+ if 'messages' in kwargs:
+ messages = kwargs['messages']
+ else:
+ messages = args[1]
+
+ # log the prompt
+ debug_message = ''
+ for message in messages:
+ content = message['content']
+
+ if isinstance(content, list):
+ for element in content:
+ if isinstance(element, dict):
+ if 'text' in element:
+ content_str = element['text']
+ elif (
+ 'image_url' in element and 'url' in element['image_url']
+ ):
+ content_str = element['image_url']['url']
+ else:
+ content_str = str(element)
+ else:
+ content_str = str(element)
+
+ debug_message += message_separator + content_str
+ else:
+ content_str = str(content)
+
+ debug_message += message_separator + content_str
+
+ llm_prompt_logger.debug(debug_message)
+
+ async def check_stopped():
+ while True:
+ if (
+ hasattr(self.config, 'on_cancel_requested_fn')
+ and self.config.on_cancel_requested_fn is not None
+ and await self.config.on_cancel_requested_fn()
+ ):
+ raise UserCancelledError('LLM request cancelled by user')
+ await asyncio.sleep(0.1)
+
+ stop_check_task = asyncio.create_task(check_stopped())
+
+ try:
+ # Directly call and await litellm_acompletion
+ resp = await async_completion_unwrapped(*args, **kwargs)
+
+ # skip if messages is empty (thus debug_message is empty)
+ if debug_message:
+ message_back = resp['choices'][0]['message']['content']
+ llm_response_logger.debug(message_back)
+ else:
+ resp = {'choices': [{'message': {'content': ''}}]}
+ self._post_completion(resp)
+
+ # We do not support streaming in this method, thus return resp
+ return resp
+
+ except UserCancelledError:
+ logger.info('LLM request cancelled by user.')
+ raise
+ except OpenAIError as e:
+ logger.error(f'OpenAIError occurred:\n{e}')
+ raise
+ except (
+ RateLimitError,
+ APIConnectionError,
+ ServiceUnavailableError,
+ InternalServerError,
+ ) as e:
+ logger.error(f'Completion Error occurred:\n{e}')
+ raise
+
+ finally:
+ await asyncio.sleep(0.1)
+ stop_check_task.cancel()
+ try:
+ await stop_check_task
+ except asyncio.CancelledError:
+ pass
+
+ @retry(
+ reraise=True,
+ stop=stop_after_attempt(self.config.num_retries),
+ wait=wait_random_exponential(
+ multiplier=self.config.retry_multiplier,
+ min=self.config.retry_min_wait,
+ max=self.config.retry_max_wait,
+ ),
+ retry=retry_if_exception_type(
+ (
+ RateLimitError,
+ APIConnectionError,
+ ServiceUnavailableError,
+ InternalServerError,
+ ContentPolicyViolationError,
+ )
+ ),
+ after=attempt_on_error,
+ )
+ async def async_acompletion_stream_wrapper(*args, **kwargs):
+ """Async wrapper for the litellm acompletion with streaming function."""
+ # some callers might just send the messages directly
+ if 'messages' in kwargs:
+ messages = kwargs['messages']
+ else:
+ messages = args[1]
+
+ # log the prompt
+ debug_message = ''
+ for message in messages:
+ debug_message += message_separator + message['content']
+ llm_prompt_logger.debug(debug_message)
+
+ try:
+ # Directly call and await litellm_acompletion
+ resp = await async_completion_unwrapped(*args, **kwargs)
+
+ # For streaming we iterate over the chunks
+ async for chunk in resp:
+ # Check for cancellation before yielding the chunk
+ if (
+ hasattr(self.config, 'on_cancel_requested_fn')
+ and self.config.on_cancel_requested_fn is not None
+ and await self.config.on_cancel_requested_fn()
+ ):
+ raise UserCancelledError(
+ 'LLM request cancelled due to CANCELLED state'
+ )
+ # with streaming, it is "delta", not "message"!
+ message_back = chunk['choices'][0]['delta']['content']
+ llm_response_logger.debug(message_back)
+ self._post_completion(chunk)
+
+ yield chunk
+
+ except UserCancelledError:
+ logger.info('LLM request cancelled by user.')
+ raise
+ except OpenAIError as e:
+ logger.error(f'OpenAIError occurred:\n{e}')
+ raise
+ except (
+ RateLimitError,
+ APIConnectionError,
+ ServiceUnavailableError,
+ InternalServerError,
+ ) as e:
+ logger.error(f'Completion Error occurred:\n{e}')
+ raise
+
+ finally:
+ if kwargs.get('stream', False):
+ await asyncio.sleep(0.1)
+
+ self._async_completion = async_completion_wrapper # type: ignore
+ self._async_streaming_completion = async_acompletion_stream_wrapper # type: ignore
+
+ async def _call_acompletion(self, *args, **kwargs):
+ return await litellm.acompletion(*args, **kwargs)
+
@property
def completion(self):
"""Decorator for the litellm completion function.
@@ -171,6 +430,25 @@ def completion(self):
"""
return self._completion
+ @property
+ def async_completion(self):
+ """Decorator for the async litellm acompletion function.
+
+ Check the complete documentation at https://litellm.vercel.app/docs/providers/ollama#example-usage---streaming--acompletion
+ """
+ return self._async_completion
+
+ @property
+ def async_streaming_completion(self):
+ """Decorator for the async litellm acompletion function with streaming.
+
+ Check the complete documentation at https://litellm.vercel.app/docs/providers/ollama#example-usage---streaming--acompletion
+ """
+ return self._async_streaming_completion
+
+ def supports_vision(self):
+ return litellm.supports_vision(self.config.model)
+
def _post_completion(self, response: str) -> None:
"""Post-process the completion response."""
try:
@@ -184,7 +462,9 @@ def _post_completion(self, response: str) -> None:
self.metrics.accumulated_cost,
)
- def get_token_count(self, messages):
+ def get_token_count(
+ self, messages: Optional[list[Message]] = None, text: Optional[str] = None
+ ) -> int:
"""Get the number of tokens in a list of messages.
Args:
@@ -193,7 +473,11 @@ def get_token_count(self, messages):
Returns:
int: The number of tokens.
"""
- return litellm.token_counter(model=self.config.model, messages=messages)
+ if messages and isinstance(messages[0], Message):
+ messages = [m.model_dump() for m in messages]
+ return litellm.token_counter(
+ model=self.config.model, messages=messages, text=text
+ )
def is_local(self):
"""Determines if the system is using a locally running LLM.
@@ -259,3 +543,28 @@ def __repr__(self):
def reset(self):
self.metrics = Metrics()
+
+ def is_over_token_limit(self, messages: list[Message]) -> bool:
+ """
+ Estimates the token count of the given events using litellm tokenizer and returns True if over the max_input_tokens value.
+
+ Parameters:
+ - messages: List of messages to estimate the token count for.
+
+ Returns:
+ - Estimated token count.
+ """
+ # max_input_tokens will always be set in init to some sensible default
+ # 0 in config.llm disables the check
+ MAX_TOKEN_COUNT_PADDING = 512
+ if not self.config.max_input_tokens:
+ return False
+ token_count = self.get_token_count(messages=messages) + MAX_TOKEN_COUNT_PADDING
+ logger.debug(f'Token count: {token_count}')
+ return token_count >= self.config.max_input_tokens
+
+ def get_text_messages(self, messages: list[Message]) -> list[dict]:
+ text_messages = []
+ for message in messages:
+ text_messages.append(message.model_dump())
+ return text_messages
diff --git a/opendevin/memory/__init__.py b/opendevin/memory/__init__.py
index 9a705a24ac77..4c1a84f59f0c 100644
--- a/opendevin/memory/__init__.py
+++ b/opendevin/memory/__init__.py
@@ -1,5 +1,4 @@
-from .condenser import MemoryCondenser
from .history import ShortTermHistory
from .memory import LongTermMemory
-__all__ = ['LongTermMemory', 'ShortTermHistory', 'MemoryCondenser']
+__all__ = ['LongTermMemory', 'ShortTermHistory']
diff --git a/opendevin/memory/condenser.py b/opendevin/memory/condenser.py
deleted file mode 100644
index e0cef4ac3815..000000000000
--- a/opendevin/memory/condenser.py
+++ /dev/null
@@ -1,24 +0,0 @@
-from opendevin.core.logger import opendevin_logger as logger
-from opendevin.llm.llm import LLM
-
-
-class MemoryCondenser:
- def condense(self, summarize_prompt: str, llm: LLM):
- """Attempts to condense the memory by using the llm
-
- Parameters:
- - llm (LLM): llm to be used for summarization
-
- Raises:
- - Exception: the same exception as it got from the llm or processing the response
- """
- try:
- messages = [{'content': summarize_prompt, 'role': 'user'}]
- resp = llm.completion(messages=messages)
- summary_response = resp['choices'][0]['message']['content']
- return summary_response
- except Exception as e:
- logger.error('Error condensing thoughts: %s', str(e), exc_info=False)
-
- # TODO If the llm fails with ContextWindowExceededError, we can try to condense the memory chunk by chunk
- raise
diff --git a/opendevin/memory/history.py b/opendevin/memory/history.py
index cbe186f387b6..beb0fb38bcf5 100644
--- a/opendevin/memory/history.py
+++ b/opendevin/memory/history.py
@@ -4,6 +4,7 @@
from opendevin.events.action.action import Action
from opendevin.events.action.agent import (
AgentDelegateAction,
+ AgentSummarizeAction,
ChangeAgentStateAction,
)
from opendevin.events.action.empty import NullAction
@@ -40,6 +41,12 @@ def __init__(self):
self.start_id = -1
self.end_id = -1
self.delegates = {}
+ self.summary = None
+ self.last_summarized_event_id = -1
+
+ def add_summary(self, summary_action: AgentSummarizeAction):
+ self.summary = summary_action
+ self.last_summarized_event_id = summary_action.last_summarized_event_id
def set_event_stream(self, event_stream: EventStream):
self._event_stream = event_stream
@@ -60,24 +67,29 @@ def get_events(self, reverse: bool = False) -> Iterable[Event]:
if self.end_id != -1
else self._event_stream.get_latest_event_id()
)
-
+ summary_yielded = False
for event in self._event_stream.get_events(
start_id=start_id,
end_id=end_id,
reverse=reverse,
filter_out_type=self.filter_out,
):
- # TODO add summaries
- # and filter out events that were included in a summary
-
# filter out the events from a delegate of the current agent
- if not any(
+ if (
+ self.summary is not None
+ and event.id <= self.last_summarized_event_id
+ and not summary_yielded
+ ):
+ summary_action = self.summary
+ summary_yielded = True
+ yield summary_action
+ elif not any(
# except for the delegate action and observation themselves, currently
# AgentDelegateAction has id = delegate_start
# AgentDelegateObservation has id = delegate_end
delegate_start < event.id < delegate_end
for delegate_start, delegate_end in self.delegates.keys()
- ):
+ ) and (event.id > self.last_summarized_event_id):
yield event
def get_last_action(self, end_id: int = -1) -> Action | None:
diff --git a/opendevin/memory/memory.py b/opendevin/memory/memory.py
index a175c36d3f95..f000060cc088 100644
--- a/opendevin/memory/memory.py
+++ b/opendevin/memory/memory.py
@@ -1,10 +1,5 @@
import threading
-import chromadb
-import llama_index.embeddings.openai.base as llama_openai
-from llama_index.core import Document, VectorStoreIndex
-from llama_index.core.retrievers import VectorIndexRetriever
-from llama_index.vector_stores.chroma import ChromaVectorStore
from openai._exceptions import APIConnectionError, InternalServerError, RateLimitError
from tenacity import (
retry,
@@ -17,94 +12,102 @@
from opendevin.core.logger import opendevin_logger as logger
from opendevin.core.utils import json
-# TODO: this could be made configurable
-num_retries: int = 10
-retry_min_wait: int = 3
-retry_max_wait: int = 300
-
-# llama-index includes a retry decorator around openai.get_embeddings() function
-# it is initialized with hard-coded values and errors
-# this non-customizable behavior is creating issues when it's retrying faster than providers' rate limits
-# this block attempts to banish it and replace it with our decorator, to allow users to set their own limits
-
-if hasattr(llama_openai.get_embeddings, '__wrapped__'):
- original_get_embeddings = llama_openai.get_embeddings.__wrapped__
-else:
- logger.warning('Cannot set custom retry limits.')
- num_retries = 1
- original_get_embeddings = llama_openai.get_embeddings
-
-
-def attempt_on_error(retry_state):
- logger.error(
- f'{retry_state.outcome.exception()}. Attempt #{retry_state.attempt_number} | You can customize these settings in the configuration.',
- exc_info=False,
+try:
+ import chromadb
+ import llama_index.embeddings.openai.base as llama_openai
+ from llama_index.core import Document, VectorStoreIndex
+ from llama_index.core.retrievers import VectorIndexRetriever
+ from llama_index.vector_stores.chroma import ChromaVectorStore
+
+ LLAMA_INDEX_AVAILABLE = True
+except ImportError:
+ LLAMA_INDEX_AVAILABLE = False
+
+if LLAMA_INDEX_AVAILABLE:
+ # TODO: this could be made configurable
+ num_retries: int = 10
+ retry_min_wait: int = 3
+ retry_max_wait: int = 300
+
+ # llama-index includes a retry decorator around openai.get_embeddings() function
+ # it is initialized with hard-coded values and errors
+ # this non-customizable behavior is creating issues when it's retrying faster than providers' rate limits
+ # this block attempts to banish it and replace it with our decorator, to allow users to set their own limits
+
+ if hasattr(llama_openai.get_embeddings, '__wrapped__'):
+ original_get_embeddings = llama_openai.get_embeddings.__wrapped__
+ else:
+ logger.warning('Cannot set custom retry limits.')
+ num_retries = 1
+ original_get_embeddings = llama_openai.get_embeddings
+
+ def attempt_on_error(retry_state):
+ logger.error(
+ f'{retry_state.outcome.exception()}. Attempt #{retry_state.attempt_number} | You can customize these settings in the configuration.',
+ exc_info=False,
+ )
+ return None
+
+ @retry(
+ reraise=True,
+ stop=stop_after_attempt(num_retries),
+ wait=wait_random_exponential(min=retry_min_wait, max=retry_max_wait),
+ retry=retry_if_exception_type(
+ (RateLimitError, APIConnectionError, InternalServerError)
+ ),
+ after=attempt_on_error,
)
- return None
-
-
-@retry(
- reraise=True,
- stop=stop_after_attempt(num_retries),
- wait=wait_random_exponential(min=retry_min_wait, max=retry_max_wait),
- retry=retry_if_exception_type(
- (RateLimitError, APIConnectionError, InternalServerError)
- ),
- after=attempt_on_error,
-)
-def wrapper_get_embeddings(*args, **kwargs):
- return original_get_embeddings(*args, **kwargs)
-
-
-llama_openai.get_embeddings = wrapper_get_embeddings
-
-
-class EmbeddingsLoader:
- """Loader for embedding model initialization."""
-
- @staticmethod
- def get_embedding_model(strategy: str, llm_config: LLMConfig):
- supported_ollama_embed_models = [
- 'llama2',
- 'mxbai-embed-large',
- 'nomic-embed-text',
- 'all-minilm',
- 'stable-code',
- ]
- if strategy in supported_ollama_embed_models:
- from llama_index.embeddings.ollama import OllamaEmbedding
-
- return OllamaEmbedding(
- model_name=strategy,
- base_url=llm_config.embedding_base_url,
- ollama_additional_kwargs={'mirostat': 0},
- )
- elif strategy == 'openai':
- from llama_index.embeddings.openai import OpenAIEmbedding
-
- return OpenAIEmbedding(
- model='text-embedding-ada-002',
- api_key=llm_config.api_key,
- )
- elif strategy == 'azureopenai':
- from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
-
- return AzureOpenAIEmbedding(
- model='text-embedding-ada-002',
- deployment_name=llm_config.embedding_deployment_name,
- api_key=llm_config.api_key,
- azure_endpoint=llm_config.base_url,
- api_version=llm_config.api_version,
- )
- elif (strategy is not None) and (strategy.lower() == 'none'):
- # TODO: this works but is not elegant enough. The incentive is when
- # an agent using embeddings is not used, there is no reason we need to
- # initialize an embedding model
- return None
- else:
- from llama_index.embeddings.huggingface import HuggingFaceEmbedding
-
- return HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
+ def wrapper_get_embeddings(*args, **kwargs):
+ return original_get_embeddings(*args, **kwargs)
+
+ llama_openai.get_embeddings = wrapper_get_embeddings
+
+ class EmbeddingsLoader:
+ """Loader for embedding model initialization."""
+
+ @staticmethod
+ def get_embedding_model(strategy: str, llm_config: LLMConfig):
+ supported_ollama_embed_models = [
+ 'llama2',
+ 'mxbai-embed-large',
+ 'nomic-embed-text',
+ 'all-minilm',
+ 'stable-code',
+ ]
+ if strategy in supported_ollama_embed_models:
+ from llama_index.embeddings.ollama import OllamaEmbedding
+
+ return OllamaEmbedding(
+ model_name=strategy,
+ base_url=llm_config.embedding_base_url,
+ ollama_additional_kwargs={'mirostat': 0},
+ )
+ elif strategy == 'openai':
+ from llama_index.embeddings.openai import OpenAIEmbedding
+
+ return OpenAIEmbedding(
+ model='text-embedding-ada-002',
+ api_key=llm_config.api_key,
+ )
+ elif strategy == 'azureopenai':
+ from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
+
+ return AzureOpenAIEmbedding(
+ model='text-embedding-ada-002',
+ deployment_name=llm_config.embedding_deployment_name,
+ api_key=llm_config.api_key,
+ azure_endpoint=llm_config.base_url,
+ api_version=llm_config.api_version,
+ )
+ elif (strategy is not None) and (strategy.lower() == 'none'):
+ # TODO: this works but is not elegant enough. The incentive is when
+ # an agent using embeddings is not used, there is no reason we need to
+ # initialize an embedding model
+ return None
+ else:
+ from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+
+ return HuggingFaceEmbedding(model_name='BAAI/bge-small-en-v1.5')
class LongTermMemory:
@@ -112,6 +115,12 @@ class LongTermMemory:
def __init__(self, llm_config: LLMConfig, memory_max_threads: int = 1):
"""Initialize the chromadb and set up ChromaVectorStore for later use."""
+ if not LLAMA_INDEX_AVAILABLE:
+ raise ImportError(
+ 'llama_index and its dependencies are not installed. '
+ 'To use LongTermMemory, please run: poetry install --with llama-index'
+ )
+
db = chromadb.Client(chromadb.Settings(anonymized_telemetry=False))
self.collection = db.get_or_create_collection(name='memories')
vector_store = ChromaVectorStore(chroma_collection=self.collection)
diff --git a/opendevin/runtime/__init__.py b/opendevin/runtime/__init__.py
index f11157e8f0b3..26d690826f2b 100644
--- a/opendevin/runtime/__init__.py
+++ b/opendevin/runtime/__init__.py
@@ -1,16 +1,10 @@
-from .docker.local_box import LocalBox
-from .docker.ssh_box import DockerSSHBox
from .e2b.sandbox import E2BBox
from .sandbox import Sandbox
def get_runtime_cls(name: str):
# Local imports to avoid circular imports
- if name == 'server':
- from .server.runtime import ServerRuntime
-
- return ServerRuntime
- elif name == 'client':
+ if name == 'eventstream':
from .client.runtime import EventStreamRuntime
return EventStreamRuntime
diff --git a/opendevin/runtime/browser/browser_env.py b/opendevin/runtime/browser/browser_env.py
index ed0611f6b21a..7d060580bd9e 100644
--- a/opendevin/runtime/browser/browser_env.py
+++ b/opendevin/runtime/browser/browser_env.py
@@ -3,8 +3,6 @@
import io
import json
import multiprocessing
-import os
-import threading
import time
import uuid
@@ -12,55 +10,32 @@
import gymnasium as gym
import html2text
import numpy as np
+import tenacity
from browsergym.utils.obs import flatten_dom_to_str
from PIL import Image
from opendevin.core.exceptions import BrowserInitException
from opendevin.core.logger import opendevin_logger as logger
+BROWSER_EVAL_GET_GOAL_ACTION = 'GET_EVAL_GOAL'
+BROWSER_EVAL_GET_REWARDS_ACTION = 'GET_EVAL_REWARDS'
+
class BrowserEnv:
- def __init__(
- self,
- is_async: bool = True,
- browsergym_eval: str = '',
- browsergym_eval_save_dir: str = '',
- ):
+ def __init__(self, browsergym_eval_env: str | None = None):
self.html_text_converter = self.get_html_text_converter()
self.eval_mode = False
self.eval_dir = ''
- # EVAL only: browsergym_eval and browsergym_eval_save_dir must be provided for evaluation
- self.browsergym_eval = browsergym_eval
- self.browsergym_eval_save_dir = browsergym_eval_save_dir
- if self.browsergym_eval:
- assert (
- self.browsergym_eval_save_dir
- ), 'browsergym_eval_save_dir must be provided for evaluation.'
- self.eval_mode = True
- self.eval_dir = os.path.join(
- self.browsergym_eval_save_dir, self.browsergym_eval.split('/')[1]
- )
- os.makedirs(self.eval_dir, exist_ok=True)
+
+ # EVAL only: browsergym_eval_env must be provided for evaluation
+ self.browsergym_eval_env = browsergym_eval_env
+ self.eval_mode = bool(browsergym_eval_env)
+
# Initialize browser environment process
multiprocessing.set_start_method('spawn', force=True)
self.browser_side, self.agent_side = multiprocessing.Pipe()
- self.process = multiprocessing.Process(
- target=self.browser_process,
- )
- try:
- self.original_cwd = os.getcwd()
- except FileNotFoundError:
- logger.warning(
- 'Current working directory does not exist. Using /tmp as fallback.'
- )
- self.original_cwd = '/tmp'
- os.chdir('/tmp')
-
- if is_async:
- threading.Thread(target=self.init_browser).start()
- else:
- self.init_browser()
+ self.init_browser()
atexit.register(self.close)
def get_html_text_converter(self):
@@ -74,20 +49,15 @@ def get_html_text_converter(self):
html_text_converter.body_width = 0
return html_text_converter
+ @tenacity.retry(
+ wait=tenacity.wait_fixed(1),
+ stop=tenacity.stop_after_attempt(5),
+ retry=tenacity.retry_if_exception_type(BrowserInitException),
+ )
def init_browser(self):
logger.info('Starting browser env...')
-
- # Ensure we're in a valid directory before starting the process
- try:
- os.chdir(self.original_cwd)
- logger.debug(f'Changed back to original directory: {self.original_cwd}')
- except Exception as e:
- logger.error(f'Failed to change to original directory: {e}')
- # If we can't change to the original directory, try to use a known valid directory
- os.chdir('/tmp')
- logger.debug('Changed to /tmp directory as fallback')
-
try:
+ self.process = multiprocessing.Process(target=self.browser_process)
self.process.start()
except Exception as e:
logger.error(f'Failed to start browser process: {e}')
@@ -99,8 +69,17 @@ def init_browser(self):
def browser_process(self):
if self.eval_mode:
- logger.info('Creating browser env for evaluation purpose.')
- env = gym.make(self.browsergym_eval)
+ assert self.browsergym_eval_env is not None
+ logger.info('Initializing browser env for web browsing evaluation.')
+ if 'webarena' in self.browsergym_eval_env:
+ import browsergym.webarena # noqa F401 register webarena tasks as gym environments
+ elif 'miniwob' in self.browsergym_eval_env:
+ import browsergym.miniwob # noqa F401 register miniwob tasks as gym environments
+ else:
+ raise ValueError(
+ f'Unsupported browsergym eval env: {self.browsergym_eval_env}'
+ )
+ env = gym.make(self.browsergym_eval_env)
else:
env = gym.make(
'browsergym/openended',
@@ -109,20 +88,22 @@ def browser_process(self):
headless=True,
disable_env_checker=True,
)
+
obs, info = env.reset()
- # EVAL only: save the goal into file for evaluation
+
+ # EVAL ONLY: save the goal into file for evaluation
+ self.eval_goal = None
+ self.eval_rewards: list[float] = []
if self.eval_mode:
- rewards = [] # store rewards if in eval mode
- logger.info(obs['goal'])
- with open(
- os.path.join(self.eval_dir, 'goal.txt'), 'w', encoding='utf-8'
- ) as f:
- f.write(obs['goal'])
+ logger.info(f"Browsing goal: {obs['goal']}")
+ self.eval_goal = obs['goal']
+
logger.info('Browser env started.')
while True:
try:
if self.browser_side.poll(timeout=0.01):
unique_request_id, action_data = self.browser_side.recv()
+
# shutdown the browser environment
if unique_request_id == 'SHUTDOWN':
logger.info('SHUTDOWN recv, shutting down browser env...')
@@ -131,17 +112,29 @@ def browser_process(self):
elif unique_request_id == 'IS_ALIVE':
self.browser_side.send(('ALIVE', None))
continue
+
+ # EVAL ONLY: Get evaluation info
+ if action_data['action'] == BROWSER_EVAL_GET_GOAL_ACTION:
+ self.browser_side.send(
+ (unique_request_id, {'text_content': self.eval_goal})
+ )
+ continue
+ elif action_data['action'] == BROWSER_EVAL_GET_REWARDS_ACTION:
+ self.browser_side.send(
+ (
+ unique_request_id,
+ {'text_content': json.dumps(self.eval_rewards)},
+ )
+ )
+ continue
+
action = action_data['action']
obs, reward, terminated, truncated, info = env.step(action)
- # EVAL only: save the rewards into file for evaluation
+
+ # EVAL ONLY: Save the rewards into file for evaluation
if self.eval_mode:
- rewards.append(reward)
- with open(
- os.path.join(self.eval_dir, 'rewards.json'),
- 'w',
- encoding='utf-8',
- ) as f:
- f.write(json.dumps(rewards))
+ self.eval_rewards.append(reward)
+
# add text content of the page
html_str = flatten_dom_to_str(obs['dom_object'])
obs['text_content'] = self.html_text_converter.handle(html_str)
@@ -159,6 +152,7 @@ def browser_process(self):
return
def step(self, action_str: str, timeout: float = 30) -> dict:
+ """Execute an action in the browser environment and return the observation."""
unique_request_id = str(uuid.uuid4())
self.agent_side.send((unique_request_id, {'action': action_str}))
start_time = time.time()
diff --git a/opendevin/runtime/browser/utils.py b/opendevin/runtime/browser/utils.py
index 8ca73eab0cfe..f7a629520d26 100644
--- a/opendevin/runtime/browser/utils.py
+++ b/opendevin/runtime/browser/utils.py
@@ -1,6 +1,9 @@
import os
+from browsergym.utils.obs import flatten_axtree_to_str
+
from opendevin.core.exceptions import BrowserUnavailableException
+from opendevin.core.logger import opendevin_logger as logger
from opendevin.core.schema import ActionType
from opendevin.events.action import BrowseInteractiveAction, BrowseURLAction
from opendevin.events.observation import BrowserOutputObservation
@@ -28,25 +31,41 @@ async def browse(
raise ValueError(f'Invalid action type: {action.action}')
try:
- # obs provided by BrowserGym: see https://github.com/ServiceNow/BrowserGym/blob/main/core/src/browsergym/core/env.py#L396
+ # obs provided by BrowserGym:
+ # https://github.com/ServiceNow/BrowserGym/blob/main/browsergym/core/src/browsergym/core/env.py#L521
+ # https://github.com/ServiceNow/BrowserGym/blob/418421abdc5da4d77dc71d3b82a9e5e931be0c4f/browsergym/core/src/browsergym/core/env.py#L521
obs = browser.step(action_str)
+ try:
+ axtree_txt = flatten_axtree_to_str(
+ obs['axtree_object'], # accessibility tree object
+ extra_properties=obs[
+ 'extra_element_properties'
+ ], # extra element properties
+ with_clickable=True,
+ filter_visible_only=True,
+ )
+ except Exception as e:
+ logger.error(
+ f'Error when trying to process the accessibility tree: {e}, obs: {obs}'
+ )
+ axtree_txt = f'AX Error: {e}'
return BrowserOutputObservation(
content=obs['text_content'], # text content of the page
- open_pages_urls=obs['open_pages_urls'], # list of open pages
- active_page_index=obs['active_page_index'], # index of the active page
- dom_object=obs['dom_object'], # DOM object
- axtree_object=obs['axtree_object'], # accessibility tree object
- extra_element_properties=obs[
- 'extra_element_properties'
- ], # extra element properties
- last_browser_action=obs['last_action'], # last browser env action performed
- focused_element_bid=obs['focused_element_bid'], # focused element bid
- screenshot=obs['screenshot'], # base64-encoded screenshot, png
- url=obs['url'], # URL of the page
- error=True if obs['last_action_error'] else False, # error flag
- last_browser_action_error=obs[
- 'last_action_error'
- ], # last browser env action error
+ url=obs.get('url', ''), # URL of the page
+ screenshot=obs.get('screenshot', ''), # base64-encoded screenshot, png
+ open_pages_urls=obs.get('open_pages_urls', []), # list of open pages
+ active_page_index=obs.get(
+ 'active_page_index', -1
+ ), # index of the active page
+ axtree_txt=axtree_txt, # accessibility tree text
+ focused_element_bid=obs.get(
+ 'focused_element_bid', ''
+ ), # focused element bid
+ last_browser_action=obs.get(
+ 'last_action', ''
+ ), # last browser env action performed
+ last_browser_action_error=obs.get('last_action_error', ''),
+ error=True if obs.get('last_action_error') else False, # error flag
)
except Exception as e:
return BrowserOutputObservation(
diff --git a/opendevin/runtime/client/README.md b/opendevin/runtime/client/README.md
deleted file mode 100644
index 2040ab66599a..000000000000
--- a/opendevin/runtime/client/README.md
+++ /dev/null
@@ -1,110 +0,0 @@
-# OpenDevin Runtime
-
-This README provides an overview of the OpenDevin Runtime, a crucial component of the OpenDevin system. It covers two main aspects:
-
-1. How the Runtime Image is Built: Explains the layered approach to creating Docker images for both production and development environments.
-2. How the Runtime Client Works: Details the functionality and architecture of the Runtime Client, which executes actions within the Docker sandbox.
-
-The following sections dive deeper into these topics, providing a comprehensive understanding of the OpenDevin Runtime system.
-
-## How the Runtime Image is Built
-
-The OpenDevin runtime uses a layered approach for building Docker images:
-
-1. **Original Image**: `ubuntu:22.04`
- - This is the base image used for all subsequent layers.
-
-2. **Runtime Image**: `od_runtime:ubuntu__22.04`
- - Built from the stable release of OpenDevin.
- - This is the primary runtime image that users will interact with.
- - Created by copying all OpenDevin code into the original image and installing dependencies using Poetry.
-
-3. **Dev Runtime Image**: `od_runtime_dev:ubuntu__22.04`
- - Built from local source code for development purposes.
-
-### Build Process
-
-#### Production Build (DEBUG=false)
-By default, when DEBUG is set to false, the build process only needs to run once:
-- The Runtime Image (`od_runtime:ubuntu__22.04`) is created by copying OpenDevin code into the original Ubuntu image and installing all dependencies.
-- This pre-built image is then used for running the OpenDevin environment.
-
-#### Development Build (DEBUG=true)
-When developing or modifying code that runs inside the container, you can set DEBUG=true to enable a more dynamic build process:
-- Every time you run the code, the existing image will be updated with the latest changes.
-- The Dev Runtime Image (`od_runtime_dev:ubuntu__22.04`) is rebuilt from the Runtime Image (`od_runtime:ubuntu__22.04`).
-- Most dependencies are already installed in the Runtime Image, so this process mainly updates the code and any new dependencies.
-- The rebuild process typically takes around 10 seconds, allowing for quick iterations during development.
-
-This approach allows developers to easily test changes to the OpenDevin codebase, including modifications to files like client.py, without needing to rebuild the entire image from scratch each time.
-
-## How the Runtime Client Works
-
-The Runtime Client is a crucial component of the OpenDevin system, responsible for executing actions within the Docker sandbox environment and producing observations. Here's an overview of its functionality:
-
-1. **Initialization**:
- - The `EventStreamRuntime` class in `runtime.py` initializes the Docker container and sets up the runtime environment.
-
-2. **Communication**:
- - The Runtime Client uses FastAPI to create a web server inside the Docker container.
- - It listens for incoming action requests from the OpenDevin backend.
-
-3. **Action Execution**:
- - When an action is received, the Runtime Client processes it based on its type:
- - `CmdRunAction`: Executes shell commands using a pexpect-spawned bash shell.
- - `FileReadAction` and `FileWriteAction`: Perform file operations within the sandbox.
- - `IPythonRunCellAction`: Executes Python code in an IPython environment.
- - `BrowseURLAction` and `BrowseInteractiveAction`: Handle web browsing tasks using a browser environment.
-
-4. **Plugin System**:
- - The Runtime Client supports a plugin system for extending functionality.
- - Plugins like JupyterPlugin can be loaded to provide additional features.
-
-5. **Observation Generation**:
- - After executing an action, the Runtime Client generates an appropriate observation.
- - Observations include command outputs, file contents, error messages, etc.
-
-6. **Asynchronous Operation**:
- - The Runtime Client uses asyncio for avoid concurrent requests.
- - It ensures that only one action is executed at a time using a semaphore.
-
-7. **Security**:
- - All actions are executed within the confined Docker environment, providing a sandbox for safe execution.
-
-8. **Flexibility**:
- - The system supports both production (DEBUG=false) and development (DEBUG=true) modes.
- - In development mode, the runtime image can be updated with the latest code changes for testing and debugging.
-
-
-
-## Architecture Diagram
-
-```
-+-------------------+ +-------------------+
-| OpenDevin | | Docker Host |
-| Backend | | |
-| | | +-------------+ |
-| +-------------+ | | | Runtime | |
-| | EventStream | | | | Container | |
-| | Runtime |<-|-----|->| | |
-| +-------------+ | | | +-------+ | |
-| | | | |Runtime| | |
-| | | | |Client | | |
-| | | | +-------+ | |
-| | | | | | |
-| | | | +-------+ | |
-| | | | |Plugins| | |
-| | | | +-------+ | |
-| | | +-------------+ |
-+-------------------+ +-------------------+
-```
-
-This diagram illustrates the high-level architecture of the OpenDevin Runtime system:
-
-1. The OpenDevin Backend communicates with the Docker Host through the EventStreamRuntime.
-2. The Docker Host runs a Runtime Container, which includes:
- - The Runtime Client: Handles incoming actions and generates observations.
- - Plugins: Extend the functionality of the Runtime Client.
-3. The Runtime Client executes actions within the sandboxed environment of the Docker container.
-
-This architecture ensures a secure and flexible environment for executing AI-driven development tasks, allowing OpenDevin to execute a wide range of actions safely and efficiently.
diff --git a/opendevin/runtime/client/client.py b/opendevin/runtime/client/client.py
index c875ad494eab..4919bcc8a4db 100644
--- a/opendevin/runtime/client/client.py
+++ b/opendevin/runtime/client/client.py
@@ -3,20 +3,23 @@
It is responsible for executing actions received from OpenDevin backend and producing observations.
NOTE: this will be executed inside the docker sandbox.
-
-If you already have pre-build docker image yet you changed the code in this file OR dependencies, you need to rebuild the docker image to update the source code.
-
-You should add SANDBOX_UPDATE_SOURCE_CODE=True to any `python XXX.py` command you run to update the source code.
"""
import argparse
import asyncio
import os
import re
+import shutil
+import subprocess
+import sys
+from contextlib import asynccontextmanager
from pathlib import Path
import pexpect
-from fastapi import FastAPI, HTTPException, Request
+from fastapi import FastAPI, HTTPException, Request, UploadFile
+from fastapi.responses import JSONResponse
+from pathspec import PathSpec
+from pathspec.patterns import GitWildMatchPattern
from pexpect import EOF, TIMEOUT, ExceptionPexpect
from pydantic import BaseModel
from uvicorn import run
@@ -36,6 +39,7 @@
ErrorObservation,
FileReadObservation,
FileWriteObservation,
+ IPythonRunCellObservation,
Observation,
)
from opendevin.events.serialization import event_from_dict, event_to_dict
@@ -46,33 +50,128 @@
JupyterPlugin,
Plugin,
)
-from opendevin.runtime.server.files import insert_lines, read_lines
-
-app = FastAPI()
+from opendevin.runtime.utils import split_bash_commands
+from opendevin.runtime.utils.files import insert_lines, read_lines
class ActionRequest(BaseModel):
action: dict
+ROOT_GID = 0
+INIT_COMMANDS = [
+ 'git config --global user.name "opendevin"',
+ 'git config --global user.email "opendevin@all-hands.dev"',
+ "alias git='git --no-pager'",
+ 'export TERM=xterm-256color',
+ "export PATH=/opendevin/poetry/$(ls /opendevin/poetry | sed -n '2p')/bin:$PATH",
+]
+
+
class RuntimeClient:
"""RuntimeClient is running inside docker sandbox.
It is responsible for executing actions received from OpenDevin backend and producing observations.
"""
- def __init__(self, plugins_to_load: list[Plugin], work_dir: str) -> None:
- self._init_bash_shell(work_dir)
+ def __init__(
+ self,
+ plugins_to_load: list[Plugin],
+ work_dir: str,
+ username: str,
+ user_id: int,
+ browsergym_eval_env: str | None,
+ ) -> None:
+ self.plugins_to_load = plugins_to_load
+ self.username = username
+ self.user_id = user_id
+ self.pwd = work_dir # current PWD
+ self._initial_pwd = work_dir
+ self._init_user(self.username, self.user_id)
+ self._init_bash_shell(self.pwd, self.username)
self.lock = asyncio.Lock()
self.plugins: dict[str, Plugin] = {}
- self.browser = BrowserEnv()
+ self.browser = BrowserEnv(browsergym_eval_env)
+ self._initial_pwd = work_dir
- for plugin in plugins_to_load:
- plugin.initialize()
+ @property
+ def initial_pwd(self):
+ return self._initial_pwd
+
+ async def ainit(self):
+ for plugin in self.plugins_to_load:
+ await plugin.initialize(self.username)
self.plugins[plugin.name] = plugin
logger.info(f'Initializing plugin: {plugin.name}')
- def _init_bash_shell(self, work_dir: str) -> None:
- self.shell = pexpect.spawn('/bin/bash', encoding='utf-8', echo=False)
+ if isinstance(plugin, JupyterPlugin):
+ await self.run_ipython(
+ IPythonRunCellAction(code=f'import os; os.chdir("{self.pwd}")')
+ )
+
+ # This is a temporary workaround
+ # TODO: refactor AgentSkills to be part of JupyterPlugin
+ # AFTER ServerRuntime is deprecated
+ if 'agent_skills' in self.plugins and 'jupyter' in self.plugins:
+ obs = await self.run_ipython(
+ IPythonRunCellAction(code='from agentskills import *')
+ )
+ logger.info(f'AgentSkills initialized: {obs}')
+
+ await self._init_bash_commands()
+
+ def _init_user(self, username: str, user_id: int) -> None:
+ """Create user if not exists."""
+ # Skip root since it is already created
+ if username == 'root':
+ return
+
+ try:
+ subprocess.run(['id', username], stdout=subprocess.DEVNULL, check=True)
+ return
+ except subprocess.CalledProcessError:
+ pass
+ # Add sudoer
+ sudoer_line = r"echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers"
+ output = subprocess.run(sudoer_line, shell=True, capture_output=True)
+ if output.returncode != 0:
+ raise RuntimeError(f'Failed to add sudoer: {output.stderr.decode()}')
+ logger.debug(f'Added sudoer successfully. Output: [{output.stdout.decode()}]')
+
+ # Add user and change ownership of the initial working directory if it doesn't exist
+ command = (
+ f'useradd -rm -d /home/{username} -s /bin/bash '
+ f'-g root -G sudo -o -u {user_id} {username}'
+ )
+
+ if not os.path.exists(self.initial_pwd):
+ command += f' && mkdir -p {self.initial_pwd}'
+ command += f' && chown -R {username}:root {self.initial_pwd}'
+ command += f' && chmod g+s {self.initial_pwd}'
+
+ output = subprocess.run(
+ command,
+ shell=True,
+ capture_output=True,
+ )
+ if output.returncode != 0:
+ raise RuntimeError(
+ f'Failed to create user {username}: {output.stderr.decode()}'
+ )
+
+ logger.debug(
+ f'Added user {username} successfully. Output: [{output.stdout.decode()}]'
+ )
+
+ command = 'deluser pn'
+ output = subprocess.run(command, shell=True, capture_output=True)
+ logger.debug(f'Output: {output.stdout.decode()}')
+
+ def _init_bash_shell(self, work_dir: str, username: str) -> None:
+ self.shell = pexpect.spawn(
+ f'su - {username}',
+ encoding='utf-8',
+ echo=False,
+ )
self.__bash_PS1 = r'[PEXPECT_BEGIN] \u@\h:\w [PEXPECT_END]'
# This should NOT match "PS1=\u@\h:\w [PEXPECT]$" when `env` is executed
@@ -80,20 +179,47 @@ def _init_bash_shell(self, work_dir: str) -> None:
r'\[PEXPECT_BEGIN\] ([a-z0-9_-]*)@([a-zA-Z0-9.-]*):(.+) \[PEXPECT_END\]'
)
- self.shell.sendline(f'export PS1="{self.__bash_PS1}"')
+ self.shell.sendline(f'export PS1="{self.__bash_PS1}"; export PS2=""')
self.shell.expect(self.__bash_expect_regex)
self.shell.sendline(f'cd {work_dir}')
self.shell.expect(self.__bash_expect_regex)
+ logger.debug(
+ f'Bash initialized. Working directory: {work_dir}. Output: {self.shell.before}'
+ )
+
+ async def _init_bash_commands(self):
+ logger.info(f'Initializing by running {len(INIT_COMMANDS)} bash commands...')
+ for command in INIT_COMMANDS:
+ action = CmdRunAction(command=command)
+ action.timeout = 300
+ logger.debug(f'Executing init command: {command}')
+ obs: CmdOutputObservation = await self.run(action)
+ logger.debug(
+ f'Init command outputs (exit code: {obs.exit_code}): {obs.content}'
+ )
+ assert obs.exit_code == 0
- def _get_bash_prompt(self):
+ logger.info('Bash init commands completed')
+
+ def _get_bash_prompt_and_update_pwd(self):
ps1 = self.shell.after
+
+ # begin at the last occurence of '[PEXPECT_BEGIN]'.
+ # In multi-line bash commands, the prompt will be repeated
+ # and the matched regex captures all of them
+ # - we only want the last one (newest prompt)
+ _begin_pos = ps1.rfind('[PEXPECT_BEGIN]')
+ if _begin_pos != -1:
+ ps1 = ps1[_begin_pos:]
+
# parse the ps1 to get username, hostname, and working directory
matched = re.match(self.__bash_expect_regex, ps1)
assert (
matched is not None
), f'Failed to parse bash prompt: {ps1}. This should not happen.'
username, hostname, working_dir = matched.groups()
+ self.pwd = os.path.expanduser(working_dir)
# re-assemble the prompt
prompt = f'{username}@{hostname}:{working_dir} '
@@ -106,24 +232,23 @@ def _get_bash_prompt(self):
def _send_interrupt(
self,
command: str,
- prev_output: str = '',
- ignore_last_output: bool = False,
+ timeout: int | None,
) -> tuple[str, int]:
logger.exception(
f'Command "{command}" timed out, killing process...', exc_info=False
)
# send a SIGINT to the process
self.shell.sendintr()
- self.shell.prompt()
- command_output = prev_output
- if not ignore_last_output:
- command_output += '\n' + self.shell.before
+ self.shell.expect(self.__bash_expect_regex, timeout=timeout)
+ command_output = self.shell.before
return (
f'Command: "{command}" timed out. Sent SIGINT to the process: {command_output}',
- -1,
+ 130,
)
- def _execute_bash(self, command, keep_prompt: bool = True) -> tuple[str, int]:
+ def _execute_bash(
+ self, command: str, timeout: int | None, keep_prompt: bool = True
+ ) -> tuple[str, int]:
logger.debug(f'Executing command: {command}')
self.shell.sendline(command)
@@ -156,54 +281,147 @@ def _execute_bash(self, command, keep_prompt: bool = True) -> tuple[str, int]:
logger.exception(
'Command timed out, killing process...', exc_info=False
)
- return self._send_interrupt(command)
+ return self._send_interrupt(command, timeout=timeout)
except ExceptionPexpect as e:
logger.exception(f'Unexpected exception: {e}')
break
output = command_output.strip()
if keep_prompt:
- output += '\r\n' + self._get_bash_prompt()
- logger.debug(f'Command output: {output}')
+ output += '\r\n' + self._get_bash_prompt_and_update_pwd()
# Get exit code
self.shell.sendline('echo $?')
logger.debug(f'Executing command for exit code: {command}')
- self.shell.expect(self.__bash_expect_regex)
+ self.shell.expect(self.__bash_expect_regex, timeout=timeout)
_exit_code_output = self.shell.before
logger.debug(f'Exit code Output: {_exit_code_output}')
exit_code = int(_exit_code_output.strip())
+ logger.debug(f'Command output: {output}')
+
return output, exit_code
async def run_action(self, action) -> Observation:
action_type = action.action
observation = await getattr(self, action_type)(action)
- observation._parent = action.id
return observation
async def run(self, action: CmdRunAction) -> CmdOutputObservation:
try:
- output, exit_code = self._execute_bash(action.command)
+ assert (
+ action.timeout is not None
+ ), f'Timeout argument is required for CmdRunAction: {action}'
+ commands = split_bash_commands(action.command)
+ all_output = ''
+ for command in commands:
+ output, exit_code = self._execute_bash(
+ command,
+ timeout=action.timeout,
+ keep_prompt=action.keep_prompt,
+ )
+ if command.startswith('pip install'):
+ output = await self.parse_pip_output(command, output)
+ if all_output:
+ # previous output already exists with prompt "user@hostname:working_dir #""
+ # we need to add the command to the previous output,
+ # so model knows the following is the output of another action)
+ all_output = all_output.rstrip() + ' ' + command + '\r\n'
+
+ all_output += str(output) + '\r\n'
+ if exit_code != 0:
+ break
+
return CmdOutputObservation(
command_id=-1,
- content=str(output),
+ content=all_output.rstrip('\r\n'),
command=action.command,
exit_code=exit_code,
)
except UnicodeDecodeError:
raise RuntimeError('Command output could not be decoded as utf-8')
+ async def restart_kernel(self) -> str:
+ if 'agent_skills' in self.plugins:
+ kernel_init_code = 'from agentskills import *'
+ else:
+ return ''
+
+ jupyter_plugin: JupyterPlugin = self.plugins['jupyter'] # type: ignore
+ restart_kernel_code = (
+ 'import IPython\nIPython.Application.instance().kernel.do_shutdown(True)'
+ )
+ act = IPythonRunCellAction(code=restart_kernel_code)
+ obs = await jupyter_plugin.run(act)
+ output = obs.content
+ if "{'status': 'ok', 'restart': True}" != output.strip():
+ print(output)
+ output = '\n[Failed to restart the kernel]'
+ else:
+ output = '\n[Kernel restarted successfully]'
+
+ # re-init the kernel after restart
+ act = IPythonRunCellAction(code=kernel_init_code)
+ await jupyter_plugin.run(act)
+ return output
+
+ async def parse_pip_output(self, code, output) -> str:
+ print(output)
+ package_names = code.split(' ', 2)[-1]
+ is_single_package = ' ' not in package_names
+ parsed_output = output
+ if 'Successfully installed' in output:
+ parsed_output = '[Package installed successfully]'
+ if (
+ 'Note: you may need to restart the kernel to use updated packages.'
+ in output
+ ):
+ parsed_output += await self.restart_kernel()
+ else:
+ # restart kernel if installed via bash too
+ await self.restart_kernel()
+ elif (
+ is_single_package
+ and f'Requirement already satisfied: {package_names}' in output
+ ):
+ parsed_output = '[Package already installed]'
+
+ prompt_output = self._get_bash_prompt_and_update_pwd()
+ return parsed_output + '\r\n' + prompt_output
+
async def run_ipython(self, action: IPythonRunCellAction) -> Observation:
if 'jupyter' in self.plugins:
_jupyter_plugin: JupyterPlugin = self.plugins['jupyter'] # type: ignore
- return await _jupyter_plugin.run(action)
+ # This is used to make AgentSkills in Jupyter aware of the
+ # current working directory in Bash
+ if self.pwd != getattr(self, '_jupyter_pwd', None):
+ logger.debug(
+ f"{self.pwd} != {getattr(self, '_jupyter_pwd', None)} -> reset Jupyter PWD"
+ )
+ reset_jupyter_pwd_code = f'import os; os.environ["JUPYTER_PWD"] = os.path.abspath("{self.pwd}")'
+ _aux_action = IPythonRunCellAction(code=reset_jupyter_pwd_code)
+ _reset_obs = await _jupyter_plugin.run(_aux_action)
+ logger.debug(
+ f'Changed working directory in IPython to: {self.pwd}. Output: {_reset_obs}'
+ )
+ self._jupyter_pwd = self.pwd
+
+ if 'app.run' in action.code.strip().split('\n')[-1]:
+ return ErrorObservation(
+ "Don't run Flask app in Jupyter notebook. Save the code to a file and run it in the terminal."
+ )
+ action.code = action.code.replace('!pip', '%pip')
+ obs: IPythonRunCellObservation = await _jupyter_plugin.run(action)
+ if 'pip install' in action.code:
+ obs.content = await self.parse_pip_output(action.code, obs.content)
+ return obs
else:
raise RuntimeError(
'JupyterRequirement not found. Unable to run IPython action.'
)
- def get_working_directory(self):
- result, exit_code = self._execute_bash('pwd', keep_prompt=False)
+ def _get_working_directory(self):
+ # NOTE: this is part of initialization, so we hard code the timeout
+ result, exit_code = self._execute_bash('pwd', timeout=60, keep_prompt=False)
if exit_code != 0:
raise RuntimeError('Failed to get working directory')
return result.strip()
@@ -217,7 +435,7 @@ def _resolve_path(self, path: str, working_dir: str) -> str:
async def read(self, action: FileReadAction) -> Observation:
# NOTE: the client code is running inside the sandbox,
# so there's no need to check permission
- working_dir = self.get_working_directory()
+ working_dir = self._get_working_directory()
filepath = self._resolve_path(action.path, working_dir)
try:
with open(filepath, 'r', encoding='utf-8') as file:
@@ -237,14 +455,21 @@ async def read(self, action: FileReadAction) -> Observation:
return FileReadObservation(path=filepath, content=code_view)
async def write(self, action: FileWriteAction) -> Observation:
- working_dir = self.get_working_directory()
+ working_dir = self._get_working_directory()
filepath = self._resolve_path(action.path, working_dir)
insert = action.content.split('\n')
try:
if not os.path.exists(os.path.dirname(filepath)):
os.makedirs(os.path.dirname(filepath))
- mode = 'w' if not os.path.exists(filepath) else 'r+'
+
+ file_exists = os.path.exists(filepath)
+ if file_exists:
+ file_stat = os.stat(filepath)
+ else:
+ file_stat = None
+
+ mode = 'w' if not file_exists else 'r+'
try:
with open(filepath, mode, encoding='utf-8') as file:
if mode != 'w':
@@ -258,6 +483,19 @@ async def write(self, action: FileWriteAction) -> Observation:
file.seek(0)
file.writelines(new_file)
file.truncate()
+
+ # Handle file permissions
+ if sys.platform != 'win32':
+ if file_exists:
+ assert file_stat is not None
+ # restore the original file permissions if the file already exists
+ os.chmod(filepath, file_stat.st_mode)
+ os.chown(filepath, file_stat.st_uid, file_stat.st_gid)
+ else:
+ # set the new file permissions if the file is new
+ os.chmod(filepath, 0o644)
+ os.chown(filepath, self.user_id, self.user_id)
+
except FileNotFoundError:
return ErrorObservation(f'File not found: {filepath}')
except IsADirectoryError:
@@ -283,27 +521,21 @@ def close(self):
self.browser.close()
-# def test_run_commond():
-# client = RuntimeClient()
-# command = CmdRunAction(command='ls -l')
-# obs = client.run_action(command)
-# print(obs)
-
-# def test_shell(message):
-# shell = pexpect.spawn('/bin/bash', encoding='utf-8')
-# shell.expect(r'[$#] ')
-# print(f'Received command: {message}')
-# shell.sendline(message)
-# shell.expect(r'[$#] ')
-# output = shell.before.strip().split('\r\n', 1)[1].strip()
-# print(f'Output: {output}')
-# shell.close()
-
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('port', type=int, help='Port to listen on')
parser.add_argument('--working-dir', type=str, help='Working directory')
parser.add_argument('--plugins', type=str, help='Plugins to initialize', nargs='+')
+ parser.add_argument(
+ '--username', type=str, help='User to run as', default='opendevin'
+ )
+ parser.add_argument('--user-id', type=int, help='User ID to run as', default=1000)
+ parser.add_argument(
+ '--browsergym-eval-env',
+ type=str,
+ help='BrowserGym environment used for browser evaluation',
+ default=None,
+ )
# example: python client.py 8000 --working-dir /workspace --plugins JupyterRequirement
args = parser.parse_args()
@@ -314,16 +546,35 @@ def close(self):
raise ValueError(f'Plugin {plugin} not found')
plugins_to_load.append(ALL_PLUGINS[plugin]()) # type: ignore
- client = RuntimeClient(plugins_to_load, work_dir=args.working_dir)
+ client: RuntimeClient | None = None
+
+ @asynccontextmanager
+ async def lifespan(app: FastAPI):
+ global client
+ client = RuntimeClient(
+ plugins_to_load,
+ work_dir=args.working_dir,
+ username=args.username,
+ user_id=args.user_id,
+ browsergym_eval_env=args.browsergym_eval_env,
+ )
+ await client.ainit()
+ yield
+ # Clean up & release the resources
+ client.close()
+
+ app = FastAPI(lifespan=lifespan)
@app.middleware('http')
async def one_request_at_a_time(request: Request, call_next):
+ assert client is not None
async with client.lock:
response = await call_next(request)
return response
@app.post('/execute_action')
async def execute_action(action_request: ActionRequest):
+ assert client is not None
try:
action = event_from_dict(action_request.action)
if not isinstance(action, Action):
@@ -334,10 +585,188 @@ async def execute_action(action_request: ActionRequest):
logger.error(f'Error processing command: {str(e)}')
raise HTTPException(status_code=500, detail=str(e))
+ @app.post('/upload_file')
+ async def upload_file(
+ file: UploadFile, destination: str = '/', recursive: bool = False
+ ):
+ assert client is not None
+
+ try:
+ # Ensure the destination directory exists
+ if not os.path.isabs(destination):
+ raise HTTPException(
+ status_code=400, detail='Destination must be an absolute path'
+ )
+
+ full_dest_path = destination
+ if not os.path.exists(full_dest_path):
+ os.makedirs(full_dest_path, exist_ok=True)
+
+ if recursive:
+ # For recursive uploads, we expect a zip file
+ if not file.filename.endswith('.zip'):
+ raise HTTPException(
+ status_code=400, detail='Recursive uploads must be zip files'
+ )
+
+ zip_path = os.path.join(full_dest_path, file.filename)
+ with open(zip_path, 'wb') as buffer:
+ shutil.copyfileobj(file.file, buffer)
+
+ # Extract the zip file
+ shutil.unpack_archive(zip_path, full_dest_path)
+ os.remove(zip_path) # Remove the zip file after extraction
+
+ logger.info(
+ f'Uploaded file {file.filename} and extracted to {destination}'
+ )
+ else:
+ # For single file uploads
+ file_path = os.path.join(full_dest_path, file.filename)
+ with open(file_path, 'wb') as buffer:
+ shutil.copyfileobj(file.file, buffer)
+ logger.info(f'Uploaded file {file.filename} to {destination}')
+
+ return JSONResponse(
+ content={
+ 'filename': file.filename,
+ 'destination': destination,
+ 'recursive': recursive,
+ },
+ status_code=200,
+ )
+
+ except Exception as e:
+ raise HTTPException(status_code=500, detail=str(e))
+
@app.get('/alive')
async def alive():
return {'status': 'ok'}
+ # ================================
+ # File-specific operations for UI
+ # ================================
+
+ @app.post('/list_files')
+ async def list_files(request: Request):
+ """List files in the specified path.
+
+ This function retrieves a list of files from the agent's runtime file store,
+ excluding certain system and hidden files/directories.
+
+ To list files:
+ ```sh
+ curl http://localhost:3000/api/list-files
+ ```
+
+ Args:
+ request (Request): The incoming request object.
+ path (str, optional): The path to list files from. Defaults to '/'.
+
+ Returns:
+ list: A list of file names in the specified path.
+
+ Raises:
+ HTTPException: If there's an error listing the files.
+ """
+ assert client is not None
+
+ # get request as dict
+ request_dict = await request.json()
+ path = request_dict.get('path', None)
+
+ # Get the full path of the requested directory
+ if path is None:
+ full_path = client.initial_pwd
+ elif os.path.isabs(path):
+ full_path = path
+ else:
+ full_path = os.path.join(client.initial_pwd, path)
+
+ if not os.path.exists(full_path):
+ return JSONResponse(
+ content={'error': f'Directory {full_path} does not exist'},
+ status_code=400,
+ )
+
+ try:
+ # Check if the directory exists
+ if not os.path.exists(full_path) or not os.path.isdir(full_path):
+ return []
+
+ # Check if .gitignore exists
+ gitignore_path = os.path.join(full_path, '.gitignore')
+ if os.path.exists(gitignore_path):
+ # Use PathSpec to parse .gitignore
+ with open(gitignore_path, 'r') as f:
+ spec = PathSpec.from_lines(GitWildMatchPattern, f.readlines())
+ else:
+ # Fallback to default exclude list if .gitignore doesn't exist
+ default_exclude = [
+ '.git',
+ '.DS_Store',
+ '.svn',
+ '.hg',
+ '.idea',
+ '.vscode',
+ '.settings',
+ '.pytest_cache',
+ '__pycache__',
+ 'node_modules',
+ 'vendor',
+ 'build',
+ 'dist',
+ 'bin',
+ 'logs',
+ 'log',
+ 'tmp',
+ 'temp',
+ 'coverage',
+ 'venv',
+ 'env',
+ ]
+ spec = PathSpec.from_lines(GitWildMatchPattern, default_exclude)
+
+ entries = os.listdir(full_path)
+
+ # Filter entries using PathSpec
+ filtered_entries = [
+ os.path.join(full_path, entry)
+ for entry in entries
+ if not spec.match_file(os.path.relpath(entry, str(full_path)))
+ ]
+
+ # Separate directories and files
+ directories = []
+ files = []
+ for entry in filtered_entries:
+ # Remove leading slash and any parent directory components
+ entry_relative = entry.lstrip('/').split('/')[-1]
+
+ # Construct the full path by joining the base path with the relative entry path
+ full_entry_path = os.path.join(full_path, entry_relative)
+ if os.path.exists(full_entry_path):
+ is_dir = os.path.isdir(full_entry_path)
+ if is_dir:
+ # add trailing slash to directories
+ # required by FE to differentiate directories and files
+ entry = entry.rstrip('/') + '/'
+ directories.append(entry)
+ else:
+ files.append(entry)
+
+ # Sort directories and files separately
+ directories.sort(key=lambda s: s.lower())
+ files.sort(key=lambda s: s.lower())
+
+ # Combine sorted directories and files
+ sorted_entries = directories + files
+ return sorted_entries
+
+ except Exception as e:
+ logger.error(f'Error listing files: {e}', exc_info=True)
+ return []
+
logger.info(f'Starting action execution API on port {args.port}')
print(f'Starting action execution API on port {args.port}')
run(app, host='0.0.0.0', port=args.port)
diff --git a/opendevin/runtime/client/runtime.py b/opendevin/runtime/client/runtime.py
index 100d18df9036..c90dfcc4b92b 100644
--- a/opendevin/runtime/client/runtime.py
+++ b/opendevin/runtime/client/runtime.py
@@ -1,6 +1,9 @@
import asyncio
+import os
+import tempfile
import uuid
from typing import Optional
+from zipfile import ZipFile
import aiohttp
import docker
@@ -8,7 +11,7 @@
from opendevin.core.config import AppConfig
from opendevin.core.logger import opendevin_logger as logger
-from opendevin.events import EventSource, EventStream
+from opendevin.events import EventStream
from opendevin.events.action import (
BrowseInteractiveAction,
BrowseURLAction,
@@ -18,7 +21,6 @@
IPythonRunCellAction,
)
from opendevin.events.action.action import Action
-from opendevin.events.event import Event
from opendevin.events.observation import (
ErrorObservation,
NullObservation,
@@ -44,47 +46,76 @@ def __init__(
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
- container_image: str | None = None,
plugins: list[PluginRequirement] | None = None,
+ container_image: str | None = None,
):
- super().__init__(config, event_stream, sid) # will initialize the event stream
- self._port = find_available_tcp_port()
+ super().__init__(
+ config, event_stream, sid, plugins
+ ) # will initialize the event stream
+ self.persist_sandbox = self.config.sandbox.persist_sandbox
+ self.fast_boot = self.config.sandbox.fast_boot
+ if self.persist_sandbox:
+ user = 'od' if self.config.run_as_devin else 'root'
+ path = config.workspace_mount_path
+ path = ''.join(c if c.isalnum() else '_' for c in path) # type: ignore
+ self.instance_id = f'persisted-{user}-{path}'
+ self._port = self.config.sandbox.port
+ else:
+ self.instance_id = (sid or '') + str(uuid.uuid4())
+ self._port = find_available_tcp_port()
self.api_url = f'http://localhost:{self._port}'
+ self.api_url = f'http://{self.config.sandbox.api_hostname}:{self._port}'
self.session: Optional[aiohttp.ClientSession] = None
- self.instance_id = (
- sid + str(uuid.uuid4()) if sid is not None else str(uuid.uuid4())
- )
# TODO: We can switch to aiodocker when `get_od_sandbox_image` is updated to use aiodocker
self.docker_client: docker.DockerClient = self._init_docker_client()
self.container_image = (
- config.sandbox.container_image
+ self.config.sandbox.container_image
if container_image is None
else container_image
)
self.container_name = self.container_name_prefix + self.instance_id
- self.plugins = plugins if plugins is not None else []
self.container = None
self.action_semaphore = asyncio.Semaphore(1) # Ensure one action at a time
+ logger.debug(f'EventStreamRuntime `{sid}` config:\n{self.config}')
async def ainit(self, env_vars: dict[str, str] | None = None):
- self.container_image = build_runtime_image(
- self.container_image,
- self.docker_client,
- # NOTE: You can need set DEBUG=true to update the source code
- # inside the container. This is useful when you want to test/debug the
- # latest code in the runtime docker container.
- update_source_code=self.config.sandbox.update_source_code,
- )
- self.container = await self._init_container(
- self.sandbox_workspace_dir,
- mount_dir=self.config.workspace_mount_path,
- plugins=self.plugins,
- )
- # MUST call super().ainit() to initialize both default env vars
- # AND the ones in env vars!
- await super().ainit(env_vars)
+ if self.config.sandbox.od_runtime_extra_deps:
+ logger.info(
+ f'Installing extra user-provided dependencies in the runtime image: {self.config.sandbox.od_runtime_extra_deps}'
+ )
+ try:
+ docker.DockerClient().containers.get(self.container_name)
+ self.is_initial_session = False
+ except docker.errors.NotFound:
+ self.is_initial_session = True
+
+ if self.is_initial_session:
+ logger.info('Creating new Docker container')
+ self.container_image = build_runtime_image(
+ self.container_image,
+ self.docker_client,
+ extra_deps=self.config.sandbox.od_runtime_extra_deps,
+ )
+ self.container = await self._init_container(
+ self.sandbox_workspace_dir,
+ mount_dir=self.config.workspace_mount_path,
+ plugins=self.plugins,
+ )
+ # MUST call super().ainit() to initialize both default env vars
+ # AND the ones in env vars!
+ await super().ainit(env_vars)
+
+ logger.info(
+ f'Container initialized with plugins: {[plugin.name for plugin in self.plugins]}'
+ )
+ logger.info(f'Container initialized with env vars: {env_vars}')
+
+ else:
+ logger.info('Using existing Docker container')
+ self.container = self.docker_client.containers.get(self.container_name)
+ await self.start_docker_container()
@staticmethod
def _init_docker_client() -> docker.DockerClient:
@@ -103,16 +134,18 @@ def _init_docker_client() -> docker.DockerClient:
async def _init_container(
self,
sandbox_workspace_dir: str,
- mount_dir: str,
+ mount_dir: str | None = None,
plugins: list[PluginRequirement] | None = None,
):
try:
logger.info(
- f'Starting container with image: {self.container_image} and name: {self.container_name}'
+ f'Starting container with image: {self.container_image} and name: {self.container_name} with port: {self._port}'
)
- if plugins is None:
- plugins = []
- plugin_names = ' '.join([plugin.name for plugin in plugins])
+ plugin_arg = ''
+ if plugins is not None and len(plugins) > 0:
+ plugin_arg = (
+ f'--plugins {" ".join([plugin.name for plugin in plugins])} '
+ )
network_mode: str | None = None
port_mapping: dict[str, int] | None = None
@@ -124,6 +157,21 @@ async def _init_container(
else:
port_mapping = {f'{self._port}/tcp': self._port}
+ if mount_dir is not None:
+ volumes = {mount_dir: {'bind': sandbox_workspace_dir, 'mode': 'rw'}}
+ logger.info(f'Mount dir: {sandbox_workspace_dir}')
+ else:
+ logger.warn(
+ 'Mount dir is not set, will not mount the workspace directory to the container.'
+ )
+ volumes = None
+
+ if self.config.sandbox.browsergym_eval_env is not None:
+ browsergym_arg = (
+ f'--browsergym-eval-env {self.config.sandbox.browsergym_eval_env}'
+ )
+ else:
+ browsergym_arg = ''
container = self.docker_client.containers.run(
self.container_image,
command=(
@@ -131,7 +179,10 @@ async def _init_container(
'PYTHONUNBUFFERED=1 poetry run '
f'python -u -m opendevin.runtime.client.client {self._port} '
f'--working-dir {sandbox_workspace_dir} '
- f'--plugins {plugin_names}'
+ f'{plugin_arg}'
+ f'--username {"opendevin" if self.config.run_as_devin else "root"} '
+ f'--user-id {self.config.sandbox.user_id} '
+ f'{browsergym_arg}'
),
network_mode=network_mode,
ports=port_mapping,
@@ -139,7 +190,7 @@ async def _init_container(
name=self.container_name,
detach=True,
environment={'DEBUG': 'true'} if self.config.debug else None,
- volumes={mount_dir: {'bind': sandbox_workspace_dir, 'mode': 'rw'}},
+ volumes=volumes,
)
logger.info(f'Container started. Server url: {self.api_url}')
return container
@@ -150,31 +201,61 @@ async def _init_container(
raise e
async def _ensure_session(self):
+ await asyncio.sleep(1)
if self.session is None or self.session.closed:
self.session = aiohttp.ClientSession()
return self.session
@tenacity.retry(
stop=tenacity.stop_after_attempt(10),
- wait=tenacity.wait_exponential(multiplier=2, min=4, max=600),
+ wait=tenacity.wait_exponential(multiplier=2, min=4, max=60),
)
async def _wait_until_alive(self):
+ logger.info('Reconnecting session')
+ container = self.docker_client.containers.get(self.container_name)
+ # print logs
+ _logs = container.logs(tail=10).decode('utf-8').split('\n')
+ # add indent
+ _logs = '\n'.join([f' |{log}' for log in _logs])
+ logger.info(
+ '\n'
+ + '-' * 30
+ + 'Container logs (last 10 lines):'
+ + '-' * 30
+ + f'\n{_logs}'
+ + '\n'
+ + '-' * 90
+ )
async with aiohttp.ClientSession() as session:
async with session.get(f'{self.api_url}/alive') as response:
if response.status == 200:
return
else:
- logger.error(
- f'Action execution API is not alive. Response: {response}'
- )
- raise RuntimeError(
- f'Action execution API is not alive. Response: {response}'
- )
+ msg = f'Action execution API is not alive. Response: {response}'
+ logger.error(msg)
+ raise RuntimeError(msg)
@property
def sandbox_workspace_dir(self):
return self.config.workspace_mount_path_in_sandbox
+ async def start_docker_container(self):
+ try:
+ container = self.docker_client.containers.get(self.container_name)
+ logger.info('Container status: %s', container.status)
+ if container.status != 'running':
+ container.start()
+ logger.info('Container started')
+ elapsed = 0
+ while container.status != 'running':
+ await asyncio.sleep(1)
+ elapsed += 1
+ if elapsed > self.config.sandbox.timeout:
+ break
+ container = self.docker_client.containers.get(self.container_name)
+ except Exception:
+ logger.exception('Failed to start container')
+
async def close(self, close_client: bool = True):
if self.session is not None and not self.session.closed:
await self.session.close()
@@ -182,28 +263,29 @@ async def close(self, close_client: bool = True):
containers = self.docker_client.containers.list(all=True)
for container in containers:
try:
- if container.name.startswith(self.container_name_prefix):
+ # only remove the container it created
+ # otherwise all other containers with the same prefix will be removed
+ # which will mess up with parallel evaluation
+ if container.name.startswith(self.container_name):
logs = container.logs(tail=1000).decode('utf-8')
logger.debug(
f'==== Container logs ====\n{logs}\n==== End of container logs ===='
)
- container.remove(force=True)
+ if self.persist_sandbox:
+ if not self.fast_boot:
+ container.stop()
+ else:
+ container.remove(force=True)
except docker.errors.NotFound:
pass
if close_client:
self.docker_client.close()
- async def on_event(self, event: Event) -> None:
- logger.info(f'EventStreamRuntime: on_event triggered: {event}')
- if isinstance(event, Action):
- logger.info(event, extra={'msg_type': 'ACTION'})
- observation = await self.run_action(event)
- # observation._cause = event.id # type: ignore[attr-defined]
- logger.info(observation, extra={'msg_type': 'OBSERVATION'})
- source = event.source if event.source else EventSource.AGENT
- await self.event_stream.add_event(observation, source)
-
- async def run_action(self, action: Action, timeout: int = 600) -> Observation:
+ async def run_action(self, action: Action) -> Observation:
+ # set timeout to default if not set
+ if action.timeout is None:
+ action.timeout = self.config.sandbox.timeout
+
async with self.action_semaphore:
if not action.runnable:
return NullObservation('')
@@ -215,13 +297,18 @@ async def run_action(self, action: Action, timeout: int = 600) -> Observation:
f'Action {action_type} is not supported in the current runtime.'
)
+ logger.info('Awaiting session')
session = await self._ensure_session()
await self._wait_until_alive()
+
+ assert action.timeout is not None
+
try:
+ logger.info(f'Executing action {action}')
async with session.post(
f'{self.api_url}/execute_action',
json={'action': event_to_dict(action)},
- timeout=timeout,
+ timeout=action.timeout,
) as response:
if response.status == 200:
output = await response.json()
@@ -240,7 +327,6 @@ async def run_action(self, action: Action, timeout: int = 600) -> Observation:
except Exception as e:
logger.error(f'Error during command execution: {e}')
obs = ErrorObservation(f'Command execution failed: {str(e)}')
- obs._parent = action.id # type: ignore[attr-defined]
return obs
async def run(self, action: CmdRunAction) -> Observation:
@@ -261,22 +347,83 @@ async def browse(self, action: BrowseURLAction) -> Observation:
async def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
return await self.run_action(action)
- ############################################################################
- # Keep the same with other runtimes
- ############################################################################
+ # ====================================================================
+ # Implement these methods (for file operations) in the subclass
+ # ====================================================================
- def get_working_directory(self):
- raise NotImplementedError(
- 'This method is not implemented in the runtime client.'
- )
+ async def copy_to(
+ self, host_src: str, sandbox_dest: str, recursive: bool = False
+ ) -> None:
+ if not os.path.exists(host_src):
+ raise FileNotFoundError(f'Source file {host_src} does not exist')
+
+ session = await self._ensure_session()
+ await self._wait_until_alive()
+ try:
+ if recursive:
+ # For recursive copy, create a zip file
+ with tempfile.NamedTemporaryFile(
+ suffix='.zip', delete=False
+ ) as temp_zip:
+ temp_zip_path = temp_zip.name
+
+ with ZipFile(temp_zip_path, 'w') as zipf:
+ for root, _, files in os.walk(host_src):
+ for file in files:
+ file_path = os.path.join(root, file)
+ arcname = os.path.relpath(
+ file_path, os.path.dirname(host_src)
+ )
+ zipf.write(file_path, arcname)
+
+ upload_data = {'file': open(temp_zip_path, 'rb')}
+ else:
+ # For single file copy
+ upload_data = {'file': open(host_src, 'rb')}
- ############################################################################
- # Initialization work inside sandbox image
- ############################################################################
+ params = {'destination': sandbox_dest, 'recursive': str(recursive).lower()}
- # init_runtime_tools direcctly do as what Runtime do
+ async with session.post(
+ f'{self.api_url}/upload_file', data=upload_data, params=params
+ ) as response:
+ if response.status == 200:
+ return
+ else:
+ error_message = await response.text()
+ raise Exception(f'Copy operation failed: {error_message}')
- # Do in the od_runtime_client
- # Overwrite the init_sandbox_plugins
- def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
- pass
+ except asyncio.TimeoutError:
+ raise TimeoutError('Copy operation timed out')
+ except Exception as e:
+ raise RuntimeError(f'Copy operation failed: {str(e)}')
+ finally:
+ if recursive:
+ os.unlink(temp_zip_path)
+ logger.info(f'Copy completed: host:{host_src} -> runtime:{sandbox_dest}')
+
+ async def list_files(self, path: str | None = None) -> list[str]:
+ """List files in the sandbox.
+
+ If path is None, list files in the sandbox's initial working directory (e.g., /workspace).
+ """
+ session = await self._ensure_session()
+ await self._wait_until_alive()
+ try:
+ data = {}
+ if path is not None:
+ data['path'] = path
+
+ async with session.post(
+ f'{self.api_url}/list_files', json=data
+ ) as response:
+ if response.status == 200:
+ response_json = await response.json()
+ assert isinstance(response_json, list)
+ return response_json
+ else:
+ error_message = await response.text()
+ raise Exception(f'List files operation failed: {error_message}')
+ except asyncio.TimeoutError:
+ raise TimeoutError('List files operation timed out')
+ except Exception as e:
+ raise RuntimeError(f'List files operation failed: {str(e)}')
diff --git a/opendevin/runtime/docker/__init__.py b/opendevin/runtime/docker/__init__.py
deleted file mode 100644
index e69de29bb2d1..000000000000
diff --git a/opendevin/runtime/docker/local_box.py b/opendevin/runtime/docker/local_box.py
deleted file mode 100644
index 5da1707939f8..000000000000
--- a/opendevin/runtime/docker/local_box.py
+++ /dev/null
@@ -1,122 +0,0 @@
-import atexit
-import os
-import subprocess
-import sys
-
-from opendevin.core.config import SandboxConfig
-from opendevin.core.logger import opendevin_logger as logger
-from opendevin.core.schema import CancellableStream
-from opendevin.runtime.sandbox import Sandbox
-
-# ===============================================================================
-# ** WARNING **
-#
-# This sandbox should only be used when OpenDevin is running inside a container
-#
-# Sandboxes are generally isolated so that they cannot affect the host machine.
-# This Sandbox implementation does not provide isolation, and can inadvertently
-# run dangerous commands on the host machine, potentially rendering the host
-# machine unusable.
-#
-# This sandbox is meant for use with OpenDevin Quickstart
-#
-# DO NOT USE THIS SANDBOX IN A PRODUCTION ENVIRONMENT
-# ===============================================================================
-
-
-class LocalBox(Sandbox):
- def __init__(
- self,
- config: SandboxConfig,
- workspace_base: str,
- ):
- self.config = config
- os.makedirs(workspace_base, exist_ok=True)
- self.workspace_base = workspace_base
- atexit.register(self.cleanup)
- super().__init__(config)
-
- def execute(
- self, cmd: str, stream: bool = False, timeout: int | None = None
- ) -> tuple[int, str | CancellableStream]:
- try:
- completed_process = subprocess.run(
- cmd,
- shell=True,
- text=True,
- capture_output=True,
- timeout=self.config.timeout,
- cwd=self.workspace_base,
- env=self._env,
- )
- return completed_process.returncode, completed_process.stdout.strip()
- except subprocess.TimeoutExpired:
- return -1, 'Command timed out'
-
- def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
- # mkdir -p sandbox_dest if it doesn't exist
- res = subprocess.run(
- f'mkdir -p {sandbox_dest}',
- shell=True,
- text=True,
- cwd=self.workspace_base,
- env=self._env,
- )
- if res.returncode != 0:
- raise RuntimeError(f'Failed to create directory {sandbox_dest} in sandbox')
-
- if recursive:
- res = subprocess.run(
- f'cp -r {host_src} {sandbox_dest}',
- shell=True,
- text=True,
- cwd=self.workspace_base,
- env=self._env,
- )
- if res.returncode != 0:
- raise RuntimeError(
- f'Failed to copy {host_src} to {sandbox_dest} in sandbox'
- )
- else:
- res = subprocess.run(
- f'cp {host_src} {sandbox_dest}',
- shell=True,
- text=True,
- cwd=self.workspace_base,
- env=self._env,
- )
- if res.returncode != 0:
- raise RuntimeError(
- f'Failed to copy {host_src} to {sandbox_dest} in sandbox'
- )
-
- def close(self):
- pass
-
- def cleanup(self):
- self.close()
-
- def get_working_directory(self):
- return self.workspace_base
-
-
-if __name__ == '__main__':
- local_box = LocalBox(SandboxConfig(), '/tmp/opendevin')
- sys.stdout.flush()
- try:
- while True:
- try:
- user_input = input('>>> ')
- except EOFError:
- logger.info('Exiting...')
- break
- if user_input.lower() == 'exit':
- logger.info('Exiting...')
- break
- exit_code, output = local_box.execute(user_input)
- logger.info('exit code: %d', exit_code)
- logger.info(output)
- sys.stdout.flush()
- except KeyboardInterrupt:
- logger.info('Exiting...')
- local_box.close()
diff --git a/opendevin/runtime/docker/ssh_box.py b/opendevin/runtime/docker/ssh_box.py
deleted file mode 100644
index d7650cc76fe5..000000000000
--- a/opendevin/runtime/docker/ssh_box.py
+++ /dev/null
@@ -1,683 +0,0 @@
-import atexit
-import os
-import re
-import sys
-import tarfile
-import tempfile
-import time
-import uuid
-from glob import glob
-
-import docker
-from pexpect import exceptions, pxssh
-from tenacity import retry, stop_after_attempt, wait_fixed
-
-from opendevin.core.config import SandboxConfig
-from opendevin.core.const.guide_url import TROUBLESHOOTING_URL
-from opendevin.core.logger import opendevin_logger as logger
-from opendevin.core.schema import CancellableStream
-from opendevin.runtime.plugins import AgentSkillsRequirement, JupyterRequirement
-from opendevin.runtime.plugins.requirement import PluginRequirement
-from opendevin.runtime.sandbox import Sandbox
-from opendevin.runtime.utils import find_available_tcp_port, split_bash_commands
-from opendevin.runtime.utils.image_agnostic import get_od_sandbox_image
-
-
-class SSHExecCancellableStream(CancellableStream):
- def __init__(self, ssh, cmd, timeout):
- super().__init__(self.read_output())
- self.ssh = ssh
- self.cmd = cmd
- self.timeout = timeout
-
- def close(self):
- self.closed = True
-
- def exit_code(self):
- marker = f'EXIT_CODE_MARKER_{uuid.uuid4().hex}'
- self.ssh.sendline(f'echo "{marker}$?{marker}"')
-
- if not self.ssh.prompt(timeout=self.timeout):
- return None # Timeout occurred
-
- output = self.ssh.before
- match = re.search(f'{marker}(\\d+){marker}', output)
-
- if match:
- try:
- return int(match.group(1))
- except ValueError:
- # Log the unexpected format
- logger.error(f'Unexpected exit code format: {match.group(1)}')
- return None
- else:
- # If we can't find our marked exit code, log the output and return None
- logger.error(f'Could not find exit code in output: {output}')
- return None
-
- def read_output(self):
- st = time.time()
- buf = ''
- crlf = '\r\n'
- lf = '\n'
- prompt_len = len(self.ssh.PROMPT)
- while True:
- try:
- if self.closed:
- break
- _output = self.ssh.read_nonblocking(timeout=1)
- if not _output:
- continue
-
- buf += _output
-
- if len(buf) < prompt_len:
- continue
-
- match = re.search(self.ssh.PROMPT, buf)
- if match:
- idx, _ = match.span()
- yield buf[:idx].replace(crlf, lf)
- buf = ''
- break
-
- res = buf[:-prompt_len]
- if len(res) == 0 or res.find(crlf) == -1:
- continue
- buf = buf[-prompt_len:]
- yield res.replace(crlf, lf)
- except exceptions.TIMEOUT:
- if time.time() - st < self.timeout:
- match = re.search(self.ssh.PROMPT, buf)
- if match:
- idx, _ = match.span()
- yield buf[:idx].replace(crlf, lf)
- break
- continue
- else:
- yield buf.replace(crlf, lf)
- break
- except exceptions.EOF:
- break
-
-
-class DockerSSHBox(Sandbox):
- instance_id: str
- container_image: str
- container_name_prefix = 'opendevin-sandbox-'
- container_name: str
- container: docker.models.containers.Container
- docker_client: docker.DockerClient
-
- _ssh_password: str
- _ssh_port: int
- ssh: pxssh.pxssh | None = None
-
- def __init__(
- self,
- config: SandboxConfig,
- persist_sandbox: bool,
- workspace_mount_path: str,
- sandbox_workspace_dir: str,
- cache_dir: str,
- run_as_devin: bool,
- ssh_hostname: str = 'host.docker.internal',
- ssh_password: str | None = None,
- ssh_port: int = 22,
- sid: str | None = None,
- ):
- self.config = config
- self.workspace_mount_path = workspace_mount_path
- self.sandbox_workspace_dir = sandbox_workspace_dir
- self.cache_dir = cache_dir
- self.use_host_network = config.use_host_network
- self.run_as_devin = run_as_devin
- logger.info(
- f'SSHBox is running as {"opendevin" if self.run_as_devin else "root"} user with USER_ID={config.user_id} in the sandbox'
- )
- # Initialize docker client. Throws an exception if Docker is not reachable.
- try:
- self.docker_client = docker.from_env()
- except Exception as ex:
- logger.exception(
- f'Error creating controller. Please check Docker is running and visit `{TROUBLESHOOTING_URL}` for more debugging information.',
- exc_info=False,
- )
- raise ex
-
- if persist_sandbox:
- if not self.run_as_devin:
- raise Exception(
- 'Persistent sandbox is currently designed for opendevin user only. Please set run_as_devin=True in your config.toml'
- )
- self.instance_id = 'persisted'
- else:
- self.instance_id = (sid or '') + str(uuid.uuid4())
-
- self.container_image = get_od_sandbox_image(
- config.container_image, self.docker_client
- )
- self.container_name = self.container_name_prefix + self.instance_id
-
- # set up random user password
- self.persist_sandbox = persist_sandbox
- self.ssh_hostname = ssh_hostname
- if persist_sandbox:
- if not ssh_password:
- raise ValueError('ssh_password is required for persistent sandbox')
- self._ssh_password = ssh_password
- self._ssh_port = ssh_port
- else:
- self._ssh_password = str(uuid.uuid4())
- self._ssh_port = find_available_tcp_port()
- try:
- docker.DockerClient().containers.get(self.container_name)
- self.is_initial_session = False
- except docker.errors.NotFound:
- self.is_initial_session = True
- logger.info('Detected initial session.')
- if not persist_sandbox or self.is_initial_session:
- logger.info('Creating new Docker container')
- n_tries = 5
- while n_tries > 0:
- try:
- self.restart_docker_container()
- break
- except Exception as e:
- logger.exception(
- 'Failed to start Docker container, retrying...', exc_info=False
- )
- n_tries -= 1
- if n_tries == 0:
- raise e
- time.sleep(5)
- self.setup_user()
- else:
- self.container = self.docker_client.containers.get(self.container_name)
- logger.info('Using existing Docker container')
- self.start_docker_container()
- try:
- self.start_ssh_session()
- except Exception as e:
- self.close()
- raise e
- time.sleep(1)
-
- # make sure /tmp always exists
- self.execute('mkdir -p /tmp')
- # set git config
- self.execute('git config --global user.name "OpenDevin"')
- self.execute('git config --global user.email "opendevin@all-hands.dev"')
- atexit.register(self.close)
- super().__init__(config)
-
- def setup_user(self):
- # Make users sudoers passwordless
- # TODO(sandbox): add this line in the Dockerfile for next minor version of docker image
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', r"echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers"],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(
- f'Failed to make all users passwordless sudoers in sandbox: {logs}'
- )
-
- # Check if the opendevin user exists
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', 'id -u opendevin'],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code == 0:
- # User exists, delete it
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', 'userdel -r opendevin'],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(f'Failed to remove opendevin user in sandbox: {logs}')
-
- if self.run_as_devin:
- # Create the opendevin user
- exit_code, logs = self.container.exec_run(
- [
- '/bin/bash',
- '-c',
- f'useradd -rm -d /home/opendevin -s /bin/bash -g root -G sudo -u {self.config.user_id} opendevin',
- ],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(f'Failed to create opendevin user in sandbox: {logs}')
- exit_code, logs = self.container.exec_run(
- [
- '/bin/bash',
- '-c',
- f"echo 'opendevin:{self._ssh_password}' | chpasswd",
- ],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(f'Failed to set password in sandbox: {logs}')
-
- # chown the home directory
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', 'chown opendevin:root /home/opendevin'],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(
- f'Failed to chown home directory for opendevin in sandbox: {logs}'
- )
- # check the miniforge3 directory exist
- exit_code, logs = self.container.exec_run(
- [
- '/bin/bash',
- '-c',
- '[ -d "/opendevin/miniforge3" ] && exit 0 || exit 1',
- ],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- if exit_code == 1:
- raise Exception(
- 'OPENDEVIN_PYTHON_INTERPRETER is not usable. Please pull the latest Docker image: docker pull ghcr.io/opendevin/sandbox:main'
- )
- else:
- raise Exception(
- f'An error occurred while checking if miniforge3 directory exists: {logs}'
- )
- exit_code, logs = self.container.exec_run(
- [
- '/bin/bash',
- '-c',
- f'chown opendevin:root {self.sandbox_workspace_dir}',
- ],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- # This is not a fatal error, just a warning
- logger.warning(
- f'Failed to chown workspace directory for opendevin in sandbox: {logs}. But this should be fine if the {self.sandbox_workspace_dir=} is mounted by the app docker container.'
- )
- else:
- exit_code, logs = self.container.exec_run(
- # change password for root
- ['/bin/bash', '-c', f"echo 'root:{self._ssh_password}' | chpasswd"],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(f'Failed to set password for root in sandbox: {logs}')
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', "echo 'opendevin-sandbox' > /etc/hostname"],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
-
- # Use the retry decorator, with a maximum of 5 attempts and a fixed wait time of 5 seconds between attempts
- @retry(stop=stop_after_attempt(5), wait=wait_fixed(5))
- def __ssh_login(self):
- try:
- self.ssh = pxssh.pxssh(
- echo=False,
- timeout=self.config.timeout,
- encoding='utf-8',
- codec_errors='replace',
- )
- hostname = self.ssh_hostname
- username = 'opendevin' if self.run_as_devin else 'root'
- if self.persist_sandbox:
- password_msg = 'using your SSH password'
- else:
- password_msg = f"using the password '{self._ssh_password}'"
- logger.info('Connecting to SSH session...')
- hostname_to_log = hostname.replace('host.docker.internal', 'localhost')
- ssh_cmd = f'`ssh -v -p {self._ssh_port} {username}@{hostname_to_log}`'
- logger.info(
- f'You can debug the SSH connection by running: {ssh_cmd} {password_msg}'
- )
- self.ssh.login(hostname, username, self._ssh_password, port=self._ssh_port)
- logger.info('Connected to SSH session')
- except pxssh.ExceptionPxssh as e:
- logger.exception(
- 'Failed to login to SSH session, retrying...', exc_info=False
- )
- raise e
-
- def start_ssh_session(self):
- time.sleep(1)
- self.__ssh_login()
- assert self.ssh is not None
-
- # Fix: https://github.com/pexpect/pexpect/issues/669
- self.ssh.sendline("bind 'set enable-bracketed-paste off'")
- self.ssh.prompt()
- time.sleep(1)
-
- # cd to workspace
- self.ssh.sendline(f'cd {self.sandbox_workspace_dir}')
- self.ssh.prompt()
-
- def get_exec_cmd(self, cmd: str) -> list[str]:
- if self.run_as_devin:
- return ['su', 'opendevin', '-c', cmd]
- else:
- return ['/bin/bash', '-c', cmd]
-
- def _send_interrupt(
- self,
- cmd: str,
- prev_output: str = '',
- ignore_last_output: bool = False,
- ) -> tuple[int, str]:
- assert self.ssh is not None
- logger.exception(
- f'Command "{cmd}" timed out, killing process...', exc_info=False
- )
- # send a SIGINT to the process
- self.ssh.sendintr()
- self.ssh.prompt()
- command_output = prev_output
- if not ignore_last_output:
- command_output += '\n' + self.ssh.before
- return (
- -1,
- f'Command: "{cmd}" timed out. Sent SIGINT to the process: {command_output}',
- )
-
- def execute(
- self, cmd: str, stream: bool = False, timeout: int | None = None
- ) -> tuple[int, str | CancellableStream]:
- assert self.ssh is not None
- timeout = timeout or self.config.timeout
- commands = split_bash_commands(cmd)
- if len(commands) > 1:
- all_output = ''
- for command in commands:
- exit_code, output = self.execute(command)
- if all_output:
- all_output += '\r\n'
- all_output += str(output)
- if exit_code != 0:
- return exit_code, all_output
- return 0, all_output
-
- self.ssh.sendline(cmd)
- if stream:
- return 0, SSHExecCancellableStream(self.ssh, cmd, self.config.timeout)
- success = self.ssh.prompt(timeout=timeout)
- if not success:
- return self._send_interrupt(cmd)
- command_output = self.ssh.before
-
- # once out, make sure that we have *every* output, we while loop until we get an empty output
- while True:
- self.ssh.sendline('\n')
- timeout_not_reached = self.ssh.prompt(timeout=1)
- if not timeout_not_reached:
- logger.debug('TIMEOUT REACHED')
- break
- output = self.ssh.before
- if isinstance(output, str) and output.strip() == '':
- break
- command_output += output
- command_output = command_output.removesuffix('\r\n')
-
- # get the exit code
- self.ssh.sendline('echo $?')
- self.ssh.prompt()
- exit_code_str = self.ssh.before.strip()
- _start_time = time.time()
- while not exit_code_str:
- self.ssh.prompt(timeout=1)
- exit_code_str = self.ssh.before.strip()
- if time.time() - _start_time > timeout:
- return self._send_interrupt(
- cmd, command_output, ignore_last_output=True
- )
- cleaned_exit_code_str = exit_code_str.replace('echo $?', '').strip()
-
- try:
- exit_code = int(cleaned_exit_code_str)
- except ValueError:
- logger.error(f'Invalid exit code: {cleaned_exit_code_str}')
- # Handle the invalid exit code appropriately (e.g., raise an exception or set a default value)
- exit_code = -1 # or some other appropriate default value
-
- return exit_code, command_output
-
- def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
- # mkdir -p sandbox_dest if it doesn't exist
- exit_code, logs = self.container.exec_run(
- ['/bin/bash', '-c', f'mkdir -p {sandbox_dest}'],
- workdir=self.sandbox_workspace_dir,
- environment=self._env,
- )
- if exit_code != 0:
- raise Exception(
- f'Failed to create directory {sandbox_dest} in sandbox: {logs}'
- )
-
- # use temp directory to store the tar file to avoid
- # conflict of filename when running multi-processes
- with tempfile.TemporaryDirectory() as tmp_dir:
- if recursive:
- assert os.path.isdir(
- host_src
- ), 'Source must be a directory when recursive is True'
- files = glob(host_src + '/**/*', recursive=True)
- srcname = os.path.basename(host_src)
- tar_filename = os.path.join(tmp_dir, srcname + '.tar')
- with tarfile.open(tar_filename, mode='w') as tar:
- for file in files:
- tar.add(
- file,
- arcname=os.path.relpath(file, os.path.dirname(host_src)),
- )
- else:
- assert os.path.isfile(
- host_src
- ), 'Source must be a file when recursive is False'
- srcname = os.path.basename(host_src)
- tar_filename = os.path.join(tmp_dir, srcname + '.tar')
- with tarfile.open(tar_filename, mode='w') as tar:
- tar.add(host_src, arcname=srcname)
-
- with open(tar_filename, 'rb') as f:
- data = f.read()
- self.container.put_archive(os.path.dirname(sandbox_dest), data)
-
- def start_docker_container(self):
- try:
- container = self.docker_client.containers.get(self.container_name)
- logger.info('Container status: %s', container.status)
- if container.status != 'running':
- container.start()
- logger.info('Container started')
- elapsed = 0
- while container.status != 'running':
- time.sleep(1)
- elapsed += 1
- if elapsed > self.config.timeout:
- break
- container = self.docker_client.containers.get(self.container_name)
- except Exception:
- logger.exception('Failed to start container')
-
- def remove_docker_container(self):
- try:
- container = self.docker_client.containers.get(self.container_name)
- container.stop()
- logger.info('Container stopped')
- container.remove()
- logger.info('Container removed')
- elapsed = 0
- while container.status != 'exited':
- time.sleep(1)
- elapsed += 1
- if elapsed > self.config.timeout:
- break
- container = self.docker_client.containers.get(self.container_name)
- except docker.errors.NotFound:
- pass
-
- def get_working_directory(self):
- exit_code, result = self.execute('pwd')
- if exit_code != 0:
- raise Exception('Failed to get working directory')
- return str(result).strip()
-
- def is_container_running(self):
- try:
- container = self.docker_client.containers.get(self.container_name)
- if container.status == 'running':
- self.container = container
- return True
- return False
- except docker.errors.NotFound:
- return False
-
- @property
- def volumes(self):
- mount_dir = self.workspace_mount_path
- return {
- mount_dir: {'bind': self.sandbox_workspace_dir, 'mode': 'rw'},
- # mount cache directory to /home/opendevin/.cache for pip cache reuse
- self.cache_dir: {
- 'bind': (
- '/home/opendevin/.cache' if self.run_as_devin else '/root/.cache'
- ),
- 'mode': 'rw',
- },
- }
-
- def restart_docker_container(self):
- try:
- self.remove_docker_container()
- except docker.errors.DockerException as ex:
- logger.exception('Failed to remove container', exc_info=False)
- raise ex
-
- try:
- network_kwargs: dict[str, str | dict[str, int]] = {}
- if self.use_host_network:
- network_kwargs['network_mode'] = 'host'
- else:
- # FIXME: This is a temporary workaround for Windows where host network mode has bugs.
- # FIXME: Docker Desktop for Mac OS has experimental support for host network mode
- network_kwargs['ports'] = {f'{self._ssh_port}/tcp': self._ssh_port}
- logger.warning(
- (
- 'Using port forwarding till the enable host network mode of Docker is out of experimental mode.'
- 'Check the 897th issue on https://github.com/OpenDevin/OpenDevin/issues/ for more information.'
- )
- )
-
- # start the container
- logger.info(f'Mounting volumes: {self.volumes}')
- self.container = self.docker_client.containers.run(
- self.container_image,
- # allow root login
- command=f"/usr/sbin/sshd -D -p {self._ssh_port} -o 'PermitRootLogin=yes'",
- **network_kwargs,
- working_dir=self.sandbox_workspace_dir,
- name=self.container_name,
- detach=True,
- volumes=self.volumes,
- )
- logger.info('Container started')
- except Exception as ex:
- logger.exception('Failed to start container: ' + str(ex), exc_info=False)
- raise ex
-
- # wait for container to be ready
- elapsed = 0
- while self.container.status != 'running':
- if self.container.status == 'exited':
- logger.info('container exited')
- logger.info('container logs:')
- logger.info(self.container.logs())
- break
- time.sleep(1)
- elapsed += 1
- self.container = self.docker_client.containers.get(self.container_name)
- logger.info(
- f'waiting for container to start: {elapsed}, container status: {self.container.status}'
- )
- if elapsed > self.config.timeout:
- break
- if self.container.status != 'running':
- raise Exception('Failed to start container')
-
- # clean up the container, cannot do it in __del__ because the python interpreter is already shutting down
- def close(self):
- containers = self.docker_client.containers.list(all=True)
- for container in containers:
- try:
- if container.name.startswith(self.container_name):
- if self.persist_sandbox:
- container.stop()
- else:
- # only remove the container we created
- # otherwise all other containers with the same prefix will be removed
- # which will mess up with parallel evaluation
- container.remove(force=True)
- except docker.errors.NotFound:
- pass
- self.docker_client.close()
-
-
-if __name__ == '__main__':
- try:
- ssh_box = DockerSSHBox(
- config=SandboxConfig(),
- run_as_devin=False,
- workspace_mount_path='/path/to/workspace',
- cache_dir='/path/to/cache',
- sandbox_workspace_dir='/sandbox',
- persist_sandbox=False,
- )
- except Exception as e:
- logger.exception('Failed to start Docker container: %s', e)
- sys.exit(1)
-
- logger.info(
- "Interactive Docker container started. Type 'exit' or use Ctrl+C to exit."
- )
-
- # Initialize required plugins
- plugins: list[PluginRequirement] = [AgentSkillsRequirement(), JupyterRequirement()]
- ssh_box.init_plugins(plugins)
- logger.info(
- '--- AgentSkills COMMAND DOCUMENTATION ---\n'
- f'{AgentSkillsRequirement().documentation}\n'
- '---'
- )
-
- sys.stdout.flush()
- try:
- while True:
- try:
- user_input = input('$ ')
- except EOFError:
- logger.info('Exiting...')
- break
- if user_input.lower() == 'exit':
- logger.info('Exiting...')
- break
- exit_code, output = ssh_box.execute(user_input)
- logger.info('exit code: %d', exit_code)
- logger.info(output)
- sys.stdout.flush()
- except KeyboardInterrupt:
- logger.info('Exiting...')
- ssh_box.close()
diff --git a/opendevin/runtime/e2b/runtime.py b/opendevin/runtime/e2b/runtime.py
index 36d162b3d8d8..5d90408b5545 100644
--- a/opendevin/runtime/e2b/runtime.py
+++ b/opendevin/runtime/e2b/runtime.py
@@ -11,22 +11,26 @@
)
from opendevin.events.stream import EventStream
from opendevin.runtime import Sandbox
-from opendevin.runtime.server.files import insert_lines, read_lines
-from opendevin.runtime.server.runtime import ServerRuntime
+from opendevin.runtime.plugins import PluginRequirement
+from opendevin.runtime.runtime import Runtime
+from ..utils.files import insert_lines, read_lines
from .filestore import E2BFileStore
from .sandbox import E2BSandbox
-class E2BRuntime(ServerRuntime):
+class E2BRuntime(Runtime):
def __init__(
self,
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
+ plugins: list[PluginRequirement] | None = None,
sandbox: Sandbox | None = None,
):
- super().__init__(config, event_stream, sid, sandbox)
+ super().__init__(config, event_stream, sid, plugins)
+ if sandbox is None:
+ self.sandbox = E2BSandbox()
if not isinstance(self.sandbox, E2BSandbox):
raise ValueError('E2BRuntime requires an E2BSandbox')
self.file_store = E2BFileStore(self.sandbox.filesystem)
diff --git a/opendevin/runtime/plugins/__init__.py b/opendevin/runtime/plugins/__init__.py
index 9c4e01cc8a33..fac44a362544 100644
--- a/opendevin/runtime/plugins/__init__.py
+++ b/opendevin/runtime/plugins/__init__.py
@@ -1,19 +1,15 @@
# Requirements
from .agent_skills import AgentSkillsPlugin, AgentSkillsRequirement
from .jupyter import JupyterPlugin, JupyterRequirement
-from .mixin import PluginMixin
from .requirement import Plugin, PluginRequirement
-from .swe_agent_commands import SWEAgentCommandsRequirement
__all__ = [
'Plugin',
- 'PluginMixin',
'PluginRequirement',
'AgentSkillsRequirement',
'AgentSkillsPlugin',
'JupyterRequirement',
'JupyterPlugin',
- 'SWEAgentCommandsRequirement',
]
ALL_PLUGINS = {
diff --git a/opendevin/runtime/plugins/agent_skills/__init__.py b/opendevin/runtime/plugins/agent_skills/__init__.py
index 8b1a3c7335d5..e331e16096ae 100644
--- a/opendevin/runtime/plugins/agent_skills/__init__.py
+++ b/opendevin/runtime/plugins/agent_skills/__init__.py
@@ -1,4 +1,3 @@
-import os
from dataclasses import dataclass
from opendevin.runtime.plugins.agent_skills.agentskills import DOCUMENTATION
@@ -8,11 +7,6 @@
@dataclass
class AgentSkillsRequirement(PluginRequirement):
name: str = 'agent_skills'
- host_src: str = os.path.dirname(
- os.path.abspath(__file__)
- ) # The directory of this file (opendevin/runtime/plugins/agent_skills)
- sandbox_dest: str = '/opendevin/plugins/agent_skills'
- bash_script_path: str = 'setup.sh'
documentation: str = DOCUMENTATION
diff --git a/opendevin/runtime/plugins/agent_skills/agentskills.py b/opendevin/runtime/plugins/agent_skills/agentskills.py
index d6fdea9084eb..db45daef580c 100644
--- a/opendevin/runtime/plugins/agent_skills/agentskills.py
+++ b/opendevin/runtime/plugins/agent_skills/agentskills.py
@@ -21,6 +21,7 @@
import os
import re
import shutil
+import sys
import tempfile
from inspect import signature
from typing import Optional
@@ -41,30 +42,77 @@
WINDOW = 100
-ENABLE_AUTO_LINT = os.getenv('ENABLE_AUTO_LINT', 'false').lower() == 'true'
-
# This is also used in unit tests!
MSG_FILE_UPDATED = '[File updated (edited at line {line_number}). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]'
+
+# ==================================================================================================
# OPENAI
-OPENAI_API_KEY = os.getenv(
- 'OPENAI_API_KEY', os.getenv('SANDBOX_ENV_OPENAI_API_KEY', '')
-)
-OPENAI_BASE_URL = os.getenv('OPENAI_BASE_URL', 'https://api.openai.com/v1')
-OPENAI_MODEL = os.getenv('OPENAI_MODEL', 'gpt-4o-2024-05-13')
-MAX_TOKEN = os.getenv('MAX_TOKEN', 500)
+# TODO: Move this to EventStream Actions when EventStreamRuntime is fully implemented
+# NOTE: we need to get env vars inside functions because they will be set in IPython
+# AFTER the agentskills is imported (the case for EventStreamRuntime)
+# ==================================================================================================
+def _get_openai_api_key():
+ return os.getenv('OPENAI_API_KEY', os.getenv('SANDBOX_ENV_OPENAI_API_KEY', ''))
+
+
+def _get_openai_base_url():
+ return os.getenv('OPENAI_BASE_URL', 'https://api.openai.com/v1')
+
+
+def _get_openai_model():
+ return os.getenv('OPENAI_MODEL', 'gpt-4o-2024-05-13')
-OPENAI_PROXY = f'{OPENAI_BASE_URL}/chat/completions'
-client = OpenAI(api_key=OPENAI_API_KEY, base_url=OPENAI_BASE_URL)
+def _get_max_token():
+ return os.getenv('MAX_TOKEN', 500)
+
+
+def _get_openai_client():
+ client = OpenAI(api_key=_get_openai_api_key(), base_url=_get_openai_base_url())
+ return client
+
+
+# ==================================================================================================
# Define the decorator using the functionality of UpdatePwd
def update_pwd_decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
- old_pwd = os.getcwd()
jupyter_pwd = os.environ.get('JUPYTER_PWD', None)
+ try:
+ old_pwd = os.getcwd()
+ except FileNotFoundError:
+ import json
+ import subprocess
+
+ print(
+ f'DEBUGGING Environment variables: {json.dumps(dict(os.environ), indent=2)}'
+ )
+ if sys.platform != 'win32':
+ print(f'DEBUGGING User ID: {os.getuid()}, Group ID: {os.getgid()}')
+
+ out = subprocess.run(['pwd'], capture_output=True)
+ old_pwd = out.stdout.decode('utf-8').strip()
+ os.chdir(old_pwd)
+ print(f'DEBUGGING Change to working directory: {old_pwd}')
+
+ import tempfile
+
+ try:
+ tempfile.TemporaryFile(dir=old_pwd)
+ print(f'DEBUGGING Directory {old_pwd} is writable')
+ except Exception as e:
+ print(f'DEBUGGING Directory {old_pwd} is not writable: {str(e)}')
+
+ # ls -alh
+ out = subprocess.run(['ls', '-alh', old_pwd], capture_output=True)
+ print(
+ f'DEBUGGING OLD working directory contents: {out.stdout.decode("utf-8")}'
+ )
+ print(f'DEBUGGING Target JUPYTER pwd: {jupyter_pwd}')
+
if jupyter_pwd:
os.chdir(jupyter_pwd)
try:
@@ -506,7 +554,10 @@ def _edit_file_impl(
shutil.move(temp_file_path, src_abs_path)
# Handle linting
- if ENABLE_AUTO_LINT:
+ # NOTE: we need to get env var inside this function
+ # because the env var will be set AFTER the agentskills is imported
+ enable_auto_lint = os.getenv('ENABLE_AUTO_LINT', 'false').lower() == 'true'
+ if enable_auto_lint:
# BACKUP the original file
original_file_backup_path = os.path.join(
os.path.dirname(file_name),
@@ -954,7 +1005,9 @@ def parse_audio(file_path: str, model: str = 'whisper-1') -> None:
try:
# TODO: record the COST of the API call
with open(file_path, 'rb') as audio_file:
- transcript = client.audio.translations.create(model=model, file=audio_file)
+ transcript = _get_openai_client().audio.translations.create(
+ model=model, file=audio_file
+ )
print(transcript.text)
except Exception as e:
@@ -975,10 +1028,10 @@ def parse_image(
# TODO: record the COST of the API call
try:
base64_image = _base64_img(file_path)
- response = client.chat.completions.create(
- model=OPENAI_MODEL,
+ response = _get_openai_client().chat.completions.create(
+ model=_get_openai_model(),
messages=_prepare_image_messages(task, base64_image),
- max_tokens=MAX_TOKEN,
+ max_tokens=_get_max_token(),
)
content = response.choices[0].message.content
print(content)
@@ -1021,10 +1074,10 @@ def parse_video(
print(f'Process the {file_path}, current No. {idx * frame_interval} frame...')
# TODO: record the COST of the API call
try:
- response = client.chat.completions.create(
- model=OPENAI_MODEL,
+ response = _get_openai_client().chat.completions.create(
+ model=_get_openai_model(),
messages=_prepare_image_messages(task, base64_frame),
- max_tokens=MAX_TOKEN,
+ max_tokens=_get_max_token(),
)
content = response.choices[0].message.content
@@ -1077,7 +1130,9 @@ def parse_pptx(file_path: str) -> None:
'parse_pptx',
]
-if OPENAI_API_KEY and OPENAI_BASE_URL:
+# This is called from OpenDevin's side
+# If SANDBOX_ENV_OPENAI_API_KEY is set, we will be able to use these tools in the sandbox environment
+if _get_openai_api_key() and _get_openai_base_url():
__all__ += ['parse_audio', 'parse_video', 'parse_image']
DOCUMENTATION = ''
diff --git a/opendevin/runtime/plugins/agent_skills/setup.sh b/opendevin/runtime/plugins/agent_skills/setup.sh
deleted file mode 100755
index 53e5cfe7df65..000000000000
--- a/opendevin/runtime/plugins/agent_skills/setup.sh
+++ /dev/null
@@ -1,21 +0,0 @@
-#!/bin/bash
-
-set -e
-
-OPENDEVIN_PYTHON_INTERPRETER=/opendevin/miniforge3/bin/python
-# check if OPENDEVIN_PYTHON_INTERPRETER exists and it is usable
-if [ -z "$OPENDEVIN_PYTHON_INTERPRETER" ] || [ ! -x "$OPENDEVIN_PYTHON_INTERPRETER" ]; then
- echo "OPENDEVIN_PYTHON_INTERPRETER is not usable. Please pull the latest Docker image!"
- exit 1
-fi
-
-# add agent_skills to PATH
-echo 'export PATH=/opendevin/plugins/agent_skills:$PATH' >> ~/.bashrc
-
-# add agent_skills to PYTHONPATH
-echo 'export PYTHONPATH=/opendevin/plugins/agent_skills:$PYTHONPATH' >> ~/.bashrc
-
-source ~/.bashrc
-
-$OPENDEVIN_PYTHON_INTERPRETER -m pip install flake8 python-docx PyPDF2 python-pptx pylatexenc openai opencv-python
-$OPENDEVIN_PYTHON_INTERPRETER -m pip install diskcache==5.6.3 grep-ast==0.3.2 tree-sitter==0.21.3 tree-sitter-languages==1.10.2
diff --git a/opendevin/runtime/plugins/jupyter/__init__.py b/opendevin/runtime/plugins/jupyter/__init__.py
index 1178a95fd0da..f9c33bfa19cc 100644
--- a/opendevin/runtime/plugins/jupyter/__init__.py
+++ b/opendevin/runtime/plugins/jupyter/__init__.py
@@ -1,10 +1,10 @@
-import os
import subprocess
import time
from dataclasses import dataclass
+from opendevin.core.logger import opendevin_logger as logger
from opendevin.events.action import Action, IPythonRunCellAction
-from opendevin.events.observation import IPythonRunCellObservation, Observation
+from opendevin.events.observation import IPythonRunCellObservation
from opendevin.runtime.plugins.requirement import Plugin, PluginRequirement
from opendevin.runtime.utils import find_available_tcp_port
@@ -14,48 +14,45 @@
@dataclass
class JupyterRequirement(PluginRequirement):
name: str = 'jupyter'
- host_src: str = os.path.dirname(
- os.path.abspath(__file__)
- ) # The directory of this file (opendevin/runtime/plugins/jupyter)
- sandbox_dest: str = '/opendevin/plugins/jupyter'
- bash_script_path: str = 'setup.sh'
-
- # ================================================================
- # Plugin methods, which will ONLY be used in the runtime client
- # running inside docker
- # ================================================================
class JupyterPlugin(Plugin):
name: str = 'jupyter'
- def initialize(self, kernel_id: str = 'opendevin-default'):
+ async def initialize(self, username: str, kernel_id: str = 'opendevin-default'):
self.kernel_gateway_port = find_available_tcp_port()
self.kernel_id = kernel_id
self.gateway_process = subprocess.Popen(
- [
- '/opendevin/miniforge3/bin/mamba',
- 'run',
- '-n',
- 'base',
- 'poetry',
- 'run',
- 'jupyter',
- 'kernelgateway',
- '--KernelGatewayApp.ip=0.0.0.0',
- f'--KernelGatewayApp.port={self.kernel_gateway_port}',
- ],
+ (
+ f"su - {username} -s /bin/bash << 'EOF'\n"
+ 'cd /opendevin/code\n'
+ 'export POETRY_VIRTUALENVS_PATH=/opendevin/poetry;\n'
+ 'export PYTHONPATH=/opendevin/code/opendevin/runtime/plugins/agent_skills:$PYTHONPATH;\n'
+ '/opendevin/miniforge3/bin/mamba run -n base '
+ 'poetry run jupyter kernelgateway '
+ '--KernelGatewayApp.ip=0.0.0.0 '
+ f'--KernelGatewayApp.port={self.kernel_gateway_port}\n'
+ 'EOF'
+ ),
stderr=subprocess.STDOUT,
+ shell=True,
)
# read stdout until the kernel gateway is ready
+ output = ''
while True and self.gateway_process.stdout is not None:
line = self.gateway_process.stdout.readline().decode('utf-8')
+ output += line
if 'at' in line:
break
time.sleep(1)
- print('Waiting for jupyter kernel gateway to start...')
+ logger.debug('Waiting for jupyter kernel gateway to start...')
+
+ logger.info(
+ f'Jupyter kernel gateway started at port {self.kernel_gateway_port}. Output: {output}'
+ )
- async def run(self, action: Action) -> Observation:
+ async def _run(self, action: Action) -> IPythonRunCellObservation:
+ """Internal method to run a code cell in the jupyter kernel."""
if not isinstance(action, IPythonRunCellAction):
raise ValueError(
f'Jupyter plugin only supports IPythonRunCellAction, but got {action}'
@@ -68,8 +65,12 @@ async def run(self, action: Action) -> Observation:
if not self.kernel.initialized:
await self.kernel.initialize()
- output = await self.kernel.execute(action.code)
+ output = await self.kernel.execute(action.code, timeout=action.timeout)
return IPythonRunCellObservation(
content=output,
code=action.code,
)
+
+ async def run(self, action: Action) -> IPythonRunCellObservation:
+ obs = await self._run(action)
+ return obs
diff --git a/opendevin/runtime/plugins/jupyter/execute_cli b/opendevin/runtime/plugins/jupyter/execute_cli
deleted file mode 100755
index 9637290e3887..000000000000
--- a/opendevin/runtime/plugins/jupyter/execute_cli
+++ /dev/null
@@ -1,4 +0,0 @@
-#!/bin/bash
-# Run the Python script with the specified interpreter
-export JUPYTER_PWD=$(pwd)
-$OPENDEVIN_PYTHON_INTERPRETER /opendevin/plugins/jupyter/execute_cli.py
diff --git a/opendevin/runtime/plugins/jupyter/execute_cli.py b/opendevin/runtime/plugins/jupyter/execute_cli.py
deleted file mode 100755
index 0b77653e12b5..000000000000
--- a/opendevin/runtime/plugins/jupyter/execute_cli.py
+++ /dev/null
@@ -1,45 +0,0 @@
-import os
-import sys
-import time
-import traceback
-
-import requests
-
-# Read the Python code from STDIN
-code = sys.stdin.read()
-
-
-def execute_code(code, print_output=True):
- PORT = os.environ.get('JUPYTER_EXEC_SERVER_PORT')
- POST_URL = f'http://localhost:{PORT}/execute'
-
- # Set the default kernel ID
- kernel_id = 'default'
- output = ''
- for i in range(3):
- try:
- response = requests.post(
- POST_URL, json={'kernel_id': kernel_id, 'code': code}
- )
- output = response.text
- if '500: Internal Server Error' not in output:
- if print_output:
- print(output)
- break
- except requests.exceptions.ConnectionError:
- if i == 2:
- traceback.print_exc()
- time.sleep(2)
- else:
- if not output:
- with open('/opendevin/logs/jupyter_execute_server.log', 'r') as f:
- output = f.read()
- print('Failed to connect to the Jupyter server', output)
-
-
-if jupyter_pwd := os.environ.get('JUPYTER_PWD'):
- execute_code(
- f'import os\nos.environ["JUPYTER_PWD"] = "{jupyter_pwd}"\n', print_output=False
- )
-
-execute_code(code)
diff --git a/opendevin/runtime/plugins/jupyter/execute_server.py b/opendevin/runtime/plugins/jupyter/execute_server.py
index 1388875c9b5a..35824257479f 100755
--- a/opendevin/runtime/plugins/jupyter/execute_server.py
+++ b/opendevin/runtime/plugins/jupyter/execute_server.py
@@ -7,6 +7,7 @@
from uuid import uuid4
import tornado
+from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed
from tornado.escape import json_decode, json_encode, url_escape
from tornado.httpclient import AsyncHTTPClient, HTTPRequest
from tornado.ioloop import PeriodicCallback
@@ -73,8 +74,8 @@ async def initialize(self):
if os.path.exists('/opendevin/plugins/agent_skills/agentskills.py'):
self.tools_to_run.append('from agentskills import *')
for tool in self.tools_to_run:
- # logging.info(f'Tool initialized:\n{tool}')
- await self.execute(tool)
+ res = await self.execute(tool)
+ logging.info(f'Tool [{tool}] initialized:\n{res}')
self.initialized = True
async def _send_heartbeat(self):
@@ -134,13 +135,18 @@ async def _connect(self):
)
self.heartbeat_callback.start()
+ @retry(
+ retry=retry_if_exception_type(ConnectionRefusedError),
+ stop=stop_after_attempt(3),
+ wait=wait_fixed(2),
+ )
async def execute(self, code, timeout=120):
if not self.ws:
await self._connect()
msg_id = uuid4().hex
assert self.ws is not None
- self.ws.write_message(
+ res = await self.ws.write_message(
json_encode(
{
'header': {
@@ -164,6 +170,7 @@ async def execute(self, code, timeout=120):
}
)
)
+ logging.info(f'Executed code in jupyter kernel:\n{res}')
outputs = []
diff --git a/opendevin/runtime/plugins/jupyter/setup.sh b/opendevin/runtime/plugins/jupyter/setup.sh
deleted file mode 100755
index 765351880f11..000000000000
--- a/opendevin/runtime/plugins/jupyter/setup.sh
+++ /dev/null
@@ -1,87 +0,0 @@
-#!/bin/bash
-
-set -e
-
-# Hardcoded to use the Python interpreter from the OpenDevin runtime client
-OPENDEVIN_PYTHON_INTERPRETER=/opendevin/miniforge3/bin/python
-# check if OPENDEVIN_PYTHON_INTERPRETER exists and it is usable
-if [ -z "$OPENDEVIN_PYTHON_INTERPRETER" ] || [ ! -x "$OPENDEVIN_PYTHON_INTERPRETER" ]; then
- echo "OPENDEVIN_PYTHON_INTERPRETER is not usable. Please pull the latest Docker image!"
- exit 1
-fi
-
-# Install dependencies
-$OPENDEVIN_PYTHON_INTERPRETER -m pip install jupyterlab notebook jupyter_kernel_gateway
-
-source ~/.bashrc
-# ADD /opendevin/plugins to PATH to make `jupyter_cli` available
-echo 'export PATH=$PATH:/opendevin/plugins/jupyter' >> ~/.bashrc
-export PATH=/opendevin/plugins/jupyter:$PATH
-
-# Temporary add /opendevin/miniforge3/bin to PATH
-# will not persist after the end of the script
-# This fixes https://github.com/OpenDevin/OpenDevin/pull/2489#issuecomment-2223088169
-export PATH=/opendevin/miniforge3/bin:$PATH
-
-# if user name is `opendevin`, add '/home/opendevin/.local/bin' to PATH
-if [ "$USER" = "opendevin" ]; then
- echo 'export PATH=$PATH:/home/opendevin/.local/bin' >> ~/.bashrc
- echo "export OPENDEVIN_PYTHON_INTERPRETER=$OPENDEVIN_PYTHON_INTERPRETER" >> ~/.bashrc
- export PATH=$PATH:/home/opendevin/.local/bin
- export PIP_CACHE_DIR=$HOME/.cache/pip
-fi
-# if user name is `root`, add '/root/.local/bin' to PATH
-if [ "$USER" = "root" ]; then
- echo 'export PATH=$PATH:/root/.local/bin' >> ~/.bashrc
- echo "export OPENDEVIN_PYTHON_INTERPRETER=$OPENDEVIN_PYTHON_INTERPRETER" >> ~/.bashrc
- export PATH=$PATH:/root/.local/bin
- export PIP_CACHE_DIR=$HOME/.cache/pip
-
-fi
-
-# Run background process to start jupyter kernel gateway
-# write a bash function that finds a free port
-find_free_port() {
- local start_port="${1:-20000}"
- local end_port="${2:-65535}"
-
- for port in $(seq $start_port $end_port); do
- if ! ss -tuln | awk '{print $5}' | grep -q ":$port$"; then
- echo $port
- return
- fi
- done
-
- echo "No free ports found in the range $start_port to $end_port" >&2
- return 1
-}
-
-export JUPYTER_GATEWAY_PORT=$(find_free_port 20000 30000)
-$OPENDEVIN_PYTHON_INTERPRETER -m \
- jupyter kernelgateway --KernelGatewayApp.ip=0.0.0.0 --KernelGatewayApp.port=$JUPYTER_GATEWAY_PORT > /opendevin/logs/jupyter_kernel_gateway.log 2>&1 &
-
-export JUPYTER_GATEWAY_PID=$!
-echo "export JUPYTER_GATEWAY_PID=$JUPYTER_GATEWAY_PID" >> ~/.bashrc
-export JUPYTER_GATEWAY_KERNEL_ID="default"
-echo "export JUPYTER_GATEWAY_KERNEL_ID=$JUPYTER_GATEWAY_KERNEL_ID" >> ~/.bashrc
-echo "JupyterKernelGateway started with PID: $JUPYTER_GATEWAY_PID"
-
-# Start the jupyter_server
-export JUPYTER_EXEC_SERVER_PORT=$(find_free_port 30000 40000)
-echo "export JUPYTER_EXEC_SERVER_PORT=$JUPYTER_EXEC_SERVER_PORT" >> ~/.bashrc
-$OPENDEVIN_PYTHON_INTERPRETER /opendevin/plugins/jupyter/execute_server.py > /opendevin/logs/jupyter_execute_server.log 2>&1 &
-export JUPYTER_EXEC_SERVER_PID=$!
-echo "export JUPYTER_EXEC_SERVER_PID=$JUPYTER_EXEC_SERVER_PID" >> ~/.bashrc
-echo "Execution server started with PID: $JUPYTER_EXEC_SERVER_PID"
-
-# Wait until /opendevin/logs/jupyter_kernel_gateway.log contains "is available"
-while ! grep -q "at" /opendevin/logs/jupyter_kernel_gateway.log; do
- echo "Waiting for Jupyter kernel gateway to be available..."
- sleep 1
-done
-# Wait until /opendevin/logs/jupyter_execute_server.log contains "Jupyter kernel created for conversation"
-while ! grep -q "kernel created" /opendevin/logs/jupyter_execute_server.log; do
- echo "Waiting for Jupyter kernel to be created..."
- sleep 1
-done
-echo "Jupyter kernel ready."
diff --git a/opendevin/runtime/plugins/mixin.py b/opendevin/runtime/plugins/mixin.py
deleted file mode 100644
index 14088eb493fa..000000000000
--- a/opendevin/runtime/plugins/mixin.py
+++ /dev/null
@@ -1,98 +0,0 @@
-import os
-from typing import Protocol
-
-from opendevin.core.logger import opendevin_logger as logger
-from opendevin.core.schema import CancellableStream
-from opendevin.runtime.plugins.requirement import PluginRequirement
-
-
-class SandboxProtocol(Protocol):
- # https://stackoverflow.com/questions/51930339/how-do-i-correctly-add-type-hints-to-mixin-classes
-
- @property
- def initialize_plugins(self) -> bool: ...
-
- def execute(
- self, cmd: str, stream: bool = False
- ) -> tuple[int, str | CancellableStream]: ...
-
- def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False): ...
-
-
-def _source_bashrc(sandbox: SandboxProtocol):
- exit_code, output = sandbox.execute(
- 'source /opendevin/bash.bashrc && source ~/.bashrc'
- )
- if exit_code != 0:
- raise RuntimeError(
- f'Failed to source /opendevin/bash.bashrc and ~/.bashrc with exit code {exit_code} and output: {output}'
- )
- logger.info('Sourced /opendevin/bash.bashrc and ~/.bashrc successfully')
-
-
-class PluginMixin:
- """Mixin for Sandbox to support plugins."""
-
- def init_plugins(self: SandboxProtocol, requirements: list[PluginRequirement]):
- """Load a plugin into the sandbox."""
- if hasattr(self, 'plugin_initialized') and self.plugin_initialized:
- return
-
- if self.initialize_plugins:
- logger.info('Initializing plugins in the sandbox')
-
- # clean-up ~/.bashrc and touch ~/.bashrc
- exit_code, output = self.execute('rm -f ~/.bashrc && touch ~/.bashrc')
- if exit_code != 0:
- logger.warning(
- f'Failed to clean-up ~/.bashrc with exit code {exit_code} and output: {output}'
- )
-
- for requirement in requirements:
- # source bashrc file when plugin loads
- _source_bashrc(self)
-
- # copy over the files
- self.copy_to(
- requirement.host_src, requirement.sandbox_dest, recursive=True
- )
- logger.info(
- f'Copied files from [{requirement.host_src}] to [{requirement.sandbox_dest}] inside sandbox.'
- )
-
- # Execute the bash script
- abs_path_to_bash_script = os.path.join(
- requirement.sandbox_dest, requirement.bash_script_path
- )
- logger.info(
- f'Initializing plugin [{requirement.name}] by executing [{abs_path_to_bash_script}] in the sandbox.'
- )
- exit_code, output = self.execute(abs_path_to_bash_script, stream=True)
- if isinstance(output, CancellableStream):
- total_output = ''
- for line in output:
- # Removes any trailing whitespace, including \n and \r\n
- line = line.rstrip()
- # logger.debug(line)
- # Avoid text from lines running into each other
- total_output += line + ' '
- _exit_code = output.exit_code()
- output.close()
- if _exit_code != 0:
- raise RuntimeError(
- f'Failed to initialize plugin {requirement.name} with exit code {_exit_code} and output: {total_output.strip()}'
- )
- logger.info(f'Plugin {requirement.name} initialized successfully')
- else:
- if exit_code != 0:
- raise RuntimeError(
- f'Failed to initialize plugin {requirement.name} with exit code {exit_code} and output: {output}'
- )
- logger.info(f'Plugin {requirement.name} initialized successfully.')
- else:
- logger.info('Skipping plugin initialization in the sandbox')
-
- if len(requirements) > 0:
- _source_bashrc(self)
-
- self.plugin_initialized = True
diff --git a/opendevin/runtime/plugins/requirement.py b/opendevin/runtime/plugins/requirement.py
index 3971dad8c184..b6713fe2a3a5 100644
--- a/opendevin/runtime/plugins/requirement.py
+++ b/opendevin/runtime/plugins/requirement.py
@@ -14,7 +14,7 @@ class Plugin:
name: str
@abstractmethod
- def initialize(self):
+ async def initialize(self, username: str):
"""Initialize the plugin."""
pass
@@ -29,8 +29,3 @@ class PluginRequirement:
"""Requirement for a plugin."""
name: str
- # FOLDER/FILES to be copied to the sandbox
- host_src: str
- sandbox_dest: str
- # NOTE: bash_script_path should be relative to the `sandbox_dest` path
- bash_script_path: str
diff --git a/opendevin/runtime/plugins/swe_agent_commands/__init__.py b/opendevin/runtime/plugins/swe_agent_commands/__init__.py
deleted file mode 100644
index 66316f36b20c..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/__init__.py
+++ /dev/null
@@ -1,71 +0,0 @@
-import os
-from dataclasses import dataclass, field
-
-from opendevin.runtime.plugins.requirement import PluginRequirement
-from opendevin.runtime.plugins.swe_agent_commands.parse_commands import (
- parse_command_file,
-)
-
-
-def _resolve_to_cur_dir(filename):
- return os.path.join(os.path.dirname(os.path.abspath(__file__)), filename)
-
-
-def check_and_parse_command_file(filepath) -> str:
- if filepath is None:
- raise FileNotFoundError(f'File not found: {filepath}')
- return parse_command_file(filepath)
-
-
-DEFAULT_SCRIPT_FILEPATHS = [
- _resolve_to_cur_dir('defaults.sh'),
- _resolve_to_cur_dir('search.sh'),
- _resolve_to_cur_dir('edit_linting.sh'),
-]
-DEFAULT_DOCUMENTATION = ''.join(
- [
- check_and_parse_command_file(filepath)
- for filepath in DEFAULT_SCRIPT_FILEPATHS
- if filepath is not None
- ]
-)
-
-
-@dataclass
-class SWEAgentCommandsRequirement(PluginRequirement):
- name: str = 'swe_agent_commands'
- host_src: str = os.path.dirname(os.path.abspath(__file__))
- sandbox_dest: str = '/opendevin/plugins/swe_agent_commands'
- bash_script_path: str = 'setup_default.sh'
-
- scripts_filepaths: list[str | None] = field(
- default_factory=lambda: DEFAULT_SCRIPT_FILEPATHS
- )
- documentation: str = DEFAULT_DOCUMENTATION
-
-
-CURSOR_SCRIPT_FILEPATHS = [
- _resolve_to_cur_dir('cursors_defaults.sh'),
- _resolve_to_cur_dir('cursors_edit_linting.sh'),
- _resolve_to_cur_dir('search.sh'),
-]
-CURSOR_DOCUMENTATION = ''.join(
- [
- check_and_parse_command_file(filepath)
- for filepath in CURSOR_SCRIPT_FILEPATHS
- if filepath is not None
- ]
-)
-
-
-@dataclass
-class SWEAgentCursorCommandsRequirement(PluginRequirement):
- name: str = 'swe_agent_commands'
- host_src: str = os.path.dirname(os.path.abspath(__file__))
- sandbox_dest: str = '/opendevin/plugins/swe_agent_commands'
- bash_script_path: str = 'setup_cursor_mode.sh'
-
- scripts_filepaths: list[str | None] = field(
- default_factory=lambda: CURSOR_SCRIPT_FILEPATHS
- )
- documentation: str = CURSOR_DOCUMENTATION
diff --git a/opendevin/runtime/plugins/swe_agent_commands/_setup_cursor_mode_env.sh b/opendevin/runtime/plugins/swe_agent_commands/_setup_cursor_mode_env.sh
deleted file mode 100755
index 3f4e8bf9e020..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/_setup_cursor_mode_env.sh
+++ /dev/null
@@ -1,24 +0,0 @@
-# Cursor Mode from SWE-Bench
-# https://github.com/princeton-nlp/SWE-agent/blob/ca54d5556b9db4f4f2be21f09530ce69a72c0305/config/configs/default_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml
-export WINDOW=200;
-export OVERLAP=2;
-export CURRENT_LINE=0;
-export CURRENT_FILE='';
-export SEARCH_RESULTS=();
-export SEARCH_FILES=();
-export SEARCH_INDEX=0;
-export START_INDEX=0;
-export END_INDEX=0;
-export START_CURSOR=0;
-export END_CURSOR=0;
-export START_CURSOR_MARK='"<<<<< START CURSOR >>>>>"'; # these have to use double quotes
-export END_CURSOR_MARK='"<<<<< END CURSOR >>>>>"'; # these have to use double quotes
-
-state() {
- local working_dir="$PWD";
- if [ -z $CURRENT_FILE ]; then
- echo '{"open_file": "n/a", "working_dir": "'$working_dir'"}';
- else
- echo '{"open_file": "'$(realpath $CURRENT_FILE)'", "working_dir": "'$working_dir'"}';
- fi
-};
diff --git a/opendevin/runtime/plugins/swe_agent_commands/_setup_default_env.sh b/opendevin/runtime/plugins/swe_agent_commands/_setup_default_env.sh
deleted file mode 100755
index fc0dbad7b870..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/_setup_default_env.sh
+++ /dev/null
@@ -1,18 +0,0 @@
-# Default Mode from SWE-Bench
-# https://github.com/princeton-nlp/SWE-agent/blob/ca54d5556b9db4f4f2be21f09530ce69a72c0305/config/configs/default_sys-env_window100-detailed_cmd_format-last_5_history-1_demos.yaml
-export WINDOW=100;
-export OVERLAP=2;
-export CURRENT_LINE=0;
-export CURRENT_FILE='';
-export SEARCH_RESULTS=();
-export SEARCH_FILES=();
-export SEARCH_INDEX=0;
-
-state() {
- local working_dir="$PWD";
- if [ -z $CURRENT_FILE ]; then
- echo '{"open_file": "n/a", "working_dir": "'$working_dir'"}';
- else
- echo '{"open_file": "'$(realpath $CURRENT_FILE)'", "working_dir": "'$working_dir'"}';
- fi
-};
diff --git a/opendevin/runtime/plugins/swe_agent_commands/_split_string b/opendevin/runtime/plugins/swe_agent_commands/_split_string
deleted file mode 100755
index 7525fdd2419e..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/_split_string
+++ /dev/null
@@ -1,17 +0,0 @@
-#!/usr/bin/env python3
-import sys
-
-
-def print_flake8_output(input_string, show_line_numbers=False):
- for value in input_string.split('\n'):
- parts = value.split()
- if not show_line_numbers:
- print(f"- {' '.join(parts[1:])}")
- else:
- line_nums = ':'.join(parts[0].split(':')[1:])
- print(f"- {line_nums} {' '.join(parts[1:])}")
-
-
-if __name__ == '__main__':
- lint_output = sys.argv[1]
- print_flake8_output(lint_output)
diff --git a/opendevin/runtime/plugins/swe_agent_commands/cursors_defaults.sh b/opendevin/runtime/plugins/swe_agent_commands/cursors_defaults.sh
deleted file mode 100644
index 561900371b1e..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/cursors_defaults.sh
+++ /dev/null
@@ -1,306 +0,0 @@
-_reset_cursors() {
- export START_CURSOR=1
- export END_CURSOR=1
-}
-
-_constrain_cursors() {
- # constrain the cursors to be within the bounds of the file [0, total_lines+1]
- local total_lines=$(awk 'END {print NR}' "$CURRENT_FILE")
- total_lines=$((total_lines < 1 ? 1 : total_lines)) # if the file is empty, set total_lines to 1
- local start_line=$((CURRENT_LINE - WINDOW / 2))
- local end_line=$((CURRENT_LINE + WINDOW / 2))
- start_line=$((start_line < 1 ? 1 : start_line))
- end_line=$((end_line > total_lines ? total_lines : end_line))
- local warning_string=""
- if [ "$START_CURSOR" -lt "$start_line" ]; then
- warning_string+="START_CURSOR moved to $start_line\n"
- START_CURSOR=$start_line
- elif [ "$START_CURSOR" -gt "$end_line" ]; then
- START_CURSOR=$end_line
- warning_string+="START_CURSOR moved to $end_line\n"
- fi
- if [ "$END_CURSOR" -lt "$start_line" ]; then
- warning_string+="END_CURSOR moved to $start_line\n"
- END_CURSOR=$start_line
- elif [ "$END_CURSOR" -gt "$end_line" ]; then
- warning_string+="END_CURSOR moved to $end_line\n"
- END_CURSOR=$end_line
- fi
- export START_CURSOR END_CURSOR
- echo "$warning_string"
- echo $START_CURSOR $END_CURSOR
-}
-
-_print() {
- local cursor_warning=$(_constrain_cursors)
- local cursor_values=$(echo "$cursor_warning" | tail -n 1)
- cursor_warning=$(echo "$cursor_warning" | head -n -1)
- export START_CURSOR=$(echo "$cursor_values" | awk '{print $1}')
- export END_CURSOR=$(echo "$cursor_values" | awk '{print $2}')
- local total_lines=$(awk 'END {print NR}' $CURRENT_FILE)
- echo "[File: $(realpath "$CURRENT_FILE") ($total_lines lines total)]"
- local start_line=$((CURRENT_LINE - WINDOW / 2))
- local end_line=$((CURRENT_LINE + WINDOW / 2))
- start_line=$((start_line < 1 ? 1 : start_line))
- end_line=$((end_line > total_lines ? total_lines : end_line))
- local lines=()
- local i=0
- while IFS= read -r line; do
- lines[i++]="$line"
- done < <(awk -v start="$start_line" -v end="$end_line" 'NR>=start && NR<=end {print}' "$CURRENT_FILE")
- local num_lines=${#lines[@]}
- if [ $start_line -gt 1 ]; then
- echo "($((start_line - 1)) more lines above)"
- fi
- for ((i=0; i
-# docstring: sets the start and end cursors to the given line numbers
-# arguments:
-# start_line:
-# type: integer
-# description: the line number to set the start cursor to
-# required: true
-# end_line:
-# type: integer
-# description: the line number to set the end cursor to
-# required: true
-set_cursors() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- if [ $# -lt 2 ]
- then
- echo "Usage: set_cursors "
- return
- fi
- local start_line=$1
- local end_line=$2
- local re='^[0-9]+$'
- if ! [[ $start_line =~ $re ]]
- then
- echo "Usage: set_cursors "
- echo "Error: start_line must be a number"
- return
- fi
- if ! [[ $end_line =~ $re ]]
- then
- echo "Usage: set_cursors "
- echo "Error: end_line must be a number"
- return
- fi
- if [ $start_line -gt $end_line ]
- then
- echo "Usage: set_cursors "
- echo "Error: start_line must be less than or equal to end_line"
- return
- fi
- export START_CURSOR=$start_line
- export END_CURSOR=$end_line
- _print
-}
-
-# @yaml
-# signature: open []
-# docstring: opens the file at the given path in the editor. If line_number is provided, the window will be centered on that line
-# arguments:
-# path:
-# type: string
-# description: the path to the file to open
-# required: true
-# line_number:
-# type: integer
-# description: the line number to move the window to (if not provided, the window will start at the top of the file)
-# required: false
-open() {
- if [ -z "$1" ]
- then
- echo "Usage: open "
- return
- fi
- # Check if the second argument is provided
- if [ -n "$2" ]; then
- # Check if the provided argument is a valid number
- if ! [[ $2 =~ ^[0-9]+$ ]]; then
- echo "Usage: open []"
- echo "Error: must be a number"
- return # Exit if the line number is not valid
- fi
- local max_line=$(awk 'END {print NR}' $1)
- if [ $2 -gt $max_line ]; then
- echo "Warning: ($2) is greater than the number of lines in the file ($max_line)"
- echo "Warning: Setting to $max_line"
- local line_number=$(jq -n "$max_line") # Set line number to max if greater than max
- elif [ $2 -lt 1 ]; then
- echo "Warning: ($2) is less than 1"
- echo "Warning: Setting to 1"
- local line_number=$(jq -n "1") # Set line number to 1 if less than 1
- else
- local line_number=$(jq -n "$2") # Set line number if valid
- fi
- else
- local line_number=$(jq -n "$WINDOW/2") # Set default line number if not provided
- fi
-
- if [ -f "$1" ]; then
- export CURRENT_FILE=$(realpath $1)
- export CURRENT_LINE=$line_number
- _constrain_line
- _print
- else
- echo "File $1 not found"
- fi
-}
-
-# @yaml
-# signature: scroll_down
-# docstring: moves the window down {WINDOW} lines
-scroll_down() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- export CURRENT_LINE=$(jq -n "$CURRENT_LINE + $WINDOW - $OVERLAP")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: scroll_up
-# docstring: moves the window up {WINDOW} lines
-scroll_up() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- export CURRENT_LINE=$(jq -n "$CURRENT_LINE - $WINDOW + $OVERLAP")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: goto
-# docstring: moves the window to show
-# arguments:
-# line_number:
-# type: integer
-# description: the line number to move the window to
-# required: true
-goto() {
- if [ $# -gt 1 ]; then
- echo "goto allows only one line number at a time."
- return
- fi
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- if [ -z "$1" ]
- then
- echo "Usage: goto "
- return
- fi
- if ! [[ $1 =~ ^[0-9]+$ ]]
- then
- echo "Usage: goto "
- echo "Error: must be a number"
- return
- fi
- local max_line=$(awk 'END {print NR}' $CURRENT_FILE)
- if [ $1 -gt $max_line ]
- then
- echo "Error: must be less than or equal to $max_line"
- return
- fi
- local OFFSET=$(jq -n "$WINDOW/6" | jq 'floor')
- export CURRENT_LINE=$(jq -n "[$1 + $WINDOW/2 - $OFFSET, 1] | max | floor")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: create
-# docstring: creates and opens a new file with the given name
-# arguments:
-# filename:
-# type: string
-# description: the name of the file to create
-# required: true
-create() {
- if [ -z "$1" ]; then
- echo "Usage: create "
- return
- fi
-
- # Check if the file already exists
- if [ -e "$1" ]; then
- echo "Error: File '$1' already exists."
- open "$1"
- return
- fi
-
- # Create the file an empty new line
- printf "\n" > "$1"
- # Use the existing open command to open the created file
- open "$1"
-}
-
-# @yaml
-# signature: submit
-# docstring: submits your current code and terminates the session
-submit() {
- cd $ROOT
-
- # Check if the patch file exists and is non-empty
- if [ -s "$SWE_CMD_WORK_DIR/test.patch" ]; then
- # Apply the patch in reverse
- git apply -R < "$SWE_CMD_WORK_DIR/test.patch"
- fi
-
- git add -A
- git diff --cached > model.patch
- echo "<>"
-}
diff --git a/opendevin/runtime/plugins/swe_agent_commands/cursors_edit_linting.sh b/opendevin/runtime/plugins/swe_agent_commands/cursors_edit_linting.sh
deleted file mode 100644
index a11f82f5aadd..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/cursors_edit_linting.sh
+++ /dev/null
@@ -1,100 +0,0 @@
-# @yaml
-# signature: |-
-# edit <
-# EOF
-# docstring: replaces *all* of the text between the START CURSOR and the END CURSOR with the replacement_text. The replacement text is delineated using heredoc syntax. All of the will be entered, so make sure your indentation is formatted properly. To enter text at the beginning of the file, set START CURSOR and END CURSOR to 0. Use set_cursors to move the cursors around. Python files will be checked for syntax errors after the edit.
-# arguments:
-# replacement_text:
-# type: string
-# description: the text to replace the current selection with
-# required: true
-edit() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo 'No file is opened. Use the `open` command first.'
- return
- fi
- local start_line=$((START_CURSOR - 1))
- start_line=$((start_line < 0 ? 0 : start_line))
- local end_line=$((END_CURSOR))
- end_line=$((end_line < 0 ? 0 : end_line))
-
- local replacement=()
- while IFS= read -r line
- do
- replacement+=("$line")
- done
-
- local num_lines=${#replacement[@]}
- # Create a backup of the current file
- cp "$CURRENT_FILE" "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup"
- # Read the file line by line into an array
- mapfile -t lines < "$CURRENT_FILE"
- local new_lines=("${lines[@]:0:$start_line}" "${replacement[@]}" "${lines[@]:$((end_line))}")
- # Write the new stuff directly back into the original file
- printf "%s\n" "${new_lines[@]}" >| "$CURRENT_FILE"
- # Run linter if enabled
- if [[ $CURRENT_FILE == *.py && -n "$ENABLE_AUTO_LINT" ]]; then
- lint_output=$(flake8 --isolated --select=F821,F822,F831,E111,E112,E113,E999,E902 "$CURRENT_FILE" 2>&1)
- else
- # do nothing
- lint_output=""
- fi
- # if there is no output, then the file is good
- if [ -z "$lint_output" ]; then
- _constrain_line
- # set to START + num_lines - 1, unless num_lines is 0, then set to START
- export END_CURSOR=$((num_lines == 0 ? START_CURSOR : START_CURSOR + num_lines - 1))
- export START_CURSOR=$START_CURSOR
- _print
- echo "File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary."
- else
- echo "Your proposed edit has introduced new syntax error(s). Please understand the fixes and retry your edit command."
- echo ""
- echo "ERRORS:"
- _split_string "$lint_output"
- echo ""
-
- # Save original values
- original_current_line=$CURRENT_LINE
- original_window=$WINDOW
- original_end_cursor=$END_CURSOR
-
- # Update values
- export CURRENT_LINE=$(( (num_lines / 2) + start_line )) # Set to "center" of edit
- export WINDOW=$((num_lines + 10)) # Show +/- 5 lines around edit
- export END_CURSOR=$((num_lines == 0 ? START_CURSOR : START_CURSOR + num_lines - 1))
-
- echo "This is how your edit would have looked if applied"
- echo "-------------------------------------------------"
- _constrain_line
- _print
- echo "-------------------------------------------------"
- echo ""
-
- # Restoring CURRENT_FILE to original contents.
- cp "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup" "$CURRENT_FILE"
-
- export CURRENT_LINE=$(( ((end_line - start_line) / 2) + start_line )) # Set to "center" of edit
- export WINDOW=$((end_line - start_line + 10))
- export END_CURSOR=$original_end_cursor
-
- echo "This is the original code before your edit"
- echo "-------------------------------------------------"
- _constrain_line
- _print
- echo "-------------------------------------------------"
-
- # Restore original values
- export CURRENT_LINE=$original_current_line
- export WINDOW=$original_window
- export END_CURSOR=$original_end_cursor
-
- echo "Your changes have NOT been applied. Please fix your edit command and try again."
- echo "You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code."
- echo "DO NOT re-run the same failed edit command. Running it again will lead to the same error."
- fi
- # Remove backup file
- rm -f "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup"
-}
diff --git a/opendevin/runtime/plugins/swe_agent_commands/defaults.sh b/opendevin/runtime/plugins/swe_agent_commands/defaults.sh
deleted file mode 100644
index 706368d72c5d..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/defaults.sh
+++ /dev/null
@@ -1,195 +0,0 @@
-_print() {
- local total_lines=$(awk 'END {print NR}' $CURRENT_FILE)
- echo "[File: $(realpath $CURRENT_FILE) ($total_lines lines total)]"
- lines_above=$(jq -n "$CURRENT_LINE - $WINDOW/2" | jq '[0, .] | max | floor')
- lines_below=$(jq -n "$total_lines - $CURRENT_LINE - $WINDOW/2" | jq '[0, .] | max | round')
- if [ $lines_above -gt 0 ]; then
- echo "($lines_above more lines above)"
- fi
- cat $CURRENT_FILE | grep -n $ | head -n $(jq -n "[$CURRENT_LINE + $WINDOW/2, $WINDOW/2] | max | floor") | tail -n $(jq -n "$WINDOW")
- if [ $lines_below -gt 0 ]; then
- echo "($lines_below more lines below)"
- fi
-}
-
-_constrain_line() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- local max_line=$(awk 'END {print NR}' $CURRENT_FILE)
- local half_window=$(jq -n "$WINDOW/2" | jq 'floor')
- export CURRENT_LINE=$(jq -n "[$CURRENT_LINE, $max_line - $half_window] | min")
- export CURRENT_LINE=$(jq -n "[$CURRENT_LINE, $half_window] | max")
-}
-
-# @yaml
-# signature: open []
-# docstring: opens the file at the given path in the editor. If line_number is provided, the window will be move to include that line
-# arguments:
-# path:
-# type: string
-# description: the path to the file to open
-# required: true
-# line_number:
-# type: integer
-# description: the line number to move the window to (if not provided, the window will start at the top of the file)
-# required: false
-open() {
- if [ -z "$1" ]
- then
- echo "Usage: open "
- return
- fi
- # Check if the second argument is provided
- if [ -n "$2" ]; then
- # Check if the provided argument is a valid number
- if ! [[ $2 =~ ^[0-9]+$ ]]; then
- echo "Usage: open []"
- echo "Error: must be a number"
- return # Exit if the line number is not valid
- fi
- local max_line=$(awk 'END {print NR}' $1)
- if [ $2 -gt $max_line ]; then
- echo "Warning: ($2) is greater than the number of lines in the file ($max_line)"
- echo "Warning: Setting to $max_line"
- local line_number=$(jq -n "$max_line") # Set line number to max if greater than max
- elif [ $2 -lt 1 ]; then
- echo "Warning: ($2) is less than 1"
- echo "Warning: Setting to 1"
- local line_number=$(jq -n "1") # Set line number to 1 if less than 1
- else
- local OFFSET=$(jq -n "$WINDOW/6" | jq 'floor')
- local line_number=$(jq -n "[$2 + $WINDOW/2 - $OFFSET, 1] | max | floor")
- fi
- else
- local line_number=$(jq -n "$WINDOW/2") # Set default line number if not provided
- fi
-
- if [ -f "$1" ]; then
- export CURRENT_FILE=$(realpath $1)
- export CURRENT_LINE=$line_number
- _constrain_line
- _print
- elif [ -d "$1" ]; then
- echo "Error: $1 is a directory. You can only open files. Use cd or ls to navigate directories."
- else
- echo "File $1 not found"
- fi
-}
-
-# @yaml
-# signature: goto
-# docstring: moves the window to show
-# arguments:
-# line_number:
-# type: integer
-# description: the line number to move the window to
-# required: true
-goto() {
- if [ $# -gt 1 ]; then
- echo "goto allows only one line number at a time."
- return
- fi
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- if [ -z "$1" ]
- then
- echo "Usage: goto "
- return
- fi
- if ! [[ $1 =~ ^[0-9]+$ ]]
- then
- echo "Usage: goto "
- echo "Error: must be a number"
- return
- fi
- local max_line=$(awk 'END {print NR}' $CURRENT_FILE)
- if [ $1 -gt $max_line ]
- then
- echo "Error: must be less than or equal to $max_line"
- return
- fi
- local OFFSET=$(jq -n "$WINDOW/6" | jq 'floor')
- export CURRENT_LINE=$(jq -n "[$1 + $WINDOW/2 - $OFFSET, 1] | max | floor")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: scroll_down
-# docstring: moves the window down {WINDOW} lines
-scroll_down() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- export CURRENT_LINE=$(jq -n "$CURRENT_LINE + $WINDOW - $OVERLAP")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: scroll_up
-# docstring: moves the window down {WINDOW} lines
-scroll_up() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo "No file open. Use the open command first."
- return
- fi
- export CURRENT_LINE=$(jq -n "$CURRENT_LINE - $WINDOW + $OVERLAP")
- _constrain_line
- _print
-}
-
-# @yaml
-# signature: create
-# docstring: creates and opens a new file with the given name
-# arguments:
-# filename:
-# type: string
-# description: the name of the file to create
-# required: true
-create() {
- if [ -z "$1" ]; then
- echo "Usage: create "
- return
- fi
-
- # Check if the file already exists
- if [ -e "$1" ]; then
- echo "Error: File '$1' already exists."
- open "$1"
- return
- fi
-
- # Create the file an empty new line
- printf "\n" > "$1"
- # Use the existing open command to open the created file
- open "$1"
-}
-
-# @yaml
-# signature: submit
-# docstring: submits your current code and terminates the session
-submit() {
- cd $ROOT
-
- # Check if the patch file exists and is non-empty
- if [ -s "$SWE_CMD_WORK_DIR/test.patch" ]; then
- # Apply the patch in reverse
- git apply -R < "$SWE_CMD_WORK_DIR/test.patch"
- fi
-
- git add -A
- git diff --cached > model.patch
- echo "<>"
-}
diff --git a/opendevin/runtime/plugins/swe_agent_commands/edit_linting.sh b/opendevin/runtime/plugins/swe_agent_commands/edit_linting.sh
deleted file mode 100644
index 8341c3cd8c8e..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/edit_linting.sh
+++ /dev/null
@@ -1,129 +0,0 @@
-# @yaml
-# signature: |-
-# edit : <
-# EOF
-# docstring: replaces lines through (inclusive) with the given text in the open file. The replacement text is delineated using heredoc syntax. All of the will be entered, so make sure your indentation is formatted properly. Python files will be checked for syntax errors after the edit. If the system detects a syntax error, the edit will not be executed. Simply try to edit the file again, but make sure to read the error message and modify the edit command you issue accordingly. Issuing the same command a second time will just lead to the same error message again. Remember, the file must be open before editing.
-# arguments:
-# start_line:
-# type: integer
-# description: the line number to start the edit at
-# required: true
-# end_line:
-# type: integer
-# description: the line number to end the edit at (inclusive)
-# required: true
-# replacement_text:
-# type: string
-# description: the text to replace the current selection with
-# required: true
-edit() {
- if [ -z "$CURRENT_FILE" ]
- then
- echo 'No file open. Use the `open` command first.'
- return
- fi
-
- local start_line="$(echo $1: | cut -d: -f1)"
- local end_line="$(echo $1: | cut -d: -f2)"
-
- if [ -z "$start_line" ] || [ -z "$end_line" ]
- then
- echo "Usage: edit :"
- return
- fi
-
- local re='^[0-9]+$'
- if ! [[ $start_line =~ $re ]]; then
- echo "Usage: edit :"
- echo "Error: start_line must be a number"
- return
- fi
- if ! [[ $end_line =~ $re ]]; then
- echo "Usage: edit :"
- echo "Error: end_line must be a number"
- return
- fi
-
- # Bash array starts at 0, so let's adjust
- local start_line=$((start_line - 1))
- local end_line=$((end_line))
-
- local line_count=0
- local replacement=()
- while IFS= read -r line
- do
- replacement+=("$line")
- ((line_count++))
- done
-
- # Create a backup of the current file
- cp "$CURRENT_FILE" "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup"
-
- # Read the file line by line into an array
- mapfile -t lines < "$CURRENT_FILE"
- local new_lines=("${lines[@]:0:$start_line}" "${replacement[@]}" "${lines[@]:$((end_line))}")
- # Write the new stuff directly back into the original file
- printf "%s\n" "${new_lines[@]}" >| "$CURRENT_FILE"
-
- # Run linter if enabled
- if [[ $CURRENT_FILE == *.py && -n "$ENABLE_AUTO_LINT" ]]; then
- lint_output=$(flake8 --isolated --select=F821,F822,F831,E111,E112,E113,E999,E902 "$CURRENT_FILE" 2>&1)
- else
- # do nothing
- lint_output=""
- fi
-
- # if there is no output, then the file is good
- if [ -z "$lint_output" ]; then
- export CURRENT_LINE=$start_line
- _constrain_line
- _print
-
- echo "File updated. Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary."
- else
- echo "Your proposed edit has introduced new syntax error(s). Please understand the fixes and retry your edit command."
- echo ""
- echo "ERRORS:"
- _split_string "$lint_output"
- echo ""
-
- # Save original values
- original_current_line=$CURRENT_LINE
- original_window=$WINDOW
-
- # Update values
- export CURRENT_LINE=$(( (line_count / 2) + start_line )) # Set to "center" of edit
- export WINDOW=$((line_count + 10)) # Show +/- 5 lines around edit
-
- echo "This is how your edit would have looked if applied"
- echo "-------------------------------------------------"
- _constrain_line
- _print
- echo "-------------------------------------------------"
- echo ""
-
- # Restoring CURRENT_FILE to original contents.
- cp "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup" "$CURRENT_FILE"
-
- export CURRENT_LINE=$(( ((end_line - start_line + 1) / 2) + start_line ))
- export WINDOW=$((end_line - start_line + 10))
-
- echo "This is the original code before your edit"
- echo "-------------------------------------------------"
- _constrain_line
- _print
- echo "-------------------------------------------------"
-
- # Restore original values
- export CURRENT_LINE=$original_current_line
- export WINDOW=$original_window
-
- echo "Your changes have NOT been applied. Please fix your edit command and try again."
- echo "You either need to 1) Specify the correct start/end line arguments or 2) Correct your edit code."
- echo "DO NOT re-run the same failed edit command. Running it again will lead to the same error."
- fi
-
- # Remove backup file
- rm -f "$SWE_CMD_WORK_DIR/$(basename "$CURRENT_FILE")_backup"
-}
diff --git a/opendevin/runtime/plugins/swe_agent_commands/parse_commands.py b/opendevin/runtime/plugins/swe_agent_commands/parse_commands.py
deleted file mode 100644
index 49aa9489646c..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/parse_commands.py
+++ /dev/null
@@ -1,62 +0,0 @@
-from dataclasses import dataclass
-
-import yaml
-
-
-@dataclass()
-class Command:
- name: str
- docstring: str | None = None
- signature: str | None = None
-
-
-def parse_command_file(filepath: str) -> str:
- content = open(filepath, 'r').read()
- lines = content.split('\n')
- commands: list[Command] = []
- idx = 0
- docs: list[str] = []
- while idx < len(lines):
- line = lines[idx]
- idx += 1
- if line.startswith('# '):
- docs.append(line[2:])
- elif line.strip().endswith('() {'):
- name = line.split()[0][:-2]
- while lines[idx].strip() != '}':
- idx += 1
- docstring, signature = None, name
- docs_dict = yaml.safe_load('\n'.join(docs).replace('@yaml', ''))
- if docs_dict is not None:
- docstring = docs_dict.get('docstring')
- arguments = docs_dict.get('arguments', None)
- if 'signature' in docs_dict:
- signature = docs_dict['signature']
- else:
- if arguments is not None:
- for param, settings in arguments.items():
- if 'required' in settings:
- signature += f' <{param}>'
- else:
- signature += f' [<{param}>]'
- command = Command(name, docstring, signature)
- commands.append(command)
- docs = []
- function_docs = ''
- for cmd in commands:
- if cmd.docstring is not None:
- function_docs += f'{cmd.signature or cmd.name} - {cmd.docstring}\n'
- return function_docs
-
-
-if __name__ == '__main__':
- import sys
-
- if len(sys.argv) < 2:
- print('Usage: python parse_commands.py ')
- sys.exit(1)
- filepath = sys.argv[1]
- filepaths = filepath.split(',')
- for filepath in filepaths:
- docs = parse_command_file(filepath)
- print(docs)
diff --git a/opendevin/runtime/plugins/swe_agent_commands/search.sh b/opendevin/runtime/plugins/swe_agent_commands/search.sh
deleted file mode 100644
index c09566bdcb1c..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/search.sh
+++ /dev/null
@@ -1,155 +0,0 @@
-# @yaml
-# signature: search_dir []
-# docstring: searches for search_term in all files in dir. If dir is not provided, searches in the current directory
-# arguments:
-# search_term:
-# type: string
-# description: the term to search for
-# required: true
-# dir:
-# type: string
-# description: the directory to search in (if not provided, searches in the current directory)
-# required: false
-search_dir() {
- if [ $# -eq 1 ]; then
- local search_term="$1"
- local dir="./"
- elif [ $# -eq 2 ]; then
- local search_term="$1"
- if [ -d "$2" ]; then
- local dir="$2"
- else
- echo "Directory $2 not found"
- return
- fi
- else
- echo "Usage: search_dir []"
- return
- fi
- dir=$(realpath "$dir")
- local matches=$(find "$dir" -type f ! -path '*/.*' -exec grep -nIH -- "$search_term" {} + | cut -d: -f1 | sort | uniq -c)
- # if no matches, return
- if [ -z "$matches" ]; then
- echo "No matches found for \"$search_term\" in $dir"
- return
- fi
- # Calculate total number of matches
- local num_matches=$(echo "$matches" | awk '{sum+=$1} END {print sum}')
- # calculate total number of files matched
- local num_files=$(echo "$matches" | wc -l | awk '{$1=$1; print $0}')
- # if num_files is > 100, print an error
- if [ $num_files -gt 100 ]; then
- echo "More than $num_files files matched for \"$search_term\" in $dir. Please narrow your search."
- return
- fi
-
- echo "Found $num_matches matches for \"$search_term\" in $dir:"
- echo "$matches" | awk '{$2=$2; gsub(/^\.+\/+/, "./", $2); print $2 " ("$1" matches)"}'
- echo "End of matches for \"$search_term\" in $dir"
-}
-
-# @yaml
-# signature: search_file []
-# docstring: searches for search_term in file. If file is not provided, searches in the current open file
-# arguments:
-# search_term:
-# type: string
-# description: the term to search for
-# required: true
-# file:
-# type: string
-# description: the file to search in (if not provided, searches in the current open file)
-# required: false
-search_file() {
- # Check if the first argument is provided
- if [ -z "$1" ]; then
- echo "Usage: search_file []"
- return
- fi
- # Check if the second argument is provided
- if [ -n "$2" ]; then
- # Check if the provided argument is a valid file
- if [ -f "$2" ]; then
- local file="$2" # Set file if valid
- else
- echo "Usage: search_file []"
- echo "Error: File name $2 not found. Please provide a valid file name."
- return # Exit if the file is not valid
- fi
- else
- # Check if a file is open
- if [ -z "$CURRENT_FILE" ]; then
- echo "No file open. Use the open command first."
- return # Exit if no file is open
- fi
- local file="$CURRENT_FILE" # Set file to the current open file
- fi
- local search_term="$1"
- file=$(realpath "$file")
- # Use grep to directly get the desired formatted output
- local matches=$(grep -nH -- "$search_term" "$file")
- # Check if no matches were found
- if [ -z "$matches" ]; then
- echo "No matches found for \"$search_term\" in $file"
- return
- fi
- # Calculate total number of matches
- local num_matches=$(echo "$matches" | wc -l | awk '{$1=$1; print $0}')
-
- # calculate total number of lines matched
- local num_lines=$(echo "$matches" | cut -d: -f1 | sort | uniq | wc -l | awk '{$1=$1; print $0}')
- # if num_lines is > 100, print an error
- if [ $num_lines -gt 100 ]; then
- echo "More than $num_lines lines matched for \"$search_term\" in $file. Please narrow your search."
- return
- fi
-
- # Print the total number of matches and the matches themselves
- echo "Found $num_matches matches for \"$search_term\" in $file:"
- echo "$matches" | cut -d: -f1-2 | sort -u -t: -k2,2n | while IFS=: read -r filename line_number; do
- echo "Line $line_number:$(sed -n "${line_number}p" "$file")"
- done
- echo "End of matches for \"$search_term\" in $file"
-}
-
-# @yaml
-# signature: find_file []
-# docstring: finds all files with the given name in dir. If dir is not provided, searches in the current directory
-# arguments:
-# file_name:
-# type: string
-# description: the name of the file to search for
-# required: true
-# dir:
-# type: string
-# description: the directory to search in (if not provided, searches in the current directory)
-# required: false
-find_file() {
- if [ $# -eq 1 ]; then
- local file_name="$1"
- local dir="./"
- elif [ $# -eq 2 ]; then
- local file_name="$1"
- if [ -d "$2" ]; then
- local dir="$2"
- else
- echo "Directory $2 not found"
- return
- fi
- else
- echo "Usage: find_file []"
- return
- fi
-
- dir=$(realpath "$dir")
- local matches=$(find "$dir" -type f -name "$file_name")
- # if no matches, return
- if [ -z "$matches" ]; then
- echo "No matches found for \"$file_name\" in $dir"
- return
- fi
- # Calculate total number of matches
- local num_matches=$(echo "$matches" | wc -l | awk '{$1=$1; print $0}')
- echo "Found $num_matches matches for \"$file_name\" in $dir:"
- echo "$matches" | awk '{print $0}'
-}
diff --git a/opendevin/runtime/plugins/swe_agent_commands/setup_cursor_mode.sh b/opendevin/runtime/plugins/swe_agent_commands/setup_cursor_mode.sh
deleted file mode 100755
index 8788e3947d73..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/setup_cursor_mode.sh
+++ /dev/null
@@ -1,19 +0,0 @@
-#!/bin/bash
-
-export PIP_CACHE_DIR=$HOME/.cache/pip
-pip install flake8
-
-# Cursor Mode from SWE-Bench
-# https://github.com/princeton-nlp/SWE-agent/blob/ca54d5556b9db4f4f2be21f09530ce69a72c0305/config/configs/default_sys-env_cursors_window100-detailed_cmd_format-last_5_history-1_demos.yaml#L108-L111
-echo 'source /opendevin/plugins/swe_agent_commands/_setup_cursor_mode_env.sh' >> ~/.bashrc
-
-# make _split_string (py) available
-echo 'export PATH=$PATH:/opendevin/plugins/swe_agent_commands' >> ~/.bashrc
-
-echo 'source /opendevin/plugins/swe_agent_commands/cursors_defaults.sh' >> ~/.bashrc
-echo 'source /opendevin/plugins/swe_agent_commands/cursors_edit_linting.sh' >> ~/.bashrc
-echo 'source /opendevin/plugins/swe_agent_commands/search.sh' >> ~/.bashrc
-
-echo 'export SWE_CMD_WORK_DIR="/opendevin/plugins/swe_agent_commands/workdir"' >> ~/.bashrc
-sudo mkdir -p /opendevin/plugins/swe_agent_commands/workdir
-sudo chmod 777 /opendevin/plugins/swe_agent_commands/workdir
diff --git a/opendevin/runtime/plugins/swe_agent_commands/setup_default.sh b/opendevin/runtime/plugins/swe_agent_commands/setup_default.sh
deleted file mode 100755
index 465a83e12f9a..000000000000
--- a/opendevin/runtime/plugins/swe_agent_commands/setup_default.sh
+++ /dev/null
@@ -1,19 +0,0 @@
-#!/bin/bash
-
-export PIP_CACHE_DIR=$HOME/.cache/pip
-pip install flake8
-
-# Default Mode from SWE-Bench
-# https://github.com/princeton-nlp/SWE-agent/blob/ca54d5556b9db4f4f2be21f09530ce69a72c0305/config/configs/default_sys-env_window100-detailed_cmd_format-last_5_history-1_demos.yaml#L103-L106
-echo 'source /opendevin/plugins/swe_agent_commands/_setup_default_env.sh' >> ~/.bashrc
-
-# make _split_string (py) available
-echo 'export PATH=$PATH:/opendevin/plugins/swe_agent_commands' >> ~/.bashrc
-
-echo 'source /opendevin/plugins/swe_agent_commands/defaults.sh' >> ~/.bashrc
-echo 'source /opendevin/plugins/swe_agent_commands/search.sh' >> ~/.bashrc
-echo 'source /opendevin/plugins/swe_agent_commands/edit_linting.sh' >> ~/.bashrc
-
-echo 'export SWE_CMD_WORK_DIR="/opendevin/plugins/swe_agent_commands/workdir"' >> ~/.bashrc
-sudo mkdir -p /opendevin/plugins/swe_agent_commands/workdir
-sudo chmod 777 /opendevin/plugins/swe_agent_commands/workdir
diff --git a/opendevin/runtime/runtime.py b/opendevin/runtime/runtime.py
index 0be6d97d0af8..688227cb1bab 100644
--- a/opendevin/runtime/runtime.py
+++ b/opendevin/runtime/runtime.py
@@ -4,11 +4,10 @@
import json
import os
from abc import abstractmethod
-from typing import Any, Optional
from opendevin.core.config import AppConfig, SandboxConfig
from opendevin.core.logger import opendevin_logger as logger
-from opendevin.events import EventStream, EventStreamSubscriber
+from opendevin.events import EventSource, EventStream, EventStreamSubscriber
from opendevin.events.action import (
Action,
ActionConfirmationStatus,
@@ -25,21 +24,19 @@
ErrorObservation,
NullObservation,
Observation,
- RejectObservation,
+ UserRejectObservation,
)
from opendevin.events.serialization.action import ACTION_TYPE_TO_CLASS
-from opendevin.runtime.plugins import PluginRequirement
-from opendevin.runtime.tools import RuntimeTool
-from opendevin.storage import FileStore
+from opendevin.runtime.plugins import JupyterRequirement, PluginRequirement
-def _default_env_vars(config: SandboxConfig) -> dict[str, str]:
+def _default_env_vars(sandbox_config: SandboxConfig) -> dict[str, str]:
ret = {}
for key in os.environ:
if key.startswith('SANDBOX_ENV_'):
sandbox_key = key.removeprefix('SANDBOX_ENV_')
ret[sandbox_key] = os.environ[key]
- if config.enable_auto_lint:
+ if sandbox_config.enable_auto_lint:
ret['ENABLE_AUTO_LINT'] = 'true'
return ret
@@ -52,7 +49,7 @@ class Runtime:
"""
sid: str
- file_store: FileStore
+ config: AppConfig
DEFAULT_ENV_VARS: dict[str, str]
def __init__(
@@ -60,13 +57,17 @@ def __init__(
config: AppConfig,
event_stream: EventStream,
sid: str = 'default',
+ plugins: list[PluginRequirement] | None = None,
):
self.sid = sid
self.event_stream = event_stream
self.event_stream.subscribe(EventStreamSubscriber.RUNTIME, self.on_event)
+ self.plugins = plugins if plugins is not None and len(plugins) > 0 else []
+
self.config = copy.deepcopy(config)
self.DEFAULT_ENV_VARS = _default_env_vars(config.sandbox)
atexit.register(self.close_sync)
+ logger.debug(f'Runtime `{sid}` config:\n{self.config}')
async def ainit(self, env_vars: dict[str, str] | None = None) -> None:
"""
@@ -97,26 +98,20 @@ def close_sync(self) -> None:
else:
loop.run_until_complete(self.close())
- # ====================================================================
- # Methods we plan to deprecate when we move to new EventStreamRuntime
- # ====================================================================
-
- def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
- # TODO: deprecate this method when we move to the new EventStreamRuntime
- raise NotImplementedError('This method is not implemented in the base class.')
-
- def init_runtime_tools(
- self,
- runtime_tools: list[RuntimeTool],
- runtime_tools_config: Optional[dict[RuntimeTool, Any]] = None,
- is_async: bool = True,
- ) -> None:
- # TODO: deprecate this method when we move to the new EventStreamRuntime
- raise NotImplementedError('This method is not implemented in the base class.')
-
# ====================================================================
async def add_env_vars(self, env_vars: dict[str, str]) -> None:
+ # Add env vars to the IPython shell (if Jupyter is used)
+ if any(isinstance(plugin, JupyterRequirement) for plugin in self.plugins):
+ code = 'import os\n'
+ for key, value in env_vars.items():
+ # Note: json.dumps gives us nice escaping for free
+ code += f'os.environ["{key}"] = {json.dumps(value)}\n'
+ code += '\n'
+ obs = await self.run_ipython(IPythonRunCellAction(code))
+ logger.info(f'Added env vars to IPython: code={code}, obs={obs}')
+
+ # Add env vars to the Bash shell
cmd = ''
for key, value in env_vars.items():
# Note: json.dumps gives us nice escaping for free
@@ -125,7 +120,7 @@ async def add_env_vars(self, env_vars: dict[str, str]) -> None:
return
cmd = cmd.strip()
logger.debug(f'Adding env var: {cmd}')
- obs: Observation = await self.run(CmdRunAction(cmd))
+ obs = await self.run(CmdRunAction(cmd))
if not isinstance(obs, CmdOutputObservation) or obs.exit_code != 0:
raise RuntimeError(
f'Failed to add env vars [{env_vars}] to environment: {obs.content}'
@@ -133,9 +128,14 @@ async def add_env_vars(self, env_vars: dict[str, str]) -> None:
async def on_event(self, event: Event) -> None:
if isinstance(event, Action):
+ # set timeout to default if not set
+ if event.timeout is None:
+ event.timeout = self.config.sandbox.timeout
+ assert event.timeout is not None
observation = await self.run_action(event)
observation._cause = event.id # type: ignore[attr-defined]
- self.event_stream.add_event(observation, event.source) # type: ignore[arg-type]
+ source = event.source if event.source else EventSource.AGENT
+ self.event_stream.add_event(observation, source) # type: ignore[arg-type]
async def run_action(self, action: Action) -> Observation:
"""Run an action and return the resulting observation.
@@ -160,15 +160,14 @@ async def run_action(self, action: Action) -> Observation:
hasattr(action, 'is_confirmed')
and action.is_confirmed == ActionConfirmationStatus.REJECTED
):
- return RejectObservation(
+ return UserRejectObservation(
'Action has been rejected by the user! Waiting for further user input.'
)
observation = await getattr(self, action_type)(action)
- observation._parent = action.id # type: ignore[attr-defined]
return observation
# ====================================================================
- # Implement these methods in the subclass
+ # Action execution
# ====================================================================
@abstractmethod
@@ -194,3 +193,19 @@ async def browse(self, action: BrowseURLAction) -> Observation:
@abstractmethod
async def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
pass
+
+ # ====================================================================
+ # File operations
+ # ====================================================================
+
+ @abstractmethod
+ async def copy_to(self, host_src: str, sandbox_dest: str, recursive: bool = False):
+ raise NotImplementedError('This method is not implemented in the base class.')
+
+ @abstractmethod
+ async def list_files(self, path: str | None = None) -> list[str]:
+ """List files in the sandbox.
+
+ If path is None, list files in the sandbox's initial working directory (e.g., /workspace).
+ """
+ raise NotImplementedError('This method is not implemented in the base class.')
diff --git a/opendevin/runtime/sandbox.py b/opendevin/runtime/sandbox.py
index 5009308a9276..c8ebd87d3783 100644
--- a/opendevin/runtime/sandbox.py
+++ b/opendevin/runtime/sandbox.py
@@ -3,10 +3,9 @@
from opendevin.core.config import SandboxConfig
from opendevin.core.schema import CancellableStream
-from opendevin.runtime.plugins.mixin import PluginMixin
-class Sandbox(ABC, PluginMixin):
+class Sandbox(ABC):
_env: dict[str, str] = {}
is_initial_session: bool = True
diff --git a/opendevin/runtime/server/runtime.py b/opendevin/runtime/server/runtime.py
deleted file mode 100644
index 04b6ee57fe05..000000000000
--- a/opendevin/runtime/server/runtime.py
+++ /dev/null
@@ -1,224 +0,0 @@
-from typing import Any, Optional
-
-from opendevin.core.config import AppConfig
-from opendevin.core.exceptions import BrowserInitException
-from opendevin.core.logger import opendevin_logger as logger
-from opendevin.events.action import (
- BrowseInteractiveAction,
- BrowseURLAction,
- CmdRunAction,
- FileReadAction,
- FileWriteAction,
- IPythonRunCellAction,
-)
-from opendevin.events.observation import (
- CmdOutputObservation,
- ErrorObservation,
- IPythonRunCellObservation,
- Observation,
-)
-from opendevin.events.stream import EventStream
-from opendevin.runtime import (
- DockerSSHBox,
- E2BBox,
- LocalBox,
- Sandbox,
-)
-from opendevin.runtime.browser.browser_env import BrowserEnv
-from opendevin.runtime.plugins import PluginRequirement
-from opendevin.runtime.runtime import Runtime
-from opendevin.runtime.tools import RuntimeTool
-from opendevin.storage.local import LocalFileStore
-
-from ..browser import browse
-from .files import read_file, write_file
-
-
-class ServerRuntime(Runtime):
- def __init__(
- self,
- config: AppConfig,
- event_stream: EventStream,
- sid: str = 'default',
- sandbox: Sandbox | None = None,
- ):
- super().__init__(config, event_stream, sid)
- self.file_store = LocalFileStore(config.workspace_base)
- if sandbox is None:
- self.sandbox = self.create_sandbox(sid, config.sandbox.box_type)
- self._is_external_sandbox = False
- else:
- self.sandbox = sandbox
- self._is_external_sandbox = True
- self.browser: BrowserEnv | None = None
-
- def create_sandbox(self, sid: str = 'default', box_type: str = 'ssh') -> Sandbox:
- if box_type == 'local':
- return LocalBox(
- config=self.config.sandbox, workspace_base=self.config.workspace_base
- )
- elif box_type == 'ssh':
- return DockerSSHBox(
- config=self.config.sandbox,
- persist_sandbox=self.config.persist_sandbox,
- workspace_mount_path=self.config.workspace_mount_path,
- sandbox_workspace_dir=self.config.workspace_mount_path_in_sandbox,
- cache_dir=self.config.cache_dir,
- run_as_devin=self.config.run_as_devin,
- ssh_hostname=self.config.ssh_hostname,
- ssh_password=self.config.ssh_password,
- ssh_port=self.config.ssh_port,
- sid=sid,
- )
- elif box_type == 'e2b':
- return E2BBox(
- config=self.config.sandbox,
- e2b_api_key=self.config.e2b_api_key,
- )
- else:
- raise ValueError(f'Invalid sandbox type: {box_type}')
-
- async def ainit(self, env_vars: dict[str, str] | None = None):
- # MUST call super().ainit() to initialize both default env vars
- # AND the ones in env vars!
- await super().ainit(env_vars)
-
- async def close(self):
- if hasattr(self, '_is_external_sandbox') and not self._is_external_sandbox:
- self.sandbox.close()
- if hasattr(self, 'browser') and self.browser is not None:
- self.browser.close()
-
- def init_sandbox_plugins(self, plugins: list[PluginRequirement]) -> None:
- self.sandbox.init_plugins(plugins)
-
- def init_runtime_tools(
- self,
- runtime_tools: list[RuntimeTool],
- runtime_tools_config: Optional[dict[RuntimeTool, Any]] = None,
- is_async: bool = True,
- ) -> None:
- # if browser in runtime_tools, init it
- if RuntimeTool.BROWSER in runtime_tools:
- if runtime_tools_config is None:
- runtime_tools_config = {}
- browser_env_config = runtime_tools_config.get(RuntimeTool.BROWSER, {})
- try:
- self.browser = BrowserEnv(is_async=is_async, **browser_env_config)
- except BrowserInitException:
- logger.warn(
- 'Failed to start browser environment, web browsing functionality will not work'
- )
-
- async def run(self, action: CmdRunAction) -> Observation:
- return self._run_command(action.command)
-
- async def run_ipython(self, action: IPythonRunCellAction) -> Observation:
- self._run_command(
- ("cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n" f'{action.code}\n' 'EOL'),
- )
-
- # run the code
- obs = self._run_command('cat /tmp/opendevin_jupyter_temp.py | execute_cli')
- output = obs.content
- if 'pip install' in action.code:
- print(output)
- package_names = action.code.split(' ', 2)[-1]
- is_single_package = ' ' not in package_names
-
- if 'Successfully installed' in output:
- restart_kernel = 'import IPython\nIPython.Application.instance().kernel.do_shutdown(True)'
- if (
- 'Note: you may need to restart the kernel to use updated packages.'
- in output
- ):
- self._run_command(
- (
- "cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n"
- f'{restart_kernel}\n'
- 'EOL'
- )
- )
- obs = self._run_command(
- 'cat /tmp/opendevin_jupyter_temp.py | execute_cli'
- )
- output = '[Package installed successfully]'
- if "{'status': 'ok', 'restart': True}" != obs.content.strip():
- print(obs.content)
- output += (
- '\n[But failed to restart the kernel to load the package]'
- )
- else:
- output += (
- '\n[Kernel restarted successfully to load the package]'
- )
-
- # re-init the kernel after restart
- if action.kernel_init_code:
- self._run_command(
- (
- f"cat > /tmp/opendevin_jupyter_init.py <<'EOL'\n"
- f'{action.kernel_init_code}\n'
- 'EOL'
- ),
- )
- obs = self._run_command(
- 'cat /tmp/opendevin_jupyter_init.py | execute_cli',
- )
- elif (
- is_single_package
- and f'Requirement already satisfied: {package_names}' in output
- ):
- output = '[Package already installed]'
- return IPythonRunCellObservation(content=output, code=action.code)
-
- async def read(self, action: FileReadAction) -> Observation:
- # TODO: use self.file_store
- working_dir = self.sandbox.get_working_directory()
- return await read_file(
- action.path,
- working_dir,
- self.config.workspace_base,
- self.config.workspace_mount_path_in_sandbox,
- action.start,
- action.end,
- )
-
- async def write(self, action: FileWriteAction) -> Observation:
- # TODO: use self.file_store
- working_dir = self.sandbox.get_working_directory()
- return await write_file(
- action.path,
- working_dir,
- self.config.workspace_base,
- self.config.workspace_mount_path_in_sandbox,
- action.content,
- action.start,
- action.end,
- )
-
- async def browse(self, action: BrowseURLAction) -> Observation:
- return await browse(action, self.browser)
-
- async def browse_interactive(self, action: BrowseInteractiveAction) -> Observation:
- return await browse(action, self.browser)
-
- def _run_command(self, command: str) -> Observation:
- try:
- exit_code, output = self.sandbox.execute(command)
- if 'pip install' in command:
- package_names = command.split(' ', 2)[-1]
- is_single_package = ' ' not in package_names
- print(output)
- if 'Successfully installed' in output:
- output = '[Package installed successfully]'
- elif (
- is_single_package
- and f'Requirement already satisfied: {package_names}' in output
- ):
- output = '[Package already installed]'
- return CmdOutputObservation(
- command_id=-1, content=str(output), command=command, exit_code=exit_code
- )
- except UnicodeDecodeError:
- return ErrorObservation('Command output could not be decoded as utf-8')
diff --git a/opendevin/runtime/utils/bash.py b/opendevin/runtime/utils/bash.py
index 9dd7a60422c5..6de80c38840a 100644
--- a/opendevin/runtime/utils/bash.py
+++ b/opendevin/runtime/utils/bash.py
@@ -1,87 +1,52 @@
-def split_bash_commands(commands):
- # States
- NORMAL = 0
- IN_SINGLE_QUOTE = 1
- IN_DOUBLE_QUOTE = 2
- IN_HEREDOC = 3
-
- state = NORMAL
- heredoc_trigger = None
- result = []
- current_command: list[str] = []
-
- i = 0
- while i < len(commands):
- char = commands[i]
-
- if state == NORMAL:
- if char == "'":
- state = IN_SINGLE_QUOTE
- elif char == '"':
- state = IN_DOUBLE_QUOTE
- elif char == '\\':
- # Check if this is escaping a newline
- if i + 1 < len(commands) and commands[i + 1] == '\n':
- i += 1 # Skip the newline
- # Continue with the next line as part of the same command
- i += 1 # Move to the first character of the next line
- continue
- elif char == '\n':
- if not heredoc_trigger and current_command:
- result.append(''.join(current_command).strip())
- current_command = []
- elif char == '<' and commands[i : i + 2] == '<<':
- # Detect heredoc
- state = IN_HEREDOC
- i += 2 # Skip '<<'
- while commands[i] == ' ':
- i += 1
- start = i
- while commands[i] not in [' ', '\n']:
- i += 1
- heredoc_trigger = commands[start:i]
- current_command.append(commands[start - 2 : i]) # Include '<<'
- continue # Skip incrementing i at the end of the loop
- current_command.append(char)
-
- elif state == IN_SINGLE_QUOTE:
- current_command.append(char)
- if char == "'" and commands[i - 1] != '\\':
- state = NORMAL
+import bashlex
- elif state == IN_DOUBLE_QUOTE:
- current_command.append(char)
- if char == '"' and commands[i - 1] != '\\':
- state = NORMAL
+from opendevin.core.logger import opendevin_logger as logger
- elif state == IN_HEREDOC:
- current_command.append(char)
- if (
- char == '\n'
- and heredoc_trigger
- and commands[i + 1 : i + 1 + len(heredoc_trigger) + 1]
- == heredoc_trigger + '\n'
- ):
- # Check if the next line starts with the heredoc trigger followed by a newline
- i += (
- len(heredoc_trigger) + 1
- ) # Move past the heredoc trigger and newline
- current_command.append(
- heredoc_trigger + '\n'
- ) # Include the heredoc trigger and newline
- result.append(''.join(current_command).strip())
- current_command = []
- heredoc_trigger = None
- state = NORMAL
- continue
-
- i += 1
-
- # Add the last command if any
- if current_command:
- result.append(''.join(current_command).strip())
-
- # Remove any empty strings from the result
- result = [cmd for cmd in result if cmd]
+def split_bash_commands(commands):
+ try:
+ parsed = bashlex.parse(commands)
+ except bashlex.errors.ParsingError as e:
+ logger.debug(
+ f'Failed to parse bash commands\n'
+ f'[input]: {commands}\n'
+ f'[warning]: {e}\n'
+ f'The original command will be returned as is.'
+ )
+ # If parsing fails, return the original commands
+ return [commands]
+
+ result: list[str] = []
+ last_end = 0
+
+ for node in parsed:
+ start, end = node.pos
+
+ # Include any text between the last command and this one
+ if start > last_end:
+ between = commands[last_end:start]
+ logger.debug(f'BASH PARSING between: {between}')
+ if result:
+ result[-1] += between.rstrip()
+ elif between.strip():
+ # THIS SHOULD NOT HAPPEN
+ result.append(between.rstrip())
+
+ # Extract the command, preserving original formatting
+ command = commands[start:end].rstrip()
+ logger.debug(f'BASH PARSING command: {command}')
+ result.append(command)
+
+ last_end = end
+
+ # Add any remaining text after the last command to the last command
+ remaining = commands[last_end:].rstrip()
+ logger.debug(f'BASH PARSING remaining: {remaining}')
+ if last_end < len(commands) and result:
+ result[-1] += remaining
+ logger.debug(f'BASH PARSING result[-1] += remaining: {result[-1]}')
+ elif last_end < len(commands):
+ if remaining:
+ result.append(remaining)
+ logger.debug(f'BASH PARSING result.append(remaining): {result[-1]}')
return result
diff --git a/opendevin/runtime/server/files.py b/opendevin/runtime/utils/files.py
similarity index 100%
rename from opendevin/runtime/server/files.py
rename to opendevin/runtime/utils/files.py
diff --git a/opendevin/runtime/utils/image_agnostic.py b/opendevin/runtime/utils/image_agnostic.py
deleted file mode 100644
index d44c9c4ebb39..000000000000
--- a/opendevin/runtime/utils/image_agnostic.py
+++ /dev/null
@@ -1,113 +0,0 @@
-"""This module contains functions for building and managing the agnostic sandbox image.
-
-This WILL BE DEPRECATED when EventStreamRuntime is fully implemented and adopted.
-"""
-
-import tempfile
-
-import docker
-
-from opendevin.core.logger import opendevin_logger as logger
-
-
-def generate_dockerfile(base_image: str) -> str:
- """Generate the Dockerfile content for the agnostic sandbox image based on user-provided base image.
-
- NOTE: This is only tested on debian yet.
- """
- # FIXME: Remove the requirement of ssh in future version
- dockerfile_content = (
- f'FROM {base_image}\n'
- 'RUN apt update && apt install -y openssh-server wget sudo\n'
- 'RUN mkdir -p -m0755 /var/run/sshd\n'
- 'RUN mkdir -p /opendevin && mkdir -p /opendevin/logs && chmod 777 /opendevin/logs\n'
- 'RUN echo "" > /opendevin/bash.bashrc\n'
- 'RUN if [ ! -d /opendevin/miniforge3 ]; then \\\n'
- ' wget --progress=bar:force -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" && \\\n'
- ' bash Miniforge3.sh -b -p /opendevin/miniforge3 && \\\n'
- ' rm Miniforge3.sh && \\\n'
- ' chmod -R g+w /opendevin/miniforge3 && \\\n'
- ' bash -c ". /opendevin/miniforge3/etc/profile.d/conda.sh && conda config --set changeps1 False && conda config --append channels conda-forge"; \\\n'
- ' fi\n'
- 'RUN /opendevin/miniforge3/bin/pip install --upgrade pip\n'
- 'RUN /opendevin/miniforge3/bin/pip install jupyterlab notebook jupyter_kernel_gateway flake8\n'
- 'RUN /opendevin/miniforge3/bin/pip install python-docx PyPDF2 python-pptx pylatexenc openai\n'
- ).strip()
- return dockerfile_content
-
-
-def _build_sandbox_image(
- base_image: str, target_image_name: str, docker_client: docker.DockerClient
-):
- try:
- with tempfile.TemporaryDirectory() as temp_dir:
- dockerfile_content = generate_dockerfile(base_image)
-
- logger.info(f'Building agnostic sandbox image: {target_image_name}')
- logger.info(
- (
- f'===== Dockerfile content =====\n'
- f'{dockerfile_content}\n'
- f'==============================='
- )
- )
- with open(f'{temp_dir}/Dockerfile', 'w') as file:
- file.write(dockerfile_content)
-
- api_client = docker_client.api
- build_logs = api_client.build(
- path=temp_dir, tag=target_image_name, rm=True, decode=True
- )
-
- for log in build_logs:
- if 'stream' in log:
- print(log['stream'].strip())
- elif 'error' in log:
- logger.error(log['error'].strip())
- else:
- logger.info(str(log))
-
- logger.info(f'Image {target_image_name} built successfully')
- except docker.errors.BuildError as e:
- logger.error(f'Sandbox image build failed: {e}')
- raise e
- except Exception as e:
- logger.error(f'An error occurred during sandbox image build: {e}')
- raise e
-
-
-def _get_new_image_name(base_image: str) -> str:
- prefix = 'od_sandbox'
- if ':' not in base_image:
- base_image = base_image + ':latest'
-
- [repo, tag] = base_image.split(':')
- repo = repo.replace('/', '___')
- return f'{prefix}:{repo}__{tag}'
-
-
-def get_od_sandbox_image(base_image: str, docker_client: docker.DockerClient) -> str:
- """Return the sandbox image name based on user-provided base image.
-
- The returned sandbox image is assumed to contains all the required dependencies for OpenDevin.
- If the sandbox image is not found, it will be built.
- """
- # OpenDevin's offcial sandbox already contains the required dependencies for OpenDevin.
- if 'ghcr.io/opendevin/sandbox' in base_image:
- return base_image
-
- new_image_name = _get_new_image_name(base_image)
-
- # Detect if the sandbox image is built
- images = docker_client.images.list()
- for image in images:
- if new_image_name in image.tags:
- logger.info('Found existing od_sandbox image, reuse:' + new_image_name)
- return new_image_name
-
- # If the sandbox image is not found, build it
- logger.info(
- f'od_sandbox image is not found for {base_image}, will build: {new_image_name}'
- )
- _build_sandbox_image(base_image, new_image_name, docker_client)
- return new_image_name
diff --git a/opendevin/runtime/utils/runtime_build.py b/opendevin/runtime/utils/runtime_build.py
index acb4ac6d02fa..9d3596eb669e 100644
--- a/opendevin/runtime/utils/runtime_build.py
+++ b/opendevin/runtime/utils/runtime_build.py
@@ -6,11 +6,16 @@
import docker
import toml
+from dirhash import dirhash
from jinja2 import Environment, FileSystemLoader
import opendevin
from opendevin.core.logger import opendevin_logger as logger
+RUNTIME_IMAGE_REPO = os.getenv(
+ 'OD_RUNTIME_RUNTIME_IMAGE_REPO', 'ghcr.io/opendevin/od_runtime'
+)
+
def _get_package_version():
"""Read the version from pyproject.toml as the other one may be outdated."""
@@ -29,6 +34,8 @@ def _create_project_source_dist():
logger.info(f'Using project root: {project_root}')
# run "python -m build -s" on project_root
+ # install build package if not installed
+ subprocess.run(['python', '-m', 'pip', 'install', 'build'])
result = subprocess.run(['python', '-m', 'build', '-s', project_root])
if result.returncode != 0:
logger.error(f'Build failed: {result}')
@@ -47,7 +54,8 @@ def _create_project_source_dist():
return tarball_path
-def _put_source_code_to_dir(temp_dir: str) -> str:
+def _put_source_code_to_dir(temp_dir: str):
+ """Put the source code of OpenDevin to the temp_dir/code."""
tarball_path = _create_project_source_dist()
filename = os.path.basename(tarball_path)
filename = filename.removesuffix('.tar.gz')
@@ -59,11 +67,20 @@ def _put_source_code_to_dir(temp_dir: str) -> str:
logger.info(
f'Source distribution moved to {os.path.join(temp_dir, "project.tar.gz")}'
)
- return filename
+
+ # unzip the tarball
+ shutil.unpack_archive(os.path.join(temp_dir, 'project.tar.gz'), temp_dir)
+ # remove the tarball
+ os.remove(os.path.join(temp_dir, 'project.tar.gz'))
+ # rename the directory to the 'code'
+ os.rename(os.path.join(temp_dir, filename), os.path.join(temp_dir, 'code'))
+ logger.info(f'Unpacked source code directory: {os.path.join(temp_dir, "code")}')
def _generate_dockerfile(
- base_image: str, source_code_dirname: str, skip_init: bool = False
+ base_image: str,
+ skip_init: bool = False,
+ extra_deps: str | None = None,
) -> str:
"""Generate the Dockerfile content for the eventstream runtime image based on user-provided base image."""
env = Environment(
@@ -74,8 +91,8 @@ def _generate_dockerfile(
template = env.get_template('Dockerfile.j2')
dockerfile_content = template.render(
base_image=base_image,
- source_code_dirname=source_code_dirname,
skip_init=skip_init,
+ extra_deps=extra_deps if extra_deps is not None else '',
)
return dockerfile_content
@@ -84,11 +101,17 @@ def prep_docker_build_folder(
dir_path: str,
base_image: str,
skip_init: bool = False,
-):
- """Prepares the docker build folder by copying the source code and generating the Dockerfile."""
- source_code_dirname = _put_source_code_to_dir(dir_path)
+ extra_deps: str | None = None,
+) -> str:
+ """Prepares the docker build folder by copying the source code and generating the Dockerfile.
+
+ Return the MD5 hash of the directory.
+ """
+ _put_source_code_to_dir(dir_path)
dockerfile_content = _generate_dockerfile(
- base_image, source_code_dirname, skip_init=skip_init
+ base_image,
+ skip_init=skip_init,
+ extra_deps=extra_deps,
)
logger.info(
(
@@ -100,78 +123,111 @@ def prep_docker_build_folder(
with open(os.path.join(dir_path, 'Dockerfile'), 'w') as file:
file.write(dockerfile_content)
+ hash = dirhash(dir_path, 'md5')
+ logger.info(
+ f'Input base image: {base_image}\n'
+ f'Skip init: {skip_init}\n'
+ f'Extra deps: {extra_deps}\n'
+ f'Hash for docker build directory [{dir_path}] (contents: {os.listdir(dir_path)}): {hash}\n'
+ )
+ return hash
+
def _build_sandbox_image(
- base_image: str,
- target_image_name: str,
+ docker_folder: str,
docker_client: docker.DockerClient,
- skip_init: bool = False,
-):
+ target_image_repo: str,
+ target_image_hash_tag: str,
+ target_image_tag: str,
+) -> str:
+ """Build the sandbox image.
+
+ The image will be tagged as both:
+ - target_image_repo:target_image_hash_tag
+ - target_image_repo:target_image_tag
+
+ Args:
+ docker_folder: str: the path to the docker build folder
+ docker_client: docker.DockerClient: the docker client
+ target_image_repo: str: the repository name for the target image
+ target_image_hash_tag: str: the *hash* tag for the target image that is calculated based
+ on the contents of the docker build folder (source code and Dockerfile)
+ e.g., ubuntu:latest -> od_runtime:1234567890abcdef
+ target_image_tag: str: the tag for the target image that's generic and based on the base image name
+ e.g., ubuntu:latest -> od_runtime:ubuntu_tag_latest
+ """
+ # 1. Always directly build and tag using the dir_hash
+ target_image_hash_name = f'{target_image_repo}:{target_image_hash_tag}'
try:
- with tempfile.TemporaryDirectory() as temp_dir:
- if skip_init:
- logger.info(
- f'Reusing existing od_sandbox image [{target_image_name}] but will update the source code in it.'
- )
- else:
- logger.info(f'Building agnostic sandbox image: {target_image_name}')
- prep_docker_build_folder(temp_dir, base_image, skip_init=skip_init)
- api_client = docker_client.api
- build_logs = api_client.build(
- path=temp_dir,
- tag=target_image_name,
- rm=True,
- decode=True,
- # do not use cache when skip_init is True (i.e., when we want to update the source code in the existing image)
- nocache=skip_init,
- )
-
- if skip_init:
- logger.info(
- f'Rebuilding existing od_sandbox image [{target_image_name}] to update the source code.'
- )
- for log in build_logs:
- if 'stream' in log:
- print(log['stream'].strip())
- elif 'error' in log:
- logger.error(log['error'].strip())
- else:
- logger.info(str(log))
-
- # check if the image is built successfully
- image = docker_client.images.get(target_image_name)
- if image is None:
- raise RuntimeError(f'Build failed: Image {target_image_name} not found')
- logger.info(f'Image {target_image_name} built successfully')
+ build_logs = docker_client.api.build(
+ path=docker_folder,
+ tag=target_image_hash_name,
+ rm=True,
+ decode=True,
+ )
except docker.errors.BuildError as e:
logger.error(f'Sandbox image build failed: {e}')
raise e
+ for log in build_logs:
+ if 'stream' in log:
+ print(log['stream'].strip())
+ elif 'error' in log:
+ logger.error(log['error'].strip())
+ else:
+ logger.info(str(log))
+
+ # 2. Re-tag the image with a more generic tag (as somewhat of "latest" tag)
+ logger.info(f'Image [{target_image_hash_name}] build finished.')
+ image = docker_client.images.get(target_image_hash_name)
+ image.tag(target_image_repo, target_image_tag)
+ logger.info(
+ f'Re-tagged image [{target_image_hash_name}] with more generic tag [{target_image_tag}]'
+ )
-def get_new_image_name(base_image: str, dev_mode: bool = False) -> str:
- if dev_mode:
- if 'od_runtime' not in base_image:
- raise ValueError(
- f'Base image {base_image} must be a valid od_runtime image to be used for dev mode.'
- )
- # remove the 'od_runtime' prefix from the base_image
- return base_image.replace('od_runtime', 'od_runtime_dev')
- elif 'od_runtime' in base_image:
- # if the base image is a valid od_runtime image, we will use it as is
- logger.info(f'Using existing od_runtime image [{base_image}]')
- return base_image
+ # check if the image is built successfully
+ image = docker_client.images.get(target_image_hash_name)
+ if image is None:
+ raise RuntimeError(
+ f'Build failed: Image [{target_image_repo}:{target_image_hash_tag}] not found'
+ )
+ logger.info(
+ f'Image [{target_image_repo}:{target_image_hash_tag}] (hash: [{target_image_tag}]) built successfully'
+ )
+ return target_image_hash_name
+
+
+def get_runtime_image_repo_and_tag(base_image: str) -> tuple[str, str]:
+ if RUNTIME_IMAGE_REPO in base_image:
+ logger.info(
+ f'The provided image [{base_image}] is a already a valid od_runtime image.\n'
+ f'Will try to reuse it as is.'
+ )
+ if ':' not in base_image:
+ base_image = base_image + ':latest'
+ repo, tag = base_image.split(':')
+ return repo, tag
else:
- prefix = 'od_runtime'
if ':' not in base_image:
base_image = base_image + ':latest'
[repo, tag] = base_image.split(':')
repo = repo.replace('/', '___')
-
od_version = _get_package_version()
- return f'{prefix}:od_v{od_version}_image_{repo}_tag_{tag}'
+ return RUNTIME_IMAGE_REPO, f'od_v{od_version}_image_{repo}_tag_{tag}'
def _check_image_exists(image_name: str, docker_client: docker.DockerClient) -> bool:
+ """Check if the image exists in the registry (try to pull it first) AND in the local store.
+
+ image_name is f'{repo}:{tag}'
+ """
+ # Try to pull the new image from the registry
+ try:
+ logger.info(f'Pulling image {image_name} directly...')
+ docker_client.images.pull(image_name)
+ except Exception:
+ logger.info(f'Cannot pull image {image_name} directly')
+
images = docker_client.images.list()
if images:
for image in images:
@@ -183,92 +239,135 @@ def _check_image_exists(image_name: str, docker_client: docker.DockerClient) ->
def build_runtime_image(
base_image: str,
docker_client: docker.DockerClient,
- update_source_code: bool = False,
- save_to_local_store: bool = False, # New parameter to control saving to local store
+ extra_deps: str | None = None,
+ docker_build_folder: str | None = None,
+ dry_run: bool = False,
+ force_rebuild: bool = False,
) -> str:
"""Build the runtime image for the OpenDevin runtime.
- This is only used for **eventstream runtime**.
+ See https://docs.all-hands.dev/modules/usage/runtime for more details.
"""
- new_image_name = get_new_image_name(base_image)
- if base_image == new_image_name:
- logger.info(
- f'Using existing od_runtime image [{base_image}]. Will NOT build a new image.'
+ runtime_image_repo, runtime_image_tag = get_runtime_image_repo_and_tag(base_image)
+
+ # Calculate the hash for the docker build folder (source code and Dockerfile)
+ with tempfile.TemporaryDirectory() as temp_dir:
+ from_scratch_hash = prep_docker_build_folder(
+ temp_dir,
+ base_image=base_image,
+ skip_init=False,
+ extra_deps=extra_deps,
)
- else:
- logger.info(f'New image name: {new_image_name}')
- # Ensure new_image_name contains a colon
- if ':' not in new_image_name:
- raise ValueError(
- f'Invalid image name: {new_image_name}. Expected format "repository:tag".'
- )
+ # hash image name, if the hash matches, it means the image is already
+ # built from scratch with the *exact SAME source code* on the exact Dockerfile
+ hash_runtime_image_name = f'{runtime_image_repo}:{from_scratch_hash}'
- # Try to pull the new image from the registry
- try:
- docker_client.images.pull(new_image_name)
- except Exception:
- logger.info(f'Cannot pull image {new_image_name} directly')
-
- # Detect if the sandbox image is built
- image_exists = _check_image_exists(new_image_name, docker_client)
- if image_exists:
- logger.info(f'Image {new_image_name} exists')
- else:
- logger.info(f'Image {new_image_name} does not exist')
-
- skip_init = False
- if image_exists and not update_source_code:
- # If (1) Image exists & we are not updating the source code, we can reuse the existing production image
- logger.info('No image build done (not updating source code)')
- return new_image_name
- elif image_exists and update_source_code:
- # If (2) Image exists & we plan to update the source code (in dev mode), we need to rebuild the image
- # and give it a special name
- # e.g., od_runtime:ubuntu_tag_latest -> od_runtime_dev:ubuntu_tag_latest
- logger.info('Image exists, but updating source code requested')
- base_image = new_image_name
- new_image_name = get_new_image_name(base_image, dev_mode=True)
-
- skip_init = True # since we only need to update the source code
- else:
- # If (3) Image does not exist, we need to build it from scratch
- # e.g., ubuntu:latest -> od_runtime:ubuntu_tag_latest
- # This snippet would allow to load from archive:
- # tar_path = f'{new_image_name.replace(":", "_")}.tar'
- # if os.path.exists(tar_path):
- # logger.info(f'Loading image from {tar_path}')
- # load_command = ['docker', 'load', '-i', tar_path]
- # subprocess.run(load_command, check=True)
- # logger.info(f'Image {new_image_name} loaded from {tar_path}')
- # return new_image_name
- skip_init = False
+ # non-hash generic image name, it could contains *similar* dependencies
+ # but *might* not exactly match the state of the source code.
+ # It resembles the "latest" tag in the docker image naming convention for
+ # a particular {repo}:{tag} pair (e.g., ubuntu:latest -> od_runtime:ubuntu_tag_latest)
+ # we will build from IT to save time if the `from_scratch_hash` is not found
+ generic_runtime_image_name = f'{runtime_image_repo}:{runtime_image_tag}'
- if not skip_init:
- logger.info(f'Building image [{new_image_name}] from scratch')
+ # 1. If the image exists with the same hash, we will reuse it as is
+ if _check_image_exists(hash_runtime_image_name, docker_client):
+ logger.info(
+ f'Image [{hash_runtime_image_name}] exists with matched hash for Docker build folder.\n'
+ 'Will reuse it as is.'
+ )
+ return hash_runtime_image_name
+
+ # 2. If the exact hash is not found, we will FIRST try to re-build it
+ # by leveraging the non-hash `generic_runtime_image_name` to save some time
+ # from re-building the dependencies (e.g., poetry install, apt install)
+ elif (
+ _check_image_exists(generic_runtime_image_name, docker_client)
+ and not force_rebuild
+ ):
+ logger.info(
+ f'Cannot find matched hash for image [{hash_runtime_image_name}]\n'
+ f'Will try to re-build it from latest [{generic_runtime_image_name}] image to potentially save '
+ f'time for dependencies installation.\n'
+ )
- _build_sandbox_image(base_image, new_image_name, docker_client, skip_init=skip_init)
+ cur_docker_build_folder = docker_build_folder or tempfile.mkdtemp()
+ _skip_init_hash = prep_docker_build_folder(
+ cur_docker_build_folder,
+ # we want to use the existing generic image as base
+ # so that we can leverage existing dependencies already installed in the image
+ base_image=generic_runtime_image_name,
+ skip_init=True, # skip init since we are re-using the existing image
+ extra_deps=extra_deps,
+ )
+ assert (
+ _skip_init_hash != from_scratch_hash
+ ), f'The skip_init hash [{_skip_init_hash}] should not match the existing hash [{from_scratch_hash}]'
+
+ if not dry_run:
+ _build_sandbox_image(
+ docker_folder=cur_docker_build_folder,
+ docker_client=docker_client,
+ target_image_repo=runtime_image_repo,
+ # NOTE: WE ALWAYS use the "from_scratch_hash" tag for the target image
+ # otherwise, even if the source code is exactly the same, the image *might* be re-built
+ # because the same source code will generate different hash when skip_init=True/False
+ # since the Dockerfile is slightly different
+ target_image_hash_tag=from_scratch_hash,
+ target_image_tag=runtime_image_tag,
+ )
+ else:
+ logger.info(
+ f'Dry run: Skipping image build for [{generic_runtime_image_name}]'
+ )
+ if docker_build_folder is None:
+ shutil.rmtree(cur_docker_build_folder)
- # Only for development: allow to save image as archive:
- if not image_exists and save_to_local_store:
- tar_path = f'{new_image_name.replace(":", "_")}.tar'
- save_command = ['docker', 'save', '-o', tar_path, new_image_name]
- subprocess.run(save_command, check=True)
- logger.info(f'Image saved to {tar_path}')
+ # 3. If the image is not found AND we cannot re-use the non-hash latest relavant image,
+ # we will build it completely from scratch
+ else:
+ if force_rebuild:
+ logger.info(
+ f'Force re-build: Will try to re-build image [{generic_runtime_image_name}] from scratch.\n'
+ )
+ cur_docker_build_folder = docker_build_folder or tempfile.mkdtemp()
+ _new_from_scratch_hash = prep_docker_build_folder(
+ cur_docker_build_folder,
+ base_image,
+ skip_init=False,
+ extra_deps=extra_deps,
+ )
+ assert (
+ _new_from_scratch_hash == from_scratch_hash
+ ), f'The new from scratch hash [{_new_from_scratch_hash}] does not match the existing hash [{from_scratch_hash}]'
+
+ if not dry_run:
+ _build_sandbox_image(
+ docker_folder=cur_docker_build_folder,
+ docker_client=docker_client,
+ target_image_repo=runtime_image_repo,
+ # NOTE: WE ALWAYS use the "from_scratch_hash" tag for the target image
+ target_image_hash_tag=from_scratch_hash,
+ target_image_tag=runtime_image_tag,
+ )
+ else:
+ logger.info(
+ f'Dry run: Skipping image build for [{generic_runtime_image_name}]'
+ )
- load_command = ['docker', 'load', '-i', tar_path]
- subprocess.run(load_command, check=True)
- logger.info(f'Image {new_image_name} loaded back into Docker from {tar_path}')
+ if docker_build_folder is None:
+ shutil.rmtree(cur_docker_build_folder)
- return new_image_name
+ return f'{runtime_image_repo}:{from_scratch_hash}'
if __name__ == '__main__':
parser = argparse.ArgumentParser()
- parser.add_argument('--base_image', type=str, default='ubuntu:22.04')
- parser.add_argument('--update_source_code', action='store_true')
- parser.add_argument('--save_to_local_store', action='store_true')
+ parser.add_argument(
+ '--base_image', type=str, default='nikolaik/python-nodejs:python3.11-nodejs22'
+ )
parser.add_argument('--build_folder', type=str, default=None)
+ parser.add_argument('--force_rebuild', action='store_true', default=False)
args = parser.parse_args()
if args.build_folder is not None:
@@ -279,29 +378,41 @@ def build_runtime_image(
logger.info(
f'Will prepare a build folder by copying the source code and generating the Dockerfile: {build_folder}'
)
- new_image_path = get_new_image_name(args.base_image)
- prep_docker_build_folder(
- build_folder, args.base_image, skip_init=args.update_source_code
+ runtime_image_repo, runtime_image_tag = get_runtime_image_repo_and_tag(
+ args.base_image
+ )
+ with tempfile.TemporaryDirectory() as temp_dir:
+ runtime_image_hash_name = build_runtime_image(
+ args.base_image,
+ docker_client=docker.from_env(),
+ docker_build_folder=temp_dir,
+ dry_run=True,
+ force_rebuild=args.force_rebuild,
+ )
+ _runtime_image_repo, runtime_image_hash_tag = runtime_image_hash_name.split(
+ ':'
+ )
+ # Move contents of temp_dir to build_folder
+ shutil.copytree(temp_dir, build_folder, dirs_exist_ok=True)
+ logger.info(
+ f'Build folder [{build_folder}] is ready: {os.listdir(build_folder)}'
)
- new_image_name, new_image_tag = new_image_path.split(':')
+
with open(os.path.join(build_folder, 'config.sh'), 'a') as file:
file.write(
(
- f'DOCKER_IMAGE={new_image_name}\n'
- f'DOCKER_IMAGE_TAG={new_image_tag}\n'
+ f'\n'
+ f'DOCKER_IMAGE={runtime_image_repo}\n'
+ f'DOCKER_IMAGE_TAG={runtime_image_tag}\n'
+ f'DOCKER_IMAGE_HASH_TAG={runtime_image_hash_tag}\n'
)
)
logger.info(
- f'`config.sh` is updated with the new image name [{new_image_name}] and tag [{new_image_tag}]'
+ f'`config.sh` is updated with the new image name [{runtime_image_repo}] and tag [{runtime_image_tag}, {runtime_image_hash_tag}]'
)
logger.info(f'Dockerfile and source distribution are ready in {build_folder}')
else:
logger.info('Building image in a temporary folder')
client = docker.from_env()
- image_name = build_runtime_image(
- args.base_image,
- client,
- update_source_code=args.update_source_code,
- save_to_local_store=args.save_to_local_store,
- )
+ image_name = build_runtime_image(args.base_image, client)
print(f'\nBUILT Image: {image_name}\n')
diff --git a/opendevin/runtime/utils/runtime_templates/Dockerfile.j2 b/opendevin/runtime/utils/runtime_templates/Dockerfile.j2
index 2502518b88e8..b7d04d246529 100644
--- a/opendevin/runtime/utils/runtime_templates/Dockerfile.j2
+++ b/opendevin/runtime/utils/runtime_templates/Dockerfile.j2
@@ -5,6 +5,7 @@ FROM {{ base_image }}
# START: Build Runtime Image from Scratch
# ================================================================
FROM {{ base_image }}
+
{% if 'ubuntu' in base_image and (base_image.endswith(':latest') or base_image.endswith(':24.04')) %}
{% set LIBGL_MESA = 'libgl1' %}
{% else %}
@@ -13,15 +14,16 @@ FROM {{ base_image }}
# Install necessary packages and clean up in one layer
RUN apt-get update && \
- apt-get install -y wget sudo apt-utils {{ LIBGL_MESA }} libasound2-plugins && \
- apt-get clean \
- && rm -rf /var/lib/apt/lists/*
+ apt-get install -y wget sudo apt-utils {{ LIBGL_MESA }} libasound2-plugins git && \
+ apt-get clean && \
+ rm -rf /var/lib/apt/lists/*
# Create necessary directories
RUN mkdir -p /opendevin && \
mkdir -p /opendevin/logs && \
- chmod 777 /opendevin/logs && \
- echo "" > /opendevin/bash.bashrc
+ mkdir -p /opendevin/poetry
+
+ENV POETRY_VIRTUALENVS_PATH=/opendevin/poetry
RUN if [ ! -d /opendevin/miniforge3 ]; then \
wget --progress=bar:force -O Miniforge3.sh "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" && \
@@ -32,8 +34,7 @@ RUN if [ ! -d /opendevin/miniforge3 ]; then \
fi
# Install Python and Poetry
-RUN /opendevin/miniforge3/bin/mamba install python=3.11 -y
-RUN /opendevin/miniforge3/bin/mamba install conda-forge::poetry -y
+RUN /opendevin/miniforge3/bin/mamba install conda-forge::poetry python=3.11 -y
# ================================================================
# END: Build Runtime Image from Scratch
# ================================================================
@@ -42,10 +43,8 @@ RUN /opendevin/miniforge3/bin/mamba install conda-forge::poetry -y
# ================================================================
# START: Copy Project and Install/Update Dependencies
# ================================================================
-COPY project.tar.gz /opendevin
RUN if [ -d /opendevin/code ]; then rm -rf /opendevin/code; fi
-RUN cd /opendevin && tar -xzvf project.tar.gz && rm project.tar.gz
-RUN mv /opendevin/{{ source_code_dirname }} /opendevin/code
+COPY ./code /opendevin/code
# Install/Update Dependencies
# 1. Install pyproject.toml via poetry
@@ -53,11 +52,14 @@ RUN mv /opendevin/{{ source_code_dirname }} /opendevin/code
# 3. Clear poetry, apt, mamba caches
RUN cd /opendevin/code && \
/opendevin/miniforge3/bin/mamba run -n base poetry env use python3.11 && \
- /opendevin/miniforge3/bin/mamba run -n base poetry install --no-interaction --no-root && \
+ /opendevin/miniforge3/bin/mamba run -n base poetry install --only main,runtime --no-interaction --no-root && \
apt-get update && \
/opendevin/miniforge3/bin/mamba run -n base poetry run pip install playwright && \
/opendevin/miniforge3/bin/mamba run -n base poetry run playwright install --with-deps chromium && \
+ export OD_INTERPRETER_PATH=$(/opendevin/miniforge3/bin/mamba run -n base poetry run python -c "import sys; print(sys.executable)") && \
+ {{ extra_deps }} {% if extra_deps %} && {% endif %} \
/opendevin/miniforge3/bin/mamba run -n base poetry cache clear --all . && \
+ {% if not skip_init %}chmod -R g+rws /opendevin/poetry && {% endif %} \
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
/opendevin/miniforge3/bin/mamba clean --all
diff --git a/opendevin/server/README.md b/opendevin/server/README.md
index fa4d6249e662..991f0fe7c03b 100644
--- a/opendevin/server/README.md
+++ b/opendevin/server/README.md
@@ -2,9 +2,23 @@
This is a WebSocket server that executes tasks using an agent.
+## Recommended Prerequisites
+
+- [Initialize the frontend code](../../frontend/README.md)
+- Install Python 3.12 (`brew install python` for those using homebrew)
+- Install pipx: (`brew install pipx` followed by `pipx ensurepath`)
+- Install poetry: (`pipx install poetry`)
+
## Install
-Follow the instructions in the base README.md to install dependencies and set up.
+First build a distribution of the frontend code (From the project root directory):
+```
+cd frontend
+npm install
+npm run build
+cd ..
+```
+Next run `poetry shell` (So you don't have to repeat `poetry run`)
## Start the Server
diff --git a/opendevin/server/listen.py b/opendevin/server/listen.py
index 3bbba95cd1f9..4453047702ba 100644
--- a/opendevin/server/listen.py
+++ b/opendevin/server/listen.py
@@ -1,11 +1,10 @@
import os
import re
+import tempfile
import uuid
import warnings
import requests
-from pathspec import PathSpec
-from pathspec.patterns import GitWildMatchPattern
from opendevin.server.data_models.feedback import FeedbackDataModel, store_feedback
from opendevin.storage import get_file_store
@@ -33,13 +32,22 @@
from opendevin.core.config import LLMConfig, load_app_config
from opendevin.core.logger import opendevin_logger as logger
from opendevin.core.schema import AgentState # Add this import
-from opendevin.events.action import ChangeAgentStateAction, NullAction
+from opendevin.events.action import (
+ ChangeAgentStateAction,
+ FileReadAction,
+ FileWriteAction,
+ NullAction,
+)
from opendevin.events.observation import (
AgentStateChangedObservation,
+ ErrorObservation,
+ FileReadObservation,
+ FileWriteObservation,
NullObservation,
)
from opendevin.events.serialization import event_to_dict
from opendevin.llm import bedrock
+from opendevin.runtime.runtime import Runtime
from opendevin.server.auth import get_sid_from_token, sign_token
from opendevin.server.session import SessionManager
@@ -207,7 +215,7 @@ async def websocket_endpoint(websocket: WebSocket):
```
- Send a message:
```json
- {"action": "message", "args": {"content": "Hello, how are you?"}}
+ {"action": "message", "args": {"content": "Hello, how are you?", "images_urls": ["base64_url1", "base64_url2"]}}
```
- Write contents to a file:
```json
@@ -355,7 +363,7 @@ async def get_agents():
@app.get('/api/list-files')
-def list_files(request: Request, path: str = '/'):
+async def list_files(request: Request, path: str | None = None):
"""List files in the specified path.
This function retrieves a list of files from the agent's runtime file store,
@@ -368,7 +376,7 @@ def list_files(request: Request, path: str = '/'):
Args:
request (Request): The incoming request object.
- path (str, optional): The path to list files from. Defaults to '/'.
+ path (str, optional): The path to list files from. Defaults to None.
Returns:
list: A list of file names in the specified path.
@@ -381,90 +389,13 @@ def list_files(request: Request, path: str = '/'):
status_code=status.HTTP_404_NOT_FOUND,
content={'error': 'Runtime not yet initialized'},
)
-
- try:
- # Get the full path of the requested directory
- full_path = (
- request.state.session.agent_session.runtime.file_store.get_full_path(path)
- )
-
- # Check if the directory exists
- if not os.path.exists(full_path) or not os.path.isdir(full_path):
- return []
-
- # Check if .gitignore exists
- gitignore_path = os.path.join(full_path, '.gitignore')
- if os.path.exists(gitignore_path):
- # Use PathSpec to parse .gitignore
- with open(gitignore_path, 'r') as f:
- spec = PathSpec.from_lines(GitWildMatchPattern, f.readlines())
- else:
- # Fallback to default exclude list if .gitignore doesn't exist
- default_exclude = [
- '.git',
- '.DS_Store',
- '.svn',
- '.hg',
- '.idea',
- '.vscode',
- '.settings',
- '.pytest_cache',
- '__pycache__',
- 'node_modules',
- 'vendor',
- 'build',
- 'dist',
- 'bin',
- 'logs',
- 'log',
- 'tmp',
- 'temp',
- 'coverage',
- 'venv',
- 'env',
- ]
- spec = PathSpec.from_lines(GitWildMatchPattern, default_exclude)
-
- entries = request.state.session.agent_session.runtime.file_store.list(path)
-
- # Filter entries using PathSpec
- filtered_entries = [
- entry
- for entry in entries
- if not spec.match_file(os.path.relpath(entry, str(full_path)))
- ]
-
- # Separate directories and files
- directories = []
- files = []
- for entry in filtered_entries:
- # Remove leading slash and any parent directory components
- entry_relative = entry.lstrip('/').split('/')[-1]
-
- # Construct the full path by joining the base path with the relative entry path
- full_entry_path = os.path.join(full_path, entry_relative)
- if os.path.exists(full_entry_path):
- is_dir = os.path.isdir(full_entry_path)
- if is_dir:
- directories.append(entry)
- else:
- files.append(entry)
-
- # Sort directories and files separately
- directories.sort(key=lambda s: s.lower())
- files.sort(key=lambda s: s.lower())
-
- # Combine sorted directories and files
- sorted_entries = directories + files
- return sorted_entries
-
- except Exception as e:
- logger.error(f'Error listing files: {e}', exc_info=True)
- return []
+ runtime: Runtime = request.state.session.agent_session.runtime
+ file_list = await runtime.list_files(path)
+ return file_list
@app.get('/api/select-file')
-def select_file(file: str, request: Request):
+async def select_file(file: str, request: Request):
"""Retrieve the content of a specified file.
To select a file:
@@ -474,6 +405,7 @@ def select_file(file: str, request: Request):
Args:
file (str): The path of the file to be retrieved.
+ Expect path to be absolute inside the runtime.
request (Request): The incoming request object.
Returns:
@@ -482,16 +414,27 @@ def select_file(file: str, request: Request):
Raises:
HTTPException: If there's an error opening the file.
"""
- try:
- content = request.state.session.agent_session.runtime.file_store.read(file)
- except Exception as e:
- logger.error(f'Error opening file {file}: {e}', exc_info=False)
- error_msg = f'Error opening file: {e}'
+ runtime: Runtime = request.state.session.agent_session.runtime
+
+ # convert file to an absolute path inside the runtime
+ if not os.path.isabs(file):
+ return JSONResponse(
+ status_code=status.HTTP_400_BAD_REQUEST,
+ content={'error': 'File path must be absolute'},
+ )
+
+ read_action = FileReadAction(file)
+ observation = await runtime.run_action(read_action)
+
+ if isinstance(observation, FileReadObservation):
+ content = observation.content
+ return {'code': content}
+ elif isinstance(observation, ErrorObservation):
+ logger.error(f'Error opening file {file}: {observation}', exc_info=False)
return JSONResponse(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
- content={'error': error_msg},
+ content={'error': f'Error opening file: {observation}'},
)
- return {'code': content}
def sanitize_filename(filename):
@@ -552,9 +495,17 @@ async def upload_file(request: Request, files: list[UploadFile]):
)
continue
- request.state.session.agent_session.runtime.file_store.write(
- safe_filename, file_contents
- )
+ # copy the file to the runtime
+ with tempfile.TemporaryDirectory() as tmp_dir:
+ tmp_file_path = os.path.join(tmp_dir, safe_filename)
+ with open(tmp_file_path, 'wb') as tmp_file:
+ tmp_file.write(file_contents)
+ tmp_file.flush()
+
+ runtime: Runtime = request.state.session.agent_session.runtime
+ await runtime.copy_to(
+ tmp_file_path, runtime.config.workspace_mount_path_in_sandbox
+ )
uploaded_files.append(safe_filename)
response_content = {
@@ -709,13 +660,32 @@ async def save_file(request: Request):
if not file_path or content is None:
raise HTTPException(status_code=400, detail='Missing filePath or content')
+ # Make sure file_path is abs
+ if not os.path.isabs(file_path):
+ return JSONResponse(
+ status_code=status.HTTP_400_BAD_REQUEST,
+ content={'error': 'File path must be absolute'},
+ )
+
# Save the file to the agent's runtime file store
- request.state.session.agent_session.runtime.file_store.write(file_path, content)
+ runtime: Runtime = request.state.session.agent_session.runtime
+ write_action = FileWriteAction(file_path, content)
+ observation = await runtime.run_action(write_action)
- # Return a success response
- return JSONResponse(
- status_code=200, content={'message': 'File saved successfully'}
- )
+ if isinstance(observation, FileWriteObservation):
+ return JSONResponse(
+ status_code=200, content={'message': 'File saved successfully'}
+ )
+ elif isinstance(observation, ErrorObservation):
+ return JSONResponse(
+ status_code=500,
+ content={'error': f'Failed to save file: {observation}'},
+ )
+ else:
+ return JSONResponse(
+ status_code=500,
+ content={'error': f'Unexpected observation: {observation}'},
+ )
except Exception as e:
# Log the error and return a 500 response
logger.error(f'Error saving file: {e}', exc_info=True)
diff --git a/opendevin/server/session/agent.py b/opendevin/server/session/agent.py
index 616a99501bff..bbe5d5e98a4f 100644
--- a/opendevin/server/session/agent.py
+++ b/opendevin/server/session/agent.py
@@ -1,15 +1,13 @@
from typing import Optional
-from agenthub.codeact_agent.codeact_agent import CodeActAgent
from opendevin.controller import AgentController
from opendevin.controller.agent import Agent
from opendevin.controller.state.state import State
from opendevin.core.config import AppConfig, LLMConfig
from opendevin.core.logger import opendevin_logger as logger
from opendevin.events.stream import EventStream
-from opendevin.runtime import DockerSSHBox, get_runtime_cls
+from opendevin.runtime import get_runtime_cls
from opendevin.runtime.runtime import Runtime
-from opendevin.runtime.server.runtime import ServerRuntime
from opendevin.storage.files import FileStore
@@ -22,6 +20,7 @@ class AgentSession:
sid: str
event_stream: EventStream
+ file_store: FileStore
controller: Optional[AgentController] = None
runtime: Optional[Runtime] = None
_closed: bool = False
@@ -51,7 +50,7 @@ async def start(
raise Exception(
'Session already started. You need to close this session and start a new one.'
)
- await self._create_runtime(runtime_name, config)
+ await self._create_runtime(runtime_name, config, agent)
await self._create_controller(
agent,
confirmation_mode,
@@ -71,7 +70,7 @@ async def close(self):
await self.runtime.close()
self._closed = True
- async def _create_runtime(self, runtime_name: str, config: AppConfig):
+ async def _create_runtime(self, runtime_name: str, config: AppConfig, agent: Agent):
"""Creates a runtime instance."""
if self.runtime is not None:
raise Exception('Runtime already created')
@@ -79,7 +78,10 @@ async def _create_runtime(self, runtime_name: str, config: AppConfig):
logger.info(f'Using runtime: {runtime_name}')
runtime_cls = get_runtime_cls(runtime_name)
self.runtime = runtime_cls(
- config=config, event_stream=self.event_stream, sid=self.sid
+ config=config,
+ event_stream=self.event_stream,
+ sid=self.sid,
+ plugins=agent.sandbox_plugins,
)
await self.runtime.ainit()
@@ -98,17 +100,6 @@ async def _create_controller(
raise Exception('Runtime must be initialized before the agent controller')
logger.info(f'Creating agent {agent.name} using LLM {agent.llm.config.model}')
- if isinstance(agent, CodeActAgent):
- if not self.runtime or not (
- isinstance(self.runtime, ServerRuntime)
- and isinstance(self.runtime.sandbox, DockerSSHBox)
- ):
- logger.warning(
- 'CodeActAgent requires DockerSSHBox as sandbox! Using other sandbox that are not stateful'
- ' LocalBox will not work properly.'
- )
- self.runtime.init_sandbox_plugins(agent.sandbox_plugins)
- self.runtime.init_runtime_tools(agent.runtime_tools)
self.controller = AgentController(
sid=self.sid,
diff --git a/opendevin/server/session/session.py b/opendevin/server/session/session.py
index d8434bffc7c8..7d0b850d6f89 100644
--- a/opendevin/server/session/session.py
+++ b/opendevin/server/session/session.py
@@ -10,13 +10,14 @@
from opendevin.core.schema import AgentState
from opendevin.core.schema.action import ActionType
from opendevin.core.schema.config import ConfigType
-from opendevin.events.action import Action, ChangeAgentStateAction, NullAction
+from opendevin.events.action import ChangeAgentStateAction, MessageAction, NullAction
from opendevin.events.event import Event, EventSource
from opendevin.events.observation import (
AgentStateChangedObservation,
CmdOutputObservation,
NullObservation,
)
+from opendevin.events.observation.browse import BrowserOutputObservation
from opendevin.events.serialization import event_from_dict, event_to_dict
from opendevin.events.stream import EventStreamSubscriber
from opendevin.llm.llm import LLM
@@ -135,7 +136,7 @@ async def on_event(self, event: Event):
if event.source == EventSource.AGENT:
await self.send(event_to_dict(event))
elif event.source == EventSource.USER and isinstance(
- event, CmdOutputObservation
+ event, (CmdOutputObservation, BrowserOutputObservation)
):
await self.send(event_to_dict(event))
@@ -145,11 +146,15 @@ async def dispatch(self, data: dict):
await self._initialize_agent(data)
return
event = event_from_dict(data.copy())
+ # This checks if the model supports images
+ if isinstance(event, MessageAction) and event.images_urls:
+ controller = self.agent_session.controller
+ if controller and not controller.agent.llm.supports_vision():
+ await self.send_error(
+ 'Model does not support image upload, change to a different model or try without an image.'
+ )
+ return
self.agent_session.event_stream.add_event(event, EventSource.USER)
- if isinstance(event, Action):
- logger.info(
- event, extra={'msg_type': 'ACTION', 'event_source': EventSource.USER}
- )
async def send(self, data: dict[str, object]) -> bool:
try:
diff --git a/poetry.lock b/poetry.lock
index 19b60549653b..65a38908108f 100644
--- a/poetry.lock
+++ b/poetry.lock
@@ -398,6 +398,17 @@ files = [
{file = "backoff-2.2.1.tar.gz", hash = "sha256:03f829f5bb1923180821643f8753b0502c3b682293992485b0eef2807afa5cba"},
]
+[[package]]
+name = "bashlex"
+version = "0.18"
+description = "Python parser for bash"
+optional = false
+python-versions = ">=2.7, !=3.0, !=3.1, !=3.2, !=3.3, !=3.4"
+files = [
+ {file = "bashlex-0.18-py2.py3-none-any.whl", hash = "sha256:91d73a23a3e51711919c1c899083890cdecffc91d8c088942725ac13e9dcfffa"},
+ {file = "bashlex-0.18.tar.gz", hash = "sha256:5bb03a01c6d5676338c36fd1028009c8ad07e7d61d8a1ce3f513b7fff52796ee"},
+]
+
[[package]]
name = "bcrypt"
version = "4.1.3"
@@ -508,17 +519,17 @@ files = [
[[package]]
name = "boto3"
-version = "1.34.149"
+version = "1.34.157"
description = "The AWS SDK for Python"
optional = false
python-versions = ">=3.8"
files = [
- {file = "boto3-1.34.149-py3-none-any.whl", hash = "sha256:11edeeacdd517bda3b7615b754d8440820cdc9ddd66794cc995a9693ddeaa3be"},
- {file = "boto3-1.34.149.tar.gz", hash = "sha256:f4e6489ba9dc7fb37d53e0e82dbc97f2cb0a4969ef3970e2c88b8f94023ae81a"},
+ {file = "boto3-1.34.157-py3-none-any.whl", hash = "sha256:3cc357156df5482154a016f138d1953061a181b4c594f8b6302c9d6c024bd950"},
+ {file = "boto3-1.34.157.tar.gz", hash = "sha256:7ef19ed38cba9863b58430fb4a66a72a5c250304f234bd1c16b860f9bf25677b"},
]
[package.dependencies]
-botocore = ">=1.34.149,<1.35.0"
+botocore = ">=1.34.157,<1.35.0"
jmespath = ">=0.7.1,<2.0.0"
s3transfer = ">=0.10.0,<0.11.0"
@@ -527,13 +538,13 @@ crt = ["botocore[crt] (>=1.21.0,<2.0a0)"]
[[package]]
name = "botocore"
-version = "1.34.149"
+version = "1.34.157"
description = "Low-level, data-driven core of boto 3."
optional = false
python-versions = ">=3.8"
files = [
- {file = "botocore-1.34.149-py3-none-any.whl", hash = "sha256:ae6c4be52eeee96f68c116b27d252bab069cd046d61a17cfe8e9da411cf22906"},
- {file = "botocore-1.34.149.tar.gz", hash = "sha256:2e1eb5ef40102a3d796bb3dd05f2ac5e8fb43fe1ff114b4f6d33153437f5a372"},
+ {file = "botocore-1.34.157-py3-none-any.whl", hash = "sha256:c6cba6de8eb86ca4d2f934e009b37adbe1e7fdcfa52fbab74783f4c30676e07d"},
+ {file = "botocore-1.34.157.tar.gz", hash = "sha256:5628a36cec123cdc8c1158d05a7b06aa5e53649ad73796c50ef3fb51199785fb"},
]
[package.dependencies]
@@ -542,7 +553,7 @@ python-dateutil = ">=2.1,<3.0.0"
urllib3 = {version = ">=1.25.4,<2.2.0 || >2.2.0,<3", markers = "python_version >= \"3.10\""}
[package.extras]
-crt = ["awscrt (==0.20.11)"]
+crt = ["awscrt (==0.21.2)"]
[[package]]
name = "browsergym"
@@ -1367,6 +1378,20 @@ files = [
graph = ["objgraph (>=1.7.2)"]
profile = ["gprof2dot (>=2022.7.29)"]
+[[package]]
+name = "dirhash"
+version = "0.5.0"
+description = "Python module and CLI for hashing of file system directories."
+optional = false
+python-versions = ">=3.8"
+files = [
+ {file = "dirhash-0.5.0-py3-none-any.whl", hash = "sha256:523dfd6b058c64f45b31604376926c6e2bd2ea301d0df23095d4055674e38b09"},
+ {file = "dirhash-0.5.0.tar.gz", hash = "sha256:e60760f0ab2e935d8cb088923ea2c6492398dca42cec785df778985fd4cd5386"},
+]
+
+[package.dependencies]
+scantree = ">=0.0.4"
+
[[package]]
name = "dirtyjson"
version = "1.0.8"
@@ -1400,26 +1425,6 @@ files = [
{file = "distro-1.9.0.tar.gz", hash = "sha256:2fa77c6fd8940f116ee1d6b94a2f90b13b5ea8d019b98bc8bafdcabcdd9bdbed"},
]
-[[package]]
-name = "dnspython"
-version = "2.6.1"
-description = "DNS toolkit"
-optional = false
-python-versions = ">=3.8"
-files = [
- {file = "dnspython-2.6.1-py3-none-any.whl", hash = "sha256:5ef3b9680161f6fa89daf8ad451b5f1a33b18ae8a1c6778cdf4b43f08c0a6e50"},
- {file = "dnspython-2.6.1.tar.gz", hash = "sha256:e8f0f9c23a7b7cb99ded64e6c3a6f3e701d78f50c55e002b839dea7225cff7cc"},
-]
-
-[package.extras]
-dev = ["black (>=23.1.0)", "coverage (>=7.0)", "flake8 (>=7)", "mypy (>=1.8)", "pylint (>=3)", "pytest (>=7.4)", "pytest-cov (>=4.1.0)", "sphinx (>=7.2.0)", "twine (>=4.0.0)", "wheel (>=0.42.0)"]
-dnssec = ["cryptography (>=41)"]
-doh = ["h2 (>=4.1.0)", "httpcore (>=1.0.0)", "httpx (>=0.26.0)"]
-doq = ["aioquic (>=0.9.25)"]
-idna = ["idna (>=3.6)"]
-trio = ["trio (>=0.23)"]
-wmi = ["wmi (>=1.5.1)"]
-
[[package]]
name = "docker"
version = "7.1.0"
@@ -1475,21 +1480,6 @@ typing-extensions = ">=4.8.0"
urllib3 = ">=1.25.3"
websockets = ">=11.0.3"
-[[package]]
-name = "email-validator"
-version = "2.2.0"
-description = "A robust email address syntax and deliverability validation library."
-optional = false
-python-versions = ">=3.8"
-files = [
- {file = "email_validator-2.2.0-py3-none-any.whl", hash = "sha256:561977c2d73ce3611850a06fa56b414621e0c8faa9d66f2611407d87465da631"},
- {file = "email_validator-2.2.0.tar.gz", hash = "sha256:cb690f344c617a714f22e66ae771445a1ceb46821152df8e165c5f9a364582b7"},
-]
-
-[package.dependencies]
-dnspython = ">=2.0.0"
-idna = ">=2.0.0"
-
[[package]]
name = "english-words"
version = "2.0.1"
@@ -1576,45 +1566,23 @@ files = [
[[package]]
name = "fastapi"
-version = "0.111.1"
+version = "0.112.0"
description = "FastAPI framework, high performance, easy to learn, fast to code, ready for production"
optional = false
python-versions = ">=3.8"
files = [
- {file = "fastapi-0.111.1-py3-none-any.whl", hash = "sha256:4f51cfa25d72f9fbc3280832e84b32494cf186f50158d364a8765aabf22587bf"},
- {file = "fastapi-0.111.1.tar.gz", hash = "sha256:ddd1ac34cb1f76c2e2d7f8545a4bcb5463bce4834e81abf0b189e0c359ab2413"},
+ {file = "fastapi-0.112.0-py3-none-any.whl", hash = "sha256:3487ded9778006a45834b8c816ec4a48d522e2631ca9e75ec5a774f1b052f821"},
+ {file = "fastapi-0.112.0.tar.gz", hash = "sha256:d262bc56b7d101d1f4e8fc0ad2ac75bb9935fec504d2b7117686cec50710cf05"},
]
[package.dependencies]
-email_validator = ">=2.0.0"
-fastapi-cli = ">=0.0.2"
-httpx = ">=0.23.0"
-jinja2 = ">=2.11.2"
pydantic = ">=1.7.4,<1.8 || >1.8,<1.8.1 || >1.8.1,<2.0.0 || >2.0.0,<2.0.1 || >2.0.1,<2.1.0 || >2.1.0,<3.0.0"
-python-multipart = ">=0.0.7"
starlette = ">=0.37.2,<0.38.0"
typing-extensions = ">=4.8.0"
-uvicorn = {version = ">=0.12.0", extras = ["standard"]}
-
-[package.extras]
-all = ["email_validator (>=2.0.0)", "httpx (>=0.23.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=2.11.2)", "orjson (>=3.2.1)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.7)", "pyyaml (>=5.3.1)", "ujson (>=4.0.1,!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0)", "uvicorn[standard] (>=0.12.0)"]
-
-[[package]]
-name = "fastapi-cli"
-version = "0.0.4"
-description = "Run and manage FastAPI apps from the command line with FastAPI CLI. ๐"
-optional = false
-python-versions = ">=3.8"
-files = [
- {file = "fastapi_cli-0.0.4-py3-none-any.whl", hash = "sha256:a2552f3a7ae64058cdbb530be6fa6dbfc975dc165e4fa66d224c3d396e25e809"},
- {file = "fastapi_cli-0.0.4.tar.gz", hash = "sha256:e2e9ffaffc1f7767f488d6da34b6f5a377751c996f397902eb6abb99a67bde32"},
-]
-
-[package.dependencies]
-typer = ">=0.12.3"
[package.extras]
-standard = ["fastapi", "uvicorn[standard] (>=0.15.0)"]
+all = ["email_validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.5)", "httpx (>=0.23.0)", "itsdangerous (>=1.1.0)", "jinja2 (>=2.11.2)", "orjson (>=3.2.1)", "pydantic-extra-types (>=2.0.0)", "pydantic-settings (>=2.0.0)", "python-multipart (>=0.0.7)", "pyyaml (>=5.3.1)", "ujson (>=4.0.1,!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0)", "uvicorn[standard] (>=0.12.0)"]
+standard = ["email_validator (>=2.0.0)", "fastapi-cli[standard] (>=0.0.5)", "httpx (>=0.23.0)", "jinja2 (>=2.11.2)", "python-multipart (>=0.0.7)", "uvicorn[standard] (>=0.12.0)"]
[[package]]
name = "fastcore"
@@ -1665,13 +1633,13 @@ typing = ["typing-extensions (>=4.8)"]
[[package]]
name = "flake8"
-version = "7.1.0"
+version = "7.1.1"
description = "the modular source code checker: pep8 pyflakes and co"
optional = false
python-versions = ">=3.8.1"
files = [
- {file = "flake8-7.1.0-py2.py3-none-any.whl", hash = "sha256:2e416edcc62471a64cea09353f4e7bdba32aeb079b6e360554c659a122b1bc6a"},
- {file = "flake8-7.1.0.tar.gz", hash = "sha256:48a07b626b55236e0fb4784ee69a465fbf59d79eec1f5b4785c3d3bc57d17aa5"},
+ {file = "flake8-7.1.1-py2.py3-none-any.whl", hash = "sha256:597477df7860daa5aa0fdd84bf5208a043ab96b8e96ab708770ae0364dd03213"},
+ {file = "flake8-7.1.1.tar.gz", hash = "sha256:049d058491e228e03e67b390f311bbf88fce2dbaa8fa673e7aea87b7198b8d38"},
]
[package.dependencies]
@@ -1915,6 +1883,36 @@ test-downstream = ["aiobotocore (>=2.5.4,<3.0.0)", "dask-expr", "dask[dataframe,
test-full = ["adlfs", "aiohttp (!=4.0.0a0,!=4.0.0a1)", "cloudpickle", "dask", "distributed", "dropbox", "dropboxdrivefs", "fastparquet", "fusepy", "gcsfs", "jinja2", "kerchunk", "libarchive-c", "lz4", "notebook", "numpy", "ocifs", "pandas", "panel", "paramiko", "pyarrow", "pyarrow (>=1)", "pyftpdlib", "pygit2", "pytest", "pytest-asyncio (!=0.22.0)", "pytest-benchmark", "pytest-cov", "pytest-mock", "pytest-recording", "pytest-rerunfailures", "python-snappy", "requests", "smbprotocol", "tqdm", "urllib3", "zarr", "zstandard"]
tqdm = ["tqdm"]
+[[package]]
+name = "func-timeout"
+version = "4.3.5"
+description = "Python module which allows you to specify timeouts when calling any existing function. Also provides support for stoppable-threads"
+optional = false
+python-versions = "*"
+files = [
+ {file = "func_timeout-4.3.5.tar.gz", hash = "sha256:74cd3c428ec94f4edfba81f9b2f14904846d5ffccc27c92433b8b5939b5575dd"},
+]
+
+[[package]]
+name = "gdown"
+version = "5.2.0"
+description = "Google Drive Public File/Folder Downloader"
+optional = false
+python-versions = ">=3.8"
+files = [
+ {file = "gdown-5.2.0-py3-none-any.whl", hash = "sha256:33083832d82b1101bdd0e9df3edd0fbc0e1c5f14c9d8c38d2a35bf1683b526d6"},
+ {file = "gdown-5.2.0.tar.gz", hash = "sha256:2145165062d85520a3cd98b356c9ed522c5e7984d408535409fd46f94defc787"},
+]
+
+[package.dependencies]
+beautifulsoup4 = "*"
+filelock = "*"
+requests = {version = "*", extras = ["socks"]}
+tqdm = "*"
+
+[package.extras]
+test = ["build", "mypy", "pytest", "pytest-xdist", "ruff", "twine", "types-requests", "types-setuptools"]
+
[[package]]
name = "gevent"
version = "24.2.1"
@@ -2129,13 +2127,13 @@ httplib2 = ">=0.19.0"
[[package]]
name = "google-cloud-aiplatform"
-version = "1.60.0"
+version = "1.61.0"
description = "Vertex AI API client library"
optional = false
python-versions = ">=3.8"
files = [
- {file = "google-cloud-aiplatform-1.60.0.tar.gz", hash = "sha256:782c7f1ec0e77a7c7daabef3b65bfd506ed2b4b1dc2186753c43cd6faf8dd04e"},
- {file = "google_cloud_aiplatform-1.60.0-py2.py3-none-any.whl", hash = "sha256:5f14159c9575f4b46335027e3ceb8fa57bd5eaa76a07f858105b8c6c034ec0d6"},
+ {file = "google-cloud-aiplatform-1.61.0.tar.gz", hash = "sha256:648e3cd7bb75be706d3c31d852a3d4d8a2e616ad4db4cf520ef4430615cf8ad9"},
+ {file = "google_cloud_aiplatform-1.61.0-py2.py3-none-any.whl", hash = "sha256:57b36d5fa085e68197e9fc576c43263a7cad320483aa3b166bcd1fdc7e8f49e7"},
]
[package.dependencies]
@@ -2147,7 +2145,7 @@ google-cloud-resource-manager = ">=1.3.3,<3.0.0dev"
google-cloud-storage = ">=1.32.0,<3.0.0dev"
packaging = ">=14.3"
proto-plus = ">=1.22.3,<2.0.0dev"
-protobuf = ">=3.19.5,<3.20.0 || >3.20.0,<3.20.1 || >3.20.1,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<5.0.0dev"
+protobuf = ">=3.20.2,<4.21.0 || >4.21.0,<4.21.1 || >4.21.1,<4.21.2 || >4.21.2,<4.21.3 || >4.21.3,<4.21.4 || >4.21.4,<4.21.5 || >4.21.5,<6.0.0dev"
pydantic = "<3"
shapely = "<3.0.0dev"
@@ -2480,13 +2478,13 @@ test = ["objgraph", "psutil"]
[[package]]
name = "grep-ast"
-version = "0.3.2"
+version = "0.3.3"
description = "A tool to grep through the AST of a source file"
optional = false
python-versions = "*"
files = [
- {file = "grep_ast-0.3.2-py3-none-any.whl", hash = "sha256:b7ceb84743983c3f4f5bca82f3374534cd9dbd759792d0dedf5648fedbb6f3fc"},
- {file = "grep_ast-0.3.2.tar.gz", hash = "sha256:d53bc7d25dfefafe77643fec189ab38e3cbd839d546c070a950ebedad82ee164"},
+ {file = "grep_ast-0.3.3-py3-none-any.whl", hash = "sha256:515cb889bffefefa26c4ab1377b9a75b3fc678aa5fa02bf9aa4f8f20999a83ad"},
+ {file = "grep_ast-0.3.3.tar.gz", hash = "sha256:42b8887d57301dc55634368f8d549e9c49c913dafb4d19c9b54c3ddb604fccf4"},
]
[package.dependencies]
@@ -2992,6 +2990,76 @@ MarkupSafe = ">=2.0"
[package.extras]
i18n = ["Babel (>=2.7)"]
+[[package]]
+name = "jiter"
+version = "0.5.0"
+description = "Fast iterable JSON parser."
+optional = false
+python-versions = ">=3.8"
+files = [
+ {file = "jiter-0.5.0-cp310-cp310-macosx_10_12_x86_64.whl", hash = "sha256:b599f4e89b3def9a94091e6ee52e1d7ad7bc33e238ebb9c4c63f211d74822c3f"},
+ {file = "jiter-0.5.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:2a063f71c4b06225543dddadbe09d203dc0c95ba352d8b85f1221173480a71d5"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:acc0d5b8b3dd12e91dd184b87273f864b363dfabc90ef29a1092d269f18c7e28"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c22541f0b672f4d741382a97c65609332a783501551445ab2df137ada01e019e"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:63314832e302cc10d8dfbda0333a384bf4bcfce80d65fe99b0f3c0da8945a91a"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a25fbd8a5a58061e433d6fae6d5298777c0814a8bcefa1e5ecfff20c594bd749"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:503b2c27d87dfff5ab717a8200fbbcf4714516c9d85558048b1fc14d2de7d8dc"},
+ {file = "jiter-0.5.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:6d1f3d27cce923713933a844872d213d244e09b53ec99b7a7fdf73d543529d6d"},
+ {file = "jiter-0.5.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:c95980207b3998f2c3b3098f357994d3fd7661121f30669ca7cb945f09510a87"},
+ {file = "jiter-0.5.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:afa66939d834b0ce063f57d9895e8036ffc41c4bd90e4a99631e5f261d9b518e"},
+ {file = "jiter-0.5.0-cp310-none-win32.whl", hash = "sha256:f16ca8f10e62f25fd81d5310e852df6649af17824146ca74647a018424ddeccf"},
+ {file = "jiter-0.5.0-cp310-none-win_amd64.whl", hash = "sha256:b2950e4798e82dd9176935ef6a55cf6a448b5c71515a556da3f6b811a7844f1e"},
+ {file = "jiter-0.5.0-cp311-cp311-macosx_10_12_x86_64.whl", hash = "sha256:d4c8e1ed0ef31ad29cae5ea16b9e41529eb50a7fba70600008e9f8de6376d553"},
+ {file = "jiter-0.5.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:c6f16e21276074a12d8421692515b3fd6d2ea9c94fd0734c39a12960a20e85f3"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5280e68e7740c8c128d3ae5ab63335ce6d1fb6603d3b809637b11713487af9e6"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:583c57fc30cc1fec360e66323aadd7fc3edeec01289bfafc35d3b9dcb29495e4"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:26351cc14507bdf466b5f99aba3df3143a59da75799bf64a53a3ad3155ecded9"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4829df14d656b3fb87e50ae8b48253a8851c707da9f30d45aacab2aa2ba2d614"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a42a4bdcf7307b86cb863b2fb9bb55029b422d8f86276a50487982d99eed7c6e"},
+ {file = "jiter-0.5.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:04d461ad0aebf696f8da13c99bc1b3e06f66ecf6cfd56254cc402f6385231c06"},
+ {file = "jiter-0.5.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:e6375923c5f19888c9226582a124b77b622f8fd0018b843c45eeb19d9701c403"},
+ {file = "jiter-0.5.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:2cec323a853c24fd0472517113768c92ae0be8f8c384ef4441d3632da8baa646"},
+ {file = "jiter-0.5.0-cp311-none-win32.whl", hash = "sha256:aa1db0967130b5cab63dfe4d6ff547c88b2a394c3410db64744d491df7f069bb"},
+ {file = "jiter-0.5.0-cp311-none-win_amd64.whl", hash = "sha256:aa9d2b85b2ed7dc7697597dcfaac66e63c1b3028652f751c81c65a9f220899ae"},
+ {file = "jiter-0.5.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:9f664e7351604f91dcdd557603c57fc0d551bc65cc0a732fdacbf73ad335049a"},
+ {file = "jiter-0.5.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:044f2f1148b5248ad2c8c3afb43430dccf676c5a5834d2f5089a4e6c5bbd64df"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:702e3520384c88b6e270c55c772d4bd6d7b150608dcc94dea87ceba1b6391248"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:528d742dcde73fad9d63e8242c036ab4a84389a56e04efd854062b660f559544"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8cf80e5fe6ab582c82f0c3331df27a7e1565e2dcf06265afd5173d809cdbf9ba"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:44dfc9ddfb9b51a5626568ef4e55ada462b7328996294fe4d36de02fce42721f"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c451f7922992751a936b96c5f5b9bb9312243d9b754c34b33d0cb72c84669f4e"},
+ {file = "jiter-0.5.0-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:308fce789a2f093dca1ff91ac391f11a9f99c35369117ad5a5c6c4903e1b3e3a"},
+ {file = "jiter-0.5.0-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:7f5ad4a7c6b0d90776fdefa294f662e8a86871e601309643de30bf94bb93a64e"},
+ {file = "jiter-0.5.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:ea189db75f8eca08807d02ae27929e890c7d47599ce3d0a6a5d41f2419ecf338"},
+ {file = "jiter-0.5.0-cp312-none-win32.whl", hash = "sha256:e3bbe3910c724b877846186c25fe3c802e105a2c1fc2b57d6688b9f8772026e4"},
+ {file = "jiter-0.5.0-cp312-none-win_amd64.whl", hash = "sha256:a586832f70c3f1481732919215f36d41c59ca080fa27a65cf23d9490e75b2ef5"},
+ {file = "jiter-0.5.0-cp38-cp38-macosx_10_12_x86_64.whl", hash = "sha256:f04bc2fc50dc77be9d10f73fcc4e39346402ffe21726ff41028f36e179b587e6"},
+ {file = "jiter-0.5.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:6f433a4169ad22fcb550b11179bb2b4fd405de9b982601914ef448390b2954f3"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ad4a6398c85d3a20067e6c69890ca01f68659da94d74c800298581724e426c7e"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:6baa88334e7af3f4d7a5c66c3a63808e5efbc3698a1c57626541ddd22f8e4fbf"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1ece0a115c05efca597c6d938f88c9357c843f8c245dbbb53361a1c01afd7148"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:335942557162ad372cc367ffaf93217117401bf930483b4b3ebdb1223dbddfa7"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:649b0ee97a6e6da174bffcb3c8c051a5935d7d4f2f52ea1583b5b3e7822fbf14"},
+ {file = "jiter-0.5.0-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:f4be354c5de82157886ca7f5925dbda369b77344b4b4adf2723079715f823989"},
+ {file = "jiter-0.5.0-cp38-cp38-musllinux_1_1_aarch64.whl", hash = "sha256:5206144578831a6de278a38896864ded4ed96af66e1e63ec5dd7f4a1fce38a3a"},
+ {file = "jiter-0.5.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:8120c60f8121ac3d6f072b97ef0e71770cc72b3c23084c72c4189428b1b1d3b6"},
+ {file = "jiter-0.5.0-cp38-none-win32.whl", hash = "sha256:6f1223f88b6d76b519cb033a4d3687ca157c272ec5d6015c322fc5b3074d8a5e"},
+ {file = "jiter-0.5.0-cp38-none-win_amd64.whl", hash = "sha256:c59614b225d9f434ea8fc0d0bec51ef5fa8c83679afedc0433905994fb36d631"},
+ {file = "jiter-0.5.0-cp39-cp39-macosx_10_12_x86_64.whl", hash = "sha256:0af3838cfb7e6afee3f00dc66fa24695199e20ba87df26e942820345b0afc566"},
+ {file = "jiter-0.5.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:550b11d669600dbc342364fd4adbe987f14d0bbedaf06feb1b983383dcc4b961"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:489875bf1a0ffb3cb38a727b01e6673f0f2e395b2aad3c9387f94187cb214bbf"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b250ca2594f5599ca82ba7e68785a669b352156260c5362ea1b4e04a0f3e2389"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8ea18e01f785c6667ca15407cd6dabbe029d77474d53595a189bdc813347218e"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:462a52be85b53cd9bffd94e2d788a09984274fe6cebb893d6287e1c296d50653"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:92cc68b48d50fa472c79c93965e19bd48f40f207cb557a8346daa020d6ba973b"},
+ {file = "jiter-0.5.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:1c834133e59a8521bc87ebcad773608c6fa6ab5c7a022df24a45030826cf10bc"},
+ {file = "jiter-0.5.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:ab3a71ff31cf2d45cb216dc37af522d335211f3a972d2fe14ea99073de6cb104"},
+ {file = "jiter-0.5.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:cccd3af9c48ac500c95e1bcbc498020c87e1781ff0345dd371462d67b76643eb"},
+ {file = "jiter-0.5.0-cp39-none-win32.whl", hash = "sha256:368084d8d5c4fc40ff7c3cc513c4f73e02c85f6009217922d0823a48ee7adf61"},
+ {file = "jiter-0.5.0-cp39-none-win_amd64.whl", hash = "sha256:ce03f7b4129eb72f1687fa11300fbf677b02990618428934662406d2a76742a1"},
+ {file = "jiter-0.5.0.tar.gz", hash = "sha256:1d916ba875bcab5c5f7d927df998c4cb694d27dceddf3392e58beaf10563368a"},
+]
+
[[package]]
name = "jmespath"
version = "1.0.1"
@@ -3016,13 +3084,13 @@ files = [
[[package]]
name = "json-repair"
-version = "0.25.3"
+version = "0.27.0"
description = "A package to repair broken json strings"
optional = false
-python-versions = ">=3.7"
+python-versions = ">=3.8"
files = [
- {file = "json_repair-0.25.3-py3-none-any.whl", hash = "sha256:f00b510dd21b31ebe72581bdb07e66381df2883d6f640c89605e482882c12b17"},
- {file = "json_repair-0.25.3.tar.gz", hash = "sha256:4ee970581a05b0b258b749eb8bcac21de380edda97c3717a4edfafc519ec21a4"},
+ {file = "json_repair-0.27.0-py3-none-any.whl", hash = "sha256:20763c8cf1c3096e33ce7c09c2b8e6c471a4acdce468688c96052fc7cccbad7f"},
+ {file = "json_repair-0.27.0.tar.gz", hash = "sha256:f4e14c5ad2b3f17290a361c3c90915536b462c36f69989e915867e81663dd467"},
]
[[package]]
@@ -3498,13 +3566,13 @@ types-tqdm = "*"
[[package]]
name = "litellm"
-version = "1.42.3"
+version = "1.43.4"
description = "Library to easily interface with LLM API providers"
optional = false
python-versions = "!=2.7.*,!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,!=3.7.*,>=3.8"
files = [
- {file = "litellm-1.42.3-py3-none-any.whl", hash = "sha256:08060c35dd2261a7955a9da30aca294f921adad4074157876be9a1e901bd107c"},
- {file = "litellm-1.42.3.tar.gz", hash = "sha256:69b4a4c748d9dfd21016f7738b8ad31768271c8c37491d7a1a3c3ffae0e47ac2"},
+ {file = "litellm-1.43.4-py3-none-any.whl", hash = "sha256:619cfaab189f921f66ff50c2b7a0e965e562c2a95b17c2ee24649826ba35da11"},
+ {file = "litellm-1.43.4.tar.gz", hash = "sha256:949c51ad494b935d80da1cd18c3567e4ed181f8eb531ef4706e3be72afb0c43c"},
]
[package.dependencies]
@@ -3513,7 +3581,7 @@ click = "*"
importlib-metadata = ">=6.8.0"
jinja2 = ">=3.1.2,<4.0.0"
jsonschema = ">=4.22.0,<5.0.0"
-openai = ">=1.27.0"
+openai = ">=1.40.0"
pydantic = ">=2.0.0,<3.0.0"
python-dotenv = ">=0.2.0"
requests = ">=2.31.0,<3.0.0"
@@ -3666,13 +3734,13 @@ sentence-transformers = ">=2.6.1"
[[package]]
name = "llama-index-embeddings-ollama"
-version = "0.1.2"
+version = "0.1.3"
description = "llama-index embeddings ollama integration"
optional = false
-python-versions = ">=3.8.1,<4.0"
+python-versions = "<4.0,>=3.8.1"
files = [
- {file = "llama_index_embeddings_ollama-0.1.2-py3-none-any.whl", hash = "sha256:ac7afabfa1134059af351b021e05e256bf86dd15e5176ffa5ab0305bcf03b33f"},
- {file = "llama_index_embeddings_ollama-0.1.2.tar.gz", hash = "sha256:a9e0809bddd2e4ad888f249519edc7e3d339c74e4e03fc5a40c3060dc41d47a9"},
+ {file = "llama_index_embeddings_ollama-0.1.3-py3-none-any.whl", hash = "sha256:b960a8c744e2e56ce1fd75a34753614fed3ad81558570ae9958b90b9062afb6a"},
+ {file = "llama_index_embeddings_ollama-0.1.3.tar.gz", hash = "sha256:4bd1dd3230c9be04cfa45b28c3a8066e46c1654d4360fcbecdc1718ac9013eca"},
]
[package.dependencies]
@@ -4571,38 +4639,38 @@ dill = ">=0.3.8"
[[package]]
name = "mypy"
-version = "1.11.0"
+version = "1.11.1"
description = "Optional static typing for Python"
optional = false
python-versions = ">=3.8"
files = [
- {file = "mypy-1.11.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:a3824187c99b893f90c845bab405a585d1ced4ff55421fdf5c84cb7710995229"},
- {file = "mypy-1.11.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:96f8dbc2c85046c81bcddc246232d500ad729cb720da4e20fce3b542cab91287"},
- {file = "mypy-1.11.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1a5d8d8dd8613a3e2be3eae829ee891b6b2de6302f24766ff06cb2875f5be9c6"},
- {file = "mypy-1.11.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:72596a79bbfb195fd41405cffa18210af3811beb91ff946dbcb7368240eed6be"},
- {file = "mypy-1.11.0-cp310-cp310-win_amd64.whl", hash = "sha256:35ce88b8ed3a759634cb4eb646d002c4cef0a38f20565ee82b5023558eb90c00"},
- {file = "mypy-1.11.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:98790025861cb2c3db8c2f5ad10fc8c336ed2a55f4daf1b8b3f877826b6ff2eb"},
- {file = "mypy-1.11.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:25bcfa75b9b5a5f8d67147a54ea97ed63a653995a82798221cca2a315c0238c1"},
- {file = "mypy-1.11.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0bea2a0e71c2a375c9fa0ede3d98324214d67b3cbbfcbd55ac8f750f85a414e3"},
- {file = "mypy-1.11.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:d2b3d36baac48e40e3064d2901f2fbd2a2d6880ec6ce6358825c85031d7c0d4d"},
- {file = "mypy-1.11.0-cp311-cp311-win_amd64.whl", hash = "sha256:d8e2e43977f0e09f149ea69fd0556623919f816764e26d74da0c8a7b48f3e18a"},
- {file = "mypy-1.11.0-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:1d44c1e44a8be986b54b09f15f2c1a66368eb43861b4e82573026e04c48a9e20"},
- {file = "mypy-1.11.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:cea3d0fb69637944dd321f41bc896e11d0fb0b0aa531d887a6da70f6e7473aba"},
- {file = "mypy-1.11.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:a83ec98ae12d51c252be61521aa5731f5512231d0b738b4cb2498344f0b840cd"},
- {file = "mypy-1.11.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:c7b73a856522417beb78e0fb6d33ef89474e7a622db2653bc1285af36e2e3e3d"},
- {file = "mypy-1.11.0-cp312-cp312-win_amd64.whl", hash = "sha256:f2268d9fcd9686b61ab64f077be7ffbc6fbcdfb4103e5dd0cc5eaab53a8886c2"},
- {file = "mypy-1.11.0-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:940bfff7283c267ae6522ef926a7887305945f716a7704d3344d6d07f02df850"},
- {file = "mypy-1.11.0-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:14f9294528b5f5cf96c721f231c9f5b2733164e02c1c018ed1a0eff8a18005ac"},
- {file = "mypy-1.11.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d7b54c27783991399046837df5c7c9d325d921394757d09dbcbf96aee4649fe9"},
- {file = "mypy-1.11.0-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:65f190a6349dec29c8d1a1cd4aa71284177aee5949e0502e6379b42873eddbe7"},
- {file = "mypy-1.11.0-cp38-cp38-win_amd64.whl", hash = "sha256:dbe286303241fea8c2ea5466f6e0e6a046a135a7e7609167b07fd4e7baf151bf"},
- {file = "mypy-1.11.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:104e9c1620c2675420abd1f6c44bab7dd33cc85aea751c985006e83dcd001095"},
- {file = "mypy-1.11.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:f006e955718ecd8d159cee9932b64fba8f86ee6f7728ca3ac66c3a54b0062abe"},
- {file = "mypy-1.11.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:becc9111ca572b04e7e77131bc708480cc88a911adf3d0239f974c034b78085c"},
- {file = "mypy-1.11.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:6801319fe76c3f3a3833f2b5af7bd2c17bb93c00026a2a1b924e6762f5b19e13"},
- {file = "mypy-1.11.0-cp39-cp39-win_amd64.whl", hash = "sha256:c1a184c64521dc549324ec6ef7cbaa6b351912be9cb5edb803c2808a0d7e85ac"},
- {file = "mypy-1.11.0-py3-none-any.whl", hash = "sha256:56913ec8c7638b0091ef4da6fcc9136896914a9d60d54670a75880c3e5b99ace"},
- {file = "mypy-1.11.0.tar.gz", hash = "sha256:93743608c7348772fdc717af4aeee1997293a1ad04bc0ea6efa15bf65385c538"},
+ {file = "mypy-1.11.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:a32fc80b63de4b5b3e65f4be82b4cfa362a46702672aa6a0f443b4689af7008c"},
+ {file = "mypy-1.11.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:c1952f5ea8a5a959b05ed5f16452fddadbaae48b5d39235ab4c3fc444d5fd411"},
+ {file = "mypy-1.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e1e30dc3bfa4e157e53c1d17a0dad20f89dc433393e7702b813c10e200843b03"},
+ {file = "mypy-1.11.1-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:2c63350af88f43a66d3dfeeeb8d77af34a4f07d760b9eb3a8697f0386c7590b4"},
+ {file = "mypy-1.11.1-cp310-cp310-win_amd64.whl", hash = "sha256:a831671bad47186603872a3abc19634f3011d7f83b083762c942442d51c58d58"},
+ {file = "mypy-1.11.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:7b6343d338390bb946d449677726edf60102a1c96079b4f002dedff375953fc5"},
+ {file = "mypy-1.11.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:e4fe9f4e5e521b458d8feb52547f4bade7ef8c93238dfb5bbc790d9ff2d770ca"},
+ {file = "mypy-1.11.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:886c9dbecc87b9516eff294541bf7f3655722bf22bb898ee06985cd7269898de"},
+ {file = "mypy-1.11.1-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:fca4a60e1dd9fd0193ae0067eaeeb962f2d79e0d9f0f66223a0682f26ffcc809"},
+ {file = "mypy-1.11.1-cp311-cp311-win_amd64.whl", hash = "sha256:0bd53faf56de9643336aeea1c925012837432b5faf1701ccca7fde70166ccf72"},
+ {file = "mypy-1.11.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:f39918a50f74dc5969807dcfaecafa804fa7f90c9d60506835036cc1bc891dc8"},
+ {file = "mypy-1.11.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:0bc71d1fb27a428139dd78621953effe0d208aed9857cb08d002280b0422003a"},
+ {file = "mypy-1.11.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:b868d3bcff720dd7217c383474008ddabaf048fad8d78ed948bb4b624870a417"},
+ {file = "mypy-1.11.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:a707ec1527ffcdd1c784d0924bf5cb15cd7f22683b919668a04d2b9c34549d2e"},
+ {file = "mypy-1.11.1-cp312-cp312-win_amd64.whl", hash = "sha256:64f4a90e3ea07f590c5bcf9029035cf0efeae5ba8be511a8caada1a4893f5525"},
+ {file = "mypy-1.11.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:749fd3213916f1751fff995fccf20c6195cae941dc968f3aaadf9bb4e430e5a2"},
+ {file = "mypy-1.11.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:b639dce63a0b19085213ec5fdd8cffd1d81988f47a2dec7100e93564f3e8fb3b"},
+ {file = "mypy-1.11.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4c956b49c5d865394d62941b109728c5c596a415e9c5b2be663dd26a1ff07bc0"},
+ {file = "mypy-1.11.1-cp38-cp38-musllinux_1_1_x86_64.whl", hash = "sha256:45df906e8b6804ef4b666af29a87ad9f5921aad091c79cc38e12198e220beabd"},
+ {file = "mypy-1.11.1-cp38-cp38-win_amd64.whl", hash = "sha256:d44be7551689d9d47b7abc27c71257adfdb53f03880841a5db15ddb22dc63edb"},
+ {file = "mypy-1.11.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:2684d3f693073ab89d76da8e3921883019ea8a3ec20fa5d8ecca6a2db4c54bbe"},
+ {file = "mypy-1.11.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:79c07eb282cb457473add5052b63925e5cc97dfab9812ee65a7c7ab5e3cb551c"},
+ {file = "mypy-1.11.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:11965c2f571ded6239977b14deebd3f4c3abd9a92398712d6da3a772974fad69"},
+ {file = "mypy-1.11.1-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:a2b43895a0f8154df6519706d9bca8280cda52d3d9d1514b2d9c3e26792a0b74"},
+ {file = "mypy-1.11.1-cp39-cp39-win_amd64.whl", hash = "sha256:1a81cf05975fd61aec5ae16501a091cfb9f605dc3e3c878c0da32f250b74760b"},
+ {file = "mypy-1.11.1-py3-none-any.whl", hash = "sha256:0624bdb940255d2dd24e829d99a13cfeb72e4e9031f9492148f410ed30bcab54"},
+ {file = "mypy-1.11.1.tar.gz", hash = "sha256:f404a0b069709f18bbdb702eb3dcfe51910602995de00bd39cea3050b5772d08"},
]
[package.dependencies]
@@ -5058,23 +5126,24 @@ sympy = "*"
[[package]]
name = "openai"
-version = "1.37.1"
+version = "1.40.2"
description = "The official Python library for the openai API"
optional = false
python-versions = ">=3.7.1"
files = [
- {file = "openai-1.37.1-py3-none-any.whl", hash = "sha256:9a6adda0d6ae8fce02d235c5671c399cfa40d6a281b3628914c7ebf244888ee3"},
- {file = "openai-1.37.1.tar.gz", hash = "sha256:faf87206785a6b5d9e34555d6a3242482a6852bc802e453e2a891f68ee04ce55"},
+ {file = "openai-1.40.2-py3-none-any.whl", hash = "sha256:38068f858f310b4fd4b0ea8734c3efcfde3c15a2978311e1453bd84817231b96"},
+ {file = "openai-1.40.2.tar.gz", hash = "sha256:2180e9070bd36084328248b3ce668964e8ddd2e9019e1d426e31dc54cc117bb5"},
]
[package.dependencies]
anyio = ">=3.5.0,<5"
distro = ">=1.7.0,<2"
httpx = ">=0.23.0,<1"
+jiter = ">=0.4.0,<1"
pydantic = ">=1.9.0,<3"
sniffio = "*"
tqdm = ">4"
-typing-extensions = ">=4.7,<5"
+typing-extensions = ">=4.11,<5"
[package.extras]
datalib = ["numpy (>=1)", "pandas (>=1.2.3)", "pandas-stubs (>=1.1.0.11)"]
@@ -5551,13 +5620,13 @@ xmp = ["defusedxml"]
[[package]]
name = "pip"
-version = "24.1.1"
+version = "24.2"
description = "The PyPA recommended tool for installing Python packages."
optional = false
python-versions = ">=3.8"
files = [
- {file = "pip-24.1.1-py3-none-any.whl", hash = "sha256:efca15145a95e95c00608afeab66311d40bfb73bb2266a855befd705e6bb15a0"},
- {file = "pip-24.1.1.tar.gz", hash = "sha256:5aa64f65e1952733ee0a9a9b1f52496ebdb3f3077cc46f80a16d983b58d1180a"},
+ {file = "pip-24.2-py3-none-any.whl", hash = "sha256:2cd581cf58ab7fcfca4ce8efa6dcacd0de5bf8d0a3eb9ec927e07405f4d9e2a2"},
+ {file = "pip-24.2.tar.gz", hash = "sha256:5b5e490b5e9cb275c879595064adce9ebd31b854e3e803740b72f9ccf34a45b8"},
]
[[package]]
@@ -5655,13 +5724,13 @@ test = ["coverage", "flake8", "freezegun (==0.3.15)", "mock (>=2.0.0)", "pylint"
[[package]]
name = "pre-commit"
-version = "3.7.1"
+version = "3.8.0"
description = "A framework for managing and maintaining multi-language pre-commit hooks."
optional = false
python-versions = ">=3.9"
files = [
- {file = "pre_commit-3.7.1-py2.py3-none-any.whl", hash = "sha256:fae36fd1d7ad7d6a5a1c0b0d5adb2ed1a3bda5a21bf6c3e5372073d7a11cd4c5"},
- {file = "pre_commit-3.7.1.tar.gz", hash = "sha256:8ca3ad567bc78a4972a3f1a477e94a79d4597e8140a6e0b651c5e33899c3654a"},
+ {file = "pre_commit-3.8.0-py2.py3-none-any.whl", hash = "sha256:9a90a53bf82fdd8778d58085faf8d83df56e40dfe18f45b19446e26bf1b3a63f"},
+ {file = "pre_commit-3.8.0.tar.gz", hash = "sha256:8bb6494d4a20423842e198980c9ecf9f96607a07ea29549e180eef9ae80fe7af"},
]
[package.dependencies]
@@ -6124,13 +6193,13 @@ windows-terminal = ["colorama (>=0.4.6)"]
[[package]]
name = "pyjwt"
-version = "2.8.0"
+version = "2.9.0"
description = "JSON Web Token implementation in Python"
optional = false
-python-versions = ">=3.7"
+python-versions = ">=3.8"
files = [
- {file = "PyJWT-2.8.0-py3-none-any.whl", hash = "sha256:59127c392cc44c2da5bb3192169a91f429924e17aff6534d70fdc02ab3e04320"},
- {file = "PyJWT-2.8.0.tar.gz", hash = "sha256:57e28d156e3d5c10088e0c68abb90bfac3df82b40a71bd0daa20c65ccd5c23de"},
+ {file = "PyJWT-2.9.0-py3-none-any.whl", hash = "sha256:3b02fb0f44517787776cf48f2ae25d8e14f300e6d7545a4315cee571a415e850"},
+ {file = "pyjwt-2.9.0.tar.gz", hash = "sha256:7e1e5b56cc735432a7369cbfa0efe50fa113ebecdc04ae6922deba8b84582d0c"},
]
[package.dependencies]
@@ -6138,8 +6207,8 @@ cryptography = {version = ">=3.4.0", optional = true, markers = "extra == \"cryp
[package.extras]
crypto = ["cryptography (>=3.4.0)"]
-dev = ["coverage[toml] (==5.0.4)", "cryptography (>=3.4.0)", "pre-commit", "pytest (>=6.0.0,<7.0.0)", "sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "zope.interface"]
-docs = ["sphinx (>=4.5.0,<5.0.0)", "sphinx-rtd-theme", "zope.interface"]
+dev = ["coverage[toml] (==5.0.4)", "cryptography (>=3.4.0)", "pre-commit", "pytest (>=6.0.0,<7.0.0)", "sphinx", "sphinx-rtd-theme", "zope.interface"]
+docs = ["sphinx", "sphinx-rtd-theme", "zope.interface"]
tests = ["coverage[toml] (==5.0.4)", "pytest (>=6.0.0,<7.0.0)"]
[[package]]
@@ -6234,6 +6303,18 @@ files = [
{file = "pyreadline3-3.4.1.tar.gz", hash = "sha256:6f3d1f7b8a31ba32b73917cefc1f28cc660562f39aea8646d30bd6eff21f7bae"},
]
+[[package]]
+name = "pysocks"
+version = "1.7.1"
+description = "A Python SOCKS client module. See https://github.com/Anorov/PySocks for more information."
+optional = false
+python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
+files = [
+ {file = "PySocks-1.7.1-py27-none-any.whl", hash = "sha256:08e69f092cc6dbe92a0fdd16eeb9b9ffbc13cadfe5ca4c7bd92ffb078b293299"},
+ {file = "PySocks-1.7.1-py3-none-any.whl", hash = "sha256:2725bd0a9925919b9b51739eea5f9e2bae91e83288108a9ad338b2e3a4435ee5"},
+ {file = "PySocks-1.7.1.tar.gz", hash = "sha256:3f8804571ebe159c380ac6de37643bb4685970655d3bba243530d6558b799aa0"},
+]
+
[[package]]
name = "pytest"
version = "8.3.2"
@@ -6375,18 +6456,19 @@ dev = ["atomicwrites (==1.4.1)", "attrs (==23.2.0)", "coverage (==7.4.1)", "hatc
[[package]]
name = "python-pptx"
-version = "0.6.23"
-description = "Generate and manipulate Open XML PowerPoint (.pptx) files"
+version = "1.0.2"
+description = "Create, read, and update PowerPoint 2007+ (.pptx) files."
optional = false
-python-versions = "*"
+python-versions = ">=3.8"
files = [
- {file = "python-pptx-0.6.23.tar.gz", hash = "sha256:587497ff28e779ab18dbb074f6d4052893c85dedc95ed75df319364f331fedee"},
- {file = "python_pptx-0.6.23-py3-none-any.whl", hash = "sha256:dd0527194627a2b7cc05f3ba23ecaa2d9a0d5ac9b6193a28ed1b7a716f4217d4"},
+ {file = "python_pptx-1.0.2-py3-none-any.whl", hash = "sha256:160838e0b8565a8b1f67947675886e9fea18aa5e795db7ae531606d68e785cba"},
+ {file = "python_pptx-1.0.2.tar.gz", hash = "sha256:479a8af0eaf0f0d76b6f00b0887732874ad2e3188230315290cd1f9dd9cc7095"},
]
[package.dependencies]
lxml = ">=3.1.0"
Pillow = ">=3.3.2"
+typing-extensions = ">=4.9.0"
XlsxWriter = ">=0.5.7"
[[package]]
@@ -6736,6 +6818,7 @@ files = [
certifi = ">=2017.4.17"
charset-normalizer = ">=2,<4"
idna = ">=2.5,<4"
+PySocks = {version = ">=1.5.6,<1.5.7 || >1.5.7", optional = true, markers = "extra == \"socks\""}
urllib3 = ">=1.21.1,<3"
[package.extras]
@@ -6942,29 +7025,29 @@ pyasn1 = ">=0.1.3"
[[package]]
name = "ruff"
-version = "0.5.5"
+version = "0.5.7"
description = "An extremely fast Python linter and code formatter, written in Rust."
optional = false
python-versions = ">=3.7"
files = [
- {file = "ruff-0.5.5-py3-none-linux_armv6l.whl", hash = "sha256:605d589ec35d1da9213a9d4d7e7a9c761d90bba78fc8790d1c5e65026c1b9eaf"},
- {file = "ruff-0.5.5-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:00817603822a3e42b80f7c3298c8269e09f889ee94640cd1fc7f9329788d7bf8"},
- {file = "ruff-0.5.5-py3-none-macosx_11_0_arm64.whl", hash = "sha256:187a60f555e9f865a2ff2c6984b9afeffa7158ba6e1eab56cb830404c942b0f3"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fe26fc46fa8c6e0ae3f47ddccfbb136253c831c3289bba044befe68f467bfb16"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:4ad25dd9c5faac95c8e9efb13e15803cd8bbf7f4600645a60ffe17c73f60779b"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f70737c157d7edf749bcb952d13854e8f745cec695a01bdc6e29c29c288fc36e"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:cfd7de17cef6ab559e9f5ab859f0d3296393bc78f69030967ca4d87a541b97a0"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a09b43e02f76ac0145f86a08e045e2ea452066f7ba064fd6b0cdccb486f7c3e7"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d0b856cb19c60cd40198be5d8d4b556228e3dcd545b4f423d1ad812bfdca5884"},
- {file = "ruff-0.5.5-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3687d002f911e8a5faf977e619a034d159a8373514a587249cc00f211c67a091"},
- {file = "ruff-0.5.5-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:ac9dc814e510436e30d0ba535f435a7f3dc97f895f844f5b3f347ec8c228a523"},
- {file = "ruff-0.5.5-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:af9bdf6c389b5add40d89b201425b531e0a5cceb3cfdcc69f04d3d531c6be74f"},
- {file = "ruff-0.5.5-py3-none-musllinux_1_2_i686.whl", hash = "sha256:d40a8533ed545390ef8315b8e25c4bb85739b90bd0f3fe1280a29ae364cc55d8"},
- {file = "ruff-0.5.5-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:cab904683bf9e2ecbbe9ff235bfe056f0eba754d0168ad5407832928d579e7ab"},
- {file = "ruff-0.5.5-py3-none-win32.whl", hash = "sha256:696f18463b47a94575db635ebb4c178188645636f05e934fdf361b74edf1bb2d"},
- {file = "ruff-0.5.5-py3-none-win_amd64.whl", hash = "sha256:50f36d77f52d4c9c2f1361ccbfbd09099a1b2ea5d2b2222c586ab08885cf3445"},
- {file = "ruff-0.5.5-py3-none-win_arm64.whl", hash = "sha256:3191317d967af701f1b73a31ed5788795936e423b7acce82a2b63e26eb3e89d6"},
- {file = "ruff-0.5.5.tar.gz", hash = "sha256:cc5516bdb4858d972fbc31d246bdb390eab8df1a26e2353be2dbc0c2d7f5421a"},
+ {file = "ruff-0.5.7-py3-none-linux_armv6l.whl", hash = "sha256:548992d342fc404ee2e15a242cdbea4f8e39a52f2e7752d0e4cbe88d2d2f416a"},
+ {file = "ruff-0.5.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:00cc8872331055ee017c4f1071a8a31ca0809ccc0657da1d154a1d2abac5c0be"},
+ {file = "ruff-0.5.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:eaf3d86a1fdac1aec8a3417a63587d93f906c678bb9ed0b796da7b59c1114a1e"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a01c34400097b06cf8a6e61b35d6d456d5bd1ae6961542de18ec81eaf33b4cb8"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:fcc8054f1a717e2213500edaddcf1dbb0abad40d98e1bd9d0ad364f75c763eea"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7f70284e73f36558ef51602254451e50dd6cc479f8b6f8413a95fcb5db4a55fc"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:a78ad870ae3c460394fc95437d43deb5c04b5c29297815a2a1de028903f19692"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9ccd078c66a8e419475174bfe60a69adb36ce04f8d4e91b006f1329d5cd44bcf"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:7e31c9bad4ebf8fdb77b59cae75814440731060a09a0e0077d559a556453acbb"},
+ {file = "ruff-0.5.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8d796327eed8e168164346b769dd9a27a70e0298d667b4ecee6877ce8095ec8e"},
+ {file = "ruff-0.5.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:4a09ea2c3f7778cc635e7f6edf57d566a8ee8f485f3c4454db7771efb692c499"},
+ {file = "ruff-0.5.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:a36d8dcf55b3a3bc353270d544fb170d75d2dff41eba5df57b4e0b67a95bb64e"},
+ {file = "ruff-0.5.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:9369c218f789eefbd1b8d82a8cf25017b523ac47d96b2f531eba73770971c9e5"},
+ {file = "ruff-0.5.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:b88ca3db7eb377eb24fb7c82840546fb7acef75af4a74bd36e9ceb37a890257e"},
+ {file = "ruff-0.5.7-py3-none-win32.whl", hash = "sha256:33d61fc0e902198a3e55719f4be6b375b28f860b09c281e4bdbf783c0566576a"},
+ {file = "ruff-0.5.7-py3-none-win_amd64.whl", hash = "sha256:083bbcbe6fadb93cd86709037acc510f86eed5a314203079df174c40bbbca6b3"},
+ {file = "ruff-0.5.7-py3-none-win_arm64.whl", hash = "sha256:2dca26154ff9571995107221d0aeaad0e75a77b5a682d6236cf89a58c70b76f4"},
+ {file = "ruff-0.5.7.tar.gz", hash = "sha256:8dfc0a458797f5d9fb622dd0efc52d796f23f0a1493a9527f4e49a550ae9a7e5"},
]
[[package]]
@@ -7106,6 +7189,21 @@ tensorflow = ["safetensors[numpy]", "tensorflow (>=2.11.0)"]
testing = ["h5py (>=3.7.0)", "huggingface-hub (>=0.12.1)", "hypothesis (>=6.70.2)", "pytest (>=7.2.0)", "pytest-benchmark (>=4.0.0)", "safetensors[numpy]", "setuptools-rust (>=1.5.2)"]
torch = ["safetensors[numpy]", "torch (>=1.10)"]
+[[package]]
+name = "scantree"
+version = "0.0.4"
+description = "Flexible recursive directory iterator: scandir meets glob(\"**\", recursive=True)"
+optional = false
+python-versions = ">=3.8"
+files = [
+ {file = "scantree-0.0.4-py3-none-any.whl", hash = "sha256:7616ab65aa6b7f16fcf8e6fa1d9afaa99a27ab72bba05c61b691853b96763174"},
+ {file = "scantree-0.0.4.tar.gz", hash = "sha256:15bd5cb24483b04db2c70653604e8ea3522e98087db7e38ab8482f053984c0ac"},
+]
+
+[package.dependencies]
+attrs = ">=18.0.0"
+pathspec = ">=0.10.1"
+
[[package]]
name = "scikit-learn"
version = "1.5.0"
@@ -7507,13 +7605,13 @@ full = ["httpx (>=0.22.0)", "itsdangerous", "jinja2", "python-multipart (>=0.0.7
[[package]]
name = "streamlit"
-version = "1.37.0"
+version = "1.37.1"
description = "A faster way to build and share data apps"
optional = false
python-versions = "!=3.9.7,>=3.8"
files = [
- {file = "streamlit-1.37.0-py2.py3-none-any.whl", hash = "sha256:d17e2d32b075a270a97f134ab5d22bbb98b4e474fa261ff49dc4a2b380386c84"},
- {file = "streamlit-1.37.0.tar.gz", hash = "sha256:463ef728ba21e74e05122e3704e8af644a7bdbb5822e281b8daf4a0a48761879"},
+ {file = "streamlit-1.37.1-py2.py3-none-any.whl", hash = "sha256:0651240fccc569900cc9450390b0a67473fda55be65f317e46285f99e2bddf04"},
+ {file = "streamlit-1.37.1.tar.gz", hash = "sha256:bc7e3813d94a39dda56f15678437eb37830973c601e8e574f2225a7bf188ea5a"},
]
[package.dependencies]
@@ -7553,7 +7651,7 @@ files = [
[[package]]
name = "swebench"
-version = "2.0.2"
+version = "2.0.12"
description = "The official SWE-bench package - a benchmark for evaluating LMs on software engineering"
optional = false
python-versions = ">=3.8"
@@ -7572,26 +7670,33 @@ python-dotenv = "*"
requests = "*"
rich = "*"
tqdm = "*"
+unidiff = "*"
+
+[package.extras]
+inference = ["anthropic", "flash_attn", "jedi", "openai", "peft", "protobuf", "sentencepiece", "tenacity", "tiktoken", "torch", "transformers", "triton"]
[package.source]
type = "git"
url = "https://github.com/OpenDevin/SWE-bench.git"
reference = "HEAD"
-resolved_reference = "4498af933679c518df6d1f8485864f72afc21500"
+resolved_reference = "c2b3cefd4a5af0b248966a773650a39046072975"
[[package]]
name = "sympy"
-version = "1.12.1"
+version = "1.13.1"
description = "Computer algebra system (CAS) in Python"
optional = false
python-versions = ">=3.8"
files = [
- {file = "sympy-1.12.1-py3-none-any.whl", hash = "sha256:9b2cbc7f1a640289430e13d2a56f02f867a1da0190f2f99d8968c2f74da0e515"},
- {file = "sympy-1.12.1.tar.gz", hash = "sha256:2877b03f998cd8c08f07cd0de5b767119cd3ef40d09f41c30d722f6686b0fb88"},
+ {file = "sympy-1.13.1-py3-none-any.whl", hash = "sha256:db36cdc64bf61b9b24578b6f7bab1ecdd2452cf008f34faa33776680c26d66f8"},
+ {file = "sympy-1.13.1.tar.gz", hash = "sha256:9cebf7e04ff162015ce31c9c6c9144daa34a93bd082f54fd8f12deca4f47515f"},
]
[package.dependencies]
-mpmath = ">=1.1.0,<1.4.0"
+mpmath = ">=1.1.0,<1.4"
+
+[package.extras]
+dev = ["hypothesis (>=6.70.0)", "pytest (>=7.1.0)"]
[[package]]
name = "tenacity"
@@ -7881,36 +7986,36 @@ files = [
[[package]]
name = "torch"
-version = "2.2.0"
+version = "2.2.2"
description = "Tensors and Dynamic neural networks in Python with strong GPU acceleration"
optional = false
python-versions = ">=3.8.0"
files = [
- {file = "torch-2.2.0-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:d366158d6503a3447e67f8c0ad1328d54e6c181d88572d688a625fac61b13a97"},
- {file = "torch-2.2.0-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:707f2f80402981e9f90d0038d7d481678586251e6642a7a6ef67fc93511cb446"},
- {file = "torch-2.2.0-cp310-cp310-win_amd64.whl", hash = "sha256:15c8f0a105c66b28496092fca1520346082e734095f8eaf47b5786bac24b8a31"},
- {file = "torch-2.2.0-cp310-none-macosx_10_9_x86_64.whl", hash = "sha256:0ca4df4b728515ad009b79f5107b00bcb2c63dc202d991412b9eb3b6a4f24349"},
- {file = "torch-2.2.0-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:3d3eea2d5969b9a1c9401429ca79efc668120314d443d3463edc3289d7f003c7"},
- {file = "torch-2.2.0-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:0d1c580e379c0d48f0f0a08ea28d8e373295aa254de4f9ad0631f9ed8bc04c24"},
- {file = "torch-2.2.0-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:9328e3c1ce628a281d2707526b4d1080eae7c4afab4f81cea75bde1f9441dc78"},
- {file = "torch-2.2.0-cp311-cp311-win_amd64.whl", hash = "sha256:03c8e660907ac1b8ee07f6d929c4e15cd95be2fb764368799cca02c725a212b8"},
- {file = "torch-2.2.0-cp311-none-macosx_10_9_x86_64.whl", hash = "sha256:da0cefe7f84ece3e3b56c11c773b59d1cb2c0fd83ddf6b5f7f1fd1a987b15c3e"},
- {file = "torch-2.2.0-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:f81d23227034221a4a4ff8ef24cc6cec7901edd98d9e64e32822778ff01be85e"},
- {file = "torch-2.2.0-cp312-cp312-manylinux1_x86_64.whl", hash = "sha256:dcbfb2192ac41ca93c756ebe9e2af29df0a4c14ee0e7a0dd78f82c67a63d91d4"},
- {file = "torch-2.2.0-cp312-cp312-manylinux2014_aarch64.whl", hash = "sha256:9eeb42971619e24392c9088b5b6d387d896e267889d41d267b1fec334f5227c5"},
- {file = "torch-2.2.0-cp312-cp312-win_amd64.whl", hash = "sha256:c718b2ca69a6cac28baa36d86d8c0ec708b102cebd1ceb1b6488e404cd9be1d1"},
- {file = "torch-2.2.0-cp312-none-macosx_10_9_x86_64.whl", hash = "sha256:f11d18fceb4f9ecb1ac680dde7c463c120ed29056225d75469c19637e9f98d12"},
- {file = "torch-2.2.0-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:ee1da852bfd4a7e674135a446d6074c2da7194c1b08549e31eae0b3138c6b4d2"},
- {file = "torch-2.2.0-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:0d819399819d0862268ac531cf12a501c253007df4f9e6709ede8a0148f1a7b8"},
- {file = "torch-2.2.0-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:08f53ccc38c49d839bc703ea1b20769cc8a429e0c4b20b56921a9f64949bf325"},
- {file = "torch-2.2.0-cp38-cp38-win_amd64.whl", hash = "sha256:93bffe3779965a71dab25fc29787538c37c5d54298fd2f2369e372b6fb137d41"},
- {file = "torch-2.2.0-cp38-none-macosx_10_9_x86_64.whl", hash = "sha256:c17ec323da778efe8dad49d8fb534381479ca37af1bfc58efdbb8607a9d263a3"},
- {file = "torch-2.2.0-cp38-none-macosx_11_0_arm64.whl", hash = "sha256:c02685118008834e878f676f81eab3a952b7936fa31f474ef8a5ff4b5c78b36d"},
- {file = "torch-2.2.0-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:d9f39d6f53cec240a0e3baa82cb697593340f9d4554cee6d3d6ca07925c2fac0"},
- {file = "torch-2.2.0-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:51770c065206250dc1222ea7c0eff3f88ab317d3e931cca2aee461b85fbc2472"},
- {file = "torch-2.2.0-cp39-cp39-win_amd64.whl", hash = "sha256:008e4c6ad703de55af760c73bf937ecdd61a109f9b08f2bbb9c17e7c7017f194"},
- {file = "torch-2.2.0-cp39-none-macosx_10_9_x86_64.whl", hash = "sha256:de8680472dd14e316f42ceef2a18a301461a9058cd6e99a1f1b20f78f11412f1"},
- {file = "torch-2.2.0-cp39-none-macosx_11_0_arm64.whl", hash = "sha256:99e1dcecb488e3fd25bcaac56e48cdb3539842904bdc8588b0b255fde03a254c"},
+ {file = "torch-2.2.2-cp310-cp310-manylinux1_x86_64.whl", hash = "sha256:bc889d311a855dd2dfd164daf8cc903a6b7273a747189cebafdd89106e4ad585"},
+ {file = "torch-2.2.2-cp310-cp310-manylinux2014_aarch64.whl", hash = "sha256:15dffa4cc3261fa73d02f0ed25f5fa49ecc9e12bf1ae0a4c1e7a88bbfaad9030"},
+ {file = "torch-2.2.2-cp310-cp310-win_amd64.whl", hash = "sha256:11e8fe261233aeabd67696d6b993eeb0896faa175c6b41b9a6c9f0334bdad1c5"},
+ {file = "torch-2.2.2-cp310-none-macosx_10_9_x86_64.whl", hash = "sha256:b2e2200b245bd9f263a0d41b6a2dab69c4aca635a01b30cca78064b0ef5b109e"},
+ {file = "torch-2.2.2-cp310-none-macosx_11_0_arm64.whl", hash = "sha256:877b3e6593b5e00b35bbe111b7057464e76a7dd186a287280d941b564b0563c2"},
+ {file = "torch-2.2.2-cp311-cp311-manylinux1_x86_64.whl", hash = "sha256:ad4c03b786e074f46606f4151c0a1e3740268bcf29fbd2fdf6666d66341c1dcb"},
+ {file = "torch-2.2.2-cp311-cp311-manylinux2014_aarch64.whl", hash = "sha256:32827fa1fbe5da8851686256b4cd94cc7b11be962862c2293811c94eea9457bf"},
+ {file = "torch-2.2.2-cp311-cp311-win_amd64.whl", hash = "sha256:f9ef0a648310435511e76905f9b89612e45ef2c8b023bee294f5e6f7e73a3e7c"},
+ {file = "torch-2.2.2-cp311-none-macosx_10_9_x86_64.whl", hash = "sha256:95b9b44f3bcebd8b6cd8d37ec802048c872d9c567ba52c894bba90863a439059"},
+ {file = "torch-2.2.2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:49aa4126ede714c5aeef7ae92969b4b0bbe67f19665106463c39f22e0a1860d1"},
+ {file = "torch-2.2.2-cp312-cp312-manylinux1_x86_64.whl", hash = "sha256:cf12cdb66c9c940227ad647bc9cf5dba7e8640772ae10dfe7569a0c1e2a28aca"},
+ {file = "torch-2.2.2-cp312-cp312-manylinux2014_aarch64.whl", hash = "sha256:89ddac2a8c1fb6569b90890955de0c34e1724f87431cacff4c1979b5f769203c"},
+ {file = "torch-2.2.2-cp312-cp312-win_amd64.whl", hash = "sha256:451331406b760f4b1ab298ddd536486ab3cfb1312614cfe0532133535be60bea"},
+ {file = "torch-2.2.2-cp312-none-macosx_10_9_x86_64.whl", hash = "sha256:eb4d6e9d3663e26cd27dc3ad266b34445a16b54908e74725adb241aa56987533"},
+ {file = "torch-2.2.2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:bf9558da7d2bf7463390b3b2a61a6a3dbb0b45b161ee1dd5ec640bf579d479fc"},
+ {file = "torch-2.2.2-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:cd2bf7697c9e95fb5d97cc1d525486d8cf11a084c6af1345c2c2c22a6b0029d0"},
+ {file = "torch-2.2.2-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:b421448d194496e1114d87a8b8d6506bce949544e513742b097e2ab8f7efef32"},
+ {file = "torch-2.2.2-cp38-cp38-win_amd64.whl", hash = "sha256:3dbcd563a9b792161640c0cffe17e3270d85e8f4243b1f1ed19cca43d28d235b"},
+ {file = "torch-2.2.2-cp38-none-macosx_10_9_x86_64.whl", hash = "sha256:31f4310210e7dda49f1fb52b0ec9e59382cfcb938693f6d5378f25b43d7c1d29"},
+ {file = "torch-2.2.2-cp38-none-macosx_11_0_arm64.whl", hash = "sha256:c795feb7e8ce2e0ef63f75f8e1ab52e7fd5e1a4d7d0c31367ade1e3de35c9e95"},
+ {file = "torch-2.2.2-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:a6e5770d68158d07456bfcb5318b173886f579fdfbf747543901ce718ea94782"},
+ {file = "torch-2.2.2-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:67dcd726edff108e2cd6c51ff0e416fd260c869904de95750e80051358680d24"},
+ {file = "torch-2.2.2-cp39-cp39-win_amd64.whl", hash = "sha256:539d5ef6c4ce15bd3bd47a7b4a6e7c10d49d4d21c0baaa87c7d2ef8698632dfb"},
+ {file = "torch-2.2.2-cp39-none-macosx_10_9_x86_64.whl", hash = "sha256:dff696de90d6f6d1e8200e9892861fd4677306d0ef604cb18f2134186f719f82"},
+ {file = "torch-2.2.2-cp39-none-macosx_11_0_arm64.whl", hash = "sha256:3a4dd910663fd7a124c056c878a52c2b0be4a5a424188058fe97109d4436ee42"},
]
[package.dependencies]
@@ -7930,7 +8035,7 @@ nvidia-cusparse-cu12 = {version = "12.1.0.106", markers = "platform_system == \"
nvidia-nccl-cu12 = {version = "2.19.3", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
nvidia-nvtx-cu12 = {version = "12.1.105", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
sympy = "*"
-triton = {version = "2.2.0", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\""}
+triton = {version = "2.2.0", markers = "platform_system == \"Linux\" and platform_machine == \"x86_64\" and python_version < \"3.12\""}
typing-extensions = ">=4.8.0"
[package.extras]
@@ -8286,6 +8391,17 @@ files = [
{file = "tzdata-2024.1.tar.gz", hash = "sha256:2674120f8d891909751c38abcdfd386ac0a5a1127954fbc332af6b5ceae07efd"},
]
+[[package]]
+name = "unidiff"
+version = "0.7.5"
+description = "Unified diff parsing/metadata extraction library."
+optional = false
+python-versions = "*"
+files = [
+ {file = "unidiff-0.7.5-py2.py3-none-any.whl", hash = "sha256:c93bf2265cc1ba2a520e415ab05da587370bc2a3ae9e0414329f54f0c2fc09e8"},
+ {file = "unidiff-0.7.5.tar.gz", hash = "sha256:2e5f0162052248946b9f0970a40e9e124236bf86c82b70821143a6fc1dea2574"},
+]
+
[[package]]
name = "uri-template"
version = "1.3.0"
@@ -8330,13 +8446,13 @@ zstd = ["zstandard (>=0.18.0)"]
[[package]]
name = "uvicorn"
-version = "0.30.3"
+version = "0.30.5"
description = "The lightning-fast ASGI server."
optional = false
python-versions = ">=3.8"
files = [
- {file = "uvicorn-0.30.3-py3-none-any.whl", hash = "sha256:94a3608da0e530cea8f69683aa4126364ac18e3826b6630d1a65f4638aade503"},
- {file = "uvicorn-0.30.3.tar.gz", hash = "sha256:0d114d0831ff1adbf231d358cbf42f17333413042552a624ea6a9b4c33dcfd81"},
+ {file = "uvicorn-0.30.5-py3-none-any.whl", hash = "sha256:b2d86de274726e9878188fa07576c9ceeff90a839e2b6e25c917fe05f5a6c835"},
+ {file = "uvicorn-0.30.5.tar.gz", hash = "sha256:ac6fdbd4425c5fd17a9fe39daf4d4d075da6fdc80f653e5894cdc2fd98752bee"},
]
[package.dependencies]
@@ -9055,47 +9171,45 @@ test = ["zope.testrunner"]
[[package]]
name = "zope-interface"
-version = "6.4.post2"
+version = "7.0.1"
description = "Interfaces for Python"
optional = false
-python-versions = ">=3.7"
+python-versions = ">=3.8"
files = [
- {file = "zope.interface-6.4.post2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:2eccd5bef45883802848f821d940367c1d0ad588de71e5cabe3813175444202c"},
- {file = "zope.interface-6.4.post2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:762e616199f6319bb98e7f4f27d254c84c5fb1c25c908c2a9d0f92b92fb27530"},
- {file = "zope.interface-6.4.post2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:5ef8356f16b1a83609f7a992a6e33d792bb5eff2370712c9eaae0d02e1924341"},
- {file = "zope.interface-6.4.post2-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0e4fa5d34d7973e6b0efa46fe4405090f3b406f64b6290facbb19dcbf642ad6b"},
- {file = "zope.interface-6.4.post2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d22fce0b0f5715cdac082e35a9e735a1752dc8585f005d045abb1a7c20e197f9"},
- {file = "zope.interface-6.4.post2-cp310-cp310-win_amd64.whl", hash = "sha256:97e615eab34bd8477c3f34197a17ce08c648d38467489359cb9eb7394f1083f7"},
- {file = "zope.interface-6.4.post2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:599f3b07bde2627e163ce484d5497a54a0a8437779362395c6b25e68c6590ede"},
- {file = "zope.interface-6.4.post2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:136cacdde1a2c5e5bc3d0b2a1beed733f97e2dad8c2ad3c2e17116f6590a3827"},
- {file = "zope.interface-6.4.post2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:47937cf2e7ed4e0e37f7851c76edeb8543ec9b0eae149b36ecd26176ff1ca874"},
- {file = "zope.interface-6.4.post2-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6f0a6be264afb094975b5ef55c911379d6989caa87c4e558814ec4f5125cfa2e"},
- {file = "zope.interface-6.4.post2-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:47654177e675bafdf4e4738ce58cdc5c6d6ee2157ac0a78a3fa460942b9d64a8"},
- {file = "zope.interface-6.4.post2-cp311-cp311-win_amd64.whl", hash = "sha256:e2fb8e8158306567a3a9a41670c1ff99d0567d7fc96fa93b7abf8b519a46b250"},
- {file = "zope.interface-6.4.post2-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:b912750b13d76af8aac45ddf4679535def304b2a48a07989ec736508d0bbfbde"},
- {file = "zope.interface-6.4.post2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:4ac46298e0143d91e4644a27a769d1388d5d89e82ee0cf37bf2b0b001b9712a4"},
- {file = "zope.interface-6.4.post2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:86a94af4a88110ed4bb8961f5ac72edf782958e665d5bfceaab6bf388420a78b"},
- {file = "zope.interface-6.4.post2-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:73f9752cf3596771c7726f7eea5b9e634ad47c6d863043589a1c3bb31325c7eb"},
- {file = "zope.interface-6.4.post2-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:00b5c3e9744dcdc9e84c24ed6646d5cf0cf66551347b310b3ffd70f056535854"},
- {file = "zope.interface-6.4.post2-cp312-cp312-win_amd64.whl", hash = "sha256:551db2fe892fcbefb38f6f81ffa62de11090c8119fd4e66a60f3adff70751ec7"},
- {file = "zope.interface-6.4.post2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e96ac6b3169940a8cd57b4f2b8edcad8f5213b60efcd197d59fbe52f0accd66e"},
- {file = "zope.interface-6.4.post2-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:cebff2fe5dc82cb22122e4e1225e00a4a506b1a16fafa911142ee124febf2c9e"},
- {file = "zope.interface-6.4.post2-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:33ee982237cffaf946db365c3a6ebaa37855d8e3ca5800f6f48890209c1cfefc"},
- {file = "zope.interface-6.4.post2-cp37-cp37m-macosx_11_0_x86_64.whl", hash = "sha256:fbf649bc77510ef2521cf797700b96167bb77838c40780da7ea3edd8b78044d1"},
- {file = "zope.interface-6.4.post2-cp37-cp37m-win_amd64.whl", hash = "sha256:4c0b208a5d6c81434bdfa0f06d9b667e5de15af84d8cae5723c3a33ba6611b82"},
- {file = "zope.interface-6.4.post2-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:d3fe667935e9562407c2511570dca14604a654988a13d8725667e95161d92e9b"},
- {file = "zope.interface-6.4.post2-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:a96e6d4074db29b152222c34d7eec2e2db2f92638d2b2b2c704f9e8db3ae0edc"},
- {file = "zope.interface-6.4.post2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:866a0f583be79f0def667a5d2c60b7b4cc68f0c0a470f227e1122691b443c934"},
- {file = "zope.interface-6.4.post2-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5fe919027f29b12f7a2562ba0daf3e045cb388f844e022552a5674fcdf5d21f1"},
- {file = "zope.interface-6.4.post2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8e0343a6e06d94f6b6ac52fbc75269b41dd3c57066541a6c76517f69fe67cb43"},
- {file = "zope.interface-6.4.post2-cp38-cp38-win_amd64.whl", hash = "sha256:dabb70a6e3d9c22df50e08dc55b14ca2a99da95a2d941954255ac76fd6982bc5"},
- {file = "zope.interface-6.4.post2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:706efc19f9679a1b425d6fa2b4bc770d976d0984335eaea0869bd32f627591d2"},
- {file = "zope.interface-6.4.post2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:3d136e5b8821073e1a09dde3eb076ea9988e7010c54ffe4d39701adf0c303438"},
- {file = "zope.interface-6.4.post2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1730c93a38b5a18d24549bc81613223962a19d457cfda9bdc66e542f475a36f4"},
- {file = "zope.interface-6.4.post2-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:bc2676312cc3468a25aac001ec727168994ea3b69b48914944a44c6a0b251e79"},
- {file = "zope.interface-6.4.post2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:1a62fd6cd518693568e23e02f41816adedfca637f26716837681c90b36af3671"},
- {file = "zope.interface-6.4.post2-cp39-cp39-win_amd64.whl", hash = "sha256:d3f7e001328bd6466b3414215f66dde3c7c13d8025a9c160a75d7b2687090d15"},
- {file = "zope.interface-6.4.post2.tar.gz", hash = "sha256:1c207e6f6dfd5749a26f5a5fd966602d6b824ec00d2df84a7e9a924e8933654e"},
+ {file = "zope.interface-7.0.1-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:ec4e87e6fdc511a535254daa122c20e11959ce043b4e3425494b237692a34f1c"},
+ {file = "zope.interface-7.0.1-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:51d5713e8e38f2d3ec26e0dfdca398ed0c20abda2eb49ffc15a15a23eb8e5f6d"},
+ {file = "zope.interface-7.0.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ea8d51e5eb29e57d34744369cd08267637aa5a0fefc9b5d33775ab7ff2ebf2e3"},
+ {file = "zope.interface-7.0.1-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:55bbcc74dc0c7ab489c315c28b61d7a1d03cf938cc99cc58092eb065f120c3a5"},
+ {file = "zope.interface-7.0.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:10ebac566dd0cec66f942dc759d46a994a2b3ba7179420f0e2130f88f8a5f400"},
+ {file = "zope.interface-7.0.1-cp310-cp310-win_amd64.whl", hash = "sha256:7039e624bcb820f77cc2ff3d1adcce531932990eee16121077eb51d9c76b6c14"},
+ {file = "zope.interface-7.0.1-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:03bd5c0db82237bbc47833a8b25f1cc090646e212f86b601903d79d7e6b37031"},
+ {file = "zope.interface-7.0.1-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:3f52050c6a10d4a039ec6f2c58e5b3ade5cc570d16cf9d102711e6b8413c90e6"},
+ {file = "zope.interface-7.0.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:af0b33f04677b57843d529b9257a475d2865403300b48c67654c40abac2f9f24"},
+ {file = "zope.interface-7.0.1-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:696c2a381fc7876b3056711717dba5eddd07c2c9e5ccd50da54029a1293b6e43"},
+ {file = "zope.interface-7.0.1-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f89a420cf5a6f2aa7849dd59e1ff0e477f562d97cf8d6a1ee03461e1eec39887"},
+ {file = "zope.interface-7.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:b59deb0ddc7b431e41d720c00f99d68b52cb9bd1d5605a085dc18f502fe9c47f"},
+ {file = "zope.interface-7.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:52f5253cca1b35eaeefa51abd366b87f48f8714097c99b131ba61f3fdbbb58e7"},
+ {file = "zope.interface-7.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:88d108d004e0df25224de77ce349a7e73494ea2cb194031f7c9687e68a88ec9b"},
+ {file = "zope.interface-7.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c203d82069ba31e1f3bc7ba530b2461ec86366cd4bfc9b95ec6ce58b1b559c34"},
+ {file = "zope.interface-7.0.1-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3f3495462bc0438b76536a0e10d765b168ae636092082531b88340dc40dcd118"},
+ {file = "zope.interface-7.0.1-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:192b7a792e3145ed880ff6b1a206fdb783697cfdb4915083bfca7065ec845e60"},
+ {file = "zope.interface-7.0.1-cp312-cp312-win_amd64.whl", hash = "sha256:400d06c9ec8dbcc96f56e79376297e7be07a315605c9a2208720da263d44d76f"},
+ {file = "zope.interface-7.0.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8c1dff87b30fd150c61367d0e2cdc49bb55f8b9fd2a303560bbc24b951573ae1"},
+ {file = "zope.interface-7.0.1-cp313-cp313-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f749ca804648d00eda62fe1098f229b082dfca930d8bad8386e572a6eafa7525"},
+ {file = "zope.interface-7.0.1-cp313-cp313-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4ec212037becf6d2f705b7ed4538d56980b1e7bba237df0d8995cbbed29961dc"},
+ {file = "zope.interface-7.0.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:d33cb526efdc235a2531433fc1287fcb80d807d5b401f9b801b78bf22df560dd"},
+ {file = "zope.interface-7.0.1-cp38-cp38-macosx_11_0_arm64.whl", hash = "sha256:b419f2144e1762ab845f20316f1df36b15431f2622ebae8a6d5f7e8e712b413c"},
+ {file = "zope.interface-7.0.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:03f1452d5d1f279184d5bdb663a3dc39902d9320eceb63276240791e849054b6"},
+ {file = "zope.interface-7.0.1-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6ba4b3638d014918b918aa90a9c8370bd74a03abf8fcf9deb353b3a461a59a84"},
+ {file = "zope.interface-7.0.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc0615351221926a36a0fbcb2520fb52e0b23e8c22a43754d9cb8f21358c33c0"},
+ {file = "zope.interface-7.0.1-cp38-cp38-win_amd64.whl", hash = "sha256:ce6cbb852fb8f2f9bb7b9cdca44e2e37bce783b5f4c167ff82cb5f5128163c8f"},
+ {file = "zope.interface-7.0.1-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:5566fd9271c89ad03d81b0831c37d46ae5e2ed211122c998637130159a120cf1"},
+ {file = "zope.interface-7.0.1-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:da0cef4d7e3f19c3bd1d71658d6900321af0492fee36ec01b550a10924cffb9c"},
+ {file = "zope.interface-7.0.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f32ca483e6ade23c7caaee9d5ee5d550cf4146e9b68d2fb6c68bac183aa41c37"},
+ {file = "zope.interface-7.0.1-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:da21e7eec49252df34d426c2ee9cf0361c923026d37c24728b0fa4cc0599fd03"},
+ {file = "zope.interface-7.0.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9a8195b99e650e6f329ce4e5eb22d448bdfef0406404080812bc96e2a05674cb"},
+ {file = "zope.interface-7.0.1-cp39-cp39-win_amd64.whl", hash = "sha256:19c829d52e921b9fe0b2c0c6a8f9a2508c49678ee1be598f87d143335b6a35dc"},
+ {file = "zope.interface-7.0.1.tar.gz", hash = "sha256:f0f5fda7cbf890371a59ab1d06512da4f2c89a6ea194e595808123c863c38eff"},
]
[package.dependencies]
@@ -9109,4 +9223,4 @@ testing = ["coverage (>=5.0.3)", "zope.event", "zope.testing"]
[metadata]
lock-version = "2.0"
python-versions = "^3.11"
-content-hash = "e1520f1342ab527bc3bb2619f8909cbdddeb227c14614eb3d82e133961f1f4d2"
+content-hash = "613a56e7dc5551be660388fb8603f6139dbc5d440ea39f3ba931870dc3234bb4"
diff --git a/pyproject.toml b/pyproject.toml
index 5496eddec6de..7e7d7276b6d1 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
[tool.poetry]
name = "opendevin"
-version = "0.8.1"
+version = "0.8.3"
description = "OpenDevin: Code Less, Make More"
authors = ["OpenDevin"]
license = "MIT"
@@ -34,25 +34,29 @@ minio = "^7.2.7"
gevent = "^24.2.1"
pyarrow = "17.0.0" # transitive dependency, pinned here to avoid conflicts
tenacity = "^8.5.0"
-zope-interface = "6.4.post2"
+zope-interface = "7.0.1"
pathspec = "^0.12.1"
google-cloud-aiplatform = "*"
-grep-ast = "0.3.2"
+grep-ast = "0.3.3"
tree-sitter = "0.21.3"
+bashlex = "^0.18"
+pyjwt = "^2.9.0"
+dirhash = "*"
[tool.poetry.group.llama-index.dependencies]
llama-index = "*"
llama-index-vector-stores-chroma = "*"
chromadb = "*"
llama-index-embeddings-huggingface = "*"
-torch = "2.2.0"
+torch = "2.2.2"
llama-index-embeddings-azure-openai = "*"
llama-index-embeddings-ollama = "*"
[tool.poetry.group.dev.dependencies]
-ruff = "0.5.5"
-mypy = "1.11.0"
-pre-commit = "3.7.1"
+ruff = "0.5.7"
+mypy = "1.11.1"
+pre-commit = "3.8.0"
+build = "*"
[tool.poetry.group.test.dependencies]
pytest = "*"
@@ -72,6 +76,7 @@ reportlab = "*"
[tool.coverage.run]
concurrency = ["gevent"]
+
[tool.poetry.group.runtime.dependencies]
jupyterlab = "*"
notebook = "*"
@@ -83,6 +88,7 @@ python-pptx = "*"
pylatexenc = "*"
opencv-python = "*"
+
[build-system]
build-backend = "poetry.core.masonry.api"
requires = [
@@ -105,9 +111,13 @@ ignore = ["D1"]
[tool.ruff.lint.pydocstyle]
convention = "google"
+
[tool.poetry.group.evaluation.dependencies]
streamlit = "*"
whatthepatch = "*"
retry = "*"
evaluate = "*"
swebench = { git = "https://github.com/OpenDevin/SWE-bench.git" }
+func_timeout = "*"
+sympy = "*"
+gdown = "*"
diff --git a/tests/integration/README.md b/tests/integration/README.md
index 292153611a88..390192e9738e 100644
--- a/tests/integration/README.md
+++ b/tests/integration/README.md
@@ -36,10 +36,11 @@ The folder is organised as follows:
โโโ README.md
โโโ conftest.py
โโโ mock
-โย ย โโโ [AgentName]
-โย ย โย ย โโโ [TestName]
-โย ย โย ย โโโ prompt_*.log
-โย ย โย ย โโโ response_*.log
+ โโโ [RuntimeType]
+โย ย | โโโ [AgentName]
+โย ย โย ย โโโ [TestName]
+โย ย โย ย โโโ prompt_*.log
+โย ย โย ย โโโ response_*.log
โโโ [TestFiles].py
```
@@ -47,15 +48,19 @@ where `conftest.py` defines the infrastructure needed to load real-world LLM pro
and responses for mocking purpose. Prompts and responses generated during real runs
of agents with real LLMs are stored under `mock/AgentName/TestName` folders.
-**Note:** Set PERSIST_SANDBOX=false to use a clean sandbox for each test.
## Run Integration Tests
Take a look at `ghcr.yml` (in the `.github/workflow` folder) to learn
-how integration tests are launched in a CI environment. You can also simply run:
+how integration tests are launched in a CI environment.
+
+You can run:
```bash
-TEST_ONLY=true ./tests/integration/regenerate.sh
+# for server runtime
+TEST_RUNTIME=server TEST_ONLY=true ./tests/integration/regenerate.sh
+# for event stream
+TEST_RUNTIME=eventstream TEST_ONLY=true ./tests/integration/regenerate.sh
```
to run all integration tests until the first failure occurs.
@@ -75,7 +80,8 @@ When you make changes to an agent's prompt, the integration tests will fail. You
by running the following command from OpenDevin's project root directory:
```bash
-./tests/integration/regenerate.sh
+TEST_RUNTIME=server ./tests/integration/regenerate.sh
+TEST_RUNTIME=eventstream ./tests/integration/regenerate.sh
```
Please note that this will:
diff --git a/tests/integration/conftest.py b/tests/integration/conftest.py
index 0d9d84a38b57..1624e7a4f6ef 100644
--- a/tests/integration/conftest.py
+++ b/tests/integration/conftest.py
@@ -4,6 +4,7 @@
import shutil
import subprocess
import tempfile
+import time
from functools import partial
from http.server import HTTPServer, SimpleHTTPRequestHandler
from threading import Thread
@@ -11,15 +12,24 @@
import pytest
from litellm import completion
+from opendevin.core.message import Message
from opendevin.llm.llm import message_separator
script_dir = os.environ.get('SCRIPT_DIR')
project_root = os.environ.get('PROJECT_ROOT')
workspace_path = os.environ.get('WORKSPACE_BASE')
+test_runtime = os.environ.get('TEST_RUNTIME')
+MOCK_ROOT_DIR = os.path.join(
+ script_dir,
+ 'mock',
+ f'{test_runtime}_runtime',
+ os.environ.get('DEFAULT_AGENT'),
+)
assert script_dir is not None, 'SCRIPT_DIR environment variable is not set'
assert project_root is not None, 'PROJECT_ROOT environment variable is not set'
assert workspace_path is not None, 'WORKSPACE_BASE environment variable is not set'
+assert test_runtime is not None, 'TEST_RUNTIME environment variable is not set'
class SecretExit(Exception):
@@ -37,7 +47,19 @@ def pytest_exception_interact(node, call, report):
def filter_out_symbols(input):
+ # remove shell hostname patterns (e.g., will change between each run)
+ # opendevin@379c7fce40b4:/workspace $
+ input = re.sub(r'(opendevin|root)@.*(:/.*)', r'\1[DUMMY_HOSTNAME]\2', input)
+
+ # handle sha256 hashes
+ # sha256=4ecf8be428f55981e2a188f510ba5f9022bed88f5fb404d7d949f44382201e3d
+ input = re.sub(r'sha256=[a-z0-9]+', 'sha256=[DUMMY_HASH]', input)
+
+ # remove newlines and whitespace
input = re.sub(r'\\n|\\r\\n|\\r|\s+', '', input)
+
+ # remove all non-alphanumeric characters
+ input = re.sub(r'[^a-zA-Z0-9]', '', input)
return input
@@ -54,9 +76,7 @@ def apply_prompt_and_get_mock_response(test_name: str, messages: str, id: int) -
Note: this function blindly replaces existing prompt file with the given
input without checking the contents.
"""
- mock_dir = os.path.join(
- script_dir, 'mock', os.environ.get('DEFAULT_AGENT'), test_name
- )
+ mock_dir = os.path.join(MOCK_ROOT_DIR, test_name)
prompt_file_path = os.path.join(mock_dir, f'prompt_{"{0:03}".format(id)}.log')
resp_file_path = os.path.join(mock_dir, f'response_{"{0:03}".format(id)}.log')
try:
@@ -88,47 +108,48 @@ def get_mock_response(test_name: str, messages: str, id: int) -> str:
we start from the end of the file, but again, that is unnecessary and only
makes test code harder to understand.
"""
+ mock_dir = os.path.join(MOCK_ROOT_DIR, test_name)
prompt = filter_out_symbols(messages)
- mock_dir = os.path.join(
- script_dir, 'mock', os.environ.get('DEFAULT_AGENT'), test_name
- )
prompt_file_path = os.path.join(mock_dir, f'prompt_{"{0:03}".format(id)}.log')
resp_file_path = os.path.join(mock_dir, f'response_{"{0:03}".format(id)}.log')
# Open the prompt file and compare its contents
- with open(prompt_file_path, 'r') as f:
- file_content = filter_out_symbols(f.read())
- if file_content == prompt:
- # Read the response file and return its content
- with open(resp_file_path, 'r') as resp_file:
- return resp_file.read()
- else:
- # print the mismatched lines
- print('Mismatched Prompt File path', prompt_file_path)
- print('---' * 10)
- # Create a temporary file to store messages
- with tempfile.NamedTemporaryFile(
- delete=False, mode='w', encoding='utf-8'
- ) as tmp_file:
- tmp_file_path = tmp_file.name
- tmp_file.write(messages)
-
- try:
- # Use diff command to compare files and capture the output
- result = subprocess.run(
- ['diff', '-u', prompt_file_path, tmp_file_path],
- capture_output=True,
- text=True,
- )
- if result.returncode != 0:
- print('Diff:')
- print(result.stdout)
- else:
- print('No differences found.')
- finally:
- # Clean up the temporary file
- os.remove(tmp_file_path)
-
- print('---' * 10)
+ from test_patch import test_patces
+
+ for key, value in test_patces.items():
+ with open(prompt_file_path, 'r') as f:
+ file_content = filter_out_symbols(f.read().replace(key, value))
+ if file_content.strip() == prompt.strip():
+ # Read the response file and return its content
+ with open(resp_file_path, 'r') as resp_file:
+ return resp_file.read()
+ else:
+ # print the mismatched lines
+ print('Mismatched Prompt File path', prompt_file_path)
+ print('---' * 10)
+ # Create a temporary file to store messages
+ with tempfile.NamedTemporaryFile(
+ delete=False, mode='w', encoding='utf-8'
+ ) as tmp_file:
+ tmp_file_path = tmp_file.name
+ tmp_file.write(messages)
+
+ try:
+ # Use diff command to compare files and capture the output
+ result = subprocess.run(
+ ['diff', '-u', prompt_file_path, tmp_file_path],
+ capture_output=True,
+ text=True,
+ )
+ if result.returncode != 0:
+ print('Diff:')
+ print(result.stdout)
+ else:
+ print('No differences found.')
+ finally:
+ # Clean up the temporary file
+ os.remove(tmp_file_path)
+
+ print('---' * 10)
def mock_user_response(*args, test_name, **kwargs):
@@ -157,8 +178,12 @@ def mock_completion(*args, test_name, **kwargs):
global cur_id
messages = kwargs['messages']
message_str = ''
+ if isinstance(messages[0], Message):
+ messages = [message.model_dump() for message in messages]
for message in messages:
- message_str += message_separator + message['content']
+ for m in message['content']:
+ if m['type'] == 'text':
+ message_str += message_separator + m['text']
# this assumes all response_(*).log filenames are in numerical order, starting from one
cur_id += 1
if os.environ.get('FORCE_APPLY_PROMPTS') == 'true':
@@ -211,7 +236,9 @@ def http_server():
thread = Thread(target=server.serve_forever)
thread.setDaemon(True)
thread.start()
+ time.sleep(1)
+ print('HTTP server started...')
yield server
# Stop the server
diff --git a/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_001.log b/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_001.log
deleted file mode 100644
index 541e772d8bee..000000000000
--- a/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_001.log
+++ /dev/null
@@ -1,4 +0,0 @@
-Let's start by installing the `pymsgbox` package.
-
-
-%pip install pymsgbox==1.0.9
diff --git a/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_002.log b/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_002.log
deleted file mode 100644
index 57d9d054a25e..000000000000
--- a/tests/integration/mock/CodeActSWEAgent/test_ipython_module/response_002.log
+++ /dev/null
@@ -1,15 +0,0 @@
-Next, let's import `pymsgbox` and print its version to `/workspace/test.txt`.
-
-
-import pymsgbox
-
-version = pymsgbox.__version__
-
-with open("/workspace/test.txt", "w") as file:
- file.write(f"pymsgbox version: {version}\n")
-
-# Verify the content of the file
-with open("/workspace/test.txt", "r") as file:
- content = file.read()
-
-content
diff --git a/tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_001.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_001.log
similarity index 100%
rename from tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_001.log
rename to tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_001.log
diff --git a/tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_002.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_002.log
similarity index 100%
rename from tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_002.log
rename to tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_002.log
diff --git a/tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_003.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_003.log
similarity index 100%
rename from tests/integration/mock/BrowsingAgent/test_browse_internet/prompt_003.log
rename to tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/prompt_003.log
diff --git a/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_001.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_001.log
new file mode 100644
index 000000000000..8bdf5efd6c72
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_001.log
@@ -0,0 +1,2 @@
+In order to accomplish my goal, I need to navigate to the localhost page.
+```goto('http://localhost:8000'
diff --git a/tests/integration/mock/CodeActAgent/test_browse_internet/response_003.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_002.log
similarity index 100%
rename from tests/integration/mock/CodeActAgent/test_browse_internet/response_003.log
rename to tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_002.log
diff --git a/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_003.log b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_003.log
new file mode 100644
index 000000000000..2eeae38382de
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/BrowsingAgent/test_browse_internet/response_003.log
@@ -0,0 +1,3 @@
+In order to accomplish my goal, I need to read the static text that reveals the answer to life, the universe, and everything.
+
+```send_msg_to_user('The answer is OpenDevin is all you need!'
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_001.log
new file mode 100644
index 000000000000..62ee7a52b602
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_001.log
@@ -0,0 +1,401 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.
+
+ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with .
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_002.log
new file mode 100644
index 000000000000..292bb8cfdc66
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_002.log
@@ -0,0 +1,126 @@
+
+
+----------
+
+# Instructions
+Review the current state of the page and all other information to find the best
+possible next action to accomplish your goal. Your answer will be interpreted
+and executed by a program, make sure to follow the formatting instructions.
+
+# Goal:
+Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.. I should start with: Get the content on "http://localhost:8000"
+
+# Action Space
+
+16 different types of actions are available.
+
+noop(wait_ms: float = 1000)
+ Examples:
+ noop()
+
+ noop(500)
+
+send_msg_to_user(text: str)
+ Examples:
+ send_msg_to_user('Based on the results of my search, the city was built in 1751.')
+
+scroll(delta_x: float, delta_y: float)
+ Examples:
+ scroll(0, 200)
+
+ scroll(-50.2, -100.5)
+
+fill(bid: str, value: str)
+ Examples:
+ fill('237', 'example value')
+
+ fill('45', 'multi-line\nexample')
+
+ fill('a12', 'example with "quotes"')
+
+select_option(bid: str, options: str | list[str])
+ Examples:
+ select_option('48', 'blue')
+
+ select_option('48', ['red', 'green', 'blue'])
+
+click(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ click('51')
+
+ click('b22', button='right')
+
+ click('48', button='middle', modifiers=['Shift'])
+
+dblclick(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ dblclick('12')
+
+ dblclick('ca42', button='right')
+
+ dblclick('178', button='middle', modifiers=['Shift'])
+
+hover(bid: str)
+ Examples:
+ hover('b8')
+
+press(bid: str, key_comb: str)
+ Examples:
+ press('88', 'Backspace')
+
+ press('a26', 'Control+a')
+
+ press('a61', 'Meta+Shift+t')
+
+focus(bid: str)
+ Examples:
+ focus('b455')
+
+clear(bid: str)
+ Examples:
+ clear('996')
+
+drag_and_drop(from_bid: str, to_bid: str)
+ Examples:
+ drag_and_drop('56', '498')
+
+upload_file(bid: str, file: str | list[str])
+ Examples:
+ upload_file('572', 'my_receipt.pdf')
+
+ upload_file('63', ['/home/bob/Documents/image.jpg', '/home/bob/Documents/file.zip'])
+
+go_back()
+ Examples:
+ go_back()
+
+go_forward()
+ Examples:
+ go_forward()
+
+goto(url: str)
+ Examples:
+ goto('http://www.example.com')
+
+Multiple actions can be provided at once. Example:
+fill('a12', 'example with "quotes"')
+click('51')
+click('48', button='middle', modifiers=['Shift'])
+Multiple actions are meant to be executed sequentially without any feedback from the page.
+Don't execute multiple actions at once if you need feedback from the page.
+
+
+
+----------
+
+# Current Accessibility Tree:
+
+
+# Previous Actions
+
+
+Here is an example with chain of thought of a valid action when clicking on a button:
+"
+In order to accomplish my goal I need to click on the button with bid 12
+```click("12")```
+"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_003.log
new file mode 100644
index 000000000000..991b96738722
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_003.log
@@ -0,0 +1,130 @@
+
+
+----------
+
+# Instructions
+Review the current state of the page and all other information to find the best
+possible next action to accomplish your goal. Your answer will be interpreted
+and executed by a program, make sure to follow the formatting instructions.
+
+# Goal:
+Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.. I should start with: Get the content on "http://localhost:8000"
+
+# Action Space
+
+16 different types of actions are available.
+
+noop(wait_ms: float = 1000)
+ Examples:
+ noop()
+
+ noop(500)
+
+send_msg_to_user(text: str)
+ Examples:
+ send_msg_to_user('Based on the results of my search, the city was built in 1751.')
+
+scroll(delta_x: float, delta_y: float)
+ Examples:
+ scroll(0, 200)
+
+ scroll(-50.2, -100.5)
+
+fill(bid: str, value: str)
+ Examples:
+ fill('237', 'example value')
+
+ fill('45', 'multi-line\nexample')
+
+ fill('a12', 'example with "quotes"')
+
+select_option(bid: str, options: str | list[str])
+ Examples:
+ select_option('48', 'blue')
+
+ select_option('48', ['red', 'green', 'blue'])
+
+click(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ click('51')
+
+ click('b22', button='right')
+
+ click('48', button='middle', modifiers=['Shift'])
+
+dblclick(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ dblclick('12')
+
+ dblclick('ca42', button='right')
+
+ dblclick('178', button='middle', modifiers=['Shift'])
+
+hover(bid: str)
+ Examples:
+ hover('b8')
+
+press(bid: str, key_comb: str)
+ Examples:
+ press('88', 'Backspace')
+
+ press('a26', 'Control+a')
+
+ press('a61', 'Meta+Shift+t')
+
+focus(bid: str)
+ Examples:
+ focus('b455')
+
+clear(bid: str)
+ Examples:
+ clear('996')
+
+drag_and_drop(from_bid: str, to_bid: str)
+ Examples:
+ drag_and_drop('56', '498')
+
+upload_file(bid: str, file: str | list[str])
+ Examples:
+ upload_file('572', 'my_receipt.pdf')
+
+ upload_file('63', ['/home/bob/Documents/image.jpg', '/home/bob/Documents/file.zip'])
+
+go_back()
+ Examples:
+ go_back()
+
+go_forward()
+ Examples:
+ go_forward()
+
+goto(url: str)
+ Examples:
+ goto('http://www.example.com')
+
+Multiple actions can be provided at once. Example:
+fill('a12', 'example with "quotes"')
+click('51')
+click('48', button='middle', modifiers=['Shift'])
+Multiple actions are meant to be executed sequentially without any feedback from the page.
+Don't execute multiple actions at once if you need feedback from the page.
+
+
+
+----------
+
+# Current Accessibility Tree:
+RootWebArea 'The Ultimate Answer', focused
+ [8] heading 'The Ultimate Answer'
+ [9] paragraph ''
+ StaticText 'Click the button to reveal the answer to life, the universe, and everything.'
+ [10] button 'Click me', clickable
+
+# Previous Actions
+goto("http://localhost:8000")
+
+Here is an example with chain of thought of a valid action when clicking on a button:
+"
+In order to accomplish my goal I need to click on the button with bid 12
+```click("12")```
+"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_004.log
new file mode 100644
index 000000000000..566df42a71a6
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_004.log
@@ -0,0 +1,132 @@
+
+
+----------
+
+# Instructions
+Review the current state of the page and all other information to find the best
+possible next action to accomplish your goal. Your answer will be interpreted
+and executed by a program, make sure to follow the formatting instructions.
+
+# Goal:
+Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.. I should start with: Get the content on "http://localhost:8000"
+
+# Action Space
+
+16 different types of actions are available.
+
+noop(wait_ms: float = 1000)
+ Examples:
+ noop()
+
+ noop(500)
+
+send_msg_to_user(text: str)
+ Examples:
+ send_msg_to_user('Based on the results of my search, the city was built in 1751.')
+
+scroll(delta_x: float, delta_y: float)
+ Examples:
+ scroll(0, 200)
+
+ scroll(-50.2, -100.5)
+
+fill(bid: str, value: str)
+ Examples:
+ fill('237', 'example value')
+
+ fill('45', 'multi-line\nexample')
+
+ fill('a12', 'example with "quotes"')
+
+select_option(bid: str, options: str | list[str])
+ Examples:
+ select_option('48', 'blue')
+
+ select_option('48', ['red', 'green', 'blue'])
+
+click(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ click('51')
+
+ click('b22', button='right')
+
+ click('48', button='middle', modifiers=['Shift'])
+
+dblclick(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'Meta', 'Shift']] = [])
+ Examples:
+ dblclick('12')
+
+ dblclick('ca42', button='right')
+
+ dblclick('178', button='middle', modifiers=['Shift'])
+
+hover(bid: str)
+ Examples:
+ hover('b8')
+
+press(bid: str, key_comb: str)
+ Examples:
+ press('88', 'Backspace')
+
+ press('a26', 'Control+a')
+
+ press('a61', 'Meta+Shift+t')
+
+focus(bid: str)
+ Examples:
+ focus('b455')
+
+clear(bid: str)
+ Examples:
+ clear('996')
+
+drag_and_drop(from_bid: str, to_bid: str)
+ Examples:
+ drag_and_drop('56', '498')
+
+upload_file(bid: str, file: str | list[str])
+ Examples:
+ upload_file('572', 'my_receipt.pdf')
+
+ upload_file('63', ['/home/bob/Documents/image.jpg', '/home/bob/Documents/file.zip'])
+
+go_back()
+ Examples:
+ go_back()
+
+go_forward()
+ Examples:
+ go_forward()
+
+goto(url: str)
+ Examples:
+ goto('http://www.example.com')
+
+Multiple actions can be provided at once. Example:
+fill('a12', 'example with "quotes"')
+click('51')
+click('48', button='middle', modifiers=['Shift'])
+Multiple actions are meant to be executed sequentially without any feedback from the page.
+Don't execute multiple actions at once if you need feedback from the page.
+
+
+
+----------
+
+# Current Accessibility Tree:
+RootWebArea 'The Ultimate Answer', focused
+ [8] heading 'The Ultimate Answer'
+ [9] paragraph ''
+ StaticText 'Click the button to reveal the answer to life, the universe, and everything.'
+ [10] button 'Click me', clickable, focused
+ StaticText 'The answer is OpenDevin is all you need!'
+
+# Previous Actions
+goto("http://localhost:8000")
+click("10")
+
+Here is an example with chain of thought of a valid action when clicking on a button:
+"
+In order to accomplish my goal I need to click on the button with bid 12
+```click("12")```
+"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_005.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_005.log
new file mode 100644
index 000000000000..48ebcb32b81e
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/prompt_005.log
@@ -0,0 +1,415 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.
+
+----------
+
+
+
+Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.. I should start with: Get the content on "http://localhost:8000"
+
+
+----------
+
+OBSERVATION:
+{'content': 'The ultimate answer to life, the universe, and everything is: OpenDevin is all you need!'}
+
+ENVIRONMENT REMINDER: You have 8 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_001.log
new file mode 100644
index 000000000000..25975d7d1366
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_001.log
@@ -0,0 +1,4 @@
+Certainly! I'll browse localhost:8000 and retrieve the ultimate answer to life for you.
+
+
+Get the content on "http://localhost:8000"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_002.log
new file mode 100644
index 000000000000..05dffd72f414
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_002.log
@@ -0,0 +1,5 @@
+To accomplish our goal of retrieving the ultimate answer to life from localhost:8000, I need to start by accessing the content on that page. Here's my thought process and action:
+
+In order to accomplish my goal, I need to navigate to the specified URL using the goto action.
+
+```goto("http://localhost:8000"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_003.log
new file mode 100644
index 000000000000..66e87a3550dc
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_003.log
@@ -0,0 +1,6 @@
+To accomplish our goal of retrieving the ultimate answer to life, we need to click on the "Click me" button. The button has a bid of 10, so we should click on it.
+
+Here's the action I'll take:
+
+In order to accomplish my goal, I need to click on the button with bid 10
+```click("10"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_004.log
new file mode 100644
index 000000000000..ae88605a499b
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_004.log
@@ -0,0 +1,6 @@
+Based on the current Accessibility Tree and the previous actions taken, it appears that we have already successfully navigated to the webpage and clicked the button to reveal the answer. The answer is now visible on the page.
+
+To accomplish our goal of retrieving the ultimate answer to life, I will now send this information to the user:
+
+```
+send_msg_to_user("The ultimate answer to life, the universe, and everything is: OpenDevin is all you need!"
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_005.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_005.log
new file mode 100644
index 000000000000..13e8a93af051
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_browse_internet/response_005.log
@@ -0,0 +1,11 @@
+Thank you for providing the content from localhost:8000. Based on the information retrieved, I can now answer your question about the ultimate answer to life.
+
+The ultimate answer to life, according to the content on localhost:8000, is:
+
+"OpenDevin is all you need!"
+
+This appears to be a playful reference to the popular science fiction series "The Hitchhiker's Guide to the Galaxy" by Douglas Adams, where the answer to the ultimate question of life, the universe, and everything is famously "42". In this case, it's been adapted to promote OpenDevin as the ultimate solution.
+
+Since I've successfully retrieved and reported the answer from the specified localhost, I believe I've completed the task you've given me.
+
+
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_001.log
new file mode 100644
index 000000000000..8f33d2e2f2bc
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_001.log
@@ -0,0 +1,401 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+Fix typos in bad.txt. Do not ask me for confirmation at any point.
+
+ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with .
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_002.log
new file mode 100644
index 000000000000..f33b77e59535
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_002.log
@@ -0,0 +1,422 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Fix typos in bad.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by opening the `bad.txt` file to identify the typos.
+
+open_file('bad.txt')
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total)]
+(this is the beginning of the file)
+1|This is a stupid typoo.
+2|Really?
+3|No mor typos!
+4|Enjoy!
+(this is the end of the file)
+
+
+ENVIRONMENT REMINDER: You have 13 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_003.log
new file mode 100644
index 000000000000..14ed92579529
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_003.log
@@ -0,0 +1,451 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Fix typos in bad.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by opening the `bad.txt` file to identify the typos.
+
+open_file('bad.txt')
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total)]
+(this is the beginning of the file)
+1|This is a stupid typoo.
+2|Really?
+3|No mor typos!
+4|Enjoy!
+(this is the end of the file)
+
+
+----------
+
+I will correct the typos in the `bad.txt` file.
+
+1. "stupid typoo" -> "stupid typo"
+2. "No mor typos!" -> "No more typos!"
+
+Let's make these changes.
+
+edit_file_by_replace(
+ 'bad.txt',
+ to_replace="This is a stupid typoo.",
+ new_content="This is a stupid typo."
+)
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total after edit)]
+(this is the beginning of the file)
+1|This is a stupid typo.
+2|Really?
+3|No mor typos!
+4|Enjoy!
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+
+ENVIRONMENT REMINDER: You have 12 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_004.log
new file mode 100644
index 000000000000..80b43b7dc699
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/prompt_004.log
@@ -0,0 +1,479 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Fix typos in bad.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by opening the `bad.txt` file to identify the typos.
+
+open_file('bad.txt')
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total)]
+(this is the beginning of the file)
+1|This is a stupid typoo.
+2|Really?
+3|No mor typos!
+4|Enjoy!
+(this is the end of the file)
+
+
+----------
+
+I will correct the typos in the `bad.txt` file.
+
+1. "stupid typoo" -> "stupid typo"
+2. "No mor typos!" -> "No more typos!"
+
+Let's make these changes.
+
+edit_file_by_replace(
+ 'bad.txt',
+ to_replace="This is a stupid typoo.",
+ new_content="This is a stupid typo."
+)
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total after edit)]
+(this is the beginning of the file)
+1|This is a stupid typo.
+2|Really?
+3|No mor typos!
+4|Enjoy!
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+
+----------
+
+Let's correct the remaining typo in the file.
+
+1. "No mor typos!" -> "No more typos!"
+
+I will make this change now.
+
+edit_file_by_replace(
+ 'bad.txt',
+ to_replace="No mor typos!",
+ new_content="No more typos!"
+)
+
+
+----------
+
+OBSERVATION:
+[File: /workspace/bad.txt (4 lines total after edit)]
+(this is the beginning of the file)
+1|This is a stupid typo.
+2|Really?
+3|No more typos!
+4|Enjoy!
+(this is the end of the file)
+[File updated (edited at line 3). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+
+ENVIRONMENT REMINDER: You have 11 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_001.log
new file mode 100644
index 000000000000..58fb6048ba95
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_001.log
@@ -0,0 +1,4 @@
+Sure! Let's start by opening the `bad.txt` file to identify the typos.
+
+
+open_file('bad.txt')
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_002.log
new file mode 100644
index 000000000000..b83d070ecd9f
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_002.log
@@ -0,0 +1,13 @@
+I will correct the typos in the `bad.txt` file.
+
+1. "stupid typoo" -> "stupid typo"
+2. "No mor typos!" -> "No more typos!"
+
+Let's make these changes.
+
+
+edit_file_by_replace(
+ 'bad.txt',
+ to_replace="This is a stupid typoo.",
+ new_content="This is a stupid typo."
+)
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_003.log
new file mode 100644
index 000000000000..c5096a8785dd
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_003.log
@@ -0,0 +1,11 @@
+Let's correct the remaining typo in the file.
+
+1. "No mor typos!" -> "No more typos!"
+
+I will make this change now.
+
+edit_file_by_replace(
+ 'bad.txt',
+ to_replace="No mor typos!",
+ new_content="No more typos!"
+)
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_004.log
new file mode 100644
index 000000000000..4fee4faa5e15
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_edits/response_004.log
@@ -0,0 +1,11 @@
+The typos in `bad.txt` have been corrected. Here is the final content of the file:
+
+```
+1|This is a stupid typo.
+2|Really?
+3|No more typos!
+4|Enjoy!
+```
+
+If you have any further requests, feel free to ask!
+
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_001.log
new file mode 100644
index 000000000000..c620c8918435
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_001.log
@@ -0,0 +1,401 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.
+
+ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with .
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_002.log
new file mode 100644
index 000000000000..013180481a26
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/prompt_002.log
@@ -0,0 +1,416 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! I will create a text file named `test.txt` in the `/workspace` directory and write 'hello world' to it.
+
+with open('/workspace/test.txt', 'w') as file:
+ file.write('hello world')
+
+
+----------
+
+OBSERVATION:
+[Code executed successfully with no output]
+
+ENVIRONMENT REMINDER: You have 13 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_001.log
new file mode 100644
index 000000000000..ea1ad9705196
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_001.log
@@ -0,0 +1,5 @@
+Sure! I will create a text file named `test.txt` in the `/workspace` directory and write 'hello world' to it.
+
+
+with open('/workspace/test.txt', 'w') as file:
+ file.write('hello world')
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_002.log
new file mode 100644
index 000000000000..42790c2d4a56
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython/response_002.log
@@ -0,0 +1,3 @@
+The file `test.txt` has been successfully created in the `/workspace` directory with the content 'hello world'. If you have any further requests, please let me know!
+
+
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_001.log
new file mode 100644
index 000000000000..3ecc273f588d
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_001.log
@@ -0,0 +1,401 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.
+
+ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with .
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_002.log
new file mode 100644
index 000000000000..54f39c635ddb
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_002.log
@@ -0,0 +1,431 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by installing the `pymsgbox` package.
+
+%pip install pymsgbox==1.0.9
+
+
+----------
+
+OBSERVATION:
+Collecting pymsgbox==1.0.9
+ Downloading PyMsgBox-1.0.9.tar.gz (18 kB)
+ Installing build dependencies ... [?25l- \ | / done
+[?25h Getting requirements to build wheel ... [?25l- done
+[?25h Preparing metadata (pyproject.toml) ... [?25l- done
+[?25hBuilding wheels for collected packages: pymsgbox
+ Building wheel for pymsgbox (pyproject.toml) ... [?25l- done
+[?25h Created wheel for pymsgbox: filename=PyMsgBox-1.0.9-py3-none-any.whl size=7406 sha256=7cc4ff18827e3c2e2bd0780a92ca70fd625dc7645b994b6f3191dfb0660e606e
+ Stored in directory: /home/opendevin/.cache/pip/wheels/85/92/63/e126ee5f33d8f2ed04f96e43ef5df7270a2f331848752e8662
+Successfully built pymsgbox
+Installing collected packages: pymsgbox
+Successfully installed pymsgbox-1.0.9
+
+[notice] A new release of pip is available: 24.1 -> 24.2
+[notice] To update, run: pip install --upgrade pip
+Note: you may need to restart the kernel to use updated packages.
+
+
+ENVIRONMENT REMINDER: You have 13 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_003.log
new file mode 100644
index 000000000000..2d2e45921813
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_003.log
@@ -0,0 +1,448 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by installing the `pymsgbox` package.
+
+%pip install pymsgbox==1.0.9
+
+
+----------
+
+OBSERVATION:
+Collecting pymsgbox==1.0.9
+ Downloading PyMsgBox-1.0.9.tar.gz (18 kB)
+ Installing build dependencies ... [?25l- \ | / done
+[?25h Getting requirements to build wheel ... [?25l- done
+[?25h Preparing metadata (pyproject.toml) ... [?25l- done
+[?25hBuilding wheels for collected packages: pymsgbox
+ Building wheel for pymsgbox (pyproject.toml) ... [?25l- done
+[?25h Created wheel for pymsgbox: filename=PyMsgBox-1.0.9-py3-none-any.whl size=7406 sha256=7cc4ff18827e3c2e2bd0780a92ca70fd625dc7645b994b6f3191dfb0660e606e
+ Stored in directory: /home/opendevin/.cache/pip/wheels/85/92/63/e126ee5f33d8f2ed04f96e43ef5df7270a2f331848752e8662
+Successfully built pymsgbox
+Installing collected packages: pymsgbox
+Successfully installed pymsgbox-1.0.9
+
+[notice] A new release of pip is available: 24.1 -> 24.2
+[notice] To update, run: pip install --upgrade pip
+Note: you may need to restart the kernel to use updated packages.
+
+
+----------
+
+Now, let's import `pymsgbox`, get its version, and write it to `/workspace/test.txt`.
+
+import pymsgbox
+
+version = pymsgbox.__version__
+
+with open('/workspace/test.txt', 'w') as file:
+ file.write(f"pymsgbox version: {version}\n")
+
+
+----------
+
+OBSERVATION:
+[Code executed successfully with no output]
+
+ENVIRONMENT REMINDER: You have 12 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_004.log
new file mode 100644
index 000000000000..043c096f6117
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/prompt_004.log
@@ -0,0 +1,463 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'",
+)
+
+
+USER:
+Observation:
+[File: /workspace/app.py (10 lines total after edit)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return '
' + ''.join([f'
{i}
' for i in numbers]) + '
'
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 7). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+Running the updated file:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+Observation:
+[1] 126
+
+ASSISTANT:
+The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
+
+
+--- END OF EXAMPLE ---
+
+
+NOW, LET'S START!
+
+----------
+
+Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.
+
+----------
+
+Sure! Let's start by installing the `pymsgbox` package.
+
+%pip install pymsgbox==1.0.9
+
+
+----------
+
+OBSERVATION:
+Collecting pymsgbox==1.0.9
+ Downloading PyMsgBox-1.0.9.tar.gz (18 kB)
+ Installing build dependencies ... [?25l- \ | / done
+[?25h Getting requirements to build wheel ... [?25l- done
+[?25h Preparing metadata (pyproject.toml) ... [?25l- done
+[?25hBuilding wheels for collected packages: pymsgbox
+ Building wheel for pymsgbox (pyproject.toml) ... [?25l- done
+[?25h Created wheel for pymsgbox: filename=PyMsgBox-1.0.9-py3-none-any.whl size=7406 sha256=7cc4ff18827e3c2e2bd0780a92ca70fd625dc7645b994b6f3191dfb0660e606e
+ Stored in directory: /home/opendevin/.cache/pip/wheels/85/92/63/e126ee5f33d8f2ed04f96e43ef5df7270a2f331848752e8662
+Successfully built pymsgbox
+Installing collected packages: pymsgbox
+Successfully installed pymsgbox-1.0.9
+
+[notice] A new release of pip is available: 24.1 -> 24.2
+[notice] To update, run: pip install --upgrade pip
+Note: you may need to restart the kernel to use updated packages.
+
+
+----------
+
+Now, let's import `pymsgbox`, get its version, and write it to `/workspace/test.txt`.
+
+import pymsgbox
+
+version = pymsgbox.__version__
+
+with open('/workspace/test.txt', 'w') as file:
+ file.write(f"pymsgbox version: {version}\n")
+
+
+----------
+
+OBSERVATION:
+[Code executed successfully with no output]
+
+----------
+
+Let's verify the content of `/workspace/test.txt` to ensure the version information was written correctly.
+
+cat /workspace/test.txt
+
+
+----------
+
+OBSERVATION:
+pymsgbox version: 1.0.9
+
+opendevin@docker-desktop:/workspace $
+[Command -1 finished with exit code 0]
+
+ENVIRONMENT REMINDER: You have 11 turns left to complete the task. When finished reply with
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_001.log
new file mode 100644
index 000000000000..3d96c1fdb215
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_001.log
@@ -0,0 +1,4 @@
+Sure! Let's start by installing the `pymsgbox` package.
+
+
+%pip install pymsgbox==1.0.9
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_002.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_002.log
new file mode 100644
index 000000000000..1886ba712d23
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_002.log
@@ -0,0 +1,9 @@
+Now, let's import `pymsgbox`, get its version, and write it to `/workspace/test.txt`.
+
+
+import pymsgbox
+
+version = pymsgbox.__version__
+
+with open('/workspace/test.txt', 'w') as file:
+ file.write(f"pymsgbox version: {version}\n")
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_003.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_003.log
new file mode 100644
index 000000000000..26fa94b072ac
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_003.log
@@ -0,0 +1,4 @@
+Let's verify the content of `/workspace/test.txt` to ensure the version information was written correctly.
+
+
+cat /workspace/test.txt
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_004.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_004.log
new file mode 100644
index 000000000000..bbebebe77ef9
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_ipython_module/response_004.log
@@ -0,0 +1,3 @@
+The version of `pymsgbox` has been successfully written to `/workspace/test.txt`. If you have any further tasks or questions, feel free to ask!
+
+
diff --git a/tests/integration/mock/eventstream_runtime/CodeActAgent/test_write_simple_script/prompt_001.log b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_write_simple_script/prompt_001.log
new file mode 100644
index 000000000000..ddb157a1b007
--- /dev/null
+++ b/tests/integration/mock/eventstream_runtime/CodeActAgent/test_write_simple_script/prompt_001.log
@@ -0,0 +1,401 @@
+
+
+----------
+
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
+The assistant can use an interactive Python (Jupyter Notebook) environment, executing code with .
+
+print("Hello World!")
+
+The assistant can execute bash commands on behalf of the user by wrapping them with and .
+
+For example, you can list the files in the current directory by ls .
+Important, however: do not run interactive commands. You do not have access to stdin.
+Also, you need to handle commands that may run indefinitely and not return a result. For such cases, you should redirect the output to a file and run the command in the background to avoid blocking the execution.
+For example, to run a Python script that might run indefinitely without returning immediately, you can use the following format: python3 app.py > server.log 2>&1 &
+Also, if a command execution result saying like: Command: "npm start" timed out. Sending SIGINT to the process, you should also retry with running the command in the background.
+The assistant can browse the Internet with and .
+For example, Tell me the usa's president using google search .
+Or Tell me what is in http://example.com .
+The assistant can install Python packages using the %pip magic command in an IPython environment by using the following syntax: %pip install [package needed] and should always import packages and define variables before starting to use them.
+To interact with GitHub, use the $GITHUB_TOKEN environment variable.
+For example, to push a branch `my_branch` to the GitHub repo `owner/repo`:
+ git push https://$GITHUB_TOKEN@github.com/owner/repo.git my_branch
+If $GITHUB_TOKEN is not set, ask the user to set it.
+
+
+Apart from the standard Python library, the assistant can also use the following functions (already imported) in environment:
+open_file(path: str, line_number: int | None = 1, context_lines: int | None = 100) -> None:
+ Opens the file at the given path in the editor. If line_number is provided, the window will be moved to include that line.
+ It only shows the first 100 lines by default! Max `context_lines` supported is 2000, use `scroll up/down`
+ to view the file if you want to see more.
+ Args:
+ path: str: The path to the file to open, preferred absolute path.
+ line_number: int | None = 1: The line number to move to. Defaults to 1.
+ context_lines: int | None = 100: Only shows this number of lines in the context window (usually from line 1), with line_number as the center (if possible). Defaults to 100.
+
+goto_line(line_number: int) -> None:
+ Moves the window to show the specified line number.
+ Args:
+ line_number: int: The line number to move to.
+
+scroll_down() -> None:
+ Moves the window down by 100 lines.
+ Args:
+ None
+
+scroll_up() -> None:
+ Moves the window up by 100 lines.
+ Args:
+ None
+
+create_file(filename: str) -> None:
+ Creates and opens a new file with the given name.
+ Args:
+ filename: str: The name of the file to create.
+
+edit_file_by_replace(file_name: str, to_replace: str, new_content: str) -> None:
+ Edit a file. This will search for `to_replace` in the given file and replace it with `new_content`.
+ Every *to_replace* must *EXACTLY MATCH* the existing source code, character for character, including all comments, docstrings, etc.
+ Include enough lines to make code in `to_replace` unique. `to_replace` should NOT be empty.
+ For example, given a file "/workspace/example.txt" with the following content:
+ ```
+ line 1
+ line 2
+ line 2
+ line 3
+ ```
+ EDITING: If you want to replace the second occurrence of "line 2", you can make `to_replace` unique:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='new line
+ line 3',
+ )
+ This will replace only the second "line 2" with "new line". The first "line 2" will remain unchanged.
+ The resulting file will be:
+ ```
+ line 1
+ line 2
+ new line
+ line 3
+ ```
+ REMOVAL: If you want to remove "line 2" and "line 3", you can set `new_content` to an empty string:
+ edit_file_by_replace(
+ '/workspace/example.txt',
+ to_replace='line 2
+ line 3',
+ new_content='',
+ )
+ Args:
+ file_name: str: The name of the file to edit.
+ to_replace: str: The content to search for and replace.
+ new_content: str: The new content to replace the old content with.
+
+insert_content_at_line(file_name: str, line_number: int, content: str) -> None:
+ Insert content at the given line number in a file.
+ This will NOT modify the content of the lines before OR after the given line number.
+ For example, if the file has the following content:
+ ```
+ line 1
+ line 2
+ line 3
+ ```
+ and you call `insert_content_at_line('file.txt', 2, 'new line')`, the file will be updated to:
+ ```
+ line 1
+ new line
+ line 2
+ line 3
+ ```
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+append_file(file_name: str, content: str) -> None:
+ Append content to the given file.
+ It appends text `content` to the end of the specified file.
+ Args:
+ file_name: str: The name of the file to edit.
+ line_number: int: The line number (starting from 1) to insert the content after.
+ content: str: The content to insert.
+
+search_dir(search_term: str, dir_path: str = './') -> None:
+ Searches for search_term in all files in dir. If dir is not provided, searches in the current directory.
+ Args:
+ search_term: str: The term to search for.
+ dir_path: Optional[str]: The path to the directory to search.
+
+search_file(search_term: str, file_path: Optional[str] = None) -> None:
+ Searches for search_term in file. If file is not provided, searches in the current open file.
+ Args:
+ search_term: str: The term to search for.
+ file_path: Optional[str]: The path to the file to search.
+
+find_file(file_name: str, dir_path: str = './') -> None:
+ Finds all files with the given name in the specified directory.
+ Args:
+ file_name: str: The name of the file to find.
+ dir_path: Optional[str]: The path to the directory to search.
+
+parse_pdf(file_path: str) -> None:
+ Parses the content of a PDF file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_docx(file_path: str) -> None:
+ Parses the content of a DOCX file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_latex(file_path: str) -> None:
+ Parses the content of a LaTex file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+parse_pptx(file_path: str) -> None:
+ Parses the content of a pptx file and prints it.
+ Args:
+ file_path: str: The path to the file to open.
+
+Please note that THE `edit_file_by_replace`, `append_file` and `insert_content_at_line` FUNCTIONS REQUIRE PROPER INDENTATION. If the assistant would like to add the line ' print(x)', it must fully write that out, with all those spaces before the code! Indentation is important and code that is not indented correctly will fail and require fixing before it can be run.
+
+Responses should be concise.
+The assistant should attempt fewer things at a time instead of putting too many commands OR too much code in one "execute" block.
+Include ONLY ONE , , or per response, unless the assistant is finished with the task or needs more input or action from the user in order to proceed.
+If the assistant is finished with the task you MUST include in your response.
+IMPORTANT: Execute code using , , or whenever possible.
+When handling files, try to use full paths and pwd to avoid errors.
+
+
+----------
+
+Here is an example of how you can interact with the environment for task solving:
+
+--- START OF EXAMPLE ---
+
+USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
+
+ASSISTANT:
+Sure! Let me create a Python file `app.py`:
+
+create_file('app.py')
+
+
+USER:
+OBSERVATION:
+[File: /workspace/app.py (1 lines total)]
+(this is the beginning of the file)
+1|
+(this is the end of the file)
+[File app.py created.]
+
+ASSISTANT:
+Now I will write the Python code for starting a web server and save it to the file `app.py`:
+
+EDITED_CODE="""from flask import Flask
+app = Flask(__name__)
+
+@app.route('/')
+def index():
+ numbers = list(range(1, 11))
+ return str(numbers)
+
+if __name__ == '__main__':
+ app.run(port=5000)"""
+
+insert_content_at_line(
+ 'app.py',
+ 1,
+ EDITED_CODE,
+)
+
+
+USER:
+OBSERVATION:
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
+
+ASSISTANT:
+I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 121[1]+ Exit 1 python3 app.py > server.log 2>&1
+
+ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+Traceback (most recent call last):
+ File "/workspace/app.py", line 2, in
+ from flask import Flask
+ModuleNotFoundError: No module named 'flask'
+
+ASSISTANT:
+It seems that Flask is not installed. Let me install Flask for you:
+
+pip install flask
+
+
+USER:
+OBSERVATION:
+Defaulting to user installation because normal site-packages is not writeable
+Collecting flask
+ Using cached flask-3.0.3-py3-none-any.whl (101 kB)
+Collecting blinker>=1.6.2
+ Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
+Collecting Werkzeug>=3.0.0
+ Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
+Collecting click>=8.1.3
+ Using cached click-8.1.7-py3-none-any.whl (97 kB)
+Collecting itsdangerous>=2.1.2
+ Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
+Requirement already satisfied: Jinja2>=3.1.2 in /home/opendevin/.local/lib/python3.10/site-packages (from flask) (3.1.3)
+Requirement already satisfied: MarkupSafe>=2.0 in /home/opendevin/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
+Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
+Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
+
+ASSISTANT:
+Now that Flask is installed, let me run the Python file again:
+
+python3 app.py > server.log 2>&1 &
+
+
+USER:
+OBSERVATION:
+[1] 124
+
+ASSISTANT:
+Let me check the server log again:
+
+cat server.log
+
+
+USER:
+OBSERVATION:
+* Serving Flask app 'app'
+ * Debug mode: off
+WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
+ * Running on http://127.0.0.1:5000
+Press CTRL+C to quit
+
+ASSISTANT:
+The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
+
+USER: Now browse the newly started server's homepage and show me the content.
+
+ASSISTANT:
+Sure! Let me browse the server's homepage at http://127.0.0.1:5000:
+
+Get the content on "http://127.0.0.1:5000"
+
+
+USER:
+OBSERVATION:
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
+
+ASSISTANT:
+The content of the server's homepage is "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]". If you have any further questions, feel free to ask!
+
+USER: Now kill the server, make it display the numbers in a table format.
+
+ASSISTANT:
+Sure! Let me stop the server first:
+
+kill 124
+
+
+USER:
+OBSERVATION:
+[1]+ Terminated python3 app.py > server.log 2>&1
+
+ASSISTANT:
+The server has been stopped. Let me open the Python file and modify it.
+
+open_file('app.py')
+
+
+USER:
+[File: /workspace/app.py (10 lines total)]
+(this is the beginning of the file)
+1|from flask import Flask
+2|app = Flask(__name__)
+3|
+4|@app.route('/')
+5|def index():
+6| numbers = list(range(1, 11))
+7| return str(numbers)
+8|
+9|if __name__ == '__main__':
+10| app.run(port=5000)
+(this is the end of the file)
+
+ASSISTANT:
+I should edit the file to display the numbers in a table format. I should include correct indentation. Let me update the file:
+
+edit_file_by_replace(
+ 'app.py',
+ to_replace=" return str(numbers)",
+ new_content=" return '