Skip to content

Commit

Permalink
update notebooks (#309)
Browse files Browse the repository at this point in the history
Replace .run -> .invoke
Update .prompt -> .get_prompts()[0]
Update to use gpt-4o everywhere
  • Loading branch information
eyurtsev authored Jul 20, 2024
1 parent 8c7e38d commit 67f3332
Show file tree
Hide file tree
Showing 11 changed files with 611 additions and 563 deletions.
206 changes: 159 additions & 47 deletions docs/source/apis.ipynb

Large diffs are not rendered by default.

693 changes: 338 additions & 355 deletions docs/source/document_extraction.ipynb

Large diffs are not rendered by default.

9 changes: 7 additions & 2 deletions docs/source/guidelines.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,19 @@
"\n",
"`Kor` is a wrapper around LLMs to help with information extraction.\n",
"\n",
"*Kor* is best used with LLMs that do **NOT** natively support function calling.\n",
"\n",
"If you're working with a chat model that **does** support native function calling, please read through\n",
"this guide first (https://python.langchain.com/v0.2/docs/how_to/tool_calling/).\n",
"\n",
"The quality of the results depends on a lot of factors. \n",
"\n",
"Here are a few things to experiment with to improve quality:\n",
"\n",
"* Add more examples. Diverse examples can help, including examples where nothing should be extracted.\n",
"* Improve the descriptions of the attributes.\n",
"* If working with multi-paragraph text, specify an `input_formatter` of `\"triple_quotes\"` when creating the chain.\n",
"* Try a better model (e.g., text-davinci-003, gpt-4).\n",
"* Try a better model.\n",
"* Break the schema into a few smaller schemas, run separate extractions, and merge the results.\n",
"* If possible to flatten the schema, and use a CSV encoding instead of a JSON encoding.\n",
"* Add verification/correction steps (ask an LLM to correct or verify the results of the extraction).\n",
Expand Down Expand Up @@ -44,7 +49,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.1"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
73 changes: 30 additions & 43 deletions docs/source/nested_objects.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,24 +14,15 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 1,
"id": "0b4597b2-2a43-4491-8830-bf9f79428074",
"metadata": {
"nbsphinx": "hidden",
"tags": [
"remove-cell"
]
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The autoreload extension is already loaded. To reload it, use:\n",
" %reload_ext autoreload\n"
]
}
],
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2\n",
Expand All @@ -43,7 +34,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 2,
"id": "718c66a7-6186-4ed8-87e9-5ed28e3f209e",
"metadata": {
"tags": []
Expand All @@ -57,7 +48,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 3,
"id": "9bc98f35-ea5f-4b74-a32e-a300a22c0c89",
"metadata": {
"tags": []
Expand All @@ -83,7 +74,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 4,
"id": "f75990e6-5973-4618-9f15-f3b60a14bfa5",
"metadata": {
"tags": []
Expand Down Expand Up @@ -147,7 +138,7 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 5,
"id": "54a199a5-24b4-442c-8907-1449e437a880",
"metadata": {
"tags": []
Expand All @@ -161,7 +152,7 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 6,
"id": "193e257b-df01-45ec-af77-076d2070533b",
"metadata": {
"tags": []
Expand All @@ -178,20 +169,20 @@
" 'to_address': {'city': 'New York', 'state': 'NY', 'country': 'USA'}}]}"
]
},
"execution_count": 29,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\n",
"chain.invoke(\n",
" \"Alice Doe moved from New York to Boston, MA while Bob Smith did the opposite.\"\n",
")[\"data\"]"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 8,
"id": "c8295f36-f986-4db2-97bc-ef2e6cdbcc87",
"metadata": {
"tags": []
Expand All @@ -201,29 +192,24 @@
"data": {
"text/plain": [
"{'information': [{'person_name': 'Alice Doe',\n",
" 'from_address': {'city': 'New York', 'state': 'NY', 'country': 'USA'},\n",
" 'to_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'}},\n",
" 'from_address': {'city': 'New York', 'country': 'USA'},\n",
" 'to_address': {'city': 'Boston', 'country': 'USA'}},\n",
" {'person_name': 'Bob Smith',\n",
" 'from_address': {'city': 'New York', 'state': 'NY', 'country': 'USA'},\n",
" 'to_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'}},\n",
" 'from_address': {'city': 'New York', 'country': 'USA'},\n",
" 'to_address': {'city': 'Boston', 'country': 'USA'}},\n",
" {'person_name': 'Andrew',\n",
" 'to_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'}},\n",
" {'person_name': 'Joana',\n",
" 'to_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'}},\n",
" {'person_name': 'Paul',\n",
" 'to_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'}},\n",
" {'person_name': 'Betty',\n",
" 'from_address': {'city': 'Boston', 'state': 'MA', 'country': 'USA'},\n",
" 'to_address': {'city': 'New York', 'state': 'NY', 'country': 'USA'}}]}"
" 'to_address': {'city': 'Boston', 'country': 'USA'}},\n",
" {'person_name': 'Joana', 'to_address': {'city': 'Boston', 'country': 'USA'}},\n",
" {'person_name': 'Paul', 'to_address': {'city': 'Boston', 'country': 'USA'}}]}"
]
},
"execution_count": 30,
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\n",
"chain.invoke(\n",
" \"Alice Doe and Bob Smith moved from New York to Boston. Andrew was 12 years\"\n",
" \" old. He also moved to Boston. So did Joana and Paul. Betty did the opposite.\"\n",
")[\"data\"]"
Expand All @@ -247,7 +233,7 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 9,
"id": "e528f20c-46d3-40b6-b1ba-11024002deb8",
"metadata": {
"tags": []
Expand Down Expand Up @@ -300,7 +286,7 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 10,
"id": "23b81b06-118a-4ebe-9e20-5df1bf269ce3",
"metadata": {
"tags": []
Expand All @@ -312,7 +298,7 @@
},
{
"cell_type": "code",
"execution_count": 33,
"execution_count": 11,
"id": "29219fae-41cb-4235-92fa-07b16ded2296",
"metadata": {
"tags": []
Expand All @@ -325,19 +311,20 @@
" 'from_address': [{'city': 'New York', 'state': 'NY', 'country': 'USA'}],\n",
" 'to_address': [{'city': 'Boston', 'state': 'MA', 'country': 'USA'}]},\n",
" {'person_name': 'Bob Smith',\n",
" 'from_address': [{'city': 'New York', 'state': 'NY', 'country': 'USA'},\n",
" {'city': 'Boston', 'state': 'MA', 'country': 'USA'}],\n",
" 'to_address': [{'city': 'Boston', 'state': 'MA', 'country': 'USA'},\n",
" {'city': 'LA', 'state': 'CA', 'country': 'USA'}]}]}"
" 'from_address': [{'city': 'New York', 'state': 'NY', 'country': 'USA'}],\n",
" 'to_address': [{'city': 'Boston', 'state': 'MA', 'country': 'USA'}]},\n",
" {'person_name': 'Bob Smith',\n",
" 'from_address': [{'city': 'Boston', 'state': 'MA', 'country': 'USA'}],\n",
" 'to_address': [{'city': 'LA', 'state': 'CA', 'country': 'USA'}]}]}"
]
},
"execution_count": 33,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\n",
"chain.invoke(\n",
" \"Alice Doe and Bob Smith moved from New York to Boston. Bob later moved to LA.\"\n",
")[\"data\"]"
]
Expand All @@ -359,7 +346,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
28 changes: 10 additions & 18 deletions docs/source/objects.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@
}
],
"source": [
"print(chain.prompt.format_prompt(text=\"[user input]\").to_string())"
"print(chain.get_prompts()[0].format_prompt(text=\"[user input]\").to_string())"
]
},
{
Expand All @@ -194,14 +194,6 @@
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/eugene/.pyenv/versions/3.9.6/envs/kor/lib/python3.9/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `Chain.run` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
" warn_deprecated(\n"
]
},
{
"data": {
"text/plain": [
Expand All @@ -214,7 +206,7 @@
}
],
"source": [
"chain.run(\"Eugene was 18 years old a long time ago.\")[\"data\"]"
"chain.invoke(\"Eugene was 18 years old a long time ago.\")[\"data\"]"
]
},
{
Expand All @@ -236,7 +228,7 @@
"source": [
"chain = create_extraction_chain(llm, schema)\n",
"print(\n",
" chain.run(\n",
" chain.invoke(\n",
" \"My name is Bob Alice and my phone number is (123)-444-9999. I found my true love one\"\n",
" \" on a blue sunday. Her number was (333)1232832. Her name was Moana Sunrise and she was 10 years old.\"\n",
" )[\"data\"]\n",
Expand Down Expand Up @@ -271,7 +263,7 @@
}
],
"source": [
"chain.run(\n",
"chain.invoke(\n",
" \"My phone number is (123)-444-9999. I found my true love one on a blue sunday.\"\n",
" \" Her number was (333)1232832\"\n",
")[\"data\"]"
Expand Down Expand Up @@ -333,7 +325,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 10,
"id": "5c694d79-e72c-4712-b891-111bc0279032",
"metadata": {
"tags": []
Expand All @@ -347,14 +339,14 @@
" 'age': '20'}]}"
]
},
"execution_count": 12,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain = create_extraction_chain(llm, schema)\n",
"chain.run(\n",
"chain.invoke(\n",
" \"My name is Bob Alice and my phone number is (123)-444-9999. I found my true love one\"\n",
" \" on a blue sunday. Her number was (333)1232832. Her name was Moana Sunrise and she was 20 years old.\"\n",
")[\"data\"]"
Expand All @@ -370,7 +362,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 11,
"id": "a2944e8c-4630-4b29-b505-b2ca6fceba01",
"metadata": {
"tags": []
Expand Down Expand Up @@ -408,7 +400,7 @@
}
],
"source": [
"print(chain.prompt.format_prompt(text=\"[user input]\").to_string())"
"print(chain.get_prompts()[0].format_prompt(text=\"[user input]\").to_string())"
]
}
],
Expand All @@ -428,7 +420,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
6 changes: 3 additions & 3 deletions docs/source/prompt.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@
"\n",
"chain = create_extraction_chain(llm, schema, instruction_template=instruction_template)\n",
"\n",
"print(chain.prompt.format_prompt(text=\"hello\").to_string())"
"print(chain.get_prompts()[0].format_prompt(text=\"hello\").to_string())"
]
},
{
Expand Down Expand Up @@ -259,7 +259,7 @@
" type_descriptor=CatType(),\n",
")\n",
"\n",
"print(chain.prompt.format_prompt(text=\"hello\").to_string())"
"print(chain.get_prompts()[0].format_prompt(text=\"hello\").to_string())"
]
}
],
Expand All @@ -279,7 +279,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 67f3332

Please sign in to comment.