Skip to content

Commit

Permalink
Merge pull request #81 from mgax/image-description-openai
Browse files Browse the repository at this point in the history
Image description with OpenAI
  • Loading branch information
tomusher authored Mar 26, 2024
2 parents ded87e6 + 7cdf140 commit 49eada0
Show file tree
Hide file tree
Showing 37 changed files with 1,170 additions and 2,620 deletions.
1 change: 1 addition & 0 deletions docs/.pages
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
nav:
- installation.md
- editor-integration.md
- images-integration.md
- ai-backends.md
- text-splitting.md
- contributing.md
41 changes: 39 additions & 2 deletions docs/ai-backends.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

Wagtail AI can be configured to use different backends to support different AI services.

Currently the only (and default) backend available in Wagtail AI is the ["LLM" backend](#the-llm-backend).
The default backend for text completion available in Wagtail AI is the ["LLM" backend](#the-llm-backend). To enable [image description](../images-integration/), you can use the ["OpenAI" backend](#the-openai-backend).

## The "LLM" backend

This backend uses the ["LLM" library](https://llm.datasette.io/en/stable/) which offers support for many AI services through plugins.
This backend uses the ["LLM" library](https://llm.datasette.io/en/stable/) which offers support for many AI services through plugins. At the moment it only supports [text completion](../editor-integration/).

By default, it is configured to use OpenAI's `gpt-3.5-turbo` model.

Expand Down Expand Up @@ -155,3 +155,40 @@ You can find the "LLM" library specific instructions at: https://llm.datasette.i
}
}
```

## The "OpenAI" backend

Wagtail AI includes a backend for OpenAI that supports both [text completion](../editor-integration/) and [image description](../images-integration/).

To use the OpenAI backend, you need an API key, which must be set in the `OPENAI_API_KEY` environment variable. Then, configure it in your Django project settings:

```python
WAGTAIL_AI = {
"BACKENDS": {
"default": {
"CLASS": "wagtail_ai.ai.openai.OpenAIBackend",
"CONFIG": {
"MODEL_ID": "gpt-4",
},
},
},
}
```

### Specifying another OpenAI model

The OpenAI backend supports the use of custom models. For newer models that are not known to Wagtail AI, you must also specify a token limit:

```python
WAGTAIL_AI = {
"BACKENDS": {
"vision": {
"CLASS": "wagtail_ai.ai.openai.OpenAIBackend",
"CONFIG": {
"MODEL_ID": "gpt-4-vision-preview",
"TOKEN_LIMIT": 300,
},
},
},
}
```
18 changes: 18 additions & 0 deletions docs/editor-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,21 @@ When creating prompts you can provide a label and description to help describe t

- 'Append after existing content' - keep your existing content intact and add the response from the AI to the end (useful for completions/suggestions).
- 'Replace content' - replace the content in the editor with the response from the AI (useful for corrections, rewrites and translations.)

### Configuring the AI backend

By default, the `"default"` model will be used for text operations in the editor. To use a different model, set `TEXT_COMPLETION_BACKEND` to the name of another backend:

```python
WAGTAIL_AI = {
"BACKENDS": {
"gpt4": {
"CLASS": "wagtail_ai.ai.openai.OpenAIBackend",
"CONFIG": {
"MODEL_ID": "gpt-4",
},
},
},
"TEXT_COMPLETION_BACKEND": "gpt4",
}
```
73 changes: 73 additions & 0 deletions docs/images-integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Images Integration

Wagtail AI integrates with the image edit form to provide AI-generated descriptions to images. The integration requires a backend that supports image descriptions, such as [the OpenAI backend](../ai-backends/#the-openai-backend).

## Configuration

1. In the Django project settings, configure an AI backend, and a model, that support images. Set `IMAGE_DESCRIPTION_BACKEND` to the name of the backend:
```python
WAGTAIL_AI = {
"BACKENDS": {
"vision": {
"CLASS": "wagtail_ai.ai.openai.OpenAIBackend",
"CONFIG": {
"MODEL_ID": "gpt-4-vision-preview",
"TOKEN_LIMIT": 300,
},
},
},
"IMAGE_DESCRIPTION_BACKEND": "vision",
}
```
2. In the Django project settings, configure a [custom Wagtail image base form](https://docs.wagtail.org/en/stable/reference/settings.html#wagtailimages-image-form-base):
```python
WAGTAILIMAGES_IMAGE_FORM_BASE = "wagtail_ai.forms.DescribeImageForm"
```

Now, when you upload or edit an image, a magic wand icon should appear next to the _title_ field. Clicking on the icon will invoke the AI backend to generate an image description.

## Separate backends for text completion and image description

Multi-modal models are faily new, so you may want to configure two different backends for text completion and image description. The `default` model will be used for text completion:

```python
WAGTAIL_AI = {
"BACKENDS": {
"default": {
"CLASS": "wagtail_ai.ai.llm.LLMBackend",
"CONFIG": {
"MODEL_ID": "gpt-3.5-turbo",
},
},
"vision": {
"CLASS": "wagtail_ai.ai.openai.OpenAIBackend",
"CONFIG": {
"MODEL_ID": "gpt-4-vision-preview",
"TOKEN_LIMIT": 300,
},
},
},
"IMAGE_DESCRIPTION_BACKEND": "vision",
}
```

## Custom prompt

Wagtail AI includes a simple prompt to ask the AI to generate an image description:

> Describe this image. Make the description suitable for use as an alt-text.
If you want to use a different prompt, override the `IMAGE_DESCRIPTION_PROMPT` value:

```python
WAGTAIL_AI = {
"BACKENDS": {
# ...
},
"IMAGE_DESCRIPTION_PROMPT": "Describe this image in the voice of Sir David Attenborough.",
}
```

## Custom form

Wagtail AI includes an image form that enhances the `title` field with an AI button. If you are using a [custom image model](https://docs.wagtail.org/en/stable/advanced_topics/images/custom_image_model.html), you can provide your own form to target another field. Check out the implementation of `DescribeImageForm` in [`forms.py`](https://github.com/wagtail/wagtail-ai/blob/main/src/wagtail_ai/forms.py), adapt it to your needs, and set it as `WAGTAILIMAGES_IMAGE_FORM_BASE`.
7 changes: 4 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Wagtail AI integrates Wagtail with OpenAI's APIs (think ChatGPT) to help you wri

Right now, it can:

* Finish what you've started - write some text and tell Wagtail AI to finish it off for you
* Correct your spelling/grammar
* Let you add your own custom prompts
* Finish what you've started - write some text and tell Wagtail AI to finish it off for you.
* Correct your spelling/grammar.
* Generate image descriptions - useful for [image alt text](https://developer.mozilla.org/en-US/docs/Web/API/HTMLImageElement/alt).
* Let you add your own custom prompts.
Loading

0 comments on commit 49eada0

Please sign in to comment.