-
-
Notifications
You must be signed in to change notification settings - Fork 319
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add text-prompt support for SamGeo2 (#341)
- Loading branch information
Showing
3 changed files
with
480 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,366 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Segmenting remote sensing imagery with text prompts and the Segment Anything Model 2 (SAM 2)\n", | ||
"\n", | ||
"[![image](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/opengeos/segment-geospatial/blob/main/docs/examples/sam2_text_prompts.ipynb)\n", | ||
"[![image](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opengeos/segment-geospatial/blob/main/docs/examples/sam2_text_prompts.ipynb)\n", | ||
"\n", | ||
"This notebook shows how to generate object masks from text prompts with the Segment Anything Model (SAM). \n", | ||
"\n", | ||
"Make sure you use GPU runtime for this notebook. For Google Colab, go to `Runtime` -> `Change runtime type` and select `GPU` as the hardware accelerator. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Install dependencies\n", | ||
"\n", | ||
"Uncomment and run the following cell to install the required dependencies." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# %pip install segment-geospatial groundingdino-py leafmap localtileserver" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import leafmap\n", | ||
"from samgeo.text_sam import LangSAM" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Create an interactive map" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"m = leafmap.Map(center=[-22.17615, -51.253043], zoom=18, height=\"800px\")\n", | ||
"m.add_basemap(\"SATELLITE\")\n", | ||
"m" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Download a sample image\n", | ||
"\n", | ||
"Pan and zoom the map to select the area of interest. Use the draw tools to draw a polygon or rectangle on the map" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"bbox = m.user_roi_bounds()\n", | ||
"if bbox is None:\n", | ||
" bbox = [-51.2565, -22.1777, -51.2512, -22.175]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"image = \"Image.tif\"\n", | ||
"leafmap.map_tiles_to_geotiff(\n", | ||
" output=image, bbox=bbox, zoom=19, source=\"Satellite\", overwrite=True\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"You can also use your own image. Uncomment and run the following cell to use your own image." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# image = '/path/to/your/own/image.tif'" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Display the downloaded image on the map." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"m.layers[-1].visible = False\n", | ||
"m.add_raster(image, layer_name=\"Image\")\n", | ||
"m" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Initialize LangSAM class\n", | ||
"\n", | ||
"The initialization of the LangSAM class might take a few minutes. The initialization downloads the model weights and sets up the model for inference." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam = LangSAM(model_type=\"sam2-hiera-large\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Specify text prompts" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"text_prompt = \"tree\"" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Segment the image\n", | ||
"\n", | ||
"Part of the model prediction includes setting appropriate thresholds for object detection and text association with the detected objects. These threshold values range from 0 to 1 and are set while calling the predict method of the LangSAM class.\n", | ||
"\n", | ||
"`box_threshold`: This value is used for object detection in the image. A higher value makes the model more selective, identifying only the most confident object instances, leading to fewer overall detections. A lower value, conversely, makes the model more tolerant, leading to increased detections, including potentially less confident ones.\n", | ||
"\n", | ||
"`text_threshold`: This value is used to associate the detected objects with the provided text prompt. A higher value requires a stronger association between the object and the text prompt, leading to more precise but potentially fewer associations. A lower value allows for looser associations, which could increase the number of associations but also introduce less precise matches.\n", | ||
"\n", | ||
"Remember to test different threshold values on your specific data. The optimal threshold can vary depending on the quality and nature of your images, as well as the specificity of your text prompts. Make sure to choose a balance that suits your requirements, whether that's precision or recall." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam.predict(image, text_prompt, box_threshold=0.24, text_threshold=0.24)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Visualize the results\n", | ||
"\n", | ||
"Show the result with bounding boxes on the map." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam.show_anns(\n", | ||
" cmap=\"Greens\",\n", | ||
" box_color=\"red\",\n", | ||
" title=\"Automatic Segmentation of Trees\",\n", | ||
" blend=True,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![image](https://github.com/user-attachments/assets/fd1a6a46-7fc6-45f5-8408-d648f2b5bbfe)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Show the result without bounding boxes on the map." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam.show_anns(\n", | ||
" cmap=\"Greens\",\n", | ||
" add_boxes=False,\n", | ||
" alpha=0.5,\n", | ||
" title=\"Automatic Segmentation of Trees\",\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![image](https://github.com/user-attachments/assets/11843d0f-9caa-4e71-905f-17d363640cef)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Show the result as a grayscale image." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam.show_anns(\n", | ||
" cmap=\"Greys_r\",\n", | ||
" add_boxes=False,\n", | ||
" alpha=1,\n", | ||
" title=\"Automatic Segmentation of Trees\",\n", | ||
" blend=False,\n", | ||
" output=\"trees.tif\",\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![image](https://github.com/user-attachments/assets/2fb80bbf-4d07-401e-8a57-ccde74ae3115)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Convert the result to a vector format. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"da, gdf = sam.region_groups(\n", | ||
" image=\"trees.tif\",\n", | ||
" min_size=100,\n", | ||
" out_csv=\"objects.csv\",\n", | ||
" out_image=\"objects.tif\",\n", | ||
" out_vector=\"objects.gpkg\",\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Show the results on the interactive map." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"m.add_raster(\"objects.tif\", layer_name=\"Trees\", palette=\"Greens\", opacity=0.5, nodata=0)\n", | ||
"style = {\n", | ||
" \"color\": \"#3388ff\",\n", | ||
" \"weight\": 2,\n", | ||
" \"fillColor\": \"#7c4185\",\n", | ||
" \"fillOpacity\": 0.5,\n", | ||
"}\n", | ||
"m.add_vector(\"objects.gpkg\", layer_name=\"Vector\", style=style)\n", | ||
"m" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Interactive segmentation" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"sam.show_map()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"![](https://i.imgur.com/wydt5Xt.gif)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.8" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 4 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.