Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception when process some image detect 0 height tables and possible fix #601

Open
bash99 opened this issue Mar 7, 2025 · 0 comments
Open

Comments

@bash99
Copy link

bash99 commented Mar 7, 2025

First of all, I would like to express my gratitude for the two outstanding open-source projects, marker and surya_ocr.

When I process a specific single image, for example, by running the following command:

marker_single --output_format markdown --disable_image_extraction data/out/bad_9.jpg

I encounter the following error:

  File "...lib/python3.11/site-packages/surya/input/processing.py", line 43, in slice_polys_from_image
    lines.append(slice_and_pad_poly(image_array, poly))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...lib/python3.11/site-packages/surya/input/processing.py", line 58, in slice_and_pad_poly
    cv2.fillPoly(mask, [np.int32(coordinates)], 1)
cv2.error: OpenCV(4.11.0) :-1: error: (-5:Bad argument) in function 'fillPoly'
> Overload resolution failed:
>  - Layout of the output array img is incompatible with cv::Mat
>  - Expected Ptr<cv::UMat> for argument 'img'

I attempted to debug the parameters passed to fillPoly, and the relevant context is as follows:

    # Pad the area outside the polygon with the pad value
    mask = np.zeros(cropped_polygon.shape[:2], dtype=np.uint8)
    print(coordinates)
    print(cropped_polygon.shape[:2])
    cv2.fillPoly(mask, [np.int32(coordinates)], 1)

The output is:

coordinates = [(0, 0), (367, 0), (367, 78), (0, 78)]
cropped_polygon.shape[:2] = (0, 367)

It appears that the height of the mask is 0, which caused the fillPoly function to throw an error.

To address this, I modified the code as follows:

    # Pad the area outside the polygon with the pad value
    mask_shape = cropped_polygon.shape[:2]
    # If mask shape is 0, it means the polygon is outside the image, so we return the full image
    if(mask_shape[0] == 0 or mask_shape[1] == 0):
        return Image.fromarray(image_array)
    mask = np.zeros(cropped_polygon.shape[:2], dtype=np.uint8)

This modification avoids the error while returning a basically correct result. However, since I am not very familiar with the overall structure of marker, I am unsure whether this fix is reasonable. If the fix is acceptable, could it be merged into the upstream? If needed, I can also create a pull request.

Unfortunately, I am unable to upload the image that triggers the error because I spent dozens of minutes trying to redact private information from it. However, once the private information is removed, the exception no longer occurs. If the image is absolutely necessary, I may need to obtain authorization and then consider sending it via email.

Thank you for your understanding and support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant