Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk scan error handling #1126

Merged
merged 3 commits into from
May 29, 2024
Merged

Bulk scan error handling #1126

merged 3 commits into from
May 29, 2024

Conversation

chigby
Copy link
Contributor

@chigby chigby commented May 29, 2024

Description

Fixes #1125

Changes proposed in this pull request:

This PR adds some error handling to the scan process in two ways:

  1. During the asset extraction phase, if we cannot fetch an asset's contents, skip over that asset and continue.
  2. During a bulk scan, if any error is caught during the scan of a single landing page, skip that landing page and continue.

See commit messages for additional technical details.

Type of change

  • Bug fix
  • New feature
  • Vulnerabilities update
  • Config changes
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires an admin update after deploy
  • Includes a database migration removing or renaming a field

Testing

I've added unit tests for both changes, but it might be useful to try this locally with a bulk scan on the problematic landing page that inspired this issue.

Post-deployment actions

None.

Checklist

General checks

  • Linting and tests pass locally
  • The website and the changes are functional in Tor Browser
  • There is no conflicting migrations
  • Any CSP related changes required has been updated (check at least both firefox & chrome)
  • The changes are accessible using keyboard and screenreader

If you made changes to scanner

  • Verify that the directory scan result page in admin interface loads as expected
  • Verify that the API at /api/v1/directory returns directory entries and scan results as expected

chigby added 3 commits May 29, 2024 10:46
See documentation:

https://docs.docker.com/compose/compose-file/04-version-and-name/#version-top-level-element-obsolete

> The top-level version property is defined by the Compose
> Specification for backward compatibility. It is only informative and
> you'll receive a warning message that it is obsolete if used.

I've removed this to prevent us from getting the following warning
when docker compose commands are run:

```
WARN[0000] docker-compose.yaml: `version` is obsolete
```
Updates the asset scanner/fetcher to catch any exceptions the
`requests` library raises when attempting to obtain the contents of
various asset files.  The previous behavior was to let those be
raised.

I've changed the `fetch_asset` function, the only place where we
invoke `requests`, to return a string containing the text content of
the response, rather than the response itself.  This is a good change
anyway, since it means the callers of `fetch_asset`, all of which only
wanted the text content anyway, don't have to worry about the details
of the `requests` library.

One thing to note: if an error is caught, then the function returns
the empty string.  This is the simplest thing to do, but it means
we're kind of passing over the error.  If we want to flag it in the
results somehow, then we will have to add some additional code and
make decisions about how exactly we want that to work.
If an error occurs while conducting a bulk scan, skip the landing page
currently being scanned and continue with the rest of the ones in the
queue.
try:
current_result = perform_scan(entry.landing_page_url, permitted_domains)
except Exception:
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly worried about this failing too silently. Should we maybe have a warning of some sorts logging the error, in case there is some real error that we may need to debug?

Copy link
Contributor

@SaptakS SaptakS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me. Left a small comment, but I don't think that is a blocker for getting this PR merged and deployed. Let's test it on staging.

@SaptakS SaptakS merged commit 553196e into develop May 29, 2024
2 checks passed
@SaptakS SaptakS deleted the bulk-scan-error-handling branch May 29, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scan results not consistently updated
2 participants