From bcc28ac50fbf8423a4e91050e467727fc0bbd0e4 Mon Sep 17 00:00:00 2001 From: Cameron Higby-Naquin Date: Wed, 29 May 2024 10:46:19 -0400 Subject: [PATCH 1/3] docker compose: Remove obsolete top-level version key See documentation: https://docs.docker.com/compose/compose-file/04-version-and-name/#version-top-level-element-obsolete > The top-level version property is defined by the Compose > Specification for backward compatibility. It is only informative and > you'll receive a warning message that it is obsolete if used. I've removed this to prevent us from getting the following warning when docker compose commands are run: ``` WARN[0000] docker-compose.yaml: `version` is obsolete ``` --- docker-compose.yaml | 1 - 1 file changed, 1 deletion(-) diff --git a/docker-compose.yaml b/docker-compose.yaml index e939347a..8a237431 100644 --- a/docker-compose.yaml +++ b/docker-compose.yaml @@ -1,5 +1,4 @@ --- -version: "3" networks: app: services: From 0f980d60e67d370d58ee4e2b643c9c617443bf4a Mon Sep 17 00:00:00 2001 From: Cameron Higby-Naquin Date: Wed, 29 May 2024 10:47:27 -0400 Subject: [PATCH 2/3] asset scanner: Handle request exceptions when fetching assets Updates the asset scanner/fetcher to catch any exceptions the `requests` library raises when attempting to obtain the contents of various asset files. The previous behavior was to let those be raised. I've changed the `fetch_asset` function, the only place where we invoke `requests`, to return a string containing the text content of the response, rather than the response itself. This is a good change anyway, since it means the callers of `fetch_asset`, all of which only wanted the text content anyway, don't have to worry about the details of the `requests` library. One thing to note: if an error is caught, then the function returns the empty string. This is the simplest thing to do, but it means we're kind of passing over the error. If we want to flag it in the results somehow, then we will have to add some additional code and make decisions about how exactly we want that to work. --- scanner/assets.py | 16 ++++++++++------ scanner/tests/test_asset_scraper.py | 16 ++++++++++++++++ 2 files changed, 26 insertions(+), 6 deletions(-) diff --git a/scanner/assets.py b/scanner/assets.py index 63d50d6b..bff10e80 100644 --- a/scanner/assets.py +++ b/scanner/assets.py @@ -81,9 +81,9 @@ def extract_assets(soup: BeautifulSoup, site_url: str) -> List[Asset]: ) ) - response = fetch_asset(script.attrs['src'], site_url) + asset_text = fetch_asset(script.attrs['src'], site_url) # assets in content from external js - for text in extract_strings(response.text): + for text in extract_strings(asset_text): for url in extract_urls(text): assets.append(Asset(resource=url, kind='script-resource', initiator=script.attrs['src'])) # js embedded in