Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated Creation and Export of Customized FLARE-VM Builds #660

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Ana06
Copy link
Member

@Ana06 Ana06 commented Feb 12, 2025

vbox-build-flare-vm.py: Automated Creation and Export of Customized FLARE-VM Builds

This PR enhances vbox-build-flare-vm.py (previously named vbox-build-vm.py) to fully automate the creation and export of customized FLARE-VM builds. The script orchestrates the entire process, including:

  • Restoring a pre-existing BUILD-READY snapshot of a clean Windows installation (with UAC disabled).
  • Copying required installation files (such as the IDA Pro installer, the FLARE-VM configuration file, and legal notices) into the guest VM.
  • Installing FLARE-VM and creating a base snapshot.
  • Generating subsequent snapshots and exporting OVA images based on a YAML configuration file.

Configuration File

The YAML configuration file specifies the VM name, exported VM name, and details for each snapshot. Snapshot configurations support custom commands, legal notices, and file/folder exclusions for automated cleanup. See virtualbox/configs/win10_flare-vm.yaml for an example to export three OVAs:

  • .dynamic: Installs IDA Pro and Microsoft Office 2024 in addition to the FLARE-VM default configuration
  • .full.dynamic: Downloads Windows Symbols and installs IDA Pro, Microsoft Office 2024 and Visual Studio in addition to the FLARE-VM default configuration. This OVA exhibits a significant size increase compared to .dynamic.
  • .EDU: Unzips a ZIP archive containing educational materials (labs and demos) and installs Microsoft Office 2016.

--do-not-install-flare-vm Flag

When the --do-not-install-flare-vm flag is provided, vbox-build-flare-vm.py skips the initial FLARE-VM installation and assumes the existence of a pre-existing base snapshot. This enables modification of the base snapshot (e.g., to reinstall failed packages) created by a prior run and resumption of the build process from the adjusted base snapshot.

Handling the VirtualBox VERR_NO_LOW_MEMORY Bug

VirtualBox encounters a VERR_NO_LOW_MEMORY error due to a bug (https://www.virtualbox.org/ticket/22185) that causes incorrect calculation of available memory. To prevent this issue from disrupting the script, when the bug is detected within run_vboxmanage, fix instructions are printed, and the command is retried after a brief delay.

Robuster ensure_vm_running Implementation

Previously, ensure_vm_running only verified if the VM was in the running state, which is insufficient as the guest OS might still be booting. Introduce the get_num_logged_in_users function and use it in ensure_vm_running to wait until at least one user is logged in. However, even this is not always enough in slower environments,, as the VirtualBox guest additions can take a while to load after user login. Therefore, in addition a retry mechanism with a 2-minute delay has been implemented for VBoxManage guestcontrol commands that fail.

Handling the aborted-saved State in ensure_vm_shutdown

ensure_vm_shutdown now avoids attempting to shut down the VM if its state is aborted-saved (e.g. after a VM crash), as this operation fails because the VM is not running.

Preventing OVA Export Conflicts

Export failures due to existing OVA files are prevented by renaming the old OVA adding a timestamp.

Remaining Tasks

Completely Unattended Process TODOs

For a fully automated, unattended process that requires no manual intervention, some issues encountered during testing still need to be addressed (these are not part of this PR).

Package Installation Failures

Occasionally, packages fail to install, and a simple retry would resolve the issue. Currently, this is handled manually using the --do-not-install-flare-vm flag explained above. To eliminate manual intervention, package installation retries on failure during FLARE-VM installation are necessary: mandiant/VM-Packages#1219

VM Crashes

I have experienced VM crashes occasionally during FLARE-VM installation (likely due to VM restarts), particularly with newer Debian kernels. The current implementation of vbox-build-flare-vm.py takes snapshots every ~20 minutes, which can be restored in case of a crash. However, this restoration is currently manual. Until the VM is manually restored to a pre-crash snapshot, the script continues to take snapshots of the crashed VM. The script needs to be extended to:

  • Detect VM crashes.
  • Copy the VirtualBox log to LOGS_DIR for crash debugging.
  • Restart a previous snapshot before the crash to resume installation.

I'll create an issue to track this.

Other Proposed Enhancements

The following proposed changes would further enhance script execution:

  • Create subfolders within LOGS_DIR with installation date and time in their names instead of deleting previous logs, facilitating debugging of past runs.
  • Reduce noisy output from certain commands within the build VM script, such as VM-Clean-Up, to improve readability of vbox-build-flare-vm.py output.

I'll create issues to track these enhancement proposals.

Ensuring failed_packages.txt Existence

The script relies on the failed_packages.txt file to determine FLARE-VM installation completion. Currently, this file is only created if any package installation fails. malware-jail.vm currently fails, albeit the tool is installed (see mandiant/VM-Packages#1130). This results in at least one failing package and the existence of failed_packages.txt. To ensure continued reliance on this file after the malware-jail.vm package is fixed, it must always be created. This is implemented in mandiant/VM-Packages#1274.

vbox-export-snapshots.py Simplification for Single Snapshot Export

Initially, vbox-export-snapshots.py was intended to export FLARE-VM snapshots built by vbox-build-flare-vm.py.However, this proved problematic because vbox-build-flare-vm.py requires executing VM-Set-Legal-Notice, which prevents the VM from starting without manual interaction (to accept the legal notice). Starting the VM is necessary after setting the network interface to hostonly for the internet detector to update the VM status. Therefore, hostonly setup and snapshot export were moved into vbox-build-flare-vm.py. With vbox-export-snapshots.py now decoupled from vbox-build-flare-vm.py, vbox-export-snapshots.py has been simplified to export single snapshots, eliminating the complexity of managing multiple snapshots and their associated configuration files. Shared code was moved to vboxcommon.py to avoid duplication.

Ana06 added 2 commits February 7, 2025 10:24
Enhance the `ensure_vm_running` function to provide more reliable guest
control. Previously, it only checked if the VM was in the `running`
state, which is insufficient because the guest OS might still be booting
and unavailable to `VBoxManage guestcontrol`. This could lead to
subsequent `guestcontrol` commands failing.

To address this, introduce the `get_num_logged_in_users` function. This
new function parses the output of the following command to determine how
many users are currently logged into the VM.
```
VBoxManage guestproperty get <VM_UUID> "/VirtualBox/GuestInfo/OS/LoggedInUsers
```

Enhance `ensure_vm_running` by waiting until at least one user is logged
in using `get_num_logged_in_users`.

Replace the wait_until_vm_state` function with the more generic
`wait_until` function.`wait_until` now accepts a condition that is
evaluated instead of a VM state. This makes the function more flexible
allowing us to continue to use it in both `ensure_vm_running` and
`ensure_vm_shutdown` to avoid code duplication.

Remove the power cycle in `vbox-export-snapshots.py` as it is not needed
with the new `ensure_vm_running` function.
This commit simplifies `vbox-export-snapshots.py` to support exporting
only a single snapshot at a time. This removes the complexity of
managing multiple snapshots and configuration files.

The original plan was to use `vbox-export-snapshots.py` to export
several snapshot with FLARE-VM installed using `vbox-build-vm.py`.
However, this proved problematic because `vbox-build-vm.py` requires
executing `VM-Set-Legal-Notice`, which prevents the VM from starting
without manual interaction (to accept the legal notice).  Starting the
VM is necessary after setting the network interface to `hostonly` for
the internet detector to update the VM status.

To address this, the intended future approach is to handle both setting
the interface to `hostonly` and exporting the snapshot directly within
`vbox-build-vm.py`.  Because `vbox-export-snapshots.py` is now
independent of `vbox-build-vm.py`, it can be significantly simplified as
done in this commit.  Both scripts will share code that will be moved to
`vboxcommon.py` to avoid code duplication.
@Ana06 Ana06 self-assigned this Feb 12, 2025
@Ana06 Ana06 changed the title vbox-build-flare-vm.py: Automated Creation and Export of Customized FLARE-VM Builds Automated Creation and Export of Customized FLARE-VM Builds Feb 12, 2025
@Ana06 Ana06 force-pushed the automate-build-vm branch from ca31c98 to 84867ac Compare February 13, 2025 07:18
This commit extends `vbox-build-flare-vm.py` to provide fully automated,
unattended creation and export of customized FLARE-VM builds,
eliminating manual interaction. The script orchestrates the entire
process, including:

* Restoring a pre-existing `BUILD-READY` snapshot of a clean Windows
  installation (with UAC disabled).
* Copying required installation files (such as the IDA Pro installer,
  the FLARE-VM configuration file, and legal notices) into the guest VM.
* Installing FLARE-VM and creating a `base` snapshot.
* Generating subsequent snapshots and exporting OVA images based on a
  YAML configuration file.

The YAML configuration file specifies the VM name, exported VM name, and
details for each snapshot. Snapshot configurations support custom
commands, legal notices, and file/folder exclusions for automated
cleanup.

This commit also includes an example configuration file that has been
tested to work with the script.

The script has been renamed from `vbox-build-vm.py` to
`vbox-build-flare-vm.py` to reflect its FLARE-VM-specific functionality.

Common functionality shared by `box-export-snapshot.py` and
`vbox-build-vm.py` has been moved to `vboxcommon.py` to reduce code
duplication and improve maintainability.
The `vbox-build-flare-vm.py` script could restore the wrong snapshot if
a snapshot with the same name already existed. This would break the
intended logic, as the wrong snapshot would be used for further
processing.

This commit introduces a new function `rename_old_snapshot` which
renames existing snapshots matching a given name by appending ` OLD` to
them. The `rename_old_snapshot` function is called before taking the
base snapshot and before taking each of the final export snapshots. This
prevents ambiguity and ensures the correct snapshot is used throughout
the script's execution.
This commit paliate the VirtualBox VERR_NO_LOW_MEMORY error, a known bug
(https://www.virtualbox.org/ticket/22185) where VirtualBox incorrectly
calculates available memory. Specifically, VirtualBox fails to properly
account for reclaimable cached memory, which leads to the error even
when sufficient RAM is actually available.

To mitigate this, the commit implements a retry mechanism within the
`run_vboxmanage` function. If the VERR_NO_LOW_MEMORY error occurs, the
command is re-run after a short delay. The user is also prompted with
instructions to clear the system's page cache, which resolves the
underlying memory accounting issue in VirtualBox.

The `run_vboxmanage` function is refactored to use a private
`__run_vboxmanage` function for the actual execution, allowing the error
handling logic to re-run the command after the cache is cleared.
@Ana06 Ana06 force-pushed the automate-build-vm branch from 84867ac to 1f392c9 Compare February 13, 2025 07:46
Prevents export failures due to existing OVA files. Existing files are
now renamed with a timestamp before a new OVA is exported, ensuring the
export completes successfully even if the script is run multiple times.
The VirtualBox guest additions can take a while to load after user
login, especially in slower environments. This can cause the `VBoxManage
guestcontrol` command to fail. This commit adds a retry mechanism with a
2-minute delay to address this issue.
This commit introduces the `--do-not-install-flare-vm` flag to
`vbox-build-flare-vm.py`. When this flag is used, the script skips the
initial FLARE-VM installation and assumes a pre-existing base snapshot.
This allows users to modify the base snapshot (e.g., to reinstall failed
packages) created by a previous run and resume the build process from
the modified base snapshot, enabling incremental builds and reducing
build times.
VMs in the `aborted-saved` state are not running and trying to shutdown
fails. This commit adds a check for this state in `ensure_vm_shutdown`
and returns early, preventing attempts to power off such VMs. It also
logs the VM's state.
The packages `pdbs.pdbresym.vm` and `visualstudio.vm` take a long time
to install and we need a longer timeout than the default to ensure they
are installed correctly.
@Ana06 Ana06 force-pushed the automate-build-vm branch from 1f392c9 to 24d3dcc Compare February 13, 2025 09:51
@Ana06
Copy link
Member Author

Ana06 commented Feb 13, 2025

I have identified another issues with the script.

Running choco install visualstudio.vm --execution-timeout 10000 using the script (via the cmd in the config) creates a broken shortcut for Visual Studio in the tools directory:

image

But running the exact same command manually (opening PoweShell in the VM) in the base snapshot creates the shortcut correctly
image

The package of all other packages (the ones from the default config like capa and the ones installed as part of the the cmd in the config like the Microsoft Office tools) is ok. I also do not see anything strange in the code in the Visual Studio package: https://github.com/mandiant/VM-Packages/blob/e9d0d073a1759890db3221c509940917fda42ca9/packages/visualstudio.vm/tools/chocolateyinstall.ps1#L17

So am I very confused. @mandiant/vms @MalwareMechanic @stevemk14ebr @williballenthin @d35ha @sara-rn any ideas of what's going on?

@stevemk14ebr
Copy link
Contributor

What is the target path of the broken link. Does it break after the first subpath with a space? Do you have a path escape issue?

@Ana06
Copy link
Member Author

Ana06 commented Feb 13, 2025

What is the target path of the broken link. Does it break after the first subpath with a space? Do you have a path escape issue?

There is not target path in the broken link. The package code that creates the shortcut is the same if I run the PS command from the host with the script or from the guest: https://github.com/mandiant/VM-Packages/blob/e9d0d073a1759890db3221c509940917fda42ca9/packages/visualstudio.vm/tools/chocolateyinstall.ps1#L17 So it should be created in the exactly same way.

@Ana06
Copy link
Member Author

Ana06 commented Feb 14, 2025

I have also experience the issue with Office tools, where the shortcuts are either missing or broken.
https://github.com/mandiant/VM-Packages/blob/e9d0d073a1759890db3221c509940917fda42ca9/packages/microsoft-office.vm/tools/chocolateyinstall.ps1#L24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants