Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move osquery history to knapsack #2104

Merged
merged 13 commits into from
Feb 14, 2025
Merged

Conversation

zackattack01
Copy link
Contributor

@zackattack01 zackattack01 commented Feb 14, 2025

There were a few aspects of our history manipulation patterns that were encouraging racy behavior that was going to become even more difficult to maintain while adding additional osquery instances for tracking.

  • as a package level variable, our parallel tests were pooling into (and mutex fighting) the same history
  • updates to individual instances were done through history instance pointers, maintained by our osquery instance structs. There was nothing inherently wrong with this but any attempts to test hard corners of our osquery lifecycle (e.g. restarting quickly after start) resulted in data races all around the setting of stats.

A few changes here should help clear this up a lot:

  • history is now stored in knapsack
  • history now adheres to a new OsqueryHistorian interface which takes responsibility for all history manipulation and reading.
    • after observing our history usage there was no real need to expose the Instances directly (we typically translate into map[string]string for any reporting anyway). exposing copies like this removes the possibility of reaching directly into instances like that going forward, and avoids overcomplicating the interfaces with pointer receivers on the instances
    • the interface adds SetConnected and SetExited methods for callers to invoke by instance run id for any updates

Not in Scope Here/future work

  • we should be able to update history.Instance to be internal only now (not exported). I did not do that here because I forgot until the end, the changeset was already big and I had already tested this as is. It can be a quick follow on improvement though
  • I noticed we don't currently initialize history for standalone flares but should be able to. I did not fix that here but that is something we could improve

Test Updates

  • all of the history and history instance tests can now safely be run in parallel, so they are updated accordingly
  • all of the new methods added to history have new tests for coverage
  • everything with a dependency on history (mostly runtime and extensions tests) needed to be updated to initialize this as part of its individual setup. helper methods are added where applicable
  • everything with a dependency on history now tests only through the methods exposed by history (can no longer reach directly into stats references)

Manual testing performed

  • in situ flares correctly gather and report history
  • LQ launcher_info table correctly reports latest instance id from history
  • LQ kolide_launcher_osquery_instance_history table correctly reports the expected history
  • killing the running active osquery process and rerunning LQ against kolide_launcher_osquery_instance_history shows that exit and errors are correctly being set, as well as connect time for new instances
    please note if there are other things you think i've missed and should test

@zackattack01 zackattack01 marked this pull request as ready for review February 14, 2025 18:26
Copy link
Contributor

@James-Pickett James-Pickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Work!

@zackattack01 zackattack01 added this pull request to the merge queue Feb 14, 2025
Merged via the queue into main with commit 0c28980 Feb 14, 2025
32 checks passed
@zackattack01 zackattack01 deleted the zack/move_history_to_knapsack branch February 14, 2025 22:20
@@ -375,14 +375,21 @@ func (i *OsqueryInstance) Launch() error {
return fmt.Errorf("starting osqueryd process: %w", err)
}

stats, err := history.NewInstance(i.registrationId, i.runId)
if err != nil {
osqHistory := i.knapsack.OsqueryHistory()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we use i.history here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oop because I was waffling on whether to keep i.history at all and then never updated, I will get this cleaned up next. thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants