Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize a parallel gzip implementation #25

Closed
wants to merge 1 commit into from
Closed

Conversation

USA-RedDragon
Copy link

@USA-RedDragon USA-RedDragon commented Mar 8, 2022

What is this?

This PR would allow parallel compression and improved decompression for slugs. I used a few cloned git repos from HashiCorp to test the performance difference for varying sizes of Git repositories. Mostly due to the fact that TFE utilizes this code and VCS repos are typically larger than 1MB.

This is based on some older worker here (https://github.com/hashicorp/go-service/pull/29), which should help with VCS ingress speeds, as it's still utilized by the slug ingress container, as per (unless I'm looking at old code): https://github.com/hashicorp/slug-ingress/blob/main/worker.go#L170-L180 (and the import therein). If these patches as well as #21 (great work!) were to be merged, significant speedups might be had in TFE/TFC/Agents.

This could considerably reduce the time taken for source code ingress on TFE, especially for customers with large monorepos.

The library that provides the parallel gzip drop-in replacement is https://github.com/klauspost/pgzip.

It is important to note that this library creates and reads standard gzip files. You do not have to match the compressor/decompressor to get the described speedups, and the gzip files are fully compatible with other gzip readers/writers.

As you can see from the results below, utilizing parallel gzip to compress these slugs can reduce the time it takes by up to 10x in some cases.

Decompression does have a speed-up as well, even though gzip decompression is single-threaded. Here's an except from the library's README explaining why it claims 104% decompression speedup over golang's implementation:

But wait, since gzip decompression is inherently singlethreaded (aside from CRC calculation) how can it be more than 100% faster? Because pgzip due to its design also acts as a buffer. When using unbuffered gzip, you are also waiting for io when you are decompressing. If the gzip decoder can keep up, it will always have data ready for your reader, and you will not be waiting for input to the gzip decompressor to complete.

Testing environment

HashiCorp-provided Lenovo X1 Carbon on Arch linux. 4c/8t i7-8665U.

File sizes via du -sh, I just pulled a few random repos from our GitHub plus a large Android source repository (platform_frameworks_base):, just to show the speedups in large workloads:

279M    atlas
437M    consul
580K    go-service
324K    is-immutable-aws-vault-consul
11G     platform_frameworks_base

Compression code

package main

import (
	"fmt"
	"log"
	"os"
	"time"

	"github.com/USA-RedDragon/go-slug"
)

func main() {
	if len(os.Args[1:]) != 2 {
		fmt.Printf("Usage: %v <input_dir> <output_file>\n", os.Args[0])
		os.Exit(1)
	}
	f, err := os.Create(os.Args[2])
	if err != nil {
		log.Fatal(err)
	}
	defer f.Sync()
	defer f.Close()

	// Then call the Pack function with a directory path containing the
	// configuration files and an io.Writer to write the slug to.
	defer duration(track("slug-pack"))
	if _, err := slug.Pack(os.Args[1], f, false); err != nil {
		log.Fatal(err)
	}
}

func track(msg string) (string, time.Time) {
	return msg, time.Now()
}

func duration(msg string, start time.Time) {
	fmt.Printf("%v: %v\n", msg, time.Since(start))
}

Decompression code

package main

import (
	"bufio"
	"fmt"
	"os"
	"time"

	"github.com/hashicorp/go-slug"
)

func main() {
	if len(os.Args[1:]) != 2 {
		fmt.Printf("Usage: %v <input_filer> <output_dir>\n", os.Args[0])
		os.Exit(1)
	}
	err := os.Mkdir(os.Args[2], 0755)
	if err != nil {
		fmt.Printf("Failed to create output directory: %s: %s\n", os.Args[2], err)
		os.Exit(1)
	}
	file, err := os.Open(os.Args[1])
	if err != nil {
		fmt.Printf("Failed to open input slug: %s: %s\n", os.Args[1], err)
		os.Exit(1)
	}
	reader := bufio.NewReader(file)
	defer duration(track("slug-unpack"))
	err = slug.Unpack(reader, os.Args[2])
	if err != nil {
		fmt.Printf("Failed to unpackage slug: %s", err)
		os.Exit(1)
	}
}

func track(msg string) (string, time.Time) {
	return msg, time.Now()
}

func duration(msg string, start time.Time) {
	fmt.Printf("%v: %v\n", msg, time.Since(start))
}

Compression Results

Atlas

Implementation Time to compress - 5 runs File size
golang gzip 1.3s 1.47s 1.39s 1.38s 1.26s 7.6M
pgzip 602.62ms 610.72ms 634.77ms 606.53ms 585.98ms 8M

Consul

Implementation Time to compress - 5 runs File size
golang gzip 2.99s, 2.53s, 2.64s, 2.97s 2.67s 23M
pgzip 525.3ms 499.16ms, 546.67ms 522.06ms 529.79ms 24M

go-service

Implementation Time to compress - 5 runs File size
golang gzip 11.65ms 10.16ms 9.6ms 9.67ms 11.57ms 35K
pgzip 9.52ms 8.13ms 7.77ms 11.35ms 8.66s 36K

is-immutable-aws-vault-consul

Here we see a great example of the <1MB of data being slightly slower. On average, about 1ms, or (1-(2.81/3.87)) ~27% slower.

Implementation Time to compress - 5 runs File size
golang gzip 3.92ms 2.54ms 2.51ms 2.47ms 2.61ms 437B
pgzip 4.43ms 5.57ms 3.12ms 2.96ms 3.26ms 441B

platform_frameworks_base

Here we see a great example of the speedups capable with this patch. This is an extreme, 11-gigabyte example, but it serves to show the scalability of this method.

Implementation Time to compress - 5 runs File size
golang gzip 1m1.47s 1m3.24s 58.23s 57.22s 55.81s 949M
pgzip 6.38s 5.94s 6.30s 7.4s 7.1s 971M

Decompression Results

Decompression is a slight improvement on average, not much to talk about.

Atlas

Implementation Time to decompress - 5 runs
golang gzip 874.65ms 749.19ms 874.72ms 1.08s 779.53ms
pgzip 649.99ms 694.72ms 1.04s 539.59ms 610.86ms

Consul

Implementation Time to decompress - 5 runs
golang gzip 1.47s 1.11s 1.52s 1.16s 1.21s
pgzip 507.07ms 799.67ms 680.70ms 873.84ms 731.31ms

go-service

Here we see the decompression speed gap close for such a small amount of data.

Implementation Time to decompress - 5 runs
golang gzip 3.68ms 3.15ms 4.08ms 2.89ms 4.12ms
pgzip 3.31ms 4.62ms 4.14ms 4.49ms 2.84ms

is-immutable-aws-vault-consul

Here we see decompression is much slower on pgzip for such a small amount of data. This is likely due to the overhead of creating threads.

Implementation Time to decompress - 5 runs
golang gzip 171ns 197ns 280ns 106ns 121ns
pgzip 762ns 455ns 639ns 551ns 700ns

platform_frameworks_base

Implementation Time to decompress - 5 runs
golang gzip 18.52s 21.15s 22.92s 22.07s 23.05s
pgzip 13.33s 16.02s 18.21s 17.14s 14.69s

@hashicorp-cla
Copy link

hashicorp-cla commented Mar 12, 2022

CLA assistant check
All committers have signed the CLA.

@brandonc
Copy link
Contributor

@USA-RedDragon I experimented with this version of go-slug in terraform with a working directory that contained about 800MB of uncompressable junk. Using just "terraform plan" with cloud config, my runs realized 20 to 40 seconds of speedup on the initial, local pack but this was a fraction of the total time it takes to upload the slug, dequeue the job, download it, execute terraform plan, pack it again, and upload the new slug (which took 2m40s in my baseline).

BUT, as you alluded to in the PR description, since there are two places we use go-slug in the pipeline (Preparing to plan + Waiting to apply) this optimization could potentially pay double and really move the needle on total planning time.

I also PR'd an alternative that allows a configurable compression level with compress/gzip. "BestSpeed" (level 1) achieves a 3x speedup but at the cost of ~10% compression size over the default (level 6). We could apply both optimizations, but if there were an apprehension to replace compress/gzip, that could be a viable alternative.

@USA-RedDragon
Copy link
Author

Hey, I forgot about submitting this! 😄

Glad to see it get some activity. I'm not pushing for this change or anything, but it does seem to be some low hanging fruit.

@brandonc
Copy link
Contributor

Even though this change makes go-slug compress slugs faster, I think there is an apprehension to replace compress/gzip with klauspost/pgzip because of the added dependency on something other than the standard lib. But, there is also an acknowledgement that the compression step is slow. Therefore, if we're stuck with single threaded execution, I've changed the default compression level to BestSpeed, which requires much less compute time (at the expense of a less favorable compression ratio)

I think this is a much better default for TFE that changes the impact of multithreaded compression and the calculation of whether or not to adopt it. We'll continue to monitor the performance of this step in the pipeline to see whether this was adequate. Thanks for your patience!

@brandonc brandonc closed this Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants