Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmarking http.jl vs download #25

Open
drizk1 opened this issue Jan 11, 2025 · 0 comments
Open

benchmarking http.jl vs download #25

drizk1 opened this issue Jan 11, 2025 · 0 comments

Comments

@drizk1
Copy link
Member

drizk1 commented Jan 11, 2025

was briefly thinking about switchig to use download instead of HTTP.jl to reduce a dependency, but did a little benchmarking.. and it looks like using HTTP.jl is actually faster, so i think we might as well leave it

julia> url_file_path = "https://vincentarelbundock.github.io/Rdatasets/csv/openintro/ucla_f18.csv"
"https://vincentarelbundock.github.io/Rdatasets/csv/openintro/ucla_f18.csv"

julia> @benchmark CSV.read(download(url_file_path), DataFrame)
BenchmarkTools.Trial: 66 samples with 1 evaluation per sample.
 Range (min … max):  58.517 ms … 100.189 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     75.821 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   75.524 ms ±   9.110 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▂   ▅        ▅ ▂    █  ▂                         
  ▅▅▁▅▁▅▅▅█▅▁█▁██▁▅██▅██▅█▁▅█████▁▅██▁██▁▁▁▅▁▁▁▁▁▅█▅▁▁▁▁▅▁▁▁▁▅ ▁
  58.5 ms         Histogram: frequency by time         98.1 ms <

 Memory estimate: 2.01 MiB, allocs estimate: 24063.

julia> @benchmark  read_csv(url_file_path)
BenchmarkTools.Trial: 101 samples with 1 evaluation per sample.
 Range (min … max):  37.511 ms … 78.963 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     48.330 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   49.855 ms ±  9.051 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃ █ ▃▃▁▆█  ▃▃ ▃▆▃▃ █ ▃  ▁          ▃                       
  ▄▇█▇█▁█████▇▁██▁████▇█▁█▄▁█▇▁▁▄▄▁▁▇▄▄█▄▁▁▁▄▁▁▁▁▇▁▁▄▁▁▄▁▁▄▁▄ ▄
  37.5 ms         Histogram: frequency by time        75.8 ms <

 Memory estimate: 4.19 MiB, allocs estimate: 23570.

julia> @benchmark CSV.read(local_file_path, DataFrame)
BenchmarkTools.Trial: 527 samples with 1 evaluation per sample.
 Range (min … max):  7.186 ms … 329.770 ms  ┊ GC (min … max): 0.00% … 93.91%
 Time  (median):     8.608 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.507 ms ±  14.238 ms  ┊ GC (mean ± σ):  7.04% ±  5.91%

   █▆    ▂▁                                                    
  ▅██▆▅▅▆██▇▆▅▄▅▄▃▃▃▃▂▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▂ ▃
  7.19 ms         Histogram: frequency by time        19.2 ms <

 Memory estimate: 1.46 MiB, allocs estimate: 22579

julia> @benchmark read_csv(local_file_path)
BenchmarkTools.Trial: 527 samples with 1 evaluation per sample.
 Range (min … max):  7.199 ms … 284.456 ms  ┊ GC (min … max): 0.00% … 94.96%
 Time  (median):     8.905 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.476 ms ±  12.137 ms  ┊ GC (mean ± σ):  6.49% ±  6.37%

    ▅█▃       ▁▂▂▅                                             
  ▅▇███▆▇▄▃▅▄▅████▇▇▇▄▅▅▄▄▄▁▃▃▃▃▃▂▃▃▂▂▃▁▃▁▁▁▃▃▁▁▁▁▂▁▂▁▁▂▂▁▁▁▂ ▃
  7.2 ms          Histogram: frequency by time          15 ms <

 Memory estimate: 1.69 MiB, allocs estimate: 22677.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant