Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add inbounds to src/Filters/design.jl #541

Closed

Conversation

Jafagervik
Copy link

@Jafagervik Jafagervik commented Feb 20, 2024

Avoid unnecessary boundary checking

Fixes #540

Copy link

codecov bot commented Feb 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.56%. Comparing base (4213e9a) to head (a318e1d).
Report is 72 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #541   +/-   ##
=======================================
  Coverage   97.56%   97.56%           
=======================================
  Files          18       18           
  Lines        3125     3125           
=======================================
  Hits         3049     3049           
  Misses         76       76           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ViralBShah
Copy link
Contributor

Is there a meaningful performance benefit?

@Jafagervik
Copy link
Author

Yes. I can try to profile where the compiler turns of blundary checking, but this macro is indeed powerful, and since I'm already using my own version of DSP.jl for my master thesis on ai and signal processing in julia, I thought I'd might as well contribute! I will do some more benches as soon as possible, but I'm quite busy these days

@ViralBShah
Copy link
Contributor

Thank you for the contribution, and good to know that you are already using this in your code.

A quick benchmark in this PR will help serve as documentation for the change. Generally, the preference is not to add @inbounds because people have used it incorrectly and aggressively in the past. But if the benefit is clear, it is definitely worthwhile to use it.

@ViralBShah
Copy link
Contributor

@wheeheee The windows failures seem codecov upload problems. I think the change to fail on codecov errors is perhaps aggressive. Would reverting it fix the error here (which is not in the package at all)?

@wheeheee
Copy link
Member

I think it's fine actually; I always expand the failed checks and prefer false positives to silent failures. Didn't comment before since I wanted to benchmark first, but that's for tomorrow...

It's weird, though, that Codecov says their server errors are fixed.

@wheeheee
Copy link
Member

Did some benchmarks for Butterworth, Chebyshev1, Chebyshev2, and Elliptic, and digitalfilter.
For these functions, I don't see any significant performance differences on Julia 1.10, and similarly on 1.6.

T = Float64, n = 5
julia> T = Float64; n = 5;

julia> @benchmark Butterworth($T, $n)   # PR
BenchmarkTools.Trial: 10000 samples with 651 evaluations.
 Range (min  max):  189.401 ns    3.459 μs  ┊ GC (min  max): 0.00%  91.17%
 Time  (median):     194.009 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   207.057 ns ± 164.544 ns  ┊ GC (mean ± σ):  4.54% ±  5.34%

  ▃▆█▇▄▂                                                        ▁
  ███████▇▆▆▇▇▅▆▇▇▇▇▆▅▄▅▆▇▇██▇▆▆▇▆▆▆▅▆▆▄▄▅▄▄▅▅▅▄▂▄▅▄▅▄▅▄▄▄▄▂▄▃▃ █
  189 ns        Histogram: log(frequency) by time        290 ns <

 Memory estimate: 240 bytes, allocs estimate: 3.

julia> @benchmark Butterworth($T, $n)   # master
BenchmarkTools.Trial: 10000 samples with 651 evaluations.
 Range (min  max):  189.401 ns    4.607 μs  ┊ GC (min  max): 0.00%  87.49%
 Time  (median):     193.241 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   207.237 ns ± 165.130 ns  ┊ GC (mean ± σ):  4.50% ±  5.32%

  ▄██▅▃▂▁      ▁                                                ▁
  ██████████▆▇███▇█▇▇▇▇▇▇▇█▆▆▆▇▅▆▅▅▅▅▅▅▅▅▄▄▄▆▅▅▅▄▄▅▅▄▅▄▄▄▄▄▄▃▃▄ █
  189 ns        Histogram: log(frequency) by time        299 ns <

 Memory estimate: 240 bytes, allocs estimate: 3.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 259 evaluations.
 Range (min  max):  298.842 ns    8.293 μs  ┊ GC (min  max): 0.00%  94.97%
 Time  (median):     303.089 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   320.355 ns ± 283.496 ns  ┊ GC (mean ± σ):  3.24% ±  3.53%

  ▅█▇▃▁▁        ▁ ▁▂▂▂                                          ▂
  ███████▇▆▇▆▇▇██████████▇▇▆▆▆▆▇▆▇▇▇▅▆▆▆▆▆▅▆▆▅▅▅▄▅▄▅▄▄▃▃▁▄▄▄▁▄▄ █
  299 ns        Histogram: log(frequency) by time        450 ns <

 Memory estimate: 256 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 264 evaluations.
 Range (min  max):  296.970 ns    8.127 μs  ┊ GC (min  max): 0.00%  93.98%
 Time  (median):     301.894 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   317.789 ns ± 265.608 ns  ┊ GC (mean ± σ):  3.06% ±  3.52%

  ▂▇█▆▃▁            ▁    ▁                                      ▁
  ████████▇▅▅▄▄▅▃▅▄▇██▇███████▇▆▇▆▅▆▅▅▆▄▅▅▅▅▅▅▅▆▅▆▄▅▄▃▃▅▄▄▃▃▂▃▃ █
  297 ns        Histogram: log(frequency) by time        416 ns <

 Memory estimate: 256 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 214 evaluations.
 Range (min  max):  351.402 ns    5.741 μs  ┊ GC (min  max): 0.00%  90.91%
 Time  (median):     356.542 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   371.667 ns ± 192.131 ns  ┊ GC (mean ± σ):  1.90% ±  3.49%

  ▅█▆▃▁                   ▁▁                                    ▁
  ███████▇▆▆▇▆▇█▆▆▆▆▅▆▆▆▇█████▇▆▆▅▆▅▆▆▅▆▅▄▅▃▄▄▄▄▄▄▄▁▃▄▄▃▃▄▅▅▄▅▅ █
  351 ns        Histogram: log(frequency) by time        545 ns <

 Memory estimate: 320 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 214 evaluations.
 Range (min  max):  352.804 ns    5.445 μs  ┊ GC (min  max): 0.00%  90.65%
 Time  (median):     357.477 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   372.132 ns ± 185.588 ns  ┊ GC (mean ± σ):  1.86% ±  3.49%

  ▅█▇▄▂▁▁▁         ▁▂                                           ▁
  ██████████▆▆▅▆▅▅▆███▆▇▆▆▆▆▆▅▅▅▆▆▇██▇█▇▆▆▅▄▅▄▅▅▅▅▅▄▅▅▃▅▄▅▄▃▅▄▄ █
  353 ns        Histogram: log(frequency) by time        493 ns <

 Memory estimate: 320 bytes, allocs estimate: 4.

julia> @benchmark Elliptic($T, $n, $1, $2)  # PR
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.380 μs  121.990 μs  ┊ GC (min  max): 0.00%  95.83%
 Time  (median):     1.560 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.658 μs ±   1.235 μs  ┊ GC (mean ± σ):  0.71% ±  0.96%

  ▄      ▃█
  █▅▂▁▂▅▂██▅▃▂▃▆▃▂▃▃▄▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  1.38 μs         Histogram: frequency by time        2.57 μs <

 Memory estimate: 528 bytes, allocs estimate: 5.

julia> @benchmark Elliptic($T, $n, $1, $2)  # master
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.370 μs  130.560 μs  ┊ GC (min  max): 0.00%  96.74%
 Time  (median):     1.410 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.496 μs ±   1.302 μs  ┊ GC (mean ± σ):  0.84% ±  0.97%

  ██
  ███▄▂▂▁▂▄▆▄▂▁▁▃▂▃▄▃▂▁▁▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  1.37 μs         Histogram: frequency by time        2.04 μs <

 Memory estimate: 528 bytes, allocs estimate: 5.
T = Float64, n = 20
julia> T = Float64; n = 20;

julia> @benchmark Butterworth($T, $n)   # PR
BenchmarkTools.Trial: 10000 samples with 223 evaluations.
 Range (min  max):  332.287 ns    5.996 μs  ┊ GC (min  max): 0.00%  91.17%
 Time  (median):     337.220 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   358.970 ns ± 249.445 ns  ┊ GC (mean ± σ):  3.28% ±  4.45%

  ▆█▆▃▂▃▄▁    ▁▁                                                ▁
  ████████▇▆▇████▇▆▆▇▆▆▅▆▆▄▆▆▅▅▆▆▅▄▅▅▅▆▆▆▆▃▅▅▄▅▂▄▄▃▅▆▇▆▇▆▄▄▄▃▄▄ █
  332 ns        Histogram: log(frequency) by time        523 ns <

 Memory estimate: 496 bytes, allocs estimate: 3.

julia> @benchmark Butterworth($T, $n)   # master
BenchmarkTools.Trial: 10000 samples with 235 evaluations.
 Range (min  max):  322.979 ns    6.572 μs  ┊ GC (min  max): 0.00%  89.39%
 Time  (median):     327.234 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   348.547 ns ± 237.311 ns  ┊ GC (mean ± σ):  3.26% ±  4.52%

  ▇█▅▁▁▂▃▁   ▁▂▁                                                ▁
  ████████▇▇█████▇▇█▇▆▆▆▇▆▆▄▆▆▆▅▆▅▅▆▅▇█▇▆▆▆▅▆▅▄▅▄▄▃▄▄▃▃▄▃▄▃▃▅▄▃ █
  323 ns        Histogram: log(frequency) by time        507 ns <

 Memory estimate: 496 bytes, allocs estimate: 3.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min  max):  426.131 ns    7.531 μs  ┊ GC (min  max): 0.00%  91.70%
 Time  (median):     433.166 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   458.110 ns ± 299.829 ns  ┊ GC (mean ± σ):  2.96% ±  4.26%

  ▅█▆▄▃▂▁                     ▁                                 ▁
  ███████▇▇▇▆▇██▇▇▇▇▇▇▆▆▆▄▆▆▅▇██▇▇▅▆▅▄▃▄▅▄▄▄▂▂▃▄▃▃▃▃▄▄▃▄▂▄▄▃▃▃▅ █
  426 ns        Histogram: log(frequency) by time        675 ns <

 Memory estimate: 512 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min  max):  432.663 ns    7.446 μs  ┊ GC (min  max): 0.00%  90.91%
 Time  (median):     439.698 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   463.778 ns ± 284.656 ns  ┊ GC (mean ± σ):  2.77% ±  4.23%

  ▅█▇▄▄▃▂▁     ▁▂                ▂▁                             ▂
  ████████████▇████▇▇▇▇▇▇▇▇▆▅▆▆▇█████▇▇▇▅▅▅▄▆▅▆▅▅▄▁▅▅▃▃▄▄▅▃▄▄▃▄ █
  433 ns        Histogram: log(frequency) by time        656 ns <

 Memory estimate: 512 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 171 evaluations.
 Range (min  max):  628.655 ns    5.420 μs  ┊ GC (min  max): 0.00%  81.78%
 Time  (median):     638.596 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   666.492 ns ± 216.780 ns  ┊ GC (mean ± σ):  1.62% ±  4.45%

  ▆█▆▅▃▂▁▁ ▁  ▁▁                                                ▁
  ██████████████▇▇▇▇▇▆▆▆▆▅▆▇▇▆▆▆▆▅▅▅▆▆▆▆▇▆▄▄▃▄▃▃▄▃▂▃▃▄▃▄▄▃▃▃▃▄▃ █
  629 ns        Histogram: log(frequency) by time        987 ns <

 Memory estimate: 848 bytes, allocs estimate: 4.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 176 evaluations.
 Range (min  max):  606.818 ns    4.291 μs  ┊ GC (min  max): 0.00%  80.83%
 Time  (median):     616.477 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   637.933 ns ± 181.670 ns  ┊ GC (mean ± σ):  1.46% ±  4.49%

  ▄█▇▄▅▂▁       ▁▁                                              ▂
  ████████▇▇▆▆▆▆███▇▇▆▇▆▅▆▁▄▄▅▆▆▆▆▆▅▆▄▄▅▅▅▄▆▆▆▇█▆▅▄▅▄▄▅▄▄▁▅▃▃▅▄ █
  607 ns        Histogram: log(frequency) by time        905 ns <

 Memory estimate: 848 bytes, allocs estimate: 4.

julia> @benchmark Elliptic($T, $n, $1, $2)  # PR
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min  max):  3.362 μs  127.925 μs  ┊ GC (min  max): 0.00%  93.63%
 Time  (median):     3.500 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.640 μs ±   1.300 μs  ┊ GC (mean ± σ):  0.33% ±  0.94%

  ██▇▁▄▃  ▄▃▁▄▆▇▆▄▃▂▁▁      ▁     ▁                           ▂
  ██████▆▆██████████████▇█▇▇█▇▇▇▇▇█▇▇▇▇▇▆▄▅▅▆▅▅▄▅▆▄▅▅▃▃▅▄▃▄▄▅ █
  3.36 μs      Histogram: log(frequency) by time      5.09 μs <

 Memory estimate: 1.03 KiB, allocs estimate: 5.

julia> @benchmark Elliptic($T, $n, $1, $2)  # master
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min  max):  3.350 μs   94.287 μs  ┊ GC (min  max): 0.00%  92.06%
 Time  (median):     3.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.502 μs ± 975.978 ns  ┊ GC (mean ± σ):  0.25% ±  0.92%

  ▃▇██▄▃▁  ▂▁  ▂▂▃▃▃  ▃▃▃▂▃  ▂▁                               ▂
  ███████▆███▅▄█████▇▇██████████▇▇▆▆▇▄▆▄▄▆▄▅▄▄▅▆▅▅▆▄▅▄▅▅▄▅▄▂▄ █
  3.35 μs      Histogram: log(frequency) by time       4.3 μs <

 Memory estimate: 1.03 KiB, allocs estimate: 5.
T = Float64, n = 10_000
julia> T = Float64; n = 10_000;

julia> @benchmark Butterworth($T, $n)   # PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   91.600 μs   3.293 ms  ┊ GC (min  max): 0.00%  94.99%
 Time  (median):      94.700 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   112.243 μs ± 82.190 μs  ┊ GC (mean ± σ):  3.83% ±  5.13%

  ▂█▅   ▁▃             ▃▅▄▂▁    ▁▁                             ▁
  ███▇█▇██▇▅▅▅▅▆▅▄▄▅▅▄▆████████▇██▇▆▆▆▇▆▆▇▆▆▆▆██▆▆▄▅▅▆▅▅▄▄▅▅▄▃ █
  91.6 μs       Histogram: log(frequency) by time       187 μs <

 Memory estimate: 156.41 KiB, allocs estimate: 5.

julia> @benchmark Butterworth($T, $n)   # master
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   92.300 μs   3.223 ms  ┊ GC (min  max): 0.00%  94.91%
 Time  (median):      94.700 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   105.481 μs ± 68.349 μs  ┊ GC (mean ± σ):  3.03% ±  4.88%

  ▅█▃    ▂              ▂▃▂                                    ▁
  █████████▅▆▆▄▅▄▃▅▄▅▄▄▅██████▇▆▇▆▇█▆▆▆▅▆▆▆▆▅▅▄▄▇▅▅▅▅▄▄▄▄▃▄▃▄▄ █
  92.3 μs       Histogram: log(frequency) by time       185 μs <

 Memory estimate: 156.41 KiB, allocs estimate: 5.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   98.300 μs   5.296 ms  ┊ GC (min  max): 0.00%  95.52%
 Time  (median):     100.600 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   111.360 μs ± 74.718 μs  ┊ GC (mean ± σ):  2.85% ±  4.82%

  ▆█▁▁▁ ▁            ▃▃▂▁▁    ▁▁                               ▁
  ████████▆▅▄▆▆▅▅▆▅▄▆███████▇███▆▆▆▆▇▇▆▆▆▆▅▆▅▅▅▅▅▅▅▅▃▁▄▄▄▅▇▆▆▆ █
  98.3 μs       Histogram: log(frequency) by time       205 μs <

 Memory estimate: 156.42 KiB, allocs estimate: 6.

julia> @benchmark Chebyshev1($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):   98.200 μs   3.129 ms  ┊ GC (min  max): 0.00%  94.34%
 Time  (median):     100.900 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   111.306 μs ± 67.067 μs  ┊ GC (mean ± σ):  2.83% ±  4.84%

  ▂█▇  ▁  ▁▁               ▄▃▂                                 ▁
  ██████████▇▅▆▅▅▅▄▅▄▅▅▃▄▄█████▇▇▇▇▇▆▆▆▆▅▅▄▅▄▄▆▆▆▅▃▄▄▅▅▇▇▄▅▄▃▅ █
  98.2 μs       Histogram: log(frequency) by time       181 μs <

 Memory estimate: 156.42 KiB, allocs estimate: 6.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  183.000 μs   3.321 ms  ┊ GC (min  max): 0.00%  92.13%
 Time  (median):     187.000 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   210.085 μs ± 97.362 μs  ┊ GC (mean ± σ):  3.16% ±  6.36%

  ▅█▂▂▁  ▂               ▄▄▁                                   ▁
  █████▇██▇▇▆▆▇▇▆▆▅▄▄▃▅▅▇█████▇▇▆▇▇██▆▆▆▆▆▆▆▅▅▅▆▅▆▆▅▅▅▄▃▄▃▄▄▃▄ █
  183 μs        Histogram: log(frequency) by time       361 μs <

 Memory estimate: 312.66 KiB, allocs estimate: 7.

julia> @benchmark Chebyshev2($T, $n, $2.1)  # master
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  180.600 μs    3.430 ms  ┊ GC (min  max): 0.00%  91.64%
 Time  (median):     187.900 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   218.443 μs ± 108.970 μs  ┊ GC (mean ± σ):  3.50% ±  6.58%

   █▇▃▂ ▂▃▁           ▄▅▃▂    ▁▁                                ▁
  ▆█████████▇▇▆▇▆▇▅▆▂▆████████████▆▇█▇▇▆▇▆▄▆█▆▅▅▃▄▄▅▄▅▄▃▄▃▃▃▄▄▃ █
  181 μs        Histogram: log(frequency) by time        393 μs <

 Memory estimate: 312.66 KiB, allocs estimate: 7.
julia> @benchmark Elliptic($T, $n, $1, $2)  # PR
BenchmarkTools.Trial: 3613 samples with 1 evaluation.
 Range (min  max):  1.319 ms    4.440 ms  ┊ GC (min  max): 0.00%  67.69%
 Time  (median):     1.335 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.381 ms ± 132.695 μs  ┊ GC (mean ± σ):  0.54% ±  3.49%

  ▃▁█▄▂▁▁      ▄▃ ▁        ▂
  ████████████▇██████▇▇▇▇█▇██▇▆▇▆▄▇▅▅▆▅▇█▆▄▄▆▅▅▄▅▂▅▂▃▂▂▄▃▂▂▂▃ █
  1.32 ms      Histogram: log(frequency) by time      1.68 ms <

 Memory estimate: 312.86 KiB, allocs estimate: 8.

julia> @benchmark Elliptic($T, $n, $1, $2)  # master
BenchmarkTools.Trial: 3644 samples with 1 evaluation.
 Range (min  max):  1.330 ms    4.476 ms  ┊ GC (min  max): 0.00%  67.69%
 Time  (median):     1.336 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.369 ms ± 117.290 μs  ┊ GC (mean ± σ):  0.56% ±  3.55%

  ▇█▄▂▂▂▁▁▁ ▁    ▄▄▂▂▁           ▁                            ▁
  █████████████▇█████████▆█▇█▇▇▅▇██▇▇▅▄▄▅▇▁▅▅▅▃▄▄▅▄▄▅▁▁▄▃▄▃▁▃ █
  1.33 ms      Histogram: log(frequency) by time       1.6 ms <

 Memory estimate: 312.86 KiB, allocs estimate: 8.
digitalfilter, Butterworth(3)
julia> hp = Highpass(0.5); bs = Bandstop(0.35, 0.4); bp = Bandpass(0.25, 0.75);

julia> butt = Butterworth(3);

julia> @benchmark digitalfilter($hp, $butt)     # PR
BenchmarkTools.Trial: 10000 samples with 451 evaluations.
 Range (min  max):  230.599 ns   1.262 μs  ┊ GC (min  max): 0.00%  70.83%
 Time  (median):     234.812 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   247.525 ns ± 59.115 ns  ┊ GC (mean ± σ):  1.25% ±  4.70%

  ▇█▅▃▂    ▃▂                                                  ▁
  ██████▇█████▇▇▇▇▇▇▆██▇▆▆▅▅▅▅▅▆▅▆▅▅▅▅▅▄▅▄▄▄▅▅▄▃▄▄▄▂▂▂▄▆███▇▆▅ █
  231 ns        Histogram: log(frequency) by time       405 ns <

 Memory estimate: 448 bytes, allocs estimate: 4.

julia> @benchmark digitalfilter($hp, $butt)     # master
BenchmarkTools.Trial: 10000 samples with 425 evaluations.
 Range (min  max):  236.235 ns   1.230 μs  ┊ GC (min  max): 0.00%  66.54%
 Time  (median):     241.176 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   251.295 ns ± 54.503 ns  ┊ GC (mean ± σ):  1.12% ±  4.49%

  ▆█▆▅▄▁   ▁▁                                                  ▁
  ███████▇▇██▆▆▆▆▅▆▅▅▆▆▆▇▆▅▄▅▄▄▅▅▄▄▄▃▅▃▃▃▄▃▅▄▃▇█▇▆▆▅▄▅▅▆▅▅▅▄▄▅ █
  236 ns        Histogram: log(frequency) by time       406 ns <

 Memory estimate: 448 bytes, allocs estimate: 4.

julia> @benchmark digitalfilter($bs, $butt)     # PR
BenchmarkTools.Trial: 10000 samples with 200 evaluations.
 Range (min  max):  408.500 ns   2.172 μs  ┊ GC (min  max): 0.00%  74.48%
 Time  (median):     417.500 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   435.752 ns ± 90.772 ns  ┊ GC (mean ± σ):  0.83% ±  3.73%

  ▁█▆▃      ▂                           ▁                      ▁
  ████▇█▇▇▆███▆▇▆▄▆▅▅▅▅▅▅▄▄▅▆▆▅▅▆▅▃▄▄▅▆███▆▅▅▅▅▅▄▄▄▆▆▅▅▄▄▄▄▅▄▅ █
  408 ns        Histogram: log(frequency) by time       695 ns <

 Memory estimate: 640 bytes, allocs estimate: 4.

julia> @benchmark digitalfilter($bs, $butt)     # master
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min  max):  416.080 ns   2.311 μs  ┊ GC (min  max): 0.00%  70.99%
 Time  (median):     426.131 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   444.156 ns ± 91.548 ns  ┊ GC (mean ± σ):  0.82% ±  3.73%

   ██▅▁▁▁     ▁                              ▁                 ▁
  ████████▆▆▆██▇▇▆▆▆▆▆▇▆▆▄▅▅▅▅▇▆▆▅▅▄▅▃▅▄▅▅▇███▇▆▅▄▂▄▄▅▃▇▆▆▄▃▃▅ █
  416 ns        Histogram: log(frequency) by time       681 ns <

 Memory estimate: 640 bytes, allocs estimate: 4.

julia> @benchmark digitalfilter($bp, $butt)     # PR
BenchmarkTools.Trial: 10000 samples with 222 evaluations.
 Range (min  max):  334.685 ns    2.760 μs  ┊ GC (min  max): 0.00%  78.93%
 Time  (median):     340.090 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   355.924 ns ± 101.835 ns  ┊ GC (mean ± σ):  1.31% ±  4.14%

  ▆█▆▃▁▁▂▁   ▂▂▁                                                ▁
  ███████████████▇▇▇▆▆▆▆▆▆▆▆▆▆▆▅▅▅▅▅▇▇▇▆▆▅▅▄▅▄▄▅▂▄▆▇▇▇▅▄▅▄▄▄▄▂▄ █
  335 ns        Histogram: log(frequency) by time        536 ns <

 Memory estimate: 592 bytes, allocs estimate: 4.

julia> @benchmark digitalfilter($bp, $butt)     # master
BenchmarkTools.Trial: 10000 samples with 217 evaluations.
 Range (min  max):  336.866 ns   2.486 μs  ┊ GC (min  max): 0.00%  60.49%
 Time  (median):     345.161 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   358.933 ns ± 97.166 ns  ┊ GC (mean ± σ):  1.19% ±  4.06%

  ▃██▅▂▂▂   ▁                                                  ▁
  ███████▇▆▆██▇▇▆▆▆▅▅▆▆▅▅▅▅▅▅▅▅▅▅▅▅▄▄▅▄▄▄▅▇█▆▆▄▅▄▄▅▅▅▅▅▅▅▄▂▄▆▅ █
  337 ns        Histogram: log(frequency) by time       576 ns <

 Memory estimate: 592 bytes, allocs estimate: 4.
digitalfilter, Butterworth(10_000)
julia> butt = Butterworth(10_000);

julia> @benchmark digitalfilter($hp, $butt)     # PR
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  412.000 μs    4.995 ms  ┊ GC (min  max): 0.00%  86.93%
 Time  (median):     419.000 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   493.169 μs ± 195.996 μs  ┊ GC (mean ± σ):  3.50% ±  7.83%

  █▃▃▁▁  ▁▆▄▄▃▂▁▁                                               ▁
  ██████▇██████████▇▇▆▆▅▅▄▅▄▄▄▁▁▁▁▁▁▁▃▁▄▁▁▄▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▃▅▆▆▆ █
  412 μs        Histogram: log(frequency) by time       1.45 ms <

 Memory estimate: 625.19 KiB, allocs estimate: 8.

julia> @benchmark digitalfilter($hp, $butt)     # master
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  418.300 μs    4.185 ms  ┊ GC (min  max): 0.00%  82.11%
 Time  (median):     430.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   476.807 μs ± 156.636 μs  ┊ GC (mean ± σ):  2.67% ±  6.93%

  █▆▃▂▂▁▁   ▃▄▂▂▂▁                                              ▁
  ███████████████████▇▆▇▆▇▅▅▅▆▆▆▅▄▁▃▄▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▃▄▁▅▃▆▆▆ █
  418 μs        Histogram: log(frequency) by time       1.23 ms <

 Memory estimate: 625.19 KiB, allocs estimate: 8.

julia> @benchmark digitalfilter($bs, $butt)     # PR
BenchmarkTools.Trial: 4535 samples with 1 evaluation.
 Range (min  max):  995.800 μs    4.446 ms  ┊ GC (min  max): 0.00%  65.81%
 Time  (median):       1.006 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.097 ms ± 227.020 μs  ┊ GC (mean ± σ):  1.97% ±  5.96%

  █▄▃▂▂▂▁        ▁▃▂▂                                           ▁
  ████████▇▅▆▆▄▄▅██████▇▇██▇▆▆▇▆▅▅▅▅▄▄▅▄▄▃▃▁▄▁▁▁▃▄▁▃▃▄▁▅▄▆█▆▆▆▅ █
  996 μs        Histogram: log(frequency) by time       2.05 ms <

 Memory estimate: 1.22 MiB, allocs estimate: 8.

julia> @benchmark digitalfilter($bs, $butt)     # master
BenchmarkTools.Trial: 4463 samples with 1 evaluation.
 Range (min  max):  990.200 μs    4.490 ms  ┊ GC (min  max): 0.00%  67.51%
 Time  (median):       1.013 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):     1.118 ms ± 250.837 μs  ┊ GC (mean ± σ):  2.27% ±  6.52%

  ██▄▃▃▂▁       ▃▅▃▃▁▁▁  ▁                                      ▁
  ██████████▇▆▄▅██████████▇█▇▇▇▇▅▅▄▅▄▅▃▃▃▃▁▁▅▁▃▃▁▃▁▃▁▃▁▄▄▅▃▇██▇ █
  990 μs        Histogram: log(frequency) by time       2.13 ms <

 Memory estimate: 1.22 MiB, allocs estimate: 8.

julia> @benchmark digitalfilter($bp, $butt)     # PR
BenchmarkTools.Trial: 6012 samples with 1 evaluation.
 Range (min  max):  753.900 μs    4.050 ms  ┊ GC (min  max): 0.00%  73.92%
 Time  (median):     761.600 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   827.660 μs ± 186.852 μs  ┊ GC (mean ± σ):  1.99% ±  5.90%

  █▃▃▂▁▂▁        ▃▁▃▁▁                                          ▁
  ████████▇█▇▇▆▅▅██████▇▇▇▇▇█▇▅▇▅▅▄▄▄▃▁▅▄▅▄▃▅▁▃▁▃▃▃▁▃▄▄▃▅▅▅▇█▆▅ █
  754 μs        Histogram: log(frequency) by time       1.61 ms <

 Memory estimate: 1.07 MiB, allocs estimate: 8.

julia> @benchmark digitalfilter($bp, $butt)     # master
BenchmarkTools.Trial: 6154 samples with 1 evaluation.
 Range (min  max):  756.500 μs    4.047 ms  ┊ GC (min  max): 0.00%  73.83%
 Time  (median):     776.400 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   809.302 μs ± 132.723 μs  ┊ GC (mean ± σ):  1.50% ±  5.52%

  ▃▇█▆▃▃▂▁▁▁▁                      ▁▁▁                          ▁
  ███████████████▇▇▇▆▇▆▅▆▅▅▄▅▄▅▃▁▃▅███▇▇█▇▆▇▇▅▄▄▅▆▅▄▄▅▆▇▇▇▇▇▇▇▇ █
  756 μs        Histogram: log(frequency) by time        1.2 ms <

 Memory estimate: 1.07 MiB, allocs estimate: 8.

@martinholters
Copy link
Member

FWIW, the @inbounds annotations here do look correct to me, but if the benchmarks don't support that there is a performance improvement, I don't think we should be doing this. It will increase the mental load and the risk of introducing bugs in the future.

@wheeheee
Copy link
Member

Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the Vectors in Butterworth etc., which could be helpful to @Jafagervik

@Jafagervik
Copy link
Author

Actually, I think there's still a potential missed optimisation here. It would probably be fine to not zero out the Vectors in Butterworth etc., which could be helpful to @Jafagervik

This would help me a lot at least. I'm on julia 1.10.1 right now and I use Butterworth quite a lot

@wheeheee
Copy link
Member

I also experimented with using sincospi, inlining cospi, sincospi, cde, and sne, and managed to squeeze out, for larger n, a nearly 100% increase in performance for Butterworth, Chebyshev1, and Chebyshev2, and about 50% for Elliptic, with more modest improvements for small n.
However, as these functions are nearly always called with small numbers in the tests, I'm not sure how valuable these are.

julia> T = Float64; n = 10_000;

julia> @benchmark Butterworth($T, $n)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  42.800 μs   2.819 ms  ┊ GC (min  max): 0.00%  96.49%
 Time  (median):     43.600 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   58.370 μs ± 67.545 μs  ┊ GC (mean ± σ):  5.95% ±  5.27%

  █▃  ▂                         ▁    ▃▄▃▂▁▁                   ▁
  ██▆███▇▆▇█▇▇▅▄▆▅▆▆▅▄▄▅▄▅▄▅▃▃▄▇██▆▆▆█████████▇▇▇▆▅▆▅▆▄▆▆▆▅▄▄ █
  42.8 μs      Histogram: log(frequency) by time       110 μs <

 Memory estimate: 156.41 KiB, allocs estimate: 5.

julia> @benchmark Chebyshev1($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  50.200 μs   2.944 ms  ┊ GC (min  max): 0.00%  96.18%
 Time  (median):     51.100 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   58.743 μs ± 50.223 μs  ┊ GC (mean ± σ):  4.01% ±  5.00%

  █▇  ▁▂▄                           ▁▂▁      ▁                ▁
  ███▇████▇▇▇▇▇▆▆▄▅▆▆▅▅▆▄▃▃▃▆▅▆▃▅▄▂▇████▇▇▇▇████▇▇▇▇▆▆▅▆▅▅▄▃▄ █
  50.2 μs      Histogram: log(frequency) by time       108 μs <

 Memory estimate: 156.42 KiB, allocs estimate: 6.

julia> @benchmark Chebyshev2($T, $n, $2.1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  103.800 μs   3.278 ms  ┊ GC (min  max): 0.00%  93.43%
 Time  (median):     105.400 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   124.029 μs ± 85.721 μs  ┊ GC (mean ± σ):  4.34% ±  6.55%

  █▃▂▂▂▁                   ▁   ▂▃▂▁                            ▁
  ████████▇█▇▇▆▆▆▆▇▇▆▆▅▅▅▆███▇▇█████▇▇▆▆▇▆▆▆▆▆▆▅▅▅▅▅▅▄▄▃▄▁▄▁▄▅ █
  104 μs        Histogram: log(frequency) by time       268 μs <

 Memory estimate: 312.66 KiB, allocs estimate: 7.

julia> @benchmark Elliptic($T, $n, $1, $2)
BenchmarkTools.Trial: 5771 samples with 1 evaluation.
 Range (min  max):  841.200 μs   3.936 ms  ┊ GC (min  max): 0.00%  75.26%
 Time  (median):     845.200 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   863.471 μs ± 83.093 μs  ┊ GC (mean ± σ):  0.60% ±  3.62%

  █▆▃▃▂▁▁    ▁    ▃▂                                           ▁
  █████████████▇▇▇████▇█▇▇▇▅▇▆▆▆▆▆▅▅▄▄▅▃▄▅▅▁▆▄▄▅▄▁▃▃▃▄▁▁▁▃▁▁▁▃ █
  841 μs        Histogram: log(frequency) by time       1.1 ms <

 Memory estimate: 312.86 KiB, allocs estimate: 8.

I pushed it to the cleanup PR, as it looks like a minor optimization. If it should be in its own PR, let me know.

@wheeheee wheeheee mentioned this pull request Feb 24, 2024
@ViralBShah
Copy link
Contributor

Close (if this is included in the cleanups)?

@wheeheee
Copy link
Member

wheeheee commented Mar 1, 2024

Yeah, if @Jafagervik comes back with benchmarks that show @inbounds is needed, it can be reopened.

@wheeheee wheeheee closed this Mar 1, 2024
@wheeheee wheeheee added the no changelog Prevent TagBot from inclusion in release notes label Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no changelog Prevent TagBot from inclusion in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add inbounds to methods in design.jl
4 participants