Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mul_hi function for bit integers #57276

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sumiya11
Copy link
Contributor

@sumiya11 sumiya11 commented Feb 5, 2025

Move the _mul_high function from base/multinverses.jl to base/int.jl. Rename it to mul_hi.

Addresses #53855.

I chose to reuse existing implementation over using the one proposed in #53855 by @LilithHafner because their performance is similar on my PC.

# existing
julia> @btime Base.MultiplicativeInverses._mul_high(x, y) setup=((x,y)=(rand(UInt128),rand(UInt128)));
  1.808 ns (0 allocations: 0 bytes)

# 53855
julia> @btime mul_hi_li2(x, y) setup=((x,y)=(rand(UInt128),rand(UInt128)));
  1.800 ns (0 allocations: 0 bytes)

Move the _mul_high function from base/multinverses.jl to base/int.jl.
Rename it to mul_hi.
@oscardssmith oscardssmith added performance Must go faster maths Mathematical functions labels Feb 5, 2025
Copy link
Member

@oscardssmith oscardssmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! while we're at it, do we want to use this to make widemul of UInt128/Int128 faster (by doing a mul, a mulhi and building the BigInt manually)? I guess that can always go in a followup PR if we care.

@sumiya11
Copy link
Contributor Author

sumiya11 commented Feb 5, 2025

while we're at it, do we want to use this to make widemul of UInt128/Int128 faster (by doing a mul, a mulhi and building the BigInt manually)?

I am not sure: I guess dynamic allocation would dominate the cost.

@sumiya11
Copy link
Contributor Author

sumiya11 commented Feb 5, 2025

Should mul_hi be exported?

@oscardssmith
Copy link
Member

I wouldn't for now. Exporting is easy if we want to in the future.

Copy link
Member

@LilithHafner LilithHafner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely, thanks!

Good choice re-using the existing implementation.

This should also be marked public in base/public.jl, even if not exported.

base/int.jl Outdated Show resolved Hide resolved
@martinholters
Copy link
Member

I'd suspect that on a 32 bit platform, the performance of mul_hi(::UInt64, ::UInt64) might be quite bad without a dedicated method. But do we even care?

@oscardssmith
Copy link
Member

LLVM should be able to figure something decent out

base/int.jl Outdated Show resolved Hide resolved
@giordano
Copy link
Contributor

giordano commented Feb 7, 2025

I'd suspect that on a 32 bit platform, the performance of mul_hi(::UInt64, ::UInt64) might be quite bad without a dedicated method.

julia> using BenchmarkTools

julia> @benchmark Base.mul_hi(x, y) setup=(x=rand(UInt64); y=rand(UInt64))
BenchmarkTools.Trial: 10000 samples with 999 evaluations per sample.
 Range (min … max):  7.999 ns … 64.661 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     8.106 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.534 ns ±  2.421 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▂▁                                                        ▁
  ███▃▃▁▁▁▁▁▁▁▁▁▁▆▆▄▅▅▄▁▁▄▃▄▁▁▁▄▅▁▄▁▁▃▁▁▅█▆█▇▇▇▃▄▄▃▁▄▄▄▅▅▆▄▆ █
  8 ns         Histogram: log(frequency) by time     21.6 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> versioninfo()
Julia Version 1.13.0-DEV.11
Commit a1dcb907faf (2025-02-06 14:44 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (i686-linux-gnu)
  CPU: 4 × Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz
  WORD_SIZE: 32
  LLVM: libLLVM-18.1.7 (ORCJIT, haswell)
  GC: Built with stock GC
Threads: 1 default, 1 interactive, 1 GC (on 4 virtual cores)

I don't know if this counts as bad enough, although the native code is a bit overcomplicated:

julia> code_native(Base.mul_hi, NTuple{2,UInt64}; debuginfo=:none)
        .text
        .file   "mul_hi"
        .globl  julia_mul_hi_5686               # -- Begin function julia_mul_hi_5686
        .p2align        4, 0x90
        .type   julia_mul_hi_5686,@function
julia_mul_hi_5686:                      # @julia_mul_hi_5686
; Function Signature: mul_hi(UInt64, UInt64)
# %bb.0:                                # %top
        push    ebp
        mov     ebp, esp
        push    ebx
        push    edi
        push    esi
        #DEBUG_VALUE: mul_hi:a <- [DW_OP_plus_uconst 12] [$ebp+0]
        #DEBUG_VALUE: mul_hi:b <- [DW_OP_plus_uconst 20] [$ebp+0]
        mov     ecx, dword ptr [ebp + 20]
        mov     eax, dword ptr [ebp + 24]
        mov     esi, dword ptr [ebp + 12]
        mov     edx, ecx
        mulx    ebx, ebx, esi
        mov     edx, eax
        mulx    esi, edi, esi
        add     edi, ebx
        mov     ebx, dword ptr [ebp + 16]
        adc     esi, 0
        mov     edx, ecx
        mulx    ecx, eax, ebx
        mov     edx, dword ptr [ebp + 24]
        mulx    ebx, edx, ebx
        add     eax, edi
        adc     ecx, esi
        setb    al
        add     edx, ecx
        movzx   eax, al
        adc     ebx, eax
        mov     eax, dword ptr [ebp + 8]
        mov     dword ptr [eax], edx
        mov     dword ptr [eax + 4], ebx
        pop     esi
        pop     edi
        pop     ebx
        pop     ebp
        ret     4
.Lfunc_end0:
        .size   julia_mul_hi_5686, .Lfunc_end0-julia_mul_hi_5686
                                        # -- End function
        .section        ".note.GNU-stack","",@progbits

@@ -71,6 +71,9 @@ public
isoperator,
isunaryoperator,

# Integer math
mul_hi,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to be public, it'd be nice for it to have a more descriptive name (and ideally one that is stylistically consistent, i.e. germanic case rather than snake case). Perhaps mulhighhalf or something like that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants