Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Backend Compatibility #17

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open
10 changes: 9 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,23 @@ name = "DiffTests"
uuid = "de460e47-3fe3-5279-bb4a-814414816d5d"
keywords = ["automatic differentiation", "test"]
license = "MIT"
version = "0.1.2"
version = "0.1.3"

[deps]
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"

[weakdeps]
MKLSparse = "0c723cd3-b8cd-5d40-b370-ba682dde9aae"

[extensions]
# some operations (ldiv!) are only supported by MKLSparse
DiffTestsMKLSparseExt = ["MKLSparse"]

[compat]
julia = "1"
MKLSparse = "1.2"

[extras]
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Expand Down
25 changes: 25 additions & 0 deletions ext/DiffTestsMKLSparseExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
module DiffTestsMKLSparseExt

using LinearAlgebra: UpperTriangular, LowerTriangular
using SparseArrays: sparse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this should not be done: Extensions should only import the package itself and the weak dependency but not even standard libraries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? SparseArrays and LinearAlgebra are the direct dependencies of DiffTests itself and MKLSparse. The latter provides more specific definitions of matrix multiplication and division for sparse matrices that are utilized by the test functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should import them indirectly via DiffTests and MKLSparse (see eg JuliaStats/LogExpFunctions.jl#63).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the info. I've updated how the import is being done. However, the changes that the extensions makes to the function lists are not visible from the user session (I guess it's different world age of DiffTests.jl or something like that).


sparse_ldiv(x::AbstractVecOrMat) = sparse(test_matrix(x)) \ x
sp_utriag_ldiv(x::AbstractVecOrMat) = UpperTriangular(sparse(test_matrix(x))) \ x
sp_ltriag_ldiv(x::AbstractVecOrMat) = LowerTriangular(sparse(test_matrix(x))) \ x

pushifmissing!(v::AbstractVector, el) =
isnothing(findfirst(==(el), v)) && push!(v, el)

# enable test functions that require ldiv!(::Matrix, ::SparseMatrixCSC, ::Matrix)
for f in [sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv]

pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f)
pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f)
pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f)

pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f)
pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f)
pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f)
end

end
149 changes: 78 additions & 71 deletions src/DiffTests.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
module DiffTests

using LinearAlgebra: det, norm, dot, tr, Diagonal, LowerTriangular, UpperTriangular
using LinearAlgebra: det, norm, dot, tr, lu, Diagonal, LowerTriangular, UpperTriangular
using SparseArrays: sparse
using Statistics: mean

Expand All @@ -25,15 +25,15 @@ names, one should not expect these functions to be "correct" for their original
num2num_1(x) = sin(x)^2 / cos(x)^2
num2num_2(x) = 2*x + sqrt(x*x*x)
num2num_3(x) = 10.31^(x + x) - x
num2num_4(x) = 1
num2num_4(x) = one(x)
alyst marked this conversation as resolved.
Show resolved Hide resolved
num2num_5(x) = 1. / (1. + exp(-x))

const NUMBER_TO_NUMBER_FUNCS = (num2num_1, num2num_2, num2num_3,
num2num_4, num2num_5, identity)
const NUMBER_TO_NUMBER_FUNCS = [num2num_1, num2num_2, num2num_3,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why you switched the constant tuples to arrays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with matrix2matrix functions, which are extended in DiffTestsMKLSparseExt. I can revert num2num to be tuples again.

But out of curiosity - why they were tuples in the first place? What kind of optimisation does it trigger?

Copy link
Member

@devmotion devmotion Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume because using tuples reduces allocations.

I don't think it matters much but I'd suggest reverting these changes in the PR because 1) they seem unrelated and 2) it is not clear that they improve anything (on the contrary, they increase allocations).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can revert it for the sets that are not modified in the extension. BTW, how much would it reduce memory overhead (considering that in return Julia would have to construct a tuple type)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a fan of keeping changes minimal and avoiding unrelated changes. So I'd suggest reverting these changes, regardless of whether they can/should be changed or not. For consistency, I think it would make sense to just use a tuple in the extension as well - and to open an issue or separate PR if you would like to change these tuples to arrays.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to replace a tuple with the extended one (although I would then have to drop const from the declaration); the code would look a little bit more convoluted.
But if you mean that the extension should define its own set of sparse functions to test -- that would require adding support for that set of functions in all AD backend tests suites.

I am also a fan of minimal changes, but there are also consistency and maintainability considerations. For this PR I thought it would be strange to change some function sets into arrays and keep others as tuples, or rewrite tuples from the extension. But if you, as a maintainer, prefer minimal changes, I can minimize them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed that the extension modifies arrays in the package. That does not seem OK to me, I don't think extensions are supposed to do that. For instance, this means that the resulting state depends on the order in which (possibly additional) extensions are loaded etc.

So in case you want to add an extension, I think a different design is needed anyway.

And yes, I very much would prefer more minimal changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not seem OK to me, I don't think extensions are supposed to do that. For instance, this means that the resulting state depends on the order in which (possibly additional) extensions are loaded etc.

What is the specific scenario that you think would be problematic?
The extensions would not remove functions from the lists, and they would be checking if the function is already in the list.
As for the order -- I don't think the order of functions in the list has any noticeable effects.
One thing to keep in mind is Revise.jl, but at the moment DiffTests.jl declares function lists as const, so Revise.jl cannot update current lists on the fly -- there is no regression here.

Also vector-based approach is used (see below) to dynamically populate the list based on Julia version (e.g. because some linear algebra routines are not available in the earlier versions). DiffTests.jl may drop support for Julia 1.0, but in the long run it might be beneficial to have some straightforward mechanism for supporting multiple Julia versions.

And let me refer to the original issue that I am trying to solve -- I've created JuliaDiff/ForwardDiff.jl#589 to address IMO the very relevant practical problem -- improving autodiff efficiency for linear models and multivariate normal distributions with dense or sparse matrices. So the test functions that I am adding here via the extension should be relevant for all autodiff backends. Vanilla SparseArrays.jl does not, at the moment, support ldiv!() for triangular sparse matrices, hence MKLSparse.jl weak dependency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extensions are just not designed for such a use case - their sole purpose is to extend existing functions.

A problematic scenario here would be another extension that also adds functions to the same array. Then the order of the functions in the array after loading the extensions can be completely arbitrary, and even worse if precompilation and loading of the extensions happens with multiple tasks in parallel two push!es might interfere.

num2num_4, num2num_5, identity]

#######################
# f(x::Number)::Array #
#######################
###############################
# f(x::Number)::AbstractArray #
###############################

function num2arr_1(x)
return reshape([num2num_1(x),
Expand All @@ -46,11 +46,11 @@ function num2arr_1(x)
num2num_3(x)], 2, 2, 2)
end

const NUMBER_TO_ARRAY_FUNCS = (num2arr_1,)
const NUMBER_TO_ARRAY_FUNCS = [num2arr_1,]

####################################
# f!(y::Array, x::Number)::Nothing #
####################################
############################################
# f!(y::AbstractArray, x::Number)::Nothing #
############################################

function num2arr_1!(y, x)
fill!(y, zero(x))
Expand All @@ -60,18 +60,18 @@ function num2arr_1!(y, x)
return nothing
end

const INPLACE_NUMBER_TO_ARRAY_FUNCS = (num2arr_1!,)
const INPLACE_NUMBER_TO_ARRAY_FUNCS = [num2arr_1!,]

########################
# f(x::Vector)::Number #
########################
################################
# f(x::AbstractVector)::Number #
################################

vec2num_1(x) = (exp(x[1]) + log(x[3]) * x[4]) / x[5]
vec2num_2(x) = x[1]*x[2] + sin(x[1])
vec2num_3(x) = norm(x' .* x)
vec2num_4(x) = ((sum(x) + prod(x)); 1)
vec2num_5(x) = sum((-x).^3)
vec2num_6(x) = sum([ifelse(i > 0, i, 0) for i in x])
vec2num_6(x) = sum([ifelse(i > 0, i, zero(eltype(x))) for i in x])
vec2num_7(x) = sum(map(y -> x[1] * y, x))

function rosenbrock_1(x)
Expand Down Expand Up @@ -114,16 +114,16 @@ end

self_weighted_logit(x) = inv(1.0 + exp(-dot(x, x)))

nested_array_mul(x) = sum(sum(x[1] * [[x[2], x[3]]]))
nested_array_mul(x)::eltype(x) = sum(sum(x[1] * [[x[2], x[3]]]))
alyst marked this conversation as resolved.
Show resolved Hide resolved

const VECTOR_TO_NUMBER_FUNCS = (vec2num_1, vec2num_2, vec2num_3, vec2num_4, vec2num_5,
const VECTOR_TO_NUMBER_FUNCS = [vec2num_1, vec2num_2, vec2num_3, vec2num_4, vec2num_5,
vec2num_6, vec2num_7, rosenbrock_1, rosenbrock_2,
rosenbrock_3, rosenbrock_4, ackley, self_weighted_logit,
nested_array_mul, first)
nested_array_mul, first]

########################
# f(x::Matrix)::Number #
########################
################################
# f(x::AbstractMatrix)::Number #
################################

mat2num_1(x) = det(first(x) * inv(x * x) + x)

Expand All @@ -144,7 +144,7 @@ mat2num_4(x) = mean(sum(sin.(x) * x, dims=2))

softmax(x) = sum(exp.(x) ./ sum(exp.(x), dims=2))

const MATRIX_TO_NUMBER_FUNCS = (det, mat2num_1, mat2num_2, mat2num_3, mat2num_4, softmax)
const MATRIX_TO_NUMBER_FUNCS = [det, mat2num_1, mat2num_2, mat2num_3, mat2num_4, softmax]

####################
# binary broadcast #
Expand All @@ -157,28 +157,33 @@ const BINARY_BROADCAST_OPS = ((a, b) -> broadcast(+, a, b),
(a, b) -> broadcast(\, a, b),
(a, b) -> broadcast(^, a, b))

#################################
# f(::Matrix, ::Matrix)::Number #
#################################
#########################################################
# f(::AbstractMatrix, ::AbstractMatrix)::AbstractMatrix #
#########################################################

# Julia LinearAlgebra does not support matrix\matrix,
# one needs to compute A factorization first
ldiv_lu(A::AbstractMatrix, B::AbstractArray) = lu(A) \ B
rdiv_lu(A::AbstractArray, B::AbstractMatrix) = A / lu(B)

const BINARY_MATRIX_TO_MATRIX_FUNCS = (+, -, *, /, \,
const BINARY_MATRIX_TO_MATRIX_FUNCS = [+, -, *, rdiv_lu, ldiv_lu,
BINARY_BROADCAST_OPS...,
(a, b) -> a * transpose(b), (a, b) -> transpose(a) * b, (a, b) -> transpose(a) * transpose(b),
(a, b) -> a * adjoint(b), (a, b) -> adjoint(a) * b, (a, b) -> adjoint(a) * adjoint(b))
(a, b) -> a * adjoint(b), (a, b) -> adjoint(a) * b, (a, b) -> adjoint(a) * adjoint(b)]

###########################################
# f(::Matrix, ::Matrix, ::Matrix)::Number #
###########################################
###################################################################
# f(::AbstractMatrix, ::AbstractMatrix, ::AbstractMatrix)::Number #
###################################################################

relu(x) = log.(1.0 .+ exp.(x))
sigmoid(n) = 1. / (1. + exp.(-n))
neural_step(x1, w1, w2) = sigmoid(dot(w2[1:size(w1, 2)], relu(w1 * x1[1:size(w1, 2)])))

const TERNARY_MATRIX_TO_NUMBER_FUNCS = (neural_step,)
const TERNARY_MATRIX_TO_NUMBER_FUNCS = [neural_step,]

###################################
# f!(y::Array, x::Array)::Nothing #
###################################
###################################################
# f!(y::AbstractArray, x::AbstractArray)::Nothing #
###################################################
# Credit for `chebyquad!`, `brown_almost_linear!`, and `trigonometric!` goes to
# Kristoffer Carlsson (@KristofferC).

Expand Down Expand Up @@ -247,49 +252,46 @@ function mutation_test_2!(y, x)
return nothing
end

const INPLACE_ARRAY_TO_ARRAY_FUNCS = (chebyquad!, brown_almost_linear!, trigonometric!,
mutation_test_1!, mutation_test_2!)
const INPLACE_ARRAY_TO_ARRAY_FUNCS = [chebyquad!, brown_almost_linear!, trigonometric!,
mutation_test_1!, mutation_test_2!]

############################
# f(x::VecOrMat)::VecOrMat #
############################
############################################
# f(x::AbstractVecOrMat)::AbstractVecOrMat #
############################################

diag_matrix(::Type{T}, n::Integer) where T<:Real =
Diagonal(LinRange(convert(T, 0.01), convert(T, 100.0), n))
diag_matrix(x::VecOrMat) = diag_matrix(Float64, size(x, 1))
diag_matrix(x::AbstractVecOrMat) = diag_matrix(Float64, size(x, 1))

lehmer_matrix(::Type{T}, n::Integer) where T<:Real =
[convert(T, min(i, j)/max(i, j)) for i in 1:n, j in 1:n]
lehmer_matrix(x::VecOrMat) = lehmer_matrix(Float64, size(x, 1))
lehmer_matrix(x::AbstractVecOrMat)::Matrix{Float64} = lehmer_matrix(Float64, size(x, 1))

test_matrix = lehmer_matrix

# left multiplication by a constant matrix
diag_lmul(x::VecOrMat) = diag_matrix(x) * x
dense_lmul(x::VecOrMat) = test_matrix(x) * x
utriag_lmul(x::VecOrMat) = UpperTriangular(test_matrix(x)) * x
ltriag_lmul(x::VecOrMat) = LowerTriangular(test_matrix(x)) * x
sparse_lmul(x::VecOrMat) = sparse(test_matrix(x)) * x
sp_utriag_lmul(x::VecOrMat) = UpperTriangular(sparse(test_matrix(x))) * x
sp_ltriag_lmul(x::VecOrMat) = LowerTriangular(sparse(test_matrix(x))) * x
diag_lmul(x::AbstractVecOrMat) = diag_matrix(x) * x
dense_lmul(x::AbstractVecOrMat)::typeof(x) = test_matrix(x) * x
alyst marked this conversation as resolved.
Show resolved Hide resolved
utriag_lmul(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(test_matrix(x)) * x
ltriag_lmul(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(test_matrix(x)) * x

sparse_lmul(x::AbstractVecOrMat)::typeof(x) = sparse(test_matrix(x)) * x
sp_utriag_lmul(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(sparse(test_matrix(x))) * x
sp_ltriag_lmul(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(sparse(test_matrix(x))) * x

# left division by a constant matrix
diag_ldiv(x::VecOrMat) = diag_matrix(x) \ x
dense_ldiv(x::VecOrMat) = test_matrix(x) \ x
utriag_ldiv(x::VecOrMat) = UpperTriangular(test_matrix(x)) \ x
ltriag_ldiv(x::VecOrMat) = LowerTriangular(test_matrix(x)) \ x
sparse_ldiv(x::VecOrMat) = sparse(test_matrix(x)) \ x
sp_utriag_ldiv(x::VecOrMat) = UpperTriangular(sparse(test_matrix(x))) \ x
sp_ltriag_ldiv(x::VecOrMat) = LowerTriangular(sparse(test_matrix(x))) \ x

const VECTOR_TO_VECTOR_FUNCS = (diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul,
diag_ldiv(x::AbstractVecOrMat) = diag_matrix(x) \ x
dense_ldiv(x::AbstractVecOrMat) = test_matrix(x) \ x
utriag_ldiv(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(test_matrix(x)) \ x
ltriag_ldiv(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(test_matrix(x)) \ x

const VECTOR_TO_VECTOR_FUNCS = [diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul,
sparse_lmul, sp_utriag_lmul, sp_ltriag_lmul,
diag_ldiv, utriag_ldiv, ltriag_ldiv,
sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv,)
diag_ldiv,]

######################
# f(x::Array)::Array #
######################
######################################
# f(x::AbstractArray)::AbstractArray #
######################################

chebyquad(x) = (y = fill(zero(eltype(x)), size(x)); chebyquad!(y, x); return y)

Expand All @@ -305,17 +307,22 @@ arr2arr_1(x) = (sum(x .* x); fill(zero(eltype(x)), size(x)))

arr2arr_2(x) = x[1, :] .+ x[1, :] .+ first(x)

const ARRAY_TO_ARRAY_FUNCS = (-, chebyquad, brown_almost_linear, trigonometric, arr2arr_1,
arr2arr_2, mutation_test_1, mutation_test_2, identity)
const ARRAY_TO_ARRAY_FUNCS = [-, chebyquad, brown_almost_linear, trigonometric, arr2arr_1,
arr2arr_2, mutation_test_1, mutation_test_2, identity]

#######################
# f(::Matrix)::Matrix #
#######################
#######################################
# f(::AbstractMatrix)::AbstractMatrix #
#######################################

const MATRIX_TO_MATRIX_FUNCS = (inv,
const MATRIX_TO_MATRIX_FUNCS = [inv,
diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul,
sparse_lmul, sp_utriag_lmul, sp_ltriag_lmul,
diag_ldiv, utriag_ldiv, ltriag_ldiv,
sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv,)
diag_ldiv,]

if VERSION >= v"1.1"
# required ldiv!(triag, adjoint) that is not implemented in 1.0
append!(VECTOR_TO_VECTOR_FUNCS, [utriag_ldiv, ltriag_ldiv])
append!(MATRIX_TO_MATRIX_FUNCS, [utriag_ldiv, ltriag_ldiv])
end

end # module
31 changes: 18 additions & 13 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,56 +2,61 @@ using DiffTests
using Test

n = rand()
x, y = rand(5, 5), rand(26)
x, y = rand(4, 6), rand(26)
A, B = rand(5, 5), rand(5, 5)

# f returns Number

for f in DiffTests.NUMBER_TO_NUMBER_FUNCS
@test isa(f(n), Number)
@test isa(@inferred(f(n)), Number)
end

for f in DiffTests.VECTOR_TO_NUMBER_FUNCS
@test isa(f(y), Number)
@test isa(@inferred(f(y)), Number)
end

for f in DiffTests.MATRIX_TO_NUMBER_FUNCS
@test isa(f(x), Number)
@test isa(@inferred(f(A)), Number)
end

for f in DiffTests.TERNARY_MATRIX_TO_NUMBER_FUNCS
@test isa(f(A, B, x), Number)
@test isa(@inferred(f(A, B, x)), Number)
end

# f returns Array

for f in DiffTests.NUMBER_TO_ARRAY_FUNCS
@test isa(f(n), Array)
@test isa(@inferred(f(n)), Array)
end

for f in DiffTests.ARRAY_TO_ARRAY_FUNCS
@test isa(f(A), Array)
@test isa(f(y), Array)
@test isa(@inferred(f(x)), Array)
@test isa(@inferred(f(y)), Array)
end

for f in DiffTests.VECTOR_TO_VECTOR_FUNCS
@test isa(f(y), Vector)
@test isa(@inferred(f(y)), Vector)
end

for f in DiffTests.MATRIX_TO_MATRIX_FUNCS
@test isa(f(A), Matrix)
@test isa(@inferred(f(A)), Matrix)
end

for f in DiffTests.BINARY_MATRIX_TO_MATRIX_FUNCS
@test isa(f(A, B), Matrix)
@test isa(@inferred(f(A, B)), Matrix)
end

# f! returns Nothing

for f! in DiffTests.INPLACE_ARRAY_TO_ARRAY_FUNCS
@test isa(f!(y, x), Nothing)
z = similar(x)
@test isa(@inferred(f!(z, x)), Nothing)
end

for f! in DiffTests.INPLACE_NUMBER_TO_ARRAY_FUNCS
@test isa(f!(y, n), Nothing)
zx = similar(x)
@test isa(@inferred(f!(zx, n)), Nothing)

zy = similar(y)
@test isa(@inferred(f!(zy, n)), Nothing)
end