-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Backend Compatibility #17
base: master
Are you sure you want to change the base?
Changes from 9 commits
926cf5b
cda36c6
4b15444
f066191
8a6c538
d2806f8
3329510
4a0f808
5664a7b
c6b0aae
a725bf5
e0375b9
573fa06
1bf0ff7
a673500
4e9976f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
module DiffTestsMKLSparseExt | ||
|
||
using LinearAlgebra: UpperTriangular, LowerTriangular | ||
using SparseArrays: sparse | ||
|
||
sparse_ldiv(x::AbstractVecOrMat) = sparse(test_matrix(x)) \ x | ||
sp_utriag_ldiv(x::AbstractVecOrMat) = UpperTriangular(sparse(test_matrix(x))) \ x | ||
sp_ltriag_ldiv(x::AbstractVecOrMat) = LowerTriangular(sparse(test_matrix(x))) \ x | ||
|
||
pushifmissing!(v::AbstractVector, el) = | ||
isnothing(findfirst(==(el), v)) && push!(v, el) | ||
|
||
# enable test functions that require ldiv!(::Matrix, ::SparseMatrixCSC, ::Matrix) | ||
for f in [sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv] | ||
|
||
pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f) | ||
pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f) | ||
pushifmissing!(DiffTests.VECTOR_TO_NUMBER_FUNCS, f) | ||
|
||
pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f) | ||
pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f) | ||
pushifmissing!(DiffTests.MATRIX_TO_MATRIX_FUNCS, f) | ||
end | ||
|
||
end |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
module DiffTests | ||
|
||
using LinearAlgebra: det, norm, dot, tr, Diagonal, LowerTriangular, UpperTriangular | ||
using LinearAlgebra: det, norm, dot, tr, lu, Diagonal, LowerTriangular, UpperTriangular | ||
using SparseArrays: sparse | ||
using Statistics: mean | ||
|
||
|
@@ -25,15 +25,15 @@ names, one should not expect these functions to be "correct" for their original | |
num2num_1(x) = sin(x)^2 / cos(x)^2 | ||
num2num_2(x) = 2*x + sqrt(x*x*x) | ||
num2num_3(x) = 10.31^(x + x) - x | ||
num2num_4(x) = 1 | ||
num2num_4(x) = one(x) | ||
alyst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
num2num_5(x) = 1. / (1. + exp(-x)) | ||
|
||
const NUMBER_TO_NUMBER_FUNCS = (num2num_1, num2num_2, num2num_3, | ||
num2num_4, num2num_5, identity) | ||
const NUMBER_TO_NUMBER_FUNCS = [num2num_1, num2num_2, num2num_3, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you explain why you switched the constant tuples to arrays? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For consistency with matrix2matrix functions, which are extended in DiffTestsMKLSparseExt. I can revert num2num to be tuples again. But out of curiosity - why they were tuples in the first place? What kind of optimisation does it trigger? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume because using tuples reduces allocations. I don't think it matters much but I'd suggest reverting these changes in the PR because 1) they seem unrelated and 2) it is not clear that they improve anything (on the contrary, they increase allocations). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can revert it for the sets that are not modified in the extension. BTW, how much would it reduce memory overhead (considering that in return Julia would have to construct a tuple type)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤷 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm a fan of keeping changes minimal and avoiding unrelated changes. So I'd suggest reverting these changes, regardless of whether they can/should be changed or not. For consistency, I think it would make sense to just use a tuple in the extension as well - and to open an issue or separate PR if you would like to change these tuples to arrays. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is possible to replace a tuple with the extended one (although I would then have to drop I am also a fan of minimal changes, but there are also consistency and maintainability considerations. For this PR I thought it would be strange to change some function sets into arrays and keep others as tuples, or rewrite tuples from the extension. But if you, as a maintainer, prefer minimal changes, I can minimize them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I missed that the extension modifies arrays in the package. That does not seem OK to me, I don't think extensions are supposed to do that. For instance, this means that the resulting state depends on the order in which (possibly additional) extensions are loaded etc. So in case you want to add an extension, I think a different design is needed anyway. And yes, I very much would prefer more minimal changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What is the specific scenario that you think would be problematic? Also vector-based approach is used (see below) to dynamically populate the list based on Julia version (e.g. because some linear algebra routines are not available in the earlier versions). DiffTests.jl may drop support for Julia 1.0, but in the long run it might be beneficial to have some straightforward mechanism for supporting multiple Julia versions. And let me refer to the original issue that I am trying to solve -- I've created JuliaDiff/ForwardDiff.jl#589 to address IMO the very relevant practical problem -- improving autodiff efficiency for linear models and multivariate normal distributions with dense or sparse matrices. So the test functions that I am adding here via the extension should be relevant for all autodiff backends. Vanilla SparseArrays.jl does not, at the moment, support ldiv!() for triangular sparse matrices, hence MKLSparse.jl weak dependency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Extensions are just not designed for such a use case - their sole purpose is to extend existing functions. A problematic scenario here would be another extension that also adds functions to the same array. Then the order of the functions in the array after loading the extensions can be completely arbitrary, and even worse if precompilation and loading of the extensions happens with multiple tasks in parallel two |
||
num2num_4, num2num_5, identity] | ||
|
||
####################### | ||
# f(x::Number)::Array # | ||
####################### | ||
############################### | ||
# f(x::Number)::AbstractArray # | ||
############################### | ||
|
||
function num2arr_1(x) | ||
return reshape([num2num_1(x), | ||
|
@@ -46,11 +46,11 @@ function num2arr_1(x) | |
num2num_3(x)], 2, 2, 2) | ||
end | ||
|
||
const NUMBER_TO_ARRAY_FUNCS = (num2arr_1,) | ||
const NUMBER_TO_ARRAY_FUNCS = [num2arr_1,] | ||
|
||
#################################### | ||
# f!(y::Array, x::Number)::Nothing # | ||
#################################### | ||
############################################ | ||
# f!(y::AbstractArray, x::Number)::Nothing # | ||
############################################ | ||
|
||
function num2arr_1!(y, x) | ||
fill!(y, zero(x)) | ||
|
@@ -60,18 +60,18 @@ function num2arr_1!(y, x) | |
return nothing | ||
end | ||
|
||
const INPLACE_NUMBER_TO_ARRAY_FUNCS = (num2arr_1!,) | ||
const INPLACE_NUMBER_TO_ARRAY_FUNCS = [num2arr_1!,] | ||
|
||
######################## | ||
# f(x::Vector)::Number # | ||
######################## | ||
################################ | ||
# f(x::AbstractVector)::Number # | ||
################################ | ||
|
||
vec2num_1(x) = (exp(x[1]) + log(x[3]) * x[4]) / x[5] | ||
vec2num_2(x) = x[1]*x[2] + sin(x[1]) | ||
vec2num_3(x) = norm(x' .* x) | ||
vec2num_4(x) = ((sum(x) + prod(x)); 1) | ||
vec2num_5(x) = sum((-x).^3) | ||
vec2num_6(x) = sum([ifelse(i > 0, i, 0) for i in x]) | ||
vec2num_6(x) = sum([ifelse(i > 0, i, zero(eltype(x))) for i in x]) | ||
vec2num_7(x) = sum(map(y -> x[1] * y, x)) | ||
|
||
function rosenbrock_1(x) | ||
|
@@ -114,16 +114,16 @@ end | |
|
||
self_weighted_logit(x) = inv(1.0 + exp(-dot(x, x))) | ||
|
||
nested_array_mul(x) = sum(sum(x[1] * [[x[2], x[3]]])) | ||
nested_array_mul(x)::eltype(x) = sum(sum(x[1] * [[x[2], x[3]]])) | ||
alyst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
const VECTOR_TO_NUMBER_FUNCS = (vec2num_1, vec2num_2, vec2num_3, vec2num_4, vec2num_5, | ||
const VECTOR_TO_NUMBER_FUNCS = [vec2num_1, vec2num_2, vec2num_3, vec2num_4, vec2num_5, | ||
vec2num_6, vec2num_7, rosenbrock_1, rosenbrock_2, | ||
rosenbrock_3, rosenbrock_4, ackley, self_weighted_logit, | ||
nested_array_mul, first) | ||
nested_array_mul, first] | ||
|
||
######################## | ||
# f(x::Matrix)::Number # | ||
######################## | ||
################################ | ||
# f(x::AbstractMatrix)::Number # | ||
################################ | ||
|
||
mat2num_1(x) = det(first(x) * inv(x * x) + x) | ||
|
||
|
@@ -144,7 +144,7 @@ mat2num_4(x) = mean(sum(sin.(x) * x, dims=2)) | |
|
||
softmax(x) = sum(exp.(x) ./ sum(exp.(x), dims=2)) | ||
|
||
const MATRIX_TO_NUMBER_FUNCS = (det, mat2num_1, mat2num_2, mat2num_3, mat2num_4, softmax) | ||
const MATRIX_TO_NUMBER_FUNCS = [det, mat2num_1, mat2num_2, mat2num_3, mat2num_4, softmax] | ||
|
||
#################### | ||
# binary broadcast # | ||
|
@@ -157,28 +157,33 @@ const BINARY_BROADCAST_OPS = ((a, b) -> broadcast(+, a, b), | |
(a, b) -> broadcast(\, a, b), | ||
(a, b) -> broadcast(^, a, b)) | ||
|
||
################################# | ||
# f(::Matrix, ::Matrix)::Number # | ||
################################# | ||
######################################################### | ||
# f(::AbstractMatrix, ::AbstractMatrix)::AbstractMatrix # | ||
######################################################### | ||
|
||
# Julia LinearAlgebra does not support matrix\matrix, | ||
# one needs to compute A factorization first | ||
ldiv_lu(A::AbstractMatrix, B::AbstractArray) = lu(A) \ B | ||
rdiv_lu(A::AbstractArray, B::AbstractMatrix) = A / lu(B) | ||
|
||
const BINARY_MATRIX_TO_MATRIX_FUNCS = (+, -, *, /, \, | ||
const BINARY_MATRIX_TO_MATRIX_FUNCS = [+, -, *, rdiv_lu, ldiv_lu, | ||
BINARY_BROADCAST_OPS..., | ||
(a, b) -> a * transpose(b), (a, b) -> transpose(a) * b, (a, b) -> transpose(a) * transpose(b), | ||
(a, b) -> a * adjoint(b), (a, b) -> adjoint(a) * b, (a, b) -> adjoint(a) * adjoint(b)) | ||
(a, b) -> a * adjoint(b), (a, b) -> adjoint(a) * b, (a, b) -> adjoint(a) * adjoint(b)] | ||
|
||
########################################### | ||
# f(::Matrix, ::Matrix, ::Matrix)::Number # | ||
########################################### | ||
################################################################### | ||
# f(::AbstractMatrix, ::AbstractMatrix, ::AbstractMatrix)::Number # | ||
################################################################### | ||
|
||
relu(x) = log.(1.0 .+ exp.(x)) | ||
sigmoid(n) = 1. / (1. + exp.(-n)) | ||
neural_step(x1, w1, w2) = sigmoid(dot(w2[1:size(w1, 2)], relu(w1 * x1[1:size(w1, 2)]))) | ||
|
||
const TERNARY_MATRIX_TO_NUMBER_FUNCS = (neural_step,) | ||
const TERNARY_MATRIX_TO_NUMBER_FUNCS = [neural_step,] | ||
|
||
################################### | ||
# f!(y::Array, x::Array)::Nothing # | ||
################################### | ||
################################################### | ||
# f!(y::AbstractArray, x::AbstractArray)::Nothing # | ||
################################################### | ||
# Credit for `chebyquad!`, `brown_almost_linear!`, and `trigonometric!` goes to | ||
# Kristoffer Carlsson (@KristofferC). | ||
|
||
|
@@ -247,49 +252,46 @@ function mutation_test_2!(y, x) | |
return nothing | ||
end | ||
|
||
const INPLACE_ARRAY_TO_ARRAY_FUNCS = (chebyquad!, brown_almost_linear!, trigonometric!, | ||
mutation_test_1!, mutation_test_2!) | ||
const INPLACE_ARRAY_TO_ARRAY_FUNCS = [chebyquad!, brown_almost_linear!, trigonometric!, | ||
mutation_test_1!, mutation_test_2!] | ||
|
||
############################ | ||
# f(x::VecOrMat)::VecOrMat # | ||
############################ | ||
############################################ | ||
# f(x::AbstractVecOrMat)::AbstractVecOrMat # | ||
############################################ | ||
|
||
diag_matrix(::Type{T}, n::Integer) where T<:Real = | ||
Diagonal(LinRange(convert(T, 0.01), convert(T, 100.0), n)) | ||
diag_matrix(x::VecOrMat) = diag_matrix(Float64, size(x, 1)) | ||
diag_matrix(x::AbstractVecOrMat) = diag_matrix(Float64, size(x, 1)) | ||
|
||
lehmer_matrix(::Type{T}, n::Integer) where T<:Real = | ||
[convert(T, min(i, j)/max(i, j)) for i in 1:n, j in 1:n] | ||
lehmer_matrix(x::VecOrMat) = lehmer_matrix(Float64, size(x, 1)) | ||
lehmer_matrix(x::AbstractVecOrMat)::Matrix{Float64} = lehmer_matrix(Float64, size(x, 1)) | ||
|
||
test_matrix = lehmer_matrix | ||
|
||
# left multiplication by a constant matrix | ||
diag_lmul(x::VecOrMat) = diag_matrix(x) * x | ||
dense_lmul(x::VecOrMat) = test_matrix(x) * x | ||
utriag_lmul(x::VecOrMat) = UpperTriangular(test_matrix(x)) * x | ||
ltriag_lmul(x::VecOrMat) = LowerTriangular(test_matrix(x)) * x | ||
sparse_lmul(x::VecOrMat) = sparse(test_matrix(x)) * x | ||
sp_utriag_lmul(x::VecOrMat) = UpperTriangular(sparse(test_matrix(x))) * x | ||
sp_ltriag_lmul(x::VecOrMat) = LowerTriangular(sparse(test_matrix(x))) * x | ||
diag_lmul(x::AbstractVecOrMat) = diag_matrix(x) * x | ||
dense_lmul(x::AbstractVecOrMat)::typeof(x) = test_matrix(x) * x | ||
alyst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
utriag_lmul(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(test_matrix(x)) * x | ||
ltriag_lmul(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(test_matrix(x)) * x | ||
|
||
sparse_lmul(x::AbstractVecOrMat)::typeof(x) = sparse(test_matrix(x)) * x | ||
sp_utriag_lmul(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(sparse(test_matrix(x))) * x | ||
sp_ltriag_lmul(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(sparse(test_matrix(x))) * x | ||
|
||
# left division by a constant matrix | ||
diag_ldiv(x::VecOrMat) = diag_matrix(x) \ x | ||
dense_ldiv(x::VecOrMat) = test_matrix(x) \ x | ||
utriag_ldiv(x::VecOrMat) = UpperTriangular(test_matrix(x)) \ x | ||
ltriag_ldiv(x::VecOrMat) = LowerTriangular(test_matrix(x)) \ x | ||
sparse_ldiv(x::VecOrMat) = sparse(test_matrix(x)) \ x | ||
sp_utriag_ldiv(x::VecOrMat) = UpperTriangular(sparse(test_matrix(x))) \ x | ||
sp_ltriag_ldiv(x::VecOrMat) = LowerTriangular(sparse(test_matrix(x))) \ x | ||
|
||
const VECTOR_TO_VECTOR_FUNCS = (diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul, | ||
diag_ldiv(x::AbstractVecOrMat) = diag_matrix(x) \ x | ||
dense_ldiv(x::AbstractVecOrMat) = test_matrix(x) \ x | ||
utriag_ldiv(x::AbstractVecOrMat)::typeof(x) = UpperTriangular(test_matrix(x)) \ x | ||
ltriag_ldiv(x::AbstractVecOrMat)::typeof(x) = LowerTriangular(test_matrix(x)) \ x | ||
|
||
const VECTOR_TO_VECTOR_FUNCS = [diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul, | ||
sparse_lmul, sp_utriag_lmul, sp_ltriag_lmul, | ||
diag_ldiv, utriag_ldiv, ltriag_ldiv, | ||
sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv,) | ||
diag_ldiv,] | ||
|
||
###################### | ||
# f(x::Array)::Array # | ||
###################### | ||
###################################### | ||
# f(x::AbstractArray)::AbstractArray # | ||
###################################### | ||
|
||
chebyquad(x) = (y = fill(zero(eltype(x)), size(x)); chebyquad!(y, x); return y) | ||
|
||
|
@@ -305,17 +307,22 @@ arr2arr_1(x) = (sum(x .* x); fill(zero(eltype(x)), size(x))) | |
|
||
arr2arr_2(x) = x[1, :] .+ x[1, :] .+ first(x) | ||
|
||
const ARRAY_TO_ARRAY_FUNCS = (-, chebyquad, brown_almost_linear, trigonometric, arr2arr_1, | ||
arr2arr_2, mutation_test_1, mutation_test_2, identity) | ||
const ARRAY_TO_ARRAY_FUNCS = [-, chebyquad, brown_almost_linear, trigonometric, arr2arr_1, | ||
arr2arr_2, mutation_test_1, mutation_test_2, identity] | ||
|
||
####################### | ||
# f(::Matrix)::Matrix # | ||
####################### | ||
####################################### | ||
# f(::AbstractMatrix)::AbstractMatrix # | ||
####################################### | ||
|
||
const MATRIX_TO_MATRIX_FUNCS = (inv, | ||
const MATRIX_TO_MATRIX_FUNCS = [inv, | ||
diag_lmul, dense_lmul, utriag_lmul, ltriag_lmul, | ||
sparse_lmul, sp_utriag_lmul, sp_ltriag_lmul, | ||
diag_ldiv, utriag_ldiv, ltriag_ldiv, | ||
sparse_ldiv, sp_utriag_ldiv, sp_ltriag_ldiv,) | ||
diag_ldiv,] | ||
|
||
if VERSION >= v"1.1" | ||
# required ldiv!(triag, adjoint) that is not implemented in 1.0 | ||
append!(VECTOR_TO_VECTOR_FUNCS, [utriag_ldiv, ltriag_ldiv]) | ||
append!(MATRIX_TO_MATRIX_FUNCS, [utriag_ldiv, ltriag_ldiv]) | ||
end | ||
|
||
end # module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW this should not be done: Extensions should only import the package itself and the weak dependency but not even standard libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? SparseArrays and LinearAlgebra are the direct dependencies of DiffTests itself and MKLSparse. The latter provides more specific definitions of matrix multiplication and division for sparse matrices that are utilized by the test functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should import them indirectly via DiffTests and MKLSparse (see eg JuliaStats/LogExpFunctions.jl#63).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the info. I've updated how the import is being done. However, the changes that the extensions makes to the function lists are not visible from the user session (I guess it's different world age of DiffTests.jl or something like that).