This project provides a native Swift interface to CUDA with the following modules:
- CUDA Driver API
import CUDADriver
- CUDA Runtime API
import CUDARuntime
- NVRTC - CUDA Runtime Compiler
import NVRTC
- cuBLAS - CUDA Basic Linear Algebra Subprograms
import CuBLAS
- Warp - GPU Acceleration Library
import Warp
(Thrust counterpart)
Any machine with CUDA 7.0+ and a CUDA-capable GPU is supported. Xcode Playground is supported as well. Please refer to Usage and Components.
CUDA Driver, Runtime, cuBLAS, and NVRTC (real-time compiler) are wrapped in
native Swift types. Warp provides higher level value types, DeviceArray
and
DeviceValue
, with copy-on-write semantics.
import Warp
/// Initialize two arrays on device
var x: DeviceArray<Float> = [1.0, 2.0, 3.0, 4.0, 5.0]
let y: DeviceArray<Float> = [1.0, 2.0, 3.0, 4.0, 5.0]
/// Scalar map operations
x.incrementElements(by: 2) // x => [2.0, 3.0, 4.0, 5.0, 6.0] on device
x.multiplyElements(by: 2) // x => [2.0, 4.0, 6.0, 8.0, 10.0] on device
/// Addition
x.formElementwise(.addition, with: y) // x => [3.0, 6.0, 9.0, 12.0, 15.0] on device
/// Dot product
x • y // => 165.0
/// Sum
x.sum() // => 15
/// Absolute sum
x.sumOfAbsoluteValues() // => 15
/// Transform by 1-place math functions
x.transform(by: .sin)
x.transform(by: .tanh)
x.transform(by: .ceil)
/// Elementwise operation
x.formElementwise(.addition, with: y)
x.formElementwise(.subtraction, with: y)
x.formElementwise(.multiplication, with: y)
x.formElementwise(.division, with: y)
/// Fill with the same value
var z = y
z.fill(with: 10.0)
/// Composite assignment
x.assign(from: .subtraction, left: y, multipliedBy: 100.0, right: z)
import NVRTC
import CUDADriver
import Warp
let source: String =
+ "extern \"C\" __global__ void saxpy(float a, float *x, float *y, float *out, int n) {"
+ " size_t tid = blockIdx.x * blockDim.x + threadIdx.x;"
+ " if (tid < n) out[tid] = a * x[tid] + y[tid];"
+ "}";
let ptx = try Compiler.compile(source)
try Device.main.withContext { context in
let module = try Module(ptx: ptx)
let function = module.function(named: "saxpy")!
let x: DeviceArray<Float> = [1, 2, 3, 4, 5, 6, 7, 8]
let y: DeviceArray<Float> = [2, 3, 4, 5, 6, 7, 8, 9]
var result = DeviceArray<Float>(capacity: 8)
try function<<<(1, 8)>>>[.float(1.0), .constPointer(to: x), .constPointer(to: y), .pointer(to: &result), .int(8)]
/// result => [3, 5, 7, 9, 11, 13, 15, 17] on device
}
Add a dependency:
.Package(url: "https://github.com/rxwei/cuda-swift", majorVersion: 1)
You may use the Makefile
in this repository for you own project. No extra path
configuration is needed.
Otherwise, specify the path to your CUDA headers and library at swift build
.
swift build -Xcc -I/usr/local/cuda/include -Xlinker -L/usr/local/cuda/lib
swift build -Xcc -I/usr/local/cuda/include -Xlinker -L/usr/local/cuda/lib64
- CUDADriver - CUDA Driver API
-
Context
-
Device
-
Function
-
PTX
-
Module
-
Stream
-
Unsafe(Mutable)DevicePointer<T>
-
DriverError
(all error codes from CUDA C API)
-
- CUDARuntime - CUDA Runtime API
-
Unsafe(Mutable)DevicePointer<T>
-
Device
-
Stream
-
RuntimeError
(all error codes from CUDA C API)
-
- NVRTC - CUDA Runtime Compiler
-
Compiler
-
- CuBLAS - GPU Basic Linear Algebra Subprograms (in-progress)
- Level 1 BLAS operations
- Level 2 BLAS operations (GEMV)
- Level 3 BLAS operations (GEMM)
- Warp - GPU Acceleration Library (Thrust counterpart)
-
DeviceArray<T>
(generic array in device memory) -
DeviceValue<T>
(generic value in device memory) - Acclerated vector operations
- Type-safe kernel argument helpers
-
- Swift Playground
- CUDADriver works in the playground. But other modules cause the "couldn't lookup symbols" problem for which we don't have a solution until Xcode is fixed.
- To use the playground, open the Xcode workspace file, and add a library for
every modulemap under
Frameworks
.
MIT License
CUDA is a registered trademark of NVIDIA Corporation.