Extension for CUDA.jl
This is an extension to support GPU (CUDA.jl
) acceleration for solving the time evolution and spectra. This improves the execution time and memory usage especially when the HEOMLS matrix is super large.
The functions of calculating time evolution (only supports ODE method with time-independent system Hamiltonian) and spectra will automatically choose to solve on CPU or GPU depend on the type of the sparse matrix in M::AbstractHEOMLSMatrix
objects (i.e., the type of the field M.data
).
typeof(M.data) <: SparseMatrixCSC # solve on CPU
typeof(M.data) <: CuSparseMatrixCSC # solve on GPU
Therefore, we wrapped several functions in CUDA
and CUDA.CUSPARSE
in order to not only converting a HEOMLS-matrix-type object into GPU arrays, but also changing the element type and word size (32
and 64
) since some of the GPUs perform better in 32
-bit. The functions are listed as follows (where input M
is a AbstractHEOMLSMatrix
):
cu(M, word_size=64)
: TransformM.data
intoCUDA
sparse arrays with specifiedword_size
(default to64
).CuSparseMatrixCSC{T}(M)
: TranslateM.data
into the typeCuSparseMatrixCSC{T, Int32}
Demonstration
The extension will be automatically loaded if user imports the package CUDA.jl
:
using HierarchicalEOM
using CUDA
CUDA.allowscalar(false) # Avoid unexpected scalar indexing
Setup
Here, we demonstrate this extension by using the example of the single-impurity Anderson model.
ϵ = -5
U = 10
Γ = 2
μ = 0
W = 10
kT = 0.5
N = 5
tier = 3
tlist = 0:0.1:10
ωlist = -10:1:10
σm = sigmam()
σz = sigmaz()
II = qeye(2)
d_up = tensor( σm, II)
d_dn = tensor(-1 * σz, σm)
ψ0 = tensor(basis(2, 0), basis(2, 0))
Hsys = ϵ * (d_up' * d_up + d_dn' * d_dn) + U * (d_up' * d_up * d_dn' * d_dn)
bath_up = Fermion_Lorentz_Pade(d_up, Γ, μ, W, kT, N)
bath_dn = Fermion_Lorentz_Pade(d_dn, Γ, μ, W, kT, N)
bath_list = [bath_up, bath_dn]
# even HEOMLS matrix
M_even_cpu = M_Fermion(Hsys, tier, bath_list)
# odd HEOMLS matrix
M_odd_cpu = M_Fermion(Hsys, tier, bath_list, ODD)
Generate HEOMLS matrix for GPU computation
M_even_gpu = cu(M_even_cpu)
M_odd_gpu = cu(M_odd_cpu)
Solving time evolution with CPU
ados_list = HEOMsolve(M_even_cpu, ψ0, tlist)
Solving time evolution with GPU
ados_list = HEOMsolve(M_even_gpu, ψ0, tlist)
Solving steady state with CPU using linear-solve method
ados_ss = steadystate(M_even_cpu);
Solving steady state with GPU using linear-solve method
ados_ss = steadystate(M_even_gpu);
Solving steady state with CPU using ODE method
ados_ss = steadystate(M_even_cpu, ψ0);
Solving steady state with GPU using ODE method
ados_ss = steadystate(M_even_gpu, ψ0);
Solving spectrum with CPU
dos = DensityOfStates(M_odd_cpu, ados_ss, d_up, ωlist)
Solving spectrum with GPU
dos = DensityOfStates(M_odd_gpu, ados_ss, d_up, ωlist)