Extension for CUDA.jl
This is an extension to support GPU (CUDA.jl) acceleration for solving the time evolution and spectra. This improves the execution time and memory usage especially when the HEOMLS matrix is super large.
The described feature requires Julia 1.9+.
The functions of calculating time evolution (only supports ODE method with time-independent system Hamiltonian) and spectra will automatically choose to solve on CPU or GPU depend on the type of the sparse matrix in M::AbstractHEOMLSMatrix objects (i.e., the type of the field M.data).
typeof(M.data) <: SparseMatrixCSC # solve on CPU
typeof(M.data) <: CuSparseMatrixCSC # solve on GPUTherefore, we wrapped several functions in CUDA and CUDA.CUSPARSE in order to return a new HEOMLS-matrix-type object with M.data is in the type of CuSparseMatrix, and also change the element type into ComplexF32 and Int32 (since GPU performs better in this type). The functions are listed as follows:
cu(M::AbstractHEOMLSMatrix): TranslateM.datainto the typeCuSparseMatrixCSC{ComplexF32, Int32}CuSparseMatrixCSC(M::AbstractHEOMLSMatrix): TranslateM.datainto the typeCuSparseMatrixCSC{ComplexF32, Int32}
Demonstration
The extension will be automatically loaded if user imports the package CUDA.jl :
using CUDA
using HierarchicalEOM
using LinearSolve # to change the solver for better GPU performanceSetup
Here, we demonstrate this extension by using the example of the single-impurity Anderson model.
ϵ = -5
U = 10
Γ = 2
μ = 0
W = 10
kT = 0.5
N = 5
tier = 3
tlist = 0:0.1:10
ωlist = -10:1:10
σm = [0 1; 0 0]
σz = [1 0; 0 -1]
II = [1 0; 0 1]
d_up = kron( σm, II)
d_dn = kron(-1 * σz, σm)
ρ0 = kron([1 0; 0 0], [1 0; 0 0])
Hsys = ϵ * (d_up' * d_up + d_dn' * d_dn) + U * (d_up' * d_up * d_dn' * d_dn)
bath_up = Fermion_Lorentz_Pade(d_up, Γ, μ, W, kT, N)
bath_dn = Fermion_Lorentz_Pade(d_dn, Γ, μ, W, kT, N)
bath_list = [bath_up, bath_dn]
# even HEOMLS matrix
M_even_cpu = M_Fermion(Hsys, tier, bath_list)
M_even_gpu = cu(M_even_cpu)
# odd HEOMLS matrix
M_odd_cpu = M_Fermion(Hsys, tier, bath_list, ODD)
M_odd_gpu = cu(M_odd_cpu)
# solve steady state with CPU
ados_ss = SteadyState(M_even_cpu);This extension does not support for solving SteadyState on GPU since it is not efficient and might get wrong solutions. If you really want to obtain the stationary state with GPU, you can repeatedly solve the evolution until you find it.
Solving time evolution with CPU
ados_list_cpu = evolution(M_even_cpu, ρ0, tlist)Solving time evolution with GPU
ados_list_gpu = evolution(M_even_gpu, ρ0, tlist)Solving Spectrum with CPU
dos_cpu = DensityOfStates(M_odd_cpu, ados_ss, d_up, ωlist)Solving Spectrum with GPU
dos_gpu = DensityOfStates(M_odd_gpu, ados_ss, d_up, ωlist; solver=KrylovJL_BICGSTAB(rtol=1f-10, atol=1f-12))