Releases: ROCm/rocSPARSE
Releases · ROCm/rocSPARSE
rocSPARSE 4.2.0 for ROCm 7.2.0
Added
- Added sliced ELL format support to the
rocsparse_spmvroutine. - Added the
rocsparse_sptrsvandrocsparse_sptrsmroutines for triangular solve. - Added the
--clients-onlyoption to theinstall.shandrmake.pyscripts to only build the clients for a version of rocSPARSE that is already installed. - Added nnz split algorithm
rocsparse_spmv_alg_csr_nnzsplittorocsparse_spmv. This algorithm might be superior to the existing adaptive algorithmrocsparse_spmv_alg_csr_adaptivewhen running the computation a small number of times because it avoids paying the analysis cost of the adaptive algorithm.
Changed
- Make rocBLAS a requirement when it's requested when building from source. Previously, rocBLAS was not used if it could not be found. To opt out of using rocblas when building from source, use the
--no-rocblasoption with theinstall.shorrmake.pybuild scripts.
Optimized
- Significantly improved the
rocsparse_sddmmroutine when using CSR format, especially as the number of columns in the denseAmatrix (or rows in the denseBmatrix) increase. - Improved the user documentation.
Resolved issues
- Fix the
rmake.pybuild script to properly handleautoand all options when selecting offload targets. - Fix building rocSPARSE with the install script on centOS 9.
- Fix
std::fmacasting in host routines to properly deduce types. This could have previously caused compilation failures when building from source.
rocsparse 4.1.0 for ROCm 7.1.1
rocSPARSE code for ROCm 7.1.1 did not change. The library was rebuilt for the updated ROCm 7.1.1 stack.
rocSPARSE 4.1.0 for ROCm 7.1.0
Added
- Added brain half float mixed precision to
rocsparse_axpbywhere X and Y use bfloat16 and result and the compute type use float. - Added brain half float mixed precision to
rocsparse_spvvwhere X and Y use bfloat16 and result and the compute type use float. - Added brain half float mixed precision to
rocsparse_spmvwhere A and X use bfloat16 and Y and the compute type use float. - Added brain half float mixed precision to
rocsparse_spmmwhere A and B use bfloat16 and C and the compute type use float. - Added brain half float mixed precision to
rocsparse_sddmmwhere A and B use bfloat16 and C and the compute type use float. - Added brain half float mixed precision to
rocsparse_sddmmwhere A and B and C use bfloat16 and the compute type use float. - Added half float mixed precision to
rocsparse_sddmmwhere A and B and C use float16 and the compute type use float. - Added brain half float uniform precision to
rocsparse_scatterandrocsparse_gatherroutines.
Optimized
- Improved the user documentation.
Upcoming changes
- Deprecate trace, debug, and bench logging using environment variable
ROCSPARSE_LAYER.
rocSPARSE 4.0.3 for ROCm 7.0.2
Resolved issues
- Resolved an issue causing premature deallocation of internal buffers still in use.
rocsparse 4.0.2 for ROCm 7.0.1
rocSPARSE code for ROCm 7.0.1 did not change. The library was rebuilt for the updated ROCm 7.0.1 stack.
rocSPARSE 4.0.2 for ROCm 7.0.0
Added
- Adds
SpGEAMgeneric routine for computing sparse matrix addition in CSR format - Adds
v2_SpMVgeneric routine for computing sparse matrix vector multiplication. As opposed to the deprecatedrocsparse_spmvroutine, this routine does not use a fallback algorithm if a non-implemented configuration is encountered and will return an error in such a case. For the deprecated routinerocsparse_spmv, the user can enable warning messages in situations where a fallback algorithm is used by either calling upfront the routinerocsparse_enable_debugor exporting the variableROCSPARSE_DEBUG(with the shell commandexport ROCSPARSE_DEBUG=1). - Adds half float mixed precision to
rocsparse_axpbywhere X and Y use float16 and result and the compute type use float - Adds half float mixed precision to
rocsparse_spvvwhere X and Y use float16 and result and the compute type use float - Adds half float mixed precision to
rocsparse_spmvwhere A and X use float16 and Y and the compute type use float - Adds half float mixed precision to
rocsparse_spmmwhere A and B use float16 and C and the compute type use float - Adds half float mixed precision to
rocsparse_sddmmwhere A and B use float16 and C and the compute type use float - Adds half float uniform precision to
rocsparse_scatterandrocsparse_gatherroutines - Adds half float uniform precision to
rocsparse_sddmmroutine - Added
rocsparse_spmv_alg_csr_rowsplitalgorithm. - Added support for gfx950
- Add ROC-TX instrumentation support in rocSPARSE (not available on Windows or in the static library version on Linux).
- Added the
almalinuxOS name to correct the gfortran dependency
Changed
- Switch to defaulting to C++17 when building rocSPARSE from source. Previously rocSPARSE was using C++14 by default.
Optimized
- Reduced the number of template instantiations in the library to further reduce the shared library binary size and improve compile times
- Allow SpGEMM routines to use more shared memory when available. This can speed up performance for matrices with a large number of intermediate products.
- Use of the
rocsparse_spmv_alg_csr_adaptiveorrocsparse_spmv_alg_csr_defaultalgorithms inrocsparse_spmvto perform transposed sparse matrix multiplication (C=alpha*A^T*x+beta*y) resulted in unnecessary analysis on A and needless slowdown during the analysis phase. This has been fixed by skipping the analysis when performing the transposed sparse matrix multiplication. - Improved the user documentation
Resolved issues
- Fixed an issue in the public headers where
extern "C"was not wrapped by#ifdef __cplusplus, which caused failures when building C programs with rocSPARSE. - Fixed a memory access fault in the
rocsparse_Xbsrilu0routines. - Fixed failures that could occur in
rocsparse_Xbsrsm_solveorrocsparse_spsmwith BSR format when using host pointer mode. - Fixed ASAN compilation failures
- Fixed failure that occurred when using const descriptor
rocsparse_create_const_csr_descrwith the generic routinerocsparse_sparse_to_sparse. Issue was not observed when using non-const descriptorrocsparse_create_csr_descrwithrocsparse_sparse_to_sparse. - Fixed a memory leak in the rocsparse handle
Removed
- The deprecated
rocsparse_spmv_exroutine - The deprecated
rocsparse_sbsrmv_ex,rocsparse_dbsrmv_ex,rocsparse_cbsrmv_ex, androcsparse_zbsrmv_exroutines - The deprecated
rocsparse_sbsrmv_ex_analysis,rocsparse_dbsrmv_ex_analysis,rocsparse_cbsrmv_ex_analysis, androcsparse_zbsrmv_ex_analysisroutines
Upcoming changes
- Deprecated the
rocsparse_spmvroutine. Users should use therocsparse_v2_spmvroutine going forward. - Deprecated
rocsparse_spmv_alg_csr_streamalgorithm. Users should use therocsparse_spmv_alg_csr_rowsplitalgorithm going forward. - Deprecated the
rocsparse_itilu0_alg_sync_split_fusionalgorithm. Users should use one ofrocsparse_itilu0_alg_async_inplace,rocsparse_itilu0_alg_async_split, orrocsparse_itilu0_alg_sync_splitgoing forward.
rocSPARSE 3.4.0 for ROCm 6.4.4
rocSPARSE code for ROCm 6.4.4 did not change. The library was rebuilt for the updated ROCm 6.4.4 stack.
rocSPARSE 3.4.0 for ROCm 6.4.3
rocSPARSE code for ROCm 6.4.3 did not change. The library was rebuilt for the updated ROCm 6.4.3 stack.
rocSPARSE 3.4.0 for ROCm 6.4.2
rocSPARSE code for ROCm 6.4.2 did not change. The library was rebuilt for the updated ROCm 6.4.2 stack.
rocSPARSE 3.4.0 for ROCm 6.4.1
rocSPARSE code for ROCm 6.4.1 did not change. The library was rebuilt for the updated ROCm 6.4.1 stack.