Releases: JuliaGPU/KernelAbstractions.jl
Releases · JuliaGPU/KernelAbstractions.jl
v0.9.8
v0.9.7
v0.9.6
KernelAbstractions v0.9.6
Merged pull requests:
v0.9.5
KernelAbstractions v0.9.5
Closed issues:
- Defining timing infrastructure that works with events. (#15)
- Kernels fail on CPU when waiting on kernels that allocate shared memory (#55)
- Use macros in nested functions (#377)
CPU(static=true)
option (#387)- Need for @inline when using GPU backend (#392)
- KA seems to be broken for CUDA (#400)
Merged pull requests:
- Add kernel cpu=false and context accessor (#389) (@vchuravy)
- Add reverse CI for oneAPI and AMDGPU (#391) (@vchuravy)
- Update readme (#393) (@vchuravy)
- Improve clarity of numa_aware example (#397) (@carstenbauer)
- Docs: numa aware saxpy example (#398) (@carstenbauer)
- Add implementation notes to host functionality (#401) (@vchuravy)
v0.9.4
KernelAbstractions v0.9.4
Merged pull requests:
- Add CPU(static=true) (#388) (@vchuravy)
- Update index.md (#390) (@Ruibin-Liu)
v0.9.3
KernelAbstractions v0.9.3
Merged pull requests:
v0.9.2
KernelAbstractions v0.9.2
Closed issues:
- Use occupancy API for autotuning (#19)
- Allow user to turn off contract (#20)
- Assigning ::ROCDevice to ::KA.GPU (#321)
- ROCKernels: using queue pool causes performance regression (#344)
- KernelAbstractions.jl is blocked to v0.8.6 by CUDAKernels (#380)
Merged pull requests:
v0.9.1
KernelAbstractions v0.9.1
Closed issues:
- Can't run the example in quickstart (#371)
Merged pull requests:
- Add Metal to list of excluded backends (#368) (@maxwindiff)
- Add queries for atomics and float64 support (#369) (@maxwindiff)
- Fix typos (#370) (@tomchor)
- Add reverse CI for Metal PR (#372) (@vchuravy)
- Update reverse CI for CUDA (#373) (@vchuravy)
- Make unit tests skippable (#374) (@maxwindiff)
- Update CUDA to master (#375) (@vchuravy)
v0.9.0
KernelAbstractions v0.9.0
Closed issues:
Merged pull requests:
- Start removing event system (#317) (@vchuravy)
- Add Metal support (#337) (@tgymnich)
- Prefer blocks over threads (#341) (@vchuravy)
- ROCKernels: Add occupancy API (#342) (@pxl-th)
- [CUDAKernels] add always_inline as device parameter (#343) (@vchuravy)
- [CUDAKernels] Update compat (#345) (@vchuravy)
- Update CI (#346) (@vchuravy)
- ROCKernels: Adapt to AMDGPU changes (#348) (@jpsamaroo)
- [ROCKernels] Fix addrspacecast (#349) (@vchuravy)
- [ROCKernels] Import LLVM (#352) (@pxl-th)
- Update compat for oneAPIKernels.jl (#355) (@utkarsh530)
- Bump oneAPI to 1.0 (#356) (@michel2323)
- Rename device to backend (#359) (@vchuravy)
- Let Event(MtlDevice) actually be a barrier (#360) (@vchuravy)
- Fix Metal workgroup size (#361) (@tgymnich)
- Update docs (#362) (@vchuravy)
- Add optional priority feature (#363) (@vchuravy)
- Backends are adaptors (#364) (@vchuravy)
- Only skip histogram tests on CPU (#365) (@vchuravy)