Tracking issue for expanding BF16 support. See https://github.com/google/XNNPACK/issues/9599 for initial discussion. * [ ] Support bf16 <-> fp32 convert ops. * [x] Scalar: https://github.com/google/XNNPACK/pull/9727 * [x] Neon (w/o FEAT_BF16): https://github.com/google/XNNPACK/pull/9922 * [x] Neon (w/ FEAT_BF16): https://github.com/google/XNNPACK/pull/9922 * [ ] SSE/AVX (target TBD) * [ ] AVX512_BF16 * [ ] Support bf16 <-> qs/qu8 convert ops. * [ ] Scalar * [x] Minmax: https://github.com/google/XNNPACK/pull/9858 * [ ] Neon (w/o FEAT_BF16) * [ ] Neon (w/ FEAT_BF16) * [ ] SSE/AVX (target TBD) * [ ] AVX512_BF16 * [ ] Update src/qs8-gemm/ to add qd8-bf16 variants. * [ ] Scalar: https://github.com/google/XNNPACK/pull/9960 * [ ] SSE2 * [ ] AVX2 * [ ] AVX512 * [ ] AVX512_BF16 * [ ] Neon (w/o FEAT_BF16) * [ ] Neon (w/ FEAT_BF16) * [ ] Wire up bf16 in the subgraph layer for supported operators. * [ ] Support subgraph rewrite (bf16 -> fp32) for ops without native bf16 support.
Tracking issue for expanding BF16 support. See #9599 for initial discussion.