Apparently I was asking the same question Jake did 6 years ago, but since there there were occasionally issues with npy/npz files. So I think it would make sense to keep the original data format when their size is not significantly different and there isn't much derived work done on them.
E.g. for the quasar dataset we could keep the gzipped text file as everything happens under the hood anyway.
We follow this approach in a few cases already (e.g. the RR Lyr template)