Could we adopt general-purpose-identifiers according to UAX-31, for the dataset names? In that case the dataset names are also valid python identifiers.
Eg, the dataset name 7zip is invalid as it starts with a digit. Also, one can argue that the dataset name 7zip is a bit misleading, it's not that we include the application itself, so changing it to logo_7zip would improve the name of the "dataset" and makes it also a valid python identifier.