-
Notifications
You must be signed in to change notification settings - Fork 14
Description
I've had this error consistently with any and all feature types in the modules Contact, Components, SurfaceArea, IRC. No combination of standardization along with non-None transform fixes this. I went back and looked at al my data to check for NaN or non-numeric values and there are none.
No matter what, If I add any transforms to any of my features, even something simple as `lambda t: t.astype(np.float32), I receive an ASCII error with the error trace below. The position and exact byte changes. The error typically occurs a couple of epochs into a training loop.
self.df.to_hdf(
Traceback (most recent call last):
File "/home/bizon/deepranking/deepranker_script.py", line 60, in <module>
func(**args)
File "/home/bizon/deepranking/training_wrap.py", line 200, in train_gnn_classifier
model.train(
File "/home/bizon/deepranking/deeprank2/deeprank2/trainer.py", line 629, in train
checkpoint_model = self._save_model()
File "/home/bizon/deepranking/deeprank2/deeprank2/trainer.py", line 921, in _save_model
deserialized_func = dill.loads(serialized_func) # noqa: S301
File "/home/bizon/anaconda3/envs/dr2/lib/python3.10/site-packages/dill/_dill.py", line 311, in loads
return load(file, ignore, **kwds)
File "/home/bizon/anaconda3/envs/dr2/lib/python3.10/site-packages/dill/_dill.py", line 297, in load
return Unpickler(file, ignore=ignore, **kwds).load()
File "/home/bizon/anaconda3/envs/dr2/lib/python3.10/site-packages/dill/_dill.py", line 452, in load
obj = StockUnpickler.load(self)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x83 in position 16: ordinal not in range(128)
Additionally, I consistently get this warning, whether I get the prior error or not, which I'm unsure is related
PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block1_values] [items->Index(['phase', 'entry', 'output'], dtype='object')]
I'd like some help either circumventing or fixing whatever is going on as this is preventing me from tuning my training.
OS is ubuntu 22.04 x86, DeepRank2 v3.1.0 with Pyg 2.4.0 and torch 2.1.1 on python 3.10.0.
Installation is for GPU running on an A100 80gb, CUDA 12.1, not containerized
Thanks