Bug in dataset splits?

I'm reading the dataset code and there is probably a bug here:

https://github.com/ml-jku/hopular/blob/3e0c39fdc59568349373af573ee52c03305ca105/hopular/auxiliary/data.py#L597-L602

After that, you are using the old indices to index into the shuffled/concatenated arrays. So the splits are different (not stratified, for example):

https://github.com/ml-jku/hopular/blob/3e0c39fdc59568349373af573ee52c03305ca105/hopular/auxiliary/data.py#L637-L638

	self.__splits = (split_training, split_validation, split_test)

	# Sort dataset according to splits.
	self.__data = np.concatenate((
	self.__data[split_training], self.__data[split_validation], self.__data[split_test]
	), axis=0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in dataset splits? #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	def split_train(self) -> torch.Tensor:
	return self.__splits[0]

Bug in dataset splits? #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions