Load Dataset for Image Classification

In the examples the mnist dataset from keras is used, but it is already loaded as numpy.ndarray. I would like to load my RGB image dataset into a Spark dataframe. In Pyspark there is the method:

spark.read.format("image").option("dropInvalid", True).load(path)

which allows you to load all the images contained in the path into a dataframe. In the Dataframe there is a row for each image, and each row contains the binary format of the corresponding image. You can convert the binary format to RGB matrices with numpy's methods, but how do you save a Tensor in each row, and then give the Dataframe as input to a convolutional network in Keras?

Is there any other way to not provide 3 matrices (RGB) for each image, and just provide a large vector of pixels?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Load Dataset for Image Classification #180

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Load Dataset for Image Classification #180

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions