Build multi-modal feature matrices from text, images, categorical and numeric data in one line.
Use it to train powerful ML models that learn from all your data simultaneously.
Extracts features from:
- Text using BERT / any HuggingFace model
- Images using ResNet / EfficientNet
- Categorical & Numeric data with sklearn pipelines
Fully customizable:
- Choose your text model:
"bert-base-uncased","distilbert-base-uncased","roberta-base", etc. - Choose your image model:
"resnet18","resnet50","efficientnet_b0".
Returns:
- Combined
Xnumpy array of multi-modal features - Target
yready for training.
From PyPI:
pip install multimodal_feature_builder