Skip to content

kylekam/RGBLangGrounding

Repository files navigation

Sneak peek of RGBLanguageGrounding! [Work in progress]

RGBLanguageGrounding is a deep learning model for unified 3D reconstruction and queriable segmentation from posed RGB images.

Disclaimer: This project does not contain the most recent work, but is intended to showcase some of the functionality.

Test time inference

Setup

Create environment and install dependencies.

conda create -n rgblg python==3.10.14
conda activate rgblg

First, follow Pytorch Setup to install Pytorch with CUDA support.

pip install \
  matplotlib \
  pillow \
  numpy \
  scikit-image \
  scipy \
  timm \
  "tqdm>=4.65" \
  trimesh \
  pytorch_lightning==1.8 \
  pyyaml \
  opencv-python-headless \
  python-box \
  tensorboard \
  open_clip_torch

For additional usage, follow the parent project instructions FineRecon.

Samples

pyramid_test.py tiles different patch resolutions over the input image to assist with finetuning.

python language_grounding/pyramid_test.py \
--src-img ./media/sample_scene.jpg \
--output-dir ./output \
--query "a chair"

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages