Skip to content

avinashpaliwal/PanoDreamer

Repository files navigation

PanoDreamer

PanoDreamer: Optimization-Based Single Image to 360° 3D Scene With Diffusion
Avinash Paliwal, Xilong Zhou, Andrii Tsarov, Nima Khademi Kalantari

arXiv ACM Project Page Video

demo

Overview

This repository implements panorama generation and 3D scene creation:

  • multicondiffusion.py: Extends an image horizontally in perspective space
  • multicondiffusion_panorama.py: Generates a 360° cylindrical panorama
  • depth_estimation.py: Estimates consistent depth maps for wide/panoramic images
  • ldi_generation.py: Creates Layered Depth Images with background inpainting
  • train_gsplat.py: Optimizes 3DGS scene from panorama LDI
  • render_gsplat.py: Renders 3DGS scenes using gsplat (for visualization)

Implementation Status

  • MultiConDiffusion (wide image generation)
  • Cylindrical panorama generation (360°)
  • Depth estimation
  • LDI generation (layered depth images)
  • 3DGS scene optimization (training from LDI)
  • 3DGS rendering (visualization with gsplat)

Example

Wide image example
Wide image generated with MultiConDiffusion

Depth estimation
Depth estimation with view stitching

Setup

# Clone repository with submodules
git clone --recursive https://github.com/yourusername/panodreamer.git
cd panodreamer

# Or if already cloned, initialize submodules
git submodule update --init --recursive

# Create environment
uv venv
source .venv/bin/activate
uv pip install -e .

# Clone Depth Anything V2 (for depth estimation)
git clone https://github.com/DepthAnything/Depth-Anything-V2.git

# Download depth model checkpoint
mkdir -p checkpoints
wget -P checkpoints https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth

# Download inpainting checkpoints for LDI generation
python download_inpainting_ckpts.py

# (Optional) Install MoGe V2 for improved metric depth
uv pip install "git+https://github.com/microsoft/MoGe.git" --no-deps

Usage

1. Wide Image Generation

Extends the input image horizontally in perspective space.

python multicondiffusion.py \
  --prompt_file examples/29_real_campus_3.txt \
  --input_image examples/29_real_campus_3.png \
  --output_dir output

2. Cylindrical Panorama (360°)

Generates a full 360° cylindrical panorama from the input image.

python multicondiffusion_panorama.py \
  --prompt_file examples/29_real_campus_3.txt \
  --input_image examples/29_real_campus_3.png \
  --output_dir output

3. Depth Estimation

Estimates depth for wide images or cylindrical panoramas. Multiple methods are available:

# DA V2 relative depth + DA V2 metric calibration (default)
python depth_estimation.py \
  --input_image output/final_output.png \
  --output_dir output_depth \
  --mode panorama

# DA V2 relative depth + MoGe V2 metric calibration (recommended, sharper edges)
python depth_estimation.py \
  --input_image output/final_output.png \
  --output_dir output_depth \
  --mode panorama \
  --method dav2+moge

# MoGe V2 direct metric + Poisson gradient merge (best depth range)
python depth_estimation.py \
  --input_image output/final_output.png \
  --output_dir output_depth \
  --mode panorama \
  --method moge+poisson

4. LDI Generation

Creates layered depth images by splitting depth into layers and inpainting occluded backgrounds.

python ldi_generation.py \
  --input_image output/final_output.png \
  --input_depth output_depth/depth.npy \
  --output_dir output_ldi \
  --num_layers 4

5. 3DGS Scene Optimization

Optimizes a 3D Gaussian Splatting scene from panorama LDI layers.

python train_gsplat.py \
  --ldi_dir output_ldi \
  --output scene_optimized.ply \
  --num_iterations 3000 \
  --init_opacity 0.5

6. 3DGS Rendering

Renders 3D Gaussian Splatting scenes using gsplat.

# Panorama (rotate in place, matches training)
python render_gsplat.py \
  --ply scene_optimized.ply \
  --output renders \
  --panorama

# Orbit (novel view, camera circles the scene)
python render_gsplat.py \
  --ply scene_optimized.ply \
  --output renders \
  --radius 5.0 \
  --focal 1250 \
  --camera_y 0

Arguments

Panorama generation (multicondiffusion.py, multicondiffusion_panorama.py):

  • --prompt_file: Text file with scene description
  • --input_image: Input image (placed in center)
  • --steps: Denoising steps per iteration (default: 50)
  • --iterations: Number of refinement iterations (default: 15)
  • --H, --W: Output dimensions (default: 512x2048)
  • --guidance: Guidance scale (default: 7.5)
  • --seed: Random seed (default: 0)
  • --debug: Save debug visualizations

Depth estimation (depth_estimation.py):

  • --input_image: Input wide/panoramic image
  • --output_dir: Output directory
  • --mode: wide or panorama
  • --method: dav2 (default), dav2+moge, moge, or moge+poisson
  • --iterations: Number of alignment iterations (default: 15)

LDI generation (ldi_generation.py):

  • --input_image: Input panorama image
  • --input_depth: Depth map (.npy file)
  • --output_dir: Output directory
  • --num_layers: Number of depth layers (default: 4)

3DGS training (train_gsplat.py):

  • --ldi_dir: Path to panorama LDI directory
  • --output: Output PLY file path
  • --num_iterations: Number of optimization iterations (default: 3000)
  • --num_views: Number of training views (default: 240)
  • --init_opacity: Initial Gaussian opacity (default: 0.5)
  • --depth_weight: Depth loss weight (default: 0.005)
  • --fov: Field of view in degrees (default: 44.702)

3DGS rendering (render_gsplat.py):

  • --ply: Path to 3DGS PLY file
  • --output: Output directory
  • --panorama: Rotate-in-place mode (matches training cameras)
  • --radius: Camera orbit radius (default: 2.0)
  • --focal: Focal length (default: 622.61)
  • --camera_y: Camera height override for orbit mode
  • --num_frames: Number of frames (default: 720)
  • --fps: Video frame rate (default: 60)

Acknowledgements

This codebase builds upon several excellent open-source projects:

  • MultiDiffusion - Fusing diffusion paths for controlled image generation
  • LucidDreamer - Domain-free generation of 3D Gaussian Splatting scenes
  • 3d-moments - Inpainting networks for layered depth images
  • Depth-Anything-V2 - Monocular depth estimation
  • gsplat - Python library for 3D Gaussian Splatting
  • MoGe - Metric depth estimation (optional, for improved depth calibration)

We thank the authors for making their code publicly available.

Citation

@inproceedings{paliwal2024panodreamer,
    author = {Paliwal, Avinash and Zhou, Xilong and Tsarov, Andrii and Kalantari, Nima},
    title = {PanoDreamer: Optimization-Based Single Image to 360° 3D Scene With Diffusion},
    year = {2025},
    booktitle = {Proceedings of the SIGGRAPH Asia 2025 Conference Papers},
    articleno = {112},
    numpages = {10},
    doi = {10.1145/3757377.3763883},
    url = {https://doi.org/10.1145/3757377.3763883}
}