← Projects

Neural Radiance Fields (NeRF)

2024

Novel view synthesis from 2D images using volumetric neural rendering

PythonPyTorchComputer Vision
Training Images
Camera Poses
Ray Generation
Coarse Network
Uniform Sampling
Positional Encoding γ
xyz + θφ → Fourier features
Coarse MLP
σ + color
Fine Network
Importance Sampling
guided by coarse output
Positional Encoding γ
Fine MLP
σ + color
Volume Rendering
∫ σ · c · T dt
Novel View Output

Hierarchical sampling: coarse network guides fine network sample placement

NeRF is one of those papers that genuinely changed how I think about 3D representation. Instead of explicitly storing geometry, you train a neural network to implicitly encode a scene as a continuous volumetric function — then render novel views by marching rays through it.

The core idea

A NeRF takes a 5D input — 3D spatial coordinates (x, y, z) plus 2D viewing direction (θ, φ) — and outputs color and density at that point. To render a pixel, you shoot a ray from the camera, sample points along it, query the network at each point, and composite the results using classical volume rendering equations.

What makes it work is positional encoding: raw coordinates are mapped to a higher-frequency Fourier feature space before being fed to the MLP. Without this, the network tends to learn overly smooth functions and misses fine detail.

Implementation details

  • Built the full volumetric rendering pipeline from scratch in PyTorch
  • Implemented hierarchical sampling: a coarse network proposes sample locations, a fine network refines them
  • Used positional encoding with configurable frequency bands
  • Trained on the standard Blender synthetic dataset to validate against published benchmarks

Training a single scene takes hours even on GPU, which makes you appreciate just how much computation is hiding behind those silky smooth novel view videos.