GeoBench

TL;DR

A comprehensive Monocular Geometry Benchmark for evaluating SOTA discriminative and generative depth and surface normal estimation foundation models. The conclusions are:
1. Discriminative Models pretrained with large data (e.g. DINOv2), can outperform generative models pretrained with Stable Diffusion with a small scale synthetic data under the same training configuration.
2. Synthetic Data is critial for fine-grained depth estimation. Data quality is a more important factor than model architectures and data scales.
3. Inductive bias is critial for surface normal estimation.

Citation

@article{ge2024geobench,
    title={GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models},
    author={Ge, Yongtao and Xu, Guangkai, and Zhao, Zhiyue and Huang, zheng and Sun, libo and Sun, Yanlong and Chen, Hao and Shen, Chunhua},
    journal={arXiv preprint arXiv:2406.12671},
    year={2024}
}