cherubicxn commited on
Commit
5abd584
·
verified ·
1 Parent(s): 72d3304

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +98 -3
  2. model.pt +3 -0
README.md CHANGED
@@ -1,3 +1,98 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - depth-estimation
7
+ - depth-completion
8
+ - rgb-d
9
+ - computer-vision
10
+ - robotics
11
+ - 3d-vision
12
+ - pytorch
13
+ - vision-transformer
14
+ datasets:
15
+ - custom
16
+ metrics:
17
+ - rmse
18
+ - mae
19
+ library_name: pytorch
20
+ pipeline_tag: depth-estimation
21
+ ---
22
+
23
+ # LingBot-Depth-DC (Depth Completion)
24
+
25
+ **LingBot-Depth-DC** is a post-trained variant of LingBot-Depth, specifically **optimized for sparse depth completion** tasks. This model excels at recovering dense depth maps from highly sparse inputs such as SfM/SLAM point clouds.
26
+
27
+ ## Model Details
28
+
29
+ ### Model Description
30
+
31
+ This model builds upon the LingBot-Depth pretrained checkpoint with additional post-training focused on sparse depth completion scenarios. It is particularly effective for:
32
+ - Recovering complete depth from sparse SfM/SLAM observations
33
+ - Handling extremely sparse depth inputs (e.g., <5% valid pixels)
34
+ - Scenarios where depth sensors are unavailable and only sparse geometric cues exist
35
+
36
+ - **Developed by:** Bin Tan, Changjiang Sun, Xiage Qin, Hanat Adai, Zelin Fu, Tianxiang Zhou, Han Zhang, Yinghao Xu, Xing Zhu, Yujun Shen, Nan Xue
37
+ - **Model type:** Vision Transformer for sparse depth completion
38
+ - **License:** Apache 2.0
39
+ - **Finetuned from model:** LingBot-Depth (pretrained)
40
+
41
+ ### Model Sources
42
+
43
+ - **Repository:** https://github.com/robbyant/lingbot-depth
44
+ - **Paper:** [Masked Depth Modeling for Spatial Perception](https://arxiv.org/abs/2601.xxxxx)
45
+ - **Project Page:** https://technology.robbyant.com/lingbot-depth
46
+
47
+ ### Related Models
48
+
49
+ | Model | Repository | Description |
50
+ |-------|------------|-------------|
51
+ | LingBot-Depth | [robbyant/lingbot-depth-pretrain-vitl-14](https://huggingface.co/robbyant/lingbot-depth-pretrain-vitl-14) | General-purpose depth refinement |
52
+ | LingBot-Depth-DC | [robbyant/lingbot-depth-postrain-dc-vitl14](https://huggingface.co/robbyant/lingbot-depth-postrain-dc-vitl14) | Optimized for sparse depth completion (this model) |
53
+
54
+ ## Uses
55
+
56
+ ### Direct Use
57
+
58
+ - **Sparse Depth Completion**: Recovering dense depth from SfM/SLAM sparse point clouds
59
+ - **Extreme Sparsity Handling**: Working with <5% valid depth pixels
60
+ - **RGB-guided Depth Densification**: Using visual context to fill large missing regions
61
+
62
+ ### Downstream Use
63
+
64
+ - **SLAM Enhancement**: Densifying sparse SLAM outputs for better scene understanding
65
+ - **Novel View Synthesis**: Providing dense geometry for view synthesis pipelines
66
+ - **3D Reconstruction**: Completing sparse depth for mesh reconstruction
67
+ - **Robotics Navigation**: Dense depth from sparse sensor observations
68
+
69
+ ## Technical Specifications
70
+
71
+ ### Model Architecture
72
+
73
+ - **Encoder:** ViT-Large/14 (24 layers) with separated patch embeddings for RGB and depth
74
+ - **Decoder:** ConvStack decoder with hierarchical upsampling
75
+ - **Objective:** Masked depth modeling optimized for sparse inputs
76
+ - **Model size:** ~300M parameters
77
+
78
+ ### Software Requirements
79
+
80
+ - Python >= 3.9
81
+ - PyTorch >= 2.0.0
82
+ - xformers
83
+
84
+ ## Citation
85
+
86
+ ```bibtex
87
+ @article{lingbot-depth2026,
88
+ title={Masked Depth Modeling for Spatial Perception},
89
+ author={Tan, Bin and Sun, Changjiang and Qin, Xiage and Adai, Hanat and Fu, Zelin and Zhou, Tianxiang and Zhang, Han and Xu, Yinghao and Zhu, Xing and Shen, Yujun and Xue, Nan},
90
+ journal={arXiv preprint arXiv:2601.xxxxx},
91
+ year={2026}
92
+ }
93
+ ```
94
+
95
+ ## Model Card Contact
96
+
97
+ - **Email:** tanbin.tan@antgroup.com, xuenan.xue@antgroup.com
98
+ - **Issues:** https://github.com/robbyant/lingbot-depth/issues
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a4512e05445857d1404fc00227285bd8fa4abdd97250dc65f9636aa9cc71325
3
+ size 1284841739