Skip to content

Commit 10daad1

Browse files
authored
[Fix] InternNav Doc update for v0.2.0 (#3)
* update doc for refactor * rename env section * add aliyun dlc bash * add evaluation metrics
1 parent 149c33a commit 10daad1

File tree

4 files changed

+69
-18
lines changed

4 files changed

+69
-18
lines changed

source/en/user_guide/internnav/quick_start/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@ pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 \
190190
--index-url https://download.pytorch.org/whl/cu118
191191

192192
# install InternNav with model dependencies
193-
pip install -e .[model]
193+
pip install -e .[model] --no-build-isolation
194194

195195
```
196196

source/en/user_guide/internnav/quick_start/train_eval.md

Lines changed: 45 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,39 @@ The training pipeline is currently under preparation and will be open-sourced so
1111
Before evaluation, we should download the robot assets from [InternUTopiaAssets](https://huggingface.co/datasets/InternRobotics/Embodiments) and move them to the `data/` directory. Model weights of InternVLA-N1 can be downloaded from [InternVLA-N1](https://huggingface.co/InternRobotics/InternVLA-N1).
1212

1313
#### Evaluation on Isaac Sim
14+
[UPDATE] We support using local model and isaac sim in one process now. Evaluate on Single-GPU:
15+
16+
```bash
17+
python scripts/eval/eval.py --config scripts/eval/configs/h1_internvla_n1_async_cfg.py
18+
```
19+
20+
For multi-gpu inference, currently we support inference on environments that expose a torchrun-compatible runtime model (e.g., Torchrun or Aliyun DLC).
21+
22+
```bash
23+
# for torchrun
24+
./scripts/eval/bash/torchrun_eval.sh \
25+
--config scripts/eval/configs/h1_internvla_n1_async_cfg.py
26+
27+
# for alicloud dlc
28+
./scripts/eval/bash/eval_vln_distributed.sh \
29+
internutopia \
30+
--config scripts/eval/configs/h1_internvla_n1_async_cfg.py
31+
```
32+
1433
The main architecture of the whole-system evaluation adopts a client-server model. In the client, we specify the corresponding configuration (*.cfg), which includes settings such as the scenarios to be evaluated, robots, models, and parallelization parameters. The client sends requests to the server, which then submits tasks to the Ray distributed framework based on the corresponding cfg file, enabling the entire evaluation process to run.
1534

1635
First, change the 'model_path' in the cfg file to the path of the InternVLA-N1 weights. Start the evaluation server:
1736
```bash
1837
# from one process
1938
conda activate <model_env>
20-
python scripts/eval/start_server.py --config scripts/eval/configs/h1_internvla_n1_cfg.py
39+
python scripts/eval/start_server.py --config scripts/eval/configs/h1_internvla_n1_async_cfg.py
2140
```
2241

2342
Then, start the client to run evaluation:
2443
```bash
2544
# from another process
2645
conda activate <internutopia>
27-
MESA_GL_VERSION_OVERRIDE=4.6 python scripts/eval/eval.py --config scripts/eval/configs/h1_internvla_n1_cfg.py
46+
MESA_GL_VERSION_OVERRIDE=4.6 python scripts/eval/eval.py --config scripts/eval/configs/h1_internvla_n1_async_cfg.py
2847
```
2948

3049
The evaluation results will be saved in the `eval_results.log` file in the output_dir of the config file. The whole evaluation process takes about 10 hours at RTX-4090 graphics platform.
@@ -36,13 +55,23 @@ The simulation can be visualized by set `vis_output=True` in eval_cfg.
3655
Evaluate on Single-GPU:
3756

3857
```bash
39-
python scripts/eval/eval_habitat.py --model_path checkpoints/InternVLA-N1 --continuous_traj --output_path result/InternVLA-N1/val_unseen_32traj_8steps
58+
python scripts/eval/eval.py --config scripts/eval/configs/habitat_dual_system_cfg.py
4059
```
4160

42-
For multi-gpu inference, currently we only support inference on SLURM.
61+
For multi-gpu inference, currently we support inference on SLURM as well as environments that expose a torchrun-compatible runtime model (e.g., Aliyun DLC).
4362

4463
```bash
64+
# for slurm
4565
./scripts/eval/bash/eval_dual_system.sh
66+
67+
# for torchrun
68+
./scripts/eval/bash/torchrun_eval.sh \
69+
--config scripts/eval/configs/habitat_dual_system_cfg.py
70+
71+
# for alicloud dlc
72+
./scripts/eval/bash/eval_vln_distributed.sh \
73+
habitat \
74+
--config scripts/eval/configs/habitat_dual_system_cfg.py
4675
```
4776

4877

@@ -125,7 +154,18 @@ Currently we only support evaluate single System2 on Habitat:
125154
Evaluate on Single-GPU:
126155

127156
```bash
128-
python scripts/eval/eval_habitat.py --model_path checkpoints/InternVLA-N1-S2 --mode system2 --output_path results/InternVLA-N1-S2/val_unseen \
157+
python scripts/eval/eval.py --config scripts/eval/configs/habitat_s2_cfg.py
158+
159+
# set config with the following fields
160+
eval_cfg = EvalCfg(
161+
agent=AgentCfg(
162+
model_name='internvla_n1',
163+
model_settings={
164+
"mode": "system2", # inference mode: dual_system or system2
165+
"model_path": "checkpoints/<s2_checkpoint>", # path to model checkpoint
166+
}
167+
)
168+
)
129169
```
130170

131171
For multi-gpu inference, currently we only support inference on SLURM.

source/en/user_guide/internnav/tutorials/env.md

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Customizing Environments and Tasks in InternNav
1+
# Environments Design in InternNav
22

33
This tutorial provided a step-by-step guide to define a new environment and a new navigation task within the InternNav framework.
44

@@ -17,26 +17,24 @@ Because of this separation:
1717

1818
- We can run the same agent in simulation (Isaac / InternUtopia) or on a real robot, as long as both environments implement the same API.
1919

20-
- We can benchmark different tasks (VLN, PointGoalNav, etc.) in different worlds without rewriting the agent.
20+
- We can benchmark different tasks in different worlds without rewriting the agent.
2121

22-
InternNav already ships with two major environment backends:
22+
![img.png](../../../_static/image/internnav_process.png)
23+
24+
InternNav already ships with three major environment backends:
2325

2426
- **InternUtopiaEnv**:
2527
Simulated environment built on top of InternUtopia / Isaac Sim. This supports complex indoor scenes, object semantics, RGB-D sensing, and scripted evaluation loops.
26-
- **HabitatEnv** (WIP): Simulated environment built on top of Habitat Sim.
28+
29+
- **HabitatEnv**: Simulated environment built on top of Habitat Sim. This supports gym style workflow and handles distribution episodes set up.
2730

2831
- **RealWorldEnv**:
2932
Wrapper around an actual robot platform and its sensors (e.g. RGB camera, depth, odometry). This lets you deploy the same agent logic in the physical world.
3033

3134
Both of these are children of the same base [`Env`](https://github.com/InternRobotics/InternNav/blob/main/internnav/env/base.py) class.
3235

33-
## Evaluation Task (WIP)
34-
For the vlnpe benchmark, we build the task based on internutopia. Here is a diagram.
35-
36-
![img.png](../../../_static/image/agent_definition.png)
3736

38-
39-
## Evaluation Metrics (WIP)
37+
### Evaluation Metrics in VLN-PE
4038
For the VLN-PE benchmark in internutopia, InternNav provides comprehensive evaluation metrics:
4139
- **Success Rate (SR)**: The proportion of episodes in which the agent successfully reaches the goal location within a 3-meter radius.
4240
- **Success Rate weighted by Path Length (SPL)**: Measures both efficiency and success. It is defined as the ratio of the shortest-path distance to the actual trajectory length, weighted by whether the agent successfully reaches the goal.
@@ -47,4 +45,18 @@ A higher SPL indicates that the agent not only succeeds but does so efficiently,
4745
- **Fall Rate (FR)**: The frequency at which the agent falls or loses balance during navigation.
4846
- **Stuck Rate (StR)**: The frequency at which the agent becomes immobile or trapped (e.g., blocked by obstacles or unable to proceed).
4947

50-
The implementation is under `internnav/env/utils/internutopia_extensions`, we highly suggested follow the guide of [InternUtopia](../../internutopia).
48+
### Evaluation Metrics in VLN-CE
49+
For the VLN-CE benchmark in Habitat, InternNav keeps the original Habitat evaluation configuration and registers the following metrics:
50+
51+
- **Distance to Goal (DistanceToGoal)**: The geodesic distance from the agent’s current position to the goal location.
52+
53+
- **Success (Success)**: A binary indicator of whether the agent stops within **3 meters** of the goal.
54+
55+
- **Success weighted by Path Length (SPL)**: Measures both success and navigation efficiency. It is defined as the ratio of the shortest-path distance to the actual trajectory length, weighted by whether the agent successfully reaches the goal.
56+
A higher SPL indicates that the agent not only succeeds but does so efficiently, without taking unnecessarily long routes.
57+
58+
- **Oracle Success Rate (OracleSuccess)**: The proportion of episodes in which **any point** along the agent’s trajectory comes within **3 meters** of the goal, representing potential success if the agent were to stop optimally.
59+
60+
- **Oracle Navigation Error (OracleNavigationError)**: The minimum geodesic distance between the agent and the goal over the entire trajectory.
61+
62+
- **Normalized Dynamic Time Warping (nDTW)**: Measures how closely the agent’s trajectory follows the ground-truth demonstration path. Only registered in rxr benchmarks.

source/en/user_guide/internnav/tutorials/index.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ myst:
1212
:caption: Tutorials
1313
:maxdepth: 2
1414
15-
core
1615
dataset
1716
model
1817
training

0 commit comments

Comments
 (0)