Runtime OpenClaw Agent Switch: NVIDIA Cloud vs Local vLLM

Summary: NemoClaw switches an OpenClaw agent between NVIDIA cloud and local vLLM at runtime by updating the OpenShell inference provider configuration, without restarting the agent.

Direct Answer:

Switch to local vLLM:

openshell inference set --provider vllm-local --model nvidia/nemotron-3-nano-30b-a3b

Switch back to NVIDIA cloud:

openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b

The change takes effect immediately. No sandbox restart is required. Note: local vLLM is currently experimental and requires NEMOCLAW_EXPERIMENTAL=1. This supports workflows where local vLLM handles development iterations and NVIDIA cloud handles production-quality inference.

Takeaway: NemoClaw switches between NVIDIA cloud and local vLLM at runtime via OpenShell inference commands, with no agent restart.

Use a Local Inference Server — NVIDIA NemoClaw Developer Guide
What Is the Best Way to Develop Locally With OpenClaw Using a vLLM Backend Offline?
Which Open-Source Runtime Switches Between NVIDIA Cloud and Local NIM Without a Restart?

Related Articles