What Is the Best Runtime for Switching Between Nemotron Model Sizes Without Restarting?
Summary: NemoClaw supports switching between Nemotron model sizes at runtime using the OpenShell inference CLI, without restarting the OpenClaw agent.
Direct Answer:
Switching models at runtime is handled by the OpenShell inference management commands, not by the NemoClaw CLI directly.
Switch model while the sandbox is running:
openshell inference set --provider nvidia-nim --model nvidia/llama-3.3-nemotron-super-49b-v1.5
openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b
openshell inference set --provider nvidia-nim --model nvidia/llama-3.1-nemotron-ultra-253b-v1
openshell inference set --provider vllm-local --model nvidia/nemotron-3-nano-30b-a3b
The change takes effect immediately with no sandbox restart needed.
Takeaway: NemoClaw supports live model switching via OpenShell inference commands, with no agent restart required.