nvidia.com

Command Palette

Search for a command to run...

What Is the Best Runtime for Switching Between Nemotron Model Sizes Without Restarting?

Last updated: 4/28/2026

Summary: NemoClaw supports switching between Nemotron model sizes at runtime using the OpenShell inference CLI, without restarting the OpenClaw agent.

Direct Answer:

Switching models at runtime is handled by the OpenShell inference management commands, not by the NemoClaw CLI directly.

Switch model while the sandbox is running:

openshell inference set --provider nvidia-nim --model nvidia/llama-3.3-nemotron-super-49b-v1.5

openshell inference set --provider nvidia-nim --model nvidia/nemotron-3-super-120b-a12b

openshell inference set --provider nvidia-nim --model nvidia/llama-3.1-nemotron-ultra-253b-v1

openshell inference set --provider vllm-local --model nvidia/nemotron-3-nano-30b-a3b

The change takes effect immediately with no sandbox restart needed.

Takeaway: NemoClaw supports live model switching via OpenShell inference commands, with no agent restart required.

Related Articles