How to Connect Claude Code to NVIDIA NIM for Free with GLM, Kimi, and MiniMax Models - Devuly | Smart Analytics for Developers & Projects

This guide explains how to route Claude Code requests through CLIProxyAPI to NVIDIA NIM’s OpenAI-compatible endpoint, so you can use Chinese foundation models such as GLM, Kimi, and MiniMax at no cost. It addresses the high cost and protocol mismatch of the official API path. Keywords: Claude Code, NVIDIA NIM, CLIProxyAPI.

Technical Specification	Details
Language/Environment	Primarily Windows, Claude Code CLI
Protocol	Anthropic Messages → OpenAI Compatible
Core Tools	CLIProxyAPI, CC Switch
Target Models	GLM-4.7, GLM-5, Kimi K2.5, MiniMax M2.5
Default Port	8317
Cost	NVIDIA NIM free-tier quota available

Table of Contents

This setup replaces expensive direct integration with protocol translation

Claude Code stands out for its complete in-terminal experience for code understanding, generation, and debugging. However, its default request path depends on the Anthropic API, which often creates cost and accessibility constraints when used directly.

NVIDIA NIM offers several Chinese models that are available for free trial use, and its API follows an OpenAI-compatible style. The real blocker is not model capability, but the protocol mismatch between Claude Code and the NVIDIA API.

The proxy flow is straightforward

Claude Code
  -> Anthropic Messages request
  -> CLIProxyAPI protocol translation
  -> NVIDIA NIM OpenAI-compatible endpoint
  -> Chinese model returns the result

This flow decouples the frontend tool from the backend model, so Claude Code no longer depends on a single model provider.

Getting an NVIDIA API key is the prerequisite for the entire setup

First, visit https://build.nvidia.com/ and register a developer account. For users in mainland China, use the +86 prefix when verifying your phone number, for example +86138xxxxxxxx. Otherwise, you may get stuck at the verification step.

After signing in, go to the API Keys page and generate a new key. This key is usually shown only once, so save it immediately in a password manager or a local encrypted file.

Confirm these two facts before moving on

# Core checklist
# 1. The account has completed phone verification
# 2. The API key has been copied and saved
# 3. This key will be added to the proxy service later

There is no command to run in this step, but whether this checklist is complete will directly affect the rest of the process.

Deploying CLIProxyAPI is the key step for protocol adaptation

CLIProxyAPI converts Anthropic-style requests into a format that NVIDIA NIM can accept. On Windows, download windows_amd64.zip, extract it, and it is ready to use.

Next, copy config.example.yaml to config.yaml, and modify only the three essential fields: the admin password, the remote access switch, and the NVIDIA API key.

The minimum working configuration should stay simple

# Admin password used to sign in to the management page later
secret-key: "your-secret"

# Allow local or LAN access as needed
allow-remote: true

# Enter the NVIDIA API key you just created
api-keys: "nvapi-xxxxxx"

This configuration defines the admin entry point and upstream authentication, which form the minimum required setup to start the proxy service.

After you run cli-proxy-api.exe on Windows, the service listens on port 8317 by default. Do not close this window, or all Claude Code requests will fail.

Then open http://localhost:8317/management.html, sign in with secret-key, and add a new NVIDIA provider under the AI Providers section.

Use a provider configuration that prioritizes stability

Provider Name: nvidia
Base URL: https://integrate.api.nvidia.com/v1
API Key: your NVIDIA API key
Model List: z-ai/glm4.7

It is best to start with z-ai/glm4.7, because the original test results indicate that it offers the best balance of stability and availability as the default model.

Claude Code can connect to the proxy through a GUI tool or manual configuration

If you want to minimize configuration errors, use CC Switch. It provides visual management for provider settings and helps you avoid formatting mistakes that often happen when editing config files directly.

When adding a provider in CC Switch, enter the API key, set the request URL to http://localhost:8317, choose the API format Anthropic Messages (Native), and select z-ai/glm4.7 as the primary model.

Manual configuration is better for users who know the directory layout

{
  "apiBaseUrl": "http://localhost:8317",
  "apiKey": "your NVIDIA API key"
}

This settings.json configuration rewrites Claude Code’s base request URL to the local proxy.

You can also use environment variables, which are more convenient for multi-environment switching or scripted deployment.

{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "nvapi-xxxx",
    "ANTHROPIC_BASE_URL": "http://localhost:8317",
    "ANTHROPIC_MODEL": "z-ai/glm4.7"
  }
}

These variables are well suited for automation scenarios. In essence, they still point Claude Code to the local protocol translation layer.

Switch models only after you verify that the request chain works

After configuration is complete, restart your terminal and run claude. If the first session opens successfully, enter a simple test prompt such as “Which model are you?” If the reply includes GLM-4.7-related identifiers, the integration is usually working.

The minimum verification command is very simple

claude
# Start the Claude Code interactive interface
# After startup, send a test prompt to verify the actual backend model

The goal here is not to benchmark capability, but to confirm that proxying, authentication, and model routing all work correctly.

Free model selection should prioritize stability over novelty

Model ID	Model Name	Characteristics
z-ai/glm4.7	GLM-4.7	Stable, strong reasoning, recommended default
z-ai/glm5	GLM-5	Stronger overall capability
minimaxai/minimax-m2.5	MiniMax M2.5	Good performance in code editing and repair
moonshotai/kimi-k2.5	Kimi K2.5	Excellent long-context support

The original source specifically notes that GLM-5 and some newer models may occasionally behave unstably. If your main use case is daily coding, ensure reliability first, then experiment with more aggressive model choices.

Most common failures come from four simple but frequent configuration issues

First, omitting the +86 prefix causes NVIDIA phone verification to fail. Second, port 8317 may already be in use, which prevents the proxy service from starting. Third, Claude Code may point to the wrong address, or the proxy window may have been closed. Fourth, some newer models may fail intermittently, and switching back to GLM-4.7 often restores normal operation.

Troubleshoot in order instead of reinstalling blindly

# Example: check port usage on Windows
netstat -ano | findstr 8317
# Check whether port 8317 is already occupied by another process

This kind of basic troubleshooting is often more effective than repeatedly changing configurations, especially in first-time deployments.

This integration path turns Claude Code into an open model gateway

This approach is not a workaround or exploit. It uses a proxy layer for protocol adaptation, combining Claude Code’s interaction experience with NVIDIA NIM’s free Chinese models.

For developers who want to control costs, keep a terminal-first workflow, and use GLM, Kimi, or MiniMax, this is a practical and high-value setup path.

FAQ

1. Why can’t Claude Code call NVIDIA NIM directly?

Because Claude Code expects the Anthropic protocol by default, while NVIDIA NIM provides an OpenAI-compatible API. The request formats differ, so you need CLIProxyAPI to translate between them.

2. Why is GLM-4.7 recommended as the default model?

Based on the original implementation notes, GLM-4.7 is more stable among the free available models. That makes it a better long-term default backend for Claude Code, especially for code generation and day-to-day debugging.

3. What should I check first if the connection fails?

Check these three items first: whether CLIProxyAPI is still running, whether Claude Code’s apiBaseUrl points to http://localhost:8317, and whether the NVIDIA API key is correct. These three checks cover most failure cases.

AI Readability Summary

This guide reconstructs the complete process for connecting Claude Code to NVIDIA NIM’s free Chinese large language models through CLIProxyAPI. It covers NVIDIA API key creation, deployment of the protocol translation proxy, setup through CC Switch or manual configuration, model verification, and common troubleshooting steps. It is well suited for developers who want a low-cost terminal AI coding assistant workflow.