How to Integrate GitHub Copilot into VS Code Extensions: A Practical Guide to the Chat API and Language Model API - Devuly | Smart Analytics for Developers & Projects

This article explains how VS Code extensions can natively integrate with GitHub Copilot. It covers chat participant registration, model invocation, and intelligent code review. It helps extension developers solve three common problems: the complexity of building AI workflows from scratch, the difficulty of context management, and the lack of clarity around publishing requirements. Keywords: VS Code extensions, GitHub Copilot, Language Model API.

Table of Contents

The technical specification snapshot is straightforward

Parameter	Description
Development language	TypeScript / Node.js
Runtime environment	VS Code Extension Host
Core protocols	VS Code Chat API, Language Model API
AI provider	GitHub Copilot
Typical models	GPT-4, GPT-3.5 series
Core dependencies	`vscode`, `@vscode/vsce`
Repository popularity	Star count not provided in the source
Common use cases	AI chat extensions, code explanation, code review, tool calling

VS Code has opened Copilot capabilities to extension developers

Starting in mid-2024, VS Code began allowing extensions to directly reuse GitHub Copilot conversation and model capabilities. Developers no longer need to build their own prompt gateway, model routing layer, or streaming output framework to embed AI into an existing toolchain.

This integration generally follows two paths: the Chat Participant API for the Copilot Chat panel, and the Language Model API for internal extension logic. The former works best for visible user interactions, while the latter fits background reasoning and automation.

The Chat Participant API turns an extension into a conversational entity

The core value of the Chat Participant API is that it allows an extension to appear in Copilot Chat as @your-extension. Users can trigger extension capabilities directly through natural language or /command, which shortens the interaction path and reduces the learning curve.

It usually includes three key parts: participant registration, slash command definitions, and follow-up question suggestions. This lets the extension answer the current question while also guiding the user toward the next step, creating a continuous task flow.

import * as vscode from 'vscode';

export function activate(context: vscode.ExtensionContext) {
  // Register a chat participant so users can invoke it with @myextension in Copilot Chat
  const participant = vscode.chat.createChatParticipant(
    'myextension.helper',
    async (request, chatContext, stream, token) => {
      const userMessage = request.prompt; // Read the user input

      // Stream the result back as Markdown
      stream.markdown(`You asked: ${userMessage}\n\n`);
      stream.markdown('This is the response returned by the extension.');

      // Return follow-up suggestions to help users continue the conversation
      return {
        followUp: [
          { prompt: 'Tell me more details', label: '📖 Detailed explanation' },
          { prompt: 'Show me an example', label: '💡 Sample code' }
        ]
      };
    }
  );

  context.subscriptions.push(participant);
}

This code registers a minimal but functional chat participant and supports both streaming output and follow-up prompts.

The Language Model API lets an extension invoke models internally

If your goal is not to expose capabilities in the chat panel, but instead to run AI tasks silently in commands, Code Actions, sidebars, or file-save hooks, you should prioritize the Language Model API.

It supports model selection, message construction, streaming consumption, and context injection. Common scenarios include code generation, document summarization, test case generation, selected code explanation, and automated review.

import * as vscode from 'vscode';

async function generateCode(prompt: string) {
  // Prefer the GPT-4 model provided by Copilot
  const models = await vscode.lm.selectChatModels({
    vendor: 'copilot',
    family: 'gpt-4'
  });

  if (models.length === 0) {
    throw new Error('No Copilot model available'); // Throw an explicit error when no model is available
  }

  const model = models[0];
  const messages = [
    vscode.LanguageModelChatMessage.User(prompt) // Build the user message
  ];

  const response = await model.sendRequest(
    messages,
    {},
    new vscode.CancellationTokenSource().token
  );

  let fullText = '';
  for await (const chunk of response.text) {
    fullText += chunk; // Concatenate the streamed output chunk by chunk
  }

  return fullText;
}

This code demonstrates the most basic flow for model selection, request submission, and streaming result aggregation.

Combining both APIs is the only way to build a truly usable AI extension

If you only use Chat Participant, you get an extension that can chat. If you only use the Language Model API, you get an extension that can reason. You need both to create a complete human-in-the-loop interaction loop.

A classic example is a code review assistant: the user enters /review in the Chat panel, the extension reads the current file, then calls a Copilot model to review it and continuously streams the results back into the chat response.

Register a code review participant that supports command routing

const reviewer = vscode.chat.createChatParticipant(
  'codereview.assistant',
  async (request, context, stream, token) => {
    const command = request.command; // Read the slash command

    if (command === 'review') {
      await reviewCurrentFile(stream, token); // Review the current file
    } else if (command === 'explain') {
      await explainSelection(stream, token); // Explain the current selection
    } else {
      stream.markdown('Supported commands:\n- `/review` Review the current file\n- `/explain` Explain the selected code');
    }
  }
);

reviewer.followupProvider = {
  provideFollowups(result, context, token) {
    return [
      { prompt: 'Fix these issues', label: '🔧 Auto-fix' },
      { prompt: 'Explain the first issue', label: '❓ More details' }
    ];
  }
};

This code implements command routing and follow-up suggestions, which form the core interaction skeleton of an AI assistant.

Reviewing the current file with the Language Model API better matches real engineering workflows

import * as vscode from 'vscode';

async function reviewCurrentFile(
  stream: vscode.ChatResponseStream,
  token: vscode.CancellationToken
) {
  const editor = vscode.window.activeTextEditor;
  if (!editor) {
    stream.markdown('❌ Please open a file first');
    return;
  }

  const code = editor.document.getText(); // Get the full content of the current file
  const language = editor.document.languageId; // Get the language type

  stream.progress('Analyzing code...');

  const models = await vscode.lm.selectChatModels({ vendor: 'copilot' });
  const model = models[0];

  const messages = [
    vscode.LanguageModelChatMessage.User(
      `Please review the following ${language} code and identify potential issues:\n\n\`\`\`${language}\n${code}\n\`\`\``
    )
  ];

  const response = await model.sendRequest(messages, {}, token);

  stream.markdown('### Review results\n\n');
  for await (const chunk of response.text) {
    stream.markdown(chunk); // Continuously write model output into the chat stream
  }
}

This code connects editor context, prompt packaging, and streaming output into a complete review pipeline.

Permission declarations and error handling must be designed early

Many extension integrations fail not because model invocation is wrong, but because proposal capabilities were never declared, or because edge cases such as Copilot not being installed, not being signed in, or having no available model were never handled.

In package.json, you need to enable the relevant experimental APIs and add metadata for chat participants. Otherwise, even if the code compiles, the corresponding capabilities may fail to activate at runtime.

{
  "enabledApiProposals": ["chatParticipant", "languageModels"],
  "contributes": {
    "chatParticipants": [
      {
        "id": "codereview.assistant",
        "name": "Code Reviewer",
        "description": "Intelligent code review assistant",
        "commands": [
          { "name": "review", "description": "Review the current file" },
          { "name": "explain", "description": "Explain the selected code" }
        ]
      }
    ]
  }
}

This configuration declares the extension’s AI capability entry points and serves as the prerequisite for feature visibility.

Robust error handling directly affects the user experience

try {
  const models = await vscode.lm.selectChatModels({ vendor: 'copilot' });

  if (models.length === 0) {
    stream.markdown('⚠️ GitHub Copilot was not detected. Please install it and sign in first.');
    return;
  }

  // Continue with model invocation logic here
} catch (error) {
  if (error instanceof vscode.LanguageModelError) {
    stream.markdown(`❌ AI invocation failed: ${error.message}`); // Display a readable error
  }
}

This code handles unavailable models and API exceptions, which is a baseline requirement for production-grade extensions.

Context management and tool calling define the extension’s upper bound

Basic capabilities can only answer single-turn questions, but high-value AI extensions must understand conversational history. By using the conversation records in ChatContext, you can reconstruct a multi-turn interaction into a model message sequence and enable continuous task execution.

async function handleWithHistory(request: vscode.ChatRequest, context: vscode.ChatContext, model: vscode.LanguageModelChat) {
  const messages = context.history.map(item => {
    if (item instanceof vscode.ChatRequestTurn) {
      return vscode.LanguageModelChatMessage.User(item.prompt); // Historical user question
    }
    return vscode.LanguageModelChatMessage.Assistant(item.response.markdown); // Historical assistant response
  });

  messages.push(vscode.LanguageModelChatMessage.User(request.prompt)); // Append the current question
  return model.sendRequest(messages, {}, new vscode.CancellationTokenSource().token);
}

This code converts chat history into model-readable input and works well for complex Q&A and iterative fix workflows.

Tool calling upgrades an extension from answering questions to executing tasks

When the model can do more than generate text and can also request workspace search, read configuration, or execute commands, the extension evolves from a Q&A tool into an intelligent agent. This is a major direction for AI-native IDE capabilities.

const tools: vscode.LanguageModelChatTool[] = [
  {
    name: 'searchCode',
    description: 'Search code in the workspace',
    inputSchema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search keyword' }
      }
    }
  }
];

// Send a request with tool definitions and allow the model to initiate tool calls
const response = await model.sendRequest(messages, { tools }, token);

This code injects callable tools into the model and provides a key entry point for making an extension more capable and autonomous.

Debugging, publishing, and fallback strategies determine whether the project can ship

For local debugging, it is best to start the Extension Development Host with F5, open Copilot Chat in the new window, verify whether @your-extension registration succeeded, and then test command behavior and streaming responses.

To troubleshoot prompts, context construction, and model responses, create a dedicated output channel that logs user input, model output, and error stacks instead of relying only on the visible UI when diagnosing issues.

const outputChannel = vscode.window.createOutputChannel('My Extension');
outputChannel.appendLine(`User prompt: ${request.prompt}`); // Log user input
outputChannel.appendLine(`Model response: ${fullText}`); // Log model output

This code creates auditable logs that make it easier to diagnose model behavior issues and prompt drift.

During publishing, developers typically use vsce to package and upload the extension to the Marketplace. You should also explicitly state in the README that the extension depends on GitHub Copilot. If Copilot is unavailable, provide a rules-based fallback mode or disable related entry points instead of failing hard.

# Install the VS Code extension publishing tool
npm install -g @vscode/vsce

# Package the extension
vsce package

# Publish to the Marketplace
vsce publish

These commands complete extension packaging and distribution and represent the standard delivery workflow.

FAQ

1. How should I choose between the Chat Participant API and the Language Model API?

If the capability needs to appear in Copilot Chat and be triggered through @extension-name or /command, choose the Chat Participant API. If the capability needs to run silently inside commands, editor actions, or background workflows, choose the Language Model API. In real projects, you usually combine both.

2. Why can’t my extension call a Copilot model?

The three most common causes are: you did not enable the chatParticipant or languageModels proposals in package.json; the user has not installed or signed in to GitHub Copilot; or there is no selectable model in the runtime environment. Check the model list length first, then emit a readable error.

3. What are the most important engineering concerns before launching an AI code review extension?

Focus on three things first: permission and dependency declarations, error handling and fallback behavior, and context boundary control. Without this infrastructure, even a strong model will be difficult to deploy reliably. In production, you should also add logging, rate limiting, and prompt version management.

[AI Readability Summary]

This article systematically reconstructs the development path for integrating GitHub Copilot into VS Code extensions. It covers the Chat Participant API, the Language Model API, a practical code review assistant, permission declarations, error handling, context management, tool calling, and the publishing workflow. The goal is to help developers quickly build usable, AI-native extensions.