DeepSeek V4 API: The Complete Developer Guide
SitePoint

DeepSeek V4 API: The Complete Developer Guide

Getting Started: API Key and Environment Setup

Creating Your DeepSeek Account and API Key

Registration begins at platform.deepseek.com. After creating an account, navigate to the API Keys section in the dashboard and generate a new key. Copy the key immediately; it will not be shown again.

Store the key in an environment variable. Never hardcode API keys in source files, commit them to version control, or expose them in client-side code.

Project Initialization

Set up a new Node.js project and install the required dependencies:

mkdir deepseek-v3-demo
cd deepseek-v3-demo
npm init -y
npm install openai@4 dotenv@16

This guide was tested with openai@4.x and dotenv@16.x. Pin versions to avoid breaking changes.

All example files use the .mjs extension to enable ES module syntax (including top-level await). Alternatively, add "type": "module" to package.json to use import in .js files.

Create a .env file in the project root:

DEEPSEEK_API_KEY=your_api_key_here
DEEPSEEK_BASE_URL=https://api.deepseek.com

Add .env to .gitignore to prevent accidental exposure:

echo ".env" >> .gitignore

Your First DeepSeek V3 API Call

Configuring the OpenAI SDK for DeepSeek

The OpenAI Node.js SDK accepts a baseURL constructor parameter. Pointing it at https://api.deepseek.com routes all requests to DeepSeek's servers while preserving the exact same method signatures, request formats, and response shapes. You don't need a wrapper library or adapter.

Create a file named basic.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = process.env.DEEPSEEK_API_KEY;
const baseURL = process.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY is not set or is empty in your .env file.");
  process.exit(1);
}

if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL is not set or is empty in your .env file.");
  process.exit(1);
}

const client = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const response = await client.chat.completions.create({
  model: "deepseek-chat",
  messages: [
    { role: "system", content: "You are a helpful programming assistant." },
    { role: "user", content: "Explain the difference between map and flatMap in JavaScript." },
  ],
});

const choice = response.choices?.[0];
if (!choice || choice.finish_reason === "content_filter") {
  console.error(
    "No valid completion returned. finish_reason:",
    choice?.finish_reason ?? "no choices"
  );
  process.exit(1);
}

const content = choice.message?.content;
if (typeof content !== "string") {
  console.error("Unexpected response shape: missing message content.");
  process.exit(1);
}

console.log(content);
console.log("Token usage:", response.usage);

Run with node basic.mjs.

Understanding the Response Object

The response follows the OpenAI chat completion schema. response.choices is an array where each entry contains a message object with role and content fields. The finish_reason field indicates why generation stopped:

  • "stop" for natural completion
  • "length" if the response hit the max_tokens cap
  • "tool_calls" if the model invoked a function
  • "content_filter" if content filtering blocked the response

The usage object reports prompt_tokens, completion_tokens, and total_tokens, which map directly to billing. Monitoring these values is essential for cost tracking in production.

Core API Features and Parameters

System Prompts and Multi-Turn Conversations

The messages array supports three roles:

  • system - sets behavior and constraints
  • user - end-user input
  • assistant - model responses from previous turns

Multi-turn conversations require the developer to maintain and append to this array across interactions.

Create multiturn.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = process.env.DEEPSEEK_API_KEY;
const baseURL = process.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY is not set or is empty in your .env file.");
  process.exit(1);
}

if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL is not set or is empty in your .env file.");
  process.exit(1);
}

const client = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const conversationHistory = [
  { role: "system", content: "You are a senior JavaScript developer. Be concise and precise." },
];

const MAX_HISTORY_TURNS = 10; // keep system prompt + last 10 user/assistant pairs

function appendAndTrim(history, role, content) {
  history.push({ role, content });
  // Always keep index 0 (system prompt); trim pairs from index 1 onward
  const systemPrompt = history[0].role === "system" ? [history[0]] : [];
  const turns = history.slice(systemPrompt.length);
  const maxTurnMessages = MAX_HISTORY_TURNS * 2; // user + assistant per turn
  const trimmed = turns.slice(Math.max(0, turns.length - maxTurnMessages));
  history.length = 0;
  history.push(...systemPrompt, ...trimmed);
}

async function chat(userMessage) {
  appendAndTrim(conversationHistory, "user", userMessage);
  const response = await client.chat.completions.create({
    model: "deepseek-chat",
    messages: conversationHistory,
  });
  const choice = response.choices?.[0];
  if (!choice || choice.finish_reason === "content_filter") {
    throw new Error(
      `No valid completion returned. finish_reason: ${choice?.finish_reason ?? "no choices"}`
    );
  }
  const assistantMessage = choice.message?.content;
  if (typeof assistantMessage !== "string") {
    throw new Error("Unexpected response shape: missing message content.");
  }
  appendAndTrim(conversationHistory, "assistant", assistantMessage);
  return assistantMessage;
}

console.log(await chat("What is a closure in JavaScript?"));
console.log(await chat("Can you give me a practical example of one?"));
console.log(await chat("How does that relate to the module pattern?"));

Each call sends the accumulated history (trimmed to a sliding window), allowing the model to reference earlier turns without unbounded memory growth.

Key Parameters for Controlling Output

The API accepts several parameters for shaping generation behavior:

  • temperature (0 to 2) - controls randomness; lower values produce more deterministic output. Check the DeepSeek API docs for the current default.
  • top_p (0 to 1) - for nucleus sampling
  • max_tokens - caps response length
  • frequency_penalty and presence_penalty (both -2 to 2 per the OpenAI-compatible spec; verify these parameters are honored by the DeepSeek endpoint, as behavior may differ from OpenAI) - discourage repetition and encourage topic diversity respectively
  • stop - pass an array of delimiter strings to halt generation at specific points

For structured output, set response_format: { type: "json_object" } and instruct the model in the system or user prompt to produce JSON. Where the endpoint supports it, this mode increases the likelihood of valid JSON output. Verify support in the DeepSeek API docs and always wrap JSON.parse() in a try/catch block.

Create jsonmode.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = process.env.DEEPSEEK_API_KEY;
const baseURL = process.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY is not set or is empty in your .env file.");
  process.exit(1);
}

if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL is not set or is empty in your .env file.");
  process.exit(1);
}

const client = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const response = await client.chat.completions.create({
  model: "deepseek-chat",
  response_format: { type: "json_object" },
  messages: [
    {
      role: "system",
      content: "You are an API that returns JSON. Always respond with a valid JSON object.",
    },
    {
      role: "user",
      content: "List three common JavaScript array methods with their descriptions and return types.",
    },
  ],
});

const choice = response.choices?.[0];
if (!choice || choice.finish_reason === "content_filter") {
  console.error(
    "No valid completion returned. finish_reason:",
    choice?.finish_reason ?? "no choices"
  );
  process.exit(1);
}

const rawContent = choice.message?.content;
if (typeof rawContent !== "string") {
  console.error("Unexpected response shape: missing message content.");
  process.exit(1);
}

let parsed;
try {
  parsed = JSON.parse(rawContent);
} catch (e) {
  console.error("Failed to parse model response as JSON:", e.message);
  console.error("Raw response:", rawContent);
  process.exit(1);
}

const isValidObject = parsed !== null && typeof parsed === "object" && !Array.isArray(parsed);
console.log(isValidObject ? "Valid JSON object received" : "Unexpected format");
console.log(JSON.stringify(parsed, null, 2));

Streaming Responses

Streaming reduces perceived latency by delivering tokens as they are generated, which is critical for user-facing applications where time-to-first-token (TTFT - the elapsed time between sending a request and receiving the first token of the response) matters more than total generation time.

Create streaming.mjs:

import "dotenv/config";
import OpenAI from "openai";

const apiKey = process.env.DEEPSEEK_API_KEY;
const baseURL = process.env.DEEPSEEK_BASE_URL;

if (!apiKey || apiKey.trim() === "") {
  console.error("Error: DEEPSEEK_API_KEY is not set or is empty in your .env file.");
  process.exit(1);
}

if (!baseURL || baseURL.trim() === "") {
  console.error("Error: DEEPSEEK_BASE_URL is not set or is empty in your .env file.");
  process.exit(1);
}

const client = new OpenAI({
  baseURL,
  apiKey,
  timeout: 60_000,
  maxRetries: 0,
});

const stream = await client.chat.completions.create({
  model: "deepseek-chat",
  messages: [
    { role: "user", content: "Write a brief explanation of event-driven architecture." },
  ],
  stream: true,
});

let fullResponse = "";
for await (const chunk of stream) {
  const content = chunk.choices?.[0]?.delta?.content ?? "";
  process.stdout.write(content);
  fullResponse += content;
}

console.log("\nFull response length:", fullResponse.length, "characters");

Each chunk contains a delta object with incremental content. The loop assembles the complete response while simultaneously writing to stdout.

Building a Complete Application: AI-Powered Code Reviewer

Application Architecture

This CLI tool reads a JavaScript file from disk, sends its contents to DeepSeek with a detailed code review system prompt, and requests structured JSON feedback. The application exercises DeepSeek V3's code understanding, reasoning, and structured output capabilities in a single cohesive workflow.

Comments

No comments yet. Start the discussion.