---
title: "Kling AI Multi-Elements API Reference"
aliases: [kling-multi-elements, kling-video-element-editing, kling-add-swap-remove]
tags: [kling, api, video-editing, multi-elements, ai-video]
sources: ["raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md"]
created: 2026-04-26
updated: 2026-04-26
---

## Overview

The Multi-Elements API lets you **add**, **swap**, or **remove** specific objects within an existing video using AI segmentation. You mark elements by clicking coordinates, then submit an editing task.

Base URL: `https://api-singapore.klingai.com`

Supported model: `kling-v1-6` only.

---

## Workflow (5 Steps)

1. **Init** — `POST /v1/videos/multi-elements/init-selection` — parse video, get `session_id`
2. **Add selection** — `POST /v1/videos/multi-elements/add-selection` — click a point on a frame to mark an object
3. **Preview** — `POST /v1/videos/multi-elements/preview-selection` — see masked overlay before committing
4. **Create task** — `POST /v1/videos/multi-elements` — choose `edit_mode` + prompt + optional reference images
5. **Query** — `GET /v1/videos/multi-elements/{task_id}` — poll until `succeed`

Optional cleanup steps: `delete-selection` (remove specific points) or `clear-selection` (wipe all).

---

## Step 1 — Init Selection

`POST /v1/videos/multi-elements/init-selection`

| Field | Type | Notes |
|-------|------|-------|
| `video_id` | string (optional) | Kling-generated video, last 30 days only |
| `video_url` | string (optional) | Public `.mp4` / `.mov` URL |

**Video constraints:**
- Duration: 2–5 s or 7–10 s
- Resolution: 720–2160 px (both dimensions)
- Frame rate: 24, 30, or 60 fps

**Response fields:**
- `session_id` — valid for **24 hours**, used in all subsequent calls
- `fps`, `original_duration`, `total_frame` — required when creating the task
- `normalized_video` — URL of the processed video

---

## Step 2 — Add Selection

`POST /v1/videos/multi-elements/add-selection`

| Field | Type | Notes |
|-------|------|-------|
| `session_id` | string | Required |
| `frame_index` | int | Which frame to mark (max 10 frames total) |
| `points` | array | `{x, y}` in `[0,1]` range (top-left = 0,0); up to 10 points per frame |

**Response:** returns `rle_mask_list` — RLE-encoded segmentation masks + PNG base64 per object.

### Decoding the RLE Mask (TypeScript)

```typescript
export type RLEObject = { size: [h: number, w: number]; counts: string }

export function decode(rleObj: RLEObject): Uint8Array {
  // Returns flat Uint8Array (row-major): 1 = masked pixel, 0 = background
  // ... see full implementation in source article
}
```

### Rendering the Mask Overlay (Canvas)

```typescript
function drawMask(rleMask: string, height: number, width: number) {
  const decodeData = decode({ counts: rleMask, size: [height, width] })
  // Paint pixels with RGBA (116, 255, 82, 163) where decodeData[y*w+x] === 1
}
```

---

## Step 3 — Delete / Clear Selection (optional)

`POST /v1/videos/multi-elements/delete-selection`
- Same body as add-selection; `points` must **exactly match** coordinates used when adding.

`POST /v1/videos/multi-elements/clear-selection`
- Only requires `session_id` — wipes all marked areas.

---

## Step 4 — Preview Selection (optional)

`POST /v1/videos/multi-elements/preview-selection`

Returns: `video` (masked overlay), `video_cover`, `tracking_output` (per-frame mask).

---

## Step 5 — Create Task

`POST /v1/videos/multi-elements`

| Field | Type | Notes |
|-------|------|-------|
| `model_name` | enum | `kling-v1-6` |
| `session_id` | string | Required |
| `edit_mode` | enum | `addition` / `swap` / `removal` |
| `image_list` | array | Required for add/swap; omit for removal |
| `prompt` | string | Use `<<<video_1>>>` / `<<<image_1>>>` references; max 2500 chars |
| `negative_prompt` | string | Optional, max 2500 chars |
| `mode` | enum | `std` (cost-effective) / `pro` (high-quality) |
| `duration` | enum | `5` or `10` seconds |
| `callback_url` | string | Optional webhook |
| `external_task_id` | string | Optional custom ID |

### edit_mode Details

| Mode | image_list | Prompt template |
|------|-----------|-----------------|
| `addition` | 1–2 images (pre-cropped) | `Using the context of <<<video_1>>>, seamlessly add [x] from <<<image_1>>>` |
| `swap` | 1 image only | `swap [x] from <<<image_1>>> for [x] from <<<video_1>>>` |
| `removal` | not required | `Delete [x] from <<<video_1>>>` |

**Image requirements (for add/swap):**
- Formats: `.jpg` / `.jpeg` / `.png`
- Max 10 MB; min 300 px; aspect ratio 1:2.5–2.5:1
- Base64: raw string only — **no** `data:image/png;base64,` prefix

---

## Step 6 — Query Task

`GET /v1/videos/multi-elements/{task_id}`  
`GET /v1/videos/multi-elements?pageNum=1&pageSize=30`

Task statuses: `submitted` → `processing` → `succeed` / `failed`

Result video URL expires after **30 days** — download and store promptly.

---

## Key Takeaways

- Session-based workflow: init once → mark objects → edit → query. Session lives 24 h.
- Three edit modes: **add** (needs 1–2 ref images), **swap** (1 ref image), **remove** (no images needed).
- Object selection uses normalized `[0,1]` click coordinates on specific frame indices — up to 10 frames, 10 points each.
- Response masks are RLE-encoded; decode to `Uint8Array` for canvas rendering.
- Only `kling-v1-6` supports multi-elements; duration must be 5 s or 10 s matching source video length bracket.
- Generated videos auto-delete after 30 days; use `callback_url` for async workflows.
- Base64 images must be raw (no data-URI prefix).

---

## Related

- [[wiki/web-agency/kling-text-to-video-api|Kling Text-to-Video API]] — generate new videos from prompts
- [[wiki/web-agency/kling-image-to-video-api|Kling Image-to-Video API]] — animate still images
- [[wiki/web-agency/kling-multi-image-to-video-api|Kling Multi-Image-to-Video API]] — composite 2–4 reference images
- [[wiki/web-agency/kling-motion-control-api|Kling Motion Control API]] — pose/motion transfer
- [[wiki/web-agency/claude-code-nanobanana-website-workflow|Claude Code + Nanobanana 2 Workflow]] — end-to-end agency workflow using Kling

---

## Sources

- Raw: `raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md`
- Origin: Kling AI API docs — `/v1/videos/multi-elements`