--- title: "Kling AI Multi-Elements API Reference" aliases: [kling-multi-elements, kling-video-element-editing, kling-add-swap-remove] tags: [kling, api, video-editing, multi-elements, ai-video] sources: ["raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md"] created: 2026-04-26 updated: 2026-04-26 --- ## Overview The Multi-Elements API lets you **add**, **swap**, or **remove** specific objects within an existing video using AI segmentation. You mark elements by clicking coordinates, then submit an editing task. Base URL: `https://api-singapore.klingai.com` Supported model: `kling-v1-6` only. --- ## Workflow (5 Steps) 1. **Init** — `POST /v1/videos/multi-elements/init-selection` — parse video, get `session_id` 2. **Add selection** — `POST /v1/videos/multi-elements/add-selection` — click a point on a frame to mark an object 3. **Preview** — `POST /v1/videos/multi-elements/preview-selection` — see masked overlay before committing 4. **Create task** — `POST /v1/videos/multi-elements` — choose `edit_mode` + prompt + optional reference images 5. **Query** — `GET /v1/videos/multi-elements/{task_id}` — poll until `succeed` Optional cleanup steps: `delete-selection` (remove specific points) or `clear-selection` (wipe all). --- ## Step 1 — Init Selection `POST /v1/videos/multi-elements/init-selection` | Field | Type | Notes | |-------|------|-------| | `video_id` | string (optional) | Kling-generated video, last 30 days only | | `video_url` | string (optional) | Public `.mp4` / `.mov` URL | **Video constraints:** - Duration: 2–5 s or 7–10 s - Resolution: 720–2160 px (both dimensions) - Frame rate: 24, 30, or 60 fps **Response fields:** - `session_id` — valid for **24 hours**, used in all subsequent calls - `fps`, `original_duration`, `total_frame` — required when creating the task - `normalized_video` — URL of the processed video --- ## Step 2 — Add Selection `POST /v1/videos/multi-elements/add-selection` | Field | Type | Notes | |-------|------|-------| | `session_id` | string | Required | | `frame_index` | int | Which frame to mark (max 10 frames total) | | `points` | array | `{x, y}` in `[0,1]` range (top-left = 0,0); up to 10 points per frame | **Response:** returns `rle_mask_list` — RLE-encoded segmentation masks + PNG base64 per object. ### Decoding the RLE Mask (TypeScript) ```typescript export type RLEObject = { size: [h: number, w: number]; counts: string } export function decode(rleObj: RLEObject): Uint8Array { // Returns flat Uint8Array (row-major): 1 = masked pixel, 0 = background // ... see full implementation in source article } ``` ### Rendering the Mask Overlay (Canvas) ```typescript function drawMask(rleMask: string, height: number, width: number) { const decodeData = decode({ counts: rleMask, size: [height, width] }) // Paint pixels with RGBA (116, 255, 82, 163) where decodeData[y*w+x] === 1 } ``` --- ## Step 3 — Delete / Clear Selection (optional) `POST /v1/videos/multi-elements/delete-selection` - Same body as add-selection; `points` must **exactly match** coordinates used when adding. `POST /v1/videos/multi-elements/clear-selection` - Only requires `session_id` — wipes all marked areas. --- ## Step 4 — Preview Selection (optional) `POST /v1/videos/multi-elements/preview-selection` Returns: `video` (masked overlay), `video_cover`, `tracking_output` (per-frame mask). --- ## Step 5 — Create Task `POST /v1/videos/multi-elements` | Field | Type | Notes | |-------|------|-------| | `model_name` | enum | `kling-v1-6` | | `session_id` | string | Required | | `edit_mode` | enum | `addition` / `swap` / `removal` | | `image_list` | array | Required for add/swap; omit for removal | | `prompt` | string | Use `<<>>` / `<<>>` references; max 2500 chars | | `negative_prompt` | string | Optional, max 2500 chars | | `mode` | enum | `std` (cost-effective) / `pro` (high-quality) | | `duration` | enum | `5` or `10` seconds | | `callback_url` | string | Optional webhook | | `external_task_id` | string | Optional custom ID | ### edit_mode Details | Mode | image_list | Prompt template | |------|-----------|-----------------| | `addition` | 1–2 images (pre-cropped) | `Using the context of <<>>, seamlessly add [x] from <<>>` | | `swap` | 1 image only | `swap [x] from <<>> for [x] from <<>>` | | `removal` | not required | `Delete [x] from <<>>` | **Image requirements (for add/swap):** - Formats: `.jpg` / `.jpeg` / `.png` - Max 10 MB; min 300 px; aspect ratio 1:2.5–2.5:1 - Base64: raw string only — **no** `data:image/png;base64,` prefix --- ## Step 6 — Query Task `GET /v1/videos/multi-elements/{task_id}` `GET /v1/videos/multi-elements?pageNum=1&pageSize=30` Task statuses: `submitted` → `processing` → `succeed` / `failed` Result video URL expires after **30 days** — download and store promptly. --- ## Key Takeaways - Session-based workflow: init once → mark objects → edit → query. Session lives 24 h. - Three edit modes: **add** (needs 1–2 ref images), **swap** (1 ref image), **remove** (no images needed). - Object selection uses normalized `[0,1]` click coordinates on specific frame indices — up to 10 frames, 10 points each. - Response masks are RLE-encoded; decode to `Uint8Array` for canvas rendering. - Only `kling-v1-6` supports multi-elements; duration must be 5 s or 10 s matching source video length bracket. - Generated videos auto-delete after 30 days; use `callback_url` for async workflows. - Base64 images must be raw (no data-URI prefix). --- ## Related - [[wiki/web-agency/kling-text-to-video-api|Kling Text-to-Video API]] — generate new videos from prompts - [[wiki/web-agency/kling-image-to-video-api|Kling Image-to-Video API]] — animate still images - [[wiki/web-agency/kling-multi-image-to-video-api|Kling Multi-Image-to-Video API]] — composite 2–4 reference images - [[wiki/web-agency/kling-motion-control-api|Kling Motion Control API]] — pose/motion transfer - [[wiki/web-agency/claude-code-nanobanana-website-workflow|Claude Code + Nanobanana 2 Workflow]] — end-to-end agency workflow using Kling --- ## Sources - Raw: `raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md` - Origin: Kling AI API docs — `/v1/videos/multi-elements`