obsidian/wiki/web-agency/kling-multi-elements-api.md
2026-04-26 21:17:25 +01:00

6.2 KiB
Raw Blame History

title aliases tags sources created updated
Kling AI Multi-Elements API Reference
kling-multi-elements
kling-video-element-editing
kling-add-swap-remove
kling
api
video-editing
multi-elements
ai-video
raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md
2026-04-26 2026-04-26

Overview

The Multi-Elements API lets you add, swap, or remove specific objects within an existing video using AI segmentation. You mark elements by clicking coordinates, then submit an editing task.

Base URL: https://api-singapore.klingai.com

Supported model: kling-v1-6 only.


Workflow (5 Steps)

  1. InitPOST /v1/videos/multi-elements/init-selection — parse video, get session_id
  2. Add selectionPOST /v1/videos/multi-elements/add-selection — click a point on a frame to mark an object
  3. PreviewPOST /v1/videos/multi-elements/preview-selection — see masked overlay before committing
  4. Create taskPOST /v1/videos/multi-elements — choose edit_mode + prompt + optional reference images
  5. QueryGET /v1/videos/multi-elements/{task_id} — poll until succeed

Optional cleanup steps: delete-selection (remove specific points) or clear-selection (wipe all).


Step 1 — Init Selection

POST /v1/videos/multi-elements/init-selection

Field Type Notes
video_id string (optional) Kling-generated video, last 30 days only
video_url string (optional) Public .mp4 / .mov URL

Video constraints:

  • Duration: 25 s or 710 s
  • Resolution: 7202160 px (both dimensions)
  • Frame rate: 24, 30, or 60 fps

Response fields:

  • session_id — valid for 24 hours, used in all subsequent calls
  • fps, original_duration, total_frame — required when creating the task
  • normalized_video — URL of the processed video

Step 2 — Add Selection

POST /v1/videos/multi-elements/add-selection

Field Type Notes
session_id string Required
frame_index int Which frame to mark (max 10 frames total)
points array {x, y} in [0,1] range (top-left = 0,0); up to 10 points per frame

Response: returns rle_mask_list — RLE-encoded segmentation masks + PNG base64 per object.

Decoding the RLE Mask (TypeScript)

export type RLEObject = { size: [h: number, w: number]; counts: string }

export function decode(rleObj: RLEObject): Uint8Array {
  // Returns flat Uint8Array (row-major): 1 = masked pixel, 0 = background
  // ... see full implementation in source article
}

Rendering the Mask Overlay (Canvas)

function drawMask(rleMask: string, height: number, width: number) {
  const decodeData = decode({ counts: rleMask, size: [height, width] })
  // Paint pixels with RGBA (116, 255, 82, 163) where decodeData[y*w+x] === 1
}

Step 3 — Delete / Clear Selection (optional)

POST /v1/videos/multi-elements/delete-selection

  • Same body as add-selection; points must exactly match coordinates used when adding.

POST /v1/videos/multi-elements/clear-selection

  • Only requires session_id — wipes all marked areas.

Step 4 — Preview Selection (optional)

POST /v1/videos/multi-elements/preview-selection

Returns: video (masked overlay), video_cover, tracking_output (per-frame mask).


Step 5 — Create Task

POST /v1/videos/multi-elements

Field Type Notes
model_name enum kling-v1-6
session_id string Required
edit_mode enum addition / swap / removal
image_list array Required for add/swap; omit for removal
prompt string Use <<<video_1>>> / <<<image_1>>> references; max 2500 chars
negative_prompt string Optional, max 2500 chars
mode enum std (cost-effective) / pro (high-quality)
duration enum 5 or 10 seconds
callback_url string Optional webhook
external_task_id string Optional custom ID

edit_mode Details

Mode image_list Prompt template
addition 12 images (pre-cropped) Using the context of <<<video_1>>>, seamlessly add [x] from <<<image_1>>>
swap 1 image only swap [x] from <<<image_1>>> for [x] from <<<video_1>>>
removal not required Delete [x] from <<<video_1>>>

Image requirements (for add/swap):

  • Formats: .jpg / .jpeg / .png
  • Max 10 MB; min 300 px; aspect ratio 1:2.52.5:1
  • Base64: raw string only — no data:image/png;base64, prefix

Step 6 — Query Task

GET /v1/videos/multi-elements/{task_id}
GET /v1/videos/multi-elements?pageNum=1&pageSize=30

Task statuses: submittedprocessingsucceed / failed

Result video URL expires after 30 days — download and store promptly.


Key Takeaways

  • Session-based workflow: init once → mark objects → edit → query. Session lives 24 h.
  • Three edit modes: add (needs 12 ref images), swap (1 ref image), remove (no images needed).
  • Object selection uses normalized [0,1] click coordinates on specific frame indices — up to 10 frames, 10 points each.
  • Response masks are RLE-encoded; decode to Uint8Array for canvas rendering.
  • Only kling-v1-6 supports multi-elements; duration must be 5 s or 10 s matching source video length bracket.
  • Generated videos auto-delete after 30 days; use callback_url for async workflows.
  • Base64 images must be raw (no data-URI prefix).


Sources

  • Raw: raw/Kling AI Next-Gen AI Video & AI Image Generator 5.md
  • Origin: Kling AI API docs — /v1/videos/multi-elements