# Generation tools & models

This page is the catalog of **generation tools** — the values you pass as the
`tool_name` argument to [`zencreator_create_task`](./tools/generation.md) — and the
**models** each generation tool offers.

Keep two layers distinct (see [Concepts](./concepts.md)):

- **MCP tools** are the `zencreator_*` tools your AI client sees. `zencreator_create_task`
  is one of them.
- **Generation tools** are the `tool_name` values you pass *into* `zencreator_create_task`
  — e.g. `by_prompt`, `image_editor`, `videogen`, `faceswap`, `upscaler`, `lipsync`.
  Each generation tool exposes its own set of **models** (the `model` field).

Plainly: `zencreator_create_task` is an MCP tool; `by_prompt` is a generation tool you
pass to it; `SDXL_NSFW` is a model you pass to `by_prompt`.

## How to use this catalog

1. Pick a generation tool below for the task (text→image, edit, video, faceswap, …).
2. Pick a model from that tool's list. Models marked **(trusted)** require a trusted
   account for NSFW use.
3. Call [`zencreator_get_tool_schema`](./tools/generation.md) for the exact input/output
   JSON Schema, a prompt-writing guide, and model-selection guidance before submitting.
4. For per-model prompt conventions (Seedream prose vs. Qwen layout-first vs. Wan
   structured blocks, text-in-image rules, NSFW phrasing), call
   `zencreator_get_model_prompt_guide`.

## NSFW and trusted gating

ZenCreator supports uncensored / NSFW generation. Two account gates apply:

- **`nsfw_allowed`** — adult content enabled on the account (toggle in ZenCreator account
  settings). If `false`, enable adult content and retry — do not silently fall back to SFW.
- **`is_trusted`** — whether the account has **Trusted Status**. It unlocks ZenCreator's extended
  capabilities (uncensored NSFW generation, 18+ templates and LoRAs, Face Swap tools, more flexible
  generation) and gates the **(trusted)** models below plus the trusted-only tools `male_undresser`,
  `flux_klein_lora`, and `text_to_video`. Trusted status is granted **automatically after the
  account's first successful payment** (buying any credit pack in Billing) and is permanent; until
  then, submitting a trusted-only task fails and wastes credits.

Check both proactively with [`zencreator_get_me`](./tools/account.md) before an NSFW
workflow. See [Workflows](./workflows.md) for the full NSFW preflight.

> The optional `zencreator_craft_prompt` sidecar (present on deployments that configure it)
> can author NSFW / model-specific prompts when the orchestrator itself declines to — it
> spends no ZenCreator credits.

## Choosing & comparing

Credit cost varies by generation tool, model, resolution, and duration — and it is
backend-driven, so it drifts. **Never hardcode credit numbers.**

- [`zencreator_compare_prices`](./tools/generation.md) — shop across all models of a
  generation tool, cheapest-first (sweeps resolutions when relevant). Use this whenever
  the user wants the cheapest option.
- [`zencreator_estimate_price`](./tools/generation.md) — get the exact credit cost of one
  candidate input. Call this and state the cost to the user before submitting.

---

# Image generation tools

## by_prompt — Text-to-image

Generate an image from a text prompt, with no input image. The entry point for creation:
use it when the user has no source image and describes the picture in words — building a
character or content from scratch, concepts, backgrounds, NSFW from a description, quick
drafts and final hero shots. Supports fast/quality modes, batches, aspect ratios and
body-shape LoRAs.

Three content groups, pick by what you need:

- **Top cloud models, censored** (`GENERAL_NSFW`, `NANO_BANANA`) — highest quality and
  realism, but block explicit content.
- **Uncensored but not built for porn** (`WAN_2_7_IMAGE`, `QWEN_IMAGE`, `SEEDREAM_5`) —
  high quality and won't block, but won't create explicit content from scratch; they
  accurately **transform NSFW references** you provide.
- **Local, built for explicit NSFW** (`SDXL_NSFW`, `FLUX_KLEIN_NSFW`) — slightly lower
  quality and more artifacts, but real explicit capability.

**Models:**

- `GENERAL_NSFW` *(default, trusted)* — General-purpose workhorse with a good
  quality/speed/price balance and strong facial likeness; the NSFW version is uncensored.
  Does not produce explicit anatomy from text alone (it covers it up). Older model —
  occasional hand/limb artifacts.
- `GENERAL_SFW` — The same pipeline, SFW only.
- `SDXL_NSFW` *(trusted)* — Best choice for explicit NSFW anatomy from text alone (it knows
  anatomy from training). Local model: slightly lower quality, more artifacts. Text-only —
  does **not** accept reference images. Renders a fixed ≈2:3 portrait (~1248×1824) and
  ignores `ratio`/`width`/`height`; pick `FLUX_KLEIN_NSFW` or `GENERAL_NSFW` when a
  specific aspect ratio matters.
- `WAN` — Legacy WAN image model; prefer `WAN_2_7_IMAGE`.
- `WAN_2_7_IMAGE` / `WAN_2_7_IMAGE_PRO` — Modern model with higher quality and detail
  (Pro = top consistency). Renders bodies, scenes and composition more aesthetically with
  fewer hand/limb artifacts. Weaker at in-image text. Uncensored, but transforms your NSFW
  references rather than inventing explicit content.
- `QWEN_IMAGE` / `QWEN_IMAGE_PRO` — Aesthetic results with good facial likeness and few
  artifacts; great for stylized / illustrative / anime subjects. Pro adds realism.
  Uncensored; transforms NSFW references.
- `SEEDREAM_5` — Newer generation: better prompt understanding, stronger stylization,
  better likeness, fewer artifacts. Uncensored; transforms NSFW references.
- `NANO_BANANA` — Among the best for realism, and the only model that reliably renders
  legible in-image text (posters, signage, captions); strong real-world knowledge. Heavily
  censored — won't produce even mildly suggestive content. Weaker facial likeness.
- `FLUX_KLEIN_NSFW` *(trusted)* — The most advanced local NSFW model: produces explicit
  content and also works with references — bring a character's face and create an action.
  Slightly lower quality, occasional artifacts.

> by_prompt generates exactly one image per input. For N variants, pass N input objects in
> the `inputs` array of one task (do not raise `batch_size`).

## image_editor — The main, most flexible image tool

Edit and composite existing images by prompt. Bring references, edit and combine them;
bring your character and dress them from a reference photo. It offers both SFW and NSFW
models, and LoRA presets that extend NSFW capability. Use it to keep a product or object
**exactly** (fabric, pattern, shape) while changing the scene. This is the most capable
image tool — and the **default for all reference-based generation** (use it instead of
`by_ref`).

**Models:**

- `GENERAL_NSFW` *(default, trusted)* — Universal default, uncensored, good facial
  likeness; general NSFW edits such as outfit or pose changes. Older model — occasional
  limb artifacts.
- `NANO_BANANA` — High realism and the most precise prompt-driven edits; **required for any
  edit involving in-image text**. Heavily censored (no NSFW); weaker likeness.
- `QWEN_IMAGE` / `QWEN_IMAGE_PRO` — Aesthetic, good likeness, few artifacts; Pro adds
  realism. Uncensored; transforms your NSFW references rather than creating explicit
  content from scratch.
- `SEEDREAM_5` — Newer than the default: better prompt understanding, stronger
  stylization, better likeness, fewer artifacts. Uncensored; transforms NSFW references.
- `WAN_2_7_IMAGE` / `WAN_2_7_IMAGE_PRO` — Aesthetic bodies and composition, few artifacts,
  precise editing; Pro = higher quality. Uncensored; transforms NSFW references. Weaker at
  in-image text.
- `FLUX_KLEIN_NSFW` *(trusted)* — Local flagship for explicit NSFW with references: it
  knows anatomy and accepts a face reference. Slightly lower quality, occasional artifacts.
- `FLUX_KLEIN_LORA` *(trusted)* — LoRA presets that extend NSFW capability (including undress
  presets) and style templates; pass a `lora_id`.

> `SDXL_NSFW` is intentionally **not** offered here — it is text-only and cannot accept
> references. For explicit anatomy on a reference, use `FLUX_KLEIN_NSFW`.

## by_ref — Generate a similar image from a reference

Bring a reference photo and get a similar one. With `GENERAL` you can bring a character's
face plus a reference photo and get a similar shot featuring **your** character. With
`SDXL` you bring only a photo and get a similar one.

> *Legacy / explicit-request-only.* For most reference-based work `image_editor` is more
> flexible (identity carryover, native aspect-ratio control, multiple references) and is the
> recommended choice — use `by_ref` only when explicitly asked for it by name.

**Models:**

- `SDXL` *(default)* — Local, NSFW-capable. Input is a photo only. Fixed ≈4:5 portrait
  output (~1392×1752); by_ref has no `ratio`/`width`/`height` inputs.
- `GENERAL` — Higher quality and realism; can carry a character's face into a reference-like
  shot. Fixed ≈ square output.

## facegen — Create a face from scratch

Generate a brand-new face from structured attributes: gender, age, origin/ethnicity, body
type, eye/hair/beard color, hairstyle, beard and makeup. No reference image; returns
several variants per request. Strength: full parametric control over appearance. There is
no free-form prompt — required fields are `gender`, `age`, `origin`; the rest are optional
appearance fields. Use it to **mint a new persona reference** (for likeness of a specific
person, use `faceswap` or `by_ref`). Niche — used occasionally.

## photoshoot — Photoshoot from face + body references

Bring a photo of the face and a photo of the body (without a face) of your character; the
tool runs them through prepared prompt presets and returns a batch of images. Presets are
grouped by type, so you can produce a set in a given style or action. The `prompt`
describes the **scene** (wardrobe, location, pose, lighting, mood) — not the subject, which
the references encode. Strength: reproducible, identity-preserving results with no manual
prompting — and, unlike `by_ref`, it honors a hard aspect ratio (pass `ratio` together with
matching `width`/`height`).

## carousel — Multiple camera angles of one subject

Bring an image and get the same subject from different camera angles (up to 10). Use it for
social-media carousels and a "3D" / product overview of an object or character. There is no
prompt — angle variation is automatic; the main dial is the number of images. Niche — used
occasionally.

## collaber — Two characters in one frame

Bring two characters and an optional background/location photo; the tool combines them into
a single scene (1–4 images). The `prompt` describes their **interaction** and the joint
scene, not their individual identities (those come from the two references). Strength: keeps
both characters' likeness — a convenient preset for collabs and duets.

## faceswap — Swap a face on a photo

Bring the photo where the face should be replaced plus a face photo, and the tool swaps the
character. **Image only — there is no video face swap.**

**Models:**

- `SDXL` *(default)* — Lowest likeness of the set; fast/cheap baseline.
- `GENERAL` — Better likeness, but not always stable.
- `GENERAL_ADVANCED` — Improved general swap with the strongest identity preservation.
- `FULL_HEAD` — Replaces the entire head, not just the face — use when the target's
  hairstyle or head shape differs strongly from the source.

## undress / male_undresser — Remove clothing

> `male_undresser` is a **🔒 trusted-only tool** — available only to trusted accounts (granted
> automatically after your first credit purchase). `undress` needs only `nsfw_allowed`.

Fully removes clothing from a character. Two variants: `undress` (default, tuned for female
subjects) and `male_undresser` (Flux Klein edit-LoRA tuned for male anatomy). Both are fully
automatic — single input image, no prompt, no parameters. These are convenience presets
built on Flux Klein LoRA — the same result is available directly through `image_editor`
with the Flux Klein LoRA presets, including presets that handle paired photos.

## flux_klein_lora — Flux Klein with LoRA templates

> **🔒 Trusted account required** — available only to trusted accounts (granted automatically
> after your first credit purchase).

Generate or edit images with Flux Klein driven by a LoRA style/undress template. Inputs:
`image_assets` (1–3 reference asset_ids), `lora_id` (**required** — the LoRA template id;
browse via the templates catalog), an optional short `prompt` (the LoRA owns the style, so
keep tweaks to scene/pose), and an optional `ratio` (1:1, 16:9, 9:16, 4:3, 3:4, 2:3, 3:2,
21:9). Base price 3 credits.

> Standalone equivalent of `image_editor` with `model=FLUX_KLEIN_LORA`; prefer `image_editor`
> unless you specifically want the dedicated entry point. The `undress` / `male_undresser`
> presets are convenience wrappers over the same Flux Klein LoRA pipeline.

## upscaler — Increase image resolution

Brings an image up to the resolution you choose and restores detail. Best for
**low-resolution sources** that need to be cleaned up and improved; gains are limited on an
already-sharp 4K image. There is no prompt — the only meaningful choice is the `version`.

**Versions:**

- `basic` / `basic_safe_face` — Baseline upscale; `*_safe_face` preserves the face.
- `natural_clarity` — Cheap and natural-looking, any size.
- `premium_realism` / `premium_safe_face` — Photorealistic detail; `*_safe_face` preserves
  the face.
- `ultra_clarity` — Maximum detail.

> Use a `*_safe_face` version whenever the image contains a face you need kept faithful; the
> other versions can subtly alter facial features while sharpening.

## upscaler_faceswap — Face swap + upscale (legacy)

Swaps the face from `face_asset` onto the person in `ref_asset`, then upscales the result.
Inputs: `ref_asset` + `face_asset` (both required) and an optional `upscaler_version`
(`basic` / `premium_realism`; default `basic`). Output: one image. Base price 2 credits.

> *Legacy combo — prefer `image_editor`*: the shared face-swap-then-upscale pipeline gives
> insufficient face similarity. For better likeness, use `faceswap` (or `image_editor` with a
> face reference) and chain `upscaler` for a real resolution bump.

---

# Video generation tools

## videogen — Generate video

Animate a photo into video (image→video). Cost depends on the model, duration (5–15 s) and
resolution (480p–1080p). Supports prompt enhancement, fixed camera, LoRAs and optional
audio. `videogen` is the **image-to-video** tool — a starting frame (`ref_asset`) is
**required on every call**; for pure text-to-video (no starting image) use `text_to_video`.

**Quick guide:** general content (best price/quality) → **Seedance**; NSFW / explicit
content → `wan@2.7-nsfw` *(trusted)* or `wan@2.2-lora` (needs only `nsfw_allowed`);
tasteful content with complex actions → **Kling** (censored).

Each model declares its own duration and resolution capability — filter against the user's
request before suggesting one, and price every candidate with `zencreator_compare_prices`
(which can sweep resolutions cheapest-first).

**Wan** — best prompt understanding and first-frame animation:

- `wan@2.7` — Latest line; top prompt understanding and first-frame animation. Continuous
  2–15 s, 720p/1080p. (Does **not** accept a `last_frame` keyframe — for that use `kling@2.1`,
  `seedance_pro`, or `seedance_v1_5_pro`; to go beyond 15 s, chain clips via `ref_asset`.)
- `wan@2.7-nsfw` *(trusted)* — Wan 2.7 for NSFW; the best choice when you have a first frame (or a
  frame with an action) to animate. Uncensored. **Trusted-only** — requires `is_trusted`.
- `wan@2.6` / `wan@2.6-flash` — Cheaper and older (`flash` is even cheaper and faster).
  Duration 5, 10, or 15 s; 720p/1080p.
- `wan@2.5` — Sharper motion than 2.2; duration 5 or 10 s; 480p/720p/1080p.
- `wan@2.2` — Frame-based duration; very flexible; uncensored NSFW base. **This is the
  backend's fallback default** when no model is passed (a factual fallback, not a
  recommendation — prefer Seedance for unspecified general content).
- `wan@2.2-lora` — Presets with action-trained LoRAs that turn a first frame into a complex
  action. Includes "Blink" LoRAs: bring any photo of your character and the frame morphs
  into the desired NSFW action. The **easiest option for beginners** — no prompt needed,
  just a photo similar to the example's first frame.

**Kling** — censored, but animates a first frame well; newer versions cost more and
understand prompts and complex actions better. Duration 5 or 10 s, 1080p:

- `kling@2.6` — Latest Kling, top motion and physics, plus native audio (dialogue, effects).
- `kling@2.5` — High quality, cheaper, consistent at volume.
- `kling@2.1` — Stable motion. Supports a start+end keyframe (`last_frame`).
- `kling@1.6` — Legacy, lowest cost.

**Seedance** — uncensored; best price/quality balance for content:

- `seedance_pro_fast` — Faster and cheaper, less "smart". Any integer duration 2–12 s.
- `seedance_pro` — Pricier and smarter. Any integer duration 2–12 s. Supports a start+end
  keyframe (`last_frame`).
- `seedance_v1_5_pro` — Best quality and result; joint audio+video, micro-expressions,
  first+last frame. Continuous 4–12 s.

**Grok:**

- `grok@4.1` — Censored; animates a first frame, top image-to-video, always emits native
  audio. Continuous 1–10 s.

> Native audio: models that support it accept `generate_audio: true`; `grok@4.1` always
> emits audio. Check each model's capabilities via `zencreator_get_tool_schema`.

## text_to_video — Text-to-video

> **🔒 Trusted account required** — available only to trusted accounts (granted automatically
> after your first credit purchase), regardless of whether the request is SFW or NSFW.

Generates a first frame (on Flux Klein, Wan 2.2 or Wan 2.7) and then animates it. Use this
when there is no starting image; if you already have a frame, use `videogen`.

Durations are model-specific — an out-of-set duration bills a 1-credit no-op, so match them
exactly:

- `wan@2.7` *(default)* — Top quality. Durations **5 / 10 / 15 s** (not 8). 720p or 1080p
  (price ≈ 2.6 credits/s at 720p, ≈ 3.4 credits/s at 1080p — so 5 s = 13/17, 10 s = 26/34,
  15 s = 39/51).
- `wan@2.2` — Budget. Durations **5 / 8 s** (not 10/15). Resolution is ignored — flat 10
  credits (5 s) / 13 credits (8 s).

## video2video — Replace a character in a video

Transfer motion / replace a character in a video using a reference video **or an
Instagram/TikTok URL** (passed directly — no upload needed); the original soundtrack can be
kept. SFW and NSFW variants. `resolution` (480p / 720p / 1080p) is required.

**Modes:**

- `kling_2_6_sfw` — Handles character replacement best (censored); billed per second.
- `replace_sfw` / `replace_nsfw` — Same character-replacement logic; `replace_nsfw` is
  uncensored.
- `animate_sfw` / `animate_nsfw` — Motion transfer / animation; `animate_nsfw` is
  uncensored.
- `dreamactor_m2` *(trusted)* — Same character-replacement logic, uncensored. **Trusted-only** —
  requires `is_trusted`.

> The uncensored modes trade some quality and prompt understanding — you may need to change
> the input (source video or character) to get a good result on the first try. Modes marked
> **(trusted)** are available only to trusted accounts.

## lipsync — Talking head

Bring an audio file and a first frame, and get a video in which the character speaks your
audio. Up to 35 seconds; JPG/PNG under 5 MB. Use it to voice a character or avatar. Niche —
used occasionally.

**Models:**

- `GENERAL_NSFW` *(default, trusted)* — Specialized lipsync pipeline. Note: this
  `GENERAL_NSFW` refers to a different underlying model than `GENERAL_NSFW` under
  `by_prompt` / `image_editor`; the shared name is a backend artifact. lipsync takes
  audio + a first-frame portrait image (no video source) and has no text prompt.

## video_upscaler — Upscale video

Bring a medium-quality video and get a sharper, higher-resolution result. Use it as a final
polish or to restore low-quality footage. There is no prompt. Niche — used occasionally.

## video_merger — Concatenate clips

Stitch 2–5 video clips into a single video, with a transition between each. No prompt, no
model selection — it is the final assembly step after generating individual clips with
`videogen` / `text_to_video`. Inputs: `clips` (2–5 items, each an uploaded video `asset_id`
plus its `source_duration_sec`, with optional `trim_start_sec` / `trim_end_sec` to cut the
clip), `transition` (`cut` / `dissolve` / `fade` / `slide`; default `cut`), `keep_audio`
(default true), `fps` (24 or 30; default 30), and `width` / `height` (default 1280×720).
Base price 1 credit.

---

## See also

- [Generation MCP tools](./tools/generation.md) — `zencreator_create_task`,
  `zencreator_run_and_wait`, `zencreator_get_tool_schema`, `zencreator_estimate_price`,
  `zencreator_compare_prices`, and the rest of the task lifecycle.
- [Concepts](./concepts.md) — tasks, calls, assets, the tool-vs-model split.
- [Workflows](./workflows.md) — end-to-end recipes (image this turn, video, NSFW preflight).
