Tips and tricks

How to Have a local video generation AI setup

AnimateDiff works as an extension on top of Stable Diffusion pipelines.
The two most common setups are:

1️⃣ ComfyUI + AnimateDiff → most powerful (recommended)
2️⃣ AUTOMATIC1111 WebUI + AnimateDiff → easier but less flexible

Because you’re running an RTX 5070 Ti and you want character consistency and advanced control, the ComfyUI method is the best.

Below is the full setup.


1. Install Python (if not already installed)

AnimateDiff runs in Python environments.

Check first:

python3 --version

You want something like:

Python 3.10.x

If missing:

brew install python@3.10

or download from:

https://python.org

2. Install ComfyUI

Clone the repo:

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI

Create a virtual environment:

python3 -m venv venv
source venv/bin/activate
if it does not work, try with CMD.EXE
venv\Scripts\activate.bat

Install dependencies:

pip install -r requirements.txt

Start ComfyUI:

python main.py --enable-manager

Open browser:

http://127.0.0.1:8188

You now have the base AI generation interface.


3. Install the ComfyUI Manager (very important)

This allows installing extensions easily.

Go inside:

ComfyUI/custom_nodes

Clone:

git clone https://github.com/ltdrdata/ComfyUI-Manager

Restart ComfyUI.

You will now see a Manager button in the UI.


4. Install AnimateDiff Extension

Using the Manager:

Manager
→ Install Custom Nodes
→ Search: AnimateDiff
→ Install

Or manually:

cd ComfyUI/custom_nodes
git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved

Restart ComfyUI again.


5. Download AnimateDiff Motion Models

These control the animation.

Create this folder:

ComfyUI/models/animatediff

Download one of these models:

Motion Module v3 (recommended)

mm_sd_v15_v2.ckpt

or

mm_sdxl_v10_beta.ckpt

Example download:

wget https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt

Place inside:

ComfyUI/models/animatediff/

6. Install Base Image Model

AnimateDiff needs a base Stable Diffusion model.

Create folder:

ComfyUI/models/checkpoints

Download a model such as:

Realistic Vision

or

Juggernaut XL

Example:

wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

Put it in:

ComfyUI/models/checkpoints/

7. Load an AnimateDiff Workflow

Inside ComfyUI:

Load
→ AnimateDiff workflow JSON

A basic pipeline looks like:

Checkpoint Loader

CLIP Text Encode

AnimateDiff Loader

KSampler

VAE Decode

Save Video

Prompt example:

a cinematic shot of a dancer in a modern studio, neon lights, slow motion movement

Frames:

16–32 frames

FPS:

8–12

8. Generate Your First Video

Typical settings:

SettingValue
frames16
fps8
steps20
cfg6
resolution768×512

Click Queue Prompt.

You will get a GIF / MP4 output.


9. Improve Quality

Common improvements:

Interpolate frames

Use:

Flowframes

Example:

8 fps → 60 fps

Motion becomes smooth.


Upscale

Tools:

  • Topaz Video AI
  • Video2X

10. Character Consistency (Next Step)

Once AnimateDiff works, the next step is adding:

IPAdapter (reference face)
ControlNet Pose
Character LoRA

This is how people create consistent AI characters in videos.


11. Typical AI Video Pipeline (Creators Use This)

Character LoRA
+
AnimateDiff
+
ControlNet Pose
+
Flowframes interpolation
+
Topaz upscale

Result → high-quality AI videos.

verify if CUDA is enabled
python -c "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'NO CUDA')"
add other folders as model paths

ComfyUI doesn’t automatically see models from AUTOMATIC1111 (stable-diffusion-webui) unless you tell it to.


✅ Option 1 (BEST) — Reuse your existing models via extra_model_paths.yaml

This avoids duplicating 20GB+ models.


Step 1 — Locate the config file

Go to:

F:\ai\ComfyUI\extra_model_paths.yaml.example

Rename it to:

extra_model_paths.yaml

Step 2 — Edit it

Open it and add your paths:

a111:
base_path: F:/ai/stable-diffusion-webui/models/
checkpoints: Stable-diffusion
vae: VAE
loras: Lora
embeddings: embeddings
controlnet: ControlNet

Step 3 — Restart ComfyUI
python main.py

Now ComfyUI will see your existing models.

if you are having problems with float precision

python main.py --force-fp32 --enable-manager

Wan 2.1 Models

Wan 2.1 is a family of video models.

For Wan 2.2 see: Wan 2.2

Files to Download

You will first need:

Text encoder and VAE:

umt5_xxl_fp8_e4m3fn_scaled.safetensors goes in: ComfyUI/models/text_encoders/

wan_2.1_vae.safetensors goes in: ComfyUI/models/vae/

Video Models

The diffusion models can be found here

Note: The fp16 versions are recommended over the bf16 versions as they will give better results.

Quality rank (highest to lowest): fp16 > bf16 > fp8_scaled > fp8_e4m3fn

These files go in: ComfyUI/models/diffusion_models/

These examples use the 16 bit files but you can use the fp8 ones instead if you don’t have enough memory.

Workflows

Text to Video

This workflow requires the wan2.1_t2v_1.3B_fp16.safetensors file (put it in: ComfyUI/models/diffusion_models/). You can also use it with the 14B model.

Example

Workflow in Json format

Image to Video

This workflow requires the wan2.1_i2v_480p_14B_fp16.safetensors file (put it in: ComfyUI/models/diffusion_models/) and clip_vision_h.safetensors which goes in: ComfyUI/models/clip_vision/

Note this example only generates 33 frames at 512×512 because I wanted it to be accessible, the model can do more than that. The 720p model is pretty good if you have the hardware/patience to run it.

Workflow in Json format

The input image can be found on the flux page.

Here’s the same example with the 720p model:

VACE reference Image to Video

This workflow requires the wan2.1_vace_14B_fp16.safetensors file (put it in: ComfyUI/models/diffusion_models/)

This example generates a video from a reference image, this is different from generating a video from a start image. You’ll notice that the video does not actually contain the reference image but is clearly derived from it.

Workflow in Json format

You can find the input image here that image contains a Chroma workflow if you are interested how it was generated.

Image Camera to Video

This workflow requires the wan2.1_fun_camera_v1.1_1.3B_bf16.safetensors file (put it in: ComfyUI/models/diffusion_models/) and clip_vision_h.safetensors which goes in: ComfyUI/models/clip_vision/ if you don’t have it already.

Workflow in Json format

The input image can be found on the flux page.






Basic AnimateDiff Workflow (copy this JSON)

3. Basic AnimateDiff Workflow (copy this JSON)

Go to ComfyUI → Load → Paste JSON

{
  "last_node_id": 8,
  "last_link_id": 10,
  "nodes": [
    {
      "id": 1,
      "type": "CheckpointLoaderSimple",
      "pos": [100, 100],
      "widgets_values": ["realisticVision.safetensors"]
    },
    {
      "id": 2,
      "type": "CLIPTextEncode",
      "pos": [100, 250],
      "widgets_values": ["a cinematic shot of a fit young woman dancing in a modern gym, realistic lighting, detailed skin texture, 4k"]
    },
    {
      "id": 3,
      "type": "CLIPTextEncode",
      "pos": [100, 350],
      "widgets_values": ["blurry, low quality, deformed, bad anatomy"]
    },
    {
      "id": 4,
      "type": "AnimateDiffLoader",
      "pos": [300, 100],
      "widgets_values": ["mm_sd_v15_v2.ckpt"]
    },
    {
      "id": 5,
      "type": "EmptyLatentImage",
      "pos": [300, 250],
      "widgets_values": [512, 768, 16]
    },
    {
      "id": 6,
      "type": "KSampler",
      "pos": [500, 200],
      "widgets_values": [12345, 20, 6, "euler", "normal", 1]
    },
    {
      "id": 7,
      "type": "VAEDecode",
      "pos": [700, 200]
    },
    {
      "id": 8,
      "type": "SaveAnimatedWEBP",
      "pos": [900, 200],
      "widgets_values": ["AnimateDiff"]
    }
  ],
  "links": [
    [1, 0, 4, 0],
    [1, 1, 2, 0],
    [1, 1, 3, 0],
    [2, 0, 6, 1],
    [3, 0, 6, 2],
    [4, 0, 6, 3],
    [5, 0, 6, 0],
    [6, 0, 7, 0],
    [7, 0, 8, 0]
  ]
}

Leave a comment

Your email address will not be published

{"type":"main_options","images_arr":"'#ffffff'","bg_slideshow_time":"0","site_url":"https:\/\/digitalzoomstudio.net","theme_url":"https:\/\/digitalzoomstudio.net\/wp-content\/themes\/qucreative\/","is_customize_preview":"off","gallery_w_thumbs_autoplay_videos":"off","base_url":"https:\/\/digitalzoomstudio.net"}