The world of AI video generation is rapidly evolving, and a new contender has just entered the arena. We’re thrilled to announce the release of LTXV 13B, a groundbreaking open-source model. This model represents a significant leap forward, offering an exceptional blend of high-quality output and astonishing speed. The LTXV 13B model is set to redefine expectations for what’s possible in AI-driven video creation.
Many users will be surprised by its efficiency despite its 13 billion parameters. Let’s dive into what makes the LTXV 13B so special.
Table of contents
What is LTXV 13B? The Next Leap in AI Video
LTXV 13B is more than just an incremental update; it’s a meticulously engineered AI model designed for superior video generation. It builds upon the successes of previous LTXVideo versions, scaling up capabilities while optimizing for performance. This model empowers creators with tools previously out of reach for many, democratizing high-end video production. The release of LTXV 13B marks a pivotal moment for the open-source community.
Key Features That Make LTXV 13B Stand Out
The LTXV 13B model is packed with innovative features designed to provide users with an unparalleled video generation experience. These features combine to deliver both top-tier quality and remarkable efficiency.
Multiscale Rendering and Keyframe for Efficiency and Realism
One of the signature features of LTXV 13B is its multiscale rendering capability. The model first generates a low-resolution layout. It then progressively refines this layout to high resolution. It also provide Multi keyframe generation points for better guided generation.These innovative approaches enables super-efficient rendering and significantly enhances physical realism in the generated videos. Users will notice a tangible difference when utilizing this feature. It also

Blazing Fast Performance
Despite the increase to 13 billion parameters, speed remains a core strength of LTXV 13B. Benchmarks show it performing up to 30 times faster than other models of similar size. This incredible speed means creators can iterate faster and produce content more efficiently without compromising on the awesome quality LTXV 13B delivers.
Advanced Creative Controls
LTXV 13B offers a suite of advanced controls, putting immense creative power in the hands of users. These include keyframe conditioning for precise animation. Camera motion control allows for dynamic shot composition. Character and scene motion adjustments, along with multi-shot sequencing, provide granular control over the final output.
Local Deployment with Quantized Model
Accessibility is key, which is why a quantized version of LTXV 13B is also available. This allows users to run the model directly on their own GPUs. The quantized model is optimized for both memory and speed, making high-quality video generation feasible on consumer-grade hardware.
Full Commercial Use License
LTXV 13B is released with a license that permits full commercial use. This opens up a world of possibilities for creators and businesses alike. For major enterprises, customized API solutions are available by contacting the developers directly.
Easy to Finetune for Customization
Customization is straightforward with LTXV 13B. Users can visit the official LTX-Video-Trainer on GitHub to easily create their own LoRA (Low-Rank Adaptation). This allows for fine-tuning the model to specific styles or content needs, further expanding its versatility.
What’s New in the LTXV 13B 0.9.7 Release?
The latest LTXV 13B 0.9.7 release, announced on May 6th, 2025, brings several exciting enhancements and new components. These updates solidify its position as a leading solution for AI video generation. This release focuses on delivering cinematic quality at unprecedented speeds.
Cinematic Quality and Unprecedented Speed
The LTXV 13B 0.9.7 version delivers truly cinematic-quality videos. It achieves this while maintaining the breakthrough prompt adherence and physical understanding introduced. The speed of generation remains a key highlight, making it a practical tool for demanding projects.
Quantized Version for Consumer GPUs (LTXV 13B Quantized 0.9.7)
A significant part of this release is the LTXV 13B Quantized 0.9.7 model. This version offers reduced memory requirements, enabling even faster inference speeds. It is ideal for consumer-grade GPUs like the NVIDIA 4090 and 5090, delivering outstanding quality with improved performance. To run this quantized version, users need to install the LTXVideo-Q8-Kernels package and use a dedicated ComfyUI flow, as loading via a standard LoadCheckpoint node won’t work.
Latent Upscaling Models for Enhanced Quality
This release introduces new latent upscaling models. These enable inference across multiple scales by upscaling latent tensors without decoding and encoding them repeatedly. This multiscale inference approach delivers high-quality results in a fraction of the time compared to similar models. The spatial and temporal upscaling models should be placed in the models/upscale_models folder.
Simplified ComfyUI Flows and Nodes
User experience is continually being improved. The 0.9.7 release includes new simplified flows and nodes for ComfyUI. Examples include simplified image-to-video, image-to-video with extension, and image-to-video with keyframes, making it easier to get started and achieve desired results.

How to Install LTXV 13B
Getting LTXV 13B set up involves a few key steps, primarily revolving around ComfyUI, which is the recommended environment for interacting with the model. Here’s a guide to get you started:
Preferred Installation: ComfyUI Manager
The easiest way to install the necessary components for LTXV 13B within ComfyUI is through the ComfyUI-Manager.
- Search for “ComfyUI-LTXVideo” in the ComfyUI-Manager‘s list of custom nodes.
- Follow the installation instructions provided by the manager.
Manual Installation Steps for ComfyUI-LTXVideo
If you prefer a manual setup or need more control:
- Install ComfyUI: Ensure you have a working installation of ComfyUI.
- Clone Repository: Clone the ComfyUI-LTXVideo repository into the custom-nodes folder within your ComfyUI installation directory. Command:
git clone https://github.com/Lightricks/ComfyUI-LTXVideo.git - Install Packages: Navigate to the cloned directory and install the required Python packages.
cd custom_nodes/ComfyUI-LTXVideo pip install -r requirements.txt- For portable ComfyUI installations, use:
.\python_embeded\python.exe -m pip install -r .\ComfyUI\custom_nodes\ComfyUI-LTXVideo\requirements.txt
- For portable ComfyUI installations, use:
Downloading and Placing Models
- Main LTXV 13B Models: Download the ltxv-13b-0.9.7-dev.safetensors (and the quantized ltxv-13b-0.9.7-dev-fp8.safetensors if desired) from Hugging Face. Place these files in your ComfyUI/models/checkpoints directory.
- Text Encoder: You’ll need a T5 text encoder. One example is google_t5-v1_1-xxl_encoderonly. You can install this using the ComfyUI Model Manager.
- Latent Upscaling Models: Download the spatial and temporal upscaling models (e.g., ltxv-spatial-upscaler-0.9.7 and ltxv-temporal-upscaler-0.9.7). Place these in your ComfyUI/models/upscale_models folder.
- Quantized Model Kernel: If you plan to use the LTXV 13B Quantized 0.9.7 model, you must install the LTXVideo-Q8-Kernels package.
Additional Custom Nodes
To run the example workflows, you might need additional custom nodes like ComfyUI-VideoHelperSuite. The ComfyUI Manager can help you identify and install any missing custom nodes when you try to load a workflow.

How to Use LTXV 13B
Once installed, LTXV 13B offers several ways to generate videos, with ComfyUI being the most feature-rich and recommended method.
Using LTXV 13B with ComfyUI
ComfyUI provides a flexible, node-based interface for LTXV 13B.
- Load Workflows: The ComfyUI-LTXVideo GitHub repository provides example workflows (JSON files). Load these into ComfyUI to get started. Examples include:
- Simplified image-to-video
- Image-to-video with keyframes
- Image-to-video with duration extension
- Image-to-video using the 8-bit quantized model
- Configure Nodes: Adjust the parameters in the nodes to control your video generation. This includes setting prompts, input images/videos, frame counts, resolutions, and LTXV 13B specific settings.
- Run Generation: Queue your prompt in ComfyUI to start the video generation process.
Using the inference.py Script (Local Runs)
For users who prefer command-line interfaces or want to integrate LTXV 13B into custom scripts, the inference.py script in the LTX-Video GitHub repository is available.
- Note: The developers recommend using ComfyUI workflows for the best results and output fidelity, as the inference.py script may not always match ComfyUI’s quality.
Text-to-Video:
python inference.py --prompt "YOUR DETAILED PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-dev.yaml
Image-to-Video:
python inference.py --prompt "YOUR PROMPT" --conditioning_media_paths IMAGE_PATH --conditioning_start_frames 0 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-dev.yaml
Extending a Video:
Input video segments must contain a multiple of 8 frames plus 1 (e.g., 9, 17). The target frame number should be a multiple of 8.
python inference.py --prompt "YOUR PROMPT" --conditioning_media_paths VIDEO_PATH --conditioning_start_frames START_FRAME --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-dev.yaml
Multi-Condition Video Generation:
Provide paths to images or video segments and their target frames.
python inference.py --prompt "YOUR PROMPT" --conditioning_media_paths PATH_1 PATH_2 --conditioning_start_frames FRAME_1 FRAME_2 --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED --pipeline_config configs/ltxv-13b-0.9.7-dev.yaml
Diffusers Integration
LTXV 13B can also be used with the Hugging Face Diffusers library. Refer to the official Diffusers documentation for details on how to integrate and use LTXV models within their pipelines.
Prompt Engineering for Best Results
Effective prompting is key to unlocking LTXV 13B’s potential:
- Focus on detailed, chronological descriptions of actions and scenes.
- Include specific movements, appearances, camera angles, and environmental details in a single paragraph.
- Start directly with the action; keep descriptions literal and precise.
- Aim for around 200 words for your prompts.
- Structure: Main action -> specific movements/gestures -> character/object appearance -> background/environment -> camera angles/movements -> lighting/colors -> changes/sudden events.
Understanding Key Parameters
- Resolution Preset: Higher resolutions for detail, lower for speed. Resolutions should be divisible by 32, and number of frames a multiple of 8 + 1.
- Seed: Save and reuse seed values to recreate styles.
- Guidance Scale (CFG): Recommended values are typically 3-3.5.
- Inference Steps: More steps (e.g., 40+) for higher quality, fewer (20-30) for faster generation.
- Automatic Prompt Enhancement: When using inference.py, short prompts can be automatically enhanced by a language model. This can also be enabled in the LTXVideoPipeline by setting enhance_prompt=True.

Evolution of LTXVideo: Building Up to LTXV 13B
The LTXV 13B model didn’t appear overnight. It’s the culmination of continuous development and refinement, building on previous versions of LTXVideo. Each release has introduced new features and improvements, paving the way for this powerful 13 billion parameter model.
From LTXVideo 0.9.5 and 0.9.6
Earlier versions like LTXVideo 0.9.5 (released March 5th, 2025) brought improved quality, support for higher resolutions, frame conditioning, and enhanced prompt understanding. It also introduced a commercial license.
The LTXVideo 0.9.6 release (April 17th, 2025) further pushed quality and speed, introducing a distilled model for rapid iteration (LTXV 0.9.6 Distilled). This version also saw the introduction of the STGGuiderAdvanced node, optimizing CFG and STG parameters across diffusion steps for superior quality. The default resolution and FPS were also increased to 1216 × 704 pixels at 30 FPS.
Key Technical Updates Along the Way
Several technical advancements have been crucial. The STGGuiderAdvanced node allows for nuanced control over generation parameters. Frame and sequence conditioning, introduced in 0.9.5, enabled interpolation between frames and video extension from various points. A Prompt Enhancer node was also added to help users generate prompts optimized for the model’s best performance.
The integration of LTXTricks code into the main ComfyUI-LTXVideo repository consolidated tools and ensured continued maintenance. These iterative improvements in prompt understanding, motion quality, and artifact reduction have all contributed to the capabilities of the model.
Training and Fine-tuning LTXV 13B for Custom Needs
One of the most powerful aspects of the LTXV ecosystem is the ability to fine-tune models for specific purposes. The LTX-Video-Trainer, available on GitHub, provides comprehensive tools for training LoRAs or even fine-tuning the entire LTXV 13B model on custom datasets. This allows for unparalleled customization.
The LTX-Video-Trainer on GitHub
The LTX-Video-Trainer repository is your go-to resource for custom training. It supports training LoRAs on top of LTXV 13B and fine-tuning the full model. It also includes essential utilities for dataset preparation, such as video captioning and scene splitting. Importantly, when training with LTXV 13B, gradient checkpointing must be enabled due to its size.
Preparing Your Dataset
Proper dataset preparation is crucial for successful training. The workflow typically involves:
- Splitting Scenes: Long videos can be split into shorter, coherent scenes using split_scenes.py.
- Captioning Videos: If your videos lack captions, caption_videos.py can generate them using vision-language models.
- Preprocessing Dataset: preprocess_dataset.py computes and caches video latents and text embeddings, significantly speeding up training and reducing GPU memory usage.
Resolution buckets are used to organize videos, though currently, the trainer supports a single resolution bucket. Dimensions must adhere to LTX-Video’s VAE architecture (spatial dimensions divisible by 32, frames multiple of 8 plus 1).
Running the Trainer and Using LoRAs
Once your dataset is prepped and your training configuration is set (using Pydantic models), you can run the trainer. Example configurations for LTXV 13B LoRA training are provided. The run_pipeline.py script offers a streamlined way to automate the entire workflow from raw videos to a trained LoRA.
After training, your LoRA weights can be converted to ComfyUI format using scripts/convert_checkpoint.py. These can then be loaded in ComfyUI using the “Load LoRA” node to apply your custom effects or styles to LTXV 13B generations. Example LoRAs like “Cakeify” and “Squish” showcase the potential of this fine-tuning capability.
The Future is Bright with LTXV 13B
The release of LTXV 13B is a landmark event in AI video generation. Its combination of superior quality, remarkable speed, advanced controls, and open-source accessibility (including commercial use) positions it as a transformative tool. For creators, developers, and businesses, LTXV 13B opens up new frontiers for visual storytelling and content creation.
With ongoing development, community contributions, and the ease of fine-tuning, the LTXV 13B model is not just a powerful tool today but a platform for future innovation. We encourage everyone to explore its capabilities, contribute to its growth, and see how LTXV 13B can elevate their video projects. The journey of AI video is accelerating, and LTXV 13B is undoubtedly in the driver’s seat for many exciting developments to come.
| Latest From Us
- Forget Towers: Verizon and AST SpaceMobile Are Launching Cellular Service From Space

- This $1,600 Graphics Card Can Now Run $30,000 AI Models, Thanks to Huawei

- The Global AI Safety Train Leaves the Station: Is the U.S. Already Too Late?

- The AI Breakthrough That Solves Sparse Data: Meet the Interpolating Neural Network

- The AI Advantage: Why Defenders Must Adopt Claude to Secure Digital Infrastructure


