Jump to content

V 4mp4 -

Capable of generating 204-frame videos (roughly 6-7 seconds at 30 fps) with realistic textures and motion.

The model is built on a massive, 30-billion parameter architecture designed for deep understanding of text prompts and visual generation. v 4mp4

It uses bilingual encoders, allowing for strong performance in both English and Chinese text prompts. Capable of generating 204-frame videos (roughly 6-7 seconds

According to Neurohive, deploying or training this model requires substantial resources: Operating System: Linux Language & Library: Python 3.10.0+ and PyTorch 2.3-cu121 Dependencies: CUDA Toolkit and FFmpeg. According to Neurohive, deploying or training this model

The Step-Video-T2V (v 4mp4) is a state-of-the-art text-to-video AI model developed by Stepfun AI that, as of early 2025, has garnered attention for its ability to generate high-quality, long-duration videos. It focuses on producing 204-frame videos with a high degree of fidelity using advanced architecture.

Important Information

Wenn Sie diese Seite nutzen, stimmen Sie den Guidelines zu. We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue. Mehr Informationen in unserer Privacy Policy

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.