0 / 5000
Reference image to influence the style or content of the output.
Seed unlocked - will use random seed
Edit Existing Video with Text Prompts — AI Video-to-Video Editor
Re-shooting is expensive. On Kling AI Video, Runway Gen-4 Aleph solves the problem differently: upload your existing clip, type a text instruction, and the model transforms the footage while preserving every frame's original camera motion and subject movement. This is video-to-video editing — not a filter, not a color grade. Gen-4 Aleph builds a spatial map of your scene before applying any transformation, so a prompt like 'heavy snowfall, night time' doesn't just tint the frame — it shifts light direction, adds surface accumulation, and makes roads reflective across all frames simultaneously. Input: MP4 or WebM up to 16 MB; only the first 5 seconds are processed. Output: 6 aspect ratios (16:9, 9:16, 4:3, 3:4, 1:1, 21:9). Premium plan required.
What Makes This an AI Video Editor — Not a Filter
Traditional video editing software — Premiere Pro, DaVinci Resolve, Final Cut — requires manual operation at the clip or pixel level: masks, keyframes, compositing layers, LUT application. Runway Gen-4 Aleph inverts that workflow entirely. Instead of operating on pixels, it builds an internal representation of the scene: object boundaries, depth layers, surface normals, light source positions, and the camera trajectory across all frames. Your text prompt then specifies what changes and what stays. The model rewrites the relevant elements and regenerates every frame in a single pass, maintaining temporal coherence without manual keyframing.
The distinction from text-to-video generation is fundamental: text-to-video creates footage from nothing using only a text prompt. This video editor takes your existing footage as input and modifies it according to your instruction — subject movement, original camera path, and scene structure all carry over into the output. The first 5 seconds of your clip (approximately 120 to 150 frames at standard frame rates) are analyzed and transformed. Files must be MP4 or WebM, maximum 16 MB. An optional reference image can anchor a specific color palette or art style when text description alone is ambiguous.
What Gen-4 Aleph Can Edit
Four transformation domains — each applied with full temporal consistency across all processed frames.
Camera Angle and Viewpoint Synthesis
Generate camera angles that were never physically filmed. Describe a low-angle push-in, a reverse shot, or a close-up, and Gen-4 Aleph synthesizes that viewpoint while keeping subject lighting and motion intact. You can also extend a shot past its original cut point, or transplant the camera movement from one clip onto a different scene entirely.
- Synthesize alternate viewpoints from a single input angle — reverse shots, low angles, or close-ups that were never filmed
- Extend shots beyond their original cut point with motion-matched continuity for timeline edits
- Transfer camera movement from one clip to a completely different scene
Object Insertion, Removal, and Replacement
Add objects that were not in the original scene with matched lighting direction, shadow angle, and surface reflections. Remove objects and get a temporally stable background fill — not a static clone stamp but a fill that holds consistent across all frames. Retexture existing surfaces: swap wood for marble, matte for glossy, bare wall for branded signage.
- Insert objects with matched lighting direction, shadow angle, reflections, and perspective
- Remove objects and fill backgrounds with temporal consistency across every processed frame
- Retexture existing surfaces, or separate foreground subjects from complex backgrounds without a green screen
Scene and Atmosphere Transformation
Swap the season, weather, time of day, or location backdrop without touching the foreground. The model identifies ground planes, sky regions, and architectural surfaces before applying the environment change, so 'midnight with street lamps' recalculates shadow direction and adds artificial light blooms rather than simply darkening the frame. Surface interactions are physically grounded: rain makes roads reflective; snow accumulates on horizontal ledges.
- Change season and weather — rain, snow, fog, sun — with physically grounded surface interactions
- Relight scenes for different times of day with recalculated directional shadows
- Replace the background environment while preserving foreground motion, silhouette, and parallax
Visual Style and Artistic Transfer
Shift the footage into a different visual register — film noir, anime line-art, oil painting, 1970s grain — while preserving subject edges and motion paths. Attach a reference image (film still, painting, brand mood board) and Gen-4 Aleph extracts its color palette, texture density, and tonal range, applying those characteristics across every frame more precisely than a written style description alone.
- Transfer a reference image's color palette, texture, and grain frame by frame
- Apply artistic looks such as anime, oil painting, sketch, or film-stock emulation while preserving motion
- Use prompt-based relighting and selective style changes without rebuilding the scene geometry
Technical Specifications
Input requirements, output formats, and platform constraints for Runway Gen-4 Aleph video editing.
- Input formats: MP4 and WebM — maximum file size 16 MB per clip
- Processing window: first 5 seconds of the uploaded clip only (trim to target segment before uploading)
- Output aspect ratios: 16:9, 9:16, 4:3, 3:4, 1:1, 21:9 — chosen at generation time, independent of input ratio
- Optional reference image (JPEG or PNG) for precise color palette and style anchoring
- Optional seed value for reproducible output across iterated prompt variations
- Requires an active account to generate edited output
How to Edit Video Online with Gen-4 Aleph
Three steps — upload a clip, write the transformation, generate — no software to install.
1. Upload Your Source Clip
Select an MP4 or WebM file under 16 MB. Only the first 5 seconds are processed, so trim the clip to the exact segment you want transformed before uploading. Footage with stable composition, consistent lighting, and a clear foreground subject produces the most precise transformations.
2. Write Your Transformation Prompt
Type a text instruction describing what should change: 'switch the background to a rainy Tokyo street at night,' 'remove the trash can near the fence,' or 'convert to vintage 16mm film grain with warm orange tones.' Optionally attach a reference image to lock in a visual target. Choose your output aspect ratio from the six available options.
3. Generate and Refine
Processing typically finishes within minutes. Review the result and adjust the prompt incrementally — changing one variable at a time produces more predictable refinements than rewriting the whole prompt. Lock the seed parameter when you want to isolate the effect of a single word change.
How to Write Effective Editing Prompts
The text prompt is the only editing interface — there are no sliders, brush tools, or masks. Prompt structure directly determines output quality, so investing a few seconds in phrasing pays off at generation time.
- Address one transformation per prompt: 'change to dusk lighting' and 'add fog rolling in' produce sharper results as two separate generations than one combined instruction
- Identify objects with spatial specificity: 'remove the red neon sign mounted above the entrance door' resolves faster than 'take out the sign'
- Describe the visual outcome, not the edit operation: write 'late afternoon sun casting long diagonal shadows' rather than 'apply golden hour color grade'
- Use a reference image when style is hard to name: attaching a film still anchors color temperature and grain more reliably than adjective-heavy text descriptions
- Hold the seed constant across prompt variations when you want to isolate what a single word change does to the output
Video Editing Use Cases
Edit footage once — generate as many variants as the project demands, without re-shooting.
Footage Style Conversion
Transform a clean digital capture into vintage super-8, hand-drawn animation, or gritty black-and-white documentary style. Gen-4 Aleph separates scene structure — edges, depth, motion — from surface appearance — color, texture, grain — and replaces the latter while keeping the former. Use a reference image from a specific film or artwork to anchor an exact aesthetic rather than describing it in words.
Weather and Atmosphere Changes
Convert a clear daytime exterior into a storm-soaked night scene for dramatic effect, or invert that — take stormy raw footage and clear the sky for a brand-appropriate sunny version. Surface interaction is physically grounded: puddles appear on horizontal surfaces, snow sits on ledges, fog attenuates background contrast. No particle overlays, no manual compositing.
Marketing Asset Versioning
Take a single approved product video and generate seasonal variants — the same clip in a summer beach setting, an autumn forest, and a winter interior — without re-booking talent, locations, or production crew. Each variant takes minutes. Run A/B tests across different visual moods by changing one prompt element per generation, measuring engagement before committing to the final creative.
Pre-Production Scene Prototyping
Directors and cinematographers can test visual concepts against reference footage before a live shoot. Upload a rough test clip and apply the intended grading style, weather conditions, and atmosphere to confirm the look holds up before committing crew and equipment. Changes that would otherwise require a second shoot become a prompt edit.
Background and Location Replacement
Shoot in a convenient indoor location, then swap the background for a mountain vista, a city rooftop, or a minimalist white studio — without a green screen. The model preserves the foreground subject's silhouette, motion path, and lighting while rebuilding the environment behind them. Useful for product videos, interviews, and travel content filmed in constrained locations.
In-Scene Object Manipulation
Add a prop that was forgotten on set, erase a logo from the background to clear usage rights, or replace generic furniture with branded products — all through a text description. Gen-4 Aleph matches inserted objects to the scene's existing lighting direction and perspective, eliminating the compositing workflow that would normally be required for this class of edit.
Limitations to Know Before Generating
Gen-4 Aleph processes only the first 5 seconds of a clip and requires MP4 or WebM format under 16 MB — trim your source footage to the relevant segment before uploading. Scenes with many independently moving objects in the same frame reduce transformation fidelity: the model allocates attention across all detected elements, so compositions with fewer competing subjects yield sharper results. When the initial output is directionally correct but not precise, refine the prompt incrementally rather than replacing it entirely.
Certain transformation types are outside the model's reliable range: small on-screen text and specific brand logos often blur or distort, physics-violating instructions produce inconsistent results, and combining two distinct edits in a single prompt (e.g., 'add a dog and change the weather') frequently degrades both. Reference images resolve style ambiguity more reliably than dense text descriptions — if a written prompt produces an unexpected visual direction, attach a reference image to anchor the target aesthetic.
Video Editor FAQ
Specific answers about Gen-4 Aleph capabilities, input requirements, and prompt strategy.