Trajectory Attention For Fine-grained Video Motion Control

Zeqi Xiao¹ Wenqi Ouyang¹ Yifan Zhou¹ Shuai Yang² Lei Yang³ Jianlou Si³ Xingang Pan¹
¹S-Lab, Nanyang Technological University,
²Wangxuan Institute of Computer Technology, Peking University,
³Sensetime Research

demo.mp4

🏠 About

Trajectory attention injects partial motion information by making content along trajectories consistent. It facilitates various tasks such as camera motion control on images and videos, and first-frame-guided video editing. Yellow boxes indicate reference contents. Green boxes indicate input frames. Blue boxes indicate output frames.

Our method allows for conditioning on trajectories from various sources -- such as camera motion derived from a single image, as shown in this figure. We inject these conditions into the model through trajectory attention, enabling explicit and fine-grained control over the motion in the generated video.

Installation

Create a Conda Environment

This codebase is tested with the versions of PyTorch 2.1.0+cu121.

conda create -n trajattn python==3.10
conda activate trajattn
pip install -r requirements.txt

Download model weights Download model weights from huggingface.
Clone Relevant Repositories and Download Checkpoints

# Clone the Depth-Anything-V2 repository
git clone https://github.com/DepthAnything/Depth-Anything-V2
# Download the Depth-Anything-V2-Large checkpoint
wget https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth?download=true
# Overwrite the run.py
cp depth_anything/run.py Depth-Anything-V2/run.py

Save the checkpoints to the checkpoints/ directory. You can also modify the checkpoint path in the running scripts if needed.

Runnig

To control camera motion on images, execute the following script

sh image_control.sh

To control camera motion on videos, execute the following script

sh video_control.sh

To do video editing, execute the following script

sh video_editing.sh

TODO

Release models and weight;
Release pipelines for single image camera motion control;
Release pipelines for video camera motion control;
Release pipelines for video editing;

🔗 Citation

If you find our work helpful, please cite:

@inproceedings{
xiao2025trajectory,
title={Trajectory attention for fine-grained video motion control},
author={Zeqi Xiao and Wenqi Ouyang and Yifan Zhou and Shuai Yang and Lei Yang and Jianlou Si and Xingang Pan},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=2z1HT5lw5M}
}

👏 Acknowledgements

SVD: Our model is tuned from SVD.
MiraData: We use the data collected by MiraData.
Depth-Anything-V2: We estimate depth map by Depth-Anything-V2.
Unimatch: We estimate optical flow map by Unimatch.
Cotracker: We estimate point trajectories by Cotracker.
NVS_Solver: Our camera rendering code is based on NVS_Solver.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
assets		assets
data		data
depth_anything		depth_anything
models		models
utils		utils
README.md		README.md
cotracker.py		cotracker.py
extract_frames.py		extract_frames.py
generate.py		generate.py
image_control.sh		image_control.sh
merge_trajectory.py		merge_trajectory.py
requirements.txt		requirements.txt
trajectory_extraction.py		trajectory_extraction.py
video_control.sh		video_control.sh
video_editing.sh		video_editing.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trajectory Attention For Fine-grained Video Motion Control

🏠 About

Installation

Runnig

TODO

🔗 Citation

👏 Acknowledgements

About

Releases

Packages

Languages

xizaoqu/TrajectoryAttention

Folders and files

Latest commit

History

Repository files navigation

Trajectory Attention For Fine-grained Video Motion Control

🏠 About

Installation

Runnig

TODO

🔗 Citation

👏 Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages