Note: Added Prof. Hung-yi Lee lecture notes for fundamentals of diffusion models. This will be helpful to understand the AI By Hand (aibyhand) of Sora’s diffusion Transformer.

SORA Diffusion Transformer AI By Hand Step by Step

Objective: Generate video from a text prompt and diffusion steps.

[1] Inputs:

[2] Video Patching:

[3] Visual Encoder:

[4] Noise Addition: