Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model Paper • 2509.04548 • Published Sep 4 • 4
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model Paper • 2508.13009 • Published Aug 18 • 25
Transition Models: Rethinking the Generative Learning Objective Paper • 2509.04394 • Published Sep 4 • 28
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models Paper • 2506.21356 • Published Jun 26 • 22
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs Paper • 2503.23022 • Published Mar 29 • 6
SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling Paper • 2503.21732 • Published Mar 27 • 9
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Paper • 2312.08754 • Published Dec 14, 2023 • 11
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper • 2502.06608 • Published Feb 10 • 40
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Paper • 2410.13925 • Published Oct 17, 2024 • 24
GVGEN: Text-to-3D Generation with Volumetric Representation Paper • 2403.12957 • Published Mar 19, 2024 • 6