[NeurIPS 2025] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
-
Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Paper โข 2506.18898 โข Published โข 34 -
Tar
๐48Unified MLLM with Text-Aligned Representations
-
Tar
๐3Unified MLLM with Text-Aligned Representations
-
Tar
๐60Unified MLLM with Text-Aligned Representations