WNQzhu
Qlisp
AI & ML interests
None yet
Organizations
None yet
RL
-
RL makes MLLMs see better than SFT
Paper • 2510.16333 • Published • 49 -
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
Paper • 2510.16888 • Published • 22 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 48 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 31
attn
Long
RL
-
RL makes MLLMs see better than SFT
Paper • 2510.16333 • Published • 49 -
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback
Paper • 2510.16888 • Published • 22 -
Reasoning with Sampling: Your Base Model is Smarter Than You Think
Paper • 2510.14901 • Published • 48 -
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation
Paper • 2510.21583 • Published • 31