VLM with textual-driven GRPO training for vision-grounded decision making (https://arxiv.org/pdf/2503.16965, NeurIPS 2025)
-
zhehuderek/textual_decisionmaking_data
Viewer • Updated • 11k • 57 • 1 -
zhehuderek/praxis_vlm_7b_decisionmaking
Image-Text-to-Text • 8B • Updated • 1 -
zhehuderek/praxis_vlm_3b_decisionmaking
Image-Text-to-Text • 4B • Updated • 1 -
zhehuderek/qwen2_5_vl_3b_GEOQA_8K_hf
Image-Text-to-Text • 4B • Updated • 1