Submitted by akhaliq 41 Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 · 10 authors 2
Submitted by xhyandwyy 33 mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models · 9 authors 2
Submitted by akhaliq 24 UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling · 6 authors 222 2
Submitted by akhaliq 18 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities · 12 authors 244 4
Submitted by akhaliq 9 Kalman-Inspired Feature Propagation for Video Face Super-Resolution · 3 authors 3
Submitted by IAMJB 8 BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion · 1 authors 3 2
Submitted by IAMJB 7 MooER: LLM-based Speech Recognition and Translation Models from Moore Threads · 8 authors 218 2
Submitted by IAMJB 5 Generating novel experimental hypotheses from language models: A case study on cross-dative generalization · 2 authors 1 1