new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Aug 12

Submitted by

akhaliq

VITA: Towards Open-Source Interactive Omni Multimodal LLM

·
15 authors

Submitted by

akhaliq

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

·
10 authors

Submitted by

xhyandwyy

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

·
9 authors

2

Submitted by

akhaliq

UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling

·
6 authors

Submitted by

akhaliq

ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

·
12 authors

Submitted by

akhaliq

Kalman-Inspired Feature Propagation for Video Face Super-Resolution

·
3 authors

Submitted by

IAMJB

BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion

·
1 authors

Submitted by

akhaliq

MulliVC: Multi-lingual Voice Conversion With Cycle Consistency

·
9 authors

Submitted by

IAMJB

MooER: LLM-based Speech Recognition and Translation Models from Moore Threads

·
8 authors

Submitted by

IAMJB

Generating novel experimental hypotheses from language models: A case study on cross-dative generalization

·
2 authors