Profiling Large Language Model Inference on Apple Silicon: A Quantization Perspective Paper • 2508.08531 • Published Aug 12, 2025 • 1
huihui-ai/Huihui-Qwen3.5-35B-A3B-abliterated Image-Text-to-Text • 36B • Updated 6 days ago • 21.7k • 196
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST 18 days ago • 18
view article Article Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac Oct 29, 2025 • 32
Optimizing LLMs Using Quantization for Mobile Execution Paper • 2512.06490 • Published Dec 6, 2025 • 1
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 16 days ago • 479
view article Article Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp Jan 30 • 13