← Dashboard
sudoingX

Sudo su ✓

29.4K followers
5 tweets
Communities: x/LocalLLaMA Hermes Agent
# Tweet Community Topic Views Ratio Engagement Posted
1
[image] okay let me say this out loud again. if you want to run local models on a single RTX 3090, your best option right now is qwen 3.5 27B dense Q4_K_M. 35 tok/s, flat from 4K to 300K+ context, zero speed degradation. thinking mode works. 262K native context on 24GB. slower than MoE
x/LocalLLaMA 27.9K 1.4x 650 Mar 28
2
[text] hermes agent is already the best on local models. but i'm working on more edges to make it fly even harder. before that, if your agent keeps crashing on local inference here's what to check: > max_turns: default is tuned for fast frontier models. bump from 30 to 50. slow local
Hermes Agent 21.0K 0.7x 420 May 16
3
[image] are you on v0.5.0 too?
Hermes Agent 13.0K 0.6x 287 Mar 29
4
[image] teknium just shipped 7 pluggable memory providers. this is massive. your agent can now remember you across sessions with the backend YOU choose. run 'hermes update' right now and then 'hermes memory setup' to pick your provider. if you're on local only, holographic uses SQLite
Hermes Agent 11.3K 0.5x 109 Apr 3