Local LLM quick performance test ============================== write me a thousand word story ============================== LM Studio 0.3.5 A: $400 hp envy x360 Ryzen 5 8640HS 16GB 6400MT/s iGPU 760m B: $700 asus tuf A16 Ryzen 7 7735HS 16GB iGPU 680m + AMD rx7700S 8GB C: $600 mac mini m4 16gb 256GB D: $800 hp victus 16.1" Ryzen 7 8845HS 16GB (48GB) 5600MT/s nVidia 4070 8GB 105-120W E: $1500 macbook pro 14 M3 Pro 11 core 18GB 512GB ============ llama 3.1 8b ============ A B C D E E (mlx 4bit) 5.93 tok/sec 41.54 tok/sec 18.68 tok/sec 35.01 tok/sec 24.04 tok/sec 28.62 tok/sec • 1466 tokens 1238 tokens 1493 tokens 1472 tokens 1530 tokens 1321 tokens • 5.08s to first token 0.28s 0.39s 0.27s 0.40s 1.19s ============ llama 3.2 3b ============ A B C D 9.92 tok/sec 53.18 tok/sec 25.61 tok/sec 45.39 tok/sec • 1616 tokens 1508 tokens 1336 tokens 1482 tokens • 0.88s to first token 0.46s 0.13s 0.41s ===================== Mistral Nemo 2407 12b ===================== A B C D 11.45 tok/sec 10.35 tok/sec • 963 tokens 861 tokens • 0.46s to first token 0.43s •