The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud. Typical run using LLaMA v2 13B on M2 ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results