Category Archives: Uncategorized

Open Source LLMs gain ground on proprietary models

Recently, there have been a few posts about how open-source models like Llama 3 are catching up to the performance level of some proprietary models. Andrew Reed from Hugging Face created a visual representation of a progress tracker to compare various models. It is clearly showing a growing trend that open-source models are gaining ground. […]

Best Llama 3 Inference Endpoint – Part 2

Considerations Testing Scenario Startup Commands Token/Sec Results vLLM4xA600014.7 tokens/sec14.7 tokens/sec15.2 tokens/sec15.0 tokens/sec15.0 tokens/secAverage token/sec 14.92 2xH10020.3 tokens/sec20.5 tokens/sec20.3 tokens/sec21.0 tokens/sec20.7 tokens/secAverage token/sec 20.56 Hugging Face TGI4xA600012.38 tokens/sec12.53 tokens/sec12.60 tokens/sec12.55 tokens/sec12.33 tokens/secAverage token/sec 12.48 2xH10021.29 tokens/sec21.40 tokens/sec21.50 tokens/sec21.60 tokens/sec21.41 tokens/secAverage token/sec 21.44 Purely looking at a token/sec result, Hugging Face TGI produces the most tokens/sec on […]