LLM Archives - Page 2 of 2

LLama 3 Benchmark Across Various GPU Types

Update: Looking for Llama 3.1 70B GPU Benchmarks? Check out our blog post on Llama 3.1 70B Benchmarks On April 18, 2024, the AI community…

Open Source LLMs gain ground on proprietary models

Recently, there have been a few posts about how open-source models like Llama 3 are catching up to the performance level of some proprietary models.…

Best Llama 3 Inference Endpoint – Part 2

In Part 1, we looked at how tools like Ollama, LM Studio, and Text Generation WebUI perform as an inference endpoint for Llama 3 –…

Best Llama 3 Inference Endpoint – Part 1

With the exciting launch of Meta’s Llama 3 LLM, we were curious about which application would be the best to serve Llama 3 as an…

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 2

Introduction – Multiple LLM APIs If you haven’t already, go back and read Part 1 of this series. In this guide we take a look…

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 1

Introduction Are you interested in setting up an inference endpoint for one of your favorite models? Have you been wanting to leverage the full unquantized…