Impact of updated NVIDIA drivers on vLLM & HuggingFace TGI

If you are building a service that relies on LLM inference performance, you want to [...]

Want to build a custom Chat GPT? The top cloud computing platforms to create your own LLM

Want to Build a Custom Chat GPT? Here’s How to Pick the Best Cloud Computing [...]

What is Hacktoberfest and is it good for beginners?

Hacktoberfest is an annual event that runs throughout October, celebrating open-source projects and the community [...]

What is AI inference VS. AI training?

When it comes to Artificial Intelligence (AI) models, there are two key processes that allow [...]

What is generative AI and how can I use it?

While it may seem that AI is a recent phenomenon, the field of artificial intelligence [...]

Advantages of Cloud GPUs for AI Development

Artificial Intelligence (AI) development has become a cornerstone of innovation across numerous industries, from healthcare [...]

LLama 3.1 Benchmark Across Various GPU Types

Figure: Generated from our Art VM Image using Invoke AI Previously we performed some benchmarks [...]

Maximizing AI efficiency: Insights into model merging techniques

What’s a great piece of advice when you venture on a creative path that requires [...]

NVIDIA Research predicts what’s next in AI, from better weather predictions to digital humans

NVIDIA’s CEO, Jensen Huang, revealed some of the most exciting technological innovations during his keynote [...]

How do I start learning about LLM? A beginner’s guide to large language models

In the era of Artificial Intelligence (AI), Large Language Models (LLMs) are redefining our interaction [...]

LLama 3 Benchmark Across Various GPU Types

Update: Looking for Llama 3.1 70B GPU Benchmarks? Check out our blog post on Llama [...]

GPU vs CPU

Often considered the “brain” of a computer, processors interpret and execute programs and tasks. In [...]

One-Click Easy Install of ComfyUI

ComfyUI provides users with a simple yet effective graph/nodes interface that streamlines the creation and [...]

Open Source LLMs gain ground on proprietary models

Recently, there have been a few posts about how open-source models like Llama 3 are [...]

Best Llama 3 Inference Endpoint – Part 2

Considerations Testing Scenario Startup Commands Token/Sec Results vLLM4xA600014.7 tokens/sec14.7 tokens/sec15.2 tokens/sec15.0 tokens/sec15.0 tokens/secAverage token/sec 14.92 [...]

Best Llama 3 Inference Endpoint – Part 1

Considerations Testing Scenario Results Conclusion

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 2

Introduction – Multiple LLM APIs If you haven’t already, go back and read Part 1 [...]

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 1

Introduction Are you interested in setting up an inference endpoint for one of your favorite [...]

AutoGen with Ollama/LiteLLM – Setup on Linux VM

In the ever-evolving landscape of AI technology, Microsoft continues to push the boundaries with groundbreaking [...]