GPU vs CPU

Often considered the “brain” of a computer, processors interpret and execute programs and tasks. In [...]

One-Click Easy Install of ComfyUI

ComfyUI provides users with a simple yet effective graph/nodes interface that streamlines the creation and [...]

Open Source LLMs gain ground on proprietary models

Recently, there have been a few posts about how open-source models like Llama 3 are [...]

Best Llama 3 Inference Endpoint – Part 2

Considerations Testing Scenario Startup Commands Token/Sec Results vLLM4xA600014.7 tokens/sec14.7 tokens/sec15.2 tokens/sec15.0 tokens/sec15.0 tokens/secAverage token/sec 14.92 [...]

Best Llama 3 Inference Endpoint – Part 1

Considerations Testing Scenario Results Conclusion

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 2

Introduction – Multiple LLM APIs If you haven’t already, go back and read Part 1 [...]

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 1

Introduction Are you interested in setting up an inference endpoint for one of your favorite [...]

AutoGen with Ollama/LiteLLM – Setup on Linux VM

In the ever-evolving landscape of AI technology, Microsoft continues to push the boundaries with groundbreaking [...]