Category Archives: Uncategorized

GPU vs CPU

Often considered the “brain” of a computer, processors interpret and execute programs and tasks. In an ever-evolving tech landscape, it’s crucial to understand the instances in which various types of processors perform best. Below, let’s break down GPUs and CPUs, what they do and when to use one over the other for various workloads. What’s […]

One-Click Easy Install of ComfyUI

ComfyUI provides users with a simple yet effective graph/nodes interface that streamlines the creation and modification of image generation tasks. The nodes in this interface correspond to different components of the image generation process, including text prompts, image inputs, and various AI-powered filters and augmentations. By connecting these nodes, users can create complex and dynamic […]

Open Source LLMs gain ground on proprietary models

Recently, there have been a few posts about how open-source models like Llama 3 are catching up to the performance level of some proprietary models. Andrew Reed from Hugging Face created a visual representation of a progress tracker to compare various models. It is clearly showing a growing trend that open-source models are gaining ground. […]

Best Llama 3 Inference Endpoint – Part 2

Considerations Testing Scenario Startup Commands Token/Sec Results vLLM4xA600014.7 tokens/sec14.7 tokens/sec15.2 tokens/sec15.0 tokens/sec15.0 tokens/secAverage token/sec 14.92 2xH10020.3 tokens/sec20.5 tokens/sec20.3 tokens/sec21.0 tokens/sec20.7 tokens/secAverage token/sec 20.56 Hugging Face TGI4xA600012.38 tokens/sec12.53 tokens/sec12.60 tokens/sec12.55 tokens/sec12.33 tokens/secAverage token/sec 12.48 2xH10021.29 tokens/sec21.40 tokens/sec21.50 tokens/sec21.60 tokens/sec21.41 tokens/secAverage token/sec 21.44 Purely looking at a token/sec result, Hugging Face TGI produces the most tokens/sec on […]