ComfyUI provides users with a simple yet effective graph/nodes interface that streamlines the creation and modification of image generation tasks. The nodes in this interface correspond to different components of the image generation process, including text prompts, image inputs, and various AI-powered filters and augmentations. By connecting these nodes, users can create complex and dynamic […]
Author Archives: Massed Compute
Recently, there have been a few posts about how open-source models like Llama 3 are catching up to the performance level of some proprietary models. Andrew Reed from Hugging Face created a visual representation of a progress tracker to compare various models. It is clearly showing a growing trend that open-source models are gaining ground. […]
Considerations Testing Scenario Startup Commands Token/Sec Results vLLM4xA600014.7 tokens/sec14.7 tokens/sec15.2 tokens/sec15.0 tokens/sec15.0 tokens/secAverage token/sec 14.92 2xH10020.3 tokens/sec20.5 tokens/sec20.3 tokens/sec21.0 tokens/sec20.7 tokens/secAverage token/sec 20.56 Hugging Face TGI4xA600012.38 tokens/sec12.53 tokens/sec12.60 tokens/sec12.55 tokens/sec12.33 tokens/secAverage token/sec 12.48 2xH10021.29 tokens/sec21.40 tokens/sec21.50 tokens/sec21.60 tokens/sec21.41 tokens/secAverage token/sec 21.44 Purely looking at a token/sec result, Hugging Face TGI produces the most tokens/sec on […]
Considerations Testing Scenario Results Conclusion
Introduction – Multiple LLM APIs If you haven’t already, go back and read Part 1 of this series. In this guide we take a look at how you can serve multiple models in the same VM. As you start to decide how you want to serve models as an inference endpoint you have a few […]
Introduction Are you interested in setting up an inference endpoint for one of your favorite models? Have you been wanting to leverage the full unquantized version of models but found the process too complex or time-consuming? Do you wish there was a simple and efficient way to deploy full models for your own projects or […]
In the ever-evolving landscape of AI technology, Microsoft continues to push the boundaries with groundbreaking projects. Among these innovative endeavors is their AutoGen project. AutoGen provides multi-agent conversation framework as a high-level abstraction. With this framework, one can conveniently build LLM workflows. As developers grapple with the increasing complexity of modern software applications, AutoGen offers […]