How To Archives - Massed Compute

How To

How RAG Unlocks Search for AI Models

Posted on March 24, 2025March 27, 2025 by Massed Compute

24
Mar

We’ve all experienced AI hallucinations—those moments when a chatbot or AI assistant confidently provides an answer that is completely wrong. AI models rely on pre-trained data and often lack access to real-time, company-specific information. When forced to answer, they may fabricate responses rather than admit gaps in their knowledge. RAG solves this problem by allowing […]

Continue reading →

How To, Hugging Face, LLM

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 2

Posted on March 4, 2024March 4, 2024 by Massed Compute

04
Mar

Introduction – Multiple LLM APIs If you haven’t already, go back and read Part 1 of this series. In this guide we take a look at how you can serve multiple models in the same VM. As you start to decide how you want to serve models as an inference endpoint you have a few […]

Continue reading →

How To, Hugging Face, LLM

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 1

Posted on February 26, 2024March 4, 2024 by Massed Compute

26
Feb

Introduction Are you interested in setting up an inference endpoint for one of your favorite models? Have you been wanting to leverage the full unquantized version of models but found the process too complex or time-consuming? Do you wish there was a simple and efficient way to deploy full models for your own projects or […]

Continue reading →

AutoGen, How To, Ollama

AutoGen with Ollama/LiteLLM – Setup on Linux VM

Posted on November 29, 2023February 26, 2024 by Massed Compute

29
Nov

In the ever-evolving landscape of AI technology, Microsoft continues to push the boundaries with groundbreaking projects. Among these innovative endeavors is their AutoGen project. AutoGen provides multi-agent conversation framework as a high-level abstraction. With this framework, one can conveniently build LLM workflows. As developers grapple with the increasing complexity of modern software applications, AutoGen offers […]

Continue reading →

Category Archives: How To

How To

How RAG Unlocks Search for AI Models

How To, Hugging Face, LLM

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 2

How To, Hugging Face, LLM

Leverage Hugging Face’s TGI to Create Large Language Models (LLMs) Inference APIs – Part 1

AutoGen, How To, Ollama

AutoGen with Ollama/LiteLLM – Setup on Linux VM

Think it. Build it. Scale it.

Think it. Build it. Scale it.

Think it. Build it. Scale it.