6 Best Local LLM Models You Can Host on Your Home PC

Artificial intelligence is taking over almost every aspect of the digital world. Whether it's about getting answers, generating the content, building code, or creating custom AI-powered chatbots, Large Language Models (LLMs) have revolutionized the way we used to process and consume vast amounts of information. Generally, we use these models through cloud-based services like ChatGPT or Gemini. But, with new AI advancements, highly-optimized LLMs have appeared that can be hosted on a home PC too. Here's a curated list of some of the best LLMs you can host on your home PC. Let's get started!

Generative AI model — 📷 Credit: John B Fotografía

But why would you want to run LLMs locally? The primary reasons are privacy and testing. Developers want to customize these models to fit their use cases, and some want privacy while they use them.

If you are a casual user and mostly use it for general-purpose queries, sticking to cloud-based LLM services is the best option. And, if you love to tinker with apps and code, go ahead.

Things to Know Before Hosting an LLM

Before you select an LLM model for local hosting, there are some important factors one must take into account to get the best results. Knowing this critical information —beforehand—ensures you choose the right model and install it correctly.

Key Considerations for Hosting an LLM—Locally

The following 4 factors should determine your LLM model selection.

Hardware Requirements: Remember, low-end PCs cannot run even the smallest LLM models. It requires an ample amount of both VRAM and RAM. The minimum hardware configuration I'll recommend is:
- 16GB RAM
- A good GPU (at least an RTX 3060)
- Quad-core processor with at least 16 threads
- High-speed SSD Storage
Model Size: The next important factor is the size of an LLM model. All these models come in different sizes, e.g. 40B or 8B. Here B implies the number of parameters available in that model. If an LLM model is large, it'll require more processing power and system resources than an LLM having a relatively smaller size.
Use Case: Depending on what exactly you want to do with your LLM model decide which one you are going to choose from the available options. Some are good with code generation, some are known for their general reasoning power, and so on. So, choose wisely.
Ease of Deployment: And lastly, how easy or difficult is to install a specific LLM can also determine whether you want to use it or not. Some of them can be set up easily through one-click installers while others may require a lot of steps that require some level of technical expertise.

Best Local LLMs for Home PCs

Now that we have considered all the factors we may take into account while choosing an LLM, let's finally check out all the available options. Here we go!

1. Llama 2 (Meta AI)

It's a free and open-source LLM developed by Meta and gives you results at par with GPT-4 output. It has three variants with 7B, 13B, and 70B parameters. It's best suited for creating chatbots, text output, and summary tasks.

The 70B variant is not recommended for home use as it requires fairly expensive hardware.

Hardware Requirements:
- For the 7B Model: 8GB VRAM (or 16GB RAM for CPU inference)
- For the 13B Model: 16GB VRAM (or 32GB RAM for CPU inference)
How to Run:
- Download Llama 2.
- Through Ollama, LM Studio, or llama.cpp, you can deploy it easily on your home PC.

2. Mistral 7B

It's a 7.3B parameter LLM model that outperforms Llama 2 13B model in several benchmarks making it one of the most powerful LLMs one can run on a home PC. Fine-tuning this model is comparatively easier which gives it an edge for custom use cases.

It's best suited for both English language text output and code generation.

Hardware Requirements:
- 8GB+ VRAM or 16GB RAM
How to Run:
- One can use llama.cpp, Ollama, text-generation-webui to install this powerful LLM model.

3. Gemma (Google DeepMind)

The Gemma 2B and 7B LLM models are made by Google and are known to be optimized for efficient and smooth performance on lower-end computers too. There are several models to choose from, but for home use, the two models I mentioned are the best ones.

You can use it for NLP research, creating chatbots, and text-based outputs.

Hardware Requirements:
- For the 2B Model: 4GB+ VRAM / 8GB RAM
- For the 7B Model: 8GB+ VRAM / 16GB RAM
How to Run:
- You can install through LM Studio, Ollama, or Hugging Face Transformers.

4. Phi-2 (Microsoft)

It's a 2.7B LLM model developed by Microsoft. It's well known to give impressive results while leaving a small footprint on system resources making it ideal for use on home PCs. There are several variants of this LLM model that can be fine-tuned for your specific needs.

It can run smoothly on a low-end computer and is best suited for code generation.

Hardware Requirements:
- 4GB+ VRAM / 8GB RAM
How to Run:
- You can install and set it up through Hugging Face, llama.cpp, or Ollama.

5. GPT4All

If you struggle with all things technical and yet looking to run a locally-hosted LLM model, this platform is your best bet. It gives you an easy-to-install way to try out several popular LLM models. Depending on your computer's hardware capabilities, you can select the appropriate model.

It's best suited for beginners and hobbyists who want a quick way to set up a local LLM engine.

Hardware Requirements:
- From 4GB to 16GB RAM (depending on the model you've chosen)
How to Run:
- Download and install GPT4All client. Thereafter, choose your preferred LLM model from within the application's interface. It's available for Windows, Mac, and Linux.

6. Vicuna

This LLM model is based on LLaMA and is fine-tuned to work on regular home PCs. It's primarily known for its chatbot-like capabilities and gives you up to 90% inference ability of ChatGPT. It comes in two variants of 7B and 13B parameters.

It's best suited for chatbot apps and to generate dialogue-based experiences.

Hardware Requirements:
- For the 7B Model: 8GB+ VRAM
- For the 13B Model: 16GB+ VRAM
How to Run:
- To deploy it, use text-generation-webui or Ollama.

Tools for Running Local LLMs

And finally, here are the tools you can use to quickly deploy these local LLM models on your home PC. These tools make the process easy for users who are new to the LLM model installation process.

Ollama: It's the easiest one to use that gives you a user-friendly interface to choose and deploy an LLM on your local PC.
LM Studio: This one too is a GUI-based application available for all major platforms. Even a technically challenged can easily install LLMs through this tool.
llama.cpp: It's a C++ library that efficiently deploys different types of LLM models on a computer system.
text-generation-webui: If you prefer a web-based UI to get the job done, this option is the best one. It enables you to set up different types of LLM models—easily.

Conclusion

Hosting an LLM locally on your home PC is now more feasible than ever. Whether you're looking for a chatbot, AI-assisted writing, or code completion, there's an LLM for your needs.

With the right hardware and tools like Ollama or LM Studio, you can enjoy privacy, low-latency AI responses, and full control over your AI assistant.