16 Tools to Run Your Local LLMs With Privacy Support

Doing this article has become a yearly favorite, so I plan to add extra value by doing two editions this year.

All of the tools below are free, many are open-source, and there are a wide range of LLMs, SLMs, and LMMs out there.

For the uninitiated:

  1. LLMs - Large Language Models that work only with text.
  2. SLMs - Small Language Models that typically have less than 10B parameters.
  3. LMMs - Large Multi-modal Models that work with text, images, audio, and video.

Use perplexity.ai to learn new terms.

I prefer Perplexity over Google for practically everything these days.

I might do a follow-up article about the best models to use with these tools.

There are 20 tools to run and interact with your local LLMs, all given below.

We will be taking a look at all these tools for Local LLMs.

Explore as many of them as you can.

All of them are useful in their own way.

And, finally: Enjoy!

16 Tools to Run LLMs Locally

1. H2O LLM Studio

https://venturebeat.com/ai/h2o-ai-launches-h2ogpt-llm-studio/?embedable=true

2. LM Studio

https://www.linkedin.com/pulse/discovering-lm-studio-gateway-private-locally-hosted-ai-thyagarajan-j0off/?embedable=true

3. Ollama:

https://medium.com/@mauryaanoop3/ollama-a-deep-dive-into-running-large-language-models-locally-part-1-0a4b70b30982?embedable=true

4. GPT4All

https://hackernoon.com/gpt4all-an-ecosystem-of-open-source-compressed-language-models?embedable=true

5. LocalAI:

https://thenewstack.io/how-to-run-a-local-llm-via-localai-an-open-source-project/?embedable=true

6. Jan:

https://medium.com/mr-plan-publication/discover-jan-ai-the-open-source-assistant-transforming-local-ais-for-everyone-19d2e5544b38?embedable=true

7. text-generation-webui (oobabooga):

https://pyimagesearch.com/2024/07/01/exploring-oobabooga-text-generation-web-ui-installation-features-and-fine-tuning-llama-model-with-lora/?embedable=true

8. PrivateGPT

https://litslink.com/blog/what-is-private-gpt?embedable=true

9. vLLM:

https://ai.gopubby.com/running-llms-locally-on-the-mac-using-vllm-b128e06d5dbd?embedable=true

10. MLC LLM:

https://www.restack.io/p/mlc-llm-answer-local-llm-android-cat-ai?embedable=true

11. llama.cpp

https://medium.com/@jankammerath/the-resurgence-of-c-through-llama-cpp-cuda-metal-8d2322cd8ded?embedable=true

12. ExLlamaV2:

https://medium.com/data-science/exllamav2-the-fastest-library-to-run-llms-32aeda294d26?embedable=true

13. llamafile

https://simonw.substack.com/p/llamafile-is-the-new-best-way-to?utm_source=profile&utm_medium=reader2&embedable=true

14. WebLLM

https://techhub.iodigital.com/articles/what-is-webllm?embedable=true

15. Hugging Face Transformers:

https://www.datacamp.com/tutorial/what-is-hugging-face?embedable=true

16. Hugging Face App Market (Spaces):

https://coda.io/@peter-sigurdson/hugging-face-spaces?embedable=true

Conclusion

There is no sector that LLMs will not disrupt.

With the correct guidance, Generative AI will reshape the world as we know it.

Often, one might find oneself in a situation where you do not want your data to leave your system.

This is especially true for enterprises and governments.

At such times, these tools will be invaluable.

Cutting-edge research is another sector where you do not want your data to leave your enterprise.

You can deploy the tool of your choice to a centralized server hosted by the company.

The server must be air-gapped to the public and use one of the tools on this list.

I sincerely hope you change the world with your Generative AI research.

All the best for your journey!

References

  1. Unite:

    https://www.unite.ai/best-llm-tools-to-run-models-locally/: Unite.AI article reviewing top tools for running LLMs locally, updated for 2025.

  2. DataCamp:

    https://www.datacamp.com/tutorial/run-llms-locally-tutorial: DataCamp tutorial on methods and tools for running LLMs locally, with practical guidance.

  3. Getstream:

    https://getstream.io/blog/best-local-llm-tools: GetStream blog listing the best tools for local LLM execution, with detailed insights.

  4. H2O LLM Studio:

    https://h2o.ai/products/h2o-llm-studio/ - Official product page for H2O LLM Studio, a no-code GUI for LLM fine-tuning and deployment.

    https://github.com/h2oai/h2ogpt - GitHub repository for H2OGPT, H2O.ai's open-source large language model.

  5. LM Studio:

    https://lmstudio.ai/ - Official website for LM Studio, a user-friendly desktop application for running LLMs locally.

  6. Ollama:

    https://ollama.ai/ - Official website for Ollama, designed for simple command-line and GUI-based local LLM serving.

  7. GPT4All:

    https://gpt4all.io/ - Official website for GPT4All, providing a free and open-source ecosystem for local LLMs.

  8. LocalAI:

    https://localai.io/ - Official website for LocalAI, a self-hosted, community-driven local AI server compatible with OpenAI API.

  9. text-generation-webui (oobabooga) API:

    https://github.com/oobabooga/text-generation-webui - GitHub repository for text-generation-webui (oobabooga), a feature-rich web UI for local LLMs.

  10. Jan:

    https://jan.ai/ - Official website for Jan, a cross-platform AI client application with local LLM support.

  11. PrivateGPT:

    https://github.com/imartinez/privateGPT - GitHub repository for PrivateGPT, a privacy-focused tool for local document Q&A using LLMs.

  12. FastChat:

    https://github.com/lm-sys/FastChat - GitHub repository for FastChat, a research platform for training, serving, and evaluating LLMs.

  13. vLLM:

    https://vllm.ai/ - Official website for vLLM, a high-throughput and efficient LLM inference server.

  14. MLC LLM:

    https://mlc.ai/mlc-llm/ - Official website for MLC LLM, focusing on machine learning compilation for efficient LLM execution.

    https://github.com/mlc-ai/mlc-llm - GitHub repository for MLC LLM, containing code and examples for local execution.

  15. llama.cpp:

    https://github.com/ggerganov/llama.cpp - GitHub repository for llama.cpp, a project focused on efficient C++ inference of Llama models.

  16. ExLlamaV2:

    https://github.com/turboderp/exllamav2 - GitHub repository for ExLlamaV2, known for fast inference of quantized LLMs.

  17. WebLLM:

    https://webllm.mlc.ai/ - Official website for WebLLM, enabling in-browser LLM execution using WebGPU.

  18. llamafile:

    https://github.com/Mozilla-Ocho/llamafile - GitHub repository for llamafile, packaging LLMs into single executable files for easy deployment.

  19. Hugging Face Transformers:

    https://huggingface.co/docs/transformers/index - Documentation for Hugging Face Transformers library, a core Python library for NLP models.

  20. Hugging Face App Market (Spaces):

    https://huggingface.co/spaces - Hugging Face Spaces, a platform for hosting and discovering AI application demos.

Google AI Studio was used in this article. It is available at this link: https://ai.google.dev/aistudio

All images created by the Flux AI Art Generation Models at Night Cafe Studio: https://creator.nightcafe.studio/explore

While I do not monetize my writing directly, your support helps me continue putting articles like this one out without a paywall or a paid subscription.

If you want ghostwritten articles like this one appearing under your name online, you can get it!

Contact me at:

https://linkedin.com/in/thomascherickal

For your own ghostwritten article! (Prices are negotiable and I offer country-wise parity pricing.)

If you want to support my writing, consider a contribution at Patreon on this link:

https://patreon.com/c/thomascherickal/membership

Alternatively, you could buy me a coffee on this link:

https://ko-fi.com/thomascherickal

Cheers!