Installing and Running DeepSeek-R1 Locally on Your Machine

In the era of large language models, the Chinese model DeepSeek-R1 has become a hot topic—not least because it can be run on conventional hardware right at home. This article explains how to install and run DeepSeek-R1 locally without relying on remote servers. We’ll discuss the model’s unique features, present three different desktop tools for local deployment, and review performance tests as well as the practical limits of the current distilled versions.

What is DeepSeek-R1?

DeepSeek-R1 is the latest language model developed in China, designed to perform similarly to ChatGPT but with considerably less hardware requirement during training. As a reasoning model, it transparently displays its thought process, aiming to minimize errors (or “hallucinations”). The developers have released the model’s weights under the MIT License, enabling you to experiment on your own hardware without being tied to remote servers. While the full model—boasting 671 billion parameters—is over 700 GB in size and only runs on high-end systems, there are distilled versions (e.g., 7B or 8B) that can be deployed on average hardware, such as a MacBook Pro M1, MacBook Air respectively, with 16 GB of RAM.

Advantages of Local Deployment

Running DeepSeek-R1 on your own computer offers several benefits:

• Data Privacy and Control: Since the model runs locally, your data never leaves your machine—avoiding external censorship or filtering mechanisms.

• Independence: You’re not dependent on external servers or API services, which helps when facing downtime or long response times.

• Flexibility: Local execution allows you to tweak parameters, customize prompt templates, and integrate the model into your own applications.

• Cost Efficiency: By eliminating API fees, a local solution becomes particularly attractive for frequent usage.

• Offline Availability: Once the model is downloaded, you can operate without an Internet connection.

Tools for Local Installation and Usage

Modern desktop applications have made it increasingly accessible for even non-experts to run chatbots like DeepSeek-R1. Here are three popular tools:

Jan: The Open-Source GUI

Jan provides a minimalistic graphical interface that simplifies downloading and managing LLMs.

• Installation and Models: Through its integrated hub, you can download popular models like Llama and Qwen and import DeepSeek-R1 (via trusted external links) as a distilled variant.

• Usage: After the model is downloaded, simply select it under the “Threads” menu and start chatting.

• Pros and Cons: Jan offers a straightforward, cross-platform solution; however, users may encounter formatting issues and the settings aren’t saved persistently.

LM Studio: For Beginners and Professionals

LM Studio offers a comprehensive interface with a wide range of configuration options—ideal if you want to delve deeper into model parameter tuning.

• Model Browser: With its built-in browser, you can easily download various models (including 7B and 8B variants of DeepSeek-R1) directly from the repository.

• Performance Monitoring: The application displays real-time CPU and RAM usage and warns you if resources are nearly exhausted.

• Pros and Cons: Although LM Studio is feature-rich and user-friendly, it is a closed-source solution, which might limit usage in some professional environments.

Ollama: Command Line Chatbots

For those who prefer the classic command line, Ollama offers a simple and effective solution.

• Installation and Operation: Ollama supports Windows, macOS, and Linux. With a command like

ollama run deepseek-r1:7b

the model is automatically downloaded (if not already available) and launched.

• Flexibility: It also allows access to text files and displays the “reasoning” output (marked by <think> tags), which is appealing for experimental users.

• Pros and Cons: The command-line interface is straightforward and open source, but the lack of a graphical interface may be a hurdle for less technical users.

Test Results and Practical Limits

On a typical MacBook Pro M1 (even on a MacBook Air M1) with 16 GB RAM, the 7B and 8B variants of DeepSeek-R1 run fairly smoothly, achieving around 25 (13 on the MacBook Air M1) tokens per second. However, larger models (such as those with 14B or even 70B parameters) can overwhelm the hardware, leading to long loading times and even crashes. Moreover, while highly distilled models are resource-friendly, they sometimes produce inconsistent or lower-quality responses compared to the full version.

Conclusion

Running DeepSeek-R1 locally offers an exciting opportunity to explore one of the most talked-about chatbots without the influence of Chinese censorship or external dependencies. Tools like Jan, LM Studio, and Ollama make it possible—even for those without deep technical expertise—to deploy distilled versions of the model on their own machines. While these versions may hit hardware limits and show some quality compromises, they still provide a private, flexible, and cost-effective access point to modern AI technology. For those in search of a robust and high-performing language model, alternatives like Meta’s Llama might also be worth considering.

Photo by Nao Triponez