Local LLM Setup
Run AI comment generation entirely on your own machine — no API keys, no costs,
complete privacy.
Supported Services
| Service | Default Endpoint | Difficulty | Notes |
|---|---|---|---|
| Ollama | http://localhost:11434/api/chat | Easy | Recommended |
| LM Studio | http://localhost:1234/v1/chat/completions | Easy | GUI app |
| oobabooga | http://localhost:5000/api/chat | Medium | Most features |
| GPT4All | http://localhost:4891/v1/chat/completions | Easy | Simplest |
Ollama (Recommended)
Install
- macOS / Windows: Download from ollama.ai
- Linux:
curl -fsSL https://ollama.ai/install.sh | sh
Pull a Model and Start
ollama pull mistral # Good balance of speed and quality
ollama serve
Recommended models for LinkedIn comments:
| Model | Size | Speed | Quality |
|---|---|---|---|
mistral | 4 GB | Fast | ⭐⭐⭐⭐⭐ |
neural-chat | 4 GB | Fast | ⭐⭐⭐⭐ |
llama2 | 4–7 GB | Medium | ⭐⭐⭐⭐ |
dolphin-2.6 | 2 GB | Very fast | ⭐⭐⭐ |
Fix CORS (Required for Chrome Extensions)
Ollama blocks browser-origin requests by default. Restart with:
macOS / Linux
OLLAMA_ORIGINS=* ollama serve
Windows PowerShell
$env:OLLAMA_ORIGINS="*"; ollama serve
Windows CMD
set OLLAMA_ORIGINS=*
ollama serve
Configure in Commently
- Settings → Use Local LLM
- Endpoint:
http://localhost:11434/api/chat - Click 🔄 Fetch Models → select your model
- 💾 Save Settings
LM Studio
- Download from lmstudio.ai
- Open LM Studio → browse and download a model (e.g.
mistral-7b-instruct) - Go to Local Server tab → select model → Start Server
- Configure in Commently:
http://localhost:1234/v1/chat/completions
oobabooga (Text Generation WebUI)
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
python server.py --api --listen
Configure in Commently:
- Endpoint:
http://localhost:5000/api/chat
GPT4All
- Download from nomic.ai/gpt4all
- Install, open the app, download a model
- Enable the API server in GPT4All settings
- Configure in Commently:
http://localhost:4891/v1/chat/completions
Troubleshooting
"403 Forbidden"
Ollama is blocking the request. See the CORS fix above.
"Failed to fetch" / "Connection refused"
Your LLM service isn't running. Start it and try again.
Comments are slow
- Use a smaller model (e.g.
dolphin-2.6instead ofllama2-13b) - Enable GPU acceleration in your LLM service settings
Poor comment quality
- Use a larger model if you have the RAM
- Use an instruction-tuned variant (models ending in
-instructor-chat) mistralandneural-chatare optimised for conversation tasks
Hardware Requirements
| Setup | RAM | Model |
|---|---|---|
| Minimum | 8 GB | Dolphin 3B or Mistral 7B (Q4) |
| Recommended | 16 GB | Mistral 7B or Neural-Chat 7B |
| Best quality | 32 GB+ | Llama2-13B or Hermes-13B |