Ollama - rf.blog

Ollama Local LLM Mac Studio Production AI Self-Hosted Privacy llama qwen Apple Silicon AI Infrastructure

Ollama in Production: Running 70B Locally

Mac Studio M4 Pro with 48GB unified memory runs llama3.3:70b for reasoning tasks. Real latency numbers, model selection logic, and where local inference actually beats cloud.

Rene Fichtmueller / 2026-05-22 / ~2 min read min read

LLM Gateway AI Circuit Breaker Failover Production AI TypeScript Fastify Confidence Scoring Ollama Architecture

LLM Gateway Patterns: What We Learned After 50,000 Requests

Circuit breakers, confidence scoring, failover chains — an LLM gateway isn't a proxy. After 50,000 production requests through our internal gateway, here's what the patterns actually look like.

Rene Fichtmueller / 2026-05-21 / ~2 min read min read

Open Source AI Document Intelligence MCP Paperless-ngx OCR RAG Ollama TypeScript Knowledge Management

PaperCortex: Adding a Brain to Your Document Archive

Paperless-ngx is great at storing documents. It's terrible at understanding them. PaperCortex fixes that.

Rene Fichtmueller / 2026-04-05 / ~1 min read min read