guideApr 21, 2026·4 min read

Bring your own LLM — how and why

Most AI-powered apps force you onto a single provider. You're stuck with their model, their pricing, their rate limits. We think that's wrong.

In yoinko, you choose your LLM. Open Settings, pick a provider, paste your API key, and you're done. We support OpenAI, Google Gemini, Anthropic Claude, and any OpenAI-compatible endpoint out of the box.

That last one is the interesting part. 'OpenAI-compatible' means you can point yoinko at Ollama running on your laptop, LM Studio serving a local model, OpenRouter for model aggregation, or any custom endpoint that speaks the OpenAI chat completions format.

Under the hood, yoinko's server proxies your requests. Your browser talks to yoinko, yoinko talks to the provider. Your API key never reaches the browser — it stays server-side, stored in your local SQLite database.

Why does this matter? Because you control the cost, the speed, the privacy, and the model. Want the cheapest option? Use gpt-4o-mini. Want full privacy? Run Llama locally through Ollama. Want the best output? Use Claude 3.5 Sonnet. It's your call.

We built yoinko to be the best interface for any model, not a middleman skimming tokens.