Integration · Ollama
Manage Ollama Prompt Templates Externally
Ollama runs models on your hardware. Stop baking system prompts into Modelfiles and docker-compose files. Version them in PromptForge and fetch over HTTPS while inference stays on localhost.
PromptForge + Ollama in one file
Version, template, and serve Ollama prompts via REST API. Manage system prompts for Llama, Qwen, DeepSeek, and Gemma models running locally in PromptForge, update them without rebuilding Modelfiles or restarting containers.
- 1
- Fetch your versioned, interpolated prompt from the PromptForge REST API with a single
fetch()call. - 2
- Pass the returned
contentstring directly to the Ollama SDK as the system prompt, no transformation needed. - 3
- Update the prompt in the PromptForge dashboard anytime, running applications pick up the change on the next request. No redeployment required.
// Ollama exposes an OpenAI-compatible API at localhost:11434
// Works the same for any model you have pulled: llama3.3, qwen2.5, deepseek-r1, etc.
async function fetchPrompt(persona: string, language: string) {
const res = await fetch(
"https://www.promptforge-app.com/api/v1/prompts/your-prompt-id",
{
method: "POST",
headers: {
Authorization: "Bearer pfk_your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
version: "latest",
variables: { persona, language },
}),
}
);
const { content } = await res.json();
return content as string;
}
// Prompt template in PromptForge (e.g.):
// You are a {{persona}} assistant. Always respond in {{language}}.
// 1. Fetch your versioned system prompt (requires outbound HTTPS)
const content = await fetchPrompt("technical_writer", "English");
// 2. Send to Ollama running locally
// Swap the model string for whatever you have pulled
const response = await fetch("http://localhost:11434/v1/chat/completions", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama3.3",
messages: [
{ role: "system", content },
{ role: "user", content: "Explain vector databases in plain language." },
],
stream: false,
}),
});
const { choices } = await response.json();
console.log(choices[0].message.content);Ollama + PromptForge in Three Steps
Add a prompt management layer to your Ollama integration without refactoring your application.
Write your Ollama system prompt template
Ollama's /v1/chat/completions endpoint accepts the same system/user/assistant format as OpenAI. Write your system prompt in PromptForge with {{persona}}, {{language}}, or {{task}} placeholders. Reuse it across every model you pull locally.
Replace Modelfile prompt drift with version history
Modelfiles bake instructions into a model tag. Changing them means rebuilding and re-pulling. PromptForge tracks every edit with line-by-line diffs and one-click rollback. Your Ollama model tag stays stable; the instructional text evolves in the dashboard.
Fetch from PromptForge, infer on localhost
Your app needs outbound HTTPS to api.promptforge-app.com for the prompt fetch. Inference stays on http://localhost:11434 (or your LAN). Pass the returned content as the system message. No Ollama plugin or sidecar required.
Powerful Prompt Management Features for AI Developers
From simple prompt storage to production-ready APIs with version control, dynamic variables, rollback, and a public gallery.
Dynamic Variables
Use {{variable}} syntax to create reusable prompt templates. Pass different values via API for endless customization across any LLM.
Instant Prompt API
RESTful API endpoint ready in seconds. Fetch any prompt with version pinning and variable interpolation. No redeployment needed.
Rollback with Diff Checker
View a line-by-line diff of every change and roll back to any previous version in one click. Never lose a working prompt again.
Publish to Gallery
Share your best prompts with the community in the public gallery. Get discovered by other developers and grow your personal library.
Start for free, upgrade anytime
No credit card required to get started. Paid plans include a 14-day free trial.
- 1 Production Prompt
- 1,000 API Requests/mo
- Stable/Latest Channel Routing
- No Credit Card Required
No charge until your trial ends
- 10k API requests/month
- 5 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
No charge until your trial ends
- 100k API requests/month
- 25 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
No charge until your trial ends
- 500k API requests/month
- 100 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
Questions? Contact us
Ollama + PromptForge: Common Questions
Specific answers for developers integrating PromptForge with Ollama.
How is PromptForge different from putting the system prompt in an Ollama Modelfile?
A Modelfile SYSTEM directive is static until you rebuild the model tag with ollama create. That works for one-off local experiments. Production teams hit limits fast: no diff history, no rollback without rebuilding, and every environment (dev laptop, CI, staging server) drifts to a different Modelfile. PromptForge centralises the text, versions every change, and serves the current prompt via API. Your Modelfile only needs FROM and PARAMETER lines; the instructional content lives elsewhere.
Does PromptForge work when Ollama runs air-gapped?
The PromptForge API requires outbound HTTPS from the machine that fetches prompts. Pure air-gapped environments cannot call it per request. Two workarounds: pre-fetch prompts at deploy time and write them to a local file or Redis your app reads, or run a scheduled sync job on a bastion host that can reach both PromptForge and your private network. Inference itself never leaves your network.
Can I use the same PromptForge prompt across multiple Ollama models?
Often yes for models in the same family, but not always. A concise instruction set that works on llama3.2:3b may be too terse for llama3.3:70b or too verbose for a 1B quant. Create separate PromptForge prompts per model tier when output quality diverges. Pin each to a version number in production so upgrades to ollama pull do not silently change behaviour.
How do I use PromptForge with Ollama in Docker or Kubernetes?
Mount nothing special. Your application container calls PromptForge over HTTPS (store pfk_your_api_key as a secret), then calls the Ollama service at http://ollama:11434/v1/chat/completions on your internal network. PromptForge and Ollama are separate hops. Cache the prompt response in-process with a 30 to 120 second TTL if you want to avoid hitting PromptForge on every user message.