Integration · Together AI
Together AI Prompt Management
Together AI gives you access to 200+ open-source models. PromptForge gives you a single versioned prompt layer that works across all of them.
PromptForge + Together AI in one file
Version, template, and serve prompts for Together AI via REST API. Manage system prompts for Llama, Qwen, DeepSeek, and every other model on Together's serverless platform, update live without touching your code.
- 1
- Fetch your versioned, interpolated prompt from the PromptForge REST API with a single
fetch()call. - 2
- Pass the returned
contentstring directly to the Together AI SDK as the system prompt, no transformation needed. - 3
- Update the prompt in the PromptForge dashboard anytime, running applications pick up the change on the next request. No redeployment required.
import Together from "together-ai";
const together = new Together({ apiKey: process.env.TOGETHER_API_KEY });
async function fetchPrompt(task: string, length: string) {
const res = await fetch(
"https://www.promptforge-app.com/api/v1/prompts/your-prompt-id",
{
method: "POST",
headers: {
Authorization: "Bearer pfk_your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
version: "latest",
variables: { task, length },
}),
}
);
const { content } = await res.json();
return content as string;
}
// Prompt template in PromptForge (e.g.):
// {{task}} the following. Keep the response {{length}}.
// 1. Fetch your prompt template with dynamic variables
const content = await fetchPrompt("summarisation", "short");
// 2. Use the versioned prompt with Llama 4 Scout on Together AI
// Switch to other models without changing
// how you manage prompts, just update the model string
const response = await together.chat.completions.create({
model: "meta-llama/Llama4-Scout",
messages: [
{ role: "system", content },
{ role: "user", content: "Summarise the following article: ..." },
],
});
console.log(response.choices[0].message.content);Together AI + PromptForge in Three Steps
Add a prompt management layer to your Together AI integration without refactoring your application.
Write one prompt template for all Together AI models
Together AI's chat completions endpoint accepts the same system/user/assistant format across all hosted models. Write your prompt once in PromptForge with {{task}} or {{context}} variables and reuse it across Llama, Qwen, Mistral, and DeepSeek.
Version as you switch between models
Switching models on Together AI often requires prompt tuning. PromptForge's version history lets you maintain separate prompt versions per model experiment, compare them, and promote the best-performing version to production.
Fetch the prompt and pass to Together AI
Call the PromptForge API once to retrieve the current system prompt, then pass it to `together.chat.completions.create`. Together AI's OpenAI-compatible endpoint means the integration code is identical regardless of which model you are running.
Powerful Prompt Management Features for AI Developers
From simple prompt storage to production-ready APIs with version control, dynamic variables, rollback, and a public gallery.
Dynamic Variables
Use {{variable}} syntax to create reusable prompt templates. Pass different values via API for endless customization across any LLM.
Instant Prompt API
RESTful API endpoint ready in seconds. Fetch any prompt with version pinning and variable interpolation. No redeployment needed.
Rollback with Diff Checker
View a line-by-line diff of every change and roll back to any previous version in one click. Never lose a working prompt again.
Publish to Gallery
Share your best prompts with the community in the public gallery. Get discovered by other developers and grow your personal library.
Start for free, upgrade anytime
No credit card required to get started. Paid plans include a 14-day free trial.
- 1k API requests/month
- 1 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
No charge until your trial ends
- 10k API requests/month
- 5 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
No charge until your trial ends
- 100k API requests/month
- 25 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
No charge until your trial ends
- 500k API requests/month
- 100 prompt
- Unlimited versions
- Dynamic variables
- Version pinning
- API key management
Questions? Contact us
Together AI + PromptForge: Common Questions
Specific answers for developers integrating PromptForge with Together AI.
How does PromptForge work with Together AI's serverless inference?
Together AI's serverless inference is billed per token with no cold start management on your side. The PromptForge fetch is a single HTTPS call that adds fewer than 50 ms to your total latency before the Together AI request begins. The two calls are independent: PromptForge returns your prompt text, then your code sends the full chat completion request to Together AI's endpoint. You can also pre-fetch and cache the prompt server-side for even lower overhead.
Can I use PromptForge with Together AI's fine-tuned models?
Yes. Fine-tuned models on Together AI are called via the same chat completions endpoint with your fine-tune model ID. Manage the system prompt in PromptForge just as you would for a base model, the fine-tune affects model weights, not the message format. If your fine-tune was trained on specific prompt structures, store those structures as PromptForge templates and version them alongside your fine-tuning experiments.
Does PromptForge support Together AI's streaming inference?
Yes. The PromptForge fetch is completed before the Together AI streaming call begins, so it does not interfere with streaming. Fetch the prompt text once, then start the Together AI streaming request with `stream: true`. The streamed tokens arrive from Together AI's endpoint directly; PromptForge is only involved in the prompt retrieval step.
How do I manage prompts when running different open-source models on Together AI?
Create a PromptForge prompt for each distinct model role or capability tier, for example, `together-llama-chat-agent`, `together-qwen-analyst`, `together-deepseek-coder`. Each model may respond best to different instructional styles or output format instructions. Versioning them separately lets you optimise prompts per model without affecting others, and promotes the best version to production independently.