API Integration Guide
Lær hvordan du integrerer LLM APIs i dine applikationer med best practices, error handling og production-ready kode eksempler.
Kom i Gang
At integrere en LLM API i din applikation er overraskende simpelt. De fleste providers tilbyder SDK'er til populære sprog, og API'erne følger lignende mønstre.
Denne guide fokuserer på OpenAI og Anthropic da de er mest populære, men principperne gælder for alle providers.
OpenAI API
Setup
1. Opret account på platform.openai.com
2. Generer API key under API keys section
3. Installer SDK
# Python pip install openai # Node.js npm install openai # Set environment variable export OPENAI_API_KEY='sk-...'
Basic Chat Completion
Python
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Du er en hjælpsom assistent"
},
{
"role": "user",
"content": "Hvad er hovedstaden i Danmark?"
}
],
temperature=0.7,
max_tokens=150
)
print(response.choices[0].message.content)Node.js
import OpenAI from 'openai';
const client = new OpenAI();
const response = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Du er en hjælpsom assistent"
},
{
role: "user",
content: "Hvad er hovedstaden i Danmark?"
}
],
temperature: 0.7,
max_tokens: 150
});
console.log(response.choices[0].message.content);Streaming Responses
For bedre UX, stream svaret så brugere ser output i real-time:
# Python
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Skriv en kort historie"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
# Node.js
const stream = await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [{role: "user", content: "Skriv en kort historie"}],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}Structured Output (JSON Mode)
Garantér JSON output med response_format:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "Udtræk produktinfo som JSON med fields: name, price, category"
},
{
"role": "user",
"content": "Smart Watch Pro koster 299 kr og er elektronik"
}
],
response_format={"type": "json_object"}
)
# Output: {"name": "Smart Watch Pro", "price": 299, "category": "elektronik"}Anthropic API (Claude)
Setup
# Python pip install anthropic # Node.js npm install @anthropic-ai/sdk # Set environment variable export ANTHROPIC_API_KEY='sk-ant-...'
Basic Message
Python
from anthropic import Anthropic
client = Anthropic()
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="Du er en hjælpsom assistent",
messages=[
{
"role": "user",
"content": "Hvad er hovedstaden i Danmark?"
}
]
)
print(message.content[0].text)Node.js
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
const message = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
system: "Du er en hjælpsom assistent",
messages: [
{
role: "user",
content: "Hvad er hovedstaden i Danmark?"
}
]
});
console.log(message.content[0].text);Prompt Caching
Spar penge ved at cache lange prompts:
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[
{
"type": "text",
"text": "Du er en ekspert i dansk ret...", # Lang system prompt
"cache_control": {"type": "ephemeral"} # Cache dette!
}
],
messages=[
{"role": "user", "content": "Hvad siger loven om..."}
]
)
# Næste calls med samme system prompt bruger cachen (90% discount!)Best Practices
1. Error Handling
Håndter altid fejl gracefully. API calls kan fejle af mange grunde:
from openai import OpenAI, APIError, RateLimitError, APIConnectionError
import time
client = OpenAI()
def call_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
return response.choices[0].message.content
except RateLimitError:
# Rate limit hit - vent og prøv igen
wait_time = (2 ** attempt) * 2 # Exponential backoff
print(f"Rate limited. Venter {wait_time}s...")
time.sleep(wait_time)
except APIConnectionError:
# Netværksfejl
print(f"Connection error. Forsøg {attempt + 1}/{max_retries}")
time.sleep(2)
except APIError as e:
# Andre API fejl
print(f"API error: {e}")
raise
raise Exception("Max retries nået")2. Rate Limiting
Implementér rate limiting for at undgå at overskride API limits:
from ratelimit import limits, sleep_and_retry
# Max 50 requests per minut
@sleep_and_retry
@limits(calls=50, period=60)
def call_api(prompt):
return client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)3. Timeout Handling
Sæt timeouts for at undgå at hænge på lange requests:
# Python
client = OpenAI(timeout=30.0) # 30 sekunder timeout
# Node.js
const client = new OpenAI({
timeout: 30 * 1000 // 30 sekunder
});4. Sikker API Key Håndtering
✅ Gør dette:
- Brug environment variables (.env filer)
- Aldrig commit API keys til git
- Brug secrets management (AWS Secrets Manager, etc.)
- Roter keys regelmæssigt
- Brug forskellige keys til dev/staging/prod
❌ Undgå dette:
- Hardcode API keys direkte i koden
- Commit .env filer til version control
- Dele keys i Slack eller email
- Brug production keys i development
5. Logging & Monitoring
import logging
logger = logging.getLogger(__name__)
def call_llm(prompt):
logger.info(f"LLM call started - tokens: ~{len(prompt)/4}")
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
# Log usage for cost tracking
usage = response.usage
logger.info(f"LLM call success - Input: {usage.prompt_tokens}, Output: {usage.completion_tokens}")
return response.choices[0].message.content
except Exception as e:
logger.error(f"LLM call failed: {e}")
raiseProduction-Ready Patterns
Queue-Based Processing
For high-volume applications, brug en queue til at håndtere requests:
Fordele:
- Håndter rate limits gracefully
- Retry logic automatisk
- Better control over concurrency
- Prioritering af requests
# Eksempel med Celery (Python)
from celery import Celery
app = Celery('tasks', broker='redis://localhost:6379')
@app.task(bind=True, max_retries=3)
def process_llm_request(self, prompt):
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
except RateLimitError as e:
# Retry efter 60 sekunder
raise self.retry(exc=e, countdown=60)Fallback Strategy
Implementér fallbacks hvis primary model fejler:
def call_with_fallback(prompt):
try:
# Prøv primary model (GPT-4o)
return call_openai(prompt, model="gpt-4o")
except RateLimitError:
# Fallback til billigere model
logger.warning("GPT-4o rate limited, falling back to GPT-4o-mini")
return call_openai(prompt, model="gpt-4o-mini")
except APIError:
# Fallback til alternativ provider
logger.warning("OpenAI failed, falling back to Claude")
return call_anthropic(prompt)Response Validation
Valider altid output før du bruger det:
import json
from pydantic import BaseModel, ValidationError
class ProductInfo(BaseModel):
name: str
price: float
category: str
def extract_product_info(text):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Udtræk produktinfo som JSON"},
{"role": "user", "content": text}
],
response_format={"type": "json_object"}
)
try:
data = json.loads(response.choices[0].message.content)
product = ProductInfo(**data) # Validate med Pydantic
return product
except (json.JSONDecodeError, ValidationError) as e:
logger.error(f"Invalid response: {e}")
raise💰 Cost Tracking
Implementér cost tracking for at holde styr på udgifter:
class CostTracker:
PRICES = {
"gpt-4o-mini": {"input": 0.15, "output": 0.60}, # per 1M tokens
"gpt-4o": {"input": 2.50, "output": 10.00}
}
def __init__(self):
self.total_cost = 0
def track_usage(self, model, input_tokens, output_tokens):
input_cost = (input_tokens / 1_000_000) * self.PRICES[model]["input"]
output_cost = (output_tokens / 1_000_000) * self.PRICES[model]["output"]
call_cost = input_cost + output_cost
self.total_cost += call_cost
logger.info(f"Call cost: $call_cost | Total: $self.total_cost")
# Send til monitoring (Datadog, CloudWatch, etc.)
metrics.gauge("llm.cost.total", self.total_cost)
return call_cost