41% of all code written in 2025 was AI-generated. The global AI API market reached $85 billion. Yet most developers use AI only through ChatGPT or Copilot—never directly via API. But API gives you full control: your own prompts, your own workflow, your own costs. This guide takes you from zero to your first API call — through OpenAI, Anthropic, and Google Gemini, with prices, code, and practical tips.
TL;DR — 30 Second Version
- AI API = direct access to models (Claude, GPT, Gemini) without intermediaries, with full control
- You pay per token (words): input + output, prices from $0.10 to $25 per million tokens
- Anthropic Claude Sonnet 4.6: best price/performance ($3/$15 per 1M tokens)
- OpenAI GPT-4.1: $2/$8, 1M context — production standard
- Google Gemini 2.5 Flash-Lite: $0.10/$0.40 — cheapest for high volume
- Prompt caching + batch API = up to 95% cost savings
Why Use AI via API (and Not Just ChatGPT)
If you only use AI through web interfaces — ChatGPT, Claude.ai, Gemini — you're missing three things that make AI truly powerful for developers.
1. Full control over prompts and parameters. In chat you have basic controls, but you can't control temperature, top_p, max_tokens, or structured output. Via API: you can. These parameters dramatically affect response quality and consistency.
2. Integration into your own applications. API lets you embed AI into anything — chatbots for your company, automatic email processing, data analysis, report generation, code review pipelines. ChatGPT is for people; API is for software.
3. Cost control. ChatGPT Plus costs $20/month regardless of usage. Via API you pay only for what you use. For light usage that might be $2/month. For heavy usage it could be more — but you have visibility and can optimize.
How AI API Works: The Basics
If you've used REST APIs before, AI APIs are familiar. The key concepts are specific to AI:
Tokens — The Currency of AI API
AI models don't work with words, but with tokens. A token is roughly ¾ of an English word or ½ of a non-English word (non-English languages consume more tokens due to longer words and diacritics). You pay for input tokens (your prompt + context) and output tokens (the response).
Practical example: A 500-word prompt ≈ 700 tokens (input). A 1000-word response ≈ 1400 tokens (output). At Claude Sonnet 4.6 prices ($3/1M input, $15/1M output): 700 × $0.000003 + 1400 × $0.000015 = $0.023 — about 0.53 CZK per request. One thousand such requests costs 530 CZK.
Models — Different Capabilities, Different Prices
Each provider offers models ranged by capability and cost. Rule: most capable = most expensive. But "most capable" doesn't always mean "best for your use case" — for simple tasks a cheap, fast model works fine.
Context Window — How Much You Can Tell the Model
Context window is the maximum tokens the model can process in one request (input + output together). In 2026, standards moved dramatically: Claude Opus/Sonnet 4.6 and GPT-4.1 offer 1 million tokens of context — that's an entire book or large codebase in one prompt.
Temperature and Parameters
Temperature controls "creativity." Temperature 0 = deterministic, consistent answers (ideal for code, data extraction). Temperature 1 = creative, variable (ideal for copywriting, brainstorming). Most developers start at 0.3–0.7 and tune per use case.
The Big Three: OpenAI, Anthropic, Google
In 2026 three providers dominate. Each has strengths, pricing, and best use cases.
Anthropic (Claude API)
Best for: Coding, reasoning, long contexts
Models:
- Haiku 4.5: $1/$5 per 1M tokens — fast, lightweight tasks
- Sonnet 4.6: $3/$15 — ideal for coding and analysis
- Opus 4.6: $5/$25 — complex reasoning, agents
Unique features: Extended thinking (model reasons aloud), prompt caching (90% savings on repeated context), PDF processing, tool use, OpenAI SDK compatibility.
SDKs: Python (pip install anthropic) and TypeScript (npm install @anthropic-ai/sdk)
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain REST API like I'm a junior developer."}
]
)
print(message.content[0].text)
OpenAI (GPT API)
Best for: General purpose, speed, breadth
Models:
- GPT-4.1 mini: $0.40/$1.60 per 1M tokens — fast, cheap tasks
- GPT-4.1: $2/$8 — production standard
- GPT-4o: $2.50/$10 — multimodal (text + images)
Unique features: Largest ecosystem, DALL-E images, Whisper audio, function calling, Assistants API for stateful chat.
SDKs: Python (pip install openai) and TypeScript
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": "You are an experienced software architect."},
{"role": "user", "content": "Design an e-commerce system with AI recommendations."}
],
temperature=0.3
)
print(response.choices[0].message.content)
Google (Gemini API)
Best for: Cost-sensitive, high volume, multimodal
Models:
- Gemini 2.5 Flash-Lite: $0.10/$0.40 — ultra-cheap, high volume
- Gemini 2.5 Flash: $0.15/$0.60 — fast production
- Gemini 3.1 Pro: $2/$12 — complex tasks, multimodal
Unique features: Most aggressive pricing, best multimodal (text + images + video + audio natively), Grounding (verifying facts via Google Search), video generation.
SDKs: Python (pip install google-genai) and TypeScript
from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Analyze this CSV data and find trends."
)
print(response.text)
Your First AI API Project: Step by Step
Let's build something practical — an automatic code reviewer that analyzes Python code and returns structured feedback. We'll use Claude API (best for code).
Step 1: Setup Account and API Key
Go to console.anthropic.com, create account (email or Google), add payment method. Create API key in "API Keys" section. Store it in environment variable — never hardcode it.
# .env file (add to .gitignore!)
ANTHROPIC_API_KEY=sk-ant-api03-XXXXXXXXX
Step 2: Install SDK
# Python
pip install anthropic python-dotenv
# TypeScript
npm install @anthropic-ai/sdk dotenv
Step 3: Create Structured Prompt
The key to good API results is a specific system prompt:
SYSTEM_PROMPT = """You are a senior Python code reviewer. Analyze the code and return:
## Summary
1-2 sentences on overall quality.
## Issues
For each problem:
- **Severity:** CRITICAL / WARNING / INFO
- **Line:** line number
- **Description:** what's wrong and why
- **Fix:** proposed solution
## Positives
What's good about this code (find at least one positive thing).
Write in English. Be specific and constructive. Don't be lenient on security issues."""
Step 4: API Call
import anthropic
from dotenv import load_dotenv
load_dotenv()
client = anthropic.Anthropic()
def review_code(code: str) -> str:
message = client.messages.create(
model="claude-sonnet-4-6-20250514",
max_tokens=2048,
temperature=0.2, # Low = consistent
system=SYSTEM_PROMPT,
messages=[
{"role": "user", "content": f"Review this code:\n\n```python\n{code}\n```"}
]
)
return message.content[0].text
# Usage
code = open("app.py").read()
review = review_code(code)
print(review)
Step 5: Streaming for Better UX
For long responses, use streaming — the user sees output as it arrives:
with client.messages.stream(
model="claude-sonnet-4-6-20250514",
max_tokens=2048,
messages=[{"role": "user", "content": prompt}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Cost Optimization: Practical Strategies
AI APIs can be expensive if unoptimized. Here are strategies that save 50–95%:
1. Prompt Caching (Up to 90% Savings)
If you send the same system prompt or context repeatedly, caching ensures you pay for those tokens only once. Subsequent requests pay a fraction. Perfect for company documentation, rules, or reference data.
2. Batch API (50% Discount)
For non-real-time processing (analyzing 1000 documents overnight), Batch API processes asynchronously at half price. Both Anthropic and OpenAI offer batch endpoints.
3. Right Model for the Task
Don't use Opus/GPT-4o for everything. For classification, extraction, simple formatting: Haiku/GPT-4.1-mini costs a fraction. Start cheap, upgrade only if quality isn't sufficient.
4. Combination: Caching + Batch + Small Model
Combined strategy can achieve 95% savings. Example: 10,000 requests with fixed system prompt (1000 tokens) + short response (200 tokens) using Haiku with caching and batch = $5 instead of $100.
Security and Best Practices
Working with AI APIs requires specific security considerations:
10 AI API Security Rules
- API key never in code — use environment variables or secret managers (Vault, AWS Secrets Manager)
- API key never in Git — add .env to .gitignore, use git-secrets pre-commit hook
- Rotate keys — change API keys every 3 months
- Set spending limits — all providers let you set monthly cap
- Don't send PII — anonymize personal data before sending to API
- Validate AI output — never trust AI output without checking, especially for code
- Rate limiting — implement exponential backoff for retries
- Log requests — for debugging and cost tracking (don't log API keys)
- Test prompts — prompt injection is real; test adversarial inputs
- GDPR compliance — check data processing agreements with providers
5 Practical Use Cases
1. Automatic Code Review in CI/CD Pipeline Attach Claude API to GitHub Actions. On every PR, auto-review code for security, bugs, performance. Cost: ~$0.05 per PR. Savings: 15–30 minutes per PR.
2. Firemní chatbot over your documentation RAG architecture: index your docs, retrieve relevant chunks, send to API with question. Customer support 24/7 for fraction of live agent cost. Cost: ~$0.02–0.10 per conversation.
3. Automated reports Fetch data (SQL, CSV, API), send to API for analysis and narrative summary. Reports that took 1 hour now take minutes. Cost: ~$0.10 per report.
4. Translation and localization AI API translates UI text, docs, marketing. Claude/GPT excel at CZ↔EN translation. Cost: ~$0.01–0.05 per page.
5. Test generation From function description, generate unit tests. AI creates test cases including edge cases. Saves 30–60 minutes of manual test writing.
Getting Started: Your Checklist
Day 1:
- Create account (console.anthropic.com or openai.com)
- Get $5 free credit
- Install SDK
- Make first API call (copy example from this article)
Day 2:
- Write system prompt for your use case
- Implement streaming for better UX
Days 3–4:
- Try tool use — connect model to your functions
- Compare providers — same prompt on Claude, GPT, Gemini
Week 2:
- Optimize costs — caching, batch API, right model
Week 3–4:
- Deploy first production integration
Bottom Line
AI API isn't just for startups and enterprise. One developer with an API key and a good system prompt can automate work that previously required a team. Investing time in learning AI API is the best career investment in 2026.
Ready to Put This Into Practice?
Building with AI APIs is about more than integration — it's about understanding how to leverage these powerful tools safely and cost-effectively.
At White Veil Industries, we help teams design, build, and deploy AI integrations that actually deliver ROI. We've worked with companies across industries to architect solutions using Claude, OpenAI, and Google APIs.
Book a Discovery Call → and let's discuss how AI APIs can solve specific problems in your business.
Sources: Anthropic API Docs 2026, OpenAI API Pricing 2026, Google Gemini API Docs 2026



