How to Use AI via API: A Practical Guide for Developers

41% of all code written in 2025 was AI-generated. The global AI API market reached $85 billion. Yet most developers use AI only through ChatGPT or Copilot—never directly via API. But API gives you full control: your own prompts, your own workflow, your own costs. This guide takes you from zero to your first API call — through OpenAI, Anthropic, and Google Gemini, with prices, code, and practical tips.

TL;DR — 30 Second Version

AI API = direct access to models (Claude, GPT, Gemini) without intermediaries, with full control
You pay per token (words): input + output, prices from $0.10 to $25 per million tokens
Anthropic Claude Sonnet 4.6: best price/performance ($3/$15 per 1M tokens)
OpenAI GPT-4.1: $2/$8, 1M context — production standard
Google Gemini 2.5 Flash-Lite: $0.10/$0.40 — cheapest for high volume
Prompt caching + batch API = up to 95% cost savings

Why Use AI via API (and Not Just ChatGPT)

If you only use AI through web interfaces — ChatGPT, Claude.ai, Gemini — you're missing three things that make AI truly powerful for developers.

1. Full control over prompts and parameters. In chat you have basic controls, but you can't control temperature, top_p, max_tokens, or structured output. Via API: you can. These parameters dramatically affect response quality and consistency.

2. Integration into your own applications. API lets you embed AI into anything — chatbots for your company, automatic email processing, data analysis, report generation, code review pipelines. ChatGPT is for people; API is for software.

3. Cost control. ChatGPT Plus costs $20/month regardless of usage. Via API you pay only for what you use. For light usage that might be $2/month. For heavy usage it could be more — but you have visibility and can optimize.

How AI API Works: The Basics

If you've used REST APIs before, AI APIs are familiar. The key concepts are specific to AI:

Tokens — The Currency of AI API

AI models don't work with words, but with tokens. A token is roughly ¾ of an English word or ½ of a non-English word (non-English languages consume more tokens due to longer words and diacritics). You pay for input tokens (your prompt + context) and output tokens (the response).

Practical example: A 500-word prompt ≈ 700 tokens (input). A 1000-word response ≈ 1400 tokens (output). At Claude Sonnet 4.6 prices ($3/1M input, $15/1M output): 700 × $0.000003 + 1400 × $0.000015 = $0.023 — about 0.53 CZK per request. One thousand such requests costs 530 CZK.

Models — Different Capabilities, Different Prices

Each provider offers models ranged by capability and cost. Rule: most capable = most expensive. But "most capable" doesn't always mean "best for your use case" — for simple tasks a cheap, fast model works fine.

Context Window — How Much You Can Tell the Model

Context window is the maximum tokens the model can process in one request (input + output together). In 2026, standards moved dramatically: Claude Opus/Sonnet 4.6 and GPT-4.1 offer 1 million tokens of context — that's an entire book or large codebase in one prompt.

Temperature and Parameters

Temperature controls "creativity." Temperature 0 = deterministic, consistent answers (ideal for code, data extraction). Temperature 1 = creative, variable (ideal for copywriting, brainstorming). Most developers start at 0.3–0.7 and tune per use case.

The Big Three: OpenAI, Anthropic, Google

In 2026 three providers dominate. Each has strengths, pricing, and best use cases.

Anthropic (Claude API)

Best for: Coding, reasoning, long contexts

Models:

Haiku 4.5: $1/$5 per 1M tokens — fast, lightweight tasks
Sonnet 4.6: $3/$15 — ideal for coding and analysis
Opus 4.6: $5/$25 — complex reasoning, agents

Unique features: Extended thinking (model reasons aloud), prompt caching (90% savings on repeated context), PDF processing, tool use, OpenAI SDK compatibility.

SDKs: Python (pip install anthropic) and TypeScript (npm install @anthropic-ai/sdk)

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-sonnet-4-6-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain REST API like I'm a junior developer."}
    ]
)
print(message.content[0].text)

OpenAI (GPT API)

Best for: General purpose, speed, breadth

Models:

GPT-4.1 mini: $0.40/$1.60 per 1M tokens — fast, cheap tasks
GPT-4.1: $2/$8 — production standard
GPT-4o: $2.50/$10 — multimodal (text + images)

Unique features: Largest ecosystem, DALL-E images, Whisper audio, function calling, Assistants API for stateful chat.

SDKs: Python (pip install openai) and TypeScript

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {"role": "system", "content": "You are an experienced software architect."},
        {"role": "user", "content": "Design an e-commerce system with AI recommendations."}
    ],
    temperature=0.3
)
print(response.choices[0].message.content)

Google (Gemini API)

Best for: Cost-sensitive, high volume, multimodal

Models:

Gemini 2.5 Flash-Lite: $0.10/$0.40 — ultra-cheap, high volume
Gemini 2.5 Flash: $0.15/$0.60 — fast production
Gemini 3.1 Pro: $2/$12 — complex tasks, multimodal

Unique features: Most aggressive pricing, best multimodal (text + images + video + audio natively), Grounding (verifying facts via Google Search), video generation.

SDKs: Python (pip install google-genai) and TypeScript

from google import genai

client = genai.Client(api_key="YOUR_API_KEY")

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Analyze this CSV data and find trends."
)
print(response.text)

Your First AI API Project: Step by Step

Let's build something practical — an automatic code reviewer that analyzes Python code and returns structured feedback. We'll use Claude API (best for code).

Step 1: Setup Account and API Key

Go to console.anthropic.com, create account (email or Google), add payment method. Create API key in "API Keys" section. Store it in environment variable — never hardcode it.

# .env file (add to .gitignore!)
ANTHROPIC_API_KEY=sk-ant-api03-XXXXXXXXX

Step 2: Install SDK

# Python
pip install anthropic python-dotenv

# TypeScript
npm install @anthropic-ai/sdk dotenv

Step 3: Create Structured Prompt

The key to good API results is a specific system prompt:

SYSTEM_PROMPT = """You are a senior Python code reviewer. Analyze the code and return:

## Summary
1-2 sentences on overall quality.

## Issues
For each problem:
- **Severity:** CRITICAL / WARNING / INFO
- **Line:** line number
- **Description:** what's wrong and why
- **Fix:** proposed solution

## Positives
What's good about this code (find at least one positive thing).

Write in English. Be specific and constructive. Don't be lenient on security issues."""

Step 4: API Call

import anthropic
from dotenv import load_dotenv

load_dotenv()
client = anthropic.Anthropic()

def review_code(code: str) -> str:
    message = client.messages.create(
        model="claude-sonnet-4-6-20250514",
        max_tokens=2048,
        temperature=0.2,  # Low = consistent
        system=SYSTEM_PROMPT,
        messages=[
            {"role": "user", "content": f"Review this code:\n\n```python\n{code}\n```"}
        ]
    )
    return message.content[0].text

# Usage
code = open("app.py").read()
review = review_code(code)
print(review)

Step 5: Streaming for Better UX

For long responses, use streaming — the user sees output as it arrives:

with client.messages.stream(
    model="claude-sonnet-4-6-20250514",
    max_tokens=2048,
    messages=[{"role": "user", "content": prompt}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Cost Optimization: Practical Strategies

AI APIs can be expensive if unoptimized. Here are strategies that save 50–95%:

1. Prompt Caching (Up to 90% Savings)

If you send the same system prompt or context repeatedly, caching ensures you pay for those tokens only once. Subsequent requests pay a fraction. Perfect for company documentation, rules, or reference data.

2. Batch API (50% Discount)

For non-real-time processing (analyzing 1000 documents overnight), Batch API processes asynchronously at half price. Both Anthropic and OpenAI offer batch endpoints.

3. Right Model for the Task

Don't use Opus/GPT-4o for everything. For classification, extraction, simple formatting: Haiku/GPT-4.1-mini costs a fraction. Start cheap, upgrade only if quality isn't sufficient.

4. Combination: Caching + Batch + Small Model

Combined strategy can achieve 95% savings. Example: 10,000 requests with fixed system prompt (1000 tokens) + short response (200 tokens) using Haiku with caching and batch = $5 instead of $100.

Security and Best Practices

Working with AI APIs requires specific security considerations:

10 AI API Security Rules

API key never in code — use environment variables or secret managers (Vault, AWS Secrets Manager)
API key never in Git — add .env to .gitignore, use git-secrets pre-commit hook
Rotate keys — change API keys every 3 months
Set spending limits — all providers let you set monthly cap
Don't send PII — anonymize personal data before sending to API
Validate AI output — never trust AI output without checking, especially for code
Rate limiting — implement exponential backoff for retries
Log requests — for debugging and cost tracking (don't log API keys)
Test prompts — prompt injection is real; test adversarial inputs
GDPR compliance — check data processing agreements with providers

5 Practical Use Cases

1. Automatic Code Review in CI/CD Pipeline Attach Claude API to GitHub Actions. On every PR, auto-review code for security, bugs, performance. Cost: ~$0.05 per PR. Savings: 15–30 minutes per PR.

2. Firemní chatbot over your documentation RAG architecture: index your docs, retrieve relevant chunks, send to API with question. Customer support 24/7 for fraction of live agent cost. Cost: ~$0.02–0.10 per conversation.

3. Automated reports Fetch data (SQL, CSV, API), send to API for analysis and narrative summary. Reports that took 1 hour now take minutes. Cost: ~$0.10 per report.

4. Translation and localization AI API translates UI text, docs, marketing. Claude/GPT excel at CZ↔EN translation. Cost: ~$0.01–0.05 per page.

5. Test generation From function description, generate unit tests. AI creates test cases including edge cases. Saves 30–60 minutes of manual test writing.

Getting Started: Your Checklist

Day 1:

Create account (console.anthropic.com or openai.com)
Get $5 free credit
Install SDK
Make first API call (copy example from this article)

Day 2:

Write system prompt for your use case
Implement streaming for better UX

Days 3–4:

Try tool use — connect model to your functions
Compare providers — same prompt on Claude, GPT, Gemini

Week 2:

Optimize costs — caching, batch API, right model

Week 3–4:

Deploy first production integration

Bottom Line

AI API isn't just for startups and enterprise. One developer with an API key and a good system prompt can automate work that previously required a team. Investing time in learning AI API is the best career investment in 2026.

Ready to Put This Into Practice?

Building with AI APIs is about more than integration — it's about understanding how to leverage these powerful tools safely and cost-effectively.

At White Veil Industries, we help teams design, build, and deploy AI integrations that actually deliver ROI. We've worked with companies across industries to architect solutions using Claude, OpenAI, and Google APIs.

Book a Discovery Call → and let's discuss how AI APIs can solve specific problems in your business.

Sources: Anthropic API Docs 2026, OpenAI API Pricing 2026, Google Gemini API Docs 2026

Key Takeaways

TL;DR — 30 Second Version

Why Use AI via API (and Not Just ChatGPT)

How AI API Works: The Basics

Tokens — The Currency of AI API

Models — Different Capabilities, Different Prices

Context Window — How Much You Can Tell the Model

Temperature and Parameters

The Big Three: OpenAI, Anthropic, Google

Anthropic (Claude API)

OpenAI (GPT API)

Google (Gemini API)

Your First AI API Project: Step by Step

Step 1: Setup Account and API Key

Step 2: Install SDK

Step 3: Create Structured Prompt

Step 4: API Call

Step 5: Streaming for Better UX

Cost Optimization: Practical Strategies

1. Prompt Caching (Up to 90% Savings)

2. Batch API (50% Discount)

3. Right Model for the Task

4. Combination: Caching + Batch + Small Model

Security and Best Practices

5 Practical Use Cases

Getting Started: Your Checklist

Bottom Line

Ready to Put This Into Practice?

Share this article

Related Articles

How to Review AI-Generated Code: Best Practices for 2026

Cursor vs Windsurf vs Claude Code: AI IDE Comparison

Claude vs ChatGPT for Programmers: Which AI Coding Assistant Wins?

Need expert guidance?