Skip to content

Lesson 4: Families of Models

There are several major families of models available today. Each has different strengths, pricing, and context-window sizes. The choice depends on the use case.

  • Models: Gemini Pro, Gemini Ultra, Gemini Flash
  • Strengths: Multimodal (text, images, video, audio), large context windows (up to 1M+ tokens), strong reasoning
  • Best for: Tasks involving mixed media, long-document analysis
  • Models: Haiku (fast, cheap), Sonnet (balanced), Opus (most capable)
  • Strengths: Strong instruction-following, long context windows (200K tokens), careful and safety-conscious outputs
  • Best for: Code generation, analysis tasks, structured outputs
  • Models: GPT-4, GPT-4o, o1, o3
  • Strengths: Broad general knowledge, strong coding ability, large ecosystem of tools and integrations
  • Best for: General-purpose tasks, creative writing, code completion
  • Models: LLaMA 3, LLaMA 4 (open-weight)
  • Strengths: Open-weight (can self-host), no API costs, customisable through fine-tuning
  • Best for: Organisations that need data sovereignty or offline operation
  • Models: Mistral Large, Mistral Medium, Codestral
  • Strengths: European-made, open-weight options, competitive performance at lower cost
  • Best for: European regulatory compliance, cost-sensitive deployments
FactorQuestion to ask
Task typeIs this code, text, or multimodal?
Context sizeHow much input does the model need to process?
CostWhat is the per-token price? What is the expected volume?
PrivacyCan data leave your infrastructure?
SpeedIs latency critical for this use case?
AccuracyHow important is factual precision vs. creative output?

ReArch is model-agnostic — you can configure different models for different agents and switch between providers without changing your workflows.