Built as an interactive training module by TARAhut AI Labs
Generative AI & Prompt Engineering · Session 04 · tarahutailabs.com
© 2026 TARAhut AI Labs. All rights reserved.
GPT-4o vs Claude Sonnet vs Gemini Pro vs Llama. Blind test. YOU judge. Is your favourite model actually the best — or just a habit? "Tuhada favourite model sachchi sabton vadiya hai — ya sirf aadat hai?"
"Tuhada favourite kihra hai?" Let's poll first — then challenge your assumptions with data. No model is universally best. The right model depends on the task. "Koi vi model sabton vadiya nahi — kaam te depend karda hai."
The most widely used LLM. Fast, multimodal (text+image+voice), massive ecosystem. Context: 128K tokens. Strengths: Speed, coding, creative writing, plugins. Weakness: Can be verbose, training cutoff limits.
Best for: General tasks, coding, rapid prototyping
Known for nuanced, thoughtful responses. Best at following complex instructions faithfully. Context: 200K tokens. Strengths: Long-form writing, analysis, instruction following, safety. Weakness: Can be overly cautious.
Best for: Long documents, research, careful analysis
Google's model with massive context and deep search integration. Context: 1M tokens. Strengths: Huge context, Google ecosystem, real-time web, multimodal. Weakness: Can be less precise on nuance.
Best for: Research, long documents, Google workflow
Open-source, runs locally. No data sent to cloud. Customizable and free. Context: 8K-128K tokens. Strengths: Privacy, customization, no API cost, offline use. Weakness: Smaller than cloud models.
Best for: Privacy, local deployment, customization
These models are closer in quality than most people think. The difference between a great prompt on a "weaker" model and a lazy prompt on a "stronger" model? The great prompt wins almost every time. Prompting skill matters more than model choice. "Prompt quality > model quality."
Next: The blind test — judge without brand bias.
Same prompt sent to all 4 models. Responses labeled A, B, C, D — model names hidden. Score each response. Reveal after scoring. "Brand nahi — output dekho."
Now use YOUR hardest prompt (from homework) on all 4 models. Score blind. Which model wins for YOUR specific use case?
For each response, rate 1-10 on these criteria:
Next: Mapping each model's specific strengths.
Based on industry testing and your own experiments, here's when to use which model. "Kihra model kihde kaam layi sahi hai."
Professional prompt engineers don't pick ONE model. They use the right model for the right task. Like a carpenter using different tools — you don't use a hammer for everything. Build your personal model selection framework. "Professional ik model nahi chunde — kaam de hisab naal model choose karde ne."
Next: Build your personal decision framework.
Create personal rules: "If I need [X], I use [model] because [reason]." This goes in your portfolio. "Tuhada personal framework — kihde kaam layi kihra model."
Next: Quiz time! Test your Week 1 knowledge.
8 questions from a pool of 18.
4 sessions. Architecture, tokens, parameters, model comparison. You now understand HOW LLMs work. "4 sessions vich tusi samajh gaye ho ki AI andar kive kaam karda hai."
✅ Session 1: Transformer architecture — assembly line, attention mechanism
✅ Session 2: Tokens & context windows — AI's currency and memory
✅ Session 3: Temperature & parameters — controlling AI output
✅ Session 4: Model comparison — data-driven model selection
Key insight: Understanding the engine makes you a better driver. Next week, you learn the advanced driving techniques.
"Agle hafte assi frameworks sikhange: CRISP, chain-of-thought, few-shot, system prompts. Tuhade prompts good ton exceptional ho jaane ne. Ih professional prompt engineering hai." Your prompts are about to transform.