Session 2: Tokens & Context Windows — Generative AI & Prompt Engineering

Section 01

What Are Tokens?

"How much does this prompt cost?" That's the question we answer today. Tokens are the basic units AI processes — roughly 3/4 of a word in English. But Hindi and Punjabi? That's a different story. "Ik English word = ~1 token. Ik Punjabi word = 3-5 tokens. Ih jaanana zaroori hai."

💰

The Token = Currency Analogy

Think of tokens like money. Everything in AI costs tokens. Your prompt uses input tokens. The response uses output tokens. Longer prompts = more expensive. Longer responses = more expensive. Different languages = different costs. "Jive paisa kharcha hunda hai, tive token kharcha hunde ne."

💬

"Hello" = 1 token

Common words are single tokens

🔬

"ChatGPT" = 3 tokens

Uncommon words get split

🇮🇳

"पंजाब" = 5+ tokens

Hindi/Punjabi uses more tokens per word

🔢

Spaces & punctuation

Yes, even spaces cost tokens

🧪 Lab: Token Counting

Open the OpenAI Tokenizer and test these:

📋 Click to copy: Token counting exercises — English vs Punjabi comparison

🧪 Lab: Tokenization Surprises

📋 Click to copy: Tokenization deep dive — spaces, numbers, code, and multilingual

🔥

Why This Matters for Punjab

If you write prompts in Punjabi, they cost 3-5x more tokens than the same prompt in English. This means: (1) You hit context window limits faster. (2) API costs are higher. (3) You get shorter responses before hitting token limits. Pro tip: Write prompts in English, ask for responses in Punjabi. "Prompt English vich likho, jawab Punjabi vich mangvo."

You can now estimate token costs for any prompt — a skill most AI users never develop.

Next: Context windows — why AI "forgets" in long conversations.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

✅ Section 1 complete! Tokens decoded. Now: context windows.

Section 02

Context Windows — AI's Working Memory

The context window is AI's short-term memory. Everything — your prompts AND AI's responses — must fit inside this window. When it overflows, AI starts forgetting. "Jado window bharr jaandi hai, AI puraaniyaan gallan bhul jaanda hai."

📈 Context Window Comparison — 2026

🟩

GPT-4o

128K tokens (~300 pages)

🟣

Claude Sonnet

200K tokens (~600 pages)

🔵

Gemini Pro

1M tokens (~3000 pages)

🦙

Llama 3

8K-128K tokens

Claude can hold a 600-page book in memory. Gemini can hold 3000 pages. But bigger isn't always better — attention quality can degrade in very long contexts.

💡

The Sliding Window Analogy

Imagine reading a book through a window that only shows 300 pages. As you read page 301, page 1 slides off the left side and is forgotten forever. That's the context window. Your entire conversation — every prompt you've sent and every response AI has given — must fit in this window. "Jive book di ik window — navan page aunda hai, purana bhul jaanda hai."

🧪 Experiment: Context Window Stress Test

Test when AI starts "forgetting" your instructions:

📋 Click to copy: "PINEAPPLE" instruction persistence test — see when AI forgets

After pasting this, have a normal 20+ exchange conversation. Check: Does it still say PINEAPPLE after message 10? 15? 20? Document the exact point where it forgets. Then try the same in Claude — compare.

🧪 Experiment: Long Document Analysis

📋 Click to copy: Long context retrieval test — can AI find info buried in long text?

Context Window Calculator

Build a quick reference: If your average prompt is ~200 tokens and average response is ~500 tokens, each exchange uses ~700 tokens. In GPT-4o's 128K window, that's about 183 exchanges before the window fills up. Calculate for YOUR typical usage.

Your average prompt length: ____ tokens
Average AI response: ____ tokens
Tokens per exchange: ____ tokens
Max exchanges in GPT-4o: 128,000 / ____ = ____ exchanges
Max exchanges in Claude: 200,000 / ____ = ____ exchanges

You now understand why long conversations "break" and how to calculate your limits.

Next: The money side — how much does AI actually cost?

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

💪 50% through! Now let's talk money.

Section 03

Token Economics

If you're building anything with AI APIs — chatbots, automation, business tools — you MUST understand token pricing. The difference between smart and wasteful prompting can be ₹10,000/month. "Token economics samjho — paisa bachao."

GPT-4o💲

~$2.50 / 1M input tokens

~$10 / 1M output tokens. A 1000-token prompt + 2000-token response costs ~$0.023. Cheap for one call. But 10,000 calls/day for a business = $230/day = ~₹19,000/day.

Claude Sonnet💲

~$3 / 1M input tokens

~$15 / 1M output tokens. Slightly pricier but larger context window. A customer service bot doing 1000 conversations/day: ~$270/day = ~₹22,500/day.

Optimization Savings💰

30-50% cost reduction

Optimized prompts use fewer tokens for the same output quality. If you save 40% on tokens, that ₹22,500/day becomes ₹13,500/day. ₹2.7 lakh/month saved just from better prompts.

⚠️

Input vs Output Token Pricing

Output tokens cost 3-5x more than input tokens. This means: (1) Long AI responses are expensive. (2) Asking for "be concise" saves money. (3) Specifying word count controls output cost. Pro tip for Punjab businesses: "Jawab chota mangvo — paisa bachda hai."

🧪 Lab: Build a Cost Calculator

📋 Click to copy: Build a token cost calculator spreadsheet

🧪 Lab: Cost Comparison Across Models

📋 Click to copy: Cross-model cost comparison for a Punjab business

"Main sirf free tools use karda haan." — That's fine for personal use. But if you're building a business tool or automating anything at scale, token economics determines profitability. The difference between a profitable AI business and a money-losing one is token optimization.

You can now calculate the EXACT cost of any AI operation — a business-critical skill.

Next: Token optimization lab — write the same prompt in 40% fewer tokens.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

🔥 67% done! Now optimize like a pro.

Section 04

Token Optimization Lab

Same meaning. Fewer tokens. Better results. This is the skill that separates casual AI users from professionals. "Kam tokens vich ohhi kaam — ih professional prompting hai."

💴

Verbose: 67 tokens

"I would really appreciate it if you could please help me with the task of writing a professional email to my boss in which I explain to him that I would like to request some time off from work next week because I have a medical appointment."

✕ Wasteful

⚡

Optimized: 28 tokens

"Write a professional email to my boss requesting leave next week for a medical appointment. Formal tone, 100 words max."

✓ 58% fewer tokens

🧪 Challenge: Optimize These Prompts

📋 Click to copy: Optimize 3 verbose prompts — same meaning, 40% fewer tokens

💡

Token Optimization Rules

1. Cut filler words: "please help me with" → just state the task.
2. Remove redundancy: "detailed and comprehensive" → "detailed" (they mean the same).
3. Use abbreviations: "social media platforms like Instagram and Facebook" → "Instagram + Facebook marketing".
4. Specify format up front: "bullet points, 200 words" prevents rambling responses (saves output tokens).
5. English prompts, target-language output: Write instructions in English, ask for Punjabi output.

You can now count, calculate, and optimize tokens — a skill most AI professionals lack.

Next: Quiz — test your token knowledge.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

✨ The moment of truth — generate your first real AI output, live.

Section 05

Live AI Playground

You've learned the concepts. Time to do it for real. One generation, your own context, a usable deliverable.

✨ TRY IT LIVE — NO LOGIN NEEDED

Try Tokens & Context Windows Live →

A worked before/after example using your technical context — applied to YOUR context. 1 free generation, no signup needed.

Open Live AI Playground →

Takes ~2 minutes · Your output stays yours

🎯

What You'll Walk Away With

A worked before/after example using your technical context · The mechanism explained in 2-3 lines · 3 use-cases + 1 anti-pattern.

You generated something real. The skill is no longer abstract — it's yours.

Next: Lock in what you've learned — quiz time.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

🧠 Quiz time — tokens, context, and cost!

Section 06

Test Your Token Knowledge

8 questions from a pool of 18. "Har sawaal tuhadi practical understanding test karda hai."

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

Section 07

Session 2 Complete!

You now understand the economics and mechanics of AI at the token level.

🎓

Key Takeaways

✅ Tokens = AI's currency (~3/4 of a word in English)
✅ Context window = AI's working memory (128K-1M tokens)
✅ Punjabi/Hindi uses 3-5x more tokens than English
✅ Output tokens cost 3-5x more than input tokens
✅ Optimization can save 30-50% on token costs
✅ Longer is NOT better — optimize your prompts

Homework Before Session 3

Tokenize your 3 most-used prompts using the OpenAI Tokenizer
Optimize each to use 30% fewer tokens while maintaining output quality
Document before/after token counts and output comparison
Build your cost calculator spreadsheet with 5 real prompts

🔮

Preview: Session 3 — Temperature & Parameters

"Kal assi temperature bare sikhange. Same prompt, wildly different outputs. Why? Because parameters control HOW the model generates text. You'll run experiments at temp 0, 0.5, and 1.0 and see the dramatic difference."

"Session 1 vich architecture samjhya, Session 2 vich economics. Kal parameters — tuhade kol poori understanding honi hai ki AI andar kive kaam karda hai."

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008

Session 3: Temperature & Parameters →

Tokens & Context WindowsTokens और Context WindowsTokens ਅਤੇ Context Windows

What Are Tokens?

Tokens क्या हैं?

Tokens ਕੀ ਹਨ?

The Token = Currency Analogy

🧪 Lab: Token Counting

🧪 Lab: Tokenization Surprises

Why This Matters for Punjab

Context Windows — AI's Working Memory

Context Windows — AI की मेमोरी

Context Windows — AI ਦੀ ਮੈਮੋਰੀ

The Sliding Window Analogy

🧪 Experiment: Context Window Stress Test

🧪 Experiment: Long Document Analysis

Context Window Calculator

Token Economics

~$2.50 / 1M input tokens

~$3 / 1M input tokens

30-50% cost reduction

Input vs Output Token Pricing

🧪 Lab: Build a Cost Calculator

🧪 Lab: Cost Comparison Across Models

Token Optimization Lab

Verbose: 67 tokens

Optimized: 28 tokens

🧪 Challenge: Optimize These Prompts

Token Optimization Rules

✅ Session 2 Mastery Checklist

Live AI Playground

लाइव AI प्लेग्राउंड

ਲਾਈਵ AI ਪਲੇਅਗਰਾਊਂਡ

Try Tokens & Context Windows Live →

What You'll Walk Away With

Test Your Token Knowledge

Session 2 Complete!

सेशन 2 पूरा!

ਸੈਸ਼ਨ 2 ਮੁਕੰਮਲ!

Key Takeaways

Homework Before Session 3

Preview: Session 3 — Temperature & Parameters