Session 02 · Prompt Engineering · 2026

Tokens & Context WindowsTokens और Context WindowsTokens ਅਤੇ Context Windows

Generative AI & Prompt Engineering — TARAhut AI Labs

Tokens are the currency of AI. Every word costs tokens. Every response costs tokens. Today you learn to count them, optimize them, and understand why AI "forgets" things in long conversations. "Token = AI di currency. Aaj tusi cost calculate karna sikhoge."

Tokens AI की करेंसी हैं। हर शब्द की कीमत tokens में है। आज आप उन्हें गिनना, optimize करना और समझना सीखेंगे।

Tokens AI ਦੀ ਕਰੰਸੀ ਹਨ। ਹਰ ਸ਼ਬਦ ਦੀ ਕੀਮਤ tokens ਵਿੱਚ ਹੈ। ਅੱਜ ਤੁਸੀਂ ਗਿਣਨਾ ਅਤੇ optimize ਕਰਨਾ ਸਿੱਖੋਗੇ।

Section 01

What Are Tokens?

Tokens क्या हैं?

Tokens ਕੀ ਹਨ?

"How much does this prompt cost?" That's the question we answer today. Tokens are the basic units AI processes — roughly 3/4 of a word in English. But Hindi and Punjabi? That's a different story. "Ik English word = ~1 token. Ik Punjabi word = 3-5 tokens. Ih jaanana zaroori hai."

💰

The Token = Currency Analogy

Think of tokens like money. Everything in AI costs tokens. Your prompt uses input tokens. The response uses output tokens. Longer prompts = more expensive. Longer responses = more expensive. Different languages = different costs. "Jive paisa kharcha hunda hai, tive token kharcha hunde ne."

💬
"Hello" = 1 token
Common words are single tokens
🔬
"ChatGPT" = 3 tokens
Uncommon words get split
🇮🇳
"पंजाब" = 5+ tokens
Hindi/Punjabi uses more tokens per word
🔢
Spaces & punctuation
Yes, even spaces cost tokens

🧪 Lab: Token Counting

Open the OpenAI Tokenizer and test these:

📋 Click to copy: Token counting exercises — English vs Punjabi comparison

🧪 Lab: Tokenization Surprises

📋 Click to copy: Tokenization deep dive — spaces, numbers, code, and multilingual
🔥

Why This Matters for Punjab

If you write prompts in Punjabi, they cost 3-5x more tokens than the same prompt in English. This means: (1) You hit context window limits faster. (2) API costs are higher. (3) You get shorter responses before hitting token limits. Pro tip: Write prompts in English, ask for responses in Punjabi. "Prompt English vich likho, jawab Punjabi vich mangvo."

You can now estimate token costs for any prompt — a skill most AI users never develop.

Next: Context windows — why AI "forgets" in long conversations.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
✅ Section 1 complete! Tokens decoded. Now: context windows.
Section 02

Context Windows — AI's Working Memory

Context Windows — AI की मेमोरी

Context Windows — AI ਦੀ ਮੈਮੋਰੀ

The context window is AI's short-term memory. Everything — your prompts AND AI's responses — must fit inside this window. When it overflows, AI starts forgetting. "Jado window bharr jaandi hai, AI puraaniyaan gallan bhul jaanda hai."

📈 Context Window Comparison — 2026
🟩
GPT-4o
128K tokens (~300 pages)
🟣
Claude Sonnet
200K tokens (~600 pages)
🔵
Gemini Pro
1M tokens (~3000 pages)
🦙
Llama 3
8K-128K tokens
Claude can hold a 600-page book in memory. Gemini can hold 3000 pages. But bigger isn't always better — attention quality can degrade in very long contexts.
💡

The Sliding Window Analogy

Imagine reading a book through a window that only shows 300 pages. As you read page 301, page 1 slides off the left side and is forgotten forever. That's the context window. Your entire conversation — every prompt you've sent and every response AI has given — must fit in this window. "Jive book di ik window — navan page aunda hai, purana bhul jaanda hai."

🧪 Experiment: Context Window Stress Test

Test when AI starts "forgetting" your instructions:

📋 Click to copy: "PINEAPPLE" instruction persistence test — see when AI forgets
After pasting this, have a normal 20+ exchange conversation. Check: Does it still say PINEAPPLE after message 10? 15? 20? Document the exact point where it forgets. Then try the same in Claude — compare.

🧪 Experiment: Long Document Analysis

📋 Click to copy: Long context retrieval test — can AI find info buried in long text?

Context Window Calculator

Build a quick reference: If your average prompt is ~200 tokens and average response is ~500 tokens, each exchange uses ~700 tokens. In GPT-4o's 128K window, that's about 183 exchanges before the window fills up. Calculate for YOUR typical usage.

You now understand why long conversations "break" and how to calculate your limits.

Next: The money side — how much does AI actually cost?

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
💪 50% through! Now let's talk money.
Section 03

Token Economics

If you're building anything with AI APIs — chatbots, automation, business tools — you MUST understand token pricing. The difference between smart and wasteful prompting can be ₹10,000/month. "Token economics samjho — paisa bachao."

GPT-4o💲

~$2.50 / 1M input tokens

~$10 / 1M output tokens. A 1000-token prompt + 2000-token response costs ~$0.023. Cheap for one call. But 10,000 calls/day for a business = $230/day = ~₹19,000/day.

Claude Sonnet💲

~$3 / 1M input tokens

~$15 / 1M output tokens. Slightly pricier but larger context window. A customer service bot doing 1000 conversations/day: ~$270/day = ~₹22,500/day.

Optimization Savings💰

30-50% cost reduction

Optimized prompts use fewer tokens for the same output quality. If you save 40% on tokens, that ₹22,500/day becomes ₹13,500/day. ₹2.7 lakh/month saved just from better prompts.

⚠️

Input vs Output Token Pricing

Output tokens cost 3-5x more than input tokens. This means: (1) Long AI responses are expensive. (2) Asking for "be concise" saves money. (3) Specifying word count controls output cost. Pro tip for Punjab businesses: "Jawab chota mangvo — paisa bachda hai."

🧪 Lab: Build a Cost Calculator

📋 Click to copy: Build a token cost calculator spreadsheet

🧪 Lab: Cost Comparison Across Models

📋 Click to copy: Cross-model cost comparison for a Punjab business

"Main sirf free tools use karda haan." — That's fine for personal use. But if you're building a business tool or automating anything at scale, token economics determines profitability. The difference between a profitable AI business and a money-losing one is token optimization.

You can now calculate the EXACT cost of any AI operation — a business-critical skill.

Next: Token optimization lab — write the same prompt in 40% fewer tokens.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
🔥 67% done! Now optimize like a pro.
Section 04

Token Optimization Lab

Same meaning. Fewer tokens. Better results. This is the skill that separates casual AI users from professionals. "Kam tokens vich ohhi kaam — ih professional prompting hai."

💴

Verbose: 67 tokens

"I would really appreciate it if you could please help me with the task of writing a professional email to my boss in which I explain to him that I would like to request some time off from work next week because I have a medical appointment."

✕ Wasteful

Optimized: 28 tokens

"Write a professional email to my boss requesting leave next week for a medical appointment. Formal tone, 100 words max."

✓ 58% fewer tokens

🧪 Challenge: Optimize These Prompts

📋 Click to copy: Optimize 3 verbose prompts — same meaning, 40% fewer tokens
💡

Token Optimization Rules

1. Cut filler words: "please help me with" → just state the task.
2. Remove redundancy: "detailed and comprehensive" → "detailed" (they mean the same).
3. Use abbreviations: "social media platforms like Instagram and Facebook" → "Instagram + Facebook marketing".
4. Specify format up front: "bullet points, 200 words" prevents rambling responses (saves output tokens).
5. English prompts, target-language output: Write instructions in English, ask for Punjabi output.

✅ Session 2 Mastery Checklist

Tap items to check them off
You can now count, calculate, and optimize tokens — a skill most AI professionals lack.

Next: Quiz — test your token knowledge.

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
🧠 Quiz time — tokens, context, and cost!
Section 05

Test Your Token Knowledge

8 questions from a pool of 18. "Har sawaal tuhadi practical understanding test karda hai."

TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
Section 06

Session 2 Complete!

सेशन 2 पूरा!

ਸੈਸ਼ਨ 2 ਮੁਕੰਮਲ!

You now understand the economics and mechanics of AI at the token level.

🎓

Key Takeaways

✅ Tokens = AI's currency (~3/4 of a word in English)
✅ Context window = AI's working memory (128K-1M tokens)
✅ Punjabi/Hindi uses 3-5x more tokens than English
✅ Output tokens cost 3-5x more than input tokens
✅ Optimization can save 30-50% on token costs
✅ Longer is NOT better — optimize your prompts

Homework Before Session 3

🔮

Preview: Session 3 — Temperature & Parameters

"Kal assi temperature bare sikhange. Same prompt, wildly different outputs. Why? Because parameters control HOW the model generates text. You'll run experiments at temp 0, 0.5, and 1.0 and see the dramatic difference."

"Session 1 vich architecture samjhya, Session 2 vich economics. Kal parameters — tuhade kol poori understanding honi hai ki AI andar kive kaam karda hai."
TARAhut AI Labs · tarahutailabs.com · +91 92008-82008
Session 3: Temperature & Parameters →