Built as an interactive training module by TARAhut AI Labs
Generative AI & Prompt Engineering · Session 02 · tarahutailabs.com
© 2026 TARAhut AI Labs. All rights reserved.
Tokens are the currency of AI. Every word costs tokens. Every response costs tokens. Today you learn to count them, optimize them, and understand why AI "forgets" things in long conversations. "Token = AI di currency. Aaj tusi cost calculate karna sikhoge."
"How much does this prompt cost?" That's the question we answer today. Tokens are the basic units AI processes — roughly 3/4 of a word in English. But Hindi and Punjabi? That's a different story. "Ik English word = ~1 token. Ik Punjabi word = 3-5 tokens. Ih jaanana zaroori hai."
Think of tokens like money. Everything in AI costs tokens. Your prompt uses input tokens. The response uses output tokens. Longer prompts = more expensive. Longer responses = more expensive. Different languages = different costs. "Jive paisa kharcha hunda hai, tive token kharcha hunde ne."
Open the OpenAI Tokenizer and test these:
If you write prompts in Punjabi, they cost 3-5x more tokens than the same prompt in English. This means: (1) You hit context window limits faster. (2) API costs are higher. (3) You get shorter responses before hitting token limits. Pro tip: Write prompts in English, ask for responses in Punjabi. "Prompt English vich likho, jawab Punjabi vich mangvo."
Next: Context windows — why AI "forgets" in long conversations.
The context window is AI's short-term memory. Everything — your prompts AND AI's responses — must fit inside this window. When it overflows, AI starts forgetting. "Jado window bharr jaandi hai, AI puraaniyaan gallan bhul jaanda hai."
Imagine reading a book through a window that only shows 300 pages. As you read page 301, page 1 slides off the left side and is forgotten forever. That's the context window. Your entire conversation — every prompt you've sent and every response AI has given — must fit in this window. "Jive book di ik window — navan page aunda hai, purana bhul jaanda hai."
Test when AI starts "forgetting" your instructions:
Build a quick reference: If your average prompt is ~200 tokens and average response is ~500 tokens, each exchange uses ~700 tokens. In GPT-4o's 128K window, that's about 183 exchanges before the window fills up. Calculate for YOUR typical usage.
Next: The money side — how much does AI actually cost?
If you're building anything with AI APIs — chatbots, automation, business tools — you MUST understand token pricing. The difference between smart and wasteful prompting can be ₹10,000/month. "Token economics samjho — paisa bachao."
~$10 / 1M output tokens. A 1000-token prompt + 2000-token response costs ~$0.023. Cheap for one call. But 10,000 calls/day for a business = $230/day = ~₹19,000/day.
~$15 / 1M output tokens. Slightly pricier but larger context window. A customer service bot doing 1000 conversations/day: ~$270/day = ~₹22,500/day.
Optimized prompts use fewer tokens for the same output quality. If you save 40% on tokens, that ₹22,500/day becomes ₹13,500/day. ₹2.7 lakh/month saved just from better prompts.
Output tokens cost 3-5x more than input tokens. This means: (1) Long AI responses are expensive. (2) Asking for "be concise" saves money. (3) Specifying word count controls output cost. Pro tip for Punjab businesses: "Jawab chota mangvo — paisa bachda hai."
"Main sirf free tools use karda haan." — That's fine for personal use. But if you're building a business tool or automating anything at scale, token economics determines profitability. The difference between a profitable AI business and a money-losing one is token optimization.
Next: Token optimization lab — write the same prompt in 40% fewer tokens.
Same meaning. Fewer tokens. Better results. This is the skill that separates casual AI users from professionals. "Kam tokens vich ohhi kaam — ih professional prompting hai."
"I would really appreciate it if you could please help me with the task of writing a professional email to my boss in which I explain to him that I would like to request some time off from work next week because I have a medical appointment."
"Write a professional email to my boss requesting leave next week for a medical appointment. Formal tone, 100 words max."
1. Cut filler words: "please help me with" → just state the task.
2. Remove redundancy: "detailed and comprehensive" → "detailed" (they mean the same).
3. Use abbreviations: "social media platforms like Instagram and Facebook" → "Instagram + Facebook marketing".
4. Specify format up front: "bullet points, 200 words" prevents rambling responses (saves output tokens).
5. English prompts, target-language output: Write instructions in English, ask for Punjabi output.
Next: Quiz — test your token knowledge.
8 questions from a pool of 18. "Har sawaal tuhadi practical understanding test karda hai."
You now understand the economics and mechanics of AI at the token level.
✅ Tokens = AI's currency (~3/4 of a word in English)
✅ Context window = AI's working memory (128K-1M tokens)
✅ Punjabi/Hindi uses 3-5x more tokens than English
✅ Output tokens cost 3-5x more than input tokens
✅ Optimization can save 30-50% on token costs
✅ Longer is NOT better — optimize your prompts
"Kal assi temperature bare sikhange. Same prompt, wildly different outputs. Why? Because parameters control HOW the model generates text. You'll run experiments at temp 0, 0.5, and 1.0 and see the dramatic difference."