Prompt Engineer
Prompt Engineers design, evaluate, and ship LLM-powered features — system prompts, RAG flows, agent orchestration, structured-output schemas, and the eval harnesses that prove a prompt is actually better. The role sits between product, applied ML, and software engineering: you write prompts the way other engineers write code, run cost-quality-latency trade-off experiments, instrument grader pipelines, and own the part of the product that the LLM actually 'speaks.' In India through 2026, the role is one of the fastest-growing AI hires — concentrated at AI-native startups (Sarvam AI, Krutrim, Ola Krutrim, Atlan, Yellow.ai), product SaaS shops with a serious AI feature surface (Freshworks, Postman, Chargebee, Whatfix, Zoho ZIA), fintechs (Razorpay, Cred, Paytm), and the GCCs of Microsoft, Google, Adobe, and Salesforce. The salary band is unusually wide because the title is new and JDs vary from 'wrote one ChatGPT integration' to 'owns the eval harness for a frontier model.' Sarvam AI made several public crore-level offers to senior prompt and LLM engineers in 2025.
Overview
Prompt Engineers design, evaluate, and ship LLM-powered features — system prompts, RAG flows, agent orchestration, structured-output schemas, and the eval harnesses that prove a prompt is actually better. The role sits between product, applied ML, and software engineering: you write prompts the way other engineers write code, run cost-quality-latency trade-off experiments, instrument grader pipelines, and own the part of the product that the LLM actually 'speaks.' In India through 2026, the role is one of the fastest-growing AI hires — concentrated at AI-native startups (Sarvam AI, Krutrim, Ola Krutrim, Atlan, Yellow.ai), product SaaS shops with a serious AI feature surface (Freshworks, Postman, Chargebee, Whatfix, Zoho ZIA), fintechs (Razorpay, Cred, Paytm), and the GCCs of Microsoft, Google, Adobe, and Salesforce. The salary band is unusually wide because the title is new and JDs vary from 'wrote one ChatGPT integration' to 'owns the eval harness for a frontier model.' Sarvam AI made several public crore-level offers to senior prompt and LLM engineers in 2025.
A Day in the Life
Open laptop, scan Slack for any overnight customer-quality complaints about the AI features. Triage which ones need a same-day prompt fix vs which need an eval rerun first.
Check the daily eval-cron dashboard. Pass rates for each AI surface (search, support bot, summarization). If any slice dropped more than 3 points, jot it down for investigation.
Standup with the AI product team — 6 engineers, PM, designer. Three-minute update on what shipped yesterday and what's blocking today. Mention the evening eval delta you noticed.
Deep work block 1. Iterate on the system prompt for the new contract-review feature. Run 3 prompt variants against the 80-case eval set, analyze which fails, edit the few-shot examples.
Run a head-to-head eval batch — Claude 4.6 Sonnet vs GPT-5.1 vs Gemini Pro on the new feature. Generate the cost-per-call vs quality table. Write up a 1-page recommendation.
Lunch break. Walk away from screens for 40 minutes. Get tea / dabba / Swiggy.
Pair with a backend engineer on the agent loop for the new workflow-automation feature. Debug a brittle multi-step flow where the agent loses tool-use schema adherence on step 4.
Investigate a hallucination report from CX. Reproduce in the playground, narrow down to a retrieval miss (the right doc wasn't in top-5), file a ticket on the indexing side, draft a stop-gap prompt fix.
Code review for two teammates' PRs — a structured-output schema change and an eval-set addition. Push back on missing edge cases (empty input, multi-turn context, adversarial framing).
Architecture review meeting — the team is debating whether to fine-tune a small model for the classification step or keep prompting Claude. You walk through the cost / quality / latency trade-off.
Read 30 minutes — a new Anthropic engineering blog post, an arXiv paper on agent eval, or a Latent Space podcast episode. Forward one useful idea to the team channel.
End-of-day Slack update — what shipped, what's queued. Close laptop. On launch weeks the day extends to 9-10 PM; during model-provider migrations the team rotates eval coverage across timezones.
Common Mistakes
7- ⚠️Taking a 'Prompt Engineer' title at a company where the actual scope is one ChatGPT integrationWhy: Many Indian JDs list 'Prompt Engineer' for what is actually a marketing-intern task. Two years there leaves you with no eval portfolio, no RAG experience, no agent work — and you struggle to switch to a serious AI role.Instead: In interviews, ask: 'Walk me through your eval harness. What's your retrieval stack? Show me an agent trace from production. How do you handle silent regressions?' Vague answers mean it's not a real prompt-engineering role.
- ⚠️Skipping eval design because writing prompts feels more funWhy: Eval is 50%+ of the senior craft. Engineers who avoid it stay perpetually at 'I write prompts that feel good' — which doesn't compound and isn't what frontier labs hire for.Instead: Force yourself to write the eval before the prompt. Build a 50-100 case eval set for every feature you ship. Read DeepEval / Promptfoo / OpenAI Evals docs. The career compounds for engineers who like the eval side.
- ⚠️Becoming a framework specialist (LangChain expert / LlamaIndex expert) instead of a fundamentals specialistWhy: Frameworks churn every 6-12 months. Engineers known for deep LangChain knowledge become irrelevant when teams switch to a different agent framework or build in-house.Instead: Be deep on the API primitives (Anthropic / OpenAI / Gemini SDKs), token economics, sampling, structured outputs, eval methodology. Learn frameworks as conveniences, not identities.
- ⚠️Treating one model as the answerWhy: Engineers who only know GPT or only know Claude get stuck when the team switches models for cost or quality reasons. Frontier model leadership rotates every 6-9 months.Instead: Build with at least 2-3 frontier models. Maintain eval harnesses that run across models. Develop intuition for which model handles which task class best.
- ⚠️Not pinning model versions in productionWhy: When the underlying model silently updates (which closed-API providers do), your prompts that worked yesterday quietly degrade. Without version pinning, you're flying blind.Instead: Always pin model versions explicitly in production. Add a daily eval cron that pages on >5% pass-rate drops. Gate any version upgrade on a fresh eval pass.
- ⚠️Staying narrowly 'I only do prompts' as the role maturesWhy: The standalone Prompt Engineer title is likely to consolidate into AI Engineer / ML Engineer over 3-5 years. Engineers who don't broaden into evals, RAG infra, fine-tuning, and policy will compress as the title fades.Instead: By year 2-3, expand into RAG infra (vector stores, retrieval evaluation), fine-tuning (PEFT, DPO basics), eval infrastructure, and AI safety. Reframe yourself as an AI Engineer who is exceptional at prompts.
- ⚠️Underestimating the value of strong written EnglishWhy: Prompt engineering is 60% writing craft. Engineers who can't write a clear, structured paragraph also can't write a clear, structured system prompt — and they lose to engineers who can.Instead: Read on writing craft (William Zinsser, Strunk & White). Edit your prompts the way an editor edits prose. The best prompt engineers in India are also good writers.
Salary by Indian City (Mid-level total cash comp)
6| City | Range |
|---|---|
| Bangalore | ₹20-32L |
| Hyderabad | ₹19-30L |
| Pune | ₹16-26L |
| NCR (Gurugram + Noida) | ₹17-28L |
| Mumbai | ₹16-26L |
| Remote / international | ₹35-90L |
Notable Indians in this career
6Communities + forums
7- Latent Space DiscordDiscordswyx and Alessio's AI-engineering community; weekly podcast plus a deeply active Discord with channels for prompt engineering, RAG, agents, evals.
- Largest production-ML community Slack; has dedicated channels for LLM ops, evals, and prompt engineering.
- Anthropic + OpenAI Discord serversDiscordOfficial developer communities run by the model providers; first place for API changes, prompt patterns, and model-specific quirks.
- Hugging Face communityWeb + DiscordLargest open-weights and prompt-research community; Spaces, model cards, and the 'open LLM leaderboard' are reference points for the field.
- AI4Bharat communityDiscord + GitHubIndia-first NLP and LLM research community at IIT Madras; central to Indic-language prompt and eval work.
- Bangalore AI Engineer MeetupIn-person + MeetupQuarterly Bangalore meetup focused on production AI engineering; talks from Sarvam, Krutrim, Yellow.ai, Razorpay AI engineers. Strong networking.
- r/LocalLLaMA and r/LangChainRedditActive subreddits — r/LocalLLaMA in particular has the deepest discussion on open-weight model prompting and self-hosted inference.
What to read / watch / follow
10- Anthropic's Prompt Engineering TutorialTutorial / Courseby Anthropic teamFree, official, dense. The single best starting point for serious prompt engineering. Read once cover-to-cover, then again after 6 months of work.
- OpenAI CookbookCode repository / docsby OpenAI team + communityReference patterns for prompts, RAG, agents, evals, structured outputs. The most-used practical resource for working prompt engineers globally.
- Andrej Karpathy's 'Zero to Hero' lecture seriesYouTubeby Andrej KarpathyFree 10-hour series on building LLMs from scratch. Foundational for understanding what prompts are actually doing inside the model.
- Latent Space podcastPodcastby swyx + AlessioWeekly interviews with AI engineers at Anthropic, OpenAI, Cohere, and Indian AI startups. The single best podcast for tracking the AI engineering field.
- Simon Willison's blogBlogby Simon WillisonThe most-read independent prompt-engineering and LLM blog globally. Daily posts on new models, prompt patterns, and tooling.
- Eugene Yan's blog (especially the LLM patterns post)Blogby Eugene YanEugene's 'Patterns for building LLM-based systems' is reference reading for production LLM engineering — RAG, evals, defensive UX.
- Anthropic and OpenAI engineering blogsBlogby Anthropic + OpenAI engineering teamsOfficial engineering blogs from frontier labs; first-source content on safety, evaluation, and applied research that defines best practice.
- Chip Huyen's 'Building LLM-powered Applications'Blog post + book chaptersby Chip HuyenComprehensive overview of production LLM application architecture; widely referenced in Indian AI-engineering interviews.
- Lilian Weng's blog (lilianweng.github.io)Blogby Lilian Weng (OpenAI)Deep technical posts on prompt engineering, agents, hallucinations, evals. The gold-standard for technical-depth writing in the field.
- AI4Bharat blog and papersBlog (India)by AI4Bharat teamFirst-hand writing from India's leading Indic-language LLM team; essential for prompt engineers working on Hindi, Tamil, Bengali, and other Indic-language features.
Daily Responsibilities
7- Write a new system prompt for a feature, draft 6-12 few-shot examples, run them against the eval set, and analyze where the model fails. Iterate prompt + examples until the eval pass rate hits the target.
- Investigate a hallucination report from product or support: reproduce the failure, isolate whether it's a retrieval miss, a prompt issue, or a model-capability gap, and draft the fix.
- Run a head-to-head eval batch on a new model (Claude vs GPT vs an open-weight variant) — analyze quality, latency, and per-1K-token cost, then write a 1-page recommendation memo.
- Review 2-3 PRs from teammates: prompt diffs, eval-set additions, structured-output schema changes. Push back on missing test cases or unclear refusal handling.
- Pair with a backend or applied-ML teammate on the agent or RAG layer: design tool-use schemas, debug a brittle multi-step flow, or rewrite the retrieval prompt for better grounding.
- Attend a 15-30 min standup, plus 1-2 ad-hoc syncs (with PM, designer, or model-policy reviewer) about a new AI feature, eval results, or a customer-reported quality issue.
Advantages
- The role is genuinely new and rewards taste — there's no 30-year canonical curriculum, so a careful self-taught practitioner with a public portfolio can compete with M.Tech grads on equal footing for many openings.
- The work is unusually creative for a technical role — you're shaping how an AI 'speaks' to users, choosing tone, constraints, and failure modes. Few engineering jobs are this much like writing.
- Salary ceiling at AI-native Indian startups is high — Sarvam AI, Krutrim, and a handful of YC-IN AI cos have public crore-level offers for senior prompt engineers; well above generalist backend bands.
- Remote and global mobility is excellent — Sarvam, Krutrim, Yellow.ai, OpenAI's India org, Anthropic's APAC team, and most AI-native startups are remote-friendly; switching to US/EU companies is realistic after 3-4 years.
- Your work is the most user-visible part of the AI stack — when you nail the prompt, the entire product 'feels right' to users. The feedback loop on craft is fast and concrete.
Challenges
- Job-title inflation is severe — many Indian companies advertise 'Prompt Engineer' for what is actually a marketing intern's task list. Read JDs hard for eval, RAG, and agent specifics; ask in interviews about the team's eval harness.
- The salary floor is genuinely low — without a strong portfolio, entry-level offers can be ₹5-7L because the title is fashionable and the supply of self-taught candidates is high. The premium kicks in only at mid-level and above.
- Tooling and best practices churn faster than any other AI role — agent frameworks, RAG patterns, eval methodologies, and base models change every 3-6 months. Last year's expert is this year's mid-level if they don't keep up.
- Eval design is the unglamorous 50%+ of the job — measuring quality, writing graders, debugging silent regressions — but most candidates only want to talk about the 30% that involves writing the prompt itself. Career compounds for those who like the eval side.
- Long-term role legitimacy is debated — a real risk over 3-5 years is that 'Prompt Engineer' folds into 'AI Engineer' or 'ML Engineer' as the work matures, and standalone prompt-only roles compress. Engineers who go deeper on infra, evals, and policy compound; those who stay 'I write prompts' compress.
Education
6- Required (most common): a Bachelor's in any technical or quantitative field — CS, IT, Statistics, Mathematics, Physics, Linguistics, or Cognitive Science. The role is genuinely degree-flexible because the work is part craft, part rigor.
- Strong alternatives: a humanities or arts degree paired with a serious technical portfolio — a public RAG app, a published prompt library, a Hugging Face Space, or a research blog with reproducible experiments. Indian AI-native startups hire this profile regularly.
- Premium signal: M.Tech / M.S. in NLP, AI, or CS from IIT, IIIT-H, IIIT-B, IISc, ISI Kolkata, or top-50 global programs — opens doors to research-leaning prompt and LLM teams at MSR India, Google Research India, Sarvam AI, and AI4Bharat.
- Self-taught + portfolio: a documented body of work (a public prompt-engineering blog, a maintained eval harness on GitHub, contributions to popular agent or RAG frameworks) is an accepted route at remote-first AI startups. Several senior prompt engineers in the Indian AI scene came from non-CS backgrounds.
- Certifications that matter: DeepLearning.AI's Prompt Engineering and ChatGPT Prompt Engineering for Developers, Anthropic's prompt engineering tutorial, OpenAI cookbook completion, and Hugging Face's NLP course — most useful in the first 1-2 years for switchers.