About This Role
Demand for RAG architects and LLM fine-tuning engineers has surged 300% since 2024. India faces a 53% skills deficit in this space, and roughly 70% of qualified senior GenAI engineers are not actively looking. This role is for an engineer who has moved beyond tutorials and prototypes and can design, build, and operate reliable RAG systems and LLM pipelines in regulated enterprise environments where accuracy, latency, and explainability all matter.
What You Will Do
- Design and build end-to-end RAG (Retrieval-Augmented Generation) systems, document processing, chunking strategies, embedding pipelines, vector search, and LLM generation layers
- Implement LLM orchestration using LangChain, LlamaIndex, or custom frameworks, chains, agents, tool use, and function calling
- Own vector database infrastructure, Pinecone, Weaviate, Qdrant, pgvector, or Chroma, index design, hybrid search, metadata filtering
- Build LLM evaluation frameworks, automated quality assessment, hallucination detection, RAG precision/recall measurement, human feedback loops
- Implement prompt engineering and prompt management systems, versioning, A/B testing, systematic improvement workflows
- Fine-tune and adapt foundation models (GPT, Claude, Llama, Mistral) for domain-specific enterprise use cases
- Build production monitoring for LLM systems, latency tracking, cost monitoring, quality drift detection
- Design multi-modal pipelines where required, text, image, and document understanding combined
- Implement responsible AI guardrails, content filtering, PII detection, toxicity screening, output validation
What You Need to Succeed
- 4+ years of software engineering experience with 2+ years specifically on LLM/GenAI systems in production
- Deep RAG architecture expertise, chunking strategies, embedding models, hybrid search, re-ranking
- LLM orchestration proficiency, LangChain, LlamaIndex, or equivalent at production scale
- Vector database experience, at least one production deployment (Pinecone, Weaviate, pgvector, or Chroma)
- Strong Python engineering skills, async programming, API design, testing, type annotations
- LLM evaluation and quality measurement experience, beyond manual spot-checking
- Prompt engineering and systematic prompt optimisation experience
- Production deployment experience, REST API serving, streaming responses, latency optimisation
What Will Give You an Edge
- LLM fine-tuning experience, LoRA, QLoRA, instruction tuning, RLHF
- Multi-modal system experience, combining text, image, and document understanding
- Agentic system design, LLM agents with tool use, code execution, and external API integration
- Experience building for regulated industries, BFSI, Healthcare, where explainability and audit trails matter
What Qfyre Offers
- RAG and LLM engineering role with genuine production ownership, not prototyping
- Remote-first flexibility reflecting the distributed nature of GenAI talent
- Market-premium compensation, GenAI engineers command 1.7x the standard engineering salary hike in 2026
- Exposure to enterprise-scale LLM deployments across BFSI, Healthcare, and Hi-Tech domains
Skills and Technologies
Apply for RAG and LLM Systems Engineer, Enterprise AI
Complete the form below. Our team reviews every application personally, no automated filtering, no keyword matching. We will be in touch within two business days.