About
AI Engineer & Data Scientist specializing in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and Machine Learning. Proven expertise in fine-tuning and deploying models using LoRA, QLoRA, and PEFT for efficient adaptation. Skilled in vector databases, dense retrieval, and adaptive re-ranking to enhance AI-driven search. Proficient in Python, SQL, PySpark, LangChain, and cloud platforms (GCP, Azure, Databricks), with a strong focus on developing scalable and impactful AI solutions.
Work
→
Summary
Led the development and deployment of advanced AI solutions, focusing on automation, security, and scalable architecture for enterprise-grade applications.
Highlights
Engineered and deployed a serverless Twitter bot on Modal, automating tweet scheduling and AI-generated responses every 5 minutes.
Strengthened API security using Modal Secret Manager, preventing unauthorized access and ensuring over 60% uptime for interactions.
Orchestrated migration to a modular architecture utilizing Modal Cron Jobs, reducing API latency by 15% and enabling scalable, on-demand AI task execution with minimal overhead.
Contributed to enterprise-grade LLM solutions, focusing on secure, precision-focused generative AI projects in the cybersecurity domain.
Designed and developed domain-specific AI agents within multi-agent frameworks, significantly enhancing research and compliance capabilities.
Improved real-time insights by 24% through integrating advanced retrieval techniques, including RAPTOR protocols, dense vector retrieval, and adaptive document re-ranking.
Implemented continuous evaluation pipelines (Trulens, RAGAS, ARES), accelerating model retraining by 3x while maintaining accuracy improvements.
→
Summary
Specialized in large-scale data processing and migration, focusing on optimizing data workflows and ensuring data integrity.
Highlights
Optimized PySpark/Spark SQL workflows, significantly improving query execution speed and facilitating a seamless migration from Talend to Databricks for large-scale data processing.
Deployed Medallion Architecture by structuring data layers into Bronze, Silver, and Gold tiers, reducing data inconsistencies by 30% and enhancing overall quality with automated validation processes.
Pioneered a PySpark-based data validation framework, identifying 37% of data anomalies and upholding impeccable data integrity across all downstream applications.
Languages
English
Fluent
Hindi
Native
Skills
Programming Languages
Python, SQL, PySpark.
Machine Learning & AI
Machine Learning Algorithms, Large Language Models (LLM), Retrieval-Augmented Generation (RAG), Fine-Tuning, Prompt Engineering, Multi-Agent AI Systems, Model Optimization (LoRA, QLoRA, PEFT), Generative AI, Model Deployment.
Data Engineering & Cloud Platforms
Databricks, Google Cloud Platform (GCP), BigQuery, Vertex AI, Azure Data Factory.
Frameworks & Libraries
LangChain, Streamlit, Hugging Face, TensorFlow, PyTorch, Pandas, NumPy, Postman.
Databases
MySQL, BigQuery, NoSQL, MongoDB, Vector Databases, Pinecone.
Data Visualization
Looker Studio.
Soft Skills
Problem-Solving, Stakeholder Collaboration, Technical Leadership, Research & Innovation.