We're seeking an exceptional AI Engineer with deep expertise in TensorFlow model training to design and build next-generation AI systems. This role focuses on developing sophisticated machine learning models, particularly Large Language Models and NLP solutions, while leveraging AWS cloud infrastructure for scalable deployment.
Key Responsibilities :
Design and architect enterprise-scale AI / ML solutions with emphasis on custom model development and training
Build, train, and optimize deep learning models using TensorFlow and TensorFlow Extended (TFX)
Develop and fine-tune Large Language Models for domain-specific applications
Implement advanced NLP pipelines including text classification, named entity recognition, sentiment analysis, and language generation
Lead model training infrastructure design, including distributed training strategies and GPU optimization
Deploy and manage ML models on AWS SageMaker and AWS Bedrock platforms
Establish MLOps practices for model versioning, experiment tracking, and continuous training
Optimize model architectures for performance, accuracy, and computational efficiency
Conduct thorough model evaluation, validation, and performance benchmarking
Collaborate with data engineering teams to build robust training data pipelines
Mentor ML engineers and data scientists on TensorFlow best practices and model training techniques
Required Qualifications :
3+ years of hands-on experience in machine learning engineering and AI architecture
Expert-level proficiency in TensorFlow 2.x for model development and training
Deep understanding of neural network architectures (Transformers, CNNs, RNNs, attention mechanisms)
Proven track record training large-scale models, including experience with LLMs
Strong expertise in Natural Language Processing and modern NLP techniques
Extensive experience with AWS cloud services, particularly SageMaker and Bedrock
Solid understanding of training optimization techniques (learning rate scheduling, regularization, gradient accumulation)
Experience with distributed training frameworks and multi-GPU / TPU training
Strong Python programming skills and experience with NumPy, Pandas, and scikit-learn
Knowledge of model compression techniques (quantization, pruning, distillation)
Preferred Skills :
Experience with Hugging Face Transformers, LangChain, or similar LLM frameworks
Familiarity with PyTorch or JAX in addition to TensorFlow
Knowledge of reinforcement learning from human feedback (RLHF) techniques
Experience with vector databases (Pinecone, Weaviate, ChromaDB) for RAG applications
Understanding of prompt engineering and few-shot learning strategies
Experience with Kubernetes and containerization (Docker) for ML workloads
Publications or contributions to open-source ML projects
Technical Skills :
Frameworks : TensorFlow, Keras, TensorFlow Serving, TFX
Cloud : AWS SageMaker, AWS Bedrock, EC2, S3, Lambda
Languages : Python, SQL
MLOps : MLflow, Weights & Biases, Kubeflow
Tools : Jupyter, Git, Docker, TensorBoard
Ai Engineer • Narela, Delhi, India