We're seeking an exceptional AI Engineer with deep expertise in TensorFlow model training to design and build next-generation AI systems. This role focuses on developing sophisticated machine learning models, particularly Large Language Models and NLP solutions, while leveraging AWS cloud infrastructure for scalable deployment.Key Responsibilities : Design and architect enterprise-scale AI / ML solutions with emphasis on custom model development and trainingBuild, train, and optimize deep learning models using TensorFlow and TensorFlow Extended (TFX)Develop and fine-tune Large Language Models for domain-specific applicationsImplement advanced NLP pipelines including text classification, named entity recognition, sentiment analysis, and language generationLead model training infrastructure design, including distributed training strategies and GPU optimizationDeploy and manage ML models on AWS SageMaker and AWS Bedrock platformsEstablish MLOps practices for model versioning, experiment tracking, and continuous trainingOptimize model architectures for performance, accuracy, and computational efficiencyConduct thorough model evaluation, validation, and performance benchmarkingCollaborate with data engineering teams to build robust training data pipelinesMentor ML engineers and data scientists on TensorFlow best practices and model training techniquesRequired Qualifications : 3+ years of hands-on experience in machine learning engineering and AI architectureExpert-level proficiency in TensorFlow 2.x for model development and trainingDeep understanding of neural network architectures (Transformers, CNNs, RNNs, attention mechanisms)Proven track record training large-scale models, including experience with LLMsStrong expertise in Natural Language Processing and modern NLP techniquesExtensive experience with AWS cloud services, particularly SageMaker and BedrockSolid understanding of training optimization techniques (learning rate scheduling, regularization, gradient accumulation)Experience with distributed training frameworks and multi-GPU / TPU trainingStrong Python programming skills and experience with NumPy, Pandas, and scikit-learnKnowledge of model compression techniques (quantization, pruning, distillation)Preferred Skills : Experience with Hugging Face Transformers, LangChain, or similar LLM frameworksFamiliarity with PyTorch or JAX in addition to TensorFlowKnowledge of reinforcement learning from human feedback (RLHF) techniquesExperience with vector databases (Pinecone, Weaviate, ChromaDB) for RAG applicationsUnderstanding of prompt engineering and few-shot learning strategiesExperience with Kubernetes and containerization (Docker) for ML workloadsPublications or contributions to open-source ML projectsTechnical Skills : Frameworks : TensorFlow, Keras, TensorFlow Serving, TFXCloud : AWS SageMaker, AWS Bedrock, EC2, S3, LambdaLanguages : Python, SQLMLOps : MLflow, Weights & Biases, KubeflowTools : Jupyter, Git, Docker, TensorBoard
Ai Engineer • Belgaum, Karnataka, India