ROLE OVERVIEW
We are seeking a highly skilled Senior Generative AI Engineer who can drive the development, optimization, and deployment of cutting-edge image and video generative models. You will work across the full AI lifecycle—data preparation, model training, experimentation, optimization, and evaluation - using state-of-the-art deep learning frameworks and large-scale GPU clusters. This role requires expertise in advanced generative architectures, distributed training, and building production-ready visual AI systems.
KEY ROLES
- Design and implement advanced image and video generative architectures including Diffusion Models, GANs, VAEs, latent video models, and transformer-based visual generation systems.
- Architect and optimize large-scale distributed training pipelines across GPU clusters for training state-of-the-art visual generative models.
- Research and prototype next-generation architectures such as Sora-like models, video LDMs, Flow Matching, and autoregressive vision models.
- Develop and enhance data engineering pipelines for large-scale image, video, and multimodal dataset processing with advanced filtering and quality control.
- Build comprehensive model evaluation frameworks for visual quality assessment, temporal consistency, and safety compliance.
- Optimize training efficiency using advanced techniques like mixed precision, gradient checkpointing, and memory-efficient attention mechanisms.
- Conduct frontier AI research focusing on fidelity improvements, temporal consistency, and long-duration video generation capabilities.
- Collaborate with cross-functional teams to integrate generative models into production visual AI applications.
RESPONSIBLITIES
Train large-scale generative models using PyTorch or TensorFlow with distributed and tensor parallelism (DDP, FSDP, DeepSpeed) across A100 / H100 / L40S GPU clusters.Build automated data cleaning, preprocessing, and filtering pipelines for images, videos, captions, and multimodal datasets with quality, NSFW, object, face, and temporal consistency filters.Develop evaluation metrics and benchmarking systems for image / video quality (FID, IS), temporal performance, realism assessment, and safety compliance validation.Work with large-scale data lake systems and distributed storage architectures for handling massive visual datasets.Run comprehensive ablations, architecture comparisons, and produce scientific evaluation reports for model performance analysis.Implement and optimize memory-efficient training techniques for large visual models including gradient accumulation and checkpoint strategies.Integrate visual generative models with backend systems using optimized inference pipelines and real-time serving architectures.Maintain experiment tracking, model versioning, and reproducibility standards for large-scale visual AI research and development.REQUIRED QUALIFICATIONS
4–7+ years of experience in computer vision, generative AI, deep learning for visual models, or large-scale model training.Proven expertise with generative model architectures including Diffusion Models (DDPM, DDIM, LDM), Transformer-based image / video models, GANs, autoencoders, and video diffusion systems.Hands-on experience with large-scale distributed training on multi-GPU clusters using PyTorch (preferred) or TensorFlow with advanced parallelization techniques.Strong knowledge of visual AI frameworks including PyTorch Lightning, Hugging Face Diffusers / Transformers, DeepSpeed, FSDP, and Megatron-LM.Expert-level experience in building data cleaning and preprocessing pipelines for visual datasets, including image / video annotation tools and metadata extraction.Solid understanding of GPU cluster management, CUDA optimization, model parallelism, and cloud / on-premises infrastructure for large compute training.Experience with experiment tracking tools (Weights & Biases), model evaluation metrics for visual generation, and scientific experimentation methodologies.Strong programming skills in Python with deep learning best practices and proficiency in visual data processing libraries.Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field with focus on Computer Vision or Machine Learning.NOTE - We accept International applicants also