AI Engineer — Image‑to‑Video (Mid‑Level)
Location : Mumbai (on‑site / hybrid)
Contract : 6 months, extendable
Start : ASAP
What you'll do
- Build, fine‑tune, and ship image‑to‑video generation pipelines (prompt‑to‑video, storyboard‑to‑video, identity‑preserving headshots)
- Integrate and iterate on SOTA components (Stable Video Diffusion, AnimateDiff, LTX‑Video / 13B variants, CogVideo‑X, ControlNet‑style conditioning).
- Optimize inference for throughput and latency (TorchScript / ONNX, TensorRT, CUDA kernels, xFormers / Flash‑Attention, mixed precision).
- Handle multi‑GPU training / inference (DDP, gradient checkpointing, sharded weights, efficient sampling).
- Own dataset curation and augmentation for faces / motion; enforce consent, licensing, and privacy.
- Build evaluation loops and dashboards (FVD, CLIP / ID‑similarity, temporal consistency, face‑ID retention).
- Productionize with Docker and CI / CD; wire up tracking (W&B / ClearML) and experiment reproducibility.
- Collaborate with design and product to convert creative briefs into deployable features and A / B tests.
Must‑have
3–5 years total software / ML experience with 1–2+ years in generative video or diffusion work.Strong Python + PyTorch, Diffusers, and CV fundamentals (spatiotemporal models, sampling).Proven experience with multi‑GPU (DDP / NCCL) and performance profiling on Linux.Solid grasp of FFmpeg, video codecs / bitrates, and post‑processing pipelines.Portfolio : repo(s), demo links, or a short reel showing your image‑to‑video work.Nice‑to‑have
Experience with ComfyUI nodes / graphs, LoRA / ControlNet training, face‑ID preservation, or lip‑sync.Triton kernels, custom schedulers / samplers, quantization (INT8 / FP8) for fast inference.MLOps on AWS / GCP / Azure, Kubernetes, vector stores, prompt orchestration.Tools you'll touch
PyTorch, Diffusers, CUDA, TensorRT / ONNX, xFormers / Flash‑Attention, FFmpeg, Docker, W&B / ClearML, ComfyUI, GitHub Actions.What we offer
Competitive contract compensation (INR, market‑aligned) with extension potential.High‑impact ownership on production creative pipelines.Modern GPU stack and a fast path from prototype → production.How to apply
Email [HIDDEN TEXT] with subject 'AI Engineer — Image‑to‑Video (Mumbai)' and include :Resume / CV, links to GitHub and any demos / reels.3–5 bullet points on your most relevant image→video work.Earliest start date and work authorization status for India.Skills Required
Pytorch, Docker, Ffmpeg, Python, Cuda