Role : Multimodal Embedding Engineer
Job Type; Full time
Location : Bangalore, Remote for good candidates
Core Technical Work :
Design and implement multi-modal embedding models that create unified vector representations across text, images, video, audio, and other modalities.
Build and optimize contrastive learning frameworks (e.g., CLIP-style architectures) for cross-modal alignment.
Develop embedding spaces that preserve semantic relationships within and across modalities
Create efficient indexing and retrieval systems for multi-modal data at scale.
Fine-tune and adapt foundation models for domain-specific multi-modal applications.
Data & Infrastructure :
Build data pipelines for processing and aligning multi-modal datasets.
Implement quality control mechanisms for multi-modal training data.
Design evaluation frameworks to measure embedding quality across modalities.
Optimize inference performance for real-time multi-modal search and retrieval.
Research & Innovation :
Stay current with latest research in multi-modal learning, vision-language models, and embedding techniques.
Experiment with novel architectures for improved cross-modal understanding.
Benchmark different approaches and conduct ablation studies.
Embedded Engineer • Hyderabad, Telangana, India