About the Role
Our client is seeking a Data Engineer to build and maintain data pipelines supporting AI agents for real estate and construction applications. You’ll play a key role in ensuring reliable data flows, integrations, and preprocessing frameworks that power advanced GenAI systems.
This is a high-impact opportunity within an early-stage, fast-growing AI startup , ideal for engineers who thrive in 0→1 environments, work with high ownership, and ship production-ready solutions quickly.
Key Responsibilities
1. Data Pipeline Development
Design, build, and optimize data pipelines for GenAI and LLM-powered systems.
Ensure data availability, integrity, and consistency across AI and backend services.
2. Integration & Automation
Connect diverse data sources (structured and unstructured) to feed AI agents.
Automate ingestion, transformation, and validation processes for continuous data flow.
3. Collaboration
Partner with AI and backend teams to align data models with product requirements.
Work closely with client-facing teams to troubleshoot and optimize data operations.
4. Scalability & Monitoring
Implement data monitoring, versioning, and error tracking systems.
Design scalable solutions suitable for multi-market enterprise environments.
Required Skills & Experience
4–6 years of data engineering experience, including at least 1 year in GenAI systems .
Strong proficiency in Python , SQL, and data pipeline tools (Airflow, Prefect, etc.).
Hands-on experience with ETL processes , data wrangling, and data modeling.
Cloud deployment experience (preferably Azure , AWS, or GCP).
Solid understanding of APIs, integration workflows, and real-time data processing.
Excellent communication and collaboration skills (client-facing experience preferred).
Previous startup or fast-paced product environment experience is a plus.
Nice to Have
Exposure to machine learning or LLM data preprocessing pipelines .
Experience with vector databases , data observability tools, or data lineage systems.
Familiarity with AI application data workflows (RAG pipelines, embeddings, etc.).
Working Hours
Monday to Saturday
SEA / India / UAE time zones
Why Join
Be part of a fast-scaling AI startup shaping data infrastructure for next-gen applications.
Work fully remote with global exposure and direct impact on product delivery.
Competitive pay and ESOP opportunities after 3 months.
High autonomy, steep learning curve, and opportunities to grow with the company.
Pro5 is a global platform helping thousands of vetted professionals get hired by top employers. See what others say on our public Google Reviews and learn how we keep your data safe in our Trust Center.
Data Engineer • Mohali, Punjab, India