Job Description
: Data Engineer (Manager Level – Individual Contributor)
Experience : 5–8 years
Location : Noida, UP
Role Type : Individual Contributor (IC)
About the Role
We are seeking a highly skilled Data Engineer with strong Python development expertise and proven experience in building and scaling cloud-based data management platforms . This role requires hands-on expertise in data pipelines, data lakehouse architectures, and Apache Spark , with a strong foundation in metadata and master data management . The ideal candidate will also bring experience in AI-driven analytics and demonstrate the ability to design, optimize, and manage data solutions with a focus on cost efficiency .
Key Responsibilities
- Design, build, and maintain scalable data pipelines and ETL / ELT processes across Azure-based ecosystems.
- Develop and optimize data lakehouse solutions leveraging Azure Synapse Analytics, Microsoft Fabric, and Databricks .
- Estimate and manage cloud resource utilization and costs for data pipelines, ensuring efficiency and cost-effectiveness.
- Monitor and fine-tune pipelines to balance performance and cost optimization across compute, storage, and data movement.
- Collaborate with analytics teams to deliver business-ready datasets for reporting and AI-driven use cases.
- Implement best practices for metadata and master data management , ensuring data lineage, quality, and governance.
- Develop and support real-time and batch processing frameworks using Apache Spark .
- Integrate and support visualization solutions using Power BI for business stakeholders.
- Partner with data science and AI teams to enable AI / ML-powered analytics solutions .
- Ensure adherence to data security, compliance, and governance standards .
Required Qualifications
5–8 years of hands-on experience as a Data Engineer or similar role.Strong Python coding expertise for data processing and automation.Proven experience with Azure Synapse, Microsoft Fabric, and Databricks in enterprise environments.Hands-on expertise with Apache Spark , distributed data processing, and performance optimization.Experience in cost estimation, monitoring, and optimization of cloud-based pipelines.Proficiency in Power BI and data visualization best practices.Strong knowledge of metadata management, master data management, and data governance frameworks .Exposure to AI-driven analytics and integration with ML / GenAI workflows.Solid understanding of data modeling, data quality, and data integration principles .Preferred Skills
Familiarity with CI / CD pipelines for data engineering (Azure DevOps, GitHub Actions, etc.).Experience with API-driven data ingestion and workflow orchestration tools.Knowledge of responsible AI practices and explainability frameworks.Strong problem-solving, communication, and stakeholder management skills.