Job Title : Data Engineer Azure Responsibilities :
- Build scalable real-time and batch processing workflows using Azure Databricks, PySpark, and Apache Spark.
- Perform data pre-processing (cleaning, transformation, deduplication, normalization, encoding, scaling) to ensure high-quality inputs for analytics and AI / ML.
- Design and maintain cloud-based data architectures (data lakes, lakehouses, warehouses) following Medallion Architecture.
- Deploy and optimize data solutions on Azure (preferred), AWS, or GCP, focusing on performance, security, and scalability.
- Develop and optimize ETL / ELT pipelines for structured / unstructured data from IoT, MES, SCADA, LIMS, and ERP systems.
- Automate data workflows using CI / CD and DevOps best practices, ensuring compliance with security standards.
- Monitor, troubleshoot, and enhance pipelines for high availability and reliability.
- Utilize Docker and Kubernetes for scalable data processing environments.
- Collaborate with automation teams, data scientists, and engineers to provide clean, structured data for AI / ML Skills (Must-Have) : Programming : PySpark, Apache Spark
- Platforms : Azure Databricks (preferred), AWS, or GCP
- Data Management : SQL, ETL / ELT development, data lakes, lakehouses, and DevOps / Automation : CI / CD pipelines, Docker, Kubernetes
- Architecture : Medallion architecture, scalable data systems. Experience with IoT, MES, SCADA, LIMS, ERP data :
- Bachelors / Masters degree in Computer Science, Data Engineering, or related field (IIT graduates only).
- Proven 4+ years experience as a Data Engineer in enterprise-scale environments.
- Strong problem-solving and communication skills with a collaborative mindset.
(ref : hirist.tech)