Position Description :
Azure Data Engineer :
Location : Bangalore / Chennai
Employment Type : Full Time
Experience : 5+ Years
Job Summary :
We are seeking a highly experienced Senior Azure Data Engineer to join our growing data team. In this role, you will be responsible for designing, developing, and operationalizing large-scale data processing systems on the Azure cloud platform. The ideal candidate will have a deep understanding of modern data architecture, a passion for building efficient and reliable data pipelines, and extensive hands-on experience with Azure Databricks, Azure Data Lake, PySpark, and Python.
Your future duties and responsibilities :
- Design, build, and maintain scalable and robust data pipelines for ingesting, processing, and transforming large volumes of structured and unstructured data.
- Develop and optimize high-performance Spark jobs using PySpark and Spark SQL within Azure Databricks.
- Implement data storage solutions using Azure Data Lake Storage (Gen2) following medallion architecture (Bronze, Silver, Gold layers) and best practices for data organization.
- Collaborate with data architects, analysts, and business stakeholders to understand data requirements and translate them into technical solutions.
- Implement data security and compliance measures, including access controls, encryption, and data masking within the Azure ecosystem.
- Perform data modeling to create efficient, scalable schemas for both batch and real-time analytics.
- Monitor, troubleshoot, and tune data pipelines and Databricks jobs for performance and cost-effectiveness.
- Automate deployment and management of data solutions using CI / CD pipelines (Azure DevOps / GitHub Actions) and Infrastructure-as-Code (IaC) tools like Terraform or Bicep.
- Establish and enforce data quality checks and data governance standards across the data platform.
- Mentor junior data engineers and promote best practices in software development and data engineering.Mandatory Skills & Qualifications :
- 5+ years of professional experience in data engineering, with a proven track record of building enterprise-grade data solutions.
- 3+ years of hands-on experience with Microsoft Azure data services, specifically :
- Azure Databricks : Expert-level proficiency in developing, tuning, and debugging Spark clusters and notebooks.
- Azure Data Lake Storage (Gen2) : Deep experience in managing data lakes, including directory structure, security (RBAC & ACLs), and performance optimization.
- Expert programming skills in Python for data engineering tasks (e.g., Pandas, API interactions, unit testing).
- Expert-level proficiency in PySpark for large-scale data processing, including DataFrame API, Spark SQL, and understanding of Catalyst Optimizer.
- Strong experience in writing and optimizing complex SQL.
- Solid understanding of data modeling concepts (e.g., star schema, snowflake schema, data vault).
- Experience with data pipeline orchestration tools such as Azure Data Factory or Apache Airflow.
- Experience with version control systems (Git) and collaborative development practices.
#LI-AD11
Required qualifications to be successful in this role :
Preferred Qualifications :
Microsoft Azure Data Engineer Associate (DP-) certification or similar.Experience with Delta Lake (format and features like ACID transactions, time travel, schema enforcement).Experience with real-time data processing using Azure Stream Analytics or Spark Streaming.Knowledge of other Azure services (Synapse Analytics, Event Hubs, Azure SQL DB, Cosmos DB, Purview).Experience with DevOps / DataOps principles and tools (CI / CD, Terraform, Azure DevOps).Familiarity with data visualization tools like Power BI.Personal Attributes :Excellent problem-solving and analytical skills.Strong communication and collaboration skills, with the ability to explain complex technical concepts to non-technical stakeholders.A proactive, self-motivated attitude with a strong sense of ownership and accountability.A continuous learner who stays updated with the latest trends in cloud data technologies.Skills :
Azure Data LakeMS SQL ServerPython