Role : Data Engineer
Location : Chennai (Hybrid).
Job Description :
Key Responsibilities :
- Lead the design, development, and deployment of scalable data pipelines and ETL processes using Scala and Apache Spark (Databricks).
- Develop and maintain data processing workflows using PySpark and Python.
- Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver robust data solutions.
- Architect and implement data ingestion, transformation, and integration solutions for structured and unstructured data.
- Optimize Spark jobs for performance and cost efficiency on cloud platforms.
- Lead, mentor, and guide a team of data engineers, ensuring best practices in coding, testing, and deployment.
- Establish and enforce data engineering standards, code reviews, and documentation.
- Troubleshoot and resolve complex data pipeline issues.
- Stay current with emerging technologies and recommend improvements to existing data infrastructure.
- Work closely with DevOps and cloud teams to ensure smooth deployment and monitoring of data pipelines.
Required Qualifications :
Bachelors or Masters degree in Computer Science, Engineering, or a related field.10+ years of professional experience in data engineering.Strong hands-on programming experience in Scala, with a deep understanding of functional programming concepts.Extensive experience with Apache Spark, including Spark SQL, Spark Streaming, and optimization techniques.Proficiency in Python and PySpark for data processing tasks.Experience working with Databricks or similar cloud-based Spark platforms.Solid understanding of distributed computing, big data architectures, and data modeling.Experience with cloud platforms such as AWS, Azure, or GCP.Familiarity with data storage technologies like HDFS, S3, Delta Lake, and relational / non-relational databases.Strong leadership skills with experience leading and mentoring engineering teams.Excellent problem-solving, communication, and collaboration skills.(ref : hirist.tech)