Job Description
Profile Summary
- Data Engineer with 5 years of experience in Big Data technologies.
- Proven expertise in designing and developing scalable data pipelines using Apache Spark and PySpark.
- Proficient in Python programming.
- Hands-on experience with Databricks, MongoDB, Docker, and Kubernetes.
- Solid understanding of DevOps practices and tools.
Primary Skills
Design, develop, and maintain scalable and reliable data pipelines.Translate business requirements and data models into efficient ETL code.Work independently on Spark-based data processing workflows.Skilled in writing complex business logic using PySpark and Spark SQL.Strong understanding of Spark clusters, distributed computing, and parallel processing.Capable of writing unit test cases to validate ETL logic.Experience in building reusable functions and applying modular programming approaches.Worked with AWS Athena for table creation, indexing, and writing complex SQL queries.Experience working with both relational databases (e.g., Oracle, SQL Server) and NoSQL databases (e.g., MongoDB) for data extraction and loading.Secondary Skills
Experience in creating AWS IAM roles and defining access policies.Familiar with Docker image creation and customization.Experience with Camunda for workflow orchestration and automation.Additional Expectations
Strong communication skills.Proficient in troubleshooting and problem-solving.Demonstrates a continuous learning mindset.Focused on automating manual tasks through coding wherever feasible.Technical Skills
Mandatory Skills :
Python for Data, Apache Spark, Java, Python, Scala, Spark SQL, Databricks