Job Overview
This position is responsible for designing, building, and optimizing large-scale data pipelines and workflows.
Key Responsibilities
- Architecting scalable data infrastructure using cloud-based services such as AWS
- Building, testing, and deploying data pipelines using Python and PySpark
- Collaborating with cross-functional teams to deliver high-quality data products
- Managing and maintaining large datasets, ensuring data integrity and security
- Maintaining a deep understanding of database systems, including MySQL and SQL
Requirements
Data Engineering Expertise : 3+ years of experience in data engineering, with a strong background in ETL pipeline design and implementationCloud Services : Hands-on experience with AWS services such as EC2, Athena, Lambda, and Step FunctionsProgramming Languages : Strong proficiency in Python, with expertise in libraries like PySpark and SQLAlchemyDatabases : Proficiency in MySQL and knowledge of other database systemsOrchestration Tools : Experience with tools like Airflow for workflow managementAbout Us
We are a dynamic organization that values innovation and teamwork. Our ideal candidate will thrive in a fast-paced environment, take ownership of their work, and be motivated by impact.