Job Title : Data Engineer
Location : Remote
Position : Full-time
Exp- Must Have 8+ Years
Position Overview
We are currently seeking a Data Engineer to join our technology team. In this role, you will work on building data-driven products that sit atop cloud-based data infrastructure. The ideal candidate is passionate about technology, software design, data engineering, and is excited about blending these disciplines to create innovative solutions.
You will be responsible for one or more software and data products, participating in architectural decisions, design, development, and evaluation of new technical solutions. This role also involves leading a small team of developers.
Key Responsibilities
- Manage and optimize data and compute environments to enable efficient use of data assets
- Design, build, test, and deploy scalable and reusable systems capable of handling large volumes of data
- Lead a small team of developers; conduct code reviews and mentor junior engineers
- Continuously learn and help teach emerging technologies to the team
- Collaborate with cross-functional teams to integrate data into broader applications
Required Skills
Proven experience designing and managing data flowsExpertise in designing systems and APIs for data integration8+ years of hands-on experience with Linux, Bash, Python, and SQL4+ years working with Spark and other components in the Hadoop ecosystem4+ years of experience using AWS cloud services, particularly :EMRGlueAthenaRedshift4+ years of experience managing a development teamDeep passion for technology and enthusiasm for solving complex, real-world problems using cutting-edge toolsAdditional Skills (Preferred)
BS, MS, or PhD in Computer Science, Engineering, or equivalent practical experienceStrong experience with Python, C++, or other widely-used languagesExperience working with petabyte-scale data infrastructureSolid understanding of :Data organization : partitioning, clustering, file sizes, file formatsData cataloging : Hive / Hive Metastore, AWS Glue, or similarBackground working with relational databasesProficiency with Hadoop, Hive, Spark, or similar toolsExposure to modern data stack technologies like dbt, airbyteExperience building scalable data pipelines (e.g., Airflow is a plus)Strong background with AWS and / or Google Cloud Platform (GCP)Demonstrated ability to independently lead projects from concept through launch and ongoing support