Your Role :
- Develop and maintain data pipelines while achieving high reliability and efficiency
- Help guide data infrastructure design and develop proof of concepts for recommended solutions
- Liaison with engineering and product to help the Research team develop new data-products
- Maintain documentation for the Data Warehouse and other data products
- Design, develop, QA and maintain code related to data engineering
- Provide thought-leadership and dependable execution on diverse projects
- Guide other data engineers in your team to apply best practices and meet their deliverables.
Required skills :
Masters in Computer Science or equivalent work experience.Strong knowledge of SQL and relational databases. Non relational database exposure is good to have.Deep understanding and prior experience with Spark / pySparkUnderstanding of data warehouse table design, star schemas, etc.Previous experience architecting and building data pipelines from 1st party and 3rd party data sourcesStrong problem-solving skills, adaptable, proactive, and willing to take ownershipStrong commitment to quality, architecture, and documentationExperience using AWS cloud services (S3, Athena, EMR, Kinesis)Experience with data pipeline technologies a plus (Airflow, Luigi)Experience with Kafka streaming a plusExperience with business intelligence tools a plus (Tableau, Qlik)Experience with Databricks / DBT a plus.Skills Required
Pyspark, Aws Cloud, Bigdata, Relational Databases, Data Warehousing, Sql