Design, develop, and maintain data solutions for data generation, collection, and processingBe a key team member that assists in design and development of the data pipelineCreate data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systemsContribute to the design, development, and implementation of data pipelines, ETL / ELT processes, and data integration solutionsTake ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risksCollaborate with cross-functional teams to understand data requirements and design solutions that meet business needsDevelop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistencyImplement data security and privacy measures to protect sensitive dataLeverage cloud platforms (AWS preferred) to build scalable and efficient data solutionsCollaborate and communicate effectively with product teamsCollaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast paced business needs across geographic regionsAdhere to best practices for coding, testing, and designing reusable code / componentExplore new tools and technologies that will help to improve ETL platform performanceParticipate in sprint planning meetings and provide estimations on technical implementationWhat we expect of you
We are all different, yet we all use our unique contributions to serve patients.
Basic Qualifications :
- Masters degree and 1 to 3 years of Computer Science, IT or related field experience OR
- Bachelors degree and 3 to 5 years of Computer Science, IT or related field experience OR
- Diploma and 7 to 9 years of Computer Science, IT or related field experience
- Proficiency in Python, PySpark, and Scala for data processing and ETL (Extract, Transform, Load) workflows, with hands-on experience in using Databricks for building ETL pipelines and handling big data processing
- Experience with data warehousing platforms such as Amazon Redshift, or Snowflake.
- Strong knowledge of SQL and experience with relational (e.g., PostgreSQL, MySQL) databases.
- Familiarity with big data frameworks like Apache Hadoop, Spark, and Kafka for handling large datasets.
- Experienced with software engineering best-practices, including but not limited to version control (GitLab, Subversion, etc.), CI / CD (Jenkins, GITLab etc.), automated unit testing, and Dev Ops
Preferred Qualifications :
- Experience with cloud platforms such as AWS particularly in data services (e.g., EKS, EC2, S3, EMR, RDS, Redshift / Spectrum, Lambda, Glue, Athena)
- Experience with Anaplan platform, including building, managing, and optimizing models and workflows including scalable data integrations
- Understanding of machine learning pipelines and frameworks for ML / AI models
Professional Certifications :
- AWS Certified Data Engineer (preferred)
- Databricks Certified (preferred)
Soft Skills :
- Excellent critical-thinking and problem-solving skills
- Strong communication and collaboration skills
- Demonstrated awareness of how to function in a team setting
- Demonstrated presentation skills
Skills Required
Ml, Data Services, Devops, Frameworks, Ai, Sql, Aws, Etl