Work on developing cutting-edge data science and engineering solutions to help shape and drive business innovation.
Ability to think outside the box and look for opportunities where ML or AI can solve problems in new and innovative ways.
Build a robust and scalable data architecture capable of handling large data volumes.
Analyze complex data requirements and design data models that meet organizational needs.
Design and be the lead developer on data pipelines to process and transform raw data efficiently from various sources.
Implement backend microservices for large scale distributed systems using gRPC or REST.
Apply data integration, validation and transformation techniques to ensure data quality and consistency.
Optimize data processing pipelines and implement performance tuning techniques.
Ensure the security and privacy of data by implementing access controls, encryption mechanisms, and data anonymization techniques.
Establish monitoring and auditing mechanisms to proactively identify and rectify any data issues ensuring a high level of data integrity.
Create and maintain visualization dashboards and reports for actionable insights.
Work in an Agile environment focused on collaboration and teamwork.
Develop roadmaps and requirements, identify risks, and develop contingency plans.
Communicate project status to relevant partners.
About You
8+ years of professional experience in data engineering, software development, or MLOps involving large-scale, real-time cloud based distributed systems.
Bachelors degree in Computer Science (or related field) or equivalent work experience.
Strong foundation in data engineering concepts, including data modeling, database design, ETL (Extract, Transform, Load) processes, and data warehousing.
Proficiency in Python, SQL, and PySpark, with a focus on applying these to data engineering tasks.
Familiarity with various machine learning models like regression, classification, clustering, deep learning, etc
Knowledge of tools like Git, MLflow, DVC (Data Version Control), or Weights & Biases for tracking experiments and model versions.
Proficiency with technologies like SnowFlake and Databricks.
Expertise with data visualization tools like Tableau, Power BI, Sigma or similar.
Familiarity with code repository and project tools such as GitHub, JIRA, and Confluence.
Strong communication ability to liaise with various partners, including data scientists, business analysts, and executives, to understand requirements and translate them into technical solutions.