This role involves collaborating with cross-functional teams to ensure data reliability, scalability, and performance. The candidate will work closely with data scientists, analysts and software engineers to ensure efficient data flow and storage, enabling data-driven decision-making across the organisation.
Responsibilities
- Software Engineering Excellence : Write clean, efficient, and maintainable code using JavaScript or Python while adhering to best practices and design patterns.
- Design, Build, and Maintain Systems : Develop robust software solutions and implement RESTful APIs that handle high volumes of data in real-time, leveraging message queues (Google Cloud Pub / Sub, Kafka, RabbitMQ) and event-driven architectures.
- Data Pipeline Development : Design, develop and maintain data pipelines (ETL / ELT) to process structured and unstructured data from various sources.
- Data Storage & Warehousing : Build and optimize databases, data lakes and data warehouses (e.g. Snowflake) for high-performance querying.
- Data Integration : Work with APIs, batch and streaming data sources to ingest and transform data.
- Performance Optimization : Optimize queries, indexing and partitioning for efficient data retrieval.
- Collaboration : Work with data analysts, data scientists, software developers and product teams to understand requirements and deliver scalable solutions.
- Monitoring & Debugging : Set up logging, monitoring, and alerting to ensure data pipelines run reliably.
- Ownership & Problem-Solving : Proactively identify issues or bottlenecks and propose innovative solutions to address them.
Qualifications
4+ years of experience in software developmentBachelor’s or Master’s degree in Computer Science, Engineering, or a related fieldStrong Problem-Solving Skills : Ability to debug and optimize data processing workflows.Programming Fundamentals : Solid understanding of data structures, algorithms, and software design patterns.Software Engineering Experience : Demonstrated experience (SDE II / III level) in designing, developing, and delivering software solutions using modern languages and frameworks (Node.js, JavaScript, Python, TypeScript, SQL, Scala or Java).ETL Tools & Frameworks : Experience with Airflow, dbt, Apache Spark, Kafka, Flink or similar technologies.Cloud Platforms : Hands-on experience with GCP (Pub / Sub, Dataflow, Cloud Storage) or AWS (S3, Glue, Redshift).Databases & Warehousing : Strong experience with PostgreSQL, MySQL, Snowflake, and NoSQL databases (MongoDB, Firestore, ES).Version Control & CI / CD : Familiarity with Git, Jenkins, Docker, Kubernetes, and CI / CD pipelines for deployment.Communication : Excellent verbal and written communication skills, with the ability to work effectively in a collaborative environment.Experience with data visualization tools (e.g. Superset, Tableau), Terraform, IaC, ML / AI data pipelines and devops practices are a plus.