Design, build, and maintain robust data pipelines (batch or streaming) that process and transform data from diverse sources.
Ensure data quality, reliability, and availability across the pipeline lifecycle.
Collaborate with product managers, architects, and engineering leads to define technical strategy.
Participate in code reviews, testing, and deployment processes to maintain high standards.
Own smaller components of the data platform or pipelines and take end-to-end responsibility.
Continuously identify and resolve performance bottlenecks in data pipelines.
Take initiatives, and show the drive to pick up new stuff proactively, and work as a Senior Individual contributor on the multiple products and features we have.
Required qualifications :
5 to 7 years of experience in Big Data or data engineering roles.
JVM based languages like Java or Scala are preferred. For someone having solid Big Data experience, Python would also be OK.
Proven and demonstrated experience working with distributed Big Data tools and processing frameworks like Apache Spark or equivalent (for processing), Kafka or Flink (for streaming), and Airflow or equivalent (for orchestration).
Familiarity with cloud platforms (e.g., AWS, GCP, or Azure), including services like S3, Glue, BigQuery, or EMR.
Ability to write clean, efficient, and maintainable code.
Good understanding of data structures, algorithms, and object-oriented programming.