Talent.com
This job offer is not available in your country.
Data Engineer - Python / Scala

Data Engineer - Python / Scala

IT FirmDelhi, IN
13 hours ago
Job description

We are looking for an experienced GCP Data Engineer with a minimum of 5+ years of professional experience in data engineering, cloud-based data solutions, and large-scale distributed systems. This role is fully remote and requires a hands-on professional who can design, build, and optimize data pipelines and solutions on Google Cloud Platform (GCP).

Key Responsibilities :

  • Architect, design, and implement highly scalable data pipelines and ETL workflows leveraging GCP services.
  • Develop and optimize data ingestion, transformation, and storage frameworks to support analytical and operational workloads.
  • Work extensively with BigQuery, Dataflow, Pub / Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage to design robust data solutions.
  • Create and maintain efficient data models and schemas for analytical reporting, machine

learning pipelines, and real-time processing.

  • Collaborate closely with data scientists, analysts, and business stakeholders to understand requirements and convert them into technical data solutions.
  • Implement best practices for data governance, security, privacy, and compliance across the entire data lifecycle.
  • Monitor, debug, and optimize pipeline performance ensuring minimal latency and high
  • throughput.

  • Design and maintain APIs and microservices for data integration across platforms.
  • Perform advanced data quality checks, anomaly detection, and validation to ensure data
  • accuracy and consistency.

  • Implement CI / CD for data engineering projects using GCP-native DevOps tools.
  • Stay updated with emerging GCP services and industry trends to continuously improve existing solutions.
  • Create detailed documentation for data processes, workflows, and standards to enable smooth knowledge transfer.
  • Support the migration of on-premise data systems to GCP, ensuring zero downtime and efficient cutover.
  • Automate repetitive workflows, deployment processes, and monitoring systems using Python, Shell scripting, or Terraform.
  • Provide mentoring and technical guidance to junior data engineers in the team.
  • Required Skills & Experience :

  • 5+ years of experience in data engineering with a strong focus on cloud-based data solutions.
  • Hands-on expertise with Google Cloud Platform (GCP) and services including BigQuery, Dataflow, Pub / Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage.
  • Strong proficiency in SQL, including query optimization, performance tuning, and working with large datasets.
  • Advanced programming skills in Python, Java, or Scala for building data pipelines.
  • Experience with real-time data streaming frameworks such as Apache Kafka or Google Pub / Sub.
  • Solid knowledge of ETL / ELT processes, data modeling (star / snowflake), and schema design for both batch and streaming use cases.
  • Proven track record of building data lakes, warehouses, and pipelines that can scale with
  • enterprise-level workloads.

  • Experience integrating diverse data sources including APIs, relational databases, flat files, and
  • unstructured data.

  • Knowledge of Terraform, Infrastructure as Code (IaC), and automation practices in cloud environments.
  • Understanding of CI / CD pipelines for data engineering workflows and integration with Git,
  • Jenkins, or Cloud Build.

  • Strong background in data governance, lineage, and cataloging tools.
  • Familiarity with machine learning workflows and enabling ML pipelines using GCP services is
  • an advantage.

  • Good grasp of Linux / Unix environments and shell scripting.
  • Exposure to DevOps practices and monitoring tools such as Stackdriver or Cloud Excellent problem-solving, debugging, and analytical skills with the ability to handle complex
  • technical challenges.

  • Strong communication skills with the ability to work independently in a remote-first team Skills :
  • Experience with multi-cloud or hybrid environments (AWS / Azure alongside GCP).
  • Familiarity with data visualization platforms such as Looker, Tableau, or Power BI.
  • Exposure to containerization technologies such as Docker and Kubernetes.
  • Understanding of big data processing frameworks like Spark, Hadoop, or Flink.
  • Prior experience in industries with high data volume such as finance, retail, healthcare, or Background :
  • Bachelors or Masters degree in Computer Science, Information Technology, Data Engineering, or a related field.
  • Relevant GCP certifications (e.g., Professional Data Engineer, Professional Cloud Architect) will be highly preferred.
  • Why Join Us?

  • Opportunity to work on cutting-edge cloud data projects at scale.
  • Fully remote working environment with flexible schedules.
  • Exposure to innovative data engineering practices and advanced GCP tools.
  • Collaborative team culture that values continuous learning, innovation, and career growth.
  • (ref : hirist.tech)

    Create a job alert for this search

    Data Engineer • Delhi, IN