Data Engineer - Python / Scala

IT FirmDelhi, IN

30+ days ago

Job description

We are looking for an experienced GCP Data Engineer with a minimum of 5+ years of professional experience in data engineering, cloud-based data solutions, and large-scale distributed systems. This role is fully remote and requires a hands-on professional who can design, build, and optimize data pipelines and solutions on Google Cloud Platform (GCP).

Key Responsibilities :

Architect, design, and implement highly scalable data pipelines and ETL workflows leveraging GCP services.
Develop and optimize data ingestion, transformation, and storage frameworks to support analytical and operational workloads.
Work extensively with BigQuery, Dataflow, Pub / Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage to design robust data solutions.
Create and maintain efficient data models and schemas for analytical reporting, machine

learning pipelines, and real-time processing.

Collaborate closely with data scientists, analysts, and business stakeholders to understand requirements and convert them into technical data solutions.

Implement best practices for data governance, security, privacy, and compliance across the entire data lifecycle.

Monitor, debug, and optimize pipeline performance ensuring minimal latency and high

throughput.

Design and maintain APIs and microservices for data integration across platforms.

Perform advanced data quality checks, anomaly detection, and validation to ensure data

accuracy and consistency.

Implement CI / CD for data engineering projects using GCP-native DevOps tools.

Stay updated with emerging GCP services and industry trends to continuously improve existing solutions.

Create detailed documentation for data processes, workflows, and standards to enable smooth knowledge transfer.

Support the migration of on-premise data systems to GCP, ensuring zero downtime and efficient cutover.

Automate repetitive workflows, deployment processes, and monitoring systems using Python, Shell scripting, or Terraform.

Provide mentoring and technical guidance to junior data engineers in the team.

Required Skills & Experience :

5+ years of experience in data engineering with a strong focus on cloud-based data solutions.

Hands-on expertise with Google Cloud Platform (GCP) and services including BigQuery, Dataflow, Pub / Sub, Dataproc, Data Fusion, Cloud Composer, and Cloud Storage.

Strong proficiency in SQL, including query optimization, performance tuning, and working with large datasets.

Advanced programming skills in Python, Java, or Scala for building data pipelines.

Experience with real-time data streaming frameworks such as Apache Kafka or Google Pub / Sub.

Solid knowledge of ETL / ELT processes, data modeling (star / snowflake), and schema design for both batch and streaming use cases.

Proven track record of building data lakes, warehouses, and pipelines that can scale with

enterprise-level workloads.

Experience integrating diverse data sources including APIs, relational databases, flat files, and

unstructured data.

Knowledge of Terraform, Infrastructure as Code (IaC), and automation practices in cloud environments.

Understanding of CI / CD pipelines for data engineering workflows and integration with Git,

Jenkins, or Cloud Build.

Strong background in data governance, lineage, and cataloging tools.

Familiarity with machine learning workflows and enabling ML pipelines using GCP services is

an advantage.

Good grasp of Linux / Unix environments and shell scripting.

Exposure to DevOps practices and monitoring tools such as Stackdriver or Cloud Excellent problem-solving, debugging, and analytical skills with the ability to handle complex

technical challenges.

Strong communication skills with the ability to work independently in a remote-first team Skills :

Experience with multi-cloud or hybrid environments (AWS / Azure alongside GCP).

Familiarity with data visualization platforms such as Looker, Tableau, or Power BI.

Exposure to containerization technologies such as Docker and Kubernetes.

Understanding of big data processing frameworks like Spark, Hadoop, or Flink.

Prior experience in industries with high data volume such as finance, retail, healthcare, or Background :

Bachelors or Masters degree in Computer Science, Information Technology, Data Engineering, or a related field.

Relevant GCP certifications (e.g., Professional Data Engineer, Professional Cloud Architect) will be highly preferred.

Why Join Us?

Opportunity to work on cutting-edge cloud data projects at scale.

Fully remote working environment with flexible schedules.

Exposure to innovative data engineering practices and advanced GCP tools.

Collaborative team culture that values continuous learning, innovation, and career growth.

(ref : hirist.tech)

Create a job alert for this search

Data Engineer • Delhi, IN

Related jobs

Promoted

AWS Data Engineer

Tata Consultancy Servicesghaziabad, uttar pradesh, in

Aws data engineer having strong experience of Python.Technical / Behavioral Competency.Proficient in Python, with experience in deploying Python packages and OOP, Experience in ingesting data from di...Show moreLast updated: 25 days ago

Promoted
New!

Senior Data Engineer

Adaptive Technology InsightsDelhi, IN

We are looking for an experienced.You will be responsible for designing, building, and optimizing data pipelines and architecture to support analytics, reporting, and data-driven decision-making ac...Show moreLast updated: 15 hours ago

Promoted

Senior Python Data Engineer

iVoyantghaziabad, uttar pradesh, in

Join a dynamic engineering team working on a high-impact tax reporting platform for the 2025 fiscal season.The core goal is to modernize and significantly accelerate the generation of Excel-based r...Show moreLast updated: 3 days ago

Promoted

Senior Data Engineer

SAIVA AIDelhi, IN

We are building the future of healthcare analytics.Join us to design, build, and scale robust data pipelines that power nationwide analytics and support our machine learning systems.Our goal : pipel...Show moreLast updated: 30+ days ago

Promoted

Data Engineer (Spark / Scala)

Tata Consultancy ServicesDelhi, India

TCS is Hiring for Data Engineer (Spark / Scala) Experience : 5+Yrs Location : .Chennai, Bangalore, Pune, Gurugram Notice Period : 0-60 Days. Please find the JD below Good work experience & Proven experi...Show moreLast updated: 26 days ago

Promoted

Senior Data Engineer

IntelliasDelhi, IN

Apache Flink / Apache Spark (Streaming).Data Engineer or similar role, with hands-on expertise in large-scale, production-grade data pipelines. Kafka + Flink / Spark Streaming).Python for data engin...Show moreLast updated: 16 days ago

Promoted
New!

AWS Data Engineer

People Prime WorldwideDelhi, IN

Important Note (Please Read Before Applying).You have less than 8 years or more than 10 years of total experience.You do NOT have strong Python + AWS Data Engineering experience.You are NOT hands-o...Show moreLast updated: 15 hours ago

Promoted
New!

Data Engineer

NexionPro ServicesGhaziabad, IN

PySpark-based data processing workflows.Collaborate with data architects, analysts, and cross-functional teams to understand business requirements and translate them into technical solutions.Optimi...Show moreLast updated: 15 hours ago

Promoted
New!

Lead Data Engineer - Python & GCP || Contract Job || 8-10 Years Experience

People Prime Worldwidenarela, delhi, in

Our Client is a global IT services company headquartered in Southborough, Massachusetts, USA.Founded in 1996, with a revenue of $1. B, with 35,000+ associates worldwide, specializes in digital engin...Show moreLast updated: 9 hours ago

Promoted

Data Engineer

MastekDelhi, IN

Deep hands-on experience with Unity Catalog — creating and managing catalogs, schemas, and tables.Experience automating data onboarding and metadata registration via Unity Catalog APIs or Databrick...Show moreLast updated: 10 days ago

Promoted

Data Engineer

Ubique SystemsDelhi, IN

Primary skills : Python, SQL, data lakes, azure.Pipeline Development & Automation.Design, build, and maintain CI / CD pipelines to automate deployment of DQ rules and data services across environments...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

IntraEdgeDelhi, IN

Python, PySpark, AWS services (Glue, Lambda), and Snowflake.The ideal candidate will design, build, and maintain scalable data pipelines, ensure efficient data integration, and enable advanced anal...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

Alp Consulting Ltd.Ghaziabad, IN

Architect and maintain our Amazon Redshift Serverless data warehouse.Design and implement ETL pipelines from operational Redshift to staging (DSA), landing (DLA), and TDW layers.Model data using st...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

Tata Consultancy ServicesDelhi, IN

TCS has been a great pioneer in feeding the fire of Techies like you.We are a global leader in the technology arena and there’s nothing that can stop us from growing together.Your role is of key im...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

Havells India LtdNoida, Uttar Pradesh, India

We are seeking a skilled and experienced Data Engineer to join our dynamic team.The ideal candidate will have a strong background in data engineering, with a focus on PySpark, Python, and SQL.Exper...Show moreLast updated: 30+ days ago

Promoted

Data Engineer

DigitalzoneDelhi, IN

As a Data Engineer, you will design, build, and optimize data pipelines and real-time systems that power AI-driven decisioning and analytics. Develop and maintain scalable ETL / ELT pipelines using Py...Show moreLast updated: 16 days ago

Promoted

Lead PySpark Data Engineer _ Exp : 7+ Years

Atyeti IncDelhi, India

Job Description : Roles and Responsibility : .Build and maintain all facets of Data Pipelines for Data Engineering team.Build the pipelines required for optimal extraction, transformation, and loading ...Show moreLast updated: 7 days ago

Promoted

Data Engineer

AceolutionDelhi, IN

Data Engineer – Python Expert(Freelance Role).We are looking for a seasoned Senior Data Engineer to architect, build, and own the data pipelines that power our large language model (LLM) developmen...Show moreLast updated: 30+ days ago