Lead Data Engineer – Python & GCP
Location - 10+ Years
Job Overview :
We are seeking an experienced Lead Data Engineer with strong expertise in Python and Google Cloud Platform (GCP) . The ideal candidate will lead the end-to-end data engineering lifecycle—from requirement gathering and solution design to development, deployment, and post-delivery support. This role involves designing scalable ETL / ELT pipelines , architecting cloud-native solutions, implementing data ingestion & transformation processes, and ensuring data quality across systems.
Experience Level :
10+ years of relevant IT experience in data engineering and backend development.
Key Responsibilities :
- Design, develop, test, and maintain scalable ETL / ELT data pipelines using Python .
- Architect enterprise-grade data solutions using technologies such as Kafka , GKE , multi-cloud services, load balancers, APIGEE , DBT , LLMs, and DLP tools.
- Work extensively with GCP services , including :
- Dataflow – real-time & batch processing
- Cloud Functions – serverless compute
- BigQuery – data warehousing & analytics
- Cloud Composer – workflow orchestration (Airflow)
- GCS – scalable storage
- IAM – access control & security
- Cloud Run – containerized workloads
- Build APIs using Python FastAPI .
- Work with Big Data and processing technologies :
- Apache Spark , Kafka , Airflow , MongoDB , Redis / Bigtable
- Perform data ingestion, transformation, cleansing, and validation to ensure high data quality.
- Implement and enforce data quality checks, monitoring, and validation rules.
- Collaborate with data scientists, analysts, and engineering teams to understand data needs and deliver solutions.
- Use GitHub for version control and support CI / CD deployments.
- Write complex SQL queries for relational databases such as SQL Server , Oracle , and PostgreSQL .
- Document data pipeline designs, architecture diagrams, and operational procedures.
Required Skills :
10+ years of hands-on experience with Python in data engineering or backend development.Strong working knowledge of GCP services (Dataflow, BigQuery, Cloud Functions, Cloud Composer, GCS, Cloud Run).Deep understanding of data pipeline architecture, ETL / ELT processes, and data integration patterns.Experience with Apache Spark , Kafka , Airflow , FastAPI , Redis / Bigtable .Strong SQL skills with at least one enterprise RDBMS (SQL Server, Oracle, PostgreSQL).Experience in on-prem to cloud data migrations.Knowledge of GitHub and CI / CD best practices.Good to Have :
Experience with Snowflake .Hands-on knowledge of Databricks .Familiarity with Azure Data Factory (ADF) or other Azure data tools.