Job Summary :
As a GCP Data Engineer, you will be responsible for designing, building, and maintaining scalable data processing systems on Google Cloud Platform (GCP). You will collaborate with cross-functional teams to ensure data is accessible, reliable, and optimized for analytics and machine learning initiatives.
Key Responsibilities :
- Data Pipeline Development :
- Design, develop, and maintain ETL / ELT pipelines using tools like Cloud Dataflow, Apache Beam, and Cloud Dataprep.
- Implement real-time data streaming solutions using Cloud Pub / Sub or Apache Kafka.
- Data Storage & Warehousing :
- Manage and optimize storage solutions such as BigQuery, Cloud Storage, and Cloud Spanner.
- Design efficient data models and schemas for structured and unstructured data.
- Data Integration & Transformation :
- Integrate data from multiple sources into unified data lakes or warehouses.
- Cleanse, enrich, and transform raw data into analytics-ready formats.
- Performance Optimization :
- Monitor and troubleshoot data pipelines.
- Optimize workflows for scalability, performance, and cost-efficiency.
- Security & Compliance :
- Implement data governance policies.
- Ensure compliance with data protection regulations (e.g., GDPR, HIPAA).
- Collaboration & Communication :
- Work closely with data scientists, analysts, and business stakeholders.
- Translate technical solutions into business insights.
Required Skills :
Proficiency in Python, SQL, and optionally Java or Scala.Experience with GCP services : BigQuery, Cloud Dataflow, Cloud Storage, Cloud Composer, etc.Strong understanding of data warehousing, data modeling, and ETL processes.Familiarity with real-time data processing and streaming architectures.Knowledge of data governance, security, and compliance standards.Qualifications :
Bachelor’s degree in Computer Science, Engineering, or related field.GCP Professional Data Engineer certification (preferred).2+ years of experience in data engineering, preferably in cloud environments.