Talent.com
Senior Data Engineer

Senior Data Engineer

Insight Globalamritsar, punjab, in
1 day ago
Job description

Summary

The Senior Data Engineer is responsible for building and optimizing ETL / ELT pipelines that process terabytes of data daily across 186 data assets, implementing BigQuery datasets with enterprise-scale performance optimization, and creating the data quality monitoring and transparency dashboards that enable data owner self-service.

Required Qualifications

Google Cloud Platform Data Engineering

  • 5+ years of data engineering experience with at least 2+ years focused on Google Cloud Platform
  • Strong proficiency with BigQuery including :
  • Advanced SQL for analytical queries (window functions, CTEs, complex joins)
  • Partitioning and clustering strategies for performance optimization
  • Materialized views, authorized views, and query optimization techniques
  • Cost optimization through efficient query design and storage management
  • Understanding of BigQuery architecture (slot allocation, shuffle operations, distributed execution)
  • Hands-on experience with Google Cloud Dataflow and Apache Beam :
  • Pipeline development in Python or Java
  • Batch and streaming data processing patterns
  • Performance tuning and resource optimization
  • Error handling and pipeline monitoring
  • Proficiency with Cloud Composer (Apache Airflow) :
  • DAG development and dependency management
  • Airflow operators (BigQueryOperator, DataflowOperator, custom operators)
  • Workflow orchestration for complex multi-step processes
  • Monitoring and troubleshooting failed workflows

ETL / ELT & Data Integration

  • Strong experience building production-grade ETL / ELT pipelines processing terabyte-scale data
  • Knowledge of data integration patterns (full refresh, incremental load, change data capture)
  • Experience with data transformation techniques (normalization, denormalization, aggregation)
  • Understanding of data quality frameworks and validation strategies
  • Proficiency with schema evolution handling changes without breaking downstream systems
  • Experience with data lineage tracking from source to consumption
  • SQL & Database Technologies

  • Expert-level SQL skills including advanced analytics functions and query optimization
  • Understanding of database performance tuning (indexing, partitioning, query plans)
  • Experience with relational databases (PostgreSQL, MySQL, SQL Server) for source system integration
  • Familiarity with NoSQL databases (Firestore, Bigtable) for specialized use cases
  • Knowledge of data warehousing concepts (fact tables, dimension tables, slowly changing dimensions)
  • Programming & Scripting

  • Strong Python proficiency for data pipeline development and scripting
  • Experience with Apache Beam SDK for Dataflow pipeline development
  • Proficiency with Pandas, NumPy for data manipulation and analysis
  • Understanding of object-oriented programming and software engineering best practices
  • Experience with Git for version control and collaborative development
  • Basic Shell scripting for automation and operational tasks
  • Data Security & Compliance

  • Understanding of row-level security implementation patterns in BigQuery
  • Experience with PHI / PII data handling and healthcare compliance requirements (HIPAA preferred)
  • Knowledge of data masking and de-identification techniques
  • Understanding of audit logging and compliance reporting requirements
  • Familiarity with least privilege principles and data access controls
  • Preferred Qualifications

  • Google Cloud Professional Data Engineer certification
  • Healthcare industry experience with understanding of clinical and administrative data
  • Experience with Google Cloud Storage lifecycle policies and storage class optimization
  • Knowledge of Cloud Spanner for transactional workloads
  • Familiarity with Cloud DLP API for automated data classification
  • Experience with dbt (data build tool) for analytics engineering
  • Understanding of data mesh or data fabric architectural patterns
  • Background in DevOps practices and CI / CD for data pipelines
  • Experience with Terraform for infrastructure as code
  • Knowledge of data visualization tools (Looker, Tableau, Power BI)
  • Familiarity with machine learning workflows on GCP (Vertex AI)
  • Experience with Docker and Kubernetes (GKE) for containerized workloads
  • Create a job alert for this search

    Senior Data Engineer • amritsar, punjab, in