Talent.com
No longer accepting applications
Senior Data Engineer

Senior Data Engineer

Insight Globaltirupati, andhra pradesh, in
10 days ago
Job description

Summary

The Senior Data Engineer is responsible for building and optimizing ETL / ELT pipelines that process terabytes of data daily across 186 data assets, implementing BigQuery datasets with enterprise-scale performance optimization, and creating the data quality monitoring and transparency dashboards that enable data owner self-service.

Required Qualifications

Google Cloud Platform Data Engineering

  • 5+ years of data engineering experience with at least 2+ years focused on Google Cloud Platform
  • Strong proficiency with BigQuery including :
  • Advanced SQL for analytical queries (window functions, CTEs, complex joins)
  • Partitioning and clustering strategies for performance optimization
  • Materialized views, authorized views, and query optimization techniques
  • Cost optimization through efficient query design and storage management
  • Understanding of BigQuery architecture (slot allocation, shuffle operations, distributed execution)
  • Hands-on experience with Google Cloud Dataflow and Apache Beam :
  • Pipeline development in Python or Java
  • Batch and streaming data processing patterns
  • Performance tuning and resource optimization
  • Error handling and pipeline monitoring
  • Proficiency with Cloud Composer (Apache Airflow) :
  • DAG development and dependency management
  • Airflow operators (BigQueryOperator, DataflowOperator, custom operators)
  • Workflow orchestration for complex multi-step processes
  • Monitoring and troubleshooting failed workflows

ETL / ELT & Data Integration

  • Strong experience building production-grade ETL / ELT pipelines processing terabyte-scale data
  • Knowledge of data integration patterns (full refresh, incremental load, change data capture)
  • Experience with data transformation techniques (normalization, denormalization, aggregation)
  • Understanding of data quality frameworks and validation strategies
  • Proficiency with schema evolution handling changes without breaking downstream systems
  • Experience with data lineage tracking from source to consumption
  • SQL & Database Technologies

  • Expert-level SQL skills including advanced analytics functions and query optimization
  • Understanding of database performance tuning (indexing, partitioning, query plans)
  • Experience with relational databases (PostgreSQL, MySQL, SQL Server) for source system integration
  • Familiarity with NoSQL databases (Firestore, Bigtable) for specialized use cases
  • Knowledge of data warehousing concepts (fact tables, dimension tables, slowly changing dimensions)
  • Programming & Scripting

  • Strong Python proficiency for data pipeline development and scripting
  • Experience with Apache Beam SDK for Dataflow pipeline development
  • Proficiency with Pandas, NumPy for data manipulation and analysis
  • Understanding of object-oriented programming and software engineering best practices
  • Experience with Git for version control and collaborative development
  • Basic Shell scripting for automation and operational tasks
  • Data Security & Compliance

  • Understanding of row-level security implementation patterns in BigQuery
  • Experience with PHI / PII data handling and healthcare compliance requirements (HIPAA preferred)
  • Knowledge of data masking and de-identification techniques
  • Understanding of audit logging and compliance reporting requirements
  • Familiarity with least privilege principles and data access controls
  • Preferred Qualifications

  • Google Cloud Professional Data Engineer certification
  • Healthcare industry experience with understanding of clinical and administrative data
  • Experience with Google Cloud Storage lifecycle policies and storage class optimization
  • Knowledge of Cloud Spanner for transactional workloads
  • Familiarity with Cloud DLP API for automated data classification
  • Experience with dbt (data build tool) for analytics engineering
  • Understanding of data mesh or data fabric architectural patterns
  • Background in DevOps practices and CI / CD for data pipelines
  • Experience with Terraform for infrastructure as code
  • Knowledge of data visualization tools (Looker, Tableau, Power BI)
  • Familiarity with machine learning workflows on GCP (Vertex AI)
  • Experience with Docker and Kubernetes (GKE) for containerized workloads
  • Create a job alert for this search

    Senior Data Engineer • tirupati, andhra pradesh, in