Job Title : Senior Data Engineer
Experience : 5–8 years
Industry :
Pharmaceutical / Biotechnology
Location : Bangalore
Employment Type : Full-Time
Overview
Our client – one of the largest data science companies – is seeking to hire a
Senior Data Engineer
with
5–8 years
of hands-on experience in designing, building, and maintaining scalable, production-grade data platforms and pipelines. The ideal candidate will have a proven track record of delivering robust data solutions on cloud-native ecosystems, with strong client-facing engagement skills to gather requirements, present technical solutions, and drive stakeholder alignment. Demonstrated expertise in end-to-end data engineering lifecycle, cloud infrastructure, DevOps practices, and clear communication with business and technical audiences is essential. Experience in pharmaceutical, biotechnology, or medical device data environments is a strong advantage but not mandatory.
The ideal candidate will embrace the Company’s
Decision Sciences Lifecycle
and ways of working, acting as a trusted client-facing partner to enable long-term business, financial, and operational outcomes. They will design, implement, and support secure, performant data solutions using modern cloud tools, CI / CD pipelines, and a global delivery model. The role involves requirements elicitation, solution architecture presentation, pipeline development, performance optimization, and structured problem-solving in client engagements. Candidates must deliver high-quality, monitored, and well-documented solutions while leading offshore / nearshore team execution through mentoring, code reviews, and day-to-day delivery coordination.
Candidates must have hands-on, recent experience and strong proficiency in the following
core data engineering domains :
Cloud-Native Data Pipeline Development & Orchestration
Design and development of scalable ETL / ELT pipelines using Apache Spark (Databricks, Azure Synapse, AWS EMR), Apache Airflow, dbt, Prefect, or equivalent orchestration platforms
Hands-on implementation of batch and streaming ingestion from diverse sources (databases, APIs, flat files, message queues, SaaS platforms, etc.)
Advanced proficiency in Python (PySpark, pandas), SQL, and version-controlled pipeline development (Git) with CI / CD integration (GitHub Actions, Azure DevOps, Jenkins)
Expertise with lakehouse formats (Delta Lake, Apache Iceberg, Hudi), schema evolution, partitioning, incremental processing, and performance tuning
Implementation of data quality frameworks (Great Expectations, Monte Carlo, Deequ), lineage tracking, and automated testing
Client-facing experience translating business requirements into technical designs and presenting pipeline architecture, runbooks, and monitoring dashboards
Cloud Infrastructure Management & Data Platform Engineering
Deep hands-on experience with at least one major cloud provider (AWS, Azure, or GCP) using core data services : S3 / ADLS Gen2 / GCS, Glue / Athena, Snowflake, Databricks, Synapse, Redshift / Serverless.
Infrastructure-as-Code proficiency using Terraform, CloudFormation, or Azure ARM / Bicep for provisioning data lakes, warehouses, networking, IAM, encryption, and logging
Implementation of security and governance controls (VPC / peering, KMS encryption, access policies, audit trails) suitable for sensitive data environments
Setup of observability, alerting, and cost governance using CloudWatch, Azure Monitor, Prometheus + Grafana, or equivalent
Containerization (Docker) and orchestration (Kubernetes, ECS / EKS / AKS / GKE) for data services when required
Ability to explain infrastructure decisions and cost implications to non-technical client stakeholders
Additional competencies considered strong advantages
Experience handling pharma, biotech, or medical device data (clinical, RWE, safety, manufacturing, IoT / device telemetry)
Exposure to Agentic AI or LLM-powered implementations in data engineering workflows (e.g., automated metadata enrichment, smart data cataloging, self-healing pipelines)
LLM modeling and GenAI capability (e.g., prompt engineering for data transformation, RAG patterns for documentation, or GenAI-assisted code generation)
Familiarity with Agile methodologies, JIRA, Confluence, and leading client-facing sprint ceremonies
Senior Data Engineer • Delhi, India