Job Description : Data Engineer
As a Data Engineer, you will own the end-to-end lifecycle of our data infrastructure. You will
design and implement robust, scalable data pipelines and architect modern data solutions
using a best-in-class technology stack. Your work will transform raw, messy data into clean,
reliable, and actionable data products that power decision-making across the business.
You’ll collaborate cross-functionally with product managers, data analysts, data scientists,
and software engineers to understand data needs and deliver high-performance data
solutions. Your impact will be measured by how effectively data is delivered, modeled, and
leveraged to drive business outcomes.
Key Responsibilities :
- Architect & Build : Design, implement and manage cloud-based data platform using a
modern ELT (Extract, Load, Transform) approach.
Data Ingestion : Develop and maintain robust data ingestion pipelines from a variety of sources, including operational databases (MongoDB, RDS), real-time IoT streams, andthird-party APIs using services like AWS Kinesis / Lambda or Azure Event Hubs / Functions.
Data Lake Management : Build and manage a scalable and cost-effective data lake onAWS S3 or Azure Data Lake Storage (ADLS Gen2), using open table formats like Apache
Iceberg or Delta Lake.
Data Transformation : Develop, test, and maintain complex data transformation models using dbt. Champion a software engineering mindset by applying principles of version control (Git), CI / CD, and automated testing to all data logic.Orchestration : Implement and manage data pipeline orchestration using modern tools like Dagster, Apache Airflow, or Azure Data Factory.Data Quality & Governance : Establish and enforce data quality standards. Implement automated testing and monitoring to ensure the reliability and integrity of all data assets.Performance & Cost Optimization : Continuously monitor and optimize the performance and cost of the data platform, ensuring our serverless query engines and storage layers are used efficiently.Collaboration : Work closely with data analysts and business stakeholders tounderstand their needs, model data effectively, and deliver datasets that power our BI
tools (Metabase, Power BI).
Required Skills & Experience (Must-Haves) :
3+ years of professional experience in a data engineering role.Expert-level proficiency in SQL and the ability to write complex, highly-performant queries.Proficient in Python based data cleaning packages and tools. Experience in python is a must.Hands-on experience building data solutions on a major cloud provider (AWS or Azure), utilizing core services like AWS S3 / Glue / Athena or Azure ADLS / Data Factory / Synapse.Proven experience building and maintaining data pipelines in Python.Experience with NoSQL databases like MongoDB, including an understanding of its data modeling, aggregation framework, and query patterns.Deep understanding of data warehousing concepts, including dimensional modeling, star / snowflake schemas, and data modeling best practices.Hands-on experience with modern data transformation tools, specifically dbt.Familiarity with data orchestration tools like Apache Airflow, Dagster, or Prefect.Proficiency with Git and experience working with CI / CD pipelines for data projects.Preferred Skills & Experience (Nice-to-Haves) :
Experience with real-time data streaming technologies, specifically AWS Kinesis or Azure Event Hubs.Experience with data cataloging and governance tools (e.g., OpenMetadata, DataHub, Microsoft Purview).Knowledge of infrastructure-as-code tools like Terraform or CloudFormation.Experience with containerization technologies (Docker, Kubernetes).