About the Role
As a Data Engineer, you will design, build, and optimize data pipelines and real-time systems that power AI-driven decisioning and analytics.
Responsibilities
- Develop and maintain scalable ETL / ELT pipelines using Python, Airflow, and dbt
- Build and optimize real-time streaming pipelines using Kafka, RabbitMQ, Spark and event-driven microservices
- Leverage AI coding agents (Claude Code, Cursor AI, GitHub Copilot) to accelerate data pipeline development, data transformation, and optimization
- Design and manage data storage solutions with PostgreSQL, ClickHouse and Elasticsearch
- Define and implement data modeling best practices for efficient and scalable data storage.
- Ensure high availability, performance, and security of data infrastructure on AWS cloud
- Design and implement BI-ready datasets in ClickHouse optimized for OLAP workloads and reporting use cases.
- Develop and manage denormalized schemas and data models to support analytics and amCharts-based reporting.
- Optimize query performance, partitioning, and indexing strategies in ClickHouse to handle large-scale analytical workloads.
- Implement best practices for data quality, validation, and observability
- Work on data ingestion using Airbyte or other connector frameworks
- Implement data transformation pipelines supporting analytical and ML workloads
- Collaborate with cloud and DevOps teams for containerized deployments (Docker, Kubernetes, AWS)
- Support integration of ML workflows and model APIs into production pipelines
Qualifications
6+ years of hands-on experience in data engineering and distributed systems.Required Skills
Experience with AI-assisted coding tools (Claude Code, GitHub Copilot, Cursor AI)Strong expertise in ClickHouse for analytics, including data modeling, partitioning, indexing, and query optimization.Proven ability to design and manage denormalized OLAP datasets to support BI and reporting workloads.Solid understanding of ETL / ELT pipelines and experience with tools such as Airbyte, DBT, Airflow or Python-based transformations.Proficiency in SQL (advanced queries, performance tuning, window functions, aggregations).Hands-on experience working with reporting / visualization frameworks (e.g., amCharts or similar libraries).Strong foundation in data quality, governance, and validation to ensure reliable analytics.Strong knowledge of real-time data streaming (Kafka, Spark)Experience with relational and analytical databases (Postgres, ClickHouse, Elasticsearch)Exposure to ML model integration, MLOps is a strong plusStrong problem-solving, debugging, and cross-team collaboration skillsPreferred Skills
Experience with relational and analytical databases (Postgres, ClickHouse, Elasticsearch)Exposure to ML model integration, MLOps is a strong plus