About the Role :
We are seeking a skilled Data Engineer to join our team and contribute to our cutting-edge data platform built on a medallion data lake architecture. You will design, develop, and maintain robust data pipelines while collaborating on innovative AI engineering projects. This role combines traditional data engineering responsibilities with emerging AI capabilities, leveraging modern tools and cloud technologies to drive impactful solutions.
Key Responsibilities :
- Design, build, and optimize scalable data pipelines using Airflow, dbt, and SQL to process and transform data in our medallion architecture (bronze, silver, gold layers).
- Manage and maintain data storage solutions using Snowflake and AWS S3, ensuring performance, reliability, and cost efficiency.
- Implement infrastructure as code using Terraform to provision and manage AWS resources such as Lambda, S3, and other cloud services.
- Work with Delta tables and Apache Spark to handle large-scale data processing and ensure data quality and consistency.
- Collaborate with AI / ML teams to integrate data pipelines with AI engineering workflows, supporting common LLM patterns (e.g., RAG, fine-tuning, prompt engineering) and architectures.
- Monitor and troubleshoot data pipelines, ensuring high availability, data integrity, and efficient performance.
- Contribute to the evolution of our data architecture, incorporating best practices for scalability, security, and governance.
- Stay updated on industry trends and emerging technologies in data engineering and AI to drive continuous Skills and Qualifications :
- Proficiency in SQL and experience with Snowflake for data warehousing and querying.
- Hands-on experience with Airflow for workflow orchestration and dbt for data transformation.
- Strong knowledge of AWS services (e.g., S3, Lambda) and infrastructure management using Terraform.
- Familiarity with Delta tables and Apache Spark for large-scale data processing.
- Familiarity with medallion data lake architecture and best practices for data modeling and pipeline design.
- Basic understanding of AI engineering concepts, including embeddings, vector stores, common LLM patterns (e.g., retrieval-augmented generation, model fine-tuning) and architectures.
- Strong problem-solving skills and the ability to work in a fast-paced, collaborative environment.
- Excellent communication skills to partner with cross-functional teams, including data scientists and AI engineers.
(ref : hirist.tech)