Department : Data Engineering & AI Solutions
Reports To : Lead Data Solutions Architect
Travel : International travel required (up to 30–40%)
Position Summary :
We are hiring a senior-level Data Engineer to lead the design, development, and optimization of high-performance data infrastructure that underpins mission-critical AI systems. With 12+ years of experience, you’ll be responsible for implementing robust data pipelines, enforcing governance frameworks, and ensuring end-to-end data accessibility for advanced machine learning and analytics use cases.
This is a strategic and hands-on role that sits at the heart of AI delivery, demanding a combination of deep technical acumen, system architecture expertise, and the ability to collaborate across business, data science, and engineering teams. The ideal candidate combines technical mastery with a consultative mindset and is capable of aligning technical workstreams with business outcomes.
Key Responsibilities :
🔹 Data Pipeline Engineering
- Design, build, and maintain cloud-native, scalable ETL / ELT pipelines for structured and unstructured data ingestion, transformation, and delivery.
- Leverage tools such as Apache Airflow, dbt, Spark, Kafka, and native cloud data services to optimize data flow and processing latency.
- Implement event-driven architectures and real-time data streaming solutions where applicable.
🔹 Data Infrastructure Architecture
Architect and manage data infrastructure components across cloud environments (AWS, Azure, GCP), including storage, compute, orchestration, and security layers.Enable containerized deployment of data services using Docker and Kubernetes, ensuring high availability and scalability of infrastructure.Ensure data systems are optimized for AI workloads, including support for large-scale model training and real-time inference.🔹 Data Modeling & Governance
Design and implement enterprise-grade data models, schema definitions, and metadata management practices.Enforce data governance policies, lineage tracking, access control, and compliance standards (e.g., GDPR, ISO 27001).Establish data quality frameworks, including anomaly detection, validation rules, and automated monitoring mechanisms.🔹 Collaboration & Stakeholder Engagement
Act as a technical liaison between Data Scientists, ML Engineers, Business Analysts, and senior stakeholders to ensure alignment of data architecture with AI goals.Lead technical workshops and provide mentorship to junior engineers and project teams.Translate high-level business requirements into detailed technical execution plans.🔹 Optimization & Monitoring
Proactively monitor pipeline performance, identify system bottlenecks, and implement enhancements for throughput, latency, and cost-efficiency.Set up automated alerts, logging, and dashboarding using observability tools (e.g., Prometheus, Grafana, CloudWatch).🔹 Documentation & Knowledge Sharing
Develop and maintain comprehensive documentation covering data flows, infrastructure configuration, architectural decisions, and operational procedures.Deliver internal training sessions and contribute to reusable libraries, templates, and engineering standards.🔹 International Implementation Support
Travel internationally to work directly with clients and partners, conducting technical assessments, supporting deployments, and providing hands-on engineering leadership in the field.Required Qualifications :
🔹 Education :
Bachelor’s degree in Computer Science, Data Engineering, or a related technical discipline is required.Master’s degree in Data Science, Information Systems, or Software Engineering is highly preferred .🔹 Technical Expertise :
Data Warehousing & Lakehouse : Advanced experience with Snowflake, Redshift, BigQuery, or Delta Lake architectures.Cloud Platforms : Deep hands-on expertise in at least one major cloud provider (AWS, Azure, GCP).ETL / ELT Tools : Proficient with dbt, Apache Airflow, Informatica, or similar.Programming Languages : Strong coding skills in Python (essential), SQL (advanced), and optionally Scala or Java.Containerization & Orchestration : Experience with Docker, Kubernetes , and Helm.CI / CD Pipelines : Familiarity with tools like GitLab CI, Jenkins, or cloud-native DevOps pipelines.Security & Compliance : Knowledge of role-based access control, encryption, data masking, and regulatory compliance frameworks.Soft Skills & Leadership :
Strong interpersonal and communication skills - capable of working across cultures, geographies, and organizational levels.Proven ability to lead technical conversations with business context and explain abstract data architecture in understandable terms.Demonstrated success in client-facing roles or cross-functional teams.High degree of ownership, autonomy, and problem-solving capability.Preferred / Bonus Qualifications :
Certifications in AWS / Azure / GCP data or ML specialties.Experience in MLOps and AI / ML model deployment lifecycles.Background in sectors such as government, energy, or finance.Company Description
Generative AI solutions are reshaping how we work, and AI Agents are the future. Data-Hat AI assists Enterprises in navigating the AI landscape and building profitable and scalable Enterprise AI solutions. As transformation leaders implore the AI landscape, they seek experts to assist in developing solutions and building strategies. And that’s where Data-Hat AI comes in! Guided by an industry veteran, Kshitij Kumar (KK), who has over 2 decades of experience in introducing and implementing Data and AI solutions in Large Enterprises in the US, UK and Europe; a global team of AI and ML experts design Enterprise level AI, GenAI and AI Agent solutions. We go beyond product development, we collaborate with stakeholders and technology leaders to build a Data and AI strategy, develop a Minimal Viable Product (MVP) and establish ROI. We help Enterprises with impactful AI solutions development.