Role Overview :
Seasoned Databricks Senior Platform Engineer who can implement scalable, end-to-end data solutions on Databricks over AWS. This role is pivotal in building the foundational infrastructure for data products, ensuring data quality, and enabling machine learning capabilities
Key Responsibilities :
- Design and implement end-to-end ETL pipelines in Databricks, sourcing data from Amazon S3 and Amazon RDS and other sources.
- Design and deploy Databricks workspaces and clusters optimized for performance, scalability, and cost-efficiency.
- Set up data quality frameworks and anomaly detection mechanisms to ensure reliability and trust in data products.
- Enable machine learning capabilities within Databricks by configuring necessary services, libraries, and integrations.
- Collaborate with data scientists, engineers, and DevOps teams to ensure seamless platform operations.
- Automate deployment and monitoring using CI / CD pipelines, GitHub
- Document architecture, workflows, and operational procedures for platform governance.
Requirements :
Core Platform & Cloud :
5+ years of experience in platform engineering rolesStrong expertise in Databricks on AWSProficient with Amazon S3, Amazon RDS, and IAM roles / policiesExperience with Delta Lake, Structured Streaming, and Databricks WorkflowsETL & Data Quality :
Hands-on experience building ETL pipelines using SparkFamiliarity with data validation, profiling, and anomaly detection frameworksKnowledge of tools like Great Expectations or custom validation scriptsML Platform Enablement :
Experience setting up ML infrastructure including :MLflow for experiment tracking and model registryFeature Store for reusable feature pipelinesDatabricks Runtime for MLIntegration with AWS SageMaker or custom ML librariesUnderstanding of model lifecycle management, though model development is out of scopeDevOps & Automation :
Proficiency with GitHub, CI / CD pipelines, and Terraform or CloudFormationPreferred Qualifications :
Experience working with healthcare dataExposure to data mesh or data product architecture principles(ref : hirist.tech)