Job Description :
We're currently hiring female candidates for the role based at Bangalore.
Required Skills :
- 8+ years of hands-on experience in data engineering, with at least 4 years in a lead or architect-level role.
- Deep expertise in Apache Spark, with proven experience developing large-scale distributed data processing pipelines.
- Strong experience with Databricks platform and its internal ecosystem (e.g., Delta Lake, Unity Catalog, MLflow, Job orchestration, Workspaces, Clusters, Lakehouse architecture).
- Extensive experience with workflow orchestration using Apache Airflow.
- Proficiency in both SQL and NoSQL databases (e.g., Postgres, DynamoDB, MongoDB, Cassandra) with a deep understanding of schema design, query tuning, and data partitioning.
- Proven background in building data warehouse / data mart architectures using AWS services like Redshift, Athena, Glue, Lambda, DMS, and S3.
- Strong programming and scripting ability in Python (preferred) or other AWS-compatible languages.
- Solid understanding of data modeling techniques, versioned datasets, and performance tuning strategies.
- Hands-on experience implementing data governance, lineage tracking, data cataloging, and compliance frameworks (GDPR, HIPAA, etc.).
- Experience with real-time data streaming using tools like Kafka, Kinesis, or Flink.
- Working knowledge of MLOps tooling and workflows, including automated model deployment, monitoring, and ML pipeline orchestration.
- Familiarity with MLflow, Feature Store, and Databricks-native ML tooling is a plus.
- Strong grasp of CI / CD for data and ML pipelines, automated testing, and infrastructure-as-code (Terraform, CDK, etc.).
- Excellent communication, leadership, and mentoring skills with a collaborative mindset and the ability to influence across functions.
(ref : hirist.tech)