About the job : Location : Remote
Project start : ASAP
Workload : And Qualifications :
- Advice on architectural discussions and workshops with customers, understand their business and technical requirements to create the desired technical architectures and solutions on Cloud around data engineering, data lakes, data lakehouses, BI and ML / AI.
- Perform requirements scoping exercises with project and use case stakeholders.
- Translate requirements into desired technical solution design.
- Carry out PoCs, prototypes and build MVPs for new innovative solutions and technology scouting on cloud (Azure) and Big Data Technologies.
- Participate in hands-on technical project work including actual project implementation tasks and some production support tasks (i.e. monitoring) as required.
- Evaluate and implement platform cost optimizations
- Create and maintain technical documentation for the use cases, solutions and data platform.
- Perform analysis of best practices and emerging concepts in Cloud based technologies with special focus on Data and Analytics cloud Musts :
- In-depth knowledge of Apache Spark and experience in optimizing and performance tuning of Apache Spark data processing jobs
- In-depth knowledge of Delta Lake
- In-depth knowledge of Data Engineering on (Azure) Databricks
- Strong hands-on experience in Python programming and in writing complex SQL queries
- Strong hands-on experience in building complex data pipelines in Azure Data Factory
- Strong hands-on experience in the following Azure Data Services : ADLS Gen2, Synapse Serverless
- Experience in architecting and building enterprise grade data platforms on Cloud and Developing Big Data Solution Architectures Preferably On Azure, Incl. Gathering requirements and mapping those to technical architectures
- Awareness of best practices for selecting a component mix of Cloud services (i.e. ADF, Databricks, Synapse, etc.)
- Ability to assess pros and cons of architecture variations (i.e. Databricks vs Snowflake vs. MS Fabric, Synapse vs MS Fabric Lakehouse, Databricks vs. open-source Spark, Required :
Multi-year experience working (>6 years) in a data engineer role, incl :
Expertise in designing, building and maintaining large scale data pipelines as well as processing (transforming, aggregating, wrangling) data.Proficient with Streaming technologies like Kafka, Spark Structured Streaming or equivalent cloud servicesGood know-how of cloud computing concepts and of Azure cloud platform, networking, security and monitoring aspectsHands-on experience in using and applying IAC, CI / CD and DevOps practices in real Data Analytics projects, preferably using Azure DevOps and Terraform.Knowledge of Microsoft Fabric will be added advantage.Skills : Azure, Cloud, Spark
(ref : hirist.tech)