Position Description :
We are seeking a highly skilled and versatile Cloud Data Specialist to join our Data Operations team. Reporting to the Team Lead, Data Operations, the Cloud Data Specialist plays a key role in the development, administration, and support of our Azure-based data platform, with a particular focus on Databricks, data pipeline orchestration using tools like Azure Data Factory (ADF), and environment management using Unity Catalog. A strong foundation in data engineering, cloud data administration, and data governance is essential. Development experience using SQL and Python is required. Knowledge or experience with APIM is nice to have.
Job Responsibilities :
Data Engineering and Platform Management :
- Design, develop, and optimize scalable data pipelines using Azure Databricks and ADF.
- Administer Databricks environments, including user access, clusters, and Unity Catalog for data lineage, governance, and security.
- Support the deployment, scheduling, and monitoring of data workflows and jobs in Databricks and ADF.
- Implement best practices for CI / CD, version control, and operational monitoring for pipeline deployments.
- Implement and manage Delta Lake to ensure reliable, performant, and ACID-compliant data operations.
Data Modeling and Integration :
Collaborate with business and data engineering teams to design data models that support analytics and reporting use cases.Support integration of data from multiple sources into the enterprise data lake and data warehouseConfigure API calls to utilize our Azure APIM platform.Maintain and enhance data quality, structure, and performance within the Lakehouse and warehouse architecture.Collaboration and Stakeholder Engagement :
Work cross-functionally with business units, data scientists, BI analysts, and other stakeholders to understand data requirements.Translate technical solutions into business-friendly language and deliver clear documentation and training when required.Required Technical Expertise :
Apache Spark (on Databricks)Proficient in PySpark and spark SQLSpark optimization techniques (caching, partitioning, broadcast joins)Writing and scheduling notebooks / jobs in DatabricksUnderstanding of Delta Lake architecture and featuresWorking with Databricks Workflows (pipelines and job orchestration)SQL / Python ProgrammingHandling JSON, XML, and other semi-structured formatsExperience with API integration using requests, http, etc.Error handling and loggingAPI Ingestion :
Designing and implementing ingestion pipelines for RESTful APITransforming and loading JSON responses to Spark tablesCloud & Data Platform Skills :
Databricks on AzureCluster configuration and managementUnity Catalog features (optional but good to have)Azure Data FactoryCreating and managing pipelines for orchestrationLinked services and datasets for ADLS, Databricks, SQL ServerParameterized and dynamic ADF pipelinesTriggering Databricks notebooks from ADFData Engineering Foundations :
Data modeling and warehousing conceptsETL / ELT design patternsData validation and quality checksWorking with structured and semi-structured data (JSON, Parquet, Avro)DevOps & CI / CD :
Git / GitHub for version controlCI / CD using Azure DevOps or GitHub Actions for Databricks jobsInfrastructure-as-code (Terraform for Databricks or ADF)Additional Requirements :
Bachelor's degree in computer science, information systems, or a related field.4+ years of experience in a cloud data engineering, data platform, or analytics engineering role.Familiarity with data governance, security principles, and data quality best practices.Excellent analytical thinking and problem-solving skills.Strong communication skills and ability to work collaboratively with technical and non-technical stakeholders.Microsoft certifications in Azure Data Engineer, Power Platform, or related field is desiredExperience with Azure APIM is nice to haveKnowledge of enterprise data architecture and data warehouse principles (e.g., dimensional modeling) an assetJob Demands and / or Physical Requirements :
As Seaspan is a global company, occasional work outside of regular office hours may be required.
(ref : hirist.tech)