About the Company :
Everest DX – We are a Digital Platform Services company, headquartered in Stamford. Our Platform / Solution includes Orchestration, Intelligent operations with BOTs’, AI-powered analytics for Enterprise IT. Our vision is to enable Digital Transformation for enterprises to deliver seamless customer experience, business efficiency and actionable insights through an integrated set of futuristic digital technologies.
Digital Transformation Services - Specialized in Design, Build, Develop, Integrate, and Manage cloud solutions and modernize Data centers, build a Cloud-native application and migrate existing applications into secure, multi-cloud environments to support digital transformation. Our Digital
Platform Services enable organizations to reduce IT resource requirements and improve productivity, in addition to lowering costs and speeding digital transformation.
Digital Platform - Cloud Intelligent Management (CiM) - An Autonomous Hybrid Cloud Management Platform that works across multi-cloud environments. helps enterprise Digital Transformation get most out of the cloud strategy while reducing Cost, Risk and Speed.
To know more please visit : http : / / www.everestdx.com
Position Overview :
We are seeking a Cloud Databricks Administrator to join our Cloud Engineering team. In this role, you will monitor and maintain Databricks jobs, troubleshoot issues, ensure platform reliability, and manage access controls. You will play a key part in operational support, defect management, and proactive issue resolution to keep our data pipelines running smoothly.
Key Responsibilities :
- Monitor hourly and daily Databricks jobs, investigate failures, and implement fixes to minimize downtime.
- Identify, log, and track defects / bugs through ticketing systems, ensuring timely resolution.
- Manage Databricks access via Azure AD groups with Admin, Edit, and Read permissions.
- Provide production support for Databricks environments, including cluster operations, job failures, and notebook troubleshooting.
- Collaborate with data engineers and platform teams to resolve platform-related incidents and performance bottlenecks.
- Proactively monitor system health, resource utilization, and performance metrics.
- Implement and enforce archival / retention policies for Databricks storage to optimize costs and performance.
- Support CI / CD pipelines (Jenkins, Azure Automation) and automate repetitive operational tasks.
- Maintain technical documentation, SOPs, and runbooks for Databricks operations.
- Ensure security compliance with RBAC, MFA, and encryption best practices.
Preferred Qualifications :
3+ years of hands-on experience with Databricks and Apache Spark in production environments.Strong knowledge of Azure (AWS / GCP acceptable) and cloud-native services.Experience in SRE or production support environments with SLAs and ticketing systems (ServiceNow, Jira).Proficiency in Python or Scala for data processing and automation.Familiarity with Power BI / Tableau for building monitoring and cost dashboards.Knowledge of CI / CD tools, version control (Git), and scripting languages (PowerShell, Bash).Understanding of cloud cost optimization and usage tracking.Excellent problem-solving skills, communication, and cross-team collaboration abilities.Nice to Have :
Experience with Databricks Lakehouse / Madelaine architecture.Background in monitoring, logging, and incident response for data platforms.Exposure to Kubernetes, Docker, and Terraform.Required Skills :
Bachelor’s degree in Computer Science, IT, or equivalent professional experience.5+ years in data engineering, cloud operations, or database administration.Proven ability to troubleshoot, communicate effectively, and collaborate across teams in a fast-paced environment.