At HG Insights, we lead the way in technology intelligence, delivering AI-driven insights through advanced data science and scalable big data architecture. We're searching a strong DevOps to support and manage complex cloud operations and database systems. With the recent acquisitions of MadKudu and TrustRadius, we’ve created an agentic GTM ecosystem that eliminates manual handoffs, guesses, and siloed signals and we need a strategic seller to take it to market.
What You Will Do
- Manage all aspects of our google cloud hosted env including kubernetes, databases, permissions, memory stores, and other external components of our applications. Supporting 20+ microservices across development, staging, and production environments
- Design and maintain CI / CD pipelines using GitHub Actions for automated testing, building, and deployment of our monorepo applications coordinating with the engineering team
- Implement monitoring and alerting systems using using tools like Prometheus, and custom dashboards for application performance and infrastructure health
- Automate infrastructure provisioning using Terraform and Helm charts for consistent, repeatable deployments across environments
- Manage database infrastructure including PostgreSQL clusters, MongoDB Atlas, Redis, and Elasticsearch with backup, scaling, and disaster recovery strategies
- Ensure security and compliance implementing secrets management, network policies, and audit logging for SOC2 requirements
What You Will Be Responsible For
Platform reliability maintaining 99.9% uptime for revenue-critical systems serving enterprise customersInfrastructure scalability supporting rapid growth in data processing (10M+ events / day) and user trafficSecurity posture ensuring compliance with enterprise security requirements and data protection standardsCost optimization managing cloud spend efficiency while maintaining performance and reliability standardsIncident response and disaster recovery procedures with minimal downtime during critical failuresWhat You Will Need
8+ years DevOps / SRE experience with Kubernetes, Docker, and cloud infrastructure managementStrong scripting and automation skills with Python, Bash, and infrastructure-as-code toolsCI / CD pipeline expertise with GitHub Actions, Jenkins, or similar automation platformsCloud platform proficiency with GCP services including GKE, Cloud SQL, BigQuery, and networkingMonitoring and observability experience with metrics, logging, alerting, and performance optimizationNice to Have
Multi-cloud experience with AWS or Azure for disaster recovery and hybrid deploymentsDatabase administration experience with PostgreSQL, MongoDB, and managed database servicesSecurity frameworks and compliance experience with SOC2, GDPR, and enterprise audit requirementsData pipeline infrastructure supporting Apache Airflow and large-scale data processing workloadsGitOps and advanced deployment strategies including blue-green deployments and canary releases