Talent.com
DevOps Engineer

DevOps Engineer

OctoposAligarh, IN
10 hours ago
Job description

We're seeking an experienced DevOps / Site Reliability Engineer to join our team and take ownership of our testing, deployment, and infrastructure operations for Octopos, our multi-platform point-of-sale SaaS solution. You'll be responsible for building robust CI / CD pipelines, managing our database infrastructure, and ensuring high availability for our retail customers who depend on us 24 / 7. This is a full REMOTE position.

CI / CD & Deployment Pipeline

  • Design and implement comprehensive CI / CD pipelines for our diverse tech stack (React, Laravel, Node.js, React Native)
  • Manage multi-platform deployments including web, Android (Capacitor), Windows (Electron)
  • Manage Google Play Store releases including APK / AAB uploads, versioning, and staged rollouts
  • Handle App Store submissions and TestFlight distributions
  • Create and maintain staging environments that accurately mirror production
  • Implement automated testing strategies across all applications
  • Establish deployment rollback procedures and blue-green deployment strategies

Infrastructure & Monitoring

  • Implement and maintain comprehensive monitoring using Grafana dashboards and alerting
  • Set up centralized logging infrastructure (ELK stack or similar) for all applications
  • Monitor and maintain production servers ensuring 99.9% uptime for POS operations
  • Design custom metrics and KPIs specific to POS operations (transaction success rates, hardware connectivity)
  • Manage incident response and on-call rotations
  • Optimize application performance and resource utilization
  • Ensure infrastructure security and PCI compliance requirements
  • Database Management

  • Design and implement multi-node MySQL cluster for high availability
  • Create and manage automated backup strategies with point-in-time recovery
  • Monitor database performance and implement optimization strategies
  • Plan and execute database migrations with zero downtime
  • Implement disaster recovery procedures
  • Testing & Quality Assurance

  • Build automated testing frameworks for React, Laravel, and Node.js applications
  • Implement E2E testing for critical POS workflows including payment processing
  • Create testing strategies for hardware integration (payment terminals, printers, scanners)
  • Establish code quality gates and coverage requirements
  • Documentation & Knowledge Transfer

  • Create and maintain comprehensive documentation for all infrastructure, deployment processes, and runbooks
  • Develop disaster recovery playbooks and incident response procedures
  • Document monitoring alerts, thresholds, and escalation procedures
  • Maintain architectural diagrams and system dependencies documentation
  • Create video tutorials and guides for common operational tasks
  • Required QualificationsTechnical Skills

  • 3+ years of DevOps / SRE experience with production systems
  • Strong experience with CI / CD tools (GitHub Actions, GitLab CI, Jenkins)
  • Hands-on experience with Grafana, Prometheus, and alerting systems
  • Experience with centralized logging solutions (ELK, Splunk, or similar)
  • Proficiency in containerization (Docker) and orchestration (Kubernetes / Docker Compose)
  • Expertise in MySQL administration including replication and clustering
  • Experience with Infrastructure as Code (Terraform, Ansible, or similar)
  • Solid understanding of Linux system administration
  • Proficiency in scripting (Bash, Python, or similar)
  • Application-Specific Experience

  • Experience deploying React / Node.js applications at scale
  • Familiarity with Laravel deployment and optimization
  • Experience managing mobile app releases and versioning strategies
  • Understanding of Electron app packaging and distribution
  • Knowledge of WebSocket implementations and real-time systems
  • Soft Skills

  • Excellent technical writing and documentation skills
  • Experience training and mentoring junior engineers
  • Strong communication skills for cross-functional collaboration
  • Ability to explain complex technical concepts to non-technical stakeholders
  • Work Schedule & On-Call RequirementsCore Hours

  • Must be available during US Pacific Time business hours (9 AM - 5 PM PST / PDT)
  • This is a full remote position
  • On-Call Responsibilities

    As our POS platform serves retail businesses operating 7 days a week, this role includes participation in an on-call rotation to ensure 24 / 7 system reliability.

    On-Call Structure :

  • Participate in rotating on-call schedule
  • Response time : 15-minute acknowledgment, 30-minute engagement during on-call periods
  • Average incident volume : 1 Incident every 2 months.
  • Severity-based response (P1 : immediate, P2 : 30 minutes, P3 : next business day)
  • On-Call Compensation :

  • Standby Pay : Additional compensation for on-call availability (paid whether or not incidents occur)
  • Incident Response Pay : 1.5x hourly rate for incident response during nights / weekends
  • Compensatory Time : Time off provided after significant weekend incidents
  • Company-provided phone and laptop dedicated for on-call use
  • Post-incident review process to minimize repeat issues and alert fatigue
  • Support Structure :

  • Comprehensive runbooks and automated remediation for common issues
  • Clear escalation procedures to senior leadership and vendor support
  • Robust monitoring to minimize false positives
  • Regular rotation reviews to ensure fair distribution
  • What We Offer

  • Opportunity to architect infrastructure for a growing SaaS platform
  • Work with diverse, modern technology stack
  • Direct impact on system reliability affecting thousands of daily transactions
  • Competitive on-call compensation package
  • Professional development budget for certifications and training
  • 12LPA plus salary
  • Requirements

  • Strong written and verbal communication skills
  • Demonstrated experience in creating technical documentation
  • Ability to work during US Pacific Time business hours
  • Willingness to participate in compensated on-call rotation
  • Self-motivated with excellent troubleshooting skills
  • Experience working in fast-paced, agile environments
  • Commitment to knowledge sharing and team development
  • Create a job alert for this search

    Engineer • Aligarh, IN