Talent.com
Costco IT
IT Reliability & Incident Lead [T500-24989]Costco IT • Delhi, India
No longer accepting applications
IT Reliability & Incident Lead [T500-24989]

IT Reliability & Incident Lead [T500-24989]

Costco IT • Delhi, India
24 days ago
Job description
About Costco Wholesale: Costco Wholesale is a multi-billion-dollar global retailer with warehouse club operations in eleven countries. They provide a wide selection of quality merchandise, plus the convenience of specialty departments and exclusive member services, all designed to make shopping a pleasurable experience for their members.

About Costco Wholesale India: At Costco Wholesale India, we foster a collaborative space, working to support Costco Wholesale in developing innovative solutions that improve members’ experiences and make employees’ jobs easier. Our employees play a key role in driving and delivering innovation to establish IT as a core competitive advantage for Costco Wholesale.

Position Title : IT Ops Analyst - Incident Commander Role Summary: IT Ops Analysts - Technical incident Commanders are responsible for defining technology needs, prioritizing them in the global planning process, analyzing, submitting, and approving new initiatives. IT Ops Analysts lead efforts to define and manage technology solutions, project initiatives, business architectures, technology objectives and strategies, and represent these to business areas and IT leadership as needed. The IT Operations Analyst is a dedicated resource assigned to the coordination, escalation, communication, and resolution of high priority eCommerce incidents/problems, as well as trend identification, metric definition, and the related efforts to decrease service disruption. Additionally, IT Operations Analysts will define processes and metrics to refine current and define future state eBusiness operations support processes. The individuals in this role own the successful end-to-end tracking and resolution of incidents, which includes managing resolution action plans, measuring service targets, and escalating incidents as required when resolution targets are missed. This role will also run Problem Management and drive Root Cause Investigations for the eCommerce environment.

Job Description: Roles & Responsibilities: Defines, captures, and validates IT requirements and other artifacts, ensuring appropriate stakeholders are involved. Develops key team deliverables and dashboards. Documents and manages risks, issues, assumptions, and constraints impacting operational support efforts. Develops and coordinates legal/compliance, operational controls, and associated metrics to measure success. Develops and implements standards, processes, and procedures for new technology solutions; ensures newer solutions will not negatively impact current service commitments. Manages the incident and problem management process and team members involved in resolving the incident and problem. Responds to a reported incident and initiates the incident management process. Remediates the deviation of the current incident management process. Acts as the point of contact for all major incidents. Analyzes internal IT customer needs and priorities while initiating operational support and delivery efforts. Participates in periodic audits for solutions, planning, and delivery functions. Ensures incidents that are not immediately resolved, are appropriately escalated according to defined service level agreements (SLAs). Drives to key performance indicators (KPIs); improving metrics and services to our members and stakeholders. Identifies and reports incident and problem trends and progress. Ensures timely, clear communication regarding high priority issues with the appropriate stakeholders. Works closely with the incident owner to ensure incident escalation processes are in-line with the overall incident management processes. Manages and tracks supplier performance; leverages approved contractual terms for accountability. Develops and conducts presentations as needed. Represents the first stage of escalation for incidents. Monitors and analyzes the incidents reported to ensure that SLAs are respected, RCAs are prepared, and preventive actions are in place. Identifies, initiates, schedules, and conducts incident reviews. Ensures users and leadership are informed about the incidents’ status at regular intervals. Ensures the closure of all resolved and end users confirmed the incident records. Establishes continuous process performance, activities, roles and responsibilities, and procedures are reviewed and enhanced wherever applicable. Facilitates collaboration with problem management to ensure successful transition of incidents into problem investigations. Ensures RCA is prepared and schedules RCA reviews with the teams worked on the incident. Records all details and timeline of key elements during incident management bridge calls. Undertakes continual service improvement activities. Creates, maintains, and reports SLA and KPIs. Identifies and reports incident trends and progress. Ensures the team and other stakeholders in the call understand the business impact. Collaborates with appropriate business and IT stakeholders to determine root cause and problem identification, and as appropriate, enhancement identification for future development work. Supports eCommerce releases for both pre- and post-release activities. Regular and reliable workplace attendance at your assigned location.

Experience Required: Minimum Qualifications: Excellent verbal and written communication skills. Ability to create accurate, concise correspondence. Ability to develop and conduct presentations. Strong proven interpersonal skills and able to work well with people at all levels. Ability to conduct monthly meetings with stakeholders to drive increased availability in identified trends. Detail-oriented and strong problem-solving skills, with the ability to analyze a situation for potential future problems. Organized and thorough, with a dedication for follow through. Intellectually inquisitive nature with the ability to be open minded to varying opinions. Responsible, conscientious, and possess a passion for excellence - positive “can do” attitude. Innovative, creative, and extremely responsive in respect to service quality and ways in which it can be improved. Highly responsive and available to support business needs, flexing as needed. Good understanding of corporate IT policies, procedures, and standards.

Knowledge of: Incident, Problem, Change, and Knowledge Management practices. IT strategies, customers, and services provided. Costco’s core business environment related to eCommerce, Merchandising, Warehouse Operations, and company philosophies. Service analysis and other tools: CARTS+, Google Apps, Smart sheets. Available for on-call coverage 24X7, to support off-hours work as required, including weekends and holidays, and fluctuates with staffing. Develops and presents a business case document and/or presentation to management. Must be familiar with a broad set of technologies and solutions currently in use at Costco. Demonstrates ability to work independently and with limited supervision. Strong abstraction skills - ability to derive general rules and concepts from the usage and classification of specific examples, literal signifiers, and first principles. Strong communication skills – able to speak to large audiences and to leaders at all levels of the organization. Able to adapt vocabulary and style for each situation; able to represent complex ideas with effective documents and visuals and to adapt presentations to the expectations and background of the audience. Extremely responsive, with a strong sense of urgency.

Must Have Skills: Familiarity with ServiceNow. Experience with statistical analysis and reporting. Familiarity with multiple Costco business areas from an IT perspective. Knowledge of the Service Desk or Call Center business processes. IT Infrastructure Library (ITIL) V3 Foundation certification. Prior experience with the IT Service Management software. Proficient in Google Workspace applications, including Sheets, Docs, Slides, and Gmail. Successful internal candidates will have spent one year or more on their current team.

Create a job alert for this search

IT Reliability & Incident Lead [T500-24989] • Delhi, India

Similar jobs

Reliability Lead

Air Liquidenew delhi, delhi, in

How will you CONTRIBUTE and GROW?.As our Reliability Lead, you will be the essential driver of Operational Excellence for our Primary Production, Bulk, and Onsites activities across the dynamic Afr... Show more

 • Promoted

Outbound Leader

Astra Securitynoida, delhi, in

Astra is a cybersecurity SaaS company that makes otherwise chaotic pentests a breeze with its one-of-a-kind AI-led offensive Pentest Platform.Astra's continuous vulnerability scanner emulates hacke... Show more

 • Promoted

Technical Lead

Birlasoftdelhi, delhi, in

Analyzing business requirements and providing functional and technical design solutions for SAP PI/PO integration projects.Experience in maintaining all interface flows of each release.Maintain doc... Show more

 • Promoted

Lead Incident Response Coordinator

SourceFuseNoida, Republic Of India, IN

SourceFuse Technologies hiring Lead L1 - Engineer with 4-5 years of experience.The L1 Lead will oversee the daily operations of the Service Desk supporting the A2P SMS platform.This role involves l... Show more

 • Promoted

PLM Teamcenter Infra Lead

Infosysdelhi, delhi, in

Function: Engineering (PLM – Teamcenter).We are seeking an experienced Teamcenter Infrastructure Lead / Analyst to manage and support Teamcenter PLM environments across development, test, and produ... Show more

 • Promoted

Policy Team Lead - Ad Tech

Unibotsdelhi, delhi, in

Unibots is a next-generation Ad-Tech company dedicated to revolutionizing monetization solutions for publishers worldwide.With innovative strategies and cutting-edge technology, Unibots empowers pu... Show more

 • Promoted

Technical Lead

Movateghaziabad, uttar pradesh, in

Notice Period: Immediate to 15 Days (Max).Shortlisted candidates will have F2F interview on 18th of April at Mepz Movate office Chennai between 9am to 2pm.Mandatory - Minimum 10+ years & good knowl... Show more

 • Promoted

It Application Technical Lead-Fullstack

Alight SolutionsNoida, Republic Of India, IN

At Alight, we believe a company’s success starts with its people.At our core, we Champion People, help our colleagues Grow with Purpose and true to our name we encourage colleagues to “Be Alight.Be... Show more

 • Promoted

Service Delivery and ITIL Professional

TECEZENoida, Republic Of India, IN

Job Title: IT Service Management (ITSM).We are looking for a skilled ITSM Analyst with 2+ years of experience in IT Service Management processes.The candidate will be responsible for managing IT se... Show more

 • Promoted

Lead Site Reliability Engineer

Concentrixnew delhi, delhi, in

As a Lead Site Reliability Engineer, you will own the reliability and availability of our production systems.You will champion SRE principles across engineering teams — defining SLOs, managing erro... Show more

 • Promoted

IT Application Technical Lead-Fullstack

Alight Solutionsnoida, uttar pradesh, India

At Alight, we believe a company’s success starts with its people.At our core, we Champion People, help our colleagues Grow with Purpose and true to our name we encourage colleagues to “Be Alight.Be... Show more

 • Promoted

Information Technology Service Management Analyst

Tata Consultancy Servicesnew delhi, delhi, in

Strong problem-solving abilities with inter team collaboration.Strong oral and written communication skills, with the ability to effectively communicate to a variety of audiences.Lead and manage ma... Show more

 • Promoted

Head of IT Governance & Scalability

NMTronics India Pvt. Ltd.Noida, Republic Of India, IN

Lead end-to-end IT Infrastructure & Cloud Operations (On-prem + Azure/AWS hybrid environments).Drive Identity & Access Management (SSO, MFA, Least Privilege access).Own Vulnerability & Threat Manag... Show more

 • Promoted

Prior Authorization Team Lead

Triplenoida, delhi, in

Triple is leading the way in remote work solutions, helping small and medium-sized businesses in North America build highly efficient remote teams for Administration, Customer Service, Accounting, ... Show more

 • Promoted

Reliability Engineer

Birlasoftnoida, uttar pradesh, India

Job Description: Reliability Sr.Reliability Architect with 8 to 12 years of experience in proactive monitoring, automation, and observability.Skilled in AIOps/MLOps, infrastructure management, and ... Show more

 • Promoted

Lead Infrastructure & Platform

SecNinjaz Technologies LLPnew delhi, delhi, in

Job Description for Infrastructure & Platform Lead Profile.Infrastructure & Platform Lead.Site Reliability Engineering (SRE).SecNinjaz is building sovereignty-aligned cybersecurity and IT solutions... Show more

 • Promoted

Tech Lead- Google Dialogflow

EXLnew delhi, delhi, in

This pivotal role will be responsible for defining the technical vision, designing robust solutions, and leading the end-to-end delivery of Google Contact Center AI (CCAI) initiatives, with a prima... Show more

 • Promoted

ITSM Architect / Technical Lead

iMerit Technologynew delhi, delhi, in

ITSM Architect / Technical Lead.ITSM Technical Lead / Integration Lead.IT Service Management (ITSM) platforms and related enterprise systems.This role is not focused on pure platform administration... Show more

 • Promoted

IT Services Industry Analysts (Remote Working)

NelsonHallnoida, delhi, in
Remote

NelsonHall has ongoing vacancies for industry research analysts in IT services.These are remote working roles offering a high degree of flexibility to suit qualified candidates globally.These posit... Show more

 • Promoted

Remote L4 O365 (Power platform governance)

h3 Technologies, LLCnew delhi, delhi, in
Remote

The L4 SharePoint / Power Platform Engineer designs, administers, and governs enterprise Microsoft 365 collaboration and.This role owns complex platform operations, governance, and project delivery... Show more