JOB DESCRIPTION
Are you ready to make an impact at DTCC
Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve.
The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high-quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance.
Pay and Benefits :
- Competitive compensation, including base pay and annual incentive
- Comprehensive health and life insurance and well-being benefits, based on location
- Pension / Retirement benefits
- Paid Time Off and Personal / Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well-being.
- DTCC offers a flexible / hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).
The Impact you will have in this role :
We are seeking a highly motivated Observability Engineer to join our Observability Engineering & Product Delivery team. This role is critical in enhancing our enterprise observability capabilities by designing, implementing, and maintaining monitoring solutions using tools such as Grafana, Splunk, and Dynatrace. The ideal candidate will have a strong background in telemetry (logs, metrics, traces, events), performance monitoring, and dashboard visualization.
This role will be in Observability Engineering & Product Delivery team. The team maintains the firm's monitoring and Observability tools and infrastructure, and this position is primarily for working on Splunk, Grafana and Observability.
Your Primary Responsibilities :
Working on engineering and development focused projects from start to finish with minimal supervisionProviding technical and operational support for our customer base as well as other technical areas within the company that utilize our toolsRisk management functions such as reconciliation of vulnerabilities, security baselines as well as other risk and audit related objectivesAdministrative functions for our tools such as keeping the tool documentation current and handling service requestsDesign and implement observability solutions across distributed systems using Grafana, Splunk ITSI, and Dynatrace.Develop and maintain custom dashboards and visualizations tailored to business and operational needs.Integrate observability tools with various data sources (e.g., Prometheus, CloudWatch, Service Now, Snowflake).Collaborate with application and infrastructure teams to define SLIs / SLOs and improve system reliability.Troubleshoot and resolve issues related to monitoring gaps, alert noise, and data ingestion.24x7 on-call L3 support on a rotational schedule with other team membersParticipating in user training to increase awareness of Splunk, Grafana & ObservabilityEnsuring incidents, problems and change tickets are addressed in a timely fashion, as well as escalating technical and managerial issuesFollowing DTCC's ITIL process for incident, change and problem resolution.Good knowledge of TCP / IP and networking fundamentalsGood knowledge of engineering, configuring, deploying and supporting Splunk Enterprise, Splunk Cloud, ITSI, Grafana, and ObservabilityAbility to create and optimize Big Data correlations as a Splunk search language (SPL) proficientProficient in Grafana queries, dashboard creations, and other development and administration tasks.Optimize / Tune logging source streamsDevelop Splunk reports to meet requirements of key stakeholders.Good knowledge of Amazon AWS products and services such as EC2, Lambda, VPC, Route 53, Amazon FW, API Gateway, ELB, and CloudTrail.Qualifications :
Minimum of 05 years of related experienceBachelor's degree preferred or equivalent experienceTalents Needed for Success :
5+ years' experience of Splunk / Grafana engineering / support in a production environment. This includes all phases of lifecycle management : planning, design, deployment, upkeep and retirementShould have developed competency with both Splunk & Grafana in a production environmentHands-on experience with Grafana, Splunk (including ITSI), and Dynatrace.Strong understanding of telemetry data types and observability architecture.Experience with scripting (Python, Bash, PowerShell) and automation tools.Familiarity with cloud platforms (AWS and Azure) and containerized environments (Kubernetes).Excellent problem-solving skills and ability to work in a fast-paced, collaborative environment.Strong communication skills.Working knowledge in Open Telemetry.Preferred Qualifications :
Experience with integrating observability tools into CI / CD pipelines.Knowledge of ITSM tools like ServiceNow and incident response platforms like PagerDuty.Exposure to AIOps, anomaly detection, and predictive analytics use cases.Skills Required
Powershell, Prometheus, Bash, Grafana, Service Now, Cloudwatch, snowflake , Dynatrace, Splunk, Python, Kubernetes, Aws