Talent.com
Observability Engineer - Splunk / Kafka

Observability Engineer - Splunk / Kafka

Jobhedge ConsultancyPune
1 day ago
Job description

Description :

Job Description : AI-Driven Observability Engineer

Experience : 10+ Years

About the Role :

We are seeking a highly skilled AI-Driven Observability Engineer to design, implement, and maintain end-to-end observability solutions for infrastructure and application. You will play a key role in ensuring the reliability, performance, and scalability of our distributed systems by developing monitoring, logging, and tracing capabilities. The ideal candidate will have expertise in ETL, Data Science, and Machine Learning, along with hands-on experience in OpenTelemetry, Splunk, Kafka for comprehensive observability.

Key Responsibilities :

  • Design & Develop Observability Solutions : Build and enhance telemetry pipelines for logs, metrics, and traces using industry-standard tools (kafka, OpenTelemetry, Splunk)
  • Instrument Applications : Implement observability best practices in infrastructure, applications and platforms.
  • Design and Implement machine learning models to analyze logs, metrics and traces for anomaly detection, predictive failure analysis and root cause analysis.
  • Monitor & Analyze System Performance : Build and Develop real-time data visualization dashboards and alerts to track system health, detect anomalies, and support real-time troubleshooting.
  • Work with Event-Driven Architectures : Integrate observability with messaging systems like Kafka, RabbitMQ, or Pulsar for real-time monitoring.
  • Collaborate Across Teams : Work closely with SREs, DevOps, and development teams to improve system reliability and incident response.
  • Security & Compliance : Ensure observability data is securely stored and compliant with relevant regulations (GDPR, HIPAA, etc.).
  • Optimize Performance : Conduct root cause analysis and improve system observability to reduce downtime and improve response times.

Required Skills & Experience :

  • Data Science & Machine Learning experience : Hands-on proficiency in Python, TensorFlow, PyTorch, Scikit-learn, Pandas, NumPy.
  • Extensive knowledge of ETL techniques : Data extraction, transformation, and loading using Apache Airflow, Apache NiFi, Spark or similar tools
  • Observability Stack : Hands-on experience with Prometheus, Grafana, ELK Stack, Loki, OpenTelemetry, Jaeger, or Zipkin.
  • Experience with Time-Series Analysis, Predictive Analytics and AI-driven Observability.
  • Cloud & Infrastructure : Experience with AWS, Azure, or GCP observability services (e.g., CloudWatch, Azure Monitor).
  • Distributed Systems & Microservices : Understanding of Kubernetes, Docker, and Service Mesh technologies (Istio, Linkerd).
  • Event-Driven Architectures : Experience with Kafka, RabbitMQ, or other message brokers.
  • Database & Storage : Familiarity with time-series databases (InfluxDB, VictoriaMetrics) and NoSQL / SQL databases.
  • Preferred Qualifications :

  • Experience in AIOps and intelligent observability or anomaly detection.
  • Knowledge of Chaos Engineering for resilience testing.
  • Certifications in AWS, Azure, Kubernetes, or Observability tools.
  • Knowledge of data engineering and big data technologies like Hadoop, Spark and Flink.
  • Experience with machine learning models for predictive observability.
  • Why Join Us ?

  • Work on cutting-edge observability solutions in a high-scale production environment.
  • Opportunity to automate infrastructure monitoring and enhance system resilience.
  • Collaborate with cross-functional teams to improve reliability engineering.
  • Competitive salary, benefits, and growth opportunities in a fast-paced environment.
  • (ref : hirist.tech)

    Create a job alert for this search

    Observability Engineer • Pune

    Related jobs
    • Promoted
    Observability Engineer (Cloud Engineer) (Otel, AWS, Grafana)

    Observability Engineer (Cloud Engineer) (Otel, AWS, Grafana)

    FICOpune, maharashtra, in
    FICO is seeking a Full-Stack observability Lead Engineer to design, maintain, and optimize our observability platform.The ideal candidate will be an expert in Open telemetry(Otel) instrumentation a...Show moreLast updated: 1 day ago
    • Promoted
    Senior MLOps Engineer

    Senior MLOps Engineer

    Mitchell Martin Inc.Pune, IN
    Include, but are not limited to, the following : .Own productionizing models—from tracked experiments to governed releases—ensuring resilient services with clear SLOs, runbooks, and fast, safe rollba...Show moreLast updated: 30+ days ago
    • Promoted
    Staff Engineer Software-Browser Development

    Staff Engineer Software-Browser Development

    Palo Alto NetworksPune / Pimpri-Chinchwad Area, India
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 30+ days ago
    • Promoted
    AWS Quicksight Engineer

    AWS Quicksight Engineer

    Persistent Systemspune, maharashtra, in
    You will play a key role in designing, building, and maintaining visually compelling dashboards and reports that enable data-driven decision-making across the organization.This is a hands-on role f...Show moreLast updated: 24 days ago
    • Promoted
    Technical Implementation Engineer

    Technical Implementation Engineer

    Qualyspune, maharashtra, in
    Technical Implementation Engineer (TIE).VMDR, CSAM, Patch Management, Policy Compliance, WAS.Enterprise TruRisk Management (ETM). You will play a critical role in helping customers integrate these s...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Splunk Engineer

    Senior Splunk Engineer

    QualysPune, Maharashtra, India
    Splunk Enterprise or Cloud environments.This role involves developing and maintaining.Requirements and Qualifications : . SPL (Search Processing Language).Splunk apps / add-ons (TA development).SIEM pra...Show moreLast updated: 23 days ago
    • Promoted
    AWS Engineer

    AWS Engineer

    Spryc SystemsPune, IN
    We are seeking an experienced AWS Engineer to design, implement, and maintain AWS infrastructure and services in a managed service environment. The ideal candidate will possess deep expertise in AWS...Show moreLast updated: 3 days ago
    • Promoted
    Sr Staff Engineer Software-Browser Development

    Sr Staff Engineer Software-Browser Development

    Palo Alto NetworksPune / Pimpri-Chinchwad Area, India
    At Palo Alto Networks® everything starts and ends with our mission : .Being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and m...Show moreLast updated: 30+ days ago
    • Promoted
    Salesforce CPQ Developer

    Salesforce CPQ Developer

    InfoBeansPune / Pimpri-Chinchwad Area, India
    Position name # 1 : Salesforce CPQ Developer.Location : Pune (Hybrid / Remote).Demonstrate expertise in Salesforce platform architecture, data model, integration. Experience in Salesforce CPQ, Apex cla...Show moreLast updated: 16 days ago
    • Promoted
    Expert Observability engineer

    Expert Observability engineer

    ConfidentialPune
    At Ensono, our Purpose is to be a relentless ally, disrupting the status quo and unleashing our clients to Do Great Things! We enable our clients to achieve key business outcomes that reshape how o...Show moreLast updated: 30+ days ago
    • Promoted
    Machine Learning Observability Platform Engineer

    Machine Learning Observability Platform Engineer

    Mewar Infotech LimitedPune, Maharashtra, India
    We’re looking for a Machine Learning Observability Platform Engineer who’s passionate about building large-scale, reliable ML systems. You’ll help design and enhance our open-source observability...Show moreLast updated: 2 days ago
    • Promoted
    • New!
    Apply in 3 Minutes : Senior Splunk Engineer

    Apply in 3 Minutes : Senior Splunk Engineer

    QualysPune, Maharashtra, India
    We are seeking an experienced Splunk Engineer with 5–7 years of hands-on expertise in managing and optimizing Splunk Enterprise or Cloud environments. This role involves developing and maintaining S...Show moreLast updated: 3 hours ago
    • Promoted
    Usability Engineer

    Usability Engineer

    L&T Technology Servicespune, maharashtra, in
    Usability Engineering, UX Research, Medical Products Usability, Summative and Formative Assessments, Documentation, Usability Planning, Design Recommendations, B. Candidates with a design background...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Observability Engineer

    Senior Observability Engineer

    ConfidentialPune
    As a Senior Engineer in the Monitoring and Observability team, you will be responsible for designing, implementing, and optimizing monitoring solutions to ensure reliability and performance of Enso...Show moreLast updated: 30+ days ago
    • Promoted
    DynaTrace Observability Engineer

    DynaTrace Observability Engineer

    INDIGLOBE IT SOLUTIONS PRIVATE LIMITEDPune
    Job Description : Key Responsibilities : - Set up, configure, and manage Dynatrace agents across on-premises, AWS, and Kubernete...Show moreLast updated: 5 days ago
    • Promoted
    Senior Monitoring and Observability Engineer

    Senior Monitoring and Observability Engineer

    ConfidentialPune
    Undertake daily tasks and activities that ensure systems and services are proactively managed.Administer the core monitoring platforms and technologies. Manage 3rd line support activities and act as...Show moreLast updated: 5 days ago
    • Promoted
    AOSP Embedded Engineer

    AOSP Embedded Engineer

    VOLANSYS (An ACL Digital Company)Pune / Pimpri-Chinchwad Area, India
    Total relevant experience 6 plus years with Embedded domain.Minimum 3 year of working experience in AOSP.Well-versed with the AOSP compilation process and integration of new modules.Experienced in ...Show moreLast updated: 24 days ago
    • Promoted
    Sr Observability Engineer (SRE)

    Sr Observability Engineer (SRE)

    ConfidentialPune
    Design and implement solutions to improve system reliability, availability, performance, and scalability.Manage SLIs, SLOs, error budgets, monitoring, and alerting. Conduct blameless postmortems and...Show moreLast updated: 7 days ago