Back to jobs Featured

Observability Automation Engineer

Job description

Job Description:

  • Design and develop observability platform comprising of monitoring, metrics, and logging systems.
  • Conceptualize and implement early anomaly detection (reduction of mean-time issue identification), pattern analysis, self-healing, infrastructure resizing, noise reduction and outage prediction.
  • Develop visualizations in Azure Monitor, GCP Looker, providing single pane views for end user experience, application, infrastructure & security
  • Design and develop end user device and network connectivity insights


  • Minimum 3 years in/with:
    • monitoring tools (e.g, Datadog, Instana, Azure Monitor, Google Operations, Prometheus, etc)
    • Programming and scripting with python and/or bash
    • Production experience in distributed tracing and debugging (e.g., jaeger, zipkin, open tracing, Application Insights, Google Cloud Trace/StackDriver, etc)
    • Experience in various visualization tools (GCP Looker, Azure Monitor, Grafana, etc)
    • Various agents and collectors (fluentbit, telegraf, bindplane, opentelemetry, splunk, etc)
    • Cloud platforms (Azure, GCP, AWS)

If this job isn't quite right for you, but you know someone who would be great at this role, why not take advantage of our referral scheme? We offer MYR500 in shopping vouchers for every referred candidate who we place in a role. Terms & Conditions Apply.