Get in Touch

Course Outline

Introduction to AIOps

  • Defining AIOps and its significance.
  • Contrasting traditional monitoring with AIOps-driven observability.
  • Exploring AIOps architecture and key components.

Collecting and Normalizing Operational Data

  • Types of observability data: metrics, logs, and traces.
  • Ingesting data from diverse sources, including servers, containers, and cloud environments.
  • Utilizing agents and exporters such as Prometheus, Beats, and Fluentd.

Data Correlation and Anomaly Detection

  • Employing time series correlation and statistical methods.
  • Applying ML models for anomaly detection.
  • Detecting incidents across distributed systems.

Alerting and Noise Reduction

  • Designing intelligent alert rules and thresholds.
  • Implementing suppression, deduplication, and alert grouping.
  • Integrating with platforms like Alertmanager, Slack, PagerDuty, or Opsgenie.

Root Cause Analysis and Visualization

  • Using dashboards to visualize metrics and identify trends.
  • Exploring events and timelines for Root Cause Analysis (RCA).
  • Tracing issues across layers using distributed tracing tools.

Automation and Remediation

  • Triggering automated scripts or workflows in response to incidents.
  • Integrating with ITSM systems such as ServiceNow and Jira.
  • Examining use cases: self-healing, scaling, and traffic rerouting.

Open Source and Commercial AIOps Platforms

  • Overview of tools including Prometheus, Grafana, ELK, Moogsoft, and Dynatrace.
  • Establishing evaluation criteria for selecting an AIOps platform.
  • Demo and hands-on session with a selected stack.

Summary and Next Steps

Requirements

  • A solid understanding of IT operations and system monitoring concepts.
  • Prior experience with monitoring tools or dashboards.
  • Familiarity with fundamental log and metric formats.

Audience

  • Operations teams responsible for infrastructure and applications.
  • Site Reliability Engineers (SREs).
  • IT monitoring and observability teams.
 14 Hours

Related Categories