Create an OCI Alarm for Compute CPU Utilization
Monitoring is an important part of cloud operations. After
deploying compute instances, we need visibility into resource usage such as
CPU, memory, disk, and network. CPU utilization is one of the most common
metrics monitored for compute instances because high CPU usage may indicate
application load, inefficient processes, or capacity limitations. Also high CPU
usage will affect the workload performance and it further may create service
disruptions.
Oracle Cloud Infrastructure provides the Monitoring service
to collect metrics from OCI resources and create alarms when metric values
cross defined thresholds. In this article, we will create an OCI alarm for
compute CPU utilization. We will also simulate high CPU usage on a compute
instance and validate that OCI sends an email notification when the alarm is
triggered.
The test case requires OCI services - OCI Compute, OCI
Monitoring, OCI Notifications and Oracle Cloud Agent. OCI Monitoring collects CPU utilization
metrics from the OCI compute instance. An alarm is created on the
CpuUtilization metric. When CPU utilization crosses the configured threshold,
the alarm moves to firing state and sends a message to an OCI Notifications
topic. The email subscription receives the alert.
Prerequisites
Before starting this setup, the following items should be
available:
Required compartment
Running compute instance
Oracle Cloud Agent enabled on the instance
Compute Instance Monitoring plugin enabled
Permission to create alarm
Permission to create notification topic and subscription
Valid email address for receiving notification
Validate Compute Instance Monitoring Plugin
Before creating the alarm, the compute instance should be
able to emit monitoring metrics. From the compute instance details page, we can
review the Oracle Cloud Agent section and validate that the Compute
Instance Monitoring plugin is enabled and running.
If the plugin is disabled, metrics may not appear in OCI
Monitoring.
The metric used for this article is: CpuUtilization
Create Notification Topic
First, we will create a Notification topic. The topic will receive the alarm message and forward it to the email subscription.
From OCI Console, navigate to:
Developer Services → Application Integration → Notifications
Create a topic.
Example topic name:
compute-cpu-alarm-topic
After the topic is created, we will add an email
subscription.
Create Email Subscription
Open the Notification topic and create a subscription.
Use the protocol:
Email
Provide the email address that should receive the alarm
notification.
After creating the subscription, OCI sends a confirmation
email. The email subscription must be confirmed before alarm notifications can
be received. Oracle’s Notifications documentation explains that Notifications
uses topics and subscriptions, and when a message is published to a topic, it
sends the message to all subscriptions on that topic. After confirmation, the subscription status
changes to active.
More details are on this topic is available in article
“Send
Email Notification When Object Is Uploaded to OCI Object Storage”
Create CPU Utilization Alarm
Now we will create the alarm.
From OCI Console, navigate to:
Observability & Management → Monitoring → Alarm
Definitions
Create a new alarm.
Example alarm name:
High CPU Utilization Alarm
Select the metric details:
Metric namespace: oci_computeagent
Metric name: CpuUtilization
Interval: 1 minute
Statistic: Mean
Select the required compartment and compute instance
dimension. For the trigger rule, we can use a low threshold for testing.
Example:
CpuUtilization greater than 50%
For a production environment, the threshold can be higher,
such as 80% or 85%, based on workload behavior and operational standards.
For notification destination, select the Notification topic:
compute-cpu-alarm-topic
Save the alarm.
Alarm Query Example
The alarm query may look similar to the following:
CpuUtilization[1m].mean() > 50
This means OCI Monitoring evaluates the average CPU
utilization over a one-minute interval. If the value is greater than 50, the
alarm condition is met.
For testing, a lower threshold makes it easier to trigger
the alarm. After validation, the threshold can be adjusted to a
production-friendly value.
Simulate High CPU Usage
To validate the alarm, we need to generate CPU load on the
compute instance.
Connect to the compute instance using SSH.
For Oracle Linux, we can use the stress tool which is
easiest way.
Install stress if it is not already done. We can use yum
utility or other available ways to install that.
sudo yum install -y stress
Generate CPU load:
stress --cpu 2 --timeout 300
This command generates CPU load using two CPU workers for
300 seconds.
Wait for Alarm Evaluation
After CPU load is generated, OCI Monitoring needs a few
minutes to collect metrics and evaluate the alarm condition.
In the alarm details page, the alarm should move from:
OK
to:
Firing
Once the alarm enters firing state, OCI Notifications sends
an email to the confirmed subscription.
A small delay is normal because metrics are collected and
evaluated based on the selected interval and alarm configuration.
Validate Email Notification
After the alarm is triggered, check the email inbox.
The notification email should include details such as:
Alarm name
Alarm state
Alarm Type
Alarm Severity
Notification type
Alarm Summary
Body
Time
Query
Number of metrics breaching threshold
Dimensions
Metric values, ordered by dimension
This confirms that the CPU alarm was triggered successfully
and the notification flow is working.
Stop CPU Load
After validation, stop the CPU load.
With the stress command, it stops automatically after the
timeout
After CPU load stops, CPU utilization should decrease.
After the next alarm evaluation cycle, the alarm should
return to:
OK
Depending on the alarm configuration, a notification may
also be sent when the alarm clears.
In this article, we created an OCI alarm for compute CPU
utilization and connected it with OCI Notifications. We also simulated high CPU
usage on the compute instance to validate that the alarm moves to firing state
and sends an email notification.
This is a useful monitoring pattern for OCI compute
workloads. By using Monitoring alarms and Notifications, we can detect high CPU
usage early and improve operational response without using custom scripts or
external monitoring tools.