Skip to main content

CloudWatch networking alarms

A summary of current alarms

  • nacl-changes
  • network-gateway-changes
  • NoVPCAttachmentTraffic
  • route-table-changes
  • vpc-changes

nacl-changes

This alarm triggers when an API call is made to create, delete, or update a network access control list (NACL). Our NACLs are fairly static, but can be amended when customers feed in values for east/west traffic, or private access to other locations such as PSN endpoints.

Should this alarm trigger you should confirm the NACL in question and find the cause of the alarm. You should check our Github Actions to see if any recent actions correspond to the triggered alarms. You can also investigate the CloudTrail logging for the relevant account.

As we make all durable changes through code, this should show you the cause of the nacl-changes alert. It ought to be legitimate in all cases, but can sometimes be related to alarms such as NoVPCAttachmentTraffic where a NACL blocks traffic from exiting a VPC.

It is unlikely to cause traffic to be blocked from entering a VPC as the transit gateway endpoint subnets have fixed permissive NACLs.

Outputs to #modernisation-platform-low-priority-alarms.

network-gateway-changes

This alarm triggers when an API call is made to create, delete, or attach an Internet Gateway or Customer Gateway. As with how static our VPC configuration is, it is unlikely that this alarm will trigger unintentionally.

Should this alarm trigger you should confirm the gateway in question and find the cause of the alarm. You should check our Github Actions to see if any recent actions correspond to the triggered alarms. You can also investigate the CloudTrail logging for the relevant account.

As we make all durable changes through code, this should show you the cause of the network-gateway-changes alert. As with the nacl-changes it ought to be legitimate in all cases. It is likely to affect customer traffic coming into a service from the internet in the case of an Internet Gateway. In the case of a Customer Gateway this will be seen alongside a VPN tunnel moving to a down state.

Outputs to #modernisation-platform-low-priority-alarms.

NpVPCAttachmentTraffic

This alarm triggers when no traffic has been seen traversing a VPC Transit Gateway attachment and exists on a per-VPC basis. The intent for this alarm was to alert us when a change or event occurred that caused a cessation of traffic from a VPC to the Modernisation Platform Transit Gateway.

Should this alarm trigger you should confirm the VPC in question and assess the behaviour of traffic over the alarm period, extending past this alarm to observe what normal behaviour should look like. It is possible for this alarm to trigger during ordinary operation when the polling window is short; no traffic over five minutes is not uncommon.

In the event that this alarm triggers, and it has been deemed legitimate, your next step should be to confirm that there are no outstanding issues with the AWS VPC service, and that no changes have been made to the relevant VPC through the core-vpc Github actions.

It is possible that a change to route tables, network access control lists, or transit gateway attachment can cause a legitimate triggering of this alarm. In short, anything that is involved in traffic leaving the VPC via the Transit Gateway is in scope for investigation.

An example of an illegitimate alert would be a dip in traffic over the alarm period that resolves itself without any action from either the Modernisation Platform team, or a customer team - taking a service down for maintenance, for example, and causing a legitimate cessation of traffic could cause this.

An example of a legitimate alert would be a dip in traffic over the alarm period that does not resolve itself, and can be traced back to an unanticipated consequence of an action; restricting traffic through an access control list, for example.

Outputs to #modernisation-platform-high-priority-alarms.

route-table-changes

This alarm triggers when an API call is made to carry out a route-table related action; creating, replacing, or deleting a route table or a route will match the filter for this alarm. It is likely that this alarm will be seen in conjunction with a valid pull request.

Should this alarm trigger you should confirm the gateway in question and find the cause of the alarm. You should check our Github Actions to see if any recent actions correspond to the triggered alarms. You can also investigate the CloudTrail logging for the relevant account.

Changes to route tables will affect the flow of traffic. As routes are selected on a most-specific-route basis, it is possible that the impact of a route change will have limited effects.

Outputs to #modernisation-platform-low-priority-alarms.

vpc-changes

This alarm triggers when an API call is made to create, delete, or update a VPC. As we have existing VPCs created on a per-business-unit and per-environment basis, we do not expect to see this alarm trigger.

Should this alarm trigger you should confirm the VPC in question and find the cause of the alarm. You should check our Github Actions to see if any recent actions correspond to the triggered alarms.

As we make all durable changes through code, this should show you the cause of the network-gateway-changes alert.

Outputs to #modernisation-platform-low-priority-alarms.

Viewing metric filters

The metrics that inform some of these alarms are based on CloudTrail filters. You can find them here.

This page was last reviewed on 17 November 2023. It needs to be reviewed again on 17 May 2024 by the page owner #modernisation-platform .
This page was set to be reviewed before 17 May 2024 by the page owner #modernisation-platform. This might mean the content is out of date.