Environments (AWS accounts)

What we’re trying to fix

Within the Ministry of Justice, AWS accounts and the applications that sit within them tend to be split by business unit and their software development lifecycle phase (SDLC).

This means that for most business units, they will have a production AWS account, which holds all of their production applications. This tends to span across other SDLC phases such as development, staging, test, and so on.

Whilst this works for business units working on their own applications with a finite number of cloud resources, scaling it into a platform for anyone within the Ministry of Justice to use raises some issues.

For example, if we had an account for fictional-business-unit-production and some example applications:

example-a, which is an application with a database that holds sensitive dataset A

example-b, which is an application with a database that holds sensitive dataset B

Issue 1: Blast radius

One of the risks of splitting AWS accounts at the granularity of business unit and their SDLC is the potential size of the blast radius - the radius in which damage could occur should something go wrong. At this level of granularity, the blast radius has a much wider impact on the business and has a high probability of affecting other resources and applications that sit within an AWS account.

The blast radius can be affected by anything, such as security, or an Availability Zone going offline.

Example: Compromise of security

Access keys are leaked for an IAM user in fictional-business-unit-production. The IAM user has an AdministratorAccess policy attached. A malicious attacker with these keys now has the ability to access both sensitive datasets: dataset A and dataset B.

Example: Loss of availability

fictional-business-unit-production had only ever configured their EC2 instances to run in one AZ. If this AZ goes down, all of their applications do too.

Issue 2: Team isolation

Similar to Issue 1, team isolation becomes extremely difficult to maintain if all applications in their SDLC stage are within one account.

Imagine team-a works on example-a and team-b works on example-b, both of which fall under fictional-business-unit.

Everything team-a does can be seen by team-b even if you use a well-defined attribute-based access control (ABAC). Using ABAC requires resources to be tagged, and it has no effect on untagged resources.ABAC also doesn’t work for all resources in AWS.

If team-a forgets to tag something, or creates a resource that doesn’t support tagging during creation, anyone who has access to that AWS account will be able to interact with it.

Example: Human error (misclick)

team-a notices an issue with one of their production EC2 instances. They try to terminate it, and notice that they’ve just terminated a production instance that belonged to team-b, which disrupts team-b‘s workflow whilst they redeploy their application. It takes time and money to reprovision these resources, and may affect application users.

Issue 3: Billing

In a shared AWS account for a business unit, untaggable billable items such as Data Transfer cannot easily be attributed to a particular application.

Billing granularity requires resource tagging, and it is useful to know how much an application costs to run to help prioritise the refactoring, replacement, or retirement of each application.

Further to this, having more granular billing can expose opportunities for optimisation: expensive operations or misconfigured resources such as instances that require right-sizing.

Example: Unbalanced untaggable billing

example-a stores 100TB of data in an S3 bucket and replicates it to another region, costing $2,048.00.

example-b stores 1TB of data in an S3 bucket and replicates it to another region, costing $20.48.

Since Data Transfer isn’t taggable, you can’t easily trace it back to the originating S3 bucket.

Issue 4: Cloud waste

When holding all applications in a SDLC account, it becomes difficult to retire application infrastructure, even if they’re covered by a good tagging policy. When deploying to the cloud, all infrastructure should be considered ephemeral and everything should be rebuildable from code.

If you destroy the wrong infrastructure when retiring an application, it becomes costly to recreate. If you accidently miss some infrastructure during de-provisioning, you inadvertently create ongoing cloud waste which is also costly.

Until you’ve retired all applications within a shared AWS account, you can’t close it, so there’s always a chance there are some resources left unaccounted for after an application has been refactored, replaced, or retired.

Example: Cloud waste from uncertainty

example-a has been retired, and is no longer of use. There are untagged resources that are thought to be part of example-a, but team-a is unsure, so they just leave it.

Example of destroying the wrong infrastructure

See Example: Human error (misclick).

What we investigated

Option A: Let teams decide

In the Modernisation Platform, we want to empower teams and grant them the autonomy to do whatever they need to support their applications and environments. Some complexities that can arise from delegating how infrastructure is separated are:

people tend to go for the easiest option, which creates technical debt further down the line
infrastructure quickly becomes hard to track
it can become tangled and messy
doesn’t support the central alignment we’re trying to achieve

However, raw autonomy does give some positives:

teams truly have autonomy and empowerment to run their infrastructure in the way they believe is best
teams can understand their own configuration as they are the ones who built it

Option B: One account, strong attribute-based access controls

Having one account is by far the easiest setup to have. You can use VPCs out of the box to enable cross-resource communication, you only have to set up IAM users once, and you can easily see all resources in one place.

Early on, we explored using strong attribute-based access control (ABAC) so teams can only view their own resources in an account, alongside strict IAM policies to stop cross-team resource interaction. Some of the issues with this is that:

ABAC isn’t supported on all AWS resources, and requires resources to be tagged (it doesn’t work on untagged resources)
finding the cost of an application becomes extremely hard to do, where billable items like Data Transfer aren’t easily attributable directly to a source
deprovisioning infrastructure or applications can become tricky if there are thousands of resources in one account
resources aren’t truly isolated, as they’re all in one account
the blast radius is huge

Option C: Accounts separated by business unit and SDLC stage

This is how most multi-account architectures are set up and it is recommended by most cloud providers to go to at least this level of granularity. Some issues with this are listed above in What we’re trying to fix.

It’s not incorrect or wrong to do things this way, but the Modernisation Platform would like to improve on the Ministry of Justice’s current working practices.

It does have its benefits:

there are a set of SDLC stages a business unit will typically use
it’s easy to work with, since everything for each SDLC stage is in one place
it logically separates business units from each other

Option D: Accounts separated by application and SDLC stage

Separating accounts by application is a more granular middle-ground of either having one account, or separating everything. It’s very similar to separating by business unit and SDLC stage but goes a step further to provide granularity at an application level.

It has some issues:

it’s not simple to do
it’s an uncommon pattern
users probably need some degree of centralisation, such as for networking
it has a higher risk of cloud waste, but is easier to track

Option E: Separate everything

Theoretically and technically, you can separate everything into different accounts; i.e. application, application component/layer, and SDLC stage. You could have something like this for the example A application:

example-a-api-production
example-a-backend-production
example-a-databases-production
example-a-frontend-production
example-a-api-development
example-a-backend-development
example-a-databases-development
example-a-frontend-development
example-a-landing-zone

Some issues with this are:

it becomes incredibly complex to maintain
it becomes difficult to track
it causes exponential growth in the number of AWS accounts
it becomes hard to sensibly group things, i.e. what makes up the frontend group, what needs to go into backend group
it is impossible to sustain as an approach as more teams are on-boarded onto the Modernisation Platform
most legacy applications aren’t microservices, so splitting things out is hard
huge room for error in cross-account sharing, if required

The benefits of this are:

it becomes easy to track costs per application and layer (frontend, backend)
each account has a tiny blast radius

What we decided

Overview

We decided to use separate AWS accounts per application as a middle-ground between separating everything and using one account with strong attribute-based access controls.

Whilst there is a trade-off to more complexity, we feel it’s outweighed by the benefits of doing this.

Some of the biggest benefits are:

granular isolation at an application level
better cost management
increased autonomy for teams
reduced cloud waste

Logically separate applications

The Modernisation Platform hosts a number of applications, each built in different ways. One of our goals is to facilitate modernisation of these applications through the onboarding process. That could mean: - moving away from on-premise hosted databases into managed services such as AWS Relational Database Service (RDS) - moving away from bastion hosts to agent-based instance management

Whilst app modernisation is our goal, we’re also aware that it won’t happen straight away, and may not be possible for some legacy applications.

By separating applications out into their own AWS accounts we can:

allow autonomy for all environments, including production, with more confidence due to a reduced blast radius
isolate different team workloads from each other
manage and control costs at an application level
reduce cloud waste by building test suites for application infrastructure, rather than for a whole platform
retire whole applications easily, by closing their AWS account
sensibly group applications without having to worry about which layer (frontend, backend) it sits in
separate legacy infrastructure from more modern infrastructure, and move them into the Cloud Platform when suitable
use Service Control Policies attached to single accounts to restrict centrally-managed actions on an application-by-application basis
utilise independently mapped Availability Zones so if one AZ goes down, it will have less of an impact in other accounts

More accounts aren’t always better

The risk of separating everything, down to layer (frontend, backend, database, etc) is too complex to maintain sensibly. It is easy to get carried away with splitting applications down into layers, and most of our hosted applications will be legacy, which are unlikely to be split into microservices. This makes it difficult or impossible to be cost efficient with migrations, especially if they are likely to be retired or rebuilt in the near future.

Whilst more accounts come at a free monetary cost, the time cost of defining, and maintaining them, becomes high.

The trade-off for complexity is (probably) worth it

Logically separating applications into their own AWS accounts isn’t an easy task, and can become complex.

The trade-off of having a smaller blast radius for legacy applications, better cost analysis, less cloud waste, and greater team isolation is worthwhile for the confidence we will have as a team to make infrastructure changes.

We’re always learning

Where there’s complexity, there’s an opportunity to learn and that will help us make the right decisions with our infrastructure.

We might find issues with this structure or find that it doesn’t work for our scale, and that’s OK, because we’ll have learnt exactly why by trying it.

Architecture

You can view our architecture for Environments (AWS accounts) on the dedicated Environments (AWS accounts) architecture page.

This page was last reviewed on 16 July 2025. It needs to be reviewed again on 16 July 2026 by the page owner #modernisation-platform .