SCPs at scale
by José Cegarra, Cloud Engineer
This blog post aims to show how we can get the most out of Service Control Policies (SCPs) in AWS. We will review some tips and tricks that hopefully make your SCPs better and cover some use cases you might find helpful. This blog post is based on SEC201 at AWS re:Invent 2021.
Introduction
First, let’s start with a basic definition of what an SCP is and what it isn’t. According to AWS documentation:
Service control policies (SCPs) are a type of organization policy that you can use to manage permissions in your organization. SCPs offer central control over the maximum available permissions for all accounts in your organization. SCPs help you to ensure your accounts stay within your organization’s access control guidelines. SCPs are available only in an organization that has all features enabled. SCPs aren’t available if your organization has enabled only the consolidated billing features.
So, long story short:
- They can limit the max allowed permissions, but they can’t grant access.
- They work at the principal level.
- They can be applied at Account level.
- They can be applied at OU level.
- They can be applied at OU root level.
- They should be used to enforce security invariants- even root accounts can be affected by SCPs.
How to manage maximum limit size in SCP
It is not likely going to happen, but in case that your organization has a really fine-grained SCPs that has been growing over time and it reached the maximum limit size, you will need to sort that out. A possible solution to bypass this size limitation is adding a ghost OU and attaching another SCP to it, that way you can have 5kb + 5kb for your SCP total size. Also, If you save the policy by using the AWS Management Console, extra white space (such as spaces and line breaks) between JSON elements and outside of quotation marks, is removed and not counted. If you save the policy using an SDK operation or the AWS CLI, then the policy is saved exactly as you set it and no automatic removal of characters occurs. Take the following scenario as an example, imagine that we have 2 accounts that are under production OU, but since we’re managing everything in a granular way we ran out of space, then we add a new OU under production OU just to increase the limit of the SCP size. The following diagram shows what this solution would look like.
Be careful with global services/regional services
If you want to deny all regions except eu-west-1, for example, you would write an SCP that explicitly denies everything in the other regions; but be careful with that because the global services are located in us-east-1. If you apply that setting*, you would also be denying global services. What you could do instead is write a policy that uses a notAction statement and deny everything except the global services, with an added condition for the requested region.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyAllOutsideRequestedRegions",
"Effect": "Deny",
"NotAction": [
"cloudfront:*",
"iam:*",
"route53:*",
"support:*"
],
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:RequestedRegion": [
"eu-west-1"
]
}
}
}
]
}
In other words, this policy means:
[“Deny”] except [“cloudfront:*”,”iam:*”,”route53:*”,”support:*”] for [“all”] the resources, when [“requested region”] is not [“eu-west-1”].
The above policy uses the Deny > Condition Not Like structure that allows you to deny everything except the condition matched.
How do I test my SCPs?
When testing SCPs, create a staging OU and always work it bottom top.
Account -> Staging/Test OU -> Production OU -> root OU (if needed). Bad configurations in SCPs can be very disruptive, so try to reduce the scope of affected accounts.
How do I add exceptions to SCPs?
Ideally, we don’t have exceptions, but in real case scenarios, exceptions are common. A workaround to this issue is having an exception OU which allows you to have a more flexible SCP and all the accounts that require that extra flexibility can be there.
Let’s take the following example: our company only works on eu-west-1 with some global services, but we need 3rd party services which are deployed in a separate account. Those services need to be deployed cross-region. In this case, instead of having a single SCP under the production OU with thousands of lines — which can be difficult to maintain- we just have another OU called Exceptions with a less restrictive SCP because the other accounts are secure in a separate OU.
Another workaround for exceptions in SCPs is to use tags to allow or deny actions to some principals. This is more useful when focusing on who can do X.
For example, add a tag allow-s3 = true. Then, in the SCP condition, check if the tag is the desired value or not, and if it isn’t, deny the action.
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Deny",
"Action": [
"s3:*"
],
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:PrincipalTag/allow-s3": "true"
}
}
}
}
In other words, this policy means:
[“Deny s3”] for [“all”] the resources, when [“the principal allow-s3 tag”] is not [“true”]
Conclusion
SCPs are a helpful tool to control governance in your organization, but they can be very disruptive if they are not configured correctly. The tricks shown in this blog post are an example of the KISS (keep it simple, stupid) principle applied to SCPs, but also some workaround solutions in case that you can’t make it that simple. Remember to:
- test bottom-up when testing your SCPs;
- not denying global services in the us-east-1 region;
- adding ghost OUs in case your SCP has reached the limit size; and
- using a separate OU to add exceptional accounts there.
At Edrans we had rough times writing SCPs, but following these tips & tricks, we have successfully left that behind. More specifically, We had trouble restricting workloads to one specific region without blocking global services (blocking CloudFront, to be more specific). Another real case scenario we faced was a hard time maintaining a large SCP that contained only “Allow” statements, so for each new service released by AWS we had to add it to the SCP, then we realized that using Deny > Condition Not Like structure like shown before covers pretty much everything, but using fewer characters. After all, it is all about keeping it simple.