Struggling with Cloud Security?

Aman Bansal
Jun 4
7 min read

Updated: Jun 7

If you're reading this in 2025, it's likely your organization relies heavily on cloud providers like AWS, GCP, or Azure for agility, scalability, and cost-efficiency. But this shift often introduces new complexities and a perceived loss of control - especially in the realm of security. If you're struggling with cloud security, you're not alone. In fact, it's one of the top concerns for Information Security Teams today.

Securing traditional infrastructure was already challenging. The cloud introduces a new layer of complexity with the shared responsibility model - not just between you and your cloud provider, but also across internal teams like DevOps or Infrastructure.

After managing cloud security for several years, I’ve realized that a well-defined process is critical. Cloud security isn't solely the responsibility of the security team—it must also involve those building infrastructure and deploying code. Defining a collaborative program with clear expectations and stakeholder alignment is the key to long-term success.

Also, this document only covers the basics of cloud security program and it can be extended to any extent.

One piece of advice from a mentor that stuck with me:

‘Make others part of your problem—it leads to better collaboration.’

In security, you're often your own project manager.

This article will only take AWS references since I have mostly worked on AWS but the discussed program covers any cloud provider, be it AZURE or GCP.

Let's get into discussing the key areas to focus on for any Cloud Security Program:

Preventive Controls

Security Architecture:

Cloud architecture isn't just about deploying scalable apps - it's the foundation for how secure (or vulnerable) your environment is. A famous internet line on this: "A well-architected cloud environment should bake in security from day one"

Any new service should go through security design review (there should be a formal technical design review process in the organization) to incorporate defence-in-depth at each step.

TIP:

Always filter your inbound/outbound traffic with combination of tools like Network Segmentation, Security Groups, Network Firewall or DNS Firewall.
A WAF should be the only point for the inbound traffic.

IAM:

Let's just step back to see what consists within Identity and Access Management. Identities

can be a human account or a non-human account, and Access Management controls what these identities can access and what permissions the identity has.

In the AWS cloud: Identity -> IAM User

Role -> IAM Role

Access Management -> IAM Policy attached to the IAM User/Role, IAM Policy states what actions can be performed.

According to me, certain aspects of IAM need to be handled with caution:

Handling of Root Accounts: Who has access to root accounts?
IAM Policies: Are you relying on IAM Users or IAM Roles?
- Use your IdP (ex, Okta) to assign IAM Roles to the IdP identities instead of IAM users accessing cloud resources directly and relying on SSO using your IdP.
- If IAM User: If IAM users are necessary, enforce access key rotation every 90–120 days (Access Keys are a hassle to manage if you are not using IAM Roles). That's why I strongly suggest never design your access system using IAM Users.
IAM Roles: These roles should be strictly restricted and follow the least privilege principle. Definitely suggest the Cloud Provider Managed Roles to use, which have a specific set of permissions only on the service level.
If needed, use SCPs (Service Control Policies) for IAM Users/Roles to restrict actions on the cloud accounts. (discussed in detail below).
There is also new solutions like https://sonraisecurity.com/ automatically protect every identity: human, non-human.

TIP: Please have some TPAM (Temporary Privileged Access Management) solutions in place in order to enforce and well audited temporary permissions for the IAM roles. This is MUST to have any successful IAM program.

Baselining:

I am pretty sure if you are using Cloud, you are most probably using infrastructure as code (IaC) to define and manage your entire infrastructure, rather than manually configuring it. If you are not, then it's another problem to deal with, and I would not recommend building your resources without IaC. The most popular of it is Terraform, but it can also be Chef, Puppet, AWS CloudFormation, and Ansible.

Baselining defines the default configurations of a resource once it is created. If you are using Terraform modules to create AWS resources, the module should be defined in a way that it will create the resource with secured configurations by default. You can use the AWS Best practices standard to build the baseline of all of your resources to define the baseline.

Ex: S3 buckets are very crucial in the context of security controls. Define your module with the below so that any S3 bucket will be created with these default configurations:

S3 general-purpose buckets should have block public access settings enabled: By default, the public access block should be enabled. If needed, explicitly define.

SCP (Service Control Policies):

In AWS, SCPs offer central control over the maximum available permissions for the IAM users and IAM roles in your organization. An SCP defines a permission guardrail, or sets limits, on the actions that the IAM users and IAM roles in your organization can perform.

Ex- KMS keys should always be rotated after 365 days: Having these SCPs will make sure that no one will be able to create KMS keys without enabling Key Rotation.

These are some of the Best Practices to follow for SCPs: https://aws.amazon.com/blogs/industries/best-practices-for-aws-organizations-service-control-policies-in-a-multi-account-environment/

Note: Ensure that you thoroughly test your SCPs in the lower accounts before moving them to Production Accounts.

SecurityHub:

In AWS, Security Hub provides a comprehensive view of security state and helps you assess your AWS environment against security industry standards and best practices.

Security Hub collects security data across AWS accounts, AWS services, and supported third-party products, helping you analyze your security trends and identify the highest-priority security issues. I would recommend enabling Standards like AWS Best Practices and CIS to review your AWS Accounts.

Some of the Benefits:

Reduced effort to collect and prioritize findings.
Automatic security checks against best practices and standards
Consolidated view of findings across accounts and providers.
Ability to automate finding updates and remediation.

Auto-Patching:

How are you taking care of your EC2 instances? Are you Patching EC2 AMIs' vulnerabilities or even Kernel vulnerabilities?

I am a big fan of AWS Systems Manager Patch Manager, which makes patching automated. You can use Patch Manager to apply patches for both operating systems and applications. You can scan instances to see only a report of missing patches, or you can scan and automatically install all missing patches.

Moreover, if you are using Linux-based containers to run your application, AWS Bottlerocket is perfect to use without the hassle of patching security issues regularly. Since it includes only the essential software to run containers, it improves resource utilization and reduces the attack surface compared to general-purpose operating systems. Also, Bottlerocket has been certified by the Center for Internet Security (CIS) to ship secure as hardened to CIS Bottlerocket Benchmark v1.0.0. Organizations that leverage Bottlerocket can now be assured that it will successfully run on a CIS hardened environment.

Install EDR:

Make sure you are installing ERD (CrowdStrike/SentinelOne, etc.) across all your EC2s. It's very important to monitor, detect, and respond to threats in computer resources. They provide real-time visibility into endpoint behavior, enabling rapid incident response and remediation. There are various ways to install EDRs in your EC2s, like using a userdata script. If you are using EKS Clusters, try using the helm chart to have EDR across your EKS nodes.

More on this: https://aws.amazon.com/marketplace/solutions/media-entertainment/edr/

Centralized logging or Monitoring:

In AWS, without unified logging of CloudTrail and Cloudwatch, you’re flying blind. Setup centralized cloudtrail logging and send it to you SIEM to log any of the API action in your AWS accounts.

Responsive Controls

CSPM (Cloud Security Posture Management):

After having all of the above responsive controls, you definitely need a CSPM solution in place to oversight any of the left out misconfigurations and vulnerabilities.

I am big fan of WIZ for CSPM as it provides visibility in your cloud like no other. CSPM is not only built just for security team but it is also very useful for DevOps/Infra teams.

Have the centralized inventory of resources.
Detect misconfigurations across your cloud resources.
Detection of vulnerabilities mapped with the CVE data in your cloud resources.
Also helps in triage of misconfigurations and vulnerabilities.
Helps in container run-time detection in your EKS clusters.

Moreover, WIZ also provides visibility into the data storage systems. Where is your PII in your cloud, what controls you have to secure the data, gaps to fill in to have the right DSPM (Data Security Posture Management) and much more.

SOC Controls:

Involvement of SOC have been seen limited in the cloud security due to lack of understanding of cloud resources and how to setup detection or alerts to find anomalies in logs such as AWS Cloudtrail. That's where cloud security team and SOC needs to work closely to setup relevant detections in place.

Moreover, please make your SOC analysts learn cloud to make this program successful. This is the pattern that have been noticed that SOC still have not adopted to build a knowledge around securing cloud services. Please support your SOC analyst to spend time on learning paths, AWS certifications are pretty robust.

Cloudtrail logs are the goldmine to build your detections in cloud.

Some of the much have detection/alerts:

Enable AWS GuardDuty and triage it together with the cloud security team.
Use of Root Accounts.
Creation of security groups having rule containing 0.0.0.0/0 IP.
Monitoring of sensitive ports in the security group rules like 22.
If you have enabled Network Firewall/DNS Firewall, setup alerts in case of any suspected outbound traffic.
Build your Dashboard/Metrics in your SIEM to continuously monitor the posture or functions.

Setting up alerts varies with different cloud environment, architecture setup. Always brainstorm what else be worth monitoring.

Triage Process:

Define and document a triage process is as important as anything within any security program. I have seens many security programs fails even after having all the security controls in place.

Different teams owns different parts of the cloud. Infrastructure/DevOps team create the Cloud Resources and Engineering team deploys code within these cloud resources. Hence the collaborations becomes the biggest part of the Job in traiging any cloud incident. Build your triage process, present it to all the necessary stakeholders and setup the right expectations from the beginning otherwise you will always struggle and chasing teams to fix issues and vulnerabilities.

I cannot emphasize enough that everything will fall apart if you don't have the well documented triage process in place.

Summary:

Cloud security is a continuous journey. Regularly assess and evolve your security controls. Audit everything, detect misconfigurations, investigate anomalies, and visualize your security posture through clear metrics. Most importantly—collaborate. Security is everyone’s responsibility.

Please right back if you find I missed anything

Cybersecurity Insights