Management and Governance
AWS CloudFormation
AWS CloudFormation is a service that provides Infrastructure as Code.
CloudFormation template
are used to provision the infrastructure.
JSON and YAML format is supported.
Stack
is a group of resources.
StackSets
allow users to create, update, or delete stacks across multiple accounts and regions with a single operation.
Change Sets
allow users to preview how proposed changes to a stack might impact their running resources before implementation.
When you edit a template and update a stack a change set is created which describes how the change will affect you.
Templates can be stored in CodeCommit
.
CodePipeline
and CodeBuild
can be used for CI/CD.
Templates can be scheduled to run at specific times.
Deleting the stack deletes all the Resources.
AWS Cloud Development Kit (CDK)
CDK is used to define your infrastructure using programming languages.
cdk synth
generates the templates
cdk deploy
deploys the infrastructure
Amazon CloudWatch
Amazon CloudWatch monitors all your resources in real time.
Metrics are recorded and alarms can be generated on thresholds. Logs of all services can be collated.
Events can be configured in EventBridge to respond to state changes in AWS instances.
Dashboard is available and has a customizable home page.
A common use-case is for EC2 instances to auto-scale based on CloudWatch alarms.
CloudWatch Metric Stream
allows CloudWatch metrics to be sent to Kinesis Data Firehose and 3rd party service providers like Datadog, Dynatrace, New Relic, Splunk, Sumo Logic.
CloudWatch Logs
CloudWatch Logs have Log groups
which have arbitrary names.
Log stream
is a set of instances within an application.
Log expiration policies can be set.
CloudWatch Logs can send logs to S3, Kinesis Data Stream, Firehose, Lambda, OpenSearch.
CloudWatch Logs Insights
can query CloudWatch Logs.
Subscription filters can be set on the CloudWatch Logs.
By default, EC2 does not send logs to CloudWatch. It needs a CloudWatch Agent with the necessary IAM permissions.
AWS CloudWatch Agent
can collect system-level metrics from on-premises servers to view them alongside AWS metrics for comprehensive monitoring.
Cloud Watch Alarms
CloudWatch Alarms can monitor metrics.
They can be used to trigger start/stop/terminate on an EC2 instance.
They can be configured to work with autoscaling groups or to send a message to SNS.
Composite Alarms
can club more than one alarm.
CloudWatch Container Insights
is used to collect, aggregate, summarize metrics and logs from containers. Works for ECS, EKS, K8S on EC2.
CloudWatch Lambda Insights
can aggregates metric and diagnostic information like cold starts.
CloudWatch Contributor Insights
analyze log data and time series data that displays top unique contributors and their usage. E.g. the IP addresses generating the highest traffic.
CloudWatch Application Insights
provides automatic dashboards to troubleshoot your application and related AWS services.
Amazon EventBridge
Amazon EventBridge facilitates decoupling and scalability of applications. It provides event processing capabilities at scale, event routing and filtering.
It is used to schedule cron jobs e.g. schedule scripts to run every hour and to write scripts to react to a service doing something.
Every account has one Event Bus
by default for AWS Services and a Partner Event Bus
for 3rd party services.
Rules can be setup to filter and send events to targets.
Pipes
route events from one source to one target.
Scheduler
is responsible for scheduling events.
Some sources include an EC2 instance change, CodeBuild build failures, S3 object uploads etc.
Filters can be set to determine which events to process.
A JSON document is generated which can integrate with destinations.
EventBridge Archive and Replay
feature is the most efficient and cost-effective way to store EventBridge events and use them later.
AWS CloudTrail
AWS CloudTrail provides governance, compliance and audit for your AWS accounts.
It allows you to get a history of events or API calls made within your AWS Account.
Actions of SDK, CLI, Console or IAM users are audited.
If events are needed for more than 90 days, then logs can be shared to CloudWatch Logs or S3 buckets, and Athena can be used to query the data.
CloudTrail Insights
detects unusual activity in your account.
AWS Config
AWS Config helps with auditing and recording compliance of your AWS resources.
E.g. it can identify security groups that allow unrestricted SSH access to all IP addresses.
We can create rules for compliance and AWS Config can show all the resources which violate compliance.
AWS Config Remediations
automatically re-configures your Security Groups to their correct state.
AWS Config Notifications
notify you over email when someone modifies your EC2 instances' security group.
AWS X-Ray
AWS X-Ray is a tracing tool that receives traces from application and AWS services.
Traces are a collection of Segments.
AWS X-Ray helps developers analyze and debug production and distributed applications by providing insights into the performance and errors.
AWS Health Dashboard
AWS Health Dashboard provides alerts and guidance for changes that can affect your AWS environment.
E.g. AWS maintenance activities
A public event
is one that affects all customers.
A private event
is one that affects only your account or a region that you use.
Amazon Managed Service for Prometheus
Prometheus is a time series database and an open source monitoring solution.
Amazon Managed Service for Prometheus is Amazon’s managed offering for Prometheus. It collects metrics for your application.
It is based on the PromQL language.
Targets
are services from which Prometheus should collect data.
Retriever
collects the metrics from the target.
It works well with dashboard visualization tool like Grafana.
Prometheus uses service discovery to identify all services to collect metrics from.
Prometheus can integrate with Cloud Watch for alerts.
Alert Manager
is a component in Prometheus which can trigger an event in SNS which can send an alert.
Amazon Managed Grafana
Grafana is an open source project that provides high quality dashboards to visualize data.
It provides improved capabilities as compared to CloudWatch.
Grafana can be used to query metrics stored in Grafana to view custom dashboards.
AWS Managed Grafana is Amazon’s managed offering for Grafana.
AWS Trusted Advisor
AWS Trusted Advisor is a tool that helps you follow AWS best practices.
It can advise on the following:
-
Cost Optimization: helps reduce costs by identifying idle resources
-
Performance: by reviewing usage and configurations
-
Security: recommending best practices
-
Fault tolerance: checks autoscaling groups, backups
-
Service Quotas: monitors maximum allowed resources and alerts when you reach quota of 80%
AWS Launch Wizard
AWS Launch Wizard simplifies the process of deploying well known 3rd party applications through pre-configured templates.
Launch Wizard itself comes at no extra charge.
Compute Optimizer
AWS Compute Optimizer provides optimization for services like EC2, ECS, Fargate, Lambda.
Its features include:
-
Performance Risk Analysis
-
Cost-saving Recommendations
-
EC2 Instance Type Recommendations
-
EBS Volume Recommendations
-
Optimization for Fargate
AWS Organizations
AWS Organizations is a global service.
It simplifies the process of managing multiple AWS accounts.
It offers centralized billing across all accounts.
We have a Root organization unit in the organization.
Organization units
help group different organizations.
Management accounts
handle the management activities like adding and removing accounts.
Member accounts
can only be part of one organization.
Reserved instances and savings plan discounts can be shared across account.
Service Control Policies (SCPs)
are a type of organization policy that you can use to manage permissions in your organization.
SCPs do not apply to the management account.
FullAWSAccess
SCP is attached at Root OU.
Deny policies on Org Unit cannot be overridden at account level.
It integrates with other services like IAM.
It is provided at no additional charge.
AWS Systems Manager
AWS Systems Manager is used to manage a large number of servers in AWS and on-premise.
It needs to install a Systems Manager Agent, which is a software to be installed on all of your servers.
AWS Control Tower
AWS Control Tower helps you manage an environment having multiple accounts. Think of it as a AWS Organization orchestrator. It offers automated provisioning and governance.
Preventive Guardrails
provide preventive measures whereas Detective Guardrails
provide reactive measures.
Account Factory
helps automate setting up the new accounts.
AWS Service Catalog
AWS Service Catalog is a curated collection of IT services.
An IT administrator manages the catalog.
Roles define what a user is allowed to use.
Some challenges that occur without a Service Catalog are:
-
Inconsistent deployments
-
Lack of Governance
-
Uncontrolled spending
-
Complexity in managing accounts
-
Slow deployment of resources
Products
can be thought of a template to deploy a set of resources.
Cloud formation stacks can configure a Product.
Portfolios
manage who can access which Product.
Portfolio is a group of Products.
Catalog Administrator
manages the catalogs and products.
End users
use the products.
AWS License Manager
AWS License Manager manages software licenses from various vendors for cloud as well as on-premise.
License manager can prevent software from launching if it does not conform to a valid license.
It prevents over-use of a license, works across AWS accounts and enforces license rules.
Resource Group and Tag Manager
Resource Group and Tag Manager is used to group resources as per tags.
One good idea is to tag resources per environment.
AWS Proton
AWS Proton allows a platform team to create environments using IAAS. It automates deployments and can have flexible definitions.
AWS Resilience Hub
AWS Resilience Hub helps you set up your disaster recovery process by continuously tracking the application.
It has the capability to alert during outage.
It provides SOP (Standard Operating Procedure) for recovery.
Its features include:
-
Define the Resilience Policies
-
Run Assessments
-
Review Assessments
-
Implement Recommendations
-
Setup Alarms
-
Review SOPs
AWS Resource Explorer
AWS Resource Explorer simplifies the search and discovery of your AWS resources across AWS regions.
It has two types of users, an administrator and a normal user.
Index
is a collection of information about AWS resources in a specific region.
Local index
is specific to a region whereas an Aggregator index
collects data from all regions.
Every region replicates their local index to the aggregator index.