MFA (mutil factor authentication) forces uses to genenrate a code in a device (usually a mobile phone or hardware) before doing important operation on S3
To use MFA-Delete, enable versioninng on the S3 bucket
You will eed MFA to
permanetly delete ann object versio
susped versioing on the bucket
You won’t need MFA for
enabling versionnig
listing deleted versions
Only the bucket owner (root account) can eable/disab;e MFA delete
MFA-Delete currently can only enabled using CLI
S3 Default Encryption vs Bucket Policies
One way tp “force encryption” is to use a bucket policy ad refuse any API ccall to PUT a S3 object without encryption headers:
Another way is to use the “default ecryption” option in S3
Note: Bucket Policies are evaluated before “default encryption”
S3 Access Logs
For audit purpose, you may want to log all access to S3 buckets
Any request made to S3, from any accout, authorized or denied, will be logged into aother S3 bucket
That data can be analyzed using data analysis tools…
Or Amazon Athena as we’ll see later i this secction
S3 Access Logs: Warning
Do not set your logging bucket to be the monitored bucket
It will create a logging loop, and your bucket will grow in size exponetially
S3 Replication (CRR & SRR)
Must enable versioing in source and destiation
Cross Region Repliccationn (CRR)
Same Region Replication (SRR)
Buckets ca be in different accounts
copying is asynchronous
Must give proper IAM permissions to S3
CRR - Use cases: compliancce, lower latency access, replicatio across acccounts
SRR - Use ccases: log aggregatio, live replication between production annd test accounnts
S3 Replication - Notes
After activating, onnly new objects are replicated
Optioally, you can replicate existing objects usig S3 batch replication
Replicates existing objects and objects that failed replication
For DELETE operationns:
Can replicate delete markers from source to target (optional settinng)
Deleteions with a version ID are not replicated (to avoid malicious deletes)
There is no “chaining” of replication
If bucket I replcatio into bicket 2, which has replication into bucket 3
Then object create inn bucket I are not replicated to bucket 3
S3 Pre-Signed URLs
Cann generate pre-signed URLs using SDK or CLI
For downloads (easy, can use the CLI)
For uploads (harder, must use the SDK)
Valid for a default of 3600 secods, can change timeout with –expireds-in [TIME_BY_SECONDS] argument
Users given a pre-signed URL innherit the permissions of the perso who gennerated the Url for GET / PUT
Examples:
Allow only logged-in users to download a premium video on your S3 bucket
Allow a ever changing list of users to download files by generatig URLs dynnamically
Allow temporarily a user to upload a file to a precise loccation in our bucket
S3 Storage Classes
Amazon S3 Standard - Genneral Purpose
Amazon S3 Stadard-ifrequent Access(IA)
Amazon S3 One Zone-Infrequent Access
Amazon S3 Glacier Instant Retrieval
Amazon S3 Glacier Flexible Retrieval
Amazo S3 Glacier Deep Archive
Amazon S3 Intelligent Tierinng
Can move betwee classes maually or using S3 lifecycle cofigurations
S3 Durability and Availability
Durability
High durability (99,999999999%
9’s) of objects across multiple AZ
If you store 10.000.000 objects with Amazon S3, you ccan on average expect to incur a loss of a single object once every 10.000 years
Same for all storage classes
Availability
Measures how readily available a service is
Varies depedig o storage class
Example: S3 standard has 99.99% availability = not avaiable 53 minutes a year
S3 Standard - General Purpose
99.99% Availability
Used for frequently accessed data
Low latenccy annd high throughput
Sustai 2 concurennt faccility failures
Use CAses: Big Data analytics, mobile & gaming appliccations, conntet distribution…
S3 Storage Classes - Innfrequent Access
For data that is less frequently accessed, but requires rapid access when needed
Expedited (1 to 5 minutes), Standard (3-5 hours), Bulk (5-12 hours)- free
Amazon S3 Glaccier Deep Archive - for long term storage
Standard (12 hours), Bulk (48 hours)
Minimum storage duration of 180 days
S3 Intelliget-Tiering
Small mothly moitorig ad auto tierig fee
Moves objects automatically between Acccess Tiers based on usage
There are o retrieval charges i S3 Itelliget-Tierig
Frequet Access tier (automatic): default tier
ifrequet Access tier (automatic): objet ot accessd for 30days
Archive Istant Access tier (automatic): object ot accessed for 90days
Archive Access tier (optioal): configuratio from 90 days to 700+ days
Deep Arcchive Access tier (optionnal): ccofig, from 180 days to 700+ days
S3 - Moving between storage classes
You can transition objeccts betwee storage classes
For infrequently accessed objecct, move them to STADARD_IA
For arcchive objects you don’t need in real-time, GLACIER or DEEP_ARCHIVE
Moving objects can be automated using a lifecycle configuration
S3 Lifecycle rules
Transition actions: It defines whe objects are transitioned to aother storage class
Move objects to Standard IA class 60 days after creatio
Move to Glacier for archiving after 6 months
Expiration actions: configure objects to expire (delete) after some time
Access log files can be set to delete after a 365 days
Can be used to delete old versions of files (if versioning is enabled)
Rules can be created for a certai prefix (ex- s3://mybucket/mp3/*)
Rules can be created for certain object tags (ex: Departmet Fiancer)
S3 Lifecycle Rules - Sceario I
Your applicationn o Ec2 creates images thuhmbails after profile photos are uploaded to Amazon S3. These thumbails can be easily recreated, ad only need to kept for 45 days. The source images should be able to be immediately retrieved for these 45 days, and afterwards, the user can wait up to 6 hours. How would you desig this?
S3 source images can be on STADARD, with a lifecycle configuration to transitionn them to GLACCIER after 45 days.
S3 thumbnails cann ben o OEZOE_IA, with a lifecycle configurationn to expire them (delete them) after 45 days.es
S3 Lifecyccle Rules - Scennario 2
A rule in your company stated that you should be able to recover your deleted S3 objects immediatedly for 15days, although this happen rarely. after this time, and for up to 365 days, deleted objects should be recoverable within 48 hours.
You need to enable S3 versioning in order to have objecct versions, so thhat “deleted objects” are in fact hidden by a “delete marker” ad ccan be recored
You can transition these “ocurrent versions” of the object to S3_IA
You can transition afterwards these “noncurrent versios” to DEEP_ARCHIVE
S3 Analyticcs - Storage class analysis
You can setup S3 Abakytics too help determine wheh to trannsition objects from Stadard to Standard_IA
Does ot work for ONNCCEZONE_IA or GLACIER
Report is updated daily
Takes about 24h to 48h hours to first start
Good first step to put together lifecycle rules (or improve them)!
S3 Baseline Performace
Amazon S3 automatically scales to high request rates, latecy 100-200 ms
Your applicationnn ccan achieve at lease 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD request per secod per prefix in a bucket.
There are o limits to the number of prefixes in a bucket
Example (object path -> prefix):
bucket/folder1/sub/file => /folder1/sub1/
If you spread read across all four prefixes evenly, you can achieve 22000 requests per second for GET and HEAD
S3 KMS Limitatio
If you use SSE-KMS, you may be impacted by the KMS limits
When you upload, it calls the GenerateDataKey KMS API
When you download, it calls the Deccrypt KMS API
Cout towards the KMS qouta per seconnd
You can request a qouta increase usinng the service qoutas console
S3 Performacce
Multipart upload
recomeded for files > 100MB, must use for files > 5GB
can help parallelized uploads (speed up transfers)
S3 Trasfer Acceleration
Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket i thhe target region
Compatible with multi-part upload
S3 Performace - S3 Byte Rage Fetches
Parallelize GETs by requesting specificc byte rages
Better resiliece in case of failures
S3 Select & Glacier Select
Retrieve less data using SQL by performing server side filterig
Ca filter by rows & columns (simple SQL statemennnts)
Ingeeral, bucket owners pay for all Amazon S3 storage ad data trasfer ccosts assocciatted with their bucket
With Requester Pays buckets, the requester instread of the bucket owner pays the cost of the request and the data download from the bucket
Helpful when you want to share large datasets with other accoounts
The requester must be authenticated in AWS (cannot be anonymous)
Amazon Athena
Serverless query service to perform analytics against S3 objects
Uses Stadard SQL language to query files
Supports CSV, JSON, ORC, Avro, and Parquet (built on Presto)
Pricinng: $5.00 per TB of data scanned
Use compressed or ccolumnnar data for cost-savinng (less sccann)
Use cases: Business intelligece / analytics / reporting, analyze & query VPC Flow Logs, ELB Logs, CloudTrail trails, etc…
ExamTip: aalyze data in S3 usig serverless SQL, use Athena
Glaccier Vault Locck
Adopt a WORM (write Once Read Many) model
Lock the policy for future edits (can no loger be changed)
Helpful for compliannce and data retention
S3 Object Lock (versioning must be enabled)
Adopt a WORM (Write Onncce Read Many) model
Block an object version deletion for a specified amount of time
Object retentio:
Retention Period: specifies a fixed period
Legal Hold: Same protecctio, no expiry date
Models:
Governance mode: users can’t overwrite or delete an object version or alter its lock setting unless they have special permissios
Compliance mode: a protected object version ccan’t be overwrittenn or deleted by compliance mode, its retention mode ccann’t be changed, and its retetion period can’t be shortened.
S3 AWS CloudFront
Content Delivery etwork (CDN)
Improves read perfomance, conntennnt is cached atn the edge
216 poit of presence globally (edge locations)
DDOS protection, inntegration witth shield aws web application firewall
Can expose external HTTPS ad cann talk to internnal HTTPS backends
CloudFront - Origins
S3 bucket
For distributing files and cachig them at the edge
Ennhaced security with CloudFrot Origi Access Identity (OAI)
CloudFrot can be used as an igress (to upload files to S3)
custom Origi (HTTP)
Applicatio Load Balancer
EC2 instance
S3 website (must first enable the buccket as a stati S3 website)
Any HTTP backend you want
CloudFront - S3 as ann Origin
Allow Public IP of edge locations
Allow security group of load balancer
OAI + S3 bucket policy
CloudFront Geo Restriction
whitelist and blacklist
CloudFront vs S3 Cross Region Replication
Cloudfront
Global edge network
Files are cacched for TTL (maybe a day)
Great for static contet that must be available for everywhere
S3 Cross Regio Replicatio
Must be setup for eacch region you want replication to happen
Files are updated i near real-time
Readonly
Greate for dynamic content that needs to be available at low-latency in few regions
AWS CloudFront Hands On
We’ll create an S3 bucket
We’ll create a CloudFront distribution
We’ll create an Origi Access Idetity
We’ll limit the S3 bucket to be accessed only using this identity
CloudFront Siged URL / Signed Cookies
You want to distribute paid shared cotent to premium users over tthe world
We can use CloudFront Signed URL/Cookies. We attach a policy with:
Include URL expiration
Include IP ranges to access the data from
Trusted signers (which AWS accounts ccan create signed URLs)
How long should be URL be valid for?
Shared content (movie, music): make it short (a few minnutes)
Private content (private to the user): you can make it last for years
Signed URL = access to individual files (one signed URL per file)
Signed cookies = access to multiple files (one siged cookie for may files)
CloudFront Siged URL Diagram
CloudFront Signed URL vs S3 Pre-Signed URL
CloudFront Signed URL
Allow access to a path, no matter the origin
Account wide key-pair, only the root can manage it
Can filter by IP, path, date, expiratio
Can leverage caching features
S3 Pre- Signed URL
Issues a request as the person who pre-signed the URL
Uses the IAM key of the signing IAM pricipal
Limited lifetime
CloudFront Pricce class
Price class: All, 200, 100
CloudFront Origin
Based on path pattern:
• /images/*
• /api/*
• /*
Origin group: high availability and do failover
CloudFront – Field Level Encryption
protected user sensitive iformation through application stack, HTTPS, asymmetric encryption
Adds an additional layer of security along with HTTPS
Usage:
Specify set of fields in POST requests that you want to be encrypted (up to 10 fields) • Specify the public key to encrypt them
AWS Global Accelerator
Anycast IP: all servers hold the same IP address and the client is routed to the nearest one
Unicast IP: one server holds one IP address
Leverage the AWS internal etwork to route to your applicatio
2 anycast IP are created for your application. The anycast IP send traffic directly to edge locations. The Edge locations send traffic to your application
Performance
Intelligent routing to lower latecy any fast regional failover
No issue with client cache
Interal AWS network
Health Checcks
Global Acccelerator performs a health check
Help make your appliccationn global (failover lesstha 1 mintes)
Great for disater recovery
Security
Only 2 external IP need to be whhitelisted
DDos protectio thanks to AWS shield
AWS Global Accelerator vs CloudFront
They both use the AWS global nnetwork and its edge locations around the workld
Both services inntegrate with AWS Shiled for DDos protectionn
CloutFront
Improves performance for both cache able content (such as images and video)
Dynamic content (such as API acceleratio and dynamic site delivery)
Content is served at the edge
Global Accelerator
Improves performance for a wide range of application over TCP and UDP
Proxying packes at the edge to applications running in one or more AWS regions
Good fit for non-HTTP use cases as gaming (UDP), IoT (MQTT), or Voice over IP
Good for HTTP use cases that require static IP address
Good for HTTP use cases that required determistic fast regional failover
AWS Snow Family
Highly - secure, portable devices to collect and process data at the edge, and migrate data into and out to AWS
Data migration: snowcone, snowballedge, snowmobile
Edge computing: snowcone, snowball edge
Data migrations with AWS snow Family
High speeds: 100mbs, 1Gbs, 10Gbs
Limited connectivity
Limited bandwidth
High network cost
Shared bandwidth
Connection stability
offline, more than one week
Snowball Edge
Physical data transport solution: move Tbs, Pbs of data in or out AWS
Pay network fees, pay per data transfer job
80TB of HDD capacity for block volume and S3 compatible object storage(storage)
42TB of HDD capacity for block volume and S3 compaitible object storage(compute)
up to 10 nodes
Use cases: large data cloud migrations, DC deccommisions, disaster recovery