SlideShare a Scribd company logo
1 of 31
S3 & Glacier
Amazon S3 & Glacier - the only backup solution you’ll ever need
Matthew Boeckman, VP - DevOps, Craftsy @matthewboeckman
http://enginerds.craftsy.com http://devops.com/author/matthewboeckman
Who is Craftsy
● Instructor led training videos for passionate hobbyists
● #19 on Forbes’ Most Promising Companies 2014
Why do backups matter?
… they don’t, until they do.
At small scale, backup can and is as easy as icloud, or google drive, or
github.
As you grow, the situation gets more complex.
In a content creation business,
backups are vital.
RESTORE
Define your exposure
daily backups - protect
against accidental
deletion (users) and
system
corruption/crash
DR - your building
catches on fire. an
employee goes rogue.
the SAN suffers a total
existence failure.
permanent archive - data at
rest, legal/business
permanence
what are my options
LTO - hope you like lifecycle hardware
refreshes! hope your data rests on a 1.5 or
3tb boundry. hope you enjoy swapping
tapes… 4-17year durability
~$.008/GB
** does not include operational costs **
DISK - good luck offsiting, hope you enjoy hardware
maintenance, hope you’ve got a big checkbook
~$5/GB
** does not include operational costs **
hobo as a service
● not scalable
● expensive
● not addressable in code
● requires software systems
welcome to system
adminsitrationville,
population: you
why did we look for alternatives?
What are the alternatives?
s3
why s3?
● $0.03/GB
● 99.999999999% durable
● easy interface with glacier
● addressable with code
“If you store 10,000 objects with us, on average we
may lose one of them every 10 million years or so.
This storage is designed in such a way that we can
sustain the concurrent loss of data in two separate
storage facilities.”
except...
this is a busy space
free tools, one thread, small files
http://s3tools.org/s3cmd
● great sync support
● supports multipart
● Does not support multithread
● tons of other features (du, cp, rm, setpolicy, cloudfront…)
https://github.com/boto
● the aws python library
● includes basic cli s3put, great reference point
great, but not fast
free tools, multipass, big files
https://github.com/mumrah/s3-multipart
● zomg multithreaded, multipart up and down
● saturated 100M pipe for a single file transfer
● does not handle small files
https://github.com/aws/aws-cli
● zomg multithreaded, multipart up and down
● saturated 500M pipe for a single file transfer
● perfect in every imaginable way
problems
approach single archive
(tar)
lots o files, some
big some small
le good ● fast transfer
● simple
● no filename
pain
● restore &
navigation
granularity
le bad ● restore all
● can’t nav your
backup
● processing
overhead
● users name
things...
our first backup script ... testy.sh
*shitty speed, restore costs, let’s just get to the next slide
to mp or not to mp...
size=$(stat -f %z "$x")
if [ "$size" -gt "100000000" ]; then
s3-mp-upload.py ${S3MPT_OW} -np 10 -s 100 "$x"
"s3://${BUCKET}/$CID/${x}" >> $LOG 2>&1
else
s3put ${S3PUT_OW} -s $AWS_ACCESS_SECRET -a
$AWS_ACCESS_KEY -b $BUCKET -p "$CPATH" "$x" >> $LOG
2>&1
fi
aws s3 cli
aws s3 sync "$CID" "s3://${BUCKET}/${CID}" --delete
aws s3 cp "$SRC" "s3://${BUCKET}/${CID}" --recursive
simple, multipart, multithreaded and saturates 500mbps both
ways
hooray! now what
why glacier?
● $0.01/GB
● durable like s3
● data sent to glacier via s3 policies cannot be
accidentally deleted
s3->glacier
prefix, delete
restore from glacier
restore, our UI
restore from glacier as code
foreach ($objects as $flerbs) {
$key = $flerbs->Key;
$restore = $s3->restore_archived_object ($bucket, $key, $days );
//var_dump($restore);
$http_code = $restore->status;
$xml_resp = $restore->body;
$message = $xml_resp->Message;
$header = $restore->header;
$req_id = $header['x-amz-request-id'];
$amz_id2 = $header['x-amz-id-2'];
$req_date = $header['date'];
}
oooh, a progress gui
restore status as code
$header = $s3->get_object_headers($bucket, $key)->header;
$restore = explode(',', $header['x-amz-restore']);
$progress = strpos($restore[0], 'true');
…
if ($sclass == 'GLACIER') {
if ($header['x-amz-restore'] == '') {
$style = '"background-color: #0ce3ff;"'; // not restored!
$note = '';
} elseif ($progress != false) {
$style = '"background-color: #ffa391;"'; // restore in progress!
$note = 'Restoration in progress';
} else {
$style = '"background-color: #95fed0;"'; // restored!
$note = "Restored until $expirydate";
}
versioning, another approach
● versioning operates on the bucket level
● each version of an object counts as storage
● versioning happens automatically with any PUT
● deletion of an object does not explicitly delete versions (!!)
GET /my-image.jpg?versionId=L4kqtJlcpXroDTDmpUMLUo
GET /my-image.jpg?versions HTTP/1.1
DELETE /my-image.jpg?versionId=UIORUnfnd89493jJFJ
restoring a version
Restore to the most recent version?
1. delete the current version!
Restore to an older version?
1. GET the desired version
2. PUT it back on top of the object you’re restoring
where does Craftsy live?
1. Bi-weeklies of ACTIVE content to a non-versioned, non-
glaciered bucket (~200TB total, +2TB/wk)
2. Weekly new finished content to glacier-backed s3 bucket
(~500TB, +45TB/mo)
3. All code artifacts (dev, staging, prod) to a versioned s3
bucket
4. Postgres backup segments to an non-glaciered bucket
most important, we do all of it in code. we don’t shuffle tapes,
upgrade hardware, or system administrate!
Thank you
QUESTIONS!
Matthew Boeckman
@matthewboeckman
http://enginerds.craftsy.com
(deck will be there & slideshare)
http://devops.com/author/matthewboeckman

More Related Content

What's hot

Boosting MongoDB performance
Boosting MongoDB performanceBoosting MongoDB performance
Boosting MongoDB performanceAlexei Panin
 
CHI - YAPC NA 2012
CHI - YAPC NA 2012CHI - YAPC NA 2012
CHI - YAPC NA 2012jonswar
 
CHI-YAPC-2009
CHI-YAPC-2009CHI-YAPC-2009
CHI-YAPC-2009jonswar
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Dockerroskakori
 
Drupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of FrontendDrupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of FrontendAcquia
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it WorksMike Dirolf
 
Persistence patterns for containers
Persistence patterns for containersPersistence patterns for containers
Persistence patterns for containersStephen Watt
 
Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)Dominic Cleal
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...IndicThreads
 
Responsive Design with WordPress
Responsive Design with WordPressResponsive Design with WordPress
Responsive Design with WordPressJoe Casabona
 
Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)Joe Casabona
 
Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012Stefan Kögl
 
Device Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDBDevice Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDBFrank Rousseau
 

What's hot (19)

Boosting MongoDB performance
Boosting MongoDB performanceBoosting MongoDB performance
Boosting MongoDB performance
 
CHI - YAPC NA 2012
CHI - YAPC NA 2012CHI - YAPC NA 2012
CHI - YAPC NA 2012
 
CHI-YAPC-2009
CHI-YAPC-2009CHI-YAPC-2009
CHI-YAPC-2009
 
Day 2-some fun coding
Day 2-some fun codingDay 2-some fun coding
Day 2-some fun coding
 
Mongodb
MongodbMongodb
Mongodb
 
Mysqlnd uh
Mysqlnd uhMysqlnd uh
Mysqlnd uh
 
Microsoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and DockerMicrosoft SQL Server with Linux and Docker
Microsoft SQL Server with Linux and Docker
 
Drupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of FrontendDrupal 8 Theme System: The Backend of Frontend
Drupal 8 Theme System: The Backend of Frontend
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
Persistence patterns for containers
Persistence patterns for containersPersistence patterns for containers
Persistence patterns for containers
 
Cookies
CookiesCookies
Cookies
 
Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)Configuration surgery with Augeas (OggCamp 12)
Configuration surgery with Augeas (OggCamp 12)
 
Memory Management in WordPress
Memory Management in WordPressMemory Management in WordPress
Memory Management in WordPress
 
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...Putting rails and couch db on the cloud -  Indicthreads cloud computing confe...
Putting rails and couch db on the cloud - Indicthreads cloud computing confe...
 
Responsive Design with WordPress
Responsive Design with WordPressResponsive Design with WordPress
Responsive Design with WordPress
 
Tutorial Puppet
Tutorial PuppetTutorial Puppet
Tutorial Puppet
 
Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)Responsive Design with WordPress (WCPHX)
Responsive Design with WordPress (WCPHX)
 
Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012Python-CouchDB Training at PyCon PL 2012
Python-CouchDB Training at PyCon PL 2012
 
Device Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDBDevice Synchronization with Javascript and PouchDB
Device Synchronization with Javascript and PouchDB
 

Viewers also liked

JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門Kazuki Ueki
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...Scality
 
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & GlacierAmazon Web Services
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerScality
 
IBM Cloud Storage - Cleversafe
IBM Cloud Storage - CleversafeIBM Cloud Storage - Cleversafe
IBM Cloud Storage - CleversafeMichael Beatty
 

Viewers also liked (6)

JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
JAWS-UG北陸第5回勉強会 クラウド破産しないためのEBS入門
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
 
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier(STG203) Simplified Storage Management & Backup Using S3 & Glacier
(STG203) Simplified Storage Management & Backup Using S3 & Glacier
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
 
IBM Cloud Storage - Cleversafe
IBM Cloud Storage - CleversafeIBM Cloud Storage - Cleversafe
IBM Cloud Storage - Cleversafe
 
Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3Black Belt Online Seminar AWS Amazon S3
Black Belt Online Seminar AWS Amazon S3
 

Similar to Amazon S3 & Glacier - The Only Backup Solution You'll Ever Need

Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptxThink_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptxPayal Singh
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sitesYann Malet
 
PHP CLI: A Cinderella Story
PHP CLI: A Cinderella StoryPHP CLI: A Cinderella Story
PHP CLI: A Cinderella StoryMike Lively
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2PgTraining
 
Pure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkPure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkBryan Ollendyke
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderSadayuki Furuhashi
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryMongoDB
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to dockerJustyna Ilczuk
 
Using filesystem capabilities with rsync
Using filesystem capabilities with rsyncUsing filesystem capabilities with rsync
Using filesystem capabilities with rsyncHazel Smith
 
Source Code Management systems
Source Code Management systemsSource Code Management systems
Source Code Management systemsxSawyer
 
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Valerii Kravchuk
 
10 Problems with your RMAN backup script
10 Problems with your RMAN backup script10 Problems with your RMAN backup script
10 Problems with your RMAN backup scriptYury Velikanov
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsAndrii Lundiak
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheBlaze Software Inc.
 
What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24Jim Jagielski
 

Similar to Amazon S3 & Glacier - The Only Backup Solution You'll Ever Need (20)

Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptxThink_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
Think_your_Postgres_backups_and_recovery_are_safe_lets_talk.pptx
 
Challenges when building high profile editorial sites
Challenges when building high profile editorial sitesChallenges when building high profile editorial sites
Challenges when building high profile editorial sites
 
PHP CLI: A Cinderella Story
PHP CLI: A Cinderella StoryPHP CLI: A Cinderella Story
PHP CLI: A Cinderella Story
 
Scaling PHP apps
Scaling PHP appsScaling PHP apps
Scaling PHP apps
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
 
Pure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talkPure Speed Drupal 4 Gov talk
Pure Speed Drupal 4 Gov talk
 
Embulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loaderEmbulk, an open-source plugin-based parallel bulk data loader
Embulk, an open-source plugin-based parallel bulk data loader
 
Mojolicious
MojoliciousMojolicious
Mojolicious
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Backup, Restore, and Disaster Recovery
Backup, Restore, and Disaster RecoveryBackup, Restore, and Disaster Recovery
Backup, Restore, and Disaster Recovery
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 
Using filesystem capabilities with rsync
Using filesystem capabilities with rsyncUsing filesystem capabilities with rsync
Using filesystem capabilities with rsync
 
Source Code Management systems
Source Code Management systemsSource Code Management systems
Source Code Management systems
 
Malcon2017
Malcon2017Malcon2017
Malcon2017
 
Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)Gdb basics for my sql db as (percona live europe 2019)
Gdb basics for my sql db as (percona live europe 2019)
 
10 Problems with your RMAN backup script
10 Problems with your RMAN backup script10 Problems with your RMAN backup script
10 Problems with your RMAN backup script
 
Drupal Deployment Troubles and Problems
Drupal Deployment Troubles and ProblemsDrupal Deployment Troubles and Problems
Drupal Deployment Troubles and Problems
 
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable CacheMobile & Desktop Cache 2.0: How To Create A Scriptable Cache
Mobile & Desktop Cache 2.0: How To Create A Scriptable Cache
 
03 pig intro
03 pig intro03 pig intro
03 pig intro
 
What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24What's New and Newer in Apache httpd-24
What's New and Newer in Apache httpd-24
 

More from Matthew Boeckman

Useful flakes - The Value of Common Tools
Useful flakes - The Value of Common ToolsUseful flakes - The Value of Common Tools
Useful flakes - The Value of Common ToolsMatthew Boeckman
 
All Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root CauseAll Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root CauseMatthew Boeckman
 
Rewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewriteRewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewriteMatthew Boeckman
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsMatthew Boeckman
 
Many hands make light work
Many hands make light workMany hands make light work
Many hands make light workMatthew Boeckman
 
Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...Matthew Boeckman
 
Go Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays RockiesGo Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays RockiesMatthew Boeckman
 
Ops, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS LambdaOps, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS LambdaMatthew Boeckman
 

More from Matthew Boeckman (11)

Useful flakes - The Value of Common Tools
Useful flakes - The Value of Common ToolsUseful flakes - The Value of Common Tools
Useful flakes - The Value of Common Tools
 
All Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root CauseAll Day DevOps 2017 - There is No Root Cause
All Day DevOps 2017 - There is No Root Cause
 
Rewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewriteRewriting DevOps - Lessons from a 15 month software rewrite
Rewriting DevOps - Lessons from a 15 month software rewrite
 
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management TeamsTop 10 Practices of Highly Successful DevOps Incident Management Teams
Top 10 Practices of Highly Successful DevOps Incident Management Teams
 
Many hands make light work
Many hands make light workMany hands make light work
Many hands make light work
 
Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...Sandstorm or Significant? The evolving role of situational context in inciden...
Sandstorm or Significant? The evolving role of situational context in inciden...
 
Rewriting DevOps
Rewriting DevOpsRewriting DevOps
Rewriting DevOps
 
Go Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays RockiesGo Rin no Show - DevOpsDays Rockies
Go Rin no Show - DevOpsDays Rockies
 
The promise of NoOps
The promise of NoOpsThe promise of NoOps
The promise of NoOps
 
Ops, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS LambdaOps, DevOps, NoOps and AWS Lambda
Ops, DevOps, NoOps and AWS Lambda
 
Vpc aws meetup
Vpc   aws meetupVpc   aws meetup
Vpc aws meetup
 

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Amazon S3 & Glacier - The Only Backup Solution You'll Ever Need

  • 1. S3 & Glacier Amazon S3 & Glacier - the only backup solution you’ll ever need Matthew Boeckman, VP - DevOps, Craftsy @matthewboeckman http://enginerds.craftsy.com http://devops.com/author/matthewboeckman
  • 2. Who is Craftsy ● Instructor led training videos for passionate hobbyists ● #19 on Forbes’ Most Promising Companies 2014
  • 3. Why do backups matter? … they don’t, until they do. At small scale, backup can and is as easy as icloud, or google drive, or github. As you grow, the situation gets more complex. In a content creation business, backups are vital.
  • 5. Define your exposure daily backups - protect against accidental deletion (users) and system corruption/crash DR - your building catches on fire. an employee goes rogue. the SAN suffers a total existence failure. permanent archive - data at rest, legal/business permanence
  • 6. what are my options LTO - hope you like lifecycle hardware refreshes! hope your data rests on a 1.5 or 3tb boundry. hope you enjoy swapping tapes… 4-17year durability ~$.008/GB ** does not include operational costs ** DISK - good luck offsiting, hope you enjoy hardware maintenance, hope you’ve got a big checkbook ~$5/GB ** does not include operational costs **
  • 7. hobo as a service ● not scalable ● expensive ● not addressable in code ● requires software systems welcome to system adminsitrationville, population: you
  • 8. why did we look for alternatives?
  • 9. What are the alternatives? s3
  • 10. why s3? ● $0.03/GB ● 99.999999999% durable ● easy interface with glacier ● addressable with code “If you store 10,000 objects with us, on average we may lose one of them every 10 million years or so. This storage is designed in such a way that we can sustain the concurrent loss of data in two separate storage facilities.”
  • 12. this is a busy space
  • 13. free tools, one thread, small files http://s3tools.org/s3cmd ● great sync support ● supports multipart ● Does not support multithread ● tons of other features (du, cp, rm, setpolicy, cloudfront…) https://github.com/boto ● the aws python library ● includes basic cli s3put, great reference point great, but not fast
  • 14. free tools, multipass, big files https://github.com/mumrah/s3-multipart ● zomg multithreaded, multipart up and down ● saturated 100M pipe for a single file transfer ● does not handle small files https://github.com/aws/aws-cli ● zomg multithreaded, multipart up and down ● saturated 500M pipe for a single file transfer ● perfect in every imaginable way
  • 15. problems approach single archive (tar) lots o files, some big some small le good ● fast transfer ● simple ● no filename pain ● restore & navigation granularity le bad ● restore all ● can’t nav your backup ● processing overhead ● users name things...
  • 16. our first backup script ... testy.sh *shitty speed, restore costs, let’s just get to the next slide
  • 17. to mp or not to mp... size=$(stat -f %z "$x") if [ "$size" -gt "100000000" ]; then s3-mp-upload.py ${S3MPT_OW} -np 10 -s 100 "$x" "s3://${BUCKET}/$CID/${x}" >> $LOG 2>&1 else s3put ${S3PUT_OW} -s $AWS_ACCESS_SECRET -a $AWS_ACCESS_KEY -b $BUCKET -p "$CPATH" "$x" >> $LOG 2>&1 fi
  • 18. aws s3 cli aws s3 sync "$CID" "s3://${BUCKET}/${CID}" --delete aws s3 cp "$SRC" "s3://${BUCKET}/${CID}" --recursive simple, multipart, multithreaded and saturates 500mbps both ways
  • 20. why glacier? ● $0.01/GB ● durable like s3 ● data sent to glacier via s3 policies cannot be accidentally deleted
  • 25. restore from glacier as code foreach ($objects as $flerbs) { $key = $flerbs->Key; $restore = $s3->restore_archived_object ($bucket, $key, $days ); //var_dump($restore); $http_code = $restore->status; $xml_resp = $restore->body; $message = $xml_resp->Message; $header = $restore->header; $req_id = $header['x-amz-request-id']; $amz_id2 = $header['x-amz-id-2']; $req_date = $header['date']; }
  • 27. restore status as code $header = $s3->get_object_headers($bucket, $key)->header; $restore = explode(',', $header['x-amz-restore']); $progress = strpos($restore[0], 'true'); … if ($sclass == 'GLACIER') { if ($header['x-amz-restore'] == '') { $style = '"background-color: #0ce3ff;"'; // not restored! $note = ''; } elseif ($progress != false) { $style = '"background-color: #ffa391;"'; // restore in progress! $note = 'Restoration in progress'; } else { $style = '"background-color: #95fed0;"'; // restored! $note = "Restored until $expirydate"; }
  • 28. versioning, another approach ● versioning operates on the bucket level ● each version of an object counts as storage ● versioning happens automatically with any PUT ● deletion of an object does not explicitly delete versions (!!) GET /my-image.jpg?versionId=L4kqtJlcpXroDTDmpUMLUo GET /my-image.jpg?versions HTTP/1.1 DELETE /my-image.jpg?versionId=UIORUnfnd89493jJFJ
  • 29. restoring a version Restore to the most recent version? 1. delete the current version! Restore to an older version? 1. GET the desired version 2. PUT it back on top of the object you’re restoring
  • 30. where does Craftsy live? 1. Bi-weeklies of ACTIVE content to a non-versioned, non- glaciered bucket (~200TB total, +2TB/wk) 2. Weekly new finished content to glacier-backed s3 bucket (~500TB, +45TB/mo) 3. All code artifacts (dev, staging, prod) to a versioned s3 bucket 4. Postgres backup segments to an non-glaciered bucket most important, we do all of it in code. we don’t shuffle tapes, upgrade hardware, or system administrate!
  • 31. Thank you QUESTIONS! Matthew Boeckman @matthewboeckman http://enginerds.craftsy.com (deck will be there & slideshare) http://devops.com/author/matthewboeckman