Bhaskar Sunkara, VP Product Experience at AppDynamics, presents "Performance on Amazon AWS".
The video of the presentation is available here : http://vimeo.com/46604224
The Xebia Cloud Day 2012 is a free Cloud Computing conference focused on Java ecosystem.
http://blog.xebia.fr/22-mai-2012-cloud-day-chez-xebia/
1. PERFORMANCE ON
AMAZON AWS
BHASKAR SUNKARA AND PETER ABRAMS
2. INTRODUCTION
• Founded in April 2008 in San Francisco – Venture Funded
• Founding Principles
• The Move to the cloud presents a new set of challenges
• New world - Constant Change (infrastructure, architecture,
code)
• Existing management solutions not designed for constant
change
• AppDynamics Value - Enable teams to operate business critical
applications in clouds and guarantee service performance
• Working with Netflix since October 2009
• Oct. 2009 – 150 servers in private data center
• May 2012 – 50 servers in data center, 8,000 servers in EC2
• AppDynamics is Netflix primary SLA management tool
5. EVERYTHING IS SHARED
• S3
• Shared/Virtualized Infrastructure • SQS
• Shared services • SDB
• EBS
• EMR
• …
Shared
Services
The biggest public cloud !!
6. INFRASTRUCTURE
• Machines come and go
• High rate of change
• Capacity is much cheaper
• Capacity can be both increased and decreased
• In minutes
• Cannot use physical dependencies anymore
• E.g. static IP mapping between services
7. PERFORMANCE MONITORING
• Traditional monitoring : Measure
• CPU and other hardware metrics
• Code metrics – individual methods etc.
• Scrape logs for errors etc.
• Configured by hand
• Cloud Monitoring - Datacenter tools are a big pain !
• You were measuring CPU metrics for a bunch of machines
• Now those are gone, and the new ones are up
• Who is going to refresh your dashboards?
• Who is going to clean up the dead instance data?
8. GOOD PERFORMANCE ON AWS?
(Re)architect your app to
• Work on Amazon !
• Take advantage of all that it provides
• Careful with shared services !
Pick the right performance monitoring tools !
Lets not forget managing capacity/cost !!
12. IF YOU ARE USING SHARED SERVICES
• Measure service performance in isolation
• Stress test the hell out of shared service calls
• At minimum double of your peak load !
• Look for common patterns out there
• e.g. Simple DB needs a cache frontend
• Avoid badly performing shared services
• EBS?
13. PERFORMANCE
MONITORING
E S TA B L I S H A C R I T E R I A T O P I C K T H E R I G H T T O O L S
14. 1.HAS TO BE SERVICE ORIENTED
• Primarily monitor Services not Infrastructure !
• Focus on the application SLAs
• Focus on the end user experience
§ Response times
Process Service Order § Load
§ Error rates
§ Trends
15. 2. HAS TO BE DISTRIBUTED
• Tools need to measure health of tiers
• Measuring individual servers does not make sense
• Services are horizontally scalable
ec2-2
ec2-1
ec2-2 ec2-4
ec2-3
ec2-1
ec2-3 ec2-5
You need to know how the cluster/tier performs
in terms of average utilization
16. 3. HAS TO KEEP UP WITH RATE OF
CHANGE
• Keep up with machines going up/down
• Node are transient
• Provide a clean view of the current state
• Clean up dead instances/services
• Maintain a baseline of how the overall tier does
ec2-23
ec2-2 ec2-22
ec2-1
ec2-24
ec2-3
17. 4.CROSS SERVICE TRACING
• Becomes absolutely necessary for truly distributed
apps
• Should be able to drill down across services within
the context of a single user request
• Should be able to analyze code in every service
• Should be able to point out impact of using shared
services
20. 5.AUTODISCOVERY/LOW
CONFIGURATION MAINTENANCE
• Cannot have configuration based discovery of new
instances/services
• Baking into AMIs etc.
• Should auto-discover new tiers/services
• Cannot have code level configuration
• Difficult to maintain with agile development
22. MANAGING COST - EISMANN
• Managing Capacity == Cost
• The cloud isn’t free !
• Eismann
• Frozen food delivery vendor in Germany
• In-production on AWS
• Has variable-capacity based on usage hours
• Use application level SLAs to determine capacity
• E.g. Process Order Volume == capacity of services on AWS
23. WHAT IS APPDYNAMICS?
• Fundamentally built for the Cloud
• Handles constant change of infrastructure
• Service oriented SLA management
• Detailed – actionable information on service
performance for engineers, architects and
operations
• Zero to low configuration
• No code configuration needed for visibility
Did I mention Eismann is fully deployed on AppDynamics and
uses us for automatically managing capacity and SLAs !!