SlideShare a Scribd company logo
Metrics that “talk” on Cloud using Ganglia
How do we monitor performance on Megam Cloud
We'll Cover
●

Our Experience Using Ganglia

●

How does it work in our Platform

●

Chef cookbooks for metering and setup.

●

Dashboard integration

© 2012-2013 Megam Systems
What is Ganglia
Scalable distributed monitoring system for high-performance
computing systems.
Sends information about your cloud instance.
Can be used as a live cloud monitor.
Can be extended using python plugins

© 2012-2013 Megam Systems
What have we accomplished ?
Oh Yeah - <flip to next page>

© 2012-2013 Megam Systems
Dash #2
(AngularJs Client)

Metrics API
gmon

gmon

gmetad

gmetad

gmetad
gmon

gmon

© 2012-2013 Megam Systems
Would you like to setup one ?
Yes you can
(or) http://www.megam.co

© 2012-2013 Megam Systems
Few facts on gmetad/gmond
gmetad can run standalone or along with gmond.
gmetad can be configured to collect metrics of gmond
servers of same cluster or different cluster.
gmetad stores data at
➔

/var/lib/ganglia/rrds/CLUSTER_NAME/GMOND_SERVER_NAME
GMOND_SERVER_NAME can be changed in gmond.conf

© 2012-2013 Megam Systems
gmetad is the metrics collector
gmond is the metrics sender
Ok. Got it.

© 2012-2013 Megam Systems
What is our setup
#1 gmetad : monitor1.megam.co
#2 gmetad : montior2.megam.co
Several gmonds(Cloud Apps) pumping data to gmetad

© 2012-2013 Megam Systems
What are Cloud Apps
Any app
for lang := range ProgLanguages {
Java
Scala
Go
….
meteor

}
&&

DB, Queue
© 2012-2013 Megam Systems
Do you need Graphite
No
Why ?
Needs rrds formatted metric files
It copies rrds files from gmetad
Twice storage
Is this the only soln ? Eager to hear feedback.
© 2012-2013 Megam Systems
Let us setup gmetad 3.3.8-1
Ubuntu(raring) : package is gmetad
Ubuntu(saucy) has 3.6.0

sudo apt-get install gmetad
sudo apt-get install ganglia-webfrontend (*optional)

We used Opscode cookbook to setup => Link

© 2012-2013 Megam Systems
Configure gmetad
nano /etc/ganglia/gmetad.conf
data_source “megcluster” <gmond1>.megam.co:8649 <gmond2>.megam.co:8649

➔

The above says “megcluster” collects metrics from <gmond1>.megam.co and
<gmond2>.megam.co
➔

Which is like monitoring a Java App in <gmond1>.megam.co

(or)
➔

Your favorite App in <gmond2>.megam.co

© 2012-2013 Megam Systems
gmetad - start/stop.
Start :
sudo gmetad

Stop : good old kill
ps -ef | grep gmetad

sudo kill -9 <pid>

© 2012-2013 Megam Systems
Cool gmetad - monitor1.megam.co is running

© 2012-2013 Megam Systems
gmond
Install ganglia-monitor-python package will be installed in a server which is
to be monitored.
➔

Package has methods to collect basic metrics(cpu...) using python scripts
in /usr/lib/ganglia/
➔

Extended by enabling additional python scripts at
/usr/lib/ganglia/python_modules.
➔

For an exhaustive list : https://github.com/ganglia/gmond_python_modules

➔

© 2012-2013 Megam Systems
Let us setup gmond 3.3.8-1
Ubuntu(raring) : package is ganglia-monitor-python
Ubuntu(saucy) has 3.6.0

sudo apt-get install ganglia-monitor-python

We used Opscode cookbook to setup => Link

© 2012-2013 Megam Systems
Configure gmond
nano /etc/ganglia/gmond.conf
globals {
override_hostname = <gmond1>.megam.co
override_ip = 127.0.0.1
}
udp_send_channel“megcluster” collects metrics from <gmond1>.megam.co and
➔
The above says {
host = monitor1.megam.co
<gmond2>.megam.co
port = 8649
ttl ➔ 1
= Which is like monitoring a Java App in <gmond1>.megam.co
}
(or)
cluster {
nameYour favorite App in <gmond2>.megam.co
= "megcluster"
➔
owner = "unspecified"
}
© 2012-2013 Megam Systems
What did we configure ?
➔

In the globals we say our monitoring Java App's host name < gmond1>.megam.co

➔

We provide the UDP channel of the gmetad (monitor1.megam.co)

➔

We need to specify the gmetad cluster (megcluster)

➔

cluster attribute groups all gmond to a gmetd <CLUSTER> in our case
megcluster.

© 2012-2013 Megam Systems
gmond - start/stop.
Start :
sudo gmond

Stop : good old kill
ps -ef | grep gmond

sudo kill -9 <pid>

© 2012-2013 Megam Systems
Cool gmond - <gmond1>.megam.co is running
&
pumping to monitor1.megam.co

© 2012-2013 Megam Systems
We customized chef - cookbooks

© 2012-2013 Megam Systems
How do we use the chef - cookbooks

© 2012-2013 Megam Systems
Tweak cookbook for gmetad

https://github.com/indykish/chef-repo/tree/master/cookbo
➔

Attributes
default[:ganglia][:cluster_name] = "megcluster"
default[:ganglia][:unicast] = true
default[:ganglia][:hostname] = “monitoring1.megam.co”

© 2012-2013 Megam Systems
Chef Run : gmetad
Run chef :
runlist 'recipe[megam_ganglia::gmetad]'

© 2012-2013 Megam Systems
Tweak recipes for gmond
For Any App
➔

Default : installs and configures ganglia-monitor-python. It collects the basic
meterings like cpu_usage, memory_usage etc.

➔

Nginx : collects nginx status details.

➔

Rabbit : collects rabbbitmq metrics.

➔

Redis : collects redis metrics.

➔

Riak

: collects riak metrics.

© 2012-2013 Megam Systems
Chef Run : gmond
To monitor an app
include_recipe “megam_ganglia”
Nginx frontended apps
include_recipe “megam_ganglia::nginx”
Rabbitmq apps
include_recipe “megam_ganglia::rabbit”
Riak apps
include_recipe “megam_ganglia::riak”
Redis apps
include_recipe “megam_ganglia::redis”
© 2012-2013 Megam Systems
Configure gmond
nano /etc/ganglia/gmond.conf
globals {
daemonize = yes
setuid = yes
user = nobody
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
host_dmax = 86400 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 30
override_hostname = "<gmond1>.megam.co"
override_ip = 127.0.0.1
}
cluster {
name = "megcluster"
owner = "unspecified"
latlong = "unspecified"
url = "unspecified"
}
/* The host section describes attributes of the host, like the location */
host {
location = "unspecified"
}
/* Feel free to specify as many udp_send_channels as you like. Gmond
used to only support having a single channel */
udp_send_channel {
host = monitoring1.megam.co
port = 8649
ttl = 1
}

© 2012-2013 Megam Systems
Verifying gmond data

Open up your browser http://gmond1.megam.co:8649
➔

It will list the metrics of the gmond instance.

➔

Not recommended for prod.

© 2012-2013 Megam Systems
Sample gmond data (for redis server)

© 2012-2013 Megam Systems
Checking gmeta data

Data files in the below dir increases
–

Get metrics data at /var/lib/ganglia/rrds/megamcluster/gmond1.megam.co

© 2012-2013 Megam Systems
Sample gmetad data (for thomas.work.local)

© 2012-2013 Megam Systems
Dash Integration in rails

Built on

–
–

We'll cover it detail in a separate slideshare.
If you are hungry “Code is the design” :)
For questions on this area:rajthilak@megam.co.in

© 2012-2013 Megam Systems
References
Ganglia Wiki
megam chef-repo

© 2012-2013 Megam Systems
Our Organization(Megam Systems)
Beta Launch of Megam Cloud (Polygot PaaS)
Our PaaS design => Link
Register http://www.megam.co for an invite
Twitter : @indykish
© 2012-2013 Megam Systems
Screencast illustrating the Cloud API
Servers working live

© 2012-2013 Megam Systems
Thank you

for watching
© 2012-2013 Megam Systems

More Related Content

Similar to Metrics that talk on cloud using ganglia (20)

Open source Cloud Automation Platform
Open source Cloud Automation PlatformOpen source Cloud Automation Platform
Open source Cloud Automation Platform
Kishore Neelamegam
 
How to improve gradle build speed
How to improve gradle build speedHow to improve gradle build speed
How to improve gradle build speed
Fate Chang
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2
Chris Westin
 
Monitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, NagiosMonitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, Nagios
Pradeep Kumar
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
Arvind Kumar G.S
 
Windows server 2012 and group policy
Windows server 2012 and group policyWindows server 2012 and group policy
Windows server 2012 and group policy
Ravi Kumar Lanke
 
Meet Magento Spain 2019 - Our Experience with Magento Cloud
Meet Magento Spain 2019 - Our Experience with Magento CloudMeet Magento Spain 2019 - Our Experience with Magento Cloud
Meet Magento Spain 2019 - Our Experience with Magento Cloud
Lyzun Oleksandr
 
Metrics with Ganglia
Metrics with GangliaMetrics with Ganglia
Metrics with Ganglia
Gareth Rushgrove
 
GWAVACon 2013:GroupWise Windermere - OH
GWAVACon 2013:GroupWise Windermere - OHGWAVACon 2013:GroupWise Windermere - OH
GWAVACon 2013:GroupWise Windermere - OH
GWAVA
 
GWAVACon 2013: GroupWise Windermere
GWAVACon 2013: GroupWise Windermere GWAVACon 2013: GroupWise Windermere
GWAVACon 2013: GroupWise Windermere
GWAVA
 
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
Nagios Conference 2011 - Mike Weber - Training:  Reducing Nagios Server Load ...Nagios Conference 2011 - Mike Weber - Training:  Reducing Nagios Server Load ...
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
Nagios
 
JCConf 2015 - 輕鬆學google的雲端開發 - Google App Engine入門(下)
JCConf 2015  - 輕鬆學google的雲端開發 - Google App Engine入門(下)JCConf 2015  - 輕鬆學google的雲端開發 - Google App Engine入門(下)
JCConf 2015 - 輕鬆學google的雲端開發 - Google App Engine入門(下)
Simon Su
 
Decrease build time and application size
Decrease build time and application sizeDecrease build time and application size
Decrease build time and application size
Keval Patel
 
Pyramid Deployment and Maintenance
Pyramid Deployment and MaintenancePyramid Deployment and Maintenance
Pyramid Deployment and Maintenance
Jazkarta, Inc.
 
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias CrauwelsOSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
NETWAYS
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
Brian Brazil
 
Gradle 3.0: Unleash the Daemon!
Gradle 3.0: Unleash the Daemon!Gradle 3.0: Unleash the Daemon!
Gradle 3.0: Unleash the Daemon!
Eric Wendelin
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Opersys inc.
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Sridhar Kumar N
 
W 0300 codingfor_life-batterylifethatis
W 0300 codingfor_life-batterylifethatisW 0300 codingfor_life-batterylifethatis
W 0300 codingfor_life-batterylifethatis
jicheng687
 
Open source Cloud Automation Platform
Open source Cloud Automation PlatformOpen source Cloud Automation Platform
Open source Cloud Automation Platform
Kishore Neelamegam
 
How to improve gradle build speed
How to improve gradle build speedHow to improve gradle build speed
How to improve gradle build speed
Fate Chang
 
Ganglia Overview-v2
Ganglia Overview-v2Ganglia Overview-v2
Ganglia Overview-v2
Chris Westin
 
Monitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, NagiosMonitor PowerKVM using Ganglia, Nagios
Monitor PowerKVM using Ganglia, Nagios
Pradeep Kumar
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
Arvind Kumar G.S
 
Windows server 2012 and group policy
Windows server 2012 and group policyWindows server 2012 and group policy
Windows server 2012 and group policy
Ravi Kumar Lanke
 
Meet Magento Spain 2019 - Our Experience with Magento Cloud
Meet Magento Spain 2019 - Our Experience with Magento CloudMeet Magento Spain 2019 - Our Experience with Magento Cloud
Meet Magento Spain 2019 - Our Experience with Magento Cloud
Lyzun Oleksandr
 
GWAVACon 2013:GroupWise Windermere - OH
GWAVACon 2013:GroupWise Windermere - OHGWAVACon 2013:GroupWise Windermere - OH
GWAVACon 2013:GroupWise Windermere - OH
GWAVA
 
GWAVACon 2013: GroupWise Windermere
GWAVACon 2013: GroupWise Windermere GWAVACon 2013: GroupWise Windermere
GWAVACon 2013: GroupWise Windermere
GWAVA
 
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
Nagios Conference 2011 - Mike Weber - Training:  Reducing Nagios Server Load ...Nagios Conference 2011 - Mike Weber - Training:  Reducing Nagios Server Load ...
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
Nagios
 
JCConf 2015 - 輕鬆學google的雲端開發 - Google App Engine入門(下)
JCConf 2015  - 輕鬆學google的雲端開發 - Google App Engine入門(下)JCConf 2015  - 輕鬆學google的雲端開發 - Google App Engine入門(下)
JCConf 2015 - 輕鬆學google的雲端開發 - Google App Engine入門(下)
Simon Su
 
Decrease build time and application size
Decrease build time and application sizeDecrease build time and application size
Decrease build time and application size
Keval Patel
 
Pyramid Deployment and Maintenance
Pyramid Deployment and MaintenancePyramid Deployment and Maintenance
Pyramid Deployment and Maintenance
Jazkarta, Inc.
 
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias CrauwelsOSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
OSMC 2018 | Why we recommend PMM to our clients by Matthias Crauwels
NETWAYS
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
Brian Brazil
 
Gradle 3.0: Unleash the Daemon!
Gradle 3.0: Unleash the Daemon!Gradle 3.0: Unleash the Daemon!
Gradle 3.0: Unleash the Daemon!
Eric Wendelin
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Opersys inc.
 
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,GrafanaPrometheus - Intro, CNCF, TSDB,PromQL,Grafana
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Sridhar Kumar N
 
W 0300 codingfor_life-batterylifethatis
W 0300 codingfor_life-batterylifethatisW 0300 codingfor_life-batterylifethatis
W 0300 codingfor_life-batterylifethatis
jicheng687
 

Recently uploaded (20)

Large Language Models vs Small Language Models
Large Language Models vs Small Language ModelsLarge Language Models vs Small Language Models
Large Language Models vs Small Language Models
Nathan Bijnens
 
Think Like and Architect Series: Session 1 of 9 Declarative Design
Think Like and Architect Series: Session 1 of 9 Declarative DesignThink Like and Architect Series: Session 1 of 9 Declarative Design
Think Like and Architect Series: Session 1 of 9 Declarative Design
Walter Spinrad
 
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
jamesfolkner123
 
The Best of Both Worlds: Hybrid Clustering with Delta Lake
The Best of Both Worlds: Hybrid Clustering with Delta LakeThe Best of Both Worlds: Hybrid Clustering with Delta Lake
The Best of Both Worlds: Hybrid Clustering with Delta Lake
carlyakerly1
 
CSUN 2025 - Interactive Charts for Everyone.pptx
CSUN 2025 - Interactive Charts for Everyone.pptxCSUN 2025 - Interactive Charts for Everyone.pptx
CSUN 2025 - Interactive Charts for Everyone.pptx
Øystein Moseng
 
AI Revolution unleashed with AI Foundry at AI Tour Brussels
AI Revolution unleashed with AI Foundry at AI Tour BrusselsAI Revolution unleashed with AI Foundry at AI Tour Brussels
AI Revolution unleashed with AI Foundry at AI Tour Brussels
Nathan Bijnens
 
Cloud Computing The Future of Technology
Cloud Computing The Future of TechnologyCloud Computing The Future of Technology
Cloud Computing The Future of Technology
joelmcapg
 
Emancipatory Information Retrieval (Invited Talk at UCC)
Emancipatory Information Retrieval (Invited Talk at UCC)Emancipatory Information Retrieval (Invited Talk at UCC)
Emancipatory Information Retrieval (Invited Talk at UCC)
Bhaskar Mitra
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdfAccelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Nuwan Dias
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Ansible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdfAnsible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdf
RHCSA Guru
 
DevOps 101 - DevOps Columbia 3-20-2025.pdf
DevOps 101 - DevOps Columbia 3-20-2025.pdfDevOps 101 - DevOps Columbia 3-20-2025.pdf
DevOps 101 - DevOps Columbia 3-20-2025.pdf
judy (fink) johnson
 
Security Policies MuleSoft API Manager Mule4
Security Policies MuleSoft API Manager Mule4Security Policies MuleSoft API Manager Mule4
Security Policies MuleSoft API Manager Mule4
Adalberto Toledo
 
Columbia Weather Systems - Product Overview
Columbia Weather Systems - Product OverviewColumbia Weather Systems - Product Overview
Columbia Weather Systems - Product Overview
Columbia Weather Systems
 
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptxCSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
Øystein Moseng
 
Presentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdfPresentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdf
Mukesh Kala
 
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
Jason Yip
 
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and PurviewMeasuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Nikki Chapple
 
Large Language Models vs Small Language Models
Large Language Models vs Small Language ModelsLarge Language Models vs Small Language Models
Large Language Models vs Small Language Models
Nathan Bijnens
 
Think Like and Architect Series: Session 1 of 9 Declarative Design
Think Like and Architect Series: Session 1 of 9 Declarative DesignThink Like and Architect Series: Session 1 of 9 Declarative Design
Think Like and Architect Series: Session 1 of 9 Declarative Design
Walter Spinrad
 
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
IObit Driver Booster Pro Crack 12.2.0 with License Key [2025]
jamesfolkner123
 
The Best of Both Worlds: Hybrid Clustering with Delta Lake
The Best of Both Worlds: Hybrid Clustering with Delta LakeThe Best of Both Worlds: Hybrid Clustering with Delta Lake
The Best of Both Worlds: Hybrid Clustering with Delta Lake
carlyakerly1
 
CSUN 2025 - Interactive Charts for Everyone.pptx
CSUN 2025 - Interactive Charts for Everyone.pptxCSUN 2025 - Interactive Charts for Everyone.pptx
CSUN 2025 - Interactive Charts for Everyone.pptx
Øystein Moseng
 
AI Revolution unleashed with AI Foundry at AI Tour Brussels
AI Revolution unleashed with AI Foundry at AI Tour BrusselsAI Revolution unleashed with AI Foundry at AI Tour Brussels
AI Revolution unleashed with AI Foundry at AI Tour Brussels
Nathan Bijnens
 
Cloud Computing The Future of Technology
Cloud Computing The Future of TechnologyCloud Computing The Future of Technology
Cloud Computing The Future of Technology
joelmcapg
 
Emancipatory Information Retrieval (Invited Talk at UCC)
Emancipatory Information Retrieval (Invited Talk at UCC)Emancipatory Information Retrieval (Invited Talk at UCC)
Emancipatory Information Retrieval (Invited Talk at UCC)
Bhaskar Mitra
 
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
Open-Source GenAI vs. Enterprise GenAI: Navigating the Future of AI Innovatio...
All Things Open
 
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdfAccelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Accelerating Platformless Modernization With Choreo - WSO2Con 2025.pdf
Nuwan Dias
 
The Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQLThe Death of the Browser - Rachel-Lee Nabors, AgentQL
The Death of the Browser - Rachel-Lee Nabors, AgentQL
All Things Open
 
Ansible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdfAnsible Variables in Playbook - RHCE.pdf
Ansible Variables in Playbook - RHCE.pdf
RHCSA Guru
 
DevOps 101 - DevOps Columbia 3-20-2025.pdf
DevOps 101 - DevOps Columbia 3-20-2025.pdfDevOps 101 - DevOps Columbia 3-20-2025.pdf
DevOps 101 - DevOps Columbia 3-20-2025.pdf
judy (fink) johnson
 
Security Policies MuleSoft API Manager Mule4
Security Policies MuleSoft API Manager Mule4Security Policies MuleSoft API Manager Mule4
Security Policies MuleSoft API Manager Mule4
Adalberto Toledo
 
Columbia Weather Systems - Product Overview
Columbia Weather Systems - Product OverviewColumbia Weather Systems - Product Overview
Columbia Weather Systems - Product Overview
Columbia Weather Systems
 
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptxCSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
CSUN 2025 - Personalization of Accessible Charts and Graphs.pptx
Øystein Moseng
 
Presentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdfPresentation Session 2 -Context Grounding.pdf
Presentation Session 2 -Context Grounding.pdf
Mukesh Kala
 
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
Leveraging Knowledge Graphs for RAG: A Smarter Approach to Contextual AI Appl...
All Things Open
 
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
[NYC Scrum] 4 bad ideas about productivity... and what Agilists should do ins...
Jason Yip
 
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and PurviewMeasuring Copilot and Gen AI Success with Viva Insights and Purview
Measuring Copilot and Gen AI Success with Viva Insights and Purview
Nikki Chapple
 

Metrics that talk on cloud using ganglia

  • 1. Metrics that “talk” on Cloud using Ganglia How do we monitor performance on Megam Cloud
  • 2. We'll Cover ● Our Experience Using Ganglia ● How does it work in our Platform ● Chef cookbooks for metering and setup. ● Dashboard integration © 2012-2013 Megam Systems
  • 3. What is Ganglia Scalable distributed monitoring system for high-performance computing systems. Sends information about your cloud instance. Can be used as a live cloud monitor. Can be extended using python plugins © 2012-2013 Megam Systems
  • 4. What have we accomplished ? Oh Yeah - <flip to next page> © 2012-2013 Megam Systems
  • 5. Dash #2 (AngularJs Client) Metrics API gmon gmon gmetad gmetad gmetad gmon gmon © 2012-2013 Megam Systems
  • 6. Would you like to setup one ? Yes you can (or) http://www.megam.co © 2012-2013 Megam Systems
  • 7. Few facts on gmetad/gmond gmetad can run standalone or along with gmond. gmetad can be configured to collect metrics of gmond servers of same cluster or different cluster. gmetad stores data at ➔ /var/lib/ganglia/rrds/CLUSTER_NAME/GMOND_SERVER_NAME GMOND_SERVER_NAME can be changed in gmond.conf © 2012-2013 Megam Systems
  • 8. gmetad is the metrics collector gmond is the metrics sender Ok. Got it. © 2012-2013 Megam Systems
  • 9. What is our setup #1 gmetad : monitor1.megam.co #2 gmetad : montior2.megam.co Several gmonds(Cloud Apps) pumping data to gmetad © 2012-2013 Megam Systems
  • 10. What are Cloud Apps Any app for lang := range ProgLanguages { Java Scala Go …. meteor } && DB, Queue © 2012-2013 Megam Systems
  • 11. Do you need Graphite No Why ? Needs rrds formatted metric files It copies rrds files from gmetad Twice storage Is this the only soln ? Eager to hear feedback. © 2012-2013 Megam Systems
  • 12. Let us setup gmetad 3.3.8-1 Ubuntu(raring) : package is gmetad Ubuntu(saucy) has 3.6.0 sudo apt-get install gmetad sudo apt-get install ganglia-webfrontend (*optional) We used Opscode cookbook to setup => Link © 2012-2013 Megam Systems
  • 13. Configure gmetad nano /etc/ganglia/gmetad.conf data_source “megcluster” <gmond1>.megam.co:8649 <gmond2>.megam.co:8649 ➔ The above says “megcluster” collects metrics from <gmond1>.megam.co and <gmond2>.megam.co ➔ Which is like monitoring a Java App in <gmond1>.megam.co (or) ➔ Your favorite App in <gmond2>.megam.co © 2012-2013 Megam Systems
  • 14. gmetad - start/stop. Start : sudo gmetad Stop : good old kill ps -ef | grep gmetad sudo kill -9 <pid> © 2012-2013 Megam Systems
  • 15. Cool gmetad - monitor1.megam.co is running © 2012-2013 Megam Systems
  • 16. gmond Install ganglia-monitor-python package will be installed in a server which is to be monitored. ➔ Package has methods to collect basic metrics(cpu...) using python scripts in /usr/lib/ganglia/ ➔ Extended by enabling additional python scripts at /usr/lib/ganglia/python_modules. ➔ For an exhaustive list : https://github.com/ganglia/gmond_python_modules ➔ © 2012-2013 Megam Systems
  • 17. Let us setup gmond 3.3.8-1 Ubuntu(raring) : package is ganglia-monitor-python Ubuntu(saucy) has 3.6.0 sudo apt-get install ganglia-monitor-python We used Opscode cookbook to setup => Link © 2012-2013 Megam Systems
  • 18. Configure gmond nano /etc/ganglia/gmond.conf globals { override_hostname = <gmond1>.megam.co override_ip = 127.0.0.1 } udp_send_channel“megcluster” collects metrics from <gmond1>.megam.co and ➔ The above says { host = monitor1.megam.co <gmond2>.megam.co port = 8649 ttl ➔ 1 = Which is like monitoring a Java App in <gmond1>.megam.co } (or) cluster { nameYour favorite App in <gmond2>.megam.co = "megcluster" ➔ owner = "unspecified" } © 2012-2013 Megam Systems
  • 19. What did we configure ? ➔ In the globals we say our monitoring Java App's host name < gmond1>.megam.co ➔ We provide the UDP channel of the gmetad (monitor1.megam.co) ➔ We need to specify the gmetad cluster (megcluster) ➔ cluster attribute groups all gmond to a gmetd <CLUSTER> in our case megcluster. © 2012-2013 Megam Systems
  • 20. gmond - start/stop. Start : sudo gmond Stop : good old kill ps -ef | grep gmond sudo kill -9 <pid> © 2012-2013 Megam Systems
  • 21. Cool gmond - <gmond1>.megam.co is running & pumping to monitor1.megam.co © 2012-2013 Megam Systems
  • 22. We customized chef - cookbooks © 2012-2013 Megam Systems
  • 23. How do we use the chef - cookbooks © 2012-2013 Megam Systems
  • 24. Tweak cookbook for gmetad https://github.com/indykish/chef-repo/tree/master/cookbo ➔ Attributes default[:ganglia][:cluster_name] = "megcluster" default[:ganglia][:unicast] = true default[:ganglia][:hostname] = “monitoring1.megam.co” © 2012-2013 Megam Systems
  • 25. Chef Run : gmetad Run chef : runlist 'recipe[megam_ganglia::gmetad]' © 2012-2013 Megam Systems
  • 26. Tweak recipes for gmond For Any App ➔ Default : installs and configures ganglia-monitor-python. It collects the basic meterings like cpu_usage, memory_usage etc. ➔ Nginx : collects nginx status details. ➔ Rabbit : collects rabbbitmq metrics. ➔ Redis : collects redis metrics. ➔ Riak : collects riak metrics. © 2012-2013 Megam Systems
  • 27. Chef Run : gmond To monitor an app include_recipe “megam_ganglia” Nginx frontended apps include_recipe “megam_ganglia::nginx” Rabbitmq apps include_recipe “megam_ganglia::rabbit” Riak apps include_recipe “megam_ganglia::riak” Redis apps include_recipe “megam_ganglia::redis” © 2012-2013 Megam Systems
  • 28. Configure gmond nano /etc/ganglia/gmond.conf globals { daemonize = yes setuid = yes user = nobody debug_level = 0 max_udp_msg_len = 1472 mute = no deaf = no host_dmax = 86400 /*secs */ cleanup_threshold = 300 /*secs */ gexec = no send_metadata_interval = 30 override_hostname = "<gmond1>.megam.co" override_ip = 127.0.0.1 } cluster { name = "megcluster" owner = "unspecified" latlong = "unspecified" url = "unspecified" } /* The host section describes attributes of the host, like the location */ host { location = "unspecified" } /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { host = monitoring1.megam.co port = 8649 ttl = 1 } © 2012-2013 Megam Systems
  • 29. Verifying gmond data Open up your browser http://gmond1.megam.co:8649 ➔ It will list the metrics of the gmond instance. ➔ Not recommended for prod. © 2012-2013 Megam Systems
  • 30. Sample gmond data (for redis server) © 2012-2013 Megam Systems
  • 31. Checking gmeta data Data files in the below dir increases – Get metrics data at /var/lib/ganglia/rrds/megamcluster/gmond1.megam.co © 2012-2013 Megam Systems
  • 32. Sample gmetad data (for thomas.work.local) © 2012-2013 Megam Systems
  • 33. Dash Integration in rails Built on – – We'll cover it detail in a separate slideshare. If you are hungry “Code is the design” :) For questions on this area:rajthilak@megam.co.in © 2012-2013 Megam Systems
  • 35. Our Organization(Megam Systems) Beta Launch of Megam Cloud (Polygot PaaS) Our PaaS design => Link Register http://www.megam.co for an invite Twitter : @indykish © 2012-2013 Megam Systems
  • 36. Screencast illustrating the Cloud API Servers working live © 2012-2013 Megam Systems
  • 37. Thank you for watching © 2012-2013 Megam Systems