SlideShare a Scribd company logo
1 of 43
Download to read offline
Introduction to ggplot2
                              Elegant Graphics for Data Analysis
                                      Maik Röder
                                      15.12.2011
                            RUGBCN and Barcelona Code Meetup




vendredi 16 décembre 2011                                          1
Data Analysis Steps
       • Prepare data
        • e.g. using the reshape framework for restructuring
                    data
       • Plot data
        • e.g. using ggplot2 instead of base graphics and
                    lattice
       • Summarize the data and refine the plots
        • Iterative process
vendredi 16 décembre 2011                                      2
ggplot2
                 grammar of graphics




vendredi 16 décembre 2011              3
Grammar
                •       Oxford English Dictionary:

                      •     The fundamental principles or rules of an art or
                            science

                      •     A book presenting these in methodical form.
                            (Now rare; formerly common in the titles of
                            books.)

                •       System of rules underlying a given language

                •       An abstraction which facilitates thinking, reasoning
                        and communicating


vendredi 16 décembre 2011                                                      4
The grammar of graphics
               •      Move beyond named graphics (e.g. “scatterplot”)

                    •       gain insight into the deep structure that underlies
                            statistical graphics

               •      Powerful and flexible system for

                    •       constructing abstract graphs (set of points)
                            mathematically

                    •       Realizing physical representations as graphics by
                            mapping aesthetic attributes (size, colour) to graphs

               •      Lacking openly available implementation

vendredi 16 décembre 2011                                                           5
Specification
               Concise description of components of a graphic

           • DATA - data operations that create variables
                   from datasets. Reshaping using an Algebra with
                   operations
           • TRANS - variable transformations
           • SCALE - scale transformations
           • ELEMENT - graphs and their aesthetic attributes
           • COORD - a coordinate system
           • GUIDE - one or more guides
vendredi 16 décembre 2011                                           6
Birth/Death Rate




         Source: http://www.scalloway.org.uk/popu6.htm



vendredi 16 décembre 2011                                7
Excess birth
                     (vs. death) rates in selected countries




                                                    Source: The grammar of Graphics, p.13
vendredi 16 décembre 2011                                                                   8
Grammar of Graphics
       Specification can be run in GPL implemented in SPSS

              DATA: source("demographics")
              DATA: longitude,
                    latitude = map(source("World"))
              TRANS: bd = max(birth - death, 0)
              COORD: project.mercator()
              ELEMENT: point(position(lon * lat),
                             size(bd),
                             color(color.red))
              ELEMENT: polygon(position(longitude *
              latitude))
                                            Source: The grammar of Graphics, p.13
vendredi 16 décembre 2011                                                           9
Rearrangement of Components
  Grammar of Graphics                          Layered Grammar of
                                               Graphics
                                  Data         Defaults
                                 Trans          Data
                                                Mapping
                               Element         Layer
                                                Data
                                                Mapping
                                                Geom
                                                Stat
                                 Scale          Position
                                Guide          Scale
                                               Coord
                                Coord          Facet
vendredi 16 décembre 2011                                       10
Layered Grammar of Graphics
                  Implementation embedded in R using ggplot2

      w <- world
      d <- demographics
      d <- transform(d,
                     bd = pmax(birth - death, 0))
      p <- ggplot(d, aes(lon, lat))
      p <- p + geom_polygon(data = w)
      p <- p + geom_point(aes(size = bd),
                              colour = "red")
      p <- p + coord_map(projection = "mercator")
      p
vendredi 16 décembre 2011                                      11
ggplot2
                   •        Author: Hadley Wickham

                   •        Open Source implementation of the layered
                            grammar of graphics

                   •        High-level R package for creating publication-
                            quality statistical graphics

                            •   Carefully chosen defaults following basic
                                graphical design rules

                   •        Flexible set of components for creating any type of
                            graphics
vendredi 16 décembre 2011                                                         12
ggplot2 installation
           • In R console:
                   install.packages("ggplot2")
                   library(ggplot2)




vendredi 16 décembre 2011                          13
qplot
                   • Quickly plot something with qplot
                    • for exploring ideas interactively
                   • Same options as plot converted to ggplot2
                            qplot(carat, price,
                                  data=diamonds,
                                  main = "Diamonds",
                                  asp = 1)

vendredi 16 décembre 2011                                        14
vendredi 16 décembre 2011   15
Exploring with qplot
                 First try:

                       qplot(carat, price,
                             data=diamonds)
                 Log transform using functions on the variables:
                            qplot(log(carat),
                                  log(price),
                                  data=diamonds)

vendredi 16 décembre 2011                                          16
vendredi 16 décembre 2011   17
from qplot to ggplot
qplot(carat, price,
      data=diamonds,
      main = "Diamonds",
      asp = 1)

p <- ggplot(diamonds, aes(carat, price))
p <- p + geom_point()
p <- p + opts(title = "Diamonds",
              aspect.ratio = 1)
p
vendredi 16 décembre 2011                          18
Data and mapping

                   • If you need to flexibly restructure and
                            aggregate data beforehand, use Reshape

                            • data is considered an independent concern
                   • Need a mapping of what variables are
                            mapped to what aesthetic
                            • weight => x, height => y, age => size
                            • Mappings are defined in scales
vendredi 16 décembre 2011                                                 19
Statistical Transformations
                            • a stat transforms data
                            • can add new variables to a dataset
                             • that can be used in aesthetic mappings



vendredi 16 décembre 2011                                               20
stat_smooth
          • Fits a smoother to the data
          • Displays a smooth and its standard error
    ggplot(diamonds, aes(carat, price)) +
    geom_point() + geom_smooth()




vendredi 16 décembre 2011                              21
vendredi 16 décembre 2011   22
Geometric Object
                   • Control the type of plot
                   • A geom can only display certain aesthetics




vendredi 16 décembre 2011                                         23
geom_histogram

      • Distribution of carats shown in a histogram

     ggplot(diamonds, aes(carat)) +
     geom_histogram()




vendredi 16 décembre 2011                             24
vendredi 16 décembre 2011   25
Position adjustments
                   • Tweak positioning of geometric objects
                   • Avoid overlaps




vendredi 16 décembre 2011                                     26
position_jitter

         • Avoid overplotting by jittering points
         x <- c(0, 0, 0, 0, 0)
         y <- c(0, 0, 0, 0, 0)
         overplotted <- data.frame(x, y)
         ggplot(overplotted, aes(x,y)) +
         geom_point(position=position_jitter
         (w=0.1, h=0.1))
vendredi 16 décembre 2011                           27
vendredi 16 décembre 2011   28
Scales
                   • Control mapping from data to aesthetic
                            attributes
                   • One scale per aesthetic




vendredi 16 décembre 2011                                     29
scale_x_continuous
                            scale_y_continuous
       x <- c(0, 0, 0, 0, 0)
       y <- c(0, 0, 0, 0, 0)
       overplotted <- data.frame(x, y)
       ggplot(overplotted, aes(x,y)) +
       geom_point(position=position_jitter
       (w=0.1, h=0.1)) +
       scale_x_continuous(limits=c(-1,1)) +
       scale_y_continuous(limits=c(-1,1))

vendredi 16 décembre 2011                        30
vendredi 16 décembre 2011   31
Coordinate System
                   • Maps the position of objects into the plane
                   • Affect all position variables simultaneously
                   • Change appearance of geoms (unlike scales)



vendredi 16 décembre 2011                                           32
coord_map
library("maps")
map <- map("nz", plot=FALSE)[c("x","y")]
m <- data.frame(map)
n <- qplot(x, y, data=m, geom="path")
n
d <- data.frame(c(0), c(0))
n + geom_point(data = d, colour = "red")

vendredi 16 décembre 2011               33
vendredi 16 décembre 2011   34
Faceting
                       • lay out multiple plots on a page
                        • split data into subsets
                        • plot subsets into different panels




vendredi 16 décembre 2011                                      35
Facet Types
          2D grid of panels:       1D ribbon of panels
                                    wrapped into 2D:




vendredi 16 décembre 2011                                36
Faceting

 aesthetics <- aes(carat, ..density..)
 p <- ggplot(diamonds, aesthetics)
 p <- p + geom_histogram(binwidth = 0.2)
 p + facet_grid(clarity ~ cut)




vendredi 16 décembre 2011                  37
vendredi 16 décembre 2011   38
Faceting Formula
                            no faceting      .~ .

        single row multiple columns          .~ a

        single column, multiple rows        b~.

         multiple rows and columns          a~b

                                           .~ a + b
   multiple variables in rows and/or
                                           a + b ~.
                columns
                                          a+b~c+d

vendredi 16 décembre 2011                             39
Scales in Facets
        facet_grid(. ~ cyl, scales="free_x")


                    scales value            free

                            fixed            -

                             free           x, y

                            free_x           x

                            free_y           y
vendredi 16 décembre 2011                          40
Layers
                   • Iterativey update a plot
                    • change a single feature at a time
                   • Think about the high level aspects of the
                            plot in isolation
                   • Instead of choosing a static type of plot,
                            create new types of plots on the fly
                   • Cure against immobility
                    • Developers can easily develop new layers
                              without affecting other layers
vendredi 16 décembre 2011                                         41
Hierarchy of defaults
    Omitted layer                  Default chosen by layer
                  Stat                        Geom
                 Geom                          Stat
                Mapping                    Plot default
                Coord                 Cartesian coordinates
                             Chosen depending on aesthetic and type of
                   Scale
                                             variable
                               Linear scaling for continuous variables
                Position
                                  Integers for categorical variables


vendredi 16 décembre 2011                                                42
Thanks!
                   • Visit the ggplot2 homepage:
                    • http://had.co.nz/ggplot2/
                   • Get the ggplot2 book:
                    • http://amzn.com/0387981403
                   • Get the Grammar of Graphics book from
                            Leland Wilkinson:
                            • http://amzn.com/0387245448
vendredi 16 décembre 2011                                    43

More Related Content

What's hot

What's hot (20)

Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
Self-supervised Learning Lecture Note
Self-supervised Learning Lecture NoteSelf-supervised Learning Lecture Note
Self-supervised Learning Lecture Note
 
Dimensionality reduction with UMAP
Dimensionality reduction with UMAPDimensionality reduction with UMAP
Dimensionality reduction with UMAP
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
An introduction to R
An introduction to RAn introduction to R
An introduction to R
 
Principal Component Analysis
Principal Component AnalysisPrincipal Component Analysis
Principal Component Analysis
 
Machine Learning for Survival Analysis
Machine Learning for Survival AnalysisMachine Learning for Survival Analysis
Machine Learning for Survival Analysis
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 
Step By Step Guide to Learn R
Step By Step Guide to Learn RStep By Step Guide to Learn R
Step By Step Guide to Learn R
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
50 Years of Data Science
50 Years of Data Science50 Years of Data Science
50 Years of Data Science
 
Implementation of DNA sequence alignment algorithms using Fpga ,ML,and CNN
Implementation of DNA sequence alignment algorithms  using Fpga ,ML,and CNNImplementation of DNA sequence alignment algorithms  using Fpga ,ML,and CNN
Implementation of DNA sequence alignment algorithms using Fpga ,ML,and CNN
 
R programming
R programmingR programming
R programming
 
Graph Neural Network - Introduction
Graph Neural Network - IntroductionGraph Neural Network - Introduction
Graph Neural Network - Introduction
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Gene mapping and its sequence
Gene mapping and its sequenceGene mapping and its sequence
Gene mapping and its sequence
 
Linear Regression With R
Linear Regression With RLinear Regression With R
Linear Regression With R
 
Regression
RegressionRegression
Regression
 

Viewers also liked

Viewers also liked (13)

[Week10] R graphics
[Week10] R graphics[Week10] R graphics
[Week10] R graphics
 
[week11] R_ggmap, leaflet
[week11] R_ggmap, leaflet[week11] R_ggmap, leaflet
[week11] R_ggmap, leaflet
 
Chương 2 phương tiện thanh toán quốc tế
Chương 2 phương tiện thanh toán quốc tếChương 2 phương tiện thanh toán quốc tế
Chương 2 phương tiện thanh toán quốc tế
 
Commodity tips
Commodity tipsCommodity tips
Commodity tips
 
Polyglot Programming @ CONFESS
Polyglot Programming @ CONFESSPolyglot Programming @ CONFESS
Polyglot Programming @ CONFESS
 
Online Video In China Is Big!
Online Video In China Is Big!Online Video In China Is Big!
Online Video In China Is Big!
 
мирф 8 1880 ocr
мирф 8 1880 ocrмирф 8 1880 ocr
мирф 8 1880 ocr
 
Private guitar teacher los angeles
Private guitar teacher los angelesPrivate guitar teacher los angeles
Private guitar teacher los angeles
 
MidiMobilités Actualités #18 – Avril 2015
MidiMobilités Actualités #18 – Avril 2015MidiMobilités Actualités #18 – Avril 2015
MidiMobilités Actualités #18 – Avril 2015
 
How to Manage Your Social Media like a Boss
How to Manage Your Social Media like a BossHow to Manage Your Social Media like a Boss
How to Manage Your Social Media like a Boss
 
Hiring for Scale; 13 Hacks in 30 Minutes (Startup Grind Europe presentation)
Hiring for Scale; 13 Hacks in 30 Minutes (Startup Grind Europe presentation)Hiring for Scale; 13 Hacks in 30 Minutes (Startup Grind Europe presentation)
Hiring for Scale; 13 Hacks in 30 Minutes (Startup Grind Europe presentation)
 
Responsive design lunch and learn
Responsive design lunch and learnResponsive design lunch and learn
Responsive design lunch and learn
 
Bibat museoan
Bibat museoan Bibat museoan
Bibat museoan
 

Similar to Introduction to ggplot2

Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2yannabraham
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache SparkDatabricks
 
From Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog VisualizationFrom Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog Visualizationgiurca
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with Ramsantac
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasWes McKinney
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Ken Mwai
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryGhislain Atemezing
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetupLiad Magen
 
Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationWesley Goi
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf
 
Geoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMGeoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMCraig Taverner
 
Bcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesBcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesPere Urbón-Bayes
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...Reynold Xin
 
Hanna bosc2010
Hanna bosc2010Hanna bosc2010
Hanna bosc2010BOSC 2010
 
Map-Side Merge Joins for Scalable SPARQL BGP Processing
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingMap-Side Merge Joins for Scalable SPARQL BGP Processing
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingAlexander Schätzle
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011mistercteam
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraSomnath Mazumdar
 

Similar to Introduction to ggplot2 (20)

Graph Theory and Databases
Graph Theory and DatabasesGraph Theory and Databases
Graph Theory and Databases
 
Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2Elegant Graphics for Data Analysis with ggplot2
Elegant Graphics for Data Analysis with ggplot2
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
 
From Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog VisualizationFrom Data to Knowledge thru Grailog Visualization
From Data to Knowledge thru Grailog Visualization
 
Spatial Data Science with R
Spatial Data Science with RSpatial Data Science with R
Spatial Data Science with R
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
Slides 111017220255-phpapp01
Slides 111017220255-phpapp01Slides 111017220255-phpapp01
Slides 111017220255-phpapp01
 
Clustering: A Survey
Clustering: A SurveyClustering: A Survey
Clustering: A Survey
 
Comparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their GeometryComparing Vocabularies for Representing Geographical Features and Their Geometry
Comparing Vocabularies for Representing Geographical Features and Their Geometry
 
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Introduction to Graph neural networks @  Vienna Deep Learning meetupIntroduction to Graph neural networks @  Vienna Deep Learning meetup
Introduction to Graph neural networks @ Vienna Deep Learning meetup
 
Exploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience SpecialisationExploratory Analysis Part1 Coursera DataScience Specialisation
Exploratory Analysis Part1 Coursera DataScience Specialisation
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
Geoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSMGeoprocessing with Neo4j-Spatial and OSM
Geoprocessing with Neo4j-Spatial and OSM
 
Bcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph DatabasesBcn On Rails May2010 On Graph Databases
Bcn On Rails May2010 On Graph Databases
 
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
(Berkeley CS186 guest lecture) Big Data Analytics Systems: What Goes Around C...
 
Hanna bosc2010
Hanna bosc2010Hanna bosc2010
Hanna bosc2010
 
Map-Side Merge Joins for Scalable SPARQL BGP Processing
Map-Side Merge Joins for Scalable SPARQL BGP ProcessingMap-Side Merge Joins for Scalable SPARQL BGP Processing
Map-Side Merge Joins for Scalable SPARQL BGP Processing
 
N20190530
N20190530N20190530
N20190530
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
 

More from maikroeder

Encode RNA Dashboard
Encode RNA DashboardEncode RNA Dashboard
Encode RNA Dashboardmaikroeder
 
Getting started with pandas
Getting started with pandasGetting started with pandas
Getting started with pandasmaikroeder
 
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...maikroeder
 
Cms - Content Management System Utilities for Django
Cms - Content Management System Utilities for DjangoCms - Content Management System Utilities for Django
Cms - Content Management System Utilities for Djangomaikroeder
 
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Röder
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik RöderPlone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Röder
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Rödermaikroeder
 

More from maikroeder (7)

Google charts
Google chartsGoogle charts
Google charts
 
Encode RNA Dashboard
Encode RNA DashboardEncode RNA Dashboard
Encode RNA Dashboard
 
Pandas
PandasPandas
Pandas
 
Getting started with pandas
Getting started with pandasGetting started with pandas
Getting started with pandas
 
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...
Repoze Bfg - presented by Rok Garbas at the Python Barcelona Meetup October 2...
 
Cms - Content Management System Utilities for Django
Cms - Content Management System Utilities for DjangoCms - Content Management System Utilities for Django
Cms - Content Management System Utilities for Django
 
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Röder
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik RöderPlone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Röder
Plone Conference 2007: Acceptance Testing In Plone Using Funittest - Maik Röder
 

Recently uploaded

LRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfLRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfHctorFranciscoSnchez1
 
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfBuilding+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfsaidbilgen
 
Mike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtMike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtTeeFusion
 
Create Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comCreate Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comjakyjhon00
 
How to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTHow to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTThink 360 Studio
 
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Ted Drake
 
Embroidery design from embroidery magazine
Embroidery design from embroidery magazineEmbroidery design from embroidery magazine
Embroidery design from embroidery magazineRivanEleraki
 
The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024Alan Dix
 
High-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillHigh-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillCre8iveskill
 
UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024mikailaoh
 
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxWCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxHasan S
 
Production of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxProduction of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxb2kshani34
 
Construction Documents Checklist before Construction
Construction Documents Checklist before ConstructionConstruction Documents Checklist before Construction
Construction Documents Checklist before ConstructionResDraft
 
Cold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxCold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxSamKuruvilla5
 
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Ed Orozco
 
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Amil baba
 
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLMath Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLkenzukiri
 
Designing for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsDesigning for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsBlock Party
 
Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...khushisharma298853
 

Recently uploaded (19)

LRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdfLRFD Bridge Design Specifications-AASHTO (2014).pdf
LRFD Bridge Design Specifications-AASHTO (2014).pdf
 
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdfBuilding+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
Building+your+Data+Project+on+AWS+-+Luke+Anderson.pdf
 
Mike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy ShirtMike Tyson Sign The Contract Big Boy Shirt
Mike Tyson Sign The Contract Big Boy Shirt
 
Create Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.comCreate Funeral Invites Online @ feedvu.com
Create Funeral Invites Online @ feedvu.com
 
How to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPTHow to use Ai for UX UI Design | ChatGPT
How to use Ai for UX UI Design | ChatGPT
 
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
Introduce Trauma-Informed Design to Your Organization - CSUN ATC 2024
 
Embroidery design from embroidery magazine
Embroidery design from embroidery magazineEmbroidery design from embroidery magazine
Embroidery design from embroidery magazine
 
The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024The future of UX design support tools - talk Paris March 2024
The future of UX design support tools - talk Paris March 2024
 
High-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkillHigh-Quality Faux Embroidery Services | Cre8iveSkill
High-Quality Faux Embroidery Services | Cre8iveSkill
 
UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024UX Conference on UX Research Trends in 2024
UX Conference on UX Research Trends in 2024
 
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptxWCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
WCM Branding Agency | 210519 - Portfolio Review (F&B) -s.pptx
 
Production of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptxProduction of Erythromycin microbiology.pptx
Production of Erythromycin microbiology.pptx
 
Construction Documents Checklist before Construction
Construction Documents Checklist before ConstructionConstruction Documents Checklist before Construction
Construction Documents Checklist before Construction
 
Cold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptxCold War Tensions Increase - 1945-1952.pptx
Cold War Tensions Increase - 1945-1952.pptx
 
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...Design mental models for managing large-scale dbt projects. March 21, 2024 in...
Design mental models for managing large-scale dbt projects. March 21, 2024 in...
 
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
Best-NO1 Pakistani Amil Baba Real Amil baba In Pakistan Najoomi Baba in Pakis...
 
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOLMath Group 3 Presentation OLOLOLOLILOOLLOLOL
Math Group 3 Presentation OLOLOLOLILOOLLOLOL
 
Designing for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teamsDesigning for privacy: 3 essential UX habits for product teams
Designing for privacy: 3 essential UX habits for product teams
 
Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...Khushi sharma undergraduate portfolio...
Khushi sharma undergraduate portfolio...
 

Introduction to ggplot2

  • 1. Introduction to ggplot2 Elegant Graphics for Data Analysis Maik Röder 15.12.2011 RUGBCN and Barcelona Code Meetup vendredi 16 décembre 2011 1
  • 2. Data Analysis Steps • Prepare data • e.g. using the reshape framework for restructuring data • Plot data • e.g. using ggplot2 instead of base graphics and lattice • Summarize the data and refine the plots • Iterative process vendredi 16 décembre 2011 2
  • 3. ggplot2 grammar of graphics vendredi 16 décembre 2011 3
  • 4. Grammar • Oxford English Dictionary: • The fundamental principles or rules of an art or science • A book presenting these in methodical form. (Now rare; formerly common in the titles of books.) • System of rules underlying a given language • An abstraction which facilitates thinking, reasoning and communicating vendredi 16 décembre 2011 4
  • 5. The grammar of graphics • Move beyond named graphics (e.g. “scatterplot”) • gain insight into the deep structure that underlies statistical graphics • Powerful and flexible system for • constructing abstract graphs (set of points) mathematically • Realizing physical representations as graphics by mapping aesthetic attributes (size, colour) to graphs • Lacking openly available implementation vendredi 16 décembre 2011 5
  • 6. Specification Concise description of components of a graphic • DATA - data operations that create variables from datasets. Reshaping using an Algebra with operations • TRANS - variable transformations • SCALE - scale transformations • ELEMENT - graphs and their aesthetic attributes • COORD - a coordinate system • GUIDE - one or more guides vendredi 16 décembre 2011 6
  • 7. Birth/Death Rate Source: http://www.scalloway.org.uk/popu6.htm vendredi 16 décembre 2011 7
  • 8. Excess birth (vs. death) rates in selected countries Source: The grammar of Graphics, p.13 vendredi 16 décembre 2011 8
  • 9. Grammar of Graphics Specification can be run in GPL implemented in SPSS DATA: source("demographics") DATA: longitude, latitude = map(source("World")) TRANS: bd = max(birth - death, 0) COORD: project.mercator() ELEMENT: point(position(lon * lat), size(bd), color(color.red)) ELEMENT: polygon(position(longitude * latitude)) Source: The grammar of Graphics, p.13 vendredi 16 décembre 2011 9
  • 10. Rearrangement of Components Grammar of Graphics Layered Grammar of Graphics Data Defaults Trans Data Mapping Element Layer Data Mapping Geom Stat Scale Position Guide Scale Coord Coord Facet vendredi 16 décembre 2011 10
  • 11. Layered Grammar of Graphics Implementation embedded in R using ggplot2 w <- world d <- demographics d <- transform(d, bd = pmax(birth - death, 0)) p <- ggplot(d, aes(lon, lat)) p <- p + geom_polygon(data = w) p <- p + geom_point(aes(size = bd), colour = "red") p <- p + coord_map(projection = "mercator") p vendredi 16 décembre 2011 11
  • 12. ggplot2 • Author: Hadley Wickham • Open Source implementation of the layered grammar of graphics • High-level R package for creating publication- quality statistical graphics • Carefully chosen defaults following basic graphical design rules • Flexible set of components for creating any type of graphics vendredi 16 décembre 2011 12
  • 13. ggplot2 installation • In R console: install.packages("ggplot2") library(ggplot2) vendredi 16 décembre 2011 13
  • 14. qplot • Quickly plot something with qplot • for exploring ideas interactively • Same options as plot converted to ggplot2 qplot(carat, price, data=diamonds, main = "Diamonds", asp = 1) vendredi 16 décembre 2011 14
  • 16. Exploring with qplot First try: qplot(carat, price, data=diamonds) Log transform using functions on the variables: qplot(log(carat), log(price), data=diamonds) vendredi 16 décembre 2011 16
  • 18. from qplot to ggplot qplot(carat, price, data=diamonds, main = "Diamonds", asp = 1) p <- ggplot(diamonds, aes(carat, price)) p <- p + geom_point() p <- p + opts(title = "Diamonds", aspect.ratio = 1) p vendredi 16 décembre 2011 18
  • 19. Data and mapping • If you need to flexibly restructure and aggregate data beforehand, use Reshape • data is considered an independent concern • Need a mapping of what variables are mapped to what aesthetic • weight => x, height => y, age => size • Mappings are defined in scales vendredi 16 décembre 2011 19
  • 20. Statistical Transformations • a stat transforms data • can add new variables to a dataset • that can be used in aesthetic mappings vendredi 16 décembre 2011 20
  • 21. stat_smooth • Fits a smoother to the data • Displays a smooth and its standard error ggplot(diamonds, aes(carat, price)) + geom_point() + geom_smooth() vendredi 16 décembre 2011 21
  • 23. Geometric Object • Control the type of plot • A geom can only display certain aesthetics vendredi 16 décembre 2011 23
  • 24. geom_histogram • Distribution of carats shown in a histogram ggplot(diamonds, aes(carat)) + geom_histogram() vendredi 16 décembre 2011 24
  • 26. Position adjustments • Tweak positioning of geometric objects • Avoid overlaps vendredi 16 décembre 2011 26
  • 27. position_jitter • Avoid overplotting by jittering points x <- c(0, 0, 0, 0, 0) y <- c(0, 0, 0, 0, 0) overplotted <- data.frame(x, y) ggplot(overplotted, aes(x,y)) + geom_point(position=position_jitter (w=0.1, h=0.1)) vendredi 16 décembre 2011 27
  • 29. Scales • Control mapping from data to aesthetic attributes • One scale per aesthetic vendredi 16 décembre 2011 29
  • 30. scale_x_continuous scale_y_continuous x <- c(0, 0, 0, 0, 0) y <- c(0, 0, 0, 0, 0) overplotted <- data.frame(x, y) ggplot(overplotted, aes(x,y)) + geom_point(position=position_jitter (w=0.1, h=0.1)) + scale_x_continuous(limits=c(-1,1)) + scale_y_continuous(limits=c(-1,1)) vendredi 16 décembre 2011 30
  • 32. Coordinate System • Maps the position of objects into the plane • Affect all position variables simultaneously • Change appearance of geoms (unlike scales) vendredi 16 décembre 2011 32
  • 33. coord_map library("maps") map <- map("nz", plot=FALSE)[c("x","y")] m <- data.frame(map) n <- qplot(x, y, data=m, geom="path") n d <- data.frame(c(0), c(0)) n + geom_point(data = d, colour = "red") vendredi 16 décembre 2011 33
  • 35. Faceting • lay out multiple plots on a page • split data into subsets • plot subsets into different panels vendredi 16 décembre 2011 35
  • 36. Facet Types 2D grid of panels: 1D ribbon of panels wrapped into 2D: vendredi 16 décembre 2011 36
  • 37. Faceting aesthetics <- aes(carat, ..density..) p <- ggplot(diamonds, aesthetics) p <- p + geom_histogram(binwidth = 0.2) p + facet_grid(clarity ~ cut) vendredi 16 décembre 2011 37
  • 39. Faceting Formula no faceting .~ . single row multiple columns .~ a single column, multiple rows b~. multiple rows and columns a~b .~ a + b multiple variables in rows and/or a + b ~. columns a+b~c+d vendredi 16 décembre 2011 39
  • 40. Scales in Facets facet_grid(. ~ cyl, scales="free_x") scales value free fixed - free x, y free_x x free_y y vendredi 16 décembre 2011 40
  • 41. Layers • Iterativey update a plot • change a single feature at a time • Think about the high level aspects of the plot in isolation • Instead of choosing a static type of plot, create new types of plots on the fly • Cure against immobility • Developers can easily develop new layers without affecting other layers vendredi 16 décembre 2011 41
  • 42. Hierarchy of defaults Omitted layer Default chosen by layer Stat Geom Geom Stat Mapping Plot default Coord Cartesian coordinates Chosen depending on aesthetic and type of Scale variable Linear scaling for continuous variables Position Integers for categorical variables vendredi 16 décembre 2011 42
  • 43. Thanks! • Visit the ggplot2 homepage: • http://had.co.nz/ggplot2/ • Get the ggplot2 book: • http://amzn.com/0387981403 • Get the Grammar of Graphics book from Leland Wilkinson: • http://amzn.com/0387245448 vendredi 16 décembre 2011 43