SlideShare a Scribd company logo
1 of 45
IRRI Galaxy: bioinformatics for rice
             scientists

                Ramil P. Mauleon
      Scientist – Bioinformatics Specialist
      TT Chang Genetic Resources Center
     International Rice Research Institute
Presented in behalf of my co-authors &
the development team @ IRRI
 Scientists/product/theme leaders
 • Michael Thomson
 • Kenneth L. McNally
 • Hei Leung
 Laboratory, software team
 • Venice Margaret Juanillas
 • Christine Jade Dilla-Ermita
Outline

 • Overview of IRRI & it’s research agenda
 • Bioinformatics activities at IRRI
 • IRRI Galaxy: current state, future
   developments
International Rice Research Institute:
  part of the Consultative Group on
 International Agricultural Research
                CGIAR
CGIAR - global partnership that unites
organizations engaged in research for a food-
secure future
 •   International Rice Research    •   International Center for
     Institute (IRRI)                   Agricultural Research in the
 •   Africa Rice Center                 Dry Areas (ICARDA)
 •   International Center for       •   International Institute of
     Tropical Agriculture (CIAT)        Tropical Agriculture (IITA)
 •   International Crops Research   •   International Livestock
     Institute for the Semi-Arid        Research Institute (ILRI)
     Tropics (ICRISAT)              •    International Water
 •   International Maize and            Management Institute (IWMI)
     Wheat Improvement Center
     (CIMMYT)
 •   International Potato Center
     (CIP)
INTERNATIONAL RICE RESEARCH INSTITUTE
                    Los Baños, Philippines
         Mission:

Reduce poverty and
hunger,

Improve the health of
rice farmers and
consumers,

Ensure environmental
sustainability

Through research,                            Home of the Green Revolution
partnerships                                      Established 1960
                                                     www.irri.org
Aims to help rice farmers improve the yield and quality of their rice by developing..
    •New rice varieties
    •Rice crop management techniques
Global Rice Science Partnership : GRiSP
 • A single strategic and work plan for global rice research
 • Streamlines current research for development activities of
   the CGIAR, aligns it with numerous partners, and
 • Adds new activities of high priority, in areas where science
   is expected to make significant contributions.



              IRRI                                    +++
6 GRiSP Research Themes (2 are rice –
research, per se)
 1. Harnessing genetic diversity to chart new productivity,
    quality, and health horizons
        1.1. Ex situ conservation and dissemination of rice germplasm
        1.2. Characterizing genetic diversity and creating novel gene
          pools (SNP genotypes, whole genome sequencing,
          phenotypes)
        1.3. Genes and allelic diversity conferring stress tolerance and
          enhanced nutrition (candidate genes)
        1.4. C4 rice (Converted from C3 photosynthesis)
 2. Accelerating the development, delivery, and adoption of
    improved rice varieties
       2.1. Breeding informatics, high-throughput marker applications,
            and multi-environment testing
IRGC – the International Rice Genebank Collection
World’s largest collection of rice germplasm (located at IRRI) held in
trust for the world community and source countries

                                 • Over 117,000 accessions from 117
                                   countries
                                 • Two cultivated species
                                          Oryza sativa
                                          Oryza glaberrima
                                 • 22 wild species
                                 • Relatively few accessions have
                                   donated alleles to current, high-
                                   yielding varieties
                                 • http://www.irri.org/GRC
Rice is morphologically very diverse
Structure of O. sativa
45 SSR Loci on 2252 lines.
(DARwin5, unwtd NJ, SM
coef.)
The color represents group
assignment for K= 9 with a
minimum allele frequency of
0.65 for model-based structure
analysis.




   IRRI


       CORNELL




                                 Rice exhibits deep population structure.
A high quality reference genome is available




                         HQ BAC-by-BAC
                         Nipponbare
                               (< 1 error in 10K bases)


                           IRGSP 2005 Nature
                           436:793-800
Research themes, Bioinformatics &
Galaxy
 • Leveraging the reference genome, datasets are
   sequencing technology-based
    o Requires bioinformatics knowledge
    o Small bioinformatics team at IRRI =
 • We need to
    o enable field/bench researchers for bioinformatics
    o share bioinformatics solutions across GRiSP partners
    o share solutions with rice research community as a
      whole
 • Galaxy bioinformatics workbench
   (http://galaxyproject.org/) an easy choice
Galaxy features that fit our needs
 Open, web-based platform for accessible, reproducible, and
   transparent computational biomedical research.
 • Accessible: Users w/o programming experience can easily
   specify parameters and run tools and workflows.
 • Reproducible: Galaxy captures info so that any user can
   repeat and understand a complete computational
   analysis.
 • Transparent: Users share and publish analyses via the
   web and create interactive, web-based documents that
   describe a complete analysis.
GRiSP 1.2.1: Rice SNP Consortium for enabling genome-
wide association studies
  • Data from high-density genotyping using
    44K, 700k Affymetrix SNP arrays and
    Illumina Beadstudio, Fluidigm medium
    density platforms
  • Bioinformatics needs                                  GRiSP 2.1.3 High-
                                                          throughput SNP
     • Genotype data management system: SNP               genotyping
       calling, storage, integration, retrieval,          platform for
       formatting for analysis                            breeding
                                                          applications
     • Analysis: GWAS pipelines, genetic analysis tools
       (for standard & specialized populations)
     • Genome browser: integrating published
       datasets & visualizing
Our 1st Galaxy: SNP calling workflow at
IRRI


                                     GenomeStudio +
      BeadXpress Scan                Alchemy plug-in
     Results ( 384 SNPs)




                           Allele calling with ALCHEMY
Why ALCHEMY SNP calling
 • GenomeStudio’s genotype calling algorithm is designed
   for human applications
    o does not consider inbred samples or population
       deficient in heterozygotes
 • Alchemy : Open source, developed at Cornell University
   by Mark “Koni” Wright et al. (2010)
    o addresses the poor performance of the vendor’s
       software on inbred sample sets
    o ability to estimate and incorporate inbreeding
       information on a per sample basis
    o written in C ; compiles neatly under the GNU/Linux
       environment
GRISP 1.2.3: The Rice 3,000 Genomes Project: Sequencing for Crop
Improvement
        Kenneth McNally, Ramil Mauleon, Chengzhi Liang, Ruaraidh Sackville Hamilton,
         Zhikang Li, Ren Wang, Hongliang Chen, Gengyun Zhang, Hongsheng Liang,
                        Hei Leung, Achim Dobermann, Robert Zeigler




                                                                   CAAS

                  + Many Analysis Partners
                NIAS                    Cornell                     TGAC
                MIPS                     Cirad                       IRD
                CAS                      CAAS                        BGI
           Academia Sinica                MPI                        KZI
              EMBRAPA                     AGI                    Wageningen
                CSHL                   Gramene                   Plant Onto
                 …                  Uni Queensland                    …
Bioinformatics challenges of the project…

   • Efficient database system that allows the
     integration of the genebank information with
     phenotypic, breeding, genomic, and IPR data for
     enhanced utilization
   • Development of toolkits/workbenches to enable
     gene/genotype->phenotype predictions by
     research scientists and rice breeders
   • Make these databases, tools, & analyses results
     available (& updated) along with the rice gene
     bank
Focus of bioinformatics developments
in 3k project
 • Sequence/genotype data management,
   manipulation system
    o include primary data visualization (SNPs,
      genome)
 • Data analysis workbench
    o Analysis tools, w/ workflow management
    o Results visualization (haplotypes, population
      structures, GWAS results)
    o Highly efficient sequence/analysis results data
      storage model & phenotype database
Objective 1 : Sequence primary analysis

 • Milestone 1: Construction of new variety group
   reference genomes for the representative clades
    o   Quick draft genomes: SOAP de novo –based assembly
        (Assembl, V.J. Ulat - IRRI)
         • Velvet fails with our dataset (legitimate out-of-memory error,
           likely due to repeats)
    o   New strategies (adapt/optimize/create algorithms) for
        high-quality assembly of new references, thru
        collaborations with partners mentioned before..
New k-mer size
Assembl                 iteration

                            SOAP denovo
        Automatically       assembly
        generate            •Contig
        SOAP denovo         •Scaffold
        config files        •Gap closer
                                                          Draft
                                                          genome
                                                          •with tiling
                                                          path
Short                                                     •multi-
reads                                                     mapped,
                                                          unmapped
data                                                      scaffolds
                                     Align scaffolds to
          QC                         reference
          trim/filter                (nucmer)
          (fastx                     •Bin to
          toolkit)                   chromosomes
                                     •Segregate per
                        Reference    chromosome
                        genome(s)    unique, multi-hits
Objective 1 : Sequence primary analysis
(contd)

 Milestone 2: SNP genotypes construction & diversity
  analysis: Haplotype structure & local (genome-
  block) diversity analysis
    o   Main problem:
         • Number of samples (3,042 varieties) overwhelms
           existing software & computers (for SNP discovery, a big
           problem)
    o   One Proposed Solution : PANATI
         • Koni Wright PhD thesis, Cornell University – Very fast
           SNP discovery and genotype calling using SW alignment
PANATI (http://panati.sourceforge.net)
 • No hard limits on the number of mismatches and in/dels
   imposed by the algorithm
 • Designed for and best suited for analysis of population
   samples with high diversity or for the use of a divergent
   proxy reference sequence for species which have no
   adequate reference of their own
 • Fast execution even when there is high divergence
   between the sample and the reference sequence
 • free for academic use
PANATI technical features
 • Read lengths of any size
    o Input can be mixes of different read lengths and single-
      end or paired-end formats
 • Flexible trade-offs between speed and memory usage
 • Multithreaded parallel execution of mapping and alignment scaling in
   linear performance up to 64 CPUs (higher has not been tested)
 • Ability to read compressed FASTQ files in bzip2 or gzip formats
   directly
     o will automatically use pbzip2 for parallel decompression of pbzip2
       compressed files if the program is available
Objective 1 : Sequence primary analysis
(contd)
 • Milestone 3: Annotation of constructed variety reference
   genomes, genotypes/haplotypes of the 10k genomes, &
   diversity analyses results
    o Intersection of results from various annotation
      pipelines
        • RAP pipeline(NIAS , T. Itoh et al)
        • PASA (TIGR)
        • Gramene evidence-based method
        • Maker (GMOD)
Objective 2 : Build database & visualization tools
for the genomes / genotypes / haplotype/diversity
analysis results

 Milestone 1. Building the project genome browser; some
   issues:
    o Multiple reference genomes to display & call SNPs
       from
         • Per reference view, several at a time
         • Super (“pan”) genome view
    o So many varieties to display
         • Pick & show subsets? Global Display?
         • Regional/global genome comparisons between
           varieties
Option 1: UCSC Genome Browser
 • Good
    o Fast even for large datasets
    o Funded, with large community support base
    o Nice integration with Galaxy
       • Pick & choose varieties in Galaxy  UCSC gbrowser
         visualization
 • Not so good
    o Painful installation
    o Steep learning curve (esp. for customizations)
    o Lack of comparative genome view
UCSC Browser hosted @ CU, mirror @ IRRI
Option 2: GMOD Gbrowse
 • Good
    o “Comfort zone” genome browser – installation,
      customization
    o Simple DB schema (basic install)
    o Funded, with large community support base
    o Comparative genome view supported
    o Integrates with Galaxy (similar to UCSC Gbrowser)
 • Not so good
    o Slow for large datasets
GMOD Gbrowse with draft genome assembly
anchored rice reference genome
Objective 2 : Build database & visualization tools
for the genomes / genotypes / haplotype/diversity
analysis results (contd)

 Milestone 2: Build data analysis application tools coupled to
   the sequence database
 • Some existing tools (input from collaborating institutes)
    o EU- transPLANT project: computational infrastructures
       for plant genomics
    o Haplophyle @ CIRAD
 • Build Galaxy for tools developed/adopted by project
    o Sequence/genotype management
    o Novel data analysis methods, workflows
Objective 3 : Genotype - > Phenotype analysis/
breeders’ toolkit

 • Milestone 1 – Create an integrated phenotype database
 • Milestone 2 - Association (GWAS) & genetic analysis tools
     o   TASSEL , java web start in IRRI GALAXY
     o   R packages integrated into IRRI GALAXY
          • R-GENETICS
          • GAPIT – Buckler, et al., Cornell University
 • Milestone 3 – The breeders’ toolkit
     o   Major project.. Putting all these tools together in a target user-friendly
         package
     o  Breeder’s use cases captured as workflows
     Is GALAXY up to this task??
     Will breeders use it??
IRRI Galaxy: Current status

 • Deployed in the cloud (Amazon Web Services
   Large instance – Singapore region)
 • Streamlined to contain rice-specific tools and
   genotyping data
 • NO NGS assembly tools in public site
Standard Galaxy release
IRRI GALAXY (current)
Workflows for rice data analysis
already available
IRRI Galaxy Toolshed is under
development (1)
IRRI Galaxy Toolshed is under
development (2)
Share data, import into current analysis
(upon publication of studies..)
Solving the data mining issue for large
data/results sets
 • BIO HDF5 technology (Hierarchical Data Format) -
   http://www.hdfgroup.org/projects/biohdf/
 • Bottom line:
      o   very fast data mining of alignments (SAM/BAM), sequences when
          the data model/file organization & tools (C APIs & libraries) are
          used
      o   Pilot ongoing now for 2,000 samples genotype data

                                      HDF5
•BAM/SAM                                C API
                                                                   •Sequence
files
•SNP data               loader                     queries         analyses
                                                                   results
•Sequences                          File system
•annotation
from www.hdfgroup.org/pubs/presentations/BIOHDF-BOF-SC09-final.pdf
Projects in IRRI Galaxy bioinformatics
workbench
 • SNP data pre-processing & calling (Alchemy, PANATI - M. Wright)
 • Data format manipulation for downstream analysis tools
 • Population analysis tools
      o   Structure (Pritchard et al.)
      o   Ade4 R package (Chessel et al.) for Analysis of Molecular Variance
 • Downstream sequence analysis tools e.g. unique primer design
   (Triplett et al, Colorado State University, in prep)
 • Interfaces for SNPs data management & analysis
      o   GWAS: TASSEL (Bradbury et al.), GAPIT
      o   GBS analysis pipeline
 •   Pick & choose data to visualize: Varieties  Genome browser
Summary
                          Bioinformatics and database to                                                       Use in
                        Integrate sequence-phenotype data                                                     breeding
                                                            BGI de novo and                                   programs
             Rice SNP Consortium                             re-sequencing
               700k Affymetrix                              Initial 5-10X coverage
                genotyping chip
                  2000 lines                       10,000 GeneBank accessions1
                                                   Cultivated + close wild relatives                        Genebank
                                                                                                           as a reverse
                                                                                                             genetics
                                                                                                              system
            Phenotyping network                           Association genetics and
                2000+ lines                                     QTL mapping
                                                                                                           •Select
                                                         Predict genotype-phenotype
                                                                                                           accessions
                                                        relationships at kb resolution
                                                                                                           based on QTL
                                                                                                           prediction for
                                                                                                           targeted
              Specialized genetic                                                                          phenotyping of
               stocks: MAGIC                                                                               specific traits
            populations, biparental
                 RILs, CSSL,                                                                               •Discover novel
                                                                                                           phenotypes
                            IRRI & GRiSP,                                           CAAS
1   Including publicly accessible germplasm from IRRI, CIRAD, AfricaRice , CIAT and regional collections
THANKS FROM OUR CUSTOMERS 

More Related Content

What's hot

Genome projects and their Contributions
Genome projects and their ContributionsGenome projects and their Contributions
Genome projects and their ContributionsAlbertPaul18
 
Rice genome india"s role
Rice genome india"s  role Rice genome india"s  role
Rice genome india"s role deepakrai26
 
Overview on arabidopsis and rice genome
Overview on arabidopsis and rice genomeOverview on arabidopsis and rice genome
Overview on arabidopsis and rice genomeGopal Singh
 
Yeast genome project
Yeast genome projectYeast genome project
Yeast genome projectNazish_Nehal
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...Borlaug Global Rust Initiative
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodrcolatru
 
The future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza GenomeThe future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza GenomeFOODCROPS
 
Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)Lokesh Gour
 
Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)Qaisar Khan
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryZarlishAttique1
 
L14 human genome
L14 human genomeL14 human genome
L14 human genomeMUBOSScz
 
Genome sequencing in vegetable crops
Genome sequencing in vegetable cropsGenome sequencing in vegetable crops
Genome sequencing in vegetable cropsBommesh
 
Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genome Alberta
 
The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part Ihhalhaddad
 
Human genome project
Human genome projectHuman genome project
Human genome projectShital Pal
 
When is a genome finished?
When is a genome finished? When is a genome finished?
When is a genome finished? Keith Bradnam
 

What's hot (20)

Genome projects and their Contributions
Genome projects and their ContributionsGenome projects and their Contributions
Genome projects and their Contributions
 
Rice genome india"s role
Rice genome india"s  role Rice genome india"s  role
Rice genome india"s role
 
Overview on arabidopsis and rice genome
Overview on arabidopsis and rice genomeOverview on arabidopsis and rice genome
Overview on arabidopsis and rice genome
 
Plant genome project(aribidopsis)
Plant genome project(aribidopsis)Plant genome project(aribidopsis)
Plant genome project(aribidopsis)
 
Yeast genome project
Yeast genome projectYeast genome project
Yeast genome project
 
The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...The wheat genome sequence: a foundation for accelerating improvment of bread ...
The wheat genome sequence: a foundation for accelerating improvment of bread ...
 
Techniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-goodTechniques of-biotechnology-mcclean-good
Techniques of-biotechnology-mcclean-good
 
The future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza GenomeThe future of Rice Genomics: sequencing the collective Oryza Genome
The future of Rice Genomics: sequencing the collective Oryza Genome
 
Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)Applied genomic research in rice genetic improvement (2)
Applied genomic research in rice genetic improvement (2)
 
Genomics seminar copy
Genomics seminar   copyGenomics seminar   copy
Genomics seminar copy
 
Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)Plant genome project (COBAM, UOP, Peshawar)
Plant genome project (COBAM, UOP, Peshawar)
 
Genome sequencing and the development of our current information library
Genome sequencing and the development of our current information libraryGenome sequencing and the development of our current information library
Genome sequencing and the development of our current information library
 
Yeast Genome
Yeast Genome Yeast Genome
Yeast Genome
 
Genomics and Plant Genomics
Genomics and Plant GenomicsGenomics and Plant Genomics
Genomics and Plant Genomics
 
L14 human genome
L14 human genomeL14 human genome
L14 human genome
 
Genome sequencing in vegetable crops
Genome sequencing in vegetable cropsGenome sequencing in vegetable crops
Genome sequencing in vegetable crops
 
Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genomics 101 jun 15 2012
Genomics 101 jun 15 2012
 
The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part I
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
When is a genome finished?
When is a genome finished? When is a genome finished?
When is a genome finished?
 

Viewers also liked

Writing Galaxy Tools
Writing Galaxy ToolsWriting Galaxy Tools
Writing Galaxy Toolspjacock
 
HPC Forum: a space for technical collaboration amongst HPC administrators
HPC Forum: a space for technical collaboration amongst HPC administratorsHPC Forum: a space for technical collaboration amongst HPC administrators
HPC Forum: a space for technical collaboration amongst HPC administratorsPeter van Heusden
 
Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchPeter van Heusden
 
Building a cluster filesystem using distributed, directly-attached storage
Building a cluster filesystem using distributed, directly-attached storageBuilding a cluster filesystem using distributed, directly-attached storage
Building a cluster filesystem using distributed, directly-attached storagePeter van Heusden
 
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...VHIR Vall d’Hebron Institut de Recerca
 
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challengeFOODCROPS
 
Rice breeding at Irga
Rice breeding at IrgaRice breeding at Irga
Rice breeding at IrgaCIAT
 
L.p.yuan. progress in breeding of super hybrid rice
L.p.yuan. progress in breeding of super hybrid riceL.p.yuan. progress in breeding of super hybrid rice
L.p.yuan. progress in breeding of super hybrid riceFOODCROPS
 
Bioinformatics of TB: A case study in big data
Bioinformatics of TB: A case study in big dataBioinformatics of TB: A case study in big data
Bioinformatics of TB: A case study in big dataPeter van Heusden
 
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...FOODCROPS
 
2015. Petr Smykal. Study domestication and to broaden genetic diversity of w...
2015. Petr Smykal.  Study domestication and to broaden genetic diversity of w...2015. Petr Smykal.  Study domestication and to broaden genetic diversity of w...
2015. Petr Smykal. Study domestication and to broaden genetic diversity of w...FOODCROPS
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsGigaScience, BGI Hong Kong
 
R.K. Singh .Breeding for salt tolerance in rice
 R.K. Singh .Breeding for salt tolerance in rice R.K. Singh .Breeding for salt tolerance in rice
R.K. Singh .Breeding for salt tolerance in riceFOODCROPS
 
" Developing rice varieties with enhanced adaptation to lowland farming syste...
" Developing rice varieties with enhanced adaptation to lowland farming syste..." Developing rice varieties with enhanced adaptation to lowland farming syste...
" Developing rice varieties with enhanced adaptation to lowland farming syste...ExternalEvents
 
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...FOODCROPS
 
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...2015. SarahHearne. From genebank to field- leveraging genomics to identify an...
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...FOODCROPS
 
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...CIAT
 
" Resource use efficiency in crops: “Green super rice” to increase water and ...
" Resource use efficiency in crops: “Green super rice” to increase water and ..." Resource use efficiency in crops: “Green super rice” to increase water and ...
" Resource use efficiency in crops: “Green super rice” to increase water and ...ExternalEvents
 
2012 GSR - breeding technology
2012 GSR - breeding technology2012 GSR - breeding technology
2012 GSR - breeding technologyFOODCROPS
 
Drought molecular breeding in rice, 19 november, 2012 swamy
Drought molecular breeding in rice, 19 november, 2012  swamyDrought molecular breeding in rice, 19 november, 2012  swamy
Drought molecular breeding in rice, 19 november, 2012 swamyarjunmanju
 

Viewers also liked (20)

Writing Galaxy Tools
Writing Galaxy ToolsWriting Galaxy Tools
Writing Galaxy Tools
 
HPC Forum: a space for technical collaboration amongst HPC administrators
HPC Forum: a space for technical collaboration amongst HPC administratorsHPC Forum: a space for technical collaboration amongst HPC administrators
HPC Forum: a space for technical collaboration amongst HPC administrators
 
Scientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible researchScientific Workflow Systems for accessible, reproducible research
Scientific Workflow Systems for accessible, reproducible research
 
Building a cluster filesystem using distributed, directly-attached storage
Building a cluster filesystem using distributed, directly-attached storageBuilding a cluster filesystem using distributed, directly-attached storage
Building a cluster filesystem using distributed, directly-attached storage
 
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
Introduction to Galaxy (UEB-UAT Bioinformatics Course - Session 2.2 - VHIR, B...
 
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge
2015. M. S. Swaminathan. Next Generation Genomics and the zero hunger challenge
 
Rice breeding at Irga
Rice breeding at IrgaRice breeding at Irga
Rice breeding at Irga
 
L.p.yuan. progress in breeding of super hybrid rice
L.p.yuan. progress in breeding of super hybrid riceL.p.yuan. progress in breeding of super hybrid rice
L.p.yuan. progress in breeding of super hybrid rice
 
Bioinformatics of TB: A case study in big data
Bioinformatics of TB: A case study in big dataBioinformatics of TB: A case study in big data
Bioinformatics of TB: A case study in big data
 
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...
2015. Pegadaraju Venkatramana. Array Tape Platform and its appliccation in ge...
 
2015. Petr Smykal. Study domestication and to broaden genetic diversity of w...
2015. Petr Smykal.  Study domestication and to broaden genetic diversity of w...2015. Petr Smykal.  Study domestication and to broaden genetic diversity of w...
2015. Petr Smykal. Study domestication and to broaden genetic diversity of w...
 
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientistsRamil Mauleon: Galaxy: bioinformatics for rice scientists
Ramil Mauleon: Galaxy: bioinformatics for rice scientists
 
R.K. Singh .Breeding for salt tolerance in rice
 R.K. Singh .Breeding for salt tolerance in rice R.K. Singh .Breeding for salt tolerance in rice
R.K. Singh .Breeding for salt tolerance in rice
 
" Developing rice varieties with enhanced adaptation to lowland farming syste...
" Developing rice varieties with enhanced adaptation to lowland farming syste..." Developing rice varieties with enhanced adaptation to lowland farming syste...
" Developing rice varieties with enhanced adaptation to lowland farming syste...
 
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...
2015. Robert L Thompson. Essential Roles of Agricultural Technology and Inter...
 
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...2015. SarahHearne. From genebank to field- leveraging genomics to identify an...
2015. SarahHearne. From genebank to field- leveraging genomics to identify an...
 
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...
GWAS of Resistance to Stem and Sheath Diseases of Uruguayan Advanced Rice Bre...
 
" Resource use efficiency in crops: “Green super rice” to increase water and ...
" Resource use efficiency in crops: “Green super rice” to increase water and ..." Resource use efficiency in crops: “Green super rice” to increase water and ...
" Resource use efficiency in crops: “Green super rice” to increase water and ...
 
2012 GSR - breeding technology
2012 GSR - breeding technology2012 GSR - breeding technology
2012 GSR - breeding technology
 
Drought molecular breeding in rice, 19 november, 2012 swamy
Drought molecular breeding in rice, 19 november, 2012  swamyDrought molecular breeding in rice, 19 november, 2012  swamy
Drought molecular breeding in rice, 19 november, 2012 swamy
 

Similar to IRRI Galaxy: Bioinformatics for Rice Scientists

PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...CGIAR Generation Challenge Programme
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlaniTulika Singh
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlanitulika101
 
3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane TheakerIventus
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128GenomeInABottle
 
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...GigaScience, BGI Hong Kong
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?ylog
 
Genesys: Online portal to Genebank Data
Genesys: Online portal to Genebank DataGenesys: Online portal to Genebank Data
Genesys: Online portal to Genebank DataLuigi Guarino
 
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...GigaScience, BGI Hong Kong
 
GRM 2011: Approaches, resources and tools for rice gene discovery and breeding
GRM 2011: Approaches, resources and tools for rice gene discovery and breedingGRM 2011: Approaches, resources and tools for rice gene discovery and breeding
GRM 2011: Approaches, resources and tools for rice gene discovery and breedingCGIAR Generation Challenge Programme
 
Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics dataDan Bolser
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015Matthew Vaughn
 
Application of nuclear and genomic technologies for improving livestock produ...
Application of nuclear and genomic technologies for improving livestock produ...Application of nuclear and genomic technologies for improving livestock produ...
Application of nuclear and genomic technologies for improving livestock produ...ILRI
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington UniversitySeth Crosby
 
BecA-ILRI Hub genomics and bioinformatics platforms
BecA-ILRI Hub genomics and bioinformatics platformsBecA-ILRI Hub genomics and bioinformatics platforms
BecA-ILRI Hub genomics and bioinformatics platformsILRI
 
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)CGIAR Generation Challenge Programme
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...CGIAR Generation Challenge Programme
 

Similar to IRRI Galaxy: Bioinformatics for Rice Scientists (20)

PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
PAG XXII 2014 – The Crop Ontology: A resource for enabling access to breeders...
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlani
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlani
 
3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker3b. Biotechnolgies & Genomics - Jane Theaker
3b. Biotechnolgies & Genomics - Jane Theaker
 
GRM 2013: Global Rice Science Partnership (GRiSP) – H Leung
GRM 2013: Global Rice Science Partnership (GRiSP) – H LeungGRM 2013: Global Rice Science Partnership (GRiSP) – H Leung
GRM 2013: Global Rice Science Partnership (GRiSP) – H Leung
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
Tin-Lap Lee: CBIIT GigaGalaxy: A Galaxy-based platform for large-scale genomi...
 
Training on increasing the capacity of research technicians in Breeding
Training on increasing the capacity of research technicians in BreedingTraining on increasing the capacity of research technicians in Breeding
Training on increasing the capacity of research technicians in Breeding
 
What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?What should Bioinformatics do for EvoDevo?
What should Bioinformatics do for EvoDevo?
 
Genesys: Online portal to Genebank Data
Genesys: Online portal to Genebank DataGenesys: Online portal to Genebank Data
Genesys: Online portal to Genebank Data
 
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
 
Next-generation sequencing of African yam bean (Sphenostylis stenocarpa) usin...
Next-generation sequencing of African yam bean (Sphenostylis stenocarpa) usin...Next-generation sequencing of African yam bean (Sphenostylis stenocarpa) usin...
Next-generation sequencing of African yam bean (Sphenostylis stenocarpa) usin...
 
GRM 2011: Approaches, resources and tools for rice gene discovery and breeding
GRM 2011: Approaches, resources and tools for rice gene discovery and breedingGRM 2011: Approaches, resources and tools for rice gene discovery and breeding
GRM 2011: Approaches, resources and tools for rice gene discovery and breeding
 
Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics data
 
iplant-highlights-pag2015
iplant-highlights-pag2015iplant-highlights-pag2015
iplant-highlights-pag2015
 
Application of nuclear and genomic technologies for improving livestock produ...
Application of nuclear and genomic technologies for improving livestock produ...Application of nuclear and genomic technologies for improving livestock produ...
Application of nuclear and genomic technologies for improving livestock produ...
 
16S MVRSION at Washington University
16S MVRSION at Washington University16S MVRSION at Washington University
16S MVRSION at Washington University
 
BecA-ILRI Hub genomics and bioinformatics platforms
BecA-ILRI Hub genomics and bioinformatics platformsBecA-ILRI Hub genomics and bioinformatics platforms
BecA-ILRI Hub genomics and bioinformatics platforms
 
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
2011: Introduction to the CGIAR Generation Challenge Programme (GCP)
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
 

More from GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 

More from GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Recently uploaded

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Recently uploaded (20)

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

IRRI Galaxy: Bioinformatics for Rice Scientists

  • 1. IRRI Galaxy: bioinformatics for rice scientists Ramil P. Mauleon Scientist – Bioinformatics Specialist TT Chang Genetic Resources Center International Rice Research Institute
  • 2. Presented in behalf of my co-authors & the development team @ IRRI Scientists/product/theme leaders • Michael Thomson • Kenneth L. McNally • Hei Leung Laboratory, software team • Venice Margaret Juanillas • Christine Jade Dilla-Ermita
  • 3. Outline • Overview of IRRI & it’s research agenda • Bioinformatics activities at IRRI • IRRI Galaxy: current state, future developments
  • 4. International Rice Research Institute: part of the Consultative Group on International Agricultural Research CGIAR
  • 5. CGIAR - global partnership that unites organizations engaged in research for a food- secure future • International Rice Research • International Center for Institute (IRRI) Agricultural Research in the • Africa Rice Center Dry Areas (ICARDA) • International Center for • International Institute of Tropical Agriculture (CIAT) Tropical Agriculture (IITA) • International Crops Research • International Livestock Institute for the Semi-Arid Research Institute (ILRI) Tropics (ICRISAT) • International Water • International Maize and Management Institute (IWMI) Wheat Improvement Center (CIMMYT) • International Potato Center (CIP)
  • 6. INTERNATIONAL RICE RESEARCH INSTITUTE Los Baños, Philippines Mission: Reduce poverty and hunger, Improve the health of rice farmers and consumers, Ensure environmental sustainability Through research, Home of the Green Revolution partnerships Established 1960 www.irri.org Aims to help rice farmers improve the yield and quality of their rice by developing.. •New rice varieties •Rice crop management techniques
  • 7. Global Rice Science Partnership : GRiSP • A single strategic and work plan for global rice research • Streamlines current research for development activities of the CGIAR, aligns it with numerous partners, and • Adds new activities of high priority, in areas where science is expected to make significant contributions. IRRI +++
  • 8. 6 GRiSP Research Themes (2 are rice – research, per se) 1. Harnessing genetic diversity to chart new productivity, quality, and health horizons 1.1. Ex situ conservation and dissemination of rice germplasm 1.2. Characterizing genetic diversity and creating novel gene pools (SNP genotypes, whole genome sequencing, phenotypes) 1.3. Genes and allelic diversity conferring stress tolerance and enhanced nutrition (candidate genes) 1.4. C4 rice (Converted from C3 photosynthesis) 2. Accelerating the development, delivery, and adoption of improved rice varieties 2.1. Breeding informatics, high-throughput marker applications, and multi-environment testing
  • 9. IRGC – the International Rice Genebank Collection World’s largest collection of rice germplasm (located at IRRI) held in trust for the world community and source countries • Over 117,000 accessions from 117 countries • Two cultivated species Oryza sativa Oryza glaberrima • 22 wild species • Relatively few accessions have donated alleles to current, high- yielding varieties • http://www.irri.org/GRC
  • 10. Rice is morphologically very diverse
  • 11. Structure of O. sativa 45 SSR Loci on 2252 lines. (DARwin5, unwtd NJ, SM coef.) The color represents group assignment for K= 9 with a minimum allele frequency of 0.65 for model-based structure analysis. IRRI CORNELL Rice exhibits deep population structure.
  • 12. A high quality reference genome is available HQ BAC-by-BAC Nipponbare (< 1 error in 10K bases) IRGSP 2005 Nature 436:793-800
  • 13. Research themes, Bioinformatics & Galaxy • Leveraging the reference genome, datasets are sequencing technology-based o Requires bioinformatics knowledge o Small bioinformatics team at IRRI = • We need to o enable field/bench researchers for bioinformatics o share bioinformatics solutions across GRiSP partners o share solutions with rice research community as a whole • Galaxy bioinformatics workbench (http://galaxyproject.org/) an easy choice
  • 14. Galaxy features that fit our needs Open, web-based platform for accessible, reproducible, and transparent computational biomedical research. • Accessible: Users w/o programming experience can easily specify parameters and run tools and workflows. • Reproducible: Galaxy captures info so that any user can repeat and understand a complete computational analysis. • Transparent: Users share and publish analyses via the web and create interactive, web-based documents that describe a complete analysis.
  • 15. GRiSP 1.2.1: Rice SNP Consortium for enabling genome- wide association studies • Data from high-density genotyping using 44K, 700k Affymetrix SNP arrays and Illumina Beadstudio, Fluidigm medium density platforms • Bioinformatics needs GRiSP 2.1.3 High- throughput SNP • Genotype data management system: SNP genotyping calling, storage, integration, retrieval, platform for formatting for analysis breeding applications • Analysis: GWAS pipelines, genetic analysis tools (for standard & specialized populations) • Genome browser: integrating published datasets & visualizing
  • 16. Our 1st Galaxy: SNP calling workflow at IRRI GenomeStudio + BeadXpress Scan Alchemy plug-in Results ( 384 SNPs) Allele calling with ALCHEMY
  • 17. Why ALCHEMY SNP calling • GenomeStudio’s genotype calling algorithm is designed for human applications o does not consider inbred samples or population deficient in heterozygotes • Alchemy : Open source, developed at Cornell University by Mark “Koni” Wright et al. (2010) o addresses the poor performance of the vendor’s software on inbred sample sets o ability to estimate and incorporate inbreeding information on a per sample basis o written in C ; compiles neatly under the GNU/Linux environment
  • 18. GRISP 1.2.3: The Rice 3,000 Genomes Project: Sequencing for Crop Improvement Kenneth McNally, Ramil Mauleon, Chengzhi Liang, Ruaraidh Sackville Hamilton, Zhikang Li, Ren Wang, Hongliang Chen, Gengyun Zhang, Hongsheng Liang, Hei Leung, Achim Dobermann, Robert Zeigler CAAS + Many Analysis Partners NIAS Cornell TGAC MIPS Cirad IRD CAS CAAS BGI Academia Sinica MPI KZI EMBRAPA AGI Wageningen CSHL Gramene Plant Onto … Uni Queensland …
  • 19. Bioinformatics challenges of the project… • Efficient database system that allows the integration of the genebank information with phenotypic, breeding, genomic, and IPR data for enhanced utilization • Development of toolkits/workbenches to enable gene/genotype->phenotype predictions by research scientists and rice breeders • Make these databases, tools, & analyses results available (& updated) along with the rice gene bank
  • 20. Focus of bioinformatics developments in 3k project • Sequence/genotype data management, manipulation system o include primary data visualization (SNPs, genome) • Data analysis workbench o Analysis tools, w/ workflow management o Results visualization (haplotypes, population structures, GWAS results) o Highly efficient sequence/analysis results data storage model & phenotype database
  • 21. Objective 1 : Sequence primary analysis • Milestone 1: Construction of new variety group reference genomes for the representative clades o Quick draft genomes: SOAP de novo –based assembly (Assembl, V.J. Ulat - IRRI) • Velvet fails with our dataset (legitimate out-of-memory error, likely due to repeats) o New strategies (adapt/optimize/create algorithms) for high-quality assembly of new references, thru collaborations with partners mentioned before..
  • 22. New k-mer size Assembl iteration SOAP denovo Automatically assembly generate •Contig SOAP denovo •Scaffold config files •Gap closer Draft genome •with tiling path Short •multi- reads mapped, unmapped data scaffolds Align scaffolds to QC reference trim/filter (nucmer) (fastx •Bin to toolkit) chromosomes •Segregate per Reference chromosome genome(s) unique, multi-hits
  • 23. Objective 1 : Sequence primary analysis (contd) Milestone 2: SNP genotypes construction & diversity analysis: Haplotype structure & local (genome- block) diversity analysis o Main problem: • Number of samples (3,042 varieties) overwhelms existing software & computers (for SNP discovery, a big problem) o One Proposed Solution : PANATI • Koni Wright PhD thesis, Cornell University – Very fast SNP discovery and genotype calling using SW alignment
  • 24. PANATI (http://panati.sourceforge.net) • No hard limits on the number of mismatches and in/dels imposed by the algorithm • Designed for and best suited for analysis of population samples with high diversity or for the use of a divergent proxy reference sequence for species which have no adequate reference of their own • Fast execution even when there is high divergence between the sample and the reference sequence • free for academic use
  • 25. PANATI technical features • Read lengths of any size o Input can be mixes of different read lengths and single- end or paired-end formats • Flexible trade-offs between speed and memory usage • Multithreaded parallel execution of mapping and alignment scaling in linear performance up to 64 CPUs (higher has not been tested) • Ability to read compressed FASTQ files in bzip2 or gzip formats directly o will automatically use pbzip2 for parallel decompression of pbzip2 compressed files if the program is available
  • 26. Objective 1 : Sequence primary analysis (contd) • Milestone 3: Annotation of constructed variety reference genomes, genotypes/haplotypes of the 10k genomes, & diversity analyses results o Intersection of results from various annotation pipelines • RAP pipeline(NIAS , T. Itoh et al) • PASA (TIGR) • Gramene evidence-based method • Maker (GMOD)
  • 27. Objective 2 : Build database & visualization tools for the genomes / genotypes / haplotype/diversity analysis results Milestone 1. Building the project genome browser; some issues: o Multiple reference genomes to display & call SNPs from • Per reference view, several at a time • Super (“pan”) genome view o So many varieties to display • Pick & show subsets? Global Display? • Regional/global genome comparisons between varieties
  • 28. Option 1: UCSC Genome Browser • Good o Fast even for large datasets o Funded, with large community support base o Nice integration with Galaxy • Pick & choose varieties in Galaxy  UCSC gbrowser visualization • Not so good o Painful installation o Steep learning curve (esp. for customizations) o Lack of comparative genome view
  • 29. UCSC Browser hosted @ CU, mirror @ IRRI
  • 30. Option 2: GMOD Gbrowse • Good o “Comfort zone” genome browser – installation, customization o Simple DB schema (basic install) o Funded, with large community support base o Comparative genome view supported o Integrates with Galaxy (similar to UCSC Gbrowser) • Not so good o Slow for large datasets
  • 31. GMOD Gbrowse with draft genome assembly anchored rice reference genome
  • 32. Objective 2 : Build database & visualization tools for the genomes / genotypes / haplotype/diversity analysis results (contd) Milestone 2: Build data analysis application tools coupled to the sequence database • Some existing tools (input from collaborating institutes) o EU- transPLANT project: computational infrastructures for plant genomics o Haplophyle @ CIRAD • Build Galaxy for tools developed/adopted by project o Sequence/genotype management o Novel data analysis methods, workflows
  • 33. Objective 3 : Genotype - > Phenotype analysis/ breeders’ toolkit • Milestone 1 – Create an integrated phenotype database • Milestone 2 - Association (GWAS) & genetic analysis tools o TASSEL , java web start in IRRI GALAXY o R packages integrated into IRRI GALAXY • R-GENETICS • GAPIT – Buckler, et al., Cornell University • Milestone 3 – The breeders’ toolkit o Major project.. Putting all these tools together in a target user-friendly package o Breeder’s use cases captured as workflows Is GALAXY up to this task?? Will breeders use it??
  • 34. IRRI Galaxy: Current status • Deployed in the cloud (Amazon Web Services Large instance – Singapore region) • Streamlined to contain rice-specific tools and genotyping data • NO NGS assembly tools in public site
  • 37. Workflows for rice data analysis already available
  • 38. IRRI Galaxy Toolshed is under development (1)
  • 39. IRRI Galaxy Toolshed is under development (2)
  • 40. Share data, import into current analysis (upon publication of studies..)
  • 41. Solving the data mining issue for large data/results sets • BIO HDF5 technology (Hierarchical Data Format) - http://www.hdfgroup.org/projects/biohdf/ • Bottom line: o very fast data mining of alignments (SAM/BAM), sequences when the data model/file organization & tools (C APIs & libraries) are used o Pilot ongoing now for 2,000 samples genotype data HDF5 •BAM/SAM C API •Sequence files •SNP data loader queries analyses results •Sequences File system •annotation
  • 43. Projects in IRRI Galaxy bioinformatics workbench • SNP data pre-processing & calling (Alchemy, PANATI - M. Wright) • Data format manipulation for downstream analysis tools • Population analysis tools o Structure (Pritchard et al.) o Ade4 R package (Chessel et al.) for Analysis of Molecular Variance • Downstream sequence analysis tools e.g. unique primer design (Triplett et al, Colorado State University, in prep) • Interfaces for SNPs data management & analysis o GWAS: TASSEL (Bradbury et al.), GAPIT o GBS analysis pipeline • Pick & choose data to visualize: Varieties  Genome browser
  • 44. Summary Bioinformatics and database to Use in Integrate sequence-phenotype data breeding BGI de novo and programs Rice SNP Consortium re-sequencing 700k Affymetrix Initial 5-10X coverage genotyping chip 2000 lines 10,000 GeneBank accessions1 Cultivated + close wild relatives Genebank as a reverse genetics system Phenotyping network Association genetics and 2000+ lines QTL mapping •Select Predict genotype-phenotype accessions relationships at kb resolution based on QTL prediction for targeted Specialized genetic phenotyping of stocks: MAGIC specific traits populations, biparental RILs, CSSL, •Discover novel phenotypes IRRI & GRiSP, CAAS 1 Including publicly accessible germplasm from IRRI, CIRAD, AfricaRice , CIAT and regional collections
  • 45. THANKS FROM OUR CUSTOMERS 