SlideShare a Scribd company logo
1 of 60
Download to read offline
noSQL

quarta-feira, 8 de setembro de 2010
.           hype




quarta-feira, 8 de setembro de 2010
história...




quarta-feira, 8 de setembro de 2010
modelos
                           • Hierarchical (IMS): late 1960’s and 1970’s
                           • Directed graph (CODASYL): 1970’s
                           • Relational: 1970’s and early 1980’s
                           • Entity-Relationship: 1970’s
                           • Extended Relational: 1980’s
                           • Semantic: late 1970’s and 1980’s
                           • Object-oriented: late 1980’s and early 1990’s
                           • Object-relational: late 1980’s and early 1990’s
                           • Semi-structured (XML): late 1990’s to late 2000’s
                           • The next big thing: ???




                                      ref: What Goes Around Comes Around por Michael Stonebraker e Joey Hellerstein
quarta-feira, 8 de setembro de 2010
next big thing?




quarta-feira, 8 de setembro de 2010
definição...




quarta-feira, 8 de setembro de 2010
abaixo ao
                                       banco de
                                        dados
                                      relacional!

quarta-feira, 8 de setembro de 2010
abaixo ao banco de
                                       dados relacional!

                                      como bala
                                       de prata!



quarta-feira, 8 de setembro de 2010
momento
   histórico...
quarta-feira, 8 de setembro de 2010
quarta-feira, 8 de setembro de 2010
resolver
                                      problemas
                                      específicos

quarta-feira, 8 de setembro de 2010
quais
     problemas?




quarta-feira, 8 de setembro de 2010
Architectural Anti Patterns
                        Notes on Data Distribution and Handling Failures




quarta-feira, 8 de setembro de 2010
Required Listening: Frank Zappa - One size fits all




quarta-feira, 8 de setembro de 2010
Anti Patterns

    •    Evolution from SQL Anti Patterns (NoSQL:br May 2010)
    •    More than just RDBMS
    •    Large volumes of data
    •    Distribution
    •    Architecture
    •    Research on other tools
    •    Message Queues, DHT, Job Schedulers, NoSQL
    •    Indexing, Map/Reduce




quarta-feira, 8 de setembro de 2010
RDBMS Anti Patterns
   Not all things fit on a relational database, single ou distributed

    •    The eternal table-as-a-tree
    •    Dynamic table creation
    •    Table as cache
    •    Table as queue
    •    Table as log file
    •    Stoned Procedures
    •    Row Alignment
    •    Extreme JOINs
    •    Your scheme must be printed in an A3 sheet.
    •    Your ORM issue full queries for Dataset iterations




quarta-feira, 8 de setembro de 2010
Doing it wrong, Junior !
quarta-feira, 8 de setembro de 2010
The eternal tree
   Problem: Most threaded discussion example uses something
   like a table which contains all threads and answers, relating to
   each other by an id. Usually the developer will come up with his
   own binary-tree version to manage this mess.

   id - parent_id -author - text
   1 - 0 - gleicon - hello world
   2 - 1 - elvis - shout !

   Alternative: Document storage:
   { thread_id:1, title: 'the meeting', author: 'gleicon', replies:[
        {
          'author': elvis, text:'shout', replies:[{...}]
        }
      ]
   }

quarta-feira, 8 de setembro de 2010
Dynamic table creation
   Problem: To avoid huge tables, one must come with a "dynamic
   schema". For example, lets think about a document
   management company, which is adding new facilities over the
   country. For each storage facility, a new table is created:

   item_id - row - column - stuff
   1 - 10 - 20 - cat food
   2 - 12 - 32 - trout

   Now you have to come up with "dynamic queries", which will
   probably query a "central storage" table and issue a huge join to
   check if you have enough cat food over the country.

   Alternatives:
   - Document storage, modeling a facility as a document
   - Key/Value, modeling each facility as a SET

quarta-feira, 8 de setembro de 2010
Table as cache
   Problem: Complex queries demand that a result be stored in a
   separated table, so it can be queried quickly. Worst than views


   Alternatives:

   - Really ?

   - Memcached

   - Redis + AOF + EXPIRE

   - De-normalization




quarta-feira, 8 de setembro de 2010
Table as queue
   Problem: A table which holds messages to be completed.
   Worse, they must be ordered by
   time of creation.

   Corolary: Job Scheduler table

   Alternatives:
   - RestMQ, Resque

   - Any other message broker

   - Redis (LISTS - LPUSH + RPOP)

   - Use the right tool



quarta-feira, 8 de setembro de 2010
Table as log file
   Problem: A table in which data gets written as a log file. From
   time to time it needs to be purged. Truncating this table once a
   day usually is the first task assigned to new DBAs.

   Alternative:

   - MongoDB capped collection

   - Redis, and RRD pattern

   - RIAK




quarta-feira, 8 de setembro de 2010
Stoned procedures
   Problem: Stored procedures hold most of your applications
   logic. Also, some triggers are used to - well - trigger important
   data events.

   SP and triggers has the magic property of vanishing of our
   memories and being impossible to keep versioned.

   Alternative:
   - Now be careful so you dont use map/reduce as modern
   stoned procedures. Unfit for real time search/processing

   - Use your preferred language for business stuff, and let event
   handling to pub/sub or message queues.




quarta-feira, 8 de setembro de 2010
Row Alignment
   Problem: Extra rows are created but not used, just in case.
   Usually they are named as a1, a2, a3, a4 and called padding.

   There's good will behind that, specially when version 1 of the
   software needed an extra column in a 150M lines database and
   it took 2 days to run an ALTER TABLE. But that's no excuse.

   Alternative:

   - Quit being cheap. Quit feeling 'hacker' about padding

   - Document based databases as MongoDB and CouchDB, has
   no schema. New atributes are local to the document and can be
   added easily.



quarta-feira, 8 de setembro de 2010
Extreme JOINs
   Problem: Business stuff modeled as tables. Table inheritance
   (Product -> SubProduct_A). To find the complete data for a user
   plan, one must issue gigantic queries with lots of JOINs.

   Alternative:

   - Document storage, as MongoDB
     might help having important
     information together.

   - De-normalization

   - Serialized objects




quarta-feira, 8 de setembro de 2010
Your scheme fits in an A3 sheet
   Problem: Huge data schemes are difficult to manage. Extreme
   specialization creates tables which converges to key/value
   model. The normal form get priority over common sense.

   Product_A                          Product_B
   id - desc                          id - desc

   Alternatives:

   - De-normalization
   - Another scheme ?
   - Document store for flattening model
   - Key/Value
   - See 'Extreme JOINs'



quarta-feira, 8 de setembro de 2010
Your ORM ...
   Problem: Your ORM issue full queries for dataset iterations,
   your ORM maps and creates tables which mimics your classes,
   even the inheritance, and the performance is bad because the
   queries are huge, etc, etc

   Alternative:

   - Apart from denormalization and good old common sense,
   ORMs are trying to bridge two things with distinct impedance.

   - There is nothing to relational models which maps cleanly to
   classes and objects. Not even the basic unit which is the
   domain(set) of each column. Black Magic ?




quarta-feira, 8 de setembro de 2010
No silver bullet
   - Think about data
     handling and your
     system architecture

   - Think outside the norm

   - De-normalize

   - Simplify

   - Know stuff (Message
     queues, NoSQL, DHT)




quarta-feira, 8 de setembro de 2010
Cycle of changes - Product A
    1.There was the database model
    2.Then, the cache was needed. Performance was no good.
    3.Cache key: query, value: resultset
    4.High or inexistent expiration time [w00t]

   (Now there's a turning point. Data didn't need to change often.
   Denormalization was a given with cache)

   5. The cache needs to be warmed or the app wont work.
   6. Key/Value storage was a natural choice. No data on MySQL
   anymore.




quarta-feira, 8 de setembro de 2010
Cycle of changes - Product B
    1.Postgres DB storing crawler results.
    2.There was a counter in each row, and updating this counter
      caused contention errors.
    3.Memcache for reads. Performance is better.
    4.First MongoDB test, no more deadlocks from counter update.
    5.Data model was simplified, the entire crawled doc was
      stored.




quarta-feira, 8 de setembro de 2010
Stuff to think about
   Think if the data you use aren't de-normalized somewhere
   (cached)

   Most of the anti-patterns signals that there are architectural
   issues instead of only database issues.

   The NoSQL route (or at least a partial NoSQL route) may
   simplify it.

   Are you dependent on cache ? Does your application fails when
   there is no cache ? Does it just slows down ?

   Think about the way to put and to get back your data from the
   database (be it SQL or NoSQL).


quarta-feira, 8 de setembro de 2010
arquitetura
quarta-feira, 8 de setembro de 2010
armazenamento
    de dados NÃO
       tem sido
   [a muito tempo]
     considerado
       parte de
     arquitetura




quarta-feira, 8 de setembro de 2010
cada escolha
                              uma
                            renúncia

quarta-feira, 8 de setembro de 2010
padrões




quarta-feira, 8 de setembro de 2010
how-to




quarta-feira, 8 de setembro de 2010
quarta-feira, 8 de setembro de 2010
acid




quarta-feira, 8 de setembro de 2010
quarta-feira, 8 de setembro de 2010
                                      (
existe nosql
                                 acid



quarta-feira, 8 de setembro de 2010
quarta-feira, 8 de setembro de 2010
                                      )
CAP




                                      ref: The CAP Theorem por Seth Gilbert & Nancy Lynch
quarta-feira, 8 de setembro de 2010
C onsistency
    A vailability
    P artition Tolerance


quarta-feira, 8 de setembro de 2010
quarta-feira, 8 de setembro de 2010
BASE




                                      ref: BASE: an Acid Alternative por Dan Pritchett
quarta-feira, 8 de setembro de 2010
B asically
 A vailable
 S oft State
 E eventually Consistent


quarta-feira, 8 de setembro de 2010
Eventually
    Consistency




                                      ref: Eventually Consistent por Werner Vogels
quarta-feira, 8 de setembro de 2010
eventual em inglês:
                                      irá ocorrer em algum
                                            momento




  eventual em português:
    pode ou não ocorrer
quarta-feira, 8 de setembro de 2010
Consitência
           em Momento
          Indeterminado

                                      @mdediana
quarta-feira, 8 de setembro de 2010
consistência


                                      W+R > N


quarta-feira, 8 de setembro de 2010
durabilidade




                                      ref: The End of an Architectural Era por Michael Stonebraker & al.
quarta-feira, 8 de setembro de 2010
ainda tem...

                   ★ latência
                   ★ performance
                   ★ particionamento
                   ★ distribuição
                   ★ replicação

quarta-feira, 8 de setembro de 2010
lembre-se
      vc não está criando uma
          solução de escala
         intergaláctica com
   tolerância a falhas aleatórias
          entre datacenters
      espalhados em diversas
    localizações geográficas e
          outras dimensões
quarta-feira, 8 de setembro de 2010
sacou a
       importância
      da arquitetura?




quarta-feira, 8 de setembro de 2010
com tantas definições...
            com tantos conceitos...
             com tantos tradeoffs...
                com tantos....



quarta-feira, 8 de setembro de 2010
como o nosql se
   tornou tão
 sexy e popular?




quarta-feira, 8 de setembro de 2010
apesar de tudo....




quarta-feira, 8 de setembro de 2010
                                      é fácil usar!
persitência
     poliglota




quarta-feira, 8 de setembro de 2010
Perguntas?


quarta-feira, 8 de setembro de 2010
Obrigado




              github.com/porcelli                 github.com/gleicon

              linkedin.com/in/alexandreporcelli   linkedin.com/in/gleicon

              @porcelli                           @gleicon

              porcelli.com.br                     zenmachine.wordpress.com

quarta-feira, 8 de setembro de 2010

More Related Content

Similar to noSQL @ QCon SP

A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...Alexandre Porcelli
 
Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2rusersla
 
Bender kuszmaul tutorial-xldb12
Bender kuszmaul tutorial-xldb12Bender kuszmaul tutorial-xldb12
Bender kuszmaul tutorial-xldb12Atner Yegorov
 
Data Structures and Algorithms for Big Databases
Data Structures and Algorithms for Big DatabasesData Structures and Algorithms for Big Databases
Data Structures and Algorithms for Big Databasesomnidba
 
Architectural anti patterns_for_data_handling
Architectural anti patterns_for_data_handlingArchitectural anti patterns_for_data_handling
Architectural anti patterns_for_data_handlingGleicon Moraes
 
Data massage: How databases have been scaled from one to one million nodes
Data massage: How databases have been scaled from one to one million nodesData massage: How databases have been scaled from one to one million nodes
Data massage: How databases have been scaled from one to one million nodesUlf Wendel
 
Architectural anti-patterns for data handling
Architectural anti-patterns for data handlingArchitectural anti-patterns for data handling
Architectural anti-patterns for data handlingGleicon Moraes
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.pptRahulTr22
 
Data science programming .ppt
Data science programming .pptData science programming .ppt
Data science programming .pptGanesh E
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.pptkalai75
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.pptAravind Reddy
 
ADLUG 2012: Linking Linked Data
ADLUG 2012: Linking Linked DataADLUG 2012: Linking Linked Data
ADLUG 2012: Linking Linked DataAndrea Gazzarini
 
Security Of Nosql Database Against Intruders Essay
Security Of Nosql Database Against Intruders EssaySecurity Of Nosql Database Against Intruders Essay
Security Of Nosql Database Against Intruders EssayMelissa Williams
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseStavros Papadopoulos
 
The Key Concepts Within Modern File Systems
The Key Concepts Within Modern File SystemsThe Key Concepts Within Modern File Systems
The Key Concepts Within Modern File SystemsAngie Willis
 

Similar to noSQL @ QCon SP (20)

ActiveRecord 2.3
ActiveRecord 2.3ActiveRecord 2.3
ActiveRecord 2.3
 
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
A importância dos dados em sua arquitetura... uma visão muito além do SQL Ser...
 
On no sql.partiii
On no sql.partiiiOn no sql.partiii
On no sql.partiii
 
Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2Los Angeles R users group - Nov 17 2010 - Part 2
Los Angeles R users group - Nov 17 2010 - Part 2
 
Bender kuszmaul tutorial-xldb12
Bender kuszmaul tutorial-xldb12Bender kuszmaul tutorial-xldb12
Bender kuszmaul tutorial-xldb12
 
Data Structures and Algorithms for Big Databases
Data Structures and Algorithms for Big DatabasesData Structures and Algorithms for Big Databases
Data Structures and Algorithms for Big Databases
 
Architectural anti patterns_for_data_handling
Architectural anti patterns_for_data_handlingArchitectural anti patterns_for_data_handling
Architectural anti patterns_for_data_handling
 
Data massage: How databases have been scaled from one to one million nodes
Data massage: How databases have been scaled from one to one million nodesData massage: How databases have been scaled from one to one million nodes
Data massage: How databases have been scaled from one to one million nodes
 
Architectural anti-patterns for data handling
Architectural anti-patterns for data handlingArchitectural anti-patterns for data handling
Architectural anti-patterns for data handling
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
Os Krug
Os KrugOs Krug
Os Krug
 
Data Science
Data Science Data Science
Data Science
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.ppt
 
Data science programming .ppt
Data science programming .pptData science programming .ppt
Data science programming .ppt
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.ppt
 
Lec1cgu13updated.ppt
Lec1cgu13updated.pptLec1cgu13updated.ppt
Lec1cgu13updated.ppt
 
ADLUG 2012: Linking Linked Data
ADLUG 2012: Linking Linked DataADLUG 2012: Linking Linked Data
ADLUG 2012: Linking Linked Data
 
Security Of Nosql Database Against Intruders Essay
Security Of Nosql Database Against Intruders EssaySecurity Of Nosql Database Against Intruders Essay
Security Of Nosql Database Against Intruders Essay
 
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal DatabaseDebunking "Purpose-Built Data Systems:": Enter the Universal Database
Debunking "Purpose-Built Data Systems:": Enter the Universal Database
 
The Key Concepts Within Modern File Systems
The Key Concepts Within Modern File SystemsThe Key Concepts Within Modern File Systems
The Key Concepts Within Modern File Systems
 

More from Alexandre Porcelli

Running rules and processes in the cloud
Running rules and processes in the cloudRunning rules and processes in the cloud
Running rules and processes in the cloudAlexandre Porcelli
 
Impulsione sua carreira contribuindo para projetos open source
Impulsione sua carreira contribuindo para projetos open sourceImpulsione sua carreira contribuindo para projetos open source
Impulsione sua carreira contribuindo para projetos open sourceAlexandre Porcelli
 
QConSP 2013 - Não confunda engenharia de software com lean startup
QConSP 2013 - Não confunda engenharia de software com lean startupQConSP 2013 - Não confunda engenharia de software com lean startup
QConSP 2013 - Não confunda engenharia de software com lean startupAlexandre Porcelli
 
JUDCon São Paulo - Drools in a Nutshell
JUDCon São Paulo - Drools in a NutshellJUDCon São Paulo - Drools in a Nutshell
JUDCon São Paulo - Drools in a NutshellAlexandre Porcelli
 
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...Alexandre Porcelli
 
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...Alexandre Porcelli
 
DevinVale: SQL, noSQL ou newSQL - Onde armazenar meus dados?
DevinVale:  SQL, noSQL ou newSQL - Onde armazenar meus dados?DevinVale:  SQL, noSQL ou newSQL - Onde armazenar meus dados?
DevinVale: SQL, noSQL ou newSQL - Onde armazenar meus dados?Alexandre Porcelli
 
noSQL e ORM, será que dá samba?
noSQL e ORM, será que dá samba?noSQL e ORM, será que dá samba?
noSQL e ORM, será que dá samba?Alexandre Porcelli
 
noSQL - Uma nova escola de pensamento
noSQL - Uma nova escola de pensamentonoSQL - Uma nova escola de pensamento
noSQL - Uma nova escola de pensamentoAlexandre Porcelli
 
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?SQL, NoSQL ou NewSQL: Onde armazenar meus dados?
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?Alexandre Porcelli
 
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em Java
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em JavaJ1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em Java
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em JavaAlexandre Porcelli
 
ANTLR Conference - OpenSpotLight driven by ANTLR
ANTLR Conference - OpenSpotLight driven by ANTLRANTLR Conference - OpenSpotLight driven by ANTLR
ANTLR Conference - OpenSpotLight driven by ANTLRAlexandre Porcelli
 

More from Alexandre Porcelli (20)

Dawn of the citizen developer
Dawn of the citizen developerDawn of the citizen developer
Dawn of the citizen developer
 
Running rules and processes in the cloud
Running rules and processes in the cloudRunning rules and processes in the cloud
Running rules and processes in the cloud
 
Impulsione sua carreira contribuindo para projetos open source
Impulsione sua carreira contribuindo para projetos open sourceImpulsione sua carreira contribuindo para projetos open source
Impulsione sua carreira contribuindo para projetos open source
 
QConSP 2013 - Não confunda engenharia de software com lean startup
QConSP 2013 - Não confunda engenharia de software com lean startupQConSP 2013 - Não confunda engenharia de software com lean startup
QConSP 2013 - Não confunda engenharia de software com lean startup
 
JUDCon São Paulo - Drools in a Nutshell
JUDCon São Paulo - Drools in a NutshellJUDCon São Paulo - Drools in a Nutshell
JUDCon São Paulo - Drools in a Nutshell
 
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...
NoSQL for the rest of us - a JBoss perspective over those hot tools and how y...
 
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...
Armazenamento de Dados em Poucas Palavras ou Uma resposta definitiva para tod...
 
DevinVale: SQL, noSQL ou newSQL - Onde armazenar meus dados?
DevinVale:  SQL, noSQL ou newSQL - Onde armazenar meus dados?DevinVale:  SQL, noSQL ou newSQL - Onde armazenar meus dados?
DevinVale: SQL, noSQL ou newSQL - Onde armazenar meus dados?
 
noSQL e ORM, será que dá samba?
noSQL e ORM, será que dá samba?noSQL e ORM, será que dá samba?
noSQL e ORM, será que dá samba?
 
noSQL - Uma nova escola de pensamento
noSQL - Uma nova escola de pensamentonoSQL - Uma nova escola de pensamento
noSQL - Uma nova escola de pensamento
 
noSQL @ MSTechDay São Paulo
noSQL @ MSTechDay São PaulonoSQL @ MSTechDay São Paulo
noSQL @ MSTechDay São Paulo
 
Integration & DSL
Integration & DSLIntegration & DSL
Integration & DSL
 
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?SQL, NoSQL ou NewSQL: Onde armazenar meus dados?
SQL, NoSQL ou NewSQL: Onde armazenar meus dados?
 
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em Java
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em JavaJ1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em Java
J1Brasil: Persistência de Dados além do JPA, ou Como usar noSQL em Java
 
noSQL WTF?! - Citi2010
noSQL WTF?! - Citi2010noSQL WTF?! - Citi2010
noSQL WTF?! - Citi2010
 
noSQL além do buzz
noSQL além do buzznoSQL além do buzz
noSQL além do buzz
 
GraphDatabases @ TDC2010
GraphDatabases @ TDC2010GraphDatabases @ TDC2010
GraphDatabases @ TDC2010
 
Motor de Regras @ TDC2010
Motor de Regras @ TDC2010Motor de Regras @ TDC2010
Motor de Regras @ TDC2010
 
OpenSpotLight - Concepts
OpenSpotLight - ConceptsOpenSpotLight - Concepts
OpenSpotLight - Concepts
 
ANTLR Conference - OpenSpotLight driven by ANTLR
ANTLR Conference - OpenSpotLight driven by ANTLRANTLR Conference - OpenSpotLight driven by ANTLR
ANTLR Conference - OpenSpotLight driven by ANTLR
 

Recently uploaded

How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIVijayananda Mohire
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingMAGNIntelligence
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3DianaGray10
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applicationsnooralam814309
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingFrancesco Corti
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...DianaGray10
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1DianaGray10
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 

Recently uploaded (20)

How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
SheDev 2024
SheDev 2024SheDev 2024
SheDev 2024
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
My key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAIMy key hands-on projects in Quantum, and QAI
My key hands-on projects in Quantum, and QAI
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
IT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced ComputingIT Service Management (ITSM) Best Practices for Advanced Computing
IT Service Management (ITSM) Best Practices for Advanced Computing
 
UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3UiPath Studio Web workshop Series - Day 3
UiPath Studio Web workshop Series - Day 3
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
Graphene Quantum Dots-Based Composites for Biomedical Applications
Graphene Quantum Dots-Based Composites for  Biomedical ApplicationsGraphene Quantum Dots-Based Composites for  Biomedical Applications
Graphene Quantum Dots-Based Composites for Biomedical Applications
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
Where developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is goingWhere developers are challenged, what developers want and where DevEx is going
Where developers are challenged, what developers want and where DevEx is going
 
Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...
 
UiPath Studio Web workshop series - Day 1
UiPath Studio Web workshop series  - Day 1UiPath Studio Web workshop series  - Day 1
UiPath Studio Web workshop series - Day 1
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 

noSQL @ QCon SP

  • 1. noSQL quarta-feira, 8 de setembro de 2010
  • 2. . hype quarta-feira, 8 de setembro de 2010
  • 4. modelos • Hierarchical (IMS): late 1960’s and 1970’s • Directed graph (CODASYL): 1970’s • Relational: 1970’s and early 1980’s • Entity-Relationship: 1970’s • Extended Relational: 1980’s • Semantic: late 1970’s and 1980’s • Object-oriented: late 1980’s and early 1990’s • Object-relational: late 1980’s and early 1990’s • Semi-structured (XML): late 1990’s to late 2000’s • The next big thing: ??? ref: What Goes Around Comes Around por Michael Stonebraker e Joey Hellerstein quarta-feira, 8 de setembro de 2010
  • 5. next big thing? quarta-feira, 8 de setembro de 2010
  • 7. abaixo ao banco de dados relacional! quarta-feira, 8 de setembro de 2010
  • 8. abaixo ao banco de dados relacional! como bala de prata! quarta-feira, 8 de setembro de 2010
  • 9. momento histórico... quarta-feira, 8 de setembro de 2010
  • 10. quarta-feira, 8 de setembro de 2010
  • 11. resolver problemas específicos quarta-feira, 8 de setembro de 2010
  • 12. quais problemas? quarta-feira, 8 de setembro de 2010
  • 13. Architectural Anti Patterns Notes on Data Distribution and Handling Failures quarta-feira, 8 de setembro de 2010
  • 14. Required Listening: Frank Zappa - One size fits all quarta-feira, 8 de setembro de 2010
  • 15. Anti Patterns • Evolution from SQL Anti Patterns (NoSQL:br May 2010) • More than just RDBMS • Large volumes of data • Distribution • Architecture • Research on other tools • Message Queues, DHT, Job Schedulers, NoSQL • Indexing, Map/Reduce quarta-feira, 8 de setembro de 2010
  • 16. RDBMS Anti Patterns Not all things fit on a relational database, single ou distributed • The eternal table-as-a-tree • Dynamic table creation • Table as cache • Table as queue • Table as log file • Stoned Procedures • Row Alignment • Extreme JOINs • Your scheme must be printed in an A3 sheet. • Your ORM issue full queries for Dataset iterations quarta-feira, 8 de setembro de 2010
  • 17. Doing it wrong, Junior ! quarta-feira, 8 de setembro de 2010
  • 18. The eternal tree Problem: Most threaded discussion example uses something like a table which contains all threads and answers, relating to each other by an id. Usually the developer will come up with his own binary-tree version to manage this mess. id - parent_id -author - text 1 - 0 - gleicon - hello world 2 - 1 - elvis - shout ! Alternative: Document storage: { thread_id:1, title: 'the meeting', author: 'gleicon', replies:[ { 'author': elvis, text:'shout', replies:[{...}] } ] } quarta-feira, 8 de setembro de 2010
  • 19. Dynamic table creation Problem: To avoid huge tables, one must come with a "dynamic schema". For example, lets think about a document management company, which is adding new facilities over the country. For each storage facility, a new table is created: item_id - row - column - stuff 1 - 10 - 20 - cat food 2 - 12 - 32 - trout Now you have to come up with "dynamic queries", which will probably query a "central storage" table and issue a huge join to check if you have enough cat food over the country. Alternatives: - Document storage, modeling a facility as a document - Key/Value, modeling each facility as a SET quarta-feira, 8 de setembro de 2010
  • 20. Table as cache Problem: Complex queries demand that a result be stored in a separated table, so it can be queried quickly. Worst than views Alternatives: - Really ? - Memcached - Redis + AOF + EXPIRE - De-normalization quarta-feira, 8 de setembro de 2010
  • 21. Table as queue Problem: A table which holds messages to be completed. Worse, they must be ordered by time of creation. Corolary: Job Scheduler table Alternatives: - RestMQ, Resque - Any other message broker - Redis (LISTS - LPUSH + RPOP) - Use the right tool quarta-feira, 8 de setembro de 2010
  • 22. Table as log file Problem: A table in which data gets written as a log file. From time to time it needs to be purged. Truncating this table once a day usually is the first task assigned to new DBAs. Alternative: - MongoDB capped collection - Redis, and RRD pattern - RIAK quarta-feira, 8 de setembro de 2010
  • 23. Stoned procedures Problem: Stored procedures hold most of your applications logic. Also, some triggers are used to - well - trigger important data events. SP and triggers has the magic property of vanishing of our memories and being impossible to keep versioned. Alternative: - Now be careful so you dont use map/reduce as modern stoned procedures. Unfit for real time search/processing - Use your preferred language for business stuff, and let event handling to pub/sub or message queues. quarta-feira, 8 de setembro de 2010
  • 24. Row Alignment Problem: Extra rows are created but not used, just in case. Usually they are named as a1, a2, a3, a4 and called padding. There's good will behind that, specially when version 1 of the software needed an extra column in a 150M lines database and it took 2 days to run an ALTER TABLE. But that's no excuse. Alternative: - Quit being cheap. Quit feeling 'hacker' about padding - Document based databases as MongoDB and CouchDB, has no schema. New atributes are local to the document and can be added easily. quarta-feira, 8 de setembro de 2010
  • 25. Extreme JOINs Problem: Business stuff modeled as tables. Table inheritance (Product -> SubProduct_A). To find the complete data for a user plan, one must issue gigantic queries with lots of JOINs. Alternative: - Document storage, as MongoDB might help having important information together. - De-normalization - Serialized objects quarta-feira, 8 de setembro de 2010
  • 26. Your scheme fits in an A3 sheet Problem: Huge data schemes are difficult to manage. Extreme specialization creates tables which converges to key/value model. The normal form get priority over common sense. Product_A Product_B id - desc id - desc Alternatives: - De-normalization - Another scheme ? - Document store for flattening model - Key/Value - See 'Extreme JOINs' quarta-feira, 8 de setembro de 2010
  • 27. Your ORM ... Problem: Your ORM issue full queries for dataset iterations, your ORM maps and creates tables which mimics your classes, even the inheritance, and the performance is bad because the queries are huge, etc, etc Alternative: - Apart from denormalization and good old common sense, ORMs are trying to bridge two things with distinct impedance. - There is nothing to relational models which maps cleanly to classes and objects. Not even the basic unit which is the domain(set) of each column. Black Magic ? quarta-feira, 8 de setembro de 2010
  • 28. No silver bullet - Think about data handling and your system architecture - Think outside the norm - De-normalize - Simplify - Know stuff (Message queues, NoSQL, DHT) quarta-feira, 8 de setembro de 2010
  • 29. Cycle of changes - Product A 1.There was the database model 2.Then, the cache was needed. Performance was no good. 3.Cache key: query, value: resultset 4.High or inexistent expiration time [w00t] (Now there's a turning point. Data didn't need to change often. Denormalization was a given with cache) 5. The cache needs to be warmed or the app wont work. 6. Key/Value storage was a natural choice. No data on MySQL anymore. quarta-feira, 8 de setembro de 2010
  • 30. Cycle of changes - Product B 1.Postgres DB storing crawler results. 2.There was a counter in each row, and updating this counter caused contention errors. 3.Memcache for reads. Performance is better. 4.First MongoDB test, no more deadlocks from counter update. 5.Data model was simplified, the entire crawled doc was stored. quarta-feira, 8 de setembro de 2010
  • 31. Stuff to think about Think if the data you use aren't de-normalized somewhere (cached) Most of the anti-patterns signals that there are architectural issues instead of only database issues. The NoSQL route (or at least a partial NoSQL route) may simplify it. Are you dependent on cache ? Does your application fails when there is no cache ? Does it just slows down ? Think about the way to put and to get back your data from the database (be it SQL or NoSQL). quarta-feira, 8 de setembro de 2010
  • 32. arquitetura quarta-feira, 8 de setembro de 2010
  • 33. armazenamento de dados NÃO tem sido [a muito tempo] considerado parte de arquitetura quarta-feira, 8 de setembro de 2010
  • 34. cada escolha uma renúncia quarta-feira, 8 de setembro de 2010
  • 35. padrões quarta-feira, 8 de setembro de 2010
  • 36. how-to quarta-feira, 8 de setembro de 2010
  • 37. quarta-feira, 8 de setembro de 2010
  • 38. acid quarta-feira, 8 de setembro de 2010
  • 39. quarta-feira, 8 de setembro de 2010 (
  • 40. existe nosql acid quarta-feira, 8 de setembro de 2010
  • 41. quarta-feira, 8 de setembro de 2010 )
  • 42. CAP ref: The CAP Theorem por Seth Gilbert & Nancy Lynch quarta-feira, 8 de setembro de 2010
  • 43. C onsistency A vailability P artition Tolerance quarta-feira, 8 de setembro de 2010
  • 44. quarta-feira, 8 de setembro de 2010
  • 45. BASE ref: BASE: an Acid Alternative por Dan Pritchett quarta-feira, 8 de setembro de 2010
  • 46. B asically A vailable S oft State E eventually Consistent quarta-feira, 8 de setembro de 2010
  • 47. Eventually Consistency ref: Eventually Consistent por Werner Vogels quarta-feira, 8 de setembro de 2010
  • 48. eventual em inglês: irá ocorrer em algum momento eventual em português: pode ou não ocorrer quarta-feira, 8 de setembro de 2010
  • 49. Consitência em Momento Indeterminado @mdediana quarta-feira, 8 de setembro de 2010
  • 50. consistência W+R > N quarta-feira, 8 de setembro de 2010
  • 51. durabilidade ref: The End of an Architectural Era por Michael Stonebraker & al. quarta-feira, 8 de setembro de 2010
  • 52. ainda tem... ★ latência ★ performance ★ particionamento ★ distribuição ★ replicação quarta-feira, 8 de setembro de 2010
  • 53. lembre-se vc não está criando uma solução de escala intergaláctica com tolerância a falhas aleatórias entre datacenters espalhados em diversas localizações geográficas e outras dimensões quarta-feira, 8 de setembro de 2010
  • 54. sacou a importância da arquitetura? quarta-feira, 8 de setembro de 2010
  • 55. com tantas definições... com tantos conceitos... com tantos tradeoffs... com tantos.... quarta-feira, 8 de setembro de 2010
  • 56. como o nosql se tornou tão sexy e popular? quarta-feira, 8 de setembro de 2010
  • 57. apesar de tudo.... quarta-feira, 8 de setembro de 2010 é fácil usar!
  • 58. persitência poliglota quarta-feira, 8 de setembro de 2010
  • 59. Perguntas? quarta-feira, 8 de setembro de 2010
  • 60. Obrigado github.com/porcelli github.com/gleicon linkedin.com/in/alexandreporcelli linkedin.com/in/gleicon @porcelli @gleicon porcelli.com.br zenmachine.wordpress.com quarta-feira, 8 de setembro de 2010