SlideShare a Scribd company logo
1 of 53
Download to read offline
Memento
                                http://mementoweb.org/


                                 Herbert Van de Sompel
                                     Robert Sanderson
                                     Michael L. Nelson


Big Leaps Towards Seamless Navigation
        of the Web of the Past

                    Memento Update
        CNI Task Force Meeting, Spring 2011   1
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                         Memento Update
             CNI Task Force Meeting, Spring 2011   2
Overview of Memento Framework

Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                         Memento Update
             CNI Task Force Meeting, Spring 2011   3
Memento wants to make it easy

to access the Web of the Past.




              Memento Update
  CNI Task Force Meeting, Spring 2011   4
Tate Online            Select Date                        Tate Online
  Today               March 16 2008                      March 16 2008




                                                              From
                                                        National Archives


                          Memento Update
              CNI Task Force Meeting, Spring 2011   5
Memento achieves this by introducing

a uniform version access capability to

 integrate the present and past Web.




                  Memento Update
      CNI Task Force Meeting, Spring 2011   6
Content Management Systems:

                     •  Designed to be aware of all
                        versions of a resource;

                     •  Self-contained;

                     •  Variety of proprietary version
                        mechanisms;

                     •  Versions interlinked using
                        proprietary mechanisms.



            Memento Update
CNI Task Force Meeting, Spring 2011   7
World Wide Web:

                     •  Designed to forget about prior
                        versions of a resource;

                     •  Distributed.




            Memento Update
CNI Task Force Meeting, Spring 2011   8
There are resource versions on
                       the Web:

                     •  Content Management
                        Systems;

                     •  Web Archives;

                     •  Transactional archives;

                     •  Search engine caches.



            Memento Update
CNI Task Force Meeting, Spring 2011   9
But the Web architecture has a
                        hard time dealing with them:

                      •  Cannot talk about a resource
                         as it used to exist;

                      •  Cannot access a prior version
                         knowing the current one;

                      •  Cannot access the current
                         version knowing a prior one;

                      Current approaches are ad hoc
                        and localized.


             Memento Update
CNI Task Force Meeting, Spring 2011   10
Memento:

                     •  Regards the Web as a big
                        Content Management System

                     •  Introduces a uniform
                        capability to access versions
                        on the Web;

                     •  Does not build new archives
                        but leverages all systems that
                        host versions: Web archives,
                        Content Management
                        Systems, Software Version
                        Systems, etc.

             Memento Update
CNI Task Force Meeting, Spring 2011   11
Memento’s version access
                        approach:

                      •  Is distributed: versions may
                         exist on several servers;

                      •  Uses time as a global version
                         indicator;

                      •  Is based on the primitives of
                         the Web: resource, resource
                         state, representation, content
                         negotiation, link.



             Memento Update
CNI Task Force Meeting, Spring 2011   12
Original Resource and Versions




               Memento Update
  CNI Task Force Meeting, Spring 2011   13
Bridge from Present to Past




             Memento Update
CNI Task Force Meeting, Spring 2011   14
Bridge from Past to Present




             Memento Update
CNI Task Force Meeting, Spring 2011   15
Memento Framework




             Memento Update
CNI Task Force Meeting, Spring 2011   16
Multiple Archives




             Memento Update
CNI Task Force Meeting, Spring 2011   17
Memento Client-Server Interaction




                 Memento Update
    CNI Task Force Meeting, Spring 2011   18
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                          Memento Update
             CNI Task Force Meeting, Spring 2011   19
Significant progress has been made towards

seamless navigation of the Web of the Past.




                     Memento Update
        CNI Task Force Meeting, Spring 2011   20
Standardization:

                                  •  Standardization process started
                                     via the IETF;

                                  •  Interest from IETF and W3C;

                                  •  Encouraged by major Web
                                     architects, including: Tim
                                     Berners-Lee, Mark Nottingham,
                                     Michael Hausenblas.


https://datatracker.ietf.org/doc/draft-vandesompel-memento/

                         Memento Update
            CNI Task Force Meeting, Spring 2011   21
Memento Clients:

                      •  Several client tools developed
                         by us and others;

                      •  Add-ons for FireFox
                         (operational) and Internet
                         Explorer (experimental);

                      •  Applications for Android
                         (operational) and iPhone/iPad
                         (in development);

                      •  Paper in next issue of Code4Lib
                         Journal.

http://www.mementoweb.org/tools/

             Memento Update
CNI Task Force Meeting, Spring 2011   22
Memento server support (1):

                      •  Memento-compliant Wayback
                         software:

                           •  Used by Internet Archive.

                           •  Available to Web archives,
                              worldwide.

                           •  Please have your favorite
                              Web Archive install this new
                              version 1.6!


http://www.mementoweb.org/tools/

             Memento Update
CNI Task Force Meeting, Spring 2011   23
Memento server support (2):

                      •  Plug-in for MediaWiki
                         (operational);

                           •  Used on W3C’s main wiki.

                      •  Please install it for your
                         MediaWiki!




http://www.mementoweb.org/tools/

             Memento Update
CNI Task Force Meeting, Spring 2011   24
Memento Server Validator

                      •  Server side client:

                           •  Attempts to perform all
                              Memento actions against a
                              given URI

                           •  Reports success/failure of
                              the interactions and
                              warnings for optional
                              aspects

                           •  Kept up to date with IETF
                              Internet Draft

http://www.mementoweb.org/tools/

             Memento Update
CNI Task Force Meeting, Spring 2011   25
Memento Proxy Support

                      •  Several systems that host
                         Mementos made Memento-
                         compliant “by proxy”:

                           •  All major Web Archives that
                              do not yet run Memento-
                              compliant Wayback software

                           •  3,000+ MediaWiki systems,
                              including Wikipedia

                      •  We want all of these to become
                         natively Memento compliant!


             Memento Update
CNI Task Force Meeting, Spring 2011   26
Memento Website:

                      •  Ongoing effort to add
                         materials that support
                         understanding and adoption:
                          •  Introduction to Memento
                          •  How to recognize
                             Mementos, TimeGates,
                             Original Resources?
                          •  Guidelines for servers that
                             host Mementos (Web
                             Archives, CMS, snapshot
                             archives, etc.)
http://www.mementoweb.org/guide/

             Memento Update
CNI Task Force Meeting, Spring 2011   27
Funding:

                      •  2007-2010: US $250K grant
                         from Library of Congress;
                          •  Approx. 50K on Memento.

                      •  2010-2011: US $1 Million
                         follow-up grant from Library of
                         Congress.

                           •  For: Specification, outreach,
                              tool development, further
                              research.



             Memento Update
CNI Task Force Meeting, Spring 2011   28
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                          Memento Update
             CNI Task Force Meeting, Spring 2011   29
Memento Time Travel is really powerful.

Time-Series Data via HTTP follow-your-nose.




                       Memento Update
          CNI Task Force Meeting, Spring 2011   30
Memento Framework




             Memento Update
CNI Task Force Meeting, Spring 2011   31
Memento Framework & Time Series


Original Resource: http://dbpedia.org/resource/France




                          Memento Update
             CNI Task Force Meeting, Spring 2011   32
Time Travel across DBpedia Versions




 Data collected through HTTP Navigation

   paper at http://arxiv.org/abs/1003.3661

                  Memento Update
     CNI Task Force Meeting, Spring 2011   33
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                          Memento Update
             CNI Task Force Meeting, Spring 2011   34
Very few Web sites provide a “timegate” link.

Need additional mechanisms to support Discovery.




                          Memento Update
             CNI Task Force Meeting, Spring 2011   35
Batch discovery of Mementos: TimeMaps




                       A TimeMap minimally lists:

•  URI and datetime of Mementos known to an archive
•  URI of Original Resource

    TimeMaps can be aggregated across systems that host Mementos

                                 Memento Update
                    CNI Task Force Meeting, Spring 2011   36
Batch discovery of Mementos: Feed of TimeMaps

•  System that host Mementos exposes Feed (e.g. Atom) of
TimeMaps to allow applications to remain in sync with its
evolving Memento collection:

   •  One Atom entry per Original Resource for which
   system hosts Mementos;
   •  The entry provides a “timemap” link to a
   TimeMap for the Original Resource;
   •  The datetime value of the updated field of the entry
   changes when additional Memento for Original Resource
   becomes available (i.e. TimeMap changes);
   •  The ID of the entry is a tag URI based on URI of
   Original Resource.
                    Will be proposed to IIPC

                            Memento Update
               CNI Task Force Meeting, Spring 2011   37
Batch discovery of Mementos: robots.txt

•  robots.txt file is used by Web servers to convey
crawling policies;

•  Add a directive to support discovery of Mementos known to
the server:
     •  Pointer to a single Memento can suffice as the robot
     can crawl on from there
     •  Mementos allow for discovery of TimeMaps via HTTP
     links.
     •  e.g. jcdl.org hosts snapshot archives of prior JCDL
     conferences and adds the following to its robots.txt:

   Memento: jcdl.org/archive/2002/index.html
               Will be promoted via Internet Draft

                             Memento Update
                CNI Task Force Meeting, Spring 2011   38
Batch discovery of TimeGates: robots.txt

•  robots.txt file is used by Web servers to convey
crawling policies;

•  Add a directive to support discovery of TimeGates known
to the server:
     •  TimeGates can be on server itself or on external server
     •  Value for the directive is typcially a regular expression
     •  e.g example.org could point at TimeGates in its
     associated transactional ta.org via robots.txt:

   TimeGate: ta.org/timegate/http://
   example.org/*


                Will be promoted via Internet Draft

                              Memento Update
                 CNI Task Force Meeting, Spring 2011   39
Discovery of Systems that Host Mementos: Registry/Feed

 •  Registry of collections of Mementos, e.g. of Web Archives,
 Transactional Archives, etc.

 •  Feed of registry records.

 •  A registry record details essential characteristics of a
 Memento collection.
       •  cf VOiD collection description for Linked Data.




                          Will be researched

                               Memento Update
                  CNI Task Force Meeting, Spring 2011   40
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                          Memento Update
             CNI Task Force Meeting, Spring 2011   41
Memento can recreate pages using
     resources from different archives.

This poses a branding challenge for archives.




                        Memento Update
           CNI Task Force Meeting, Spring 2011   42
Current Branding Practice for Web Archives

        Page and embedded resources from same Web Archive




 Branding
    for
   page
    and
embedded
resources




                                 Memento Update
                    CNI Task Force Meeting, Spring 2011   43
Branding for Web Archives in Memento Mode

       Page and embedded resources from various Web Archives

  Page
branding



   No
branding



   No
branding


                           Will be researched

                                Memento Update
                   CNI Task Force Meeting, Spring 2011   44
Overview of Memento Framework

Deployment Progress

Memento and Data

Memento and Discovery

Memento and Branding

Alternative Web Archiving Strategies


                          Memento Update
             CNI Task Force Meeting, Spring 2011   45
Crawl-based Archives host distinct observations.

 Transactional Archives never miss an update.




                         Memento Update
            CNI Task Force Meeting, Spring 2011   46
Crawl-Based Web Archives




                    Observations

For example: Heritrix crawler for Internet Archive

                     Memento Update
        CNI Task Force Meeting, Spring 2011   47
Crawl-Based Web Archives

•  Collect discreet observations of resources, not their entire
evolution.

•  Can be rejected (robots.txt, by user-agent, by host
IP)

•  Can be deceived (cloaking, by geo-location, by user-
agent).

•  Coverage of particular Web server dependent on crawl-
strategy.




                              Memento Update
                 CNI Task Force Meeting, Spring 2011   48
Server-Side Transactional Web Archives




                       Change History

For example: TTApache, PageVault, Vignette Web Capture

                         Memento Update
            CNI Task Force Meeting, Spring 2011   49
Server-Side Transactional Web Archives

•  Collect all representations served by to-be-archived server.

•  To-be-archived server needs to cooperate.
     •  Incentives e.g. institutional memory, official record of
     Web presence.

•  Archival coverage restricted by to-be-archived server, does
not include external servers (e.g. embedded resources).

•  To be archived server can submit falsified information.

•  Archival collection management: what to keep, what not
(e.g. significant changes, deduplication, …).


                               Memento Update
                  CNI Task Force Meeting, Spring 2011   50
Development of Transactional Web Archive Software
Capture:
•  Apache connection filter module (mod_ta) captures URI, headers, body;
•  Module POSTs in real-time to transactional archive’s Submit URI.




Submit:
•  Java-Grizzly-Jersey submission interface application;
•  Berkeley DB metadata store;
•  FS store for body and headers.

                                  Memento Update
                     CNI Task Force Meeting, Spring 2011   51
Development of Transactional Web Archive Software
Access:
•  Transactional archive natively supports Memento;
•  Immediate availability of archived content;
•  Export of WARC, e.g. for long-term archiving in other environment.




Development timeline:
•  Ongoing development (LANL) and testing (ODU);
•  Submit/Access finalized; development focus on collection management.
•  Expected release as open source, 3rd Quarter 2011.

                                  Memento Update
                     CNI Task Force Meeting, Spring 2011   52
Memento
                                  http://mementoweb.org/


                                  Herbert Van de Sompel
                                      Robert Sanderson
                                      Michael L. Nelson


Big Leaps Towards Seamless Navigation of
           the Web of the Past

                      Memento Update
         CNI Task Force Meeting, Spring 2011   53

More Related Content

Viewers also liked

Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeHerbert Van de Sompel
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersHerbert Van de Sompel
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkHerbert Van de Sompel
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataHerbert Van de Sompel
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataHerbert Van de Sompel
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference LinkingHerbert Van de Sompel
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiativeHerbert Van de Sompel
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...Herbert Van de Sompel
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemHerbert Van de Sompel
 

Viewers also liked (17)

Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & Exchange
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 
The djatoka Image Server
The djatoka Image ServerThe djatoka Image Server
The djatoka Image Server
 
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability FrameworkAn Overview of the OAI Object Reuse and Exchange Interoperability Framework
An Overview of the OAI Object Reuse and Exchange Interoperability Framework
 
The aDORe Federation Architecture
The aDORe Federation ArchitectureThe aDORe Federation Architecture
The aDORe Federation Architecture
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
An HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked DataAn HTTP-Based Versioning Mechanism for Linked Data
An HTTP-Based Versioning Mechanism for Linked Data
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
Memento: Time Travel for the Web
Memento: Time Travel for the WebMemento: Time Travel for the Web
Memento: Time Travel for the Web
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 
Untitled I: Challenges ahead
Untitled I: Challenges aheadUntitled I: Challenges ahead
Untitled I: Challenges ahead
 
Towards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication SystemTowards a Machine-Actionable Scholarly Communication System
Towards a Machine-Actionable Scholarly Communication System
 

Similar to Memento: Big Leaps Towards Seamless Navigation of the Web of the Past

Update on Memento (IIPC 2011 Plenary)
Update on Memento (IIPC 2011 Plenary)Update on Memento (IIPC 2011 Plenary)
Update on Memento (IIPC 2011 Plenary)Robert Sanderson
 
Memento: Updated technical details (May 2011)
Memento: Updated technical details (May 2011)Memento: Updated technical details (May 2011)
Memento: Updated technical details (May 2011)Herbert Van de Sompel
 
Semantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation WebSemantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation Webajithranabahu
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk UpdateESUG
 
VA Smalltalk Update ESUG2014
VA Smalltalk Update ESUG2014VA Smalltalk Update ESUG2014
VA Smalltalk Update ESUG2014ESUG
 
Webinar Mobile ECM Apps with Nuxeo EP
Webinar Mobile ECM Apps with Nuxeo EPWebinar Mobile ECM Apps with Nuxeo EP
Webinar Mobile ECM Apps with Nuxeo EPNuxeo
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M usersJongyoon Choi
 
Tycho - Building plug-ins with Maven
Tycho - Building plug-ins with MavenTycho - Building plug-ins with Maven
Tycho - Building plug-ins with MavenPascal Rapicault
 
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.Paris Open Source Summit
 
Jasig-sakai2012-communitytranslation-kajita
Jasig-sakai2012-communitytranslation-kajitaJasig-sakai2012-communitytranslation-kajita
Jasig-sakai2012-communitytranslation-kajitaShoji Kajita
 
An introduction to honeyclient technology
An introduction to honeyclient technologyAn introduction to honeyclient technology
An introduction to honeyclient technologyAngelo Dell'Aera
 
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishContent Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishJani Tarvainen
 
joomla.ppt educational content and topic
joomla.ppt educational content and topicjoomla.ppt educational content and topic
joomla.ppt educational content and topicOlajide Kuku
 
Open Source na IBM (palestra efetuada no Comaer 2008)
Open Source na IBM (palestra efetuada no Comaer 2008)Open Source na IBM (palestra efetuada no Comaer 2008)
Open Source na IBM (palestra efetuada no Comaer 2008)Cezar Taurion
 
Open Source and Open Standards for Information and Records Managers
Open Source and Open Standards for Information and Records ManagersOpen Source and Open Standards for Information and Records Managers
Open Source and Open Standards for Information and Records ManagersCheryl McKinnon
 
software technology benchmarking
software  technology benchmarkingsoftware  technology benchmarking
software technology benchmarkingMallikarjuna G D
 
Enabling The Enterprise With Php
Enabling The Enterprise With PhpEnabling The Enterprise With Php
Enabling The Enterprise With Phpphptechtalk
 
Programming With WinRT And Windows8
Programming With WinRT And Windows8Programming With WinRT And Windows8
Programming With WinRT And Windows8Rainer Stropek
 

Similar to Memento: Big Leaps Towards Seamless Navigation of the Web of the Past (20)

Update on Memento (IIPC 2011 Plenary)
Update on Memento (IIPC 2011 Plenary)Update on Memento (IIPC 2011 Plenary)
Update on Memento (IIPC 2011 Plenary)
 
Memento: Updated technical details (May 2011)
Memento: Updated technical details (May 2011)Memento: Updated technical details (May 2011)
Memento: Updated technical details (May 2011)
 
Semantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation WebSemantic Annotation and Search for Resources in the Next Generation Web
Semantic Annotation and Search for Resources in the Next Generation Web
 
VA Smalltalk Update
VA Smalltalk UpdateVA Smalltalk Update
VA Smalltalk Update
 
VA Smalltalk Update ESUG2014
VA Smalltalk Update ESUG2014VA Smalltalk Update ESUG2014
VA Smalltalk Update ESUG2014
 
Os php-wiki1-pdf
Os php-wiki1-pdfOs php-wiki1-pdf
Os php-wiki1-pdf
 
Webinar Mobile ECM Apps with Nuxeo EP
Webinar Mobile ECM Apps with Nuxeo EPWebinar Mobile ECM Apps with Nuxeo EP
Webinar Mobile ECM Apps with Nuxeo EP
 
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
 
Tycho - Building plug-ins with Maven
Tycho - Building plug-ins with MavenTycho - Building plug-ins with Maven
Tycho - Building plug-ins with Maven
 
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.
#OSSPARIS19 - Do not be afraid to be forked ! - YOAV KUTNER, Oro Inc.
 
Jasig-sakai2012-communitytranslation-kajita
Jasig-sakai2012-communitytranslation-kajitaJasig-sakai2012-communitytranslation-kajita
Jasig-sakai2012-communitytranslation-kajita
 
Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011Smw+tutorial berlin-fall-2011
Smw+tutorial berlin-fall-2011
 
An introduction to honeyclient technology
An introduction to honeyclient technologyAn introduction to honeyclient technology
An introduction to honeyclient technology
 
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ PublishContent Management Systems and Refactoring - Drupal, WordPress and eZ Publish
Content Management Systems and Refactoring - Drupal, WordPress and eZ Publish
 
joomla.ppt educational content and topic
joomla.ppt educational content and topicjoomla.ppt educational content and topic
joomla.ppt educational content and topic
 
Open Source na IBM (palestra efetuada no Comaer 2008)
Open Source na IBM (palestra efetuada no Comaer 2008)Open Source na IBM (palestra efetuada no Comaer 2008)
Open Source na IBM (palestra efetuada no Comaer 2008)
 
Open Source and Open Standards for Information and Records Managers
Open Source and Open Standards for Information and Records ManagersOpen Source and Open Standards for Information and Records Managers
Open Source and Open Standards for Information and Records Managers
 
software technology benchmarking
software  technology benchmarkingsoftware  technology benchmarking
software technology benchmarking
 
Enabling The Enterprise With Php
Enabling The Enterprise With PhpEnabling The Enterprise With Php
Enabling The Enterprise With Php
 
Programming With WinRT And Windows8
Programming With WinRT And Windows8Programming With WinRT And Windows8
Programming With WinRT And Windows8
 

More from Herbert Van de Sompel

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about itHerbert Van de Sompel
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebHerbert Van de Sompel
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DoneHerbert Van de Sompel
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueHerbert Van de Sompel
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Herbert Van de Sompel
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly recordHerbert Van de Sompel
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsHerbert Van de Sompel
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Herbert Van de Sompel
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarshipHerbert Van de Sompel
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingHerbert Van de Sompel
 

More from Herbert Van de Sompel (20)

The web is rotting and what to do about it
The web is rotting and what to do about itThe web is rotting and what to do about it
The web is rotting and what to do about it
 
Researcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized WebResearcher Pod: Scholarly Communication Using the Decentralized Web
Researcher Pod: Scholarly Communication Using the Decentralized Web
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)Registration / Certification Interoperability Architecture (overlay peer-review)
Registration / Certification Interoperability Architecture (overlay peer-review)
 
Collecting the organizational scholarly record
Collecting the organizational scholarly recordCollecting the organizational scholarly record
Collecting the organizational scholarly record
 
To the Rescue of Scholarly Orphans
To the Rescue of Scholarly OrphansTo the Rescue of Scholarly Orphans
To the Rescue of Scholarly Orphans
 
Almost two decades at LANL
Almost two decades at LANLAlmost two decades at LANL
Almost two decades at LANL
 
Perseverance on Persistence
Perseverance on PersistencePerseverance on Persistence
Perseverance on Persistence
 
Paul Evan Peters Lecture
Paul Evan Peters LecturePaul Evan Peters Lecture
Paul Evan Peters Lecture
 
Achieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed CollectionsAchieving Link Integrity for Managed Collections
Achieving Link Integrity for Managed Collections
 
Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)Signposting Overview (Version November 2017)
Signposting Overview (Version November 2017)
 
Signposting Overview
Signposting OverviewSignposting Overview
Signposting Overview
 
Interoperability for web based scholarship
Interoperability for web based scholarshipInteroperability for web based scholarship
Interoperability for web based scholarship
 
Reminiscing about interoperability
Reminiscing about interoperabilityReminiscing about interoperability
Reminiscing about interoperability
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
ResourceSync Quick Overview
ResourceSync Quick OverviewResourceSync Quick Overview
ResourceSync Quick Overview
 
Memento 101
Memento 101Memento 101
Memento 101
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 

Recently uploaded

Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 

Recently uploaded (20)

Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 

Memento: Big Leaps Towards Seamless Navigation of the Web of the Past

  • 1. Memento http://mementoweb.org/ Herbert Van de Sompel Robert Sanderson Michael L. Nelson Big Leaps Towards Seamless Navigation of the Web of the Past Memento Update CNI Task Force Meeting, Spring 2011 1
  • 2. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 2
  • 3. Overview of Memento Framework Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 3
  • 4. Memento wants to make it easy to access the Web of the Past. Memento Update CNI Task Force Meeting, Spring 2011 4
  • 5. Tate Online Select Date Tate Online Today March 16 2008 March 16 2008 From National Archives Memento Update CNI Task Force Meeting, Spring 2011 5
  • 6. Memento achieves this by introducing a uniform version access capability to integrate the present and past Web. Memento Update CNI Task Force Meeting, Spring 2011 6
  • 7. Content Management Systems: •  Designed to be aware of all versions of a resource; •  Self-contained; •  Variety of proprietary version mechanisms; •  Versions interlinked using proprietary mechanisms. Memento Update CNI Task Force Meeting, Spring 2011 7
  • 8. World Wide Web: •  Designed to forget about prior versions of a resource; •  Distributed. Memento Update CNI Task Force Meeting, Spring 2011 8
  • 9. There are resource versions on the Web: •  Content Management Systems; •  Web Archives; •  Transactional archives; •  Search engine caches. Memento Update CNI Task Force Meeting, Spring 2011 9
  • 10. But the Web architecture has a hard time dealing with them: •  Cannot talk about a resource as it used to exist; •  Cannot access a prior version knowing the current one; •  Cannot access the current version knowing a prior one; Current approaches are ad hoc and localized. Memento Update CNI Task Force Meeting, Spring 2011 10
  • 11. Memento: •  Regards the Web as a big Content Management System •  Introduces a uniform capability to access versions on the Web; •  Does not build new archives but leverages all systems that host versions: Web archives, Content Management Systems, Software Version Systems, etc. Memento Update CNI Task Force Meeting, Spring 2011 11
  • 12. Memento’s version access approach: •  Is distributed: versions may exist on several servers; •  Uses time as a global version indicator; •  Is based on the primitives of the Web: resource, resource state, representation, content negotiation, link. Memento Update CNI Task Force Meeting, Spring 2011 12
  • 13. Original Resource and Versions Memento Update CNI Task Force Meeting, Spring 2011 13
  • 14. Bridge from Present to Past Memento Update CNI Task Force Meeting, Spring 2011 14
  • 15. Bridge from Past to Present Memento Update CNI Task Force Meeting, Spring 2011 15
  • 16. Memento Framework Memento Update CNI Task Force Meeting, Spring 2011 16
  • 17. Multiple Archives Memento Update CNI Task Force Meeting, Spring 2011 17
  • 18. Memento Client-Server Interaction Memento Update CNI Task Force Meeting, Spring 2011 18
  • 19. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 19
  • 20. Significant progress has been made towards seamless navigation of the Web of the Past. Memento Update CNI Task Force Meeting, Spring 2011 20
  • 21. Standardization: •  Standardization process started via the IETF; •  Interest from IETF and W3C; •  Encouraged by major Web architects, including: Tim Berners-Lee, Mark Nottingham, Michael Hausenblas. https://datatracker.ietf.org/doc/draft-vandesompel-memento/ Memento Update CNI Task Force Meeting, Spring 2011 21
  • 22. Memento Clients: •  Several client tools developed by us and others; •  Add-ons for FireFox (operational) and Internet Explorer (experimental); •  Applications for Android (operational) and iPhone/iPad (in development); •  Paper in next issue of Code4Lib Journal. http://www.mementoweb.org/tools/ Memento Update CNI Task Force Meeting, Spring 2011 22
  • 23. Memento server support (1): •  Memento-compliant Wayback software: •  Used by Internet Archive. •  Available to Web archives, worldwide. •  Please have your favorite Web Archive install this new version 1.6! http://www.mementoweb.org/tools/ Memento Update CNI Task Force Meeting, Spring 2011 23
  • 24. Memento server support (2): •  Plug-in for MediaWiki (operational); •  Used on W3C’s main wiki. •  Please install it for your MediaWiki! http://www.mementoweb.org/tools/ Memento Update CNI Task Force Meeting, Spring 2011 24
  • 25. Memento Server Validator •  Server side client: •  Attempts to perform all Memento actions against a given URI •  Reports success/failure of the interactions and warnings for optional aspects •  Kept up to date with IETF Internet Draft http://www.mementoweb.org/tools/ Memento Update CNI Task Force Meeting, Spring 2011 25
  • 26. Memento Proxy Support •  Several systems that host Mementos made Memento- compliant “by proxy”: •  All major Web Archives that do not yet run Memento- compliant Wayback software •  3,000+ MediaWiki systems, including Wikipedia •  We want all of these to become natively Memento compliant! Memento Update CNI Task Force Meeting, Spring 2011 26
  • 27. Memento Website: •  Ongoing effort to add materials that support understanding and adoption: •  Introduction to Memento •  How to recognize Mementos, TimeGates, Original Resources? •  Guidelines for servers that host Mementos (Web Archives, CMS, snapshot archives, etc.) http://www.mementoweb.org/guide/ Memento Update CNI Task Force Meeting, Spring 2011 27
  • 28. Funding: •  2007-2010: US $250K grant from Library of Congress; •  Approx. 50K on Memento. •  2010-2011: US $1 Million follow-up grant from Library of Congress. •  For: Specification, outreach, tool development, further research. Memento Update CNI Task Force Meeting, Spring 2011 28
  • 29. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 29
  • 30. Memento Time Travel is really powerful. Time-Series Data via HTTP follow-your-nose. Memento Update CNI Task Force Meeting, Spring 2011 30
  • 31. Memento Framework Memento Update CNI Task Force Meeting, Spring 2011 31
  • 32. Memento Framework & Time Series Original Resource: http://dbpedia.org/resource/France Memento Update CNI Task Force Meeting, Spring 2011 32
  • 33. Time Travel across DBpedia Versions Data collected through HTTP Navigation paper at http://arxiv.org/abs/1003.3661 Memento Update CNI Task Force Meeting, Spring 2011 33
  • 34. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 34
  • 35. Very few Web sites provide a “timegate” link. Need additional mechanisms to support Discovery. Memento Update CNI Task Force Meeting, Spring 2011 35
  • 36. Batch discovery of Mementos: TimeMaps A TimeMap minimally lists: •  URI and datetime of Mementos known to an archive •  URI of Original Resource TimeMaps can be aggregated across systems that host Mementos Memento Update CNI Task Force Meeting, Spring 2011 36
  • 37. Batch discovery of Mementos: Feed of TimeMaps •  System that host Mementos exposes Feed (e.g. Atom) of TimeMaps to allow applications to remain in sync with its evolving Memento collection: •  One Atom entry per Original Resource for which system hosts Mementos; •  The entry provides a “timemap” link to a TimeMap for the Original Resource; •  The datetime value of the updated field of the entry changes when additional Memento for Original Resource becomes available (i.e. TimeMap changes); •  The ID of the entry is a tag URI based on URI of Original Resource. Will be proposed to IIPC Memento Update CNI Task Force Meeting, Spring 2011 37
  • 38. Batch discovery of Mementos: robots.txt •  robots.txt file is used by Web servers to convey crawling policies; •  Add a directive to support discovery of Mementos known to the server: •  Pointer to a single Memento can suffice as the robot can crawl on from there •  Mementos allow for discovery of TimeMaps via HTTP links. •  e.g. jcdl.org hosts snapshot archives of prior JCDL conferences and adds the following to its robots.txt: Memento: jcdl.org/archive/2002/index.html Will be promoted via Internet Draft Memento Update CNI Task Force Meeting, Spring 2011 38
  • 39. Batch discovery of TimeGates: robots.txt •  robots.txt file is used by Web servers to convey crawling policies; •  Add a directive to support discovery of TimeGates known to the server: •  TimeGates can be on server itself or on external server •  Value for the directive is typcially a regular expression •  e.g example.org could point at TimeGates in its associated transactional ta.org via robots.txt: TimeGate: ta.org/timegate/http:// example.org/* Will be promoted via Internet Draft Memento Update CNI Task Force Meeting, Spring 2011 39
  • 40. Discovery of Systems that Host Mementos: Registry/Feed •  Registry of collections of Mementos, e.g. of Web Archives, Transactional Archives, etc. •  Feed of registry records. •  A registry record details essential characteristics of a Memento collection. •  cf VOiD collection description for Linked Data. Will be researched Memento Update CNI Task Force Meeting, Spring 2011 40
  • 41. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 41
  • 42. Memento can recreate pages using resources from different archives. This poses a branding challenge for archives. Memento Update CNI Task Force Meeting, Spring 2011 42
  • 43. Current Branding Practice for Web Archives Page and embedded resources from same Web Archive Branding for page and embedded resources Memento Update CNI Task Force Meeting, Spring 2011 43
  • 44. Branding for Web Archives in Memento Mode Page and embedded resources from various Web Archives Page branding No branding No branding Will be researched Memento Update CNI Task Force Meeting, Spring 2011 44
  • 45. Overview of Memento Framework Deployment Progress Memento and Data Memento and Discovery Memento and Branding Alternative Web Archiving Strategies Memento Update CNI Task Force Meeting, Spring 2011 45
  • 46. Crawl-based Archives host distinct observations. Transactional Archives never miss an update. Memento Update CNI Task Force Meeting, Spring 2011 46
  • 47. Crawl-Based Web Archives Observations For example: Heritrix crawler for Internet Archive Memento Update CNI Task Force Meeting, Spring 2011 47
  • 48. Crawl-Based Web Archives •  Collect discreet observations of resources, not their entire evolution. •  Can be rejected (robots.txt, by user-agent, by host IP) •  Can be deceived (cloaking, by geo-location, by user- agent). •  Coverage of particular Web server dependent on crawl- strategy. Memento Update CNI Task Force Meeting, Spring 2011 48
  • 49. Server-Side Transactional Web Archives Change History For example: TTApache, PageVault, Vignette Web Capture Memento Update CNI Task Force Meeting, Spring 2011 49
  • 50. Server-Side Transactional Web Archives •  Collect all representations served by to-be-archived server. •  To-be-archived server needs to cooperate. •  Incentives e.g. institutional memory, official record of Web presence. •  Archival coverage restricted by to-be-archived server, does not include external servers (e.g. embedded resources). •  To be archived server can submit falsified information. •  Archival collection management: what to keep, what not (e.g. significant changes, deduplication, …). Memento Update CNI Task Force Meeting, Spring 2011 50
  • 51. Development of Transactional Web Archive Software Capture: •  Apache connection filter module (mod_ta) captures URI, headers, body; •  Module POSTs in real-time to transactional archive’s Submit URI. Submit: •  Java-Grizzly-Jersey submission interface application; •  Berkeley DB metadata store; •  FS store for body and headers. Memento Update CNI Task Force Meeting, Spring 2011 51
  • 52. Development of Transactional Web Archive Software Access: •  Transactional archive natively supports Memento; •  Immediate availability of archived content; •  Export of WARC, e.g. for long-term archiving in other environment. Development timeline: •  Ongoing development (LANL) and testing (ODU); •  Submit/Access finalized; development focus on collection management. •  Expected release as open source, 3rd Quarter 2011. Memento Update CNI Task Force Meeting, Spring 2011 52
  • 53. Memento http://mementoweb.org/ Herbert Van de Sompel Robert Sanderson Michael L. Nelson Big Leaps Towards Seamless Navigation of the Web of the Past Memento Update CNI Task Force Meeting, Spring 2011 53