1. Joshua Shinavier
The Real-Time Web in the
Age of Agents
2011 Semantic Technology Conference
June 8th, 2011
2. Overview
• real-time Semantic Web is emerging
• state of the art
• real-time interaction?
• Semantic Web agents
• use cases
• the RDFAgents specification
• queries and subscriptions
• data model
• provenance and information discovery
2
3. Real-time interaction
with the Semantic Web
• let’s mash up the “real-time” social Web
with the Semantic Web
• data sources are plentiful
• e.g. Twitter, Facebook, Glue, ActivityStreams
• data integration is straightforward
• mapping into Semantic Web vocabularies
• linking into Semantic Web datasets
• what about interaction?
3
4. Semantic Web “1.0” was
• ...not “real-time”
- static RDF data, occasionally updated
• ...not ubiquitous
- desktop model (browsers, keyword search)
- client-server model
• ...not very interactive
- read-only content, monolithic triple stores
- little “bottom-up” user content
4
5. Getting closer to “real-time”
static triple stores
days
web pages
hours
blogs
minutes
microblogs
TwitLogic
seconds
instant messaging sparqlPuSH
C-SPARQL
milliseconds
streaming media
latency technology misc. examples
5
6. More ubiquitous, more interactive
• mobile Semantic Web
- e.g. RDF On the Go, Android Semantic Web Core
library, DBpedia/mSpace/OntoWiki Mobile
• semantic sensor networks
• semantic publishing and presence
- e.g. SMOB, Sharing Spaces
• smart spaces, dialogue-based Semantic Web UI
6
7. The 10,000-foot view
• real-time Semantic Web is growing organically
• common themes:
- rapidly changing data
- two-way, peer-to-peer communication
- participation of “ubiquitous” devices large and
small
• we need shared frameworks for real-time interaction
7
8. Enter Semantic Web agents
“The real power of the Semantic Web will be
realized when people create many programs that
collect Web content from diverse sources,
process the information and exchange the results
with other programs.
The effectiveness of such software agents will
increase exponentially as more machine-readable
Web content and services becomes available.”
— Tim Berners-Lee et al., 2001
8
9. By “agent”, we mean:
• peer-to-peer communication
• proactive, event-driven information sharing
• conventions for provenance tracking
• support for a variety of interaction protocols
- e.g. for query answering, subscriptions, contracts
( not that kind of agent. )
9
10. Use cases:
basic query answering
• Agent A → Agent B: “Who is Josh?”
• Agent B → Agent A: “Josh is a presenter at
SemTech.”
10
11. Use cases:
query delegation with provenance
• Agent A → Agent B: “Who is Josh?”
- Agent B → Agent C, Agent D: “Who is Josh?”
- Agent C → Agent B: “Josh is a presenter at
SemTech.”
- Agent D → Agent B: “Josh has the Twitter handle
@joshsh and has written these tweets: ...”
• Agent B → Agent A: “Agent C says that Josh is a
presenter at SemTech. Agent D says that Josh has
the Twitter handle @joshsh and has written these
tweets: ...”
11
12. Use cases:
real-time data streams
• Agent A → Agent B: “Keep me up to date about
SemTech.”
• Agent B → Agent A: “Will do!”
• Agent B → Agent A: “Josh has just written this tweet
about SemTech: ...”
• Agent B → Agent A: “Craig has just posted this review
of SemTech: ...”
12
13. Use cases:
syndication with provenance
• Agent A → Agent B: “Keep me up to date about
SemTech.”
• Agent B → Agent A: “Will do!”
- Agent B → Agent C: “Keep me up to date about
SemTech.”
- Agent C → Agent B: “Will do!”
- Agent C → Agent B: “Josh has just written this
tweet about SemTech: ...”
• Agent B → Agent A: “Agent C says that Josh has just
written this tweet about SemTech: ...”
13
14. Why agents? Extreme real-time
• human limit of real-time ≈ 10 - 100ms
- these are simultaneity thresholds, central to
human sensory integration
• technical limits of real-time are similar
- 134ms for a round-trip around the Earth (ideally)
- 28ms from New York to San Francisco and back
- actual Internet latency is 2 to 3 times higher, but
• TCP’s three-way handshake is a problem
- we need long-lived connections and
bidirectional data flow, as in XMPP (Jabber)
14
15. Why agents? The standards
• FIPA (Foundation for Intelligent Physical Agents)
• most widely-adopted body of agent standards
• supports peer-to-peer, asynchronous, and event-
driven communication
• transport agnostic
- compatible with HTTP, XMPP, etc.
• easily coupled with Web APIs, Semantic Web data
formats
15
16. RDFAgents = + data streams
• RDFAgents specification
• extends FIPA
• enables query answering and real-time data
streams with provenance
• includes:
• RDF content languages
• two interaction protocols: Query and Pub-Sub
• support for trust, privacy, information
accountability
• conventions for peer-to-peer information
discovery
16
20. RDFAgents data model
• agents exchange RDF Datasets, in the sense of
SPARQL
- any number of Named Graphs
- may be annotated with provenance metadata
- a single default graph
- contains asserted (accepted) statements
• agents identify themselves with Semantic Web URIs
- agent contact information in FOAF
• Semantic Web Publishing vocabulary captures the
provenance trail
20
21. Example: a dataset from a message
a named graph generic content
<urn:uuid:be0c72c6-2b8f-4134-b309-690039f8c419> {
ex:post1785782813 a sioct:MicroblogPost ;
dc:title "#SemTech begins again!" ;
dc:created "2011-06-06T19:43:35.000Z"^^xsd:dateTime ;
sioc:topic <http://twitlogic.fortytwo.net/hashtag/semtech> .
}
the default graph (accepted statements)
{
<urn:uuid:be0c72c6-2b8f-4134-b309-690039f8c419> a rdfg:Graph ;
swp:assertedBy <urn:uuid:be0c72c6-2b8f-4134-b309-690039f8c419> ;
swp:authority ex:twitlogic .
ex:twitlogic a foaf:Agent ;
foaf:mbox <xmpp:twitlogic@example.org> .
}
graph metadata
agent metadata 21
22. The provenance trail
• keeping track of who said what
- publishers deserve credit for their content
- consumers need to decide what to believe
• RDFAgents specifies a provenance-preserving
transformation for each dataset received
- records “A said that B said X” relationships
between agents and graphs
• simple policies unravel the provenance trail to accept
or reject RDF statements
• provenance metadata also supports information
discovery
22
23. What can you do with RDFAgents?
• make real-time streams out of social data sources
- e.g. Twitter agents, ActivityStreams agents, etc.
• combine streams from different sources
• integrate with Web services, Linked Data and
SPARQL endpoints
• attach SPARQL-based filters for smart content feeds
• pipe the data into a triple store, publish it as Linked
Data, power your real-time applications
23
24. RDFAgents implementation
• implemented in Java with JADE and Sesame
- LEAP kernel for mobile devices
- all of the usual RDF formats incl. TriG, N-Quads
- transport protocols: XMPP, HTTP, IIOP
• open-source
• in progress: high-throughput RDF data streams
with AllegroGraph
• in progress: Droidspeak library for Android OS
( )
24
25. Conclusion
• Semantic Web is moving into real-time and ubiquitous
environments, but
• common interaction models are needed
• Semantic Web agents are particularly appropriate
• RDFAgents = Semantic Web + FIPA for transport-
agnostic, peer-to-peer queries and data streams with
provenance
25
26. Thanks!
• RDFAgents:
• http://fortytwo.net/2011/rdfagents/spec
• https://github.com/joshsh/rdfagents
• Tetherless World Constellation: http://tw.rpi.edu
• Franz Inc: http://www.franz.com
• Institute of Automation: http://english.ia.cas.cn
• TinkerPop: http://tinkerpop.com
• Contact:
• josh@fortytwo.net, @joshsh
26