The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
Structured Occurrence Network for provenance: talk for ipaw12 paper
1. Modelling provenance using
Structured Occurrence Networks
Paolo Missier, Brian Randell, Maciej Koutny
Newcastle University, UK
IPAW’12
Santa Barbara, CA, June 2012
Thursday, June 21, 2012
2. Petri Nets and Occurrence Nets
Historically, Occurrence Nets have been used to define
process semantics of Petri Nets [2]
IPAW 2012 - P.Missier
[2] Best, Eike, and Raymond Devillers. “Sequential and concurrent behaviour in Petri net theory.” Theoretical Computer
Science 55, no. 1 (1987): 87-136. http://www.sciencedirect.com/science/article/pii/0304397587900909.
2
Thursday, June 21, 2012
3. Occurrence Nets
ON is about capturing causal relationships
between events and conditions
ON = (C, E, F ) with nodes C E and a flow relation F ⊆ (C ×E)∪(E × C) satisfying the following:
(i) for every condition c there is at most one event e such that (e,c) ∈ F, and at most one event f such that (c,f) ∈ F;
(ii) for every event e there is at least one condition c such that (c, e) ∈ F , and at least one condition d such that (e,
d) ∈ F ;
(iii) ON forms an acyclic graph and F + is a partial order relation (the relation PrecON = (F ◦ F)|C×C is acyclic)
IPAW 2012 - P.Missier
3
Thursday, June 21, 2012
4. Provenance and Occurrence Nets
• "Provenance is defined as a record that • An Occurrence Net is an abstract record of
describes the people, institutions, a single execution of some computing
entities, and activities involved in system
producing, influencing, or delivering a • Only information about causality and
piece of data or a thing." [PROV] concurrency between events and visited
local states is represented
IPAW 2012 - P.Missier
4
Thursday, June 21, 2012
5. Provenance and Occurrence Nets
• "Provenance is defined as a record that • An Occurrence Net is an abstract record of
describes the people, institutions, a single execution of some computing
entities, and activities involved in system
producing, influencing, or delivering a • Only information about causality and
piece of data or a thing." [PROV] concurrency between events and visited
local states is represented
Data: The evolution of variable A as a ON:
IPAW 2012 - P.Missier
4
Thursday, June 21, 2012
6. Provenance and Occurrence Nets
• "Provenance is defined as a record that • An Occurrence Net is an abstract record of
describes the people, institutions, a single execution of some computing
entities, and activities involved in system
producing, influencing, or delivering a • Only information about causality and
piece of data or a thing." [PROV] concurrency between events and visited
local states is represented
Data: The evolution of variable A as a ON:
Agents: The evolution of Bob, the document editor:
read drafted ready
performed verified
ptd paper internal to
exp. results draft
p1 memo
IPAW 2012 - P.Missier
read
paper
p2
4
Thursday, June 21, 2012
7. From ON to Structured ON
ON is an adequate starting point, but extensions are needed to
represent the activity of complex systems
• Structured Occurrence Nets provide these extensions as new
relationships amongst multiple ONs [2]
IPAW 2012 - P.Missier
[2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and
Analysis Techniques.” Fundamenta Informaticae 97 (2009).
5
Thursday, June 21, 2012
8. From ON to Structured ON
ON is an adequate starting point, but extensions are needed to
represent the activity of complex systems
• Structured Occurrence Nets provide these extensions as new
relationships amongst multiple ONs [2]
IPAW 2012 - P.Missier
[2] Koutny, M., and B Randell. “Structured Occurrence Nets: A Formalism for Aiding System Failure Prevention and
Analysis Techniques.” Fundamenta Informaticae 97 (2009).
5
Thursday, June 21, 2012
9. Provenance modelling patterns using SON
Main goal of this work:
to explore the use of Structured Occurrence Nets
as a formal model of provenance
• Data is viewed as an evolving system
• Agents are also evolving systems, thus their evolution is also captured
• Uniform representation of Agents / Activity / Data interplay
• Representation of multi-layered Provenance
– eg:the provenance of a workflow run is underpinned by system-level provenance
– SONs are suitable for modelling at multiple levels of abstraction
IPAW 2012 - P.Missier
Expected by-product of the investigation:
suggest enhancements to the current SON model
6
Thursday, June 21, 2012
10. Talk outline
• C-SON: ON + communication relation
– a provenance pattern to represent workflow patterns
• T-SON: C-SON + temporal abstraction
• (more abstraction relations omitted in the talk)
• Modelling multi-layered provenance
• Agents as evolving systems
– provenance of agents
IPAW 2012 - P.Missier
7
Thursday, June 21, 2012
11. Communication SONs
• Goal: to capture communication between ONs that otherwise proceed
concurrently
• by introducing a Communication relation amongst activities in two ONs
– induces a partial order on the states and conditions of the two nets
– must result in an acyclic net
e1 cannot have happened after f1
e2, f2 must have happened simultaneously
IPAW 2012 - P.Missier
8
Thursday, June 21, 2012
12. Communication SONs
• Goal: to capture communication between ONs that otherwise proceed
concurrently
• by introducing a Communication relation amongst activities in two ONs
– induces a partial order on the states and conditions of the two nets
– must result in an acyclic net
e1 cannot have happened after f1
e2, f2 must have happened simultaneously
IPAW 2012 - P.Missier
8
Thursday, June 21, 2012
13. C-SON at work - a first provenance pattern
• Each ON models one variable as a system.
• The composed activity “A:=A+1; A:=A+B; B:=A+B” is represented as a SON
composed of a pair of communicating ONs
IPAW 2012 - P.Missier
• The resulting SON captures a (partial) order of possible reads and writes
leading to the final result
9
Thursday, June 21, 2012
14. Workflow execution traces using C-SON
X= 10
Y:= f(X)
(a) (b)
Program Execution
specification instance f()
<X,Z> := g(X,Y) Y = "200"
g()
X = 20 Z = "20"
10 r 10 r 10 w 20
X
(c)
SON representation of
Execution trace f g
Program
execution
IPAW 2012 - P.Missier
w 20
w 200 r 200
Z
Y
10
Thursday, June 21, 2012
15. Temporal SON (T-SON)
• Goal: to replace part of an ON with new “atomic” actions
– a form of temporal abstraction
– these new actions only appear atomic at one level of abstraction
IPAW 2012 - P.Missier
11
Thursday, June 21, 2012
16. T-SON in action: multi-layered provenance
• Provenance of computed data naturally comes in layers:
– program execution (incl. workflow controller)
• system-level primitives, network protocols
10 r 10
t t 10 r 10
t
X
10 fetch send 10
X
f
Program
execution
get compute
t t t
f
IPAW 2012 - P.Missier
Program
execution
Temporal abstraction used to hide/reveal lower-level details of both
12 systems and their interaction (i.e., reading from storage)
Thursday, June 21, 2012
17. T-SON: multi-layered provenance
Additional unfolding:
- assume f is a service call
- reveal details of communication with an underlying service implementation
10 r 10
X
f
Program
execution
Service
execution
IPAW 2012 - P.Missier
13
Thursday, June 21, 2012
18. T-SON: multi-layered provenance
10 r 10
t t
t
msg msg
receive send
10 fetch send 10
Service
X
execution
msg msg
get receive
prep send
t t
t
f
IPAW 2012 - P.Missier
Program
execution
14
Thursday, June 21, 2012
19. Agents as evolving systems
Alice and Bob collaborate on document editing
An agent performs activities that account for changes in the state of the data
• F is data (a file)
• Data and agents are modelled as systems
• Each system is a SON
• Communication abstraction used to synchronize events across different
IPAW 2012 - P.Missier
systems
• Bob.draft → F.write, F.read → Alice.”read draft” etc.
• For example, Bob in the state b2 is unaware of Alice’s feedback
• Bob in state b3 has read Alice’s feedback
15
Thursday, June 21, 2012
20. Agents as evolving systems
• Tracking the state of agents is important
– we know that it is the “version” of Bob that is aware of Alice’s comment that is
responsible for the latest edit
Below: Bob’s edits to the document do not take account of Alice’s comments
ptd draft b2 edit b4
Bob
w f1 r f1 w f2 r f2 w f3
IPAW 2012 - P.Missier
F
read leave
a1 a2
draft comments
16 Alice
Thursday, June 21, 2012
21. Summary
• SONs extend well-known Occurrence Nets
• Simple graphical notation with formal grounding
– amenable to various types of analysis
• Appealing for capturing system evolution and inter-system
communication
• Accommodates multiple levels of abstraction
• Evolution traces of Data, Agents and Activities provenance queried
seamlessly
• Some software tool support (in progress), more work needed here
• Feedback for customizations of the SON model itself!
IPAW 2012 - P.Missier
17
Thursday, June 21, 2012
22. Selected References
• V. Khomenko, M. Koutny, A. Yakovlev: Logic Synthesis for Asynchronous
Circuits Based on Petri Net Unfoldings and Incremental SAT. Fundamenta
Informaticae 70, 2006.
• http://www.cs.ncl.ac.uk/publications/inproceedings/papers/749.pdf
• M. Koutny, B. Randell: Structured Occurrence Nets: A Formalism for
Aiding System Failure Prevention and Analysis Techniques. Fundamenta
Informaticae 97, 2009.
• http://www.cs.ncl.ac.uk/publications/trs/papers/1162.pdf
• B. Li, M. Koutny: Verification and Simulation Tool for Communication Structured
Occurrence Nets. CS-TR, Newcastle University, (to appear).
• P.M. Merlin and B. Randell: State Restoration in Distributed Systems. Proc.
FTCS-8, 1978.
• http://www.cs.ncl.ac.uk/publications/inproceedings/papers/347.pdf
• I. Poliakov, V. Khomenko, A. Yakovlev: Workcraft - A Framework for
Interpreted Graph Models. LNCS 5606, 2009.
• B. Randell, M. Koutny: Failures: Their Definition, Modelling and Analysis. LNCS
4711, 2007.
• http://www.cs.ncl.ac.uk/publications/trs/papers/994.pdf
IPAW 2012 - P.Missier
• B. Randell, M. Koutny: Structured Occurrence Nets: Incomplete,
Contradictory and Uncertain Failure Evidence. CS-TR 1170, Newcastle
University, 2009.
• http://www.cs.ncl.ac.uk/publications/trs/papers/1170.pdf
18 Martinique, Jan. 2012
24
Thursday, June 21, 2012
23. IPAW 2012 - P.Missier EXTRA material
19
Thursday, June 21, 2012
24. SON provenance and PROV
read
b2 b3 edit b4
feedback
Bob (a)
f2 r f2 r f2 w f3
F
used wasGeneratedBy
f2 edit wa f3 (b)
sG
en
era
used wasAssociatedWith ted
By
read wasGeneratedBy
feedback Bob_b3 wasDerivedFrom Bob_b4
IPAW 2012 - P.Missier
wasAssociatedWith
wasDerivedFrom
Bob_b2
20
Thursday, June 21, 2012
25. T-SON and finite-duration activities
In Occurrence Nets, events (activities) are instantaneous.
Using temporal abstraction one can add a time duration to activities
[s,e]
Activity f f
with duration [s,e]
t t t t t t
s e
c duration interval of f c
t1 t2 tn
Time
IPAW 2012 - P.Missier
21
Thursday, June 21, 2012
26. SON -- behavioural abstraction
• Insight: ‘system’ and ‘state’ are not separate concepts
• The same graph node may be used to represent either a state or a
system, at different levels of abstraction
Key idea of Structured Occurrence Nets:
• multiple ONs, each portraying a system/state at some level of abstraction
IPAW 2012 - P.Missier
• associations amongst the ONs are then established by means of new
relation types
22
Thursday, June 21, 2012
27. B-abstraction in use
– An application of behavioural abstraction -- state/system duality
– Bob’s state b1 expands into a system’s activities (b1)
– the expansion provides additional insight into Bob’s “knowledge level” /
“preparedness” prior to drafting the document
IPAW 2012 - P.Missier
23 Bob’s state as a system
Thursday, June 21, 2012
28. Finding good abstractions - cuts in ON and SON
(a)
(d)
(b)
(e)
IPAW 2012 - P.Missier
(c)
24
Thursday, June 21, 2012