Slides for our "Connecting the Smithsonian American Art Museum to the Linked Data Cloud." paper presented at the 10th Extended Semantic Web Conference (ESWC), in Montpellier, May 2013. http://eswc-conferences.org/sites/default/files/papers2013/szekely.pdf
Connecting the Smithsonian American Art Museum to the Linked Data Cloud
1. Connecting the Smithsonian
American Art Museum to
the Linked Data Cloud
Pedro Szekely, Craig A. Knoblock, Fengyu Yang, Xuming Zhu,
Eleanor E. Fink, Rachel Allen, and Georgina Goodlander
University of Southern California, Los Angeles, California, USA
Nanchang Hangkong University, Nanchang, China
Smithsonian American Art Museum, Washington, DC, USA
http://www.isi.edu/integration/karma
2. The Smithsonian American Art
Museum is a museum in Washington,
D.C. which has one of the world's
largest and most inclusive collections
of art, from the colonial period to the
present, made in the United States.
Wikipedia
4. Problem
SAAM
Data
What ontology to use?
Structure mismatches
Data consistency What to link to?
100% precision
How to enable museums to do this themselves?
Pedro Szekely and Craig KnoblockUniversity of Southern California
5. Steps to Create Linked Data
• Map data to RDF
… select ontologies
… define mappings
• Link to external resources
… identify the links
• Curate the Linked Data
… museums demand 100% correctness
Pedro Szekely and Craig KnoblockUniversity of Southern California
17. Pedro Szekely and Craig KnoblockUniversity of Southern California
download the presentation to view the embedded video
18. Evaluation of Data Mapping Using Karma
SAAM database
8 tables
29 columns
Ontologies
407 classes
105 data properties
229 object properties
# of times Karma’s top 4
suggestions contain the
correct semantic type
# of times Karma
correctly assigns object
properties
Time
(minutes)
Run 1:
no training
data
7 out of 29 (24%) 30 out of 35 (85%) 18
Run 2:
using Run 1
as training
27 out of 29 (93%) 32 out of 35 (91%) 8
Pedro Szekely and Craig KnoblockUniversity of Southern California
20. Pedro Szekely and Craig KnoblockUniversity of Southern California
Multiple “John Singer Sargent”
ima:Person_John_Singer_Sargent
a aac-ont:Person ;
dct:date "1856-1925" ;
foaf:name "John Singer Sargent" .
saam:Person_4253
a aac-ont:Person ;
aac-ont:associatedPlace
saam:SaamPlace_1357324439768t1r13950_0,
saam:SaamPlace_1357324439768t1r13951_0 ;
saam:constituentId "4253" ;
rdaGr2:biographicalInformation
“Painter. Sargent traveled …" ;
rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;
rdaGr2:dateOfBirth "1856-1-12" ;
rdaGr2:dateOfDeath "1925-4-15" ;
rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;
rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;
foaf:name "John S. Sargent" ;
skos:altLabel "John S. Sargent" ;
skos:prefLabel "John Singer Sargent" .
cb:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:dateOfBirth "1879", "1885" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
met:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:placeOfResidence
"North and Central America",
"United States" ;
foaf:name "John Singer Sargent" .
dallas:Person_John_Singer_Sargent
a aac-ont:Person ;
ont0:dateOfBirth "1856" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
21. Pedro Szekely and Craig KnoblockUniversity of Southern California
John Singer Sargent
ima:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
dct:date "1856-1925" ;
foaf:name "John Singer Sargent" .
saam:SaamPerson_4253
a saam:SaamPerson ;
saam:associatedPlace
saam:SaamPlace_1357324439768t1r13950_0,
saam:SaamPlace_1357324439768t1r13951_0 ;
saam:constituentId "4253" ;
rdaGr2:biographicalInformation
“Painter. Sargent traveled …" ;
rdaGr2:dateAssociatedWithThePerson "1990-10-1”, "1995-5-8" ;
rdaGr2:dateOfBirth "1856-1-12" ;
rdaGr2:dateOfDeath "1925-4-15" ;
rdaGr2:placeOfBirth saam:SaamPlace_1357324439768t1r13952_0 ;
rdaGr2:placeOfDeath saam:SaamPlace_1357324439768t1r13953_0 ;
skos:altLabel "John S. Sargent" ;
skos:prefLabel "John Singer Sargent" .
cb:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:dateOfBirth "1879", "1885" ;
ont0:dateOfDeath "1925" ;
skos:prefLabel "John Singer Sargent" .
met:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:placeOfResidence
"North and Central America",
"United States" ;
foaf:name "John Singer Sargent" .
dallas:SaamPerson_John_Singer_Sargent
a saam:SaamPerson ;
ont0:dateOfBirth "1856" ;
ont0:dateOfDeath "1925" ;
foaf:name "John Singer Sargent" .
22. Linking “John Singer Sargent”
saam:Person_4253
owl:sameAs cb:Person_John_Singer_Sargent ;
owl:sameAs dallas:Person_John_Singer_Sargent ;
owl:sameAs ima:Person_John_Singer_Sargent ;
owl:sameAs met:Person_John_Singer_Sargent ;
owl:sameAs dbpedia:John_Singer_Sargent ;
owl:sameAs nytimes:N49129220686803623753 ;
owl:sameAs w-flick:John_Singer_Sargent ;
...
.
Pedro Szekely and Craig KnoblockUniversity of Southern California
23. Intuition
Estimate discrimination power of properties,
e.g., of name, birth and death dates
birth date death date # of people
… … …
1800 1820 147
1800 1821 284
1800 1822 213
… … …
every
combination
of dates
Song, D., Heflin, J.: Domain-independent entity coreference for linking ontology instances.
ACM Journal of Data and Information Quality (ACM JDIQ) (2012)
similar idea to
Pedro Szekely and Craig KnoblockUniversity of Southern California
24. Evaluation of Automatic Linking
Pedro Szekely and Craig KnoblockUniversity of Southern California
SAAM names starting with “A” matched by hand
535 people 176 matches
25. Results of Automatic Linking
Getty ULAN® 2,110
Rijksmuseum 551
Geonames 3,068
DBPedia 2,194
New York Times 70
Pedro Szekely and Craig KnoblockUniversity of Southern California
estimate ≈ 30 missing
links to DBpedia
26. Pedro Szekely and Craig KnoblockUniversity of Southern California
Curating Links with Karma
27. Pedro Szekely and Craig KnoblockUniversity of Southern California
Linking with Karma
28. results of automated linking and
interactive curation recorded using
PROV
Pedro Szekely and Craig KnoblockUniversity of Southern California
owl:sameAs statements constructed
using SPARQL CONSTRUCT queries
over PROV records
34. Related Work
• Europeana
• 17 million items, 1,500 institutions
• Require exports in “Europeana” format
• Amsterdam Museum, Museum Finland
• Rich ontology, RDF to RDF mapping rules
• LODAC museums in Japan
• 114 museums, simple ontology
• Research Space, British Museum
• CIDOC CRM ontologies, complex mappings
We focused significantly on Linking identification and curation
35. Next Steps
• Applications leveraging linked data
• Virtual museum
• Tools to create multimedia stories about art
• Tools to find inconsistencies
• Feed data to wikidata
• American Art Collective: a linked data
consortium of museums
Pedro Szekely and Craig KnoblockUniversity of Southern California