Current standard web documents are designed to be presented to humans. Machines have no idea about the information located in a web document. Semantic web is organized in a structured way so that it is meaningful to both machines and humans. In this presentation, we suggest a framework that will process the web documents and produce machine readable format in RDF (Resource Description Framework) collaborated with the OWL (Web Ontology Language).
Our suggested framework, which we call RS2 (RDF by Structured Reference to Semantics), takes an HTML document as input, extracts the plain text from it. Natural language context of plaintext is then parsed to yield subject-object-predicate of each sentence. This data is used to lookup in the ontology and generate RDF graph which is the machine intelligible semantic equivalent to the original human recognized text.
<this>
RDF by Structured Reference to Semantics, the RS2 framework
1. Khulna University of Engineering & Technology Department of Computer Science and Engineering An Approach to Emerge Semantic Web Khan Muhammad Nafee Mostafa | 0507007 Samiul Hoque Sourav | 0507035 Qudrat-E-Alahy Ratul | 0507037 Supervisedby | Rushdi Shams | Lecturer CSE, KUET
2. Introduction Web » a horde of valuable but unorganized and scattered documents Web of document is not intelligible to machines but web of linked data is Semantic Web » web of data RDF graph » semantics underlying the document bottleneck of emerging semantic web » conversion of html to RDF
3. Objective Generate RDF from HTML document Suggesting a framework titled ‘RDF by Structured Reference to Semantics’ or RS2 framework to do so
10. Ex:- Mashrafe play for Kolkata Knight rider. Mashrafe’s nationality is Bangladeshi.Player Cricket Team Country Instance of Instance of Instance of play for nationality Bangladesh Mashrafe Kolkata Nightrider
11. Architecture of RS2 framework Extract plaintext Parse Natural Language TEXT plaintext Parse tree Yield SPO Generate RDF Lookup for Semantic equivalent Subject Predicate Object Semantic Web entities for SPO
16. Lookup semantic web entities I think KKR and Kolkata Knight Raider are different Same anomaly occurred for predicate and object
17.
18. Convert it to a machine accessible way.KKRis located in Kolkata. Kolkata Knight Rider is situated at west Bengal. Natural Language Subject Predicate Object Kolkata Knight Rider location Kolkata. RDF Triple
27. Future Work and Benefits An application/framework to enhance Web Ontology from knowledge conceived from html document applications with Semantic Web features Benefit: Emergence of Semantic Web Automatic conversion of piles of html into RDF graph
28. Conclusion A framework and a prototype application to convert html document into RDF Eliminate the bottleneck in the emergence of Semantic Web by RS2
Web, today, is like a horde of valuable documents with humankind’s precious knowledge left unorganized in a very scattered fashionWeb of documents is not intelligible to machines but web of linked data isSemantic Web is the web of dataAn RDF graph is the semantic info underlying any documentthe bottleneck of emerging web is conversion of html to RDF
Generate RDF from HTML document We are going to develop a framework titled ‘RDF by Structured Reference to Semantics’ or RS2 frameworkRS2 will generate RDF graph based on the semantics yielded from html document by mapping them into existing ontology
RS2 fx needs external information from a Lexicon, a mapper and an Ontology
For parsing sentences from natural language, several steps are to undergo:Separate each sentence, we will parse a sentence at onceSeparate words in the sentencePOS tagging, find parts of speech of each word from the lexiconTry to parse the sentence with a grammar by recognizing parts of speech as input symbolsIf parsed successfully return parse tree (syntax tree)
An application/framework to enhance Web Ontology from knowledge conceived from html document, can be built on RS2 frameworkRS2 framework will help the emergence of a unified giant global graph of linked data which can enable many features of Semantic Web. RS2 will help convert the giant collection of html documents to RDF graphs of data and applications can be built with the help of RDF graph occupied in this method.
In this thesis we have tried to eliminate one of the greatest bottlenecks of the emergence of Semantic Web. We have suggested a framework that will take input of HTML web document and give output of RDF graph of linked data. This will help us convert the web from the horde of documents into the squad of data.