SlideShare a Scribd company logo
1 of 72
Download to read offline
Use of Open Data in Hong Kong
Sammy Fung
sammy.hk
Incu-Lab ICE in StartMeUpHK - Open Data Initiative Gathering
2013/12/04
http://slidesha.re/1cleS2y
We want a better life with
public data.
We want a easier way to
access the public data.
Agenda
●

What is Open Data ?

●

Use of Open Source Software in web crawling.

●

Starting new Open Source project hk0weather
to create Open Weather Data.
Sammy Fung
●

Software Developer
–

to use and develop open source sofware.

–

Perl → PHP → Python.

–

interests on Data Mining / Web Crawling.

–

own a startup of web and mobile technology.
Sammy Fung
●

15+ years in Open Source Communities.
–

Founding Chairman, Hong Kong Linux User Group.

–

Founding Chairman, Open Source Hong Kong.

–

Member, GNOME Asia committee.

–

Mozilla Representative

–

Member, program committee at COSCUP
●

Conference for Open Source Coders, Users and Developers.

●

Largest open source conference in Taiwan.
What is Open Data ?
Open Data
Three Laws of Open Government Data by David Eaves.
1.If it can't be spidered or indexed, it doesn't exist.
2.If it isn't available in open and machine readable format, it
can't engage.
3.If a legal framework doesn't allow it to be repurposed, it
doesn't empower.
http://eaves.ca/2009/09/30/three-law-of-open-government-data/
Open Data
●

Tim Berners-Lee, the inventor of the Web.
–

5stardata.info

–

5 star deployment scheme of Open Data.
* One Star - Open Data
1.make your stuff available on the Web (whatever format) under an
open license.
2.make it available as structured data (e.g., Excel instead of image
scan of a table)
3.use non-proprietary formats (e.g., CSV instead of Excel)
4.use URIs to denote things, so that people can point at your stuff.
5.link your data to other data to provide context.
5stardata.info by Tim Berners-Lee, the inventor of the Web.
** Two Star - Open Data
1.make your stuff available on the Web (whatever format) under an
open license.
2.make it available as structured data (e.g., Excel instead of image
scan of a table)
3.use non-proprietary formats (e.g., CSV instead of Excel)
4.use URIs to denote things, so that people can point at your stuff.
5.link your data to other data to provide context.
5stardata.info by Tim Berners-Lee, the inventor of the Web.
*** Three Star - Open Data
1.make your stuff available on the Web (whatever format) under an
open license.
2.make it available as structured data (e.g., Excel instead of image
scan of a table)
3.use non-proprietary formats (e.g., CSV instead of Excel)
4.use URIs to denote things, so that people can point at your stuff.
5.link your data to other data to provide context.
5stardata.info by Tim Berners-Lee, the inventor of the Web.
**** Four Star - Open Data
1.make your stuff available on the Web (whatever format) under an
open license.
2.make it available as structured data (e.g., Excel instead of image
scan of a table)
3.use non-proprietary formats (e.g., CSV instead of Excel)
4.use URIs to denote things, so that people can point at your stuff.
5.link your data to other data to provide context.
5stardata.info by Tim Berners-Lee, the inventor of the Web.
***** Five Star - Open Data
1.make your stuff available on the Web (whatever format) under an
open license.
2.make it available as structured data (e.g., Excel instead of image
scan of a table)
3.use non-proprietary formats (e.g., CSV instead of Excel)
4.use URIs to denote things, so that people can point at your stuff.
5.link your data to other data to provide context.
5stardata.info by Tim Berners-Lee, the inventor of the Web.
Open Data in Hong Kong
Open Data in Hong Kong
●

Data.One
–

http://www.gov.hk/en/theme/psi

–

released on 2011/3/31.

–

First App Competition on Data.One
●

Call for Submission now till 2014/02/28.
Weather Information in Hong Kong
●

Hong Kong Observatory
–

Hourly Hong Kong Weather Report

–

Regional Weather in Hong Kong (10 min updates)

–

Weather Forecast and Weekly Weather Forecast

–

Typhoon Report and Forecast
Hong Kong Observatory RSS
Hong Kong Observatory RSS
Weather at Data.One
●

●

I posted a blog 'Progress of Open
Government Data in Hong Kong' on
2013/01/17.
Weather at Data.One provides 7 dataset URLs,
returns RSS (XML) format (Eng/TChi/SChi)
–

One word: Useless.

–

Data.One dataset (RSS) is completely different
with HKO own paid service (XML).
Weather at Data.One
●

Example - Current local weather report:

●

Plain text report in RSS.

●

Difference to quote report content:
–
–

●

Website: a pair of HTML tags, eg. <PRE>....</PRE>.
Data.One: a pair of RSS description tags,
<description>....</description>.

Other weather data is missing, eg. Regional
temperture updates per each 12 mins.
Weather at Data.One
●

●

●

Weather at Data.One is 'report' but not 'data'.
Weather RSS is already released by HKO
before launch of Data.One.
Technically, json/xml format is better
readable by computer programs.
Data.One
●

In November 2013, 43 datasets are available.
–

JSON/XML = 18

–

RSS = 10

–

XLS = 6

–

CSV = 4

–

JPG/PNG = 3

–

HTML/MDB = 2
Data.One
●

JSON/XML (18 datasets)
–

Air Pollution.
●

Past 24-hour Air Pollution Index from stations.

–

Approved Charitable Fund-raising Activities

–

Restaurant and Food Licences.

–

Details of facility locations.

–

Reward Notices from Police Force.

–

Marine Traffic (Arrival/Departure).

–

Traffic Speed and special news.

–

EventHK information.
Data.One
●

RSS (10 datasets)
–

Weather Information (7 datasets)

–

Beach Water Quality (1 datasets)

–

Current Air Pollution Index range and forecase (2
datasets)
Data.One
●

JPG/PNG (3 datasets)
–

Exhibition gallery of government building
projects.

–

Speed map panels.

–

Traffic snapshot images.
Data.One
●

CSV
–
–

Locations of Public Facility and GovWifi

–
●

Past Record of Air Pollution Index
Marine Shipping directory of HK

HTML
–

●

HTML version of Marine Traffic.

XLS, MDB
–

2011 Population Census.

–

Property Market Statistics.

–

Monthly Digested Stats and Registers of Auth Persons from Building Dept.

–

Routes and fares of public transport.
Data.One
●

Many departments does not release their useful data, and
release current information available on their website.
–

●

Few of them keep available open data in their own.

Most of them does not understand what is 'real' open data.
–
–

Open data format insteads of proprietary data format.

–
●

Data insteads of Information.
Useful of data.

Some departments should manage their open data in better
data structure.
Legco Meeting Minutes
and Voting Results
Legco Meeting Minutes
and Voting Results
Legco Meeting Minutes
and Voting Results
●

●

●

In October 2013, LegCo start to publish voting
results of House Committe in XML.
It is not a part of Data.One project.
My open source software on LegCo vote
result XML:
–

http://github.com/sammyfung/legcovotes
Digital21 Strategy
Public Consultation Document
(G) Public Sector Information (PSI) as Default
"34. Through different channels (like press releases, publications, websites, etc.), the
Government releases a lot of information in different areas. However, most of such
information can only be read but cannot be used. In view of the immense benefits of
widening access to PSI for free and easy re-use, we propose to make all Government
information released for public consumption machine-readable by default. Where
appropriate, datasets will be released with application programming interfaces (APIs),
providing predefined functions to make their retrieval easier."
(G) 廣泛提供公共資料
"34. 政府透過不同途徑 ( 例如新聞稿、出版物、網站等 ) 發放大量不同範疇的資料。然而 , 這些資
料大都只可供閱讀而不能使用。有見開放公共資料以供免費再用可帶來巨大效益 , 我們建議所有
開放予公眾使用的政府資料都須以數碼格式編製。在適用情況下 , 資料發布時會同時推出應用程
式界面 , 以便提供預設功能 , 讓公眾輕易地檢索資料。 "
Digital21 Strategy
Public Consultation Document
"33. PSI datasets can be used and meshed together to create innovative new applications, as
demonstrated by the creative and useful products and services developed from PSI in Hong Kong
and around the world. For example, using PSI datasets on traffic snapshot images, a number of
mobile apps have been developed to provide real-time traffic situation for users to avoid traffic jams
in planning their traffic routes. Experience from other developed economies shows that widening
access to PSI datasets can open up lucrative business opportunities and bring social benefits. By
tapping the creativity of the community and entrepreneurs, the use of PSI can lead to positive social
outcomes. For instance, in some cities in the United States, application of PSI on hygiene inspections
has led to a significant drop in food poisoning incidents."
Digital21 Strategy
Public Consultation Document
"33. 由本港及世界各地利用公共資料所開發的實用創意產品及服
務所見 , 公共資料可個別及混合使用 , 以開發創新的應用程式。例
如 , 現時已有多個利用交通情況快拍圖像的公共資料開發的流動應
用程式 , 以提供實時交通情況資料 , 讓使用者計劃行車路線 , 從而
避開交通擠塞情況。根據其他經濟體系的經驗 , 開放公共資料 , 供
大眾廣為使用 , 可開拓有利可圖的商機 , 並為社會帶來禆益。我們
可藉着開放公共資料 , 借助市民及企業家的創意來造福社會。舉例
來說 , 在美國一些城市 , 有關衞生檢查的公共資料在開放使用後 ,
食物中毒事故宗數大幅減少。 "
Digital21 Strategy
Public Consultation Document
"35. Apart from Government data, there are vast amounts of PSI handled,
collected and disseminated by public organisations, which are equally useful
for the development of innovative services and products. Therefore, we
propose to encourage public organisations (e.g. public utilities and transport
operators) to release data owned by them in machine-readable format."
"35. 除了政府資料外 , 本港亦備有大量經公共機構處理、收集及發放的公共資料 ,
這些資料對開發創新服務及產品同樣有用。因此 , 我們建議鼓勵公共機構 ( 例如公
用事業及運輸機構 ) 發放以數碼格式編製的資料。 "
Open Data is important to citizens.
User of Open Source
Software in web
crawling
Web Scraping
●

a computer software technique of extracting
information from websites. (Wikipedia)

●

for business, hobbies, research purposes.
Web Scraping
●

Look for right URLs to scrap.

●

Look for right content from webpages.

●

Saving data into data store.

●

When to run the web scraping program ?
Use of Open Source Software in
Web Crawling
●

●

Use Open Source Tools to collect useful and
meaningful machine-readable data.
Doesn't need to wait provider to release data
in machine-readable format.
Open Source Tools
●

Python programming lanugage

●

with Regular Expression library

●

Scrapy web crawling framework
Why python + scrapy ?
●

●

python: my current favourite programming
language for few years.
scrapy: web crawling framework written in
Python.
What is Scrapy ?
●

●

An open source web scraping framework for
Python.
Scrapy is a fast high-level screen scraping and
web crawling framework, used to crawl
websites and extract structured data from
their pages. It can be used for a wide range of
purposes, from data mining to monitoring
and automated testing.
Scrapy Features
●

define data you want to scrapy

●

write spider to extract data

●

Built-in: selecting and extracting data from HTML
and XML

●

Built-in: JSON, CSV, XML output

●

Interactive shell console

●

Built-in: web service, telnet console, logging

●

Others
Programme List of Paid TVs in 2004
Programme List of Paid TVs in 2004
●

I want to know live football match was
showing on which channel.

●

Paid TV web site = M$ + IIS + ASP + Flash

●

Slow....... Very Slow...... Extremely Slow!

●

Couldn't connect at any peak hours!

●

Wrote my first web crawler in PHP in 2004.
Public Transportation in 2006-2010
●

Kowloon Motor Bus (KMB)
–

●

No map view for a bus route

Public Transportation Enquiry System (PTES)
–

Exteremly Poor, Ugly (or much worse) map UI on
PTES.
HK Observatory and Joint Typhoon
Warning Center
●

Any typhoon is coming to Hong Kong ? And
When will it come ?

●

No easy data exchange format.

●

No RSS nor ATOM.

●

We aren't check websites everyday.
My Products
●

WeatherHK ← ← ←

●

TCTrack
WeatherHK
●

http://twitter.com/weatherhk

●

hourly current weather report

●

weather forecast report

●

tropical signal warning
WeatherHK
●

●

Backend: Python + Scrapy + Database +
Twitter + NNTP......
Frontend: Twitter + Newsgroup
WeatherHK
●

http://twitter.com/weatherhk

●

Interview by MetroPop in 2009.
My Products
●

WeatherHK

●

TCTrack ← ← ←
TCTrack
●

●

●

http://sammy.hk/projects/tctrack/tctrack.php
Plot TC current and forecast tracks over
Google Map.
Source:
–

JTWC

–

HKO
TCTrack
●

●

●

http://sammy.hk/projects/tctrack/tctrack.php
Probably first tctrack map in HK using
GoogleMap
Use of GMap: TCTrack -> Weather
Underground Hong Kong -> HKO
TCTrack
●

http://twitter.com/tctrack

●

Tweet JTWC updates for Northwest Pacific.
Releases information to citizens
in a better presentation.
Starting new Open
Source project
hk0weather to create
Open Weather Data.
Starting new Open Source projects
to create Open Data
●

●

Develop a open source project.
Release data in standard machine-readable
data format.
hk0weather
●

https://github.com/sammyfung/hk0weather

●

Open Source Hong Kong Weather Project.

●

convert to JSON data from HKO webpages.

●

python + scrapy

●

1st version: from current weather report,
extracting temperture and humidity from 20+
weather stations, export in json format.
hk0weather
●

https://github.com/sammyfung/hk0weather

●

$ virtualenv hk0weatherenv

●

$ source hk0weatherenv/bin/activate

●

$ pip install scrapy

●

$ git clone
https://github.com/sammyfung/hk0weather.git

●

$ cd hk0weather

●

$ scrapy crawl currwx -t json -o testresult
hk0weather
●

Python
–

●

import re

Scrapy
–

web crawling framework written in Python.

–

HtmlXPathSelector.

–

built-in JSON, CSV, XML output.
hk0weather
[{"humidity": 80, "station": "hko", "temperture": 17, "time": 1360785720},
{"station": "kingspark", "temperture": 16, "time": 1360785720},
{"station": "wongchukhang", "temperture": 17, "time": 1360785720},
{"station": "takwuling", "temperture": 16, "time": 1360785720},
{"station": "laufaushan", "temperture": 15, "time": 1360785720},
{"station": "taipo", "temperture": 16, "time": 1360785720},
{"station": "shatin", "temperture": 17, "time": 1360785720},
{"station": "tuenmun", "temperture": 17, "time": 1360785720},
{"station": "tseungkwano", "temperture": 16, "time": 1360785720},
{"station": "saikung", "temperture": 16, "time": 1360785720},
{"station": "cheungchau", "temperture": 17, "time": 1360785720},
{"station": "cheungchau", "temperture": 17, "time": 1360785720},
{"station": "tsingyi", "temperture": 17, "time": 1360785720},
{"station": "shekkong", "temperture": 15, "time": 1360785720},
{"station": "tsuenwanhokoon", "temperture": 15, "time": 1360785720},
{"station": "tsuenwanshingmunvalley", "temperture": 17, "time": 1360785720},
{"station": "hongkongpark", "temperture": 17, "time": 1360785720},
{"station": "shaukeiwan", "temperture": 16, "time": 1360785720},
{"station": "kowlooncity", "temperture": 16, "time": 1360785720},
{"station": "happyvalley", "temperture": 18, "time": 1360785720},
{"station": "wongtaisin", "temperture": 17, "time": 1360785720},
{"station": "stanley", "temperture": 16, "time": 1360785720},
{"station": "kwuntong", "temperture": 15, "time": 1360785720},
{"station": "shamshuipo", "temperture": 17, "time": 1360785720}]
Items.py
class Hk0WeatherItem(Item):
time = Field()
station = Field()
temperture = Field()
humidity = Field()
Currwx.py
start_urls = (
'http://www.weather.gov.hk/wxinfo/currwx/curr
entc.htm',
)
Currwx.py
def parse(self, response):
laststation = ''
temperture = int()
stations = []
hxs = HtmlXPathSelector(response)
report = hxs.select('//div[@id="ming"]')
libhk0
class hk0:
stations = [
(u' 天 文 台 ', 'hko'),
(u' 京 士 柏 ', 'kingspark'),
(u' 黃 竹 坑 ', 'wongchukhang'),
(u' 打 鼓 嶺 ', 'takwuling'),
(u' 流 浮 山 ', 'laufaushan'),
libhk0
class hk0:
def gettime(self, report):
…
def hk0current(self, report):
…
Agenda
●

What is Open Data ?

●

Use of Open Source Software in web crawling.

●

Starting new Open Source project hk0weather
to create Open Weather Data.
We want a easier way to
access the public data.
We want a better life with
public data.
Thank You!
sammy.hk
http://slidesha.re/1cleS2y

More Related Content

What's hot

Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollinkSSSW
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefineOpen Knowledge Belgium
 
2013 04-29 american art collaborative lod meeting - washington dc - web
2013 04-29 american art collaborative lod meeting - washington dc - web2013 04-29 american art collaborative lod meeting - washington dc - web
2013 04-29 american art collaborative lod meeting - washington dc - weblecmaj
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseRDTF-Discovery
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web CorpusRobert Meusel
 
Almost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingAlmost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingMichelle Minkoff
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...Robert Meusel
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With PythonRobert Dempsey
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis PlatformLeigh Dodds
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DoneHerbert Van de Sompel
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineLeigh Dodds
 
Graph Analysis over JSON, Larus
Graph Analysis over JSON, LarusGraph Analysis over JSON, Larus
Graph Analysis over JSON, LarusNeo4j
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 
Linked Data Overview - AGI Technical SIG
Linked Data Overview - AGI Technical SIGLinked Data Overview - AGI Technical SIG
Linked Data Overview - AGI Technical SIGChris Ewing
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
Fc3 integration strategies
Fc3 integration strategiesFc3 integration strategies
Fc3 integration strategiesGabrieleSani3
 
End of Term Harvest User Interface
End of Term Harvest User Interface End of Term Harvest User Interface
End of Term Harvest User Interface misstracyjo
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncpetrknoth
 

What's hot (20)

Knowledge discoverylaurahollink
Knowledge discoverylaurahollinkKnowledge discoverylaurahollink
Knowledge discoverylaurahollink
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
2013 04-29 american art collaborative lod meeting - washington dc - web
2013 04-29 american art collaborative lod meeting - washington dc - web2013 04-29 american art collaborative lod meeting - washington dc - web
2013 04-29 american art collaborative lod meeting - washington dc - web
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcase
 
Mining a Large Web Corpus
Mining a Large Web CorpusMining a Large Web Corpus
Mining a Large Web Corpus
 
Almost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without ProgrammingAlmost Scraping: Web Scraping without Programming
Almost Scraping: Web Scraping without Programming
 
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
A Web-scale Study of the Adoption and Evolution of the schema.org Vocabulary ...
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
Getting Started With The Talis Platform
Getting Started With The Talis PlatformGetting Started With The Talis Platform
Getting Started With The Talis Platform
 
Persistent Identification: Easier Said than Done
Persistent Identification: Easier Said than DonePersistent Identification: Easier Said than Done
Persistent Identification: Easier Said than Done
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Graph Analysis over JSON, Larus
Graph Analysis over JSON, LarusGraph Analysis over JSON, Larus
Graph Analysis over JSON, Larus
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Linked Data Overview - AGI Technical SIG
Linked Data Overview - AGI Technical SIGLinked Data Overview - AGI Technical SIG
Linked Data Overview - AGI Technical SIG
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
Fc3 integration strategies
Fc3 integration strategiesFc3 integration strategies
Fc3 integration strategies
 
End of Term Harvest User Interface
End of Term Harvest User Interface End of Term Harvest User Interface
End of Term Harvest User Interface
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSync
 

Viewers also liked

Mozilla - Openness of the Web
Mozilla - Openness of the WebMozilla - Openness of the Web
Mozilla - Openness of the WebSammy Fung
 
From Hk0weather to Open Data
From Hk0weather to Open DataFrom Hk0weather to Open Data
From Hk0weather to Open DataSammy Fung
 
OUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String ProcessingOUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String ProcessingFlorian Leitner
 
Fun Facts about Big Data
Fun Facts about Big DataFun Facts about Big Data
Fun Facts about Big DataCrayon Data
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBernard Marr
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 

Viewers also liked (7)

Mozilla - Openness of the Web
Mozilla - Openness of the WebMozilla - Openness of the Web
Mozilla - Openness of the Web
 
From Hk0weather to Open Data
From Hk0weather to Open DataFrom Hk0weather to Open Data
From Hk0weather to Open Data
 
OUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String ProcessingOUTDATED Text Mining 3/5: String Processing
OUTDATED Text Mining 3/5: String Processing
 
Fun Facts about Big Data
Fun Facts about Big DataFun Facts about Big Data
Fun Facts about Big Data
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similar to Use of Open Data in Hong Kong

Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Sammy Fung
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionSammy Fung
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2Sammy Fung
 
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)Sammy Fung
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are goingEuropean Data Forum
 
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013AmbasciatadelCanada
 
Sharing Advisory Board newsletter #8
Sharing Advisory Board newsletter #8Sharing Advisory Board newsletter #8
Sharing Advisory Board newsletter #8Carlo Vaccari
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Sammy Fung
 
Data as a service
Data as a serviceData as a service
Data as a serviceZoltan Nagy
 
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...guest1e3ee089
 
Brand Niemann Tutorial12242009
Brand Niemann Tutorial12242009Brand Niemann Tutorial12242009
Brand Niemann Tutorial12242009guestbc60aee0
 
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...guest8c518a8
 
Putting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open DataPutting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open DataMartin Kaltenböck
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
Open data 4 startups (2°edition)
Open data 4 startups (2°edition)Open data 4 startups (2°edition)
Open data 4 startups (2°edition)TOP-IX Consortium
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningResearch Data Alliance
 
Von Open Data zu Linked Open Data, M. Kaltenböck, SWC
Von Open Data zu Linked Open Data, M. Kaltenböck, SWCVon Open Data zu Linked Open Data, M. Kaltenböck, SWC
Von Open Data zu Linked Open Data, M. Kaltenböck, SWCMartin Kaltenböck
 
Paul Davidson – Opening up public data to improve transparancy and efficiency
Paul Davidson – Opening up public data to improve transparancy and efficiencyPaul Davidson – Opening up public data to improve transparancy and efficiency
Paul Davidson – Opening up public data to improve transparancy and efficiencyCorvé Open Government Preconference 2010
 
Open data developments in Japan
Open data developments in JapanOpen data developments in Japan
Open data developments in JapanccAustralia
 
Fiscal openness working group open knowledge - october 28
Fiscal openness working group   open knowledge - october 28 Fiscal openness working group   open knowledge - october 28
Fiscal openness working group open knowledge - october 28 Open Knowledge
 

Similar to Use of Open Data in Hong Kong (20)

Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)Use of Open Data in Hong Kong (LegCo 2014)
Use of Open Data in Hong Kong (LegCo 2014)
 
Local Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell ExtensionLocal Weather Information and GNOME Shell Extension
Local Weather Information and GNOME Shell Extension
 
How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2How Open Data can help entrepreneurs - ITFest 2014 E2
How Open Data can help entrepreneurs - ITFest 2014 E2
 
How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)How do we develop open source software to help open data ? (MOSC 2013)
How do we develop open source software to help open data ? (MOSC 2013)
 
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
EDF2012  Rufus Pollock - Open Data. Where we are where we are goingEDF2012  Rufus Pollock - Open Data. Where we are where we are going
EDF2012 Rufus Pollock - Open Data. Where we are where we are going
 
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
Domenico Donvito - Istat - Open Data in Official Statistics - 10 July 2013
 
Sharing Advisory Board newsletter #8
Sharing Advisory Board newsletter #8Sharing Advisory Board newsletter #8
Sharing Advisory Board newsletter #8
 
Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)Creating Open Data with Open Source (beta2)
Creating Open Data with Open Source (beta2)
 
Data as a service
Data as a serviceData as a service
Data as a service
 
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
 
Brand Niemann Tutorial12242009
Brand Niemann Tutorial12242009Brand Niemann Tutorial12242009
Brand Niemann Tutorial12242009
 
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...Put Your Desktop in the Cloud In Support of the Open Government Directive and...
Put Your Desktop in the Cloud In Support of the Open Government Directive and...
 
Putting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open DataPutting the L in front: from Open Data to Linked Open Data
Putting the L in front: from Open Data to Linked Open Data
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Open data 4 startups (2°edition)
Open data 4 startups (2°edition)Open data 4 startups (2°edition)
Open data 4 startups (2°edition)
 
SoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social MiningSoBigData. European Research Infrastructure for Big Data and Social Mining
SoBigData. European Research Infrastructure for Big Data and Social Mining
 
Von Open Data zu Linked Open Data, M. Kaltenböck, SWC
Von Open Data zu Linked Open Data, M. Kaltenböck, SWCVon Open Data zu Linked Open Data, M. Kaltenböck, SWC
Von Open Data zu Linked Open Data, M. Kaltenböck, SWC
 
Paul Davidson – Opening up public data to improve transparancy and efficiency
Paul Davidson – Opening up public data to improve transparancy and efficiencyPaul Davidson – Opening up public data to improve transparancy and efficiency
Paul Davidson – Opening up public data to improve transparancy and efficiency
 
Open data developments in Japan
Open data developments in JapanOpen data developments in Japan
Open data developments in Japan
 
Fiscal openness working group open knowledge - october 28
Fiscal openness working group   open knowledge - october 28 Fiscal openness working group   open knowledge - october 28
Fiscal openness working group open knowledge - october 28
 

More from Sammy Fung

Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Sammy Fung
 
DevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineDevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineSammy Fung
 
Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Sammy Fung
 
My Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunityMy Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunitySammy Fung
 
Introduction to development with Django web framework
Introduction to development with Django web frameworkIntroduction to development with Django web framework
Introduction to development with Django web frameworkSammy Fung
 
香港中文開源軟件翻譯
香港中文開源軟件翻譯香港中文開源軟件翻譯
香港中文開源軟件翻譯Sammy Fung
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastSammy Fung
 
Open Source Technology and Community
Open Source Technology and CommunityOpen Source Technology and Community
Open Source Technology and CommunitySammy Fung
 
Access Open Data with Open Source Software Tools
Access Open Data with Open Source Software ToolsAccess Open Data with Open Source Software Tools
Access Open Data with Open Source Software ToolsSammy Fung
 
Installation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionInstallation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionSammy Fung
 
Software Freedom and Open Source Community
Software Freedom and Open Source CommunitySoftware Freedom and Open Source Community
Software Freedom and Open Source CommunitySammy Fung
 
Building your own job site with Drupal
Building your own job site with DrupalBuilding your own job site with Drupal
Building your own job site with DrupalSammy Fung
 
Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at workSammy Fung
 
Software Freedom and Community
Software Freedom and CommunitySoftware Freedom and Community
Software Freedom and CommunitySammy Fung
 
Open Source Job Board
Open Source Job BoardOpen Source Job Board
Open Source Job BoardSammy Fung
 
Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Sammy Fung
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSSammy Fung
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoSammy Fung
 
Mozilla Community and Hong Kong
Mozilla Community and Hong KongMozilla Community and Hong Kong
Mozilla Community and Hong KongSammy Fung
 
ITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingSammy Fung
 

More from Sammy Fung (20)

Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹Python 爬網⾴工具 - Scrapy 介紹
Python 爬網⾴工具 - Scrapy 介紹
 
DevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to onlineDevRel - Transform article writing from printing to online
DevRel - Transform article writing from printing to online
 
Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)Introduction to Open Source by opensource.hk (2019 Edition)
Introduction to Open Source by opensource.hk (2019 Edition)
 
My Open Source Journey - Developer and Community
My Open Source Journey - Developer and CommunityMy Open Source Journey - Developer and Community
My Open Source Journey - Developer and Community
 
Introduction to development with Django web framework
Introduction to development with Django web frameworkIntroduction to development with Django web framework
Introduction to development with Django web framework
 
香港中文開源軟件翻譯
香港中文開源軟件翻譯香港中文開源軟件翻譯
香港中文開源軟件翻譯
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Open Source Technology and Community
Open Source Technology and CommunityOpen Source Technology and Community
Open Source Technology and Community
 
Access Open Data with Open Source Software Tools
Access Open Data with Open Source Software ToolsAccess Open Data with Open Source Software Tools
Access Open Data with Open Source Software Tools
 
Installation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server EditionInstallation of LAMP Server with Ubuntu 14.10 Server Edition
Installation of LAMP Server with Ubuntu 14.10 Server Edition
 
Software Freedom and Open Source Community
Software Freedom and Open Source CommunitySoftware Freedom and Open Source Community
Software Freedom and Open Source Community
 
Building your own job site with Drupal
Building your own job site with DrupalBuilding your own job site with Drupal
Building your own job site with Drupal
 
Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at work
 
Software Freedom and Community
Software Freedom and CommunitySoftware Freedom and Community
Software Freedom and Community
 
Open Source Job Board
Open Source Job BoardOpen Source Job Board
Open Source Job Board
 
Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)Introduction of Mozilla Hong Kong (COSCUP 2014)
Introduction of Mozilla Hong Kong (COSCUP 2014)
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMS
 
Python, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and DjangoPython, web scraping and content management: Scrapy and Django
Python, web scraping and content management: Scrapy and Django
 
Mozilla Community and Hong Kong
Mozilla Community and Hong KongMozilla Community and Hong Kong
Mozilla Community and Hong Kong
 
ITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source MarketingITFest 2014 - Open Source Marketing
ITFest 2014 - Open Source Marketing
 

Recently uploaded

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 

Recently uploaded (20)

How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 

Use of Open Data in Hong Kong

  • 1. Use of Open Data in Hong Kong Sammy Fung sammy.hk Incu-Lab ICE in StartMeUpHK - Open Data Initiative Gathering 2013/12/04 http://slidesha.re/1cleS2y
  • 2. We want a better life with public data.
  • 3. We want a easier way to access the public data.
  • 4. Agenda ● What is Open Data ? ● Use of Open Source Software in web crawling. ● Starting new Open Source project hk0weather to create Open Weather Data.
  • 5. Sammy Fung ● Software Developer – to use and develop open source sofware. – Perl → PHP → Python. – interests on Data Mining / Web Crawling. – own a startup of web and mobile technology.
  • 6. Sammy Fung ● 15+ years in Open Source Communities. – Founding Chairman, Hong Kong Linux User Group. – Founding Chairman, Open Source Hong Kong. – Member, GNOME Asia committee. – Mozilla Representative – Member, program committee at COSCUP ● Conference for Open Source Coders, Users and Developers. ● Largest open source conference in Taiwan.
  • 7. What is Open Data ?
  • 8. Open Data Three Laws of Open Government Data by David Eaves. 1.If it can't be spidered or indexed, it doesn't exist. 2.If it isn't available in open and machine readable format, it can't engage. 3.If a legal framework doesn't allow it to be repurposed, it doesn't empower. http://eaves.ca/2009/09/30/three-law-of-open-government-data/
  • 9. Open Data ● Tim Berners-Lee, the inventor of the Web. – 5stardata.info – 5 star deployment scheme of Open Data.
  • 10. * One Star - Open Data 1.make your stuff available on the Web (whatever format) under an open license. 2.make it available as structured data (e.g., Excel instead of image scan of a table) 3.use non-proprietary formats (e.g., CSV instead of Excel) 4.use URIs to denote things, so that people can point at your stuff. 5.link your data to other data to provide context. 5stardata.info by Tim Berners-Lee, the inventor of the Web.
  • 11. ** Two Star - Open Data 1.make your stuff available on the Web (whatever format) under an open license. 2.make it available as structured data (e.g., Excel instead of image scan of a table) 3.use non-proprietary formats (e.g., CSV instead of Excel) 4.use URIs to denote things, so that people can point at your stuff. 5.link your data to other data to provide context. 5stardata.info by Tim Berners-Lee, the inventor of the Web.
  • 12. *** Three Star - Open Data 1.make your stuff available on the Web (whatever format) under an open license. 2.make it available as structured data (e.g., Excel instead of image scan of a table) 3.use non-proprietary formats (e.g., CSV instead of Excel) 4.use URIs to denote things, so that people can point at your stuff. 5.link your data to other data to provide context. 5stardata.info by Tim Berners-Lee, the inventor of the Web.
  • 13. **** Four Star - Open Data 1.make your stuff available on the Web (whatever format) under an open license. 2.make it available as structured data (e.g., Excel instead of image scan of a table) 3.use non-proprietary formats (e.g., CSV instead of Excel) 4.use URIs to denote things, so that people can point at your stuff. 5.link your data to other data to provide context. 5stardata.info by Tim Berners-Lee, the inventor of the Web.
  • 14. ***** Five Star - Open Data 1.make your stuff available on the Web (whatever format) under an open license. 2.make it available as structured data (e.g., Excel instead of image scan of a table) 3.use non-proprietary formats (e.g., CSV instead of Excel) 4.use URIs to denote things, so that people can point at your stuff. 5.link your data to other data to provide context. 5stardata.info by Tim Berners-Lee, the inventor of the Web.
  • 15. Open Data in Hong Kong
  • 16. Open Data in Hong Kong ● Data.One – http://www.gov.hk/en/theme/psi – released on 2011/3/31. – First App Competition on Data.One ● Call for Submission now till 2014/02/28.
  • 17. Weather Information in Hong Kong ● Hong Kong Observatory – Hourly Hong Kong Weather Report – Regional Weather in Hong Kong (10 min updates) – Weather Forecast and Weekly Weather Forecast – Typhoon Report and Forecast
  • 20. Weather at Data.One ● ● I posted a blog 'Progress of Open Government Data in Hong Kong' on 2013/01/17. Weather at Data.One provides 7 dataset URLs, returns RSS (XML) format (Eng/TChi/SChi) – One word: Useless. – Data.One dataset (RSS) is completely different with HKO own paid service (XML).
  • 21. Weather at Data.One ● Example - Current local weather report: ● Plain text report in RSS. ● Difference to quote report content: – – ● Website: a pair of HTML tags, eg. <PRE>....</PRE>. Data.One: a pair of RSS description tags, <description>....</description>. Other weather data is missing, eg. Regional temperture updates per each 12 mins.
  • 22. Weather at Data.One ● ● ● Weather at Data.One is 'report' but not 'data'. Weather RSS is already released by HKO before launch of Data.One. Technically, json/xml format is better readable by computer programs.
  • 23. Data.One ● In November 2013, 43 datasets are available. – JSON/XML = 18 – RSS = 10 – XLS = 6 – CSV = 4 – JPG/PNG = 3 – HTML/MDB = 2
  • 24. Data.One ● JSON/XML (18 datasets) – Air Pollution. ● Past 24-hour Air Pollution Index from stations. – Approved Charitable Fund-raising Activities – Restaurant and Food Licences. – Details of facility locations. – Reward Notices from Police Force. – Marine Traffic (Arrival/Departure). – Traffic Speed and special news. – EventHK information.
  • 25. Data.One ● RSS (10 datasets) – Weather Information (7 datasets) – Beach Water Quality (1 datasets) – Current Air Pollution Index range and forecase (2 datasets)
  • 26. Data.One ● JPG/PNG (3 datasets) – Exhibition gallery of government building projects. – Speed map panels. – Traffic snapshot images.
  • 27. Data.One ● CSV – – Locations of Public Facility and GovWifi – ● Past Record of Air Pollution Index Marine Shipping directory of HK HTML – ● HTML version of Marine Traffic. XLS, MDB – 2011 Population Census. – Property Market Statistics. – Monthly Digested Stats and Registers of Auth Persons from Building Dept. – Routes and fares of public transport.
  • 28. Data.One ● Many departments does not release their useful data, and release current information available on their website. – ● Few of them keep available open data in their own. Most of them does not understand what is 'real' open data. – – Open data format insteads of proprietary data format. – ● Data insteads of Information. Useful of data. Some departments should manage their open data in better data structure.
  • 29. Legco Meeting Minutes and Voting Results
  • 30. Legco Meeting Minutes and Voting Results
  • 31. Legco Meeting Minutes and Voting Results ● ● ● In October 2013, LegCo start to publish voting results of House Committe in XML. It is not a part of Data.One project. My open source software on LegCo vote result XML: – http://github.com/sammyfung/legcovotes
  • 32. Digital21 Strategy Public Consultation Document (G) Public Sector Information (PSI) as Default "34. Through different channels (like press releases, publications, websites, etc.), the Government releases a lot of information in different areas. However, most of such information can only be read but cannot be used. In view of the immense benefits of widening access to PSI for free and easy re-use, we propose to make all Government information released for public consumption machine-readable by default. Where appropriate, datasets will be released with application programming interfaces (APIs), providing predefined functions to make their retrieval easier." (G) 廣泛提供公共資料 "34. 政府透過不同途徑 ( 例如新聞稿、出版物、網站等 ) 發放大量不同範疇的資料。然而 , 這些資 料大都只可供閱讀而不能使用。有見開放公共資料以供免費再用可帶來巨大效益 , 我們建議所有 開放予公眾使用的政府資料都須以數碼格式編製。在適用情況下 , 資料發布時會同時推出應用程 式界面 , 以便提供預設功能 , 讓公眾輕易地檢索資料。 "
  • 33. Digital21 Strategy Public Consultation Document "33. PSI datasets can be used and meshed together to create innovative new applications, as demonstrated by the creative and useful products and services developed from PSI in Hong Kong and around the world. For example, using PSI datasets on traffic snapshot images, a number of mobile apps have been developed to provide real-time traffic situation for users to avoid traffic jams in planning their traffic routes. Experience from other developed economies shows that widening access to PSI datasets can open up lucrative business opportunities and bring social benefits. By tapping the creativity of the community and entrepreneurs, the use of PSI can lead to positive social outcomes. For instance, in some cities in the United States, application of PSI on hygiene inspections has led to a significant drop in food poisoning incidents."
  • 34. Digital21 Strategy Public Consultation Document "33. 由本港及世界各地利用公共資料所開發的實用創意產品及服 務所見 , 公共資料可個別及混合使用 , 以開發創新的應用程式。例 如 , 現時已有多個利用交通情況快拍圖像的公共資料開發的流動應 用程式 , 以提供實時交通情況資料 , 讓使用者計劃行車路線 , 從而 避開交通擠塞情況。根據其他經濟體系的經驗 , 開放公共資料 , 供 大眾廣為使用 , 可開拓有利可圖的商機 , 並為社會帶來禆益。我們 可藉着開放公共資料 , 借助市民及企業家的創意來造福社會。舉例 來說 , 在美國一些城市 , 有關衞生檢查的公共資料在開放使用後 , 食物中毒事故宗數大幅減少。 "
  • 35. Digital21 Strategy Public Consultation Document "35. Apart from Government data, there are vast amounts of PSI handled, collected and disseminated by public organisations, which are equally useful for the development of innovative services and products. Therefore, we propose to encourage public organisations (e.g. public utilities and transport operators) to release data owned by them in machine-readable format." "35. 除了政府資料外 , 本港亦備有大量經公共機構處理、收集及發放的公共資料 , 這些資料對開發創新服務及產品同樣有用。因此 , 我們建議鼓勵公共機構 ( 例如公 用事業及運輸機構 ) 發放以數碼格式編製的資料。 "
  • 36. Open Data is important to citizens.
  • 37. User of Open Source Software in web crawling
  • 38. Web Scraping ● a computer software technique of extracting information from websites. (Wikipedia) ● for business, hobbies, research purposes.
  • 39. Web Scraping ● Look for right URLs to scrap. ● Look for right content from webpages. ● Saving data into data store. ● When to run the web scraping program ?
  • 40. Use of Open Source Software in Web Crawling ● ● Use Open Source Tools to collect useful and meaningful machine-readable data. Doesn't need to wait provider to release data in machine-readable format.
  • 41. Open Source Tools ● Python programming lanugage ● with Regular Expression library ● Scrapy web crawling framework
  • 42. Why python + scrapy ? ● ● python: my current favourite programming language for few years. scrapy: web crawling framework written in Python.
  • 43. What is Scrapy ? ● ● An open source web scraping framework for Python. Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
  • 44. Scrapy Features ● define data you want to scrapy ● write spider to extract data ● Built-in: selecting and extracting data from HTML and XML ● Built-in: JSON, CSV, XML output ● Interactive shell console ● Built-in: web service, telnet console, logging ● Others
  • 45. Programme List of Paid TVs in 2004
  • 46. Programme List of Paid TVs in 2004 ● I want to know live football match was showing on which channel. ● Paid TV web site = M$ + IIS + ASP + Flash ● Slow....... Very Slow...... Extremely Slow! ● Couldn't connect at any peak hours! ● Wrote my first web crawler in PHP in 2004.
  • 47. Public Transportation in 2006-2010 ● Kowloon Motor Bus (KMB) – ● No map view for a bus route Public Transportation Enquiry System (PTES) – Exteremly Poor, Ugly (or much worse) map UI on PTES.
  • 48. HK Observatory and Joint Typhoon Warning Center ● Any typhoon is coming to Hong Kong ? And When will it come ? ● No easy data exchange format. ● No RSS nor ATOM. ● We aren't check websites everyday.
  • 49. My Products ● WeatherHK ← ← ← ● TCTrack
  • 50. WeatherHK ● http://twitter.com/weatherhk ● hourly current weather report ● weather forecast report ● tropical signal warning
  • 51. WeatherHK ● ● Backend: Python + Scrapy + Database + Twitter + NNTP...... Frontend: Twitter + Newsgroup
  • 54. TCTrack ● ● ● http://sammy.hk/projects/tctrack/tctrack.php Plot TC current and forecast tracks over Google Map. Source: – JTWC – HKO
  • 55. TCTrack ● ● ● http://sammy.hk/projects/tctrack/tctrack.php Probably first tctrack map in HK using GoogleMap Use of GMap: TCTrack -> Weather Underground Hong Kong -> HKO
  • 57. Releases information to citizens in a better presentation.
  • 58. Starting new Open Source project hk0weather to create Open Weather Data.
  • 59. Starting new Open Source projects to create Open Data ● ● Develop a open source project. Release data in standard machine-readable data format.
  • 60. hk0weather ● https://github.com/sammyfung/hk0weather ● Open Source Hong Kong Weather Project. ● convert to JSON data from HKO webpages. ● python + scrapy ● 1st version: from current weather report, extracting temperture and humidity from 20+ weather stations, export in json format.
  • 61. hk0weather ● https://github.com/sammyfung/hk0weather ● $ virtualenv hk0weatherenv ● $ source hk0weatherenv/bin/activate ● $ pip install scrapy ● $ git clone https://github.com/sammyfung/hk0weather.git ● $ cd hk0weather ● $ scrapy crawl currwx -t json -o testresult
  • 62. hk0weather ● Python – ● import re Scrapy – web crawling framework written in Python. – HtmlXPathSelector. – built-in JSON, CSV, XML output.
  • 63. hk0weather [{"humidity": 80, "station": "hko", "temperture": 17, "time": 1360785720}, {"station": "kingspark", "temperture": 16, "time": 1360785720}, {"station": "wongchukhang", "temperture": 17, "time": 1360785720}, {"station": "takwuling", "temperture": 16, "time": 1360785720}, {"station": "laufaushan", "temperture": 15, "time": 1360785720}, {"station": "taipo", "temperture": 16, "time": 1360785720}, {"station": "shatin", "temperture": 17, "time": 1360785720}, {"station": "tuenmun", "temperture": 17, "time": 1360785720}, {"station": "tseungkwano", "temperture": 16, "time": 1360785720}, {"station": "saikung", "temperture": 16, "time": 1360785720}, {"station": "cheungchau", "temperture": 17, "time": 1360785720}, {"station": "cheungchau", "temperture": 17, "time": 1360785720}, {"station": "tsingyi", "temperture": 17, "time": 1360785720}, {"station": "shekkong", "temperture": 15, "time": 1360785720}, {"station": "tsuenwanhokoon", "temperture": 15, "time": 1360785720}, {"station": "tsuenwanshingmunvalley", "temperture": 17, "time": 1360785720}, {"station": "hongkongpark", "temperture": 17, "time": 1360785720}, {"station": "shaukeiwan", "temperture": 16, "time": 1360785720}, {"station": "kowlooncity", "temperture": 16, "time": 1360785720}, {"station": "happyvalley", "temperture": 18, "time": 1360785720}, {"station": "wongtaisin", "temperture": 17, "time": 1360785720}, {"station": "stanley", "temperture": 16, "time": 1360785720}, {"station": "kwuntong", "temperture": 15, "time": 1360785720}, {"station": "shamshuipo", "temperture": 17, "time": 1360785720}]
  • 64. Items.py class Hk0WeatherItem(Item): time = Field() station = Field() temperture = Field() humidity = Field()
  • 66. Currwx.py def parse(self, response): laststation = '' temperture = int() stations = [] hxs = HtmlXPathSelector(response) report = hxs.select('//div[@id="ming"]')
  • 67. libhk0 class hk0: stations = [ (u' 天 文 台 ', 'hko'), (u' 京 士 柏 ', 'kingspark'), (u' 黃 竹 坑 ', 'wongchukhang'), (u' 打 鼓 嶺 ', 'takwuling'), (u' 流 浮 山 ', 'laufaushan'),
  • 68. libhk0 class hk0: def gettime(self, report): … def hk0current(self, report): …
  • 69. Agenda ● What is Open Data ? ● Use of Open Source Software in web crawling. ● Starting new Open Source project hk0weather to create Open Weather Data.
  • 70. We want a easier way to access the public data.
  • 71. We want a better life with public data.