32. Deep Learningの最近の話題
• マルチモーダルデータへの適⽤用
• 画像のキャプションを⾃自動⽣生成[Vinyal+ to appear]
• Recurrent NN, LSTMを⽤用いた可変⻑⾧長データの解析
• 機械翻訳[Sutskever+’14], 動画[Karpathy+ ‘14]
• DNNの同等の性能をShallow NNで実現する
• Model Compression[Bucilua+’06] / Distilled Networkによる
Dark Knowledgeの獲得[Hinton+’14]
• 理理論論計算機科学者がDeep Learningの理理論論解析へ進出
• Layerwise Pretrainの正当化 [Arora+’13]
• Deep (Directed) Generative Model
• Generative Stochastic Network[Bengio+’13], Generative
Variational Auto-Encoder[Kingma+’13]
どのトピックも詳しく話すと
本講演1回分の内容です…
32
33. Deep Learning関連資料料
33
弊社メンバーもUstream /
Slideshare / Research Blog
などで資料料を公開しています
http://www.slideshare.net/pfi/deep-learning-22350063
http://www.slideshare.net/beam2d/deep-learning20140130
http://www.slideshare.net/beam2d/deep-learning-22544096
42. 分散インテリジェンスに関する経済予測
CiscoとGEの試算
Cisco : Internet of Everything(IoE)
IoEは⺠民間セクターにおいて今後10年年間で14
兆4000億ドルの機会の創出をもたらす
l 試算の活⽤用/社員の⽣生産性向上/サ
プライチェーン、ロジスティクスの
改善/カスタマーエクスペリエンス
の向上/市場への投⼊入の時間短縮
l ⽇日本での創出は7610億円(約5%)
l 医療療・ライフサイエンス分野では
2013年年において990億ドルの価値
が最終的に創出される
42
- White Paper Embracing the Internet of Everything To Capture Your Share of $14.4 Trillion
- Industrial Internet: Pushing the Boundaries of Minds and Machines
- The Industrial Internet@Work
GE : Industrial Internet
Industrial Internetにより、世界のGDPは今
後20年年で100兆から150兆ドル成⻑⾧長する
• インテリジェントな機器/⾼高度度な分析/
つながった⼈人々
• 輸送/⽯石油・ガス/発電プラント/産業
施設/医療療機器
• 医療療での例例:CT、MRIメンテナンスコス
トは400万時間/年年、2億5000万ドルの
⼈人件費に相当
52. 参考⽂文献(1/5)
[Arora+’13] Arora, Sanjeev, et al. "Provable bounds for learning some deep representations." arXiv preprint
arXiv:1310.6343 (2013).
[Ba+’13] Ba, Lei Jimmy, and Rich Caurana. "Do Deep Nets Really Need to be Deep?." arXiv preprint arXiv:
1312.6184 (2013).
[Bengio+’07] Bengio, Yoshua, et al. "Greedy layer-wise training of deep networks." Advances in neural
information processing systems 19 (2007): 153.
[Bengio+’13] Bengio, Yoshua, and Eric Thibodeau-Laufer. "Deep generative stochastic networks trainable
by backprop." arXiv preprint arXiv:1306.1091 (2013).
[Bengio’14] Bengio, Yoshua. "How auto-encoders could provide credit assignment in deep networks via
target propagation." arXiv preprint arXiv:1407.7906 (2014).
[Bucilua+’06] Buciluǎ, Cristian, Rich Caruana, and Alexandru Niculescu-Mizil. "Model compression."
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining.
ACM, 2006.
52
53. 参考⽂文献(2/5)
[Dahl+’14] Dahl, George E., Navdeep Jaitly, and Ruslan Salakhutdinov. "Multi-task Neural Networks for
QSAR Predictions." arXiv preprint arXiv:1406.1231 (2014).
[Dauphin+’14] Dauphin, Yann N., et al. "Identifying and attacking the saddle point problem in high-
dimensional non-convex optimization." Advances in Neural Information Processing Systems. 2014.
[Duchi+’11] Duchi, John, Elad Hazan, and Yoram Singer. "Adaptive subgradient methods for online
learning and stochastic optimization." The Journal of Machine Learning Research 12 (2011): 2121-2159.
[Deng+’09] Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." Computer Vision and
Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
[GoodFellow+’09] Goodfellow, Ian, et al. "Measuring invariances in deep networks." Advances in neural
information processing systems. 2009.
[Hinton+’12] Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature
detectors." arXiv preprint arXiv:1207.0580 (2012).
53
54. 参考⽂文献(3/5)
[Hinton+’14] Geoffrey Hinton, Oriol Vinyals, Jeff Dean, Distilling the Knowledge in a Neural Network, Deep
Learning and Representation Learning Workshop: NIPS 2014
[Jeffrey+’12] Dean, Jeffrey, et al. "Large scale distributed deep networks." Advances in Neural Information
Processing Systems. 2012.
[Jeffrey+’14] Large Scale Deep Learning CIKM keynote, 2014, http://static.googleusercontent.com/
media/research.google.com/ja//people/jeff/CIKM-keynote-Nov2014.pdf
[Karpathy+ ’14] Karpathy, Andrej, et al. "Large-scale video classification with convolutional neural
networks." IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014.
[Kingma+’13] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint
arXiv:1312.6114 (2013).
[Klela+’14] Learning Image Embeddings using Convolutional Neural Networks for Improved Multi-Modal
Semantics, D. Kiela and Leon Bottou EMNLP 2014
54
55. 参考⽂文献(4/5)
[Krizhevsky+ ’12] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with
deep convolutional neural networks." Advances in neural information processing systems. 2012.
[Le+’13] Le, Quoc V. "Building high-level features using large scale unsupervised learning." Acoustics,
Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013.
[LeCun+ ’89] Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne
Hubbard, Lawrence D Jackel, Backpropagation applied to handwritten zip code recognition, Advances in
neural information processing systems 2, NIPS 1989, 396-404
[Lin+’13] Lin, Min, Qiang Chen, and Shuicheng Yan. "Network In Network." arXiv preprint arXiv:1312.4400
(2013).
[Nair+ ’10] Nair, Vinod, and Geoffrey E. Hinton. "Rectified linear units improve restricted boltzmann
machines." Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010.
[Naito+’12] Yuki Naito and Hidemasa Bono, GGRNA: an ultrafast, transcript-oriented search engine for
genes and transcripts, Nucl. Acids Res. (2012) 40(W1):W592-W596
55
56. 参考⽂文献(5/5)
[Puniyani+’10] K. Puniyani, S. Kim, and E. P. Xing, “Multi-population GWA mapping via multi-task
regularized regression,” Bioinformatics, vol. 26, no. 12, pp. i208-i216, Jun. 2010
[Srivastava+’14] Srivastava, Nitish, et al. "Dropout: A simple way to prevent neural networks from
overfitting." The Journal of Machine Learning Research 15.1 (2014): 1929-1958.
[Sutskever+’14] Sutskever, Ilya, Oriol Vinyals, and Quoc VV Le. "Sequence to sequence learning with neural
networks." Advances in Neural Information Processing Systems. 2014.
[岡⾕谷ʼ’13] 岡⾕谷貴之, 画像認識識分野でのディープラーニングの研究動向, 第16回情報論論的学習理理論論ワークショップ
(IBIS2013)
[丸⼭山+’12] 丸⼭山宏, 岡野原⼤大輔 Edge-Heavy Data: CPS・ビッグデータ・クラウド・スマホがもたらす次世代
アーキテクチャ GICTF総会 特別講演 2012, http://www.gictf.jp/doc/20120709GICTF.pdf
56
71. 参考⽂文献(1/4)
[Bengio+’07] Bengio, Yoshua, et al. "Greedy layer-wise training of deep networks." Advances in neural
information processing systems 19 (2007): 153.
[Bryson+’69] Bryson, Arthur E., and Ho Yu Chi. "Applied optimal control." (1969).
[Cortes+’95] Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks." Machine learning 20.3
(1995): 273-297.
[Elman+’90] Elman, Jeffrey L. "Finding structure in time." Cognitive science 14.2 (1990): 179-211.
[Fukushima’80] Fukushima, Kunihiko. "Neocognitron: A self-organizing neural network model for a
mechanism of pattern recognition unaffected by shift in position." Biological cybernetics 36.4 (1980):
193-202.
[Hinton+’06] Hinton, Geoffrey, Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep
belief nets." Neural computation 18.7 (2006): 1527-1554.
[Hochreiter’91] Hochreiter, Sepp. "Untersuchungen zu dynamischen neuronalen Netzen." Master's thesis,
Institut fur Informatik, Technische Universitat, Munchen (1991).
71
72. 参考⽂文献(2/4)
[Hochreiter+’97] Hochreiter, Sepp, J. urgen Schmidhuber, and Corso Elvezia. "LONG SHORT-TERM
MEMORY." Neural Computation 9.8 (1997): 1735-1780.
[Hopfield’82] Hopfield, John J. "Neural networks and physical systems with emergent collective
computational abilities." Proceedings of the national academy of sciences 79.8 (1982): 2554-2558.
[Jordan’86] Jordan, Michael I. Serial Order: A Parallel Distributed Processing Approach. No. ICS-8604.
CALIFORNIA UNIV SAN DIEGO LA JOLLA INST FOR COGNITIVE SCIENCE, 1986.
[LeCun+ ’89] Yann LeCun, Bernhard Boser, John S Denker, Donnie Henderson, Richard E Howard, Wayne
Hubbard, Lawrence D Jackel, Backpropagation applied to handwritten zip code recognition, Advances in
neural information processing systems 2, NIPS 1989, 396-404,
[Krizhevsky+ ’12] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with
deep convolutional neural networks." Advances in neural information processing systems. 2012.
72
73. 参考⽂文献(3/4)
[Le+’13] Le, Quoc V. "Building high-level features using large scale unsupervised learning." Acoustics,
Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 2013.
[McClelland+’86] McClelland, James L., David E. Rumelhart, and PDP Research Group. "Parallel
distributed processing." Explorations in the microstructure of cognition 2 (1986).
[Minsky+’69] Minsky, Marvin, and Seymour Papert. "Perceptron: an introduction to computational
geometry." The MIT Press, Cambridge, expanded edition 19 (1969): 88.
[Pearl’85] Pearl, Judea. "BAYESIAN NETWCRKS: A MODEL CF ‘SELF-ACTIVATED MEMORY FOR
EVIDENTIAL REASONING." (1985).
[Rosenblatt’58] Rosenblatt, Frank. "The perceptron: a probabilistic model for information storage and
organization in the brain." Psychological review 65.6 (1958): 386.
[Rumelhart+’86] Rumelhart, David E., James L. McClelland, and PDP Research Group. "Parallel
distributed processing, volume 1: Foundations." MIT Press, Cambridge, MA 19 (1986): 67-70.
73
74. 参考⽂文献(4/4)
[Salakhutdinov+’09] Salakhutdinov, Ruslan, and Geoffrey E. Hinton. "Deep boltzmann
machines." International Conference on Artificial Intelligence and Statistics. 2009.
[Smolensky’86], Smolensky, Paul. "Information processing in dynamical systems: Foundations
of harmony theory." (1986): 194.
[Szegedy+’14], Szegedy, Christian, et al. "Going deeper with convolutions." arXiv preprint arXiv:
1409.4842 (2014).
[Werbos’74] Werbos, Paul. "Beyond regression: New tools for prediction and analysis in the
behavioral sciences." (1974).
74