Resume of Hiroyoshi Komatsu
Education
The University of Tokyo, Department of Computer Science, Master of Science (2021-04 to 2023-03)
Passed Certificate for Students Achieving the Proficiency Level of Upper Secondary School Graduates Exam in Japan (2013-09)
- Effective from April 2015 (18 years old)
Skills
- Programming Languages (proficient): Python, Scala, Java.
- Programming Languages (barely readable): C/C++, Haskell, JavaScript.
- Others: Android apps development, SQL, NoSQL, SPARQL.
- 7-year experiences and understanding of natural language processing with both deep learning approaches & traditional statistical machine learning models.
- Understanding and basic research experience of mathematical logic (proof theory), computational complexity theory, and theoretical cryptography.
- Basic skills: Linux, git, kubernetes
Experience
National Institute of Advanced Industrial Science and Technology (AIST) (September 2018 to 04 and 2018-07 to 2018-09)
- A research assistant. I’ve worked on designing and making a dataset for contextual semantic parsing task based on real search queries and making a deep learning system with PyTorch.
MITOU project 2013 (2013-10 to 2014-07)
- Project Details: https://www.ipa.go.jp/jinzai/mitou/2013/gaiyou_s-4
- I’ve proposed a new semantic parsing approach for virtual assistant applications. I’ve Implemented a core machine learning algorithm (log-linear model) and natural language processing system, a server-side backend, and an Android application for the frontend.
- It has been awarded a selected “Super Creator” award of MITOU project 2013. (Less than 8% for all applicants, the youngest winner in time from 2000)
- Supervisor: Yusuke Miyao
- A technical assistant. 1) I’ve worked on making a semantic parsing system for database query generations from Japanese question sentences; 2) I’ve made a coreference resolution system as a part of a group project for a textual entailment workshop, NTCIR-10 RITE. (Our system achieves the top score on 1 out of 3 datasets among 12 teams)
Nara Institute of Science and Technology (NAIST) (2011-07)
- Supervisor: Mamoru Komachi
- A Technical Assistant. I’ve made a converter for a predicate-argument structures corpus in the Japanese language with Python.
Open-Source Projects
corenlp-python
- Website: https://bitbucket.org/torotoki/corenlp-python/
- A Python wrapper for Stanford CoreNLP, a set of essential libraries for natural language processing. This is a fork of Dustin Smith’s stanford-corenlp-python with new features and bugfixes.
- Received new 21 forks, 29 watchers in Bitbucket; also used by natural language processing engineers/researchers, also used in a source code for an international conference paper.
BCCWJ-PAS
- Tools & dataset for predicate-argument structures and coreferences for Balanced Corpus of Contemporary Written Japanese (BCCWJ), a popular corpus for Japanese language.
- This is the project when I worked at Nara Institute of Science and Technology (NAIST) with Assistant Prof. Mamoru Komachi.
- Currently available as a part of BCCWJ-DepParaPAS
Other Works
- DG Lab Tokyo, 2016: I’ve made a contextual question answering system for experiment with PyTorch.
- Gakushuin University, 2016: I’ve made a benchmark system for a classification task of legal texts with PyTorch as a freelance.
- The University of Tokyo, 2023: Teaching Assistant of the lecture “Computational Complexity Theory”.
Misc
- NII Today No. 60, 2013: The Younger Generation Discusses Their Hopes for the Todai
Robot Project.
- Languages:
- Japanese: native
- English: TOEFL iBT 86 (2020-07), TOEIC 920 (2024-10)
Publications
- PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency, 2024. Preferred Elements et al. arXiv preprint arXiv:2410.07563. https://arxiv.org/abs/2410.07563
- The Logic for a Mildly Context-Sensitive Fragment of the Lambek-Grishin Calculus, 2021. Hiroyoshi Komatsu. arXiv preprint arXiv:2101.03634. https://arxiv.org/abs/2101.03634
- The Display Calculus and Proof Nets for the Lambek-Grishin Calculus, and Mildly Context-Sensitive Grammars, 2021. Hiroyoshi Komatsu. SLACS.
- Latent Dirichlet Allocation for Wikipedia (Japanese), 2013. Hiroyoshi Komatsu, YANS.
- BnO at NTCIR-10 RITE: A Strong Shallow Approach and an Inference-based Textual
Entailment Recognition System, 2013. Ran Tian, Yusuke Miyao, Takuya Matsuzaki, Hiroyoshi Komatsu. NTCIR-10.
Other Presentations