According to an analysis as part of the AI Index report, the share publications on ArXiv in the subcategory “Computation and Language” as part of total publications in Artificial Intelligence has mostly increased since 2012:
- 2012: 5.70%
- 2013: 5.74%
- 2014: 8.81%
- 2015: 10.63%
- 2016: 14.80%
- 2017: 14.41%
From AI Index’s annual report (p. 73):
Source arXiv.org is an online archive of research articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. arXiv is owned and operated by Cornell University. See more information on arXiv.org.
Methodology The keywords we selected, and their respective categories, are below: Artificial intelligence (cs.AI) Computation and language (cs.CL) Computer vision and pattern recognition (cs.CV) Machine learning (cs.LG) Neural and evolutionary computing (cs.NE) Robotics (cs.RO) Machine learning in stats (stats.ML) For most categories, arXiv provided data years 1999 — 2017. For our analysis, we decided to start at the year 2010 in order to include Machine Learning in Stats, which did not exist on arXiv prior. To see other categories’ submission rates on arXiv, see arXiv.orgs submission statistics.
Categories are self-identified by authors — those shown are selected as the “primary” category. Therefore, it is worth noting that there is not one streamlined categorization process. Additionally, the Artificial intelligence or Machine learning categories may be categorized by other subfields / keywords.
arXiv team members have shared that participation on arXiv can breed more participation — meaning that an increase in a subcategory on arXiv could drive over-indexed participation by certain communities.
Growth of papers on arXiv does not reflect actual growth of papers on that topic. Some growth can be attributed to arXiv.org’s efforts to increase their paper count, or to the increasing importance of dissemination by AI communities.
What percent of ArXiv AI publications in the calendar year 2019 will be in the subcategory “Computation and Language”?
The question will resolve as per the data published by the 2020 AI Index annual report. If the methodology has substantially changed, the question resolves ambiguous.
Raw data for our analysis was provided to AI Index team by representatives at arXiv.org. Historical data can be accessed here. Please make a copy by clicking "file" and then "make a copy" if you wish to edit it.