We investigated the number of references per page for different European Geosciences Union journals, which share the same text formatting. Although the journals formally all focus on geoscience, different disciplines are covered, from ocean science and biogeosciences to the technical description of numerical model development. In this study, we show that the number of references per page is remarkably constant across these journals. In addition, this value has remained constant in the last decade, despite the consistent increase in the number of pages and in the number of references in almost all journals considered. Independently of the quality of the references used in an article, we show that for the EGU (European Geosciences Union) journals the average number of references per page is 3.82 (1.87–6.18 at 90 % confidence level). This reveals that there is a consensus regarding optimum reference density, which depends on the journal's layout and not on the journal's discipline.

The number of references in a scientific paper can influence reader
judgement of the paper's quality

It has been shown

In the last decades, the length of scientific papers has undergone a significant increase.

there is a drop in short reference lists and a corresponding increase in a bit longer and medium-sized reference lists. Long and very long reference lists remain much more stable in shares over time and therefore do not contribute much to the observed growth.

A steady state in reference numbers has until now only been artificially reached in a few journals and/or manuscript types, through the enforcement of limits in the number of referencesThe European Geophysical Society (the predecessor of the European Geosciences Union) started its first open-access (OA) journal in 2001, with the launch of the journal

In this work we examined of the OA EGU journals, which have identical layouts and therefore allow for a direct comparison between the different journals. In addition, all the paper-related metadata have been published online in a searchable XML format, which allows for automatic computer scripting for information gathering. It must be stressed that Copernicus Publications publish other OA journals in addition to the EGU journals considered. However, these journals use diverse layouts, which hinders the comparison between them.

In this work we analyse the reference density, i.e.
the number of references per page, in the OA journals published by the EGU.
The goal is to investigate whether the reference density varies among
journals which cover different topics but have the exact same layout.
We show that there exists a well defined range for the number of references per page, similar for all OA EGU journals,
and that this has remained remarkably constant over time.
In the Sect.

We considered articles accepted and published in XML form in the 2010–2020 period
from the EGU OA journals.
Therefore, only EGU journals which started operating in 2010 at the latest
were used in this study, which resulted in the inclusion of
a total of 12 journals (see Table

Summary of journal characteristics. The number of papers analysed in each journal is listed, as well as the number of papers excluded (also expressed as a fraction) as outliers. The papers with the highest number of pages and the highest number of references are also listed for each journal.

An automatic Python script was used to recursively collect all the information required, such as the number of pages and the number of references, from the XML version of each article.

To avoid counting papers which cited an unrepresentative number of references (such as zero references or pure compilation articles), the outliers, which were defined as (i) papers containing no references or (ii) papers containing a number of references above the average plus 3 times the reference's standard deviation, were removed. In total 30 028 papers were downloaded, of which 787 were excluded as outliers; i.e. 29 241 published papers were used in this analysis.

In Table

In addition to the numbers of analysed and disregarded papers,
Table

For each journal analysed, we estimated the trends in number of pages, references and references per page: our results are presented in Table

Linear fit of the temporal trends of pages, references
and references per page for
different EGU journals for all analysed papers between 2010 and 2020.
The numbers inside the parentheses are the standard deviations
of the estimated time trends (slope of the linear fit).
The units are in

The increase in number of citations may be attributed
to the increasing growth of available literature.
In fact, by publishing more papers, more papers
can (or must) be cited in future work.
Analogously, the increase in absolute number of citations
reflects also the maturity that a specific science field has reached,
whereby the large (and increasing) number of citations mirrors
the large (and increasing) amount of research performed on the specific topic. Furthermore, accessibility could be a major point
for increasing citations over time: OA papers (with the leading
role of pure OA journals) enable easy access to citable
material. In addition, technological development (e.g. fast internet connection, searchable and online downloadable journals)
facilitates the search and usage of previous literature.
Finally,

In addition to the increase in the number of pages and the number of references in the period 2010–2020, we estimated also the evolution of reference density (i.e. number of references per page) over this period.
As shown in Table

Based on these findings, we can consider the reference density to be constant in OA EGU journals, which, in turn, enables us to inspect all papers published in the period covered.

The probability density distribution of pages against references is
presented in Fig.

Shown are two-dimensional histograms (centre) with frequency histograms for pages (top) and references (right) for different EGU journals. The journal abbreviation and the total number of papers, pages and references are listed on the top right of each plot. The black line depicts the linear fit (with no intercept). The axes for the two-dimensional histograms are the same in all plots.

For each journal, the average number of pages and the number of references
were calculated, and the results are presented in Table

Average numbers of pages, references and references per page (Refs. per page) for different EGU journals for all analysed papers. The range at 90 % confidence level is listed in parentheses.

Finally, the average reference densities for each journal
(based on the reference density for each article) have been estimated (see Table

The reference density for each journal displays a classical log-normal distribution. Combining all the reference density distributions also results
(to a good approximation) a log-normal distribution

It is difficult to establish the cause of the relationship
between pages and references. Although it is clear that the number of pages and the number of references in a paper influence each other positively,
they are influenced both directly and indirectly by multiple factors, including, foe example, the number of authors

The importance of references in scientific journals has been clearly established. In this work we took advantage of the OA EGU journals, which, although they cover different areas in geoscience, share the same layout, thereby allowing for a direct comparison. It is shown that in the period 2010–2020, the number of pages and the number of references has been increasing in a statistically significant way.

Different reasons may underlie this growth, such as open access to existing literature together with technological development which facilitates searching for relevant citations. Additionally, we suggested this growth to be especially strong in EGU journals, as geophysics is still a relatively immature field, with a growing number of researchers and, consequently, strong growth in the ensuing literature, which tends to be referenced increasingly in subsequent studies.

Despite the increases in publication length and number of
references in all journals since 2010, the reference density (i.e. number of references per page) has remained remarkably constant. In addition, no statistical difference in reference density can be observed in any of the journals. The average number of references per published page was estimated
based on all the published papers, which shows that the optimal reference density is

It has been shown that papers with a large number of references tend to be cited more

This work provides an indication for authors preparing their manuscript for EGU journals, suggesting how many references are “about right” in a paper. This is especially important for less experienced authors, as it shows if their citation strategy fits with the existing body of literature. Furthermore, reviewers or editors should be particularly careful in evaluating manuscripts whose reference density is outside the range 1.87–6.18, as this indicates a non-standard (or outlying) manuscript with an uncommonly high (or low) number of references.

The code for analysis is available upon request to the contact author.

The data used in this work are freely available under Creative Commons licence (CC BY 4.0) on each EGU journal website.

The contact author has declared that there are no competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The author would like to thank Ulrich Pöschl for the constructive discussion, John Crowley for his support and help, and Sam Illingworth for the suggestions for improvements.

This paper was edited by Sam Illingworth and reviewed by two anonymous referees.