Today’s illustration doesn’t have anything to do with the topic below. I made it for a ten minute talk I’ll give tomorrow, at the local “Physics Slam.” You can see the program here. Short version: Six physics faculty will have ten minutes each to explain something. The audience votes on their favorite presentation. Apparently, when it was done last a few years ago, several hundred people came. We’ll see what happens this time! My title:
Why do bacteria care about physics?
At some point, I should practice…
Now on to today’s topic:
Everyone agrees that it’s impossible to keep up with the ever-expanding scientific literature. An interesting recent paper* takes a look at this phenomenon, verifying that the number of papers published every year is, indeed, growing exponentially:
* “Attention decay in science,” Pietro Della Briotta Parolo et al., http://arxiv.org/pdf/1503.01881v1.pdf
The authors look at what this means for scientific “memory.” In general, the rate at which a paper is cited by later papers decays over time (after an initial peak), as it is forgotten or as it is gives rise to other works that are cited instead. One might guess that a growth in publication rate might correlate with a larger decay rate for citations — we spend less time with the past as we’re swamped by new stuff. This is indeed what Parolo et al. find: a decay rate that has steadily grown over decades. This is unfortunate: by not considering papers of the more distant past we risk needlessly re-discovering insights, and we disconnect ourselves from our fields’ pioneering perspectives.
Returning to the overall number of papers: I wonder if this terrifying growth is driven primarily by an increase in the number of scientists or by an increase in papers written per person. I suspect the former. Even within the US, there are a lot more scientists than there used to be [e.g. this graph]. In the developing world this increase is far more dramatic (see e.g. here), as (presumably) it should be.
Unfortunately, I can’t find any data on the total number of scientists worldwide — at least not with just a few minutes of searching — or even the total number of Ph.D.’s awarded each year.
Looking around for any data that might help illuminate trends of population and paper production, I stumbled upon historical data for the American Physical Society (APS), namely the number of members in each year, since 1905 (http://www.aps.org/membership/statistics/upload/historical-counts-14.pdf). It’s not hard to tabulate the total number of papers published each year in the Physical Review journals — the publications of the APS. Looking at how each of these change with time might give a rough sense of whether one tracks the other. Of course, there are a lot of problems with interpreting any correlation between these two things: APS members (like me) publish in all sorts of journals, not just APS ones; non-APS members publish in APS journals; etc. Still, let’s see what these two look like:
Just considering APS journals alone, the number of papers published each year is 10 times what it was a few decades ago! Within the microcosm of APS, the number of papers being published has been growing at a far faster rate than the membership.
What does all this mean? I don’t really know. It’s impossible to do something about the general complaint that there are too many papers to read unless we have some deeper understanding of why we’re in this state. Lacking that, I suppose we’re just stuck reading papers as best we can, or feeling guilty for not reading…
That is a nice fish.
I’d be interested in the number of consequential papers over time… say, plot the number of papers with 20 citations or more. At least some of the paper proliferation is due to journal proliferation, and papers about one factoid being accepted by journals that will publish anything.
On the other hand, it also seems like papers at normal journals need to go into more and more depth to be accepted. When I was in grad school a fly paper was a good description of a phenotype and a mapping to where in the genome the mutation might be. Now you need a pathway, a molecular function, expression data, etc.
Yes, I certainly agree — there’s a strange paradox that lots of terrible stuff is being published, while it’s hard to get “good,” small things published.
About consequential papers — yes, this would be neat to see. The closest thing I can think of to a study of this is a paper on Physics citations over time, http://physics.bu.edu/~redner/pubs/pdf/citations-pr-rev.pdf . It has graphs of things like citation probability vs. years after publication (Fig 7) for papers published in various years, which are all very similar to each other even in the high-probability end of the curve.