How I did it
- First you need to install R then R Studio (not the other way around),
- Fire up R Studio,
- Install and load the library rplos (Github/CRAN) and rplotly (Github/CRAN),
- PLOS data: tap PLOS data directly and plot the result,
PMC data: they don’t have any package for direct tapping (yet), but here’s the workaround explained from several sources. You have to go Flink system to do some queries based on key words. Be sure to choose you database by pulling down the menu,
Export the query result,
Then do some analysis and plotting from R Studio.
The Flink screenshots
Opening page and database menu
The query window, you can search by PMID or keywords
The search result and save it as csv
Then I write some code following Jon’s code from his blog and add it with some my own (on Github containing: plos_analysis.R, pmc.csv, and pmc2.csv)
And here’s the results
From PLOS data
Using keyword ‘political science’. I don’t know what happened on the spike. Is it US election and Brexit?
Using keyword ‘earth science’. It will be interesting to dig in what did happen on the spike.
Using keyword ‘hydrogeology’. Kind of weird huh. As if it was controlled by wet and dry season. And it was only five, yes five papers at on every spike. Compare that number with the order of hundreds in earth science and political science.
From PMC data
Using keyword ‘hydrogeology’
Using keyword ‘groundwater – river water interaction’