Category Archives: data mining

Multivariate Analysis of Limestone Petrography Data On Kalipucang Formation Using R (iGeos 2017)

Multivariate Analysis of Limestone Petrography Data On Kalipucang Formation Using R

Achmad Darul1,#., Dasapta Erwin Irawan2., Jejen Ramdani1., Fauzan Septiana1., Siti Saniyyah Sholihat3
1 Fakultas Teknik dan Desain, Institut Teknologi Sains Bandung, Jalan Ganesha Boulevard LOT A-1 CBD Deltamas, Bekasi, Indonesia
2 Fakultas Ilmu dan Teknologi Kebumian, Institut Teknologi Bandung, Jalan Ganesha no. 10 Bandung, Indonesia
3 Fakultas Ilmu Pendidikan, Unversitas Pendidikan Indonesia, Jalan Dr. Setiabudhi no. 229, Bandung, Indonesia

Abstract. Limestone is one of the most strategic construction materials. Its physical properties are controlled by chemical properties. Different types of limestone can be distinguished by examining thin section. However, it is still complicated to classify limestones based on qualitative observation. Geologists also need a method to classify large number of samples based on training data. This paper applies multivariate statistical techniques (principal component analysis and cluster analysis) to assist sample classification. We used 57 samples of thin section rock of Kalipucang Formation from three location: Pancatengah-Tasikmalaya (PCT); Cijulang-Ciamis (CJL) and Sindangsari-Ciamis (SDS). An open source R statistical package, was used in the analysis. The result from our training data, shows a consistent classification with the initial visual classification. Each locations show a distinct petrographical compositions: Group 1 shows the dominant control of depositional environment with strong values of foraminifera, algae, mud carbonate, coral fragments. Group 2 shows a mixing with igneous rock with plagioclase, opaque, glass, pyroxene. Group 3 shows a the a mixing with transported-sediment with traces of quartz compositions, iron oxides, rock fragments. However we need to make more trials using more data set to test this method.

Key words: Limestones, Multivariate analysis, Petrography.

Note: This abstract has already presented in the iGeos International Conference 2017. Currently we’re writing some revisions on the language section. We also need to link our Github page for the data and R code. The complete first version can be accessed at the INA-rxiv preprint server.


Some visualisations of Bandung water quality data

Here I learn some more type visualizations to understand groundwater behavior based on groundwater quality data set. I have 142 data points of water quality data measured in 2015. The dataset can be downloaded from our OSF repository. Currently we are on our way in writing a paper out of the data set based on multivariate analysis. I use free apps to produce all plots. I will add the plots as I move along in the analysis.

Continue reading Some visualisations of Bandung water quality data

Mining PLOS and PubMed data


This post was inspired from Jon Tennant’s post on his blog (here). He was talking about the number of papers on paleontology field published in PLOSone. His post was mainly based on his code using `rplos` package (Github repo/CRAN repo) from `ropensci` community. Jon’s post was kind of fire up my R life again, especially in the field of text mining. So in this post, I will connect this original post with my research about analyzing biodiversity of Cikapundung riverbank area (on Figshare).

Continue reading Mining PLOS and PubMed data