Abstract:
Researchers in the dynamic program analysis field have extensively used cluster analysis to address various problems. Typically, the clustering techniques are applied onto execution profiles having high dimensionality (i.e., involving a large number of profiling elements), sometimes in the order of thousands or even hundreds of thousands. Our concern is that the high number of profiling elements might diminish the effectiveness of the clustering process, which led us to explore the use of dimensionality reduction techniques as a preprocessing step to clustering. Specifically, in this work, we used PCA (Principal Component Analysis) as a dimensionality reduction technique and investigated its impact on two cluster-based analysis techniques, one aiming at identifying coincidentally correct tests, and the other at test suite minimization. In other words, we tried to assess whether PCA improves cluster-based analysis. Our experimental results showed that the impact was positive on the first technique, but inconclusive on the second, which calls for further investigation in the future. © 2013 IEEE.