Keynote

Keynote Speaker

Dr. Chanchal Roy

University of Saskatchewan, Canada

ABSTRACT

Large scale clone detection, analysis and benchmarking: an evolutionary perspective

Copying a code fragment and then reusing it by pasting and adapting (e.g., adding/modifying/deleting statements) is a common practice in software development, resulted in a significant amount of duplicated code in software systems. Developers consider cloning as one of the principled reengineering approaches and often intentionally practice cloning for a variety of reasons such as faster development, avoiding risk by reusing stable old code, or for time pressure. On the other hand, duplicated code poses a number of threats to the maintenance of software systems such as clones are the #1 "bad smell" in Fowler's refactoring list and there are several recent studies including studies with industrial systems show that although for many cases clones are not really harmful, and even could be useful in many cases, they could be also detrimental to software maintenance. For example, reusing a fragment containing unknown bugs may result in bugs propagation, or any changes in requirements involving a cloned fragment may lead to changes to all the similar fragments to it, multiplying the work to be done. Furthermore, inconsistent changes to the cloned fragments during any updating processes may lead to severe unexpected behaviour. Software clones are thus considered to be one of the major contributors to the high software maintenance cost, which could be up to 80% of total software development cost. The era of Big Data has introduced new applications for clone detection. For example, clone detection has been used to find similar mobile applications, to intelligently tag code snippets, to identify code examples, and so on from large inter-project repositories. The dual role of clones in software development and maintenance, along with these many emerging new applications of clone detection, has led to a great many clone detection tools and analysis frameworks. In this keynote talk, I will review the cloning literature to date, in particular, I will talk about our recent work on large scale clone detection, and the challenges in evaluating such clone detectors and how we have overcome them at least in part with our BigCloneBench and Mutation framework. I will then talk about the recent advances in clone analysis and management along with a vision for a comprehensive clone management system.