HCI Bibliography Home | HCI Conferences | ICPC Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
ICPC Tables of Contents: 1415

Proceedings of the 2014 International Conference on Program Comprehension

Fullname:ICPC'14: 22nd International Conference on Program Comprehension
Editors:Chanchal K. Roy; Andrew Begel; Leon Moonen
Location:Hyderabad, India
Dates:2014-Jun-02 to 2014-Jun-03
Standard No:ISBN: 978-1-4503-2879-1; ACM DL: Table of Contents; hcibib: ICPC14
Links:Conference Website | Main Conference Website
  1. Invited Talks
  2. Architecture
  3. Supporting Software Engineers
  4. Collaborative and Human Aspects
  5. Recommendations
  6. Joint Session with CHASE 1
  7. Joint Session with CHASE 2
  8. Understanding Comprehension
  9. Software Quality
  10. Novel Development Tooling

Invited Talks

Innovating in India: designing for constraint, computing for inclusion (keynote) BIBAFull-Text 1
  Edward Cutrell
A fundamental tenet of user-centered design is that the needs, wants, limitations, and contexts of end users are central to the process of creating products and services that can be used and understood by the people who will use them. Most of the time these end users aren't all that different from the people designing the technology. But as the differences increase between designers and the people they're designing for, understanding and empathizing with users becomes harder and even more important. As we build software for people and communities with vastly diverse backgrounds, cultures, languages, and education, we need to stretch our ideas of what users want and need and how best to serve them.
   The Technology for Emerging Markets (TEM) group at Microsoft Research India seeks to address the needs and aspirations of people in the developing world who are just beginning to use computing technologies and services as well as those for whom access to computing still remains largely out of reach. Much of this work can be described as designing for constraint: constraints in education, in infrastructure, in financial resources, in languages and in many other areas. In this talk, I will describe some work from our group that explores how we have tried to manage these constraints to create software and systems for people and communities often overlooked by technologists.
The MoJo family: a story about clustering evaluation (invited talk) BIBAFull-Text 2
  Zhihua Wen; Vassilios Tzerpos
The need to decompose large, complex software systems into smaller, more manageable subsystems has been recognized for more than two decades. Many cluster analysis algorithms have been applied to the software domain, and several algorithms specializing in software clustering have been developed. This in turn has created the need to evaluate and compare clustering results.
   This talk will present some background on the software clustering problem and its challenges, as well as the software clustering evaluation and its challenges. It will then discuss the MoJo family of measures with an emphasis on MoJoFM (originally presented at IWPC 2004).


Do architectural design decisions improve the understanding of software architecture? two controlled experiments BIBAFull-Text 3-13
  Mojtaba Shahin; Peng Liang; Zengyang Li
Architectural design decision (ADD) and its design rationale, as a paradigm shift on documenting and enriching architecture design description, is supposed to facilitate the understanding of architecture and the reasoning behind the design rationale, which consequently improves the architecting process and gets better architecture design results. But the lack of empirical evaluation that supports this statement is one of the major reasons that prevent industrial practitioners from using ADDs in their daily architecting activities. In this paper, we conducted two controlled experiments, as a family of experiments, to investigate how presence of ADDs can improve the understanding of architecture. The main results of our experiments are: (i) using ADDs and their rationale in architecture documentation does not affect the time needed for completing architecture design tasks; (ii) one experiment and the family of experiments achieved a significantly better understanding of architecture design when using ADDs; and (iii) with regard to the correctness of architecture understanding, more experienced participants benefited more from ADDs in comparison with less experienced ones.
Revealing the relationship between architectural elements and source code characteristics BIBAFull-Text 14-25
  Vanius Zapalowski; Ingrid Nunes; Daltro José Nunes
Understanding how a software system is structured, i.e. its architecture, is crucial for software comprehension. It allows developers to understand an implemented system and reason about how non-functional requirements are addressed. Yet, many systems lack any architectural documentation, or it is often outdated due to software evolution. In current practice, the process of recovering a system's architecture relies primarily on developer knowledge. Although existing architecture recovery approaches can help to identify architectural elements, these approaches require improvement to identify architectural concepts of a system automatically. Towards this goal, we analyze the usefulness of adopting different code-level characteristics to group elements into architectural modules. Our main contributions are an evaluation of the relationships between different sets of characteristics and their corresponding accuracies, and the evaluation results, which help us to understand which characteristics reveal information about the source code structure. Our experiment shows that an identified set of characteristics achieves an average accuracy of 80%, which indicates the usefulness of the considered characteristics for architecture recovery and thus to improving software comprehension.

Supporting Software Engineers

Understanding LDA in source code analysis BIBAFull-Text 26-36
  David Binkley; Daniel Heinz; Dawn Lawrie; Justin Overfelt
Latent Dirichlet Allocation (LDA) has seen increasing use in the understanding of source code and its related artifacts in part because of its impressive modeling power. However, this expressive power comes at a cost: the technique includes several tuning parameters whose impact on the resulting LDA model must be carefully considered. An obvious example is the burn-in period; too short a burn-in period leaves excessive echoes of the initial uniform distribution. The aim of this work is to provide insights into the tuning parameter's impact. Doing so improves the comprehension of both, 1) researchers who look to exploit the power of LDA in their research and 2) those who interpret the output of LDA-using tools. It is important to recognize that the goal of this work is not to establish values for the tuning parameters because there is no universal best setting. Rather, appropriate settings depend on the problem being solved, the input corpus (in this case, typically words from the source code and its supporting artifacts), and the needs of the engineer performing the analysis. This work's primary goal is to aid software engineers in their understanding of the LDA tuning parameters by demonstrating numerically and graphically the relationship between the tuning parameters and the LDA output. A secondary goal is to enable more informed setting of the parameters. Results obtained using both production source code and a synthetic corpus underscore the need for a solid understanding of how to configure LDA's tuning parameters.
A diagnosis-based approach to software comprehension BIBAFull-Text 37-47
  Alexandre Perez; Rui Abreu
Program comprehension is a time-consuming task performed during the process of reusing, reengineering, and enhancing existing systems. Currently, there are tools to assist in program comprehension by means of dynamic analysis, but, e.g., most cannot identify the topology and the interactions of a certain functionality in need of change, especially when used in large, real-world software applications. We propose an approach, coined Spectrum-based Feature Comprehension (SFC), that borrows techniques used for automatic software-fault-localization, which were proven to be effective even when debugging large applications in resource-constrained environments. SFC analyses the program by exploiting run-time information from test case executions to compute the components that are important for a given feature (and whether a component is used to implement just one feature or more), helping software engineers to understand how a program is structured and what the functionality's dependencies are. We present a toolset, coined Pangolin, that implements SFC and displays its report to the user using an intuitive visualization. A user study with the open-source application Rhino is presented, demonstrating the efficiency of Pangolin in locating the components that should be inspected when changing a certain functionality.
dsOli: data structure operation location and identification BIBAFull-Text 48-52
  David H. White
Comprehension of C programs can be a difficult task, especially when they contain pointer-based dynamic data structures. This paper describes our tool dsOli which aims to simplify this problem by automatically locating and identifying data structure operations in C programs, such as inserting into a singly linked list. The approach is based on a dynamic analysis that seeks to identify functional units in a program by observing repetitive temporal patterns caused by multiple invocations of code fragments. The behaviour of these functional units is then classified by matching the associated heap states against templates describing common data structure operations. The analysis results are available to the user via XML output, and can also be viewed using an intuitive GUI which overlays the learnt information on the program source code.
Version history, similar report, and structure: putting them together for improved bug localization BIBAFull-Text 53-63
  Shaowei Wang; David Lo
During the evolution of a software system, a large number of bug reports are submitted. Locating the source code files that need to be fixed to resolve the bugs is a challenging problem. Thus, there is a need for a technique that can automatically figure out these buggy files. A number of bug localization solutions that take in a bug report and output a ranked list of files sorted based on their likelihood to be buggy have been proposed in the literature. However, the accuracy of these tools still need to be improved.
   In this paper, to address this need, we propose AmaLgam, a new method for locating relevant buggy files that puts together version history, similar reports, and structure. To do this, AmaLgam integrates a bug prediction technique used in Google which analyzes version history, with a bug localization technique named BugLocator which analyzes similar reports from bug report system, and the state-of-the-art bug localization technique BLUiR which considers structure. We perform a large-scale experiment on four open source projects, namely AspectJ, Eclipse, SWT and ZXing to localize more than 3,000 bugs. Compared with a history-aware bug localization solution of Sisman and Kak, our approach achieves a 46.1% improvement in terms of mean average precision (MAP). Compared with BugLocator, our approach achieves a 24.4% improvement in terms of MAP. Compared with BLUiR, our approach achieves a 16.4% improvement in terms of MAP.
Understanding the database manipulation behavior of programs BIBAFull-Text 64-67
  Nesrine Noughi; Marco Mori; Loup Meurice; Anthony Cleve
Due to the lack of (up-do-date) documentation, software maintenance and evolution processes often necessitate the recovery of a sufficient understanding of the software system, before the latter can be adapted to new or changing requirements. To address this problem, several program comprehension techniques have been proposed to support this preliminary phase of software maintenance and evolution. Nevertheless, those techniques generally fail in gaining a complete and accurate understanding in the case of modern data-intensive systems, which are characterized by complex, dynamic and continuous interactions between the application programs and their database. In particular, understanding the database manipulation behavior of a given program involves dierent levels of comprehension ranging from identifying to relating and interpreting the successive database access operations. In this paper, we present our early research achievements in the development of a tool-supported framework aiming to extract and understand the database manipulation behavior of data-intensive programs.
On mapping releases to commits in open source systems BIBAFull-Text 68-71
  Joe F. Shobe; Md Yasser Karim; Motahareh Bahrami Zanjani; Huzefa Kagdi
The paper presents an empirical study on the release naming and structure in three open source projects: Google Chrome, GNU gcc, and Subversion. Their commonality and variability are discussed. An approach is developed that establishes the mapping from a particular release (major or minor) to the specific earliest and latest revisions, i.e., a commit window of a release, in the source control repository. For example, the major release 25.0 in Chrome is mapped to the earliest revision 157687 and latest revision 165096 in the trunk. This mapping between releases and commits would facilitate a systematic choice of history in units of the project evolution scale (i.e., commits that constitute a software release). A projected application is in forming a training set for a source-code change prediction model, e.g., using the association rule mining or machine learning techniques, commits from the source code history are needed.

Collaborative and Human Aspects

Ranking crowd knowledge to assist software development BIBAFull-Text 72-82
  Lucas B. L. de Souza; Eduardo C. Campos; Marcelo de A. Maia
StackOverflow.com (SO) is a Question and Answer service oriented to support collaboration among developers in order to help them solving their issues related to software development. In SO, developers post questions related to a programming topic and other members of the site can provide answers to help them. The information available on this type of service is also known as "crowd knowledge" and currently is one important trend in supporting activities related to software development and maintenance.
   We present an approach that makes use of "crowd knowledge" available in SO to recommend information that can assist developers in their activities. This strategy recommends a ranked list of pairs of questions/answers from SO based on a query (list of terms). The ranking criteria is based on two main aspects: the textual similarity of the pairs with respect to the query (the developer's problem) and the quality of the pairs. Moreover, we developed a classifier to consider only "how-to" posts. We conducted an experiment considering programming problems on three different topics (Swing, Boost and LINQ) widely used by the software development community to evaluate the proposed recommendation strategy. The results have shown that for 77.14% of the assessed activities, at least one recommended pair proved to be useful concerning the target programming problem. Moreover, for all activities, at least one recommended pair had a source code snippet considered reproducible or almost reproducible.
How do API changes trigger stack overflow discussions? a study on the Android SDK BIBAFull-Text 83-94
  Mario Linares-Vásquez; Gabriele Bavota; Massimiliano Di Penta; Rocco Oliveto; Denys Poshyvanyk
The growing number of questions related to mobile development in StackOverflow highlights an increasing interest of software developers in mobile programming. For the Android platform, 213,836 questions were tagged with Android-related labels in StackOverflow between July 2008 and August 2012. This paper aims at investigating how changes occurring to Android APIs trigger questions and activity in StackOverflow, and whether this is particularly true for certain kinds of changes. Our findings suggest that Android developers usually have more questions when the behavior of APIs is modified. In addition, deleting public methods from APIs is a trigger for questions that are (i) more discussed and of major interest for the community, and (ii) posted by more experienced developers. In general, results of this paper provide important insights about the use of social media to learn about changes in software ecosystems, and establish solid foundations for building new recommenders for notifying developers/managers about important changes and recommending them relevant crowdsourced solutions.
Towards more accurate content categorization of API discussions BIBAFull-Text 95-105
  Bo Zhou; Xin Xia; David Lo; Cong Tian; Xinyu Wang
Nowadays, software developers often discuss the usage of various APIs in online forums. Automatically assigning pre-defined semantic categorizes to API discussions in these forums could help manage the data in online forums, and assist developers to search for useful information. We refer to this process as content categorization of API discussions. To solve this problem, Hou and Mo proposed the usage of naive Bayes multinomial, which is an effective classification algorithm.
   In this paper, we propose a Cache-bAsed compoSitE algorithm, short formed as CASE, to automatically categorize API discussions. Considering that the content of an API discussion contains both textual description and source code, CASE has 3 components that analyze an API discussion in 3 different ways: text, code, and original. In the text component, CASE only considers the textual description; in the code component, CASE only considers the source code; in the original component, CASE considers the original content of an API discussion which might include textual description and source code. Next, for each component, since different terms (i.e., words) have different affinities to different categories, CASE caches a subset of terms which have the highest affinity scores to each category, and builds a classifier based on the cached terms. Finally, CASE combines all the 3 classifiers to achieve a better accuracy score. We evaluate the performance of CASE on 3 datasets which contain a total of 1,035 API discussions. The experiment results show that CASE achieves accuracy scores of 0.69, 0.77, and 0.96 for the 3 datasets respectively, which outperforms the state-of-the-art method proposed by Hou and Mo by 11%, 10%, and 2%, respectively.
CODES: mining source code descriptions from developers discussions BIBAFull-Text 106-109
  Carmine Vassallo; Sebastiano Panichella; Massimiliano Di Penta; Gerardo Canfora
Program comprehension is a crucial activity, preliminary to any software maintenance task. Such an activity can be difficult when the source code is not adequately documented, or the documentation is outdated. Differently from the many existing software re-documentation approaches, based on different kinds of code analysis, this paper describes CODES (mining sourCe cOde Descriptions from developErs diScussions), a tool which applies a "social'' approach to software re-documentation. Specifically, CODES extracts candidate method documentation from StackOverflow discussions, and creates Javadoc descriptions from it. We evaluated CODES to mine Lucene and Hibernate method descriptions. The results indicate that CODES is able to extract descriptions for 20% and 28% of the Lucene and Hibernate methods with a precision of 84% and 91% respectively.
Condensing class diagrams by analyzing design and network metrics using optimistic classification BIBAFull-Text 110-121
  Ferdian Thung; David Lo; Mohd Hafeez Osman; Michel R. V. Chaudron
A class diagram of a software system enhances our ability to understand software design. However, this diagram is often unavailable. Developers usually reconstruct the diagram by reverse engineering it from source code. Unfortunately, the resultant diagram is often very cluttered; making it difficult to learn anything valuable from it. Thus, it would be very beneficial if we are able to condense the reverse-engineered class diagram to contain only the important classes depicting the overall design of a software system. Such diagram would make program understanding much easier. A class can be important, for example, if its removal would break many connections between classes. In our work, we estimate this kind of importance by using design (e.g., number of attributes, number of dependencies, etc.) and network metrics (e.g., betweenness centrality, closeness centrality, etc.). We use these metrics as features and input their values to our optimistic classifier that will predict if a class is important or not. Different from standard classification, our newly proposed optimistic classification technique deals with data scarcity problem by optimistically assigning labels to some of the unlabeled data and use them for training a better statistical model. We have evaluated our approach to condense reverse-engineered diagrams of 9 software systems and compared our approach with the state-of-the-art work of Osman et al. Our experiments show that our approach can achieve an average Area Under the Receiver Operating Characteristic Curve (AUC) score of 0.825, which is a 9.1% improvement compared to the state-of-the-art approach.
An information visualization feature model for supporting the selection of software visualizations BIBAFull-Text 122-125
  Renan Vasconcelos; Marcelo Schots; Cláudia Werner
Software development comprises the execution of a variety of tasks, such as bug discovery, finding reusable assets, dependency analysis etc. A better understanding of the task at hand and its surroundings can improve the development performance in general. Software visualizations can support such understanding by addressing different issues according to the necessity of stakeholders. However, knowing which visualizations better fit a given task in progress is not a trivial skill. In this sense, a feature model, intended for organizing the knowledge of a given domain and allowing the reuse of components, can support the identification, categorization and selection of information visualization elements. This work presents an ongoing domain analysis performed for building an information visualization feature model, whose goal is to support the process of choosing and building proper, suitable software visualizations.
Enabling integrated development environments with natural user interface interactions BIBAFull-Text 126-129
  Denis Delimarschi; George Swartzendruber; Huzefa Kagdi
The paper introduces the concept of applying Natural User Interface (NUI) interactions in the context of Integrated Development Environments (IDEs). Human voice and gestures are mapped to several IDE commends. A prototype tool is developed using the Microsoft Kinect hardware sensors and the available software development kits for Microsoft Visual Studio. A pilot study was conducted to assess the developed prototype. The results of the study suggest that it might be possible to apply natural interactions to a range of IDE capabilities.


Amalgamating source code authors, maintainers, and change proneness to triage change requests BIBAFull-Text 130-141
  Md Kamal Hossen; Huzefa Kagdi; Denys Poshyvanyk
The paper presents an approach, namely iMacPro, to recommend developers who are most likely to implement incoming change requests. iMacPro amalgamates the textual similarity between the given change request and source code, change proneness information, authors, and maintainers of a software system. Latent Semantic Indexing (LSI) and a lightweight analysis of source code, and its commits from the software repository, are used. The basic premise of iMacPro is that the authors and maintainers of the relevant source code, which is change prone, to a given change request are most likely to best assist with its resolution. iMacPro unifies these sources in a unique way to perform its task, which was not investigated and reported in the literature previously.
   An empirical study on three open source systems, ArgoUML, JabRef, and jEdit, was conducted to assess the effectiveness of iMacPro. A number of change requests from these systems were used in the evaluated benchmark. Recall values for top one, five, and ten recommended developers are reported. Furthermore, a comparative study with a previous approach that uses the source-code authorship information for developer recommendation was performed. Results show that iMacPro could provide recall gains from 30% to 180% over its subjected competitor with statistical significance.
Mining unit tests for code recommendation BIBAFull-Text 142-145
  Mohammad Ghafari; Carlo Ghezzi; Andrea Mocci; Giordano Tamburrelli
Developers spend a significant portion of their time understanding and learning the correct usage of the APIs of libraries they want to integrate in their projects. However, learning how to effectively use APIs is complex and time consuming. Code recommendation systems play a crucial role facilitating developers in this task by providing to them relevant examples while they code. This paper proposes a novel approach to code recommendation in which code examples are automatically obtained by mining and manipulating unit tests. In this paper we discuss the theoretical and practical implications that underpin this idea. The discussion leads to a series of fascinating research challenges that we organized in a research agenda.
Recommending automated extract method refactorings BIBAFull-Text 146-156
  Danilo Silva; Ricardo Terra; Marco Tulio Valente
Extract Method is a key refactoring for improving program comprehension. However, recent empirical research shows that refactoring tools designed to automate Extract Methods are often underused. To tackle this issue, we propose a novel approach to identify and rank Extract Method refactoring opportunities that are directly automated by IDE-based refactoring tools. Our approach aims to recommend new methods that hide structural dependencies that are rarely used by the remaining statements in the original method. We conducted an exploratory study to experiment and define the best strategies to compute the dependencies and the similarity measures used by the proposed approach. We also evaluated our approach in a sample of 81 extract method opportunities generated for JUnit and JHotDraw, achieving a precision of 48% (JUnit) and 38% (JHotDraw).
Identifying and locating interference issues in PHP applications: the case of WordPress BIBAFull-Text 157-167
  Laleh Eshkevari; Giuliano Antoniol; James R. Cordy; Massimiliano Di Penta
The large success of Content management Systems (CMS) such as WordPress is largely due to the rich ecosystem of themes and plugins developed around the CMS that allows users to easily build and customize complex Web applications featuring photo galleries, contact forms, and blog pages. However, the design of the CMS, the plugin-based architecture, and the implicit characteristics of the programming language used to develop them (often PHP), can cause interference or unwanted side effects between the resources declared and used by different plugins. This paper describes the problem of interference between plugins in CMS, specifically those developed using PHP, and outlines an approach combining static and dynamic analysis to detect and locate such interference. Results of a case study conducted over 10 WordPress plugins shows that the analysis can help to identify and locate plugin interference, and thus be used to enhance CMS quality assurance.

Joint Session with CHASE 1

Prioritizing maintainability defects based on refactoring recommendations BIBAFull-Text 168-176
  Daniela Steidl; Sebastian Eder
As a measure of software quality, current static code analyses reveal thousands of quality defects on systems in brown-field development in practice. Currently, there exists no way to prioritize among a large number of quality defects and developers lack a structured approach to address the load of refactoring. Consequently, although static analyses are often used, they do not lead to actual quality improvement. Our approach recommends to remove quality defects, exemplary code clones and long methods, which are easy to refactor and, thus, provides developers a first starting point for quality improvement. With an empirical industrial Java case study, we evaluate the usefulness of the recommendation based on developers' feedback. We further quantify which external factors influence the process of quality defect removal in industry software development.

Joint Session with CHASE 2

How the evolution of emerging collaborations relates to code changes: an empirical study BIBAFull-Text 177-188
  Sebastiano Panichella; Gerardo Canfora; Massimiliano Di Penta; Rocco Oliveto
Developers contributing to open source projects spontaneously group into "emerging'' teams, reflected by messages exchanged over mailing lists, issue trackers and other communication means. Previous studies suggested that such teams somewhat mirror the software modularity. This paper empirically investigates how, when a project evolves, emerging teams re-organize themselves-e.g., by splitting or merging. We relate the evolution of teams to the files they change, to investigate whether teams split to work on cohesive groups of files. Results of this study-conducted on the evolution history of four open source projects, namely Apache httpd, Eclipse JDT, Netbeans, and Samba-provide indications of what happens in the project when teams reorganize. Specifically, we found that emerging team splits imply working on more cohesive groups of files and emerging team merges imply working on groups of files that are cohesive from structural perspective. Such indications serve to better understand the evolution of software projects. More important, the observation of how emerging teams change can serve to suggest software remodularization actions.

Understanding Comprehension

On the effect of code regularity on comprehension BIBAFull-Text 189-200
  Ahmad Jbara; Dror G. Feitelson
It is naturally easier to comprehend simple code relative to complicated code. Regrettably, there is little agreement on how to effectively measure code complexity. As a result simple general-purpose metrics are often used, such as lines of code (LOC), McCabe's cyclomatic complexity (MCC), and Halstead's metrics. But such metrics just count syntactic features, and ignore details of the code's global structure, which may also have an effect on understandability. In particular, we suggest that code regularity -- where the same structures are repeated time after time -- may significantly reduce complexity, because once one figures out the basic repeated element it is easier to understand additional instances. We demonstrate this by controlled experiments where subjects perform cognitive tasks on different versions of the same basic function. The results indicate that versions with significant regularity lead to better comprehension, while taking similar time, despite being longer and having higher MCC. These results indicate that regularity is another attribute of code that should be taken into account in the context of studying the code's complexity and comprehension. Moreover, the fact that regularity may compensate for LOC and MCC demonstrates that complexity cannot be decomposed into independently addable contributions by individual attributes.
Capturing software traceability links from developers' eye gazes BIBAFull-Text 201-204
  Braden Walters; Timothy Shaffer; Bonita Sharif; Huzefa Kagdi
The paper presents a novel approach for recovering software traceability links from developers' eye gazes. An eye tracker is used to capture eye gazes while developers perform software maintenance tasks within the Eclipse IDE. An algorithm is presented that establishes a set of traceability links from the eye-gaze data of several developer sessions. A preliminary study assesses the feasibility and validity of the approach. The links generated by the approach were validated by another set of developers. Results indicate that our algorithm achieves strong recall when developers accurately perform bug-localization tasks.
Comprehension support during knowledge transitions: learning from field BIBAFull-Text 205-206
  Vikrant Kaulgud; A Annervaz K. M.; Janardan Misra; Gary Titus
Knowledge Transition (KT) of legacy applications is a critical activity, often determining the quality of maintenance in the early stages of a maintenance life-cycle. We developed an integrated reverse engineering tool-suite that bootstraps the KT process by providing knowledge recipients insights to application structure, quality and functionality. The tool-suite is based on an in-depth study with KT practitioners and a comparative study of existing tools. We evaluated the benefits of the tool-suite during KT in real-life projects. In this talk, we report our learning from the study and evaluation phases.
A visualization tool recording historical data of program comprehension tasks BIBAFull-Text 207-211
  Katsuhisa Maruyama; Takayuki Omori; Shinpei Hayashi
Software visualization has become a major technique in program comprehension. Although many tools visualize the structure, behavior, and evolution of a program, they have no concern with how a tool user has understood it. Moreover, they miss the stuff the user has left through trial-and-error processes of his/her program comprehension task. This paper presents a source code visualization tool called CodeForest. It uses a forest metaphor to depict source code of Java programs. Each tree represents a class within the program and the collection of trees constitutes a three-dimensional forest. CodeForest helps a user to try a large number of combinations of mapping of software metrics on visual parameters. Moreover, it provides two new types of support: leaving notes that memorize the current understanding and insight along with visualized objects, and automatically recording a user's actions under understanding. The left notes and recorded actions might be used as historical data that would be hints accelerating the current comprehension task.
An empirical comparison of static and dynamic type systems on API usage in the presence of an IDE: Java vs. groovy with eclipse BIBAFull-Text 212-222
  Pujan Petersen; Stefan Hanenberg; Romain Robbes
Several studies have concluded that static type systems offer an advantage over dynamic type systems for programming tasks involving the discovery of a new API. However, these studies did not take into account modern IDE features; the advanced navigation and code completion techniques available in modern IDEs could drastically alter their conclusions. This study describes an experiment that compares the usage of an unknown API using Java and Groovy using the IDE Eclipse. It turns out that the previous finding that static type systems improve the usability of an unknown API still holds, even in the presence of a modern IDE.
What is the foundation of evidence of human factors decisions in language design? an empirical study on programming language workshops BIBAFull-Text 223-231
  Andreas Stefik; Stefan Hanenberg; Mark McKenney; Anneliese Andrews; Srinivas Kalyan Yellanki; Susanna Siebert
In recent years, the programming language design community has engaged in rigorous debate on the role of empirical evidence in the design of general purpose programming languages. Some scholars contend that the language community has failed to embrace a form of evidence that is non-controversial in other disciplines (e.g., medicine, biology, psychology, sociology, physics, chemistry), while others argue that a science of language design is unrealistic. While the discussion will likely persist for some time, we begin here a systematic evaluation of the use of empirical evidence with human users, documenting, paper-by-paper, the evidence provided for human factors decisions, beginning with 359 papers from the workshops PPIG, Plateau, and ESP. This preliminary work provides the following contributions: an analysis of the 1) overall quantity and quality of empirical evidence used in the workshops, and of the 2) overall significant challenges to reliably coding academic papers. We hope that, once complete, this long-term research project will serve as a practical catalog designers can use when evaluating the impact of a language feature on human users.

Software Quality

Domain matters: bringing further evidence of the relationships among anti-patterns, application domains, and quality-related metrics in Java mobile apps BIBAFull-Text 232-243
  Mario Linares-Vásquez; Sam Klock; Collin McMillan; Aminata Sabané; Denys Poshyvanyk; Yann-Gaël Guéhéneuc
Some previous work began studying the relationship between application domains and quality, in particular through the prevalence of code and design smells (e.g., anti-patterns). Indeed, it is generally believed that the presence of these smells degrades quality but also that their prevalence varies across domains. Though anecdotal experiences and empirical evidence gathered from developers and researchers support this belief, there is still a need to further deepen our understanding of the relationship between application domains and quality. Consequently, we present a large-scale study that investigated the systematic relationships between the presence of smells and quality-related metrics computed over the bytecode of 1,343 Java Mobile Edition applications in 13 different application domains. Although, we did not find evidence of a correlation between smells and quality-related metrics, we found (1) that larger differences exist between metric values of classes exhibiting smells and classes without smells and (2) that some smells are commonly present in all the domains while others are most prevalent in certain domains.
SCQAM: a scalable structured code quality assessment method for industrial software BIBAFull-Text 244-252
  Shrinath Gupta; Himanshu Kumar Singh; Radhika D. Venkatasubramanyam; Umesh Uppili
Siemens, Corporate Technology, Development Center, Asia Australia (CT DC AA) has been developing and maintaining several software projects for the Industry, Energy, Healthcare, and Infrastructure & Cities sectors of Siemens. The critical nature of these projects necessitates a high level of software code quality. As part of the code quality program at CT DC AA the strategy is to have a scalable method towards identification of issues affecting code quality of projects across the organization. Traditionally, code quality experts in Siemens used EMISQ method to assess code quality. EMISQ requires about three person months (two experts for six weeks) for 50-100 kLoC, making it effort intensive and time consuming. Thus, scaling this assessment method to include the hundreds of projects in CT DC AA poses many challenges. To address this, we have developed a lightweight assessment method called SCQAM (Structured Code Quality Assessment Method). SCQAM is an expert-based method wherein manual assessment of code quality by experts is directed by the systematic application of code analysis tools. In this paper, we describe the SCQAM method, experiences in applying it to projects in CT DC AA, challenges faced and initiatives taken to enable fixing of systemic issues reported by assessments. The insights from our SCQAM experience can provide useful pointers to other organizations and practitioners interested in assessment and improvement of software code quality.
Repeatedly-executed-method viewer for efficient visualization of execution paths and states in Java BIBAFull-Text 253-257
  Toshinori Matsumura; Takashi Ishio; Yu Kashima; Katsuro Inoue
The state of a program at runtime is useful information for developers to understand a program. Omniscient debugging and logging-based tools enable developers to investigate the state of a program at an arbitrary point of time in an execution. While these tools are effective to analyze the state at a single point of time, they might be insufficient to understand the generic behavior of a method which includes various control-flow paths. In this paper, we propose REMViewer (Repeatedly-Executed-Method Viewer), or a tool that visualizes multiple execution paths of a Java method. The tool shows each execution path in a separated view so that developers can firstly select actual execution paths of interest and then compare the state of local variables in the paths.
A formal evaluation of DepDegree based on Weyuker's properties BIBAFull-Text 258-261
  Dirk Beyer; Peter Häring
Complexity of source code is an important characteristic that software engineers aim to quantify using static software measurement. Several measures used in practice as indicators for software complexity have theoretical flaws. In order to assess the quality of a software measure, Weyuker established a set of properties that an indicator for program-code complexity should satisfy. It is known that several well-established complexity indicators do not fulfill Weyuker's properties. As an "early achievement" in a larger project on evaluating software measures, we show that DepDegree, a measure for data-flow dependencies, satisfies all of Weyuker's properties.
Hey! are you committing tangled changes? BIBAFull-Text 262-265
  Hiroyuki Kirinuki; Yoshiki Higo; Keisuke Hotta; Shinji Kusumoto
Although there is a principle that states a commit should only include changes for a single task, it is not always respected by developers. This means that code repositories often include commits that contain tangled changes. The presence of such tangled changes hinders analyzing code repositories because most mining software repository (MSR) approaches are designed with the assumption that every commit includes only changes for a single task. In this paper, we propose a technique to inform developers that they are in the process of committing tangled changes. The proposed technique utilizes the changes included in the past commits to judge whether a given commit includes tangled changes. If it determines that the proposed commit may include tangled changes, it offers suggestions on how the tangled changes can be split into a set of untangled changes.
A semiautomated method for classifying program analysis rules into a quality model BIBAFull-Text 266-270
  Shrinath Gupta; Himanshu Kumar Singh
Most of the software code quality assessment and monitoring methods uses Quality Model (QM) as an aid to capture quality requirements of the software. An important aspect concerning use of QM is classification of Program Analysis (PA) rules into QM according to their relevance to quality attributes such as maintainability, reliability etc. Currently such classification is performed manually by experts and most of the PA tools (such as FxCop for C#, FindBugs for Java, PC-Lint for C/C++) support hundreds of PA rules. Hence performing classification manually can be very effort intensive and time consuming and can lead to concerns like subjectivity and inconsistency. Hence we propose a light weight semiautomated method to expedite classification and make classification activity less effort intensive. Proposed classifier is based on natural language processing (NLP) techniques and uses a keyword matching algorithm. We have computed precision and recall for such a classifier. We have also shown results from applying technique on classifying rules from FxCop, PC-Lint, and FindBugs into the EMISQ QM. We believe that proposed approach will significantly help in reducing the time required to perform classification and hence also to incorporate newer PA tools and rules into QM based methods.
An approach for evaluating and suggesting method names using n-gram models BIBAFull-Text 271-274
  Takayuki Suzuki; Kazunori Sakamoto; Fuyuki Ishikawa; Shinichi Honiden
Method names are important for the software development process. It has been shown by some studies that the quality of method names affects software comprehension. In response, some approaches that evaluate comprehensibility of method names have been proposed. However, the effectiveness of existing approaches is limited because they focus on part of names.
   To deal with the limitation, we propose a novel approach for evaluating comprehensibility of method names and suggesting comprehensible method names using n-gram models. We implemented a prototype tool and conducted two experiments as a case study. Our experiments show that our approach can correctly evaluate 75% method names and successfully suggest 92% actual third words of method names.
Cross-language bug localization BIBAFull-Text 275-278
  Xin Xia; David Lo; Xingen Wang; Chenyi Zhang; Xinyu Wang
Bug localization refers to the process of identifying source code files that contain defects from textual descriptions in bug reports. Existing bug localization techniques work on the assumption that bug reports, and identifiers and comments in source code files, are written in the same language (i.e., English). However, software users from non-English speaking countries (e.g., China) often use their native languages (e.g., Chinese) to write bug reports. For this setting, existing studies on bug localization would not work as the terms that appear in the bug reports do not appear in the source code. We refer to this problem as cross-language bug localization. In this paper, we propose a cross-language bug localization algorithm named CrosLocator, which is based on language translation.
   Since different online translators (e.g., Google and Microsoft translators) have different translation accuracies for various texts, CrosLocator uses multiple translators to convert a non-English textual description of a bug report into English -- each bug report would then have multiple translated versions. For each translated version, CrosLocator applies a bug localization technique to rank source code files. Finally, CrosLocator combines the multiple ranked lists of source code files. Our preliminary experiment on Ruby-China shows that CrosLocator could achieve mean reciprocal rank (mrr) and mean average precision (map) scores of up to 0.146 and 0.116, which outperforms a baseline approach by an average of 10% and 12% respectively.

Novel Development Tooling

Automatic documentation generation via source code summarization of method context BIBAFull-Text 279-290
  Paul W. McBurney; Collin McMillan
A documentation generator is a programming tool that creates documentation for software by analyzing the statements and comments in the software's source code. While many of these tools are manual, in that they require specially-formatted metadata written by programmers, new research has made inroads towards automatic generation of documentation. These approaches work by stitching together keywords from the source code into readable natural language sentences. These approaches have been shown to be effective, but carry a key limitation: the generated documents do not explain the source code's context. They can describe the behavior of a Java method, but not why the method exists or what role it plays in the software. In this paper, we propose a technique that includes this context by analyzing how the Java methods are invoked. In a user study, we found that programmers benefit from our generated documentation because it includes context information.
Improving topic model source code summarization BIBAFull-Text 291-294
  Paul W. McBurney; Cheng Liu; Collin McMillan; Tim Weninger
In this paper, we present an emerging source code summarization technique that uses topic modeling to select keywords and topics as summaries for source code. Our approach organizes the topics in source code into a hierarchy, with more general topics near the top of the hierarchy. In this way, we present the software's highest-level functionality first, before lower-level details. This is an advantage over previous approaches based on topic models, that only present groups of related keywords without a hierarchy. We conducted a preliminary user study that found our approach selects keywords and topics that the participants found to be accurate in a majority of cases.
A code obfuscation framework using code clones BIBAFull-Text 295-299
  Aniket Kulkarni; Ravindra Metta
IT industry loses tens of billions of dollars annually from security attacks such as malicious reverse engineering. To protect sensitive parts of software from such attacks, we designed a code obfuscation scheme based on nontrivial code clones. While implementing this scheme, we realized that currently there is no framework to assist implementation of such advanced obfuscation techniques. Therefore, we have developed a framework to support code obfuscation using code clones. We could successfully implement our obfuscation technique using this framework in Java. In this paper, we present our framework and illustrate it with an example.
JCSD: visual support for understanding code control structure BIBAFull-Text 300-303
  Ahmad Jbara; Dror G. Feitelson
Program comprehension is a vital mental process in any maintenance activity. It becomes decisive as functions get larger. Such functions are burdened with very many programming constructs as lines of code (LOC) strongly correlate with the McCabe's cyclomatic complexity (MCC). This makes it hard to capture the whole code of such functions and as a result hinders grasping their structural properties that might be essential for maintenance. Program visualization is known as a key solution that assists in comprehending complex systems. As a matter of fact we have shown, in a recent work, that control structure diagrams (CSD) could be useful to better understand and discover structural properties of such functions. For example, we found that the code regularity property, and even cloning, can be easily identified by CSDs. This paper presents JCSD, which is an Eclipse plug-in that implements CSD diagrams for Java methods. In particular it visualizes the control structure and nesting of a Java method, and by this it easily conveys structural characteristics of the code to the programmer and helps him to better understand and refactor.
Plagiarism detection for multithreaded software based on thread-aware software birthmarks BIBAFull-Text 304-313
  Zhenzhou Tian; Qinghua Zheng; Ting Liu; Ming Fan; Xiaodong Zhang; Zijiang Yang
The availability of inexpensive multicore hardware presents a turning point in software development. In order to benefit from the continued exponential throughput advances in new processors, the software applications must be multithreaded programs. As multithreaded programs become increasingly popular, plagiarism of multithreaded programs starts to plague the software industry. Although there has been tremendous progress on software plagiarism detection technology, existing dynamic approaches remain optimized for sequential programs and cannot be applied to multithreaded programs without significant redesign. This paper fills the gap by presenting two dynamic birthmark based approaches. The first approach extracts key instructions while the second approach extracts system calls. Both approaches consider the effect of thread scheduling on computing software birthmarks. We have implemented a prototype based on the Pin instrumentation framework. Our empirical study shows that the proposed approaches can effectively detect plagiarism of multithread programs and exhibit strong resilience to various semantic-preserving code obfuscations.
Redacting sensitive information in software artifacts BIBAFull-Text 314-325
  Mark Grechanik; Collin McMillan; Tathagata Dasgupta; Denys Poshyvanyk; Malcom Gethers
In the past decade, there have been many well-publicized cases of source code leaking from different well-known companies. These leaks pose a serious problem when the source code contains sensitive information encoded in its identifier names and comments. Unfortunately, redacting the sensitive information requires obfuscating the identifiers, which will quickly interfere with program comprehension. Program comprehension is key for programmers in understanding the source code, so sensitive information is often left unredacted.
   To address this problem, we offer a novel approach for REdacting Sensitive Information in Software arTifacts (RESIST). RESIST finds and replaces sensitive words in software artifacts in such a way to reduce the impact on program comprehension. We evaluated RESIST experimentally using 57 professional programmers from over a dozen different organizations. Our evaluation shows that RESIST effectively redacts software artifacts, thereby making it difficult for participants to infer sensitive information, while maintaining a desired level of comprehension.