SHF: EAGER: Clustering Programming Artifacts to Enrich Code Foraging Environment

Project: Research project

Project Details

Description

Software developers have difficulty, and therefore waste a significant amount of time, navigating around the code base as they perform software maintenance activities. Thus, a major challenge that confronts software engineering is to understand how developers seek relevant code and further invent tools that facilitate code navigation. Previous research has resulted in some ad hoc tools without any theoretical basis, along with some descriptive theories derived by observing developer behavior. These approaches have produced only modest improvements. A more comprehensive and effective solution can only be reached through a theoretical understanding of the code navigation task itself. This research seeks a unified theory for code navigation based on mathematically modeled foraging mechanisms that evolved to help our animal ancestors to find food. These same mechanisms appear to work as users seek useful information in the vastness of the Web. Developers, like organisms foraging for food, need to evolve the strategies to maximize the gains of useful information to their maintenance tasks per unit cost. The research will explore the usefulness of this analogy and adopt theories, models and tools that could eventually help lower the cost and effort of software development and maintenance. This EAGER proposal focuses on an exploration of the extent to which the code foraging environment can be automatically enriched by software clustering. The hypothesis guiding this research is that the way programming artifacts are grouped (clustered) can affect the profitability of the information foraging environment, which in turn can shape the way developers navigate the code base. In order to test the hypothesis, the PI will (i) develop an experimental framework to assess topical locality in software, (ii) create a new set of metrics to guide the evaluation of code rearrangement and foraging tools, and (iii) compare well-established source code clustering algorithms to discover effective enrichment mechanisms. The research will be evaluated through experiments with large open-source projects that have recorded detailed developer interaction logs.
StatusFinished
Effective start/end date5/1/124/30/14

Funding

  • National Science Foundation: $80,000.00

ASJC Scopus Subject Areas

  • Software
  • Computer Networks and Communications
  • Engineering(all)
  • Electrical and Electronic Engineering
  • Communication