-
Are There Functionally Similar Code Clones in Practice?
Authors:
Verena Käfer,
Stefan Wagner,
Rainer Koschke
Abstract:
Having similar code fragments, also called clones, in software systems can lead to unnecessary comprehension, review and change efforts. Syntactically similar clones can often be encountered in practice. The same is not clear for only functionally similar clones (FSC).
We conducted an exploratory survey among developers to investigate whether they encounter functionally similar clones in practic…
▽ More
Having similar code fragments, also called clones, in software systems can lead to unnecessary comprehension, review and change efforts. Syntactically similar clones can often be encountered in practice. The same is not clear for only functionally similar clones (FSC).
We conducted an exploratory survey among developers to investigate whether they encounter functionally similar clones in practice and whether there is a difference in their inclination to remove them to syntactically similar clones.
Of the 34 developers answering the survey, 31 have experienced FSC in their professional work, and 24 have experienced problems caused by FSCs. We found no difference in the inclination and reasoning for removing FSCs and syntactically similar clones. FSCs exist in practice and should be investigated to bring clone detectors to the same quality as for syntactically similar clones, because being able to detect them allows developers to manage and potentially remove them.
△ Less
Submitted 28 March, 2018;
originally announced March 2018.
-
Poster: Communication in Open-Source Projects--End of the E-mail Era?
Authors:
Verena Käfer,
Daniel Graziotin,
Ivan Bogicevic,
Stefan Wagner,
Jasmin Ramadani
Abstract:
Communication is essential in software engineering. Especially in distributed open-source teams, communication needs to be supported by channels including mailing lists, forums, issue trackers, and chat systems. Yet, we do not have a clear understanding of which communication channels stakeholders in open-source projects use. In this study, we fill the knowledge gap by investigating a statisticall…
▽ More
Communication is essential in software engineering. Especially in distributed open-source teams, communication needs to be supported by channels including mailing lists, forums, issue trackers, and chat systems. Yet, we do not have a clear understanding of which communication channels stakeholders in open-source projects use. In this study, we fill the knowledge gap by investigating a statistically representative sample of 400 GitHub projects. We discover the used communication channels by regular expressions on project data. We show that (1) half of the GitHub projects use observable communication channels; (2) GitHub Issues, e-mail addresses, and the modern chat system Gitter are the most common channels; (3) mailing lists are only in place five and have a lower market share than all modern chat systems combined.
△ Less
Submitted 26 March, 2018;
originally announced March 2018.
-
What Is the Best Way For Developers to Learn New Software Tools? An Empirical Comparison Between a Text and a Video Tutorial
Authors:
Verena Käfer,
Daniel Kulesz,
Stefan Wagner
Abstract:
The better developers can learn software tools, the faster they can start using them and the more efficiently they can later work with them. Tutorials are supposed to help here. While in the early days of computing, mostly text tutorials were available, nowadays software developers can choose among a huge number of tutorials for almost any popular software tool. However, only little research was c…
▽ More
The better developers can learn software tools, the faster they can start using them and the more efficiently they can later work with them. Tutorials are supposed to help here. While in the early days of computing, mostly text tutorials were available, nowadays software developers can choose among a huge number of tutorials for almost any popular software tool. However, only little research was conducted to understand how text tutorials differ from other tutorials, which tutorial types are preferred and, especially, which tutorial types yield the best learning experience in terms of efficiency and effectiveness, especially for programmers. To evaluate these questions, we converted an existing video tutorial for a novel software tool into a content-equivalent text tutorial. We then conducted an experiment in three groups where 42 undergraduate students from a software engineering course were commissioned to operate the software tool after using a tutorial: the first group was provided only with the video tutorial, the second group only with the text tutorial and the third group with both. In this context, the differences in terms of efficiency were almost negligible: We could observe that participants using only the text tutorial completed the tutorial faster than the participants with the video tutorial. However, the participants using only the video tutorial applied the learned content faster, achieving roughly the same bottom line performance. We also found that if both tutorial types are offered, participants prefer video tutorials for learning new content but text tutorials for looking up "missed" information. We mainly gathered our data through questionnaires and screen recordings and analyzed it with suitable statistical hypotheses tests. The data is available at [12]. Since producing tutorials requires effort, knowing with which type of tutorial learnability can be increased to which extent has an immense practical relevance. We conclude that in contexts similar to ours, while it would be ideal if software tool makers would offer both tutorial types, it seems more efficient to produce only text tutorials instead of a passive video tutorial - provided you manage to motivate your learners to use them.
△ Less
Submitted 31 March, 2017;
originally announced April 2017.
-
Spreadsheet Guardian: An Approach to Protecting Semantic Correctness throughout the Evolution of Spreadsheets
Authors:
Daniel Kulesz,
Verena Käfer,
Stefan Wagner
Abstract:
Spreadsheets are powerful tools which play a business-critical role in many organizations. However, many bad decisions taken due to faulty spreadsheets show that these tools need serious quality assurance. Furthermore, while collaboration on spreadsheets for maintenance tasks is common, there has been almost no support for ensuring that the spreadsheets remain correct during this process.
We hav…
▽ More
Spreadsheets are powerful tools which play a business-critical role in many organizations. However, many bad decisions taken due to faulty spreadsheets show that these tools need serious quality assurance. Furthermore, while collaboration on spreadsheets for maintenance tasks is common, there has been almost no support for ensuring that the spreadsheets remain correct during this process.
We have developed an approach named Spreadsheet Guardian which separates the specification of spreadsheet test rules from their execution. By automatically executing user-defined test rules, our approach is able to detect semantic faults. It also protects all collaborating spreadsheet users from introducing faults during maintenance, even if only few end-users specify test rules. To evaluate Spreadsheet Guardian, we implemented a representative testing technique as an add-in for Microsoft Excel.
We evaluated the testing technique in two empirical evaluations with 29 end-users and 42 computer science students. The results indicate that the technique is easy to learn and to apply. Furthermore, after finishing maintenance, participants with spreadsheets "protected" by the technique are more realistic about the correctness of their spreadsheets than participants who employ only "classic", non-interactive test rules based on static analysis techniques. Hence, we believe Spreadsheet Guardian can be of use for business-critical spreadsheets.
△ Less
Submitted 30 November, 2017; v1 submitted 30 November, 2016;
originally announced December 2016.