RESEARCH CASE STUDY IMPROVE BUG FINDING ABILITY OF TESTAR BY USING HISTORICAL BUG DATABASE OF DIGIOFFICE

  • Robin Bouwmeester

    Student thesis: Master's Thesis

    Abstract

    This abstract is a fictional conversation between a Teacher (T) and a Student (S).
    T: Hi, what are you working on?
    S: I’m researching a tool called TESTAR which can automatically find bugs in software systems, such as our DigiOffice product. It finds bugs by searching for suspicious messages, crashes and hangs. I want to improve the tool by detecting more kinds of bugs.
    T: Do you know which bugs this tool can find in DigiOffice?
    S: To know which types of bugs there are, the bugs must be first classified with a bug classification system but there wasn’t any applicable bug classification system, so I constructed a new one based on the large bug database of DigiOffice which I have access to. The classification is executed on 2500 historical bugs of DigiOffice and resulted in 76 classifications divided into 5 groups.
    T: What percentage of bugs did DigiOffice had?
    S: That would be 27% Data, 26% User interface, 24% Exception, 19% Invalid and 4% Resource related bugs.
    T: But how can you improve TESTAR to detect more types of bugs?
    S: By constructing generic action and oracle ideas inspired by old bugs of DigiOffice which could be implemented in TESTAR
    T: What do you mean by generic action or oracle ideas?
    S: By generic, I mean that the idea would not be limited to DigiOffice alone but also to other applications. Similar to a user who will test an application without any specifications or idea of what the program should do. This tester would look for inconsistencies, duplicates, spelling errors, aesthetics, and more. If the tester suspects an issue, then the tester reports this to the program’s owner, which can validate if the issue is valid or not.
    With every idea, I have to convince myself that this idea would work in at least one other application built outside our company. It is not a requirement that the ideas will work in any application. For example, if an idea is about a dropdownlist requirement but the application has no dropdownlist, then the idea does not apply to that application.
    T: Did you find any ideas and what kind of improvement can be expected?
    S: Yes, actually, I found 99 ideas to improve TESTAR, which in theory can improve the bug-finding ability of TESTAR by around 18%. Thus out of 100 bugs, 18 bugs are found when all 99 ideas are implemented. TESTAR can already find around 33% of the reported bugs with the suspicious message oracle. Thus, in theory, this means that TESTAR can detect around half of all the historically reported bugs.
    T: Did you implement any ideas or are they just theoretical ideas?
    S: Yes, Fernando, PhD student and part of the TESTAR team, and I have implemented 35 ideas for testing web applications. Almost all ideas could also be implemented as a desktop application, but this is technologically different to develop. We had biweekly technical meetings to decide how and which ideas to implement. Thus no, they are not only theoretical ideas but are practical as well.
    T: How can you verify and test generic oracle ideas if you only have DigiOffice?
    S:We test the ideas on the newest versions of DigiOffice periodically. And by building a buggy version of Para-Bank we show that bugs are detected in another application. ParaBank is a fictional bank application which is often used by TESTAR.
    T: But that is cheating because you create and detect the bugs yourself! Isn’t it?
    S: There, you have a point, but not all bugs are injected beforehand in ParaBank. There are existing bugs, and there may be unknown bugs just waiting to be found. With each bug injection, we’ve tried to inject them naturally within the specification and technology of ParaBank. ParaBank is a webapplication with Java as a backend and is built by other developers with their own programming style. DigiOffice, on the other hand, is
    built withMicrosoft ASP.NET, and has many custom-made widgets and configuration options. By testing the ideas on DigiOffice and ParaBank at the same time, we test if the implementation is technically working correctly. Other researchers can use the buggy version of ParaBank as a benchmark to estimate the effectiveness of test tools. For researchers, there is a need for applications with known bugs which are based on real bugs of industrial-sized applications. The injected bugs should be detected by one of the implemented oracles. We
    have tested this and found most of the injected bugs.
    T: Can you validate that this approach works?
    S: Formally we can’t, but we invite you to try out the implemented ideas on other systems so the ideas can be validated. That would be a huge addition to this research!
    T: Did you find any new bugs in DigiOffice with the improvements?
    S: Yes, there are 65 bugs found in roughly one year. 20 of those bugs were found by looking for error messages in the webbrowser console log, which is a new feature in TESTAR, and 13 bugs are found by the functional oracles.
    T: OK, well, you’ve done a lot of work and I think you have made some nice improvements!
    S: Thanks!
    Date of Award2 Sept 2023
    Original languageEnglish
    SupervisorTanja Vos (Examiner), Beatriz Marín (External assessor) & Fernando Pastor Ricós (Co-assessor)

    Master's Degree

    • Master Computer Science
    • Master Software Engineering

    Cite this

    '