Abstract
We present a novel method for static analysis in which we combine data-flow analysis with machine learning to detect SQL injection (SQLi) and Cross-Site Scripting (XSS) vulnerabilities in PHP applications. We assembled a dataset from the National Vulnerability Database and the SAMATE project, containing vulnerable PHP code samples and their patched versions in which the vulnerability is solved. We extracted features from the code samples by applying data-flow analysis techniques, including reaching definitions analysis, taint analysis, and reaching constants analysis. We used these features in machine learning to train various probabilistic classifiers. To demonstrate the effectiveness of our approach, we built a tool called WIRECAML, and compared our tool to other tools for vulnerability detection in PHP code. Our tool performed best for detecting both SQLi and XSS vulnerabilities. We also tried our approach on a number of open-source software applications, and found a previously unknown vulnerability in a photo-sharing web application.
Original language | English |
---|---|
Title of host publication | Proceedings of the 13th International Conference on Availability, Reliability and Security |
Place of Publication | New York, NY, USA |
Publisher | acm |
Number of pages | 10 |
ISBN (Print) | 978-1-4503-6448-5 |
DOIs | |
Publication status | Published - 2018 |
Event | 13th International Conference on Availability, Reliability and Security - Hamburg, Germany Duration: 27 Aug 2018 → 30 Aug 2018 https://dl.acm.org/citation.cfm?doid=3230833.3230856 |
Conference
Conference | 13th International Conference on Availability, Reliability and Security |
---|---|
Abbreviated title | ARES 2018 |
Country | Germany |
City | Hamburg |
Period | 27/08/18 → 30/08/18 |
Internet address |
Keywords
- Software security, data-flow analysis, machine learning, static code analysis, vulnerability detection