Top 10:
The Web Application Vulnerability Scanners
Benchmark, 2012
Commercial & Open Source Scanners
An Accuracy, Coverage,
Versatility, Adaptability, Feature and Price Comparison
of 60 Commercial
& Open Source Black Box Web Application Vulnerability Scanners
By Shay Chen
Information Security Consultant, Researcher and
Instructor
sectooladdict-$at$-gmail-$dot$-com
July 2012
Assessment Environments: WAVSEP 1.2, ZAP-WAVE (WAVSEP
integration), WIVET v3-rev148
Table of Contents
1.
Introduction
2.
List of Tested Web Application Scanners
3.
Benchmark Overview & Assessment Criteria
4.
A Glimpse at the Results of the Benchmark
5.
Test I - Scanner Versatility - Input Vector Support
6.
Test II – Attack Vector Support – Counting Audit Features
7.
Introduction to the Various Accuracy Assessments
8.
Test III – The Detection Accuracy of Reflected XSS
9.
Test IV – The Detection Accuracy of SQL Injection
10.
Test V – The Detection Accuracy of Path Traversal/LFI
11.
Test VI – The Detection Accuracy of RFI (XSS via RFI)
12.
Test VII - WIVET - Coverage via Automated Crawling
13.
Test VIII – Scanner Adaptability - Crawling & Scan Barriers
14.
Test IX – Authentication and Usability Feature Comparison
15.
Test X – The Crown Jewel - Results & Features vs. Pricing
16.
Additional Comparisons, Built-in Products and Licenses
17.
What Changed?
18.
Initial Conclusions – Open Source vs. Commercial
19.
Verifying The Benchmark Results
20.
So What Now?
21.
Recommended Reading List: Scanner Benchmarks
22.
Thank-You Note
23.
FAQ - Why Didn't You Test NTO, Cenzic and N-Stalker?
24. Appendix A – List of Tools Not Included In the
Test
1. Introduction
Detailed Result Presentation at
Tools, Features, Results, Statistics
and Price Comparison
(Delete Cache) |
A Step by Step Guide for Choosing the Right
Web Application Vulnerability Scanner for *You*
|
A Perfectionist Guide for Optimal Use
of Web Application Vulnerability Scanners
[Placeholder]
|
Getting the information was the easy part. All I had to do
was to invest a couple of years in gathering the list of tools, and a couple of
more in documenting their various features. It's really a daily routine - you
read a couple of posts in news groups in the morning, and couple blogs at the
evening. Once you get used to it, it's fun, and even quite addictive.
Then came the "best" fantasy, and with it,
the inclination to test the proclaimed features of all the web application
vulnerability scanners against each other, only to find out that things are
not that simple, and finding the "best", if there is such a tool, was
not an easy task.
Inevitably, I tried searching for alternative assessment
models, methods of measurements that will handle the imperfections of the
previous assessments.
I tried to change the perspective, add tests (and hundreds
of those - 940+, to be exact), examine
different aspects, and even make parts of the test process obscure, and now,
I'm finally ready for another shot.
In spite of everything I had invested in past researches, due
to the focus I had on features and accuracy, and the policy I used when
interacting with the various vendors, it was difficult, especially for me, to gain
insights from the mass amounts of data that will enable me to choose, and more
importantly, properly use the various tools in real life
scenarios.
Is the most accurate
scanner necessarily the best choice for a point and shoot scenario? and what
good will it do if it can't scan an application due to a specific scan barrier
it can't handle, or because if does not support the input delivery method?
I needed to gather other pieces of the puzzle, and even more
importantly, I needed a method, or more accurately, a methodology.
I'm sorry to disappoint you, dear reader, so early in the article,
but I still don't have a perfect answer or one recommendation... But I sure
am much closer than I ever was, and although I might not have the
answer, I have many answers, and a very comprehensive, logical and clear
methodology for employing the use of all the information I'm about to present.
In the previous benchmarks , I focused on assessing 3 major aspects of web application
scanners, which revolved mostly around features & accuracy, and even though
the information was very interesting, it wasn't necessarily useful, at least
not in all scenarios.
So decided to
take it to the edge, but since I already reached the number of 60 scanners, it
was hard to make an impression with a couple of extra tools, so instead, I focused
my efforts on aspects.
This time, I compared 10 different aspects of the
tools (or 14, if you consider non competitive charts), and chose the collection
with the aim of providing practical tools for making a decision, and getting
a glimpse of the bigger picture.
Let me assure you - this time, the information is
presented in a manner that is very helpful, is easy to navigate, and is supported
by presentation platforms, articles and step by step methodologies.
Furthermore, I wrapped it all in a summary that includes
the major results and features in relation to the price, for those of us
that prefer the overview, and avoid the drill down. Information and Insights that I believe, will
help testers invest their time in better-suited tools, and consumers in properly
investing their money, in the long term or the short term (but not necessarily
both*).
As mentioned earlier, this research covers various aspects
for the latest versions of 11 commercial web application scanners, and the
latest versions of most of the 49 free & open source web
application scanners. It also covers some scanners that were not covered
in previous benchmarks, and includes, among others, the following components
and tests:
• A Price Comparison - in Relation to the Rest of the Benchmark Results
• Scanner Versatility - A Measure for the Scanner's Support of Protocols & Input Delivery Vectors
• Attack Vector Support - The Amount & Type of Active Scan Plugins (Vulnerability Detection)
• Reflected Cross Site Scripting Detection Accuracy
• SQL Injection Detection Accuracy
• Path Traversal / Local File Inclusion Detection Accuracy
• Remote File Inclusion Detection Accuracy (XSS/Phishing via RFI)
• WIVET Score Comparison - Automated Crawling / Input Vector Extraction
• Scanner Adaptability - Complementary Coverage Features and Scan Barrier Support
• Authentication Features Comparison
• Complementary Scan Features and Embedded Products
• General Scanning Features and Overall Impression
• License Comparison and General Information
And just before we delve into the details, one last tip:
don't focus solely on the charts - if you want to really understand what they
reflect, dig in.
Lists and charts first, detailed description later.
2. List of Tested Web Application
Scanners
The following commercial scanners were included
in the benchmark:
- IBM AppScan v8.5.0.1, Build 42-SR1434 (IBM)
- WebInspect v9.20.277.0, SecureBase 4.08.00 (HP)
- Netsparker v2.1.0, Build 45 (Mavituna Security)
- Acunetix WVS v8.0, Build 20120613 (Acunetix)
- Syhunt Dynamic (SandcatPro) v4.5.0.0/1 (Syhunt)
- Burp Suite v1.4.10 (Portswigger)
- ParosPro v1.9.12 (Milescan) - WIVET / Other
- JSky v3.5.1-905 (NoSec) - WIVET / Other
- WebCruiser v2.5.1 EE (Janus Security)
- Nessus v5.0.1 - 20120701 (Tenable Network Security) - Web Scanning Features
- Ammonite v1.2 (RyscCorp)
The
following new free & open source scanners were included
in the benchmark:
IronWASP v0.9.1.0
The updated versions of the following free & open
source scanners were re-tested in the benchmark:
Zed Attack Proxy (ZAP) v1.4.0.1, sqlmap
v1.0-Jul-5-2012 (Github), W3AF 1.2-rev509 (SVN), Acunetix
Free Edition v8.0-20120509, Safe3WVS v10.1 FE (Safe3
Network Center) WebSecurify v0.9 (free edition - the new
commercial version was not tested), Syhunt Mini (Sandcat Mini) v4.4.3.0,
arachni v0.4.0.3, Skipfish 2.07b, N-Stalker
2012 Free Edition v7.1.1.121 (N-Stalker), Watobo
v0.9.8-rev724 (a few new WATOBO 0.9.9 pre versions were released a few days
before the publication of the benchmark, but I didn't managed to test them in
time)
Different aspects of the following free & open
source scanners were tested in the benchmark:
VEGA 1.0 beta (Subgraph), Netsparker
Community Edition v1.7.2.13, Andiparos v1.0.6, ProxyStrike
v2.2, Wapiti v2.2.1, Paros Proxy v3.2.13, Grendel
Scan v1.0
The results were compared to those of unmaintained scanners
tested in previous benchmarks:
PowerFuzzer v1.0, Oedipus v1.8.1
(v1.8.3 is around somewhere), Scrawler v1.0, WebCruiser
v2.4.2 FE (corrections), Sandcat Free Edition v4.0.0.1, JSKY
Free Edition v1.0.0, N-Stalker 2009 Free Edition v7.0.0.223,
UWSS (Uber Web Security Scanner) v0.0.2, Grabber v0.1, WebScarab
v20100820, Mini MySqlat0r v0.5, WSTool v0.14001,
crawlfish v0.92, Gamja v1.6, iScan v0.1, LoverBoy
v1.0, DSSS (Damn Simple SQLi Scanner) v0.1h, openAcunetix
v0.1, ScreamingCSS v1.02, Secubat v0.5, SQID
(SQL Injection Digger) v0.3, SQLiX v1.0, VulnDetector
v0.0.2, Web Injection Scanner (WIS) v0.4, Xcobra v0.2, XSSploit
v0.5, XSSS v0.40, Priamos v1.0, XSSer
v1.5-1 (version 1.6 was released but I didn't manage to test it), aidSQL
02062011 (a newer revision exists in the SVN but was not officially released)
For a full list of commercial & open source tools that
were not tested in this benchmark, refer to the appendix.
3. Benchmark Overview & Assessment
Criteria
The benchmark focused on testing commercial & open
source tools that are able to detect (and not necessarily exploit) security
vulnerabilities on a wide range of URLs, and thus, each tool tested was
required to support the following features:
·
The ability to detect
Reflected XSS and/or SQL Injection and/or Path Traversal/Local File
Inclusion/Remote File Inclusion vulnerabilities.
·
The ability to scan
multiple URLs at once (using either a crawler/spider feature, URL/Log file
parsing feature or a built-in proxy).
·
The ability to control and
limit the scan to internal or external host (domain/IP).
The testing procedure of all the tools included the
following phases:
Feature Documentation
The features of each scanner were documented and compared,
according to documentation, configuration, plugins and information received
from the vendor. The features were then divided into groups, which were
used to compose various hierarchal charts.
Accuracy Assessment
The scanners were all tested against the latest version of WAVSEP (v1.2, integrating ZAP-WAVE),
a benchmarking platform designed to assess the detection accuracy of web
application scanners, which was released with the publication of this benchmark.
The purpose of WAVSEP’s test cases is to provide a scale for understanding
which detection barriers each scanning tool can bypass, and which common
vulnerability variations can be detected by each tool.
·
The various scanners were
tested against the following test cases (GET and POST attack vectors):
o
816 test cases that
were vulnerable to Path Traversal attacks.
o
108 test cases that
were vulnerable to Remote File Inclusion (XSS via RFI) attacks.
o
66 test cases that
were vulnerable to Reflected Cross Site Scripting attacks.
o
80 test cases that
contained Error Disclosing SQL Injection exposures.
o
46 test cases that
contained Blind SQL Injection exposures.
o
10 test cases that
were vulnerable to Time Based SQL Injection attacks.
o
7 different
categories of false positive RXSS vulnerabilities.
o
10 different
categories of false positive SQLi vulnerabilities.
o
8 different
categories of false positive Path Travesal / LFI vulnerabilities.
o
6 different
categories of false positive Remote File Inclusion vulnerabilities.
· The benchmark included 8
experimental RXSS test cases and 2 experimental SQL Injection test cases, and
although the scan results of these test cases were documented in the various
scans, their results were not included in the final score, at least for now.
·
In order to ensure the
result consistency, the directory of each exposure sub category was
individually scanned multiple times using various configurations, usually using
a single thread and using a scan policy that only included the relevant plugins.
In order to ensure that the detection features of each scanner were truly effective, most of the scanners were tested against an additional benchmarking application that was prone to the same vulnerable test cases as the WAVSEP platform, but had a different design, slightly different behavior and different entry point format, in order to verify that no signatures were used, and that any improvement was due to the enhancement of the scanner's attack tree.
Attack Surface Coverage Assessment
In order to assess the scanners attack surface coverage, the
assessment included tests that measure the efficiency of the scanner's
automated crawling mechanism (input vector extraction) , and feature
comparisons meant to assess its support for various technologies and its ability
to handle different scan barriers.
This section of the benchmark also included the WIVET
test (Web Input Vector Extractor Teaser), in which scanners were executed
against a dedicated application that can assess their crawling mechanism in the
aspect of input vector extraction. The specific details of this assessment are
provided in the relevant section.
Public tests vs. Obscure tests
In order to make the test as fair as possible, while still
enabling the various vendors to show improvement, the benchmark was divided
into tests that were publically announced, and tests that were obscure
to all vendors:
·
Publically announced
tests: the active scan feature comparison, and the detection accuracy
assessment of the SQL Injection and Reflected Cross Site Scripting, composed
out of tests cases which were published as a part of WAVSEP v1.1.1)
·
Tests that were obscure
to all vendors until the moment of the publication: the various new
groups of feature comparisons, the WIVET assessment, and the detection accuracy
assessment of the Path Traversal / LFI and Remote File Inclusion (XSS via RFI),
implemented as 940+ test cases in WAVSEP 1.2 (a new version that was only
published alongside this benchmark).
The results of the main test categories are presented within
three graphs (commercial graph, free & open source graph, unified graph),
and the detailed information of each test is presented in a dedicated section
in benchmark presentation platform at http://www.sectoolmarket.com.
Now that were finally done with the formality, let's get to
the interesting part... the results.
4. A Glimpse to the Results of the Benchmark
This presentation of results in this benchmark, alongside the
dedicated website (http://www.sectoolmarket.com/)
and a series of supporting articles and methodologies ([placeholder]),
are all designed to help the reader to make a decision - to choose the
proper product/s or tool/s for the task at hand, within the borders of the time
or budget.
For those of us that can't wait, and want to get a glimpse
to the summary of the unified results, there is a dedicated page available at
the following links:
Price & Feature Comparison of Commercial Scanners
http://sectoolmarket.com/price-and-feature-comparison-of-web-application-scanners-commercial-list.html
Price & Feature Comparison of a Unified List of Commercial, Free and Open Source Products
Price & Feature Comparison of Commercial Scanners
http://sectoolmarket.com/price-and-feature-comparison-of-web-application-scanners-commercial-list.html
Price & Feature Comparison of a Unified List of Commercial, Free and Open Source Products
Some of the sections might not be clear to some of
the readers at this phase, which is why I advise you to read the rest of the
article, prior to analyzing this summary.
5. Test I - Scanner Versatility - Input
Vector Support
The first assessment criterion was the number of input
vectors each tool can scan (and not just parse).
Modern web applications use a variety of sub-protocols and
methods for delivering complex inputs from the browser to the server. These
methods include standard input delivery methods such as HTTP querystring
parameters and HTTP body parameters, modern
delivery methods such as JSON and XML, and even binary delivery methods for
technology specific objects such as AMF, Java serialized objects and WCF.
Since the vast majority of active scan plugins rely on input
that is meant to be injected into client originating parameters, supporting the
parameter (or rather, the input) delivery method of the tested application is a
necessity.
Although the charts in this section don't necessarily represent
the most important score, it is the most important perquisite for the
scanner to comply with when scanning a specific technology.
Reasoning: An automated tool can't detect a
vulnerability in a given parameter, if it can't scan the protocol or mimic the application's
method of delivering the input. The more vectors of input delivery that the
scanner supports, the more versatile it is in scanning different technologies
and applications (assuming it can handle the relevant scan barriers, supports
necessary features such as authentication, or alternatively, contains features that
can be used to work around the specific limitations).
The detailed comparison of the scanners support for various
input delivery methods is documented in detail in the following section of
sectoolmarket (recommended - too many scanners in the chart):
The following chart shows how versatile each scanner is in scanning different input delivery vectors (and although not entirely accurate - different technologies):
The Number of Input Vectors Supported – Commercial Tools
The Number of Input Vectors Supported – Free & Open Source Tools
The Number of Input Vectors Supported – Unified List
6. Test II – Attack Vector Support –
Counting Audit Features
The second assessment criterion was the number of audit
features each tool supports.
Reasoning: An automated tool can't detect an exposure
that it can't recognize (at least not directly, and not without manual analysis),
and therefore, the number of audit features will affect the amount of exposures
that the tool will be able to detect (assuming the audit features are implemented
properly, that vulnerable entry points will be detected,
that the tool will be able to handle the relevant scan barriers and scanning
perquisites, and that the tool will manage
to scan the vulnerable input vectors).
For the purpose of the benchmark, an audit feature was
defined as a common generic application-level scanning feature, supporting
the detection of exposures which could be used to attack the tested web
application, gain access to sensitive assets or attack legitimate clients.
The definition of the assessment criterion rules out product
specific exposures and infrastructure related vulnerabilities, while unique and
extremely rare features were documented and presented in a different section of
this research, and were not taken into account when calculating the results.
Exposures that were specific to Flash/Applet/Silverlight and Web Services
Assessment (with the exception of XXE) were treated in the same manner.
The detailed comparison of the scanners support for various audit features is documented in detail in the following section of sectoolmarket:
The Number of Audit Features in Web Application
Scanners – Commercial Tools
The Number of Audit Features in Web Application Scanners – Free & Open Source Tools
The Number of Audit Features in Web Application
Scanners – Unified List
So once again, now that were done with the quantity, let's get to the quality…
7. Introduction to the Various Accuracy
Assessments
The following sections presents the results of the detection
accuracy assessments performed for Reflected XSS, SQL Injection, Path Traversal
and Remote File Inclusion (RXSS via RFI) - four of the most commonly supported
features in web application scanners. Although the detection accuracy of a
specific exposure might not reflect the overall condition of the scanner on its
own, it is a crucial indicator for how good a scanner is at detecting specific
vulnerability instances.
The various assessments were performed against the various
test cases of WAVSEP v1.2, which emulate different common test case
scenarios for generic technologies.
Reasoning: a scanner that is not accurate enough will
miss many exposures, and might classify non-vulnerable entry points as
vulnerable. These tests aim to assess how good is each tool at detecting the vulnerabilities
it claims to support, in a supported input vector, which is located in
a known entry point, without any restrictions that can prevent the tool
from operating properly.
8. Test III – The Detection Accuracy of
Reflected XSS
The third assessment criterion was the detection accuracy of
Reflected Cross Site Scripting, a common exposure which is the 2nd most
commonly implemented feature in web application scanners, and the one in which
I noticed the greatest improvement in the various tested web application
scanners.
The comparison of the scanners' reflected cross site scripting detection accuracy is documented in detail in the following section of sectoolmarket:
Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories
detected by the tool (which may result in more instances then what the bar
actually presents, when compared to the detection accuracy bar).
The Reflected XSS Detection Accuracy of Web
Application Scanners – Commercial Tools
The Reflected XSS Detection Accuracy of Web Application
Scanners – Open Source & Free Tools
The Reflected XSS Detection Accuracy of Web
Application Scanners – Unified List
9. Test IV – The Detection Accuracy of
SQL Injection
The fourth assessment criterion was the detection accuracy
of SQL Injection, one of the most famous exposures and the most commonly
implemented attack vector in web application scanners.
The evaluation was performed on an application that uses
MySQL 5.5.x as its data repository, and thus, will reflect the detection
accuracy of the tool when scanning an application that uses similar data
repositories.
The comparison of the scanners' SQL injection detection accuracy is documented in detail in the following section of sectoolmarket:
Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).
The SQL Injection Detection Accuracy of Web
Application Scanners – Commercial Tools
The SQL Injection Detection Accuracy of Web
Application Scanners – Open Source & Free Tools
The SQL Injection Detection Accuracy of Web
Application Scanners – Unified List
Although there are many changes in the results since the
last benchmark, both of these exposures (SQLi, RXSS) were previously assessed,
so, I believe it's time to introduce something new... something none of the
tested vendors could have prepared for in advance...
10. Test V – The Detection Accuracy of
Path Traversal/LFI
The fifth assessment criterion was the detection accuracy of
Path Traversal (a.k.a Directory Traversal), a newly implemented feature in
WAVSEP v1.2, and the third most commonly implemented attack vector in web
application scanners.
The reason it was tagged along with Local File Inclusion
(LFI) is simple - many scanners don't make the differentiation between
inclusion and traversal, and furthermore, a few online vulnerability documentation
sources don't. In addition, the results obtained from the tests performed on
the vast majority of tools lead to the same conclusion - many plugins listed
under the name LFI detected the path traversal plugins.
While implementing the path traversal test cases and
consuming nearly every relevant piece of documentation I could find on the
subject, I decided to take the current path, in spite of some acute differences
some of the documentation sources suggested (but did implemented an
infrastructure in WAVSEP for "true" inclusion exposures).
The point is not to get into a discussion of whether
or not path traversal, directory traversal and local file inclusion should be
classified as the same vulnerability, but simply to explain why in spite of the
differences some organizations / classification methods have for these
exposures, they were listed under the same name (In sectoolmarket - path
traversal detection accuracy is listed under the title LFI).
The evaluation was performed on a WAVSEP v1.2
instance that was hosted on windows XP, and although there are specific test
cases meant to emulate servers that are running with a low privileged OS user
accounts (using the servlet context file access method), many of the test cases
emulate web servers that are running with administrative user accounts.
[Note - in addition to the wavsep installation, to
produce identical results to those of this benchmark, a file by the name of
content.ini must be placed in the root installation directory of the tomcat server-
which is different than the root directory of the web server]
Although I didn't perform the path traversal scans on Linux
for all the tools, I did perform the initial experiments on Linux, and even a
couple of verifications on Linux for some of the scanners, and as weird as it
sounds, I can clearly state that the results were significantly worse,
and although I won't get the opportunity to discuss the subject in this
benchmark, I might handle it in the next.
In order to assess the detection accuracy of different path
traversal instances, I designed a total of 816 OS-adapting path
traversal test cases (meaning - the test cases adapt themselves to the OS they
are executed in, and to the server they are executed in, in the aspects of file
access delimiters and file access paths). I know it might seem a lot, and I
guess I did got carried away with the perfectionism, but you will be surprised
too see that these tests really represent common vulnerability instances, and
not necessarily super extreme scenarios, and that results of the tests did
prove the necessity.
The tests were deigned to emulate various combination of the
following conditions and restrictions:
If you will take a closer look at the detailed scan-specific
results at www.sectoolmarket.com, you'll notice that some scanners were completely
unaffected by the response content type and HTTP code variation, while
other scanners were dramatically affected by the variety (gee, it's nice
to know that I didn't write them all for nothing... :) ).
In reality, there were supposed to more test cases,
primarily because I intended to test injection entry points in which the input
only affected the filename without the extension, or was injected directly into
the directory name. However, due to the sheer amount of tests and the deadline
I had for this benchmark, I decided to delete (literally) the test cases that
handled these anomalies, and focus on test cases in which the entire
filename/path was affected. That being said, I might publish these test cases
in future versions of wavsep (they amount to a couple of hundreds).
The comparison of the scanners' path traversal detection accuracy is documented in detail in the following section of sectoolmarket:
Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).
The Path Traversal / LFI Detection Accuracy of Web
Application Scanners – Commercial Tools
The Path Traversal / LFI Detection Accuracy of Web
Application Scanners – Open Source & Free Tools
The Path Traversal / LFI Detection Accuracy of Web
Application Scanners – Unified List
And what of LFI's evil counterpart, Remote File Inclusion?
(yeah yeah, I know, it was path traversal...)
11. Test VI – The Detection Accuracy of RFI
(XSS via RFI)
The sixth assessment criterion was the detection accuracy of
Remote File Inclusion (or more accurately, vectors of RFI that can result in
XSS or Phishing - and currently, not necessarily in server code execution), a
newly implemented feature in WAVSEP v1.2, and the one of most commonly
implemented attack vector in web application scanners.
I didn't originally plan to assess the detection accuracy of
RFI in this benchmark, however, since I implemented a new structure to wavsep
that enables me to write a lot of test cases faster, I couldn't resist
the urge to try it... and thus, found a new way to decrease the amount of sleep
I get each night.
The interesting thing I found was that although RFI is
supposed to work a bit differently than LFI/Path traversal, many LFI/Path
traversal Plugins effectively detected RFI exposures, and in some instances,
the tests for both of these vulnerabilities were actually implemented in the
same plugin (usually named "file inclusions"); thus, while scanning
for Traversal/LFI/RFI, I usually activated all the relevant plugins in the
scanner, and low and behold - got results from the LFI/Path Traversal plugins
that even the RFI dedicated plugins did not detect.
In order to assess the detection accuracy of different remote
file inclusion exposures (again, RXSS/Phishing via RFI vectors), I designed a
total of 108 remote file inclusion test cases.
The tests were deigned to emulate various combination of the
following conditions and restrictions:
Just like the case of path traversal, In reality, there were
supposed to be more XSS via RFI test cases, primarily because I intended to
test injection entry points in which the input only affected the filename
without the extension, or was injected directly into the directory name.
However, due to the sheer amount of tests and the deadline I had for this
benchmark, I decided to delete (literally) the test cases that handled these
anomalies, and focus on test cases in which the entire filename/path was
affected. That being said, I might publish these test cases in future versions
of wavsep (they amount to dozens).
[Note: Although the tested versions of Appscan and Nessus
contain RFI detection plugins, they did not support the detection of XSS via
RFI.]
The comparison of the scanners' remote file inclusion detection accuracy is documented in detail in the following section of sectoolmarket:
Result Chart Glossary
Note that the GREEN bar represents the vulnerable test case detection accuracy, while the RED bar represents false positive categories detected by the tool (which may result in more instances then what the bar actually presents, when compared to the detection accuracy bar).
The RFI (XSS via RFI) Detection Accuracy of Web
Application Scanners – Commercial Tools
The RFI (XSS via RFI) Detection Accuracy of Web
Application Scanners – Open Source & Free Tools
The RFI (XSS via RFI) Detection Accuracy of Web
Application Scanners – Unified List
And after covering all those accuracy aspects, it's time to
cover a totally different subject - Coverge.
12. Test VII - WIVET - Coverage via
Automated Crawling
The seventh assessment criterion was the scanner's WIVET
score, which is related to coverage.
The concept of coverage can mean a lot of things, but in
general, what I'm referring to is the ability of the scanner to increase the
attack surface of the tested application - to locate additional resources and
input delivery methods to attack.
Although a scanner can increase the attack surface in a
number of ways, from detecting hidden files to exposing device-specific
interfaces, this section of the benchmark focuses on automated crawling
and an efficient input vector extraction.
This aspect of a scanner is extremely important in
point-and-shoot scans, scans in which the user does not "train" the
scanner to recognize the application structure, URLs and requests, either due
to time/methodology restrictions, or when the user is not a security expert
that knows how to properly use manual crawling with the scanner.
In order to evaluate these aspects in scanners, I used a
wonderful OWASP turkey project called WIVET
(Web Input Vector Extractor Teaser); The WIVET project is a benchmarking
project that was written by an application security specialist by the name of Bedirhan Urgun, and released under
the GPL2 license.
The project is implemented as a web application which aims to "statistically analyze web link extractors", by measuring the amount of input vectors extracted by each scanner while crawling the WIVET website, in order to assess how well each scanner can increase the coverage of the attack surface.
Plainly speaking, the project simply measures how well a
scanner is able to crawl the application, and how well can it locate input
vectors, by presenting a collection of challenges that contain links,
parameters and input delivery methods that the crawling process should locate
and extract.
Although WIVET used to have an online instance, with my
luck, by the time I decided to use it the online version was already gone... so
I checked-out the latest subversion revision from the project's google code
website (v3-revision148), installed FastCGI on an IIS server (Windows XP),
copied the application files to a directory called wivet under the C:\Inetpub\wwwroot\
directory, and started the IIS default website.
In order for WIVET to work, the scanner must crawl the
application while consistently using the same session identifier in its
crawling requests, while avoiding the 100.php logout page (which initializes
the session, and thus the results). The results can then be viewed by accessing
the application index page, while using the session identifier used during the
scan.
A very nice idea that makes the assessment process easy and
effective, however, for me, things weren't that easy. Although some scanners
did work properly with the platform, many scanners did not receive any score,
even though I configured them exactly according to the recommendations (valid
session identifier and logout URL exclusion), so after a careful examination, I
discovered the source of my problem: some of the scanners don't send the
predefined session identifier in their crawling requests (even though it's
explicitly defined in the product), and others simply ignore URL exclusions (in
certain conditions).
Since even without these bugs, not all the scanners
supported URL exclusions (100.php logout page) and predefined cookies, I had to
come up with a solution that will enable me to test all of them... so I changed
the WIVET platform a little bit by deleting the link to the logout page
(100.php) from the main menu page (menu.php), forwarded the communication of
the vast majority of scanners through a fiddler instance, in which I defined a
valid WIVET session identifier (using the filter features), and in
extreme scenarios in which an upstream proxy was not supported by the scanner,
defined the WIVET website as a proxy in an IE browser, loaded fiddler
(so it will forward the communication to the system defined proxy - WIVET),
defined burp as a transparent proxy that forwards the communication to fiddler
(upstream proxy), and scanned burp instead of the WIVET application (the
scanner will scan burp which will forward the communication to fiddler which
will forward the communication to the system defined proxy - the WIVET
website).
These solutions seemed to be working for most vendors, that
is until I discovered two more bugs that caused these solutions not to work for
another small group of products...
The first bug was related to the emulation of modern browser
behavior when interpreting the relative context of links in a frameset
(browsers use the link's target frame as the path basis, but some scanners used
the path basis of the links origin page), and the other bug was related to
another browser emulation issue - some scanners that did not manage to submit
forms without an action property (while a browser usually submits such a form
to the same URL that form originated from).
I managed to solve the first bug by editing the menu page
and manually adding additional links with an alternate
context (added "pages/" to all
URLs) to the same WIVET pages , while the second bug was reported to some
vendors (and was handled by them).
Finally, some of the scanners had bugs that I did not manage
to isolate in the given timeframe, and thus, I didn't manage to get any WIVET
score for them (a list of these products will presented at the end of this
section).
However, the vast majority of the scanners did got a score,
which can be viewed in the following charts and links.
The comparison of the scanners' WIVET score is documented in detail in the following section of sectoolmarket:
http://sectoolmarket.com/wivet-score-unified-list.html
The WIVET Score of Web Application Scanners – Commercial Tools
The WIVET Score of Web Application Scanners – Commercial Tools
The WIVET Score of Web Application Scanners – Free and Open Source Tools
The WIVET Score of Web Application Scanners – Unified List
It is important to clarify that due to these scanner bugs (and the current WIVET structure) - low scores and non-existing scores might differ once minor bugs are fixed, but the scores presented in this chart are currently all I can offer.
The following scanners didn't manage to get a WIVET score at
all (even after all the adjustments and enhancements I tried), and although
this does not mean that their score is necessarily low, or that there isn't any
possible way to execute them in-front of WIVET, simply that there isn't a
simple method of doing it (at least not one that I discovered):
Syhunt Mini (Sandcat Mini), Webcruiser, IronWASP, Safe3WVS
free edition, N-Stalker 2012 free edition, Vega, Skipfish.
In addition, I didn't try scanning WIVET with various
unmaintained scanners, scanners that didn't have a spider feature (WATOBO in
the assessed version, Ammonite, etc), or with the following assessed tools: Nessus,
sqlmap.
It's crucial to note that scanners with burp-log parsing
features (such sqlmap and IronWASP) can effectively be assigned with the WIVET
score of burp, that scanners with internal proxy features (such as ZAP,
Burpsuite, Vega, etc) can be used with the crawling mechanisms of other
scanners (such as Acunetix FE), and that as a result of both of these conclusions,
any scanner that supports any of those features can be assigned the WIVET score
of any scanner in the possession of the tester (by using the crawling mechanism
of a scanner through a proxy such as burp, in order to generate scan logs).
13. Test VIII – Scanner Adaptability - Crawling
& Scan Barriers
By using the seemingly irrelevant term "adaptability"
in relation to scanners, I'm actually referring to the scanner's ability to
adapt and scan the application, despite different technologies, abnormal crawling
requirements and varying scan barriers, such as Anti-CSRF tokens,
CAPTCHA mechanisms, platform specific tokens (such as required viewstate
values) or account lock mechanisms.
Although not necessarily a measurable quality, the ability
of the scanner to handle different technologies and scan barriers is an
important perquisite, and in a sense, almost as important as being able to scan
the input delivery method.
Reasoning: An automated tool can't detect a
vulnerability in a point and shoot scenario if it is can't locate & scan the
vulnerable location due to the lack of support in a certain a browser add-on, the
lack of support for extracting data from certain non-standard vectors, or the
lack of support in overcoming a specific barrier, such as a required token or
challenge. The more barriers the scanner is able to handle, the more useful it
is when scanning complex applications that employ the use of various
technologies and scan barriers (assuming it can handle the relevant input
vectors, supports the necessary features such as authentication, or has a
feature that can be used to work around the specific limitations).
The following charts shows how many types of barriers does
each scanner claim to be able to handle (these features were not verified, and the
information currently relies on documentation or vendor supplied information):
The Adaptability Score of Web Application Scanners – Commercial Tools
The Adaptability Score of Web Application Scanners – Commercial Tools
The Adaptability Score of Web Application Scanners – Free and Open Source Tools
The Adaptability Score of Web Application Scanners – Unified List
The detailed comparison of the scanners support for various barriers is documented in detail in the following of sectoolmarket:
14. Test IX – Authentication and
Usability Feature Comparison
Although supporting the authentication required by the
application seems like a crucial quality, in reality, certain scanner chaining features
can make-up for the lack of support in certain authentication methods, by
employing the use of a 3rd party proxy to authenticate on the scanner's behalf.
For example, if we wanted to use a scanner that does not
support NTLM authentication (but does support an upstream proxy), we could have
defined the relevant credentials in burpsuite FE, and define it as an upstream
proxy for the tested scanner.
However, chaining the scanner to an external tool that
supports the authentication still has some disadvantages, such as potential
stability issues, thread limitation and inconvenience.
The following comparison table shows which authentication
methods and features are supported by the various assessed scanners:
15. Test X – The Crown Jewel - Results
& Features vs. Pricing
Finally, after reading through all the sections and charts,
and analyzing the different aspects in
which each scanner was measured, it's time to expose the price (at least for
those of you that did manage to resist the temptation to access this link at
the beginning).
The important thing to notice, specifically in
relation to commercial scanner pricing, is that each product might be a bundle
of several semi-independent products that cover different aspects of the
assessment process, which are not necessarily related to the web application
security. These products currently include web service scanners, flash
application scanners and CGI scanners (SAST and IAST features were not included
on purpose).
In short, the scanner price might reflect (or not) a set of
products that might have been priced separately as an independent product.
Another issue to pay attention to is the type of license acquired.
In general, I did not cover non commercial prices in this comparison,
and in addition, did not include any vendor specific bundles, sales, discounts
and sales pitches. I presented the base prices listed in the vendor website or
provided to me by the vendor, according to a total of 6 predefined categories,
which are in fact, combinations of the following concepts:
Consultant Licenses: although there isn't a commonly
accepted term, I defined "Consultant" licenses as licenses that fit
the common requirements of a consulting firm - scanning an unrestricted amount
of IP addresses, without any boundaries or limitations.
Limited Enterprise Licenses: Any license that allowed
scanning an unlimited but restricted set of addresses (for example - internal
network addresses or organization-specific assets) was defined as an enterprise
license, which might not be suited for a consultant, but will usually suffice
for an organization interested in assessing its own applications.
Website/Year - a license to install the software on a
single station and use it for a single
year against a single IP address (the exception to this rule is Netsparker, in
which the per website price reflects 3 Websites).
Seat/Year - a license to install the software on a
single station and use it for a single year.
Perpetual Licenses - pay once, and it's yours (might
still be limited by seat, website, enterprise or consultant restrictions). The
vendor's website usually includes additional prices for optional support and
product updates.
The various prices can be viewed in the dedicated comparison
in sectoolmarket, available in the following address:
It is important to remember that this prices might change,
vary or be affected by numerous variables, from special discounts and sales to
a strategic conscious decision of a vendors to invest in you as a customer or a
beta testing site.
16. Additional Comparisons, Built-in
Products and Licenses
While in the past I used to present additional information
in external PDF files, with the new presentation platform I am now able to
present the information in a media that is much easier to use and analyze.
Although anyone can access the root URL of sectoolmarket and search the various
sections on his own, I decided to provide a short summary of additional lists
and features that were not covered in a dedicated section of this benchmark,
but were still documented and published in sectoolmarket.
List of Tools
The list of tools tested in this benchmark, and in the
previous benchmarks, can be accessed through the following link:
Additional Features
Complementary scan features that were not evaluated or
included in the benchmark:
In order to clarify what each column in the report table
means, use the following glossary table:
Title
|
Possible Values
|
Configuration & Usage Scale
|
Very Simple - GUI + Wizard
Simple - GUI with simple options, Command line with scan
configuration file or simple options
Complex - GUI with numerous options, Command line with
multiple options
Very Complex - Manual scanning feature dependencies, multiple
configuration requirements
|
Stability Scale
|
Very Stable - Rarely crashes, Never gets stuck
Stable - Rarely crashes, Gets stuck only in extreme scenarios
Unstable - Crashes every once in a while, Freezes on a
consistent basis
Fragile – Freezes or Crashes on a consistent basis, Fails
performing the operation in many cases
|
Performance Scale
|
Very Fast - Fast implementation with limited amount of
scanning tasks
Fast - Fast implementation with plenty of scanning tasks
Slow - Slow implementation with limited amount of scanning
tasks
Very Slow - Slow implementation with plenty of scanning tasks
|
Scan Logs
In order to access the scan logs and detailed scan results of
each scanner, simply access the scan-specific information for that scanner, by
clicking on the scanner version in the various comparison charts:
17. What Changed?
Since the latest benchmark, many open source &
commercial tools added new features and improved their detection accuracy.
The following list presents a summary of changes in the
detection accuracy of commercial tools that were tested in the previous
benchmark (+new):
·
IBM AppScan -
no significant changes, new results for Path Traversal and WIVET.
·
WebInspect -
a dramatic improvement in the detection accuracy of SQLi and XSS
(fantastic result!), new results for Path Traversal, RFI (fantastic result!),
and WIVET (fantastic result!)
·
Netsparker -
no significant changes, new results for Path Traversal and WIVET.
·
Acunetix WVS -
a dramatic improvement in the detection accuracy of SQLi (fantastic
result!) and XSS (fantastic result!), and new results for Path Traversal, RFI
and WIVET.
·
Syhunt Dynamic -
a dramatic improvement in the detection accuracy of XSS (fantastic
result!) and SQLi, and new results for Path Traversal, RFI and WIVET.
·
Burp Suite -
a dramatic improvement in the detection accuracy of XSS and SQLi (fantastic
result!), and new results for Path Traversal and WIVET.
·
ParosPro -
New results for Path Traversal and WIVET.
·
JSky - New
results for RFI, Path Traversal and WIVET.
·
WebCruiser -
No significant changes.
·
Nessus - a dramatic
improvement in the detection accuracy of Reflected XSS, potential bug in
the LFI/RFI detection features.
·
Ammonite -
New results for RXSS, SQLi, RFI and Path Traversal (fantastic result!)
The following list presents a summary of changes in the
detection accuracy of free and open source tools that were tested in the
previous benchmark (+new):
·
Zed Attack Proxy
(ZAP) – a dramatic improvement in the detection accuracy of
Reflected XSS exposures (fantastic result!), in addition to new results for
Path Traversal and WIVET.
·
IronWASP -
New results for SQLi, XSS, Path Traversal and RFI (fantastic result!).
·
arachni
– an improvement in the detection accuracy of Reflected XSS exposures
(mainly due to the elimination of false positives), but a decrease in the
accuracy of SQL injection exposures (due to additional false positives being discovered).
There's also new results for RFI, Path Traversal (incomplete due to a bug), and
WIVET.
·
sqlmap
– a dramatic improvement in the detection accuracy of SQL Injection
exposures (fantastic result!).
·
Acunetix Free
Edition – a dramatic improvement in the detection accuracy
of Reflected XSS exposures, in addition to a new WIVET result.
·
Syhunt Mini (Sandcat
Mini) - a dramatic improvement in the detection accuracy of both
XSS (fantastic result!) and SQLi. New results for RFI.
·
Watobo –
Identical results, in addition to new results for Path Traversal and WIVET. The
author did not test the latest Watobo version, which was released a few days
before the publication of this benchmark.
·
N-Stalker
2012 FE – no significant changes, although it seems that the
decreased accuracy is actually an unhandled bug in the release (unverified
theory).
·
Skipfish –
insignificant changes that probably
result from the testing methodology and/or testing environment. New results for
Path Traversal, RFI and WIVET.
·
WebSecurify
– a major improvement in the detection accuracy of RXSS exposures, and
new results for Path Traversal and WIVET.
·
W3AF –
a slight increase in the SQL Injection detection accuracy. New results for Path
Traversal (fantastic result!), RFI and WIVET.
·
Netsparker
Community Edition – New results for WIVET.
·
Andiparos & Paros
– New results for WIVET.
·
Wapiti – New
results for Path Traversal, RFI and WIVET.
·
ProxyStrike –
New results for WIVET (Fantastic results for an open source product! again!)
·
Vega - New
results for Path Traversal, RFI and WIVET.
·
Grendel Scan
– New results for WIVET.
18. Initial Conclusions – Open Source vs.
Commercial
The following section presents my own personal opinions
on the results, and is not based purely on accurate statistics, like the rest
of the benchmark.
After testing various versions of over 51 open source
scanners on multiple occasions, and after comparing the results and experiences
to the ones I had after testing 15 commercial ones (including tools
tested in the previous benchmarks and tools I did not reported), I have reached
the following conclusions:
·
As far as accuracy &
features, the distance between open source tools and commercial tools is insignificant,
and open source already rival, and in some rare cases, even exceed the
capabilities of commercial scanners (and vice versa).
·
Although most open source
scanners have not yet adjusted to support applications that use new
technologies (AJAX, JSON, etc), recent advancement in the crawler of ZAP proxy
(not tested in the benchmark, and might be reused by other projects), and the
input vectors supported by a new project named IronWASP are a great beginning
to the process. On the other hand, most of the commercial vendors already
adjusted themselves to some of the new technologies, and can be used to
scan them in a variety of models.
·
The automated crawling
capability of most commercial scanners is significantly better than those of
open source projects, making these tools better for point and shot scenarios...
the difference however, is not significant for some open source projects, which
can "import" or employ the crawling capabilities of the a free
version of a commercial product (requires some experience with certain tools -
probably more suited for a consultant then a QA engineer).
·
Some open source tools, even
the most accurate ones, are relatively difficult to install & use, and
still require fine-tuning in various fields, particularly stability. Other open
source projects however, improved over the last year, and enhanced their user
experience in many ways.
19. Verifying The Benchmark Results
The results of the benchmark can be verified by replicating
the scan methods described in the scan log of each scanner, and by testing the
scanner against WAVSEP v1.2 and WIVET v3-revision148.
The same methodology can be used to assess vulnerability
scanners that were not included in the benchmark.
The latest version of WAVSEP can be downloaded from the web
site of project WAVSEP (binary/source code distributions, installation
instructions and the test case description are provided in the web site
download section):
The latest version of WIVET can be downloaded from the
project web site, or preferably, checked-out from the project subversion
repository:
svn checkout http://wivet.googlecode.com/svn/trunk/ wivet-read-only
20. So What Now?
So now that we have all those statistics, it's time to
analyze them properly, and see which conclusions we can get to. I already
started writing a couple of articles that will make the information easy to
use, and defined a methodology that will explain exactly how to use it.
Analyzing the results however, will take me some time, since most of my time in
the next few months will be invested in another project I'm working on (will be
released soon), one I've been working on for the past year.
Since I didn't manage to test all the tools I wanted, I
might update the results of the benchmark soon with additional tools (so you
can think of it as a dynamic benchmark), and I will surely update the results
in sectoolmarket (made some promises).
If you want to get notifications on new scan results, follow
my blog or twitter account, and i'll do my best to tweet notification when I
find the time to perform some major updates.
Since I have already been in the situation in the past, then
I know what's coming… so I apologize in advance for any delays in my
responses in the next few weeks, especially during august.
21. Recommended Reading List: Scanner
Benchmarks
The following resources include additional information on
previous benchmarks, comparisons and assessments in the field of web
application vulnerability scanners:
·
"SQL
Injection through HTTP Headers", by Yasser Aboukir (an analysis and
enhancement of the 2011 60 scanners benchmark, with a different approach for
interpreting the results, March 2012)
·
"The
Scanning Legion: Web Application Scanners Accuracy Assessment & Feature Comparison",
one of the predecessors of the current benchmark, by Shay Chen (a comparison of
60 commercial & open source scanners, August 2011)
·
"Building
a Benchmark for SQL Injection Scanners", by Andrew Petukhov (a
commercial & opensource scanner SQL injection benchmark with a generator
that produces 27680 (!!!) test cases, August 2011)
·
"Webapp
Scanner Review: Acunetix versus Netsparker", by Mark Baldwin
(commercial scanner comparison, April 2011)
·
"Effectiveness
of Automated Application Penetration Testing Tools", by Alexandre
Miguel Ferreira and Harald Kleppe (commercial & freeware scanner comparison,
February 2011)
·
"Web
Application Scanners Accuracy Assessment", one of the predecessors of
the current benchmark, by Shay Chen (a comparison of 43 free & open source
scanners, December 2010)
·
"State
of the Art: Automated Black-Box Web Application Vulnerability Testing"
(Original
Paper), by Jason Bau, Elie Bursztein, Divij Gupta, John Mitchell (May 2010)
– original paper
·
"Analyzing
the Accuracy and Time Costs of Web Application Security Scanners", by
Larry Suto (commercial scanners comparison, February 2010)
·
"Why
Johnny Can’t Pentest: An Analysis of Black-box Web Vulnerability Scanners",
by Adam Doup´e, Marco Cova, Giovanni Vigna (commercial & open source
scanner comparison, 2010)
·
"Web
Vulnerability Scanner Evaluation", by AnantaSec (commercial scanner
comparison, January 2009)
·
"Analyzing the
Effectiveness and Coverage of Web Application Security Scanners", by
Larry Suto (commercial scanners comparison, October 2007)
·
"Rolling Review: Web App
Scanners Still Have Trouble with Ajax", by Jordan Wiens (commercial
scanners comparison, October 2007)
·
"Web
Application Vulnerability Scanners – a Benchmark" , by Andreas
Wiegenstein, Frederik Weidemann, Dr. Markus Schumacher, Sebastian Schinzel (Anonymous
scanners comparison, October 2006)
22. Thank-You Note
During the research described in this article, I have
received help from plenty of individuals and resources, and I’d like to take
the opportunity to thank them all.
I might be reusing the texts, due to the late night hour and
the constant lack of sleep I have been through in the last couple of months,
but I mean every word that is written here.
For all the open source tool authors that
assisted me in testing the various tools in unreasonable late night hours and
bothered to adjust their tools for me, discuss their various features and invest
their time in explaining how I can optimize their use,
For the kind souls that helped me obtain evaluation
licenses for commercial products, for the CEO's, Marketing Executives, QA
engineers, Support and Development teams of commercial vendors, which saved
me tons of time, supported me throughout the process, helped me overcome
obstacles and proved to me that the process of interacting with a commercial
vendor can be a pleasant one, and for the various individuals that helped me
contact these vendors.
I can't thank you enough, and wish you all the best.
For the information sources that helped me gather the list
of scanners over the years, and gain knowledge, ideas, and insights, including
(but not limited to) information security sources such as Security Sh3ll
(http://security-sh3ll.blogspot.com/),
PenTestIT (http://www.pentestit.com/),
The Hacker News (http://thehackernews.com/),
Toolswatch (http://www.vulnerabilitydatabase.com/toolswatch/),
Darknet (http://www.darknet.org.uk/),
Packet Storm (http://packetstormsecurity.org/),
Google (of course), Twitter (my latest addiction) and many others
great sources that I have used over the years to gather the list of tools.
I hope that the conclusions, ideas, information and payloads
presented in this research (and the benchmarks and tools that will follow) will
contribute to all the vendors, projects and most importantly, testers that
choose to rely on them.
23. FAQ - Why Didn't You Test NTO, Cenzic
and N-Stalker?
Prior to the benchmark, I made an important decision. I
decided to go through official channels, and either contact vendors and work
with them, or use public evaluation versions of relatively simple
products. I had a huge amount of tasks, and needed the support to cut the
learning curve of understanding how optimize the tools. I was determined to
meet my deadline, didn't have any time to spare, and was willing to make
certain sacrifices to meet my goals.
As for why specific vendors
were not included, this is the short answer:
NTO: I only managed to get in touch with NTO about two
weeks before the benchmark publication. I didn't have luck contacting the guys
I worked with in the previous benchmarks, but was eventually contacted by Kim
Dinerman. She was nice and polite, and apologized for the time the process
took. After explaining to her which timeframe they have for enhancing the
product (an action performed by other commercial vendors as well, in order to prepare
for the publically known tests of the benchmark), they decided that the
timeframe and circumstances don't provide an even opportunity and decided not
to participate.
I admit that by the time they contacted me, I was so loaded
with tasks, that it was somewhat relieved, even though I was curious and wanted
to assess their product. That being said, I decided prior to the benchmark that
I will respect the decisions of vendors, even if will cause me to not get to a
round scanner number.
N-Stalker: I finally received a valid
N-Stalker license one day before the publication of the benchmark - a couple of
days after the final deadline I had for accepting any tool. I decided to give
it a shot, just in case it will be a simple process, however, with my luck, I
immediately discovered a bug that prevented me from properly assessing the
product and it's features, and unlike the rest of tests which were performed
with a sufficient timeframe... this time, I had no time to find a workaround. I
decided not to publish the partial results I had (I did not want to create the
wrong impression or hurt anyone's business), and notified the vendor on the bug
and on my decision.
The vendor, from his part, thanked me for the bug report,
and promised to look up the issue. Sorry guys... I wanted to test them too...
next benchmark.
Cenzic: the story of Cenzic is much simpler than the
rest. I simply didn't manage to get in touch, and even though I did have access
to a license, I decided prior to the benchmark not to take that approach. As I
mentioned earlier, I decided to respect the vendor decisions, and not
to assess their product without their support.
The following commercial web application
vulnerability scanners were not included in the benchmark,
due to deadlines and time restrictions from my part, and in the case of
specific vendors, for other reasons.
Commercial Scanners not included in this benchmark
·
N-Stalker Commercial
Edition (N-Stalker)
·
Hailstorm (Cenzic)
·
NTOSpider
(NTO)
·
Retina
Web Application Scanner (eEye Digital Security)
·
SAINT
Scanner Web Application Scanning Features (SAINT co.)
The following open source web application
vulnerability scanners were not included in the benchmark, mainly
due to time restrictions, but might be included in future benchmarks:
Open Source Scanners not included in this benchmark
·
GNUCitizen
JAVASCRIPT XSS SCANNER - since WebSecurify, a more advanced tool
from the same vendor is already tested in the benchmark.
·
Vulnerability Scanner
1.0 (by cmiN, RST) - since the source code contained traces for remotely
downloaded RFI lists from locations that do not exist anymore.
The benchmark focused on web application scanners that are
able to detect either Reflected XSS or SQL Injection vulnerabilities, can be
locally installed, and are also able to scan multiple URLs in the same
execution.
As a result, the test did not include the following
types of tools:
·
Online Scanning
Services – Online applications that remotely scan applications,
including (but not limited to) Appscan On Demand (IBM), Click To Secure, QualysGuard
Web Application Scanning (Qualys), Sentinel (WhiteHat), Veracode (Veracode), VUPEN
Web Application Security Scanner (VUPEN Security), WebInspect (online service -
HP), WebScanService (Elanize KG), Gamascan (GAMASEC – currently offline), Cloud
Penetrator (Secpoint), Zero Day Scan, DomXSS
Scanner, etc.
·
Scanners without RXSS
/ SQLi detection features:
o
LFI/RFI Checker
(astalavista)
o
etc
·
Passive Scanners
(response analysis without verification):
o
etc
·
Scanners of specific
products or services (CMS scanners, Web Services Scanners, etc):
o WSDigger
o Sprajax
o ScanAjax
o Joomscan
o wpscan
o Joomlascan
o Joomsq
o WPSqli
o
etc
·
Web Application Scanning
Tools which are using Dynamic Runtime Analysis:
o PuzlBox (the free version was removed from the web site, and
is now sold as a commercial product named PHP Vulnerability Hunter)
o
etc
·
Uncontrollable
Scanners - scanners that can’t be controlled or restricted to scan a
single site, since they either receive the list of URLs to scan from Google
Dork, or continue and scan external sites that are linked to the tested site.
This list currently includes the following tools (and might include more):
o
Darkjumper 5.8 (scans
additional external hosts that are linked to the given tested host)
o
Bako's SQL Injection
Scanner 2.2 (only tests sites from a google dork)
o
Serverchk (only
tests sites from a google dork)
o
XSS Scanner by
Xylitol (only tests sites from a google dork)
o Hexjector by hkhexon – also falls into other
categories
o d0rk3r by b4ltazar
o
etc
·
Deprecated Scanners
- incomplete tools that were not maintained for a very long time. This list
currently includes the following tools (and might include more):
o
Wpoison (development
stopped in 2003, the new official version was never released, although the 2002
development version can be obtained by manually composing the sourceforge URL which
does not appear in the web site- http://sourceforge.net/projects/wpoison/files/
)
o
etc
·
De facto Fuzzers
– tools that scan applications in a similar way to a scanner, but where the scanner
attempts to conclude whether or not the application or is vulnerable (according
to some sort of “intelligent” set of rules), the fuzzer simply collects
abnormal responses to various inputs and behaviors, leaving the task of
concluding to the human user.
o
Lilith 0.4c/0.6a (both
versions 0.4c and 0.6a were tested, and although the tool seems to be a scanner
at first glimpse, it doesn’t perform any intelligent analysis on the results).
o
Spike proxy 1.48
(although the tool has XSS and SQLi scan features, it acts like a fuzzer more
then it acts like a scanner – it sends payloads of partial XSS and SQLi, and
does not verify that the context of the returned output is sufficient for
execution or that the error presented by the server is related to a database syntax
injection, leaving the verification task for the user).
·
Fuzzers –
scanning tools that lack the independent ability to conclude whether a given response
represents a vulnerable location, by using some sort of verification method (this
category includes tools such as JBroFuzz, Firefuzzer, Proxmon, st4lk3r, etc).
Fuzzers that had at least one type of exposure that was verified were included
in the benchmark (Powerfuzzer).
·
CGI Scanners:
vulnerability scanners that focus on detecting hardening flaws and version
specific hazards in web infrastructures (Nikto, Wikto, WHCC, st4lk3r,
N-Stealth, etc)
·
Single URL
Vulnerability Scanners - scanners that can only scan one URL at a time,
or can only scan information from a google dork (uncontrollable).
o
Havij (by itsecteam.com)
o
Hexjector (by hkhexon)
o Simple XSS Fuzzer [SiXFu] (by www.EvilFingers.com)
o Mysqloit (by muhaimindz)
o PHP Fuzzer (by RoMeO from DarkMindZ)
o SQLi-Scanner (by Valentin Hoebel)
o Etc.
·
Vulnerability
Detection Assisting Tools – tools that aid in discovering a
vulnerability, but do not detect the vulnerability themselves; for example:
·
Exploiters - tools
that can exploit vulnerabilities but have no independent ability to
automatically detect vulnerabilities on a large scale. Examples:
o
MultiInjector
o XSS-Proxy-Scanner
o Pangolin
o FGInjector
o Absinth
o Safe3 SQL Injector (an exploitation tool with scanning
features (pentest mode) that are not available in the free version).
o
etc
·
Exceptional Cases
o SecurityQA Toolbar (iSec) – various lists and rumors
include this tool in the collection of free/open-source vulnerability scanners,
but I wasn’t able to obtain it from the vendor’s web site, or from any other
legitimate source, so I’m not really sure it fits the “free to use” category.
I am security guy, too. While planing to pen test, I found your excellent article. I really appreciate it for your work!
ReplyDeleteAwesome Article!!!!
ReplyDeleteHello, I am am Co-Founder of Orvant. I think our Securus vulnerability scanner would make a worthy addition to the list. One thing that is unique about Securus is that we leverage many of these tools as well as add our own special sauce on top. Our intent is to provide you with the greates test ant threat coverage as possible. As well as the flexability to decide what tools are worth running and being able to run a side by side comparison helps.
ReplyDeleteWill take a look at the next benchmark, somewhere around May.
DeleteThanks you can contact me via email dan - orvant.com if you have any question or comments when you take a look.
DeleteShay,
ReplyDeleteYour research is comprehensive and was really helpful for me in evaluating both commercial and open-source tools. Your selection of assessment criteria was useful for the majority of vulnerabilities/features and it makes comparing the results a bit easier.
One recent update that I found was regarding ZAP, which extended the results using ZAP 2.0.0 (released in January 2013) against WAVSEP, as reported in the following link:
http://code.google.com/p/zaproxy/wiki/TestingWavsep
I look forward reading your updates and analysis on this research and which conclusions you will reach.
Thanks,
Itay
Hello,
ReplyDeletethank you for your excellent article, Do you have a benchmarking or vision of Source Code Security Analyzers (HP fortify static code analyser,IBM security Appscan Source, Find Bugs, ...) and what is the product that you recommend
Thanks
Hocine
Shay,
ReplyDeleteExcellent analysis. I was starting out looking for the same answer, is it value for money to have a commercial Web Vulnerability Scanner rather than an open source? Comparing scanners is like going to a dance and meeting very attractive people, picking one is hard. The long term future is a decider. Keeping up to date with the forks is also difficult. ZAP is a fork of version 3.2.13 of the open source variant of Paros. Vega looks good. IronWasp impressive. Tough choices. The bit I liked is your ability to put yourself in the Consultants role. - scanning an unrestricted amount of IP addresses. Commercial suppliers have trouble with this role.
Thanks
Shay,
ReplyDeleteThank you for this extremely in-depth analysis of the different types of web application security scanners available. I personally prefer Veracode for application security testing (which is #20 on the list of Forbe's most promising companies in America) because of their dynamic analysis tool and clear reporting. Black Diamond Solutions is actually offering a free application security scan on the Veracode platform. Hope this helps!
It is so good that I found this post. Now I have the ideal how to check my site security.
ReplyDeleteThank you.
Shay,
ReplyDeleteMy name is Riaan Gouws and I am the CTO of Quatrashield. First, I think you deserve much credit for the important service that you provide our industry. This detailed article is testament to your passion in this field.
I would like to ask you to also consider including our web application vulnerability scanner – QuatraScan - in your next benchmark study. Based on our own testing, we believe that our false positive rate puts us in the first tier of vendors and we are hopeful that sectooladdict can validate this as well.
I am happy to provide as much info as is needed.
Thanks, Riaan.
Hi Shay,
ReplyDeleteGreat article! I have a question to interpret the list the right way. In which relation do the accuracies stand to the WIVET? For example for the w3af: Are those 35.29% the accuracy from the whole application or only from those 19% of WIVET?
Hope you understand my question :) Thanks a lot!
Hi Thomas,
Deletefirst of all - a new and more updated benchmark was published last week - you can access it through the following link:
http://sectooladdict.blogspot.co.il/2014/02/wavsep-web-application-scanner.html
The WIVET score is good to determine how good the scanner will identify the structure of the application *automatically* - at the worst case scenario.
So, if for example the WIVET score is 10%, the application has 100 web pages which are all vulnerable to a number of URLs that the scanner can identify, and crawling the application is very difficult due to the technology,
the scanner will be able to crawl about 10% of the pages, and scan them for vulnerabilities... all the rest will not be tested.
Please take into consideration that this explanation *highly* simplifies the meaning of the WIVET score for the purpose of associating value to it, and in reality, the scanner may crawl anything from 0% to 100%, depending on technology. WIVET is a great score to measure how well it will adapt to different technologies - and isn't related directly to accuracy, more to coverage.
WebCruiser Web Vulnerability Scanner 3
ReplyDeletehttp://lobatandawgs.com/104-webcruiser-web-vulnerability-scanner-3.html
http://shanghaiblackgoons.com/107-webcruiser-web-vulnerability-scanner-3.html