Updated August 2003
Some consumer-focused routers have for some time claimed to use Stateful Packet Inspection (SPI) in their firewalls. But the recent generation of boxes based on the ADMtek 5106 Home Gateway Controller, (which was purchased by Conexant and is now the Conexant CX84200 Network Processor) seem to really be using SPI, or at least, doing something differently than their previously-released brethren. The telltale signs of this new generation of routers seem to be a significant reduction in measured throughput when the LAN test client is put in the router’s DMZ, and problems in completing UDP streaming tests under normal Qcheck test conditions.
At any rate, these newer routers that use both Network Address Translation (NAT) and SPI eliminate Ixia‘s Qcheck from our router testing toolbox. It turns out that Qcheck is too much of a “black box” and neither allows you to adjust some important test setup properties for SPI+NAT routers, nor gives you enough control over the test file size and run times.
Ixia’s IxChariot, on the other hand, does give us some of the control we need. Although it comes with the caveat that it says that it doesn’t support testing of SPI+NAT firewalls (routers), we have found that if configured properly, IxChariot can do the job.
We use the simple test setup pictured below to run the same three basic tests that we used to run using Qcheck, but in this case run with IxChariot.
NOTE! For all tests, we tell IxChariot to use the “Use Endpint 1 fixed port setting” (User Settings > Firewall Options > Endpoint 1 through firewall to Endpoint 2), which is the recommended setting for testing SPI based firewalls.
1) Throughput – This test is a measure of how fast data flows through the router. The test sends a file from computer to computer, measures how much time it takes, and calculates the result in Mbps (Megabits per second).
Our test uses an unmodified Throughput.scr IxChariot script (100,000 Byte file size, one transaction per timing record), and runs the test for 10 seconds in Batch mode with no endpoint polling. Higher numbers are better, but any result over 1-2Mbps will be plenty fast for most broadband connections, which usually run at an average of 0.5 to 0.8Mbps (even though the speed is usually advertised as being 1Mbps or higher).
2) Response Time – This test measures the delay (also known as lag, or latency) that the router introduces into a data stream, and is essentially what you’d measure by using the ping command. This test sends a small packet of data from one computer to another and measures the time it takes to receive a reply.
We use the same test setup for this test, since IxChariot actually does the calculation from the data taken for the Throughput test. Note that the IxChariot test uses a 100,000 Byte Data size (vs. 100 Byte size in Qcheck), so this test reports larger numbers than those obtained with Qcheck. The larger numbers are primarily a factor of the test Data size and not an indication of poorer performance than those products tested with Qcheck. Lower numbers are better, especially for gaming and any voice or video applications, but anything under 100ms (milliseconds) is fine, again, because the delay that your Internet connection introduces is probably greater.
3) UDP Stream – This test measures how well a router can keep up with a continuous stream of data. In addition to giving an indication of whether you’ll have trouble listening to Internet audio or watching video program streams, it tends to show flaws in the router’s routing “engine”. It uses the connectionless UDP protocol, which has less overhead and error recovery mechanisms than the TCP protocol (picture a fire hose being turned on vs. a water bucket brigade).
Our test uses an modified Realmed.scr IxChariot script (17,240 Byte file size, 431 Byte Send buffer size), with a 500 kbps (0.5Mbps) Send data rate, runs the test for 10 seconds in Batch mode with no endpoint polling, and reports two numbers. You want the Throughput number to be as close to 500 kbps as possible and the Lost Data to be ideally zero, which most routers will come pretty close to. Avoid products that can’t complete the test because they lock up, have less than 400 kbps throughput, or error rates above 10%.
All three tests are run from WAN to LAN (“downstream”), and repeated from LAN to WAN (“upstream”). As mentioned earlier, putting the LAN endpoint into DMZ can significantly change the test results… at least in the first round of SPI+NAT products that we’ve seen. So, unless otherwise noted, the results you see in the chart are taken with the LAN endpoint in DMZ for all WAN to LAN tests, and with the LAN endpoint not in DMZ for the LAN to WAN tests. The only exception to the latter rule is for the LAN to WAN UDP streaming tests, which requires that the LAN endpoint be put into DMZ.
We tend to use the same computers to run the tests, with all running Win98SE or WinXP, and having 300MHz or better processor speeds and memory configurations in excess of 256MB. The test machines have no other applications running during testing.
IMPORTANT INFORMATION! PLEASE READ CAREFULLY!
Initial testing of these products has shown significantly lower throughput for WAN-LAN tests (1 to 2 Mbps typically). Experimentation has shown that this appears to be due to a combination of the SPI implementation of the router firewall and the way that Qcheck / IxChariot performs the WAN-LAN test.
Both Qcheck and IxChariot need to initiate the sending of data through the router for the WAN-LAN test from the untrusted (WAN) side of the router. Therefore these tests must be run with the LAN-side test machine either set in DMZ, or with TCP and UDP ports 10113-10117 opened to it. Since data is coming from the untrusted (WAN) side of the router, the packets are subjected to more scrutiny than data originating from the LAN side of the router, and this appears to cause a throughput decrease.
A similar reduction in throughput can be seen for LAN-WAN tests when the LAN machine is put into DMZ, or has the Qcheck-required ports forwarded. Since the LAN-WAN test doesn’t require that the LAN-side machine be put into DMZ, we can run this test both ways and compare the results, which we will usually also report somewhere in the Performance commentary.
If a product allows SPI to be disabled, we will usually also try to test in this condition.
What does this all mean? Are these new routers really that slow?!
The bottom line is that, no, SPI+NAT routers should not be slower than non-SPI routers in normal use. Here’s why.
The WAN-LAN test does not operate the same way as, say, your web browser does. When you download a file, or browse a website with your browser, although the bulk of the data flows from WAN to LAN, you originate the data request. This means that the whole data request/delivery process starts from the TRUSTED (LAN) side of the router/firewall. In this case, the data is most likely not “Inspected” (the “I” in SPI) as much as a data request that originates from the UNTRUSTED (WAN) side of the router. The router doesn’t have to work as hard, and throughput should be higher.
By contrast, in the Qcheck / IxChariot WAN-LAN test, the machine on the TRUSTED (LAN) side of the firewall first does some setup from the TRUSTED (LAN) side of the router, to its test partner on the UNTRUSTED (WAN) side. But then the bulk data transfer for the actual test ORIGINATES from the UNTRUSTED (WAN) side of the router. (It’s done this way to keep test overhead to a minimum and results in very high maximum test speeds.) Since data is originating from an UNTRUSTED source, this makes a big difference to the SPI firewall. This untrusted data causes the SPI to spring into action and apply a higher level of scrutiny to the data, possibly resulting in lower router throughput, depending on the router’s SPI implementation.
Most current-generation routers have throughput that is about the same in both directions. (This was not necessarily the case in earlier generation products due to design trade-offs.) So it’s reasonable to assume that you’d see WAN-LAN throughput similar to the LAN-WAN throughput results in normal, everyday use. This is because the LAN-WAN test is done with data originating from the TRUSTED side of the router, with the LAN test client not in DMZ, or with ports forwarded to it.
So when would you see performance similar to that shown in the WAN-LAN results? Basically, when a machine is put into DMZ or has ports forwarded to it for running a web, FTP, or other server, throughput would be similar to that measured in the WAN-LAN test for data flow originating from the UNTRUSTED side of the router.
Example: If you were hosting an Internet accessible server behind the router (with the appropriate ports forwarded), INBOUND requests to that server, i.e. originating from the Internet, would experience throughput similar to the WAN-LAN test results. Since most inbound requests aren’t very large, you (or the users making the request) probably wouldn’t even notice the lower throughput.
So if the WAN-LAN results are misleading, why do we publish them? Because the real results will allow comparison as manufacturers change their SPI algorithms. And frankly, it’s the only way we can keep from confusing ourselves!