Executive Summary
The rapid development of IP technology has raised expectations across Corporate
America for powerful, low-cost networks. But many companies hesitate to move
business-critical applications to IP networks. The reason: average lower latency
times available through today's IP networks don't necessarily provide sufficient
reliability and performance at the application level. And since many service
providers are unable to provide information to measure that performance, Service
Level Agreements (SLA) often can't be measured.
Product Description
Concept
Service Definition and Description
Using Agilent's network measurement tool, Firehunter, a real-time network
protocol performance-reporting tool, the Qwest applications build on Agilent's
expertise in network testingand provide conclusive evidence of Qwest's
superior network technology and performance.
Where traditional network tests measure only ICMP (Internet Control Message
Protocol) in other word "pings" against routers on a complete transit around the
network, Firehunter measures other protocols as packets in the entire IP network
mesh. This provides us with valuable point to point measurements. These
measurements are used as a backbone performance tool in referencing SLAs
(Service Level Agreements). Users then can access the stat.qwest.net web site
to reference SLA performance across Qwest IP backbone.
Business benefits
Many service providers offer SLAs with a Catch-22. On the one hand, they
agree to terms of the SLA. On the other hand, it is the service providers
themselves who must monitor the network to determine whether those terms are
being met. So even when business users conclude that network performance is
substandard, they must rely on the service providers to prove their own failure.
By contrast, Qwest uses a third-party measurement tool and public 7x24 Web
reporting to demonstrate network performance. Qwest does not adjust or alter the
reportsthey reflect actual network performance.
The measurements monitor the Qwest SLA of 65ms or less average network
latency, based on links between TeraPOPs and averaged between the total
number of TeraPOPs.
The public can use a Web site not only to monitor Qwest's overall backbone
latency, but also to run their own POP to POP calculations, at any time, for a
number of different protocols (HTTP, TCP & FTP). These go far beyond the
standard "ping" test. They reveal how traffic is actually being routed across the
backbone network and demonstrate the average latency to the general public.
This helps educate the public about the power of next-generation networks like
the Qwest Macro Capacity® Fiber Network.
Primary Features
Reporting
Agilent's Firehunter is the core application that performs the protocol tests. The
actual reporting via web interface is a joint development between Qwest
developers and the import of data from Firehunter. It enables users to access a
Web site to both obtain network status reports, run protocol tests, traceroutes,
pings to user specified locations from anywhere in the IP network and BGP
queries at any time, 24 hours a day. This round-the-clock access is key for its
ability to run multiple tests can easily verify network backbone performance, and
accuracy of the application. This both informs current and potential customers of
backbone performance and serves as Qwest's key performance indicator (KPI)
to ensure advanced quality communications.
Customer Service & Sales
Designed to provide network status and SLA confirmation, the site shows customers and Sales Qwest's
ability to deliver high capacity networking. Users can measure network performance at
the protocol level at any time from any TeraPOP location and verify that quality
performances are being achieved. Furthermore, users can see whether or not an
application is being affected by network performance. Reports include hourly and daily information
as well as historical information for the last 7, 30, and 90 day views of performance metrics.
Technical Description
Agilent's Firehunter Architecture
Agilent's Firehunter is an application based software tool that tests protocol
transactions at the application service layer. Because it is based on a client
platform, Firehunter scales as the network expands and new applications are
applied. In short, it provides a toolkit at the application layer to provide end-to-end
testing.
The following diagram highlights the architecture of Agilent Firehunter.
Agilent Firehunter deploys a three-tier architecture to provide the flexibility to
adapt to any network, large or small. The minimum configuration is one
application from each tier.
Qwest uses the basic configuration to test between the different TeraPOPs that
are currently deployed. For general operations, the DMS is configured to execute
a series of queries that occur between the agents and the servers in each
TeraPOP. Once the agent receives the HTTP text page from the server, the
round trip time is sent back to the DMS, which determines the average network
delay for the Web query. In other words, a central server executes a command
to agents that are configured on web servers that are located in each TeraPOP
that make up the Qwest backbone. The agents query the web server and
respond back to the central server called the DMS. The Central server or DMS
calculates the latency for the response and time back to the DMS.
By having agents query an HTTP text page instead of a standard "ping" test or
TRACEROUTE to an interface, the test measures the true network transit time
of Web traffic. As a result, Qwest can monitor actual network delay for the
service applications running on the network.
HTTP Protocol Measurement Description
DNS Resolution
HTTP packets are made up of a header and the data payload being transmitted in
the packet. The header contains information about the points of origin and
destination for the packet. When a packet is generated for delivery to a URL,
e.g. www.qwest.com, the originating URL directs the packet to a distributed
name server (DNS) that translates www.qwest.com into an IP address. This IP
address is given back to the HTTP packet, which uses it to continue to its
destination.
This basic HTTP description highlights the overhead involved with DNS
resolution for an HTTP packet. Thus it can be time-consuming for an HTTP
packet to go from its origin to its destination. By measuring the process,
Firehunter can accurately depict the latency from the initial DNS translation.
Such granular measurementsthe first available in the industryare introducing
a new level of protocol monitoring.
TCP Connect Time
An HTTP packet's trip from its origin to its destination is the TCP connect time.
This is the time it takes for the originating computer to get a valid connection to
the destination computer. The Firehunter test is conducted between originating
and destination servers in the TeraPOPs. A valid connection between different
servers is known as a handshake. This aspect of the HTTP protocol test is the
most accurate depiction of network latency. This is an important measurement
because the rest of the measurements broken out by Firehunter are dependent
upon service level computer systems or the amounts of data that are being sent
across the network. For example DNS time is a measurement of a DNS server's
response time, which is the computer system's latency rather than actual network
latency.
Response Times
The next two measurements of the HTTP protocol are the most important
measurements to be broken out of the total latency measurement. The first is the
server response time; the second is data transfer time.
Server Response Time
Server response time is the time that it takes for a Web server to answer a
request for a Web page. This step follows the TCP handshake. Once a
handshake has been completed the server responds based upon the Web page
that is being requested. Measuring this response is important in order to
distinguish between network and server latency. For instance, a long response
time that seems to be the result of network saturation could in fact be a slow
response time from a Web server. Thus having the overall response time broken
out into the handshake and processing times can save a great deal of time in
tracking the source of a delay.
Data Transfer Time
Firehunter also breaks out data transfer time from the total response time. This
can be a deceptive measurement because it depends on the packet size of the
data being transferred. For example, a 10-kilobyte HTTP packet is obviously not
going to take as long to transit the network as a 1-megabyte HTTP packet will.
Total Response Time
Firehunter gives the overall response time along with the breakdown of the
different responses. The total response time is useful for overall processing times
for protocol queries, i.e. Web pages. However, Qwest mainly uses the protocol
response breakdown to provide accurate end-to-end network latency reports.
Shown below is an overall graphical flow of how the measurements are broken
down by Firehunter.
Conclusion
Qwest's implementation of Agilent's Firehunter is a new departure in network
measurement technology. By allowing users the ability to monitor network
performance on a publicly available Web site using a third-party tool, Qwest
demonstrates that it maintains its standard of 65 ms or less of average network
latency. That's the kind of performance that businesses require as they move key
applications onto IP backbones. And since customers can test those applications
at the protocol level, they can determine whether network performance is meeting
the terms of the Qwest SLA.
References
About Firehunter
Keynote Methodology Whitepaper
Appendix A
Acronyms and Abbreviations
IP Internet Protocol
SLA Service Level Agreement
ms milliseconds
ICMP Internet Control Message Protocol
HTTP HyperText Transfer Protocol
NNTP Network News Transfer Protocol
DNS Domain Name Server
SMTP Simple Mail Transfer Protocol
KPI Key Performance Indicator
TCP Transfer Control Protocol
|