MegaPath Technical Services Knowledgebase
Search:     Advanced search
Browse by category:

Identifying and Resolving Packet Loss

Add comment
Views: 2271
Votes: 0
Comments: 0
Posted: 24 Jan, 2007
by: NOC M.
Updated: 30 Jul, 2007
by: Magdael R.D.

When everything is working properly, each data packet should reach its destination. However, when there are problems with links or routers (e.g. overload), packets may be lost. It takes considerable time to detect and resend a lost packet, so performance drops dramatically with even a relatively small amount of packet loss. As a rule of thumb, packet loss should be less than 1 percent. Packet loss of more than 5 percent is serious.

Scenarios

Can you get to your service provider? PING your DNS server.

Internet providers supply at least one, and usually two, DNS servers (which convert names into actual Internet addresses) to subscribers. These are "nearby" nodes that usually respond to PING -- a good way to test your basic connection. Call your provider to request the numbered IP address of the DNS server. PING the numbered IP address at least 10 times. You should get consistently low latency with no packet loss, as in the example above. If you don't, run a traceroute/tracert to pinpoint the problem.


Problem 1A: Your local Internet connection
Symptom: Erratic latency or packet loss starting on node 1

If you're on a cable modem, your local subnet may be overloaded, particularly if you see this problem predominately at times of peak demand. If you're on DSL, then your DSL modem may be having problems with synch or errors. Either problem needs to be addressed by your provider.


Example of problem 1A
tracert 192.168.0.11
Tracing route to ns1.example.com [192.168.0.11] over a maximum of 30 hops:
1 21 ms 285 ms 316 ms gateway.example.com [192.168.0.9]
2 291 ms 18 ms 192 ms ns1.example.com [192.168.0.11]
Trace complete.

Problem 1B: Your provider's network or DNS server
Symptom: Node 1 is OK, but erratic latency or packet loss on node 2 or up

The DNS server or your provider's internal network may be overloaded or malfunctioning. These are serious problems that can be fixed only by your provider.

Example of problem 1B
tracert 192.168.0.11
Tracing route to ns1.example.com [192.168.0.11] over a maximum of 30 hops:
1 21 ms 19 ms 23 ms gateway.example.com [192.168.0.9]
2 291 ms 18 ms 192 ms ns1.example.com [192.168.0.11]
Trace complete.

Can you get outside of your provider? Traceroute to a remote node.

Problem 2A: DNS failure
Symptom: Can't resolve DNS name

You may have the name wrong. Check it. Also check other names. If the name is valid, particularly if other names are also failing, either your Internet configuration is screwed up (e.g. DNS servers are not being configured correctly) or the provider's DNS server is acting up. You may be able to fix a configuration problem yourself (e.g. by manual configuration of DNS servers), but a DNS server problem is a serious issue that can be fixed only by your provider.

Example of problem 2A -- deliberately bogus name
tracert ss.techtv.com
Unable to resolve target system name ss.techtv.com.
 

Problem 2B: Internet transit overload
Symptom: Sudden jump in latency or packet loss at your provider's transit router

Most providers are connected to the Internet by means of "transit" links, typically a small number of high-speed circuits that each carry traffic from a large number of subscribers. If a transit link gets saturated, which can easily happen if a provider isn't proactive about keeping capacity ahead of subscriber traffic growth, latency will climb at the provider's "border" router (node) as packets are queued to wait their turn. If the border router runs out of queuing capacity, it copes by simply discarding packets (packet loss).

The solution is for your provider to install more capacity or for you to find a better provider if the problem isn't fixed right quick! In the following example, note the big jump in latency at the transition between the dummy ISP (example.net) and the transit. Each "*" is a lost packet. What matters is the first node at which the problem shows up.

Example of problem 2B
tracert www.techtv.com
Tracing route to www.techtv.com [64.95.116.134] over a maximum of 30 hops:
1 21 ms 17 ms 15 ms gateway.example.net
2 18 ms 18 ms 21 ms internal.example.net
3 19 ms 20 ms 17 ms border.example.net
4 824 ms 565 ms 918 ms aaa.transit.invalid
5 656 ms * 707 ms bbb.transit.invalid
6 * 943 ms * ccc.backbone.invalid
 

Problem 2C: Internet overload
Symptom: Latency or packet loss starting beyond the transit

Capacity problems on Internet "backbone" circuits and major connection nodes are much less common than some providers would have us believe. Further, most providers will claim they're not at fault when it happens. However, they do sometimes occur, particularly when there's a critical failure (e.g. major cable inadvertently cut by a clumsy backhoe operator). This would be much like the example immediately above, except that the problem shows up at a later node.

Problem 2D: Routing problem
Symptom: Packets 'looping' until the trace expires

When routing "tables" (the Internet "map") get messed up, packets may get sent to the wrong place or even stuck in a loop. There usually isn't much your provider can do about this except to report the problem. Note how the following example gets stuck in a loop (that continues until the packet expires) between ccc and ddd.

Example of problem 2D
tracert www.techtv.com
Tracing route to www.techtv.com [64.95.116.134] over a maximum of 30 hops:
1 21 ms 17 ms 15 ms gateway.example.net
2 18 ms 18 ms 21 ms internal.example.net
3 19 ms 20 ms 17 ms border.example.net
4 25 ms 27 ms 22 ms aaa.transit.invalid
5 24 ms 24 ms 21 ms bbb.transit.invalid
6 34 ms 33 ms 33 ms ccc.backbone.invalid
7 32 ms 41 ms 38 ms ddd.backbone.invalid
8 35 ms 31 ms 44 ms ccc.backbone.invalid
9 46 ms 37 ms 39 ms ddd.backbone.invalid
10 33 ms 34 ms 34 ms ccc.backbone.invalid


 

Basic measurement tools (software that's probably already on your computer)

Traceroute

Traces the path between you and the remote host as a series of nodes (and the links between them). Typically displays three latency measurements for each node and looks up node names (if available). Under Windows it's called "tracert," as in the examples below. If you use Linux, substitute "traceroute" for "tracert." Here's an example of a clean, fast DSL connection:  TP
tracert www.TechTV.com

Tracing route to www.techtv.com [64.95.116.134] over a maximum of 30 hops:

1 21 ms 17 ms 15 ms gateway.example.net
2 18 ms 18 ms 21 ms internal.example.net
3 19 ms 20 ms 17 ms border.example.net
4 21 ms 19 ms 21 ms 500.Serial2-11.GW3.SFO4.ALTER.NET
5 19 ms 20 ms 18 ms 129.ATM2-0.XR1.SFO4.ALTER.NET
6 19 ms 20 ms 19 ms 191.ATM6-0.GW4.SFO4.ALTER.NET
7 25 ms 25 ms 26 ms internap-gw.customer.alter.net
8 24 ms 24 ms 26 ms border5.ge3-1-bbnet1.sfo.pnap.net
9 26 ms 26 ms 26 ms techtv-1.border5.sfo.pnap.net
10 28 ms 24 ms 25 ms 64.95.116.134
Trace complete.
 

PING

Sends a special packet designed to get a response back from a particular remote node, much like the echo of a sonar ping used to detect objects underwater. It's usually possible to send several (or even continuous) PINGs in succession, and the latency of each PING is commonly displayed. Since PING can be used in certain forms of Internet attacks, some nodes deliberately ignore PINGs as a security measure, so a complete failure to respond to PING is not necessarily bad. Note the consistent latency and lack of packet loss in this example of a clean, fast DSL connection (where "-n 10" is used to send 10 PINGs): 

ping -n 10 www.TechTV.com
Pinging www.TechTV.com [64.95.116.134] with 32 bytes of data:
Reply from 64.95.116.134: bytes=32 time=26 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=23 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=22 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=22 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=23 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=22 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=24 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=25 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=26 ms TTL=47
Reply from 64.95.116.134: bytes=32 time=22 ms TTL=47
PING statistics for 64.95.116.134:
Packets: Sent = 10, Received = 10, Lost = 0 (0 percent loss),
Approximate round-trip times in milliseconds:
Minimum = 22 ms, Maximum = 26 ms, Average = 23 ms

Others in this Category
document Glossary of DSL Terminology
document Preparations Before Install
document Modem Vs. Router
document Mapping Ports - Which Option is Best for Me?
document Netopia Routers - Light Status and Images
document What Are the MegaPath News Servers?
document What is my IP Address?
document How Can I Find Out What Dial-Up Access Numbers I Can Use?
document Line Testing
document Network Outage Notifications
» More articles