IP_DUMMYNETBack to my home page
dummynet
1. Description
dummynet is a flexible tool originally designed for testing networking
protocols, and since then (mis)used for bandwidth management.
It simulates/enforces queue and bandwidth limitations, delays, packet losses,
and multipath effects. It also implements a variant of Weighted Fair Queueing
called WF2Q+. It can be used on user's workstations, or on FreeBSD machines
acting as routers or bridges.
Just to get the idea of what you can do with dummynet, e.g. by using dummynet on
your workstation, or putting a PC with two ethernet cards between your network
and your router and booting from the floppy-image below, here are a few examples
:
These rules limit the total ICMP traffic (inbound+outbound) to 50Kbit/s
ipfw add pipe 1 icmp from any to any
ipfw pipe 1 config bw 50Kbit/s queue 10
These rules limit inbound traffic to 300Kbit/s for each host on your network
10.1.2.0/24.
ipfw add pipe 2 ip from any to 10.1.2.0/24
ipfw pipe 2 config bw 300Kbit/s queue 20 mask dst-ip 0x000000ff
If you want all machines to share evenly a single link, you should use instead:
ipfw add queue 1 ip from any to 10.1.2.0/24
ipfw queue 1 config weight 5 pipe 2 mask dst-ip 0x000000ff
ipfw pipe 2 config bw 300Kbit/s
And these rules simulate an ADSL link to the moon:
ipfw add pipe 3 ip from any to any out
ipfw add pipe 4 ip from any to any in
ipfw pipe 3 config bw 128Kbit/s queue 10 delay 1000ms
ipfw pipe 4 config bw 640Kbit/s queue 30 delay 1000ms
dummynet works by intercepting packets (selected by ipfw rules - ipfw is one of
the FreeBSD firewalls) in their way through the protocol stack, and passing them
through one or more objects called queues and pipes, which simulate the effects
of bandwidth limitations, propagation delays, bounded-size queues, packet
losses, multipath. Pipes are fixed-bandwidth channels. Queues represent instead
queues of packets, associated with a weight, which share the bandwidth of the
pipe they are connected to proportionally to their weight.
Each pipe and queue can be configured separately, so you can apply different
limitations/delays to different traffic according to the ipfw rules (e.g.
selecting on protocols, addresses and ports ranges, interfaces, etc.). Pipes and
queues can be created dynamically, so using a single set of rules you can apply
independent limitations to all hosts in a subnet, or to all types of traffic,
etc. You can also configure the system to build cascades of pipes, so you can
simulate networks with multiple links and paths between source(s) and
destination(s).
2. Performance, status and availability
Unlike other traffic shaping packages which run in userland, dummynet has a very
little overhead, as all processing is done within the kernel. There is no data
copying involved to move packets through pipes, just a bit of pointer shuffling,
and the implementation is able to handle thousands of pipes with O(log N) cost,
where N is the number of active pipes.
The WFQ variant we implement, called WF2Q+, has a complexity which is O(log N)
in the number of active flows, so again it is able to handle efficiently
thousands of flows. dummynet is part of FreeBSD since Sept.1998. It has been
recently (Jan.2000 and June 2000) rewritten, so the most recent, feature-rich
and robust versions are in FreeBSD 3.4-STABLE and newer releases.
You don't need to install FreeBSD on your hard disk to use it, as below you will
find a bootable single-floppy version of FreeBSD which includes dummynet,
bridging, and a lot of other goodies.
Dummynet is being heavily used by lots of people, and the code seems to be
extremely stable and robust, especially in the 3.4-STABLE version and above. Bug
fixes are generally applied to the FreeBSD source tree and are available from
the CVS tree or in newer snapshot/releases of FreeBSD. From time to time i
update the floppy image on this site as well.
3. Support
If you have found some bug, please report it to me by email, but don't forget to
include information on which version of FreeBSD and dummynet you are using, your
rules (ipfw show; ipfw pipe show), your configuration (bridge or router) etc.
If you have a simple question, again just email me and i generally try to reply
as soon as possible. Again, please supply details!
For more complex things (like "i have no time to learn how to use it, i just
want this work done"), or customizations and additions of new features to
dummynet/ipfw, I am available (through my department) for doing support on a
contract basis.
Email luigi@iet.unipi.it for discussing details.
This said, FreeBSD users should be able to use dummynet without the need for
support.
The relevant manpages (ipfw(8), dummynet(4), bridge(4)) are a great source of
information, so please read updated version of them before asking questions.
You can also try posting on the various FreeBSD mailing lists or newsgroups,
they are usually a very good source of information.
4. Using dummynet
Dummynet is entirely controlled by the ipfw commands and a set of sysctl
variables.
4.1 Basic ipfw commands
The basic structure of ipfw commands is
ipfw add [N] [prob X] action PROTO from SRC to DST [options]
where N is the rule number ;
X is a number between 0 and 1 that, when present, indicates the probability of
getting a match on this rule if all other fields are correct. The default is
deterministic match;
action is one of the actions executed on a match, which can be any of allow,
deny, skipto N, pipe N and others. To send a packet to a dummynet pipe, we have
to use pipe N; PROTO is the protocol type we want to match (IP, TCP, UDP, ...);
SRC and DST are address specifier (we can use addresses with netmasks and
optionally followed by ports or port ranges);
options can be used to restrict the attention to packets coming from/to specific
interfaces, or carrying some TCP flags or ICMP options, or bridged, etc.
4.2 Sysctl variables
The following are the main sysctl variables to control the behaviour of ipfw,
bridging and dummynet:
Controlling ipfw
The firewall is mostly controlled by ipfw, and the sysctl variables only serve
to give global configuration and default parameters.
net.inet.ip.fw.enable: 1
enables firewall in the IP stack
net.inet.ip.fw.one_pass: 1
Forces a single pass through the firewall. If set to 0,
packets coming out of a pipe will be reinjected into the
firewall starting with the rule after the matching one.
NOTE: there is always one pass for bridged packets.
net.inet.ip.fw.dyn_buckets: 256 (readonly)
Current hash table size used for dynamic rules.
net.inet.ip.fw.curr_dyn_buckets: 256
Desired hash table size used for dynamic rules.
net.inet.ip.fw.dyn_count: 3
Current number of dynamic rules. (readonly)
net.inet.ip.fw.dyn_max: 1000
Max number of dynamic rules. If you exceed this limit, you will
have to wait for a rule to expire before being able to create
a new one.
net.inet.ip.fw.dyn_ack_lifetime: 300
net.inet.ip.fw.dyn_syn_lifetime: 20
net.inet.ip.fw.dyn_fin_lifetime: 20
net.inet.ip.fw.dyn_rst_lifetime: 5
net.inet.ip.fw.dyn_short_lifetime: 5
Lifetime (in seconds) for various types of dynamic rules.
Controlling dummynet
Also dummynet is mostly controlled by ipfw, with the sysctl variables serving
mostly for default parameters.
net.inet.ip.dummynet.hash_size: 64
Size of hash table for dynamic pipes.
net.inet.ip.dummynet.expire: 1
Delete dynamic pipes when they become empty.
net.inet.ip.dummynet.max_chain_len: 16
Max ratio between number of dynamic queues and hash buckets.
When you exceed (max_chain_len*buckets) queues on a pipe,
packets not matching any of these will be all put into the
same default queue.
Controlling bridging
Bridging is almost exclusively controlled by sysctl variables.
net.link.ether.bridge_cfg: ed2:1,rl0:1,
set of interfaces for which bridging is enabled, and cluster
they belong to.
net.link.ether.bridge: 0
enable bridging.
net.link.ether.bridge_ipfw: 0
enable ipfw for bridging.
4.3 Pipe and queue configuration
The following ipfw commands control dummynet pipes
ipfw pipe NN config ...
This command is used to create or reconfigure a pipe. NN is the numeric
identifier (between 1 and 65535) of the pipe. Issuing multiple time the
configuration command results in the pipe being reconfigured.
ipfw [-s field] pipe [NN] show
This command shows the parameters of a pipe. If the pipe is a dynamic one (see
mask parameter), then all dynamic pipes created from this one are listed. The
list can be very very long. The -s option allows you to sort the listing on
one of the four counters associated to the pipe.
ipfw pipe NN delete Destroys a single pipe. Remember that packets sent to a
non-existing pipe are silently dropped.
ipfw pipe flush Destroys all pipes.
The following parameters can be configured for a pipe, adding the command in the
pipe config... line:
Bandwidth: bw NNunit
NN is the bandwidth assigned to the pipe, unit (which must follow the number
with no intervening spaces) can be any of bit/s Kbit/s Mbit/s Byte/s KByte/s
MByte/s or non-ambiguous abbreviations.
A bandwidth of 0 (or no bandwidth) results in no bandwidth limitations (hence,
no queues will ever build up).
Queue size: queue NN [unit]
Sets the queue size, in slots if only NN is specified, otherwise in Bytes or
KBytes. When there is no room in the queue, packets are dropped. The default
queue size is 50 slots.
The combination of bandwidth and queue size influence the queueing delay. Be
careful when using low bandwidths not to use too large queues, or you might
end up with several seconds of queueing delay.
Also be careful when you specify the queue size in packets: if you run tests
over the loopback interface, a packet can be very large, e.g. 16KB, again
resulting in huge delays.
Delay: delay NN ms
Sets the propagation delay of the pipe, in milliseconds. Note that the
queueing delay component is independent of the propagation delay. Also note
that all delays are approximated with a granularity of 1/HZ seconds (HZ is
typically 100, but we suggest using HZ=1000 and maybe even larger values).
Random Packet Loss: plr X
X is a floating point number between 0 and 1 which causes packets to be
dropped at random. This is done generally to simulate lossy links. The default
is 0, or no loss.
Dynamic queue creation: mask ...
It is possible to associate a mask to a pipe so that bandwidth and queue
limitations are enforced separately for packets belonging to different flows.
The mask command lets you specify which parts of the following fields
contribute to identify a flow:
[proto N] [src-ip N] [dst-ip N] [src-port N] [dst-port N]
where N is a bitmask where significant bits are set to 1. You can specify one
or more masks, or the all keyword to mean that all fields are fully
significant.
The default (when no mask are specified) is to ignore all fields, so that all
packets are considered to belong to the same flow.
Whenever a new flow is encountered, a new queue (with the specified bandwidth
and queue size) is created.
WARNING!!! the number of dynamic queues that can be created in this way can
become very large. They are accessed through a hash table, whose size you can
define using the buckets NN specifier after the mask command.
To use WF2Q+, packets must be passed to queues which in turn must be connected
to a pipe.
The following ipfw commands control dummynet pipes
ipfw queue NN config ...
This command is used to create or reconfigure a queue. NN is the numeric
identifier (between 1 and 65535) of the queue. Issuing multiple time the
configuration command results in the queue being reconfigured.
ipfw queue NN delete Destroys a single queue. Remember that packets sent to a
non-existing queue are silently dropped.
ipfw queue flush Destroys all queues.
The following parameters can be configured for a queue, adding the command in
the queue config... line:
Pipe: pipe NN
NN is the identifier of the pipe used for regulating traffic.
Weight: weight NN
NN is the weight (1..100, default 1) associated to the queue.
Per-Flow queueing: mask ...
The syntax is the same as for pipes. However, all queues created dynamically
will share the parent pipe's bandwidth according to the weight.
Queue size, Random Packet Loss:
Same as for pipes.
5. Using dummynet for testing protocols
Dummynet was originally created to test network protocols and applications,
possibly even on a standalone system. As a consequence, some of its features
such as delay emulation, random loss etc. are explicitly designed for that
purpose.
There are a few things you should take in mind when doing such tests, to avoid
getting incorrect results. They are all obvious things, still it is better to
have them in mind.
Choosing a reasonable buffer size.
As said earlier, packet can be subject to a delay which is proportional to the
total queue size (in bytes), and inversely proportional to the bandwidth. At
low bandwidths, this queueing delays can be extremely high, especially if the
queue size is defined in terms of packets and packets are large. The default
queue size is almost certainly too large for most purposes, and it is often
preferable to define the queue size in terms of bytes rather than packets.
Half-duplex vs. Full-duplex channels.
With the exception of shared-medium networks such as the ethernet, most links
that you want to simulate for your experiments are full duplex. As such, the
proper configuration is the following:
ipfw add pipe 1 ip from A to B
ipfw add pipe 2 ip from B to A
ipfw pipe 1 config ...
ipfw pipe 2 config ...
Should you really need to mode a half duplex network, then you can use the
following sequence. But think twice before you do so, as it is often a
non-realistic mode.
ipfw add pipe 3 ip from A to B
ipfw add pipe 3 ip from B to A
ipfw pipe 3 config ...
Interactions between bridging and multicast
You can use ipfw (and dummynet) in a bridge by setting some sysctl variables:
sysctl -w net.link.ether.bridge=1
sysctl -w net.link.ether.bridge_ipfw=1
and then specify your firewall configuration.
Be careful when you run experiment involving multicast traffic through a
dummynet-enabled bridge. Unless you set the rules right, multicast traffic in
a bridge goes through the firewall code twice: once during forwarding at level
2, once when the packet is passed to the local IP stack of the bridge.
Starting from Feb.2000, there are to avoid this problem. One involves a sysctl
variable:
sysctl -w net.inet.ip.fw.enable=0
which avoids that the firewall is invoked at the ip level. Otherwise, you can
use the bridged specifier in your ruleset to match only bridged packets:
ipfw add pipe 1 ip from any to any bridged
Running over the loopback interface.
Dummynet was originally designed for running experiments on a standalone
machine. The loopback interface lets you run senders and receivers on the same
machine, but you should remember a few things:
The firewall is invoked on all packets.
This means that if you have a configuration such as
ipfw add pipe 4 ip from 127.0.0.1 to 127.0.0.1
ipfw pipe 4 config delay 100ms
and do a simple ping 127.0.0.1 you will see a delay of approximately 400ms.
In fact the ICMP request goes through the pipe twice (once down, once up),
and the same for the ICMP reply. For the same reason, if you also have
bandwidth or queue limitations, remember that the queue sees the traffic
multiple times.
You can partially overcome this problem by using additional ipfw options,
e.g. specifying a direction for matching packets, or the uid of the sender
or receiving process. Alternatively, you can assign multiple aliases to the
loopback interface, and make sure that the sender and receiver bind their
local endpoint to different addresses so that you will have distinct rules
matching traffic in the two directions.
The MTU of the loopback interface defaults to 16KB
The usual default for ethernet is 1500, and for point-to-point links often
smaller (576 or so). You can simply fix this by redefining the mtu to the
desired value with
ifconfig lo0 mtu 1500
TCP defaults.
Be very careful when using TCP, especially between processes running on the
same machine, or on the same subnet.
Apart from the MTU issue mentioned earlier, at least on FreeBSD, TCP starts
with a full window when the remote endpoint is on the same subnet as one of
the local addresses. You need a simple fix in the source (tcp_input.c i
believe) to fix this behaviour in FreeBSD 3.x, whereas FreeBSD 4.x has sysctl
variable(s) to set the initial window.
Secondly, when you do experiments on configuration with a large
delay-bandwidth product, remember that many applications use the default
window size which is small, something like 16KB. You might end up not using
the full bandwidth just because your data transfer is window-limited.
5.1 Simulating multipath
One nice feature of the new version of dummynet is the ability to simulate
multiple paths between sender and receiver. This is done using probabilistic
match, e.g.:
ipfw add prob 0.33 pipe 1 ip from A to B
ipfw add prob 0.5 pipe 2 ip from A to B
ipfw add pipe 3 ip from A to B
ipfw pipe 1 config ...
ipfw pipe 2 config ...
ipfw pipe 3 config ...
Given the right packet, the first rule will match with probability 1/3; in the
remaining 2/3 of occurrence we move to the second rule, which will match with
prob 1/2 (so overall 1/2*1/3 = 1/3), and the remaining 1/3 of occurrence will
move to the third rule, which has a deterministic match. We can then configure
the three pipes as desired to emulate phenomena such as packet reordering etc.
6 Related links
Here i collect some info on how to do various ipfw-related things. Most of this
is just URLs collected from the mailing list so the reliability of the info
might be different (for good or bad) from what is in this page.
PPP Over Ethernet
Detailed instructions on how to set up a PPPoE connection.
ALTQ
Alternate queueing scheme.
A PicoBSD floppy
To conclude... if you want to try dummynet, here is a bootable floppy image of a
system with FreeBSD, bridging, ipfw, dummynet, natd, ppp, drivers for a few
interfaces, and accessible via telnet.
To setup this system, download the 1.44MB image, pico.000608.bin and copy it to
a floppy using dd under FreeBSD, or rawrite under DOS/Windows.
Then put the floppy into a machine with hopefully at least one interface, and
wait for it to boot. When the system comes up, login as root, password "setup",
and you can play with bridging, ipfw and dummynet using the above commands.
Luigi Rizzo
Dipartimento di Ingegneria dell'Informazione -- Univ. di Pisa
via Diotisalvi 2 -- 56126 PISA
tel. +39-050-568533 Fax +39-050-568522
email: luigi@iet.unipi.it
               (
geocities.com/hackermuda)