02/02/2005
Cluckfuck McDuck, Bumfuck Chuckschmuck
This is the fifth in a series of back-dated site updates that have only just been published
firewalling and traffic shaping part 2
So we move onto traffic shaping. The main reason I want to do this is so that I can still log in to remote hosts using ssh and be able to type fairly easily, even if I am downloading something or you fucks are spamming my web server with stupid requests (e.g. "GET /favicon.ico". Note to self: write another rant about Firefox another time)
Suppose you're uploading and also trying to ssh. If you're downloading you're also uploading; there are things called ACK packets to maintain a reliable stream. ssh sets its outgoing packets to have Type Of Service (TOS) Minimise-Delay so they leave your router at high priority by default. However they also are transmitted at the speed your network card talks to your modem, typically 10 megabits per second. This is way above your internet connection's upload speed, which is typically about 256 kilobits per second, if you're on ADSL. So a queue forms in the modem. You don't want this. The modem will just drop packets it doesn't have room to queue, for example ssh packets that should be prioritised over, say, http ACK packets. This creates latency and those annoying delays between keypress and character appearing in your ssh window. You want to
- Limit outgoing packets so that the queue is in your router, over which you have control
- Prioritise packets by criteria such as Type Of Service, protocol or port number, length etc.
When there is network traffic to send out the kernel puts it in a Queueing Discipline, abbreviated to qdisc. Then the packets are dequeued as determined by the qdisc. A classless qdisc treats all packets the same. A classful qdisc is split into classes and you have to specify filters to tell the qdisc which packets go into which class. The default qdisc on an ethernet interface is a first in first out 3 band priority queue, that is, when a packet is queued it gets put into one of three FIFO queues depending on its TOS value. Packets are dequeued by priority and no attempt is made to slow the connection down; if there's packets to send, they'll just go.
Probably the easiest way to do what I want is to replace the default qdisc with a Hierarchial Token Bucket (HTB). This qdisc is pretty much designed for the problem I'm trying to solve but is only available in recent kernels (2.4.20 onwards, so not in Debian Woody unless you've upgraded) What you can do with this is set the maximum rate at which the entire qdisc can dequeue (that is at which your upstream link can transmit), and classify traffic into subflows which will be dequeued in an order given by choosable priorities. The best part is that if a higher priority queue isn't full lower priority queues can borrow their bandwidth.
(If you don't have a recent kernel you can make a similar setup by putting a bunch of Token Buckets into a simple Priority queue; the token buckets are independent of each other and their rates have to add up to the total upstream bandwidth you have)
Implementation
Okay that's the theory, and it's not really conceptually difficult, my terrible explanations notwithstanding. How do you actually set up this stuff? Obviously you need it compiled into your kernel. To initialise it however requires use of the command tc, which has probably one of the most awful command line interfaces ever made. I don't even think it's meant to be human-readable.
The first part is relatively straightforward, you set the qdisc of your outgoing interface to be an HTB, and make its first child class specify the maximum bandwidth of the link. (That should be a little bit less than your actual upstream bandwidth) Then you create subclasses of that first class that have a given guaranteed bandwidth, a given maximum bandwidth, and a priority. Obviously the given maximum bandwidth for each subclass shouldn't exceed the maximum bandwidth of the the parent class, nor should the sum of the guaranteed bandwidths.
Finally you attach filters to the qdisc that say which subclass of the qdisc to put packets into. This is probably the hardest part as the u32 filter is an awful thing that requires intimate knowledge of the structure of the header of a packet and inhuman ability with bitshifts and bitmasks and stuff. I found it much easier to use the fwmark filter, which classifies packets according to their iptables mark. This is why I have the mangle table, as I mentioned in the previous update. It says a lot about tc that writing iptables rules is much easier(!)
Summary
Anyway so basically what you have happening is this
- Packet wants to get transmitted
- Packet goes into iptables POSTROUTING hook where the mangle table sets a mark on it depending on whatever criteria
- Packet gets enqueued into HTB
- HTB classifies packet (puts it into a given class) based on its iptables mark. I suggest you use the same arbitrary mark number as the priority of the class into which it is going.
- Packet sits in queue.
- Network hardware asks to have packet dequeued so it can transmit it.
- HTB looks for the highest priority packet to dequeue, dequeues it if configured rates allow it.
- Packet goes out, hopefully not having had to queue in or get dropped by the modem.
Please note you can only do this with outgoing packets. There's not a lot you can do with incoming traffic, except police it. That is, drop some if you recieve too much. TCP ensures it gets retransmitted at the other end. I do this but I'm not sure how much effect it has.
Links
- Advanced routing and traffic control HOWTO just about the only documentation that exists. Also found here is Wondershaper which does pretty much everything I've described above. I could have just installed this but I wanted to learn about it, so I wrote my own init scripts instead.
- Hierarchial Token Bucket home page, user guide, etc.
- TCNG, a very promising piece of code for which you write configuration files in a nicely clear structured configuration language, and it turns them into tc commands for you. Unfortunately I had certain problems that make it little better than beta software at the moment. That's not to say it didn't help out a lot.