IPTables has always been a relatively flexible and modular firewall; if it can't currently test for a particular packet characteristic, you have the option of writing a test or modifying an existing test. The catch - as with so much of open source - is that most of us aren't programmers.
It turns out we do have an option that doesn't require programming. Don Cohen was kind enough to write an IPTables module that pulls any bytes you'd like out of the packet, does some manipulation, and sees if the result is in a particular range. For example, I can grab the Fragmentation information out of the IP header, throw away everything except the More Fragments flag, and see if that flag is set.
Without writing any C code at all. :-)
What I'll do is introduce the core concepts here, and put in hopefully enough annotated examples that you'll be able to write your own tests.
I won't be focusing on what these fields are, or why you'd want to test them; there are lots of (warning - shameless plug for my employer ahead!) resources for doing that. If you simply need a quick reference for the packet headers, see tcpip.pdf.
All byte positions in this article start counting at 0 as the first byte of the header. For example, in the IP header, byte "0" holds the 4 bit "Version" and 4 bit "IP Header Length", byte "1" holds the "TOS" field, etc.
In it's simplest form, u32 grabs a block of 4 bytes starting at Start, applies a mask of Mask to it, and compares the result to Range. Here's the syntax we'll use for our first examples:
iptables -m u32 --u32 "Start&Mask=Range"
We'll generally pick a "Start" value that's 3 less than the last byte in which you're interested. So, if you want bytes 4 and 5 of the IP header (the IP ID field), Start needs to be 5-3 = 2. Mask strips out all the stuff you don't want; it's a bitmask that can be as large as 0xFFFFFFFF. To get to our target of bytes 4 or 5, we have to discard bytes 2 and 3. Here's the mask we'll use: 0x0000FFFF . We'll actually use the shorter, and equivalent, 0xFFFF instead.
So, to test for IPID's from 2 to 256, the iptables command line is:
iptables -m u32 --u32 "2&0xFFFF=0x2:0x0100"
To read this off from left to right: "Load the u32 module, and perform the following u32 tests on this packet; grab the 4 bytes starting with byte 2 (bytes 2 and 3 are the Total Length field, and bytes 4 and 5 are the IPID), apply a mask of 0x0000FFFF (which sets the first two bytes to all zeroes, leaving the last two bytes untouched), and see if that value - the IPID - falls between 2 and 256 inclusive; if so, return true, otherwise false."
There is no standalone IPID check in IPTables, but this is the equivalent of the "ip[2:2] >= 2 and ip[2:2] <= 256" tcpdump/bpf filter.
I leave off actions in these examples, but you can add things like:
-j LOG --log-prefix "ID-in-2-256 "
or any other action. You can also add other tests, as we'll do in a minute.
Don offers this test to see if the total packet length is greater than or equal to 256. The total length field is bytes 2 and 3 of the IP header, so our starting position is 3-3 = 0. Since we're pulling out two bytes again, the mask will be 0xFFFF here as well. The final test is:
iptables -m u32 --u32 "0&0xFFFF=0x100:0xFFFF"
This is the same as:
iptables -m length --length 256:65535
or the bpf filter
"len >= 256"
Much the same, except we'll use a mask of 0x000000FF (or it's shorter equivalent 0xFF) to pull out a single byte from the 4 bytes u32 initially hands us. Let's say I want to test the TTL field for TTL's below 3 to find people tracerouting to us. Yes, there's a ttl module, but let's see how this would be done in u32.
I want to end up with byte 8 of the IP header, so my starting position is 8-3 = 5. Here's the test:
iptables -m u32 --u32 "5&0xFF=0:3"
Which is equivalent to:
iptables -m ttl --ttl-lt 4
or the bpf filter
"ip <= 3"
To check a complete destination IP address, we'll inspect bytes 16-19. Because we want all 4 bytes, we don't need a mask at all. Let's see if the destination address is 18.104.22.168:
iptables -m u32 --u32 "16=0xE0000001"
This is equivalent to:
iptables -d 22.214.171.124/32
If we only want to look at the first three bytes (to check if a source address is part of a given class C network), we'll need to use a mask again. The mask we'll use is 0xFFFFFF00 , which throws away the last octet. Let's check if the source address (from bytes 12-15, although we'll ignore byte 15 with the mask) is in the class C network 192.168.15.0 (0xC0A80F00):
iptables -m u32 --u32 "12&0xFFFFFF00=0xC0A80F00"
Which is the same as:
iptables -s 192.168.15.0/24
Obviously, if I want to look at the TOS field (byte 1 of the IP header), I can't start at byte 1-3 = -2. What we'll do instead is start at byte 0, pull out the byte we want, and then move it down to the last position for easy testing. This isn't the only way we could do this, but it helps demonstrate a technique we'll need in a minute.
To pull out the TOS field, I first ask u32 to give me bytes 0-3 by using an offset of 0. Now, I pull out byte 1 (the second byte in that block) with a mask of 0x00FF0000 . I need to shift the TOS value down to the far right position for easy comparison. To do this, I use a technique called, unsuprisingly, "right shift". The symbol for right shift is ">>"; this is followed by the number of bits right to move the data. If you're unfamiliar with right shift, take a look at this tutorial from Harper College.
I want to move TOS two bytes - or 16 bits - to the right. This is done with ">>16". Now that we have TOS in the correct position, we compare it to 0x08 (Maximize Throughput):
iptables -m u32 --u32 "0&0x00FF0000>>16=0x08"
which is the equivalent of:
iptables -m ttl --tos 8
I'd like to look at the "More Fragments" flag - a flag which has no existing test in iptables (-f matches 2nd and further fragments, I want to match all fragments except the last). Byte 6 contains this, so I'll start with offset 3 and throw away bytes 3-5. Normally this would use a mask of 0x000000FF, but I also want to discard the other bits in that last byte. The only bit I want to keep is the third from the top (0010 0000), so the mask I'll use is 0x00000020 . Now I have two choices; move that bit down to the lowest position and compare, or leave it in its current position and compare.
To move it down, we'll right shift 5 bits. The final test is:
iptables -m u32 --u32 "3&0x20>>5=1"
If I take the other approach of leaving the bit where it is, I need to be careful about the compare value on the right. If that bit is turned on, the compare value needs to be 0x20 as well.
iptables -m u32 --u32 "3&0x20=0x20"
Both approaches return true if the More Fragments flag is turned on.
If you want to inspect more than one aspect of a packet, use:
between each test.
This is a little tricky. Let's say I'd like to look at bytes 4-7 of the TCP header (the TCP sequence number). Let's take the simple approach first, and then look at some ways to improve this.
For our first version, let's assume that the IP header is 20 bytes long - usually a good guess. Our starting point is byte 4 of the tcp header that immediately follows the IP header. Our simplistic test for whether the sequence number is 41 (hex 29) might look like this:
iptables -m u32 --u32 "24=0x29"
For packets where the IP header length is 20, this will actually work, but there are a few problems. Let's fix them one by one.
First, we never check to see if the packet is even a TCP packet. This is stored in byte 9 of the IP header, so we'll pull 4 bytes starting at byte 6, drop 6-8, and check to see if it's 6. The new rule that first checks if this is a TCP packet at all and also checks that the Sequence Number is 41 is:
iptables -m u32 --u32 "6&0xFF=0x6 && 24=0x29"
The second problem we've momentarily ignored is the IP header length. True, it usually is 20 bytes long, but it can be longer, if IP options are used.
Here are the steps. We pull the IP header length (a nibble that shows how many 4 bytes words there are in the header, usually 5) out of the IP header. We multiply it by 4 to get the number of bytes in the IP header. We use this number to say how many bytes to jump to get to the beginning of the TCP header, and jump 4 more bytes to get to the Sequence number.
To get the header length, we need the first byte:
"0>>24", but we need to only grab the lower nibble and we
need to multiply that number by 4 to get the actual number of bytes in
the header. To do the multiply, we'll right shift 22 instead of 24.
With this shift, we'll need to use a mask of 0x3C instead of the 0x0F we
would have used. The expression so far is:
On an IP header with no options, that expression returns 20; just what
we'd expect. Now we need to tell u32 to use that number and make
a jump that many bytes into the packet, a step performed by the "@"
iptables -m u32 --u32 "6&0xFF=0x6 && 0>>22&0x3C@4=0x29"
The "@" grabs the number we created on its left (20, normally) and jumps that many bytes forward (we can even do this more than once - see the TCP payload section below). The 4 to its right tells u32 to grab bytes 4-7, but u32 knows to pull them relative to the 20 bytes it skipped over. This gives us the Sequence Number, even if the IP header grows because of options. *phew*!
The last quirk to handle is fragments. When we were only working with the IP header, this wasn't an issue; IP is designed in such a way that the IP header itself can never be fragmented. The TCP header and application payload technically might be, and if we're handed the second or further fragment, we might be looking not at the Sequence Number in bytes 4-7, but perhaps some other part of the TCP header, or more likely, some application layer data.
What we'll do is check that this is the first fragment (or an
unfragmented packet, the test won't care), so that we're sure we're
looking at tcp header info. To do this, we test the fragment offset in
most (we discard the top three flag bits) of bytes 6 and 7 of the IP
header to make sure the offset is 0. The test is:
The final expression (check for TCP, check for unfragmented packet or first fragment, and jump over the IP header, checking that bytes 4-7 of the TCP header are equal to 41) is:
iptables -m u32 --u32 "6&0xFF=0x6 && 4&0x1FFF=0 && 0>>22&0x3C@4=0x29"
If the packet is, in fact, fragmented, we have one more consideration; the fragment might be so small that the field we're testing might have been put in a future fragment! In this one case, it's not an issue because every IP link should handle packets of at least 68 bytes; even if the IP header was at its maximum of 60 bytes, the first 8 bytes of the TCP header should be included in that first fragment.
When we start testing for things further in to the packet, we'll have to depend on u32's ability to simply return false if we ever try to ask for a value that falls outside of the packet being inspected.
Let's look for ICMP Host Unreachables (ICMP, type 3, code 1).
Just as in the above example, we need to check for the Protocol field
(Protocol 1 = ICMP this time) and that we're looking at a complete
packet or at least the first fragment:
To check for the ICMP Type and Code, we skip over the IP
header again (
"0>>22&0x3C@..."). To grab the first two
bytes, we'll start at offset 0 and just right shift 16 bits. The final
iptables -m u32 --u32 "6&0xFF=1 && 4&0x1FFF=0 && 0>>22&0x3C@0>>16=0x0301"
Lets try going all the way into the packet payload now, and match packets that are UDP DNS queries. Here we're not only going to check for destination port 53, but we're also going to test the top bit of byte 2 of the payload; if set, this is a DNS query.
We start by checking that this is a UDP packet:
"6&0xFF=17". We add the now familiar check for first
To test the destination port, we grab bytes 2 and 3 from the
udp header (after jumping over the IP header as in the previous
If the packet has passed all of the above, we go back to
check the payload (remember we have to jump over the variable-length IP
and 8 byte UDP headers
"0>>22&0x3C@8 ...") to make sure
this is a DNS query rather than a response. To grab the high bit
from byte 2, I'll use offset 8 to grab the first 4 payload bytes, right
shift 15 bits to deposit the Query bit in the lowest position, and throw
away all the rest of the bits with a mask of 0x01:
The final test is:
iptables -m u32 --u32 "6&0xFF=17 && 4&0x1FFF=0 && 0>>22&0x3C@0&0xFFFF=53 && 0>>22&0x3C@8>>15&0x01=1"
Ugh. I've seen stellar noise that had less entropy :-) Note that we're doing the whole thing with u32 checks; we could pull out the "udp", "first/no fragment" and "port 53" checks into other modules, and end up with this slightly more readable version:
iptables -p udp --dport 53 \! -f -m u32 --u32 "0>>22&0x3C@8>>15&0x01=1"
First, a recap of the above, then some additional tests.
And now, some new tests:
Don Cohen wrote the u32 module, and also wrote some (if you'll forgive me) somewhat cryptic documentation inside the source code for the module. William Stearns wrote this text, which borrows some examples and concepts from Don's documentation. Many thanks to Don for reviewing an early draft of this article. Thanks also to Gary Kessler and Sans for making the TCP/IP pocket reference guide freely available.