How the passive mode in P2P programs like KaZaA and Gnutella works

I recently got a Very Clever™ friend of mine who works with networks to answer this question for me, and I thought since it was a very good explanation that I might as well post it here, so other people can find it, and gosh, maybe in some small way I'll actually have increased the information content of the Web. This text is an edited copy of a chat log.

Firstly, "passive mode" is actually an FTP term really. One of the features of FTP is the fact that there are different ports used for control and data by default. This is so that clever things can be done. It is, for example, possible to use an ftp client to connect to a server and tell that server to transfer something from another server. When stateful firewalls were introduced - the kind that record outgoing sessions and allow the response traffic to come back in without needing rules to deal with it in both directions - they didn't start off terribly smart. These days, firewalls can look at the commands going over the FTP control port and open up, dynamically, the incoming data ports that the server wants to use when it talks back to the client.

Then NAT arrived. This hosed things slightly, but only for a short while. The problem with NAT was that even if the firewall/NAT device inspected the control port and opened up the ports, if the machine had a private IP address then the server at the other end wouldn't be able to connect back into it. Again, as a problem, this was eventually solved - using dynamic reverse-mapping and translation of ports on the public address range back to private addresses. However, this was all a bit of a shag basically. And for what most people wanted to do with FTP (download or upload files) there was a simple solution. The simply solution is to have the client initiate all of the connections to the server. This meant that firewalls that were stateful coped, because the client inside the firewall had opened the connection. If NAT was en-route, again it didn't matter - the session was open from the inside. This is what "passive mode" is.

The downside with this of course, is that you can't have the three way dialog (in which the client tells server to talk to another server). Now, what no-one ever ran into with FTP was a situation such as the one which can arise with peer to peer filesharing applications. If you are running an FTP server, then it needs to have some form of public address - if it is behind a NAT firewall, then it needs a bunch of ports mapping to a public address anyway. With Peer to Peer though, you can have a situation where people can be behind a firewall and not have a public address or have the control over the port mapping. The way that searches tend to work - taking gnutella as an example - is that the client sends the query to a set of peers it has connections to. Those peers then forward the query on to all their peers. And so on. This goes on until the TTL on the query expires, at which point somewhere along the way the query stops being forwarded.

The problem is that there are no sessions beyond the first-hop peers, so there's no way to get back to the first machine. The original client hasn't actually established connections to probably 95% of the other machines. Those queries have flowed along existing TCP connections which were already open. So, let's say our search succeeds. We have a rare file we want to download. Problem is that the file is only on one machine, and that machine is behind a NAT firewall with no static port mapping. If we have a public IP address, no problem, because that machine can find us: we send a message -- along the same route we used to send the query, the web of machines -- to tell that machine to connect to us and 'push' the file to us. When passive mode works, that's what happens.

But, if we are also behind a NAT firewall, we have a problem. The only way we can get traffic between the two machines is over the web of peers. Neither of us can make that initial connection to deal with the NAT/firewall issue. And it would be terribly inefficient to use P2P bandwidth for anything other than queries, since that traffic would be flowing over all sort of links like dial-ups. So although the two machines can talk to each other, they can't start the efficient, direct connection they need to transfer files.