Copying Loads of Data with netcat

Recently I had to setup a shiny new database server at Bandzoogle. We've got a lot of members, with a lot of data, so simply dumping the database, and then reimporting it, is simply not an option, but even if it was, that's a problematic enterprise at best. So, to move the data I used netcat, which is one of the cooler things in the world of UNIX.

According to the homepage:

Netcat is a featured networking utility which reads and writes data across network connections, using the TCP/IP protocol.

It is designed to be a reliable "back-end" tool that can be used directly or easily driven by other programs and scripts. At the same time, it is a feature-rich network debugging and exploration tool, since it can create almost any kind of connection you would need and has several interesting built-in capabilities.

There's a lot of things you can do with netcat, if you remember that it exists – which tends to be my problem. Generally I'll struggle along with some hackish method of copying files, instead of remembering that I have this nifty tool available.

Anyway, this time I remembered and gave it a go. Basically, I tar'ed up the content on the old server, sent it over the wire to the new server, and untar'ed it there, but all on the fly, and with netcat in the middle. Here's how it looked.

On the new server, I ran this command:

nc -l 12378 | tar -xvf -

That tells the computer to listen on port 12378 for a connection from another machine, and to pipe any input through tar. Then, I stopped MySQL on my database server, and from /var/lib I ran the following command:

tar -cvf - mysql --exclude="/var/lib/mysql/huge_database" | nc dest_ip_address 12378

The first part of this command is to tar up /var/lib/mysql and dump the output to stdout, but to exclude one particularly large database, which didn't need to be on the new server. The output from tar is piped to netcat, which connects to the listening port 12378 on the destination server, and sends the data over. There was a lot of data, so it took some time, but it was significantly faster than sending it via scp, which is what I would normally do, especially because both of these machines are on their own private network. In fact, if they weren't sitting next to each other, this wouldn't have been an option, because the data transfer is completely unencrypted in this setup.

Anyway, at that point I had my data. I put it in place, checked that the permissions were correct, and before too long I had a nice new database server that was ready for work.

Filed under: Code