Thursday, November 08, 2007

Using cURL for FTP over SSL file transfers

I recently helped a client work through some errors while trying to transfer a file over a secure FTP connection (FTP over SSL) with cURL. If you haven't used curl, it is a great tool that lends itself to scripted data transfers quite nicely. I'll quote from the curl website:
curl is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, SCP, SFTP, TFTP, TELNET, DICT, LDAP, LDAPS and FILE. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, kerberos...), file transfer resume, proxy tunneling and a busload of other useful tricks.

Anyway, using curl with FTP over SSL is usually done something like this: curl -3 -v --cacert /etc/ssl/certs/cert.pem \ --ftp-ssl -T "/file/to/upload/file.txt" \ Let's go over these options:
  • -3: Force the use of SSL v3.
  • -v: Gives verbose debugging output. Lines starting with ">" mean data sent by curl. Lines starting with "<" show data received by curl. Lines starting with "*" display additional information presented by curl.
  • --cacert: Specifies which file contains the SSL certificate(s) used to verify the server. This file must be in PEM format.
  • --ftp-ssl: Try to use SSL or TLS for the FTP connection. If the server does not support SSL/TLS, curl will fallback to unencrypted FTP.
  • -T: Specifies a file to upload

The last part of the command line is simply a way to specify the username, password, host and port all in one shot.

How FTP Works

Before I get to the problem, I need to explain a bit about how FTP works. FTP operates in one of two modes - active or passive. In active mode, the client connects to the server on a control port (usually TCP port 21), then starts listening on a random high port and sends this port number back to the server. The server then connects back to the client on the specified port (usually the server's source TCP port is 20). Active mode isn't used much or even recommended anymore, since the reverse connection from the server to the client is frequently blocked, and can be a security risk if not handled properly by intervening firewalls. Contrast this with passive mode, in which the client makes an initial connection to the server on the control port, then waits for the server to send an IP address and port number. The client connects to the specified IP address and port and then sends the data. From a firewall's perspective, this is much nicer, since the control and data connections are in the same direction and the ports are well-defined. Most FTP clients now default to passive mode, curl included.

The problem

Now, a problem can arise when the server sends back the IP address from a passive mode request. If the server is not configured properly, it will send back it's own host IP address, which is almost always a private IP address and different from the address the client connected to. Usually a firewall or router is doing Network Address Translation (NAT) to map requests from the server's public IP address to the server's internal IP address. When the client gets this IP address from the server, it is trying to connect to a non-routable IP address and the connection times out. How do you know when this problem has manifested itself? Take a look at this partial debug output from curl:

... > PASV < 227 Entering Passive Mode (172,19,2,90,41,20) * Trying
Here the client has sent the PASV command, which asks the server for a passive data connection. The server returns a string of six decimal numbers, representing the IP address (first four digits) and port (last two digits). Here the IP address is - a non-routable IP address as per RFC 1918. When the client tries to connect to this address, it will fail.

The solution...sort of

In 1998 RFC 2428 was released, which specified 'Extended Passive Mode', specifically meant to address this problem. In extended passive mode, only the port is returned to the client, the client assumes the IP address of the server has not changed. The problem with this solution is that many FTP servers still do not support extended passive mode. If you try, you will see something like this:
> EPSV * Connect data stream passively < 500 'EPSV': command not understood. * disabling EPSV usage > PASV < 227 Entering Passive Mode (172,19,2,90,41,20) * Trying

...and we're back to the same problem again.

The Real Solution

Curl has a neat solution to this problem, requiring two additional options. The first is --disable-epsv, which prevents curl from sending the EPSV command - it will just default to standard passive mode. The second is --ftp-skip-pasv-ip, which tells curl to ignore the IP address returned by the server, and to connect back to the server IP address specified in the command line. Let's put it all together:
curl -3 -v --cacert /etc/ssl/certs/cert.pem \ --disable-epsv --ftp-skip-pasv-ip \ --ftp-ssl -T "/file/to/upload/file.txt" \
If this succeeds, you'll see something like this:

* SSL certificate verify ok. ... < 226- Transfer complete - acknowledgment message is pending. < 226 Transfer complete. > QUIT < 221 Goodbye.
The final 226 Transfer complete is the sign that the file was transferred to the server successfully.

No comments: