A co-worker is convinced that ftp is the way to go - "more reliable."
My preference is scp (partly for security, partly to avoid the messiness
of expect scripts to run ftp).
My question is this: if a file is transferred via ftp, is there a
significantly greater or lesser likelyhood of it arriving corrupted than
if it were transfered via scp? Which would be more tolerant of an erratic
netowrk connection? Obviously I can do checksums at either end, but how
much error checking (if any) do either of these file-transfer methods do?
Probably of more significance, how tolerant could each be made to network
timeouts?
Is there _any_ reason, other than speed of course, to use ftp in preference
to scp?
Chris Moll
Unix sys admin
Lawrence Berkeley Laboratory
SCP uses TCP just like FTP, but also does encryption (and optionally
compression) of the data; someone else can speak about any additional
integrity checks (aside from your data coming out "wrong" if the encryption
or compression fails or is tampered with in the middle) that might be there.
At worst, they are exactly equivalent: they both depend on TCP.
Erik Fair <fa...@clock.org>
I have been thinking over your request for a few days--here are my
very old 2 cents, for what they're worth. You should get some more
information from your co-worker as to just why FTP is "more reliable"
than scp. Without some more local facts, it's not directly clear how
to make a case one way or the other. Are we talking transport or
implementation?
>From a transport standpoint, both protocols are layered on top of TCP,
so you are largly dependent both on the TCP stack and possibly on how
well that particular TCP and FTP or scp is implemented on a specific
platform. Some platforms, such as Windows for Workgroups, Windows/NT
and OS/2 have multiple FTP implementations.
For the most part, FTP is specified not to check for data integrity
(see section 3.5 of RFC 959). The FTP architecture does have
something called block mode (see 3.4.2 in RFC 959) which can be used
be used for checkpoint, restart and recovery. I am not aware of it
being used as such in most common implementations. It is conceptually
straightforward to recover a broken data connection and harder to
restart a broken control connection.
Obviously, scp is different here--it collects your command arguments
and then forks an ssh process to negociate a secure channel over which
your data is transfered. Since you are encrypting data, by definition
you know when you have integrity issues. However, if you look at
packet.c, you'll see that ssh currently does not do any recovery from
network error. Recovery under scp may not be as easy as FTP since you
need to negociate a new security context.
A case could be made for FTP reliability in terms of known problems
and issues. FTP has been around for a very long time--it appears to
have been first conceptualized around 1971. FTP predates TCP and
was used before that on something called NCP. I remember using it on
Tops-20 as early as 1978. Thus, FTP is very well known and understood
and all reasonable TCP/IP implementations should be expected to talk
it. This is nice if you want to talk to arcane platforms such
MVS/TSO, Tops-20 and old CDC machines.
FTP's disadvantage is that its generality can lead to complex
implementations that are not well integrated with the operating
system. This is case under MVS/TSO. The Microsoft Windows/NT 3.51
FTP server is another case that comes to mind.
scp's advantage is that it is far newer and can take advantage of
newer software technology. In addition, it currently does have to
contend with as many TCP/IP implementations as FTP. It appears to be
more closely integrated into Unix, particularly in the area of
maintaining Unix specific file attributes such as modification times
and modes. It will probably be more reasonable about useful returning
process exit codes then FTP will.
In terms of speed, scp has built-in compression. The FTP architecture
has certain kinds of compression, but not all implementations are
required to have them. Currently, Tops-20, Digital Unix FTP, Olivetti
Unix FTP and Windows 3.51 FTP do not support compressed mode (MODE C).
IBM MVS FTP (EZAFTSRV V3R2) does.
All of this leads to the following points, but please remember that
your mileage may vary:
1) Speed:
a) If both FTP and scp are network I/O bound and scp has both
encryption and compression turned off, then you should see
roughly equivalent transfer times (scp probably taking
*slightly* longer) because you are waiting for the network.
b) If both FTP and scp are network I/O bound and scp has only
encryption turned on, then you should see scp taking longer
because of the extra data being transfered.
c) If both FTP and scp are I/O bound and scp has both encryption
and compression turned on then, you may see scp taking
less time, depending on what you're transfering. [Remember,
the data is compressed BEFORE the encryption takes place.]
d) If both FTP and scp are I/O bound and you turn off encryption
with scp and enable compression, scp will run faster than case
c and may win over FTP where case c doesn't.
e) If you are processor bound, FTP may be able to beat out scp
even with encryption off and compression on, depending on what
you are transfering and the implementation.
f) If you can find two FTP's that cooperate about compression, you
may win over scp in case d.
2) Data Integrity:
a) scp always computes a 32 bit CRC (Cyclic Reduntancy Check),
even with compression and encryption off. Thus, it checks data
integrity.
b) FTP does not check data integrity. It is possible that you
might fudge something up with block mode (MODE B) if your
particular FTP handles it.
3) Reliability:
a) scp does not recover and restart a broken network link.
b) The FTP architecture includes provisions for restart and
recovery, but these are not required in an implementation.
Looking at all this, I believe that if I were in your shoes that I
would go with scp because of the integrity checking, compression and
batch file issues. The secure option is a very nice plus. For now,
if scp reports an error, I would just restart the transfer.
For gigantic files, you may need to write a small utility that snips
out some number of blocks at a time and then ensure that they are
transfered with scp. That way, you'd never have to retransmit the
whole file. You would need a cooresponding utility on the target
(possibly a small shell script) to append the blocked files together,
1 at a time while deleting the previous block so as to not overflow
your file system.
Sorry I've taken so long and I hope I've been of some help,
--T