"Linux Gazette...making Linux just a little more fun!"
Internet is the big and wide world of infomation, which is really great, but when one works on a limited Internet access, retrieving big amounts of data may become a nigthmare. This is my particular case. I work at the National Astronomic Observatory, Universidad Nacional de Colombia. Its ethernet LAN is attached to a main ATM university's network backbone. However, the external connection to the world goes through a 64K bandwidth channel and that's a real problem when more than 500 users surf the net on day time, for Internet velocity becomes offendly slow. Matter change when night comes up and there is nobody at the campus, and so the transmition rate grows to acceptable levels. Then, one can downloading easily big quantities of information (for example, a whole Linux distribution). But as we are mortal human beings, it's not very advisable to keep awake all nights working at the computer. Then a solution arises: Program the computer so it works when we sleep. Now the question is: How to program a Linux box to do that? This is the reason I wrote this article.
On this writting I cover the concerning about ftp connections. I have not worked yet on http connections, if you did so, please tell me.
At first sight, a solution comes up intuitively: Use the at command to program an action at a given time. Let's remember how looks a simple ftp session (in bold are the commands entered by user):
bash$ ftp anyserver.anywhere.net Connected to anyserver.anywhere.net. 220 anyserver FTP server (Version wu-2.4(1) Tue Aug 8 15:50:43 CDT 1995) ready. Name (anyserver:theuser): anonymous 331 Guest login ok, send your complete e-mail address as password. Password:(an e-mail address) 230 Guest login ok, access restrictions apply. Remote system type is UNIX. Using binary mode to transfer files. ftp> cd pub ftp> bin ftp> get anyfile.tar.gz 150 Opening BINARY mode data connection for anyfile.tar.gz (3217 bytes). 226 Transfer complete. 3217 bytes received in 0.0402 secs (78 Kbytes/sec) ftp> bye 221 Goodbye. bash$
You can write a small shell script containing the steps that at will execute. To manage the internal ftp commands into a shell script you can use a shell syntax feature that permits to embed data that supposely would come from the standard input into a script. This is called a "here" document:
#! /bin/sh echo This will use a "here" document to embed ftp commands in this script # Begin of "here" document ftp <<** open anyserver.anywhere.net anonymous mymail@mailserver.net cd pub bin get anyfile.tar.gz bye ** # End of "here" document echo ftp transfer ended.
Note that all the data between the ** strings are sended to the ftp program as if it has been written by the user. So this script would open a ftp connection to anyserver.anynet.net, loging as anonymous with mymail@mailserver.net as password, will retrieve the anyfile.tar.gz file located at the pub directory using binary transfer mode. Theoretically this script looks okay, but on practice it won't work. Why? Because the ftp program does not accept the username and password via a "here" document; so in this case ftp will react with an "Invalid comand" to anonymous and mymail@mailserver.net . Obviously the ftp server will reject when no login and password data are sended.
The tip to this lies in a hidden file that ftp uses called ~/.netrc ; this must be located on the home directory. This file contains the information required by ftp to login into a system, organized in tree text lines:
machine anyserver.anynet.net login anonymous password mymail@mailserver.net
In the case for private ftp connections, the password field must have the concerning account password, instead an email as for anonymous ftp. This may open a security hole, for this reason ftp will require that the ~/.netrc file lacks of read, write, and execute permission to everybody, except the user. This is done easily using the chmod command:
chmod go-rwx .netrc
Now, our shell script will look so:
#! /bin/sh echo This will use a "here" document to embed ftp commands in this script # Begin of "here" document ftp <<** open anyserver.anywhere.net cd pub bin get anyfile.tar.gz bye ** # End of "here" document echo ftp transfer ended.
ftp will extract the login and passwd information fron ~/.netrc and will realize the connection. Say we called this script getdata (and made it executable with chmod ugo+x getdata), we can program its execution at a given time so:
bash$ at 1:00 am
getdata
(control-D)
Job 70 will be executed using /bin/sh
bash$
When you return at the morning, the requested data will be on your computer!
Another useful way to use this script is:
bash$ nohup getdata & [2] 131 bash$ nohup: appending output to 'nohup.out' bash$
nohup permits that the process it executes (in this case getdata) keeps runnig in spite of the user logouts. So you can work in anything else while in the background a set of files are retrieved, or make logout without kill the ftp children process.
In short you may follow these steps:
And voilá.
Additionally you can add more features to the script, so it automatically manages the updating of the ~/.netrc file and generates a log information file showing the time used:
#!/bin/sh # Makes a backup of the old ~/.netrc file cp $HOME/.netrc $HOME/netrc.bak # Configures a new ~/.netrc rm $HOME/.netrc echo machine anyserver.anywhere.net > $HOME/.netrc echo login anonymous >> $HOME/.netrc echo password mymail@mailserver.net >> $HOME/.netrc chmod go-rwx $HOME/.netrc echo scriptname log file > scriptname.log echo Begin conection at: >> scriptname.log date >> scriptname.log ftp -i<<** open anyserver.anywhere.net bin cd pub get afile.tar.gz get bfile.tar.gz bye ** echo End conection at: >> scriptname.log date >> scriptname.log # End of scriptname script
To create by hand such script each time we need to download data may be annoying. For this reason I have wrote a small tcl/tk8.0 application to generate a script under our needs.
You can find detailed information about the ftp command set in its respective man page.