One of the powerful tools available in most Linux distributions is the Wget command line utility. With a simply one-line command, the tool can download files from the web and save them to the local disk. While this capability might initially seem only moderately useful (Why not just use Chrome or Firefox to download the file?) – most Linux servers are managed remotely through a tool called SSH. SSH normally offers only a command line interface without any graphical components, so all the server maintenance needs to be done through the command line. Wget is used constantly throughout the installation process to download files from the Internet and install new programs on the system.
Normally, downloading a file from the Internet using Wget is done as follows:
In addition to downloading programs, however, Wget can be used to remotely trigger events or run jobs in web applications. In order to leverage the already-built code of the web application, many backend jobs are often programmed as scripts on the website. In order to run the job, the server simply needs to access the webpage at a predefined interval. In order to access that webpage, the server can use Wget and discard the output by piping it to /dev/null:
wget -qO- http://www.domain.com/script.php &> /dev/null
This script can be then put inside a cron job and executed on a target interval as needed by the application.
A problem arises, however, when the script is secured, as it should be so that non-administrative users will not have access to run system batch jobs. It’s insecure to pass login parameters directly through the URL due to server logging. The job should ideally run through a token-based cookie, isolated to the local machine.
In order to generate the cookie, the login script should first be run by passing the login parameters through the POST data, and then saving the resulting cookie to disk:
wget -qO- --keep-session-cookies --save-cookies cookies.txt --post-data 'user=MYUSER&password=MYPASS' http://www.domain.com/login.php
Depending on the login form arguments, different post-data will need to be entered. The resulting cookies will be saved to the file cookies.txt in the current folder. This command should only be run once, and should not be stored inside any script to prevent hard storage of the password.
Finally, the authentication token in cookies.txt can be used to run the script in the batch job:
wget -qO- --load-cookies cookies.txt http://www.domain.com/script.php &> /dev/null
This technique enables a secure method for batch processing in web applications, and helps reduce application vulnerability to hacking. Ideally, the system account would only have access to the particular batch jobs that it executes.
An alternative method to executing batch jobs on the PHP platform is to directly call the PHP executable from the command line, instead of going through the web server. While this can work in some instances, dynamic web applications often require virtual paths properly set and can behave unexpectedly when called from the command line. It is generally safer and more cross-platform compatible to work within the web server framework, and directly access the job service in that same manner that other web requests are processed. The Wget technique will also work with other web development technologies, such as Node.js, ASP.NET, Rails, and Django.
Written by Andrew Palczewski
About the Author
Andrew Palczewski is CEO of apHarmony, a Chicago software development company. He holds a Master's degree in Computer Engineering from the University of Illinois at Urbana-Champaign and has over ten years' experience in managing development of software projects.