You probably find yourself needing to download a tarball, or an ISO image, to a server from a web page. Or within a script you need to know that you can query both local and remote web servers over HTTP(S) to check uptime or the presense of certain content.
One of the most popular feature-filled tools is Wget. Wget is a command line utility for downloading files from the internet. It descends from an earlier program named Geturl written by the same author. Wget boasts several supported protocols, namely HTTP, HTTPS and FTP, and also has the ability to look past and query a site beyond HTTP proxies – a powerful feature-set indeed.
In the mid-1990s many Unix users struggled behind extremely slow dial-up connections, leading to a growing need for a downloading agent that could deal with transient network failures without assistance from the human operator. Originally released in January 1996, GNU Wget provided a solution to these issues and became a major application for most Linux/UNIX systems when downloading files from the Internet.
To the delight of all fans of the free and open source software, yesterday GNU Wget2 2.0 has been released for this successor to GNU Wget.
GNU Wget2
GNU Wget2 is a free utility for non-interactive download of files from the internet. It was more than three years in the making. GNU Wget2 supports HTTP and HTTPS protocols, as well as retrieval through HTTP(S) proxies. It is a complete rewrite / new codebase, wget2 + libwget (library for web clients). It is licensed under GPLv3+.
Wget2 is non-interactive, meaning that it can work in the background, while the user is not logged on. This allows you to start a retrieval and disconnect from the system, letting Wget2 finish the work. By contrast, most of the Web browsers require constant user’s presence, which can be a great hindrance when transferring a lot of data.
It’s important to note that GNU Wget2 has many improvements in comparison to Wget. It has some long standing features like multi-threading, compression (gzip, bzip2, lzma/xz), HTTP/2, parses XML sitemaps + RSS / Atom feeds, TCP Fast Open, OCSP (stapling), multiple proxies, etc.
On top on that, in many cases GNU Wget2 downloads much faster than Wget1.x due to HTTP2, HTTP compression, parallel connections and use of If-Modified-Since HTTP header. As we mentioned, it can follow links in HTML, XHTML, CSS, RSS, Atom and sitemap files to create local versions of remote web sites, fully recreating the directory structure of the original site.
GNU Wget2 has been designed for robustness over slow or unstable network connections. That means if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. In addition, if the server supports partial downloads, it may continue the download from where it left off.
For detailed information on all new features in GNU Wget2, you can refer to the project’s GitLab page.