Site icon Bytefreaks.net

Automatically download possibly a whole public website using wget recursively

Advertisements
wget -r -k -np --user-agent="Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X; en-us) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53" --wait=2 --limit-rate=200K --recursive --no-clobber --page-requisites --convert-links --domains bytefreaks.net https://bytefreaks.net/;

Introduction:

The “wget” command is a powerful tool used to download files and web pages from the internet. It is commonly used in Linux/Unix environments but can also be used on other operating systems. The command comes with various options and parameters that can be customized to suit your specific download requirements. In this post, we will discuss the wget command with a breakdown of its various options, and how to use it to download files and web pages.

Command Explanation:

Here is a detailed explanation of the options used in the command:

  1. “-r” : This option is used to make the download recursive, which means that it will download the entire website.
  2. “-k” : This option is used to convert the links in the downloaded files so that they point to the local files. This is necessary to ensure that the downloaded files can be viewed offline.
  3. “-np” : This option prevents wget from ascending to the parent directory when downloading. This is helpful when you want to limit the download to a specific directory.
  4. “–user-agent” : This option allows you to specify the user agent string that wget will use to identify itself to the server. In this case, the user agent string is set to a mobile device (iPhone).
  5. “–wait” : This option adds a delay (in seconds) between requests. This is useful to prevent the server from being overloaded with too many requests at once.
  6. “–limit-rate” : This option is used to limit the download speed to a specific rate (in this case, 200K).
  7. “–recursive” : This option is used to make the download recursive, which means that it will download the entire website.
  8. “–no-clobber” : This option prevents wget from overwriting existing files.
  9. “–page-requisites” : This option instructs wget to download all the files needed to display the webpage, including images, CSS, and JavaScript files.
  10. “–convert-links” : This option is used to convert the links in the downloaded files so that they point to the local files. This is necessary to ensure that the downloaded files can be viewed offline.
  11. “–domains” : This option allows you to specify the domain name(s) that you want to download.
  12. https://bytefreaks.net/” : This is the URL of the website that you want to download.

Conclusion:

The wget command is a powerful tool that can be used to download files and web pages from the internet. By using the various options and parameters available, you can customize your download to suit your specific requirements. In this post, we have discussed the wget command and its various options, and how to use it to download files and web pages. We hope that this post has been helpful and informative, and that it has given you a better understanding of the wget command.

Same command without setting the user agent:

The following command will try to download a full website with all pages it can find through public links.

wget --wait=2 --limit-rate=200K --recursive --no-clobber --page-requisites --convert-links --domains example.com http://example.com/;

Parameters:

This post is also available in: Greek

Exit mobile version