2 Comments

Summary:

This is a minor update to SiteSucker, but it is a program that I depend on, so I am posting it. For the uninitiated, SiteSucker is a program that sucks everything (pages, images, etc.) out of a website and downloads it to your hard drive. I […]

SiteSucker

This is a minor update to SiteSucker, but it is a program that I depend on, so I am posting it.

For the uninitiated, SiteSucker is a program that sucks everything (pages, images, etc.) out of a website and downloads it to your hard drive. I use it because I update my school web page at the beginning of each year, but I don’t want to lose all the information from before. I don’t usually go back and look at it much, but mainly save it as a backup of what happened the previous year, in case I lose some other record.

The updates are:

  • Allowed users to view the download settings while downloading.
  • Replaced wildcard support in paths settings with regular expressions.
  • Removed “Get Files via Image Links” from the Download Option and added “Only Follow Image Links” option under the Advanced tab in the download settings.
  • Added an option to save log files in ~/Library/Logs/SiteSucker.
  • Added a Logs tab in the Download Settings window and reorganized the settings.
  • Added scanning of the style attribute in all tags for URLs.
  • Replaced URL parameters with a value in local file names.
  • Deleted empty folders in the download folder when all downloads are paused.
  • Modified the document format to improve performance when analyzing files.
  • Fixed an issue where some files failed to download when a download was resumed.
  • Fixed some issues with the Open File command.
  1. Another really great application for this purpose is SiteCrawler. When I were into site downloaders a while ago, I found that SiteCrawler was able to download a lot of sites and files that SiteSucker failed on. However I guess that many of those issues are corrected in this version. But if you miss a feature in SiteSucker or feel like testing another option, try SiteCrawler @ http://lightheadsw.com/sitecrawler/

    Share
  2. Ahhrg, stupid typo. When I *was*

    Share

Comments have been disabled for this post