dimanche 28 juin 2015

How do I make Wget name files as part of URL?

Short story:

I want Wget to name downloaded files as they match regex token ([^/]*)

wget -r --accept-regex="^.*/([^/]*)/$" $MYURL

Full story:

I use GNU Wget to recursively download one specific folder under particular WordPress website. I use regex to accept only posts and nothing else. Here is how I use it:

wget -r --accept-regex="^.*/([^/]*)/$" $MYURL

It works and Wget follows all the desired URLs. However, it saves files as .../last_directory/index.html, but I want these files to be saved as last_directory.html (.html part is optional).

Is there a way to do that with Wget alone? Or would you suggest how to do the same thing with sed or similar tools?

Aucun commentaire:

Enregistrer un commentaire