Custom user lists
• So, lets take some word docs and pull out the user names and first and last names!
• What about Web?
wget -r -l1 --no-parent -A.doc http://www.somewebsite.com/ | exiftool -r -a -u -Author -
LastSavedBy * >users.txt |strings users.txt | cut -d":" -f2 | grep -v "\=" | grep -v "\image files read" |
tr '[:space:]' '\n' | sort | uniq >cleanusers.txt
• local disk?
exiftool -r -a -u -Author -LastSavedBy * >users.txt |strings users.txt | cut -d":" -f2 | grep -v "\=" |
grep -v "\image files read" | tr '[:space:]' '\n' | sort | uniq >cleanusers.txt
More info at http://www.pauldotcom.com/Metadata_the_Silent_Killer_NS2009.pdf
================================================================
exclude hml, php, asp and cgi extensions
# wget -nd -r -R htm,html,php,asp,aspx,cgi -P /home/tools/metadata_from_[website_name] [target_domain]
alternatively, we coul have included only pdf, word and excel extensions using the following command
# wget -nd -e -A pdf,doc,docx,xlx,xlsx -P /home/tools/metadata_from_[website_name] [target_domain]
-nd = no directories (places all files in specified directorty)
-r = recursive download
-P [directory] = prefix output file location with directory
-R/A= Restrict or allow file types or pattern
• So, lets take some word docs and pull out the user names and first and last names!
• What about Web?
wget -r -l1 --no-parent -A.doc http://www.somewebsite.com/ | exiftool -r -a -u -Author -
LastSavedBy * >users.txt |strings users.txt | cut -d":" -f2 | grep -v "\=" | grep -v "\image files read" |
tr '[:space:]' '\n' | sort | uniq >cleanusers.txt
• local disk?
exiftool -r -a -u -Author -LastSavedBy * >users.txt |strings users.txt | cut -d":" -f2 | grep -v "\=" |
grep -v "\image files read" | tr '[:space:]' '\n' | sort | uniq >cleanusers.txt
More info at http://www.pauldotcom.com/Metadata_the_Silent_Killer_NS2009.pdf
================================================================
exclude hml, php, asp and cgi extensions
# wget -nd -r -R htm,html,php,asp,aspx,cgi -P /home/tools/metadata_from_[website_name] [target_domain]
alternatively, we coul have included only pdf, word and excel extensions using the following command
# wget -nd -e -A pdf,doc,docx,xlx,xlsx -P /home/tools/metadata_from_[website_name] [target_domain]
-nd = no directories (places all files in specified directorty)
-r = recursive download
-P [directory] = prefix output file location with directory
-R/A= Restrict or allow file types or pattern
Comments
Post a Comment