Suppose I have a file containing a list of links of webpages.
www.xyz.com/asdd
www.wer.com/asdas
www.asdas.com/asd
www.asd.com/asdas
I know that doing curl www.xyz.com/asdd
will fetch me the html of that webpage. I want to fetch some data from that webpage.
So the scenario is use curl to hit all the links in the file one by one and extract some data from the webpage and store somewhere else. Any ideas or suggestions.
As indicated in the comments, this will loop through your_file
and curl
each line:
while IFS= read -r line
do
curl "$line"
done < your_file
To get the <title>
of a page, you can grep
something like this:
grep -iPo '(?<=<title>).*(?=</title>)' file
So all together you could do
while IFS= read -r line
do
curl -s "$line" | grep -Po '(?<=<title>).*(?=</title>)'
done < your_file
Note curl -s
is for silent mode. See an example with google page:
$ curl -s http://www.google.com | grep -Po '(?<=<title>).*(?=</title>)'
302 Moved
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments