Possible to dump AJAX content from webpage?

Sandra Schlichting

I would like to dump all the names on this page and all the remaining 146 pages.

The red/orange previous/next buttons uses JavaScript it seams, and gets the names by AJAX.

Question

Is it possible to write a script to crawl the 146 pages and dump the names?

Does there exist Perl modules for this kind of thing?

simbabque

You can use WWW::Mechanize or another Crawler for this. Web::Scraper might also be a good idea.

use Web::Scraper;
use URI;
use Data::Dump;

# First, create your scraper block
my $scraper = scraper {
    # grab the text nodes from all elements with class type_firstname (that way you could also classify them by type)
    process ".type_firstname", "list[]" => 'TEXT';
};

my @names;
foreach my $page ( 1 .. 146) {
  # Fetch the page (add page number param)
  my $res = $scraper->scrape( URI->new("http://www.familiestyrelsen.dk/samliv/navne/soeginavnelister/godkendtefornavne/drengenavne/?tx_lfnamelists_pi2[gotopage]=" . $page) );
  # add them to our list of names
  push @names, $_ for @{ $res->{list} };
}

dd \@names;

It will give you a very long list with all the names. Running it may take some time. Try with 1..1 first.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

how to read content from webpage

scraping AJAX content on webpage with requests python

PHP trigger from webpage A to load content from database on webpage B

How to copy/paste 'int' content from webpage

Unable to get a dynamically generated content from a webpage

Scraping: cannot extract content from webpage

Download only the text from a webpage content in Python

Read content from rendered webpage into nodejs

Checking the content from an URL: Is it a file or webpage?

Scrapy is returning content from a different webpage

Webscraping content from a webpage not using selenium

is it Possible to Trigger Android Toast Notification from Webpage?

Is it possible for the webpage to detect that it is accessed from the webview app?

Is it possible to dump inode information from the inotify subsystem?

Is it possible make dump of mysql database with updated content in codeigniter

Scrape data from AJAX webpage with python

Scraping content from a dynamic webpage with Selenium returns wrong content

In SQLAlchemy is it possible to dump and load from a Table (not from a mapped class)?

How to get specific content from a webpage using curl in php?

Cant fetch the content of some tabular data from a webpage

Failed to fetch tabular content from a webpage using requests

Failed to scrape tabular content from a webpage using requests module

Trouble extracting some content from a webpage using BeautifulSoup

Can't produce tabular content from a webpage using Selenium

Why is BeautifulSoup losing so much content from this webpage?

Can't fetch some content from a webpage using post requests

Can a google ad on a webpage scrape content from the current page?

How to remove inline styling of copied content from webpage using jQuery?

RVest: extract tag content inside a class from a supermarket webpage