Download from react sites with a list
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I'm looking for a script for the following process:

  1. The script is pointed at a list of URLs
  2. For each URL, the site is visited, and the web page is saved (html and files).

That's basically it. I've been trying to get a few different tools to work but to no avail. The main challenge is that some of the sites I'm trying to use are react, and see the request as a browser with Javascript disabled.

Any pointers or help on this would be appreciate.

Hi, have you ever used JSOUP library for web scraping in Java? It has the same issue as you mentioned. I've used them in the past and had the same issue.. WebClient for JSoup seem the solution..
SilverHood Apps 3 months ago
In the above example, 2 libraries are used:
1. HtmlUnit - This simulates a web browser using the WebClient class.
2. JSoup - This is used to extract the web page and parse if required.
In your case you might just extract and save for each corresponding URL. ps: Needs NetBeans IDE
SilverHood Apps 3 months ago
Hi, I've successfully created a working prototype using said library that extracts dynamic web pages.
Kindly get intouch from back of my page/account for its implementation, as it will be requiring heavy interaction
Regards.
SilverHood Apps 3 months ago
Well that it will be easy to make with python for me, this Q is still on? Is O.K use selenium? Basically you have to mock the web browser to get the full page processed, then extract all the html and references to download all the content. The other part, the program has to make the async queries in parallel.
romelgomez 3 months ago
Do you want the page links, i.e. scripts, styles, etc. to point to their original URLs or to a relative path once saved? For example, if there's a <script src="https://domain.com/js/file.js">, do you want these links to become <script src="/js/file.js"> or leave them pointing to their original URLs (absolute paths)?
kostasx 3 months ago
This is some new trend being seen here, new new account post bounty & vanishes. Wonder that you achieve..
SilverHood Apps 3 months ago

Crowdsource coding tasks.

0 Solutions