Algobook
- The developer's handbook
mode-switch
back-button
Buy Me A Coffee
Mon May 08 2023

How to scrape a website from our React application

In this article, we will share how we can scrape a website from a React application. As our help, we will use an amazing open source project called reactjs-scraper and also an additional demo of our free web scraper API that we provide on Algobook.

Promotion of our projects

So, just to get it out of the way - all the stuff we are using in this guide are our own tools that we give out for free. So this article is a little promotion of what we offer. The biggest achievement there is, is when people are using our software, and also telling us about what is good and what can be better. Hence, we want to hightlight our portfolio to get more audience to it :)

In this guide, it is our Scraping capabilities we want to get some eyes on!

React scraping

We will first show how to scrape a given website from our React application, with our component called reactjs-scraper.

Download

It can be downloaded from npm

npm i reactjs-scraper

Final result

When we are done, it will look like this

gif

Basic usage

To get it up and running, we will implement it as below. I will go through the important parts after the code snippet.

import { ReactScraper } from "reactjs-scraper"; import { LoadingButton } from "reactjs-loading-button"; const renderButton = (content: string) => ( <LoadingButton loadingMode="SPINNER" onClick={() => { navigator.clipboard.writeText(content).then(() => alert("HTML copied")); }} isLoading={false} text="Copy HTML" buttonStyle={{ fontWeight: "bold", backgroundColor: "#da6868", color: "white", margin: "1rem 0", }} /> ); <ReactScraper placeholder="Type in the URL" buttonText="Crawl" animationColor="#fff" buttonStyle={{ backgroundColor: "#2a6e06", }} renderFn={renderButton} />;

Some notes

  • The ReactScraper component will accept some props, all are optional, except for renderFn. In our example, we are giving it some custom text and also changing the default styling of the button and its animation.

  • renderFn is the callback that will be called after the content of the webpage we want to scrape has been retrieved. So we can render it exactly how we want it to be. In our example, we are just showing a button that allows the user to copy the full HTML content. The callback could potentially do anything we want, like render the HTML out on our page, or we could perhaps filter out important data from it - sky is the limit, right?

  • LoadingButton is the component we are using. Oh surprise, it is another of our free stuff you can check out if you like. Link to docs.

And that's it. Now we got out nice component in our project, and we are ready to do some scraping!

Use the API instead

You don't like the component? Or are you using something else than React? Maybe you want to do this from your backend logic instead? In any case, we have an API that you can use as well to achieve the same, our component was built for convenience.

Using the API

The usage is super simple, just call our API with the URL of the webpage you want to scrape, and we will get it for you.

Example in JavaScript

const response = await fetch( "https://media.algobook.info/scrape?url=https://example.com" ); const content = await response.json(); console.log(content.data); // Full HTML of the given page

Response

This is the full response from https://example.com/

{ "data": "<!DOCTYPE html> <html> <head> <title>Example Domain</title> <meta charset=\"utf-8\"> <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\"> <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\"> <style type=\"text/css\"> body { background-color: #f0f0f2; margin: 0; padding: 0; font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif; } div { width: 600px; margin: 5em auto; padding: 2em; background-color: #fdfdff; border-radius: 0.5em; box-shadow: 2px 3px 7px 2px rgba(0, 0, 0, 0.02); } a:link, a:visited { color: #38488f; text-decoration: none; } @media (max-width: 700px) { div { margin: 0 auto; width: auto; } } </style> </head> <body> <div> <h1>Example Domain</h1> <p>This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.</p> <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p> </div> </body> </html>" }

Outro

That's it. Now you know how you can do web scraping from your React application, or from anywhere if you are using our API instead of the component.

I really hope you enjoyed this guide, and I hope even more that you find value in using some of our free software. And if there is anything you think we could improve or add to any of our services, just give us an email here, and we will do all we can to meet your needs.

Thanks for reading, and have a great day!

signatureMon May 08 2023
See all our articles