A scraper is a program that extracts data from a website.
I've built a scraper API which extracts data from various popular online marketplaces (This one is for Ebay).
A feature list:
this works with a Python Flask Application and the latest chrome driver and selenium.
With this CURL request we send a GET request to the API server and get back 25 random listings in JSON format.
curl -X GET "http://10.254.1.119/getInserateUrls" -H "accept: application/json"
"Expected" output:
{"urls": ["https://www.ebay-kleinanzeigen.de/s-anzeige/xyz", ...]}
With this CURL request we send a GET request with the listing URL to the API server and get back all data of it in JSON format.
curl -X GET "http://10.254.1.119/getInseratDetails" -H "accept: application/json" -H "url: https://www.ebay-kleinanzeigen.de/s-anzeige/xyz"
"Expected" output:
{"title": "xyz", "price": "1500.00", "images": ["https://img.ebay-kleinanzeigen.de/api/v1/prod-ads/images/81/xyz"], "tags": ["Kleinanzeigen Berlin", "Elektronik", "Handy & Telefon"], "views": "0", "description": "xyz", "uploadDate": "18.02.2023", "adId": "058195681"}
With this CURL request we send a GET request with the listing URL to the API server and get back all updated values of it in JSON format.
curl -X GET "http://10.254.1.119/getViews" -H "accept: application/json" -H "url: https://www.ebay-kleinanzeigen.de/s-anzeige/xyz"
"Expected" output:
{"views": "xyz000"}
You can use the API to build a website where you can search for listings and filter them by price, views, upload date, etc.
And what did I do? I've built a website where you can search for listings and filter them.
I've made a few php scripts to get listings, the details and the updated values of a listing to store them in a mongodb database.
I know it's crappy but it works (for now)
I've used following tools:
The API is running in a docker container, you can self-host it really easily. Just one command and you can call it a day.
Educational purposes. :)