Free Python Domain Crawler
Add to Wishlist
Add to Wishlist
Description

Did you buy the domain list? Want to crawl the Internet and your domains for desired data?

Great, we just open sourced one of our crawlers, which is super FAST! and low resource consuming (i.e over 100 connections per second) with low RAM and CPU consumption. It is asynchronous, so provides best performance even on the smaller VPS/Linux server. You can setup a cluster of the crawlers, using for example Redis and for example RQ (Redis Queue) to process domains from several machines.

As you can see, this will cost some money to setup and run such environment. You can do the math, with ca. 100 requests per second and with 260,000,000 domains, it would require ca. 30 servers (i.e $10/month pro Server) to process it within one day. We did this and do it continuously.

By buying it from us, you save money and hassle.

Feel free to use it to get the data you want, after buying list of domains from us:

Here is the Domain Crawler Open Source GitHub page:

https://github.com/topcodersonline/domain-crawler/blob/master/crawler.py

You need to specify fields you want to crawl and input file.

Currently, it will output to standard output in JSON format these values:

– Domain
– IP
– Web Server type
– Tech stack (Powered By)
– MetaGenerator
– Email
– Country Hosted

Feel free to add/modify.

In case of questions feel free to contact us directly.

Reviews

Only logged in customers who have purchased this product may leave a review.