Case Studies

Online Database Scraping AI

Technologies:
Industry:
Finance
Client:
Confidential
Platform:
Web
CAPTCHA solving
Automated website scraping
Online Database Scraping AI

Project Summary

A solution for a bankruptcy services company which automatically gathers data from online bankruptcy databases

Services

Web Software Development

Team

1 Project Manager
1 Data Science Developer

Target Audience

Accounting Firms
Finance Companies

Our client is a finance services company looking to expand their marketing efforts through the acquisition of clients from new channels, namely bankruptcy databases. We were tasked with the development of a data scraper which would go through these databases and extract data about companies who have recently filed for bankruptcy.

The scraper would have to integrate seamlessly into the company’s existing software infrastructure, connect with the CMS systems, and have a simple and intuitive design.

Data Scraper

The data scraper we have developed automatically accesses the bankruptcy databases, loads relevant pages and scans for the bankrupt company data, checking if the data is new or has been saved before, therefore avoiding duplicates.

Before the extraction, the scraper processes the data to bring it to a uniform format, like phone numbers and addresses. The data can be extracted in various formats, including Excel sheets, or can be automatically added into a CMS our client uses.

We have implemented the ability to easily train the data scraper by showing it what data to collect from a database. All a user has to do is open a new database, highlight the data, and run the scraper, after which the scraper can access the new database on a regular basis.

We have insured stability and continuous scraping by integrating Sentry, an application monitoring and error tracking software which alerts the user if there's been an interruption in the scraping process.

Human Behavior Emulation

The data scraping process closely emulates human behavior thus avoiding bans from database websites. Our data scraper can solve CAPTCHAs, handles pagination and performs page scrolling, changes IP dynamically, and rotates user agents.

The scraper can collect data continuously, adding new contacts as they show up in the databases, or work on a schedule, scanning the databases once a day or once a week.

Results

Our data scraper is already used daily by our clients, bringing them dozens of new potential clients daily. Its user-friendly design made it simple for employees to use the new employees, and its integration with the CMS systems made the scraper integration seamless.

Let's Work Together!

Do you want to know the total cost of development and realization of the project? Tell us about your requirements, our specialists will contact you as soon as possible.