Custom Search Platform for Recruitment Agency

Azati designed and developed a custom recruitment platform for a staffing agency based in New Jersey. The platform uses a network of interconnected microservices to improve the process of resume search, candidate evaluation, and general hiring, ultimately speeding up the recruitment process and enhancing overall efficiency.

Discuss an idea

All Technologies Used

Python
Python
Keras
Keras
TensorFlow
TensorFlow
Flask
Flask
Rails
Rails
React
React
Ruby
Ruby
Numpy
Numpy
MongoDB
MongoDB
Selenium
Selenium
Redis
Redis

Motivation

The customer needed a custom solution to automatically collect resumes from various websites, create a database, classify candidates, enhance their CVs with missing skills, and enable efficient search across the candidate database. The goal was to improve the hiring process, reducing time and costs for recruitment while ensuring high-quality matches between candidates and job descriptions.

Main Challenges

Challenge 1
Web Scraping and Data Merging

The main challenge was extracting unstructured data from multiple job sites like LinkedIn, Indeed, and Stack Overflow, where candidates often upload incomplete or outdated resumes. The system needed to merge this data into a comprehensive candidate profile, avoiding web scraping limitations imposed by these sites.

Challenge 2
Understanding Complex Technologies and Skills

Recruiters lack specific knowledge of the vast array of technologies, programming languages, and frameworks that developers use. The challenge was to train a machine learning model to build relationships between these technologies and predict missing skills based on the available resume data.

Key Features

  • Web Scraping Engine: The platform automatically scrapes resumes from various job sites and merges them into a comprehensive candidate profile.
  • Data Classification and Tagging: Resumes are classified and tagged with relevant skills, programming languages, and frameworks, improving search accuracy.
  • Machine Learning Model: The system predicts missing skills and enhances resumes by associating known technologies with similar competencies.
  • Cloud-Based Architecture: The solution is scalable and cost-effective, hosted in the cloud to avoid on-site infrastructure and provide flexibility.
  • Proxy Management System: A built-in proxy management system ensures that the platform bypasses scraping limitations imposed by job sites.

Our Approach

Custom Web Scraping Network
We built a network of custom web scrapers using Selenium, assigning each scraper to a specific website. To avoid limitations and maintain an 'ordinary user' appearance, we managed proxies and user agents to bypass restrictions.
Unstructured Data Analysis and Structuring
Data extracted from websites was unstructured. We used a NoSQL database to store this data in a structured format, which allowed recruiters to search and evaluate candidates effectively.
Machine Learning for Skill Enhancement
We trained a machine learning model to identify missing skills and classify candidates into groups based on their expertise, correlating known frameworks and programming languages to predict additional competencies. This feature enhanced incomplete resumes and provided recruiters with a more accurate understanding of candidates' capabilities.
Cloud Architecture for Scalability
The entire solution was hosted in the cloud to reduce maintenance costs and improve scalability. We used asynchronous processing, scaled the web scraping engine using Docker containers, and leveraged React for a highly interactive user interface.

Project Impact

The platform processes half a million webpages monthly, providing recruiters with accurate and enriched candidate profiles. This solution speeds up the hiring process, reduces stress for recruiters, and improves candidate matching. The system processes an average of 17,000 webpages per day and increases the number of relevant candidates by 127%. On average, it only takes 4 seconds to classify and tag a candidate, making the process significantly more efficient.

Ready To Get Started

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.