Homepage > Portfolio > Inventory Search Engine for Auto Parts Retailer

Inventory Search Engine for Auto Parts Retailer

Azati designed and developed an intelligent search engine for inventory search enhancing traditional search algorithms. It analyzes the user input and looks for a specific entry. If the algorithm can’t find the requested item in the inventory, it explores the characteristics of the object and returns the list of similar products.

CUSTOMER:

Brazilian online retailer of automotive parts and accessories for cars, vans, trucks, and sport utility vehicles. Customer owns and manages a chain of auto parts stores, car workshops, and engineering lab. The customer is known for a broad selection of automotive parts and accessories and has structured a discount pricing scheme for individual consumers.

A customer has a complex supply chain which helps him to deliver auto parts in the shortest terms all across the country. There are about 500.000 parts and about 100.000 auto-related products inits catalog.

The customer suffered from a tech-related issue — the search among huge catalog was inaccurate and quite slow. It took about 2 minutes to complete data lookup. There was no search algorithms optimization, so there were several ways we could improve the situation.

OBJECTIVE:

After applying to the project, we discovered, that there is no single catalog of all auto parts. There were several catalogs from different manufacturers and several catalogs for internal use. The existing solution was performing ineffective request chains from one data source to another. From the very beginning, there was a single catalog, that included an enormous number of autoparts and accessories.

Today other catalogs also store a massive amount of data about auto parts and accessories. The number of data increased dramatically. This way we decided to give up the improvement of the existing solution and design the new system that can handle the amount of data. There were several challenges we faced.

#1

CHALLENGE #1:

There were several formats of catalogs: XSLX files (Excel), TXT files, CSV (Open-source spreadsheets), Databases (MySQL), and APIs. Another ERP software autogenerated the majority offiles.

This way our team decided to develop several universal connectors (one for each catalog) that can handle data extraction and convert the data from different sources into one well-structured format.

CHALLENGE #2:

Once the data is extracted, it is time for the intelligent search. To make an accurate, fast, and secure search engine we decided to retrieve attributes from every auto part and use that data in smart tagging. It means that every auto part has a unique set of tags, but products from the same category will have similar tags.

The user query is analyzed in search of object attributes. According to these tags, the system determines the type of object and category and makes a data lookup.

#2
#3

CHALLENGE #3:

A customer wanted us to build a system, where a user can look for any auto part from a single search field. Our engine should analyze user query and perform a data lookup in real time rebuilding the result page. A content of the page should automatically refresh when a user ads new words into the query — as Google does.

This way we decided to give up using a database and perform an in-memory search to avoid read-write disk bottlenecks.

PROCESS:

At Azati we enjoy building modular systems while developing small projects: such systems are more straightforward to build, test, scale and maintain. This project is not an exclusion. Our team decided to create two modules: one for data processing, another for user interface generation.

From the very beginning, as the customer wanted the system to re-generate the result page content every time the user changes the input, we decided to use React as a front-end framework for UI generation. It is a powerful tool that can modify the page source code without reloading the browser tab. The first module is based on React and Python Tornado as a back-end for request handling.

The second module was responsible for entity extraction, parts tagging, and data lookup. There were a lot of different files that contained the information we needed, so our team developed a custom rule-based parser that can identify the document type and provider on the fly.

For every provider, we described a set of rules, that are used by a parser to extract data. The parser took the document as an input and provided the structured data as an output.

There were several data sources, where the information about auto parts was represented as MySQ Land Access databases. For these databases, we built an algorithm that asynchronously sends requests to multiple databases at the same time extracting the data and eliminating any delays.

As the core of a search engine, we used enhanced pairwise comparison algorithm, developed by our PHDs and well-tested on some bioinformatics samples during the recent projects. As a result, we developed a prototype that was presented to the customer. He was quite impressed and asked us to improve the prototype and turn it into a solution.

SOLUTION:

The final solution includes interconnected two modules: one for user interface generation,another for document processing and data lookup. These modules are hosted in a self-made cloud inside the client’s infrastructure. As there is a vast amount of information describing autoparts, it takes about 3GB of RAM to store all the attributes and tags in memory. If a customer loads a bigger dataset, the system will automatically allocate additional memory.

The search engine includes two modules:

 

Document processing and search module
 

API for User Interface Generation

TECHNOLOGIES:

 
 
 
 
 

SCREENSHOTS:

RESULTS:

We successfully built an intelligent search engine that aggregates information from various data sources: from documents to external databases. Our team successfully implemented enhanced search algorithm to provide quick and accurate search results. As all data is stored in memory, we made data lookups blazing fast. The engine also offers the most relevant search results according to advanced scoring.

A FEW NUMBERS:

15K
objects
were analyzed while developing a prototype
300
Kattributes
were generated from the sample dataset
~1
second
It takes to examine the search query and return a result

NOW:

We have successfully launched this project in late November 2017. Now we are maintaining this solution and collecting usage statistics. Statistics help us to understand the users better. We will propose the client to improve the search with machine learning, which will make search even more accurate and fast.

Drop us a line

If you are interested in the development of a custom solution — send us the message and we'll schedule a talk about it.