Faster Data Processing
Reduced total runtime from 48 hours to 30.5 minutes, drastically improving research productivity.
Azati significantly accelerated the client’s DNA sequence processing software by identifying and optimizing a critical bottleneck in the FASTAptamer toolkit. The team rewrote the core clusterization logic from Perl to C++, achieving an 80x reduction in total execution time and a 1,000x improvement for the embedded Levenshtein algorithm.
overall software performance improvement
Levenshtein algorithm execution speedup
end-to-end processing time after optimization
The client needed a dramatic acceleration of their DNA sequencing pipeline, which processed vast amounts of biological data. Manual research workflows were slowed by a software bottleneck, delaying experiments and reducing lab productivity. The goal was to shorten processing time while maintaining result accuracy, enabling faster research cycles and timely delivery of insights.
The client’s DNA sequencing software took approximately 48 hours to process a dataset because the FASTAptamer toolkit had a major performance bottleneck. This delay disrupted research timelines and impacted productivity. Azati analyzed the pipeline, pinpointed the slowest steps, and proposed performance engineering solutions to optimize them using low-level programming techniques.
The clusterization step relied on the Levenshtein algorithm implemented in Perl, which was inefficient for high-throughput data. Azati proposed rewriting this logic in C++ to exploit faster execution, better memory handling, and native compilation advantages, drastically reducing processing time while maintaining accuracy.
Analyzed the client's DNA sequencing pipeline to locate performance bottlenecks.
Determined that the clusterization program in FASTAptamer, particularly the Levenshtein calculation, consumed the majority of execution time.
Benchmarked the Perl implementation and confirmed inefficiencies due to language limitations in high-throughput operations.
Rewrote the Levenshtein algorithm in C++ to enhance execution speed and memory efficiency.
Integrated the optimized C++ algorithm into the client’s pipeline and validated results to ensure consistency and accuracy.
Submitted the improved algorithm to the official FASTAptamer repository, which was merged in version 1.0.12, benefiting the global bioinformatics community.
Bring your complexity. We'll bring the plan. Select a convenient slot to start a conversation with our experts.
Schedule a callThe original Levenshtein algorithm in Perl was a major performance bottleneck. Azati rewrote it in C++ to leverage efficient memory management and faster computation, allowing the software to process sequences thousands of times faster without changing the output results.
Beyond the algorithm rewrite, Azati optimized the full DNA sequencing workflow, removing unnecessary delays and improving data handling across modules. This reduced total execution time for datasets from 48 hours to just 30.5 minutes, massively increasing research throughput and lab efficiency.
The optimized Levenshtein algorithm was submitted to FASTAptamer’s official repository and merged in the subsequent release. This not only improved the client’s performance but also contributed to the wider bioinformatics community, enabling all users to benefit from faster DNA sequence analysis.
Reduced total runtime from 48 hours to 30.5 minutes, drastically improving research productivity.
Output results remained consistent, ensuring scientific integrity and reproducibility.
Optimization accepted into the mainstream toolset, benefiting all FASTAptamer users worldwide.
Last updated