Genetic analysis tools

Big data IP solution


Genetic analysis tools

Description

2008 till NOW STRATEGY, BIG DATA, DATA PROCESSING, PORTAL, OUTSOURCING

The development of an online single source of sequence data, aimed at responding to the scientists’ and researchers’ Intellectual Property (IP) sequence searching needs, including patentability, Freedom-to-Operate (FTO), Patent Infringement, Validity, and Business Intelligence.

Client

The SequenceBase Corporation is one of the leaders in providing patent sequence information to the biotechnology, legal, pharmaceutical, scientific, technical and academic bioinformatics communities.

Challenge

Intellectual property experts state that about 80% of the information published in a patent document is not available anywhere else. There are no lexicographers who monitor the publication process, so inventors could use whatever language they choose. On top of that, the researchers were compelled to step carefully through multiple databases — workflows, which often left them with major difficulties to overcome:

  • Is this search really complete?
  • The volume of results is overwhelming!
  • Time taken for results analysis inhibits the IP sequence search process
  • Sifting through duplicated results from multiple databases is slow and inefficient
  • Outsourcing is expensive and hard to integrate into workflows
  • It’s hard to share and report results in a simple accessible way

Besides, the volume of legal and scientific information grows exponentially each year, which, taken altogether, makes this approach totally inadequate in the perspective. The conditions above are paving the way to missing details in genetics research, which is by far not an option. The accuracy and comprehensiveness of DNA and protein research are critical to achieving goals, such as the discovery of new drugs and vaccines, genetic therapies, and sustainable agriculture.

For a long time, biotech companies, pharmaceutical, academia and law firms had been expressing the need for the reliable and simple way to discover IP sequence information across disparate resources.

Solution

Azati team successfully tackled the challenge by developing and maintaining the online IP Research Portal with easy-to-use, readily accessible content, search, analysis, and reporting tools.

As a first step, a new product was developed to cover all available genetic sequences from the published applications and issued patents of the USPTO dating back to 1982. Each database record contains a sequence and related data including organism name, sequence length and tables for modifications, and other features. Bibliographic and text search options, including publication title, abstract, patent assignees at issue, full inventor names plus the complete set of publication, application, and parent case WIPO/PCT numbers and dates are also provided.

As a second step, we developed SequenceBase® BLAST® Search Portal – a web-based access point for comprehensive patent sequence searching to handle various sequence databases.

The portal provides the ability to perform searches by the query sequences using one of the algorithms:  BLAST, Smith-Waterman, Multiple Sequence Search algorithm or MOTIF. It also has extensive data filtering capabilities, advanced reporting and exporting features.

Data updates occur every 24 hours. In order to to handle the growing amount of data and its processing needs, we’ve developed a long term Big Data strategy. The system utilizes cloud, distributed processing and scaling technologies to allow blazing fast data delivery. Thus, SequenceBase was the first company to announce “same day” data delivery to their clients.

Azati Software Engineering Team specializes in Intellectual Property, Machine Learning, Statistics and Biostatistics, and has years of experience in working with large datasets and accelerated sequence processing optimization. The SequenceBase Research Portal effectively addresses the predefined criteria, thus advancing the scientists’ research capabilities.





Skills:

  • Big Data
  • Cloud
  • Ruby
  • Postgres
  • Solr
  • Javascript

Some detailed information not disclosed due to NDA restrictions