Automated Gene-Retrieval System for Biological Information Needs

Document Type


Publication Date



Bioinformatics | Computer Sciences


In this day and age, conducting a biological experiment is presumably a very expensive procedure largely owing to the highly sophisticated and expensive equipment necessitated by the process. Conceivably, being capable of isolating and focusing on a smaller set of imperative genes or gene products that are of high relevance to the experiment, pathway, or biological system under investigation is very desirable largely owing to the potential savings in experimental costs. In this work, we propose an intelligent information system capable of generating a ranked list of genes and gene products that are most pertinent to a given biological pathway, experiment or system (referred to as a biological context henceforth). We assume that the biological context of interest can be described by various textual query terms and phrases from the biological domain which, in turn, relate to various molecular functions, biological processes and cellular components of genes and their products. Intelligent text-based analyses and mining are utilised for this purpose by using the published literature, in the form of publication abstracts downloaded from PubMed, with the intention of ranking genes and gene products having identified relationships to the specified description terms based on the gene ontology (GO) standard. At this stage, our approach is capable of producing promising results given all surrounding restrictions, one of which is the lack of similar work in the literature. For demonstration purposes, we report experimental results on the molting regulation pathway in Drosophila melanogaster (fruit fly).