Advances in molecular biology have resulted in the development of diagnostic tests for infectious diseases based on genetic profiles. While probe based assays dominate the field today, sequence based assays hold great promise for the future. However, the variability in quality of sequence information currently present in public databases limits the potential growth and use of sequence based analysis. To address this problem a standardized method for DNA sequence validation and building of custom databases was developed using Mycobacterium as a development model. With this model, a computational approach to identification of infectious diseases was developed and evaluated. The web-based application, termed BioDatabase, accomplished genetic sequence identification via the creation of curated databases containing a relatively small set of genetic data specific to a species or group. The process for creation of the custom database included multiple steps beginning with identification of highly conserved start and end sequences and intervening sequence validation parameters. The process eliminated the need for multiple sequence alignment with GenBank sequences, whose information is valuable, yet difficult to properly utilize due to its size and quality. The custom database approach maximized application performance with minimal impact on analysis response time, allowing investigation of optimal sequences for identification of all Mycobacterium to the species level. In comparison to the 16S and ITS genetic regions, a curated ITS based approach proved most effective for identification of Mycobacterium isolates.