The human mitochondrial proteome database has been developed by deriving data from, a combination of public repositories and experimental and computational prediction methods. The experimental data is derived from highly purified mitochondria from, human heart tissue, whereas predictions have been performed by MITOPRED, a genome-scale method for the prediction of nucleus-encoded mitochondrial proteins. Mitochondrial protein sequences from different sources have been clustered to generate a nonredundant dataset. Annotations related to the protein function, structure, disease association, pathways, and so on are collected from a number of public databases using commonly used UNIX and Perl scripts. This chapter provides a detailed description of various data sources and methods used to download, curate, parse, and generate meaningful annotations from, primary as well as derived databases.