The sequence that was obtained contained multiple Open Reading Frames (ORF). The particular reading frame chosen was read from the 25th to 1023rd nucleotide, in the 5’ to 3’ region and that particular ORF was chosen due to its high score, query cover and also the low E value.
The sequence of interest was shown to code for the phosphoribosylaminoimdazole carboxylase protein, relating to the ATP-grasp 4 superfamily. With a molecular weight of 38698.41Da and a theoretical isoelectric point of 8.64. This protein is responsible for the synthesis of purines in various bacteria such as the in particularly the Pelagibacter genus. The phylogenicity tree showed that the protein was closely realted to the Alphaprobacteria phylum.
An analysis of the sequence with PSORTb predicted that the protein is highly likely to be in the cytoplasmic membrane and analysis using SWISS-model shows the 3 Dimensional structure of the protein with a GQME value of 0.72 thus indicating a high accuracy of prediction.