GOS 978010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091140850046
Annotathon code: GOS_978010
Sample :
  • GPS :1°12'58s; 90°25'22w
  • Galapagos Islands: Devil's Crown, Floreana Island - Ecuador
  • Coastal (-2.2m, 25.5°C, 0.1-0.8 microns)
Authors
Team : BSB06CIIT
Username : Awais
Annotated on : 2009-05-24 23:03:49
  • Rahman Obaid

Synopsis

  • Taxonomy: Bacillus cereus (NCBI info)
    Rank: species - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 1396
    Kingdom: Bacteria - Phylum: Firmicutes - Class: Bacilli - Order: Bacillales
    Bacteria; Firmicutes; Bacillales; Bacillaceae; Bacillus; Bacillus cereus group;

Genomic Sequence

>JCVI_READ_1091140850046 GOS_978010 Genomic DNA
TTCTTTTTCGTTAGTTTTTTACCTTCTTTTAATTTTTTTCTTAATGTATCTGCTTCTTTTCTAGAAATATTTTTAGGTCGCATGTTTTGATTTTGAAGAA
TAGTGTTTTTTATTTCTTTTCTAGCTCCAGCAGCTACTTCACCTTTTACATCAAACTTAAAATTTTTATTGTTAAAAGATGCAAGTCTTTGTTTAAAAAA
ATTATCTGCATCTTTCATTTTTAAACTAGCTAAATCTTTAAAAGATTTAATTGGAGAGTTAGCAAACATATTACCTACCATTTGATTAGACTCTTTTAAC
ATTTGTTTTATTTCTTTAGCCTCGGTCCTTACTGCAGGATTCACTAGTTTTAACATTTTTTCAGATTGTACTTTACCTGGGGCCATAATATAATCTGTAA
TTTTATTACGTTCTCTTTGTAAATCATCCATGACGTTTCTATAACCAGGACTATTTGTTACTTTATCAAATATGTTACCTTTCATTTTAGTAACAATGTC
ATAAAGTTTTTCATTCATGTATTCGTCTTTACGATTAAATTTTTTAGTGATGGCTTCTAATTTAGCTTGTACTTTATCACCTTCATTTTTTAAACCTTTA
GACATTAAACCTGCAGAACCAAAGTATTCTTTTAGTTTACTTAATTTACCTGCAACTGTTTGACCAAACGTACCACCTTTAGTATTATAAAAAGCCCACA
TAGATGGATCAGGTATATCTAATTTTTCATATAGTTTACTTTTAAACTTATCTTTTTTTGTGCCAAGTTTAGTTACACCTTTAGTAATAAAAGTACCTTC
TTTACCTTTACC

Translation

[112 - 810/812]   indirect strand


Annotator commentaries

Firstly there is no need of MSA, as we can see that we have worst result from BLAST, so there is also no need of Predicting life tree as we predict that our protein is non-coding in nature and have no significant similarity in sequence and even no significant conserved domains. So, i am unable to find any significant Process and molecular function of this sequence.


Secondly, from all above the results i am able to strongly predict that this is a non-coding sequence from Basscillus Group.



ORF finding

PROTOCOL


SMS ORF FINDER / forward strand /1, 2 and 3 STRAND / 60 codons for the minimum ORF length / atg start codon / universal genetic code


SMS ORF FINDER / reverse strand /1, 2 and 3 STRAND / 60 codons for the minimum ORF length / atg start codon / universal genetic code


RESULTS ANALYSIS


No reading frame found from the forward strand with the particluar parameters, but i got a reading frame from reverse strand thats is from fram 1, starting with M but lacking the stop codon at the end, it may possible that gene is starting from the this particular region of the genome but stop codon is present further away from this part. Hence, i am going to applying further procedure on this reading frame.

a) Forward Strand

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.


b) Reverse Strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 112 to base 810.
ATGTGGGCTTTTTATAATACTAAAGGTGGTACGTTTGGTCAAACAGTTGCAGGTAAATTA
AGTAAACTAAAAGAATACTTTGGTTCTGCAGGTTTAATGTCTAAAGGTTTAAAAAATGAA
GGTGATAAAGTACAAGCTAAATTAGAAGCCATCACTAAAAAATTTAATCGTAAAGACGAA
TACATGAATGAAAAACTTTATGACATTGTTACTAAAATGAAAGGTAACATATTTGATAAA
GTAACAAATAGTCCTGGTTATAGAAACGTCATGGATGATTTACAAAGAGAACGTAATAAA
ATTACAGATTATATTATGGCCCCAGGTAAAGTACAATCTGAAAAAATGTTAAAACTAGTG
AATCCTGCAGTAAGGACCGAGGCTAAAGAAATAAAACAAATGTTAAAAGAGTCTAATCAA
ATGGTAGGTAATATGTTTGCTAACTCTCCAATTAAATCTTTTAAAGATTTAGCTAGTTTA
AAAATGAAAGATGCAGATAATTTTTTTAAACAAAGACTTGCATCTTTTAACAATAAAAAT
TTTAAGTTTGATGTAAAAGGTGAAGTAGCTGCTGGAGCTAGAAAAGAAATAAAAAACACT
ATTCTTCAAAATCAAAACATGCGACCTAAAAATATTTCTAGAAAAGAAGCAGATACATTA
AGAAAAAAATTAAAAGAAGGTAAAAAACTAACGAAAAAG

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
MWAFYNTKGGTFGQTVAGKLSKLKEYFGSAGLMSKGLKNEGDKVQAKLEAITKKFNRKDE
YMNEKLYDIVTKMKGNIFDKVTNSPGYRNVMDDLQRERNKITDYIMAPGKVQSEKMLKLV
NPAVRTEAKEIKQMLKESNQMVGNMFANSPIKSFKDLASLKMKDADNFFKQRLASFNNKN
FKFDVKGEVAAGARKEIKNTILQNQNMRPKNISRKEADTLRKKLKEGKKLTKK

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.


Multiple Alignement

PROTOCOL



RESULTS ANALYSIS

RAW RESULTS

Protein Domains

PROTOCOL


_> EBI INTERPRO SCAN / -1 READING FRAME AS INPUT / WITH ALL THESE PARAMETERS

BlastProDom FPrintScan HMMPIR HMMPfam HMMSmart

HMMTigr ProfileScan ScanRegExp patternScan SuperFamily SignalPHMM

TMHMM HMMPanther


_> Pfam / Sequence Search


RESULTS ANALYSIS


_> NO HITS REPORTED for the INTERPROSCAN


_> i got 4 hits for th Pfam family but "We found 4 Pfam-A matches to your search sequence (there were no significant matches). You did not choose to search for Pfam-B matches. The graphic below shows the arrangement of matches on your sequence: ", this is the result analysis from pfam.


So there is no significant conserved domain present in this sequence.


Below you will see errors in discussing domains of this protein, as i told you i got totally insignificant results and this protein doesnt seems to have any significant conserved domain. Hence, this error generates because this protein's domains are not present on and if there is any that is not present on the specify positions.

RAW RESULTS

==> No hits reported for INTERPROSCAN

==> Pfam 

#HMM       *->sEGsKnEGDKviAGKEAiYK<-*
#MATCH        s G KnEGDKv A  EAi K   
#SEQ          SKGLKNEGDKVQAKLEAITK    53   

#HMM       *->aagtaAaaPvapavpqeLvPheverFqALMasappaPPalarseepSavsklVetqDdavRkvl....ddvlallhhakdmSmnDiemaAasir.....lqyEiasl...qfdmqakmsVvqSgKdAv.qTLMKNQ<-*
#MATCH        ++ +++  +v +     L   e ++         +aP++++  + + +v  +V t++ +++++l+++++ v  + ++ + +S+   ++a+  +++ ++    ++as ++++f++ +k  V+      ++ T   NQ   
#SEQ          VTNSPGYRNVMDD----LQ-RERNK----ITDYIMAPGKVQSEKMLKLVNPAVRTEAKEIKQMLkesnQMVGNMFANSPIKSFK--DLASLKMKdadnfFKQRLASFnnkNFKFDVKGEVAAGARKEIkNTILQNQ    205

#HMM       *->vvEialVqkmvneHNsav<-*
#MATCH        ++E+ ++ +m++e+N+ v   
#SEQ          RTEAKEIKQMLKESNQMV    142  

#HMM       *->RlkTFqnkkWPisnl<-*
#MATCH        Rl++F nk+++++++   
#SEQ          RLASFNNKNFKFDVK    186  

Phylogeny

PROTOCOL



RESULTS ANALYSIS

RAW RESULTS

Taxonomy report

PROTOCOL


NCBI / BLAST / TAXONOMY REPORT



RESULTS ANALYSIS


as i discussed in BLAST all top hits of this protein hypothetical proteins, and we can see that all belongs to same GENERA BACILLUS.


Bacteria › Firmicutes › Bacillales › Bacillaceae › Bacillus › Bacillus cereus group


Top most hit Bacillus cereus F65185 have following teaxonomy lineage.


Bacteria › Firmicutes › Bacillales › Bacillaceae › Bacillus › Bacillus cereus group


so we can predict that this DNA sequence may get from Bacillus Cereus Group



RAW RESULTS

cellular organisms
. Bacteria              [bacteria]
. . Bacillus cereus group [firmicutes]
. . . Bacillus cereus       [firmicutes]
. . . . Bacillus cereus F65185 ---------------------------------   38 4 hits [firmicutes]        hypothetical protein bcere0025_55720 [Bacillus cereus F6518
. . . . Bacillus cereus 03BB102 ................................   38 2 hits [firmicutes]        hypothetical protein BCA_A0057 [Bacillus cereus 03BB102] >g
. . . . Bacillus cereus AH603 ..................................   36 2 hits [firmicutes]        hypothetical protein bcere0026_55900 [Bacillus cereus AH603
. . . . Bacillus cereus ATCC 10987 .............................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus ........................................   36 7 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus H3081.97 ...............................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus AH187 ..................................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus AH1273 .................................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus AH1272 .................................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus AH1271 .................................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus BDRD-ST26 ..............................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus R309803 ................................   36 2 hits [firmicutes]        hypothetical protein BCE_A0063 [Bacillus cereus ATCC 10987]
. . . . Bacillus cereus G9241 ..................................   36 2 hits [firmicutes]        conserved hypothetical protein protein [Bacillus cereus G92
. . . . Bacillus cereus ATCC 10876 .............................   36 2 hits [firmicutes]        hypothetical protein bcere0002_59290 [Bacillus cereus ATCC 
. . . . Bacillus cereus AH1134 .................................   35 2 hits [firmicutes]        conserved hypothetical protein [Bacillus cereus AH1134] >gi
. . . . Bacillus cereus Q1 .....................................   35 2 hits [firmicutes]        hypothetical protein BCQ_PI209 [Bacillus cereus Q1] >gi|221
. . . . Bacillus cereus 172560W ................................   35 2 hits [firmicutes]        hypothetical protein bcere0005_52830 [Bacillus cereus 17256
. . . . Bacillus cereus AH820 ..................................   34 2 hits [firmicutes]        hypothetical protein pPER272_AH820_0064 [Bacillus cereus] >
. . . Bacillus thuringiensis serovar monterrey BGSC 4AJ1 -------   38 2 hits [firmicutes]        hypothetical protein bthur0007_60810 [Bacillus thuringiensi
. . . Bacillus thuringiensis serovar huazhongensis BGSC 4BD1 ...   37 2 hits [firmicutes]        hypothetical protein bthur0011_60650 [Bacillus thuringiensi
. . . Bacillus thuringiensis serovar pulsiensis BGSC 4CC1 ......   37 4 hits [firmicutes]        hypothetical protein bthur0012_57590 [Bacillus thuringiensi
. . . Bacillus thuringiensis IBL 200 ...........................   37 4 hits [firmicutes]        hypothetical protein bthur0013_65960 [Bacillus thuringiensi
. . . Bacillus anthracis (anthrax) .............................   36 2 hits [firmicutes]        hypothetical protein pxo1_51 [Bacillus anthracis] >gi|21392
. . . Bacillus anthracis str. A2012 ............................   36 2 hits [firmicutes]        hypothetical protein pxo1_51 [Bacillus anthracis] >gi|21392
. . . Bacillus anthracis str. 'Ames Ancestor' ..................   36 2 hits [firmicutes]        hypothetical protein pxo1_51 [Bacillus anthracis] >gi|21392
. . . Bacillus anthracis str. A0488 ............................   36 2 hits [firmicutes]        hypothetical protein pxo1_51 [Bacillus anthracis] >gi|21392

BLAST

PROTOCOL


NCBI BLASTp / nr database



RESULTS ANALYSIS


i got only one reading frame and when i apply BLAST on it with all databases (nr), i found not even a single good or i can say relaible hit. No putative conserved domains have been detected, and the top best hits are showing in raw results ....


we can see bit score are gradually decreasing and E value is grdually increasing, and even the top most hit is worst in its own and more than 90% proteins are hypothetical, there we not found any strong evidence that these are functional protein.


so from these result i can predict that the this gene asign to me is a non coding gene,

RAW RESULTS

ref|ZP_04206589.1|  hypothetical protein bcere0025_55720 [Baci...    Bit Score 38.9  E value  0.35
ref|ZP_04112197.1|  hypothetical protein bthur0007_60810 [Baci...    Bit Score 38.5  E value  0.44 
ref|YP_002752904.1|  hypothetical protein BCA_A0057 [Bacillus ...    Bit Score 38.5  E value  0.55  
ref|ZP_04088303.1|  hypothetical protein bthur0011_60650 [Baci...    Bit Score 37.7  E value  0.74 
ref|YP_001243612.1|  iron-containing alcohol dehydrogenase [Th...    Bit Score 37.7  E value  0.82  
ref|NP_228728.1|  alcohol dehydrogenase, iron-containing [Ther...    Bit Score 37.7  E value  0.92  
ref|ZP_04082061.1|  hypothetical protein bthur0012_57590 [Baci...    Bit Score 37.4  E value  1.1  
ref|ZP_04076202.1|  hypothetical protein bthur0013_65960 [Baci...    Bit Score 37.4  E value  1.2  
emb|CAG03597.1|  unnamed protein product [Tetraodon nigroviridis]    Bit Score 37.0  E value  1.5  
ref|ZP_04200817.1|  hypothetical protein bcere0026_55900 [Baci...    Bit Score 36.6  E value  1.7  
ref|NP_982070.1|  hypothetical protein BCE_A0063 [Bacillus cer...    Bit Score 36.6  E value  1.8   
ref|NP_052747.1|  hypothetical protein pxo1_51 [Bacillus anthr...    Bit Score 36.6  E value  2.0   
ref|ZP_00239657.1|  conserved hypothetical protein protein [Ba...    Bit Score 36.6  E value  2.1