GOS 1508010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091118858523
Annotathon code: GOS_1508010
Sample :
  • GPS :24°10'29n; 84°20'40w
  • Caribbean Sea: Gulf of Mexico - USA
  • Coastal Sea (-2m, 26.4°C, 0.1-0.8 microns)
Authors
Team : Algarve
Username : AMS1
Annotated on : 2010-07-14 20:45:21
  • a27930 MónicaAlexandraIsidoroGomes
  • a28985 SusanaFilipaJordãoViegas
  • a34705 AndréJerónimoGuerra

Synopsis

Genomic Sequence

>JCVI_READ_1091118858523 GOS_1508010 Genomic DNA
TCGAAGGAGGCGTCGGTAAAGGAATATTTTCTGTTGGCGCCGAAGGTATTGTCATAAACAAAGAGTTCGCCAAAGCCGAAAATATCGAAGAGAAGATATG
TCTTAAAGATTTTCTAGATAAACCTCCCGAAGACATCGTGCAGGCTTTGTCTGGTTCCGAGGAAGAAGATGGTTGGAAACAAGTTTACGGAGATCTCATC
AAGCAGGCAGCTGGAGGCGATGTTAGCGATCCTACAAAGAAGGAAGCTGCAGAAAAACAATTTTTGGCAGAGTTTCAAAAATTAAAAGATGAATTCAAGA
CAGGTTTTGAAAAAGTATCTGGCACTATTCAAAATGTCGAACCAAACGAAGAATTAAAAGCAGCCACGCAAGAAACAGGGTTCTTGAAATCTCTCCTATC
CTCTAATAATCCATACAAGCTGGACGCAGGAACAGCCATCAAAATCATGAATGCGTTAATGACAGTGTCACTATCTGGACTTGTTGTTCTGACACAAAAC
CTTATTCAGATGACTGCAGACGTAGATGCCGCTCTTAAAGACTCTATCGAAGGTGTACAGACCCAAGCAGAAATTCAAGACGTTGAAGAAACTCCAGAAC
GAGCGAAAACAATTCAAGAAAAGATCGTAGCATTCGTAAAGAATAAAGATGCGCACCCTCTTATCCTAAGCAACCTAGAACAAGCTGGGTGGAATTTAGA
AAAAGACAAGGATCTTTCAAGAGTCGACATGAAAAAGGCGATGAAACCGCAAGTGGGCGAATCATTGGAAAAATTGTTAGGTGAACAATTCGAAGAATTT
CTCACCGAGTTCGAATTAGCTATTCCCGGAGAAGACGAAACTGACGATCCTAACGAAGTAGAAGAACTCCTTGATGATGTCGAAACTAACGTCAAAGAAG
ATCCAGATTTGGACGCAGCGACGCAGTAAAAATG

Translation

[447 - 926/934]   direct strand
>GOS_1508010 Translation [447-926   direct strand]
MNALMTVSLSGLVVLTQNLIQMTADVDAALKDSIEGVQTQAEIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEKDKDLSRVDMKKAMK
PQVGESLEKLLGEQFEEFLTEFELAIPGEDETDDPNEVEELLDDVETNVKEDPDLDAATQ

Annotator commentaries

It was not found any reliable homology with any known protein domain, thus we can not correlate the sequence to any known organism or family of organisms neither deduce anything about the protein molecular function or biological process.



ORF finding

PROTOCOL


a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'Standard' genetic code

b) SMS ORFinder / reverse strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'Standard' genetic code



RESULTS ANALYSIS


In our nucleotide sequence analysis we obtained one ORF in the reading frame 3 on the forward strand, from base 3 to base 929 with a start codon GAA, which encodes a glutamic acid (E), and ends in frame TAA, which encodes a glutamine (Q).

No ORF's were found in reading frame 1 and 2.

We also obtained two ORF's in reading frame 1 on the reverse strand, one from base 1 to base 321, with the start codon CAT, which encodes for histidine (H) and ends in frame TGA, which encodes for serine (S). The second ORF was found in the same reading frame, from base 751 to base 933, with a start codon ACT, which encodes for threonine (T) and ends at frame TCG, which encodes for serine (S). On the same reverse strand, in reading frame 3, it was found another ORF from base 270 to base 464, with a start codon GAT encoding for the protein aspartic acid (D), and ends in frame TAG, which encodes for arginine (R). Wasn't found any ORF's in reading frame 2.

All ORF's found end with a stop codon, except the ORF number 2 in reading frame 1 on the reverse strand.

The ORF selected for further studies was the one in reading frame 3 on the direct strand.

RAW RESULTS

a) forward strand

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 3 to base 929.
GAAGGAGGCGTCGGTAAAGGAATATTTTCTGTTGGCGCCGAAGGTATTGTCATAAACAAA
GAGTTCGCCAAAGCCGAAAATATCGAAGAGAAGATATGTCTTAAAGATTTTCTAGATAAA
CCTCCCGAAGACATCGTGCAGGCTTTGTCTGGTTCCGAGGAAGAAGATGGTTGGAAACAA
GTTTACGGAGATCTCATCAAGCAGGCAGCTGGAGGCGATGTTAGCGATCCTACAAAGAAG
GAAGCTGCAGAAAAACAATTTTTGGCAGAGTTTCAAAAATTAAAAGATGAATTCAAGACA
GGTTTTGAAAAAGTATCTGGCACTATTCAAAATGTCGAACCAAACGAAGAATTAAAAGCA
GCCACGCAAGAAACAGGGTTCTTGAAATCTCTCCTATCCTCTAATAATCCATACAAGCTG
GACGCAGGAACAGCCATCAAAATCATGAATGCGTTAATGACAGTGTCACTATCTGGACTT
GTTGTTCTGACACAAAACCTTATTCAGATGACTGCAGACGTAGATGCCGCTCTTAAAGAC
TCTATCGAAGGTGTACAGACCCAAGCAGAAATTCAAGACGTTGAAGAAACTCCAGAACGA
GCGAAAACAATTCAAGAAAAGATCGTAGCATTCGTAAAGAATAAAGATGCGCACCCTCTT
ATCCTAAGCAACCTAGAACAAGCTGGGTGGAATTTAGAAAAAGACAAGGATCTTTCAAGA
GTCGACATGAAAAAGGCGATGAAACCGCAAGTGGGCGAATCATTGGAAAAATTGTTAGGT
GAACAATTCGAAGAATTTCTCACCGAGTTCGAATTAGCTATTCCCGGAGAAGACGAAACT
GACGATCCTAACGAAGTAGAAGAACTCCTTGATGATGTCGAAACTAACGTCAAAGAAGAT
CCAGATTTGGACGCAGCGACGCAGTAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
EGGVGKGIFSVGAEGIVINKEFAKAENIEEKICLKDFLDKPPEDIVQALSGSEEEDGWKQ
VYGDLIKQAAGGDVSDPTKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEELKA
ATQETGFLKSLLSSNNPYKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKD
SIEGVQTQAEIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEKDKDLSR
VDMKKAMKPQVGESLEKLLGEQFEEFLTEFELAIPGEDETDDPNEVEELLDDVETNVKED
PDLDAATQ*

---------------------------------------------------------------------------------------------------
b) reverse strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 1 to base 321.
CATTTTTACTGCGTCGCTGCGTCCAAATCTGGATCTTCTTTGACGTTAGTTTCGACATCA
TCAAGGAGTTCTTCTACTTCGTTAGGATCGTCAGTTTCGTCTTCTCCGGGAATAGCTAAT
TCGAACTCGGTGAGAAATTCTTCGAATTGTTCACCTAACAATTTTTCCAATGATTCGCCC
ACTTGCGGTTTCATCGCCTTTTTCATGTCGACTCTTGAAAGATCCTTGTCTTTTTCTAAA
TTCCACCCAGCTTGTTCTAGGTTGCTTAGGATAAGAGGGTGCGCATCTTTATTCTTTACG
AATGCTACGATCTTTTCTTGA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
HFYCVAASKSGSSLTLVSTSSRSSSTSLGSSVSSSPGIANSNSVRNSSNCSPNNFSNDSP
TCGFIAFFMSTLERSLSFSKFHPACSRLLRIRGCASLFFTNATIFS*

>ORF number 2 in reading frame 1 on the reverse strand extends from base 751 to base 933.
ACTTGTTTCCAACCATCTTCTTCCTCGGAACCAGACAAAGCCTGCACGATGTCTTCGGGA
GGTTTATCTAGAAAATCTTTAAGACATATCTTCTCTTCGATATTTTCGGCTTTGGCGAAC
TCTTTGTTTATGACAATACCTTCGGCGCCAACAGAAAATATTCCTTTACCGACGCCTCCT
TCG

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
TCFQPSSSSEPDKACTMSSGGLSRKSLRHIFSSIFSALANSLFMTIPSAPTENIPLPTPP
S

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 270 to base 464.
GATAAGAGGGTGCGCATCTTTATTCTTTACGAATGCTACGATCTTTTCTTGAATTGTTTT
CGCTCGTTCTGGAGTTTCTTCAACGTCTTGAATTTCTGCTTGGGTCTGTACACCTTCGAT
AGAGTCTTTAAGAGCGGCATCTACGTCTGCAGTCATCTGAATAAGGTTTTGTGTCAGAAC
AACAAGTCCAGATAG

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
DKRVRIFILYECYDLFLNCFRSFWSFFNVLNFCLGLYTFDRVFKSGIYVCSHLNKVLCQN
NKSR*

Multiple Alignement

PROTOCOL




RESULTS ANALYSIS


We can't make the multiple alignement because the E-value is very high (0,38), which make the results inconclusive.

RAW RESULTS

Protein Domains

PROTOCOL


InterProScan, default parameters at EBI



RESULTS ANALYSIS


On the InterPro Scan results in the first two and in the last ORF found, we didn't obtained any kind of results, meaning that there are no known protein domains for the sequence that is being analysed. Only on the penultimate ORF found, the translation of ORF number 2 in reading frame 1 on the reverse strand, we obtained one result for a unintegrated transmembrane region.


In this case, we'll continue the study with the ORF with the largest number of bases, which is the ORF number 1 in reading frame 3 on the direct strand.


Phylogeny

PROTOCOL



RESULTS ANALYSIS


We can't make the taxonomy report neither the multiple alignement because the E-value is very high (0,38), which make the results inconclusive. So we don't have any results to build the phylogenetic tree.

RAW RESULTS

Taxonomy report

PROTOCOL




RESULTS ANALYSIS


We can't make the taxonomy report because the E-value is very high (0,38), which make the results inconclusive.

RAW RESULTS

BLAST

PROTOCOL


BLASTx versus NR, NCBI default parameters apart from "Number of descriptions_1000"



RESULTS ANALYSIS


Due to not having any kind of results in the InterPro Scan, we decided to make a BLASTx of our sequence, in order to check whether there is any homologous proteins.

From the results obtained we verified, by the values of E obtained from homologous sequences, which were very high (0,38), that there are no known homologous proteins, so we terminated our study of the sequence, since it will not provide any more reliable results .



RAW RESULTS


                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

emb|CBK73300.1|  Uncharacterized protein conserved in bacteria...  40.0    0.38 
ref|ZP_02619461.1|  conserved hypothetical protein [Clostridiu...  39.7    0.49 
ref|YP_002633953.1|  putative cell division protein FtsY [Stap...  39.3    0.64 
ref|ZP_02329253.1|  YlxF [Paenibacillus larvae subsp. larvae B...  39.3    0.64 
ref|ZP_05914259.1|  type I site-specific deoxyribonuclease, Hs...  39.3    0.64 
ref|XP_001024253.1|  Viral A-type inclusion protein repeat con...  38.9    0.84 
ref|YP_723920.1|  TPR repeat-containing protein [Trichodesmium...  38.9    0.84 
ref|YP_002828655.1|  hypothetical protein M1425_0506 [Sulfolob...  38.5    1.1  
ref|YP_002248237.1|  methyl-accepting chemotaxis protein, puta...  38.5    1.1  
ref|YP_424804.1|  hypothetical protein MCAP_0862 [Mycoplasma c...  38.5    1.1  
ref|XP_781933.1|  PREDICTED: hypothetical protein, partial [St...  38.1    1.4  
ref|ZP_05632205.1|  hypothetical protein FulcA4_02157 [Fusobac...  37.7    1.9  
gb|EDV10429.1|  conserved hypothetical protein [Saccharomyces ...  37.7    1.9  
ref|XP_788234.2|  PREDICTED: similar to Golgi autoantigen, gol...  37.7    1.9  
gb|ABD33760.1|  flagellin [Bacillus thuringiensis serovar iber...  37.7    1.9  
ref|XP_001608392.1|  erythrocyte binding protein [Plasmodium v...  37.4    2.4  
ref|XP_001312904.1|  viral A-type inclusion protein [Trichomon...  37.4    2.4  
ref|XP_001136451.1|  PREDICTED: serologically defined colon ca...  37.4    2.4  
ref|XP_525116.2|  PREDICTED: serologically defined colon cance...  37.4    2.4  
ref|XP_001136271.1|  PREDICTED: serologically defined colon ca...  37.4    2.4  
ref|XP_001136358.1|  PREDICTED: serologically defined colon ca...  37.4    2.4  
ref|XP_002733969.1|  PREDICTED: pyruvate dehydrogenase E1 alph...  37.0    3.2  
ref|ZP_06372617.1|  LOW QUALITY PROTEIN: GTP-binding protein [...  37.0    3.2  
ref|YP_002533102.1|  XkdF [Bacillus cereus Q1] >gb|ACM15998.1|...  37.0    3.2  
ref|YP_001693401.1|  putative viral A-type inclusion protein [...  37.0    3.2  
ref|YP_001924003.1|  MCP methyltransferase/methylesterase, Che...  37.0    3.2  
ref|XP_001455225.1|  hypothetical protein [Paramecium tetraure...  37.0    3.2  
ref|YP_001156062.1|  HflK protein [Polynucleobacter necessariu...  37.0    3.2  
ref|YP_697640.1|  calcium-translocating P-type ATPase, PMCA-ty...  37.0    3.2  
ref|YP_453737.1|  flagellar hook-associated protein 2 FliD [So...  37.0    3.2  
ref|XP_505021.1|  YALI0F05170p [Yarrowia lipolytica] >emb|CAG7...  37.0    3.2  
ref|YP_001928419.1|  putative zinc protease [Porphyromonas gin...  36.6    4.2  
ref|ZP_02641854.1|  ethanolamine utilization protein [Clostrid...  36.6    4.2  
ref|YP_001016434.1|  small mechanosensitive ion channel, MscS ...  36.6    4.2  
ref|YP_694777.1|  cation-transporting ATPase, P-type [Clostrid...  36.6    4.2  
ref|XP_001092411.1|  PREDICTED: similar to serologically defin...  36.6    4.2  
ref|XP_001092297.1|  PREDICTED: similar to serologically defin...  36.6    4.2  
ref|XP_001657110.1|  hypothetical protein AaeL_AAEL003671 [Aed...  36.6    4.2  
ref|NP_904531.1|  M16 family peptidase [Porphyromonas gingival...  36.6    4.2  
ref|NP_198700.2|  forkhead-associated domain-containing protei...  36.6    4.2  
ref|YP_032819.1|  hypothetical protein BQ12960 [Bartonella qui...  36.6    4.2  
ref|NP_895351.1|  small mechanosensitive ion channel [Prochlor...  36.6    4.2  
dbj|BAB08640.1|  unnamed protein product [Arabidopsis thaliana]    36.6    4.2  
ref|YP_957845.1|  hypothetical protein Maqu_0557 [Marinobacter...  36.6    4.2  
emb|CAY79542.1|  She10p [Saccharomyces cerevisiae EC1118]          36.2    5.4  
gb|EEU07494.1|  She10p [Saccharomyces cerevisiae JAY291]           36.2    5.4  
ref|XP_002558198.1|  Pc12g13920 [Penicillium chrysogenum Wisco...  36.2    5.4  
ref|XP_002611235.1|  hypothetical protein BRAFLDRAFT_71196 [Br...  36.2    5.4  
gb|EDZ72331.1|  YGL228Wp-like protein [Saccharomyces cerevisia...  36.2    5.4  
dbj|BAG61280.1|  unnamed protein product [Homo sapiens]            36.2    5.4  
dbj|BAG61251.1|  unnamed protein product [Homo sapiens]            36.2    5.4  
dbj|BAG59616.1|  unnamed protein product [Homo sapiens]            36.2    5.4  
ref|XP_002005324.1|  GI20420 [Drosophila mojavensis] >gb|EDW09...  36.2    5.4  
gb|EDN61900.1|  conserved protein [Saccharomyces cerevisiae YJ...  36.2    5.4  
gb|EAW77085.1|  serologically defined colon cancer antigen 8, ...  36.2    5.4  
gb|AAD38979.1|AF153767_1  immunoreactive 106 kDa antigen PG115...  36.2    5.4  
gb|ABD33751.1|  flagellin [Bacillus thuringiensis serovar rosk...  36.2    5.4  
gb|AAC18039.1|  antigen NY-CO-8 [Homo sapiens]                     36.2    5.4  
ref|NP_006633.1|  serologically defined colon cancer antigen 8...  36.2    5.4  
ref|NP_011286.1|  Putative glycosylphosphatidylinositol (GPI)-...  36.2    5.4  
ref|ZP_00990449.1|  hypothetical membrane associated protein [...  36.2    5.4  
ref|YP_002860349.1|  hypothetical protein CLJ_0040 [Clostridiu...  35.8    7.1  
gb|EEE20163.1|  Coronin, putative [Toxoplasma gondii GT1]          35.8    7.1  
ref|XP_458689.2|  DEHA2D05126p [Debaryomyces hansenii CBS767] ...  35.8    7.1  
dbj|BAF98220.1|  CM0216.380.nc [Lotus japonicus]                   35.8    7.1  
ref|XP_001425727.1|  hypothetical protein [Paramecium tetraure...  35.8    7.1  
ref|ZP_04217402.1|  DegV [Bacillus cereus Rock3-44] >gb|EEL508...  35.4    9.3  
ref|ZP_03960448.1|  transposase [Lactobacillus vaginalis ATCC ...  35.4    9.3  
ref|ZP_03540451.1|  hypothetical ABC transporter ATP-binding p...  35.4    9.3  
ref|XP_002371005.1|  hypothetical protein, conserved [Toxoplas...  35.4    9.3  
ref|XP_002144255.1|  conserved hypothetical protein [Penicilli...  35.4    9.3  
ref|ZP_03539629.1|  hypothetical ABC transporter ATP-binding p...  35.4    9.3  
ref|ZP_02865057.1|  cation-transporting ATPase, P-type [Clostr...  35.4    9.3  
ref|ZP_02865861.1|  ethanolamine utilization protein [Clostrid...  35.4    9.3  
ref|YP_001797523.1|  HflK protein [Polynucleobacter necessariu...  35.4    9.3  
ref|ZP_02029984.1|  hypothetical protein BIFADO_02449 [Bifidob...  35.4    9.3  
ref|XP_001458603.1|  hypothetical protein [Paramecium tetraure...  35.4    9.3  
ref|YP_910398.1|  methionyl-tRNA synthetase [Bifidobacterium a...  35.4    9.3  
ref|XP_001662334.1|  SspC, putative [Aedes aegypti] >gb|EAT356...  35.4    9.3  
gb|AAU93915.1|  coronin [Toxoplasma gondii]                        35.4    9.3  
ref|YP_072766.1|  methylgalactoside ABC transporter, ATP-bindi...  35.4    9.3  
ref|YP_034299.1|  hypothetical protein BH16060 [Bartonella hen...  35.4    9.3  

ALIGNMENTS
>emb|CBK73300.1| Uncharacterized protein conserved in bacteria [Butyrivibrio fibrisolvens 
16/4]
Length=501

 Score = 40.0 bits (92),  Expect = 0.38
 Identities = 53/250 (21%), Positives = 97/250 (38%), Gaps = 46/250 (18%)
 Frame = +3

Query  93   KICLKDFLDKPPEDIVQALSGSEEEDGWKQVYGDLIKQAAGGDVSDPTKKEAAEKQFLAE  272
            ++  K+FL+K PE I Q L  S  +   +++ G L  +    D  D    +  EK     
Sbjct  100  QLAAKNFLNKNPEQITQDLQDS-LQGNMREIIGTLSLKVINTD-RDSFSDQVMEK-----  152

Query  273  FQKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGFLKSLLSSNNPYKLDAGTAIKIMN  452
                +D  K G E +S  IQNV         T E G +   L  +N  K+    AI    
Sbjct  153  --ASRDMSKLGIEILSCNIQNV---------TDENGLIND-LGMDNTAKIKKDAAIAKAQ  200

Query  453  ALMTVSL-----------------------SGLVVLTQNLIQMTADVDAALKDSIEGVQT  563
            A   V++                       +  + + Q  ++  AD   A+ D+   +Q 
Sbjct  201  ADRDVAIAQAEADKAANDARVTAQTEIAEKNNALAIKQAELKQQADTANAVADAAYSIQQ  260

Query  564  QAEIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEKDKDLSRVDMKKAM  743
            Q + + +E     A+  + +  A +K K+    +L   ++    +EK  D  +   +K  
Sbjct  261  QEQQKTIEAATVNAQIAKAEREAELKQKE----VLVKQQELAAQIEKQADAEKYQAEKKA  316

Query  744  KPQVGESLEK  773
            + ++ +  +K
Sbjct  317  EAELIQRQKK  326


>ref|ZP_02619461.1| conserved hypothetical protein [Clostridium botulinum Bf]
 gb|EDT84097.1| conserved hypothetical protein [Clostridium botulinum Bf]
Length=1205

 Score = 39.7 bits (91),  Expect = 0.49
 Identities = 48/244 (19%), Positives = 100/244 (40%), Gaps = 15/244 (6%)
 Frame = +3

Query  48   IVINKEFAKAENIEEKICLKDFLDKPPEDIVQALSGSEEEDGWKQVYGDLIKQAAGGDVS  227
            +V++ E +   N  +   LK+ L+  PE+IV+       +   +     LIK+  G    
Sbjct  463  LVVDLENSDTINTNKFEELKEILNNAPEEIVEEFKTPYSKIAKRYEIESLIKETEG----  518

Query  228  DPTKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQN-----VEPNEELKAATQETGFLKS  392
            + T++   +   L +   ++D++K  +      + N     VE  E+ +   +     K+
Sbjct  519  NLTEENLDKLSNLLDNANIEDKYKLSYTSKYNKLLNDYTKKVEEQEQTEYEAKLQAATKA  578

Query  393  L-LSSNNPYKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKDSIEGVQTQA  569
            +  + N+  + D   A  + N+L T   S    L   L ++  D+D    +  +  + + 
Sbjct  579  VEKAENSKNQADLDNARGLANSLKTTDKSN---LNTRLDKVQKDIDDKKTEEEKQAEYEV  635

Query  570  EIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEKDKDLSRVDMKKAMKP  749
            ++Q   +  E+A+    K    V N       L  ++++  N   DK    +D KK  + 
Sbjct  636  KLQAATKAVEKAE--NSKNQTDVDNARGLANSLKEIDKSNLNARLDKVQKNIDDKKTEEE  693

Query  750  QVGE  761
            +  E
Sbjct  694  KQAE  697


>ref|YP_002633953.1| putative cell division protein FtsY [Staphylococcus carnosus 
subsp. carnosus TM300]
 emb|CAL27768.1| putative cell division protein FtsY [Staphylococcus carnosus 
subsp. carnosus TM300]
Length=422

 Score = 39.3 bits (90),  Expect = 0.64
 Identities = 38/142 (26%), Positives = 68/142 (47%), Gaps = 8/142 (5%)
 Frame = +3

Query  276  QKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGFLKSLLSSNNPYKLDAGTAIKIMNA  455
            ++ K+E K  FE   G I ++E  EE+++      F + L  S   ++      I     
Sbjct  55   EEKKEETKDDFEFDDGLI-SIEEFEEIESQKLGAKFKQGLEKSRENFQEQLNNLIA---R  110

Query  456  LMTVSLSGLVVLTQNLIQMTADVDA-ALKDSIEGVQTQAEIQDVEETPERAKTIQEKIVA  632
              TV       L + LI  TADV    + + ++ ++T+A+ ++++ET +  + I EKIV 
Sbjct  111  YRTVDEDFFEALEEMLI--TADVGFNTVMELVDELRTEAQRRNIKETSDLKEVIVEKIVE  168

Query  633  FVKNKDAHPLILSNLEQAGWNL  698
              +  D H  ++ NLE    N+
Sbjct  169  IYEQDDDHSEVM-NLEDGRLNV  189


>ref|ZP_02329253.1| YlxF [Paenibacillus larvae subsp. larvae BRL-230010]
Length=303

 Score = 39.3 bits (90),  Expect = 0.64
 Identities = 50/187 (26%), Positives = 75/187 (40%), Gaps = 43/187 (22%)
 Frame = +3

Query  183  VYGDLIKQAAGGDVSDPTKKEAAEKQF-------------LAEFQKLKDEFKTGFEKVSG  323
            +   LI +  GG  +   K++AA+                L EFQK  DE K  F+K   
Sbjct  56   ILNKLIPEPKGGGSAAEIKEKAAQNDLKNKEEQIGELSGKLNEFQKKYDELKENFDKKDV  115

Query  324  TI-----QNVEPNEELKAA-----TQETGFLKSLLSSNNPYKLDAGTAIKIMNAL-----  458
             +     +N E  E+L+A       ++   L SL S+ +P     G A  I+ +L     
Sbjct  116  ELKELSTENAELKEQLEAKNNAKYNEQLKKLVSLYSNMSP-----GKAAPILESLTPKET  170

Query  459  -MTVSLSGLVVLTQNLIQM----TADVDAALKDSIEGVQTQAEIQDVEETPERAKTIQEK  623
             + +S+ G     + L +M     AD   ALKD +     Q     +    ER K +Q K
Sbjct  171  ILILSMMGTDSRQKILEKMDPKEAADFSIALKDQVPAADRQ-----IAALQERVKELQNK  225

Query  624  IVAFVKN  644
              A V N
Sbjct  226  QTAEVSN  232


>ref|ZP_05914259.1| type I site-specific deoxyribonuclease, HsdR family protein [Brevibacterium 
linens BL2]
Length=1021

 Score = 39.3 bits (90),  Expect = 0.64
 Identities = 33/140 (23%), Positives = 68/140 (48%), Gaps = 6/140 (4%)
 Frame = +3

Query  210  AGGDVSDPTKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGFLK  389
            AG D+S    +   + Q     Q+ +D+F   F+ + G  + V PN+EL A+  +  FL 
Sbjct  726  AGIDLSTTGPQTLIDAQARMPDQEAEDDFAANFQMLQGIWEAVAPNDELAASRSQYRFLS  785

Query  390  SLLSSNNPYKLDAGTAIKIMNALMTVSLSGLVVLTQNL-IQMTADVDAALKDSIEGVQTQ  566
             + +S  P +   G+   +   L   +++ +    +++ +   ++V AA  D++  +  +
Sbjct  786  QVYASIQPSR---GSDDLLWGRLGAKTINLVHEHMEDIAVTRASEVIAADADTVSKLVEE  842

Query  567  AEIQDVEETPERAKTIQEKI  626
               +DVE+   + KT+ E I
Sbjct  843  GYEEDVEDI--KGKTVDEII  860


>ref|XP_001024253.1| Viral A-type inclusion protein repeat containing protein [Tetrahymena 
thermophila]
 gb|EAS04008.1| Viral A-type inclusion protein repeat containing protein [Tetrahymena 
thermophila SB210]
Length=3714

 Score = 38.9 bits (89),  Expect = 0.84
 Identities = 57/221 (25%), Positives = 96/221 (43%), Gaps = 31/221 (14%)
 Frame = +3

Query  78   ENIEEKIC-----LKDFLDKPPEDIVQAL---SGSEEEDGWKQVYGDLIKQAAGGDVSDP  233
            EN++EK         D L++  + ++Q     +  + E    Q Y D I +    D+   
Sbjct  720  ENLQEKYDRMNEEFDDLLNEKEQFLIQIQDLQNKCQHEQEINQKYLDQIGKYKQDDIDQQ  779

Query  234  TKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGFLKSLLSSNNP  413
             +    EK++L E QKLKD+      K S   Q+ E NEE+KA       L SL + N  
Sbjct  780  NR----EKKYLKEIQKLKDDIDQ-LSKQSKKNQH-ENNEEIKA------MLTSLQTKNEK  827

Query  414  YKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKDSIEGVQTQAEIQDVEET  593
             + +    ++         L+ L +LTQ  I +  ++ A L++ +   Q   +IQ +E  
Sbjct  828  LEQENAQILQSSQEQQNQLLTNLEMLTQQNIDVQNNL-AILEEEVN--QKDLKIQQLE--  882

Query  594  PERAKTIQEKIVAFV-----KNKDAHPLILSNLEQAGWNLE  701
             +  +   +KI  F+     +NK  H    + L+Q    LE
Sbjct  883  -QELQLSAQKIETFLSLAEEENKIKHAQNQALLKQQEQKLE  922


>ref|YP_723920.1| TPR repeat-containing protein [Trichodesmium erythraeum IMS101]
 gb|ABG53447.1| TPR repeat [Trichodesmium erythraeum IMS101]
Length=1247

 Score = 38.9 bits (89),  Expect = 0.84
 Identities = 45/194 (23%), Positives = 85/194 (43%), Gaps = 11/194 (5%)
 Frame = +3

Query  63   EFAKAENIEEKICLKDFL--DKPPEDIVQALSGSEEEDGWKQVYGDLI-KQAAGGDVSDP  233
            E ++A N+ + +  +D +  D  P +I Q L   ++E    +   + + KQ +  +  + 
Sbjct  733  ERSQARNLVQLLTNRDLVPQDNVPPEITQQLQQQQKEIRVNKGELEKVRKQLSQQNTEEK  792

Query  234  TKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGF--LKSLLSSN  407
            T  E +EKQ   E QKL+ +     EK    IQ  +P   L    Q   F  ++ L +  
Sbjct  793  TSLEKSEKQLNEELQKLRQQ----LEKTFNQIQEYDPTFRLTQEVQPIAFREIQQLATEQ  848

Query  408  NPYKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKDSI-EGVQTQAEIQDV  584
                ++     +   A + V  +        + Q T D    L++ + E ++   E+Q+ 
Sbjct  849  QTAIIEWYIMGEKFLAFILVPPTSTATSELFMWQSTEDDSKNLEEWVKEYLEKYYELQEA  908

Query  585  E-ETPERAKTIQEK  623
            E E  E+AK +Q++
Sbjct  909  ELENQEKAKKLQQE  922


>ref|YP_002828655.1| hypothetical protein M1425_0506 [Sulfolobus islandicus M.14.25]
 gb|ACP37357.1| hypothetical protein M1425_0506 [Sulfolobus islandicus M.14.25]
Length=1998

 Score = 38.5 bits (88),  Expect = 1.1
 Identities = 33/126 (26%), Positives = 55/126 (43%), Gaps = 18/126 (14%)
 Frame = +3

Query  237   KKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEE-----------LKAATQETGF  383
             K E   +      +++K E     E V   I NVE  ++           +K    +  F
Sbjct  1715  KGELTPEDIKNTIEEMKGEIDQLLENVKDYIDNVEKEKQPDETVLEYDKGIKELADKVSF  1774

Query  384   LKSL----LSSNNPYKLDAGTAIKIMNALMTVS---LSGLVVLTQNLIQMTADVDAALKD  542
              K +    L SN   K +  TAI+++N L T++   L   ++LTQ  IQ+ + +   LKD
Sbjct  1775  DKLIYFTSLISNIANKFNVPTAIELLNQLNTLTPEELQNTIILTQPQIQLLSKIAPLLKD  1834

Query  543   SIEGVQ  560
              +  +Q
Sbjct  1835  ILVSLQ  1840


>ref|YP_002248237.1| methyl-accepting chemotaxis protein, putative [Thermodesulfovibrio 
yellowstonii DSM 11347]
 gb|ACI22149.1| methyl-accepting chemotaxis protein, putative [Thermodesulfovibrio 
yellowstonii DSM 11347]
Length=474

 Score = 38.5 bits (88),  Expect = 1.1
 Identities = 45/152 (29%), Positives = 63/152 (41%), Gaps = 15/152 (9%)
 Frame = +3

Query  270  EFQKLKDEFKTGFEKVSGTIQNVEPNEEL-----KAATQETGFLKSLLSSNNPYKLDAGT  434
            EFQK+ D F    +K+S  IQ  E  +E+     K     T   +  L+       D   
Sbjct  120  EFQKIADTFNQMMDKLSTLIQTEEEKKEMQNNIIKFLQIMTQASEGDLTQRAEVTPDVFG  179

Query  435  AIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKDSIEGVQTQAEIQD---------VE  587
            ++     LMT  LS L+   +N  +        L D I+ +QT AEIQ          VE
Sbjct  180  SLADAFNLMTDGLSELIKEVKNSAEDVGQKSNILNDIIQKLQTGAEIQKQEIEKIASLVE  239

Query  588  ETPERAKTIQEK-IVAFVKNKDAHPLILSNLE  680
            E  E A   +EK  VA   +K+A   I+   E
Sbjct  240  EAAEIAYQTKEKTTVATDVSKEAMNAIIKGNE  271


>ref|YP_424804.1| hypothetical protein MCAP_0862 [Mycoplasma capricolum subsp. 
capricolum ATCC 27343]
 gb|ABC01376.1| membrane protein, putative [Mycoplasma capricolum subsp. capricolum 
ATCC 27343]
Length=750

 Score = 38.5 bits (88),  Expect = 1.1
 Identities = 58/250 (23%), Positives = 107/250 (42%), Gaps = 39/250 (15%)
 Frame = +3

Query  195  LIKQAAGGDVSDPTKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNE-ELKAATQ  371
            L K+    ++    K +  EK+     +K  ++ K    K   TI+N+E  + EL  + +
Sbjct  401  LNKKQLANELDIKIKSKQIEKELK---EKEIEDLKQDNNKSEETIKNLEKQDSELDTSIK  457

Query  372  ETGFLKSLLSSNNPYKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTAD------VDAA  533
            +    KS L  N    LD    IK  N+        L + T++L Q+T +       +  
Sbjct  458  DLNNRKSELEKNLSETLDL---IKKENSTSEQLTKELDIKTKDLNQLTRENRLLEEKNKD  514

Query  534  LKDSIEGVQTQAEIQDVEETPERAKTIQE----KIVAFVKNKDAHPLILSNLEQAGWNLE  701
            LK +I   +T+ EI + E+T +  + ++     K + F KN        S LE+    LE
Sbjct  515  LKQNIASSKTKIEILESEKTKKSLELVKLEEDLKSIDFEKNS-------SLLEKE--KLE  565

Query  702  KDKDLSRVDMKKAMKPQVGESLEKllgeqfeefltefelAIPGEDETDDPNEVEELLDDV  881
             D+ + ++   + +     E L++ L +               +++TD PN++ E    V
Sbjct  566  NDEKIKKMHEAQTLLKDKQEELKERLDQLK-------------KNKTDLPNKISEKTKSV  612

Query  882  ETNVKEDPDL  911
            E+  K+  D+
Sbjct  613  ESLTKQISDI  622


>ref|XP_781933.1| PREDICTED: hypothetical protein, partial [Strongylocentrotus 
purpuratus]
 ref|XP_001176065.1| PREDICTED: hypothetical protein, partial [Strongylocentrotus 
purpuratus]
Length=1548

 Score = 38.1 bits (87),  Expect = 1.4
 Identities = 32/136 (23%), Positives = 66/136 (48%), Gaps = 6/136 (4%)
 Frame = +3

Query  345   NEELKAATQETGFLKSLLSSNNPYKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADV  524
             +++++    +   L++ L+S N  K     +++ +   ++   +G   + Q L   T+++
Sbjct  1189  DDDVRQLRADKASLEAALASANEEKRTFDDSLQKLRGDLSRVETGFKQMRQELGAKTSEL  1248

Query  525   DAALKDSIEGVQTQAEIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEK  704
             +AA    +EG Q + +++  E+  E  +   E  +A + NKDA   I+  L +A   LEK
Sbjct  1249  EAA---QMEGQQVREQLKVEEQQVEAQEANLEASIAAIGNKDA---IIDELLEAKGQLEK  1302

Query  705   DKDLSRVDMKKAMKPQ  752
             +    R DM+     Q
Sbjct  1303  EIQGLRRDMQSTRHSQ  1318


>ref|ZP_05632205.1| hypothetical protein FulcA4_02157 [Fusobacterium ulcerans ATCC 
49185]
Length=570

 Score = 37.7 bits (86),  Expect = 1.9
 Identities = 49/235 (20%), Positives = 96/235 (40%), Gaps = 34/235 (14%)
 Frame = +3

Query  39   AEGIVINKEFAKAENIEEKICLKDFLDKPPEDIVQALSGSEEEDGWKQVYGDLIKQAAGG  218
            +E  +IN +F      +  I +K+   K P D  Q L   +EE   + +  ++       
Sbjct  96   SEAHIINVDFLTDRKTDGAIGIKE---KTPLDASQVLVPDDEEKNKEVLVTEIENSYLDN  152

Query  219  DVSDPTKKEAAEKQFLAEFQKLKDEFKTGFEKVSGTIQNVEPNEELKAATQETGFLKSLL  398
               +    E   K       KL +E +  +E     I+ +E  +E K+  ++   LK + 
Sbjct  153  MKINDLNLENILKSEYERNNKLLEEKRNYWET---RIKEIEKGKEFKSIRED---LKKVT  206

Query  399  SSNNP---YKLDAGTAIKIMNALMTVSLSGLVVLTQNLIQMTADVDAALKDSIEGVQTQA  569
            S  NP   +K+D                       +N+I     ++  LKD  E  Q + 
Sbjct  207  SLKNPLDIFKMDREV--------------------KNIIDSAKVLNDQLKDEKE--QIRL  244

Query  570  EIQDVEETPERAKTIQEKIVAFVKNKDAHPLILSNLEQAGWNLEKDKDLSRVDMK  734
            +++D++ +PE   ++++ + AFV+N +     L +L     N   +K +  + +K
Sbjct  245  DLEDMKNSPELKFSVEKSVEAFVQNGEIAIKDLDSLVNIYLNEVYEKKIYEIVVK  299