ORF TK16940

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01160045.1
Annotathon code: ORF_TK16940
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2006
Username : jaimejardiner
Annotated on : 2008-03-19 18:52:37
  • MESURET guillaume

Synopsis

Genomic Sequence

>AACY01160045.1 ORF_TK16940 genomic DNA
GGATATTCTAGTTAACAAGGATGTTAATCTCAGAATTAATAAAATGAATAGAAAATTAATTAAACCTAAACAAGCTGCAAATATTCTGATGTGTAGCGAA
AGAACCTTAGAGACTTGGAGAAGAGAAGAGAAAGGTCCTAGTTATTATAAGATAGAAGGAAAAGTACTTTATGGGATTGATGATTTACAAAATTTTATTG
AAGGTTCCAGAGTGTCAGTTTCATAATAAACATTTATAAGGAGAATAAATGAAAATAAAAATTATAACTTTAGCAGCAATTGTTTCTTTAGTGGCTTCTT
GTTCAGGAAGTGATGGAAGTAAGGAAATAAGACTTAGTGATGCTGCAGAAACTACAGCGTATGTTGAATATCTTTTTTGTAAAAATGGACCTGATATGTC
GCAAGAGTCCTTTACAGCAATGATATCGGAATGGAATACAATACAAGACGGTATGGAAAATCCTGTTCCTATGTCTGTGGGACTCGTTCCACGAACTGAA
ACTGATTTGTATGATGGTATGTGGGTATTAGTTTGGCAATCTAAAGAGCAAAGCGAAACAGGCTGGGAAGAATGGTTGGCTGGACCTGCTGAAGATTGGA
TAGAAAAGACTAGTTCCATTCTTTCTTGTGTAGACTCAAATGGAGACGCTATAAATTATAGCTTCAATGTAAGTAATTTTAGACCTGCACAAGCTCAAGA
TGCAGAACCAGGAGGGGTCGTGGGTTTTAATTTCTGCAGTTACACCGACTCATTTGGTCCTAACGATTTAATTGCAGCCAATGGAATTTATAATCAATGG
CTAGATGCTGCTGTGGAAGCAGCAGGGACTGCTTCACCATATTTTTATACAATTCACGAACCTAATTTTAAAACTCCAATCCCAGG

Translation

[249 - 884/886]   direct strand
>ORF_TK16940 Translation [249-884   direct strand]
MKIKIITLAAIVSLVASCSGSDGSKEIRLSDAAETTAYVEYLFCKNGPDMSQESFTAMISEWNTIQDGMENPVPMSVGLVPRTETDLYDGMWVLVWQSKE
QSETGWEEWLAGPAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYTDSFGPNDLIAANGIYNQWLDAAVEAAGTASPYFY
TIHEPNFKTPIP

[ Warning ] 3' incomplete: following codon is not a STOP

Phylogeny


Annotator commentaries

Le seul ORF valable est celui se trouvant sur le brin direct, frame +3 (les autres ORF contiennent des codons STOP). la sequence est incomplete car elle ne contient pas de codon STOP.

Le blastp ne presente que tres peu de reultats (seulement 8) avec des scores tres faibles. Seul les 2 premiers peuvent etre pris en compte. Ils proviennent d'un meme organisme: marine gamma proteobacterium HTCC2207. Ces deux sequences provenant du meme genome (locus_tag_

"GB2207_xxxxx), elles sont paralogues. Le fait qu'il n'y ait que peu de resulats du blast indique que la sequence coderait pour une proteine tres specifique, qui n'est donc pas retrouvée chez d'autre organismes. De plus dans les banques, il y a peu de representants de marine gammaproteobacteries.

L'alignement multiple montre que la sequence serait incomplete au niveau C-term d'ou l'absence de codon STOP.

La fonction et le processus biologique dans lequel la proteine interviendrait sont completement inconnus mais la recherche de domaine proteique (interpro) montre qu'elle contiendrait un signal procaryote d'attachement à la membrane lipoproteique (IPR000437).

La proteine serait donc une lipoproteine de la membrane d'un organisme procaryote.

PS: Il y avait beaucoup plus de resultats avec le blastx, mais les scores etaient tres faibles, et ils correspondaient à L'orf se situant sur le cadre +2, qui n'etait pas valable.

Multiple Alignement

CLUSTAL W (1.83) multiple sequence alignment


ORF_TK16940      MKIKIIT-LAAIVSLVASCSGSDGSKEIR-----------LSDAAETTAYVEYLFCKNGP 48
marine           --MKLITGLAVTALLLVGCGDKNAEMDVAA----EMDMAAAQPAAPVSTFVEYMYCDGGA 54
2marine          --MNKLLAASFTVLALAGCSNDPAPEAAAAPDVAVAAEAMTFDMVGQVFFNEFIPCTAGP 58
                   :: :   :  .  :..*... .                    .    : *:: *  *.

ORF_TK16940      DMSQESFTAMISEWNTIQDGMENPVPMSVGLVPRTETDLYD-GMWVLVWQSKEQSETGWE 107
marine           DFSPENYAKLTAAWNLISEESPVPALGAFAIRPKVETELYD-GMWANIWSSVEAREAGWK 113
2marine          DFSEATVDAMVAEWRAS--GIAGEILGAWGYAPASENNRFQNGWWELQWSSKEAADAGWR 116
                 *:*  .   : : *.            : .  *  *.: :: * *   *.* *  ::**.

ORF_TK16940      EWLAG-PAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYT 166
marine           DWVEN-HAEAFGAEFDSTLACN-AEKRFLFETMPIT--APVQEWDPAQQFQASYSFCSFK 169
2marine          QWAASDVAQAWSSKHENVMVCDAASRVSWDFNFPRDP-YSFGDIDESGQFVSAFLPCQLN 175
                 :*  .  *: :  : .. : *  :.      .:      .    *       .:  *. .

ORF_TK16940      DSFGPNDLIAANGIYNQWLDAAVEAAGTASPY-FYTIHEPNFKTPIP------------- 212
marine           EGKTQADGEAAGAAFAEWIADQRTLG-RGLNY-MAYLQIPTFDPETAGG----SIQDYTF 223
2marine          EGKTMDDLNVAIAAYNTFLDAIPVTENSFYSYGIYASNSDASEVDIYWGNFHPSFERMAL 235
                 :.    *  .* . :  ::            * :   :    .                 

ORF_TK16940      ---------------------------------------------------
marine           VRADFWGSADEQAADMTACMTEGNTAREMADAIYDCQDVGFDLYSIKRMES 274
2marine          ADATWMANGGETKAQMEAVMTCDTPDVHNAKLFYNPEDPDFS--------- 277
                                                                    
                                                                    

BLAST

BLASTP 2.2.15 [Oct-15-2006]

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: 1165164898-3336-34267133.BLASTQ4


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           4,198,544 sequences; 1,444,632,299 total letters

If you have any problems or questions with the results of this search
please refer to the BLAST FAQs Taxonomy reports
Query= ORF_TK16940 traduction [249-884 sens direct] Length=212 

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

gi|90415837|ref|ZP_01223770.1|  hypothetical protein GB2207_01...  82.4    1e-14
gi|90416936|ref|ZP_01224865.1|  hypothetical protein GB2207_06...  63.2    8e-09
gi|68128479|emb|CAJ08609.1|  hypothetical protein, conserved [Lei  38.9    0.16 
gi|118100713|ref|XP_417416.2|  PREDICTED: similar to DEP domain c  37.0    0.62  Gene info
gi|115629823|ref|XP_001201703.1|  PREDICTED: similar to dynein...  35.0    2.4   Gene info
gi|45187486|ref|NP_983709.1|  ADL387Cp [Eremothecium gossypii]...  33.9    5.4   Gene info
gi|54642298|gb|EAL31047.1|  GA20299-PA [Drosophila pseudoobscura]  33.9    5.7  
gi|86360666|ref|YP_472554.1|  flagellin C protein [Rhizobium e...  33.1    9.8   Gene info

Alignments

>gi|90415837|ref|ZP_01223770.1|  hypothetical protein GB2207_01352 [marine gamma proteobacterium 
HTCC2207]
 gi|90332211|gb|EAS47408.1|  hypothetical protein GB2207_01352 [marine gamma proteobacterium 
HTCC2207]
Length=274

 Score = 82.4 bits (202),  Expect = 1e-14, Method: Composition-based stats.
 Identities = 53/175 (30%), Positives = 89/175 (50%), Gaps = 15/175 (8%)

Query  3    IKIITLAAIVSLVASCSGS-----DGSKEIRLS---DAAETTAYVEYLFCKNGPDMSQES  54
            +K+IT  A+ +L+    G      D + E+ ++    AA  + +VEY++C  G D S E+
Sbjct  1    MKLITGLAVTALLLVGCGDKNAEMDVAAEMDMAAAQPAAPVSTFVEYMYCDGGADFSPEN  60

Query  55   FTAMISEWNTIQDGMENPVPM--SVGLVPRTETDLYDGMWVLVWQSKEQSETGWEEWLAG  112
            +  + + WN I +  E+PVP   +  + P+ ET+LYDGMW  +W S E  E GW++W+  
Sbjct  61   YAKLTAAWNLISE--ESPVPALGAFAIRPKVETELYDGMWANIWSSVEAREAGWKDWVEN  118

Query  113  PAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYTD  167
             AE +  +  S L+C   N +       +    P Q  D        ++FCS+ +
Sbjct  119  HAEAFGAEFDSTLAC---NAEKRFLFETMPITAPVQEWDPAQQFQASYSFCSFKE  170


>gi|90416936|ref|ZP_01224865.1|  hypothetical protein GB2207_06733 [marine gamma proteobacterium 
HTCC2207]
 gi|90331283|gb|EAS46527.1|  hypothetical protein GB2207_06733 [marine gamma proteobacterium 
HTCC2207]
Length=277

 Score = 63.2 bits (152),  Expect = 8e-09, Method: Composition-based stats.
 Identities = 45/154 (29%), Positives = 73/154 (47%), Gaps = 9/154 (5%)

Query  38   YVEYLFCKNGPDMSQESFTAMISEWNTIQDGMENPVPMSVGLVPRTETDLY-DGMWVLVW  96
            + E++ C  GPD S+ +  AM++EW     G+   +  + G  P +E + + +G W L W
Sbjct  48   FNEFIPCTAGPDFSEATVDAMVAEWRA--SGIAGEILGAWGYAPASENNRFQNGWWELQW  105

Query  97   QSKEQSETGWEEWLAGP-AEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPG  155
             SKE ++ GW +W A   A+ W  K  +++ C       +++ FN     P    D +  
Sbjct  106  SSKEAADAGWRQWAASDVAQAWSSKHENVMVC--DAASRVSWDFNFPR-DPYSFGDIDES  162

Query  156  G--VVGFNFCSYTDSFGPNDLIAANGIYNQWLDA  187
            G  V  F  C   +    +DL  A   YN +LDA
Sbjct  163  GQFVSAFLPCQLNEGKTMDDLNVAIAAYNTFLDA  196


>gi|68128479|emb|CAJ08609.1|  hypothetical protein, conserved [Leishmania major]
Length=280

 Score = 38.9 bits (89),  Expect = 0.16, Method: Composition-based stats.
 Identities = 31/124 (25%), Positives = 56/124 (45%), Gaps = 10/124 (8%)

Query  27   IRLSDAAETTAYVEYL----FCKNGPDMSQESFTAMISEWNTIQDGMENPVPMSVGLVPR  82
            + L D  E TA +       F KN P+   E   A +  W     GM+NP   +V + P 
Sbjct  99   VLLVDVVEGTAQLSLQHMKSFLKNRPNTFPEVKDAEV--WFLRMGGMQNPQGAAVSVPPL  156

Query  83   TETDLYDGMWVLVWQSK-EQSETGWEEWLAGPAEDWIEKTSSILSCVDSNGDAINYSFNV  141
             + +   G+W   W++   + E+ W+ W AG  E +I    + + C  +N + ++ +  V
Sbjct  157  LQKNDETGLW--QWRTDIRKMESVWDGWFAGLDEAFISLPCAKMLCT-ANAERLDKTLTV  213

Query  142  SNFR  145
            +  +
Sbjct  214  AQMQ  217


>gi|118100713|ref|XP_417416.2| Gene info PREDICTED: similar to DEP domain containing 6 [Gallus gallus]
Length=413

 Score = 37.0 bits (84),  Expect = 0.62, Method: Composition-based stats.
 Identities = 24/82 (29%), Positives = 36/82 (43%), Gaps = 17/82 (20%)

Query  108  EWLAGPAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGG---VVGFNFCS  164
            E L  P   +++KT +I+      GDA+ + F V   RP   Q  +PGG     G   C 
Sbjct  322  EELLSPGAPYVKKTLTIV------GDAVGWGFVVRGGRPCHIQAVDPGGPAAAAGMKVCQ  375

Query  165  YTDSFGPNDLIAANGIYNQWLD  186
            +        + + NG+Y   LD
Sbjct  376  F--------VFSVNGMYVLHLD  389


>gi|115629823|ref|XP_001201703.1| Gene info PREDICTED: similar to dynein, axonemal, heavy chain 5, partial 
[Strongylocentrotus purpuratus]
 gi|115640858|ref|XP_001187076.1| Gene info PREDICTED: similar to dynein, axonemal, heavy chain 5, partial 
[Strongylocentrotus purpuratus]
Length=1103

 Score = 35.0 bits (79),  Expect = 2.4, Method: Composition-based stats.
 Identities = 27/92 (29%), Positives = 41/92 (44%), Gaps = 8/92 (8%)

Query  74   PMSVGLVPRTETDLYDGMWVLVWQSKEQSETGWEEW--LAGPAED-WIEKTSSIL----S  126
            P   G +     D  DG++  +W+   +S+ G   W  L GP +  WIE  +S+L    +
Sbjct  151  PQMFGRLDVATNDWTDGIFSTLWRRTLRSKKGEHVWIVLDGPVDAIWIENLNSVLDDNKT  210

Query  127  CVDSNGDAINYSFNVS-NFRPAQAQDAEPGGV  157
               +NGD I  + N    F P    +A P  V
Sbjct  211  LTLANGDRIPMAPNCKIVFEPHNIDNASPATV  242


>gi|45187486|ref|NP_983709.1| Gene info ADL387Cp [Eremothecium gossypii]
 gi|44982224|gb|AAS51533.1|  ADL387Cp [Ashbya gossypii ATCC 10895]
Length=488

 Score = 33.9 bits (76),  Expect = 5.4, Method: Composition-based stats.
 Identities = 16/35 (45%), Positives = 20/35 (57%), Gaps = 0/35 (0%)

Query  77   VGLVPRTETDLYDGMWVLVWQSKEQSETGWEEWLA  111
            VG V  TE+++YD +  L  Q  E    GWEEW A
Sbjct  24   VGSVIPTESEVYDAVAQLWRQEPELERAGWEEWRA  58


>gi|54642298|gb|EAL31047.1|  GA20299-PA [Drosophila pseudoobscura]
Length=420

 Score = 33.9 bits (76),  Expect = 5.7, Method: Composition-based stats.
 Identities = 30/106 (28%), Positives = 47/106 (44%), Gaps = 6/106 (5%)

Query  108  EWLAGPAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYTD  167
            ++L G  + WI K S I +  + N +   ++ NV  F   +       G+    F S  D
Sbjct  304  DYLEGIMKRWIAKDSEIANREEFNTET--FTINVQPFSQFRDFPRTRSGMTDTRFFS-ED  360

Query  168  SFGPND---LIAANGIYNQWLDAAVEAAGTASPYFYTIHEPNFKTP  210
             F  +      AAN I+N  L+   E +G A+  F T H P+ + P
Sbjct  361  CFHLSQRGHAAAANSIWNNMLELPGEKSGFATQLFETFHCPSEQRP  406


>gi|86360666|ref|YP_472554.1| Gene info flagellin C protein [Rhizobium etli CFN 42]
 gi|86284768|gb|ABC93827.1| Gene info flagellin C protein [Rhizobium etli CFN 42]
Length=302

 Score = 33.1 bits (74),  Expect = 9.8, Method: Composition-based stats.
 Identities = 16/35 (45%), Positives = 23/35 (65%), Gaps = 1/35 (2%)

Query  118  IEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDA  152
            IE +S IL   D+NGD++ YS +++NF   Q Q A
Sbjct  181  IETSSGILGTADANGDSV-YSLDITNFTTGQIQSA  214




---------------------------------------------------------------------




BLASTX 2.2.15 [Oct-15-2006]

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: 1164717001-29286-92033587115.BLASTQ2


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           4,186,904 sequences; 1,441,309,946 total letters

If you have any problems or questions with the results of this search
please refer to the BLAST FAQs Taxonomy reports
Query= ORF_TK16940 ADN g##nomique Length=886

Sequences producing significant alignments:                        (Bits)  Value

gi|90415837|ref|ZP_01223770.1|  hypothetical protein GB2207_01...  97.4    8e-19
gi|90416936|ref|ZP_01224865.1|  hypothetical protein GB2207_06...  74.3    7e-12
gi|69935774|ref|ZP_00630661.1|  conserved hypothetical protein...  47.4    0.001
gi|13488566|ref|NP_109573.1|  hypothetical protein msr9739 [Me...  46.6    0.002  Gene info
gi|84687548|ref|ZP_01015424.1|  hypothetical protein RB2654_04...  46.6    0.002
gi|89362044|ref|ZP_01199855.1|  conserved hypothetical protein...  45.8    0.003
gi|90425047|ref|YP_533417.1|  hypothetical protein RPC_3559 [R...  45.1    0.005  Gene info
gi|89362967|ref|ZP_01200765.1|  conserved hypothetical protein...  45.1    0.005
gi|69937719|ref|ZP_00632326.1|  conserved hypothetical protein...  45.1    0.005
gi|46203306|ref|ZP_00208897.1|  hypothetical protein Magn03005...  45.1    0.005
gi|110636165|ref|YP_676373.1|  putative transcriptional regula...  45.1    0.005  Gene info
gi|89361635|ref|ZP_01199449.1|  conserved hypothetical protein...  44.7    0.006
gi|89362857|ref|ZP_01200658.1|  conserved hypothetical protein...  44.3    0.008
gi|85709928|ref|ZP_01040993.1|  hypothetical protein NAP1_1362...  44.3    0.008
gi|113935183|ref|ZP_01421082.1|  conserved hypothetical protei...  43.9    0.010
gi|89361037|ref|ZP_01198853.1|  hypothetical protein XautDRAFT...  43.9    0.010
gi|90425384|ref|YP_533754.1|  putative transcriptional regulat...  43.9    0.010  Gene info
gi|115526028|ref|YP_782939.1|  putative transcriptional regula...  43.9    0.010  Gene info
gi|33603113|ref|NP_890673.1|  hypothetical protein BB4139 [Bor...  43.1    0.017  Gene info
gi|67545243|ref|ZP_00423166.1|  conserved hypothetical protein...  43.1    0.017
gi|83944719|ref|ZP_00957085.1|  hypothetical protein OA2633_08...  42.7    0.022
gi|78699621|ref|ZP_00864112.1|  conserved hypothetical protein...  42.7    0.022
gi|46201985|ref|ZP_00208339.1|  hypothetical protein Magn03008...  42.7    0.022
gi|110634682|ref|YP_674890.1|  hypothetical protein Meso_2338 ...  42.7    0.022  Gene info
gi|69938585|ref|ZP_00633026.1|  conserved hypothetical protein...  42.4    0.029
gi|90022439|ref|YP_528266.1|  hypothetical protein Sde_2797 [S...  42.0    0.038  Gene info
gi|75675040|ref|YP_317461.1|  hypothetical protein Nwi_0845 [N...  42.0    0.038  Gene info
gi|84704640|ref|ZP_01018140.1|  hypothetical protein PB2503_11...  41.6    0.050
gi|27375163|ref|NP_766692.1|  hypothetical protein bll0052 [Br...  41.2    0.065  Gene info
gi|110632884|ref|YP_673092.1|  phage transcriptional regulator...  41.2    0.065  Gene info
gi|78695988|ref|ZP_00860499.1|  conserved hypothetical protein...  40.4    0.11 
gi|83648337|ref|YP_436772.1|  hypothetical protein HCH_05691 [...  40.0    0.14   Gene info
gi|78697501|ref|ZP_00862008.1|  conserved hypothetical protein...  40.0    0.14 
gi|103487974|ref|YP_617535.1|  putative transcriptional regula...  40.0    0.14   Gene info
gi|85713466|ref|ZP_01044456.1|  hypothetical protein NB311A_02...  39.7    0.19 
gi|71367751|ref|ZP_00658269.1|  Excisionase/Xis, DNA-binding [...  39.3    0.25 
gi|92118041|ref|YP_577770.1|  hypothetical protein Nham_2528 [...  38.9    0.32   Gene info
gi|85716955|ref|ZP_01047919.1|  hypothetical protein NB311A_09...  38.9    0.32 
gi|113935160|ref|ZP_01421059.1|  conserved hypothetical protei...  38.5    0.42 
gi|68128479|emb|CAJ08609.1|  hypothetical protein, conserved [Lei  38.5    0.42 
gi|113935557|ref|ZP_01421455.1|  hypothetical protein CaulDRAF...  38.1    0.55 
gi|90022429|ref|YP_528256.1|  hypothetical protein Sde_2785 [S...  37.7    0.72   Gene info
gi|75676151|ref|YP_318572.1|  hypothetical protein Nwi_1960 [N...  37.7    0.72   Gene info
gi|85708428|ref|ZP_01039494.1|  hypothetical protein NAP1_0429...  37.7    0.72 
gi|30250425|ref|NP_842495.1|  hypothetical protein NE2506 [Nit...  37.4    0.94   Gene info
gi|103488103|ref|YP_617664.1|  hypothetical protein Sala_2625 ...  36.6    1.6    Gene info
gi|94496783|ref|ZP_01303358.1|  hypothetical protein SKA58_166...  36.2    2.1  
gi|54297854|ref|YP_124223.1|  hypothetical protein lpp1909 [Le...  36.2    2.1    Gene info
gi|89338954|ref|ZP_01191719.1|  conserved hypothetical protein...  36.2    2.1  
gi|85713467|ref|ZP_01044457.1|  hypothetical protein NB311A_02...  36.2    2.1  
gi|99078556|ref|YP_611814.1|  hypothetical protein TM1040_3583...  36.2    2.1    Gene info
gi|118100713|ref|XP_417416.2|  PREDICTED: similar to DEP domain c  35.8    2.7    Gene info
gi|115478597|ref|NP_001062892.1|  Os09g0327800 [Oryza sativa (...  35.8    2.7    Gene info
gi|54294011|ref|YP_126426.1|  hypothetical protein lpl1072 [Le...  35.4    3.6    Gene info
gi|54642298|gb|EAL31047.1|  GA20299-PA [Drosophila pseudoobscura]  35.4    3.6  
gi|27375178|ref|NP_766707.1|  hypothetical protein bsr0067 [Br...  35.0    4.7    Gene info
gi|84702416|ref|ZP_01016991.1|  hypothetical protein PB2503_05...  35.0    4.7  
gi|94496682|ref|ZP_01303257.1|  hypothetical protein SKA58_161...  34.7    6.1  
gi|87198390|ref|YP_495647.1|  hypothetical protein Saro_0365 [...  34.7    6.1    Gene info
gi|60681595|ref|YP_211739.1|  possible DNA-binding protein [Ba...  34.7    6.1    Gene info
gi|113937098|ref|ZP_01422982.1|  conserved hypothetical protei...  34.3    7.9  
gi|113935535|ref|ZP_01421433.1|  hypothetical protein CaulDRAF...  34.3    7.9  
gi|94265084|ref|ZP_01288851.1|  Excisionase/Xis, DNA-binding [...  34.3    7.9  
gi|91775534|ref|YP_545290.1|  hypothetical protein Mfla_1181 [...  34.3    7.9    Gene info
gi|52842576|ref|YP_096375.1|  hypothetical protein lpg2367 [Le...  34.3    7.9    Gene info
gi|89362079|ref|ZP_01199890.1|  conserved hypothetical protein...  34.3    7.9  
gi|85717137|ref|ZP_01048096.1|  hypothetical protein NB311A_02...  34.3    7.9  

>gi|90415837|ref|ZP_01223770.1|  hypothetical protein GB2207_01352 [marine gamma proteobacterium 
HTCC2207]
 gi|90332211|gb|EAS47408.1|  hypothetical protein GB2207_01352 [marine gamma proteobacterium 
HTCC2207]
Length=274

 Score = 97.4 bits (241),  Expect = 8e-19
 Identities = 61/215 (28%), Positives = 98/215 (45%), Gaps = 16/215 (7%)
 Frame = +3

Query  255  IKIIT-LAAIVSLVASCSGSDGSKEIRLS-------DAAETTAYVEYLFCKNGPDMSQES  410
            +K+IT LA    L+  C   +   ++           AA  + +VEY++C  G D S E+
Sbjct  1    MKLITGLAVTALLLVGCGDKNAEMDVAAEMDMAAAQPAAPVSTFVEYMYCDGGADFSPEN  60

Query  411  FTAMISEWNTIQDGMENPVPM--SVGLVPRTETDLYDGMWVLVWQSKEQSETGWEEWLAG  584
            +  + + WN I +  E+PVP   +  + P+ ET+LYDGMW  +W S E  E GW++W+  
Sbjct  61   YAKLTAAWNLISE--ESPVPALGAFAIRPKVETELYDGMWANIWSSVEAREAGWKDWVEN  118

Query  585  PAEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYTDSFGPN  764
             AE +  +  S L+C   N +       +    P Q  D        ++FCS+ +     
Sbjct  119  HAEAFGAEFDSTLAC---NAEKRFLFETMPITAPVQEWDPAQQFQASYSFCSFKEGKTQA  175

Query  765  DLIAANGIYNQWLDAAVEAAGTASPYFYTIHEPNF  869
            D  AA   + +W+ A     G    Y   +  P F
Sbjct  176  DGEAAGAAFAEWI-ADQRTLGRGLNYMAYLQIPTF  209


>gi|90416936|ref|ZP_01224865.1|  hypothetical protein GB2207_06733 [marine gamma proteobacterium 
HTCC2207]
 gi|90331283|gb|EAS46527.1|  hypothetical protein GB2207_06733 [marine gamma proteobacterium 
HTCC2207]
Length=277

 Score = 74.3 bits (181),  Expect = 7e-12
 Identities = 57/218 (26%), Positives = 91/218 (41%), Gaps = 22/218 (10%)
 Frame = +3

Query  258  KIITLAAIVSLVASCSGSDGSKEIRLSDAAETTA-----------YVEYLFCKNGPDMSQ  404
            K++  +  V  +A CS     +     D A               + E++ C  GPD S+
Sbjct  3    KLLAASFTVLALAGCSNDPAPEAAAAPDVAVAAEAMTFDMVGQVFFNEFIPCTAGPDFSE  62

Query  405  ESFTAMISEWNTIQDGMENPVPMSVGLVPRTETDLY-DGMWVLVWQSKEQSETGWEEWLA  581
             +  AM++EW     G+   +  + G  P +E + + +G W L W SKE ++ GW +W A
Sbjct  63   ATVDAMVAEWRA--SGIAGEILGAWGYAPASENNRFQNGWWELQWSSKEAADAGWRQWAA  120

Query  582  GP-AEDWIEKTSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGG--VVGFNFCSYTDS  752
               A+ W  K  +++ C       +++ FN     P    D +  G  V  F  C   + 
Sbjct  121  SDVAQAWSSKHENVMVC--DAASRVSWDFNFPR-DPYSFGDIDESGQFVSAFLPCQLNEG  177

Query  753  FGPNDLIAANGIYNQWLDAAVEAAGTASPYFYTIHEPN  866
               +DL  A   YN +LDA        S Y Y I+  N
Sbjct  178  KTMDDLNVAIAAYNTFLDAIPVTEN--SFYSYGIYASN  213


>gi|69935774|ref|ZP_00630661.1|  conserved hypothetical protein [Paracoccus denitrificans PD1222]
 gi|69152707|gb|EAN65857.1|  conserved hypothetical protein [Paracoccus denitrificans PD1222]
Length=93

 Score = 47.4 bits (111),  Expect = 0.001
 Identities = 21/49 (42%), Positives = 34/49 (69%), Gaps = 0/49 (0%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQNFI  199
            + ++ K+AA+ L  S RTLE  R    GP+Y+K+ G+V+Y I+DLQ ++
Sbjct  11   RYLRTKEAAHFLSLSARTLEKHRTYGTGPAYHKLGGRVVYAIEDLQAWV  59


>gi|13488566|ref|NP_109573.1| Gene info hypothetical protein msr9739 [Mesorhizobium loti MAFF303099]
 gi|14028320|dbj|BAB54912.1| Gene info msr9739 [Mesorhizobium loti MAFF303099]
Length=93

 Score = 46.6 bits (109),  Expect = 0.002
 Identities = 23/58 (39%), Positives = 37/58 (63%), Gaps = 1/58 (1%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQNFIE-GSRVSVS  223
            + ++  +AA+ L  S RTLE  R    GP+Y K+ G+V+Y +DDLQ +++ G+  S S
Sbjct  11   RYLRTSEAAHFLSLSARTLEKHRTYGTGPAYCKLGGRVVYAVDDLQAWVQRGAMTSTS  68


>gi|84687548|ref|ZP_01015424.1|  hypothetical protein RB2654_04959 [Rhodobacterales bacterium 
HTCC2654]
 gi|84664457|gb|EAQ10945.1|  hypothetical protein RB2654_04959 [Rhodobacterales bacterium 
HTCC2654]
Length=66

 Score = 46.6 bits (109),  Expect = 0.002
 Identities = 24/55 (43%), Positives = 31/55 (56%), Gaps = 0/55 (0%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQNFIEGSRVS  217
            K +   QAA+ L  S  TLETWR E  GP YYK+  +V Y   DL  ++E   V+
Sbjct  6    KFLDTAQAAHYLGYSMSTLETWRCESMGPKYYKLHRRVRYKQSDLDAWLEAQVVT  60


>gi|89362044|ref|ZP_01199855.1|  conserved hypothetical protein [Xanthobacter autotrophicus Py2]
 gi|89349090|gb|EAS14380.1|  conserved hypothetical protein [Xanthobacter autotrophicus Py2]
Length=93

 Score = 45.8 bits (107),  Expect = 0.003
 Identities = 21/55 (38%), Positives = 35/55 (63%), Gaps = 0/55 (0%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQNFIEGSRVS  217
            + ++ K+AA  L  S RTLE  R    GP+Y+K+ G+V+Y +DDL+ + +   V+
Sbjct  11   RYLRTKEAAAFLSLSARTLEKHRTYGTGPAYHKLGGRVVYSVDDLKAWADRGAVT  65


>gi|90425047|ref|YP_533417.1| Gene info hypothetical protein RPC_3559 [Rhodopseudomonas palustris BisB18]
 gi|90107061|gb|ABD89098.1| Gene info hypothetical protein RPC_3559 [Rhodopseudomonas palustris BisB18]
Length=89

 Score = 45.1 bits (105),  Expect = 0.005
 Identities = 25/58 (43%), Positives = 36/58 (62%), Gaps = 1/58 (1%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQNFIE-GSRVSVS  223
            +L++  +AA  L  S RTLE  R    GP Y KI G+V+Y ++DLQ++   G+R S S
Sbjct  11   RLLRTPEAARFLGISNRTLEKHRTYGTGPVYRKIGGRVVYAVEDLQSWSAIGARKSTS  68


>gi|89362967|ref|ZP_01200765.1|  conserved hypothetical protein [Xanthobacter autotrophicus Py2]
 gi|89348253|gb|EAS13556.1|  conserved hypothetical protein [Xanthobacter autotrophicus Py2]
Length=93

 Score = 45.1 bits (105),  Expect = 0.005
 Identities = 20/46 (43%), Positives = 30/46 (65%), Gaps = 0/46 (0%)
 Frame = +2

Query  53   KLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGIDDLQ  190
            + ++ K+AA  L  S RTLE  R    GP+Y K+ G+V+Y +DDL+
Sbjct  11   RFLRTKEAAEFLSLSARTLEKHRTYGTGPAYRKLGGRVVYAVDDLE  56

ORF finding

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 2 to base 226.
GATATTCTAGTTAACAAGGATGTTAATCTCAGAATTAATAAAATGAATAGAAAATTAATT
AAACCTAAACAAGCTGCAAATATTCTGATGTGTAGCGAAAGAACCTTAGAGACTTGGAGA
AGAGAAGAGAAAGGTCCTAGTTATTATAAGATAGAAGGAAAAGTACTTTATGGGATTGAT
GATTTACAAAATTTTATTGAAGGTTCCAGAGTGTCAGTTTCATAA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
DILVNKDVNLRINKMNRKLIKPKQAANILMCSERTLETWRREEKGPSYYKIEGKVLYGID
DLQNFIEGSRVSVS*

>ORF number 1 in reading frame 3 on the direct strand extends from base 249 to base 884.
ATGAAAATAAAAATTATAACTTTAGCAGCAATTGTTTCTTTAGTGGCTTCTTGTTCAGGA
AGTGATGGAAGTAAGGAAATAAGACTTAGTGATGCTGCAGAAACTACAGCGTATGTTGAA
TATCTTTTTTGTAAAAATGGACCTGATATGTCGCAAGAGTCCTTTACAGCAATGATATCG
GAATGGAATACAATACAAGACGGTATGGAAAATCCTGTTCCTATGTCTGTGGGACTCGTT
CCACGAACTGAAACTGATTTGTATGATGGTATGTGGGTATTAGTTTGGCAATCTAAAGAG
CAAAGCGAAACAGGCTGGGAAGAATGGTTGGCTGGACCTGCTGAAGATTGGATAGAAAAG
ACTAGTTCCATTCTTTCTTGTGTAGACTCAAATGGAGACGCTATAAATTATAGCTTCAAT
GTAAGTAATTTTAGACCTGCACAAGCTCAAGATGCAGAACCAGGAGGGGTCGTGGGTTTT
AATTTCTGCAGTTACACCGACTCATTTGGTCCTAACGATTTAATTGCAGCCAATGGAATT
TATAATCAATGGCTAGATGCTGCTGTGGAAGCAGCAGGGACTGCTTCACCATATTTTTAT
ACAATTCACGAACCTAATTTTAAAACTCCAATCCCA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
MKIKIITLAAIVSLVASCSGSDGSKEIRLSDAAETTAYVEYLFCKNGPDMSQESFTAMIS
EWNTIQDGMENPVPMSVGLVPRTETDLYDGMWVLVWQSKEQSETGWEEWLAGPAEDWIEK
TSSILSCVDSNGDAINYSFNVSNFRPAQAQDAEPGGVVGFNFCSYTDSFGPNDLIAANGI
YNQWLDAAVEAAGTASPYFYTIHEPNFKTPIP

>ORF number 1 in reading frame 1 on the reverse strand extends from base 232 to base 666.
TTTATAGCGTCTCCATTTGAGTCTACACAAGAAAGAATGGAACTAGTCTTTTCTATCCAA
TCTTCAGCAGGTCCAGCCAACCATTCTTCCCAGCCTGTTTCGCTTTGCTCTTTAGATTGC
CAAACTAATACCCACATACCATCATACAAATCAGTTTCAGTTCGTGGAACGAGTCCCACA
GACATAGGAACAGGATTTTCCATACCGTCTTGTATTGTATTCCATTCCGATATCATTGCT
GTAAAGGACTCTTGCGACATATCAGGTCCATTTTTACAAAAAAGATATTCAACATACGCT
GTAGTTTCTGCAGCATCACTAAGTCTTATTTCCTTACTTCCATCACTTCCTGAACAAGAA
GCCACTAAAGAAACAATTGCTGCTAAAGTTATAATTTTTATTTTCATTTATTCTCCTTAT
AAATGTTTATTATGA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
FIASPFESTQERMELVFSIQSSAGPANHSSQPVSLCSLDCQTNTHIPSYKSVSVRGTSPT
DIGTGFSIPSCIVFHSDIIAVKDSCDISGPFLQKRYSTYAVVSAASLSLISLLPSLPEQE
ATKETIAAKVIIFIFIYSPYKCLL*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 360 to base 584.
TACCCACATACCATCATACAAATCAGTTTCAGTTCGTGGAACGAGTCCCACAGACATAGG
AACAGGATTTTCCATACCGTCTTGTATTGTATTCCATTCCGATATCATTGCTGTAAAGGA
CTCTTGCGACATATCAGGTCCATTTTTACAAAAAAGATATTCAACATACGCTGTAGTTTC
TGCAGCATCACTAAGTCTTATTTCCTTACTTCCATCACTTCCTGA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
YPHTIIQISFSSWNESHRHRNRIFHTVLYCIPFRYHCCKGLLRHIRSIFTKKIFNIRCSF
CSITKSYFLTSITS*