ORF EE21660

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01160237.1
Annotathon code: ORF_EE21660
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2007
Username : TAMT
Annotated on : 2008-03-19 18:52:37
  • MEILAC THOMAS
  • TSALANLAL AMEL

Synopsis

Genomic Sequence

>AACY01160237.1 ORF_EE21660 genomic DNA
ATATTATCAACACTATCTACTTGTATAAAATTTAATTCAAACCTAAATTCATATTTATCAAACCATTTTCTTGAACCCTCAAAATAAAATTTATCTTCTT
TTGTAAATTTCTTTCTACTATTTTCTATTCTTTTACAACACCCAAAAAAATCATCTTGACTATCCGACTTTATATAAACATTTAAAGGAAAACTATTTCT
TGCTATTTTAGCATACAACATTTGAGAATCATCTACCCAGCCTGCCATTTTATAATCTGCACTATGTATTAAATCAAACTTTACAATACCAACATTTGGT
GCCTTTGCAGCAAAATCAGTATTTACATCTCCAGGGGCCCATTTATTTTCTCTTTCTATATTAGATTTTGTTCCAACATCCCATTCAACAATTATCGGAT
ACTCTATTTTATCAATAACTATTTTTTTAAGTTGTTCATAATTTGTTATCTGTTTTCCACCAAGACTTATTCCATATTTCTTATTTGTGAAAAATAGTAC
TGGTATCTTTTTTAACCCAAGAAAATAAGAAACCATAATTCTACCTTGTCCAGGATGTAATCCCCACGCCAAATCTGGATTTTTTAAACTTGGTGTTTCT
CCACTTGGAATAATTGTACCTGGATTATAACCTAACAAAAATATATCCTCTACATCATCATCACCAAAATGTGACATATCTAATCCAAGCATACCAAGTG
GATAAGTAAAACCCATTTCCAAGATACTCTGACACATCCATTTTTGTAACCAAAAATGATGACCATAATCATTAAAATCTTGAAAATTTTTGTGAAAAAT
ACCCTGTGTAATTTGAGAA

Translation

[2 - 817/819]   indirect strand
>ORF_EE21660 Translation [2-817   indirect strand]
SQITQGIFHKNFQDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSHFGDDDVEDIFLLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFL
GLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQM
LYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTKEDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN

[ Warning ] 5' incomplete: does not start with a Methionine
[ Warning ] 3' incomplete: following codon is not a STOP

Phylogeny


Annotator commentaries

1/Recherche d'ORF.

Tout d'abord, les ORFs ont été recherchés grâce à ORFfinder dans SMS. Cette analyse nous donne les résultats reportés dans le champs ORF. Ces résultats nous montrent une liste de 3 ORFs de longueurs différentes. Comme indiqué, nous avons choisi l'ORF le plus grand: l'ORF n°1 du 2eme cadre de lecture sur le brin inversé de la base 2 à 817 (815 pb).

Cette séquence semble codante puisqu'elle comporte au moins un ORF vrai positif de 272 codons, c'est-à-dire supérieur à 60 codons.

La traduction de la séquence nucléotidique de l'ORF sélectionné nous révèle l'absence de codon d'initiation (Met) et de codon STOP. Ces dernières données peuvent nous faire penser que notre séquence est une partie codante interne d'une séquence bien plus étendue. Ainsi, le codon d'initiation serait en amont de cette séquence et le codon stop en aval. La traduction de cette séquence nucléotidique nous révèlerait donc une partie de la séquence protéique sans le début et sans la fin de celle-ci.

2/Recherche d'homologues.

Par la suite, nous cherchons à déterminer si cette séquence protéique issue de la traduction de notre séquence d'intéret possède des homologues connus. Pour celà nous effectuons des alignements de séquences protéiques par un Blastp, dans un premier temps contre la banque swissprot. Ce choix de banque est dû au fait que Swissprot contient un nombre de séquences bien plus restreint que d'autre banques généraliste comme nr. Donc, avec un peu de chance, notre séquence trouvera un homologue dans cette banque. Ce premier résultat nous donne seulement 2 séquences ayant des similarités avec la notre. Pour ces deux séquences la e-value est extrêment élevée (0.033 et 4.1)et des scores très bas (38.1 et 31.2). Etant donné le peu de similarité entre ces séquences et la notre, et considérant un taux de probabilité très élevé que ces similarités soient dûes au hasard, on peu rapidement conclure à l'absence d'homologue à notre séquence dans cette banque. Ayant échoué avec Swissprot nous effectuons le même blast contre la banque nr en espérant y trouver des séquences homologues qui ne se trouvent pas dans Swissprot (puisque nr est plus vaste). Les résultats obtenus ne sont pas plus probant que ceux acquis contre Swissprot. Nous obtenons 6 séquences avec des Scores bas et des e-value élevées: il n'y a pas de séquences dans nr suceptibles d'être homologues à la notre.

Puis, nos trouvailles peu fructueuses, nous avons fait un blastp dans env_nr pour chercher des séquences homologues provenant de l'environement. Il n'est, lui aussi, pas significatif. On fait donc un tblastn pour avoir un éventuel meilleur résultat. On retrouve seulement notre protéine.

Les blasts ne nous ont pas permis de trouver d'homologues. Cependant avant de conclure nous avons tenté de trouver des homologies entre domaines protéiques en utilisant prosite, pfam et interpro. Il n'y a pas de domaines protéiques homologues rescencés. De ce fait, aucune classification taxonomique n'a pu être définie.

3/Conclusion.

Ainsi nous pouvons dire que notre séquence contient un ORF probablement codant pour une protéine (en fait, une partie) encore inconnue (ORFan).

Multiple Alignement


BLAST

Blastp dans Swissprot.

BLASTP 2.2.17 (Aug-26-2007)

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: KNXHW24W01R


Database: Non-redundant SwissProt sequences
           259,673 sequences; 98,621,568 total letters

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

sp|Q8ET01|KHSE_OCEIH  Homoserine kinase (HSK) (HK)                 38.1    0.033
sp|P78722|LAC2_PODAN  Laccase-2 precursor (Laccase II) (Benzen...  31.2    4.1  



>sp|Q8ET01|KHSE_OCEIH  Homoserine kinase (HSK) (HK)
Length=294

 Score = 38.1 bits (87),  Expect = 0.033, Method: Composition-based stats.
 Identities = 23/93 (24%), Positives = 41/93 (44%), Gaps = 6/93 (6%)

Query  129  QLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVKFDLIHSAD  188
            QL+ I +D + Y   +E       N        P   N D+AAKA  +  +    ++S D
Sbjct  151  QLRDIDVDIVAYIPNIELKTSVSRNC------LPDSYNRDYAAKASAISNLTIAALYSKD  204

Query  189  YKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQ  221
            YK+AG + +  + +        P   YI+ +++
Sbjct  205  YKLAGKLMEEDLFHEPFRSELIPNFAYIREEAK  237


>sp|P78722|LAC2_PODAN  Laccase-2 precursor (Laccase II) (Benzenediol:oxygen oxidoreductase 
2) (Urishiol oxidase 2) (Diphenol oxidase 2) (Laccase 
C)
Length=621

 Score = 31.2 bits (69),  Expect = 4.1, Method: Composition-based stats.
 Identities = 11/22 (50%), Positives = 15/22 (68%), Gaps = 1/22 (4%)

Query  160  WAPG-DVNTDFAAKAPNVGIVK  180
            WAPG D+NTD+    PN G+ +
Sbjct  58   WAPGFDINTDYEVSTPNTGVTR  79

-----------------------------------------------------------------------------------------------------
Blastp dans nr.

BLASTP 2.2.17 (Aug-26-2007)

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: KNXYSAXN013


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
from WGS projects
           5,633,163 sequences; 1,947,344,958 total letters

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|NP_691385.1|  homoserine kinase [Oceanobacillus iheyensis ...  38.1    0.51  Gene info
gb|EAY85281.1|  hypothetical protein OsI_006514 [Oryza sativa ...  37.4    0.99 
ref|ZP_01255484.1|  putative Oxidoreductase, FAD-binding prote...  35.8    3.0  
ref|XP_001434997.1|  hypothetical protein [Paramecium tetraure...  35.4    3.6   Gene info
ref|YP_644558.1|  Uracil-DNA glycosylase superfamily [Rubrobac...  35.4    4.1   Gene info
ref|YP_278057.1|  phosphatidylserine synthase [Candidatus Bloc...  34.3    9.3   Gene info


>ref|NP_691385.1| Gene info homoserine kinase [Oceanobacillus iheyensis HTE831]
 sp|Q8ET01|KHSE_OCEIH  Homoserine kinase (HSK) (HK)
 dbj|BAC12420.1| Gene info homoserine kinase [Oceanobacillus iheyensis HTE831]
Length=294

 Score = 38.1 bits (87),  Expect = 0.51, Method: Composition-based stats.
 Identities = 23/93 (24%), Positives = 41/93 (44%), Gaps = 6/93 (6%)

Query  129  QLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVKFDLIHSAD  188
            QL+ I +D + Y   +E       N        P   N D+AAKA  +  +    ++S D
Sbjct  151  QLRDIDVDIVAYIPNIELKTSVSRNC------LPDSYNRDYAAKASAISNLTIAALYSKD  204

Query  189  YKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQ  221
            YK+AG + +  + +        P   YI+ +++
Sbjct  205  YKLAGKLMEEDLFHEPFRSELIPNFAYIREEAK  237


>gb|EAY85281.1|  hypothetical protein OsI_006514 [Oryza sativa (indica cultivar-group)]
Length=721

 Score = 37.4 bits (85),  Expect = 0.99, Method: Composition-based stats.
 Identities = 21/70 (30%), Positives = 38/70 (54%), Gaps = 6/70 (8%)

Query  77   SLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVID  136
            SL    ++WG+ P  G++M    L L  IP    + +++G + GG    ++  LKK  ++
Sbjct  243  SLHIEGVSWGILPPFGQLMQLEELTLNNIP----STRRFGPNFGGVTQKSFSHLKK--VE  296

Query  137  KIEYPIIVEW  146
             ++ P +VEW
Sbjct  297  FVDMPELVEW  306


>ref|ZP_01255484.1|  putative Oxidoreductase, FAD-binding protein [Psychroflexus torquis 
ATCC 700755]
 gb|EAS69714.1|  putative Oxidoreductase, FAD-binding protein [Psychroflexus torquis 
ATCC 700755]
Length=731

 Score = 35.8 bits (81),  Expect = 3.0, Method: Composition-based stats.
 Identities = 23/79 (29%), Positives = 39/79 (49%), Gaps = 9/79 (11%)

Query  83   LAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVI----DKI  138
            L+W LH G+G +  S  L L    +LFF    + ++L  ++     ++K I I    D  
Sbjct  287  LSWKLHTGEGNVFWSIVLLLASASILFFMYSGFAMTLKRRK-----KVKAISIMPDKDDC  341

Query  139  EYPIIVEWDVGTKSNIERE  157
            E+ I+V  + GT  +  R+
Sbjct  342  EFVILVGSETGTTFDFARQ  360


>ref|XP_001434997.1| Gene info hypothetical protein [Paramecium tetraurelia]
 emb|CAK67600.1| Gene info unnamed protein product [Paramecium tetraurelia]
Length=3590

 Score = 35.4 bits (80),  Expect = 3.6, Method: Composition-based stats.
 Identities = 26/103 (25%), Positives = 50/103 (48%), Gaps = 6/103 (5%)

Query  124   ITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVKFDL  183
             + NY+ LKK ++++I +P   +  + T  ++E E  +  GD   +   + PN  + KF  
Sbjct  1580  VRNYDSLKKYLVNRIPHPGQEDKALKTLKHLEEEKNYK-GDSQEELVVEVPN-DLTKF--  1635

Query  184   IHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFG  226
               + +Y ++GW+   Q        N F L++Y    + +  FG
Sbjct  1636  --ATEYSVSGWLRWDQPPIGAPWFNVFRLSLYSTEANAEGRFG  1676


>ref|YP_644558.1| Gene info Uracil-DNA glycosylase superfamily [Rubrobacter xylanophilus 
DSM 9941]
 gb|ABG04746.1| Gene info Uracil-DNA glycosylase superfamily [Rubrobacter xylanophilus 
DSM 9941]
Length=196

 Score = 35.4 bits (80),  Expect = 4.1, Method: Composition-based stats.
 Identities = 21/73 (28%), Positives = 38/73 (52%), Gaps = 3/73 (4%)

Query  66   PGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQIT  125
            PG ++  G TP+    D ++ +  G+GRI  S  LGL   P L   +  Y + + G    
Sbjct  117  PGALVLLGSTPAKALIDRSFTMREGRGRIFESKILGL---PALATYHPAYLLRVRGAGGA  173

Query  126  NYEQLKKIVIDKI  138
            +Y +L++ V++ +
Sbjct  174  DYGRLRRQVVEDL  186


>ref|YP_278057.1| Gene info phosphatidylserine synthase [Candidatus Blochmannia pennsylvanicus 
str. BPEN]
 gb|AAZ41180.1| Gene info phosphatidylserine synthase [Candidatus Blochmannia pennsylvanicus 
str. BPEN]
Length=458

 Score = 34.3 bits (77),  Expect = 9.3, Method: Composition-based stats.
 Identities = 29/104 (27%), Positives = 48/104 (46%), Gaps = 5/104 (4%)

Query  119  LGGKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGI  178
            LGGK+I +     K +  +I+  I+V+W    +  I    K+   D  TD   + PN  I
Sbjct  68   LGGKKIIDALLHVKKLRPQIKIRILVDWHRARRVRIGSSKKYTNIDWYTDIIKQYPNTEI  127

Query  179  VKFDL-IHSAD----YKMAGWVDDSQMLYAKIARNSFPLNVYIK  217
              + + IH  +      + G++ D Q+LY+    N   L+V  K
Sbjct  128  AIYGIPIHVNEALGVLHLKGFIIDDQILYSGANLNGEYLHVNTK  171


---------------------------------------------------------------------------------------------------
Blastp dans env_nt.

BLASTP 2.2.17 (Aug-26-2007)

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: KNY2F4D401R


Database: Environmental sample proteins from WGS projects
           6,028,191 sequences; 1,207,406,420 total letters

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

gb|EBZ82337.1|  hypothetical protein GOS_3815077 [marine metageno  36.2    0.82 
gb|EBJ63322.1|  hypothetical protein GOS_8865825 [marine metageno  36.2    0.90 
gb|ECF57515.1|  hypothetical protein GOS_5697940 [marine metageno  35.4    1.4  
gb|EBR91608.1|  hypothetical protein GOS_7520603 [marine metageno  35.0    1.9  
gb|EBT09224.1|  hypothetical protein GOS_7330368 [marine metageno  35.0    2.1  
gb|ECK75148.1|  hypothetical protein GOS_6128638 [marine metageno  34.7    2.3  
gb|EDE43679.1|  hypothetical protein GOS_1130977 [marine metageno  34.3    3.6  
gb|ECW85016.1|  hypothetical protein GOS_2645137 [marine metageno  33.9    4.4  
gb|EBX69548.1|  hypothetical protein GOS_6547066 [marine metageno  33.9    4.6  
gb|ECL67475.1|  hypothetical protein GOS_5949631 [marine metageno  33.9    4.7  
gb|ECC57253.1|  hypothetical protein GOS_3384164 [marine metageno  33.5    5.4  
gb|EBC78381.1|  hypothetical protein GOS_14794 [marine metagenome  33.5    6.4  
gb|EDC79643.1|  hypothetical protein GOS_1412928 [marine metageno  33.1    7.0  
gb|ECU36641.1|  hypothetical protein GOS_4642498 [marine metageno  33.1    7.6  
gb|ECD49287.1|  hypothetical protein GOS_3242909 [marine metageno  33.1    7.9  
gb|ECB93230.1|  hypothetical protein GOS_5913632 [marine metageno  32.7    8.7  


>gb|EBZ82337.1|  hypothetical protein GOS_3815077 [marine metagenome]
Length=200

 Score = 36.2 bits (82),  Expect = 0.82, Method: Composition-based stats.
 Identities = 23/95 (24%), Positives = 45/95 (47%), Gaps = 7/95 (7%)

Query  153  NIERENKWAP----GDVNTDFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQMLYAKIARN  208
            N E++ +W P    G+  + F    PN G+   +L   A+   +GWV + + ++   A+ 
Sbjct  104  NQEQKERWIPDLIAGEYKSCFGVTEPNSGLNTANLQTRAERTNSGWVVNGRKIWTSTAQV  163

Query  209  SFPLNVYIKSDSQDDFFGCCKRIENSRKKFTKEDK  243
            +  + +  ++  Q+D   C K I+     FT  D+
Sbjct  164  ASKILLIARTTPQED---CAKPIDGLSLFFTDLDR  195


>gb|EBJ63322.1|  hypothetical protein GOS_8865825 [marine metagenome]
Length=324

 Score = 36.2 bits (82),  Expect = 0.90, Method: Composition-based stats.
 Identities = 30/90 (33%), Positives = 41/90 (45%), Gaps = 15/90 (16%)

Query  80   NPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNK----KYGISLGGKQIT-----NYEQL  130
            +P L W       RI     L  KKIPV   T K    + GI LG K I      NY+Q 
Sbjct  132  DPKLEWK------RISKVIKLICKKIPVSLDTRKSAIMEKGIKLGVKIINDVSGLNYDQK  185

Query  131  KKIVIDKIEYPIIVEWDVGTKSNIERENKW  160
             K ++ K + P I++   GT  N++   K+
Sbjct  186  TKYILKKYKIPFIIQHSQGTPENMQNNPKY  215


>gb|ECF57515.1|  hypothetical protein GOS_5697940 [marine metagenome]
Length=263

 Score = 35.4 bits (80),  Expect = 1.4, Method: Composition-based stats.
 Identities = 30/100 (30%), Positives = 42/100 (42%), Gaps = 17/100 (17%)

Query  168  DFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQM------LYAKIARNSFPLNVYIKSDSQ  221
            DF      +  +K    +SA  K A  V DS+       +Y K  RNS P N Y K   Q
Sbjct  161  DFEKAKETINFIKQKKWNSA-LKSAEKVKDSEFRKLITWMYLKTTRNSAPFNEYKKFIEQ  219

Query  222  DDFFGCCKRIENSRKKFTKEDKFYFEGSR-----KWFDKY  256
            +D++    RI     ++  E+K Y   +       WF KY
Sbjct  220  NDYYPRINRI-----RYLAEEKIYLRNNSPTSIINWFKKY  254


>gb|EBR91608.1|  hypothetical protein GOS_7520603 [marine metagenome]
Length=233

 Score = 35.0 bits (79),  Expect = 1.9, Method: Composition-based stats.
 Identities = 24/86 (27%), Positives = 41/86 (47%), Gaps = 9/86 (10%)

Query  76   PSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVI  135
            P  K+ DL+  L  G+     SY+  L+ +P  +F NK             YE L +   
Sbjct  93   PQKKDTDLSGFLSFGEKSFGRSYYQLLEGVPGFYFLNK---------STLEYEFLSQQET  143

Query  136  DKIEYPIIVEWDVGTKSNIERENKWA  161
            +K E  I+V  ++G K  I++ ++W+
Sbjct  144  EKQEIKIVVGKNIGQKKLIDKLSEWS  169


>gb|EBT09224.1|  hypothetical protein GOS_7330368 [marine metagenome]
Length=308

 Score = 35.0 bits (79),  Expect = 2.1, Method: Composition-based stats.
 Identities = 21/56 (37%), Positives = 30/56 (53%), Gaps = 7/56 (12%)

Query  121  GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNV  176
            GK  TN  ++KK+ I +I+  II   D+G       E K+ P    T F A++PNV
Sbjct  135  GKTYTNINEIKKLPIKRIKGKIITLGDLG-------EVKYGPVSEKTLFKAQSPNV  183


>gb|ECK75148.1|  hypothetical protein GOS_6128638 [marine metagenome]
Length=271

 Score = 34.7 bits (78),  Expect = 2.3, Method: Composition-based stats.
 Identities = 16/40 (40%), Positives = 23/40 (57%), Gaps = 0/40 (0%)

Query  207  RNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTKEDKFYF  246
            +N F LN+ IK   QDD F    +   S KK+ K++ +YF
Sbjct  6    KNIFFLNLKIKDTKQDDIFDYLSKFIKSSKKYLKQNTYYF  45


>gb|EDE43679.1|  hypothetical protein GOS_1130977 [marine metagenome]
Length=527

 Score = 34.3 bits (77),  Expect = 3.6, Method: Composition-based stats.
 Identities = 22/69 (31%), Positives = 39/69 (56%), Gaps = 9/69 (13%)

Query  88   HPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVIDKIEY-PIIVEW  146
            H G+ R M +  LG++K+PVL +T        G  ++  ++Q    V+DK ++ P + E+
Sbjct  20   HEGRHRCMAAKKLGIEKVPVLIYTGS------GFDRVPQWDQSTHDVVDKSDFKPQLKEY  73

Query  147  --DVGTKSN  153
              DV TK++
Sbjct  74   GSDVSTKND  82


>gb|ECW85016.1|  hypothetical protein GOS_2645137 [marine metagenome]
Length=388

 Score = 33.9 bits (76),  Expect = 4.4, Method: Composition-based stats.
 Identities = 36/136 (26%), Positives = 59/136 (43%), Gaps = 18/136 (13%)

Query  102  LKKIPVLFFTNKKYGISLG-GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKW  160
            + K+  L   +KK    LG G +I+N + L K +I + + P  V W+      IE +NK+
Sbjct  93   INKVLALLLKSKKPLFLLGHGVKISNSQLLYKKIIQRYKIPFCVTWN--ASDVIESDNKF  150

Query  161  APGDVNTDFAAKAPNVGIVKFDLI---------HSADYKMAGWVDDSQMLYAKIARN---  208
              G     FA +  N  +   DLI             Y    +  +++++   I  N   
Sbjct  151  YMGRPGA-FAERGTNFIVQNCDLIICIGTRLPFMVTGYNSKKFAKNAKIVMVDIDSNELN  209

Query  209  --SFPLNVYIKSDSQD  222
              S  LN+ I SD++D
Sbjct  210  KPSLRLNLKINSDAKD  225


>gb|EBX69548.1|  hypothetical protein GOS_6547066 [marine metagenome]
Length=241

 Score = 33.9 bits (76),  Expect = 4.6, Method: Composition-based stats.
 Identities = 14/38 (36%), Positives = 22/38 (57%), Gaps = 0/38 (0%)

Query  13   QDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSH  50
            QDF  +  H+ LQK +C  ++++  TYPL     D+ H
Sbjct  131  QDFGPHKDHYNLQKGVCNQLIQLMHTYPLQSYVKDLQH  168


>gb|ECL67475.1|  hypothetical protein GOS_5949631 [marine metagenome]
Length=300

 Score = 33.9 bits (76),  Expect = 4.7, Method: Composition-based stats.
 Identities = 24/86 (27%), Positives = 40/86 (46%), Gaps = 9/86 (10%)

Query  76   PSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVI  135
            P  K+ DL+  L  G+     SY+  L+K+P  +F N         K    YE L +   
Sbjct  79   PPKKDRDLSGFLRFGEKSFGHSYYHLLEKMPGFYFLN---------KSTLEYELLSQKET  129

Query  136  DKIEYPIIVEWDVGTKSNIERENKWA  161
            +K E  I +  ++  K  I++ +KW+
Sbjct  130  NKEEIKITIGKNISQKELIDKLSKWS  155


>gb|ECC57253.1|  hypothetical protein GOS_3384164 [marine metagenome]
Length=287

 Score = 33.5 bits (75),  Expect = 5.4, Method: Composition-based stats.
 Identities = 21/58 (36%), Positives = 27/58 (46%), Gaps = 6/58 (10%)

Query  67   GTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQI  124
            G II + E P L  P L       +GR+ +   LG K +P  F  N KY   L  K+I
Sbjct  234  GKIIATTENPDLSKPSLP------EGRLCMVMGLGPKGLPGSFVNNSKYHFELTTKKI  285


>gb|EBC78381.1|  hypothetical protein GOS_14794 [marine metagenome]
Length=198

 Score = 33.5 bits (75),  Expect = 6.4, Method: Composition-based stats.
 Identities = 18/45 (40%), Positives = 24/45 (53%), Gaps = 2/45 (4%)

Query  228  CKRIENSRKKFTKEDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN  272
            C+R  N   +F KED  +FEG R  F KY+F   ++      VDN
Sbjct  52   CERASNY--QFVKEDINHFEGLRMLFTKYDFDAVIHLAAESHVDN  94


>gb|EDC79643.1|  hypothetical protein GOS_1412928 [marine metagenome]
Length=263

 Score = 33.1 bits (74),  Expect = 7.0, Method: Composition-based stats.
 Identities = 18/53 (33%), Positives = 30/53 (56%), Gaps = 1/53 (1%)

Query  47   DMSHFGDDDVEDIFLLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYF  99
            + S++G D+V DIF  GYN  +++P  +  SL    +A     G   +++SYF
Sbjct  154  NFSNYGADNV-DIFAPGYNVYSLLPENDYESLSGTSMAAPAVSGVASLILSYF  205


>gb|ECU36641.1|  hypothetical protein GOS_4642498 [marine metagenome]
Length=142

 Score = 33.1 bits (74),  Expect = 7.6, Method: Composition-based stats.
 Identities = 18/41 (43%), Positives = 22/41 (53%), Gaps = 0/41 (0%)

Query  212  LNVYIKSDSQDDFFGCCKRIENSRKKFTKEDKFYFEGSRKW  252
            L +   S S DDFF   K I NS   F + DK YF  S++W
Sbjct  50   LPIVFGSPSVDDFFILKKNISNSLINFEEIDKLYFFKSKRW  90


>gb|ECD49287.1|  hypothetical protein GOS_3242909 [marine metagenome]
Length=150

 Score = 33.1 bits (74),  Expect = 7.9, Method: Composition-based stats.
 Identities = 20/65 (30%), Positives = 33/65 (50%), Gaps = 1/65 (1%)

Query  101  GLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVIDKIEYPIIVE-WDVGTKSNIERENK  159
             LKK P  FF +KK    L  +     +   K+ +    YP I++  D+  + + +R N+
Sbjct  26   ALKKDPFHFFEDKKVKDVLSVRVAEVLDNAIKVNVGDDRYPTIIKKADIALEKSDQRPNR  85

Query  160  WAPGD  164
            +APGD
Sbjct  86   FAPGD  90


>gb|ECB93230.1|  hypothetical protein GOS_5913632 [marine metagenome]
Length=101

 Score = 32.7 bits (73),  Expect = 8.7, Method: Composition-based stats.
 Identities = 18/58 (31%), Positives = 26/58 (44%), Gaps = 6/58 (10%)

Query  67   GTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQI  124
            GT++ + E P      L       +GR+ +   LG K +P+ F T  KY   L G  I
Sbjct  31   GTVVATTENPDYSKSSLP------KGRLCMVMGLGPKGLPISFITKSKYNFELTGSNI  82


---------------------------------------------------------------------------------------------------
tBlastn dans env_nt.

TBLASTN 2.2.17 (Aug-26-2007)

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

RID: KNYCVCPW01R


Database: environmental samples
           5,061,419 sequences; 5,008,228,876 total letters


                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

gb|AACY024086195.1|  Marine metagenome 1096626585396, whole genom   575    1e-162
gb|AACY022206818.1|  Marine metagenome 864999, whole genome shotg  37.4    0.91  
gb|AACY022589205.1|  Marine metagenome 1095403498384, whole genom  37.0    1.2   
gb|AACY021805083.1|  Marine metagenome 1092343726176, whole genom  36.6    1.6   
gb|AACY021272742.1|  Marine metagenome 1610191, whole genome shot  36.2    2.0   
gb|AACY021737952.1|  Marine metagenome 1093016207484, whole genom  36.2    2.0   
gb|AACY021929886.1|  Marine metagenome 1093017762176, whole genom  36.2    2.0   
gb|AACY020746744.1|  Marine metagenome 1520455, whole genome shot  35.8    2.7   
gb|AAFX01067009.1|  Metagenome sequence XZS68210.b1, whole genome  35.8    2.7   
gb|AAFX01111059.1|  Metagenome sequence XZS14290.g1, whole genome  35.8    2.7   
gb|AAFX01112669.1|  Metagenome sequence 2662324_fasta.screen.C...  35.8    2.7   
gb|AACY020173203.1|  Marine metagenome 1096626200011, whole genom  35.4    3.5   
gb|AACY022755291.1|  Marine metagenome ctg_1101667162642, whol...  35.0    4.5   
gb|AACY020077416.1|  Marine metagenome 1096626089385, whole genom  34.7    5.9   
gb|AACY022222235.1|  Marine metagenome 1092963871026, whole genom  34.7    5.9   
gb|AACY022615091.1|  Marine metagenome ctg_1101667022442, whol...  34.7    5.9   

>gb|AACY024086195.1|  Marine metagenome 1096626585396, whole genome shotgun sequence
Length=819

 Score =  575 bits (1481),  Expect = 1e-162
 Identities = 272/272 (100%), Positives = 272/272 (100%), Gaps = 0/272 (0%)
 Frame = -2

Query  1    SQITQGIFHKNFQDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSHFGDDDVEDIF  60
            SQITQGIFHKNFQDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSHFGDDDVEDIF
Sbjct  818  SQITQGIFHKNFQDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSHFGDDDVEDIF  639

Query  61   LLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLG  120
            LLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLG
Sbjct  638  LLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLG  459

Query  121  GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVK  180
            GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVK
Sbjct  458  GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVK  279

Query  181  FDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTK  240
            FDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTK
Sbjct  278  FDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTK  99

Query  241  EDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN  272
            EDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN
Sbjct  98   EDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN  3


>gb|AACY022206818.1|  Marine metagenome 864999, whole genome shotgun sequence
Length=824

 Score = 37.4 bits (85),  Expect = 0.91
 Identities = 30/100 (30%), Positives = 42/100 (42%), Gaps = 17/100 (17%)
 Frame = +1

Query  168  DFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQM------LYAKIARNSFPLNVYIKSDSQ  221
            DF      +  +K    +SA  K A  V DS+       +Y K  RNS P N Y K   Q
Sbjct  517  DFEKAKETINFIKQKKWNSA-LKSAEKVKDSEFRKLITWMYLKTTRNSAPFNEYKKFIEQ  693

Query  222  DDFFGCCKRIENSRKKFTKEDKFYFEGSR-----KWFDKY  256
            +D++    RI     ++  E+K Y   +       WF KY
Sbjct  694  NDYYPRINRI-----RYLAEEKIYLRNNSPTSIINWFKKY  798


>gb|AACY022589205.1|  Marine metagenome 1095403498384, whole genome shotgun sequence
Length=799

 Score = 37.0 bits (84),  Expect = 1.2
 Identities = 36/153 (23%), Positives = 66/153 (43%), Gaps = 22/153 (14%)
 Frame = +1

Query  122  KQITNYEQLKKIVIDKIEYPIIVEWDVGT---KSNIERENKWAPG-----DVNTDFAAKA  173
            K+  NY+  K +VI    + II  +++G+    SN++ EN +        D+NT    K+
Sbjct  229  KKYYNYKFQKLLVIF---FTIIFIFNLGSVIVNSNLKFENSFVNDVFKKTDLNTRVLTKS  399

Query  174  PNVGIVKFDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIEN  233
            PN+  +  D+  S+DY    + DD+Q       +N F       S+  + F         
Sbjct  400  PNIYFIISDMFPSSDYLKILYDDDNQNFLEFFKKNKFKFKDNHFSNYSNTFLSLASLFNT  579

Query  234  SRKKFTKEDKFYFEGSRKWFDKYEFRFELNFIQ  266
            +   + ++D F        F+  +F+   NF +
Sbjct  580  N---YLRDDYF--------FNPQKFKTNSNFFK  645


>gb|AACY021805083.1|  Marine metagenome 1092343726176, whole genome shotgun sequence
Length=627

 Score = 36.6 bits (83),  Expect = 1.6
 Identities = 32/145 (22%), Positives = 60/145 (41%), Gaps = 24/145 (16%)
 Frame = +1

Query  115  YGISLGGKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAP  174
            Y       + +NY  LK   I+ I        DVG+  N +   ++  G ++ +F +   
Sbjct  19   YSFKNNDHEFSNYVSLKDKKINLI-------MDVGSPPNKKFSEEYQAGALSFEFVSNGK  177

Query  175  NVGIVKFDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENS  234
             +            +  AG+ D   + + +++++S   NV I  D+       CK I+NS
Sbjct  178  KI------------FTNAGYYDYKNVKFKELSKSSAVHNVLIVDDNSS-----CKFIKNS  306

Query  235  RKKFTKEDKFYFEGSRKWFDKYEFR  259
              KF  +D    +     F+K E++
Sbjct  307  LSKFEVKDSLKTDIKNISFEKNEWK  381


>gb|AACY021272742.1|  Marine metagenome 1610191, whole genome shotgun sequence
Length=880

 Score = 36.2 bits (82),  Expect = 2.0
 Identities = 18/50 (36%), Positives = 24/50 (48%), Gaps = 0/50 (0%)
 Frame = -3

Query  38   TYPLGMLGLDMSHFGDDDVEDIFLLGYNPGTIIPSGETPSLKNPDLAWGL  87
            T PLG    ++  FGDD     F L Y PG ++  G    + N +L W L
Sbjct  866  TSPLG*SSRNVCRFGDDSRPGSFALEYRPGQLLTQGHRLQIFNAELLWYL  717


>gb|AACY021737952.1|  Marine metagenome 1093016207484, whole genome shotgun sequence
Length=776

 Score = 36.2 bits (82),  Expect = 2.0
 Identities = 41/151 (27%), Positives = 64/151 (42%), Gaps = 22/151 (14%)
 Frame = -3

Query  102  LKKIPVLFFTNK----KYGISLGGKQIT-----NYEQLKKIVIDKIEYPIIVEWDVGTKS  152
            +KKIP    T K    + GI +G K I      NY+     ++ K + P +++  VGT  
Sbjct  453  VKKIPTSLDTRKSSIMERGIRIGVKLINDVSGLNYDTKTINILKKYKIPFVIQHSVGTPE  274

Query  153  NIERENKWAPG--DVNTDFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQMLYAKIARNSF  210
            N+++  K+     D+   F          K  LI S   K    + D  + + K  +++ 
Sbjct  273  NMQKNPKYKNELLDIYDYFED--------KIKLIRSRGIKHNNIILDPGIGFGKNLKHNM  118

Query  211  PLNVYIKSDSQDDFFGCCKRIENSRKKFTKE  241
             L   I+  S     G    + NSRKKF KE
Sbjct  117  NL---IRGISIFHSLGFPILVGNSRKKFIKE  34


>gb|AACY021929886.1|  Marine metagenome 1093017762176, whole genome shotgun sequence
Length=738

 Score = 36.2 bits (82),  Expect = 2.0
 Identities = 23/95 (24%), Positives = 45/95 (47%), Gaps = 7/95 (7%)
 Frame = -3

Query  153  NIERENKWAP----GDVNTDFAAKAPNVGIVKFDLIHSADYKMAGWVDDSQMLYAKIARN  208
            N E++ +W P    G+  + F    PN G+   +L   A+   +GWV + + ++   A+ 
Sbjct  295  NQEQKERWIPDLIAGEYKSCFGVTEPNSGLNTANLQTRAERTNSGWVVNGRKIWTSTAQV  116

Query  209  SFPLNVYIKSDSQDDFFGCCKRIENSRKKFTKEDK  243
            +  + +  ++  Q+D   C K I+     FT  D+
Sbjct  115  ASKILLIARTTPQED---CAKPIDGLSLFFTDLDR  20


>gb|AACY020746744.1|  Marine metagenome 1520455, whole genome shotgun sequence
Length=851

 Score = 35.8 bits (81),  Expect = 2.7
 Identities = 16/40 (40%), Positives = 23/40 (57%), Gaps = 0/40 (0%)
 Frame = -2

Query  207  RNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTKEDKFYF  246
            +N F LN+ IK   QDD F    +   S KK+ K++ +YF
Sbjct  799  KNIFFLNLKIKDTKQDDIFDYLSKFIKSSKKYLKQNTYYF  680


>gb|AAFX01067009.1|  Metagenome sequence XZS68210.b1, whole genome shotgun sequence
Length=1069

 Score = 35.8 bits (81),  Expect = 2.7
 Identities = 16/35 (45%), Positives = 20/35 (57%), Gaps = 0/35 (0%)
 Frame = +3

Query  143  IVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVG  177
            ++EWD  TKS  E +  WAP  VN D   +AP  G
Sbjct  282  LLEWDDRTKSVAEGKLPWAPTQVNLDIKEQAPKTG  386


>gb|AAFX01111059.1|  Metagenome sequence XZS14290.g1, whole genome shotgun sequence
Length=782

 Score = 35.8 bits (81),  Expect = 2.7
 Identities = 16/35 (45%), Positives = 20/35 (57%), Gaps = 0/35 (0%)
 Frame = +3

Query  143  IVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVG  177
            ++EWD  TKS  E +  WAP  VN D   +AP  G
Sbjct  267  LLEWDDRTKSVAEGKLPWAPTQVNLDIKEQAPKTG  371


>gb|AAFX01112669.1|  Metagenome sequence 2662324_fasta.screen.Contig5738, whole genome 
shotgun sequence
Length=773

 Score = 35.8 bits (81),  Expect = 2.7
 Identities = 16/35 (45%), Positives = 20/35 (57%), Gaps = 0/35 (0%)
 Frame = +3

Query  143  IVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVG  177
            ++EWD  TKS  E +  WAP  VN D   +AP  G
Sbjct  267  LLEWDDRTKSVAEGKLPWAPTQVNLDIKEQAPKTG  371


>gb|AACY020173203.1|  Marine metagenome 1096626200011, whole genome shotgun sequence
Length=1836

 Score = 35.4 bits (80),  Expect = 3.5
 Identities = 22/69 (31%), Positives = 39/69 (56%), Gaps = 9/69 (13%)
 Frame = -1

Query  88    HPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLGGKQITNYEQLKKIVIDKIEY-PIIVEW  146
             H G+ R M +  LG++K+PVL +T        G  ++  ++Q    V+DK ++ P + E+
Sbjct  1779  HEGRHRCMAAKKLGIEKVPVLIYTGS------GFDRVPQWDQSTHDVVDKSDFKPQLKEY  1618

Query  147   --DVGTKSN  153
               DV TK++
Sbjct  1617  GSDVSTKND  1591


>gb|AACY022755291.1|  Marine metagenome ctg_1101667162642, whole genome shotgun sequence
Length=901

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 12/26 (46%), Positives = 18/26 (69%), Gaps = 0/26 (0%)
 Frame = +2

Query  10   KNFQDFNDYGHHFWLQKWMCQSILEM  35
            + FQ FN++G HFW  KW  QS++ +
Sbjct  473  EKFQLFNEWGFHFWGSKWATQSVVNL  550


>gb|AACY020077416.1|  Marine metagenome 1096626089385, whole genome shotgun sequence
Length=3002

 Score = 34.7 bits (78),  Expect = 5.9
 Identities = 21/68 (30%), Positives = 36/68 (52%), Gaps = 9/68 (13%)
 Frame = -1

Query  102  LKKIPVLFFTNK----KYGISLGGKQIT-----NYEQLKKIVIDKIEYPIIVEWDVGTKS  152
            +KKIP+   T K    + GIS+G K I      NY+     ++ K + P +++  VGT  
Sbjct  332  VKKIPISLDTRKSTIMENGISMGVKLINDVSGLNYDPETINILKKYKIPFVIQHSVGTPE  153

Query  153  NIERENKW  160
            N++ + K+
Sbjct  152  NMQNKAKY  129


>gb|AACY022222235.1|  Marine metagenome 1092963871026, whole genome shotgun sequence
Length=659

 Score = 34.7 bits (78),  Expect = 5.9
 Identities = 18/41 (43%), Positives = 22/41 (53%), Gaps = 0/41 (0%)
 Frame = +2

Query  212  LNVYIKSDSQDDFFGCCKRIENSRKKFTKEDKFYFEGSRKW  252
            L +   S S DDFF   K I NS   F + DK YF  S++W
Sbjct  149  LPIVFGSPSVDDFFILKKNISNSLINFEEIDKLYFFKSKRW  271


>gb|AACY022615091.1|  Marine metagenome ctg_1101667022442, whole genome shotgun sequence
Length=830

 Score = 34.7 bits (78),  Expect = 5.9
 Identities = 14/38 (36%), Positives = 22/38 (57%), Gaps = 0/38 (0%)
 Frame = -1

Query  13   QDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSH  50
            QDF  +  H+ LQK +C  ++++  TYPL     D+ H
Sbjct  440  QDFGPHKDHYNLQKGVCNQLIQLMHTYPLQSYVKDLQH  327

ORF finding

recherche ORF finder dans SMS n'importe quel codon d'initiation, cadre de lecture 1.2.3
dans le sens direct et indirect pour les ORF d'au moins 60 codons, avec code génétique standard

>ORF number 1 in reading frame 1 on the direct strand extends from base 256 to base 441.
TCTGCACTATGTATTAAATCAAACTTTACAATACCAACATTTGGTGCCTTTGCAGCAAAA
TCAGTATTTACATCTCCAGGGGCCCATTTATTTTCTCTTTCTATATTAGATTTTGTTCCA
ACATCCCATTCAACAATTATCGGATACTCTATTTTATCAATAACTATTTTTTTAAGTTGT
TCATAA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
SALCIKSNFTIPTFGAFAAKSVFTSPGAHLFSLSILDFVPTSHSTIIGYSILSITIFLSC
S*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 273 to base 497.
ATCAAACTTTACAATACCAACATTTGGTGCCTTTGCAGCAAAATCAGTATTTACATCTCC
AGGGGCCCATTTATTTTCTCTTTCTATATTAGATTTTGTTCCAACATCCCATTCAACAAT
TATCGGATACTCTATTTTATCAATAACTATTTTTTTAAGTTGTTCATAATTTGTTATCTG
TTTTCCACCAAGACTTATTCCATATTTCTTATTTGTGAAAAATAG

>Translation of ORF number 1 in reading frame 3 on the direct strand.
IKLYNTNIWCLCSKISIYISRGPFIFSFYIRFCSNIPFNNYRILYFINNYFFKLFIICYL
FSTKTYSIFLICEK*


No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the reverse strand extends from base 2 to base 817.
TCTCAAATTACACAGGGTATTTTTCACAAAAATTTTCAAGATTTTAATGATTATGGTCAT
CATTTTTGGTTACAAAAATGGATGTGTCAGAGTATCTTGGAAATGGGTTTTACTTATCCA
CTTGGTATGCTTGGATTAGATATGTCACATTTTGGTGATGATGATGTAGAGGATATATTT
TTGTTAGGTTATAATCCAGGTACAATTATTCCAAGTGGAGAAACACCAAGTTTAAAAAAT
CCAGATTTGGCGTGGGGATTACATCCTGGACAAGGTAGAATTATGGTTTCTTATTTTCTT
GGGTTAAAAAAGATACCAGTACTATTTTTCACAAATAAGAAATATGGAATAAGTCTTGGT
GGAAAACAGATAACAAATTATGAACAACTTAAAAAAATAGTTATTGATAAAATAGAGTAT
CCGATAATTGTTGAATGGGATGTTGGAACAAAATCTAATATAGAAAGAGAAAATAAATGG
GCCCCTGGAGATGTAAATACTGATTTTGCTGCAAAGGCACCAAATGTTGGTATTGTAAAG
TTTGATTTAATACATAGTGCAGATTATAAAATGGCAGGCTGGGTAGATGATTCTCAAATG
TTGTATGCTAAAATAGCAAGAAATAGTTTTCCTTTAAATGTTTATATAAAGTCGGATAGT
CAAGATGATTTTTTTGGGTGTTGTAAAAGAATAGAAAATAGTAGAAAGAAATTTACAAAA
GAAGATAAATTTTATTTTGAGGGTTCAAGAAAATGGTTTGATAAATATGAATTTAGGTTT
GAATTAAATTTTATACAAGTAGATAGTGTTGATAAT

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
SQITQGIFHKNFQDFNDYGHHFWLQKWMCQSILEMGFTYPLGMLGLDMSHFGDDDVEDIF
LLGYNPGTIIPSGETPSLKNPDLAWGLHPGQGRIMVSYFLGLKKIPVLFFTNKKYGISLG
GKQITNYEQLKKIVIDKIEYPIIVEWDVGTKSNIERENKWAPGDVNTDFAAKAPNVGIVK
FDLIHSADYKMAGWVDDSQMLYAKIARNSFPLNVYIKSDSQDDFFGCCKRIENSRKKFTK
EDKFYFEGSRKWFDKYEFRFELNFIQVDSVDN

No ORFs were found in reading frame 3.