ORF LE23220

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091143176848
Annotathon code: ORF_LE23220
Sample :
  • GPS :5°33'10n; 87°5'16w
  • Eastern Tropical Pacific: Dirty Rock, Cocos Island - Costa Rica
  • Fringing Reef (-1.1m, 28.3°C, 0.8-3.0 microns)
Authors
Team : BioCell 2007
Username : Bodi
Annotated on : 2008-03-19 18:52:37
  • BOURGES CHRISTOPHE
  • GENSOLLEN THOMAS

Synopsis

  • Gene symbol: [[:Category: | ]]
  • Taxonomy: Viruses (NCBI info)
    Rank: no rank - Genetic Code: Standard - NCBI Identifier: 10239
    Kingdom: - Phylum: - Class: - Order:
    Viruses;

Genomic Sequence

>JCVI_READ_1091143176848 ORF_LE23220 genomic DNA
GTTGTAGCTACTGAAGCACACAACTGTACAATTGATACAGTTATTGATGTTCGTGGTGTTGATGACAACGCATACAACGGAACATATAATATTACATCTG
TAGTTGATCCATTTACATTTAAATATACTGCAGGTTCTACACCAACTATTGCAACAGCAGGTGGTGAATATACAGTAACACCTAAGGGTGCTTTCGGAAC
CAATCTTGAGATTGGTATGATGGATCAGCAAAATGGTATCTTCTTTAGATATGCGAACGGTGAATTAAGTGTTGTTCGTAGATCATCTACTTATCAACTA
TCAGGTAAAGTCAATGTTACTAATGGAAGCACACTAGTTTCTAGTTTCACTGGTATCAATGGTCAAGGAACTAAGTTTTCAAAACAGTTGAAACCAGGTG
ACTATGTTGTCATCCGTGGTTCTTCTTATCGTGTTGATGGTATCGTATCTGATACTCAGATGGTTATTTTCCCTGACTATCGTGGTCCTACAGCAGGTAA
TGTACCTGTAACTAAGACTACAGAAGTAGAATGGAAACAAGGTGACTGGAACATTGACCGTTGTGATGGTACAGGTAAGACTGGTTACACACTCGACACT
ACCAAGATGCAAATGTTCTACATGGATTACTCTTGGTATGGTGCTGGTTTTATTCGTTGGGGTTTCCGTGCATTAAATGGTGATGTTATTTACGCTCATA
AGATACCTAACAACAACCAGAACACTGAAGCATACATGAGATCTGGTAACCTACCAGCTCGTTACGAAGTTAATACTATTCCACCATCAACAACTGCTAC
TAGAACATTTGCTAGTGGTGATAGCACATTGTATGTTGCTGATGATCTTTCTCAGTTCCCAACATCTGGTACACTTCGTGTCAAGCAATCTACTAGTGCT
ACTGCTGGTACTCAAGAGTATATAAATTACACTGGTAAAACATCTTTCGTTCAAGATATTATTAGCACTAACTCACGTAACGATACTATCACTGTTTCTT
CTACTACAGGATTACACCAGGCGGTCAACAAACATTATTTTTGATGTCCCATTTGCTAACATAGTGCTAACAAAA

Translation

[1 - 1041/1075]   direct strand
>ORF_LE23220 Translation [1-1041   direct strand]
VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQL
SGKVNVTNGSTLVSSFTGINGQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYTLDT
TKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA
TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLHQAVNKHYF

[ Warning ] 5' incomplete: does not start with a Methionine

Phylogeny

neighbor-fitch-kitch :

  +------------------meta4     
  ! 
  !           +-----------meta1     
  !     +-----5 
  !     !     !       +---ORF_LE2322
  !     !     +-------2 
  4-----6             +------meta2     
  !     ! 
  !     !       +------------------meta3     
  !     +-------3 
  !             !         +meta5     
  !             +---------1 
  !                       +-----meta6     
  ! 
  +---------------------Cphage    



Parcimonie :

     +-----------------meta3     
     !  
  +--5              +--meta6     
  !  !  +-----------7  
  !  !  !           +--meta2     
  !  +--4  
  !     !     +--------meta5     
  !     +-----6  
  1           !  +-----ORF_LE2322
  !           +--3  
  !              !  +--meta1     
  !              +--2  
  !                 +--meta4     
  !  
  +--------------------Cphage    

Annotator commentaries

Nous avons effectué une recherche d'ORF avec SMS (avec "anycodon" et "code génétique standard"). Sur le brin direct, dans le cadre de lecture 1, nous avons trouvé un ORF de la base 1 à la base 1044 qui est l'ORF le plus long et qui sera donc celui étudié. La protéine codée par cet ORF a une taille minimale de 347 acides aminé. La séquence étudiée de 1041pb est codante, avec un codon Stop, mais nous n'avons pas l'extrémité 5' du gène. L'alignement multiple suggère qu'il manque environ 400 acides aminés (soit 1200 nucléotides) à l'extrémité 5' de notre ORF, comparée aux deux homologues ayant les plus hauts scores. Ceci pourrait correspondre à une erreur de séquençage étant donné le pourcentage d'identité entre ces séquences (plus de 70% d'identité).

Pour déterminer s'il existe d'éventuels homologues connus de notre ORF, nous avons fait un BLASTp contre Swissprot mais nous n'avons trouvé aucun homologue avec un score significatif (E-value très élevées). Les séquences issues de cette première analyse présentent un faible pourcentage d'indentité avec notre ORF, ce qui ne permet pas leur utilisation pour notre étude. Nous avons ensuite fait un BLASTp contre nr et nous avons trouvé un seul homologue ayant un score significatif, avec un score de 416 et une E-value de 5e-115. L'alignement 2 à 2 issu de ce blast présente pour cette séquence une identité de 72% pour un alignement total de 338 acides aminés. Cela pourrait indiquer que l'ORF étudié est phylogénétiquement très proche de cet homologue au vu des bons résultats obtenus. En outre, les autres homologues issu du même blast ont une E-value supérieure à 4e-06, et ne peuvent donc pas être utilisé pour l'étude de notre ORF. N'ayant obtenu qu'un seul homologue ayant un score significatif avec le blastp contre nr, nous avons effectué un blastp contre Environmental sample (env_nr). Ce Blast nous a permis de trouver de nombreux homologues de notre ORF, ayant des scores de plus de 400 et des E-values atteignant 2e-159, mais qui sont issus d'organismes qui n'ont pas encore été identifié. Ils nous servent à construire l'arbre taxonomique mais aussi à voir si les homologues de notre ORF sont abondants dans l'environnement ou non. On peut en effet observer qu'il existe de nombreux organismes (environ une centaine) possédant une séquence homologue à notre ORF.

Nous n'avons trouvé aucun domaine protéique connu avec Interpro. Notre seul homologue (provenant d'un organisme identifié) ayant un score significatif provient d'un bactériophage de la cyanobactérie Synechococcu (Synechococcus phage S-PM2), mais il n'y a pas de symbole de gène associé à cette séquence. La protéine correspondant à cet homologue est une protéine de structure du virus. Etant donné la forte similarité qu'il y a entre cette protéine et celle issue de l'ORF étudié, nous pouvons supposer que cette dernière possède le même rôle. Nous avons donc proposé "structural molecule activity" comme fonction moléculaire pour notre ORF.

Pour l'étude taxonomique nous avons rassemblé pour l'alignement multiple la séquence protéique de l'ORF, celle de notre seul homologue ayant un score significatif et quelques unes issues du blastp contre Environmental sample. Ces dernières séquences doivent nous permettre de voir si notre ORF se branche plutôt avec le virus ou avec les homologues provenant d'organismes non identifiés. Après alignement multiple avec clustal W, nous avons utilisé le logiciel Phylip pour construire notre arbre phylogénétique. En observant l'arbre NJ et l'arbre par parcimonie (non racinés), on constate que l'ORF est branché avec les séquences provenant des organismes non identifiées, et que le meilleur homologue (Synechococcus phage S-PM2) se retrouve à l'extérieur. On peut ainsi penser, étant donné la forte similarité entre la séquence protéique de l'ORF étudié et celle du phage, que les homologues provenant des organismes non identifiés et l'ORF lui même peuvent faire partie du même groupe taxonomique que le phage : les virus. Malgré tout, un seul homoloque avec un score significatif et provenant d'un organisme identifié ne peut prouver cette supposition.












Multiple Alignement

CLUSTAL W (1.82) multiple sequence alignment


Cphage           MARKSIRSNYYIFDASAREVILPGGIQREQLVLITNVTDNKVIYNFSDPELTATEYHIST
meta4            MARKSIKSNYYLFDASARQVVIPGGIQREQLTLITNVTDNKIIYNFSDPELTATAYSIET
meta1            -----------------------------------------VIYNFSDPELTASIYSIQT
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            ------------------------------------------------------------
meta5            ------------------------------------------------------------
meta6            ------------------------------------------------------------
                                                                             

Cphage           DIRNVTTTRIVLAYDTTSMSDTDKLQIVYDDFEETIKPSETYHDAVNKSKVSNPQSQIDT
meta4            DIRNVTTTRVTLAYDTTSMADTDKLQIVYDDFEETIKPAETYMDSVNKQRVSNPQSQIDT
meta1            DIRNVTTTRVVLSYDTTGMEDSDDLQIIVDDFEETVKPAETYNDAVNKSKVSQPQSQIDT
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            ------------------------------------------------------------
meta5            ------------------------------------------------------------
meta6            ------------------------------------------------------------
                                                                             

Cphage           DFEYGTQDTKWEALAMINNNPFAYKGQDVLNVSDIQVTTGSKIVTVSSTS----NPGAGT
meta4            DFEYGTQGTKWEALAMINNNPFAYKSETALAVTQIEAFTSSKLIRVTVDTNQTALPAAGS
meta1            DFEYGTQDTKWEALAMVNNNPFAYKSQTPIVITEIQTTPGSREMAVSCSTP----PGAGT
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            ---------------------------------------DSKQIRVSVDTSQTPLPNAGT
meta5            ------------------------------------------------------------
meta6            ------------------------------------------------------------
                                                                             

Cphage           AVYIQDSLFPGANGVFITNTS-----NSNSFTYTAKYPWTFGPG--SVYNSARTALYRGV
meta4            AVFVQDTTFPAANGVFIIDAAGTGSLNSNQFTYTASFAWTQGNS--SIQISGRTDVYTGS
meta1            AIYMQDATFPGANGVFIIDSISTTG-AFVGFKYTAKYEWPSGTGGTNIYDAARTAVYSGI
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            AVFIQDTIHDGANGVFIVDKKLGAQNQFTYTAKYPWTTGDSDIDIDARTAVYTGALFTGA
meta5            ------------------------------------------------------------
meta6            ------------------------------------------------------------
                                                                             

Cphage           HYTASEIGGAITVSFPGD----GSVKVTTSRAHGLEIGNEFGMVGSTGTNVNGSWIVARV
meta4            HYTGSDIGGSLTLSAPGD----GQVRVVCTNAHGLEVGNEIAITGSNGNNVNGSWVVATV
meta1            HFTGSDLGGTITLSTPSAGVMSGSVQVDTTQAHGLEVGNEIAIVGSAGTNVNGSWVVARV
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            EIGGNIT---------ISAPGNGNVRVTCSNAHGLEIGNEIAVTGSQGSNVNGSWIVATV
meta5            ------------------------------------------------------------
meta6            ------------------------------------------------------------
                                                                             

Cphage           ESPTIFYYFPDGTPSGTG--GTKKLYPRPQGNSVHRAFDGGVKFSTNSTSKNQQSIRQTK
meta4            ESPTIFEYYPTTAPSGGVGSGTAKLYPRPQGNSVHRAFDGGVKFSTNSISKNQQAIRQTK
meta1            ESPTRFYYYPDAAPTGSVATGTIKLYPRPQGNSIHRAFDGGVKFSTNSFSKNQQAVRQTK
ORF_LE23220      ------------------------------------------------------------
meta2            ------------------------------------------------------------
meta3            ESPTVFEYYPDAAPSGTVNGGTIKLYPRPQGNSVHRAFDGGVKFSTNSNSKNQQQIRQTK
meta5            ------------------------------------AFDGGVKFSTNSHSKNQQAIRQTK
meta6            ------------------------------------------------------------
                                                                             

Cphage           RYFRYQSGKGVAFSTGSILAPAIENIDEISSSGTTVTVVAATNHNISKGSQIDVRGCSDN
meta4            RYFRYQSGKGVAFSTGSILAPAIENIDSITASGTTVTVVTTVAHNITRGSQIDVRSCDDN
meta1            RYFRYQSGKGVSFSTGSILEPAIENLDSITASGTDVTVVSTDAHNVTRDTVVRVQGVNDN
ORF_LE23220      -------------------------------------VVATEAHNCTIDTVIDVRGVDDN
meta2            ------------------------------------------------------------
meta3            RYFRYQSGKGVAFSTGSILAPAIENIDSITASGTTVTVVSSVAHNITRDTEVKVQGCNDN
meta5            RYFRYQSGKGIAFSTGSILAPAIENVDEITSSGSTVTVVCAVAHNVTRDTEVDVRGCNDN
meta6            -----------------------------------------------------------N
                                                                             

Cphage           AYNGIYEVTNIVDPFTLQYVVPSTPTQTTTGGEYSVTAINSFGTNLEIGMMDQQNGIFFR
meta4            NYNGTFTVSEVIDAFTFKYTALAAPSVTVAGGSYTITPINAYGVNLEIGMMDQQNGIFFR
meta1            NYNGTFNVTNVVDPYTFQYAADSSPTDSEAAGEYTITPVNSYGTKLEIGMMDQQNGIFFR
ORF_LE23220      AYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMMDQQNGIFFR
meta2            -------------------------------------------------------GIFFR
meta3            AYNGTYNITNVIDAYKFEYQASVAPTESTAAGEYTITPVNANGVQLELGMMDQQNGIFFR
meta5            VYNGTFQVSNVINAYTFQYVAGSTPSEATASGEYTVTPVNANGINLEIGMMDQQNGIFFR
meta6            VYNGTFQVSNVIDAFTFQYVAGSTPSEATASGEYTVTPVNANGVNLEIGMMDQQNGIFFR
                                                                        *****

Cphage           YSNGVLSVVRRTSTFQLSGKVSVTNGSTLVSNFTTINGQTTKFAKQLKPGDYVVIRGSSY
meta4            YSHGQIEVVRRTSTFQLSGKATVTQGSSVVSSYTGVNGASTRFAKQLKVGDYVVLRGSSY
meta1            WASGSLSVVRRTSTFQLSGRLTVTNGSTLVSSFAGPNLQTTKFAKQLKPGDYVVIRGASY
ORF_LE23220      YANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGINGQGTKFSKQLKPGDYVVIRGSSY
meta2            YSNGELSVVRRSSTYQLSGKVTVNNGSTLVSSFTGINGQGTKFAKQLQPGDYVVIRGSSY
meta3            HANGHTSVVRRSATFQLSGRATVTNGNSLISSYTGPNQQGTKFAKQLQIGDYVVLRGSTY
meta5            HANGHTSLVRRTSTFQLSGKATVTSGSSLISSYTGPNQQGTKFAKQLKIGDYVVLRGSSY
meta6            HANGHTSLVRRTSTFQLSGKATVTSGSSLISSYTGPNQQGTKFAKQLKIGDYVVLRGSSY
                  : *  .:***::*:****: .*..*.:::*.::  *   *:*:***: *****:**::*

Cphage           RVDGIISDTQMVIFPDYRGPTSNDVPVTKTVETEWNQTEWNLDRMDGTGKSGYTLDPTRM
meta4            RVDGIISDTQIVIFPDYRGQSDINVPITKTEEAVWRQSEWNLDRCDGTGKSGYTLDVTKM
meta1            RVDGIISDTQMVIFPDYRGPSASNVPVTKTVETEWNQGDWNIDRCDGTGKTGYTLDPTKM
ORF_LE23220      RVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYTLDTTKM
meta2            RVDGIISDTQMVIFPDYRGPSAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYNLDVTKM
meta3            RVDGIVSDTQMVIFPDYRGPSDINVPITKVTETEWQQEDWNLDRCDGTGKSGYSLDVTKM
meta5            RIDGIISDTQMVIFPDYRGPSDINVPITKTVETVWDQPSWNIDRCDGTGKSGYTLDVTKM
meta6            RIDGIISDTQMVIFPDYRGPSDINVPITKTVETVWDQPSWNIDRCDGTGKSGYTLDVTKM
                 *:***:****:******** :  :**:**. *. * * .**:** *****:**.** *:*

Cphage           QMFYMDYSWYGAGFIRWGFRAEDGNIIYAHKIPNNNVNTEAYMRSGNLPSRYEVNTIPPS
meta4            QMFYLDYSWYGAGFIRWGFRAADGDVIYAHKIPNNNFNTEAYMRSGNLPARYEVNTICPS
meta1            QMFYMDYSWYGAGFVRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPA
ORF_LE23220      QMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPS
meta2            QMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPK
meta3            QMFYMDYSWYGAGFIRWGFRATDGDVIYAHKIPNNNFNTEAYMRSGNLPARYEVNTICPK
meta5            QMFYMDYSWYGAGFIRWGFRATDGNVIYIHKIPNNNFNTEAYMRSGNLPSRYEVNTICPT
meta6            QMFYMDYSWYGAGFIRWGFRATDGNVIYIHKIPNNNFNTEAYMRSGNLPSRYEVNTICPT
                 ****:*********:****** :*::** ******* ************:******* * 

Cphage           TIASKTLSSTDNVLYVADAPVKFPDSGTLRIKKTTSATDGVYEYVNYTSKTSYVKDVFNV
meta4            VTATTDISSGASVLYVDKAPELFPGNGTLRIRQTDSASTADIEYVNYTGTTSFKQDVIAT
meta1            TVASRTFTSGDSTLYIADAPTHFPSSGTLRVRQTTGSTAGTQEYINYTGKVQFQQDVIAV
ORF_LE23220      TTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGTQEYINYTGKTSFVQDIIST
meta2            TVATRTFASGDSTLYVADDLTQFPASGTLRVKQSSSATGGTQEYINYTGKTSFVQDIISV
meta3            STATTSFANNATSIYVTEAPYAFPTSGTLRIRQTTSATAANTEYVNYTGTTKFQQDIIAV
meta5            TQATKSITNSDSVIYVTEPPYGFPDAGTLRIRQTTSATAANQEYVNYTGTTKYQQDVISV
meta6            TQATKSITNSDTTIYVTEPPYGFPDAGTLRIRQTTSATAANQEYVNYTGTTKYQQDVISV
                   *:  ::.  . :*: .    **  ****:::: .:: .  **:***....: :*:: .

Cphage           SSN--IIEVNSTTGLSGGGVQTVHFSIPFSNVVAGKTYYVATVPTSTTFTITDTPGSSTP
meta4            EAAGNTIQVASSSGLQGNGVQSIRFDRPFSNIVANRTYYVATVPNATSFTITETPSSSTP
meta1            SAAGDTIEVASTTGLSPGGQQTITFDTPFSNIVAKKIYYVAAVPSATTFKITDTIGDATG
ORF_LE23220      NSRNDTITVSSTTGLHQAVNKHYF------------------------------------
meta2            DAGNDTIVVSSTTGLTPGGQQTIIFDTPFSNIVANKTYFVAAVPSSTTFKITETQGDSTG
meta3            TSGNDAIEVADTTGLQGGGVQTITFTVPFSNVVARKTYYIASILSSTLFTITDTPGSSVA
meta5            VAANDSIVVGTTTGLQGGGVQTVTFDRPFSNIVSGKTYYVAQVISSTAFSITDTAGNSLG
meta6            VAGNNSIVVGSTTGLQGGGVQTVTFDRPFSNIVSGKTYYVAQVISGTAFSITDTAGNSLG
                  :    * *  ::**     :                                       

Cphage           LTLTDQSGSALSPLSVVSSGSFNGLIREQSGAIAEMIMAAGSSSGVSTLEGIGYQKGQRV
meta4            ISLLAQQVLLCLHWQLQSLEHSLV------------------------------------
meta1            IALTDATGSALSPLSRASAGSFTGCY----------------------------------
ORF_LE23220      ------------------------------------------------------------
meta2            IALTGATGSALSPLSRARAGSFTGITREQARAASVTLSIGNGESSGTVSSATGIQKGQRV
meta3            IALNDETGTALSPLSTGLSGAFTGVTREQAGATAVNLTIADGSSVGTVSSSTGIQKGQLV
meta5            ISLQDQTGTTLSPLGIVRSGAFTGVTREQAGATGINLTIADNASSGTVSSATGIKKDN--
meta6            ISLLDQTGTTLSPLGIARSGAFTGVTREQSGASAINLTIADNSSSGTVSSGTGIQVGQRV
                                                                             

Cphage           IG-TGIPDGTSVYSVSGSNIEFSNAVTSANPTATFIPMGTSQEANTFTYSETQPISVELL
meta4            ------------------------------------------------------------
meta1            ------------------------------------------------------------
ORF_LE23220      ------------------------------------------------------------
meta2            IDNSDIPADTFVHSISGTNITLSRAVTADNPAGVKFPPLGAGAASTFTYSATQPIGVELI
meta3            VG-ADIPDDTTVYNIVGTTITLSKAVTAANPQNVTFAAMGTGVPSAFTYSATQPISLELI
meta5            ------------------------------------------------------------
meta6            VG-ADVPDDAYVNYISGTSIKLSKAVTAANPTNIRFPSMGTGSATAFTYSATQPIALELI
                                                                             

Cphage           EATSVPQISHWGSSVIMDGLYDDDRAYVYTVGTKLRRTIGTGGNRTRSVLALRISPSVDN
meta4            ------------------------------------------------------------
meta1            ------------------------------------------------------------
ORF_LE23220      ------------------------------------------------------------
meta2            QATSVPQISHWGSSIIMDGEYDEDRAYIYSVGTKTGRSVSSGATKAILGLRLAP-TVDNG
meta3            AATSVPQISHWGSSVIMDGEFDEDRAYIYSVGTSTGRSIASGATKGILAIRVAP-SIDNG
meta5            ------------------------------------------------------------
meta6            AATSVPQISHWGSSVIMDGRLDDDRAYVYTAATKRQGGIQSGQTRAIIALRVAP-SVDNG
                                                                             

Cphage           GIIGSFGSRELINRMQLVFRSMDVLSEGQFFVELVLNPIPTISTEWLPVGGTSLAEYSNF
meta4            ------------------------------------------------------------
meta1            ------------------------------------------------------------
ORF_LE23220      ------------------------------------------------------------
meta2            ITGTQLGDRELVNRMQLVMRDCQIVANGVFFVELVLNPTVTIQAEWEPVGGTSLAQYAEL
meta3            ISG-SFGARELTNRMQLVMRDCQIVSNGVFFVELLLNPLCDTSATWQNVGGTSLAQFAVL
meta5            ------------------------------------------------------------
meta6            IPG-NFGTRELVNRMQLVLSQVDISSNGKFFVELVLNPVPDIQGIWIPVGGTSIAQYAIM
                                                                             

Cphage           DGRSDVDFIGGEVIYGFYAGGSQDNAVPQSYNLESVKEISNSILGGGSNEYTTSSPPDPS
meta4            ------------------------------------------------------------
meta1            ------------------------------------------------------------
ORF_LE23220      ------------------------------------------------------------
meta2            NNNCELVGG-EVVYAFYAGDAGGFGAGASTVPLDDVKEISNSVLGGGGANLVVASGANPT
meta3            GNNAELVGG-EVVYAFYAGDAG-FASGAASISLKDVKEISNCILGGGKATYDDAS---PT
meta5            ------------------------------------------------------------
meta6            NTNTQLLGG-EVIFGFYSDNGV------NQYSLDEVKELSNSILGGGSANYDTSTGANPT
                                                                             

Cphage           GIYPDGPEVVGIRVTNIGSSVAKIDARISWTEAQA----
meta4            ---------------------------------------
meta1            ---------------------------------------
ORF_LE23220      ---------------------------------------
meta2            GVFPDGPEVLAVRVTNIAGGFGSGSRSADFKFSWTEAQA
meta3            GVFPDGPEVLAVRVTNIAGGFGSSAKSADFKFSWTEAQA
meta5            ---------------------------------------
meta6            GIFPDGPEVLAIRATNIT----NSQKGIDARFSWKEAQA








BLAST

BLASTp de l'ORF de 347 AA contre Swissprot

                                                                  Score     E
Sequences producing significant alignments:                       (Bits)  Value

sp|Q9X2V9.2|MCJC_ECOLX  Microcin J25-processing protein mcjC       33.1    1.8  
sp|P17334.2|PTQC_ECOLI  N,N'-diacetylchitobiose permease IIC c...  32.0    3.9   Gene info
sp|P36093|PHD1_YEAST  Putative transcription factor PHD1           32.0    4.4   Gene info
sp|Q9FFB0|SCP47_ARATH  Serine carboxypeptidase-like 47 precursor   31.6    5.6   Gene info



Alignement 2 à 2 

>sp|Q9X2V9.2|MCJC_ECOLX  Microcin J25-processing protein mcjC
Length=513

 Score = 33.1 bits (74),  Expect = 1.8, Method: Composition-based stats.
 Identities = 22/92 (23%), Positives = 46/92 (50%), Gaps = 4/92 (4%)

Query  68   NLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGINGQGTKFS  127
            N++I + D+ + I F   +   S+    + + LSG    T  S +   +   NG+  KF+
Sbjct  97   NMDIFVSDKISDIKFLNPDMTFSLNITMAEHYLSGNRIATQESLITGIYKVNNGEFIKFN  156

Query  128  KQLKP----GDYVVIRGSSYRVDGIVSDTQMV  155
             QLKP     ++ + + ++  +D I+ + +M+
Sbjct  157  NQLKPVLLRDEFSITKKNNSTIDSIIDNIEMM  188


>sp|P17334.2|PTQC_ECOLI Gene info N,N'-diacetylchitobiose permease IIC component (PTS system N,N'-diacetylchitobiose-specific 
EIIC component) (EIIC-Chb)
Length=452

 GENE ID: 945982 chbC | N,N'-diacetylchitobiose-specific enzyme IIC component of
PTS [Escherichia coli K12] (Over 10 PubMed links)

 Score = 32.0 bits (71),  Expect = 3.9, Method: Composition-based stats.
 Identities = 13/29 (44%), Positives = 20/29 (68%), Gaps = 0/29 (0%)

Query  10  TIDTVIDVRGVDDNAYNGTYNITSVVDPF  38
           TI+T+  ++G+  N YNGT  I S++ PF
Sbjct  72  TIETLNGLKGIGGNVYNGTLGIMSLMAPF  100


>sp|P36093|PHD1_YEAST Gene info Putative transcription factor PHD1
Length=366

 GENE ID: 853823 PHD1 | Transcriptional activator that enhances pseudohyphal
growth; regulates expression of FLO11, an adhesin required for pseudohyphal
filament formation; similar to StuA, an A. nidulans developmental regulator;
potential Cdc28p substrate [Saccharomyces cerevisiae] (10 or fewer PubMed links)

 Score = 32.0 bits (71),  Expect = 4.4, Method: Composition-based stats.
 Identities = 22/80 (27%), Positives = 36/80 (45%), Gaps = 2/80 (2%)

Query  30   NITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMMDQQNGIFFRYANGEL  89
            N TSV  P   K  A ++PT+        V+        +   M + +N I ++     +
Sbjct  148  NDTSVARPNNLKSIAAASPTVTATTRTPGVSSTSVLKPRVITTMWEDENTICYQVEANGI  207

Query  90   SVVRRSSTYQLSGK--VNVT  107
            SVVRR+    ++G   +NVT
Sbjct  208  SVVRRADNNMINGTKLLNVT  227


>sp|Q9FFB0|SCP47_ARATH Gene info Serine carboxypeptidase-like 47 precursor
Length=505

 GENE ID: 832362 SCPL47 | SCPL47 (serine carboxypeptidase-like 47); serine
carboxypeptidase [Arabidopsis thaliana] (10 or fewer PubMed links)

 Score = 31.6 bits (70),  Expect = 5.6, Method: Composition-based stats.
 Identities = 23/94 (24%), Positives = 42/94 (44%), Gaps = 12/94 (12%)

Query  223  RALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADD  282
            R LN   +++  +  ++ N E  ++S NL  +Y+VN IP          S D+   +   
Sbjct  24   RILNNPSVFSSSLNFSSGNAERLIKSFNLMPKYDVNVIPK--------GSLDAPRLIERQ  75

Query  283  LSQFPTSGTLRVKQSTSATAGTQEYINYTGKTSF  316
            +    T+G+    ++ S     QE+ +Y G  S 
Sbjct  76   IDFLATAGS----KNASVGPSVQEFGHYAGYYSL  105




BLASTp de l'ORF de 347 AA contre nr

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|YP_195208.1|  virion structural protein [Cyanophage phage ...   416    5e-115 Gene info
ref|YP_001112759.1|  hypothetical protein Dred_1404 [Desulfoto...  55.2    4e-06  Gene info
ref|YP_200273.1|  hypothetical protein XOO1634 [Xanthomonas or...  38.0    0.55   Gene info
ref|YP_450549.1|  hypothetical protein XOO_1520 [Xanthomonas o...  38.0    0.57   Gene info
gb|AAF02290.1|AF090831_1  cytochrome oxidase subunit III [Mytilus  35.9    2.3   
ref|XP_001351707.1|  hypothetical protein, conserved [Plasmodi...  35.6    2.8    Gene info
gb|AAO16596.1|  cytochrome oxidase subunit III [Mytilus trossulus  35.6    3.0   
ref|ZP_00121342.2|  COG3507: Beta-xylosidase [Bifidobacterium lon  35.6    3.2   
ref|YP_214763.1|  cytochrome c oxidase subunit III [Mytilus ga...  35.2    3.4    Gene info
ref|NP_695401.1|  possible endo-1,5-alpha-L-arabinosidase [Bif...  34.9    4.4    Gene info
gb|AAO51629.1|  similar to Anabaena sp. (strain PCC 7120). Pol...  34.5    6.1   
ref|NP_781205.1|  zink-carboxypeptidase [Clostridium tetani E8...  34.2    8.5    Gene info

Alignement 2 à 2

>ref|YP_195208.1| Gene info virion structural protein [Cyanophage phage S-PM2]
 emb|CAF34238.1| Gene info virion structural protein [Cyanophage phage S-PM2]
Length=1095

 GENE ID: 3260257 S-PM2p173 | virion structural protein
[Synechococcus phage S-PM2] (10 or fewer PubMed links)

 Score =  416 bits (1194),  Expect = 5e-115, Method: Composition-based stats.
 Identities = 246/338 (72%), Positives = 287/338 (84%), Gaps = 2/338 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VVA   HN +  + IDVRG  DNAYNG Y +T++VDPFT +Y   STPT  T GGEY+VT
Sbjct  381  VVAATNHNISKGSQIDVRGCSDNAYNGIYEVTNIVDPFTLQYVVPSTPTQTTTGGEYSVT  440

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
               +FGTNLEIGMMDQQNGIFFRY+NG LSVVRR+ST+QLSGKV+VTNGSTLVS+FT IN
Sbjct  441  AINSFGTNLEIGMMDQQNGIFFRYSNGVLSVVRRTSTFQLSGKVSVTNGSTLVSNFTTIN  500

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
            GQ TKF+KQLKPGDYVVIRGSSYRVDGI+SDTQMVIFPDYRGPT+ +VPVTKT E EW Q
Sbjct  501  GQTTKFAKQLKPGDYVVIRGSSYRVDGIISDTQMVIFPDYRGPTSNDVPVTKTVETEWNQ  560

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
             +WN+DR DGTGK+GYTLD T+MQMFYMDYSWYGAGFIRWGFRA +G++IYAHKIPNNN 
Sbjct  561  TEWNLDRMDGTGKSGYTLDPTRMQMFYMDYSWYGAGFIRWGFRAEDGNIIYAHKIPNNNV  620

Query  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300
            NTEAYMRSGNLP+RYEVNTIPPST A++T +S D+ LYVAD   +FP SGTLR+K++TSA
Sbjct  621  NTEAYMRSGNLPSRYEVNTIPPSTIASKTLSSTDNVLYVADAPVKFPDSGTLRIKKTTSA  680

Query  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGL  338
            T G  EY+NYT KTS+V+D+ + +S  + I V+STTGL
Sbjct  681  TDGVYEYVNYTSKTSYVKDVFNVSS--NIIEVNSTTGL  716


>ref|YP_001112759.1| Gene info hypothetical protein Dred_1404 [Desulfotomaculum reducens MI-1]
 gb|ABO49934.1| Gene info hypothetical protein Dred_1404 [Desulfotomaculum reducens MI-1]
Length=338

 GENE ID: 4956333 Dred_1404 | hypothetical protein
[Desulfotomaculum reducens MI-1]

 Score = 55.2 bits (145),  Expect = 4e-06, Method: Composition-based stats.
 Identities = 46/134 (34%), Positives = 64/134 (47%), Gaps = 4/134 (2%)

Query  142  SYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWK-QGDWNIDRCDGTGKTGYTLDT  200
            S ++ GI  DT    F  Y G + G V   K     W  Q  WN+D+ DGTG +G  LD 
Sbjct  51   STQIIGI-GDTNDGFFFGYNGSSFG-VLRRKNGLDNWTPQASWNMDKLDGTGSSGLILDP  108

Query  201  TKMQMFYMDYSWYGAGFIRWGFRALN-GDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNT  259
            TK  ++ + Y W G G I +     N GD++  HKIP  N NT+  + +  LP   +V+ 
Sbjct  109  TKGNVYSIQYQWLGFGMIYFYIENQNTGDLLLVHKIPYANANTDPSIFNPVLPLMAQVSN  168

Query  260  IPPSTTATRTFASG  273
               +T      AS 
Sbjct  169  TTNNTNIKLETASA  182


>ref|YP_200273.1| Gene info hypothetical protein XOO1634 [Xanthomonas oryzae pv. oryzae KACC10331]
 gb|AAW74888.1| Gene info conserved hypothetical protein [Xanthomonas oryzae pv. oryzae 
KACC10331]
Length=756

 GENE ID: 3262543 vrgS | hypothetical protein
[Xanthomonas oryzae pv. oryzae KACC10331] (10 or fewer PubMed links)

 Score = 38.0 bits (95),  Expect = 0.55, Method: Composition-based stats.
 Identities = 36/94 (38%), Positives = 41/94 (43%), Gaps = 4/94 (4%)

Query  15   IDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMM  74
            I V G D     GT   TSV    T  + AG T TIA AG   TVT  G F T L    +
Sbjct  570  ITVTGTDTLGVTGT-RTTSVTGAVTETFQAGQTKTIAAAGYRETVT--GDFATTLNGNFV  626

Query  75   DQQNGIFFR-YANGELSVVRRSSTYQLSGKVNVT  107
             Q+NG +        L  V   +T QLS    VT
Sbjct  627  SQRNGTWKETVTQTSLRQVIGKTTEQLSAGREVT  660


>ref|YP_450549.1| Gene info hypothetical protein XOO_1520 [Xanthomonas oryzae pv. oryzae 
MAFF 311018]
 dbj|BAE68275.1| Gene info conserved hypothetical protein [Xanthomonas oryzae pv. oryzae 
MAFF 311018]
Length=735

 GENE ID: 3858965 XOO1520 | hypothetical protein
[Xanthomonas oryzae pv. oryzae MAFF 311018]

 Score = 38.0 bits (95),  Expect = 0.57, Method: Composition-based stats.
 Identities = 36/94 (38%), Positives = 41/94 (43%), Gaps = 4/94 (4%)

Query  15   IDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMM  74
            I V G D     GT   TSV    T  + AG T TIA AG   TVT  G F T L    +
Sbjct  549  ITVTGTDTLGVTGT-RTTSVTGAVTETFQAGQTKTIAAAGYRETVT--GDFATTLNGNFV  605

Query  75   DQQNGIFFR-YANGELSVVRRSSTYQLSGKVNVT  107
             Q+NG +        L  V   +T QLS    VT
Sbjct  606  SQRNGTWKETVTQTSLRQVIGKTTEQLSAGREVT  639


>gb|AAF02290.1|AF090831_1  cytochrome oxidase subunit III [Mytilus californianus]
Length=72

 Score = 35.9 bits (89),  Expect = 2.3, Method: Composition-based stats.
 Identities = 17/60 (28%), Positives = 33/60 (55%), Gaps = 1/60 (1%)

Query  198  LDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEV  257
            +D   ++++ + Y W+G     W F+  +GD IY  K P+   +  AY++  + P+ Y+V
Sbjct  8    VDVVWVRLWCLVYVWFGGWLYMWWFKMWDGD-IYTFKYPDAKPSWYAYVQEEHAPSWYKV  66


>ref|XP_001351707.1| Gene info hypothetical protein, conserved [Plasmodium falciparum 3D7]
 emb|CAD51514.1| Gene info hypothetical protein, conserved [Plasmodium falciparum 3D7]
Length=848

 GENE ID: 812964 PFE0750c | hypothetical protein, conserved
[Plasmodium falciparum 3D7] (10 or fewer PubMed links)

 Score = 35.6 bits (88),  Expect = 2.8, Method: Composition-based stats.
 Identities = 38/153 (24%), Positives = 64/153 (41%), Gaps = 14/153 (9%)

Query  187  RCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYM  246
            +C+ T  +GYT    K    Y    ++   F R G  A   + +Y H+IPN N   E + 
Sbjct  580  KCNPTKDSGYT----KADKSYTSKQYFCIYFAR-GCCAYGHNCLYKHRIPNENDELE-FE  633

Query  247  RSGNLPARYEVNTIPPSTTATRTFASGDSTLYVA----DDLSQFPTSGTLRVKQSTSATA  302
             S ++  R + NT     T   TF +   TL++     ++++Q P     ++        
Sbjct  634  ASVDIFGREKFNTFKDDMTGVGTFNNDCRTLFIGSIFINNINQVPV--IEKILYEEFLPF  691

Query  303  GTQEYINY--TGKTSFVQDIISTNSRNDTITVS  333
            G  EY+ Y      +F+Q     N+    I +S
Sbjct  692  GNIEYVRYIPNKNIAFIQFTNRVNAEFAKIAMS  724


>gb|AAO16596.1|  cytochrome oxidase subunit III [Mytilus trossulus]
Length=120

 Score = 35.6 bits (88),  Expect = 3.0, Method: Composition-based stats.
 Identities = 17/60 (28%), Positives = 32/60 (53%), Gaps = 1/60 (1%)

Query  198  LDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEV  257
            +D   + ++ + Y W+G     W F+  +GDV Y  K P+   +  AY++  + P+ Y+V
Sbjct  56   VDVVWVALWCLVYVWFGGWLYMWWFKMWDGDV-YTFKYPDAKPSWYAYIQEEHAPSWYKV  114


>ref|ZP_00121342.2|  COG3507: Beta-xylosidase [Bifidobacterium longum DJO10A]
Length=194

 Score = 35.6 bits (88),  Expect = 3.2, Method: Composition-based stats.
 Identities = 30/89 (33%), Positives = 44/89 (49%), Gaps = 8/89 (8%)

Query  116  FTGINGQGTKFSKQLKPGDY-VVI--RGSSYRVDGIVSDTQMVIFPDYRGPTA-GNVPVT  171
            +TG     TK+  +   GDY  VI  + +S++    V+D       DYRG     N+ +T
Sbjct  71   YTGTKAADTKYDAKNVAGDYEFVIHDQRTSFKGPKKVTDKHST---DYRGVNKPANITLT  127

Query  172  KTTEVEW-KQGDWNIDRCDGTGKTGYTLD  199
            +  +V   K G W +D+ DGTG    TLD
Sbjct  128  EDGKVTGDKTGTWKLDKSDGTGDMTITLD  156


>ref|YP_214763.1| Gene info cytochrome c oxidase subunit III [Mytilus galloprovincialis]
 gb|AAR31736.1| Gene info cytochrome c oxidase subunit III [Mytilus galloprovincialis]
Length=311

 GENE ID: 3332125 COX3 | cytochrome c oxidase subunit III
[Mytilus galloprovincialis] (10 or fewer PubMed links)

 Score = 35.2 bits (87),  Expect = 3.4, Method: Composition-based stats.
 Identities = 17/60 (28%), Positives = 32/60 (53%), Gaps = 1/60 (1%)

Query  198  LDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEV  257
            +D   + ++ + Y W+G     W F+  +GDV Y  K P+   +  AY++  + P+ Y+V
Sbjct  247  VDVVWVALWCLVYVWFGGWLYMWWFKMWDGDV-YTFKYPDAKPSWYAYIQEEHAPSWYKV  305


>ref|NP_695401.1| Gene info possible endo-1,5-alpha-L-arabinosidase [Bifidobacterium longum 
NCC2705]
 gb|AAN24037.1| Gene info possible endo-1,5-alpha-L-arabinosidase [Bifidobacterium longum 
NCC2705]
Length=531

 GENE ID: 1022842 BL0183 | possible endo-1,5-alpha-L-arabinosidase
[Bifidobacterium longum NCC2705] (10 or fewer PubMed links)

 Score = 34.9 bits (86),  Expect = 4.4, Method: Composition-based stats.
 Identities = 30/89 (33%), Positives = 44/89 (49%), Gaps = 8/89 (8%)

Query  116  FTGINGQGTKFSKQLKPGDY-VVI--RGSSYRVDGIVSDTQMVIFPDYRGPTAG-NVPVT  171
            +TG     TK+  +   GDY  VI  + +S++    V+D       DYRG     N+ +T
Sbjct  408  YTGTKAADTKYDAKNVAGDYEFVIHDQRTSFKGPKKVTDKHST---DYRGVNKPVNITLT  464

Query  172  KTTEVEW-KQGDWNIDRCDGTGKTGYTLD  199
            +  +V   K G W +D+ DGTG    TLD
Sbjct  465  EDGKVTGDKTGTWKLDKSDGTGDMTITLD  493


>gb|AAO51629.1|  similar to Anabaena sp. (strain PCC 7120). Polyketide synthase 
[Dictyostelium discoideum]
Length=1995

 Score = 34.5 bits (85),  Expect = 6.1, Method: Composition-based stats.
 Identities = 35/119 (29%), Positives = 56/119 (47%), Gaps = 8/119 (6%)

Query  96   STYQLSGKVNVTNGSTLVSSFTGINGQGTKFS--KQLKPGD--YVVIRGSSYRVDGI-VS  150
            S   + G VN+   +T + +F+ +N   T  S    ++ G+  Y +I+GSSY VDG   S
Sbjct  203  SKLSIVGGVNLIVDTTNIKAFSYLNMLSTSGSLKDSIRDGNKIYCIIKGSSYNVDGNGNS  262

Query  151  DTQMVIFPDYRGPTAGNVPVT-KTTEVEWKQGDWNIDRCDGTGK-TGYTLDTTKMQMFY  207
            D Q    P      + N+ +  K+T       D +   C GTG  TG  ++T  + M +
Sbjct  263  DKQNFYAPSSIS-QSDNIKLAIKSTNGSITCDDIDYFECHGTGTPTGDPIETKGISMAF  320


>ref|NP_781205.1| Gene info zink-carboxypeptidase [Clostridium tetani E88]
 gb|AAO35142.1| Gene info zink-carboxypeptidase [Clostridium tetani E88]
Length=581

 GENE ID: 1060123 CTC00519 | zink-carboxypeptidase [Clostridium tetani E88]
(10 or fewer PubMed links)

 Score = 34.2 bits (84),  Expect = 8.5, Method: Composition-based stats.
 Identities = 36/145 (24%), Positives = 58/145 (40%), Gaps = 10/145 (6%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAY--NGTYNITSV----VDPFTFKYTAGSTPTIATAG  54
            ++ + A N  I    DVR V DN +  N   ++ S      D +TF    G  PTI  AG
Sbjct  270  ILISSAKNLNIPK--DVRKVSDNLFVNNAQKDLASRNFSNYDYYTFT-EKGGKPTIFEAG  326

Query  55   GEYTVTPKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVS  114
             +  +  + A+G       + +  GI     N E  V+    + Q   +    N   +  
Sbjct  327  TDPKIG-RNAYGLQPSFSFLVESRGIGIGKENFERRVLSHIVSAQNIIRSTANNADLVKK  385

Query  115  SFTGINGQGTKFSKQLKPGDYVVIR  139
            +      + TK  K + P D +V+R
Sbjct  386  TIDNARKEITKLGKTVDPNDKLVLR  410



Blastp contre Environmental sample (env_nr) :

Sequences producing significant alignments:                       (Bits)  Value

gb|EBL29051.1|  hypothetical protein GOS_8595502 [marine metageno   562    2e-159
gb|EDB50388.1|  hypothetical protein GOS_1812769 [marine metageno   451    8e-126
gb|ECV30510.1|  hypothetical protein GOS_2924511 [marine metageno   412    3e-114
gb|ECX02854.1|  hypothetical protein GOS_2612174 [marine metageno   411    5e-114
gb|EDI89573.1|  hypothetical protein GOS_1789994 [marine metageno   396    2e-109
gb|EBN58132.1|  hypothetical protein GOS_8222789 [marine metageno   393    1e-108
gb|EDE97880.1|  hypothetical protein GOS_1036762 [marine metageno   391    8e-108
gb|EBA66340.1|  hypothetical protein GOS_363840 [marine metagenom   343    2e-93 
gb|ECH47406.1|  hypothetical protein GOS_5115624 [marine metageno   273    2e-72 
gb|ECI35556.1|  hypothetical protein GOS_5118533 [marine metageno   215    5e-55 
gb|ECK99817.1|  hypothetical protein GOS_5116567 [marine metageno   210    2e-53 
gb|EBM74109.1|  hypothetical protein GOS_8360821 [marine metageno   202    7e-51 
gb|ECK10767.1|  hypothetical protein GOS_5149627 [marine metageno   198    1e-49 
gb|ECY36742.1|  hypothetical protein GOS_2371994 [marine metageno   194    2e-48 
gb|EDB40570.1|  hypothetical protein GOS_1829690 [marine metageno   193    2e-48 
gb|EDE24433.1|  hypothetical protein GOS_1164552 [marine metageno   188    8e-47 
gb|EBD38981.1|  hypothetical protein GOS_9937366 [marine metageno   186    3e-46 
gb|EBK13726.1|  hypothetical protein GOS_8782514 [marine metageno   185    5e-46 
gb|EBE37095.1|  hypothetical protein GOS_9775088 [marine metageno   183    2e-45 
gb|EDI41734.1|  hypothetical protein GOS_436553 [marine metagenom   183    3e-45 
gb|EBC86909.1|  hypothetical protein GOS_1501 [marine metagenome]   181    1e-44 
gb|EDB83388.1|  hypothetical protein GOS_1584431 [marine metageno   179    4e-44 
gb|EDH09167.1|  hypothetical protein GOS_667731 [marine metagenom   178    7e-44 
gb|ECQ18545.1|  hypothetical protein GOS_3158114 [marine metageno   177    1e-43 
gb|ECE38116.1|  hypothetical protein GOS_3192478 [marine metageno   173    3e-42 
gb|ECB51552.1|  hypothetical protein GOS_4064910 [marine metageno   173    4e-42 
gb|ECX86404.1|  hypothetical protein GOS_2462194 [marine metageno   172    4e-42 
gb|EBM54147.1|  hypothetical protein GOS_8393448 [marine metageno   171    1e-41 
gb|ECD22382.1|  hypothetical protein GOS_4281375 [marine metageno   171    1e-41 
gb|EDD27489.1|  hypothetical protein GOS_1329328 [marine metageno   168    6e-41 
gb|EDB37089.1|  hypothetical protein GOS_1835634 [marine metageno   165    9e-40 
gb|EBL89404.1|  hypothetical protein GOS_8496857 [marine metageno   162    5e-39 
gb|EBC44309.1|  hypothetical protein GOS_69092 [marine metagenome   162    5e-39 
gb|EDB37857.1|  hypothetical protein GOS_1834265 [marine metageno   162    5e-39 
gb|EBY68590.1|  hypothetical protein GOS_4839564 [marine metageno   161    2e-38 
gb|EDE35082.1|  hypothetical protein GOS_1145805 [marine metageno   158    9e-38 
gb|EDC78377.1|  hypothetical protein GOS_1415215 [marine metageno   157    2e-37 
gb|EBO95708.1|  hypothetical protein GOS_7991700 [marine metageno   156    3e-37 
gb|EBM36340.1|  hypothetical protein GOS_8421847 [marine metageno   156    3e-37 
gb|EBY07609.1|  hypothetical protein GOS_5298784 [marine metageno   156    3e-37 
gb|EBK59973.1|  hypothetical protein GOS_8705896 [marine metageno   153    4e-36 
gb|EBC87277.1|  hypothetical protein GOS_927 [marine metagenome]    151    9e-36 
gb|EDC79826.1|  hypothetical protein GOS_1412607 [marine metageno   150    3e-35 
gb|EBV81120.1|  hypothetical protein GOS_6847055 [marine metageno   150    3e-35 
gb|EDD55939.1|  hypothetical protein GOS_1283759 [marine metageno   149    5e-35 
gb|EBM34083.1|  hypothetical protein GOS_8425478 [marine metageno   148    7e-35 
gb|ECU25210.1|  hypothetical protein GOS_5110527 [marine metageno   146    3e-34 
gb|EBQ47438.1|  hypothetical protein GOS_7746033 [marine metageno   145    6e-34 
gb|EDC23652.1|  hypothetical protein GOS_1512048 [marine metageno   145    7e-34 
gb|EBL47388.1|  hypothetical protein GOS_8565864 [marine metageno   144    1e-33 
gb|ECE77493.1|  hypothetical protein GOS_5337576 [marine metageno   144    2e-33 
gb|EDG08619.1|  hypothetical protein GOS_842794 [marine metagenom   143    3e-33 
gb|EBM12732.1|  hypothetical protein GOS_8458696 [marine metageno   143    4e-33 
gb|ECP05415.1|  hypothetical protein GOS_3041481 [marine metageno   142    7e-33 
gb|EDH88897.1|  hypothetical protein GOS_524184 [marine metagenom   137    1e-31 
gb|EDC40986.1|  hypothetical protein GOS_1481028 [marine metageno   136    3e-31 
gb|EDB02789.1|  hypothetical protein GOS_1894129 [marine metageno   135    6e-31 
gb|ECR75892.1|  hypothetical protein GOS_3910070 [marine metageno   135    6e-31 
gb|ECM91327.1|  hypothetical protein GOS_4513046 [marine metageno   133    3e-30 
gb|EBP03542.1|  hypothetical protein GOS_7978396 [marine metageno   132    9e-30 
gb|EBU14530.1|  hypothetical protein GOS_7159426 [marine metageno   128    7e-29 
gb|ECG70480.1|  hypothetical protein GOS_4682438 [marine metageno   126    3e-28 
gb|EBN45433.1|  hypothetical protein GOS_8243916 [marine metageno   125    6e-28 
gb|EDI30870.1|  hypothetical protein GOS_454979 [marine metagenom   125    7e-28 
gb|ECH47407.1|  hypothetical protein GOS_5115625 [marine metageno   125    1e-27 
gb|EBV37469.1|  hypothetical protein GOS_6915895 [marine metageno   123    3e-27 
gb|EBD41976.1|  hypothetical protein GOS_9932803 [marine metageno   123    4e-27 
gb|ECF95231.1|  hypothetical protein GOS_4198001 [marine metageno   121    2e-26 
gb|EBN92170.1|  hypothetical protein GOS_8166217 [marine metageno   119    4e-26 
gb|EDE66640.1|  hypothetical protein GOS_1091159 [marine metageno   118    8e-26 
gb|EBK65660.1|  hypothetical protein GOS_8696385 [marine metageno   118    9e-26 
gb|ECI63249.1|  hypothetical protein GOS_4037754 [marine metageno   118    1e-25 
gb|ECP99038.1|  hypothetical protein GOS_3910774 [marine metageno   117    3e-25 
gb|EBQ12204.1|  hypothetical protein GOS_7799817 [marine metageno   116    3e-25 
gb|ECE26013.1|  hypothetical protein GOS_3672767 [marine metageno   114    2e-24 
gb|ECT85968.1|  hypothetical protein GOS_4290802 [marine metageno   112    9e-24 
gb|EBW29648.1|  hypothetical protein GOS_6768882 [marine metageno   110    2e-23 
gb|ECR44983.1|  hypothetical protein GOS_5130246 [marine metageno   106    4e-22 
gb|ECU68623.1|  hypothetical protein GOS_3391639 [marine metageno   101    1e-20 
gb|ECT93294.1|  hypothetical protein GOS_4004854 [marine metageno   100    2e-20 
gb|EBK18438.1|  hypothetical protein GOS_8774532 [marine metageno  98.0    1e-19 
gb|EBN96962.1|  hypothetical protein GOS_8158088 [marine metageno  94.5    2e-18 
gb|EBN03533.1|  hypothetical protein GOS_8312441 [marine metageno  93.5    3e-18 
gb|EBQ90163.1|  hypothetical protein GOS_7680541 [marine metageno  91.1    2e-17 
gb|ECP13152.1|  hypothetical protein GOS_6219024 [marine metageno  89.0    8e-17 
gb|ECS62862.1|  hypothetical protein GOS_3960620 [marine metageno  85.2    1e-15 
gb|EBM51210.1|  hypothetical protein GOS_8398165 [marine metageno  84.2    2e-15 
gb|ECS62861.1|  hypothetical protein GOS_3960619 [marine metageno  83.8    2e-15 
gb|EBO89349.1|  hypothetical protein GOS_8002450 [marine metageno  83.5    3e-15 
gb|ECP12527.1|  hypothetical protein GOS_6246196 [marine metageno  83.1    5e-15 
gb|EBZ01948.1|  hypothetical protein GOS_3528747 [marine metageno  82.1    1e-14 
gb|EBN92169.1|  hypothetical protein GOS_8166216 [marine metageno  81.4    1e-14 
gb|EBK36698.1|  hypothetical protein GOS_8744578 [marine metageno  81.1    2e-14 
gb|ECG76118.1|  hypothetical protein GOS_4452174 [marine metageno  80.0    4e-14 
gb|EBQ90165.1|  hypothetical protein GOS_7680543 [marine metageno  80.0    4e-14 
gb|EBA66339.1|  hypothetical protein GOS_363839 [marine metagenom  77.6    2e-13 
gb|ECG87663.1|  hypothetical protein GOS_4011145 [marine metageno  77.6    2e-13 
gb|EBD41770.1|  hypothetical protein GOS_9933161 [marine metageno  76.2    5e-13 
gb|EBV87241.1|  hypothetical protein GOS_6837472 [marine metageno  75.6    9e-13 
gb|ECD32314.1|  hypothetical protein GOS_3900817 [marine metageno  73.8    3e-12 



Alignement 2 à 2 :

>gb|EBL29051.1|  hypothetical protein GOS_8595502 [marine metagenome]
Length=347

 Score =  562 bits (1617),  Expect = 2e-159, Method: Composition-based stats.
 Identities = 347/347 (100%), Positives = 347/347 (100%), Gaps = 0/347 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT
Sbjct  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
            PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN
Sbjct  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
            GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ
Sbjct  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
            GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ
Sbjct  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240

Query  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300
            NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA
Sbjct  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300

Query  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLHQAVNKHYF  347
            TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLHQAVNKHYF
Sbjct  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLHQAVNKHYF  347


>gb|EDB50388.1|  hypothetical protein GOS_1812769 [marine metagenome]
Length=760

 Score =  451 bits (1293),  Expect = 8e-126, Method: Composition-based stats.
 Identities = 263/338 (77%), Positives = 302/338 (89%), Gaps = 0/338 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VV+T+AHN T DTV+ V+GV+DN YNGT+N+T+VVDP+TF+Y A S+PT + A GEYT+T
Sbjct  352  VVSTDAHNVTRDTVVRVQGVNDNNYNGTFNVTNVVDPYTFQYAADSSPTDSEAAGEYTIT  411

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
            P  ++GT LEIGMMDQQNGIFFR+A+G LSVVRR+ST+QLSG++ VTNGSTLVSSF G N
Sbjct  412  PVNSYGTKLEIGMMDQQNGIFFRWASGSLSVVRRTSTFQLSGRLTVTNGSTLVSSFAGPN  471

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
             Q TKF+KQLKPGDYVVIRG+SYRVDGI+SDTQMVIFPDYRGP+A NVPVTKT E EW Q
Sbjct  472  LQTTKFAKQLKPGDYVVIRGASYRVDGIISDTQMVIFPDYRGPSASNVPVTKTVETEWNQ  531

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
            GDWNIDRCDGTGKTGYTLD TKMQMFYMDYSWYGAGF+RWGFRALNGDVIYAHKIPNNNQ
Sbjct  532  GDWNIDRCDGTGKTGYTLDPTKMQMFYMDYSWYGAGFVRWGFRALNGDVIYAHKIPNNNQ  591

Query  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300
            NTEAYMRSGNLPARYEVNTIPP+T A+RTF SGDSTLY+AD  + FP+SGTLRV+Q+T +
Sbjct  592  NTEAYMRSGNLPARYEVNTIPPATVASRTFTSGDSTLYIADAPTHFPSSGTLRVRQTTGS  651

Query  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGL  338
            TAGTQEYINYTGK  F QD+I+ ++  DTI V+STTGL
Sbjct  652  TAGTQEYINYTGKVQFQQDVIAVSAAGDTIEVASTTGL  689


>gb|ECV30510.1|  hypothetical protein GOS_2924511 [marine metagenome]
Length=642

 Score =  412 bits (1181),  Expect = 3e-114, Method: Composition-based stats.
 Identities = 240/260 (92%), Positives = 249/260 (95%), Gaps = 0/260 (0%)

Query  79   GIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGINGQGTKFSKQLKPGDYVVI  138
            GIFFRY+NGELSVVRRSSTYQLSGKV V NGSTLVSSFTGINGQGTKF+KQL+PGDYVVI
Sbjct  1    GIFFRYSNGELSVVRRSSTYQLSGKVTVNNGSTLVSSFTGINGQGTKFAKQLQPGDYVVI  60

Query  139  RGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYTL  198
            RGSSYRVDGI+SDTQMVIFPDYRGP+AGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGY L
Sbjct  61   RGSSYRVDGIISDTQMVIFPDYRGPSAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYNL  120

Query  199  DTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVN  258
            D TKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVN
Sbjct  121  DVTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVN  180

Query  259  TIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGTQEYINYTGKTSFVQ  318
            TIPP T ATRTFASGDSTLYVADDL+QFP SGTLRVKQS+SAT GTQEYINYTGKTSFVQ
Sbjct  181  TIPPKTVATRTFASGDSTLYVADDLTQFPASGTLRVKQSSSATGGTQEYINYTGKTSFVQ  240

Query  319  DIISTNSRNDTITVSSTTGL  338
            DIIS ++ NDTI VSSTTGL
Sbjct  241  DIISVDAGNDTIVVSSTTGL  260


>gb|ECX02854.1|  hypothetical protein GOS_2612174 [marine metagenome]
Length=943

 Score =  411 bits (1179),  Expect = 5e-114, Method: Composition-based stats.
 Identities = 238/339 (70%), Positives = 282/339 (83%), Gaps = 0/339 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VV++ AHN T DT + V+G +DNAYNGTYNIT+V+D + F+Y A   PT +TA GEYT+T
Sbjct  230  VVSSVAHNITRDTEVKVQGCNDNAYNGTYNITNVIDAYKFEYQASVAPTESTAAGEYTIT  289

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
            P  A G  LE+GMMDQQNGIFFR+ANG  SVVRRS+T+QLSG+  VTNG++L+SS+TG N
Sbjct  290  PVNANGVQLELGMMDQQNGIFFRHANGHTSVVRRSATFQLSGRATVTNGNSLISSYTGPN  349

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
             QGTKF+KQL+ GDYVV+RGS+YRVDGIVSDTQMVIFPDYRGP+  NVP+TK TE EW+Q
Sbjct  350  QQGTKFAKQLQIGDYVVLRGSTYRVDGIVSDTQMVIFPDYRGPSDINVPITKVTETEWQQ  409

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
             DWN+DRCDGTGK+GY+LD TKMQMFYMDYSWYGAGFIRWGFRA +GDVIYAHKIPNNN 
Sbjct  410  EDWNLDRCDGTGKSGYSLDVTKMQMFYMDYSWYGAGFIRWGFRATDGDVIYAHKIPNNNF  469

Query  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300
            NTEAYMRSGNLPARYEVNTI P +TAT +FA+  +++YV +    FPTSGTLR++Q+TSA
Sbjct  470  NTEAYMRSGNLPARYEVNTICPKSTATTSFANNATSIYVTEAPYAFPTSGTLRIRQTTSA  529

Query  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLH  339
            TA   EY+NYTG T F QDII+  S ND I V+ TTGL 
Sbjct  530  TAANTEYVNYTGTTKFQQDIIAVTSGNDAIEVADTTGLQ  568


>gb|EDI89573.1|  hypothetical protein GOS_1789994 [marine metagenome]
Length=798

 Score =  396 bits (1135),  Expect = 2e-109, Method: Composition-based stats.
 Identities = 226/334 (67%), Positives = 273/334 (81%), Gaps = 0/334 (0%)

Query  6    AHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAF  65
            AHN T  + IDVR  DDN YNGT+ ++ V+D FTFKYTA + P++  AGG YT+TP  A+
Sbjct  397  AHNITRGSQIDVRSCDDNNYNGTFTVSEVIDAFTFKYTALAAPSVTVAGGSYTITPINAY  456

Query  66   GTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGINGQGTK  125
            G NLEIGMMDQQNGIFFRY++G++ VVRR+ST+QLSGK  VT GS++VSS+TG+NG  T+
Sbjct  457  GVNLEIGMMDQQNGIFFRYSHGQIEVVRRTSTFQLSGKATVTQGSSVVSSYTGVNGASTR  516

Query  126  FSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNI  185
            F+KQLK GDYVV+RGSSYRVDGI+SDTQ+VIFPDYRG +  NVP+TKT E  W+Q +WN+
Sbjct  517  FAKQLKVGDYVVLRGSSYRVDGIISDTQIVIFPDYRGQSDINVPITKTEEAVWRQSEWNL  576

Query  186  DRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAY  245
            DRCDGTGK+GYTLD TKMQMFY+DYSWYGAGFIRWGFRA +GDVIYAHKIPNNN NTEAY
Sbjct  577  DRCDGTGKSGYTLDVTKMQMFYLDYSWYGAGFIRWGFRAADGDVIYAHKIPNNNFNTEAY  636

Query  246  MRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGTQ  305
            MRSGNLPARYEVNTI PS TAT   +SG S LYV      FP +GTLR++Q+ SA+    
Sbjct  637  MRSGNLPARYEVNTICPSVTATTDISSGASVLYVDKAPELFPGNGTLRIRQTDSASTADI  696

Query  306  EYINYTGKTSFVQDIISTNSRNDTITVSSTTGLH  339
            EY+NYTG TSF QD+I+T +  +TI V+S++GL 
Sbjct  697  EYVNYTGTTSFKQDVIATEAAGNTIQVASSSGLQ  730


>gb|EBN58132.1|  hypothetical protein GOS_8222789 [marine metagenome]
Length=50

 Score =  393 bits (1127),  Expect = 1e-108, Method: Composition-based stats.
 Identities = 231/332 (69%), Positives = 275/332 (82%), Gaps = 0/332 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VV   AHN T DT +DVRG +DN YNGT+ +++V++ +TF+Y AGSTP+ ATA GEYTVT
Sbjct  62   VVCAVAHNVTRDTEVDVRGCNDNVYNGTFQVSNVINAYTFQYVAGSTPSEATASGEYTVT  121

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
            P  A G NLEIGMMDQQNGIFFR+ANG  S+VRR+ST+QLSGK  VT+GS+L+SS+TG N
Sbjct  122  PVNANGINLEIGMMDQQNGIFFRHANGHTSLVRRTSTFQLSGKATVTSGSSLISSYTGPN  181

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
             QGTKF+KQLK GDYVV+RGSSYR+DGI+SDTQMVIFPDYRGP+  NVP+TKT E  W Q
Sbjct  182  QQGTKFAKQLKIGDYVVLRGSSYRIDGIISDTQMVIFPDYRGPSDINVPITKTVETVWDQ  241

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
              WNIDRCDGTGK+GYTLD TKMQMFYMDYSWYGAGFIRWGFRA +G+VIY HKIPNNN 
Sbjct  242  PSWNIDRCDGTGKSGYTLDVTKMQMFYMDYSWYGAGFIRWGFRATDGNVIYIHKIPNNNF  301

Query  241  NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA  300
            NTEAYMRSGNLP+RYEVNTI P+T AT++  + DS +YV +    FP +GTLR++Q+TSA
Sbjct  302  NTEAYMRSGNLPSRYEVNTICPTTQATKSITNSDSVIYVTEPPYGFPDAGTLRIRQTTSA  361

Query  301  TAGTQEYINYTGKTSFVQDIISTNSRNDTITV  332
            TA  QEY+NYTG T + QD+IS  + ND+I V
Sbjct  362  TAANQEYVNYTGTTKYQQDVISVVAANDSIVV  393


>gb|EDE97880.1|  hypothetical protein GOS_1036762 [marine metagenome]
Length=686

 Score =  391 bits (1119),  Expect = 8e-108, Method: Composition-based stats.
 Identities = 224/317 (70%), Positives = 266/317 (83%), Gaps = 0/317 (0%)

Query  23   NAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVTPKGAFGTNLEIGMMDQQNGIFF  82
            N YNGT+ +++V+D FTF+Y AGSTP+ ATA GEYTVTP  A G NLEIGMMDQQNGIFF
Sbjct  1    NVYNGTFQVSNVIDAFTFQYVAGSTPSEATASGEYTVTPVNANGVNLEIGMMDQQNGIFF  60

Query  83   RYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGINGQGTKFSKQLKPGDYVVIRGSS  142
            R+ANG  S+VRR+ST+QLSGK  VT+GS+L+SS+TG N QGTKF+KQLK GDYVV+RGSS
Sbjct  61   RHANGHTSLVRRTSTFQLSGKATVTSGSSLISSYTGPNQQGTKFAKQLKIGDYVVLRGSS  120

Query  143  YRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYTLDTTK  202
            YR+DGI+SDTQMVIFPDYRGP+  NVP+TKT E  W Q  WNIDRCDGTGK+GYTLD TK
Sbjct  121  YRIDGIISDTQMVIFPDYRGPSDINVPITKTVETVWDQPSWNIDRCDGTGKSGYTLDVTK  180

Query  203  MQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPP  262
            MQMFYMDYSWYGAGFIRWGFRA +G+VIY HKIPNNN NTEAYMRSGNLP+RYEVNTI P
Sbjct  181  MQMFYMDYSWYGAGFIRWGFRATDGNVIYIHKIPNNNFNTEAYMRSGNLPSRYEVNTICP  240

Query  263  STTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGTQEYINYTGKTSFVQDIIS  322
            +T AT++  + D+T+YV +    FP +GTLR++Q+TSATA  QEY+NYTG T + QD+IS
Sbjct  241  TTQATKSITNSDTTIYVTEPPYGFPDAGTLRIRQTTSATAANQEYVNYTGTTKYQQDVIS  300

Query  323  TNSRNDTITVSSTTGLH  339
              + N++I V STTGL 
Sbjct  301  VVAGNNSIVVGSTTGLQ  317


>gb|EBA66340.1|  hypothetical protein GOS_363840 [marine metagenome]
Length=427

 Score =  343 bits (981),  Expect = 2e-93, Method: Composition-based stats.
 Identities = 184/251 (73%), Positives = 216/251 (86%), Gaps = 0/251 (0%)

Query  1    VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT  60
            VV+  AHN T DTV+ V+GV DNAYNG + +T+V+D + F+YTA STPT  TA GEYTVT
Sbjct  62   VVSAVAHNVTRDTVVKVQGVTDNAYNGEHQVTNVIDAYQFQYTASSTPTETTASGEYTVT  121

Query  61   PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN  120
            P GA+G NLEIGMMDQQNG+FFR++NG  SVVRR+ST+QLSGK  VTNGS+L+SS+T  N
Sbjct  122  PIGAYGVNLEIGMMDQQNGLFFRHSNGRTSVVRRTSTFQLSGKATVTNGSSLISSYTAPN  181

Query  121  GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ  180
             QGTKF+KQL+ GDY+V+RGSSYRVDGI+SDTQ+VIFPDYRGP+A NVP+TKT E EW Q
Sbjct  182  QQGTKFAKQLEIGDYIVLRGSSYRVDGIISDTQLVIFPDYRGPSAINVPITKTAETEWYQ  241

Query  181  GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ  240
             +WN+DRCDGTGK+GY+LD TKMQM YMDYSWYGAGFIR+GFR  +GDVIY HKIPNNN 
Sbjct  242  EEWNLDRCDGTGKSGYSLDVTKMQMLYMDYSWYGAGFIRFGFRGEDGDVIYVHKIPNNNF  301

Query  241  NTEAYMRSGNL  251
            NTEAYMRSGNL
Sbjct  302  NTEAYMRSGNL  312


>gb|ECH47406.1|  hypothetical protein GOS_5115624 [marine metagenome]
Length=290

 Score =  273 bits (778),  Expect = 2e-72, Method: Composition-based stats.
 Identities = 152/194 (78%), Positives = 176/194 (90%), Gaps = 0/194 (0%)

Query  145  VDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQGDWNIDRCDGTGKTGYTLDTTKMQ  204
            +DGI+SDTQMVIFPDYRGP+AGNVPVTKT E EW Q DWNIDRCDGTGKTGYT+D TKMQ
Sbjct  1    IDGIISDTQMVIFPDYRGPSAGNVPVTKTVETEWAQADWNIDRCDGTGKTGYTIDPTKMQ  60

Query  205  MFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEAYMRSGNLPARYEVNTIPPST  264
            MFYMDYSWYGAGF+RWGFRALNGD+IYAHKIPNNNQNTEAYMRSGNLPARYEVNT+PPST
Sbjct  61   MFYMDYSWYGAGFVRWGFRALNGDIIYAHKIPNNNQNTEAYMRSGNLPARYEVNTVPPST  120

Query  265  TATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGTQEYINYTGKTSFVQDIISTN  324
              T++F++ D+TL+VA DL+ FP++GTL +K+STSATAG  EYINYTGKT+F QD+I+ +
Sbjct  121  VTTKSFSNSDTTLFVASDLADFPSAGTLALKRSTSATAGEGEYINYTGKTAFSQDVINVS  180

Query  325  SRNDTITVSSTTGL  338
            +  DTI V+STTGL
Sbjct  181  AAADTIEVASTTGL  194


>gb|ECI35556.1|  hypothetical protein GOS_5118533 [marine metagenome]
Length=243

 Score =  215 bits (610),  Expect = 5e-55, Method: Composition-based stats.
 Identities = 127/154 (82%), Positives = 139/154 (90%), Gaps = 0/154 (0%)

Query  185  IDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQNTEA  244
            IDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGF+RWGFRALNGDVIYAHKIPNNNQNTEA
Sbjct  1    IDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFVRWGFRALNGDVIYAHKIPNNNQNTEA  60

Query  245  YMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSATAGT  304
            YMRSGNLPARYEVNTIPP+T  TRT  S DST+YVAD  + FP+SGTLR+KQST AT G 
Sbjct  61   YMRSGNLPARYEVNTIPPATACTRTINSSDSTIYVADAPTHFPSSGTLRIKQSTGATTGV  120

Query  305  QEYINYTGKTSFVQDIISTNSRNDTITVSSTTGL  338
            QEY+NYTGKT FVQD+IS ++  DTITV+ST+GL
Sbjct  121  QEYVNYTGKTVFVQDVISVSAAGDTITVASTSGL  154

ORF finding

SMS ORF finder/anycodon/codons 1,2 et 3/longueur mini:60/direct/code standard

>ORF number 1 in reading frame 1 on the direct strand extends from base 1 to base 1044.
GTTGTAGCTACTGAAGCACACAACTGTACAATTGATACAGTTATTGATGTTCGTGGTGTT
GATGACAACGCATACAACGGAACATATAATATTACATCTGTAGTTGATCCATTTACATTT
AAATATACTGCAGGTTCTACACCAACTATTGCAACAGCAGGTGGTGAATATACAGTAACA
CCTAAGGGTGCTTTCGGAACCAATCTTGAGATTGGTATGATGGATCAGCAAAATGGTATC
TTCTTTAGATATGCGAACGGTGAATTAAGTGTTGTTCGTAGATCATCTACTTATCAACTA
TCAGGTAAAGTCAATGTTACTAATGGAAGCACACTAGTTTCTAGTTTCACTGGTATCAAT
GGTCAAGGAACTAAGTTTTCAAAACAGTTGAAACCAGGTGACTATGTTGTCATCCGTGGT
TCTTCTTATCGTGTTGATGGTATCGTATCTGATACTCAGATGGTTATTTTCCCTGACTAT
CGTGGTCCTACAGCAGGTAATGTACCTGTAACTAAGACTACAGAAGTAGAATGGAAACAA
GGTGACTGGAACATTGACCGTTGTGATGGTACAGGTAAGACTGGTTACACACTCGACACT
ACCAAGATGCAAATGTTCTACATGGATTACTCTTGGTATGGTGCTGGTTTTATTCGTTGG
GGTTTCCGTGCATTAAATGGTGATGTTATTTACGCTCATAAGATACCTAACAACAACCAG
AACACTGAAGCATACATGAGATCTGGTAACCTACCAGCTCGTTACGAAGTTAATACTATT
CCACCATCAACAACTGCTACTAGAACATTTGCTAGTGGTGATAGCACATTGTATGTTGCT
GATGATCTTTCTCAGTTCCCAACATCTGGTACACTTCGTGTCAAGCAATCTACTAGTGCT
ACTGCTGGTACTCAAGAGTATATAAATTACACTGGTAAAACATCTTTCGTTCAAGATATT
ATTAGCACTAACTCACGTAACGATACTATCACTGTTTCTTCTACTACAGGATTACACCAG
GCGGTCAACAAACATTATTTTTGA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
VVATEAHNCTIDTVIDVRGVDDNAYNGTYNITSVVDPFTFKYTAGSTPTIATAGGEYTVT
PKGAFGTNLEIGMMDQQNGIFFRYANGELSVVRRSSTYQLSGKVNVTNGSTLVSSFTGIN
GQGTKFSKQLKPGDYVVIRGSSYRVDGIVSDTQMVIFPDYRGPTAGNVPVTKTTEVEWKQ
GDWNIDRCDGTGKTGYTLDTTKMQMFYMDYSWYGAGFIRWGFRALNGDVIYAHKIPNNNQ
NTEAYMRSGNLPARYEVNTIPPSTTATRTFASGDSTLYVADDLSQFPTSGTLRVKQSTSA
TAGTQEYINYTGKTSFVQDIISTNSRNDTITVSSTTGLHQAVNKHYF*

>ORF number 1 in reading frame 2 on the direct strand extends from base 740 to base 925.
GATCTGGTAACCTACCAGCTCGTTACGAAGTTAATACTATTCCACCATCAACAACTGCTA
CTAGAACATTTGCTAGTGGTGATAGCACATTGTATGTTGCTGATGATCTTTCTCAGTTCC
CAACATCTGGTACACTTCGTGTCAAGCAATCTACTAGTGCTACTGCTGGTACTCAAGAGT
ATATAA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
DLVTYQLVTKLILFHHQQLLLEHLLVVIAHCMLLMIFLSSQHLVHFVSSNLLVLLLVLKS
I*

No ORFs were found in reading frame 3



SMS ORFfinder/anycodon/codons 1,2et3/longueur mini:60/indirect/code standard

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the reverse strand extends from base 104 to base 289.
GTTAGTGCTAATAATATCTTGAACGAAAGATGTTTTACCAGTGTAATTTATATACTCTTG
AGTACCAGCAGTAGCACTAGTAGATTGCTTGACACGAAGTGTACCAGATGTTGGGAACTG
AGAAAGATCATCAGCAACATACAATGTGCTATCACCACTAGCAAATGTTCTAGTAGCAGT
TGTTGA

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
VSANNILNERCFTSVIYILLSTSSSTSRLLDTKCTRCWELRKIISNIQCAITTSKCSSSS
C*

>ORF number 2 in reading frame 2 on the reverse strand extends from base 404 to base 697.
TGCACGGAAACCCCAACGAATAAAACCAGCACCATACCAAGAGTAATCCATGTAGAACAT
TTGCATCTTGGTAGTGTCGAGTGTGTAACCAGTCTTACCTGTACCATCACAACGGTCAAT
GTTCCAGTCACCTTGTTTCCATTCTACTTCTGTAGTCTTAGTTACAGGTACATTACCTGC
TGTAGGACCACGATAGTCAGGGAAAATAACCATCTGAGTATCAGATACGATACCATCAAC
ACGATAAGAAGAACCACGGATGACAACATAGTCACCTGGTTTCAACTGTTTTGA

>Translation of ORF number 2 in reading frame 2 on the reverse strand.
CTETPTNKTSTIPRVIHVEHLHLGSVECVTSLTCTITTVNVPVTLFPFYFCSLSYRYITC
CRTTIVRENNHLSIRYDTINTIRRTTDDNIVTWFQLF*

>ORF number 3 in reading frame 2 on the reverse strand extends from base 812 to base 1075.
TTCACCGTTCGCATATCTAAAGAAGATACCATTTTGCTGATCCATCATACCAATCTCAAG
ATTGGTTCCGAAAGCACCCTTAGGTGTTACTGTATATTCACCACCTGCTGTTGCAATAGT
TGGTGTAGAACCTGCAGTATATTTAAATGTAAATGGATCAACTACAGATGTAATATTATA
TGTTCCGTTGTATGCGTTGTCATCAACACCACGAACATCAATAACTGTATCAATTGTACA
GTTGTGTGCTTCAGTAGCTACAAC

>Translation of ORF number 3 in reading frame 2 on the reverse strand.
FTVRISKEDTILLIHHTNLKIGSESTLRCYCIFTTCCCNSWCRTCSIFKCKWINYRCNII
CSVVCVVINTTNINNCINCTVVCFSSYN

>ORF number 1 in reading frame 3 on the reverse strand extends from base 852 to base 1073.
TCCATCATACCAATCTCAAGATTGGTTCCGAAAGCACCCTTAGGTGTTACTGTATATTCA
CCACCTGCTGTTGCAATAGTTGGTGTAGAACCTGCAGTATATTTAAATGTAAATGGATCA
ACTACAGATGTAATATTATATGTTCCGTTGTATGCGTTGTCATCAACACCACGAACATCA
ATAACTGTATCAATTGTACAGTTGTGTGCTTCAGTAGCTACA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
SIIPISRLVPKAPLGVTVYSPPAVAIVGVEPAVYLNVNGSTTDVILYVPLYALSSTPRTS
ITVSIVQLCASVAT

[[Category: ]]