GOS 1288010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091120585933
Annotathon code: GOS_1288010
Sample :
  • GPS :10°42'59n; 80°15'16w
  • Caribbean Sea: Northeast of Colón - Panama
  • Coastal (-1.7m, 27.7°C, 0.1-0.8 microns)
Authors
Team : Biochimie 2010
Username : leaman
Annotated on : 2010-06-24 13:53:29
  • BLAIN Emmanuelle
  • LECOMTE Lea

Synopsis

Genomic Sequence

>JCVI_READ_1091120585933 GOS_1288010 Genomic DNA
CCCGCGAGGAGACCCACTCTCAGAACTCTGTGTCTAAGCTCCAAATGGGCATCATCGACAGAATCCTCCCAAACCAGTCCTTGCTTCATTACTACACACG
ATGAGGTGAATGAAAATAATCCTTAGTTTATTTGAAGCGCTAGAGCCCTTTGGATATACCATGGAGGAGGCTAGTGGGCCTGAATGGAGGATTCACGGAT
TAGTCGGTTTGGCCCTACTAATCAATGTAGTATTCCTGAAGCTATCGATTGGTGGGCCATGGGATTCAGAGACATTTACTTTAGGCCTGATAGGCTCGAT
TTCCCTGGCCTTATTCTATGTGTCTTGGTATAGATTAACGTTCAGAAGGAGAGGACTAATTCCTTGGGTTGATCTCTGGAAGGAACCCTCTTCATCGGCT
AAGAAGGGGCTATTATCCTCTCTCATAGTTCTCTCCATGGCTTGGCTTTCTGGTAATCATCTGCAGCACATACTTCCGACCCCCACAGGATTGGTCTTGT
CCCTGATCGGTTTCCTGATGCTGACTCAGTCAGCATATGTACTGATGAGTATAGGACCTTTGTCAGATGACTAGCGTTTAGACCAAATCTCAAGCCATTT
TGTGATATTCCTCTCAACGTCTTCTTCAGACACTTCCCTGCTCTCAACAGAGCCGTCTGTAATCGCCTTGACCCTGCATCCACATGCATTTGCGTGCCCT
CCTCCGTCTGGATCAAAATAAGTCGCTAGCCGTGTTAAGTCGAAGATTCCTCCTTCTTTATGAAGGAACGAGTTCGTGTAGAATGAGGCCGAAACAGGAT
ATCTCCCCTCTTCCCCGAAAGATGCTCCGATGTCGCCGTGAATCACGATACAAGCATCACATGAATCCCCCGCAAAAGCCGTAATATGGTACCCATTGGA
ACGCATCCCCAAACCATCTA

Translation

[110 - 571/920]   direct strand
>GOS_1288010 Translation [110-571   direct strand]
MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKG
LLSSLIVLSMAWLSGNHLQHILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD

Annotator commentaries

  • Notre séquence d'adn génomique GOS_1288010 provient de la mer de caraibes, elle est composé de 920 paire de base.


  • Pour débuter, les différents ORF ont été recherché, un ORF putatif, plus long que les autres, a été mis en évidence. Cet ORF putatif est dans le cadre 2 de lecture et dans le sens direct, il va de 110 à 574 base et il fait 464 bases; Notre ORF semble codant, car il est plutôt long. Nos analyses sur les codons d'initiation nous montre une méthionine en début de séquence qui pourrait être le codon initiateur, l'alignement multiple nous confirme cette hypothèse car toute les séquences homologues sélectionnés commence par une méthionine, et elle possède un codon stop en fin de séquence.


  • Cet ORF coderait pour une protéine de 154 aa, pour un poids moléculaire de 17.05 kDa.

Cette protéine possède des homologues répertoriés mais non analysés on a de ce fait aucune information quand a sa fonction biologique.

Néanmoins, elle aurait 4 domaines transmembranaires et un signal peptidique, ce qui peut nous faire penser a une protéine transmembranaire (ex: récepteur ou protéine de transport) le signal peptidique pourrai être un récepteur signal ou un peptide d'adressage clivé par la suite.


  • Les homologues de cette protéine n'ayant pas été analysé, nous ne pouvons pas établir de groupe d'étude ni de groupe taxonomique.


  • Après alignement des séquences supposé homologue et la construction des arbres, on peut supposer que notre séquence d'ADN génomique GOS_1288010 pourrait être proche des séquences gi_143590958 et gi_143656696. Ces deux séquences sont les deux premier de notre recherche de Blast et affiche les plus haut scores et les E-value les plus bas. Elles ont était prélevé toute dans la même région. On peut donc en déduire que notre proteine putative vient elle aussi d'un métagénome marin et elle a probablement la même fonction que ces deux autres séquences.







ORF finding

PROTOCOLE:

a) any codon / SMS ORF finder / sens direct / cadre 1,2 & 3 / min 60 AA / code génétique standard /

b) any codon / SMS ORF finder / sens reverse / cadre 1,2 & 3 / min 60 AA / code génétique standard /

c) ATG / SMS ORF finder / sens direct / cadre 2 / min 60 AA / code génétique bacterial ( pareil pour standard)

d)Masse moléculaire/ SMS/ protein molecular weight.

e) any codon / SMS ORF finder / sens direct / cadre 1,2 & 3 / min 60 AA / code génétique bacterial /

f) any codon / SMS ORF finder / sens reverse / cadre 1,2 & 3 / min 60 AA / code génétique bacterial




ANALYSE RESULTATS:


Nous avons commencé notre analyse avec le code génétique standard puis nous avons comparé avec le code génétique bactérial , et les résultats sont identiques (résultat code génétique bacterial sont dans résultat brut).


a) sens direct

- On ne trouve aucun ORF dans le cadre 1

- On trouve un ORF putatif dans le cadre 2 il va de 110 à 574 base et il fait 464 bases

- On trouve un ORF putatif dans le cadre 3 il va de 672 à 884 base et il fait 212 bases



> on trouve donc 2 ORF putatif de plus de 60 codons, ne contenant pas de codon stop dans la séquence, l'ORF le plus long est celui du cadre 2


b)sens indirect

- On trouve un ORF putatif dans le cadre 1 il va de 196 à 498 base et il fait 302 bases

- On ne trouve aucun ORF dans le cadre 2

- On trouve un ORF putatif dans le cadre 3 il va de 3 à 350 base et il fait 347 bases



> on trouve 2 ORF putatif de plus de 60 codons et ne contenant pas de codons stop. l'ORF le plus long est dans le cadre 3, il fait 347 bases.


_> On pense que l'ORF a considérer est celui du sens direct dans le cadre 2 de 464 bases , car c'est le plus long. Néanmoins les autres ORF ne semble pas insignifiants, car ils dépasse tous 60 codon, et ne contienent aucun codon stop, mais il nous est demander de choisir le plus long.


Pour préciser notre recherche nous allons recommencer l'analyse du sens direct en utilisant l'option ATG codon d'initiation car l'analyse avec "any codon" nous donne une séquence qui commence juste après un codon stop. On ce place dans le cas ou notre protéine putative débuterait par une méthionine.



c)Sens direct codon d'initiation ATG


>On trouve un ORF cadre 2 sens direct de 464 bases, 155codons , 154aa commençant par un ATG ,on peut penser que c'est le codon d'initiation, et fini par un stop ne contient aucun codon stop dans la séquence,on peut donc pensé que cette ORF est codant.


d)notre ORF putatif semble complet on calcul donc son poids moléculaire avec SMS _ 17.05 KDa


e)et d)---> identique à a) et b), détaille dans résultat brut.


Conclusion:


  • On peut penser que notre séquence nucléotidique est codante car on obtient plusieurs ORF assez long.
  • Nos analyses nous permettent de choisir l'ORF du sens direct dans le cadre 2 car ces le plus long il commence par une méthionine on peut penser que ce codon est le codon d'initiation, nos prochaine analyse nous permettrons peut être de confirmer cette hypothèse.
  • Notre ORF putatif est complet puisque il ce fini par un codon stop en position 571-574; on peut donc calculer son poids moléculaire qui est de 17.05 KDa.
RÉSULTATS BRUTS:
a)ORF sens direct , cadre 1,2 & 3

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 110 to base 574.
ATGAAAATAATCCTTAGTTTATTTGAAGCGCTAGAGCCCTTTGGATATACCATGGAGGAG
GCTAGTGGGCCTGAATGGAGGATTCACGGATTAGTCGGTTTGGCCCTACTAATCAATGTA
GTATTCCTGAAGCTATCGATTGGTGGGCCATGGGATTCAGAGACATTTACTTTAGGCCTG
ATAGGCTCGATTTCCCTGGCCTTATTCTATGTGTCTTGGTATAGATTAACGTTCAGAAGG
AGAGGACTAATTCCTTGGGTTGATCTCTGGAAGGAACCCTCTTCATCGGCTAAGAAGGGG
CTATTATCCTCTCTCATAGTTCTCTCCATGGCTTGGCTTTCTGGTAATCATCTGCAGCAC
ATACTTCCGACCCCCACAGGATTGGTCTTGTCCCTGATCGGTTTCCTGATGCTGACTCAG
TCAGCATATGTACTGATGAGTATAGGACCTTTGTCAGATGACTAG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL
IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH
ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD*

>ORF number 1 in reading frame 3 on the direct strand extends from base 672 to base 884.
CCCTGCATCCACATGCATTTGCGTGCCCTCCTCCGTCTGGATCAAAATAAGTCGCTAGCC
GTGTTAAGTCGAAGATTCCTCCTTCTTTATGAAGGAACGAGTTCGTGTAGAATGAGGCCG
AAACAGGATATCTCCCCTCTTCCCCGAAAGATGCTCCGATGTCGCCGTGAATCACGATAC
AAGCATCACATGAATCCCCCGCAAAAGCCGTAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
PCIHMHLRALLRLDQNKSLAVLSRRFLLLYEGTSSCRMRPKQDISPLPRKMLRCRRESRY
KHHMNPPQKP*

b)ORF sens indirect , cadre 1,2 & 3


>ORF number 1 in reading frame 1 on the reverse strand extends from base 196 to base 498.
CGACTTATTTTGATCCAGACGGAGGAGGGCACGCAAATGCATGTGGATGCAGGGTCAAGG
CGATTACAGACGGCTCTGTTGAGAGCAGGGAAGTGTCTGAAGAAGACGTTGAGAGGAATA
TCACAAAATGGCTTGAGATTTGGTCTAAACGCTAGTCATCTGACAAAGGTCCTATACTCA
TCAGTACATATGCTGACTGAGTCAGCATCAGGAAACCGATCAGGGACAAGACCAATCCTG
TGGGGGTCGGAAGTATGTGCTGCAGATGATTACCAGAAAGCCAAGCCATGGAGAGAACTA
TGA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
RLILIQTEEGTQMHVDAGSRRLQTALLRAGKCLKKTLRGISQNGLRFGLNASHLTKVLYS
SVHMLTESASGNRSGTRPILWGSEVCAADDYQKAKPWREL*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 3 to base 350.
GATGGTTTGGGGATGCGTTCCAATGGGTACCATATTACGGCTTTTGCGGGGGATTCATGT
GATGCTTGTATCGTGATTCACGGCGACATCGGAGCATCTTTCGGGGAAGAGGGGAGATAT
CCTGTTTCGGCCTCATTCTACACGAACTCGTTCCTTCATAAAGAAGGAGGAATCTTCGAC
TTAACACGGCTAGCGACTTATTTTGATCCAGACGGAGGAGGGCACGCAAATGCATGTGGA
TGCAGGGTCAAGGCGATTACAGACGGCTCTGTTGAGAGCAGGGAAGTGTCTGAAGAAGAC
GTTGAGAGGAATATCACAAAATGGCTTGAGATTTGGTCTAAACGCTAG

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
DGLGMRSNGYHITAFAGDSCDACIVIHGDIGASFGEEGRYPVSASFYTNSFLHKEGGIFD
LTRLATYFDPDGGGHANACGCRVKAITDGSVESREVSEEDVERNITKWLEIWSKR*


c)ORF, sens direct, cadre 2, codon ATG


>ORF number 1 in reading frame 2 on the direct strand extends from base 110 to base 574.
ATGAAAATAATCCTTAGTTTATTTGAAGCGCTAGAGCCCTTTGGATATACCATGGAGGAG
GCTAGTGGGCCTGAATGGAGGATTCACGGATTAGTCGGTTTGGCCCTACTAATCAATGTA
GTATTCCTGAAGCTATCGATTGGTGGGCCATGGGATTCAGAGACATTTACTTTAGGCCTG
ATAGGCTCGATTTCCCTGGCCTTATTCTATGTGTCTTGGTATAGATTAACGTTCAGAAGG
AGAGGACTAATTCCTTGGGTTGATCTCTGGAAGGAACCCTCTTCATCGGCTAAGAAGGGG
CTATTATCCTCTCTCATAGTTCTCTCCATGGCTTGGCTTTCTGGTAATCATCTGCAGCAC
ATACTTCCGACCCCCACAGGATTGGTCTTGTCCCTGATCGGTTTCCTGATGCTGACTCAG
TCAGCATATGTACTGATGAGTATAGGACCTTTGTCAGATGACTAG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL
IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH
ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD*


d)----> 17.05 KDa


e)ORF sens direct , cadre 1,2 & 3 ( code génétique bacterial)


No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 110 to base 574.
ATGAAAATAATCCTTAGTTTATTTGAAGCGCTAGAGCCCTTTGGATATACCATGGAGGAG
GCTAGTGGGCCTGAATGGAGGATTCACGGATTAGTCGGTTTGGCCCTACTAATCAATGTA
GTATTCCTGAAGCTATCGATTGGTGGGCCATGGGATTCAGAGACATTTACTTTAGGCCTG
ATAGGCTCGATTTCCCTGGCCTTATTCTATGTGTCTTGGTATAGATTAACGTTCAGAAGG
AGAGGACTAATTCCTTGGGTTGATCTCTGGAAGGAACCCTCTTCATCGGCTAAGAAGGGG
CTATTATCCTCTCTCATAGTTCTCTCCATGGCTTGGCTTTCTGGTAATCATCTGCAGCAC
ATACTTCCGACCCCCACAGGATTGGTCTTGTCCCTGATCGGTTTCCTGATGCTGACTCAG
TCAGCATATGTACTGATGAGTATAGGACCTTTGTCAGATGACTAG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL
IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH
ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD*

>ORF number 1 in reading frame 3 on the direct strand extends from base 672 to base 884.
CCCTGCATCCACATGCATTTGCGTGCCCTCCTCCGTCTGGATCAAAATAAGTCGCTAGCC
GTGTTAAGTCGAAGATTCCTCCTTCTTTATGAAGGAACGAGTTCGTGTAGAATGAGGCCG
AAACAGGATATCTCCCCTCTTCCCCGAAAGATGCTCCGATGTCGCCGTGAATCACGATAC
AAGCATCACATGAATCCCCCGCAAAAGCCGTAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
PCIHMHLRALLRLDQNKSLAVLSRRFLLLYEGTSSCRMRPKQDISPLPRKMLRCRRESRY
KHHMNPPQKP*

f)ORF sens indirect , cadre 1,2 & 3 ( code génétique bacterial)


>ORF number 1 in reading frame 1 on the reverse strand extends from base 196 to base 498.
CGACTTATTTTGATCCAGACGGAGGAGGGCACGCAAATGCATGTGGATGCAGGGTCAAGG
CGATTACAGACGGCTCTGTTGAGAGCAGGGAAGTGTCTGAAGAAGACGTTGAGAGGAATA
TCACAAAATGGCTTGAGATTTGGTCTAAACGCTAGTCATCTGACAAAGGTCCTATACTCA
TCAGTACATATGCTGACTGAGTCAGCATCAGGAAACCGATCAGGGACAAGACCAATCCTG
TGGGGGTCGGAAGTATGTGCTGCAGATGATTACCAGAAAGCCAAGCCATGGAGAGAACTA
TGA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
RLILIQTEEGTQMHVDAGSRRLQTALLRAGKCLKKTLRGISQNGLRFGLNASHLTKVLYS
SVHMLTESASGNRSGTRPILWGSEVCAADDYQKAKPWREL*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 3 to base 350.
GATGGTTTGGGGATGCGTTCCAATGGGTACCATATTACGGCTTTTGCGGGGGATTCATGT
GATGCTTGTATCGTGATTCACGGCGACATCGGAGCATCTTTCGGGGAAGAGGGGAGATAT
CCTGTTTCGGCCTCATTCTACACGAACTCGTTCCTTCATAAAGAAGGAGGAATCTTCGAC
TTAACACGGCTAGCGACTTATTTTGATCCAGACGGAGGAGGGCACGCAAATGCATGTGGA
TGCAGGGTCAAGGCGATTACAGACGGCTCTGTTGAGAGCAGGGAAGTGTCTGAAGAAGAC
GTTGAGAGGAATATCACAAAATGGCTTGAGATTTGGTCTAAACGCTAG

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
DGLGMRSNGYHITAFAGDSCDACIVIHGDIGASFGEEGRYPVSASFYTNSFLHKEGGIFD
LTRLATYFDPDGGGHANACGCRVKAITDGSVESREVSEEDVERNITKWLEIWSKR*



Multiple Alignement

PROTOCOLE:


a) Alignement multiple avec MUSCLE / Curation de l'alignement avec Gblocks / Option séléction rigoureuse.

b) Alignement multiple avec MUSCLE / Curation de l'alignement avec Gblocks / Option sélection rigoureuse / aprés retrais d'une séquence ECQ21037.1 hypothetical protein GOS_3060423.


ANALYSE DES RÉSULTATS:


Pour commencer nous avons sélectionnées 26 séquences parmi nos 47 séquences homologues afin de réaliser l'alignement multiple,comme nous n'avons pas pu définir de groupe d'étude et de groupe extérieur, on sélection les séquences par rapports à leur longueur avec la notre afin que l'alignement soit homogène.


séquence choisi pour faire l'alignement multiple 26:


EDF75304.1 hypothetical protein GOS_900216 [marine metagenome]301 301 99% 5e-81

EDG10892.1 hypothetical protein GOS_838951 [marine metagenome]282 282 97% 3e-75

EBN01749.1 hypothetical protein GOS_8315127 [marine metagenome]136 136 94% 2e-31

EBK69923.1 hypothetical protein GOS_8689362 [marine metagenome]126 126 90% 2e-28

EBW77633.1 hypothetical protein GOS_6693422 [marine metagenome]123 123 98% 2e-27

EDF64348.1 hypothetical protein GOS_919126 [marine metagenome]122 122 90% 3e-27

EDF87642.1 hypothetical protein GOS_878966 [marine metagenome]121 121 98% 8e-27

ECH45445.1 hypothetical protein GOS_5190370 [marine metagenome]121 121 95% 9e-27

EDF80869.1 hypothetical protein GOS_890542 [marine metagenome]118 118 89% 7e-26

ECF72060.1 hypothetical protein GOS_5109402 [marine metagenome]97.8 97.8 97% 1e-19

ECQ21037.1 hypothetical protein GOS_3060423 [marine metagenome]96.7 96.7 93% 3e-19

ECT42809.1 hypothetical protein GOS_6018401 [marine metagenome]95.1 95.1 83% 6e-19

EBA68462.1 hypothetical protein GOS_360053 [marine metagenome]94.0 94.0 87% 2e-18

EBN06076.1 hypothetical protein GOS_8308395 [marine metagenome]90.9 90.9 84% 1e-17

EDI93593.1 hypothetical protein GOS_1783225 [marine metagenome]88.6 88.6 91% 6e-17

ECU35447.1 hypothetical protein GOS_4689927 [marine metagenome]88.2 88.2 91% 8e-17

EBT93118.1 hypothetical protein GOS_7192607 [marine metagenome]88.2 88.2 97% 8e-17

EBD28922.1 hypothetical protein GOS_9953355 [marine metagenome]87.8 87.8 88% 1e-16

EBU77893.1 hypothetical protein GOS_7007326 [marine metagenome]85.9 85.9 91% 4e-16

EBZ66621.1 hypothetical protein GOS_4415766 [marine metagenome]84.7 84.7 83% 8e-16

EDI82476.1 hypothetical protein GOS_1802492 [marine metagenome]84.7 84.7 91% 1e-15

EBL11661.1 hypothetical protein GOS_8620954 [marine metagenome]84.7 84.7 83% 1e-15

ECD20916.1 hypothetical protein GOS_4330322 [marine metagenome]80.5 80.5 81% 2e-14

ECQ77423.1 hypothetical protein GOS_4323591 [marine metagenome]68.6 68.6 80% 7e-11

ECA18118.1 hypothetical protein GOS_5887129 [marine metagenome]77.0 77.0 90% 2e-13

ECM13022.1 hypothetical protein GOS_4126107 [marine metagenome]76.6 76.6 89% 2e-13



Analyse de l'alignement multiple:


  • Notre protéine putative commence à la 17 position, dans Gblocks, par une méthionine comme 5 autres séquences, ce qui ne nous permet pas de valider ou non notre hypothèse sur le codon d'initiation.


  • Le logiciel a conservé deux blocs un de 92 à 142,et un autres de 172 à 187; ces blocs sont des séquences dans lesquels on retrouve au moins 23 positions informatives.


  • Le logiciel Gblocks garde 64 positions informatives soit 36% de notre séquence de 154 aa,ces positions nous informent sur les acides aminés identique, similaire, proche ou différents entre notre protéine putative et les séquences homologues sélectionné. 36% est un score assez élevés ce qui signifie que nous pourrons certainement reconstruire l'histoire évolutive qui a séparées ces protéines avec un arbre phylogénétique.



Analyse de l'alignement multiple après retrais d'une séquence:


  • Le nombre de position conservés augmente il passe de 64 à 79 on peut pense que notre arbre sera plus précis. on remarque que les blocs conservé on aussi changés , il y en a 3 [49 63] [67 116] [147 160];


  • Toute les séquences ou presque commence par un méthionine certes pas toute à la même position; donc on peut pensé que notre hypothése sur le codon d'initiation serait une methionine, est vrai.
RÉSULTATS BRUTS:


a) Curation Gblocks:

Gblocks 0.91b Results

Processed file: input.fasta
Number of sequences: 27
Alignment assumed to be: Protein
New number of positions: 64 (selected positions are underlined in blue)

                         10        20        30        40        50        60
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  ----------------------------------------------MEEDAAWV------
gi|136274652|gb  -------------------------------MSCLYMGE----------EPEWR------
gi|141183798|gb  MCTGWQHSTSQPARDNNCIFNHAHIESQRVQPFKHNPTC-----VVMANEPEWR------
gi|135979845|gb  -----------MLPPKPAATITASLTAVIGNPRGCNLSRITHHGLIMGTVPEWR------
gi|139070491|gb  -------MVGNILPPRPAATITASLITVINNPRGYNLSRITHHSVTMARVPEWR------
gi|137407868|gb  -------MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGLHMATIPEWR------
gi|144109928|gb  -------MVGSILPPNPAATITASFTAVIKNPKGYNLSSITHHAVIMATTPEWR------
gi|137564391|gb  -----------MLLPNPAATITASLTTVINNPKGYNLSSITHHGVIMATVPEWR------
gi|144094326|gb  -----------MLLPNPAATITASLTTVINNPKGYNLSSITHHGVIMATVPEWR------
gi|139533243|gb  -------MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGVIMATVPEWR------
gi|141945933|gb  -------MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGVIMATVPEWR------
gi|138523308|gb  -----------MLWPHKISQNIGCFQALV--IIGTHMAELNASK---AQGPPWQ------
gi|140532212|gb  -----------VFWPHNLSQRIGCFQALV--IIGTYMAELTPSK---EQGPPWQ------
gi|141265812|gb  -----------------VKALAGNIRVPS--PAAMRTAS--------RTGSMRRHTRHRL
gi|138436847|gb  -----------------------------------------------MAEVRWG------
gi|135918067|gb  --------------------------MCY--RLGIVMTE--------DKAPDWI------
Translation_of   -----------------MKIILSLFEALE--PFGYTMEE--------ASGPEWR------
gi|143590958|gb  -----------------MKIILSLFEALE--PFGYTMEE--------ASGPEWR------
gi|143656696|gb  --------------------MLSLFEALD--PFGHTMEE--------ASGPEWR------
gi|137923410|gb  -----------------MTFLLSRFEGQI--PVGSIVGN--------ETEVNGV------
gi|143615374|gb  -----------------MTFLLSRFEGQI--PVGSIVGN--------ETEVNGV------
gi|141813602|gb  ----------------------------------------------MGAKAEWR------
gi|134766806|gb  --------------------------------------------MHMGEKPEWR------
gi|136268459|gb  -----------------MTPSLRGFQAQT--PPRPYMEA--------KAAPEWR------
gi|143601414|gb  ------------------------MITTA--RPDHSMGG--------QSEPRWK------
gi|143561525|gb  ------------------------MITAA--RPGHSMVS--------QSEPQWK------
gi|139786306|gb  ------------------------IITAA--RPAHYMVS--------QSEPQWK------
                                                                             


                         70        80        90       100       110       120
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  ----------------GMGIFGLIIMIDVLIIGWAPNGPWNDDSFSLGVLGMIGVASLYL
gi|136274652|gb  ----------------LLGLFGVICLIQTFL-DITPAGPWDSRSFSRGAVGLIGLVCIYI
gi|141183798|gb  ----------------WYTLAGLLMVVQTLF-DVAPSGPWDAPSFTRGIVGLVGLCCLYI
gi|135979845|gb  ----------------WLVVIGLALVGQTLF-DVAPEGPWGASSFTRGVIGLSGLVCIYL
gi|139070491|gb  ----------------WVTIAGFVMVIQTMF-DVAPNGPWNAPSFTRGVIGLGGLCCLYI
gi|137407868|gb  ----------------WITLAGLVMVAQTMF-DIAPEGPWGATSFTRGLIGLCGLCCLYV
gi|144109928|gb  ----------------WFTITGLIMVAQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYV
gi|137564391|gb  ----------------WVTLAGLVMVAQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYI
gi|144094326|gb  ----------------WVTVAGLVMVAQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYI
gi|139533243|gb  ----------------WITVAGLVMVVQTMF-DLAPEGPWGASSFTRGLIGLCGLCCLYV
gi|141945933|gb  ----------------LITVAGLVMVAQTMF-DLAPEGPWGASSFTRGLIGLCGLCCLYV
gi|138523308|gb  ----------------GLAIIGVILIAQTFT-DFGWKGPWNDVSFTQGSLGLVGGILIYL
gi|140532212|gb  ----------------GLAIIGVILVAQTFT-DFGWKGPWNDVSFTQGSIGLVGGILLYL
gi|141265812|gb  SSVALQHEVMERKDTVGLLVAGVLLILQSFT-DLAPDGPWDSPSFTRGVLGLVGMVLVYL
gi|138436847|gb  ----------------AWGIAGILLVAQTFL-DLVPSGPWGNGAMGTGFLGLAGFGCLYV
gi|135918067|gb  ----------------KFLFIGFILLINTIFIKISSPWPWGSESFTLGVIGLIGLVMLYI
Translation_of   ----------------IHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYV
gi|143590958|gb  ----------------IHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYV
gi|143656696|gb  ----------------IHGLVGAALLINVVFLKISIDGPWDSETFTLGLIGSVSIALFYV
gi|137923410|gb  ----------------TPAIIGSILLLNTMILGFSFPGPWDSESFTLGIIGMTGLGFWYV
gi|143615374|gb  ----------------TPAIIGSILLLNTMILGFSFPGPWDSESFTLGIIGMTGLGFWYV
gi|141813602|gb  ----------------VFAAVGTVLVAITFT-NWSPEGPWNDSTFTSGSFGLTGMMLLYL
gi|134766806|gb  ----------------GFAAIGTLLVAITFT-DLSPKGPWSDETFTSGSIGLIGLSFIYL
gi|136268459|gb  ----------------AHASIGSILLLNTLTVKLAPAGPWGAESFTLGVIGLIGLVFLYT
gi|143601414|gb  ----------------PHAAAAGVLILDALVLGVAPSGPWDDSSFSRGVIGLVGACLAYV
gi|143561525|gb  ----------------PHAVAAVLLILDALVFGVAPSGPWDDSSFSRGVIGLVGASLAYV
gi|139786306|gb  ----------------PHAVAAGLLILDALVIGVAPGGPWDDSSFSRGLIGLVGACLAYV
                                                 ############################


                        130       140       150       160       170       180
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  SWYRWRFKQKGVVPWLRLWKDVRSGGMKVTIVGLMVMLGTWFLYA---SGPFPPAGGLIL
gi|136274652|gb  AWFRFTFKQKGLIPTIGVLKNPKKSWIYVMTFGIACYLFVILNQKMSFEEYFPETTGMIV
gi|141183798|gb  AWFRFTFKSNGIIPSINRWQNPQQSWIKVLIFSLVCFVIVGILQINKIEDKLPETAGMIV
gi|135979845|gb  GWFRFTFKQSGIIPSINRWQKPEASWKLVVLFSCLCLIALAVINNTNLSDNLPATTGMIV
gi|139070491|gb  GWFRFTFRQSGIIPSINRWQQPEKSWKYVVLFSLFCMLTLPLINSPNLADSLPETTGMII
gi|137407868|gb  GWFRYTFRQNGIIPSINRWREPKRSWKLVLSFSFVCIGLVLAINNTDLKDSLPETSGMIL
gi|144109928|gb  GWFRVTFRQSGIIPSINRWRQPKKSWKLVLIFSFTCMLLVYGINNTELKNILPETSGMIL
gi|137564391|gb  GWFRFTFRQSGIIPSINRWRQPEKSWKLVFGFSLVCIVLVFGINNSDLKDTLPETSGMIL
gi|144094326|gb  GWFRFTFRQSGIIPSINRWRQPEKSWKLVFGFSLVCIVLVYAINNSGLKDTLPETSGMIL
gi|139533243|gb  GWFRFTFRQSGIIPSINRWRQPEKSWKLVLIFSLSCISLVYLINNTNLKDVLPETSGMIL
gi|141945933|gb  GWFRFTFRQSGIIPSINRWRKPEKSWKLVMIFSFACILLVYMINNTDLKGILPETSGMIL
gi|138523308|gb  AWFRWHFKINGLIPTLDRWKNPKKGMLNLFIFGIIMVIITWLVGG-PFSEIFPRPTGMLL
gi|140532212|gb  SWFRWHFKINGLIPTLDRWKNPKKGMLNLFIFGIVMTIITWLVGG-PFSEIFPRPTGMIL
gi|141265812|gb  AWFRHTFGIFGVAPTVNRWATPETTWLRVVAFGLGCLVATRAIRLFDDSGVVPEPAGLLI
gi|138436847|gb  AWYRRTFSTPGLLPTSDLWKDPAGSWPKVVGAGVLFLLLSYLAGREEVDAWMPDTAGLVL
gi|135918067|gb  SWYRFTFRRRGLVPWLDLWKEPKESSIKLFLFSFIITIFSYVLGK--NQYFFPDPTSLIL
Translation_of   SWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGN-HLQHILPTPTGLVL
gi|143590958|gb  SWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGN-HLQHILPTPTGLVL
gi|143656696|gb  SWYRLTFRRRGLIPWVDLWKEPSSSAKKVLLSSLIVLSMAWLSGN-HLQHILPTPTGLVL
gi|137923410|gb  SWYRFTFNRKGLIPWLDNWKTPEKSSKQVLAVGVFTIILSWTAGN-PLQEYLPDPTGLVL
gi|143615374|gb  SWYRFTFNRKGLIPWLDNWKTPEKSSKQVLAVGVFTIILSWTAGN-PLQEYLPDPTGLVL
gi|141813602|gb  AWFRLTFESKGVVPTMDMWKDPEGTSPLVIGVGLVILGIAYAVGR---IDFFPGPAGLIL
gi|134766806|gb  AWFRLTFEKKGVVPTLDLWRDPGQTSLTVMGVGIVTLGIAYAVGR---IEFFPEPAGLIL
gi|136268459|gb  AWYRLTFKRKGLVPWMDMWENPKDSSRTVIVAGLAIIASAWLAGN-VVEEALPKPTGLIL
gi|143601414|gb  AWYRRTFKRKGLVPWIDLWEKPEESARLVLYASIGFLTISWIAGN-PMQPHLPDPTGLVL
gi|143561525|gb  AWYRRTFKRKGLVPWIDLWEKPEESARLVLYASIGFLAISWVAGN-PMQPHLPDPTGLIL
gi|139786306|gb  AWYRRTFNRKGLVPWIDLWEKPEESARLVLYASIGFLTISWIAGN-PMQPHLPEPTGLVL
                 ######################                              ########


                        190       200       210
                 =========+=========+=========+=
gi|134330123|gb  NLIGALMVLQGTYAILSS-GYLSE------S
gi|136274652|gb  LLIGSLSLLNSIYVWMVVSGPLKQDIVKEQE
gi|141183798|gb  MLVGCLTLLNSIYVGLVVSGPLNE--TTEEE
gi|135979845|gb  LLIASLSMLNGLYVGLVVSGPLSN--RSEEE
gi|139070491|gb  LLIASLAMLNGX-------------------
gi|137407868|gb  LLIASLSMLNAVYVSLVVSGPLSN--LPEEE
gi|144109928|gb  LLIASLSMLNGIYVGLVVSGPLKK--RSEEE
gi|137564391|gb  LLIASLAMLNAIYVGLVVSGPLSK--TTEEE
gi|144094326|gb  LLIASLSMLNAIYVGLVVSGPLSK--ITEEE
gi|139533243|gb  LLIASLSMLNAIYVGLVVLGPLSK--PDEEE
gi|141945933|gb  LLIASLSMLNAIYVGLVVS------------
gi|138523308|gb  GLISLLALLQFSYAWLYFNNYFLE-----EE
gi|140532212|gb  GLISLLTLLQVSYAWLYFNNYFLE-----EE
gi|141265812|gb  TLVGVLAIMNGLYVWAVTSGPLND----EEE
gi|138436847|gb  SLIGLLTVLNGLYVGAVI-GPLSE-----EE
gi|135918067|gb  SLIALLTFIQATYVLLSTTVLLDD-------
Translation_of   SLIGFLMLTQSAYVLMSI-GPLSD------D
gi|143590958|gb  SLIGFLMLTQSAYVLMSI-GPLSD------D
gi|143656696|gb  SLIGFLMLTQSAYVLMSI-GPLSD------D
gi|137923410|gb  LLIGLLISLSGAYSMLAL-GPLAD--SIADE
gi|143615374|gb  LLIGLLISLSGIYSMLAL-GPLAD--SIADE
gi|141813602|gb  SLVGLLVATNGVYVWLSSAGPLSR----EEE
gi|134766806|gb  SLIGLLVTTNGFYVWMSTEGPLSS----EEE
gi|136268459|gb  TLVGLLTLLNGVYVYLSV-GALSD-----LE
gi|143601414|gb  VLVGLLLGLQAVYVYLVI-GPLKE------E
gi|143561525|gb  VLVGLLLGLQAVYVYLVI-GPLKE------E
gi|139786306|gb  VLVGLLLGLQAVYVYLVI-GPLKE------E
                 ######                         






Parameters used
Minimum Number Of Sequences For A Conserved Position: 14
Minimum Number Of Sequences For A Flanking Position: 23
Maximum Number Of Contiguous Nonconserved Positions: 4
Minimum Length Of A Block: 10
Allowed Gap Positions: None
Use Similarity Matrices: Yes


Flank positions of the 2 selected block(s)
Flanks: [93  142]  [173  186]  

New number of positions in input.fasta-gb:  64  (30% of the original 211 positions)



b)Curation avec Gblocks aprés retrais d'une séquence ECQ21037.1 hypothetical protein GOS_3060423.   

Gblocks 0.91b Results

Processed file: input.fasta
Number of sequences: 26
Alignment assumed to be: Protein
New number of positions: 79 (selected positions are underlined in blue)

                         10        20        30        40        50        60
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  ------------------------------------------MEEDAAWVGMGIFGLIIM
gi|141265812|gb  VKALAGNIRVPSPAAMRTASRTGSMRRHTRHRLSSVALQHEVMERKD-TVGLLVAGVLLI
gi|136274652|gb  -------------------------------------MSCLYMGEEPEWRLLGLFGVICL
gi|135979845|gb  -------MLPPKPAATITASLTAVIGNPRGCNLSRITHHGLIMGTVPEWRWLVVIGLALV
gi|139070491|gb  ---MVGNILPPRPAATITASLITVINNPRGYNLSRITHHSVTMARVPEWRWVTIAGFVMV
gi|137407868|gb  ---MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGLHMATIPEWRWITLAGLVMV
gi|144109928|gb  ---MVGSILPPNPAATITASFTAVIKNPKGYNLSSITHHAVIMATTPEWRWFTITGLIMV
gi|137564391|gb  -------MLLPNPAATITASLTTVINNPKGYNLSSITHHGVIMATVPEWRWVTLAGLVMV
gi|144094326|gb  -------MLLPNPAATITASLTTVINNPKGYNLSSITHHGVIMATVPEWRWVTVAGLVMV
gi|139533243|gb  ---MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGVIMATVPEWRWITVAGLVMV
gi|141945933|gb  ---MVGSILLPNPAATITASLTTVIENPKGYNLSSITHHGVIMATVPEWRLITVAGLVMV
gi|138523308|gb  -------MLWPHKISQNIGCFQALV--IIGTHMAELNASK---AQGPPWQGLAIIGVILI
gi|140532212|gb  -------VFWPHNLSQRIGCFQALV--IIGTYMAELTPSK---EQGPPWQGLAIIGVILV
gi|135918067|gb  ----------------------MCY--RLGIVMTE--------DKAPDWIKFLFIGFILL
gi|143590958|gb  -------------MKIILSLFEALE--PFGYTMEE--------ASGPEWRIHGLVGLALL
GOS_1288010_Tra  -------------MKIILSLFEALE--PFGYTMEE--------ASGPEWRIHGLVGLALL
gi|143656696|gb  ----------------MLSLFEALD--PFGHTMEE--------ASGPEWRIHGLVGAALL
gi|138436847|gb  -------------------------------------------MAEVRWGAWGIAGILLV
gi|137923410|gb  -------------MTFLLSRFEGQI--PVGSIVGN--------ETEVNGVTPAIIGSILL
gi|143615374|gb  -------------MTFLLSRFEGQI--PVGSIVGN--------ETEVNGVTPAIIGSILL
gi|141813602|gb  ------------------------------------------MGAKAEWRVFAAVGTVLV
gi|134766806|gb  ----------------------------------------MHMGEKPEWRGFAAIGTLLV
gi|136268459|gb  -------------MTPSLRGFQAQT--PPRPYMEA--------KAAPEWRAHASIGSILL
gi|143601414|gb  --------------------MITTA--RPDHSMGG--------QSEPRWKPHAAAAGVLI
gi|143561525|gb  --------------------MITAA--RPGHSMVS--------QSEPQWKPHAVAAVLLI
gi|139786306|gb  --------------------IITAA--RPAHYMVS--------QSEPQWKPHAVAAGLLI
                                                                 ############


                         70        80        90       100       110       120
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  IDVLIIGWAPNGPWNDDSFSLGVLGMIGVASLYLSWYRWRFKQKGVVPWLRLWKDVRSGG
gi|141265812|gb  LQSFT-DLAPDGPWDSPSFTRGVLGLVGMVLVYLAWFRHTFGIFGVAPTVNRWATPETTW
gi|136274652|gb  IQTFL-DITPAGPWDSRSFSRGAVGLIGLVCIYIAWFRFTFKQKGLIPTIGVLKNPKKSW
gi|135979845|gb  GQTLF-DVAPEGPWGASSFTRGVIGLSGLVCIYLGWFRFTFKQSGIIPSINRWQKPEASW
gi|139070491|gb  IQTMF-DVAPNGPWNAPSFTRGVIGLGGLCCLYIGWFRFTFRQSGIIPSINRWQQPEKSW
gi|137407868|gb  AQTMF-DIAPEGPWGATSFTRGLIGLCGLCCLYVGWFRYTFRQNGIIPSINRWREPKRSW
gi|144109928|gb  AQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYVGWFRVTFRQSGIIPSINRWRQPKKSW
gi|137564391|gb  AQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYIGWFRFTFRQSGIIPSINRWRQPEKSW
gi|144094326|gb  AQTMV-DLAPEGPWGASSFTRGLIGLCGLCCLYIGWFRFTFRQSGIIPSINRWRQPEKSW
gi|139533243|gb  VQTMF-DLAPEGPWGASSFTRGLIGLCGLCCLYVGWFRFTFRQSGIIPSINRWRQPEKSW
gi|141945933|gb  AQTMF-DLAPEGPWGASSFTRGLIGLCGLCCLYVGWFRFTFRQSGIIPSINRWRKPEKSW
gi|138523308|gb  AQTFT-DFGWKGPWNDVSFTQGSLGLVGGILIYLAWFRWHFKINGLIPTLDRWKNPKKGM
gi|140532212|gb  AQTFT-DFGWKGPWNDVSFTQGSIGLVGGILLYLSWFRWHFKINGLIPTLDRWKNPKKGM
gi|135918067|gb  INTIFIKISSPWPWGSESFTLGVIGLIGLVMLYISWYRFTFRRRGLVPWLDLWKEPKESS
gi|143590958|gb  INVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSA
GOS_1288010_Tra  INVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSA
gi|143656696|gb  INVVFLKISIDGPWDSETFTLGLIGSVSIALFYVSWYRLTFRRRGLIPWVDLWKEPSSSA
gi|138436847|gb  AQTFL-DLVPSGPWGNGAMGTGFLGLAGFGCLYVAWYRRTFSTPGLLPTSDLWKDPAGSW
gi|137923410|gb  LNTMILGFSFPGPWDSESFTLGIIGMTGLGFWYVSWYRFTFNRKGLIPWLDNWKTPEKSS
gi|143615374|gb  LNTMILGFSFPGPWDSESFTLGIIGMTGLGFWYVSWYRFTFNRKGLIPWLDNWKTPEKSS
gi|141813602|gb  AITFT-NWSPEGPWNDSTFTSGSFGLTGMMLLYLAWFRLTFESKGVVPTMDMWKDPEGTS
gi|134766806|gb  AITFT-DLSPKGPWSDETFTSGSIGLIGLSFIYLAWFRLTFEKKGVVPTLDLWRDPGQTS
gi|136268459|gb  LNTLTVKLAPAGPWGAESFTLGVIGLIGLVFLYTAWYRLTFKRKGLVPWMDMWENPKDSS
gi|143601414|gb  LDALVLGVAPSGPWDDSSFSRGVIGLVGACLAYVAWYRRTFKRKGLVPWIDLWEKPEESA
gi|143561525|gb  LDALVFGVAPSGPWDDSSFSRGVIGLVGASLAYVAWYRRTFKRKGLVPWIDLWEKPEESA
gi|139786306|gb  LDALVIGVAPGGPWDDSSFSRGLIGLVGACLAYVAWYRRTFNRKGLVPWIDLWEKPEESA
                 ###   ##################################################    


                        130       140       150       160       170       180
                 =========+=========+=========+=========+=========+=========+
gi|134330123|gb  MKVTIVGLMVMLGTWFLYA---SGPFPPAGGLILNLIGALMVLQGTYAILSS-GYLSE--
gi|141265812|gb  LRVVAFGLGCLVATRAIRLFDDSGVVPEPAGLLITLVGVLAIMNGLYVWAVTSGPLND--
gi|136274652|gb  IYVMTFGIACYLFVILNQKMSFEEYFPETTGMIVLLIGSLSLLNSIYVWMVVSGPLKQDI
gi|135979845|gb  KLVVLFSCLCLIALAVINNTNLSDNLPATTGMIVLLIASLSMLNGLYVGLVVSGPLSN--
gi|139070491|gb  KYVVLFSLFCMLTLPLINSPNLADSLPETTGMIILLIASLAMLNGX--------------
gi|137407868|gb  KLVLSFSFVCIGLVLAINNTDLKDSLPETSGMILLLIASLSMLNAVYVSLVVSGPLSN--
gi|144109928|gb  KLVLIFSFTCMLLVYGINNTELKNILPETSGMILLLIASLSMLNGIYVGLVVSGPLKK--
gi|137564391|gb  KLVFGFSLVCIVLVFGINNSDLKDTLPETSGMILLLIASLAMLNAIYVGLVVSGPLSK--
gi|144094326|gb  KLVFGFSLVCIVLVYAINNSGLKDTLPETSGMILLLIASLSMLNAIYVGLVVSGPLSK--
gi|139533243|gb  KLVLIFSLSCISLVYLINNTNLKDVLPETSGMILLLIASLSMLNAIYVGLVVLGPLSK--
gi|141945933|gb  KLVMIFSFACILLVYMINNTDLKGILPETSGMILLLIASLSMLNAIYVGLVVS-------
gi|138523308|gb  LNLFIFGIIMVIITWLVGG-PFSEIFPRPTGMLLGLISLLALLQFSYAWLYFNNYFLE--
gi|140532212|gb  LNLFIFGIVMTIITWLVGG-PFSEIFPRPTGMILGLISLLTLLQVSYAWLYFNNYFLE--
gi|135918067|gb  IKLFLFSFIITIFSYVLGK--NQYFFPDPTSLILSLIALLTFIQATYVLLSTTVLLDD--
gi|143590958|gb  KKGLLSSLIVLSMAWLSGN-HLQHILPTPTGLVLSLIGFLMLTQSAYVLMSI-GPLSD--
GOS_1288010_Tra  KKGLLSSLIVLSMAWLSGN-HLQHILPTPTGLVLSLIGFLMLTQSAYVLMSI-GPLSD--
gi|143656696|gb  KKVLLSSLIVLSMAWLSGN-HLQHILPTPTGLVLSLIGFLMLTQSAYVLMSI-GPLSD--
gi|138436847|gb  PKVVGAGVLFLLLSYLAGREEVDAWMPDTAGLVLSLIGLLTVLNGLYVGAVI-GPLSE--
gi|137923410|gb  KQVLAVGVFTIILSWTAGN-PLQEYLPDPTGLVLLLIGLLISLSGAYSMLAL-GPLAD--
gi|143615374|gb  KQVLAVGVFTIILSWTAGN-PLQEYLPDPTGLVLLLIGLLISLSGIYSMLAL-GPLAD--
gi|141813602|gb  PLVIGVGLVILGIAYAVGR---IDFFPGPAGLILSLVGLLVATNGVYVWLSSAGPLSR--
gi|134766806|gb  LTVMGVGIVTLGIAYAVGR---IEFFPEPAGLILSLIGLLVTTNGFYVWMSTEGPLSS--
gi|136268459|gb  RTVIVAGLAIIASAWLAGN-VVEEALPKPTGLILTLVGLLTLLNGVYVYLSV-GALSD--
gi|143601414|gb  RLVLYASIGFLTISWIAGN-PMQPHLPDPTGLVLVLVGLLLGLQAVYVYLVI-GPLKE--
gi|143561525|gb  RLVLYASIGFLAISWVAGN-PMQPHLPDPTGLILVLVGLLLGLQAVYVYLVI-GPLKE--
gi|139786306|gb  RLVLYASIGFLTISWIAGN-PMQPHLPEPTGLVLVLVGLLLGLQAVYVYLVI-GPLKE--
                                           ##############                    


                 
                 =====
gi|134330123|gb  ----S
gi|141265812|gb  --EEE
gi|136274652|gb  VKEQE
gi|135979845|gb  RSEEE
gi|139070491|gb  -----
gi|137407868|gb  LPEEE
gi|144109928|gb  RSEEE
gi|137564391|gb  TTEEE
gi|144094326|gb  ITEEE
gi|139533243|gb  PDEEE
gi|141945933|gb  -----
gi|138523308|gb  ---EE
gi|140532212|gb  ---EE
gi|135918067|gb  -----
gi|143590958|gb  ----D
GOS_1288010_Tra  ----D
gi|143656696|gb  ----D
gi|138436847|gb  ---EE
gi|137923410|gb  SIADE
gi|143615374|gb  SIADE
gi|141813602|gb  --EEE
gi|134766806|gb  --EEE
gi|136268459|gb  ---LE
gi|143601414|gb  ----E
gi|143561525|gb  ----E
gi|139786306|gb  ----E
                      






Parameters used
Minimum Number Of Sequences For A Conserved Position: 14
Minimum Number Of Sequences For A Flanking Position: 22
Maximum Number Of Contiguous Nonconserved Positions: 8
Minimum Length Of A Block: 10
Allowed Gap Positions: None
Use Similarity Matrices: Yes


Flank positions of the 3 selected block(s)
Flanks: [49  63]  [67  116]  [147  160]  

New number of positions in input.fasta-gb:  79  (42% of the original 185 positions)



Protein Domains

PROTOCOLE:

INTERPRO/ ORF / cadre 2 / sens direct / 154aa /


ANALYSE DES RÉSULTATS:


Notre analyse révèle plusieurs partie structurale sur notre protéine putative mais aucune information sur des partie fonctionnelle.

notre protéine putative n'est pas intégrée dans interpro,pas d'IPR, ce qui signifie qu'aucun numéros dans la banque ne correspond à un domaine fonctionnelle. cela ne veut pas dire que notre proteine n'est pas codante.

Les informations structurales sont un peu flou, on apprend que notre protéine supposé aurait 4 domaines transmembranaires et un signal peptidique on peut penser a une protéine transmembranaire (ex: récepteur ou proteine de transport) le signal peptidique pourrai être un récepteur signal ou un peptide d'adressage clivé par la suite.




>Nous avons limitées nos analyses au logiciel INTERPRO car il intègre Prosite et Pfam.

RÉSULTATS BRUTS:

résultat INTERPRO:

Sequence_1	D484A4052840A07D	154	TMHMM	tmhmm	transmembrane_regions	30	50	NA	?	12-Feb-2010	NULL	NULL
Sequence_1	D484A4052840A07D	154	TMHMM	tmhmm	transmembrane_regions	56	76	NA	?	12-Feb-2010	NULL	NULL
Sequence_1	D484A4052840A07D	154	TMHMM	tmhmm	transmembrane_regions	96	114	NA	?	12-Feb-2010	NULL	NULL
Sequence_1	D484A4052840A07D	154	TMHMM	tmhmm	transmembrane_regions	120	138	NA	?	12-Feb-2010	NULL	NULL
Sequence_1	D484A4052840A07D	154	SignalPHMM	SignalP	signal-peptide	1	48	NA	?	12-Feb-2010	NULL	NULL

Phylogeny

PROTOCOLE:

a)Arbre 1 /Phylogeny.fr/ méthode PhyML/ pas de blootstrap/ TreeDyn graphique/

b)Arbre 2 /Phylogeny.fr/ méthode BioNJ/

c)fiche séquence la plus proche gi_143590958

d)fiche séquence la plus proche gi_143656696

e)Arbre 3 /Phylogeny.fr/ méthode PhyML/ pas de blootstrap/ TreeDyn graphique/ sens la séquence ECQ21037.1 hypothetical protein GOS_3060423


ANALYSE DES RÉSULTATS:


a)Arbre 1


Notre protéine putative est proche de beaucoup de métagénomes marins en particulier ces deux là : gi_143590958 et gi_143656696; malheureusement ces séquences ne sont pas encore analysées.

On ne peut donc pas en déduire une fonction moléculaire, mais on peut supposer que notre séquence est bien codante, quel vient elle aussi d'un métagénomes marins, et que la proteine pour laquelle elle code a probablement une fonction similaire au deux autres.


b)Arbre 2


Notre protéine putative reste très proche des même séquences,gi_143590958 et gi_143656696; ce qui confirme les hypothèses de l'arbre précédent c'est à dire que notre proteine putative vient elle aussi d'un métagénome marins, et que ca fonction est probablement similaire voir identique aux deux autres séquences.


c) et d)

En analysant les fiche descriptive de ces deux séquence on peut voir quels on était prélevé du coté oriental Nord américain dans l'océan du pacifique oriental dans la mer des sargasses,le canal de panama et les iles gallapagos.

Tous comme notre séquence qui viens du Panama. Et elle font la même taille ou presque que notre protéine putative soit 154aa.


e)Arbre 3

Notre protéine putative est toujours aussi proche des même séquences, retirer cette séquence nous a permis de mieux analysés notre alignement multiple mais ne crée pas de gros changement dans l'arbre. car elle ne ce trouvait pas dans la même branche de l'arbre que notre protéine putative.

RÉSULTATS BRUTS:
a)Arbre 1:
                                                                                                    -------0.2-----
 
              +-----------------------------------marine_metagenome_gi_141265812                   (ecological métagenomes)
              |
    +---------+             +----------------marine_metagenome_gi_136274652                        (ecological métagenomes)
    |         |             |
    |         +-------------+       +--------marine_metagenome_gi_141183798                        (ecological métagenomes)
    |                       |       |
    |                       |       |           +-marine_metagenome_gi_139070491                   (ecological métagenomes)
    |                       +-------+           |
    |                               |           |   +---------marine_metagenome_gi_135979845       (ecological métagenomes)
    |                               +-----------+   |
    |                                           |   |
    |                                           |   |      +marine_metagenome_gi_137564391         (ecological métagenomes)
    |                                           +---+      |
 +--+                                               |      |marine_metagenome_gi_144094326         (ecological métagenomes)
 |  |                                               |      |
 |  |                                               +------++------marine_metagenome_gi_137407868  (ecological métagenomes)
 |  |                                                      ||
 |  |                                                      ||--marine_metagenome_gi_141945933      (ecological métagenomes)
 |  |                                                      ||
 |  |                                                      ++--marine_metagenome_gi_144109928      (ecological métagenomes)
 |  |                                                       |
 |  |                                                       +marine_metagenome_gi_139533243        (ecological métagenomes)
 |  |
 |  |                                +---marine_metagenome_gi_138523308                            (ecological métagenomes)
 |  +--------------------------------+
 |                                   +-marine_metagenome_gi_140532212                              (ecological métagenomes)
 |
 |
 |                                     +marine_metagenome_gi_143601414                             (ecological métagenomes)
 |                                   +-+
 |          +------------------------+ +-----marine_metagenome_gi_139786306                        (ecological métagenomes)
 |          |                        |
 |          |                        +marine_metagenome_gi_143561525                               (ecological métagenomes)
 |          |
 |          |                 +----------marine_metagenome_gi_135918067                            (ecological métagenomes)
 |          |                 |
 |          |                 |                        +marine_metagenome_gi_143590958             (ecological métagenomes)
 |          |      +----------+                        |
 |    +-----+      |          |                        |Translation_of_ORF_number_1_in_reading_frame_2_on_the_direct_str
 |    |     |      |          |      +-----------------+
 |    |     |      |          |      |                 +-----marine_metagenome_gi_143656696        (ecological métagenomes)
 |    |     |      |          +------+
 |    |     | +----+                 |                    +marine_metagenome_gi_137923410          (ecological métagenomes)
 |    |     | |    |                 +--------------------+
 |    |     | |    |                                      +marine_metagenome_gi_143615374          (ecological métagenomes)
 +----+     +-+    |
      |       |    |
      |       |    +---------------marine_metagenome_gi_136268459                                  (ecological métagenomes)
      |       |
      |       +-----------------------------------------marine_metagenome_gi_134330123             (ecological métagenomes)
      |
      |      +-----------------------------------------------------marine_metagenome_gi_138436847  (ecological métagenomes)
      +------+
             |            +--------------------marine_metagenome_gi_141813602                      (ecological métagenomes)
             +------------+
                          +-----marine_metagenome_gi_134766806                                     (ecological métagenomes)




b) Arbre 2


                                                                                                      -----0.2----
 
                            +--------------marine_metagenome_gi_135918067                          (ecological métagenomes)
                            |
                      +-----+                          +Translation_of_ORF_number_1_in_reading_frame_2_on_the_direct_str
                      |     |                      +---+
                      |     |                      |   +marine_metagenome_gi_143590958             (ecological métagenomes)
                      |     +----------------------+
                 +----+                            +marine_metagenome_gi_143656696                 (ecological métagenomes)
                 |    |
                 |    |                 +marine_metagenome_gi_137923410                            (ecological métagenomes)
           +-----+    |                 |
           |     |    +-----------------+
           |     |                      +marine_metagenome_gi_143615374                            (ecological métagenomes)
           |     |
      +----+     +--------------marine_metagenome_gi_136268459                                     (ecological métagenomes)
      |    |
      |    |                 +-marine_metagenome_gi_143601414                                      (ecological métagenomes)
      |    |                ++
      |    +----------------++marine_metagenome_gi_143561525                                       (ecological métagenomes)
      |                     |
      |                     +---marine_metagenome_gi_139786306                                     (ecological métagenomes)
      |
      |                                           +----marine_metagenome_gi_137407868              (ecological métagenomes)
      |                                          ++
      |                                          |+marine_metagenome_gi_144109928                  (ecological métagenomes)
      |                                         ++
      |                                         |+marine_metagenome_gi_139533243                   (ecological métagenomes)
      |                                         |
      |                                         |
 +----+                                    +----+-marine_metagenome_gi_141945933                   (ecological métagenomes)
 |    |                                    |    |
 |    |                                    |    |marine_metagenome_gi_137564391                    (ecological métagenomes)
 |    |                                 +--+    |
 |    |                                 |  |    +marine_metagenome_gi_144094326                    (ecological métagenomes)
 |    |                       +---------+  |
 |    |                       |         |  +-------marine_metagenome_gi_135979845                  (ecological métagenomes)
 |    |                       |         |
 |    |                     +-+         +----marine_metagenome_gi_139070491                        (ecological métagenomes)
 |    |                     | |
 |    |          +----------+ +---------marine_metagenome_gi_141183798                             (ecological métagenomes)
 |    |          |          |
 |    |   +------+          +--------------------marine_metagenome_gi_136274652                    (ecological métagenomes)
 |    |   |      |
 |    +---+      +---------------------------------marine_metagenome_gi_141265812                  (ecological métagenomes)
 |        |
 |        +--------------------------------------------marine_metagenome_gi_138436847              (ecological métagenomes)
 |
 |
 |   +--------------------------------------------------marine_metagenome_gi_134330123             (ecological métagenomes)
 | +-+
 | | |                                        +-----marine_metagenome_gi_138523308                 (ecological métagenomes)
 | | +----------------------------------------+
 +-+                                          +marine_metagenome_gi_140532212                      (ecological métagenomes)
   |
   |         +-----------------marine_metagenome_gi_141813602                                      (ecological métagenomes)
   +---------+
             +----------marine_metagenome_gi_134766806                                             (ecological métagenomes)




c)fiche séquence la plus proche gi_143590958

LOCUS       EDF75304                 154 aa            linear   ENV 06-APR-2007
DEFINITION  hypothetical protein GOS_900216 [marine metagenome].
ACCESSION   EDF75304
VERSION     EDF75304.1  GI:143590958
DBSOURCE    accession EP661019.1
KEYWORDS    .
SOURCE      marine metagenome
  ORGANISM  marine metagenome
            unclassified sequences; metagenomes; ecological metagenomes.
REFERENCE   1  (residues 1 to 154)
  AUTHORS   Yooseph,S., Sutton,G., Rusch,D.B., Halpern,A.L., Williamson,S.J.,
            Remington,K., Eisen,J.A., Heidelberg,K.B., Manning,G., Li,W.,
            Jaroszewski,L., Cieplak,P., Miller,C.S., Li,H., Mashiyama,S.T.,
            Joachimiak,M.P., van Belle,C., Chandonia,J.M., Soergel,D.A.,
            Zhai,Y., Natarajan,K., Lee,S., Raphael,B.J., Bafna,V., Friedman,R.,
            Brenner,S.E., Godzik,A., Eisenberg,D., Dixon,J.E., Taylor,S.S.,
            Strausberg,R.L., Frazier,M. and Venter,J.C.
  TITLE     The Sorcerer II Global Ocean Sampling Expedition: Expanding the
            Universe of Protein Families
  JOURNAL   PLoS Biol. 5 (3), E16 (2007)
   PUBMED   17355171
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   2  (residues 1 to 154)
  AUTHORS   Kannan,N., Taylor,S.S., Zhai,Y., Venter,J.C. and Manning,G.
  TITLE     Structural and Functional Diversity of the Microbial Kinome
  JOURNAL   PLoS Biol. 5 (3), E17 (2007)
   PUBMED   17355172
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   3  (residues 1 to 154)
  AUTHORS   Rusch,D.B., Halpern,A.L., Sutton,G., Heidelberg,K.B.,
            Williamson,S., Yooseph,S., Wu,D., Eisen,J.A., Hoffman,J.M.,
            Remington,K., Beeson,K., Tran,B., Smith,H., Baden-Tillson,H.,
            Stewart,C., Thorpe,J., Freeman,J., Andrews-Pfannkoch,C.,
            Venter,J.E., Li,K., Kravitz,S., Heidelberg,J.F., Utterback,T.,
            Rogers,Y.H., Falcon,L.I., Souza,V., Bonilla-Rosso,G.,
            Eguiarte,L.E., Karl,D.M., Sathyendranath,S., Platt,T.,
            Bermingham,E., Gallardo,V., Tamayo-Castillo,G., Ferrari,M.R.,
            Strausberg,R.L., Nealson,K., Friedman,R., Frazier,M. and
            Venter,J.C.
  TITLE     The Sorcerer II Global Ocean Sampling Expedition: Northwest
            Atlantic through Eastern Tropical Pacific
  JOURNAL   PLoS Biol. 5 (3), E77 (2007)
   PUBMED   17355176
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   4  (residues 1 to 154)
  CONSRTM   J. Craig Venter Institute
  TITLE     Direct Submission
  JOURNAL   Submitted (02-MAR-2007) J. Craig Venter Institute, 9704 Medical
            Center Drive, Rockville, MD 20850, USA
COMMENT     Method: conceptual translation.
FEATURES             Location/Qualifiers
     source          1..154
                     /organism="marine metagenome"
                     /isolation_source="isolated as part of a large dataset
                     composed predominantly from surface water marine samples
                     collected along a voyage from Eastern North American coast
                     to the Eastern Pacific Ocean, including locations in the
                     Sargasso Sea, Panama Canal, and the Galapagos Islands"
                     /db_xref="taxon:408172"
                     /environmental_sample
                     /note="metagenomic"
     Protein         1..154
                     /product="hypothetical protein"
     CDS             1..154
                     /locus_tag="GOS_900216"
                     /coded_by="complement(EP661019.1:2412..2876)"
                     /note="JCVI_ORF_1096697956010; CAM_CL_10918"
                     /transl_table=11
ORIGIN      
        1 mkiilslfea lepfgytmee asgpewrihg lvglallinv vflklsiggp wdsetftlgl
       61 igsislalfy vswyrltfrr rglipwvdlw kepsssakkg llsslivlsm awlsgnhlqh
      121 ilptptglvl sligflmltq sayvlmsigp lsdd
//



d)fiche séquence la plus proche gi_143656696


LOCUS       EDG10892                 151 aa            linear   ENV 06-APR-2007
DEFINITION  hypothetical protein GOS_838951 [marine metagenome].
ACCESSION   EDG10892
VERSION     EDG10892.1  GI:143656696
DBSOURCE    accession EP648010.1
KEYWORDS    .
SOURCE      marine metagenome
  ORGANISM  marine metagenome
            unclassified sequences; metagenomes; ecological metagenomes.
REFERENCE   1  (residues 1 to 151)
  AUTHORS   Yooseph,S., Sutton,G., Rusch,D.B., Halpern,A.L., Williamson,S.J.,
            Remington,K., Eisen,J.A., Heidelberg,K.B., Manning,G., Li,W.,
            Jaroszewski,L., Cieplak,P., Miller,C.S., Li,H., Mashiyama,S.T.,
            Joachimiak,M.P., van Belle,C., Chandonia,J.M., Soergel,D.A.,
            Zhai,Y., Natarajan,K., Lee,S., Raphael,B.J., Bafna,V., Friedman,R.,
            Brenner,S.E., Godzik,A., Eisenberg,D., Dixon,J.E., Taylor,S.S.,
            Strausberg,R.L., Frazier,M. and Venter,J.C.
  TITLE     The Sorcerer II Global Ocean Sampling Expedition: Expanding the
            Universe of Protein Families
  JOURNAL   PLoS Biol. 5 (3), E16 (2007)
   PUBMED   17355171
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   2  (residues 1 to 151)
  AUTHORS   Kannan,N., Taylor,S.S., Zhai,Y., Venter,J.C. and Manning,G.
  TITLE     Structural and Functional Diversity of the Microbial Kinome
  JOURNAL   PLoS Biol. 5 (3), E17 (2007)
   PUBMED   17355172
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   3  (residues 1 to 151)
  AUTHORS   Rusch,D.B., Halpern,A.L., Sutton,G., Heidelberg,K.B.,
            Williamson,S., Yooseph,S., Wu,D., Eisen,J.A., Hoffman,J.M.,
            Remington,K., Beeson,K., Tran,B., Smith,H., Baden-Tillson,H.,
            Stewart,C., Thorpe,J., Freeman,J., Andrews-Pfannkoch,C.,
            Venter,J.E., Li,K., Kravitz,S., Heidelberg,J.F., Utterback,T.,
            Rogers,Y.H., Falcon,L.I., Souza,V., Bonilla-Rosso,G.,
            Eguiarte,L.E., Karl,D.M., Sathyendranath,S., Platt,T.,
            Bermingham,E., Gallardo,V., Tamayo-Castillo,G., Ferrari,M.R.,
            Strausberg,R.L., Nealson,K., Friedman,R., Frazier,M. and
            Venter,J.C.
  TITLE     The Sorcerer II Global Ocean Sampling Expedition: Northwest
            Atlantic through Eastern Tropical Pacific
  JOURNAL   PLoS Biol. 5 (3), E77 (2007)
   PUBMED   17355176
  REMARK    Publication Status: Available-Online prior to print
REFERENCE   4  (residues 1 to 151)
  CONSRTM   J. Craig Venter Institute
  TITLE     Direct Submission
  JOURNAL   Submitted (02-MAR-2007) J. Craig Venter Institute, 9704 Medical
            Center Drive, Rockville, MD 20850, USA
COMMENT     Method: conceptual translation.
FEATURES             Location/Qualifiers
     source          1..151
                     /organism="marine metagenome"
                     /isolation_source="isolated as part of a large dataset
                     composed predominantly from surface water marine samples
                     collected along a voyage from Eastern North American coast
                     to the Eastern Pacific Ocean, including locations in the
                     Sargasso Sea, Panama Canal, and the Galapagos Islands"



e) Arbre 3:
                     ----0.1---
 
                                                                             +marine_metagenome_gi_143590958
                                                                             |
                                                                             |GOS_1288010_Traduction_110-571_sens_direct
                                         +-----------------------------------+
                                         |                                   +------marine_metagenome_gi_143656696
                                         |
                             +-----------+                                   +marine_metagenome_gi_137923410
                             |           |  +--------------------------------+
                             |           |  |                                +marine_metagenome_gi_143615374
                     +-------+           +--+
                     |       |              |
                     |       |              +---------------------marine_metagenome_gi_135918067
                 +---+       |
                 |   |       +--------------------marine_metagenome_gi_136268459
                 |   |
  +--------------+   +------------------------------------------------------------------marine_metagenome_gi_134330123
  |              |
  |              |                                        +marine_metagenome_gi_143561525
  |              |                                        |
  |              +----------------------------------------+   +--marine_metagenome_gi_143601414
 ++                                                       +---+
 ||                                                           +-----marine_metagenome_gi_139786306
 ||
 ||        +------------------------------------------------------------------------marine_metagenome_gi_138436847
 ||        |
 |+--------+                   +--------------------------marine_metagenome_gi_141813602
 |         +-------------------+
 |                             |
 |                             +----marine_metagenome_gi_134766806
 |
 |                                        +---marine_metagenome_gi_138523308
 |      +---------------------------------+
 |      |                                 +--marine_metagenome_gi_140532212
 +------+
        |       +-----------------------------------------------------------marine_metagenome_gi_141265812
        |       |
        +-------+                +---------------------------marine_metagenome_gi_136274652
                |                |
                +----------------+                      +-------marine_metagenome_gi_135979845
                                 |                      |
                                 |                      |            +----------marine_metagenome_gi_139070491
                                 +----------------------+            |
                                                        |            |       +-marine_metagenome_gi_137564391
                                                        +------------+       |
                                                                     |       |
                                                                     |       |marine_metagenome_gi_144094326
                                                                     |       |
                                                                     +-------+  +-marine_metagenome_gi_139533243
                                                                             |  |
                                                                             |  |--marine_metagenome_gi_141945933
                                                                             +--+
                                                                                |-------marine_metagenome_gi_137407868
                                                                                |
                                                                                +------marine_metagenome_gi_144109928


                     /db_xref="taxon:408172"
                     /environmental_sample
                     /note="metagenomic"
     Protein         1..151
                     /product="hypothetical protein"
     CDS             1..151
                     /locus_tag="GOS_838951"
                     /coded_by="EP648010.1:603..1058"
                     /note="JCVI_ORF_1096680234916; CAM_CL_10918"
                     /transl_table=11
ORIGIN      
        1 mlslfealdp fghtmeeasg pewrihglvg aallinvvfl kisidgpwds etftlgligs
       61 vsialfyvsw yrltfrrrgl ipwvdlwkep sssakkvlls slivlsmawl sgnhlqhilp
      121 tptglvlsli gflmltqsay vlmsigplsd d
//

Taxonomy report

PROTOCOLE:

BLAST P / ORF sens direct, cadre2, 154aa, contre une banque proteique environmental samples; parametre par défaut NCBI sauf "number of description _ 500"



ANALYSE DES RÉSULTATS:


notre séquence n'ayant aucun homologue significatif dans nr et swiss prot nous avons fait un blastp contre enviro. Nous obtenons plusieurs séquences similaire mais avec aucune information taxonomique car elle non pas encore était annoté.

Par conséquence nous ne pouvons pas déterminer de groupe d'étude et de groupe extérieur, puisque la seul information taxonomique sur toutes les séquences est protéine hypothétique métagénome marin.


Nous avons choisi 40 séquences qui nous semble être homologue après étude approfondi de leur score max, de leur E-value, de l'identité, de la longueur d'alignement, du nombre de gaps quelles contiennent. Nous avons donc pu définir un score seuil de 50, un E-valus d'environ 10^-05, un identité supérieur a 30%, un longueur d'alignement min de 90 à 100 bases, un nombre de gaps inférieur ou égale à 9%.


description des séquences considéré comme homologue 47:



EDF75304.1 hypothetical protein GOS_900216 [marine metagenome] 301 301 99% 5e-81

EDG10892.1 hypothetical protein GOS_838951 [marine metagenome] 282 282 97% 3e-75

EBN01749.1 hypothetical protein GOS_8315127 [marine metagenome] 136 136 94% 2e-31

EBN38109.1 hypothetical protein GOS_8255980 [marine metagenome] 129 129 98% 3e-29

EBK69923.1 hypothetical protein GOS_8689362 [marine metagenome] 126 126 90% 2e-28

EBW77633.1 hypothetical protein GOS_6693422 [marine metagenome] 123 123 98% 2e-27

EDF64348.1 hypothetical protein GOS_919126 [marine metagenome] 122 122 90% 3e-27

EDF87642.1 hypothetical protein GOS_878966 [marine metagenome] 121 121 98% 8e-27

ECH45445.1 hypothetical protein GOS_5190370 [marine metagenome]121 121 95% 9e-27

EDF80869.1 hypothetical protein GOS_890542 [marine metagenome]118 118 89% 7e-26

ECH90300.1 hypothetical protein GOS_3423331 [marine metagenome]106 106 77% 3e-22

ECF72060.1 hypothetical protein GOS_5109402 [marine metagenome]97.8 97.8 97% 1e-19

ECQ21037.1 hypothetical protein GOS_3060423 [marine metagenome]96.7 96.7 93% 3e-19

ECT42809.1 hypothetical protein GOS_6018401 [marine metagenome]95.1 95.1 83% 6e-19

EBA68462.1 hypothetical protein GOS_360053 [marine metagenome]94.0 94.0 87% 2e-18

EBN06076.1 hypothetical protein GOS_8308395 [marine metagenome]90.9 90.9 84% 1e-17

EDI93593.1 hypothetical protein GOS_1783225 [marine metagenome]88.6 88.6 91% 6e-17

EDG45611.1 hypothetical protein GOS_778845 [marine metagenome]88.2 88.2 82% 7e-17

ECU35447.1 hypothetical protein GOS_4689927 [marine metagenome]88.2 88.2 91% 8e-17

EBT93118.1 hypothetical protein GOS_7192607 [marine metagenome]88.2 88.2 97% 8e-17

EBD28922.1 hypothetical protein GOS_9953355 [marine metagenome]87.8 87.8 88% 1e-16

EBU77893.1 hypothetical protein GOS_7007326 [marine metagenome]85.9 85.9 91% 4e-16

EBZ66621.1 hypothetical protein GOS_4415766 [marine metagenome]84.7 84.7 83% 8e-16

EDI82476.1 hypothetical protein GOS_1802492 [marine metagenome]84.7 84.7 91% 1e-15

EBL11661.1 hypothetical protein GOS_8620954 [marine metagenome]84.7 84.7 83% 1e-15

EDD60700.1 hypothetical protein GOS_1275691 [marine metagenome]84.3 84.3 87% 1e-15

EBN78289.1 hypothetical protein GOS_8189251 [marine metagenome]84.3 84.3 51% 1e-15

ECD20916.1 hypothetical protein GOS_4330322 [marine metagenome]80.5 80.5 81% 2e-14

EBL23404.1 hypothetical protein GOS_8603451 [marine metagenome]80.5 80.5 74% 2e-14

ECC19927.1 hypothetical protein GOS_4824622 [marine metagenome]77.8 77.8 83% 1e-13

ECA18118.1 hypothetical protein GOS_5887129 [marine metagenome]77.0 77.0 90% 2e-13

EBK29602.1 hypothetical protein GOS_8755925 [marine metagenome]77.0 77.0 70% 2e-13

ECM13022.1 hypothetical protein GOS_4126107 [marine metagenome]76.6 76.6 89% 2e-13

EDB60955.1 hypothetical protein GOS_1626242 [marine metagenome]75.9 75.9 69% 4e-13

EBL81460.1 hypothetical protein GOS_8509950 [marine metagenome]73.9 73.9 72% 2e-12

ECV78015.1 hypothetical protein GOS_2836784 [marine metagenome]73.9 73.9 68% 2e-12

ECD44217.1 hypothetical protein GOS_3432934 [marine metagenome]72.4 72.4 81% 5e-12

EBX88779.1 hypothetical protein GOS_6517362 [marine metagenome]71.2 71.2 72% 9e-12

ECJ30169.1 hypothetical protein GOS_4865127 [marine metagenome]71.2 71.2 71% 1e-11

ECQ77423.1 hypothetical protein GOS_4323591 [marine metagenome]68.6 68.6 80% 7e-11

EDB11118.1 hypothetical protein GOS_1880221 [marine metagenome]66.2 66.2 72% 4e-10

EBU24568.1 hypothetical protein GOS_7144060 [marine metagenome]62.8 62.8 60% 4e-09

EBW29005.1 hypothetical protein GOS_6769902 [marine metagenome]62.4 62.4 76% 5e-09

EDG04535.1 hypothetical protein GOS_849949 [marine metagenome] 61.6 61.6 69% 9e-09

ECT89181.1 hypothetical protein GOS_4166703 [marine metagenome]57.4 57.4 60% 2e-07

ECT76578.1 hypothetical protein GOS_4646595 [marine metagenome]54.3 54.3 76% 1e-06

ECU30093.1 hypothetical protein GOS_4911241 [marine metagenome]50.8 50.8 60% 2e-05




RÉSULTATS BRUTS:

séquence considéré comme homologue


EDF75304.1 hypothetical protein GOS_900216 [marine metagenome]301	 301	99%	5e-81	
EDG10892.1 hypothetical protein GOS_838951 [marine metagenome]282	 282	97%	3e-75	
EBN01749.1 hypothetical protein GOS_8315127 [marine metagenome]136	 136	94%	2e-31	
EBN38109.1 hypothetical protein GOS_8255980 [marine metagenome]129	 129	98%	3e-29	
EBK69923.1 hypothetical protein GOS_8689362 [marine metagenome]126	 126	90%	2e-28	
EBW77633.1 hypothetical protein GOS_6693422 [marine metagenome]123       123	98%	2e-27	
EDF64348.1 hypothetical protein GOS_919126 [marine metagenome]122	 122	90%	3e-27	
EDF87642.1 hypothetical protein GOS_878966 [marine metagenome]121	 121	98%	8e-27	
ECH45445.1 hypothetical protein GOS_5190370 [marine metagenome]121	 121	95%	9e-27	
EDF80869.1 hypothetical protein GOS_890542 [marine metagenome]118	 118	89%	7e-26	
ECH90300.1 hypothetical protein GOS_3423331 [marine metagenome]106	 106	77%	3e-22	
ECF72060.1 hypothetical protein GOS_5109402 [marine metagenome]97.8      97.8	97%	1e-19	
ECQ21037.1 hypothetical protein GOS_3060423 [marine metagenome]96.7      96.7	93%	3e-19	
ECT42809.1 hypothetical protein GOS_6018401 [marine metagenome]95.1      95.1	83%	6e-19	
EBA68462.1 hypothetical protein GOS_360053 [marine metagenome]94.0	 94.0	87%	2e-18	
EBN06076.1 hypothetical protein GOS_8308395 [marine metagenome]90.9      90.9	84%	1e-17	
EDI93593.1 hypothetical protein GOS_1783225 [marine metagenome]88.6      88.6	91%	6e-17	
EDG45611.1 hypothetical protein GOS_778845 [marine metagenome]88.2	 88.2	82%	7e-17	
ECU35447.1 hypothetical protein GOS_4689927 [marine metagenome]88.2      88.2	91%	8e-17	
EBT93118.1 hypothetical protein GOS_7192607 [marine metagenome]88.2      88.2	97%	8e-17	
EBD28922.1 hypothetical protein GOS_9953355 [marine metagenome]87.8      87.8	88%	1e-16	
EBU77893.1 hypothetical protein GOS_7007326 [marine metagenome]85.9      85.9	91%	4e-16	
EBZ66621.1 hypothetical protein GOS_4415766 [marine metagenome]84.7      84.7	83%	8e-16	
EDI82476.1 hypothetical protein GOS_1802492 [marine metagenome]84.7      84.7	91%	1e-15	
EBL11661.1 hypothetical protein GOS_8620954 [marine metagenome]84.7      84.7	83%	1e-15	
EDD60700.1 hypothetical protein GOS_1275691 [marine metagenome]84.3      84.3	87%	1e-15	
EBN78289.1 hypothetical protein GOS_8189251 [marine metagenome]84.3      84.3	51%	1e-15	
ECD20916.1 hypothetical protein GOS_4330322 [marine metagenome]80.5      80.5	81%	2e-14	
EBL23404.1 hypothetical protein GOS_8603451 [marine metagenome]80.5	 80.5	74%	2e-14	
ECC19927.1 hypothetical protein GOS_4824622 [marine metagenome]77.8	 77.8	83%	1e-13	
ECA18118.1 hypothetical protein GOS_5887129 [marine metagenome]77.0	 77.0	90%	2e-13	
EBK29602.1 hypothetical protein GOS_8755925 [marine metagenome]77.0	 77.0	70%	2e-13	
ECM13022.1 hypothetical protein GOS_4126107 [marine metagenome]76.6	 76.6	89%	2e-13	
EDB60955.1 hypothetical protein GOS_1626242 [marine metagenome]75.9	 75.9	69%	4e-13	
EBL81460.1 hypothetical protein GOS_8509950 [marine metagenome]73.9	 73.9	72%	2e-12	
ECV78015.1 hypothetical protein GOS_2836784 [marine metagenome]73.9	 73.9	68%	2e-12	
ECD44217.1 hypothetical protein GOS_3432934 [marine metagenome]72.4	 72.4	81%	5e-12	
EBX88779.1 hypothetical protein GOS_6517362 [marine metagenome]71.2	 71.2	72%	9e-12	
ECJ30169.1 hypothetical protein GOS_4865127 [marine metagenome]71.2	 71.2	71%	1e-11	
ECQ77423.1 hypothetical protein GOS_4323591 [marine metagenome]68.6	 68.6	80%	7e-11	
EDB11118.1 hypothetical protein GOS_1880221 [marine metagenome]66.2	 66.2	72%	4e-10	
EBU24568.1 hypothetical protein GOS_7144060 [marine metagenome]62.8	 62.8	60%	4e-0
EBW29005.1 hypothetical protein GOS_6769902 [marine metagenome]62.4	 62.4	76%	5e-09	
EDG04535.1 hypothetical protein GOS_849949 [marine metagenome] 61.6	 61.6	69%	9e-09	
ECT89181.1 hypothetical protein GOS_4166703 [marine metagenome]57.4	 57.4	60%	2e-07	
ECT76578.1 hypothetical protein GOS_4646595 [marine metagenome]54.3	 54.3	76%	1e-06	
ECU30093.1 hypothetical protein GOS_4911241 [marine metagenome]50.8	 50.8	60%	2e-05	

BLAST

PROTOCOLE:


a)BLAST P /ORF sens direct, cadre 2, 154aa, contre une banque proteique SWIssprot; paramétre par défaut de NCBI sauf "number of description _ 500".


b)BLAST P / ORF sens direct, cadre2, 154aa, contre une banque proteique non-redondante nr; parametre par défaut NCBI sauf "number of description _ 500".


c) BLAST P / ORF sens direct, cadre2, 154aa, contre une banque proteique environemental samples ; parametre par défaut NCBI sauf "number of description _ 500".



ANALYSE DES RÉSULTATS:



a) blastp ORF putatif contre swissprot:


on trouve 12 séquences.

-score max_ 32; E-value_ 1.5

-score min_ 29.3; E-value_ 9.4


--->Les scores sont très bas et il y a peu de variation, les E-value plutôt élevés.On peut donc pensé que les séquences trouvés sont des faux positifs donc pas homologues et du au hasard.s'ajoute à ces observation la taille des séquences qui varie beaucoup ,très peu des séquences trouvées on une taille similaire a notre proteine putative.



Nous allons affiner notre recherche avec une banque plus importante nr.


b) blastp ORF putatif contre nr:


On trouve 13 séquences.

-score max_38;1 E-value_ 0.29

-score min_ 33.1; E-value_ 8.9


---> les résultat sont similaire aux précédents les scores sont bas et les E-value élevés, et il n'y a que 2 séquence qui on une taille d'environ 150 acides aminées.On remarque aussi que la plupart des séquences proposées on un taux d'identités bas et quelque gaps. on n'en déduit que les séquences ne sont pas homologues; on ne peut donc pas donner de fonction à notre protéine putative.


_> BLAST trouve des séquences non homologue par l'effet du hasard, mais aussi car certains enchainements d'acide aminés dans une protéine peuvent être similaire dans d'autre protéine sans pour autant qu'elles est un lien.



c) blast-p ORF putatif contre environmental samples:


On trouve 63 séquences.

-score max_ 301 E-value_ 5e-81

-score min_ 31.06 E-value_ 9.0


--->les résultats obtenu sont plus significatif que les précédents, on obtient un score max des 301 ce qui est un bon score et un E-value max de 5e-81, toute les séquences correspondent à des protéines hypothétiques de metagenomes marins.On n'a aucune autre information quand a la fonction biologique ou au rapport taxonomique, car la banque environnemental ne contient que des séquences répertoriés mais pas encore analysés dans le détail. donc cette analyse nous permet juste de vérifier que notre séquence existe et de lui assimiler des homologues en analysant dans le détailles les alignements deux à deux.








RÉSULTATS BRUTS:


a) BLAST-P contre swissprot

DESCRIPTION:

Sequences producing significant alignments:                       (Bits)  Value

sp|Q8U403.1|COBD_PYRFU  RecName: Full=Probable cobalamin biosy...  32.0    1.5  
sp|Q6C4W5.1|PFA3_YARLI  RecName: Full=Palmitoyltransferase PFA...  30.8    3.4  
sp|B0U1L1.1|SYV_XYLFM  RecName: Full=Valyl-tRNA synthetase; Al...  30.0    5.6  
sp|O13789.1|YEOB_SCHPO  RecName: Full=Uncharacterized beta-glu...  29.6    6.3   Gene info
sp|Q48P92.1|PYRR_PSE14  RecName: Full=Bifunctional protein pyr...  29.6    6.4   Gene info
sp|O31698.1|YKUL_BACSU  RecName: Full=CBS domain-containing pr...  29.6    6.4  
sp|Q9PH12.1|SYV_XYLFA  RecName: Full=Valyl-tRNA synthetase; Al...  29.6    6.4  
sp|Q4ZZ69.1|PYRR_PSEU2  RecName: Full=Bifunctional protein pyr...  29.6    6.6   Gene info
sp|A6UYL3.1|PYRR_PSEA7  RecName: Full=Bifunctional protein pyr...  29.6    7.1   Gene info
sp|P33125.1|HSP82_AJECA  RecName: Full=Heat shock protein 82       29.6    7.1  
sp|B3E5Z2.1|MUTL_GEOLS  RecName: Full=DNA mismatch repair prot...  29.3    9.4  
sp|Q1GT16.1|AROA_SPHAL  RecName: Full=3-phosphoshikimate 1-car...  29.3    9.4  


ALIGNEMENT:

>sp|Q8U403.1|COBD_PYRFU  RecName: Full=Probable cobalamin biosynthesis protein cobD
Length=285

 Score = 32.0 bits (71),  Expect = 1.5, Method: Compositional matrix adjust.
 Identities = 22/78 (28%), Positives = 43/78 (55%), Gaps = 15/78 (19%)

Query  71   VSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPTGLVL  130
            + ++   +RRR   P++D           G +SSL+V+ +A++  +HL + LP P  L+L
Sbjct  32   IEFFDNKYRRRS--PYLDFLV--------GAISSLVVIGLAFIL-SHLPNFLPNPFNLIL  80

Query  131  SLIGFLMLTQSAYVLMSI  148
            S    + L +S++ + S+
Sbjct  81   S----IYLLKSSFAIRSL  94


>sp|Q6C4W5.1|PFA3_YARLI  RecName: Full=Palmitoyltransferase PFA3; AltName: Full=Protein 
fatty acyltransferase 3
Length=401

 Score = 30.8 bits (68),  Expect = 3.4, Method: Compositional matrix adjust.
 Identities = 15/47 (31%), Positives = 29/47 (61%), Gaps = 5/47 (10%)

Query  20  EASGPEWRIHGLVGLA--LLINVVFLKLSI---GGPWDSETFTLGLI  61
           E+    WR++G+ G+A  ++ NV++LK+     G P D + F++ L+
Sbjct  49  ESDTTFWRVYGVAGVAIGIMCNVLYLKVCKVGPGSPTDIDNFSVPLV  95


>sp|B0U1L1.1|SYV_XYLFM  RecName: Full=Valyl-tRNA synthetase; AltName: Full=Valine--tRNA 
ligase; Short=ValRS
Length=991

 Score = 30.0 bits (66),  Expect = 5.6, Method: Composition-based stats.
 Identities = 21/85 (24%), Positives = 40/85 (47%), Gaps = 8/85 (9%)

Query  44   KLSIGGPWDSETFTLGLIGSISLALFYVSWYR--LTFRRRGLIPWVDLWK------EPSS  95
            +L +   W   TFT+    S ++   +V WY   L +R + L+ W  + K      E  +
Sbjct  133  RLGVSADWSRSTFTMDPQPSAAVTEAFVRWYEEGLIYRGQRLVNWDPVLKTAISDLEVEN  192

Query  96   SAKKGLLSSLIVLSMAWLSGNHLQH  120
             A++G+L S+       ++  H++H
Sbjct  193  VAEEGMLWSIRYPLSDGVTYEHIEH  217


>sp|O13789.1|YEOB_SCHPO Gene info RecName: Full=Uncharacterized beta-glucan synthesis-associated 
protein C17G6.11c
Length=636

 GENE ID: 2542171 SPAC17G6.11c | glucosidase (predicted)
[Schizosaccharomyces pombe] (10 or fewer PubMed links)

 Score = 29.6 bits (65),  Expect = 6.3, Method: Composition-based stats.
 Identities = 23/93 (24%), Positives = 39/93 (41%), Gaps = 5/93 (5%)

Query  10   ALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGG-----PWDSETFTLGLIGSI  64
            A + +G+  +  SGP+  I   VG      +    +   G     P   E   + L   I
Sbjct  497  AYQKYGFDYKPGSGPDALISWFVGDEYTWTMRQPAVGQNGNIASRPVSEEPMIVVLNFGI  556

Query  65   SLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSA  97
            S    Y  WY LTF +   + +V ++++ S S+
Sbjct  557  SPTWIYFYWYELTFPQTMYVDYVRIYQDSSDSS  589


>sp|Q48P92.1|PYRR_PSE14 Gene info RecName: Full=Bifunctional protein pyrR; Includes: RecName: Full=Pyrimidine 
operon regulatory protein; Includes: RecName: 
Full=Uracil phosphoribosyltransferase; Short=UPRTase
Length=170

 GENE ID: 3558892 pyrR | bifunctional pyrimidine regulatory protein PyrR uracil
phosphoribosyltransferase [Pseudomonas syringae pv. phaseolicola 1448A]
(10 or fewer PubMed links)

 Score = 29.6 bits (65),  Expect = 6.4, Method: Compositional matrix adjust.
 Identities = 15/46 (32%), Positives = 21/46 (45%), Gaps = 0/46 (0%)

Query  42  FLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWV  87
           F+ +  GG W ++     L     L    VS+YR  F + GL P V
Sbjct  31  FIGIRTGGVWVAQALLKALNNPAPLGTLDVSFYRDDFSQNGLHPQV  76


>sp|O31698.1|YKUL_BACSU  RecName: Full=CBS domain-containing protein ykuL
Length=147

 Score = 29.6 bits (65),  Expect = 6.4, Method: Compositional matrix adjust.
 Identities = 11/28 (39%), Positives = 18/28 (64%), Gaps = 0/28 (0%)

Query  15  GYTMEEASGPEWRIHGLVGLALLINVVF  42
           GYT      P +R+HGL+G  +++N +F
Sbjct  44  GYTAIPVLDPSYRLHGLIGTNMIMNSIF  71


>sp|Q9PH12.1|SYV_XYLFA  RecName: Full=Valyl-tRNA synthetase; AltName: Full=Valine--tRNA 
ligase; Short=ValRS
Length=994

 Score = 29.6 bits (65),  Expect = 6.4, Method: Composition-based stats.
 Identities = 21/85 (24%), Positives = 40/85 (47%), Gaps = 8/85 (9%)

Query  44   KLSIGGPWDSETFTLGLIGSISLALFYVSWYR--LTFRRRGLIPWVDLWK------EPSS  95
            +L +   W   TFT+    S ++   +V WY   L +R + L+ W  + K      E  +
Sbjct  133  RLGVSADWSRSTFTMDPQPSAAVTEAFVRWYEAGLIYRGQRLVNWDPVLKTAISDLEVEN  192

Query  96   SAKKGLLSSLIVLSMAWLSGNHLQH  120
             A++G+L S+       ++  H++H
Sbjct  193  VAEEGMLWSIRYPLSDGVTYEHVEH  217


>sp|Q4ZZ69.1|PYRR_PSEU2 Gene info RecName: Full=Bifunctional protein pyrR; Includes: RecName: Full=Pyrimidine 
operon regulatory protein; Includes: RecName: 
Full=Uracil phosphoribosyltransferase; Short=UPRTase
Length=170

 GENE ID: 3365959 Psyr_0483 | bifunctional pyrimidine regulatory protein PyrR
uracil phosphoribosyltransferase [Pseudomonas syringae pv. syringae B728a]
(10 or fewer PubMed links)

 Score = 29.6 bits (65),  Expect = 6.6, Method: Compositional matrix adjust.
 Identities = 15/46 (32%), Positives = 21/46 (45%), Gaps = 0/46 (0%)

Query  42  FLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWV  87
           F+ +  GG W ++     L     L    VS+YR  F + GL P V
Sbjct  31  FIGIRTGGVWVAQALLKALNNPAPLGTLDVSFYRDDFSQNGLHPQV  76


>sp|A6UYL3.1|PYRR_PSEA7 Gene info RecName: Full=Bifunctional protein pyrR; Includes: RecName: Full=Pyrimidine 
operon regulatory protein; Includes: RecName: 
Full=Uracil phosphoribosyltransferase; Short=UPRTase
Length=170

 GENE ID: 5355277 pyrR | bifunctional pyrimidine regulatory protein PyrR uracil
phosphoribosyltransferase [Pseudomonas aeruginosa PA7]

 Score = 29.6 bits (65),  Expect = 7.1, Method: Compositional matrix adjust.
 Identities = 16/46 (34%), Positives = 21/46 (45%), Gaps = 0/46 (0%)

Query  42  FLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWV  87
           ++ +  GG W +E     L     L    VS+YR  F R GL P V
Sbjct  31  YVGIHTGGIWVAEALLKALGNEEPLGTLDVSFYRDDFTRNGLHPQV  76


>sp|P33125.1|HSP82_AJECA  RecName: Full=Heat shock protein 82
Length=679

 Score = 29.6 bits (65),  Expect = 7.1, Method: Composition-based stats.
 Identities = 18/34 (52%), Positives = 22/34 (64%), Gaps = 1/34 (2%)

Query  7    LFE-ALEPFGYTMEEASGPEWRIHGLVGLALLIN  39
            LFE +L   G+T+EE SG   RIH LV L L I+
Sbjct  618  LFETSLLVSGFTIEEPSGFAGRIHKLVSLGLNID  651


>sp|B3E5Z2.1|MUTL_GEOLS  RecName: Full=DNA mismatch repair protein mutL
Length=589

 Score = 29.3 bits (64),  Expect = 9.4, Method: Composition-based stats.
 Identities = 10/21 (47%), Positives = 14/21 (66%), Gaps = 0/21 (0%)

Query  11   LEPFGYTMEEASGPEWRIHGL  31
            LEP G+ +EE  G  WRI+ +
Sbjct  474  LEPLGFELEEFGGQTWRINAV  494




b) BLAST-P contre nr:

Query ID
    lcl|18431 
    lcl|18431
Description
    None
Molecule type
    amino acid
Query Length
    155

Database Name
    nr
Description
    All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects See details
Program
    BLASTP 2.2.22+ Citation



DESCRIPTION
Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_01308161.1|  pyrimidine regulatory protein PyrR [Oceano...  38.1    0.29 
ref|ZP_06162448.1|  conserved hypothetical protein [Actinomyce...  35.0    2.6  
gb|EEY58175.1|  conserved hypothetical protein [Phytophthora i...  33.9    5.7  
ref|ZP_01076821.1|  pyrimidine operon regulatory protein PyrR ...  33.9    6.2  
ref|ZP_03644826.1|  hypothetical protein BACCOPRO_03216 [Bacte...  33.5    6.8  
gb|AAL28746.2|  LD15505p [Drosophila melanogaster]                 33.5    6.9  
ref|YP_001581975.1|  hypothetical protein Nmar_0641 [Nitrosopu...  33.5    7.4   Gene info
ref|NP_609190.2|  CG8460 [Drosophila melanogaster] >gb|AAF5261...  33.5    7.5   UniGene infoGene info
ref|XP_002036166.1|  GM16819 [Drosophila sechellia] >gb|EDW520...  33.5    7.6   Gene info
ref|XP_002078602.1|  GD23511 [Drosophila simulans] >gb|EDX0418...  33.5    7.8   UniGene infoGene info
ref|YP_001354638.1|  bifunctional pyrimidine regulatory protei...  33.5    7.9   Gene info
ref|ZP_03520118.1|  hypothetical protein RetlG_01712 [Rhizobiu...  33.5    8.2  
ref|XP_002120843.1|  PREDICTED: similar to predicted protein [...  33.1    8.9   Gene info


ALIGNEMENT

>ref|ZP_01308161.1|  pyrimidine regulatory protein PyrR [Oceanobacter sp. RED65]
 gb|EAT11213.1|  pyrimidine regulatory protein PyrR [Oceanobacter sp. RED65]
Length=173

 Score = 38.1 bits (87),  Expect = 0.29, Method: Compositional matrix adjust.
 Identities = 21/71 (29%), Positives = 36/71 (50%), Gaps = 0/71 (0%)

Query  39   NVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAK  98
            +V+ + +  GG W ++T    L  +  L L  +S+YR  F ++GL P V   + P    K
Sbjct  31   DVIIVGIHTGGAWVAQTLHQELKLNTELGLLDISFYRDDFTQKGLHPEVKSSELPEVEGK  90

Query  99   KGLLSSLIVLS  109
              +L   +V+S
Sbjct  91   TIILVDDVVMS  101


>ref|ZP_06162448.1|  conserved hypothetical protein [Actinomyces sp. oral taxon 848 
str. F0332]
 gb|EEZ78045.1|  conserved hypothetical protein [Actinomyces sp. oral taxon 848 
str. F0332]
Length=235

 Score = 35.0 bits (79),  Expect = 2.6, Method: Compositional matrix adjust.
 Identities = 27/66 (40%), Positives = 39/66 (59%), Gaps = 5/66 (7%)

Query  31  LVGLALLIN--VVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVD  88
           LVG AL++N  +V L  ++GG  D+ TF   L+G++S A   VS     +RR+ L   VD
Sbjct  26  LVGNALVVNGLLVALVFALGG--DAGTFLRALLGAVS-AFGLVSALVFAWRRKRLKDAVD  82

Query  89  LWKEPS  94
            W+E S
Sbjct  83  RWQESS  88


>gb|EEY58175.1|  conserved hypothetical protein [Phytophthora infestans T30-4]
Length=1341

 Score = 33.9 bits (76),  Expect = 5.7, Method: Composition-based stats.
 Identities = 15/39 (38%), Positives = 25/39 (64%), Gaps = 0/39 (0%)

Query  9     EALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSI  47
             E LEP+G  ++ +SG   +I G +GL+L +  V + LS+
Sbjct  1074  ETLEPYGGQVKSSSGHPLKIRGWIGLSLRLGAVEVSLSV  1112


>ref|ZP_01076821.1|  pyrimidine operon regulatory protein PyrR [Marinomonas sp. MED121]
 gb|EAQ65040.1|  pyrimidine operon regulatory protein PyrR [Marinomonas sp. MED121]
Length=164

 Score = 33.9 bits (76),  Expect = 6.2, Method: Compositional matrix adjust.
 Identities = 20/71 (28%), Positives = 34/71 (47%), Gaps = 0/71 (0%)

Query  39   NVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAK  98
            N V + +  GG W +E     L  ++ LA   +++YR  F R GL P V     P+   +
Sbjct  27   NPVVVGIHSGGVWVAERLIKALSTNVPLATLDITFYRDDFTRAGLHPKVKQTSLPAIENQ  86

Query  99   KGLLSSLIVLS  109
              +L   +++S
Sbjct  87   HIILVDDVLMS  97


>ref|ZP_03644826.1|  hypothetical protein BACCOPRO_03216 [Bacteroides coprophilus 
DSM 18228]
 gb|EEF77694.1|  hypothetical protein BACCOPRO_03216 [Bacteroides coprophilus 
DSM 18228]
Length=363

 Score = 33.5 bits (75),  Expect = 6.8, Method: Compositional matrix adjust.
 Identities = 12/44 (27%), Positives = 22/44 (50%), Gaps = 0/44 (0%)

Query  66   LALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLS  109
            L L Y  ++RL + R+G++ W+D W+              +VL+
Sbjct  109  LELHYCKFFRLQYNRKGILGWIDKWRTRQDERIVRRFDKFVVLT  152


>gb|AAL28746.2|  LD15505p [Drosophila melanogaster]
Length=428

 Score = 33.5 bits (75),  Expect = 6.9, Method: Compositional matrix adjust.
 Identities = 22/74 (29%), Positives = 30/74 (40%), Gaps = 7/74 (9%)

Query  48   GGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGL-------IPWVDLWKEPSSSAKKG  100
            GGP D + F LGL+    LA   VS +R  F+  GL       + +V  W        K 
Sbjct  69   GGPQDQDVFDLGLVSPEPLAKDIVSNHRGYFKETGLRRFNGTTLGYVTPWNSHGYDVAKI  128

Query  101  LLSSLIVLSMAWLS  114
                  ++S  WL 
Sbjct  129  FAKKFDIISPVWLQ  142


>ref|YP_001581975.1| Gene info hypothetical protein Nmar_0641 [Nitrosopumilus maritimus SCM1]
 gb|ABX12537.1| Gene info hypothetical protein Nmar_0641 [Nitrosopumilus maritimus SCM1]
Length=313

 GENE ID: 5774432 Nmar_0641 | hypothetical protein
[Nitrosopumilus maritimus SCM1]

 Score = 33.5 bits (75),  Expect = 7.4, Method: Compositional matrix adjust.
 Identities = 20/56 (35%), Positives = 32/56 (57%), Gaps = 3/56 (5%)

Query  24   PEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLAL---FYVSWYRL  76
            P   +H ++  +L+I    LK++I     S  F LG+IG+I++ L   F +SWY L
Sbjct  205  PSAGVHSIIIFSLVIGAFMLKMNIPRARKSMYFVLGIIGTITVNLIRIFSLSWYAL  260


>ref|NP_609190.2| UniGene infoGene info CG8460 [Drosophila melanogaster]
 gb|AAF52614.1| Gene info CG8460 [Drosophila melanogaster]
 gb|AAL28794.1| Gene info LD18607p [Drosophila melanogaster]
 gb|ACL84074.1|  CG8460-PA [synthetic construct]
 gb|ACL89181.1|  CG8460-PA [synthetic construct]
Length=402

 GENE ID: 34115 CG8460 | CG8460 gene product from transcript CG8460-RA
[Drosophila melanogaster] (10 or fewer PubMed links)

 Score = 33.5 bits (75),  Expect = 7.5, Method: Compositional matrix adjust.
 Identities = 22/74 (29%), Positives = 30/74 (40%), Gaps = 7/74 (9%)

Query  48   GGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGL-------IPWVDLWKEPSSSAKKG  100
            GGP D + F LGL+    LA   VS +R  F+  GL       + +V  W        K 
Sbjct  43   GGPQDQDVFDLGLVSPEPLAKDIVSNHRGYFKETGLRRFNGTTLGYVTPWNSHGYDVAKI  102

Query  101  LLSSLIVLSMAWLS  114
                  ++S  WL 
Sbjct  103  FAKKFDIISPVWLQ  116


>ref|XP_002036166.1| Gene info GM16819 [Drosophila sechellia]
 gb|EDW52089.1| Gene info GM16819 [Drosophila sechellia]
Length=402

 GENE ID: 6611630 Dsec\GM16819 | GM16819 gene product from transcript GM16819-RA
[Drosophila sechellia] (10 or fewer PubMed links)

 Score = 33.5 bits (75),  Expect = 7.6, Method: Compositional matrix adjust.
 Identities = 22/74 (29%), Positives = 30/74 (40%), Gaps = 7/74 (9%)

Query  48   GGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGL-------IPWVDLWKEPSSSAKKG  100
            GGP D + F LGL+    LA   VS +R  F+  GL       + +V  W        K 
Sbjct  43   GGPQDQDVFDLGLVSPEPLAKDIVSNHRGYFKETGLRRFNGTTLGYVTPWNSHGYDVAKI  102

Query  101  LLSSLIVLSMAWLS  114
                  ++S  WL 
Sbjct  103  FAKKFDIISPVWLQ  116


>ref|XP_002078602.1| UniGene infoGene info GD23511 [Drosophila simulans]
 gb|EDX04187.1| Gene info GD23511 [Drosophila simulans]
Length=402

 GENE ID: 6731451 Dsim\GD23511 | GD23511 gene product from transcript GD23511-RA
[Drosophila simulans] (10 or fewer PubMed links)

 Score = 33.5 bits (75),  Expect = 7.8, Method: Compositional matrix adjust.
 Identities = 22/74 (29%), Positives = 30/74 (40%), Gaps = 7/74 (9%)

Query  48   GGPWDSETFTLGLIGSISLALFYVSWYRLTFRRRGL-------IPWVDLWKEPSSSAKKG  100
            GGP D + F LGL+    LA   VS +R  F+  GL       + +V  W        K 
Sbjct  43   GGPQDQDVFDLGLVSPEPLAKDIVSNHRGYFKETGLRRFNGTTLGYVTPWNSHGYDVAKI  102

Query  101  LLSSLIVLSMAWLS  114
                  ++S  WL 
Sbjct  103  FAKKFDIISPVWLQ  116





c) BLAST-p contre environmental sample

DESCRIPTION:

EDF75304.1 hypothetical protein GOS_900216 [marine metagenome]301	 301	99%	5e-81	
EDG10892.1 hypothetical protein GOS_838951 [marine metagenome]282	 282	97%	3e-75	
EBN01749.1 hypothetical protein GOS_8315127 [marine metagenome]136	 136	94%	2e-31	
EBN38109.1 hypothetical protein GOS_8255980 [marine metagenome]129	 129	98%	3e-29	
EBK69923.1 hypothetical protein GOS_8689362 [marine metagenome]126	 126	90%	2e-28	
EBW77633.1 hypothetical protein GOS_6693422 [marine metagenome]123	 123	98%	2e-27	
EDF64348.1 hypothetical protein GOS_919126 [marine metagenome]122	 122	90%	3e-27	
EDF87642.1 hypothetical protein GOS_878966 [marine metagenome]121	 121	98%	8e-27	
ECH45445.1 hypothetical protein GOS_5190370 [marine metagenome]121	 121	95%	9e-27	
EDF80869.1 hypothetical protein GOS_890542 [marine metagenome]118	 118	89%	7e-26	
ECH90300.1 hypothetical protein GOS_3423331 [marine metagenome]106	 106	77%	3e-22	
ECF72060.1 hypothetical protein GOS_5109402 [marine metagenome]97.8	97.8	97%	1e-19	
ECQ21037.1 hypothetical protein GOS_3060423 [marine metagenome]96.7	96.7	93%	3e-19	
ECT42809.1 hypothetical protein GOS_6018401 [marine metagenome]95.1	95.1	83%	6e-19	
EBA68462.1 hypothetical protein GOS_360053 [marine metagenome]94.0	94.0	87%	2e-18	
EBN06076.1 hypothetical protein GOS_8308395 [marine metagenome]90.9	90.9	84%	1e-17	
EDI93593.1 hypothetical protein GOS_1783225 [marine metagenome]88.6	88.6	91%	6e-17	
EDG45611.1 hypothetical protein GOS_778845 [marine metagenome]88.2	88.2	82%	7e-17	
ECU35447.1 hypothetical protein GOS_4689927 [marine metagenome]88.2	88.2	91%	8e-17	
EBT93118.1 hypothetical protein GOS_7192607 [marine metagenome]88.2	88.2	97%	8e-17	
EBD28922.1 hypothetical protein GOS_9953355 [marine metagenome]87.8	87.8	88%	1e-16	
EBU77893.1 hypothetical protein GOS_7007326 [marine metagenome]85.9	85.9	91%	4e-16	
EBZ66621.1 hypothetical protein GOS_4415766 [marine metagenome]84.7	84.7	83%	8e-16	
EDI82476.1 hypothetical protein GOS_1802492 [marine metagenome]84.7	84.7	91%	1e-15	
EBL11661.1 hypothetical protein GOS_8620954 [marine metagenome]84.7	84.7	83%	1e-15	
EDD60700.1 hypothetical protein GOS_1275691 [marine metagenome]84.3	84.3	87%	1e-15	
EBN78289.1 hypothetical protein GOS_8189251 [marine metagenome]84.3	84.3	51%	1e-15	
ECD20916.1 hypothetical protein GOS_4330322 [marine metagenome]80.5	80.5	81%	2e-14	
EBL23404.1 hypothetical protein GOS_8603451 [marine metagenome]80.5	80.5	74%	2e-14	
ECC19927.1 hypothetical protein GOS_4824622 [marine metagenome]77.8	77.8	83%	1e-13	
ECA18118.1 hypothetical protein GOS_5887129 [marine metagenome]77.0	77.0	90%	2e-13	
EBK29602.1 hypothetical protein GOS_8755925 [marine metagenome]77.0	77.0	70%	2e-13	
ECM13022.1 hypothetical protein GOS_4126107 [marine metagenome]76.6	76.6	89%	2e-13	
EDB60955.1 hypothetical protein GOS_1626242 [marine metagenome]75.9	75.9	69%	4e-13	
EBL81460.1 hypothetical protein GOS_8509950 [marine metagenome]73.9	73.9	72%	2e-12	
ECV78015.1 hypothetical protein GOS_2836784 [marine metagenome]73.9	73.9	68%	2e-12	
ECD44217.1 hypothetical protein GOS_3432934 [marine metagenome]72.4	72.4	81%	5e-12	
EBX88779.1 hypothetical protein GOS_6517362 [marine metagenome]71.2	71.2	72%	9e-12	
ECJ30169.1 hypothetical protein GOS_4865127 [marine metagenome]71.2	71.2	71%	1e-11	
ECQ77423.1 hypothetical protein GOS_4323591 [marine metagenome]68.6	68.6	80%	7e-11	
EDB11118.1 hypothetical protein GOS_1880221 [marine metagenome]66.2	66.2	72%	4e-10	
EBU24568.1 hypothetical protein GOS_7144060 [marine metagenome]62.8	62.8	60%	4e-09	
EBW29005.1 hypothetical protein GOS_6769902 [marine metagenome]62.4	62.4	76%	5e-09	
EDG04535.1 hypothetical protein GOS_849949 [marine metagenome]61.6	61.6	69%	9e-09	
ECT89181.1 hypothetical protein GOS_4166703 [marine metagenome]57.4	57.4	60%	2e-07	
EDB07587.1 hypothetical protein GOS_1886194 [marine metagenome]54.7	54.7	42%	1e-06	
ECG56178.1 hypothetical protein GOS_5256911 [marine metagenome]54.7	54.7	36%	1e-06	
ECT76578.1 hypothetical protein GOS_4646595 [marine metagenome]54.3	54.3	76%	1e-06	
EDG23931.1 hypothetical protein GOS_816322 [marine metagenome]51.2	51.2	52%	1e-05	
ECU30093.1 hypothetical protein GOS_4911241 [marine metagenome]50.8	50.8	60%	2e-05	
EBA86257.1 hypothetical protein GOS_329017 [marine metagenome]46.2	46.2	80%	3e-04	
EBY81693.1 hypothetical protein GOS_4316489 [marine metagenome]41.2	41.2	27%	0.011	
EBW77646.1 hypothetical protein GOS_6693409 [marine metagenome]38.9	38.9	53%	0.054	
EBG20883.1 hypothetical protein GOS_9471878 [marine metagenome]35.0	35.0	46%	0.91	
EDB54582.1 hypothetical protein GOS_1805397 [marine metagenome]34.7	34.7	29%	1.1	
ECG55383.1 hypothetical protein GOS_5284482 [marine metagenome]34.7	34.7	54%	1.1	
ECW29559.1 hypothetical protein GOS_2746504 [marine metagenome]33.5	33.5	38%	2.4	
ECW52028.1 hypothetical protein GOS_2706418 [marine metagenome]33.1	33.1	35%	3.5	
ECW50134.1 hypothetical protein GOS_2710162 [marine metagenome]32.7	32.7	35%	4.1	
EDE61141.1 hypothetical protein GOS_1100605 [marine metagenome]32.3	32.3	51%	5.7	
EBF47063.1 hypothetical protein GOS_9592355 [marine metagenome]32.0	32.0	52%	7.6	
ECJ49610.1 hypothetical protein GOS_4098280 [marine metagenome]31.6	31.6	20%	9.0	
ECW35784.1 hypothetical protein GOS_2735370 [marine metagenome]31.6	31.6	31%	9.0



ALIGNMENT:


>gb|EDF75304.1|  hypothetical protein GOS_900216 [marine metagenome]
Length=154

 Score =  301 bits (771),  Expect = 5e-81, Method: Compositional matrix adjust.
 Identities = 154/154 (100%), Positives = 154/154 (100%), Gaps = 0/154 (0%)

Query  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60
            MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL
Sbjct  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60

Query  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120
            IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH
Sbjct  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120

Query  121  ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD  154
            ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD
Sbjct  121  ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD  154


>gb|EDG10892.1|  hypothetical protein GOS_838951 [marine metagenome]
Length=151

 Score =  282 bits (721),  Expect = 3e-75, Method: Compositional matrix adjust.
 Identities = 142/151 (94%), Positives = 148/151 (98%), Gaps = 0/151 (0%)

Query  4    ILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGS  63
            +LSLFEAL+PFG+TMEEASGPEWRIHGLVG ALLINVVFLK+SI GPWDSETFTLGLIGS
Sbjct  1    MLSLFEALDPFGHTMEEASGPEWRIHGLVGAALLINVVFLKISIDGPWDSETFTLGLIGS  60

Query  64   ISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILP  123
            +S+ALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKK LLSSLIVLSMAWLSGNHLQHILP
Sbjct  61   VSIALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKVLLSSLIVLSMAWLSGNHLQHILP  120

Query  124  TPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD  154
            TPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD
Sbjct  121  TPTGLVLSLIGFLMLTQSAYVLMSIGPLSDD  151


>gb|EBN01749.1|  hypothetical protein GOS_8315127 [marine metagenome]
Length=155

 Score =  136 bits (343),  Expect = 2e-31, Method: Compositional matrix adjust.
 Identities = 68/146 (46%), Positives = 101/146 (69%), Gaps = 0/146 (0%)

Query  8    FEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLA  67
            F+A  P    ME  + PEWR H  +G  LL+N + +KL+  GPW +E+FTLG+IG I L 
Sbjct  8    FQAQTPPRPYMEAKAAPEWRAHASIGSILLLNTLTVKLAPAGPWGAESFTLGVIGLIGLV  67

Query  68   LFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPTG  127
              Y +WYRLTF+R+GL+PW+D+W+ P  S++  +++ L +++ AWL+GN ++  LP PTG
Sbjct  68   FLYTAWYRLTFKRKGLVPWMDMWENPKDSSRTVIVAGLAIIASAWLAGNVVEEALPKPTG  127

Query  128  LVLSLIGFLMLTQSAYVLMSIGPLSD  153
            L+L+L+G L L    YV +S+G LSD
Sbjct  128  LILTLVGLLTLLNGVYVYLSVGALSD  153


>gb|EBN38109.1|  hypothetical protein GOS_8255980 [marine metagenome]
Length=158

 Score =  129 bits (324),  Expect = 3e-29, Method: Compositional matrix adjust.
 Identities = 70/153 (45%), Positives = 97/153 (63%), Gaps = 0/153 (0%)

Query  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60
            M  + S FEAL PFG ++   +        ++G  LL+N + L +S  GPWDSE+FTLG+
Sbjct  1    MTFLFSRFEALMPFGSSVGSDTSANGTAPAIIGSLLLLNTMILGISFSGPWDSESFTLGI  60

Query  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120
            IG   L  +YVSWYR TF R+GLIPW+D W  P  S+K+ L   +  + ++W +GN LQ 
Sbjct  61   IGMTGLGFWYVSWYRFTFNRKGLIPWLDNWDNPEQSSKQVLAVGVFTIVLSWAAGNPLQE  120

Query  121  ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSD  153
             LP PTGLVL LIG L+     Y ++++GPL+D
Sbjct  121  YLPDPTGLVLLLIGLLISLSGIYSMLALGPLAD  153


>gb|EBK69923.1|  hypothetical protein GOS_8689362 [marine metagenome]
Length=144

 Score =  126 bits (317),  Expect = 2e-28, Method: Compositional matrix adjust.
 Identities = 69/140 (49%), Positives = 90/140 (64%), Gaps = 1/140 (0%)

Query  15   GYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWY  74
            G  M E   P+W     +G  LLIN +F+K+S   PW SE+FTLG+IG I L + Y+SWY
Sbjct  6    GIVMTEDKAPDWIKFLFIGFILLINTIFIKISSPWPWGSESFTLGVIGLIGLVMLYISWY  65

Query  75   RLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPTGLVLSLIG  134
            R TFRRRGL+PW+DLWKEP  S+ K  L S I+   +++ G + Q+  P PT L+LSLI 
Sbjct  66   RFTFRRRGLVPWLDLWKEPKESSIKLFLFSFIITIFSYVLGKN-QYFFPDPTSLILSLIA  124

Query  135  FLMLTQSAYVLMSIGPLSDD  154
             L   Q+ YVL+S   L DD
Sbjct  125  LLTFIQATYVLLSTTVLLDD  144


>gb|EBW77633.1|  hypothetical protein GOS_6693422 [marine metagenome]
Length=158

 Score =  123 bits (309),  Expect = 2e-27, Method: Compositional matrix adjust.
 Identities = 70/153 (45%), Positives = 95/153 (62%), Gaps = 0/153 (0%)

Query  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60
            M  +LS FE   P G  +   +        ++G  LL+N + L  S  GPWDSE+FTLG+
Sbjct  1    MTFLLSRFEGQIPVGSIVGNETEVNGVTPAIIGSILLLNTMILGFSFPGPWDSESFTLGI  60

Query  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120
            IG   L  +YVSWYR TF R+GLIPW+D WK P  S+K+ L   +  + ++W +GN LQ 
Sbjct  61   IGMTGLGFWYVSWYRFTFNRKGLIPWLDNWKTPEKSSKQVLAVGVFTIILSWTAGNPLQE  120

Query  121  ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSD  153
             LP PTGLVL LIG L+    AY ++++GPL+D
Sbjct  121  YLPDPTGLVLLLIGLLISLSGAYSMLALGPLAD  153


>gb|EDF64348.1|  hypothetical protein GOS_919126 [marine metagenome]
Length=147

 Score =  122 bits (307),  Expect = 3e-27, Method: Compositional matrix adjust.
 Identities = 58/140 (41%), Positives = 98/140 (70%), Gaps = 0/140 (0%)

Query  15   GYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWY  74
            G++M   S P+W+ H +  + L+++ +   ++  GPWD  +F+ G+IG +  +L YV+WY
Sbjct  8    GHSMVSQSEPQWKPHAVAAVLLILDALVFGVAPSGPWDDSSFSRGVIGLVGASLAYVAWY  67

Query  75   RLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPTGLVLSLIG  134
            R TF+R+GL+PW+DLW++P  SA+  L +S+  L+++W++GN +Q  LP PTGL+L L+G
Sbjct  68   RRTFKRKGLVPWIDLWEKPEESARLVLYASIGFLAISWVAGNPMQPHLPDPTGLILVLVG  127

Query  135  FLMLTQSAYVLMSIGPLSDD  154
             L+  Q+ YV + IGPL ++
Sbjct  128  LLLGLQAVYVYLVIGPLKEE  147


>gb|EDF87642.1|  hypothetical protein GOS_878966 [marine metagenome]
Length=158

 Score =  121 bits (303),  Expect = 8e-27, Method: Compositional matrix adjust.
 Identities = 69/153 (45%), Positives = 94/153 (61%), Gaps = 0/153 (0%)

Query  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60
            M  +LS FE   P G  +   +        ++G  LL+N + L  S  GPWDSE+FTLG+
Sbjct  1    MTFLLSRFEGQIPVGSIVGNETEVNGVTPAIIGSILLLNTMILGFSFPGPWDSESFTLGI  60

Query  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120
            IG   L  +YVSWYR TF R+GLIPW+D WK P  S+K+ L   +  + ++W +GN LQ 
Sbjct  61   IGMTGLGFWYVSWYRFTFNRKGLIPWLDNWKTPEKSSKQVLAVGVFTIILSWTAGNPLQE  120

Query  121  ILPTPTGLVLSLIGFLMLTQSAYVLMSIGPLSD  153
             LP PTGLVL LIG L+     Y ++++GPL+D
Sbjct  121  YLPDPTGLVLLLIGLLISLSGIYSMLALGPLAD  153


>gb|ECH45445.1|  hypothetical protein GOS_5190370 [marine metagenome]
Length=147

 Score =  121 bits (303),  Expect = 9e-27, Method: Compositional matrix adjust.
 Identities = 62/148 (41%), Positives = 99/148 (66%), Gaps = 1/148 (0%)

Query  7    LFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISL  66
            +  A  P  Y + + S P+W+ H +    L+++ + + ++ GGPWD  +F+ GLIG +  
Sbjct  1    IITAARPAHYMVSQ-SEPQWKPHAVAAGLLILDALVIGVAPGGPWDDSSFSRGLIGLVGA  59

Query  67   ALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPT  126
             L YV+WYR TF R+GL+PW+DLW++P  SA+  L +S+  L+++W++GN +Q  LP PT
Sbjct  60   CLAYVAWYRRTFNRKGLVPWIDLWEKPEESARLVLYASIGFLTISWIAGNPMQPHLPEPT  119

Query  127  GLVLSLIGFLMLTQSAYVLMSIGPLSDD  154
            GLVL L+G L+  Q+ YV + IGPL ++
Sbjct  120  GLVLVLVGLLLGLQAVYVYLVIGPLKEE  147


>gb|EDF80869.1|  hypothetical protein GOS_890542 [marine metagenome]
Length=147

 Score =  118 bits (295),  Expect = 7e-26, Method: Compositional matrix adjust.
 Identities = 59/139 (42%), Positives = 94/139 (67%), Gaps = 0/139 (0%)

Query  16   YTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGLIGSISLALFYVSWYR  75
            ++M   S P W+ H      L+++ + L ++  GPWD  +F+ G+IG +   L YV+WYR
Sbjct  9    HSMGGQSEPRWKPHAAAAGVLILDALVLGVAPSGPWDDSSFSRGVIGLVGACLAYVAWYR  68

Query  76   LTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQHILPTPTGLVLSLIGF  135
             TF+R+GL+PW+DLW++P  SA+  L +S+  L+++W++GN +Q  LP PTGLVL L+G 
Sbjct  69   RTFKRKGLVPWIDLWEKPEESARLVLYASIGFLTISWIAGNPMQPHLPDPTGLVLVLVGL  128

Query  136  LMLTQSAYVLMSIGPLSDD  154
            L+  Q+ YV + IGPL ++
Sbjct  129  LLGLQAVYVYLVIGPLKEE  147


>gb|ECH90300.1|  hypothetical protein GOS_3423331 [marine metagenome]
Length=121

 Score =  106 bits (264),  Expect = 3e-22, Method: Compositional matrix adjust.
 Identities = 49/120 (40%), Positives = 73/120 (60%), Gaps = 0/120 (0%)

Query  1    MKIILSLFEALEPFGYTMEEASGPEWRIHGLVGLALLINVVFLKLSIGGPWDSETFTLGL  60
            M  +LS FE   P G  +   +     +  ++G  L++N + L  S  GPWDSE+FTLG+
Sbjct  1    MTFLLSRFEGQIPIGSVVANETEVSGSVPAIIGSLLILNTMILGFSFPGPWDSESFTLGI  60

Query  61   IGSISLALFYVSWYRLTFRRRGLIPWVDLWKEPSSSAKKGLLSSLIVLSMAWLSGNHLQH  120
            IG   L  +YV+WYR TF R+GLIPW+D W++P  S+K+ L   +  + ++W +GN LQ 
Sbjct  61   IGMTGLGFWYVAWYRFTFNRKGLIPWLDNWQDPEKSSKQVLAVGVFTIILSWAAGNPLQE  120






TAXONOMY REPORT:

marine metagenome .    63 hits    1 orgs [root; unclassified sequences; metagenomes; ecological metagenomes]