ORF XW19480

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01160155.1
Annotathon code: ORF_XW19480
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : Biochimie 2007
Username : stefcec
Annotated on : 2008-03-19 18:52:37
  • BLANCHARD Cécile
  • MANEVILLE Stéphanie

Synopsis

Genomic Sequence

>AACY01160155.1 ORF_XW19480 genomic DNA
ATAAGATGATTGGTTTGATTTTTGGTGAGACAAGTTTTCCAAATGAAATTCTAAAAAAAGTGAAGAAAAAAAAATTAAATTATTTAATAATTGATTTAAC
TCAATTAAAAAAATTTAAAAAAGATAAAAAATCTTATTCAGTTTCTATAGGTCAATTTGGAAAAATTATTAATATTCTAAAAAAAAATAATTGCAAAAAA
GTCTTATTTGCTGGAAAGGTAAATAAGCCAAACTTTTCTAGATTAAAATTAGACTTTAAAGGCATTTATTATATTCCAAGAATTATAAAAGCATCAAAGC
TTGGAGATGCGGCTATAATAAAAGAAATTATTAAAATATTAGCTCAAAACAAGATAAAAACTGAAAATTCACTAAAATTCAATCCTGAGCTTTCATTAAA
AAGGGGAAATTATTCAAAGATAAAACCAAACGATCATGATCAATCAGATATTAAAAAGGCAATTAAAAAATTAAATAATTTAAGACAATATAATTTTAGC
CAAGGGGTTGTAGTTAGAAATAAAACAGTTGTAGCTATAGAAGGAAAGGGTGGAACAAAAAAAATGCTTGAAAAAAGTAAAAGTAAAAAATTTAGAAATC
ATGGTGTTTTAGTAAAATTTCCCAAAAAAAAACAAGACTTAAGAGTAGATCTACCTACTATTGGATTAAAAACTTTAAAACAAAGCAAAACTGCTGGGTT
AAAGGGAATTGTAGTTAAAAATAAACAGCATGTTTTTTTAGATAAAAACCAATGTATTAAATTTGCTAATCAAAATAGAATGTTTATTTCAGTAAAATGA
AAAAGATATTTGTTTTAACTGGCGAACCTTCAGGTGATAAATTAGCATCTACTGTAATTGCTAAACTTAAACAAAATCATTCAGATATAGAGTATCTGAG
TGTG

Translation

[6 - 797/904]   direct strand
>ORF_XW19480 Translation [6-797   direct strand]
MIGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDKKSYSVSIGQFGKIINILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLG
DAAIIKEIIKILAQNKIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEGKGGTKKMLEKSKSKKFRNHG
VLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFISVK

Phylogeny

UTILISATION DE INFOBIOGEN
ARBRE NJ ENRACINE AVEC leptospira interrogans (Linter130):


                   +--------------Ckstutt   (planctomycetes)  O
                   ! 
  +---------------S8  +---------Pmlms1    (delta )  O
  !                !  !  
  !                +S10  +----------------Lintra    (delta)  O
  !                   !  !  
  !                   !  !  +---------------Rbalt    (planctomycetes)   O
  !                   +S11  !  
  !                      !  !  +----------Mmc1      (protéo)   O
  !                      !  !  !  
  !                      +S12  !                          +Cpela1062 (alpha) O
  !                         !  !                  +------S3 
  !                         !  !        +--------S4       +Cpela1002  (alpha) O
  !                         !  !        !         ! 
  !                         +S13  +----T9         +------NOTRESEQ  
  !                            !  !     ! 
  !                            !  !     !                +Fm25586   (fusobacteries)  X
  !                            !  !     +----------------S2 
  !                            !  !                      +Fn49256   (fusobacteries)   X
  !                            !  !  
  !                            +S16  +--------------Goxy    (alpha)  O
  !                               !  !  
  !                               !  !             +--Ccres15   (alpha)  O
  !                               !  !        +---S5 
  !                               !  !     +S14    +---CspK31    (alpha)   O
  !                               +S18     !  !  
  !                                  !  +S15  +---------Oalex     (alpha)   O
  !                                  !  !  !  
  !                                  !  !  +----------Pberm     (alpha)    O
  !                                  !  !  
  !                                  +S19            +----RpaluB5   (alpha)    O
  !                                     !          +S6 
  !                                     !  +------S7 +-----RpaluA53   (alpha)  O
  !                                     !  !       ! 
  !                                     +S20       +------Bjapo     (alpha)   O
  !                                        !  
  !                                        !  +-----------Sagg      (alpha)   O
  !                                        +S17  
  !                                           +------------Smed      (alpha)   O
  ! 
  S1Linter    (spirochetes)  O
  ! 
  +Linter130    (spirochetes)  O



ARBRE PAR METHODE PARCIMONIE ENRACINE PAR Leptospirula interrogans (Linter130):

  +--------------------------------------------------------------Linter130 
  !  
  !                                                           +--Fn49256   
  !                                                  +-------15  
  !                                                  !        +--Fm25586   
  !     +-------------------------------------------14  
  !     !                                            !     +-----NOTRESEQ  
  !     !                                            +----13  
  !     !                                                  !  +--Cpela1002 
  !     !                                                  +-12  
  !     !                                                     +--Cpela1062 
  !     !  
  !     !                             +--------------------------Goxy      
  !     !                             !  
  1     !                             !                       +--Smed      
  !  +-11                             !              +-------17  
  !  !  !                             !              !        +--Sagg      
  !  !  !              +-------------18  +----------16  
  !  !  !              !              !  !           !     +-----Bjapo     
  !  !  !              !              !  !           +----10  
  !  !  !              !              !  !                 !  +--RpaluA53  
  !  !  !              !              !  !                 +--9  
  !  !  !              !              +--8                    +--RpaluB5   
  !  !  !              !                 !  
  !  !  !              !                 !                    +--Pberm     
  !  !  !              !                 !              +-----7  
  !  !  +--------------4                 !              !     +--Oalex     
  !  !                 !                 +--------------6  
  +--2                 !                                !     +--CspK31    
     !                 !                                +-----5  
     !                 !                                      +--Ccres15   
     !                 !  
     !                 !                             +-----------Lintra    
     !                 !                             !  
     !                 !                             !        +--Ckstutt   
     !                 +----------------------------19     +-21  
     !                                               !  +-20  +--Rbalt     
     !                                               !  !  !  
     !                                               +--3  +-----Mmc1      
     !                                                  !  
     !                                                  +--------Pmlms1    
     !  
     +-----------------------------------------------------------Linter    



Annotator commentaries

1.Recherche d'ORF:

La recherche d'ORF a été effectué à l'aide du logiciel ORF finder dans SMS ( paramètres utilisés dans resultats). La recherhce d'ORF a été réalisée dans les deux sens de lecture (directe et indirecte). Dans un premier temps notre ORF pouvait commencer par n'importe quels codons , puis par des codons alternatifs (ATG, GTG, CTG et TTG).

Avec "any codons", on obtient un seul ORF, dans le sens direct et dans le cadre de lecture 3. Cet ORF débute au nucléotide numéro 3 jusqu'au 800ème. Le nucléotide 3 correspond au premier nucléotide après un codon stop. Notre ORF se termine par un codon STOP.

On a effectué une recherche d'ORF avec les codons alternatifs afin de trouver un codon START en aval du nucléotide 3. On obtient à nouveau un seul ORF, dans le sens direct, et dans le "frame 3", qui commence par une methionine dont le premier nucléotide est en position 6, et se termine par le même codon STOP.

Nous avons donc un ORF complet, composé de 264 acides aminés (6 _

> 797ème nucléotide). Nous avons donc pu calculer, dans SMS, la masse moléculaire à l'aide du logiciel protein molecular weight. Notre protéine a une masse moléculaire de 30,18 kDa.

2.BLAST:

Avec notre ORF traduit on a fait un blastp avec NCBI contre la banque nr et swissprot afin de trouver differents homologues de notre ORF. Contre la banque swissprot, nous avons obtenus peu de résultats avec des scores faibles (entre 32 et 30) et des E_value élevées( superieur a 2.6). En revanche, contre la banque nr, nous avons trouvé de nombreux homologues potentiels avec des scores élevés ( de 100 à plus de 200) et de faibles E-value ( de l'ordre de 10-50 à 10-25) a la fois avec un BLASTx et un BLASTp. Parmi eux, il a fallu déterminer notre groupe d'étude et notre groupe extérieur. Les homologues potentiels ayant les meilleurs scores font tous partis de la famuille des alpha protéobactéries d'après le taxonomy report. Nous avons donc décidé que notre groupe d'étude sera composé uniquement d'alpha protéo. Nous avons pris les homologues potentiels ayant les douze meilleurs scores qui correspondent à neuf espèces différentes dont certaines se distinguent par leurs souches. On peut supposer que notre région codante provient d'une alpha protéobactérie.

Notre groupe d'étude étant constitué que d'une sorte de protéobactéries (alpha), nous avons donc décidé de choisir plusieurs et diverses familles pour constituer notre groupe externe afin d'élargir notre étude et mieux cibler l'appartenance vraie de notre protéine. Cependant les séquences choisies ont un score satisfaisant (de 70 à 100). Il comprend donc: 1 protéo, 2 delta protéo, 2 planctomycétes, 2 fusobactéries et 2 spirochètes.


3.alignement multiple

Après avoir choisis nos deux groupes, on a renommé nos sequences (première de lettre du genre en majuscule et diminutif de l'espèce en minuscule). On a fait de même avec notre région codante (NOTRESEQ). D'après nos résultats, on peut observer quelques acides aminés conservés entre toutes les séquences proposées à l'alignement: _ des acides aminés chargés ( E, D, K) qui peuvent être importants selon leur configuration dans la structure 3D et par conséquent dans la fonction de la protéine (stabilisation de la protéine,...); - des acides aminés hydrophobes (A) qui peuvent conférer un domaine transmembranaire à la protéine ou imposer des angles de 90° dans les feuillets,... toujours selon la structure 3D; - autres acides aminés (Q, G, T, P) qui ont des rôles moins précis dans les structures 3D mais qui peuvent générer des liaisons hydrogène pour stabiliser la protéine dans sa fonction par exemple, ou moduler les helices alpha (angles)... Tout cela n'est que hypothèse d'après les propriétés des acides aminés individuels. Cependant, ces acides aminés conservés doivent certainement jouer un role important dans la fonction de la protéine pour être retrouvés dans tous les homologues potentiels de notre séquence ainsi que dans le groupe externe.


4.Analyse phylogénétique

On a réalisé deux méthodes d'arbre: parcimonie et Neighbour Joining. Les deux arbres sont enracinés et on a pris comme séquence protéique externe Linter130 car il a le plus petit score du groupe externe. Les 2 arbres obtenus sont identiques. On choisit donc d'etudier l'arbre obtenu par la methode de NJ.

On constate que notre sequence fait bien parti du groupe des alpha protéo. En revanche on constate la presence des deux fusobacteries au niveau du noeud neuf entrainant la speciation entre NOTRESEQ et les deux homolgues potentiels qui ont les deux meilleurs scores dans le blastp (Cpela1062 et Cpela1002) et la speciation de nos deux fusobacteries. Ce resultat nous pousse a emettre l'hypothèse qu'il y a eu, au niveau du noeud neuf, un transfert de region codante des alpha proteobacteries vers les fusobacteries. On qualifiera donc nos fusobacteries de xenologues.

A part cette caracteristique atypique, on constate que notre groupe d'etude et notre groupe exterieur sont bien distincts.

5. Domaines protéiques

Avec le logiciel Interpro, on obtient un seul domaine qui est non identifié mais il est indiqué dans sa fiche que c'est une protéine caracteristique des bacteries. Ceci nous rassure pour l'ensemble de notre etude. Avec le logiciel pFam, on obtient le meme et unique resultat. Dans le logiciel prosite, nous avons aucun resultat.

D'apres l'arbre obtenu par NJ, nous avopns regardés dans le blastp les indications sur les fonctions des deux orthologues et des deux xenologues les plus proches de notre sequence afin d'obtenir plus d'informations sur la fonction eventuelle de notre protéine. Malheureusement, ceux sont des protéines inconnues ou hypothétiques.

Pour conclure, notre ORF code bien pour une protéine mature provenant des alpha protéobactéries. Cette protéine est caractéristique des bacteries mais possède une fonction inconnue.



Multiple Alignement

UTILISATION DE INFOBIOGEN
CLUSTAL W (1.82) multiple sequence alignment


Linter          ------------------------------------------------------------
Linter130       ------------------------------------------------------------
Pmlms1          ------------------------------------------------------------
Mmc1            ------------------------------------------------------------
Ccres15         ------------------------------------------------------------
CspK31          ------------------------------------------------------------
Oalex           ------------------------------------------------------------
Pberm           ------------------------------------------------------------
RpaluB5         ---------------------------------------------------------MTA
RpaluA53        ----------------------------------------------------------MT
Bjapo           -----------------------------------------------------MAADMTS
Cpela1062       ------------------------------------------------------------
Cpela1002       ------------------------------------------------------------
NOTRESEQ        ------------------------------------------------------------
Fm25586         ------------------------------------------------------------
Fn49256         ------------------------------------------------------------
Sagg            ---------------------------------------------------------MVR
Smed            MLQADFRGGWLDPCKRSRNPRRVSRLRTGSRDPGLYRRRERPRAVVAESRRQRLMSVAIH
Goxy            ------------------------------------------------------------
Lintra          ------------------------------------------------------------
Rbalt           ----------------------------------MTNALPMRNAVHNTSESRSHDCESIL
Ckstutt         ------------------------------------------------------------
                                                                            

Linter          -----MGKLGILAGAGELPHIGMKEAL--LAGEDPIFFSIIESDFHVGMYEDRNIPIHIV
Linter130       -----------------------------MAGEDPIFFSIIESDFHVGMYEDRNIPIHIV
Pmlms1          -MATRMSKLGIIAGGGQFPLLVAQAAR-RHGREVAVVAHRGESVPELEQAAASCLWIKLG
Mmc1            --------MGIIAGSGAIPALLIDKLRHCHHTAVVVAAHVGEADPKLTQLADAIEWVRLG
Ccres15         -----MRKLGLIAGGGALPVELASHCE--AAGRAFAVMRLRSFAD-PSLDRYPGADVGIG
CspK31          --MAVPRKLGLIAGGGSLPVELAQHCE--AAGRPFSVMRLRSFAE-PVLARYPGVEVGLG
Oalex           --MAVYQRLGLIAGGGDLPVYVARAAQ--TGDRLACVIALKGFAD-PTRYDSP-VIRGIA
Pberm           -MAEPWKKLGIIAGGGSLPLKIAESCQ--QQDAPFHILALSGYAD-DILKSFKPSWCGIG
RpaluB5         PGLQIGSPVGVIAGGGVLPFAIADSMQ--ARQITPLLIGLRGFCDPTGIARFRHHWISIG
RpaluA53        GASQIGSPVGLIAGGGVLPFAVADSLQ--ARGIGAVLFALKGSCDADQLSRYRHHWISIG
Bjapo           AAAGISSPVGVVAGGGAMPFAVADSLA--TRGITPVLFPLRGACDPVQVEKFRHRWISVG
Cpela1062       -------MIGLFLGDTDFSEAVLKNIK--KLNKRYFIIDFSK--NNKFKNDINSNRISIG
Cpela1002       -------MIGLFLGDTDFSEAVLKNIK--KLNKRYFIIDFSK--NNKFKNDINSNRISIG
NOTRESEQ        -------MIGLIFGETSFPNEILKKVK--KKKLNYLIIDLTQ--LKKFKKDKKSYSVSIG
Fm25586         -----MEKIGLIVGNGKFPLYFIEEAK--NSNISVYPIGLFPSVDEEIKKIDNYTEFNIG
Fn49256         -----MEKIGLIVGNGKFPLYFIEEAK--NSNISVYPIGLFPSVDDEIKKLDNYAEFNVG
Sagg            ADMATEPRLALIAGNGSLPCQIADALS--NAGREFKIIAIKGEAD-ERTRAQADTELGWG
Smed            RLPQSKGRLAIIAGAGALPHHVAEAAR--RQGENPFIIALSREAD-ADWTGFDHTVCAIG
Goxy            --MGDRGCIGILAGSGPLPAQVAAAAI--AKGRKVFVIGFRDFADRALLEPYPHEIIRLA
Lintra          -------MLGVIAGSGQFPFMVVKGAQ-EKGYKVIVCGFHGHTDSKLEDIADHFEMMPLG
Rbalt           RDVPEGAPVGLIAGWGRFPICVAEKLK-ALGHPVHCVAITGHAGEELNDICESVLWAGVG
Ckstutt         -----------MAGNGRFPILFAKGAK-NNNVPVIAVAIEGETSPEVGQYVEKLYWIGVA
                                                                            

Linter          KIGTLLKLCKRHNVDRLLLLGKVKKEIIFKNLKFDLKAIA-------LLARMINKH----
Linter130       KIGTLLKLCKRHNVDRLLLLGKVKKEIIFKNLKFDLKAIA-------LLARMINKH----
Pmlms1          QLGKMVNFLRRQGVQQCLFAGTITKTRIFRDVWPDFKALQ-------LWGRIDSRQ----
Mmc1            QFKRILRFFHAQGVTHIVMVGGITKTQIWN-IRPDTLALK-------IATRLKHMQ----
Ccres15         EFGKIFKALRAEGCDVVCFAG-NVSRPDFSALMPDARGLK-------VLPSLIVAAR-KG
CspK31          EFGKVFKALRAEGCEAVCFAG-VVERPDFAAIKPDLRGLT-------VMPGLINAAR-KG
Oalex           QLGQVVKDLRQADCDAVCFAG-IVTRPDFSALKPDLKGMA-------FLPQALAAAA-RG
Pberm           EVGKAIRVLKDHGCDAVVLAG-NVTRPNFATLRPDWRGAK-------LLPKILSAAT-QG
RpaluB5         QYGRLKRLLRAEHCRDVMFIG-SVVRPSLASVRLDWGAVR-------VLPSVMAAYR-GG
RpaluA53        AFGQLRRLLRAEQCRDVLFIG-ALVRPSLSAVRLDWGAIR-------VMPAILAAYR-GG
Bjapo           QLGRAMRLFREEGCRDLIFIG-TLVRPSLSEIRFDFTTLR-------LLGNVIRAFR-GG
Cpela1062       KFGKIIDLIKEKKSKKVLFAGKIAK-PKFSTLRLDLKGIY-------YMPSILKAAK-LG
Cpela1002       KFGKIINLIKEKKSKKVLFAGKIAK-PKFSTLRLDLKGIY-------YMPSILKAAK-LG
NOTRESEQ        QFGKIINILKKNNCKKVLFAGKVNK-PNFSRLKLDFKGIY-------YIPRIIKASK-LG
Fm25586         HIGEIIKYLLLRDINKIVMLGKVEKKLIFENLILDKYGEK-------IMEIVPDNK----
Fn49256         HIGEIIKYLLLRDVTKIVMLGKVEKRLIFENLILDKYGEK-------IMEIVPDNK----
Sagg            EIGRLYKFLKKTGCRDVLLIGGVSKRPDFTSILGDIGTLK-------RLPTIIRALA-GG
Smed            DFAAISHTFEAEKIDRVVLSGAVRRRPEWRDIRPTLKTLA-------KVPRVFRTLMSGG
Goxy            AAGDILGALKRNNCRELVLIG-PVRRPAWRDLRPDAEGAR-------ILARLGRAIF-SG
Lintra          QFNRLIRFFRRSGVIELCMAG-AINKPRALQVRPDFRAFR-------LYFSLCRKGD---
Rbalt           RFGGHLRYFKRNDVAHVTMAGKLFKSDLLYSGSVWIRHTPDWTCIKTFWPCLFGARRDAR
Ckstutt         QIGKLIKIFKQENVSKAVMAGGLTKGNMFSSLRNLRLLPDLR-----TINLWYKNVKRRD
                                  : *                                       

Linter          DYSIFKTVADEFAKEKITIISQKTFLQSLFLPEGRFTKKPLTQKELEDIAFGMDYAEKMA
Linter130       DYSIFKTVADEFAKEKITIISQKTFLQSLFLPEGRFTKKPLTQKELEDIAFGMDYAEKMA
Pmlms1          DDAILRAIADLLAAEGIEVVASTLYLEELLFPKGVLSRKKPNAEQRADIDFGWQMARRIG
Mmc1            DDHLLRAIAETLEERGFVVCGAHELAPELLAPVGILGHHRPNSELWQDMRLGWQMAKAIG
Ccres15         DDALLRRVLDEFEKEGFEIEGAHEVMGEMTLPRGRLGKVSPAPEHMADIDKALDVAREIG
CspK31          DDALLRRLLSEFEKEGFAIEGAHEVRGEMTLPRGSLGRHAPTDAHRTDMDRALTVARAIG
Oalex           DDALLRVIVGFFEKEGFTVIGANDIADELLVEPGLIGSIRPDAIAEADAKKALHVAGVTG
Pberm           DGAMLDVLVATFASEGFYVVGADDVATALTVPAGALGMLGPDTCDLSDMRKAAAVVAALG
RpaluB5         DDHLLTSIGRIFEGEGFRLYGVKDVAPELLMPRGELTQATPDEGHLADIAKGIAVLAALS
RpaluA53        DDHLLTGIGQIFERDGFRLLGLKDVAPDLLMPEGCMTRARPNKDTEADIAKGRAVLAALS
Bjapo           DDHLLSGVGRILEQGGFRMVGIKDVAPDLLMPEGCISRAWPSDTSKTDIERGRAVLTALG
Cpela1062       DAAIIKAIIKILDNEKIKVLSSVFFNPELTVKRGNYTKLKANRKDINSIKMGITYFNKLK
Cpela1002       DAAIIKAIIKILDNEKIKVLSSVFFNPELTVKRGNYTKLKANKSDINSIKMGITYFNKLK
NOTRESEQ        DAAIIKEIIKILAQNKIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLR
Fm25586         DETLLFAIIGFIKLSGIKVLPQSYLMKKFIFETKCYTEKEPDIDDEKTISIGIEAARLLS
Fn49256         DETLLFAIIGFIRLSGIKVLPQSYLMKKFIFETKCYTEKEPDFDDEKTISLGIEAARLLS
Sagg            DDSLLTKVIRLFEVEGYRVVGIKDVAPQLLASSGVLGKVQPSQTDWRDAELALRACEKLG
Smed            DDAVLRMVIELIEASGAHVIGAHEVVPGLLADVGPLGRHAPTDEDQRDIRAGIAAANALG
Goxy            DDGLLGAVVRVLGEEGFHVRGAHEFLEHATGRSGTLGRVLPDAQAKQDIARGVEVLKVMA
Lintra          -DALFRTIIKEFEKEGFLMVSPSTFVPFLHCPPGVLSNKQPDKDILAEISYGWPIATSLG
Rbalt           DDRLLGAVIDTYENHAMKICSATDLAPELLAKTGQLTRRKPSSAIQSDISSGWQIAKTMG
Ckstutt         DQTLLGAVADELLKDGIELQSSTLYVPQLLAKKGILTKKNPTDREMEDIYFAVPLAKEIA
                   ::  :                                           .        

Linter          GLDIGQTVVVLDKSVLAVEAVEGTDLAICRGGSFAKK----------GKATVCKSSKPNQ
Linter130       GLDIGQTVVVLDKSVLAVEAVEGTDLAICRGGSFAKK----------GKATVCKSSKPNQ
Pmlms1          ELDIGQCVVVRQRAVLAVEAIEGTDAAIRRGGELGR-----------EQAVVVKVRKPNQ
Mmc1            ALDIGQGVVVRERVVLAVEAVEGTDAMLQRAGKLSR-----------GGGCLVKVSKPQQ
Ccres15         RLDIGQGAVVCEGLVLAVEAQEGTDAMLRRVADLPEAIRGRAERR---LGVLAKAPKPIQ
CspK31          ALDVGQGAVVCDGLVLAVEAQEGTDAMLRRVADLPEAIRGRAEAP---RGVLAKAPKPIQ
Oalex           AEDIGQGAVVCKGLVLAVEAQEGTDQMLARVAGLPAELRGDELNR---SGVLAKRPKPGQ
Pberm           PFDVGQGAVVRQGFVIAIEAAEGTDLMLGRCAPLIARLQGEEGNRSERRGVLLKCPKPEQ
RpaluB5         PFDIGQGVIVIDGHVVAVEDIGGTDALLANLARLRAQGAIHAKPG---RGVLVKSPKSGQ
RpaluA53        PFDIGQGCVVIDGHVVSVEDTGGTDGLLRRVEQLRGERRLRAKPG---RGVLVKAPKSGQ
Bjapo           PFDIGQAAVVIDGHVVAVEDIEGTDALLARVARLREEGRIRAATG---RGVLVKAPKSSQ
Cpela1062       SLDHVQAIIVKDNTILAIEDQQGTKKMLSKLKKK-------------SEGILIKLPKKKQ
Cpela1002       SLDHVQAIIVKNDTIIAIEDHQGTKKMLSKLKKK-------------SEGILIKLPKKKQ
NOTRESEQ        QYNFSQGVVVRNKTVVAIEGKGGTKKMLEKSKSKKF----------RNHGVLVKFPKKKQ
Fm25586         RVDVGQTVVCRDRAVIAVEGIEGTDETLKRAGQYS-----------DKDNILIKMSRPQQ
Fn49256         RVDVGQTVVCRDRAVIAVEGIEGTDETLKRAGQYS-----------DKDNILIKMSRPQQ
Sagg            ELDIGQAAVAVGGRVVALEGAEGTDAMLQRCADLKRNGRIRAKSH---TGVLVKTAKPNQ
Smed            ALDVGQGAVAVGGRVVALEGAEGTDAMLARVSDLRKDGRISVRRR----GVLVKLCKPQQ
Goxy            ALDIGQGCVVQNGLVLAVEALEGTDAMLGRCGRLMQAG---------SGGVLVKMPKTGQ
Lintra          RFDIGQLIVVKQQMVIAIECLEGTNATLQRGAELGGK-----------NCVAIKIAKPIQ
Rbalt           GLDIGQAITIKDGTIIAVEAIEGTDACIARTGELCRRG----------GWTLVKVSKPDQ
Ckstutt         KHGIGQCIVVKEKVVLAVEAFEGTDEAIRRGGKLGRS-----------DVVVIKVCKQNF
                  .  *        ::::*   **.  : .                       *  :   

Linter          DHRFDLPTVGENTLKAMYENNCGTLALRTGETIIVHPKEFINLAEKFKINILSIGSGNLK
Linter130       DHRFDLPTVGENTLKTMYENNCGTLALRTGETIIVHPKEFINLAEKFKINILSIGSGNLK
Pmlms1          DFRFDLPAIGRQTIATMQEARAAVLAVEARQALLFDPRETLAAADQAGLVVVGVEEAADG
Mmc1            DLRLDMPTIGVATIQNLHRAGLRGLAVESGSTLIVDYIGMLAEADRLGIVVVGCDAAQMT
Ccres15         ETRVDLPTIGVATIHRAARAGLAGIVGEAGRLLVVDREAVIAAADDLGLFVLGVDPQERP
CspK31          ETKVDLPTIGVATVQRAARAGLAGVVGEAGRLLIVDREQVIACADDLGLFVYGVEPRADA
Oalex           ERRIDLPVIGVSTVQGAARAGLAGIVIPAGGAMVLGREAVGQAADAAGLAVWAVEMDGPA
Pberm           ERRVDLPTIGVRTVELAAEAGLAGIAVEASGGLVLDSGAVARCADARGLFVYGYTSHDLR
RpaluB5         DLRFDLPTLGPRTVEGVAAAGLAGIAVAAGNTLVAEPQETIKAADAAGLFVTGVPA----
RpaluA53        DLRFDLPALGPKTIEGLIAAQIAGVAVVAGHTVVAEPQAMVDAADRAGLFVTGVAA----
Bjapo           DLRFDLPTIGPRTIEGVARAGLAGIAVIAGNTIAAEPQAMIALADAKYLFIIGLPA----
Cpela1062       DLRMDLPTIGLQTLKDCKKYGLKGIVLRSKKNIFLDKAKCIAFANKNKIFVKII------
Cpela1002       DLRMDLPTIGLQTLKDCKKYGLKGIVLKSKKNIFLDKAKSIAFANKNKIFVKII------
NOTRESEQ        DLRVDLPTIGLKTLKQSKTAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFISVK------
Fm25586         DMRVDVPVIGLNTIETAIKNGFKGIVAQAKKMIFLNQKECIELANKNNIFIVGKKI----
Fn49256         DMRVDVPVIGLNTVETAIKNGFKGIVAQAKKMIFLNQKECIELANKNNIFIVGKKI----
Sagg            DLRVDLPTVGPKTIDLAVAAGLAGIAVEASGALIAEKDVTLKKADDAGLFVIGIEHGSSI
Smed            DERADLPSIGPSTVAEAHAAGLAGIAIEAGRALVLERTRLVEAADRSGMFVLGIERNLRR
Goxy            DVRADMPTIGPETLENAARNGLRGVAFQPGVTLMTDPAGCVKLADRYGLFLYGLTPEDLK
Lintra          DERVDLPAIGLETIHLLVKYQFKCIAVSAEKTLFFDMPEALTLANKHKLCVMSLSDTDIR
Rbalt           DMRFDVPTIGPQTIQRVHEAGGAAIAIEAGKTILLDSEETIQLADRLGIALVAMASADTM
Ckstutt         DPRFDIPTVGLDTIKTLKESSASVLALEAGRTIILDIEETLAEADKAGISVIGIGASRLT
                : : *:* :*  *:          :.      :          *:   : :         

Linter          KINSTIQKIR--
Linter130       KINSTIQKIR--
Pmlms1          RLLF--------
Mmc1            DNMGREGPL---
Ccres15         ------------
CspK31          ------------
Oalex           DDANA-------
Pberm           EGA---------
RpaluB5         ------------
RpaluA53        ------------
Bjapo           ------------
Cpela1062       ------------
Cpela1002       ------------
NOTRESEQ        ------------
Fm25586         ------------
Fn49256         ------------
Sagg            DAVLSEQPDERS
Smed            ERE---------
Goxy            EK----------
Lintra          ------------
Rbalt           EDQLSSSRKAA-
Ckstutt         IDNLL-------

BLAST

UTILISATION DE NCBI

BLASTp CONTRE BANQUE SWISSPROT:


Sequences producing significant alignments:                       (Bits)  Value

gi|14423677|sp|Q9HJ26|DNLI_THEAC  DNA ligase (Polydeoxyribonucleo  32.0    2.3  
gi|8134566|sp|O50008|METE_ARATH  5-methyltetrahydropteroyltrig...  31.2    4.2   
gi|74585304|sp|Q59XV0|SET2_CANAL  Histone-lysine N-methyltrans...  30.8    5.0  
gi|25090272|sp|Q8K923|EX1_BUCAP  Exodeoxyribonuclease I (Exonu...  30.0    8.6  


BLASTx CONTRE BANQUE NR:
Sequences producing significant alignments:                       (Bits)  Value

gi|71083615|ref|YP_266334.1|  Protein of unknown function (DUF...   221    3e-56 
gi|91761964|ref|ZP_01263929.1|  hypothetical protein PU1002_01...   221    5e-56
gi|16126153|ref|NP_420717.1|  hypothetical protein CC_1910 [Ca...   105    3e-21 
gi|113933567|ref|ZP_01419469.1|  Protein of unknown function D...   101    5e-20
gi|84701776|ref|ZP_01016351.1|  hypothetical protein PB2503_02...   101    6e-20
gi|91977315|ref|YP_569974.1|  protein of unknown function DUF1...   100    1e-19 
gi|52699350|ref|ZP_00340758.1|  COG3494: Uncharacterized prote...  94.7    6e-18
gi|91205179|ref|YP_537534.1|  hypothetical protein RBE_0364 [R...  94.0    1e-17 
gi|15604565|ref|NP_221083.1|  hypothetical protein RP730 [Rick...  94.0    1e-17 
gi|67458570|ref|YP_246194.1|  hypothetical protein RF_0178 [Ri...  92.8    2e-17 
gi|109728947|ref|ZP_01380358.1|  hypothetical protein RbelO_01...  92.0    4e-17
gi|102191508|ref|ZP_01347320.1|  hypothetical protein RcanM_01...  91.7    5e-17
gi|86749933|ref|YP_486429.1|  Protein of unknown function DUF1...  91.7    5e-17 
gi|51473899|ref|YP_067656.1|  rickettsial conserved hypothetic...  91.3    7e-17 
gi|34581279|ref|ZP_00142759.1|  hypothetical protein [Ricketts...  90.5    1e-16
gi|27379959|ref|NP_771488.1|  hypothetical protein bll4848 [Br...  90.1    1e-16 
gi|42454173|ref|ZP_00154080.1|  COG3494: Uncharacterized prote...  90.1    1e-16
gi|15893033|ref|NP_360747.1|  hypothetical protein RC1110 [Ric...  89.7    2e-16 
gi|78695193|ref|ZP_00859705.1|  conserved hypothetical protein...  88.6    4e-16
gi|115524572|ref|YP_781483.1|  protein of unknown function DUF...  88.6    4e-16 
gi|110679827|ref|YP_682834.1|  hypothetical protein RD1_2597 [...  86.7    2e-15 
gi|114768807|ref|ZP_01446433.1|  hypothetical protein OM2255_0...  86.3    2e-15
gi|110633745|ref|YP_673953.1|  protein of unknown function DUF...  85.5    4e-15 
gi|19703931|ref|NP_603493.1|  hypothetical protein FN0596 [Fus...  84.0    1e-14 
gi|83858375|ref|ZP_00951897.1|  hypothetical protein OA2633_02...  84.0    1e-14
gi|114797181|ref|YP_760484.1|  hypothetical protein HNE_1780 [...  83.6    1e-14 
gi|34764132|ref|ZP_00145004.1|  hypothetical protein [Fusobact...  83.2    2e-14
gi|114569943|ref|YP_756623.1|  protein of unknown function DUF...  82.8    2e-14 
gi|58040254|ref|YP_192218.1|  hypothetical protein GOX1823 [Gl...  82.8    2e-14 
gi|92117253|ref|YP_576982.1|  protein of unknown function DUF1...  82.4    3e-14 
gi|75676038|ref|YP_318459.1|  Protein of unknown function DUF1...  82.4    3e-14 
gi|85716984|ref|ZP_01047947.1|  hypothetical protein NB311A_06...  82.0    4e-14
gi|118589999|ref|ZP_01547403.1|  hypothetical protein SIAM614_...  81.3    7e-14
gi|113872922|ref|ZP_01413051.1|  conserved hypothetical protei...  81.3    7e-14
gi|99081244|ref|YP_613398.1|  protein of unknown function DUF1...  81.3    7e-14 
gi|90423948|ref|YP_532318.1|  protein of unknown function DUF1...  79.3    3e-13 
gi|39935975|ref|NP_948251.1|  hypothetical protein RPA2910 [Rh...  79.0    3e-13 
gi|83951894|ref|ZP_00960626.1|  hypothetical protein ISM_15065...  78.2    6e-13
gi|17987117|ref|NP_539751.1|  hypothetical protein BMEI0834 [B...  76.6    2e-12 
gi|126737633|ref|ZP_01753363.1|  hypothetical protein RSK20926...  75.5    4e-12
gi|23502028|ref|NP_698155.1|  conserved hypothetical protein T...  75.5    4e-12 
gi|116251988|ref|YP_767826.1|  hypothetical protein RL2232 [Rh...  75.1    5e-12 
gi|15888711|ref|NP_354392.1|  hypothetical protein AGR_C_2562 ...  75.1    5e-12 
gi|83311584|ref|YP_421848.1|  hypothetical protein amb2485 [Ma...  75.1    5e-12 
gi|46202584|ref|ZP_00052965.2|  COG3494: Uncharacterized prote...  75.1    5e-12
gi|121525911|ref|ZP_01658847.1|  conserved hypothetical protei...  74.7    6e-12
gi|15965259|ref|NP_385612.1|  hypothetical protein SMc02090 [S...  74.7    6e-12 
gi|86138413|ref|ZP_01056987.1|  hypothetical protein MED193_04...  74.7    6e-12
gi|86357545|ref|YP_469437.1|  hypothetical protein RHE_CH01924...  74.3    8e-12 
gi|89360180|ref|ZP_01197999.1|  conserved hypothetical protein...  73.9    1e-11
gi|69933302|ref|ZP_00628504.1|  conserved hypothetical protein...  73.2    2e-11 
gi|114327609|ref|YP_744766.1|  hypothetical protein GbCGDNIH1_...  72.8    2e-11 
gi|83953539|ref|ZP_00962260.1|  hypothetical protein NAS141_04...  72.0    4e-11
gi|124514699|gb|EAY56211.1|  Uncharacterized protein conserved...  71.6    5e-11
gi|126462141|ref|YP_001043255.1|  protein of unknown function ...  70.9    9e-11 
gi|121602728|ref|YP_988901.1|  hypothetical protein BARBAKC583...  70.5    1e-10 
gi|126726878|ref|ZP_01742717.1|  hypothetical protein RB2150_0...  70.1    2e-10
gi|77463267|ref|YP_352771.1|  hypothetical protein RSP_2715 [R...  70.1    2e-10 
gi|94269143|ref|ZP_01291381.1|  Protein of unknown function DU...  69.7    2e-10
gi|83592933|ref|YP_426685.1|  Protein of unknown function DUF1...  69.7    2e-10 
gi|49475420|ref|YP_033461.1|  Phosphatidate cytidyltransferase...  69.3    3e-10 
gi|116751168|ref|YP_847855.1|  protein of unknown function DUF...  69.3    3e-10 
gi|56696556|ref|YP_166913.1|  hypothetical protein SPO1674 [Si...  68.9    3e-10 
gi|83942320|ref|ZP_00954781.1|  hypothetical protein EE36_1480...  68.6    5e-10
gi|121542724|ref|ZP_01674441.1|  conserved hypothetical protei...  68.2    6e-10
gi|32476481|ref|NP_869475.1|  hypothetical protein RB10538 [Rh...  68.2    6e-10 
gi|13470831|ref|NP_102400.1|  hypothetical protein mll0631 [Me...  67.0    1e-09 
gi|83368776|ref|ZP_00913637.1|  conserved hypothetical protein...  67.0    1e-09
gi|1262295|gb|AAA96792.1|  ORF9; hypothetical protein              66.6    2e-09
gi|117925149|ref|YP_865766.1|  protein of unknown function DUF...  66.6    2e-09 
gi|91202217|emb|CAJ75277.1|  conserved hypothetical protein [C...  66.2    2e-09
gi|84500831|ref|ZP_00999066.1|  hypothetical protein OB2597_01...  66.2    2e-09
gi|144898245|emb|CAM75109.1|  conserved hypothetical protein, ...  65.5    4e-09
gi|91761963|ref|ZP_01263928.1|  lipid-A-disaccharide synthase ...  64.3    9e-09
gi|71083616|ref|YP_266335.1|  lipid-A-disaccharide synthase (l...  62.8    2e-08 
gi|118737582|ref|ZP_01585985.1|  protein of unknown function D...  62.4    3e-08
gi|49474286|ref|YP_032328.1|  Phosphatidate cytidyltransferase...  62.4    3e-08 
gi|94987560|ref|YP_595493.1|  hypothetical protein LI1118 [Law...  61.6    6e-08 
gi|46580770|ref|YP_011578.1|  hypothetical protein DVU2365 [De...  61.6    6e-08 
gi|89068198|ref|ZP_01155608.1|  hypothetical protein OG2516_02...  61.6    6e-08
gi|120601945|ref|YP_966345.1|  protein of unknown function DUF...  61.2    7e-08 
gi|114764261|ref|ZP_01443489.1|  hypothetical protein R2601_25...  61.2    7e-08
gi|89210177|ref|ZP_01188569.1|  Protein of unknown function DU...  61.2    7e-08
gi|116327597|ref|YP_797317.1|  hypothetical protein LBL_0829 [...  60.1    2e-07 
gi|45658416|ref|YP_002502.1|  hypothetical protein LIC12578 [L...  59.7    2e-07 
gi|24213797|ref|NP_711278.1|  hypothetical protein LA1097 [Lep...  59.7    2e-07 
gi|121535891|ref|ZP_01667688.1|  protein of unknown function D...  59.3    3e-07
gi|78356422|ref|YP_387871.1|  hypothetical protein Dde_1375 [D...  58.2    6e-07 
gi|114704864|ref|ZP_01437772.1|  hypothetical protein FP2506_0...  57.8    8e-07
gi|89054940|ref|YP_510391.1|  hypothetical protein Jann_2449 [...  57.8    8e-07 
gi|126729716|ref|ZP_01745529.1|  hypothetical protein SSE37_04...  57.0    1e-06
gi|15606496|ref|NP_213876.1|  hypothetical protein aq_1276 [Aq...  54.3    9e-06 
gi|90419603|ref|ZP_01227513.1|  conserved hypothetical protein...  53.9    1e-05
gi|88941965|ref|ZP_01147345.1|  conserved hypothetical protein...  52.8    3e-05
gi|116625259|ref|YP_827415.1|  protein of unknown function DUF...  52.4    3e-05 
gi|51246794|ref|YP_066678.1|  hypothetical protein DP2942 [Des...  50.8    1e-04 
gi|87307079|ref|ZP_01089225.1|  hypothetical protein DSM3645_0...  48.9    4e-04
gi|121538196|ref|ZP_01669971.1|  protein of unknown function D...  48.5    5e-04
gi|86157513|ref|YP_464298.1|  protein of unknown function DUF1...  46.6    0.002 
gi|108763981|ref|YP_632885.1|  hypothetical protein MXAN_4723 ...  45.8    0.003 


BLASTp CONTRE BANQUE NR:

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

gi|71083615|ref|YP_266334.1|  Protein of unknown function (DUF...   209    1e-52 Gene info
gi|91761964|ref|ZP_01263929.1|  hypothetical protein PU1002_01...   208    2e-52
gi|16126153|ref|NP_420717.1|  hypothetical protein CC_1910 [Ca...   141    3e-32 Gene info
gi|84701776|ref|ZP_01016351.1|  hypothetical protein PB2503_02...   137    5e-31
gi|91977315|ref|YP_569974.1|  protein of unknown function DUF1...   136    1e-30 Gene info
gi|113933567|ref|ZP_01419469.1|  Protein of unknown function D...   135    2e-30
gi|115524572|ref|YP_781483.1|  protein of unknown function DUF...   122    2e-26 Gene info
gi|27379959|ref|NP_771488.1|  hypothetical protein bll4848 [Br...   122    2e-26 Gene info
gi|118589999|ref|ZP_01547403.1|  hypothetical protein SIAM614_...   120    1e-25
gi|58040254|ref|YP_192218.1|  hypothetical protein GOX1823 [Gl...   118    2e-25 Gene info
gi|83858375|ref|ZP_00951897.1|  hypothetical protein OA2633_02...   116    1e-24
gi|78695193|ref|ZP_00859705.1|  conserved hypothetical protein...   114    5e-24
gi|86749933|ref|YP_486429.1|  Protein of unknown function DUF1...   114    6e-24 Gene info
gi|113872922|ref|ZP_01413051.1|  conserved hypothetical protei...   112    2e-23
gi|39935975|ref|NP_948251.1|  hypothetical protein RPA2910 [Rh...   111    4e-23 Gene info
gi|102191508|ref|ZP_01347320.1|  hypothetical protein RcanM_01...   109    1e-22
gi|110633745|ref|YP_673953.1|  protein of unknown function DUF...   109    2e-22 Gene info
gi|114327609|ref|YP_744766.1|  hypothetical protein GbCGDNIH1_...   109    2e-22 Gene info
gi|90423948|ref|YP_532318.1|  protein of unknown function DUF1...   108    2e-22 Gene info
gi|91205179|ref|YP_537534.1|  hypothetical protein RBE_0364 [R...   108    2e-22 Gene info
gi|83951894|ref|ZP_00960626.1|  hypothetical protein ISM_15065...   107    7e-22
gi|17987117|ref|NP_539751.1|  hypothetical protein BMEI0834 [B...   107    8e-22 Gene info
gi|99081244|ref|YP_613398.1|  protein of unknown function DUF1...   106    1e-21 Gene info
gi|23502028|ref|NP_698155.1|  conserved hypothetical protein T...   105    2e-21 Gene info
gi|51473899|ref|YP_067656.1|  rickettsial conserved hypothetic...   105    2e-21 Gene info
gi|75676038|ref|YP_318459.1|  Protein of unknown function DUF1...   105    3e-21 Gene info
gi|109728947|ref|ZP_01380358.1|  hypothetical protein RbelO_01...   105    3e-21
gi|85716984|ref|ZP_01047947.1|  hypothetical protein NB311A_06...   105    3e-21
gi|114797181|ref|YP_760484.1|  hypothetical protein HNE_1780 [...   104    4e-21 Gene info
gi|114569943|ref|YP_756623.1|  protein of unknown function DUF...   104    4e-21 Gene info
gi|15888711|ref|NP_354392.1|  hypothetical protein AGR_C_2562 ...   104    5e-21 Gene info
gi|89360180|ref|ZP_01197999.1|  conserved hypothetical protein...   104    5e-21
gi|52699350|ref|ZP_00340758.1|  COG3494: Uncharacterized prote...   103    1e-20
gi|116251988|ref|YP_767826.1|  hypothetical protein RL2232 [Rh...   102    2e-20 Gene info
gi|92117253|ref|YP_576982.1|  protein of unknown function DUF1...   102    2e-20 Gene info
gi|34581279|ref|ZP_00142759.1|  hypothetical protein [Ricketts...   102    2e-20
gi|67458570|ref|YP_246194.1|  hypothetical protein RF_0178 [Ri...   102    2e-20 Gene info
gi|42454173|ref|ZP_00154080.1|  COG3494: Uncharacterized prote...   102    2e-20
gi|15893033|ref|NP_360747.1|  hypothetical protein RC1110 [Ric...   101    4e-20 Gene info
gi|15604565|ref|NP_221083.1|  hypothetical protein RP730 [Rick...   100    7e-20 Gene info
gi|86357545|ref|YP_469437.1|  hypothetical protein RHE_CH01924...   100    9e-20 Gene info
gi|94269143|ref|ZP_01291381.1|  Protein of unknown function DU...   100    9e-20
gi|19703931|ref|NP_603493.1|  hypothetical protein FN0596 [Fus...   100    1e-19 Gene info
gi|34764132|ref|ZP_00145004.1|  hypothetical protein [Fusobact...  99.4    2e-19
gi|110679827|ref|YP_682834.1|  hypothetical protein RD1_2597 [...  99.4    2e-19 Gene info
gi|86138413|ref|ZP_01056987.1|  hypothetical protein MED193_04...  99.4    2e-19
gi|126726878|ref|ZP_01742717.1|  hypothetical protein RB2150_0...  98.6    3e-19
gi|114768807|ref|ZP_01446433.1|  hypothetical protein OM2255_0...  98.6    3e-19
gi|124514699|gb|EAY56211.1|  Uncharacterized protein conserved...  98.2    3e-19
gi|1262295|gb|AAA96792.1|  ORF9; hypothetical protein              96.7    1e-18
gi|69933302|ref|ZP_00628504.1|  conserved hypothetical protein...  96.3    1e-18 Gene info
gi|126737633|ref|ZP_01753363.1|  hypothetical protein RSK20926...  95.5    2e-18
gi|56696556|ref|YP_166913.1|  hypothetical protein SPO1674 [Si...  95.1    3e-18 Gene info
gi|13470831|ref|NP_102400.1|  hypothetical protein mll0631 [Me...  94.7    5e-18 Gene info
gi|83953539|ref|ZP_00962260.1|  hypothetical protein NAS141_04...  91.3    5e-17
gi|83311584|ref|YP_421848.1|  hypothetical protein amb2485 [Ma...  90.9    7e-17 Gene info
gi|117925149|ref|YP_865766.1|  protein of unknown function DUF...  90.5    7e-17 Gene info
gi|15965259|ref|NP_385612.1|  hypothetical protein SMc02090 [S...  90.5    8e-17 Gene info
gi|83592933|ref|YP_426685.1|  Protein of unknown function DUF1...  90.5    8e-17 Gene info
gi|94987560|ref|YP_595493.1|  hypothetical protein LI1118 [Law...  90.1    9e-17 Gene info
gi|49475420|ref|YP_033461.1|  Phosphatidate cytidyltransferase...  90.1    1e-16 Gene info
gi|46202584|ref|ZP_00052965.2|  COG3494: Uncharacterized prote...  90.1    1e-16
gi|84500831|ref|ZP_00999066.1|  hypothetical protein OB2597_01...  89.7    1e-16
gi|83942320|ref|ZP_00954781.1|  hypothetical protein EE36_1480...  89.4    2e-16
gi|121602728|ref|YP_988901.1|  hypothetical protein BARBAKC583...  89.0    2e-16 Gene info
gi|116751168|ref|YP_847855.1|  protein of unknown function DUF...  88.6    3e-16 Gene info
gi|121525911|ref|ZP_01658847.1|  conserved hypothetical protei...  87.8    5e-16
gi|49474286|ref|YP_032328.1|  Phosphatidate cytidyltransferase...  85.5    3e-15 Gene info
gi|114704864|ref|ZP_01437772.1|  hypothetical protein FP2506_0...  84.7    4e-15
gi|116625259|ref|YP_827415.1|  protein of unknown function DUF...  84.3    5e-15 Gene info
gi|91202217|emb|CAJ75277.1|  conserved hypothetical protein [C...  84.0    8e-15
gi|114764261|ref|ZP_01443489.1|  hypothetical protein R2601_25...  83.6    1e-14
gi|46580770|ref|YP_011578.1|  hypothetical protein DVU2365 [De...  83.6    1e-14 Gene info
gi|89054940|ref|YP_510391.1|  hypothetical protein Jann_2449 [...  83.2    1e-14 Gene info
gi|120601945|ref|YP_966345.1|  protein of unknown function DUF...  82.8    2e-14 Gene info
gi|121535891|ref|ZP_01667688.1|  protein of unknown function D...  82.4    2e-14
gi|89068198|ref|ZP_01155608.1|  hypothetical protein OG2516_02...  82.4    2e-14
gi|89210177|ref|ZP_01188569.1|  Protein of unknown function DU...  81.3    5e-14
gi|121542724|ref|ZP_01674441.1|  conserved hypothetical protei...  80.1    1e-13
gi|90419603|ref|ZP_01227513.1|  conserved hypothetical protein...  77.8    6e-13
gi|78356422|ref|YP_387871.1|  hypothetical protein Dde_1375 [D...  77.4    7e-13 Gene info
gi|32476481|ref|NP_869475.1|  hypothetical protein RB10538 [Rh...  77.4    8e-13 Gene info
gi|118737582|ref|ZP_01585985.1|  protein of unknown function D...  74.7    5e-12
gi|51246794|ref|YP_066678.1|  hypothetical protein DP2942 [Des...  73.9    7e-12 Gene info
gi|126736311|ref|ZP_01752053.1|  hypothetical protein RCCS2_00...  72.8    2e-11
gi|24213797|ref|NP_711278.1|  hypothetical protein LA1097 [Lep...  70.9    7e-11 Gene info
gi|83368776|ref|ZP_00913637.1|  conserved hypothetical protein...  70.1    1e-10
gi|45658416|ref|YP_002502.1|  hypothetical protein LIC12578 [L...  70.1    1e-10 Gene info
gi|116327597|ref|YP_797317.1|  hypothetical protein LBL_0829 [...  68.2    5e-10 Gene info
gi|87307079|ref|ZP_01089225.1|  hypothetical protein DSM3645_0...  66.2    2e-09
gi|121538196|ref|ZP_01669971.1|  protein of unknown function D...  64.3    6e-09
gi|88941965|ref|ZP_01147345.1|  conserved hypothetical protein...  64.3    6e-09
gi|126729716|ref|ZP_01745529.1|  hypothetical protein SSE37_04...  62.8    2e-08
gi|126462141|ref|YP_001043255.1|  protein of unknown function ...  62.8    2e-08 Gene info
gi|77463267|ref|YP_352771.1|  hypothetical protein RSP_2715 [R...  62.0    3e-08 Gene info
gi|114778068|ref|ZP_01452968.1|  hypothetical protein SPV1_056...  60.8    7e-08
gi|15606496|ref|NP_213876.1|  hypothetical protein aq_1276 [Aq...  59.7    2e-07 Gene info
gi|94971580|ref|YP_593628.1|  protein of unknown function DUF1...  59.3    2e-07 Gene info
gi|86157513|ref|YP_464298.1|  protein of unknown function DUF1...  58.9    3e-07 Gene info
gi|108763981|ref|YP_632885.1|  hypothetical protein MXAN_4723 ...  58.9    3e-07 Gene info
gi|115377101|ref|ZP_01464316.1|  phosphatidate cytidyltransfer...  56.2    2e-06
gi|84516081|ref|ZP_01003441.1|  hypothetical protein SKA53_040...  51.2    5e-05
gi|77551749|gb|ABA94546.1|  retrotransposon protein, putative,...  35.0    4.2  
gi|124400534|emb|CAK66020.1|  unnamed protein product [Paramecium  34.7    5.0  


GROUPE D'ETUDE:

>gi|71083615|ref|YP_266334.1| Gene info Protein of unknown function (DUF1009) [Candidatus Pelagibacter 
ubique HTCC1062]
 gi|71062728|gb|AAZ21731.1| Gene info Protein of unknown function (DUF1009) [Candidatus Pelagibacter 
ubique HTCC1062]
Length=261

 Score =  209 bits (532),  Expect = 1e-52, Method: Composition-based stats.
 Identities = 145/263 (55%), Positives = 193/263 (73%), Gaps = 3/263 (1%)

Query  1    MIGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDKKSYSVSIGQFGKIINILKK  60
            MIGL  G+T F   +LK +KK    Y IID ++  KFK D  S  +SIG+FGKII+++K+
Sbjct  1    MIGLFLGDTDFSEAVLKNIKKLNKRYFIIDFSKNNKFKNDINSNRISIGKFGKIIDLIKE  60

Query  61   NNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKTE  120
               KKVLFAGK+ KP FS L+LD KGIYY+P I+KA+KLGDAAIIK IIKIL   KIK  
Sbjct  61   KKSKKVLFAGKIAKPKFSTLRLDLKGIYYMPSILKAAKLGDAAIIKAIIKILDNEKIKVL  120

Query  121  NSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEG  180
            +S+ FNPEL++KRGNY+K+K N  D + IK  I   N L+  +  Q ++V++ T++AIE 
Sbjct  121  SSVFFNPELTVKRGNYTKLKANRKDINSIKMGITYFNKLKSLDHVQAIIVKDNTILAIED  180

Query  181  KGGTKKMLEKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLKGIVVKNK  240
            + GTKKML K K K     G+L+K PKKKQDLR+DLPTIGL+TLK  K  GLKGIV+++K
Sbjct  181  QQGTKKMLSKLKKKS---EGILIKLPKKKQDLRMDLPTIGLQTLKDCKKYGLKGIVLRSK  237

Query  241  QHVFLDKNQCIKFANQNRMFISV  263
            +++FLDK +CI FAN+N++F+ +
Sbjct  238  KNIFLDKAKCIAFANKNKIFVKI  260


>gi|91761964|ref|ZP_01263929.1|  hypothetical protein PU1002_01826 [Candidatus Pelagibacter ubique 
HTCC1002]
 gi|91717766|gb|EAS84416.1|  hypothetical protein PU1002_01826 [Candidatus Pelagibacter ubique 
HTCC1002]
Length=261

 Score =  208 bits (530),  Expect = 2e-52, Method: Composition-based stats.
 Identities = 147/263 (55%), Positives = 191/263 (72%), Gaps = 3/263 (1%)

Query  1    MIGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDKKSYSVSIGQFGKIINILKK  60
            MIGL  G+T F   +LK +KK    Y IID ++  KFK D  S  +SIG+FGKIIN++K+
Sbjct  1    MIGLFLGDTDFSEAVLKNIKKLNKRYFIIDFSKNNKFKNDINSNRISIGKFGKIINLIKE  60

Query  61   NNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKTE  120
               KKVLFAGK+ KP FS L+LD KGIYY+P I+KA+KLGDAAIIK IIKIL   KIK  
Sbjct  61   KKSKKVLFAGKIAKPKFSTLRLDLKGIYYMPSILKAAKLGDAAIIKAIIKILDNEKIKVL  120

Query  121  NSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEG  180
            +S+ FNPEL++KRGNY+K+K N  D + IK  I   N L+  +  Q ++V+N T++AIE 
Sbjct  121  SSVFFNPELTVKRGNYTKLKANKSDINSIKMGITYFNKLKSLDHVQAIIVKNDTIIAIED  180

Query  181  KGGTKKMLEKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLKGIVVKNK  240
              GTKKML K K K     G+L+K PKKKQDLR+DLPTIGL+TLK  K  GLKGIV+K+K
Sbjct  181  HQGTKKMLSKLKKKS---EGILIKLPKKKQDLRMDLPTIGLQTLKDCKKYGLKGIVLKSK  237

Query  241  QHVFLDKNQCIKFANQNRMFISV  263
            +++FLDK + I FAN+N++F+ +
Sbjct  238  KNIFLDKAKSIAFANKNKIFVKI  260


>gi|16126153|ref|NP_420717.1| Gene info hypothetical protein CC_1910 [Caulobacter crescentus CB15]
 gi|13423363|gb|AAK23885.1| Gene info conserved hypothetical protein [Caulobacter crescentus CB15]
Length=280

 Score =  141 bits (355),  Expect = 4e-32, Method: Composition-based stats.
 Identities = 84/268 (31%), Positives = 138/268 (51%), Gaps = 8/268 (2%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDK-KSYSVSIGQFGKIINILKK  60
            +GLI G  + P E+    +     + ++ L        D+     V IG+FGKI   L+ 
Sbjct  4    LGLIAGGGALPVELASHCEAAGRAFAVMRLRSFADPSLDRYPGADVGIGEFGKIFKALRA  63

Query  61   NNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKTE  120
              C  V FAG V++P+FS L  D +G+  +P +I A++ GD A+++ ++    +   + E
Sbjct  64   EGCDVVCFAGNVSRPDFSALMPDARGLKVLPSLIVAARKGDDALLRRVLDEFEKEGFEIE  123

Query  121  NSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEG  180
             + +   E++L RG   K+ P     +DI KA+     + + +  QG VV    V+A+E 
Sbjct  124  GAHEVMGEMTLPRGRLGKVSPAPEHMADIDKALDVAREIGRLDIGQGAVVCEGLVLAVEA  183

Query  181  KGGTKKML-------EKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLK  233
            + GT  ML       E  + +  R  GVL K PK  Q+ RVDLPTIG+ T+ ++  AGL 
Sbjct  184  QEGTDAMLRRVADLPEAIRGRAERRLGVLAKAPKPIQETRVDLPTIGVATIHRAARAGLA  243

Query  234  GIVVKNKQHVFLDKNQCIKFANQNRMFI  261
            GIV +  + + +D+   I  A+   +F+
Sbjct  244  GIVGEAGRLLVVDREAVIAAADDLGLFV  271


>gi|84701776|ref|ZP_01016351.1|  hypothetical protein PB2503_02422 [Parvularcula bermudensis HTCC2503]
 gi|84691022|gb|EAQ16862.1|  hypothetical protein PB2503_02422 [Parvularcula bermudensis HTCC2503]
Length=290

 Score =  137 bits (346),  Expect = 5e-31, Method: Composition-based stats.
 Identities = 85/275 (30%), Positives = 140/275 (50%), Gaps = 19/275 (6%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLIIDLTQ-----LKKFKKDKKSYSVSIGQFGKIIN  56
            +G+I G  S P +I +  +++   + I+ L+      LK FK         IG+ GK I 
Sbjct  8    LGIIAGGGSLPLKIAESCQQQDAPFHILALSGYADDILKSFKPS----WCGIGEVGKAIR  63

Query  57   ILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNK  116
            +LK + C  V+ AG V +PNF+ L+ D++G   +P+I+ A+  GD A++  ++   A   
Sbjct  64   VLKDHGCDAVVLAGNVTRPNFATLRPDWRGAKLLPKILSAATQGDGAMLDVLVATFASEG  123

Query  117  IKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVV  176
                 +      L++  G    + P+  D SD++KA   +  L  ++  QG VVR   V+
Sbjct  124  FYVVGADDVATALTVPAGALGMLGPDTCDLSDMRKAAAVVAALGPFDVGQGAVVRQGFVI  183

Query  177  AIEGKGGTKKM----------LEKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQ  226
            AIE   GT  M          L+  +  +    GVL+K PK +Q+ RVDLPTIG++T++ 
Sbjct  184  AIEAAEGTDLMLGRCAPLIARLQGEEGNRSERRGVLLKCPKPEQERRVDLPTIGVRTVEL  243

Query  227  SKTAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFI  261
            +  AGL GI V+    + LD     + A+   +F+
Sbjct  244  AAEAGLAGIAVEASGGLVLDSGAVARCADARGLFV  278


>gi|91977315|ref|YP_569974.1| Gene info protein of unknown function DUF1009 [Rhodopseudomonas palustris 
BisB5]
 gi|91683771|gb|ABE40073.1| Gene info protein of unknown function DUF1009 [Rhodopseudomonas palustris 
BisB5]
Length=285

 Score =  136 bits (342),  Expect = 1e-30, Method: Composition-based stats.
 Identities = 81/274 (29%), Positives = 142/274 (51%), Gaps = 17/274 (6%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLII------DLTQLKKFKKDKKSYSVSIGQFGKII  55
            +G+I G    P  I   ++ +++  L+I      D T + +F+     + +SIGQ+G++ 
Sbjct  12   VGVIAGGGVLPFAIADSMQARQITPLLIGLRGFCDPTGIARFRH----HWISIGQYGRLK  67

Query  56   NILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQN  115
             +L+  +C+ V+F G V +P+ + ++LD+  +  +P ++ A + GD  ++  I +I    
Sbjct  68   RLLRAEHCRDVMFIGSVVRPSLASVRLDWGAVRVLPSVMAAYRGGDDHLLTSIGRIFEGE  127

Query  116  KIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTV  175
              +        PEL + RG  ++  P++   +DI K I  L  L  ++  QGV+V +  V
Sbjct  128  GFRLYGVKDVAPELLMPRGELTQATPDEGHLADIAKGIAVLAALSPFDIGQGVIVIDGHV  187

Query  176  VAIEGKGGTKKMLEKSKSKKFR-------NHGVLVKFPKKKQDLRVDLPTIGLKTLKQSK  228
            VA+E  GGT  +L      + +         GVLVK PK  QDLR DLPT+G +T++   
Sbjct  188  VAVEDIGGTDALLANLARLRAQGAIHAKPGRGVLVKSPKSGQDLRFDLPTLGPRTVEGVA  247

Query  229  TAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFIS  262
             AGL GI V     +  +  + IK A+   +F++
Sbjct  248  AAGLAGIAVAAGNTLVAEPQETIKAADAAGLFVT  281


>gi|113933567|ref|ZP_01419469.1|  Protein of unknown function DUF1009 [Caulobacter sp. K31]
 gi|113733788|gb|EAU14856.1|  Protein of unknown function DUF1009 [Caulobacter sp. K31]
Length=283

 Score =  135 bits (340),  Expect = 2e-30, Method: Composition-based stats.
 Identities = 77/268 (28%), Positives = 141/268 (52%), Gaps = 8/268 (2%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKK-FKKDKKSYSVSIGQFGKIINILKK  60
            +GLI G  S P E+ +  +     + ++ L    +          V +G+FGK+   L+ 
Sbjct  7    LGLIAGGGSLPVELAQHCEAAGRPFSVMRLRSFAEPVLARYPGVEVGLGEFGKVFKALRA  66

Query  61   NNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKTE  120
              C+ V FAG V +P+F+ +K D +G+  +P +I A++ GD A+++ ++    +     E
Sbjct  67   EGCEAVCFAGVVERPDFAAIKPDLRGLTVMPGLINAARKGDDALLRRLLSEFEKEGFAIE  126

Query  121  NSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEG  180
             + +   E++L RG+  +  P D  ++D+ +A+     +   +  QG VV +  V+A+E 
Sbjct  127  GAHEVRGEMTLPRGSLGRHAPTDAHRTDMDRALTVARAIGALDVGQGAVVCDGLVLAVEA  186

Query  181  KGGTKKML-------EKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLK  233
            + GT  ML       E  + +     GVL K PK  Q+ +VDLPTIG+ T++++  AGL 
Sbjct  187  QEGTDAMLRRVADLPEAIRGRAEAPRGVLAKAPKPIQETKVDLPTIGVATVQRAARAGLA  246

Query  234  GIVVKNKQHVFLDKNQCIKFANQNRMFI  261
            G+V +  + + +D+ Q I  A+   +F+
Sbjct  247  GVVGEAGRLLIVDREQVIACADDLGLFV  274


>gi|115524572|ref|YP_781483.1| Gene info protein of unknown function DUF1009 [Rhodopseudomonas palustris 
BisA53]
 gi|115518519|gb|ABJ06503.1| Gene info protein of unknown function DUF1009 [Rhodopseudomonas palustris 
BisA53]
Length=284

 Score =  122 bits (306),  Expect = 2e-26, Method: Composition-based stats.
 Identities = 76/274 (27%), Positives = 138/274 (50%), Gaps = 17/274 (6%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLII------DLTQLKKFKKDKKSYSVSIGQFGKII  55
            +GLI G    P  +   ++ + +  ++       D  QL +++     + +SIG FG++ 
Sbjct  11   VGLIAGGGVLPFAVADSLQARGIGAVLFALKGSCDADQLSRYRH----HWISIGAFGQLR  66

Query  56   NILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQN  115
             +L+   C+ VLF G + +P+ S ++LD+  I  +P I+ A + GD  ++  I +I  ++
Sbjct  67   RLLRAEQCRDVLFIGALVRPSLSAVRLDWGAIRVMPAILAAYRGGDDHLLTGIGQIFERD  126

Query  116  KIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTV  175
              +        P+L +  G  ++ +PN   ++DI K    L  L  ++  QG VV +  V
Sbjct  127  GFRLLGLKDVAPDLLMPEGCMTRARPNKDTEADIAKGRAVLAALSPFDIGQGCVVIDGHV  186

Query  176  VAIEGKGGTKKMLEKSKS----KKFR---NHGVLVKFPKKKQDLRVDLPTIGLKTLKQSK  228
            V++E  GGT  +L + +     ++ R     GVLVK PK  QDLR DLP +G KT++   
Sbjct  187  VSVEDTGGTDGLLRRVEQLRGERRLRAKPGRGVLVKAPKSGQDLRFDLPALGPKTIEGLI  246

Query  229  TAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFIS  262
             A + G+ V     V  +    +  A++  +F++
Sbjct  247  AAQIAGVAVVAGHTVVAEPQAMVDAADRAGLFVT  280


>gi|27379959|ref|NP_771488.1| Gene info hypothetical protein bll4848 [Bradyrhizobium japonicum USDA 110]
 gi|27353112|dbj|BAC50113.1| Gene info bll4848 [Bradyrhizobium japonicum USDA 110]
Length=289

 Score =  122 bits (305),  Expect = 2e-26, Method: Composition-based stats.
 Identities = 74/273 (27%), Positives = 135/273 (49%), Gaps = 17/273 (6%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLII------DLTQLKKFKKDKKSYSVSIGQFGKII  55
            +G++ G  + P  +   +  + +  ++       D  Q++KF+       +S+GQ G+ +
Sbjct  16   VGVVAGGGAMPFAVADSLATRGITPVLFPLRGACDPVQVEKFRHR----WISVGQLGRAM  71

Query  56   NILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQN  115
             + ++  C+ ++F G + +P+ S ++ DF  +  +  +I+A + GD  ++  + +IL Q 
Sbjct  72   RLFREEGCRDLIFIGTLVRPSLSEIRFDFTTLRLLGNVIRAFRGGDDHLLSGVGRILEQG  131

Query  116  KIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTV  175
              +        P+L +  G  S+  P+D  ++DI++    L  L  ++  Q  VV +  V
Sbjct  132  GFRMVGIKDVAPDLLMPEGCISRAWPSDTSKTDIERGRAVLTALGPFDIGQAAVVIDGHV  191

Query  176  VAIEGKGGTKKML-------EKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSK  228
            VA+E   GT  +L       E+ + +     GVLVK PK  QDLR DLPTIG +T++   
Sbjct  192  VAVEDIEGTDALLARVARLREEGRIRAATGRGVLVKAPKSSQDLRFDLPTIGPRTIEGVA  251

Query  229  TAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFI  261
             AGL GI V     +  +    I  A+   +FI
Sbjct  252  RAGLAGIAVIAGNTIAAEPQAMIALADAKYLFI  284


>gi|118589999|ref|ZP_01547403.1|  hypothetical protein SIAM614_15080 [Stappia aggregata IAM 12614]
 gi|118437496|gb|EAV44133.1|  hypothetical protein SIAM614_15080 [Stappia aggregata IAM 12614]
Length=301

 Score =  120 bits (300),  Expect = 1e-25, Method: Composition-based stats.
 Identities = 79/269 (29%), Positives = 133/269 (49%), Gaps = 9/269 (3%)

Query  2    IGLIFGETSFPNEILKKVKKKKLNYLIIDLT-QLKKFKKDKKSYSVSIGQFGKIINILKK  60
            + LI G  S P +I   +      + II +  +  +  + +    +  G+ G++   LKK
Sbjct  12   LALIAGNGSLPCQIADALSNAGREFKIIAIKGEADERTRAQADTELGWGEIGRLYKFLKK  71

Query  61   NNCKKVLFAGKVNK-PNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKT  119
              C+ VL  G V+K P+F+ +  D   +  +P II+A   GD +++ ++I++      + 
Sbjct  72   TGCRDVLLIGGVSKRPDFTSILGDIGTLKRLPTIIRALAGGDDSLLTKVIRLFEVEGYRV  131

Query  120  ENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIE  179
                   P+L    G   K++P+  D  D + A++    L + +  Q  V     VVA+E
Sbjct  132  VGIKDVAPQLLASSGVLGKVQPSQTDWRDAELALRACEKLGELDIGQAAVAVGGRVVALE  191

Query  180  GKGGTKKMLEKSKSKK------FRNH-GVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGL  232
            G  GT  ML++    K       ++H GVLVK  K  QDLRVDLPT+G KT+  +  AGL
Sbjct  192  GAEGTDAMLQRCADLKRNGRIRAKSHTGVLVKTAKPNQDLRVDLPTVGPKTIDLAVAAGL  251

Query  233  KGIVVKNKQHVFLDKNQCIKFANQNRMFI  261
             GI V+    +  +K+  +K A+   +F+
Sbjct  252  AGIAVEASGALIAEKDVTLKKADDAGLFV  280


>gi|58040254|ref|YP_192218.1| Gene info hypothetical protein GOX1823 [Gluconobacter oxydans 621H]
 gi|58002668|gb|AAW61562.1| Gene info Hypothetical protein GOX1823 [Gluconobacter oxydans 621H]
Length=280

 Score =  118 bits (296),  Expect = 2e-25, Method: Composition-based stats.
 Identities = 61/217 (28%), Positives = 117/217 (53%), Gaps = 1/217 (0%)

Query  46   VSIGQFGKIINILKKNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAII  105
            + +   G I+  LK+NNC++++  G V +P +  L+ D +G   + R+ +A   GD  ++
Sbjct  53   IRLAAAGDILGALKRNNCRELVLIGPVRRPAWRDLRPDAEGARILARLGRAIFSGDDGLL  112

Query  106  KEIIKILAQNKIKTENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFS  165
              ++++L +       + +F    + + G   ++ P+   + DI + ++ L  +   +  
Sbjct  113  GAVVRVLGEEGFHVRGAHEFLEHATGRSGTLGRVLPDAQAKQDIARGVEVLKVMAALDIG  172

Query  166  QGVVVRNKTVVAIEGKGGTKKMLEK-SKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTL  224
            QG VV+N  V+A+E   GT  ML +  +  +  + GVLVK PK  QD+R D+PTIG +TL
Sbjct  173  QGCVVQNGLVLAVEALEGTDAMLGRCGRLMQAGSGGVLVKMPKTGQDVRADMPTIGPETL  232

Query  225  KQSKTAGLKGIVVKNKQHVFLDKNQCIKFANQNRMFI  261
            + +   GL+G+  +    +  D   C+K A++  +F+
Sbjct  233  ENAARNGLRGVAFQPGVTLMTDPAGCVKLADRYGLFL  269


ORF finding

LOGICIEL ORF FINDER DANS SMS
PARAMETRES: AU MOINS 60 CODONS, CODE GENETIQUE STANDARD, FRAME 1,2 ET 3, CODONS ALTERNATIFS

LECTURE SENS DIRECT:

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 6 to base 800.
ATGATTGGTTTGATTTTTGGTGAGACAAGTTTTCCAAATGAAATTCTAAAAAAAGTGAAG
AAAAAAAAATTAAATTATTTAATAATTGATTTAACTCAATTAAAAAAATTTAAAAAAGAT
AAAAAATCTTATTCAGTTTCTATAGGTCAATTTGGAAAAATTATTAATATTCTAAAAAAA
AATAATTGCAAAAAAGTCTTATTTGCTGGAAAGGTAAATAAGCCAAACTTTTCTAGATTA
AAATTAGACTTTAAAGGCATTTATTATATTCCAAGAATTATAAAAGCATCAAAGCTTGGA
GATGCGGCTATAATAAAAGAAATTATTAAAATATTAGCTCAAAACAAGATAAAAACTGAA
AATTCACTAAAATTCAATCCTGAGCTTTCATTAAAAAGGGGAAATTATTCAAAGATAAAA
CCAAACGATCATGATCAATCAGATATTAAAAAGGCAATTAAAAAATTAAATAATTTAAGA
CAATATAATTTTAGCCAAGGGGTTGTAGTTAGAAATAAAACAGTTGTAGCTATAGAAGGA
AAGGGTGGAACAAAAAAAATGCTTGAAAAAAGTAAAAGTAAAAAATTTAGAAATCATGGT
GTTTTAGTAAAATTTCCCAAAAAAAAACAAGACTTAAGAGTAGATCTACCTACTATTGGA
TTAAAAACTTTAAAACAAAGCAAAACTGCTGGGTTAAAGGGAATTGTAGTTAAAAATAAA
CAGCATGTTTTTTTAGATAAAAACCAATGTATTAAATTTGCTAATCAAAATAGAATGTTT
ATTTCAGTAAAATGA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
MIGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDKKSYSVSIGQFGKIINILKK
NNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKTE
NSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIEG
KGGTKKMLEKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLKGIVVKNK
QHVFLDKNQCIKFANQNRMFISVK*

LECTURE REVERSE:

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.


PARAMETRES: AU MOINS 60 CODONS, CODE GENETIQUE STANDARD, FRAME 1,2 ET 3, AVEC ANY CODON


LECTURE DIRECTE:

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 3 to base 800.
AAGATGATTGGTTTGATTTTTGGTGAGACAAGTTTTCCAAATGAAATTCTAAAAAAAGTG
AAGAAAAAAAAATTAAATTATTTAATAATTGATTTAACTCAATTAAAAAAATTTAAAAAA
GATAAAAAATCTTATTCAGTTTCTATAGGTCAATTTGGAAAAATTATTAATATTCTAAAA
AAAAATAATTGCAAAAAAGTCTTATTTGCTGGAAAGGTAAATAAGCCAAACTTTTCTAGA
TTAAAATTAGACTTTAAAGGCATTTATTATATTCCAAGAATTATAAAAGCATCAAAGCTT
GGAGATGCGGCTATAATAAAAGAAATTATTAAAATATTAGCTCAAAACAAGATAAAAACT
GAAAATTCACTAAAATTCAATCCTGAGCTTTCATTAAAAAGGGGAAATTATTCAAAGATA
AAACCAAACGATCATGATCAATCAGATATTAAAAAGGCAATTAAAAAATTAAATAATTTA
AGACAATATAATTTTAGCCAAGGGGTTGTAGTTAGAAATAAAACAGTTGTAGCTATAGAA
GGAAAGGGTGGAACAAAAAAAATGCTTGAAAAAAGTAAAAGTAAAAAATTTAGAAATCAT
GGTGTTTTAGTAAAATTTCCCAAAAAAAAACAAGACTTAAGAGTAGATCTACCTACTATT
GGATTAAAAACTTTAAAACAAAGCAAAACTGCTGGGTTAAAGGGAATTGTAGTTAAAAAT
AAACAGCATGTTTTTTTAGATAAAAACCAATGTATTAAATTTGCTAATCAAAATAGAATG
TTTATTTCAGTAAAATGA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
KMIGLIFGETSFPNEILKKVKKKKLNYLIIDLTQLKKFKKDKKSYSVSIGQFGKIINILK
KNNCKKVLFAGKVNKPNFSRLKLDFKGIYYIPRIIKASKLGDAAIIKEIIKILAQNKIKT
ENSLKFNPELSLKRGNYSKIKPNDHDQSDIKKAIKKLNNLRQYNFSQGVVVRNKTVVAIE
GKGGTKKMLEKSKSKKFRNHGVLVKFPKKKQDLRVDLPTIGLKTLKQSKTAGLKGIVVKN
KQHVFLDKNQCIKFANQNRMFISVK*


LECTURE REVERSE:

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.