ORF ES22450

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01160305.1
Annotathon code: ORF_ES22450
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2007
Username : cqfpf
Annotated on : 2008-03-19 18:52:37
  • ROSSI ANNE LISE
  • SAID SOILIHI RIWADI

Synopsis

  • Taxonomy: Proteobacteria (NCBI info)
    Rank: phylum - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 1224
    Kingdom: Bacteria - Phylum: Proteobacteria - Class: - Order:
    Bacteria; Proteobacteria;

Genomic Sequence

>AACY01160305.1 ORF_ES22450 genomic DNA
GCCAGCATCAGCCAGGCCATCAGCACCAGCATCGCGGAACAAGACAGTCCCGATGTCGCGGTCATTCCTGAGGGGCCGTACCTCGTCCCTTTTGTCCAAT
CTCCGATGTAGCTTCATTCTGCCTCGAACTGAGTAGAATGCGGATTGCAGCGCAGGCAGCACCGTAACCGTTGTCAATATTGCAGACCAACACTCCCGGC
GCACAGGAGGACAAACAGGCTGAAAGCGCAGTTTCTCCGTTTCGGCTCGCACCGTATCCGGTGGAGGTGGGAAGCGCAATGATCGCTCCAGGGACCAAAC
CTCCAAGAACGGTCGGCAGTGCAGCATCCATTCCCGCAACAGCAATCACAACCGGATGCTTGCGGATCTCCTCGACCCTTTCCATCAACCTCCAGATTCC
CGCCACTCCCACATCAAAAATCGCCTTGGCACCAAAACCATTGAATTCAAGCGTTCTCAAAGCTTCTCGTGCAACTCTTGCATCCGAAGTGCCTGCAGCA
ACAACCGCCACTTCCGGAGCATTTTGCTTCACGGGAGAAACGGGTTTTTTGAAAAAGGCGGTTTGGGAAACAGGATCATAATCCAGATCCGAACTGACCT
CCTCACTCAGAGCAGAATGTTTTTCTACGGAAAGTCTGGTCAGCAATAGAGACGCATTGTTGTTCCGGGCTTCTTCCAGAATCGACTGAATCTGTTCCAC
AGATTTCTGTAGGCAAAAAATCGCTTCGGCCAATCCGATCCTGTCGGTCCGGTCCCAGTCGAGCTTGATTTCGGAAGAACTAGAAGTCATGTCTGGAGTT
CCGGAATCTTAAAGGCACTTCCACGTTGATAGGCGCAGAAACGGACCTTGGTAGAGGGACGGTGCTGAAAACAC

Translation

[67 - 810/874]   indirect strand
>ORF_ES22450 Translation [67-810   indirect strand]
DSGTPDMTSSSSEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVSSDLDYDPVSQTAFFKKPVSPVKQNAPEV
AVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETALSACLS
SCAPGVLVCNIDNGYGAACAAIRILLSSRQNEATSEIGQKGRGTAPQE

[ Warning ] 5' incomplete: does not start with a Methionine

Phylogeny

22 Populations

Neighbor-Joining/UPGMA method version 3.6a2.1


 Neighbor-joining method

 Negative branch lengths allowed


                                                         +--Anaeromyxo  [d-proteobacteria]
                                         +---------------3 
                                      +-17               +----A.dehaloge [d-proteobacteria]
                                      !  !  
                                      !  +-----------------------P.pacifica [d-proteobacteria]
                                      !  
                                      !              +--------P.propioni [d-proteobacteria]
                                      !           +-12  
                                      !           !  +---------G.lovleyi [d-proteobacteria]
                                      !        +-14  
                                   +-20        !  !      +---G.sulfurre [d-proteobacteria]
                                   !  !        !  +------7 
                                   !  !     +-15         +--G.metallir [d-proteobacteria]
                                   !  !     !  !  
                                   !  !     !  !      +----G.uraniumr [d-proteobacteria]
                                   !  !     !  !  +---8 
                                   !  !  +-16  +-13   +-----Geobacter [d-proteobacteria]
                                   !  !  !  !     !  
                                +-18  !  !  !     +--------G.bemidjie [d-proteobacteria]
                                !  !  +-19  !  
                                !  !     !  +------------P.carbinol [d-proteobacteria]  
                                !  !     !  
                                !  !     +------------------S.fumaroxi [d-proteobacteria]
                                !  !  
                        +------11  !      +-----------------C.concisus [e-proteobacteria]
                        !       !  !   +--9 
                        !       !  +--10  +----------W.succinog [e-proteobacteria]
                        !       !      !  
  +---------------------6       !      +--------------S.aciditro [d-proteobacteria]
  !                     !       !  
  !                     !       +-------------------D.psychrop [d-proteobacteria]
  !                     ! 
  !                     !    +---------------Methylobac [a-proteobacteria]
  !                     +----5 
  !                          !    +------------------ORF_ES2245
  !                          +----4 
  !                               +--------------------V.shilonii [g-proteobacteria]
  ! 
  !   +--P.marinus2 [cyanobacteria]
  2---1 
  !   +-P.marinus3 [cyanobacteria]
  ! 
  +-----P.marinus1 [cyanobacteria]



Arbre par parcimonie : 

Arbre(s) issus de protpars :


Protein parsimony algorithm, version 3.6a2.1



One most parsimonious tree found:




                                                           +-----S.aciditro [d-proteobacteria]
                                +-------------------------14  
                                !                          !  +--W.succinog [e-proteobacteria]
                                !                          +-13  
                                !                             +--C.concisus [e-proteobacteria]
                                !  
                                !                          +-----V.shilonii [g-proteobacteria]
                                !                 +-------21  
                                !                 !        !  +--ORF_ES2245
                                !                 !        +-20  
           +-------------------12              +-19           +--Methylobac [a-proteobacteria]
           !                    !              !  !  
           !                    !              !  !        +-----P.marinus1 [cyanobacteria]
           !                    !              !  +-------18  
           !                    !           +-16           !  +--P.marinus3 [cyanobacteria]
           !                    !           !  !           +-17  
           !                    !           !  !              +--P.marinus2 [cyanobacteria]
           !                    !        +-15  !  
           !                    !        !  !  +-----------------D.psychrop [d-proteobacteria]
        +-10                    +-------11  !  
        !  !                             !  +--------------------S.fumaroxi [d-proteobacteria]
        !  !                             !  
        !  !                             +-----------------------P.carbinol [d-proteobacteria]
        !  !  
        !  !                                               +-----G.bemidjie [d-proteobacteria]
        !  !                                   +-----------9  
        !  !                                   !           !  +--Geobacter [d-proteobacteria]
        !  !                                   !           +--8  
     +--3  !                                   !              +--G.uraniumr [d-proteobacteria]
     !  !  +-----------------------------------7  
     !  !                                      !              +--G.metallir [d-proteobacteria]
     !  !                                      !        +-----6  
     !  !                                      !        !     +--G.sulfurre [d-proteobacteria]
     !  !                                      +--------5  
  +--2  !                                               !     +--G.lovleyi [d-proteobacteria]
  !  !  !                                               +-----4  
  !  !  !                                                     +--P.propioni [d-proteobacteria]
  !  !  !  
  1  !  +--------------------------------------------------------P.pacifica [d-proteobacteria]
  !  !  
  !  +-----------------------------------------------------------A.dehaloge [d-proteobacteria]
  !  
  +--------------------------------------------------------------Anaeromyxo [d-proteobacteria]

  remember: this is an unrooted tree!


requires a total of   2254.000

Annotator commentaries

ORF: Pour l'annotation de notre séquence, nous avons tout d'abord fait une recherche d'ORF avec le logiciel ORF finder sur SMS. A cette fin, nous avons fait une "demande gourmande" en prenant pour critère "débute avec n'importe quel codon" (any codon). Nous avons aussi choisi les ORF de plus de 60 codons et qui finissent par un codon stop. Nous avons pu trouver cinq ORF : deux dans la phase directe et trois dans la phase reverse. Nous avons gardé celui qui code pour la protéine la plus longue. Cependant cet ORF ne débute pas par une méthionine, il n'est pas complet. Nous avons alors effectué une autre recherche d'ORF mais avec comme critère "débute avec un atg". Nous avons alors obtenu l'ORF sélectionnée mais un peu plus court. Pour décider quel ORF est le plus complet, nous avons fait un Blastp contre nr et regardé si les premières séquences homologues débutaient au niveau de la méthionine. Comme ce n'était pas le cas, nous avons conservé l'ORF le plus long, sans méthionine initiale. Grâce à l'alignement multiple on a pu comparer la position de la méthionine initiale, ainsi, on peut dire que notre séquence est délétée des 27 premiers acides aminés. Le fait qu'on obtienne un résultat avec le ORF Finder, que, de plus, la protéine ait plus de 60 codons et qu'elle ait des homologues de "bonne e-value",nous permet de dire que la séquence est codante.

Homologie : Concernant le Blast: nous avons effectué deux Blastp, un contre swissprot et un contre nr. Lors du Blastp NCBI contre swissprot, nous avons obtenus des séquences homologues ayant pour meilleure e-value 4e-25. Nous avons ensuite fait un Blastp NCBI contre NR. Nous y avons obtenu de meilleures e-value que celles obtenues par swissprot, la meilleure atteignant 2e-49. Nous avons donc choisi de conserver les résultats du dernier Blastp. Ils nous ont permis de voir qu'il existe de nombreux homologues, utiles pour confirmer les resulats des recherches de domaines fonctionnels dans Interpro et pour établir une première idée de la phylogénie. Les séquences obtenues, avec le Blastp NCBI contre nr sont des oxydoréductases bactériennes sauf dix "circadian phase modifier" ou dérivés, on ne peut donc émettre aucune hypothèse formelle quant à la fonction de notre protéine. Les "circadian phase modifier", changent les phases circadiennes en perturbant soit les valeurs minimums et maximums, soit la durée du cycle (par exemple le cycle du sommeil). Parmis elles, il se peut qu'il y ait des oxidoréductases. Ce sont des enzymes qui catalysent les transferts d'électrons d'une molécule à une autre. La recherche de domaine protéique dans Interpro met en évidence deux domaines. Le domaine PurE, allant de l'acide aminé 96 au 246, a une fonction inconnue. Son gène est retrouvé dans 4 des 11 résultats trouvés dans swissprot (nous avons vérifié dans cette banque de données car elle contient des informations fonctionnelles plus sures, car vérifiées par des annotateurs). Le domaine AIR carboxylase allant de l'acide aminé 106 au 225, est contenu dans celui PurE. Sa fonction est d'être une carboxylase phosphoribosylaminoimidazole intervenant dans les processus de biosynthèse d'IMP ce qui correspond au processus de biosynthèse des purines. Dans l'alignement multiple, fait avec clustalw, les séquences protéiques allant de l'acide aminé 5 au 49 et de celui 92 au 224 semblent très conservées. La deuxième région conservée correspond aux domaines fonctionnels PurE et AirC. Pour la première région, on ne sait pas exactement qu'elle est sa fonction mais, du fait qu'il soit bien conservé, on peut penser qu'il participe à la régulation de la fonction de la protéine et peut être même à celle de la deuxième région.

Fonction potentielle : Les résultats obtenus précédemment indiquent que notre protéine peut être une carboxylase phosphoribosylaminoimidazole. Des recherches plus avancées sur les domaines protéiques, dans interpro, nous confirme que notre séquence protéique est impliqué dans un complexe carboxylase phosphoribosylaminoimidazole, qui intervient dans la synthèse de nucléotides. Elles nous permettent aussi d'émettre une hypothèse quant à ses processus biologiques qui semblent être : biosynthèse d'IMP et de purines. Nous avons donc hésité quant au choix du processus biologique. En effet la biosynthèse des purines se retrouvent dans les métabolismes de l'ADN et de l'ARN. Nous avons choisi le métabolisme de l'ARN car une notion d'expression au bon moment et au bon endroit y est attachée et que notre séquence pourrait vraissemblablement intervenir dans les cycles circadiens.

Phylogénie : Les rapports taxonomiques laissent penser que notre séquence est très proche des proteobacteria parce qu'elles correspondent aux premiers organismes à plus fort scores dans le lineage report. Cependant les scores associés aux séquences provenant des protéobacteria ne sont pas très différents des scores associés aux cyanobactéries, aux autres bactéries ou même aux archaea. Ainsi, pour définir avec plus de précision de quels organismes notre séquence est la plus proche, nous avons fait plusieurs arbres phylogénétiques en changeant de groupe d'étude et de groupe extérieur. Le seul groupe dans lequel notre séquence s'est intégrée est celui des protéobactéries. Nous avons donc choisi différents types de proteobactéria provenant d'organismes différents et ayant les scores les plus grands. Pour le groupe extérieur, nous avons choisi des séquences homologues à la notre mais différentes des protéobactéria. Notre choix s'est donc porté sur les cyanobacteria. Parmi toutes les proteobacteria, notre séquence est peut être une g-protéobacteria. En effet, elle semble être phylogéniquement proche d'une g-proteobacteria, la V.shilonii (distance évolutive de 8). Cette hypothèse de proximité évolutive est confirmé par l'arbre par parcimonie. Cependant son appartenance à la famille des g-proteobacteria reste une hypothèse car le Blast avec 100 séquences cibles ne nous a donné qu'une séquence de g-proteobacteria. Cependant, avec un taxonomy report du Blastp NCBI contre nr avec 1000 séquences cibles, on remarque que les autres g-proteobacteria ont des scores inférieurs ou égaux à 35, ce qui semble traduire une absence d'homologie.

Multiple Alignement

CLUSTAL W (1.82) multiple sequence alignment


Anaeromyxobacter       --------------MDERKLRRLLDGVRKGDTTVDDAVADMKHAP--FERLGDVATLDTH
A.dehalogenans         --------------MDEKTLRALLARVKRGQSSLDEAVAALKGAP--FERMGDLATLDTH
P.pacifica             ---------MSAAATSAQILDALLAGELDRDAALTKLEGLRPSAPSAVEALDDYARIDHD
P.propionicus          --------------MTPNSIRELLGAVSQGKTSVDDAVERLRHLP---FQDLGCAQVDHH
G.lovleyi              ------------MLMVLEELQALLQAVAAGATTPEQGLERLRHLP---FEDLGFAQVDHH
G.sulfurreducens       --------------MDPRELKTILRSFKDGALSEDEMLERLRHLP---YEDVGDALVDHH
G.metallireducens      MIFLLECTDTRLDAMDPNELKILLNEFKSGAIGEDEALERLRHLP---FEDIGDAMVDHH
G.uraniumreducens      --------------MDRTELKRLLYNVKNNDIGVEDALDRLKHLP---FEELGCATIDHH
Geobacter              --------------MDRAELKLLLQKVKENRINVDEALKCLRHLP---YEELDCATVDHH
G.bemidjiensis         ------MYGGKIRFMQSKVIENILQEVGAGSLDVQTALERLKHLP---FEDVGCATVDHH
P.carbinolicus         --------------MNPSELHKLLSAIRSGQVSVEEGLSRLATLP---FEDVGEALIDHH
S.fumaroxidans         ----------------MQRLEQILKDYKEGRRELSDVLTYLRKLP---FEDLSFARIDHH
C.concisus             --------------MSETEILEFIAGIKNGKMSEQDALKYLKNYP---FNDIGCAKIDTQ
W.succinogenes         --------------MQTQAILTLLEQVQQGNLDIQGALEQLKKLP---FEDLGFAKIDHH
S.aciditrophicus       --------------MDTDVLRDILEKVKEGRVAVDEGMALLKSHF---YLDVGCAKIDTH
D.psychrophila         --------------MNPHLLTEILGLLQDGTNSLEQTLKELKDFP---AERVKDACIDHQ
ORF_ES22450            -----------------------------------------DSGTPDMTSSSSEIKLDWD
Methylobacterium       ---------------------------------------------------MTEFVLDVE
V.shilonii             ---------------------------------------------------MSGIIIDFE
P.marinus2             --------------------------------------------------MNFDIKFDFQ
P.marinus3             --------------------------------------------------MNLDIRFDFQ
P.marinus1             --------------------------------------------------MNLDIRFDFQ
                                                                               .* .

Anaeromyxobacter       RALRVGMPEVVLAESKTAAQVAAIAKQLAARGP-LLVTRLAPDKAAAARR-------AVK
A.dehalogenans         RTLRVGMPEVVLAESKTAAQVAAIARKLAARGP-LLVTRLAPEKAGPARR-------AVK
P.pacifica             RAKRRGFPEVIYGPGKTTEQIVGIFGRLAAHNPNVLCTRTTAETAEAVRARLVEDGVGVE
P.propionicus          RELRQGMPEVIFGEGKDAGQIVQIMTAMHATGSNILVTRLSPEKGAEIGTH-------FP
G.lovleyi              RVLRQGQPEVIFGQGKTVEQIGRIMEAMVARGSNVLVTRLEEVRAVELLAV-------FP
G.sulfurreducens       RGLRQGFPEVIFGAGKSAGQVERIMASLAAKGNNILVTRLDEAKALAVKEA-------FP
G.metallireducens      RSLRQGFPEVIFGAGKSAGQIERIMASLAARGNNILVTRLDEAKALAVKEA-------FP
G.uraniumreducens      RTLRQGFPEVIFGESKTVGQMEQIIIALLAKGNNILATRVDGEKAAKLRQT-------FP
Geobacter              RALRQGFPEVIFGESKTVGQMEQIIMALLGKGNNVLATRVDGEKGAALMHK-------FP
G.bemidjiensis         RSLRQGFPEVIFGQGKNLAQMRTIIAALLAKGGNVLATRVSAAKGAKLKET-------FP
P.carbinolicus         RGLRQGAPEVIFGQGKTSEQILRIAEGLLNGNNNVLVTRLDADKAAPLVER-------WP
S.fumaroxidans         RKLRRGFPEVVYGEGKSAEQILSIIRAMKEFGSNVLVTRVDAVKAGHILEN-------ID
C.concisus             RALRNGTGEVIYGANKTDDEILQIASAIGEKNENILITRTNESVFKRMREI-------FP
W.succinogenes         REIRNGYPEVIYAQGKTPAQISEITKHLLERGNNILATRASEAAYESILAF-------CP
S.aciditrophicus       REWRVGYPEVIFCSGKTVEQVKAIVKAMLEREANILATRASEDLFEEIRKI-------CP
D.psychrophila         REIRTGIPEVIYGAAKSVEQIITIARAQIATGGPVIATRVEREKAKQVQLVLP------E
ORF_ES22450            RTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS-----SD
Methylobacterium       RPARIGLEEAVYAAGKTAAQVAAILAAAAERRASLLVTRLDPEKLAALPGTLR-----AE
V.shilonii             RKQRCGVEEAILCSSKSPLQIEQIIQMALERDARMLFTRLSLEKWQSLSPTFH-----PK
P.marinus2             RRDRLGLIEAIWGQDKSIDQLERLCGNVLSKNEVVFITRINSEKANYLLDLYD-----DA
P.marinus3             RRERLGLIEAIWGQDKSIDQLKRLSENVLSKNEVVFITRINSEKAIDLLDFYD-----DA
P.marinus1             RRERIGLIEAIWGADKSVDQLKRVSKEVLQKKEVVLITRIDKEKAIHLLDEFE-----EA
                       *  * *  *.:    *   ::  :          :: **                     

Anaeromyxobacter       GATYDPVSRTLRKGR--MDLPARGPVAVCCAGTSDIPVCEEAAVTLDVMGVEAIRIYDVG
A.dehalogenans         GSVYDPVSRTLRRGR--MDLPARGPVAVCCAGTSDIPVCEEAAVTLEVMGVEPIRVYDVG
P.pacifica             ALEYEPRSGLLSLWRE-REVRYPGTIAVVSAGTSDLRVAVEAERCATIMGNRVEALADVG
P.propionicus          AARYHDEARCLTLAQRPPEMRGRGTILVVSAGTSDIGVAAEALVTARFLGNEAEPLYDVG
G.lovleyi              RSFYHAEARCLTLETTPKREQGRGTILIVAAGTSDLQVASEALVTARFMGNQAELLCDVG
G.sulfurreducens       SAMWHADARCLTLEQRPIEKRGLGTVLVLSAGTSDLPVAAEALVTLRMLGNEASHLYDVG
G.metallireducens      PAVWHADARCLTLEQKPAERRGRGTVLVVSAGTSDLPVAAEALVTLRMLGNDTEHLYDVG
G.uraniumreducens      TAHYHSEARCLTIEQKPVELKGRGKILVISAGTSDIPVAGEAVITGRMMGNEVESLFDVG
Geobacter              VASYHRESRCLTIEQKPVEAKGRGKIVVISAGTSDMPVAGEAVITARIMGNEVETLFDVG
G.bemidjiensis         QAVYHPDARALTIEQHPVELRGKGKILVVCAGTSDIPVAAEALLTARLMGNEVEHIYDVG
P.carbinolicus         EAAYDPLGRTLTIVRHAVATTGRGPILVICAGTSDLPVAREASTAARMLGNRVEELADVG
S.fumaroxidans         GPVYHPVARVLSYEQERVIPRCRGIVQVICAGSSDVPVAEEAAITAEMMGQSVERYCDVG
C.concisus             QANFNARGRVISVKFK-EPAPTKSYIAIVSAGTADGAVVEEAYETAKFLGNDVRKFTDVG
W.succinogenes         EAKYNALGKTITIKRR-EITPPPTHITIVAAGTSDLPVVEEAYETATILGNRVEKIVDVG
S.aciditrophicus       AAVYSSLARAFTIKRK-ALEVSSGYIALVTAGTSDLPVAEEAAVTAELFGNRVERIVDVG
D.psychrophila         CHYHERASMLTCLERRATAPSFRGEALILCAGTSDIPVAEEARVTLEALGSPVKTIYDIG
ORF_ES22450            LDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVG
Methylobacterium       IDYCDLSRTAWFGPPRPVRGQ--GRIAIVAAGTSDLPVAREAERTLRYAGEAATVIADVG
V.shilonii             LSYCSVSETAILG-TITDLTQQESPIAIVSGGSSDTNICHEILRTLNYHGVSASLYEDVG
P.marinus2             RFHEEANCLLIGKNFNKIITN--KKVAIISGGSSDLAVTLEAQLALEIYGVNCQSFIDVG
P.marinus3             RFYEEAKCLIIGKNLNKLNTN--KKVAIISGGSSDLPVTLEAQLALEIYGVNCQSFIDVG
P.marinus1             IFYEEANCLIVGENFNKISTN--KKVAIIAGGSSDLAVTLEAKLSLELYGVRCQSFIDVG
                                                  :  .*::*  :  *        *       *:*

Anaeromyxobacter       VAGLHRLLARRGDLERARAVIVVAGMEGALPSVVGGLVGRPVIGVPTSVGYGASLGGIAP
A.dehalogenans         VAGIHRLLARRDDLDRARAVIVAAGMEGALPSVVGGLVGRPVIGVPTSVGYGASLGGLAP
P.pacifica             VAGLHRLLAVRPTLERARVIVCVAGFEAALPSVVAGLVPCPVIAVPTSTGYGASFGGLSA
P.propionicus          VSGIHRLLARREMLCSASVIIVVAGMEGALPSVVGGLVDRPVIAVPTSVGYGASFGGIAA
G.lovleyi              VAGIHRLLSRMELLRAATVLIVVAGMEGALPSVVGGLVDRPVIAVPTSVGYGASFGGIAA
G.sulfurreducens       VAGIHRLLARRDVLFSARVLIVVAGMEGALPSVVGGLVDRPVIAVPTSVGYGASFGGIAA
G.metallireducens      VAGIHRLLARREALTAASVLIVVAGMEGALPSVVGGLVDRPVIAVPTSVGYGAAFGGIAA
G.uraniumreducens      VAGLHRLLAQKELLISAAVVIVVAGMEGALPSVVGGLVDKPVIAVPTSVGYGASFGGIAA
Geobacter              VAGLHRLLARKELLFSASVIIVVAGMEGALPSVVGGLVDKPVIAVPTSVGYGASFGGVAA
G.bemidjiensis         VAGLHRLLARRSALAEASVIIVVAGMEGALPSVVGGLVDKPVIAVPTSIGYGASFGGVAA
P.carbinolicus         VAGLHRLLAHLDQLRRASVIIAVAGMEGALPSVIGGLVAAPVIAVPTSVGYGASLGGVAA
S.fumaroxidans         VAGLHRLLGIWDELQKGSVYVVVAGMEGALPSVVAGLVRRPVIAVPTSVGYGASFGGVGA
C.concisus             VAGLHRLVAKIDEIRGAKVVIAVAGMEGALASVLAGLVSVPVIAVPTSVGYGANFGGLSA
W.succinogenes         VAGLHRLLAHIEVIRQAKVLIVVAGMEGALASVIGGLVDKPVIAVPTSVGYGASFGGLSA
S.aciditrophicus       VAGIHRLFYNLEAIRGARVVIVVAGMEGALPSVVGGLVDKPVIAVPTSVGYGASFNGLSA
D.psychrophila         IAGLHRLLLHREVISQASVIIVVAGMEGALASVVGGLCTAPIIGVPTSVGYGASFGGVSA
ORF_ES22450            VAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETA
Methylobacterium       VAGLWRLTRRLEEIRAHPVVVAVAGMDAALASVLGGLVAGAVIGVPTSVGYGVAAGGRPA
V.shilonii             VSALWRLTNALEDINKAKIIIAVAGMEAALPTVLAGLTPRPIIAVPTSVGYGVSNGGELA
P.marinus2             VAGLHRLMSQLEEINKYDVLIVCAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKDGETA
P.marinus3             VAGLHRLISQLEEINKYDVLIVCAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKNGETA
P.marinus1             VAGLHRLLNQIDEINKYDVLIVCAGMEGALATVIGGLLPQPIIAVPVSIGYGVSKNGETA
                       ::.: **      :      :  **::.**.:*:.**   .:*.:*.* ***.  .*  .

Anaeromyxobacter       LLTMLNSCAPNVTVVNIDNGFGAAFVAGLVARE---------------------
A.dehalogenans         LFTMLNSCAPNVTVVNVDNGFGAAFVAGLVARG---------------------
P.pacifica             LLGCLTSCAAGVTTVNIDNGFGAAYAATLMNRLGTEDPSR--------------
P.propionicus          LLGMLNSCAAGVTVVNIDNGFGAACAASLINRV---------------------
G.lovleyi              LLGMLNSCASGVTVVNIDNGFGAACAASLMNRV---------------------
G.sulfurreducens       LLGMLNSCAAGVTVVNIDNGFGAAVAASKINRE---------------------
G.metallireducens      LLGMLNSCAAGVTVVNIDNGFGAAVAASKINRE---------------------
G.uraniumreducens      LLGMLNSCAAGVTVVNIDNGFGAAYAASLMNRE---------------------
Geobacter              LLGMLNSCAAGVTVVNIDNGFGAAYAASLINRESSSTI----------------
G.bemidjiensis         LLGMLNSCAAGVTVVNIDNGFGAAYAASLMNRVHR-------------------
P.carbinolicus         LLGMLNSCASGVTVVNIDNGFGAACAAARINRQEQP------------------
S.fumaroxidans         LLGMLNSCAPGVAVVNIDNGFGAGYLASVINEGGLPPGEARPEGEGAS------
C.concisus             LLAMLNSCANGISVVNIDNGFGAAYNASLINHL---------------------
W.succinogenes         LLCMLNSCASGVSVVNIDNGFGAGYNASIINHL---------------------
S.aciditrophicus       LLAMLNSCASGVCVVNIDNGYGAGYLAGMINRL---------------------
D.psychrophila         LLTMLNSCAPGLAVVNIDNGFGAACMAFSINRQNPTG-----------------
ORF_ES22450            LSACLSSCAPGVLVCNIDNGYGAACAAIRILLSSRQNEATSEIGQKGRGTAPQE
Methylobacterium       LDVMLASCAPGLAVVNIDNGYGAACAALRFLHAAGRLAGA--------------
V.shilonii             LHSCLGSCAAGVMTMNIDNGFGAACAAIKLLNSFQPN-----------------
P.marinus2             LNSMLSSCSPGIAVMNIDNGYGAAMAALRIIKSIS-------------------
P.marinus3             LNSMLSSCSPGVAVMNIDNGYGAAMAALRIIKSIS-------------------
P.marinus1             LNSMLSSCAPGISVMNIDNGYGAAMAALRIIKRI--------------------
                       *   * **: .: . *:***:**.  *  .                        

BLAST

Blastp NCBI contre nr

BLASTP 2.2.17 (Aug-26-2007) 
Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: KPEDXA6001R


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
from WGS projects
           5,633,163 sequences; 1,947,344,958 total letters
 If you have any problems or questions with the results of this search please refer to the BLAST FAQs
Taxonomy reports
Query=  Translation of ORF number 1 in reading frame 1 on the reverse strand.
Length=249

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_01866594.1|  hypothetical protein VSAK1_18939 [Vibrio s...   199    2e-49
ref|ZP_01850389.1|  putative circadian phase modifier CpmA-lik...   195    2e-48
ref|YP_001011753.1|  putative circadian phase modifier CpmA-li...   176    9e-43 
ref|YP_001091687.1|  putative circadian phase modifier CpmA-li...   172    2e-41 
ref|YP_001484702.1|  putative circadian phase modifier CpmA [P...   171    3e-41 
ref|YP_292039.1|  circadian phase modifier CpmA homolog [Proch...   171    4e-41 
ref|YP_001015518.1|  putative circadian phase modifier CpmA [P...   171    4e-41 
ref|YP_001009867.1|  putative circadian phase modifier CpmA-li...   170    8e-41 
ref|YP_397868.1|  putative circadian phase modifier CpmA [Proc...   169    1e-40 
ref|YP_729997.1|  putative circadian phase modifier CpmA homol...   167    5e-40 
ref|ZP_00778228.1|  NCAIR mutase (PurE)-related proteins [Ther...   166    1e-39
ref|NP_893395.1|  putative circadian phase modifier CpmA homol...   165    2e-39 
ref|NP_622425.1|  NCAIR mutase (PurE)-related proteins [Thermo...   164    3e-39 
ref|ZP_01455801.1|  NCAIR mutase (PurE)-related proteins [Ther...   163    7e-39
ref|ZP_01004957.1|  Circadian phase modifier CpmA-like protein...   161    4e-38
ref|ZP_01470649.1|  circadian phase modifier CpmA-like protein...   160    9e-38
emb|CAJ71941.1|  conserved hypothetical protein [Candidatus Ku...   158    3e-37
ref|ZP_01695562.1|  AIR carboxylase, putative [Bacillus coagul...   154    3e-36
ref|YP_001307704.1|  NCAIR mutase (PurE)-related protein [Clos...   153    9e-36 
ref|YP_358887.1|  hypothetical protein CHY_0015 [Carboxydother...   153    1e-35 
ref|YP_001252653.1|  hypothetical protein CBO0105 [Clostridium...   152    2e-35 
ref|YP_357809.1|  NCAIR mutase (PurE)-related proteins [Peloba...   152    2e-35 
ref|ZP_01124630.1|  putative circadian phase modifier CpmA [Sy...   152    3e-35
ref|YP_795944.1|  NCAIR mutase (PurE)-related protein [Lactoba...   151    3e-35 
ref|YP_001225444.1|  Circadian phase modifier CpmA homolog [Sy...   151    4e-35 
ref|YP_001389468.1|  hypothetical protein CLI_0160 [Clostridiu...   150    6e-35 
ref|YP_001111523.1|  hypothetical protein Dred_0149 [Desulfoto...   150    6e-35 
ref|YP_001227989.1|  Circadian phase modifier CpmA homolog [Sy...   150    8e-35 
ref|YP_001017951.1|  putative circadian phase modifier CpmA-li...   150    8e-35 
ref|YP_462061.1|  phosphoribosylaminoimidazole carboxylase NCA...   149    1e-34 
ref|YP_001393731.1|  PurE-related protein [Clostridium kluyver...   149    2e-34 
ref|YP_187850.1|  AIR carboxylase, putative [Staphylococcus ep...   148    3e-34 
ref|NP_347412.1|  NCAIR mutase (PurE)-related protein [Clostri...   148    3e-34 
ref|ZP_01576355.1|  NCAIR mutase (PurE)-related protein [Clost...   148    3e-34
ref|NP_763927.1|  NCAIR mutase (PurE)-related protein [Staphyl...   148    3e-34 
ref|YP_001180494.1|  NCAIR mutase (PurE)-related protein [Cald...   148    4e-34 
ref|NP_894191.1|  putative circadian phase modifier CpmA homol...   147    5e-34 
ref|YP_001139288.1|  hypothetical protein cgR_2380 [Corynebact...   147    7e-34 
ref|YP_377508.1|  circadian phase modifier CpmA-like [Synechoc...   147    7e-34 
ref|YP_900314.1|  hypothetical protein Ppro_0625 [Pelobacter p...   146    1e-33 
ref|YP_001318366.1|  1-(5-phosphoribosyl)-5-amino-4-imidazole-...   145    1e-33 
gb|EDP13334.1|  hypothetical protein CLOBOL_06373 [Clostridium...   145    2e-33
ref|ZP_01965895.1|  hypothetical protein RUMOBE_03643 [Ruminoc...   145    2e-33
ref|YP_001036799.1|  NCAIR mutase (PurE)-related protein [Clos...   145    3e-33 
ref|ZP_01594602.1|  conserved hypothetical protein [Geobacter ...   144    4e-33
ref|ZP_01083949.1|  putative circadian phase modifier CpmA-lik...   144    6e-33
ref|NP_738970.1|  putative carboxylase [Corynebacterium effici...   144    7e-33 
ref|ZP_02026789.1|  hypothetical protein EUBVEN_02054 [Eubacte...   143    7e-33
ref|YP_724157.1|  putative phosphoribosylaminoimidazole carbox...   143    1e-32 
ref|ZP_01081331.1|  putative circadian phase modifier CpmA-lik...   142    1e-32
ref|ZP_01771846.1|  Hypothetical protein COLAER_00835 [Collins...   142    1e-32
ref|ZP_01623146.1|  probable phosphoribosylaminoimidazole carb...   142    2e-32
ref|NP_681979.1|  circadian phase modifier CpmA homolog [Therm...   142    2e-32 
ref|ZP_01875229.1|  hypothetical protein LNTAR_16012 [Lentisph...   142    3e-32
ref|ZP_00107377.1|  COG1691: NCAIR mutase (PurE)-related prote...   141    3e-32
ref|YP_001213181.1|  NCAIR mutase (PurE)-related proteins [Pel...   141    4e-32 
ref|YP_300265.1|  putative NCAIR mutase [Staphylococcus saprop...   140    6e-32 
ref|NP_951202.1|  hypothetical protein GSU0140 [Geobacter sulf...   140    6e-32 
ref|YP_395399.1|  Putative NCAIR mutase, PurE-related protein ...   140    7e-32 
ref|YP_383164.1|  hypothetical protein Gmet_0193 [Geobacter me...   140    7e-32 
ref|ZP_01666253.1|  conserved hypothetical protein [Thermosinu...   139    1e-31
ref|NP_070103.1|  hypothetical protein AF1275 [Archaeoglobus f...   139    1e-31 
ref|YP_805156.1|  NCAIR mutase (PurE)-related protein [Pedioco...   139    1e-31 
ref|YP_147817.1|  hypothetical protein GK1964 [Geobacillus kau...   139    2e-31 
ref|YP_001125968.1|  NCAIR mutase (pure)-related protein [Geob...   139    2e-31 
ref|YP_448505.1|  hypothetical protein Msp_1492 [Methanosphaer...   138    3e-31 
ref|NP_897701.1|  putative circadian phase modifier CpmA homol...   137    4e-31 
ref|ZP_01774416.1|  conserved hypothetical protein [Geobacter ...   137    7e-31
ref|YP_322326.1|  probable phosphoribosylaminoimidazole carbox...   137    8e-31 
ref|ZP_01905616.1|  hypothetical protein PPSIR1_39580 [Plesioc...   137    8e-31
ref|NP_783950.1|  NCAIR mutase, PurE-related protein [Lactobac...   137    9e-31 
ref|YP_001229003.1|  NCAIR mutase (PurE)-related protein-like ...   136    1e-30 
gb|ABV27369.1|  ncair mutase [Candidatus Chloracidobacterium ther   136    1e-30
ref|ZP_01389817.1|  conserved hypothetical protein [Geobacter ...   136    1e-30
ref|YP_064543.1|  hypothetical protein DP0807 [Desulfotalea ps...   135    2e-30 
ref|YP_828323.1|  hypothetical protein Acid_7127 [Solibacter u...   135    3e-30 
ref|NP_487925.1|  hypothetical protein alr3885 [Nostoc sp. PCC...   134    4e-30 
ref|NP_442897.1|  hypothetical protein sll1489 [Synechocystis ...   134    4e-30 
ref|YP_517993.1|  hypothetical protein DSY1760 [Desulfitobacte...   134    7e-30 
emb|CAO91356.1|  unnamed protein product [Microcystis aeruginosa    133    9e-30
ref|ZP_01730730.1|  hypothetical protein CY0110_06329 [Cyanoth...   133    9e-30
ref|NP_924408.1|  circadian phase modifier CpmA homolog [Gloeo...   133    1e-29 
ref|YP_846588.1|  hypothetical protein Sfum_2473 [Syntrophobac...   132    1e-29 
ref|ZP_01805115.1|  hypothetical protein CdifQ_04000218 [Clost...   132    2e-29
ref|YP_001381621.1|  hypothetical protein Anae109_4459 [Anaero...   132    2e-29 
ref|ZP_01469620.1|  circadian phase modifier CpmA-like protein...   132    2e-29
ref|YP_001086665.1|  hypothetical protein CD0195 [Clostridium ...   132    2e-29 
ref|YP_001466327.1|  ncair mutase [Campylobacter concisus 1382...   132    2e-29 
ref|YP_381215.1|  putative circadian phase modifier CpmA [Syne...   132    2e-29 
ref|YP_360116.1|  hypothetical protein CHY_1282 [Carboxydother...   132    2e-29 
ref|NP_875743.1|  Circadian phase modifier CpmA homolog [Proch...   131    3e-29 
ref|YP_467514.1|  hypothetical protein Adeh_4314 [Anaeromyxoba...   131    3e-29 
ref|YP_590423.1|  hypothetical protein Acid345_1347 [Acidobact...   130    8e-29 
ref|YP_001325629.1|  1-(5-phosphoribosyl)-5-amino-4-imidazole-...   129    2e-28 
ref|YP_477323.1|  circadian phase modifier CpmA-like protein [...   129    2e-28 
ref|NP_906400.1|  NCAIR MUTASE (PURE)-RELATED PROTEIN [Wolinel...   127    5e-28 
ref|ZP_00517535.1|  1-(5-Phosphoribosyl)-5-amino-4-imidazole-c...   127    6e-28
ref|YP_400185.1|  circadian phase modifier CpmA [Synechococcus...   127    8e-28 
ref|YP_171092.1|  circadian phase modifier CpmA [Synechococcus...   126    9e-28 
ref|YP_474975.1|  circadian phase modifier CpmA-like protein [...   125    2e-27 

ref|ZP_01866594.1|  hypothetical protein VSAK1_18939 [Vibrio shilonii AK1]
 gb|EDL54832.1|  hypothetical protein VSAK1_18939 [Vibrio shilonii AK1]
Length=220

 Score =  199 bits (505),  Expect = 2e-49, Method: Composition-based stats.
 Identities = 108/220 (49%), Positives = 143/220 (65%), Gaps = 1/220 (0%)

Query  12   SEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEV  71
            S I +D++R  R G+ EAI C  KS  QI+ I++ A   +A +L TRLS+EK  +LS   
Sbjct  2    SGIIIDFERKQRCGVEEAILCSSKSPLQIEQIIQMALERDARMLFTRLSLEKWQSLSPTF  61

Query  72   SSDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIF  131
               L Y  VS+TA         +Q +P +A+V+ G+SD  +  E LRTL ++G  A    
Sbjct  62   HPKLSYCSVSETAILGTITDLTQQESP-IAIVSGGSSDTNICHEILRTLNYHGVSASLYE  120

Query  132  DVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNG  191
            DVGV+ +WRL   +E+I K  ++IAVAGM+AALPTVL GL P  IIA+PTS GYG S  G
Sbjct  121  DVGVSALWRLTNALEDINKAKIIIAVAGMEAALPTVLAGLTPRPIIAVPTSVGYGVSNGG  180

Query  192  ETALSACLSSCAPGVLVCNIDNGYGAACAAIRILLSSRQN  231
            E AL +CL SCA GV+  NIDNG+GAACAAI++L S + N
Sbjct  181  ELALHSCLGSCAAGVMTMNIDNGFGAACAAIKLLNSFQPN  220


ref|ZP_01850389.1|  putative circadian phase modifier CpmA-like protein [Methylobacterium 
sp. 4-46]
 gb|EDK91119.1|  putative circadian phase modifier CpmA-like protein [Methylobacterium 
sp. 4-46]
Length=222

 Score =  195 bits (495),  Expect = 2e-48, Method: Composition-based stats.
 Identities = 120/214 (56%), Positives = 150/214 (70%), Gaps = 2/214 (0%)

Query  12   SEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEV  71
            +E  LD +R  RIGL EA++   K+  Q+ +IL  A    ASLL+TRL  EK +AL   +
Sbjct  2    TEFVLDVERPARIGLEEAVYAAGKTAAQVAAILAAAAERRASLLVTRLDPEKLAALPGTL  61

Query  72   SSDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIF  131
             +++DY  +S+TA+F  P  PV+     +A+VAAGTSD  VAREA RTL + G  A  I 
Sbjct  62   RAEIDYCDLSRTAWFGPP-RPVRGQG-RIAIVAAGTSDLPVAREAERTLRYAGEAATVIA  119

Query  132  DVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNG  191
            DVGVAG+WRL  R+EEIR HPVV+AVAGMDAAL +VLGGLV GA+I +PTS GYG +  G
Sbjct  120  DVGVAGLWRLTRRLEEIRAHPVVVAVAGMDAALASVLGGLVAGAVIGVPTSVGYGVAAGG  179

Query  192  ETALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
              AL   L+SCAPG+ V NIDNGYGAACAA+R L
Sbjct  180  RPALDVMLASCAPGLAVVNIDNGYGAACAALRFL  213


ref|YP_001011753.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. MIT 9515]
 gb|ABM72646.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. MIT 9515]
Length=217

 Score =  176 bits (447),  Expect = 9e-43, Method: Composition-based stats.
 Identities = 96/213 (45%), Positives = 140/213 (65%), Gaps = 2/213 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +I+ D+ R +RIGL EAI+   KSV+Q++ + +E       +L+TR+  EK   L +E  
Sbjct  4    DIRFDFQRRERIGLIEAIWGADKSVDQLKRVSKEVLQKKEVVLITRIDKEKAIHLLDEFE  63

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
              + Y+  +      +  + +  N  +VA++A G+SD  V  EA  +LE  G   ++  D
Sbjct  64   EAIFYEE-ANCLIVGENFNKISTNK-KVAIIAGGSSDLAVTLEAKLSLELYGVRCQSFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RL+ +++EI K+ V+I  AGM+ AL TV+GGL+P  IIA+P S GYG S+NGE
Sbjct  122  VGVAGLHRLLNQIDEINKYDVLIVCAGMEGALATVIGGLLPQPIIAVPVSIGYGVSKNGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
            TAL++ LSSCAPG+ V NIDNGYGAA AA+RI+
Sbjct  182  TALNSMLSSCAPGISVMNIDNGYGAAMAALRII  214

ref|YP_001091687.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. MIT 9301]
 gb|ABO18086.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. MIT 9301]
Length=218

 Score =  172 bits (436),  Expect = 2e-41, Method: Composition-based stats.
 Identities = 94/215 (43%), Positives = 139/215 (64%), Gaps = 2/215 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +IK D+ R DR+GL EAI+   KS++Q++ +     + N  + +TR++ EK + L + + 
Sbjct  4    DIKFDFQRRDRLGLIEAIWGQDKSIDQLERLCGNVLSKNEVVFITRINSEKANYLLD-LY  62

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
             D  +   +      K  + +  N  +VA+++ G+SD  V  EA   LE  G   ++  D
Sbjct  63   DDARFHEEANCLLIGKNFNKIITNK-KVAIISGGSSDLAVTLEAQLALEIYGVNCQSFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RLM ++EEI K+ V+I  AGM+ AL TV+GGL+   IIA+P S GYG S++GE
Sbjct  122  VGVAGLHRLMSQLEEINKYDVLIVCAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKDGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRILLS  227
            TAL++ LSSC+PG+ V NIDNGYGAA AA+RI+ S
Sbjct  182  TALNSMLSSCSPGIAVMNIDNGYGAAMAALRIIKS  216


ref|YP_001484702.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. MIT 9215]
 gb|ABV51116.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. MIT 9215]
Length=218

 Score =  171 bits (434),  Expect = 3e-41, Method: Composition-based stats.
 Identities = 94/215 (43%), Positives = 138/215 (64%), Gaps = 2/215 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +IK D+ R DR+GL EAI+   KS++Q++ +     + N  + +TR++ EK + L   + 
Sbjct  4    DIKFDFQRRDRLGLIEAIWGQDKSIDQLERLCRNVLSKNEVVFITRINSEKANYLLN-LY  62

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
             D  +   +      K  + +  N  +VA+++ G+SD  V  EA   LE  G   ++  D
Sbjct  63   DDARFHEEANCLTIGKNFNKIITNK-KVAIISGGSSDLAVTLEAQLALEIYGVNCQSFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RLM ++EEI K+ V+I  AGM+ AL TV+GGL+   IIA+P S GYG S++GE
Sbjct  122  VGVAGLHRLMSQLEEINKYDVLIVFAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKDGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRILLS  227
            TAL++ LSSC+PG+ V NIDNGYGAA AA+RI+ S
Sbjct  182  TALNSMLSSCSPGIAVMNIDNGYGAAMAALRIIKS  216


ref|YP_292039.1|  circadian phase modifier CpmA homolog [Prochlorococcus marinus 
str. NATL2A]
 gb|AAZ58336.1|  circadian phase modifier CpmA homolog [Prochlorococcus marinus 
str. NATL2A]
Length=215

 Score =  171 bits (433),  Expect = 4e-41, Method: Composition-based stats.
 Identities = 104/217 (47%), Positives = 138/217 (63%), Gaps = 8/217 (3%)

Query  12   SEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEV  71
            +E  +D+ R  RIG+ EAI+   K++EQI  IL++ +    + L+TRL+ EK   L  E 
Sbjct  2    NESIIDFQRRTRIGVVEAIWGEHKTIEQISEILKKYQLECETALVTRLTKEKGQKLLVEF  61

Query  72   SSDLDYDPVS---QTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAK  128
             S  ++  +S       FK+  S    +  EV ++  GTSD  VA EA   L F+G   K
Sbjct  62   PS-AEFHEISGCLTLGEFKECTS----SKEEVIILTGGTSDVGVASEAEIALNFHGIKTK  116

Query  129  AIFDVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGAS  188
             + DVGVAG+ RL++R+EEI+   VVIA AGM+ ALPTVL GL+P  II LP S GYG S
Sbjct  117  LLIDVGVAGLHRLLDRIEEIKLSKVVIACAGMEGALPTVLAGLIPQPIIGLPISVGYGIS  176

Query  189  RNGETALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
              G+TAL   L+SCAPG++V NIDNGYGAA AA+RIL
Sbjct  177  GGGKTALEGMLASCAPGLVVVNIDNGYGAAMAAMRIL  213


ref|YP_001015518.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. NATL1A]
 gb|ABM76254.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. NATL1A]
Length=215

 Score =  171 bits (433),  Expect = 4e-41, Method: Composition-based stats.
 Identities = 105/217 (48%), Positives = 138/217 (63%), Gaps = 8/217 (3%)

Query  12   SEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEV  71
            +E  +D+ R  RIG+ EAI+   K++EQI  IL++ +    + L+TRL+ EK   L  E 
Sbjct  2    NESIIDFQRRTRIGVVEAIWGEHKTIEQISEILKKYQLECETALVTRLTKEKGQKLLVEF  61

Query  72   SSDLDYDPVS---QTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAK  128
             S  ++  +S       FK+  S    +  EV ++  GTSD  VA EA   L F+G   K
Sbjct  62   PS-AEFQEISGCLTLGEFKECTS----SKEEVIILTGGTSDVGVASEAEIALNFHGIKTK  116

Query  129  AIFDVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGAS  188
             + DVGVAG+ RL+ER+EEI+   VVIA AGM+ ALPTVL GL+P  II LP S GYG S
Sbjct  117  LLIDVGVAGLHRLLERLEEIKLAKVVIACAGMEGALPTVLAGLIPQPIIGLPISVGYGIS  176

Query  189  RNGETALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
              G+TAL   L+SCAPG++V NIDNGYGAA AA+RIL
Sbjct  177  GGGKTALEGMLASCAPGLVVVNIDNGYGAAMAAMRIL  213


ref|YP_001009867.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. AS9601]
 gb|ABM70760.1|  putative circadian phase modifier CpmA-like protein [Prochlorococcus 
marinus str. AS9601]
Length=218

 Score =  170 bits (430),  Expect = 8e-41, Method: Composition-based stats.
 Identities = 93/215 (43%), Positives = 139/215 (64%), Gaps = 2/215 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +IK D+ R DR+GL EAI+   KS++Q++ +     + N  + +TR++ EK + L + + 
Sbjct  4    DIKFDFQRRDRLGLIEAIWGQDKSIDQLERLCGNVLSKNEVVFITRINSEKANYLLD-LY  62

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
             D  +   +      K  + +  N  +VA+++ G+SD  +  EA   LE  G   ++  D
Sbjct  63   DDARFYEKANCLTIGKNFNKIITNK-KVAIISGGSSDLAITLEAQLALEIYGVNCQSFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RLM ++EEI K+ V+I  AGM+ AL TV+GGL+   IIA+P S GYG S++GE
Sbjct  122  VGVAGLHRLMSQLEEINKYDVLIVCAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKDGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRILLS  227
            TAL++ LSSC+PG+ V NIDNGYGAA AA+RI+ S
Sbjct  182  TALNSMLSSCSPGIAVMNIDNGYGAAMAALRIIKS  216

ref|YP_397868.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. MIT 9312]
 gb|ABB50432.1|  putative circadian phase modifier CpmA [Prochlorococcus marinus 
str. MIT 9312]
Length=218

 Score =  169 bits (428),  Expect = 1e-40, Method: Composition-based stats.
 Identities = 94/215 (43%), Positives = 140/215 (65%), Gaps = 2/215 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +I+ D+ R +R+GL EAI+   KS++Q++ + E   + N  + +TR++ EK   L +   
Sbjct  4    DIRFDFQRRERLGLIEAIWGQDKSIDQLKRLSENVLSKNEVVFITRINSEKAIDLLD-FY  62

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
             D  +   ++     K ++ +  N  +VA+++ G+SD  V  EA   LE  G   ++  D
Sbjct  63   DDARFYEEAKCLIIGKNLNKLNTNK-KVAIISGGSSDLPVTLEAQLALEIYGVNCQSFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RL+ ++EEI K+ V+I  AGM+ AL TV+GGL+   IIA+P S GYG S+NGE
Sbjct  122  VGVAGLHRLISQLEEINKYDVLIVCAGMEGALATVVGGLLAQPIIAVPVSVGYGVSKNGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRILLS  227
            TAL++ LSSC+PGV V NIDNGYGAA AA+RI+ S
Sbjct  182  TALNSMLSSCSPGVAVMNIDNGYGAAMAALRIIKS  216

ref|YP_729997.1|  putative circadian phase modifier CpmA homolog [Synechococcus 
sp. CC9311]
 gb|ABI46206.1|  putative circadian phase modifier CpmA homolog [Synechococcus 
sp. CC9311]
Length=221

 Score =  167 bits (423),  Expect = 5e-40, Method: Composition-based stats.
 Identities = 95/216 (43%), Positives = 135/216 (62%), Gaps = 0/216 (0%)

Query  10   SSSEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSE  69
            SS +I+LDW R DR+G++EAI+ L KSV+QI +ILE     +   L+TR+   K  A+ +
Sbjct  2    SSDDIRLDWQRNDRLGISEAIWGLHKSVDQIVAILEAFAVRDQPALVTRVDETKAQAVLQ  61

Query  70   EVSSDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKA  129
              +++L                 ++     V V++ GTSD  VA EA   L ++G  A  
Sbjct  62   RCNTELVRFEARARCLTSGAPPTLRPELGTVTVLSGGTSDLPVAAEAQLALHWHGIDAGL  121

Query  130  IFDVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASR  189
            + DVGVAG+ RL++++ ++++  V+IA AGM+ ALPTVL GL+P  +I +P S GYG S 
Sbjct  122  LLDVGVAGLHRLLDQLPKLQQSSVLIACAGMEGALPTVLAGLLPQPVIGVPVSVGYGVSA  181

Query  190  NGETALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
             G  AL   L+SCAPG+ V NIDNGYGAA AA+RIL
Sbjct  182  GGRAALDGMLASCAPGLAVVNIDNGYGAAMAALRIL  217

ref|ZP_00778228.1|  NCAIR mutase (PurE)-related proteins [Thermoanaerobacter ethanolicus 
ATCC 33223]
 gb|EAO65243.1|  NCAIR mutase (PurE)-related proteins [Thermoanaerobacter ethanolicus 
ATCC 33223]
Length=253

 Score =  166 bits (420),  Expect = 1e-39, Method: Composition-based stats.
 Identities = 98/207 (47%), Positives = 130/207 (62%), Gaps = 3/207 (1%)

Query  15   KLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVSSD  74
            K+D+ R  R G  E IFC  K+ EQ++ I      N + +L TR S E   A+ E V   
Sbjct  39   KIDYHREIRKGFPEVIFCQGKTPEQVKEIAFNMFKNGSDVLGTRASREHFEAVKEVVEKA  98

Query  75   LDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVG  134
            + Y+     +      +P+K+    + VVAAGTSD  VA EA  T E  G   K  +DVG
Sbjct  99   VYYEEARIISIRN---TPIKKTKGIIGVVAAGTSDLPVAEEAAITAELMGNTVKRFYDVG  155

Query  135  VAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETA  194
            VAG+ RL+++++EIR+  V+IAVAGM+ ALPTVLGGLV   IIA+PTS GYGA+ +G +A
Sbjct  156  VAGLHRLLDKLDEIRQCRVLIAVAGMEGALPTVLGGLVSSPIIAVPTSVGYGANFHGLSA  215

Query  195  LSACLSSCAPGVLVCNIDNGYGAACAA  221
            L   L+SCA GV V NIDNG+GAA +A
Sbjct  216  LLTMLNSCASGVSVVNIDNGFGAAYSA  242

ref|NP_893395.1|  putative circadian phase modifier CpmA homolog [Prochlorococcus 
marinus subsp. pastoris str. CCMP1986]
 emb|CAE19737.1|  putative circadian phase modifier CpmA homolog [Prochlorococcus 
marinus subsp. pastoris str. CCMP1986]
Length=218

 Score =  165 bits (417),  Expect = 2e-39, Method: Composition-based stats.
 Identities = 92/213 (43%), Positives = 134/213 (62%), Gaps = 2/213 (0%)

Query  13   EIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +I+ D+ R  RIGL EAI+   KS++Q++ + +E       + +TR+  EK + L E + 
Sbjct  4    DIRFDFQRRGRIGLIEAIWGADKSIDQLERVSKEVLEKKEVVFITRIDKEKAAHLLE-IF  62

Query  73   SDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFD  132
             +  +   +      + ++    N  +VA+++ G+SD  V  EA  +LE  G   +   D
Sbjct  63   KEGRFIEEANCFIIGENLNKFSTNK-KVAIISGGSSDLAVTLEAKLSLEIYGVSCQTFID  121

Query  133  VGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGE  192
            VGVAG+ RL   +EEI K+ V+I  AGM+ AL TV+GGL+   IIA+P S GYG S+NGE
Sbjct  122  VGVAGLHRLFSEIEEINKYDVLIVCAGMEGALATVIGGLLSQPIIAVPVSIGYGVSKNGE  181

Query  193  TALSACLSSCAPGVLVCNIDNGYGAACAAIRIL  225
            TAL++ LSSCAPG+ V NIDNGYGAA AA+RI+
Sbjct  182  TALNSMLSSCAPGISVMNIDNGYGAAMAALRII  214

____________________________________________________________________________________________________


Blastp NCBI contre swissprot

BLASTP 2.2.17 (Aug-26-2007) 
Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: KPF5VPZY013


Database: Non-redundant SwissProt sequences
           259,673 sequences; 98,621,568 total letters
 If you have any problems or questions with the results of this search please refer to the BLAST FAQs
Taxonomy reports
Query=  ORF_ES22450 Traduction [67-810 sens indirect]
Length=248

Sequences producing significant alignments:                       (Bits)  Value

sp|Q57629|Y165_METJA  Uncharacterized protein MJ0165                114    4e-25
sp|Q58033|PUR6_METJA  Phosphoribosylaminoimidazole carboxylase...  50.4    6e-06
sp|P41654|PUR6_METTH  Probable phosphoribosylaminoimidazole ca...  41.6    0.003
sp|Q9UY68|PUR6_PYRAB  Phosphoribosylaminoimidazole carboxylase...  40.0    0.008
sp|P76103.2|YDCO_ECOLI  Inner membrane protein ydcO                33.1    0.96  
sp|O06456|PUR6_SULSO  Phosphoribosylaminoimidazole carboxylase...  32.3    1.8  
sp|Q9P7W0|OPA3_SCHPO  Putative OPA3-like protein C1703.11          32.0    2.2  
sp|O75386|TULP3_HUMAN  Tubby-related protein 3 (Tubby-like protei  30.8    4.8   
sp|Q8BQM4|HEAT3_MOUSE  HEAT repeat-containing protein 3            30.8    5.4   
sp|Q47I43|CHEB1_DECAR  Chemotaxis response regulator protein-g...  30.8    5.5   
sp|Q8AXY6|MUSK_CHICK  Muscle, skeletal receptor tyrosine prote...  30.0    9.2   

sp|Q57629|Y165_METJA  Uncharacterized protein MJ0165
Length=256

 Score =  114 bits (284),  Expect = 4e-25, Method: Composition-based stats.
 Identities = 79/217 (36%), Positives = 118/217 (54%), Gaps = 16/217 (7%)

Query  14   IKLDWDRTDRIGLAEAIFCLQKSVEQI-QSILEEARNNNASLLLTRLSVEKHSALSEEVS  72
            +KLD +R  R G+ E ++   K +++I ++ L+    N  +L      +EK   LS+E+ 
Sbjct  38   LKLDINRQFRTGVPEVVYGKGKDIDEIIKATLKLVEKNGIALATKIEDIEK---LSDEIR  94

Query  73   S------DLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFG  126
                   D+  +  ++T   K     VK+   +V ++ AGTSD  VA EA  TLE  G  
Sbjct  95   KWNLKNYDIKINKKAKTLIIKNKNYEVKKIG-KVGILTAGTSDIPVAEEAKDTLEIMGVE  153

Query  127  AKAIFDVGVAGIWRLMERVEEIRKHPV--VIAVAGMDAALPTVLGGLVPGAIIALPTSTG  184
            A   +DVG+AGI RL   ++ + +  V  +I VAGM+ ALP+V+  +V   +I +PTST 
Sbjct  154  AITAYDVGIAGIHRLFPALKRMIEEDVCCIIVVAGMEGALPSVIASMVDIPVIGVPTSTS  213

Query  185  YGASRNGETALSACLSSCAPGVLVCNIDNGYGAACAA  221
            YG      T L   L SC+PG+ V NIDNG+GA   A
Sbjct  214  YGIKI---TPLLTMLHSCSPGIAVVNIDNGFGAGVFA  247


sp|Q58033|PUR6_METJA  Phosphoribosylaminoimidazole carboxylase catalytic subunit (AIR 
carboxylase) (AIRC)
Length=157

 Score = 50.4 bits (119),  Expect = 6e-06, Method: Composition-based stats.
 Identities = 52/145 (35%), Positives = 72/145 (49%), Gaps = 18/145 (12%)

Query  100  VAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIWRLMERVEEIRKHP---VVIA  156
            + ++    SD ++A +A+  L+   FG +  F+V VA   R  E VEEI K+    V IA
Sbjct  2    ICIIMGSESDLKIAEKAVNILK--EFGVE--FEVRVASAHRTPELVEEIVKNSKADVFIA  57

Query  157  VAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETALSACLSSCA--PGVLVCN--ID  212
            +AG+ A LP V+  L    +IA+P      A  +G   L A LSS    PG+ V    ID
Sbjct  58   IAGLAAHLPGVVASLTTKPVIAVPVD----AKLDG---LDALLSSVQMPPGIPVATVGID  110

Query  213  NGYGAACAAIRILLSSRQNEATSEI  237
             G  AA  A+ IL    +N A   I
Sbjct  111  RGENAAILALEILALKDENIAKKLI  135


sp|P41654|PUR6_METTH  Probable phosphoribosylaminoimidazole carboxylase (AIR carboxylase) 
(AIRC)
Length=334

 Score = 41.6 bits (96),  Expect = 0.003, Method: Composition-based stats.
 Identities = 38/135 (28%), Positives = 57/135 (42%), Gaps = 15/135 (11%)

Query  98   PEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIWRLMERVEEIRKHP-----  152
            P V ++    SD R+A +A+   E      +  +D+ VA   R  E+V+ I         
Sbjct  3    PRVMILLGSASDFRIAEKAMEIFE----ELRIPYDLRVASAHRTHEKVKAIVSEAVKAGV  58

Query  153  -VVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETALSACLSSCAPG-VLVCN  210
             V I +AG+ A LP ++       +I +P     G    G  AL AC     P  V    
Sbjct  59   EVFIGIAGLSAHLPGMISANTHRPVIGVPVDVKLG----GLDALFACSQMPFPAPVATVG  114

Query  211  IDNGYGAACAAIRIL  225
            +D G  AA  A +I+
Sbjct  115  VDRGENAAILAAQII  129

sp|Q9UY68|PUR6_PYRAB  Phosphoribosylaminoimidazole carboxylase catalytic subunit (AIR 
carboxylase) (AIRC)
Length=174

 Score = 40.0 bits (92),  Expect = 0.008, Method: Composition-based stats.
 Identities = 43/141 (30%), Positives = 63/141 (44%), Gaps = 16/141 (11%)

Query  93   VKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIWRLMERVEEI----  148
            VK   P+V ++    SD  V +EA + LE      +  +++ V    R  ER+ E     
Sbjct  2    VKVGMPKVGIIMGSDSDLPVMKEAAKVLE----DFEVDYEMKVISAHRTPERLHEYARTA  57

Query  149  --RKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETALSACLSSCAPGV  206
              R   V+IA AG  A LP VL  L    +I +P  +    + NG  +L + +    PG+
Sbjct  58   EERGIEVIIAGAGGAAHLPGVLAALTMIPVIGVPIKS---KALNGLDSLLS-IVQMPPGI  113

Query  207  LVCN--IDNGYGAACAAIRIL  225
             V    ID    AA  A+ IL
Sbjct  114  PVATVGIDGAKNAALLALEIL  134

sp|P76103.2|YDCO_ECOLI  Inner membrane protein ydcO
Length=391

 Score = 33.1 bits (74),  Expect = 0.96, Method: Composition-based stats.
 Identities = 32/103 (31%), Positives = 47/103 (45%), Gaps = 15/103 (14%)

Query  90   VSPVKQNAPEVAVVAAGTSDARVAR----EALRTLEFNGFGAKAIFDVGVAGIWRLMERV  145
            V+   QNAP +A + A    A V+       L  L F+ FG   ++ VG+A I   + + 
Sbjct  221  VTMASQNAPGIAAMKAAGYSAPVSPLIVFTGLLALVFSPFG---VYSVGIAAITAAICQS  277

Query  146  EEIRKHP------VVIAVAGMDAALPTVLGGLVPGAIIALPTS  182
             E   HP      +  AVAG+   L  + G  + G + ALP S
Sbjct  278  PE--AHPDKDQRWLAAAVAGIFYLLAGLFGSAITGMMAALPVS  318

sp|O06456|PUR6_SULSO  Phosphoribosylaminoimidazole carboxylase catalytic subunit (AIR 
carboxylase) (AIRC)
Length=158

 Score = 32.3 bits (72),  Expect = 1.8, Method: Composition-based stats.
 Identities = 38/131 (29%), Positives = 56/131 (42%), Gaps = 6/131 (4%)

Query  98   PEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIWRLMERVEEIRKH--PVVI  155
            P+VAV+    +D    REA+  L+  G   +A           +M+  +E  K    V+I
Sbjct  2    PKVAVIMGSKNDWEYMREAVEILKQFGIDYEARVVSAHRTPEFMMQYAKEAEKRGIEVII  61

Query  156  AVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNG-ETALSACLSSCAPGVLVCNIDNG  214
            A AG  A LP ++  L    +I +P  +    + NG ++ LS         V    I   
Sbjct  62   AGAGGAAHLPGMVASLTSLPVIGVPIPS---KNLNGLDSLLSIVQMPYGVPVATVAIGGA  118

Query  215  YGAACAAIRIL  225
              AA  AIRIL
Sbjct  119  KNAALLAIRIL  129

sp|Q9P7W0|OPA3_SCHPO  Putative OPA3-like protein C1703.11
Length=218

 Score = 32.0 bits (71),  Expect = 2.2, Method: Composition-based stats.
 Identities = 24/100 (24%), Positives = 46/100 (46%), Gaps = 7/100 (7%)

Query  19   DRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLSVEKHSALSEEVSSDLDYD  78
            ++  R  +AEAI  LQ  + +I  I+E+        +L +   E  S+  E  S++ D+D
Sbjct  107  EKNRRDEVAEAILGLQHEIVRINEIMEK------QFVLQKKKNELQSSTEEIDSTEKDFD  160

Query  79   PVSQTAF-FKKPVSPVKQNAPEVAVVAAGTSDARVAREAL  117
             + +     ++ +  ++QN P     A  T    + RE +
Sbjct  161  ELHKVILKVERELHTLRQNTPSQNEQAEATPSKEIPRETV  200


sp|O75386|TULP3_HUMAN  Tubby-related protein 3 (Tubby-like protein 3)
Length=442

 Score = 30.8 bits (68),  Expect = 4.8, Method: Composition-based stats.
 Identities = 21/70 (30%), Positives = 37/70 (52%), Gaps = 4/70 (5%)

Query  42   SILEEARNNN---ASLLLTRLSVEKHSALSEEVSSDLDYDPVSQTAFFKKPVSPVKQNAP  98
            S++EE   N    AS    +  ++KH  +SE V+ D + D +SQ+A  ++P S   QN+ 
Sbjct  106  SVVEEDAENTVDTASKPGLQERLQKHD-ISESVNFDEETDGISQSACLERPNSASSQNST  164

Query  99   EVAVVAAGTS  108
            +     + T+
Sbjct  165  DTGTSGSATA  174


sp|Q8BQM4|HEAT3_MOUSE  HEAT repeat-containing protein 3
Length=679

 Score = 30.8 bits (68),  Expect = 5.4, Method: Composition-based stats.
 Identities = 36/119 (30%), Positives = 54/119 (45%), Gaps = 14/119 (11%)

Query  81   SQTAFFKKP-VSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGIW  139
            S+T  FK+P  SP++    E A  + GT D      A   LE     +  + +   AG+ 
Sbjct  4    SRTKRFKRPQFSPIESCQAEAAAASNGTGDEEDDGPAAELLEKLQHPSAEVRECACAGLA  63

Query  140  RLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALPTSTGYGASRNGETALSAC  198
            RL      +++ P +  +A  DA     LG L+  + +A+   T  GA RN    LSAC
Sbjct  64   RL------VQQRPALPDLARRDAV--RRLGPLLLDSSLAV-RETAAGALRN----LSAC  109


sp|Q47I43|CHEB1_DECAR  Chemotaxis response regulator protein-glutamate methylesterase 
1
Length=350

 Score = 30.8 bits (68),  Expect = 5.5, Method: Composition-based stats.
 Identities = 32/109 (29%), Positives = 51/109 (46%), Gaps = 9/109 (8%)

Query  80   VSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTLEFNGFGAKAIFDVGVAGI-  138
            V  +A  +  +S +   AP++ VV A   DA+ ARE ++ L  +        DV +  + 
Sbjct  10   VDDSALMRGLLSQMINLAPDMEVVGAA-PDAQSAREMIKVLNPDVL----TLDVQMPKMD  64

Query  139  -WRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAI--IALPTSTG  184
                +ER+  +R  PVV+  +  +A   T L  L  GAI  I  P + G
Sbjct  65   GLEFLERLMRLRPMPVVMVSSFTEAGSDTTLKALELGAIDFIGKPRADG  113


sp|Q8AXY6|MUSK_CHICK  Muscle, skeletal receptor tyrosine protein kinase precursor (Muscle-specific 
tyrosine protein kinase receptor) (Muscle-specific 
kinase receptor) (MuSK)
Length=947

 Score = 30.0 bits (66),  Expect = 9.2, Method: Composition-based stats.
 Identities = 16/50 (32%), Positives = 23/50 (46%), Gaps = 3/50 (6%)

Query  183  TGYGASRNGETALSACLSSCAPGVLVCNIDNGYGAA---CAAIRILLSSR  229
            T Y   RNG+      +     GV  C  DNG GAA   C A+++ +  +
Sbjct  73   TRYSIQRNGQLLTILSVEDSDDGVYCCTADNGVGAAAQSCGALQVKMRPK  122


 


ORF finding

ORF Finder SMS any codon/ min long: 60 codons/ frame 1,2 et 3 direct

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 224 to base 649.
AAGCGCAGTTTCTCCGTTTCGGCTCGCACCGTATCCGGTGGAGGTGGGAAGCGCAATGAT
CGCTCCAGGGACCAAACCTCCAAGAACGGTCGGCAGTGCAGCATCCATTCCCGCAACAGC
AATCACAACCGGATGCTTGCGGATCTCCTCGACCCTTTCCATCAACCTCCAGATTCCCGC
CACTCCCACATCAAAAATCGCCTTGGCACCAAAACCATTGAATTCAAGCGTTCTCAAAGC
TTCTCGTGCAACTCTTGCATCCGAAGTGCCTGCAGCAACAACCGCCACTTCCGGAGCATT
TTGCTTCACGGGAGAAACGGGTTTTTTGAAAAAGGCGGTTTGGGAAACAGGATCATAATC
CAGATCCGAACTGACCTCCTCACTCAGAGCAGAATGTTTTTCTACGGAAAGTCTGGTCAG
CAATAG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
KRSFSVSARTVSGGGGKRNDRSRDQTSKNGRQCSIHSRNSNHNRMLADLLDPFHQPPDSR
HSHIKNRLGTKTIEFKRSQSFSCNSCIRSACSNNRHFRSILLHGRNGFFEKGGLGNRIII
QIRTDLLTQSRMFFYGKSGQQ*

>ORF number 1 in reading frame 3 on the direct strand extends from base 168 to base 581.
CCGTTGTCAATATTGCAGACCAACACTCCCGGCGCACAGGAGGACAAACAGGCTGAAAGC
GCAGTTTCTCCGTTTCGGCTCGCACCGTATCCGGTGGAGGTGGGAAGCGCAATGATCGCT
CCAGGGACCAAACCTCCAAGAACGGTCGGCAGTGCAGCATCCATTCCCGCAACAGCAATC
ACAACCGGATGCTTGCGGATCTCCTCGACCCTTTCCATCAACCTCCAGATTCCCGCCACT
CCCACATCAAAAATCGCCTTGGCACCAAAACCATTGAATTCAAGCGTTCTCAAAGCTTCT
CGTGCAACTCTTGCATCCGAAGTGCCTGCAGCAACAACCGCCACTTCCGGAGCATTTTGC
TTCACGGGAGAAACGGGTTTTTTGAAAAAGGCGGTTTGGGAAACAGGATCATAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
PLSILQTNTPGAQEDKQAESAVSPFRLAPYPVEVGSAMIAPGTKPPRTVGSAASIPATAI
TTGCLRISSTLSINLQIPATPTSKIALAPKPLNSSVLKASRATLASEVPAATTATSGAFC
FTGETGFLKKAVWETGS*

ORF Finder SMS any codon/ min long: 60 codons/ frame 1,2 et 3 indirect

>ORF number 1 in reading frame 1 on the reverse strand extends from base 67 to base 813.
GATTCCGGAACTCCAGACATGACTTCTAGTTCTTCCGAAATCAAGCTCGACTGGGACCGG
ACCGACAGGATCGGATTGGCCGAAGCGATTTTTTGCCTACAGAAATCTGTGGAACAGATT
CAGTCGATTCTGGAAGAAGCCCGGAACAACAATGCGTCTCTATTGCTGACCAGACTTTCC
GTAGAAAAACATTCTGCTCTGAGTGAGGAGGTCAGTTCGGATCTGGATTATGATCCTGTT
TCCCAAACCGCCTTTTTCAAAAAACCCGTTTCTCCCGTGAAGCAAAATGCTCCGGAAGTG
GCGGTTGTTGCTGCAGGCACTTCGGATGCAAGAGTTGCACGAGAAGCTTTGAGAACGCTT
GAATTCAATGGTTTTGGTGCCAAGGCGATTTTTGATGTGGGAGTGGCGGGAATCTGGAGG
TTGATGGAAAGGGTCGAGGAGATCCGCAAGCATCCGGTTGTGATTGCTGTTGCGGGAATG
GATGCTGCACTGCCGACCGTTCTTGGAGGTTTGGTCCCTGGAGCGATCATTGCGCTTCCC
ACCTCCACCGGATACGGTGCGAGCCGAAACGGAGAAACTGCGCTTTCAGCCTGTTTGTCC
TCCTGTGCGCCGGGAGTGTTGGTCTGCAATATTGACAACGGTTACGGTGCTGCCTGCGCT
GCAATCCGCATTCTACTCAGTTCGAGGCAGAATGAAGCTACATCGGAGATTGGACAAAAG
GGACGAGGTACGGCCCCTCAGGAATGA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
DSGTPDMTSSSSEIKLDWDRTDRIGLAEAIFCLQKSVEQIQSILEEARNNNASLLLTRLS
VEKHSALSEEVSSDLDYDPVSQTAFFKKPVSPVKQNAPEVAVVAAGTSDARVAREALRTL
EFNGFGAKAIFDVGVAGIWRLMERVEEIRKHPVVIAVAGMDAALPTVLGGLVPGAIIALP
TSTGYGASRNGETALSACLSSCAPGVLVCNIDNGYGAACAAIRILLSSRQNEATSEIGQK
GRGTAPQE*

>ORF number 1 in reading frame 2 on the reverse strand extends from base 530 to base 874.
TTGCTGTTGCGGGAATGGATGCTGCACTGCCGACCGTTCTTGGAGGTTTGGTCCCTGGAG
CGATCATTGCGCTTCCCACCTCCACCGGATACGGTGCGAGCCGAAACGGAGAAACTGCGC
TTTCAGCCTGTTTGTCCTCCTGTGCGCCGGGAGTGTTGGTCTGCAATATTGACAACGGTT
ACGGTGCTGCCTGCGCTGCAATCCGCATTCTACTCAGTTCGAGGCAGAATGAAGCTACAT
CGGAGATTGGACAAAAGGGACGAGGTACGGCCCCTCAGGAATGACCGCGACATCGGGACT
GTCTTGTTCCGCGATGCTGGTGCTGATGGCCTGGCTGATGCTGGC

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
LLLREWMLHCRPFLEVWSLERSLRFPPPPDTVRAETEKLRFQPVCPPVRRECWSAILTTV
TVLPALQSAFYSVRGRMKLHRRLDKRDEVRPLRNDRDIGTVLFRDAGADGLADAG

>ORF number 1 in reading frame 3 on the reverse strand extends from base 462 to base 701.
TGTGGGAGTGGCGGGAATCTGGAGGTTGATGGAAAGGGTCGAGGAGATCCGCAAGCATCC
GGTTGTGATTGCTGTTGCGGGAATGGATGCTGCACTGCCGACCGTTCTTGGAGGTTTGGT
CCCTGGAGCGATCATTGCGCTTCCCACCTCCACCGGATACGGTGCGAGCCGAAACGGAGA
AACTGCGCTTTCAGCCTGTTTGTCCTCCTGTGCGCCGGGAGTGTTGGTCTGCAATATTGA


>Translation of ORF number 1 in reading frame 3 on the reverse strand.
CGSGGNLEVDGKGRGDPQASGCDCCCGNGCCTADRSWRFGPWSDHCASHLHRIRCEPKRR
NCAFSLFVLLCAGSVGLQY*