ORF BJ16030

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01159979.1
Annotathon code: ORF_BJ16030
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2006
Username : bouda
Annotated on : 2008-03-19 18:52:37
  • JANE jonathan
  • ROUDAUT Yann

Synopsis

Genomic Sequence

>AACY01159979.1 ORF_BJ16030 genomic DNA
CGAATGATTTTACAGAGCACTGGATCCGAGGCCTGATCTACATGGTCTCGGTAATGGCAATTTTGTCGGCGCATGAAGCAGGGCACTTTGTTGCAGCATG
GCGTCATCGAATTCCTGCAACGCTTCCATTTTTCTTACCGCTTCCAGTGATGCTAACTGGGACACTTGGCGCCGTAATTGGCATGGAAGGATCTCGGGCA
GACAGAAAACAGTTATTTGATATCGCCTTAGCTGGACCTCTCGCTGGTCTTCTTGTTGCGATTCCTGTTTTTGTAGCGGGGCTGGTGCTTGCTCAACCGG
CAGATAGCAGCCTGTTTTCAATGCCTTTACTTGCAACATGGCTTTTGAGACTTGTTCGGCCAGATTTACCAGTAGGCCAGGTGCTTATCCCAAATGCGTT
CTTGCTGGCTGGCTGGGTAGGTTTTCTTGTAACTGGACTGAATATGATTCCCCTCAGCCAACTCGATGGTGGGCATATTAGCCATGCTGTTTTTGGTCGG
CGTTCGTGCTGGGTGGCCAGAAGTGTCCTCCTCGGAGCAATAACCGCTATTATTCTTGTAGGAGCTGATCATTGGGTTTTAATGGTTGTTTTAGTCACGT
TTATGGGTGTCGATCACCCGCCCATTCGAAATGAATCGCAGCCGTTGGGCACCGCGAGAACAATTCTGGGCATTGCTTCATTTGTCATTCCGGTGATTAC
ATTCATGCCGGAGCCGCTGCTGCTGCCCGGATTCATTTTCATTCGTTGACGCCCTGCCATTCCGCTCACACAATAGATTAAGATTGTTTTAGATGGAGTC
AGACGCACTGGTTCCAAGGCTGCTACATAACCGAGAGGTGATCCGTGAACTTTGTTGAACTAAAAGACCCTGCAATTT

Translation

[3 - 746/878]   direct strand
>ORF_BJ16030 Translation [3-746   direct strand]
NDFTEHWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADRKQLFDIALAGPLAGLLVAIPVFVAGLVLAQPA
DSSLFSMPLLATWLLRLVRPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHWVLMVVLVTF
MGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPLLLPGFIFIR

[ Warning ] 5' incomplete: does not start with a Methionine

Phylogeny

          +------------EU1methano
         ! 
         !                 +------------------GNSBchloro
         !             +---8 
         !          +-12   +---------------------GNSBroseif
         !          !  !  
  +------6          !  +---------------------------EUpicrophi
  !      !          !  
  !      !          !                   +---------EUthermoco
  !      !          !     +-------------3 
  !      !          !     !             +-----------EUpyrococc
  !      !          !  +-18  
  !      !          !  !  !  +----------------PLcandidat
  !      +---------14  !  !  !  
  !                 !  !  +-17    +---------------PRanaeromy
  !                 !  !     !  +-9 
  !                 !  !     +-13 +-----------------------PRstigmate
  !                 !  !        !  
  !                 !  !        +-------------------------EuBacAcido
  !                 !  !  
  !                 !  !                    +------------CYcrocosph
  !                 +-16                 +--5 
  !                    !                 !  !     +---CYAnabaena
  !                    !         +-------7  +-----2 
  !                    !         !       !        +-----CYnostoc  
  !                    !         !       ! 
  !                    !      +-10       +-----------------EUSynechoc
  !                    !      !  !  
  !                    !      !  !           +-------GSBpelodic
  !                    !  +--11  +-----------4 
  !                    !  !   !              +----------GSBprosthe
  !                    +-15   !  
  !                       !   +---------------------SPleptospi
  !                       !  
  !                       +-----------------ORF_BJ1603
  ! 
  1----EUmethanos
  ! 
  +----EU2methano



Annotator commentaries

Nous avons fait une recherche d'ORF à l'aide d'orfinder. Une ORF à été trouvée dans la phase 3 de lecture du brin dans le sens direct. Etant donné le nombre de codons que l'on a, nous emmettons l'hypothèse selon laquelle le gène code pour une protéine.Debut de l'orf:3 et dernier nucléotide:746. Nous avons pu trouver la masse moléculaire de cette séquence à l'aide de "protein molecular weight" dans le logiciel SMS, cependant cette masse n'est en rapport qu'avec notre séquence et non avec la masse totale du gene car notre séquence n'est qu'une portion du gene. Nous avons éffectué ensuite plusieur Blast pour trouver des séquences homologues parmi plusieur banques de données: tout d'abord nous avons commencé par un blastn, dont le résultat était ininterprétable car les references été trop vagues. Nous avons poursuivi avec un blastP sur Swissprot mais le résultat été ininterprétable, avec des Evalues trop grands: de 0.8 à 8 (cf bloc notes) Nous avons donc effectué un blastP dans nr qui nous a permit de trouver les résultats indiqués. Ces résultats nous ont posés des problemes car deux groupes bien distinct nous sont apparus avec des Evalues satisfaisant et tres proche: les Archae et les bactéries. Nous avons commencé par faire un alignement multiple à l'aide de Clustlaw en mettant pour objet d'etude plusieur gpe de bactéries et comme groupe exterieur quelques Euryarchaeotes. Remarquons que le gene est imcomplet du fait que l'on voit beaucoup d'absence dans les alignements. Nous avons ensuite réalisé les deux arbres phylogénétiques. Ceux-ci étaient différent mais présentaient comme similarité le mélange des deux groupes . Nous avons donc pensé qu'il pouvait s'agir d'une duplication silencieuse. Pour infirmer/confirmer ce résultat, nous avons éffectué un autre alignement multiples (cf bloc notes) en étoffant nos groupes. Le résultat nous a montré un mélanges d'autant plus important ce qui infirme notre 1ere hypothese car il est tres peu probable qu'un si grand nbr de paralogie cachées _

duplication silencieuse au cour de l'évolution (duplication suivi de plusieurs pertes dans la lignée des déscendants, ce qui peut induire en érreur en pensant que deux organismes sont plus proche qu'il ne le sont en réalité) soient apparues pour la petite partie de géne que nous étudions alors que le nbr de groupe de bactérie est élevé parmi les homologies probables.

Notre conclusion est qu'il y a eu transfert de génes entre ces deux groupes: transfert d'information génétique horizontalement, matériel échangé qui a ensuite été transmis aux générations descendantes. Nous avons donc choisi de garder l'arbre réalisé avec NJ car nous n'avons aucun moyens de priligier l'un ou l'autre. Celui choisit nous permet de voir l'éloignement grace a la longueur des branches. Nous n'avons pas pu déterminer d'horthologues et pensons à des xénologues du fait des transferts de genes . A l'aide de la source interpro, nous avons pu voir qu'un domaine protéique correspondait à notre séquence:peptidase M50. Cette suposition est confirmée par le nbr de sequences homologues de notre arbre qui porte la meme fonction. Grace à la lecture des differentes fiches des fonctions des sequences homologues de notre ORF, nous avons vu que notre gene code pour une proteine associée à la membrane dont la fonction est une zinc metalloprotease. Ce qui justifie notre choix des fonctions et procéssus. Cependant, malgrés de longs efforts nous ne sommes pas en mesure de donner un nom a ce gene car les fiches Swprot qui nous l'aurait permi ne sont pas assez informatives quant à nore ORF.


Multiple Alignement

CLUSTAL W (1.82) multiple sequence alignment


GSBprosthecochloris      ------------------------------------------------------------
GSBpelodictyon           ------------------------------------------------------------
CYnostoc                 -----------MVLKLLVGRFVKITEDNMAFWFLFILLLGLATYLMVQHSVAHITRTP--
CYAnabaena               ------------------------------MTFWFLLLLGIATYLMVQRSVANITKTP--
CYcrocosphaera           ------------------------------MNWLLLFLLGAFTYYLAQKTVAPITRTP--
EUmethanosarcina         --------------------------MNQENHNKGKGIFNAEETVSR-------------
EU2methanosarcina        --------------------MNPKMKPEIEDQRNGRGKSEVEESISR-------------
EU1methanococcoides      ----------------------------MSSINKAPDDAEVESMIID-------------
EUthermococcus           ---------MPRGIYECVNCGHREVLDSTEPVIERACPKCGGDMILVGYAVSGVESPNVR
EUpyrococcus             MVREGRWRLLPRGIYECVNCKHREVLDKTQPLLPGRCPVCGGDMILVGYELDIEEEE---
EUSynechococcus          -----------------------------MADWILLALVGSATYLLLQKSVARLSKVP--
PLcandidatus             ------------------------------------------------------------
ORF_BJ16030              ------------------------------------------------------------
PRanaeromyxobacter       ------------------------------------------------------------
PRstigmatella            ---------------MGGRGLGTKEKAPLSPGGPDRRSSGSGSLLDGDAHGGRAGHADDG
GNSBchloroflexus         -------------------------------------MFDVDPTVALR------------
GNSBroseiflexus          ---------------------------------MEPHVADTLDRLELMG-----------
EuBacAcido               ------------------------------------------------------------
SPleptospira             ------------------------------------------------------------
EUpicrophilus            ------------------------------------------------------------
                                                                                     

GSBprosthecochloris      ------------------------------------MRDNPSTRREILAEQNYLLHITLL
GSBpelodictyon           ------------------------------------------------MKQNYPLHLTLF
CYnostoc                 -------------------------VWLLWLVLMTPAFLLSGWTLIYGTKQPPPPALIVS
CYAnabaena               -------------------------IWLLWLVLMTPALVLTGWTLVYGVKQPPPPALIFW
CYcrocosphaera           -------------------------LWIIWLVMMMPAVVWTIWYEINGEDQQIPLPIIII
EUmethanosarcina         ------------------------------LYPYIVRVFDVYEVQNSGEALYFFGTPRTN
EU2methanosarcina        ------------------------------IYPFISRVFDVYEIQKSGDILYFFGTPKVD
EU1methanococcoides      ------------------------------LYDDIHPFFKAYEVGYADSAILFYGVPQID
EUthermococcus           SGREEVPAPEGSGVSPGVEVPRESPHGLPPEVEAKLNEFYSLRFYGFDGHVAVFEVLDIY
EUpyrococcus             ---------------------------KPSFEDFLREHYDLGELIEHRGEVYAYEVLGIK
EUSynechococcus          -------------------------WRILWLVMMMPPLIWVLGRQVFQYEMPPLLMLGLF
PLcandidatus             ------------------------------------------MTREDQKSPPFFKKVRIH
ORF_BJ16030              ------------------------------------------------------------
PRanaeromyxobacter       --------------------------------------MDELVLRPRPRSRFPAANLALF
PRstigmatella            AGQREDAVAGRGARALQVHLRGAHGLGLGHQELVHHGVEVALDQVDDVKDGRQAQGLGLD
GNSBchloroflexus         ----------------------------MLATEVMMVTTGEELVHGEQHAFRYRGQLLRE
GNSBroseiflexus          ----------------------------RVRLALDGLMTIERYAWDSRGALTMVGKLHAP
EuBacAcido               --------------------------------MSDPNTPLPTESPEYDYPVPVYYYPRVK
SPleptospira             -----------------------------------------------MKQSRFSTHIILF
EUpicrophilus            ----------------MEVKVQSDDLEYVVSTVRSYINSYETDVNPLYIRFYFFESDNPL
                                                                                     

GSBprosthecochloris      CITLLTTLWAGAFWTGHQVRIDS-------------------------------------
GSBpelodictyon           FLTLLTTLWAGAFWGGHPVSFSS-------------------------------------
CYnostoc                 SLFVCTVLYWILFLGGRRVPRDTQTEVPAQASESQP--IIQPTPEPLVR-----------
CYAnabaena               PSIICLLLYWLLFQWGRQLPRDTQTAPQTTESQSAN--HPTAEPVPVR------------
CYcrocosphaera           PLVIGFTLYVWLIRLGKISSSDQPKETVQNSQPKLENIEQPPQDSEKIR-----------
EUmethanosarcina         TENITGELWEPLQQFGFGCTLKYELGEYVLLVFPEKKAKE--------------------
EU2methanosarcina        TENVMGELWAPLEQRGFGGTLKYELGEHVIIVAPVKKSEE--------------------
EU1methanococcoides      PKLIYQDLWPKLLAKGYKLSFSKEFGEDVLVVSPIQEVPE--------------------
EUthermococcus           EKNFERVLR-ELENLGYWAALKKRDGRIVLFVFPAGKIPP--------------------
EUpyrococcus             TENFEEVLR-EAEKFGYWLALKRREGKIVLYVFPAQLYEDK-------------------
EUSynechococcus          LASYFTSMTLVRRGRLRPPTAGSQPGAENTSPVAGSDRESLLEAEEEKEDSAADLASATL
PLcandidatus             VLLFIATFL---------------------------------------------------
ORF_BJ16030              ------------------------------------------------------------
PRanaeromyxobacter       LATLATTLWAGFTLSPLAPLG---------------------------------------
PRstigmatella            HRDGLTASGGELGARNHGALNAGARYEAARCDGGQNGLLHCVSSRNGRLEG---------
GNSBchloroflexus         AQAAHDAIVTRAQALGYTPLFQADPAGAAILFIPTPPKAPP-------------------
GNSBroseiflexus          ADKVYPQIRAGMAALGFTPFLRKSGDDVEIMALPFVISAPK-------------------
EuBacAcido               TRYWLHALLLALTFFTTLTVGAKFQDNFMHHQPLIGDGFPFP------------------
SPleptospira             ILTFLTLTFQSEFFELPFLSIQS-------------------------------------
EUpicrophilus            LDENFDEIRKILVPSGYIPAVIKGPENYIEVTRRPKENYRS-------------------
                                                                                     

GSBprosthecochloris      ------------------------------------------------------------
GSBpelodictyon           ------------------------------------------------------------
CYnostoc                 ---PIEPTEETQLRNCFPWSVYYVQNIEYRPQAVICRGQLRTKASNAYQQIKTNIEAQFG
CYAnabaena               ---PIEPTEETQLRNCFPWSIYYVQNIEYRPQAVICRGQLRTTPTQAYQQIRANIEAQFG
CYcrocosphaera           ---PITATEEKSLRDCFPWEVYYLQNVDYRPQGILCRGKLRTAPEKAYKSIKKNIEKVFG
EUmethanosarcina         ------------------------------------------------------------
EU2methanosarcina        ------------------------------------------------------------
EU1methanococcoides      ------------------------------------------------------------
EUthermococcus           ------------------------------------------------------------
EUpyrococcus             ------------------------------------------------------------
EUSynechococcus          VHTPISEVPREKLNHCFPWNVFYLQSVEYRPQAIICRGNLRADPTEAYERVQRNVENTFG
PLcandidatus             ------------------------------------------------------------
ORF_BJ16030              ------------------------------------------------------------
PRanaeromyxobacter       ------------------------------------------------------------
PRstigmatella            -----------------------------VMGKQRSGPPLEWCPQARGVRSRSLVCVKQG
GNSBchloroflexus         ------------------------------------------------------------
GNSBroseiflexus          ------------------------------------------------------------
EuBacAcido               ------------------------------------------------------------
SPleptospira             ------------------------------------------------------------
EUpicrophilus            ------------------------------------------------------------
                                                                                     

GSBprosthecochloris      ------------------------------------------------------------
GSBpelodictyon           ------------------------------------------------------------
CYnostoc                 DRFVLIFQEGLNDKPFFVLVPNIQAAKDRNTPRREQERLTRPGLALLLVVATLITTTLVG
CYAnabaena               DRFLLIFQEGFNGKPFFVLVPNSQAAKAN---ARQSEPLTRPGLALLLLVATLVTTTLVG
CYcrocosphaera           DHFLILFQEGLQEKPFFALVPNPWSKNESEK-NSDEEKLKRPVFALTLLLLTLLTTTIIG
EUmethanosarcina         ----------------------------------------KTWINLVLFIATFFTTMVCG
EU2methanosarcina        ----------------------------------------KIWINLALFMATGFTTMICG
EU1methanococcoides      ----------------------------------------RIWINVLLAVATVFTTMFAG
EUthermococcus           ---------------------------------------DNPWLPWLFLVLTVLSTFFAG
EUpyrococcus             ---------------------------------------ENPLVGIALFILTLLSTFFAG
EUSynechococcus          KRFLVVLQEGFAGKPFFALVPNPAARRSLTR------QQEWPLLALGLLLFTFWTTLTAG
PLcandidatus             ------------------------------------------------------------
ORF_BJ16030              ------------------------------------------------------------
PRanaeromyxobacter       ------------------------------------------------------------
PRstigmatella            SGEGRLVLTRREAALQSRGPLGTHPPLAPSMETALVRPLSRPWLHLLLLVVTLASAFVSF
GNSBchloroflexus         ---------------------------------------SRLWLAVLLFVLTVASTMFVG
GNSBroseiflexus          ---------------------------------------PNIVLPVALFIITVLSTLMVG
EuBacAcido               ------------------------------------------------------------
SPleptospira             ------------------------------------------------------------
EUpicrophilus            -----------------------------------------IYVNIIMLVLTLLSTVYVG
                                                                                     

GSBprosthecochloris      -------------------LVNFFKDLSYGKEYAAALLIFLGVHEFGHFFAALSHRIRTT
GSBpelodictyon           -------------------LPLFISSLGTGIPYSLSLLLFLTVHEFGHFFAAMRHRVQAT
CYnostoc                 VEIAGASLPPLWEIGSLFKVLSNPDVLFKGLPYALGLMTILGIHELGHYLTAKFYKIRST
CYAnabaena               VKIAGIDP---------TRLQSDPKLLLQGLPYALALMTILGIHEMGHYFTARFYKIRST
CYcrocosphaera           TVAIVGVAQET--------LNTDPSLLLKGLPYSLGLITILGIHELSHYFTAIRYKIATT
EUmethanosarcina         AWMSG------------ADLENDLFQLFRGLPFTLAIMAVLGSHEMAHYVMARYHGMKAS
EU2methanosarcina        AWMFG------------VNLTSDPIQVFRGLPFTLAILAVLGSHEMAHYAMARYHGMKTS
EU1methanococcoides      ATMFG------------VDIFSEPSQFIKGLPFTLAIMFVLGSHEMGHYIVAKMHGMRTS
EUthermococcus           YYLALNYIATLEH----YGLPGLRNPYIIALSFSVSVMAIIGTHELGHKIAATYHGVKAT
EUpyrococcus             YILSLNYVKTLED----LNLPGIKNLYLNALAFSLGIISILGSHEMGHKIAATIHNVKST
EUSynechococcus          AQAAGVGP----------DRLLHLPSLLKGLPYAVGILAILGSHEGIRYWVARRHGIKTS
PLcandidatus             -----------------------TTYYVNGIWYSLAIMSILLSHELGHFFMCRKYHVDAT
ORF_BJ16030              --------------------NDFTEHWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPAT
PRanaeromyxobacter       --------------------PTLGRVLEGGLPFAGALVAILFTHEMGHYVLARRHRVDTT
PRstigmatella            AFFLGGQG----------EELGLPGWVGGSLSFALALLSILGAHEMGHYVLARFHGVDTS
GNSBchloroflexus         GQEYIESTG------------QVVFNWGYALSFSGSLLAILLAHEMGHFIVARREKVAVS
GNSBroseiflexus          ALYDG------------IDVFSNPAGIVAGIPFSATIMGILFVHEMGHYIVGRWRRAPVS
EuBacAcido               ----------------LLWVLHQPSNLLLGLPFALSLMGILLAHEMGHYVYCRRYHVLAT
SPleptospira             ----------------------LKELFFLRLPYSLSLIIILSAHEMGHFLAARYYGIKAT
EUpicrophilus            SIYAASFVRPG-------PYYEFYKLLYGFVFFSLPLMFILGIHETAHYLVARRYRVNAS
                                                         :   :: .:  **  :           :

GSBprosthecochloris      LPYFIPVPPM-----PFLLNLGTLGAVIRIKEKIPDTKSLFDTGVSGPLSGFIIALGLLI
GSBpelodictyon           LPYYIPMPPL-----PFLLSLGTMGAVIKVKERIPGTNSLFDIGIAGPIGGFAVSVGLLV
CYnostoc                 LPYFIPMP----------FFLGTFGAFIQMRSPIPNRKALFDISIAGPLAGFVVTLPLLI
CYAnabaena               LPYFIPIP----------FFLGTFGAFIQMRSPIPNRKALFDVGIAGPLAGFIATLPLVI
CYcrocosphaera           LPYFIPIP----------FFLGTFGAFIQMKAPVPHRKALFDVAVAGPLGGFIVTIPLLI
EUmethanosarcina         LPYFIPFP----------TFIGTMGALIRYRGPVPSRKALFDVGVAGPLVGLFMSVAVTV
EU2methanosarcina        LPYFIPFP----------TFIGTMGAVIRYKGPIPDRKALFDVGVAGPLVGLFVSIAVTI
EU1methanococcoides      LPYFIPFP----------TIIGTMGAVIKHRGVIPDRKALFDVAVAGPLVGLVASVIVTF
EUthermococcus           MPYFIPFP----------NILGTLGAVIRVKSPLPTRNAAIDLGVSGPIAGFLVAVPVTV
EUpyrococcus             FPYFIPFP----------SFIGTLGAIIRVKSPIPTRNAAIDLGASGPLVGLIVAIPVTA
EUSynechococcus          LPYFIPVP----------FVLGTFGAFIELKEPVPNRKVLFDIAVAGPLAGSLVALTMLL
PLcandidatus             LPYFLPLP---------LPPFGTFGAVIKMKGHIPHKRALFDIGAAGPLMGLVFAIPAIV
ORF_BJ16030              LPFFLPLP---------VMLTGTLGAVIGMEGSRADRKQLFDIALAGPLAGLLVAIPVFV
PRanaeromyxobacter       LPYFIPVP----------FGAGTLGAVIRIRSALPSRKATLEIGAAGPIAGFLVAVPLLV
PRstigmatella            LPYFIPLP--------LLSMVGTLGAVIVIRGRIPHRNALVDIGAAGPLAGLVVAVPVLL
GNSBchloroflexus         YPFFIPMP---------LFLLGTMGAFIAIKDLVPNRRSLLAIGIAGPLAGLVVAIPVLA
GNSBroseiflexus          LPYFIPVPPIPIPGLGIITFTGTLGAVIVQREPMLDRKTILEIGIAGPLAGLVVALPLLF
EuBacAcido               LPYFLPAP----------TLIGTLGAFIRIKSPIRSRKALFDIGIGGPIAGFVVAMPLLF
SPleptospira             WPYFIPIP---------FAPIGTMGAVIRILEPIRNKKQLFDIGIWGPLMSLILSVPCYI
EUpicrophilus            LPFFIPFP----------YIIGTFGAFVSLRDPIPDRKAMTEIGAAGPIAGFLASIPLMF
                          *:::* *             **:**.:         .     .  **: .   ::    

GSBprosthecochloris      YGFTHLPPIDYIYAIHPEYRSL---------------------GGIPATAPAETLFLGKN
GSBpelodictyon           YGFLHLPPADFIYSIHPEYLQS---------------------GGLEVAVPSGTLVLGKN
CYnostoc                 WGLAHSEVVPLIEEK-------------------------------TRFLNPDALNPKYS
CYAnabaena               WGLAHSDLVPLTEN--------------------------------TSLLNPDALNPKYS
CYcrocosphaera           WGISLSDIVPLPTVES------------------------------ASLLNVEALDPRFS
EUmethanosarcina         IGLNLE-------ASA-----------------------------------VNPFSKFVM
EU2methanosarcina        IGLNLD-------VPE-----------------------------------INPLPDSLM
EU1methanococcoides      IGLSLP-------PVE-----------------------------------YIVTPGNMV
EUthermococcus           LGLKLSVLVPMSMVPS-----------------------------------TEGGLYFGT
EUpyrococcus             IGLRLSPLVPVDYLQ------------------------------------GEGTIYFGM
EUSynechococcus          VGLVFSTPGDPPAGPE----------------------------GQPTPISFHRIDPRLS
PLcandidatus             VGLILSDVRPVPADSS-------------------------------------NYLGLGE
ORF_BJ16030              AGLVLAQP--------------------------------------------ADSSLFSM
PRanaeromyxobacter       WGLAHSEVHQVAAGVAGTSVASPLDALRA------------WMDGRELFGPDTGVRVYGD
PRstigmatella            WGLAHSPIVEAPLPETGLMGEGSLWVLAQRLFGWLMLQLTHASAPPGVESEGLWQVLFGD
GNSBchloroflexus         IGLSISEVKQVVPLPG------------------------------------SFTEGNSL
GNSBroseiflexus          YGLATSPVGPPPPGGY-------------------------------------IQEGNSI
EuBacAcido               LGLALSRAG-------------------------------------------SGEPIDFG
SPleptospira             VGIYLSSLGPIDSVRE----------------------------------NPGIISFGES
EUpicrophilus            LAQYFEKVIKP--------------------------------------------VNNVI
                          .                                                          

GSBprosthecochloris      LLYILLEEIIRPSQLPPMTEMYHYPFLFAGWLGCFVTALNLLPVGQLDGGHITYAMFGKK
GSBpelodictyon           LLWMGLEYLIAPKELPPMTEIYHYPFLFAGWLGSLVTALNLLPVGQLDGGHITYAMFGRR
CYnostoc                 ILLALLSKLALGSQLTAKSALDLHPVAVAGFIGLIVTALNLMPVGQLDGGHIVHAMFGQR
CYAnabaena               ILVALLAKLALGSALTAKLAIDLHPVAVAGFLGLIVTALNLMPVGQLDGGHIVHAMFGQR
CYcrocosphaera           FLFAILVKLVLGSSFVAGKALHLHPLAVAGYIGLIVTALNLMPVGQLDGGHMVHAMFGQK
EUmethanosarcina         PSG-LPPLFVFIQNLVGATGENLHPVAFAGWVGMFVTLLNLLPAGQLDGGHILRAMLGKK
EU2methanosarcina        FEIGLPPLFVMIQKVVGVTGSNLHPVAFAGWVGMFVTLLNLLPAGQLDGGHVLRAMLGKK
EU1methanococcoides      LDIQVPLLFQAINTISGNTVETMHPVAFAGWVGMLVTVLNLLPSGQLDGGHIVRAMLGER
EUthermococcus           NLLFEALQRLVLN-VQGDYVIFLHPVAIAGWVGILVTFLNLIPVAQLDGGHILRAFISEK
EUpyrococcus             NLIFYGLSKLVIGDVPEGFGIILHPLAIAGWVGILVTFLNLLPAAQLDGGHIARAFLPEK
EUSynechococcus          VLLAILARLVLGDQLQPGQVIDLHPLAFAGWLGLVVVAFNLVPVGQLDGGHIVHAIYGQQ
PLcandidatus             PVLFSFIAKLLFGTLPEGMDIYLHPLAFAGWAGLFVTALNLLPIGQLDGGHIMYALLGKK
ORF_BJ16030              PLLATWLLRLVRPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRR
PRanaeromyxobacter       SLVTWAAQRLVWGTLPAGHEVFVHPVGFAAWLGLLVTTLNLVPMGQLDGGHVLYALLGRR
PRstigmatella            SLLMQGLRWLAVGPLPAGKQLYEHPVVVAGWFGLLITMLNLMPVGQLDGGHLSFALWGRH
GNSBchloroflexus         LYAAMKILIFGRFLPSGGEDVYLHPVALAGWAGLLVTGLNLLPAGQLDGGHIFFALFGPR
GNSBroseiflexus          LYAMAKYLVFGQFLPGNGVDVQLNAVAWGAWIGLLVTMINLLPIGQLDGGHVAYALLGEY
EuBacAcido               FPLVFHLAHIVCGPRVALSQTALHPIAIAAWFGMFATSLNLLPGGQLDGGHILASVWPRT
SPleptospira             IFTITMNQWILGPFDPAAQDVWIHPLAQAGWVGLLVTAINLLPFGQLDGGHVIYSVFGER
EUpicrophilus            PFQLNYPLIYKLFGIFEPVKVPVFPMVFAVWVGIFATAMNLIPAGQLDGGHIVRGLLGSR
                                                 ..  . : * . . :*::* .******:  ..    

GSBprosthecochloris      GHLLTARIFLFFIIVLGLPSFLFIITELILPQQGQLLAPWLIEWSWPGWILWAFILYKVI
GSBpelodictyon           GHALAAKAFLLFIMLLGFPSFVELLLSWLMPEALALIPAVMLRWSWSGWILWSFILSRFI
CYnostoc                 VAIIIGQVARLLLLLLSLI--------------------------REEFLMWAIILLFMP
CYAnabaena               TAMFIGQIARLLLLLLSLV--------------------------QSEFFVWAIILLFIP
CYcrocosphaera           TAIVIGQLTRIFMLVLAMS--------------------------RPEFLIWAILLWLMP
EUmethanosarcina         AEKISFMMPRVLFLIGLYVIYW-------------------LKEDGFIWISWALFLWIFA
EU2methanosarcina        ADRVSSMMPRILFLIGFYVIYW-------------------LKGDGFIWIFWALFLWAFA
EU1methanococcoides      AKHVSMAMPFILGCLGLYVIFV-------------------LQQNGGIWMFWSIFLLLFA
EUthermococcus           AHKMITYAAALLLVG--------------------------MSYLWSGWLIWAILIIFIG
EUpyrococcus             VHRVLTYALGFVAIG--------------------------LSYLWPGWFLWGLLILIMG
EUSynechococcus          MGANVGRVTRWLVLLLALT-------------------------VQPWLLLWALLLFVIT
PLcandidatus             SDIVYRIGIFIFCVITVFFYKG---------------------------WILFAILLLIF
ORF_BJ16030              SCWVARSVLLGAITAIILVG--------------------------ADHWVLMVVLVTFM
PRanaeromyxobacter       GARIGSEVVSAGLLVAGLTLSWN-------------------------WLFWWLLTRFLI
PRstigmatella            ARGLGRCVALVLLLLTVFASAS--------------------------WGVWLLVTVKLV
GNSBchloroflexus         AARIMSMIVAVALLGLGFLWSG--------------------------WFIWAVMIALIG
GNSBroseiflexus          AHYLAYAFIGGCVLLGILVAPN---------------------------WLLWGVLGLFI
EuBacAcido               HRWISICTIIALFGLSFFLFLG-----------------------------WLLWAIFLA
SPleptospira             YRNWIYYLFTAFLLLCLWN---------------------------FSWLLWGFLIYFII
EUpicrophilus            AYILNYIFLGFLFYLAIVYNYLG--------------------------WLFLALFVIFL
                                                                                   . 

GSBprosthecochloris      GVEHP-ATQIKQPLSTKRTLLGWIAIAIFILCFTPVPFGVI-------------------
GSBpelodictyon           GLNHP-PTVHDHSLSTGRVVYGWVAIAIFVITFTPVPFGVT-------------------
CYnostoc                 LIDEP-ALNDVTELDNKRDIWGLLAMALLIVIILPLPQAIANFWQI--------------
CYAnabaena               LVDEP-ALNDVTELDTKRDILGLLAMALLVIIVLPMPEAIANLLQI--------------
CYcrocosphaera           IMDQP-ALNDVTELDDIRDFIGLFCLGLLIVILLPVPGAISQWLGI--------------
EUmethanosarcina         AIGHPSPLHDEVELDKKRILLGIITFILGLLCFTLIPFKPIP------------------
EU2methanosarcina        AAGHPSPLHDKVKLDRKRILIGILTFILGLLCFTLIPFKPIT------------------
EU1methanococcoides      LAGHPRTLNDDIKLDKRRMALGIGTFILGLLCFTLVPFTLVIT-----------------
EUthermococcus           SAGNPGALDEVSPISKGRIVLALTALVIFVITATPRPLWTA-------------------
EUpyrococcus             RVGNPGALDEVTPLTWSRKVLAIIIWAVFIASATLVPFSTSS------------------
EUSynechococcus          SADEP-ALNDVAELDEGRELLGLAILSWLVLILLPVPPFLQSWLGLA-------------
PLcandidatus             GFRHPSPADEYTPLDPRRKMLGIALFIIFLLSFTPVPLKF--------------------
ORF_BJ16030              GVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPLLLPGFIFIR-------------
PRanaeromyxobacter       GARHP-PPLRDEPLDARGRVLAVATLLLFAVTFVPVPISL--------------------
PRstigmatella            GFGHPEVIEPGVPLSRGRKWLCLLCFLALVGCAMPIPLRQVLS-----------------
GNSBchloroflexus         QQRSP-LRNEISPLEGPWRWLAYLGILTFLLVFTPIPITVTVP-----------------
GNSBroseiflexus          GPRHPPPLNDVSRIGPGHAALAVLGLITFVLLFMPNPLQVVGVPQ---------------
EuBacAcido               IAIRHPWVPEYPPLDKPRRWIAFCGLVMLVITIAPRPFAGLSIYDFIMYWKHGG------
SPleptospira             KVEHPFVPDPAAPLDRIRKIGGLLVLFALIFIFVPSPIQLGTNMNRPGLAEEVWISLKSV
EUpicrophilus            GLVHPPALNDYARIKMRDVFIGIFCLLMFIITFTPLPIKP--------------------
                                      :                      *                       

GSBprosthecochloris      ----
GSBpelodictyon           ----
CYnostoc                 ----
CYAnabaena               ----
CYcrocosphaera           ----
EUmethanosarcina         ----
EU2methanosarcina        ----
EU1methanococcoides      ----
EUthermococcus           ----
EUpyrococcus             ----
EUSynechococcus          ----
PLcandidatus             ----
ORF_BJ16030              ----
PRanaeromyxobacter       ----
PRstigmatella            ----
GNSBchloroflexus         ----
GNSBroseiflexus          ----
EuBacAcido               ----
SPleptospira             YSSL
EUpicrophilus            ----
                             

BLAST

BLASTP 2.2.15 [Oct-15-2006]

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: 1163605444-26513-77006931206.BLASTQ4


Database: Non-redundant SwissProt sequences
           217,874 sequences; 82,041,608 total letters

If you have any problems or questions with the results of this search
please refer to the BLAST FAQs Taxonomy reports
Query= ORF_BJ16030 traduction [3-746 sens direct] Length=248 

Sequences producing significant alignments:                        (Bits)  Value

gi|2499925|sp|Q55518|Y528_SYNY3  Putative zinc metalloprotease sl  33.1    0.80   Gene info
gi|81651261|sp|Q6GHH3|Y1238_STAAR  Putative zinc metalloprotease   32.0    1.7    Gene info
gi|38605593|sp|Q8NWZ4|Y1145_STAAW  Putative zinc metalloprotea...  32.0    1.7    Gene info
gi|54040032|sp|P63333|Y1105_STAAN  Putative zinc metalloprotea...  32.0    1.7    Gene info
gi|81694637|sp|Q5HGG9|Y1281_STAAC  Putative zinc metalloprotease   32.0    1.8    Gene info
gi|74626334|sp|Q9Y7U4|NSE3_SCHPO  Non-structural maintenance o...  32.0    2.0  
gi|51315812|sp|O75460|ERN1_HUMAN  Serine/threonine-protein kin...  30.0    8.0    Gene info

Alignments

>gi|2499925|sp|Q55518|Y528_SYNY3 Gene info Putative zinc metalloprotease sll0528
Length=379

 Score = 33.1 bits (74),  Expect = 0.80, Method: Composition-based stats.
 Identities = 38/158 (24%), Positives = 62/158 (39%), Gaps = 37/158 (23%)

Query  7    WIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPA-TLPFFLPLPVMLTGTLGAVIGMEGSR  65
            WI GLI  + + A + AHE GH + A    I   ++  FL          G +  +E   
Sbjct  58   WILGLITALLLFASVVAHELGHSLVALAQGIEVKSITLFL---------FGGLASLEKES  108

Query  66   ADRKQLFDIALAGPLAGLLVAIPVFVAGLVLAQPADSSLFSMPLLATWLLRLVRPDLPVG  125
                Q F +A+AGP   L++ + + + G  +  P                         G
Sbjct  109  NTPWQAFAVAIAGPAVSLVLFLGLTIVGTQIPLPVP-----------------------G  145

Query  126  QVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAV  163
            Q +I     L G +   +   N+IP   LDGG++  ++
Sbjct  146  QAIIG----LLGMINLALALFNLIPGLPLDGGNVLKSI  179


>gi|81651261|sp|Q6GHH3|Y1238_STAAR Gene info Putative zinc metalloprotease SAR1238
Length=428

 Score = 32.0 bits (71),  Expect = 1.7, Method: Composition-based stats.
 Identities = 23/69 (33%), Positives = 35/69 (50%), Gaps = 6/69 (8%)

Query  135  LAGWVGFLVTGL---NMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHW  191
            L G+   L   L   N+IP+  LDGG I   ++        + V   A T II +GA   
Sbjct  355  LIGYTALLSVNLGIMNLIPIPALDGGRILFVIY---EAIFRKPVNKKAETTIIAIGAIFM  411

Query  192  VLMVVLVTF  200
            V++++LVT+
Sbjct  412  VVIMILVTW  420


>gi|38605593|sp|Q8NWZ4|Y1145_STAAW Gene info Putative zinc metalloprotease MW1145
 gi|81649414|sp|Q6G9V1|Y1196_STAAS Gene info Putative zinc metalloprotease SAS1196
Length=428

 Score = 32.0 bits (71),  Expect = 1.7, Method: Composition-based stats.
 Identities = 23/69 (33%), Positives = 35/69 (50%), Gaps = 6/69 (8%)

Query  135  LAGWVGFLVTGL---NMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHW  191
            L G+   L   L   N+IP+  LDGG I   ++        + V   A T II +GA   
Sbjct  355  LIGYTALLSVNLGIMNLIPIPALDGGRILFVIY---EAIFRKPVNKKAETTIIAIGAIFM  411

Query  192  VLMVVLVTF  200
            V++++LVT+
Sbjct  412  VVIMILVTW  420


>gi|54040032|sp|P63333|Y1105_STAAN Gene info Putative zinc metalloprotease SA1105
 gi|54042339|sp|P63332|Y1262_STAAM Gene info Putative zinc metalloprotease SAV1262
Length=428

 Score = 32.0 bits (71),  Expect = 1.7, Method: Composition-based stats.
 Identities = 23/69 (33%), Positives = 35/69 (50%), Gaps = 6/69 (8%)

Query  135  LAGWVGFLVTGL---NMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHW  191
            L G+   L   L   N+IP+  LDGG I   ++        + V   A T II +GA   
Sbjct  355  LIGYTALLSVNLGIMNLIPIPALDGGRILFVIY---EAIFRKPVNKKAETTIIAIGAIFM  411

Query  192  VLMVVLVTF  200
            V++++LVT+
Sbjct  412  VVIMILVTW  420


>gi|81694637|sp|Q5HGG9|Y1281_STAAC Gene info Putative zinc metalloprotease SACOL1281
Length=428

 Score = 32.0 bits (71),  Expect = 1.8, Method: Composition-based stats.
 Identities = 23/69 (33%), Positives = 35/69 (50%), Gaps = 6/69 (8%)

Query  135  LAGWVGFLVTGL---NMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHW  191
            L G+   L   L   N+IP+  LDGG I   ++        + V   A T II +GA   
Sbjct  355  LIGYTALLSVNLGIMNLIPIPALDGGRILFVIY---EAIFRKPVNKKAETTIIAIGAIFM  411

Query  192  VLMVVLVTF  200
            V++++LVT+
Sbjct  412  VVIMILVTW  420


>gi|74626334|sp|Q9Y7U4|NSE3_SCHPO  Non-structural maintenance of chromosome element 3 (Non-SMC element 
3)
Length=328

 Score = 32.0 bits (71),  Expect = 2.0, Method: Composition-based stats.
 Identities = 22/64 (34%), Positives = 32/64 (50%), Gaps = 1/64 (1%)

Query  96   LAQPADSSLFSMPLLATWLLRLVRP-DLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQL  154
            L +PA S+  S  L   W+LR   P +L     LI ++ L   + GFL+T +  I +S  
Sbjct  166  LRRPATSNANSSNLHRYWVLRSTLPMELQKDSRLIVDSVLDTAYYGFLMTVIAFIAVSHC  225

Query  155  DGGH  158
              GH
Sbjct  226  SVGH  229


>gi|51315812|sp|O75460|ERN1_HUMAN Gene info Serine/threonine-protein kinase/endoribonuclease IRE1 precursor 
(Inositol-requiring protein 1) (hIRE1p) (IRE1a) (Ire1-alpha) 
(Endoplasmic reticulum-to-nucleus signaling 1) [Includes: 
Serine/threonine-protein kinase ; Endoribonuclease ]
Length=977

 Score = 30.0 bits (66),  Expect = 8.0, Method: Composition-based stats.
 Identities = 17/40 (42%), Positives = 22/40 (55%), Gaps = 10/40 (25%)

Query  119  RPDLPVGQVL------IPNAFLLAGWVGFLVTGLNMIPLS  152
            RP+ PV  +L      I + FLL GWV F++T     PLS
Sbjct  432  RPEAPVDSMLKDMATIILSTFLLIGWVAFIIT----YPLS  467




  Database: Non-redundant SwissProt sequences
    Posted date:  Nov 14, 2006  5:54 PM
  Number of letters in database: 82,041,608
  Number of sequences in database:  217,874
Lambda     K      H
   0.330    0.145    0.459 
Gapped
Lambda     K      H
   0.267   0.0410    0.140 
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 217874
Number of Hits to DB: 2026809
Number of extensions: 84426
Number of successful extensions: 294
Number of sequences better than 10: 4
Number of HSP's better than 10 without gapping: 0
Number of HSP's gapped: 302
Number of HSP's successfully gapped: 4
Length of query: 248
Length of database: 82041608
Length adjustment: 109
Effective length of query: 139
Effective length of database: 58293342
Effective search space: 8102774538
Effective search space used: 8102774538
T: 11
A: 40
X1: 15 (7.1 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 40 (20.0 bits)
S2: 65 (29.6 bits)

//////////////////////

BLASTP 2.2.15 [Oct-15-2006]

Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 
(1997), "Gapped BLAST and PSI-BLAST: a new generation of 
protein database search programs", Nucleic Acids Res. 25:3389-3402.

Reference:
Schäffer, Alejandro A., L. Aravind, Thomas L. Madden, Sergei 
Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and 
Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST 
protein database searches with composition-based statistics 
and other refinements", Nucleic Acids Res. 29:2994-3005.

RID: 1163605949-1252-140239588763.BLASTQ4


Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           4,113,407 sequences; 1,418,464,042 total letters

If you have any problems or questions with the results of this search
please refer to the BLAST FAQs Taxonomy reports
Query= ORF_BJ16030 traduction [3-746 sens direct] Length=248 

Sequences producing significant alignments:                        (Bits)  Value

gi|87306567|ref|ZP_01088714.1|  hypothetical protein DSM3645_0...   164    2e-39
gi|91202377|emb|CAJ72016.1|  conserved hypothetical protein [C...   125    1e-27
gi|110619951|emb|CAJ35229.1|  putative metalloprotease (M50 fa...   121    2e-26
gi|57641182|ref|YP_183660.1|  membrane-associated metallopepti...   110    7e-23  Gene info
gi|108760121|ref|YP_634836.1|  peptidase, M50 (S2P protease) f...   108    2e-22  Gene info
gi|116329531|ref|YP_799251.1|  Peptidase, M50 family [Leptospi...   107    4e-22  Gene info
gi|81298898|ref|YP_399106.1|  hypothetical protein Synpcc7942_...   107    5e-22  Gene info
gi|56751426|ref|YP_172127.1|  hypothetical protein syc1417_d [...   107    5e-22  Gene info
gi|73668262|ref|YP_304277.1|  zinc metalloprotease [Methanosar...   107    7e-22  Gene info
gi|20089141|ref|NP_615216.1|  hypothetical protein MA0243 [Met...   105    2e-21  Gene info
gi|14521814|ref|NP_127290.1|  hypothetical protein PAB1063 [Py...   104    3e-21  Gene info
gi|115377636|ref|ZP_01464831.1|  peptidase, M50 family [Stigma...   103    7e-21
gi|76258449|ref|ZP_00766104.1|  Peptidase M50 [Chloroflexus au...   103    8e-21
gi|68551965|ref|ZP_00591358.1|  Peptidase M50 [Prosthecochlori...   103    1e-20
gi|15790876|ref|NP_280700.1|  hypothetical protein VNG2012C [H...   102    2e-20  Gene info
gi|68548817|ref|ZP_00588286.1|  Peptidase M50 [Pelodictyon pha...   102    2e-20
gi|91773435|ref|YP_566127.1|  peptidase M50 [Methanococcoides ...   100    4e-20  Gene info
gi|76801782|ref|YP_326790.1|  probable metalloprotease [Natron...   100    6e-20  Gene info
gi|86160740|ref|YP_467525.1|  peptidase M50 [Anaeromyxobacter ...   100    7e-20  Gene info
gi|21227625|ref|NP_633547.1|  Zinc metalloprotease [Methanosar...  98.2    3e-19  Gene info
gi|86606401|ref|YP_475164.1|  peptidase, M50B family [Synechoc...  97.8    3e-19  Gene info
gi|67923155|ref|ZP_00516644.1|  Peptidase M50 [Crocosphaera wa...  97.8    4e-19
gi|14590262|ref|NP_142328.1|  hypothetical protein PH0351 [Pyr...  97.8    4e-19  Gene info
gi|11497673|ref|NP_068894.1|  hypothetical protein AF0053 [Arc...  97.8    4e-19  Gene info
gi|24212763|ref|NP_710244.1|  hypothetical protein LA0063 [Lep...  97.4    4e-19  Gene info
gi|17230910|ref|NP_487458.1|  hypothetical protein all3418 [No...  97.4    5e-19  Gene info
gi|75909644|ref|YP_323940.1|  Peptidase M50 [Anabaena variabil...  97.4    5e-19  Gene info
gi|18976764|ref|NP_578121.1|  metalloprotease [Pyrococcus furi...  97.4    6e-19  Gene info
gi|106888483|ref|ZP_01355683.1|  Peptidase M50 [Roseiflexus sp...  96.7    8e-19
gi|55378407|ref|YP_136257.1|  hypothetical protein rrnAC1637 [...  96.3    1e-18  Gene info
gi|23127252|ref|ZP_00109126.1|  COG0750: Predicted membrane-as...  95.9    2e-18
gi|16331565|ref|NP_442293.1|  hypothetical protein slr0643 [Sy...  95.5    2e-18  Gene info
gi|48477315|ref|YP_023021.1|  zinc metalloprotease [Picrophilu...  94.7    4e-18  Gene info
gi|71482040|ref|ZP_00661741.1|  Peptidase M50 [Prosthecochlori...  93.2    1e-17
gi|67919415|ref|ZP_00512993.1|  Peptidase M50 [Chlorobium limi...  92.8    1e-17
gi|22298414|ref|NP_681661.1|  hypothetical protein tll0871 [Th...  92.0    2e-17  Gene info
gi|94968211|ref|YP_590259.1|  peptidase M50 [Acidobacteria bac...  92.0    2e-17  Gene info
gi|110667076|ref|YP_656887.1|  probable membrane associated me...  89.7    1e-16  Gene info
gi|13541085|ref|NP_110773.1|  Predicted membrane-associated Zn...  89.7    1e-16  Gene info
gi|55379738|ref|YP_137588.1|  hypothetical protein rrnAC3176 [...  89.4    2e-16  Gene info
gi|113477344|ref|YP_723405.1|  peptidase M50 [Trichodesmium er...  89.0    2e-16  Gene info
gi|86607717|ref|YP_476479.1|  peptidase, M50B family [Synechoc...  89.0    2e-16  Gene info
gi|78187529|ref|YP_375572.1|  zinc protease, putative [Pelodic...  89.0    2e-16  Gene info
gi|16082612|ref|NP_394800.1|  Predicted membrane-associated Zn...  88.2    3e-16  Gene info
gi|10640687|emb|CAC12465.1|  conserved hypothetical membrane p...  86.7    1e-15  Gene info
gi|67938062|ref|ZP_00530592.1|  Peptidase M50 [Chlorobium phae...  86.3    1e-15
gi|69268506|ref|ZP_00609238.1|  Peptidase M50 [Ferroplasma aci...  83.2    1e-14
gi|67923178|ref|ZP_00516666.1|  Peptidase M50 [Crocosphaera wa...  81.6    3e-14
gi|110597188|ref|ZP_01385477.1|  Peptidase M50 [Chlorobium fer...  80.1    8e-14
gi|23128478|ref|ZP_00110323.1|  COG0750: Predicted membrane-as...  79.3    2e-13
gi|116624251|ref|YP_826407.1|  peptidase M50 [Solibacter usita...  77.8    4e-13  Gene info
gi|13541649|ref|NP_111337.1|  Predicted membrane-associated Zn...  77.4    5e-13  Gene info
gi|21674522|ref|NP_662587.1|  zinc protease, putative [Chlorob...  75.9    1e-12  Gene info
gi|75907293|ref|YP_321589.1|  Peptidase M50 [Anabaena variabil...  74.7    4e-12  Gene info
gi|17229606|ref|NP_486154.1|  hypothetical protein alr2114 [No...  72.8    1e-11  Gene info
gi|83815374|ref|YP_445072.1|  peptidase, M50 family protein [S...  72.0    2e-11  Gene info
gi|86607310|ref|YP_476073.1|  peptidase, M50 family [Synechoco...  70.9    5e-11  Gene info
gi|86608731|ref|YP_477493.1|  hypothetical protein CYB_1255 [S...  68.6    3e-10  Gene info
gi|16330216|ref|NP_440944.1|  hypothetical protein sll0862 [Sy...  68.2    3e-10  Gene info
gi|22298733|ref|NP_681980.1|  hypothetical protein tll1190 [Th...  68.2    3e-10  Gene info
gi|16081895|ref|NP_394299.1|  hypothetical protein Ta0839 [The...  67.8    4e-10  Gene info
gi|15238440|ref|NP_198372.1|  EGY1 (ETHYLENE-DEPENDENT GRAVITR...  67.4    6e-10  UniGene infoGene info
gi|115455845|ref|NP_001051523.1|  Os03g0792400 [Oryza sativa (...  67.0    7e-10  Gene info
gi|49457926|gb|AAO37991.2|  expressed protein [Oryza sativa (j...  67.0    7e-10  Gene info
gi|116060343|emb|CAL55679.1|  unnamed protein product [Ostreococc  66.6    8e-10
gi|15239226|ref|NP_196193.1|  metalloendopeptidase [Arabidopsi...  63.9    6e-09  UniGene infoGene info
gi|115455101|ref|NP_001051151.1|  Os03g0729000 [Oryza sativa (...  63.5    8e-09  Gene info
gi|42573279|ref|NP_974736.1|  metalloendopeptidase [Arabidopsis t  63.5    8e-09  UniGene infoGene info
gi|113474292|ref|YP_720353.1|  peptidase M50 [Trichodesmium er...  63.2    1e-08  Gene info
gi|67934827|ref|ZP_00527853.1|  zinc protease, putative [Chlor...  60.8    5e-08
gi|54290179|dbj|BAD61067.1|  unknown protein [Oryza sativa (japon  60.5    7e-08  Gene info
gi|115434462|ref|NP_001041989.1|  Os01g0142100 [Oryza sativa (...  60.5    8e-08  Gene info
gi|69270399|ref|ZP_00610431.1|  Peptidase M50 [Ferroplasma aci...  60.1    1e-07
gi|56751543|ref|YP_172244.1|  hypothetical protein syc1534_d [...  59.3    1e-07  Gene info
gi|8978356|dbj|BAA98209.1|  unnamed protein product [Arabidopsis   59.3    2e-07
gi|116062581|dbj|BAA79899.2|  hypothetical protein [Aeropyrum per  58.9    2e-07
gi|37521282|ref|NP_924659.1|  hypothetical protein glr1713 [Gl...  58.9    2e-07  Gene info
gi|14601068|ref|NP_147594.1|  hypothetical protein APE0915 [Aerop  57.8    4e-07  Gene info
gi|81301385|ref|YP_401593.1|  hypothetical protein Synpcc7942_...  56.6    1e-06  Gene info
gi|110740640|dbj|BAE98423.1|  hypothetical protein [Arabidopsis t  55.5    2e-06
gi|15220875|ref|NP_173229.1|  unknown protein [Arabidopsis tha...  55.1    3e-06  UniGene infoGene info
gi|110637322|ref|YP_677529.1|  zinc protease [Cytophaga hutchi...  50.4    8e-05  Gene info
gi|48477496|ref|YP_023202.1|  hypothetical zinc metalloproteas...  45.8    0.002  Gene info
gi|30692714|ref|NP_851094.1|  DNA binding / metalloendopeptida...  44.3    0.005  UniGene infoGene info
gi|68055730|ref|ZP_00539872.1|  Peptidase M50 [Exiguobacterium...  40.8    0.051
gi|87162734|gb|ABD28529.1|  Peptidase M, neutral zinc metallop...  39.7    0.11 
gi|42780644|ref|NP_977891.1|  hypothetical protein BCE_1570 [B...  39.7    0.13   Gene info
gi|49184372|ref|YP_027624.1|  hypothetical protein BAS1355 [Ba...  38.1    0.32   Gene info
gi|30261543|ref|NP_843920.1|  hypothetical protein BA1465 [Bac...  38.1    0.32   Gene info
gi|49477237|ref|YP_035663.1|  membrane metalloprotease [Bacill...  38.1    0.33   Gene info
gi|89202924|ref|ZP_01181628.1|  Peptidase M50 [Bacillus cereus...  38.1    0.35 
gi|89894538|ref|YP_518025.1|  hypothetical protein DSY1792 [De...  37.7    0.47   Gene info
gi|65318810|ref|ZP_00391769.1|  COG1994: Zn-dependent proteases [  36.2    1.2  
gi|20094249|ref|NP_614096.1|  Predicted membrane-associated Zn...  35.8    1.7    Gene info
gi|47569114|ref|ZP_00239802.1|  membrane metalloprotease [Baci...  35.8    2.0  
gi|107025401|ref|YP_622912.1|  peptidase M50 [Burkholderia cen...  35.8    2.0    Gene info
gi|84353019|ref|ZP_00977961.1|  COG1994: Zn-dependent protease...  35.4    2.5  
gi|78061178|ref|YP_371086.1|  Peptidase M50 [Burkholderia sp. ...  35.0    3.0    Gene info
gi|50755557|ref|XP_414795.1|  PREDICTED: similar to importin alph  34.7    3.6    Gene info
gi|115359427|ref|YP_776565.1|  peptidase M50 [Burkholderia cep...  34.7    3.8    Gene info


>gi|87306567|ref|ZP_01088714.1|  hypothetical protein DSM3645_09547 [Blastopirellula marina DSM 
3645]
 gi|87290746|gb|EAQ82633.1|  hypothetical protein DSM3645_09547 [Blastopirellula marina DSM 
3645]
Length=349

 Score =  164 bits (416),  Expect = 2e-39, Method: Composition-based stats.
 Identities = 112/241 (46%), Positives = 161/241 (66%), Gaps = 12/241 (4%)

Query  10   GLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADRK  69
            GLIYM  ++AIL AHE GHF+   R+RI A+ P+F+P+P+   GT+GAVIGM+G +A+R+
Sbjct  110  GLIYMACLLAILFAHEMGHFLMTVRYRIHASYPYFIPIPISPIGTMGAVIGMDGLKANRR  169

Query  70   QLFDialagplagllvaipVFVAG-----LVLAQPADSSLFSMPLLATWLLRLVR-PDL-  122
            QLFDI LAGPLAGL++AIPV   G     L    P+  SL  +PL   W +   + P   
Sbjct  170  QLFDIGLAGPLAGLVIAIPVLYVGILQMDLTKTAPSPYSL-DVPLGLAWAMAWFQVPGYS  228

Query  123  ---PVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGA  179
               PV Q  + N + +AGWVG LVTGLNM+P+SQLDGGH+++ +FG+ S ++A  V++ A
Sbjct  229  LGDPVAQAQL-NPYFMAGWVGLLVTGLNMLPISQLDGGHVAYTLFGKWSYFLAWGVIIAA  287

Query  180  ITAIILVGADHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPL  239
            +TA+ L     W+LM++LV  +G  HPP  ++S  +G  R I+G  S VIP++ F P+ +
Sbjct  288  VTAMALGAGWTWILMLILVLVIGPSHPPTADDSVKIGAVRWIVGFTSLVIPILCFPPQAI  347

Query  240  L  240
            +
Sbjct  348  I  348


>gi|91202377|emb|CAJ72016.1|  conserved hypothetical protein [Candidatus Kuenenia stuttgartiensis]
Length=271

 Score =  125 bits (315),  Expect = 1e-27, Method: Composition-based stats.
 Identities = 100/246 (40%), Positives = 146/246 (59%), Gaps = 12/246 (4%)

Query  3    FTEHWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGME  62
             T +++ G+ Y +++M+IL +HE GHF    ++ + ATLP+FLPLP+   GT GAVI M+
Sbjct  27   LTTYYVNGIWYSLAIMSILLSHELGHFFMCRKYHVDATLPYFLPLPLPPFGTFGAVIKMK  86

Query  63   GSRADRKQLFDialagplagllvaipVFVAGLVLAQ----PADSSLF---SMPLLATWLL  115
            G    ++ LFDI  AGPL GL+ AIP  V GL+L+     PADSS +     P+L +++ 
Sbjct  87   GHIPHKRALFDIGAAGPLMGLVFAIPAIVVGLILSDVRPVPADSSNYLGLGEPVLFSFIA  146

Query  116  RLVRPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVAR--  173
            +L+   LP G  +  +    AGW G  VT LN++P+ QLDGGHI +A+ G++S  V R  
Sbjct  147  KLLFGTLPEGMDIYLHPLAFAGWAGLFVTALNLLPIGQLDGGHIMYALLGKKSDIVYRIG  206

Query  174  SVLLGAITAIILVGADHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVIT  233
              +   IT     G   W+L  +L+   G  HP   +E  PL   R +LGIA F+I +++
Sbjct  207  IFIFCVITVFFYKG---WILFAILLLIFGFRHPSPADEYTPLDPRRKMLGIALFIIFLLS  263

Query  234  FMPEPL  239
            F P PL
Sbjct  264  FTPVPL  269


>gi|110619951|emb|CAJ35229.1|  putative metalloprotease (M50 family) [uncultured methanogenic 
archaeon]
Length=352

 Score =  121 bits (304),  Expect = 2e-26, Method: Composition-based stats.
 Identities = 95/247 (38%), Positives = 141/247 (57%), Gaps = 18/247 (7%)

Query  9    RGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADR  68
            +GL + +++M  L +HE GH++ + ++ I ATLP+F+P P    GT+GA+I  +G   +R
Sbjct  110  KGLPFAIAIMVALGSHELGHYIVSRKYGIDATLPYFIPFPFSPIGTMGAIIRQKGPVPNR  169

Query  69   KQLFDialagplagllvaipVFVAGLVLAQPADSSL------FSMPLLATWLLRLVRPDL  122
            K LFD+ +AGPL GL V++ + V GL+L  P   +        + PLL  +L  +V P  
Sbjct  170  KALFDVGIAGPLVGLAVSVVIIVIGLMLPAPEIDTTSGTYMQINTPLLFDFLAWVVHPGE  229

Query  123  PVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITA  182
             +  V   N    AGWVG LVT LNMIP+ QLDGGH+S AVFG R+  ++R V+   I A
Sbjct  230  TLTSV---NPIAFAGWVGLLVTVLNMIPVGQLDGGHVSRAVFGERANLISR-VMPIIIMA  285

Query  183  IILVG-------ADHWVLMVVLVTFMGV-DHPPIRNESQPLGTARTILGIASFVIPVITF  234
              L G        + W+L   L   M    HP   +++Q +G  R IL  A+FV+ ++ F
Sbjct  286  FGLYGTFILQQPGEIWILWGFLSALMSAGSHPKPTDDTQTIGVPRYILAAAAFVLALLCF  345

Query  235  MPEPLLL  241
             P P+ +
Sbjct  346  TPFPITM  352


>gi|57641182|ref|YP_183660.1| Gene info membrane-associated metallopeptidase, M50 family [Thermococcus 
kodakarensis KOD1]
 gi|57159506|dbj|BAD85436.1| Gene info membrane-associated metallopeptidase, M50 family [Thermococcus 
kodakarensis KOD1]
Length=436

 Score =  110 bits (274),  Expect = 7e-23, Method: Composition-based stats.
 Identities = 94/247 (38%), Positives = 138/247 (55%), Gaps = 21/247 (8%)

Query  7    WIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRA  66
            +I  L + VSVMAI+  HE GH +AA  H + AT+P+F+P P +L GTLGAVI ++    
Sbjct  194  YIIALSFSVSVMAIIGTHELGHKIAATYHGVKATMPYFIPFPNIL-GTLGAVIRVKSPLP  252

Query  67   DRKQLFDialagplagllvaipVFVAGLVLAQPADSSL---------FSMPLLATWLLRL  117
             R    D+ ++GP+AG LVA+PV V GL L+     S+         F   LL   L RL
Sbjct  253  TRNAAIDLGVSGPIAGFLVAVPVTVLGLKLSVLVPMSMVPSTEGGLYFGTNLLFEALQRL  312

Query  118  VRPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLL  177
            V  ++    V+  +   +AGWVG LVT LN+IP++QLDGGHI  A    ++       ++
Sbjct  313  VL-NVQGDYVIFLHPVAIAGWVGILVTFLNLIPVAQLDGGHILRAFISEKA-----HKMI  366

Query  178  GAITAIILVGADH----WVLMVVLVTFMG-VDHPPIRNESQPLGTARTILGIASFVIPVI  232
                A++LVG  +    W++  +L+ F+G   +P   +E  P+   R +L + + VI VI
Sbjct  367  TYAAALLLVGMSYLWSGWLIWAILIIFIGSAGNPGALDEVSPISKGRIVLALTALVIFVI  426

Query  233  TFMPEPL  239
            T  P PL
Sbjct  427  TATPRPL  433


>gi|108760121|ref|YP_634836.1| Gene info peptidase, M50 (S2P protease) family [Myxococcus xanthus DK 1622]
 gi|108464001|gb|ABF89186.1| Gene info peptidase, M50 (S2P protease) family [Myxococcus xanthus DK 1622]
Length=365

 Score =  108 bits (270),  Expect = 2e-22, Method: Composition-based stats.
 Identities = 94/279 (33%), Positives = 139/279 (49%), Gaps = 44/279 (15%)

Query  5    EHWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGS  64
            E   R L + +S++AIL  HE GH+V A  HR+  +LP+F+PLPV+  GTLGAVI +   
Sbjct  82   EAAFRALAFSLSLLAILGTHEMGHYVLARWHRVETSLPYFIPLPVLGVGTLGAVIRIRDR  141

Query  65   RADRKQLFDialagplagllvaipVFVAGLVLAQPADS----------------------  102
              +R  L DI  AGPLAGL+VA+P+   GL  +   D+                      
Sbjct  142  IPNRNALVDIGAAGPLAGLVVALPILFWGLAHSTVVDAPDIPSTLFPGDGSLWVIGRDVF  201

Query  103  ----------------------SLFSMPLLATWLLRLVRPDLPVGQVLIPNAFLLAGWVG  140
                                  +LF   LL   L RL    LP G+ ++ +  ++AGW G
Sbjct  202  TWVMDRVTNAPPAPETPFNGVQTLFGDSLLMQGLTRLALGPLPEGKDILVHPVVIAGWFG  261

Query  141  FLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHWVLMVVLVTF  200
             LVT LN++P+ QLDGGH+++A++GRR+ WV R+V L  +   + V A   + ++V    
Sbjct  262  LLVTLLNLMPVGQLDGGHLAYALWGRRAHWVGRAVALVLLVLTLFVTASWGLWLLVTSKL  321

Query  201  MGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPL  239
            +G  HP +    +PL   R  +     +  +   MP PL
Sbjct  322  VGFGHPEVVEPQEPLSPLRKWICALCLLALIGCAMPIPL  360


>gi|116329531|ref|YP_799251.1| Gene info Peptidase, M50 family [Leptospira borgpetersenii serovar Hardjo-bovis 
L550]
 gi|116329846|ref|YP_799564.1| Gene info Peptidase, M50 family [Leptospira borgpetersenii serovar Hardjo-bovis 
JB197]
 gi|116122275|gb|ABJ80318.1| Gene info Peptidase, M50 family [Leptospira borgpetersenii serovar Hardjo-bovis 
L550]
 gi|116123535|gb|ABJ74806.1| Gene info Peptidase, M50 family [Leptospira borgpetersenii serovar Hardjo-bovis 
JB197]
Length=308

 Score =  107 bits (268),  Expect = 4e-22, Method: Composition-based stats.
 Identities = 90/248 (36%), Positives = 140/248 (56%), Gaps = 25/248 (10%)

Query  11   LIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADRKQ  70
            L Y +S++ ILSAHE GHF+AA  + I AT P+F+P+P    GT+GAVI +     ++KQ
Sbjct  45   LPYSLSLIIILSAHEMGHFLAARYYGIKATWPYFIPIPFAPIGTMGAVIRILEPIRNKKQ  104

Query  71   LFDialagplagllvaipVFVAGLVLAQ--PADS------------SLFSMPLLATWLLR  116
            LFDI + GPL  L++++P ++ G+ L+   P DS            S+F++  +  W+L 
Sbjct  105  LFDIGIWGPLMSLILSVPCYIVGIYLSSLGPIDSVRENPGIISFGESIFTIT-MNQWIL-  162

Query  117  LVRPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVL  176
               P  P  Q +  +    AGWVG LVT +N++P  QLDGGH+ ++VFG R     R+ +
Sbjct  163  --GPFDPAAQDVWIHPLAQAGWVGLLVTAINLLPFGQLDGGHVIYSVFGERY----RNWI  216

Query  177  LGAITAIILVGADH--WVLMVVLVTF-MGVDHPPIRNESQPLGTARTILGIASFVIPVIT  233
                TA +L+   +  W+L   L+ F + V+HP + + + PL   R I G+      +  
Sbjct  217  YYLFTAFLLLCLWNFSWLLWGFLIYFIIKVEHPFVPDPAAPLDRIRKIGGLLVLFALIFI  276

Query  234  FMPEPLLL  241
            F+P P+ L
Sbjct  277  FVPSPIQL  284


>gi|81298898|ref|YP_399106.1| Gene info hypothetical protein Synpcc7942_0087 [Synechococcus elongatus 
PCC 7942]
 gi|81167779|gb|ABB56119.1| Gene info conserved hypothetical protein [Synechococcus elongatus PCC 7942]
Length=503

 Score =  107 bits (267),  Expect = 5e-22, Method: Composition-based stats.
 Identities = 89/246 (36%), Positives = 141/246 (57%), Gaps = 14/246 (5%)

Query  8    IRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRAD  67
            +RGL Y +S++AIL  HE GHF AA +HR+ A+LP+F+P+P  L GT GA + +     D
Sbjct  253  LRGLPYALSLLAILGVHEFGHFWAARKHRLQASLPYFIPVPAFL-GTFGAFVRIRSPIPD  311

Query  68   RKQLFDialagplagllvaipVFVAGLVLAQ----PADSSLFSMPLL---ATWLLRLVR-  119
            RK LFD+ ++GPLAGL++ +P+ + GL  +Q    P  S L +   L    + L+ L+  
Sbjct  312  RKALFDVGVSGPLAGLVITLPLLIWGLTQSQVVPMPERSGLLNFSALDPGVSILMGLISH  371

Query  120  ----PDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSV  175
                  L + Q L  +   +AG++G +VT LN++P+ QLDGGHI HA+FG+R   V   V
Sbjct  372  LSLGDRLGLNQALQLHPVAIAGYLGLIVTALNLVPVGQLDGGHIVHAMFGQRQGAVIGQV  431

Query  176  LLGAITAIILVGADHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITFM  235
                I A+  V ++  +  ++L+     D P + + S+ L   R  +G  +  I ++  +
Sbjct  432  ARLCILALSFVRSELLLWALLLLLLPVADEPALNDLSE-LDDRRDGIGFLALFILILIVL  490

Query  236  PEPLLL  241
            P P +L
Sbjct  491  PLPPVL  496


>gi|56751426|ref|YP_172127.1| Gene info hypothetical protein syc1417_d [Synechococcus elongatus PCC 6301]
 gi|56686385|dbj|BAD79607.1| Gene info hypothetical protein [Synechococcus elongatus PCC 6301]
Length=503

 Score =  107 bits (267),  Expect = 5e-22, Method: Composition-based stats.
 Identities = 89/246 (36%), Positives = 141/246 (57%), Gaps = 14/246 (5%)

Query  8    IRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRAD  67
            +RGL Y +S++AIL  HE GHF AA +HR+ A+LP+F+P+P  L GT GA + +     D
Sbjct  253  LRGLPYALSLLAILGVHEFGHFWAARKHRLQASLPYFIPVPAFL-GTFGAFVRIRSPIPD  311

Query  68   RKQLFDialagplagllvaipVFVAGLVLAQ----PADSSLFSMPLL---ATWLLRLVR-  119
            RK LFD+ ++GPLAGL++ +P+ + GL  +Q    P  S L +   L    + L+ L+  
Sbjct  312  RKALFDVGVSGPLAGLVITLPLLIWGLTQSQVVPMPERSGLLNFSALDPGVSILMGLISH  371

Query  120  ----PDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSV  175
                  L + Q L  +   +AG++G +VT LN++P+ QLDGGHI HA+FG+R   V   V
Sbjct  372  LSLGDRLGLNQALQLHPVAIAGYLGLIVTALNLVPVGQLDGGHIVHAMFGQRQGAVIGQV  431

Query  176  LLGAITAIILVGADHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITFM  235
                I A+  V ++  +  ++L+     D P + + S+ L   R  +G  +  I ++  +
Sbjct  432  ARLCILALSFVRSELLLRALLLLLLPVADEPALNDLSE-LDDRRDGIGFLALFILILIVL  490

Query  236  PEPLLL  241
            P P +L
Sbjct  491  PLPPVL  496


>gi|73668262|ref|YP_304277.1| Gene info zinc metalloprotease [Methanosarcina barkeri str. fusaro]
 gi|72395424|gb|AAZ69697.1| Gene info zinc metalloprotease [Methanosarcina barkeri str. fusaro]
Length=369

 Score =  107 bits (266),  Expect = 7e-22, Method: Composition-based stats.
 Identities = 87/241 (36%), Positives = 133/241 (55%), Gaps = 13/241 (5%)

Query  9    RGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADR  68
            RGL + +++MA+L +HE  H+V A  H + A+LP+F+P P  + GT+GA+I   G    R
Sbjct  128  RGLPFTLAIMAVLGSHEMAHYVMARYHGMKASLPYFIPFPTFI-GTMGALIRYRGPVPSR  186

Query  69   KQLFDialagplagllvaipVFVAGLVLAQPADS--SLFSMPLLATWLLRLVRPDL-PVG  125
            K LFD+ +AGPL GL +++ V V GL L   A +  S F MP     L   ++  +   G
Sbjct  187  KALFDVGVAGPLVGLFMSVAVTVIGLNLEASAVNPFSKFVMPSGLPPLFVFIQNLVGATG  246

Query  126  QVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRR----SCWVARSVLLGAIT  181
            + L P AF  AGWVG  VT LN++P  QLDGGHI  A+ G++    S  + R + L  + 
Sbjct  247  ENLHPVAF--AGWVGMFVTLLNLLPAGQLDGGHILRAMLGKKAEKISFMMPRVLFLIGLY  304

Query  182  AIILVGADHWVLM---VVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEP  238
             I  +  D ++ +   + L  F  + HP   ++   L   R +LGI +F++ ++ F   P
Sbjct  305  VIYWLKEDGFIWISWALFLWIFAAIGHPSPLHDEVELDKKRILLGIITFILGLLCFTLIP  364

Query  239  L  239
             
Sbjct  365  F  365


>gi|20089141|ref|NP_615216.1| Gene info hypothetical protein MA0243 [Methanosarcina acetivorans C2A]
 gi|19914009|gb|AAM03696.1| Gene info conserved hypothetical protein [Methanosarcina acetivorans C2A]
Length=368

 Score =  105 bits (261),  Expect = 2e-21, Method: Composition-based stats.
 Identities = 90/245 (36%), Positives = 137/245 (55%), Gaps = 20/245 (8%)

Query  9    RGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRADR  68
            +GL + ++++A+L +HE  H+  A  H +  +LP+F+P P  + GT+GAVI   G   DR
Sbjct  126  QGLPFTLAILAVLGSHEMAHYAMARHHGMKTSLPYFIPFPTFI-GTMGAVIRYRGPIPDR  184

Query  69   KQLFDialagplagllvaipVFVAGLVLAQPA-----DSSLFSM--PLLATWLLRLVRPD  121
            K LFD+ +AGPL GLLV+I V + GL L  PA     DS +F +  P L   L +LV   
Sbjct  185  KALFDVGIAGPLVGLLVSIVVTIIGLNLDVPAVKPLPDSLMFELGLPPLFVMLQKLVGV-  243

Query  122  LPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVA----RSVLL  177
               G  L P AF  AGWVG  VT LN++P  QLDGGH+  A+ G+++ WV+    R +L+
Sbjct  244  --TGSNLHPVAF--AGWVGMFVTLLNLLPAGQLDGGHVLRAMLGKKADWVSSMMPRILLM  299

Query  178  GAITAIILVGADHWVLM---VVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITF  234
              I  +  +  D ++ +   + L  F    HP   ++   L   R ++GI +F++ ++ F
Sbjct  300  IGIYVVYGLKGDGFIWIFWALFLWAFAAAGHPSPLHDKMKLDRKRILIGILTFILGLLCF  359

Query  235  MPEPL  239
               P 
Sbjct  360  TLIPF  364


>gi|14521814|ref|NP_127290.1| Gene info hypothetical protein PAB1063 [Pyrococcus abyssi GE5]
 gi|5459034|emb|CAB50520.1| Gene info Peptidase, M50 family [Pyrococcus abyssi GE5]
Length=409

 Score =  104 bits (260),  Expect = 3e-21, Method: Composition-based stats.
 Identities = 83/222 (37%), Positives = 128/222 (57%), Gaps = 11/222 (4%)

Query  7    WIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSRA  66
            ++  L + + +++IL  HE GH +AA  H + +T P+F+P P  + GTLGAVI ++    
Sbjct  166  YLNALAFSLGIISILGTHEMGHKIAASIHNVKSTFPYFIPFPSFI-GTLGAVIRVKSPIP  224

Query  67   DRKQLFDialagplagllvaipVFVAGLVLA--------QPADSSLFSMPLLATWLLRLV  118
             R    D+ ++GP+AGLLVAIPV + GL L+        +  ++  F   LL   L++LV
Sbjct  225  TRNAEVDLGVSGPIAGLLVAIPVTIIGLKLSAVVPINYLEKGETIYFGSSLLFYGLMKLV  284

Query  119  RPDLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLG  178
              DLP    +I +   +AGWVG LVT LN+IP +QLDGGH++ A+   ++  V  +  LG
Sbjct  285  LGDLPQNVGIILHPLAVAGWVGILVTFLNLIPAAQLDGGHVARALLPEKAHRVL-TYTLG  343

Query  179  AITAIILVGADHWVLMVVLVTFMG-VDHPPIRNESQPLGTAR  219
             +T  +      W+L  +L+  MG V +P   +E  PL T+R
Sbjct  344  FLTIGLAYFWPGWILWGILILLMGRVGNPGALDEVSPLTTSR  385


>gi|115377636|ref|ZP_01464831.1|  peptidase, M50 family [Stigmatella aurantiaca DW4/3-1]
 gi|115365345|gb|EAU64385.1|  peptidase, M50 family [Stigmatella aurantiaca DW4/3-1]
Length=546

 Score =  103 bits (257),  Expect = 7e-21, Method: Composition-based stats.
 Identities = 99/279 (35%), Positives = 139/279 (49%), Gaps = 46/279 (16%)

Query  7    WIRG-LIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVM-LTGTLGAVIGMEGS  64
            W+ G L + +++++IL AHE GH+V A  H +  +LP+F+PLP++ + GTLGAVI + G 
Sbjct  263  WVGGSLSFALALLSILGAHEMGHYVLARFHGVDTSLPYFIPLPLLSMVGTLGAVIVIRGR  322

Query  65   RADRKQLFDialagplagllvaipVFVAGL----------------------VLAQ----  98
               R  L DI  AGPLAGL+VA+PV + GL                      VLAQ    
Sbjct  323  IPHRNALVDIGAAGPLAGLVVAVPVLLWGLAHSPIVEAPLPETGLMGEGSLWVLAQRLFG  382

Query  99   ------------PADSS------LFSMPLLATWLLRLVRPDLPVGQVLIPNAFLLAGWVG  140
                        P   S      LF   LL   L  L    LP G+ L  +  ++AGW G
Sbjct  383  WLMLQLTHASAPPGVESEGLWQVLFGDSLLMQGLRWLAVGPLPAGKQLYEHPVVVAGWFG  442

Query  141  FLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAITAIILVGADHWVLMVVLVTF  200
             L+T LN++P+ QLDGGH+S A++GR +  + R V L  +   +   A   V ++V V  
Sbjct  443  LLITMLNLMPVGQLDGGHLSFALWGRHARGLGRCVALVLLLLTVFASASWGVWLLVTVKL  502

Query  201  MGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPL  239
            +G  HP +     PL   R  L +  F+  V   MP PL
Sbjct  503  VGFGHPEVIEPGVPLSRGRKWLCLLCFLALVGCAMPIPL  541


>gi|76258449|ref|ZP_00766104.1|  Peptidase M50 [Chloroflexus aurantiacus J-10-fl]
 gi|76166533|gb|EAO60658.1|  Peptidase M50 [Chloroflexus aurantiacus J-10-fl]
Length=364

 Score =  103 bits (257),  Expect = 8e-21, Method: Composition-based stats.
 Identities = 89/246 (36%), Positives = 132/246 (53%), Gaps = 17/246 (6%)

Query  6    HWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIGMEGSR  65
            +W   L +  S++AIL AHE GHF+ A R  +  + PFF+P+P  L GT+GA I ++   
Sbjct  119  NWGYALSFSASLLAILLAHELGHFIVARREGVAVSYPFFIPMPFFLLGTMGAFIAIKDLV  178

Query  66   ADRKQLFDialagplagllvaipVFVAGLVLAQ-------PADSSLFSMPLLATWLLRLV  118
             +R+ L  I +AGPLAGL+VAIPV   GL +++       P   +  +  L A   + + 
Sbjct  179  PNRRALLAIGIAGPLAGLVVAIPVLAIGLSISEVKQVVPLPGSFTEGNSLLYAAMKILIF  238

Query  119  RPDLPV-GQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLL  177
               LP  G+ +  +   LAGW G LVTGLN++P  QLDGGHI  A+FG R+     + ++
Sbjct  239  GRFLPSGGEDVYLHPVALAGWAGLLVTGLNLLPAGQLDGGHIFFALFGARA-----ARIM  293

Query  178  GAITAIILVGA----DHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVIT  233
              I A+ L+G       W +  V+V  +G    P+RNE  PL      L     +  ++ 
Sbjct  294  SMIVAVALLGLGFLWSGWFIWAVMVALIGQQRSPLRNEISPLEGPWRWLAYLGLLTFILV  353

Query  234  FMPEPL  239
            F P P+
Sbjct  354  FTPVPI  359


>gi|68551965|ref|ZP_00591358.1|  Peptidase M50 [Prosthecochloris aestuarii DSM 271]
 gi|68241088|gb|EAN23356.1|  Peptidase M50 [Prosthecochloris aestuarii DSM 271]
Length=342

 Score =  103 bits (256),  Expect = 1e-20, Method: Composition-based stats.
 Identities = 91/280 (32%), Positives = 135/280 (48%), Gaps = 54/280 (19%)

Query  13   YMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVML----TGTLGAVIGMEGSRADR  68
            Y  +++  L  HE GHF AA  HRI  TLP+F+P+P M      GTLGAVI ++    D 
Sbjct  61   YAAALLIFLGVHEFGHFFAALSHRIRTTLPYFIPVPPMPFLLNLGTLGAVIRIKEKIPDT  120

Query  69   KQLFDialagplagllvaipVFV----------------------AGLVLAQPADSSLFS  106
            K LFD  ++GPL+G ++A+ + +                       G+    PA++    
Sbjct  121  KSLFDTGVSGPLSGFIIALGLLIYGFTHLPPIDYIYAIHPEYRSLGGIPATAPAETLFLG  180

Query  107  MPLLATWLLRLVRPD-LPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFG  165
              LL   L  ++RP  LP    +    FL AGW+G  VT LN++P+ QLDGGHI++A+FG
Sbjct  181  KNLLYILLEEIIRPSQLPPMTEMYHYPFLFAGWLGCFVTALNLLPVGQLDGGHITYAMFG  240

Query  166  RRSCWVARSV------------LLGAITAIILVGADH-------------WVLMV-VLVT  199
            ++   +   +             L  IT +IL                  W+L   +L  
Sbjct  241  KKGHLLTARIFLFFIIVLGLPSFLFIITELILPQQGQLLAPWLIEWSWPGWILWAFILYK  300

Query  200  FMGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPL  239
             +GV+HP  + + QPL T RT+LG  +  I ++ F P P 
Sbjct  301  VIGVEHPATQIK-QPLSTKRTLLGWIAIAIFILCFTPVPF  339

ORF finding

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 3 to base 749.
AATGATTTTACAGAGCACTGGATCCGAGGCCTGATCTACATGGTCTCGGTAATGGCAATT
TTGTCGGCGCATGAAGCAGGGCACTTTGTTGCAGCATGGCGTCATCGAATTCCTGCAACG
CTTCCATTTTTCTTACCGCTTCCAGTGATGCTAACTGGGACACTTGGCGCCGTAATTGGC
ATGGAAGGATCTCGGGCAGACAGAAAACAGTTATTTGATATCGCCTTAGCTGGACCTCTC
GCTGGTCTTCTTGTTGCGATTCCTGTTTTTGTAGCGGGGCTGGTGCTTGCTCAACCGGCA
GATAGCAGCCTGTTTTCAATGCCTTTACTTGCAACATGGCTTTTGAGACTTGTTCGGCCA
GATTTACCAGTAGGCCAGGTGCTTATCCCAAATGCGTTCTTGCTGGCTGGCTGGGTAGGT
TTTCTTGTAACTGGACTGAATATGATTCCCCTCAGCCAACTCGATGGTGGGCATATTAGC
CATGCTGTTTTTGGTCGGCGTTCGTGCTGGGTGGCCAGAAGTGTCCTCCTCGGAGCAATA
ACCGCTATTATTCTTGTAGGAGCTGATCATTGGGTTTTAATGGTTGTTTTAGTCACGTTT
ATGGGTGTCGATCACCCGCCCATTCGAAATGAATCGCAGCCGTTGGGCACCGCGAGAACA
ATTCTGGGCATTGCTTCATTTGTCATTCCGGTGATTACATTCATGCCGGAGCCGCTGCTG
CTGCCCGGATTCATTTTCATTCGTTGA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
NDFTEHWIRGLIYMVSVMAILSAHEAGHFVAAWRHRIPATLPFFLPLPVMLTGTLGAVIG
MEGSRADRKQLFDIALAGPLAGLLVAIPVFVAGLVLAQPADSSLFSMPLLATWLLRLVRP
DLPVGQVLIPNAFLLAGWVGFLVTGLNMIPLSQLDGGHISHAVFGRRSCWVARSVLLGAI
TAIILVGADHWVLMVVLVTFMGVDHPPIRNESQPLGTARTILGIASFVIPVITFMPEPLL
LPGFIFIR*

recerche orf dans le sens indirect

>ORF number 1 in reading frame 1 on the reverse strand extends from base 301 to base 513.
AACCCAATGATCAGCTCCTACAAGAATAATAGCGGTTATTGCTCCGAGGAGGACACTTCT
GGCCACCCAGCACGAACGCCGACCAAAAACAGCATGGCTAATATGCCCACCATCGAGTTG
GCTGAGGGGAATCATATTCAGTCCAGTTACAAGAAAACCTACCCAGCCAGCCAGCAAGAA
CGCATTTGGGATAAGCACCTGGCCTACTGGTAA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
NPMISSYKNNSGYCSEEDTSGHPARTPTKNSMANMPTIELAEGNHIQSSYKKTYPASQQE
RIWDKHLAYW*

>ORF number 1 in reading frame 2 on the reverse strand extends from base 20 to base 265.
TTCAACAAAGTTCACGGATCACCTCTCGGTTATGTAGCAGCCTTGGAACCAGTGCGTCTG
ACTCCATCTAAAACAATCTTAATCTATTGTGTGAGCGGAATGGCAGGGCGTCAACGAATG
AAAATGAATCCGGGCAGCAGCAGCGGCTCCGGCATGAATGTAATCACCGGAATGACAAAT
GAAGCAATGCCCAGAATTGTTCTCGCGGTGCCCAACGGCTGCGATTCATTTCGAATGGGC
GGGTGA

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
FNKVHGSPLGYVAALEPVRLTPSKTILIYCVSGMAGRQRMKMNPGSSSGSGMNVITGMTN
EAMPRIVLAVPNGCDSFRMGG*

>ORF number 2 in reading frame 2 on the reverse strand extends from base 311 to base 586.
TCAGCTCCTACAAGAATAATAGCGGTTATTGCTCCGAGGAGGACACTTCTGGCCACCCAG
CACGAACGCCGACCAAAAACAGCATGGCTAATATGCCCACCATCGAGTTGGCTGAGGGGA
ATCATATTCAGTCCAGTTACAAGAAAACCTACCCAGCCAGCCAGCAAGAACGCATTTGGG
ATAAGCACCTGGCCTACTGGTAAATCTGGCCGAACAAGTCTCAAAAGCCATGTTGCAAGT
AAAGGCATTGAAAACAGGCTGCTATCTGCCGGTTGA

>Translation of ORF number 2 in reading frame 2 on the reverse strand.
SAPTRIIAVIAPRRTLLATQHERRPKTAWLICPPSSWLRGIIFSPVTRKPTQPASKNAFG
ISTWPTGKSGRTSLKSHVASKGIENRLLSAG*

>ORF number 3 in reading frame 2 on the reverse strand extends from base 587 to base 775.
GCAAGCACCAGCCCCGCTACAAAAACAGGAATCGCAACAAGAAGACCAGCGAGAGGTCCA
GCTAAGGCGATATCAAATAACTGTTTTCTGTCTGCCCGAGATCCTTCCATGCCAATTACG
GCGCCAAGTGTCCCAGTTAGCATCACTGGAAGCGGTAAGAAAAATGGAAGCGTTGCAGGA
ATTCGATGA

>Translation of ORF number 3 in reading frame 2 on the reverse strand.
ASTSPATKTGIATRRPARGPAKAISNNCFLSARDPSMPITAPSVPVSITGSGKKNGSVAG
IR*

>ORF number 1 in reading frame 3 on the reverse strand extends from base 495 to base 869.
GCACCTGGCCTACTGGTAAATCTGGCCGAACAAGTCTCAAAAGCCATGTTGCAAGTAAAG
GCATTGAAAACAGGCTGCTATCTGCCGGTTGAGCAAGCACCAGCCCCGCTACAAAAACAG
GAATCGCAACAAGAAGACCAGCGAGAGGTCCAGCTAAGGCGATATCAAATAACTGTTTTC
TGTCTGCCCGAGATCCTTCCATGCCAATTACGGCGCCAAGTGTCCCAGTTAGCATCACTG
GAAGCGGTAAGAAAAATGGAAGCGTTGCAGGAATTCGATGACGCCATGCTGCAACAAAGT
GCCCTGCTTCATGCGCCGACAAAATTGCCATTACCGAGACCATGTAGATCAGGCCTCGGA
TCCAGTGCTCTGTAA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
APGLLVNLAEQVSKAMLQVKALKTGCYLPVEQAPAPLQKQESQQEDQREVQLRRYQITVF
CLPEILPCQLRRQVSQLASLEAVRKMEALQEFDDAMLQQSALLHAPTKLPLPRPCRSGLG
SSAL*