ORF KS23670
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary! |
Sequence | |||
---|---|---|---|
CAMERA AccNum : | JCVI_READ_1091140009608 | ||
Annotathon code: | ORF_KS23670 | ||
Sample : |
| ||
Authors | |||
Team : | Biochimie 2008 | ||
Username : | damien | ||
Annotated on : | 2008-06-04 11:01:22
| ||
Contents
Synopsis
- Gene symbol: Unknown gene symbol
- Biological Process: unknown biological process
- Molecular Function: unknown molecular function
- Taxonomy: Bacteria (NCBI info)
Rank: superkingdom - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 2
Kingdom: Bacteria - Phylum: - Class: - Order:
Bacteria;
Genomic Sequence
>JCVI_READ_1091140009608 ORF_KS23670 genomic DNA TTCAATGGGAATGAGTGGTGACTTGGATGAGGCAGTTATGGAGGGAAGTACCATGATTCGTGTGGGTTCAGCTTTGTATGGTAAACGATCATGACCTTTG TTTGAAGTATGCTTGCACAATTTTATTTAAGTTGTTCATTAGATTGTAATTAAATGAATAATTCTAAGGAAAATTTGGAAAGTAATTACTGGCTAAGTCT TCGTAAATTGATAGACGATGAATCTCCATTAGTCAGAAGTGCTTTGCTTGCTGAACTTAGAAAGCATCCGAAAGATGGAAAACTTTTTCTCGAAAATATA ATTCAAAACCAAAAAGATATTCTTGCAAAATTCGCACTTTCATTAATTGAAAGTCTTGGTTGGAGTGATGGAGTTGGGAATTTTTTAAAATTCATTCGAT CTCAAAGATACGAACTTGAATCGGGTTGTTTTTTGTTGGATCGAACGATTTTTCCAACTTTTGAAATCTCCTCATCCACTCTTTTTCTCGACCAACTCGC AGATCGTGTTCGTGAGCTTTTAACTCCTCCCCAAAATGCTCGGGATATTTGTTCAGTAATCAATCGAGTTTTTTTTCATGAATTCGGATTTAGAGGAGCA ACCAAGGATTTTGCTGATCCAAAAAATAGTTTTCTCCACCTTGTATTGGAAAGGAAAAGAGGTTTACCGATTTCTCTATCGGTAATTTACATTCTAATTG CTCGAAGGGTAAGTTTAGACCTTGAGCCAATTGGACTTCCAGGAAGATTTATGGTTGGTAGCTTTACCGATGATCAACCATTTTACATTGATGTGTGGTC AGGGGGTAAATTCTTTGATATCGACCAGATGGAGGAATTCCTAGGAAATTCAGAGCTTGAAAACTCTGGTTCATCTCTTCTTCCTGTAACAGTAGCAGAA ACT
Translation
[154 - 903/903] direct strand
>ORF_KS23670 Translation [154-903 direct strand] MNNSKENLESNYWLSLRKLIDDESPLVRSALLAELRKHPKDGKLFLENIIQNQKDILAKFALSLIESLGWSDGVGNFLKFIRSQRYELESGCFLLDRTIF PTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFM VGSFTDDQPFYIDVWSGGKFFDIDQMEEFLGNSELENSGSSLLPVTVAET
[ Warning ] 3' incomplete: following codon is not a STOP
Phylogeny
-Arbre avec phylip (arbre): Arbre(s) issus de neighbor : 21 Populations Neighbor-Joining/UPGMA method version 3.6a2.1 Neighbor-joining method Negative branch lengths allowed +--------NOSTOC PUNCTIFORME (cyanobacteria) ! ! +----------SYNECHOCOCCUS (cyanobacteria) ! ! ! ! +-----------------------physcomitrella patens (mosses) ! ! +--8 7-9 ! +---------------------chlamydomonas reinhardtii (green algae) ! ! ! ! ! ! +ANAEROMYXOBACTER (d-proteobacteria) ! ! ! +-----------------3 ! ! ! ! +ANAEROMYXOBACTER SP (d-proteobacteria) ! +-10 ! ! ! ! +ornithorynchus anatinus(monotremes) ! ! ! +--1 ! ! ! +------------2 +-------gallus gallus (birds) ! ! ! ! ! ! ! ! +------------4 +-------danio rerio (bony fishes) ! ! ! ! ! ! +---17 +--11 +-----------------------------apis mellifera (bees) ! ! ! ! ! ! +-14 +-----------------------------Translation (ORF) ! ! ! ! ! ! +-15 +-------------------------------NITROSOPUMILUS MARITIMUS (creanarchaeotes) ! ! ! ! ! ! ! +--------------------GEMMATA OBSC (planctomycetes) ! ! ! ! ! ! +----XYLELLA FASTIDIOSA (g-proteobacteria) ! +-18 +-5 ! ! +-------------6 +----XANTHOMONAS ORYZAE (g-proteobacteria) ! ! ! ! ! ! +-16 +----STENOTROPHOMONAS MALTOPHILIA (g-proteobacteria) ! ! ! ! ! ! ! +--------------------BURKHOLDERIA XENOVORANS (b-proteobacteria) ! +-19 ! ! +----------------------------------LEPTOSPIRA BORGPETERSENII(spirochetes) ! ! +-12 ! +-13 +----------------------------------MICROSCILLA MARINA (CFB group bacteria) ! ! ! +-------------------------------SYMBIOBACTERIUM THERMOPHILUM (firmicutes) ! +--------GLOEOBACTER (cyanobacteria) /////////////////////////////////////////////////////////////////////////////////////////////////// -Arbre avec Phylip (distance): Arbre(s) issus de protpars : Protein parsimony algorithm, version 3.6a2.1 One most parsimonious tree found: +--------------chlamydomonas reinhardtii (green algae) ! ! +--------MICROSCILLA MARINA (CFB group bacteria) +----------16 +-20 ! ! ! ! +-----SYMBIOBACTERIUM THERMOPHILUM (firmicutes) ! ! ! +-18 ! +-17 ! +--Translation (ORF) ! ! +-19 ! ! +--LEPTOSPIRA BORGPETERSENII(spirochetes) +----------------------13 ! ! ! +-----------physcomitrella patens (mosses) ! ! ! ! +-----danio rerio (bony fishes) ! ! +-10 ! ! ! ! +--apis mellifera (bees) ! +-----------------9 +-15 ! ! +--gallus gallus (birds) +--8 ! ! ! +--------ornithorynchus anatinus(monotremes) ! ! ! ! +--NITROSOPUMILUS MARITIMUS (creanarchaeotes) ! ! +----------------14 ! ! ! +--GEMMATA OBSC (planctomycetes) ! ! ! ! ! ! +--------BURKHOLDERIA XENOVORANS (b-proteobacteria) ! +----------------------------12 +----11 +--3 ! ! ! +-----STENOTROPHOMONAS MALTOPHILIA (g-proteobacteria) ! ! ! ! +--7 ! ! ! ! ! +--XANTHOMONAS ORYZAE (g-proteobacteria) ! ! +-----5 +--6 ! ! ! +--XYLELLA FASTIDIOSA (g-proteobacteria) ! ! ! +--2 ! ! +--ANAEROMYXOBACTER (d-proteobacteria) ! ! ! +-----------4 ! ! ! +--ANAEROMYXOBACTER SP (d-proteobacteria) ! ! ! 1 ! +-----------------------------------------------------SYNECHOCOCCUS (cyanobacteria) ! ! ! +--------------------------------------------------------NOSTOC PUNCTIFORME (cyanobacteria) ! +-----------------------------------------------------------GLOEOBACTER (cyanobacteria) ___________________________________________________________________________________________________ Ces 2 arbres présentent des résultats différents. Dans le premier par NJ, notre séquence est sur une branche isolée qui se rapproche plus des eucaryotes. Dans le second, issu de Protpars, elle est plus proche des procaryotes, et en particulier d'une bactérie (Leptospira). Cela est ambigu pour attribuer un groupe taxonomique à l'ORF.
Annotator commentaries
On étudie une séquence provenant de la côte Est des États-Unis, prélevée le 17/11/03. Cette séquence fait 903 paires de bases, nous navons pas pu lui donner de fonction.
Recherche d'ORFs:
Pour notre recherche dORFs par SMS, on utilise loutil ORF Finder. On ne demande quà afficher les ORFs qui font plus de 60 codons de long et on indique à cet outil dutilisé le code génétique « standard ». Les résultats montrent quil existe quun seul ORF dans le sens direct sur le cadre de lecture 1, aucun ORF dans le sens indirect. Nous devons donc poursuivre lanalyse avec cet unique ORF, qui par ailleurs est incomplet en 3 et a une taille de 250 acides aminés. Une telle longueur signifierai que la séquence est codante. Daprès lalignement multiple, il semblerait que dans notre ORF il manque au moins une cinquantaine dacides aminés en 3'. C'est-à-dire que lorsquon étudie cet alignement, la plupart des séquences alignées avec lORF se poursuivent environ 50 acides aminés plus loin. Il manquerait donc au moins 20% des acides aminés à notre ORF pour que celui-ci corresponde à une séquence complète. Si ces données sont exactes, notre séquence ferai 300 acides aminés et peserait en théorie 110x300_
33kDa.
Blast et rapport taxonomique:
Nous procédons à un Blast avec (nr), puis un avec SwissProt. Ce dernier est moins significatif. En effet on constate de la faiblesse et de la variabilité dans les scores et les E-value de Swissprot. Le meilleur score est de 81.6 et la meilleure E-value est de 3e-15, la variabilité est telle que le neuvième meilleur score est de 40.0 et que sa E-value correspondante est de 0.011. Dans le Blast contre (nr) où le meilleur score est de 109 (et E-value_
2e-22), il ny a pas de chute brutale des scores et des E-value puisque le centième meilleur score affiche 68.9 avec une E-value de 3e-10. Nous poursuivons donc lannotation avec les résultats de (nr).
Par rapport à la séquence ORF on voit que les HSP se situent en majorité entre les 70ème et 240ème acides aminés. Les alignements deux à deux nous permettent de voir à partir de quel score les séquences ne sont plus homologues. En dessous du seuil de 68,9 on voit que les séquences présentent moins de similarités, et si il y'en a, celles ci sont sur des distances plus courtes. Chez les eucaryotes et les procaryotes les meilleurs scores sont assez proches (respectivement 87 et 109), de plus on nobserve pas de chute brutale entre les scores des différents alignements: quand ils diminuent, les scores diminuent progressivement denviron 2 points, dune ligne à lautre du rapport taxonomique. Cependant les eucaryotes sont beaucoup moins représentés. On prend la totalité des êtres vivants comme groupe détude, nous n'avons pas de groupe extérieur. Notre groupe détude comprend donc 13 bactéries et une archae, ainsi que 6 eucaryotes differents, chacune des séquences correspond à un score différent. On remarque déjà que de très nombreuses séquences présentées dans le rapport taxonomique nont pas de fonctions identifiées. Pour le moment rien ne peut être prédit sur la fonction de notre séquence.
Alignement multiple et phylogénie:
Lalignement multiple est réalisé avec Clustal W. Philyp nous a permis de réaliser les arbres phylogénétique. Sur lalignement multiple on peut voir que le plus grand nombre des séquences alignés commencent à présenter des similarités a partir du même endroit. De plus, les indels se situent aux mêmes positions et ont presque toujours la même longueur sur les différentes séquences. Plusieurs de ces séquences doivent avoir une homologie proche avec lORF. En revanche, certaines séquences deucaryotes (Gallus gallus par exemple) ont des longueurs largement plus élevées que la majorité des autres séquences (dont la notre), il se pourrait que chez ces eucaryotes ces séquences aient suivies une évolution différente. Les deux arbres (Neighbor-joining et Protpars) nous montrent des résultats différents. Dans celui par NJ on voit que notre séquence se trouve sur la même branche que les eucaryotes, mais on ne peut pas dire quelle en fait partie car on constate que dautres eucaryotes sont mélangés aux bactéries. Dans larbre issu de protpars les résultats sont tout autres, notre séquence se trouve plus proche des procaryotes (Leptospira), et hormis quelques exceptions, les eucaryotes et procaryotes se trouvent sur des branches distinctes. Au vu de ces résultats, il est assez difficile de dire à quel groupe taxonomique appartient notre séquence. Compte tenu de lalignement multiple concernant les eucaryotes (certaines séquences plus longue), le cas le plus probable est que notre séquence appartienne aux procaryotes.
Recherche de domaines homologues:
Nous soumettons notre séquence en format Fasta à InterPro, on n'obtient aucun résultat. Nous nobtenons pas plus de résultats avec Prosite et Pfam, ce qui nous pousse à dire que notre séquence na pas une fonction clairement identifiable. Cela correspond également au résultat observé dans le Blast (nr) et le Taxonomy Report. Les informations concernant la fonction des séquences indique quil pourrait sagir dune protéine hypothétique, cette protéine serait présente dans tout le monde vivant.
Multiple Alignement
Groupe d'étude comprenant des eucaryotes et des procaryotes (dont une archae). Procaryotes: -GLOEOBACTER (cyanobacteria) -NOSTOC PUNCTIFORME (cyanobacteria) -SYNECHOCOCCUS (cyanobacteria) -ANAEROMYXOBACTER (d-proteobacteria) -ANAEROMYXOBACTER SP (d-proteobacteria) -XYLELLA FASTIDIOSA (g-proteobacteria) -XANTHOMONAS ORYZAE (g-proteobacteria) -STENOTROPHOMONAS MALTOPHILIA (g-proteobacteria) -BURKHOLDERIA XENOVORANS (b-proteobacteria) -GEMMATA OBSC (planctomycetes) -LEPTOSPIRA BORGPETERSENII(spirochetes) -SYMBIOBACTERIUM THERMOPHILUM (firmicutes) -MICROSCILLA MARINA (CFB group bacteria) -NITROSOPUMILUS MARITIMUS (creanarchaeotes) Eucaryotes: -ornithorynchus anatinus(monotremes) -gallus gallus (birds) -danio rerio (bony fishes) -physcomitrella patens (mosses) -apis mellifera (bees) -chlamydomonas reinhardtii (green algae) Dans les alignements et arbres qui suivent, la ligne noté Translation représente notre ORF. De plus, nous avons notés les procaryotes en majuscules et les eucaryotes en minuscules dans un souci de simplification de l'analyse. ................................................................................................... CLUSTAL W (1.82) multiple sequence alignment GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus -------------------------------------------------- danio -------------------------------------------------- BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella MEPACANLARSQFSVMPVSLSPCCSKSNVCNTRVSSAFLQQGNVTTRSSL NITROSOPUMILUS -------------------------------------------------- apis -------------------------------------------------- chlamydomonas -------------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus ---------------------------------MAAAEAERLCLTDLPGE danio -------------------------------------------------- BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella LVTSTQAVQVGNYACCAQSCTHGVELNHGLGGRLVMHTDIGWHERLEVRD NITROSOPUMILUS -------------------------------------------------- apis -------------------------------------------------- chlamydomonas -------------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus LLELILCCDVLGAADIGRVACTCRRLREACQPRGKVWRERFRLRWPSLLK danio -------------------------------------------------- BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella VHGHSARSKTSSTNVGCSRSLFDNLEGHWFRDQPRIGSTHSSAFGNSSPA NITROSOPUMILUS -------------------------------------------------- apis ---------------------------------------------MKDKK chlamydomonas -------------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus YYSHTDSVSWLEEYKARHNAGLEAQRIVASFSKRFFSEHVPCDGFSDIET danio -------------------------------------------------- BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella STSGGIHRRGSPWGSKRPRSRNCCCSSEGMGGDKNETGAGGEEKYEVSRI NITROSOPUMILUS -------------------------------------------------- apis LFSLDELWPLLKEVYQTSKELEHQMINWKEEVKISLNKFDFLFCPKEGAH chlamydomonas ------------------------------------------MDSGDSGE LEPTOSPIRA MTHFVFRKNIFFMSLSDSYHDSFHLPSDKLEGKFYQLEFSGTEEKIRVIR SYMBIOBACTERIUM -------------------------MRTEGELRALVSLLADEQEGVARAA Translation ---------------------MNNSKENLESNYWLSLRKLIDDESPLVRS MICROSCILLA ---------------------------MNQNEIKALVSLLDDDDYEVINH GLOEOBACTER ------------------------------------MEQSSARQRFIRVL NOSTOC ------------------------------------MNFSSARQYFYQEI SYNECHOCOCCUS ------------------------------------MSIPAARQRFALEI ANAEROMYXOBACTER.DEHA ---------------------------------MTATRDLEGRARARFAE ANAEROMYXOBACTER.SP ---------------------------------MTATRDLEGRARARFAE XYLELLA ------------------------------------MVEQLLLPLWNSLA XANTHOMONAS.ORYZAE ------------------------MNPAPLADTLGSMVDILPLPHWTALA STENOTROPHOMONAS ------------------------------------MQDRISLPDWDALA ornithorynchus --MTPSYFEDSALLVLHLFLRKALTWKYYAKKILYFLRQQEILKSLKAFL gallus LGCPSHFFEDELMCILNMEGRKGLTWKYYAKKILYFLRQQNILKNLKEYL danio ---------------------KSLTLKYYAKKILYFFRQQNILRNLKVFL BURKHOLDERIA ---------------------------------------MTRVLDYFSTL GEMMATA ---------------------------------------MKLSAALEALS physcomitrella PNEGVTNRATPSSKSDRVADRSAAQVGTLESELFLYTGRMRALENFRAEV NITROSOPUMILUS -----------------------------------MEKEFDPFVAEWFSF apis PLAYYFLIDELTTLIKCPAIVSNLTHRYYALKVIRYLKQTHLKGEWQKFI chlamydomonas GPPSGGAERGGPGSEAGAGSSSGAPGGEAGEPLTLGAGVLSRMRQLRALD LEPTOSPIRA EIADMIPWQFEIDEFIEEFKDPTLRIFARSISSIVHMERINTRYSLLADK SYMBIOBACTERIUM WDELLGAGQAAVPFLEAAFDDPDPRLRGRVRALLEELRLEALELQWRRLA Translation ALLAELRKHPKDGKLFLENIIQNQKDILAKFALSLIESLGWSDGVGNFLK MICROSCILLA IKEKIISMGDVLIPFLETAWENNFSPVVQRRIEELIHTLQLQSLQERFRD GLOEOBACTER QQPEVQIDLAEAALYIALEAYPGLDV--EEYLNALDTMAEEVRERIEGER NOSTOC QQSDEQIDLAKAALYIAQEEYPKLDP--EEYLNALDTMAWELQERLPDSR SYNECHOCOCCUS Q--SEPIDLGRAALWIAQEAYPDLEV--EEYVAALDEMAAEVQERLPPER ANAEROMYXOBACTER.DEHA LLGRDPVPLDEAALAIAQEEYPALEP--EAYLTRLDELAARVVRRVPGPV ANAEROMYXOBACTER.SP LLGRDPVPLDHAALAIAQEEYPALEP--EAYLTRLDDLAARVVRRVPGPV XYLELLA TVDDETLPLMSTALLIARDEYPDLDA--NLYDTLVQSYVEYLRSEVEEIS XANTHOMONAS.ORYZAE SLDDDAVPLMSTALLIARDEYPQLDA--VLYDTLVQSHVEHLRREVDAID STENOTROPHOMONAS DLEDEALPLLPTALLIARDEYPDLQP--STYDALIQSHVDHLRSEVDSID ornithorynchus ERAENDESFLEGAVLIDRYCNPLSDINLEEIRAQVDDIADRVRLALRGKN gallus QRPADQQSFLEGAVLIDQYCNPLSDICLKSVQAQVDEITDKVRKVLRTKN danio ERPPEQQSALEGAVLVDQYCNPLTDVTFESISAQVEEITDKVKKCLRAQN BURKHOLDERIA VAEDDSLPLTEAALSLAQDAYPDLDL--QGALAEIDELVVRLQRRMPDDA GEMMATA VDPSAETDLARIALLIARDAYSGMNP--RAYLRRIERLAEQLRPRLKGSL physcomitrella AKPDEEISLVRAAVFVAQHLYPRITA--EEVEAEMKEMADELEPLLPPPA NITROSOPUMILUS VKNPNFNLVEKCLKFAQILEYPDLDI--EKHIEKITKIGMSLKESISDVK apis SLPSKQQTLERGATIVAQWSQPERHVSYPTISSILDNIAEQTKDLLKEQH chlamydomonas EYRQLMEQLAEGALLIARHRYPELDE--AEVRGVLDDMARKASALLPAEA LEPTOSPIRA GHVNDYGDLEEAVFLLSSVGDPKASY--HEFKIYLDKLALRVEELYDLNP SYMBIOBACTERIUM QKDDDALDLEEGCILLAALGGTVGRE--RAVASFLDAVAGSVRAHMPAVG Translation FIRSQRYELESGCFLLDRTIFPTFEIS--SSTLFLDQLADRVRELLTPPQ MICROSCILLA WKDNEQEDLLKGIWLVACFQYPDLDLR--KVTKELDKIYHEVWLSHRAYA . : GLOEOBACTER -----------------------------YPLRMLQGINRYLYDDLGFRG NOSTOC -----------------------------YPLRIVQSINQYLYDDLKFSG SYNECHOCOCCUS -----------------------------YPLRVIKILNHYLFEDLGFRG ANAEROMYXOBACTER.DEHA -----------------------------RAASALRALREVLHDEEGLRG ANAEROMYXOBACTER.SP -----------------------------RAASALRALREVLHDEEGLRG XYLELLA -----------------------------LWPLKMAAVNRYLFQKLGYSG XANTHOMONAS.ORYZAE -----------------------------PWPLKMAAVNRYLFDELGYSG STENOTROPHOMONAS -----------------------------NSPLKMAAINRHLFDELGYSG ornithorynchus ARHPSVSSEAGGTSTVAD---------GELQRQVLDAINQVLYEQLKYRG gallus PRHPSVASKAG-EILIPE---------VELQRQVLDAMNCVLYEQLKYKG danio AAHPSLRAGQGECFVLED---------FEFQRQVICAVNTVLYEQLQYKG BURKHOLDERIA -----------------------------DIKQKVGILNRFFFRELGFAS GEMMATA -------------------------------AARTAELSTFLFEECGFAG physcomitrella ER---------------------------YTMRMINSINRYLYGQLGYKG NITROSOPUMILUS N-----------------------------PTYLISMLNEHLFENLGYSG apis PNHSIFSIPTERFIFWKNNIIDDNQWNISETRQVTDALCKVLFEKLGFYG chlamydomonas ERR--------------------------YPLRVVAAINRVMYVDYGFRG LEPTOSPIRA EYVS--------------------------EELKVHFLTRVLSSEENFQG SYMBIOBACTERIUM ---------------------------------GLRAMGEVLFENLRFRA Translation -----------------------------NARDICSVINRVFFHEFGFRG MICROSCILLA S-----------------------------PHDKVKNLNNILFTKLGFSS : : . . GLOEOBACTER NEEEYYDPRNSFLNEVIDRRTGIPITLALVYLEIARRVDFPMVGVSMPGH NOSTOC NKIDYYDPRNSFFNDVIDRRLGIPITLALVYLEVARRIDFPMVGVGMPGH SYNECHOCOCCUS NREDYYDPRNSFLNEVIDRRTGIPITLSLIYLELARRIDFPMAGVGMPGH ANAEROMYXOBACTER.DEHA NDDDYYDPRNSFLNDVLDRRLGIPITLALVYMEVGRRAGLRLEGVGFPGH ANAEROMYXOBACTER.SP NDDDYYDPRNSFLNDVLDRRLGIPITLALVYMEVGRRAGLRLEGVGFPGH XYLELLA NHDEYYDPRNSYLNQVFERRLGNPISLAVIQIEVARRLGIPLDGVSFPGH XANTHOMONAS.ORYZAE NHEQYYDPRNSYLNQVFERRLGNPISLAMVQIEVARRLGIPLAGVSFPGH STENOTROPHOMONAS DHDEYYDPRNSYLNQVFERRLGNPISLALVQMEVARRLGIPLDGVSFPGH ornithorynchus NQADYYNALNLFIHQVLIRRTGIPISLSVVYLTVARQLGVPLEPVNFPGH gallus NELDYYNSLNSYIHQVLIRRTGIPISLSVLYLTIARQLGVKLEPVNFPSH danio NERDYYNPLNSYIHQVLLRRTGIPISLSVLYMTLARKLGVPLEPVNFPNH BURKHOLDERIA NLNDYYDPDNSHLNVVLKRRRGIPISLAVLYLEMAEQIGIPVRGVSFPGH GEMMATA NTEDYYDPRNSYLNKVLDRQVGLPIALSVLAMEVGRRAGLDVVGVGLPGH physcomitrella -ATNYLDPDNSCINMVLKRREGLPLTMSLLYMELAKRVGLPMQGVNLPAH NITROSOPUMILUS DDDDYYNPKNNFLNEVIDKKTGLPITISILYAEVAKFIGLDLKIVGFPSH apis NSEMYYSSENSFIDRVLERRRGIPITLAIVFESVARRLGIRCEPVSFPSH chlamydomonas NQEDYYSADNSCINKVVERRVGIPITLSLVYMEVGRRLGLTLRGVNLPGH LEPTOSPIRA NNDQYDDPNNSFVTRIVHTRKGIPISLSTIYLLVAKRLSLPLYGVNMPLH SYMBIOBACTERIUM --GEMGNPEHHFLPSVLERRRGIPIALAAVYILVGRRVGLPVYGVAMPDH Translation ATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGR MICROSCILLA NTQNFHSPGNSMINIVLESRKGNPISLCIVYMLIAQKLKMPIYGINLPNI .. : . :. : * *:::. : :.. . : :* GLOEOBACTER FLIRPDRSDM-------EIHIDAFHRGEILFREDCGERLERIYG-RRLEL NOSTOC FLIRPNIPDI-------EIFVDAFNGGEIIFAQDCEERLSQIYQ-QTVTL SYNECHOCOCCUS FLIRPLFAGA-------EIFVDPFQQGEILFPEDCQELLSQIYG-PGIPF ANAEROMYXOBACTER.DEHA FLAKYVSPGGV------EVFVDAYHGGEMLSADECVARYKARTG-GKDLD ANAEROMYXOBACTER.SP FLAKYVSPGGV------EVFVDAYHGGEMLSADECVARYKARTG-GKDLD XYLELLA FLVRLPVDDG-------ILVMDPFNGGRPLDAEELRERARPHLG-GEAPD XANTHOMONAS.ORYZAE FLVRLPVGDG-------VLVMDPFNGGRPLGVDELRERARPHLG-GEIPD STENOTROPHOMONAS FLVRLPVDDG-------VLVMDPFNGGRPLDVDELRERAKSHLG-GQMPD ornithorynchus FLLRWGQGAKGSPDIFDYTYIDAFGKGKRLTVKECEYLIGHHVT-EEFYG gallus FLLRWCQGKEGSTDIFDYTYIDAFGKGKQLTVKECEYLIGHHVT-EEFYG danio FLLRWCQNQRRCGDIYDYIYIDAFGNGKQLAAKECEILIKHQAT-ADYYS BURKHOLDERIA FLLRVTTPDG-------DVMLDPTSGHSLSESEMVEMLEPYVAS-AGESV GEMMATA FIVKAVEGNE-------EVLLDPFNGGQFLDIEGCEALVGGVTG-RPFEA physcomitrella FMCRPTGDGL-------EFFIDAHANGKITFLQDAEERLSVVYG-VPVEI NITROSOPUMILUS ILVKYNEEMI----------LDPFYDGRLLDIDDLQEILDTNYG-GEIEF apis FLLRWKETYAPEFKDTENYYIDVFNGGQFLTKKNCPRIGGVSRCPIEKYN chlamydomonas FMIQPVCYPA-----------------------EAEERLSELLG-APVRI LEPTOSPIRA FLLHFDSPDY-------ETFIDPFHGGVLLDKSTCIRFLEANSF---TPS SYMBIOBACTERIUM FLAMYAEADR-------PAYVDCFNQGQVYRYETLSRILSRRGV------ Translation FMVGSFTDDQ-------PFYIDVWSGGKFFDIDQMEEFLGNSEL------ MICROSCILLA FVLTYKSEET-------QFYINAFNRGLIFSRSEIDNYIAQLNI------ :: GLOEOBACTER HPAFFEAVGA---RRFLARMLTNLKFAYWVRSDWQSALGTVERLLVIFPN NOSTOC QPEFLAVVSN---RQFLARMLTNIKFIYLKQQELEKTLAAIERILLLFPN SYNECHOCOCCUS QEHHLRPTPP---RLILVRLLNNLKQIYLSRAELEPALAAAERILLLIPE ANAEROMYXOBACTER.DEHA ARYLAAVSP----RQLLARMLQNLKRVYAERKDDVRLFWVLDRILLVTPD ANAEROMYXOBACTER.SP ARYLAAVSP----RQILARMLQNLKRVYAERKDDVRLFWVLDRILLLTPG XYLELLA DRALAQILNPAPHRTILVRILRNLHSVYANTDRWDRAARCADRILKLVPN XANTHOMONAS.ORYZAE DRALAQILDPAPHRAILIRILRNLHGVYADAEHWDRAARSADRILKLVPD STENOTROPHOMONAS DQVLAQILDPAPARAILMRMLRNLHGVYAEAGEWDRAARSADRLLKLAPE ornithorynchus VVNAKEVLQR-----MVGNLLSLGKR-ERTDQSYQLLRDSLDLYLAMYPD gallus VVTSKEVLQR-----MVGNLLNLGKR-ESTDQSYQLLRDSLDLYLAMYPD danio AISTSELLLR-----MVGNLLNIGMRGEGNEKSYQLLRDSLDLYLSINPD BURKHOLDERIA SRALRMLLQPATRREIIARMLRNLKSTYLQTERWQRLLAVQQRLVILLPE GEMMATA TPEALAATPPG---AIVARMLQNLKTAYLAERDYRRGARVTRRLTQLVPA physcomitrella NPEFLKHTSITN-RAFLLRLLFNLKRIYFERKDPVSTLCIIDYLKIVRPG NITROSOPUMILUS KPEFLDEITH---EQILVRLTRNLKNSYVQSFVYDKALRCVNMVLAIEPE apis VHEEATAVEVVTRMANNLEIAARQHTHINGTHRTARLRSALELRYMIQPN chlamydomonas DPQLLKESRPLPPRTFLLRMLSNLRSIYLSTQQVDSLLTVVKFMRATLEA LEPTOSPIRA ERYFTRASTLS----IIKRMYRNLIHIYRKEQFRDMEDILSRQLLILENK SYMBIOBACTERIUM -VAPERVLSPCSTRVILYRMLGNLERLYTGVGQSRMVGRVQRWREILVVK Translation ----------------------------------------ENSGSSLLPV MICROSCILLA -PTTNRFYYPCSNLEIIQRIMRNLVVAFEKLNEDEKARELKELLQVLDA- GLOEOBACTER TPAEWRDRGILHYRLGHPTAARADLENYLRSAPGAE-DSARIRQLLDELQ NOSTOC LTLELRDRGLISYQLGNYPQAVNDLQHYLAKVPDAQ-DASVIRRLLTELG SYNECHOCOCCUS SLPHLRDRGLLYYQLGRWQQACQDLKRYLKQAPFHP-EINRSDEHLIREI ANAEROMYXOBACTER.DEHA QREALRDRGLAAARLGGAAAAIRDLEAYLSLAPAAG-DAEEVRAAVAGLR ANAEROMYXOBACTER.SP QREALRDRGLAAARLGGAAAAIRDLEAYLSLAPAAAGDAEEIRAAVAGLR XYLELLA QPEALRDRGLAYLQLGHRSGALNDLKRYLQLYPSTH-NVDMVRGHLVDLS XANTHOMONAS.ORYZAE QPEALRDRGMAYLHLGHRNGARHDLTRYLVLNPNAQ-DAANLHEHLVELN STENOTROPHOMONAS QDDALRDRGLAYLQLEYLAGARHDLGQYLKRNPEAS-DAQWLREKLIDLG ornithorynchus HVQHLLLQARLYFHLGIWPEK----------------------------- gallus NVQHLMLQARLYFHLGIWPEKVLDILQHIQALDPSQHGAVGYLVQHTLEH danio NVQYLLLQARLYFHLGIWPEKVLDILQHIQALDPSQHGAVGYLVQHTLEH BURKHOLDERIA SIEEVRDRGFAYARLDYLRPALEDLERYLGDRPDAE-DATVVESQLHELR GEMMATA DASQQRDLGVLLVQAEQFGRAVDPLRAYLRAEPGAEDAADVRKFLYRALN physcomitrella VIEETRDYGICLYLLNRFSEAIPCLEAYIQEAPRAT-DAESMRSLLTKMR NITROSOPUMILUS SPDDIRDKGILEERLLNYDSALKYLNKYLEINPNAEDVDFILELIRSIKT apis DTNTVLQLGRIYMSQHMDLTELDLELMSRGQANMISQTFKTLQKCQKRLQ chlamydomonas AAP---------------EAAAAPVDPSLVSPEEAQTVAAVLEEARRRMR LEPTOSPIRA LKA----------------------------------------------- SYMBIOBACTERIUM GRP----------------------------------------------- Translation TVAET--------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER KQDR---------------------------------------------- NOSTOC RD------------------------------------------------ SYNECHOCOCCUS VERLESHPDS---------------------------------------- ANAEROMYXOBACTER.DEHA AGRGALLN------------------------------------------ ANAEROMYXOBACTER.SP AGRGALLN------------------------------------------ XYLELLA NERIQTH------------------------------------------- XANTHOMONAS.ORYZAE SQRARAH------------------------------------------- STENOTROPHOMONAS GPVPRLH------------------------------------------- ornithorynchus -------------------------------------------------- gallus IERRKEEVGPEVKHRSDEKHKEVCFSIGLIMKHKRYGYNCVIYGWDPACM danio IQHKRHPVEPEVKRRSAPEHRDVQFSTGLIMKHKRSGYNCVIYGWDPKCT BURKHOLDERIA QRTQHNDRD----------------------------------------- GEMMATA EVARWN-------------------------------------------- physcomitrella RDKMS--------------------------------------------- NITROSOPUMILUS KN------------------------------------------------ apis PKVIIPSVK---------------YAIGLIMKHKIYGYLCVITGWDVRCM chlamydomonas LNRPDRD------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus MGHEWIRNMNVHSLPHGPHQPFYNVLVEDGSCRYAAQENLEHNSEPREIP danio MSQEWINTMRVHQLSKGADQPFYNVLVQDGTCRYAAQENLEPHSAPLEIA BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella -------------------------------------------------- NITROSOPUMILUS -------------------------------------------------- apis ASTEWMNEMNVDGLEEGADQPFYKIFVDDGSCQYAAQENLLLAPNPEWIN chlamydomonas -------------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER -------------------------------------------------- NOSTOC -------------------------------------------------- SYNECHOCOCCUS -------------------------------------------------- ANAEROMYXOBACTER.DEHA -------------------------------------------------- ANAEROMYXOBACTER.SP -------------------------------------------------- XYLELLA -------------------------------------------------- XANTHOMONAS.ORYZAE -------------------------------------------------- STENOTROPHOMONAS -------------------------------------------------- ornithorynchus -------------------------------------------------- gallus HPDIGRYFSEFTGTHYLANTELEIRYPEDLELTCTTVQKIYSSGKERMTA danio HPEIGRYFNEFSETHYISNEELQARYPEDMCKTNRTVEELYHSLTPNSGQ BURKHOLDERIA -------------------------------------------------- GEMMATA -------------------------------------------------- physcomitrella -------------------------------------------------- NITROSOPUMILUS -------------------------------------------------- apis HHAIGRYFYKFSGAHYIPNEEKAKEYPEDEKVCNELIVEYMQNGITYNTT chlamydomonas -------------------------------------------------- LEPTOSPIRA -------------------------------------------------- SYMBIOBACTERIUM -------------------------------------------------- Translation -------------------------------------------------- MICROSCILLA -------------------------------------------------- GLOEOBACTER ---------------- NOSTOC ---------------- SYNECHOCOCCUS ---------------- ANAEROMYXOBACTER.DEHA ---------------- ANAEROMYXOBACTER.SP ---------------- XYLELLA ---------------- XANTHOMONAS.ORYZAE ---------------- STENOTROPHOMONAS ---------------- ornithorynchus ---------------- gallus ASQGWKQTGV------ danio SPEPTANFQDHDLLDP BURKHOLDERIA ---------------- GEMMATA ---------------- physcomitrella ---------------- NITROSOPUMILUS ---------------- apis ---------------- chlamydomonas ---------------- LEPTOSPIRA ---------------- SYMBIOBACTERIUM ---------------- Translation ---------------- MICROSCILLA ---------------- ___________________________________________________________________________________________________ L'alignement montre que la plupart des séquences commencent et finissent aux mêmes endroits. Il y a 3 acides aminés conservés dans toutes les séquences. Les indels se situent aux mêmes positions et ont à peu prés les mêmes longueurs sur les différentes séquences. Cela pourrait traduire une homologie entre les séquences. Parmi les eucaryotes on remarque que plusieurs d'entre eux ont des séquences beaucoups plus longues, qui commencent bien avant (physcomitrella), qui finissent bien après (danio rerio), ou même qui commencent bien avant pour finir bien aprés (gallus gallus)...
BLAST
Blastp avec (nr) comme base de donnée: BLASTP 2.2.18 (Feb-03-2008) Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schäffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. RID: WRHCKA2S014 Database: All non-redundant GenBank CDS translations+PDB+SwissProt+PIR+PRF excluding environmental samples from WGS projects 6,283,845 sequences; 2,146,951,973 total letters If you have any problems or questions with the results of this search please refer to the BLAST FAQs Taxonomy reports Query= Translation of ORF number 1 in reading frame 1 on the direct strand. Length=250 SCORE E Sequences producing significant alignments: (Bits) Value ref|YP_590415.1| hypothetical protein Acid345_1339 [Acidobact... 109 2e-22 Gene info ref|YP_473862.1| hypothetical protein CYA_0380 [Synechococcus... 100 8e-20 Gene info ref|YP_478074.1| tetratricopeptide repeat protein [Synechococ... 98.2 5e-19 Gene info ref|ZP_02735526.1| hypothetical protein GobsU_27206 [Gemmata ... 93.6 1e-17 ref|YP_001518509.1| hypothetical protein AM1_4212 [Acaryochlo... 89.7 1e-16 Gene info ref|ZP_02326065.1| conserved hypothetical protein [Anaeromyxo... 87.8 6e-16 ref|ZP_00518391.1| similar to Uncharacterized conserved prote... 87.8 6e-16 ref|XP_001783555.1| predicted protein [Physcomitrella patens ... 87.4 7e-16 Gene info ref|YP_323133.1| hypothetical protein Ava_2624 [Anabaena vari... 86.7 1e-15 Gene info ref|ZP_01618704.1| hypothetical protein L8106_04536 [Lyngbya ... 86.7 1e-15 ref|NP_925439.1| hypothetical protein glr2493 [Gloeobacter vi... 86.7 1e-15 Gene info ref|XP_396942.3| PREDICTED: similar to F-box only protein 21 ... 86.7 1e-15 UniGene infoGene info ref|ZP_01629100.1| hypothetical protein N9414_05744 [Nodulari... 86.7 1e-15 ref|ZP_02174933.1| conserved hypothetical protein [Anaeromyxo... 86.3 2e-15 ref|NP_484068.1| hypothetical protein alr0024 [Nostoc sp. PCC... 86.3 2e-15 Gene info ref|ZP_01644664.1| conserved hypothetical protein [Stenotroph... 86.3 2e-15 ref|ZP_01731197.1| hypothetical protein CY0110_12327 [Cyanoth... 86.3 2e-15 ref|ZP_01883302.1| hypothetical protein PBAL39_09731 [Pedobac... 85.5 3e-15 ref|YP_001582702.1| hypothetical protein Nmar_1368 [Nitrosopu... 85.5 3e-15 Gene info ref|YP_001379739.1| hypothetical protein Anae109_2554 [Anaero... 84.7 6e-15 Gene info ref|YP_464426.1| hypothetical protein Adeh_1215 [Anaeromyxoba... 84.3 7e-15 Gene info ref|XP_001505558.1| PREDICTED: similar to F-box only protein ... 83.6 1e-14 UniGene infoGene info ref|YP_001120469.1| hypothetical protein Bcep1808_2642 [Burkh... 83.6 1e-14 Gene info ref|ZP_00681733.1| TPR repeat [Xylella fastidiosa Ann-1] >gb|... 83.2 1e-14 ref|YP_001354211.1| hypothetical protein mma_2521 [Janthinoba... 82.8 2e-14 Gene info ref|YP_560225.1| hypothetical protein Bxe_A0761 [Burkholderia... 82.8 2e-14 Gene info ref|ZP_01499874.1| conserved hypothetical protein [Burkholder... 82.4 3e-14 ref|YP_001278485.1| hypothetical protein RoseRS_4192 [Roseifl... 82.0 3e-14 Gene info ref|ZP_00651027.1| TPR repeat [Xylella fastidiosa Dixon] >ref... 81.6 4e-14 ref|ZP_01511058.1| conserved hypothetical protein [Burkholder... 81.6 4e-14 emb|CAO88150.1| unnamed protein product [Microcystis aeruginosa 81.6 4e-14 sp|Q87DH6|Y709_XYLFT UPF0162 protein PD_0709 81.6 4e-14 ref|NP_778929.1| hypothetical protein PD0709 [Xylella fastidi... 81.6 4e-14 Gene info sp|Q9PD85|Y1494_XYLFA UPF0162 protein XF_1494 81.6 4e-14 ref|NP_298783.1| hypothetical protein XF1494 [Xylella fastidi... 81.6 4e-14 Gene info ref|XP_001697344.1| hypothetical protein CHLREDRAFT_192924 [C... 81.3 6e-14 UniGene infoGene info ref|YP_797022.1| hypothetical protein LBL_0492 [Leptospira bo... 80.9 7e-14 Gene info ref|YP_720431.1| hypothetical protein Tery_0503 [Trichodesmiu... 80.9 8e-14 Gene info ref|YP_000890.1| hypothetical protein LIC10915 [Leptospira in... 80.9 8e-14 Gene info ref|NP_713397.1| hypothetical protein LA3217 [Leptospira inte... 80.1 1e-13 Gene info ref|YP_370119.1| hypothetical protein Bcep18194_A5881 [Burkho... 79.7 2e-13 Gene info ref|ZP_01693284.1| hypothetical protein M23134_04875 [Microsc... 79.3 2e-13 ref|YP_365261.1| hypothetical protein XCV3530 [Xanthomonas ca... 79.3 2e-13 Gene info ref|NP_643720.1| hypothetical protein XAC3413 [Xanthomonas ax... 79.3 2e-13 Gene info ref|ZP_00111354.1| COG2912: Uncharacterized conserved protein... 79.0 3e-13 ref|YP_001660836.1| hypothetical protein MAE_58220 [Microcyst... 78.6 3e-13 Gene info emb|CAP50264.1| conserved hypothetical protein [Xanthomonas c... 78.6 3e-13 ref|NP_638613.1| hypothetical protein XCC3267 [Xanthomonas ca... 78.6 3e-13 Gene info ref|YP_074740.1| hypothetical protein STH911 [Symbiobacterium... 78.6 4e-13 Gene info ref|YP_001430198.1| hypothetical protein Rcas_0042 [Roseiflex... 78.6 4e-13 Gene info ref|ZP_01517496.1| conserved hypothetical protein [Comamonas ... 78.6 4e-13 ref|YP_969535.1| hypothetical protein Aave_1169 [Acidovorax a... 78.2 5e-13 Gene info ref|ZP_00987301.1| COG2912: Uncharacterized conserved protein... 77.8 6e-13 ref|XP_696130.2| PREDICTED: similar to mKIAA0875 protein, partia 77.8 6e-13 UniGene infoGene info ref|YP_630196.1| tetratricopeptide repeat protein [Myxococcus... 77.8 7e-13 Gene info gb|EAY68084.1| hypothetical protein BDAG_00787 [Burkholderia dol 77.8 7e-13 ref|ZP_02354727.1| hypothetical protein BoklE_04548 [Burkhold... 77.0 1e-12 ref|ZP_02361908.1| hypothetical protein BoklC_04253 [Burkhold... 77.0 1e-12 ref|XP_425297.2| PREDICTED: similar to KIAA0875 protein [Gallus 77.0 1e-12 UniGene infoGene info ref|YP_001619054.1| hypothetical protein sce8404 [Sorangium c... 76.6 1e-12 Gene info ref|NP_824307.1| hypothetical protein SAV_3131 [Streptomyces ... 76.6 2e-12 Gene info ref|YP_199764.1| hypothetical protein XOO1125 [Xanthomonas or... 76.3 2e-12 Gene info ref|ZP_01466024.1| conserved hypothetical protein [Stigmatell... 76.3 2e-12 ref|YP_677380.1| hypothetical protein CHU_0753 [Cytophaga hut... 75.9 2e-12 Gene info gb|ABZ93155.1| Conserved hypothetical protein [Leptospira bif... 75.5 3e-12 ref|ZP_02244494.1| hypothetical protein Xoryp_18060 [Xanthomo... 75.5 3e-12 ref|YP_001100703.1| hypothetical protein HEAR2453 [Herminiimo... 75.5 3e-12 Gene info ref|XP_001636708.1| predicted protein [Nematostella vectensis... 75.5 3e-12 Gene info ref|YP_001563208.1| hypothetical protein Daci_2183 [Delftia a... 74.7 5e-12 Gene info ref|ZP_01720679.1| hypothetical protein ALPR1_15704 [Algoriph... 73.9 8e-12 ref|YP_296972.1| hypothetical protein Reut_A2767 [Ralstonia e... 73.9 9e-12 Gene info ref|ZP_01554972.1| conserved hypothetical protein [Burkholder... 73.6 1e-11 ref|ZP_02209215.1| conserved hypothetical protein [Polynucleo... 73.2 1e-11 ref|YP_550374.1| hypothetical protein Bpro_3568 [Polaromonas ... 73.2 2e-11 Gene info ref|YP_987209.1| hypothetical protein Ajs_2998 [Acidovorax sp... 72.8 2e-11 Gene info ref|YP_585048.1| hypothetical protein Rmet_2906 [Ralstonia me... 72.8 2e-11 Gene info ref|NP_629282.1| hypothetical protein SCO5133 [Streptomyces c... 72.4 3e-11 Gene info ref|XP_001419186.1| predicted protein [Ostreococcus lucimarin... 72.0 4e-11 Gene info ref|YP_774488.1| hypothetical protein Bamb_2598 [Burkholderia... 71.6 4e-11 Gene info ref|XP_001374929.1| PREDICTED: similar to KIAA0875 protein [Mono 71.6 4e-11 UniGene infoGene info ref|YP_996547.1| hypothetical protein Veis_1775 [Verminephrob... 71.2 6e-11 Gene info ref|NP_520679.1| hypothetical protein RSc2558 [Ralstonia sola... 71.2 6e-11 Gene info ref|YP_001156535.1| hypothetical protein Pnuc_1758 [Polynucle... 70.9 7e-11 Gene info ref|YP_107498.1| hypothetical protein BPSL0873 [Burkholderia ... 70.9 7e-11 Gene info ref|ZP_02504948.1| hypothetical protein BpseBC_04828 [Burkhol... 70.5 9e-11 ref|YP_523774.1| hypothetical protein Rfer_2526 [Rhodoferax f... 70.5 9e-11 Gene info ref|XP_001601019.1| PREDICTED: similar to KIAA0875 protein [Naso 70.1 1e-10 Gene info gb|AAI58867.1| Unknown (protein for MGC:188488) [Rattus norvegic 69.3 2e-10 gb|ABK76830.1| conserved hypothetical protein [Cenarchaeum symbi 68.9 3e-10 ref|YP_441293.1| hypothetical protein BTH_I0737 [Burkholderia... 68.9 3e-10 Gene info ref|ZP_01857655.1| hypothetical protein PM8797T_30454 [Planct... 68.9 3e-10 ref|YP_621813.1| hypothetical protein Bcen_1937 [Burkholderia... 68.9 3e-10 Gene info ref|NP_001101808.1| F-box only protein 21 [Rattus norvegicus]... 68.9 3e-10 UniGene infoGene info ref|ZP_02378053.1| hypothetical protein BuboB_10031 [Burkholderi 68.9 3e-10 ref|YP_983223.1| hypothetical protein Pnap_3003 [Polaromonas ... 68.6 3e-10 Gene info ref|YP_615146.1| hypothetical protein Sala_0087 [Sphingopyxis... 68.6 4e-10 Gene info ref|ZP_01341732.1| hypothetical protein Bmal2_03000556 [Burkh... 68.6 4e-10 ref|NP_663539.1| F-box protein 21 [Mus musculus] >sp|Q8VDH1|F... 68.6 4e-10 UniGene infoGene info gb|EDL19801.1| F-box only protein 21, isoform CRA_a [Mus musculu 68.2 4e-10 Gene info gb|EDL19802.1| F-box only protein 21, isoform CRA_b [Mus musculu 68.2 4e-10 Gene info Alignments: >ref|YP_590415.1| Gene info hypothetical protein Acid345_1339 [Acidobacteria bacterium Ellin345] gb|ABF40341.1| Gene info conserved hypothetical protein [Acidobacteria bacterium Ellin345] Length=284 GENE ID: 4070877 Acid345_1339 | hypothetical protein [Acidobacteria bacterium Ellin345] Score = 109 bits (273), Expect = 2e-22, Method: Compositional matrix adjust. Identities = 58/190 (30%), Positives = 105/190 (55%), Gaps = 8/190 (4%) Query 65 IESLGWSDGVGNFLKFIRSQ----RYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVR 120 ++ +G F+ +R++ R +L + RT +P +I + ++ LA +VR Sbjct 1 MQVVGTRSVTEEFIALVRAEIEDDRIDLLHASLTIARTEYPNLDIRNYVTRVEALASKVR 60 Query 121 ELLTPPQNARDICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYI 180 L+P + + + +N V FHE FRG +D+ DP+NSFL+ V+ER+ G+PI++SV+Y+ Sbjct 61 SRLSPEASTIETVNTLNHVMFHEMNFRGNREDYYDPRNSFLNDVIERRLGIPITMSVLYM 120 Query 181 LIARRVSLDLEPIGLPGRFMVGSF-TDDQPFYIDVWSGGKFFDIDQMEEFLGN---SELE 236 +ARRV L L +G+PG F++ + + + ID ++ G + + E+ + E+ Sbjct 121 EVARRVGLPLVGVGMPGHFLLKVYDVEGRQILIDPFNNGSMLNASECEQRMKEIYAGEVR 180 Query 237 NSGSSLLPVT 246 LLPV+ Sbjct 181 FKPEFLLPVS 190 >ref|YP_473862.1| Gene info hypothetical protein CYA_0380 [Synechococcus sp. JA-3-3Ab] gb|ABC98599.1| Gene info conserved hypothetical protein [Synechococcus sp. JA-3-3Ab] Length=278 GENE ID: 3897952 CYA_0380 | hypothetical protein [Synechococcus sp. JA-3-3Ab] (10 or fewer PubMed links) Score = 100 bits (250), Expect = 8e-20, Method: Compositional matrix adjust. Identities = 47/157 (29%), Positives = 89/157 (56%), Gaps = 0/157 (0%) Query 76 NFLKFIRSQRYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSV 135 FL+ I++ +L+ + + +P ++ LD++A V+E L P + + + Sbjct 9 RFLQEIQADPIDLDRAALWIAQEAYPDLDVEEYLAALDEMAAEVQERLPPERYPLRVIRI 68 Query 136 INRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGL 195 +NR F + GF G ++D+ DP+NSFL+ V++R+ G+PI+LS+IY+ +ARR+ + +G+ Sbjct 69 LNRYLFEDLGFCGNSEDYYDPRNSFLNEVIDRRTGIPITLSLIYLELARRIDFPMAGVGM 128 Query 196 PGRFMVGSFTDDQPFYIDVWSGGKFFDIDQMEEFLGN 232 PG F+V D ++D + G+ + +E L Sbjct 129 PGHFLVRPLFADAEIFVDPFHQGEILFPEDCQERLAQ 165 >ref|YP_478074.1| Gene info tetratricopeptide repeat protein [Synechococcus sp. JA-2-3B'a(2-13)] gb|ABD02811.1| Gene info tetratricopeptide repeat protein [Synechococcus sp. JA-2-3B'a(2-13)] Length=279 GENE ID: 3900915 CYB_1856 | tetratricopeptide repeat protein [Synechococcus sp. JA-2-3B'a(2-13)] (10 or fewer PubMed links) Score = 98.2 bits (243), Expect = 5e-19, Method: Compositional matrix adjust. Identities = 45/152 (29%), Positives = 84/152 (55%), Gaps = 0/152 (0%) Query 81 IRSQRYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVF 140 I+S+ +L + + +P E+ LD++A V+E L P + + ++N Sbjct 14 IQSEPIDLGRAALWIAQEAYPDLEVEEYVAALDEMAAEVQERLPPERYPLRVIKILNHYL 73 Query 141 FHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFM 200 F + GFRG +D+ DP+NSFL+ V++R+ G+PI+LS+IY+ +ARR+ + +G+PG F+ Sbjct 74 FEDLGFRGNREDYYDPRNSFLNEVIDRRTGIPITLSLIYLELARRIDFPMAGVGMPGHFL 133 Query 201 VGSFTDDQPFYIDVWSGGKFFDIDQMEEFLGN 232 + ++D + G+ + +E L Sbjct 134 IRPLFAGAEIFVDPFQQGEILFPEDCQELLSQ 165 >ref|ZP_02735526.1| hypothetical protein GobsU_27206 [Gemmata obscuriglobus UQM 2246] Length=273 Score = 93.6 bits (231), Expect = 1e-17, Method: Compositional matrix adjust. Identities = 48/141 (34%), Positives = 83/141 (58%), Gaps = 2/141 (1%) Query 92 CFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGFRGATK 151 L+ R + + +++LA+++R L AR + ++ F E GF G T+ Sbjct 24 ALLIARDAYSGMNPRAYLRRIERLAEQLRPRLKGSLAAR--TAELSTFLFEECGFAGNTE 81 Query 152 DFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTDDQPFY 211 D+ DP+NS+L+ VL+R+ GLPI+LSV+ + + RR LD+ +GLPG F+V + ++ Sbjct 82 DYYDPRNSYLNKVLDRQVGLPIALSVLAMEVGRRAGLDVVGVGLPGHFIVKAVEGNEEVL 141 Query 212 IDVWSGGKFFDIDQMEEFLGN 232 +D ++GG+F DI+ E +G Sbjct 142 LDPFNGGQFLDIEGCEALVGG 162 >ref|YP_001518509.1| Gene info hypothetical protein AM1_4212 [Acaryochloris marina MBIC11017] gb|ABW29192.1| Gene info conserved hypothetical protein [Acaryochloris marina MBIC11017] Length=273 GENE ID: 5683015 AM1_4212 | hypothetical protein [Acaryochloris marina MBIC11017] Score = 89.7 bits (221), Expect = 1e-16, Method: Compositional matrix adjust. Identities = 44/138 (31%), Positives = 75/138 (54%), Gaps = 0/138 (0%) Query 82 RSQRYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFF 141 +S +L + + + +P + LD +A V E L + +NR + Sbjct 16 QSSEPDLAAAALYIAQEEYPALAVDEYLNALDTMAGEVEERLPADPYPLKVLQTLNRYLY 75 Query 142 HEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMV 201 + GF G ++ + DP+NSFL+ VL+R+ G+PI+LS++YI IARR++ +E I PG F+V Sbjct 76 EDLGFTGNSQHYYDPRNSFLNDVLDRRLGIPITLSLVYIEIARRINFPMEGINFPGHFLV 135 Query 202 GSFTDDQPFYIDVWSGGK 219 DD ++D + G+ Sbjct 136 RPTRDDMNIFVDPFYQGE 153 >ref|ZP_02326065.1| conserved hypothetical protein [Anaeromyxobacter dehalogenans 2CP-1] gb|EDR80602.1| conserved hypothetical protein [Anaeromyxobacter dehalogenans 2CP-1] Length=282 Score = 87.8 bits (216), Expect = 6e-16, Method: Compositional matrix adjust. Identities = 44/139 (31%), Positives = 72/139 (51%), Gaps = 1/139 (0%) Query 88 LESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGFR 147 L+ + + +P E + LD+LA RV + P A + V E G R Sbjct 26 LDEAALAIAQEEYPALEPEAYLTRLDELAARVVRRVPGPVRAASALRALREVLHDEEGLR 85 Query 148 GATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTDD 207 G D+ DP+NSFL+ VL+R+ G+PI+L+++Y+ + RR L LE +G PG F+ + Sbjct 86 GNDDDYYDPRNSFLNDVLDRRLGIPITLALVYMEVGRRAGLRLEGVGFPGHFLAKYVSPG 145 Query 208 Q-PFYIDVWSGGKFFDIDQ 225 ++D + GG+ D+ Sbjct 146 GVEVFVDAYHGGEMLSADE 164 >ref|ZP_00518391.1| similar to Uncharacterized conserved protein [Crocosphaera watsonii WH 8501] gb|EAM48526.1| similar to Uncharacterized conserved protein [Crocosphaera watsonii WH 8501] Length=265 Score = 87.8 bits (216), Expect = 6e-16, Method: Compositional matrix adjust. Identities = 46/162 (28%), Positives = 85/162 (52%), Gaps = 7/162 (4%) Query 88 LESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGFR 147 L + + +P +I + LD + R+++ L + IN+ F E GF+ Sbjct 17 LAKASLIYAKYQYPRLDIQDYLMTLDIIFQRIKDELGKEVYPLKVIKTINKYLFKELGFK 76 Query 148 GATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTDD 207 G ++ DP NS+L+ V+++K G+PI+LSVIY+ IA+R+ + IG+PG F++ ++ Sbjct 77 GNKDNYYDPDNSYLNQVIDKKTGIPITLSVIYLEIAKRLDFPMVGIGMPGHFLIRPEFEN 136 Query 208 QPFYIDVWSGGKFFDIDQMEEFLGNSELENSGSSLLPVTVAE 249 ++DV++ G+ F + EL+ P+T+ E Sbjct 137 VGIFVDVFNQGEIL-------FKEDCELKLKEIYQQPITLEE 171 >ref|XP_001783555.1| Gene info predicted protein [Physcomitrella patens subsp. patens] gb|EDQ51644.1| Gene info predicted protein [Physcomitrella patens subsp. patens] Length=515 Score = 87.4 bits (215), Expect = 7e-16, Method: Compositional matrix adjust. Identities = 43/134 (32%), Positives = 76/134 (56%), Gaps = 3/134 (2%) Query 99 IFPTFEISSSTLFLDQLADRVRELLTPPQN--ARDICSVINRVFFHEFGFRGATKDFADP 156 ++P + ++AD + LL PP + + INR + + G++GAT ++ DP Sbjct 270 LYPRITAEEVEAEMKEMADELEPLLPPPAERYTMRMINSINRYLYGQLGYKGAT-NYLDP 328 Query 157 KNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTDDQPFYIDVWS 216 NS +++VL+R+ GLP+++S++Y+ +A+RV L ++ + LP FM D F+ID + Sbjct 329 DNSCINMVLKRREGLPLTMSLLYMELAKRVGLPMQGVNLPAHFMCRPTGDGLEFFIDAHA 388 Query 217 GGKFFDIDQMEEFL 230 GK + EE L Sbjct 389 NGKITFLQDAEERL 402 >ref|YP_323133.1| Gene info hypothetical protein Ava_2624 [Anabaena variabilis ATCC 29413] gb|ABA22238.1| Gene info conserved hypothetical protein [Anabaena variabilis ATCC 29413] Length=271 GENE ID: 3681907 Ava_2624 | hypothetical protein [Anabaena variabilis ATCC 29413] Score = 86.7 bits (213), Expect = 1e-15, Method: Compositional matrix adjust. Identities = 39/133 (29%), Positives = 74/133 (55%), Gaps = 0/133 (0%) Query 87 ELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGF 146 +L + + +P ++ LD +A V E L + + IN+ + + GF Sbjct 22 DLARAALYIAKEEYPRLDVEEYLSALDTMAMEVEERLPSSRYPLRVIQGINQYLYDDLGF 81 Query 147 RGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTD 206 G KD+ DP+NSF + V++R+ G+PI+L+++Y+ IA+R+ +E +GLPG F++ Sbjct 82 IGNQKDYYDPRNSFFNEVIDRRVGIPITLALVYLEIAQRIDFPMEGVGLPGHFLIRPAIS 141 Query 207 DQPFYIDVWSGGK 219 D ++D ++ G+ Sbjct 142 DMEIFVDAFNRGE 154 >ref|ZP_01618704.1| hypothetical protein L8106_04536 [Lyngbya sp. PCC 8106] gb|EAW39179.1| hypothetical protein L8106_04536 [Lyngbya sp. PCC 8106] Length=274 Score = 86.7 bits (213), Expect = 1e-15, Method: Compositional matrix adjust. Identities = 42/139 (30%), Positives = 77/139 (55%), Gaps = 0/139 (0%) Query 83 SQRYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFH 142 +Q+ +L + + FP + LD++A V E L + + IN+ F Sbjct 18 NQQIDLAKAALYMAQEQFPDLDPEEYLKALDEMAAEVLERLDEERYPLRVIQTINQYLFD 77 Query 143 EFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVG 202 + F G ++ DP NS+L+ V++R+ G+PI+LSV+Y+ IA+R++ + IG+PG F++ Sbjct 78 DLEFVGNESNYYDPNNSYLNQVIDRRTGIPITLSVVYLEIAKRINFPMVGIGMPGHFLIR 137 Query 203 SFTDDQPFYIDVWSGGKFF 221 +D Y+DV++ G+ Sbjct 138 PDFEDVGIYVDVFNRGEIL 156 ____________________________________________________________________________________________________ Blastp avec (SwissProt) comme base de donnée: BLASTP 2.2.18 (Feb-03-2008) Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Reference for compositional score matrix adjustment: Altschul, Stephen F., John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis, Alejandro A. Schäffer, and Yi-Kuo Yu (2005) "Protein database searches using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109. RID: WRNAHNVJ016 Database: Non-redundant SwissProt sequences 309,621 sequences; 115,465,120 total letters If you have any problems or questions with the results of this search please refer to the BLAST FAQs Taxonomy reports Query= Translation of ORF number 1 in reading frame 1 on the direct strand. Length=250 SCORE E Sequences producing significant alignments: (Bits) Value sp|Q87DH6|Y709_XYLFT UPF0162 protein PD_0709 81.6 3e-15 sp|Q9PD85|Y1494_XYLFA UPF0162 protein XF_1494 81.6 3e-15 sp|Q8VDH1|FBX21_MOUSE F-box only protein 21 68.6 3e-11 Gene info sp|O94952|FBX21_HUMAN F-box only protein 21 67.0 8e-11 Gene info sp|Q5R5S1|FBX21_PONPY F-box only protein 21 66.6 1e-10 sp|Q9KQ28|Y2176_VIBCH UPF0162 protein VC_2176 57.4 6e-08 sp|Q9CN81|Y557_PASMU UPF0162 protein PM0557 53.9 7e-07 sp|P45252|Y1558_HAEIN UPF0162 protein HI1558 52.8 1e-06 sp|Q9HYI6|Y3419_PSEAE UPF0162 protein PA3419 48.1 4e-05 sp|P57270|Y173_BUCAI UPF0162 protein BU173 40.0 0.011 sp|P0A2L9|SIRB1_SALTY Protein sirB1 >sp|P0A2M0|SIRB1_SALTI Prote 35.4 0.25 sp|Q2G8L3|PSD_NOVAD Phosphatidylserine decarboxylase proenzym... 33.9 0.71 Gene info sp|Q9X5E3|PSD_ZYMMO Phosphatidylserine decarboxylase proenzym... 33.9 0.79 sp|P0AGM5.1|SIRB1_ECOLI Protein sirB1 >sp|P0AGM6|SIRB1_ECO57 Pro 33.5 0.91 Gene info sp|Q8K9W8|Y167_BUCAP UPF0162 protein BUsg_167 32.3 2.0 sp|Q3MHH0.2|WD40A_BOVIN WD repeat-containing protein 40A 32.3 2.0 Gene info sp|Q2NBU9.1|PSD_ERYLH Phosphatidylserine decarboxylase proenz... 32.3 2.4 sp|Q3EAE5|FBDL1_ARATH Putative F-box/FBD/LRR-repeat protein At4g 31.6 3.5 Gene info sp|Q7UI46|NRDJ_RHOBA Vitamin B12-dependent ribonucleotide red... 31.6 3.7 sp|Q1GRP7|PSD_SPHAL Phosphatidylserine decarboxylase proenzym... 30.4 9.0 sp|Q9HSF7|TF2B7_HALSA Transcription initiation factor IIB 7 (TFI 30.0 9.4 Alignements: >sp|Q87DH6|Y709_XYLFT UPF0162 protein PD_0709 Length=281 Score = 81.6 bits (200), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 49/157 (31%), Positives = 83/157 (52%), Gaps = 8/157 (5%) Query 76 NFLKFIRSQRYELESGCFLLDRTIFPTFEIS-SSTL---FLDQLADRVRELLTPPQNARD 131 N L + + L S L+ R +P + + TL +++ L V E+ P Sbjct 11 NNLATVDDETLPLMSTALLIARDEYPDLDANLYDTLVQSYVEYLRSEVEEISLWPLK--- 67 Query 132 ICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLE 191 + +NR F + G+ G ++ DP+NS+L+ V ER+ G PISL+VI I +ARR+ + L+ Sbjct 68 -MAAVNRYLFQKLGYSGNHDEYYDPRNSYLNQVFERRLGNPISLAVIQIEVARRLGIPLD 126 Query 192 PIGLPGRFMVGSFTDDQPFYIDVWSGGKFFDIDQMEE 228 + PG F+V DD +D ++GG+ D +++ E Sbjct 127 GVSFPGHFLVRLPVDDGILVMDPFNGGRPLDAEELRE 163 >sp|Q9PD85|Y1494_XYLFA UPF0162 protein XF_1494 Length=281 Score = 81.6 bits (200), Expect = 3e-15, Method: Compositional matrix adjust. Identities = 49/157 (31%), Positives = 83/157 (52%), Gaps = 8/157 (5%) Query 76 NFLKFIRSQRYELESGCFLLDRTIFPTFEIS-SSTL---FLDQLADRVRELLTPPQNARD 131 N L + + L S L+ R +P + + TL +++ L V E+ P Sbjct 11 NSLATVDDETLPLMSTALLIARDEYPDLDANLYDTLVQSYVEYLRSEVEEISLWPLK--- 67 Query 132 ICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLE 191 + +NR F + G+ G ++ DP+NS+L+ V ER+ G PISL+VI I +ARR+ + L+ Sbjct 68 -MAAVNRYLFQKLGYSGNHDEYYDPRNSYLNQVFERRLGNPISLAVIQIEVARRLGIPLD 126 Query 192 PIGLPGRFMVGSFTDDQPFYIDVWSGGKFFDIDQMEE 228 + PG F+V DD +D ++GG+ D +++ E Sbjct 127 GVSFPGHFLVRLPVDDGILVMDPFNGGRPLDAEELRE 163 >sp|Q8VDH1|FBX21_MOUSE Gene info F-box only protein 21 Length=627 GENE ID: 231670 Fbxo21 | F-box protein 21 [Mus musculus] (Over 10 PubMed links) Score = 68.6 bits (166), Expect = 3e-11, Method: Compositional matrix adjust. Identities = 33/110 (30%), Positives = 60/110 (54%), Gaps = 8/110 (7%) Query 131 DICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDL 190 + IN V + + F+G D+ + N ++H VL R+ G+PIS+S++Y+ +AR++ + L Sbjct 260 QVLDAINYVLYDQLKFKGNRMDYYNALNLYMHQVLTRRTGIPISMSLLYLTVARQLGVPL 319 Query 191 EPIGLPGRFMV-------GSFTDDQPF-YIDVWSGGKFFDIDQMEEFLGN 232 EP+ P F++ G+ D + YID + GK + + E +G Sbjct 320 EPVNFPSHFLLRWCQGAEGATLDIFDYIYIDAFGKGKQLTVKECEYLIGQ 369 >sp|O94952|FBX21_HUMAN Gene info F-box only protein 21 Length=621 GENE ID: 23014 FBXO21 | F-box protein 21 [Homo sapiens] (10 or fewer PubMed links) Score = 67.0 bits (162), Expect = 8e-11, Method: Compositional matrix adjust. Identities = 33/109 (30%), Positives = 60/109 (55%), Gaps = 8/109 (7%) Query 132 ICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLE 191 + +N V + + F+G D+ + N ++H VL R+ G+PIS+S++Y+ IAR++ + LE Sbjct 261 VLDAMNYVLYDQLKFKGNRMDYYNALNLYMHQVLIRRTGIPISMSLLYLTIARQLGVPLE 320 Query 192 PIGLPGRFMV-------GSFTDDQPF-YIDVWSGGKFFDIDQMEEFLGN 232 P+ P F++ G+ D + YID + GK + + E +G Sbjct 321 PVNFPSHFLLRWCQGAEGATLDIFDYIYIDAFGKGKQLTVKECEYLIGQ 369 >sp|Q5R5S1|FBX21_PONPY F-box only protein 21 Length=621 Score = 66.6 bits (161), Expect = 1e-10, Method: Compositional matrix adjust. Identities = 33/109 (30%), Positives = 60/109 (55%), Gaps = 8/109 (7%) Query 132 ICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLE 191 + +N V + + F+G D+ + N ++H VL R+ G+PIS+S++Y+ IAR++ + LE Sbjct 261 VLDAMNYVLYDQLKFQGNRMDYYNALNLYMHQVLIRRTGIPISMSLLYLTIARQLGVPLE 320 Query 192 PIGLPGRFMV-------GSFTDDQPF-YIDVWSGGKFFDIDQMEEFLGN 232 P+ P F++ G+ D + YID + GK + + E +G Sbjct 321 PVNFPSHFLLRWCQGAEGATLDIFDYIYIDAFGKGKQLTVKECEYLIGQ 369 >sp|Q9KQ28|Y2176_VIBCH UPF0162 protein VC_2176 Length=270 Score = 57.4 bits (137), Expect = 6e-08, Method: Compositional matrix adjust. Identities = 32/132 (24%), Positives = 72/132 (54%), Gaps = 1/132 (0%) Query 87 ELESGCFLLDRTIFPTFEISSSTLFLDQLADRVRELLTPPQNARDICSVINRVFFHEFGF 146 EL G L++ I P ++ + + L +L L ++ + R+F+ E+GF Sbjct 13 ELVEGALALNKAINPETQLEWAHIELARLLKEAELALVHERDEKARFEAFLRLFYQEWGF 72 Query 147 RGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVG-SFT 205 G + + D +N+F+ VL+R++G+P+SL + + + ++ L I P +F++ +++ Sbjct 73 SGDREAYFDSRNAFIDQVLQRRKGIPVSLGSLLLYLGHKLGFPLNGISFPTQFLLSLNWS 132 Query 206 DDQPFYIDVWSG 217 ++P Y++ ++G Sbjct 133 GERPIYLNPFNG 144 >sp|Q9CN81|Y557_PASMU UPF0162 protein PM0557 Length=264 Score = 53.9 bits (128), Expect = 7e-07, Method: Compositional matrix adjust. Identities = 24/80 (30%), Positives = 48/80 (60%), Gaps = 0/80 (0%) Query 138 RVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPG 197 ++F+ E+GF + + N +L+ VLE +RG+P+SL I + IA +++L L P+ P Sbjct 60 QLFYGEWGFHCDPESYFLSSNLYLNDVLETRRGMPVSLGAILLYIADKLNLPLYPVNFPT 119 Query 198 RFMVGSFTDDQPFYIDVWSG 217 + ++ + + + +I+ W G Sbjct 120 QLVIRAEVEGEVAFINPWDG 139 >sp|P45252|Y1558_HAEIN UPF0162 protein HI1558 Length=267 Score = 52.8 bits (125), Expect = 1e-06, Method: Compositional matrix adjust. Identities = 23/103 (22%), Positives = 54/103 (52%), Gaps = 0/103 (0%) Query 115 LADRVRELLTPPQNARDICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPIS 174 L + R+ ++P + + ++F+ ++GF +D+ +N +L V E ++G+P++ Sbjct 36 LVRKARKKISPDWPKEEQIHQLLQLFYGDWGFHCDPEDYFYARNLYLPYVFEHRQGMPVT 95 Query 175 LSVIYILIARRVSLDLEPIGLPGRFMVGSFTDDQPFYIDVWSG 217 L + +A + L + P+ P + ++ + D+ +ID W G Sbjct 96 LGAMVFYLAEALDLPIYPVNFPTQLILRAEVRDEVAFIDPWDG 138 >sp|Q9HYI6|Y3419_PSEAE UPF0162 protein PA3419 Length=270 Score = 48.1 bits (113), Expect = 4e-05, Method: Compositional matrix adjust. Identities = 24/75 (32%), Positives = 45/75 (60%), Gaps = 2/75 (2%) Query 156 PKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPGRFMVGSFTDDQPFYIDVW 215 P+++ L LVL+R++G P+SL+++ + +ARR+ + L + PGRF++ D +D Sbjct 85 PRSALLPLVLQRRQGQPLSLALVAMELARRLDIPLVGVNFPGRFLLRVPQAD--HLLDPA 142 Query 216 SGGKFFDIDQMEEFL 230 +G + + D E L Sbjct 143 TGRRLYTPDCRELLL 157 >sp|P57270|Y173_BUCAI UPF0162 protein BU173 Length=269 Score = 40.0 bits (92), Expect = 0.011, Method: Compositional matrix adjust. Identities = 23/97 (23%), Positives = 56/97 (57%), Gaps = 3/97 (3%) Query 138 RVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYILIARRVSLDLEPIGLPG 197 ++F+ ++ F GA+ + ++ VL+ ++G +SL +I++ IA+ + L L P+ P Sbjct 64 QLFYTQWNFGGASGVYKLSDTLWIDNVLKTRKGTAVSLGIIFLHIAQSLKLPLNPVVFPT 123 Query 198 RFMV-GSFTDDQPFYIDVWSGGKFFDIDQMEEFL-GN 232 + ++ + +++ + I+ ++ G+ D +E +L GN Sbjct 124 QLILRADWINEKKWLINPFN-GEILDQHTLEVWLKGN 159. ___________________________________________________________________________________________________ Les scores donnés par le Blast contre Swiss Prot sont inférieurs à ceux donnés par le Blast contre (nr), ils sont donc moins significatifs. Pour définir le seuil on se basera sur les résultats du Blast contre (nr). Pour cela on regarde les alignements 2 à 2. On essaye de voir en dessous de quel score les séquences sont considérés comme non homologue. Lorsque l'alignement montre qu'il n'y a que très peu de régions conservées et que ces régions sont très courtes, entre notre ORF et la séquence à laquelle il est comparé, on peut considérer que les séquences ne sont pas homologues. Ainsi on a définit une valeur seuil de 68.9.
ORF finding
On utilise ORF Finder. Avant d'effectuer la recherche d'ORF dans les tous les cadres de lecture pour les sens direct et indirect, on paramètre les données suivantes: -> Les ORFs peuvent débuter avec n'importe quel codon. -> Ils doivent faire au moins 60 acides aminés, car on estime qu'au delà de cette longueur les ORFs ne sont pas seulement dus au hasard, il se pourrait qu'ils correspondent à des séquences codantes. -> Pour sa recherche, ORF Finder utilisera le code génétique standard. Résultats: -dans le sens direct: ORF Finder results (DIRECT STRAND) Results for 903 residue sequence "ORF_KS23670 ADN génomique (North American East Coast: Block Isla0nd, NY)" starting "TTCAATGGGA" >ORF number 1 in reading frame 1 on the direct strand extends from base 154 to base 903. ATGAATAATTCTAAGGAAAATTTGGAAAGTAATTACTGGCTAAGTCTTCGTAAATTGATA GACGATGAATCTCCATTAGTCAGAAGTGCTTTGCTTGCTGAACTTAGAAAGCATCCGAAA GATGGAAAACTTTTTCTCGAAAATATAATTCAAAACCAAAAAGATATTCTTGCAAAATTC GCACTTTCATTAATTGAAAGTCTTGGTTGGAGTGATGGAGTTGGGAATTTTTTAAAATTC ATTCGATCTCAAAGATACGAACTTGAATCGGGTTGTTTTTTGTTGGATCGAACGATTTTT CCAACTTTTGAAATCTCCTCATCCACTCTTTTTCTCGACCAACTCGCAGATCGTGTTCGT GAGCTTTTAACTCCTCCCCAAAATGCTCGGGATATTTGTTCAGTAATCAATCGAGTTTTT TTTCATGAATTCGGATTTAGAGGAGCAACCAAGGATTTTGCTGATCCAAAAAATAGTTTT CTCCACCTTGTATTGGAAAGGAAAAGAGGTTTACCGATTTCTCTATCGGTAATTTACATT CTAATTGCTCGAAGGGTAAGTTTAGACCTTGAGCCAATTGGACTTCCAGGAAGATTTATG GTTGGTAGCTTTACCGATGATCAACCATTTTACATTGATGTGTGGTCAGGGGGTAAATTC TTTGATATCGACCAGATGGAGGAATTCCTAGGAAATTCAGAGCTTGAAAACTCTGGTTCA TCTCTTCTTCCTGTAACAGTAGCAGAAACT >Translation of ORF number 1 in reading frame 1 on the direct strand. MNNSKENLESNYWLSLRKLIDDESPLVRSALLAELRKHPKDGKLFLENIIQNQKDILAKF ALSLIESLGWSDGVGNFLKFIRSQRYELESGCFLLDRTIFPTFEISSSTLFLDQLADRVR ELLTPPQNARDICSVINRVFFHEFGFRGATKDFADPKNSFLHLVLERKRGLPISLSVIYI LIARRVSLDLEPIGLPGRFMVGSFTDDQPFYIDVWSGGKFFDIDQMEEFLGNSELENSGS SLLPVTVAET No ORFs were found in reading frame 2. No ORFs were found in reading frame 3. -dans le sens indirect: ORF Finder results (REVERSE STRAND) Results for 903 residue sequence "ORF_KS23670 ADN génomique (North American East Coast: Block Island, NY)" starting "TTCAATGGGA" No ORFs were found in reading frame 1. No ORFs were found in reading frame 2. No ORFs were found in reading frame 3. ________________________________________________________________________________________________________________________________ Nous n'avons qu'un seul ORF (sens direct, cadre de lecture 1), l'étude ne peut donc se faire qu'avec celui-ci. Cet ORF semble incomplet en 3', en effet il s'étend de la base 154 à la base 903 (donc la dernière) de notre séquence de départ, de plus son dernier codon est ACT, qui code pour une Thréonine, et non pour un codon STOP. En revanche l'ORF est complet en 5', il débute à la base 154 de notre séquence et son premier acide aminés est une Méthionine. L'ORF a une taille de 250 acides aminés, le poids moyen d'un acide aminé étant de 110Da, on estime que l'ORF à un poids moyen d'approximativement 27.5 kDa. Cela reste à confirmer par un calcul plus poussé à l'aide de l'outil Protein Molecular Weight se trouvant sur le site de SMS (voir plus bas).