GOS 2149020

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091118860115
Annotathon code: GOS_2149020
Sample :
  • GPS :9°9'52n; 79°50'10w
  • Panama Canal: Lake Gatun - Panama
  • Fresh Water (-2m, 28.6°C, 0.1-0.8 microns)
Authors
Team : Algarve 2011
Username : anajoao
Annotated on : 2011-05-21 17:58:49
  • Bandarra Ana Teresa Santos Lopes
  • Roberto João Vitor Martins

Synopsis

  • Taxonomy: Viruses (NCBI info)
    Rank: no rank - Genetic Code: Standard - NCBI Identifier: 10239
    Kingdom: - Phylum: - Class: - Order:
    Viruses;

Genomic Sequence

>JCVI_READ_1091118860115 GOS_2149020 Genomic DNA
ATTGGATCACTGCCGTGCCCACCTTGAGGTGATAGAACTGCTTCAATTGCTCCAGTAGCATTTGAAGCAACTACAACAGGAGTGGTCAAAGCAGCATCAG
AGTATACATTGCCATTGTTTAAAAGAATATTTGCATAACTATAACCAGAACCAGCACTTTGTAAAATTGCGTATGTGATATTTCCAGAAGAATTTGTGCC
AAATTTAACAACACCACCGCTTCCGTCGCCTACAATTGGAGCATAAAGAGATGTTTGACTTGTTGGTAAGTTAGATCCAATAGCTTTAAGAACAACTGTA
CTTACAGAACCATTGACAGCAGCAGCTTGTACTACAGAATCAGCAACAATAGGAATAAAATCTGTAGATAGAAATCTTAGAACATCATTGGTGGAAATAG
TATACATATATTTCCAAATGTAATTACAAAAACCATTCGCATCTACTGGTTCTTTGTAGATTCCATTAGCATATGTTCCACTTGATGGAGTTGTTGATGG
TTCATAAGCAACCGTTGGTAATACGCCACCTGGAAGAAGAGAAGCGGGTCTTTCACCGTTGTAAAGACATTTAAATACTTCATATTTTCCATTTACAACA
TAAAATTTAGCAGTTGAAATATTTTGGGAACCAGTTGATCCACCAGGAAGAACTACAGAAGAAGTCTTTTGCTCGGAATAATCTGGTCTCCACATATCAA
AACGACTATTTGTGGATATATCCCAGTTAAATCTTCTCACAACTGGTCTTGCGTATTCACCAGTAATTCTCTTTGCCGCAAGGATATCGGCATATGCTAA
GTATTTTTCTGTTTGATTATCGTAAGGAACTGTTGGAATATCATCGGTAGCAAAACGATAAACCCCACTTAAAGCTGTGGCAGCAGTGTTTGAACTAGTA
ATAGGATTATACCAGTTAAAGTAGAACCTTGTGGTGGAATAGCGGAAGCAGTTGGACCTACAGCAGAAAGAAGAGA

Translation

[3 - 971/976]   indirect strand
>GOS_2149020 Translation [3-971   indirect strand]
SSFCCRSNCFRYSTTRFYFNWYNPITSSNTAATALSGVYRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNSRFDMWRPDYS
EQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDVLR
FLSTDFIPIVADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPV
VVASNATGAIEAVLSPQGGHGSD

[ Warning ] 5' incomplete: does not start with a Methionine
[ Warning ] 3' incomplete: following codon is not a STOP

Annotator commentaries

We considered that de sequence is coding due to have found homologous in the BLAST.

In the protein domains, we consider that the E values were significant, and have a biological function.

In the BLASTp the alignments we found have a significant E value, and the scores were high, being 286 the highest and 72 lowest. The sequences are very similar. In concordance with the BLAST our sequence ir related with Synechococcus and Cyanophage.

The phylogenetic tree is not unrooted, and are similar in the two process. The results of the phylogenetic tree is concordant with the Taxonomy report.

The Molecular Weight was not made due to the sequence be incompletein both ends.

Due to be a viral protein in the biological process we can not tell nothing, it is unknown.



















ORF finding

PROTOCOL


a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code


b) SMS ORFinder / reverse strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code



RESULTS ANALYSIS


By subjecting the metagenomic DNA sequence in the SMS ORFinder, there were only one OFR in the reading frame 3 of the reverse chain. In the foward chain we did not find any ORF.

As there were no more ORF's possible, we worked with the only one available, with the aim of finding meaningful biological function.

We considered that the sequence was coding due to had found homologs when analyzing BLAST.

The 5' region of the sequence is incomplete, because it does not start with a Methionine, and the 3' region is incomplete too, due to the following codon does not be a STOP.

RAW RESULTS

a) foward strand 

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.

-------------------------------------------------------------------------------------------------------------------

b) reverse strand

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 3 to base 974.
TCTTCTTTCTGCTGTAGGTCCAACTGCTTCCGCTATTCCACCACAAGGTTCTACTTTAAC
TGGTATAATCCTATTACTAGTTCAAACACTGCTGCCACAGCTTTAAGTGGGGTTTATCGT
TTTGCTACCGATGATATTCCAACAGTTCCTTACGATAATCAAACAGAAAAATACTTAGCA
TATGCCGATATCCTTGCGGCAAAGAGAATTACTGGTGAATACGCAAGACCAGTTGTGAGA
AGATTTAACTGGGATATATCCACAAATAGTCGTTTTGATATGTGGAGACCAGATTATTCC
GAGCAAAAGACTTCTTCTGTAGTTCTTCCTGGTGGATCAACTGGTTCCCAAAATATTTCA
ACTGCTAAATTTTATGTTGTAAATGGAAAATATGAAGTATTTAAATGTCTTTACAACGGT
GAAAGACCCGCTTCTCTTCTTCCAGGTGGCGTATTACCAACGGTTGCTTATGAACCATCA
ACAACTCCATCAAGTGGAACATATGCTAATGGAATCTACAAAGAACCAGTAGATGCGAAT
GGTTTTTGTAATTACATTTGGAAATATATGTATACTATTTCCACCAATGATGTTCTAAGA
TTTCTATCTACAGATTTTATTCCTATTGTTGCTGATTCTGTAGTACAAGCTGCTGCTGTC
AATGGTTCTGTAAGTACAGTTGTTCTTAAAGCTATTGGATCTAACTTACCAACAAGTCAA
ACATCTCTTTATGCTCCAATTGTAGGCGACGGAAGCGGTGGTGTTGTTAAATTTGGCACA
AATTCTTCTGGAAATATCACATACGCAATTTTACAAAGTGCTGGTTCTGGTTATAGTTAT
GCAAATATTCTTTTAAACAATGGCAATGTATACTCTGATGCTGCTTTGACCACTCCTGTT
GTAGTTGCTTCAAATGCTACTGGAGCAATTGAAGCAGTTCTATCACCTCAAGGTGGGCAC
GGCAGTGATCCA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
SSFCCRSNCFRYSTTRFYFNWYNPITSSNTAATALSGVYRFATDDIPTVPYDNQTEKYLA
YADILAAKRITGEYARPVVRRFNWDISTNSRFDMWRPDYSEQKTSSVVLPGGSTGSQNIS
TAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDAN
GFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTVVLKAIGSNLPTSQ
TSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPV
VVASNATGAIEAVLSPQGGHGSDP

Multiple Alignement

PROTOCOL

Phylogeny.fr / MUSCLE


RESULTS ANALYSIS



When analysing the results of multiple alignement we conclude that the sequence is incomplete at 5'end and 3'end. The sequence does not start with a a codon START, methionine, it can be observed in the multiple alignement, due to all the others sequences start with a methionine.

The sequence does not have a good similarity with the others.




RAW RESULTS

a)Muscle
CLUSTAL FORMAT:MUSCLE (3.7) multiple sequence alignment

                          10        20        30        40        50        60
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ----SSFCCRSNCFRYSTTRFYFNWY-NP------------------------ITS----
gi|255929072|re  -----MAAIISDKFRIFNAKQFLESLTEG---------------------AT-DDS----
gi|291335845|gb  RWKICKVDAYVEIFNISGTFVVGETVSGGGWSATVAEVNENSILVNNVLPTA-TTTPSFG
gi|58532894|ref  -----MAAIISDKFRIFNAKQFLESLSEG--------------------TVAGSGGDPEL
gi|326784074|re  -----MAAIISDKFRIFNAKQFLESLSE---------------------PVG-GAEDS--
gi|326783059|re  ------MALVTDNFRIYAAESFRNTLT---------------------------------
gi|326782112|re  ------MALVTDNFRIYAAESFRNTLQ---------------------------------
gi|326781852|re  -----MAAIITDQIRILNAKNFVSGIT---------------------------------
gi|326783352|re  -----MAALLTDQFRIFSARKFIKALEG---------------------PDA-TQSDDVA
gi|326784585|re  -----MAALLTDQFRIFSARKFIKALEG---------------------PDA-TQSDSAA
gi|61806285|ref  -----MAALLTDQFRIFSAQKFIKALEG---------------------PNA-TESDTVA
gi|113200609|re  -----MAALLTDQFRIFSAKKFIKALEG---------------------PIA-TQSDDAA
gi|326783573|re  -----MAALLTDQFRIFSSKKFIKALEG---------------------PNA-TQSDDDA
gi|326782840|re  -----MAALLTDQFRIFSAKKFIKALEG---------------------PDA-TQSDDAA
gi|326782364|re  -----MSALLTDQFRIFSAKKFIKSLEG---------------------PDA-TQSDAAA
gi|326784346|re  -----MAALLTDQFRIFSAKKFIKALEG---------------------PDA-TQSDDAA
gi|326782632|re  -----MAALLTDQFRIFSAKKFIKALEG---------------------PDA-TQSDDAA
gi|326783763|re  -----MPAIISEQFRILNAETFVQSFVG---------------------VG---------
gi|291335680|gb  -----MAAIITDQLRILNAKNFVNSVR---------------------------------
gi|61805984|ref  -----MPAIVTDQFRILNANNFVESVE---------------------------------
                        ###############                                      


                         70        80        90       100       110       120
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ------------------------------------------------------------
gi|255929072|re  -PDKTRMYFFVGRPQ-------PWRAYLEVYSKSSTNFTVGNEVYVG-------TYGTTA
gi|291335845|gb  TTLTGATSSATAK-----------------------------------------------
gi|58532894|ref  GNERTRMYFFVGRPQ-------GWDSFLELYSQNSTPFVASQFVYVSSDGNGSYTYANSP
gi|326784074|re  -PEKTRMYFFVGRPQ-------RWDAFLEIYSQNAVAFAENQFVYVASDGNGSYTYANSP
gi|326783059|re  --ATNKVYMFVGRAK-------TWG-----------------------------------
gi|326782112|re  --ATNKVYMFVGRAK-------TWG-----------------------------------
gi|326781852|re  -SSANSYYSFIGLPN-------PSDY----------------------------------
gi|326783352|re  GTTRDRLYLFIGRPQ-------TWD-----------------------------------
gi|326784585|re  GANRDRLYVFIGRSQ-------PWD-----------------------------------
gi|61806285|ref  GATRDRLYLFIGRPQ-------SWD-----------------------------------
gi|113200609|re  GATRDRLYIFIGRPQ-------SWD-----------------------------------
gi|326783573|re  GTTRDRLYLFIGRPQ-------TWD-----------------------------------
gi|326782840|re  GTSRDRLYLFIGRPQ-------TWD-----------------------------------
gi|326782364|re  GEDRDRLYVFIGRPQ-------AWD-----------------------------------
gi|326784346|re  GTSRDRLYLFIGRPQ-------TWD-----------------------------------
gi|326782632|re  GASRDRAYLFIGRPQ-------SWD-----------------------------------
gi|326783763|re  -STVNKYYAFMGLPN-------SVEP----------------------------------
gi|291335680|gb  -DSNNSYYSFIGLPN-------PFDY----------------------------------
gi|61805984|ref  -SDKNAYYVFIGLSNPTGTPSPSVQV----------------------------------
                                                                             


                        130       140       150       160       170       180
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ------------------------------------------SNTAATALSGVYRFATDD
gi|255929072|re  FRATVAAVYDSALLLTDVFGSNGVNSAPPLGSDLKETADGGATDTLATANSGVYRYATED
gi|291335845|gb  --------------------------------------------------SGTYRYATEE
gi|58532894|ref  FKASIVAAYENSLVLSSIQPS--VNSSPLPGATIK--GWNGASDTGAEAKSGVYRFATED
gi|326784074|re  FKASIEQVYDNSLILSDVTPS--VNSTPLPNAVLE--GWNGAADTGAEARAGVYRYATED
gi|326783059|re  ---------------------------------------------------------SSD
gi|326782112|re  ---------------------------------------------------------SSD
gi|326781852|re  ----------------------------------------------------QDDW--DS
gi|326783352|re  ---------------------------------------------------------NEN
gi|326784585|re  ---------------------------------------------------------NEN
gi|61806285|ref  ---------------------------------------------------------NEN
gi|113200609|re  ---------------------------------------------------------NEN
gi|326783573|re  ---------------------------------------------------------NEN
gi|326782840|re  ---------------------------------------------------------NEN
gi|326782364|re  ---------------------------------------------------------NEN
gi|326784346|re  ---------------------------------------------------------NEN
gi|326782632|re  ---------------------------------------------------------NEN
gi|326783763|re  ------------------------------------------------KAGGTATW--AT
gi|291335680|gb  ----------------------------------------------------------DS
gi|61805984|ref  ------------------------------------------------GYGRSSDWNKTN
                                                                             


                        190       200       210       220       230       240
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ----IPT--VPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNS------RFD
gi|255929072|re  ----VPP--LPLDNQREKIALYDEIIAAKRITDSFARTVVRRYNWNLVANP------KFD
gi|291335845|gb  ----APP--APLDNYSEKLAIYNELIAAKRVTGPFARLVVPRYNWN-LTLN-----PKFD
gi|58532894|ref  ----IPP--VPLDNQREKFDVYDDIIAAKRITSQFVRPVVTRYDWNLVATE-----KRFD
gi|326784074|re  ----TPP--TPLDNQIEKFSVYDEIIAAKRITDQFARAVITRYDWNLLATE-----PRFD
gi|326783059|re  ----VPPTGEPIDSFEYARTSYGDSVAFKRVDVSDTALVIPRVDWIDPTKTTGGVGRTYS
gi|326782112|re  ----VPPTGEPLDSFEYARTSYGDSVAFKRVDISDTALVIPRVDWTDPTKTTGGVGRTYS
gi|326781852|re  ----NPP--APKDNFSQENDYWDTMVALKKINSGDVRQVIPKRTWTS--------GTTYD
gi|326783352|re  ----SPP--QAIDSFQEFSSSYDDMISLKRVLASDTVQVVRRIDWVSPEQTTGGLGFTYD
gi|326784585|re  ----APP--QAVDSFSEFSNSYDDMISLKRVLAADTVQVVRRIDWVSPEETTGGLGFTYD
gi|61806285|ref  ----SPP--QAVDSFSEFSGSYDDMVSLKRVLASDTVQVVRRIDWVSPEQTTGGLGFTYD
gi|113200609|re  ----SPP--QAVDSFLEFSGSFDDMIALKRVLASDTIQVVRRIDWVSPEQTTGGLGFTYD
gi|326783573|re  ----SPP--QAVDSFSEFSGSYDDMISLKRVLASDTVQVVRRIDWVSPEQTTGGLGFTYD
gi|326782840|re  ----SPP--QAVDSFSEFSGSYDDMISMKRVLASDTIQVVRRIDWVSPEQTTGGLGFTYD
gi|326782364|re  ----SPP--QAIDAFDQFSDSYDDMISMKRVLASDTIQVIRRIDWTPPEQTTGGLGFTYD
gi|326784346|re  ----SPP--QAVDSFSEFSGSYDDMISMKRVLASDTIQVVRRIDWVSPEQTTGGLGFTYD
gi|326782632|re  ----SPP--QAVDSFSEFSGSYDDMISLKRVLASDTVQVVRRIDWVSPEETTGGLGFTYD
gi|326783763|re  ----NTP--SPLDGFEEEYSIKESIIAMKKVTDKDVRRLVRKVSWVA--------GTTYE
gi|291335680|gb  NWNVNPP--APKDNGDEENDYWDTIIAVKKIGNDDVRHVVRKIIWTR--------ENVYD
gi|61805984|ref  ----STP--KPLDSFSSTAHVGDTMMFGKRIASANIRRIVRRIDWTA--------GKRYE
                           ###################################               


                        250       260       270       280       290       300
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      MWRPDYSEQ--------K-TSSVVLPGGS---TGSQNISTAKFYVVNGKYEVFKCLYNGE
gi|255929072|re  MWKPDYSATPGGGGQIGK-QT-------A---TGADSISDAKFYVMNSTYEVFKCLYNGE
gi|291335845|gb  MYRPNYSPTPGGGGSVGK-DT-------A---TGQSSLSEGKFYVMNQQYEVFKCLYNGE
gi|58532894|ref  MFKPDYSATVA--GRVGK-SS-------T---TGASALGDAKYYVVNANYEVFKCLYNGE
gi|326784074|re  MYKPDYSATTT--GQVGK-QS-------T---TGAASLGASKFYVINSNYEVFKCLYNGQ
gi|326783059|re  MYKPDYAPT--------K-TT-------A---NGASRLYDSNFYVMNSDFNVYKCLYNGQ
gi|326782112|re  MYKPDYAPT--------K-TT-------A---NGSSRLYDSNFYVMNSDFNVYKCLYNGQ
gi|326781852|re  MYRHDYSVT--------NTA--------A--VSGATNLYSAFYYVMNSDFRVYACLQNGT
gi|326783352|re  MYRHDYSPS--------K-TA-------S---SGATKLYDSDFYVVNSQYQVYKVIYNGT
gi|326784585|re  MYRHNYSPS--------K-TA-------S---SGATKLYDADFFVVNSQYQVYKCIYNGT
gi|61806285|ref  MYRHDYSPS--------K-TA-------A---SGATKLYDSDFYVVNSQYQVYKCIYNGT
gi|113200609|re  MYRHDYSPS--------K-TA-------S---SGATKLYDSDFYVVNSQYQVYKCIYNGT
gi|326783573|re  MYRHDYSPS--------K-TA-------A---SGATKLYDSDFYVVNSQYQVYKCIYNGT
gi|326782840|re  MYRHDYSPT--------K-TA-------A---SGATKLYDSDFYVVNSQYQCYKVIYNGT
gi|326782364|re  MYRHDYSPT--------N-TA-------A---SGATKLYDADFYVVNTNYQVYKCIYNGT
gi|326784346|re  MYRHDYSPS--------K-TA-------A---SGATKLYDSDFYVVNSQYQCYKCIYNGT
gi|326782632|re  MYRHDYSPS--------K-TA-------A---SGATKLYDSDFYVVNSQYQVYKCIYNGT
gi|326783763|re  MYRHDYNIY--------NLTP-------I---TSQGSLYDANYYVVNEDLKVYICLQNGS
gi|291335680|gb  MYRNDISRN--------RISN-------S---SSSSSIYASNFYVLNEDYRVYICLNNGT
gi|61805984|ref  MYRDDYSTE--------AGA--------QSPINDSSRLYGASYYVMNSEFKVYICISNGS
                                                  ###########################


                        310       320       330       340       350       360
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      RPA----SLLPGGVLPTVAYEPSTTPSS------------GT----YA---N--GIYKEP
gi|255929072|re  GPG----N--PTG-------VDAVEEPTT--SGGNYDAGTGL----YT---E--TT----
gi|291335845|gb  GAA----N--PTG-------QNATYEPKSQPSPG-----QGA----FA---N--GIYTEP
gi|58532894|ref  FPGQVDPD--PVY-------EPKTNPSGG---EGTYNPSTGIFTERAT---AIV------
gi|326784074|re  FPGQV--D--PNP-----VYEPKTTPSAG---QGTYDAGSGLFTESAD---AVVAN----
gi|326783059|re  SPE----F--PRG-------RPSLVEPT------------GT----ST---T--II----
gi|326782112|re  SPE----F--PRG-------RPSLVEPT------------GT----ST---T--II----
gi|326781852|re  DPN----N--PNG-------KPSLDEPT------------FT----DLEPRS--AG----
gi|326783352|re  SPS----D--PNG-------KPSTVEPT------------GS----ST---S--II----
gi|326784585|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|61806285|ref  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|113200609|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|326783573|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|326782840|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|326782364|re  SPS----D--PNG-------KPSTIEPT------------GT----ST---S--II----
gi|326784346|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|326782632|re  SPS----D--PNG-------KPSTVEPT------------GT----ST---S--II----
gi|326783763|re  DPE----N--PKG-------RPSYDQPT------------FV----DLEPRA--AG----
gi|291335680|gb  DPE----H--PNG-------RPSLDQPT------------FT----DLEPKA--AG----
gi|61805984|ref  SGD----N--PTG-------NISQDEPM------------FT----DLEPSR--AG----
                 ##                                                          


                        370       380       390       400       410       420
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      -VDA--NG--FCNYIWKYMYTISTNDVLRFLSTDFIPIV-------ADSVVQAAA-----
gi|255929072|re  -NVN--G-----NYIWKYMYTIPTDDVLKFLSSDFMPIVL--PANPSRTTVVGQA-----
gi|291335845|gb  AGTA--------GYIWKHMFTVPTGDVLAFLSTDFMPVVE--STELSRTQVEALA-----
gi|58532894|ref  -AASGAS-----GYVWKYMYTIPTEDVLRFLSTNFMSINL--AGEPTRAGTESIA-----
gi|326784074|re  -TAG--S-----GYIWKYMYTIPTDDVLRFLSTNFMPINL--TGEATRAATEAAA-----
gi|326783059|re  -ETS--DSPGVYSYRWKYLYTIDADNILKFVTTEFIPVL-------TNSLVQSAA-----
gi|326782112|re  -ETS--DSPGVYSYRWKYLYTIDADNILKFVTSEFIPVL-------SNSLVTSAA-----
gi|326781852|re  -SSG--D-----GYLWKYLYTIKPNEVVKFESTDFMPVPADWATSTDNAAVRDNA-----
gi|326783352|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNDAVRTNA-----
gi|326784585|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNTAVKTNA-----
gi|61806285|ref  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNAAVQTNA-----
gi|113200609|re  -TTA--D-----SYRWKYLYTIPVASVLKFFSNDYMPVF-------TNDAVKTNA-----
gi|326783573|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNSAVQTNA-----
gi|326782840|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNTSVKTNA-----
gi|326782364|re  -TTA--D-----GYRWKYMYTIPVAQVLKFFSNEYMPVF-------TNNSVKTNA-----
gi|326784346|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNTSVKTNA-----
gi|326782632|re  -TTG--D-----GYRWKYMYTIPVASVLKFFSNDYMPVF-------TNDSVKTNA-----
gi|326783763|re  -TSG--D-----GYVWKYLYTIKPSEIVKFDSIEYIPVPENWGNQGETVATKANA-----
gi|291335680|gb  -LSG--D-----GYIWKYLYTIKPSEIIKFDSTNYIPVPDNWDGNSNTAEVRSHA-----
gi|61805984|ref  -TSG--D-----GYVWKYLYTVSPADILKFDSTEYITVPNN-------WSTSTDAQIKAV
                              #########################                      


                        430       440       450       460       470       480
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      --------VNGSVSTVVLKAIGSNLP-TSQTSLYA--PIVGD-----GSGGV----VKFG
gi|255929072|re  --------VDGAIDVVLIEDAGSGLPANRGVGNELYAGIKGD-----GTGGV----VKMT
gi|291335845|gb  --------VDGAVHVAVVKDGGSGLP--ASDTLYT--SVKGD-----GSGAV----VELT
gi|58532894|ref  --------VDGAIDIVLIEDRGTGLP--NGTH-YA--PVVGDG----QLGGNNPAIVKIV
gi|326784074|re  --------VDGAIDVVLVEDIGSGLP--NGTH-YA--PVLGDGQVS-GTQSV----VKIV
gi|326783059|re  --------NSGSVDTVVIENAGSGYN--NGTFTNV--PIRGDYTVNGGTQAS----CTVT
gi|326782112|re  --------NTGSVDTVVIENAGSGYN--NGQFTNV--PIRGDYNVNGGTQAL----CTVN
gi|326781852|re  --------VDGSIKVVTVTNSGVGLGTANQTYTRV--PIQGD-----GTGAE----CTLT
gi|326783352|re  --------VSGEIDTVVISSAGSGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|326784585|re  --------VTGEIDTVVINAAGSGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|61806285|ref  --------VAGEVDTVVINAAGSGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|113200609|re  --------VTGEVDTVVITSAGTGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|326783573|re  --------VSGEVDTVVINSAGSGYN--NGTYDNV--AINGD-----GTGGR----VSVV
gi|326782840|re  --------VAGEIDTVVINSAGSGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|326782364|re  --------VSGEVDTVVITSSGSGYN--NGTYDNV--AIAGD-----GVGGR----VSIV
gi|326784346|re  --------VAGEIDTVVINSAGSGYN--NGTYDNV--GINGD-----GTGGR----VSVV
gi|326782632|re  --------VAGEIDTVVINAAGSGYN--NGTYDNV--AINGD-----GTGGR----VSIV
gi|326783763|re  --------IDGKIEVVVVNDRGSNYQPISTSFANV--PILGD-----GSGGK----ATIT
gi|291335680|gb  -------ESSGQLKNILITDRGVGLGTANVIYTNV--PIKGD-----GTGAE----ASIV
gi|61805984|ref  RENGDSSLNSNQIKHIYIDKAGGKYA--DGLGQEV--DILGD-----GTGGK----ARVD
                           ################                                  


                        490       500       510       520       530       540
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      TNSSGNITYAILQSAGS-GYSYANILLNNGNV----------YSD--AALT---TPVVV-
gi|255929072|re  TDGSGSILTAEVEVRGS-GYTYANVLLSNGNL----------FTD--PGLAPGDAIATP-
gi|291335845|gb  TDASGTVTAASMFNVGS-GYSYGNLLLETGE-----------------------------
gi|58532894|ref  VS-GGQIESTEVVDRGTGGYTYASVPLENGQT----------IAGLPYGLYSNQTLTTPR
gi|326784074|re  VS-SGSIESTEVVVKGA-GYTYASIALDDGAT----------VGGIKYGLYAEQALTTAR
gi|326783059|re  VV-SGSISAVTITTAGS-GYSFASIDTSL-------------IANI--------------
gi|326782112|re  VV-SGSVSSVTITQAGS-GYSFASIDVSL-------------ITNI--------------
gi|326781852|re  VGADSKVSGVTVSNQGS-GYSYGSLNLE---------------AG------------GV-
gi|326783352|re  ID-GGRIISATVTSGGT-GYTFGKISIDS-------------ITGI--------------
gi|326784585|re  VD-GGKVTSATVTSGGT-GYTFGQISINA-------------ITGI--------------
gi|61806285|ref  ID-GGKIISATVTSGGT-GYTFGKISVDN-------------ITGI--------------
gi|113200609|re  VD-GGKIISATVTSGGT-GYTFGKISVDN-------------ITGI--------------
gi|326783573|re  ID-GGKIISATVTSGGT-GYTFGKISVDA-------------ITGI--------------
gi|326782840|re  VD-GGKIISATVTSGGT-GYTFGKISVDN-------------ITGI--------------
gi|326782364|re  VD-GGKIISATVTSGGT-GYSFGKISVDT-------------ISGI--------------
gi|326784346|re  VD-GGKVISATVTSGGT-GYTFGKVSVDN-------------ITGI--------------
gi|326782632|re  VD-GGKIISATVTSGGT-GYTFGKISVDS-------------VTGI--------------
gi|326783763|re  IDSFGKVSEVFVTDGGE-GYTHGSIQFFPGAPGSESGGVLANLTNT-----------GI-
gi|291335680|gb  IGNDSKIDRIVVTNGGS-GYTYGIVDYI---------------SG------------GL-
gi|61805984|ref  VVG-GKITNATVSSGGS-GYSYGLVDLG-------------ALQDA--------------
                     ############                                            


                        550       560       570       580       590       600
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      -ASNA--TGAIEAVLSPQGGHGSDP-----------------------------------
gi|255929072|re  -G-GW--TGALEAILPPQGGHGSDHETELNGKRVMTNIRLTYDEGSGDFPVDNDFRRIGI
gi|291335845|gb  ------------------------------------------------------------
gi|58532894|ref  TGVTG--TGALEVVLPPQGGHGSDFELELNAKRVMTNIRLTYAEGQGDFPVDNDFRRIGI
gi|326784074|re  TGVGG--TGALEVVLPPQGGHGADFELELNAKRVMTNIRLTYAEGSGDFPVDNDFRRIGI
gi|326783059|re  -GNGT--NADLDVVLPPNGGHGKDSVRELGAYRLMFASKLETTSAFVDFPNDLTYRRVGL
gi|326782112|re  -GNGS--DASLDVVLPPNGGHGNDSVRELGAYRLMFASKLETTSAFIDFPNDLTYRRVGL
gi|326781852|re  -PTG-TTIPTFDVIMAPQGGHGADIYRELGAYNVLLYSRIENDNENPDFVTGNQIARVGI
gi|326783352|re  -GTGA--SGIVDVIMPPPGGHGSDSVVELGAFRVMVNAKLSYDEGAGDFPIDNDYRRIGL
gi|326784585|re  -GTGT--SGEVDVVIPPPDGHGYDSSIELGGFRVMINAKLSYDEGAGDFPIDNDYRRIGL
gi|61806285|ref  -GTGT--GGQVDVIIPPPGGHGKDSVVELGAFRTMINAKLSYDEGAGDFPVDNDYRRIGL
gi|113200609|re  -GTGT--SGQVDVIIPPPNGHGFDPIVELGAFRVMINSKTSYAEGAGDFPIDNDYRRIGL
gi|326783573|re  -GTGT--GGQVDVIIPPPGGHGNDAVVEIGAFRVMINAKLSYDEGAGDFPVDNDYRRIGL
gi|326782840|re  -GTGT--GAIVDVIIPPPGGHGSDAVVELGAFRVMINAKLSYDEGAGDFPIDNDYRRIGL
gi|326782364|re  -GTGA--AAQIDVSMPPPGGHGFDSIIELGAYRVMINAKLSYDEGAGDFPVDNDYRRVGL
gi|326784346|re  -GTGT--GGQVDVIIPPPGGHGADAVVELGAFRVMINAKLSYDEGAGDFPIDNDYRRIGL
gi|326782632|re  -GTGT--GGQVDVIIPPPNGHGADAVVELGAFRVMINAKLSYDEGAGDFPIDNDYRRIGL
gi|326783763|re  -GT--TSVANFSVIIPPKGGHGYDVYRELGAYRALLYSRFETLETNPDIIEGNDFARVGL
gi|291335680|gb  -PVN-TLAPSFDVIIPPKGGHGADIYNELGAFNVMIFSAIENDLENPDFILENQISRVGI
gi|61805984|ref  -AHPSNQRAKLVPIIPPSLGHGYDIYKELGTDRVLIYAR--FDDSTKDFPSDTKFAQVGI
                                                                             


                        610       620       630       640       650       660
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ------------------------------------------------------------
gi|255929072|re  IKDPYAWGT-TDFLLSDTVSGLRALKIEGS-----QADYIP--DETITQTV-SGG---TA
gi|291335845|gb  ------------VYTD--------------------------------------------
gi|58532894|ref  LKDPELWNS-SDYATLSTLNGMYAVKIQNA----TAG-FVA--DEEITQNLAGGG---VA
gi|326784074|re  IKDPFNWST-TDFAVLDTLNGLYAAKITGA-----SADYVS--DETITQALAGGG---TA
gi|326783059|re  VLNPTDYNT-TTVCSQNTRSAVKALIFPQSGAGTPSGNFVA--GETITQT------STNA
gi|326782112|re  VLNPTDYNT-TTVCSQNTRSAVKALIFPQSGAGTPSGTFAP--GETITQS------TTGA
gi|326781852|re  VENPQV-SV-GNVLTSDKASALNALKLTG--TGYSSASFTA--DSYFTQTVATGS---TA
gi|326783352|re  VTNPLKFGT-EELISDLTVSAAKAVIFSPT----FQGNYVP--DEIITQTRVVGGTSITG
gi|326784585|re  VTNPLKFGT-SELLADLTVSATKAVIFSPT----FQGNYVP--DEIITQTRVVGGTNITA
gi|61806285|ref  ITNPLKYGT-AELLSDLTVSATKAVIFSPT----FQGNYVP--DEIITQTRVVGGTNVTA
gi|113200609|re  VTNPKKFGT-EELLSDLTVSAAKAVIFPPS----FQGNYTP--DEIITQNRVVGGTNVTA
gi|326783573|re  ITNPLKFGT-SELISDLTVSASKAVIFSPT----FQGNYVP--DEIITQTRVVGGQNVTA
gi|326782840|re  VTNPLKFGT-SELISDLTVSATKAAIFSPT----FQGNYVP--DEIITQTRVVGGTNVTA
gi|326782364|re  VVNPRKFGT-SELQSDLTSSVTKAVIFAPT----FQGNFLP--DEIITQTRTVGGQSVTS
gi|326784346|re  VTNPLKFGT-SELISDLTVSATKAAIFSPT----FQGNYVP--DEIITQTRVVGGTNVTA
gi|326782632|re  ITNPLKFGT-EELISDLTVSATKAVIFSPT----FQGNYVP--DEIITQTRVVGGTNVTA
gi|326783763|re  IKNPTVFGSNTELLDTAMVSGLKALKLGG--IT-TATTYAV--DSEITQTVGVGS---TA
gi|291335680|gb  VQNPTVYNS-TDILTLDKASSISAIKLIG--TGFNEVSFDE--DSVIEQTVGTGQ---TA
gi|61805984|ref  VKNPTKVGT-AVTYSDSTFSSTQAFIFDT---I-ADSSVTPKVGERITQVLTSGQ---IA
                                                                             


                        670       680       690       700       710       720
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ------------------------------------------------------------
gi|255929072|re  YGTVVSWVLDSGSTTDGVLKYIQTN-D---------------AHTDQGVVRAF-ESSGAT
gi|291335845|gb  ------------------------------------------------------SGLT-A
gi|58532894|ref  KGNVVSWTLDTGSTTNGVLKYVQSP-E---------------FHANGGIVRAF-DNSA-N
gi|326784074|re  KGTVVSWTLDSGSTTDGVLKYIQSP-D---------------LHADQGVVRAF-DDSA-N
gi|326783059|re  KGLVVSY-----DSTTKVLKYYQDV-TDG---------------TVNGNVIAF-SGAN-Q
gi|326782112|re  KGFVVSY-----DATTKVLKYYQDS-TDG---------------TVNGNVIAF-AGAN-Q
gi|326781852|re  VGRVVNY-----DATTGVLKYWQDRSLAGF------------TTAGIGITNPT-YGFD-L
gi|326783352|re  RGRVVSW-----NPTTKVLKYYQNS-IDGI------------FPEVTGTQNEL-DGSN-V
gi|326784585|re  RARVISW-----NATTKVLKYYQNS-VDGI------------FPEVTGTQNEF-DGSN-V
gi|61806285|ref  RARVISW-----NATTKVLKYYQNA-VDGI------------FPEVTGTQNEF-DGSN-V
gi|113200609|re  RGRVVSW-----NATTKVLKYYQNN-VDGI------------FPEVTGTLNEF-DGSN-V
gi|326783573|re  RARVISW-----NATTKVLKYYQNA-VDGI------------FPEVTGTQNEF-DGSN-V
gi|326782840|re  RGRVISW-----NATTKVLKYYQNA-IDGI------------FPEVTGTQNEF-DGSN-V
gi|326782364|re  RARVISW-----NPTTKVLKYYQNR-VDGI------------FPEITGSLNEF-DGSN-A
gi|326784346|re  RGRVISW-----NATTKVLKYYQNA-VDGI------------FPEVTGTQNEF-DGSN-V
gi|326782632|re  RGRVISW-----NATTKVLKYYQNA-VDGI------------FPEVTGTQNEF-DGSN-V
gi|326783763|re  IGYVASW-----DKVTGVLKYYQPMGLASS-------ETGYKIIPFTSVPDAG-YGVT-I
gi|291335680|gb  IGRVISY-----NKSTGILKYWNDRYTHGVGYG----GTISYYPQYGLDRYTF-TNTP-T
gi|61805984|ref  QGYVASY-----DEETKVMKYFRDRSL---NYTTTLDQTDYTGISTSGRIYSFEDSSN-A
                                                                             


                        730       740       750       760       770       780
                 =========+=========+=========+=========+=========+=========+
GOS_2149020      ------------------------------------------------------------
gi|255929072|re  ISGGQS--GGDGT--VTTGY-------------NTGGGV----LPLLGHSFTAGLSNPEI
gi|291335845|gb  AAGAFS--GTAAL-----------------------------------------------
gi|58532894|ref  IVGSNS--LASGS--VDP---------------SANATT----I--LGIPFTSGFGYPEI
gi|326784074|re  IVGSAS--LASGA--V-A---------------TGVTGT----TL-LGVSFTNGFGTPEI
gi|326783059|re  ITGSAS--SYNAT--PDATFG------T-SSVP-LTQITIGVSVYELGLSFVTGYANEEI
gi|326782112|re  ITGSTN--SYTAT--PDATFG------T-ASVP-LTQITIGVSVYELGLSFVEGYANEEI
gi|326781852|re  KEFTAS--PDAGG--SVTIVPSSGSNLAIDTSFTGITTVINNRTYYLGQSFTSGVAGPEV
gi|326783352|re  ISGATS--GAAGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYAKPEI
gi|326784585|re  ISGATS--GASGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMRFTSGYAKAEI
gi|61806285|ref  VSGATS--GAAGA--PDVNF---------PAVPNSSARTINNTEYDLGMKFNNGYAKPEI
gi|113200609|re  INGATS--GAAGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYAKPEI
gi|326783573|re  ISGATS--GAAGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYAKPEI
gi|326782840|re  INGATS--GAAGQGQPDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYSKPEL
gi|326782364|re  ISGASS--GSSAE--PDINF---------PTVPNTSSRVINNTEYDLGMRFTSAYAKPEV
gi|326784346|re  INGATS--GAAGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYSKPEL
gi|326782632|re  INGATS--GAAGQ--PDVNF---------PAVPNSSSRTINNTEYDLGMKFNNGYSKPEI
gi|326783763|re  NGSSVS--GSLLS--IDTN-------------YNGVSTSINNKTYQLGMSFTSGISSAEF
gi|291335680|gb  GAGSINIRGNEGD--IVLQI---------DTAFTGSTMPINNRTYNLGQNFVNGLANPEV
gi|61805984|ref  IKGDSS--NFSAG--INTSFS------GI---TTNPT---GTKLIDLGVNFSNGLSNSEI
                                                                             


                        790       800       810
                 =========+=========+=========+==
GOS_2149020      --------------------------------
gi|255929072|re  ENNSGDVIYIENRRLITRAPDQIEDIKLVIEF
gi|291335845|gb  -----NWLFLQRRTWF----------------
gi|58532894|ref  KPNSGDIIYVENRRLITRAADQIEDIKLVIEF
gi|326784074|re  EQNSGDIIYVENRRLITRAADQIEDIKLVIEF
gi|326783059|re  ELNSGEILYLDNRIPITRSADQNEELKVVIEF
gi|326782112|re  ELNSGEILYLDNRIPITRSADQNEELKVVIEF
gi|326781852|re  KKHAGNIIYVDNRPSITRSSNQKEDIKIILQF
gi|326783352|re  KSNSGNVVYIDNRRAISRANDQVEDIKIVIEF
gi|326784585|re  EPNSGQVVYIDNRRAISRANDQVEDIKIVIEF
gi|61806285|ref  ESNSGDVVYIDNRRSISRANDQVEDIKIVIEF
gi|113200609|re  KSNDGDIIYIDNRRAISRANDQIEDIKIVIEF
gi|326783573|re  KSNSGQVVYIDNRRSISRANDQVEDIKIVIEF
gi|326782840|re  ASNSGQVVYIDNRRAISRANDQVEDIKIVIEF
gi|326782364|re  AFNTGEIIYLDNRRSISRASDQIEDIKIVIEF
gi|326784346|re  ASNSGQVVYIDNRRAISRANDQVEDIKIVIEF
gi|326782632|re  KSNSGQVVYIDNRRAISRANDQVEDIKIVIEF
gi|326783763|re  NTKSGEIIYIDNRTAIPRSASQKEDIKIVLEF
gi|291335680|gb  QKYSGNIIYVDNRPPITRSQNQKELIKVILQF
gi|61805984|ref  NKGSGEIVYLDNRPLIARNERQKEDVKIILEF


                                                 










Protein Domains

PROTOCOL

InterProScan, default parameters at EBI


RESULTS ANALYSIS


When analyzed the ORF with InterProScan, there were 3 results, so we considered that there were two domains with biological function (2.8e-30, 1.9e-19), and the other one is unintegrated (6e-14).

We chose the bacteriophage T4, GP8 with accession number IPR015298 as a more specific domain, because it has a more significant E value (2.8e-30), compared with the other E value that is not so significant (1.9e-19).

RAW RESULTS

Sequence_1	83A9550EE450ED8E	324	Gene3D	      G3DSA:2.170.290.10	no description	                        77	230	6e-14	T	14-Apr-2011	NULL	        NULL	
Sequence_1	83A9550EE450ED8E	324	HMMPfam	      PF09215	                Phage-Gp8	                        35	307	2.8e-30	T	14-Apr-2011	IPR015298	Bacteriophage T4, Gp8	
Sequence_1	83A9550EE450ED8E	324	superfamily   SSF89433	                Baseplate structural protein gp8	39	238	1.9e-19	T	14-Apr-2011	IPR015298	Bacteriophage T4, Gp8	

Phylogeny

PROTOCOL


a) Phylogeny.fr / BioNJ method / outgroup: no outgroup

b) Phylogeny.fr / PhyML method / no bootstrap / default substitution model / outgroup: no outgroup



RESULTS ANALYSIS


The trees we obtained are in agreement.

Both trees does not have outgroups sequences because they are few and all of the same order, and they are not unrooted.

We believe that our serquence is related to Synechococcus and cyanophagose. Also perhaps the uncultured phage is related related to the previous, but is not for sure.

The results of the trees are in agreement with the BLAST of the Taxonomy report.




RAW RESULTS

a) BioNJ                                                                              
                                                                                                      ------0.1----
 
                                          +------------------------------------------GOS_2149020
                                          |
    +-------------------------------------+----------------------------------------uncultured_phage_MedDCM-OCT-S08-C1281           [viruses]
    |                                     |
    |                                     |         +--------------------Synechococcus_phage_S-RSM4                                [viruses]                         
    |                                     |         |
    |                                     +---------+          +--------------Synechococcus_phage_S-PM2                            [viruses]           
    |                                               |          |
 +--+                                               +----------+
 |  |                                                          +--------------Cyanophage_Syn1                                      [viruses]
 |  |
 |  |                                                +--Cyanophage_M4-259                                                          [viruses]
 |  |------------------------------------------------+
 |  |                                                +----Cyanophage_M4-247                                                        [viruses]
 |  |
 |  |--------------------------------------------Cyanophage_8102-4                                                                 [viruses]
 |  |
 |  |--------------------------------------------Synechococcus_phage_syn9                                                          [viruses]
 |  |
 |  |--------------------------------------------Cyanophage_Syn33                                                                  [viruses]
 |  |
 |  |
 |  |                                          +-Cyanophage_9303-10a                                                               [viruses]
 |  |------------------------------------------+
 |  |                                          +Cyanophage_8102-12                                                                 [viruses]
 |  |
 |  |------------------------------------------------Cyanophage_NATL1A15                                                           [viruses]
 |  |
 |  |------------------------------------------Cyanophage_6501-1                                                                   [viruses]
 |  |
 |  |-------------------------------------------Prochlorococcus_phage_P-SSM4                                                       [viruses]
 |  |
 |  +------------------------------------------Cyanophage_Syn19                                                                    [viruses]
 |
 |
 |           +-----------------------------------------------------Prochlorococcus_phage_P-SSM2                                    [viruses]
 +-----------+
             |              +--------------------------------------------------Cyanophage_8109-3                                   [viruses]
             +--------------+
                            |----------------------------------Cyanophage_8017-1                                                   [viruses]
                            |
                            +--------------------------------------------------------uncultured_phage_MedDCM-OCT-S04-C93           [viruses]


b)PhyML
                                                                                                         ----0.1---
 
                                                                    +---------------Synechococcus_phage_S-PM2                      [viruses]
                                        +---------------------------+
                                        |                           +------Cyanophage_Syn1                                         [viruses]
                                        |
 +--------------------------------------+---------------------Synechococcus_phage_S-RSM4                                           [viruses]
 |                                      |
 |                                      |--------------------------------------------GOS_2149020
 |                                      |
 |                                      | 
 |                                      +----------------------------uncultured_phage_MedDCM-OCT-S08-C1281                         [viruses]
 |
 |                   +-------------------------------------------Prochlorococcus_phage_P-SSM2                                      [viruses]
 |                   
 |+------------------+              +-------------------------------Cyanophage_8109-3                                              [viruses]
 ||                  |              |   
 ||                  +--------------+             +----------------------Cyanophage_8017-1                                         [viruses]
 ||                                 +-------------+
 ||                                               +----------------------------------uncultured_phage_MedDCM-OCT-S04-C93           [viruses]
 ++
  |                                         +Cyanophage_M4-259                                                                     [viruses]
  |-----------------------------------------+
  |                                         |
  |                                         +----Cyanophage_M4-247                                                                 [viruses]
  |
  |-----------------------------------Cyanophage_8102-4                                                                            [viruses]
  |
  |-----------------------------------Cyanophage_9303-10a                                                                          [viruses]
  |
  |----------------------------------Cyanophage_8102-12                                                                            [viruses]
  | 
  |                                +----Synechococcus_phage_syn9                                                                   [viruses]
  |                                | 
  +--------------------------------+Cyanophage_Syn19                                                                               [viruses]
                                   |
                                   |
                                   |--Prochlorococcus_phage_P-SSM4                                                                 [viruses]
                                   
                                   | +-----Cyanophage_NATL1A15                                                                     [viruses]
                                   |-+
                                   | +Cyanophage_6501-1                                                                            [viruses]
                                   |
                                   +-----Cyanophage_Syn33                                                                          [viruses]

Taxonomy report

PROTOCOL

a) BLASTp versus NR, NCBI default parameters apart from "Number of descripions_1000"


b) BLASTp versus swissprot, NCBI default parameters apart from "Number of descripions_1000"



RESULTS ANALYSIS


The vast majority of the sequences are virus.

Because they are few and all sequences of the same order we only worked with ingroups to form the phylogenetic tree.

The taxonomy results are heterogeneous, and the values ​​of hits ranging.


Ingroup:

ref|YP_003097384.1| Synechococcus phage 4e-75 Synechococcus phage S-RSM4 [viruses]

gb|ADD95443.1| uncultured phage 6e-64 uncultured phage MedDCM-OCT-S08-C1281 [viruses]

ref|YP_195117.1| Synechococcus phage 5e-63 hococcus phage S-PM2 [viruses]

ref|YP_004324467.1| Prochlorococcus phage 6e-62 Prochlorococcus phage Syn1 [viruses]

ref|YP_004323456.1| Prochlorococcus phage 4e-29 Prochlorococcus phage P-HM2 [viruses]

ref|YP_004322513.1| Prochlorococcus phage 4e-27 Prochlorococcus phage P-HM1 [viruses]

ref|YP_004322254.1| Synechococcus phage 5e-27 Synechococcus phage S-SM2 [viruses]

ref|YP_004323702.1| Prochlorococcus phage 2e-26 Prochlorococcus phage Syn33 [viruses]

ref|YP_004324924.1| Prochlorococcus phage 3e-26 Prochlorococcus phage P-SSM7 [viruses]

ref|YP_214644.1| Prochlorococcus phage 2e-25 Prochlorococcus phage P-SSM4 [viruses] ref|YP_717771.1| Synechococcus phage 2e-25 Synechococcus phage syn9 [viruses] ref|YP_004323924.1| Synechococcus phage 2e-25 Synechococcus phage Syn19 [viruses]

ref|YP_004323238.1| Prochlorococcus phage 2e-24 Prochlorococcus phage P-RSM4 [viruses]

ref|YP_004322764.1| Synechococcus phage 5e-24 Synechococcus phage S-ShM2 [viruses]

ref|YP_004324700.1| Synechococcus phage 1e-23 Synechococcus phage S-SSM5 [viruses]

ref|YP_004322996.1| Synechococcus phage 2e-23 Synechococcus phage S-SM1 [viruses]

ref|YP_004324157.1| Synechococcus phage 8e-22 Synechococcus phage S-SSM7 [viruses] gb|ADD95286.1| uncultured phage 2e-15 uncultured phage MedDCM-OCT-S04-C93 [viruses]

ref|YP_214344.1| Prochlorococcus phage 1e-10 Prochlorococcus phage P-SSM2 [viruses]













RAW RESULTS
a)
root .........................................    59 hits   30 orgs 
. Viruses ....................................    37 hits   19 orgs 
. . Caudovirales .............................    31 hits   15 orgs [dsDNA viruses, no RNA stage]
. . . Myoviridae .............................    27 hits   13 orgs 
. . . . unclassified T4-like viruses .........    11 hits    5 orgs [T4-like viruses]
. . . . . Synechococcus phage S-RSM4 .........     2 hits    1 orgs 
. . . . . Synechococcus phage S-PM2 ..........     2 hits    1 orgs 
. . . . . Prochlorococcus phage P-SSM4 .......     2 hits    1 orgs 
. . . . . Synechococcus phage syn9 ...........     2 hits    1 orgs 
. . . . . Prochlorococcus phage P-SSM2 .......     3 hits    1 orgs 
. . . . unclassified Myoviridae ..............    16 hits    8 orgs 
. . . . . Prochlorococcus phage P-HM1 ........     2 hits    1 orgs 
. . . . . Synechococcus phage S-SM2 ..........     2 hits    1 orgs 
. . . . . Prochlorococcus phage P-SSM7 .......     2 hits    1 orgs 
. . . . . Prochlorococcus phage P-RSM4 .......     2 hits    1 orgs 
. . . . . Synechococcus phage S-ShM2 .........     2 hits    1 orgs 
. . . . . Synechococcus phage S-SSM5 .........     2 hits    1 orgs 
. . . . . Synechococcus phage S-SM1 ..........     2 hits    1 orgs 
. . . . . Synechococcus phage S-SSM7 .........     2 hits    1 orgs 
. . . unclassified Caudovirales ..............     4 hits    2 orgs 
. . . . Prochlorococcus phage Syn1 ...........     2 hits    1 orgs 
. . . . Prochlorococcus phage Syn33 ..........     2 hits    1 orgs 
. . environmental samples ....................     2 hits    2 orgs 
. . . uncultured phage MedDCM-OCT-S08-C1281 ..     1 hits    1 orgs 
. . . uncultured phage MedDCM-OCT-S04-C93 ....     1 hits    1 orgs 
. . unclassified phages ......................     4 hits    2 orgs 
. . . Prochlorococcus phage P-HM2 ............     2 hits    1 orgs 
. . . Synechococcus phage Syn19 ..............     2 hits    1 orgs 
. cellular organisms .........................    22 hits   11 orgs 
. . Bacteria .................................    11 hits    4 orgs 
. . . Actinomycetales ........................     7 hits    2 orgs [Actinobacteria; Actinobacteria (class); Actinobacteridae]
. . . . Streptomyces clavuligerus ATCC 27064 .     3 hits    1 orgs [Streptomycineae; Streptomycetaceae; Streptomyces; Streptomyces clavuligerus]
. . . . Frankia sp. EuI1c ....................     4 hits    1 orgs [Frankineae; Frankiaceae; Frankia]
. . . Acidobacterium sp. MP5ACTX8 ............     2 hits    1 orgs [Fibrobacteres/Acidobacteria group; Acidobacteria; Acidobacteria (class); Acidobacteriales; Acidobacteriaceae; Acidobacterium]
. . . Dehalococcoides ethenogenes 195 ........     2 hits    1 orgs [Chloroflexi; Dehalococcoidetes; Dehalococcoides; Dehalococcoides ethenogenes]
. . Eukaryota ................................    11 hits    7 orgs 
. . . Fungi/Metazoa group ....................     9 hits    6 orgs 
. . . . Dikarya ..............................     8 hits    5 orgs [Fungi]
. . . . . Eurotiomycetidae ...................     6 hits    4 orgs [Ascomycota; saccharomyceta; Pezizomycotina; leotiomyceta; Eurotiomycetes]
. . . . . . Neosartorya ......................     5 hits    3 orgs [Eurotiales; Trichocomaceae]
. . . . . . . Neosartorya fischeri NRRL 181 ..     2 hits    1 orgs [Neosartorya fischeri group; Neosartorya fischeri]
. . . . . . . Aspergillus fumigatus ..........     3 hits    2 orgs 
. . . . . . . . Aspergillus fumigatus Af293 ..     2 hits    1 orgs 
. . . . . . . . Aspergillus fumigatus A1163 ..     1 hits    1 orgs 
. . . . . . Ajellomyces dermatitidis ER-3 ....     1 hits    1 orgs [Onygenales; Ajellomycetaceae; Ajellomyces; Ajellomyces dermatitidis]
. . . . . Cryptococcus gattii WM276 ..........     2 hits    1 orgs [Basidiomycota; Agaricomycotina; Tremellomycetes; Tremellales; Tremellaceae; Filobasidiella; Filobasidiella/Cryptococcus neoformans species complex; Cryptococcus gattii]
. . . . Callithrix jacchus ...................     1 hits    1 orgs [Metazoa; Eumetazoa; Bilateria; Coelomata; Deuterostomia; Chordata; Craniata; Vertebrata; Gnathostomata; Teleostomi; Euteleostomi; Sarcopterygii; Tetrapoda; Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; Primates; Haplorrhini; Simiiformes; Platyrrhini; Cebidae; Callitrichinae; Callithrix]
. . . Perkinsus marinus ATCC 50983 ...........     2 hits    1 orgs [Alveolata; Perkinsea; Perkinsida; Perkinsidae; Perkinsus; Perkinsus marinus]

b)
root .......................................     9 hits    8 orgs 
. cellular organisms .......................     8 hits    7 orgs 
. . Methanocaldococcus jannaschii ..........     2 hits    1 orgs [Archaea; Euryarchaeota; Methanococci; Methanococcales; Methanocaldococcaceae; Methanocaldococcus]
. . Bacteria ...............................     3 hits    3 orgs 
. . . Bacillus .............................     2 hits    2 orgs [Firmicutes; Bacilli; Bacillales; Bacillaceae]
. . . . Bacillus subtilis ..................     1 hits    1 orgs [Bacillus subtilis group]
. . . . Bacillus pseudofirmus OF4 ..........     1 hits    1 orgs [Bacillus pseudofirmus]
. . . Shewanella piezotolerans WP3 .........     1 hits    1 orgs [Proteobacteria; Gammaproteobacteria; Alteromonadales; Shewanellaceae; Shewanella; Shewanella piezotolerans]
. . Fungi/Metazoa group ....................     3 hits    3 orgs [Eukaryota]
. . . Schizosaccharomyces japonicus yFS275 .     1 hits    1 orgs [Fungi; Dikarya; Ascomycota; Taphrinomycotina; Schizosaccharomycetes; Schizosaccharomycetales; Schizosaccharomycetaceae; Schizosaccharomyces; Schizosaccharomyces japonicus]
. . . Bilateria ............................     2 hits    2 orgs [Metazoa; Eumetazoa]
. . . . Caenorhabditis elegans .............     1 hits    1 orgs [Pseudocoelomata; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; Rhabditidae; Peloderinae; Caenorhabditis]
. . . . Homo sapiens .......................     1 hits    1 orgs [Coelomata; Deuterostomia; Chordata; Craniata; Vertebrata; Gnathostomata; Teleostomi; Euteleostomi; Sarcopterygii; Tetrapoda; Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; Primates; Haplorrhini; Simiiformes; Catarrhini; Hominoidea; Hominidae; Homininae; Homo]
. Human adenovirus 6 .......................     1 hits    1 orgs [Viruses; dsDNA viruses, no RNA stage; Adenoviridae; Mastadenovirus; Human adenovirus C]

BLAST

PROTOCOL

a) BLASTp versus NR, NCBI default parameters apart from "Number of descripions_1000"


b) BLASTp versus swissprot, NCBI default parameters apart from "Number of descripions_1000"


RESULTS ANALYSIS


By running the BLASTp we found some alignments, 19 of them with a significant E value, ranging from 4e-75, to 1e-10.

It is noted that most sequences have the same function (baseplate wedge), one of the only two sequences that is different, is very similar with the previous, the similar sequence is T4-like baseplate wedge, and the most different is an hypothetical protein.

When comparing our sequence with others with known function, significant scores and E values, we can conclude that there is a possibility that the sequence is homologous to our.

The results of BLASTp vs NR and BLASTp vs SWISSPROT are not similar in E-values, and the sequence of homologs is in a diferent organization.




RAW RESULTS

a)
Sequences producing significant alignments:                      Score   E                                                                         
                                                                (Bits) Value

ref|YP_003097384.1| baseplate wedge subunit [Synechococcus...     286  4e-75
gb|ADD95443.1| hypothetical protein [uncultured phage MedD...     249  6e-64
ref|YP_195117.1| baseplate wedge [Synechococcus phage S-PM2]      246  5e-63
ref|YP_004324467.1| baseplate wedge [Prochlorococcus phage...     242  6e-62
ref|YP_004323456.1| baseplate wedge [Prochlorococcus phage...     133  4e-29
ref|YP_004322513.1| baseplate wedge [Prochlorococcus phage...     126  4e-27
ref|YP_004322254.1| baseplate wedge [Synechococcus phage S...     126  5e-27
ref|YP_004323702.1| baseplate wedge [Prochlorococcus phage...     124  2e-26
ref|YP_004324924.1| baseplate wedge [Prochlorococcus phage...     123  3e-26
ref|YP_214644.1| baseplate wedge [Prochlorococcus phage P-...     121  2e-25
ref|YP_717771.1| baseplate wedge [Synechococcus phage syn9]       120  2e-25
ref|YP_004323924.1| baseplate wedge [Synechococcus phage S...     120  2e-25
ref|YP_004323238.1| baseplate wedge [Prochlorococcus phage...     117  2e-24
ref|YP_004322764.1| baseplate wedge [Synechococcus phage S...     116  5e-24
ref|YP_004324700.1| baseplate wedge [Synechococcus phage S...     115  1e-23
ref|YP_004322996.1| baseplate wedge [Synechococcus phage S...     114  2e-23
ref|YP_004324157.1| baseplate wedge [Synechococcus phage S...     108  8e-22
gb|ADD95286.1| T4-like baseplate wedge [uncultured phage M...      87  2e-15
ref|YP_214344.1| baseplate wedge [Prochlorococcus phage P-...      72  1e-10

ALIGNMENTS
>ref|YP_003097384.1| Gene info linked to YP_003097384.1 baseplate wedge subunit [Synechococcus phage S-RSM4]
 emb|CAR63347.1| Gene info linked to CAR63347.1 baseplate wedge subunit [Synechococcus phage S-RSM4]
Length=615

 Score =  286 bits (731),  Expect = 4e-75, Method: Compositional matrix adjust.
 Identities = 157/315 (50%), Positives = 211/315 (67%), Gaps = 31/315 (10%)

Query  27   SSNTAATALSGVYRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDI  86
            +++T ATA SGVYR+AT+D+P +P DNQ EK   Y +I+AAKRIT  +AR VVRR+NW++
Sbjct  116  ATDTLATANSGVYRYATEDVPPLPLDNQREKIALYDEIIAAKRITDSFARTVVRRYNWNL  175

Query  87   STNSRFDMWRPDYSEQKTSSVVLPGG--------STGSQNISTAKFYVVNGKYEVFKCLY  138
              N +FDMW+PDYS         PGG        +TG+ +IS AKFYV+N  YEVFKCLY
Sbjct  176  VANPKFDMWKPDYS-------ATPGGGGQIGKQTATGADSISDAKFYVMNSTYEVFKCLY  228

Query  139  NGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDV  198
            NGE P +  P GV      EP+T+  +     G+Y E  + NG  NYIWKYMYTI T+DV
Sbjct  229  NGEGPGN--PTGV--DAVEEPTTSGGNYDAGTGLYTETTNVNG--NYIWKYMYTIPTDDV  282

Query  199  LRFLSTDFIPIV-----ADSVVQAAAVNGSVSTVVLKAIGSNLPTSQ---TSLYAPIVGD  250
            L+FLS+DF+PIV     + + V   AV+G++  V+++  GS LP ++     LYA I GD
Sbjct  283  LKFLSSDFMPIVLPANPSRTTVVGQAVDGAIDVVLIEDAGSGLPANRGVGNELYAGIKGD  342

Query  251  GSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPVVVASNA--TG  308
            G+GGVVK  T+ SG+I  A ++  GSGY+YAN+LL+NGN+++D  L     +A+    TG
Sbjct  343  GTGGVVKMTTDGSGSILTAEVEVRGSGYTYANVLLSNGNLFTDPGLAPGDAIATPGGWTG  402

Query  309  AIEAVLSPQGGHGSD  323
            A+EA+L PQGGHGSD
Sbjct  403  ALEAILPPQGGHGSD  417


>gb|ADD95443.1|  hypothetical protein [uncultured phage MedDCM-OCT-S08-C1281]
Length=357

 Score =  249 bits (635),  Expect = 6e-64, Method: Compositional matrix adjust.
 Identities = 139/288 (48%), Positives = 188/288 (65%), Gaps = 30/288 (10%)

Query  25   ITSSNTAATALSGVYRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNW  84
            +T + ++ATA SG YR+AT++ P  P DN +EK   Y +++AAKR+TG +AR VV R+NW
Sbjct  62   LTGATSSATAKSGTYRYATEEAPPAPLDNYSEKLAIYNELIAAKRVTGPFARLVVPRYNW  121

Query  85   DISTNSRFDMWRPDYSEQKTSSVVLPGG--------STGSQNISTAKFYVVNGKYEVFKC  136
            +++ N +FDM+RP+YS         PGG        +TG  ++S  KFYV+N +YEVFKC
Sbjct  122  NLTLNPKFDMYRPNYSPT-------PGGGGSVGKDTATGQSSLSEGKFYVMNQQYEVFKC  174

Query  137  LYNGERPASLLPGGVLPTVAYEPSTTPS--SGTYANGIYKEPVDANGFCNYIWKYMYTIS  194
            LYNGE  A+  P G   T  YEP + PS   G +ANGIY EP    G   YIWK+M+T+ 
Sbjct  175  LYNGEGAAN--PTGQNAT--YEPKSQPSPGQGAFANGIYTEPA---GTAGYIWKHMFTVP  227

Query  195  TNDVLRFLSTDFIPIV-----ADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVG  249
            T DVL FLSTDF+P+V     + + V+A AV+G+V   V+K  GS LP S T LY  + G
Sbjct  228  TGDVLAFLSTDFMPVVESTELSRTQVEALAVDGAVHVAVVKDGGSGLPASDT-LYTSVKG  286

Query  250  DGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALT  297
            DGSG VV+  T++SG +T A + + GSGYSY N+LL  G VY+D+ LT
Sbjct  287  DGSGAVVELTTDASGTVTAASMFNVGSGYSYGNLLLETGEVYTDSGLT  334


>ref|YP_195117.1| Gene info linked to YP_195117.1 baseplate wedge [Synechococcus phage S-PM2]
 emb|CAF34147.1| Gene info linked to CAF34147.1 baseplate wedge [Synechococcus phage S-PM2]
Length=634

 Score =  246 bits (627),  Expect = 5e-63, Method: Compositional matrix adjust.
 Identities = 150/324 (46%), Positives = 197/324 (61%), Gaps = 36/324 (11%)

Query  27   SSNTAATALSGVYRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWD-  85
            +S+T A A SGVYRFAT+DIP VP DNQ EK+  Y DI+AAKRIT ++ RPVV R++W+ 
Sbjct  126  ASDTGAEAKSGVYRFATEDIPPVPLDNQREKFDVYDDIIAAKRITSQFVRPVVTRYDWNL  185

Query  86   ISTNSRFDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPAS  145
            ++T  RFDM++PDYS    +  V    +TG+  +  AK+YVVN  YEVFKCLYNGE    
Sbjct  186  VATEKRFDMFKPDYS-ATVAGRVGKSSTTGASALGDAKYYVVNANYEVFKCLYNGE----  240

Query  146  LLPGGVLPTVAYEPSTTPS--SGTY--ANGIYKEP----VDANGFCNYIWKYMYTISTND  197
              PG V P   YEP T PS   GTY  + GI+ E     V A+G   Y+WKYMYTI T D
Sbjct  241  -FPGQVDPDPVYEPKTNPSGGEGTYNPSTGIFTERATAIVAASGASGYVWKYMYTIPTED  299

Query  198  VLRFLSTDFIPI-VADSVVQAA----AVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGDGS  252
            VLRFLST+F+ I +A    +A     AV+G++  V+++  G+ LP    + YAP+VGDG 
Sbjct  300  VLRFLSTNFMSINLAGEPTRAGTESIAVDGAIDIVLIEDRGTGLPNG--THYAPVVGDGQ  357

Query  253  GG-----VVKFGTNSSGNITYAILQSAGSGYSYANILLNNGN--------VYSDAALTTP  299
             G     +VK   +     +  ++     GY+YA++ L NG         +YS+  LTTP
Sbjct  358  LGGNNPAIVKIVVSGGQIESTEVVDRGTGGYTYASVPLENGQTIAGLPYGLYSNQTLTTP  417

Query  300  VVVASNATGAIEAVLSPQGGHGSD  323
                   TGA+E VL PQGGHGSD
Sbjct  418  -RTGVTGTGALEVVLPPQGGHGSD  440


>ref|YP_004324467.1| Gene info linked to YP_004324467.1 baseplate wedge [Prochlorococcus phage Syn1]
 gb|ADO99197.1| Gene info linked to ADO99197.1 baseplate wedge [Prochlorococcus phage Syn1]
Length=627

 Score =  242 bits (617),  Expect = 6e-62, Method: Compositional matrix adjust.
 Identities = 147/323 (46%), Positives = 205/323 (63%), Gaps = 36/323 (11%)

Query  27   SSNTAATALSGVYRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWD-  85
            +++T A A +GVYR+AT+D P  P DNQ EK+  Y +I+AAKRIT ++AR V+ R++W+ 
Sbjct  121  AADTGAEARAGVYRYATEDTPPTPLDNQIEKFSVYDEIIAAKRITDQFARAVITRYDWNL  180

Query  86   ISTNSRFDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPAS  145
            ++T  RFDM++PDYS   T+  V    +TG+ ++  +KFYV+N  YEVFKCLYNG+    
Sbjct  181  LATEPRFDMYKPDYS-ATTTGQVGKQSTTGAASLGASKFYVINSNYEVFKCLYNGQ----  235

Query  146  LLPGGVLPTVAYEPSTTPSS--GTY--ANGIYKEPVDA----NGFCNYIWKYMYTISTND  197
              PG V P   YEP TTPS+  GTY   +G++ E  DA         YIWKYMYTI T+D
Sbjct  236  -FPGQVDPNPVYEPKTTPSAGQGTYDAGSGLFTESADAVVANTAGSGYIWKYMYTIPTDD  294

Query  198  VLRFLSTDFIPI-----VADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGD--  250
            VLRFLST+F+PI        +  +AAAV+G++  V+++ IGS LP    + YAP++GD  
Sbjct  295  VLRFLSTNFMPINLTGEATRAATEAAAVDGAIDVVLVEDIGSGLPNG--THYAPVLGDGQ  352

Query  251  --GSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNG--------NVYSDAALTTPV  300
              G+  VVK    SSG+I    +   G+GY+YA+I L++G         +Y++ ALTT  
Sbjct  353  VSGTQSVVKI-VVSSGSIESTEVVVKGAGYTYASIALDDGATVGGIKYGLYAEQALTT-A  410

Query  301  VVASNATGAIEAVLSPQGGHGSD  323
                  TGA+E VL PQGGHG+D
Sbjct  411  RTGVGGTGALEVVLPPQGGHGAD  433


>ref|YP_004323456.1| Gene info linked to YP_004323456.1 baseplate wedge [Prochlorococcus phage P-HM2]
 gb|ADO99865.1| Gene info linked to ADO99865.1 baseplate wedge [Prochlorococcus phage P-HM2]
Length=504


 Score =  133 bits (334),  Expect = 4e-29, Method: Compositional matrix adjust.
 Identities = 102/318 (32%), Positives = 163/318 (51%), Gaps = 48/318 (15%)

Query  21   WYNPITSSNTAATALSGVYRFATDDIPTV--PYDNQTEKYLAYADILAAKRITGEYARPV  78
            + N +T++N     +     + + D+P    P D+      +Y D +A KR+       V
Sbjct  16   FRNTLTATNKVYMFVGRAKTWGSSDVPPTGEPIDSFEYARTSYGDSVAFKRVDVSDTALV  75

Query  79   VRRFNWDISTNSR------FDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYE  132
            + R +W   T +       + M++PDY+  KT++        G+  +  + FYV+N  + 
Sbjct  76   IPRVDWIDPTKTTGGVGRTYSMYKPDYAPTKTTA-------NGASRLYDSNFYVMNSDFN  128

Query  133  VFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDANGFCNYIWKYMYT  192
            V+KCLYNG+ P    P G        PS    +GT    I  E  D+ G  +Y WKY+YT
Sbjct  129  VYKCLYNGQSPE--FPRG-------RPSLVEPTGTSTTII--ETSDSPGVYSYRWKYLYT  177

Query  193  ISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGDGS  252
            I  +++L+F++T+FIP++ +S+VQ+AA +GSV TVV++  GS    + T    PI GD +
Sbjct  178  IDADNILKFVTTEFIPVLTNSLVQSAANSGSVDTVVIENAGSGY-NNGTFTNVPIRGDYT  236

Query  253  GGVVKFGTNS-------SGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPVVVASN  305
               V  GT +       SG+I+   + +AGSGYS+A+I         D +L   +   +N
Sbjct  237  ---VNGGTQASCTVTVVSGSISAVTITTAGSGYSFASI---------DTSLIANIGNGTN  284

Query  306  ATGAIEAVLSPQGGHGSD  323
            A   ++ VL P GGHG D
Sbjct  285  AD--LDVVLPPNGGHGKD  300


>ref|YP_004322513.1| Gene info linked to YP_004322513.1 baseplate wedge [Prochlorococcus phage P-HM1]
 gb|ADO98712.1| Gene info linked to ADO98712.1 baseplate wedge [Prochlorococcus phage P-HM1]
Length=504

 Score =  126 bits (317),  Expect = 4e-27, Method: Compositional matrix adjust.
 Identities = 99/316 (31%), Positives = 159/316 (50%), Gaps = 44/316 (14%)

Query  21   WYNPITSSNTAATALSGVYRFATDDIPTV--PYDNQTEKYLAYADILAAKRITGEYARPV  78
            + N + ++N     +     + + D+P    P D+      +Y D +A KR+       V
Sbjct  16   FRNTLQATNKVYMFVGRAKTWGSSDVPPTGEPLDSFEYARTSYGDSVAFKRVDISDTALV  75

Query  79   VRRFNWDISTNSR------FDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYE  132
            + R +W   T +       + M++PDY+  KT++        GS  +  + FYV+N  + 
Sbjct  76   IPRVDWTDPTKTTGGVGRTYSMYKPDYAPTKTTA-------NGSSRLYDSNFYVMNSDFN  128

Query  133  VFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDANGFCNYIWKYMYT  192
            V+KCLYNG+ P    P G        PS    +GT    I  E  D+ G  +Y WKY+YT
Sbjct  129  VYKCLYNGQSPE--FPRG-------RPSLVEPTGTSTTII--ETSDSPGVYSYRWKYLYT  177

Query  193  ISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGD--  250
            I  +++L+F++++FIP++++S+V +AA  GSV TVV++  GS     Q +   PI GD  
Sbjct  178  IDADNILKFVTSEFIPVLSNSLVTSAANTGSVDTVVIENAGSGYNNGQFT-NVPIRGDYN  236

Query  251  ---GSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPVVVASNAT  307
               G+  +      S    +  I Q AGSGYS+A+I         D +L T +   S+A 
Sbjct  237  VNGGTQALCTVNVVSGSVSSVTITQ-AGSGYSFASI---------DVSLITNIGNGSDA-  285

Query  308  GAIEAVLSPQGGHGSD  323
             +++ VL P GGHG+D
Sbjct  286  -SLDVVLPPNGGHGND  300


>ref|YP_004322254.1| Gene info linked to YP_004322254.1 baseplate wedge [Synechococcus phage S-SM2]
 gb|ADO97440.1| Gene info linked to ADO97440.1 baseplate wedge [Synechococcus phage S-SM2]
Length=516

 Score =  126 bits (316),  Expect = 5e-27, Method: Compositional matrix adjust.
 Identities = 92/297 (31%), Positives = 152/297 (51%), Gaps = 48/297 (16%)

Query  39   YRFATDDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNSRFDMWRPD  98
            Y+   D  P  P DN +++   +  ++A K+I     R V+ +  W  ++ + +DM+R D
Sbjct  40   YQDDWDSNPPAPKDNFSQENDYWDTMVALKKINSGDVRQVIPKRTW--TSGTTYDMYRHD  97

Query  99   YSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYE  158
            YS   T++V      +G+ N+ +A +YV+N  + V+ CL NG  P +  P G  P++  E
Sbjct  98   YSVTNTAAV------SGATNLYSAFYYVMNSDFRVYACLQNGTDPNN--PNG-KPSLD-E  147

Query  159  PSTTP----SSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVAD--  212
            P+ T     S+G+  +G             Y+WKY+YTI  N+V++F STDF+P+ AD  
Sbjct  148  PTFTDLEPRSAGSSGDG-------------YLWKYLYTIKPNEVVKFESTDFMPVPADWA  194

Query  213  -----SVVQAAAVNGSVSTVVLKAIGSNLPTS-QTSLYAPIVGDGSGGVVKFGTNSSGNI  266
                 + V+  AV+GS+  V +   G  L T+ QT    PI GDG+G        +   +
Sbjct  195  TSTDNAAVRDNAVDGSIKVVTVTNSGVGLGTANQTYTRVPIQGDGTGAECTLTVGADSKV  254

Query  267  TYAILQSAGSGYSYANILLNNGNVYSDAALTTPVVVASNATGAIEAVLSPQGGHGSD  323
            +   + + GSGYSY ++ L  G V +   + T            + +++PQGGHG+D
Sbjct  255  SGVTVSNQGSGYSYGSLNLEAGGVPTGTTIPT-----------FDVIMAPQGGHGAD  300


>ref|YP_004323702.1| Gene info linked to YP_004323702.1 baseplate wedge [Prochlorococcus phage Syn33]
 gb|ADO99688.1| Gene info linked to ADO99688.1 baseplate wedge [Prochlorococcus phage Syn33]
Length=510

 Score =  124 bits (312),  Expect = 2e-26, Method: Compositional matrix adjust.
 Identities = 100/336 (30%), Positives = 160/336 (48%), Gaps = 52/336 (15%)

Query  4    CCRSNCFR-YSTTRFYFNWYNP--ITSSNTAATALSGVYRFA-------TDDIPTVPYDN  53
               ++ FR +S  +F      P    S + A T    +Y F         ++ P    D+
Sbjct  3    ALLTDQFRIFSARKFIKALEGPDATQSDDVAGTTRDRLYLFIGRPQTWDNENSPPQAIDS  62

Query  54   QTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNSR------FDMWRPDYSEQKTSSV  107
              E   +Y D+++ KR+       VVRR +W     +       +DM+R DYS  KT+S 
Sbjct  63   FQEFSSSYDDMISLKRVLASDTVQVVRRIDWVSPEQTTGGLGFTYDMYRHDYSPSKTAS-  121

Query  108  VLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGT  167
                  +G+  +  + FYVVN +Y+V+K +YNG  P+   P G       +PST   +G+
Sbjct  122  ------SGATKLYDSDFYVVNSQYQVYKVIYNGTSPSD--PNG-------KPSTVEPTGS  166

Query  168  YANGIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTV  227
              + I       +G   Y WKYMYTI    VL+F S D++P+  +  V+  AV+G + TV
Sbjct  167  STSII----TTGDG---YRWKYMYTIPVASVLKFFSNDYMPVFTNDAVRTNAVSGEIDTV  219

Query  228  VLKAIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNN  287
            V+ + GS    + T     I GDG+GG V    +  G I  A + S G+GY++  I    
Sbjct  220  VISSAGSGY-NNGTYDNVAINGDGTGGRVSIVID-GGRIISATVTSGGTGYTFGKI----  273

Query  288  GNVYSDAALTTPVVVASNATGAIEAVLSPQGGHGSD  323
                   ++ +   + + A+G ++ ++ P GGHGSD
Sbjct  274  -------SIDSITGIGTGASGIVDVIMPPPGGHGSD  302


>ref|YP_004324924.1| Gene info linked to YP_004324924.1 baseplate wedge [Prochlorococcus phage P-SSM7]
 gb|ADO99010.1| Gene info linked to ADO99010.1 baseplate wedge [Prochlorococcus phage P-SSM7]
Length=510

 Score =  123 bits (309),  Expect = 3e-26, Method: Compositional matrix adjust.
 Identities = 92/286 (32%), Positives = 144/286 (50%), Gaps = 42/286 (15%)

Query  44   DDIPTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNSR------FDMWRP  97
            ++ P    D+ +E   +Y D+++ KR+       VVRR +W     +       +DM+R 
Sbjct  53   ENAPPQAVDSFSEFSNSYDDMISLKRVLAADTVQVVRRIDWVSPEETTGGLGFTYDMYRH  112

Query  98   DYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAY  157
            +YS  KT+S       +G+  +  A F+VVN +Y+V+KC+YNG  P+   P G   TV  
Sbjct  113  NYSPSKTAS-------SGATKLYDADFFVVNSQYQVYKCIYNGTSPSD--PNGKPSTV--  161

Query  158  EPSTTPSSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQA  217
            EP+ T +S       Y+            WKYMYTI    VL+F S D++P+  ++ V+ 
Sbjct  162  EPTGTSTSIITTGDGYR------------WKYMYTIPVASVLKFFSNDYMPVFTNTAVKT  209

Query  218  AAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSG  277
             AV G + TVV+ A GS    + T     I GDG+GG V    +  G +T A + S G+G
Sbjct  210  NAVTGEIDTVVINAAGSGY-NNGTYDNVAINGDGTGGRVSIVVD-GGKVTSATVTSGGTG  267

Query  278  YSYANILLNNGNVYSDAALTTPVVVASNATGAIEAVLSPQGGHGSD  323
            Y++  I +N        A+T    + +  +G ++ V+ P  GHG D
Sbjct  268  YTFGQISIN--------AITG---IGTGTSGEVDVVIPPPDGHGYD  302


>ref|YP_214644.1| Gene info linked to YP_214644.1 baseplate wedge [Prochlorococcus phage P-SSM4]
 gb|AAX46884.1| Gene info linked to AAX46884.1 baseplate wedge [Prochlorococcus phage P-SSM4]
Length=510

 Score =  121 bits (303),  Expect = 2e-25, Method: Compositional matrix adjust.
 Identities = 92/283 (33%), Positives = 139/283 (49%), Gaps = 42/283 (15%)

Query  47   PTVPYDNQTEKYLAYADILAAKRITGEYARPVVRRFNWDISTNSR------FDMWRPDYS  100
            P    D+ +E   +Y D+++ KR+       VVRR +W     +       +DM+R DYS
Sbjct  56   PPQAVDSFSEFSGSYDDMVSLKRVLASDTVQVVRRIDWVSPEQTTGGLGFTYDMYRHDYS  115

Query  101  EQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYEPS  160
              KT++       +G+  +  + FYVVN +Y+V+KC+YNG  P+   P G   TV  EP+
Sbjct  116  PSKTAA-------SGATKLYDSDFYVVNSQYQVYKCIYNGTSPSD--PNGKPSTV--EPT  164

Query  161  TTPSSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQAAAV  220
             T +S       Y+            WKYMYTI    VL+F S D++P+  ++ VQ  AV
Sbjct  165  GTSTSIITTGDGYR------------WKYMYTIPVASVLKFFSNDYMPVFTNAAVQTNAV  212

Query  221  NGSVSTVVLKAIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSY  280
             G V TVV+ A GS    + T     I GDG+GG V    +  G I  A + S G+GY++
Sbjct  213  AGEVDTVVINAAGSGY-NNGTYDNVAINGDGTGGRVSIVID-GGKIISATVTSGGTGYTF  270

Query  281  ANILLNNGNVYSDAALTTPVVVASNATGAIEAVLSPQGGHGSD  323
              I ++N              + +   G ++ ++ P GGHG D
Sbjct  271  GKISVDNI-----------TGIGTGTGGQVDVIIPPPGGHGKD  302


>ref|YP_717771.1| Gene info linked to YP_717771.1 baseplate wedge [Synechococcus phage syn9]
 gb|ABA47073.1| Gene info linked to ABA47073.1 baseplate wedge [Synechococcus phage syn9]
Length=510

 Score =  120 bits (302),  Expect = 2e-25, Method: Compositional matrix adjust.
 Identities = 103/334 (31%), Positives = 161/334 (48%), Gaps = 52/334 (16%)

Query  7    SNCFR-YSTTRFYFNWYNPI-TSSNTAATALSG-VYRFA-------TDDIPTVPYDNQTE  56
            ++ FR +S  +F      PI T S+ AA A    +Y F         ++ P    D+  E
Sbjct  6    TDQFRIFSAKKFIKALEGPIATQSDDAAGATRDRLYIFIGRPQSWDNENSPPQAVDSFLE  65

Query  57   KYLAYADILAAKRITGEYARPVVRRFNWDISTNSR------FDMWRPDYSEQKTSSVVLP  110
               ++ D++A KR+       VVRR +W     +       +DM+R DYS  KT+S    
Sbjct  66   FSGSFDDMIALKRVLASDTIQVVRRIDWVSPEQTTGGLGFTYDMYRHDYSPSKTAS----  121

Query  111  GGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYAN  170
               +G+  +  + FYVVN +Y+V+KC+YNG  P+   P G   TV  EP+ T +S     
Sbjct  122  ---SGATKLYDSDFYVVNSQYQVYKCIYNGTSPSD--PNGKPSTV--EPTGTSTSIITTA  174

Query  171  GIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTVVLK  230
              Y+            WKY+YTI    VL+F S D++P+  +  V+  AV G V TVV+ 
Sbjct  175  DSYR------------WKYLYTIPVASVLKFFSNDYMPVFTNDAVKTNAVTGEVDTVVIT  222

Query  231  AIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNV  290
            + G+    + T     I GDG+GG V    +  G I  A + S G+GY++  I ++N   
Sbjct  223  SAGTGY-NNGTYDNVAINGDGTGGRVSIVVD-GGKIISATVTSGGTGYTFGKISVDN---  277

Query  291  YSDAALTTPVVVASNATGAIEAVLSPQGGHGSDP  324
                 +T    + +  +G ++ ++ P  GHG DP
Sbjct  278  -----ITG---IGTGTSGQVDVIIPPPNGHGFDP  303


>ref|YP_004323924.1| Gene info linked to YP_004323924.1 baseplate wedge [Synechococcus phage Syn19]
 gb|ADO99466.1| Gene info linked to ADO99466.1 baseplate wedge [Synechococcus phage Syn19]
Length=510

 Score =  120 bits (302),  Expect = 2e-25, Method: Compositional matrix adjust.
 Identities = 97/314 (31%), Positives = 150/314 (48%), Gaps = 49/314 (16%)

Query  23   NPITSSNTAATALSGVYRFA-------TDDIPTVPYDNQTEKYLAYADILAAKRITGEYA  75
            N   S + A T    +Y F         ++ P    D+ +E   +Y D+++ KR+     
Sbjct  25   NATQSDDDAGTTRDRLYLFIGRPQTWDNENSPPQAVDSFSEFSGSYDDMISLKRVLASDT  84

Query  76   RPVVRRFNWDISTNSR------FDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNG  129
              VVRR +W     +       +DM+R DYS  KT++       +G+  +  + FYVVN 
Sbjct  85   VQVVRRIDWVSPEQTTGGLGFTYDMYRHDYSPSKTAA-------SGATKLYDSDFYVVNS  137

Query  130  KYEVFKCLYNGERPASLLPGGVLPTVAYEPSTTPSSGTYANGIYKEPVDANGFCNYIWKY  189
            +Y+V+KC+YNG  P+   P G   TV  EP+ T +S       Y+            WKY
Sbjct  138  QYQVYKCIYNGTSPSD--PNGKPSTV--EPTGTSTSIITTGDGYR------------WKY  181

Query  190  MYTISTNDVLRFLSTDFIPIVADSVVQAAAVNGSVSTVVLKAIGSNLPTSQTSLYAPIVG  249
            MYTI    VL+F S D++P+  +S VQ  AV+G V TVV+ + GS    + T     I G
Sbjct  182  MYTIPVASVLKFFSNDYMPVFTNSAVQTNAVSGEVDTVVINSAGSGY-NNGTYDNVAING  240

Query  250  DGSGGVVKFGTNSSGNITYAILQSAGSGYSYANILLNNGNVYSDAALTTPVVVASNATGA  309
            DG+GG V    +  G I  A + S G+GY++  I           ++     + +   G 
Sbjct  241  DGTGGRVSVVID-GGKIISATVTSGGTGYTFGKI-----------SVDAITGIGTGTGGQ  288

Query  310  IEAVLSPQGGHGSD  323
            ++ ++ P GGHG+D
Sbjct  289  VDVIIPPPGGHGND  302

b)                                                               Score   E                                                                         
                                                                (Bits) Value

sp|Q58976.1|PYRB_METJA RecName: Full=Aspartate carbamoyltr...      34  1.2
sp|Q58473.1|RIO2_METJA RecName: Full=RIO-type serine/threo...      32  4.8
sp|P42175.2|NARG_BACSU RecName: Full=Nitrate reductase alp...      33  1.9
sp|B6JZY9.1|GET1_SCHJY RecName: Full=Protein get1; AltName...      32  4.8
sp|B8CIL6.1|Y674_SHEPW RecName: Full=UPF0042 nucleotide-bi...      31  5.0
sp|Q09165.2|DIG1_CAEEL RecName: Full=Mesocentin; Flags: Pr...      31  5.2
sp|Q09165.2|DIG1_CAEEL RecName: Full=Mesocentin; Flags: Pr...      31  5.2
sp|Q04966.1|HEX_ADE06 RecName: Full=Hexon protein; AltName...      31  6.5
sp|P30266.3|CATE_BACPE RecName: Full=Catalase                      31  7.5
sp|Q86Y46.1|K2C73_HUMAN RecName: Full=Keratin, type II cyt...      31  9.2

ALIGNMENTS
>sp|Q58976.1|PYRB_METJA  RecName: Full=Aspartate carbamoyltransferase; AltName: Full=Aspartate 
transcarbamylase; Short=ATCase
Length=306

 Score = 34.3 bits (77),  Expect = 1.2, Method: Compositional matrix adjust.
 Identities = 18/42 (43%), Positives = 22/42 (52%), Gaps = 9/42 (21%)

Query  129  GKYEVFKCL---------YNGERPASLLPGGVLPTVAYEPST  161
            GK E+ + L          N +RP  LL G +L TV YEPST
Sbjct  11   GKEEILEILDEARKMEELLNTKRPLKLLEGKILATVFYEPST  52


>sp|P42175.2|NARG_BACSU  RecName: Full=Nitrate reductase alpha chain
Length=1228

 Score = 33.5 bits (75),  Expect = 1.9, Method: Composition-based stats.
 Identities = 20/72 (28%), Positives = 36/72 (50%), Gaps = 6/72 (8%)

Query  52   DNQTEKYLAYA----DILAAKRITGEYARPVVRRF--NWDISTNSRFDMWRPDYSEQKTS  105
            + +TE+++ YA    D      ++ E       RF    DI   ++ D W+P   +++TS
Sbjct  325  NQETERFIEYAKQYTDFPFLVTLSKENGVYTAGRFLHAKDIGRKTKHDQWKPAVWDEQTS  384

Query  106  SVVLPGGSTGSQ  117
            S  +P G+ GS+
Sbjct  385  SFAIPQGTMGSR  396


>sp|B6JZY9.1|GET1_SCHJY Gene info linked to B6JZY9.1 RecName: Full=Protein get1; AltName: Full=Guided entry of tail-anchored 
proteins 1; Flags: Precursor
Length=170

 Score = 32.3 bits (72),  Expect = 4.8, Method: Compositional matrix adjust.
 Identities = 20/79 (25%), Positives = 33/79 (42%), Gaps = 3/79 (4%)

Query  89   NSRFDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVFKCLYNGERPASLLP  148
            N +FD     + +Q   S ++   S G + + +  F++V   Y       N   P   +P
Sbjct  75   NRKFDQLNVKWEKQ---SKIVSQKSEGVKKLISLTFWIVTRGYRFIVQFKNSGNPVFAVP  131

Query  149  GGVLPTVAYEPSTTPSSGT  167
             G+LPT A      P + T
Sbjct  132  EGMLPTWALWFLALPKAKT  150


>sp|Q58473.1|RIO2_METJA  RecName: Full=RIO-type serine/threonine-protein kinase Rio2
Length=270

 Score = 32.3 bits (72),  Expect = 4.8, Method: Compositional matrix adjust.
 Identities = 16/46 (35%), Positives = 25/46 (54%), Gaps = 1/46 (2%)

Query  217  AAAVNGSVSTVVLKAIGSNLPTSQT-SLYAPIVGDGSGGVVKFGTN  261
            A A+N  V   +LKAIG+ L   +   +Y  ++ DG   V+KF  +
Sbjct  52   ALAINAFVKKGILKAIGNKLGVGKEGDVYTVLLSDGREAVLKFHKH  97


>sp|B8CIL6.1|Y674_SHEPW Gene info linked to B8CIL6.1 RecName: Full=UPF0042 nucleotide-binding protein swp_0674
Length=284

 Score = 32.0 bits (71),  Expect = 5.0, Method: Compositional matrix adjust.
 Identities = 21/61 (34%), Positives = 26/61 (43%), Gaps = 5/61 (8%)

Query  75   ARPVVRRFNWDISTNSRFDMWRPDYSEQKTSSVVLPGGSTGSQNISTAKFYVVNGKYEVF  134
             RP+V +F W I    R   W PD      S + +  G TG Q+ S    YV     E F
Sbjct  211  QRPLVNKFIWQIENLLR--TWLPDLERNNRSYLTIAIGCTGGQHRSV---YVTERLAEHF  265

Query  135  K  135
            K
Sbjct  266  K  266


>sp|Q09165.2|DIG1_CAEEL Gene info linked to Q09165.2Download subject sequence Q09165 spanning the HSP RecName: Full=Mesocentin; Flags: Precursor
Length=13100

 Score = 32.0 bits (71),  Expect = 5.2, Method: Compositional matrix adjust.
 Identities = 36/123 (29%), Positives = 50/123 (41%), Gaps = 17/123 (14%)

Query  184   NYIWKYM----YTISTNDVLRFL----STDFIPIVADSVVQAAAVNGS-----VSTVVLK  230
             NYI+  +     TI T+D  R +      D  P+  D+       +G       S   L 
Sbjct  5572  NYIYPAVGPNGQTIPTDDTGRTVYPVRGPDGTPLPTDASGAVIGPDGEPIPTDASGKPLS  5631

Query  231   AIGSNLPTSQTSLYAPIVGDGSGGVVKFGTNSSGNITYAILQSAGSGYSY---ANILLNN  287
             A GS LPT     Y  +  DGS  V    T+ SGN  Y ++   G+  S     N L N+
Sbjct  5632  ADGSPLPTDNNGNYVIVPTDGS-TVKSHPTDDSGNTIYPVVNEDGTPLSTDLSGNFLTNS  5690

Query  288   GNV  290
             G +
Sbjct  5691  GEI  5693


>sp|Q04966.1|HEX_ADE06  RecName: Full=Hexon protein; AltName: Full=Late protein 2
Length=465

 Score = 31.6 bits (70),  Expect = 6.5, Method: Compositional matrix adjust.
 Identities = 29/97 (30%), Positives = 40/97 (41%), Gaps = 11/97 (11%)

Query  160  STTPSSGTYANGIYKEPVDANGFCNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQAAA  219
             TTP    Y  G Y  P ++NG    +W     + +   ++F ST        S      
Sbjct  116  KTTPMKPCY--GSYARPTNSNGGQGVLWLTNGKLESQVEMQFFST--------STNATNE  165

Query  220  VNGSVSTVVLKAIGSNLPTSQTSL-YAPIVGDGSGGV  255
            VN    TVVL +   N+ T  T L Y P +GD +  V
Sbjct  166  VNNIQPTVVLYSEDVNMETPDTHLSYKPKMGDKNAKV  202


>sp|P30266.3|CATE_BACPE Gene info linked to P30266.3 RecName: Full=Catalase
Length=678

 Score = 31.6 bits (70),  Expect = 7.5, Method: Compositional matrix adjust.
 Identities = 18/65 (28%), Positives = 29/65 (45%), Gaps = 5/65 (8%)

Query  183  CNYIWKYMYTISTNDVLRFLSTDFIPIVADSVVQA-----AAVNGSVSTVVLKAIGSNLP  237
                W  M  +  N ++   S +   + + SV Q      A V   ++  V +AIG+NLP
Sbjct  453  AKLFWNSMSEVEKNHIIEAFSFELGKVQSKSVQQQVVEMFAHVTSDLAKPVAEAIGANLP  512

Query  238  TSQTS  242
             S+ S
Sbjct  513  QSEGS  517


>sp|Q86Y46.1|K2C73_HUMAN Gene info linked to Q86Y46.1 RecName: Full=Keratin, type II cytoskeletal 73; AltName: Full=Cytokeratin-73; 
Short=CK-73; AltName: Full=Keratin-73; Short=K73; 
AltName: Full=Type II inner root sheath-specific keratin-K6irs3; 
AltName: Full=Type-II keratin Kb36
Length=540

 Score = 31.2 bits (69),  Expect = 9.2, Method: Compositional matrix adjust.
 Identities = 18/80 (23%), Positives = 34/80 (43%), Gaps = 11/80 (14%)

Query  62   ADILAAKRITGEYARPVVRRFNWDISTNSRFDMWRPDYSEQKTSSVVLPGGSTGSQNIST  121
            +++ + + +  +Y +      N   +  + F + + D     TS V L            
Sbjct  206  SELRSVREVVEDYKKRYEEEINKRTTAENEFVVLKKDVDAAYTSKVELQ-----------  254

Query  122  AKFYVVNGKYEVFKCLYNGE  141
            AK   ++G+ + FKCLY GE
Sbjct  255  AKVDALDGEIKFFKCLYEGE  274