GOS 1412010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091118858520
Annotathon code: GOS_1412010
Sample :
  • GPS :24°10'29n; 84°20'40w
  • Caribbean Sea: Gulf of Mexico - USA
  • Coastal Sea (-2m, 26.4°C, 0.1-0.8 microns)
Authors
Team : Algarve
Username : 768990bio1
Annotated on : 2010-07-30 14:30:16
  • a37076 AnaSofiaCarraçoCosta
  • a37089 LilianaSofiaGriloSantos
  • a37090 ManuelAlexanderGuerreiroVieira

Synopsis

  • Taxonomy: Planctomycetales (NCBI info)
    Rank: order - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 112
    Kingdom: Bacteria - Phylum: Planctomycetes - Class: Planctomycetacia - Order: Planctomycetales
    Bacteria; Planctomycetes; Planctomycetacia; Planctomycetales;

Genomic Sequence

>JCVI_READ_1091118858520 GOS_1412010 Genomic DNA
ACGATGCCCGGCCGGATGCCGCCGCCAGCCATCCAGACGGTGAAGCCATGCGGGTGGTGGTCGCGACCGTATCCGGAGCCGTCGCCCTGTGCCATCGGCG
TTCGGCCGAACTCGCCGCCCCAGATCACTAGCGTCTCGTCAAGCAGCCCGCGTTGTTTCAGGTCGGCGATGAGCGCTGCGATCGGGCGATCCACCGCGCG
GCAGCGCTCGGTGTGTTGCTTGGCGATCTCGCCGTGCGAATCCCAGCCCTTCTCGTAAAGCTGCACGAACCTCACGCCGCTCTCGCTCAACCTCCGCGCC
AACAGGCAGTTGTTGGCGAATGACACCCGGCCCGGATCGGCTCCGTATGTCTCGAGCACTGCCCGCGGCTCAGCCGCGATGTCCGCAAGCTCCGGCACGG
CGGTTTGCATCCGATACGCCAGCTCGTATTGCTTGATGCGGGCGAGGATTGCCGGATCGTTCACCAACTCGTGCCGTCGACGGTTCAGCGCGCCCAGCAG
GTTGATCTGGTCGCGGCGGGCCTCACGGCTGACCCCGTCCGGATTATTGAGATAAAGCACCGGGTCGCCCTGAGTCCGCAGCGGCACGCCCTGATGTTCC
GGTGGCAGAAAGCCCGCGCCCCAATAGCTATCGAGCAGCGGCTGAATATTCCTGCCGGAGGCGAGCACGATGTACGCCGGCAGATCGCTGTTCTCGGTGC
CCAGCCCGTACGACGCCCAAGCGCCCAGCGATGGCCAACCGAAACGCACGTTGCCGGTGTTCATAAACGTCACCGCCGGGTCGTGGTTGAACGTGTCGGA
GTACAGCGAACGCACGATGGCCAACTCGTCCGCCACGCCGGCCACGTGTGGAAACAGCTCCGAAATCTCGATGCCGCTTTGGCCGTGCGGCCGAAACTTC
CACGGTGATCCCTTCACCTTCGGCTCC

Translation

[2 - 925/927]   indirect strand
>GOS_1412010 Translation [2-925   indirect strand]
EPKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRFGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDS
YWGAGFLPPEHQGVPLRTQGDPVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADIAAEPRAVLETYGADPGR
VSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVWMA
GGGIRPGI

[ Warning ] 5' incomplete: does not start with a Methionine
[ Warning ] 3' incomplete: following codon is not a STOP

Annotator commentaries

The chosen ORF(ORF number 1 in reading frame 2 on the reverse strand extends from base 2 to base 925) presents e-values highly significant (E<< e^-06) only to hypothetical or unknown function protein, and is incomplete on both ends. Even so we consider that is a potencial coding ORF because has a size of more than 300 codons and a good number of homologs.

Through the tree analysis, we conclude that the sequence GOS is a Planctomycetales; to more specifications and certainly on taxonomy is needed the complete sequence and a bigger number of in-group organisms.

All the homologs found have a unknown function, and the sequence have a domain of alkaline phosphatase-like funtion with a E-value of 3.3e-11, so the Biological Process is could be a metabolic process and the molecular function has a potencial catalytic activity; but because of the absence of other complementer domain or other type of information, the biological process and the molecular function was keep it as unknown.

ORF finding

PROTOCOL


a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code

b) SMS ORFinder / reverse strand / frames 1, 2 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code



RESULTS ANALYSIS


There are seven ORFs in direct strand and four ORFs in reverse strand.

The ORF chosen is the biggest and goes from 2 to 925, in reverse strand. Is coding and incomplete at the 5'end and 3'end, this means that ORF does not have START and STOP codons.

Beyond the chosen ORF, no longer exists any meaningful biologycal as significant.

The other ORFs found are overlapping on the chosen ORF .

   RAW RESULTS


a) forward strand

>ORF number 1 in reading frame 1 on the direct strand extends from base 1 to base 258.
ACGATGCCCGGCCGGATGCCGCCGCCAGCCATCCAGACGGTGAAGCCATGCGGGTGGTGG
TCGCGACCGTATCCGGAGCCGTCGCCCTGTGCCATCGGCGTTCGGCCGAACTCGCCGCCC
CAGATCACTAGCGTCTCGTCAAGCAGCCCGCGTTGTTTCAGGTCGGCGATGAGCGCTGCG
ATCGGGCGATCCACCGCGCGGCAGCGCTCGGTGTGTTGCTTGGCGATCTCGCCGTGCGAA
TCCCAGCCCTTCTCGTAA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
TMPGRMPPPAIQTVKPCGWWSRPYPEPSPCAIGVRPNSPPQITSVSSSSPRCFRSAMSAA
IGRSTARQRSVCCLAISPCESQPFS

>ORF number 2 in reading frame 1 on the direct strand extends from base 259 to base 555.
AGCTGCACGAACCTCACGCCGCTCTCGCTCAACCTCCGCGCCAACAGGCAGTTGTTGGCG
AATGACACCCGGCCCGGATCGGCTCCGTATGTCTCGAGCACTGCCCGCGGCTCAGCCGCG
ATGTCCGCAAGCTCCGGCACGGCGGTTTGCATCCGATACGCCAGCTCGTATTGCTTGATG
CGGGCGAGGATTGCCGGATCGTTCACCAACTCGTGCCGTCGACGGTTCAGCGCGCCCAGC
AGGTTGATCTGGTCGCGGCGGGCCTCACGGCTGACCCCGTCCGGATTATTGAGATAA

>Translation of ORF number 2 in reading frame 1 on the direct strand.
SCTNLTPLSLNLRANRQLLANDTRPGSAPYVSSTARGSAAMSASSGTAVCIRYASSYCLM
RARIAGSFTNSCRRRFSAPSRLIWSRRASRLTPSGLLR

>ORF number 3 in reading frame 1 on the direct strand extends from base 646 to base 927.
ATATTCCTGCCGGAGGCGAGCACGATGTACGCCGGCAGATCGCTGTTCTCGGTGCCCAGC
CCGTACGACGCCCAAGCGCCCAGCGATGGCCAACCGAAACGCACGTTGCCGGTGTTCATA
AACGTCACCGCCGGGTCGTGGTTGAACGTGTCGGAGTACAGCGAACGCACGATGGCCAAC
TCGTCCGCCACGCCGGCCACGTGTGGAAACAGCTCCGAAATCTCGATGCCGCTTTGGCCG
TGCGGCCGAAACTTCCACGGTGATCCCTTCACCTTCGGCTCC

>Translation of ORF number 3 in reading frame 1 on the direct strand.
IFLPEASTMYAGRSLFSVPSPYDAQAPSDGQPKRTLPVFINVTAGSWLNVSEYSERTMAN
SSATPATCGNSSEISMPLWPCGRNFHGDPFTFGS

>ORF number 1 in reading frame 2 on the direct strand extends from base 173 to base 436.
GCGCTGCGATCGGGCGATCCACCGCGCGGCAGCGCTCGGTGTGTTGCTTGGCGATCTCGC
CGTGCGAATCCCAGCCCTTCTCGTAAAGCTGCACGAACCTCACGCCGCTCTCGCTCAACC
TCCGCGCCAACAGGCAGTTGTTGGCGAATGACACCCGGCCCGGATCGGCTCCGTATGTCT
CGAGCACTGCCCGCGGCTCAGCCGCGATGTCCGCAAGCTCCGGCACGGCGGTTTGCATCC
GATACGCCAGCTCGTATTGCTTGA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
ALRSGDPPRGSARCVAWRSRRANPSPSRKAARTSRRSRSTSAPTGSCWRMTPGPDRLRMS
RALPAAQPRCPQAPARRFASDTPARIA

>ORF number 2 in reading frame 2 on the direct strand extends from base 551 to base 766.
GATAAAGCACCGGGTCGCCCTGAGTCCGCAGCGGCACGCCCTGATGTTCCGGTGGCAGAA
AGCCCGCGCCCCAATAGCTATCGAGCAGCGGCTGAATATTCCTGCCGGAGGCGAGCACGA
TGTACGCCGGCAGATCGCTGTTCTCGGTGCCCAGCCCGTACGACGCCCAAGCGCCCAGCG
ATGGCCAACCGAAACGCACGTTGCCGGTGTTCATAA

>Translation of ORF number 2 in reading frame 2 on the direct strand.
DKAPGRPESAAARPDVPVAESPRPNSYRAAAEYSCRRRARCTPADRCSRCPARTTPKRPA
MANRNARCRCS

>ORF number 1 in reading frame 3 on the direct strand extends from base 132 to base 323.
CGTCTCGTCAAGCAGCCCGCGTTGTTTCAGGTCGGCGATGAGCGCTGCGATCGGGCGATC
CACCGCGCGGCAGCGCTCGGTGTGTTGCTTGGCGATCTCGCCGTGCGAATCCCAGCCCTT
CTCGTAAAGCTGCACGAACCTCACGCCGCTCTCGCTCAACCTCCGCGCCAACAGGCAGTT
GTTGGCGAATGA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
RLVKQPALFQVGDERCDRAIHRAAALGVLLGDLAVRIPALLVKLHEPHAALAQPPRQQAV
VGE

>ORF number 2 in reading frame 3 on the direct strand extends from base 324 to base 908.
CACCCGGCCCGGATCGGCTCCGTATGTCTCGAGCACTGCCCGCGGCTCAGCCGCGATGTC
CGCAAGCTCCGGCACGGCGGTTTGCATCCGATACGCCAGCTCGTATTGCTTGATGCGGGC
GAGGATTGCCGGATCGTTCACCAACTCGTGCCGTCGACGGTTCAGCGCGCCCAGCAGGTT
GATCTGGTCGCGGCGGGCCTCACGGCTGACCCCGTCCGGATTATTGAGATAAAGCACCGG
GTCGCCCTGAGTCCGCAGCGGCACGCCCTGATGTTCCGGTGGCAGAAAGCCCGCGCCCCA
ATAGCTATCGAGCAGCGGCTGAATATTCCTGCCGGAGGCGAGCACGATGTACGCCGGCAG
ATCGCTGTTCTCGGTGCCCAGCCCGTACGACGCCCAAGCGCCCAGCGATGGCCAACCGAA
ACGCACGTTGCCGGTGTTCATAAACGTCACCGCCGGGTCGTGGTTGAACGTGTCGGAGTA
CAGCGAACGCACGATGGCCAACTCGTCCGCCACGCCGGCCACGTGTGGAAACAGCTCCGA
AATCTCGATGCCGCTTTGGCCGTGCGGCCGAAACTTCCACGGTGA

>Translation of ORF number 2 in reading frame 3 on the direct strand.
HPARIGSVCLEHCPRLSRDVRKLRHGGLHPIRQLVLLDAGEDCRIVHQLVPSTVQRAQQV
DLVAAGLTADPVRIIEIKHRVALSPQRHALMFRWQKARAPIAIEQRLNIPAGGEHDVRRQ
IAVLGAQPVRRPSAQRWPTETHVAGVHKRHRRVVVERVGVQRTHDGQLVRHAGHVWKQLR
NLDAALAVRPKLPR


b) reverse strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 1 to base 300.
GGAGCCGAAGGTGAAGGGATCACCGTGGAAGTTTCGGCCGCACGGCCAAAGCGGCATCGA
GATTTCGGAGCTGTTTCCACACGTGGCCGGCGTGGCGGACGAGTTGGCCATCGTGCGTTC
GCTGTACTCCGACACGTTCAACCACGACCCGGCGGTGACGTTTATGAACACCGGCAACGT
GCGTTTCGGTTGGCCATCGCTGGGCGCTTGGGCGTCGTACGGGCTGGGCACCGAGAACAG
CGATCTGCCGGCGTACATCGTGCTCGCCTCCGGCAGGAATATTCAGCCGCTGCTCGATAG


>Translation of ORF number 1 in reading frame 1 on the reverse strand.
GAEGEGITVEVSAARPKRHRDFGAVSTRGRRGGRVGHRAFAVLRHVQPRPGGDVYEHRQR
AFRLAIAGRLGVVRAGHREQRSAGVHRARLRQEYSAAAR

>ORF number 2 in reading frame 1 on the reverse strand extends from base 559 to base 789.
GCCGCGGGCAGTGCTCGAGACATACGGAGCCGATCCGGGCCGGGTGTCATTCGCCAACAA
CTGCCTGTTGGCGCGGAGGTTGAGCGAGAGCGGCGTGAGGTTCGTGCAGCTTTACGAGAA
GGGCTGGGATTCGCACGGCGAGATCGCCAAGCAACACACCGAGCGCTGCCGCGCGGTGGA
TCGCCCGATCGCAGCGCTCATCGCCGACCTGAAACAACGCGGGCTGCTTGA

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
AAGSARDIRSRSGPGVIRQQLPVGAEVERERREVRAALREGLGFARRDRQATHRALPRGG
SPDRSAHRRPETTRAA

>ORF number 1 in reading frame 2 on the reverse strand extends from base 2 to base 925.
GAGCCGAAGGTGAAGGGATCACCGTGGAAGTTTCGGCCGCACGGCCAAAGCGGCATCGAG
ATTTCGGAGCTGTTTCCACACGTGGCCGGCGTGGCGGACGAGTTGGCCATCGTGCGTTCG
CTGTACTCCGACACGTTCAACCACGACCCGGCGGTGACGTTTATGAACACCGGCAACGTG
CGTTTCGGTTGGCCATCGCTGGGCGCTTGGGCGTCGTACGGGCTGGGCACCGAGAACAGC
GATCTGCCGGCGTACATCGTGCTCGCCTCCGGCAGGAATATTCAGCCGCTGCTCGATAGC
TATTGGGGCGCGGGCTTTCTGCCACCGGAACATCAGGGCGTGCCGCTGCGGACTCAGGGC
GACCCGGTGCTTTATCTCAATAATCCGGACGGGGTCAGCCGTGAGGCCCGCCGCGACCAG
ATCAACCTGCTGGGCGCGCTGAACCGTCGACGGCACGAGTTGGTGAACGATCCGGCAATC
CTCGCCCGCATCAAGCAATACGAGCTGGCGTATCGGATGCAAACCGCCGTGCCGGAGCTT
GCGGACATCGCGGCTGAGCCGCGGGCAGTGCTCGAGACATACGGAGCCGATCCGGGCCGG
GTGTCATTCGCCAACAACTGCCTGTTGGCGCGGAGGTTGAGCGAGAGCGGCGTGAGGTTC
GTGCAGCTTTACGAGAAGGGCTGGGATTCGCACGGCGAGATCGCCAAGCAACACACCGAG
CGCTGCCGCGCGGTGGATCGCCCGATCGCAGCGCTCATCGCCGACCTGAAACAACGCGGG
CTGCTTGACGAGACGCTAGTGATCTGGGGCGGCGAGTTCGGCCGAACGCCGATGGCACAG
GGCGACGGCTCCGGATACGGTCGCGACCACCACCCGCATGGCTTCACCGTCTGGATGGCT
GGCGGCGGCATCCGGCCGGGCATC

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
EPKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNV
RFGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQG
DPVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPEL
ADIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE
RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVWMA
GGGIRPGI

>ORF number 1 in reading frame 3 on the reverse strand extends from base 168 to base 443.
ACACCGGCAACGTGCGTTTCGGTTGGCCATCGCTGGGCGCTTGGGCGTCGTACGGGCTGG
GCACCGAGAACAGCGATCTGCCGGCGTACATCGTGCTCGCCTCCGGCAGGAATATTCAGC
CGCTGCTCGATAGCTATTGGGGCGCGGGCTTTCTGCCACCGGAACATCAGGGCGTGCCGC
TGCGGACTCAGGGCGACCCGGTGCTTTATCTCAATAATCCGGACGGGGTCAGCCGTGAGG
CCCGCCGCGACCAGATCAACCTGCTGGGCGCGCTGA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
TPATCVSVGHRWALGRRTGWAPRTAICRRTSCSPPAGIFSRCSIAIGARAFCHRNIRACR
CGLRATRCFISIIRTGSAVRPAATRSTCWAR

Multiple Alignement

PROTOCOL


a) Muscle multiple sequence alignment / default parameters;


RESULTS ANALYSIS


Analysing the results of multiple alignement we conclude that sequence is incomplete at 5'end and 3'end.

There are many insertion of gaps in sequence, on both sides.

The sequences aligned aren't all very well aligned, above all with the sequences u5H12_in,u3FN_in, u8FN_in and u6FN_in, that are also the sequences with low homology on in-group.




identification of sequences on multiple sequence alignment:


u5H12_in uncultured planctomycete 5H12 in

u3FN_in uncultured planctomycete 3FN in

u8FN_in uncultured planctomycete 8FN in

u6FN_in uncultured planctomycete 6FN in

V.sp.out Verrucomicrobium spinosum out

Si.ln.out Spirosoma linguale out

Sh.spi out Sphingobacterium spiritivorum 1 out

Sh2_out Sphingobacterium spiritivorum 2 out

F.bc.out Flavobacteriales bacterium out

Pl.lm.in Planctomyces limnophilus in

Pi.st.in Pirellula staleyi in

G.o.in Gemmata obscuriglobus in

B.mn.in Blastopirellula marina in

R.bl.in Rhodopirellula baltica in

GOS_141201

Pl.ms.in Planctomyces maris in

RAW RESULTS
a) Muscle multiple sequence alignment
CLUSTAL FORMAT: MUSCLE (3.7) multiple sequence alignment


u5H12_in        -------------------------------MSESVDNAGCGSNEHVNRRSLLKLAGLS-
u3FN_in         --------------------------------------MNSSSRNGMSRRHFMQHAATAA
u8FN_in         -----------------------------------MNPRHPSNRSTPSRRDLLKTGGIAA
u6FN_in         ------------------------------------------MTFEFSRRNALMTLSCGF
V.sp.out        -----------------------------------MNPFHLASARLATRRTFLGQSGLGL
Si.ln.out       -------------------------------MNKLIQELRHAAAERETRRHFLHTCSTGL
Sh.spi out      ----------------------------MVDLDKLMREAQQYKLQAVTRRHFLKDCVAGI
Sh2_out         ----------------------------MVDLDKLMREAQQYKLQAVTRRHFLKDCVAGI
F.bc.out        ---------------------------MEDLINRLIQEKLARETESKTRRHFLKNCTQGI
Pl.lm.in        -----------------------------MAIMSLQNMVYEQLRQQMVRRYFFQRSAAGL
Pi.st.in        ------------------------------------MSMFDDLLAAHTRRYFFGKMAHGL
G.o.in          ----------------------------------MLPFSPAERQQLLNRRWFLRECGVGL
B.mn.in         -----------------------------------------MTPNDISRRWFLRDCGVGL
R.bl.in         MADGQSHDFELGRNDHQIMNSLNMSYAQSTGLTSADPSRFAKGLDLMRRRWFLQQCGLGL
GOS_141201      ------------------------------------------------------------
Pl.ms.in        ------------------------------------MNAHLQQLLNYTRREFFQRAGMGI
                                                                            

u5H12_in        GLSWLTPLATGLARQ-------------------------------KEQSKVDQARSLIV
u3FN_in         TTIPTFQFLNQLQA--------------------------------NAADVRKRQKACIL
u8FN_in         MGLSLPRLLQSQDRPPGPGNEENGPQS------SISP-----------TSARTKEISCIF
u6FN_in         GYMALAGLTTKAAEDSKN---------------PLAP---------KAPHYPATAKRVIF
V.sp.out        GAMALQGMLPGLSSAAPAAGGMSLPEN------PLLP---------HQAPLPAKARSVIY
Si.ln.out       GAMALSSVLGSCGFFGKENPEANAVGAASLSGEPTAP---------HPSQYIPKAKRIIY
Sh.spi out      GSIALGSLLASCGGSAQGDAPLNLNALN-----PMIP---------RAPHFPAKAKSVIY
Sh2_out         GSIALGSLLASCGGTGQGDAPLNLNALN-----PMIP---------RAPHFPAKAKSVIY
F.bc.out        GGLALSSIFMGCDSFGKPTKKNQITFAER-DLNPLAT---------LAPPYSPKVKSIIY
Pl.lm.in        GGAALASLLNPQLFSGMPALAANPEVS------STVPPDLSSLGALPGLHHAPKAKRVIW
Pi.st.in        GTAALASLLAHDSQADDAAKVAAAN--------ELVS-----LGALKSLHHAPKAKRVIW
G.o.in          GAMALADLARHEARGADA---------------PRA----------LKTHHEPKAKRVIY
B.mn.in         GSIALADMLRGEAGAEQID--------------PMAP---------KRPHFAGKAKNVIF
R.bl.in         GHIALTTLMAEAGDLELAGPDRSSVN-------PMAP---------KSPHFPAKIKNVIL
GOS_141201      ------------------------------------------------------------
Pl.ms.in        GGAALTTLLANDLQAALPTAAN-----------PMAA---------RQSHFTPKAKNVIF
                                                                            

u5H12_in        LWLEGAPSQLETFDPHPGAEIAAGSLA-----------------------------RPTN
u3FN_in         MWMGGGPPSIDIWDLKP------------------------GSKN-------GGEFKPID
u8FN_in         IHQYGGLSQLDSWDPKP-----------------------LSSAE---IRGP-YSPIQTV
u6FN_in         LCMRGGPSHVDTFDYKPQLTKDSGKQSPG-----------QKSRK---LMGSPWSFKQCG
V.sp.out        LHMSGAPPTLDLFDYKPKLNELHMQDCPDSLFAGKRFAFIKGRPK---MLGSPYKFKQYG
Si.ln.out       IHMAGSPSQLELFDYKPELAKYNGKDCPQALLEGKKFAFIRGTPK---MLGPQGKFAQRG
Sh.spi out      LHMAGAPSQLELFDYKPELHKLHNKPCPDSLLKGKKFAFIRGTPN---MLGPQATFAQYG
Sh2_out         LHMAGAPSQLELFDYKPELHKLHNKPCPDSLLKGKKFAFIRGTPN---MLGPQATFAQYG
F.bc.out        LHMAGAPSQLEMFDYKPALQKLDGQDCPQSLLEGKKFAFIKGTPK---LMGPQAKFKQEG
Pl.lm.in        LFMADGPSQLDLFDYKPKMVDWFDKDLPESIRNGQRITTMTSGQSRFPIAPSVFKFNQYG
Pi.st.in        LFMADAPSQLDLFDYKPKLADFFDKDLPESIRQGQRITTMTSGQSRLPVAPSTFKFAQHG
G.o.in          LFMGGAPSHLELFDNKPQLAKFDGTLPPPELLKNYRAAFINPNSK---LLGPKFKFKKFG
B.mn.in         LFMAGAPSHLEMFDYKPQLEKFDGSLPPAELLDGYRAAFINPNSK---LLGPKFKFAKHG
R.bl.in         LFMGGGPSQFEMFDYKPELERLDGTLPPAELLDGYRAAFINPNSK---LLGPKFKFEKKG
GOS_141201      ------------------------------------------EPK---VKGSPWKFRPHG
Pl.ms.in        LHMVGAPSHLDLYDAKPKLQELDGELVPDKLWEGLRLAFIREQPK---LMGSPFAFQQQG
                                                                            

u5H12_in        APGI-LLGDGFEQTAEQMDSISLVRAITSKEGDHARAVYNIKTGYRPDPTLIHPAIGSVI
u3FN_in         TKGDLQISEHMPKTAMVMDNLSVVRSMSTREADHTR-------G------------GILH
u8FN_in         TPGF-QISELMPKLSQMSDKYSVIRSMTHGNAQHDQANAMMLAG-RSNPAPDDPSFGAMV
u6FN_in         ESGL-PISDLFPHLGKHADDLCLVNGMAGEVPNHPQAYLKLHTG---SFRFVRPSVGSWA
V.sp.out        QSGA-WVSDMFPHFTKIVDEVALVKSMNTDQFNHAPAELFVHTG---DMRAGGASIGSWV
Si.ln.out       QSGA-WVSDYLPHLQGVVDEISFLKAMHTDQFNHAPAQLLMHTG---SARLGRPSMGSWV
Sh.spi out      ESGA-WISDHLPHFSKVADEVSFLKAVHTDQFNHGPAQLFMHTG---SARLGRPSIGSWV
Sh2_out         ESGA-WISDHLPHFSKVADEVSFLKAVHTDQFNHGPAQLFMHTG---SARLGRPSIGSWV
F.bc.out        ESGN-WVSNYLPHFKKVVDDVAFLKAVHTDQFNHGPAQLFMHTG---SARLGRPSIGSWA
Pl.lm.in        QNGT-YISELLPHLAGVVDDLTIVKTMYTEAINHDPAITFIQTG---SELPGRPSLGAWL
Pi.st.in        KNGT-WLSELLPKLGEVVDDIAIIKTMNTEAINHDPAITYIQTG---SQIPGRPSMGAWS
G.o.in          QSGT-ELGELLPHLGAVVDDIAVVKSMHTDAFNHAPAQILALTG---HQQFGRPSAGAWV
B.mn.in         ECGA-EISELLPYTAKIADELSIIKSMKTDAFNHAPAQIMMNTG---SPLFGKPSLGAWT
R.bl.in         SAGT-HISELLPHTAGVLDDICLIRSMKTDAFNHAPAQLMMSTG---SQQFGRPSMGSWT
GOS_141201      QSGI-EISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTG---NVRFGWPSLGAWA
Pl.ms.in        EAGL-PISELMPHLGSVSDELCMIHSLKTDHFNHAPAQLFFQTG---FSRFGRPSLGSWV
                  *   :.: :       *.  .:. :     :*         *            *   

u5H12_in        CHQIPKAASSMVDIPRHISILA-------GQSAGRG----GYLGDRYDAFKIGDPLNPVP
u3FN_in         AHCL---CTQPDRRASELRF------GRQLRTRFKAART-GNPLVCFRRRQRAQLGLPSA
u8FN_in         T-KL---RPSFAKIPPHVWLQK---------------YGGGAAPPEQTYLSGGRLGAAVA
u6FN_in         LYGL---GTENQDLPGFITLNP------ETRVGGAQNYGSSFLPAYYQGTAIGQINQSLA
V.sp.out        TYGL---GSENLDLPGFVVLLS----GGTDPTGGKSLWNSGFLPSVYQGVQCRTTGEPIL
Si.ln.out       TYGL---GTENDNLPGYIVLAS----GGKQPDAGKSVWGSGFLPTVYQGVQCRTDGDPVL
Sh.spi out      TYGL---GSENSNLPGFVVLTS----GGKTPDAGKSVWGSGFLPSVYQGVQCRSKGDPVL
Sh2_out         TYGL---GSENSNLPGFVVLTS----GGKTPDAGKSVWGSGFLPSVYQGVQCRSKGDPVL
F.bc.out        TYGL---GSENQNLPGFVVLTS----GGNSPDAGKSVWGSGFLPSVYQGVQCRSKGDPVL
Pl.lm.in        SYGI---GSPNQDLPAFVVLHSKIAAGAQTQALFSRLWGSGFLPTKHQGVALRSSGDPVL
Pi.st.in        SYGL---GSENQDLPAYVVLHSKLAPGSSSQALFSRLWGSGFLPTKHQGVALRSSGDPVL
G.o.in          SYGL---GSESKDLPGFVVFSS----GSKGPSGGNSCWGSGFLPSSHAGTLFRSAGDPVL
B.mn.in         MYGL---GSESRNLPGFVVFSS----GKKGPSGGSSNWGSGYLPTVYQGVQLRSVGDPVL
R.bl.in         TYGL---GSESRDLPAYVVFNS----GKKGPSAGSGNWNSGFLPSLHSGVEFRSSGDPVL
GOS_141201      SYGL---GTENSDLPAYIVLAS----GRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDPVL
Pl.ms.in        NYGL---GSENSNLPGFVVLIT----G-NVAGAGNSLWGSGFLPSIYQGVEFRSSGDPVL
                   :    .     .  : :                    .                .  

u5H12_in        -------DLKPRVGDRQQNQRLKDLS-------VVDQAFL-------QRQGNNRTLQGVL
u3FN_in         -----RCIRPSGVSSTGQIQNAQMSGSKGRLSSRLKMLDVVETNFAKSRRGDAPKSHTEV
u8FN_in         PMLIGKQHDENPASDDFRVRAFDRSQ-------GISQQRLKTRWQLTQKIQN-SSNAGKT
u6FN_in         NARFGNIKNPR-FSEALQREQLNLLQ-------EMNQELL-------NKKEVSPELEGVI
V.sp.out        -----FSKNPEGMARDSRRRSLDALG-------RLNQLEA-------AELGD-PETLTRI
Si.ln.out       -----YASDPSGISRDIRKQTIDAIS-------QINQQQY-------DEVKD-PEILTRI
Sh.spi out      -----YIADPDGMGRDLKKHTIDAIN-------KVNMDEY-------ETYKD-PETLSRI
Sh2_out         -----YIADPDGMGRDLKKHTIDAIN-------KVNMDEY-------ETYKD-PETLSRI
F.bc.out        -----YIKDPDGITRDLKKSTIDAIN-------KINKEEY-------LKYAD-PEILARI
Pl.lm.in        -----YLSNPKGVSSQSRRTMLDGLA-------ELNQQHL-------QEMGD-PEIAARI
Pi.st.in        -----YLSNPGGVSKETRRKMLDGLA-------ELNQQKY-------EAVGD-PEITSRI
G.o.in          -----FLGNPAGIDPGLQRDSLDALN-------ALNKRRL-------DVVGD-PEIAARI
B.mn.in         -----YLSNPTGIDRTVQRDSLDTIN-------QLNQQRL-------AATGD-PEIATRI
R.bl.in         -----YLSNPPGMSDWSQRQSLQTVN-------ELNRRRL-------DVVGD-PEIATRI
GOS_141201      -----YLNNPDGVSREARRDQINLLG-------ALNRRRH-------ELVND-PAILARI
Pl.ms.in        -----FLSNPKGMTGEDRKRIIDSVN-------HLNKVQL-------ADVGD-PEIATRI
                                 .    .           :.                        

u5H12_in        GNHN---------LDAALSMMSSEQISAFDVSKADSSLREEYGDTP-----FGRGCLAAV
u3FN_in         YQKA----------VNLMTSSQMK---AFKVEDEDPKTKEAYGEN-----NFGQGLLMAR
u8FN_in         SEYDRLDLYQRKSFDLLDSTKAKE---AFDIKQEPSTIRERYGQNP-----LGQNLVLAR
u6FN_in         ESYE---------LAFRMQNAVPT---LMDVSSESRETLDMYGVGVKETDNFGRQCLLAR
V.sp.out        SQYE---------LAYRMQTAVPE---VFDIQKEPESVRNLYGAKPGEG-SFANNCLLAR
Si.ln.out       AQYE---------LAFRMQMSVPD---AMDIKSEPQYMLDSYGVDPNKG-SFARNCLLAR
Sh.spi out      AQYE---------MAYKMQVAVPE---VMDIASEPEYIHELYGTQPGKE-SFANNCLLAR
Sh2_out         AQYE---------MAYKMQVAVPE---VMDIASEPEYIHELYGTQPGKE-SFANNCLLAR
F.bc.out        NQYE---------MAYRMQIAVPE---VMNINNEPEDIKQMYGVEPGKE-SFANNCLLAR
Pl.lm.in        AQYE---------MAFRMQTSVPD---LVDLKDESKETLEMYGPEVNEPGTFAYNCLLAR
Pi.st.in        AQYE---------MAYRMQSSVPE---LVDVSKETQETLDLYGPEVKEPGTFAYNCLLAR
G.o.in          SAYE---------MAGRMQSSAPE---LMDLSKETKETLAMYGAEPGKP-SFANNCLLAR
B.mn.in         NSFE---------MAYRMQASGPE---LMDLSSEPKHVLDMYGVDPDKP-SFAKNCLLAR
R.bl.in         NGYE---------MAYRMQSSAPE---AMALSDEPEHMLKLYGAEPGKM-SFANNCLLAR
GOS_141201      KQYE---------LAYRMQTAVPE---LADIAAEPRAVLETYGADPGRV-SFANNCLLAR
Pl.ms.in        NQYE---------MAYRMQSAVPE---LMDISNEPKHIHEQYGTQPGKA-SFANNCLLAR
                                              :          **        :.   : * 

u5H12_in        RLIEQGVRCVEVTLN-------------GWDTHVNNHE--------LQAGRIEILDPAFA
u3FN_in         RLVETGVPFIEVSSG-------------GWDLHNGVFT-------ALKDTKLPELDQGIG
u8FN_in         RLVEAGVRLVNVLAWTGLAPQEKFVSVETWDMHGNADVGIFEDGWNGLPFALPRADQAVA
u6FN_in         RFAESGVRFIELCHG-------------NWDQHGNLKG--------KLESNCRATDQPIA
V.sp.out        RLVENGVRYVQLFDW-------------GWDIHGTGKGDDLV---NKFPQKCRDVDQACA
Si.ln.out       RLVERGVRFVQLFDW-------------GWDTHGTSADGSID---VGLKAKCKESDQAVA
Sh.spi out      KLVEQGVRFVQLFDW-------------GWDSHGTSASDSID---LGFRNKCREIDRPMT
Sh2_out         KLVEQGVRFVQLFDW-------------GWDSHGTSASDSID---LGFRNKCREIDRPMT
F.bc.out        KLVEDGVRFVQLFDW-------------GWDTHGNIREGSID---IGLRNKCREIDRPIT
Pl.lm.in        RLAERGVRFTQIFLR-------------GWDHHGGLPG--------QIRQLVKSADQPCA
Pi.st.in        RLAERGVRFTQVFLR-------------GWDHHNNLPK--------QIPLLCKSMDQPAA
G.o.in          RLVERGVRFVQLFHE-------------AWDHHGGLTN--------GLKAECGKTDKACA
B.mn.in         RLVERGTRFVQLFHE-------------AWDQHGNLKK--------DLQENCLATDQACA
R.bl.in         RLVQRGVRFVQLFHE-------------SWDQHGGLTG--------GLKQNCGDTDQACA
GOS_141201      RLSESGVRFVQLYEK-------------GWDSHGEIAK--------QHTERCRAVDRPIA
Pl.ms.in        RLVERGVRFVQLFDQ-------------GWDHHGSIVK--------SLKNKCRQVDQPIA
                .: : *.   ::                 ** *                      *    

u5H12_in        ALIRDLKRRELLDSTMVVCGGEFGRTPWMNPLG-------GRDHWPSGFSMALAGGGIQG
u3FN_in         ALTADLKQRGMLDDVVLVWMGEFGRTPRINANV-------GRDHWARSWSVMIGGGGLQG
u8FN_in         TLLTDLEERGLLETTLVVLCGEFGRTPKISRGAKRI----GRDHWPNCYSALVAGGGIQG
u6FN_in         ALLTDLKQRGMLKDTLVVWGGEFGRTPHVKKKD-------GRDHNATGFSTWMAGGGVKG
V.sp.out        ALITDLKQRGLLENTLVVWGGEFGRTPMNEARGGST--YLGRDHHPNCFTMWMAGGGIKG
Si.ln.out       ALLNDLKQRGLLDDTLVVWGGEFGRTPMQENRDGQTLPFMGRDHHLEAFTVWMAGGGVKK
Sh.spi out      ALIMDLKQRGLLDETLVVWGGEFGRTPMQENRDNRDMPFMGRDHHTDAYTIWMAGGGIRK
Sh2_out         ALIMDLKQRGLLDETLVVWGGEFGRTPMQENRDNRDMPFMGRDHHTDAYTIWMAGGGVRK
F.bc.out        ALILDLKQRGLLDETLIVWGGEFGRTPMQENRGNKKMAFKGRDHHGDAFTMWIAGGGIKK
Pl.lm.in        ALIKDLKQRGMLDDTLVVWGGEFGRTIYSQGTLTKENY--GRDHHPRCFTMWFAGGGMKP
Pi.st.in        GLLKDLKQRGMLEDTLVVWGGEFGRTVYSQGTLTKENH--GRDHHPRCFTMWMAGAGVKP
G.o.in          ALIKDLKQRGLLKDTLVVWGGEFGRTPMVQGGND------GRDHHPNCYSVWLAGGGVKP
B.mn.in         ALVQDLKQRGLLEDTLVIWGGEFGRTPMVQGGGDD-----GRDHHPNAFTMWMAGGGAKG
R.bl.in         ALVKDLKQQGMLDETLVIWGGEFGRTPMVQGGND------GRDHHPNSFSMWMAGGGLKP
GOS_141201      ALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGY----GRDHHPHGFTVWMAGGGIRP
Pl.ms.in        ALIKDLRQRGLLDDTLVVWGAEFGRTPMVQGDRKAP----GRDHHKDAYTVWMAGGGVKR
                 *  **  . :*. .:::  .*****   .          ****.   ::  ..*.* . 

u5H12_in        GRVVGETSANPKRDTIDRKTEL-KNPHSVEDIHATILGSLGIELTKELDTPIGRPMEICQ
u3FN_in         GVAVGAT------DADGTTVV--DRSYLPGDIWSTVSHALGISTKIVHTSKRGRPMKIAN
u8FN_in         GVVYGES------DKQGAYVK--SNPVSLEDFTATLFSAMKIDP---TSRLSPDGFTLPA
u6FN_in         GQRVGAT------DEYGITAI--ENKMNFHDLHATMLHTLGLDHTKLTYRYAGRDFRLTD
V.sp.out        GIVHGAT------DELGYDVV--EGKYTIRDLQTTLLHQLGFDAHKFSYRYQGLNQRLIG
Si.ln.out       GFSFGET------DDIGYYGV--KDKVHIHDLQATILHLLGFDHTKLTYQFQGRPFRLTD
Sh.spi out      GVTYGET------DEIGFTAV--SGRSSVHDVHATMLHLLGFDHEKFTYEFQGRPFRLTD
Sh2_out         GVTYGET------DEIGFTAV--SGRSSVHDVHATMLHLLGFDHEKFTYEFQGRPFRLTD
F.bc.out        GASHGKT------DDIGFSGT--EGRVSVHDVHATIMHLLGFDHEQFTFEFQGRHFRLTD
Pl.lm.in        GTVYGET------DDFSYNIVSENQKMHIHDLNATILHLLGINHERLTYRSQGRDFRLTD
Pi.st.in        GVVHGET------DDFSYNVV--DKPVHIHDLNATILHLMGVNHEQLTYRFQGRDFRLTD
G.o.in          GLVLGAS------DDLGFNAT--DDRVHVHDLNATLLHLLGLDHEKLTYRLQGRDFRLTD
B.mn.in         GVSYGET------DDFGFNTV--KDGVHVRDLHATILHLLGFDHNRLSVKFQGLAQKLTG
R.bl.in         GLAYGQT------DELGFDVA--ENPVHVHDLHATILHLLGFDHKQLTYRFQGRDYRLTD
GOS_141201      GI----------------------------------------------------------
Pl.ms.in        GFAYGKT------DDIGFNVA--ENPMHVNDFHATLLHLLGMDHERLTFKFQGLDMRVTG
                *                                                           

u5H12_in        ----GEMIRELLD
u3FN_in         ---GGTPIKELIG
u8FN_in         -ST-GDVIGDLF-
u6FN_in         -VY-GRVVKDILS
V.sp.out        PTGDGRLMKEILA
Si.ln.out       -VA-GKVVKPILA
Sh.spi out      -VE-GNLISDII-
Sh2_out         -VE-GNLISNII-
F.bc.out        -VE-GEVIKEILA
Pl.lm.in        -VE-GNVIREILA
Pi.st.in        -VE-GHVVKEILT
G.o.in          -VH-GVVVEKLLA
B.mn.in         -VEPARIVHDLIA
R.bl.in         -VH-GELVHDILA
GOS_141201      -------------
Pl.ms.in        -VA-GNVVPDIIA
                             

Protein Domains

PROTOCOL


InterProScan; default parameters at EDI


RESULTS ANALYSIS


There are two results.

The second result, Alkaline phosphatase-like is contained in the first result, Protein of unknown function DUF1501.

The protein whose function is unknown has a smaller E-value (3.7e-115), soon has more homology, while the Alkaline has E-value bigger but still acceptable.

From the first result, there is no known function, but the Alkaline that is contained in the first result, it has a catalytic function and has a metabolic process.




RAW RESULTS

Translation	FCB2C8F24941C0EF	308	HMMPfam	PF07394	DUF1501	6	307	3.7e-115	T	16-Mar-2010	IPR010869	Protein of unknown function DUF1501	
Translation	FCB2C8F24941C0EF	308	superfamily	SSF53649	Alkaline phosphatase-like	201	305	3.3e-11	T	16-Mar-2010	IPR017850	Alkaline-phosphatase-like, core domain	Molecular Function: catalytic activity (GO:0003824), Biological Process: metabolic process (GO:0008152)

Phylogeny

PROTOCOL


a) Phylogeny.fr/ BioNJ method /no bootstrap / out group: Verrucomicrobium_spinosum(Verrucomicrobia), Spirosoma_linguale (Sphingobacteria), Sphingobacterium_spiritivorum_1 (Sphingobacteria), Sphingobacterium_spiritivorum_2 (Sphingobacteria), Flavobacteriales_bacterium (Flavobacteria).



b) Phylogeny.fr/ PhyML method / no bootstrap / out group: Verrucomicrobium_spinosum(Verrucomicrobia), Spirosoma_linguale (Sphingobacteria), Sphingobacterium_spiritivorum_1 (Sphingobacteria), Sphingobacterium_spiritivorum_2 (Sphingobacteria), Flavobacteriales_bacterium (Flavobacteria).



RESULTS ANALYSIS


The trees found aren't in agreement.

In both the out groups are separated from in groups.

Because all the sequences of in group (where GOS was satisfactory integrated) are of planctomycetales order, we can conclude that sequence belong to the planctomycetales, class planctomycetacia. As it was found in uncultured planctomycete in the two trees, is impossible to know more about the organism, even because this relation can be caused by the sequences are more distants from the others on the in-group, and the GOS sequence is incomplete on both ends.

The GOS sequence is inserted at node 2.



In tree a) the nodes has indicated values below 0.7:

nó 1 - 0.66

nó 2 - 0.29

nó 3 - 0.17

nó 4 - 0.53

nó 5 - 0.66

nó 6 - 0.31

nó 7 - 0.23



na árvore b) os nós assinalados têm valores inferiores a 0.7:


nó 1 - 0.4

nó 2 - 0.002

RAW RESULTS

a) BioNJ

   +-----------------Planctomyces_maris_in                                                                               (planctomycetacia)
   |
   |               +-------Planctomyces_limnophilus_in                                                                   (planctomycetacia)
   |     +---------+
   |     |         +----------Pirellula_staleyi_in                                                                       (planctomycetacia)        
   | +---+4
   |7|   |
5+-+++   +----------------Gemmata_obscuriglobus_in                                                                        (planctomycetacia)
 | |||
 | |||  +-------------Blastopirellula_marina_in                                                                           (planctomycetacia)
 | ||+--+
 | || 6 +----------------Rhodopirellula_baltica_in                                                                        (planctomycetacia)
 | ||
 | ||                         +------------------------------------------------------uncultured_planctomycete_5H12_in      (planctomycetacia)
 |3++                         |
 |  |           +-------------+        +-------------------------------------------------uncultured_planctomycete_3FN_in   (planctomycetacia)
 |  |           |             +--------+
 |  |+----------+                      |
 |  ||          |                      +------------------------------------uncultured_planctomycete_8FN_in                (planctomycetacia)
 |  ||          |
 |  ++          +----------------------------------uncultured_planctomycete_6FN_in                                         (planctomycetacia)
 |  2|
 |   +---------------------GOS_1412010
 |
 | +----------------Verrucomicrobium_spinosum_out                                                                           (Verrucomicrobia)
 | |
 | |   +------------Spirosoma_linguale_out                                                                                  (Sphingobacteria)
1+-+   |
   |   |
   +---+               +Sphingobacterium_spiritivorum_1_out                                                                 (Sphingobacteria)
       |        +------+
       +--------+      +Sphingobacterium_spiritivorum_2_out                                                                 (Sphingobacteria)
                |
                +-----------Flavobacteriales_bacterium_out                                                                  (Flavobacteria)


----------------------------------------------------------------------------------------------------------------------------

b) PhyML

                  +-----Planctomyces_limnophilus_in                                                                         (planctomycetacia)
            +-------+
            |       +--------Pirellula_staleyi_in                                                                           (planctomycetacia)
        +---+
        |   +-----------Gemmata_obscuriglobus_in                                                                            (planctomycetacia)
    +---+
    |   |
    |   +-------------Rhodopirellula_baltica_in                                                                             (planctomycetacia)
    |
    |+----------Blastopirellula_marina_in                                                                                   (planctomycetacia)
 +--+|
 |  ||                                         +-----------------------------------------uncultured_planctomycete_3FN_in    (planctomycetacia)
 |  ||                               1+--------+
 |  ||                                |        +---------------------------------uncultured_planctomycete_8FN_in            (planctomycetacia)
 | 2++                 +--------------+
 |   |                 |              +-----------------------------------------------uncultured_planctomycete_5H12_in      (planctomycetacia)
 |   |       +---------+
 |   |       |         |
 |   |  +----+         +--------------------------uncultured_planctomycete_6FN_in                                           (planctomycetacia)
 |   |  |    |
 |   +--+    +------------------GOS_1412010
 |      |
 |      +-----------Planctomyces_maris_in                                                                                   (planctomycetacia)
 |
 |  +-------------Verrucomicrobium_spinosum_out                                                                             (Verrucomicrobiae)
 |  |
 |  |     +---------Spirosoma_linguale_out                                                                                  (Sphingobacteria)
 +--+     |
    |     |
    +-----+             +Sphingobacterium_spiritivorum_1_out                                                                (Sphingobacteria)
          |       +-----+
          +-------+     +Sphingobacterium_spiritivorum_2_out                                                                (Sphingobacteria)
                  |
                  +--------Flavobacteriales_bacterium_out                                                                   (Flavobacteria)

Taxonomy report

PROTOCOL


a) Blastp vs NR, NCBI defaut parameters * "1000 Max target sequences"


RESULTS ANALYSIS


The in and out groups was chosen based on lineage report, and in the E-values of the sequences in BLASTp vs NR. The number of organism are low, this reflects on the number of sequences and in the range of E-values in the in group.

For in groups were chosen the planctomycetes results. The outgroups don´t belong the same class as in groups, and are classes not to distant from planctomycetes.



In group: planctomycetes

ZP_01857254.1 Planctomyces maris in Planctomyces maris DSM 8797 2e-103(planctomycetacia)

ZP_04427336.1 Planctomyces limnophilus in Planctomyces limnophilus DSM 3776 5e-95 (planctomycetacia)

ZP_01092432.1 Blastopirellula marina in Blastopirellula marina DSM 3645 4e-95 (planctomycetacia)

NP_870347.1 Rhodopirellula baltica in Rhodopirellula baltica SH 1 8e-95 (planctomycetacia)

YP_003373058.1 Pirellula staleyi in Pirellula staleyi DSM 6068 1e-94 (planctomycetacia)

ZP_02736239.1 Gemmata obscuriglobus in Gemmata obscuriglobus UQM 2246 2e-92 (planctomycetacia)

ABX10661.1 uncultured planctomycete 6FN in uncultured planctomycete 6FN 5e-68 (planctomycetacia)

ABX10571.1 uncultured planctomycete 5H12 in uncultured planctomycete 5H12 9e-23 (planctomycetacia)

ABX10686.1 uncultured planctomycete 8FN in uncultured planctomycete 8FN 4e-20 (planctomycetacia)

ABX10648.1 uncultured planctomycete 3FN in uncultured planctomycete 3FN 4e-14 (planctomycetacia)


out group: verrucomicrobia, flavobacteria, sphingobacteria

ZP_02925393.1 Verrucomicrobium spinosum out Verrucomicrobium spinosum DSM 4136 9e-97 (Verrucomicrobiae)

ZP_01107554.1 Flavobacteriales bacterium out Flavobacteriales bacterium HTCC2170 6e-96 (Flavobacteria)

ZP_04781244.1 Sphingobacterium spiritivorum 1 out Sphingobacterium spiritivorum ATCC 33861 1e-95 (Sphingobacteria)

ZP_03969507.1 Sphingobacterium spiritivorum 2 out Sphingobacterium spiritivorum ATCC 33300 2e-95 (Sphingobacteria)

YP_003390217.1 Spirosoma linguale out Spirosoma linguale DSM 74 2e-94 (Sphingobacteria)




















RAW RESULTS

Lineage Report

Bacteria          [bacteria]
. Planctomycetales  [planctomycetes]
. . Planctomycetaceae [planctomycetes]
. . . Planctomyces      [planctomycetes]
. . . . Planctomyces maris DSM 8797 -------------------------  379 188 hits [planctomycetes]      hypothetical protein PM8797T_08334 [Planctomyces maris DSM 
. . . . Planctomyces limnophilus DSM 3776 ...................  351  50 hits [planctomycetes]      Protein of unknown function (DUF1501) [Planctomyces limnoph
. . . Blastopirellula marina DSM 3645 -----------------------  352  62 hits [planctomycetes]      hypothetical protein DSM3645_27773 [Blastopirellula marina 
. . . Rhodopirellula baltica SH 1 ...........................  350  82 hits [planctomycetes]      hypothetical protein RB12159 [Rhodopirellula baltica SH 1] 
. . . Pirellula staleyi DSM 6068 ............................  350 114 hits [planctomycetes]      protein of unknown function DUF1501 [Pirellula staleyi DSM 
. . . Gemmata obscuriglobus UQM 2246 ........................  343  60 hits [planctomycetes]      hypothetical protein GobsU_30795 [Gemmata obscuriglobus UQM
. . uncultured planctomycete 6FN ----------------------------  261   1 hit  [planctomycetes]      hypothetical secreted protein [uncultured planctomycete 6FN]
. . uncultured planctomycete 5H12 ...........................  111   1 hit  [planctomycetes]      hypothetical protein 5H12_9 [uncultured planctomycete 5H12]
. . uncultured planctomycete 8FN ............................  102   1 hit  [planctomycetes]      hypothetical secreted protein [uncultured planctomycete 8FN]
. . uncultured planctomycete 3FN ............................   82   1 hit  [planctomycetes]      hypothetical secreted protein [uncultured planctomycete 3FN]
. bacterium Ellin514 ----------------------------------------  363  28 hits [verrucomicrobia]     protein of unknown function DUF1501 [bacterium Ellin514] >g
. Verrucomicrobium spinosum DSM 4136 ........................  357  47 hits [verrucomicrobia]     hypothetical protein VspiD_02085 [Verrucomicrobium spinosum
. Flavobacteriales bacterium HTCC2170 .......................  354  12 hits [CFB group bacteria]  hypothetical protein FB2170_08934 [Flavobacteriales bacteri
. Sphingobacterium spiritivorum ATCC 33861 ..................  353   2 hits [CFB group bacteria]  protein of hypothetical function DUF1501 [Sphingobacterium 
. Sphingobacterium spiritivorum ATCC 33300 ..................  352   2 hits [CFB group bacteria]  protein of hypothetical function DUF1501 [Sphingobacterium 
. Candidatus Solibacter usitatus Ellin6076 ..................  351  30 hits [bacteria]            hypothetical protein Acid_6540 [Solibacter usitatus Ellin60
. Spirosoma linguale DSM 74 .................................  349  12 hits [CFB group bacteria]  protein of unknown function DUF1501 [Spirosoma linguale DSM
. Algoriphagus sp. PR1 ......................................  349   6 hits [CFB group bacteria]  hypothetical protein ALPR1_16089 [Algoriphagus sp. PR1] >gi
. Flavobacteria bacterium MS024-2A ..........................  344   4 hits [CFB group bacteria]  protein of unknown function DUF1501 [Flavobacteria bacteriu
. Chthoniobacter flavus Ellin428 ............................  342 100 hits [verrucomicrobia]     protein of unknown function DUF1501 [Chthoniobacter flavus 
. Lentisphaera araneosa HTCC2155 ............................  341  50 hits [bacteria]            hypothetical protein LNTAR_22115 [Lentisphaera araneosa HTC
. Dyadobacter fermentans DSM 18053 ..........................  335  14 hits [CFB group bacteria]  protein of unknown function DUF1501 [Dyadobacter fermentans
. Pedobacter heparinus DSM 2366 .............................  334   2 hits [CFB group bacteria]  protein of unknown function DUF1501 [Pedobacter heparinus D
. uncultured marine bacterium 105 ...........................  316   2 hits [bacteria]            hypothetical protein MBMO_EBAC750-01A01.32 [uncultured mari
. Pseudoalteromonas atlantica T6c ...........................  309   2 hits [g-proteobacteria]    twin-arginine translocation pathway signal [Pseudoalteromon
. uncultured Poribacteria bacterium 64K2 ....................  278   1 hit  [bacteria]            hypothetical protein [uncultured Poribacteria bacterium 64K
. Methylocella silvestris BL2 ...............................  249   2 hits [a-proteobacteria]    protein of unknown function DUF1501 [Methylocella silvestri
. Polaromonas naphthalenivorans CJ2 .........................  244   2 hits [b-proteobacteria]    hypothetical protein Pnap_0574 [Polaromonas naphthalenivora
. Azoarcus sp. BH72 .........................................  233   2 hits [b-proteobacteria]    hypothetical protein azo2345 [Azoarcus sp. BH72] >gi|119671
. Plesiocystis pacifica SIR-1 ...............................  122   4 hits [d-proteobacteria]    hypothetical protein PPSIR1_37894 [Plesiocystis pacifica SI
. Prosthecobacter vanneervenii ..............................  112   1 hit  [verrucomicrobia]     hypothetical protein [Prosthecobacter vanneervenii]
. Leptospira biflexa serovar Patoc strain 'Patoc 1 (Paris)' .   90   2 hits [spirochetes]         hypothetical protein LEPBI_I0161 [Leptospira biflexa serova
. Leptospira biflexa serovar Patoc strain 'Patoc 1 (Ames)' ..   90   2 hits [spirochetes]         hypothetical protein LEPBI_I0161 [Leptospira biflexa serova
. Reinekea blandensis MED297 ................................   51   2 hits [g-proteobacteria]    hypothetical protein MED297_16923 [Reinekea sp. MED297] >gi

BLAST

PROTOCOL


a) Blastp vs NR, NCBI defaut parameters * "1000 Max target sequences"

b) Blastp vs swissprot, NCBI defaut parameters * "1000 Max target sequences"


RESULTS ANALYSIS


There are a large number of proteins with small E-values (E-value<< e-6), almost all belonging to hypothetical proteins or proteins without known function.

The values of NR and swissprot not coincide. The values in swissprot are insignificant and more than 0.1.

Because the Data Base SWISSPROT is very limited, the data used are from the BLAST vs NR results.



RAW RESULTS

a) BLASTp vs NR

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_01857254.1|  hypothetical protein PM8797T_08334 [Planct...   379    2e-103
ref|ZP_03630904.1|  protein of unknown function DUF1501 [bacte...   363    2e-98 
ref|ZP_03630017.1|  protein of unknown function DUF1501 [bacte...   359    2e-97 
ref|ZP_02925393.1|  hypothetical protein VspiD_02085 [Verrucom...   357    8e-97 
ref|ZP_01856761.1|  hypothetical protein PM8797T_30671 [Planct...   357    1e-96 
ref|ZP_02931028.1|  hypothetical protein VspiD_30335 [Verrucom...   355    4e-96 
ref|ZP_01107554.1|  hypothetical protein FB2170_08934 [Flavoba...   354    6e-96 
ref|ZP_03629479.1|  protein of unknown function DUF1501 [bacte...   354    7e-96 
ref|ZP_04781244.1|  protein of hypothetical function DUF1501 [...   353    1e-95 
ref|ZP_01855397.1|  hypothetical protein PM8797T_15066 [Planct...   353    2e-95 
ref|ZP_03969507.1|  protein of hypothetical function DUF1501 [...   352    2e-95 
ref|ZP_01092432.1|  hypothetical protein DSM3645_27773 [Blasto...   352    4e-95 
ref|ZP_04427336.1|  Protein of unknown function (DUF1501) [Pla...   351    5e-95 
ref|YP_827747.1|  hypothetical protein Acid_6540 [Solibacter u...   351    5e-95 
ref|NP_870347.1|  hypothetical protein RB12159 [Rhodopirellula...   350    8e-95 
ref|YP_003373058.1|  protein of unknown function DUF1501 [Pire...   350    1e-94 
ref|YP_003390217.1|  protein of unknown function DUF1501 [Spir...   349    2e-94 
ref|ZP_01720756.1|  hypothetical protein ALPR1_16089 [Algoriph...   349    2e-94 
ref|NP_870138.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   344    7e-93 
ref|ZP_03702867.1|  protein of unknown function DUF1501 [Flavo...   344    7e-93 
ref|YP_825205.1|  hypothetical protein Acid_3953 [Solibacter u...   344    8e-93 
ref|YP_003385895.1|  protein of unknown function DUF1501 [Spir...   343    1e-92 
ref|NP_863903.1|  hypothetical protein RB430 [Rhodopirellula b...   343    2e-92 
ref|ZP_02736239.1|  hypothetical protein GobsU_30795 [Gemmata ...   343    2e-92 
ref|NP_865100.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   342    2e-92 
ref|ZP_03128999.1|  protein of unknown function DUF1501 [Chtho...   342    3e-92 
ref|NP_868132.1|  signal peptide [Rhodopirellula baltica SH 1]...   342    4e-92 
ref|ZP_01092833.1|  hypothetical protein DSM3645_06756 [Blasto...   342    4e-92 
ref|ZP_01877013.1|  hypothetical protein LNTAR_22115 [Lentisph...   341    7e-92 
ref|ZP_03129150.1|  protein of unknown function DUF1501 [Chtho...   341    7e-92 
ref|NP_867708.1|  hypothetical protein RB7228 [Rhodopirellula ...   340    1e-91 
ref|ZP_01092944.1|  hypothetical protein DSM3645_07311 [Blasto...   340    1e-91 
ref|YP_003370758.1|  protein of unknown function DUF1501 [Pire...   340    1e-91 
ref|ZP_01091001.1|  hypothetical protein DSM3645_11502 [Blasto...   339    2e-91 
ref|NP_869220.1|  hypothetical protein RB10053 [Rhodopirellula...   339    2e-91 
ref|ZP_01853690.1|  hypothetical protein PM8797T_25346 [Planct...   339    2e-91 
ref|ZP_03703410.1|  protein of unknown function DUF1501 [Flavo...   339    3e-91 
ref|YP_822258.1|  hypothetical protein Acid_0975 [Solibacter u...   338    5e-91 
ref|ZP_02930892.1|  hypothetical protein VspiD_29645 [Verrucom...   336    2e-90 
ref|ZP_01107522.1|  hypothetical protein FB2170_08774 [Flavoba...   336    2e-90 
ref|ZP_02925725.1|  hypothetical protein VspiD_03765 [Verrucom...   336    2e-90 
ref|ZP_02927300.1|  hypothetical protein VspiD_11660 [Verrucom...   335    3e-90 
ref|ZP_03631679.1|  protein of unknown function DUF1501 [bacte...   335    3e-90 
ref|ZP_03131590.1|  protein of unknown function DUF1501 [Chtho...   335    3e-90 
ref|ZP_01854425.1|  hypothetical protein PM8797T_24351 [Planct...   335    4e-90 
ref|YP_003086002.1|  protein of unknown function DUF1501 [Dyad...   335    5e-90 
ref|ZP_03129528.1|  protein of unknown function DUF1501 [Chtho...   335    5e-90 
ref|YP_003092977.1|  protein of unknown function DUF1501 [Pedo...   334    6e-90 
ref|ZP_01718231.1|  hypothetical protein ALPR1_13615 [Algoriph...   334    6e-90 
ref|YP_003084767.1|  protein of unknown function DUF1501 [Dyad...   334    9e-90 
ref|YP_827500.1|  hypothetical protein Acid_6289 [Solibacter u...   333    1e-89 
ref|YP_003385810.1|  protein of unknown function DUF1501 [Spir...   332    4e-89 
ref|NP_868363.1|  hypothetical protein RB8440 [Rhodopirellula ...   332    4e-89 
ref|ZP_01108005.1|  hypothetical protein FB2170_01452 [Flavoba...   332    4e-89 
ref|ZP_04430100.1|  Protein of unknown function (DUF1501) [Pla...   332    4e-89 
ref|ZP_01854820.1|  hypothetical protein PM8797T_29758 [Planct...   331    6e-89 
ref|YP_003370766.1|  protein of unknown function DUF1501 [Pire...   330    1e-88 
ref|YP_003086365.1|  protein of unknown function DUF1501 [Dyad...   330    2e-88 
ref|NP_868678.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   329    3e-88 
ref|ZP_02734530.1|  hypothetical protein GobsU_22177 [Gemmata ...   329    3e-88 
ref|ZP_01873115.1|  hypothetical protein LNTAR_22974 [Lentisph...   329    3e-88 
ref|ZP_02928256.1|  hypothetical protein VspiD_16440 [Verrucom...   329    3e-88 
ref|YP_003086590.1|  protein of unknown function DUF1501 [Dyad...   328    4e-88 
ref|ZP_01091675.1|  hypothetical protein DSM3645_24600 [Blasto...   328    4e-88 
ref|ZP_01876657.1|  Twin-arginine translocation pathway signal...   328    5e-88 
ref|ZP_03127031.1|  protein of unknown function DUF1501 [Chtho...   328    5e-88 
ref|ZP_03127852.1|  protein of unknown function DUF1501 [Chtho...   328    6e-88 
ref|NP_869125.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   328    7e-88 
ref|ZP_03629928.1|  protein of unknown function DUF1501 [bacte...   326    2e-87 
ref|ZP_01855834.1|  hypothetical protein PM8797T_17227 [Planct...   326    2e-87 
ref|ZP_01873250.1|  hypothetical protein LNTAR_13597 [Lentisph...   326    2e-87 
ref|NP_865098.1|  hypothetical protein RB2690 [Rhodopirellula ...   326    2e-87 
ref|YP_003388871.1|  protein of unknown function DUF1501 [Spir...   325    4e-87 
ref|ZP_04428956.1|  Protein of unknown function (DUF1501) [Pla...   325    6e-87 
ref|ZP_03127351.1|  protein of unknown function DUF1501 [Chtho...   325    6e-87 
ref|ZP_02736676.1|  hypothetical protein GobsU_32994 [Gemmata ...   323    1e-86 
ref|YP_003088071.1|  protein of unknown function DUF1501 [Dyad...   323    1e-86 
ref|YP_003085746.1|  protein of unknown function DUF1501 [Dyad...   323    2e-86 
ref|ZP_03127819.1|  protein of unknown function DUF1501 [Chtho...   322    3e-86 
ref|ZP_01855363.1|  hypothetical protein PM8797T_14896 [Planct...   322    3e-86 
ref|ZP_04427017.1|  Protein of unknown function (DUF1501) [Pla...   322    3e-86 
ref|ZP_01872687.1|  hypothetical protein LNTAR_17098 [Lentisph...   322    4e-86 
ref|YP_003390694.1|  protein of unknown function DUF1501 [Spir...   322    4e-86 
ref|NP_863780.1|  hypothetical protein RB218 [Rhodopirellula b...   321    7e-86 
ref|ZP_02926849.1|  hypothetical protein VspiD_09395 [Verrucom...   320    9e-86 
ref|YP_003373035.1|  protein of unknown function DUF1501 [Pire...   320    1e-85 
ref|ZP_01855318.1|  hypothetical protein PM8797T_14671 [Planct...   320    2e-85 
ref|ZP_03127353.1|  protein of unknown function DUF1501 [Chtho...   320    2e-85 
ref|ZP_02927331.1|  hypothetical protein VspiD_11815 [Verrucom...   319    2e-85 
ref|ZP_03631677.1|  protein of unknown function DUF1501 [bacte...   319    2e-85 
ref|ZP_02930257.1|  hypothetical protein VspiD_26460 [Verrucom...   319    3e-85 
ref|ZP_02928258.1|  hypothetical protein VspiD_16450 [Verrucom...   318    3e-85 
ref|ZP_03626973.1|  protein of unknown function DUF1501 [bacte...   317    1e-84 
ref|ZP_01873584.1|  hypothetical protein LNTAR_08569 [Lentisph...   317    2e-84 
ref|ZP_01875693.1|  hypothetical protein LNTAR_19040 [Lentisph...   316    2e-84 
gb|AAR37458.1|  hypothetical protein MBMO_EBAC750-01A01.32 [un...   316    3e-84 
ref|NP_870329.1|  hypothetical protein RB12118 [Rhodopirellula...   315    3e-84 
ref|ZP_01872643.1|  hypothetical protein LNTAR_16878 [Lentisph...   315    3e-84 
ref|ZP_04428645.1|  Protein of unknown function (DUF1501) [Pla...   315    4e-84 
ref|ZP_01854205.1|  hypothetical protein PM8797T_16178 [Planct...   315    4e-84 
ref|YP_003369338.1|  protein of unknown function DUF1501 [Pire...   313    1e-83 
ref|ZP_02732205.1|  hypothetical protein GobsU_10408 [Gemmata ...   313    1e-83 
ref|ZP_01872684.1|  hypothetical protein LNTAR_17083 [Lentisph...   313    2e-83 
ref|ZP_01089232.1|  hypothetical protein DSM3645_01495 [Blasto...   313    2e-83 
ref|YP_003371883.1|  protein of unknown function DUF1501 [Pire...   311    5e-83 
ref|ZP_01105813.1|  hypothetical protein FB2170_06405 [Flavoba...   311    6e-83 
ref|ZP_03130924.1|  protein of unknown function DUF1501 [Chtho...   310    2e-82 
ref|YP_660396.1|  twin-arginine translocation pathway signal [...   309    3e-82 
ref|ZP_01876017.1|  hypothetical protein LNTAR_19967 [Lentisph...   308    6e-82 
ref|ZP_01872682.1|  hypothetical protein LNTAR_17073 [Lentisph...   308    6e-82 
ref|ZP_01854951.1|  hypothetical protein PM8797T_23604 [Planct...   307    1e-81 
ref|ZP_01876721.1|  hypothetical protein LNTAR_25075 [Lentisph...   306    1e-81 
ref|ZP_01874628.1|  hypothetical protein LNTAR_20133 [Lentisph...   306    2e-81 
ref|ZP_02731514.1|  hypothetical protein GobsU_06925 [Gemmata ...   306    2e-81 
ref|ZP_02737754.1|  hypothetical protein GobsU_38453 [Gemmata ...   303    1e-80 
ref|ZP_02928306.1|  hypothetical protein VspiD_16690 [Verrucom...   303    2e-80 
ref|ZP_03129269.1|  protein of unknown function DUF1501 [Chtho...   301    4e-80 
ref|YP_003371611.1|  protein of unknown function DUF1501 [Pire...   300    1e-79 
ref|YP_003371257.1|  protein of unknown function DUF1501 [Pire...   299    2e-79 
ref|NP_864042.1|  putative related to sulfatase [Rhodopirellul...   298    4e-79 
ref|ZP_03128596.1|  protein of unknown function DUF1501 [Chtho...   296    2e-78 
ref|ZP_02927371.1|  Twin-arginine translocation pathway signal...   296    3e-78 
ref|YP_827755.1|  hypothetical protein Acid_6548 [Solibacter u...   295    3e-78 
ref|ZP_02925122.1|  hypothetical protein VspiD_00730 [Verrucom...   295    3e-78 
ref|NP_870880.1|  hypothetical protein RB13126 [Rhodopirellula...   294    7e-78 
ref|ZP_02926662.1|  hypothetical protein VspiD_08460 [Verrucom...   292    2e-77 
ref|ZP_02927946.1|  hypothetical protein VspiD_14890 [Verrucom...   292    3e-77 
ref|YP_003371260.1|  protein of unknown function DUF1501 [Pire...   291    7e-77 
ref|ZP_02730408.1|  hypothetical protein GobsU_01317 [Gemmata ...   290    1e-76 
ref|YP_003371874.1|  protein of unknown function DUF1501 [Pire...   290    2e-76 
ref|ZP_01853875.1|  hypothetical protein PM8797T_20678 [Planct...   290    2e-76 
ref|ZP_02927906.1|  hypothetical protein VspiD_14690 [Verrucom...   289    2e-76 
ref|ZP_01851689.1|  hypothetical protein PM8797T_27749 [Planct...   289    3e-76 
ref|ZP_01853395.1|  hypothetical protein PM8797T_26580 [Planct...   288    5e-76 
ref|ZP_02732624.1|  hypothetical protein GobsU_12525 [Gemmata ...   287    1e-75 
ref|ZP_01857105.1|  hypothetical protein PM8797T_00639 [Planct...   285    3e-75 
ref|ZP_01089145.1|  hypothetical protein DSM3645_01060 [Blasto...   285    4e-75 
ref|ZP_01854483.1|  hypothetical protein PM8797T_24641 [Planct...   284    8e-75 
ref|ZP_01855816.1|  hypothetical protein PM8797T_17137 [Planct...   283    1e-74 
ref|ZP_01855716.1|  hypothetical protein PM8797T_26915 [Planct...   283    1e-74 
ref|ZP_02928847.1|  hypothetical protein VspiD_19395 [Verrucom...   283    1e-74 
ref|ZP_01875778.1|  hypothetical protein LNTAR_02197 [Lentisph...   283    1e-74 
ref|ZP_01093323.1|  hypothetical protein DSM3645_16265 [Blasto...   283    2e-74 
ref|ZP_03131336.1|  protein of unknown function DUF1501 [Chtho...   281    9e-74 
ref|NP_867041.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   279    3e-73 
gb|AAW84308.1|  hypothetical protein [uncultured Poribacteria ...   278    4e-73 
ref|ZP_01093590.1|  hypothetical protein DSM3645_25542 [Blasto...   278    8e-73 
ref|NP_865869.1|  hypothetical protein RB4032 [Rhodopirellula ...   276    2e-72 
ref|ZP_01093903.1|  hypothetical protein DSM3645_04440 [Blasto...   276    3e-72 
ref|ZP_01854150.1|  hypothetical protein PM8797T_15903 [Planct...   275    4e-72 
ref|ZP_02735781.1|  hypothetical protein GobsU_28485 [Gemmata ...   275    5e-72 
ref|NP_870885.1|  hypothetical protein RB13132 [Rhodopirellula...   274    1e-71 
ref|YP_822425.1|  hypothetical protein Acid_1146 [Solibacter u...   273    1e-71 
ref|ZP_01852047.1|  hypothetical protein PM8797T_21773 [Planct...   272    4e-71 
ref|NP_865177.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   271    5e-71 
ref|NP_864781.1|  hypothetical protein RB2170 [Rhodopirellula ...   271    7e-71 
ref|YP_003371019.1|  protein of unknown function DUF1501 [Pire...   271    9e-71 
ref|ZP_04427515.1|  Protein of unknown function (DUF1501) [Pla...   270    1e-70 
ref|ZP_04425483.1|  uncharacterized conserved protein [Plancto...   270    1e-70 
ref|ZP_02925656.1|  hypothetical protein VspiD_03420 [Verrucom...   270    2e-70 
ref|YP_003372017.1|  protein of unknown function DUF1501 [Pire...   270    2e-70 
ref|YP_003368788.1|  protein of unknown function DUF1501 [Pire...   269    2e-70 
ref|ZP_02735194.1|  hypothetical protein GobsU_25531 [Gemmata ...   269    2e-70 
ref|YP_003371857.1|  protein of unknown function DUF1501 [Pire...   269    3e-70 
ref|NP_869883.1|  putative related to sulfatase [Rhodopirellul...   269    4e-70 
ref|ZP_02735517.1|  hypothetical protein GobsU_27161 [Gemmata ...   268    6e-70 
ref|ZP_03129195.1|  protein of unknown function DUF1501 [Chtho...   268    8e-70 
ref|ZP_01852440.1|  hypothetical protein PM8797T_05220 [Planct...   267    1e-69 
ref|ZP_03128757.1|  protein of unknown function DUF1501 [Chtho...   266    2e-69 
ref|ZP_03129642.1|  protein of unknown function DUF1501 [Chtho...   266    2e-69 
ref|ZP_01852944.1|  hypothetical protein PM8797T_03184 [Planct...   266    3e-69 
gb|AAR37452.1|  hypothetical protein MBMO_EBAC750-01A01.26 [un...   265    4e-69 
ref|ZP_02925201.1|  hypothetical protein VspiD_01125 [Verrucom...   263    1e-68 
ref|ZP_03128780.1|  protein of unknown function DUF1501 [Chtho...   263    1e-68 
ref|YP_003372660.1|  protein of unknown function DUF1501 [Pire...   263    1e-68 
ref|ZP_04429791.1|  Protein of unknown function (DUF1501) [Pla...   263    2e-68 
ref|NP_870319.1|  hypothetical protein RB12102 [Rhodopirellula...   263    2e-68 
ref|YP_003372561.1|  protein of unknown function DUF1501 [Pire...   262    3e-68 
gb|ABX10661.1|  hypothetical secreted protein [uncultured plan...   261    5e-68 
ref|ZP_02735556.1|  hypothetical protein GobsU_27356 [Gemmata ...   261    6e-68 
ref|ZP_01854808.1|  hypothetical protein PM8797T_29698 [Planct...   261    8e-68 
ref|NP_866636.1|  hypothetical protein RB5328 [Rhodopirellula ...   260    1e-67 
ref|NP_863970.1|  hypothetical protein RB546 [Rhodopirellula b...   260    1e-67 
ref|ZP_01091735.1|  hypothetical protein DSM3645_02278 [Blasto...   260    2e-67 
ref|ZP_02925399.1|  hypothetical protein VspiD_02115 [Verrucom...   259    2e-67 
ref|ZP_01855532.1|  hypothetical protein PM8797T_06877 [Planct...   259    3e-67 
ref|ZP_02731539.1|  hypothetical protein GobsU_07052 [Gemmata ...   259    3e-67 
ref|ZP_01856103.1|  hypothetical protein PM8797T_15471 [Planct...   259    3e-67 
ref|ZP_01090775.1|  hypothetical protein DSM3645_14785 [Blasto...   259    3e-67 
ref|ZP_01092737.1|  hypothetical protein DSM3645_26474 [Blasto...   258    6e-67 
ref|ZP_02928216.1|  hypothetical protein VspiD_16240 [Verrucom...   257    1e-66 
ref|ZP_02930014.1|  hypothetical protein VspiD_25235 [Verrucom...   257    1e-66 
ref|ZP_03127886.1|  protein of unknown function DUF1501 [Chtho...   255    3e-66 
ref|NP_870582.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   255    5e-66 
ref|ZP_02927884.1|  hypothetical protein VspiD_14580 [Verrucom...   255    5e-66 
ref|ZP_01088859.1|  hypothetical protein DSM3645_10272 [Blasto...   254    6e-66 
ref|YP_003372015.1|  protein of unknown function DUF1501 [Pire...   254    6e-66 
ref|ZP_01852773.1|  hypothetical protein PM8797T_13188 [Planct...   254    9e-66 
ref|ZP_01854164.1|  hypothetical protein PM8797T_15973 [Planct...   254    1e-65 
ref|ZP_03129879.1|  protein of unknown function DUF1501 [Chtho...   253    2e-65 
ref|YP_003371626.1|  protein of unknown function DUF1501 [Pire...   253    2e-65 
ref|ZP_01852438.1|  hypothetical protein PM8797T_05210 [Planct...   253    2e-65 
ref|ZP_01089809.1|  hypothetical protein DSM3645_29127 [Blasto...   253    3e-65 
ref|ZP_01874254.1|  hypothetical protein LNTAR_12371 [Lentisph...   252    3e-65 
ref|ZP_03626735.1|  protein of unknown function DUF1501 [bacte...   252    4e-65 
ref|NP_866691.1|  hypothetical protein RB5428 [Rhodopirellula ...   252    4e-65 
ref|ZP_02734956.1|  hypothetical protein GobsU_24341 [Gemmata ...   251    8e-65 
ref|YP_003368869.1|  protein of unknown function DUF1501 [Pire...   251    8e-65 
ref|ZP_02736877.1|  hypothetical protein GobsU_34005 [Gemmata ...   251    9e-65 
ref|YP_002363403.1|  protein of unknown function DUF1501 [Meth...   249    2e-64 
ref|ZP_04426341.1|  uncharacterized conserved protein [Plancto...   249    2e-64 
ref|ZP_02737271.1|  hypothetical protein GobsU_36010 [Gemmata ...   249    3e-64 
ref|NP_867331.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   249    3e-64 
ref|ZP_01856201.1|  hypothetical protein PM8797T_00292 [Planct...   249    3e-64 
ref|ZP_01853937.1|  hypothetical protein PM8797T_20988 [Planct...   248    4e-64 
ref|ZP_03128760.1|  protein of unknown function DUF1501 [Chtho...   248    7e-64 
ref|ZP_02732916.1|  hypothetical protein GobsU_14037 [Gemmata ...   248    7e-64 
ref|ZP_01873072.1|  hypothetical protein LNTAR_22759 [Lentisph...   248    7e-64 
ref|NP_866326.1|  hypothetical protein RB4805 [Rhodopirellula ...   247    9e-64 
ref|ZP_01874017.1|  hypothetical protein LNTAR_11176 [Lentisph...   247    1e-63 
ref|YP_003371968.1|  protein of unknown function DUF1501 [Pire...   247    2e-63 
ref|ZP_03133438.1|  protein of unknown function DUF1501 [Chtho...   246    2e-63 
ref|ZP_02929458.1|  hypothetical protein VspiD_22455 [Verrucom...   246    2e-63 
ref|ZP_01088614.1|  hypothetical protein DSM3645_09047 [Blasto...   245    6e-63 
ref|ZP_01851695.1|  hypothetical protein PM8797T_27779 [Planct...   245    6e-63 
ref|ZP_03129093.1|  protein of unknown function DUF1501 [Chtho...   244    7e-63 
ref|ZP_01877392.1|  hypothetical protein LNTAR_07991 [Lentisph...   244    8e-63 
ref|ZP_01090268.1|  hypothetical protein DSM3645_21052 [Blasto...   244    8e-63 
ref|ZP_01852958.1|  hypothetical protein PM8797T_03254 [Planct...   244    9e-63 
ref|YP_980815.1|  hypothetical protein Pnap_0574 [Polaromonas ...   244    1e-62 
ref|ZP_02929363.1|  hypothetical protein VspiD_21980 [Verrucom...   244    1e-62 
ref|ZP_01855129.1|  hypothetical protein PM8797T_30067 [Planct...   244    1e-62 
ref|ZP_02733064.1|  hypothetical protein GobsU_14779 [Gemmata ...   244    1e-62 
ref|ZP_03626730.1|  protein of unknown function DUF1501 [bacte...   244    1e-62 
ref|ZP_01105430.1|  hypothetical protein FB2170_04490 [Flavoba...   243    2e-62 
ref|ZP_02930276.1|  hypothetical protein VspiD_26555 [Verrucom...   243    2e-62 
ref|ZP_01874793.1|  hypothetical protein LNTAR_20958 [Lentisph...   242    3e-62 
ref|ZP_02928559.1|  hypothetical protein VspiD_17955 [Verrucom...   242    4e-62 
ref|ZP_02732541.1|  hypothetical protein GobsU_12100 [Gemmata ...   242    4e-62 
ref|YP_825310.1|  hypothetical protein Acid_4060 [Solibacter u...   241    5e-62 
ref|ZP_04428710.1|  Protein of unknown function (DUF1501) [Pla...   241    6e-62 
ref|YP_827398.1|  hypothetical protein Acid_6187 [Solibacter u...   241    7e-62 
ref|ZP_01857519.1|  hypothetical protein PM8797T_14559 [Planct...   241    9e-62 
ref|ZP_04428544.1|  Protein of unknown function (DUF1501) [Pla...   238    6e-61 
ref|ZP_02930269.1|  hypothetical protein VspiD_26520 [Verrucom...   237    1e-60 
ref|ZP_03630501.1|  protein of unknown function DUF1501 [bacte...   237    1e-60 
ref|ZP_04428235.1|  uncharacterized conserved protein [Plancto...   237    1e-60 
ref|ZP_03129809.1|  protein of unknown function DUF1501 [Chtho...   236    2e-60 
ref|ZP_02737208.1|  hypothetical protein GobsU_35693 [Gemmata ...   236    3e-60 
ref|ZP_01854261.1|  hypothetical protein PM8797T_16458 [Planct...   235    4e-60 
ref|YP_003368739.1|  protein of unknown function DUF1501 [Pire...   235    4e-60 
ref|ZP_03129285.1|  protein of unknown function DUF1501 [Chtho...   235    4e-60 
ref|ZP_02927776.1|  hypothetical protein VspiD_14040 [Verrucom...   235    4e-60 
ref|YP_933849.1|  hypothetical protein azo2345 [Azoarcus sp. B...   233    2e-59 
ref|ZP_02730646.1|  hypothetical protein GobsU_02543 [Gemmata ...   233    2e-59 
ref|ZP_01857965.1|  hypothetical protein PM8797T_02454 [Planct...   233    2e-59 
ref|ZP_03130363.1|  protein of unknown function DUF1501 [Chtho...   233    2e-59 
ref|ZP_02736729.1|  hypothetical protein GobsU_33259 [Gemmata ...   233    3e-59 
ref|ZP_03131080.1|  protein of unknown function DUF1501 [Chtho...   232    3e-59 
ref|ZP_02927341.1|  hypothetical protein VspiD_11865 [Verrucom...   230    2e-58 
ref|ZP_03127946.1|  protein of unknown function DUF1501 [Chtho...   229    2e-58 
ref|YP_822143.1|  hypothetical protein Acid_0859 [Solibacter u...   229    3e-58 
ref|ZP_02734377.1|  hypothetical protein GobsU_21410 [Gemmata ...   227    1e-57 
ref|ZP_01854202.1|  hypothetical protein PM8797T_16163 [Planct...   227    1e-57 
ref|ZP_02925546.1|  hypothetical protein VspiD_02860 [Verrucom...   226    2e-57 
ref|ZP_02734667.1|  hypothetical protein GobsU_22882 [Gemmata ...   226    2e-57 
ref|ZP_01854739.1|  hypothetical protein PM8797T_29353 [Planct...   226    2e-57 
ref|ZP_02931184.1|  hypothetical protein VspiD_31125 [Verrucom...   226    2e-57 
ref|ZP_01855366.1|  hypothetical protein PM8797T_14911 [Planct...   225    4e-57 
ref|ZP_01856741.1|  hypothetical protein PM8797T_14464 [Planct...   223    2e-56 
ref|ZP_03132322.1|  protein of unknown function DUF1501 [Chtho...   223    3e-56 
ref|YP_003370733.1|  protein of unknown function DUF1501 [Pire...   223    3e-56 
ref|ZP_04425975.1|  Protein of unknown function (DUF1501) [Pla...   221    1e-55 
ref|YP_003371853.1|  protein of unknown function DUF1501 [Pire...   220    2e-55 
ref|ZP_03131958.1|  protein of unknown function DUF1501 [Chtho...   220    2e-55 
ref|ZP_02736483.1|  hypothetical protein GobsU_32019 [Gemmata ...   217    1e-54 
ref|ZP_01856699.1|  hypothetical protein PM8797T_14254 [Planct...   216    4e-54 
ref|ZP_01091346.1|  hypothetical protein DSM3645_05725 [Blasto...   215    4e-54 
ref|YP_003372010.1|  protein of unknown function DUF1501 [Pire...   215    5e-54 
ref|YP_003371985.1|  protein of unknown function DUF1501 [Pire...   215    6e-54 
ref|ZP_01856297.1|  hypothetical protein PM8797T_27572 [Planct...   210    1e-52 
ref|ZP_03130358.1|  protein of unknown function DUF1501 [Chtho...   208    6e-52 
ref|NP_867306.1|  hypothetical protein RB6510 [Rhodopirellula ...   207    1e-51 
ref|NP_868014.1|  hypothetical protein RB7760 [Rhodopirellula ...   204    9e-51 
ref|ZP_02925263.1|  hypothetical protein VspiD_01435 [Verrucom...   203    2e-50 
ref|ZP_01853425.1|  hypothetical protein PM8797T_26730 [Planct...   197    9e-49 
ref|ZP_03627200.1|  protein of unknown function DUF1501 [bacte...   184    1e-44 
ref|NP_864132.1|  sulfatase 1 precursor [Rhodopirellula baltic...   184    1e-44 
ref|ZP_03127076.1|  protein of unknown function DUF1501 [Chtho...   176    3e-42 
ref|ZP_01854156.1|  hypothetical protein PM8797T_15933 [Planct...   169    4e-40 
ref|ZP_03127075.1|  protein of unknown function DUF1501 [Chtho...   151    9e-35 
ref|ZP_01852088.1|  hypothetical protein PM8797T_21978 [Planct...   151    9e-35 
ref|ZP_02730713.1|  hypothetical protein GobsU_02880 [Gemmata ...   151    9e-35 
ref|ZP_01876932.1|  hypothetical protein LNTAR_09701 [Lentisph...   150    1e-34 
ref|ZP_04426448.1|  uncharacterized conserved protein [Plancto...   149    3e-34 
ref|ZP_03132144.1|  protein of unknown function DUF1501 [Chtho...   149    4e-34 
ref|ZP_02930322.1|  hypothetical protein VspiD_26785 [Verrucom...   147    1e-33 
ref|ZP_02735961.1|  hypothetical protein GobsU_29391 [Gemmata ...   144    1e-32 
ref|ZP_01854143.1|  hypothetical protein PM8797T_15868 [Planct...   142    5e-32 
ref|YP_003370659.1|  protein of unknown function DUF1501 [Pire...   141    1e-31 
ref|YP_003368871.1|  protein of unknown function DUF1501 [Pire...   140    3e-31 
ref|ZP_01875695.1|  hypothetical protein LNTAR_19050 [Lentisph...   139    4e-31 
ref|NP_867469.1|  hypothetical protein RB6795 [Rhodopirellula ...   139    6e-31 
ref|ZP_02927014.1|  hypothetical protein VspiD_10230 [Verrucom...   137    1e-30 
ref|ZP_02931373.1|  hypothetical protein VspiD_32070 [Verrucom...   137    1e-30 
ref|ZP_01853172.1|  hypothetical protein PM8797T_09744 [Planct...   137    2e-30 
ref|ZP_02737862.1|  hypothetical protein GobsU_38993 [Gemmata ...   137    2e-30 
ref|YP_003370549.1|  protein of unknown function DUF1501 [Pire...   135    4e-30 
ref|ZP_01853065.1|  hypothetical protein PM8797T_09209 [Planct...   135    8e-30 
ref|YP_003370915.1|  protein of unknown function DUF1501 [Pire...   134    9e-30 
ref|ZP_03130881.1|  protein of unknown function DUF1501 [Chtho...   134    1e-29 
ref|ZP_02928050.1|  hypothetical protein VspiD_15410 [Verrucom...   134    2e-29 
ref|ZP_04429041.1|  uncharacterized conserved protein [Plancto...   133    2e-29 
ref|NP_867064.1|  hypothetical protein RB6101 [Rhodopirellula ...   133    3e-29 
ref|ZP_01090316.1|  hypothetical protein DSM3645_21292 [Blasto...   132    5e-29 
ref|YP_003372003.1|  protein of unknown function DUF1501 [Pire...   131    8e-29 
ref|YP_003372790.1|  protein of unknown function DUF1501 [Pire...   131    1e-28 
ref|YP_003369169.1|  protein of unknown function DUF1501 [Pire...   130    1e-28 
ref|ZP_02931018.1|  hypothetical protein VspiD_30285 [Verrucom...   129    3e-28 
ref|ZP_02733656.1|  hypothetical protein GobsU_17775 [Gemmata ...   127    1e-27 
ref|ZP_02931050.1|  hypothetical protein VspiD_30455 [Verrucom...   127    1e-27 
ref|ZP_04428720.1|  uncharacterized conserved protein [Plancto...   127    1e-27 
ref|ZP_01853766.1|  hypothetical protein PM8797T_25726 [Planct...   127    1e-27 
ref|ZP_01094116.1|  hypothetical protein DSM3645_13063 [Blasto...   127    2e-27 
ref|YP_003370624.1|  protein of unknown function DUF1501 [Pire...   126    3e-27 
ref|ZP_02927897.1|  hypothetical protein VspiD_14645 [Verrucom...   126    3e-27 
ref|YP_003371321.1|  protein of unknown function DUF1501 [Pire...   126    4e-27 
ref|ZP_01854591.1|  hypothetical protein PM8797T_03945 [Planct...   126    4e-27 
ref|ZP_02931362.1|  hypothetical protein VspiD_32015 [Verrucom...   125    5e-27 
ref|YP_003369182.1|  protein of unknown function DUF1501 [Pire...   125    6e-27 
ref|ZP_01854481.1|  hypothetical protein PM8797T_24631 [Planct...   125    7e-27 
ref|ZP_02735466.1|  hypothetical protein GobsU_26901 [Gemmata ...   124    9e-27 
ref|ZP_02732388.1|  hypothetical protein GobsU_11325 [Gemmata ...   124    1e-26 
ref|ZP_01851693.1|  hypothetical protein PM8797T_27769 [Planct...   124    2e-26 
ref|ZP_01089299.1|  hypothetical protein DSM3645_16565 [Blasto...   123    2e-26 
ref|NP_865732.1|  hypothetical protein RB3808 [Rhodopirellula ...   123    2e-26 
ref|YP_825016.1|  hypothetical protein Acid_3761 [Solibacter u...   123    3e-26 
ref|ZP_03131234.1|  protein of unknown function DUF1501 [Chtho...   123    3e-26 
ref|ZP_01853208.1|  hypothetical protein PM8797T_09924 [Planct...   122    4e-26 
ref|ZP_02731025.1|  hypothetical protein GobsU_04454 [Gemmata ...   122    4e-26 
ref|ZP_01853747.1|  hypothetical protein PM8797T_25631 [Planct...   122    4e-26 
ref|ZP_02732092.1|  hypothetical protein GobsU_09843 [Gemmata ...   122    4e-26 
ref|YP_823540.1|  hypothetical protein Acid_2265 [Solibacter u...   122    6e-26 
ref|YP_003369379.1|  protein of unknown function DUF1501 [Pire...   122    7e-26 
ref|ZP_01909638.1|  hypothetical protein PPSIR1_37894 [Plesioc...   122    7e-26 
ref|ZP_02737846.1|  hypothetical protein GobsU_38913 [Gemmata ...   121    8e-26 
ref|ZP_01856843.1|  hypothetical protein PM8797T_16655 [Planct...   121    9e-26 
ref|ZP_01854055.1|  hypothetical protein PM8797T_18259 [Planct...   121    9e-26 
ref|YP_003369771.1|  protein of unknown function DUF1501 [Pire...   121    1e-25 
ref|ZP_02737730.1|  hypothetical protein GobsU_38333 [Gemmata ...   120    1e-25 
ref|YP_826281.1|  hypothetical protein Acid_5041 [Solibacter u...   120    2e-25 
ref|ZP_04426027.1|  uncharacterized conserved protein [Plancto...   120    2e-25 
ref|YP_003372266.1|  protein of unknown function DUF1501 [Pire...   120    3e-25 
ref|NP_867558.1|  hypothetical protein RB6969 [Rhodopirellula ...   119    3e-25 
ref|ZP_01856653.1|  hypothetical protein PM8797T_02249 [Planct...   119    3e-25 
ref|YP_003372376.1|  protein of unknown function DUF1501 [Pire...   119    3e-25 
ref|NP_868971.1|  hypothetical protein RB9598 [Rhodopirellula ...   119    4e-25 
ref|YP_003368891.1|  protein of unknown function DUF1501 [Pire...   119    6e-25 
ref|YP_003370478.1|  protein of unknown function DUF1501 [Pire...   117    1e-24 
ref|NP_866971.1|  sulfatase [Rhodopirellula baltica SH 1] >emb...   117    1e-24 
ref|YP_003370138.1|  protein of unknown function DUF1501 [Pire...   117    1e-24 
ref|ZP_03133042.1|  protein of unknown function DUF1501 [Chtho...   117    1e-24 
ref|ZP_02731211.1|  hypothetical protein GobsU_05406 [Gemmata ...   117    1e-24 
ref|ZP_01092914.1|  hypothetical protein DSM3645_07161 [Blasto...   117    1e-24 
ref|ZP_02931140.1|  hypothetical protein VspiD_30905 [Verrucom...   117    2e-24 
ref|ZP_01873963.1|  hypothetical protein LNTAR_10906 [Lentisph...   117    2e-24 
ref|ZP_01855688.1|  hypothetical protein PM8797T_12101 [Planct...   116    3e-24 
ref|ZP_02733228.1|  hypothetical protein GobsU_15603 [Gemmata ...   116    3e-24 
ref|ZP_02732318.1|  hypothetical protein GobsU_10975 [Gemmata ...   116    4e-24 
ref|ZP_03628878.1|  protein of unknown function DUF1501 [bacte...   116    4e-24 
ref|ZP_01856895.1|  hypothetical protein PM8797T_01099 [Planct...   116    4e-24 
ref|YP_003369173.1|  protein of unknown function DUF1501 [Pire...   116    4e-24 
ref|YP_003372640.1|  protein of unknown function DUF1501 [Pire...   115    6e-24 
ref|ZP_03131227.1|  protein of unknown function DUF1501 [Chtho...   114    1e-23 
ref|ZP_01090042.1|  hypothetical protein DSM3645_23471 [Blasto...   114    1e-23 
ref|ZP_02928757.1|  hypothetical protein VspiD_18945 [Verrucom...   114    1e-23 
ref|ZP_02928140.1|  hypothetical protein VspiD_15860 [Verrucom...   114    1e-23 
ref|ZP_03129213.1|  protein of unknown function DUF1501 [Chtho...   113    2e-23 
ref|ZP_02734729.1|  hypothetical protein GobsU_23192 [Gemmata ...   113    2e-23 
ref|ZP_01852710.1|  hypothetical protein PM8797T_12873 [Planct...   112    4e-23 
ref|ZP_02731000.1|  hypothetical protein GobsU_04319 [Gemmata ...   112    4e-23 
ref|YP_003368897.1|  protein of unknown function DUF1501 [Pire...   112    5e-23 
ref|ZP_04428611.1|  uncharacterized conserved protein [Plancto...   112    6e-23 
ref|ZP_02735036.1|  hypothetical protein GobsU_24741 [Gemmata ...   112    6e-23 
ref|ZP_01853565.1|  hypothetical protein PM8797T_11214 [Planct...   112    6e-23 
emb|CAQ51414.1|  hypothetical protein [Prosthecobacter vanneer...   112    7e-23 
ref|ZP_01855710.1|  hypothetical protein PM8797T_26885 [Planct...   112    7e-23 
ref|ZP_01853939.1|  hypothetical protein PM8797T_20998 [Planct...   112    8e-23 
ref|ZP_01855772.1|  hypothetical protein PM8797T_27195 [Planct...   111    8e-23 
gb|ABX10571.1|  hypothetical protein 5H12_9 [uncultured planct...   111    8e-23 
ref|ZP_02735681.1|  hypothetical protein GobsU_27981 [Gemmata ...   111    1e-22 
ref|ZP_01852148.1|  hypothetical protein PM8797T_22278 [Planct...   111    1e-22 
ref|ZP_01857211.1|  hypothetical protein PM8797T_07417 [Planct...   110    2e-22 
ref|ZP_01854565.1|  hypothetical protein PM8797T_03815 [Planct...   110    2e-22 
ref|ZP_03128911.1|  protein of unknown function DUF1501 [Chtho...   110    2e-22 
ref|YP_003369408.1|  protein of unknown function DUF1501 [Pire...   110    3e-22 
ref|ZP_03127901.1|  protein of unknown function DUF1501 [Chtho...   110    3e-22 
ref|YP_826940.1|  hypothetical protein Acid_5708 [Solibacter u...   109    4e-22 
ref|ZP_01854788.1|  hypothetical protein PM8797T_29598 [Planct...   109    4e-22 
ref|YP_003368753.1|  protein of unknown function DUF1501 [Pire...   109    4e-22 
ref|NP_870671.1|  hypothetical protein RB12732 [Rhodopirellula...   109    5e-22 
ref|ZP_01908563.1|  hypothetical protein PPSIR1_33184 [Plesioc...   108    5e-22 
ref|ZP_02925556.1|  hypothetical protein VspiD_02910 [Verrucom...   108    6e-22 
ref|ZP_01093136.1|  hypothetical protein DSM3645_15575 [Blasto...   108    7e-22 
ref|ZP_01092106.1|  hypothetical protein DSM3645_25894 [Blasto...   108    7e-22 
ref|ZP_01092133.1|  hypothetical protein DSM3645_26029 [Blasto...   108    9e-22 
ref|ZP_01093767.1|  hypothetical protein DSM3645_08221 [Blasto...   108    1e-21 
ref|ZP_01857082.1|  hypothetical protein PM8797T_19712 [Planct...   108    1e-21 
ref|ZP_01855007.1|  hypothetical protein PM8797T_07719 [Planct...   107    1e-21 
ref|ZP_04426163.1|  uncharacterized conserved protein [Plancto...   107    2e-21 
ref|ZP_01089025.1|  hypothetical protein DSM3645_00460 [Blasto...   107    2e-21 
ref|ZP_01090430.1|  hypothetical protein DSM3645_12001 [Blasto...   107    2e-21 
ref|YP_003369824.1|  protein of unknown function DUF1501 [Pire...   106    4e-21 
ref|ZP_02734897.1|  hypothetical protein GobsU_24042 [Gemmata ...   105    4e-21 
ref|ZP_01851901.1|  hypothetical protein PM8797T_28809 [Planct...   105    5e-21 
ref|ZP_04429253.1|  uncharacterized conserved protein [Plancto...   105    5e-21 
ref|ZP_03133292.1|  protein of unknown function DUF1501 [Chtho...   105    6e-21 
ref|NP_866593.1|  hypothetical protein RB5255 [Rhodopirellula ...   105    7e-21 
ref|NP_863941.1|  hypothetical protein RB490 [Rhodopirellula b...   104    1e-20 
ref|ZP_01852887.1|  hypothetical protein PM8797T_02899 [Planct...   104    1e-20 
ref|ZP_03128421.1|  protein of unknown function DUF1501 [Chtho...   102    4e-20 
ref|ZP_03132475.1|  protein of unknown function DUF1501 [Chtho...   102    4e-20 
gb|ABX10686.1|  hypothetical secreted protein [uncultured plan...   102    4e-20 
ref|ZP_02735796.1|  hypothetical protein GobsU_28560 [Gemmata ...   102    4e-20 
ref|ZP_04430094.1|  uncharacterized conserved protein [Plancto...   102    5e-20 
ref|ZP_04429409.1|  Protein of unknown function (DUF1501) [Pla...   102    6e-20 
ref|ZP_01852890.1|  hypothetical protein PM8797T_02914 [Planct...   102    7e-20 
ref|YP_003085453.1|  protein of unknown function DUF1501 [Dyad...   100    2e-19 
ref|ZP_01851838.1|  hypothetical protein PM8797T_28494 [Planct...   100    2e-19 
ref|YP_825001.1|  hypothetical protein Acid_3745 [Solibacter u...   100    2e-19 
ref|ZP_03129604.1|  protein of unknown function DUF1501 [Chtho...   100    2e-19 
ref|ZP_03129634.1|  protein of unknown function DUF1501 [Chtho...  99.8    3e-19 
ref|ZP_04428316.1|  uncharacterized conserved protein [Plancto...  99.8    4e-19 
ref|ZP_02731516.1|  hypothetical protein GobsU_06935 [Gemmata ...  99.0    6e-19 
ref|ZP_01854633.1|  hypothetical protein PM8797T_04155 [Planct...  98.6    7e-19 
ref|ZP_01854735.1|  hypothetical protein PM8797T_29333 [Planct...  98.6    9e-19 
ref|YP_003370409.1|  protein of unknown function DUF1501 [Pire...  98.2    1e-18 
ref|ZP_02734817.1|  hypothetical protein GobsU_23632 [Gemmata ...  97.4    2e-18 
ref|ZP_01856086.1|  hypothetical protein PM8797T_15386 [Planct...  97.4    2e-18 
ref|ZP_01852283.1|  hypothetical protein PM8797T_22953 [Planct...  97.1    2e-18 
ref|NP_864815.1|  hypothetical protein RB2231 [Rhodopirellula ...  96.7    3e-18 
ref|ZP_01852920.1|  hypothetical protein PM8797T_03064 [Planct...  96.3    4e-18 
ref|YP_828850.1|  hypothetical protein Acid_7666 [Solibacter u...  96.3    4e-18 
ref|ZP_03128694.1|  protein of unknown function DUF1501 [Chtho...  95.9    5e-18 
ref|ZP_01105334.1|  hypothetical protein FB2170_04010 [Flavoba...  95.9    5e-18 
ref|ZP_01856242.1|  hypothetical protein PM8797T_27297 [Planct...  95.9    6e-18 
ref|ZP_01855446.1|  hypothetical protein PM8797T_13837 [Planct...  95.5    7e-18 
ref|ZP_02733406.1|  hypothetical protein GobsU_16499 [Gemmata ...  95.1    9e-18 
ref|ZP_03626946.1|  protein of unknown function DUF1501 [bacte...  94.4    1e-17 
ref|ZP_01855110.1|  hypothetical protein PM8797T_29972 [Planct...  94.4    1e-17 
ref|ZP_02732129.1|  hypothetical protein GobsU_10028 [Gemmata ...  94.4    1e-17 
ref|YP_003369235.1|  protein of unknown function DUF1501 [Pire...  94.4    2e-17 
ref|YP_003385808.1|  protein of unknown function DUF1501 [Spir...  94.4    2e-17 
ref|ZP_01720413.1|  hypothetical protein ALPR1_14374 [Algoriph...  94.4    2e-17 
ref|YP_003370277.1|  protein of unknown function DUF1501 [Pire...  94.0    2e-17 
ref|YP_003370193.1|  protein of unknown function DUF1501 [Pire...  93.6    2e-17 
ref|ZP_01852805.1|  hypothetical protein PM8797T_13348 [Planct...  93.2    3e-17 
ref|ZP_03629388.1|  protein of unknown function DUF1501 [bacte...  92.8    4e-17 
ref|ZP_04429374.1|  Protein of unknown function (DUF1501) [Pla...  92.4    6e-17 
ref|ZP_02731013.1|  hypothetical protein GobsU_04384 [Gemmata ...  92.4    6e-17 
ref|ZP_01853773.1|  hypothetical protein PM8797T_25761 [Planct...  91.7    9e-17 
ref|ZP_01090891.1|  hypothetical protein DSM3645_10952 [Blasto...  91.7    1e-16 
ref|ZP_03130514.1|  protein of unknown function DUF1501 [Chtho...  91.3    1e-16 
ref|ZP_02733304.1|  hypothetical protein GobsU_15987 [Gemmata ...  91.3    1e-16 
ref|YP_001837582.1|  hypothetical protein LEPBI_I0161 [Leptosp...  90.9    2e-16 
ref|YP_003371703.1|  protein of unknown function DUF1501 [Pire...  90.5    2e-16 
ref|YP_003368899.1|  protein of unknown function DUF1501 [Pire...  89.4    5e-16 
ref|ZP_01852951.1|  hypothetical protein PM8797T_03219 [Planct...  89.0    6e-16 
ref|ZP_02926622.1|  hypothetical protein VspiD_08260 [Verrucom...  88.2    1e-15 
ref|ZP_02736909.1|  hypothetical protein GobsU_34165 [Gemmata ...  87.8    2e-15 
ref|YP_003370886.1|  protein of unknown function DUF1501 [Pire...  87.4    2e-15 
ref|ZP_01856805.1|  hypothetical protein PM8797T_30891 [Planct...  86.7    3e-15 
ref|ZP_01852113.1|  hypothetical protein PM8797T_22103 [Planct...  85.5    7e-15 
ref|ZP_01857657.1|  hypothetical protein PM8797T_30464 [Planct...  85.5    8e-15 
ref|ZP_01091597.1|  hypothetical protein DSM3645_24210 [Blasto...  85.1    1e-14 
ref|ZP_01854939.1|  hypothetical protein PM8797T_23544 [Planct...  84.7    1e-14 
ref|ZP_01873668.1|  hypothetical protein LNTAR_08989 [Lentisph...  84.0    2e-14 
gb|ABX10648.1|  hypothetical secreted protein [uncultured plan...  82.8    4e-14 
ref|ZP_01855774.1|  hypothetical protein PM8797T_27205 [Planct...  82.0    8e-14 
ref|ZP_03129635.1|  protein of unknown function DUF1501 [Chtho...  82.0    8e-14 
ref|ZP_04425976.1|  Protein of unknown function (DUF1501) [Pla...  81.6    1e-13 
ref|ZP_02735043.1|  hypothetical protein GobsU_24776 [Gemmata ...  81.6    1e-13 
ref|ZP_01853648.1|  hypothetical protein PM8797T_25136 [Planct...  81.3    1e-13 
ref|ZP_02733295.1|  hypothetical protein GobsU_15942 [Gemmata ...  81.3    1e-13 
ref|ZP_01857675.1|  hypothetical protein PM8797T_11641 [Planct...  80.9    2e-13 
ref|ZP_03130415.1|  protein of unknown function DUF1501 [Chtho...  80.1    3e-13 
ref|ZP_03130555.1|  protein of unknown function DUF1501 [Chtho...  79.0    6e-13 
ref|ZP_02730745.1|  hypothetical protein GobsU_03040 [Gemmata ...  77.0    2e-12 
ref|YP_003373034.1|  protein of unknown function DUF1501 [Pire...  76.6    3e-12 
ref|ZP_01873959.1|  hypothetical protein LNTAR_10886 [Lentisph...  75.5    7e-12 
ref|ZP_01873591.1|  hypothetical protein LNTAR_08604 [Lentisph...  74.7    1e-11 
ref|ZP_02928809.1|  hypothetical protein VspiD_19205 [Verrucom...  74.7    1e-11 
ref|ZP_01854095.1|  hypothetical protein PM8797T_18459 [Planct...  74.3    2e-11 
ref|ZP_02736908.1|  hypothetical protein GobsU_34160 [Gemmata ...  72.4    7e-11 
ref|ZP_02738017.1|  hypothetical protein GobsU_39782 [Gemmata ...  68.9    7e-10 
ref|ZP_02731828.1|  hypothetical protein GobsU_08517 [Gemmata ...  60.8    2e-07 
ref|ZP_01853055.1|  hypothetical protein PM8797T_09159 [Planct...  59.3    5e-07 
ref|ZP_02732574.1|  hypothetical protein GobsU_12265 [Gemmata ...  55.1    9e-06 
ref|ZP_02731517.1|  hypothetical protein GobsU_06940 [Gemmata ...  52.0    8e-05 
ref|ZP_01114887.1|  hypothetical protein MED297_16923 [Reineke...  52.0    8e-05 
gb|ACN58762.1|  putative exported protein [uncultured bacteriu...  48.9    7e-04 
ref|ZP_04769738.1|  protein of unknown function DUF1501 [Astic...  47.8    0.001 
ref|ZP_05055957.1|  conserved hypothetical protein [Verrucomic...  47.8    0.002 
ref|ZP_03724912.1|  twin-arginine translocation pathway signal...  47.8    0.002 
ref|YP_269110.1|  hypothetical protein CPS_2392 [Colwellia psy...  47.4    0.002 
ref|ZP_03627199.1|  hypothetical protein Cflav_PD5243 [bacteri...  47.0    0.003 
ref|ZP_05026872.1|  conserved hypothetical protein [Microcoleu...  47.0    0.003 
ref|YP_002550579.1|  hypothetical protein Avi_3565 [Agrobacter...  46.6    0.004 
ref|YP_724364.1|  twin-arginine translocation pathway signal [...  45.4    0.007 
ref|YP_002987725.1|  protein of unknown function DUF1501 [Dick...  45.4    0.008 
ref|ZP_02906248.1|  protein of unknown function DUF1501 [Burkh...  43.9    0.024 
ref|ZP_05056147.1|  conserved hypothetical protein [Verrucomic...  43.1    0.036 
ref|ZP_03269459.1|  protein of unknown function DUF1501 [Burkh...  43.1    0.039 
ref|ZP_04425872.1|  uncharacterized conserved protein [Plancto...  43.1    0.040 
ref|ZP_02367627.1|  hypothetical protein BoklC_33270 [Burkhold...  43.1    0.042 
ref|ZP_02162931.1|  Twin-arginine translocation pathway signal...  42.7    0.051 
ref|YP_001811048.1|  hypothetical protein BamMC406_4376 [Burkh...  42.4    0.069 
ref|ZP_02894795.1|  protein of unknown function DUF1501 [Burkh...  42.4    0.070 
ref|YP_775800.1|  hypothetical protein Bamb_3912 [Burkholderia...  42.4    0.071 
ref|ZP_02360141.1|  hypothetical protein BoklE_31990 [Burkhold...  42.4    0.075 
ref|ZP_01450348.1|  hypothetical protein OM2255_18171 [alpha p...  42.0    0.077 
ref|ZP_01201359.1|  putative twin-arginine translocation pathw...  42.0    0.081 
ref|XP_002290996.1|  predicted protein [Thalassiosira pseudona...  42.0    0.089 
ref|YP_002547113.1|  hypothetical protein Avi_5238 [Agrobacter...  42.0    0.092 
ref|ZP_05885927.1|  DUF1501 domain-containing protein [Vibrio ...  42.0    0.094 
ref|YP_001618398.1|  hypothetical protein sce7749 [Sorangium c...  41.6    0.11  
ref|YP_372143.1|  hypothetical protein Bcep18194_B1385 [Burkho...  41.6    0.12  
ref|ZP_01302694.1|  hypothetical protein SKA58_03360 [Sphingom...  41.2    0.14  
ref|XP_002294919.1|  predicted protein [Thalassiosira pseudona...  41.2    0.14  
ref|NP_923592.1|  N-acetylglucosamine 6-sulfatase [Gloeobacter...  41.2    0.14  
ref|NP_927164.1|  hypothetical protein glr4218 [Gloeobacter vi...  41.2    0.15  
ref|YP_001519231.1|  hypothetical protein AM1_4942 [Acaryochlo...  40.8    0.17  
ref|YP_001615050.1|  hypothetical protein sce4407 [Sorangium c...  40.8    0.18  
ref|XP_002293606.1|  predicted protein [Thalassiosira pseudona...  40.4    0.23  
ref|ZP_01911419.1|  Tat (twin-arginine translocation) pathway ...  40.4    0.23  
ref|ZP_05035938.1|  conserved hypothetical protein [Synechococ...  40.4    0.26  
ref|YP_003392910.1|  protein of unknown function DUF1501 [Cone...  40.4    0.28  
ref|YP_002784614.1|  Conserved hypothetical protein, precursor...  40.4    0.28  
ref|YP_003332076.1|  protein of unknown function DUF1501 [Dick...  40.4    0.28  
ref|YP_002989025.1|  protein of unknown function DUF1501 [Dick...  40.0    0.34  
ref|YP_623755.1|  twin-arginine translocation pathway signal [...  40.0    0.35  
ref|ZP_05811641.1|  protein of unknown function DUF1501 [Mesor...  40.0    0.36  
ref|YP_002234268.1|  hypothetical protein BCAM1657 [Burkholder...  39.7    0.38  
ref|ZP_03626846.1|  sulfatase [bacterium Ellin514] >gb|EEF6282...  39.7    0.38  
ref|ZP_01811801.1|  hypothetical protein VSWAT3_24889 [Vibrion...  39.7    0.45  
ref|ZP_01693758.1|  conserved hypothetical protein [Microscill...  39.7    0.45  
ref|ZP_04771294.1|  protein of unknown function DUF1501 [Astic...  39.3    0.58  
ref|YP_002379925.1|  protein of unknown function DUF1501 [Cyan...  39.3    0.58  
ref|YP_001860255.1|  hypothetical protein Bphy_4086 [Burkholde...  39.3    0.60  
ref|NP_357518.1|  hypothetical protein Atu3080 [Agrobacterium ...  39.3    0.62  
ref|ZP_01460227.1|  Tat (twin-arginine translocation) pathway ...  38.9    0.66  
ref|NP_246621.1|  hypothetical protein PM1682 [Pasteurella mul...  38.9    0.67  
ref|YP_001192872.1|  hypothetical protein Fjoh_0518 [Flavobact...  38.9    0.74  
ref|YP_002759965.1|  hypothetical protein GAU_0453 [Gemmatimon...  38.9    0.77  
ref|ZP_01450313.1|  hypothetical protein OM2255_17996 [alpha p...  38.9    0.82  
ref|YP_002542343.1|  hypothetical protein Arad_9773 [Agrobacte...  38.5    0.86  
ref|ZP_06224994.1|  protein of unknown function DUF1501 [Burkh...  38.5    0.89  
ref|YP_591943.1|  hypothetical protein Acid345_2868 [Candidatu...  38.5    0.90  
ref|YP_629264.1|  Tat pathway signal sequence domain-containin...  38.5    0.97  
ref|YP_001779439.1|  hypothetical protein Bcenmc03_5828 [Burkh...  38.5    1.1   
ref|NP_937442.1|  hypothetical protein VVA1386 [Vibrio vulnifi...  38.1    1.2   
ref|ZP_05094695.1|  conserved hypothetical protein [marine gam...  38.1    1.2   
ref|YP_558718.1|  hypothetical protein Bxe_A2304 [Burkholderia...  38.1    1.4   
ref|YP_001296394.1|  hypothetical protein FP1512 [Flavobacteri...  37.7    1.5   
ref|ZP_06053753.1|  DUF1501 domain-containing protein [Grimont...  37.7    1.5   
ref|XP_002290908.1|  predicted protein [Thalassiosira pseudona...  37.7    1.5   
ref|YP_825925.1|  hypothetical protein Acid_4681 [Solibacter u...  37.7    1.6   
ref|YP_003156246.1|  arylsulfatase A family protein [Brachybac...  37.7    1.7   
ref|ZP_06383564.1|  twin-arginine translocation pathway signal...  37.7    1.8   
ref|ZP_05126993.1|  twin-arginine translocation pathway signal...  37.7    1.8   
ref|ZP_05102721.1|  sulfatase [Roseobacter sp. GAI101] >gb|EEB...  37.4    1.9   
ref|ZP_03276253.1|  protein of unknown function DUF1501 [Arthr...  37.4    1.9   
ref|NP_948934.1|  hypothetical protein RPA3596 [Rhodopseudomon...  37.4    2.2   
ref|YP_001993087.1|  protein of unknown function DUF1501 [Rhod...  37.4    2.3   
ref|ZP_02730597.1|  hypothetical protein GobsU_02296 [Gemmata ...  37.4    2.3   
ref|XP_001771463.1|  predicted protein [Physcomitrella patens ...  37.0    2.5   
ref|YP_002911194.1|  Twin-arginine translocation pathway signa...  37.0    2.6   
gb|AAR37838.1|  twin-arginine translocation domain protein [un...  37.0    2.6   
ref|YP_003075061.1|  hypothetical protein TERTU_3754 [Teredini...  37.0    2.7   
ref|ZP_03131134.1|  type I phosphodiesterase/nucleotide pyroph...  37.0    3.0   
ref|ZP_02367535.1|  ribonuclease II (RNB)-like protein [Burkho...  37.0    3.0   
ref|ZP_02357059.1|  ribonuclease II (RNB)-like protein [Burkho...  37.0    3.0   
ref|YP_356810.1|  radical SAM domain-containing protein [Pelob...  37.0    3.0   
ref|YP_002910413.1|  Ribonuclease II [Burkholderia glumae BGR1...  37.0    3.1   
ref|YP_553690.1|  hypothetical protein Bxe_B1625 [Burkholderia...  37.0    3.2   
ref|ZP_06465779.1|  protein of unknown function DUF1501 [Burkh...  36.6    3.6   
ref|ZP_01113058.1|  Tat (twin-arginine translocation) pathway ...  36.6    3.6   
ref|ZP_03588010.1|  twin-arginine translocation pathway signal...  36.6    3.8   
ref|ZP_02464804.1|  ribonuclease II (RNB)-like protein [Burkho...  36.6    4.1   
ref|XP_001216756.1|  hypothetical protein ATEG_08135 [Aspergil...  36.2    4.2   
ref|YP_001448581.1|  hypothetical protein VIBHAR_06463 [Vibrio...  36.2    4.3   
ref|ZP_01986330.1|  twin-arginine translocation pathway signal...  36.2    4.3   
ref|YP_003119901.1|  protein of unknown function DUF1501 [Chit...  36.2    4.4   
ref|XP_002373009.1|  EF hand domain protein [Aspergillus flavu...  36.2    4.5   
ref|ZP_01718041.1|  probable sulfatase atsG [Algoriphagus sp. ...  36.2    4.6   
ref|ZP_06294676.1|  protein of unknown function DUF1501 [Burkh...  36.2    4.6   
ref|ZP_02373404.1|  ribonuclease II (RNB)-like protein [Burkho...  36.2    4.6   
ref|YP_001565509.1|  hypothetical protein Daci_4495 [Delftia a...  36.2    4.6   
ref|ZP_06171160.1|  protein of unknown function DUF1501 [Brevu...  36.2    4.7   
ref|YP_003386039.1|  protein of unknown function DUF1501 [Spir...  36.2    4.8   
ref|ZP_02387270.1|  ribonuclease II (RNB)-like protein [Burkho...  36.2    4.9   
ref|ZP_06120947.1|  cystathionine beta-lyase [Caulobacter segn...  36.2    5.0   
ref|ZP_03571275.1|  twin-arginine translocation pathway signal...  36.2    5.1   
ref|YP_001579521.1|  hypothetical protein Bmul_1336 [Burkholde...  36.2    5.1   
ref|ZP_00051182.1|  COG4102: Uncharacterized protein conserved...  36.2    5.2   
ref|YP_003311122.1|  amidase, hydantoinase/carbamoylase family...  36.2    5.3   
ref|YP_441717.1|  ribonuclease II [Burkholderia thailandensis ...  35.8    5.5   
ref|YP_457286.1|  hypothetical protein ELI_01985 [Erythrobacte...  35.8    5.6   
ref|YP_002909483.1|  hypothetical protein bglu_2g19230 [Burkho...  35.8    5.7   
dbj|BAI79868.1|  hypothetical protein [Deferribacter desulfuri...  35.8    5.9   
ref|XP_001622929.1|  predicted protein [Nematostella vectensis...  35.8    6.0   
ref|YP_001005727.1|  hypothetical protein YE1419 [Yersinia ent...  35.8    6.0   
ref|YP_001187594.1|  sulfatase [Pseudomonas mendocina ymp] >gb...  35.8    6.1   
ref|XP_657921.1|  hypothetical protein AN0317.2 [Aspergillus n...  35.8    6.1   
ref|ZP_06490334.1|  hypothetical protein XcampmN_12355 [Xantho...  35.8    6.3   
ref|YP_269085.1|  sulfatase family protein [Colwellia psychrer...  35.8    6.3   
ref|ZP_06487994.1|  hypothetical protein XcampvN_25910 [Xantho...  35.8    6.6   
ref|ZP_02195169.1|  hypothetical protein 1103602000598_AND4_10...  35.8    6.6   
ref|ZP_02245118.1|  hypothetical protein Xoryp_21350 [Xanthomo...  35.8    6.9   
ref|XP_001817855.1|  hypothetical protein [Aspergillus oryzae ...  35.4    7.4   
ref|YP_117539.1|  putative phosphoketolase [Nocardia farcinica...  35.4    7.5   
ref|ZP_01623303.1|  hypothetical protein L8106_25360 [Lyngbya ...  35.4    7.7   
ref|YP_001036496.1|  sulfatase [Clostridium thermocellum ATCC ...  35.4    7.7   
ref|YP_001915909.1|  hypothetical protein PXO_02839 [Xanthomon...  35.4    7.8   
ref|YP_199065.1|  hypothetical protein XOO0426 [Xanthomonas or...  35.4    7.9   
ref|ZP_01734699.1|  heparan N-sulfatase [Flavobacteria bacteri...  35.4    8.4   
ref|YP_841661.1|  hypothetical protein H16_B2149 [Ralstonia eu...  35.4    8.4   
ref|YP_449417.1|  hypothetical protein XOO_0388 [Xanthomonas o...  35.4    8.5   
ref|YP_828056.1|  sulfatase [Solibacter usitatus Ellin6076] >g...  35.4    8.6   
ref|ZP_01035718.1|  ABC-type transport system ATP-binding prot...  35.4    8.9   
ref|ZP_04547095.1|  arylsulfatase B [Bacteroides sp. D1] >gb|E...  35.0    9.5   
ref|XP_001214017.1|  conserved hypothetical protein [Aspergill...  35.0    9.5   
ref|XP_002537082.1|  conserved hypothetical protein [Ricinus c...  35.0    9.7   
ref|ZP_02884044.1|  protein of unknown function DUF1501 [Burkh...  35.0    9.8   

ALIGNMENTS
>ref|ZP_01857254.1| hypothetical protein PM8797T_08334 [Planctomyces maris DSM 8797]
 gb|EDL56895.1| hypothetical protein PM8797T_08334 [Planctomyces maris DSM 8797]
Length=474

 Score =  379 bits (974),  Expect = 2e-103, Method: Compositional matrix adjust.
 Identities = 177/308 (57%), Positives = 227/308 (73%), Gaps = 1/308 (0%)

Query  1    EPKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNV  60
            +PK+ GSP+ F+  G++G+ ISEL PH+  V+DEL ++ SL +D FNH PA  F  TG  
Sbjct  107  QPKLMGSPFAFQQQGEAGLPISELMPHLGSVSDELCMIHSLKTDHFNHAPAQLFFQTGFS  166

Query  61   RFGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQG  120
            RFG PSLG+W +YGLG+ENS+LP ++VL +G N+    +S WG+GFLP  +QGV  R+ G
Sbjct  167  RFGRPSLGSWVNYGLGSENSNLPGFVVLITG-NVAGAGNSLWGSGFLPSIYQGVEFRSSG  225

Query  121  DPVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPEL  180
            DPVL+L+NP G++ E R+  I+ +  LN+ +   V DP I  RI QYE+AYRMQ+AVPEL
Sbjct  226  DPVLFLSNPKGMTGEDRKRIIDSVNHLNKVQLADVGDPEIATRINQYEMAYRMQSAVPEL  285

Query  181  ADIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE  240
             DI+ EP+ + E YG  PG+ SFANNCLLARRL E GVRFVQL+++GWD HG I K    
Sbjct  286  MDISNEPKHIHEQYGTQPGKASFANNCLLARRLVERGVRFVQLFDQGWDHHGSIVKSLKN  345

Query  241  RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVWMA  300
            +CR VD+PIAALI DL+QRGLLD+TLV+WG EFGRTPM QGD    GRDHH   +TVWMA
Sbjct  346  KCRQVDQPIAALIKDLRQRGLLDDTLVVWGAEFGRTPMVQGDRKAPGRDHHKDAYTVWMA  405

Query  301  GGGIRPGI  308
            GGG++ G 
Sbjct  406  GGGVKRGF  413


>ref|ZP_03630904.1| protein of unknown function DUF1501 [bacterium Ellin514]
 gb|EEF58791.1| protein of unknown function DUF1501 [bacterium Ellin514]
Length=480

 Score =  363 bits (931),  Expect = 2e-98, Method: Compositional matrix adjust.
 Identities = 171/306 (55%), Positives = 224/306 (73%), Gaps = 2/306 (0%)

Query  3    KVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRF  62
            K+ G  + F  HGQSG E+SEL PH+A VAD++AIV+S+ +D FNH PA   M+TG+ +F
Sbjct  116  KLLGPKFSFARHGQSGAELSELLPHLAEVADDIAIVKSMSTDAFNHAPAQIMMHTGSQQF  175

Query  63   GWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDP  122
            G PS+GAW+ YGLG+E+ DLP ++V +SG        S WG+GFLP  + GV  R+QGDP
Sbjct  176  GRPSVGAWSLYGLGSESKDLPGFVVFSSGAKGPSGGASNWGSGFLPTVYNGVMFRSQGDP  235

Query  123  VLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELAD  182
            +LYL+NP GV  + +RD ++ +  LN++  ++V DP I  RI  YE+AYRMQT+ P+L D
Sbjct  236  ILYLSNPKGVDDQIQRDTLDSVRNLNQKHLDVVGDPEISTRINSYEMAYRMQTSAPDLMD  295

Query  183  IAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTERC  242
            ++ EP+ VLE YG +PG+ SFA NCLLARRL E GVRFV+L+ + WD HG +    T+ C
Sbjct  296  LSKEPKHVLEMYGVEPGKPSFAMNCLLARRLIERGVRFVELFHESWDQHGNLKADLTKNC  355

Query  243  RAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVWMAGG  302
            +  D+  AAL+ DLKQRGLLD+TLVIWGGEFGRTPM QG     GRDHHP+ FT+WMAGG
Sbjct  356  KHTDQASAALVKDLKQRGLLDDTLVIWGGEFGRTPMVQGGDD--GRDHHPNCFTMWMAGG  413

Query  303  GIRPGI  308
            GI+PGI
Sbjct  414  GIKPGI  419


>ref|ZP_03630017.1| protein of unknown function DUF1501 [bacterium Ellin514]
 gb|EEF59628.1| protein of unknown function DUF1501 [bacterium Ellin514]
Length=491

 Score =  359 bits (921),  Expect = 2e-97, Method: Compositional matrix adjust.
 Identities = 165/305 (54%), Positives = 216/305 (70%), Gaps = 4/305 (1%)

Query  7    SPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRFGWPS  66
            +  KF  HG+ G E+SE  PH+  + D++AIV+S+ +D FNH PA  FMNTG  +FG PS
Sbjct  127  TTLKFSKHGKCGAELSETLPHLGEIVDDIAIVKSMTTDAFNHAPAQIFMNTGATQFGRPS  186

Query  67   LGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDPVLYL  126
            +G+W +YGLG+E+  LP ++VL+S         S WG GFLP  +QGVP R  GDP+L L
Sbjct  187  MGSWVTYGLGSESQSLPGFVVLSSAGGTSGGA-SNWGCGFLPTVYQGVPFRRSGDPILSL  245

Query  127  NNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADIAAE  186
            +NP+GV+++ +RD +++L  LN+   ++V DP I  RI  +E+AYRMQ + PEL DI+ E
Sbjct  246  SNPNGVTKQMQRDSLDVLKELNQHHLDVVGDPEIATRINAFEMAYRMQASAPELMDISKE  305

Query  187  PRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTERCRAVD  246
             +  LE YGA+PG+ SFANNCLLARRL E GVRFVQLY + WD H E+      +C   D
Sbjct  306  SKDTLEMYGAEPGKSSFANNCLLARRLVERGVRFVQLYHEAWDHHSEVINGVKNQCGVTD  365

Query  247  RPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGD---GSGYGRDHHPHGFTVWMAGGG  303
            +P AALI DLKQRGLL++TLV+WGGEFGRTPM + +   G   GRDHHP  FT+WMAGGG
Sbjct  366  KPAAALIKDLKQRGLLEDTLVVWGGEFGRTPMVETNEAAGRKMGRDHHPQAFTMWMAGGG  425

Query  304  IRPGI  308
            I+PGI
Sbjct  426  IKPGI  430


>ref|ZP_02925393.1| hypothetical protein VspiD_02085 [Verrucomicrobium spinosum DSM 
4136]
Length=490

 Score =  357 bits (916),  Expect = 8e-97, Method: Compositional matrix adjust.
 Identities = 172/314 (54%), Positives = 228/314 (72%), Gaps = 7/314 (2%)

Query  2    PKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVR  61
            PK+ GSP+KF+ +GQSG  +S++FPH   + DE+A+V+S+ +D FNH PA  F++TG++R
Sbjct  114  PKMLGSPYKFKQYGQSGAWVSDMFPHFTKIVDEVALVKSMNTDQFNHAPAELFVHTGDMR  173

Query  62   FGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGD  121
             G  S+G+W +YGLG+EN DLP ++VL SG        S W +GFLP  +QGV  RT G+
Sbjct  174  AGGASIGSWVTYGLGSENLDLPGFVVLLSGGTDPTGGKSLWNSGFLPSVYQGVQCRTTGE  233

Query  122  PVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELA  181
            P+L+  NP+G++R++RR  ++ LG LN+     + DP  L RI QYELAYRMQTAVPE+ 
Sbjct  234  PILFSKNPEGMARDSRRRSLDALGRLNQLEAAELGDPETLTRISQYELAYRMQTAVPEVF  293

Query  182  DIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHG-----EIAK  236
            DI  EP +V   YGA PG  SFANNCLLARRL E+GVR+VQL++ GWD HG     ++  
Sbjct  294  DIQKEPESVRNLYGAKPGEGSFANNCLLARRLVENGVRYVQLFDWGWDIHGTGKGDDLVN  353

Query  237  QHTERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPM--AQGDGSGYGRDHHPHG  294
            +  ++CR VD+  AALI DLKQRGLL+ TLV+WGGEFGRTPM  A+G  +  GRDHHP+ 
Sbjct  354  KFPQKCRDVDQACAALITDLKQRGLLENTLVVWGGEFGRTPMNEARGGSTYLGRDHHPNC  413

Query  295  FTVWMAGGGIRPGI  308
            FT+WMAGGGI+ GI
Sbjct  414  FTMWMAGGGIKGGI  427


>ref|ZP_01856761.1| hypothetical protein PM8797T_30671 [Planctomyces maris DSM 8797]
 gb|EDL57318.1| hypothetical protein PM8797T_30671 [Planctomyces maris DSM 8797]
Length=478

 Score =  357 bits (915),  Expect = 1e-96, Method: Compositional matrix adjust.
 Identities = 167/310 (53%), Positives = 217/310 (70%), Gaps = 7/310 (2%)

Query  4    VKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRFG  63
            +  S ++F  HG+SG  +SEL PH A +ADEL +V+SLY++  NHDPA+TF+ TG+++ G
Sbjct  110  IAASMYQFARHGESGTWMSELLPHTAKIADELCVVKSLYTEAINHDPAITFLQTGSIQAG  169

Query  64   WPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDPV  123
             PS+G+W SYGLG+EN DLP ++ L SG   QPL D  WG+GFLP  HQGV  R   DPV
Sbjct  170  RPSMGSWISYGLGSENRDLPTFVALTSGAGGQPLYDRLWGSGFLPTRHQGVKFRRSSDPV  229

Query  124  LYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADI  183
            L+L+NP G+ ++ RRD ++ LG LNR   +   DP I  RI QYELA+RMQT++PELAD+
Sbjct  230  LFLSNPPGIDQQTRRDMLDDLGELNRLSLQQKGDPEIATRISQYELAFRMQTSIPELADL  289

Query  184  AAEPRAVLETYG---ADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE  240
            + E  A  E YG     PG  ++A NCLLARRL+E GVRF+QLY +GWD H  +  +  +
Sbjct  290  SEETSATFELYGEQAKQPG--TYAANCLLARRLAERGVRFIQLYHRGWDHHLNLPTKIRQ  347

Query  241  RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQG--DGSGYGRDHHPHGFTVW  298
                 D+  AALI DLKQRG+LD+TLV+W GEFGRT   QG    + YGRDHHP  FTVW
Sbjct  348  LTEETDQATAALILDLKQRGMLDDTLVVWAGEFGRTVYCQGTLTATNYGRDHHPRCFTVW  407

Query  299  MAGGGIRPGI  308
             AGGG++PG+
Sbjct  408  AAGGGMKPGM  417


>ref|ZP_02931028.1| hypothetical protein VspiD_30335 [Verrucomicrobium spinosum DSM 
4136]
Length=485

 Score =  355 bits (911),  Expect = 4e-96, Method: Compositional matrix adjust.
 Identities = 172/312 (55%), Positives = 221/312 (70%), Gaps = 7/312 (2%)

Query  4    VKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRFG  63
            V  S +KF  HG+S   ISEL PHV+ +AD+L  ++S++++  NHDPA+TF  TG    G
Sbjct  112  VAPSVFKFAEHGESRATISELMPHVSQIADDLCFIKSMHTEAINHDPAITFFQTGRQIAG  171

Query  64   WPSLGAWASYGLGTENSDLPAYIVLAS----GRNIQPLLDSYWGAGFLPPEHQGVPLRTQ  119
            +PS+G+W SYGLG+EN DLPA++VL S     ++ QPL    WG+GFLP EHQGV  R  
Sbjct  172  YPSMGSWLSYGLGSENKDLPAFVVLTSFGSGRKDCQPLASRLWGSGFLPSEHQGVRFRNS  231

Query  120  GDPVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPE  179
            GDPVLYL+NP G+S   RR  +++L ALN  R  LV DP I  RI QYE+A+RMQT+VP+
Sbjct  232  GDPVLYLSNPGGMSPSMRRRSLDVLNALNEDRLRLVGDPEIQTRIAQYEMAFRMQTSVPD  291

Query  180  LADIAAEPRAVLETYGADPGRV-SFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQH  238
            L D+  EP+ +L+ YG D  R  S+A NCLLARRL+E  VRFVQL+  GWD H  + K  
Sbjct  292  LTDMKDEPQHILDMYGPDVKRPGSYAANCLLARRLAERDVRFVQLFHMGWDQHFNLPKAI  351

Query  239  TERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQG--DGSGYGRDHHPHGFT  296
              +C   D+P AALI DLKQRGLL++TL++WGGEFGRT  +QG    S YGRDHHP  FT
Sbjct  352  QGQCHDTDQPTAALIKDLKQRGLLEDTLIVWGGEFGRTIYSQGTLTESNYGRDHHPRCFT  411

Query  297  VWMAGGGIRPGI  308
            V++AGGGI+PG+
Sbjct  412  VFLAGGGIKPGM  423


>ref|ZP_01107554.1| hypothetical protein FB2170_08934 [Flavobacteriales bacterium 
HTCC2170]
 gb|EAR00618.1| hypothetical protein FB2170_08934 [Flavobacteriales bacterium 
HTCC2170]
Length=503

 Score =  354 bits (909),  Expect = 6e-96, Method: Compositional matrix adjust.
 Identities = 164/315 (52%), Positives = 223/315 (70%), Gaps = 9/315 (2%)

Query  2    PKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVR  61
            PK+ G   KF+  G+SG  +S   PH   V D++A ++++++D FNH PA  FM+TG+ R
Sbjct  127  PKLMGPQAKFKQEGESGNWVSNYLPHFKKVVDDVAFLKAVHTDQFNHGPAQLFMHTGSAR  186

Query  62   FGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGD  121
             G PS+G+WA+YGLG+EN +LP ++VL SG N      S WG+GFLP  +QGV  R++GD
Sbjct  187  LGRPSIGSWATYGLGSENQNLPGFVVLTSGGNSPDAGKSVWGSGFLPSVYQGVQCRSKGD  246

Query  122  PVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELA  181
            PVLY+ +PDG++R+ ++  I+ +  +N+  +    DP ILARI QYE+AYRMQ AVPE+ 
Sbjct  247  PVLYIKDPDGITRDLKKSTIDAINKINKEEYLKYADPEILARINQYEMAYRMQIAVPEVM  306

Query  182  DIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE-  240
            +I  EP  + + YG +PG+ SFANNCLLAR+L E GVRFVQL++ GWD+HG I +   + 
Sbjct  307  NINNEPEDIKQMYGVEPGKESFANNCLLARKLVEDGVRFVQLFDWGWDTHGNIREGSIDI  366

Query  241  ----RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSG----YGRDHHP  292
                +CR +DRPI ALI DLKQRGLLDETL++WGGEFGRTPM +  G+      GRDHH 
Sbjct  367  GLRNKCREIDRPITALILDLKQRGLLDETLIVWGGEFGRTPMQENRGNKKMAFKGRDHHG  426

Query  293  HGFTVWMAGGGIRPG  307
              FT+W+AGGGI+ G
Sbjct  427  DAFTMWIAGGGIKKG  441


>ref|ZP_03629479.1| protein of unknown function DUF1501 [bacterium Ellin514]
 gb|EEF60197.1| protein of unknown function DUF1501 [bacterium Ellin514]
Length=492

 Score =  354 bits (908),  Expect = 7e-96, Method: Compositional matrix adjust.
 Identities = 169/315 (53%), Positives = 222/315 (70%), Gaps = 9/315 (2%)

Query  2    PKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVR  61
            PK+  +P KF  HG+ G+E+SEL PH+A VAD++ +VRS+ +D FNH PA  F+N+G+ +
Sbjct  116  PKILATPHKFARHGKCGMELSELLPHLATVADDITLVRSMVTDAFNHAPAQIFLNSGSTQ  175

Query  62   FGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGD  121
             G PS+G+W +YGLG+E+ DLP ++V+ SG        + W +GFLP  +QGV  R+QGD
Sbjct  176  IGRPSMGSWLNYGLGSESQDLPGFVVMLSGGGQPSGGTACWSSGFLPTVYQGVQFRSQGD  235

Query  122  PVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELA  181
            PVL+L+NP G+S + RR  ++ L  LN        DP I  RI  +E+AY+MQT+ PEL 
Sbjct  236  PVLFLSNPAGMSVQDRRRALDALHDLNEMELNATGDPEIATRINSFEMAYKMQTSAPELM  295

Query  182  DIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHT--  239
            DI+ EP+ + E YG +PG+V+FANNCLLARRL E G RFVQLY +GWD HG         
Sbjct  296  DISKEPQYIHEMYGTEPGKVAFANNCLLARRLIERGSRFVQLYHRGWDHHGTAPNTDIVN  355

Query  240  -----ERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPM-AQGDGSGY-GRDHHP  292
                 ++CR  DRP+AALI DLKQRGLL++TLVIWGGEFGRTPM  + DGS Y GRDHHP
Sbjct  356  PEGLPKQCRETDRPMAALIKDLKQRGLLEDTLVIWGGEFGRTPMNEERDGSKYLGRDHHP  415

Query  293  HGFTVWMAGGGIRPG  307
              F++WMAGGGI+ G
Sbjct  416  KAFSLWMAGGGIKGG  430


>ref|ZP_04781244.1| protein of hypothetical function DUF1501 [Sphingobacterium spiritivorum 
ATCC 33861]
 gb|EER70026.1| protein of hypothetical function DUF1501 [Sphingobacterium spiritivorum 
ATCC 33861]
Length=497

 Score =  353 bits (906),  Expect = 1e-95, Method: Compositional matrix adjust.
 Identities = 166/316 (52%), Positives = 221/316 (69%), Gaps = 9/316 (2%)

Query  2    PKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVR  61
            P + G    F  +G+SG  IS+  PH + VADE++ ++++++D FNH PA  FM+TG+ R
Sbjct  122  PNMLGPQATFAQYGESGAWISDHLPHFSKVADEVSFLKAVHTDQFNHGPAQLFMHTGSAR  181

Query  62   FGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGD  121
             G PS+G+W +YGLG+ENS+LP ++VL SG        S WG+GFLP  +QGV  R++GD
Sbjct  182  LGRPSIGSWVTYGLGSENSNLPGFVVLTSGGKTPDAGKSVWGSGFLPSVYQGVQCRSKGD  241

Query  122  PVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELA  181
            PVLY+ +PDG+ R+ ++  I+ +  +N   +E   DP  L+RI QYE+AY+MQ AVPE+ 
Sbjct  242  PVLYIADPDGMGRDLKKHTIDAINKVNMDEYETYKDPETLSRIAQYEMAYKMQVAVPEVM  301

Query  182  DIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE-  240
            DIA+EP  + E YG  PG+ SFANNCLLAR+L E GVRFVQL++ GWDSHG  A    + 
Sbjct  302  DIASEPEYIHELYGTQPGKESFANNCLLARKLVEQGVRFVQLFDWGWDSHGTSASDSIDL  361

Query  241  ----RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQG----DGSGYGRDHHP  292
                +CR +DRP+ ALI DLKQRGLLDETLV+WGGEFGRTPM +     D    GRDHH 
Sbjct  362  GFRNKCREIDRPMTALIMDLKQRGLLDETLVVWGGEFGRTPMQENRDNRDMPFMGRDHHT  421

Query  293  HGFTVWMAGGGIRPGI  308
              +T+WMAGGGIR G+
Sbjct  422  DAYTIWMAGGGIRKGV  437


>ref|ZP_01855397.1| hypothetical protein PM8797T_15066 [Planctomyces maris DSM 8797]
 gb|EDL58779.1| hypothetical protein PM8797T_15066 [Planctomyces maris DSM 8797]
Length=479

 Score =  353 bits (905),  Expect = 2e-95, Method: Compositional matrix adjust.
 Identities = 167/310 (53%), Positives = 217/310 (70%), Gaps = 3/310 (0%)

Query  1    EPKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNV  60
            E K +G   +FR +G++G EIS+  P  A +AD++ I+RS+ ++  NHDPA TF+NTG  
Sbjct  109  ELKCQGPLTRFRKYGRNGQEISDFLPWTAKIADDICIIRSMVTEQINHDPAHTFLNTGTA  168

Query  61   RFGWPSLGAWASYGLGTENSDLPAYIVLAS--GRNIQPLLDSYWGAGFLPPEHQGVPLRT  118
              G PS+G+W +YGLG+E  +LP ++VL S  GRN QP+    WG GFLP  +QGV   +
Sbjct  169  ISGRPSMGSWITYGLGSETEELPGFVVLTSVGGRNPQPIASRQWGTGFLPSRYQGVQFNS  228

Query  119  QGDPVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVP  178
             GDPV YL NP+G+S   ++  I  +  L+R R+E V +P I  RI  YE+A+ MQT+VP
Sbjct  229  TGDPVNYLKNPEGISNSQQKQLIETIQKLDRYRNERVTNPEIDTRIAAYEMAFMMQTSVP  288

Query  179  ELADIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQH  238
            EL D++ E R  LE YGA+PG  S+ANNCLLARRL+E G RF+ LY +GWD HG++ +  
Sbjct  289  ELMDLSGETRQTLEMYGAEPGSGSYANNCLLARRLAERGSRFIHLYHRGWDHHGDLVRYM  348

Query  239  TERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVW  298
               C   D+P  ALI DLKQRG+LDETLVIWGGEFGRTPM QG G G GRDHH  GF++W
Sbjct  349  NTCCGLTDKPTWALINDLKQRGMLDETLVIWGGEFGRTPMFQGKG-GAGRDHHIKGFSMW  407

Query  299  MAGGGIRPGI  308
            MAGGGI+ GI
Sbjct  408  MAGGGIKGGI  417


>ref|ZP_03969507.1| protein of hypothetical function DUF1501 [Sphingobacterium spiritivorum 
ATCC 33300]
 gb|EEI90786.1| protein of hypothetical function DUF1501 [Sphingobacterium spiritivorum 
ATCC 33300]
Length=497

 Score =  352 bits (904),  Expect = 2e-95, Method: Compositional matrix adjust.
 Identities = 165/316 (52%), Positives = 221/316 (69%), Gaps = 9/316 (2%)

Query  2    PKVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVR  61
            P + G    F  +G+SG  IS+  PH + VADE++ ++++++D FNH PA  FM+TG+ R
Sbjct  122  PNMLGPQATFAQYGESGAWISDHLPHFSKVADEVSFLKAVHTDQFNHGPAQLFMHTGSAR  181

Query  62   FGWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGD  121
             G PS+G+W +YGLG+ENS+LP ++VL SG        S WG+GFLP  +QGV  R++GD
Sbjct  182  LGRPSIGSWVTYGLGSENSNLPGFVVLTSGGKTPDAGKSVWGSGFLPSVYQGVQCRSKGD  241

Query  122  PVLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELA  181
            PVLY+ +PDG+ R+ ++  I+ +  +N   +E   DP  L+RI QYE+AY+MQ AVPE+ 
Sbjct  242  PVLYIADPDGMGRDLKKHTIDAINKVNMDEYETYKDPETLSRIAQYEMAYKMQVAVPEVM  301

Query  182  DIAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTE-  240
            DIA+EP  + E YG  PG+ SFANNCLLAR+L E GVRFVQL++ GWDSHG  A    + 
Sbjct  302  DIASEPEYIHELYGTQPGKESFANNCLLARKLVEQGVRFVQLFDWGWDSHGTSASDSIDL  361

Query  241  ----RCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQG----DGSGYGRDHHP  292
                +CR +DRP+ ALI DLKQRGLLDETLV+WGGEFGRTPM +     D    GRDHH 
Sbjct  362  GFRNKCREIDRPMTALIMDLKQRGLLDETLVVWGGEFGRTPMQENRDNRDMPFMGRDHHT  421

Query  293  HGFTVWMAGGGIRPGI  308
              +T+WMAGGG+R G+
Sbjct  422  DAYTIWMAGGGVRKGV  437


>ref|ZP_01092432.1| hypothetical protein DSM3645_27773 [Blastopirellula marina DSM 
3645]
 gb|EAQ78953.1| hypothetical protein DSM3645_27773 [Blastopirellula marina DSM 
3645]
Length=467

 Score =  352 bits (902),  Expect = 4e-95, Method: Compositional matrix adjust.
 Identities = 168/306 (54%), Positives = 218/306 (71%), Gaps = 1/306 (0%)

Query  3    KVKGSPWKFRPHGQSGIEISELFPHVAGVADELAIVRSLYSDTFNHDPAVTFMNTGNVRF  62
            K+ G  +KF  HG+ G EISEL P+ A +ADEL+I++S+ +D FNH PA   MNTG+  F
Sbjct  101  KLLGPKFKFAKHGECGAEISELLPYTAKIADELSIIKSMKTDAFNHAPAQIMMNTGSPLF  160

Query  63   GWPSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDP  122
            G PSLGAW  YGLG+E+ +LP ++V +SG+       S WG+G+LP  +QGV LR+ GDP
Sbjct  161  GKPSLGAWTMYGLGSESRNLPGFVVFSSGKKGPSGGSSNWGSGYLPTVYQGVQLRSVGDP  220

Query  123  VLYLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELAD  182
            VLYL+NP G+ R  +RD ++ +  LN++R     DP I  RI  +E+AYRMQ + PEL D
Sbjct  221  VLYLSNPTGIDRTVQRDSLDTINQLNQQRLAATGDPEIATRINSFEMAYRMQASGPELMD  280

Query  183  IAAEPRAVLETYGADPGRVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTERC  242
            +++EP+ VL+ YG DP + SFA NCLLARRL E G RFVQL+ + WD HG + K   E C
Sbjct  281  LSSEPKHVLDMYGVDPDKPSFAKNCLLARRLVERGTRFVQLFHEAWDQHGNLKKDLQENC  340

Query  243  RAVDRPIAALIADLKQRGLLDETLVIWGGEFGRTPMAQGDGSGYGRDHHPHGFTVWMAGG  302
             A D+  AAL+ DLKQRGLL++TLVIWGGEFGRTPM QG G   GRDHHP+ FT+WMAGG
Sbjct  341  LATDQACAALVQDLKQRGLLEDTLVIWGGEFGRTPMVQGGGDD-GRDHHPNAFTMWMAGG  399

Query  303  GIRPGI  308
            G + G+
Sbjct  400  GAKGGV  405

---------------------------------------------------------------------------------------------------------------------

b) Blastp vs swissprot

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

sp|Q5Z066.1|PHK_NOCFA  RecName: Full=Probable phosphoketolase      35.4    0.43 
sp|Q21MG7.1|DDL_SACD2  RecName: Full=D-alanine--D-alanine liga...  33.1    2.1  
sp|A1T8K4.1|UVRC_MYCVP  RecName: Full=UvrABC system protein C;...  32.3    3.6  
sp|P77318.2|YDEN_ECOLI  RecName: Full=Uncharacterized sulfatas...  32.3    3.9  
sp|Q5HPA2.2|EBH_STAEQ  RecName: Full=Extracellular matrix-bind...  32.0    4.9  
sp|Q8CP76.1|EBH_STAES  RecName: Full=Extracellular matrix-bind...  32.0    4.9  
sp|P75264.1|Y134_MYCPN  RecName: Full=Putative ABC transporter...  32.0    5.2  
sp|Q493L3.1|MURB_BLOPB  RecName: Full=UDP-N-acetylenolpyruvoyl...  31.6    6.1  
sp|Q1AHB2.1|KSL11_ORYSI  RecName: Full=Stemod-13(17)-ene synth...  31.6    6.6  
sp|A6WRJ9.2|SYL_SHEB8  RecName: Full=Leucyl-tRNA synthetase; A...  31.2    8.2  


ALIGNMENTS
>sp|Q5Z066.1|PHK_NOCFA RecName: Full=Probable phosphoketolase
Length=822

 Score = 35.4 bits (80),  Expect = 0.43, Method: Compositional matrix adjust.
 Identities = 38/127 (29%), Positives = 52/127 (40%), Gaps = 22/127 (17%)

Query  72   SYGLGTENSDLPAYIVLASGR-NIQPLLDSYWGAGFLPPEHQGVPLRTQGDPVLYLN---  127
            +YG   +N DL  + V+  G     PL  S+ G  FL P   G  L     P+L LN   
Sbjct  185  AYGAALDNPDLTVFCVIGDGEAETGPLATSWHGNKFLNPGRDGAVL-----PILALNEYK  239

Query  128  --NPDGVSREARRDQINLLGALNRRRHELV----NDPAILARIKQYELAYRMQTAVPELA  181
              NP   +R    + INLL       HE +    +DP ++       LA  M T +  +A
Sbjct  240  IANPTLFARIPEPELINLLEGYG---HEPIVVAGDDPGVV----HQRLAAAMDTCMNRIA  292

Query  182  DIAAEPR  188
             I    R
Sbjct  293  QIQRAAR  299


>sp|Q21MG7.1|DDL_SACD2 RecName: Full=D-alanine--D-alanine ligase; AltName: Full=D-alanylalanine 
synthetase; AltName: Full=D-Ala-D-Ala ligase
Length=314

 Score = 33.1 bits (74),  Expect = 2.1, Method: Compositional matrix adjust.
 Identities = 18/44 (40%), Positives = 22/44 (50%), Gaps = 5/44 (11%)

Query  70   WASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQG  113
            W S GL T     P Y  L    + Q +LDS  G GF+ P H+G
Sbjct  118  WQSVGLVT-----PEYASLVENSDWQAVLDSLGGQGFVKPAHEG  156


>sp|A1T8K4.1|UVRC_MYCVP RecName: Full=UvrABC system protein C; Short=Protein uvrC; AltName: 
Full=Excinuclease ABC subunit C
Length=678

 Score = 32.3 bits (72),  Expect = 3.6, Method: Compositional matrix adjust.
 Identities = 34/105 (32%), Positives = 46/105 (43%), Gaps = 27/105 (25%)

Query  118  TQGDPVLYLNNPDGV-------------------SREARRDQINLLGAL----NRRRHEL  154
            T  DPV++  N DG+                   S+ ++R   + L ++      RR  L
Sbjct  563  TAPDPVIFPRNSDGLYLLQRVRDEAHRFAISYHRSKRSKRMTASALDSVRGLGEHRRKAL  622

Query  155  VNDPAILARIKQYELAYRMQTAVPELADIAAEPRAVLETYGADPG  199
            V     LAR+KQ  +     TAVP +   AA  RAVLE  GAD G
Sbjct  623  VTHFGSLARLKQASVDE--ITAVPGIG--AATARAVLEALGADSG  663


>sp|P77318.2|YDEN_ECOLI RecName: Full=Uncharacterized sulfatase ydeN; Flags: Precursor
Length=560

 Score = 32.3 bits (72),  Expect = 3.9, Method: Compositional matrix adjust.
 Identities = 22/93 (23%), Positives = 43/93 (46%), Gaps = 11/93 (11%)

Query  224  YEKGWDSHGEIAKQHTERCRAVDRPIAALIADLKQRGLLDETLVIWGGEFGRT---PMA-  279
            Y+K +++  + A  +     +VD+ +  ++  LK+ G  D T++++  + G     P+  
Sbjct  297  YQKQFNTGSQTADNYYASVYSVDQGVKRILEQLKKNGQYDNTIILFTSDNGAVIDGPLPL  356

Query  280  QGDGSGYGRDHHPHG-----FTVWMAGGGIRPG  307
             G   GY    +P G     F  W   G ++PG
Sbjct  357  NGAQKGYKSQTYPGGTHTPMFMWW--KGKLQPG  387


>sp|Q5HPA2.2|EBH_STAEQ RecName: Full=Extracellular matrix-binding protein ebh; AltName: 
Full=ECM-binding protein homolog
Length=9439

 Score = 32.0 bits (71),  Expect = 4.9, Method: Composition-based stats.
 Identities = 37/190 (19%), Positives = 75/190 (39%), Gaps = 21/190 (11%)

Query  14    HGQSGIEISELFPHVAGVADELAIVRSLY---------SDTFNHDPAVTFMNTGNVRFGW  64
             H Q+  +++E+      + +E+  +++L          S   N DP V  +   +++ G 
Sbjct  6300  HAQTKQQVAEIIAQANKLNNEMGTLKTLVEEQSNVHQQSKYINEDPQVQNIYNDSIQKGR  6359

Query  65    PSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDPVL  124
               L        GT +  L    +  + +NI    +   G   L    Q        + + 
Sbjct  6360  EILN-------GTTDDVLNNNKIADAIQNIHLTKNDLHGDQKLQKAQQDAT-----NELN  6407

Query  125   YLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADIA  184
             YL N +   R++  D+IN   +     ++L +  A+   ++Q E    ++ +V +L+D  
Sbjct  6408  YLTNLNNSQRQSEHDEINSAPSRTEVSNDLNHAKALNEAMRQLENEVALENSVKKLSDFI  6467

Query  185   AEPRAVLETY  194
              E  A    Y
Sbjct  6468  NEDEAAQNEY  6477


>sp|Q8CP76.1|EBH_STAES RecName: Full=Extracellular matrix-binding protein ebh; AltName: 
Full=ECM-binding protein homolog
Length=9439

 Score = 32.0 bits (71),  Expect = 4.9, Method: Composition-based stats.
 Identities = 37/190 (19%), Positives = 75/190 (39%), Gaps = 21/190 (11%)

Query  14    HGQSGIEISELFPHVAGVADELAIVRSLY---------SDTFNHDPAVTFMNTGNVRFGW  64
             H Q+  +++E+      + +E+  +++L          S   N DP V  +   +++ G 
Sbjct  6300  HAQTKQQVAEIIAQANKLNNEMGTLKTLVEEQSNVHQQSKYINEDPQVQNIYNDSIQKGR  6359

Query  65    PSLGAWASYGLGTENSDLPAYIVLASGRNIQPLLDSYWGAGFLPPEHQGVPLRTQGDPVL  124
               L        GT +  L    +  + +NI    +   G   L    Q        + + 
Sbjct  6360  EILN-------GTTDDVLNNNKIADAIQNIHLTKNDLHGDQKLQKAQQDAT-----NELN  6407

Query  125   YLNNPDGVSREARRDQINLLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADIA  184
             YL N +   R++  D+IN   +     ++L +  A+   ++Q E    ++ +V +L+D  
Sbjct  6408  YLTNLNNSQRQSEHDEINSAPSRTEVSNDLNHAKALNEAMRQLENEVALENSVKKLSDFI  6467

Query  185   AEPRAVLETY  194
              E  A    Y
Sbjct  6468  NEDEAAQNEY  6477


>sp|P75264.1|Y134_MYCPN RecName: Full=Putative ABC transporter ATP-binding protein MG187 
homolog
Length=586

 Score = 32.0 bits (71),  Expect = 5.2, Method: Compositional matrix adjust.
 Identities = 21/77 (27%), Positives = 41/77 (53%), Gaps = 3/77 (3%)

Query  140  QIN-LLGALNRRRHELVNDPAILARIKQYELAYRMQTAVPELADIAAEPRAVLETYGADP  198
            QIN L+ ++ +++ EL  +  ++ R KQ+ +    +  + ++ D+  + +A LET  AD 
Sbjct  165  QINDLMVSVFQKQSELEANLKLIPRKKQFAIISLSKETLSQIRDVETKAKAALET--ADS  222

Query  199  GRVSFANNCLLARRLSE  215
              V       L ++LSE
Sbjct  223  AEVEQTIKSELKQKLSE  239


>sp|Q493L3.1|MURB_BLOPB RecName: Full=UDP-N-acetylenolpyruvoylglucosamine reductase; 
AltName: Full=UDP-N-acetylmuramate dehydrogenase
Length=345

 Score = 31.6 bits (70),  Expect = 6.1, Method: Compositional matrix adjust.
 Identities = 23/79 (29%), Positives = 35/79 (44%), Gaps = 3/79 (3%)

Query  200  RVSFANNCLLARRLSESGVRFVQLYEKGWDSHGEIAKQHTERCRAVDRPIAALIADLKQR  259
            R S   NCL    +   G+R  + ++   D H E+A  H E+     R I   I  ++ +
Sbjct  162  RDSIFRNCLEKYAIVSVGLRLCKKWKPILDYH-ELA--HLEKFHITPRQIFNFIYIIRHK  218

Query  260  GLLDETLVIWGGEFGRTPM  278
             L D  LV   G F + P+
Sbjct  219  KLPDPVLVGNAGSFFKNPI  237


>sp|Q1AHB2.1|KSL11_ORYSI RecName: Full=Stemod-13(17)-ene synthase; AltName: Full=Stemodene 
synthase; AltName: Full=Ent-kaurene synthase-like 11; Short=OsKSL11
Length=816

 Score = 31.6 bits (70),  Expect = 6.6, Method: Compositional matrix adjust.
 Identities = 16/45 (35%), Positives = 24/45 (53%), Gaps = 10/45 (22%)

Query  199  GRVSFANNCLLARRLS---------ESGVRFVQLYEKGWDSHGEI  234
             R++F+ NC+L   +          E  V FV L ++ WD+HGEI
Sbjct  538  ARIAFSQNCMLTTMVDDFFDGGGSMEEMVNFVALIDE-WDNHGEI  581


>sp|A6WRJ9.2|SYL_SHEB8 RecName: Full=Leucyl-tRNA synthetase; AltName: Full=Leucine--tRNA 
ligase; Short=LeuRS
Length=859

 Score = 31.2 bits (69),  Expect = 8.2, Method: Compositional matrix adjust.
 Identities = 44/194 (22%), Positives = 80/194 (41%), Gaps = 41/194 (21%)

Query  104  AGFLPPEHQGVPLRTQG----DPVLYLN--------NPDGVSREARRDQINLLGALNRRR  151
            AG +        L TQG    D   Y+N        +P  V+   + D+  +  A+++  
Sbjct  549  AGLVNSNEPAKQLLTQGMVLADAFYYINEKGARVWVSPLDVATTEKDDKGRITKAIDKDG  608

Query  152  HELVNDPAILARIKQYELAYRMQTAVPELADIAAEPRAVLETYGADPGR--VSFANNCLL  209
            +ELV               Y   + + +  +   +P+ ++E YGAD  R  + FA+   L
Sbjct  609  NELV---------------YTGMSKMSKSKNNGIDPQVMVEKYGADTVRLFMMFASPPEL  653

Query  210  ARRLSESGV----RFVQLYEKGWDSHGEIAKQHTERCRAVDRPIAALIADLKQ-RGLLDE  264
                 ESGV    RF++   + W    ++A +H  +  +    ++ L +D K  R  + +
Sbjct  654  TLEWQESGVEGAHRFIK---RLW----KLANEHVNQANSEALDVSTLTSDQKALRREVHK  706

Query  265  TLVIWGGEFGRTPM  278
            T+     + GR  M
Sbjct  707  TIAKVTDDIGRRQM  720