GOS 1415010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1400318
Annotathon code: GOS_1415010
Sample :
  • GPS :31°32'06n; 63°35'42w
  • Sargasso Sea: Sargasso Sea, Station 13 - Bermuda (UK)
  • Open Ocean (-5m, 20°C, 0.22-0.8 microns)
Authors
Team : Algarve
Username : 768990bio1
Annotated on : 2010-07-30 14:30:16
  • a37076 AnaSofiaCarraçoCosta
  • a37089 LilianaSofiaGriloSantos
  • a37090 ManuelAlexanderGuerreiroVieira

Synopsis

Genomic Sequence

>JCVI_READ_1400318 GOS_1415010 Genomic DNA
TCCTGATTTAATTTTGATTCTAACCCAAATGTTTCTTCTGAAGCTTCATTTTTTACTATAACACCTTGAATTAATCCAGTACCGTCTCGAAATATTAAAA
ACCATATCTTACCAATACTTCTAGTATTATAAACCCAGCCTTTCATCTGAATACTTTCACCACAATAACTATCGTTTAACTCGTTTACCATTAAAGTTTT
CATTTTTTTATATCCTTTTACCAAGTCGTCATGATATCATTCAAAATATCTTCCGCAATTTTATTCATTGCTATCCTAGCAGCAAAGCTTCTAGGATCAC
CAAATTCATCATCATCTAATTCATCAATTTTACCATCATTATCATTATCTATACCATCATTGCTAATATCGCCGCTTAATCCATATGATCCAAAACCTGT
ATAAGTTTTTGATATTAGATTTACATTATTTAAAACATCATACCATTCTAATTTTAAAAAGACTTTATACCTATACTCTGATACTGACTCTTCTTTATTA
TAAGTATAAGGATTTTCATCAATTTTTGATATCGTACCTTTCAAAATAGAATTAGCAACTTCTTCGTTTAATACTTTTAATACTCCTTCCTGATTAAACT
TTTCAAGAATCCTATCAGTTACAATTTGCTCTAATCCATATTCAGAAGTTTCATTATCTACTAATGGTATGCTTATAGATTTTATATGCGGTGGTATTGA
TCCAGCCAGAGAATAAATAGCACATCCATTTAAAAGTAAAATTAAATTAAAAATTAAAAATTTCATTTAGGATTTTTTCTAATCTTGGTTTCTAATCCAT
ATTCATCAATTTTTCTATAAAGCGTTCTTTCACTCATTCCCAGTGATTTGGCTGTTTTTCTCCTGTTTGTATTGAAAGAATCTTAATGTCCTAATTATTG
GTTTCCTTTTCTAAATCTCTTATG

Translation

[159 - 704/924]   indirect strand
>GOS_1415010 Translation [159-704   indirect strand]
MKFLIFNLILLLNGCAIYSLAGSIPPHIKSISIPLVDNETSEYGLEQIVTDRILEKFNQEGVLKVLNEEVANSILKGTISKIDENPYTYNKEESVSEYRY
KVFLKLEWYDVLNNVNLISKTYTGFGSYGLSGDISNDGIDNDNDGKIDELDDDEFGDPRSFAARIAMNKIAEDILNDIMTTW

Annotator commentaries

This putative ORF begins with a methionine, confirmed with the multiple alignement; and ends with a stop codon. Is a complete ORF.

The data in the Data Bases used are not enough to conclude the annotation; the Biological Process, Molecular Function and Gene can't be predicted. The taxonomy is probably a Bacteria, but the results are to inconsistent and incomplete.

ORF finding

PROTOCOL


a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code

b) SMS ORFinder / reverse strand / frames 1, 2 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code



RESULTS ANALYSIS

There are three potential ORFs with more than 60 amino-acids in direct strand and others two on reverse strand.

The biggest putative ORF is the ORF number 1 in reading frame 3 on the reverse strand extends from base 159 to base 707; and have results in Protein domain and in homologs research.

The ORFs finded in reading frame (RF) 1 and 3 on the foward strand have unintegrated domains of Signal Peptide on RF1 and, Signal Peptie and Transmembrane-region on RF3 (searched on InterProScan).

The ORF number 1 in reading frame 2 on the reverse strand extends from base 689 to base 922, is incomplete and is almost not overlapping with the larger ORF (if the start codon is a methionine, there isn't overlapping); this ORF is certainly coding because have significant results in BLASTp (with more than 40 sequences with E-values under e-06), and have a domain of Nucleic acid-binding proteins (E-value is 14e-08).



RAW RESULTS

a) forward strand

>ORF number 1 in reading frame 1 on the direct strand extends from base 367 to base 549.
TATCGCCGCTTAATCCATATGATCCAAAACCTGTATAAGTTTTTGATATTAGATTTACAT
TATTTAAAACATCATACCATTCTAATTTTAAAAAGACTTTATACCTATACTCTGATACTG
ACTCTTCTTTATTATAAGTATAAGGATTTTCATCAATTTTTGATATCGTACCTTTCAAAA
TAG

>Translation of ORF number 1 in reading frame 1 on the direct strand.
YRRLIHMIQNLYKFLILDLHYLKHHTILILKRLYTYTLILTLLYYKYKDFHQFLISYLSK

>ORF number 1 in reading frame 2 on the direct strand extends from base 125 to base 319.
TATTATAAACCCAGCCTTTCATCTGAATACTTTCACCACAATAACTATCGTTTAACTCGT
TTACCATTAAAGTTTTCATTTTTTTATATCCTTTTACCAAGTCGTCATGATATCATTCAA
AATATCTTCCGCAATTTTATTCATTGCTATCCTAGCAGCAAAGCTTCTAGGATCACCAAA
TTCATCATCATCTAA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
YYKPSLSSEYFHHNNYRLTRLPLKFSFFYILLPSRHDIIQNIFRNFIHCYPSSKASRITK
FIII

>ORF number 1 in reading frame 3 on the direct strand extends from base 195 to base 404.
AGTTTTCATTTTTTTATATCCTTTTACCAAGTCGTCATGATATCATTCAAAATATCTTCC
GCAATTTTATTCATTGCTATCCTAGCAGCAAAGCTTCTAGGATCACCAAATTCATCATCA
TCTAATTCATCAATTTTACCATCATTATCATTATCTATACCATCATTGCTAATATCGCCG
CTTAATCCATATGATCCAAAACCTGTATAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
SFHFFISFYQVVMISFKISSAILFIAILAAKLLGSPNSSSSNSSILPSLSLSIPSLLISP
LNPYDPKPV

b) reverse starnd

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the reverse strand extends from base 689 to base 922.
TATCATGACGACTTGGTAAAAGGATATAAAAAAATGAAAACTTTAATGGTAAACGAGTTA
AACGATAGTTATTGTGGTGAAAGTATTCAGATGAAAGGCTGGGTTTATAATACTAGAAGT
ATTGGTAAGATATGGTTTTTAATATTTCGAGACGGTACTGGATTAATTCAAGGTGTTATA
GTAAAAAATGAAGCTTCAGAAGAAACATTTGGGTTAGAATCAAAATTAAATCAG

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
YHDDLVKGYKKMKTLMVNELNDSYCGESIQMKGWVYNTRSIGKIWFLIFRDGTGLIQGVI
VKNEASEETFGLESKLNQ

>ORF number 1 in reading frame 3 on the reverse strand extends from base 159 to base 707.
ATGAAATTTTTAATTTTTAATTTAATTTTACTTTTAAATGGATGTGCTATTTATTCTCTG
GCTGGATCAATACCACCGCATATAAAATCTATAAGCATACCATTAGTAGATAATGAAACT
TCTGAATATGGATTAGAGCAAATTGTAACTGATAGGATTCTTGAAAAGTTTAATCAGGAA
GGAGTATTAAAAGTATTAAACGAAGAAGTTGCTAATTCTATTTTGAAAGGTACGATATCA
AAAATTGATGAAAATCCTTATACTTATAATAAAGAAGAGTCAGTATCAGAGTATAGGTAT
AAAGTCTTTTTAAAATTAGAATGGTATGATGTTTTAAATAATGTAAATCTAATATCAAAA
ACTTATACAGGTTTTGGATCATATGGATTAAGCGGCGATATTAGCAATGATGGTATAGAT
AATGATAATGATGGTAAAATTGATGAATTAGATGATGATGAATTTGGTGATCCTAGAAGC
TTTGCTGCTAGGATAGCAATGAATAAAATTGCGGAAGATATTTTGAATGATATCATGACG
ACTTGGTAA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
MKFLIFNLILLLNGCAIYSLAGSIPPHIKSISIPLVDNETSEYGLEQIVTDRILEKFNQE
GVLKVLNEEVANSILKGTISKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISK
TYTGFGSYGLSGDISNDGIDNDNDGKIDELDDDEFGDPRSFAARIAMNKIAEDILNDIMT
TW

Multiple Alignement

PROTOCOL

a)MUSCLE sequence alignment / default parameters;


RESULTS ANALYSIS

Because the similarity between the homologs is low, the homology is low and the homologs are few, the tree couldn't be correctly done.

This multiple alignement, with no separation on in and out groups, only with the sequences with significant E-values (E-value <0.001), can give us an idea of what must be the initiation codon; in this case shows that the methionine must be the initiation codon.



Organisms from the sequeces used:

YP_001998058 Chlorobaculum parvum NCIB 8327 2e-09 [green sulfur bacteria]

YP_001943979 Chlorobium limicola DSM 245 5e-09 [green sulfur bacteria]

YP_003291329 Rhodothermus marinus DSM 4252 8e-09 [CFB group bacteria]

YP_001740456 Candidatus Cloacamonas acidaminovorans 1e-07 [planctomycetes]

YP_002016458 Prosthecochloris aestuarii DSM 271 2e-07 [green sulfur bacteria]

YP_001995433 Chloroherpeton thalassium ATCC 35110 4e-07 [green sulfur bacteria]

NP_662649 Chlorobium tepidum TLS 1e-06 [green sulfur bacteria]

YP_002430365 Desulfatibacillum alkenivorans AK-01 1e-06 [d-proteobacteria]

YP_446749 Salinibacter ruber DSM 13855 2e-06 [CFB group bacteria]

YP_003572741 Salinibacter ruber 3e-06 [CFB group bacteria]

ZP_03130666 Chthoniobacter flavus Ellin428 5e-06 [verrucomicrobia]

CAJ72110 Candidatus Kuenenia stuttgartiensis 9e-06 [planctomycetes]

YP_591432 Candidatus Koribacter versatilis Ellin345 1e-05 [planctomycetes]

YP_001231867 Geobacter uraniireducens Rf4 2e-05 [d-proteobacteria]

NP_968356 Bdellovibrio bacteriovorus HD100 9e-05 [d-proteobacteria]

ZP_01873542 Lentisphaera araneosa HTCC2155 1e-04 [verrucomicrobia]

YP_002019144 Pelodictyon phaeoclathratiforme BU-1 2e-04 [green sulfur bacteria]

ZP_04429847 Planctomyces limnophilus DSM 3776 5e-04 [planctomycetes]

YP_002537627 Geobacter sp. FRC-32 8e-04 [d-proteobacteria]


RAW RESULTS
a)MUSCLE 
CLUSTAL FORMAT: MUSCLE (3.7) multiple sequence alignment


Ca.Ko.vers      -------------------------MTAFRVSLIAIALAVVLSTGC---GYHEA--GKAV
Ca.C.acida      ---------------------------MVKKFIYAILLLVIL-SSC---HYSVY--SNA-
GOS_141501      ------------------------------MKFLIFNLILLL-NGC--AIYSLA--GS--
R.ma.DSM_4      ---------------MRTRCTMISKKSYKRWVWLWWIGWLAL-GGC--AYYSFT--GAA-
S.r.DSM_13      -----------------------------------MAFAASL-PGC--TVYSFS--GAS-
S.ruber         --------MPTNPPPFSLTPRTSNRLRSWIVAGLLVAFAASL-PGC--TVYSFS--GAS-
Chlh.th.AT      -------MIFSDELSLSIFKIFTMISKTRLFLLFFFFLMATL-SGC----YSFS--EGS-
Chlbi.li.D      ---------------MFEKRSDTMRHIGTKAVIFLAFTIAIL-QGC----YSFS--GGS-
Pe.ph.BU-1      -----------------------MLKAAAPIITLFVLVLATL-QGC----YSFA--GSS-
Pr.ae.DSM       -------------------------MVRRSLFPLLFLCMFMM-HGC----YSFS--GSS-
Chlba.pa.N      --------------------MNAMHKRTGTLLLLFGLLTVIL-QGC----YSFS--GAS-
Chlbi.te.T      -----------------------MRKTTGTLFVLLTLVTLIL-QGC----YSFS--GGA-
L.ar.HTCC2      ---------------------------MIKKLLISSLMALVL-VSC--AGYQVG--NMG-
Cht.f.Elli      ----------------------------MRFAFLLSLAAFVF-TGC--AGYHIG--PVQP
Ca.Ku.stut      ---------MPKCFYKMTIPCVFKRRHYTLSALVFPMIFFFI-TGC---GYSSK--SL--
Pl.ln.DSM       MTTPRDQQRCQGEFAPRESFPAAQKISRRDCLAMLSTLPWVF-TGC---GYTI---GKA-
G.u.Rf4         -------------------------MHRFFAFWLLMIVGIAL-NGC---GYHHV--GSVA
B.bo.HD100      --MIASRSHYFCDDFLLILYSTGVVDAIFKAFRITLILSLFL-SGC---AYRLG--SGTR
D.al.AK-01      -------------------------MQKALLRLVLPVLMLSL-CAC---GYHFS--GGA-
                                                         :  .*    *         

Ca.Ko.vers      RIPPD-VKTISIPIFENQTKTY--HIEQIITNAVIHEMSSRTKYRIVSSPSDDVDATITG
Ca.C.acida      -YP-H-LRKIRILAFENRTSEY--GLGDKLLNYLNREIREDGRLKLVT---EDPDCTLEG
GOS_141501      -IPPH-IKSISIPLVDNETSEY--GLEQIVTDRILEKFNQEGVLKV-LNE-EVANSILKG
R.ma.DSM_4      -IPSH-LQTIAIPLVEDRSNSPFTNLDQQLTELLIERFVNQTRLTLEPDP-EAADALLEV
S.r.DSM_13      -IPSN-LETISVPIAQNNTSSPVNRLGADLTDLLTDRFVDRTRLSLTTDD-AGADALLRA
S.ruber         -IPSN-LETISVPIAQNNTSSPVNRLGADLTDLLTDRFVDRTRLSLTTDD-AGADALLRA
Chlh.th.AT      -LPSH-IKTIAIPVFGDESSSGIGQLREQLTTKMVDHVQSQSSLVIEQDR-RLANSVMEA
Chlbi.li.D      -VPSH-MKTVAVPVFQDRSQAGIAPFRAELTRRLTEKIESQSPLRMTPSR-ATADGLLEG
Pe.ph.BU-1      -LPAH-LKTIAIPVFEDRSAAGIAQLRGELTGSLVNRIESQSPLRVTPSL-ARADALLEG
Pr.ae.DSM       -VPAH-INTVAIPLFGDTSGAGVAQLTIKLTDMVHRKVEGESRLQIEPNR-DRADAVLEG
Chlba.pa.N      -IPEH-LHTVAVPLFDDASQAGIAEFRERCTRRLVNKVEAQSDLSIEPDL-SRANAVLKG
Chlbi.te.T      -LPPH-LHTVAVPLFDDTTQAGIAEFREGITRSLINKIESQSTLSIEPDP-SRADAVLKG
L.ar.HTCC2      -HPQ--IKRVSVGKIKNLSDEP--RLALIVMDKLKEAIRQDGTYELVNAGEGKADAIIQG
Cht.f.Elli      KFMEG-IHKIAIPTFRNDTLEP--RVETILATTVINQFQQDGTYQIVDE--KDADAILEG
Ca.Ku.stut      -LRSN-VRSIYVPIFDNDTFRR--GYEFDLTRAVRDQLLLRTNLRIVDK--DEADSILFG
Pl.ln.DSM       -FSPQ-IRTVTVPIFENDTFRR--GIEYQLTEAVQREIRTRTPFKLVNG--DEAETRLSG
G.u.Rf4         GSPVGGKDGVHIPVFANKSYHP--GLETVLTSNLVDEFARRSGGKAVDE--AAARLVLSG
B.bo.HD100      SIPGG-YKQISVPIFKNKTQET--GIEVAFTNTLIQEFQRSRVARIVDN--SLSEVAVIG
D.al.AK-01      -RPGDPVRAVFIPVLDNETAET--GLETMITNGLIREFTREQKFTLAHSR-AEANVLLAG
                         : :    : :              :   .                   :  

Ca.Ko.vers      KVLTTESTPA-TYDSQTGR-----------ATTALV---TVTASVKFVDR-HGKVLFENN
Ca.C.acida      AILSFSENVY-SYDTANQ------------VQDYQV---KMVVSVTFTDLINNTVLYQNS
GOS_141501      TISKIDENPY-TYNKE-ES-----------VSEYRY---KVFLKLEWYDVLNNVNLISKT
R.ma.DSM_4      RIDRYTNEPT-AVGGA-ER-----------AERNRV---TITVTVRYVDQVNDQVLLARS
S.r.DSM_13      RIQRYTNEPT-GVSGD-ER-----------ATTNQV---TIRVQVRYVDQTKGEDLLNQT
S.ruber         RIQRYTNEPT-GVSGD-ER-----------ATTNQV---TIRVQVRYVDQTKDEELLNQT
Chlh.th.AT      VIKSYSDEPS-QVSSTTER-----------ATQNRI---SITISVTYKDLVTKKTLFTQS
Chlbi.li.D      TITTFSDEPS-QLSSKTER-----------AITNRI---TITVNAAFQDRVNKTMLFERT
Pe.ph.BU-1      ALISFSDAPS-QLSSITER-----------AMTNRV---TLVVQVTMVDRVKKTAMFTQS
Pr.ae.DSM       VIVSYNDEAS-QLSSETER-----------ASTNRI---TLVVKAVFRDRLEDKELFPTT
Chlba.pa.N      VILSYADEPS-QLGSSTER-----------AVTNRV---TIVLKAEFEDMVKHEQLFSQT
Chlbi.te.T      AIVSYSDEPS-QLGSATER-----------AVTNRI---TIVLQADFDDQVKNSKLFSQT
L.ar.HTCC2      TLPQFSFRKV-GFSKNDDD-----------DKKYRVDNYRATVKFTYEVVTPKKLVIQKF
Cht.f.Elli      TLDVLQRNPARSVRGNVLL-----------TKEYTL---NVRCRFKLTKKSTGVIVDQRV
Ca.Ku.stut      KISSVNENVL--IEDRKDN-----------IVESRV---TIRADIRWVDTRTGRAIVERK
Pl.ln.DSM       KIVEIRKDVL-GETTWDD------------PRELQF---SLMVHVTWEDLRSGEVLGKET
G.u.Rf4         AVLSYAVTPV-SYTAA-DK-----------IREYRL---TIKVVATLSDRQSGKVLWKGE
B.bo.HD100      QIDSVQYLPG-AKRVAGDSSAPYLPNGTVVASEYRI---LLNVTVKVVRQADGTELWSGS
D.al.AK-01      SIASLLDENA-ARRSSGD------------SALRRV---KMILSLELLDE-DGRVLWADD
                 :                                                     :    

Ca.Ko.vers      --------NYTFRDQYQLSADPLSFFEED----------------------TVALHRMAS
Ca.C.acida      --------GLTVTELYAVAEGGTAKFKTK----------------------EEAVEELIS
GOS_141501      ---------YTGFGSYG-LSGD---ISNDGIDNDNDGKIDELDDDEFGDPRSFAARIAMN
R.ma.DSM_4      ---------FSAFEEYDPIAQG---LAGE----------------------EETARLVLR
S.r.DSM_13      ---------FSGAANYNPVEAG---LDGE----------------------RQAAQSALE
S.ruber         ---------FSGAANYNPVEAG---LDGE----------------------RQAAQNALE
Chlh.th.AT      ---------FTGISDYA--IGD---FTAQ----------------------QESIEEAID
Chlbi.li.D      ---------FTGFADYS--VGS---FSGK----------------------QEAIMQSLE
Pe.ph.BU-1      ---------FVGFSDYR--TGD---PVGQ----------------------QEALRFCVD
Pr.ae.DSM       --------SFTGFADYS--AGS---YAGQ----------------------QEAIRSSVE
Chlba.pa.N      ---------FVGFADYQ--AGS---YSAQ----------------------QGAIDSAID
Chlbi.te.T      ---------FVGFADYQ--TGN---YTAQ----------------------QTAIQSAYN
L.ar.HTCC2      --------TMTGNGDFSDGVSI---EQNR----------------------REGLERASY
Cht.f.Elli      ---------VTGTTSFYATGSD---SVSQDVNQDE----------------RQAVPLAAA
Ca.Ku.stut      --------NIKGTTEFIVLRNE---TL------------------------TSSSNESFV
Pl.ln.DSM       APLDTSSIAMASQADFAPEVGQ---SLAT------------------------ATADSTS
G.u.Rf4         ---------LSGSQDYPANSNLALQQNSE----------------------DAAVREICR
B.bo.HD100      ---------FSGERTYAAPQVT---LAGVNSINPLYNLSAR----------RQNIDLMAY
D.al.AK-01      --------KISEYETYQVVSEN---LAATQANK------------------NQALSVLTT
                               :                                            

Ca.Ko.vers      DFSRTLVSNI-L--EAF----------------------
Ca.C.acida      KLYKTILQNS-I--EGW----------------------
GOS_141501      KIAEDILNDI-M--TTW----------------------
R.ma.DSM_4      NLADDIFTAA-T--SNW----------------------
S.r.DSM_13      NVADDIFSTA-T--SNW----------------------
S.ruber         NVADDIFSTA-T--SNW----------------------
Chlh.th.AT      QVSDDILNRL-LAGAGW----------------------
Chlbi.li.D      TIATDILDNM-L--SDWK---------------------
Pe.ph.BU-1      QIIDEIFDRV-V--SGW----------------------
Pr.ae.DSM       QISDDIFNAM-V--SIW----------------------
Chlba.pa.N      MAVDELFNRM-I--SNW----------------------
Chlbi.te.T      MALDDLFNQM-I--SNW----------------------
L.ar.HTCC2      SMATKVVTQL-A--EGW----------------------
Cht.f.Elli      DMAVQLVSQY-A--EGW----------------------
Ca.Ku.stut      KLAQSIVESMEE--DWW----------------------
Pl.ln.DSM       RLARRIVNMM-E--TPW----------------------
G.u.Rf4         RLAEQIHEKT-Q--EDF----------------------
B.bo.HD100      DIMSEAHDRI-T--ENF----------------------
D.al.AK-01      RLAMKIRHRMEAYFEGF

Protein Domains

PROTOCOL


a) InterProScan; default parameters at EBI



RESULTS ANALYSIS

The results are very inconclusive. The result is unintegrated with have two overlapping regions: a signal-peptide and a transmembrane regions.

This putative protein exist in a membrane.

RAW RESULTS

GOS_1415010   38FD18757DE73BBE	182	SignalPHMM	SignalP	signal-peptide	1	21	NA	?	16-Mar-2010	NULL	NULL
GOS_1415010   38FD18757DE73BBE	182	TMHMM	tmhmm	transmembrane_regions	5	25	NA	?	16-Mar-2010	NULL	NULL

Phylogeny

PROTOCOL



RESULTS ANALYSIS


No enough results to do a representative tree for the identification of the query's organism , also because can't be rooted with a out group.

RAW RESULTS

Taxonomy report

PROTOCOL


1) BLASTp vs NR, defaut NCBI parameters * "1000 Max target sequences"



RESULTS ANALYSIS

There are very few results with few hits; and associated with the BLAST results, a list of in and out groups can't be done to do a correctly tree or a significant multiple alignement.


RAW RESULTS

Lineage Report

root
. cellular organisms
. . Bacteria                     [bacteria]
. . . Bacteroidetes/Chlorobi group [bacteria]
. . . . Chlorobiaceae                [green sulfur bacteria]
. . . . . Chlorobaculum                [green sulfur bacteria]
. . . . . . Chlorobaculum parvum NCIB 8327 -------------   65  2 hits [green sulfur bacteria]  conserved hypothetical protein [Chlorobaculum parvum NCIB 8
. . . . . . Chlorobium tepidum TLS .....................   56  2 hits [green sulfur bacteria]  hypothetical protein CT1770 [Chlorobium tepidum TLS] >gi|21
. . . . . Chlorobium limicola DSM 245 ------------------   64  2 hits [green sulfur bacteria]  conserved hypothetical protein [Chlorobium limicola DSM 245
. . . . . Prosthecochloris aestuarii DSM 271 ...........   59  2 hits [green sulfur bacteria]  hypothetical protein Paes_1799 [Prosthecochloris aestuarii 
. . . . . Chloroherpeton thalassium ATCC 35110 .........   58  2 hits [green sulfur bacteria]  conserved hypothetical protein [Chloroherpeton thalassium A
. . . . . Pelodictyon phaeoclathratiforme BU-1 .........   48  2 hits [green sulfur bacteria]  hypothetical protein Ppha_2332 [Pelodictyon phaeoclathratif
. . . . . Chlorobium phaeobacteroides BS1 ..............   46  2 hits [green sulfur bacteria]  conserved hypothetical protein [Chlorobium phaeobacteroides
. . . . . Chlorobium luteolum DSM 273 ..................   45  2 hits [green sulfur bacteria]  hypothetical protein Plut_0414 [Chlorobium luteolum DSM 273
. . . . . Chlorobium phaeobacteroides DSM 266 ..........   42  2 hits [green sulfur bacteria]  hypothetical protein Cpha266_1999 [Chlorobium phaeobacteroi
. . . . . Chlorobium ferrooxidans DSM 13031 ............   42  4 hits [green sulfur bacteria]  conserved hypothetical protein [Chlorobium ferrooxidans DSM
. . . . . Chlorobium phaeovibrioides DSM 265 ...........   37  2 hits [green sulfur bacteria]  hypothetical protein Cvib_0466 [Prosthecochloris vibrioform
. . . . Rhodothermus marinus DSM 4252 ------------------   63  2 hits [CFB group bacteria]     hypothetical protein Rmar_2060 [Rhodothermus marinus DSM 42
. . . . Salinibacter ruber DSM 13855 ...................   55  2 hits [CFB group bacteria]     hypothetical protein SRU_2651 [Salinibacter ruber DSM 13855
. . . . Flavobacteria bacterium BBFL7 ..................   40  2 hits [CFB group bacteria]     hypothetical protein BBFL7_01085 [Flavobacteria bacterium B
. . . . Chitinophaga pinensis DSM 2588 .................   40  2 hits [CFB group bacteria]     hypothetical protein Cpin_5918 [Chitinophaga pinensis DSM 2
. . . . Pedobacter sp. BAL39 ...........................   39  2 hits [CFB group bacteria]     hypothetical protein PBAL39_09866 [Pedobacter sp. BAL39] >g
. . . . Gramella forsetii KT0803 .......................   35  2 hits [CFB group bacteria]     hypothetical protein GFO_3592 [Gramella forsetii KT0803] >g
. . . . Bacteroides eggerthii DSM 20697 ................   34  2 hits [CFB group bacteria]     hypothetical protein BACEGG_01261 [Bacteroides eggerthii DS
. . . Candidatus Cloacamonas acidaminovorans -----------   60  2 hits [bacteria]               hypothetical protein; putative signal peptide [Candidatus C
. . . Desulfatibacillum alkenivorans AK-01 .............   56  2 hits [d-proteobacteria]       hypothetical protein Dalk_1194 [Desulfatibacillum alkenivor
. . . Chthoniobacter flavus Ellin428 ...................   54  2 hits [verrucomicrobia]        hypothetical protein CfE428DRAFT_3831 [Chthoniobacter flavu
. . . Candidatus Kuenenia stuttgartiensis ..............   53  1 hit  [planctomycetes]         hypothetical protein [Candidatus Kuenenia stuttgartiensis]
. . . Candidatus Koribacter versatilis Ellin345 ........   53  2 hits [bacteria]               hypothetical protein Acid345_2357 [Candidatus Koribacter ve
. . . Geobacter uraniireducens Rf4 .....................   52  2 hits [d-proteobacteria]       hypothetical protein Gura_3133 [Geobacter uraniireducens Rf
. . . Bdellovibrio bacteriovorus HD100 .................   50  2 hits [d-proteobacteria]       hypothetical protein Bd1463 [Bdellovibrio bacteriovorus HD1
. . . Lentisphaera araneosa HTCC2155 ...................   50  2 hits [bacteria]               hypothetical protein LNTAR_08359 [Lentisphaera araneosa HTC
. . . Planctomyces limnophilus DSM 3776 ................   47  2 hits [planctomycetes]         hypothetical protein PlimDRAFT_45770 [Planctomyces limnophi
. . . Elusimicrobium minutum Pei191 ....................   47  2 hits [bacteria]               hypothetical protein Emin_0202 [Elusimicrobium minutum Pei1
. . . Geobacter sp. FRC-32 .............................   47  2 hits [d-proteobacteria]       hypothetical protein Geob_2172 [Geobacter sp. FRC-32] >gi|2
. . . Syntrophobacter fumaroxidans MPOB ................   46  2 hits [d-proteobacteria]       hypothetical protein Sfum_2085 [Syntrophobacter fumaroxidan
. . . Syntrophus aciditrophicus SB .....................   44  2 hits [d-proteobacteria]       hypothetical protein SYN_02375 [Syntrophus aciditrophicus S
. . . Desulfuromonas acetoxidans DSM 684 ...............   43  2 hits [d-proteobacteria]       lipoprotein, putative [Desulfuromonas acetoxidans DSM 684] 
. . . Fibrobacter succinogenes subsp. succinogenes S85 .   42  2 hits [bacteria]               hypothetical protein Fisuc_0376 [Fibrobacter succinogenes s
. . . Blastopirellula marina DSM 3645 ..................   41  2 hits [planctomycetes]         hypothetical protein DSM3645_07690 [Blastopirellula marina 
. . . Verrucomicrobium spinosum DSM 4136 ...............   41  1 hit  [verrucomicrobia]        hypothetical protein VspiD_16590 [Verrucomicrobium spinosum
. . . Planctomyces maris DSM 8797 ......................   39  2 hits [planctomycetes]         hypothetical protein PM8797T_14229 [Planctomyces maris DSM 
. . . Geobacter lovleyi SZ .............................   39  2 hits [d-proteobacteria]       lipoprotein, putative [Geobacter lovleyi SZ] >gi|189421433|
. . . Pelobacter carbinolicus DSM 2380 .................   36  2 hits [d-proteobacteria]       hypothetical protein Pcar_1414 [Pelobacter carbinolicus DSM
. . . Pasteurella dagmatis ATCC 43325 ..................   35  2 hits [g-proteobacteria]       hydrogenase maturation protein HypF [Pasteurella dagmatis A
. . . Victivallis vadensis ATCC BAA-548 ................   34  2 hits [bacteria]               hypothetical protein Vvad_PD0539 [Victivallis vadensis ATCC
. . . Hydrogenobaculum sp. SN ..........................   34  2 hits [aquificales]            hypothetical protein HydSNDRAFT_1166 [Hydrogenobaculum sp. 
. . . Stigmatella aurantiaca DW4/3-1 ...................   34  2 hits [d-proteobacteria]       hypothetical protein STIAU_5427 [Stigmatella aurantiaca DW4
. . . Methylacidiphilum infernorum V4 ..................   34  2 hits [verrucomicrobia]        hypothetical protein Minf_1576 [Methylacidiphilum infernoru
. . . Desulfitobacterium hafniense Y51 .................   33  2 hits [firmicutes]             hypothetical protein DSY4595 [Desulfitobacterium hafniense 
. . Pan troglodytes ------------------------------------   35  1 hit  [primates]               PREDICTED: similar to POM121-like 1 [Pan troglodytes]
. . Coprinopsis cinerea okayama7#130 ...................   34  2 hits [basidiomycetes]         predicted protein [Coprinopsis cinerea okayama7#130] >gi|11
. . Branchiostoma floridae .............................   34  4 hits [lancelets]              hypothetical protein BRAFLDRAFT_118068 [Branchiostoma flori
. . Giardia lamblia ATCC 50803 .........................   34  1 hit  [diplomonads]            hypothetical protein GL50803_34701 [Giardia lamblia ATCC 50
. . Mus musculus (mouse) ...............................   34 11 hits [rodents]                EGF-like, fibronectin type III and laminin G domains [Mus m
. . Dictyostelium discoideum AX4 .......................   34  2 hits [cellular slime molds]   hypothetical protein DDB_G0282251 [Dictyostelium discoideum
. . Ricinus communis ...................................   33  2 hits [eudicots]               nucleic acid binding protein, putative [Ricinus communis] >
. . Rattus norvegicus (brown rat) ......................   33  1 hit  [rodents]                vomeronasal 2 receptor, 71 [Rattus norvegicus]
. Newcastle disease virus ------------------------------   34 11 hits [viruses]                hemagglutinin-neuraminidase [Newcastle disease virus]

BLAST

PROTOCOL


1) BLASTp vs NR, default NCBI parameters * "1000 Max target sequences"



RESULTS ANALYSIS

There are a few results with significant E-values (under e-06), but the identity and scores are low. This values associated with the homologs, that are unclassified, is impossible to preview the a biological process.

RAW RESULTS

1) BLASTp vs NR

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|YP_001998058.1|  conserved hypothetical protein [Chlorobac...  65.5    2e-09
ref|YP_001943979.1|  conserved hypothetical protein [Chlorobiu...  64.3    5e-09
ref|YP_003291329.1|  hypothetical protein Rmar_2060 [Rhodother...  63.9    8e-09
ref|YP_001740456.1|  hypothetical protein; putative signal pep...  60.1    9e-08
ref|YP_002016458.1|  hypothetical protein Paes_1799 [Prostheco...  59.3    2e-07
ref|YP_001995433.1|  conserved hypothetical protein [Chloroher...  58.2    4e-07
ref|NP_662649.1|  hypothetical protein CT1770 [Chlorobium tepi...  56.6    1e-06
ref|YP_002430365.1|  hypothetical protein Dalk_1194 [Desulfati...  56.6    1e-06
ref|YP_446749.1|  hypothetical protein SRU_2651 [Salinibacter ...  55.8    2e-06
ref|ZP_03130666.1|  hypothetical protein CfE428DRAFT_3831 [Cht...  54.3    5e-06
emb|CAJ72110.1|  hypothetical protein [Candidatus Kuenenia stu...  53.5    9e-06
ref|YP_591432.1|  hypothetical protein Acid345_2357 [Candidatu...  53.5    1e-05
ref|YP_001231867.1|  hypothetical protein Gura_3133 [Geobacter...  52.8    2e-05
ref|NP_968356.1|  hypothetical protein Bd1463 [Bdellovibrio ba...  50.4    9e-05
ref|ZP_01873542.1|  hypothetical protein LNTAR_08359 [Lentisph...  50.1    1e-04
ref|YP_002019144.1|  hypothetical protein Ppha_2332 [Pelodicty...  48.9    2e-04
ref|ZP_04429847.1|  hypothetical protein PlimDRAFT_45770 [Plan...  47.8    5e-04
ref|YP_001875102.1|  hypothetical protein Emin_0202 [Elusimicr...  47.8    5e-04
ref|YP_002537627.1|  hypothetical protein Geob_2172 [Geobacter...  47.4    7e-04
ref|YP_001960395.1|  conserved hypothetical protein [Chlorobiu...  47.0    0.001
ref|YP_846203.1|  hypothetical protein Sfum_2085 [Syntrophobac...  46.2    0.002
ref|YP_374345.1|  hypothetical protein Plut_0414 [Chlorobium l...  45.8    0.002
ref|YP_460225.1|  hypothetical protein SYN_02375 [Syntrophus a...  44.3    0.006
ref|ZP_01313417.1|  lipoprotein, putative [Desulfuromonas acet...  43.9    0.007
ref|YP_912436.1|  hypothetical protein Cpha266_1999 [Chlorobiu...  42.7    0.016
ref|YP_003248470.1|  hypothetical protein Fisuc_0376 [Fibrobac...  42.7    0.016
ref|ZP_01385167.1|  conserved hypothetical protein [Chlorobium...  42.7    0.018
ref|ZP_01092660.1|  hypothetical protein DSM3645_07690 [Blasto...  41.6    0.041
ref|ZP_02928286.1|  hypothetical protein VspiD_16590 [Verrucom...  41.2    0.055
ref|ZP_01203323.1|  hypothetical protein BBFL7_01085 [Flavobac...  40.4    0.074
ref|YP_003125537.1|  hypothetical protein Cpin_5918 [Chitinoph...  40.0    0.12 
ref|ZP_01856694.1|  hypothetical protein PM8797T_14229 [Planct...  39.7    0.13 
ref|ZP_01883329.1|  hypothetical protein PBAL39_09866 [Pedobac...  39.3    0.18 
ref|YP_001952351.1|  lipoprotein, putative [Geobacter lovleyi ...  39.3    0.19 
ref|YP_001129990.1|  hypothetical protein Cvib_0466 [Prostheco...  37.4    0.77 
ref|YP_356830.1|  hypothetical protein Pcar_1414 [Pelobacter c...  36.6    1.3  
ref|YP_863597.1|  hypothetical protein GFO_3592 [Gramella fors...  35.4    2.4  
ref|ZP_05920157.1|  hydrogenase maturation protein HypF [Paste...  35.0    3.2  
ref|XP_001141769.1|  PREDICTED: similar to POM121-like 1 [Pan ...  35.0    3.5  
ref|ZP_06244868.1|  hypothetical protein Vvad_PD0539 [Victival...  34.7    4.0  
ref|XP_001840495.1|  predicted protein [Coprinopsis cinerea ok...  34.7    4.1  
gb|ACT22850.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.3  
gb|ACT22838.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.3  
ref|ZP_03458486.1|  hypothetical protein BACEGG_01261 [Bactero...  34.7    4.4  
gb|ACT22874.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.5  
gb|ABK63994.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.5  
gb|AAS67136.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.5  
gb|AAQ54620.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.5  
gb|ACT22856.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    4.6  
ref|ZP_06463287.1|  hypothetical protein HydSNDRAFT_1166 [Hydr...  34.7    4.6  
gb|AAQ54642.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.7    5.0  
ref|XP_002596059.1|  hypothetical protein BRAFLDRAFT_118068 [B...  34.7    5.1  
ref|ZP_01465097.1|  hypothetical protein STIAU_5427 [Stigmatel...  34.3    5.8  
gb|EFD95060.1|  hypothetical protein GL50803_34701 [Giardia la...  34.3    6.0  
ref|YP_001940228.1|  hypothetical protein Minf_1576 [Methylaci...  34.3    6.1  
gb|AAI50711.1|  EGF-like, fibronectin type III and laminin G d...  34.3    6.3  
gb|EDL03345.1|  expressed sequence AU040377, isoform CRA_a [Mu...  34.3    6.3  
gb|EDL03346.1|  expressed sequence AU040377, isoform CRA_b [Mu...  34.3    6.3  
sp|Q4VBE4.1|EGFLA_MOUSE  RecName: Full=Pikachurin; AltName: Fu...  34.3    6.3  
ref|NP_848863.1|  EGF-like, fibronectin type III and laminin G...  34.3    6.3  
ref|XP_640310.1|  hypothetical protein DDB_G0282251 [Dictyoste...  34.3    6.3  
gb|AAQ54622.1|  hemagglutinin-neuraminidase [Newcastle disease...  34.3    6.5  
ref|XP_002513526.1|  nucleic acid binding protein, putative [R...  33.9    6.8  
ref|YP_520828.1|  hypothetical protein DSY4595 [Desulfitobacte...  33.9    7.2  
ref|ZP_01385024.1|  SMC protein-like [Chlorobium ferrooxidans ...  33.9    8.1  
gb|ACT22868.1|  hemagglutinin-neuraminidase [Newcastle disease...  33.5    9.1  
ref|NP_001092986.1|  vomeronasal 2 receptor, 71 [Rattus norveg...  33.5    9.1  
ref|XP_002596060.1|  hypothetical protein BRAFLDRAFT_66209 [Br...  33.5    9.3  

ALIGNMENTS
>ref|YP_001998058.1| conserved hypothetical protein [Chlorobaculum parvum NCIB 8327]
 gb|ACF10858.1| conserved hypothetical protein [Chlorobaculum parvum NCIB 8327]
Length=172

 Score = 65.5 bits (158),  Expect = 2e-09, Method: Compositional matrix adjust.
 Identities = 47/148 (31%), Positives = 77/148 (52%), Gaps = 10/148 (6%)

Query  10   LLLNGCAIYSLAG-SIPPHIKSISIPLVDNETSEYGLEQI---VTDRILEKFNQEGVLKV  65
            ++L GC  YS +G SIP H+ ++++PL D+  S+ G+ +     T R++ K   +  L +
Sbjct  20   VILQGC--YSFSGASIPEHLHTVAVPLFDD-ASQAGIAEFRERCTRRLVNKVEAQSDLSI  76

Query  66   LNE-EVANSILKGTI-SKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISKTYT  123
              +   AN++LKG I S  DE     +  E     R  + LK E+ D++ +  L S+T+ 
Sbjct  77   EPDLSRANAVLKGVILSYADEPSQLGSSTERAVTNRVTIVLKAEFEDMVKHEQLFSQTFV  136

Query  124  GFGSYGLSGDISNDG-IDNDNDGKIDEL  150
            GF  Y      +  G ID+  D  +DEL
Sbjct  137  GFADYQAGSYSAQQGAIDSAIDMAVDEL  164


>ref|YP_001943979.1| conserved hypothetical protein [Chlorobium limicola DSM 245]
 gb|ACD91000.1| conserved hypothetical protein [Chlorobium limicola DSM 245]
Length=178

 Score = 64.3 bits (155),  Expect = 5e-09, Method: Compositional matrix adjust.
 Identities = 48/185 (25%), Positives = 88/185 (47%), Gaps = 31/185 (16%)

Query  3    FLIFNLILLLNGCAIYSLAGSIPPHIKSISIPLVDNETSEYGLEQI---VTDRILEKFNQ  59
            FL F  I +L GC  +S  GS+P H+K++++P+  +  S+ G+      +T R+ EK   
Sbjct  19   FLAFT-IAILQGCYSFS-GGSVPSHMKTVAVPVFQDR-SQAGIAPFRAELTRRLTEKIES  75

Query  60   EGVLKVL-NEEVANSILKGTISKIDENPYTYN-KEESVSEYRYKVFLKLEWYDVLNNVNL  117
            +  L++  +   A+ +L+GTI+   + P   + K E     R  + +   + D +N   L
Sbjct  76   QSPLRMTPSRATADGLLEGTITTFSDEPSQLSSKTERAITNRITITVNAAFQDRVNKTML  135

Query  118  ISKTYTGFGSYGLSGDISNDGIDNDNDGKIDELDDDEFGDPRSFAARIAMNKIAEDILND  177
              +T+TGF  Y + G  S         GK + +               ++  IA DIL++
Sbjct  136  FERTFTGFADYSV-GSFS---------GKQEAIMQ-------------SLETIATDILDN  172

Query  178  IMTTW  182
            +++ W
Sbjct  173  MLSDW  177


>ref|YP_003291329.1| hypothetical protein Rmar_2060 [Rhodothermus marinus DSM 4252]
 gb|ACY48941.1| hypothetical protein Rmar_2060 [Rhodothermus marinus DSM 4252]
Length=180

 Score = 63.9 bits (154),  Expect = 8e-09, Method: Compositional matrix adjust.
 Identities = 38/133 (28%), Positives = 74/133 (55%), Gaps = 9/133 (6%)

Query  10   LLLNGCAIYSLAGS-IPPHIKSISIPLVDNETSE--YGLEQIVTDRILEKF-NQEGVLKV  65
            L L GCA YS  G+ IP H+++I+IPLV++ ++     L+Q +T+ ++E+F NQ  +   
Sbjct  25   LALGGCAYYSFTGAAIPSHLQTIAIPLVEDRSNSPFTNLDQQLTELLIERFVNQTRLTLE  84

Query  66   LNEEVANSILKGTISKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISKTYTGF  125
             + E A+++L+  I +    P      E     R  + + + + D +N+  L++++++ F
Sbjct  85   PDPEAADALLEVRIDRYTNEPTAVGGAERAERNRVTITVTVRYVDQVNDQVLLARSFSAF  144

Query  126  GSY-----GLSGD  133
              Y     GL+G+
Sbjct  145  EEYDPIAQGLAGE  157


>ref|YP_001740456.1| hypothetical protein; putative signal peptide [Candidatus Cloacamonas 
acidaminovorans]
 emb|CAO80249.1| hypothetical protein; putative signal peptide [Candidatus Cloacamonas 
acidaminovorans]
Length=166

 Score = 60.1 bits (144),  Expect = 9e-08, Method: Compositional matrix adjust.
 Identities = 39/120 (32%), Positives = 66/120 (55%), Gaps = 9/120 (7%)

Query  2    KFLIFNLILLLNGCAIYSLAGSIPPHIKSISIPLVDNETSEYGLEQIVTDRILEKFNQE-  60
            KF+   L+L++     YS+  +  PH++ I I   +N TSEYGL     D++L   N+E 
Sbjct  4    KFIYAILLLVILSSCHYSVYSNAYPHLRKIRILAFENRTSEYGL----GDKLLNYLNREI  59

Query  61   ---GVLKVLNEEVANSILKGTISKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNL  117
               G LK++ E+  +  L+G I    EN Y+Y+    V +Y+ K+ + + + D++NN  L
Sbjct  60   REDGRLKLVTED-PDCTLEGAILSFSENVYSYDTANQVQDYQVKMVVSVTFTDLINNTVL  118


>ref|YP_002016458.1| hypothetical protein Paes_1799 [Prosthecochloris aestuarii DSM 
271]
 gb|ACF46811.1| conserved hypothetical protein [Prosthecochloris aestuarii DSM 
271]
Length=168

 Score = 59.3 bits (142),  Expect = 2e-07, Method: Compositional matrix adjust.
 Identities = 48/159 (30%), Positives = 81/159 (50%), Gaps = 13/159 (8%)

Query  4    LIFNLILLLNGCAIYSLAGS-IPPHIKSISIPLVDNETSEYGLEQI---VTDRILEKFNQ  59
            L+F  + +++GC  YS +GS +P HI +++IPL   +TS  G+ Q+   +TD +  K   
Sbjct  9    LLFLCMFMMHGC--YSFSGSSVPAHINTVAIPLF-GDTSGAGVAQLTIKLTDMVHRKVEG  65

Query  60   EGVLKV-LNEEVANSILKGTI-SKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNL  117
            E  L++  N + A+++L+G I S  DE     ++ E  S  R  + +K  + D L +  L
Sbjct  66   ESRLQIEPNRDRADAVLEGVIVSYNDEASQLSSETERASTNRITLVVKAVFRDRLEDKEL  125

Query  118  ISKT-YTGFGSYGLSGDISNDGIDNDNDGKIDELDDDEF  155
               T +TGF  Y      S  G        ++++ DD F
Sbjct  126  FPTTSFTGFADYSAG---SYAGQQEAIRSSVEQISDDIF  161


>ref|YP_001995433.1| conserved hypothetical protein [Chloroherpeton thalassium ATCC 
35110]
 gb|ACF12986.1| conserved hypothetical protein [Chloroherpeton thalassium ATCC 
35110]
Length=187

 Score = 58.2 bits (139),  Expect = 4e-07, Method: Compositional matrix adjust.
 Identities = 40/147 (27%), Positives = 77/147 (52%), Gaps = 10/147 (6%)

Query  12   LNGCAIYSLAGSIPPHIKSISIPLVDNETSEYGL----EQIVTDRILEKFNQEGVLKVLN  67
            L+GC  +S  GS+P HIK+I+IP+  +E+S  G+    EQ+ T  +    +Q  ++   +
Sbjct  35   LSGCYSFS-EGSLPSHIKTIAIPVFGDESSS-GIGQLREQLTTKMVDHVQSQSSLVIEQD  92

Query  68   EEVANSILKGTI-SKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISKTYTGFG  126
              +ANS+++  I S  DE     +  E  ++ R  + + + + D++    L ++++TG  
Sbjct  93   RRLANSVMEAVIKSYSDEPSQVSSTTERATQNRISITISVTYKDLVTKKTLFTQSFTGIS  152

Query  127  SYGLSGDISNDGIDNDNDGKIDELDDD  153
             Y + GD +        +  ID++ DD
Sbjct  153  DYAI-GDFTAQ--QESIEEAIDQVSDD  176


>ref|NP_662649.1| hypothetical protein CT1770 [Chlorobium tepidum TLS]
 gb|AAM72991.1| hypothetical protein CT1770 [Chlorobium tepidum TLS]
Length=169

 Score = 56.6 bits (135),  Expect = 1e-06, Method: Compositional matrix adjust.
 Identities = 34/121 (28%), Positives = 64/121 (52%), Gaps = 7/121 (5%)

Query  13   NGCAIYSLAGSIPPHIKSISIPLVDNETSEYGLEQI---VTDRILEKFNQEGVLKV-LNE  68
             GC  +S  G++PPH+ ++++PL D +T++ G+ +    +T  ++ K   +  L +  + 
Sbjct  20   QGCYSFS-GGALPPHLHTVAVPLFD-DTTQAGIAEFREGITRSLINKIESQSTLSIEPDP  77

Query  69   EVANSILKGTI-SKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISKTYTGFGS  127
              A+++LKG I S  DE     +  E     R  + L+ ++ D + N  L S+T+ GF  
Sbjct  78   SRADAVLKGAIVSYSDEPSQLGSATERAVTNRITIVLQADFDDQVKNSKLFSQTFVGFAD  137

Query  128  Y  128
            Y
Sbjct  138  Y  138


>ref|YP_002430365.1| hypothetical protein Dalk_1194 [Desulfatibacillum alkenivorans 
AK-01]
 gb|ACL02897.1| conserved hypothetical protein [Desulfatibacillum alkenivorans 
AK-01]
Length=175

 Score = 56.6 bits (135),  Expect = 1e-06, Method: Compositional matrix adjust.
 Identities = 36/107 (33%), Positives = 62/107 (57%), Gaps = 6/107 (5%)

Query  9    ILLLNGCAI-YSLAGSIPP--HIKSISIPLVDNETSEYGLEQIVTDRILEKFNQEGVLKV  65
            +L+L+ CA  Y  +G   P   ++++ IP++DNET+E GLE ++T+ ++ +F +E    +
Sbjct  12   VLMLSLCACGYHFSGGARPGDPVRAVFIPVLDNETAETGLETMITNGLIREFTREQKFTL  71

Query  66   LNEEV-ANSILKGTI-SKIDENPYTYNKEESVSEYRYKVFLKLEWYD  110
             +    AN +L G+I S +DEN    +  +S    R K+ L LE  D
Sbjct  72   AHSRAEANVLLAGSIASLLDENAARRSSGDSALR-RVKMILSLELLD  117


>ref|YP_446749.1| hypothetical protein SRU_2651 [Salinibacter ruber DSM 13855]
 gb|ABC44937.1| conserved hypothetical protein [Salinibacter ruber DSM 13855]
Length=160

 Score = 55.8 bits (133),  Expect = 2e-06, Method: Compositional matrix adjust.
 Identities = 39/175 (22%), Positives = 84/175 (48%), Gaps = 25/175 (14%)

Query  12   LNGCAIYSLAG-SIPPHIKSISIPLVDNETSEY--GLEQIVTDRILEKFNQEGVLKVLNE  68
            L GC +YS +G SIP ++++IS+P+  N TS     L   +TD + ++F     L +  +
Sbjct  7    LPGCTVYSFSGASIPSNLETISVPIAQNNTSSPVNRLGADLTDLLTDRFVDRTRLSLTTD  66

Query  69   EV-ANSILKGTISKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLISKTYTGFGS  127
            +  A+++L+  I +    P   + +E  +  +  + +++ + D     +L+++T++G  +
Sbjct  67   DAGADALLRARIQRYTNEPTGVSGDERATTNQVTIRVQVRYVDQTKGEDLLNQTFSGAAN  126

Query  128  YGLSGDISNDGIDNDNDGKIDELDDDEFGDPRSFAARIAMNKIAEDILNDIMTTW  182
            Y    +    G+D +                   AA+ A+  +A+DI +   + W
Sbjct  127  Y----NPVEAGLDGERQ-----------------AAQSALENVADDIFSTATSNW  160


>ref|ZP_03130666.1| hypothetical protein CfE428DRAFT_3831 [Chthoniobacter flavus 
Ellin428]
 gb|EDY18446.1| hypothetical protein CfE428DRAFT_3831 [Chthoniobacter flavus 
Ellin428]
Length=174

 Score = 54.3 bits (129),  Expect = 5e-06, Method: Compositional matrix adjust.
 Identities = 40/147 (27%), Positives = 73/147 (49%), Gaps = 13/147 (8%)

Query  3    FLIFNLILLLNGCAIYSLAGSIPPH----IKSISIPLVDNETSEYGLEQIVTDRILEKFN  58
            FL+     +  GCA Y + G + P     I  I+IP   N+T E  +E I+   ++ +F 
Sbjct  5    FLLSLAAFVFTGCAGYHI-GPVQPKFMEGIHKIAIPTFRNDTLEPRVETILATTVINQFQ  63

Query  59   QEGVLKVLNEEVANSILKGTISKIDENPYT-------YNKEESVS-EYRYKVFLKLEWYD  110
            Q+G  ++++E+ A++IL+GT+  +  NP           KE +++   R+K+  K     
Sbjct  64   QDGTYQIVDEKDADAILEGTLDVLQRNPARSVRGNVLLTKEYTLNVRCRFKLTKKSTGVI  123

Query  111  VLNNVNLISKTYTGFGSYGLSGDISND  137
            V   V   + ++   GS  +S D++ D
Sbjct  124  VDQRVVTGTTSFYATGSDSVSQDVNQD  150


>emb|CAJ72110.1| hypothetical protein [Candidatus Kuenenia stuttgartiensis]
Length=181

 Score = 53.5 bits (127),  Expect = 9e-06, Method: Compositional matrix adjust.
 Identities = 33/122 (27%), Positives = 68/122 (55%), Gaps = 10/122 (8%)

Query  4    LIFNLILL-LNGCAIYSLAGSIPPHIKSISIPLVDNET----SEYGLEQIVTDRILEKFN  58
            L+F +I   + GC  YS    +  +++SI +P+ DN+T     E+ L + V D++L + N
Sbjct  24   LVFPMIFFFITGCG-YSSKSLLRSNVRSIYVPIFDNDTFRRGYEFDLTRAVRDQLLLRTN  82

Query  59   QEGVLKVLNEEVANSILKGTISKIDENPYTYNKEESVSEYRYKVFLKLEWYDVLNNVNLI  118
                L++++++ A+SIL G IS ++EN    ++++++ E R  +   + W D      ++
Sbjct  83   ----LRIVDKDEADSILFGKISSVNENVLIEDRKDNIVESRVTIRADIRWVDTRTGRAIV  138

Query  119  SK  120
             +
Sbjct  139  ER  140


>ref|YP_591432.1| hypothetical protein Acid345_2357 [Candidatus Koribacter versatilis 
Ellin345]
 gb|ABF41358.1| hypothetical protein Acid345_2357 [Candidatus Koribacter versatilis 
Ellin345]
Length=175

 Score = 53.5 bits (127),  Expect = 1e-05, Method: Compositional matrix adjust.
 Identities = 34/131 (25%), Positives = 64/131 (48%), Gaps = 7/131 (5%)

Query  9    ILLLNGCAIYSLAGS---IPPHIKSISIPLVDNETSEYGLEQIVTDRILEKFNQEGVLKV  65
            ++L  GC  Y  AG    IPP +K+ISIP+ +N+T  Y +EQI+T+ ++ + +     ++
Sbjct  15   VVLSTGCG-YHEAGKAVRIPPDVKTISIPIFENQTKTYHIEQIITNAVIHEMSSRTKYRI  73

Query  66   LNEEV--ANSILKGTISKIDENPYTYNKEE-SVSEYRYKVFLKLEWYDVLNNVNLISKTY  122
            ++      ++ + G +   +  P TY+ +    +     V   +++ D    V   +  Y
Sbjct  74   VSSPSDDVDATITGKVLTTESTPATYDSQTGRATTALVTVTASVKFVDRHGKVLFENNNY  133

Query  123  TGFGSYGLSGD  133
            T    Y LS D
Sbjct  134  TFRDQYQLSAD  144