GOS 2185010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091118858896
Annotathon code: GOS_2185010
Sample :
  • GPS :24°10'29n; 84°20'40w
  • Caribbean Sea: Gulf of Mexico - USA
  • Coastal Sea (-2m, 26.4°C, 0.1-0.8 microns)
Authors
Team : Algarve 2011
Username : vmarques
Annotated on : 2011-05-23 16:15:12
  • Marques Vera Cristina Jordão

Synopsis

  • Taxonomy: Proteobacteria (NCBI info)
    Rank: phylum - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 1224
    Kingdom: Bacteria - Phylum: Proteobacteria - Class: - Order:
    Bacteria; Proteobacteria;

Genomic Sequence

>JCVI_READ_1091118858896 GOS_2185010 Genomic DNA
GATTTGCTTCGAAAAACTCCCCCAAGCAAAAAGGATGTTGGATTTATTCATTCGTTTCACGGACCATCAAATTAGTGTGTGCAATCATTTCTAAATCATT
CCAGATATATCAAAACCAGCACACCAGTATAGGTATAAAGTTGAAGGCCGCTGGGAACCCTCGCCATTTCATAACCTCCAAACATTCCAGCCAATGGGAA
ATCTCCCAATTGCTCATGGATCAGTTGGATATCCACATCCGATTTTCCATATAAGGCTTCCCCCCTCGCAGCACAATTAAAATAAATTCCAAAACTGGGC
GGTTTTTCCTTTTTCATCCTCTTCAACATTGCATTCAGGTCGTCCCTGGCACTGATCGCACTTCGATAGGCAAAAGATACCACGGTTCCCTTTTCAATAA
TTTGGGAAAACTGAAGCCCCTGTCGTGCCACATCGATTCCTGTAAGATGCCTGACCATCGAAGATTCCCCTTCAAATTTCGGCTCTTCAGGGTCAAGTGG
GAAACTGATCAACAATTGTCTGGCAGCATTCTCAAGATTGTCAAATTCAAGTTCTGAAGCCACCCTGCTGAAAACTTCCAAAGCTGGAGTTCCATCCAGA
GTGAGGACTAAATCATCCTCAAAATCAGTCACGAACATCGGGTCACCAATCATCTTACAGGATTGGGTCACTCCTGCTGTAAACTCAGGAACCCCCTTGA
AACCCATGCCTCCAGCACCATTGACAACAACCCCTTCGGCACCGAATTGAACGGAAATCCTGCCTGATCCATCATCACAGGATCCTGCTCCAAAAACCAT
TGGGTCTGATTGGACATAATTCAGCATGTTGATAAAATTATAGGGCTGATGCTGATAGACATCCGGGAAAAACAGGAACAAGGGTTTTCTCCCG

Translation

[2 - 796/894]   indirect strand
>GOS_2185010 Translation [2-796   indirect strand]
GRKPLFLFFPDVYQHQPYNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLD
GTPALEVFSRVASELEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISARDDLNAMLKRMKKEKPPS
FGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEMARVPSGLQLYTYTGVLVLIYLE

[ Warning ] 5' incomplete: does not start with a Methionine

Annotator commentaries

Despite its homologs are hypothetical or with unknown function,I chose the ORF number 1 in reading frame 2 on the reverse strand because its e-values are relatively good (above 7e-04), and its length is above 200 aa (798aa).However, the sequence is incomplete in 5' end, starting with a G (glycine) instead of a M (methionine).


Through the tree analysis, I conclude that probably the sequence chosen is a Proteobacteria. I could not specify more because of the low number of organisms in the ingroup.


It is a sequence with known biological function.


The Protein Domains suggests that the sequence studied have two domains: F-box and intracellular signal transduction or FIST and FIST C (because have C terminal), with e-values of 3.2e-11 and 9.4e-22, respectively.

Therefore, the molecular function can be a signal transducer activity, that bind small ligans, such like aminoacids.

As I investigated more about this domains, I concluded that FIST can have several functions (1), and that is why I considered the biological process as unknown.




(1)Borziak, K., & Zhulin, I. B. (2007), FIST: a sensory domain for diverse signal transduction pathaways in prokaryotes and ubiquitin signaling in eukaryotes. Bioinformatics 23, 2518-2521



ORF finding

PROTOCOL


a)SMS ORFinder / direct strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code


b)SMS ORFinder / strand strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code




RESULTS ANALYSIS


This sequence have one ORF in reading frame 1 of the direct strand, one in reading frame 2 of the same strand and two ORF's in the reading frame 3. The reverse strand only have one ORF in the reading frame 2.

An ORF found with "any codon" as initiation codon with a start position above position 3, can not be incomplete at 5' end because there is a STOP codon just before, so I can conclude that:

The ORF number 1 in reading frame 1 on the direct strand is complete on the 5'end and 3'end.

The ORF number 1 in reading frame 2 on the direct strand is also complete on the 5'end., but not on the 3'end.

The ORF number 1 in reading frame 3 on the direct strand is incomplete on the 5'end, but complete at 3'end.

The ORF number 2 in reading frame 3 on the direct strand is complete on the 5'end and 3'end.

The ORF number 1 in reading frame 2 on the reverse strand is incomplete on the 5'end and 3'end (this is the ORF analyzed).


Apparently, all the ORF's, except the longest (ORF number 1 in reading frame 2 on the reverse strand), does not have biological significance. These other ORF's overlap with the chosen ORF (ORF number 1 in reading frame 2 on the reverse strand).


The ORF analyzed is the longer and the only one that have biological significance, based on the results of the BLASTp and the Protein Domains.

The sequence goes from base 2 to 799, in the reverse strand. It is coding and incomplete at the 5'end, this means that ORF does not have START codon.

Using the multiple alignement we can confirm that this ORF is incomplete because do not start with a Met codon and have many gaps before.


RAW RESULTS

a) direct strand

>ORF number 1 in reading frame 1 on the direct strand extends from base 415 to base 750.
AGCCCCTGTCGTGCCACATCGATTCCTGTAAGATGCCTGACCATCGAAGATTCCCCTTCA
AATTTCGGCTCTTCAGGGTCAAGTGGGAAACTGATCAACAATTGTCTGGCAGCATTCTCA
AGATTGTCAAATTCAAGTTCTGAAGCCACCCTGCTGAAAACTTCCAAAGCTGGAGTTCCA
TCCAGAGTGAGGACTAAATCATCCTCAAAATCAGTCACGAACATCGGGTCACCAATCATC
TTACAGGATTGGGTCACTCCTGCTGTAAACTCAGGAACCCCCTTGAAACCCATGCCTCCA
GCACCATTGACAACAACCCCTTCGGCACCGAATTGA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
SPCRATSIPVRCLTIEDSPSNFGSSGSSGKLINNCLAAFSRLSNSSSEATLLKTSKAGVP
SRVRTKSSSKSVTNIGSPIILQDWVTPAVNSGTPLKPMPPAPLTTTPSAPN*

>ORF number 1 in reading frame 2 on the direct strand extends from base 95 to base 280.
ATCATTCCAGATATATCAAAACCAGCACACCAGTATAGGTATAAAGTTGAAGGCCGCTGG
GAACCCTCGCCATTTCATAACCTCCAAACATTCCAGCCAATGGGAAATCTCCCAATTGCT
CATGGATCAGTTGGATATCCACATCCGATTTTCCATATAAGGCTTCCCCCCTCGCAGCAC
AATTAA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
IIPDISKPAHQYRYKVEGRWEPSPFHNLQTFQPMGNLPIAHGSVGYPHPIFHIRLPPSQH
N*

>ORF number 1 in reading frame 3 on the direct strand extends from base 3 to base 254.
TTTGCTTCGAAAAACTCCCCCAAGCAAAAAGGATGTTGGATTTATTCATTCGTTTCACGG
ACCATCAAATTAGTGTGTGCAATCATTTCTAAATCATTCCAGATATATCAAAACCAGCAC
ACCAGTATAGGTATAAAGTTGAAGGCCGCTGGGAACCCTCGCCATTTCATAACCTCCAAA
CATTCCAGCCAATGGGAAATCTCCCAATTGCTCATGGATCAGTTGGATATCCACATCCGA
TTTTCCATATAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
FASKNSPKQKGCWIYSFVSRTIKLVCAIISKSFQIYQNQHTSIGIKLKAAGNPRHFITSK
HSSQWEISQLLMDQLDIHIRFSI*

>ORF number 2 in reading frame 3 on the direct strand extends from base 255 to base 557.
GGCTTCCCCCCTCGCAGCACAATTAAAATAAATTCCAAAACTGGGCGGTTTTTCCTTTTT
CATCCTCTTCAACATTGCATTCAGGTCGTCCCTGGCACTGATCGCACTTCGATAGGCAAA
AGATACCACGGTTCCCTTTTCAATAATTTGGGAAAACTGAAGCCCCTGTCGTGCCACATC
GATTCCTGTAAGATGCCTGACCATCGAAGATTCCCCTTCAAATTTCGGCTCTTCAGGGTC
AAGTGGGAAACTGATCAACAATTGTCTGGCAGCATTCTCAAGATTGTCAAATTCAAGTTC
TGA

>Translation of ORF number 2 in reading frame 3 on the direct strand.
GFPPRSTIKINSKTGRFFLFHPLQHCIQVVPGTDRTSIGKRYHGSLFNNLGKLKPLSCHI
DSCKMPDHRRFPFKFRLFRVKWETDQQLSGSILKIVKFKF*


-------------------------------------------------------------------------------
b)reverse strand

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the reverse strand extends from base 2 to base 799.
GGGAGAAAACCCTTGTTCCTGTTTTTCCCGGATGTCTATCAGCATCAGCCCTATAATTTT
ATCAACATGCTGAATTATGTCCAATCAGACCCAATGGTTTTTGGAGCAGGATCCTGTGAT
GATGGATCAGGCAGGATTTCCGTTCAATTCGGTGCCGAAGGGGTTGTTGTCAATGGTGCT
GGAGGCATGGGTTTCAAGGGGGTTCCTGAGTTTACAGCAGGAGTGACCCAATCCTGTAAG
ATGATTGGTGACCCGATGTTCGTGACTGATTTTGAGGATGATTTAGTCCTCACTCTGGAT
GGAACTCCAGCTTTGGAAGTTTTCAGCAGGGTGGCTTCAGAACTTGAATTTGACAATCTT
GAGAATGCTGCCAGACAATTGTTGATCAGTTTCCCACTTGACCCTGAAGAGCCGAAATTT
GAAGGGGAATCTTCGATGGTCAGGCATCTTACAGGAATCGATGTGGCACGACAGGGGCTT
CAGTTTTCCCAAATTATTGAAAAGGGAACCGTGGTATCTTTTGCCTATCGAAGTGCGATC
AGTGCCAGGGACGACCTGAATGCAATGTTGAAGAGGATGAAAAAGGAAAAACCGCCCAGT
TTTGGAATTTATTTTAATTGTGCTGCGAGGGGGGAAGCCTTATATGGAAAATCGGATGTG
GATATCCAACTGATCCATGAGCAATTGGGAGATTTCCCATTGGCTGGAATGTTTGGAGGT
TATGAAATGGCGAGGGTTCCCAGCGGCCTTCAACTTTATACCTATACTGGTGTGCTGGTT
TTGATATATCTGGAATGA

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
GRKPLFLFFPDVYQHQPYNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVNGA
GGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNL
ENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAI
SARDDLNAMLKRMKKEKPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGG
YEMARVPSGLQLYTYTGVLVLIYLE*

No ORFs were found in reading frame 3.


Multiple Alignement

PROTOCOL


a) Phylogeny.fr/ T-coffee



RESULTS ANALYSIS


The ORF begins at position fifty nine and is incomplete at 5'end because have some inserts of gaps there.

All the other sequences in the multiple alignement begins upstream the GOS start position, and all of it are complete at 5'end.

All the sequences have similar lenght.


The conserved regions only exist above position 244, that is where the multiple alignement is considered good.

The comparison of the results of protein domains with the alignement suggests that are not agreement because the domains ends in postition 244, where the conserved regions begins.


However, the alignement is weak, probably because the proteins are hypothetical.





RAW RESULTS



Gblocks 0.91b Results
 
Processed file: input.fasta
 Number of sequences: 10
Alignment assumed to be: Protein
 New number of positions: 127 (selected positions are underlined in blue) 

                         10        20        30        40        50        60
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      ------------------------------------------------------------
G._violaceus_ou  --MAGFASD-TMHWAGALSRRPTVAEALTEA-----TRAIRSQM----------------
A._Cellulolytic  ----------MARFGDGLVADVDLVRAAEAA-----ARAARMPL----------------
Synechococcus_s  -----------MQWVNALSQLPSLEAALRQV-----VEEAKAKLQVAQLKSALTRQVQAV
C._akajimensis   ----------MPILPNH-AAVTHFPFPFSEAEIQRWSAQQRREL----------------
M._tuberculosis  -----------MRIGVGVCTTPDARQAAVEA-----AGQARDEL----------------
R.centenum       MSITDLAGRTDTGFASALAVGADWADAAKNC-----LAALP-----------------S-
Acidovorax_sp.   ----------MRLFPNAHATHPQWHMAAVLV-----LAQLRAQM-----------ALPQY
Beggiatoa_sp.    ------MEK----FQFGHASAKNWQQATQAC-----LNQMSNLT-----------A----
R._eutropha      ------MH--DVAFVHAHAANARWQDALVDC-----RQQLEAQL-----------AVQHA
                                                                             


                         70        80        90       100       110       120
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      ------------------GRKPLFLFFPDVYQHQPYNFINML------------------
G._violaceus_ou  -A--------------GRRVDLLFVFASPDFAQGAGQWLGGLQREL-ACRVQIGCSGGGI
A._Cellulolytic  -G--------------GHNPDLALVFVCGDDPAETARALERAAAAV-HARTVIGCSASGV
Synechococcus_s  GSGSLPGLLTGYVPSQPLRPNLGILFVSAAFASEYIRVLPLLSELL-EVDVLIGCSGGGI
C._akajimensis   -G--------------G-PATFALIFCSQEHVDDISDLIEIVQIYA-HVPTVVGCSGVGL
M._tuberculosis  -A--------------GEAPSLAVLLGSRAHTDRAADVLSAVLQMI-DPPALVGCIAQAI
R.centenum       ----------------APGANLGFVYVTDPLADQLSSIVTLFRGVT-GVDQWVGSVGMGI
Acidovorax_sp.   ----------------ASAPTLGLLYITDHYADDAAALLDLLRRELPTVTDWAGAVGVGV
Beggiatoa_sp.    -----------------SDNHLGFIYITDLFANYVYDILDYLKQQT-NIPHWVGTVGIGI
R._eutropha      RLGS------------AVALTLGWCYLSDYYATAAERILDALQQEW-PGVAWVGTAAMGI
                                                                             


                        130       140       150       160       170       180
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      ------------------------------------------------------------
G._violaceus_ou  IGAGSEVEGPSALSLLAAHLPGVELRPF--WLKAEELPDLDSS--PKTWENLMEISAGAA
A._Cellulolytic  IGAGRAVERRAAASVWAGVLPGVRIRAF--HLEVIRTPQGMAV--LGLPP-----VDDAD
Synechococcus_s  VGGGHEIEEGPALSLSLAVLPDVALHPF--YLRGNQLPDLDAP--PSAWIDLVGVLPQSK
C._akajimensis   IANSDEIENDAGVSIALYRLPGTQAIAH--HIPTSCFGTVDTP--AS-FKRDLGSSLDQA
M._tuberculosis  VAGRHEIEDEPAVVVWLAS--GLAAETFQLDFVRTGSGA--LI--TGYRF-----DRTAR
R.centenum       VGRGTEVFDHPAIAVLLTTLPADAFRLF--PAVSDGMAPLEEA--AGTWL-----ATRQP
Acidovorax_sp.   AGNNVEYFDEPALSLMLCDLPRAHYRVF--S----GVAPLAHGDAAGVGF--------AA
Beggiatoa_sp.    CSSAKEYFDVPAIAIMIGEFPEESFSIF--NTVSQDFDTFSRT--HQSWC-----DNKQA
R._eutropha      SACGVEYIDEPALALMMAPLPRDAFRLF--S----GRKPL---PPATSGF--------TP
                                                                             


                        190       200       210       220       230       240
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      ---------------------NYVQSDPMVFGAGSCDDGSGRISVQFGA--------EGV
G._violaceus_ou  PHFVLMVDGSSFPVDVLIGGLDFAFPKAIKVGGLASGGNRPGQNRLFFG--------DQA
A._Cellulolytic  VLGIVLADPYSFPADGFVEQANRTV-SVPLVGGMAFGAAGPGSTRLSLD--------RRS
Synechococcus_s  PHFLLLADGFSSRISELLQGLDFAYPGAVKVGGLASGGRGPRGNALFLLDARTPTPRREL
C._akajimensis   NAWMLFASSESIGHDSWLPAWNQATGGKVTIGGFASSPSENPQSHLFLN--------GQH
M._tuberculosis  DLHLLLPDPYTFPSNLLIEHPNTDLPGTAVVGGVVSGGRRRGDTRLFRD--------HDV
R.centenum       LLGLVHADPRTPRLPDLIAAVA-RRAGGFLVGGLCASRGE----QAVLA--------GRV
Acidovorax_sp.   HTALVHADGQTPELADLVAEMAGRTSSGYLFGGVVASRGA----QVQLA-LPAGGEGDGV
Beggiatoa_sp.    LFAIVHGDPRNRRIIKLIHQMSERLGEGFLVGGLTSSRHQ----YLQIA--------DTV
R._eutropha      HTALVHAEGSTPDLQDLLRELSERTATGYLFGGLSSARNQ----TLQIA--------DGI
                                                                             


                        250       260       270       280       290       300
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      VVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASEL
G._violaceus_ou  VGSGAVGVVLAGDIAVEAAVAQGCRPVGETFQITRAEGNLLWELDGQPALQVLQTVLQQL
A._Cellulolytic  VERGAVGVLLGGPVGVRTAVSQGCRPIGPPMTVTAARDNVLLELAGMPAVRKLERVLAEL
Synechococcus_s  YREGTVGLALSGNVVLDAVVAQGCRPIGDPLRVTEAEGNVILSLEGRPPLAVLQDLAERL
C._akajimensis   YQDGAVALSLEGHVTIEPLLTQGCRPIGSPWIVTEAEHNLIHKIGNRPILEVLRDTLENM
M._tuberculosis  LTSGVVGVRLPGMRGV-PVVSQGCRPIGYPYIVTGADGILITELGGRPPLQRLREIVEGL
R.centenum       AQGGLSGVLFSDRVALATGLTQGCSPIGPTRTVTDAEASIIKEIDGRPALEALRHDVGEL
Acidovorax_sp.   LHGGLSGVAFDAQVELVSRVTQGCQPIGAQAAITAAQDNVVLELDGEPALDVLLDTLGVT
Beggiatoa_sp.    TEGGISGVLFSTTVNVTTRLTQSCVPIGPRHQITESSNNVIIRIDDRPALDIFKEDIGPK
R._eutropha      FTGGLSGVLFGPGVGLISRVTQGCQPIGPARTVTRAERNVVFTLDHEPALDCVLKDLGLE
                    ###########     ###############################          


                        310       320       330       340       350       360
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      EFDNL---ENAARQL---LISFPLDPEE----PKFEGESSMVRHLTGIDVARQGLQFSQI
G._violaceus_ou  DENDQ---RLARNAL---FVGVRMSEFH----SGSEQGDFLVRNLMGVDSRTGGLAVGEW
A._Cellulolytic  SAEDQ---ALASAGL---QIGIAMDEYA----EDHDMGDFLVRGILGIDPARQGIAIGDV
Synechococcus_s  SPSDQ---RLARQAL---FIGLLMDEFK----SEPTSGDFLIRVILGIDPRVGAIAIGDR
C._akajimensis   SDDDQ---QLAHGNI---FIGLVLDEYK----SSFGTGDFLVRNLAAIDPQTGAIAIATP
M._tuberculosis  SPDER---ALVSHGL---QIGIVVDEHL----AAPGQGDFVIRGLLGADPSTGSIEIDEV
R.centenum       LANDL---ERVAGYV---FAGLPI--------AGSDTGDYLVRNLLGIDPQRGWIAVGEP
Acidovorax_sp.   LQGDA---QAALRAVRATLAGLEDADAPQRQRTGHFGADTRVRHIVGLDATRSGVALGDH
Beggiatoa_sp.    LSKDL---NQVAGLI---FVGLPI--------IGSDTGDFMVRNLAGIDPEHKLLAIGET
R._eutropha      RDAPTKAMAEALSGT---LAGLSAGTEDAPRLPGMFGAETVVRHLIGLDVQHRVLALADV
                                                          ###################


                        370       380       390       400       410       420
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      IEKGTVVSFAYRSAISARDDLNAMLKRMKKE----------------K------------
G._violaceus_ou  LRTGQTVRFHLRDAATSRDDLQLVLQRHRLE------H-------SGA------------
A._Cellulolytic  VPVGRTVRFHVRDAASAGDDLRSTVKRLREE------F-------TA-------------
Synechococcus_s  VRPGQTVQFHLRDAQTSAEDLRWALSRYCAERNLQQSYPAERSSQPKP------------
C._akajimensis   PRIGQNLQFQIRDPHTAAIDMEELLKRKKAR------L-------QGR------------
M._tuberculosis  VQVGATMQFQVRDAAGADKDLRLTVERAAAR------L-------PGR------------
R.centenum       VARGRPLLFCRRDRAAAEADLKRMLGQMKRR------L-------GGG------------
Acidovorax_sp.   VEVGMRLAFCQRNVGAARADLMRICAEVREE------L-------SPQLEEAQALPAAGS
Beggiatoa_sp.    VNPGTPIIFTRRDRKVAHKDFIKILNNIKTQ------L-------KGR------------
R._eutropha      PETGMRLAFCTRNPAAARTDLTRIATEIRAE------I-------EGG------AA----
                 ####################                                        


                        430       440       450       460       470       480
                 =========+=========+=========+=========+=========+=========+
GOS_2185010      ----------PPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGD-FPLAGMFGGYEMARV
G._violaceus_ou  ----------PPAGALLFSCLGRGESLYGEPDVDSTLFAQVLGEGVPLAGFFCNGEIGPV
A._Cellulolytic  -----------VESALLFSCNGRGSHLFPDAAHDVSVVRGVLGV-QAVAGFFAAGEIGPV
Synechococcus_s  ----------DPCGALMFSCLGRGKGLYGTPNFDSQRFRELLGE-LPLGGFFCNGEIGPV
C._akajimensis   ----------RIYGGCLCDCIGRGASLYGAPNQDVSAIQNALPG-IPLSGIFCNGEFATV
M._tuberculosis  -----------AAGALLFTCNGRGRRMFGVADHDASTIEELLGG-IPLAGFFAAGEIGPI
R.centenum       ----------TPKGGVYISCIARGPGLFGDPDHELRAIAEHLGD-FPLAGFFAGGEI---
Acidovorax_sp.   APHAEPAGGRRICGALYVSCSGRGGPHFGGPSAELQIVRHALGD-VPLAGFFAGGEI---
Beggiatoa_sp.    ----------LPRGGVYYSCMGRGESLFGKDSQELKIIQSVLGD-FPLVGFFANGEI---
R._eutropha      ---------GKALGALYISCSGRGGPHFGAPSAELQIVRRALGD-LPMAGFFANGEI---
                                    ########################   ###########   


                        490       500
                 =========+=========+=====
GOS_2185010      PSGLQLYTYTGVLVLIYLE------
G._violaceus_ou  GSTTFLHGYTSSFGLFRPRTSGRL-
A._Cellulolytic  AGRTYLHGFSASIAAFGTCTDRST-
Synechococcus_s  GGSTFLHGYTSCFGIFRPAR-----
C._akajimensis   KQQTQLHGYAASLGLFVEKNESANP
M._tuberculosis  AGRNALHGFTASMALFVDDM---E-
R.centenum       -SHDRLYSYTGVLTLFL--------
Acidovorax_sp.   -AAHRLYGYTGVLTVFVAPD-G-AL
Beggiatoa_sp.    -YYQRLYGYTGILTLFL--------
R._eutropha      -ARDHLYGYTGVLTVFTGAV-V-S-
                      ###########         



 

Parameters used
Minimum Number Of Sequences For A Conserved Position: 6
Minimum Number Of Sequences For A Flanking Position: 9
Maximum Number Of Contiguous Nonconserved Positions: 8
Minimum Length Of A Block: 10
Allowed Gap Positions: None
Use Similarity Matrices: Yes

Flank positions of the 6 selected block(s)
Flanks: [244  254]  [260  290]  [342  380]  [440  463]  [467  477]  [486  496]  

New number of positions in input.fasta-gb:  127  (25% of the original 505 positions)


Protein Domains

PROTOCOL


InterPro, default parameters at EBI



RESULTS ANALYSIS


This sequence have two domains which do not overlap.


The FIST C means F-box and intracellular signal transduction, C-terminal. This sensory domain is found in signal transduction proteins from bacteria, archaea and eukaryotes. This evidence suggests that FIST domains can bind to small ligands, such like amino acids.


FIST_C has a smaller e-value (9.4e-22), while the FIST has e-value bigger (3.2e-11) but also acceptable.





RAW RESULTS

Sequence_1	4C1D44EA504F84E5	265	HMMPfam	PF10442	FIST_C	104	244	9.4e-22	T	14-Mar-2011	IPR019494	FIST C domain	
Sequence_1	4C1D44EA504F84E5	265	HMMPfam	PF08495	FIST	3	103	3.2e-11	T	14-Mar-2011	IPR013702	FIST domain, N-terminal	

Phylogeny

PROTOCOL


a)Phylogeny.fr / PhyML method / "Number of bootstrap_100" / default substitution model / outgroup: Gloeobacter violaceus (cyanobacteria), Acidothermus cellulolyticus (high GC Gram+), Mycobacterium tuberculosis (high GC Gram+),

Synechococcus sp. (cyanobacteria), Coraliomargarita akajimensis (verrucomicrobia)



b)Phylogeny.fr / BioNJ method / "Number of bootstrap_100" / default substitution model / outgroup: Gloeobacter violaceus (cyanobacteria), Acidothermus cellulolyticus (high GC Gram+), Mycobacterium tuberculosis (high GC Gram+),

Synechococcus sp. (cyanobacteria), Coraliomargarita akajimensis (verrucomicrobia)(Archaea)




RESULTS ANALYSIS


The tree is rooted because all the sequences of the outgroup descended from a single node, and this node only have outgroups.

All of the sequences have low identity (<50%) and low score (<200), although the e-values are good (E<<1e-2). The proteins are hypothetical, so we can not know for sure if really exists.

The tree was been collapsed in branches having support value smaller than 70%, because only with this values we can trust the branch.


The trees found are in agreement.


In both, the outgroups are separated from ingroups, and can be trusted because the likelihood in the outgroup node is 0.99 on PhyML tree and 0.96 on the BioNJ tree.


The base of the node where all the ingroups sequences are included is Phylum Proteobacteria. The GOS sequence is integrated here, so I can conclude that the GOS sequence probably is a Proteobacteria.


RAW RESULTS

a) PhyML
   
                         ----0.1--
 
                                +------------------------------------------G._violaceus_out                                 [cyanobacteria]
                          0.81  |
                       +--------+-------------------------------------------Synechococcus_sp._out                           [cyanobacteria]
                       |        |
        0.99           |        |
 +---------------------+        +------------------------------------------------C._akajimensis_out                          [verrucomicrobia]
 |                     |
 |                     |    0.82   +-------------------------------A._Cellulolyticus_out                                     [high GC Gram+]
 |                     +-----------+
 |                                 +------------------------------------M._tuberculosis_out                                  [high GC Gram+]
 |
 |
 |             +---------------------------------------------------------------------------------------------GOS_2185010
 |             |
 |             |           0.88             +----------------Acidovorax_sp.                                                  [b-proteobacteria]
 +-------------+----------------------------+
               |                            +--------------------R._eutropha                                                 [b-proteobacteria]
               |
               |
               |      0.86       +---------------------R.centenum                                                            [a-proteobacteria]
               +-----------------+
                                 +-----------------------------------------------Beggiatoa_sp.                               [g-proteobacteria]





----------------------------------------------------------------------------------------------------------------------------


b)BioNJ


      -------0.1-----
 
                   +-------------------------------------------------------G._violaceus_out                                  [cyanobacteria]
        0.96       |
 +-----------------+-------------------------------------------------------------Synechococcus_sp._out                       [cyanobacteria]
 |                 |
 |                 |
 |                 |-------------------------------------------------------------------------C._akajimensis_out               [verrucomicrobia]
 |                 |
 |                 |                 +------------------------------------------A._Cellulolyticus_out                         [high GC Gram+]
 |                 +-----------------+
 |                                   +-------------------------------------------------M._tuberculosis_out                   [high GC Gram+]
 |
 |
 |                          1                     +------------------------Acidovorax_sp.                                    [b-proteobacteria]
 |       +----------------------------------------+
 |       |                                        +----------------------------R._eutropha                                   [b-proteobacteria]
 |       |
 |       |      0.72        +-----------------------------------------R.centenum                                             [a-proteobacteria]
 |       |------------------+
 +-------+                  |
         |                  +------------------------------------------------------Beggiatoa_sp.                             [g-proteobacteria]
         |
         +---------------------------------------------------------------------------------------------------GOS_2185010

                                                                       

Taxonomy report

PROTOCOL


BLASTp vs NR,default NCBI parameters + "1000 Max Target Sequences"



RESULTS ANALYSIS


The ingroups and outgroups were chosen based on lineage report, and in the e-values of the sequences in BLASTp vs NR.

The taxon chosen to built the tree were "phylum", being the ingroups Proteobacteria and the outgroups other than Proteobacterias.


Ingroup: Proteobacteria

ref|YP_988070.1| Acidovorax sp. 2e-20 Acidovorax sp. JS42 [b-proteobacteria]

ref|YP_002299266.1| R. centenum 9e-25 Rhodospirillum centenum SW [a-proteobacteria]

ref|ZP_02002961.1| Beggiatoa sp. 5e-20 Beggiatoa sp. PS [g-proteobacteria]

ref|YP_728373.1| R. eutropha 5e-20 Ralstonia eutropha H16 [b-proteobacteria]


Outgroup: other than Proteobacteria (example: cyanobacteria, verrumicrobia, high GC Gram+)

ref|NP_923772.1| G. violaceus 7e-23 Gloeobacter violaceus PCC 7421 [cyanobacteria]

ref|YP_003549120.1| C. akajimensis 5e-15 Coraliomargarita akajimensis DSM 45221 [verrucomicrobia]

ref|YP_873649.1| A. cellulolyticus 1e-22 Acidothermus cellulolyticus 11B [high GC Gram+]

ref|NP_215142.1| M. tuberculosis 9e-17 Mycobacterium tuberculosis H37Rv [high GC Gram+]

ref|YP_475304.1| Synechococcus sp. 5e-17 Synechococcus sp. JA-3-3Ab [cyanobacteria]





RAW RESULTS

cellular organisms ..................................................   547 hits  222 orgs [root]
. Bacteria ..........................................................   503 hits  201 orgs 
. . Proteobacteria ..................................................   169 hits   77 orgs 
. . . Alphaproteobacteria ...........................................    16 hits    9 orgs 
. . . . Rhodospirillaceae ...........................................     9 hits    5 orgs [Rhodospirillales]
. . . . . Rhodospirillum centenum SW ................................     2 hits    1 orgs [Rhodospirillum; Rhodospirillum centenum]
. . . . . Magnetospirillum ..........................................     5 hits    3 orgs 
. . . . . . Magnetospirillum magnetotacticum MS-1 ...................     2 hits    1 orgs [Magnetospirillum magnetotacticum]
. . . . . . Magnetospirillum magneticum AMB-1 .......................     2 hits    1 orgs [Magnetospirillum magneticum]
. . . . . . Magnetospirillum gryphiswaldense MSR-1 ..................     1 hits    1 orgs [Magnetospirillum gryphiswaldense]
. . . . . Azospirillum sp. B510 .....................................     2 hits    1 orgs [Azospirillum]
. . . . Rhizobiales .................................................     5 hits    3 orgs 
. . . . . Xanthobacter autotrophicus Py2 ............................     2 hits    1 orgs [Xanthobacteraceae; Xanthobacter; Xanthobacter autotrophicus]
. . . . . Methylosinus trichosporium OB3b ...........................     2 hits    1 orgs [Methylocystaceae; Methylosinus; Methylosinus trichosporium]
. . . . . uncultured Rhizobium sp. HF0500_35F13 .....................     1 hits    1 orgs [Rhizobiaceae; Rhizobium/Agrobacterium group; Rhizobium; environmental samples]
. . . . Roseobacter sp. GAI101 ......................................     2 hits    1 orgs [Rhodobacterales; Rhodobacteraceae; Roseobacter]
. . . delta/epsilon subdivisions ....................................    37 hits   14 orgs 
. . . . Deltaproteobacteria .........................................    35 hits   13 orgs 
. . . . . Myxococcales ..............................................    12 hits    4 orgs 
. . . . . . Nannocystineae ..........................................     6 hits    2 orgs 
. . . . . . . Plesiocystis pacifica SIR-1 ...........................     4 hits    1 orgs [Nannocystaceae; Plesiocystis; Plesiocystis pacifica]
. . . . . . . Haliangium ochraceum DSM 14365 ........................     2 hits    1 orgs [Haliangiaceae; Haliangium; Haliangium ochraceum]
. . . . . . Anaeromyxobacter sp. Fw109-5 ............................     2 hits    1 orgs [Cystobacterineae; Myxococcaceae; Anaeromyxobacter]
. . . . . . Sorangium cellulosum 'So ce 56' .........................     4 hits    1 orgs [Sorangiineae; Polyangiaceae; Sorangium; Sorangium cellulosum]
. . . . . Desulfobacterales .........................................     3 hits    2 orgs 
. . . . . . Desulfurivibrio alkaliphilus AHT2 .......................     2 hits    1 orgs [Desulfobulbaceae; Desulfurivibrio; Desulfurivibrio alkaliphilus]
. . . . . . uncultured Desulfobacterium sp. .........................     1 hits    1 orgs [Desulfobacteraceae; Desulfobacterium; environmental samples]
. . . . . Geobacter .................................................    16 hits    5 orgs [Desulfuromonadales; Geobacteraceae]
. . . . . . Geobacter sp. FRC-32 ....................................     2 hits    1 orgs 
. . . . . . Geobacter uraniireducens Rf4 ............................     2 hits    1 orgs [Geobacter uraniireducens]
. . . . . . Geobacter sp. M18 .......................................     4 hits    1 orgs 
. . . . . . Geobacter bemidjiensis Bem ..............................     4 hits    1 orgs [Geobacter bemidjiensis]
. . . . . . Geobacter sp. M21 .......................................     4 hits    1 orgs 
. . . . . Syntrophobacter fumaroxidans MPOB .........................     2 hits    1 orgs [Syntrophobacterales; Syntrophobacteraceae; Syntrophobacter; Syntrophobacter fumaroxidans]
. . . . . Desulfovibrio sp. FW1012B .................................     2 hits    1 orgs [Desulfovibrionales; Desulfovibrionaceae; Desulfovibrio]
. . . . Campylobacterales bacterium GD 1 ............................     2 hits    1 orgs [Epsilonproteobacteria; Campylobacterales; unclassified Campylobacterales]
. . . Gammaproteobacteria ...........................................    33 hits   18 orgs 
. . . . Acidithiobacillus ...........................................     6 hits    3 orgs [Acidithiobacillales; Acidithiobacillaceae]
. . . . . Acidithiobacillus ferrooxidans ............................     4 hits    2 orgs 
. . . . . . Acidithiobacillus ferrooxidans ATCC 53993 ...............     2 hits    1 orgs 
. . . . . . Acidithiobacillus ferrooxidans ATCC 23270 ...............     2 hits    1 orgs 
. . . . . Acidithiobacillus caldus ATCC 51756 .......................     2 hits    1 orgs [Acidithiobacillus caldus]
. . . . Beggiatoa sp. PS ............................................     2 hits    1 orgs [Thiotrichales; Thiotrichaceae; Beggiatoa]
. . . . Thioalkalivibrio ............................................     4 hits    2 orgs [Chromatiales; Ectothiorhodospiraceae]
. . . . . Thioalkalivibrio sp. K90mix ...............................     2 hits    1 orgs 
. . . . . Thioalkalivibrio sp. HL-EbGR7 .............................     2 hits    1 orgs 
. . . . Endoriftia persephone 'Hot96_1+Hot96_2' .....................     1 hits    1 orgs [unclassified Gammaproteobacteria; sulfur-oxidizing symbionts; Endoriftia; Endoriftia persephone]
. . . . Vibrionales .................................................    16 hits    9 orgs 
. . . . . Vibrio ....................................................    14 hits    8 orgs [Vibrionaceae]
. . . . . . Vibrio sp. MED222 .......................................     2 hits    1 orgs 
. . . . . . Vibrio splendidus .......................................     4 hits    2 orgs 
. . . . . . . Vibrio splendidus LGP32 ...............................     2 hits    1 orgs 
. . . . . . . Vibrio splendidus 12B01 ...............................     2 hits    1 orgs 
. . . . . . Vibrio brasiliensis LMG 20546 ...........................     2 hits    1 orgs [Vibrio brasiliensis]
. . . . . . Vibrio vulnificus .......................................     6 hits    4 orgs 
. . . . . . . Vibrio vulnificus MO6-24/O ............................     2 hits    1 orgs 
. . . . . . . Vibrio vulnificus CMCP6 ...............................     1 hits    1 orgs 
. . . . . . . Vibrio vulnificus YJ016 ...............................     2 hits    1 orgs 
. . . . . Vibrionales bacterium SWAT-3 ..............................     2 hits    1 orgs [unclassified Vibrionales]
. . . . Saccharophagus degradans 2-40 ...............................     2 hits    1 orgs [Alteromonadales; Alteromonadaceae; Saccharophagus; Saccharophagus degradans]
. . . . Pseudomonas aeruginosa PA7 ..................................     2 hits    1 orgs [Pseudomonadales; Pseudomonadaceae; Pseudomonas; Pseudomonas aeruginosa group; Pseudomonas aeruginosa]
. . . Betaproteobacteria ............................................    81 hits   35 orgs 
. . . . Burkholderiales .............................................    47 hits   24 orgs 
. . . . . Comamonadaceae ............................................    33 hits   17 orgs 
. . . . . . Acidovorax ..............................................     9 hits    5 orgs 
. . . . . . . Acidovorax sp. JS42 ...................................     2 hits    1 orgs 
. . . . . . . Acidovorax ebreus TPSY ................................     2 hits    1 orgs [Acidovorax ebreus]
. . . . . . . Acidovorax delafieldii 2AN ............................     2 hits    1 orgs [Acidovorax delafieldii]
. . . . . . . Acidovorax citrulli AAC00-1 ...........................     2 hits    1 orgs [Acidovorax citrulli]
. . . . . . . Acidovorax avenae subsp. avenae ATCC 19860 ............     1 hits    1 orgs [Acidovorax avenae; Acidovorax avenae subsp. avenae]
. . . . . . Verminephrobacter eiseniae EF01-2 .......................     2 hits    1 orgs [Verminephrobacter; Verminephrobacter eiseniae]
. . . . . . Polaromonas .............................................     4 hits    2 orgs 
. . . . . . . Polaromonas naphthalenivorans CJ2 .....................     2 hits    1 orgs [Polaromonas naphthalenivorans]
. . . . . . . Polaromonas sp. JS666 .................................     2 hits    1 orgs 
. . . . . . Curvibacter putative symbiont of Hydra magnipapillata ...     2 hits    1 orgs [Curvibacter]
. . . . . . Rhodoferax ferrireducens T118 ...........................     2 hits    1 orgs [Albidiferax; Albidiferax ferrireducens]
. . . . . . Alicycliphilus denitrificans BC .........................     2 hits    1 orgs [Alicycliphilus; Alicycliphilus denitrificans]
. . . . . . Variovorax paradoxus ....................................     4 hits    2 orgs [Variovorax]
. . . . . . . Variovorax paradoxus EPS ..............................     2 hits    1 orgs 
. . . . . . . Variovorax paradoxus S110 .............................     2 hits    1 orgs 
. . . . . . Delftia acidovorans SPH-1 ...............................     2 hits    1 orgs [Delftia; Delftia acidovorans]
. . . . . . Comamonas testosteroni ..................................     6 hits    3 orgs [Comamonas]
. . . . . . . Comamonas testosteroni CNB-2 ..........................     2 hits    1 orgs [Comamonas testosteroni CNB-1]
. . . . . . . Comamonas testosteroni S44 ............................     2 hits    1 orgs 
. . . . . . . Comamonas testosteroni KF-1 ...........................     2 hits    1 orgs 
. . . . . Burkholderiaceae ..........................................     8 hits    4 orgs 
. . . . . . Cupriavidus .............................................     4 hits    2 orgs 
. . . . . . . Ralstonia eutropha H16 ................................     2 hits    1 orgs [Cupriavidus necator]
. . . . . . . Ralstonia eutropha JMP134 .............................     2 hits    1 orgs [Cupriavidus pinatubonensis]
. . . . . . Limnobacter sp. MED105 ..................................     2 hits    1 orgs [Limnobacter]
. . . . . . Lautropia mirabilis ATCC 51599 ..........................     2 hits    1 orgs [Lautropia; Lautropia mirabilis]
. . . . . Burkholderiales Genera incertae sedis .....................     4 hits    2 orgs [unclassified Burkholderiales]
. . . . . . Leptothrix cholodnii SP-6 ...............................     2 hits    1 orgs [Leptothrix; Leptothrix cholodnii]
. . . . . . Methylibium petroleiphilum PM1 ..........................     2 hits    1 orgs [Methylibium; Methylibium petroleiphilum]
. . . . . Oxalobacter formigenes OXCC13 .............................     2 hits    1 orgs [Oxalobacteraceae; Oxalobacter; Oxalobacter formigenes]
. . . . Rhodocyclaceae ..............................................     6 hits    3 orgs [Rhodocyclales]
. . . . . Azoarcus sp. BH72 .........................................     2 hits    1 orgs [Azoarcus]
. . . . . Thauera sp. MZ1T ..........................................     2 hits    1 orgs [Thauera]
. . . . . Dechloromonas aromatica RCB ...............................     2 hits    1 orgs [Dechloromonas; Dechloromonas aromatica]
. . . . Thiobacillus denitrificans ATCC 25259 .......................     2 hits    1 orgs [Hydrogenophilales; Hydrogenophilaceae; Thiobacillus; Thiobacillus denitrificans]
. . . . Methylophilaceae ............................................    22 hits    5 orgs [Methylophilales]
. . . . . Methylovorus ..............................................     8 hits    2 orgs 
. . . . . . Methylovorus sp. MP688 ..................................     4 hits    1 orgs 
. . . . . . Methylovorus sp. SIP3-4 .................................     4 hits    1 orgs 
. . . . . Methylotenera .............................................     8 hits    2 orgs 
. . . . . . Methylotenera sp. 301 ...................................     4 hits    1 orgs 
. . . . . . Methylotenera mobilis JLW8 ..............................     4 hits    1 orgs [Methylotenera mobilis]
. . . . . Methylobacillus flagellatus KT ............................     6 hits    1 orgs [Methylobacillus; Methylobacillus flagellatus]
. . . . Candidatus Accumulibacter phosphatis clade IIA str. UW-1 ....     2 hits    1 orgs [unclassified Betaproteobacteria; Candidatus Accumulibacter; Candidatus Accumulibacter phosphatis]
. . . . Gallionella capsiferriformans ES-2 ..........................     2 hits    1 orgs [Gallionellales; Gallionellaceae; Gallionella; Gallionella capsiferriformans]
. . . Magnetococcus sp. MC-1 ........................................     2 hits    1 orgs [unclassified Proteobacteria; Magnetococcus]
. . Cyanobacteria ...................................................   100 hits   48 orgs 
. . . Gloeobacter violaceus PCC 7421 ................................     2 hits    1 orgs [Gloeobacteria; Gloeobacterales; Gloeobacter; Gloeobacter violaceus]
. . . Chroococcales .................................................    64 hits   32 orgs 
. . . . Synechococcus ...............................................    41 hits   20 orgs 
. . . . . Synechococcus sp. JA-2-3B'a(2-13) .........................     2 hits    1 orgs 
. . . . . Synechococcus sp. JA-3-3Ab ................................     2 hits    1 orgs 
. . . . . Synechococcus sp. RS9916 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. RS9917 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. WH 8109 .................................     2 hits    1 orgs 
. . . . . Synechococcus sp. BL107 ...................................     2 hits    1 orgs 
. . . . . Synechococcus elongatus ...................................     5 hits    2 orgs 
. . . . . . Synechococcus elongatus PCC 7942 ........................     3 hits    1 orgs 
. . . . . . Synechococcus elongatus PCC 6301 ........................     2 hits    1 orgs 
. . . . . Synechococcus sp. CC9311 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. CB0101 ..................................     1 hits    1 orgs 
. . . . . Synechococcus sp. WH 8102 .................................     2 hits    1 orgs 
. . . . . Synechococcus sp. CC9605 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. CC9902 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. WH 5701 .................................     2 hits    1 orgs 
. . . . . Synechococcus sp. RCC307 ..................................     2 hits    1 orgs 
. . . . . Synechococcus sp. PCC 7002 ................................     2 hits    1 orgs 
. . . . . Synechococcus sp. WH 7805 .................................     2 hits    1 orgs 
. . . . . Synechococcus sp. WH 7803 .................................     2 hits    1 orgs 
. . . . . Synechococcus sp. CB0205 ..................................     1 hits    1 orgs 
. . . . . Synechococcus sp. PCC 7335 ................................     4 hits    1 orgs 
. . . . Microcystis aeruginosa ......................................     3 hits    2 orgs [Microcystis]
. . . . . Microcystis aeruginosa NIES-843 ...........................     2 hits    1 orgs 
. . . . . Microcystis aeruginosa PCC 7806 ...........................     1 hits    1 orgs 
. . . . Cyanothece ..................................................    14 hits    7 orgs 
. . . . . Cyanothece sp. ATCC 51142 .................................     2 hits    1 orgs 
. . . . . Cyanothece sp. CCY0110 ....................................     2 hits    1 orgs 
. . . . . Cyanothece sp. PCC 7822 ...................................     2 hits    1 orgs 
. . . . . Cyanothece sp. PCC 7425 ...................................     2 hits    1 orgs 
. . . . . Cyanothece sp. PCC 7424 ...................................     2 hits    1 orgs 
. . . . . Cyanothece sp. PCC 8801 ...................................     2 hits    1 orgs 
. . . . . Cyanothece sp. PCC 8802 ...................................     2 hits    1 orgs 
. . . . Cyanobium sp. PCC 7001 ......................................     2 hits    1 orgs [Cyanobium]
. . . . Crocosphaera watsonii WH 8501 ...............................     2 hits    1 orgs [Crocosphaera; Crocosphaera watsonii]
. . . . Synechocystis sp. PCC 6803 ..................................     2 hits    1 orgs [Synechocystis]
. . . Oscillatoriales ...............................................    14 hits    7 orgs 
. . . . Oscillatoria sp. PCC 6506 ...................................     2 hits    1 orgs [Oscillatoria]
. . . . Trichodesmium erythraeum IMS101 .............................     2 hits    1 orgs [Trichodesmium; Trichodesmium erythraeum]
. . . . Arthrospira .................................................     6 hits    3 orgs 
. . . . . Arthrospira platensis .....................................     2 hits    2 orgs 
. . . . . . Arthrospira platensis str. Paraca .......................     1 hits    1 orgs 
. . . . . . Arthrospira platensis NIES-39 ...........................     1 hits    1 orgs 
. . . . . Arthrospira maxima CS-328 .................................     4 hits    1 orgs [Arthrospira maxima]
. . . . Microcoleus chthonoplastes PCC 7420 .........................     2 hits    1 orgs [Microcoleus; Microcoleus chthonoplastes]
. . . . Lyngbya sp. PCC 8106 ........................................     2 hits    1 orgs [Lyngbya]
. . . Nostocaceae ...................................................    12 hits    5 orgs [Nostocales]
. . . . Nodularia spumigena CCY9414 .................................     2 hits    1 orgs [Nodularia; Nodularia spumigena]
. . . . Anabaena ....................................................     4 hits    2 orgs 
. . . . . Anabaena variabilis ATCC 29413 ............................     2 hits    1 orgs [Anabaena variabilis]
. . . . . 'Nostoc azollae' 0708 .....................................     2 hits    1 orgs [Anabaena azollae]
. . . . Nostoc ......................................................     6 hits    2 orgs 
. . . . . Nostoc sp. PCC 7120 .......................................     2 hits    1 orgs 
. . . . . Nostoc punctiforme PCC 73102 ..............................     4 hits    1 orgs [Nostoc punctiforme]
. . . Acaryochloris marina MBIC11017 ................................     4 hits    1 orgs [unclassified Cyanobacteria; Acaryochloris; Acaryochloris marina]
. . . Prochlorococcus marinus .......................................     4 hits    2 orgs [Prochlorales; Prochlorococcaceae; Prochlorococcus]
. . . . Prochlorococcus marinus str. MIT 9303 .......................     2 hits    1 orgs 
. . . . Prochlorococcus marinus str. MIT 9313 .......................     2 hits    1 orgs 
. . Actinobacteria (class) ..........................................   182 hits   49 orgs [Actinobacteria]
. . . Actinomycetales ...............................................   180 hits   48 orgs [Actinobacteridae]
. . . . Acidothermus cellulolyticus 11B .............................     2 hits    1 orgs [Frankineae; Acidothermaceae; Acidothermus; Acidothermus cellulolyticus]
. . . . Streptosporangineae .........................................     8 hits    4 orgs 
. . . . . Streptosporangium roseum DSM 43021 ........................     2 hits    1 orgs [Streptosporangiaceae; Streptosporangium; Streptosporangium roseum]
. . . . . Nocardiopsaceae ...........................................     4 hits    2 orgs 
. . . . . . Nocardiopsis dassonvillei subsp. dassonvillei DSM 43111 .     2 hits    1 orgs [Nocardiopsis; Nocardiopsis dassonvillei; Nocardiopsis dassonvillei subsp. dassonvillei]
. . . . . . Thermobifida fusca YX ...................................     2 hits    1 orgs [Thermobifida; Thermobifida fusca]
. . . . . Thermomonospora curvata DSM 43183 .........................     2 hits    1 orgs [Thermomonosporaceae; Thermomonospora; Thermomonospora curvata]
. . . . Thermobispora bispora DSM 43833 .............................     2 hits    1 orgs [Pseudonocardineae; Pseudonocardiaceae; Thermobispora; Thermobispora bispora]
. . . . Mycobacterium ...............................................   168 hits   42 orgs [Corynebacterineae; Mycobacteriaceae]
. . . . . Mycobacterium kansasii ATCC 12478 .........................     1 hits    1 orgs [Mycobacterium kansasii]
. . . . . Mycobacterium tuberculosis complex ........................   167 hits   41 orgs 
. . . . . . Mycobacterium tuberculosis ..............................   153 hits   37 orgs 
. . . . . . . Mycobacterium tuberculosis T92 ........................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu006 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis 02_1987 ....................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu001 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu010 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis T17 ........................     3 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis T46 ........................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu011 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis K85 ........................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis H37Rv ......................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis CDC1551 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis H37Ra ......................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis F11 ........................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis 94_M4241A ..................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis EAS054 .....................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis T85 ........................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis KZN 1435 ...................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis str. Haarlem ...............     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis '98-R604 INH-RIF-EM' .......     2 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis KZN 4207 ...................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis CPHL_A .....................     6 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis KZN 605 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis 210 ........................     2 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis KZN R506 ...................     2 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu002 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu003 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu004 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu005 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu008 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu007 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu009 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis KZN V2475 ..................     2 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis CDC1551A ...................     2 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis C ..........................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis SUMu012 ....................     4 hits    1 orgs 
. . . . . . . Mycobacterium tuberculosis GM 1503 ....................     4 hits    1 orgs 
. . . . . . Mycobacterium bovis .....................................    14 hits    4 orgs 
. . . . . . . Mycobacterium bovis AF2122/97 .........................     4 hits    1 orgs 
. . . . . . . Mycobacterium bovis BCG ...............................     8 hits    2 orgs 
. . . . . . . . Mycobacterium bovis BCG str. Pasteur 1173P2 .........     4 hits    1 orgs 
. . . . . . . . Mycobacterium bovis BCG str. Tokyo 172 ..............     4 hits    1 orgs 
. . . Conexibacter woesei DSM 14684 .................................     2 hits    1 orgs [Rubrobacteridae; Solirubrobacterales; Conexibacteraceae; Conexibacter; Conexibacter woesei]
. . Candidatus Nitrospira defluvii ..................................     2 hits    1 orgs [Nitrospirae; Nitrospira (class); Nitrospirales; Nitrospiraceae; Nitrospira]
. . Verrucomicrobia .................................................     8 hits    4 orgs [Chlamydiae/Verrucomicrobia group]
. . . Coraliomargarita akajimensis DSM 45221 ........................     2 hits    1 orgs [Opitutae; Puniceicoccales; Puniceicoccaceae; Coraliomargarita; Coraliomargarita akajimensis]
. . . Verrucomicrobiales ............................................     4 hits    2 orgs [Verrucomicrobiae]
. . . . bacterium Ellin514 ..........................................     2 hits    1 orgs [Verrucomicrobia subdivision 3]
. . . . Verrucomicrobium spinosum DSM 4136 ..........................     2 hits    1 orgs [Verrucomicrobiaceae; Verrucomicrobium; Verrucomicrobium spinosum]
. . . Methylacidiphilum infernorum V4 ...............................     2 hits    1 orgs [unclassified Verrucomicrobia; Methylacidiphilales; Methylacidiphilaceae; Methylacidiphilum; Methylacidiphilum infernorum]
. . Planctomycetaceae ...............................................    10 hits    5 orgs [Planctomycetes; Planctomycetacia; Planctomycetales]
. . . Blastopirellula marina DSM 3645 ...............................     2 hits    1 orgs [Blastopirellula; Blastopirellula marina]
. . . Planctomyces ..................................................     4 hits    2 orgs 
. . . . Planctomyces brasiliensis DSM 5305 ..........................     2 hits    1 orgs [Planctomyces brasiliensis]
. . . . Planctomyces limnophilus DSM 3776 ...........................     2 hits    1 orgs [Planctomyces limnophilus]
. . . Pirellula staleyi DSM 6068 ....................................     2 hits    1 orgs [Pirellula; Pirellula staleyi]
. . . Isosphaera pallida ATCC 43644 .................................     2 hits    1 orgs [Isosphaera; Isosphaera pallida]
. . Chloroflexi .....................................................     4 hits    2 orgs 
. . . Ktedonobacter racemifer DSM 44963 .............................     2 hits    1 orgs [Ktedonobacteria; Ktedonobacterales; Ktedonobacteraceae; Ktedonobacter; Ktedonobacter racemifer]
. . . Herpetosiphon aurantiacus ATCC 23779 ..........................     2 hits    1 orgs [Chloroflexi (class); Herpetosiphonales; Herpetosiphonaceae; Herpetosiphon; Herpetosiphon aurantiacus]
. . Bacteroidetes ...................................................     6 hits    3 orgs [Bacteroidetes/Chlorobi group]
. . . Algoriphagus sp. PR1 ..........................................     2 hits    1 orgs [Cytophagia; Cytophagales; Cyclobacteriaceae; Algoriphagus]
. . . Flavobacteriaceae .............................................     4 hits    2 orgs [Flavobacteria; Flavobacteriales]
. . . . Dokdonia donghaensis MED134 .................................     2 hits    1 orgs [Dokdonia; Dokdonia donghaensis]
. . . . Maribacter sp. HTCC2170 .....................................     2 hits    1 orgs [Maribacter]
. . Firmicutes ......................................................    22 hits   12 orgs 
. . . Clostridia ....................................................    16 hits    9 orgs 
. . . . Clostridiales ...............................................     9 hits    5 orgs 
. . . . . Peptococcaceae ............................................     4 hits    2 orgs 
. . . . . . Desulfotomaculum acetoxidans DSM 771 ....................     2 hits    1 orgs [Desulfotomaculum; Desulfotomaculum acetoxidans]
. . . . . . Pelotomaculum thermopropionicum SI ......................     2 hits    1 orgs [Pelotomaculum; Pelotomaculum thermopropionicum]
. . . . . Butyrivibrio proteoclasticus B316 .........................     2 hits    1 orgs [Lachnospiraceae; Butyrivibrio; Clostridium proteoclasticum]
. . . . . Clostridium ...............................................     3 hits    2 orgs [Clostridiaceae]
. . . . . . Clostridium sp. M62/1 ...................................     2 hits    1 orgs 
. . . . . . Clostridium cf. saccharolyticum K10 .....................     1 hits    1 orgs [Clostridium saccharolyticum]
. . . . Halanaerobium praevalens DSM 2228 ...........................     1 hits    1 orgs [Halanaerobiales; Halanaerobiaceae; Halanaerobium; Halanaerobium praevalens]
. . . . Thermoanaerobacterales ......................................     6 hits    3 orgs 
. . . . . Moorella group ............................................     4 hits    2 orgs [Thermoanaerobacteraceae]
. . . . . . Moorella thermoacetica ATCC 39073 .......................     2 hits    1 orgs [Moorella; Moorella thermoacetica]
. . . . . . Ammonifex degensii KC4 ..................................     2 hits    1 orgs [Ammonifex; Ammonifex degensii]
. . . . . Caldicellulosiruptor lactoaceticus 6A .....................     2 hits    1 orgs [Thermoanaerobacterales Family III. Incertae Sedis; Caldicellulosiruptor; Caldicellulosiruptor lactoaceticus]
. . . Bacillus ......................................................     6 hits    3 orgs [Bacilli; Bacillales; Bacillaceae]
. . . . Bacillus cellulosilyticus DSM 2522 ..........................     2 hits    1 orgs [Bacillus cellulosilyticus]
. . . . Bacillus subtilis subsp. spizizenii .........................     4 hits    2 orgs [Bacillus subtilis group; Bacillus subtilis]
. . . . . Bacillus subtilis subsp. spizizenii ATCC 6633 .............     2 hits    1 orgs 
. . . . . Bacillus subtilis subsp. spizizenii str. W23 ..............     2 hits    1 orgs 
. Eukaryota .........................................................    38 hits   18 orgs 
. . Viridiplantae ...................................................    17 hits    5 orgs 
. . . Chlamydomonadales .............................................     6 hits    2 orgs [Chlorophyta; Chlorophyceae]
. . . . Chlamydomonas reinhardtii ...................................     2 hits    1 orgs [Chlamydomonadaceae; Chlamydomonas]
. . . . Volvox carteri f. nagariensis ...............................     4 hits    1 orgs [Volvocaceae; Volvox; Volvox carteri]
. . . rosids ........................................................    11 hits    3 orgs [Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicotyledons]
. . . . Vitis vinifera ..............................................     3 hits    1 orgs [rosids incertae sedis; Vitales; Vitaceae; Vitis]
. . . . Malpighiales ................................................     8 hits    2 orgs [fabids]
. . . . . Ricinus communis ..........................................     2 hits    1 orgs [Euphorbiaceae; Acalyphoideae; Acalypheae; Ricinus]
. . . . . Populus trichocarpa .......................................     6 hits    1 orgs [Salicaceae; Saliceae; Populus]
. . Aureococcus anophagefferens .....................................     2 hits    1 orgs [stramenopiles; Pelagophyceae; Aureococcus]
. . Fungi/Metazoa group .............................................    19 hits   12 orgs 
. . . Bilateria .....................................................    17 hits   11 orgs [Metazoa; Eumetazoa]
. . . . Coelomata ...................................................    13 hits    8 orgs 
. . . . . Deuterostomia .............................................    10 hits    5 orgs 
. . . . . . Chordata ................................................     8 hits    4 orgs 
. . . . . . . Branchiostoma floridae ................................     2 hits    1 orgs [Cephalochordata; Branchiostomidae; Branchiostoma]
. . . . . . . Euteleostomi ..........................................     5 hits    2 orgs [Craniata; Vertebrata; Gnathostomata; Teleostomi]
. . . . . . . . Tetraodon nigroviridis ..............................     2 hits    1 orgs [Actinopterygii; Actinopteri; Neopterygii; Teleostei; Elopocephala; Clupeocephala; Euteleostei; Neognathi; Neoteleostei; Eurypterygii; Ctenosquamata; Acanthomorpha; Euacanthomorpha; Holacanthopterygii; Acanthopterygii; Euacanthopterygii; Percomorpha; Tetraodontiformes; Tetraodontoidei; Tetradontoidea; Tetraodontidae; Tetraodon]
. . . . . . . . Xenopus (Silurana) tropicalis .......................     3 hits    1 orgs [Sarcopterygii; Tetrapoda; Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; Xenopodinae; Xenopus; Silurana]
. . . . . . . Oikopleura dioica .....................................     1 hits    1 orgs [Tunicata; Appendicularia; Oikopleuridae; Oikopleura]
. . . . . . Strongylocentrotus purpuratus ...........................     2 hits    1 orgs [Echinodermata; Eleutherozoa; Echinozoa; Echinoidea; Euechinoidea; Echinacea; Echinoida; Strongylocentrotidae; Strongylocentrotus]
. . . . . Aculeata ..................................................     3 hits    3 orgs [Protostomia; Panarthropoda; Arthropoda; Mandibulata; Pancrustacea; Hexapoda; Insecta; Dicondylia; Pterygota; Neoptera; Endopterygota; Hymenoptera; Apocrita]
. . . . . . Formicidae ..............................................     2 hits    2 orgs [Vespoidea]
. . . . . . . Harpegnathos saltator .................................     1 hits    1 orgs [Ponerinae; Ponerini; Harpegnathos]
. . . . . . . Solenopsis invicta ....................................     1 hits    1 orgs [Myrmicinae; Solenopsidini; Solenopsis]
. . . . . . Apis mellifera ..........................................     1 hits    1 orgs [Apoidea; Apidae; Apinae; Apini; Apis]
. . . . Caenorhabditis ..............................................     4 hits    3 orgs [Pseudocoelomata; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; Rhabditidae; Peloderinae]
. . . . . Caenorhabditis remanei ....................................     2 hits    1 orgs 
. . . . . Caenorhabditis briggsae ...................................     2 hits    2 orgs 
. . . . . . Caenorhabditis briggsae AF16 ............................     1 hits    1 orgs 
. . . Nectria haematococca mpVI 77-13-4 .............................     2 hits    1 orgs [Fungi; Dikarya; Ascomycota; saccharomyceta; Pezizomycotina; leotiomyceta; sordariomyceta; Sordariomycetes; Hypocreomycetidae; Hypocreales; Nectriaceae; Nectria; Nectria haematococca complex; Nectria haematococca; Nectria haematococca mpVI]
. Euryarchaeota .....................................................     6 hits    3 orgs [Archaea]
. . Methanococcoides burtonii DSM 6242 ..............................     2 hits    1 orgs [Methanomicrobia; Methanosarcinales; Methanosarcinaceae; Methanococcoides; Methanococcoides burtonii]
. . Halobacteriaceae ................................................     4 hits    2 orgs [Halobacteria; Halobacteriales]
. . . Haloarcula marismortui ATCC 43049 .............................     2 hits    1 orgs [Haloarcula; Haloarcula marismortui]
. . . Haloterrigena turkmenica DSM 5511 .............................     2 hits    1 orgs [Haloterrigena; Haloterrigena turkmenica]

BLAST

PROTOCOL


a) BLASTp versus NR, NCBI default parameters apart from "Number of descriptions_1000"


b) BLASTp versus SWISSPROT, NCBI default parameters apart from "Number of descriptions_1000"



RESULTS ANALYSIS



The score and hits values are low because are under 200 and 500 respectively, but e-values are relatively good to trust (E<<1e-2).

Almost all homologs results are hypothetical proteins or proteins with unknown function, so I can not know the protein function trough this. However, I can know that the sequence have significant function, otherwise would not be conserved throughout evolution. This means that GOS have an important function to the cells where occurs.


There is two homologs sequences in BLASTp versus SWISSPROT but its uncharacterized protein.

The values of NR and SWISSPROT do not coincide because the majority of e-values in SWISSPROT are insignificant.

I will use BLASTp vs NR from now on because its aim is to find all possible homologs to do a phylogenetic analysis.



RAW RESULTS

a)
                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|YP_002299266.1|  hypothetical protein RC1_3089 [Rhodospiri...   118    9e-25
ref|ZP_01912048.1|  hypothetical protein PPSIR1_16925 [Plesioc...   116    3e-24
ref|ZP_00055761.1|  COG4398: Uncharacterized protein conserved...   113    3e-23
ref|NP_923772.1|  hypothetical protein gll0826 [Gloeobacter vi...   112    6e-23
ref|YP_423691.1|  hypothetical protein amb4328 [Magnetospirill...   111    1e-22
ref|YP_873649.1|  hypothetical protein Acel_1891 [Acidothermus...   111    1e-22
ref|YP_003796027.1|  hypothetical protein NIDE0322 [Candidatus...   108    1e-21
ref|YP_001381206.1|  hypothetical protein Anae109_4044 [Anaero...   106    3e-21
ref|YP_003343943.1|  hypothetical protein Sros_8558 [Streptosp...   105    6e-21
ref|YP_002219557.1|  hypothetical protein Lferr_1108 [Acidithi...   105    9e-21
ref|YP_988070.1|  hypothetical protein Ajs_3887 [Acidovorax sp...   103    2e-20
ref|ZP_02002961.1|  conserved hypothetical protein [Beggiatoa ...   102    5e-20
ref|YP_728373.1|  hypothetical protein H16_B0207 [Ralstonia eu...   102    5e-20
ref|YP_003447216.1|  hypothetical protein AZL_000340 [Azospiri...   101    1e-19
ref|YP_999639.1|  hypothetical protein Veis_4934 [Verminephrob...   101    1e-19
ref|YP_003267641.1|  domain of unknown function DUF1745 [Halia...   100    2e-19
ref|YP_003461793.1|  FIST C domain protein [Thioalkalivibrio s...  99.8    3e-19
ref|YP_931645.1|  hypothetical protein azo0140 [Azoarcus sp. B...  99.4    5e-19
ref|ZP_04764088.1|  protein of unknown function DUF1745 [Acido...  97.4    2e-18
ref|YP_980550.1|  hypothetical protein Pnap_0306 [Polaromonas ...  97.1    2e-18
ref|YP_003653700.1|  hypothetical protein Tbis_3113 [Thermobis...  97.1    2e-18
ref|ZP_04748039.1|  hypothetical protein MkanA1_08708 [Mycobac...  96.7    3e-18
ref|YP_299770.1|  hypothetical protein Reut_B5581 [Ralstonia e...  96.3    4e-18
ref|YP_002515297.1|  domain of unknown function DUF1745 [Thioa...  96.3    4e-18
emb|CBA32316.1|  hypothetical protein Csp_D31530 [Curvibacter ...  95.1    9e-18
ref|YP_521904.1|  hypothetical protein Rfer_0622 [Rhodoferax f...  94.7    1e-17
ref|YP_972819.1|  hypothetical protein Aave_4508 [Acidovorax a...  94.7    1e-17
ref|ZP_03423779.1|  hypothetical protein MtubT9_05535 [Mycobac...  93.6    2e-17
ref|YP_003678048.1|  domain of unknown function DUF1745 [Nocar...  93.2    3e-17
ref|YP_478038.1|  hypothetical protein CYB_1819 [Synechococcus...  93.2    3e-17
ref|ZP_07434680.1|  hypothetical protein TMFG_03295 [Mycobacte...  92.8    4e-17
gb|ADX48294.1|  domain of unknown function DUF1745 [Acidovorax...  92.8    4e-17
ref|ZP_03414580.1|  hypothetical protein Mtub0_01637 [Mycobact...  92.8    4e-17
ref|YP_475304.1|  hypothetical protein CYA_1894 [Synechococcus...  92.4    5e-17
ref|ZP_07413073.1|  hypothetical protein TMAG_02508 [Mycobacte...  92.4    6e-17
ref|ZP_03535547.1|  hypothetical protein MtubT1_03865 [Mycobac...  92.4    6e-17
ref|ZP_05291766.1|  hypothetical protein ACA_2647 [Acidithioba...  92.0    7e-17
ref|ZP_07487733.1|  hypothetical protein TMKG_03909 [Mycobacte...  92.0    7e-17
ref|ZP_06453457.1|  LOW QUALITY PROTEIN: conserved hypothetica...  92.0    7e-17
ref|NP_215142.1|  hypothetical protein Rv0628c [Mycobacterium ...  91.7    9e-17
ref|ZP_05771286.1|  hypothetical protein MtubK8_05719 [Mycobac...  91.7    9e-17
ref|ZP_04924289.1|  conserved hypothetical protein [Mycobacter...  91.7    1e-16
ref|ZP_07492241.2|  hypothetical protein TMLG_03378 [Mycobacte...  91.3    1e-16
ref|YP_003298679.1|  hypothetical protein Tcur_1055 [Thermomon...  90.5    2e-16
ref|YP_001792284.1|  hypothetical protein Lcho_3261 [Leptothri...  89.7    3e-16
ref|ZP_07112126.1|  conserved hypothetical protein [Oscillator...  88.2    9e-16
ref|YP_004128546.1|  hypothetical protein Alide_3953 [Alicycli...  88.2    1e-15
ref|YP_314647.1|  hypothetical protein Tbd_0889 [Thiobacillus ...  87.8    1e-15
ref|YP_547312.1|  hypothetical protein Bpro_0450 [Polaromonas ...  87.0    3e-15
ref|ZP_01472900.1|  hypothetical protein RS9916_39286 [Synecho...  86.7    3e-15
ref|ZP_01080920.1|  hypothetical protein RS9917_03688 [Synecho...  86.3    4e-15
ref|YP_003549120.1|  domain of unknown function DUF1745 [Coral...  85.9    5e-15
ref|ZP_01088597.1|  hypothetical protein DSM3645_08962 [Blasto...  85.9    5e-15
ref|ZP_01629844.1|  hypothetical protein N9414_22128 [Nodulari...  85.5    6e-15
ref|YP_001022824.1|  hypothetical protein Mpe_A3636 [Methylibi...  85.1    8e-15
ref|YP_321568.1|  hypothetical protein Ava_1049 [Anabaena vari...  84.7    1e-14
ref|YP_004272173.1|  domain of unknown function DUF1745 [Planc...  84.7    1e-14
ref|YP_003395047.1|  domain of unknown function DUF1745 [Conex...  84.7    1e-14
ref|YP_001417805.1|  hypothetical protein Xaut_2912 [Xanthobac...  84.7    1e-14
ref|ZP_03631608.1|  protein of unknown function DUF1745 [bacte...  84.3    1e-14
ref|YP_001617600.1|  hypothetical protein sce6951 [Sorangium c...  84.3    2e-14
ref|ZP_06890313.1|  protein of unknown function DUF1745 [Methy...  84.0    2e-14
ref|NP_486891.1|  hypothetical protein alr2851 [Nostoc sp. PCC...  84.0    2e-14
emb|CAM74134.1|  protein conserved in bacteria [Magnetospirill...  83.2    4e-14
ref|ZP_03419268.1|  hypothetical protein Mtub9_03837 [Mycobact...  82.4    5e-14
ref|ZP_03414856.1|  hypothetical protein Mtub0_03063 [Mycobact...  82.4    6e-14
ref|YP_001864412.1|  hypothetical protein Npun_F0717 [Nostoc p...  82.0    7e-14
ref|NP_335325.1|  hypothetical protein MT0897 [Mycobacterium t...  82.0    7e-14
ref|NP_215389.1|  hypothetical protein Rv0874c [Mycobacterium ...  82.0    8e-14
ref|YP_002354113.1|  domain of unknown function DUF1745 [Thaue...  81.3    1e-13
ref|ZP_06970289.1|  protein of unknown function DUF1745 [Ktedo...  80.9    1e-13
ref|YP_003720872.1|  hypothetical protein Aazo_1561 ['Nostoc a...  80.9    2e-13
ref|YP_722964.1|  hypothetical protein Tery_3390 [Trichodesmiu...  80.5    2e-13
ref|YP_004153098.1|  hypothetical protein Varpa_0769 [Variovor...  79.7    4e-13
ref|ZP_06380305.1|  hypothetical protein AplaP_01340 [Arthrosp...  79.3    5e-13
ref|YP_002942669.1|  hypothetical protein Vapar_0750 [Variovor...  79.3    5e-13
ref|YP_003368908.1|  hypothetical protein Psta_0358 [Pirellula...  78.2    9e-13
ref|YP_001659254.1|  hypothetical protein MAE_42400 [Microcyst...  77.8    1e-12
ref|YP_288505.1|  hypothetical protein Tfu_0444 [Thermobifida ...  77.4    2e-12
ref|YP_001517878.1|  hypothetical protein AM1_3572 [Acaryochlo...  77.0    2e-12
ref|ZP_05790775.1|  conserved hypothetical protein [Synechococ...  76.6    3e-12
ref|ZP_05025912.1|  conserved domain protein [Microcoleus chth...  75.9    5e-12
ref|ZP_01619015.1|  hypothetical protein L8106_01732 [Lyngbya ...  75.9    5e-12
ref|ZP_03272861.1|  protein of unknown function DUF1745 [Arthr...  75.1    8e-12
ref|ZP_01469352.1|  hypothetical protein BL107_08029 [Synechoc...  75.1    8e-12
gb|ADI22910.1|  uncharacterized protein conserved in bacteria ...  75.1    9e-12
ref|YP_001804802.1|  hypothetical protein cce_3388 [Cyanothece...  75.1    1e-11
ref|ZP_05044208.1|  conserved domain protein [Cyanobium sp. PC...  74.7    1e-11
gb|AAB08473.1|  putative protein [Synechococcus elongatus PCC ...  74.3    1e-11
ref|YP_729563.1|  hypothetical protein sync_0329 [Synechococcu...  73.9    2e-11
ref|ZP_07974667.1|  hypothetical protein SCB01_13439 [Synechoc...  73.9    2e-11
ref|YP_003629433.1|  hypothetical protein Plim_1400 [Planctomy...  73.9    2e-11
ref|NP_896381.1|  hypothetical protein SYNW0286 [Synechococcus...  73.6    2e-11
ref|YP_380612.1|  hypothetical protein Syncc9605_0281 [Synecho...  73.6    2e-11
ref|YP_378064.1|  hypothetical protein Syncc9902_2063 [Synecho...  73.6    2e-11
ref|YP_171863.1|  hypothetical protein syc1153_d [Synechococcu...  73.6    3e-11
ref|ZP_01084758.1|  hypothetical protein WH5701_01325 [Synecho...  73.6    3e-11
ref|ZP_01730197.1|  hypothetical protein CY0110_28949 [Cyanoth...  72.8    4e-11
ref|YP_003888287.1|  hypothetical protein Cyan7822_3057 [Cyano...  72.4    6e-11
ref|ZP_00515460.1|  similar to Uncharacterized protein conserv...  72.4    7e-11
ref|ZP_02926352.1|  hypothetical protein VspiD_06900 [Verrucom...  71.6    9e-11
emb|CAO87243.1|  unnamed protein product [Microcystis aerugino...  71.2    1e-10
ref|ZP_03531069.1|  hypothetical protein MtubG1_01980 [Mycobac...  71.2    1e-10
ref|YP_001228471.1|  hypothetical protein SynRCC307_2215 [Syne...  70.9    2e-10
ref|YP_004040188.1|  hypothetical protein MPQ_1799 [Methylovor...  70.9    2e-10
ref|YP_001018434.1|  hypothetical protein P9303_24381 [Prochlo...  70.9    2e-10
ref|ZP_02537784.1|  hypothetical protein Epers_31631 [Endorift...  70.5    2e-10
ref|NP_895646.1|  hypothetical protein PMT1819 [Prochlorococcu...  70.5    2e-10
ref|YP_003051568.1|  hypothetical protein Msip34_1796 [Methylo...  70.5    2e-10
ref|ZP_07719104.1|  hypothetical protein ALPR1_02485 [Algoriph...  70.1    3e-10
ref|YP_001734438.1|  hypothetical protein SYNPCC7002_A1183 [Sy...  70.1    3e-10
ref|YP_003690056.1|  domain of unknown function DUF1745 [Desul...  69.7    4e-10
ref|YP_002484915.1|  hypothetical protein Cyan7425_4241 [Cyano...  69.3    4e-10
ref|ZP_01914162.1|  hypothetical protein LMED105_02585 [Limnob...  68.9    6e-10
ref|YP_002379319.1|  domain of unknown function DUF1745 [Cyano...  68.6    9e-10
ref|YP_002537285.1|  domain of unknown function DUF1745 [Geoba...  67.8    1e-09
ref|YP_002371146.1|  hypothetical protein PCC8801_0913 [Cyanot...  67.4    2e-09
ref|YP_003675320.1|  FIST C domain protein [Methylotenera sp. ...  65.5    7e-09
ref|YP_001561558.1|  hypothetical protein Daci_0527 [Delftia a...  64.7    1e-08
ref|ZP_01123502.1|  hypothetical protein WH7805_07511 [Synecho...  64.7    1e-08
ref|YP_004178630.1|  hypothetical protein Isop_1496 [Isosphaer...  64.7    1e-08
ref|YP_001224055.1|  hypothetical protein SynWH7803_0332 [Syne...  63.9    2e-08
ref|ZP_07970043.1|  hypothetical protein SCB02_03858 [Synechoc...  63.9    2e-08
ref|YP_001230880.1|  hypothetical protein Gura_2119 [Geobacter...  63.9    2e-08
ref|YP_004201181.1|  hypothetical protein GM18_4501 [Geobacter...  63.2    3e-08
ref|YP_002139664.1|  FIST domain-containing protein [Geobacter...  62.0    7e-08
ref|NP_442809.1|  hypothetical protein sll0524 [Synechocystis ...  62.0    9e-08
ref|ZP_05035696.1|  conserved domain protein [Synechococcus sp...  61.6    1e-07
ref|YP_002140826.1|  FIST domain-containing protein [Geobacter...  61.6    1e-07
ref|YP_003276414.1|  hypothetical protein CtCNB1_0372 [Comamon...  61.2    1e-07
ref|YP_003021169.1|  domain of unknown function DUF1745 [Geoba...  61.2    1e-07
ref|ZP_03545489.1|  protein of unknown function DUF1745 [Comam...  60.5    2e-07
ref|YP_003049391.1|  hypothetical protein Mmol_1961 [Methylote...  59.3    5e-07
ref|YP_003023904.1|  domain of unknown function DUF1745 [Geoba...  59.3    5e-07
ref|YP_004200527.1|  hypothetical protein GM18_3825 [Geobacter...  57.0    2e-06
ref|ZP_02930760.1|  hypothetical protein VspiD_28985 [Verrucom...  56.6    3e-06
ref|YP_284380.1|  hypothetical protein Daro_1154 [Dechloromona...  56.2    4e-06
ref|ZP_07667983.1|  hypothetical protein TMGG_02939 [Mycobacte...  55.8    6e-06
ref|ZP_06508773.1|  conserved hypothetical protein [Mycobacter...  55.1    1e-05
ref|ZP_03424057.1|  hypothetical protein MtubT9_07011 [Mycobac...  54.7    1e-05
ref|XP_001700835.1|  hypothetical protein CHLREDRAFT_187500 [C...  54.7    1e-05
ref|YP_001940493.1|  hypothetical protein Minf_1841 [Methylaci...  52.8    5e-05
ref|YP_003191633.1|  diguanylate cyclase [Desulfotomaculum ace...  50.8    2e-04
ref|ZP_05072560.1|  conserved domain protein [Campylobacterale...  50.4    2e-04
ref|YP_544961.1|  hypothetical protein Mfla_0852 [Methylobacil...  50.4    3e-04
gb|ADO76354.1|  diguanylate cyclase [Halanaerobium praevalens ...  50.1    3e-04
ref|YP_846772.1|  hypothetical protein Sfum_2659 [Syntrophobac...  49.3    5e-04
gb|EGB10503.1|  hypothetical protein AURANDRAFT_62534 [Aureoco...  48.9    6e-04
ref|ZP_01066475.1|  GGDEF family protein [Vibrio sp. MED222] >...  48.9    7e-04
ref|YP_001517594.1|  hypothetical protein AM1_3284 [Acaryochlo...  47.8    0.001
ref|XP_002954384.1|  hypothetical protein VOLCADRAFT_118736 [V...  47.0    0.002
ref|YP_002418560.1|  GGDEF family protein [Vibrio splendidus L...  47.0    0.002
ref|ZP_08019152.1|  conjugal transfer protein [Lautropia mirab...  46.6    0.003
ref|ZP_04579904.1|  predicted protein [Oxalobacter formigenes ...  46.6    0.004
ref|YP_004095055.1|  FIST C domain [Bacillus cellulosilyticus ...  46.6    0.004
ref|YP_001544045.1|  hypothetical protein Haur_1269 [Herpetosi...  44.7    0.013
ref|YP_003051527.1|  domain of unknown function DUF1745 [Methy...  44.3    0.017
emb|CBX30267.1|  hypothetical protein N47_D30760 [uncultured D...  44.3    0.017
ref|YP_565606.1|  hypothetical protein Mbur_0911 [Methanococco...  44.3    0.018
ref|YP_003168159.1|  hypothetical protein CAP2UW1_2952 [Candid...  43.9    0.023
ref|XP_002609756.1|  hypothetical protein BRAFLDRAFT_122089 [B...  43.9    0.024
ref|YP_003674596.1|  domain of unknown function DUF1745 [Methy...  43.1    0.039
emb|CBA29234.1|  hypothetical protein Csp_A11120 [Curvibacter ...  42.7    0.047
ref|YP_004040145.1|  domain of unknown function duf1745 [Methy...  42.7    0.053
ref|YP_001870165.1|  putative transcriptional regulator [Nosto...  42.4    0.070
ref|YP_003848332.1|  hypothetical protein Galf_2569 [Gallionel...  42.0    0.086
ref|ZP_05099333.1|  conserved domain protein [Roseobacter sp. ...  42.0    0.094
emb|CAN79767.1|  hypothetical protein VITISV_019403 [Vitis vin...  41.6    0.10 
emb|CBI15680.3|  unnamed protein product [Vitis vinifera]          41.6    0.11 
ref|XP_002283895.1|  PREDICTED: hypothetical protein [Vitis vi...  41.2    0.13 
ref|ZP_06368249.1|  protein serine/threonine phosphatase [Desu...  41.2    0.15 
emb|CAG02350.1|  unnamed protein product [Tetraodon nigroviridis]  40.8    0.20 
ref|YP_001210861.1|  hypothetical protein PTH_0311 [Pelotomacu...  40.4    0.23 
emb|CAF91713.1|  unnamed protein product [Tetraodon nigroviridis]  40.4    0.26 
ref|ZP_00992308.1|  GGDEF family protein [Vibrio splendidus 12...  40.4    0.26 
ref|XP_002955138.1|  hypothetical protein VOLCADRAFT_118962 [V...  40.4    0.26 
ref|ZP_01049207.1|  conserved hypothetical protein [Dokdonia d...  40.4    0.27 
ref|ZP_01815145.1|  GGDEF family protein [Vibrionales bacteriu...  40.0    0.28 
ref|XP_002511159.1|  conserved hypothetical protein [Ricinus c...  40.0    0.30 
gb|EFN83047.1|  F-box only protein 22 [Harpegnathos saltator]      40.0    0.32 
ref|YP_429553.1|  hypothetical protein Moth_0690 [Moorella the...  40.0    0.32 
gb|EGB05124.1|  hypothetical protein AURANDRAFT_66694 [Aureoco...  40.0    0.35 
ref|ZP_00054896.2|  COG3287: Uncharacterized conserved protein...  39.3    0.48 
ref|NP_001072897.1|  F-box protein 22 [Xenopus (Silurana) trop...  39.3    0.50 
ref|ZP_08096939.1|  hypothetical protein VIBR0546_01144 [Vibri...  38.9    0.68 
ref|ZP_01909114.1|  hypothetical protein PPSIR1_19394 [Plesioc...  38.5    0.90 
ref|YP_545350.1|  hypothetical protein Mfla_1241 [Methylobacil...  38.1    1.1  
ref|ZP_07738310.1|  FIST C domain [Caldicellulosiruptor lactoa...  37.7    1.6  
ref|ZP_05038897.1|  conserved domain protein [Synechococcus sp...  37.0    2.4  
ref|XP_002317146.1|  predicted protein [Populus trichocarpa] >...  37.0    2.8  
ref|YP_001615886.1|  hypothetical protein sce5243 [Sorangium c...  37.0    3.0  
ref|YP_527371.1|  hypothetical protein Sde_1899 [Saccharophagu...  36.6    3.4  
ref|YP_003830357.1|  GGDEF domain-containing protein [Butyrivi...  36.6    3.6  
ref|YP_003860904.1|  endo-1,4-beta-xylanase B [Maribacter sp. ...  36.6    3.8  
ref|YP_003048728.1|  domain of unknown function DUF1745 [Methy...  36.6    3.9  
ref|XP_003045801.1|  predicted protein [Nectria haematococca m...  36.2    4.2  
ref|XP_001122526.1|  PREDICTED: similar to F-box only protein ...  36.2    4.4  
ref|ZP_06346762.1|  acetyl-CoA carboxylase, carboxyl transfera...  36.2    4.6  
ref|YP_004187575.1|  hypothetical protein VVM_00670 [Vibrio vu...  36.2    5.1  
emb|CBY10167.1|  unnamed protein product [Oikopleura dioica]       36.2    5.2  
ref|XP_003103571.1|  hypothetical protein CRE_28750 [Caenorhab...  36.2    5.2  
ref|XP_794413.1|  PREDICTED: similar to F-box only protein 22 ...  35.8    5.5  
ref|NP_760333.1|  hypothetical protein VV1_1420 [Vibrio vulnif...  35.8    5.5  
ref|ZP_06874357.1|  putative DNA-modified purine glycosidase [...  35.8    5.7  
ref|NP_935756.1|  hypothetical protein VV2963 [Vibrio vulnific...  35.8    5.8  
ref|YP_866851.1|  hypothetical protein Mmc1_2954 [Magnetococcu...  35.8    6.3  
ref|XP_002632404.1|  Hypothetical protein CBG00428 [Caenorhabd...  35.8    6.6  
gb|EFZ14140.1|  hypothetical protein SINV_16325 [Solenopsis in...  35.4    6.9  
ref|XP_002317147.1|  predicted protein [Populus trichocarpa] >...  35.4    7.0  
emb|CAP21880.2|  hypothetical protein CBG_00428 [Caenorhabditi...  35.4    7.2  
ref|ZP_03272220.1|  hypothetical protein AmaxDRAFT_1038 [Arthr...  35.4    7.9  
ref|YP_003239980.1|  domain of unknown function DUF1745 [Ammon...  35.4    8.5  
emb|CBK76062.1|  acetyl-CoA carboxylase carboxyltransferase su...  35.0    9.0  
ref|XP_002322271.1|  f-box family protein [Populus trichocarpa...  35.0    9.0  
ref|YP_135101.1|  hypothetical protein rrnAC0348 [Haloarcula m...  35.0    9.4  
ref|YP_003405192.1|  domain of unknown function DUF1745 [Halot...  35.0    9.4  
ref|YP_001349287.1|  hypothetical protein PSPA7_3933 [Pseudomo...  35.0    9.8  

ALIGNMENTS
>ref|YP_002299266.1| hypothetical protein RC1_3089 [Rhodospirillum centenum SW]
 gb|ACJ00454.1| conserved hypothetical protein [Rhodospirillum centenum SW]
Length=381

 Score =  118 bits (295),  Expect = 9e-25, Method: Compositional matrix adjust.
 Identities = 75/210 (36%), Positives = 104/210 (50%), Gaps = 11/210 (5%)

Query  55   VVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASE  114
            V   G  G+ F        G+TQ C  IG    VTD E  ++  +DG PALE       E
Sbjct  178  VAQGGLSGVLFSDRVALATGLTQGCSPIGPTRTVTDAEASIIKEIDGRPALEALRHDVGE  237

Query  115  LEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSF  174
            L  ++LE  A  +    P+   +        +VR+L GID  R  +   + + +G  + F
Sbjct  238  LLANDLERVAGYVFAGLPIAGSDTG----DYLVRNLLGIDPQRGWIAVGEPVARGRPLLF  293

Query  175  AYRSAISARDDLNAMLKRMKKE---KPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGD  231
              R   +A  DL  ML +MK+      P  G+Y +C ARG  L+G  D +++ I E LGD
Sbjct  294  CRRDRAAAEADLKRMLGQMKRRLGGGTPKGGVYISCIARGPGLFGDPDHELRAIAEHLGD  353

Query  232  FPLAGMFGGYEMARVPSGLQLYTYTGVLVL  261
            FPLAG F G E+    S  +LY+YTGVL L
Sbjct  354  FPLAGFFAGGEI----SHDRLYSYTGVLTL  379


>ref|ZP_01912048.1| hypothetical protein PPSIR1_16925 [Plesiocystis pacifica SIR-1]
 gb|EDM75004.1| hypothetical protein PPSIR1_16925 [Plesiocystis pacifica SIR-1]
Length=409

 Score =  116 bits (291),  Expect = 3e-24, Method: Compositional matrix adjust.
 Identities = 79/269 (30%), Positives = 126/269 (47%), Gaps = 10/269 (3%)

Query  1    GRKPLFLFFPDVYQHQPYNFINMLNYVQSDPMVFG---AGSCDDGSGRISVQFGAEGVVV  57
            G  PL + FPD +       +  L+       V G   +G    G  R+   F       
Sbjct  138  GPDPLLMLFPDPFSWPGPEVLGSLDRAFPQGTVVGGLASGGARPGEHRL---FCDRSTHH  194

Query  58   NGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEF  117
             G  G+  +G  E    V Q C+ +G PMFVT  + ++V  LDG PA+E   ++ + LE 
Sbjct  195  RGMVGLALRGNLEVETIVAQGCRPVGAPMFVTRRQANIVYELDGRPAVEALQQLFTTLEP  254

Query  118  DNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYR  177
            D+   A   LLI   + P+    +    +VR+L G+D +   +  +  +    VV F  R
Sbjct  255  DDRARARTSLLIGLSMHPQLEVHDQGDFLVRNLIGVDPSSGAVGIAAELHGHPVVQFHLR  314

Query  178  SAISARD---DLNAMLKRMKKEKPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLG-DFP  233
             A +A     DL A  +R+  E+ P+  + F+C  RGE LYG++  D +++ E LG   P
Sbjct  315  DAQTAASELHDLAAEHQRIHGERAPAVALLFSCLGRGEHLYGRTGHDSEVLREHLGATLP  374

Query  234  LAGMFGGYEMARVPSGLQLYTYTGVLVLI  262
            LAG F   E+  +     ++ YT  ++L+
Sbjct  375  LAGFFCNGEIGPIAGRTFMHGYTSSILLL  403


>ref|ZP_00055761.1| COG4398: Uncharacterized protein conserved in bacteria [Magnetospirillum 
magnetotacticum MS-1]
Length=370

 Score =  113 bits (282),  Expect = 3e-23, Method: Compositional matrix adjust.
 Identities = 71/199 (36%), Positives = 103/199 (52%), Gaps = 17/199 (8%)

Query  70   EFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENAARQLLI  129
            E   G+TQ C  +G+   VT+    +V+ LDG PAL+V      EL   +L   A  + +
Sbjct  178  EVLTGMTQGCSPLGEVHTVTESWQGVVMALDGRPALDVLKEEVGELLARDLRRIAGYIHV  237

Query  130  SFPLDPEEPKFEGESS---MVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISARDDL  186
              P        EG+ S    VR L GID  +  +   + +E+G  + F  R A +AR DL
Sbjct  238  GLPA-------EGDDSHDYQVRTLIGIDPGQGWIAIGEHVEEGGRLIFVRRDANAARADL  290

Query  187  NAMLKRMKKE---KPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEM  243
              ML  +K+    +P   G+Y +C  RGE ++G  D + +L+HE LGDFPL G F   E+
Sbjct  291  RRMLIGLKERLDGRPIRAGLYVSCTGRGEYMFGHKDAEPELLHEVLGDFPLIGFFANGEI  350

Query  244  ARVPSGLQLYTYTGVLVLI  262
            +R      LY +TGVL L+
Sbjct  351  SRD----HLYGFTGVLTLL  365


>ref|NP_923772.1| hypothetical protein gll0826 [Gloeobacter violaceus PCC 7421]
 dbj|BAC88767.1| gll0826 [Gloeobacter violaceus PCC 7421]
Length=407

 Score =  112 bits (279),  Expect = 6e-23, Method: Compositional matrix adjust.
 Identities = 71/206 (35%), Positives = 105/206 (51%), Gaps = 4/206 (1%)

Query  55   VVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASE  114
             V +GA G+   G     A V Q C+ +G+   +T  E +L+  LDG PAL+V   V  +
Sbjct  188  AVGSGAVGVVLAGDIAVEAAVAQGCRPVGETFQITRAEGNLLWELDGQPALQVLQTVLQQ  247

Query  115  LEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSF  174
            L+ ++   A   L +   +       E    +VR+L G+D    GL   + +  G  V F
Sbjct  248  LDENDQRLARNALFVGVRMSEFHSGSEQGDFLVRNLMGVDSRTGGLAVGEWLRTGQTVRF  307

Query  175  AYRSAISARDDLNAMLKRMKKEK---PPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGD  231
              R A ++RDDL  +L+R + E    PP+  + F+C  RGE+LYG+ DVD  L  + LG+
Sbjct  308  HLRDAATSRDDLQLVLQRHRLEHSGAPPAGALLFSCLGRGESLYGEPDVDSTLFAQVLGE  367

Query  232  -FPLAGMFGGYEMARVPSGLQLYTYT  256
              PLAG F   E+  V S   L+ YT
Sbjct  368  GVPLAGFFCNGEIGPVGSTTFLHGYT  393


>ref|YP_423691.1| hypothetical protein amb4328 [Magnetospirillum magneticum AMB-1]
 dbj|BAE53132.1| Uncharacterized protein conserved in bacteria [Magnetospirillum 
magneticum AMB-1]
Length=367

 Score =  111 bits (277),  Expect = 1e-22, Method: Compositional matrix adjust.
 Identities = 68/196 (35%), Positives = 98/196 (50%), Gaps = 11/196 (5%)

Query  70   EFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENAARQLLI  129
            E   G+TQ C  +G    VT+    +V+ LDG PALEV      EL   +L   A  + +
Sbjct  175  EVLTGMTQGCSPLGAIHTVTESWQGVVMALDGRPALEVLKEEVGELLARDLRRIAGYIHV  234

Query  130  SFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISARDDLNAM  189
              P + E+         VR L GID  +  +     +E+G  + F  R A +AR DL  M
Sbjct  235  GLPAEGEDD----HDYQVRTLIGIDPNQGWIAIGDHVEEGGTLMFVRRDANAARTDLRRM  290

Query  190  LKRMKKE---KPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEMARV  246
            L  +K+    +P   G+Y +C  RGE ++G    +  L+HE LGDFPL G F   E++R 
Sbjct  291  LTGLKERLDGRPIRAGLYVSCTGRGEYMFGHQGAEPDLLHEVLGDFPLIGYFANGEISRD  350

Query  247  PSGLQLYTYTGVLVLI  262
                 LY +TG+L L+
Sbjct  351  ----HLYGFTGILTLL  362


>ref|YP_873649.1| hypothetical protein Acel_1891 [Acidothermus cellulolyticus 11B]
 gb|ABK53663.1| domain of unknown function DUF1745 [Acidothermus cellulolyticus 
11B]
Length=391

 Score =  111 bits (277),  Expect = 1e-22, Method: Compositional matrix adjust.
 Identities = 80/260 (31%), Positives = 123/260 (48%), Gaps = 16/260 (6%)

Query  5    LFLFFPDVYQHQPYNFINMLNYVQSDPMVFGA--GSCDDGSGRIS-----VQFGAEGVVV  57
            L +   D Y      F+   N   S P+V G   G+   GS R+S     V+ GA GV++
Sbjct  126  LGIVLADPYSFPADGFVEQANRTVSVPLVGGMAFGAAGPGSTRLSLDRRSVERGAVGVLL  185

Query  58   NGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEF  117
             G  G+           V+Q C+ IG PM VT   D+++L L G PA+    RV +EL  
Sbjct  186  GGPVGV--------RTAVSQGCRPIGPPMTVTAARDNVLLELAGMPAVRKLERVLAELSA  237

Query  118  DNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYR  177
            ++   A+  L I   +D      +    +VR + GID ARQG+    ++  G  V F  R
Sbjct  238  EDQALASAGLQIGIAMDEYAEDHDMGDFLVRGILGIDPARQGIAIGDVVPVGRTVRFHVR  297

Query  178  SAISARDDLNAMLKRMKKE-KPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAG  236
             A SA DDL + +KR+++E       + F+C  RG  L+  +  D+ ++   LG   +AG
Sbjct  298  DAASAGDDLRSTVKRLREEFTAVESALLFSCNGRGSHLFPDAAHDVSVVRGVLGVQAVAG  357

Query  237  MFGGYEMARVPSGLQLYTYT  256
             F   E+  V     L+ ++
Sbjct  358  FFAAGEIGPVAGRTYLHGFS  377


>ref|YP_003796027.1| hypothetical protein NIDE0322 [Candidatus Nitrospira defluvii]
 emb|CBK40101.1| conserved exported protein of unknown function [Candidatus Nitrospira 
defluvii]
Length=408

 Score =  108 bits (269),  Expect = 1e-21, Method: Compositional matrix adjust.
 Identities = 65/209 (32%), Positives = 101/209 (49%), Gaps = 2/209 (0%)

Query  55   VVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASE  114
            V  +G  G+   G       ++Q C+ IGD   VT  E +++  L G PAL     V  +
Sbjct  186  VYSDGLVGVALSGNISVRTVISQGCRPIGDRFIVTKAEHNVIQELGGIPALHCLQTVFGQ  245

Query  115  LEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSF  174
            L  D    A R L I   +D +  +F     ++R+L G D     +    +I++G  V F
Sbjct  246  LSMDERAQAQRALHIGIAMDEQRAQFTRGDFLIRNLLGADQQTGAIVVGDVIQEGQTVQF  305

Query  175  AYRSAISARDDLNAML--KRMKKEKPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDF  232
              R A SA +DL+A+L   R+ + + P   + F+C  RG+ L+G  + D  ++ EQLG  
Sbjct  306  QVRDAQSADEDLHALLAASRLDESQRPLGALLFSCCGRGKGLFGVPNHDASVLGEQLGAI  365

Query  233  PLAGMFGGYEMARVPSGLQLYTYTGVLVL  261
            PLAG F   E+  V     L+ YT  + +
Sbjct  366  PLAGFFAQGELGPVGGRNFLHGYTASIAI  394


>ref|YP_001381206.1| hypothetical protein Anae109_4044 [Anaeromyxobacter sp. Fw109-5]
 gb|ABS28222.1| domain of unknown function DUF1745 [Anaeromyxobacter sp. Fw109-5]
Length=401

 Score =  106 bits (265),  Expect = 3e-21, Method: Compositional matrix adjust.
 Identities = 70/212 (34%), Positives = 99/212 (47%), Gaps = 2/212 (0%)

Query  52   AEGVVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRV  111
            AE V  NG  G+ F G  E    + Q C+ IG PM VT  +  ++  LDG P L+V + +
Sbjct  174  AEDVHRNGGVGVVFTGNLEVDTLIAQGCRAIGAPMLVTRCQHGVLQELDGRPPLQVIAEL  233

Query  112  ASELEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTV  171
             + LE  + E     L +   L  +E +F+    +VR+L G D     L     +   TV
Sbjct  234  YASLEPRDRELMQTSLFLGLELRSDEVEFQPGELLVRNLIGADEDTGALAVGAELRPLTV  293

Query  172  VSFAYRSAISARDDLNAMLKRMKKEKP--PSFGIYFNCAARGEALYGKSDVDIQLIHEQL  229
            V F  R A SA  +L  ML R ++     P+  + F+C  RG  L+G  D D  L  EQL
Sbjct  294  VQFVLRDAHSAEQELRRMLARHRRAATGRPAGALLFSCVGRGAGLFGHPDHDTSLFEEQL  353

Query  230  GDFPLAGMFGGYEMARVPSGLQLYTYTGVLVL  261
            G  PL G F   E+  V     ++ YT    +
Sbjct  354  GPAPLGGFFCNGEIGPVGGTTFVHGYTSAFAM  385


>ref|YP_003343943.1| hypothetical protein Sros_8558 [Streptosporangium roseum DSM 
43021]
 gb|ACZ91200.1| conserved hypothetical protein [Streptosporangium roseum DSM 
43021]
Length=398

 Score =  105 bits (262),  Expect = 6e-21, Method: Compositional matrix adjust.
 Identities = 75/258 (30%), Positives = 123/258 (48%), Gaps = 5/258 (1%)

Query  7    LFFPDVYQHQPYNFINMLNYVQSD-PMVFGAGSCDDGSGRISVQFGAEG-VVVNGAGGMG  64
            + F D Y      F+     V  D P++ G  +   G G  +V+  A+G +   GA G+ 
Sbjct  130  ILFADPYSFPTDGFVERSQEVLGDLPLIGGLANAIQGRG--AVRLFADGEIYTEGAVGVL  187

Query  65   FKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENAA  124
              G    +  V+Q C+ IG  M VT  ED+L+L L G PAL     + S L+ D+ +  A
Sbjct  188  LSGPVNISTVVSQGCRPIGPTMAVTAVEDNLLLELAGQPALARLEEIVSALDEDDRDLVA  247

Query  125  RQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISARD  184
              L I   +D    + E    ++R + GID  R+ +    ++E G  V F  R A +A +
Sbjct  248  SGLQIGIAMDEYAERHERGDFLIRGVLGIDPEREAVAIGDVVEIGRTVRFQVRDAATADE  307

Query  185  DLNAMLKRMKKEKPPSFG-IYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEM  243
            DL  +L   ++E     G + F+C  RG A++G +D D   + + LG   +AG F   E+
Sbjct  308  DLYELLDAHREEFGRVDGALLFSCNGRGSAMFGTADHDAVALRDTLGPISVAGFFAAGEV  367

Query  244  ARVPSGLQLYTYTGVLVL  261
              V     ++ +T  +++
Sbjct  368  GPVGGHNHVHGFTASVLV  385


>ref|YP_002219557.1| hypothetical protein Lferr_1108 [Acidithiobacillus ferrooxidans 
ATCC 53993]
 ref|YP_002425819.1| hypothetical protein AFE_1387 [Acidithiobacillus ferrooxidans 
ATCC 23270]
 gb|ACH83350.1| domain of unknown function DUF1745 [Acidithiobacillus ferrooxidans 
ATCC 53993]
 gb|ACK80550.1| conserved hypothetical protein [Acidithiobacillus ferrooxidans 
ATCC 23270]
Length=378

 Score =  105 bits (261),  Expect = 9e-21, Method: Compositional matrix adjust.
 Identities = 66/210 (32%), Positives = 106/210 (51%), Gaps = 11/210 (5%)

Query  55   VVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASE  114
            VV  G  G+ F         +TQ C  IG    + +  ++++  LD  PAL+VF+    +
Sbjct  170  VVHGGLSGVTFSENIAVATRLTQGCSPIGPIHHINEAHNNVISRLDDRPALDVFNDETCD  229

Query  115  LEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSF  174
            +   +L+ AA  + ++ P+  ++        MVR L GID+ R+ L   +  E G  + F
Sbjct  230  ILSRDLQRAAGFIFVAMPVKNDDRG----DYMVRTLVGIDIERKLLAIGEYAEAGQSLMF  285

Query  175  AYRSAISARDDLNAMLK---RMKKEKPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGD  231
              R + +A +DL  ML    R+   +    G+YF C  RG+ L+G    ++++I + LGD
Sbjct  286  CKRDSGTAGEDLERMLSDISRLADGRKIRGGLYFTCVWRGQNLFGPDSAELRMIRDGLGD  345

Query  232  FPLAGMFGGYEMARVPSGLQLYTYTGVLVL  261
            FPL G F   E+    S  ++Y YTGVL L
Sbjct  346  FPLVGFFANAEI----SHDKIYGYTGVLTL  371


>ref|YP_988070.1| hypothetical protein Ajs_3887 [Acidovorax sp. JS42]
 ref|YP_002554593.1| hypothetical protein Dtpsy_3160 [Acidovorax ebreus TPSY]
 gb|ABM43994.1| domain of unknown function DUF1745 [Acidovorax sp. JS42]
 gb|ACM34593.1| domain of unknown function DUF1745 [Acidovorax ebreus TPSY]
Length=421

 Score =  103 bits (258),  Expect = 2e-20, Method: Compositional matrix adjust.
 Identities = 72/242 (30%), Positives = 111/242 (46%), Gaps = 38/242 (15%)

Query  53   EGVVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEV-FSRV  111
            +GV+  G  G+ F    E  + VTQ C+ IG    +T  +D++VL LDG PAL+V    +
Sbjct  177  DGVLHGGLSGVAFDAQVELVSRVTQGCQPIGAQAAITAAQDNVVLELDGEPALDVLLDTL  236

Query  112  ASELEFDNLE--NAARQLLISFPLDPEEPKFE-----GESSMVRHLTGIDVARQGLQFSQ  164
               L+ D      A R  L     D + P+ +     G  + VRH+ G+D  R G+    
Sbjct  237  GVTLQGDAQAALRAVRATLAGLE-DADAPQRQRTGHFGADTRVRHIVGLDATRSGVALGD  295

Query  165  IIEKGTVVSFAYRSAISARDDLNAMLKRMKKEKPPSF-----------------------  201
             +E G  ++F  R+  +AR DL  +   +++E  P                         
Sbjct  296  HVEVGMRLAFCQRNVGAARADLMRICAEVREELSPQLEEAQALPAAGSAPHAEPAGGRRI  355

Query  202  --GIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEMARVPSGLQLYTYTGVL  259
               +Y +C+ RG   +G    ++Q++   LGD PLAG F G E+A      +LY YTGVL
Sbjct  356  CGALYVSCSGRGGPHFGGPSAELQIVRHALGDVPLAGFFAGGEIA----AHRLYGYTGVL  411

Query  260  VL  261
             +
Sbjct  412  TV  413


>ref|ZP_02002961.1| conserved hypothetical protein [Beggiatoa sp. PS]
 gb|EDN67040.1| conserved hypothetical protein [Beggiatoa sp. PS]
Length=374

 Score =  102 bits (255),  Expect = 5e-20, Method: Compositional matrix adjust.
 Identities = 65/213 (31%), Positives = 103/213 (49%), Gaps = 11/213 (5%)

Query  52   AEGVVVNGAGGMGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRV  111
            A+ V   G  G+ F      T  +TQSC  IG    +T+  +++++ +D  PAL++F   
Sbjct  168  ADTVTEGGISGVLFSTTVNVTTRLTQSCVPIGPRHQITESSNNVIIRIDDRPALDIFKED  227

Query  112  ASELEFDNLENAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTV  171
                   +L   A  + +  P+   +        MVR+L GID   + L   + +  GT 
Sbjct  228  IGPKLSKDLNQVAGLIFVGLPIIGSDTG----DFMVRNLAGIDPEHKLLAIGETVNPGTP  283

Query  172  VSFAYRSAISARDDLNAMLKRMK---KEKPPSFGIYFNCAARGEALYGKSDVDIQLIHEQ  228
            + F  R    A  D   +L  +K   K + P  G+Y++C  RGE+L+GK   ++++I   
Sbjct  284  IIFTRRDRKVAHKDFIKILNNIKTQLKGRLPRGGVYYSCMGRGESLFGKDSQELKIIQSV  343

Query  229  LGDFPLAGMFGGYEMARVPSGLQLYTYTGVLVL  261
            LGDFPL G F   E+       +LY YTG+L L
Sbjct  344  LGDFPLVGFFANGEIYY----QRLYGYTGILTL  372

-------------------------------------------------------------------------------------------

b) BLASTp versus SWISSPROT, NCBI default parameters apart from "Number of descriptions=1000"

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

sp|P64730.1|Y644_MYCBO  RecName: Full=Uncharacterized protein ...  91.7    4e-18
sp|P0A5D3.1|Y874_MYCTU  RecName: Full=Uncharacterized protein ...  82.0    4e-15
sp|Q9PLM5.1|SECA_CHLMU  RecName: Full=Protein translocase subu...  33.5    1.4  
sp|Q5U3V7.1|S2543_DANRE  RecName: Full=Solute carrier family 2...  33.1    1.9  
sp|Q1R0J2.1|LLDD_CHRSD  RecName: Full=L-lactate dehydrogenase ...  33.1    2.0  
sp|Q8TZZ8.1|RL3_PYRFU  RecName: Full=50S ribosomal protein L3P     33.1    2.0  
sp|Q8F3V7.1|LIPA_LEPIN  RecName: Full=Lipoyl synthase; AltName...  32.7    2.2  
sp|Q72RU2.1|LIPA_LEPIC  RecName: Full=Lipoyl synthase; AltName...  32.7    2.2  
sp|Q8PR33.1|LLDD_XANAC  RecName: Full=L-lactate dehydrogenase ...  32.0    3.7  
sp|Q3BZH2.1|LLDD_XANC5  RecName: Full=L-lactate dehydrogenase ...  32.0    3.8  
sp|B0RLM2.1|LLDD_XANCB  RecName: Full=L-lactate dehydrogenase ...  32.0    4.2  
sp|O61492.3|FLOT2_DROME  RecName: Full=Flotillin-2                 32.0    4.2  
sp|Q4V0H2.1|LLDD_XANC8  RecName: Full=L-lactate dehydrogenase ...  32.0    4.4  
sp|B2FIJ0.1|LLDD_STRMK  RecName: Full=L-lactate dehydrogenase ...  32.0    4.5  
sp|Q1IF69.1|LLDD_PSEE4  RecName: Full=L-lactate dehydrogenase ...  32.0    4.6  
sp|B4SMK1.1|LLDD_STRM5  RecName: Full=L-lactate dehydrogenase ...  32.0    4.6  
sp|Q5H6Z4.1|LLDD_XANOR  RecName: Full=L-lactate dehydrogenase ...  32.0    4.7  
sp|A8HTC9.1|LLDD_AZOC5  RecName: Full=L-lactate dehydrogenase ...  31.6    5.0  
sp|B0KIT4.1|LLDD_PSEPG  RecName: Full=L-lactate dehydrogenase ...  31.6    5.2  
sp|Q88DT3.1|LLDD_PSEPK  RecName: Full=L-lactate dehydrogenase ...  31.6    5.4  
sp|B1J244.1|LLDD_PSEPW  RecName: Full=L-lactate dehydrogenase ...  31.6    5.9  
sp|P39461.1|FUMC_SULSO  RecName: Full=Fumarate hydratase class...  31.2    6.6  
sp|B5XMV0.1|LLDD_KLEP3  RecName: Full=L-lactate dehydrogenase ...  31.2    8.0  
sp|A6TFK0.1|LLDD_KLEP7  RecName: Full=L-lactate dehydrogenase ...  30.8    8.2  
sp|B3E7B3.1|SUCC_GEOLS  RecName: Full=Succinyl-CoA ligase [ADP...  30.8    8.4  
sp|P46454.1|LLDD_HAEIN  RecName: Full=L-lactate dehydrogenase ...  30.8    8.6  
sp|A7MNF6.1|LLDD_ENTS8  RecName: Full=L-lactate dehydrogenase ...  30.8    8.8  
sp|A5UFG9.1|LLDD_HAEIG  RecName: Full=L-lactate dehydrogenase ...  30.8    8.9  
sp|A5UBE3.1|LLDD_HAEIE  RecName: Full=L-lactate dehydrogenase ...  30.8    8.9  
sp|Q4QJK8.1|LLDD_HAEI8  RecName: Full=L-lactate dehydrogenase ...  30.8    9.0  
sp|A8ARJ1.1|LLDD_CITK8  RecName: Full=L-lactate dehydrogenase ...  30.8    9.5  
sp|A4W540.1|LLDD_ENT38  RecName: Full=L-lactate dehydrogenase ...  30.8    9.9  
sp|A1AHE2.1|LLDD_ECOK1  RecName: Full=L-lactate dehydrogenase ...  30.8    9.9  
sp|Q87G18.1|LLDD_VIBPA  RecName: Full=L-lactate dehydrogenase ...  30.8    9.9  

ALIGNMENTS
>sp|P64730.1|Y644_MYCBO RecName: Full=Uncharacterized protein Mb0644c
 sp|P64729.1|Y628_MYCTU RecName: Full=Uncharacterized protein Rv0628c/MT0656
Length=383

 Score = 91.7 bits (226),  Expect = 4e-18, Method: Compositional matrix adjust.
 Identities = 58/190 (31%), Positives = 87/190 (46%), Gaps = 4/190 (2%)

Query  75   VTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENAARQLLISFPLD  134
            V+Q C+ IG+P  VT  +  ++  L G P L     +   +  D  E  +R L I   +D
Sbjct  192  VSQGCRPIGEPYIVTGADGAVITELGGRPPLHRLREIVLGMAPDEQELVSRGLQIGIVVD  251

Query  135  PEEPKFEGESS-MVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISARDDLNAMLKRM  193
             E     G+   ++R L G D     +   +++E G  V F  R A +A  DL   ++R 
Sbjct  252  -EHLAVPGQGDFLIRGLLGADPTTGAIGIGEVVEVGATVQFQVRDAAAADKDLRLAVERA  310

Query  194  KKE--KPPSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGGYEMARVPSGLQ  251
              E   PP  G+ F C  RG  ++G +D D   I + LG  PLAG F   E+  V     
Sbjct  311  AAELPGPPVGGLLFTCNGRGRRMFGVTDHDASTIEDLLGGIPLAGFFAAGEIGPVAGHNA  370

Query  252  LYTYTGVLVL  261
            L+ +T  + L
Sbjct  371  LHGFTASMAL  380


>sp|P0A5D3.1|Y874_MYCTU RecName: Full=Uncharacterized protein Rv0874c/MT0897
 sp|P0A5D4.1|Y898_MYCBO RecName: Full=Uncharacterized protein Mb0898c
Length=386

 Score = 82.0 bits (201),  Expect = 4e-15, Method: Compositional matrix adjust.
 Identities = 57/201 (29%), Positives = 89/201 (45%), Gaps = 8/201 (3%)

Query  64   GFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENA  123
            G +GVP     V+Q C+ IG P  VT  +  L+  L G P L+    +   L  D     
Sbjct  185  GMRGVPV----VSQGCRPIGYPYIVTGADGILITELGGRPPLQRLREIVEGLSPDERALV  240

Query  124  ARQLLISFPLDPEEPKFEGESS-MVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISA  182
            +  L I   +D E     G+   ++R L G D +   ++  ++++ G  + F  R A  A
Sbjct  241  SHGLQIGIVVD-EHLAAPGQGDFVIRGLLGADPSTGSIEIDEVVQVGATMQFQVRDAAGA  299

Query  183  RDDLNAMLKRMKKEKP--PSFGIYFNCAARGEALYGKSDVDIQLIHEQLGDFPLAGMFGG  240
              DL   ++R     P   +  + F C  RG  ++G +D D   I E LG  PLAG F  
Sbjct  300  DKDLRLTVERAAARLPGRAAGALLFTCNGRGRRMFGVADHDASTIEELLGGIPLAGFFAA  359

Query  241  YEMARVPSGLQLYTYTGVLVL  261
             E+  +     L+ +T  + L
Sbjct  360  GEIGPIAGRNALHGFTASMAL  380


>sp|Q9PLM5.1|SECA_CHLMU RecName: Full=Protein translocase subunit secA
Length=968

 Score = 33.5 bits (75),  Expect = 1.4, Method: Compositional matrix adjust.
 Identities = 39/183 (22%), Positives = 70/183 (39%), Gaps = 29/183 (15%)

Query  18   YNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVN------GAGGMGFKGVPEF  71
            +  +N  N+ Q   ++ GAG    G+  ++      G  +        AGG+   G    
Sbjct  595  HTVLNAKNHAQEAEIIAGAGKV--GAVTVATNMAGRGTDIKLDKEAVAAGGLYVIGTSRH  652

Query  72   TA-----GVTQSCKMIGDP---MFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLENA  123
             +      +   C  +GDP    F   FED L + L  +P L    R     E + + + 
Sbjct  653  QSRRIDRQLRGRCARLGDPGAAKFFLSFEDRL-MRLFASPKLNTLIRHFRPPEGEAMSDP  711

Query  124  ARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQFSQIIEKGTVVSFAYRSAISAR  183
                LI    +  + + EG +  +R  T        L++  ++ K     +A+R+ +   
Sbjct  712  MFDRLI----ETAQKRVEGRNYTIRKHT--------LEYDDVMNKQRQTIYAFRNDVLHA  759

Query  184  DDL  186
            DDL
Sbjct  760  DDL  762


>sp|Q5U3V7.1|S2543_DANRE RecName: Full=Solute carrier family 25 member 43
Length=345

 Score = 33.1 bits (74),  Expect = 1.9, Method: Compositional matrix adjust.
 Identities = 28/103 (28%), Positives = 49/103 (48%), Gaps = 21/103 (20%)

Query  114  ELEFDNLENAARQLL-------ISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQ--FSQ  164
             + F +L+N     L       +SFP +  + K + +S ++ H  G+DV   G+   F Q
Sbjct  195  HVRFTSLQNFINGCLAAGVAQTLSFPFETVKKKMQAQSLVLPHCGGVDVHFNGMADCFRQ  254

Query  165  IIEKGTVVSFAYRSAISARDDLNAMLKRMKKEKPPSFGIYFNC  207
            +I+   V+  A  S ++A      M+K +     P FG+ F+C
Sbjct  255  VIKNKGVM--ALWSGLTAN-----MVKIV-----PYFGLLFSC  285


>sp|Q1R0J2.1|LLDD_CHRSD RecName: Full=L-lactate dehydrogenase [cytochrome]
Length=392

 Score = 33.1 bits (74),  Expect = 2.0, Method: Compositional matrix adjust.
 Identities = 17/39 (44%), Positives = 23/39 (59%), Gaps = 1/39 (2%)

Query  31   PMVFGAGSCDDGSGRISVQFGAEGVVVNGAGGMGFKGVP  69
            PM+   G  D    R +V+FGA+G+VV+  GG    GVP
Sbjct  247  PMII-KGILDPEDARDAVRFGADGIVVSNHGGRQLDGVP  284


>sp|Q8TZZ8.1|RL3_PYRFU RecName: Full=50S ribosomal protein L3P
Length=365

 Score = 33.1 bits (74),  Expect = 2.0, Method: Compositional matrix adjust.
 Identities = 22/72 (31%), Positives = 34/72 (48%), Gaps = 3/72 (4%)

Query  63   MGFKGVPEFTAGVTQSCKMIGDPMFVTDFEDDLVLTLDGTPALEVFSRVASELEFDNLEN  122
            +GF G   + AG+T    +  +P      E  + +T+  TP L VF   A  + +  LE 
Sbjct  39   LGFAG---YKAGMTHILMIDDEPGLTNGKEIFMPVTIIETPPLRVFGIRAYRMGYLGLET  95

Query  123  AARQLLISFPLD  134
            A   ++  FPLD
Sbjct  96   ATEVIVPDFPLD  107


>sp|Q8F3V7.1|LIPA_LEPIN RecName: Full=Lipoyl synthase; AltName: Full=Lip-syn; Short=LS; 
AltName: Full=Lipoate synthase; AltName: Full=Lipoic acid 
synthase; AltName: Full=Sulfur insertion protein lipA
Length=301

 Score = 32.7 bits (73),  Expect = 2.2, Method: Compositional matrix adjust.
 Identities = 17/34 (50%), Positives = 21/34 (62%), Gaps = 3/34 (8%)

Query  131  FPLDPEEPKFEGESSM---VRHLTGIDVARQGLQ  161
            FPLDPEEPK   ESS+   +RH+    V R  L+
Sbjct  94   FPLDPEEPKRIAESSIALDLRHVVITSVNRDDLE  127


>sp|Q72RU2.1|LIPA_LEPIC RecName: Full=Lipoyl synthase; AltName: Full=Lip-syn; Short=LS; 
AltName: Full=Lipoate synthase; AltName: Full=Lipoic acid 
synthase; AltName: Full=Sulfur insertion protein lipA
Length=301

 Score = 32.7 bits (73),  Expect = 2.2, Method: Compositional matrix adjust.
 Identities = 17/34 (50%), Positives = 21/34 (62%), Gaps = 3/34 (8%)

Query  131  FPLDPEEPKFEGESSM---VRHLTGIDVARQGLQ  161
            FPLDPEEPK   ESS+   +RH+    V R  L+
Sbjct  94   FPLDPEEPKRIAESSIALGLRHVVITSVNRDDLE  127


>sp|Q8PR33.1|LLDD_XANAC RecName: Full=L-lactate dehydrogenase [cytochrome]
Length=388

 Score = 32.0 bits (71),  Expect = 3.7, Method: Compositional matrix adjust.
 Identities = 21/74 (29%), Positives = 32/74 (44%), Gaps = 8/74 (10%)

Query  18   YNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVNGAGGMGFKGV-------PE  70
            +  +  +    + PMV   G  D    R +V+FGA+G+VV+  GG    GV       P 
Sbjct  234  WKDLEWIREFWTGPMVI-KGILDPEDARDAVRFGADGIVVSNHGGRQLDGVLSSARALPA  292

Query  71   FTAGVTQSCKMIGD  84
                V    K++ D
Sbjct  293  IADAVKGELKILAD  306


>sp|Q3BZH2.1|LLDD_XANC5 RecName: Full=L-lactate dehydrogenase [cytochrome]
Length=388

 Score = 32.0 bits (71),  Expect = 3.8, Method: Compositional matrix adjust.
 Identities = 21/74 (29%), Positives = 32/74 (44%), Gaps = 8/74 (10%)

Query  18   YNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVNGAGGMGFKGV-------PE  70
            +  +  +    + PMV   G  D    R +V+FGA+G+VV+  GG    GV       P 
Sbjct  234  WKDLEWIREFWTGPMVI-KGILDPEDARDAVRFGADGIVVSNHGGRQLDGVLSSARALPA  292

Query  71   FTAGVTQSCKMIGD  84
                V    K++ D
Sbjct  293  IADAVKGELKILAD  306


>sp|B0RLM2.1|LLDD_XANCB RecName: Full=L-lactate dehydrogenase [cytochrome]
Length=386

 Score = 32.0 bits (71),  Expect = 4.2, Method: Compositional matrix adjust.
 Identities = 21/74 (29%), Positives = 32/74 (44%), Gaps = 8/74 (10%)

Query  18   YNFINMLNYVQSDPMVFGAGSCDDGSGRISVQFGAEGVVVNGAGGMGFKGV-------PE  70
            +  +  +    + PMV   G  D    R +V+FGA+G+VV+  GG    GV       P 
Sbjct  234  WKDLEWIREFWTGPMVI-KGILDPEDARDAVRFGADGIVVSNHGGRQLDGVLSSARALPA  292

Query  71   FTAGVTQSCKMIGD  84
                V    K++ D
Sbjct  293  IADAVKGELKILAD  306


>sp|O61492.3|FLOT2_DROME RecName: Full=Flotillin-2
Length=438

 Score = 32.0 bits (71),  Expect = 4.2, Method: Compositional matrix adjust.
 Identities = 25/101 (25%), Positives = 47/101 (47%), Gaps = 5/101 (4%)

Query  66   KGVPEFTAGVTQSCKMIGDPMFV-TDFEDDLVLTLDGTPALEVFSRVASELE---FDNLE  121
            +GVP    GV Q CK++    +  TD+ +D    L GT + +   +   E++      LE
Sbjct  63   QGVPLTVTGVAQ-CKIMKSSSYKQTDYHNDEADELLGTASEQFLGKSVKEIKQTILQTLE  121

Query  122  NAARQLLISFPLDPEEPKFEGESSMVRHLTGIDVARQGLQF  162
               R +L +  ++      +  +++VR +   DV R G++ 
Sbjct  122  GHLRAILGTLTVEEVYKDRDQFAALVREVAAPDVGRMGIEI  162