GOS 1323010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1092343626955
Annotathon code: GOS_1323010
Sample :
  • GPS :15°16'40s; 148°13'28w
  • Polynesia Archipelagos: Tikehau Lagoon - Fr. Polynesia
  • Coral Atoll (-1.2m, 27.8°C, 0.1-0.8 microns)
Authors
Team : Algarve
Username : Bioteam
Annotated on : 2010-07-11 13:14:20
  • AntónioJoséVieiraCanavarro a36794@ualg.pt
  • VâniaFilipaSimõesJoão a36812@ualg.pt
  • a36790 AnaCatarinaDiasPereira

Synopsis

Genomic Sequence

>JCVI_READ_1092343626955 GOS_1323010 Genomic DNA
GCGCATACCCCACAATAACAGCTGTATCACAGCCGTTGAAAACCCCAGCCCATTCGGGCACGTTGGGTCAAACGATCGCCCATCTTCATTAAAAAGTGAT
ATCACATTCACTGCAGCCGTCACACCACCCGCAACAATTGCCATGGATTGATACCGAATGGTGGGGCATCGCACACTATTGGTTTGCACCATTTCTAACT
GTTCAGCACTACGGTGAAAATCATCAATGGTGTCCATATTAGACGAATAAGATAGCCTCACCATGCGCCCGGCAGTGGTTACAGAATCAACCGCAAATTG
AACAACCATCGGTCCCGCCAAAAGCCATAACAATCCCAACCCCTCACCCCCAAGTTGCACCTCACACATGGAGTTCGCTCGAACAACTTCAGAAACTTGG
CCCAAATGAGTTGGACTACATCCTTCAGGCAAAGACTGACAATCCAACATGCCCATTTTATATAAAGGCATTAATGGTGGGCAGGTGCTATTCACATTCA
CAGTTGTTACTCTACGCTATGATTTTTCACCAGCGGCACTGATCGCCTCGCGCTGCACCACAAGGGTAACGTTAGATTTATAAGCATCTAATATATCTTG
AGCCTCAACTTGCGTATATTCCTTCACAAAGGCATCATTGCAATAGCGGTAGCGATTAGGCCCTAACTTTGTTAAAGAGGTATAATGCCCCGAACCATCT
TGACCAAAAAAGAAAGTCACCGCGCGAATATGTTGAGCATTAAGTGTGATGGAGTCTGATGAACATATGACTGATTGAGTCGCTTGATCAAAAACCAAGC
GATGAACTGTAGACATACGAATGCCATTCGGGTAATTGTAACGCGTCGTCCTGTGTTGTCGAAGTGAACTTACATCCTGACCATTAATCTG

Translation

[376 - 891/891]   indirect strand


Annotator commentaries

The analysis of the sequence leads us to conclude that the sequence is non-coding, and even though it has a length of >150 amino acids it does not have known or unknown homology according to Blastp or Blastx, because these values are not significant, the e-values are too big and the scores are too low, >_3.0 and <_35 in Blastp and >_1.9 and <_35.8. Due to these factors and without the Protein Domains we could not run the Multiple Alignment, build the Phylogenetic tree or tell the Biological Function, the Biological Process or the taxonomic classification.

ORF finding

PROTOCOL


a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code

b) SMS ORFinder / reverse strand / frames 1, 2 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code



RESULTS ANALYSIS


The direct strand has one ORF in frame 3 and the reverse strand has three ORFs , two in frame 1 and one in frame 2. The chosen ORF was one of the two ORFs of frame 1, because it has more than 150 amino acids and it probably has a biological function. This ORF does not have start codon or end codon which means it does not have carboxyl or amino extremities, it is an open-ended sequence.



RAW RESULTS

a) forward strand

No ORFs were found in reading frame 1.

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 243 to base 542.
ACGAATAAGATAGCCTCACCATGCGCCCGGCAGTGGTTACAGAATCAACCGCAAATTGAA
CAACCATCGGTCCCGCCAAAAGCCATAACAATCCCAACCCCTCACCCCCAAGTTGCACCT
CACACATGGAGTTCGCTCGAACAACTTCAGAAACTTGGCCCAAATGAGTTGGACTACATC
CTTCAGGCAAAGACTGACAATCCAACATGCCCATTTTATATAAAGGCATTAATGGTGGGC
AGGTGCTATTCACATTCACAGTTGTTACTCTACGCTATGATTTTTCACCAGCGGCACTGA


>Translation of ORF number 1 in reading frame 3 on the direct strand.
TNKIASPCARQWLQNQPQIEQPSVPPKAITIPTPHPQVAPHTWSSLEQLQKLGPNELDYI
LQAKTDNPTCPFYIKALMVGRCYSHSQLLLYAMIFHQRH*

____________________________________________________________________


b)reverse strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 1 to base 375.
CAGATTAATGGTCAGGATGTAAGTTCACTTCGACAACACAGGACGACGCGTTACAATTAC
CCGAATGGCATTCGTATGTCTACAGTTCATCGCTTGGTTTTTGATCAAGCGACTCAATCA
GTCATATGTTCATCAGACTCCATCACACTTAATGCTCAACATATTCGCGCGGTGACTTTC
TTTTTTGGTCAAGATGGTTCGGGGCATTATACCTCTTTAACAAAGTTAGGGCCTAATCGC
TACCGCTATTGCAATGATGCCTTTGTGAAGGAATATACGCAAGTTGAGGCTCAAGATATA
TTAGATGCTTATAAATCTAACGTTACCCTTGTGGTGCAGCGCGAGGCGATCAGTGCCGCT
GGTGAAAAATCATAG

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
QINGQDVSSLRQHRTTRYNYPNGIRMSTVHRLVFDQATQSVICSSDSITLNAQHIRAVTF
FFGQDGSGHYTSLTKLGPNRYRYCNDAFVKEYTQVEAQDILDAYKSNVTLVVQREAISAA
GEKS*

>ORF number 2 in reading frame 1 on the reverse strand extends from base 376 to base 891.
CGTAGAGTAACAACTGTGAATGTGAATAGCACCTGCCCACCATTAATGCCTTTATATAAA
ATGGGCATGTTGGATTGTCAGTCTTTGCCTGAAGGATGTAGTCCAACTCATTTGGGCCAA
GTTTCTGAAGTTGTTCGAGCGAACTCCATGTGTGAGGTGCAACTTGGGGGTGAGGGGTTG
GGATTGTTATGGCTTTTGGCGGGACCGATGGTTGTTCAATTTGCGGTTGATTCTGTAACC
ACTGCCGGGCGCATGGTGAGGCTATCTTATTCGTCTAATATGGACACCATTGATGATTTT
CACCGTAGTGCTGAACAGTTAGAAATGGTGCAAACCAATAGTGTGCGATGCCCCACCATT
CGGTATCAATCCATGGCAATTGTTGCGGGTGGTGTGACGGCTGCAGTGAATGTGATATCA
CTTTTTAATGAAGATGGGCGATCGTTTGACCCAACGTGCCCGAATGGGCTGGGGTTTTCA
ACGGCTGTGATACAGCTGTTATTGTGGGGTATGCGC

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
RRVTTVNVNSTCPPLMPLYKMGMLDCQSLPEGCSPTHLGQVSEVVRANSMCEVQLGGEGL
GLLWLLAGPMVVQFAVDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTI
RYQSMAIVAGGVTAAVNVISLFNEDGRSFDPTCPNGLGFSTAVIQLLLWGMR

>ORF number 1 in reading frame 2 on the reverse strand extends from base 422 to base 613.
TGCCTTTATATAAAATGGGCATGTTGGATTGTCAGTCTTTGCCTGAAGGATGTAGTCCAA
CTCATTTGGGCCAAGTTTCTGAAGTTGTTCGAGCGAACTCCATGTGTGAGGTGCAACTTG
GGGGTGAGGGGTTGGGATTGTTATGGCTTTTGGCGGGACCGATGGTTGTTCAATTTGCGG
TTGATTCTGTAA

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
CLYIKWACWIVSLCLKDVVQLIWAKFLKLFERTPCVRCNLGVRGWDCYGFWRDRWLFNLR
LIL*

No ORFs were found in reading frame 3.

Multiple Alignement

PROTOCOL



RESULTS ANALYSIS


We could not select any ingroups or outgroups, that way we could not run the Multiple Alignment. And without the MSA it was not possible to find the start and end codon or the size of the sequence.

RAW RESULTS

Protein Domains

PROTOCOL

InterPro, default parameters at EBI


RESULTS ANALYSIS


We cannot conclude the protein domains because the e-values were not available (NA).

RAW RESULTS

Sequence_1	8EAA775908467CFE	172	TMHMM	tmhmm	transmembrane_regions	122	142	NA	?	17-Mar-2010	NULL	NULL
Sequence_1	8EAA775908467CFE	172	TMHMM	tmhmm	transmembrane_regions	152	170	NA	?	17-Mar-2010	NULL	NULL

Phylogeny

PROTOCOL



RESULTS ANALYSIS


Since we could not run a multiple alignment (due to the lack of ingroups and outgroups) it was not possible to build a phylogenetic tree.

RAW RESULTS

Taxonomy report

PROTOCOL


BLASTp versus NR, NCBI default parameters apart from "Number of descriptions_1000"



RESULTS ANALYSIS


The obtained results were not significant (e-values were >_1.9 and the score values were <_35) or enough (only 11), therefore we cannot select any ingroups or outgroups.

RAW RESULTS

Lineage Report

cellular organisms
. Bacteria           [bacteria]
. . Listeria           [firmicutes]
. . . Listeria welshimeri serovar 6b str. SLCC5334 -   34 2 hits [firmicutes]          malonyl CoA-acyl carrier protein transacylase [Listeria wel
. . . Listeria grayi DSM 20601 .....................   33 2 hits [firmicutes]          [acyl-carrier-protein] S-malonyltransferase [Listeria grayi
. . . Listeria innocua Clip11262 ...................   33 1 hit  [firmicutes]          acyl-carrier-protein S-malonyltransferase [Listeria innocua
. . . Listeria innocua .............................   33 1 hit  [firmicutes]          acyl-carrier-protein S-malonyltransferase [Listeria innocua
. . Bacteroides eggerthii DSM 20697 ----------------   33 2 hits [CFB group bacteria]  hypothetical protein BACEGG_03174 [Bacteroides eggerthii DS
. . Corynebacterium tuberculostearicum SK141 .......   33 2 hits [high GC Gram+]       hypothetical protein CORTU0001_0894 [Corynebacterium tuberc
. . Corynebacterium pseudogenitalium ATCC 33035 ....   33 2 hits [high GC Gram+]       hypothetical protein HMPREF0305_0699 [Corynebacterium pseud
. . Bacteroides dorei 5_1_36/D4 ....................   33 2 hits [CFB group bacteria]  ATPase [Bacteroides sp. D4] >gi|255689898|ref|ZP_05413573.1
. . Bacteroides finegoldii DSM 17565 ...............   33 2 hits [CFB group bacteria]  ATPase [Bacteroides sp. D4] >gi|255689898|ref|ZP_05413573.1
. Aedes albopictus (forest day mosquito) -----------   33 1 hit  [flies]               signal transducer and activator of transcription [Aedes alb
. Rattus norvegicus (brown rat) ....................   33 5 hits [rodents]             transcriptional co-activator with PDZ-binding motif (TA)Z, 

BLAST

PROTOCOL

a) BLASTp versus NR, NCBI default parameters apart from "Number of descriptions_1000"


b) BLASTx versus NR, NCBI default parameters apart from "Number of descriptions_1000"




RESULTS ANALYSIS


After analyzing the Blastp results we can tell that our sequence have no homologue sequences because the e-values were >_3.0 and the score values were <_35 (when they should be <_10-2 and the score >200). Due to these results we had to use Blastx, the score values obtained are like the Blastp results (<_35) and although there was an improvement the e-values are still too big (>_1.9), and due to this results the conclusion is that they do not have homologue sequences.

RAW RESULTS

a) BLASTp
                                                           

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_05632424.1|  alginate O-acetyltransferase, putative [Fu...  35.0    3.0  
ref|NP_572182.1|  CG7024 [Drosophila melanogaster] >gb|AAF4597...  34.3    5.5  
ref|ZP_03460358.1|  hypothetical protein BACEGG_03174 [Bactero...  33.9    5.8  
ref|XP_002100071.1|  GE16362 [Drosophila yakuba] >gb|EDX01179....  33.9    6.9  
ref|YP_001710584.1|  lipoyl synthase [Clavibacter michiganensi...  33.9    7.1  
ref|XP_002413033.1|  beta chain of the tetrameric hemoglobin, ...  33.9    7.2  
ref|ZP_04555519.1|  ATPase [Bacteroides sp. D4] >ref|ZP_054135...  33.5    8.3  
ref|YP_001863434.1|  hypothetical protein Bphy_7419 [Burkholde...  33.5    8.9  
ref|YP_002842833.1|  PilT protein domain protein [Sulfolobus i...  33.1    9.9  

ALIGNMENTS
>ref|ZP_05632424.1| alginate O-acetyltransferase, putative [Fusobacterium ulcerans 
ATCC 49185]
Length=483

 Score = 35.0 bits (79),  Expect = 3.0, Method: Compositional matrix adjust.
 Identities = 17/66 (25%), Positives = 35/66 (53%), Gaps = 2/66 (3%)

Query  22   GMLDCQSLPEGCSPTHLGQVSEVVRANSMCEVQLGGEGLGLLWLLAGPMVVQFAVDSVTT  81
            GM D + L    + TH+GQ+ E+ ++     + + G  + L++LL   ++V F  +S   
Sbjct  395  GMFDIKGLI--YTGTHMGQIGEMTKSYRELTIGMLGNKINLVFLLLATIIVVFMNNSYEK  452

Query  82   AGRMVR  87
            + + +R
Sbjct  453  SKKNIR  458


>ref|NP_572182.1| CG7024 [Drosophila melanogaster]
 gb|AAF45979.1| CG7024 [Drosophila melanogaster]
 gb|AAQ23628.1| AT31065p [Drosophila melanogaster]
Length=479

 Score = 34.3 bits (77),  Expect = 5.5, Method: Compositional matrix adjust.
 Identities = 35/131 (26%), Positives = 55/131 (41%), Gaps = 9/131 (6%)

Query  33   CSPTHLGQVSEVVRANSMCEVQLGGEGLGLLW-----LLAGPMVVQFAVDSVTTAGRMV-  86
            C   H G  + V RA++M E  + G+ +  LW     +LA     QFAVD     G +V 
Sbjct  224  CENNHYGMGTHVKRASAMTEFYMRGQYIPGLWVDGNQVLAVRSATQFAVDHALKHGPIVL  283

Query  87   -RLSYSSNMDTIDDFHRSAEQLEMVQ-TNSVRCPTIRYQSMAIVAGGVTAAVNVISLFNE  144
               +Y     ++ D   S    E VQ T   R P   ++S  I+A  +     + +L ++
Sbjct  284  EMSTYRYVGHSMSDPGTSYRSREEVQSTREKRDPITSFRSQ-IIALCLADEEELKALDDK  342

Query  145  DGRSFDPTCPN  155
              +  D  C  
Sbjct  343  TRKQVDSICKK  353


>ref|ZP_03460358.1| hypothetical protein BACEGG_03174 [Bacteroides eggerthii DSM 
20697]
 gb|EEC52823.1| hypothetical protein BACEGG_03174 [Bacteroides eggerthii DSM 
20697]
Length=441

 Score = 33.9 bits (76),  Expect = 5.8, Method: Compositional matrix adjust.
 Identities = 20/55 (36%), Positives = 34/55 (61%), Gaps = 6/55 (10%)

Query  76   VDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQSMAIVAG  130
            +D VT++G +     S ++DT DD+  + E+L ++Q  S  CP IR ++ AI +G
Sbjct  249  LDDVTSSGEV-----SCHVDTFDDYVAALEKLFVIQNISAWCPAIRSKT-AIRSG  297


>ref|XP_002100071.1| GE16362 [Drosophila yakuba]
 gb|EDX01179.1| GE16362 [Drosophila yakuba]
Length=485

 Score = 33.9 bits (76),  Expect = 6.9, Method: Compositional matrix adjust.
 Identities = 36/147 (24%), Positives = 58/147 (39%), Gaps = 13/147 (8%)

Query  19   YKMGMLDCQSLPEGCSPTHLGQVSEVVRANSMCEVQLGGEGLGLLW-----LLAGPMVVQ  73
            Y M  L C      C   H G  + V RA++M E  + G+ +  LW     +LA     Q
Sbjct  210  YNMAKLWCLPCIFVCENNHYGMGTHVRRASAMSEFYMRGQYIPGLWVDGNQVLAVRSATQ  269

Query  74   FAVDSVTTAG----RMVRLSY-SSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQSMAIV  128
            FAVD     G     M    Y   +M      +RS ++++  +  S    + R Q   I+
Sbjct  270  FAVDHALNHGPIVLEMSTYRYVGHSMSDPGTSYRSRDEVQAAREKSDPITSFRSQ---II  326

Query  129  AGGVTAAVNVISLFNEDGRSFDPTCPN  155
            A  +     + +L ++  +  D  C  
Sbjct  327  ALCLADEEELKALEDKTKKQVDSICKK  353


>ref|YP_001710584.1| lipoyl synthase [Clavibacter michiganensis subsp. sepedonicus]
 sp|B0RE24.1|LIPA_CLAMS RecName: Full=Lipoyl synthase; AltName: Full=Lipoic acid synthase; 
AltName: Full=Lipoate synthase; AltName: Full=Sulfur insertion 
protein lipA; AltName: Full=Lip-syn; Short=LS
 emb|CAQ01985.1| lipoyl synthase [Clavibacter michiganensis subsp. sepedonicus]
Length=329

 Score = 33.9 bits (76),  Expect = 7.1, Method: Compositional matrix adjust.
 Identities = 22/60 (36%), Positives = 30/60 (50%), Gaps = 6/60 (10%)

Query  34   SPTHLGQVSEVVRANSMCEVQLGGEGLGLLWLLAGPMVVQFAVDSVTTAGRMVRLSYSSN  93
            SP HL  V+  VR     E++   E +G L +LAGP+     V S   AGR+   S S+ 
Sbjct  253  SPRHL-PVARWVRPEEFVEIKAEAEAIGFLGVLAGPL-----VRSSYRAGRLYAQSMSAK  306


>ref|XP_002413033.1| beta chain of the tetrameric hemoglobin, putative [Ixodes scapularis]
 gb|EEC16341.1| beta chain of the tetrameric hemoglobin, putative [Ixodes scapularis]
Length=177

 Score = 33.9 bits (76),  Expect = 7.2, Method: Compositional matrix adjust.
 Identities = 25/82 (30%), Positives = 36/82 (43%), Gaps = 13/82 (15%)

Query  52   EVQLGGEGLGLLWLLAGP----MVVQFAVDSVT---------TAGRMVRLSYSSNMDTID  98
            EVQ  G  + ++     P    + V FA D +               V  + +S +DT+D
Sbjct  36   EVQTSGVAIFVVLFFKHPAYQKLFVAFAADPIAELPQNPRAIAHALTVAYAITSIIDTLD  95

Query  99   DFHRSAEQLEMVQTNSVRCPTI  120
            +   SAE +  V TN VR PTI
Sbjct  96   EPETSAELVRKVATNHVRHPTI  117


>ref|ZP_04555519.1| ATPase [Bacteroides sp. D4]
 ref|ZP_05413573.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565]
 gb|EEO46853.1| ATPase [Bacteroides sp. D4]
 gb|EEX47375.1| conserved hypothetical protein [Bacteroides finegoldii DSM 17565]
Length=430

 Score = 33.5 bits (75),  Expect = 8.3, Method: Compositional matrix adjust.
 Identities = 17/49 (34%), Positives = 30/49 (61%), Gaps = 5/49 (10%)

Query  76   VDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQS  124
            +D VT++G +     S ++DT DD+  + E+L ++Q  S  CP IR ++
Sbjct  238  LDDVTSSGEV-----SCHVDTFDDYVSALEKLFVIQNISAWCPAIRSKT  281


>ref|YP_001863434.1| hypothetical protein Bphy_7419 [Burkholderia phymatum STM815]
 gb|ACC76384.1| conserved hypothetical protein [Burkholderia phymatum STM815]
Length=412

 Score = 33.5 bits (75),  Expect = 8.9, Method: Compositional matrix adjust.
 Identities = 15/38 (39%), Positives = 23/38 (60%), Gaps = 0/38 (0%)

Query  134  AAVNVISLFNEDGRSFDPTCPNGLGFSTAVIQLLLWGM  171
            AAV ++ LF+E GR   P C + +  S A++ LL+  M
Sbjct  336  AAVGLLFLFHESGRQVAPRCASAMLLSGAIVSLLVSAM  373


>ref|YP_002842833.1| PilT protein domain protein [Sulfolobus islandicus M.16.27]
 gb|ACP54788.1| PilT protein domain protein [Sulfolobus islandicus M.16.27]
Length=129

 Score = 33.1 bits (74),  Expect = 9.9, Method: Compositional matrix adjust.
 Identities = 16/60 (26%), Positives = 29/60 (48%), Gaps = 0/60 (0%)

Query  96   TIDDFHRSAEQLEMVQTNSVRCPTIRYQSMAIVAGGVTAAVNVISLFNEDGRSFDPTCPN  155
            T +D    ++ LE+++   V+ P I       V   +T+ V++I    E+ R F+  C N
Sbjct  13   TFEDSENHSKALEIIEKEDVKIPQIVVYEFLWVLAKLTSDVSLIKTKIEELREFEIICEN  72

____________________________________________________________________________________


b)BLASTx

                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|XP_001499179.2|  PREDICTED: similar to Disabled homolog 2 ...  35.8    1.9  
ref|XP_001421089.1|  predicted protein [Ostreococcus lucimarin...  35.4    2.5  
ref|NP_577921.1|  oligopeptide transporter permease appc [Pyro...  35.0    3.3  
ref|YP_002800089.1|  hypothetical protein Avin_29470 [Azotobac...  34.7    4.3  
ref|YP_850024.1|  malonyl CoA-acyl carrier protein transacylas...  34.7    4.3  
ref|YP_315578.1|  TrwC protein [Thiobacillus denitrificans ATC...  34.7    4.3  
ref|ZP_06596791.1|  type I phosphodiesterase / nucleotide pyro...  34.3    5.6  
ref|YP_002322142.1|  type I phosphodiesterase/nucleotide pyrop...  34.3    5.6  
emb|CAB09912.1|  hypothetical protein MLCL383.38c [Mycobacteri...  34.3    5.6  
ref|NP_301401.1|  hypothetical protein ML0431 [Mycobacterium l...  34.3    5.6  
ref|ZP_04443462.1|  [acyl-carrier-protein] S-malonyltransferas...  33.9    7.3  
ref|ZP_03460358.1|  hypothetical protein BACEGG_03174 [Bactero...  33.9    7.3  
gb|AAQ64662.1|  signal transducer and activator of transcripti...  33.9    7.3  
ref|ZP_05365284.1|  hypothetical protein CORTU0001_0894 [Coryn...  33.5    9.6  
ref|XP_002494428.1|  ZYRO0A01254p [Zygosaccharomyces rouxii] >...  33.5    9.6  
ref|ZP_03920510.1|  hypothetical protein HMPREF0305_0699 [Cory...  33.5    9.6  
ref|ZP_04555519.1|  ATPase [Bacteroides sp. D4] >ref|ZP_054135...  33.5    9.6  
ref|XP_002562627.1|  Pc20g00650 [Penicillium chrysogenum Wisco...  33.5    9.6  
gb|EDM14888.1|  transcriptional co-activator with PDZ-binding ...  33.5    9.6  
ref|XP_001835732.1|  hypothetical protein CC1G_07156 [Coprinop...  33.5    9.6  
ref|NP_001020040.1|  WW domain containing transcription regula...  33.5    9.6  
ref|NP_471256.1|  acyl-carrier-protein S-malonyltransferase [L...  33.5    9.6  

ALIGNMENTS
>ref|XP_001499179.2| PREDICTED: similar to Disabled homolog 2 (Differentially-expressed 
protein 2) (DOC-2) [Equus caballus]
Length=744

 Score = 35.8 bits (81),  Expect = 1.9
 Identities = 19/44 (43%), Positives = 21/44 (47%), Gaps = 2/44 (4%)
 Frame = -3

Query  259  SPCARQWLQNQPQIEQPSVPPKAITIPTPHPQVAPHTWSSLEQL  128
            SP    W  NQP     S PP A  +  P   VAP+TWSS   L
Sbjct  527  SPAVASW--NQPSSFAASTPPPAPVVWGPSASVAPNTWSSTSPL  568


>ref|XP_001421089.1| predicted protein [Ostreococcus lucimarinus CCE9901]
 gb|ABO99382.1| predicted protein [Ostreococcus lucimarinus CCE9901]
Length=2146

 Score = 35.4 bits (80),  Expect = 2.5
 Identities = 21/76 (27%), Positives = 39/76 (51%), Gaps = 4/76 (5%)
 Frame = -3

Query  229  QPQIEQPSVPPKA-ITIPTPHPQVAPHTWSSLEQLQ-KLGPNELDYILQAKTDNPTCPFY  56
            QP +E PS PP   + +P+P P + P     L+ LQ + G   +  I   +  +P+C   
Sbjct  773  QPPVEVPSPPPPPPVAVPSPSPPILPTCLRLLQDLQGRCG--SIFGIENDEAFDPSCCDI  830

Query  55   IKALMVGRCYSHSQLL  8
            ++ + V RC+  + ++
Sbjct  831  VRGMNVERCFCDTAII  846


>ref|NP_577921.1| oligopeptide transporter permease appc [Pyrococcus furiosus DSM 
3638]
 gb|AAL80316.1| hypothetical oligopeptide transport system permease protein appc 
[Pyrococcus furiosus DSM 3638]
Length=474

 Score = 35.0 bits (79),  Expect = 3.3
 Identities = 16/43 (37%), Positives = 25/43 (58%), Gaps = 0/43 (0%)
 Frame = -3

Query  226  PQIEQPSVPPKAITIPTPHPQVAPHTWSSLEQLQKLGPNELDY  98
            P + +P++P K  TI   +P+V P TWSS+       P+E+ Y
Sbjct  41   PYVTEPNIPDKWKTIWIENPKVVPPTWSSIFSGVSEAPHEVIY  83


>ref|YP_002800089.1| hypothetical protein Avin_29470 [Azotobacter vinelandii DJ]
 gb|ACO79114.1| hypothetical protein Avin_29470 [Azotobacter vinelandii DJ]
Length=410

 Score = 34.7 bits (78),  Expect = 4.3
 Identities = 27/92 (29%), Positives = 42/92 (45%), Gaps = 11/92 (11%)
 Frame = -3

Query  334  FAPF--LTVQHYGENHQWCPY*TNKIASPCARQWLQNQPQIEQPSVP-PKAITIPTPHPQ  164
            + PF  LT+ +  EN QW P     +++P  R   +N P++E P +   +AI IP     
Sbjct  131  YRPFDELTLSYGRENLQWGPSVILSLSNPFTRDNGRNNPRVEVPGLDYYRAIWIPN----  186

Query  163  VAPHTWSSLEQLQKLGPNELDYILQAKTDNPT  68
                TW +L  +   GP   D +    +  PT
Sbjct  187  ---ETW-TLSTIYNYGPGRSDDVESYMSGQPT  214


>ref|YP_850024.1| malonyl CoA-acyl carrier protein transacylase [Listeria welshimeri 
serovar 6b str. SLCC5334]
 emb|CAK21245.1| fabD [Listeria welshimeri serovar 6b str. SLCC5334]
Length=313

 Score = 34.7 bits (78),  Expect = 4.3
 Identities = 27/95 (28%), Positives = 41/95 (43%), Gaps = 5/95 (5%)
 Frame = +1

Query  220  FAVDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQSM----AIVA  387
            F++  + T G M  L+ S N         S   L  ++T  V+   +   S+    A+VA
Sbjct  42   FSITDIITEGPMELLTKSENAQPAL-VSTSVAILRALETYGVKADYVAGHSLGEYSALVA  100

Query  388  GGVTAAVNVISLFNEDGRSFDPTCPNGLGFSTAVI  492
            GG   A + I L  + G   +   PNG G   AV+
Sbjct  101  GGFLEASDAIYLVRKRGELMEKAVPNGAGAMAAVL  135


>ref|YP_315578.1| TrwC protein [Thiobacillus denitrificans ATCC 25259]
 gb|AAZ97773.1| TrwC protein [Thiobacillus denitrificans ATCC 25259]
Length=929

 Score = 34.7 bits (78),  Expect = 4.3
 Identities = 18/68 (26%), Positives = 32/68 (47%), Gaps = 3/68 (4%)
 Frame = -3

Query  265  IASPCARQWLQNQPQIEQPSVPPKAITIPTPHPQVAPHTWSSLEQLQKLGPNELDYILQA  86
            +A+   R+  Q +  + +P  PP A+ +P P P + P  W + E    +   +   IL+ 
Sbjct  859  LAAQIERESGQKETALAEPEPPPPALELPMPGPAIEPAPWETSEPAIDVRTPD---ILEI  915

Query  85   KTDNPTCP  62
            +T  P  P
Sbjct  916  ETPPPAVP  923


>ref|ZP_06596791.1| type I phosphodiesterase / nucleotide pyrophosphatase family 
protein [Bifidobacterium breve DSM 20213]
 gb|EFE88754.1| type I phosphodiesterase / nucleotide pyrophosphatase family 
protein [Bifidobacterium breve DSM 20213]
Length=275

 Score = 34.3 bits (77),  Expect = 5.6
 Identities = 24/66 (36%), Positives = 31/66 (46%), Gaps = 13/66 (19%)
 Frame = -1

Query  294  INGVHIRRIR*PHHAPGSGYRINRKLNNHRSRQK---------P*QSQPLTPKLHLTHGV  142
            ++GV +  +R  +HAP     INR LNN  S +          P  S  LT   H  HGV
Sbjct  10   VDGVRLDIVRKENHAPN----INRILNNGSSAEMTMEVPTISGPGWSSILTGTTHAQHGV  65

Query  141  RSNNFR  124
            + N FR
Sbjct  66   QDNTFR  71


>ref|YP_002322142.1| type I phosphodiesterase/nucleotide pyrophosphatase [Bifidobacterium 
longum subsp. infantis ATCC 15697]
 gb|ACJ51764.1| type I phosphodiesterase/nucleotide pyrophosphatase [Bifidobacterium 
longum subsp. infantis ATCC 15697]
Length=275

 Score = 34.3 bits (77),  Expect = 5.6
 Identities = 24/66 (36%), Positives = 31/66 (46%), Gaps = 13/66 (19%)
 Frame = -1

Query  294  INGVHIRRIR*PHHAPGSGYRINRKLNNHRSRQK---------P*QSQPLTPKLHLTHGV  142
            ++GV +  +R  +HAP     INR LNN  S +          P  S  LT   H  HGV
Sbjct  10   VDGVRLDIVRKENHAPN----INRILNNGSSAEMTMEVPTISGPGWSSILTGTTHAQHGV  65

Query  141  RSNNFR  124
            + N FR
Sbjct  66   QDNTFR  71


>emb|CAB09912.1| hypothetical protein MLCL383.38c [Mycobacterium leprae]
Length=261

 Score = 34.3 bits (77),  Expect = 5.6
 Identities = 15/44 (34%), Positives = 22/44 (50%), Gaps = 2/44 (4%)
 Frame = -3

Query  283  PY*TNKIASPCARQWLQNQPQIEQPSVPPKAITIPTPHPQVAPH  152
            P+   K      R W Q+QP      +PP+ +T+P PH   +PH
Sbjct  53   PWTPKKPPQQLPRYWQQDQPP--PTDIPPEGLTLPPPHEPKSPH  94


>ref|NP_301401.1| hypothetical protein ML0431 [Mycobacterium leprae TN]
 ref|YP_002503032.1| hypothetical protein MLBr_00431 [Mycobacterium leprae Br4923]
 emb|CAC29939.1| putative membrane protein [Mycobacterium leprae]
 emb|CAR70524.1| putative membrane protein [Mycobacterium leprae Br4923]
Length=259

 Score = 34.3 bits (77),  Expect = 5.6
 Identities = 15/44 (34%), Positives = 22/44 (50%), Gaps = 2/44 (4%)
 Frame = -3

Query  283  PY*TNKIASPCARQWLQNQPQIEQPSVPPKAITIPTPHPQVAPH  152
            P+   K      R W Q+QP      +PP+ +T+P PH   +PH
Sbjct  51   PWTPKKPPQQLPRYWQQDQPP--PTDIPPEGLTLPPPHEPKSPH  92


>ref|ZP_04443462.1| [acyl-carrier-protein] S-malonyltransferase [Listeria grayi DSM 
20601]
 gb|EEN77906.1| [acyl-carrier-protein] S-malonyltransferase [Listeria grayi DSM 
20601]
Length=313

 Score = 33.9 bits (76),  Expect = 7.3
 Identities = 24/97 (24%), Positives = 48/97 (49%), Gaps = 5/97 (5%)
 Frame = +1

Query  220  FAVDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQSM----AIVA  387
            F++  +   G + +L+ S N       + +A  L +++ ++++   +   S+    A+VA
Sbjct  42   FSLSELIQEGPIEKLTKSENAQPALVTNSTAI-LRVLEAHAIKADYVAGHSLGEYSALVA  100

Query  388  GGVTAAVNVISLFNEDGRSFDPTCPNGLGFSTAVIQL  498
            GG   A + I L N+ G+  +   PNG G   A++ L
Sbjct  101  GGYLKAEDAIYLVNKRGQLMEAAVPNGQGAMAAILGL  137


>ref|ZP_03460358.1| hypothetical protein BACEGG_03174 [Bacteroides eggerthii DSM 
20697]
 gb|EEC52823.1| hypothetical protein BACEGG_03174 [Bacteroides eggerthii DSM 
20697]
Length=441

 Score = 33.9 bits (76),  Expect = 7.3
 Identities = 20/57 (35%), Positives = 32/57 (56%), Gaps = 7/57 (12%)
 Frame = +1

Query  226  VDSVTTAGRMVRLSYSSNMDTIDDFHRSAEQLEMVQTNSVRCPTIRYQSMAIVAGGV  396
            +D VT++G +     S ++DT DD+  + E+L ++Q  S  CP IR  S   +  GV
Sbjct  249  LDDVTSSGEV-----SCHVDTFDDYVAALEKLFVIQNISAWCPAIR--SKTAIRSGV  298