GOS 1385010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091143176888
Annotathon code: GOS_1385010
Sample :
  • GPS :5°33'10n; 87°5'16w
  • Eastern Tropical Pacific: Dirty Rock, Cocos Island - Costa Rica
  • Fringing Reef (-1.1m, 28.3°C, 0.8-3.0 microns)
Authors
Team : Algarve
Username : minifelizardix
Annotated on : 2010-07-08 21:00:54
  • a37095@ualg.pt SoniaOliveira
  • a37097 TatianaRaquelLopesEstevesRafaelFelizardo1
  • a37099 VeraCristinaJordaoMarques1

Synopsis

Genomic Sequence

>JCVI_READ_1091143176888 GOS_1385010 Genomic DNA
CATTCTCGCCTTTGTTTGGTAATTTATCTTTGAGAAAGAAATGTATAAAGTGACATTCATTTATTTTATTGTTTGCGGTAAATAGACCGTTCCATTTCCA
ATGCATCTTTTGTAAAGGTACACGATACTTCTTAATAAAAAAGTTAAGTAAGGTTTGGTCAGTTGACCACTTCCATGTACCTACACCGTCCACAAAGTCT
TTGAATTCACTTCGCATTAGAAACTCTTTTGGTGTTTGTCCTTGTAGATAAGGTTTAAACTTCTGACAATTTAAAAGTATCATACCCATATTACAAAACT
CAAACCCTAATTCATTTGGGTCAAAGTTTATACCACTAAAATCATGTAGTCTTTCATACTGCATATATGAATAGTTAACTATCTTTTTCTCATACCAATC
TTCGATAGGCATATCTCTTTCAACCACTGCACCAAAGGCCGCATCGTCCGAGAACTCGTTAAAGATGTTTGGTGCGTCTGGACGGATATAGATATCTGCA
TCTACGATTGCAATCTGGTCATAGTCATCTATGTAATTAAATGCATTCTCTTTTTCATAGATAGGTAGATATCCACCATACTTCATATAAGACTCTTCAC
TTCTACCAGAATTAAAAGGGTCTGGTTTTATCCATAGTTGTGGTGCAGTTAATACATGATGCGTGATATCGTGTTTTTCACAATAGTTTTTTACTGACTC
GATGCAGTGTTCATATAGTTTCGATTGCGGGCCTACCGCAACCTGAAATATCATTCTATTTATATTTTTCATCTAAAATCTCAAACATTACTTTTAAACC
CGTTTCTCCATGTTCCATGCTACCACCTACAGTTTCATGGTAAACTGGTTTTGTGTAATTATACACATTATTTATTACATTGTCAAGTCCTTTGTAATGC
AACATAAAATCATCAGTAAAGTTTGTTAAGGTTTACTTACTGCAAGTTTCAACATATCAGTTGCATCTCTGGGGTTAGAGTGTATGCGTGTGTGACGGGA
CTATCTTTTTATCTTTGCCCAGATGGGTGGTCAATAAAT

Translation

[268 - 1038/1039]   indirect strand
>GOS_1385010 Translation [268-1038   indirect strand]
MKNINRMIFQVAVGPQSKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRSEESYMKYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDA
PNIFNEFSDDAAFGAVVERDMPIEDWYEKKIVNYSYMQYERLHDFSGINFDPNELGFEFCNMGMILLNCQKFKPYLQGQTPKEFLMRSEFKDFVDGVGTW
KWSTDQTLLNFFIKKYRVPLQKMHWKWNGLFTANNKINECHFIHFFLKDKLPNKGEN

[ Warning ] 3' incomplete: following codon is not a STOP

Annotator commentaries

This sequence is coding. It participate in the metabolic process and his function is transferase activity.Is not possible to conclude more about it because wasn't possible study the majority of the parameters here required.

ORF finding

PROTOCOL

a)SMS ORFinder/ direct strand/ frames 1,2 & 3/ min 60 AA/ 'any'codon initiation /'standard' genetic code

b)SMS ORFinder/ reverse strand/ frames 1,2 & 3/ min 60 AA/ 'any'codon initiation /'standard' genetic code



RESULTS ANALYSIS


The ORF have a START codon, but don't have STOP codon ou intern STOP's. We can find the ORF in reverse strand in reading frame 1. Don't exist bigger ORF's. Don't exist ORF's with biological significance. In forward strand we found 3 ORF's, wich 2 of them are in reading frame 2. In reverse strand we found 1 ORF.

RAW RESULTS:

a) forward strand

>No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 371 to base 568.
ATAGTTAACTATCTTTTTCTCATACCAATCTTCGATAGGCATATCTCTTTCAACCACTGC
ACCAAAGGCCGCATCGTCCGAGAACTCGTTAAAGATGTTTGGTGCGTCTGGACGGATATA
GATATCTGCATCTACGATTGCAATCTGGTCATAGTCATCTATGTAATTAAATGCATTCTC
TTTTTCATAGATAGGTAG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
IVNYLFLIPIFDRHISFNHCTKGRIVRELVKDVWCVWTDIDICIYDCNLVIVIYVIKCIL
FFIDR*

>ORF number 2 in reading frame 2 on the direct strand extends from base 719 to base 907.
TTTCGATTGCGGGCCTACCGCAACCTGAAATATCATTCTATTTATATTTTTCATCTAAAA
TCTCAAACATTACTTTTAAACCCGTTTCTCCATGTTCCATGCTACCACCTACAGTTTCAT
GGTAAACTGGTTTTGTGTAATTATACACATTATTTATTACATTGTCAAGTCCTTTGTAAT
GCAACATAA

>Translation of ORF number 2 in reading frame 2 on the direct strand.
FRLRAYRNLKYHSIYIFHLKSQTLLLNPFLHVPCYHLQFHGKLVLCNYTHYLLHCQVLCN
AT*

>ORF number 1 in reading frame 3 on the direct strand extends from base 54 to base 251.
CATTCATTTATTTTATTGTTTGCGGTAAATAGACCGTTCCATTTCCAATGCATCTTTTGT
AAAGGTACACGATACTTCTTAATAAAAAAGTTAAGTAAGGTTTGGTCAGTTGACCACTTC
CATGTACCTACACCGTCCACAAAGTCTTTGAATTCACTTCGCATTAGAAACTCTTTTGGT
GTTTGTCCTTGTAGATAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
HSFILLFAVNRPFHFQCIFCKGTRYFLIKKLSKVWSVDHFHVPTPSTKSLNSLRIRNSFG
VCPCR*

b) reverse strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 268 to base 1038.
ATGAAAAATATAAATAGAATGATATTTCAGGTTGCGGTAGGCCCGCAATCGAAACTATAT
GAACACTGCATCGAGTCAGTAAAAAACTATTGTGAAAAACACGATATCACGCATCATGTA
TTAACTGCACCACAACTATGGATAAAACCAGACCCTTTTAATTCTGGTAGAAGTGAAGAG
TCTTATATGAAGTATGGTGGATATCTACCTATCTATGAAAAAGAGAATGCATTTAATTAC
ATAGATGACTATGACCAGATTGCAATCGTAGATGCAGATATCTATATCCGTCCAGACGCA
CCAAACATCTTTAACGAGTTCTCGGACGATGCGGCCTTTGGTGCAGTGGTTGAAAGAGAT
ATGCCTATCGAAGATTGGTATGAGAAAAAGATAGTTAACTATTCATATATGCAGTATGAA
AGACTACATGATTTTAGTGGTATAAACTTTGACCCAAATGAATTAGGGTTTGAGTTTTGT
AATATGGGTATGATACTTTTAAATTGTCAGAAGTTTAAACCTTATCTACAAGGACAAACA
CCAAAAGAGTTTCTAATGCGAAGTGAATTCAAAGACTTTGTGGACGGTGTAGGTACATGG
AAGTGGTCAACTGACCAAACCTTACTTAACTTTTTTATTAAGAAGTATCGTGTACCTTTA
CAAAAGATGCATTGGAAATGGAACGGTCTATTTACCGCAAACAATAAAATAAATGAATGT
CACTTTATACATTTCTTTCTCAAAGATAAATTACCAAACAAAGGCGAGAAT

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
MKNINRMIFQVAVGPQSKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRSEE
SYMKYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDAPNIFNEFSDDAAFGAVVERD
MPIEDWYEKKIVNYSYMQYERLHDFSGINFDPNELGFEFCNMGMILLNCQKFKPYLQGQT
PKEFLMRSEFKDFVDGVGTWKWSTDQTLLNFFIKKYRVPLQKMHWKWNGLFTANNKINEC
HFIHFFLKDKLPNKGEN

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.

Multiple Alignement

PROTOCOL



RESULTS ANALYSIS


Don't have data to support the alignement.

RAW RESULTS

Protein Domains

PROTOCOL

InterPro default parameters at EBI



RESULTS ANALYSIS


We have only one sequence wich is unintegrated and his molecular function is transferase. This sequence have a significant E-value (2.5e-06). This function protein is to catalize the addition of nucleotides in the 3' extremity of a DNA molecule.



RAW RESULTS

Sequence_1	C43F5BCACF791BAE	257	superfamily	SSF53448	Nucleotide-diphospho-sugar transferases	71	250	2.5e-06	T	16-Mar-2010	NULL	NULL	

Phylogeny

PROTOCOL



RESULTS ANALYSIS


If we don't have tha alignement we can't make the tree.

RAW RESULTS

Taxonomy report

PROTOCOL

BLASTx vs NR, defaut NCBI parameters + "1000 Max target sequences"


RESULTS ANALYSIS


We can't complete the taxonomy because don't have biological significance.

RAW RESULTS

Sphingobacterium spiritivorum ATCC 33300 [CFB group bacteria]
. Sphingobacterium spiritivorum ATCC 33300 -   35 2 hits [CFB group bacteria]  conserved hypothetical protein [Sphingobacterium spiritivor

BLAST

PROTOCOL

1)BLASTp vs NR, default NCBI parameters + "1000 Max target sequences"

2)BLASTx vs NR, default NCBI parameters + "1000 Max target sequences"



RESULTS ANALYSIS


E-value are too high to give importance, and the hits are small so the similarity are low.

We can't establish homologys because we don't have alignement.

RAW RESULTS

1) BLASTp
Sequences producing significant alignments:                       (Bits)  Value

ref|YP_001381601.1|  hypothetical protein Anae109_4439 [Anaero...  41.2    0.098
ref|ZP_06370161.1|  hypothetical protein DFW101DRAFT_2731 [Des...  40.8    0.13 
ref|YP_002951438.1|  hypothetical protein DMR_00610 [Desulfovi...  40.4    0.17 
ref|ZP_02156141.1|  hypothetical protein KT99_03107 [Shewanell...  39.7    0.28 
ref|YP_001095823.1|  glycosyl transferase family protein [Shew...  39.7    0.31 
ref|YP_794500.1|  glycosyltransferase-like protein [Lactobacil...  38.1    1.0  
ref|YP_001323648.1|  L-tyrosine decarboxylase [Methanococcus v...  37.4    1.7  
ref|YP_002406292.1|  putative zinc-dependent hydrolase [Escher...  37.4    1.7  
ref|YP_001742566.1|  hypothetical protein EcSMS35_0465 [Escher...  37.0    1.8  
ref|YP_001196464.1|  carboxymethylenebutenolidase [Flavobacter...  36.6    3.0  
ref|YP_003232988.1|  hypothetical protein ECO111_0458 [Escheri...  36.2    3.1  
gb|EDL11534.1|  contactin associated protein 4, isoform CRA_c ...  36.2    3.5  
dbj|BAC35446.1|  unnamed protein product [Mus musculus]            36.2    3.5  
dbj|BAD32533.1|  mKIAA1763 protein [Mus musculus]                  36.2    3.6  
ref|NP_569724.2|  contactin associated protein-like 4 precurso...  36.2    3.7  
sp|Q99P47.1|CNTP4_MOUSE  RecName: Full=Contactin-associated pr...  36.2    3.7  
gb|EDL11533.1|  contactin associated protein 4, isoform CRA_b ...  35.8    4.1  
ref|YP_003227539.1|  hypothetical protein ECO26_0460 [Escheric...  35.8    4.6  
ref|ZP_03049688.1|  conserved domain protein [Escherichia coli...  35.4    6.0  
ref|ZP_05434260.1|  Sel1 domain protein repeat-containing prot...  35.4    6.0  
ref|YP_001548870.1|  L-tyrosine decarboxylase [Methanococcus m...  35.4    6.0  
ref|YP_001461611.1|  hypothetical protein EcE24377A_0460 [Esch...  35.4    6.0  
ref|YP_001457268.1|  hypothetical protein EcHS_A0500 [Escheric...  35.4    6.2  
ref|YP_002291725.1|  hypothetical protein ECSE_0450 [Escherich...  35.4    6.3  
ref|ZP_03726297.1|  hypothetical protein ObacDRAFT_7121 [Opitu...  35.4    6.6  
ref|ZP_03028517.1|  conserved domain protein [Escherichia coli...  35.0    6.8  
ref|ZP_01060541.1|  dienelactone hydrolase family protein [Lee...  35.0    7.1  
ref|YP_001098058.1|  L-tyrosine decarboxylase [Methanococcus m...  35.0    7.6  
ref|XP_390402.1|  hypothetical protein FG10226.1 [Gibberella z...  35.0    8.3  
emb|CAR63633.1|  hypothetical protein [Angiostrongylus cantone...  35.0    8.6  
ref|YP_001330347.1|  L-tyrosine decarboxylase [Methanococcus m...  34.7    9.6  

ALIGNMENTS
>ref|YP_001381601.1| hypothetical protein Anae109_4439 [Anaeromyxobacter sp. Fw109-5]
 gb|ABS28617.1| hypothetical protein Anae109_4439 [Anaeromyxobacter sp. Fw109-5]
Length=305

 Score = 41.2 bits (95),  Expect = 0.098, Method: Compositional matrix adjust.
 Identities = 29/103 (28%), Positives = 52/103 (50%), Gaps = 11/103 (10%)

Query  4    INRMIFQVAVGPQ-SKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSG--RSEE  60
            + + +  +A+G +  + YEH +   + +  +H    HV+ +      P+PF     RS++
Sbjct  22   MRKALVTMAIGARFEREYEHLLPLREAWARRHGWDLHVVRS-----IPEPFVRAHSRSDK  76

Query  61   SYMKYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDAPNI  103
              + +  YL   +   AF+   DYD +A VD+DI I P AP +
Sbjct  77   PGLGWCCYLYKLQLPEAFS---DYDLVAYVDSDIAINPAAPCL  116


>ref|ZP_06370161.1| hypothetical protein DFW101DRAFT_2731 [Desulfovibrio sp. FW1012B]
 gb|EFC19728.1| hypothetical protein DFW101DRAFT_2731 [Desulfovibrio sp. FW1012B]
Length=273

 Score = 40.8 bits (94),  Expect = 0.13, Method: Compositional matrix adjust.
 Identities = 32/105 (30%), Positives = 43/105 (40%), Gaps = 15/105 (14%)

Query  85   DQIAIVDADIYIRPDAPNIFNEFSDDAAFGAVVERDMPIEDWYEKKIVNYSYMQYERLHD  144
            DQ+A +DADI IRPDAPN+F     D  FGAV     P  + Y           Y   H 
Sbjct  72   DQVAWLDADIVIRPDAPNVFEGVPPD-TFGAVDAFASPSPEAYAAAAGKIR--AYLAGHG  128

Query  145  FSGINFDPNELGF-----------EFCNMGMILLNCQKFKPYLQG  178
             +G + D    GF                G+++L  +   P L+ 
Sbjct  129  LAGPD-DATPEGFYHHYGFDTGPDRVAQTGVLVLTPEVHGPVLEA  172


>ref|YP_002951438.1| hypothetical protein DMR_00610 [Desulfovibrio magneticus RS-1]
 dbj|BAH73552.1| hypothetical protein [Desulfovibrio magneticus RS-1]
Length=328

 Score = 40.4 bits (93),  Expect = 0.17, Method: Compositional matrix adjust.
 Identities = 37/153 (24%), Positives = 59/153 (38%), Gaps = 26/153 (16%)

Query  1    MKNINRMIFQVAVGP--QSKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRS  58
            +  + R I  + VG   + +   H  ESV+ Y + H +   VL  P   I   P    RS
Sbjct  21   LPAMKRAIATLVVGEACRKRFAAHAAESVRAYAKSHGLALVVLDKP---IDDSPRGRARS  77

Query  59   EESYMKYGGYLPIYEKENAFNY--IDDYDQIAIVDADIYIRPDAPNIFNEFSDDAAFGAV  116
                       P ++K   F+   +   DQ+  +DADI + P AP++F+        GAV
Sbjct  78   -----------PAWQKCLLFDVPKLRACDQVLWLDADIAVAPGAPDVFDGVP-PGTLGAV  125

Query  117  VERDMPIEDWYEKKIVNYSYMQYERLHDFSGIN  149
                   E +      +Y+  Q       + I 
Sbjct  126  -------EQFSSPTPADYATAQAATRRRLAAIG  151


>ref|ZP_02156141.1| hypothetical protein KT99_03107 [Shewanella benthica KT99]
 gb|EDQ02468.1| hypothetical protein KT99_03107 [Shewanella benthica KT99]
Length=283

 Score = 39.7 bits (91),  Expect = 0.28, Method: Compositional matrix adjust.
 Identities = 59/259 (22%), Positives = 108/259 (41%), Gaps = 48/259 (18%)

Query  4    INRMIFQVAVGPQSKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRSEESYM  63
            + + IF +A+G  + +Y+  IES + Y EK  +   ++ +  L  + D  N         
Sbjct  1    MQKAIFTLAIG-DNPMYKAAIESFRAYAEK--VGADLIISDALHYQVDIKNP--------  49

Query  64   KYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDAPNIFNEFSDDAAF-----GAVVE  118
            KY       EK      +  YD++  +DAD  I P+A N+F+   D  +      G   +
Sbjct  50   KYDANPAWSEKLYIGELLKRYDRVLYLDADFIITPEAENVFDTLDDLNSVYMFNEGRYRD  109

Query  119  RDMPIEDWYEKKIVNYSYMQYERLHDFSGINFDPNELGFEFCNMGMILLNCQKFKPYLQG  178
            R   I++              + L D    N+D       + NMGM+L++          
Sbjct  110  RTPVIKE------------ACDLLGDVP--NWDSESNLPVYFNMGMLLIS---------K  146

Query  179  QTPK-EFLMRSEFKDFVDGVGTWKWSTDQTLLNFFIKKYRVPLQKMHWKWNGL-FTANNK  236
            Q P  E L   + +   + +  +    +QTL N+ I+K+++  + +  K+N +     + 
Sbjct  147  QCPLFEHLSVEKLQSVCNKIKFY----EQTLTNYMIQKHKISYKCVESKFNRMDLLGLDN  202

Query  237  INECHFIHFF---LKDKLP  252
              +  FIH+     +DK P
Sbjct  203  YRQADFIHYAGRGFRDKCP  221


>ref|YP_001095823.1| glycosyl transferase family protein [Shewanella loihica PV-4]
 gb|ABO25564.1| glycosyl transferase, family 8 [Shewanella loihica PV-4]
Length=284

 Score = 39.7 bits (91),  Expect = 0.31, Method: Compositional matrix adjust.
 Identities = 55/251 (21%), Positives = 100/251 (39%), Gaps = 49/251 (19%)

Query  4    INRMIFQVAVGPQSKLYEHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRSEESYM  63
            + + IF +A+G  + +Y   ++S + Y EK  +   ++ + +L  K    N         
Sbjct  1    MKKAIFTLAIG-DNPMYRAALKSFERYAEK--VGADLVVSDRLHYKIHIENP--------  49

Query  64   KYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDAPNIFNEFSDDAAF-----GAVVE  118
            KY       EK      +  YD++  +DADI + P A +IF E+ D         G   +
Sbjct  50   KYSASPAWPEKLYTAELLKQYDRVLYLDADIMVTPWARDIFEEYQDLETVYMFDEGPYTD  109

Query  119  RDMPIEDWYEKKIVNYSYMQYERLHDFSGINFDPNELG-FEFCNMGMILLN--CQKFKPY  175
            R +P+ +      +N +         F  +   P   G + + N GM L++  C  F   
Sbjct  110  RTIPVGE------INAA---------FDAVESWPQTDGHYSYYNFGMFLISKECPLF---  151

Query  176  LQGQTPKEFLMRSEFKDFVDGVGTWKWSTDQTLLNFFIKKYRVPLQKMHWKWNGL-FTAN  234
                   E       +   + V  +    DQT +N+ I+K ++  Q +   +N +    N
Sbjct  152  -------ELATLEGMQQVCNTVKFY----DQTYVNYVIQKNKIKNQGVDAAFNRMDLLGN  200

Query  235  NKINECHFIHF  245
                +  FIH+
Sbjct  201  EDYRKADFIHY  211


>ref|YP_794500.1| glycosyltransferase-like protein [Lactobacillus brevis ATCC 367]
 gb|ABJ63469.1| Glycosyltransferase related enzyme [Lactobacillus brevis ATCC 
367]
Length=339

 Score = 38.1 bits (87),  Expect = 1.0, Method: Compositional matrix adjust.
 Identities = 26/103 (25%), Positives = 46/103 (44%), Gaps = 11/103 (10%)

Query  85   DQIAIVDADIYIRPDA-PNIFNEFSD--DAAFGAVVERDMPIEDWYEKKIVNYSYMQYER  141
            D I  +DAD+   P   P +  +  D  D  +G  V  D   + W+++      Y    +
Sbjct  100  DLIITIDADLQDDPTIIPAMIAQAQDGYDIVYG--VRNDRSTDTWFKRTTAQSFYWLMGK  157

Query  142  LHDFSGINFDPNELGFEFCNMGMI--LLNCQKFKPYLQGQTPK  182
            L    G+N  PN   F      ++  L+ CQ+ +P+++G  P+
Sbjct  158  L----GVNLIPNHSDFRLLTHRVVTELMRCQESRPFIRGLIPQ  196


>ref|YP_001323648.1| L-tyrosine decarboxylase [Methanococcus vannielii SB]
 sp|A6URB4.1|MFNA_METVS RecName: Full=L-tyrosine decarboxylase; Short=TDC
 gb|ABR55036.1| Pyridoxal-dependent decarboxylase [Methanococcus vannielii SB]
Length=384

 Score = 37.4 bits (85),  Expect = 1.7, Method: Compositional matrix adjust.
 Identities = 27/73 (36%), Positives = 34/73 (46%), Gaps = 6/73 (8%)

Query  110  DAAFGAVVERDMPIEDWYEK-KIVNYSYMQYERLHDFSGINFDPNELGFEFCNMGMILLN  168
            DAAFG  V     I   Y+K K+ NY Y     L   S I  DP+++G    + G IL  
Sbjct  193  DAAFGGFV-----IPFLYDKYKLKNYRYEFDFSLEGVSSITIDPHKMGLAPISAGGILFR  247

Query  169  CQKFKPYLQGQTP  181
               FK YL   +P
Sbjct  248  NNSFKKYLDVDSP  260


>ref|YP_002406292.1| putative zinc-dependent hydrolase [Escherichia coli IAI39]
 emb|CAR16388.1| putative zinc-dependent hydrolase [Escherichia coli IAI39]
Length=331

 Score = 37.4 bits (85),  Expect = 1.7, Method: Compositional matrix adjust.
 Identities = 51/243 (20%), Positives = 96/243 (39%), Gaps = 31/243 (12%)

Query  36   ITHHVLTAPQLWIKPDPFNSGRSEESYMKYGGYLP---IYEKENAFNYIDDYDQIAIVDA  92
            +T  +L +  L + P    S ++ E Y    G  P   +  + +    + D D  +++  
Sbjct  22   LTKRILFSLMLIVSPCVVASEKAHELYDSIYGGQPAPDVINRLHKMAELGDVDAQSLLGW  81

Query  93   DIY-----IRPDAPNIFNEFSDDAAFGAVVERDMPI---EDWYEKKIVNYSYMQYERLHD  144
            + Y      +PD       F   A  G   +R+ P+     +Y+ ++V   Y +   L +
Sbjct  82   EYYQPRFDTKPDVQEAIKWFELAAKHG---DREAPLALGSIYYDGELVRVDYAKAYALFN  138

Query  145  FS---GINFDPNELGFEFCNMGMILLNCQKFKPYLQGQT-----PKEFLMRSEFKDFVDG  196
             +   G+N   + LG  + N   + ++C+K K YL         P++FL     KD +D 
Sbjct  139  QAAQYGVNLAWSRLGMMYANGQYVEVDCKKAKEYLDKGVHIYGGPEDFLATCR-KDMID-  196

Query  197  VGTWKWSTDQTLLNFFIKKYRVP---LQKMHWKWNGLFTANNKINECHFIHFFLKDKLPN  253
                + + D TL    + +  +    L K     + LF   NK+ E   +      + P+
Sbjct  197  ----RKTVDDTLPVITVTRSGMRDNFLDKGFSCMDSLFATTNKLGEVANLRVTFSIRSPS  252

Query  254  KGE  256
              E
Sbjct  253  GKE  255


>ref|YP_001742566.1| hypothetical protein EcSMS35_0465 [Escherichia coli SMS-3-5]
 gb|ACB16936.1| conserved domain protein [Escherichia coli SMS-3-5]
Length=311

 Score = 37.0 bits (84),  Expect = 1.8, Method: Compositional matrix adjust.
 Identities = 51/243 (20%), Positives = 96/243 (39%), Gaps = 31/243 (12%)

Query  36   ITHHVLTAPQLWIKPDPFNSGRSEESYMKYGGYLP---IYEKENAFNYIDDYDQIAIVDA  92
            +T  +L +  L + P    S ++ E Y    G  P   +  + +    + D D  +++  
Sbjct  2    LTKRILFSLMLIVSPCVVASEKAHELYDSIYGGQPAPDVINRLHKMAELGDVDAQSLLGW  61

Query  93   DIY-----IRPDAPNIFNEFSDDAAFGAVVERDMPI---EDWYEKKIVNYSYMQYERLHD  144
            + Y      +PD       F   A  G   +R+ P+     +Y+ ++V   Y +   L +
Sbjct  62   EYYQPRFDTKPDVQEAIKWFELAAKHG---DREAPLALGSIYYDGELVRVDYAKAYALFN  118

Query  145  FS---GINFDPNELGFEFCNMGMILLNCQKFKPYLQGQT-----PKEFLMRSEFKDFVDG  196
             +   G+N   + LG  + N   + ++C+K K YL         P++FL     KD +D 
Sbjct  119  QAAQYGVNLAWSRLGMMYANGQYVEVDCKKAKEYLDKGVHIYGGPEDFLATCR-KDMID-  176

Query  197  VGTWKWSTDQTLLNFFIKKYRVP---LQKMHWKWNGLFTANNKINECHFIHFFLKDKLPN  253
                + + D TL    + +  +    L K     + LF   NK+ E   +      + P+
Sbjct  177  ----RKTVDDTLPVITVTRSGMRDNFLDKGFSCMDSLFATTNKLGEVANLRVTFSIRSPS  232

Query  254  KGE  256
              E
Sbjct  233  GKE  235


>ref|YP_001196464.1| carboxymethylenebutenolidase [Flavobacterium johnsoniae UW101]
 gb|ABQ07145.1| Carboxymethylenebutenolidase [Flavobacterium johnsoniae UW101]
Length=298

 Score = 36.6 bits (83),  Expect = 3.0, Method: Compositional matrix adjust.
 Identities = 31/130 (23%), Positives = 58/130 (44%), Gaps = 12/130 (9%)

Query  60   ESYMKYGGYLPIYEKENAFNYIDDYDQIAIVDADIYIRPDAPNIFNEFSDDAAFGAVVER  119
            ES  K  G + ++E      YI+D  + A ++  I + PDA +    +  +   G  +++
Sbjct  94   ESKKKLPGIIVVHENRGLNPYIEDVGRRAALEGFITLAPDALSPLGGYPGNDDEGRELQK  153

Query  120  DMPIEDWYEKKIVNYSYMQ-YERLHDFSGINFDPNELGFEFCN-----MGMILLNCQKFK  173
                E+  E  I  Y Y++ ++  + + G+      +GF F       M + +L+ +   
Sbjct  154  KRTREEMLEDFIAAYEYLKSHKDCNGYVGV------VGFCFGGWISNMMAVKILDLKAAV  207

Query  174  PYLQGQTPKE  183
            PY  GQ  KE
Sbjct  208  PYYGGQPAKE  217


>ref|YP_003232988.1| hypothetical protein ECO111_0458 [Escherichia coli O111:H- str. 
11128]
 dbj|BAI34437.1| hypothetical protein [Escherichia coli O111:H- str. 11128]
Length=311

 Score = 36.2 bits (82),  Expect = 3.1, Method: Compositional matrix adjust.
 Identities = 52/243 (21%), Positives = 95/243 (39%), Gaps = 31/243 (12%)

Query  36   ITHHVLTAPQLWIKPDPFNSGRSEESYMKYGGYLPIYEKENAFNYI---DDYDQIAIVDA  92
            +T  +L +  L + P    S ++ E Y    G  P  +  N  + +    D D  +++  
Sbjct  2    LTKRILFSVMLIVSPSVVASEKAHELYDSIYGGKPAPDVINTLHKMAESGDIDAQSLLGW  61

Query  93   DIY-----IRPDAPNIFNEFSDDAAFGAVVERDMPIE---DWYEKKIVNYSYMQYERLHD  144
            + Y      +PD       F   A  G   +R+ P+     +Y+ + V   Y +   L +
Sbjct  62   EYYQPRYDTKPDVQEAIKWFELAAKQG---DREAPLALGGIYYDGEQVRVDYAKAYALFN  118

Query  145  FS---GINFDPNELGFEFCNMGMILLNCQKFKPYLQGQT-----PKEFLMRSEFKDFVDG  196
             +   G+N   + LG  + N   + ++C+K K YL         P++FL     KD +D 
Sbjct  119  QAAQHGVNLARSRLGIMYANGQYVEVDCKKAKEYLDKGVHIYGGPEDFLATCR-KDMID-  176

Query  197  VGTWKWSTDQTLLNFFIKKYRVP---LQKMHWKWNGLFTANNKINECHFIHFFLKDKLPN  253
                + + D TL    + +  +    L K     + LF   NK+ E   +      + P+
Sbjct  177  ----RKTVDDTLPVITVTRSGMRDNFLDKGFSCMDSLFATTNKLGEVANLRVTFSIRRPS  232

Query  254  KGE  256
              E
Sbjct  233  GKE  235


>gb|EDL11534.1| contactin associated protein 4, isoform CRA_c [Mus musculus]
Length=1200

 Score = 36.2 bits (82),  Expect = 3.5, Method: Composition-based stats.
 Identities = 16/54 (29%), Positives = 29/54 (53%), Gaps = 6/54 (11%)

Query  21   EHCIESVKNYCEKHDITHHVLTAPQLWIKPDPFNSGRSEESYMKYGGYLPIYEK  74
            EHC + +  YC+K  + +    +P+ W        GR+ E+   +GG LP+++K
Sbjct  696  EHCQQELVYYCKKSRLVNQQDGSPRSWW------VGRTNETQTYWGGSLPVHQK  743


2) BLASTx

Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_03968186.1|  conserved hypothetical protein [Sphingobac...  35.0    8.5  

ALIGNMENTS
>ref|ZP_03968186.1| conserved hypothetical protein [Sphingobacterium spiritivorum 
ATCC 33300]
 gb|EEI92024.1| conserved hypothetical protein [Sphingobacterium spiritivorum 
ATCC 33300]
Length=1247

 Score = 35.0 bits (79),  Expect = 8.5
 Identities = 28/97 (28%), Positives = 44/97 (45%), Gaps = 11/97 (11%)
 Frame = +1

Query  358   DMPIEDWYEKKIVNYSYMQYERLHDFSGI-NFDPNELGFEFCNMGMILLNCQKFKPYLQG  534
             D PI +W   K+ N S+M Y  L     I N++ +++     NM  +  N + F   +  
Sbjct  945   DQPIGNWNTVKVANMSFMFYNALVFNQNIGNWNTSQV----TNMSNLFSNAKAFNQNIDN  1000

Query  535   -QTPKEFLMRSEFKD---FVDGVGTWKWS--TDQTLL  627
               T K   M S F+D   F   +G+W     TD +L+
Sbjct  1001  WDTQKVVTMNSMFRDASVFNQNIGSWNTQKVTDMSLM  1037