GOS 1464010

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : JCVI_READ_1091120595580
Annotathon code: GOS_1464010
Sample :
  • GPS :24°29'18n; 83°4'12w
  • Caribbean Sea: Off Key West, FL - USA
  • Coastal (-1.7m, 25°C, 0.1-0.8 microns)
Authors
Team : Algarve
Username : CDL3
Annotated on : 2010-07-12 12:24:42
  • a34619 DallylAymardElKhoury1
  • a37000 CarlaAlexandraTavaresLopesVarela1
  • a37829 LiliaEneidaGóiaMendesCosta1

Synopsis

Genomic Sequence

>JCVI_READ_1091120595580 GOS_1464010 Genomic DNA
GGCGATCTTGGTGTCCTGCGGGATGTGCCGAGAGATGGTGGTGTTCAGCACGTTTCGGGCCACAAAGCGTTCTGCGGTGAGCTTGACCTGGTCGGTGATG
ACGCCCAGCACCCGTTCTGGCGATTGACTCGAGAAGTCCACCGTGGCCAAGAGCGGGAGGAATCGATGAAACGGCTTGAAACGCCTCAGTGTAAGATACT
AACTTCAATCGAGCAGACAGAGGTTCGTTTCATTTCTTGGACCCAAGAAGGCTAATATATCATCCAATCACGATTTTTCTTTGTGAGCCAAAAGGGACTA
CTGCTCCTCAGTGCAGTCCTTGGCGGGGCTGCACAAACGGTGGCAGAATCCTTTGGTATCTCGTCTGTGACGACGATTTTGAGTTGCCTGTTCCTCCTCC
TATTCATTCAGGAGGCATGGATTATTCAGTGGAAGATGGGAGGTGAGAACGTCGATTTCACCCGCCGGTTTAATCGGAAATATCCTCCGGGCGAGTTTTC
TCTCGTGAGGTTTGGGCCCCACCTCGTTGGAACGCTTGGCGTTGTGTTGTTCATCGCATTTCTCGGGTTGCCAACAAATGCAGTCTCGATTCATGCCGTC
TTTTGCAGTTTCTTCATTCTGAAATGGCTTTCAGATCCGCTTCTCGGGTGCCTCGCAGAGTACCACAACCGGTTGTTCCAACCCATTTTCGCGTACGTTG
TGATTTTCATCTATGCAACGACGTCGTCCGTTGAGTCCCAAATCCTCGAGTTGACACCGATTCCATGGACCCAATCCGTGAGCCTTGTCCTGTTCTTCTT
TGCAGTCCTACAAATGCGAATGGCGTACTATCAAAAATTCTGTTTCCGATCTGAGATCTCGGTTGATGGACAACTGAACCTCGTCCTCATGTCACTTCTC
TTTATGTCGCTTCCAGGCTTGGCCTACGCCATGGATGTCTTCATGGCCGACTGAGTGAGTGCGTGACGGCAAGGAAAGCATGTTCCTCGCCATGCTTTTG
CTTCTGGTCAG

Translation

[256 - 951/1011]   direct strand
>GOS_1464010 Translation [256-951   direct strand]
YIIQSRFFFVSQKGLLLLSAVLGGAAQTVAESFGISSVTTILSCLFLLLFIQEAWIIQWKMGGENVDFTRRFNRKYPPGEFSLVRFGPHLVGTLGVVLFI
AFLGLPTNAVSIHAVFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILELTPIPWTQSVSLVLFFFAVLQMRMAYYQKFCFRSE
ISVDGQLNLVLMSLLFMSLPGLAYAMDVFMAD

[ Warning ] 5' incomplete: does not start with a Methionine

Annotator commentaries

Our sequence encodes a protein, but this protein is incomplete, because it´s sequence does not start with a methionine /the 5' extremity its incomplete. Without the complete protein we can not conclude the molecular weight.


In bouth of BLAST´s( p-proteins and x - nucleotids) we found heigh e-values ,the lowest one was 1.2,to be meaningful the e- value was to be equal or less than 1e-04 , which means our results are insignificant,so we can´t do any analysis based in this resulults.


Without low e- values (significatives) and because the BLAST´s had very diferent results, we can´t determine which taxonomic group the sequence belong and neither it´s biological function.

ORF finding

PROTOCOL

a) SMS ORFinder / forward strand / frames 1, 2 & 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code

b) SMS ORFinder / reverse strand / frames 1, 2 3 / min 60 AA / 'any codon' initiation / 'standard' genetic code



RESULTS ANALYSIS


In the forward strand we found two ORF´s in frame 1,no ORF were found in frame 2 and we found two ORF´s in frame 3.

these orf´s contains respectively 254 ,698,251 and 260 nucleotides.


In the reverse strand we found two ORF´s in frame 1, no ORF were found in frame 2 and we found two ORF´s in frame 3.

These ORF´s contains respectively 230,197,179 and 227 nucleotides.


The ORF who probably encodes our protain belongs to the second position of frame 1 , because that ORF is the biggest one, containing 698 nucleotides.Although this sequence is coding our protein isn´t complete because the the 5´ extremity sequence is incomplete which means the sequence don´t start with methionine.

RAW RESULTS

a)forward strand

>ORF number 1 in reading frame 1 on the direct strand extends from base 1 to base 255.
GGCGATCTTGGTGTCCTGCGGGATGTGCCGAGAGATGGTGGTGTTCAGCACGTTTCGGGC
CACAAAGCGTTCTGCGGTGAGCTTGACCTGGTCGGTGATGACGCCCAGCACCCGTTCTGG
CGATTGACTCGAGAAGTCCACCGTGGCCAAGAGCGGGAGGAATCGATGAAACGGCTTGAA
ACGCCTCAGTGTAAGATACTAACTTCAATCGAGCAGACAGAGGTTCGTTTCATTTCTTGG
ACCCAAGAAGGCTAA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
GDLGVLRDVPRDGGVQHVSGHKAFCGELDLVGDDAQHPFWRLTREVHRGQEREESMKRLE
TPQCKILTSIEQTEVRFISWTQEG*

>ORF number 2 in reading frame 1 on the direct strand extends from base 256 to base 954.
TATATCATCCAATCACGATTTTTCTTTGTGAGCCAAAAGGGACTACTGCTCCTCAGTGCA
GTCCTTGGCGGGGCTGCACAAACGGTGGCAGAATCCTTTGGTATCTCGTCTGTGACGACG
ATTTTGAGTTGCCTGTTCCTCCTCCTATTCATTCAGGAGGCATGGATTATTCAGTGGAAG
ATGGGAGGTGAGAACGTCGATTTCACCCGCCGGTTTAATCGGAAATATCCTCCGGGCGAG
TTTTCTCTCGTGAGGTTTGGGCCCCACCTCGTTGGAACGCTTGGCGTTGTGTTGTTCATC
GCATTTCTCGGGTTGCCAACAAATGCAGTCTCGATTCATGCCGTCTTTTGCAGTTTCTTC
ATTCTGAAATGGCTTTCAGATCCGCTTCTCGGGTGCCTCGCAGAGTACCACAACCGGTTG
TTCCAACCCATTTTCGCGTACGTTGTGATTTTCATCTATGCAACGACGTCGTCCGTTGAG
TCCCAAATCCTCGAGTTGACACCGATTCCATGGACCCAATCCGTGAGCCTTGTCCTGTTC
TTCTTTGCAGTCCTACAAATGCGAATGGCGTACTATCAAAAATTCTGTTTCCGATCTGAG
ATCTCGGTTGATGGACAACTGAACCTCGTCCTCATGTCACTTCTCTTTATGTCGCTTCCA
GGCTTGGCCTACGCCATGGATGTCTTCATGGCCGACTGA

>Translation of ORF number 2 in reading frame 1 on the direct strand.
YIIQSRFFFVSQKGLLLLSAVLGGAAQTVAESFGISSVTTILSCLFLLLFIQEAWIIQWK
MGGENVDFTRRFNRKYPPGEFSLVRFGPHLVGTLGVVLFIAFLGLPTNAVSIHAVFCSFF
ILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILELTPIPWTQSVSLVLF
FFAVLQMRMAYYQKFCFRSEISVDGQLNLVLMSLLFMSLPGLAYAMDVFMAD*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the direct strand extends from base 195 to base 446.
GATACTAACTTCAATCGAGCAGACAGAGGTTCGTTTCATTTCTTGGACCCAAGAAGGCTA
ATATATCATCCAATCACGATTTTTCTTTGTGAGCCAAAAGGGACTACTGCTCCTCAGTGC
AGTCCTTGGCGGGGCTGCACAAACGGTGGCAGAATCCTTTGGTATCTCGTCTGTGACGAC
GATTTTGAGTTGCCTGTTCCTCCTCCTATTCATTCAGGAGGCATGGATTATTCAGTGGAA
GATGGGAGGTGA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
DTNFNRADRGSFHFLDPRRLIYHPITIFLCEPKGTTAPQCSPWRGCTNGGRILWYLVCDD
DFELPVPPPIHSGGMDYSVEDGR*

>ORF number 2 in reading frame 3 on the direct strand extends from base 474 to base 734.
TCGGAAATATCCTCCGGGCGAGTTTTCTCTCGTGAGGTTTGGGCCCCACCTCGTTGGAAC
GCTTGGCGTTGTGTTGTTCATCGCATTTCTCGGGTTGCCAACAAATGCAGTCTCGATTCA
TGCCGTCTTTTGCAGTTTCTTCATTCTGAAATGGCTTTCAGATCCGCTTCTCGGGTGCCT
CGCAGAGTACCACAACCGGTTGTTCCAACCCATTTTCGCGTACGTTGTGATTTTCATCTA
TGCAACGACGTCGTCCGTTGA

>Translation of ORF number 2 in reading frame 3 on the direct strand.
SEISSGRVFSREVWAPPRWNAWRCVVHRISRVANKCSLDSCRLLQFLHSEMAFRSASRVP
RRVPQPVVPTHFRVRCDFHLCNDVVR*

---------------------------------------------------------------------------------------------------
b) reverse strand

>ORF number 1 in reading frame 1 on the reverse strand extends from base 382 to base 612.
AAGCCATTTCAGAATGAAGAAACTGCAAAAGACGGCATGAATCGAGACTGCATTTGTTGG
CAACCCGAGAAATGCGATGAACAACACAACGCCAAGCGTTCCAACGAGGTGGGGCCCAAA
CCTCACGAGAGAAAACTCGCCCGGAGGATATTTCCGATTAAACCGGCGGGTGAAATCGAC
GTTCTCACCTCCCATCTTCCACTGAATAATCCATGCCTCCTGAATGAATAG

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
KPFQNEETAKDGMNRDCICWQPEKCDEQHNAKRSNEVGPKPHERKLARRIFPIKPAGEID
VLTSHLPLNNPCLLNE*

>ORF number 2 in reading frame 1 on the reverse strand extends from base 814 to base 1011.
TATCTTACACTGAGGCGTTTCAAGCCGTTTCATCGATTCCTCCCGCTCTTGGCCACGGTG
GACTTCTCGAGTCAATCGCCAGAACGGGTGCTGGGCGTCATCACCGACCAGGTCAAGCTC
ACCGCAGAACGCTTTGTGGCCCGAAACGTGCTGAACACCACCATCTCTCGGCACATCCCG
CAGGACACCAAGATCGCC

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
YLTLRRFKPFHRFLPLLATVDFSSQSPERVLGVITDQVKLTAERFVARNVLNTTISRHIP
QDTKIA

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 126 to base 305.
GGACGAGGTTCAGTTGTCCATCAACCGAGATCTCAGATCGGAAACAGAATTTTTGATAGT
ACGCCATTCGCATTTGTAGGACTGCAAAGAAGAACAGGACAAGGCTCACGGATTGGGTCC
ATGGAATCGGTGTCAACTCGAGGATTTGGGACTCAACGGACGACGTCGTTGCATAGATGA


>Translation of ORF number 1 in reading frame 3 on the reverse strand.
GRGSVVHQPRSQIGNRIFDSTPFAFVGLQRRTGQGSRIGSMESVSTRGFGTQRTTSLHR*


>ORF number 2 in reading frame 3 on the reverse strand extends from base 783 to base 1010.
AACGAACCTCTGTCTGCTCGATTGAAGTTAGTATCTTACACTGAGGCGTTTCAAGCCGTT
TCATCGATTCCTCCCGCTCTTGGCCACGGTGGACTTCTCGAGTCAATCGCCAGAACGGGT
GCTGGGCGTCATCACCGACCAGGTCAAGCTCACCGCAGAACGCTTTGTGGCCCGAAACGT
GCTGAACACCACCATCTCTCGGCACATCCCGCAGGACACCAAGATCGC

>Translation of ORF number 2 in reading frame 3 on the reverse strand.
NEPLSARLKLVSYTEAFQAVSSIPPALGHGGLLESIARTGAGRHHRPGQAHRRTLCGPKR
AEHHHLSAHPAGHQDR

Multiple Alignement

PROTOCOL

NOT DONE


RESULTS ANALYSIS


This section is made through the taxonomic groups found in the lineage report from the BLAST , as we saw the e-values are´t significant, so we can´t rely based on them to realize the multiparty alignments.


RAW RESULTS

Protein Domains

PROTOCOL

1)InterPro default parameters EBI


RESULTS ANALYSIS


The E-values were not available,because is too high so we can´t conclude anything about the biological significance of the domains found.

RAW RESULTS

1)InterPro

GOS_1464010	A1CD1619A9478DB9	232	SignalPHMM	SignalP	signal-peptide	1	26	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	15	33	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	39	59	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	82	102	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	108	126	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	141	159	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	169	187	NA	?	23-Apr-2010	NULL	NULL
GOS_1464010	A1CD1619A9478DB9	232	TMHMM	tmhmm	transmembrane_regions	208	228	NA	?	23-Apr-2010	NULL	NULL

Phylogeny

PROTOCOL

NOT DONE


RESULTS ANALYSIS


Without significant e-values is not viable to make a tree, so we can´t determine wich taxonomic group our sequece belong.

RAW RESULTS

Taxonomy report

PROTOCOL

1)BLASTP vs NR. default NCBI parameters * '1000 Max target sequence '

2)BLASTx vs NR. default NCBI parameters * '1000 Max target sequence '




RESULTS ANALYSIS


This results weren´t significatives ,because the E-values were too high and we were very diferent results between BLASTs.

RAW RESULTS
1)BLASTp

Lineage Report

cellular organisms
. Bacteria           [bacteria]
. . Actinomycetales    [high GC Gram+]
. . . Micromonospora sp. ATCC 39149 -   37  2 hits [high GC Gram+]       conserved hypothetical protein [Micromonospora sp. ATCC 391
. . . Streptomyces venezuelae .......   37  1 hit  [high GC Gram+]       hypothetical protein [Streptomyces venezuelae]
. . Opitutaceae bacterium TAV2 ------   35  2 hits [verrucomicrobia]     ABC-type phosphate transport system periplasmic component-l
. . Pseudomonas entomophila L48 .....   35  2 hits [g-proteobacteria]    cysteine sulfinate desulfinase [Pseudomonas entomophila L48
. Equus caballus (equine) -----------   36  1 hit  [odd-toed ungulates]  PREDICTED: similar to Heat shock 70 kDa protein 12B [Equus 
. Mus musculus (mouse) ..............   35  7 hits [rodents]             heat shock protein 12B [Mus musculus] >gi|27734248|sp|Q9CZJ
. Homo sapiens (man) ................   34 14 hits [primates]            unnamed protein product [Homo sapiens]
. Entamoeba dispar SAW760 ...........   34  2 hits [eukaryotes]          hypothetical protein [Entamoeba dispar SAW760] >gi|16590042
. Trichomonas vaginalis G3 ..........   34  2 hits [trichomonads]        hypothetical protein [Trichomonas vaginalis G3] >gi|1218964

--------------------------------------------------------------------------------

2)BLASTx

Lineage Report

cellular organisms
. Eukaryota                               [eukaryotes]
. . Leishmania                              [kinetoplastids]
. . . Leishmania braziliensis species complex [kinetoplastids]
. . . . Leishmania braziliensis                 [kinetoplastids]
. . . . . Leishmania braziliensis MHOM/BR/75/M2904 -   37 1 hit  [kinetoplastids]    ABC transporter [Leishmania braziliensis MHOM/BR/75/M2904] 
. . . . . Leishmania braziliensis ..................   37 1 hit  [kinetoplastids]    ABC transporter [Leishmania braziliensis MHOM/BR/75/M2904] 
. . . Leishmania major strain Friedlin -------------   36 2 hits [kinetoplastids]    hypothetical protein [Leishmania major strain Friedlin] >gi
. . Schistosoma mansoni ----------------------------   35 2 hits [flatworms]         phospholipid-transporting atpase [Schistosoma mansoni] >gi|
. Arthrospira platensis str. Paraca ----------------   37 1 hit  [cyanobacteria]     permease protein of iron(III) ABC transporter [Arthrospira 
. Vibrio cholerae RC385 ............................   37 2 hits [g-proteobacteria]  transporter, major facilitator family [Vibrio cholerae RC38
. Saccharopolyspora erythraea NRRL 2338 ............   36 1 hit  [high GC Gram+]     carbon monoxide dehydrogenase subunit G [Saccharopolyspora 
. Agrobacterium radiobacter K84 ....................   35 2 hits [a-proteobacteria]  glycine betaine/L-proline ABC transporter [Agrobacterium ra

BLAST

PROTOCOL

1)BLASTP vs NR. default NCBI parameters * '1000 Max target sequence '

2)BLASTx vs NR. default NCBI parameters * '1000 Max target sequence '



RESULTS ANALYSIS


No significant hits were found in both of BLAST ´s, the E-values were too high and without goods E-values we can´t do the philogenetic tree neither the multiple alisnement so we can not determine to which taxonomic group belong the sequence under study.


RAW RESULTS
                                                                   Score     E
1)BLASTp

Sequences producing significant alignments:                       (Bits)  Value

ref|ZP_04609051.1|  conserved hypothetical protein [Micromonos...  37.4    1.2  
dbj|BAD21143.1|  hypothetical protein [Streptomyces venezuelae]    37.4    1.4  
ref|XP_001496123.1|  PREDICTED: similar to Heat shock 70 kDa p...  36.6    2.0  
ref|NP_082582.1|  heat shock protein 12B [Mus musculus] >sp|Q9...  35.4    4.5  
ref|ZP_03727888.1|  ABC-type phosphate transport system peripl...  35.4    5.1  
ref|YP_609701.1|  cysteine sulfinate desulfinase [Pseudomonas ...  35.0    6.9  
dbj|BAG59626.1|  unnamed protein product [Homo sapiens]            34.7    7.6  
dbj|BAG62852.1|  unnamed protein product [Homo sapiens]            34.7    7.6  
emb|CAI18841.2|  heat shock 70kD protein 12B [Homo sapiens]        34.7    8.0  
gb|AAH90857.1|  HSPA12B protein [Homo sapiens]                     34.7    8.3  
gb|AAI43933.1|  HSPA12B protein [Homo sapiens]                     34.7    8.3  
ref|XP_001737006.1|  hypothetical protein [Entamoeba dispar SA...  34.7    8.4  
ref|XP_001330348.1|  hypothetical protein [Trichomonas vaginal...  34.7    8.6  
dbj|BAB71261.1|  unnamed protein product [Homo sapiens]            34.7    8.8  
ref|NP_443202.3|  heat shock 70 kDa protein 12B [Homo sapiens]...  34.7    8.9  

ALIGNMENTS
>ref|ZP_04609051.1| conserved hypothetical protein [Micromonospora sp. ATCC 39149]
 gb|EEP74981.1| conserved hypothetical protein [Micromonospora sp. ATCC 39149]
Length=385

 Score = 37.4 bits (85),  Expect = 1.2, Method: Compositional matrix adjust.
 Identities = 37/122 (30%), Positives = 53/122 (43%), Gaps = 24/122 (19%)

Query  64   ENVDFTRRFNRKYPPGEFS-LVRFGPHLVGTL--GVVLFIAFLGLPTNAVSIHAVFCSFF  120
            + V+ +R  N +  PG  + +VRFG   V  L   VV  +A  G P              
Sbjct  244  DAVELSRSLNPEDVPGRLTFIVRFGAKEVDRLLPPVVEAVARCGAP--------------  289

Query  121  ILKWLSDPLLGC---LAEYHNRLFQPIFAYVVIF---IYATTSSVESQILELTPIPWTQS  174
             + WL DP+ G    L+ +  RL +P+ A +  F   + A         LELTP P T+ 
Sbjct  290  -VVWLCDPMHGNGLRLSGFKTRLIEPMRAEITSFARILQANRRWPAGLHLELTPDPVTEC  348

Query  175  VS  176
            VS
Sbjct  349  VS  350


>dbj|BAD21143.1| hypothetical protein [Streptomyces venezuelae]
Length=392

 Score = 37.4 bits (85),  Expect = 1.4, Method: Compositional matrix adjust.
 Identities = 37/122 (30%), Positives = 51/122 (41%), Gaps = 24/122 (19%)

Query  64   ENVDFTRRFNRKYPPGEFS-LVRFGPHLVGTL--GVVLFIAFLGLPTNAVSIHAVFCSFF  120
            + V  +R  N +  PG  + +VRFG   V  L   VV  +A  G P              
Sbjct  251  DAVALSRSLNPEGVPGRLTFIVRFGAKEVDELLPPVVRAVARHGAP--------------  296

Query  121  ILKWLSDPLLGC---LAEYHNRLFQPIFAYVVIFIYATTSSVE---SQILELTPIPWTQS  174
             + WL DP+ G    LA +  RL +P+ A    F+       +      LELTP P T+ 
Sbjct  297  -VVWLCDPMHGNGLKLAGHKTRLIEPMRAETAAFVRTLREHGQWPAGLHLELTPDPVTEC  355

Query  175  VS  176
            VS
Sbjct  356  VS  357


>ref|XP_001496123.1| PREDICTED: similar to Heat shock 70 kDa protein 12B [Equus caballus]
Length=686

 Score = 36.6 bits (83),  Expect = 2.0, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  371  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  430

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S  ++KW S  +L    E  N LFQP  + ++  I A     E Q ++L
Sbjct  431  RRSSLNVVKWSSQGMLRMSCEAMNELFQPTVSGIIEHIEALLGRPEVQDVKL  482


>ref|NP_082582.1| heat shock protein 12B [Mus musculus]
 sp|Q9CZJ2.1|HS12B_MOUSE RecName: Full=Heat shock 70 kDa protein 12B
 dbj|BAB28308.1| unnamed protein product [Mus musculus]
 gb|AAH11103.1| Heat shock protein 12B [Mus musculus]
 gb|AAO37639.1| HSPA12B protein [Mus musculus]
 emb|CAM21096.1| heat shock protein 12B [Mus musculus]
 gb|EDL28301.1| heat shock protein 12B [Mus musculus]
Length=685

 Score = 35.4 bits (80),  Expect = 4.5, Method: Compositional matrix adjust.
 Identities = 29/112 (25%), Positives = 44/112 (39%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF  +F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  371  DFIAKFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  430

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S  ++KW S  +L    E  N LFQP  + ++  I    +  E Q ++L
Sbjct  431  RRSSVNLVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEMLLAKPEVQGVKL  482


>ref|ZP_03727888.1| ABC-type phosphate transport system periplasmic component-like 
protein [Opitutaceae bacterium TAV2]
 gb|EEG18097.1| ABC-type phosphate transport system periplasmic component-like 
protein [Opitutaceae bacterium TAV2]
Length=526

 Score = 35.4 bits (80),  Expect = 5.1, Method: Compositional matrix adjust.
 Identities = 33/110 (30%), Positives = 52/110 (47%), Gaps = 2/110 (1%)

Query  13   KGLLLLSAVLGGAAQTVAESFGISSVTTILSCLFLLLFIQEAWIIQWKMGGENVDFTRRF  72
            +GLLL++ +  G    +    G+S   +I   L + L  +E   +  +   ++ D  R  
Sbjct  23   QGLLLVAVLGAGVLDCILPDSGMSVSVSIAGALLVWLLGREGKKVVLRAYPKSDDGVRMA  82

Query  73   NRKYPPGEFSLVRFGPHLVGTLGVVLFIAFLGLPTNAVSIHAVFCSFFIL  122
                PPG+F L R+ P L+ +L   LF+A  GL   A    AVF  F I+
Sbjct  83   ECPPPPGDFWL-RYQPVLIASL-WSLFVALAGLLFPAPEEGAVFTIFLIV  130


>ref|YP_609701.1| cysteine sulfinate desulfinase [Pseudomonas entomophila L48]
 emb|CAK16917.1| cysteine sulfinate desulfinase [Pseudomonas entomophila L48]
Length=401

 Score = 35.0 bits (79),  Expect = 6.9, Method: Compositional matrix adjust.
 Identities = 20/71 (28%), Positives = 31/71 (43%), Gaps = 1/71 (1%)

Query  58   QWKMGGENVDFTRRFNRKYPPGEFSLVRFGPHLVGTLGVVLFIAFLG-LPTNAVSIHAVF  116
             W+ GGE V      N  + P         P + G +G+   + +L  L TNAV+ H   
Sbjct  245  HWQFGGEMVQLADYQNASFRPAPLGFEAGTPPIAGVIGLGATLDYLASLDTNAVAAHEAS  304

Query  117  CSFFILKWLSD  127
                +L+ L+D
Sbjct  305  LHQHLLRGLAD  315


>dbj|BAG59626.1| unnamed protein product [Homo sapiens]
Length=520

 Score = 34.7 bits (78),  Expect = 7.6, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  205  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  264

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S   +KW S  +L    E  N LFQP  + ++  I A  +  E Q ++L
Sbjct  265  RRSSVNFVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEALLARPEVQGVKL  316


>dbj|BAG62852.1| unnamed protein product [Homo sapiens]
Length=600

 Score = 34.7 bits (78),  Expect = 7.6, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  285  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  344

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S   +KW S  +L    E  N LFQP  + ++  I A  +  E Q ++L
Sbjct  345  RRSSVNFVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEALLARPEVQGVKL  396


>emb|CAI18841.2| heat shock 70kD protein 12B [Homo sapiens]
Length=600

 Score = 34.7 bits (78),  Expect = 8.0, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  285  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  344

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S   +KW S  +L    E  N LFQP  + ++  I A  +  E Q ++L
Sbjct  345  RRSSVNFVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEALLARPEVQGVKL  396


>gb|AAH90857.1| HSPA12B protein [Homo sapiens]
Length=687

 Score = 34.7 bits (78),  Expect = 8.3, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  372  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  431

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S   +KW S  +L    E  N LFQP  + ++  I A  +  E Q ++L
Sbjct  432  RRSSVNFVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEALLARPEVQGVKL  483


>gb|AAI43933.1| HSPA12B protein [Homo sapiens]
Length=685

 Score = 34.7 bits (78),  Expect = 8.3, Method: Compositional matrix adjust.
 Identities = 30/112 (26%), Positives = 43/112 (38%), Gaps = 12/112 (10%)

Query  67   DFTRRFNRKYPPG--------EFSLVRFGPHLVGTLGVVLFIAFLGL----PTNAVSIHA  114
            DF   F R+ P          E      GPH  G L + L  +F+        + V    
Sbjct  370  DFIATFKRQRPAAWVDLTIAFEARKRTAGPHRAGALNISLPFSFIDFYRKQRGHNVETAL  429

Query  115  VFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILEL  166
               S   +KW S  +L    E  N LFQP  + ++  I A  +  E Q ++L
Sbjct  430  RRSSVNFVKWSSQGMLRMSCEAMNELFQPTVSGIIQHIEALLARPEVQGVKL  481


>ref|XP_001737006.1| hypothetical protein [Entamoeba dispar SAW760]
 gb|EDR26745.1| hypothetical protein, conserved [Entamoeba dispar SAW760]
Length=758

 Score = 34.7 bits (78),  Expect = 8.4, Method: Composition-based stats.
 Identities = 22/66 (33%), Positives = 34/66 (51%), Gaps = 2/66 (3%)

Query  93   TLGVVLFIAFLGLPTNAVSIH--AVFCSFFILKWLSDPLLGCLAEYHNRLFQPIFAYVVI  150
            TL  +LF+ FL L  + V +H  + F  +++L +L   LL     Y N +   +F  + I
Sbjct  133  TLSFILFLDFLHLYISMVKLHLFSGFYEYYVLVFLFSFLLMISMHYSNPILIYLFTLLYI  192

Query  151  FIYATT  156
            FIY  T
Sbjct  193  FIYCYT  198


2)BLASTx
                                                                  Score     E
                                                                   Score     E
Sequences producing significant alignments:                       (Bits)  Value

ref|XP_001566410.1|  ABC transporter [Leishmania braziliensis ...  37.7    2.1  
ref|ZP_06382673.1|  permease protein of iron(III) ABC transpor...  37.4    2.8  
ref|ZP_04916126.1|  transporter, major facilitator family [Vib...  37.4    2.8  
ref|ZP_06564415.1|  carbon monoxide dehydrogenase subunit G [S...  37.0    3.6  
ref|XP_847711.1|  hypothetical protein [Leishmania major strai...  36.6    4.7  
ref|XP_002578905.1|  phospholipid-transporting atpase [Schisto...  35.8    8.0  
ref|YP_002541084.1|  glycine betaine/L-proline ABC transporter...  35.8    8.0  

ALIGNMENTS
>ref|XP_001566410.1| ABC transporter [Leishmania braziliensis MHOM/BR/75/M2904]
 emb|CAM39919.1| ABC transporter, putative [Leishmania braziliensis]
Length=1867

 Score = 37.7 bits (86),  Expect = 2.1
 Identities = 31/153 (20%), Positives = 65/153 (42%), Gaps = 18/153 (11%)
 Frame = +1

Query  430   WKMGGENVDFTRRFNRKYPPGEFSLVRFGPHLVGTLGVVLFIAFLGLPTNAVS--IHAVF  603
             +++ G +V      N   P GEF       +    +G+ + + F+ +P+N +S  +    
Sbjct  1233  YQLFGSSVPMPTVVNSPMPLGEFEKSLVSANKQVMIGIFIILPFIFIPSNTISFIVEEKE  1292

Query  604   CSFFILKWLSDP------LLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILELTPIP  765
                  ++W+S        L   + ++   +   I A+V+  I+  T  +    ++     
Sbjct  1293  SGARHMQWISGASVLAYWLSSFVFDFACYIVTQILAFVIFLIFRRTEYIGKDTID-----  1347

Query  766   WTQSVSLVLF-FFAVLQMRMAYYQKFCFRSEIS  861
                  +LVLF FF +  + ++Y+  F FRS  +
Sbjct  1348  ----AALVLFLFFGLTSIPVSYFLSFFFRSSFT  1376


>ref|ZP_06382673.1| permease protein of iron(III) ABC transporter [Arthrospira platensis 
str. Paraca]
Length=529

 Score = 37.4 bits (85),  Expect = 2.8
 Identities = 26/85 (30%), Positives = 40/85 (47%), Gaps = 12/85 (14%)
 Frame = +1

Query  505  VRFGPHLVGTLGVVLFIAFLGLPTNAVSIHAVFCSFFILKWLSDPLLGCLAEYHNRLFQP  684
            VRF   L  TL    +I F G+P   VS+  VF     + WL           + +L   
Sbjct  345  VRFPSRLTATLERTSYIGF-GMPGIVVSLSLVFFGANYVPWL-----------YQQLPML  392

Query  685  IFAYVVIFIYATTSSVESQILELTP  759
            IFAY+V+FI     ++   +L+++P
Sbjct  393  IFAYLVLFIPQAVGTLRGSLLQVSP  417


>ref|ZP_04916126.1| transporter, major facilitator family [Vibrio cholerae RC385]
 gb|EDN13478.1| transporter, major facilitator family [Vibrio cholerae RC385]
Length=398

 Score = 37.4 bits (85),  Expect = 2.8
 Identities = 29/101 (28%), Positives = 50/101 (49%), Gaps = 8/101 (7%)
 Frame = +1

Query  700  VIFIYATTSSVESQILELTPIPWTQSVSLVLFFFAVLQMRMAYYQKF------CFRSEIS  861
            ++FI+    S  S IL +    WT  V++++ F A L MR  +  KF       FRS  +
Sbjct  278  LVFIFGMALSATSTILLINGGSWTTMVAVIVIFCANLMMRNFFSIKFQALISPSFRS--T  335

Query  862  VDGQLNLVLMSLLFMSLPGLAYAMDVFMAD*VSA*RQGKHV  984
            +D   + ++  +L +SLP    A+D    + ++A   G +V
Sbjct  336  IDSLFSTMMRIVLIISLPLSGNAIDTIGWEVMTALFVGSYV  376


>ref|ZP_06564415.1| carbon monoxide dehydrogenase subunit G [Saccharopolyspora erythraea 
NRRL 2338]
Length=242

 Score = 37.0 bits (84),  Expect = 3.6
 Identities = 19/49 (38%), Positives = 26/49 (53%), Gaps = 7/49 (14%)
 Frame = -2

Query  158  SRSWPRWTSRVNRQNGCWASS-------PTRSSSPQNALWPETC*TPPS  33
            SR WP  +  V+ + GCWA+        PTRSS+P +A  P     PP+
Sbjct  192  SRRWPPRSQSVSSRAGCWAAGAGGAVCWPTRSSTPPSASVPTATPCPPA  240


>ref|XP_847711.1| hypothetical protein [Leishmania major strain Friedlin]
 gb|AAZ09503.1| hypothetical protein, conserved [Leishmania major strain Friedlin]
Length=3008

 Score = 36.6 bits (83),  Expect = 4.7
 Identities = 21/42 (50%), Positives = 24/42 (57%), Gaps = 0/42 (0%)
 Frame = -3

Query  196  SYTEAFQAVSSIPPALGHGGLLESIARTGAGRHHRPGQAHRR  71
            S TEA  AVS+I  A G GG  +S+ RTG  RH   G   RR
Sbjct  926  SGTEATSAVSAILYAGGKGGGCDSVERTGEERHRHKGTGDRR  967


>ref|XP_002578905.1| phospholipid-transporting atpase [Schistosoma mansoni]
 emb|CAZ35143.1| phospholipid-transporting atpase [Schistosoma mansoni]
Length=1311

 Score = 35.8 bits (81),  Expect = 8.0
 Identities = 21/53 (39%), Positives = 30/53 (56%), Gaps = 3/53 (5%)
 Frame = +1

Query  784   LVLFFF---AVLQMRMAYYQKFCFRSEISVDGQLNLVLMSLLFMSLPGLAYAM  933
             LVL+FF    +  +   +Y  FC  S+ SV  QL L+L +L+  SLP L Y +
Sbjct  957   LVLYFFYKNLIFTLPQMFYGFFCAYSQQSVYPQLYLILFNLIMTSLPILLYGI  1009


>ref|YP_002541084.1| glycine betaine/L-proline ABC transporter [Agrobacterium radiobacter 
K84]
 gb|ACM29487.1| glycine betaine/L-proline ABC transporter [Agrobacterium radiobacter 
K84]
Length=283

 Score = 35.8 bits (81),  Expect = 8.0
 Identities = 27/112 (24%), Positives = 53/112 (47%), Gaps = 6/112 (5%)
 Frame = +1

Query  616  ILKWLSDPLLGCLAEYHNRLFQPIFAYVVIFIYATTSSVESQILELTPIPWTQSVSLVLF  795
            IL +    LL    E+ N  F P+FA +   I A  +++E+ +L L   P+     +V+F
Sbjct  5    ILDYSPGVLLAPAVEWLNTNFHPLFATISAVIEAVLNAIEAVLLSLP--PYAVIALVVIF  62

Query  796  FFAVLQMRMAYYQK----FCFRSEISVDGQLNLVLMSLLFMSLPGLAYAMDV  939
             F+V   R+A        FC  +++ V     + L+++  +    +A+ + +
Sbjct  63   AFSVAGWRVAILAACALGFCLTADLWVPSIQTISLVTVAVLISVVIAFPLGI  114