ORF RX17820

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01160096.1
Annotathon code: ORF_RX17820
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2006
Username : Maxsmith
Annotated on : 2008-03-19 18:52:37
  • COMBEL maxime
  • MASQUELIER marion

Synopsis

Genomic Sequence

>AACY01160096.1 ORF_RX17820 genomic DNA
ATTGCATCTTATCCGGGATCTGCGCCAAACTGGACGTTCCTAGGACAAATAGCATAATCATCACCCAGATCCGGAACAAATGATTAATCAATTGAGTCAA
ATTCATCCACTGATTGTCGCACTCTTTTTGAGTGTGTCGGTAGTGAATCTTACATTTGCCGCGCCAGAAGAAGATCGCTGGATTCGTGTGGACAACGGAG
ATGTCGCCTTTTCTACCAACCTAGGTGAATCTGAAGCACTAGAGCTAGAACGCTCAATTCGCCTATTCTCCGCGTTTAGCAAAACTTTTTTGCCAGTTAG
GGAAAATTATTCGATACCACTAGAGTTAATTGTTTTCGCGAAGAAAGCTGATTTTGAGGACACGGTAAAACCTAGAAAATTTGCTTCCTACACCAATTCT
GAACTGGATGGTGTTCTCATCGTCGCTGCTCCCTCTACCAGCAAAGATGTCGATCTTCTAGAAAATCTGAAGCACGAGCTCGCGCACTATCACATGCGTC
ATACTTCGATTAATTATCCACTTTGGTACGAAGAGGGAATGGCAACCCTGTTATCCGAGGCAACACTTACATTTGTAGACGACGCCATCAAAGCCGAATT
CAAAACTCCCAAGCCCACGGCAGGTTTTCCATTAAAACGATCTACAAAAATGGTAAGAAAAGCCTGGTTGGTTGAACATCTTAAACGAAGAAGTCTGCGT
AATCTGAACTTAAGGATCATTCACAACTTCTATAATGATAGTCATCGACTGGCCAACTTCTTCCATTTTAACGAAAGTGATGATTCCAGATTCTCGATGA
AAGCACTGAATCAATATCTATTAAACCAATCAAGTACTCTTTTCTCCTCTCTTAATGTGACGCC

Translation

[80 - 862/864]   direct strand
>ORF_RX17820 Translation [80-862   direct strand]
MINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALELERSIRLFSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPRK
FASYTNSELDGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDDAIKAEFKTPKPTAGFPLKRSTKMVRKAWLVEH
LKRRSLRNLNLRIIHNFYNDSHRLANFFHFNESDDSRFSMKALNQYLLNQSSTLFSSLNVT

[ Warning ] 3' incomplete: following codon is not a STOP

Phylogeny


Annotator commentaries

Nous avons utilisé ORF finder sur SMS pour trouver si notre séquence possedait des cadres de lecture ouvert. Nous avons tout d'abord réalisé la recherche avec Any codons et trouvé: Sur le brin direct: 1 ORF en position 2 Sur le brin reverse: 2 ORFs en position 1 (145-351; 554-762) 1 ORF en position 3 Afin d'affiner notre recherche concernant les cadres ouvert de lecture nous avons relancé ORF finder avec l'ATG comme codon de départ et trouvé: Sur le brin direct 1 ORF en position 2 Sur le brin reverse 1 ORF en position 1 Comme l'ORF sur le brin direct est le plus long nous le choisissons pour effectuer la suite de nos recherche. Nous obtenons donc un ORF codant sur le brin direct de 261 acides aminés allant de la position 80 à la position 862, sachant que nous ne possédons pas de codon stop sur notre séquence nous pouvons dire que celui ci se continue en aval de notre séquence.

Par la suite nous avons réalisé par Interpro (EBI) une prédiction de domaines protéiques. Nous obtenons un seul resultat noIPR unintegrated ne donnant aucun renseignement sur un domaine pouvant être codé par notre séquence.

Pour confirmer ce resultat et comparer notre séquence à des séquences deja annotées nous realisons un blast p contre swissprot et contre Nr. Nous obtenons très peu de resultats (4 sur swissprot et 8 sur nr) avec de très mauvaises evalues (0.51 pour le meilleur sur swissprot et 0.039 sur nr). Ces resultats n'etant pas du tout interessant nous faisons ensuite un blast x contre Nr pour voir si un autre ORF plus petit appartenant à notre lot coderait pour une protéine. Les résultats sont encore une fois très mauvais: très peu de résultats (10) avec des evalues ne depassant pas les 0.010.

D'après nos analyses, notre lot est codant et les resultats de nos différents blasts evoquent une proteine hypothetique. Nous pouvons donc conclure que notre lot contient probablement une séquence codante nouvelle n'ayant pas encore été annotée par des chercheurs et n'ayant pas de similarité avec d'autres séquences appartenant à des organismes connus. Il nous est impossible dans ces conditions de déduire la fonction et le processus biologique dans lesquels rentre notre protéine, ni de realiser un arbre phylogénétique sur la base d'alignements multiples.


N.B: Les BLASTs ont aussi été lancé avec les autres ORFs.Les résultats ne donnent encore une fois rien

Multiple Alignement


BLAST

----------------------Blast contre swissprot-----------------------------

BLASTP 2.2.15 [Oct-15-2006] 

Database: Non-redundant SwissProt sequences
           217,875 sequences; 82,042,039 total letters

gi|22095932|sp|O94681|ODO2_SCHPO  Probable dihydrolipoyllysine...  33.9    0.51 
gi|267149|sp|Q00942|TOP2_ASFB7  DNA topoisomerase 2 (DNA topoisom  31.6    3.1  
gi|83308972|sp|Q49YH4|Y1020_STAS1  UPF0354 protein SSP1020         30.4    6.5   
gi|6685546|sp|O88986|KBL_MOUSE  2-amino-3-ketobutyrate coenzym...  30.4    6.8   


>gi|22095932|sp|O94681|ODO2_SCHPO  Probable dihydrolipoyllysine-residue succinyltransferase component 
of 2-oxoglutarate dehydrogenase complex, mitochondrial 
precursor (E2) (Probable dihydrolipoamide succinyltransferase 
component of 2-oxoglutarate dehydrogenase complex)
Length=452

 Score = 33.9 bits (76),  Expect = 0.51, Method: Composition-based stats.
 Identities = 23/75 (30%), Positives = 34/75 (45%), Gaps = 6/75 (8%)

Query  168  DAIKAEFKTPKP--TAGFPLKRS----TKMVRKAWLVEHLKRRSLRNLNLRIIHNFYNDS  221
            DA + EF +PKP      P+K+S    T+  R +    +  R  +  + LRI        
Sbjct  181  DAKEPEFSSPKPKPAKSEPVKQSKPKATETARPSSFSRNEDRVKMNRMRLRIAERLKESQ  240

Query  222  HRLANFFHFNESDDS  236
            +R A+   FNE D S
Sbjct  241  NRAASLTTFNECDMS  255


>gi|267149|sp|Q00942|TOP2_ASFB7  DNA topoisomerase 2 (DNA topoisomerase II)
Length=1192

 Score = 31.6 bits (70),  Expect = 3.1, Method: Composition-based stats.
 Identities = 27/103 (26%), Positives = 41/103 (39%), Gaps = 18/103 (17%)

Query  59   SIRLFSAFSKTFLPVRENYSIPLEL-----------IVFAKKADFEDTVKPRKFASYTN-  106
            S++L S F KT  P  +++ +P              +     A  E    P +   YT  
Sbjct  802  SVQLASEFIKTMFPAEDSWLLPYVFEDGQRAEPEYYVPVLPLAIMEYGANPSEGWKYTTW  861

Query  107  -SELDGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPL  148
              +L+ +L +      KD     N KHEL HY ++H     PL
Sbjct  862  ARQLEDILALVRAYVDKD-----NPKHELLHYAIKHKITILPL  899


>gi|83308972|sp|Q49YH4|Y1020_STAS1  UPF0354 protein SSP1020
Length=287

 Score = 30.4 bits (67),  Expect = 6.5, Method: Composition-based stats.
 Identities = 26/91 (28%), Positives = 43/91 (47%), Gaps = 7/91 (7%)

Query  164  TFVDDAIKAEFKTPKPTAGFPLKRSTKMVRKAWLVE-HLKRRSLRNL---NLRIIHNFYN  219
            +FV DA  AE           L +S +++ +A L E  L ++ L+ +   N+R + N Y 
Sbjct  105  SFVIDAHTAETNI---YYAVDLGKSYRLIDEAMLEELKLTKQQLKEMALFNVRKLENKYT  161

Query  220  DSHRLANFFHFNESDDSRFSMKALNQYLLNQ  250
                  N F+F  S+D   + + LN   LN+
Sbjct  162  TDEVKGNIFYFVNSNDGYDASRILNTSFLNE  192


>gi|6685546|sp|O88986|KBL_MOUSE  2-amino-3-ketobutyrate coenzyme A ligase, mitochondrial precursor 
(AKB ligase) (Glycine acetyltransferase)
Length=416

 Score = 30.4 bits (67),  Expect = 6.8, Method: Composition-based stats.
 Identities = 19/63 (30%), Positives = 30/63 (47%), Gaps = 2/63 (3%)

Query  128  ENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDDAIKAEFKTPKPTAGFPLKR  187
            +NL+ ++AH+H R  +I YP  ++      L EA LT  D  +  E        G  L +
Sbjct  110  KNLEAKIAHFHQREDAILYPSCFDANAG--LFEALLTPEDAVLSDELNHASIIDGIRLCK  167

Query  188  STK  190
            + K
Sbjct  168  AHK  170

------------------------Blast contre nr-----------------------------------


BLASTP 2.2.15 [Oct-15-2006]

Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           4,196,452 sequences; 1,444,328,266 total letters

gi|108763613|ref|YP_630548.1|  hypothetical protein MXAN_2327 ...  41.6    0.039 
gi|116253222|ref|YP_769060.1|  putative transmembrane protein ...  38.1    0.45  
gi|86160014|ref|YP_466799.1|  hypothetical protein Adeh_3596 [...  35.8    2.1   
gi|116624734|ref|YP_826890.1|  hypothetical protein Acid_5658 ...  35.0    3.1   
gi|86160013|ref|YP_466798.1|  hypothetical protein Adeh_3595 [...  35.0    3.1   
gi|92915559|ref|ZP_01284182.1|  conserved hypothetical protein...  35.0    3.2  
gi|51894443|ref|YP_077134.1|  hypothetical protein STH3309 [Sy...  34.7    4.0   
gi|108756961|ref|YP_630691.1|  hypothetical protein MXAN_2471 ...  34.3    5.8   



>gi|108763613|ref|YP_630548.1|  hypothetical protein MXAN_2327 [Myxococcus xanthus DK 1622]
 gi|108467493|gb|ABF92678.1|  hypothetical protein MXAN_2327 [Myxococcus xanthus DK 1622]
Length=524

 Score = 41.6 bits (96),  Expect = 0.039, Method: Composition-based stats.
 Identities = 43/158 (27%), Positives = 68/158 (43%), Gaps = 13/158 (8%)

Query  34   WIRVDNGDVAFSTNLGESEALE-LERSIRLFSAFSKTFLP--VRENYSIPLELIVFAKKA  90
            W+R+D+      T+L   EA E ++R  R  +A   +  P  +R+  +  L++ V     
Sbjct  38   WLRLDSDHYTLHTDLLAEEAREAMQRLERTRAAILTSMWPQSLRQQMT-KLDVYVIQSPR  96

Query  91   DFEDTVKPRKFASYTNSELDGVLIVAA-----PSTSKDVDLLEN--LKHELAHYHMRHTS  143
            +FE     R  A +  S+ + +++++        T   + L  +  L HELAHY   +  
Sbjct  97   EFEGLYPRRVRAFFFRSDSEALIVLSGRPGTWEQTFSGLSLASSSPLNHELAHYLSAYPL  156

Query  144  INYPLWYEEGMATLLSEATLTFVDDAIKAEFKTPKPTA  181
               P W  EGMA  L   TL    D   A    P  TA
Sbjct  157  SRQPRWLSEGMAEYLE--TLRISKDGRTAVVGAPHWTA  192


>gi|116253222|ref|YP_769060.1|  putative transmembrane protein [Rhizobium leguminosarum bv. viciae 
3841]
 gi|115257870|emb|CAK08968.1|  putative transmembrane protein [Rhizobium leguminosarum bv. viciae 
3841]
Length=370

 Score = 38.1 bits (87),  Expect = 0.45, Method: Composition-based stats.
 Identities = 28/82 (34%), Positives = 40/82 (48%), Gaps = 5/82 (6%)

Query  9    HPLIVALFLSVSVVNLTFAA---PEEDRWIRVDNGDVAFSTNLGESEALELERSIRLFSA  65
            HPL++A+   V  + L   A      DR  R    D+AF  +LG + AL    S+RL  +
Sbjct  202  HPLLLAVAFLVCALGLFATALYFDLGDRLRRTTRSDIAFWLHLGAAPALLF--SVRLLMS  259

Query  66   FSKTFLPVRENYSIPLELIVFA  87
            F   FL V +  SI   +IV +
Sbjct  260  FDGNFLDVAQAVSIKTPVIVIS  281


>gi|86160014|ref|YP_466799.1|  hypothetical protein Adeh_3596 [Anaeromyxobacter dehalogenans 
2CP-C]
 gi|85776525|gb|ABC83362.1|  hypothetical protein Adeh_3596 [Anaeromyxobacter dehalogenans 
2CP-C]
Length=498

 Score = 35.8 bits (81),  Expect = 2.1, Method: Composition-based stats.
 Identities = 26/136 (19%), Positives = 53/136 (38%), Gaps = 3/136 (2%)

Query  34   WIRVDNGDVAFSTNLGESEALELERSI-RLFSAFSKTFLPVRENYSIPLELIVFAKKADF  92
            W  +   ++   T+L   +A +L R + R++                P+ ++ F  + +F
Sbjct  36   WRELRTANILLQTDLSSGKAQDLARELDRIYDVVRIALFRRPPPTVAPMRVVAFQSEEEF  95

Query  93   EDTVKPRKFASYTNSELDGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEE  152
                 P+   +Y  S      ++  P    D   +  + HE+ H+         P W+ E
Sbjct  96   H-LFAPKDATAYHMSGTRLGAVMLTPGLLADSQRIVAV-HEITHHVTTPLFARQPRWFAE  153

Query  153  GMATLLSEATLTFVDD  168
            G+A  +    +T VD+
Sbjct  154  GLACYMESMAMTGVDN  169


>gi|116624734|ref|YP_826890.1|  hypothetical protein Acid_5658 [Solibacter usitatus Ellin6076]
 gi|116227896|gb|ABJ86605.1|  hypothetical protein Acid_5658 [Solibacter usitatus Ellin6076]
Length=597

 Score = 35.0 bits (79),  Expect = 3.1, Method: Composition-based stats.
 Identities = 32/137 (23%), Positives = 60/137 (43%), Gaps = 10/137 (7%)

Query  26   FAAPEEDRWIRVDNGDVAFSTNLGESEALELERSIRLFSAFSKTFLPVRENYSIPLELIV  85
            F+AP+ D W+++ + +    T  GE    +L +      +F           + P  +I 
Sbjct  25   FSAPQ-DSWLKITSANFELYTTAGERSGRDLIKHFEQVRSFFTQAFGAHLAAARPARIIA  83

Query  86   FAKKADFEDTVKPRKFAS--YTNSEL-DGVLIVAAPSTSKDVDLLENLKHELAHYHMRHT  142
            F  + +++   +P +FAS  Y    + D +++  A S    V +     HE  H  +  +
Sbjct  84   FRNEKEYQ-PYRPGEFASAFYQPGAVHDFIVMSGASSEHYPVAI-----HEYTHLMIHQS  137

Query  143  SINYPLWYEEGMATLLS  159
             ++ P W  EG+A L S
Sbjct  138  GMDLPPWLNEGLAELYS  154


>gi|86160013|ref|YP_466798.1|  hypothetical protein Adeh_3595 [Anaeromyxobacter dehalogenans 
2CP-C]
 gi|85776524|gb|ABC83361.1|  hypothetical protein Adeh_3595 [Anaeromyxobacter dehalogenans 
2CP-C]
Length=529

 Score = 35.0 bits (79),  Expect = 3.1, Method: Composition-based stats.
 Identities = 35/142 (24%), Positives = 56/142 (39%), Gaps = 17/142 (11%)

Query  26   FAAPEED--RWIRVDNGDVAFSTNLGESEALELERSIRLFSAFSKTFLPVREN-YSIP--  80
            F  PE+    W  +    V   T+L   +A EL           +TF+ VR   +  P  
Sbjct  52   FRCPEQGGPDWHELRTEHVVLQTDLPSWKAKELA------GELERTFVVVRTGLFRNPPP  105

Query  81   ----LELIVFAKKADFEDTVKPRKFASYTNSELDGVLIVAAPSTSKDVDLLENLKHELAH  136
                L ++ FA +++FE    P    +Y +       +V  P T  D      + HEL H
Sbjct  106  APGLLRVVAFASESEFE-RFAPMGAGAYYHRPPFFAPVVVMPGTLGDAQRTV-IAHELTH  163

Query  137  YHMRHTSINYPLWYEEGMATLL  158
            +         P W+ EG+A+ +
Sbjct  164  HLTAQLFARQPPWFREGLASFM  185


>gi|92915559|ref|ZP_01284182.1|  conserved hypothetical protein [Mycobacterium sp. KMS]
 gi|108800252|ref|YP_640449.1|  hypothetical protein Mmcs_3286 [Mycobacterium sp. MCS]
 gi|92440295|gb|EAS98139.1|  conserved hypothetical protein [Mycobacterium sp. KMS]
 gi|108770671|gb|ABG09393.1|  conserved hypothetical protein [Mycobacterium sp. MCS]
Length=275

 Score = 35.0 bits (79),  Expect = 3.2, Method: Composition-based stats.
 Identities = 32/107 (29%), Positives = 46/107 (42%), Gaps = 10/107 (9%)

Query  114  IVAAPSTS--KDVDLLENLKHELAHYHMR-HTSINYPLWYEEGMATLLSEATLTFVDDAI  170
            IV AP  +   D DL   L+HEL H+ +R  T+ + P W  EG+A  L+    T   DA 
Sbjct  137  IVFAPGAAAMTDEDLRIVLRHELFHHAVREQTAADAPRWLTEGVADHLARPRTTPAPDAE  196

Query  171  KA-----EFKTPKPTAGFPLKRSTKMVRKAWLVEHLKRRSLRNLNLR  212
             A     +  TP         R+ +     ++ +      LR L LR
Sbjct  197  TALPTDSDLDTPGAVRSQAYDRAWRFA--TYVADRYGPERLRALYLR  241


>gi|51894443|ref|YP_077134.1|  hypothetical protein STH3309 [Symbiobacterium thermophilum IAM 
14863]
 gi|51858132|dbj|BAD42290.1|  hypothetical protein [Symbiobacterium thermophilum IAM 14863]
Length=305

 Score = 34.7 bits (78),  Expect = 4.0, Method: Composition-based stats.
 Identities = 16/41 (39%), Positives = 22/41 (53%), Gaps = 1/41 (2%)

Query  127  LENLKHELAHYHMRH-TSINYPLWYEEGMATLLSEATLTFV  166
            L  + HEL HY +   T  NYP W+ EG+A  + E    +V
Sbjct  155  LSPVAHELTHYLLDELTEGNYPRWFTEGLAQYVEELATGYV  195


>gi|108756961|ref|YP_630691.1|  hypothetical protein MXAN_2471 [Myxococcus xanthus DK 1622]
 gi|108460841|gb|ABF86026.1|  hypothetical protein MXAN_2471 [Myxococcus xanthus DK 1622]
Length=507

 Score = 34.3 bits (77),  Expect = 5.8, Method: Composition-based stats.
 Identities = 25/86 (29%), Positives = 39/86 (45%), Gaps = 4/86 (4%)

Query  81   LELIVFAKKADFEDTVKPRKFASYTNSELDGVLIVAAP---STSKDVDLLENLKHELAHY  137
            +++IV   ++  E+    R     TN+E DG L+V A    + S+    +    HEL HY
Sbjct  76   VDIIVLHNRSALEEFTNIRIEGFSTNTE-DGPLLVLAGHAYALSEATADITTQAHELTHY  134

Query  138  HMRHTSINYPLWYEEGMATLLSEATL  163
                  +  P W  EG+A+ L    L
Sbjct  135  LSELALVRQPRWLSEGLASYLETIAL  160


-----------------------------blast X-------------------------------------------


BLASTX 2.2.15 [Oct-15-2006] 

Database: All non-redundant GenBank CDS
translations+PDB+SwissProt+PIR+PRF excluding environmental samples
           4,196,452 sequences; 1,444,328,266 total letters

gi|108763613|ref|YP_630548.1|  hypothetical protein MXAN_2327 ...  43.9    0.010 
gi|116624734|ref|YP_826890.1|  hypothetical protein Acid_5658 ...  41.2    0.062 
gi|108756961|ref|YP_630691.1|  hypothetical protein MXAN_2471 ...  38.9    0.31  
gi|114769594|ref|ZP_01447204.1|  cobaltochelatase [alpha prote...  36.2    2.0  
gi|86160014|ref|YP_466799.1|  hypothetical protein Adeh_3596 [...  36.2    2.0   
gi|86160013|ref|YP_466798.1|  hypothetical protein Adeh_3595 [...  35.0    4.5   
gi|18033721|gb|AAL57224.1|  gamma-glutamylcysteine synthetase ...  35.0    4.5  
gi|4713921|gb|AAD28293.1|  gamma-glutamylcysteine synthetase [Pla  35.0    4.5  
gi|68070807|ref|XP_677317.1|  gamma-glutamylcysteine synthetas...  35.0    4.5   
gi|68059036|ref|XP_671496.1|  hypothetical protein PB301533.00...  35.0    4.5   


>gi|108763613|ref|YP_630548.1|  hypothetical protein MXAN_2327 [Myxococcus xanthus DK 1622]
 gi|108467493|gb|ABF92678.1|  hypothetical protein MXAN_2327 [Myxococcus xanthus DK 1622]
Length=524

 Score = 43.9 bits (102),  Expect = 0.010
 Identities = 43/158 (27%), Positives = 68/158 (43%), Gaps = 13/158 (8%)
 Frame = +2

Query  179  WIRVDNGDVAFSTNLGESEALE-LERSIRLFSAFSKTFLP--VRENYSIPLELIVFAKKA  349
            W+R+D+      T+L   EA E ++R  R  +A   +  P  +R+  +  L++ V     
Sbjct  38   WLRLDSDHYTLHTDLLAEEAREAMQRLERTRAAILTSMWPQSLRQQMT-KLDVYVIQSPR  96

Query  350  DFEDTVKPRKFASYTNSELDGVLIVAA-----PSTSKDVDLLEN--LKHELAHYHMRHTS  508
            +FE     R  A +  S+ + +++++        T   + L  +  L HELAHY   +  
Sbjct  97   EFEGLYPRRVRAFFFRSDSEALIVLSGRPGTWEQTFSGLSLASSSPLNHELAHYLSAYPL  156

Query  509  INYPLWYEEGMATLLSEATLTFVDDAIKAEFKTPKPTA  622
               P W  EGMA  L   TL    D   A    P  TA
Sbjct  157  SRQPRWLSEGMAEYLE--TLRISKDGRTAVVGAPHWTA  192


>gi|116624734|ref|YP_826890.1|  hypothetical protein Acid_5658 [Solibacter usitatus Ellin6076]
 gi|116227896|gb|ABJ86605.1|  hypothetical protein Acid_5658 [Solibacter usitatus Ellin6076]
Length=597

 Score = 41.2 bits (95),  Expect = 0.062
 Identities = 31/137 (22%), Positives = 59/137 (43%), Gaps = 10/137 (7%)
 Frame = +2

Query  155  FAAPEEDRWIRVDNGDVAFSTNLGESEALELERSIRLFSAFSKTFLPVRENYSIPLELIV  334
            F+AP+ D W+++ + +    T  GE    +L +      +F           + P  +I 
Sbjct  25   FSAPQ-DSWLKITSANFELYTTAGERSGRDLIKHFEQVRSFFTQAFGAHLAAARPARIIA  83

Query  335  FAKKADFEDTVKPRKFAS---YTNSELDGVLIVAAPSTSKDVDLLENLKHELAHYHMRHT  505
            F  + +++   +P +FAS      +  D +++  A S    V +     HE  H  +  +
Sbjct  84   FRNEKEYQP-YRPGEFASAFYQPGAVHDFIVMSGASSEHYPVAI-----HEYTHLMIHQS  137

Query  506  SINYPLWYEEGMATLLS  556
             ++ P W  EG+A L S
Sbjct  138  GMDLPPWLNEGLAELYS  154


>gi|108756961|ref|YP_630691.1|  hypothetical protein MXAN_2471 [Myxococcus xanthus DK 1622]
 gi|108460841|gb|ABF86026.1|  hypothetical protein MXAN_2471 [Myxococcus xanthus DK 1622]
Length=507

 Score = 38.9 bits (89),  Expect = 0.31
 Identities = 36/138 (26%), Positives = 56/138 (40%), Gaps = 14/138 (10%)
 Frame = +2

Query  179  WIRVDNGDVAFSTNLGESEALELERSIRLF-----SAFSKTFLPVRENYSIPLELIVFAK  343
            W+ V +      TNL    A E  + + L       A+  +F P        +++IV   
Sbjct  29   WVEVRSPHFTVRTNLDTETAEEAAQELELLREGLLQAWGGSFDPPGT-----VDIIVLHN  83

Query  344  KADFEDTVKPRKFASYTNSELDGVLIVAAP---STSKDVDLLENLKHELAHYHMRHTSIN  514
            ++  E+    R     TN+E DG L+V A    + S+    +    HEL HY      + 
Sbjct  84   RSALEEFTNIRIEGFSTNTE-DGPLLVLAGHAYALSEATADITTQAHELTHYLSELALVR  142

Query  515  YPLWYEEGMATLLSEATL  568
             P W  EG+A+ L    L
Sbjct  143  QPRWLSEGLASYLETIAL  160


>gi|114769594|ref|ZP_01447204.1|  cobaltochelatase [alpha proteobacterium HTCC2255]
 gi|114549299|gb|EAU52181.1|  cobaltochelatase [alpha proteobacterium HTCC2255]
Length=1239

 Score = 36.2 bits (82),  Expect = 2.0
 Identities = 50/197 (25%), Positives = 81/197 (41%), Gaps = 34/197 (17%)
 Frame = +2

Query  71   PEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNG----DVAFSTNLGESEA  238
            PEQ +     I+PLI+       V  + F++  E  W    NG    D+A +  L E + 
Sbjct  272  PEQSVENSGAINPLIMEATQRAPVFQVVFSSSSETVWENELNGLNARDIAMNVALPEVDG  331

Query  239  LELERSIRL-FSAF--SKTFLPVRENYSIPLELIVFAKKADFEDTVKPRKFASYTNSELD  409
              L R+I     AF   KT  P+  +Y    + I +       D  K      +TN++  
Sbjct  332  RVLTRAISFKGEAFFDEKTQCPI-GSYRARGDRIEYV-----ADLTKNWVNLRHTNAKTK  385

Query  410  GVLIVAAPSTSKD---------------VDLLENLKHELAHYHMRHTSINYPLWYEEGMA  544
             V ++ A   +KD               VD+++ LKHE   YH +    + P   +E M 
Sbjct  386  KVSLILANYPNKDGRLANGVGLDTPQATVDMMKMLKHE--GYHTK----DLPNSSDELMK  439

Query  545  TLLSEATLTFVDDAIKA  595
             +++  T    D AI++
Sbjct  440  KIMNGPTNWLTDRAIRS  456


>gi|86160014|ref|YP_466799.1|  hypothetical protein Adeh_3596 [Anaeromyxobacter dehalogenans 
2CP-C]
 gi|85776525|gb|ABC83362.1|  hypothetical protein Adeh_3596 [Anaeromyxobacter dehalogenans 
2CP-C]
Length=498

 Score = 36.2 bits (82),  Expect = 2.0
 Identities = 26/136 (19%), Positives = 53/136 (38%), Gaps = 3/136 (2%)
 Frame = +2

Query  179  WIRVDNGDVAFSTNLGESEALELERSI-RLFSAFSKTFLPVRENYSIPLELIVFAKKADF  355
            W  +   ++   T+L   +A +L R + R++                P+ ++ F  + +F
Sbjct  36   WRELRTANILLQTDLSSGKAQDLARELDRIYDVVRIALFRRPPPTVAPMRVVAFQSEEEF  95

Query  356  EDTVKPRKFASYTNSELDGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEE  535
                 P+   +Y  S      ++  P    D   +  + HE+ H+         P W+ E
Sbjct  96   H-LFAPKDATAYHMSGTRLGAVMLTPGLLADSQRIVAV-HEITHHVTTPLFARQPRWFAE  153

Query  536  GMATLLSEATLTFVDD  583
            G+A  +    +T VD+
Sbjct  154  GLACYMESMAMTGVDN  169


>gi|86160013|ref|YP_466798.1|  hypothetical protein Adeh_3595 [Anaeromyxobacter dehalogenans 
2CP-C]
 gi|85776524|gb|ABC83361.1|  hypothetical protein Adeh_3595 [Anaeromyxobacter dehalogenans 
2CP-C]
Length=529

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 35/142 (24%), Positives = 57/142 (40%), Gaps = 17/142 (11%)
 Frame = +2

Query  155  FAAPEEDR--WIRVDNGDVAFSTNLGESEALELERSIRLFSAFSKTFLPVREN-YSIP--  319
            F  PE+    W  +    V   T+L   +A EL   +       +TF+ VR   +  P  
Sbjct  52   FRCPEQGGPDWHELRTEHVVLQTDLPSWKAKELAGELE------RTFVVVRTGLFRNPPP  105

Query  320  ----LELIVFAKKADFEDTVKPRKFASYTNSELDGVLIVAAPSTSKDVDLLENLKHELAH  487
                L ++ FA +++FE    P    +Y +       +V  P T  D      + HEL H
Sbjct  106  APGLLRVVAFASESEFE-RFAPMGAGAYYHRPPFFAPVVVMPGTLGDAQRTV-IAHELTH  163

Query  488  YHMRHTSINYPLWYEEGMATLL  553
            +         P W+ EG+A+ +
Sbjct  164  HLTAQLFARQPPWFREGLASFM  185


>gi|18033721|gb|AAL57224.1|  gamma-glutamylcysteine synthetase [Plasmodium berghei]
 gi|18033723|gb|AAL57225.1|  gamma-glutamylcysteine synthetase [Plasmodium berghei]
 gi|18033725|gb|AAL57226.1|  gamma-glutamylcysteine synthetase [Plasmodium berghei]
Length=967

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 37/179 (20%), Positives = 72/179 (40%), Gaps = 18/179 (10%)
 Frame = +2

Query  68   DPEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALEL  247
            D + + +QL+ I PL +A+      +   F    + RW  + N     S +    + L  
Sbjct  489  DAKYVYDQLAVIAPLFLAITACTPYLG-GFLTETDARWRVISN-----SVDCRTEDELSY  542

Query  248  ERSIRL--FSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPR--KFASYTNSEL---  406
                R    S +    LP+++NY    ++ +   K  ++  +K    ++ S   S L   
Sbjct  543  ISKPRYSGISLYISDELPLKKNYYFYNDIDIILNKNVYDKLIKENVDEYLSRHISSLFVR  602

Query  407  DGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDD  583
            D +++     + KD+  ++N+ HE           N  +W EE M  +       F++D
Sbjct  603  DPIVVFEGSFSEKDITTIQNIMHE-----KNENINNSKMWSEEEMNKIYLSDDFEFLED  656


>gi|4713921|gb|AAD28293.1|  gamma-glutamylcysteine synthetase [Plasmodium berghei]
Length=967

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 37/179 (20%), Positives = 72/179 (40%), Gaps = 18/179 (10%)
 Frame = +2

Query  68   DPEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALEL  247
            D + + +QL+ I PL +A+      +   F    + RW  + N     S +    + L  
Sbjct  489  DAKYVYDQLAVIAPLFLAITACTPYLG-GFLTETDARWRVISN-----SVDCRTEDELSY  542

Query  248  ERSIRL--FSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPR--KFASYTNSEL---  406
                R    S +    LP+++NY    ++ +   K  ++  +K    ++ S   S L   
Sbjct  543  ISKPRYSGISLYISDELPLKKNYYFYNDIDIILNKNVYDKLIKENVDEYLSRHISSLFVR  602

Query  407  DGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDD  583
            D +++     + KD+  ++N+ HE           N  +W EE M  +       F++D
Sbjct  603  DPIVVFEGSFSEKDITTIQNIMHE-----KNENINNSKMWSEEEMNKIYLSDDFEFLED  656


>gi|68070807|ref|XP_677317.1|  gamma-glutamylcysteine synthetase [Plasmodium berghei strain 
ANKA]
 gi|56497386|emb|CAH98696.1|  gamma-glutamylcysteine synthetase, putative [Plasmodium berghei]
Length=965

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 37/179 (20%), Positives = 72/179 (40%), Gaps = 18/179 (10%)
 Frame = +2

Query  68   DPEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALEL  247
            D + + +QL+ I PL +A+      +   F    + RW  + N     S +    + L  
Sbjct  487  DAKYVYDQLAVIAPLFLAITACTPYLG-GFLTETDARWRVISN-----SVDCRTEDELSY  540

Query  248  ERSIRL--FSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPR--KFASYTNSEL---  406
                R    S +    LP+++NY    ++ +   K  ++  +K    ++ S   S L   
Sbjct  541  ISKPRYSGISLYISDELPLKKNYYFYNDIDIILNKNVYDKLIKENVDEYLSRHISSLFVR  600

Query  407  DGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDD  583
            D +++     + KD+  ++N+ HE           N  +W EE M  +       F++D
Sbjct  601  DPIVVFEGSFSEKDITTIQNIMHE-----KNENINNSKMWSEEEMNKIYLSDDFEFLED  654


>gi|68059036|ref|XP_671496.1|  hypothetical protein PB301533.00.0 [Plasmodium berghei strain 
ANKA]
 gi|56487727|emb|CAI04104.1|  hypothetical protein PB301533.00.0 [Plasmodium berghei]
Length=325

 Score = 35.0 bits (79),  Expect = 4.5
 Identities = 37/179 (20%), Positives = 72/179 (40%), Gaps = 18/179 (10%)
 Frame = +2

Query  68   DPEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALEL  247
            D + + +QL+ I PL +A+      +   F    + RW  + N     S +    + L  
Sbjct  33   DAKYVYDQLAVIAPLFLAITACTPYLG-GFLTETDARWRVISN-----SVDCRTEDELSY  86

Query  248  ERSIRL--FSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPR--KFASYTNSEL---  406
                R    S +    LP+++NY    ++ +   K  ++  +K    ++ S   S L   
Sbjct  87   ISKPRYSGISLYISDELPLKKNYYFYNDIDIILNKNVYDKLIKENVDEYLSRHISSLFVR  146

Query  407  DGVLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDD  583
            D +++     + KD+  ++N+ HE           N  +W EE M  +       F++D
Sbjct  147  DPIVVFEGSFSEKDITTIQNIMHE-----KNENINNSKMWSEEEMNKIYLSDDFEFLED  200

ORF finding

---------------------any codons------------------------------------------

brin direct

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 53 to base 862.
CATAATCATCACCCAGATCCGGAACAAATGATTAATCAATTGAGTCAAATTCATCCACTG
ATTGTCGCACTCTTTTTGAGTGTGTCGGTAGTGAATCTTACATTTGCCGCGCCAGAAGAA
GATCGCTGGATTCGTGTGGACAACGGAGATGTCGCCTTTTCTACCAACCTAGGTGAATCT
GAAGCACTAGAGCTAGAACGCTCAATTCGCCTATTCTCCGCGTTTAGCAAAACTTTTTTG
CCAGTTAGGGAAAATTATTCGATACCACTAGAGTTAATTGTTTTCGCGAAGAAAGCTGAT
TTTGAGGACACGGTAAAACCTAGAAAATTTGCTTCCTACACCAATTCTGAACTGGATGGT
GTTCTCATCGTCGCTGCTCCCTCTACCAGCAAAGATGTCGATCTTCTAGAAAATCTGAAG
CACGAGCTCGCGCACTATCACATGCGTCATACTTCGATTAATTATCCACTTTGGTACGAA
GAGGGAATGGCAACCCTGTTATCCGAGGCAACACTTACATTTGTAGACGACGCCATCAAA
GCCGAATTCAAAACTCCCAAGCCCACGGCAGGTTTTCCATTAAAACGATCTACAAAAATG
GTAAGAAAAGCCTGGTTGGTTGAACATCTTAAACGAAGAAGTCTGCGTAATCTGAACTTA
AGGATCATTCACAACTTCTATAATGATAGTCATCGACTGGCCAACTTCTTCCATTTTAAC
GAAAGTGATGATTCCAGATTCTCGATGAAAGCACTGAATCAATATCTATTAAACCAATCA
AGTACTCTTTTCTCCTCTCTTAATGTGACG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
HNHHPDPEQMINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGES
EALELERSIRLFSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPRKFASYTNSELDG
VLIVAAPSTSKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDDAIK
AEFKTPKPTAGFPLKRSTKMVRKAWLVEHLKRRSLRNLNLRIIHNFYNDSHRLANFFHFN
ESDDSRFSMKALNQYLLNQSSTLFSSLNVT

No ORFs were found in reading frame 3.


brin reverse


>ORF number 1 in reading frame 1 on the reverse strand extends from base 145 to base 351.
ATGATCCTTAAGTTCAGATTACGCAGACTTCTTCGTTTAAGATGTTCAACCAACCAGGCT
TTTCTTACCATTTTTGTAGATCGTTTTAATGGAAAACCTGCCGTGGGCTTGGGAGTTTTG
AATTCGGCTTTGATGGCGTCGTCTACAAATGTAAGTGTTGCCTCGGATAACAGGGTTGCC
ATTCCCTCTTCGTACCAAAGTGGATAA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
MILKFRLRRLLRLRCSTNQAFLTIFVDRFNGKPAVGLGVLNSALMASSTNVSVASDNRVA
IPSSYQSG*

>ORF number 2 in reading frame 1 on the reverse strand extends from base 559 to base 762.
TTTTCCCTAACTGGCAAAAAAGTTTTGCTAAACGCGGAGAATAGGCGAATTGAGCGTTCT
AGCTCTAGTGCTTCAGATTCACCTAGGTTGGTAGAAAAGGCGACATCTCCGTTGTCCACA
CGAATCCAGCGATCTTCTTCTGGCGCGGCAAATGTAAGATTCACTACCGACACACTCAAA
AAGAGTGCGACAATCAGTGGATGA

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
FSLTGKKVLLNAENRRIERSSSSASDSPRLVEKATSPLSTRIQRSSSGAANVRFTTDTLK
KSATISG*

No ORFs were found in reading frame 2.

>ORF number 1 in reading frame 3 on the reverse strand extends from base 645 to base 863.
GTTGGTAGAAAAGGCGACATCTCCGTTGTCCACACGAATCCAGCGATCTTCTTCTGGCGC
GGCAAATGTAAGATTCACTACCGACACACTCAAAAAGAGTGCGACAATCAGTGGATGAAT
TTGACTCAATTGATTAATCATTTGTTCCGGATCTGGGTGATGATTATGCTATTTGTCCTA
GGAACGTCCAGTTTGGCGCAGATCCCGGATAAGATGCAA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
VGRKGDISVVHTNPAIFFWRGKCKIHYRHTQKECDNQWMNLTQLINHLFRIWVMIMLFVL
GTSSLAQIPDKMQ


-------------------------------ATG--------------------------------------

brin direct

No ORFs were found in reading frame 1.

>ORF number 1 in reading frame 2 on the direct strand extends from base 80 to base 862.
ATGATTAATCAATTGAGTCAAATTCATCCACTGATTGTCGCACTCTTTTTGAGTGTGTCG
GTAGTGAATCTTACATTTGCCGCGCCAGAAGAAGATCGCTGGATTCGTGTGGACAACGGA
GATGTCGCCTTTTCTACCAACCTAGGTGAATCTGAAGCACTAGAGCTAGAACGCTCAATT
CGCCTATTCTCCGCGTTTAGCAAAACTTTTTTGCCAGTTAGGGAAAATTATTCGATACCA
CTAGAGTTAATTGTTTTCGCGAAGAAAGCTGATTTTGAGGACACGGTAAAACCTAGAAAA
TTTGCTTCCTACACCAATTCTGAACTGGATGGTGTTCTCATCGTCGCTGCTCCCTCTACC
AGCAAAGATGTCGATCTTCTAGAAAATCTGAAGCACGAGCTCGCGCACTATCACATGCGT
CATACTTCGATTAATTATCCACTTTGGTACGAAGAGGGAATGGCAACCCTGTTATCCGAG
GCAACACTTACATTTGTAGACGACGCCATCAAAGCCGAATTCAAAACTCCCAAGCCCACG
GCAGGTTTTCCATTAAAACGATCTACAAAAATGGTAAGAAAAGCCTGGTTGGTTGAACAT
CTTAAACGAAGAAGTCTGCGTAATCTGAACTTAAGGATCATTCACAACTTCTATAATGAT
AGTCATCGACTGGCCAACTTCTTCCATTTTAACGAAAGTGATGATTCCAGATTCTCGATG
AAAGCACTGAATCAATATCTATTAAACCAATCAAGTACTCTTTTCTCCTCTCTTAATGTG
ACG

>Translation of ORF number 1 in reading frame 2 on the direct strand.
MINQLSQIHPLIVALFLSVSVVNLTFAAPEEDRWIRVDNGDVAFSTNLGESEALELERSI
RLFSAFSKTFLPVRENYSIPLELIVFAKKADFEDTVKPRKFASYTNSELDGVLIVAAPST
SKDVDLLENLKHELAHYHMRHTSINYPLWYEEGMATLLSEATLTFVDDAIKAEFKTPKPT
AGFPLKRSTKMVRKAWLVEHLKRRSLRNLNLRIIHNFYNDSHRLANFFHFNESDDSRFSM
KALNQYLLNQSSTLFSSLNVT

No ORFs were found in reading frame 3.

brin reverse

>ORF number 1 in reading frame 1 on the reverse strand extends from base 145 to base 351.
ATGATCCTTAAGTTCAGATTACGCAGACTTCTTCGTTTAAGATGTTCAACCAACCAGGCT
TTTCTTACCATTTTTGTAGATCGTTTTAATGGAAAACCTGCCGTGGGCTTGGGAGTTTTG
AATTCGGCTTTGATGGCGTCGTCTACAAATGTAAGTGTTGCCTCGGATAACAGGGTTGCC
ATTCCCTCTTCGTACCAAAGTGGATAA

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
MILKFRLRRLLRLRCSTNQAFLTIFVDRFNGKPAVGLGVLNSALMASSTNVSVASDNRVA
IPSSYQSG*

No ORFs were found in reading frame 2.

No ORFs were found in reading frame 3.