ORF NQ21170

From Metagenes
Warning: this metagenomic sequence has been carefully annotated by students during bioinformatics assignments. These quality annotations are therefore the result of a teaching exercise that you are most welcome to amend and extend if necessary!


Sequence
CAMERA AccNum : AACY01159963.1
Annotathon code: ORF_NQ21170
Sample :
  • GPS :31°10'30n; 64°19'27.6w
  • Sargasso Sea: Sargasso Sea, Station 11 - Bermuda (UK)
  • Open Ocean (-5m, 20.5°C, 0.1-0.8 microns)
Authors
Team : BioCell 2007
Username : deust
Annotated on : 2008-03-19 18:52:37
  • MAZZELLA JEAN MICHAEL
  • SABER AHMED

Synopsis

  • Taxonomy: Rhizobiales (NCBI info)
    Rank: order - Genetic Code: Bacterial and Plant Plastid - NCBI Identifier: 356
    Kingdom: Bacteria - Phylum: Proteobacteria - Class: Alphaproteobacteria - Order: Rhizobiales
    Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales;

Genomic Sequence

>AACY01159963.1 ORF_NQ21170 genomic DNA
TATTGTGTGGCGAACATGCCAGGTGCTGTCGCGCGGACCTCAACGTTTGCTCTTAACAATGCAACGCTGCCTTTCATGCTGGCCCTTGCGAACAAGGGAT
ATCGTCAGGCGCTGGCCGACGATCCACATTTGTTGGCGGGACTTAACGTGCATCACGGCGCCGTGACTTACGCCGCCGTGGCTTCAGCCTTGGGTGAACC
CTTCCATGATGCTGCCAGCGTGATCGCTGCCTAAGCGACATCTCGATACCGTCTCAGAAATTGACCAGGCCGGTGCAGTAATGCGCCGGGCTTTGCTGTT
GAAGCTGCCTTTATTCTCCCCATCTCTCCCTAGCTTGCGGTTCGCAATCTAGTCGGGTCTCAGAGGAGAGGAAGCAATAATGGCAAAGAATGCACTTTTG
TTGGGCCTTCTTGGTTTTGCCTTTGCTCTGGCAATTATGGGCATTTCTTTGATCTTAGACCGTGAACCGAAACGGGCACCGTTGGTCGGTCCAGTCTCGT
TCTTTGCGGCGCCTGTCGCCGAGGGTGGCCTGCTGTTTTCAGCATTGCTGGAGGGCAGGGGGGTCGTCGGCCGATCGCAGACAGCAACTCTCAGTATTGA
TGAAATGATAACGCCGGTTGTTGCGGGTCGGGTCTTGGCAATGTCGGATGCTGAGCTTGCTTTTGTCGAACTCGACAGTGTGGGCGGCGATGTTCAATCG
TCCATCGACATTGCGCGCCGACTGCGGGCTGCCGGATCGCACACTCATGTTGCGTCTGGGGCGAAGTGTTTCAGTGCCTGCACGGTTATCTACCAGGGTG
GCGTCGAACGAACGGCGGGCGAGGAGGCACTATTCCTTCTTCACTATGCGGTTCAGGTTTCAGACGACCCAAATCATGCGCGTGTTGGAAGTGTCTGGGG
CACGGTCGCGTTGATCGAAGCGATGATTGATCTGGGCACCGACTCATCGGTTTACGA

Translation

[353 - 955/957]   direct strand
>ORF_NQ21170 Translation [353-955   direct strand]
SGLRGEEAIMAKNALLLGLLGFAFALAIMGISLILDREPKRAPLVGPVSFFAAPVAEGGLLFSALLEGRGVVGRSQTATLSIDEMITPVVAGRVLAMSDA
ELAFVELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLHYAVQVSDDPNHARVGSVWGTVALIEAMIDLGTDSSV
Y

[ Warning ] 5' incomplete: does not start with a Methionine
[ Warning ] 3' incomplete: following codon is not a STOP

Phylogeny

arbre réalisé avec le neighbor-joining method 

avec comme groupe d'etude les Rhizobiales 

et comme groupe extérieur : Sagittula Stellata 
  





      +--------------------------------------------ORF_NQ2117
  +---4 
  !   !    +-------------------------Fulvimarina Pelagi         (Rhizobiales)
  !   +----5 
  !        ! +-----------------Mesorhizobium                    (Rhizobiales)
  !        +-6 
  !          +---------------------------Bradyrhizobium         (Rhizobiales)
  ! 
  !             +Rhodobacter Sphaeroides                        (alpha-protéobacterie)               
  !  +----------1 
  3--2          +Rhodobacter Sphaeroides ATCC                   (alpha-protéobactérie)
  !  ! 
  !  +--------------Rhodobacterales Bacterium                    (alpha-protéobactérie)
  ! 
  +-----------------Sagittula Stellata                          (alpha-protéobactérie)




---------------------------------------------------------------------------------------------

arbre réalisé avec la méthode par parcimonie

avec comme groupe d'étude les Rhizobiales

et comme groupe extérieur : Sagittula Stellata



  +--------------------Sagittula Stellata                              (alpha-protéobactérie)
  !  
  !           +--------ORF_NQ2117      
  3  +--------7  
  !  !        !  +-----Bradyrhizobium                                  (Rhizobiales)
  !  !        +--6  
  !  !           !  +--Fulvimarina Pelagi                              (Rhizobiales)
  +--4           +--5  
     !              +--Mesorhizobium                                   (Rhizobiales)
     !  
     !           +-----Rhodobacterales Bacterium                       (alpha-protéobactérie)
     +-----------2            
                 !  +--Rhodobacter Spheroides ATCC                     (alpha-protéobactérie)
                 +--1  
                    +--Rhodobacter Spheroides                          (alpha-protéobactérie)

Annotator commentaries

Nous avons commencé notre étude de séquence par la recherche d'ORF. Pour ce faire, nous avons utilisé le programme SMS, avec l'option any codon et en regardant les 6 cadres de lecture possibles. La séquence protéique correspondante à l'ORF étudié comporte 201 acides aminés L'extrémité 5'est incomplète, en effet l'ORF ne commence pas par une Méthionine (mais par une Sérine). Il en est de même pour l'extrémité 3' qui ne se termine pas par un codon STOP. La séquence nucléotique s'achève par les nucléotides A et C, ce qui n'est pas suffisant pour former un codon (un codon comportant 3 bases). L'ORF est codant, car il comporte plus de 60 acides-aminés. De plus, la séquence est lue dans le sens direct.


Dans INTERPRO, 2 domaines sont apparus conservés dans notre protéine : un domaine peptide-signal, qui s'étend du 1er acide aminé jusqu'au 25e, et un domaine transmembranaire qui s'étend du 15e acide aminé jusqu'au 35e. Cependant, ces domaines ne possèdent pas de numéro d'identification, ils sont dit "unintegrated". Puis nous avons éffectué un blastP de notre séquence protéique contre Swissprot et contre NR. Les résultats du BlastP contre swissprot se sont avérés très peu concluants (en effet, au lignage report, seulement 2 protéines semblables syntéhtisées par 2 especes différentes sont apparues; celles-ci avec des e-values très mauvaises, de l'ordre de 3,6), nous avons donc opté pour un BlastP contre NR. En utilisant la banque NR, les e-values des séquences supposées homolgues se sont révélées assez faibles. Par conséquent, nous avons choisi seulement 7 séquences homologues (10^-6 pour la plus faible, jusqu'à 4.10^-4 pour la derniere séquence retenue) pour construire l'arbre phylogénique le meilleur possible. En étudiant les alignements multiples (grâce au programme CLUSTALW), on a pû déterminer une région conservée qui s'étend du 105e acide aminé au 164e. On peut émettre l'hyptothèse que notre séquence est un domaine fonctionnel non répertorié, ou que ce n'est pas un domaine. Toujours avec les alignements multiples, nous avons pû déterminer notre groupe d'étude : celui des Rhizobiales; ainsi que le groupe extérieur : les Sagittula Stellata; qui font partie des alpha-protéobactéries. Le Taxonomy Report nous indique que certains domaines protéiques de notre séquences sont largement conservés chez les alpha-protéobactéries. En conséquent, nous avons choisi un groupe d'étude faisant partie des alpha-protéobactéries : celui des Rhizobiales (avec des e-value les plus faibles possibles, de l'ordre de 10¯6). Le groupe extérieur est un autre sous-groupe des alpha-protéobactéries : Sagittula Stellata.


Nous avons essayé de réaliser l'arbre phylogénique grâce au programme EDTALN. Dans un premier temps, nous avons opté pour la méthode de Parcimonie. Nous observons que notre protéine est probablement issue d'un organisme faisant partir des Rhizobiales. Pour confirmer cela, nous avons utilisé la méthode de neighbor-joining : l'arbre raciné obtenu, avec Sagittula Stellata comme groupe extérieur, donnait sensiblement les mêmes observations : à savoir que notre ORF se rapprochait davantage du sous-groupe évolutif des Rhizobiales que du groupe des alpha-proétéobactéries.


Comme cité ci-dessus, notre séquence protéique comporte un peptide-signal ainsi qu'un domaine transmembranaire. De plus, selon le taxonomy report, la séquence protéique provenant de Sagittula Stellata est une protéine périplasmique. Avec ces deux données, on pourrait que notre protéine est en intéraction avec la membrane plasmique, d'une maniere que l'on ne peut définir (faute d'arguments). Le groupe taxonomique déterminé grâce aux 2 arbres phylogénétiques est celui des Rhizobiales; son identifiant NCBI est le numéro 356. Enfin, nous ne sommes pas en mesure de proposer une fonction moléculaire exacte de la protéine; ainsi qu'un symbole du gène.

Multiple Alignement

CLUSTAL W (1.82) multiple sequence alignment


Rhodobacter          ---------MSAEAAPRRAIWAMLILQVG--------------IGAALIGMDLARPAPPS
Rhodobacter2         ---------MSAEAAPRRAIWAMLILQVG--------------IGAALIGMDLARPAPPS
Rhodobacterales      MSTETTDATDPAPTGIRRAIAGIVGVQVA--------------IAAALLITDLAGSLPRI
Sagittula            -------MSTRTARPVARTLAAVLIFQVG--------------IGVLLVLGDIRSAPFSL
Mesorhizobium        -MTESDTPSASSNPPPRHSSFAVYFARFNDGALMRGAFVGVLIAAATLVGLDLREMVDSN
Fulvimarina          MIGEESSPGRLERLIQKIPEGAVLRSVFVTLLAVSCGIVYLDWQEFSEAELDNARIERTE
Bradyrhizobium       --MATPLNQRVHAWLAGNPDESVLRWVFR----------SILVVTVVILALDLIDQTTPH
ORF_NQ21170          --------SGLRGEEAIMAKNALLLGLLG--------------FAFALAIMGISLILDRE
                                       .  .:    .                       .        

Rhodobacter          PADLFAPASVPQMRPYRP---DLRPAPG-TDGPQMRP------MPARLEFGGEG-ARVTL
Rhodobacter2         PADLFAPASVPQMRPYRP---DLRPAPG-TDGPQMRP------MPARLEFGGEG-ARVTL
Rhodobacterales      DPLTRQSPSGPSTRPYSP---DRAPSTLPGERPVTGP------MPERLEVSADG-PTLTL
Sagittula            PFSSPQAPRLSEPVRPGD---QRRTFNPSRDRPTVQPSRDPGELPDRLTLTYED-ATWRL
Mesorhizobium        GLLPAEPATLQHTFPVLPPAVDAGPNRKQTNDPRQFVTANQEQLRQPISFTLEAGGTLRA
Fulvimarina          PMPVRRPEPGDQVRPYLPKTIPVGPDRGQPVLPGYDGPVDSEAMTRPMTFFDAGEGIVSG
Bradyrhizobium       AASDTAQPQLDIQRDQPDGPADSPAVLTPLLKRLMPLPKGDPALEQPISLELRGGGRLYA
ORF_NQ21170          PKRAP----------------LVGPVSFFAAPVAEGGLLFSALLEGRGVVGRSQTATLSI
                                             .                  :     .          

Rhodobacter          TGQIAAGDAARFSALLEQRG---ERPEVVELDSSGGVVSEALLIGRQIRALGAATEVEAG
Rhodobacter2         TGQIAAGDAARFSALLEQRG---ERPEVVGLDSSGGVVSEALLIGRQIRALGAATEVEAG
Rhodobacterales      TGQIAPGDGARIGDELRTRAGAGQTVATIRLDSPGGSVSDALEIGELIRGSGIDTEIAAN
Sagittula            EGAIEDGDAQRLMPQITAAD---PKIETLVLQSPGGSVRDAIDLGRHLRASGIATTVLSG
Mesorhizobium        SGAIDPGSAARLRAELDARG---EYVERVSLNSPGGALDDAIDMARMLRERGISTLVENG
Fulvimarina          VGRIEIGTASDLREFLAERDRVGGEVRRLFLHSPGGSVEDALQMARDLRADGISTEVPAD
Bradyrhizobium       TGTITPGSARAFADEVERHG---EYVKTVVLNSPGGSVADALAMGRLIRNRKFATEIEAG
ORF_NQ21170          DEMITPVVAGRVLAMSDAEL------AFVELDSVGGDVQSSIDIARRLRAAGSHTHVASG
                        *    .  .                : *.* ** : .:: :.. :*     * :  .

Rhodobacter          AVCLSACPYLLAGGVERRVAEGGLVGVHQHYFGENSLLP----AFLAVEDVQRGQAEVMR
Rhodobacter2         AVCLSACPYLLAGGVERRVAEGGLVGVHQHYFGENSLLP----AFLAVEDVQRGQAEVMR
Rhodobacterales      AICLSACPYILAGGVERTVASSGRVGVHQHYFGESTILP----AFIAVEDIQRGQAEVMA
Sagittula            EICYSACPYLFAGGTTRTAEPSASIGVHQHYFGESTILP----AFVAVEDIQRGQAEVMT
Mesorhizobium        AICASSCPLMLAGGTTRQVEEQAAVGLHQFYTALDPAIR----PAQALANAQMTTARISR
Fulvimarina          GYCASACPLVFAGGLTRLAGASSWVGLHQVYAAEIPGMPNARDLDRSISDIQQTIARAQQ
Bradyrhizobium       KTCASSCPLVFAGGIERRAGERAMIGVHQIAAVRSAKPAR---TDDDMSLAQNVSARCQR
ORF_NQ21170          AKCFSACTVIYQGGVERTAGEEALFLLHYAVQVSDDPNH-------ARVGSVWGTVALIE
                       * *:*. :  **  * .   . . :*                           .    

Rhodobacter          YLDEMGVDPRLMMHGMETPAREIYMLDAARLAELRLSTEGA-------------------
Rhodobacter2         YLDEMGVDPRLMMHGMETPAREIYMLDAARLAELRLSTEGG-------------------
Rhodobacterales      YLTRMGIGLGIMEHAMRTPPDQIYLLSQEELSEYDMVTAAQ-------------------
Sagittula            YLDDMGVDVRVMSHALATPSNEIYILLPEELERYGFTTPET-------------------
Mesorhizobium        HLQEMGLDPAIWLHALDTPPRALYYLTAEEMKRYKLVTQTKEVAQQ--------------
Fulvimarina          LLADMGVDPAIWINAMETPPADLYVLTEEELITSRYVRPFPDGPEFVGPRRPADFVLEGD
Bradyrhizobium       HLADMGIDLKVWVHAMETPHDQLFTFTPDELKSLNLVTSAPESRPEKPRS----------
ORF_NQ21170          AMIDLGTDSSVY------------------------------------------------
                      :  :* .  :                                                 

Rhodobacter          --------------------------------
Rhodobacter2         --------------------------------
Rhodobacterales      --------------------------------
Sagittula            --------------------------------
Mesorhizobium        --------------------------------
Fulvimarina          QIASTEDEETAVPDGNPAHETAVTPPSSTGEG
Bradyrhizobium       --------------------------------
ORF_NQ21170          --------------------------------


BLAST

BLASTp de l'ORF de 201 AA contre SWISSPROT


                                                                  Score     E
Sequences producing significant alignments:                       (Bits)  Value

sp|Q89EW7|MODC2_BRAJA  Molybdenum import ATP-binding protein modC  30.8    3.6  
sp|Q8J0U0|TRXB_PNEJI  Thioredoxin reductase                        30.8    3.6  



>sp|Q89EW7|MODC2_BRAJA  Molybdenum import ATP-binding protein modC 2
Length=897

 Score = 30.8 bits (68),  Expect = 3.6, Method: Composition-based stats.
 Identities = 22/70 (31%), Positives = 38/70 (54%), Gaps = 2/70 (2%)

Query  73   GRSQTATLSIDEMITPVVAGR-VLAMSDAELAFVELDSVGGDVQSSIDIARRLRAAGSHT  131
            G+++  T+ +D     +V G  VL + +     VE++ VG D+Q+  +IA R R AG+  
Sbjct  705  GKTKVGTVLVDRGDAELVEGPIVLELHEGAPVLVEMN-VGVDLQTLYEIAVRGRVAGAER  763

Query  132  HVASGAKCFS  141
              A G + F+
Sbjct  764  RCAIGLEAFA  773


>sp|Q8J0U0|TRXB_PNEJI  Thioredoxin reductase
Length=327

 Score = 30.8 bits (68),  Expect = 3.6, Method: Composition-based stats.
 Identities = 30/95 (31%), Positives = 39/95 (41%), Gaps = 31/95 (32%)

Query  121  ARRLRAAGSHTHVASGAKCFSACTV------IYQG------GVERTAGEEALFLLHYAVQ  168
            ARRL   G  T+   G    SAC V      I++G      G   +A EE+LFL  YA +
Sbjct  121  ARRLHIPGEETYWQRG---ISACAVCDGAAPIFRGKCLSVVGGGDSAAEESLFLTRYATK  177

Query  169  ----VSDDP------------NHARVGSVWGTVAL  187
                V  D             NH ++  +W TV L
Sbjct  178  VYLLVRRDKLRASAVMAKRLLNHPKIEVIWNTVVL  212




--------------------------------------------------------------------------------------------

BLASTp de l'ORF de 201 AA contre NR


Sequences producing significant alignments:                       (Bits)  Value

ref|YP_675788.1|  hypothetical protein Meso_3251 [Mesorhizobiu...  56.2    1e-06 Gene info
ref|ZP_01744988.1|  periplasmic protein-like [Sagittula stella...  54.3    4e-06
ref|ZP_01438849.1|  hypothetical protein FP2506_15809 [Fulvima...  54.3    4e-06
ref|YP_354585.1|  hypothetical protein RSP_3068 [Rhodobacter s...  52.8    1e-05 Gene info
ref|YP_001045647.1|  hypothetical protein Rsph17029_3795 [Rhod...  50.1    8e-05 Gene info
ref|ZP_01012351.1|  hypothetical protein RB2654_12534 [Rhodoba...  49.3    1e-04
ref|YP_001208362.1|  hypothetical protein BRADO6539 [Bradyrhiz...  48.1    3e-04 Gene info
ref|YP_569330.1|  hypothetical protein RPD_2194 [Rhodopseudomo...  47.8    4e-04 Gene info
ref|YP_001237260.1|  hypothetical protein BBta_1104 [Bradyrhiz...  47.8    4e-04 Gene info
ref|YP_001312506.1|  hypothetical protein Smed_3760 [Sinorhizo...  45.8    0.002 Gene info
ref|ZP_01011741.1|  hypothetical protein RB2654_20733 [Rhodoba...  45.4    0.002
ref|YP_486882.1|  hypothetical protein RPB_3275 [Rhodopseudomo...  45.1    0.003 Gene info
ref|NP_436886.1|  hypothetical protein SMb20360 [Sinorhizobium...  44.3    0.005 Gene info
ref|YP_533077.1|  hypothetical protein RPC_3216 [Rhodopseudomo...  43.9    0.006 Gene info
ref|YP_471602.1|  hypothetical protein RHE_PA00006 [Rhizobium ...  43.9    0.006 Gene info
ref|ZP_01442472.1|  hypothetical protein R2601_21221 [Roseovar...  43.5    0.007
ref|ZP_00964280.1|  hypothetical protein NAS141_07835 [Sulfito...  43.1    0.009
ref|YP_616535.1|  hypothetical protein Sala_1489 [Sphingopyxis...  43.1    0.010 Gene info
ref|YP_781165.1|  hypothetical protein RPE_2243 [Rhodopseudomo...  42.7    0.011 Gene info
ref|YP_496607.1|  hypothetical protein Saro_1329 [Novosphingob...  42.0    0.019 Gene info
ref|ZP_01226153.1|  conserved hypothetical protein [Aurantimon...  42.0    0.022
ref|YP_767470.1|  hypothetical protein RL1867 [Rhizobium legum...  41.6    0.026 Gene info
ref|YP_001170058.1|  hypothetical protein Rsph17025_3898 [Rhod...  41.6    0.032 Gene info
ref|YP_001370395.1|  periplasmic protein-like protein [Ochroba...  41.2    0.033 Gene info
ref|ZP_00999112.1|  hypothetical protein OB2597_02042 [Oceanic...  40.8    0.052
ref|ZP_01156334.1|  hypothetical protein OG2516_01721 [Oceanic...  40.4    0.066
ref|ZP_01746764.1|  periplasmic protein-like [Sagittula stella...  40.0    0.086
ref|ZP_01740848.1|  hypothetical protein RB2150_14256 [Rhodoba...  39.7    0.11 
ref|YP_511353.1|  periplasmic protein-like [Jannaschia sp. CCS...  39.3    0.14  Gene info
ref|ZP_00958181.1|  hypothetical protein ISM_00100 [Roseovariu...  38.9    0.16 
ref|ZP_01746289.1|  periplasmic protein-like [Sagittula stella...  38.9    0.18 
ref|ZP_00953603.1|  hypothetical protein EE36_02893 [Sulfitoba...  38.9    0.18 
ref|ZP_02008006.1|  periplasmic protein-like protein [Ralstoni...  38.9    0.19 
ref|NP_947478.1|  hypothetical protein RPA2133 [Rhodopseudomon...  38.9    0.20  Gene info
ref|YP_001045074.1|  hypothetical protein Rsph17029_3205 [Rhod...  38.5    0.25  Gene info
ref|ZP_00970923.1|  COG0740: Protease subunit of ATP-dependent...  37.4    0.54 
gb|AAR38193.1|  hypothetical protein EBAC000-36A07.38 [uncultu...  37.4    0.55 
ref|YP_001202137.1|  hypothetical protein pQBR0391 [Pseudomona...  36.6    0.83  Gene info
ref|ZP_01748019.1|  lipoprotein, putative [Sagittula stellata ...  36.6    0.87 
ref|YP_587254.1|  hypothetical protein Rmet_5126 [Ralstonia me...  36.6    0.94  Gene info
ref|NP_105499.1|  hypothetical protein mlr4687 [Mesorhizobium ...  36.6    0.96  Gene info
ref|YP_156189.1|  Beta-lactamase class D (N-terminal domain) f...  36.2    1.1   Gene info
ref|YP_001294799.1|  Clp protease [Microbacterium phage Min1] ...  36.2    1.1   Gene info
ref|ZP_01552420.1|  periplasmic protein-like [Methylophilales ...  36.2    1.2  
ref|YP_769107.1|  hypothetical protein RL3527 [Rhizobium legum...  35.4    1.8   Gene info
ref|YP_001202162.1|  hypothetical protein pQBR0416 [Pseudomona...  35.0    2.4   Gene info
ref|YP_299390.1|  putative Uncharacterized protein conserved i...  35.0    2.8   Gene info
ref|ZP_01058608.1|  hypothetical protein MED193_21646 [Roseoba...  35.0    2.8  
ref|ZP_01040517.1|  hydrolase, alpha/beta fold family protein ...  34.7    3.8  
ref|YP_459019.1|  hydrolase, alpha/beta fold family protein [E...  34.3    4.3   Gene info
ref|YP_001187022.1|  periplasmic protein-like protein [Pseudom...  34.3    5.0   Gene info
ref|YP_166473.1|  lipoprotein, putative [Silicibacter pomeroyi...  33.9    5.4   Gene info
ref|YP_001346360.1|  hypothetical protein PSPA7_0974 [Pseudomo...  33.9    6.1   Gene info
ref|ZP_01039376.1|  hypothetical protein NAP1_03705 [Erythroba...  33.1    9.1  


>ref|YP_675788.1| Gene info hypothetical protein Meso_3251 [Mesorhizobium sp. BNC1]
 gb|ABG64623.1| Gene info conserved hypothetical protein [Mesorhizobium sp. BNC1]
Length=278

 Score = 56.2 bits (134),  Expect = 1e-06, Method: Composition-based stats.
 Identities = 52/173 (30%), Positives = 75/173 (43%), Gaps = 14/173 (8%)

Query  35   LDREPKRAPLVGPVSFFAAPVAEGGLLFSALLEGRGVVGRSQTATLSIDEMITPVVAGRV  94
            +D  P R     P  F  A   +     S  LE  G        TL     I P  A R+
Sbjct  80   VDAGPNRKQTNDPRQFVTANQEQLRQPISFTLEAGG--------TLRASGAIDPGSAARL  131

Query  95   LAMSDAELAFVE---LDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGV  151
             A  DA   +VE   L+S GG +  +ID+AR LR  G  T V +GA C S+C ++  GG 
Sbjct  132  RAELDARGEYVERVSLNSPGGALDDAIDMARMLRERGISTLVENGAICASSCPLMLAGGT  191

Query  152  ERTAGEEALFLLHY---AVQVSDDPNHARVGSVWGTVALIEAMIDLGTDSSVY  201
             R   E+A   LH    A+  +  P  A   +   T  +   + ++G D +++
Sbjct  192  TRQVEEQAAVGLHQFYTALDPAIRPAQALANAQMTTARISRHLQEMGLDPAIW  244


>ref|ZP_01744988.1|  periplasmic protein-like [Sagittula stellata E-37]
 gb|EBA09216.1|  periplasmic protein-like [Sagittula stellata E-37]
Length=249

 Score = 54.3 bits (129),  Expect = 4e-06, Method: Composition-based stats.
 Identities = 28/70 (40%), Positives = 41/70 (58%), Gaps = 0/70 (0%)

Query  95   LAMSDAELAFVELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERT  154
            +  +D ++  + L S GG V+ +ID+ R LRA+G  T V SG  C+SAC  ++ GG  RT
Sbjct  111  ITAADPKIETLVLQSPGGSVRDAIDLGRHLRASGIATTVLSGEICYSACPYLFAGGTTRT  170

Query  155  AGEEALFLLH  164
            A   A   +H
Sbjct  171  AEPSASIGVH  180


>ref|ZP_01438849.1|  hypothetical protein FP2506_15809 [Fulvimarina pelagi HTCC2506]
 gb|EAU41913.1|  hypothetical protein FP2506_15809 [Fulvimarina pelagi HTCC2506]
Length=332

 Score = 54.3 bits (129),  Expect = 4e-06, Method: Composition-based stats.
 Identities = 35/103 (33%), Positives = 57/103 (55%), Gaps = 9/103 (8%)

Query  107  LDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH--  164
            L S GG V+ ++ +AR LRA G  T V +   C SAC +++ GG+ R AG  +   LH  
Sbjct  151  LHSPGGSVEDALQMARDLRADGISTEVPADGYCASACPLVFAGGLTRLAGASSWVGLHQV  210

Query  165  YAVQVSDDPNHAR-----VGSVWGTVALIEAMI-DLGTDSSVY  201
            YA ++   PN AR     +  +  T+A  + ++ D+G D +++
Sbjct  211  YAAEIPGMPN-ARDLDRSISDIQQTIARAQQLLADMGVDPAIW  252


>ref|YP_354585.1| Gene info hypothetical protein RSP_3068 [Rhodobacter sphaeroides 2.4.1]
 gb|ABA80684.1| Gene info conserved hypothetical protein [Rhodobacter sphaeroides 2.4.1]
Length=240

 Score = 52.8 bits (125),  Expect = 1e-05, Method: Composition-based stats.
 Identities = 29/60 (48%), Positives = 37/60 (61%), Gaps = 0/60 (0%)

Query  105  VELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH  164
            VELDS GG V  ++ I R++RA G+ T V +GA C SAC  +  GGVER   E  L  +H
Sbjct  112  VELDSSGGVVSEALLIGRQIRALGAATEVEAGAVCLSACPYLLAGGVERRVAEGGLVGVH  171


>ref|YP_001045647.1| Gene info hypothetical protein Rsph17029_3795 [Rhodobacter sphaeroides 
ATCC 17029]
 gb|ABN78875.1| Gene info conserved hypothetical protein [Rhodobacter sphaeroides ATCC 
17029]
Length=240

 Score = 50.1 bits (118),  Expect = 8e-05, Method: Composition-based stats.
 Identities = 28/60 (46%), Positives = 36/60 (60%), Gaps = 0/60 (0%)

Query  105  VELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH  164
            V LDS GG V  ++ I R++RA G+ T V +GA C SAC  +  GGVER   E  L  +H
Sbjct  112  VGLDSSGGVVSEALLIGRQIRALGAATEVEAGAVCLSACPYLLAGGVERRVAEGGLVGVH  171


>ref|ZP_01012351.1|  hypothetical protein RB2654_12534 [Rhodobacterales bacterium 
HTCC2654]
 gb|EAQ13898.1|  hypothetical protein RB2654_12534 [Rhodobacterales bacterium 
HTCC2654]
Length=253

 Score = 49.3 bits (116),  Expect = 1e-04, Method: Composition-based stats.
 Identities = 27/72 (37%), Positives = 38/72 (52%), Gaps = 0/72 (0%)

Query  93   RVLAMSDAELAFVELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVE  152
            R  A +   +A + LDS GG V  +++I   +R +G  T +A+ A C SAC  I  GGVE
Sbjct  113  RTRAGAGQTVATIRLDSPGGSVSDALEIGELIRGSGIDTEIAANAICLSACPYILAGGVE  172

Query  153  RTAGEEALFLLH  164
            RT        +H
Sbjct  173  RTVASSGRVGVH  184


>ref|YP_001208362.1| Gene info hypothetical protein BRADO6539 [Bradyrhizobium sp. ORS278]
 emb|CAL80147.1| Gene info conserved hypothetical protein [Bradyrhizobium sp. ORS278]
Length=272

 Score = 48.1 bits (113),  Expect = 3e-04, Method: Composition-based stats.
 Identities = 23/65 (35%), Positives = 37/65 (56%), Gaps = 0/65 (0%)

Query  105  VELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH  164
            V L+S GG V  ++ + R +R     T + +G  C S+C +++ GG+ER AGE A+  +H
Sbjct  134  VVLNSPGGSVADALAMGRLIRNRKFATEIEAGKTCASSCPLVFAGGIERRAGERAMIGVH  193

Query  165  YAVQV  169
                V
Sbjct  194  QIAAV  198


>ref|YP_569330.1| Gene info hypothetical protein RPD_2194 [Rhodopseudomonas palustris BisB5]
 gb|ABE39429.1| Gene info conserved hypothetical protein [Rhodopseudomonas palustris BisB5]
Length=273

 Score = 47.8 bits (112),  Expect = 4e-04, Method: Composition-based stats.
 Identities = 36/121 (29%), Positives = 58/121 (47%), Gaps = 9/121 (7%)

Query  53   APVAEGGLLFSALLEGRGVVGRSQT------ATLSIDEMITPVVA---GRVLAMSDAELA  103
            +P   GG   + L +  GV+G+  T        L+    ITP  A    + +      + 
Sbjct  75   SPWLPGGDRLAPLPQPDGVLGKGMTFELVSGGRLTATGTITPGTAEAFAKAIERHGEYIK  134

Query  104  FVELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLL  163
             V L+S GG V  ++ + R +R     T V +G  C S+C +++ GGVER AG++A   +
Sbjct  135  TVVLNSPGGSVSDALTMGRLIRDRKLATFVEAGRYCASSCPLVFFGGVERRAGDKAAIGV  194

Query  164  H  164
            H
Sbjct  195  H  195


>ref|YP_001237260.1| Gene info hypothetical protein BBta_1104 [Bradyrhizobium sp. BTAi1]
 gb|ABQ33354.1| Gene info hypothetical protein BBta_1104 [Bradyrhizobium sp. BTAi1]
Length=268

 Score = 47.8 bits (112),  Expect = 4e-04, Method: Composition-based stats.
 Identities = 32/102 (31%), Positives = 50/102 (49%), Gaps = 6/102 (5%)

Query  105  VELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH  164
            V L+S GG V  ++ + R +R     T + +G  C S+C +++ GG+ER AGE A+  +H
Sbjct  130  VVLNSPGGSVADALAMGRLIRTRKFATEIEAGKYCASSCPLVFAGGIERRAGERAVIGVH  189

Query  165  YAVQVSDDPNHARVGSVWGTVALIEA-----MIDLGTDSSVY  201
                V    + AR G        I A     + D+G D  V+
Sbjct  190  QIAAVR-TASAARSGDDMSRAQNISARCQRHLADMGIDLKVW  230


>ref|YP_001312506.1| Gene info hypothetical protein Smed_3760 [Sinorhizobium medicae WSM419]
 gb|ABR62573.1| Gene info conserved hypothetical protein [Sinorhizobium medicae WSM419]
Length=262

 Score = 45.8 bits (107),  Expect = 0.002, Method: Composition-based stats.
 Identities = 23/60 (38%), Positives = 37/60 (61%), Gaps = 0/60 (0%)

Query  105  VELDSVGGDVQSSIDIARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLH  164
            V L+S GG V+ ++ I++ +R     T +AS A C S+C +++ GGV R A  +AL  +H
Sbjct  126  VALNSPGGSVEDALAISKLIREKKLDTKIASRALCASSCPIVFAGGVARVAASDALIGVH  185


ORF finding

SMS ORF Finder / any codon / codon 1 2 3 / mini long 60 / direct / code standard

>ORF number 1 in reading frame 1 on the direct strand extends from base 1 to base 234.
TATTGTGTGGCGAACATGCCAGGTGCTGTCGCGCGGACCTCAACGTTTGCTCTTAACAAT
GCAACGCTGCCTTTCATGCTGGCCCTTGCGAACAAGGGATATCGTCAGGCGCTGGCCGAC
GATCCACATTTGTTGGCGGGACTTAACGTGCATCACGGCGCCGTGACTTACGCCGCCGTG
GCTTCAGCCTTGGGTGAACCCTTCCATGATGCTGCCAGCGTGATCGCTGCCTAA

>Translation of ORF number 1 in reading frame 1 on the direct strand.
YCVANMPGAVARTSTFALNNATLPFMLALANKGYRQALADDPHLLAGLNVHHGAVTYAAV
ASALGEPFHDAASVIAA*

>ORF number 2 in reading frame 1 on the direct strand extends from base 334 to base 465.
CTTGCGGTTCGCAATCTAGTCGGGTCTCAGAGGAGAGGAAGCAATAATGGCAAAGAATGC
ACTTTTGTTGGGCCTTCTTGGTTTTGCCTTTGCTCTGGCAATTATGGGCATTTCTTTGAT
CTTAGACCGTGA

>Translation of ORF number 2 in reading frame 1 on the direct strand.
LAVRNLVGSQRRGSNNGKECTFVGPSWFCLCSGNYGHFFDLRP*

>ORF number 3 in reading frame 1 on the direct strand extends from base 466 to base 600.
ACCGAAACGGGCACCGTTGGTCGGTCCAGTCTCGTTCTTTGCGGCGCCTGTCGCCGAGGG
TGGCCTGCTGTTTTCAGCATTGCTGGAGGGCAGGGGGGTCGTCGGCCGATCGCAGACAGC
AACTCTCAGTATTGA

>Translation of ORF number 3 in reading frame 1 on the direct strand.
TETGTVGRSSLVLCGACRRGWPAVFSIAGGQGGRRPIADSNSQY*

>ORF number 4 in reading frame 1 on the direct strand extends from base 655 to base 930.
GCTTGCTTTTGTCGAACTCGACAGTGTGGGCGGCGATGTTCAATCGTCCATCGACATTGC
GCGCCGACTGCGGGCTGCCGGATCGCACACTCATGTTGCGTCTGGGGCGAAGTGTTTCAG
TGCCTGCACGGTTATCTACCAGGGTGGCGTCGAACGAACGGCGGGCGAGGAGGCACTATT
CCTTCTTCACTATGCGGTTCAGGTTTCAGACGACCCAAATCATGCGCGTGTTGGAAGTGT
CTGGGGCACGGTCGCGTTGATCGAAGCGATGATTGA

>Translation of ORF number 4 in reading frame 1 on the direct strand.
ACFCRTRQCGRRCSIVHRHCAPTAGCRIAHSCCVWGEVFQCLHGYLPGWRRTNGGRGGTI
PSSLCGSGFRRPKSCACWKCLGHGRVDRSDD*

>ORF number 1 in reading frame 2 on the direct strand extends from base 2 to base 166.
ATTGTGTGGCGAACATGCCAGGTGCTGTCGCGCGGACCTCAACGTTTGCTCTTAACAATG
CAACGCTGCCTTTCATGCTGGCCCTTGCGAACAAGGGATATCGTCAGGCGCTGGCCGACG
ATCCACATTTGTTGGCGGGACTTAACGTGCATCACGGCGCCGTGA

>Translation of ORF number 1 in reading frame 2 on the direct strand.
IVWRTCQVLSRGPQRLLLTMQRCLSCWPLRTRDIVRRWPTIHICWRDLTCITAP*

>ORF number 2 in reading frame 2 on the direct strand extends from base 224 to base 352.
TCGCTGCCTAAGCGACATCTCGATACCGTCTCAGAAATTGACCAGGCCGGTGCAGTAATG
CGCCGGGCTTTGCTGTTGAAGCTGCCTTTATTCTCCCCATCTCTCCCTAGCTTGCGGTTC
GCAATCTAG

>Translation of ORF number 2 in reading frame 2 on the direct strand.
SLPKRHLDTVSEIDQAGAVMRRALLLKLPLFSPSLPSLRFAI*

>ORF number 3 in reading frame 2 on the direct strand extends from base 353 to base 955.
TCGGGTCTCAGAGGAGAGGAAGCAATAATGGCAAAGAATGCACTTTTGTTGGGCCTTCTT
GGTTTTGCCTTTGCTCTGGCAATTATGGGCATTTCTTTGATCTTAGACCGTGAACCGAAA
CGGGCACCGTTGGTCGGTCCAGTCTCGTTCTTTGCGGCGCCTGTCGCCGAGGGTGGCCTG
CTGTTTTCAGCATTGCTGGAGGGCAGGGGGGTCGTCGGCCGATCGCAGACAGCAACTCTC
AGTATTGATGAAATGATAACGCCGGTTGTTGCGGGTCGGGTCTTGGCAATGTCGGATGCT
GAGCTTGCTTTTGTCGAACTCGACAGTGTGGGCGGCGATGTTCAATCGTCCATCGACATT
GCGCGCCGACTGCGGGCTGCCGGATCGCACACTCATGTTGCGTCTGGGGCGAAGTGTTTC
AGTGCCTGCACGGTTATCTACCAGGGTGGCGTCGAACGAACGGCGGGCGAGGAGGCACTA
TTCCTTCTTCACTATGCGGTTCAGGTTTCAGACGACCCAAATCATGCGCGTGTTGGAAGT
GTCTGGGGCACGGTCGCGTTGATCGAAGCGATGATTGATCTGGGCACCGACTCATCGGTT
TAC

>Translation of ORF number 3 in reading frame 2 on the direct strand.
SGLRGEEAIMAKNALLLGLLGFAFALAIMGISLILDREPKRAPLVGPVSFFAAPVAEGGL
LFSALLEGRGVVGRSQTATLSIDEMITPVVAGRVLAMSDAELAFVELDSVGGDVQSSIDI
ARRLRAAGSHTHVASGAKCFSACTVIYQGGVERTAGEEALFLLHYAVQVSDDPNHARVGS
VWGTVALIEAMIDLGTDSSVY

>ORF number 1 in reading frame 3 on the direct strand extends from base 57 to base 146.
CAATGCAACGCTGCCTTTCATGCTGGCCCTTGCGAACAAGGGATATCGTCAGGCGCTGGC
CGACGATCCACATTTGTTGGCGGGACTTAA

>Translation of ORF number 1 in reading frame 3 on the direct strand.
QCNAAFHAGPCEQGISSGAGRRSTFVGGT*

>ORF number 2 in reading frame 3 on the direct strand extends from base 459 to base 608.
ACCGTGAACCGAAACGGGCACCGTTGGTCGGTCCAGTCTCGTTCTTTGCGGCGCCTGTCG
CCGAGGGTGGCCTGCTGTTTTCAGCATTGCTGGAGGGCAGGGGGGTCGTCGGCCGATCGC
AGACAGCAACTCTCAGTATTGATGAAATGA

>Translation of ORF number 2 in reading frame 3 on the direct strand.
TVNRNGHRWSVQSRSLRRLSPRVACCFQHCWRAGGSSADRRQQLSVLMK*

>ORF number 3 in reading frame 3 on the direct strand extends from base 612 to base 914.
CGCCGGTTGTTGCGGGTCGGGTCTTGGCAATGTCGGATGCTGAGCTTGCTTTTGTCGAAC
TCGACAGTGTGGGCGGCGATGTTCAATCGTCCATCGACATTGCGCGCCGACTGCGGGCTG
CCGGATCGCACACTCATGTTGCGTCTGGGGCGAAGTGTTTCAGTGCCTGCACGGTTATCT
ACCAGGGTGGCGTCGAACGAACGGCGGGCGAGGAGGCACTATTCCTTCTTCACTATGCGG
TTCAGGTTTCAGACGACCCAAATCATGCGCGTGTTGGAAGTGTCTGGGGCACGGTCGCGT
TGA

>Translation of ORF number 3 in reading frame 3 on the direct strand.
RRLLRVGSWQCRMLSLLLSNSTVWAAMFNRPSTLRADCGLPDRTLMLRLGRSVSVPARLS
TRVASNERRARRHYSFFTMRFRFQTTQIMRVLEVSGARSR*


-------------------------------------------------------------------------------------------

SMS ORF Finder / any codon / codon 1 2 3 / mini long 60 / reverse / code standard

>ORF number 1 in reading frame 1 on the reverse strand extends from base 265 to base 609.
ACATCGCCGCCCACACTGTCGAGTTCGACAAAAGCAAGCTCAGCATCCGACATTGCCAAG
ACCCGACCCGCAACAACCGGCGTTATCATTTCATCAATACTGAGAGTTGCTGTCTGCGAT
CGGCCGACGACCCCCCTGCCCTCCAGCAATGCTGAAAACAGCAGGCCACCCTCGGCGACA
GGCGCCGCAAAGAACGAGACTGGACCGACCAACGGTGCCCGTTTCGGTTCACGGTCTAAG
ATCAAAGAAATGCCCATAATTGCCAGAGCAAAGGCAAAACCAAGAAGGCCCAACAAAAGT
GCATTCTTTGCCATTATTGCTTCCTCTCCTCTGAGACCCGACTAG

>Translation of ORF number 1 in reading frame 1 on the reverse strand.
TSPPTLSSSTKASSASDIAKTRPATTGVIISSILRVAVCDRPTTPLPSSNAENSRPPSAT
GAAKNETGPTNGARFGSRSKIKEMPIIARAKAKPRRPNKSAFFAIIASSPLRPD*

>ORF number 2 in reading frame 1 on the reverse strand extends from base 610 to base 774.
ATTGCGAACCGCAAGCTAGGGAGAGATGGGGAGAATAAAGGCAGCTTCAACAGCAAAGCC
CGGCGCATTACTGCACCGGCCTGGTCAATTTCTGAGACGGTATCGAGATGTCGCTTAGGC
AGCGATCACGCTGGCAGCATCATGGAAGGGTTCACCCAAGGCTGA

>Translation of ORF number 2 in reading frame 1 on the reverse strand.
IANRKLGRDGENKGSFNSKARRITAPAWSISETVSRCRLGSDHAGSIMEGFTQG*

>ORF number 3 in reading frame 1 on the reverse strand extends from base 775 to base 918.
AGCCACGGCGGCGTAAGTCACGGCGCCGTGATGCACGTTAAGTCCCGCCAACAAATGTGG
ATCGTCGGCCAGCGCCTGACGATATCCCTTGTTCGCAAGGGCCAGCATGAAAGGCAGCGT
TGCATTGTTAAGAGCAAACGTTGA

>Translation of ORF number 3 in reading frame 1 on the reverse strand.
SHGGVSHGAVMHVKSRQQMWIVGQRLTISLVRKGQHERQRCIVKSKR*

>ORF number 1 in reading frame 2 on the reverse strand extends from base 2 to base 172.
CGTAAACCGATGAGTCGGTGCCCAGATCAATCATCGCTTCGATCAACGCGACCGTGCCCC
AGACACTTCCAACACGCGCATGATTTGGGTCGTCTGAAACCTGAACCGCATAGTGAAGAA
GGAATAGTGCCTCCTCGCCCGCCGTTCGTTCGACGCCACCCTGGTAGATAA

>Translation of ORF number 1 in reading frame 2 on the reverse strand.
RKPMSRCPDQSSLRSTRPCPRHFQHAHDLGRLKPEPHSEEGIVPPRPPFVRRHPGR*

>ORF number 2 in reading frame 2 on the reverse strand extends from base 188 to base 367.
AACACTTCGCCCCAGACGCAACATGAGTGTGCGATCCGGCAGCCCGCAGTCGGCGCGCAA
TGTCGATGGACGATTGAACATCGCCGCCCACACTGTCGAGTTCGACAAAAGCAAGCTCAG
CATCCGACATTGCCAAGACCCGACCCGCAACAACCGGCGTTATCATTTCATCAATACTGA


>Translation of ORF number 2 in reading frame 2 on the reverse strand.
NTSPQTQHECAIRQPAVGAQCRWTIEHRRPHCRVRQKQAQHPTLPRPDPQQPALSFHQY*


>ORF number 3 in reading frame 2 on the reverse strand extends from base 368 to base 523.
GAGTTGCTGTCTGCGATCGGCCGACGACCCCCCTGCCCTCCAGCAATGCTGAAAACAGCA
GGCCACCCTCGGCGACAGGCGCCGCAAAGAACGAGACTGGACCGACCAACGGTGCCCGTT
TCGGTTCACGGTCTAAGATCAAAGAAATGCCCATAA

>Translation of ORF number 3 in reading frame 2 on the reverse strand.
ELLSAIGRRPPCPPAMLKTAGHPRRQAPQRTRLDRPTVPVSVHGLRSKKCP*

>ORF number 4 in reading frame 2 on the reverse strand extends from base 629 to base 727.
GGAGAGATGGGGAGAATAAAGGCAGCTTCAACAGCAAAGCCCGGCGCATTACTGCACCGG
CCTGGTCAATTTCTGAGACGGTATCGAGATGTCGCTTAG

>Translation of ORF number 4 in reading frame 2 on the reverse strand.
GEMGRIKAASTAKPGALLHRPGQFLRRYRDVA*

>ORF number 5 in reading frame 2 on the reverse strand extends from base 854 to base 955.
CGATATCCCTTGTTCGCAAGGGCCAGCATGAAAGGCAGCGTTGCATTGTTAAGAGCAAAC
GTTGAGGTCCGCGCGACAGCACCTGGCATGTTCGCCACACAA

>Translation of ORF number 5 in reading frame 2 on the reverse strand.
RYPLFARASMKGSVALLRANVEVRATAPGMFATQ

>ORF number 1 in reading frame 3 on the reverse strand extends from base 129 to base 419.
TGCCTCCTCGCCCGCCGTTCGTTCGACGCCACCCTGGTAGATAACCGTGCAGGCACTGAA
ACACTTCGCCCCAGACGCAACATGAGTGTGCGATCCGGCAGCCCGCAGTCGGCGCGCAAT
GTCGATGGACGATTGAACATCGCCGCCCACACTGTCGAGTTCGACAAAAGCAAGCTCAGC
ATCCGACATTGCCAAGACCCGACCCGCAACAACCGGCGTTATCATTTCATCAATACTGAG
AGTTGCTGTCTGCGATCGGCCGACGACCCCCCTGCCCTCCAGCAATGCTGA

>Translation of ORF number 1 in reading frame 3 on the reverse strand.
CLLARRSFDATLVDNRAGTETLRPRRNMSVRSGSPQSARNVDGRLNIAAHTVEFDKSKLS
IRHCQDPTRNNRRYHFINTESCCLRSADDPPALQQC*

>ORF number 2 in reading frame 3 on the reverse strand extends from base 504 to base 647.
GATCAAAGAAATGCCCATAATTGCCAGAGCAAAGGCAAAACCAAGAAGGCCCAACAAAAG
TGCATTCTTTGCCATTATTGCTTCCTCTCCTCTGAGACCCGACTAGATTGCGAACCGCAA
GCTAGGGAGAGATGGGGAGAATAA

>Translation of ORF number 2 in reading frame 3 on the reverse strand.
DQRNAHNCQSKGKTKKAQQKCILCHYCFLSSETRLDCEPQARERWGE*

>ORF number 3 in reading frame 3 on the reverse strand extends from base 705 to base 815.
GACGGTATCGAGATGTCGCTTAGGCAGCGATCACGCTGGCAGCATCATGGAAGGGTTCAC
CCAAGGCTGAAGCCACGGCGGCGTAAGTCACGGCGCCGTGATGCACGTTAA

>Translation of ORF number 3 in reading frame 3 on the reverse strand.
DGIEMSLRQRSRWQHHGRVHPRLKPRRRKSRRRDAR*