The nucleic acid codes supported are:A --> adenosine
C --> cytidine
G --> guanine
T --> thymidine
U --> uridine
R --> G A (purine)
Y --> T C (pyrimidine)
K --> G T (keto)
M --> A C (amino)
S --> G C (strong)
W --> A T (weak)
B --> G T C
D --> G A T
H --> A C T
V --> G C A
N --> A G C T (any)
- gap of indeterminate length
For those programs that use amino acid query sequences
(BLASTP and TBLASTN), the accepted amino acid codes are:
(BLASTP and TBLASTN), the accepted amino acid codes are:
A alanine
B aspartate or asparagine
C cystine
D aspartate
E glutamate
F phenylalanine
G glycine
H histidine
I isoleucine
K lysine
L leucine
M methionine
N asparagine
P prolineB aspartate or asparagine
C cystine
D aspartate
E glutamate
F phenylalanine
G glycine
H histidine
I isoleucine
K lysine
L leucine
M methionine
N asparagine
Q glutamine
R arginine
S serine
T threonine
U selenocysteine
V valine
W tryptophan
Y tyrosine
Z glutamate or glutamine
X any
* translation stop
- gap of indeterminate length
>sp|Q38361|VINT_BPMD2 Integrase OS=Mycobacterium phage D29 GN=33 PE=3 SV=1
MDAEAWLASEKRLIDNEEWTPPAEREKKAAASAITVEEYTKKWIAERD
LAGGTKDLYSTHARKRIYPVLGDTPVAEMTPALVRAWWAGMGKQYP
TARRHAYNVLRAVMNTAVEDKLVSENPCRIEQKAPAERDVEALTPEEL
DVVAGEVFEHYRVAVYILAWTSLRFGELIEIRRKDIVDDGETMKLRVR
RGAARVGEKIVVGNTKTVRSKRPVTVPPHVAAMIREHMADRTKMNK
GPEALLVTTTRGQRLSKSAFTRSLKKGYAKIGRPDLRIHDLRAVGATL
AAQAGATTKELMVRLGHTTPRMAMKYQMASAARDEEIARRMSELAGI
TP
MDAEAWLASEKRLIDNEEWTPPAEREKKAAASAITVEEYTKKWIAERD
LAGGTKDLYSTHARKRIYPVLGDTPVAEMTPALVRAWWAGMGKQYP
TARRHAYNVLRAVMNTAVEDKLVSENPCRIEQKAPAERDVEALTPEEL
DVVAGEVFEHYRVAVYILAWTSLRFGELIEIRRKDIVDDGETMKLRVR
RGAARVGEKIVVGNTKTVRSKRPVTVPPHVAAMIREHMADRTKMNK
GPEALLVTTTRGQRLSKSAFTRSLKKGYAKIGRPDLRIHDLRAVGATL
AAQAGATTKELMVRLGHTTPRMAMKYQMASAARDEEIARRMSELAGI
TP
The sequence is one amino acid sequence because the amino acid P Proline is presents here. There is not P in the list of nucleic acid codes .
First amino acid in the Integrase sequence is M and the codon in mRNA is the start codon:
AUG.