avatar Bite 306. Translate coding sequences to proteins

Genes can be converted (translated) to proteins using a three base decoding system as described in Bite 255.

Your job is to create a function that takes a coding sequence (CDS) (=the region of a gene that encodes a protein but excluding any other features) and returns the translated protein/polypeptide as a str.

Since this is a beginner bite, we've enabled Biopython. Have a look at the Bio.Seq Module and locate the correct function to solve this bite. You will also need to ensure proper cleaning up of input whitespace. 

Although the genetic code is very conserved, there are differences between encodings in certain organisms and cell compartments. Therefore, there are different mappings or "translation tables" implemented in Biopython that you can directly use.

Also make sure you find and set the option to use complete coding sequences (CDS) as this means that start codons in position 1 are evaluated as the amino acid M instead of their usual counterpart.

Example of how the function should work:

>>> translate_gene("ATGGGGTTTTAA", "Bacterial")
>>> translate_gene("TTGGGGTTTTAA", "Bacterial")
>>> translate_gene("ACGGGGTTTTAA", "Bacterial")
TranslationError: First codon 'CTG' is not a start codon

Good luck!


Login and get coding
go back Intermediate level
Bitecoin 3X

17 out of 18 users completed this Bite.
Will you be Pythonista #18 to crack this Bite?
Resolution time: ~30 min. (avg. submissions of 5-240 min.)
Pythonistas rate this Bite 6.0 on a 1-10 difficulty scale.
» Up for a challenge? 💪

We use Python 3.8