Login and get coding
Proteins fulfill important functions in all organisms and consist of amino acids linked together in a specific order.
Although single changes of these amino acids can have devastating effects on the protein function, not all changes carry the same severity. The impact is largely influenced by the chemical and physical properties of the substituted amino acid.
But how can differences in proteins be quantified? An efficient way to calculate similarity is the use of scoring matrices.
In this Bite you will use BLOSUM and PAM matrices to calculate protein similarity scores.
The score is calculated by summing up all the values from the scoring matrix for each paired amino acid. Each amino acid is represented by a different letter (e.g. A stands for alanine, R for aRginine ...)
Consider the following scoring matrix:BLOSUM62 (excerpt):| A R N D C Q [...]--+-----------------------
A | 4 -1 -2 -2 0 -1 [...]
R | -1 5 0 -2 -3 1 [...]
To calculate the BLOSUM62 similarity score between two sequences Seq1 and Seq2, the sequences are aligned and the individual scores for each amino acid are added up as follows:Seq1 A R R N C Q A
Seq2 A A R R A A A
Score 4 -1 5 0 0 -1 4 --> SUM == 11
11in this example.
- Implement a general matrix score calculator for amino acid sequences using the provided matrices.
- Implement a custom error class
matrix_scorefunction to address non existent pairs (see tests for more information).
- Write a function that returns the sequence(s) of the most closely related (highest score) amino acid sequence.
Note: For this bite you can assume that all sequences are already properly aligned.
32 out of 33 users completed this Bite.
Will you be Pythonista #33 to crack this Bite?
Resolution time: ~104 min. (avg. submissions of 5-240 min.)
Pythonistas rate this Bite 7.2 on a 1-10 difficulty scale.
» Up for a challenge? 💪