avatar Bite 303. Unique genes

You have received a list of DNA sequences for a specific gene in the FASTA file format (see Bite 298).

Your job is to collapse this FASTA file to a new FASTA file which contains only unique gene sequences and the headers for each entry are changed accordingly.

To make things more interesting the function not only accepts FASTA files but also gzipped FASTA files.

Watch out for edge cases as specified in the tests.

Example

convert_to_unique_genes("input.fasta", "output.fasta")

input.fasta

>gene [locus_tag=AA11]
AAAAAA
>gene [locus_tag=BB22]
AAAAAA
>gene [locus_tag=CC33]
GAAAAC

output.fasta

>gene [locus_tags=AA11,BB22]
AAAAAA
>gene [locus_tag=CC33]
GAAAAC
Login and get coding
go back Advanced level
Bitecoin 4X

26 out of 29 users completed this Bite.
Will you be Pythonista #27 to crack this Bite?
Resolution time: ~107 min. (avg. submissions of 5-240 min.)
Pythonistas rate this Bite 8.0 on a 1-10 difficulty scale.
» Up for a challenge? 💪

Focus on this Bite hiding sidebars, turn on Focus Mode.

Ask for Help