choppa.align.AlignFactory

class choppa.align.AlignFactory(fitness_dict: OrderedDict, complex: Structure)[source]

Base class for aligning Fitness data with PDB complexes within choppa.

__init__(fitness_dict: OrderedDict, complex: Structure)[source]

Methods

__init__(fitness_dict, complex)

align_fitness()

Align a fitness OrderedDict to a complex object

alignment_idx_to_original_idx(alignment)

The BioPython alignment shifts indices freely based on query/target overlap.

complex_get_seq()

From a PDB.Structure.Structure, extracts the amino acid sequence

complex_get_seqidcs()

From a PDB.Structure.Structure, extracts the sequence indices as stored in the PDB file

fill_aligned_fitness(aligned_fitness_dict)

For an aligned fitness dict, there may be gaps with respect to the complex PDB.

fitness_get_seq()

From a fitness OrderedDict, extracts the amino acid sequence

fitness_get_seqidcs()

From a fitness OrderedDict, extracts the amino acid sequence indices as a list

fitness_reset_keys(alignment)

Given a fitness OrderedDict and an alignment, reset the keys of the fitness OrderedDict.

get_alignment(fitness_seq, complex_seq)

Aligns two AA sequences with BioPython's PairwiseAligner (https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec128).

get_fitness_alignment_shift_dict(alignment)

Given an input complex sequence with residue indices (may not start at 0) and the fitness-complex alignment, creates a dictionary with indices that should be used for the fitness data of the form {fitness_idx : aligned_idx}

validate_alignment()

[Placeholder] validates the alignment of a fitness OrderedDict to a complex object

align_fitness()[source]

Align a fitness OrderedDict to a complex object

alignment_idx_to_original_idx(alignment)[source]

The BioPython alignment shifts indices freely based on query/target overlap. This function return a dict that maps the new indices in the alignment back to the old/original index sequence.

complex_get_seq()[source]

From a PDB.Structure.Structure, extracts the amino acid sequence

complex_get_seqidcs()[source]

From a PDB.Structure.Structure, extracts the sequence indices as stored in the PDB file

fill_aligned_fitness(aligned_fitness_dict)[source]

For an aligned fitness dict, there may be gaps with respect to the complex PDB. Fills these with empty data for easier parsing during visualization.

fitness_get_seq()[source]

From a fitness OrderedDict, extracts the amino acid sequence

fitness_get_seqidcs()[source]

From a fitness OrderedDict, extracts the amino acid sequence indices as a list

fitness_reset_keys(alignment)[source]

Given a fitness OrderedDict and an alignment, reset the keys of the fitness OrderedDict.

NB: also fills the fitness dict with indices that exist in the PDB but not in the fitness data, i.e. represented as ‘empty’ dict entries. This way the fitness HTML view will have ‘empty’ fitness data for those residues.

get_alignment(fitness_seq, complex_seq)[source]

Aligns two AA sequences with BioPython’s PairwiseAligner (https://biopython.org/DIST/docs/tutorial/Tutorial.html#sec128). We do a local alignment with BLOSUM to take evolutionary divergence into account.

get_fitness_alignment_shift_dict(alignment)[source]

Given an input complex sequence with residue indices (may not start at 0) and the fitness-complex alignment, creates a dictionary with indices that should be used for the fitness data of the form {fitness_idx : aligned_idx}

validate_alignment()[source]

[Placeholder] validates the alignment of a fitness OrderedDict to a complex object