TEST module
Genotyping module
- constrain.test.genotyping.concatenating_list_of_dfs(list_of_dfs: list)[source]
Concatenating a list of daframes into one pd.dataframe by rows
- constrain.test.genotyping.pairwise_alignment_of_templates(reads: list, templates: list, primers: list) dict[source]
Infers relationship of templates to reads based on highest score from a pairwise alignment.
- Parameters:
reads (list of Bio.SeqRecord.SeqRecord) – these are .ab1 files made into Bio.SeqRecord.SeqRecord objects
templates (list of Bio.SeqRecord.SeqRecord) – Templates for inferring relationship with - could be plasmid fx
primers (list of Bio.SeqRecord.SeqRecord) – list of primers to be for finding were the read should start
- Return type:
pd.Dataframe in the following way
Example
<<<df_alignment = pairwise_alignment_of_templates(reads,templates, primers_for_seq)
<<< df_alignment
Sample-Name inf_promoter_name align_score inf_promoter 132 yp53re_cpr_A10_A10-pad_cpr_fw pCCW12 634.0 5 188 yp53re_cpr_A11_A11-pad_cpr_fw pTPI1 904.0 6 247 yp53re_cpr_A12_A12-pad_cpr_fw pTPI1 851.0 6 93 yp53re_cpr_A1_A01-pad_cpr_fw pCCW12 543.0 5 41 yp53re_cpr_A2_A02-pad_cpr_fw pCCW12 636.0 5
Notes
If you want inf_part_number column then change your the description of the Bio.SeqRecord.SeqRecord as follows:
pCCW12.description = ‘1’
- constrain.test.genotyping.plat_seq_data_wrangler(sequencing_plates: list) list[source]
Makes list of Plate2Seq pd.DataFrames into numeric values and removes nan values.
- Parameters:
sequencing_plates (list of pd.DataFrames) – Sliced Plate2seq pd.dataframes
- Return type:
Plate2Seq pd.DataFrames with numeric values
- constrain.test.genotyping.plate_AvgQual(list_of_dfs_numeric: list, Avg_qual=50, used_bases=25) list[source]
Filters out rows that doesnt follow the criteria.
- Parameters:
list_of_dfs_numeric (list of pd.DataFrames) – Sliced and Plate2seq pd.dataframes
Avg_qual (int) –
used_bases (int) –
- Return type:
Plate2Seq pd.DataFrames with that follows Avg_qual and used_bases criteria
- constrain.test.genotyping.slicing_and_naming_seq_plates(sequencing_plates, where_to_slice=7) list[source]
Slices rows of a list of dataframes and changes the names. Is used to ease pre-processing of Plate2seq excel files
- Parameters:
sequencing_plates (list of pd.DataFrames) – Plate2seq pd.dataframes
where_to_slice (int) – indicate where to slice the dataframe
- Return type:
list of plates sliced pd.DataFrames