pyprobound.table.get_dataframe

get_dataframe(paths, total_count=None, random_state=None)

Loads tab-delimited count tables into columns on a Pandas dataframe.

The input count tables are assumed to have a sequence field and a series of count fields, all separated by a tab character, with no header.

Parameters:
  • paths (str | list[str]) – The paths to each count table to be merged into a dataframe.

  • total_count (int | None) – The total number of counts to be sampled from each column.

  • random_state (int | None) – A seed used to make the output reproducible if total_count is specified.

Return type:

DataFrame

Returns:

An integer Pandas dataframe with each column representing a sequencing round. Sequences are stored in the index of the dataframe.