src.utils.pandas module

helper methods for manipulation pandas dataframes. i.e. - merging tables - drop list entries - index by a certain feature

src.utils.pandas.concatOnPrefixes(table, prefixes, ident)[source]

Partition columns by prefix and stack matches row-wise.

For each prefix in prefixes:
  • take all columns starting with the prefix

  • append all non-matching columns

  • add a column ‘ident’ with the prefix label

Then concatenate these partitions along rows.

Parameters:
  • table (DataFrame)

  • prefixes (list[str])

  • ident (str)

Return type:

DataFrame

src.utils.pandas.dropListEntries(dataframe)[source]

this method drops all columns of a dataframe, that contain a list.

Parameters:

dataframe (pd.DataFrame) – a dataframe to format

Returns:

dataframe – the resulting dataframe with dropped lists

Return type:

pd.DataFrame

src.utils.pandas.indexByOneVariable(df, var)[source]

this method uses one variable/feature and sets it as the new index variable.

Parameters:
  • df (pd.DataFrame) – the dataframe, to change

  • var (str) – the name of the feature/variable

Returns:

dataframe – the new dataframe indexed by var

Return type:

pd.DataFrame

src.utils.pandas.mergeTables(tables)[source]

merges tables along the rows, but assumes unique colnames!

Parameters:

tables (list[DataFrame])

Return type:

DataFrame

Dependency Diagrams (without externals):

digraph imports {
  rankdir=LR;
  node [shape=box];
  "src.core.process.extract" -> "src.utils.pandas";
  "src.main" -> "src.utils.pandas";
}

Import dependencies (collapsed)

Dependency Diagrams:

digraph imports {
  rankdir=LR;
  node [shape=box];
  "src.core.process.extract" -> "src.utils.pandas";
  "src.main" -> "src.utils.pandas";
}

Import dependencies (collapsed)