Package peppy Documentation

Class Project

A class to model a Project (collection of samples and metadata).

Parameters:

  • cfg (str | Mapping): Project config file (YAML), or appropriatekey-value mapping of data to constitute project
  • sample_table_index (str | Iterable[str]): name of the columns to setthe sample_table index to
  • subsample_table_index (str | Iterable[str]): name of the columns to setthe subsample_table index to
  • amendments (str | Iterable[str]): names of the amendments to activate
  • amendments (Iterable[str]): amendments to use within configuration file

Examples:

    from peppy import Project
    prj = Project("ngs")
    samples = prj.samples
def __init__(self, cfg=None, amendments=None, sample_table_index=None, subsample_table_index=None, defer_samples_creation=False)

Initialize self. See help(type(self)) for accurate signature.

def activate_amendments(self, amendments)

Update settings based on amendment-specific values.

This method will update Project attributes, adding new values associated with the amendments indicated, and in case of collision with an existing key/attribute the amendments' values will be favored.

Parameters:

  • amendments (Iterable[str]): A string with amendmentnames to be activated

Returns:

  • peppy.Project: Updated Project instance

Raises:

  • TypeError: if argument to amendment parameter is null
  • NotImplementedError: if this call is made on a project notcreated from a config file
def add_samples(self, samples)

Add list of Sample objects

Parameters:

  • samples (peppy.Sample | Iterable[peppy.Sample]): samples to add
def amendments(self)

Return currently active list of amendments or None if none was activated

Returns:

  • Iterable[str]: a list of currently active amendment names
def attr_constants(self)

Update each Sample with constants declared by a Project. If Project does not declare constants, no update occurs.

def attr_derive(self, attrs=None)

Set derived attributes for all Samples tied to this Project instance

def attr_imply(self)

Infer value for additional field(s) from other field(s).

Add columns/fields to the sample based on values in those already-set that the sample's project defines as indicative of implications for additional data elements for the sample.

def attr_merge(self)

Merge sample subannotations (from subsample table) with sample annotations (from sample_table)

def attr_remove(self)

Remove declared attributes from all samples that have them defined

def attr_synonyms(self)

Copy attribute values for all samples to a new one

def config(self)

Get the config mapping

Returns:

  • Mapping: config. May be formatted to comply with the mostrecent version specifications
def config_file(self)

Get the config file path

Returns:

  • str: path to the config file
def copy(self)

Copy self to a new object.

def create_samples(self)

Populate Project with Sample objects

def deactivate_amendments(self)

Bring the original project settings back.

Returns:

  • peppy.Project: Updated Project instance

Raises:

  • NotImplementedError: if this call is made on a project notcreated from a config file
def get_description(self)

Infer project description from config file.

The provided description has to be of class coercible to string

Returns:

  • str: inferred name for project.

Raises:

  • InvalidConfigFileException: if description is not of classcoercible to string
def get_sample(self, sample_name)

Get an individual sample object from the project.

Will raise a ValueError if the sample is not found. In the case of multiple samples with the same name (which is not typically allowed), a warning is raised and the first sample is returned

Parameters:

  • sample_name (str): The name of a sample to retrieve

Returns:

  • peppy.Sample: The requested Sample object
def get_samples(self, sample_names)

Returns a list of sample objects given a list of sample names

Parameters:

  • sample_names (list): A list of sample names to retrieve

Returns:

  • list[peppy.Sample]: A list of Sample objects
def infer_name(self)

Infer project name from config file path.

First assume the name is the folder in which the config file resides, unless that folder is named "metadata", in which case the project name is the parent of that folder.

Returns:

  • str: inferred name for project.

Raises:

  • InvalidConfigFileException: if the project lacks both a name anda configuration file (no basis, then, for inference)
  • InvalidConfigFileException: if specified Project name is invalid
def list_amendments(self)

Return a list of available amendments or None if not declared

Returns:

  • Iterable[str]: a list of available amendment names
def load_samples(self)
def modify_samples(self)
def parse_config_file(self, cfg_path, amendments=None)

Parse provided yaml config file and check required fields exist.

Parameters:

  • cfg_path (str): path to the config file to read and parse
  • amendments (Iterable[str]): Name of amendments to activate

Raises:

  • KeyError: if config file lacks required section(s)
def sample_name_colname(self)

Name of the effective sample name containing column in the sample table.

It is "sample_name" bu default, but when it's missing it could be replaced by the selected sample table index, defined on the object instantiation stage.

Returns:

  • str: name of the column that consist of sample identifiers
def sample_table(self)

Get sample table. If any sample edits were performed, it will be re-generated

Returns:

  • pandas.DataFrame: a data frame with current samples attributes
def samples(self)

Generic/base Sample instance for each of this Project's samples.

Returns:

  • Iterable[Sample]: Sample instance for eachof this Project's samples
def subsample_table(self)

Get subsample table

Returns:

  • pandas.DataFrame: a data frame with subsample attributes

Class Sample

Class to model Samples based on a pandas Series.

Parameters:

  • series (Mapping | pandas.core.series.Series): Sample's data.
def __init__(self, series, prj=None)

Initialize self. See help(type(self)) for accurate signature.

def copy(self)

Copy self to a new object.

def derive_attribute(self, data_sources, attr_name)

Uses the template path provided in the project config section "data_sources" to piece together an actual path by substituting variables (encoded by "{variable}"") with sample attributes.

Parameters:

  • data_sources (Mapping): mapping from key name (as a value ina cell of a tabular data structure) to, e.g., filepath
  • attr_name (str): Name of sample attribute(equivalently, sample sheet column) specifying a derived column.

Returns:

  • str: regex expansion of data source specified in configuration,with variable substitutions made

Raises:

  • ValueError: if argument to data_sources parameter is null/empty
def get_sheet_dict(self)

Create a K-V pairs for items originally passed in via the sample sheet. This is useful for summarizing; it provides a representation of the sample that excludes things like config files and derived entries.

Returns:

  • OrderedDict: mapping from name to value for data elementsoriginally provided via the sample sheet (i.e., the a map-like representation of the instance, excluding derived items)
def project(self)

Get the project mapping

Returns:

  • peppy.Project: project object the sample was created from
def to_dict(self, add_prj_ref=False)

Serializes itself as dict object.

Parameters:

  • add_prj_ref (bool): whether the project reference bound do theSample object should be included in the YAML representation

Returns:

  • dict: dict representation of this Sample
def to_yaml(self, path, add_prj_ref=False)

Serializes itself in YAML format.

Parameters:

  • path (str): A file path to write yaml to; provide this orthe subs_folder_path
  • add_prj_ref (bool): whether the project reference bound do theSample object should be included in the YAML representation

Class PeppyError

Base error type for peppy custom errors.

def __init__(self, msg)

Initialize self. See help(type(self)) for accurate signature.

Version Information: peppy v0.31.1, generated by lucidoc v0.4.2