API#

Accessor#

From cf-xarray.

class cf_pandas.accessor.CFAccessor(pandas_obj)[source]#

Bases: object

Dataframe accessor analogous to cf-xarray accessor.

Attributes:
axes

Property that returns a dictionary mapping valid Axis standard names for .cf[] to variable names.

axes_cols

Property that returns a list of column names from the axes mapping.

coordinates

Property that returns a dictionary mapping valid Coordinate standard names for .cf[] to variable names.

coordinates_cols

Property that returns a list of column names from the coordinates mapping.

custom_keys

Returns a dictionary mapping criteria keys to variable names.

standard_names

Returns a dictionary mapping standard_names to variable names, if there is a match.

Methods

keys()

Utility function that returns valid keys for .cf[].

property axes: Dict[str, List[str]]#

Property that returns a dictionary mapping valid Axis standard names for .cf[] to variable names.

This is useful for checking whether a key is valid for indexing, i.e. that the attributes necessary to allow indexing by that key exist. It will return the Axis names ("X", "Y", "Z", "T") present in .columns.

Returns:

Dictionary with keys that can be used with __getitem__ or as .cf[key]. Keys will be the appropriate subset of (“X”, “Y”, “Z”, “T”). Values are lists of variable names that match that particular key.

Return type:

dict

property axes_cols: List[str]#

Property that returns a list of column names from the axes mapping.

Returns:

Variable names that are the column names which represent axes.

Return type:

list

property coordinates: Dict[str, List[str]]#

Property that returns a dictionary mapping valid Coordinate standard names for .cf[] to variable names.

This is useful for checking whether a key is valid for indexing, i.e. that the attributes necessary to allow indexing by that key exist. It will return the Coordinate names ("latitude", "longitude", "vertical", "time") present in .columns.

Returns:

Dictionary of valid Coordinate names that can be used with __getitem__ or .cf[key]. Keys will be the appropriate subset of ("latitude", "longitude", "vertical", "time"). Values are lists of variable names that match that particular key.

Return type:

dict

property coordinates_cols: List[str]#

Property that returns a list of column names from the coordinates mapping.

Returns:

Variable names that are the column names which represent coordinates.

Return type:

list

property custom_keys#

Returns a dictionary mapping criteria keys to variable names.

Returns:

Dictionary mapping criteria keys to variable names.

Return type:

dict

Notes

Need to use this with context manager version of providing custom_criteria.

keys() Set[str][source]#

Utility function that returns valid keys for .cf[].

This is useful for checking whether a key is valid for indexing, i.e. that the attributes necessary to allow indexing by that key exist.

Returns:

Set of valid key names that can be used with __getitem__ or .cf[key].

Return type:

set

property standard_names#

Returns a dictionary mapping standard_names to variable names, if there is a match. Compares with all cf-standard names.

Returns:

Dictionary mapping standard_names to variable names.

Return type:

dict

Notes

This is not the same as the cf-xarray accessor method of the same name, which searches for variables with standard_name attributes and surfaces those values to map to the variable name.

cf_pandas.accessor.apply_mapper(mappers: Union[Callable[[DataFrame, str], List[str]], Tuple[Callable[[DataFrame, str], List[str]], ...]], obj: DataFrame, key: Hashable, error: bool = True, default: Optional[Any] = None) List[Any][source]#

Applies a mapping function; does error handling / returning defaults. Expects the mapper function to raise an error if passed a bad key. It should return a list in all other cases including when there are no results for a good key.

cf-pandas utilities#

Utilities for cf-pandas.

cf_pandas.utils.always_iterable(obj: ~typing.Any, allowed=(<class 'tuple'>, <class 'list'>, <class 'set'>, <class 'dict'>)) Iterable[source]#

This is from cf-xarray.

cf_pandas.utils.astype(value, type_)[source]#

Return value as type type_. Particularly made to work correctly for returning string, PosixPath, or Timestamp as list.

cf_pandas.utils.match_criteria_key(available_values: list, keys_to_match: Union[str, list], criteria: Optional[dict] = None, split: bool = False) list[source]#

Use criteria to choose match to key from available available_values.

Parameters:
  • available_values (list) – String or list of strings to compare against list of category values. They should be keys in criteria.

  • keys_to_match (str, list) – Key(s) from criteria to match with available_values.

  • criteria (dict, optional) – Criteria to use to map from variable to attributes describing the variable. If user has defined custom_criteria, this will be used by default.

  • split (bool, optional) – If split is True, split the available_values by white space before performing matches. This is helpful e.g. when columns headers have the form “standard_name (units)” and you want to match standard_name.

Returns:

Values from available_values that match keys_to_match, according to criteria.

Return type:

list

Notes

This uses logic from cf-xarray.

cf_pandas.utils.set_up_criteria(criteria: Optional[Union[dict, Iterable]] = None) ChainMap[source]#

Get custom criteria from options.

Parameters:

criteria (dict, optional) – Criteria to use to map from variable to attributes describing the variable. If user has defined custom_criteria, this will be used by default.

Returns:

Criteria

Return type:

ChainMap

cf_pandas.utils.standard_names()[source]#

Returns list of CF standard_names.

Returns:

All CF standard_names

Return type:

list

Reg class for writing regular expressions#

Class for writing regular expressions.

class cf_pandas.reg.Reg(exclude: Optional[Union[List[str], str]] = None, exclude_start: Optional[Union[List[str], str]] = None, exclude_end: Optional[Union[List[str], str]] = None, include: Optional[Union[List[str], str]] = None, include_or: Optional[Union[List[str], str]] = None, include_exact: Optional[str] = None, include_start: Optional[str] = None, include_end: Optional[str] = None, ignore_case: bool = True)[source]#

Bases: object

Class to write a regular expression.

Notes

  • Input strings are never allowed to be empty.

  • Need escape characters on any special characters, and then convert to raw, e.g., r”[celsius]” for “[celsius]”.

  • The exclude options are logical “or”.

  • The include option is logical “and”, include_or is logical “or”, and the other include_ options allow for only one selection. If you want to use more than one include_start for example, you should make an additional regular expression.

Methods

check()

Check to make sure selected options are compatible.

exclude(string)

Exclude string from anywhere in matches.

exclude_end(string)

Exclude string from end of matches.

exclude_start(string)

Exclude string from start of matches.

include(string)

String must be present anywhere in matches, logical "and".

include_end(string)

String must be present at the end of matches.

include_exact(string)

String must match exactly.

include_or(string)

String must be present anywhere in matches, logical "or".

include_start(string)

String must be present at the start of matches.

pattern()

Generate regular expression pattern from user rules.

check()[source]#

Check to make sure selected options are compatible.

exclude(string: Union[str, list])[source]#

Exclude string from anywhere in matches.

Parameters:

string (str, list) – Matches with regular expression pattern will not contain string(s).

Notes

As a list of strings, this acts as a logical “or” for the exclusions.

exclude_end(string: Union[str, list])[source]#

Exclude string from end of matches.

Parameters:

string (str, list) – Matches with regular expression pattern will not end with string(s).

Notes

As a list of strings, this acts as a logical “or” for the exclusions.

exclude_start(string: Union[str, list])[source]#

Exclude string from start of matches.

Parameters:

string (str, list) – Matches with regular expression pattern will not start with string(s).

Notes

As a list of strings, this acts as a logical “or” for the exclusions.

include(string: Union[str, list])[source]#

String must be present anywhere in matches, logical “and”.

Parameters:

string (str, list) – Matches with regular expression pattern will contain all string(s).

Notes

A list of strings will be treated as a logical “and”.

include_end(string: str)[source]#

String must be present at the end of matches.

Parameters:

string (str) – Matches with regular expression pattern will end with string.

include_exact(string: str)[source]#

String must match exactly.

Parameters:

string (str) – A match with regular expression pattern will be exactly string.

include_or(string: Union[str, list])[source]#

String must be present anywhere in matches, logical “or”.

Parameters:

string (str, list) – Matches with regular expression pattern will contain at lease one of string(s).

Notes

A list of strings will be treated as a logical “or”.

include_start(string: str)[source]#

String must be present at the start of matches.

Parameters:

string (str) – Matches with regular expression pattern will start with string.

pattern() str[source]#

Generate regular expression pattern from user rules.

Returns:

Regular expression accounting for all input selections.

Return type:

str

cf_pandas.reg.joinpat(regs: Sequence[Reg]) str[source]#

Join patterns from Reg objects.

Parameters:

regs (Sequence) – Reg objects from which .pattern() will be used.

Returns:

Regular expression patterns from regs joined together with “|”

Return type:

str

Vocab class for handling custom variable-selection vocabularies#

Class for creating and working with vocabularies.

class cf_pandas.vocab.Vocab(openname: Optional[str] = None)[source]#

Bases: object

Class to handle vocabularies.

Methods

add(other_vocab, method)

Add two Vocab objects together...

make_entry(nickname, expressions[, attr])

Make an entry for vocab.

open_file(openname)

Open previously-saved vocab.

save(savename)

Save to file.

add(other_vocab: Union[DefaultDict[str, Dict[str, str]], Vocab], method: str) Vocab[source]#

Add two Vocab objects together…

by adding their .vocab`s together. Expressions are piped together but otherwise not changed. This is used for both `__add__ and __iadd__.

Parameters:
  • other_vocab (Vocab) – Other Vocab object to combine with.

  • method (str) – Whether to run as “add” which returns a new Vocab object or “iadd” which adds to the original object.

Returns:

vocab + other_vocab either as a new object or in place.

Return type:

Vocab

make_entry(nickname: str, expressions: Union[str, list], attr: str = 'standard_name')[source]#

Make an entry for vocab.

Parameters:
  • nickname (str) – The nickname to call the variable being represented in this entry.

  • expressions (str, list) – Regular expression(s) to use to select out the variable in a regex match. Multiple expressions input in a list are piped together to create one str of expressions.

  • attr (str) – What attribute to identify the regular expressions with. Default is “standard_name”, but other reasonable options are any variable attributes in a netcdf file such as “units”, “name”, and “long_name”.

Examples

The following creates an entry in the vocabulary stored in vocab.vocab. It doesn’t print the entry but it has been pasted in below the example to show what it looks like.

>>> import cf_pandas as cfp
>>> vocab = cfp.Vocab()
>>> vocab.make_entry("temp", ["a","b"], attr="name")
{'temp': {'standard_name': 'a|b'}})
open_file(openname: Union[str, PurePath])[source]#

Open previously-saved vocab.

Parameters:

openname (str) – Where to find vocab to open.

save(savename: Union[str, PurePath])[source]#

Save to file.

Parameters:

savename (str, PurePath) – Filename to save to.

cf_pandas.vocab.merge(vocabs: Sequence[Vocab]) Vocab[source]#

Add together multiple Vocab objects.

Parameters:

vocabs (Sequence[Vocab]) – Sequence of Vocab objects to merge.

Returns:

Single Vocab object made up of input vocabs.

Return type:

Vocab

widget class for easy human selection of variables to exactly match#

Widget

class cf_pandas.widget.Selector(options: Sequence, vocab: Optional[Vocab] = None, nickname_in: str = '', include_in: str = '', exclude_in: str = '')[source]#

Bases: object

Coordinates interaction with dropdown widget to make simple vocabularies.

Options are filtered by a regular expression written to reflect the include and exclude inputs, and these are updated when changed and shown in the dropdown. The user should select using command or control to make multiple options. Then push the “save” button when the nickname and selected options from the dropdown menu are the variables you want to include exactly in a future regular expression search.

Examples

Show widget with a short list of options. Input a nickname and press button to save an entry to the running vocabulary in the object:

>>> import cf_pandas as cpf
>>> sel = cfp.Selector(options=["var1", "var2", "var3"])
>>> sel

See resulting vocabulary with:

>>> sel.vocab

Methods

button_pressed(*args)

Saves a new entry in the catalog when button is pressed.

button_pressed(*args)[source]#

Saves a new entry in the catalog when button is pressed.

cf_pandas.widget.dropdown(nickname: str, options: Union[Sequence, Series], include: str = '', exclude: str = '')[source]#

Makes widget that is used by class.

Options are filtered by a regular expression written to reflect the include and exclude inputs, and these are updated when changed and shown in the dropdown. The user should select using command or control to make multiple options. Then push the “save” button when the nickname and selected options from the dropdown menu are the variables you want to include exactly in a future regular expression search.

Parameters:
  • nickname (str) – nickname to associate with the Vocab class vocabulary entry from this, e.g., “temp”. Inputting this to the function creates a text box for the user to enter it into.

  • options (Sequence) – strings to select from in the dropdown widget. Will be filtered by include and exclude inputs.

  • include (str) – include must be in options values for them to show in the dropdown. Will update as more are input. To input more than one, join separate strings with “|”. For example, to search on both “temperature” and “sea_water”, input “temperature|sea_water”.

  • exclude (str) – exclude must not be in options values for them to show in the dropdown. Will update as more are input. To input more than one, join separate strings with “|”. For example, to exclude both “temperature” and “sea_water”, input “temperature|sea_water”.