openest.models.ddp_model module¶
-
class
openest.models.ddp_model.DDPModel(p_format=None, source=None, xx_is_categorical=False, xx=None, yy_is_categorical=False, yy=None, pp=None, unaccounted=None, scaled=True)[source]¶ Bases:
openest.models.univariate_model.UnivariateModel,openest.models.memoizable.MemoizableUnivariateDiscrete-Discrete-Probability (DDP) Format
A DDP file describes a dose-response relationship with a limited collection of response outcomes. The dose and response values may be either categorical or sampled at a collection of numerical levels.
<y-value-1>, …,<y-value-N>and<x-value-1>, …,<x-value-N>are either strings (for named categories) or numerical values.The format of a DDP file is:
<format>,<y-value-1>,<y-value-2>,... <x-value-1>,p(y1|x1),p(y2|x1),... <x-value-2>,p(y1|x2),p(y2|x2),...
Below is a sample categorical DDP file:
ddp1,live,dead control,.5,.5 treated,.9,.1
Below is a sample numerical DDP file:
ddp1,-10.0,-.33333333333,3.33333333333,10.0 0.0,0.5,0.5,0.0,0.0 13.3333333333,0.0,0.5,0.5,0.0 26.6666666667,0.0,0.0,0.5,0.5 40.0,0.0,0.0,0.0,0.5
Parameters: - p_format (str) –
Probability format. May be one of the following values:
ddp1- the p(.) values are simple probabilities (0 < p(.) < 1 and sum p(y|x) = 1)ddp2- the p(.) values are log probabilities
- source (str) – Metadata attribute. Name of file this object was read in from.
- xx_is_categorical (bool) – Indicates whether
xxis categorical. False indicates numeric data. - xx (list-like) – X axis index
- yy_is_categorical (bool) – Indicates whether
yyis categorical. False indicates numeric data. - yy (list-like) – Y axis index
- pp (array-like) – underlying numpy(?) data array
- unaccounted (numpy.array) – column of remaining probability.
unaccounted = 1-sum(pp, axis=1). - scaled (bool) – Indicates whether data has been scaled. If scaled, re-scale so
pp.sum(axis=1)==1.
-
static
create_lin(yy, xxs)[source]¶ Create a DDP model by supplying y index and dictionary of p-values
Parameters: - yy (list-like) – y-index labels
- xxs (dict) – dictionary keyed with x-index values with p-values for vals
-
draw_sample(x=None)[source]¶ Randomly sample label from y-index using p values in row x
If x is None (default), use first row. Uses self.get_closest(x) to find matching nearest match for x-index label x
-
eval_pval(x, p, threshold=0.001)[source]¶ Inverse CDF Evaluation
Returns the value of $y$ that corresponds to a given p-value: $F^{-1}(p | x)$.
-
get_closest(x=None)[source]¶ return closest index on x axis
If x index is categorical, coerce x to string and find first matching index. If numeric, find the closest value.
If x is None (default), return 0
-
get_mean(x=None)[source]¶ Returns the mean of the y-index labels weighted by p values in row x
If x is None (default), use first row. Uses self.get_closest(x) to find matching nearest match for x-index label x
-
get_sdev(x=None)[source]¶ Returns the std dev of the y-index labels weighted by p values in row x
If x is None (default), use first row. Uses self.get_closest(x) to find matching nearest match for x-index label x
-
interpolate_x(newxx, kind='quadratic')[source]¶ custom interpolation method. wrapper around scipy.interp1d.
Parameters: - newxx (list-like) – new x axis
- kind (str) – interpolation method, passed to scipy.interp1d
- p_format (str) –