Detectors¶
Module of detectors.
A detector detects anomalous time points from time series.
-
class
adtk.detector.
AutoregressionAD
(n_steps=1, step_size=1, regressor=None, c=3.0, side='both')[source]¶ Detector that detects anomalous autoregression property in time series.
Many time series has autoregressive behavior. For example, in a linearly autoregressive time series, current value is a linear combination of serveral previous values. Violation of usual autoregressive behavior may indicate anomaly.
The detector applies a regressor to learn autoregressive property of the time series, and identifies a time point as anomalous when the residual of autoregression is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
n_steps (int, optional) – Number of steps (previous values) to include in the model. Default: 1.
step_size (int, optional) – Length of a step. For example, if n_steps=2, step_size=3, X_[t-3] and X_[t-6] will be used to predict X_[t]. Default: 1.
regressor (object, optional) – Regressor to be used. Same as a scikit-learn regressor, it should minimally have fit and predict methods. If not given, a linear regressor will be used.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative residuals;
If “positive”, to only detect anomalous positive residuals;
If “negative”, to only detect anomalous negative residuals.
Default: “both”.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
CustomizedDetector1D
(detect_func, detect_func_params=None, fit_func=None, fit_func_params=None)[source]¶ Univariate detector derived from a user-given function and parameters.
- Parameters
detect_func (function) –
A function detecting anomalies from univariate time series.
The first input argument must be a pandas Series, optional input argument may be accepted through parameter detect_func_params and the output of fit_func, and the output must be a binary pandas Series with the same index as input.
detect_func_params (dict, optional) – Parameters of detect_func. Default: None.
fit_func (function, optional) –
A function training parameters of detect_func with univariate time series.
The first input argument must be a pandas Series, optional input argument may be accepted through parameter fit_func_params, and the output must be a dict that can be used by detect_func as parameters. Default: None.
fit_func_params (dict, optional) – Parameters of fit_func. Default: None.
- Return type
None
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
GeneralizedESDTestAD
(alpha=0.05)[source]¶ Detector that detects anomaly based on generalized ESD test.
This detector performs generalized extreme Studentized deviate (ESD) test [1, 2] on historical data and identifies normal values vs. outliers for training. For predicting, the detector adds each value in the testing series to the set of normal values from training series independently, and performs generalized ESD test to this set (all normal values from training series, plus one value from testing series) to evaluate if this value of interest is an outlier.
Please note a key assumption of generalized ESD test is that values follow an approximately normal distribution. Please only use this detector when this assumption holds.
[1] Rosner, Bernard (May 1983), Percentage Points for a Generalized ESD Many-Outlier Procedure,Technometrics, 25(2), pp. 165-172.
[2] https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm
- Parameters
alpha (float, optional) – Significance level. Default: 0.05.
- Return type
None
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
InterQuartileRangeAD
(c=3.0)[source]¶ Detector that detects anomaly based on inter-quartile range of historical data.
This detector compares time series values with 1st and 3rd quartiles of historical data, and identifies time points as anomalous when differences are beyond the inter-quartile range (IQR) times a user-given factor c.
- Parameters
c (float, or 2-tuple (float, float), optional) – Factor used to determine the bound of normal range (betweeen Q1-c*IQR and Q3+c*IQR). If a tuple (c1, c2), the factors are for lower and upper bound respectively. Default: 3.0.
- Return type
None
-
abs_low_
¶ The fitted lower bound of normal range.
- Type
float
-
abs_high_
¶ The fitted upper bound of normal range.
- Type
float
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
LevelShiftAD
(window, c=6.0, side='both', min_periods=None)[source]¶ Detector that detects level shift of time series values.
This detector compares values of two time windows next to each others, and identifies the time point in between as an level-shift point if the difference of the medians in the two time windows is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
window (int or str, or 2-tuple of int or str) –
Size of the time windows.
If int, it is the number of time point in this time window.
If str, it must be able to be converted into a pandas Timedelta object.
If 2-tuple, it defines the left and right window respectively.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 6.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative changes;
If “positive”, to only detect anomalous positive changes;
If “negative”, to only detect anomalous negative changes.
Default: “both”.
min_periods (int, or 2-tuple of int, optional) – Minimum number of observations in each window required to have a value for that window. If 2-tuple, it defines the left and right window respectively. Default: None, i.e. all observations must have values.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
PersistAD
(window=1, c=3.0, side='both', min_periods=None, agg='median')[source]¶ Detector that detects anomaly based on values in a preceding period.
This detector compares time series values with the values of their preceding time windows, and identifies a time point as anomalous if the change of value from its preceding average or median is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
window (int or str, optional) –
Size of the preceding time window.
If int, it is the number of time point in this time window.
If str, it must be able to be converted into a pandas Timedelta object.
Default: 1.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative changes;
If “positive”, to only detect anomalous positive changes;
If “negative”, to only detect anomalous negative changes.
Default: “both”.
min_periods (int, optional) – Minimum number of observations in each window required to have a value for that window. Default: None, i.e. all observations must have values.
agg (str, optional) – Aggregation operation of the time window, either “mean” or “median”. Default: “median”.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
QuantileAD
(low=None, high=None)[source]¶ Detector that detects anomaly based on quantiles of historical data.
This detector compares time series values with user-specified quantiles of historical data, and identifies time points as anomalous when values are beyond the thresholds.
- Parameters
low (float, optional) – Quantile of historical data lower which a value is regarded as anomaly. Must between 0 and 1. Default: None, i.e. no threshold on lower side.
high (float, optional) – Quantile of historical data above which a value is regarded as anomaly. Must between 0 and 1. Default: None, i.e. no threshold on upper side.
- Return type
None
-
abs_low_
¶ The fitted lower bound of normal range.
- Type
float
-
abs_high_
¶ The fitted upper bound of normal range.
- Type
float
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
SeasonalAD
(freq=None, side='both', c=3.0, trend=False)[source]¶ Detector that detects anomalous values away from seasonal pattern.
This detector uses a seasonal decomposition transformer to remove seasonal pattern (as well as trend optional), and identifies a time point as anomalous when the residual of seasonal decomposition is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
freq (int, optional) – Length of a seasonal cycle as the number of time points in a cycle. If not specified, the model will try to determine it based on autocorrelation of the training series. Default: None.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative residuals;
If “positive”, to only detect anomalous positive residuals;
If “negative”, to only detect anomalous negative residuals.
Default: “both”.
trend (bool, optional) – Whether to extract trend during decomposition. Default: False.
- Return type
None
-
freq_
¶ Length of seasonal cycle as the number of time points in a cycle. Equal to parameter freq if it is specified. Otherwise, calculated based on autocorrelation of the training series.
- Type
int
-
seasonal_
¶ Seasonal pattern extracted from training series.
- Type
pandas.Series
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
ThresholdAD
(low=None, high=None)[source]¶ Detector that detects anomaly based on user-given threshold.
This detector compares time series values with user-given thresholds, and identifies time points as anomalous when values are beyond the thresholds.
- Parameters
low (float, optional) – Threshold below which a value is regarded anomaly. Default: None, i.e. no threshold on lower side.
high (float, optional) – Threshold above which a value is regarded anomaly. Default: None, i.e. no threshold on upper side.
- Return type
None
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and the detector will be applied to each univariate series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and the detector will be applied to each univariate series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and the detector will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
VolatilityShiftAD
(window, c=6.0, side='both', min_periods=None, agg='std')[source]¶ Detector that detects shift of volatility in time series.
This detector compares volatility of two time windows next to each others, and identifies the time point in between as a volatility-shift point if the difference of the volatility measurement in the two time windows is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
window (int or str, or 2-tuple of int or str) –
Size of the time windows.
If int, it is the number of time point in this time window.
If str, it must be able to be converted into a pandas Timedelta object.
If 2-tuple, it defines the left and right window respectively.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 6.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative changes;
If “positive”, to only detect anomalous positive changes;
If “negative”, to only detect anomalous negative changes.
Default: “both”.
min_periods (int, optional) – Minimum number of observations in each window required to have a value for that window. If 2-tuple, it defines the left and right window respectively. Default: None, i.e. all observations must have values.
agg (str, optional) – Measurement of volatility in a time window, one of “std” (standard deviation), “iqr” (interquartile range), or “idr” (interdecile range). Default: “std”.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit
(ts)¶ Train the detector with given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
- Return type
None
-
fit_detect
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
fit_predict
(ts, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be trained and applied to each series independently.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(ts, return_list=False)¶ Detect anomalies from given time series.
- Parameters
ts (pandas.Series or pandas.DataFrame) –
Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series.
If the detector was trained with a Series, the detector will be applied to each univariate series independently;
If the detector was trained with a DataFrame, i.e. the detector is essentially k detectors, those detectors will be applied to each univariate series respectively.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If input is a Series and return_list=False, return a Series;
If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
If input is a Series and return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If input is a DataFrame and return_list=True, return a dict of event lists, where each key-value pair corresponds a column in input.
- Return type
pandas.Series, pandas.DataFrame, list, or dict
-
score
(ts, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, it is treated as k independent univariate time series, and k univariate detectors will be applied to each series independently.
anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If pandas DataFrame, each column is a binary series and is treated as an independent type of anomaly.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
If dict, each key-value pair is a list of events and is treated as an independent type of anomaly.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score(s) for each type of anomaly.
- Return type
float or dict
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
CustomizedDetectorHD
(detect_func, detect_func_params=None, fit_func=None, fit_func_params=None)[source]¶ Multivariate detector derived from a user-given function and parameters.
- Parameters
detect_func (function) –
A function detecting anomalies from multivariate time series.
The first input argument must be a pandas DataFrame, optional input argument may be accepted through parameter detect_func_params and the output of fit_func, and the output must be a binary pandas Series with the same index as input.
detect_func_params (dict, optional) – Parameters of detect_func. Default: None.
fit_func (function, optional) –
A function training parameters of detect_func with multivariate time series.
The first input argument must be a pandas Series, optional input argument may be accepted through parameter fit_func_params, and the output must be a dict that can be used by detect_func as parameters. Default: None.
fit_func_params (dict, optional) – Parameters of fit_func. Default: None.
- Return type
None
-
detect
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit
(df)¶ Train the detector with given time series.
- Parameters
df (pandas.DataFrame) – Time series to be used to train the detector.
- Return type
None
-
fit_detect
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit_predict
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
score
(df, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
anomaly_true (Series or list) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score of detection result.
- Return type
float
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
MinClusterDetector
(model)[source]¶ Detector that detects anomaly based on clustering of historical data.
This detector peforms clustering using a clustering model, and identifies a time points as anomalous if it belongs to the minimal cluster.
- Parameters
model (object) – A clustering model to be used for clustering time series values. Same as a clustering model in scikit-learn, the model should minimally have a fit method and a predict method. The predict method should return an array of cluster labels.
- Return type
None
-
detect
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit
(df)¶ Train the detector with given time series.
- Parameters
df (pandas.DataFrame) – Time series to be used to train the detector.
- Return type
None
-
fit_detect
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit_predict
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
score
(df, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
anomaly_true (Series or list) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score of detection result.
- Return type
float
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
OutlierDetector
(model)[source]¶ Detector that detects anomaly based on a outlier detection model.
This detector peforms time-independent outlier detection using given model, and identifies a time points as anomalous if it is labelled as an outlier.
- Parameters
model (object) – An outlier detection model to be used. Same as a outlier detection model in scikit-learn (e.g. EllipticEnvelope, IsolationForest, LocalOutlierFactor), the model should minimally have a fit_predict method, or fit and predict methods. The fit_predict or predict method should return an array of outlier indicators where outliers are marked by -1.
- Return type
None
-
detect
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit
(df)¶ Train the detector with given time series.
- Parameters
df (pandas.DataFrame) – Time series to be used to train the detector.
- Return type
None
-
fit_detect
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit_predict
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
score
(df, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
anomaly_true (Series or list) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score of detection result.
- Return type
float
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
PcaAD
(k=1, c=5.0)[source]¶ Detector that detects outlier point with principal component analysis.
This detector performs principal component analysis (PCA) to the multivariate time series (every time point is treated as a point in high- dimensional space), measures reconstruction error at every time point, and identifies a time point as anomalous when the recontruction error is beyond anomalously large.
This detector is internally implemented as a Pipeline object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
k (int, optional) – Number of principal components to use. Default: 1.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 5.0.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit
(df)¶ Train the detector with given time series.
- Parameters
df (pandas.DataFrame) – Time series to be used to train the detector.
- Return type
None
-
fit_detect
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit_predict
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
score
(df, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
anomaly_true (Series or list) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score of detection result.
- Return type
float
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None
-
class
adtk.detector.
RegressionAD
(regressor, target, c=3.0, side='both')[source]¶ Detector that detects anomalous inter-series relationship.
This detector performs regression to build relationship between a target series and the rest of series, and identifies a time point as anomalous when the residual of regression is anomalously large.
This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.
- Parameters
target (str) – Name of the column to be regarded as target variable.
regressor (object) – Regressor to be used. Same as a scikit-learn regressor, it should minimally have fit and predict methods.
c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
side (str, optional) –
If “both”, to detect anomalous positive and negative residuals;
If “positive”, to only detect anomalous positive residuals;
If “negative”, to only detect anomalous negative residuals.
Default: “both”.
- Return type
None
-
pipe_
¶ Internal pipenet object.
- Type
-
detect
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit
(df)¶ Train the detector with given time series.
- Parameters
df (pandas.DataFrame) – Time series to be used to train the detector.
- Return type
None
-
fit_detect
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
fit_predict
(df, return_list=False)¶ Train the detector and detect anomalies from the time series used for training.
- Parameters
df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
get_params
()¶ Get the parameters of this model.
- Returns
Model parameters.
- Return type
dict
-
predict
(df, return_list=False)¶ Detect anomalies from given time series.
- Parameters
df (pandas.DataFrame) – Time series to detect anomalies from.
return_list (bool, optional) – Whether to return a list of anomalous events, or a binary series indicating normal/anomalous. Default: False.
- Returns
Detected anomalies.
If return_list=False, return a binary series;
If return_list=True, return a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
- Return type
pandas.Series or list
-
score
(df, anomaly_true, scoring='recall', **kwargs)¶ Detect anomalies and score the results against true anomalies.
- Parameters
df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
anomaly_true (Series or list) –
True anomalies.
If pandas Series, it is treated as a series of binary labels.
If list, a list of events where an event is a pandas Timestamp if it is instantaneous or a 2-tuple of pandas Timestamps if it is a closed time interval.
scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
**kwargs – Optional parameters for scoring function. See module metrics for more information.
kwargs (Any) –
- Returns
Score of detection result.
- Return type
float
-
set_params
(**params)¶ Set the parameters of this model.
- Parameters
**params – Model parameters to set.
params (Any) –
- Return type
None