Detectors

Module of detectors.

A detector detects anomalous time points from time series.

adtk.detector.print_all_models()[source]

Print description of every model in this module.

class adtk.detector.ThresholdAD(low=None, high=None)[source]

Detector that detects anomaly based on user-given threshold.

This detector compares time series values with user-given thresholds, and identifies time points as anomalous when values are beyond the thresholds.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • low (float, optional) – Threshold below which a value is regarded anomaly. Default: None, i.e. no threshold on lower side.
  • high (float, optional) – Threshold above which a value is regarded anomaly. Default: None, i.e. no threshold on lower side.
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.QuantileAD(low=None, high=None)[source]

Detector that detects anomaly based on quantiles of historical data.

This detector compares time series values with user-specified quantiles of historical data, and identifies time points as anomalous when values are beyond the thresholds.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • low (float, optional) – Quantile of historical data lower which a value is regarded as anomaly. Must between 0 and 1. Default: None, i.e. no threshold on lower side.
  • high (float, optional) – Quantile of historical data above which a value is regarded as anomaly. Must between 0 and 1. Default: None, i.e. no threshold on upper side.
abs_low_

The fitted lower bound of normal range.

Type:float
abs_high_

The fitted upper bound of normal range.

Type:float
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.InterQuartileRangeAD(c=3.0)[source]

Detector that detects anomaly based on inter-quartile range of historical data.

This detector compares time series values with 1st and 3rd quartiles of historical data, and identifies time points as anomalous when differences are beyond the inter-quartile range times a user-given factor c.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:c (float, or 2-tuple (float, float), optional) – Factor used to determine the bound of normal range (betweeen Q1-c*IQR and Q3+c*IQR). If a tuple (c1, c2), the factors are for lower and upper bound respectively. Default: 3.0.
abs_low_

The fitted lower bound of normal range.

Type:float
abs_high_

The fitted upper bound of normal range.

Type:float
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.GeneralizedESDTestAD(alpha=0.05)[source]

Detector that detects anomaly based on generalized ESD test.

This detector performs generalized extreme Studentized deviate (ESD) test [1, 2] on historical data and identifies normal values vs. outliers for training. For predicting, the detector adds each value in the testing series to the set of normal values from training series independently, and performs generalized ESD test to this set (all normal values from training series, plus one value from testing series) to evaluate if this value of interest is an outlier.

Please note a key assumption of generalized ESD test is that normal values follow an approximately normal distribution. Please only use this detector when this assumption holds.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

[1] Rosner, Bernard (May 1983), Percentage Points for a Generalized ESD Many-Outlier Procedure,Technometrics, 25(2), pp. 165-172.

[2] https://www.itl.nist.gov/div898/handbook/eda/section3/eda35h3.htm

Parameters:alpha (float, optional) – Significance level. Default: 0.05.
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.PersistAD(window=1, c=3.0, side='both', min_periods=None, agg='median')[source]

Detector that detects anomaly based on values in a preceding period.

This detector compares time series values with the values of their preceding time windows, and identifies a time point as anomalous if the change of value from its preceding average or median is beyond a threshold based on historical interquartile range.

This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • window (int, optional) – Number of time points in the time window. Default: 1.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative changes; If “positive”, to only detect anomalous positive changes; If “negative”, to only detect anomalous negative changes. Default: “both”.
  • min_periods (int, optional) – Minimum number of observations in each window required to have a value for that window. Default: None, i.e. all observations must have values.
  • agg (str, optional) – Aggregation operation of the time window, either “mean” or “median”. Default: “median”.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.LevelShiftAD(window=10, c=6.0, side='both', min_periods=None)[source]

Detector that detects level shift of time series values.

This detector compares median values inside time windows next to each others, and identifies a time point as a level shift point if difference between time windows on its left-side and its right-side is beyond a threshold based on historical interquartile range.

This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • window (int, optional) – Number of time points in each time window. Default: 10.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 6.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative changes; If “positive”, to only detect anomalous positive changes; If “negative”, to only detect anomalous negative changes. Default: “both”.
  • min_periods (int, optional) – Minimum number of observations in each window required to have a value for that window. Default: None, i.e. all observations must have values.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.VolatilityShiftAD(window=10, c=6.0, side='both', min_periods=None, agg='std')[source]

Detector that detects level shift of time series volatility.

This detector compares standard deviations inside time windows next to each others, and identifies a time point as a volatility shift point if change over time windows from its left-side to its right-side is beyond a threshold based on historical interquartile range.

This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • window (int, optional) – Number of time points in each time window. Default: 10.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 6.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative changes; If “positive”, to only detect anomalous positive changes; If “negative”, to only detect anomalous negative changes. Default: “both”.
  • min_periods (int, optional) – Minimum number of observations in each window required to have a value for that window. Default: None, i.e. all observations must have values.
  • agg (str, optional) – Aggregation operation of the time window, one of “std”, “iqr” or “idr”. Default: “std”.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.AutoregressionAD(n_steps=1, step_size=1, regressor=None, c=3.0, side='both')[source]

Detector that detects anomalous autoregression property in time series.

Many time series has autoregression behavior. For example, in a linear autoregression time series, current value is a linear combination of serveral previous values. Violation of usual autoregression behavior may indicate anomaly.

The detector applies a regressor to learn autoregression property of the time series, and identifies a time point as anomalous when the residual of autoregression is beyond a threshold based on historical interquartile range.

This detector is internally implemented aattribute pipe_.nced users may learn more details by checking attribute pipe_.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • n_steps (int, optional) – Number of steps (previous values) to include in the model. Default: 1.
  • step_size (int, optional) – Length of a step. For example, if n_steps=2, step_size=3, X_[t-3] and X_[t-6] will be used to predict X_[t]. Default: 1.
  • regressor (object, optional) – Regressor to be used. Same as a scikit-learn regressor, it should minimally have fit and predict methods. If not given, a linear regressor will be used.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative residuals; If “positive”, to only detect anomalous positive residuals; If “negative”, to only detect anomalous negative residuals. Default: “both”.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.SeasonalAD(freq=None, side='both', c=3.0, trend=False)[source]

Detector that detects anomalous values away from seasonal pattern.

This detector uses a seasonal decomposition transformer to remove seasonal pattern (as well as trend optional), and identifies a time point as anomalous when the residual of seasonal decomposition is beyond a threshold based on historical interquartile range.

This detector is internally implemented aattribute pipe_.nced users may learn more details by checking attribute pipe_.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • freq (int, optional) – Length of a seasonal cycle. If not given, the model will determine automatically based on autocorrelation of the training series. Default: None.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative residuals; If “positive”, to only detect anomalous positive residuals; If “negative”, to only detect anomalous negative residuals. Default: “both”.
  • trend (bool, optional) – Whether to extract trend during decomposition. Only used when classic seasonal decomposition is applied. Default: False.
freq_

Length of seasonal cycle. Equal to parameter freq if it is given. Otherwise, calculated based on autocorrelation of the training series.

Type:int
seasonal_

Seasonal pattern extracted from training series.

Type:pandas.Series
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.CustomizedDetector1D(detect_func=None, detect_func_params=None, fit_func=None, fit_func_params=None)[source]

Detector derived from a user-given function and parameters.

This is an univariate detector. When it is applied to a multivariate time series (i.e. pandas DataFrame), it will be applied to every series independently. All parameters can be defined as a dict object where key- value pairs are series names (i.e. column names of DataFrame) and the model parameter for that series. If not, then the same parameter will be applied to all series.

Parameters:
  • detect_func (function) – A function detecting anomalies from given time series. The first input argument must be a pandas Series, optional input argument allows; the output must be a binary pandas Series with the same index as input.
  • detect_func_params (dict, optional) – Parameters of detect_func. Default: None.
  • fit_func (function, optional) – A function learning from a list of time series and return parameters dict that detect_func can used for future detection. Default: None.
  • fit_func_params (dict, optional) – Parameters of fit_func. Default: None.
detect(ts, return_list=False)

Detect anomalies from given time series.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit(ts)

Train the detector with given time series.

Parameters:ts (pandas.Series or pandas.DataFrame) – Time series to be used to train the detector. If a DataFrame with k columns, k univariate detectors will be trained independently.
fit_detect(ts, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • ts (pandas.Series or pandas.DataFrame) – Time series to be used for training and be detected for anomalies. If a DataFrame with k columns, k univariate detectors will be trained and applied to them independently.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If input is a Series and return_list=False, return a Series;
  • If input is a DataFrame and return_list=False, return a DataFrame, where each column corresponds a column in input;
  • If input is a Series and return_list=True, return a list of time stamps or time stamp tuples;
  • If input is a DataFrame and return_list=True, return a dict of lists, where each key-value pair corresponds a column in input.

Return type:

pandas.Series, pandas.DataFrame, list, or dict

fit_predict(ts, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(ts, return_list=False)

Alias of detect.

score(ts, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • ts (pandas Series or pandas.DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (pandas.Series, pandas.DataFrame, list, or dict) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If DataFrame, each column is considered as an independent type of anomaly;
    • If list, it is a list of anomalous events in form of time points (pandas.Timestamp) or time windows (2-tuple of time stamps);
    • If a dict of lists, each value is considered as an independent type of anomaly.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score(s) for each type of anomaly.

Return type:

float or dict

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.MinClusterDetector(model=None)[source]

Detector that detect anomaly based on clustering of historical data.

This detector peforms clustering using a clustering model, and identifies a time points as anomalous if it belongs to the minimal cluster.

Parameters:model (object) – A clustering model to be used for clustering time series values. Same as a clustering model in scikit-learn, the model should minimally have a fit method and a predict method. The predict method should return an array of cluster labels.
detect(df, return_list=False)

Detect anomalies from given time series.

Parameters:
  • df (pandas.DataFrame) – Time series to detect anomalies from.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit(df)

Train the detector with given time series.

Parameters:df (pandas.DataFrame) – Time series to be used to train the detector.
fit_detect(df, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit_predict(df, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(df, return_list=False)

Alias of detect.

score(df, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (Series, or a list of Timestamps or Timestamp tuple) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If list, it is a list of anomalous events in form of time windows.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score of detection result.

Return type:

float

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.OutlierDetector(model=None)[source]

Detector that detect anomaly based on a outlier detection model.

This detector peforms time-independent outlier detection using given model, and identifies a time points as anomalous if it is labelled as an outlier.

Parameters:model (object) – An outlier detection model to be used. Same as a outlier detection model in scikit-learn (e.g. EllipticEnvelope, IsolationForest, LocalOutlierFactor), the model should minimally have a fit_predict method, or fit and predict methods. The fit_predict or predict method should return an array of outlier indicators where outliers are marked by -1.
detect(df, return_list=False)

Detect anomalies from given time series.

Parameters:
  • df (pandas.DataFrame) – Time series to detect anomalies from.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit(df)

Train the detector with given time series.

Parameters:df (pandas.DataFrame) – Time series to be used to train the detector.
fit_detect(df, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit_predict(df, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(df, return_list=False)

Alias of detect.

score(df, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (Series, or a list of Timestamps or Timestamp tuple) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If list, it is a list of anomalous events in form of time windows.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score of detection result.

Return type:

float

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.RegressionAD(target=None, regressor=None, c=3.0, side='both')[source]

Detector that detects anomalous inter-series relationship.

This detector performs regression to build relationship between a target series and the rest of series, and identifies a time point as anomalous when the residual of regression is beyond a threshold based on historical interquartile range.

This detector is internally implemented as a Pipenet object. Advanced users may learn more details by checking attribute pipe_.

Parameters:
  • target (str, optional) – Name of the column to be regarded as target variable. If not specified, the first column in input DataFrame will be used.
  • regressor (object) – Regressor to be used. Same as a scikit-learn regressor, it should minimally have fit and predict methods.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 3.0.
  • side (str, optional) – If “both”, to detect anomalous positive and negative residuals; If “positive”, to only detect anomalous positive residuals; If “negative”, to only detect anomalous negative residuals. Default: “both”.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(df, return_list=False)

Detect anomalies from given time series.

Parameters:
  • df (pandas.DataFrame) – Time series to detect anomalies from.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit(df)

Train the detector with given time series.

Parameters:df (pandas.DataFrame) – Time series to be used to train the detector.
fit_detect(df, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit_predict(df, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(df, return_list=False)

Alias of detect.

score(df, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (Series, or a list of Timestamps or Timestamp tuple) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If list, it is a list of anomalous events in form of time windows.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score of detection result.

Return type:

float

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.PcaAD(k=1, c=5.0)[source]

Detector that detects outlier point with principal component analysis.

This detector performs principal component analysis (PCA) to the multivariate time series (every time point is treated as a point in high- dimensional space), measures reconstruction error at every time point, and identifies a time point as anomalous when the recontruction error is beyond a threshold based on historical interquartile range.

This detector is internally implemented as a Pipeline object. Advanced users may learn more details by checking attribute pipe_.

Parameters:
  • k (int, optional) – Number of principal components to use. Default: 1.
  • c (float, optional) – Factor used to determine the bound of normal range based on historical interquartile range. Default: 5.0.
pipe_

Internal pipenet object.

Type:adtk.pipe.Pipenet
detect(df, return_list=False)

Detect anomalies from given time series.

Parameters:
  • df (pandas.DataFrame) – Time series to detect anomalies from.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit(df)

Train the detector with given time series.

Parameters:df (pandas.DataFrame) – Time series to be used to train the detector.
fit_detect(df, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit_predict(df, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(df, return_list=False)

Alias of detect.

score(df, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (Series, or a list of Timestamps or Timestamp tuple) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If list, it is a list of anomalous events in form of time windows.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score of detection result.

Return type:

float

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.
class adtk.detector.CustomizedDetectorHD(detect_func=None, detect_func_params=None, fit_func=None, fit_func_params=None)[source]

Detector derived from a user-given function and parameters.

Parameters:
  • detect_func (function) – A function detecting anomalies from given time series. The first input argument must be a pandas Dataframe, optional input argument allows; the output must be a binary pandas Series with the same index as input.
  • detect_func_params (dict, optional) – Parameters of detect_func. Default: None.
  • fit_func (function, optional) – A function learning from a list of time series and return parameters dict that detect_func can used for future detection. Default: None.
  • fit_func_params (dict, optional) – Parameters of fit_func. Default: None.
detect(df, return_list=False)

Detect anomalies from given time series.

Parameters:
  • df (pandas.DataFrame) – Time series to detect anomalies from.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit(df)

Train the detector with given time series.

Parameters:df (pandas.DataFrame) – Time series to be used to train the detector.
fit_detect(df, return_list=False)

Train the detector and detect anomalies from the time series used for training.

Parameters:
  • df (pandas.DataFrame) – Time series to be used for training and be detected for anomalies.
  • return_list (bool, optional) – Whether to return a list of anomalous time stamps, or a binary series indicating normal/anomalous. Default: False.
Returns:

Detected anomalies.

  • If return_list=False, return a binary series;
  • If return_list=True, return a list of time stamps or time stamp tuples.

Return type:

pandas.Series or list

fit_predict(df, return_list=False)

Alias of fit_detect.

get_params()

Get parameters of this model.

Returns:Model parameters.
Return type:dict
predict(df, return_list=False)

Alias of detect.

score(df, anomaly_true, scoring='recall', **kwargs)

Detect anomalies and score the results against true anomalies.

Parameters:
  • df (pandas DataFrame) – Time series to detect anomalies from. If a DataFrame with k columns, k univariate detectors will be applied to them independently.
  • anomaly_true (Series, or a list of Timestamps or Timestamp tuple) –

    True anomalies.

    • If Series, it is a series binary labels indicating anomalous;
    • If list, it is a list of anomalous events in form of time windows.
  • scoring (str, optional) – Scoring function to use. Must be one of “recall”, “precision”, “f1”, and “iou”. See module metrics for more information. Default: “recall”
  • **kwargs – Optional parameters for scoring function. See module metrics for more information.
Returns:

Score of detection result.

Return type:

float

set_params(**kwargs)

Set parameters of this model.

Parameters:**kwargs – Model parameters to set. If empty, then all parameters will be reset to default values.