skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

catch22: CAnonical Time-series CHaracteristics: Selected through highly comparative time-series analysis

Data mining and knowledge discovery, 2019-11, Vol.33 (6), p.1821-1852 [Peer Reviewed Journal]

The Author(s) 2019 ;ISSN: 1384-5810 ;EISSN: 1573-756X ;DOI: 10.1007/s10618-019-00647-x

Full text available

Citations Cited by
  • Title:
    catch22: CAnonical Time-series CHaracteristics: Selected through highly comparative time-series analysis
  • Author: Lubba, Carl H. ; Sethi, Sarab S. ; Knaute, Philip ; Schultz, Simon R. ; Fulcher, Ben D. ; Jones, Nick S.
  • Subjects: Artificial Intelligence ; Chemistry and Earth Sciences ; Computer Science ; Data Mining and Knowledge Discovery ; Information Storage and Retrieval ; Physics ; Statistics for Engineering
  • Is Part Of: Data mining and knowledge discovery, 2019-11, Vol.33 (6), p.1821-1852
  • Description: Capturing the dynamical properties of time series concisely as interpretable feature vectors can enable efficient clustering and classification for time-series applications across science and industry. Selecting an appropriate feature-based representation of time series for a given application can be achieved through systematic comparison across a comprehensive time-series feature library, such as those in the hctsa toolbox. However, this approach is computationally expensive and involves evaluating many similar features, limiting the widespread adoption of feature-based representations of time series for real-world applications. In this work, we introduce a method to infer small sets of time-series features that (i) exhibit strong classification performance across a given collection of time-series problems, and (ii) are minimally redundant. Applying our method to a set of 93 time-series classification datasets (containing over 147,000 time series) and using a filtered version of the hctsa feature library (4791 features), we introduce a set of 22 CAnonical Time-series CHaracteristics, catch22 , tailored to the dynamics typically encountered in time-series data-mining tasks. This dimensionality reduction, from 4791 to 22, is associated with an approximately 1000-fold reduction in computation time and near linear scaling with time-series length, despite an average reduction in classification accuracy of just 7%. catch22 captures a diverse and interpretable signature of time series in terms of their properties, including linear and non-linear autocorrelation, successive differences, value distributions and outliers, and fluctuation scaling properties. We provide an efficient implementation of catch22 , accessible from many programming environments, that facilitates feature-based time-series analysis for scientific, industrial, financial and medical applications using a common language of interpretable time-series properties.
  • Publisher: New York: Springer US
  • Language: English
  • Identifier: ISSN: 1384-5810
    EISSN: 1573-756X
    DOI: 10.1007/s10618-019-00647-x
  • Source: ProQuest Central
    Springer Nature OA Free Journals

Searching Remote Databases, Please Wait