Authors:

Xiaosheng LiWenjie XiJessica Lin

Paper:

https://arxiv.org/abs/2408.07956

Introduction

Neural networks are fundamental in machine learning, data mining, and artificial intelligence. Typically, these networks undergo a training phase where their parameters are tuned according to specific learning rules and data. This training often involves backpropagation to optimize an objective function. Once trained, these networks can be deployed for various tasks, including classification, clustering, and regression.

Time series clustering, which involves grouping time series instances into homogeneous groups, is a crucial and challenging task in time series data mining. It has applications in finance, biology, climate, medicine, and more. Existing methods for time series clustering often focus on a single aspect, such as shape or point-to-point distance, which can be suboptimal for certain data types.

In this work, we introduce RandomNet, a novel method for time series clustering using untrained deep neural networks. Unlike conventional methods that adjust network weights using backpropagation, RandomNet utilizes different sets of random parameters to extract various representations of the data. These representations are clustered, and the results are ensembled to produce the final clustering. This approach ensures that data only needs to pass through the networks once, obviating the need for backpropagation, and provides a more efficient solution for time series clustering tasks.

Background and Related Work

Definitions and Notations

  • Definition 1: A time series ( T = [t_1, t_2, \ldots, t_m] ) is an ordered sequence of real-value data points, where ( m ) is the length of the time series.
  • Definition 2: Given a set of time series ( {T_i}_{i=1}^n ) and the number of clusters ( k ), the objective of time series clustering is to assign each time series instance ( T_i ) a group label ( c_j ), where ( j \in {1, \ldots, k} ). ( n ) is the number of instances in the dataset.

Related Work

Time series clustering methods can be categorized into four groups: raw-data-based methods, feature-based methods, deep-learning-based methods, and others.

  • Raw-data-based methods: These methods directly apply classic clustering algorithms such as k-means on raw time series. They often struggle with scale-variance, phase-shifting, distortion, and noise in the data. Dynamic Time Warping (DTW) and its variants are popular distance measures used to address these challenges.
  • Feature-based methods: These methods transform time series into flat, unordered features and then apply classic clustering algorithms. Examples include U-shapelets and Similarity PreservIng RepresentAtion Learning (SPIRAL).
  • Deep-learning-based methods: Many methods in this category adopt the autoencoder architecture for clustering. Examples include Improved Deep Embedded Clustering (IDEC) and Deep Temporal Clustering (DTC).
  • Other methods: These include methods like Symbolic Pattern Forest (SPF), which uses Symbolic Aggregate approXimation (SAX) to transform time series subsequences into symbolic patterns.

The Proposed Method

Architecture and Algorithm

RandomNet is structured with multiple branches, each containing a CNN-LSTM block designed to capture both spatial and temporal dependencies of time series, followed by k-means clustering. Each branch produces its own clustering, and a selection mechanism is used to remove skewed or deviant clusterings. The final clustering is obtained by ensembling the selected clusterings using the Hybrid Bipartite Graph Formulation (HBGF).

Algorithm 1: RandomNet

“`plaintext
Require: D: time series dataset; B: branch number; k: number of clusters; lr, ur: lower and upper bounds; sr: selection rate
Ensure: C: clustering assignment of D

1: function RandomNet(D, B, k, lr, ur, sr)
2: ClusteringSet ←empty
3: for i = 1 to B do
4: Randomize(CNN LSTM blocks)
5: Features ←CNN LSTM blocks(D)
6: Clustering ←Kmeans(Features)
7: ClusteringSet.add(Clustering)
8: end for
9: SelectedSet ←Selection(ClusteringSet, sr, lr, ur, B) // Algorithm 2
10: C ←Ensemble(SelectedSet) // Algorithm 3
11: return C
12: end function
“`

Effectiveness of RandomNet

RandomNet’s effectiveness is based on the idea that multiple random parameters can generate diverse data representations. Some representations are relevant to the clustering task, while others are not. By ensembling the clustering results derived from these diverse representations, the noise introduced by irrelevant representations tends to cancel out, while the connections provided by relevant representations are strengthened.

Theorem 1

Assume two instances, ( T_1 ) and ( T_2 ), are from the same class. If they reside within the same cluster under some relevant representations, then RandomNet assigns these two instances to the same cluster in the final output.

Theorem 2

Assume the ensemble size to be ( b ). Then, the lower bound of ( b ) needed to provide a good clustering is given by ( -2 \ln \alpha / \gamma^2 ), where ( \gamma ) represents the percentage of relevant representations and ( 1-\alpha ) is the confidence level.

Experimental Evaluation

Experimental Setup

To evaluate the effectiveness of RandomNet, we run the algorithm on all 128 datasets from the UCR time series archive. These datasets come from different disciplines with various characteristics. We compare RandomNet with several state-of-the-art methods, including kDBA, KSC, k-shape, SPIRAL, SPF, IDEC, DTC, ROCKET, MiniRocket, and MultiRocket.

Hyperparameter Analysis

We conduct a series of experiments to fine-tune and investigate the influence of hyperparameters on model performance. The hyperparameters include the number of branches, selection rate, and lower and upper bounds for cluster size.

Experimental Results

We compare RandomNet with k-means, kmeansE, ROCKET, MiniRocket, MultiRocket, and other state-of-the-art methods. RandomNet achieves the highest average Rand Index and average rank among all the baseline methods.

Ablation Study

We verify the effectiveness of each component in RandomNet by comparing the performance of the full RandomNet and its four variants on 108 UCR datasets. The results show that the full RandomNet is better than the four variants in average Rand Index and average rank.

Visualizing Clusters for Different Methods

We visualize the 2D embeddings of the Cylinder-Bell-Funnel (CBF) dataset using t-SNE and compare the cluster assignments by k-means, MiniRocket, and RandomNet with the true labels. RandomNet is able to identify the correct clusters more accurately than k-means and MiniRocket.

Testing the Time Complexity

We test the scalability and effectiveness of RandomNet by generating datasets of varying sizes and lengths. The results show that the average running time of RandomNet has a strong linear relationship with the number of instances and length of time series.

Analyzing Noise Sensitivity

We test the noise sensitivity of RandomNet using three different datasets injected with varying levels of random Gaussian noise. RandomNet exhibits strong resilience to noise and performs better than SPF.

Finding the Optimal Number of Clusters

We apply the Elbow Method to determine the optimal number of clusters for the CBF dataset. RandomNet can find an obvious “elbow” at ( k = 3 ), whereas it is hard to locate a clear “elbow” for k-means.

Conclusion and Future Work

In this paper, we introduced RandomNet, a novel method for time series clustering that utilizes deep neural networks with random parameters to extract diverse representations of the input time series for clustering. The selection mechanism and ensemble in the proposed method cancel irrelevant representations and strengthen relevant representations to provide reliable clustering. Extensive evaluations conducted across all 128 UCR datasets demonstrate competitive accuracy compared to state-of-the-art methods, as well as superior efficiency.

Future research directions may involve integrating more complex or domain-specific network structures into our method. Additionally, incorporating some level of training into the framework could potentially improve performance. We will also explore the potential of applying our method to multivariate time series or other data types, such as image data.

Share.

Comments are closed.

Exit mobile version