RQA2 – Comprehensive Reference Guide#
Overview#
The RQA2 module is the modern, object-oriented core of the SMdRQA package
(version 2025.7.27). It supersedes the legacy RQA_functions procedural
interface and bundles four cooperating classes:
Class |
Purpose |
|---|---|
End-to-end RQA pipeline: data loading, parameter estimation, recurrence-plot construction, measure computation, visualisation, and batch processing. |
|
Reproducible generators for well-known chaotic dynamical systems (Rössler, Lorenz, Hénon, Chua) used for benchmarking. |
|
Surrogate-data generation (FT, AAFT, IAAFT, IDFS, WIAAFT, PPS) and comprehensive statistical validation of nonlinear dynamics metrics. |
|
Machine learning benchmarking: feature engineering from RQA measures, nested cross-validation with best-subset feature selection, surrogate null baselines, statistical comparison (Wilcoxon signed-rank), feature importance, clustering validity, and publication-ready visualisations. |
Key Design Principles#
Lazy evaluation – Parameters τ, m, and ε are computed on first access and cached; re-loading data resets all caches automatically.
Unified API – One class instance drives the entire analysis pipeline.
Reproducibility – Full state (data, parameters, measures, config) is serialisable via
save_results/load_results.Separation of concerns – Simulation, surrogate testing, and measurement live in dedicated classes that are each independently unit-testable.
Quick Start#
from SMdRQA.RQA2 import RQA2
import numpy as np
# ── 1. Load data (shape (N,) or (N, D)) ──────────────────────────────
data = np.load('mytimeseries.npy')
# ── 2. Create analysis object (z-scores the data by default) ─────────
rqa = RQA2(data, reqrr=0.10) # target 10 % recurrence rate
# ── 3. Inspect automatically selected parameters ──────────────────────
print(f"tau = {rqa.tau}, m = {rqa.m}, eps = {rqa.eps:.4f}")
# ── 4. Compute all RQA measures in one call ──────────────────────────
measures = rqa.compute_rqa_measures()
for k, v in measures.items():
print(f" {k:30s}: {v:.4f}")
# ── 5. Visualise ─────────────────────────────────────────────────────
rqa.plot_recurrence_plot(save_path='rp.png')
rqa.plot_rqa_measures_summary()
# ── 6. Save for auditable workflows ──────────────────────────────────
rqa.save_results('analysis.pkl')
Mathematical Background#
Takens’ Delay Embedding#
Given a scalar time series \(\{x(t)\}_{t=1}^{N}\) sampled from an unknown dynamical system, the delay-embedding theorem (Takens, 1981) guarantees that the map
is a diffeomorphism (smooth bijection) between the original attractor and the reconstructed manifold in \(\mathbb{R}^m\), provided:
the embedding dimension satisfies \(m \geq 2 d_f + 1\) (where \(d_f\) is the fractal dimension of the attractor), and
the delay \(\tau\) avoids both redundancy (too small) and independence (too large).
The result extends straightforwardly to multivariate data: for a \(D\)-dimensional input the delay vector has shape \((N_\text{emb},\, m,\, D)\) with \(N_\text{emb} = N - (m-1)\tau\).
Recurrence Matrix#
Given the embedded signal, the recurrence matrix is
where \(\Theta\) is the Heaviside step function and \(\varepsilon\) is the neighbourhood radius. \(R_{ij}=1\) iff the two state vectors are within \(\varepsilon\) of each other in Euclidean distance.
RQA Measures – Definitions#
All measures are derived from the lengths and densities of diagonal and vertical
line structures in \(R\). Let \(\ell_\text{min}\) denote the minimum
accepted line length (lmin in the config).
Diagonal-line measures (predictability, determinism)
where \(P(\ell)\) is the diagonal line-length histogram and \(p(\ell) = P(\ell)/\sum P(\ell)\).
Vertical-line measures (laminarity, trapping time)
where \(V(v)\) is the vertical line-length histogram.
Full measure table:
Key |
Name |
Interpretation |
|---|---|---|
|
Recurrence Rate (RR) |
Fraction of recurrent points. Fix RR constant when comparing conditions. |
|
Determinism (DET) |
Share of recurrent points on diagonal lines >= lmin. High DET signals rule-based dynamics. |
|
Average diagonal length (L) |
Mean predictability horizon; inversely related to Lyapunov exponent. |
|
Max diagonal length (Lmax) |
1/Lmax ~ largest Lyapunov exponent for chaotic flows. |
|
Diagonal entropy (Lentr) |
Complexity of the diagonal line-length distribution. |
|
Diagonal mode |
Most frequent diagonal line length. |
|
Laminarity (LAM) |
Share of recurrent points on vertical lines. Signals laminar phases. |
|
Trapping Time (TT) |
Mean duration of laminar (slowly-changing) states. |
|
Max vertical length (Vmax) |
Longest laminar episode. |
|
Vertical entropy (Ventr) |
Complexity of the vertical line-length distribution. |
|
Vertical mode |
Most frequent vertical line length. |
Parameter Estimation Algorithms#
Time Delay tau#
Default method – first MI minimum
Computes the time-delayed mutual information
using a multidimensional histogram estimator (mi_method='histdd') or by
averaging 1-D MI across dimensions (mi_method='avg'). The optimal delay
\(\tau^*\) is the first local minimum of \(I[\tau]\).
Polynomial method
Fits a cross-validated polynomial to the MI curve and returns the first minimum of the fitted function. More robust for noisy or short series where the discrete minimum is ambiguous.
rqa = RQA2(data, tau_method='polynomial', mi_method='avg')
# or override per call:
tau = rqa.compute_time_delay(method='polynomial', mi_method='histdd')
Inspect the MI curve to verify the automatic choice:
rqa.plot_tau_mi_curve(max_tau=80)
Embedding Dimension m#
Uses the False Nearest Neighbours (FNN) algorithm (Kennel et al., 1992):
For each candidate \(m\), find the nearest neighbour of every embedded point in \(\mathbb{R}^m\).
Lift both points to \(\mathbb{R}^{m+1}\) by appending the next delayed coordinate.
A neighbour is false if the ratio of distances after/before the lift exceeds a threshold \(r\).
The FNN fraction is computed across a range of \(r\) values (
RmintoRmax, withrdivsteps).The optimal \(m^*\) is the smallest dimension for which the FNN fraction drops by at least
boundrelative to the previous dimension.
rqa = RQA2(data, Rmin=1, Rmax=10, rdiv=451, bound=0.2)
m = rqa.compute_embedding_dimension()
rqa.plot_fnn_curve(max_m=10)
Neighbourhood Radius epsilon#
A linear scan over epsdiv candidate values in [epsmin, epsmax]
finds the first epsilon for which
If no candidate satisfies the tolerance the midpoint
(epsmin + epsmax) / 2 is returned as a fallback.
rqa = RQA2(data, reqrr=0.05, rr_delta=0.002,
epsmin=0, epsmax=5, epsdiv=2001)
eps = rqa.compute_neighborhood_radius()
print(f"Achieved RR = {rqa.recurrence_rate:.4f}")
Configuration Reference#
All configuration keys are passed as **kwargs to the constructor and stored
in rqa.config. They can also be inspected or updated at runtime:
rqa = RQA2(data, reqrr=0.05, lmin=3)
print(rqa.config)
rqa.config['lmin'] = 5 # update before recomputing
Key |
Default |
Description |
|---|---|---|
|
0.10 |
Target recurrence rate (0 < reqrr < 1). Clamped to [0.01, 0.99]. |
|
0.005 |
Tolerance |RR - reqrr| for accepting an epsilon candidate. |
|
0 |
Lower bound of epsilon search range. |
|
10 |
Upper bound of epsilon search range. |
|
1001 |
Resolution of the linear epsilon scan. |
|
2 |
Minimum line length for DET, LAM, entropy, average, and mode measures. |
|
|
|
|
|
|
|
1 |
Lower bound of FNN threshold ratio search. |
|
10 |
Upper bound of FNN threshold ratio search. |
|
451 |
Number of candidate FNN threshold values. |
|
0.001 |
FNN convergence tolerance (FNN ratio < delta -> accept dimension). |
|
0.2 |
Minimum fractional drop in FNN ratio required to select a dimension. |
Tip
For long, high-dimensional signals reduce epsdiv and rdiv (e.g. to 501
and 201) to keep runtimes reasonable. For very short signals (N < 200)
increase reqrr slightly (e.g. 0.15) to guarantee a non-trivial recurrence
plot.
API Reference#
Constructor#
rqa = RQA2(data=None, normalize=True, **kwargs)
data may be omitted; call load_data() before accessing any computed
property. normalize=True applies z-score normalisation column-wise.
Lazy Properties#
Accessing any property triggers computation on first call and caches the result.
Loading new data with load_data() resets all caches.
Property |
Description |
|---|---|
|
Optimal time delay (int >= 1). |
|
Optimal embedding dimension (int >= 1). |
|
Neighbourhood radius (float > 0). |
|
Fraction of recurrent points RR in [0, 1]. |
|
Binary recurrence matrix, shape (N_emb, N_emb). |
|
Delay-embedded tensor, shape (N_emb, m, D). |
Core Computation Methods#
Method call |
Description |
|---|---|
|
Recompute tau with an explicit algorithm choice. |
|
Recompute m via FNN. |
|
Recompute epsilon for a given target recurrence rate. |
|
Build the recurrence matrix from current tau, m, epsilon. |
|
Build the delay-embedding tensor. |
|
Compute all 11 RQA measures; returns a dict. |
|
Return DET as a float in [0, 1]. |
|
Return LAM as a float in [0, 1]. |
|
Return TT (mean vertical line length). |
|
Return a nested dict of data info, parameters, and measures. |
Visualisation Methods#
All plot methods accept figsize=(w, h) and save_path=None.
Method call |
Description |
|---|---|
|
Display the binary recurrence matrix. |
|
Plot MI vs tau with optimal tau marked. |
|
Plot FNN ratio vs m with optimal m marked. |
|
2x3 panel: main measures, entropy, avg/max/mode lengths, parameters. |
|
Plot original (unnormalised) signal; stacked for multivariate data. |
Persistence and Batch Processing#
Method call |
Description |
|---|---|
|
Pickle the full analysis state. |
|
Restore a previously saved analysis state. |
|
Process all |
Common Workflows#
Workflow 1 – Single Univariate Signal#
import numpy as np
from SMdRQA.RQA2 import RQA2
t = np.linspace(0, 20 * np.pi, 2000)
data = np.sin(t) + 0.1 * np.random.randn(2000)
rqa = RQA2(data, reqrr=0.10, lmin=2)
measures = rqa.compute_rqa_measures()
print(f"DET = {measures['determinism']:.3f}")
print(f"LAM = {measures['laminarity']:.3f}")
print(f"TT = {measures['average_vertical_length']:.2f}")
rqa.plot_recurrence_plot()
Workflow 2 – Multivariate (MdRQA)#
import numpy as np
from SMdRQA.RQA2 import RQA2, RQA2_simulators
sim = RQA2_simulators(seed=0)
x, y, z = sim.lorenz(n=2000)
data = np.column_stack([x, y, z]) # shape (2000, 3)
rqa = RQA2(data, reqrr=0.08)
measures = rqa.compute_rqa_measures()
rqa.plot_rqa_measures_summary(save_path='lorenz_summary.png')
Workflow 3 – Surrogate Validation#
from SMdRQA.RQA2 import RQA2_simulators, RQA2_tests
sim = RQA2_simulators(seed=42)
x, _, _ = sim.rossler(n=1000, a=0.1)
signal = x.astype(float)
tester = RQA2_tests(signal, seed=0, max_workers=4)
surr = tester.generate('IAAFT', n_surrogates=200)
systems = sim.generate_test_battery()
results = tester.comprehensive_validation(systems, n_surrogates=100)
# results[system_name][surrogate_method][metric] -> p-value
Workflow 4 – Batch Processing#
from SMdRQA.RQA2 import RQA2
results, errors = RQA2.batch_process(
input_path='./data/raw/',
output_path='./data/processed/',
group_level=True,
group_level_estimates=['tau', 'm'],
reqrr=0.10,
)
import pandas as pd
df = pd.DataFrame(results)
print(df[['file', 'determinism', 'laminarity']].head())
Workflow 5 – Save and Reload#
rqa = RQA2(data)
_ = rqa.compute_rqa_measures()
rqa.save_results('my_analysis.pkl')
# Later / different session
rqa2 = RQA2()
rqa2.load_results('my_analysis.pkl')
print(rqa2.get_summary())
Workflow 6 – Inspecting Parameter Curves#
Always visually verify automatic parameter choices before trusting them:
rqa = RQA2(data)
# Visualise MI curve – confirm tau is at the first minimum
rqa.plot_tau_mi_curve(max_tau=100)
# Visualise FNN curve – confirm m is where FNN reaches zero
rqa.plot_fnn_curve(max_m=12)
# Verify the achieved recurrence rate
print(f"Target RR = {rqa.config['reqrr']:.3f}")
print(f"Actual RR = {rqa.recurrence_rate:.3f}")
Workflow 7 – ML Pipelines (Supervised + Unsupervised)#
from SMdRQA.RQA2 import RQA2_ml, RQA2_simulators
import numpy as np
sim = RQA2_simulators(seed=42)
battery = sim.generate_test_battery()
# Build a small labeled dataset by slicing trajectories
signals, labels = [], []
for name, data in battery.items():
x = data['x']
seg_len = len(x) // 4
for i in range(4):
signals.append(x[i * seg_len:(i + 1) * seg_len])
labels.append(name)
ml = RQA2_ml()
features = ml.build_feature_table(
signals,
labels=labels,
window_size=100,
window_step=20,
window_stats=('mean', 'median', 'mode'),
)
X = features.drop(columns=['id', 'label'])
y = features['label']
supervised, best_model = ml.supervised_benchmark(X, y, cv=3)
print(supervised)
unsupervised, cluster_labels = ml.unsupervised_benchmark(
X, n_clusters=len(set(labels)))
print(unsupervised)
Machine Learning Pipelines (RQA2_ml)#
The RQA2_ml class provides a full feature-to-model workflow built on top of the RQA2 measures. It is designed for quick hypothesis testing in both supervised and unsupervised settings, with reproducible defaults and minimal boilerplate.
Beginner-Friendly Overview#
If you are new to RQA or machine learning, here is the simplest way to think about this pipeline:
Start with a time series (a list of numbers that changes over time).
RQA turns it into numbers that describe repeating patterns.
Windowed RQA repeats this in small chunks to capture local changes.
Summaries (mean/median/mode) turn those chunks into a few stable features.
Machine learning uses those features to classify or cluster systems.
Minimal “Hello World” Example#
import numpy as np
from SMdRQA.RQA2 import RQA2_ml
# Two very simple signals
s1 = np.sin(np.linspace(0, 4 * np.pi, 200))
s2 = np.random.default_rng(0).standard_normal(200)
ml = RQA2_ml()
features = ml.build_feature_table(
[s1, s2],
labels=["sine", "noise"],
window_size=60,
window_step=10,
window_stats=("mean", "median", "mode"),
)
X = features.drop(columns=["id", "label"])
y = features["label"]
results, model = ml.supervised_benchmark(X, y, models=("svm",), cv=2)
print(results)
What You Need to Provide#
Signals: either a list of arrays or a folder of
.npyfiles.Labels (optional): required only for supervised learning.
window_size: must be provided; it controls how long each window is.
Rule of thumb: choose a window_size that is large enough to see
repeating structure, but small enough to detect local changes.
Windowed RQA Features#
Windowed RQA is computed by sliding a square window along the diagonal of the recurrence plot (the same strategy used in the legacy sliding-window utilities). For each window, RQA measures are computed, then aggregated.
rqa = RQA2(data)
windows = rqa.compute_windowed_rqa_measures(window_size=100, window_step=10)
summary = rqa.summarize_windowed_measures(
windows, stats=('mean', 'median', 'mode'))
The summary features are flattened with names like:
recurrence_rate__meandeterminism__medianlaminarity__mode
Feature Table Builder#
build_feature_table returns a pandas DataFrame containing:
Whole-signal RQA measures
Windowed summary features (mean, median, mode)
Optional parameters
tau,m,eps(wheninclude_params=True)
from SMdRQA.RQA2 import RQA2_ml
ml = RQA2_ml()
features = ml.build_feature_table(
signals_or_dir="./data/npy_files/",
labels=None,
window_size=120,
window_step=20,
window_stats=('mean', 'median', 'mode'),
include_params=True,
)
Windowed features are prefixed with win_ in the output DataFrame to
avoid collisions with whole-signal measures.
Supervised Benchmark (Quick)#
supervised_benchmark evaluates multiple classifiers with stratified
cross-validation and reports accuracy, macro-F1, and ROC AUC:
X = features.drop(columns=['id', 'label'])
y = features['label']
results, best_model = ml.supervised_benchmark(
X, y, models=('knn', 'svm', 'rf'), cv=5)
The method returns:
A results DataFrame with mean/std scores for each model
The best-performing fitted model (ready to
predict)
This is a lighter alternative for exploratory analysis. For paper-quality
validation, use nested_cv_benchmark instead.
Nested Cross-Validation with Feature Selection#
nested_cv_benchmark implements the validation procedure described in
the SMdRQA paper: the outer loop evaluates generalisation on held-out data,
while the inner loop performs best-subset feature selection exclusively on
the training fold to prevent data leakage.
results = ml.nested_cv_benchmark(
X, y, model='knn',
outer_iterations=100,
test_fraction=1/3,
feature_selection='auto', # exhaustive if <= 12 features, else forward
)
print(f"Accuracy: {results['accuracy'].mean():.3f} +/- {results['accuracy'].std():.3f}")
print(f"ROC AUC: {results['roc_auc'].mean():.3f} +/- {results['roc_auc'].std():.3f}")
print("Most selected features:")
print(results['feature_frequency'].head(5))
Returns a dict with:
accuracy/roc_auc: score arrays (one per outer iteration)selected_features: list of feature-index tuples per iterationfeature_frequency: Series counting how often each feature was selected
Surrogate Null Baseline#
surrogate_baseline shuffles the labels n_permutations times and
evaluates each with stratified k-fold CV, yielding a null distribution:
null = ml.surrogate_baseline(X, y, model='knn', n_permutations=100)
Statistical Comparison#
compare_scores implements the Wilcoxon signed-rank test with
rank-biserial effect size:
stat = RQA2_ml.compare_scores(
results['accuracy'], null['null_accuracy'],
alternative='greater')
print(f"p = {stat['p_value']:.4f}, effect size r = {stat['effect_size']:.2f}")
Feature Importance#
feature_importance computes permutation importance on a fitted model:
imp = RQA2_ml.feature_importance(best_model, X, y, n_repeats=10)
print(imp.head(10))
RQA2_ml.plot_feature_importance(imp, save_path='importance.png')
Unsupervised Benchmark#
unsupervised_benchmark evaluates clustering methods and reports multiple
validity indices: silhouette, Calinski-Harabasz, and Davies-Bouldin. When
ground-truth labels are available, the adjusted Rand index is also computed.
results, labels = ml.unsupervised_benchmark(
X, y_true=y, methods=('kmeans', 'gmm', 'agglo'), k_range=(2, 6))
ml.plot_cluster_validity(results, save_path='validity.png')
Cluster Stability#
cluster_stability assesses reproducibility via bootstrap resampling:
stab = ml.cluster_stability(X, method='kmeans', n_clusters=2, n_bootstrap=100)
print(f"Stability ARI: {stab['mean_ari']:.3f} +/- {stab['std_ari']:.3f}")
Visualisation Methods#
All plotting methods accept an optional save_path for direct export and
return the matplotlib Figure object:
plot_benchmark_results(results, baseline=None)— box plots of score distributions, optionally overlaying null distributions.plot_confusion_matrix(y_true, y_pred, labels=None)— annotated heatmap.plot_feature_importance(importance_df, top_n=15)— horizontal bar chart.plot_cluster_validity(results_df)— line plots of validity indices vs k.plot_cluster_scatter(X, labels, method='pca')— PCA-projected 2-D scatter.
Notes and Best Practices#
Choose a window_size that preserves local dynamics but still provides enough points for reliable RQA measures.
For fair cross-system comparisons, fix parameters across groups using
group_level_paramsinbuild_feature_table.Use
nested_cv_benchmarkfor paper-quality validation andsupervised_benchmarkfor quick exploration.Always compare against a surrogate baseline to ensure classification performance exceeds chance.
Use
models=('svm',)ormethods=('kmeans',)if you want a single baseline rather than full benchmarking.
Glossary (Plain Language)#
Recurrence plot (RP): a square image that marks when the system revisits similar states.
RQA measures: numeric summaries extracted from the RP (e.g., recurrence rate, determinism).
Windowed RQA: computing RQA on smaller slices of the RP to capture local changes over time.
Feature table: a spreadsheet‑like table where each row is one signal and each column is a numeric feature.
Supervised learning: you already know the label (e.g., “chaotic” vs “periodic”) and train a classifier.
Unsupervised learning: you do not supply labels; clustering groups similar signals together.
Common Pitfalls#
“window_size exceeds RP size”: your signal is too short or the window is too large. Reduce the window size or use longer signals.
Only one class in y: supervised learning needs at least two labels.
Very small windows: can produce unstable measures; increase window size or step size.
RQA2_simulators – Chaotic System Generators#
RQA2_simulators integrates four continuous-time attractors (using
scipy.integrate.solve_ivp, RK45, rtol=1e-9, atol=1e-12) and one
discrete-time map.
Available systems:
Method |
System |
Default chaotic parameters |
|---|---|---|
|
Rössler attractor |
a=0.2, b=0.2, c=5.7; chaotic band attractor. Use a=0.1 for periodic. |
|
Lorenz attractor |
sigma=10, rho=28, beta=8/3; classic butterfly attractor. |
|
Hénon map |
a=1.4, b=0.3; discrete 2-D map on a fractal attractor. |
|
Chua’s circuit |
alpha=15.6, beta=28, m0=-1.143, m1=-0.714; double-scroll attractor. |
|
All of the above |
Returns a dict with keys: |
from SMdRQA.RQA2 import RQA2_simulators
sim = RQA2_simulators(seed=42)
x, y, z = sim.rossler(tmax=5000, n=2000, a=0.1) # limit cycle
x, y, z = sim.lorenz(n=2000) # butterfly
x, y = sim.henon(n=2000) # Hénon map
systems = sim.generate_test_battery() # full battery
RQA2_tests – Surrogate Data and Validation#
Surrogate data testing determines whether a measured signal exhibits genuine nonlinear structure by comparing statistics to ensembles of null surrogates.
Surrogate Algorithms#
Key |
Algorithm |
Null hypothesis and notes |
|---|---|---|
|
Fourier Transform |
Signal is a stationary linear Gaussian process. Fastest; randomises Fourier phases only. |
|
Amplitude-Adjusted FT |
Same as FT but also matches the marginal amplitude distribution. |
|
Iterative AAFT |
Iteratively matches spectrum and amplitude; best accuracy/speed trade-off. |
|
Iterative Digitally-Filtered Shuffled |
Targets higher-order cumulants; starts from a shuffled realisation. |
|
Wavelet-based IAAFT |
Applies IAAFT per wavelet level; preserves multiscale structure. |
|
Pseudo-Periodic Surrogate |
Preserves return-map geometry; best for near-periodic / weakly chaotic signals. |
from SMdRQA.RQA2 import RQA2_tests
import numpy as np
signal = np.random.randn(512).astype(float)
tester = RQA2_tests(signal, seed=42, max_workers=4)
surr_iaaft = tester.generate('IAAFT', n_surrogates=200, n_iter=200)
surr_wave = tester.generate('WIAAFT', n_surrogates=100, wavelet='db8', level=4)
surr_pps = tester.generate('PPS', n_surrogates=50, dim=5, noise_factor=0.1)
Nonlinear Validation Metrics#
Six metrics are evaluated in comprehensive_validation():
Metric key |
Description |
|---|---|
|
Largest Lyapunov exponent (Rosenstein nearest-neighbour divergence). Positive value -> chaotic dynamics. |
|
Third-order temporal asymmetry (Ramsey & Rothman 1996). Non-zero -> time-irreversible, hence nonlinear, process. |
|
Approximate entropy of length-m patterns; lower -> more regular. |
|
Grassberger-Procaccia fractal dimension estimate. Finite, low value -> low-dimensional deterministic attractor. |
|
Absolute skewness of first differences; asymmetric amplitude fluctuations. |
|
Normalised local linear prediction error; lower -> more predictable. |
Comprehensive Validation#
from SMdRQA.RQA2 import RQA2_simulators, RQA2_tests
sim = RQA2_simulators(seed=0)
systems = sim.generate_test_battery()
signal = systems['rossler_chaotic']['x'].astype(float)
tester = RQA2_tests(signal, seed=42)
results = tester.comprehensive_validation(
systems_data=systems,
n_surrogates=200,
save_path='validation_heatmap.png',
)
# results[system_name][surrogate_method][metric] -> p-value
A p-value near 0 means the original signal is significantly different from the surrogate ensemble for that metric, providing evidence of the type of structure the surrogate destroys.
Parallel Generation#
Parallelism activates automatically when n_surrogates >= 50 and
max_workers > 1:
tester = RQA2_tests(signal, seed=0, max_workers=8)
surr = tester.generate('IAAFT', n_surrogates=500)
Implementation Notes#
Indexing#
All internal arrays use 0-based indexing. The embedded signal has shape
(N_emb, m, D) where N_emb = N - (m-1)*tau. Line-length histogram
arrays have length N_emb + 1 so that index i directly counts lines of
length i.
Distance Computation#
Both the recurrence matrix and the epsilon search use
scipy.spatial.distance.cdist for vectorised Euclidean distances. Memory
scales as \(O(N_\text{emb}^2)\); for N_emb > 5000 consider windowed
analysis via RP_maker or down-sampling.
Cross-Validated Polynomial Degree#
The polynomial tau method uses sklearn.model_selection.RepeatedKFold
(5-fold, 3 repeats) to select the degree that minimises RMSE on held-out data,
then finds the first minimum of the fitted polynomial.
Surrogate Seeding#
Each surrogate receives a unique seed derived from the parent RNG, guaranteeing
statistical independence and reproducibility given the same seed argument.
Reproducibility Checklist#
Normalise multivariate data (
normalize=True, the default).Verify parameters visually before analysis:
rqa.plot_tau_mi_curve() rqa.plot_fnn_curve()
Check the achieved RR:
print(rqa.recurrence_rate)
Use consistent lmin across all signals in a study.
Fix parameters across groups with
batch_process(..., group_level=True).Save state after each expensive computation:
rqa.save_results('checkpoint.pkl')
Troubleshooting#
Symptom |
Likely cause and fix |
|---|---|
|
N is too small for the chosen (m, tau). Reduce m or tau, or increase N. |
RR is 0 or 1 regardless of epsilon |
|
tau = 1 for all signals |
MI curve has no local minimum. Try |
m is very large (> 10) |
FNN never drops below |
Surrogates take very long |
Reduce |
|
Cast your array: |
Sphinx |
Ensure SMdRQA is importable from the docs build environment and all dependencies (PyWavelets, scikit-learn, seaborn) are installed. |
Recurrence plot is all black or all white |
|
Extending RQA2#
Subclass RQA2 and override one or more private methods
to integrate custom algorithms without changing the public API:
from SMdRQA.RQA2 import RQA2
class MyRQA(RQA2):
def _findtau_default(self, mi_method):
"""Custom tau estimator (e.g. autocorrelation zero crossing)."""
acf = np.correlate(self.data[:, 0], self.data[:, 0], mode='full')
acf = acf[len(acf) // 2:]
zeros = np.where(acf < 0)[0]
return int(zeros[0]) if len(zeros) > 0 else 1
def _findm(self, tau, sd):
"""Override with a global false-strand or singular-value method."""
...
Key extension points:
_findtau_default,_findtau_polynomial– time-delay strategies_findm– embedding dimension selection_findeps– neighbourhood radius selection_diaghist,_vert_hist– line structure extraction_percentmorethan,_entropy,_average,_maxi,_mode– measure computation
Citing This Work#
Please cite the following when using any component of RQA2 or SMdRQA in academic publications:
Thaikkandi, S., Sharika, K. M., & Nivedita. (2025). SMdRQA: Sliding Window
Multidimensional Recurrence Quantification Analysis (Version 2025.7.27)
[Software]. Zenodo. https://doi.org/10.5281/zenodo.10854678
BibTeX:
@software{smdrqa2025,
author = {Thaikkandi, Swarag and Sharika, K. M. and Nivedita},
title = {{SMdRQA}: Sliding Window Multidimensional Recurrence
Quantification Analysis},
year = {2025},
version = {2025.7.27},
publisher = {Zenodo},
doi = {10.5281/zenodo.10854678},
url = {https://doi.org/10.5281/zenodo.10854678}
}
Key references for the underlying algorithms:
Recurrence plots – Eckmann et al., Europhys. Lett. 4 (1987)
RQA measures – Marwan et al., Phys. Rep. 438 (2007)
Takens embedding – Takens, Lecture Notes in Mathematics 898 (1981)
FNN algorithm – Kennel et al., Phys. Rev. A 45 (1992)
MI for tau – Fraser & Swinney, Phys. Rev. A 33 (1986)
IAAFT – Schreiber & Schmitz, Phys. Rev. Lett. 77 (1996)
PPS – Small et al., Phys. Rev. Lett. 87 (2001)