Documentation¶
Welcome to NeuroKit’s documentation. Here you can find information and learn about Python, NeuroKit, Physiological Signals and more.
You can navigate to the different sections using the left panel. We would recommend checking out the guides and examples, where you can find tutorials and handson walkthroughs.
Introduction¶
The Python Toolbox for Neurophysiological Signal Processing
NeuroKit2 is a userfriendly package providing easy access to advanced biosignal processing routines. Researchers and clinicians without extensive knowledge of programming or biomedical signal processing can analyze physiological data with only two lines of code.
Quick Example¶
import neurokit2 as nk
# Download example data
data = nk.data("bio_eventrelated_100hz")
# Preprocess the data (filter, find peaks, etc.)
processed_data, info = nk.bio_process(ecg=data["ECG"], rsp=data["RSP"], eda=data["EDA"], sampling_rate=100)
# Compute relevant features
results = nk.bio_analyze(processed_data, sampling_rate=100)
And boom 💥 your analysis is done 😎
Installation¶
You can install NeuroKit2 from PyPI
pip install neurokit2
or condaforge
conda install c condaforge neurokit2
If you’re not sure what to do, read our installation guide.
Contributing¶
NeuroKit2 is the most welcoming project with a large community of contributors with all levels of programming expertise. But the package is still far from being perfect! Thus, if you have some ideas for improvement, new features, or just want to learn Python and do something useful at the same time, do not hesitate and check out the following guides:
Also, if you have developped new signal processing methods or algorithms and you want to increase its usage, popularity and citations, get in touch with us to eventually add it to NeuroKit. A great opportunity for the users as well as the original developpers!
Documentation¶
Click on the links above and check out our tutorials:
General¶
Examples¶
You can try out these examples directly in your browser.
Don’t know which tutorial is suited for your case? Follow this flowchart:
Citation¶
The NeuroKit2 paper can be found here 🎉 Additionally, you can get the reference directly from Python by running:
nk.cite()
You can cite NeuroKit2 as follows:
 Makowski, D., Pham, T., Lau, Z. J., Brammer, J. C., Lespinasse, F., Pham, H.,
Schölzel, C., & Chen, S. A. (2021). NeuroKit2: A Python toolbox for neurophysiological signal processing.
Behavior Research Methods, 53(4), 1689–1696. https://doi.org/10.3758/s1342802001516y
Full bibtex reference:
@article{Makowski2021neurokit,
author = {Dominique Makowski and Tam Pham and Zen J. Lau and Jan C. Brammer and Fran{\c{c}}ois Lespinasse and Hung Pham and Christopher Schölzel and S. H. Annabel Chen},
title = {{NeuroKit}2: A Python toolbox for neurophysiological signal processing},
journal = {Behavior Research Methods},
volume = {53},
number = {4},
pages = {16891696},
publisher = {Springer Science and Business Media {LLC}},
doi = {10.3758/s1342802001516y},
url = {https://doi.org/10.3758%2Fs1342802001516y},
year = 2021,
month = {feb}
}
Let us know if you used NeuroKit2 in a publication! Open a new discussion (select the NK in publications category) and link the paper. The community would be happy to know about how you used it and learn about your research. We could also feature it once we have a section on the website for papers that used the software.
Physiological Data Preprocessing¶
Simulate physiological signals¶
import numpy as np
import pandas as pd
import neurokit2 as nk
# Generate synthetic signals
ecg = nk.ecg_simulate(duration=10, heart_rate=70)
ppg = nk.ppg_simulate(duration=10, heart_rate=70)
rsp = nk.rsp_simulate(duration=10, respiratory_rate=15)
eda = nk.eda_simulate(duration=10, scr_number=3)
emg = nk.emg_simulate(duration=10, burst_number=2)
# Visualise biosignals
data = pd.DataFrame({"ECG": ecg,
"PPG": ppg,
"RSP": rsp,
"EDA": eda,
"EMG": emg})
nk.signal_plot(data, subplots=True)
Electrodermal Activity (EDA/GSR)¶
# Generate 10 seconds of EDA signal (recorded at 250 samples / second) with 2 SCR peaks
eda = nk.eda_simulate(duration=10, sampling_rate=250, scr_number=2, drift=0.01)
# Process it
signals, info = nk.eda_process(eda, sampling_rate=250)
# Visualise the processing
nk.eda_plot(signals, sampling_rate=250)
Cardiac activity (ECG)¶
# Generate 15 seconds of ECG signal (recorded at 250 samples / second)
ecg = nk.ecg_simulate(duration=15, sampling_rate=250, heart_rate=70)
# Process it
signals, info = nk.ecg_process(ecg, sampling_rate=250)
# Visualise the processing
nk.ecg_plot(signals, sampling_rate=250)
Respiration (RSP)¶
# Generate one minute of respiratory (RSP) signal (recorded at 250 samples / second)
rsp = nk.rsp_simulate(duration=60, sampling_rate=250, respiratory_rate=15)
# Process it
signals, info = nk.rsp_process(rsp, sampling_rate=250)
# Visualise the processing
nk.rsp_plot(signals, sampling_rate=250)
Electromyography (EMG)¶
# Generate 10 seconds of EMG signal (recorded at 250 samples / second)
emg = nk.emg_simulate(duration=10, sampling_rate=250, burst_number=3)
# Process it
signal, info = nk.emg_process(emg, sampling_rate=250)
# Visualise the processing
nk.emg_plot(signals, sampling_rate=250)
Photoplethysmography (PPG/BVP)¶
# Generate 15 seconds of PPG signal (recorded at 250 samples / second)
ppg = nk.ppg_simulate(duration=15, sampling_rate=250, heart_rate=70)
# Process it
signals, info = nk.ppg_process(ppg, sampling_rate=250)
# Visualize the processing
nk.ppg_plot(signals, sampling_rate=250)
Electrooculography (EOG)¶
# Import EOG data
eog_signal = nk.data("eog_100hz")
# Process it
signals, info = nk.eog_process(eog_signal, sampling_rate=100)
# Plot
plot = nk.eog_plot(signals, info, sampling_rate=100)
Electrogastrography (EGG)¶
Consider helping us develop it!
Physiological Data Analysis¶
The analysis of physiological data usually comes in two types, eventrelated or intervalrelated.
Heart Rate Variability (HRV)¶
Checkout our Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial paper for:
a comprehensive review of the most uptodate HRV indices
a discussion of their significance in psychological research and practices
a stepbystep guide for HRV analysis using NeuroKit2
You can cite the paper as follows:
 Pham, T., Lau, Z. J., Chen, S. H. A., & Makowski, D. (2021).
Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial.
Sensors, 21(12), 3998. https://doi:10.3390/s21123998
Compute HRV indices using Python
Time domain: RMSSD, MeanNN, SDNN, SDSD, CVNN etc.
Frequency domain: Spectral power density in various frequency bands (Ultra low/ULF, Very low/VLF, Low/LF, High/HF, Very high/VHF), Ratio of LF to HF power, Normalized LF (LFn) and HF (HFn), Log transformed HF (LnHF).
Nonlinear domain: Spread of RR intervals (SD1, SD2, ratio between SD2 to SD1), Cardiac Sympathetic Index (CSI), Cardial Vagal Index (CVI), Modified CSI, Sample Entropy (SampEn).
# Download data
data = nk.data("bio_resting_8min_100hz")
# Find peaks
peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
# Compute HRV indices
nk.hrv(peaks, sampling_rate=100, show=True)
>>> HRV_RMSSD HRV_MeanNN HRV_SDNN ... HRV_CVI HRV_CSI_Modified HRV_SampEn
>>> 0 69.697983 696.395349 62.135891 ... 4.829101 592.095372 1.259931
Miscellaneous¶
ECG Delineation¶
Delineate the QRS complex of an electrocardiac signal (ECG) including Ppeaks, Tpeaks, as well as their onsets and offsets.
# Download data
ecg_signal = nk.data(dataset="ecg_3000hz")['ECG']
# Extract Rpeaks locations
_, rpeaks = nk.ecg_peaks(ecg_signal, sampling_rate=3000)
# Delineate
signal, waves = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='all')
Signal Processing¶
Signal processing functionalities
Filtering: Using different methods.
Detrending: Remove the baseline drift or trend.
Distorting: Add noise and artifacts.
# Generate original signal
original = nk.signal_simulate(duration=6, frequency=1)
# Distort the signal (add noise, linear trend, artifacts etc.)
distorted = nk.signal_distort(original,
noise_amplitude=0.1,
noise_frequency=[5, 10, 20],
powerline_amplitude=0.05,
artifacts_amplitude=0.3,
artifacts_number=3,
linear_drift=0.5)
# Clean (filter and detrend)
cleaned = nk.signal_detrend(distorted)
cleaned = nk.signal_filter(cleaned, lowcut=0.5, highcut=1.5)
# Compare the 3 signals
plot = nk.signal_plot([original, distorted, cleaned])
Complexity (Entropy, Fractal Dimensions, …)¶
Optimize complexity parameters (delay tau, dimension m, tolerance r)
# Generate signal
signal = nk.signal_simulate(frequency=[1, 3], noise=0.01, sampling_rate=100)
# Find optimal time delay, embedding dimension and r
parameters = nk.complexity_optimize(signal, show=True)
Compute complexity features
Entropy: Sample Entropy (SampEn), Approximate Entropy (ApEn), Fuzzy Entropy (FuzzEn), Multiscale Entropy (MSE), Shannon Entropy (ShEn)
Fractal dimensions: Correlation Dimension D2, …
Detrended Fluctuation Analysis
nk.entropy_sample(signal)
nk.entropy_approximate(signal)
Signal Decomposition¶
# Create complex signal
signal = nk.signal_simulate(duration=10, frequency=1) # High freq
signal += 3 * nk.signal_simulate(duration=10, frequency=3) # Higher freq
signal += 3 * np.linspace(0, 2, len(signal)) # Add baseline and linear trend
signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0) # Nonlinear trend
signal += np.random.normal(0, 0.02, len(signal)) # Add noise
# Decompose signal using Empirical Mode Decomposition (EMD)
components = nk.signal_decompose(signal, method='emd')
nk.signal_plot(components) # Visualize components
# Recompose merging correlated components
recomposed = nk.signal_recompose(components, threshold=0.99)
nk.signal_plot(recomposed) # Visualize components
Signal Power Spectrum Density (PSD)¶
# Generate complex signal
signal = nk.signal_simulate(duration=20, frequency=[0.5, 5, 10, 15], amplitude=[2, 1.5, 0.5, 0.3], noise=0.025)
# Get the PSD using different methods
welch = nk.signal_psd(signal, method="welch", min_frequency=1, max_frequency=20, show=True)
multitaper = nk.signal_psd(signal, method="multitapers", max_frequency=20, show=True)
lomb = nk.signal_psd(signal, method="lomb", min_frequency=1, max_frequency=20, show=True)
burg = nk.signal_psd(signal, method="burg", min_frequency=1, max_frequency=20, order=10, show=True)
Statistics¶
Highest Density Interval (HDI)
x = np.random.normal(loc=0, scale=1, size=100000)
ci_min, ci_max = nk.hdi(x, ci=0.95, show=True)
Popularity¶
NeuroKit2 is one of the most welcoming package for new contributors and users, as well as the fastest growing package. So stop hesitating and hop onboard 🤗
Notes¶
The authors do not provide any warranty. If this software causes your keyboard to blow up, your brain to liquefy, your toilet to clog or a zombie plague to break loose, the authors CANNOT IN ANY WAY be held responsible.
Installation¶
Hint
Spotted a typo? Would like to add something or make a correction? Join us by contributing (see these guides).
Instructions for users without a Python installation¶
Since NeuroKit2 is a Python package, let’s first make sure that you have a Python installation on your computing device. Then we can move on to install the NeuroKit2 package itself.
Windows¶
Winpython¶
The advantage of Winpython is its portability (i.e., works out of a folder) and default setup (convenient for science).
Download a nonzero version of Winpython
Install it somewhere (the desktop is a good place). It creates a folder called WPyXXxxxx
In the WPyXXxxxx folder, open WinPython Command Prompt.exe
Now you can proceed to install the NeuroKit2 package
Miniconda or Anaconda¶
The difference between the two is straightforward, miniconda is recommended if you don’t have much storage space and you know what you want to install. Similar to Winpython, Anaconda comes with a base environment, meaning you have basic packages preinstalled. Here is some more information to help you choose between miniconda and Anaconda.
Download and install Miniconda or Anaconda (make sure the
Anaconda3
directory is similar to this:C:\Users\<username>\anaconda3\
)Open the Anaconda Prompt (search for it on your computer)
Run
conda help
to see your options
Note
There should be a name in parentheses before your user’s directory, e.g. (base) C:\Users\<yourusername>
. That is the name of your computing environment. By default, you have a base environment
. We don’t want that, so create an environment.
Run
conda env create <yourenvname>
; activate it every time you open up conda by runningconda activate <yourenvname>
Now you can proceed to install the NeuroKit2 package
Mac OS¶
Instructions for users with a Python installation¶
If you have Python installed as part of Miniconda or Anaconda, please follow these steps:
As described in above, open the Anaconda Prompt and activate your conda environment
You can now install NeuroKit2 from condaforge by typing
conda config add channels condaforge
conda install neurokit2
If you use another Python installation, you can simply install NeuroKit2 from PyPI:
pip install neurokit2
If you don’t have pip installed, this Python installation guide can guide you through the process.
conda or pip are the preferred methods to install NeuroKit2, as they will install the most uptodate stable release.
It is also possible to install NeuroKit2 directly from GitHub:
pip install https://github.com/neuropsychology/neurokit/zipball/master
Hint
Enjoy living on the edge? You can always install the latest dev branch to access workinprogress features using pip install https://github.com/neuropsychology/neurokit/zipball/dev
Get Started¶
Contents:
Get familiar with Python in 10 minutes¶
Hint
Spotted a typo? Would like to add something or make a correction? Join us by contributing (see these guides).
You have no experience in computer science? You are afraid of code? You feel betrayed because you didn’t expect to do programming in psychology studies? Relax! We got you covered.
This tutorial will provide you with all you need to know to dive into the wonderful world of scientific programming. The goal here is not become a programmer, or a software designer, but rather to be able to use the power of programming to get scientific results.
Setup¶
The first thing you will need is to install Python on your computer (we have a tutorial for that). In fact, this includes two things, installing Python (the language), and an environment to be able to use it. For this tutorial, we will assume you have something that looks like Spyder (called an IDE). But you can use jupyter notebooks, or anything else, it doesn’t really matter.
There is one important concept to understand here: the difference between the CONSOLE and the EDITOR. The editor is like a cooking table where you prepare your ingredients to make a dish, whereas the console is like the oven, you only open it to put the dish in it and get the result.
Most of the code that you will write, you will write it in the editor. It’s basically a text editor (such as notepad), except that it automatically highlights the code. Importantly, you can directly execute a line of code (which is equivalent to copy it and paste it the console).
For instance, try writing 1+1
somewhere in the file in the editor pane. Now if select the piece of code you just wrote, and press F9
(or CTRL + ENTER
), it will execute it.
As a result, you should see in the console the order that you gave and, below, its output (which is 2
).
Now that the distinction between where we write the code and where the output appears is clear, take some time to explore the settings and turn the editor background to BLACK. Why? Because it’s more comfortable for the eyes, but most importantly, because it’s cool 😎.
Congrats, you’ve become a programmer, a wizard of the modern times.
You can now save the file (CTRL + S
), which will be saved with a .py
extension (i.e., a Python script). Try closing everything and reopening this file with the editor.
Variables¶
The most important concept of programming is variables, which is a fancy name for something that you already know. Do you remember, from your mathematics classes, the famous x, this placeholder for any value? Well, x was a variable, i.e., the name refering to some other thing.
Hint
Despite to what I just said, a variable in programming is not equivalent to a variable in statistics, in which it refers to some specific data (for instance, age is variable and contains multiple observations). In programming, a variable is simply the name that we give to some entity, that could be anything.
We can assign a value to a variable using the =
sign, for instance:
x = 2
y = 3
Once we execute these two lines, Python will know that x
refers to 2
, and y
to 3
. We can now write:
print(x + y)
Which will print in the console the correct result.
We can also store the output in a third variable:
x = 2
y = 3
anothervariable = x * y
print(anothervariable)
Variables and data types¶
The next important thing to have in mind is that variables have types. Basic types include integers (numbers without decimals), floats (numbers with decimals), strings (character text) and booleans (True
and False
). Depending on their type, the variables will not behave in the same way. For example, try:
print(1 + 2)
print("1" + "2")
What happened here? Well, quotations ("I am quoted"
) are used to represent strings (i.e., text). So in the second line, the numbers that we added were not numbers, but text. And when you add strings together in Python, it concatenates them.
One can change the type of a variable with the following:
int(1.0) # transform the input to an integer
float(1) # transform the input to a float
str(1) # transform the input into text
Also, here I used the hashtag symbol to make comments, i.e., writing stuff that won’t be executed by Python. This is super useful to annotate each line of your code to remember what you do  and why you do it.
Types are often the source of many errors as they usually are incompatible between them. For instance, you cannot add a number (int
or float
) with a character string. For instance, try running 3 + "a"
, it will throw a TypeError
.
Lists and dictionnaries¶
Two other important types are lists and dictionnaries. You can think of them as containers, as they contain multiple variables. The main difference between them is that in a list, you access the individual elements that it contains by its order (for instance, “give me the third one”), whereas in a dictionary, you access an element by its name (also known as key), for example “give me the element named A”.
A list is created using square brackets, and a dictionary using curly brackets. Importantly, in a dictionary, you must specify a name to each element. Here’s what it looks like:
mylist = [1, 2, 3]
mydict = {"A": 1, "B": 2, "C": 3}
Keep in mind that there are more types of containers, such as arrays and dataframes, that we will talk about later.
Basic indexing¶
There’s no point in storing elements in containers if we cannot access them later on. As mentioned earlier, we can access elements from a dictionary by its key within square brackets (note that here the square brackets don’t mean list, just mean within the previous container).
mydict = {"A": 1, "B": 2, "C": 3}
x = mydict["B"]
print(x)
Exercice time! If you have followed this tutorial so far, you should be able to guess what the following code will output:
mydict = {"1": 0, "2": 42, "x": 7}
x = str(1 + 1)
y = mydict[x]
print(y)
Answer: If you guessed 42, you’re right, congrats! If you guessed 7, you have likely confused the variable named x
(which represents 1+1 converted to a character), with the character "x"
. And if you guessed 0… what is wrong with you?
Indexing starts from 0¶
As mentioned earliers, one can access elements from a list by its order. However, and there is very important to remember (the source of many beginner errors), in Python, the order starts from 0. That means that the first element is the 0th.
So if we want the 2nd element of the list, we have to ask for the 1th:
mylist = [1, 2, 3]
x = mylist[1]
print(x)
Control flow (if and else)¶
One important notion in programming is control flow. You want the code to do something different depending on a condition. For instance, if x
is lower than 3, print “lower than 3”. In Python, this is done as follows:
x = 2
if x < 3:
print("lower than 3")
One very important thing to notice is that the if statement corresponds to a “chunk” of code, as signified by the colon :
. The chunk is usually written below, and has to be indented (you can ident a line or a chunk of code by pressing the TAB
key).
What is identation?
this
is
indentation
This identation must be consistent: usually one level of identation corresponds to 4 spaces. Make sure you respect that throughout your script, as this is very important in Python. If you break the rule, it will throw an error. Try running the following:
if 2 < 3:
print("lower than 3")
Finally, if statements can be followed by else statements, which takes care of what happens if the condition is not fullfilled:
x = 5
if x < 3:
print("lower")
else:
print("higher")
Again, note the indentation and how the else statement creates a new idented chunk.
For loops¶
One of the most used concept is loops, and in particular for loops. Loops are chunks of code that will be run several times, until a condition is complete.
The for loops create a variable that will successively take all the values of a list (or other iterable types). Let’s look at the code below:
for var in [1, 2, 3]:
print(var)
Here, the for loop creates a variable (that we named var), that will successively take all the values of the provided list.
Functions¶
Now that you know what a variable is, as well as the purpose of little things like if, else, for, etc., the last most common thing that you will find in code are function calls. In fact, we have already used some of them! Indeed, things like print()
, str()
and int()
were functions. And in fact, you’ve probably encountered them in secondary school mathematics! Remember f(x)?
One important thing about functions is that most of the time (not always though), it takes something in, and returns something out. It’s like a factory, you give it some raw material and it outputs some transformed stuff.
For instance, let’s say we want to transform a variable containing an integer
into a character string
:
x = 3
x = str(x)
print(x)
As we can see, our str()
function takes x
as an input, and outputs the transformed version, that we can collect using the equal sign =
and store in the x
variable to replace its content.
Another useful function is range()
, that creates a sequence of integers, and is often used in combination with for loops. Remember our previous loop:
mylist = [1, 2, 3]
for var in mylist:
print(var)
We can rewrite it using the range()
function, to create a sequence of length 3 (which will be from 0
to 2
; remember that Python indexing starts from 0!), and extracting and printing all of the elements in the list:
mylist = [1, 2, 3]
for i in range(3):
print(mylist[i])
It’s a bit more complicated than the previous version, it’s true. But that’s the beauty of programming, all things can be done in a nearinfinite amount of ways, allowing for your creativity to be expressed.
Exercice time! Can you try making a loop so that we add :code: 1 to each element of the list?
Answer:
mylist = [1, 2, 3]
for i in range(3):
mylist[i] = mylist[i] + 1
print(mylist)
If you understand what happened here, in this combination of lists, functions, loops and indexing, great! You are ready to move on.
Packages¶
Interestingly, Python alone does not include a lot of functions. And that’s also its strength, because it allows to easily use functions developped by other people, that are stored in packages (or modules). A package is a collection of functions that can be downloaded and used in your code.
One of the most popular package is numpy (for NUM*rical *PY*thon), including a lot of functions for maths and scientific programming. It is likely that this package is already **installed* on your Python distribution. However, installing a package doesn’t mean you can use it. In order to use a package, you have to import it (load it) in your script, before using it. This usually happens at the top of a Python file, like this:
import numpy
Once you have imported it (you have to run that line), you can use its functions. For instance, let’s use the function to compute square roots included in this package:
x = numpy.sqrt(9)
print(x)
You will notice that we have to first write the package name, and then a dot, and then the sqrt()
function. Why is it like that? Imagine you load two packages, both having a function named sqrt()
. How would the program know which one to use? Here, it knows that it has to look for the sqrt()
function in the numpy
package.
You might think, it’s annoying to write the name of the package everytime, especially if the package name is long. And this is why we sometimes use aliases. For instance, numpy is often loaded under the shortcut np, which makes it shorter to use:
import numpy as np
x = np.sqrt(9)
print(x)
Lists vs. vectors (arrays)¶
Packages can also add new types. One important type avalable through numpy is arrays.
In short, an array is a container, similar to a list. However, it can only contain one type of things inside (for instance, only floats, only strings, etc.) and can be multidimensional (imagine a 3D cube made of little cubes containing a value). If an array is onedimensional (like a list, i.e., a sequence of elements), we can call it a vector.
A list can be converted to a vector using the array() function from the numpy package:
mylist = [1, 2, 3]
myvector = np.array(mylist)
print(myvector)
In signal processing, vectors are often used instead of lists to store the signal values, because they are more efficient and allow to do some cool stuff with it. For instance, remember our exercice above? In which we had to add :code:`1`to each element of the list? Well using vectors, you can do this directly like this:
myvector = np.array([1, 2, 3])
myvector = myvector + 1
print(myvector)
Indeed, vectors allow for vectorized operations, which means that any operation is propagated on each element of the vector. And that’s very useful for signal processing :)
Conditional indexing¶
Arrays can also be transformed in arrays of booleans (True
or False
) using a condition, for instance:
myvector = np.array([1, 2, 3, 2, 1])
vector_of_bools = myvector <= 2 # <= means inferior OR equal
print(vector_of_bools)
This returns a vector of the same length but filled with True
(if the condition is respected) or False
otherwise. And this new vector can be used as a mask to index and subset the original vector. For instance, we can select all the elements of the array that fulfills this condition:
myvector = np.array([1, 2, 3, 2, 1])
mask = myvector <= 2
subset = myvector[mask]
print(subset)
Additionaly, we can also modify a subset of values on the fly:
myvector = np.array([1, 2, 3, 2, 1])
myvector[myvector <= 2] = 6
print(myvector)
Here we assigned a new value 6 to all elements of the vector that respected the condition (were inferior or equal to 2).
Dataframes¶
If you’ve followed everything until now, congrats! You’re almost there. The last important type that we are going to see is dataframes. A dataframe is essentially a table with rows and columns. Often, the rows represent different observations and the columns different variables.
Dataframes are available in Python through the pandas package, another very used package, usually imported under the shortcut pd
. A dataframe can be constructed from a dictionnay: the key will become the variable naùe, and the list or vector associated will become the variable values.
import pandas as pd
# Create variables
var1 = [1, 2, 3]
var2 = [5, 6, 7]
# Put them in a dict
data = {"Variable1": var1, "Variable2": var2}
# Convert this dict to a dataframe
data = pd.DataFrame.from_dict(data)
print(data)
This creates a dataframe with 3 rows (the observations) and 2 columns (the variables). One can access the variables by their name:
print(data["Variable1"])
Note that Python cares about the case: tHiS
is not equivalent to ThIs
. And pd.DataFrame
has to be written with the D and F in capital letters. This is another common source of beginner errors, so make sure you put capital letters at the right place.
Reading data¶
Now that you know how to create a dataframe in Python, note that you also use pandas to read data from a file (.csv, excel, etc.) by its path:
import pandas as pd
data = pd.read_excel("C:/Users/Dumbledore/Desktop/myfile.xlsx") # this is an example
print(data)
Additionally, this can also read data directly from the internet! Try running the following:
import pandas as pd
data = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit/master/data/bio_eventrelated_100hz.csv")
print(data)
Next steps¶
Now that you know the basis, and that you can distinguish between the different elements of Python code (functions calls, variables, etc.), we recommend that you dive in and try to follow our other examples and tutorials, that will show you some usages of Python to get something out of it.
Where to start¶
Hint
This page is under construction. Consider helping us developing it by contributing.
Here are a few examples that a good for starting with NeuroKit.
Examples¶
The notebooks in this repo are meant to illustrate what you can do with NeuroKit. It is supposed to reveal how easy it has become to use cuttingedge methods, and still retain the liberty to change a myriad of parameters. These notebooks are organized in different sections that correspond to NeuroKit’s modules.
Try the examples in your browser¶
Hint
Spotted a typo? Would like to add something or make a correction? Join us by contributing (see this tutorial).
The notebooks in this repo are meant to illustrate what you can do with NeuroKit. It is supposed to reveal how easy it has become to use cuttingedge methods, and still retain the liberty to change a myriad of parameters. These notebooks are organized in different sections that correspond to NeuroKit’s modules.
You are free to click on the link below to run everything… without having to install anything! There you’ll find a Jupyterlab with notebooks ready to fire up. If you need help figuring out the interface. (The secret is shift+enter
).
1. Analysis Paradigm¶
Examples dedicated to specific analysis pipelines, such as for event related paradigms and resting state.
Ideas of examples to be implemented: > Preprocessing feature signals for machine learning Analysis > EEG + physiological activity during resting state > Comparing interval related activity from different “mental states” (e.g. meditation, induced emotion vs. neutral)
2. Biosignal Processing¶
Examples dedicated to processing pipelines, and measure extraction of multiple signals at a time. What’s your thing ? How do you do it ?
Ideas of examples to be implemented:
> Batch preprocessing of multiple recordings
> PPG processing for respiration and temperature
> EMG overview (so many muscles to investigate)
> add yours...
a) Custom processing pipeline¶
custom.ipynb
 Description
This notebook breaks down the default NeuroKit pipeline used in
_process()
functions. It guides you in creating your own pipeline with the parameters best suited for your signals.
3. Heart rate and heart cycles¶
Examples dedicated to the analysis of ECG, PPG and HRV time series. Are you a fan of the Neurovisceral integration model? How would you infer a cognitive or affective process with HRV ? How do you investigate the asymmetry of cardiac cycles ?
Ideas of examples to be implemented:
> Benchmark different peak detection methods
> resting state analysis of HRV
> Comparing resting state and movie watching
> add yours
a) Detecting components of the cardiac cycle¶
ecg_delineation.ipynb
 Description
This notebook illustrate how reliable the peak detection is by analyzing the morphology of each cardiac cycles. It shows you how PQRST components are extracted.
b) Looking closer at heart beats¶
heartbeats.ipynb
 Description
This notebook gives hints for a thorough investigation of ECG signals by visualizing individual heart beats, interactively.
4. Electrodermal activity¶
Examples dedicated to the analysis of EDA signals.
Ideas of examples to be implemented:
> Pain experiments
> Temperature
> add yours
a) Extracting information in EDA¶
eda.ipynb
 Description
This notebook goes at the heart of the complexity of EDA analysis by break down how Tonic and Phasic components are extracted from the signal.
5. Respiration rate and respiration cycles¶
Examples dedicated to the analysis of respiratory signals, i.e. as given by a belt, or eventually, with PPG.
Ideas of examples to be implemented:
> Meditation experiments
> Stress regulation
> add yours
a) Extracting Respiration Rate Variability metrics¶
rrv.ipynb
 Description
This notebook breaks down the extraction of variability metrics done by
rsp_rrv()
6. Muscle activity¶
Examples dedicated to the analysis of EMG signals.
Ideas of examples to be implemented:
> Suggestion and muscle activation
> Sleep data analysis
>... nothing yet!
Simulate Artificial Physiological Signals¶
This example can be referenced by citing the package.
Neurokit’s core signal processing functions surround electrocardiogram (ECG), respiratory (RSP), electrodermal activity (EDA), and electromyography (EMG) data. Hence, this example shows how to use Neurokit to simulate these physiological signals with customized parametric control.
[5]:
import warnings
warnings.filterwarnings('ignore')
[6]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline
[7]:
plt.rcParams['figure.figsize'] = [10, 6] # Bigger images
Cardiac Activity (ECG)¶
With ecg_simulate()
, you can generate an artificial ECG signal of a desired length (in this case here, duration=10
), noise, and heart rate. As you can see in the plot below, ecg50 has about half the number of heart beats than ecg100, and ecg50 also has more noise in the signal than the latter.
[8]:
# Alternate heart rate and noise levels
ecg50 = nk.ecg_simulate(duration=10, noise=0.05, heart_rate=50)
ecg100 = nk.ecg_simulate(duration=10, noise=0.01, heart_rate=100)
# Visualize
ecg_df = pd.DataFrame({"ECG_100": ecg100,
"ECG_50": ecg50})
nk.signal_plot(ecg_df, subplots=True)
You can also choose to generate the default, simple simulation based on Daubechies wavelets, which roughly approximates one cardiac cycle, or a more complex one by specifiying method="ecgsyn"
.
[9]:
# Alternate methods
ecg_sim = nk.ecg_simulate(duration=10, method="simple")
ecg_com = nk.ecg_simulate(duration=10, method="ecgsyn")
# Visualize
methods = pd.DataFrame({"ECG_Simple": ecg_sim,
"ECG_Complex": ecg_com})
nk.signal_plot(methods, subplots=True)
Respiration (RSP)¶
To simulate a synthetic respiratory signal, you can use rsp_simulate()
and choose a specific duration and breathing rate. In this example below, you can see that rsp7 has a lower breathing rate than rsp15. You can also decide which model you want to generate the signal. The simple rsp15 signal incorporates method = "sinusoidal"
which approximates a respiratory cycle based on the trigonometric sine wave. On the other hand, the complex rsp15 signal specifies
method = "breathmetrics"
which uses a more advanced model by interpolating inhalation and exhalation pauses between each respiratory cycle.
[10]:
# Simulate
rsp15_sim = nk.rsp_simulate(duration=20, respiratory_rate=15, method="sinusoidal")
rsp15_com = nk.rsp_simulate(duration=20, respiratory_rate=15, method="breathmetrics")
rsp7 = nk.rsp_simulate(duration=20, respiratory_rate=7, method="breathmetrics")
# Visualize respiration rate
rsp_df = pd.DataFrame({"RSP7": rsp7,
"RSP15_simple": rsp15_sim,
"RSP15_complex": rsp15_com})
nk.signal_plot(rsp_df, subplots=True)
Electromyography (EMG)¶
Now, we come to generating an artificial EMG signal using emg_simulate()
. Here, you can specify the number of bursts of muscular activity (n_bursts
) in the signal as well as the duration of the bursts (duration_bursts
). As you can see the active muscle periods in EMG2_Longer are greater in duration than that of EMG2, and EMG5 contains more bursts than the former two.
[11]:
# Simulate
emg2 = nk.emg_simulate(duration=10, burst_number=2, burst_duration=1.0)
emg2_long = nk.emg_simulate(duration=10, burst_number=2, burst_duration=1.5)
emg5 = nk.emg_simulate(duration=10, burst_number=5, burst_duration=1.0)
# Visualize
emg_df = pd.DataFrame({"EMG2": emg2,
"EMG2_Longer": emg2_long,
"EMG5": emg5})
nk.signal_plot(emg_df,subplots=True)
Electrodermal Activity (EDA)¶
Finally, eda_simulate()
can be used to generate a synthetic EDA signal of a given duration, specifying the number of skin conductance responses or activity ‘peaks’ (n_scr
) and the drift
of the signal. You can also modify the noise level of the signal.
[12]:
# Simulate
eda1 = nk.eda_simulate(duration=10, scr_number=1, drift=0.01, noise=0.05)
eda3 = nk.eda_simulate(duration=10, scr_number=3, drift=0.01, noise=0.01)
eda3_long = nk.eda_simulate(duration=10, scr_number=3, drift=0.1, noise=0.01)
# Visualize
eda_df = pd.DataFrame({"EDA1": eda1,
"EDA3": eda3,
"EDA3_Longer": eda3_long})
nk.signal_plot(eda_df, subplots=True)
Customize your Processing Pipeline¶
This example can be referenced by citing the package.
While NeuroKit is designed to be beginnerfriendly, experts who desire to have more control over their own processing pipeline are also offered the possibility to tune functions to their specific usage. This example shows how to use NeuroKit to customize your own processing pipeline for advanced users taking ECG processing as an example.
[6]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
[7]:
plt.rcParams['figure.figsize'] = [15, 9] # Bigger images
The Default NeuroKit processing pipeline¶
NeuroKit provides a very useful set of functions, *_process()
(e.g. ecg_process()
, eda_process()
, emg_process()
, …), which are allinone functions that cleans, preprocesses and processes the signals. It includes good and sensible defaults that should be suited for most of users and typical usecases. That being said, in some cases, you might want to have more control over the processing pipeline.
This is how ecg_process()
is typically used:
[8]:
# Simulate ecg signal (you can use your own one)
ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80)
# Default processing pipeline
signals, info = nk.ecg_process(ecg, sampling_rate=1000)
# Visualize
plot = nk.ecg_plot(signals)
Building your own process()
function¶
Now, if you look at the code of `ecg_process()
<https://github.com/neuropsychology/NeuroKit/blob/master/neurokit2/ecg/ecg_process.py#L49>`__ (see here for how to explore the code), you can see that it is in fact very simple.
It uses what can be referred to as “midlevel functions”, such as ecg_clean()
, ecg_peaks()
, ecg_rate()
etc.
This means that you can basically recreate the ecg_process()
function very easily by calling these midlevel functions:
[9]:
# Define a new function
def my_processing(ecg_signal):
# Do processing
ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=1000)
instant_peaks, rpeaks, = nk.ecg_peaks(ecg_cleaned, sampling_rate=1000)
rate = nk.ecg_rate(rpeaks, sampling_rate=1000, desired_length=len(ecg_cleaned))
quality = nk.ecg_quality(ecg_cleaned, sampling_rate=1000)
# Prepare output
signals = pd.DataFrame({"ECG_Raw": ecg_signal,
"ECG_Clean": ecg_cleaned,
"ECG_Rate": rate,
"ECG_Quality": quality})
signals = pd.concat([signals, instant_peaks], axis=1)
info = rpeaks
return signals, info
You can now use this function as you would do with ecg_process()
.
[10]:
# Process the signal using previously defined function
signals, info = my_processing(ecg)
# Visualize
plot = nk.ecg_plot(signals)
Changing the processing parameters¶
Now, you might want to ask, why would you recreate the processing function? Well, it allows you to change the parameters of the inside as you please. Let’s say you want to use a specific cleaning method.
First, let’s look at the documentation for ``ecg_clean()` <https://neurokit2.readthedocs.io/en/latest/functions.html#neurokit2.ecg_clean>`__, you can see that they are several different methods for cleaning which can be specified. The default is the Neurokit method, however depending on the quality of your signal (and several other factors), other methods may be more appropriate. It is up to you to make this decision.
You can now change the methods as you please for each function in your custom processing function that you have written above:
[11]:
# Define a new function
def my_processing(ecg_signal):
# Do processing
ecg_cleaned = nk.ecg_clean(ecg_signal, sampling_rate=1000, method="engzeemod2012")
instant_peaks, rpeaks, = nk.ecg_peaks(ecg_cleaned, sampling_rate=1000)
rate = nk.ecg_rate(rpeaks, sampling_rate=1000, desired_length=len(ecg_cleaned))
quality = nk.ecg_quality(ecg_cleaned, sampling_rate=1000)
# Prepare output
signals = pd.DataFrame({"ECG_Raw": ecg_signal,
"ECG_Clean": ecg_cleaned,
"ECG_Rate": rate,
"ECG_Quality": quality})
signals = pd.concat([signals, instant_peaks], axis=1)
info = rpeaks
return signals, info
Similarly, you can select a different method for the peak detection.
Customize even more!¶
It is possible that none of these methods suit your needs, or that you want to test a new method. Rejoice yourself, as NeuroKit allows you to do that by providing what can be referred to as “lowlevel” functions.
For instance, you can rewrite the cleaning procedure by using the signal processsing tools offered by NeuroKit:
[12]:
def my_cleaning(ecg_signal, sampling_rate):
detrended = nk.signal_detrend(ecg_signal, order=1)
cleaned = nk.signal_filter(detrended, sampling_rate=sampling_rate, lowcut=2, highcut=9, method='butterworth')
return cleaned
You can use this function inside your custom processing written above:
[13]:
# Define a new function
def my_processing(ecg_signal):
# Do processing
ecg_cleaned = my_cleaning(ecg_signal, sampling_rate=1000)
instant_peaks, rpeaks, = nk.ecg_peaks(ecg_cleaned, sampling_rate=1000)
rate = nk.ecg_rate(rpeaks, sampling_rate=1000, desired_length=len(ecg_cleaned))
quality = nk.ecg_quality(ecg_cleaned, sampling_rate=1000)
# Prepare output
signals = pd.DataFrame({"ECG_Raw": ecg_signal,
"ECG_Clean": ecg_cleaned,
"ECG_Rate": rate,
"ECG_Quality": quality})
signals = pd.concat([signals, instant_peaks], axis=1)
info = rpeaks
return signals, info
Congrats, you have created your own processing pipeline! Let’s see how it performs:
[14]:
signals, info = my_processing(ecg)
plot = nk.ecg_plot(signals)
This doesn’t look bad :) Can you do better?
Analyze Electrodermal Activity (EDA)¶
This example can be referenced by citing the package.
This example shows how to use NeuroKit2 to extract the features from Electrodermal Activity (EDA) .
[5]:
# Load the NeuroKit package and other useful packages
import neurokit2 as nk
import matplotlib.pyplot as plt
%matplotlib inline
[6]:
plt.rcParams['figure.figsize'] = [15, 5] # Bigger images
Extract the cleaned EDA signal¶
In this example, we will use a simulated EDA signal. However, you can use any signal you have generated (for instance, extracted from the dataframe using read_acqknowledge().
[7]:
# Simulate 10 seconds of EDA Signal (recorded at 250 samples / second)
eda_signal = nk.eda_simulate(duration=10, sampling_rate=250, scr_number=3, drift=0.01)
Once you have a raw EDA signal in the shape of a vector (i.e., a onedimensional array), or a list, you can use eda_process() to process it.
[8]:
# Process the raw EDA signal
signals, info = nk.eda_process(eda_signal, sampling_rate=250)
Note: It is critical that you specify the correct sampling rate of your signal throughout many processing functions, as this allows NeuroKit to have a time reference.
This function outputs two elements, a dataframe containing the different signals (e.g., the raw signal, clean signal, SCR samples marking the different features etc.), and a dictionary containing information about the Skin Conductance Response (SCR) peaks (e.g., onsets, peak amplitude etc.).
Locate Skin Conductance Response (SCR) features¶
The processing function does two important things for our purpose: Firstly, it cleans the signal. Secondly, it detects the location of 1) peak onsets, 2) peak amplitude, and 3) halfrecovery time. Let’s extract these from the output.
[9]:
# Extract clean EDA and SCR features
cleaned = signals["EDA_Clean"]
features = [info["SCR_Onsets"], info["SCR_Peaks"], info["SCR_Recovery"]]
We can now visualize the location of the peak onsets, the peak amplitude, as well as the halfrecovery time points in the cleaned EDA signal, respectively marked by the red dashed line, blue dashed line, and orange dashed line.
[10]:
# Visualize SCR features in cleaned EDA signal
plot = nk.events_plot(features, cleaned, color=['red', 'blue', 'orange'])
Decompose EDA into Phasic and Tonic components¶
We can also decompose the EDA signal into its phasic and tonic components, or more specifically, the Phasic Skin Conductance Response (SCR) and the Tonic Skin Conductance Level (SCL) respectively. The SCR represents the stimulusdependent fast changing signal whereas the SCL is slowchanging and continuous. Separating these two signals helps to provide a more accurate estimation of the true SCR amplitude.
[11]:
# Filter phasic and tonic components
data = nk.eda_phasic(nk.standardize(eda_signal), sampling_rate=250)
Note: here westandardizedthe raw EDA signal before the decomposition, which can be useful in the presence of high interindividual variations.
We can now add the raw signal to the dataframe containing the two signals, and plot them!
[12]:
data["EDA_Raw"] = eda_signal # Add raw signal
data.plot()
[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x209e08abba8>
Quick Plot¶
You can obtain all of these features by using the eda_plot() function on the dataframe of processed EDA.
[13]:
# Plot EDA signal
plot = nk.eda_plot(signals)
Analyze Respiratory Rate Variability (RRV)¶
This example can be referenced by citing the package.
Respiratory Rate Variability (RRV), or variations in respiratory rhythm, are crucial indices of general health and respiratory complications. This example shows how to use NeuroKit to perform RRV analysis.
Download Data and Extract Relevant Signals¶
[2]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
[3]:
plt.rcParams['figure.figsize'] = 15, 5 # Bigger images
In this example, we will download a dataset that contains electrocardiogram, respiratory, and electrodermal activity signals, and extract only the respiratory (RSP) signal.
[3]:
# Get data
data = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit/master/data/bio_eventrelated_100hz.csv")
rsp = data["RSP"]
nk.signal_plot(rsp, sampling_rate=100) # Visualize
You now have the raw RSP signal in the shape of a vector (i.e., a onedimensional array). You can then clean it using rsp_clean()
and extract the inhalation peaks of the signal using rsp_peaks()
. This will output 1) a dataframe indicating the occurrences of inhalation peaks and exhalation troughs (“1” marked in a list of zeros), and 2) a dictionary showing the samples of peaks and troughs.
Note: As the dataset has a frequency of 100Hz, make sure the ``sampling_rate`` is also set to 100Hz. It is critical that you specify the correct sampling rate of your signal throughout all the processing functions.
[4]:
# Clean signal
cleaned = nk.rsp_clean(rsp, sampling_rate=100)
# Extract peaks
df, peaks_dict = nk.rsp_peaks(cleaned)
info = nk.rsp_fixpeaks(peaks_dict)
formatted = nk.signal_formatpeaks(info, desired_length=len(cleaned),peak_indices=info["RSP_Peaks"])
[5]:
nk.signal_plot(pd.DataFrame({"RSP_Raw": rsp, "RSP_Clean": cleaned}), sampling_rate=100, subplots=True)
[6]:
candidate_peaks = nk.events_plot(peaks_dict['RSP_Peaks'], cleaned)
[7]:
fixed_peaks = nk.events_plot(info['RSP_Peaks'], cleaned)
[8]:
# Extract rate
rsp_rate = nk.rsp_rate(formatted, desired_length=None, sampling_rate=100) # Note: You can also replace info with peaks dictionary
# Visualize
nk.signal_plot(rsp_rate, sampling_rate=100)
plt.ylabel('BPM')
[8]:
Text(0, 0.5, 'BPM')
Analyse RRV¶
Now that we have extracted the respiratory rate signal and the peaks dictionary, you can then input these into rsp_rrv()
. This outputs a variety of RRV indices including time domain, frequency domain, and nonlinear features. Examples of time domain features include RMSSD (rootmeansquared standard deviation) or SDBB (standard deviation of the breathtobreath intervals). Power spectral analyses (e.g., LF, HF, LFHF) and entropy measures (e.g., sample entropy, SampEn where smaller values
indicate that respiratory rate is regular and predictable) are also examples of frequency domain and nonlinear features respectively.
A Poincaré plot is also shown when setting show=True
, plotting each breathtobreath interval against the next successive one. It shows the distribution of successive respiratory rates.
[9]:
rrv = nk.rsp_rrv(rsp_rate, info, sampling_rate=100, show=True)
rrv
Neurokit warning: signal_psd(): The duration of recording is too short to support a sufficiently long window for high frequency resolution. Consider using a longer recording or increasing the `min_frequency`
[9]:
RRV_SDBB  RRV_RMSSD  RRV_SDSD  RRV_VLF  RRV_LF  RRV_HF  RRV_LFHF  RRV_LFn  RRV_HFn  RRV_SD1  RRV_SD2  RRV_SD2SD1  RRV_ApEn  RRV_SampEn  RRV_DFA  

0  1030.411296  1269.625397  1286.590811  0.0  203.846501  2000.465169  0.1019  0.092476  0.907524  909.757087  1138.34833  1.251266  0.496939  0.693147  0.755783 
This is a simple visualization tool for shortterm (SD1) and longterm variability (SD2) in respiratory rhythm.
See documentation for full reference¶
RRV method taken from : Soni et al. 2019
ECGDerived Respiration (EDR) Analysis¶
ECGderived respiration (EDR) is the extraction of respiratory information from ECG and is a noninvasive method to monitor respiration activity under instances when respiratory signals are not recorded. In clinical settings, this presents convenience as it allows the monitoring of cardiac and respiratory signals simultaneously from a recorded ECG signal. This example shows how to use Neurokit to perform EDR analysis.
This example can be referenced by citing the package.
[2]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
[3]:
plt.rcParams['figure.figsize'] = [15, 5] # Bigger images
Download ECG Data¶
In this example, we will download a dataset containing an ECG signal sampled at 1000 Hz.
[3]:
# Get data
ecg = np.array(pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit/dev/data/ecg_1000hz.csv"))[:, 1]
# Visualize signal
nk.signal_plot(ecg)
Extraction of ECG Features¶
Now you can extract the R peaks of the signal using ecg_peaks()
and compute the heart period using ecg_rate()
.
Note: As the dataset has a frequency of 1000Hz, make sure the ``sampling_rate`` is also set to 1000Hz. It is critical that you specify the correct sampling rate of your signal throughout all the processing functions.
[4]:
# Extract peaks
rpeaks, info = nk.ecg_peaks(ecg, sampling_rate=1000)
# Compute rate
ecg_rate = nk.ecg_rate(rpeaks, sampling_rate=1000, desired_length=len(ecg))
Analyse EDR¶
Now that we have an array of the heart period, we can then input this into ecg_rsp()
to extract the EDR.
Default method is by Van Gent et al. 2019
; see the full reference in documentation (run : nk.ecg_rsp?
)
[5]:
edr = nk.ecg_rsp(ecg_rate, sampling_rate=1000)
# Visual comparison
nk.signal_plot(edr)
The function ecg_rsp()
incorporates different methods to compute EDR. For a visual comparison of the different methods, we can create a dataframe of EDR columns each of which are produced by different methods, and then plot it, like so:
[16]:
edr_df = pd.DataFrame({"Van Gent et al.": nk.ecg_rsp(ecg_rate, sampling_rate=1000),
"Charlton et al." : nk.ecg_rsp(ecg_rate, sampling_rate=1000, method="charlton2016"),
"Soni et al.": nk.ecg_rsp(ecg_rate, sampling_rate=1000, method="soni2019"),
"Sarkar et al.": nk.ecg_rsp(ecg_rate, sampling_rate=1000, method="sarkar2015")})
[17]:
nk.signal_plot(edr_df, subplots=True)
Extract and Visualize Individual Heartbeats¶
This example can be referenced by citing the package.
This example shows how to use NeuroKit to extract and visualize the QRS complexes (individual heartbeats) from an electrocardiogram (ECG).
[17]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
[18]:
plt.rcParams['figure.figsize'] = [15, 9] # Bigger images
Extract the cleaned ECG signal¶
In this example, we will use a simulated ECG signal. However, you can use any of your signal (for instance, extracted from the dataframe using the read_acqknowledge().
[19]:
# Simulate 30 seconds of ECG Signal (recorded at 250 samples / second)
ecg_signal = nk.ecg_simulate(duration=30, sampling_rate=250)
Once you have a raw ECG signal in the shape of a vector (i.e., a onedimensional array), or a list, you can use ecg_process() to process it.
Note: It is critical that you specify the correct sampling rate of your signal throughout many processing functions, as this allows NeuroKit to have a time reference.
[20]:
# Automatically process the (raw) ECG signal
signals, info = nk.ecg_process(ecg_signal, sampling_rate=250)
This function outputs two elements, a dataframe containing the different signals (raw, cleaned, etc.) and a dictionary containing various additional information (peaks location, …).
Extract Rpeaks location¶
The processing function does two important things for our purpose: 1) it cleans the signal and 2) it detects the location of the Rpeaks. Let’s extract these from the output.
[21]:
# Extract clean ECG and Rpeaks location
rpeaks = info["ECG_R_Peaks"]
cleaned_ecg = signals["ECG_Clean"]
Great. We can visualize the Rpeaks location in the signal to make sure it got detected correctly by marking their location in the signal.
[22]:
# Visualize Rpeaks in ECG signal
plot = nk.events_plot(rpeaks, cleaned_ecg)
Once that we know where the Rpeaks are located, we can create windows of signal around them (of a length of for instance 1 second, ranging from 400 ms before the Rpeak), which we can refer to as epochs.
Segment the signal around the heart beats¶
You can now epoch all these individual heart beats, synchronized by their R peaks with the ecg_segment() function.
[23]:
# Plotting all the heart beats
epochs = nk.ecg_segment(cleaned_ecg, rpeaks=None, sampling_rate=250, show=True)
This create a dictionary of dataframes for each ‘epoch’ (in this case, each heart beat).
Advanced Plotting¶
This section is written for a more advanced purpose of plotting and visualizing all the heartbeats segments. The code below uses packages other than NeuroKit2 to manually set the colour gradient of the signals and to create a more interactive experience for the user  by hovering your cursor over each signal, an annotation of the signal corresponding to the heart beat index is shown.
Custom colors and legend¶
Here, we define a function to create the epochs. It takes in cleaned
as the cleaned signal dataframe, and peaks
as the array of Rpeaks locations.
[24]:
%matplotlib notebook
plt.rcParams['figure.figsize'] = [10, 6] # resize
# Define a function to create epochs
def plot_heartbeats(cleaned, peaks, sampling_rate=None):
heartbeats = nk.epochs_create(cleaned, events=peaks, epochs_start=0.3, epochs_end=0.4, sampling_rate=sampling_rate)
heartbeats = nk.epochs_to_df(heartbeats)
return heartbeats
heartbeats = plot_heartbeats(cleaned_ecg, peaks=rpeaks, sampling_rate=250)
heartbeats.head()
[24]:
Signal  Index  Label  Time  

0  0.193262  139  1  0.300000 
1  0.187513  140  1  0.295977 
2  0.181679  141  1  0.291954 
3  0.175692  142  1  0.287931 
4  0.169474  143  1  0.283908 
We then pivot the dataframe so that each column corresponds to the signal values of one channel, or Label.
[25]:
heartbeats_pivoted = heartbeats.pivot(index='Time', columns='Label', values='Signal')
heartbeats_pivoted.head()
[25]:
Label  1  10  11  12  13  14  15  16  17  18  ...  31  32  33  34  4  5  6  7  8  9 

Time  
0.300000  0.193262  0.130034  0.136912  0.139149  0.130810  0.129424  0.140368  0.141949  0.137200  0.129263  ...  0.134434  0.137504  0.130364  0.155541  0.121164  0.148815  0.131790  0.138593  0.134619  0.137870 
0.295977  0.187513  0.129303  0.135751  0.138231  0.130387  0.128518  0.138962  0.140974  0.136386  0.128326  ...  0.133507  0.136182  0.128863  0.154694  0.120614  0.147637  0.130774  0.137616  0.133792  0.137158 
0.291954  0.181679  0.128410  0.134396  0.137167  0.129875  0.127490  0.137368  0.139873  0.135463  0.127291  ...  0.132467  0.134675  0.127122  0.153670  0.119863  0.146230  0.129614  0.136513  0.132834  0.136297 
0.287931  0.175692  0.127313  0.132779  0.135925  0.129223  0.126319  0.135557  0.138622  0.134400  0.126087  ...  0.131265  0.132927  0.125104  0.152414  0.118853  0.144558  0.128243  0.135241  0.131704  0.135261 
0.283908  0.169474  0.125951  0.130811  0.134471  0.128376  0.124978  0.133487  0.137182  0.133150  0.124632  ...  0.129833  0.130870  0.122765  0.150849  0.117519  0.142586  0.126583  0.133737  0.130349  0.134013 
5 rows × 34 columns
[26]:
# Import dependencies
import matplotlib.pyplot as plt
# Prepare figure
fig, ax = plt.subplots()
plt.close(fig)
ax.set_title("Individual Heart Beats")
ax.set_xlabel("Time (seconds)")
# Aesthetics
labels = list(heartbeats_pivoted)
labels = ['Channel ' + x for x in labels] # Set labels for each signal
cmap = iter(plt.cm.YlOrRd(np.linspace(0,1, int(heartbeats["Label"].nunique())))) # Get color map
lines = [] # Create empty list to contain the plot of each signal
for i, x, color in zip(labels, heartbeats_pivoted, cmap):
line, = ax.plot(heartbeats_pivoted[x], label='%s' % i, color=color)
lines.append(line)
# Show figure
fig
Interactivity¶
This section of the code incorporates the aesthetics and interactivity of the plot produced. Unfortunately, the interactivity is not active in this example but it should work in your console! As you hover your cursor over each signal, annotation of the channel that produced it is shown. The below figure that you see is a standstill image.
Note: you need to install the ``mplcursors`` package for the interactive part (``pip install mplcursors``)
[27]:
# Import packages
import ipywidgets as widgets
from ipywidgets import interact, interact_manual
import mplcursors
[28]:
# Obtain hover cursor
mplcursors.cursor(lines, hover=True, highlight=True).connect("add", lambda sel: sel.annotation.set_text(sel.artist.get_label()))
# Return figure
fig
Locate P, Q, S and T waves in ECG¶
This example shows how to use Neurokit to delineate the ECG peaks in Python using NeuroKit. This means detecting and locating all components of the QRS complex, including Ppeaks and Tpeaks, as well their onsets and offsets from an ECG signal.
This example can be referenced by citing the package.
[3]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
plt.rcParams['figure.figsize'] = [8, 5] # Bigger images
In this example, we will use a short segment of ECG signal with sampling rate of 3000 Hz.
Find the R peaks¶
[4]:
# Retrieve ECG data from data folder (sampling rate= 1000 Hz)
ecg_signal = nk.data(dataset="ecg_3000hz")['ECG']
# Extract Rpeaks locations
_, rpeaks = nk.ecg_peaks(ecg_signal, sampling_rate=3000)
The ecg_peaks() function will return a dictionary contains the samples at which Rpeaks are located.
Let’s visualize the Rpeaks location in the signal to make sure it got detected correctly.
[5]:
# Visualize Rpeaks in ECG signal
plot = nk.events_plot(rpeaks['ECG_R_Peaks'], ecg_signal)
# Zooming into the first 5 Rpeaks
plot = nk.events_plot(rpeaks['ECG_R_Peaks'][:5], ecg_signal[:20000])
Visually, the Rpeaks seem to have been correctly identified. You can also explore searching for Rpeaks using different methods provided by Neurokit ecg_peaks().
Locate other waves (P, Q, S, T) and their onset and offset¶
In ecg_delineate(), Neurokit implements different methods to segment the QRS complexes. There are the derivative method and the other methods that make use of Wavelet to delineate the complexes.
Peak method¶
First, let’s take a look at the ‘peak’ method and its output.
[6]:
# Delineate the ECG signal
_, waves_peak = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="peak")
[7]:
# Visualize the Tpeaks, Ppeaks, Qpeaks and Speaks
plot = nk.events_plot([waves_peak['ECG_T_Peaks'],
waves_peak['ECG_P_Peaks'],
waves_peak['ECG_Q_Peaks'],
waves_peak['ECG_S_Peaks']], ecg_signal)
# Zooming into the first 3 Rpeaks, with focus on T_peaks, Ppeaks, Qpeaks and Speaks
plot = nk.events_plot([waves_peak['ECG_T_Peaks'][:3],
waves_peak['ECG_P_Peaks'][:3],
waves_peak['ECG_Q_Peaks'][:3],
waves_peak['ECG_S_Peaks'][:3]], ecg_signal[:12500])
Visually, the ‘peak’ method seems to have correctly identified the Ppeaks, Qpeaks, Speaks and Tpeaks for this signal, at least, for the first few complexes. Well done, peak!
However, it can be quite tiring to be zooming in to each complex and inspect them one by one. To have a better overview of all complexes at once, you can make use of the show
argument in ecg_delineate() as below.
[8]:
# Delineate the ECG signal and visualizing all peaks of ECG complexes
_, waves_peak = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="peak", show=True, show_type='peaks')
The ‘peak’ method is doing a glamorous job with identifying all the ECG peaks for this piece of ECG signal.
On top of the above peaks, the peak method also identify the wave boundaries, namely the onset of Ppeaks and offset of Tpeaks. You can vary the argument show_type
to specify the information you would like plot.
Let’s visualize them below:
[9]:
# Delineate the ECG signal and visualizing all Ppeaks boundaries
signal_peak, waves_peak = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="peak", show=True, show_type='bounds_P')
[10]:
# Delineate the ECG signal and visualizing all Tpeaks boundaries
signal_peaj, waves_peak = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="peak", show=True, show_type='bounds_T')
Both the onsets of Ppeaks and the offsets of Tpeaks appears to have been correctly identified here. This information will be used to delineate cardiac phases in ecg_phase().
Let’s next take a look at the continuous wavelet method.
Continuous Wavelet Method (CWT)¶
[11]:
# Delineate the ECG signal
signal_cwt, waves_cwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="cwt", show=True, show_type='all')
By specifying ‘all’ in the show_type
argument, you can plot all delineated information output by the cwt method. However, it could be hard to evaluate the accuracy of the delineated information with everyhing plotted together. Let’s tease them apart!
[12]:
# Visualize Ppeaks and Tpeaks
signal_cwt, waves_cwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="cwt", show=True, show_type='peaks')
[13]:
# Visualize Twaves boundaries
signal_cwt, waves_cwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="cwt", show=True, show_type='bounds_T')
[14]:
# Visualize Pwaves boundaries
signal_cwt, waves_cwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="cwt", show=True, show_type='bounds_P')
[15]:
# Visualize Rwaves boundaries
signal_cwt, waves_cwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="cwt", show=True, show_type='bounds_R')
Unlike the peak method, the continuous wavelet method does not idenfity the Qpeaks and Speaks. However, it provides more information regarding the boundaries of the waves
Visually, except a few exception, CWT method is doing a great job. However, the Pwaves boundaries are not very clearly identified here.
Last but not least, we will look at the third method in Neurokit ecg_delineate() function: the discrete wavelet method.
Discrete Wavelet Method (DWT)  default method¶
[16]:
# Delineate the ECG signal
signal_dwt, waves_dwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='all')
[17]:
# Visualize Ppeaks and Tpeaks
signal_dwt, waves_dwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='peaks')
[18]:
# visualize Twave boundaries
signal_dwt, waves_dwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='bounds_T')
[19]:
# Visualize Pwave boundaries
signal_dwt, waves_dwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='bounds_P')
[20]:
# Visualize Rwave boundaries
signal_dwt, waves_dwt = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='bounds_R')
Visually, from the plots above, the delineated outputs of DWT appear to be more accurate than CWT, especially for the Ppeaks and Pwave boundaries.
Overall, for this signal, the peak and DWT methods seem to be superior to the CWT.
How to create epochs¶
So, in your experiment, participants undergo a number of trials (events) and these events are possibly of different conditions. And you are wondering how can you locate these events on your signals and perhaps make them into epochs for future analysis?
This example can be referenced by citing the package.
This example shows how to use Neurokit to extract epochs from data based on events localisation. In case you have multiple data files for each subject, this example also shows you how to create a loop through the subject folders and put the files together in an epoch format for further analysis.
[5]:
# Load NeuroKit and other useful packages
import neurokit2 as nk
import matplotlib.pyplot as plt
[6]:
plt.rcParams['figure.figsize'] = [15, 5] # Bigger images
In this example, we will use a short segment of data which has ECG, EDA and respiration (RSP) signals.
One signal with multiple event markings¶
[7]:
# Retrieve ECG data from data folder (sampling rate= 1000 Hz)
data = nk.data("bio_eventrelated_100hz")
Besides the signal channels, this data also has a fourth channel which consists of a string of 0 and 5. This is a binary marking of the Digital Input channel in BIOPAC.
Let’s visualize the eventmarking channel below.
[8]:
# Visualize the eventmarking channel
plt.plot(data['Photosensor'])
[8]:
[<matplotlib.lines.Line2D at 0x257b48910f0>]
Depends on how you set up your experiment, the onset of the event can either be marked by signal going from 0 to 5 or vice versa. Specific to this data, the onsets of the events are marked where the signal in the eventmarking channel goes from 5 to 0 and the offsets of the events are marked where the signal goes from 0 to 5.
As shown in the above figure, there are four times the signal going from 5 to 0, corresponding to the 4 events (4 trials) in this data.
There were 2 types (the condition) of images that were shown to the participant: “Negative” vs. “Neutral” in terms of emotion. Each condition had 2 trials. The following list is the condition order.
[9]:
condition_list = ["Negative", "Neutral", "Neutral", "Negative"]
Before we can epoch the data, we have to locate the events and extract their related information. This can be done using Neurokit function events_find().
[10]:
# Find events
events = nk.events_find(event_channel=data["Photosensor"],
threshold_keep='below',
event_conditions=condition_list)
events
[10]:
{'onset': array([ 1024, 4957, 9224, 12984]),
'duration': array([300, 300, 300, 300]),
'label': array(['1', '2', '3', '4'], dtype='<U11'),
'condition': ['Negative', 'Neutral', 'Neutral', 'Negative']}
The output of events_find() gives you a dictionary
that contains the information of event onsets, event duration, event label and event condition.
As stated, as the event onsets of this data are marked by event channel going from 5 to 0, the threshold_keep
is set to below
. Depends on your data, you can customize the arguments
in events_find() to correctly locate the events.
You can use the events_plot() function to plot the events that have been found together with your event channel to confirm that it is correct.
[11]:
plot = nk.events_plot(events, data['Photosensor'])
Or you can visualize the events together with the all other signals.
[12]:
plot = nk.events_plot(events, data)
After you have located the events, you can now create epochs using the NeuroKit epochs_create() function. However, we recommend to process your signal first before cutting them to smaller epochs. You can read more about processing of physiological signals using NeuroKit in Custom your Processing Pipeline Example.
[13]:
# Process the signal
df, info = nk.bio_process(ecg=data["ECG"], rsp=data["RSP"], eda=data["EDA"], sampling_rate=100)
Now, let’s think about how we want our epochs to be like. For this example, we want:
1. Epochs to start *1 second before the event onset*
2. Epochs to end *6 seconds* afterwards
These are passed into the epochs_start
and epochs_end
arguments, respectively.
Our epochs will then cover the region from 1 s to +6 s relative to the onsets of events (i.e., 700 data points since the signal is sampled at 100Hz).
[14]:
# Build and plot epochs
epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=1, epochs_end=6)
And as easy as that, you have created a dictionary of four dataframes, each correspond to an epoch of the event.
Here, in the above example, all your epochs have the same starting time and ending time, specified by epochs_start
and epochs_end
. Nevertheless, you can also pass a list of different timings to these two arguments to customize the duration of the epochs for each of your events.
One subject with multiple data files¶
In some experimental designs, instead of having one signal file with multiple events, each subject can have multiples files where each file is the record of one event.
In the following example, we will show you how to create a loop through the subject folders and put the files together in an epoch format for further analysis.
Firstly, let’s say your data is arranged as the following where each subject has a folder and in each folder there are multiple data files corresponding to different events:
[Experiment folder]

└── Data
 
 └── Subject_001/
  │ event_1.[csv]
  │ event_2.[csv]
  __ ......
 └── Subject_002/
 │ event_1.[csv]
 │ event_2.[csv]
 __ ......
└── analysis_script.py
The following will illustrate how your analysis script might look like. Try to recreate such data structure and the analysis script in your computer!
Now, in our analysis scripts, let’s load the necessary packages:
[21]:
# Load packages
import pandas as pd
import os
Assuming that your working directory is now at your analysis script, and you want to read all the data files of Subject_001
.
Your analysis script should look something like below:
[ ]:
# Your working directory should be at Experiment folder
participant = 'Subject_001'
sampling_rate=100
# List all data files in Subject_001 folder
all_files = os.listdir('Data/' + participant)
# Create an empty directory to store your files (events)
epochs = {}
# Loop through each file in the subject folder
for i, file in enumerate(all_files):
# Read the file
data = pd.read_csv('Data/' + participant + '/' + file)
# Add a Label column (e.g Label 1 for epoch 1)
data['Label'] = np.full(len(data), str(i+1))
# Set index of data to time in seconds
index = data.index/sampling_rate
data = data.set_index(pd.Series(index))
# Append the file into the dictionary
epochs[str(i + 1)] = data
epochs
And tah dah! You now should have a dictionary
called epochs that resembles the output of NeuroKit epochs_create(). Each DataFrame
in the epochs corresponds to an event (a trial) that Subject_001
underwent.
The epochs is now ready for further analysis!
Heart Rate Varability (HRV)¶
For a comprehensive review of the most uptodate HRV indices, a discussion of their significance in psychology, and a stepbystep guide for HRV analysis using NeuroKit2, the Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial paper is a good place to start.
NeuroKit2 is the most comprehensive software for computing HRV indices, and the list of features is available below:
Domains 
Indices 
NeuroKit 
heartpy 
HRV 
pyHRV 

Time Domain 
CVNN CVSD MAD MHR MRRI NNI parameters ΔNNI parameters MadNN MeanNN MedianNN MCVNN pNN20 pNN50 RMSSD SDANN SDNN SDNN_index SDSD TINN 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
Frequency Domain 
ULF VLF LF LFn LF Peak LF Relative HF HFnu HF Peak HF Relative LF/HF 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
NonLinear Domain 
SD1 SD2 S SD1/SD2 SampEn DFA CSI Modified CSI CVI 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ 
✔️ ✔️ 
✔️ ✔️ ✔️ ✔️ ✔️ ✔️ 
Compute HRV features¶
This example can be referenced by citing the package.
The example shows how to use NeuroKit2 to compute heart rate variability (HRV) indices in the time, frequency, and nonlinear domain.
[12]:
# Load the NeuroKit package and other useful packages
import neurokit2 as nk
import matplotlib.pyplot as plt
%matplotlib inline
[13]:
plt.rcParams['figure.figsize'] = [15, 9] # Bigger images
Download Dataset¶
First, let’s download the resting rate data (sampled at 100Hz) using nk.data()
.
[14]:
data = nk.data("bio_resting_5min_100hz")
data.head() # Print first 5 rows
[14]:
ECG  PPG  RSP  

0  0.003766  0.102539  0.494652 
1  0.017466  0.103760  0.502483 
2  0.015679  0.107422  0.511102 
3  0.001598  0.110855  0.518791 
4  0.002483  0.112610  0.528669 
You can see that it consists of three different signals, pertaining to ECG, PPG (an alternative determinant of heart rate as compared to ECG), and RSP (respiration). Now, let’s extract the ECG signal in the shape of a vector (i.e., a onedimensional array), and find the peaks using ecg_peaks().
[15]:
# Find peaks
peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)
Note: It is critical that you specify the correct sampling rate of your signal throughout many processing functions, as this allows NeuroKit to have a time reference.
This produces two elements, peaks
which is a DataFrame of same length as the input signal in which occurences of Rpeaks are marked with 1 in a list of zeros. info
is a dictionary of the sample points at which these Rpeaks occur.
HRV is the temporal variation between consecutive heartbeats (RR intervals). Here, we will use peaks
i.e. occurrences of the heartbeat peaks, as the input argument in the following HRV functions to extract the indices.
TimeDomain Analysis¶
First, let’s extract the timedomain indices.
[16]:
# Extract clean EDA and SCR features
hrv_time = nk.hrv_time(peaks, sampling_rate=100, show=True)
hrv_time
[16]:
HRV_RMSSD  HRV_MeanNN  HRV_SDNN  HRV_SDSD  HRV_CVNN  HRV_CVSD  HRV_MedianNN  HRV_MadNN  HRV_MCVNN  HRV_pNN50  HRV_pNN20  HRV_TINN  HRV_HTI  

0  69.697983  696.395349  62.135891  69.779109  0.089225  0.100084  690.0  44.478  0.064461  14.651163  49.302326  950.0  4.343434 
These features include the RMSSD (square root of the mean of the sum of successive differences between adjacent RR intervals), MeanNN (mean of RR intervals) so on and so forth. You can also visualize the distribution of RR intervals by specifying show=True
in hrv_time().
FrequencyDomain Analysis¶
Now, let’s extract the frequency domain features, which involve extracting for example the spectral power density pertaining to different frequency bands. Again, you can visualize the power across frequency bands by specifying show=True
in hrv_frequency().
[17]:
hrv_freq = nk.hrv_frequency(peaks, sampling_rate=100, show=True)
hrv_freq
[17]:
HRV_ULF  HRV_VLF  HRV_LF  HRV_HF  HRV_VHF  HRV_LFHF  HRV_LFn  HRV_HFn  HRV_LnHF  

0  NaN  NaN  1168.925644  1586.90506  253.936141  0.736607  0.388377  0.527252  7.369541 
NonLinear Domain Analysis¶
Now, let’s compute the nonlinear indices with hrv_nonlinear().
[18]:
hrv_non = nk.hrv_nonlinear(peaks, sampling_rate=100, show=True)
hrv_non
[18]:
HRV_SD1  HRV_SD2  HRV_SD2SD1  HRV_CSI  HRV_CVI  HRV_CSI_Modified  HRV_SampEn  

0  49.341281  85.461606  1.732051  1.732051  4.829101  592.095372  1.259931 
This will produce a Poincaré plot which plots each RR interval against the next successive one.
All Domains¶
Finally, if you’d like to extract HRV indices from all three domains, you can simply input peaks
into hrv(), where you can specify show=True
to visualize the combination of plots depicting the RR intervals distribution, power spectral density for frequency domains, and the poincare scattergram.
[19]:
hrv_indices = nk.hrv(peaks, sampling_rate=100, show=True)
hrv_indices
[19]:
HRV_RMSSD  HRV_MeanNN  HRV_SDNN  HRV_SDSD  HRV_CVNN  HRV_CVSD  HRV_MedianNN  HRV_MadNN  HRV_MCVNN  HRV_pNN50  ...  HRV_LFn  HRV_HFn  HRV_LnHF  HRV_SD1  HRV_SD2  HRV_SD2SD1  HRV_CSI  HRV_CVI  HRV_CSI_Modified  HRV_SampEn  

0  69.697983  696.395349  62.135891  69.779109  0.089225  0.100084  690.0  44.478  0.064461  14.651163  ...  0.388377  0.527252  7.369541  49.341281  85.461606  1.732051  1.732051  4.829101  592.095372  1.259931 
1 rows × 29 columns
Resources¶
There are several other packages more focused on HRV in which you might find a more in depth explanation and documentation. See their documentation here:
Complexity Analysis of Physiological Signals¶
A complex system, can be loosely defined as one that comprises of many components that interact with each other. In science, this approach is used to investigate how relationships between a system’s parts results in its collective behaviours and how the system interacts and forms relationships with its environment.
In recent years, there has been an increase in the use of complex systems to model physiological systems, such as for medical purposes where the dynamics of physiological systems can distinguish which systems and healthy and which are impaired. One prime example of how complexity exists in physiological systems is heartrate variability (HRV), where higher complexity (greater HRV) has been shown to be an indicator of health, suggesting enhanced ability to adapt to environmental changes.
Although complexity is an important concept in health science, it is still foreign to many health scientists. This tutorial aims to provide a simple guide to the main tenets of compexity analysis in relation to physiological signals.
Basic Concepts¶
Definitions¶
Complex systems are examples of nonlinear dynamical systems.
A dynamical system can be described by a vector of numbers, called its state, which can be represented by a point in a phase space. In terms of physiology, this vector might include values of variables such as lung capacity, heart rate and blood pressure. This state aims to provide a complete description of the system (in this case, health) at some point in time.
The set of all possible states is called the state space and the way in which the system evolves over time (e.g., change in person’s health state over time) can be referred to as trajectory.
After a sufficiently long time, different trajectories may evolve or converge to a common subset of state space called an attractor. The presence and behavior of attractors gives intuition about the underlying dynamical systems. This attractor can be a fixedpoint where all trajectories converge to a single point (i.e., homoeostatic equilibrium) or it can be periodic where trajectories follow a regular path (i.e., cyclic path).
Timedelay embedding¶
Nonlinear timeseries analysis is used to understand, characterize and predict dynamical information about human physiological systems. This is based on the concept of statespace reconstruction, which allows one to reconstruct the full dynamics of a nonlinear system from a single time series (a signal).
One standard method for statespace reconstruction is timedelay embedding (or also known as delaycoordinate embedding). It aims to identify the state of a system at a particular point in time by searching the past history of observations for similar states, and by studying how these similar states evolve, in turn predict the future course of the time series.
In conclusion, the purpose of timedelay embeddings is to reconstruct the state and dynamics of an unknown dynamical system from measurements or observations of that system taken over time.
In this gif here, you can see how the phase space is constructed by plotting delayed signals against the original signal (where each time series is an embedded version i.e., delayed version of the original). Each point in the 3D reconstruction can be thought of as a time segment, with different points capturing different segments of history of variable X. Credits go to this short illustration.
Embedding Parameters¶
For the reconstructed dynamics to be identical to the full dynamics of the system, some basic parameters need to be optimally determined for timedelayed embedding:
Time delay: tau, τ
A measure of time that sets basic delay
Generates the respective axes of the reconstruction: x(t), x(ttau), x(t2tau)…
E.g., if tau=1, the state x(t) would be plotted against its prior self x(t1)
If τ is too small, constructed signals are too much alike and if too large, the reconstructed trajectory
will show connections between states very far in the past and to those far in the future (no relationship), which might make the reconstruction extremely complex
Embedding dimension, m
Number of vectors to be compared (i.e. no. of additional signals of time delayed values of tau)
Dictates how many axes will be shown in the reconstruction space i.e. how much of the system’s history is shown
Dimensionality must be sufficiently high to generate relevant information and create a rich history of states over time, but also low enough to be easily understandable
Tolerance threshold, r
Tolerance for accepting similar patterns
Visualize Embedding
This is how a typical sinusoidal signal looks like, when embedded in 2D and 3D respectively.
Using NeuroKit
There are different methods to guide the choice of parameters.
In NeuroKit, you can use nk.complexity_optimize()
to estimate the optimal parameters, including time delay, embedding dimension and tolerance threshold.
import neurokit2 as nk
signal = nk.signal_simulate(duration=10, frequency=1, noise=0.01)
parameters = nk.complexity_optimize(signal, show=True)
parameters
>>> {'delay': 13, 'dimension': 6, 'r': 0.014}
In the above example, the optimal time delay is estimated using the Mutual Information method (Fraser & Swinney, 1986), the optimal embedding dimension is estimated using the Average False Nearest Neighbour (Cao, 1997) and the optimal r is obtained using the Maximum Approximate Entropy (Lu et al., 2008).
These are the default methods in nk.complexity_optimize()
. Nevertheless, you can specify your preferred method via the method arguments.
More of these methods can be read about in this chapter here.
Entropy as measures of Complexity¶
The complexity of physiological signals can be represented by the entropy of these nonlinear, dynamic physiological systems.
Entropy can be defined as the measure of disorder in a signal.
Shannon Entropy (ShEn)¶
call
nk.entropy_shannon()
Approximate Entropy (ApEn)¶
Quantifies the amount of regularity and the unpredictability of fluctuations over timeseries data.
Advantages of ApEn: lower computational demand (can be designed to work for small data samples i.e. less than 50 data points and can be applied in real time) and less sensitive to noise.
Smaller values indicate that the data is more regular and predictable, and larger values corresponding to more complexity or irregularity in the data.
call
nk.entropy_approximate()
Examples of use
Reference 
Signal 
Parameters 
Findings 

Caldirola et al. (2004) 
17min breathbybreath recordings of respiration parameters 
m=1, r=0.2 
Panic disorder patients showed higher ApEn indexes in baseline RSP patterns (all parameters) than healthy subjects 
Burioka et al. (2003) 
30 mins of Respiration, 20s recordings of EEG 
m=2, r=0.2, τ=1.1s for respiration, 0.09s for EEG 
Lower ApEn of respiratory movement and EEG in stage IV sleep than other stages of consciousness 
Boettger et al. (2009) 
64s recordings of QT and RR intervals 
m=2, r=0.2 
Higher ratio of ApEn(QT) to ApEn(RR) for higher intensities of exercise, reflecting sympathetic activity 
Taghavi et al. (2011) 
2mis recordings of EEG 
m=2, r=0.1 
Higher ApEn of normal subjects than schizophrenic patients particularly in limbic areas of the brain 
Sample Entropy (SampEn)¶
A modification of approximate entropy
Advantages over ApEn: data length independence and a relatively troublefree implementation.
Large values indicate high complexity whereas smaller values characterize more selfsimilar and regular signals.
call
nk.entropy_sample()
Examples of use
Reference 
Signal 
Parameters 
Findings 

Lake et al. (2002) 
25min recordings of RR intervals 
m=3, r=0.2 
SampEn is lower in the course of neonatal sepsis and sepsislike illness 
Lake et al. (2011) 
24h recordings of RR intervals 
m=1, r=to vary 
In patients over 4o years old, SampEn has high degrees of accuracy in distinguishing atrial fibrillation from normal sinus rhythm in 12beat calculations performed hourly 
Estrada et al. (2015) 
EMG diaphragm signal 
m=1, r=0.3 
fSampEn (fixed SampEn) method to extract RSP rate from respiratory EMG signal 
Kapidzic et al. (2014) 
RR intervals and its corresponding RSP signal 
m=2, r=0.2 
During paced breathing, significant reduction of SampEn(Resp) and SampEn(RR) with age in male subjects, compared to smaller and nonsignificant SampEn decrease in females 
Abásolo et al. (2006) 
5min recordings of EEG in 5 second epochs 
m=1, r=0.25 
Alzheimer’s Disease patients had lower SampEn than controls in parietal and occipital regions 
Fuzzy Entropy (FuzzyEn)¶
Similar to ApEn and SampEn
call
nk.entropy_fuzzy()
Multiscale Entropy (MSE)¶
Expresses different levels of either ApEn or SampEn by means of multiple factors for generating multiple time series
Captures more useful information than using a scalar value produced by ApEn and SampEn
call
nk.entropy_multiscale()
Detrended Fluctuation Analysis (DFA)¶
Analyze Electrooculography EOG data (eye blinks, saccades, etc.)¶
[26]:
# This is temporary to load the dev version of NK, needs to be removed once it's in master
import os
os.chdir('D:/Dropbox/RECHERCHE/N/NeuroKit/')
# 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import neurokit2 as nk
plt.rcParams['figure.figsize'] = [14, 8] # Increase plot size
Explore the EOG signal¶
Let’s load the example dataset corresponding to a vertical EOG signal.
[47]:
eog_signal = pd.read_csv("data/eog_100hz.csv")
eog_signal = nk.as_vector(eog_signal) # Extract the only column as a vector
nk.signal_plot(eog_signal)
Let’s zoom in to some areas where clear blinks are present.
[48]:
nk.signal_plot(eog_signal[100:1700])
Clean the signal¶
We can now filter the signal to remove some noise and trends.
[49]:
eog_cleaned = nk.eog_clean(eog_signal, sampling_rate=100, method='neurokit')
Let’s visualize the same chunk and compare the clean version with the original signal.
[50]:
nk.signal_plot([eog_signal[100:1700], eog_cleaned[100:1700]])
Detect and visualize eye blinks¶
We will nor run a peak detection algorithm to detect peaks location.
[54]:
blinks = nk.eog_findpeaks(eog_cleaned, sampling_rate=100, method="mne")
blinks
[54]:
array([ 277, 430, 562, 688, 952, 1378, 1520, 1752, 3353,
3783, 3928, 4031, 4168, 4350, 4723, 4878, 5213, 5365,
5699, 5904, 6312, 6539, 6714, 7224, 7382, 7553, 7827,
8014, 8154, 8312, 8626, 8702, 9140, 9425, 9741, 9948,
10142, 10230, 10368, 10708, 10965, 11256, 11683, 11775],
dtype=int64)
[77]:
events = nk.epochs_create(eog_cleaned, blinks, sampling_rate=100, epochs_start=0.3, epochs_end=0.7)
[79]:
data = nk.epochs_to_array(events) # Convert to 2D array
data = nk.standardize(data) # Rescale so that all the blinks are on the same scale
# Plot with their median (used here as a robust average)
plt.plot(data, linewidth=0.4)
plt.plot(np.median(data, axis=1), linewidth=3, linestyle='', color="black")
[79]:
[<matplotlib.lines.Line2D at 0x18fc51f7e50>]
[ ]:
Fit a function to a signal¶
This example can be referenced by citing the package.
This small tutorial will show you how to use Python to estimate the best fitting line to some data. This can be used to find the optimal line passing through a signal.
[11]:
# Load packages
import numpy as np
import pandas as pd
import scipy.optimize
import matplotlib.pyplot as plt
%matplotlib inline
import neurokit2 as nk
plt.rcParams['figure.figsize'] = [14, 8] # Increase plot size
Fit a linear function¶
We will start by generating some random data on a scale from 0 to 10 (the xaxis), and then pass them through a function to create its y values.
[12]:
x = np.random.uniform(0., 10., size=100)
y = 3. * x + 2. + np.random.normal(0., 10., 100)
plt.plot(x, y, '.')
[12]:
[<matplotlib.lines.Line2D at 0x23a3a3e68d0>]
No in this case we know that the best fitting line will be a linear function (i.e., a straight line), and we want to find its parameters. A linear function has two parameters, the intercept and the slope.
First, we need to create this function, that takes some x values, the parameters, and return the y value.
[13]:
def function_linear(x, intercept, slope):
y = intercept + slope * x
return y
Now, using the power of scipy, we can optimize this function based on our data to find the parameters that minimize the least square error. It just needs the function and the data’s x and y values.
[14]:
params, covariance = scipy.optimize.curve_fit(function_linear, x, y)
params
[14]:
array([4.19966702, 2.12600208])
So the optimal parameters (in our case, the intercept and the slope) are returned in the params object. We can unpack these parameters (using the star symbol *) into our linear function to use them, and create the predicted y values to our x axis.
[15]:
fit = function_linear(x, *params)
plt.plot(x, y, '.')
plt.plot(x, fit, '')
[15]:
[<matplotlib.lines.Line2D at 0x23a3a407588>]
Nonlinear curves¶
We can also use that to approximate nonlinear curves.
[16]:
signal = nk.eda_simulate(sampling_rate=50)
nk.signal_plot(signal)
In this example, we will try to approximate this Skin Conductance Response (SCR) using a gamma distribution, which is quite a flexible distribution defined by 3 parameters (a, loc and scale).
On top of these 3 parameters, we will add 2 more, the intercept and a size parameter to give it more flexibility.
[17]:
def function_gamma(x, intercept, size, a, loc, scale):
gamma = scipy.stats.gamma.pdf(x, a=a, loc=loc, scale=scale)
y = intercept + (size * gamma)
return y
We can start by visualizing the function with a “arbitrary” parameters:
[18]:
x = np.linspace(0, 20, num=500)
y = function_gamma(x, intercept=1, size=1, a=3, loc=0.5, scale=0.33)
nk.signal_plot(y)
Since these values are already a good start, we will use them as “starting point” (through the p0
argument), to help the estimation algorithm converge (otherwise it could never find the right combination of parameters).
[19]:
params, covariance = scipy.optimize.curve_fit(function_gamma, x, signal, p0=[1, 1, 3, 0.5, 0.33])
params
[19]:
array([0.95075442, 2.23136452, 1.10417281, 2.04403253, 1.88667224])
[20]:
fit = function_gamma(x, *params)
plt.plot(x, signal, '.')
plt.plot(x, fit, '')
[20]:
[<matplotlib.lines.Line2D at 0x23a3a7ffe80>]
Resources¶
Contents:
Recording good quality signals¶
Hint
Spotted a typo? Would like to add something or make a correction? Join us by contributing (see this tutorial).
This tutorial is a work in progress.
Recording¶
Signal quality¶
Artifacts and Anomalies¶
What software for physiological signal processing¶
Hint
Spotted a typo? Would like to add something or make a correction? Join us by contributing (see this tutorial).
If you’re here, it’s probably that you have (or plan to have) some physiological data (aka biosignals, e.g., ECG for cardiac activity, RSP for respiration, EDA for electrodermal activity, EMG for muscle activity etc.), that you plan to process and analyze these signals and that you don’t know where to start. Whether you are an undergrad, a master or PhD student, a postdoc or a full professor, you’re at the right place.
So let’s discuss a few things to consider to best decide on your options.
Software vs. programming language (packages)¶
In this context, a software would be a program that you download, install (through some sort of .exe file), and start similarly to most of the other programs installed on our computers.
They are appealing because of their (apparent) simplicity and familiarity of usage: you can click on icons and menus and you can see all the available options, which makes it easy for exploration. In a way, it also feels safe, because you can always close the software, press “do not save the changes”, and start again.
Unfortunately, when it comes to science, this comes with a set of limitations; they are in general quite expensive, are limited to the set of included features (it’s not easy to use one feature from one software, and then another from another one), have a slow pace of updates (and thus often don’t include the most recent and cuttingedge algorithms, but rather wellestablished ones), and are not opensource (and thus prevent to run fully reproducible analyses).
Software for biosignals processing
AcqKnowledge: General physiological analysis software (ECG, PPG, RSP, EDA, EMG, …).
Kubios: Heartrate variability (HRV).
Unfortunately, it’s the prefered option for many researchers. Why? For PIs, it’s usually because they are established tools backed by some sort of company behind them, with experts, advertisers and sellers that do their job well. The companies also offer some guaranty in terms of training, updates, issues troubleshooting, etc. For younger researchers starting with physiological data analysis, it’s usually because they don’t have much (or any) experience with programming languages. They feel like there is already a lot of things to learn on the theorethical side of physiological signal processing, so they don’t want to add on top of that, learning a programming language.
However, it is important to understand that you don’t necessarily have to know how to code to use some of the packages. Moreover, some of them include a GUI (see below), which makes them very easy to use and a great alternative to the software mentioned above.
Note
TLDR; Closed proprietary software, even though seemlingly appealing, might not a good investement of time or money.
GUI vs. code¶
TODO.
Packages with a GUI
Note
TLDR; While GUIs can be good alternatives and a first step to dive into programming languagebased tools, coding will provide you with more freedom, incredible power and the best fit possible for your data and issues.
Matlab vs. Python vs. R vs. Julia¶
What is the best programming language for physiological data analysis?
Matlab is the historical main contender. However… TODO.
Pythonbased packages
NeuroKit2: ECG, PPG, RSP, EDA, EMG.
BioSPPy: ECG, RSP, EDA, EMG.
PySiology: ECG, EDA, EMG.
pyphysio: ECG, PPG, EDA.
HeartPy: ECG.
hrv: ECG.
hrvanalysis: ECG.
pyhrv: ECG.
pyecgdetectors: ECG.
Systole: PPG.
edaexplorer: EDA.
Pypsy: EDA.
MNE: EEG.
tensorpac: EEG.
PyGaze: Eyetracking.
PyTrack: Eyetracking.
Matlabbased packages
BreatheEasyEDA: EDA.
EDA: EDA.
unfold: EEG.
Additional Resources¶
Hint
Would like to add something? Join us by contributing (see this tutorial).
General Neuroimaging¶
ECG¶
EDA¶
EEG¶
Functions¶
ECG¶
Submodule for NeuroKit.
 ecg_analyze(data, sampling_rate=1000, method='auto', subepoch_rate=[None, None])[source]¶
Performs ECG analysis on either epochs (eventrelated analysis) or on longer periods of data such as resting state data.
 Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘eventrelated’ for eventrelated analysis on epochs, or ‘intervalrelated’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘eventrelated’ for duration under 10s).
subepoch_rate (list) – For eventrelated analysis,, a smaller “subepoch” within the epoch of an event can be specified. The ECG raterelated features of this “subepoch” (e.g., ECG_Rate, ECG_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the subepoch and the second specifies the end of the subepoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
 Returns
DataFrame – A dataframe containing the analyzed ECG features. If eventrelated analysis is conducted, each epoch is indicated by the Label column. See ecg_eventrelated() and ecg_intervalrelated() docstrings for details.
See also
bio_process
,ecg_process
,epochs_create
,ecg_eventrelated
,ecg_intervalrelated
Examples
>>> import neurokit2 as nk >>> >>> # Example 1: Download the data for eventrelated analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for eventrelated analysis >>> df, info = nk.bio_process(ecg=data["ECG"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", ... "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.ecg_analyze(epochs, sampling_rate=100) >>> analyze_epochs
>>> >>> # Example 2: Download the restingstate data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.ecg_analyze(df, sampling_rate=100) >>> analyze_df
 ecg_clean(ecg_signal, sampling_rate=1000, method='neurokit')[source]¶
Clean an ECG signal.
Prepare a raw ECG signal for Rpeak detection with the specified method.
 Parameters
ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default), ‘biosppy’, ‘pantompkins1985’, ‘hamilton2002’, ‘elgendi2010’, ‘engzeemod2012’.
 Returns
array – Vector containing the cleaned ECG signal.
See also
ecg_findpeaks
,signal_rate
,ecg_process
,ecg_plot
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> signals = pd.DataFrame({"ECG_Raw" : ecg, ... "ECG_NeuroKit" : nk.ecg_clean(ecg, sampling_rate=1000, method="neurokit"), ... "ECG_BioSPPy" : nk.ecg_clean(ecg, sampling_rate=1000, method="biosppy"), ... "ECG_PanTompkins" : nk.ecg_clean(ecg, sampling_rate=1000, method="pantompkins1985"), ... "ECG_Hamilton" : nk.ecg_clean(ecg, sampling_rate=1000, method="hamilton2002"), ... "ECG_Elgendi" : nk.ecg_clean(ecg, sampling_rate=1000, method="elgendi2010"), ... "ECG_EngZeeMod" : nk.ecg_clean(ecg, sampling_rate=1000, method="engzeemod2012")}) >>> signals.plot() <AxesSubplot:>
References
Jiapu Pan and Willis J. Tompkins. A RealTime QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME32.3 (1985), pp. 230–236.
Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.
 ecg_delineate(ecg_cleaned, rpeaks=None, sampling_rate=1000, method='dwt', show=False, show_type='peaks', check=False)[source]¶
Delineate QRS complex.
Function to delineate the QRS complex.
Cardiac Cycle: A typical ECG heartbeat consists of a P wave, a QRS complex and a T wave. The P wave represents the wave of depolarization that spreads from the SAnode throughout the atria. The QRS complex reflects the rapid depolarization of the right and left ventricles. Since the ventricles are the largest part of the heart, in terms of mass, the QRS complex usually has a much larger amplitude than the Pwave. The T wave represents the ventricular repolarization of the ventricles.On rare occasions, a U wave can be seen following the T wave. The U wave is believed to be related to the last remnants of ventricular repolarization.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (Union[list, np.array, pd.Series]) – The samples at which Rpeaks occur. Accessible with the key “ECG_R_Peaks” in the info dictionary returned by ecg_findpeaks().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 500.
method (str) – Can be one of ‘peak’ for a peakbased method, ‘cwt’ for continuous wavelet transform or ‘dwt’ (default) for discrete wavelet transform.
show (bool) – If True, will return a plot to visualizing the delineated waves information.
show_type (str) – The type of delineated waves information showed in the plot.
check (bool) – Defaults to False. If True, replaces the delineated features with np.nan if its standardized distance from Rpeaks is more than 3.
 Returns
waves (dict) – A dictionary containing additional information. For derivative method, the dictionary contains the samples at which Ppeaks, Qpeaks, Speaks, Tpeaks, Ponsets and Toffsets occur, accessible with the key “ECG_P_Peaks”, “ECG_Q_Peaks”, “ECG_S_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_T_Offsets” respectively.
For wavelet methods, in addition to the above information, the dictionary contains the samples at which QRSonsets and QRSoffsets occur, accessible with the key “ECG_P_Peaks”, “ECG_T_Peaks”, “ECG_P_Onsets”, “ECG_P_Offsets”, “ECG_Q_Peaks”, “ECG_S_Peaks”, “ECG_T_Onsets”, “ECG_T_Offsets”, “ECG_R_Onsets”, “ECG_R_Offsets” respectively.
signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of peaks, onsets and offsets marked as “1” in a list of zeros.
See also
ecg_clean
,signal_fixpeaks
,ecg_peaks
,signal_rate
,ecg_process
,ecg_plot
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> _, rpeaks = nk.ecg_peaks(cleaned, sampling_rate=1000) >>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000) >>> nk.events_plot(waves["ECG_P_Peaks"], cleaned) <Figure ...> >>> nk.events_plot(waves["ECG_T_Peaks"], cleaned) <Figure ...>
References
Martínez, J. P., Almeida, R., Olmos, S., Rocha, A. P., & Laguna, P. (2004). A waveletbased ECG delineator: evaluation on standard databases. IEEE Transactions on biomedical engineering, 51(4), 570581.
Performs eventrelated ECG analysis on epochs.
 Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
subepoch_rate (list) – A smaller “subepoch” within the epoch of an event can be specified. The ECG raterelated features of this “subepoch” (e.g., ECG_Rate, ECG_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the subepoch and the second specifies the end of the subepoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
 Returns
DataFrame – A dataframe containing the analyzed ECG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:
”ECG_Rate_Max”: the maximum heart rate after stimulus onset.
”ECG_Rate_Min”: the minimum heart rate after stimulus onset.
”ECG_Rate_Mean”: the mean heart rate after stimulus onset.
”ECG_Rate_SD”: the standard deviation of the heart rate after stimulus onset.
”ECG_Rate_Max_Time”: the time at which maximum heart rate occurs.
”ECG_Rate_Min_Time”: the time at which minimum heart rate occurs.
”ECG_Phase_Atrial”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).
”ECG_Phase_Ventricular”: indication of whether the onset of the event concurs with respiratory systole (1) or diastole (0).
”ECG_Phase_Atrial_Completion”: indication of the stage of the current cardiac (atrial) phase (0 to 1) at the onset of the event.
”ECG_Phase_Ventricular_Completion”: indication of the stage of the current cardiac (ventricular) phase (0 to 1) at the onset of the event.
We also include the following experimental features related to the parameters of a quadratic model:
”ECG_Rate_Trend_Linear”: The parameter corresponding to the linear trend.
”ECG_Rate_Trend_Quadratic”: The parameter corresponding to the curvature.
”ECG_Rate_Trend_R2”: the quality of the quadratic model. If too low, the parameters might not be reliable or meaningful.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> ecg, info = nk.ecg_process(nk.ecg_simulate(duration=20)) >>> >>> # Process the data >>> epochs = nk.epochs_create(ecg, events=[5000, 10000, 15000], ... epochs_start=0.1, epochs_end=1.9) >>> nk.ecg_eventrelated(epochs) Label Event_Onset ... ECG_Phase_Completion_Ventricular ECG_Quality_Mean 1 1 ... ... ... ... 2 2 ... ... ... ... 3 3 ... ... ... ...
[3 rows x 17 columns] >>> >>> # Example with real data >>> data = nk.data(“bio_eventrelated_100hz”) >>> >>> # Process the data >>> df, info = nk.bio_process(ecg=data[“ECG”], sampling_rate=100) >>> events = nk.events_find(data[“Photosensor”], … threshold_keep=’below’, … event_conditions=[“Negative”, “Neutral”, … “Neutral”, “Negative”]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, … epochs_start=0.1, epochs_end=1.9) >>> nk.ecg_eventrelated(epochs) #doctest: +ELLIPSIS
Label Condition … ECG_Phase_Completion_Ventricular ECG_Quality_Mean
1 1 Negative … … … 2 2 Neutral … … … 3 3 Neutral … … … 4 4 Negative … … …
[4 rows x 18 columns]
 ecg_findpeaks(ecg_cleaned, sampling_rate=1000, method='neurokit', show=False)[source]¶
Find Rpeaks in an ECG signal.
Lowlevel function used by ecg_peaks() to identify Rpeaks in an ECG signal using a different set of algorithms. See ecg_peaks() for details.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (string) – The algorithm to be used for Rpeak detection. Can be one of ‘neurokit’ (default), ‘pantompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’, ‘kalidas2017’, ‘martinez2003’, ‘rodrigues2021’ or ‘promac’.
show (bool) – If True, will return a plot to visualizing the thresholds used in the algorithm. Useful for debugging.
 Returns
info (dict) – A dictionary containing additional information, in this case the samples at which Rpeaks occur, accessible with the key “ECG_R_Peaks”.
See also
ecg_clean
,signal_fixpeaks
,ecg_peaks
,ecg_rate
,ecg_process
,ecg_plot
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> info = nk.ecg_findpeaks(cleaned) >>> nk.events_plot(info["ECG_R_Peaks"], cleaned) <Figure ...>
>>> >>> # Different methods >>> neurokit = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="neurokit"), method="neurokit") >>> pantompkins1985 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="pantompkins1985"), method="pantompkins1985") >>> nabian2018 = nk.ecg_findpeaks(cleaned, method="nabian2018") >>> hamilton2002 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="hamilton2002"), method="hamilton2002") >>> martinez2003 = nk.ecg_findpeaks(cleaned, method="martinez2003") >>> christov2004 = nk.ecg_findpeaks(cleaned, method="christov2004") >>> gamboa2008 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="gamboa2008"), method="gamboa2008") >>> elgendi2010 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="elgendi2010"), method="elgendi2010") >>> engzeemod2012 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="engzeemod2012"), method="engzeemod2012") >>> kalidas2017 = nk.ecg_findpeaks(nk.ecg_clean(ecg, method="kalidas2017"), method="kalidas2017") >>> rodrigues2021 = nk.ecg_findpeaks(cleaned, method="rodrigues2021") >>> >>> # Visualize >>> nk.events_plot([neurokit["ECG_R_Peaks"], ... pantompkins1985["ECG_R_Peaks"], ... nabian2018["ECG_R_Peaks"], ... hamilton2002["ECG_R_Peaks"], ... christov2004["ECG_R_Peaks"], ... gamboa2008["ECG_R_Peaks"], ... elgendi2010["ECG_R_Peaks"], ... engzeemod2012["ECG_R_Peaks"], ... kalidas2017["ECG_R_Peaks"], ... martinez2003["ECG_R_Peaks"], ... rodrigues2021["ECG_R_Peaks"]], cleaned) <Figure ...> >>> >>> # Methodagreement >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=500) >>> ecg = nk.signal_distort(ecg, ... sampling_rate=500, ... noise_amplitude=0.2, noise_frequency=[25, 50], ... artifacts_amplitude=0.2, artifacts_frequency=50) >>> nk.ecg_findpeaks(ecg, sampling_rate=1000, method="promac", show=True) {'ECG_R_Peaks': array(...)}
References
Rodrigues, Tiago & Samoutphonh, Sirisack & Plácido da Silva, Hugo & Fred, Ana. (2021). A LowComplexity Rpeak Detection Algorithm with Adaptive Thresholding for Wearable Devices.
Gamboa, H. (2008). Multimodal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
Zong, W., Heldt, T., Moody, G. B., & Mark, R. G. (2003). An opensource algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003 (pp. 259262). IEEE.
Hamilton, P. (2002, September). Open source ECG analysis. In Computers in cardiology (pp. 101104). IEEE.
Pan, J., & Tompkins, W. J. (1985). A realtime QRS detection algorithm. IEEE transactions on biomedical engineering, (3), 230236.
Engelse, W. A. H., & Zeelenberg, C. (1979). A single scan algorithm for QRS detection and feature extraction IEEE Comput Cardiol. Long Beach: IEEE Computer Society.
Lourenço, A., Silva, H., Leite, P., Lourenço, R., & Fred, A. L. (2012). Real Time Electrocardiogram Segmentation for Finger based ECG Biometrics. In Biosignals (pp. 4954).
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., Ostadabbas, S. (2018). An OpenSource Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE Journal of Translational Engineering in Health and Medicine, 6, 111. doi:10.1109/jtehm.2018.2878000
Performs ECG analysis on longer periods of data (typically > 10 seconds), such as restingstate data.
 Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by ecg_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
 Returns
DataFrame – A dataframe containing the analyzed ECG features. The analyzed features consist of the following:
”ECG_Rate_Mean”: the mean heart rate.
”ECG_HRV”: the different heart rate variability metrices.
See hrv_summary() docstrings for details.
See also
bio_process
,ecg_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> >>> # Single dataframe is passed >>> nk.ecg_intervalrelated(df, sampling_rate=100) ECG_Rate_Mean HRV_MeanNN ... 0 ...
>>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, ... epochs_end=150) >>> nk.ecg_intervalrelated(epochs) Label ECG_Rate_Mean ... 1 ...
…
 ecg_peaks(ecg_cleaned, sampling_rate=1000, method='neurokit', correct_artifacts=False)[source]¶
Find Rpeaks in an ECG signal.
Find Rpeaks in an ECG signal using the specified method.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (string) – The algorithm to be used for Rpeak detection. Can be one of ‘neurokit’ (default), ‘pantompkins1985’, ‘hamilton2002’, ‘christov2004’, ‘gamboa2008’, ‘elgendi2010’, ‘engzeemod2012’ or ‘kalidas2017’.
correct_artifacts (bool) – Whether or not to identify artifacts as defined by Jukka A. Lipponen & Mika P. Tarvainen (2019): A robust algorithm for heart rate variability time series artefact correction using novel beat classification, Journal of Medical Engineering & Technology, DOI: 10.1080/03091902.2019.1640306.
 Returns
signals (DataFrame) – A DataFrame of same length as the input signal in which occurences of Rpeaks marked as “1” in a list of zeros with the same length as ecg_cleaned. Accessible with the keys “ECG_R_Peaks”.
info (dict) – A dictionary containing additional information, in this case the samples at which Rpeaks occur, accessible with the key “ECG_R_Peaks”, as well as the signals’ sampling rate.
See also
ecg_clean
,ecg_findpeaks
,ecg_process
,ecg_plot
,signal_rate
,signal_fixpeaks
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> signals, info = nk.ecg_peaks(cleaned, correct_artifacts=True) >>> nk.events_plot(info["ECG_R_Peaks"], cleaned) <Figure ...>
References
Gamboa, H. (2008). Multimodal behavioral biometrics based on hci and electrophysiology. PhD ThesisUniversidade.
W. Zong, T. Heldt, G.B. Moody, and R.G. Mark. An opensource algorithm to detect onset of arterial blood pressure pulses. In Computers in Cardiology, 2003, pages 259–262, 2003.
Hamilton, Open Source ECG Analysis Software Documentation, E.P.Limited, 2002.
Jiapu Pan and Willis J. Tompkins. A RealTime QRS Detection Algorithm. In: IEEE Transactions on Biomedical Engineering BME32.3 (1985), pp. 230–236.
C. Zeelenberg, A single scan algorithm for QRS detection and feature extraction, IEEE Comp. in Cardiology, vol. 6, pp. 3742, 1979
A. Lourenco, H. Silva, P. Leite, R. Lourenco and A. Fred, “Real Time Electrocardiogram Segmentation for Finger Based ECG Biometrics”, BIOSIGNALS 2012, pp. 4954, 2012.
 ecg_phase(ecg_cleaned, rpeaks=None, delineate_info=None, sampling_rate=None)[source]¶
Compute cardiac phase (for both atrial and ventricular).
Finds the cardiac phase, labelled as 1 for systole and 0 for diastole.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (list or array or DataFrame or Series or dict) – The samples at which the different ECG peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with ecg_findpeaks() or ecg_peaks().
delineate_info (dict) – A dictionary containing additional information of ecg delineation and can be obtained with ecg_delineate().
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to None.
 Returns
signals (DataFrame) – A DataFrame of same length as ecg_signal containing the following columns:
”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Completion_Atrial”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Completion_Ventricular”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
See also
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=10, sampling_rate=1000) >>> cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> _, rpeaks = nk.ecg_peaks(cleaned) >>> signals, waves = nk.ecg_delineate(cleaned, rpeaks, sampling_rate=1000) >>> >>> cardiac_phase = nk.ecg_phase(ecg_cleaned=cleaned, rpeaks=rpeaks, ... delineate_info=waves, sampling_rate=1000) >>> nk.signal_plot([cleaned, cardiac_phase], standardize=True)
 ecg_plot(ecg_signals, rpeaks=None, sampling_rate=None, show_type='default')[source]¶
Visualize ECG data.
 Parameters
ecg_signals (DataFrame) – DataFrame obtained from ecg_process().
rpeaks (dict) – The samples at which the Rpeak occur. Dict returned by ecg_process(). Defaults to None.
sampling_rate (int) – The sampling frequency of the ECG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None. Must be specified to plot artifacts.
show_type (str) – Visualize the ECG data with ‘default’ or visualize artifacts thresholds with ‘artifacts’ produced by ecg_fixpeaks(), or ‘full’ to visualize both.
 Returns
fig – Figure representing a plot of the processed ecg signals (and peak artifacts).
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> signals, info = nk.ecg_process(ecg, sampling_rate=1000) >>> nk.ecg_plot(signals, sampling_rate=1000, show_type='default') <Figure ...>
See also
 ecg_process(ecg_signal, sampling_rate=1000, method='neurokit')[source]¶
Process an ECG signal.
Convenience function that automatically processes an ECG signal.
 Parameters
ecg_signal (Union[list, np.array, pd.Series]) – The raw ECG channel.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
method (str) – The processing pipeline to apply. Defaults to “neurokit”.
 Returns
signals (DataFrame) – A DataFrame of the same length as the ecg_signal containing the following columns:
”ECG_Raw”: the raw signal.
”ECG_Clean”: the cleaned signal.
”ECG_R_Peaks”: the Rpeaks marked as “1” in a list of zeros.
”ECG_Rate”: heart rate interpolated between Rpeaks.
”ECG_P_Peaks”: the Ppeaks marked as “1” in a list of zeros
”ECG_Q_Peaks”: the Qpeaks marked as “1” in a list of zeros .
”ECG_S_Peaks”: the Speaks marked as “1” in a list of zeros.
”ECG_T_Peaks”: the Tpeaks marked as “1” in a list of zeros.
”ECG_P_Onsets”: the Ponsets marked as “1” in a list of zeros.
 ”ECG_P_Offsets”: the Poffsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
 ”ECG_T_Onsets”: the Tonsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
”ECG_T_Offsets”: the Toffsets marked as “1” in a list of zeros.
 ”ECG_R_Onsets”: the Ronsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
 ”ECG_R_Offsets”: the Roffsets marked as “1” in a list of zeros
(only when method in ecg_delineate is wavelet).
”ECG_Phase_Atrial”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Phase_Ventricular”: cardiac phase, marked by “1” for systole and “0” for diastole.
”ECG_Atrial_PhaseCompletion”: cardiac phase (atrial) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
”ECG_Ventricular_PhaseCompletion”: cardiac phase (ventricular) completion, expressed in percentage (from 0 to 1), representing the stage of the current cardiac phase.
info (dict) – A dictionary containing the samples at which the Rpeaks occur, accessible with the key “ECG_Peaks”, as well as the signals’ sampling rate.
See also
ecg_clean
,ecg_findpeaks
,ecg_plot
,signal_rate
,signal_fixpeaks
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> signals, info = nk.ecg_process(ecg, sampling_rate=1000) >>> nk.ecg_plot(signals) <Figure ...>
 ecg_quality(ecg_cleaned, rpeaks=None, sampling_rate=1000, method='averageQRS', approach=None)[source]¶
Quality of ECG Signal.
The “averageQRS” method computes a continuous index of quality of the ECG signal, by interpolating the distance of each QRS segment from the average QRS segment present in the data. This index is therefore relative: 1 corresponds to heartbeats that are the closest to the average sample and 0 corresponds to the most distant heartbeat from that average sample. Note that 1 does not necessarily means “good”: if the majority of samples are bad, than being close to the average will likely mean bad as well. Use this index with care and plot it alongside your ECG signal to see if it makes sense.
The “zhao2018” method (Zhao et la., 2018) extracts several signal quality indexes (SQIs): QRS wave power spectrum distribution pSQI, kurtosis kSQI, and baseline relative power basSQI. An additional R peak detection match qSQI was originally computed in the paper but left out in this algorithm. The indices were originally weighted with a ratio of [0.4, 0.4, 0.1, 0.1] to generate the final classification outcome, but because qSQI was dropped, the weights have been rearranged to [0.6, 0.2, 0.2] for pSQI, kSQI and basSQI respectively.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG signal in the form of a vector of values.
rpeaks (tuple or list) – The list of Rpeak samples returned by ecg_peaks(). If None, peaks is computed from the signal input.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
method (str) – The method for computing ECG signal quality, can be “averageQRS” (default) or “zhao2018”.
approach (str) – The data fusion approach as documented in Zhao et al. (2018). Can be “simple” or “fuzzy”. The former performs simple heuristic fusion of SQIs and the latter performs fuzzy comprehensive evaluation. If None (default), simple heuristic fusion is used.
**kwargs – Keyword arguments to be passed to signal_power() in the computation of basSQI and pSQI.
 Returns
array or str – Vector containing the quality index ranging from 0 to 1 for “averageQRS” method, returns string classification (“Unacceptable”, “Barely Acceptable” or “Excellent”) of the signal for “zhao2018 method”.
See also
References
Zhao, Z., & Zhang, Y. (2018). “SQI quality evaluation mechanism of singlelead ECG signal based on simple heuristic fusion and fuzzy comprehensive evaluation”. Frontiers in Physiology, 9, 727.
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=30, sampling_rate=300, noise=0.2) >>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=300) >>> quality = nk.ecg_quality(ecg_cleaned, sampling_rate=300) >>> >>> nk.signal_plot([ecg_cleaned, quality], standardize=True) >>> >>> # Zhao et al. (2018) method >>> quality = nk.ecg_quality(ecg_cleaned, sampling_rate=300, method="zhao2018", approach="fuzzy")
 ecg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')¶
Calculate signal rate from a series of peaks.
This function can also be called either via
ecg_rate()
,`ppg_rate()
orrsp_rate()
(aliases provided for consistency). Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of Rpeaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the ydirection). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
 Returns
array – A vector containing the rate.
See also
signal_findpeaks
,signal_fixpeaks
,signal_plot
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal)) >>> fig = nk.signal_plot(rate) >>> fig
 ecg_rsp(ecg_rate, sampling_rate=1000, method='vangent2019')[source]¶
Extract ECG Derived Respiration (EDR).
This implementation is far from being complete, as the information in the related papers prevents me from getting a full understanding of the procedure. Help is required!
 Parameters
ecg_rate (array) – The heart rate signal as obtained via ecg_rate().
sampling_rate (int) – The sampling frequency of the signal that contains the Rpeaks (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘vangent2019’ (default), ‘soni2019’, ‘charlton2016’ or ‘sarkar2015’.
 Returns
array – A Numpy array containing the heart rate.
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> # Get heart rate >>> data = nk.data("bio_eventrelated_100hz") >>> rpeaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> ecg_rate = nk.signal_rate(rpeaks, sampling_rate=100, desired_length=len(rpeaks)) >>> >>> >>> # Get ECG Derived Respiration (EDR) >>> edr = nk.ecg_rsp(ecg_rate, sampling_rate=100) >>> nk.standardize(pd.DataFrame({"EDR": edr, "RSP": data["RSP"]})).plot() <AxesSubplot:> >>> >>> # Method comparison (the closer to 0 the better) >>> nk.standardize(pd.DataFrame({"True RSP": data["RSP"], ... "vangent2019": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="vangent2019"), ... "sarkar2015": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="sarkar2015"), ... "charlton2016": nk.ecg_rsp(ecg_rate, sampling_rate=100, method="charlton2016"), ... "soni2019": nk.ecg_rsp(ecg_rate, sampling_rate=100, ... method="soni2019")})).plot() <AxesSubplot:>
References
van Gent, P., Farah, H., van Nes, N., & van Arem, B. (2019). HeartPy: A novel heart rate algorithm for the analysis of noisy signals. Transportation research part F: traffic psychology and behaviour, 66, 368378.
Sarkar, S., Bhattacherjee, S., & Pal, S. (2015). Extraction of respiration signal from ECG for respiratory rate estimation.
Charlton, P. H., Bonnici, T., Tarassenko, L., Clifton, D. A., Beale, R., & Watkinson, P. J. (2016). An assessment of algorithms to estimate respiratory rate from the electrocardiogram and photoplethysmogram. Physiological measurement, 37(4), 610.
Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation effects. International Journal of Yoga, 12(1), 45.
 ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=False)[source]¶
Segment an ECG signal into single heartbeats.
 Parameters
ecg_cleaned (Union[list, np.array, pd.Series]) – The cleaned ECG channel as returned by ecg_clean().
rpeaks (dict) – The samples at which the Rpeaks occur. Dict returned by ecg_peaks(). Defaults to None.
sampling_rate (int) – The sampling frequency of ecg_signal (in Hz, i.e., samples/second). Defaults to 1000.
show (bool) – If True, will return a plot of heartbeats. Defaults to False.
 Returns
dict – A dict containing DataFrames for all segmented heartbeats.
Examples
>>> import neurokit2 as nk >>> >>> ecg = nk.ecg_simulate(duration=15, sampling_rate=1000, heart_rate=80) >>> ecg_cleaned = nk.ecg_clean(ecg, sampling_rate=1000) >>> nk.ecg_segment(ecg_cleaned, rpeaks=None, sampling_rate=1000, show=True) {'1': Signal Index Label ... '2': Signal Index Label ... '19': Signal Index Label ...}
 ecg_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, heart_rate=70, heart_rate_std=1, method='ecgsyn', random_state=None)[source]¶
Simulate an ECG/EKG signal.
Generate an artificial (synthetic) ECG signal of a given duration and sampling rate using either the ECGSYN dynamical model (McSharry et al., 2003) or a simpler model based on Daubechies wavelets to roughly approximate cardiac cycles.
 Parameters
duration (int) – Desired recording length in seconds.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
length (int) – The desired length of the signal (in samples).
noise (float) – Noise level (amplitude of the laplace noise).
heart_rate (int) – Desired simulated heart rate (in beats per minute). The default is 70. Note that for the ECGSYN method, random fluctuations are to be expected to mimick a real heart rate. These fluctuations can cause some slight discrepancies between the requested heart rate and the empirical heart rate, especially for shorter signals.
heart_rate_std (int) – Desired heart rate standard deviation (beats per minute).
method (str) – The model used to generate the signal. Can be ‘simple’ for a simulation based on Daubechies wavelets that roughly approximates a single cardiac cycle. If ‘ecgsyn’ (default), will use an advanced model desbribed McSharry et al. (2003).
random_state (int) – Seed for the random number generator.
 Returns
array – Vector containing the ECG signal.
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> ecg1 = nk.ecg_simulate(duration=10, method="simple") >>> ecg2 = nk.ecg_simulate(duration=10, method="ecgsyn") >>> pd.DataFrame({"ECG_Simple": ecg1, ... "ECG_Complex": ecg2}).plot(subplots=True) array([<AxesSubplot:>, <AxesSubplot:>], dtype=object)
See also
rsp_simulate
,eda_simulate
,ppg_simulate
,emg_simulate
References
McSharry, P. E., Clifford, G. D., Tarassenko, L., & Smith, L. A. (2003). A dynamical model for
generating synthetic electrocardiogram signals. IEEE transactions on biomedical engineering, 50(3), 289294.  https://github.com/diarmaidocualain/ecg_simulation
PPG¶
Submodule for NeuroKit.
 ppg_analyze(data, sampling_rate=1000, method='auto')[source]¶
Performs PPG analysis on either epochs (eventrelated analysis) or on longer periods of data such as resting state data.
 Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by ppg_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘eventrelated’ for eventrelated analysis on epochs, or ‘intervalrelated’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘eventrelated’ for duration under 10s).
 Returns
DataFrame – A dataframe containing the analyzed PPG features. If eventrelated analysis is conducted, each epoch is indicated by the Label column. See ppg_eventrelated() and ppg_intervalrelated() docstrings for details.
See also
bio_process
,ppg_process
,epochs_create
,ppg_eventrelated
,ppg_intervalrelated
Examples
>>> import neurokit2 as nk >>> >>> # Example 1: Simulate data for eventrelated analysis >>> ppg = nk.ppg_simulate(duration=20, sampling_rate=1000) >>> >>> # Process data >>> ppg_signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> epochs = nk.epochs_create(ppg, events=[5000, 10000, 15000], ... epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.ppg_analyze(epochs, sampling_rate=1000) >>> analyze_epochs
>>> >>> # Example 2: Download the restingstate data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ppg_process(data["PPG"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.ppg_analyze(df, sampling_rate=100) >>> analyze_df
 ppg_clean(ppg_signal, sampling_rate=1000, heart_rate=None, method='elgendi')[source]¶
Clean a photoplethysmogram (PPG) signal.
Prepare a raw PPG signal for systolic peak detection.
 Parameters
ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.
heart_rate (Union[int, float]) – The heart rate of the PPG signal. Applicable only if method is “nabian2018” to check that filter frequency is appropriate.
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.
method (str) – The processing pipeline to apply. Can be one of “elgendi” or “nabian2018”. The default is “elgendi”.
 Returns
clean (array) – A vector containing the cleaned PPG.
See also
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> >>> # Simulate and clean signal >>> ppg = nk.ppg_simulate(heart_rate=75, duration=30) >>> ppg_elgendi = nk.ppg_clean(ppg, method='elgendi') >>> ppg_nabian = nk.ppg_clean(ppg, method='nabian2018', heart_rate=75) >>> >>> # Plot and compare methods >>> signals = pd.DataFrame({"PPG_Raw" : ppg, ... "PPG_Elgendi" : ppg_elgendi, ... "PPG_Nabian" : ppg_nabian}) >>> signals.plot() <AxesSubplot:>
References
Nabian, M., Yin, Y., Wormwood, J., Quigley, K. S., Barrett, L. F., & Ostadabbas, S. (2018).
An OpenSource Feature Extraction Tool for the Analysis of Peripheral Physiological Data. IEEE Journal of Translational Engineering in Health and Medicine, 6, 111. doi:10.1109/jtehm.2018.2878000
Performs eventrelated PPG analysis on epochs.
 Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
 Returns
DataFrame – A dataframe containing the analyzed PPG features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:
”PPG_Rate_Baseline”: the baseline heart rate (at stimulus onset).
”PPG_Rate_Max”: the maximum heart rate after stimulus onset.
”PPG_Rate_Min”: the minimum heart rate after stimulus onset.
”PPG_Rate_Mean”: the mean heart rate after stimulus onset.
”PPG_Rate_SD”: the standard deviation of the heart rate after stimulus onset.
”PPG_Rate_Max_Time”: the time at which maximum heart rate occurs.
”PPG_Rate_Min_Time”: the time at which minimum heart rate occurs.
We also include the following experimental features related to the parameters of a quadratic model:
”PPG_Rate_Trend_Linear”: The parameter corresponding to the linear trend.
”PPG_Rate_Trend_Quadratic”: The parameter corresponding to the curvature.
”PPG_Rate_Trend_R2”: the quality of the quadratic model. If too low, the parameters might not be reliable or meaningful.
See also
events_find
,epochs_create
,ppg_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> ppg, info = nk.ppg_process(nk.ppg_simulate(duration=20)) >>> >>> # Process the data >>> epochs = nk.epochs_create(ppg, events=[5000, 10000, 15000], ... epochs_start=0.1, epochs_end=1.9) >>> nk.ppg_eventrelated(epochs) Label Event_Onset ... PPG_Rate_Trend_Linear PPG_Rate_Trend_R2 1 1 ... ... ... ... 2 2 ... ... ... ... 3 3 ... ... ... ...
[3 rows x 12 columns]
 ppg_findpeaks(ppg_cleaned, sampling_rate=1000, method='elgendi', show=False)[source]¶
Find systolic peaks in a photoplethysmogram (PPG) signal.
 Parameters
ppg_cleaned (Union[list, np.array, pd.Series]) – The cleaned PPG channel as returned by ppg_clean().
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). The default is 1000.
method (str) – The processing pipeline to apply. Can be one of “elgendi”. The default is “elgendi”.
show (bool) – If True, returns a plot of the thresholds used during peak detection. Useful for debugging. The default is False.
 Returns
info (dict) – A dictionary containing additional information, in this case the samples at which systolic peaks occur, accessible with the key “PPG_Peaks”.
See also
Examples
>>> import neurokit2 as nk >>> import matplotlib.pyplot as plt >>> >>> ppg = nk.ppg_simulate(heart_rate=75, duration=30) >>> ppg_clean = nk.ppg_clean(ppg) >>> info = nk.ppg_findpeaks(ppg_clean) >>> peaks = info["PPG_Peaks"] >>> >>> plt.plot(ppg, label="raw PPG") >>> plt.plot(ppg_clean, label="clean PPG") >>> plt.scatter(peaks, ppg[peaks], c="r", label="systolic peaks") >>> plt.legend()
References
Elgendi M, Norton I, Brearley M, Abbott D, Schuurmans D (2013) Systolic Peak Detection in
Acceleration Photoplethysmograms Measured from Emergency Responders in Tropical Conditions. PLoS ONE 8(10): e76585. doi:10.1371/journal.pone.0076585.
Performs PPG analysis on longer periods of data (typically > 10 seconds), such as restingstate data.
 Parameters
data (Union[dict, pd.DataFrame]) – A DataFrame containing the different processed signal(s) as different columns, typically generated by ppg_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
 Returns
DataFrame – A dataframe containing the analyzed ECG features. The analyzed features consist of the following:
”PPG_Rate_Mean”: the mean heart rate.
”ECG_HRV”: the different heart rate variability metrices.
See hrv_summary() docstrings for details.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.ppg_process(data["PPG"], sampling_rate=100) >>> >>> # Single dataframe is passed >>> nk.ppg_intervalrelated(df, sampling_rate=100) PPG_Rate_Mean HRV_MeanNN ... 0 ...
>>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, ... epochs_end=150) >>> nk.ppg_intervalrelated(epochs) Label PPG_Rate_Mean ... 1 ...
…
 ppg_plot(ppg_signals, sampling_rate=None)[source]¶
Visualize photoplethysmogram (PPG) data.
 Parameters
ppg_signals (DataFrame) – DataFrame obtained from ppg_process().
sampling_rate (int) – The sampling frequency of the PPG (in Hz, i.e., samples/second). Needs to be supplied if the data should be plotted over time in seconds. Otherwise the data is plotted over samples. Defaults to None.
 Returns
fig – Figure representing a plot of the processed PPG signals.
Examples
>>> import neurokit2 as nk >>> >>> # Simulate data >>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70) >>> >>> # Process signal >>> signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> >>> # Plot >>> nk.ppg_plot(signals) <Figure ...>
See also
 ppg_process(ppg_signal, sampling_rate=1000, **kwargs)[source]¶
Process a photoplethysmogram (PPG) signal.
Convenience function that automatically processes a photoplethysmogram signal.
 Parameters
ppg_signal (Union[list, np.array, pd.Series]) – The raw PPG channel.
sampling_rate (int) – The sampling frequency of emg_signal (in Hz, i.e., samples/second).
 Returns
signals (DataFrame) – A DataFrame of same length as emg_signal containing the following columns:  “PPG_Raw”: the raw signal.  “PPG_Clean”: the cleaned signal.  “PPG_Rate”: the heart rate as measured based on PPG peaks.  “PPG_Peaks”: the PPG peaks marked as “1” in a list of zeros.
info (dict) – A dictionary containing the information of peaks and the signals’ sampling rate.
See also
Examples
>>> import neurokit2 as nk >>> >>> ppg = nk.ppg_simulate(duration=10, sampling_rate=1000, heart_rate=70) >>> signals, info = nk.ppg_process(ppg, sampling_rate=1000) >>> fig = nk.ppg_plot(signals) >>> fig
 ppg_rate(peaks, sampling_rate=1000, desired_length=None, interpolation_method='monotone_cubic')¶
Calculate signal rate from a series of peaks.
This function can also be called either via
ecg_rate()
,`ppg_rate()
orrsp_rate()
(aliases provided for consistency). Parameters
peaks (Union[list, np.array, pd.DataFrame, pd.Series, dict]) – The samples at which the peaks occur. If an array is passed in, it is assumed that it was obtained with signal_findpeaks(). If a DataFrame is passed in, it is assumed it is of the same length as the input signal in which occurrences of Rpeaks are marked as “1”, with such containers obtained with e.g., ecg_findpeaks() or rsp_findpeaks().
sampling_rate (int) – The sampling frequency of the signal that contains peaks (in Hz, i.e., samples/second). Defaults to 1000.
desired_length (int) – If left at the default None, the returned rated will have the same number of elements as peaks. If set to a value larger than the sample at which the last peak occurs in the signal (i.e., peaks[1]), the returned rate will be interpolated between peaks over desired_length samples. To interpolate the rate over the entire duration of the signal, set desired_length to the number of samples in the signal. Cannot be smaller than or equal to the sample at which the last peak occurs in the signal. Defaults to None.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the ydirection). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
 Returns
array – A vector containing the rate.
See also
signal_findpeaks
,signal_fixpeaks
,signal_plot
Examples
>>> import neurokit2 as nk >>> >>> signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=1) >>> info = nk.signal_findpeaks(signal) >>> >>> rate = nk.signal_rate(peaks=info["Peaks"], desired_length=len(signal)) >>> fig = nk.signal_plot(rate) >>> fig
 ppg_simulate(duration=120, sampling_rate=1000, heart_rate=70, frequency_modulation=0.2, ibi_randomness=0.1, drift=0, motion_amplitude=0.1, powerline_amplitude=0.01, burst_number=0, burst_amplitude=1, random_state=None, show=False)[source]¶
Simulate a photoplethysmogram (PPG) signal.
Phenomenological approximation of PPG. The PPG wave is described with four landmarks: wave onset, location of the systolic peak, location of the dicrotic notch and location of the diastolic peaks. These landmarks are defined as x and y coordinates (in a time series). These coordinates are then interpolated at the desired sampling rate to obtain the PPG signal.
 Parameters
duration (int) – Desired recording length in seconds. The default is 120.
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second). The default is 1000.
heart_rate (int) – Desired simulated heart rate (in beats per minute). The default is 70. Note that for the ECGSYN method, random fluctuations are to be expected to mimick a real heart rate. These fluctuations can cause some slight discrepancies between the requested heart rate and the empirical heart rate, especially for shorter signals.
frequency_modulation (float) – Float between 0 and 1. Determines how pronounced respiratory sinus arrythmia (RSA) is (0 corresponds to absence of RSA). The default is 0.3.
ibi_randomness (float) – Float between 0 and 1. Determines how much random noise there is in the duration of each PPG wave (0 corresponds to absence of variation). The default is 0.1.
drift (float) – Float between 0 and 1. Determines how pronounced the baseline drift (.05 Hz) is (0 corresponds to absence of baseline drift). The default is 1.
motion_amplitude (float) – Float between 0 and 1. Determines how pronounced the motion artifact (0.5 Hz) is (0 corresponds to absence of motion artifact). The default is 0.1.
powerline_amplitude (float) – Float between 0 and 1. Determines how pronounced the powerline artifact (50 Hz) is (0 corresponds to absence of powerline artifact). Note that powerline_amplitude > 0 is only possible if ‘sampling_rate’ is >= 500. The default is 0.1.
burst_amplitude (float) – Float between 0 and 1. Determines how pronounced high frequency burst artifacts are (0 corresponds to absence of bursts). The default is 1.
burst_number (int) – Determines how many high frequency burst artifacts occur. The default is 0.
show (bool) – If true, returns a plot of the landmarks and interpolated PPG. Useful for debugging.
random_state (int) – Seed for the random number generator. Keep it fixed for reproducible results.
 Returns
ppg (array) – A vector containing the PPG.
See also
ecg_simulate
,rsp_simulate
,eda_simulate
,emg_simulate
Examples
>>> import neurokit2 as nk >>> >>> ppg = nk.ppg_simulate(duration=40, sampling_rate=500, heart_rate=75, random_state=42)
HRV¶
 hrv(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes indices of Heart Rate Variability (HRV).
Computes HRV indices in the time, frequency, and nonlinear domain. Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaninful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.
 Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., Rpeaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
show (bool, optional) – If True, returns the plots that are generates for each of the domains.
 Returns
DataFrame – Contains HRV metrics from three domains:  frequency (see hrv_frequency)  time (see hrv_time)  nonlinear (see hrv_nonlinear) If RSP data is provided (e.g., output of bio_process):  rsa
Otherwise, to compute ECGderived respiration, use hrv_rsa If no raw respiratory data is available, users can also choose to use `ecg_rsp <https://neurokit2.readthedocs.io/en/latest/functions.html#neurokit2.ecg.ecg_rsp`_ to obtain ECGderived respiratory signal, although this is not an ideal procedure.
See also
ecg_peaks
,ppg_peaks
,hrv_time
,hrv_frequency
,hrv_nonlinear
,hrv_rsa
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Clean signal and Find peaks >>> ecg_cleaned = nk.ecg_clean(data["ECG"], sampling_rate=100) >>> peaks, info = nk.ecg_peaks(ecg_cleaned, sampling_rate=100, correct_artifacts=True) >>> >>> # Compute HRV indices >>> hrv_indices = nk.hrv(peaks, sampling_rate=100, show=True) >>> hrv_indices >>> >>> # Compute HRV from processed signals >>> signals, info = nk.bio_process(data, sampling_rate=100) >>> hrv = nk.hrv(signals, sampling_rate=100, show=True) >>> hrv
References
Pham, T., Lau, Z. J., Chen, S. H. A., & Makowski, D. (2021). Heart Rate Variability in Psychology:
A Review of HRV Indices and an Analysis Tutorial. Sensors, 21(12), 3998. https://doi:10.3390/s21123998
Stein, P. K. (2002). Assessing heart rate variability from realworld Holter reports. Cardiac
electrophysiology review, 6(3), 239244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
 hrv_frequency(peaks, sampling_rate=1000, ulf=(0, 0.0033), vlf=(0.0033, 0.04), lf=(0.04, 0.15), hf=(0.15, 0.4), vhf=(0.4, 0.5), psd_method='welch', show=False, silent=True, normalize=True, order_criteria=None, **kwargs)[source]¶
Computes frequencydomain indices of Heart Rate Variability (HRV).
Note that a minimum duration of the signal containing the peaks is recommended for some HRV indices to be meaningful. For instance, 1, 2 and 5 minutes of high quality signal are the recomended minima for HF, LF and LF/HF, respectively. See references for details.
 Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., Rpeaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
ulf (tuple, optional) – Upper and lower limit of the ultralow frequency band. By default (0, 0.0033).
vlf (tuple, optional) – Upper and lower limit of the verylow frequency band. By default (0.0033, 0.04).
lf (tuple, optional) – Upper and lower limit of the low frequency band. By default (0.04, 0.15).
hf (tuple, optional) – Upper and lower limit of the high frequency band. By default (0.15, 0.4).
vhf (tuple, optional) – Upper and lower limit of the veryhigh frequency band. By default (0.4, 0.5).
psd_method (str) – Method used for spectral density estimation. For details see signal.signal_power. By default “welch”.
silent (bool) – If False, warnings will be printed. Default to True.
show (bool) – If True, will plot the power in the different frequency bands.
normalize (bool) – Normalization of power by maximum PSD value. Default to True. Normalization allows comparison between different PSD methods.
order_criteria (str) – The criteria to automatically select order in parametric PSD (only used for autoregressive (AR) methods such as ‘burg’). Defaults to None.
**kwargs (optional) – Other arguments.
 Returns
DataFrame – Contains frequency domain HRV metrics:  ULF: The spectral power density pertaining to ultra low frequency band i.e., .0 to .0033 Hz by default.  VLF: The spectral power density pertaining to very low frequency band i.e., .0033 to .04 Hz by default.  LF: The spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default.  HF: The spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default.  VHF: The variability, or signal power, in very high frequency i.e., .4 to .5 Hz by default.  LFn: The normalized low frequency, obtained by dividing the low frequency power by the total power.  HFn: The normalized high frequency, obtained by dividing the low frequency power by the total power.  LnHF: The log transformed HF.
See also
ecg_peaks
,ppg_peaks
,hrv_summary
,hrv_time
,hrv_nonlinear
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv_welch = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="welch") >>> hrv_burg = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="burg") >>> hrv_lomb = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="lomb") >>> hrv_multitapers = nk.hrv_frequency(peaks, sampling_rate=100, show=True, psd_method="multitapers")
References
Stein, P. K. (2002). Assessing heart rate variability from realworld Holter reports. Cardiac
electrophysiology review, 6(3), 239244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
Boardman, A., Schlindwein, F. S., & Rocha, A. P. (2002). A study on the optimum order of
autoregressive models for heart rate variability. Physiological measurement, 23(2), 325.
Bachler, M. (2017). Spectral Analysis of Unevenly Spaced Data: Models and Application in Heart
Rate Variability. Simul. Notes Eur., 27(4), 183190.
 hrv_nonlinear(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes nonlinear indices of Heart Rate Variability (HRV).
See references for details.
 Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., Rpeaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
dfa_windows (list) – A list of tuples containing the number of heartbeats to compute the DFA short term scaling exponent, α1 and the long term scaling exponent, α2, respectively. Defaults to [[4, 11], [12, None]], where α1 is estimated from 4 to 11 heartbeats and α2 is estimated from a larger number of heartbeats, i.e., 11 beats and above, based on Acharya et al. (2002).
show (bool, optional) – If True, will return a Poincaré plot, a scattergram, which plots each RR interval against the next successive one. The ellipse centers around the average RR interval. By default False.
**kwargs (optional) – Other arguments to be passed into fractal_dfa() and fractal_correlation().
 Returns
DataFrame – Contains nonlinear HRV metrics:
Characteristics of the Poincaré Plot Geometry:
SD1: SD1 is a measure of the spread of RR intervals on the Poincaré plot
perpendicular to the line of identity. It is an index of shortterm RR interval fluctuations, i.e., beattobeat variability. It is equivalent (although on another scale) to RMSSD, and therefore it is redundant to report correlations with both (Ciccone, 2017).
SD2: SD2 is a measure of the spread of RR intervals on the Poincaré plot along the
line of identity. It is an index of longterm RR interval fluctuations.
SD1SD2: the ratio between short and long term fluctuations of the RR intervals
(SD1 divided by SD2).
S: Area of ellipse described by SD1 and SD2 (
pi * SD1 * SD2
). It is
proportional to SD1SD2.
CSI: The Cardiac Sympathetic Index (Toichi, 1997), calculated by dividing the
longitudinal variability of the Poincaré plot (
4*SD2
) by its transverse variability (4*SD1
).CVI: The Cardiac Vagal Index (Toichi, 1997), equal to the logarithm of the product of
longitudinal (
4*SD2
) and transverse variability (4*SD1
).CSI_Modified: The modified CSI (Jeppesen, 2014) obtained by dividing the square of
the longitudinal variability by its transverse variability.
Indices of Heart Rate Asymmetry (HRA), i.e., asymmetry of the Poincaré plot (Yan, 2017):
GI: Guzik’s Index, defined as the distance of points above line of identity (LI)
to LI divided by the distance of all points in Poincaré plot to LI except those that are located on LI.
SI: Slope Index, defined as the phase angle of points above LI divided by the
phase angle of all points in Poincaré plot except those that are located on LI.
AI: Area Index, defined as the cumulative area of the sectors corresponding to
the points that are located above LI divided by the cumulative area of sectors corresponding to all points in the Poincaré plot except those that are located on LI.
PI: Porta’s Index, defined as the number of points below LI divided by the total
number of points in Poincaré plot except those that are located on LI.
SD1d and SD1a: shortterm variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
C1d and C1a: the contributions of heart rate decelerations and accelerations
to shortterm HRV, respectively (Piskorski, 2011).
SD2d and SD2a: longterm variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
C2d and C2a: the contributions of heart rate decelerations and accelerations
to longterm HRV, respectively (Piskorski, 2011).
SDNNd and SDNNa: total variance of contributions of decelerations
(prolongations of RR intervals) and accelerations (shortenings of RR intervals), respectively (Piskorski, 2011).
Cd and Ca: the total contributions of heart rate decelerations and
accelerations to HRV.
Indices of Heart Rate Fragmentation (Costa, 2017):
PIP: Percentage of inflection points of the RR intervals series.
IALS: Inverse of the average length of the acceleration/deceleration segments.
PSS: Percentage of short segments.
PAS: IPercentage of NN intervals in alternation segments.
Indices of Complexity:
ApEn: The approximate entropy measure of HRV, calculated by entropy_approximate().
SampEn: The sample entropy measure of HRV, calculated by entropy_sample().
ShanEn: The Shannon entropy measure of HRV, calculated by entropy_shannon().
FuzzyEn: The fuzzy entropy measure of HRV, calculated by entropy_fuzzy().
MSE: The multiscale entropy measure of HRV, calculated by entropy_multiscale().
CMSE: The composite multiscale entropy measure of HRV, calculated by entropy_multiscale().
RCMSE: The refined composite multiscale entropy measure of HRV, calculated by entropy_multiscale().
CD: The Correlation Dimension of the HR signal, calculated by fractal_correlation().
HFD: The Higuchi’s Fractal Dimension of the HR signal, calculated by fractal_higuchi().
kmax is set to “default”.
KFD: The Katz’s Fractal Dimension of the HR signal, calculated by fractal_katz().
LZC: The LempelZiv complexity of the HR signal, calculated by fractal_lempelziv().
DFA_alpha1: The monofractal detrended fluctuation analysis of the HR signal corresponding
to shortterm correlations, calculated by fractal_dfa().
DFA_alpha2: The monofractal detrended fluctuation analysis of the HR signal corresponding
to longterm correlations, calculated by fractal_dfa().
DFA_alpha1_ExpRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to shortterm correlations, calculated by fractal_dfa(). ExpRange is the range of singularity exponents, correspoinding to the width of the singularity spectrum.
DFA_alpha2_ExpRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to longterm correlations, calculated by fractal_dfa(). ExpRange is the range of singularity exponents, correspoinding to the width of the singularity spectrum.
DFA_alpha1_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
DFA_alpha2_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
DFA_alpha1_DimRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to shortterm correlations, calculated by fractal_dfa(). DimRange is the range of singularity dimensions, correspoinding to the height of the singularity spectrum.
DFA_alpha2_DimRange: The multifractal detrended fluctuation analysis of the HR signal
corresponding to longterm correlations, calculated by fractal_dfa(). DimRange is the range of singularity dimensions, correspoinding to the height of the singularity spectrum.
DFA_alpha1_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
DFA_alpha2_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
See also
ecg_peaks
,ppg_peaks
,hrv_frequency
,hrv_time
,hrv_summary
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv = nk.hrv_nonlinear(peaks, sampling_rate=100, show=True) >>> hrv
References
Yan, C., Li, P., Ji, L., Yao, L., Karmakar, C., & Liu, C. (2017). Area asymmetry of heart
rate variability signal. Biomedical engineering online, 16(1), 112.
Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.
(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674678.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
Costa, M. D., Davis, R. B., & Goldberger, A. L. (2017). Heart rate fragmentation: a new
approach to the analysis of cardiac interbeat interval dynamics. Front. Physiol. 8, 255 (2017).
Jeppesen, J., Beniczky, S., Johansen, P., Sidenius, P., & FuglsangFrederiksen, A. (2014).
Using Lorenz plot and Cardiac Sympathetic Index of heart rate variability for detecting seizures for patients with epilepsy. In 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 45634566). IEEE.
Piskorski, J., & Guzik, P. (2011). Asymmetric properties of longterm and total heart rate
variability. Medical & biological engineering & computing, 49(11), 12891297.
Stein, P. K. (2002). Assessing heart rate variability from realworld Holter reports. Cardiac
electrophysiology review, 6(3), 239244.
Brennan, M. et al. (2001). Do Existing Measures of Poincaré Plot Geometry Reflect Nonlinear
Features of Heart Rate Variability?. IEEE Transactions on Biomedical Engineering, 48(11), 13421347.
Toichi, M., Sugiura, T., Murai, T., & Sengoku, A. (1997). A new method of assessing cardiac
autonomic function and its comparison with spectral analysis and coefficient of variation of R–R interval. Journal of the autonomic nervous system, 62(12), 7984.
Acharya, R. U., Lim, C. M., & Joseph, P. (2002). Heart rate variability analysis using
correlation dimension and detrended fluctuation analysis. ItbmRbm, 23(6), 333339.
 hrv_rsa(ecg_signals, rsp_signals=None, rpeaks=None, sampling_rate=1000, continuous=False, window=None, window_number=None)[source]¶
Respiratory Sinus Arrhythmia (RSA)
Respiratory sinus arrhythmia (RSA), also referred to as ‘cardiac coherence’ or ‘physiological coherence’ (though these terms are often encompassing a wider meaning), is the naturally occurring variation in heart rate during the breathing cycle. Metrics to quantify it are often used as a measure of parasympathetic nervous system activity. Neurophysiology informs us that the functional output of the myelinated vagus originating from the nucleus ambiguus has a respiratory rhythm. Thus, there would a temporal relation between the respiratory rhythm being expressed in the firing of these efferent pathways and the functional effect on the heart rate rhythm manifested as RSA. Importantly, several methods exist to quantify RSA:
The Peaktotrough (P2T) algorithm measures the statistical range in milliseconds of the heart
period oscillation associated with synchronous respiration. Operationally, subtracting the shortest heart period during inspiration from the longest heart period during a breath cycle produces an estimate of RSA during each breath. The peaktotrough method makes no statistical assumption or correction (e.g., adaptive filtering) regarding other sources of variance in the heart period time series that may confound, distort, or interact with the metric such as slower periodicities and baseline trend. Although it has been proposed that the P2T method “acts as a timedomain filter dynamically centered at the exact ongoing respiratory frequency” (Grossman, 1992), the method does not transform the time series in any way, as a filtering process would. Instead the method uses knowledge of the ongoing respiratory cycle to associate segments of the heart period time series with either inhalation or exhalation (Lewis, 2012).
The PorgesBohrer (PB) algorithm assumes that heart period time series reflect the sum of several
component time series. Each of these component time series may be mediated by different neural mechanisms and may have different statistical features. The PorgesBohrer method applies an algorithm that selectively extracts RSA, even when the periodic process representing RSA is superimposed on a complex baseline that may include aperiodic and slow periodic processes. Since the method is designed to remove sources of variance in the heart period time series other than the variance within the frequency band of spontaneous breathing, the method is capable of accurately quantifying RSA when the signal to noise ratio is low.
 Parameters
ecg_signals (DataFrame) – DataFrame obtained from ecg_process(). Should contain columns ECG_Rate and ECG_R_Peaks. Can also take a DataFrame comprising of both ECG and RSP signals, generated by bio_process().
rsp_signals (DataFrame) – DataFrame obtained from rsp_process(). Should contain columns RSP_Phase and RSP_PhaseCompletion. No impact when a DataFrame comprising of both the ECG and RSP signals are passed as ecg_signals. Defaults to None.
rpeaks (dict) – The samples at which the Rpeaks of the ECG signal occur. Dict returned by ecg_peaks(), ecg_process(), or bio_process(). Defaults to None.
sampling_rate (int) – The sampling frequency of signals (in Hz, i.e., samples/second).
continuous (bool) – If False, will return RSA properties computed from the data (one value per index). If True, will return continuous estimations of RSA of the same length as the signal. See below for more details.
window (int) – For calculating RSA second by second. Length of each segment in seconds. If None (default), window will be set at 32 seconds.
window_number (int) – Between 2 and 8. For caculating RSA second by second. Number of windows to be calculated in Peak Matched Multiple Window. If None (default), window_number will be set at 8.
 Returns
rsa (dict) – A dictionary containing the RSA features, which includes:
”RSA_P2T_Values”: the estimate of RSA during each breath cycle, produced by subtracting the shortest heart period (or RR interval) from the longest heart period in ms.
”RSA_P2T_Mean”: the mean peaktotrough across all cycles in ms
”RSA_P2T_Mean_log”: the logarithm of the mean of RSA estimates.
”RSA_P2T_SD”: the standard deviation of all RSA estimates.
”RSA_P2T_NoRSA”: the number of breath cycles from which RSA could not be calculated.
”RSA_PorgesBohrer”: the PorgesBohrer estimate of RSA, optimal when the signal to noise ratio is low, in ln(ms^2).
Example
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> ecg_signals, info = nk.ecg_process(data["ECG"], sampling_rate=100) >>> rsp_signals, _ = nk.rsp_process(data["RSP"], sampling_rate=100) >>> >>> # Get RSA features >>> nk.hrv_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=False) {'RSA_P2T_Mean': ..., 'RSA_P2T_Mean_log': ..., 'RSA_P2T_SD': ..., 'RSA_P2T_NoRSA': ..., 'RSA_PorgesBohrer': ..., 'RSA_Gates_Mean': ..., 'RSA_Gates_Mean_log': ..., 'RSA_Gates_SD': ...} >>> >>> # Get RSA as a continuous signal >>> rsa = nk.hrv_rsa(ecg_signals, rsp_signals, info, sampling_rate=100, continuous=True) >>> rsa RSA_P2T RSA_Gates 0 ... ... 1 ... ... 2 ... ... ... ... ...
[15000 rows x 2 columns] >>> nk.signal_plot([ecg_signals[“ECG_Rate”], rsp_signals[“RSP_Rate”], rsa], standardize=True)
References
Servant, D., Logier, R., Mouster, Y., & Goudemand, M. (2009). La variabilité de la fréquence cardiaque. Intérêts en psychiatrie. L’Encéphale, 35(5), 423–428. doi:10.1016/j.encep.2008.06.016
Lewis, G. F., Furman, S. A., McCool, M. F., & Porges, S. W. (2012). Statistical strategies to quantify respiratory sinus arrhythmia: Are commonly used metrics equivalent?. Biological psychology, 89(2), 349364.
Zohar, A. H., Cloninger, C. R., & McCraty, R. (2013). Personality and heart rate variability: exploring pathways from personality to cardiac coherence and health. Open Journal of Social Sciences, 1(06), 32.
 hrv_time(peaks, sampling_rate=1000, show=False, **kwargs)[source]¶
Computes timedomain indices of Heart Rate Variability (HRV).
See references for details.
 Parameters
peaks (dict) – Samples at which cardiac extrema (i.e., Rpeaks, systolic peaks) occur. Can be a list of indices or the output(s) of other functions such as ecg_peaks, ppg_peaks, ecg_process or bio_process.
sampling_rate (int, optional) – Sampling rate (Hz) of the continuous cardiac signal in which the peaks occur. Should be at least twice as high as the highest frequency in vhf. By default 1000.
show (bool) – If True, will plot the distribution of RR intervals.
 Returns
DataFrame – Contains time domain HRV metrics:  MeanNN: The mean of the RR intervals.  SDNN: The standard deviation of the RR intervals. SDANN1, SDANN2, SDANN5: The standard deviation of average RR intervals extracted from nminute segments of time series data (1, 2 and 5 by default). Note that these indices require a minimal duration of signal to be computed (3, 6 and 15 minutes respectively) and will be silently skipped if the data provided is too short. SDNNI1, SDNNI2, SDNNI5: The mean of the standard deviations of RR intervals extracted from nminute segments of time series data (1, 2 and 5 by default). Note that these indices require a minimal duration of signal to be computed (3, 6 and 15 minutes respectively) and will be silently skipped if the data provided is too short.  RMSSD: The square root of the mean of the sum of successive differences between adjacent RR intervals. It is equivalent (although on another scale) to SD1, and therefore it is redundant to report correlations with both (Ciccone, 2017).  SDSD: The standard deviation of the successive differences between RR intervals.  CVNN: The standard deviation of the RR intervals (SDNN) divided by the mean of the RR intervals (MeanNN).  CVSD: The root mean square of the sum of successive differences (RMSSD) divided by the mean of the RR intervals (MeanNN).  MedianNN: The median of the absolute values of the successive differences between RR intervals.  MadNN: The median absolute deviation of the RR intervals.  HCVNN: The median absolute deviation of the RR intervals (MadNN) divided by the median of the absolute differences of their successive differences (MedianNN).  IQRNN: The interquartile range (IQR) of the RR intervals.  pNN50: The proportion of RR intervals greater than 50ms, out of the total number of RR intervals.  pNN20: The proportion of RR intervals greater than 20ms, out of the total number of RR intervals.  TINN: A geometrical parameter of the HRV, or more specifically, the baseline width of the RR intervals distribution obtained by triangular interpolation, where the error of least squares determines the triangle. It is an approximation of the RR interval distribution.  HTI: The HRV triangular index, measuring the total number of RR intervals divded by the height of the RR intervals histogram.
See also
ecg_peaks
,ppg_peaks
,hrv_frequency
,hrv_summary
,hrv_nonlinear
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Find peaks >>> peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100) >>> >>> # Compute HRV indices >>> hrv = nk.hrv_time(peaks, sampling_rate=100, show=True)
References
Ciccone, A. B., Siedlik, J. A., Wecht, J. M., Deckert, J. A., Nguyen, N. D., & Weir, J. P.
(2017). Reminder: RMSSD and SD1 are identical heart rate variability metrics. Muscle & nerve, 56(4), 674678.
Stein, P. K. (2002). Assessing heart rate variability from realworld Holter reports. Cardiac
electrophysiology review, 6(3), 239244.
Shaffer, F., & Ginsberg, J. P. (2017). An overview of heart rate variability metrics and norms.
Frontiers in public health, 5, 258.
RSP¶
Submodule for NeuroKit.
 rsp_amplitude(rsp_cleaned, peaks, troughs=None, interpolation_method='monotone_cubic')[source]¶
Compute respiratory amplitude.
Compute respiratory amplitude given the raw respiration signal and its extrema.
 Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
interpolation_method (str) – Method used to interpolate the amplitude between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the ydirection). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
 Returns
array – A vector containing the respiratory amplitude.
See also
rsp_clean
,rsp_peaks
,signal_rate
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info, signals = nk.rsp_peaks(cleaned) >>> >>> amplitude = nk.rsp_amplitude(cleaned, signals) >>> fig = nk.signal_plot(pd.DataFrame({"RSP": rsp, "Amplitude": amplitude}), subplots=True) >>> fig
 rsp_analyze(data, sampling_rate=1000, method='auto', subepoch_rate=[None, None])[source]¶
Performs RSP analysis on either epochs (eventrelated analysis) or on longer periods of data such as resting state data.
 Parameters
data (dict or DataFrame) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘eventrelated’ for eventrelated analysis on epochs, or ‘intervalrelated’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘eventrelated’ for duration under 10s).
subepoch_rate (list) – For eventrelated analysis,, a smaller “subepoch” within the epoch of an event can be specified. The ECG raterelated features of this “subepoch” (e.g., RSP_Rate, RSP_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the subepoch and the second specifies the end of the subepoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
 Returns
DataFrame – A dataframe containing the analyzed RSP features. If eventrelated analysis is conducted, each epoch is indicated by the Label column. See rsp_eventrelated() and rsp_intervalrelated() docstrings for details.
See also
bio_process
,rsp_process
,epochs_create
,rsp_eventrelated
,rsp_intervalrelated
Examples
>>> import neurokit2 as nk
>>> # Example 1: Download the data for eventrelated analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for eventrelated analysis >>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> nk.rsp_analyze(epochs, sampling_rate=100) >>> >>> # Example 2: Download the restingstate data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100) >>> >>> # Analyze >>> nk.rsp_analyze(df, sampling_rate=100)
 rsp_clean(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]¶
Preprocess a respiration (RSP) signal.
Clean a respiration signal using different sets of parameters, such as ‘khodadad2018’ (linear detrending followed by a fifth order 2Hz lowpass IIR Butterworth filter) or BioSPPy (second order0.1  0.35 Hz bandpass Butterworth filter followed by a constant detrending).
 Parameters
rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
 Returns
array – Vector containing the cleaned respiratory signal.
See also
rsp_findpeaks
,signal_rate
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, sampling_rate=50, noise=0.01) >>> signals = pd.DataFrame({ "RSP_Raw": rsp, ... "RSP_Khodadad2018": nk.rsp_clean(rsp, sampling_rate=50, method="khodadad2018"), ... "RSP_BioSPPy": nk.rsp_clean(rsp, sampling_rate=50, method="biosppy")}) >>> fig = signals.plot() >>> fig
References
Performs eventrelated RSP analysis on epochs.
 Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
subepoch_rate (list) – A smaller “subepoch” within the epoch of an event can be specified. The ECG raterelated features of this “subepoch” (e.g., RSP_Rate, RSP_Rate_Max), relative to the baseline (where applicable), will be computed. The first value of the list specifies the start of the subepoch and the second specifies the end of the subepoch (in seconds), e.g., subepoch_rate = [1, 3] or subepoch_rate = [1, None]. Defaults to [None, None].
 Returns
DataFrame – A dataframe containing the analyzed RSP features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist of the following:  “RSP_Rate_Max”: the maximum respiratory rate after stimulus onset.  “RSP_Rate_Min”: the minimum respiratory rate after stimulus onset.  “RSP_Rate_Mean”: the mean respiratory rate after stimulus onset.  “RSP_Rate_SD”: the standard deviation of the respiratory rate after stimulus onset.  “RSP_Rate_Max_Time”: the time at which maximum respiratory rate occurs.  “RSP_Rate_Min_Time”: the time at which minimum respiratory rate occurs.  “RSP_Amplitude_Max”: the change in maximum respiratory amplitude from before stimulus onset.  “RSP_Amplitude_Min”: the change in minimum respiratory amplitude from before stimulus onset.  “RSP_Amplitude_Mean”: the change in mean respiratory amplitude from before stimulus onset.  “RSP_Amplitude_SD”: the standard deviation of the respiratory amplitude after stimulus onset.  “RSP_Phase”: indication of whether the onset of the event concurs with respiratory inspiration (1) or expiration (0).  “RSP_PhaseCompletion”: indication of the stage of the current respiration phase (0 to 1) at the onset of the event.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> rsp, info = nk.rsp_process(nk.rsp_simulate(duration=120)) >>> epochs = nk.epochs_create(rsp, events=[5000, 10000, 15000], epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> rsp1 = nk.rsp_eventrelated(epochs) >>> rsp1 >>> >>> # Example with real data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> df, info = nk.bio_process(rsp=data["RSP"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=0.1, epochs_end=2.9) >>> >>> # Analyze >>> rsp2 = nk.rsp_eventrelated(epochs) >>> rsp2
 rsp_findpeaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]¶
Extract extrema in a respiration (RSP) signal.
Lowlevel function used by rsp_peaks() to identify inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters. See rsp_peaks() for details.
 Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. I.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).
 Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.
See also
rsp_clean
,rsp_fixpeaks
,rsp_peaks
,signal_rate
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info = nk.rsp_findpeaks(cleaned) >>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned) >>> fig
 rsp_fixpeaks(peaks, troughs=None)[source]¶
Correct RSP peaks.
Lowlevel function used by rsp_peaks() to correct the peaks found by rsp_findpeaks(). Doesn’t do anything for now for RSP. See rsp_peaks() for details.
 Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
 Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively.
See also
rsp_clean
,rsp_findpeaks
,rsp_peaks
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> info = nk.rsp_findpeaks(cleaned) >>> info = nk.rsp_fixpeaks(info) >>> fig = nk.events_plot([info["RSP_Peaks"], info["RSP_Troughs"]], cleaned) >>> fig
Performs RSP analysis on longer periods of data (typically > 10 seconds), such as restingstate data.
 Parameters
data (DataFrame or dict) – A DataFrame containing the different processed signal(s) as different columns, typically generated by rsp_process() or bio_process(). Can also take a dict containing sets of separately processed DataFrames.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
 Returns
DataFrame – A dataframe containing the analyzed RSP features. The analyzed features consist of the following:  “RSP_Rate_Mean”: the mean respiratory rate.  “RSP_Amplitude_Mean”: the mean respiratory amplitude.  “RSP_RRV”: the different respiratory rate variability metrices. See rsp_rrv() docstrings for details.  “RSP_Phase_Duration_Inspiration”: the average inspiratory duration.  “RSP_Phase_Duration_Expiration”: the average expiratory duration.  “RSP_Phase_Duration_Ratio “: the inspiratorytoexpiratory time ratio (I/E).
See also
bio_process
,rsp_eventrelated
Examples
>>> import neurokit2 as nk >>> >>> # Download data >>> data = nk.data("bio_resting_5min_100hz") >>> >>> # Process the data >>> df, info = nk.rsp_process(data["RSP"], sampling_rate=100)
>>> # Single dataframe is passed >>> nk.rsp_intervalrelated(df, sampling_rate=100) >>> >>> epochs = nk.epochs_create(df, events=[0, 15000], sampling_rate=100, epochs_end=150) >>> nk.rsp_intervalrelated(epochs)
 rsp_peaks(rsp_cleaned, sampling_rate=1000, method='khodadad2018', amplitude_min=0.3)[source]¶
Identify extrema in a respiration (RSP) signal.
This function rsp_findpeaks() and rsp_fixpeaks to identify and process inhalation peaks and exhalation troughs in a preprocessed respiration signal using different sets of parameters, such as:
`Khodadad et al. (2018)
<https://iopscience.iop.org/article/10.1088/13616579/aad7e6/meta>`_
`BioSPPy
<https://github.com/PIAGroup/BioSPPy/blob/master/biosppy/signals/resp.py>`_
 Parameters
rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
amplitude_min (float) – Only applies if method is “khodadad2018”. Extrema that have a vertical distance smaller than (outlier_threshold * average vertical distance) to any direct neighbour are removed as false positive outliers. i.e., outlier_threshold should be a float with positive sign (the default is 0.3). Larger values of outlier_threshold correspond to more conservative thresholds (i.e., more extrema removed as outliers).
 Returns
info (dict) – A dictionary containing additional information, in this case the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs”, respectively, as well as the signals’ sampling rate.
peak_signal (DataFrame) – A DataFrame of same length as the input signal in which occurences of inhalation peaks and exhalation troughs are marked as “1” in lists of zeros with the same length as rsp_cleaned. Accessible with the keys “RSP_Peaks” and “RSP_Troughs” respectively.
See also
rsp_clean
,signal_rate
,rsp_findpeaks
,rsp_fixpeaks
,rsp_amplitude
,rsp_process
,rsp_plot
Examples
>>> import neurokit2 as nk >>> import pandas as pd >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> peak_signal, info = nk.rsp_peaks(cleaned, sampling_rate=1000) >>> >>> data = pd.concat([pd.DataFrame({"RSP": rsp}), peak_signal], axis=1) >>> fig = nk.signal_plot(data) >>> fig
 rsp_phase(peaks, troughs=None, desired_length=None)[source]¶
Compute respiratory phase (inspiration and expiration).
Finds the respiratory phase, labelled as 1 for inspiration and 0 for expiration.
 Parameters
peaks (list or array or DataFrame or Series or dict) – The samples at which the inhalation peaks occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
troughs (list or array or DataFrame or Series or dict) – The samples at which the inhalation troughs occur. If a dict or a DataFrame is passed, it is assumed that these containers were obtained with rsp_findpeaks().
desired_length (int) – By default, the returned respiration rate has the same number of elements as peaks. If set to an integer, the returned rate will be interpolated between peaks over desired_length samples. Has no effect if a DataFrame is passed in as the peaks argument.
 Returns
signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns:  “RSP_Phase”: breathing phase, marked by “1” for inspiration and “0” for expiration.  “RSP_Phase_Completion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.
See also
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=30, respiratory_rate=15) >>> cleaned = nk.rsp_clean(rsp, sampling_rate=1000) >>> peak_signal, info = nk.rsp_peaks(cleaned) >>> >>> phase = nk.rsp_phase(peak_signal, desired_length=len(cleaned)) >>> fig = nk.signal_plot([rsp, phase], standardize=True) >>> fig
 rsp_plot(rsp_signals, sampling_rate=None)[source]¶
Visualize respiration (RSP) data.
 Parameters
rsp_signals (DataFrame) – DataFrame obtained from rsp_process().
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> rsp_signals, info = nk.rsp_process(rsp, sampling_rate=1000) >>> fig = nk.rsp_plot(rsp_signals) >>> fig
 Returns
fig – Figure representing a plot of the processed rsp signals.
See also
 rsp_process(rsp_signal, sampling_rate=1000, method='khodadad2018')[source]¶
Process a respiration (RSP) signal.
Convenience function that automatically processes a respiration signal with one of the following methods:
`Khodadad et al. (2018)
<https://iopscience.iop.org/article/10.1088/13616579/aad7e6/meta>`_
`BioSPPy
<https://github.com/PIAGroup/BioSPPy/blob/master/biosppy/signals/resp.py>`_
 Parameters
rsp_signal (Union[list, np.array, pd.Series]) – The raw respiration channel (as measured, for instance, by a respiration belt).
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of “khodadad2018” (default) or “biosppy”.
 Returns
signals (DataFrame) – A DataFrame of same length as rsp_signal containing the following columns:  “RSP_Raw”: the raw signal.  “RSP_Clean”: the cleaned signal.  “RSP_Peaks”: the inhalation peaks marked as “1” in a list of zeros.  “RSP_Troughs”: the exhalation troughs marked as “1” in a list of zeros.  “RSP_Rate”: breathing rate interpolated between inhalation peaks.  “RSP_Amplitude”: breathing amplitude interpolated between inhalation peaks.  “RSP_Phase”: breathing phase, marked by “1” for inspiration and “0” for expiration.  “RSP_PhaseCompletion”: breathing phase completion, expressed in percentage (from 0 to 1), representing the stage of the current respiratory phase.
info (dict) – A dictionary containing the samples at which inhalation peaks and exhalation troughs occur, accessible with the keys “RSP_Peaks”, and “RSP_Troughs” respectively, as well as the signals’ sampling rate.
See also
rsp_clean
,rsp_findpeaks
,signal_rate
,rsp_amplitude
,rsp_plot
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> signals, info = nk.rsp_process(rsp, sampling_rate=1000) >>> fig = nk.rsp_plot(signals) >>> fig
 rsp_rate(rsp_cleaned, peaks=None, sampling_rate=1000, window=10, hop_size=1, method='peak', peak_method='khodadad2018', interpolation_method='monotone_cubic')[source]¶
Find respiration rate by crosscorrelation method. :Parameters: * rsp_cleaned (Union[list, np.array, pd.Series]) – The cleaned respiration channel as returned by rsp_clean().
peaks (Union[list, np.array, pd.Series, pd.DataFrame]) – The respiration peaks as returned by rsp_peaks(). If None (default), respiration peaks will be automatically identified from the rsp_clean signal.
sampling_rate (int) – The sampling frequency of ‘rsp_cleaned’ (in Hz, i.e., samples/second).
window (int) – The duration of the sliding window (in second). Default to 10 seconds.
hop_size (int) – The number of samples between each successive window. Default to 1 sample.
method (str) – Method can be either “peak” or “xcorr”. In “peak” method, rsp_rate is calculated from the periods between respiration peaks. In “xcorr” method, crosscorrelations between the changes in respiration with a bank of sinusoids of different frequencies are caclulated to indentify the principal frequency of oscillation.
peak_method (str) – Method to identify respiration peaks. Can be one of “khodadad2018” (default) or “biosppy”.
interpolation_method (str) – Method used to interpolate the rate between peaks. See signal_interpolate(). ‘monotone_cubic’ is chosen as the default interpolation method since it ensures monotone interpolation between data points (i.e., it prevents physiologically implausible “overshoots” or “undershoots” in the ydirection). In contrast, the widely used cubic spline interpolation does not ensure monotonicity.
 Returns
rsp_rate (np.ndarray) – Instantenous respiration rate.
Example
>>> import neurokit2 as nk >>> rsp_signal = nk.data("rsp_200hz.txt").iloc[:,0] >>> sampling_rate=200 >>> rsp_cleaned = nk.rsp_clean(rsp_signal) >>> rsp_rate_peak = nk.rsp_rate(rsp_cleaned, sampling_rate=sampling_rate, method="peaks") >>> rsp_rate_xcorr = nk.rsp_rate(rsp_cleaned, sampling_rate=sampling_rate, method="xcorr")
 rsp_rrv(rsp_rate, peaks=None, sampling_rate=1000, show=False, silent=True)[source]¶
Computes time domain and frequency domain features for Respiratory Rate Variability (RRV) analysis.
 Parameters
rsp_rate (array) – Array containing the respiratory rate, produced by signal_rate().
peaks (dict) – The samples at which the inhalation peaks occur. Dict returned by rsp_peaks(). Defaults to None.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second).
show (bool) – If True, will return a Poincaré plot, a scattergram, which plots each breathtobreath interval against the next successive one. The ellipse centers around the average breathtobreath interval. Defaults to False.
silent (bool) – If False, warnings will be printed. Default to True.
 Returns
DataFrame – DataFrame consisting of the computed RRV metrics, which includes:  “RRV_SDBB”: the standard deviation of the breathtobreath intervals.
”RRV_RMSSD”: the root mean square of successive differences of the breathtobreath intervals.
”RRV_SDSD”: the standard deviation of the successive differences between adjacent
breathtobreath intervals.
”RRV_BBx”: the number of successive interval differences that are greater than x seconds.
”RRVpBBx”: the proportion of breathtobreath intervals that are greater than x seconds,
out of the total number of intervals.
”RRV_VLF”: spectral power density pertaining to very low frequency band i.e., 0 to .04 Hz by default.
”RRV_LF”: spectral power density pertaining to low frequency band i.e., .04 to .15 Hz by default.
”RRV_HF”: spectral power density pertaining to high frequency band i.e., .15 to .4 Hz by default.
”RRV_LFHF”: the ratio of low frequency power to high frequency power.
”RRV_LFn”: the normalized low frequency, obtained by dividing the low frequency power by the total power.
”RRV_HFn”: the normalized high frequency, obtained by dividing the low frequency power by total power.
”RRV_SD1”: SD1 is a measure of the spread of breathtobreath intervals on the Poincaré
plot perpendicular to the line of identity. It is an index of shortterm variability.
”RRV_SD2”: SD2 is a measure of the spread of breathtobreath intervals on the Poincaré
plot along the line of identity. It is an index of longterm variability.
”RRV_SD2SD1”: the ratio between short and long term fluctuations of the breathtobreath
intervals (SD2 divided by SD1).
”RRV_ApEn”: the approximate entropy of RRV, calculated by entropy_approximate().
”RRV_SampEn”: the sample entropy of RRV, calculated by entropy_sample().
”RRV_DFA_alpha1”: the “shortterm” fluctuation value generated from Detrended Fluctuation
Analysis i.e. the root mean square deviation from the fitted trend of the breathtobreath intervals. Will only be computed if mora than 160 breath cycles in the signal.
”RRV_DFA_alpha2”: the longterm fluctuation value. Will only be computed if mora than 640 breath
cycles in the signal.
RRV_alpha1_ExpRange: Multifractal DFA. ExpRange is the range of singularity exponents, correspoinding to the
width of the singularity spectrum.
RRV_alpha2_ExpRange: Multifractal DFA. ExpRange is the range of singularity exponents, correspoinding to the
width of the singularity spectrum.
RRV_alpha1_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
RRV_alpha2_ExpMean: Multifractal DFA. ExpMean is the mean of singularity exponents.
RRV_alpha1_DimRange: Multifractal DFA. DimRange is the range of singularity dimensions, correspoinding to the
height of the singularity spectrum.
RRV_alpha2_DimRange: Multifractal DFA. DimRange is the range of singularity dimensions, correspoinding to the
height of the singularity spectrum.
RRV_alpha1_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
RRV_alpha2_DimMean: Multifractal DFA. Dimmean is the mean of singularity dimensions.
See also
signal_rate
,rsp_peaks
,signal_power
,entropy_sample
,entropy_approximate
Examples
>>> import neurokit2 as nk >>> >>> rsp = nk.rsp_simulate(duration=90, respiratory_rate=15) >>> rsp, info = nk.rsp_process(rsp) >>> rrv = nk.rsp_rrv(rsp, show=True)
References
Soni, R., & Muniyandi, M. (2019). Breath rate variability: a novel measure to study the meditation
effects. International Journal of Yoga, 12(1), 45.
 rsp_simulate(duration=10, length=None, sampling_rate=1000, noise=0.01, respiratory_rate=15, method='breathmetrics', random_state=None)[source]¶
Simulate a respiratory signal.
Generate an artificial (synthetic) respiratory signal of a given duration and rate.
 Parameters
duration (int) – Desired length of duration (s).
sampling_rate (int) – The desired sampling rate (in Hz, i.e., samples/second).
length (int) – The desired length of the signal (in samples).
noise (float) – Noise level (amplitude of the laplace noise).
respiratory_rate (float) – Desired number of breath cycles in one minute.
method (str) – The model used to generate the signal. Can be ‘sinusoidal’ for a simulation based on a trigonometric sine wave that roughly approximates a single respiratory cycle. If ‘breathmetrics’ (default), will use an advanced model desbribed Noto, et al. (2018).
random_state (int) – Seed for the random number generator.
 Returns
array – Vector containing the respiratory signal.
Examples
>>> import pandas as pd >>> import numpy as np >>> import neurokit2 as nk >>> >>> rsp1 = nk.rsp_simulate(duration=30, method="sinusoidal") >>> rsp2 = nk.rsp_simulate(duration=30, method="breathmetrics") >>> fig = pd.DataFrame({"RSP_Simple": rsp1, "RSP_Complex": rsp2}).plot(subplots=True) >>> fig
References
Noto, T., Zhou, G., Schuele, S., Templer, J., & Zelano, C. (2018). Automated analysis of breathing waveforms using BreathMetrics: A respiratory signal processing toolbox. Chemical Senses, 43(8), 583–597. https://doi.org/10.1093/chemse/bjy045
See also
rsp_clean
,rsp_findpeaks
,signal_rate
,rsp_process
,rsp_plot
EDA¶
Submodule for NeuroKit.
 eda_analyze(data, sampling_rate=1000, method='auto')[source]¶
Performs EDA analysis on either epochs (eventrelated analysis) or on longer periods of data such as resting state data.
 Parameters
data (Union[dict, pd.DataFrame]) – A dictionary of epochs, containing one DataFrame per epoch, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df(). Can also take a DataFrame of processed signals from a longer period of data, typically generated by eda_process() or bio_process(). Can also take a dict containing sets of separate periods of data.
sampling_rate (int) – The sampling frequency of the signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
method (str) – Can be one of ‘eventrelated’ for eventrelated analysis on epochs, or ‘intervalrelated’ for analysis on longer periods of data. Defaults to ‘auto’ where the right method will be chosen based on the mean duration of the data (‘eventrelated’ for duration under 10s).
 Returns
DataFrame – A dataframe containing the analyzed EDA features. If eventrelated analysis is conducted, each epoch is indicated by the Label column. See eda_eventrelated() and eda_intervalrelated() docstrings for details.
See also
bio_process
,eda_process
,epochs_create
,eda_eventrelated
,eda_intervalrelated
Examples
>>> import neurokit2 as nk
>>> # Example 1: Download the data for eventrelated analysis >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data for eventrelated analysis >>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> analyze_epochs = nk.eda_analyze(epochs, sampling_rate=100) >>> analyze_epochs >>> >>> # Example 2: Download the restingstate data >>> data = nk.data("bio_resting_8min_100hz") >>> >>> # Process the data >>> df, info = nk.eda_process(data["EDA"], sampling_rate=100) >>> >>> # Analyze >>> analyze_df = nk.eda_analyze(df, sampling_rate=100) >>> analyze_df
 eda_autocor(eda_cleaned, sampling_rate=1000, lag=4)[source]¶
Computes autocorrelation measure of raw EDA signal i.e., the correlation between the time series data and a specified timelagged version of itself.
 Parameters
eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.
sampling_rate (int) – The sampling frequency of raw EDA signal (in Hz, i.e., samples/second). Defaults to 1000Hz.
lag (int) – Time lag in seconds. Defaults to 4 seconds to avoid autoregressive correlations approaching 1, as recommended by Halem et al. (2020).
 Returns
float – Autocorrelation index of the eda signal.
See also
Examples
>>> import neurokit2 as nk >>> >>> # Simulate EDA signal >>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> cor = nk.eda_autocor(eda_cleaned) >>> cor
References
Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
 eda_changepoints(eda_cleaned)[source]¶
Calculate the number of change points using of the skin conductance signal in terms of mean and variance. Defaults to an algorithm penalty of 10000, as recommended by Halem et al. (2020).
 Parameters
eda_cleaned (Union[list, np.array, pd.Series]) – The cleaned EDA signal.
 Returns
float – Number of changepoints in the
See also
Examples
>>> import neurokit2 as nk >>> >>> # Simulate EDA signal >>> eda_signal = nk.eda_simulate(duration=5, scr_number=5, drift=0.1) >>> eda_cleaned = nk.eda_clean(eda_signal) >>> changepoints = nk.eda_changepoints(eda_cleaned) >>> changepoints
References
Halem, S., van Roekel, E., Kroencke, L., Kuper, N., & Denissen, J. (2020). Moments That Matter? On the Complexity of Using Triggers Based on Skin Conductance to Sample Arousing Events Within an Experience Sampling Framework. European Journal of Personality.
 eda_clean(eda_signal, sampling_rate=1000, method='neurokit')[source]¶
Preprocess Electrodermal Activity (EDA) signal.
 Parameters
eda_signal (Union[list, np.array, pd.Series]) – The raw EDA signal.
sampling_rate (int) – The sampling frequency of rsp_signal (in Hz, i.e., samples/second).
method (str) – The processing pipeline to apply. Can be one of ‘neurokit’ (default) or ‘biosppy’.
 Returns
array – Vector containing the cleaned EDA signal.
See also
Examples
>>> import pandas as pd >>> import neurokit2 as nk >>> >>> eda = nk.eda_simulate(duration=30, sampling_rate=100, scr_number=10, noise=0.01, drift=0.02) >>> signals = pd.DataFrame({"EDA_Raw": eda, ... "EDA_BioSPPy": nk.eda_clean(eda, sampling_rate=100,method='biosppy'), ... "EDA_NeuroKit": nk.eda_clean(eda, sampling_rate=100, method='neurokit')}) >>> fig = signals.plot() >>> fig
Performs eventrelated EDA analysis on epochs.
 Parameters
epochs (Union[dict, pd.DataFrame]) – A dict containing one DataFrame per event/trial, usually obtained via epochs_create(), or a DataFrame containing all epochs, usually obtained via epochs_to_df().
silent (bool) – If True, silence possible warnings.
 Returns
DataFrame – A dataframe containing the analyzed EDA features for each epoch, with each epoch indicated by the Label column (if not present, by the Index column). The analyzed features consist the following:
”EDA_SCR”: indication of whether Skin Conductance Response (SCR) occurs following the event (1 if an SCR onset is present and 0 if absent) and if so, its corresponding peak amplitude, time of peak, rise and recovery time. If there is no occurrence of SCR, nans are displayed for the below features.
”EDA_Peak_Amplitude”: the maximum amplitude of the phasic component of the signal.
”SCR_Peak_Amplitude”: the peak amplitude of the first SCR in each epoch.
”SCR_Peak_Amplitude_Time”: the timepoint of each first SCR peak amplitude.
”SCR_RiseTime”: the risetime of each first SCR i.e., the time it takes for SCR to reach peak amplitude from onset.
”SCR_RecoveryTime”: the halfrecovery time of each first SCR i.e., the time it takes for SCR to decrease to half amplitude.
See also
events_find
,epochs_create
,bio_process
Examples
>>> import neurokit2 as nk >>> >>> # Example with simulated data >>> eda = nk.eda_simulate(duration=15, scr_number=3) >>> >>> # Process data >>> eda_signals, info = nk.eda_process(eda, sampling_rate=1000) >>> epochs = nk.epochs_create(eda_signals, events=[5000, 10000, 15000], sampling_rate=1000, ... epochs_start=0.1, epochs_end=1.9) >>> >>> # Analyze >>> nk.eda_eventrelated(epochs) >>> >>> # Example with real data >>> data = nk.data("bio_eventrelated_100hz") >>> >>> # Process the data >>> df, info = nk.bio_process(eda=data["EDA"], sampling_rate=100) >>> events = nk.events_find(data["Photosensor"], threshold_keep='below', ... event_conditions=["Negative", "Neutral", "Neutral", "Negative"]) >>> epochs = nk.epochs_create(df, events, sampling_rate=100, epochs_start=0.1, epochs_end=6.9) >>&