--- title: "RTransferEntropy" author: "Simon Behrendt, Thomas Dimpfl, Franziska J. Peter, David J. Zimmermann" date: "`r Sys.Date()`" output: rmarkdown::html_vignette bibliography: bdpz2018.bib vignette: > %\VignetteIndexEntry{RTransferEntropy} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4 ) options(future.globals.maxSize = Inf) library(ggplot2) theme_set(theme_light()) plot_series <- function(x, y) { p1 <- ggplot(data.frame(x = c(NA, x[1:(length(x) - 1)]), y = y), aes(x, y)) + geom_smooth() + geom_point(alpha = 0.5, size = 0.5) + labs(x = expression(X[t - 1]), y = expression(Y[t])) + coord_fixed(1) + scale_x_continuous(limits = range(x)) + scale_y_continuous(limits = range(y)) p2 <- ggplot(data.frame(x = x, y = y), aes(x, y)) + geom_smooth() + geom_point(alpha = 0.5, size = 0.5) + labs(x = expression(X[t]), y = expression(Y[t])) + coord_fixed(1) + scale_x_continuous(limits = range(x)) + scale_y_continuous(limits = range(y)) p3 <- ggplot(data.frame(x = x, y = c(NA, y[1:(length(y) - 1)])), aes(x, y)) + geom_smooth() + geom_point(alpha = 0.5, size = 0.5) + labs(x = expression(X[t]), y = expression(Y[t - 1])) + coord_fixed(1) + scale_x_continuous(limits = range(x)) + scale_y_continuous(limits = range(y)) p <- gridExtra::grid.arrange(p1, p2, p3, ncol = 3) return(invisible(p)) } ``` ## Introduction to RTransferEntropy The measurement of information transfer between different time series is the basis of research questions in various research areas, including biometrics, economics, ecological modelling, neuroscience, sociology, and thermodynamics. The quantification of information transfer commonly relies on measures that have been derived from subject-specific assumptions and restrictions concerning the underlying stochastic processes or theoretical models. With the development of transfer entropy, information theory based measures have become a popular alternative to quantify information flows within various disciplines. Transfer entropy is a non-parametric measure of directed, asymmetric information transfer between two processes. We show how to quantify the information flow between two stationary time series and how to test for its statistical significance using Shannon transfer entropy and Rényi transfer entropy within the package `RTransferEntropy`. A core aspect of the provided package is to allow statistical inference and hypothesis testing in the context of transfer entropy. To this end, we first present the methodology, i.e. the derivation and calculation of transfer entropy as well as the associated bias correction applied to calculate effective transfer entropy, and describe our approach to statistical inference. Afterwards, we introduce the package in detail and demonstrate its functionality in several applications to simulated processes as well as an application to financial time series. ## Measuring information flows using transfer entropy Let $log$ denote the logarithm to the base 2, then informational gain is measured in bits. Shannon entropy [@S48] states that for a discrete random variable $J$ with probability distribution $p(j)$, where $j$ stands for the different outcomes the random variable $J$ can take, the average number of bits required to optimally encode independent draws from the distribution of $J$ can be calculated as $$ H_J = - \sum_j p(j) \cdot log \left(p(j)\right). $$ Strictly speaking, Shannon's formula is a measure for uncertainty, which increases with the number of bits needed to optimally encode a sequence of realizations of $J$. In order to measure the information flow between two processes, Shannon entropy is combined with the concept of the Kullback-Leibler distance [@KL51] and by assuming that the underlying processes evolve over time according to a Markov process [@schreiber2000]. Let $I$ and $J$ denote two discrete random variables with marginal probability distributions $p(i)$ and $p(j)$ and joint probability distribution $p(i,j)$, whose dynamical structures correspond to stationary Markov processes of order $k$ (process $I$) and $l$ (process $J$). The Markov property implies that the probability to observe $I$ at time $t+1$ in state $i$ conditional on the $k$ previous observations is $p(i_{t+1}|i_t,...,i_{t-k+1})=p(i_{t+1}|i_t,...,i_{t-k})$. The average number of bits needed to encode the observation in $t+1$ if the previous $k$ values are known is given by $$ h_I(k)=- \sum_i p\left(i_{t+1}, i_t^{(k)}\right) \cdot log \left(p\left(i_{t+1}|i_t^{(k)}\right)\right), $$ where $i^{(k)}_t=(i_t,...,i_{t-k+1})$. $h_J(l)$ can be derived analogously for process $J$. In the bivariate case, information flow from process $J$ to process $I$ is measured by quantifying the deviation from the generalized Markov property $p(i_{t+1}| i_t^{(k)})=p(i_{t+1}| i_t^{(k)},j_t^{(l)})$ relying on the Kullback-Leibler distance [@schreiber2000]. Thus, (Shannon) transfer entropy is given by $$ T_{J \rightarrow I}(k,l) = \sum_{i,j} p\left(i_{t+1}, i_t^{(k)}, j_t^{(l)}\right) \cdot log \left(\frac{p\left(i_{t+1}| i_t^{(k)}, j_t^{(l)}\right)}{p\left(i_{t+1}|i_t^{(k)}\right)}\right), $$ where $T_{J\rightarrow I}$ consequently measures the information flow from $J$ to $I$ ( $T_{I \rightarrow J}$ as a measure for the information flow from $I$ to $J$ can be derived analogously). Transfer entropy can also be based on Rényi entropy [@R70] rather than Shannon entropy. Rényi entropy introduces a weighting parameter $q>0$ for the individual probabilities $p(j)$ and can be calculated as $$ H^q_J = \frac{1}{1-q} log \left(\sum_j p^q(j)\right). $$ For $q\rightarrow 1$, Rényi entropy converges to Shannon entropy. For $01$ the weights induce a preference for outcomes $j$ with a higher initial probability. Consequently, Rényi entropy provides a more flexible tool for estimating uncertainty, since different areas of a distribution can be emphasized, depending on the parameter $q$. Using the escort distribution [for more information, see @BeckS93] $\phi_q(j)=\frac{p^q(j)}{\sum_j p^q(j)}$ with $q >0$ to normalize the weighted distributions, @JKS12 derive the Rényi transfer entropy measure as $$ RT_{J \rightarrow I}(k,l) = \frac{1}{1-q} log \left(\frac{\sum_i \phi_q\left(i_t^{(k)}\right)p^q\left(i_{t+1}|i^{(k)}_t\right)}{\sum_{i,j} \phi_q\left(i^{(k)}_t,j^{(l)}_t\right)p^q\left(i_{t+1}|i^{(k)}_t,j^{(l)}_t \right)}\right). $$ Analogously to (Shannon) transfer entropy, Rényi transfer entropy measures the information flow from $J$ to $I$. Note that, contrary to Shannon transfer entropy, the calculation of Rényi transfer entropy can result in negative values. In such a situation, knowing the history of $J$ reveals even greater risk than would otherwise be indicated by only knowing the history of $I$ alone. For more details on this issue see @JKS12. The above transfer entropy estimates are commonly biased due to small sample effects. A remedy is provided by the effective transfer entropy [@MK02], which is computed in the following way: $$ ET_{J \rightarrow I}(k,l)= T_{J \rightarrow I}(k,l)- T_{J_{\text{shuffled}} \rightarrow I}(k,l), $$ where $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ indicates the transfer entropy using a shuffled version of the time series of $J$. Shuffling implies randomly drawing values from the time series of $J$ and realigning them to generate a new time series. This procedure destroys the time series dependencies of $J$ as well as the statistical dependencies between $J$ and $I$. As a result $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ converges to zero with increasing sample size and any nonzero value of $T_{J_{\text{shuffled}} \rightarrow I}(k,l)$ is due to small sample effects. The transfer entropy estimates from shuffled data can therefore be used as an estimator for the bias induced by these small sample effects. To derive a consistent estimator, shuffling is repeated many times and the average of the resulting shuffled transfer entropy estimates across all replications is subtracted from the Shannon or Rényi transfer entropy estimate to obtain a bias corrected effective transfer entropy estimate. In order to assess the statistical significance of transfer entropy estimates, we rely on a Markov block bootstrap as proposed by @Dimpfl2013. In contrast to shuffling, the Markov block bootstrap preserves the dependencies within each time series. Thereby, it generates the distribution of transfer entropy estimates under the null hypothesis of no information transfer, i.e. randomly drawn blocks of process $J$ are realigned to form a simulated series, which retains the univariate dependencies of $J$ but eliminates the statistical dependencies between $J$ and $I$. Shannon or Rényi transfer entropy is then estimated based on the simulated time series. Repeating this procedure yields the distribution of the transfer entropy estimate under the null of no information flow. The p-value associated with the null hypothesis of no information transfer is given by $1-\hat{q}_{TE}$, where $\hat{q}_{TE}$ denotes the quantile of the simulated distribution that corresponds to the original transfer entropy estimate. The calculation of Shannon and Rényi transfer entropy is based on discrete data. If the data does not exhibit a discrete structure that allows for transfer entropy estimation, it has to be discretized. This can be achieved by symbolic recoding, i.e. by partitioning the data into a finite number of bins, which can either be based on defining upper and lower bounds for the bins a priori or by choosing specific quantiles of the empirical distribution of the data. Denote the bounds specified for the $n$ bins by $q_1, q_2, ..., q_n$, where $q_1< q_2< ... Y calc_te(x, y) calc_ete(x, y) # and Y->X calc_te(y, x) calc_ete(y, x) ``` Note that the effective transfer entropy relies on a random component (induced by shuffling and repeated reestimation). Therefore, the results might be slightly different. ### Simulated series II Consider again the above example, where the results show a significant information flow from $X$ to $Y$, but not vice versa. Similar conclusions could be drawn from using a vector autoregressive model and testing for Granger causality. However, the main advantage of using transfer entropy is that it is not limited to linear relationships. Consider the following nonlinear relation between $X$ and $Y$, where, again, only $Y$ depends on $X$: $$ \begin{split} x_t & = & 0.2x_{t-1} + \varepsilon_{x,t}\\ y_t & = & \sqrt{\mid x_{t-1}\mid} + \varepsilon_{y,t}, \end{split} $$ with $\varepsilon_{x,t}$ and $\varepsilon_{y,t}$ being standard normally distributed. We simulate these processes, discarding the first 200 observations as a burn-in period. ```{r gen_data_2, eval=T} set.seed(12345) n <- 2500 x <- rep(0, n + 200) y <- rep(0, n + 200) x[1] <- rnorm(1, 0, 1) y[1] <- rnorm(1, 0, 1) for (i in 2:(n + 200)) { x[i] <- 0.2 * x[i - 1] + rnorm(1, 0, 1) y[i] <- sqrt(abs(x[i - 1])) + rnorm(1, 0, 1) } x <- x[-(1:200)] y <- y[-(1:200)] ``` As in the previous example, a scatterplot provides a first impression of the dependencies among both time series for different lead-lag combinations. ```{r plot_data_2, echo=F, message=FALSE, warning=FALSE} plot_series(x, y) ``` The focus of this example is on Shannon transfer entropy. We use the standard settings of the `transfer_entropy()` function which uses one lag to calculate the transfer entropy. ```{r te_2, eval=T} shannon_te2 <- transfer_entropy(x, y) shannon_te2 ``` The resulting Shannon transfer entropy estimate indicates that there is a significant information flow from $X$ to $Y$, but not in the other direction. In the same situation, using a VAR would not reveal any relationship between $X$ and $Y$. Using the package [`vars`](https://cran.r-project.org/package=vars) and one lag (the true lag structure) delivers the following estimates of the VAR(1) for the dependent variable $Y$: ```{r var_comparison, message=FALSE, warning=FALSE} library(vars) varfit <- VAR(cbind(x, y), p = 1, type = "const") svf <- summary(varfit) svf$varresult$y ``` The VAR cannot detect the nonlinear dependence of $Y$ on $X$, which leads to a parameter estimate `x.l1` that is not statistically significant. The autoregressive nature of $X$ (not shown above, but can be obtained with `svf$varresult$x`) is, as expected, readily identified. The precise value of the transfer entropy is influenced by the choice of the quantiles. To illustrate the effect, we reestimate Shannon transfer entropy for a selection of quantiles. The following graph reports the results for increasing tail bins (and, thus, a shrinking central bin). ```{r te_2a, eval=T} df <- data.frame(q1 = 5:25, q2 = 95:75) df$ete <- apply( df, 1, function(el) calc_ete(x, y, quantiles = c(el[["q1"]], el[["q2"]])) ) df$quantiles <- factor(sprintf("(%02.f, %02.f)", df$q1, df$q2)) ggplot(df, aes(x = quantiles, y = ete)) + geom_point() + theme(axis.text.x = element_text(angle = 90)) + labs(x = "Quantiles", y = "ETE (X->Y)") ``` As can be seen, the relative value of the effective transfer entropy from $X$ to $Y$ increases as the central bin gets smaller. Statistical significance (not shown in the plot), is, however, not affected. ### Simulated series III As shown above, the question how to determine the quantiles is important as it impacts the order of magnitude of the transfer entropy. A different approach is provided by Rényi transfer entropy, which allows to reweight the probabilities associated with the individual bins. Due to the proposed symbolic encoding of the data, tail events are associated with a lower likelihood compared to events in the centre of the distribution. Rényi transfer entropy, thus, puts more weight on the tails when calculating transfer entropy. This is particularly convenient when the distribution is assumed to be more informative in the tails. Note again that Rényi transfer entropy estimates can potentially turn out negative. To illustrate the use of Rényi transfer entropy, we simulate data for which the dependence of $Y$ on $X$ changes with the level of the innovation. $$ \begin{split} x_t & = & 0.2x_{t-1} + \varepsilon_{x,t}\\ y_t & = & \begin{cases} \phantom{0.3}x_{t-1} + \varepsilon_{y,t} \quad \text{if } |\varepsilon_{y,t}| > s \\ 0.2x_{t-1} + \varepsilon_{y,t} \quad \text{if } |\varepsilon_{y,t}| < s \end{cases}, \end{split} $$ with $\varepsilon_{x,t}$ and $\varepsilon_{y,t}$ being standard normally distributed and $s = 2\sigma_y$ and $\sigma_y$ the standard deviation of $\varepsilon_{y,t}$. As before, $X$ serves as a predictor for $Y$, but not vice versa. ```{r gen_data_3, eval=T} set.seed(12345) x <- rep(0, n + 200) y <- rep(0, n + 200) x[1] <- rnorm(1, 0, 1) y[1] <- rnorm(1, 0, 1) for (i in 2:(n + 200)) { x[i] <- 0.2 * x[i - 1] + rnorm(1, 0, 1) y[i] <- ifelse( abs(x[i - 1]) > 1.65, x[i - 1] + rnorm(1, 0, 1), 0.2 * x[i - 1] + rnorm(1, 0, 1) ) } x <- x[-(1:200)] y <- y[-(1:200)] ``` Again, a scatterplot provides a first approximation of the dependencies among both time series. ```{r plot_data_3, echo=F, message=FALSE, warning=FALSE} plot_series(x, y) ``` In a first step, we estimate Rényi transfer entropy with a weighting parameter $q=0.3$, which gives a reasonable weight to the (infrequent) tail observations. Estimation of Rényi transfer entropy is, thus, invoqued as follows: ```{r te_renyi_3} set.seed(12345) renyi_te <- transfer_entropy(x, y, entropy = "Renyi", q = 0.3) renyi_te ``` The results indicate first and foremost that there is (statistically significant) information flow from $X$ to $Y$. The effect in the other direction is not statistically significant. The appealing property of the Rényi transfer entropy is the possibility to reweight the probabilities. Furthermore, as $q \rightarrow 1$, Rényi transfer entropy approaches Shannon transfer entropy. We illustrate these two effects using different values of $q$ in the estimation of Rényi transfer entropy and compare the result with the Shannon transfer entropy. Note that the values are only comparable when the same bins are used in the symbolic encoding. Furthermore, the results are illustrated using the raw transfer entropy estimates and not the effective transfer entropy estimates because the latter depend on the shuffling procedure. ```{r q_test} qs <- c(seq(0.1, 0.9, 0.1), 0.99) te <- sapply(qs, function(q) calc_te(x, y, entropy = "renyi", q = q)) names(te) <- sprintf("q = %.2f", qs) te_shannon <- calc_te(x, y) te_shannon ``` The Shannon transfer entropy from $X$ to $Y$ is estimated as `r sprintf("%.4f", te_shannon)`. Using different values of $q$ we obtain the following results for the Rényi transfer entropy. ```{r plot_q_test, message=FALSE, warning=FALSE} round(te, 4) text_df <- data.frame(x = 0.25, y = te_shannon, lab = sprintf("Shannon's TE = %.4f", te_shannon)) ggplot(data.frame(x = qs, y = te), aes(x = x, y = y)) + geom_hline(yintercept = te_shannon, color = "red", linetype = "dashed") + geom_smooth(se = F, color = "black", size = 0.5) + theme_light() + labs(x = "Values for q", y = "Renyi's Transfer Entropy", title = "Renyi's Transfer Entropy for different Values of q") + geom_text(data = text_df, aes(label = lab), color = "red", nudge_y = 0.01) ``` As can be seen, the value of Rényi transfer entropy is highest for small values of $q$ and decreases as $q$ increases. Furthermore, it is indeed approaching `r sprintf("%.4f", te_shannon)` as $q\rightarrow 1$. ## Application to financial time series The analysis of information flows in financial markets has a long history. Using transfer entropy widens the possibilities to detect information flows as nonlinear relationships can also be accounted for. To illustrate the application of transfer entropy, we use a dataset of 10 individual stocks comprised in the S\&P 500 index as well as the index itself. The data range from January 3, 2000, to December 29, 2017. This dataset is included as the `stocks` object in the package. A stock market index is a weighted average of individual stocks. Of course, small stocks have less weight and, hence, may only have limited impact on the index while large corporations might dominate. On the other hand, it is unclear upfront how the market environment as a whole (as measured by the index) might provide information to the individual stocks. To measure the extent to which information flows between the index and the stocks, we use transfer entropy. As the calculation of transfer entropy requires stationary data, we calculate log-returns from the price series. ```{r load_data, message=FALSE, warning=FALSE} library(data.table) # for data manipulation res <- lapply(split(stocks, stocks$ticker), function(d) { te <- transfer_entropy(d$ret, d$sp500, shuffles = 50, nboot = 100, quiet = T) data.table( ticker = d$ticker[1], dir = c("X->Y", "Y->X"), coef(te)[1:2, 2:3] ) }) df <- rbindlist(res) # order the ticker by the ete of X->Y df[, ticker := factor(ticker, levels = unique(df$ticker)[order(df[dir == "X->Y"]$ete)])] # rename the variable (xy/yx) df[, dir := factor(dir, levels = c("X->Y", "Y->X"), labels = c("Flow towards Market", "Flow towards Stock"))] ggplot(df, aes(x = ticker, y = ete)) + facet_wrap(~dir) + geom_hline(yintercept = 0, color = "gray") + theme(axis.text.x = element_text(angle = 90)) + labs(x = NULL, y = "Effective Transfer Entropy") + geom_errorbar(aes(ymin = ete - qnorm(0.95) * se, ymax = ete + qnorm(0.95) * se), width = 0.25, col = "blue") + geom_point() ``` The graphic reports effective transfer entropy estimates together with 95% confidence bounds for 10 stocks. The results illustrate that there is indeed bi-directional information flow. However, the information flow from the stocks to the market is higher for most stocks than the other direction. Also, as can easily be seen from the confidence bounds, the information flow towards the stock is not statistically significant in two cases (on a 5\% significance level). Again, the estimated transfer entropy depends on the choice of the quantile. This is illustrated in the following Figure which reports the changes of effective transfer entropy values for two different quantiles `(05, 95)` and `c(10, 90)` for the two flow directions: stock towards market on the left facet, and market towards stock on the right facet. The different colors represent different stocks. The estimated information flow from the market to the stocks is rather robust with respect to the choice of the quantiles, most values in both flow directions show little change (as indicated by small absolute slopes of the lines). ```{r density_plot_1} # calculate the same ete with different quantiles df2 <- stocks[, .(ete_xy = calc_ete(ret, sp500, quantiles = c(10, 90)), ete_yx = calc_ete(sp500, ret, quantiles = c(10, 90))), by = ticker] # combine the quantiles into a single dt df1 <- dcast(df[, .(dir, ticker, ete)], ticker ~ dir, value.var = "ete") setnames(df1, c("ticker", "ete_xy", "ete_yx")) dt <- rbindlist(list( df1[, quantiles := "(05, 95)"], df2[, quantiles := "(10, 90)"] )) df_long2 <- melt(dt, id.vars = c("ticker", "quantiles")) df_long2[, quantiles := factor(quantiles, levels = c("(05, 95)", "(10, 90)"))] df_long2[, variable := factor(variable, levels = c("ete_xy", "ete_yx"), labels = c("Flow towards Market", "Flow towards Stock"))] ggplot(df_long2, aes(x = quantiles, y = value, color = ticker, group = ticker)) + geom_line() + facet_wrap(~variable) + labs( x = "Quantiles", y = "Effective Transfer Entropy", title = "Change of ETE-Values for different Quantiles", color = "Ticker" ) ``` In finance, pricing relevant information is readily associated with tail events, i.e. relatively large positive or negative returns. If these are indeed more relevant, Rényi transfer entropy provides a tool to give more weight to their contribution to the overall information flow. To illustrate this feature we use one stock and calculate Rényi transfer entropy from the index to the stock for a selection of the weighting parameters $q$. ```{r renyi_te} qs <- c(seq(0.1, 0.9, 0.1), 0.99) d <- stocks[ticker == "AXP"] q_list <- lapply(qs, function(q) { # transfer_entropy will give a warning as nboot < 100 suppressWarnings({ tefit <- transfer_entropy(d$ret, d$sp500, lx = 1, ly = 1, entropy = "Renyi", q = q, shuffles = 50, quantiles = c(10, 90), nboot = 20, quiet = T) }) data.table( q = q, dir = c("X->Y", "Y->X"), coef(tefit)[, 2:3] ) }) qdt <- rbindlist(q_list) sh_dt <- data.table( dir = c("X->Y", "Y->X"), ete = c(calc_ete(d$ret, d$sp500), calc_ete(d$sp500, d$ret)) ) qdt[, pe := qnorm(0.95) * se] ggplot(qdt, aes(x = q, y = ete)) + geom_hline(yintercept = 0, color = "darkgray") + geom_hline(data = sh_dt, aes(yintercept = ete), linetype = "dashed", color = "red") + geom_point() + geom_errorbar(aes(ymin = ete - pe, ymax = ete + pe), width = 0.25/10, col = "blue") + facet_wrap(~dir) + labs(x = "Values for q", y = "Renyi's Transfer Entropy", title = "Renyi's Transfer Entropy for different Values of q", subtitle = "For American Express (AXP, X) and the S&P 500 Index (Y)") ``` For low values of $q$, the information in the tails is given a high weight which leads in the current situation to a significant effective transfer entropy result. This indicates that indeed tail dependence is given between the S\&P 500 and the stock. As the weight is reduced, the effective transfer entropy decreases and even becomes negative (between $q=0.3$ and $q=0.8 in the direction from the stock to the index and between $q=0.5$ and $q=0.9$ in the other direction). This would mean that the knowledge of the either the stock or the index would even indicate a higher risk expose of the respective other entity. Note that this does not mean that there is no information flow. The information flow merely suggests a higher risk of the dependent variable as opposed to a situation with positive Rényi transfer entropy where the risk about the dependent variable is reduced by knowledge of the other variable. The red dashed line in the graph represents the Shannon entropy. The last value of $q$ is 0.99 for which the Rényi transfer entropy is already fairly close to the value of the Shannon transfer entropy. This illustrates that Rényi transfer entropy converges to Shannon transfer entropy as $q$ approaches 1. ## Parallel execution `RTransferEntropy` uses the [`future`](https://cran.r-project.org/package=future) package internally that allows for parallel execution. To enable parallel execution, you have to select a plan (see also `?future::plan`). The following code uses multiple cores as set by `plan(multisession)` then it reverts to sequential execution. ```{r future_details, eval = F} library(future) # enable parallelism plan(multisession) te <- transfer_entropy(x, y, nboot = 100) # execute sequential again plan(sequential) te <- transfer_entropy(x, y, nboot = 100) ``` Turning the function to `quiet = TRUE` might further help to reduce time when the function is called repeatedly. If you want to disable output for all calls, you can use `set_quiet(TRUE)`. ```{r set_quiet} set_quiet(TRUE) te <- transfer_entropy(x, y, nboot = 0) set_quiet(FALSE) te <- transfer_entropy(x, y, nboot = 0) # close multisession, see also ?plan plan(sequential) ``` ## Comparison with existing package The authors of this package are aware of one other package that calculates transfer entropy, the [`TransferEntropy`-package](https://github.com/Healthcast/TransEnt). This package allows the user to compute the Shannon transfer entropy measure using the `computeTE()`-function and returns a single value for the calculated transfer entropy. The present package also provides the computation of Shannon transfer entropy, but is not limited to it. By default (using the `transfer_entropy`-function), it computes transfer entropy, effective transfer entropy (Eff. TE), standard errors, and p-values for both directions between the input time series. Standard errors are based on a bootstrap and bootstrapped transfer entropy quantiles are also provided by default. The focus of this package, therefore, is to allow the researcher to conduct inference about hypotheses regarding effective transfer entropies. Furthermore, the present package also allows the calculation of Rényi transfer entropy. The latter is particularly useful if the tails are assumed to be more informative than the centre of the distribution. The important difference between the [`TransferEntropy`-package](https://github.com/Healthcast/TransEnt) and this package is the way the data are discretized. The former package relies on the k-nearest neighbor approach of Kraskov to estimate mutual information. This package uses symbolic encoding based on selected bins or quantiles of the empirical distribution of the data to estimate empirical frequencies. As in particular the bootstrap is computationally intensive, this package uses `Rcpp` and parallel processing to decrease computing time. ## References