--- title: "Circular Statistics and Distribution Overlays" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Circular Statistics and Distribution Overlays} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 6, fig.align = "center" ) ``` ```{r setup} if (requireNamespace("pkgload", quietly = TRUE)) { pkgload::load_all("..", export_all = FALSE, helpers = FALSE, quiet = TRUE) } else if (requireNamespace("radiatR", quietly = TRUE)) { library(radiatR) } else { stop("Package 'radiatR' not installed and 'pkgload' not available.") } library(ggplot2) ``` ## Overview The main *radiatR* vignette covers the path from raw tracking files to a plotted set of headings. This vignette picks up from a heading data frame and shows the analysis layer: - **Dispersion summaries** — `circ_dispersion()`, `sector_summary()` - **Parametric fits** — `vonmises_fit()`, `wrappedcauchy_fit()` - **Hypothesis tests** — `test_uniformity()`, `test_mean_directions()`, `test_concentration()`, all with multiple-comparison correction - **Correlation** — `circ_cor()` (circular-linear and circular-circular) - **Distribution overlays** — `add_angle_rose()`, `add_vonmises_density()`, `add_wrappedcauchy_density()`, `add_circular_kde()` - **Significance geometry** — `add_critical_r()` (Rayleigh / V-test circle) and `add_critical_v_line()` (V-test boundary) Every statistics function takes a data frame with a heading column in radians (default name `"heading"`) and returns a tidy data frame, so the results drop straight into `dplyr`, `knitr::kable()`, or further plotting. ## A heading data frame We derive one heading per trial from the bundled `cpunctatus` dataset with the ring-crossing rule, then attach each trial's target half-width (`arc`) for grouping. This is the same construction used in the main vignette. ```{r headings} data(cpunctatus) hd <- derive_headings(cpunctatus, rule = "crossing", circ0 = 0.2, circ1 = 0.4, coords = "relative", angle_convention = "clock") names(hd)[names(hd) == "id"] <- "trial_id" # attach target half-width (arc) from the dataset, by trial arc_map <- unique(cpunctatus@data[, c("trial_id", "arc")]) hd <- merge(hd, arc_map, by = "trial_id") hd$arc <- factor(hd$arc) # keep trials with a defined crossing heading hd <- hd[is.finite(hd$heading), , drop = FALSE] head(hd[, c("trial_id", "arc", "heading")]) ``` The `heading` column is in radians, reference-relative (0 = toward the target). Everything below operates on that column. ## Dispersion summaries `circ_dispersion()` returns the mean direction, resultant length *R*, and circular standard deviation. Grouping by `arc` gives one row per condition. ```{r dispersion} circ_dispersion(hd, group_col = "arc") ``` *R* runs from 0 (uniformly scattered) to 1 (all headings identical); the circular SD moves the opposite way. For dense per-frame heading series — gaze direction from a tethered subject, say — `sector_summary()` bins the angles and reports dwell proportions per sector: ```{r sectors} sector_summary(hd, sectors = 8L) ``` ## Parametric fits `vonmises_fit()` estimates the mean direction \(\mu\) and concentration \(\kappa\) by maximum likelihood, with asymptotic standard errors and a confidence interval on \(\mu\): ```{r vonmises} vonmises_fit(hd, group_col = "arc")[, c("arc", "mu_deg", "kappa", "n")] ``` `wrappedcauchy_fit()` is the heavier-tailed alternative — more robust when the data have outliers or weak directionality. Its concentration \(\rho\) is bounded to \([0, 1)\): ```{r wrappedcauchy} wrappedcauchy_fit(hd, group_col = "arc")[, c("arc", "mu_deg", "rho", "n")] ``` Both return a row per group, so a quick `merge()` puts the two concentration estimates side by side for comparison. ## Hypothesis tests ### Uniformity `test_uniformity()` asks, per group, whether the headings have *any* preferred direction. The Rayleigh test gives an exact p-value; when testing many conditions at once, pass `p_adjust` for a corrected `p_value_adj` column: ```{r uniformity} test_uniformity(hd, group_col = "arc", test = "rayleigh", p_adjust = "BH") ``` ### Equal mean directions `test_mean_directions()` is the Watson-Williams test — the circular analogue of a one-way ANOVA on the mean angle. The omnibus form asks whether *any* group differs: ```{r ww-omnibus} test_mean_directions(hd, group_col = "arc") ``` Set `pairwise = TRUE` for all pairwise comparisons; `p_adjust` is strongly recommended here because the number of comparisons grows quickly: ```{r ww-pairwise} pw <- test_mean_directions(hd, group_col = "arc", pairwise = TRUE, p_adjust = "holm") head(pw[order(pw$p_value_adj), ]) ``` ### Equal concentrations `test_concentration()` checks whether the groups are equally concentrated (the circular analogue of a test for equal variances), a key assumption behind the Watson-Williams test above: ```{r concentration} test_concentration(hd, group_col = "arc") ``` ## Circular correlation `circ_cor()` measures the association between headings and a covariate. With `x_type = "linear"` (the default) it computes the circular-linear correlation — here, whether heading direction is associated with the numeric target half-width: ```{r cor-linear} hd$arc_num <- as.numeric(as.character(hd$arc)) circ_cor(hd, x_col = "arc_num", angle_col = "heading", x_type = "linear") ``` The returned `r` is unsigned (association strength, 0–1); the test statistic \(n r^2\) is approximately \(\chi^2_2\). For two angular variables, pass `x_type = "circular"` to get Fisher's \(\rho \in [-1, 1]\). ## Circular regression Where `circ_cor()` measures the *strength* of an association, `circ_regression()` *models* it: it fits the Fisher–Lee circular–linear regression of a heading on one or more linear covariates, via a formula interface over `circular::lm.circular()`. To show that the fit recovers a known effect, we simulate trajectories whose mean heading shifts with a predictor — the per-condition `mean_slope` controls that shift — and then fit the model back: ```{r reg-simulate} cond <- data.frame(condition = "demo", n_trials = 120, ref_mean = 0, concentration_base = 12, mean_slope = 0.6, predictor_mean = 0, predictor_sd = 1) s <- simulate_tracks(conditions = cond, n_points = 8, seed = 1) reg <- s[!duplicated(s$trial_id), c("predictor", "final_heading")] names(reg)[2] <- "heading" fit <- circ_regression(reg, heading ~ predictor) summary(fit) ``` The coefficient table recovers a **positive, significant** slope for `predictor`, confirming the simulated effect. Note the magnitude: the Fisher–Lee model links the mean heading to the linear predictor through a \(2\arctan(\cdot)\) function, so the fitted slope is on that **link scale** and is attenuated relative to the plain `mean_slope` used to simulate. Interpret the **sign and significance** of the coefficients directly, and use `predict()` to read the effect back on the angular (heading) scale: ```{r reg-predict} predict(fit, data.frame(predictor = c(-1, 0, 1))) ``` The predicted mean heading tracks the predictor — rotating with its value — exactly as the simulation set up. The fitted relationship can be drawn straight onto the circular panel: each covariate value becomes a mean-direction arrow, colour-graded by the predictor, so you see the mean heading sweep as the covariate increases. ```{r reg-fitted-arrows} radiate(headings_frame(hd, heading, units = "radians")) + add_circ_mean(fitted_directions(fit, at = seq(-2, 2, length.out = 7)), colour_col = "predictor") ``` ## Distribution overlays The overlay layers draw an angular distribution in the same Cartesian unit-disc space as `radiate()`, so they compose onto a bare circular canvas with `+`. We build that canvas once: ```{r canvas} canvas <- ggplot() + coord_fixed() + add_circ(radius = 1) + theme_void() ``` A **rose diagram** bins the headings into wedges; the parametric and non-parametric density curves overlay on top. Giving every layer the same `scale` aligns their radii, so the shapes can be compared directly: ```{r overlays, fig.width=6.5, fig.height=6.5} vm <- vonmises_fit(hd) # pooled fit (group_col = NULL) wc <- wrappedcauchy_fit(hd) canvas + add_angle_rose(hd, bins = 12, scale = 0.8, fill = "grey80") + add_circular_kde(hd, scale = 0.8, colour = "tomato") + add_vonmises_density(vm, scale = 0.8, colour = "steelblue") + add_wrappedcauchy_density(wc, scale = 0.8, colour = "darkorange") ``` The grey wedges are the empirical rose; the **steelblue** curve is the von Mises fit, **darkorange** the wrapped Cauchy, and **tomato** a circular kernel density estimate. Where the parametric curves track the rose closely the fit is good; a systematic gap (especially in the tails) is the cue to prefer the heavier-tailed wrapped Cauchy. ## Per-group overlays Every overlay that summarises headings takes a `colour_col`, so a single call draws one element per group, coloured by it -- handy for comparing conditions on one plot. Here the mean-direction **arrow** (`add_heading_arrow()`) and its bootstrap **confidence interval** (`add_heading_interval()`) are split by target half-width (`arc`): ```{r grouped-overlays, fig.width=6.5, fig.height=6.5} canvas + add_heading_arrow(hd, colour_col = "arc") + add_heading_interval(hd, colour_col = "arc", stat = "bootstrap_ci") ``` Each `arc` group gets its own mean vector and CI arc in a matching colour. The grouping is independent of any faceting, so you can (for example) colour by cohort while faceting `radiate()` by treatment. `radiate()`'s own built-in arrow follows the same idea via `arrow_colour_col`, drawing one arrow per group: ```{r radiate-arrow-colour, eval=FALSE} radiate(cpunctatus, group_col = "trial_id", show_arrow = TRUE, arrow_colour_col = "arc") ``` ## Significance geometry The remaining two helpers draw the *decision boundary* of a significance test directly in resultant-length space (radius 0–1), so it can be read against the observed mean vector. `add_critical_r()` draws the **Rayleigh critical circle**: the mean resultant length needed for significance at level `alpha`, namely \(r_\text{crit} = \sqrt{-\log(\alpha)/n}\). If the mean vector reaches past the circle, the headings are significantly non-uniform. ```{r critical-r, fig.width=6.5, fig.height=6.5} disp <- circ_dispersion(hd) mean_vec <- data.frame(x = disp$resultant_R * cos(disp$mean_dir), y = disp$resultant_R * sin(disp$mean_dir)) canvas + add_critical_r(hd, alpha = 0.05, test = "rayleigh") + geom_segment(data = mean_vec, aes(x = 0, y = 0, xend = x, yend = y), arrow = arrow(length = unit(0.15, "inches")), linewidth = 1) ``` ```{r rayleigh-pval} test_uniformity(hd, test = "rayleigh") ``` The arrow tip lies well outside the dashed critical circle, matching the tiny Rayleigh p-value above. `add_critical_v_line()` is the corresponding boundary for the **V-test**, which tests uniformity against a *specified* direction `mu0` (here 0, i.e. toward the target). Unlike the Rayleigh circle, the V-test boundary is a straight line perpendicular to `mu0`: significance requires the mean vector's projection onto `mu0` to exceed \(c = z_\alpha / \sqrt{2n}\). Set `show_region = TRUE` to shade the rejection side. ```{r critical-v, fig.width=6.5, fig.height=6.5} canvas + add_critical_v_line(hd, mu0 = 0, alpha = 0.05, show_region = TRUE) + geom_segment(data = mean_vec, aes(x = 0, y = 0, xend = x, yend = y), arrow = arrow(length = unit(0.15, "inches")), linewidth = 1) ``` When the experiment has an *a priori* expected direction (the target), the V-test is more powerful than the omnibus Rayleigh test, and the line makes the one-sided nature of the decision visible. ## Where next - The **main vignette** covers building these heading data frames and the trajectory/heading-overlay plotting layers. - The **Loaders vignette** covers reading tracking exports from 20+ tools into the `Tracks` objects these analyses start from. - Every function above is documented individually under *Reference*, grouped by role (summaries, parametric fitting, correlation, hypothesis tests, and distribution overlays). ```