---
title: "Circular Statistics and Distribution Overlays"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Circular Statistics and Distribution Overlays}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  fig.width = 6,
  fig.height = 6,
  fig.align = "center"
)
```

```{r setup}
if (requireNamespace("pkgload", quietly = TRUE)) {
  pkgload::load_all("..", export_all = FALSE, helpers = FALSE, quiet = TRUE)
} else if (requireNamespace("radiatR", quietly = TRUE)) {
  library(radiatR)
} else {
  stop("Package 'radiatR' not installed and 'pkgload' not available.")
}
library(ggplot2)
```

## Overview

The main *radiatR* vignette covers the path from raw tracking files to a
plotted set of headings. This vignette picks up from a heading data frame and
shows the analysis layer:

- **Dispersion summaries** — `circ_dispersion()`, `sector_summary()`
- **Parametric fits** — `vonmises_fit()`, `wrappedcauchy_fit()`
- **Hypothesis tests** — `test_uniformity()`, `test_mean_directions()`,
  `test_concentration()`, all with multiple-comparison correction
- **Correlation** — `circ_cor()` (circular-linear and circular-circular)
- **Distribution overlays** — `add_angle_rose()`, `add_vonmises_density()`,
  `add_wrappedcauchy_density()`, `add_circular_kde()`
- **Significance geometry** — `add_critical_r()` (Rayleigh / V-test circle)
  and `add_critical_v_line()` (V-test boundary)

Every statistics function takes a data frame with a heading column in radians
(default name `"heading"`) and returns a tidy data frame, so the results drop
straight into `dplyr`, `knitr::kable()`, or further plotting.

## A heading data frame

We derive one heading per trial from the bundled `cpunctatus` dataset with the
ring-crossing rule, then attach each trial's target half-width (`arc`) for
grouping. This is the same construction used in the main vignette.

```{r headings}
data(cpunctatus)

hd <- derive_headings(cpunctatus, rule = "crossing",
                      circ0 = 0.2, circ1 = 0.4,
                      coords = "relative",
                      angle_convention = "clock")
names(hd)[names(hd) == "id"] <- "trial_id"

# attach target half-width (arc) from the dataset, by trial
arc_map  <- unique(cpunctatus@data[, c("trial_id", "arc")])
hd       <- merge(hd, arc_map, by = "trial_id")
hd$arc   <- factor(hd$arc)

# keep trials with a defined crossing heading
hd <- hd[is.finite(hd$heading), , drop = FALSE]
head(hd[, c("trial_id", "arc", "heading")])
```

The `heading` column is in radians, reference-relative (0 = toward the
target). Everything below operates on that column.

## Dispersion summaries

`circ_dispersion()` returns the mean direction, resultant length *R*, and
circular standard deviation. Grouping by `arc` gives one row per condition.

```{r dispersion}
circ_dispersion(hd, group_col = "arc")
```

*R* runs from 0 (uniformly scattered) to 1 (all headings identical); the
circular SD moves the opposite way. For dense per-frame heading series — gaze
direction from a tethered subject, say — `sector_summary()` bins the angles and
reports dwell proportions per sector:

```{r sectors}
sector_summary(hd, sectors = 8L)
```

## Parametric fits

`vonmises_fit()` estimates the mean direction \(\mu\) and concentration
\(\kappa\) by maximum likelihood, with asymptotic standard errors and a
confidence interval on \(\mu\):

```{r vonmises}
vonmises_fit(hd, group_col = "arc")[, c("arc", "mu_deg", "kappa", "n")]
```

`wrappedcauchy_fit()` is the heavier-tailed alternative — more robust when the
data have outliers or weak directionality. Its concentration \(\rho\) is
bounded to \([0, 1)\):

```{r wrappedcauchy}
wrappedcauchy_fit(hd, group_col = "arc")[, c("arc", "mu_deg", "rho", "n")]
```

Both return a row per group, so a quick `merge()` puts the two concentration
estimates side by side for comparison.

## Hypothesis tests

### Uniformity

`test_uniformity()` asks, per group, whether the headings have *any* preferred
direction. The Rayleigh test gives an exact p-value; when testing many
conditions at once, pass `p_adjust` for a corrected `p_value_adj` column:

```{r uniformity}
test_uniformity(hd, group_col = "arc", test = "rayleigh", p_adjust = "BH")
```

### Equal mean directions

`test_mean_directions()` is the Watson-Williams test — the circular analogue of
a one-way ANOVA on the mean angle. The omnibus form asks whether *any* group
differs:

```{r ww-omnibus}
test_mean_directions(hd, group_col = "arc")
```

Set `pairwise = TRUE` for all pairwise comparisons; `p_adjust` is strongly
recommended here because the number of comparisons grows quickly:

```{r ww-pairwise}
pw <- test_mean_directions(hd, group_col = "arc",
                           pairwise = TRUE, p_adjust = "holm")
head(pw[order(pw$p_value_adj), ])
```

### Equal concentrations

`test_concentration()` checks whether the groups are equally concentrated (the
circular analogue of a test for equal variances), a key assumption behind the
Watson-Williams test above:

```{r concentration}
test_concentration(hd, group_col = "arc")
```

## Circular correlation

`circ_cor()` measures the association between headings and a covariate. With
`x_type = "linear"` (the default) it computes the circular-linear correlation —
here, whether heading direction is associated with the numeric target
half-width:

```{r cor-linear}
hd$arc_num <- as.numeric(as.character(hd$arc))
circ_cor(hd, x_col = "arc_num", angle_col = "heading", x_type = "linear")
```

The returned `r` is unsigned (association strength, 0–1); the test statistic
\(n r^2\) is approximately \(\chi^2_2\). For two angular variables, pass
`x_type = "circular"` to get Fisher's \(\rho \in [-1, 1]\).

## Circular regression

Where `circ_cor()` measures the *strength* of an association, `circ_regression()`
*models* it: it fits the Fisher–Lee circular–linear regression of a heading on
one or more linear covariates, via a formula interface over
`circular::lm.circular()`. To show that the fit recovers a known effect, we
simulate trajectories whose mean heading shifts with a predictor — the
per-condition `mean_slope` controls that shift — and then fit the model back:

```{r reg-simulate}
cond <- data.frame(condition = "demo", n_trials = 120, ref_mean = 0,
                   concentration_base = 12, mean_slope = 0.6,
                   predictor_mean = 0, predictor_sd = 1)
s   <- simulate_tracks(conditions = cond, n_points = 8, seed = 1)
reg <- s[!duplicated(s$trial_id), c("predictor", "final_heading")]
names(reg)[2] <- "heading"

fit <- circ_regression(reg, heading ~ predictor)
summary(fit)
```

The coefficient table recovers a **positive, significant** slope for
`predictor`, confirming the simulated effect. Note the magnitude: the Fisher–Lee
model links the mean heading to the linear predictor through a \(2\arctan(\cdot)\)
function, so the fitted slope is on that **link scale** and is attenuated
relative to the plain `mean_slope` used to simulate. Interpret the **sign and
significance** of the coefficients directly, and use `predict()` to read the
effect back on the angular (heading) scale:

```{r reg-predict}
predict(fit, data.frame(predictor = c(-1, 0, 1)))
```

The predicted mean heading tracks the predictor — rotating with its value —
exactly as the simulation set up.

The fitted relationship can be drawn straight onto the circular panel: each
covariate value becomes a mean-direction arrow, colour-graded by the predictor,
so you see the mean heading sweep as the covariate increases.

```{r reg-fitted-arrows}
radiate(headings_frame(hd, heading, units = "radians")) +
  add_circ_mean(fitted_directions(fit, at = seq(-2, 2, length.out = 7)),
                colour_col = "predictor")
```

## Distribution overlays

The overlay layers draw an angular distribution in the same Cartesian unit-disc
space as `radiate()`, so they compose onto a bare circular canvas with `+`. We
build that canvas once:

```{r canvas}
canvas <- ggplot() + coord_fixed() +
  add_circ(radius = 1) +
  theme_void()
```

A **rose diagram** bins the headings into wedges; the parametric and
non-parametric density curves overlay on top. Giving every layer the same
`scale` aligns their radii, so the shapes can be compared directly:

```{r overlays, fig.width=6.5, fig.height=6.5}
vm <- vonmises_fit(hd)        # pooled fit (group_col = NULL)
wc <- wrappedcauchy_fit(hd)

canvas +
  add_angle_rose(hd, bins = 12, scale = 0.8, fill = "grey80") +
  add_circular_kde(hd, scale = 0.8, colour = "tomato") +
  add_vonmises_density(vm, scale = 0.8, colour = "steelblue") +
  add_wrappedcauchy_density(wc, scale = 0.8, colour = "darkorange")
```

The grey wedges are the empirical rose; the **steelblue** curve is the von
Mises fit, **darkorange** the wrapped Cauchy, and **tomato** a circular kernel
density estimate. Where the parametric curves track the rose closely the fit is
good; a systematic gap (especially in the tails) is the cue to prefer the
heavier-tailed wrapped Cauchy.

## Per-group overlays

Every overlay that summarises headings takes a `colour_col`, so a single call
draws one element per group, coloured by it -- handy for comparing conditions on
one plot. Here the mean-direction **arrow** (`add_heading_arrow()`) and its
bootstrap **confidence interval** (`add_heading_interval()`) are split by target
half-width (`arc`):

```{r grouped-overlays, fig.width=6.5, fig.height=6.5}
canvas +
  add_heading_arrow(hd,    colour_col = "arc") +
  add_heading_interval(hd, colour_col = "arc", stat = "bootstrap_ci")
```

Each `arc` group gets its own mean vector and CI arc in a matching colour. The
grouping is independent of any faceting, so you can (for example) colour by
cohort while faceting `radiate()` by treatment. `radiate()`'s own built-in arrow
follows the same idea via `arrow_colour_col`, drawing one arrow per group:

```{r radiate-arrow-colour, eval=FALSE}
radiate(cpunctatus, group_col = "trial_id",
        show_arrow = TRUE, arrow_colour_col = "arc")
```

## Significance geometry

The remaining two helpers draw the *decision boundary* of a significance test
directly in resultant-length space (radius 0–1), so it can be read against the
observed mean vector.

`add_critical_r()` draws the **Rayleigh critical circle**: the mean resultant
length needed for significance at level `alpha`, namely
\(r_\text{crit} = \sqrt{-\log(\alpha)/n}\). If the mean vector reaches past the
circle, the headings are significantly non-uniform.

```{r critical-r, fig.width=6.5, fig.height=6.5}
disp <- circ_dispersion(hd)
mean_vec <- data.frame(x = disp$resultant_R * cos(disp$mean_dir),
                       y = disp$resultant_R * sin(disp$mean_dir))

canvas +
  add_critical_r(hd, alpha = 0.05, test = "rayleigh") +
  geom_segment(data = mean_vec,
               aes(x = 0, y = 0, xend = x, yend = y),
               arrow = arrow(length = unit(0.15, "inches")),
               linewidth = 1)
```

```{r rayleigh-pval}
test_uniformity(hd, test = "rayleigh")
```

The arrow tip lies well outside the dashed critical circle, matching the tiny
Rayleigh p-value above.

`add_critical_v_line()` is the corresponding boundary for the **V-test**, which
tests uniformity against a *specified* direction `mu0` (here 0, i.e. toward the
target). Unlike the Rayleigh circle, the V-test boundary is a straight line
perpendicular to `mu0`: significance requires the mean vector's projection onto
`mu0` to exceed \(c = z_\alpha / \sqrt{2n}\). Set `show_region = TRUE` to shade
the rejection side.

```{r critical-v, fig.width=6.5, fig.height=6.5}
canvas +
  add_critical_v_line(hd, mu0 = 0, alpha = 0.05, show_region = TRUE) +
  geom_segment(data = mean_vec,
               aes(x = 0, y = 0, xend = x, yend = y),
               arrow = arrow(length = unit(0.15, "inches")),
               linewidth = 1)
```

When the experiment has an *a priori* expected direction (the target), the
V-test is more powerful than the omnibus Rayleigh test, and the line makes the
one-sided nature of the decision visible.

## Where next

- The **main vignette** covers building these heading data frames and the
  trajectory/heading-overlay plotting layers.
- The **Loaders vignette** covers reading tracking exports from 20+ tools into
  the `Tracks` objects these analyses start from.
- Every function above is documented individually under *Reference*, grouped by
  role (summaries, parametric fitting, correlation, hypothesis tests, and
  distribution overlays).
```