Ordered Network Analysis · Practical Workflow Guide

ONA in practice: turn a chapter-style figure into a reusable workflow.

The original ONA chapter, ENA tutorial article, and quantitative ethnography texts already explain the method in depth. This guide introduces those foundations more explicitly, then translates them into a stable, step-by-step workflow that researchers and AI agents can actually run: install, render, compare, troubleshoot, and reuse.

For: total beginners R 4.4+ · Python 3.10+ Windows · macOS · Linux ~30 min first time

Guide author: Jewoong Moon (The University of Alabama, jmoon19@ua.edu)

KEY INSIGHT The ona R package's plot wrapper is broken on current Windows installs. We bypass it by calling its internal helpers directly — the math stays 100% authentic, only the rendering is rerouted.
What this guide is — and isn’t

An operational bridge to the canonical ONA literature.

The methodological heavy lifting already lives in the original sources: the ONA chapter, the ENA tutorial article, the quantitative ethnography book, and the package manuals for ona, tma, and rENA. This guide does not replace them. It reorganises them into an agent-friendly workflow with runnable templates, reproducible defaults, downloadable example files, and a practical workaround for the current Windows plotting failure. For theory and citation, defer to the originals in References.

The four views you can render

Every figure below was generated from the same reusable workflow pack and the same synthetic teacher–student dialogue dataset. Pick the view that matches your analytic question, then reuse the linked templates and starter files.

What is Ordered Network Analysis?

Ordered Network Analysis (ONA) is a quantitative ethnographic method developed by Tan, Swiecki, Ruis, and Shaffer (2024). It models how codes co-occur within a sliding temporal window — but unlike its predecessor, Epistemic Network Analysis (ENA), ONA preserves the direction of co-occurrence: which code preceded which.

That direction matters in any process where order is meaningful: classroom dialogue (who responded to whom), clinical decision-making (which symptom triggered which test), collaborative problem solving (whose contribution sparked which next move).

An ONA figure has four moving parts:

  • Nodes are codes, positioned in a 2-D space derived from singular value decomposition (SVD) of the directional co-occurrence matrix. Codes that get co-occurrence pressure from similar partners cluster together.
  • Edges are directional curved arrows that show "code X tended to follow code Y" — thicker arrow = stronger pull.
  • Per-unit points are individual sessions/students projected into the same 2-D space — small dots scattered around the codes.
  • Group means with 95% confidence intervals show where each condition's centroid sits in the projection.

The visual you see at the top of a typical ONA paper — directional curved arrows over a backbone of named codes, with two coloured groups overlaid or split into panels — is what this skill produces.

Why this guide, if the chapter and packages already exist?

The original ONA chapter and related ENA literature are strong on methodological explanation. They tell you what ONA is, why directional co-occurrence matters, and how to interpret the resulting network. What they do not try to be is a fault-tolerant, copyable workflow for contemporary R + Python environments or a prompt-ready workflow for AI agents.

That gap matters in practice. If you follow the official ona R-package plotting recipe on many current Windows machines, the build step succeeds but the wrapper silently produces an empty plot: the points appear, while arrows and node markers never inject into the layout.

I spent a long time trying to fix that wrapper directly — recompiling ona from source with RTools45, downgrading the package combo to the LAK24 versions, and swapping data.frame indexing for data.table indexing. None of it solved the operational failure.

Eventually I traced the failure into the wrapper's compiled logic: it builds the right edge geometry internally, then tries to inject it into a hierarchical legacy structure ($ENA_DIRECTION$response) that no longer matches the newer tma output. The injection silently fails, the figure renders empty, and no error is thrown.

The practical breakthrough: ona's internal helpers — create_edge_matrix(), edge_paths(), and to_square() — still work perfectly when called directly. They produce the same directional geometry and in-strength values the canonical figure depends on. This guide packages that working path into reusable templates, a one-shot driver, downloadable example files, and explicit instructions that an agent or a human collaborator can follow without guessing.

So the role of this guide is simple: keep the original literature as the conceptual authority, and make the workflow operationally reliable.

Critical · Caveat

Do not call ona::nodes() or ona::edges() from this pipeline. They are silent no-ops on the current install and will give you an "empty figure" mystery that wastes hours.

Prerequisites & install

You need three things on your machine: R, Python, and (on Windows) RTools. The skill auto-detects the rest.

Step 01
Install R 4.4+
Download R 4.4 or newer from CRAN. The default install path on Windows is C:\Program Files\R\R-4.5.2\.
Step 02
Install RTools45
Windows-only. Download RTools45 and accept the default path C:\rtools45\. Needed once, in case any package needs to compile.
Step 03 · Core
Install ona, tma, plotly
From an R console: install.packages(c("ona","tma","plotly","htmlwidgets","data.table","magrittr"))
Step 04
Install Python 3.10+
From python.org or via Anaconda. Then in a terminal: pip install -U plotly kaleido.

Verify the install

From an R console:

# Should print version numbers (any 0.x is fine)
packageVersion("ona")
packageVersion("tma")
packageVersion("plotly")

From a terminal:

python -c "import kaleido; print(kaleido.__version__)"
# Expected: 1.x  — never 0.2.1
Pitfall · kaleido 0.2.1

If print(kaleido.__version__) shows 0.2.1, the renderer will hang silently on Windows. Run pip install -U kaleido until it shows 1.x.

Tip · Skip the manual env work

The skill includes a run_ona.ps1 driver that auto-detects your R install (4.4 / 4.5), checks kaleido, installs missing Python deps, then runs everything end-to-end. See Chapter 08.

Your data: the long-format CSV

ONA expects one row per turn (or whatever your atomic unit of "happening" is). Every code is a separate column with a 0/1 integer indicator. Plus identifying columns for unit, mode, and time.

Here is the minimal schema:

ColumnTypeWhat it carries
user_idstring / int Who produced the turn. One value per session/participant. Used as the unit of analysis: each unique user_id becomes one ONA point.
condstring Group label. e.g. "AO" vs "non-AO", treatment vs control. Optional — drop if you only have one group.
turn_idxinteger Monotonically increasing turn number within a session. Defines temporal order. The directional arrows depend on this.
speakerstring Who is speaking on this turn. Used as the mode column so ONA can tell self-talk apart from cross-actor moves.
CODE_A0 / 1 One column per code. 1 if the code applies to this turn, 0 otherwise. Repeat for every code (CODE_B, CODE_C, …).

A worked example

Imagine a teacher–student dialogue with five codes (T_QUESTION, T_FEEDBACK, T_STATEMENT, S_QUESTION, S_RESPONSE). Your CSV starts like this:

user_id,cond,turn_idx,speaker,T_QUESTION,T_FEEDBACK,T_STATEMENT,S_QUESTION,S_RESPONSE
P01,AO,1,teacher,1,0,0,0,0
P01,AO,2,student,0,0,0,0,1
P01,AO,3,teacher,0,1,0,0,0
P01,AO,4,teacher,0,0,1,0,0
P02,non-AO,1,teacher,1,0,0,0,0
P02,non-AO,2,student,0,0,0,1,0
...
Tip · Don't mix code columns and metadata

Keep all integer 0/1 code columns together at the right side of the CSV and metadata columns (id, group, time, speaker) at the left. The skill auto-discovers code columns by name, but a clean schema avoids future surprises.

The single-rule HOO concept

HOO stands for "head of output". It tells tma which past turns are eligible to send a directional connection to the current turn. In the chapter examples you'll see a two-rule pattern with separate ground and response rules. In our experience that pattern commonly errors out on real datasets when the ground/response conditions cannot both be satisfied for every turn.

The pattern that worked most reliably in the classroom, dialogue, and log-style datasets tested for this guide is one rule:

hoo <- tma:::rules(
  user_id %in% UNIT$user_id & cond %in% UNIT$cond
)

Read this as: "for every unit (a unique user_id × cond combination), the past turns eligible to send connections are exactly those that share the same user_id and the same cond." In plain words: connections only flow within a session, never across participants.

The skill template uses this pattern by default. Don't change it unless you know exactly why.

A worked dataframe you can run end-to-end

To make every figure in this guide reproducible, the skill ships a fully synthetic dialogue dataset at example_data/fake_dialogue_long.csv. It is not real data — transition matrices and group differences are designed by hand to produce a clean, interpretable ONA picture. Use it to verify your install before pointing the templates at your own data.

What it represents

A study with two conditions (treatment, control) and 50 participants per condition. Each participant produces 24–60 turns. Two speakers (A and B) take turns. Five codes describe what each turn does:

CodeMeaning (illustrative)
ASKThe speaker asks a question of the partner.
EXPLAINThe speaker offers an explanation or assertion.
EXAMPLEThe speaker provides a concrete example.
AGREEThe speaker agrees with the partner.
ELABORATEThe speaker extends, refines, or adds nuance.

First ten rows

Read these rows like turns in a transcript. Each row is one turn. The five code columns are 0/1; exactly one is on per turn (in this synthetic data — in real data multiple codes can co-occur).

user_id, cond,      turn_idx, speaker, ASK, EXPLAIN, EXAMPLE, AGREE, ELABORATE
P001,    treatment, 1,        A,        1,   0,       0,       0,     0
P001,    treatment, 2,        B,        0,   1,       0,       0,     0
P001,    treatment, 3,        A,        0,   0,       1,       0,     0
P001,    treatment, 4,        B,        0,   0,       0,       0,     1
P001,    treatment, 5,        A,        0,   1,       0,       0,     0
P001,    treatment, 6,        B,        0,   0,       1,       0,     0
P001,    treatment, 7,        A,        0,   0,       0,       0,     1
P001,    treatment, 8,        B,        1,   0,       0,       0,     0
P001,    treatment, 9,        A,        0,   1,       0,       0,     0
P001,    treatment, 10,       B,        0,   0,       0,       1,     0

How to load it

# From R (matches the templates exactly)
long <- read.csv("path/to/fake_dialogue_long.csv",
                 stringsAsFactors = FALSE)
str(long)
# 'data.frame': 4136 obs. of 9 variables
Tip · Use this CSV as your install check

Point any of the templates (ona_template.R, ona_panels_template.R, ona_subtracted_template.R) at this CSV with CODES = c("ASK","EXPLAIN","EXAMPLE","AGREE","ELABORATE") and GROUP_COL = "cond". The figures you get back should match the four views in the gallery at the top of this guide. If they don’t, your install is the problem — not your data.

ENA in five minutes — stanzas and the sliding window

ONA is a directional descendant of Epistemic Network Analysis (ENA). To read an ONA figure properly you need three ENA concepts: the stanza, accumulation, and projection. The R packages (tma behind the scenes) implement all three; you don’t set them by hand, but you do need to recognise them.

1. The stanza — what counts as "together"

A stanza is the local window of turns within which two codes are considered to co-occur. Two codes that fire 30 turns apart shouldn’t count as connected; two codes that fire in the same minute almost certainly should. The stanza is how ENA puts a number on "almost certainly."

The most common stanza definition is a sliding window of size w. For each turn, look at the previous w−1 turns. Every code that fired in that window is a "neighbour" of the current turn’s code(s). The skill’s templates use WINDOW_SIZE <- 4: each turn looks back three turns. Bigger window = more connections counted; smaller window = stricter "happened together." There is no universal best value — it depends on your turn granularity.

2. Accumulation — turning windows into a vector

Walk through every turn of every session, applying the stanza window. For each turn, increment the co-occurrence count for every (code-in-window, code-now) pair. After the walk, each unit of analysis (one session = one user_id×cond) has a vector of edge weights, one number per directed code pair.

For 5 codes there are 5×5 = 25 directed code pairs (including self-loops), so each session ends up as a 25-dimensional vector. With 100 sessions in the synthetic dataset, accumulation produces a 100-row, 25-column matrix.

3. Projection — SVD onto two axes

The 25-dimensional vectors are too many dimensions to look at. ENA applies singular value decomposition (SVD) to the matrix and keeps the two axes that explain the most variance — that’s SVD1 (x) and SVD2 (y) in every figure you see. Each session lands at a point in this 2-D space. The codes themselves are placed by a centroid rule: each node sits at the weighted average of the units that depend on it.

The result is what we’ve been looking at all along: codes as nodes in 2-D, sessions as small dots in the same 2-D, and edges weighted by the per-pair co-occurrence count.

In one sentence

Stanza → window → co-occurrence count → SVD → figure

You set the codes and the window. Everything else — accumulating co-occurrences, fitting SVD, placing nodes — happens inside tma when you call model(accum).

From ENA to ONA — what changes when you add direction

ENA treats co-occurrence as symmetric: if codes X and Y appear in the same window, the pair (X,Y) gets one count. ONA treats co-occurrence as directional: it matters whether X happened before Y or after.

That single change ripples through the whole pipeline.

StepENAONA
Window Symmetric: every code in the window pairs with every other code. Asymmetric: only past–to–present pairs count. The "current" code is the receiver; everything in its lookback window is a sender.
Adjacency k(k−1)/2 unordered pairs (off-diagonal). For 5 codes: 10 weights. k² ordered pairs. For 5 codes: 25 weights (including self-loops).
Edge in the figure Undirected line. Width = co-occurrence frequency. Curved arrow with an arrowhead. Width = directional flow.
Node position Centroid of the units that include the code. Same SVD projection — positions are familiar to ENA users.
What the figure shows "Which codes tend to appear together." "Which codes tend to follow which."
Best for Static co-occurrence patterns where order is meaningless. Process data where order carries the meaning — dialogue, clinical workflow, gameplay sequences.

Why direction matters — a concrete example

Imagine a tutoring session that contains both ASK and EXAMPLE. ENA can only tell you they appeared in the same stanzas often. ONA tells you something stronger:

  • If the model says ASK → EXAMPLE is heavy, the tutor offers examples in response to learner questions — reactive scaffolding.
  • If EXAMPLE → ASK is heavy instead, the tutor proactively shows examples, and questions follow — proactive scaffolding.

Both produce identical ENA figures. Only ONA distinguishes them.

Why the same projection works for both

ONA inherits ENA’s SVD-on-the-vector machinery: each session is still a long vector of edge weights, just twice as long because direction doubles the slots. The 2-D projection still lays codes out by centroid pull. So if you’ve read an ENA figure before, you already know how to read the layout. The only new visual element is the arrowhead.

Practical note

Direction is computed inside tma whenever you pass mode.column = "speaker" and a numeric time.column. The same CSV that drives an ENA model can drive an ONA model — you just call model() on the directional accumulator. The skill’s templates already do this for you.

When to choose which

Question you’re askingUse
"Which constructs co-occur in this corpus?"ENA
"Which moves trigger which other moves?"ONA
"Did the treatment shift the order of operations?"ONA
"Are these two corpora structurally similar?"ENA (or ONA — both work)
"Who initiates and who responds?"ONA + a speaker mode column
"What pairs often appear in the same window?"ENA

Everything in the rest of this guide is ONA-specific, but ENA users will recognise the bones.

An interactive ONA playground you can actually touch

This mini embed is not a full statistics engine. It is a reading tool for beginners: change the controls and watch how the same coded process looks different in combined, panels, and difference views.

Treatment Control Difference edge
How to use this

Start in combined view, raise the threshold until only the dominant transitions survive, then switch to difference. If one code suddenly dominates the colored arrows, that is usually the structural story you should read first.

The combined view

This view overlays both groups' edges on the same plot — different colours, same node positions, same axis. Good when you want one figure that shows both group means in the same projection.

Step 1 · Copy the template

# From PowerShell, in your project root
Copy-Item "C:\Users\<you>\.claude\skills\ona\templates\ona_template.R" .\analysis\ona_run.R

Step 2 · Edit only the PARAMS block

Open analysis\ona_run.R in any editor and adjust the top section. Everything else can stay as-is on the first run.

# ---------------------------------------------------------------------------
# PARAMS — edit these
# ---------------------------------------------------------------------------
ROOT          <- "C:/Users/you/Desktop/MyProject"
CSV           <- file.path(ROOT, "data/turns_long.csv")
FIG_DIR       <- file.path(ROOT, "analysis/figures")
OUT_BASENAME  <- "ona_combined"

CODES         <- c("T_QUESTION","T_FEEDBACK","T_STATEMENT",
                 "S_QUESTION","S_RESPONSE")
UNIT_COLS     <- c("cond","user_id")
MODE_COL      <- "speaker"
TIME_COL      <- "turn_idx"
WINDOW_SIZE   <- 4

GROUP_COL     <- "cond"
G1            <- "AO"
G2            <- "non-AO"
EDGE_COLOR_1  <- "red"
EDGE_COLOR_2  <- "blue"

Step 3 · Run R

& "C:\Program Files\R\R-4.5.2\bin\Rscript.exe" .\analysis\ona_run.R

Expected console output:

Edge paths: 30 + 30
Node sizes: 25.8, 14.4, 26.2, 36.4, 46
Built figure: shapes = 60
[saved] ona_combined.html
[saved] ona_combined.json

Step 4 · Render the PNG

python "C:\Users\<you>\.claude\skills\ona\templates\render_ona_kaleido.py" ^
       .\analysis\figures\ona_combined.json ^
       .\analysis\figures\ona_combined.png 1200 750

You should now see ona_combined.png next to the JSON.

Combined view: AO vs non-AO
Combined view · outputTreatment (red) and Control (blue) overlaid in the same SVD projection. Donut nodes sized by combined in-strength. Square markers = group means with 95% CI crosshairs.
Tip · Single-network mode

If your dataset has no group split, set GROUP_COL <- NULL and the template renders a single condition-agnostic mean network. The points and group-mean traces are skipped automatically.

Side-by-side panels

When the combined overlay gets visually crowded — or when you need a fair group comparison for a figure caption — use the panels template. It draws two ONA networks side-by-side, with three things forced equal across panels:

Equal across panels

Same node positions

Both panels use the same SVD rotation. A code that sits at the upper-left in panel 1 sits at the upper-left in panel 2.

Equal across panels

Same edge-width scale

scale_edges_from = c(0, max(c(w_G1, w_G2))) — the thickest possible arrow in either panel maps to the same px width.

Equal across panels

Same node-size scale

In-strength is summed across both groups (node_total = in1 + in2) so a node's diameter is comparable.

Different across panels

Edge colour + group mean

Panel 1 uses EDGE_COLOR_1, panel 2 uses EDGE_COLOR_2. Each panel shows only its own group's points and mean ± CI.

Run it

Copy-Item "C:\Users\<you>\.claude\skills\ona\templates\ona_panels_template.R" .\analysis\ona_panels_run.R
# edit PARAMS the same way as the combined template, then:
& "C:\Program Files\R\R-4.5.2\bin\Rscript.exe" .\analysis\ona_panels_run.R
python "C:\Users\<you>\.claude\skills\ona\templates\render_ona_kaleido.py" ^
       .\analysis\figures\ona_panels.json .\analysis\figures\ona_panels.png 1800 750

Note the wider canvas — 1800×750 instead of 1200×750 — to fit two panels.

Side-by-side panels: AO and non-AO
Panels view · outputTreatment (left, red) and Control (right, blue) on identical axes, edge-width scale, and node-size scale.

Subtracted (difference) network

The subtracted network is the canonical ONA group-comparison view. It encodes one number per edge: wG1 − wG2. Edges that are more frequent in G1 draw in EDGE_COLOR_1; edges that are more frequent in G2 draw in EDGE_COLOR_2. Edge width = absolute difference, so faint edges mean "the two groups behave nearly the same on this connection."

This is what you reach for when you want a single figure that answers: "how do the groups differ in dialogue structure?" The combined view shows both groups overlapping; the panels show each group cleanly; the subtracted view shows the shift.

Run it

Copy-Item "C:\Users\<you>\.claude\skills\ona\templates\ona_subtracted_template.R" .\analysis\ona_sub_run.R
# edit the same PARAMS as the other templates, then:
pwsh "C:\Users\<you>\.claude\skills\ona\templates\run_ona.ps1" ^
     -RScript .\analysis\ona_sub_run.R ^
     -OutDir .\analysis\figures ^
     -OutBaseName ona_subtracted

Expected console output (these are real numbers from a teacher-simulation dataset):

# Diff range tells you how big the per-edge differences are
Diff range: -0.2528 0.2407  | AO-favored: 12  | non-AO-favored: 13
Built subtracted network: shapes = 33
How to read this output

"AO-favored: 12 / non-AO-favored: 13"

Of the 25 directional edges in the model, 12 are stronger in the AO group and 13 are stronger in the non-AO group. The Diff range ±0.25 tells you the largest single-edge gap, which calibrates the colour saturation in the figure.

Subtracted (difference) network
Subtracted view · outputEdges encode Treatment − Control. Red strokes mark connections stronger under Treatment; blue strokes mark connections stronger under Control. Faint edges = the two groups behave nearly the same.
Heads-up · Renderer fallback

Some subtracted networks trigger a plotly.js error in kaleido due to near-zero edge geometries. The skill's render_ona_kaleido.py falls back to a Playwright screenshot of the saved HTML when this happens. The PNG you get is identical visually.

Per-function (per-actor) sub-networks

Once you have a working ONA model, you often want to ask narrower questions: "what happens if I look only at teacher-initiated moves?" or "what does the network look like for student responses only?" These are functional decompositions — slices of the same fitted model, redrawn with a subset of edges.

There are two ways to slice. Pick the one that fits your question.

Option A · Slice the edge matrix

"Show me edges where the SENDER is a teacher code"

Keep the same fitted model. Filter the directed adjacency matrix by row prefix (sender) or column prefix (receiver). Use the result as the weights vector for a fresh edge_paths call.

Best when your codes are already grouped by actor (e.g. T_*, S_*) and you just want to project a sub-graph.

Option B · Refit on a subset

"Refit ONA using only teacher-spoken turns"

Filter the long-format CSV upstream (long[long$speaker == "teacher", ]), then rerun the whole pipeline. You get fresh node positions, fresh SVD axes, fresh per-unit points.

Best when the actor's behaviour is qualitatively different and you want a separate model — but you lose comparability with the full network.

Option A · Code recipe (most common)

This snippet adds a "teacher-initiated only" sub-network to the standard pipeline. Drop it into your ona_run.R after the model is fit.

# Re-square the per-group weights into directed adjacency matrices
sq_ao <- ona:::to_square(ao_wts)   # rows = sender, cols = receiver
rownames(sq_ao) <- colnames(sq_ao) <- as.character(nodes_dt$code)

# Mask: keep only rows whose sender starts with "T_"
teacher_rows <- grepl("^T_", rownames(sq_ao))
sq_ao_T <- sq_ao
sq_ao_T[!teacher_rows, ] <- 0   # zero out non-teacher senders

# Convert back to a flat weight vector in the same order ona expects
ao_wts_T <- as.numeric(t(sq_ao_T))     # row-major flatten

# Build edge paths for this sub-network
em_T <- ona:::create_edge_matrix(NULL, weights = ao_wts_T,
                                  nodes = nodes_dt, direction = 1)
ep_T <- ona:::edge_paths(em_T, edge_color = c("darkgreen"),
   edge_size_multiplier = 0.4, scale_edges_to = c(0,1.5),
   scale_edges_from = c(0, max(ao_wts_T)), # ... rest of params same)

Then add ep_T to your shapes_combined list. The result: a figure where only teacher-initiated arrows are visible, painted in green, against the same node positions as the main figure.

Common functional cuts to consider

QuestionSliceMask used
Teacher-initiated moves onlySender = T_*grepl("^T_", rownames(sq))
Student-initiated moves onlySender = S_*grepl("^S_", rownames(sq))
Cross-actor movesT→S or S→TMask out same-prefix sender↔receiver pairs
Reciprocal exchangesX→Y AND Y→X both present(sq > 0) & (t(sq) > 0)
Sink-only edgesReceiver = highest in-strength codeKeep only the column matching the top-in-strength code
Self-loopsSender = Receiverdiag(sq)
Tip · Always re-anchor on the full model

When you slice (Option A), keep the node positions, axis range, and per-unit points from the full fitted model. That way the sub-network is interpretable in the same projection — readers can compare a teacher-only figure against the all-actors figure code-for-code.

The one-shot driver

If you don't want to remember the Rscript path or the kaleido install line every time, use run_ona.ps1. It auto-detects an R 4.4+ install, ensures Python deps, runs your R script, then renders the PNG.

pwsh "C:\Users\<you>\.claude\skills\ona\templates\run_ona.ps1" ^
     -RScript ".\analysis\ona_run.R" ^
     -OutDir ".\analysis\figures" ^
     -OutBaseName "ona_combined" ^
     -Width 1200 -Height 750

One line. The driver prints what it does:

[ona] Rscript: C:\Program Files\R\R-4.5.2\bin\Rscript.exe
[ona] running R: .\analysis\ona_run.R
Edge paths: 30 + 30
Node sizes: 25.8, 14.4, 26.2, 36.4, 46
Built figure: shapes = 60
[ona] rendering PNG via kaleido ...
[saved] .\analysis\figures\ona_combined.png
[ona] DONE: .\analysis\figures\ona_combined.png

For panels just point -RScript at the panels run-file and bump -Width to 1800.

The tuning knobs you can change

The defaults are chosen to match the chapter Fig 4 aesthetic. If you want something different, here is what each knob does and what range stays sensible.

KnobDefaultWhat it controls
EDGE_SIZE_MULT0.4Multiplier inside edge_paths. Bigger → fatter arrows overall. Range 0.2–0.8.
EDGE_SCALE_TOc(0, 1.5)Output range for edge widths. Bumping the upper bound makes the strongest edge thicker without touching the weakest.
EDGE_LINE_MULT2.5Extra multiplier applied to line.width when shapes are emitted to plotly. Useful if you want bolder strokes without changing the underlying edge geometry.
NODE_SIZE_MIN12Smallest possible node diameter (in px). The least-connected node draws at this size.
NODE_SIZE_RANGE34Span above the minimum. Maximum node = MIN + RANGE = 46 px by default.
DONUT_INNER_RATIO0.55Inner white circle relative to outer. Smaller → thicker ring.
LABEL_FONT_PX17Code-label font size in px (serif). Bump for poster-size figures.

All seven of these live in the PARAMS block at the top of each template. You don't have to touch the rest of the script.

Outer ring (black)
Inner circle (white)
Edge colour (red / blue / your choice)

Pitfalls & how to recover

Any of these can happen on a fresh setup. Each one has a fast diagnostic and a one-line fix.

Symptom

The PNG is empty (axes only, no arrows, no nodes)

Cause: You called ona::nodes() / ona::edges() somewhere. Those wrappers silent-fail.
Fix: Rerun using one of the skill templates as your starting point. The templates never call those wrappers.

Symptom

module 'kaleido' has no attribute 'scopes'

Cause: kaleido 0.2.1 is installed.
Fix: pip install -U kaleido until the version reports as 1.x.

Symptom

R errors: ENA_DIRECTION not found

Cause: You data.frame-indexed set$line.weights with [df$col == "X", ]. That syntax strips the ena.line.weights class.
Fix: Use the template's pattern — extract the numeric matrix via as.matrix() and aggregate with colMeans().

Symptom

All edges are zero / no arrows render

Cause: The HOO rule excluded every turn (no co-occurrences passed the filter).
Fix: In R, print nrow(flat) and range(ao_wts). If the matrix is empty, your UNIT_COLS or cond filter is too strict — relax it.

Symptom

Nodes hide behind the edges

Cause: layer = "below" got dropped from the shape spec, or marker sizemode defaulted to "area" and shrunk the donut hole.
Fix: Restore both lines from the template; do not edit them.

Symptom

Rscript: command not found

Cause: Rscript is not on PATH on Windows by default.
Fix: Use the absolute path "C:\Program Files\R\R-4.5.2\bin\Rscript.exe", or use run_ona.ps1 which auto-detects.

Network metrics — what numbers can you pull out?

The figure is one expression of the model. The same model carries node-level and network-level summary statistics you can drop into a Results table or a regression model. The skill ships templates/ona_metrics.R which produces all of them.

Node-level metrics — per code, per group

For each code, you get four numbers per group plus two differences:

MetricRWhat it tells you
in-strengthcolSums(sq)How much directional weight the code receives. The "sink" of the field.
out-strengthrowSums(sq)How much directional weight the code initiates. The "source" of the field.
total degreein + outCombined participation. Drives the donut node size in the figure.
in_diffin_G1 − in_G2Where the receiving load shifted across groups.
out_diffout_G1 − out_G2Where the sending load shifted across groups.

Real example from a teacher–student dialogue dataset (AO = treatment, non-AO = control):

          in_T   out_T  in_C   out_C  in_diff  out_diff
ASK       0.597  0.633  1.147  1.174  −0.549  −0.541
EXPLAIN   0.919  0.917  1.154  1.133  −0.235  −0.216
EXAMPLE   0.896  0.879  0.381  0.359  +0.515  +0.520
AGREE     0.580  0.587  0.672  0.674  −0.092  −0.088
ELABORATE 1.216  1.192  0.476  0.488  +0.740  +0.704

How to read this table in plain words:

  • ELABORATE is the biggest treatment-favoured sink (in +0.74). Treatment dialogues route a lot more flow into elaboration moves.
  • EXAMPLE jumps the same direction (+0.52 in / +0.52 out). Together with ELABORATE this is a structural fingerprint of the treatment: explain → give an example → elaborate.
  • ASK moves the opposite way (−0.55 in / −0.54 out). Control dialogues lean more on question-asking; treatment displaces that load onto example-and-elaborate sequences.

Network-level metrics — per group

MetricWhat it captures
Active edgesCount of edges with weight > 0. Maximum is k² for k codes (here 25 for 5 codes).
Total masssum(weights) — the cumulative co-occurrence pressure in the network. Compare relative mass across groups; absolute values depend on units.
DensityFraction of possible edges with non-zero weight. Density = 1 means every code reaches every other code at some point.
Mean SVD1 / SVD2Where the group's centroid sits in the projection. Drives the square-marker means in the figure.
Between-means distanceEuclidean distance between the two group means. The headline "how far apart are these groups?" number.

From the same dataset:

             metric    Treatment    Control
       Active edges       25           25
         Total mass        4.21         3.83
            Density        1.00         1.00
          Mean SVD1       −0.348       +0.348
          Mean SVD2       −0.024       +0.024
 Between-means dist        0.698

Both networks are equally dense and carry similar total mass — the difference is where the mass goes, not how much. That's a typical ONA finding: the structure shifts even though the volume stays comparable.

Tip · Use these in a regression

The per-unit SVD1/SVD2 coordinates and the per-unit in/out-strength of any code are valid outcome variables for a downstream regression — e.g. predict SVD1 position from condition + covariates. ONA gives you the structure; the regression gives you the inference.

Group comparison statistics

The figure shows where two groups land in the projection. The numbers below tell you whether that separation is reliable. Three tests are standard. The skill's ona_metrics.R runs all three.

Test 1 · Parametric

Welch's t-test on SVD1 / SVD2

Tests whether the per-unit mean coordinate differs between groups, allowing unequal variances. Reasonable default when units are roughly normally distributed in each group.

t.test(SVD1 ~ cond, data = pts_df)
t.test(SVD2 ~ cond, data = pts_df)
Test 2 · Non-parametric

Mann–Whitney U

Drop the normality assumption — tests whether one group's coordinates tend to be higher than the other. Use this when n is small or when residuals are skewed.

wilcox.test(SVD1 ~ cond, data = pts_df)
wilcox.test(SVD2 ~ cond, data = pts_df)
Test 3 · Effect size

Cohen's d

p-values shrink with sample size. Cohen's d gives you the magnitude of the difference in pooled-SD units. Conventional anchors: 0.2 small, 0.5 medium, 0.8 large.

# pooled-SD effect size
(mean(x) - mean(y)) /
  sqrt(((nx-1)*var(x) + (ny-1)*var(y))
       / (nx+ny-2))
Test 4 · Permutation

Randomization test on the between-means distance

The most defensible test for ONA: repeatedly shuffle the condition labels, recompute the between-means distance, and ask "how often does chance produce a separation as large as the one we observed?"

obs_d <- sqrt(sum((m1 - m2)^2))
perm_d <- replicate(2000, {
  s <- sample(pts_df$cond)
  ...
})
p <- mean(perm_d >= obs_d)

Real output (B = 2000 permutations)

# GROUP COMPARISON STATISTICS (synthetic data)
 axis    t_stat    t_p     cohen_d
 SVD1   +20.076  <.0001   −4.015
 SVD2    +1.333   .1863   −0.267

# PERMUTATION TEST (B = 2000)
  Observed dist:   0.698
  Null mean (95%): 0.074 (0.012, 0.182)
  p-value:         < .0001

# Sign convention: R t.test(y ~ cond) compares first level (control)
# minus second level (treatment); with control mean > treatment mean on SVD1,
# t is positive. Cohen's d here is treatment − control, so it is negative.

Read like this:

  • SVD1 separation is enormous (Cohen's d ≈ 4.0, far past the conventional "large effect" threshold of 0.8). With cleanly differentiated synthetic transition matrices, the test correctly detects them.
  • SVD2 separation is essentially null (d ≈ 0.27, p ≈ .19). The synthetic groups overlap on the secondary axis — a useful sanity check that the test isn’t trigger-happy.
  • The overall separation is far beyond chance: observed between-means distance 0.70 vs permutation null mean 0.07 (95th percentile 0.18).
Pitfall · Rotation choice changes the test

SVD1 is the axis that maximally separates the network on first read. If you change the rotation (e.g. mean-rotated vs SVD-rotated, or a custom rotation node), the t-test answer changes too. Report which rotation you used — and don't compare t-statistics across different rotations.

Interpretation framework

An ONA figure looks like a network diagram, but the right way to read it is as a structural fingerprint of how codes flow into each other. Use this five-pass framework. Each pass answers one question; do them in order.

Pass 01
Where do codes sit?
Codes near each other share co-occurrence partners. If questioning and responding end up on opposite SVD1 poles, the dialogue has a strong question→answer axis.
Pass 02
Which node is the sink?
The biggest donut is what arrows point toward. That's the dialogue's gravity. In teacher–student data, it's almost always the student response.
Pass 03 · Core
Which arrows are heavier where?
In the combined view, look at edge thickness by colour. In the subtracted view, follow the saturation. Big asymmetric edges are where the treatment shifted the structure.
Pass 04
Where do the means land?
The square markers tell you each group's "average position". CI crosshairs that don't overlap = the groups occupy different structural regimes.
Pass 05
What does this mean theoretically?
Translate structure into mechanism. A heavier T_QUESTION→S_RESPONSE arrow under treatment is consistent with stronger scaffolding — but ONA itself is descriptive.

A worked interpretation

Using the same teacher-simulation data shown above, here is how a Results paragraph might read after walking the framework:

"The ONA mean networks show that the Treatment and Control sessions occupy structurally different regions of the projection (between-means distance = 0.70, permutation p < .001). The separation is concentrated on the first SVD axis (Cohen’s d = −4.02) rather than the second (d = −0.27). Inspecting node-level in-strength differences, the contrast is driven by ELABORATE (+0.74), EXAMPLE (+0.52), and ASK (−0.55): under the treatment, dialogues route flow into example→elaborate sequences while question-asking declines. The subtracted network confirms that EXPLAIN → EXAMPLE and EXAMPLE → ELABORATE arrows are systematically thicker in the Treatment group, while ASK-initiated arrows dominate the Control group. We interpret this as a shift in dialogue structure — not as evidence of a causal effect on outcomes, which would require the study design to support it. (Numbers above are from the synthetic dataset bundled with the skill; replace with your own.)"

When to be cautious

SymptomRiskWhat to do
One group has < 15 unitsMeans are unstableBootstrap the means or report rank-based tests only.
The two means' CIs overlap heavilyThe visual separation may be illusoryLead with the permutation p, not the eyeballed gap.
SVD1 separation is large but SVD2 is nullReal, but interpret only on the axis that separatesDon't claim the groups differ "everywhere" — they differ on one axis.
One group dominates total massThe other group has more zero-weight edges, density looks artificially lowReport active-edge counts side-by-side; consider matching session lengths.
Codes are extremely sparse (mostly 0)The window-based co-occurrence is noisyIncrease the window size or merge low-frequency codes upstream.
Groups differ on something the design didn't randomizeWhatever you observe might be confoundedStay descriptive in the abstract. Causal claims require a causal design.
Universal caveat

ONA is a descriptive method. A well-separated network tells you the groups look structurally different. It does not tell you why, and it does not license a causal claim unless your design supports one independently.

Cheat sheet

First-time setup
install.packages(c(
  "ona","tma","plotly",
  "htmlwidgets","data.table"
))
# in a terminal
pip install -U plotly kaleido
One-shot run
pwsh run_ona.ps1 ^
  -RScript .\ona_run.R ^
  -OutDir .\figures ^
  -OutBaseName ona_combined
Manual two-step
& "C:\Program Files\R\R-4.5.2\bin\Rscript.exe" ona_run.R
python render_ona_kaleido.py ^
       figures\ona_combined.json ^
       figures\ona_combined.png 1200 750
Required CSV columns
user_id, # unit identity
cond,    # group (optional)
turn_idx,# numeric time
speaker, # mode
CODE_A, CODE_B, ...   # 0/1 ints

Where each file lives

SKILL.mdThe skill's dispatch doc — read by Claude / Codex when triggered.
templates/ona_template.RCombined-view R script. Copy + edit PARAMS.
templates/ona_panels_template.RSide-by-side panels with shared edge/size/position scales.
templates/ona_subtracted_template.RGroup-difference (subtracted) network. Red = G1-favored, blue = G2-favored.
templates/ona_metrics.RNode + network metrics + group comparison statistics + permutation test → CSVs.
templates/render_ona_kaleido.pykaleido 1.x PNG renderer. No edits needed.
templates/run_ona.ps1One-shot driver: env detect + R + render. No edits needed.
GUIDE.htmlThis document.
You're done

If everything ran cleanly, you now have a publication-quality ONA figure. Open the HTML version too — it's interactive in a browser.

Fig 9

References — cite the originals

This guide stands on the work of the Epistemic Analytics group at the University of Wisconsin–Madison and contributors to the open-source ona, tma, and rENA R packages. If you publish work that uses these tools, please cite the canonical sources below — not this guide.

Methodological foundations

  • ONA — Ordered Network Analysis. Tan, Y., Swiecki, Z., Ruis, A. R., & Shaffer, D. W. (2024). Ordered network analysis. In B. Wasson & S. Zörgő (Eds.), Advances in quantitative ethnography. (Chapter / book series.) Introduces ONA as a directional extension of ENA; the figure conventions reproduced here come from this chapter. Verify exact bibliographic details against the published volume before citing.
  • ENA — Epistemic Network Analysis. Shaffer, D. W., Collier, W., & Ruis, A. R. (2016). A tutorial on epistemic network analysis: Analyzing the structure of connections in cognitive, social, and interaction data. Journal of Learning Analytics, 3(3), 9–45. https://doi.org/10.18608/jla.2016.33.3
  • Quantitative ethnography (book). Shaffer, D. W. (2017). Quantitative ethnography. Cathcart Press. The methodological framing under which ENA and ONA sit.
  • Transmodal analysis (TMA). The accumulation pipeline behind tma generalises ENA to multimodal and directional data. Please consult the package author list, accompanying papers, and the CITATION file when citing the method.

Software

  • ona R package. Tan, Y., Marquart, C. L., Ruis, A. R., Cai, Z., Knowles, M. A., Eagan, B., & Shaffer, D. W. ona: Ordered Network Analysis. CRAN. cran.r-project.org/package=ona — in R, run citation("ona") to get the version-specific citation block.
  • tma R package. tma: Transmodal Analysis. CRAN. cran.r-project.org/package=tmacitation("tma").
  • rENA R package. Marquart, C. L., Swiecki, Z., Collier, W., Eagan, B., Woodward, R., & Shaffer, D. W. rENA: Epistemic Network Analysis. CRAN. cran.r-project.org/package=rENAcitation("rENA").
  • Epistemic Network Analysis (web tool). epistemicnetwork.org — the maintained web application from the Epistemic Analytics group.
  • Plotly & Kaleido. Used here purely for figure rendering. plotly.com · github.com/plotly/Kaleido.

Implementation notes this guide leans on

  • The LAK24 teacher-practices repository by A. Hussain (Wisc.) is a useful real-world example of ona/tma usage and inspired the data.table-based subsetting pattern reused here.
  • The figure aesthetic borrows the knowledge-arch-blueprint theme tokens from nexu-io/open-design (MIT licensed) — cream paper background, rust-red highlights, blueprint grid.

Why this guide helps in agentic workflows

The chapter, articles, and package docs above are the right places to learn the method. This guide exists because agentic execution needs a different layer of support: explicit file contracts, fixed parameter blocks, reproducible defaults, predictable output names, and troubleshooting rules that prevent silent failure. In other words, the canon explains what ONA means; this guide explains how to get a trustworthy ONA artifact out of a real machine and hand it off to an AI workflow without ambiguity.

Downloadable starter files

Starter pack

Full sanitized workflow bundle

Download the ONA agentic starter pack (.zip) — includes the sanitized skill example, core templates, renderer, driver, synthetic dataset, generator script, and example output PNGs.

Skill example

No personal information

Download a public-safe ONA skill example (.md) that shows the trigger rules, workflow logic, and operational defaults without any personal notes or private project content.

Example data

Synthetic long-format CSV

Download the synthetic ONA dialogue dataset (.csv) or download the generator script (.py) if you want to inspect or regenerate the example data.

Batch example

Render all examples

Download the batch render script (.R) if you want one file that reproduces the example outputs against the bundled synthetic data.

How to cite this guide (optional)

If you must cite this on-ramp itself (e.g., for a tutorial or class handout), use:

Educatian. (2026). ONA in Practice — reproducible ONA workflows for researchers and agents
   [Open guide]. https://educatian.github.io/ona/

But for any methodological claim, defer to the original sources above.

Important

This is a community on-ramp, not the canonical source.

Numbers in this guide come from a synthetic dataset bundled here for demonstration. Wording is summarised in our own words to be beginner-accessible. Anything you find in conflict with the original publications or the package manuals: trust the originals, and please open an issue / send a correction.