Question 1

What does this tool do?

Accepted Answer

It estimates the causal impact of an SEO change by comparing observed variant performance to a modelled counterfactual built from historical variant and control data.

Question 2

What input file do I need?

Accepted Answer

Use either CSV upload or Google Search Console mode. Both produce daily time-series data in the same structure: date, variant, control.

Question 3

How does Google Search Console mode work?

Accepted Answer

Connect GSC, choose a property and metric (clicks or impressions), set the fetch date range, and paste variant/control URL lists. The tool fetches and aggregates daily totals automatically.

Question 4

Can I run the analyzer with CSV only?

Accepted Answer

Yes. Upload a CSV with columns date, variant, and control. Values should be daily totals for clicks, sessions, or impressions.

Question 5

Can I use impressions or sessions instead of clicks?

Accepted Answer

Yes. Keep one metric type consistent across variant and control for all dates within the same run.

Question 6

How much data is recommended?

Accepted Answer

At least 100 pre-intervention days is recommended. The tool can run with less, but confidence and robustness are usually weaker.

Question 7

How should I set the start date?

Accepted Answer

Choose the first day after the SEO change was launched on the variant group. Everything before is pre-period; everything after is post-period.

Question 8

What is Impact on Clicks?

Accepted Answer

The average relative effect during the post-period. Positive values indicate uplift versus the modelled baseline; negative values indicate decline.

Question 9

What is Confidence Level?

Accepted Answer

Confidence Level is calculated as 1 - p_value. Higher confidence means the observed impact is less likely to be random variation.

Question 10

What is P value?

Accepted Answer

P value is the posterior tail-area probability from the model. Lower values are stronger evidence against a no-effect outcome.

Question 11

What is Daily Effect?

Accepted Answer

The average absolute day-level difference between observed variant performance and the modelled counterfactual.

Question 12

What is Cumulative Effect?

Accepted Answer

The total absolute impact over the post-period, built by summing the estimated day-level effects.

Question 13

How do I read chart 1 (actual vs predicted)?

Accepted Answer

If actual and predicted lines separate after launch and remain separated, that suggests sustained impact. The confidence band shows forecast uncertainty.

Question 14

How do I read chart 2 (cumulative effect)?

Accepted Answer

An upward cumulative curve indicates accumulating positive impact; a downward curve indicates accumulating loss over time.

Question 15

What does Analysis quality summarize?

Accepted Answer

It combines effect size, confidence, interval behavior, fit quality, and robustness checks into one decision-oriented quality view.

Question 16

How is Reliability Score calculated?

Accepted Answer

It is a composite score from three components: Statistical evidence, Precision and stability, and Robustness.

Question 17

What is Statistical evidence?

Accepted Answer

This reflects the strength of the causal signal using confidence, whether the effect interval clearly excludes zero, and interval tightness.

Question 18

What is Precision and stability?

Accepted Answer

This measures how tight and consistent estimates are, using pre-period fit quality and post-period uncertainty behavior.

Question 19

What is Robustness?

Accepted Answer

This checks whether conclusions remain stable under placebo-date runs and sensitivity runs with different model assumptions.

Question 20

What is 95% interval width?

Accepted Answer

This is the width of the relative-effect uncertainty interval. Narrower intervals generally indicate more precise estimates.

Question 21

What is Pre-period coverage?

Accepted Answer

Pre-period coverage shows how often actual pre-period values fall inside the model's uncertainty bounds. Values around ~95% are typically healthy.

Question 22

What is in Advanced diagnostics?

Accepted Answer

Advanced diagnostics is the technical validation layer behind Reliability Score. It includes Residual autocorr (lag1), Placebo false-positive rate, Sensitivity spread, and Sign consistency. Use this section to confirm the result is not fragile.

Question 23

What do GO / HOLD / NO-GO mean?

Accepted Answer

GO: strong positive and reliable signal. HOLD: mixed or fragile evidence. NO-GO: reliably negative impact.

Question 24

Common reasons for unreliable output?

Accepted Answer

Short pre-period, unstable control series, major concurrent external events, missing dates, or inconsistent aggregation logic between variant and control.

SEO A/B Test Analyzer

Input data

Q&A