Module 2.3: Choosing the Right Discovery Method

20 min Prerequisites: Modules 2.1 and 2.2

What You'll Learn

  1. When to use PCMCI vs PCMCIplus vs LPCMCI
  2. Understanding method assumptions
  3. What the different output graphs mean
  4. Practical examples of each method

Key Concepts Explained

What are "same-time effects"?

Sometimes cause and effect happen so fast that they appear simultaneous in your data:

  • You measure temperature and energy every hour
  • But turning on AC (cause) affects energy (effect) within seconds
  • In your hourly data, both changes appear at the same hour = "same-time effect"

What are "hidden confounders"?

Variables you didn't measure that affect multiple things you did measure:

  • You measure ice cream sales and drowning
  • You didn't measure temperature (the hidden confounder)

Method Overview

MethodSame-Time?Hidden?Best For
PCMCINoNoSimple, lagged-only
PCMCIplusYesNoMost common choice
LPCMCIYesYesComplex systems
RPCMCINoNoRegime changes
JPCMCI+YesSomeMultiple datasets

Decision Flowchart

START: Do same-time causal effects exist? │ ├── NO (only lagged effects) │ └── PCMCI ✓ │ └── YES (contemporaneous effects) │ └── Are there hidden confounders? │ ├── NO → PCMCIplus ✓ (recommended) │ └── YES or UNSURE → LPCMCI

Method 1: PCMCI (Original)

Assumes: Only LAGGED effects (no same-time causation)

Use when:

  • Effects take time (e.g., today's ad spend affects next week's sales)
  • Fast sampling rate relative to causal process
from tigramite.pcmci import PCMCI

pcmci = PCMCI(dataframe=dataframe, cond_ind_test=parcorr)
results = pcmci.run_pcmci(tau_max=5, pc_alpha=0.05)

Method 2: PCMCIplus

Assumes: Same-time effects CAN exist, but NO hidden confounders

Use when:

  • Fast sampling (effects can be instantaneous)
  • You believe you've measured all relevant variables
results = pcmci.run_pcmciplus(tau_max=5, pc_alpha=0.05)
Recommendation: Start with PCMCIplus - it's the most flexible for common cases.

Method 3: LPCMCI (Latent PCMCI)

Assumes: Hidden confounders MAY exist

Use when:

  • You suspect unmeasured variables affect your system
  • Complex systems where you can't measure everything
  • You're in a complex domain (biology, climate)
from tigramite.lpcmci import LPCMCI

lpcmci = LPCMCI(dataframe=dataframe, cond_ind_test=parcorr)
results = lpcmci.run_lpcmci(tau_max=3)

Understanding Edge Types

SymbolMeaningExample
-->Definite causeX(t-1) → Y(t)
<--Definite effectY(t) ← X(t-1)
o-oUndetermined directionX(t) ? Y(t)
o->Possibly causalX(t) maybe→ Y(t)
<->Hidden confounder likelyH → X and H → Y
x-xConflicting evidenceRare, check data

Practical Recommendations

  1. Start with PCMCIplus - Most flexible for common cases
  2. Use PCMCI if you're confident effects are only lagged
  3. Use LPCMCI if:
    • You can't measure all relevant variables
    • You see unexpected bidirectional edges
    • You're in a complex domain (biology, climate)
  4. Check assumptions by visualizing your data first!