Module 2.3: Choosing the Right Discovery Method
What You'll Learn
- When to use PCMCI vs PCMCIplus vs LPCMCI
- Understanding method assumptions
- What the different output graphs mean
- Practical examples of each method
Key Concepts Explained
What are "same-time effects"?
Sometimes cause and effect happen so fast that they appear simultaneous in your data:
- You measure temperature and energy every hour
- But turning on AC (cause) affects energy (effect) within seconds
- In your hourly data, both changes appear at the same hour = "same-time effect"
What are "hidden confounders"?
Variables you didn't measure that affect multiple things you did measure:
- You measure ice cream sales and drowning
- You didn't measure temperature (the hidden confounder)
Method Overview
| Method | Same-Time? | Hidden? | Best For |
|---|---|---|---|
| PCMCI | No | No | Simple, lagged-only |
| PCMCIplus | Yes | No | Most common choice |
| LPCMCI | Yes | Yes | Complex systems |
| RPCMCI | No | No | Regime changes |
| JPCMCI+ | Yes | Some | Multiple datasets |
Decision Flowchart
START: Do same-time causal effects exist?
│
├── NO (only lagged effects)
│ └── PCMCI ✓
│
└── YES (contemporaneous effects)
│
└── Are there hidden confounders?
│
├── NO → PCMCIplus ✓ (recommended)
│
└── YES or UNSURE → LPCMCI ✓
Method 1: PCMCI (Original)
Assumes: Only LAGGED effects (no same-time causation)
Use when:
- Effects take time (e.g., today's ad spend affects next week's sales)
- Fast sampling rate relative to causal process
from tigramite.pcmci import PCMCI
pcmci = PCMCI(dataframe=dataframe, cond_ind_test=parcorr)
results = pcmci.run_pcmci(tau_max=5, pc_alpha=0.05)
Method 2: PCMCIplus
Assumes: Same-time effects CAN exist, but NO hidden confounders
Use when:
- Fast sampling (effects can be instantaneous)
- You believe you've measured all relevant variables
results = pcmci.run_pcmciplus(tau_max=5, pc_alpha=0.05)
Recommendation: Start with PCMCIplus - it's the most flexible for common cases.
Method 3: LPCMCI (Latent PCMCI)
Assumes: Hidden confounders MAY exist
Use when:
- You suspect unmeasured variables affect your system
- Complex systems where you can't measure everything
- You're in a complex domain (biology, climate)
from tigramite.lpcmci import LPCMCI
lpcmci = LPCMCI(dataframe=dataframe, cond_ind_test=parcorr)
results = lpcmci.run_lpcmci(tau_max=3)
Understanding Edge Types
| Symbol | Meaning | Example |
|---|---|---|
--> | Definite cause | X(t-1) → Y(t) |
<-- | Definite effect | Y(t) ← X(t-1) |
o-o | Undetermined direction | X(t) ? Y(t) |
o-> | Possibly causal | X(t) maybe→ Y(t) |
<-> | Hidden confounder likely | H → X and H → Y |
x-x | Conflicting evidence | Rare, check data |
Practical Recommendations
- Start with PCMCIplus - Most flexible for common cases
- Use PCMCI if you're confident effects are only lagged
- Use LPCMCI if:
- You can't measure all relevant variables
- You see unexpected bidirectional edges
- You're in a complex domain (biology, climate)
- Check assumptions by visualizing your data first!