Simple trading rules, big impact
The Holy Grail of trading success won't be found in the cutting-edge of classical machine learning models or complex rules
Introduction
Picture this: You’re standing at the helm of a ship in stormy markets, armed with a treasure map labeled "Classical machine learning models"—a maze of algorithms, charts, and optimization routines. In the other hand, you clutch a crumpled napkin with the words "If open > last week’s open, buy" scrawled hastily. Now ask yourself: which do you trust to guide you to the treasure?
I'll say it quietly: The napkin could save your loot.
Imagine teaching a parrot to trade. If you give it 100 rules, the poor bird will freeze in decision-making paralysis. But give it one rule—something simple and easy to follow—and it might actually make a profit. Humans, algorithms, and even parrots thrive in simplicity.
Let’s quantify this with a bit of math. Suppose a trading strategy has k independent rules. Each rule has a 55% chance of being correct—better than a coin flip, but not by much. The probability that all rules align perfectly is:
For k=2, that’s 30%. For k=5, it plummets to 5%. Adding rules reduces your chances of winning, even if each rule is slightly good. Every added rule introduces a new chance to fail. Complexity isn’t just confusing—it’s mathematically self-sabotaging.
But wait, it gets worse.
In reality, rules are rarely independent. They often share hidden dependencies. If Rule 1 fails, Rule 2 is more likely to fail too. This turns our formula into:
And that dependency factor is usually less than 1, crushing your odds further. In my experience: 1 rule, okay; 2 rules, okay; More than 2… Bzzz! Although, the method of rule assembly also comes into play here (quite different from signal ensemble methods). A toy example to ilustrate that:
Conditional rules sssembly
# Define rules
rule1 = (close_price > prev_close_price)
rule2 = (spread < 0.05)
rule3 = (volume > 1_000_000)
# Assemble rules with logic
if rule1 AND rule2 OR rule3:
decision = "Buy"
else:
decision = "Hold"
Signals assembly
# Signals from 3 strategies
strategy_A = 1 # Buy
strategy_B = 0 # Hold
strategy_C = 1 # Buy
# Assemble signals (majority vote)
final_signal = 1 if (strategy_A + strategy_B + strategy_C) >= 2 else 0
But let's get back to the matter at hand, let’s model this with expected value. Suppose each rule you add to your strategy slightly improves your edge but massively increases your failure rate.
For example:
1 Rule: 55% win rate, 2% gain per win, 1% loss per fail.
\(E = (0.55 \times 0.02) + (0.45 \times -0.01) = 0.65\% \text{ per trade} \)5 Rules: 5% win rate (as calculated earlier), 5% gain per win, 1% loss per fail.
\(E = (0.05 \times 0.05) + (0.95 \times -0.01) = -0.0075 \)
Adding rules turned your profitable strategy into a money furnace. Why? Because small gains can’t offset low win rates. A simpler approach yields steady profits, while extra complexity burns capital. Complexity opens too many paths to failure, and even marginally higher returns can’t make up for that.
Consider the classic Pareto principle: 80% of results come from 20% of inputs. In trading, this becomes even more lopsided. Adding a 5th rule to a strategy might improve accuracy by 1% but introduce 3 new failure modes.
If rules are positively correlated—as they often are, even when meticulously selected to appear uncorrelated—the joint probability collapses faster than a soufflé pulled too soon from the oven. For example, if each rule shares a 0.8 correlation with others:
Now you’re relying on a 0.5% chance of success. Good luck with that.
So, what’s the fix? Embrace the One Good Rule philosophy:
Isolate the rule: Find a single logical condition that repeats multiple times over long periods. Because it’s no longer just about finding a high-probability rule; what we’re looking for is high persistence and to avoid false positives.
Stress-test it: See how it holds up in multiple scenarios, and build synthetic scenarios to test it beyond historical data.
Now, let’s see how simplicity dodges this statistical bullet. It’s important to show the beauty of dumb rules…
Simple rules like "Buy if today’s open > open 5 days ago" have one superpower: they fail loudly and clearly. You either win or lose fast, letting you adapt quickly. Complex models, however, fail in silence.
Let’s model a basic rule: Buy if today’s open > yesterday’s open, hold for 1 day. Assume:
55% win rate (profit = 2%)
45% loss rate (loss = 1%)
Expected value per trade:
That’s a 0.65% average gain per trade. Compound this over 100 trades, and you’re up ~92%.
Now, let’s make this spicy. Suppose you add a time limit to your trades (e.g., "close after 5 days, no matter what"). This rule does two things:
Caps your losses (no hoping a losing trade will recover).
Forces closing (no emotional clinging to winners).
Let’s recalculate EV with a time limit. Assume the time limit:
Reduces loss size from -1% to -0.8% (because you exit earlier).
Slightly reduces win size from +2% to +1.8% (because you exit winners too soon sometimes).
Then, to calculate the new expected value, let’s incorporate the updated values:
Even with smaller wins and losses, the rule still works because the asymmetry (bigger wins than losses) is preserved. This is the heart of simple strategies: They’re not about being right all the time—they’re about being less wrong.
Why do simple rules win? They are like your favorite hoodie—comfortable, reliable, and always there when you need them. They shine for two big reasons:
Clear feedback: When a simple rule stops working, it’s immediately apparent.
Ease of adaptation: Unlike complex systems, simple rules can be adjusted or refined without the risk of disrupting an intricate network of dependencies.
####################################################################
# 🔥 THIS SNIPPED WILL BE COMPLETED IN FUTURE POSTS 🔥
####################################################################
def combinations(iterable, r):
"""
Generate combinations
"""
pool = list(iterable)
n = len(pool)
if r > n:
return
indices = list(range(r))
yield tuple(pool[i] for i in indices)
while True:
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
else:
return
indices[i] += 1
for j in range(i + 1, r):
indices[j] = indices[j - 1] + 1
yield tuple(pool[x] for x in indices)
def same_rule(rule1, rule2):
"""
Return True if both rules have the same base parameters
"""
return (
rule1['shift'] == rule2['shift'] and
rule1['i'] == rule2['i'] and
rule1['operator'] == rule2['operator'] and
rule1['price_type'] == rule2['price_type'] and
rule1['side'] == rule2['side'])
def update_buffer(buffer, new_rule, new_scores, top_n, scores=0.99):
...
####################################################################
# 🔒 FOR PAID SUBSCRIBERS ONLY 🔒
####################################################################
Not bad for a rule a toddler could understand! The same rule working for 4 years across multiple market regimes. But wait—can something this simple really compete with machine learning?
Markets are mostly noise. Even the best ML models struggle because they’re trained to find patterns in chaos—like teaching a cat to solve Sudoku. Let’s model this.
Suppose the market’s daily return rt is:
The signal (0.1% daily drift) is dwarfed by noise (2% standard deviation). Now, let’s assume:
Simple rule detects the signal 55% of the time.
ML model detects it 70% of the time (wow, such genius!).
ML model trades 10x more frequently.
Apparently, if you were to use a toy example the ML model has a slightly better Sharpe ratio (I'm being generous because with real data, this isn't true). But wait! We didn’t account for transaction costs, slippage and other collateral issues. Add those in, and the ML’s edge vanishes faster than a mirage on a scorching desert road.
The verdict? The simple rule isn’t just better—it’s obviously better.
So, why do we overengineer!? We are pattern-recognition machines, hardwired to seek order in chaos. In trading, this manifests as an obsession with complexity—more rules, more data, more layers. It’s the financial equivalent of the "kitchen sink" approach: if we throw everything into the model, surely it’ll capture the market’s secrets. Sometimes you end up with a great dish, but you might also go up in flames.
Okay, time to play!—only for beginners. I propose something to you. I want you to play with this little snipped code and tell me in a comment what results you have obtained 😈
Instructions
===========================================
🎮 Welcome to the Trading rule game! 🎮
===========================================
Compete to see if your trading rule can beat the pre-defined rules.
Define your rule:
Options:
1. noisy_open > prev_open
2. noisy_open < prev_open
3. close > noisy_open
Or create your own custom rule using 'noisy_open', 'prev_open', and 'close'.
Enter your rule (e.g., noisy_open > prev_open): noisy_open > prev_open
🔍 Testing rules...
Testing Rule 1: noisy_open > prev_open...
Testing Rule 2: noisy_open < prev_open...
Testing Rule 3: close > noisy_open...
Testing Player's Rule...
Want to empower someone in the world of finance? Recommend this post to them!
Your support fuels this newsletter:
❤️ Like this article if it added value to your day.
📢 Share it with your network to spark smarter conversations.
Feel free to modify it, break it, or simply ignore it.
import pandas as pd
import numpy as np
class RuleGame:
def __init__(self, data, test_noise_levels, hold_period=1):
"""
Initialize the RuleGame.
Parameters:
- data (pd.DataFrame): Market data with 'open' and 'close' columns.
- test_noise_levels (list): Levels of random noise for stress-testing.
- hold_period (int): Number of days to hold the position.
"""
self.data = data.copy()
self.test_noise_levels = test_noise_levels
self.hold_period = hold_period
self.data['prev_open'] = self.data['open'].shift(1)
# Drop the first row since prev_open will be NaN
self.data.dropna(subset=['prev_open'], inplace=True)
def stress_test_rule(self, rule_func):
"""
Stress-test a single rule.
Parameters:
- rule_func (function): A function defining the rule.
Returns:
- pd.DataFrame: Results of the stress test.
"""
results = []
for noise_level in self.test_noise_levels:
# Add noise to the 'open' price
self.data['noisy_open'] = self.data['open'] + np.random.normal(0, noise_level, len(self.data))
# Apply the trading rule
try:
self.data['signal'] = rule_func(self.data)
except Exception as e:
print(f"Error applying rule: {e}")
return pd.DataFrame()
# Ensure 'signal' is a boolean Series
if not pd.api.types.is_bool_dtype(self.data['signal']):
print("Error: The rule did not return a boolean Series.")
return pd.DataFrame()
# Simulate trade outcomes
wins, losses = [], []
for idx, row in self.data[self.data['signal']].iterrows():
exit_idx = idx + self.hold_period
if exit_idx < len(self.data):
entry_price = row['noisy_open']
exit_price = self.data.iloc[exit_idx]['close']
pnl = exit_price - entry_price
if pnl > 0:
wins.append(pnl)
else:
losses.append(abs(pnl))
# Calculate performance metrics
total_trades = len(wins) + len(losses)
win_rate = len(wins) / total_trades if total_trades > 0 else 0
avg_gain = np.mean(wins) if wins else 0
avg_loss = np.mean(losses) if losses else 0
expected_value = (win_rate * avg_gain) - ((1 - win_rate) * avg_loss)
results.append({
'noise_level': noise_level,
'win_rate': win_rate,
'avg_gain': avg_gain,
'avg_loss': avg_loss,
'expected_value': expected_value
})
return pd.DataFrame(results)
def start_game(self):
"""
Start the Rule Game.
"""
print("Welcome to the Trading Rule Game!")
print("Compete to see if your trading rule can beat the pre-defined rules.")
# Pre-defined rules
pre_defined_rules = {
"Rule 1: noisy_open > prev_open": lambda df: df['noisy_open'] > df['prev_open'],
"Rule 2: noisy_open < prev_open": lambda df: df['noisy_open'] < df['prev_open'],
"Rule 3: close > noisy_open": lambda df: df['close'] > df['noisy_open']
}
# Player defines their rule
print("\nDefine your rule:")
print("Options:")
print("1. noisy_open > prev_open")
print("2. noisy_open < prev_open")
print("3. close > noisy_open")
print("Or create your own custom rule using 'noisy_open', 'prev_open', and 'close'.")
player_rule_input = input("Enter your rule (e.g., noisy_open > prev_open): ").strip()
# Check for any quotes within the rule
if "'" in player_rule_input or '"' in player_rule_input:
print("Invalid rule. Please enter the rule without any quotes.")
return
# Validate and parse the player's rule
try:
# Use pandas' eval for safe evaluation
# This will return a boolean Series
player_rule = lambda df: df.eval(player_rule_input)
# Test the rule on a small subset with 'noisy_open' added
# to ensure it returns boolean
temp_data = self.data.copy()
temp_data['noisy_open'] = temp_data['open'] + np.random.normal(0, self.test_noise_levels[0], len(temp_data))
test_signal = player_rule(temp_data.head(10))
if not pd.api.types.is_bool_dtype(test_signal):
raise ValueError("The rule does not return a boolean Series.")
except Exception as e:
print(f"Invalid rule. Error: {e}")
return
# Test all rules
results = {}
print("\nTesting rules...")
for rule_name, rule_func in pre_defined_rules.items():
print(f"Testing {rule_name}...")
results[rule_name] = self.stress_test_rule(rule_func)
# Test player's rule
print("Testing Player's Rule...")
results["Player's Rule"] = self.stress_test_rule(player_rule)
# Display results
for rule_name, result_df in results.items():
print(f"\nResults for {rule_name}:")
if result_df.empty:
print("No results due to an error in the rule.")
continue
print(result_df[['noise_level', 'win_rate', 'avg_gain', 'avg_loss', 'expected_value']])
# Determine the winner based on average expected value across noise levels
winner = None
highest_ev = -np.inf
for rule_name, result_df in results.items():
if result_df.empty:
continue
avg_ev = result_df['expected_value'].mean()
if avg_ev > highest_ev:
highest_ev = avg_ev
winner = rule_name
if winner:
print(f"\nWho wins?")
print(f"The best rule is: {winner} with an average expected value of {highest_ev:.4f}")
else:
print("\nNo valid rules to determine a winner.")
if __name__ == "__main__":
# Generate synthetic market data
np.random.seed(42)
data = pd.DataFrame({
'open': np.random.uniform(100, 200, 1000),
'close': np.random.uniform(100, 200, 1000)
})
# Game parameters
test_noise_levels = [0.5, 1, 2, 5]
# Initialize and start the game
game = RuleGame(data, test_noise_levels, hold_period=1)
game.start_game()