Autonomous AgentsPrediction577 lines
Financial Market Prediction
Quick Summary14 lines
Financial market prediction combines technical analysis (price patterns), fundamental analysis (intrinsic value), sentiment analysis (market psychology), and quantitative methods (statistical/ML models) to forecast asset prices, volatility, and market regime changes. While markets are notoriously difficult to predict due to the efficient market hypothesis, exploitable edges exist in volatility forecasting, sentiment-driven mispricings, and machine learning approaches that process alternative data at scale. ## Key Points 1. Technical analysis identifies patterns in price and volume; it works best as a timing tool on top of fundamental views 2. Fundamental analysis (DCF, relative valuation) provides the "what to buy" while technicals provide the "when to buy" 3. Sentiment indicators (VIX, put/call ratio, Fear & Greed) are most valuable as contrarian signals at extremes 4. GARCH models capture volatility clustering and provide calibrated volatility forecasts essential for risk management 5. Options prices contain the market's probability distribution for future prices; the Breeden-Litzenberger formula extracts it 6. Machine learning for financial prediction requires walk-forward validation to avoid look-ahead bias; random train/test splits are invalid 7. The most robust predictive features tend to be simple: momentum, mean reversion, volume, and volatility, not complex patterns 8. Transaction costs, slippage, and market impact are the reality check; many "profitable" strategies evaporate after costs
skilldb get prediction-skills/financial-market-predictionFull skill: 577 linesPaste into your CLAUDE.md or agent config
Financial Market Prediction
Overview
Financial market prediction combines technical analysis (price patterns), fundamental analysis (intrinsic value), sentiment analysis (market psychology), and quantitative methods (statistical/ML models) to forecast asset prices, volatility, and market regime changes. While markets are notoriously difficult to predict due to the efficient market hypothesis, exploitable edges exist in volatility forecasting, sentiment-driven mispricings, and machine learning approaches that process alternative data at scale.
Technical Analysis
Price Pattern Recognition
import numpy as np
import pandas as pd
class TechnicalAnalyzer:
"""Core technical analysis indicators for market prediction."""
def __init__(self, prices: pd.DataFrame):
"""prices: DataFrame with columns ['open', 'high', 'low', 'close', 'volume']"""
self.df = prices.copy()
def moving_averages(self, short_window: int = 20, long_window: int = 50):
"""Compute moving average crossover signals."""
self.df['sma_short'] = self.df['close'].rolling(short_window).mean()
self.df['sma_long'] = self.df['close'].rolling(long_window).mean()
self.df['ema_short'] = self.df['close'].ewm(span=short_window).mean()
# Golden cross (bullish) / Death cross (bearish)
self.df['ma_signal'] = np.where(
self.df['sma_short'] > self.df['sma_long'], 1, -1
)
# Crossover detection
self.df['crossover'] = self.df['ma_signal'].diff()
return self.df[['sma_short', 'sma_long', 'ma_signal', 'crossover']]
def rsi(self, period: int = 14) -> pd.Series:
"""Relative Strength Index: momentum oscillator (0-100)."""
delta = self.df['close'].diff()
gain = delta.clip(lower=0)
loss = (-delta).clip(lower=0)
avg_gain = gain.rolling(period).mean()
avg_loss = loss.rolling(period).mean()
rs = avg_gain / avg_loss.replace(0, np.nan)
rsi = 100 - (100 / (1 + rs))
self.df['rsi'] = rsi
return rsi
def bollinger_bands(self, period: int = 20, std_dev: float = 2.0):
"""Bollinger Bands: volatility-based envelope around price."""
sma = self.df['close'].rolling(period).mean()
std = self.df['close'].rolling(period).std()
self.df['bb_upper'] = sma + std_dev * std
self.df['bb_lower'] = sma - std_dev * std
self.df['bb_middle'] = sma
self.df['bb_width'] = (self.df['bb_upper'] - self.df['bb_lower']) / sma
self.df['bb_position'] = (self.df['close'] - self.df['bb_lower']) / \
(self.df['bb_upper'] - self.df['bb_lower'])
return self.df[['bb_upper', 'bb_lower', 'bb_middle', 'bb_width', 'bb_position']]
def macd(self, fast: int = 12, slow: int = 26, signal: int = 9):
"""MACD: trend-following momentum indicator."""
ema_fast = self.df['close'].ewm(span=fast).mean()
ema_slow = self.df['close'].ewm(span=slow).mean()
self.df['macd'] = ema_fast - ema_slow
self.df['macd_signal'] = self.df['macd'].ewm(span=signal).mean()
self.df['macd_histogram'] = self.df['macd'] - self.df['macd_signal']
return self.df[['macd', 'macd_signal', 'macd_histogram']]
def volume_profile(self, window: int = 20):
"""Volume-weighted analysis."""
self.df['vwap'] = (
(self.df['close'] * self.df['volume']).rolling(window).sum() /
self.df['volume'].rolling(window).sum()
)
self.df['volume_sma'] = self.df['volume'].rolling(window).mean()
self.df['volume_ratio'] = self.df['volume'] / self.df['volume_sma']
return self.df[['vwap', 'volume_sma', 'volume_ratio']]
def generate_signals(self) -> pd.DataFrame:
"""Combine indicators into a composite signal."""
self.moving_averages()
self.rsi()
self.bollinger_bands()
self.macd()
self.volume_profile()
signals = pd.DataFrame(index=self.df.index)
# Trend signal
signals['trend'] = self.df['ma_signal']
# Momentum signal
signals['momentum'] = np.where(self.df['rsi'] < 30, 1,
np.where(self.df['rsi'] > 70, -1, 0))
# Mean reversion signal
signals['mean_reversion'] = np.where(self.df['bb_position'] < 0.1, 1,
np.where(self.df['bb_position'] > 0.9, -1, 0))
# MACD signal
signals['macd_signal'] = np.where(self.df['macd_histogram'] > 0, 1, -1)
# Volume confirmation
signals['volume_confirm'] = np.where(self.df['volume_ratio'] > 1.5, 1, 0)
# Composite
signals['composite'] = (
0.3 * signals['trend'] +
0.2 * signals['momentum'] +
0.2 * signals['mean_reversion'] +
0.2 * signals['macd_signal'] +
0.1 * signals['volume_confirm']
)
return signals
Fundamental Analysis
Intrinsic Value Estimation
class FundamentalAnalyzer:
"""Fundamental analysis for equity valuation."""
def dcf_valuation(self, free_cash_flows: list, growth_rate: float,
terminal_growth: float, discount_rate: float,
shares_outstanding: int) -> dict:
"""
Discounted Cash Flow valuation.
Projects future cash flows and discounts to present value.
"""
# Project cash flows
projected = []
last_fcf = free_cash_flows[-1]
for year in range(1, 6): # 5-year projection
projected_fcf = last_fcf * (1 + growth_rate) ** year
pv = projected_fcf / (1 + discount_rate) ** year
projected.append({'year': year, 'fcf': projected_fcf, 'pv': pv})
# Terminal value
terminal_fcf = projected[-1]['fcf'] * (1 + terminal_growth)
terminal_value = terminal_fcf / (discount_rate - terminal_growth)
terminal_pv = terminal_value / (1 + discount_rate) ** 5
# Enterprise value
enterprise_value = sum(p['pv'] for p in projected) + terminal_pv
# Equity value per share
equity_per_share = enterprise_value / shares_outstanding
return {
'enterprise_value': enterprise_value,
'equity_per_share': equity_per_share,
'terminal_value_fraction': terminal_pv / enterprise_value,
'projected_cash_flows': projected
}
def relative_valuation(self, target: dict, comparables: list) -> dict:
"""
Relative valuation using comparable company multiples.
"""
multiples = {
'pe_ratio': [],
'ev_ebitda': [],
'ps_ratio': [],
'pb_ratio': []
}
for comp in comparables:
if comp.get('pe_ratio'):
multiples['pe_ratio'].append(comp['pe_ratio'])
if comp.get('ev_ebitda'):
multiples['ev_ebitda'].append(comp['ev_ebitda'])
if comp.get('ps_ratio'):
multiples['ps_ratio'].append(comp['ps_ratio'])
if comp.get('pb_ratio'):
multiples['pb_ratio'].append(comp['pb_ratio'])
implied_values = {}
if multiples['pe_ratio'] and target.get('eps'):
median_pe = np.median(multiples['pe_ratio'])
implied_values['pe_implied'] = target['eps'] * median_pe
if multiples['ev_ebitda'] and target.get('ebitda'):
median_ev_ebitda = np.median(multiples['ev_ebitda'])
implied_values['ev_ebitda_implied'] = target['ebitda'] * median_ev_ebitda
if multiples['ps_ratio'] and target.get('revenue_per_share'):
median_ps = np.median(multiples['ps_ratio'])
implied_values['ps_implied'] = target['revenue_per_share'] * median_ps
if implied_values:
avg_implied = np.mean(list(implied_values.values()))
return {
'implied_values': implied_values,
'average_implied_value': avg_implied,
'current_price': target.get('current_price', 0),
'upside': (avg_implied / target.get('current_price', avg_implied) - 1) * 100
}
return {'error': 'Insufficient data'}
Sentiment Analysis for Markets
class MarketSentimentAnalyzer:
"""Analyze market sentiment from multiple data sources."""
def __init__(self):
self.sentiment_sources = {}
def fear_greed_index(self, market_data: dict) -> dict:
"""
Compute a CNN-style Fear & Greed Index from market indicators.
0 = Extreme Fear, 100 = Extreme Greed.
"""
indicators = {}
# 1. Market momentum (S&P vs 125-day MA)
if 'sp500' in market_data and 'sp500_ma125' in market_data:
ratio = market_data['sp500'] / market_data['sp500_ma125']
indicators['momentum'] = min(100, max(0, (ratio - 0.95) / 0.10 * 100))
# 2. VIX (Volatility Index)
if 'vix' in market_data:
vix = market_data['vix']
indicators['volatility'] = min(100, max(0, (50 - vix) / 50 * 100))
# 3. Put/Call ratio
if 'put_call_ratio' in market_data:
pcr = market_data['put_call_ratio']
indicators['put_call'] = min(100, max(0, (1.2 - pcr) / 0.8 * 100))
# 4. Junk bond demand (spread over treasuries)
if 'high_yield_spread' in market_data:
spread = market_data['high_yield_spread']
indicators['junk_bond'] = min(100, max(0, (8 - spread) / 6 * 100))
# 5. Market breadth (advance/decline)
if 'advance_decline' in market_data:
ad = market_data['advance_decline']
indicators['breadth'] = min(100, max(0, (ad + 1) / 2 * 100))
if not indicators:
return {'error': 'Insufficient market data'}
index = np.mean(list(indicators.values()))
return {
'fear_greed_index': index,
'label': self._label_sentiment(index),
'components': indicators,
'contrarian_signal': 'buy' if index < 25 else 'sell' if index > 75 else 'neutral'
}
def _label_sentiment(self, index: float) -> str:
if index < 20: return 'Extreme Fear'
if index < 40: return 'Fear'
if index < 60: return 'Neutral'
if index < 80: return 'Greed'
return 'Extreme Greed'
def news_sentiment(self, headlines: list) -> dict:
"""Analyze sentiment from news headlines."""
positive_words = {'surge', 'rally', 'gain', 'jump', 'soar', 'boom',
'record', 'bullish', 'upgrade', 'beat', 'strong'}
negative_words = {'crash', 'plunge', 'drop', 'fall', 'sink', 'bear',
'recession', 'crisis', 'downgrade', 'miss', 'weak'}
scores = []
for headline in headlines:
words = set(headline.lower().split())
pos = len(words & positive_words)
neg = len(words & negative_words)
if pos + neg > 0:
score = (pos - neg) / (pos + neg)
else:
score = 0
scores.append(score)
avg_sentiment = np.mean(scores) if scores else 0
return {
'average_sentiment': avg_sentiment,
'label': 'positive' if avg_sentiment > 0.1 else 'negative' if avg_sentiment < -0.1 else 'neutral',
'n_headlines': len(headlines),
'extreme_negative_count': sum(1 for s in scores if s < -0.5),
'extreme_positive_count': sum(1 for s in scores if s > 0.5)
}
Volatility Forecasting (GARCH)
class GARCHModel:
"""
GARCH(1,1) for volatility forecasting.
Captures volatility clustering: large moves follow large moves.
sigma²_t = omega + alpha * r²_{t-1} + beta * sigma²_{t-1}
"""
def __init__(self, omega: float = 0.00001, alpha: float = 0.1,
beta: float = 0.85):
self.omega = omega
self.alpha = alpha
self.beta = beta
self.fitted = False
def fit(self, returns: np.ndarray):
"""Fit GARCH(1,1) using maximum likelihood."""
from scipy.optimize import minimize
def neg_log_likelihood(params):
omega, alpha, beta = params
if omega <= 0 or alpha < 0 or beta < 0 or alpha + beta >= 1:
return 1e10
n = len(returns)
sigma2 = np.zeros(n)
sigma2[0] = np.var(returns)
for t in range(1, n):
sigma2[t] = omega + alpha * returns[t-1]**2 + beta * sigma2[t-1]
sigma2[t] = max(sigma2[t], 1e-10)
ll = -0.5 * np.sum(np.log(2 * np.pi * sigma2) + returns**2 / sigma2)
return -ll
result = minimize(
neg_log_likelihood,
x0=[self.omega, self.alpha, self.beta],
method='Nelder-Mead'
)
self.omega, self.alpha, self.beta = result.x
self.fitted = True
self._returns = returns
# Compute final conditional variance
n = len(returns)
sigma2 = np.zeros(n)
sigma2[0] = np.var(returns)
for t in range(1, n):
sigma2[t] = self.omega + self.alpha * returns[t-1]**2 + self.beta * sigma2[t-1]
self._sigma2 = sigma2
return self
def forecast_volatility(self, steps: int = 1) -> np.ndarray:
"""Forecast conditional volatility for future periods."""
forecasts = np.zeros(steps)
last_sigma2 = self._sigma2[-1]
last_return2 = self._returns[-1]**2
for h in range(steps):
if h == 0:
forecasts[h] = self.omega + self.alpha * last_return2 + self.beta * last_sigma2
else:
forecasts[h] = self.omega + (self.alpha + self.beta) * forecasts[h-1]
return np.sqrt(forecasts) # Return as volatility (std dev)
def long_run_volatility(self) -> float:
"""Unconditional (long-run) volatility."""
long_run_var = self.omega / (1 - self.alpha - self.beta)
return np.sqrt(long_run_var)
def half_life(self) -> float:
"""Half-life of volatility shocks (days to decay by 50%)."""
persistence = self.alpha + self.beta
if persistence >= 1:
return float('inf')
return np.log(0.5) / np.log(persistence)
Options-Implied Probabilities
class OptionsImpliedProbability:
"""Extract probability distributions from option prices."""
@staticmethod
def implied_move(atm_straddle_price: float, stock_price: float,
days_to_expiry: int) -> dict:
"""
The at-the-money straddle price implies the expected move.
"""
implied_move_pct = atm_straddle_price / stock_price
annualized_vol = implied_move_pct * np.sqrt(252 / days_to_expiry)
return {
'implied_move_pct': implied_move_pct * 100,
'implied_move_dollars': atm_straddle_price,
'implied_range': (
stock_price * (1 - implied_move_pct),
stock_price * (1 + implied_move_pct)
),
'annualized_vol': annualized_vol * 100,
'one_std_dev_range': (
stock_price * (1 - annualized_vol * np.sqrt(days_to_expiry/252)),
stock_price * (1 + annualized_vol * np.sqrt(days_to_expiry/252))
)
}
@staticmethod
def probability_above(current_price: float, strike: float,
implied_vol: float, days: int,
risk_free_rate: float = 0.05) -> float:
"""
Probability that price exceeds a given strike at expiration
(using Black-Scholes N(d2)).
"""
from scipy.stats import norm
T = days / 252
d2 = (np.log(current_price / strike) +
(risk_free_rate - 0.5 * implied_vol**2) * T) / \
(implied_vol * np.sqrt(T))
return norm.cdf(d2)
@staticmethod
def risk_neutral_density(strikes: np.ndarray, call_prices: np.ndarray,
risk_free_rate: float, T: float) -> dict:
"""
Extract risk-neutral probability density from option prices
using Breeden-Litzenberger formula.
"""
# Second derivative of call price w.r.t. strike = risk-neutral density
dK = np.diff(strikes)
dC = np.diff(call_prices)
first_deriv = dC / dK
d2K = (dK[:-1] + dK[1:]) / 2
d2C = np.diff(first_deriv)
density = np.exp(risk_free_rate * T) * d2C / d2K
mid_strikes = strikes[1:-1]
return {
'strikes': mid_strikes,
'density': density,
'mean': np.sum(mid_strikes * np.abs(density)) / np.sum(np.abs(density)),
'std': np.sqrt(
np.sum(mid_strikes**2 * np.abs(density)) / np.sum(np.abs(density)) -
(np.sum(mid_strikes * np.abs(density)) / np.sum(np.abs(density)))**2
)
}
Machine Learning for Alpha Generation
class MLAlphaModel:
"""Machine learning pipeline for financial prediction."""
def __init__(self):
self.feature_generators = []
self.model = None
def generate_features(self, df: pd.DataFrame) -> pd.DataFrame:
"""Generate predictive features from OHLCV data."""
features = pd.DataFrame(index=df.index)
# Returns
for period in [1, 5, 10, 20, 60]:
features[f'return_{period}d'] = df['close'].pct_change(period)
# Volatility
for period in [5, 10, 20]:
features[f'volatility_{period}d'] = df['close'].pct_change().rolling(period).std()
# Volume features
features['volume_ratio_20d'] = df['volume'] / df['volume'].rolling(20).mean()
features['volume_trend'] = df['volume'].rolling(5).mean() / df['volume'].rolling(20).mean()
# Price relative to moving averages
for period in [10, 20, 50, 200]:
features[f'price_vs_ma{period}'] = df['close'] / df['close'].rolling(period).mean() - 1
# RSI
delta = df['close'].diff()
gain = delta.clip(lower=0).rolling(14).mean()
loss = (-delta.clip(upper=0)).rolling(14).mean()
features['rsi_14'] = 100 - (100 / (1 + gain / loss.replace(0, np.nan)))
# Bollinger band position
sma20 = df['close'].rolling(20).mean()
std20 = df['close'].rolling(20).std()
features['bb_position'] = (df['close'] - sma20) / (2 * std20)
# Day of week, month
if isinstance(df.index, pd.DatetimeIndex):
features['day_of_week'] = df.index.dayofweek
features['month'] = df.index.month
return features.dropna()
def walk_forward_backtest(self, features: pd.DataFrame, target: pd.Series,
train_size: int = 252, step_size: int = 21) -> dict:
"""
Walk-forward validation: train on past, predict future, step forward.
Prevents look-ahead bias.
"""
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
results = []
n = len(features)
for start in range(train_size, n - step_size, step_size):
X_train = features.iloc[:start]
y_train = target.iloc[:start]
X_test = features.iloc[start:start + step_size]
y_test = target.iloc[start:start + step_size]
model = GradientBoostingClassifier(
n_estimators=100, max_depth=3, learning_rate=0.1
)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)[:, 1]
accuracy = accuracy_score(y_test, predictions)
try:
auc = roc_auc_score(y_test, probabilities)
except ValueError:
auc = 0.5
results.append({
'period_start': features.index[start],
'accuracy': accuracy,
'auc': auc,
'n_predictions': len(y_test)
})
return {
'mean_accuracy': np.mean([r['accuracy'] for r in results]),
'mean_auc': np.mean([r['auc'] for r in results]),
'n_periods': len(results),
'results': results
}
Key Takeaways
- Technical analysis identifies patterns in price and volume; it works best as a timing tool on top of fundamental views
- Fundamental analysis (DCF, relative valuation) provides the "what to buy" while technicals provide the "when to buy"
- Sentiment indicators (VIX, put/call ratio, Fear & Greed) are most valuable as contrarian signals at extremes
- GARCH models capture volatility clustering and provide calibrated volatility forecasts essential for risk management
- Options prices contain the market's probability distribution for future prices; the Breeden-Litzenberger formula extracts it
- Machine learning for financial prediction requires walk-forward validation to avoid look-ahead bias; random train/test splits are invalid
- The most robust predictive features tend to be simple: momentum, mean reversion, volume, and volatility, not complex patterns
- Transaction costs, slippage, and market impact are the reality check; many "profitable" strategies evaporate after costs
Install this skill directly: skilldb add prediction-skills