Welcome to the Deep Reinforcement Learning-based Stock Trading Framework! This project leverages financial data, key technical indicators, and deep reinforcement learning techniques to simulate a robust trading agent that learns to maximize portfolio returns.
- Download real-time stock market data using the yfinance library.
- Compute popular indicators: SMA, RSI, MACD.
- Custom trading environment with realistic constraints like transaction costs and balance tracking.
- Reward structure encouraging portfolio optimization and volatility management.
- HighPerformancePolicyNetwork: Residual connections, BatchNorm, and Dropout for stability.
- SimplePolicyNetwork: LayerNorm, GELU activation, and improved regularization.
- Intuitive graphs for portfolio value, actions, and price trends.
- Easy-to-interpret results for training and testing phases.
- Download and process stock market data.
- Compute SMA, RSI, and MACD indicators.
- TradingEnv: Custom RL environment for portfolio management.
- Actions: Sell, Buy, Hold
- Two versions of neural networks for action prediction:
- HighPerformancePolicyNetwork: Residuals & BatchNorm for high performance.
- SimplePolicyNetwork: Lightweight with GELU and Dropout.
- Implements policy gradient optimization with entropy regularization.
- Early stopping mechanism to prevent overfitting.
- Predict portfolio behavior and plot results with easy-to-read graphs.
- Set the stock symbol and time period to automatically download data for that stock price (eg. AAPL, NVDA).
data = get_stock_data('NVDA', start_date='2023-01-01', end_date='2025-01-01')
- Calculate key indicators:
- Simple Moving Average (SMA): Tracks long-term trends.
- Relative Strength Index (RSI): Measures momentum.
- Moving Average Convergence Divergence (MACD): Tracks trend changes.
- Simulates realistic trading conditions:
- Balance Tracking: Start with $1000.
- Transaction Costs: 0.5% cost per trade.
- State Representation: Combines past 5 days' data into one state.
env = TradingEnv(data, window_size=5)
- Two policy network options:
- HighPerformancePolicyNetwork 🏋️: Designed for large-scale training.
- SimplePolicyNetwork 🎯: Lightweight and efficient.
- Training Process
- Policy gradient method with discounted rewards.
- Gradient clipping and entropy regularization for stability.
portfolio_values, initial_balance = train(env, policy_net, optimizer)
- Evaluate the model by visualizing:
- Portfolio value during the prediction phase.
- 'Buy', 'Sell' and 'Hold' actions overlaid on price trends.
visualize_training(portfolio_values, initial_balance)
predict(env, policy_net)
💡 The following graph shows the change in portfolio value during training.
- Add additional technical indicators (e.g., Bollinger Bands, ATR).
- Integrate more diverse datasets and multiple stocks.
- Experiment with other RL algorithms (e.g., PPO, A3C).
Seoul National University Graduate School of Data Science (SNU GSDS) Under the guidance of Navy Lee
For any questions or feedback, contact us at: 📧 [email protected] or [email protected]