- Initialise
python3 -m venv .venv
- Activate
source .venv/bin/activate
- Install dependencies
pip3 install -r requirements.txt
- Generate model
python3 gen.py
NOTE: this wil fail on the first run, fix the following manually.
- the BTCUSD_1H.csv chart needs to have the url removed from the first line of the csv.
- the BTC_sentiment.csv will be wrapped in json and the columns need reordering (date is first column and it has it as the last)
- Run data collection
python3 collect.py
To run collect.py
in the background as a persistent process (so it doesn't stop when you close your terminal):
nohup ./.venv/bin/python3 collect.py &
This command uses nohup
to prevent the process from being terminated when the controlling terminal is closed, and &
to run it in the background. Output will be redirected to nohup.out
by default.
This script connects to the Binance WebSocket stream to collect real-time BTCUSDT data and calculate features, logging everything to BTCUSD_trading.csv
.
To stop the collect.py
process if it was run with nohup
:
Stopping the collect process:
pgrep -f "python3 collect.py" | xargs kill
This command finds the PID of the running collect.py
script and kills it in one step.
- Run trading bot
python3 trade.py
This script reads the BTCUSD_trading.csv
file, loads the pre-trained model, makes price predictions, and logs trade actions to BTCUSD_predictions_trades.csv
.
To view the trade.html
page, which displays the BTCUSD_predictions_trades.csv
data, you need to run a local web server. Navigate to the project's root directory in your terminal and execute the following command:
python3 -m http.server 4242
After starting the server, open your web browser and go to http://localhost:8000/trade.html
. The page will automatically update with the latest data from BTCUSD_predictions_trades.csv
every 5 seconds.
To run the tests for this project, use the following command:
python3 -m unittest test_features.py
[] Gradient boosting machines (XGBoost, LightGBM or CatBoost), these models are good for regression tasks, predicting future prices, or understanding feature importance.
[] Deep learning (LSTM or GRUs), specialised for sequential data. Great for capturing long-term dependencies and complex patterns in price data. Particularly useful with large datasets.
[] Recurrent neural nets (temporal data, RNNs and variations). Designed for handling time series data and sequential dependencies effectively.
[] GANS (Generative Adversarial Networks) can be used to simulate and generate synthetic price data for testing trading strategies and risk analysis.
[] Bayesian machine learning - can help in probabalistic modeling, providing a range of possible outcomes and quantifying uncertainty in predictions. Especially good for volatile assets like Bitcoin.