Post

πŸ’ΉπŸ“ˆ Crypto Predictor - AI-Based Cryptocurrency Trading Strategy

This project leverages machine learning to develop a strategy for cryptocurrency trading. The core objective is to create a model that predicts whether the price of a cryptocurrency will increase, decrease, or remain stable in the next market moment (next days). This is achieved by collecting historical price data and applying machine learning techniques to make predictions based on these trends.

Some technical analysis and market statistics were used to forecast cryptocurrency price changes. It built a dataset from historical cryptocurrency data and employed various classification methods to predict future price movements. The application, Crypto Predictor, is a web-based tool designed to aid users in trading by providing trend predictions for selected cryptocurrencies.

This project was originally developed for the course β€˜Data Mining & Machine Learning’ in MSc in Artificial Intelligence & Data Engineering at University of Pisa.

architecture

πŸ“Š Dataset Creation

The dataset was created by scraping historical cryptocurrency values from Yahoo Finance for Bitcoin, Ethereum, and Binance between 2020 and 2022. The Moving Average Cross Strategy, a common financial approach, was employed to generate features for the model.

This includes the concept of the Golden Cross, where a short-term moving average crosses above a long-term moving average, indicating a potential rise in the price trend:

golden_cross

The dataset includes key financial attributes such as open, high, low, close, and adjusted close prices, along with short-term and long-term exponential moving averages (EMAs).

πŸ—οΈ Implementation Steps

The project involved several steps:

  1. Feature Generation
    • New attributes, such as short-term and long-term EMAs, were generated.
  2. Feature Selection
    • Attributes strongly correlated with the output classes were selected using a supervised heuristic approach based on mutual information.
  3. Data Normalization
    • Standardization of features was performed using z-score normalization to ensure that all features have a mean of 0 and variance of 1.
  4. Feature Transformation
    • Principal Component Analysis (PCA) was applied to reduce the dimensionality of the dataset while retaining most of the variation in the data.

πŸ›οΈ Architecture

Here is the architecture that was used: architecture

πŸ” Classification Models

Various classifiers were tested, including:

  • K-Nearest Neighbors
  • Logistic Regression
  • Gaussian NaΓ―ve Bayes
  • AdaBoost
  • Random Forest

The goal was to achieve high accuracy and F-measure scores, with an accuracy target of at least 70%. After extensive testing, the Random Forest classifier was selected as the best model due to its robust performance across different datasets.

πŸ“‰ Model Evaluation and Selection

CurrencySelected ModelClassPrecisionRecallF1-scoreSupportAccuracy
BTCRandom Forest-1.00.730.540.62610.722
Β Β 0.00.670.770.7198Β 
Β Β 1.00.810.840.8261Β 
ETHRandom Forest-1.00.830.740.78650.618
Β Β 0.00.520.150.2373Β 
Β Β 1.00.550.940.6982Β 
BNBRandom Forest-1.00.740.580.65590.641
Β Β 0.00.590.430.5084Β 
Β Β 1.00.630.920.7577Β 

Time Series Split Cross-Validation was used to evaluate the models, ensuring that temporal dependencies in the data were preserved. The Random Forest classifier demonstrated the best performance, showing resilience against trend drifts and achieving consistent accuracy and F1-score results across different temporal folds.

πŸ”— GitHub Repository

Visit the project repository here for project documentation and access to the codebase and project documentation.

(if you enjoyed this content, please consider leaving a star ⭐).

Screenshots πŸ“Έ

Here is a screenshot illustrating the web app interface: Screenshot 1 Screenshot 2

This post is licensed under CC BY 4.0 by the author.