Merge branch 'develop' into pr/Axel-CH/8779

This commit is contained in:
Matthias
2023-06-20 17:43:50 +02:00
59 changed files with 2073 additions and 1217 deletions

View File

@@ -136,6 +136,7 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
check-latest: true
- name: Cache_dependencies
uses: actions/cache@v3

View File

@@ -6,7 +6,7 @@ To download data (candles / OHLCV) needed for backtesting and hyperoptimization
If no additional parameter is specified, freqtrade will download data for `"1m"` and `"5m"` timeframes for the last 30 days.
Exchange and pairs will come from `config.json` (if specified using `-c/--config`).
Otherwise `--exchange` becomes mandatory.
Without provided configuration, `--exchange` becomes mandatory.
You can use a relative timerange (`--days 20`) or an absolute starting point (`--timerange 20200101-`). For incremental downloads, the relative approach should be used.
@@ -83,40 +83,47 @@ Common arguments:
```
!!! Tip "Downloading all data for one quote currency"
Often, you'll want to download data for all pairs of a specific quote-currency. In such cases, you can use the following shorthand:
`freqtrade download-data --exchange binance --pairs .*/USDT <...>`. The provided "pairs" string will be expanded to contain all active pairs on the exchange.
To also download data for inactive (delisted) pairs, add `--include-inactive-pairs` to the command.
!!! Note "Startup period"
`download-data` is a strategy-independent command. The idea is to download a big chunk of data once, and then iteratively increase the amount of data stored.
For that reason, `download-data` does not care about the "startup-period" defined in a strategy. It's up to the user to download additional days if the backtest should start at a specific point in time (while respecting startup period).
### Pairs file
### Start download
In alternative to the whitelist from `config.json`, a `pairs.json` file can be used.
If you are using Binance for example:
- create a directory `user_data/data/binance` and copy or create the `pairs.json` file in that directory.
- update the `pairs.json` file to contain the currency pairs you are interested in.
A very simple command (assuming an available `config.json` file) can look as follows.
```bash
mkdir -p user_data/data/binance
touch user_data/data/binance/pairs.json
freqtrade download-data --exchange binance
```
The format of the `pairs.json` file is a simple json list.
Mixing different stake-currencies is allowed for this file, since it's only used for downloading.
This will download historical candle (OHLCV) data for all the currency pairs defined in the configuration.
``` json
[
"ETH/BTC",
"ETH/USDT",
"BTC/USDT",
"XRP/ETH"
]
Alternatively, specify the pairs directly
```bash
freqtrade download-data --exchange binance --pairs ETH/USDT XRP/USDT BTC/USDT
```
!!! Tip "Downloading all data for one quote currency"
Often, you'll want to download data for all pairs of a specific quote-currency. In such cases, you can use the following shorthand:
`freqtrade download-data --exchange binance --pairs .*/USDT <...>`. The provided "pairs" string will be expanded to contain all active pairs on the exchange.
To also download data for inactive (delisted) pairs, add `--include-inactive-pairs` to the command.
or as regex (in this case, to download all active USDT pairs)
```bash
freqtrade download-data --exchange binance --pairs .*/USDT
```
### Other Notes
* To use a different directory than the exchange specific default, use `--datadir user_data/data/some_directory`.
* To change the exchange used to download the historical data from, please use a different configuration file (you'll probably need to adjust rate limits etc.)
* To use `pairs.json` from some other directory, use `--pairs-file some_other_dir/pairs.json`.
* To download historical candle (OHLCV) data for only 10 days, use `--days 10` (defaults to 30 days).
* To download historical candle (OHLCV) data from a fixed starting point, use `--timerange 20200101-` - which will download all data from January 1st, 2020.
* Use `--timeframes` to specify what timeframe download the historical candle (OHLCV) data for. Default is `--timeframes 1m 5m` which will download 1-minute and 5-minute data.
* To use exchange, timeframe and list of pairs as defined in your configuration file, use the `-c/--config` option. With this, the script uses the whitelist defined in the config as the list of currency pairs to download data for and does not require the pairs.json file. You can combine `-c/--config` with most other options.
??? Note "Permission denied errors"
If your configuration directory `user_data` was made by docker, you may get the following error:
@@ -131,39 +138,7 @@ Mixing different stake-currencies is allowed for this file, since it's only used
sudo chown -R $UID:$GID user_data
```
### Start download
Then run:
```bash
freqtrade download-data --exchange binance
```
This will download historical candle (OHLCV) data for all the currency pairs you defined in `pairs.json`.
Alternatively, specify the pairs directly
```bash
freqtrade download-data --exchange binance --pairs ETH/USDT XRP/USDT BTC/USDT
```
or as regex (to download all active USDT pairs)
```bash
freqtrade download-data --exchange binance --pairs .*/USDT
```
### Other Notes
- To use a different directory than the exchange specific default, use `--datadir user_data/data/some_directory`.
- To change the exchange used to download the historical data from, please use a different configuration file (you'll probably need to adjust rate limits etc.)
- To use `pairs.json` from some other directory, use `--pairs-file some_other_dir/pairs.json`.
- To download historical candle (OHLCV) data for only 10 days, use `--days 10` (defaults to 30 days).
- To download historical candle (OHLCV) data from a fixed starting point, use `--timerange 20200101-` - which will download all data from January 1st, 2020.
- Use `--timeframes` to specify what timeframe download the historical candle (OHLCV) data for. Default is `--timeframes 1m 5m` which will download 1-minute and 5-minute data.
- To use exchange, timeframe and list of pairs as defined in your configuration file, use the `-c/--config` option. With this, the script uses the whitelist defined in the config as the list of currency pairs to download data for and does not require the pairs.json file. You can combine `-c/--config` with most other options.
#### Download additional data before the current timerange
### Download additional data before the current timerange
Assuming you downloaded all data from 2022 (`--timerange 20220101-`) - but you'd now like to also backtest with earlier data.
You can do so by using the `--prepend` flag, combined with `--timerange` - specifying an end-date.
@@ -238,7 +213,36 @@ Size has been taken from the BTC/USDT 1m spot combination for the timerange spec
To have a best performance/size mix, we recommend the use of either feather or parquet.
#### Sub-command convert data
### Pairs file
In alternative to the whitelist from `config.json`, a `pairs.json` file can be used.
If you are using Binance for example:
* create a directory `user_data/data/binance` and copy or create the `pairs.json` file in that directory.
* update the `pairs.json` file to contain the currency pairs you are interested in.
```bash
mkdir -p user_data/data/binance
touch user_data/data/binance/pairs.json
```
The format of the `pairs.json` file is a simple json list.
Mixing different stake-currencies is allowed for this file, since it's only used for downloading.
``` json
[
"ETH/BTC",
"ETH/USDT",
"BTC/USDT",
"XRP/ETH"
]
```
!!! Note
The `pairs.json` file is only used when no configuration is loaded (implicitly by naming, or via `--config` flag).
You can force the usage of this file via `--pairs-file pairs.json` - however we recommend to use the pairlist from within the configuration, either via `exchange.pair_whitelist` or `pairs` setting in the configuration.
## Sub-command convert data
```
usage: freqtrade convert-data [-h] [-v] [--logfile FILE] [-V] [-c PATH]
@@ -290,7 +294,7 @@ Common arguments:
```
##### Example converting data
### Example converting data
The following command will convert all candle (OHLCV) data available in `~/.freqtrade/data/binance` from json to jsongz, saving diskspace in the process.
It'll also remove original json data files (`--erase` parameter).
@@ -299,7 +303,7 @@ It'll also remove original json data files (`--erase` parameter).
freqtrade convert-data --format-from json --format-to jsongz --datadir ~/.freqtrade/data/binance -t 5m 15m --erase
```
#### Sub-command convert trade data
## Sub-command convert trade data
```
usage: freqtrade convert-trade-data [-h] [-v] [--logfile FILE] [-V] [-c PATH]
@@ -342,7 +346,7 @@ Common arguments:
```
##### Example converting trades
### Example converting trades
The following command will convert all available trade-data in `~/.freqtrade/data/kraken` from jsongz to json.
It'll also remove original jsongz data files (`--erase` parameter).
@@ -351,7 +355,7 @@ It'll also remove original jsongz data files (`--erase` parameter).
freqtrade convert-trade-data --format-from jsongz --format-to json --datadir ~/.freqtrade/data/kraken --erase
```
### Sub-command trades to ohlcv
## Sub-command trades to ohlcv
When you need to use `--dl-trades` (kraken only) to download data, conversion of trades data to ohlcv data is the last step.
This command will allow you to repeat this last step for additional timeframes without re-downloading the data.
@@ -400,13 +404,13 @@ Common arguments:
```
#### Example trade-to-ohlcv conversion
### Example trade-to-ohlcv conversion
``` bash
freqtrade trades-to-ohlcv --exchange kraken -t 5m 1h 1d --pairs BTC/EUR ETH/EUR
```
### Sub-command list-data
## Sub-command list-data
You can get a list of downloaded data using the `list-data` sub-command.
@@ -451,7 +455,7 @@ Common arguments:
```
#### Example list-data
### Example list-data
```bash
> freqtrade list-data --userdir ~/.freqtrade/user_data/
@@ -465,7 +469,7 @@ ETH/BTC 5m, 15m, 30m, 1h, 2h, 4h, 6h, 12h, 1d
ETH/USDT 5m, 15m, 30m, 1h, 2h, 4h
```
### Trades (tick) data
## Trades (tick) data
By default, `download-data` sub-command downloads Candles (OHLCV) data. Some exchanges also provide historic trade-data via their API.
This data can be useful if you need many different timeframes, since it is only downloaded once, and then resampled locally to the desired timeframes.

View File

@@ -180,6 +180,9 @@ You can ask for each of the defined features to be included also for informative
In total, the number of features the user of the presented example strat has created is: length of `include_timeframes` * no. features in `feature_engineering_expand_*()` * length of `include_corr_pairlist` * no. `include_shifted_candles` * length of `indicator_periods_candles`
$= 3 * 3 * 3 * 2 * 2 = 108$.
!!! note "Learn more about creative feature engineering"
Check out our [medium article](https://emergentmethods.medium.com/freqai-from-price-to-prediction-6fadac18b665) geared toward helping users learn how to creatively engineer features.
### Gain finer control over `feature_engineering_*` functions with `metadata`
@@ -209,41 +212,7 @@ Another example, where the user wants to use live metrics from the trade databas
You need to set the standard dictionary in the config so that FreqAI can return proper dataframe shapes. These values will likely be overridden by the prediction model, but in the case where the model has yet to set them, or needs a default initial value, the pre-set values are what will be returned.
## Feature normalization
FreqAI is strict when it comes to data normalization. The train features, $X^{train}$, are always normalized to [-1, 1] using a shifted min-max normalization:
$$X^{train}_{norm} = 2 * \frac{X^{train} - X^{train}.min()}{X^{train}.max() - X^{train}.min()} - 1$$
All other data (test data and unseen prediction data in dry/live/backtest) is always automatically normalized to the training feature space according to industry standards. FreqAI stores all the metadata required to ensure that test and prediction features will be properly normalized and that predictions are properly denormalized. For this reason, it is not recommended to eschew industry standards and modify FreqAI internals - however - advanced users can do so by inheriting `train()` in their custom `IFreqaiModel` and using their own normalization functions.
## Data dimensionality reduction with Principal Component Analysis
You can reduce the dimensionality of your features by activating the `principal_component_analysis` in the config:
```json
"freqai": {
"feature_parameters" : {
"principal_component_analysis": true
}
}
```
This will perform PCA on the features and reduce their dimensionality so that the explained variance of the data set is >= 0.999. Reducing data dimensionality makes training the model faster and hence allows for more up-to-date models.
## Inlier metric
The `inlier_metric` is a metric aimed at quantifying how similar the features of a data point are to the most recent historical data points.
You define the lookback window by setting `inlier_metric_window` and FreqAI computes the distance between the present time point and each of the previous `inlier_metric_window` lookback points. A Weibull function is fit to each of the lookback distributions and its cumulative distribution function (CDF) is used to produce a quantile for each lookback point. The `inlier_metric` is then computed for each time point as the average of the corresponding lookback quantiles. The figure below explains the concept for an `inlier_metric_window` of 5.
![inlier-metric](assets/freqai_inlier-metric.jpg)
FreqAI adds the `inlier_metric` to the training features and hence gives the model access to a novel type of temporal information.
This function does **not** remove outliers from the data set.
## Weighting features for temporal importance
### Weighting features for temporal importance
FreqAI allows you to set a `weight_factor` to weight recent data more strongly than past data via an exponential function:
@@ -253,13 +222,103 @@ where $W_i$ is the weight of data point $i$ in a total set of $n$ data points. B
![weight-factor](assets/freqai_weight-factor.jpg)
## Building the data pipeline
By default, FreqAI builds a dynamic pipeline based on user congfiguration settings. The default settings are robust and designed to work with a variety of methods. These two steps are a `MinMaxScaler(-1,1)` and a `VarianceThreshold` which removes any column that has 0 variance. Users can activate other steps with more configuration parameters. For example if users add `use_SVM_to_remove_outliers: true` to the `freqai` config, then FreqAI will automatically add the [`SVMOutlierExtractor`](#identifying-outliers-using-a-support-vector-machine-svm) to the pipeline. Likewise, users can add `principal_component_analysis: true` to the `freqai` config to activate PCA. The [DissimilarityIndex](#identifying-outliers-with-the-dissimilarity-index-di) is activated with `DI_threshold: 1`. Finally, noise can also be added to the data with `noise_standard_deviation: 0.1`. Finally, users can add [DBSCAN](#identifying-outliers-with-dbscan) outlier removal with `use_DBSCAN_to_remove_outliers: true`.
!!! note "More information available"
Please review the [parameter table](freqai-parameter-table.md) for more information on these parameters.
### Customizing the pipeline
Users are encouraged to customize the data pipeline to their needs by building their own data pipeline. This can be done by simply setting `dk.feature_pipeline` to their desired `Pipeline` object inside their `IFreqaiModel` `train()` function, or if they prefer not to touch the `train()` function, they can override `define_data_pipeline`/`define_label_pipeline` functions in their `IFreqaiModel`:
!!! note "More information available"
FreqAI uses the the [`DataSieve`](https://github.com/emergentmethods/datasieve) pipeline, which follows the SKlearn pipeline API, but adds, among other features, coherence between the X, y, and sample_weight vector point removals, feature removal, feature name following.
```python
from datasieve.transforms import SKLearnWrapper, DissimilarityIndex
from datasieve.pipeline import Pipeline
from sklearn.preprocessing import QuantileTransformer, StandardScaler
from freqai.base_models import BaseRegressionModel
class MyFreqaiModel(BaseRegressionModel):
"""
Some cool custom model
"""
def fit(self, data_dictionary: Dict, dk: FreqaiDataKitchen, **kwargs) -> Any:
"""
My custom fit function
"""
model = cool_model.fit()
return model
def define_data_pipeline(self) -> Pipeline:
"""
User defines their custom feature pipeline here (if they wish)
"""
feature_pipeline = Pipeline([
('qt', SKLearnWrapper(QuantileTransformer(output_distribution='normal'))),
('di', ds.DissimilarityIndex(di_threshold=1)
])
return feature_pipeline
def define_label_pipeline(self) -> Pipeline:
"""
User defines their custom label pipeline here (if they wish)
"""
label_pipeline = Pipeline([
('qt', SKLearnWrapper(StandardScaler())),
])
return label_pipeline
```
Here, you are defining the exact pipeline that will be used for your feature set during training and prediction. You can use *most* SKLearn transformation steps by wrapping them in the `SKLearnWrapper` class as shown above. In addition, you can use any of the transformations available in the [`DataSieve` library](https://github.com/emergentmethods/datasieve).
You can easily add your own transformation by creating a class that inherits from the datasieve `BaseTransform` and implementing your `fit()`, `transform()` and `inverse_transform()` methods:
```python
from datasieve.transforms.base_transform import BaseTransform
# import whatever else you need
class MyCoolTransform(BaseTransform):
def __init__(self, **kwargs):
self.param1 = kwargs.get('param1', 1)
def fit(self, X, y=None, sample_weight=None, feature_list=None, **kwargs):
# do something with X, y, sample_weight, or/and feature_list
return X, y, sample_weight, feature_list
def transform(self, X, y=None, sample_weight=None,
feature_list=None, outlier_check=False, **kwargs):
# do something with X, y, sample_weight, or/and feature_list
return X, y, sample_weight, feature_list
def inverse_transform(self, X, y=None, sample_weight=None, feature_list=None, **kwargs):
# do/dont do something with X, y, sample_weight, or/and feature_list
return X, y, sample_weight, feature_list
```
!!! note "Hint"
You can define this custom class in the same file as your `IFreqaiModel`.
### Migrating a custom `IFreqaiModel` to the new Pipeline
If you have created your own custom `IFreqaiModel` with a custom `train()`/`predict()` function, *and* you still rely on `data_cleaning_train/predict()`, then you will need to migrate to the new pipeline. If your model does *not* rely on `data_cleaning_train/predict()`, then you do not need to worry about this migration.
More details about the migration can be found [here](strategy_migration.md#freqai---new-data-pipeline).
## Outlier detection
Equity and crypto markets suffer from a high level of non-patterned noise in the form of outlier data points. FreqAI implements a variety of methods to identify such outliers and hence mitigate risk.
### Identifying outliers with the Dissimilarity Index (DI)
The Dissimilarity Index (DI) aims to quantify the uncertainty associated with each prediction made by the model.
The Dissimilarity Index (DI) aims to quantify the uncertainty associated with each prediction made by the model.
You can tell FreqAI to remove outlier data points from the training/test data sets using the DI by including the following statement in the config:
@@ -271,7 +330,7 @@ You can tell FreqAI to remove outlier data points from the training/test data se
}
```
The DI allows predictions which are outliers (not existent in the model feature space) to be thrown out due to low levels of certainty. To do so, FreqAI measures the distance between each training data point (feature vector), $X_{a}$, and all other training data points:
Which will add `DissimilarityIndex` step to your `feature_pipeline` and set the threshold to 1. The DI allows predictions which are outliers (not existent in the model feature space) to be thrown out due to low levels of certainty. To do so, FreqAI measures the distance between each training data point (feature vector), $X_{a}$, and all other training data points:
$$ d_{ab} = \sqrt{\sum_{j=1}^p(X_{a,j}-X_{b,j})^2} $$
@@ -305,9 +364,9 @@ You can tell FreqAI to remove outlier data points from the training/test data se
}
```
The SVM will be trained on the training data and any data point that the SVM deems to be beyond the feature space will be removed.
Which will add `SVMOutlierExtractor` step to your `feature_pipeline`. The SVM will be trained on the training data and any data point that the SVM deems to be beyond the feature space will be removed.
FreqAI uses `sklearn.linear_model.SGDOneClassSVM` (details are available on scikit-learn's webpage [here](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDOneClassSVM.html) (external website)) and you can elect to provide additional parameters for the SVM, such as `shuffle`, and `nu`.
You can elect to provide additional parameters for the SVM, such as `shuffle`, and `nu` via the `feature_parameters.svm_params` dictionary in the config.
The parameter `shuffle` is by default set to `False` to ensure consistent results. If it is set to `True`, running the SVM multiple times on the same data set might result in different outcomes due to `max_iter` being to low for the algorithm to reach the demanded `tol`. Increasing `max_iter` solves this issue but causes the procedure to take longer time.
@@ -325,7 +384,7 @@ You can configure FreqAI to use DBSCAN to cluster and remove outliers from the t
}
```
DBSCAN is an unsupervised machine learning algorithm that clusters data without needing to know how many clusters there should be.
Which will add the `DataSieveDBSCAN` step to your `feature_pipeline`. This is an unsupervised machine learning algorithm that clusters data without needing to know how many clusters there should be.
Given a number of data points $N$, and a distance $\varepsilon$, DBSCAN clusters the data set by setting all data points that have $N-1$ other data points within a distance of $\varepsilon$ as *core points*. A data point that is within a distance of $\varepsilon$ from a *core point* but that does not have $N-1$ other data points within a distance of $\varepsilon$ from itself is considered an *edge point*. A cluster is then the collection of *core points* and *edge points*. Data points that have no other data points at a distance $<\varepsilon$ are considered outliers. The figure below shows a cluster with $N = 3$.

View File

@@ -107,6 +107,13 @@ This is for performance reasons - FreqAI relies on making quick predictions/retr
it needs to download all the training data at the beginning of a dry/live instance. FreqAI stores and appends
new candles automatically for future retrains. This means that if new pairs arrive later in the dry run due to a volume pairlist, it will not have the data ready. However, FreqAI does work with the `ShufflePairlist` or a `VolumePairlist` which keeps the total pairlist constant (but reorders the pairs according to volume).
## Additional learning materials
Here we compile some external materials that provide deeper looks into various components of FreqAI:
- [Real-time head-to-head: Adaptive modeling of financial market data using XGBoost and CatBoost](https://emergentmethods.medium.com/real-time-head-to-head-adaptive-modeling-of-financial-market-data-using-xgboost-and-catboost-995a115a7495)
- [FreqAI - from price to prediction](https://emergentmethods.medium.com/freqai-from-price-to-prediction-6fadac18b665)
## Credits
FreqAI is developed by a group of individuals who all contribute specific skillsets to the project.

100
docs/lookahead-analysis.md Normal file
View File

@@ -0,0 +1,100 @@
# Lookahead analysis
This page explains how to validate your strategy in terms of look ahead bias.
Checking look ahead bias is the bane of any strategy since it is sometimes very easy to introduce backtest bias -
but very hard to detect.
Backtesting initializes all timestamps at once and calculates all indicators in the beginning.
This means that if your indicators or entry/exit signals could look into future candles and falsify your backtest.
Lookahead-analysis requires historic data to be available.
To learn how to get data for the pairs and exchange you're interested in,
head over to the [Data Downloading](data-download.md) section of the documentation.
This command is built upon backtesting since it internally chains backtests and pokes at the strategy to provoke it to show look ahead bias.
This is done by not looking at the strategy itself - but at the results it returned.
The results are things like changed indicator-values and moved entries/exits compared to the full backtest.
You can use commands of [Backtesting](backtesting.md).
It also supports the lookahead-analysis of freqai strategies.
- `--cache` is forced to "none".
- `--max-open-trades` is forced to be at least equal to the number of pairs.
- `--dry-run-wallet` is forced to be basically infinite.
## Lookahead-analysis command reference
```
usage: freqtrade lookahead-analysis [-h] [-v] [--logfile FILE] [-V] [-c PATH]
[-d PATH] [--userdir PATH] [-s NAME]
[--strategy-path PATH]
[--recursive-strategy-search]
[--freqaimodel NAME]
[--freqaimodel-path PATH] [-i TIMEFRAME]
[--timerange TIMERANGE]
[--data-format-ohlcv {json,jsongz,hdf5,feather,parquet}]
[--max-open-trades INT]
[--stake-amount STAKE_AMOUNT]
[--fee FLOAT] [-p PAIRS [PAIRS ...]]
[--enable-protections]
[--dry-run-wallet DRY_RUN_WALLET]
[--timeframe-detail TIMEFRAME_DETAIL]
[--strategy-list STRATEGY_LIST [STRATEGY_LIST ...]]
[--export {none,trades,signals}]
[--export-filename PATH]
[--breakdown {day,week,month} [{day,week,month} ...]]
[--cache {none,day,week,month}]
[--freqai-backtest-live-models]
[--minimum-trade-amount INT]
[--targeted-trade-amount INT]
[--lookahead-analysis-exportfilename LOOKAHEAD_ANALYSIS_EXPORTFILENAME]
options:
--minimum-trade-amount INT
Minimum trade amount for lookahead-analysis
--targeted-trade-amount INT
Targeted trade amount for lookahead analysis
--lookahead-analysis-exportfilename LOOKAHEAD_ANALYSIS_EXPORTFILENAME
Use this csv-filename to store lookahead-analysis-
results
```
!!! Note ""
The above Output was reduced to options `lookahead-analysis` adds on top of regular backtesting commands.
### Summary
Checks a given strategy for look ahead bias via lookahead-analysis
Look ahead bias means that the backtest uses data from future candles thereby not making it viable beyond backtesting
and producing false hopes for the one backtesting.
### Introduction
Many strategies - without the programmer knowing - have fallen prey to look ahead bias.
Any backtest will populate the full dataframe including all time stamps at the beginning.
If the programmer is not careful or oblivious how things work internally
(which sometimes can be really hard to find out) then it will just look into the future making the strategy amazing
but not realistic.
This command is made to try to verify the validity in the form of the aforementioned look ahead bias.
### How does the command work?
It will start with a backtest of all pairs to generate a baseline for indicators and entries/exits.
After the backtest ran, it will look if the `minimum-trade-amount` is met
and if not cancel the lookahead-analysis for this strategy.
After setting the baseline it will then do additional runs for every entry and exit separately.
When a verification-backtest is done, it will compare the indicators as the signal (either entry or exit) and report the bias.
After all signals have been verified or falsified a result-table will be generated for the user to see.
### Caveats
- `lookahead-analysis` can only verify / falsify the trades it calculated and verified.
If the strategy has many different signals / signal types, it's up to you to select appropriate parameters to ensure that all signals have triggered at least once. Not triggered signals will not have been verified.
This could lead to a false-negative (the strategy will then be reported as non-biased).
- `lookahead-analysis` has access to everything that backtesting has too.
Please don't provoke any configs like enabling position stacking.
If you decide to do so, then make doubly sure that you won't ever run out of `max_open_trades` amount and neither leftover money in your wallet.

View File

@@ -1,6 +1,6 @@
markdown==3.3.7
mkdocs==1.4.3
mkdocs-material==9.1.15
mkdocs-material==9.1.16
mdx_truly_sane_lists==1.3
pymdown-extensions==10.0.1
jinja2==3.1.2

View File

@@ -728,3 +728,86 @@ Targets now get their own, dedicated method.
return dataframe
```
### FreqAI - New data Pipeline
If you have created your own custom `IFreqaiModel` with a custom `train()`/`predict()` function, *and* you still rely on `data_cleaning_train/predict()`, then you will need to migrate to the new pipeline. If your model does *not* rely on `data_cleaning_train/predict()`, then you do not need to worry about this migration. That means that this migration guide is relevant for a very small percentage of power-users. If you stumbled upon this guide by mistake, feel free to inquire in depth about your problem in the Freqtrade discord server.
The conversion involves first removing `data_cleaning_train/predict()` and replacing them with a `define_data_pipeline()` and `define_label_pipeline()` function to your `IFreqaiModel` class:
```python linenums="1" hl_lines="11-14 47-49 55-57"
class MyCoolFreqaiModel(BaseRegressionModel):
"""
Some cool custom IFreqaiModel you made before Freqtrade version 2023.6
"""
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
# ... your custom stuff
# Remove these lines
# data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
# self.data_cleaning_train(dk)
# data_dictionary = dk.normalize_data(data_dictionary)
# (1)
# Add these lines. Now we control the pipeline fit/transform ourselves
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
dk.label_pipeline = self.define_label_pipeline(threads=dk.thread_count)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
# ... your custom code
return model
def predict(
self, unfiltered_df: DataFrame, dk: FreqaiDataKitchen, **kwargs
) -> Tuple[DataFrame, npt.NDArray[np.int_]]:
# ... your custom stuff
# Remove these lines:
# self.data_cleaning_predict(dk)
# (2)
# Add these lines:
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
# Remove this line
# pred_df = dk.denormalize_labels_from_metadata(pred_df)
# (3)
# Replace with these lines
pred_df, _, _ = dk.label_pipeline.inverse_transform(pred_df)
if self.freqai_info.get("DI_threshold", 0) > 0:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
# ... your custom code
return (pred_df, dk.do_predict)
```
1. Data normalization and cleaning is now homogenized with the new pipeline definition. This is created in the new `define_data_pipeline()` and `define_label_pipeline()` functions. The `data_cleaning_train()` and `data_cleaning_predict()` functions are no longer used. You can override `define_data_pipeline()` to create your own custom pipeline if you wish.
2. Data normalization and cleaning is now homogenized with the new pipeline definition. This is created in the new `define_data_pipeline()` and `define_label_pipeline()` functions. The `data_cleaning_train()` and `data_cleaning_predict()` functions are no longer used. You can override `define_data_pipeline()` to create your own custom pipeline if you wish.
3. Data denormalization is done with the new pipeline. Replace this with the lines below.

View File

View File

@@ -19,7 +19,8 @@ from freqtrade.commands.list_commands import (start_list_exchanges, start_list_f
start_list_markets, start_list_strategies,
start_list_timeframes, start_show_trades)
from freqtrade.commands.optimize_commands import (start_backtesting, start_backtesting_show,
start_edge, start_hyperopt)
start_edge, start_hyperopt,
start_lookahead_analysis)
from freqtrade.commands.pairlist_commands import start_test_pairlist
from freqtrade.commands.plot_commands import start_plot_dataframe, start_plot_profit
from freqtrade.commands.strategy_utils_commands import start_strategy_update

24
freqtrade/commands/arguments.py Normal file → Executable file
View File

@@ -117,7 +117,11 @@ NO_CONF_REQURIED = ["convert-data", "convert-trade-data", "download-data", "list
NO_CONF_ALLOWED = ["create-userdir", "list-exchanges", "new-strategy"]
ARGS_STRATEGY_UTILS = ["strategy_list", "strategy_path", "recursive_strategy_search"]
ARGS_STRATEGY_UPDATER = ["strategy_list", "strategy_path", "recursive_strategy_search"]
ARGS_LOOKAHEAD_ANALYSIS = [
a for a in ARGS_BACKTEST if a not in ("position_stacking", "use_max_market_positions", 'cache')
] + ["minimum_trade_amount", "targeted_trade_amount", "lookahead_analysis_exportfilename"]
class Arguments:
@@ -201,8 +205,9 @@ class Arguments:
start_install_ui, start_list_data, start_list_exchanges,
start_list_freqAI_models, start_list_markets,
start_list_strategies, start_list_timeframes,
start_new_config, start_new_strategy, start_plot_dataframe,
start_plot_profit, start_show_trades, start_strategy_update,
start_lookahead_analysis, start_new_config,
start_new_strategy, start_plot_dataframe, start_plot_profit,
start_show_trades, start_strategy_update,
start_test_pairlist, start_trading, start_webserver)
subparsers = self.parser.add_subparsers(dest='command',
@@ -451,4 +456,15 @@ class Arguments:
'files to the current version',
parents=[_common_parser])
strategy_updater_cmd.set_defaults(func=start_strategy_update)
self._build_args(optionlist=ARGS_STRATEGY_UTILS, parser=strategy_updater_cmd)
self._build_args(optionlist=ARGS_STRATEGY_UPDATER, parser=strategy_updater_cmd)
# Add lookahead_analysis subcommand
lookahead_analayis_cmd = subparsers.add_parser(
'lookahead-analysis',
help="Check for potential look ahead bias.",
parents=[_common_parser, _strategy_parser])
lookahead_analayis_cmd.set_defaults(func=start_lookahead_analysis)
self._build_args(optionlist=ARGS_LOOKAHEAD_ANALYSIS,
parser=lookahead_analayis_cmd)

17
freqtrade/commands/cli_options.py Normal file → Executable file
View File

@@ -690,4 +690,21 @@ AVAILABLE_CLI_OPTIONS = {
help='Run backtest with ready models.',
action='store_true'
),
"minimum_trade_amount": Arg(
'--minimum-trade-amount',
help='Minimum trade amount for lookahead-analysis',
type=check_int_positive,
metavar='INT',
),
"targeted_trade_amount": Arg(
'--targeted-trade-amount',
help='Targeted trade amount for lookahead analysis',
type=check_int_positive,
metavar='INT',
),
"lookahead_analysis_exportfilename": Arg(
'--lookahead-analysis-exportfilename',
help="Use this csv-filename to store lookahead-analysis-results",
type=str
),
}

View File

@@ -1,18 +1,16 @@
import logging
import sys
from collections import defaultdict
from datetime import datetime, timedelta
from typing import Any, Dict, List
from typing import Any, Dict
from freqtrade.configuration import TimeRange, setup_utils_configuration
from freqtrade.constants import DATETIME_PRINT_FORMAT, Config
from freqtrade.data.converter import convert_ohlcv_format, convert_trades_format
from freqtrade.data.history import (convert_trades_to_ohlcv, refresh_backtest_ohlcv_data,
refresh_backtest_trades_data)
from freqtrade.data.history import convert_trades_to_ohlcv, download_data_main
from freqtrade.enums import CandleType, RunMode, TradingMode
from freqtrade.exceptions import OperationalException
from freqtrade.exchange import market_is_active, timeframe_to_minutes
from freqtrade.plugins.pairlist.pairlist_helpers import dynamic_expand_pairlist, expand_pairlist
from freqtrade.exchange import timeframe_to_minutes
from freqtrade.plugins.pairlist.pairlist_helpers import expand_pairlist
from freqtrade.resolvers import ExchangeResolver
from freqtrade.util.binance_mig import migrate_binance_futures_data
@@ -20,7 +18,7 @@ from freqtrade.util.binance_mig import migrate_binance_futures_data
logger = logging.getLogger(__name__)
def _data_download_sanity(config: Config) -> None:
def _check_data_config_download_sanity(config: Config) -> None:
if 'days' in config and 'timerange' in config:
raise OperationalException("--days and --timerange are mutually exclusive. "
"You can only specify one or the other.")
@@ -37,78 +35,14 @@ def start_download_data(args: Dict[str, Any]) -> None:
"""
config = setup_utils_configuration(args, RunMode.UTIL_EXCHANGE)
_data_download_sanity(config)
timerange = TimeRange()
if 'days' in config:
time_since = (datetime.now() - timedelta(days=config['days'])).strftime("%Y%m%d")
timerange = TimeRange.parse_timerange(f'{time_since}-')
if 'timerange' in config:
timerange = timerange.parse_timerange(config['timerange'])
# Remove stake-currency to skip checks which are not relevant for datadownload
config['stake_currency'] = ''
pairs_not_available: List[str] = []
# Init exchange
exchange = ExchangeResolver.load_exchange(config, validate=False)
markets = [p for p, m in exchange.markets.items() if market_is_active(m)
or config.get('include_inactive')]
expanded_pairs = dynamic_expand_pairlist(config, markets)
# Manual validations of relevant settings
if not config['exchange'].get('skip_pair_validation', False):
exchange.validate_pairs(expanded_pairs)
logger.info(f"About to download pairs: {expanded_pairs}, "
f"intervals: {config['timeframes']} to {config['datadir']}")
for timeframe in config['timeframes']:
exchange.validate_timeframes(timeframe)
_check_data_config_download_sanity(config)
try:
if config.get('download_trades'):
if config.get('trading_mode') == 'futures':
raise OperationalException("Trade download not supported for futures.")
pairs_not_available = refresh_backtest_trades_data(
exchange, pairs=expanded_pairs, datadir=config['datadir'],
timerange=timerange, new_pairs_days=config['new_pairs_days'],
erase=bool(config.get('erase')), data_format=config['dataformat_trades'])
# Convert downloaded trade data to different timeframes
convert_trades_to_ohlcv(
pairs=expanded_pairs, timeframes=config['timeframes'],
datadir=config['datadir'], timerange=timerange, erase=bool(config.get('erase')),
data_format_ohlcv=config['dataformat_ohlcv'],
data_format_trades=config['dataformat_trades'],
)
else:
if not exchange.get_option('ohlcv_has_history', True):
raise OperationalException(
f"Historic klines not available for {exchange.name}. "
"Please use `--dl-trades` instead for this exchange "
"(will unfortunately take a long time)."
)
migrate_binance_futures_data(config)
pairs_not_available = refresh_backtest_ohlcv_data(
exchange, pairs=expanded_pairs, timeframes=config['timeframes'],
datadir=config['datadir'], timerange=timerange,
new_pairs_days=config['new_pairs_days'],
erase=bool(config.get('erase')), data_format=config['dataformat_ohlcv'],
trading_mode=config.get('trading_mode', 'spot'),
prepend=config.get('prepend_data', False)
)
download_data_main(config)
except KeyboardInterrupt:
sys.exit("SIGINT received, aborting ...")
finally:
if pairs_not_available:
logger.info(f"Pairs [{','.join(pairs_not_available)}] not available "
f"on exchange {exchange.name}.")
def start_convert_trades(args: Dict[str, Any]) -> None:

View File

@@ -132,3 +132,15 @@ def start_edge(args: Dict[str, Any]) -> None:
# Initialize Edge object
edge_cli = EdgeCli(config)
edge_cli.start()
def start_lookahead_analysis(args: Dict[str, Any]) -> None:
"""
Start the backtest bias tester script
:param args: Cli args from Arguments()
:return: None
"""
from freqtrade.optimize.lookahead_analysis_helpers import LookaheadAnalysisSubFunctions
config = setup_utils_configuration(args, RunMode.UTIL_NO_EXCHANGE)
LookaheadAnalysisSubFunctions.start(config)

View File

@@ -203,7 +203,7 @@ class Configuration:
# This will override the strategy configuration
self._args_to_config(config, argname='timeframe',
logstring='Parameter -i/--timeframe detected ... '
'Using timeframe: {} ...')
'Using timeframe: {} ...')
self._args_to_config(config, argname='position_stacking',
logstring='Parameter --enable-position-stacking detected ...')
@@ -300,6 +300,9 @@ class Configuration:
self._args_to_config(config, argname='hyperoptexportfilename',
logstring='Using hyperopt file: {}')
self._args_to_config(config, argname='lookahead_analysis_exportfilename',
logstring='Saving lookahead analysis results into {} ...')
self._args_to_config(config, argname='epochs',
logstring='Parameter --epochs detected ... '
'Will run Hyperopt with for {} epochs ...'
@@ -474,6 +477,19 @@ class Configuration:
self._args_to_config(config, argname='analysis_csv_path',
logstring='Path to store analysis CSVs: {}')
self._args_to_config(config, argname='analysis_csv_path',
logstring='Path to store analysis CSVs: {}')
# Lookahead analysis results
self._args_to_config(config, argname='targeted_trade_amount',
logstring='Targeted Trade amount: {}')
self._args_to_config(config, argname='minimum_trade_amount',
logstring='Minimum Trade amount: {}')
self._args_to_config(config, argname='lookahead_analysis_exportfilename',
logstring='Path to store lookahead-analysis-results: {}')
def _process_runmode(self, config: Config) -> None:
self._args_to_config(config, argname='dry_run',
@@ -552,6 +568,7 @@ class Configuration:
# Fall back to /dl_path/pairs.json
pairs_file = config['datadir'] / 'pairs.json'
if pairs_file.exists():
logger.info(f'Reading pairs file "{pairs_file}".')
config['pairs'] = load_file(pairs_file)
if 'pairs' in config and isinstance(config['pairs'], list):
config['pairs'].sort()

View File

@@ -8,6 +8,7 @@ from typing import Any, Dict, List, Literal, Tuple
from freqtrade.enums import CandleType, PriceType, RPCMessageType
DOCS_LINK = "https://www.freqtrade.io/en/stable"
DEFAULT_CONFIG = 'config.json'
DEFAULT_EXCHANGE = 'bittrex'
PROCESS_THROTTLE_SECS = 5 # sec
@@ -163,6 +164,9 @@ CONF_SCHEMA = {
'trading_mode': {'type': 'string', 'enum': TRADING_MODES},
'margin_mode': {'type': 'string', 'enum': MARGIN_MODES},
'reduce_df_footprint': {'type': 'boolean', 'default': False},
'minimum_trade_amount': {'type': 'number', 'default': 10},
'targeted_trade_amount': {'type': 'number', 'default': 20},
'lookahead_analysis_exportfilename': {'type': 'string'},
'liquidation_buffer': {'type': 'number', 'minimum': 0.0, 'maximum': 0.99},
'backtest_breakdown': {
'type': 'array',

View File

@@ -6,7 +6,7 @@ Includes:
* download data from exchange and store to disk
"""
# flake8: noqa: F401
from .history_utils import (convert_trades_to_ohlcv, get_timerange, load_data, load_pair_history,
refresh_backtest_ohlcv_data, refresh_backtest_trades_data, refresh_data,
validate_backtest_data)
from .history_utils import (convert_trades_to_ohlcv, download_data_main, get_timerange, load_data,
load_pair_history, refresh_backtest_ohlcv_data,
refresh_backtest_trades_data, refresh_data, validate_backtest_data)
from .idatahandler import get_datahandler

View File

@@ -7,7 +7,7 @@ from typing import Dict, List, Optional, Tuple
from pandas import DataFrame, concat
from freqtrade.configuration import TimeRange
from freqtrade.constants import DATETIME_PRINT_FORMAT, DEFAULT_DATAFRAME_COLUMNS
from freqtrade.constants import DATETIME_PRINT_FORMAT, DEFAULT_DATAFRAME_COLUMNS, Config
from freqtrade.data.converter import (clean_ohlcv_dataframe, ohlcv_to_dataframe,
trades_remove_duplicates, trades_to_ohlcv)
from freqtrade.data.history.idatahandler import IDataHandler, get_datahandler
@@ -15,6 +15,8 @@ from freqtrade.enums import CandleType
from freqtrade.exceptions import OperationalException
from freqtrade.exchange import Exchange
from freqtrade.misc import format_ms_time
from freqtrade.plugins.pairlist.pairlist_helpers import dynamic_expand_pairlist
from freqtrade.util.binance_mig import migrate_binance_futures_data
logger = logging.getLogger(__name__)
@@ -294,7 +296,7 @@ def refresh_backtest_ohlcv_data(exchange: Exchange, pairs: List[str], timeframes
continue
for timeframe in timeframes:
logger.info(f'Downloading pair {pair}, interval {timeframe}.')
logger.debug(f'Downloading pair {pair}, {candle_type}, interval {timeframe}.')
process = f'{idx}/{len(pairs)}'
_download_pair_history(pair=pair, process=process,
datadir=datadir, exchange=exchange,
@@ -483,3 +485,77 @@ def validate_backtest_data(data: DataFrame, pair: str, min_date: datetime,
logger.warning("%s has missing frames: expected %s, got %s, that's %s missing values",
pair, expected_frames, dflen, expected_frames - dflen)
return found_missing
def download_data_main(config: Config) -> None:
timerange = TimeRange()
if 'days' in config:
time_since = (datetime.now() - timedelta(days=config['days'])).strftime("%Y%m%d")
timerange = TimeRange.parse_timerange(f'{time_since}-')
if 'timerange' in config:
timerange = timerange.parse_timerange(config['timerange'])
# Remove stake-currency to skip checks which are not relevant for datadownload
config['stake_currency'] = ''
pairs_not_available: List[str] = []
# Init exchange
from freqtrade.resolvers.exchange_resolver import ExchangeResolver
exchange = ExchangeResolver.load_exchange(config, validate=False)
available_pairs = [
p for p in exchange.get_markets(
tradable_only=True, active_only=not config.get('include_inactive')
).keys()
]
expanded_pairs = dynamic_expand_pairlist(config, available_pairs)
# Manual validations of relevant settings
if not config['exchange'].get('skip_pair_validation', False):
exchange.validate_pairs(expanded_pairs)
logger.info(f"About to download pairs: {expanded_pairs}, "
f"intervals: {config['timeframes']} to {config['datadir']}")
for timeframe in config['timeframes']:
exchange.validate_timeframes(timeframe)
# Start downloading
try:
if config.get('download_trades'):
if config.get('trading_mode') == 'futures':
raise OperationalException("Trade download not supported for futures.")
pairs_not_available = refresh_backtest_trades_data(
exchange, pairs=expanded_pairs, datadir=config['datadir'],
timerange=timerange, new_pairs_days=config['new_pairs_days'],
erase=bool(config.get('erase')), data_format=config['dataformat_trades'])
# Convert downloaded trade data to different timeframes
convert_trades_to_ohlcv(
pairs=expanded_pairs, timeframes=config['timeframes'],
datadir=config['datadir'], timerange=timerange, erase=bool(config.get('erase')),
data_format_ohlcv=config['dataformat_ohlcv'],
data_format_trades=config['dataformat_trades'],
)
else:
if not exchange.get_option('ohlcv_has_history', True):
raise OperationalException(
f"Historic klines not available for {exchange.name}. "
"Please use `--dl-trades` instead for this exchange "
"(will unfortunately take a long time)."
)
migrate_binance_futures_data(config)
pairs_not_available = refresh_backtest_ohlcv_data(
exchange, pairs=expanded_pairs, timeframes=config['timeframes'],
datadir=config['datadir'], timerange=timerange,
new_pairs_days=config['new_pairs_days'],
erase=bool(config.get('erase')), data_format=config['dataformat_ohlcv'],
trading_mode=config.get('trading_mode', 'spot'),
prepend=config.get('prepend_data', False)
)
finally:
if pairs_not_available:
logger.info(f"Pairs [{','.join(pairs_not_available)}] not available "
f"on exchange {exchange.name}.")

View File

@@ -301,7 +301,7 @@ class Exchange:
return list((self._api.timeframes or {}).keys())
@property
def markets(self) -> Dict:
def markets(self) -> Dict[str, Any]:
"""exchange ccxt markets"""
if not self._markets:
logger.info("Markets were not loaded. Loading them now..")

View File

@@ -199,6 +199,7 @@ class Okx(Exchange):
order_reg['type'] = 'stoploss'
order_reg['status_stop'] = 'triggered'
return order_reg
order = self._order_contracts_to_amount(order)
order['type'] = 'stoploss'
return order

View File

@@ -2,7 +2,7 @@ import logging
import random
from abc import abstractmethod
from enum import Enum
from typing import Optional, Type, Union
from typing import List, Optional, Type, Union
import gymnasium as gym
import numpy as np
@@ -141,6 +141,9 @@ class BaseEnvironment(gym.Env):
Unique to the environment action count. Must be inherited.
"""
def action_masks(self) -> List[bool]:
return [self._is_valid(action.value) for action in self.actions]
def seed(self, seed: int = 1):
self.np_random, seed = seeding.np_random(seed)
return [seed]

View File

@@ -13,7 +13,8 @@ import pandas as pd
import torch as th
import torch.multiprocessing
from pandas import DataFrame
from stable_baselines3.common.callbacks import EvalCallback
from sb3_contrib.common.maskable.callbacks import MaskableEvalCallback
from sb3_contrib.common.maskable.utils import is_masking_supported
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.utils import set_random_seed
from stable_baselines3.common.vec_env import SubprocVecEnv, VecMonitor
@@ -48,7 +49,7 @@ class BaseReinforcementLearningModel(IFreqaiModel):
self.reward_params = self.freqai_info['rl_config']['model_reward_parameters']
self.train_env: Union[VecMonitor, SubprocVecEnv, gym.Env] = gym.Env()
self.eval_env: Union[VecMonitor, SubprocVecEnv, gym.Env] = gym.Env()
self.eval_callback: Optional[EvalCallback] = None
self.eval_callback: Optional[MaskableEvalCallback] = None
self.model_type = self.freqai_info['rl_config']['model_type']
self.rl_config = self.freqai_info['rl_config']
self.df_raw: DataFrame = DataFrame()
@@ -82,6 +83,9 @@ class BaseReinforcementLearningModel(IFreqaiModel):
if self.ft_params.get('use_DBSCAN_to_remove_outliers', False):
self.ft_params.update({'use_DBSCAN_to_remove_outliers': False})
logger.warning('User tried to use DBSCAN with RL. Deactivating DBSCAN.')
if self.ft_params.get('DI_threshold', False):
self.ft_params.update({'DI_threshold': False})
logger.warning('User tried to use DI_threshold with RL. Deactivating DI_threshold.')
if self.freqai_info['data_split_parameters'].get('shuffle', False):
self.freqai_info['data_split_parameters'].update({'shuffle': False})
logger.warning('User tried to shuffle training data. Setting shuffle to False')
@@ -107,27 +111,37 @@ class BaseReinforcementLearningModel(IFreqaiModel):
training_filter=True,
)
data_dictionary: Dict[str, Any] = dk.make_train_test_datasets(
dd: Dict[str, Any] = dk.make_train_test_datasets(
features_filtered, labels_filtered)
self.df_raw = copy.deepcopy(data_dictionary["train_features"])
self.df_raw = copy.deepcopy(dd["train_features"])
dk.fit_labels() # FIXME useless for now, but just satiating append methods
# normalize all data based on train_dataset only
prices_train, prices_test = self.build_ohlc_price_dataframes(dk.data_dictionary, pair, dk)
data_dictionary = dk.normalize_data(data_dictionary)
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
# data cleaning/analysis
self.data_cleaning_train(dk)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
if self.freqai_info.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
logger.info(
f'Training model on {len(dk.data_dictionary["train_features"].columns)}'
f' features and {len(data_dictionary["train_features"])} data points'
f' features and {len(dd["train_features"])} data points'
)
self.set_train_and_eval_environments(data_dictionary, prices_train, prices_test, dk)
self.set_train_and_eval_environments(dd, prices_train, prices_test, dk)
model = self.fit(data_dictionary, dk)
model = self.fit(dd, dk)
logger.info(f"--------------------done training {pair}--------------------")
@@ -151,9 +165,11 @@ class BaseReinforcementLearningModel(IFreqaiModel):
self.train_env = self.MyRLEnv(df=train_df, prices=prices_train, **env_info)
self.eval_env = Monitor(self.MyRLEnv(df=test_df, prices=prices_test, **env_info))
self.eval_callback = EvalCallback(self.eval_env, deterministic=True,
render=False, eval_freq=len(train_df),
best_model_save_path=str(dk.data_path))
self.eval_callback = MaskableEvalCallback(self.eval_env, deterministic=True,
render=False, eval_freq=len(train_df),
best_model_save_path=str(dk.data_path),
use_masking=(self.model_type == 'MaskablePPO' and
is_masking_supported(self.eval_env)))
actions = self.train_env.get_actions()
self.tensorboard_callback = TensorboardCallback(verbose=1, actions=actions)
@@ -236,13 +252,10 @@ class BaseReinforcementLearningModel(IFreqaiModel):
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_dataframe = self.drop_ohlc_from_df(filtered_dataframe, dk)
dk.data_dictionary["prediction_features"] = self.drop_ohlc_from_df(filtered_dataframe, dk)
filtered_dataframe = dk.normalize_data_from_metadata(filtered_dataframe)
dk.data_dictionary["prediction_features"] = filtered_dataframe
# optional additional data cleaning/analysis
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], _, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
pred_df = self.rl_model_predict(
dk.data_dictionary["prediction_features"], dk, self.model)

View File

@@ -17,8 +17,8 @@ logger = logging.getLogger(__name__)
class BaseClassifierModel(IFreqaiModel):
"""
Base class for regression type models (e.g. Catboost, LightGBM, XGboost etc.).
User *must* inherit from this class and set fit() and predict(). See example scripts
such as prediction_models/CatboostPredictionModel.py for guidance.
User *must* inherit from this class and set fit(). See example scripts
such as prediction_models/CatboostClassifier.py for guidance.
"""
def train(
@@ -50,21 +50,30 @@ class BaseClassifierModel(IFreqaiModel):
logger.info(f"-------------------- Training on data from {start_date} to "
f"{end_date} --------------------")
# split data into train/test data.
data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions_candles", 0) or not self.live:
dk.fit_labels()
# normalize all data based on train_dataset only
data_dictionary = dk.normalize_data(data_dictionary)
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
# optional additional data cleaning/analysis
self.data_cleaning_train(dk)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
if self.freqai_info.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(data_dictionary['train_features'])} data points")
logger.info(f"Training model on {len(dd['train_features'])} data points")
model = self.fit(data_dictionary, dk)
model = self.fit(dd, dk)
end_time = time()
@@ -89,10 +98,11 @@ class BaseClassifierModel(IFreqaiModel):
filtered_df, _ = dk.filter_features(
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_df = dk.normalize_data_from_metadata(filtered_df)
dk.data_dictionary["prediction_features"] = filtered_df
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
predictions = self.model.predict(dk.data_dictionary["prediction_features"])
if self.CONV_WIDTH == 1:
@@ -107,4 +117,10 @@ class BaseClassifierModel(IFreqaiModel):
pred_df = pd.concat([pred_df, pred_df_prob], axis=1)
if dk.feature_pipeline["di"]:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
return (pred_df, dk.do_predict)

View File

@@ -1,5 +1,6 @@
import logging
from typing import Dict, List, Tuple
from time import time
from typing import Any, Dict, List, Tuple
import numpy as np
import numpy.typing as npt
@@ -35,6 +36,7 @@ class BasePyTorchClassifier(BasePyTorchModel):
return dataframe
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.class_name_to_index = None
@@ -68,9 +70,12 @@ class BasePyTorchClassifier(BasePyTorchModel):
filtered_df, _ = dk.filter_features(
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_df = dk.normalize_data_from_metadata(filtered_df)
dk.data_dictionary["prediction_features"] = filtered_df
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
x = self.data_convertor.convert_x(
dk.data_dictionary["prediction_features"],
device=self.device
@@ -85,6 +90,13 @@ class BasePyTorchClassifier(BasePyTorchModel):
pred_df_prob = DataFrame(probs.detach().tolist(), columns=class_names)
pred_df = DataFrame(predicted_classes_str, columns=[dk.label_list[0]])
pred_df = pd.concat([pred_df, pred_df_prob], axis=1)
if dk.feature_pipeline["di"]:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
return (pred_df, dk.do_predict)
def encode_class_names(
@@ -149,3 +161,58 @@ class BasePyTorchClassifier(BasePyTorchModel):
)
return self.class_names
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
"""
Filter the training data and train a model to it. Train makes heavy use of the datakitchen
for storing, saving, loading, and analyzing the data.
:param unfiltered_df: Full dataframe for the current training period
:return:
:model: Trained model which can be used to inference (self.predict)
"""
logger.info(f"-------------------- Starting training {pair} --------------------")
start_time = time()
features_filtered, labels_filtered = dk.filter_features(
unfiltered_df,
dk.training_features_list,
dk.label_list,
training_filter=True,
)
# split data into train/test data.
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions_candles", 0) or not self.live:
dk.fit_labels()
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
if self.freqai_info.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(dd['train_features'])} data points")
model = self.fit(dd, dk)
end_time = time()
logger.info(f"-------------------- Done training {pair} "
f"({end_time - start_time:.2f} secs) --------------------")
return model

View File

@@ -1,12 +1,8 @@
import logging
from abc import ABC, abstractmethod
from time import time
from typing import Any
import torch
from pandas import DataFrame
from freqtrade.freqai.data_kitchen import FreqaiDataKitchen
from freqtrade.freqai.freqai_interface import IFreqaiModel
from freqtrade.freqai.torch.PyTorchDataConvertor import PyTorchDataConvertor
@@ -29,51 +25,6 @@ class BasePyTorchModel(IFreqaiModel, ABC):
self.splits = ["train", "test"] if test_size != 0 else ["train"]
self.window_size = self.freqai_info.get("conv_width", 1)
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
"""
Filter the training data and train a model to it. Train makes heavy use of the datakitchen
for storing, saving, loading, and analyzing the data.
:param unfiltered_df: Full dataframe for the current training period
:return:
:model: Trained model which can be used to inference (self.predict)
"""
logger.info(f"-------------------- Starting training {pair} --------------------")
start_time = time()
features_filtered, labels_filtered = dk.filter_features(
unfiltered_df,
dk.training_features_list,
dk.label_list,
training_filter=True,
)
# split data into train/test data.
data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions", 0) or not self.live:
dk.fit_labels()
# normalize all data based on train_dataset only
data_dictionary = dk.normalize_data(data_dictionary)
# optional additional data cleaning/analysis
self.data_cleaning_train(dk)
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(data_dictionary['train_features'])} data points")
model = self.fit(data_dictionary, dk)
end_time = time()
logger.info(f"-------------------- Done training {pair} "
f"({end_time - start_time:.2f} secs) --------------------")
return model
@property
@abstractmethod
def data_convertor(self) -> PyTorchDataConvertor:

View File

@@ -1,5 +1,6 @@
import logging
from typing import Tuple
from time import time
from typing import Any, Tuple
import numpy as np
import numpy.typing as npt
@@ -17,6 +18,7 @@ class BasePyTorchRegressor(BasePyTorchModel):
A PyTorch implementation of a regressor.
User must implement fit method
"""
def __init__(self, **kwargs):
super().__init__(**kwargs)
@@ -36,10 +38,11 @@ class BasePyTorchRegressor(BasePyTorchModel):
filtered_df, _ = dk.filter_features(
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_df = dk.normalize_data_from_metadata(filtered_df)
dk.data_dictionary["prediction_features"] = filtered_df
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
x = self.data_convertor.convert_x(
dk.data_dictionary["prediction_features"],
device=self.device
@@ -47,5 +50,71 @@ class BasePyTorchRegressor(BasePyTorchModel):
self.model.model.eval()
y = self.model.model(x)
pred_df = DataFrame(y.detach().tolist(), columns=[dk.label_list[0]])
pred_df = dk.denormalize_labels_from_metadata(pred_df)
pred_df, _, _ = dk.label_pipeline.inverse_transform(pred_df)
if dk.feature_pipeline["di"]:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
return (pred_df, dk.do_predict)
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
"""
Filter the training data and train a model to it. Train makes heavy use of the datakitchen
for storing, saving, loading, and analyzing the data.
:param unfiltered_df: Full dataframe for the current training period
:return:
:model: Trained model which can be used to inference (self.predict)
"""
logger.info(f"-------------------- Starting training {pair} --------------------")
start_time = time()
features_filtered, labels_filtered = dk.filter_features(
unfiltered_df,
dk.training_features_list,
dk.label_list,
training_filter=True,
)
# split data into train/test data.
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions_candles", 0) or not self.live:
dk.fit_labels()
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
dk.label_pipeline = self.define_label_pipeline(threads=dk.thread_count)
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
if self.freqai_info.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(dd['train_features'])} data points")
model = self.fit(dd, dk)
end_time = time()
logger.info(f"-------------------- Done training {pair} "
f"({end_time - start_time:.2f} secs) --------------------")
return model

View File

@@ -16,8 +16,8 @@ logger = logging.getLogger(__name__)
class BaseRegressionModel(IFreqaiModel):
"""
Base class for regression type models (e.g. Catboost, LightGBM, XGboost etc.).
User *must* inherit from this class and set fit() and predict(). See example scripts
such as prediction_models/CatboostPredictionModel.py for guidance.
User *must* inherit from this class and set fit(). See example scripts
such as prediction_models/CatboostRegressor.py for guidance.
"""
def train(
@@ -49,21 +49,33 @@ class BaseRegressionModel(IFreqaiModel):
logger.info(f"-------------------- Training on data from {start_date} to "
f"{end_date} --------------------")
# split data into train/test data.
data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions_candles", 0) or not self.live:
dk.fit_labels()
# normalize all data based on train_dataset only
data_dictionary = dk.normalize_data(data_dictionary)
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
dk.label_pipeline = self.define_label_pipeline(threads=dk.thread_count)
# optional additional data cleaning/analysis
self.data_cleaning_train(dk)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
if self.freqai_info.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(data_dictionary['train_features'])} data points")
logger.info(f"Training model on {len(dd['train_features'])} data points")
model = self.fit(data_dictionary, dk)
model = self.fit(dd, dk)
end_time = time()
@@ -85,14 +97,12 @@ class BaseRegressionModel(IFreqaiModel):
"""
dk.find_features(unfiltered_df)
filtered_df, _ = dk.filter_features(
dk.data_dictionary["prediction_features"], _ = dk.filter_features(
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_df = dk.normalize_data_from_metadata(filtered_df)
dk.data_dictionary["prediction_features"] = filtered_df
# optional additional data cleaning/analysis
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
predictions = self.model.predict(dk.data_dictionary["prediction_features"])
if self.CONV_WIDTH == 1:
@@ -100,6 +110,11 @@ class BaseRegressionModel(IFreqaiModel):
pred_df = DataFrame(predictions, columns=dk.label_list)
pred_df = dk.denormalize_labels_from_metadata(pred_df)
pred_df, _, _ = dk.label_pipeline.inverse_transform(pred_df)
if dk.feature_pipeline["di"]:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
return (pred_df, dk.do_predict)

View File

@@ -1,70 +0,0 @@
import logging
from time import time
from typing import Any
from pandas import DataFrame
from freqtrade.freqai.data_kitchen import FreqaiDataKitchen
from freqtrade.freqai.freqai_interface import IFreqaiModel
logger = logging.getLogger(__name__)
class BaseTensorFlowModel(IFreqaiModel):
"""
Base class for TensorFlow type models.
User *must* inherit from this class and set fit() and predict().
"""
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
"""
Filter the training data and train a model to it. Train makes heavy use of the datakitchen
for storing, saving, loading, and analyzing the data.
:param unfiltered_df: Full dataframe for the current training period
:param metadata: pair metadata from strategy.
:return:
:model: Trained model which can be used to inference (self.predict)
"""
logger.info(f"-------------------- Starting training {pair} --------------------")
start_time = time()
# filter the features requested by user in the configuration file and elegantly handle NaNs
features_filtered, labels_filtered = dk.filter_features(
unfiltered_df,
dk.training_features_list,
dk.label_list,
training_filter=True,
)
start_date = unfiltered_df["date"].iloc[0].strftime("%Y-%m-%d")
end_date = unfiltered_df["date"].iloc[-1].strftime("%Y-%m-%d")
logger.info(f"-------------------- Training on data from {start_date} to "
f"{end_date} --------------------")
# split data into train/test data.
data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
if not self.freqai_info.get("fit_live_predictions_candles", 0) or not self.live:
dk.fit_labels()
# normalize all data based on train_dataset only
data_dictionary = dk.normalize_data(data_dictionary)
# optional additional data cleaning/analysis
self.data_cleaning_train(dk)
logger.info(
f"Training model on {len(dk.data_dictionary['train_features'].columns)} features"
)
logger.info(f"Training model on {len(data_dictionary['train_features'])} data points")
model = self.fit(data_dictionary, dk)
end_time = time()
logger.info(f"-------------------- Done training {pair} "
f"({end_time - start_time:.2f} secs) --------------------")
return model

View File

@@ -28,6 +28,11 @@ from freqtrade.strategy.interface import IStrategy
logger = logging.getLogger(__name__)
FEATURE_PIPELINE = "feature_pipeline"
LABEL_PIPELINE = "label_pipeline"
TRAINDF = "trained_df"
METADATA = "metadata"
class pair_info(TypedDict):
model_filename: str
@@ -425,7 +430,7 @@ class FreqaiDataDrawer:
dk.data["training_features_list"] = list(dk.data_dictionary["train_features"].columns)
dk.data["label_list"] = dk.label_list
with (save_path / f"{dk.model_filename}_metadata.json").open("w") as fp:
with (save_path / f"{dk.model_filename}_{METADATA}.json").open("w") as fp:
rapidjson.dump(dk.data, fp, default=self.np_encoder, number_mode=rapidjson.NM_NATIVE)
return
@@ -450,39 +455,39 @@ class FreqaiDataDrawer:
elif self.model_type in ["stable_baselines3", "sb3_contrib", "pytorch"]:
model.save(save_path / f"{dk.model_filename}_model.zip")
if dk.svm_model is not None:
dump(dk.svm_model, save_path / f"{dk.model_filename}_svm_model.joblib")
dk.data["data_path"] = str(dk.data_path)
dk.data["model_filename"] = str(dk.model_filename)
dk.data["training_features_list"] = dk.training_features_list
dk.data["label_list"] = dk.label_list
# store the metadata
with (save_path / f"{dk.model_filename}_metadata.json").open("w") as fp:
with (save_path / f"{dk.model_filename}_{METADATA}.json").open("w") as fp:
rapidjson.dump(dk.data, fp, default=self.np_encoder, number_mode=rapidjson.NM_NATIVE)
# save the train data to file so we can check preds for area of applicability later
# save the pipelines to pickle files
with (save_path / f"{dk.model_filename}_{FEATURE_PIPELINE}.pkl").open("wb") as fp:
cloudpickle.dump(dk.feature_pipeline, fp)
with (save_path / f"{dk.model_filename}_{LABEL_PIPELINE}.pkl").open("wb") as fp:
cloudpickle.dump(dk.label_pipeline, fp)
# save the train data to file for post processing if desired
dk.data_dictionary["train_features"].to_pickle(
save_path / f"{dk.model_filename}_trained_df.pkl"
save_path / f"{dk.model_filename}_{TRAINDF}.pkl"
)
dk.data_dictionary["train_dates"].to_pickle(
save_path / f"{dk.model_filename}_trained_dates_df.pkl"
)
if self.freqai_info["feature_parameters"].get("principal_component_analysis"):
cloudpickle.dump(
dk.pca, (dk.data_path / f"{dk.model_filename}_pca_object.pkl").open("wb")
)
self.model_dictionary[coin] = model
self.pair_dict[coin]["model_filename"] = dk.model_filename
self.pair_dict[coin]["data_path"] = str(dk.data_path)
if coin not in self.meta_data_dictionary:
self.meta_data_dictionary[coin] = {}
self.meta_data_dictionary[coin]["train_df"] = dk.data_dictionary["train_features"]
self.meta_data_dictionary[coin]["meta_data"] = dk.data
self.meta_data_dictionary[coin][METADATA] = dk.data
self.meta_data_dictionary[coin][FEATURE_PIPELINE] = dk.feature_pipeline
self.meta_data_dictionary[coin][LABEL_PIPELINE] = dk.label_pipeline
self.save_drawer_to_disk()
return
@@ -492,7 +497,7 @@ class FreqaiDataDrawer:
Load only metadata into datakitchen to increase performance during
presaved backtesting (prediction file loading).
"""
with (dk.data_path / f"{dk.model_filename}_metadata.json").open("r") as fp:
with (dk.data_path / f"{dk.model_filename}_{METADATA}.json").open("r") as fp:
dk.data = rapidjson.load(fp, number_mode=rapidjson.NM_NATIVE)
dk.training_features_list = dk.data["training_features_list"]
dk.label_list = dk.data["label_list"]
@@ -512,15 +517,17 @@ class FreqaiDataDrawer:
dk.data_path = Path(self.pair_dict[coin]["data_path"])
if coin in self.meta_data_dictionary:
dk.data = self.meta_data_dictionary[coin]["meta_data"]
dk.data_dictionary["train_features"] = self.meta_data_dictionary[coin]["train_df"]
dk.data = self.meta_data_dictionary[coin][METADATA]
dk.feature_pipeline = self.meta_data_dictionary[coin][FEATURE_PIPELINE]
dk.label_pipeline = self.meta_data_dictionary[coin][LABEL_PIPELINE]
else:
with (dk.data_path / f"{dk.model_filename}_metadata.json").open("r") as fp:
with (dk.data_path / f"{dk.model_filename}_{METADATA}.json").open("r") as fp:
dk.data = rapidjson.load(fp, number_mode=rapidjson.NM_NATIVE)
dk.data_dictionary["train_features"] = pd.read_pickle(
dk.data_path / f"{dk.model_filename}_trained_df.pkl"
)
with (dk.data_path / f"{dk.model_filename}_{FEATURE_PIPELINE}.pkl").open("rb") as fp:
dk.feature_pipeline = cloudpickle.load(fp)
with (dk.data_path / f"{dk.model_filename}_{LABEL_PIPELINE}.pkl").open("rb") as fp:
dk.label_pipeline = cloudpickle.load(fp)
dk.training_features_list = dk.data["training_features_list"]
dk.label_list = dk.data["label_list"]
@@ -530,9 +537,6 @@ class FreqaiDataDrawer:
model = self.model_dictionary[coin]
elif self.model_type == 'joblib':
model = load(dk.data_path / f"{dk.model_filename}_model.joblib")
elif self.model_type == 'keras':
from tensorflow import keras
model = keras.models.load_model(dk.data_path / f"{dk.model_filename}_model.h5")
elif 'stable_baselines' in self.model_type or 'sb3_contrib' == self.model_type:
mod = importlib.import_module(
self.model_type, self.freqai_info['rl_config']['model_type'])
@@ -544,9 +548,6 @@ class FreqaiDataDrawer:
model = zip["pytrainer"]
model = model.load_from_checkpoint(zip)
if Path(dk.data_path / f"{dk.model_filename}_svm_model.joblib").is_file():
dk.svm_model = load(dk.data_path / f"{dk.model_filename}_svm_model.joblib")
if not model:
raise OperationalException(
f"Unable to load model, ensure model exists at " f"{dk.data_path} "
@@ -556,11 +557,6 @@ class FreqaiDataDrawer:
if coin not in self.model_dictionary:
self.model_dictionary[coin] = model
if self.config["freqai"]["feature_parameters"]["principal_component_analysis"]:
dk.pca = cloudpickle.load(
(dk.data_path / f"{dk.model_filename}_pca_object.pkl").open("rb")
)
return model
def update_historic_data(self, strategy: IStrategy, dk: FreqaiDataKitchen) -> None:

View File

@@ -4,7 +4,6 @@ import logging
import random
import shutil
from datetime import datetime, timezone
from math import cos, sin
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
@@ -12,16 +11,12 @@ import numpy as np
import numpy.typing as npt
import pandas as pd
import psutil
from datasieve.pipeline import Pipeline
from pandas import DataFrame
from scipy import stats
from sklearn import linear_model
from sklearn.cluster import DBSCAN
from sklearn.metrics.pairwise import pairwise_distances
from sklearn.model_selection import train_test_split
from sklearn.neighbors import NearestNeighbors
from freqtrade.configuration import TimeRange
from freqtrade.constants import Config
from freqtrade.constants import DOCS_LINK, Config
from freqtrade.data.converter import reduce_dataframe_footprint
from freqtrade.exceptions import OperationalException
from freqtrade.exchange import timeframe_to_seconds
@@ -81,11 +76,12 @@ class FreqaiDataKitchen:
self.backtest_predictions_folder: str = "backtesting_predictions"
self.live = live
self.pair = pair
self.svm_model: linear_model.SGDOneClassSVM = None
self.keras: bool = self.freqai_config.get("keras", False)
self.set_all_pairs()
self.backtest_live_models = config.get("freqai_backtest_live_models", False)
self.feature_pipeline = Pipeline()
self.label_pipeline = Pipeline()
self.DI_values: npt.NDArray = np.array([])
if not self.live:
self.full_path = self.get_full_models_path(self.config)
@@ -227,13 +223,7 @@ class FreqaiDataKitchen:
drop_index = pd.isnull(filtered_df).any(axis=1) # get the rows that have NaNs,
drop_index = drop_index.replace(True, 1).replace(False, 0) # pep8 requirement.
if (training_filter):
const_cols = list((filtered_df.nunique() == 1).loc[lambda x: x].index)
if const_cols:
filtered_df = filtered_df.filter(filtered_df.columns.difference(const_cols))
self.data['constant_features_list'] = const_cols
logger.warning(f"Removed features {const_cols} with constant values.")
else:
self.data['constant_features_list'] = []
# we don't care about total row number (total no. datapoints) in training, we only care
# about removing any row with NaNs
# if labels has multiple columns (user wants to train multiple modelEs), we detect here
@@ -264,8 +254,7 @@ class FreqaiDataKitchen:
self.data["filter_drop_index_training"] = drop_index
else:
if 'constant_features_list' in self.data and len(self.data['constant_features_list']):
filtered_df = self.check_pred_labels(filtered_df)
# we are backtesting so we need to preserve row number to send back to strategy,
# so now we use do_predict to avoid any prediction based on a NaN
drop_index = pd.isnull(filtered_df).any(axis=1)
@@ -307,107 +296,6 @@ class FreqaiDataKitchen:
return self.data_dictionary
def normalize_data(self, data_dictionary: Dict) -> Dict[Any, Any]:
"""
Normalize all data in the data_dictionary according to the training dataset
:param data_dictionary: dictionary containing the cleaned and
split training/test data/labels
:returns:
:data_dictionary: updated dictionary with standardized values.
"""
# standardize the data by training stats
train_max = data_dictionary["train_features"].max()
train_min = data_dictionary["train_features"].min()
data_dictionary["train_features"] = (
2 * (data_dictionary["train_features"] - train_min) / (train_max - train_min) - 1
)
data_dictionary["test_features"] = (
2 * (data_dictionary["test_features"] - train_min) / (train_max - train_min) - 1
)
for item in train_max.keys():
self.data[item + "_max"] = train_max[item]
self.data[item + "_min"] = train_min[item]
for item in data_dictionary["train_labels"].keys():
if data_dictionary["train_labels"][item].dtype == object:
continue
train_labels_max = data_dictionary["train_labels"][item].max()
train_labels_min = data_dictionary["train_labels"][item].min()
data_dictionary["train_labels"][item] = (
2
* (data_dictionary["train_labels"][item] - train_labels_min)
/ (train_labels_max - train_labels_min)
- 1
)
if self.freqai_config.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
data_dictionary["test_labels"][item] = (
2
* (data_dictionary["test_labels"][item] - train_labels_min)
/ (train_labels_max - train_labels_min)
- 1
)
self.data[f"{item}_max"] = train_labels_max
self.data[f"{item}_min"] = train_labels_min
return data_dictionary
def normalize_single_dataframe(self, df: DataFrame) -> DataFrame:
train_max = df.max()
train_min = df.min()
df = (
2 * (df - train_min) / (train_max - train_min) - 1
)
for item in train_max.keys():
self.data[item + "_max"] = train_max[item]
self.data[item + "_min"] = train_min[item]
return df
def normalize_data_from_metadata(self, df: DataFrame) -> DataFrame:
"""
Normalize a set of data using the mean and standard deviation from
the associated training data.
:param df: Dataframe to be standardized
"""
train_max = [None] * len(df.keys())
train_min = [None] * len(df.keys())
for i, item in enumerate(df.keys()):
train_max[i] = self.data[f"{item}_max"]
train_min[i] = self.data[f"{item}_min"]
train_max_series = pd.Series(train_max, index=df.keys())
train_min_series = pd.Series(train_min, index=df.keys())
df = (
2 * (df - train_min_series) / (train_max_series - train_min_series) - 1
)
return df
def denormalize_labels_from_metadata(self, df: DataFrame) -> DataFrame:
"""
Denormalize a set of data using the mean and standard deviation from
the associated training data.
:param df: Dataframe of predictions to be denormalized
"""
for label in df.columns:
if df[label].dtype == object or label in self.unique_class_list:
continue
df[label] = (
(df[label] + 1)
* (self.data[f"{label}_max"] - self.data[f"{label}_min"])
/ 2
) + self.data[f"{label}_min"]
return df
def split_timerange(
self, tr: str, train_split: int = 28, bt_split: float = 7
) -> Tuple[list, list]:
@@ -452,9 +340,7 @@ class FreqaiDataKitchen:
tr_training_list_timerange.append(copy.deepcopy(timerange_train))
# associated backtest period
timerange_backtest.startts = timerange_train.stopts
timerange_backtest.stopts = timerange_backtest.startts + int(bt_period)
if timerange_backtest.stopts > config_timerange.stopts:
@@ -485,426 +371,6 @@ class FreqaiDataKitchen:
return df
def check_pred_labels(self, df_predictions: DataFrame) -> DataFrame:
"""
Check that prediction feature labels match training feature labels.
:param df_predictions: incoming predictions
"""
constant_labels = self.data['constant_features_list']
df_predictions = df_predictions.filter(
df_predictions.columns.difference(constant_labels)
)
logger.warning(
f"Removed {len(constant_labels)} features from prediction features, "
f"these were considered constant values during most recent training."
)
return df_predictions
def principal_component_analysis(self) -> None:
"""
Performs Principal Component Analysis on the data for dimensionality reduction
and outlier detection (see self.remove_outliers())
No parameters or returns, it acts on the data_dictionary held by the DataHandler.
"""
from sklearn.decomposition import PCA # avoid importing if we dont need it
pca = PCA(0.999)
pca = pca.fit(self.data_dictionary["train_features"])
n_keep_components = pca.n_components_
self.data["n_kept_components"] = n_keep_components
n_components = self.data_dictionary["train_features"].shape[1]
logger.info("reduced feature dimension by %s", n_components - n_keep_components)
logger.info("explained variance %f", np.sum(pca.explained_variance_ratio_))
train_components = pca.transform(self.data_dictionary["train_features"])
self.data_dictionary["train_features"] = pd.DataFrame(
data=train_components,
columns=["PC" + str(i) for i in range(0, n_keep_components)],
index=self.data_dictionary["train_features"].index,
)
# normalsing transformed training features
self.data_dictionary["train_features"] = self.normalize_single_dataframe(
self.data_dictionary["train_features"])
# keeping a copy of the non-transformed features so we can check for errors during
# model load from disk
self.data["training_features_list_raw"] = copy.deepcopy(self.training_features_list)
self.training_features_list = self.data_dictionary["train_features"].columns
if self.freqai_config.get('data_split_parameters', {}).get('test_size', 0.1) != 0:
test_components = pca.transform(self.data_dictionary["test_features"])
self.data_dictionary["test_features"] = pd.DataFrame(
data=test_components,
columns=["PC" + str(i) for i in range(0, n_keep_components)],
index=self.data_dictionary["test_features"].index,
)
# normalise transformed test feature to transformed training features
self.data_dictionary["test_features"] = self.normalize_data_from_metadata(
self.data_dictionary["test_features"])
self.data["n_kept_components"] = n_keep_components
self.pca = pca
logger.info(f"PCA reduced total features from {n_components} to {n_keep_components}")
if not self.data_path.is_dir():
self.data_path.mkdir(parents=True, exist_ok=True)
return None
def pca_transform(self, filtered_dataframe: DataFrame) -> None:
"""
Use an existing pca transform to transform data into components
:param filtered_dataframe: DataFrame = the cleaned dataframe
"""
pca_components = self.pca.transform(filtered_dataframe)
self.data_dictionary["prediction_features"] = pd.DataFrame(
data=pca_components,
columns=["PC" + str(i) for i in range(0, self.data["n_kept_components"])],
index=filtered_dataframe.index,
)
# normalise transformed predictions to transformed training features
self.data_dictionary["prediction_features"] = self.normalize_data_from_metadata(
self.data_dictionary["prediction_features"])
def compute_distances(self) -> float:
"""
Compute distances between each training point and every other training
point. This metric defines the neighborhood of trained data and is used
for prediction confidence in the Dissimilarity Index
"""
# logger.info("computing average mean distance for all training points")
pairwise = pairwise_distances(
self.data_dictionary["train_features"], n_jobs=self.thread_count)
# remove the diagonal distances which are itself distances ~0
np.fill_diagonal(pairwise, np.NaN)
pairwise = pairwise.reshape(-1, 1)
avg_mean_dist = pairwise[~np.isnan(pairwise)].mean()
return avg_mean_dist
def get_outlier_percentage(self, dropped_pts: npt.NDArray) -> float:
"""
Check if more than X% of points werer dropped during outlier detection.
"""
outlier_protection_pct = self.freqai_config["feature_parameters"].get(
"outlier_protection_percentage", 30)
outlier_pct = (dropped_pts.sum() / len(dropped_pts)) * 100
if outlier_pct >= outlier_protection_pct:
return outlier_pct
else:
return 0.0
def use_SVM_to_remove_outliers(self, predict: bool) -> None:
"""
Build/inference a Support Vector Machine to detect outliers
in training data and prediction
:param predict: bool = If true, inference an existing SVM model, else construct one
"""
if self.keras:
logger.warning(
"SVM outlier removal not currently supported for Keras based models. "
"Skipping user requested function."
)
if predict:
self.do_predict = np.ones(len(self.data_dictionary["prediction_features"]))
return
if predict:
if not self.svm_model:
logger.warning("No svm model available for outlier removal")
return
y_pred = self.svm_model.predict(self.data_dictionary["prediction_features"])
do_predict = np.where(y_pred == -1, 0, y_pred)
if (len(do_predict) - do_predict.sum()) > 0:
logger.info(f"SVM tossed {len(do_predict) - do_predict.sum()} predictions.")
self.do_predict += do_predict
self.do_predict -= 1
else:
# use SGDOneClassSVM to increase speed?
svm_params = self.freqai_config["feature_parameters"].get(
"svm_params", {"shuffle": False, "nu": 0.1})
self.svm_model = linear_model.SGDOneClassSVM(**svm_params).fit(
self.data_dictionary["train_features"]
)
y_pred = self.svm_model.predict(self.data_dictionary["train_features"])
kept_points = np.where(y_pred == -1, 0, y_pred)
# keep_index = np.where(y_pred == 1)
outlier_pct = self.get_outlier_percentage(1 - kept_points)
if outlier_pct:
logger.warning(
f"SVM detected {outlier_pct:.2f}% of the points as outliers. "
f"Keeping original dataset."
)
self.svm_model = None
return
self.data_dictionary["train_features"] = self.data_dictionary["train_features"][
(y_pred == 1)
]
self.data_dictionary["train_labels"] = self.data_dictionary["train_labels"][
(y_pred == 1)
]
self.data_dictionary["train_weights"] = self.data_dictionary["train_weights"][
(y_pred == 1)
]
logger.info(
f"SVM tossed {len(y_pred) - kept_points.sum()}"
f" train points from {len(y_pred)} total points."
)
# same for test data
# TODO: This (and the part above) could be refactored into a separate function
# to reduce code duplication
if self.freqai_config['data_split_parameters'].get('test_size', 0.1) != 0:
y_pred = self.svm_model.predict(self.data_dictionary["test_features"])
kept_points = np.where(y_pred == -1, 0, y_pred)
self.data_dictionary["test_features"] = self.data_dictionary["test_features"][
(y_pred == 1)
]
self.data_dictionary["test_labels"] = self.data_dictionary["test_labels"][(
y_pred == 1)]
self.data_dictionary["test_weights"] = self.data_dictionary["test_weights"][
(y_pred == 1)
]
logger.info(
f"{self.pair}: SVM tossed {len(y_pred) - kept_points.sum()}"
f" test points from {len(y_pred)} total points."
)
return
def use_DBSCAN_to_remove_outliers(self, predict: bool, eps=None) -> None:
"""
Use DBSCAN to cluster training data and remove "noisy" data (read outliers).
User controls this via the config param `DBSCAN_outlier_pct` which indicates the
pct of training data that they want to be considered outliers.
:param predict: bool = If False (training), iterate to find the best hyper parameters
to match user requested outlier percent target.
If True (prediction), use the parameters determined from
the previous training to estimate if the current prediction point
is an outlier.
"""
if predict:
if not self.data['DBSCAN_eps']:
return
train_ft_df = self.data_dictionary['train_features']
pred_ft_df = self.data_dictionary['prediction_features']
num_preds = len(pred_ft_df)
df = pd.concat([train_ft_df, pred_ft_df], axis=0, ignore_index=True)
clustering = DBSCAN(eps=self.data['DBSCAN_eps'],
min_samples=self.data['DBSCAN_min_samples'],
n_jobs=self.thread_count
).fit(df)
do_predict = np.where(clustering.labels_[-num_preds:] == -1, 0, 1)
if (len(do_predict) - do_predict.sum()) > 0:
logger.info(f"DBSCAN tossed {len(do_predict) - do_predict.sum()} predictions")
self.do_predict += do_predict
self.do_predict -= 1
else:
def normalise_distances(distances):
normalised_distances = (distances - distances.min()) / \
(distances.max() - distances.min())
return normalised_distances
def rotate_point(origin, point, angle):
# rotate a point counterclockwise by a given angle (in radians)
# around a given origin
x = origin[0] + cos(angle) * (point[0] - origin[0]) - \
sin(angle) * (point[1] - origin[1])
y = origin[1] + sin(angle) * (point[0] - origin[0]) + \
cos(angle) * (point[1] - origin[1])
return (x, y)
MinPts = int(len(self.data_dictionary['train_features'].index) * 0.25)
# measure pairwise distances to nearest neighbours
neighbors = NearestNeighbors(
n_neighbors=MinPts, n_jobs=self.thread_count)
neighbors_fit = neighbors.fit(self.data_dictionary['train_features'])
distances, _ = neighbors_fit.kneighbors(self.data_dictionary['train_features'])
distances = np.sort(distances, axis=0).mean(axis=1)
normalised_distances = normalise_distances(distances)
x_range = np.linspace(0, 1, len(distances))
line = np.linspace(normalised_distances[0],
normalised_distances[-1], len(normalised_distances))
deflection = np.abs(normalised_distances - line)
max_deflection_loc = np.where(deflection == deflection.max())[0][0]
origin = x_range[max_deflection_loc], line[max_deflection_loc]
point = x_range[max_deflection_loc], normalised_distances[max_deflection_loc]
rot_angle = np.pi / 4
elbow_loc = rotate_point(origin, point, rot_angle)
epsilon = elbow_loc[1] * (distances[-1] - distances[0]) + distances[0]
clustering = DBSCAN(eps=epsilon, min_samples=MinPts,
n_jobs=int(self.thread_count)).fit(
self.data_dictionary['train_features']
)
logger.info(f'DBSCAN found eps of {epsilon:.2f}.')
self.data['DBSCAN_eps'] = epsilon
self.data['DBSCAN_min_samples'] = MinPts
dropped_points = np.where(clustering.labels_ == -1, 1, 0)
outlier_pct = self.get_outlier_percentage(dropped_points)
if outlier_pct:
logger.warning(
f"DBSCAN detected {outlier_pct:.2f}% of the points as outliers. "
f"Keeping original dataset."
)
self.data['DBSCAN_eps'] = 0
return
self.data_dictionary['train_features'] = self.data_dictionary['train_features'][
(clustering.labels_ != -1)
]
self.data_dictionary["train_labels"] = self.data_dictionary["train_labels"][
(clustering.labels_ != -1)
]
self.data_dictionary["train_weights"] = self.data_dictionary["train_weights"][
(clustering.labels_ != -1)
]
logger.info(
f"DBSCAN tossed {dropped_points.sum()}"
f" train points from {len(clustering.labels_)}"
)
return
def compute_inlier_metric(self, set_='train') -> None:
"""
Compute inlier metric from backwards distance distributions.
This metric defines how well features from a timepoint fit
into previous timepoints.
"""
def normalise(dataframe: DataFrame, key: str) -> DataFrame:
if set_ == 'train':
min_value = dataframe.min()
max_value = dataframe.max()
self.data[f'{key}_min'] = min_value
self.data[f'{key}_max'] = max_value
else:
min_value = self.data[f'{key}_min']
max_value = self.data[f'{key}_max']
return (dataframe - min_value) / (max_value - min_value)
no_prev_pts = self.freqai_config["feature_parameters"]["inlier_metric_window"]
if set_ == 'train':
compute_df = copy.deepcopy(self.data_dictionary['train_features'])
elif set_ == 'test':
compute_df = copy.deepcopy(self.data_dictionary['test_features'])
else:
compute_df = copy.deepcopy(self.data_dictionary['prediction_features'])
compute_df_reindexed = compute_df.reindex(
index=np.flip(compute_df.index)
)
pairwise = pd.DataFrame(
np.triu(
pairwise_distances(compute_df_reindexed, n_jobs=self.thread_count)
),
columns=compute_df_reindexed.index,
index=compute_df_reindexed.index
)
pairwise = pairwise.round(5)
column_labels = [
'{}{}'.format('d', i) for i in range(1, no_prev_pts + 1)
]
distances = pd.DataFrame(
columns=column_labels, index=compute_df.index
)
for index in compute_df.index[no_prev_pts:]:
current_row = pairwise.loc[[index]]
current_row_no_zeros = current_row.loc[
:, (current_row != 0).any(axis=0)
]
distances.loc[[index]] = current_row_no_zeros.iloc[
:, :no_prev_pts
]
distances = distances.replace([np.inf, -np.inf], np.nan)
drop_index = pd.isnull(distances).any(axis=1)
distances = distances[drop_index == 0]
inliers = pd.DataFrame(index=distances.index)
for key in distances.keys():
current_distances = distances[key].dropna()
current_distances = normalise(current_distances, key)
if set_ == 'train':
fit_params = stats.weibull_min.fit(current_distances)
self.data[f'{key}_fit_params'] = fit_params
else:
fit_params = self.data[f'{key}_fit_params']
quantiles = stats.weibull_min.cdf(current_distances, *fit_params)
df_inlier = pd.DataFrame(
{key: quantiles}, index=distances.index
)
inliers = pd.concat(
[inliers, df_inlier], axis=1
)
inlier_metric = pd.DataFrame(
data=inliers.sum(axis=1) / no_prev_pts,
columns=['%-inlier_metric'],
index=compute_df.index
)
inlier_metric = (2 * (inlier_metric - inlier_metric.min()) /
(inlier_metric.max() - inlier_metric.min()) - 1)
if set_ in ('train', 'test'):
inlier_metric = inlier_metric.iloc[no_prev_pts:]
compute_df = compute_df.iloc[no_prev_pts:]
self.remove_beginning_points_from_data_dict(set_, no_prev_pts)
self.data_dictionary[f'{set_}_features'] = pd.concat(
[compute_df, inlier_metric], axis=1)
else:
self.data_dictionary['prediction_features'] = pd.concat(
[compute_df, inlier_metric], axis=1)
self.data_dictionary['prediction_features'].fillna(0, inplace=True)
logger.info('Inlier metric computed and added to features.')
return None
def remove_beginning_points_from_data_dict(self, set_='train', no_prev_pts: int = 10):
features = self.data_dictionary[f'{set_}_features']
weights = self.data_dictionary[f'{set_}_weights']
labels = self.data_dictionary[f'{set_}_labels']
self.data_dictionary[f'{set_}_weights'] = weights[no_prev_pts:]
self.data_dictionary[f'{set_}_features'] = features.iloc[no_prev_pts:]
self.data_dictionary[f'{set_}_labels'] = labels.iloc[no_prev_pts:]
def add_noise_to_training_features(self) -> None:
"""
Add noise to train features to reduce the risk of overfitting.
"""
mu = 0 # no shift
sigma = self.freqai_config["feature_parameters"]["noise_standard_deviation"]
compute_df = self.data_dictionary['train_features']
noise = np.random.normal(mu, sigma, [compute_df.shape[0], compute_df.shape[1]])
self.data_dictionary['train_features'] += noise
return
def find_features(self, dataframe: DataFrame) -> None:
"""
Find features in the strategy provided dataframe
@@ -925,37 +391,6 @@ class FreqaiDataKitchen:
labels = [c for c in column_names if "&" in c]
self.label_list = labels
def check_if_pred_in_training_spaces(self) -> None:
"""
Compares the distance from each prediction point to each training data
point. It uses this information to estimate a Dissimilarity Index (DI)
and avoid making predictions on any points that are too far away
from the training data set.
"""
distance = pairwise_distances(
self.data_dictionary["train_features"],
self.data_dictionary["prediction_features"],
n_jobs=self.thread_count,
)
self.DI_values = distance.min(axis=0) / self.data["avg_mean_dist"]
do_predict = np.where(
self.DI_values < self.freqai_config["feature_parameters"]["DI_threshold"],
1,
0,
)
if (len(do_predict) - do_predict.sum()) > 0:
logger.info(
f"{self.pair}: DI tossed {len(do_predict) - do_predict.sum()} predictions for "
"being too far from training data."
)
self.do_predict += do_predict
self.do_predict -= 1
def set_weights_higher_recent(self, num_weights: int) -> npt.ArrayLike:
"""
Set weights so that recent data is more heavily weighted during
@@ -1325,9 +760,9 @@ class FreqaiDataKitchen:
" which was deprecated on March 1, 2023. Please refer "
"to the strategy migration guide to use the new "
"feature_engineering_* methods: \n"
"https://www.freqtrade.io/en/stable/strategy_migration/#freqai-strategy \n"
f"{DOCS_LINK}/strategy_migration/#freqai-strategy \n"
"And the feature_engineering_* documentation: \n"
"https://www.freqtrade.io/en/latest/freqai-feature-engineering/"
f"{DOCS_LINK}/freqai-feature-engineering/"
)
tfs: List[str] = self.freqai_config["feature_parameters"].get("include_timeframes")
@@ -1515,3 +950,32 @@ class FreqaiDataKitchen:
timerange.startts += buffer * timeframe_to_seconds(self.config["timeframe"])
return timerange
# deprecated functions
def normalize_data(self, data_dictionary: Dict) -> Dict[Any, Any]:
"""
Deprecation warning, migration assistance
"""
logger.warning(f"Your custom IFreqaiModel relies on the deprecated"
" data pipeline. Please update your model to use the new data pipeline."
" This can be achieved by following the migration guide at "
f"{DOCS_LINK}/strategy_migration/#freqai-new-data-pipeline "
"We added a basic pipeline for you, but this will be removed "
"in a future version.")
return data_dictionary
def denormalize_labels_from_metadata(self, df: DataFrame) -> DataFrame:
"""
Deprecation warning, migration assistance
"""
logger.warning(f"Your custom IFreqaiModel relies on the deprecated"
" data pipeline. Please update your model to use the new data pipeline."
" This can be achieved by following the migration guide at "
f"{DOCS_LINK}/strategy_migration/#freqai-new-data-pipeline "
"We added a basic pipeline for you, but this will be removed "
"in a future version.")
pred_df, _, _ = self.label_pipeline.inverse_transform(df)
return pred_df

View File

@@ -7,14 +7,18 @@ from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Literal, Optional, Tuple
import datasieve.transforms as ds
import numpy as np
import pandas as pd
import psutil
from datasieve.pipeline import Pipeline
from datasieve.transforms import SKLearnWrapper
from numpy.typing import NDArray
from pandas import DataFrame
from sklearn.preprocessing import MinMaxScaler
from freqtrade.configuration import TimeRange
from freqtrade.constants import Config
from freqtrade.constants import DOCS_LINK, Config
from freqtrade.data.dataprovider import DataProvider
from freqtrade.enums import RunMode
from freqtrade.exceptions import OperationalException
@@ -503,68 +507,43 @@ class IFreqaiModel(ABC):
"feature_engineering_* functions"
)
def data_cleaning_train(self, dk: FreqaiDataKitchen) -> None:
"""
Base data cleaning method for train.
Functions here improve/modify the input data by identifying outliers,
computing additional metrics, adding noise, reducing dimensionality etc.
"""
def define_data_pipeline(self, threads=-1) -> Pipeline:
ft_params = self.freqai_info["feature_parameters"]
pipe_steps = [
('const', ds.VarianceThreshold(threshold=0)),
('scaler', SKLearnWrapper(MinMaxScaler(feature_range=(-1, 1))))
]
if ft_params.get('inlier_metric_window', 0):
dk.compute_inlier_metric(set_='train')
if self.freqai_info["data_split_parameters"]["test_size"] > 0:
dk.compute_inlier_metric(set_='test')
if ft_params.get(
"principal_component_analysis", False
):
dk.principal_component_analysis()
if ft_params.get("principal_component_analysis", False):
pipe_steps.append(('pca', ds.PCA()))
pipe_steps.append(('post-pca-scaler',
SKLearnWrapper(MinMaxScaler(feature_range=(-1, 1)))))
if ft_params.get("use_SVM_to_remove_outliers", False):
dk.use_SVM_to_remove_outliers(predict=False)
svm_params = ft_params.get(
"svm_params", {"shuffle": False, "nu": 0.01})
pipe_steps.append(('svm', ds.SVMOutlierExtractor(**svm_params)))
if ft_params.get("DI_threshold", 0):
dk.data["avg_mean_dist"] = dk.compute_distances()
di = ft_params.get("DI_threshold", 0)
if di:
pipe_steps.append(('di', ds.DissimilarityIndex(di_threshold=di, n_jobs=threads)))
if ft_params.get("use_DBSCAN_to_remove_outliers", False):
if dk.pair in self.dd.old_DBSCAN_eps:
eps = self.dd.old_DBSCAN_eps[dk.pair]
else:
eps = None
dk.use_DBSCAN_to_remove_outliers(predict=False, eps=eps)
self.dd.old_DBSCAN_eps[dk.pair] = dk.data['DBSCAN_eps']
pipe_steps.append(('dbscan', ds.DBSCAN(n_jobs=threads)))
if self.freqai_info["feature_parameters"].get('noise_standard_deviation', 0):
dk.add_noise_to_training_features()
sigma = self.freqai_info["feature_parameters"].get('noise_standard_deviation', 0)
if sigma:
pipe_steps.append(('noise', ds.Noise(sigma=sigma)))
def data_cleaning_predict(self, dk: FreqaiDataKitchen) -> None:
"""
Base data cleaning method for predict.
Functions here are complementary to the functions of data_cleaning_train.
"""
ft_params = self.freqai_info["feature_parameters"]
return Pipeline(pipe_steps)
# ensure user is feeding the correct indicators to the model
self.check_if_feature_list_matches_strategy(dk)
def define_label_pipeline(self, threads=-1) -> Pipeline:
if ft_params.get('inlier_metric_window', 0):
dk.compute_inlier_metric(set_='predict')
label_pipeline = Pipeline([
('scaler', SKLearnWrapper(MinMaxScaler(feature_range=(-1, 1))))
])
if ft_params.get(
"principal_component_analysis", False
):
dk.pca_transform(dk.data_dictionary['prediction_features'])
if ft_params.get("use_SVM_to_remove_outliers", False):
dk.use_SVM_to_remove_outliers(predict=True)
if ft_params.get("DI_threshold", 0):
dk.check_if_pred_in_training_spaces()
if ft_params.get("use_DBSCAN_to_remove_outliers", False):
dk.use_DBSCAN_to_remove_outliers(predict=True)
return label_pipeline
def model_exists(self, dk: FreqaiDataKitchen) -> bool:
"""
@@ -576,8 +555,6 @@ class IFreqaiModel(ABC):
"""
if self.dd.model_type == 'joblib':
file_type = ".joblib"
elif self.dd.model_type == 'keras':
file_type = ".h5"
elif self.dd.model_type in ["stable_baselines3", "sb3_contrib", "pytorch"]:
file_type = ".zip"
@@ -701,7 +678,7 @@ class IFreqaiModel(ABC):
# # for keras type models, the conv_window needs to be prepended so
# # viewing is correct in frequi
if self.freqai_info.get('keras', False) or self.ft_params.get('inlier_metric_window', 0):
if self.ft_params.get('inlier_metric_window', 0):
n_lost_points = self.freqai_info.get('conv_width', 2)
zeros_df = DataFrame(np.zeros((n_lost_points, len(hist_preds_df.columns))),
columns=hist_preds_df.columns)
@@ -991,3 +968,50 @@ class IFreqaiModel(ABC):
:do_predict: np.array of 1s and 0s to indicate places where freqai needed to remove
data (NaNs) or felt uncertain about data (i.e. SVM and/or DI index)
"""
# deprecated functions
def data_cleaning_train(self, dk: FreqaiDataKitchen, pair: str):
"""
throw deprecation warning if this function is called
"""
logger.warning(f"Your model {self.__class__.__name__} relies on the deprecated"
" data pipeline. Please update your model to use the new data pipeline."
" This can be achieved by following the migration guide at "
f"{DOCS_LINK}/strategy_migration/#freqai-new-data-pipeline")
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
dd = dk.data_dictionary
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
dk.label_pipeline = self.define_label_pipeline(threads=dk.thread_count)
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
return
def data_cleaning_predict(self, dk: FreqaiDataKitchen, pair: str):
"""
throw deprecation warning if this function is called
"""
logger.warning(f"Your model {self.__class__.__name__} relies on the deprecated"
" data pipeline. Please update your model to use the new data pipeline."
" This can be achieved by following the migration guide at "
f"{DOCS_LINK}/strategy_migration/#freqai-new-data-pipeline")
dd = dk.data_dictionary
dd["predict_features"], outliers, _ = dk.feature_pipeline.transform(
dd["predict_features"], outlier_check=True)
if self.freqai_info.get("DI_threshold", 0) > 0:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
return

View File

@@ -103,13 +103,13 @@ class PyTorchTransformerRegressor(BasePyTorchRegressor):
"""
dk.find_features(unfiltered_df)
filtered_df, _ = dk.filter_features(
dk.data_dictionary["prediction_features"], _ = dk.filter_features(
unfiltered_df, dk.training_features_list, training_filter=False
)
filtered_df = dk.normalize_data_from_metadata(filtered_df)
dk.data_dictionary["prediction_features"] = filtered_df
self.data_cleaning_predict(dk)
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
x = self.data_convertor.convert_x(
dk.data_dictionary["prediction_features"],
device=self.device
@@ -131,7 +131,13 @@ class PyTorchTransformerRegressor(BasePyTorchRegressor):
yb = yb.cpu().squeeze()
pred_df = pd.DataFrame(yb.detach().numpy(), columns=dk.label_list)
pred_df = dk.denormalize_labels_from_metadata(pred_df)
pred_df, _, _ = dk.label_pipeline.inverse_transform(pred_df)
if self.freqai_info.get("DI_threshold", 0) > 0:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
if x.shape[1] > 1:
zeros_df = pd.DataFrame(np.zeros((x.shape[1] - len(pred_df), len(pred_df.columns))),

View File

@@ -2,7 +2,8 @@ import logging
from typing import Any, Dict
from pandas import DataFrame
from stable_baselines3.common.callbacks import EvalCallback
from sb3_contrib.common.maskable.callbacks import MaskableEvalCallback
from sb3_contrib.common.maskable.utils import is_masking_supported
from stable_baselines3.common.vec_env import SubprocVecEnv, VecMonitor
from freqtrade.freqai.data_kitchen import FreqaiDataKitchen
@@ -55,9 +56,11 @@ class ReinforcementLearner_multiproc(ReinforcementLearner):
env_info=env_info) for i
in range(self.max_threads)]))
self.eval_callback = EvalCallback(self.eval_env, deterministic=True,
render=False, eval_freq=eval_freq,
best_model_save_path=str(dk.data_path))
self.eval_callback = MaskableEvalCallback(self.eval_env, deterministic=True,
render=False, eval_freq=eval_freq,
best_model_save_path=str(dk.data_path),
use_masking=(self.model_type == 'MaskablePPO' and
is_masking_supported(self.eval_env)))
# TENSORBOARD CALLBACK DOES NOT RECOMMENDED TO USE WITH MULTIPLE ENVS,
# IT WILL RETURN FALSE INFORMATIONS, NEVERTHLESS NOT THREAD SAFE WITH SB3!!!

View File

@@ -5,6 +5,7 @@ from xgboost import XGBRFRegressor
from freqtrade.freqai.base_models.BaseRegressionModel import BaseRegressionModel
from freqtrade.freqai.data_kitchen import FreqaiDataKitchen
from freqtrade.freqai.tensorboard import TBCallback
logger = logging.getLogger(__name__)
@@ -44,7 +45,10 @@ class XGBoostRFRegressor(BaseRegressionModel):
model = XGBRFRegressor(**self.model_training_parameters)
model.set_params(callbacks=[TBCallback(dk.data_path)], activate=self.activate_tensorboard)
model.fit(X=X, y=y, sample_weight=sample_weight, eval_set=eval_set,
sample_weight_eval_set=eval_weights, xgb_model=xgb_model)
# set the callbacks to empty so that we can serialize to disk later
model.set_params(callbacks=[])
return model

View File

@@ -2,6 +2,9 @@
import logging
logger = logging.getLogger(__name__)
def set_loggers(verbosity: int = 0, api_verbosity: str = 'info') -> None:
"""
Set the logging level for third party libraries
@@ -23,3 +26,30 @@ def set_loggers(verbosity: int = 0, api_verbosity: str = 'info') -> None:
logging.getLogger('werkzeug').setLevel(
logging.ERROR if api_verbosity == 'error' else logging.INFO
)
__BIAS_TESTER_LOGGERS = [
'freqtrade.resolvers',
'freqtrade.strategy.hyper',
'freqtrade.configuration.config_validation',
]
def reduce_verbosity_for_bias_tester() -> None:
"""
Reduce verbosity for bias tester.
It loads the same strategy several times, which would spam the log.
"""
logger.info("Reducing verbosity for bias tester.")
for logger_name in __BIAS_TESTER_LOGGERS:
logging.getLogger(logger_name).setLevel(logging.WARNING)
def restore_verbosity_for_bias_tester() -> None:
"""
Restore verbosity after bias tester.
"""
logger.info("Restoring log verbosity.")
log_level = logging.NOTSET
for logger_name in __BIAS_TESTER_LOGGERS:
logging.getLogger(logger_name).setLevel(log_level)

View File

@@ -24,6 +24,7 @@ from freqtrade.enums import (BacktestState, CandleType, ExitCheckTuple, ExitType
from freqtrade.exceptions import DependencyException, OperationalException
from freqtrade.exchange import (amount_to_contract_precision, price_to_precision,
timeframe_to_minutes, timeframe_to_seconds)
from freqtrade.exchange.exchange import Exchange
from freqtrade.mixins import LoggingMixin
from freqtrade.optimize.backtest_caching import get_strategy_run_id
from freqtrade.optimize.bt_progress import BTProgress
@@ -72,7 +73,7 @@ class Backtesting:
backtesting.start()
"""
def __init__(self, config: Config) -> None:
def __init__(self, config: Config, exchange: Optional[Exchange] = None) -> None:
LoggingMixin.show_output = False
self.config = config
@@ -89,7 +90,10 @@ class Backtesting:
self.rejected_df: Dict[str, Dict] = {}
self._exchange_name = self.config['exchange']['name']
self.exchange = ExchangeResolver.load_exchange(self.config, load_leverage_tiers=True)
if not exchange:
exchange = ExchangeResolver.load_exchange(self.config, load_leverage_tiers=True)
self.exchange = exchange
self.dataprovider = DataProvider(self.config, self.exchange)
if self.config.get('strategy_list'):
@@ -114,16 +118,7 @@ class Backtesting:
self.timeframe_min = timeframe_to_minutes(self.timeframe)
self.init_backtest_detail()
self.pairlists = PairListManager(self.exchange, self.config, self.dataprovider)
if 'VolumePairList' in self.pairlists.name_list:
raise OperationalException("VolumePairList not allowed for backtesting. "
"Please use StaticPairList instead.")
if 'PerformanceFilter' in self.pairlists.name_list:
raise OperationalException("PerformanceFilter not allowed for backtesting.")
if len(self.strategylist) > 1 and 'PrecisionFilter' in self.pairlists.name_list:
raise OperationalException(
"PrecisionFilter not allowed for backtesting multiple strategies."
)
self._validate_pairlists_for_backtesting()
self.dataprovider.add_pairlisthandler(self.pairlists)
self.pairlists.refresh_pairlist()
@@ -164,6 +159,18 @@ class Backtesting:
self.init_backtest()
def _validate_pairlists_for_backtesting(self):
if 'VolumePairList' in self.pairlists.name_list:
raise OperationalException("VolumePairList not allowed for backtesting. "
"Please use StaticPairList instead.")
if 'PerformanceFilter' in self.pairlists.name_list:
raise OperationalException("PerformanceFilter not allowed for backtesting.")
if len(self.strategylist) > 1 and 'PrecisionFilter' in self.pairlists.name_list:
raise OperationalException(
"PrecisionFilter not allowed for backtesting multiple strategies."
)
@staticmethod
def cleanup():
LoggingMixin.show_output = True

View File

@@ -0,0 +1,275 @@
import logging
import shutil
from copy import deepcopy
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
from pandas import DataFrame
from freqtrade.configuration import TimeRange
from freqtrade.data.history import get_timerange
from freqtrade.exchange import timeframe_to_minutes
from freqtrade.loggers.set_log_levels import (reduce_verbosity_for_bias_tester,
restore_verbosity_for_bias_tester)
from freqtrade.optimize.backtesting import Backtesting
logger = logging.getLogger(__name__)
class VarHolder:
timerange: TimeRange
data: DataFrame
indicators: Dict[str, DataFrame]
result: DataFrame
compared: DataFrame
from_dt: datetime
to_dt: datetime
compared_dt: datetime
timeframe: str
class Analysis:
def __init__(self) -> None:
self.total_signals = 0
self.false_entry_signals = 0
self.false_exit_signals = 0
self.false_indicators: List[str] = []
self.has_bias = False
class LookaheadAnalysis:
def __init__(self, config: Dict[str, Any], strategy_obj: Dict):
self.failed_bias_check = True
self.full_varHolder = VarHolder()
self.entry_varHolders: List[VarHolder] = []
self.exit_varHolders: List[VarHolder] = []
self.exchange: Optional[Any] = None
# pull variables the scope of the lookahead_analysis-instance
self.local_config = deepcopy(config)
self.local_config['strategy'] = strategy_obj['name']
self.current_analysis = Analysis()
self.minimum_trade_amount = config['minimum_trade_amount']
self.targeted_trade_amount = config['targeted_trade_amount']
self.strategy_obj = strategy_obj
@staticmethod
def dt_to_timestamp(dt: datetime):
timestamp = int(dt.replace(tzinfo=timezone.utc).timestamp())
return timestamp
@staticmethod
def get_result(backtesting: Backtesting, processed: DataFrame):
min_date, max_date = get_timerange(processed)
result = backtesting.backtest(
processed=deepcopy(processed),
start_date=min_date,
end_date=max_date
)
return result
@staticmethod
def report_signal(result: dict, column_name: str, checked_timestamp: datetime):
df = result['results']
row_count = df[column_name].shape[0]
if row_count == 0:
return False
else:
df_cut = df[(df[column_name] == checked_timestamp)]
if df_cut[column_name].shape[0] == 0:
return False
else:
return True
return False
# analyzes two data frames with processed indicators and shows differences between them.
def analyze_indicators(self, full_vars: VarHolder, cut_vars: VarHolder, current_pair: str):
# extract dataframes
cut_df: DataFrame = cut_vars.indicators[current_pair]
full_df: DataFrame = full_vars.indicators[current_pair]
# cut longer dataframe to length of the shorter
full_df_cut = full_df[
(full_df.date == cut_vars.compared_dt)
].reset_index(drop=True)
cut_df_cut = cut_df[
(cut_df.date == cut_vars.compared_dt)
].reset_index(drop=True)
# check if dataframes are not empty
if full_df_cut.shape[0] != 0 and cut_df_cut.shape[0] != 0:
# compare dataframes
compare_df = full_df_cut.compare(cut_df_cut)
if compare_df.shape[0] > 0:
for col_name, values in compare_df.items():
col_idx = compare_df.columns.get_loc(col_name)
compare_df_row = compare_df.iloc[0]
# compare_df now comprises tuples with [1] having either 'self' or 'other'
if 'other' in col_name[1]:
continue
self_value = compare_df_row[col_idx]
other_value = compare_df_row[col_idx + 1]
# output differences
if self_value != other_value:
if not self.current_analysis.false_indicators.__contains__(col_name[0]):
self.current_analysis.false_indicators.append(col_name[0])
logger.info(f"=> found look ahead bias in indicator "
f"{col_name[0]}. "
f"{str(self_value)} != {str(other_value)}")
def prepare_data(self, varholder: VarHolder, pairs_to_load: List[DataFrame]):
if 'freqai' in self.local_config and 'identifier' in self.local_config['freqai']:
# purge previous data if the freqai model is defined
# (to be sure nothing is carried over from older backtests)
path_to_current_identifier = (
Path(f"{self.local_config['user_data_dir']}/models/"
f"{self.local_config['freqai']['identifier']}").resolve())
# remove folder and its contents
if Path.exists(path_to_current_identifier):
shutil.rmtree(path_to_current_identifier)
prepare_data_config = deepcopy(self.local_config)
prepare_data_config['timerange'] = (str(self.dt_to_timestamp(varholder.from_dt)) + "-" +
str(self.dt_to_timestamp(varholder.to_dt)))
prepare_data_config['exchange']['pair_whitelist'] = pairs_to_load
backtesting = Backtesting(prepare_data_config, self.exchange)
self.exchange = backtesting.exchange
backtesting._set_strategy(backtesting.strategylist[0])
varholder.data, varholder.timerange = backtesting.load_bt_data()
backtesting.load_bt_data_detail()
varholder.timeframe = backtesting.timeframe
varholder.indicators = backtesting.strategy.advise_all_indicators(varholder.data)
varholder.result = self.get_result(backtesting, varholder.indicators)
def fill_full_varholder(self):
self.full_varHolder = VarHolder()
# define datetime in human-readable format
parsed_timerange = TimeRange.parse_timerange(self.local_config['timerange'])
if parsed_timerange.startdt is None:
self.full_varHolder.from_dt = datetime.fromtimestamp(0, tz=timezone.utc)
else:
self.full_varHolder.from_dt = parsed_timerange.startdt
if parsed_timerange.stopdt is None:
self.full_varHolder.to_dt = datetime.utcnow()
else:
self.full_varHolder.to_dt = parsed_timerange.stopdt
self.prepare_data(self.full_varHolder, self.local_config['pairs'])
def fill_entry_and_exit_varHolders(self, result_row):
# entry_varHolder
entry_varHolder = VarHolder()
self.entry_varHolders.append(entry_varHolder)
entry_varHolder.from_dt = self.full_varHolder.from_dt
entry_varHolder.compared_dt = result_row['open_date']
# to_dt needs +1 candle since it won't buy on the last candle
entry_varHolder.to_dt = (
result_row['open_date'] +
timedelta(minutes=timeframe_to_minutes(self.full_varHolder.timeframe)))
self.prepare_data(entry_varHolder, [result_row['pair']])
# exit_varHolder
exit_varHolder = VarHolder()
self.exit_varHolders.append(exit_varHolder)
# to_dt needs +1 candle since it will always exit/force-exit trades on the last candle
exit_varHolder.from_dt = self.full_varHolder.from_dt
exit_varHolder.to_dt = (
result_row['close_date'] +
timedelta(minutes=timeframe_to_minutes(self.full_varHolder.timeframe)))
exit_varHolder.compared_dt = result_row['close_date']
self.prepare_data(exit_varHolder, [result_row['pair']])
# now we analyze a full trade of full_varholder and look for analyze its bias
def analyze_row(self, idx, result_row):
# if force-sold, ignore this signal since here it will unconditionally exit.
if result_row.close_date == self.dt_to_timestamp(self.full_varHolder.to_dt):
return
# keep track of how many signals are processed at total
self.current_analysis.total_signals += 1
# fill entry_varHolder and exit_varHolder
self.fill_entry_and_exit_varHolders(result_row)
# register if buy signal is broken
if not self.report_signal(
self.entry_varHolders[idx].result,
"open_date",
self.entry_varHolders[idx].compared_dt):
self.current_analysis.false_entry_signals += 1
# register if buy or sell signal is broken
if not self.report_signal(
self.exit_varHolders[idx].result,
"close_date",
self.exit_varHolders[idx].compared_dt):
self.current_analysis.false_exit_signals += 1
# check if the indicators themselves contain biased data
self.analyze_indicators(self.full_varHolder, self.entry_varHolders[idx], result_row['pair'])
self.analyze_indicators(self.full_varHolder, self.exit_varHolders[idx], result_row['pair'])
def start(self) -> None:
# first make a single backtest
self.fill_full_varholder()
reduce_verbosity_for_bias_tester()
# check if requirements have been met of full_varholder
found_signals: int = self.full_varHolder.result['results'].shape[0] + 1
if found_signals >= self.targeted_trade_amount:
logger.info(f"Found {found_signals} trades, "
f"calculating {self.targeted_trade_amount} trades.")
elif self.targeted_trade_amount >= found_signals >= self.minimum_trade_amount:
logger.info(f"Only found {found_signals} trades. Calculating all available trades.")
else:
logger.info(f"found {found_signals} trades "
f"which is less than minimum_trade_amount {self.minimum_trade_amount}. "
f"Cancelling this backtest lookahead bias test.")
return
# now we loop through all signals
# starting from the same datetime to avoid miss-reports of bias
for idx, result_row in self.full_varHolder.result['results'].iterrows():
if self.current_analysis.total_signals == self.targeted_trade_amount:
break
self.analyze_row(idx, result_row)
# Restore verbosity, so it's not too quiet for the next strategy
restore_verbosity_for_bias_tester()
# check and report signals
if self.current_analysis.total_signals < self.local_config['minimum_trade_amount']:
logger.info(f" -> {self.local_config['strategy']} : too few trades. "
f"We only found {self.current_analysis.total_signals} trades. "
f"Hint: Extend the timerange "
f"to get at least {self.local_config['minimum_trade_amount']} "
f"or lower the value of minimum_trade_amount.")
self.failed_bias_check = True
elif (self.current_analysis.false_entry_signals > 0 or
self.current_analysis.false_exit_signals > 0 or
len(self.current_analysis.false_indicators) > 0):
logger.info(f" => {self.local_config['strategy']} : bias detected!")
self.current_analysis.has_bias = True
self.failed_bias_check = False
else:
logger.info(self.local_config['strategy'] + ": no bias detected")
self.failed_bias_check = False

View File

@@ -0,0 +1,202 @@
import logging
import time
from pathlib import Path
from typing import Any, Dict, List
import pandas as pd
from freqtrade.constants import Config
from freqtrade.exceptions import OperationalException
from freqtrade.optimize.lookahead_analysis import LookaheadAnalysis
from freqtrade.resolvers import StrategyResolver
logger = logging.getLogger(__name__)
class LookaheadAnalysisSubFunctions:
@staticmethod
def text_table_lookahead_analysis_instances(
config: Dict[str, Any],
lookahead_instances: List[LookaheadAnalysis]):
headers = ['filename', 'strategy', 'has_bias', 'total_signals',
'biased_entry_signals', 'biased_exit_signals', 'biased_indicators']
data = []
for inst in lookahead_instances:
if config['minimum_trade_amount'] > inst.current_analysis.total_signals:
data.append(
[
inst.strategy_obj['location'].parts[-1],
inst.strategy_obj['name'],
"too few trades caught "
f"({inst.current_analysis.total_signals}/{config['minimum_trade_amount']})."
f"Test failed."
]
)
elif inst.failed_bias_check:
data.append(
[
inst.strategy_obj['location'].parts[-1],
inst.strategy_obj['name'],
'error while checking'
]
)
else:
data.append(
[
inst.strategy_obj['location'].parts[-1],
inst.strategy_obj['name'],
inst.current_analysis.has_bias,
inst.current_analysis.total_signals,
inst.current_analysis.false_entry_signals,
inst.current_analysis.false_exit_signals,
", ".join(inst.current_analysis.false_indicators)
]
)
from tabulate import tabulate
table = tabulate(data, headers=headers, tablefmt="orgtbl")
print(table)
return table, headers, data
@staticmethod
def export_to_csv(config: Dict[str, Any], lookahead_analysis: List[LookaheadAnalysis]):
def add_or_update_row(df, row_data):
if (
(df['filename'] == row_data['filename']) &
(df['strategy'] == row_data['strategy'])
).any():
# Update existing row
pd_series = pd.DataFrame([row_data])
df.loc[
(df['filename'] == row_data['filename']) &
(df['strategy'] == row_data['strategy'])
] = pd_series
else:
# Add new row
df = pd.concat([df, pd.DataFrame([row_data], columns=df.columns)])
return df
if Path(config['lookahead_analysis_exportfilename']).exists():
# Read CSV file into a pandas dataframe
csv_df = pd.read_csv(config['lookahead_analysis_exportfilename'])
else:
# Create a new empty DataFrame with the desired column names and set the index
csv_df = pd.DataFrame(columns=[
'filename', 'strategy', 'has_bias', 'total_signals',
'biased_entry_signals', 'biased_exit_signals', 'biased_indicators'
],
index=None)
for inst in lookahead_analysis:
# only update if
if (inst.current_analysis.total_signals > config['minimum_trade_amount']
and inst.failed_bias_check is not True):
new_row_data = {'filename': inst.strategy_obj['location'].parts[-1],
'strategy': inst.strategy_obj['name'],
'has_bias': inst.current_analysis.has_bias,
'total_signals':
int(inst.current_analysis.total_signals),
'biased_entry_signals':
int(inst.current_analysis.false_entry_signals),
'biased_exit_signals':
int(inst.current_analysis.false_exit_signals),
'biased_indicators':
",".join(inst.current_analysis.false_indicators)}
csv_df = add_or_update_row(csv_df, new_row_data)
# Fill NaN values with a default value (e.g., 0)
csv_df['total_signals'] = csv_df['total_signals'].fillna(0)
csv_df['biased_entry_signals'] = csv_df['biased_entry_signals'].fillna(0)
csv_df['biased_exit_signals'] = csv_df['biased_exit_signals'].fillna(0)
# Convert columns to integers
csv_df['total_signals'] = csv_df['total_signals'].astype(int)
csv_df['biased_entry_signals'] = csv_df['biased_entry_signals'].astype(int)
csv_df['biased_exit_signals'] = csv_df['biased_exit_signals'].astype(int)
logger.info(f"saving {config['lookahead_analysis_exportfilename']}")
csv_df.to_csv(config['lookahead_analysis_exportfilename'], index=False)
@staticmethod
def calculate_config_overrides(config: Config):
if config['targeted_trade_amount'] < config['minimum_trade_amount']:
# this combo doesn't make any sense.
raise OperationalException(
"Targeted trade amount can't be smaller than minimum trade amount."
)
if len(config['pairs']) > config['max_open_trades']:
logger.info('Max_open_trades were less than amount of pairs. '
'Set max_open_trades to amount of pairs just to avoid false positives.')
config['max_open_trades'] = len(config['pairs'])
min_dry_run_wallet = 1000000000
if config['dry_run_wallet'] < min_dry_run_wallet:
logger.info('Dry run wallet was not set to 1 billion, pushing it up there '
'just to avoid false positives')
config['dry_run_wallet'] = min_dry_run_wallet
# enforce cache to be 'none', shift it to 'none' if not already
# (since the default value is 'day')
if config.get('backtest_cache') is None:
config['backtest_cache'] = 'none'
elif config['backtest_cache'] != 'none':
logger.info(f"backtest_cache = "
f"{config['backtest_cache']} detected. "
f"Inside lookahead-analysis it is enforced to be 'none'. "
f"Changed it to 'none'")
config['backtest_cache'] = 'none'
return config
@staticmethod
def initialize_single_lookahead_analysis(config: Config, strategy_obj: Dict[str, Any]):
logger.info(f"Bias test of {Path(strategy_obj['location']).name} started.")
start = time.perf_counter()
current_instance = LookaheadAnalysis(config, strategy_obj)
current_instance.start()
elapsed = time.perf_counter() - start
logger.info(f"Checking look ahead bias via backtests "
f"of {Path(strategy_obj['location']).name} "
f"took {elapsed:.0f} seconds.")
return current_instance
@staticmethod
def start(config: Config):
config = LookaheadAnalysisSubFunctions.calculate_config_overrides(config)
strategy_objs = StrategyResolver.search_all_objects(
config, enum_failed=False, recursive=config.get('recursive_strategy_search', False))
lookaheadAnalysis_instances = []
# unify --strategy and --strategy_list to one list
if not (strategy_list := config.get('strategy_list', [])):
if config.get('strategy') is None:
raise OperationalException(
"No Strategy specified. Please specify a strategy via --strategy or "
"--strategy_list"
)
strategy_list = [config['strategy']]
# check if strategies can be properly loaded, only check them if they can be.
for strat in strategy_list:
for strategy_obj in strategy_objs:
if strategy_obj['name'] == strat and strategy_obj not in strategy_list:
lookaheadAnalysis_instances.append(
LookaheadAnalysisSubFunctions.initialize_single_lookahead_analysis(
config, strategy_obj))
break
# report the results
if lookaheadAnalysis_instances:
LookaheadAnalysisSubFunctions.text_table_lookahead_analysis_instances(
config, lookaheadAnalysis_instances)
if config.get('lookahead_analysis_exportfilename') is not None:
LookaheadAnalysisSubFunctions.export_to_csv(config, lookaheadAnalysis_instances)
else:
logger.error("There were no strategies specified neither through "
"--strategy nor through "
"--strategy_list "
"or timeframe was not specified.")

View File

@@ -97,7 +97,7 @@ class Order(ModelBase):
@property
def safe_filled(self) -> float:
return self.filled if self.filled is not None else self.amount or 0.0
return self.filled if self.filled is not None else 0.0
@property
def safe_cost(self) -> float:
@@ -726,7 +726,7 @@ class LocalTrade():
self.stoploss_order_id = None
self.close_rate_requested = self.stop_loss
self.exit_reason = ExitType.STOPLOSS_ON_EXCHANGE.value
if self.is_open:
if self.is_open and order.safe_filled > 0:
logger.info(f'{order.order_type.upper()} is hit for {self}.')
else:
raise ValueError(f'Unknown order type: {order.order_type}')
@@ -1423,7 +1423,10 @@ class Trade(ModelBase, LocalTrade):
e.g. `(trade_filter=Trade.id == trade_id)`
:return: unsorted query object
"""
return Trade.session.scalars(Trade.get_trades_query(trade_filter, include_orders))
query = Trade.get_trades_query(trade_filter, include_orders)
# this sholud remain split. if use_db is False, session is not available and the above will
# raise an exception.
return Trade.session.scalars(query)
@staticmethod
def get_open_order_trades() -> List['Trade']:

View File

@@ -34,7 +34,7 @@ class FreqaiModelResolver(IResolver):
Load the custom class from config parameter
:param config: configuration dictionary
"""
disallowed_models = ["BaseRegressionModel", "BaseTensorFlowModel"]
disallowed_models = ["BaseRegressionModel"]
freqaimodel_name = config.get("freqaimodel")
if not freqaimodel_name:

View File

@@ -70,9 +70,9 @@ def __run_pairlist(job_id: str, config_loc: Config):
except (OperationalException, Exception) as e:
logger.exception(e)
ApiBG.jobs[job_id]['error'] = str(e)
ApiBG.jobs[job_id]['status'] = 'failed'
finally:
ApiBG.jobs[job_id]['is_running'] = False
ApiBG.jobs[job_id]['status'] = 'failed'
ApiBG.pairlist_running = False

View File

@@ -534,10 +534,10 @@ class Telegram(RPCHandler):
if order_nr == 1:
lines.append(f"*{wording} #{order_nr}:*")
lines.append(
f"*Amount:* {cur_entry_amount} "
f"*Amount:* {cur_entry_amount:.8g} "
f"({round_coin_value(order['cost'], quote_currency)})"
)
lines.append(f"*Average Price:* {cur_entry_average}")
lines.append(f"*Average Price:* {cur_entry_average:.8g}")
else:
sum_stake = 0
sum_amount = 0
@@ -560,9 +560,9 @@ class Telegram(RPCHandler):
if is_open:
lines.append("({})".format(dt_humanize(order["order_filled_date"],
granularity=["day", "hour", "minute"])))
lines.append(f"*Amount:* {cur_entry_amount} "
lines.append(f"*Amount:* {cur_entry_amount:.8g} "
f"({round_coin_value(order['cost'], quote_currency)})")
lines.append(f"*Average {wording} Price:* {cur_entry_average} "
lines.append(f"*Average {wording} Price:* {cur_entry_average:.8g} "
f"({price_to_1st_entry:.2%} from 1st entry Rate)")
lines.append(f"*Order filled:* {order['order_filled_date']}")
@@ -633,11 +633,11 @@ class Telegram(RPCHandler):
])
lines.extend([
"*Open Rate:* `{open_rate:.8f}`",
"*Close Rate:* `{close_rate:.8f}`" if r['close_rate'] else "",
"*Open Rate:* `{open_rate:.8g}`",
"*Close Rate:* `{close_rate:.8g}`" if r['close_rate'] else "",
"*Open Date:* `{open_date}`",
"*Close Date:* `{close_date}`" if r['close_date'] else "",
" \n*Current Rate:* `{current_rate:.8f}`" if r['is_open'] else "",
" \n*Current Rate:* `{current_rate:.8g}`" if r['is_open'] else "",
("*Unrealized Profit:* " if r['is_open'] else "*Close Profit: *")
+ "`{profit_ratio:.2%}` `({profit_abs_r})`",
])
@@ -658,9 +658,9 @@ class Telegram(RPCHandler):
"`({initial_stop_loss_ratio:.2%})`")
# Adding stoploss and stoploss percentage only if it is not None
lines.append("*Stoploss:* `{stop_loss_abs:.8f}` " +
lines.append("*Stoploss:* `{stop_loss_abs:.8g}` " +
("`({stop_loss_ratio:.2%})`" if r['stop_loss_ratio'] else ""))
lines.append("*Stoploss distance:* `{stoploss_current_dist:.8f}` "
lines.append("*Stoploss distance:* `{stoploss_current_dist:.8g}` "
"`({stoploss_current_dist_ratio:.2%})`")
if r['open_order']:
lines.append(

View File

@@ -1300,7 +1300,7 @@ class IStrategy(ABC, HyperStrategyMixin):
timedout = (order.status == 'open' and order.order_date_utc < timeout_threshold)
if timedout:
return True
time_method = (self.check_exit_timeout if order.side == trade.exit_side
time_method = (self.check_exit_timeout if order.ft_order_side == trade.exit_side
else self.check_entry_timeout)
return strategy_safe_wrapper(time_method,

View File

@@ -3,7 +3,7 @@ import logging
from packaging import version
from sqlalchemy import select
from freqtrade.constants import Config
from freqtrade.constants import DOCS_LINK, Config
from freqtrade.enums.tradingmode import TradingMode
from freqtrade.exceptions import OperationalException
from freqtrade.persistence.pairlock import PairLock
@@ -25,7 +25,7 @@ def migrate_binance_futures_names(config: Config):
if version.parse("2.6.26") > version.parse(ccxt.__version__):
raise OperationalException(
"Please follow the update instructions in the docs "
"(https://www.freqtrade.io/en/latest/updating/) to install a compatible ccxt version.")
f"({DOCS_LINK}/updating/) to install a compatible ccxt version.")
_migrate_binance_futures_db(config)
migrate_binance_futures_data(config)

View File

@@ -22,6 +22,7 @@ nav:
- Web Hook: webhook-config.md
- Data Downloading: data-download.md
- Backtesting: backtesting.md
- Lookahead analysis: lookahead-analysis.md
- Hyperopt: hyperopt.md
- FreqAI:
- Introduction: freqai.md

View File

@@ -9,18 +9,18 @@
coveralls==3.3.1
ruff==0.0.272
mypy==1.3.0
pre-commit==3.3.2
pre-commit==3.3.3
pytest==7.3.2
pytest-asyncio==0.21.0
pytest-cov==4.1.0
pytest-mock==3.10.0
pytest-mock==3.11.1
pytest-random-order==1.1.0
isort==5.12.0
# For datetime mocking
time-machine==2.9.0
time-machine==2.10.0
# Convert jupyter notebooks to markdown documents
nbconvert==7.4.0
nbconvert==7.5.0
# mypy types
types-cachetools==5.3.0.5

View File

@@ -5,7 +5,7 @@
torch==2.0.1
#until these branches will be released we can use this
gymnasium==0.28.1
stable_baselines3==2.0.0a10
stable_baselines3==2.0.0a13
sb3_contrib>=2.0.0a9
# Progress bar for stable-baselines3 and sb3-contrib
tqdm==4.65.0

View File

@@ -5,8 +5,8 @@
# Required for freqai
scikit-learn==1.1.3
joblib==1.2.0
catboost==1.1.1; sys_platform == 'darwin' and python_version < '3.9'
catboost==1.2; 'arm' not in platform_machine and (sys_platform != 'darwin' or python_version >= '3.9')
catboost==1.2; 'arm' not in platform_machine
lightgbm==3.3.5
xgboost==1.7.5
xgboost==1.7.6
tensorboard==2.13.0
datasieve==0.1.5

View File

@@ -5,4 +5,4 @@
scipy==1.10.1
scikit-learn==1.1.3
scikit-optimize==0.9.0
filelock==3.12.1
filelock==3.12.2

View File

@@ -2,7 +2,7 @@ numpy==1.24.3
pandas==2.0.2
pandas-ta==0.3.14b
ccxt==3.1.34
ccxt==3.1.44
cryptography==41.0.1; platform_machine != 'armv7l'
cryptography==40.0.1; platform_machine == 'armv7l'
aiohttp==3.8.4
@@ -23,7 +23,7 @@ jinja2==3.1.2
tables==3.8.0
blosc==1.11.1
joblib==1.2.0
rich==13.4.1
rich==13.4.2
pyarrow==12.0.0; platform_machine != 'armv7l'
# find first, C search in arrays
@@ -38,7 +38,7 @@ orjson==3.9.1
sdnotify==0.3.2
# API Server
fastapi==0.96.0
fastapi==0.97.0
pydantic==1.10.9
uvicorn==0.22.0
pyjwt==2.7.0

View File

@@ -5,7 +5,7 @@ from setuptools import setup
plot = ['plotly>=4.0']
hyperopt = [
'scipy',
'scikit-learn',
'scikit-learn<=1.1.3',
'scikit-optimize>=0.7.0',
'filelock',
]
@@ -16,7 +16,8 @@ freqai = [
'catboost; platform_machine != "aarch64"',
'lightgbm',
'xgboost',
'tensorboard'
'tensorboard',
'datasieve>=0.1.5'
]
freqai_rl = [

View File

@@ -641,7 +641,7 @@ def test_get_ui_download_url_direct(mocker):
def test_download_data_keyboardInterrupt(mocker, markets):
dl_mock = mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
dl_mock = mocker.patch('freqtrade.commands.data_commands.download_data_main',
MagicMock(side_effect=KeyboardInterrupt))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value=markets))
@@ -660,7 +660,7 @@ def test_download_data_keyboardInterrupt(mocker, markets):
def test_download_data_timerange(mocker, markets):
dl_mock = mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value=markets))
@@ -708,10 +708,10 @@ def test_download_data_timerange(mocker, markets):
def test_download_data_no_markets(mocker, caplog):
dl_mock = mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker, id='binance')
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value={}))
mocker.patch(f'{EXMS}.get_markets', return_value={})
args = [
"download-data",
"--exchange", "binance",
@@ -723,11 +723,11 @@ def test_download_data_no_markets(mocker, caplog):
assert log_has("Pairs [ETH/BTC,XRP/BTC] not available on exchange Binance.", caplog)
def test_download_data_no_exchange(mocker, caplog):
mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
def test_download_data_no_exchange(mocker):
mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value={}))
mocker.patch(f'{EXMS}.get_markets', return_value={})
args = [
"download-data",
]
@@ -740,7 +740,7 @@ def test_download_data_no_exchange(mocker, caplog):
def test_download_data_no_pairs(mocker):
mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value={}))
@@ -758,7 +758,7 @@ def test_download_data_no_pairs(mocker):
def test_download_data_all_pairs(mocker, markets):
dl_mock = mocker.patch('freqtrade.commands.data_commands.refresh_backtest_ohlcv_data',
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value=markets))
@@ -792,13 +792,13 @@ def test_download_data_all_pairs(mocker, markets):
assert set(dl_mock.call_args_list[0][1]['pairs']) == expected
def test_download_data_trades(mocker, caplog):
dl_mock = mocker.patch('freqtrade.commands.data_commands.refresh_backtest_trades_data',
def test_download_data_trades(mocker):
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_trades_data',
MagicMock(return_value=[]))
convert_mock = mocker.patch('freqtrade.commands.data_commands.convert_trades_to_ohlcv',
convert_mock = mocker.patch('freqtrade.data.history.history_utils.convert_trades_to_ohlcv',
MagicMock(return_value=[]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value={}))
mocker.patch(f'{EXMS}.get_markets', return_value={})
args = [
"download-data",
"--exchange", "kraken",
@@ -829,7 +829,7 @@ def test_download_data_trades(mocker, caplog):
def test_download_data_data_invalid(mocker):
patch_exchange(mocker, id="kraken")
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value={}))
mocker.patch(f'{EXMS}.get_markets', return_value={})
args = [
"download-data",
"--exchange", "kraken",

View File

@@ -0,0 +1,96 @@
from unittest.mock import MagicMock, PropertyMock
import pytest
from freqtrade.configuration.config_setup import setup_utils_configuration
from freqtrade.data.history.history_utils import download_data_main
from freqtrade.enums import RunMode
from freqtrade.exceptions import OperationalException
from tests.conftest import EXMS, log_has, patch_exchange
def test_download_data_main_no_markets(mocker, caplog):
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker, id='binance')
mocker.patch(f'{EXMS}.get_markets', return_value={})
config = setup_utils_configuration({"exchange": "binance"}, RunMode.UTIL_EXCHANGE)
config.update({
"days": 20,
"pairs": ["ETH/BTC", "XRP/BTC"],
"timeframes": ["5m", "1h"]
})
download_data_main(config)
assert dl_mock.call_args[1]['timerange'].starttype == "date"
assert log_has("Pairs [ETH/BTC,XRP/BTC] not available on exchange Binance.", caplog)
def test_download_data_main_all_pairs(mocker, markets):
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_ohlcv_data',
MagicMock(return_value=["ETH/BTC", "XRP/BTC"]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value=markets))
config = setup_utils_configuration({"exchange": "binance"}, RunMode.UTIL_EXCHANGE)
config.update({
"pairs": [".*/USDT"],
"timeframes": ["5m", "1h"]
})
download_data_main(config)
expected = set(['ETH/USDT', 'XRP/USDT', 'NEO/USDT', 'TKN/USDT'])
assert set(dl_mock.call_args_list[0][1]['pairs']) == expected
assert dl_mock.call_count == 1
dl_mock.reset_mock()
config.update({
"pairs": [".*/USDT"],
"timeframes": ["5m", "1h"],
"include_inactive": True
})
download_data_main(config)
expected = set(['ETH/USDT', 'LTC/USDT', 'XRP/USDT', 'NEO/USDT', 'TKN/USDT'])
assert set(dl_mock.call_args_list[0][1]['pairs']) == expected
def test_download_data_main_trades(mocker):
dl_mock = mocker.patch('freqtrade.data.history.history_utils.refresh_backtest_trades_data',
MagicMock(return_value=[]))
convert_mock = mocker.patch('freqtrade.data.history.history_utils.convert_trades_to_ohlcv',
MagicMock(return_value=[]))
patch_exchange(mocker)
mocker.patch(f'{EXMS}.get_markets', return_value={})
config = setup_utils_configuration({"exchange": "binance"}, RunMode.UTIL_EXCHANGE)
config.update({
"days": 20,
"pairs": ["ETH/BTC", "XRP/BTC"],
"timeframes": ["5m", "1h"],
"download_trades": True,
})
download_data_main(config)
assert dl_mock.call_args[1]['timerange'].starttype == "date"
assert dl_mock.call_count == 1
assert convert_mock.call_count == 1
config.update({
"download_trades": True,
"trading_mode": "futures",
})
with pytest.raises(OperationalException,
match="Trade download not supported for futures."):
download_data_main(config)
def test_download_data_main_data_invalid(mocker):
patch_exchange(mocker, id="kraken")
mocker.patch(f'{EXMS}.get_markets', return_value={})
config = setup_utils_configuration({"exchange": "kraken"}, RunMode.UTIL_EXCHANGE)
config.update({
"days": 20,
"pairs": ["ETH/BTC", "XRP/BTC"],
"timeframes": ["5m", "1h"],
})
with pytest.raises(OperationalException, match=r"Historic klines not available for .*"):
download_data_main(config)

View File

@@ -1,6 +1,7 @@
# pragma pylint: disable=missing-docstring, protected-access, C0103
import json
import logging
import uuid
from pathlib import Path
from shutil import copyfile
@@ -503,9 +504,10 @@ def test_validate_backtest_data(default_conf, mocker, caplog, testdatadir) -> No
])
def test_refresh_backtest_ohlcv_data(
mocker, default_conf, markets, caplog, testdatadir, trademode, callcount):
dl_mock = mocker.patch('freqtrade.data.history.history_utils._download_pair_history',
MagicMock())
caplog.set_level(logging.DEBUG)
dl_mock = mocker.patch('freqtrade.data.history.history_utils._download_pair_history')
mocker.patch(f'{EXMS}.markets', PropertyMock(return_value=markets))
mocker.patch.object(Path, "exists", MagicMock(return_value=True))
mocker.patch.object(Path, "unlink", MagicMock())
@@ -520,7 +522,7 @@ def test_refresh_backtest_ohlcv_data(
assert dl_mock.call_count == callcount
assert dl_mock.call_args[1]['timerange'].starttype == 'date'
assert log_has("Downloading pair ETH/BTC, interval 1m.", caplog)
assert log_has_re(r"Downloading pair ETH/BTC, .* interval 1m\.", caplog)
def test_download_data_no_markets(mocker, default_conf, caplog, testdatadir):

View File

@@ -9,9 +9,9 @@ from freqtrade.configuration import TimeRange
from freqtrade.data.dataprovider import DataProvider
from freqtrade.exceptions import OperationalException
from freqtrade.freqai.data_kitchen import FreqaiDataKitchen
from tests.conftest import get_patched_exchange, log_has_re
from tests.conftest import get_patched_exchange
from tests.freqai.conftest import (get_patched_data_kitchen, get_patched_freqai_strategy,
make_data_dictionary, make_unfiltered_dataframe)
make_unfiltered_dataframe)
from tests.freqai.test_freqai_interface import is_mac
@@ -72,68 +72,6 @@ def test_check_if_model_expired(mocker, freqai_conf):
shutil.rmtree(Path(dk.full_path))
def test_use_DBSCAN_to_remove_outliers(mocker, freqai_conf, caplog):
freqai = make_data_dictionary(mocker, freqai_conf)
# freqai_conf['freqai']['feature_parameters'].update({"outlier_protection_percentage": 1})
freqai.dk.use_DBSCAN_to_remove_outliers(predict=False)
assert log_has_re(r"DBSCAN found eps of 1\.7\d\.", caplog)
def test_compute_distances(mocker, freqai_conf):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai_conf['freqai']['feature_parameters'].update({"DI_threshold": 1})
avg_mean_dist = freqai.dk.compute_distances()
assert round(avg_mean_dist, 2) == 1.98
def test_use_SVM_to_remove_outliers_and_outlier_protection(mocker, freqai_conf, caplog):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai_conf['freqai']['feature_parameters'].update({"outlier_protection_percentage": 0.1})
freqai.dk.use_SVM_to_remove_outliers(predict=False)
assert log_has_re(
"SVM detected 7.83%",
caplog,
)
def test_compute_inlier_metric(mocker, freqai_conf, caplog):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai_conf['freqai']['feature_parameters'].update({"inlier_metric_window": 10})
freqai.dk.compute_inlier_metric(set_='train')
assert log_has_re(
"Inlier metric computed and added to features.",
caplog,
)
def test_add_noise_to_training_features(mocker, freqai_conf):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai_conf['freqai']['feature_parameters'].update({"noise_standard_deviation": 0.1})
freqai.dk.add_noise_to_training_features()
def test_remove_beginning_points_from_data_dict(mocker, freqai_conf):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai.dk.remove_beginning_points_from_data_dict(set_='train')
def test_principal_component_analysis(mocker, freqai_conf, caplog):
freqai = make_data_dictionary(mocker, freqai_conf)
freqai.dk.principal_component_analysis()
assert log_has_re(
"reduced feature dimension by",
caplog,
)
def test_normalize_data(mocker, freqai_conf):
freqai = make_data_dictionary(mocker, freqai_conf)
data_dict = freqai.dk.data_dictionary
freqai.dk.normalize_data(data_dict)
assert any('_max' in entry for entry in freqai.dk.data.keys())
assert any('_min' in entry for entry in freqai.dk.data.keys())
def test_filter_features(mocker, freqai_conf):
freqai, unfiltered_dataframe = make_unfiltered_dataframe(mocker, freqai_conf)
freqai.dk.find_features(unfiltered_dataframe)

View File

@@ -1,3 +1,4 @@
import logging
import platform
import shutil
import sys
@@ -37,21 +38,22 @@ def can_run_model(model: str) -> None:
pytest.skip("Reinforcement learning / PyTorch module not available on intel based Mac OS.")
@pytest.mark.parametrize('model, pca, dbscan, float32, can_short, shuffle, buffer', [
('LightGBMRegressor', True, False, True, True, False, 0),
('XGBoostRegressor', False, True, False, True, False, 10),
('XGBoostRFRegressor', False, False, False, True, False, 0),
('CatboostRegressor', False, False, False, True, True, 0),
('PyTorchMLPRegressor', False, False, False, False, False, 0),
('PyTorchTransformerRegressor', False, False, False, False, False, 0),
('ReinforcementLearner', False, True, False, True, False, 0),
('ReinforcementLearner_multiproc', False, False, False, True, False, 0),
('ReinforcementLearner_test_3ac', False, False, False, False, False, 0),
('ReinforcementLearner_test_3ac', False, False, False, True, False, 0),
('ReinforcementLearner_test_4ac', False, False, False, True, False, 0),
@pytest.mark.parametrize('model, pca, dbscan, float32, can_short, shuffle, buffer, noise', [
('LightGBMRegressor', True, False, True, True, False, 0, 0),
('XGBoostRegressor', False, True, False, True, False, 10, 0.05),
('XGBoostRFRegressor', False, False, False, True, False, 0, 0),
('CatboostRegressor', False, False, False, True, True, 0, 0),
('PyTorchMLPRegressor', False, False, False, False, False, 0, 0),
('PyTorchTransformerRegressor', False, False, False, False, False, 0, 0),
('ReinforcementLearner', False, True, False, True, False, 0, 0),
('ReinforcementLearner_multiproc', False, False, False, True, False, 0, 0),
('ReinforcementLearner_test_3ac', False, False, False, False, False, 0, 0),
('ReinforcementLearner_test_3ac', False, False, False, True, False, 0, 0),
('ReinforcementLearner_test_4ac', False, False, False, True, False, 0, 0),
])
def test_extract_data_and_train_model_Standard(mocker, freqai_conf, model, pca,
dbscan, float32, can_short, shuffle, buffer):
dbscan, float32, can_short, shuffle,
buffer, noise):
can_run_model(model)
@@ -68,12 +70,14 @@ def test_extract_data_and_train_model_Standard(mocker, freqai_conf, model, pca,
freqai_conf.update({"reduce_df_footprint": float32})
freqai_conf['freqai']['feature_parameters'].update({"shuffle_after_split": shuffle})
freqai_conf['freqai']['feature_parameters'].update({"buffer_train_data_candles": buffer})
freqai_conf['freqai']['feature_parameters'].update({"noise_standard_deviation": noise})
if 'ReinforcementLearner' in model:
model_save_ext = 'zip'
freqai_conf = make_rl_config(freqai_conf)
# test the RL guardrails
freqai_conf['freqai']['feature_parameters'].update({"use_SVM_to_remove_outliers": True})
freqai_conf['freqai']['feature_parameters'].update({"DI_threshold": 2})
freqai_conf['freqai']['data_split_parameters'].update({'shuffle': True})
if 'test_3ac' in model or 'test_4ac' in model:
@@ -162,7 +166,6 @@ def test_extract_data_and_train_model_MultiTargets(mocker, freqai_conf, model, s
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_model.joblib").is_file()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_metadata.json").is_file()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_trained_df.pkl").is_file()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_svm_model.joblib").is_file()
assert len(freqai.dk.data['training_features_list']) == 14
shutil.rmtree(Path(freqai.dk.full_path))
@@ -218,7 +221,6 @@ def test_extract_data_and_train_model_Classifiers(mocker, freqai_conf, model):
f"{freqai.dk.model_filename}_model{model_file_extension}").exists()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_metadata.json").exists()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_trained_df.pkl").exists()
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_svm_model.joblib").exists()
shutil.rmtree(Path(freqai.dk.full_path))
@@ -283,9 +285,6 @@ def test_start_backtesting(mocker, freqai_conf, model, num_files, strat, caplog)
_, base_df = freqai.dd.get_base_and_corr_dataframes(sub_timerange, "LTC/BTC", freqai.dk)
df = base_df[freqai_conf["timeframe"]]
for i in range(5):
df[f'%-constant_{i}'] = i
metadata = {"pair": "LTC/BTC"}
freqai.dk.set_paths('LTC/BTC', None)
freqai.start_backtesting(df, metadata, freqai.dk, strategy)
@@ -293,14 +292,6 @@ def test_start_backtesting(mocker, freqai_conf, model, num_files, strat, caplog)
assert len(model_folders) == num_files
Trade.use_db = True
assert log_has_re(
"Removed features ",
caplog,
)
assert log_has_re(
"Removed 5 features from prediction features, ",
caplog,
)
Backtesting.cleanup()
shutil.rmtree(Path(freqai.dk.full_path))
@@ -425,36 +416,6 @@ def test_backtesting_fit_live_predictions(mocker, freqai_conf, caplog):
shutil.rmtree(Path(freqai.dk.full_path))
def test_principal_component_analysis(mocker, freqai_conf):
freqai_conf.update({"timerange": "20180110-20180130"})
freqai_conf.get("freqai", {}).get("feature_parameters", {}).update(
{"princpial_component_analysis": "true"})
strategy = get_patched_freqai_strategy(mocker, freqai_conf)
exchange = get_patched_exchange(mocker, freqai_conf)
strategy.dp = DataProvider(freqai_conf, exchange)
strategy.freqai_info = freqai_conf.get("freqai", {})
freqai = strategy.freqai
freqai.live = True
freqai.dk = FreqaiDataKitchen(freqai_conf)
freqai.dk.live = True
timerange = TimeRange.parse_timerange("20180110-20180130")
freqai.dd.load_all_pair_histories(timerange, freqai.dk)
freqai.dd.pair_dict = MagicMock()
data_load_timerange = TimeRange.parse_timerange("20180110-20180130")
new_timerange = TimeRange.parse_timerange("20180120-20180130")
freqai.dk.set_paths('ADA/BTC', None)
freqai.extract_data_and_train_model(
new_timerange, "ADA/BTC", strategy, freqai.dk, data_load_timerange)
assert Path(freqai.dk.data_path / f"{freqai.dk.model_filename}_pca_object.pkl")
shutil.rmtree(Path(freqai.dk.full_path))
def test_plot_feature_importance(mocker, freqai_conf):
from freqtrade.freqai.utils import plot_feature_importance
@@ -540,6 +501,7 @@ def test_get_required_data_timerange(mocker, freqai_conf):
def test_download_all_data_for_training(mocker, freqai_conf, caplog, tmpdir):
caplog.set_level(logging.DEBUG)
strategy = get_patched_freqai_strategy(mocker, freqai_conf)
exchange = get_patched_exchange(mocker, freqai_conf)
pairlist = PairListManager(exchange, freqai_conf)

View File

@@ -0,0 +1,366 @@
# pragma pylint: disable=missing-docstring, W0212, line-too-long, C0103, unused-argument
from copy import deepcopy
from pathlib import Path
from unittest.mock import MagicMock, PropertyMock
import pytest
from freqtrade.commands.optimize_commands import start_lookahead_analysis
from freqtrade.data.history import get_timerange
from freqtrade.exceptions import OperationalException
from freqtrade.optimize.lookahead_analysis import Analysis, LookaheadAnalysis
from freqtrade.optimize.lookahead_analysis_helpers import LookaheadAnalysisSubFunctions
from tests.conftest import EXMS, get_args, log_has_re, patch_exchange
@pytest.fixture
def lookahead_conf(default_conf_usdt):
default_conf_usdt['minimum_trade_amount'] = 10
default_conf_usdt['targeted_trade_amount'] = 20
default_conf_usdt['strategy_path'] = str(
Path(__file__).parent.parent / "strategy/strats/lookahead_bias")
default_conf_usdt['strategy'] = 'strategy_test_v3_with_lookahead_bias'
default_conf_usdt['max_open_trades'] = 1
default_conf_usdt['dry_run_wallet'] = 1000000000
default_conf_usdt['pairs'] = ['UNITTEST/USDT']
return default_conf_usdt
def test_start_lookahead_analysis(mocker):
single_mock = MagicMock()
text_table_mock = MagicMock()
mocker.patch.multiple(
'freqtrade.optimize.lookahead_analysis_helpers.LookaheadAnalysisSubFunctions',
initialize_single_lookahead_analysis=single_mock,
text_table_lookahead_analysis_instances=text_table_mock,
)
args = [
"lookahead-analysis",
"--strategy",
"strategy_test_v3_with_lookahead_bias",
"--strategy-path",
str(Path(__file__).parent.parent / "strategy/strats/lookahead_bias"),
"--pairs",
"UNITTEST/BTC",
"--max-open-trades",
"1"
]
pargs = get_args(args)
pargs['config'] = None
start_lookahead_analysis(pargs)
assert single_mock.call_count == 1
assert text_table_mock.call_count == 1
single_mock.reset_mock()
# Test invalid config
args = [
"lookahead-analysis",
"--strategy",
"strategy_test_v3_with_lookahead_bias",
"--strategy-path",
str(Path(__file__).parent.parent / "strategy/strats/lookahead_bias"),
"--targeted-trade-amount",
"10",
"--minimum-trade-amount",
"20",
]
pargs = get_args(args)
pargs['config'] = None
with pytest.raises(OperationalException,
match=r"Targeted trade amount can't be smaller than minimum trade amount.*"):
start_lookahead_analysis(pargs)
def test_lookahead_helper_invalid_config(lookahead_conf) -> None:
conf = deepcopy(lookahead_conf)
conf['targeted_trade_amount'] = 10
conf['minimum_trade_amount'] = 40
with pytest.raises(OperationalException,
match=r"Targeted trade amount can't be smaller than minimum trade amount.*"):
LookaheadAnalysisSubFunctions.start(conf)
def test_lookahead_helper_no_strategy_defined(lookahead_conf):
conf = deepcopy(lookahead_conf)
conf['pairs'] = ['UNITTEST/USDT']
del conf['strategy']
with pytest.raises(OperationalException,
match=r"No Strategy specified"):
LookaheadAnalysisSubFunctions.start(conf)
def test_lookahead_helper_start(lookahead_conf, mocker) -> None:
single_mock = MagicMock()
text_table_mock = MagicMock()
mocker.patch.multiple(
'freqtrade.optimize.lookahead_analysis_helpers.LookaheadAnalysisSubFunctions',
initialize_single_lookahead_analysis=single_mock,
text_table_lookahead_analysis_instances=text_table_mock,
)
LookaheadAnalysisSubFunctions.start(lookahead_conf)
assert single_mock.call_count == 1
assert text_table_mock.call_count == 1
single_mock.reset_mock()
text_table_mock.reset_mock()
def test_lookahead_helper_text_table_lookahead_analysis_instances(lookahead_conf):
analysis = Analysis()
analysis.has_bias = True
analysis.total_signals = 5
analysis.false_entry_signals = 4
analysis.false_exit_signals = 3
strategy_obj = {
'name': "strategy_test_v3_with_lookahead_bias",
'location': Path(lookahead_conf['strategy_path'], f"{lookahead_conf['strategy']}.py")
}
instance = LookaheadAnalysis(lookahead_conf, strategy_obj)
instance.current_analysis = analysis
table, headers, data = (LookaheadAnalysisSubFunctions.
text_table_lookahead_analysis_instances(lookahead_conf, [instance]))
# check row contents for a try that has too few signals
assert data[0][0] == 'strategy_test_v3_with_lookahead_bias.py'
assert data[0][1] == 'strategy_test_v3_with_lookahead_bias'
assert data[0][2].__contains__('too few trades')
assert len(data[0]) == 3
# now check for an error which occured after enough trades
analysis.total_signals = 12
analysis.false_entry_signals = 11
analysis.false_exit_signals = 10
instance = LookaheadAnalysis(lookahead_conf, strategy_obj)
instance.current_analysis = analysis
table, headers, data = (LookaheadAnalysisSubFunctions.
text_table_lookahead_analysis_instances(lookahead_conf, [instance]))
assert data[0][2].__contains__("error")
# edit it into not showing an error
instance.failed_bias_check = False
table, headers, data = (LookaheadAnalysisSubFunctions.
text_table_lookahead_analysis_instances(lookahead_conf, [instance]))
assert data[0][0] == 'strategy_test_v3_with_lookahead_bias.py'
assert data[0][1] == 'strategy_test_v3_with_lookahead_bias'
assert data[0][2] # True
assert data[0][3] == 12
assert data[0][4] == 11
assert data[0][5] == 10
assert data[0][6] == ''
analysis.false_indicators.append('falseIndicator1')
analysis.false_indicators.append('falseIndicator2')
table, headers, data = (LookaheadAnalysisSubFunctions.
text_table_lookahead_analysis_instances(lookahead_conf, [instance]))
assert data[0][6] == 'falseIndicator1, falseIndicator2'
# check amount of returning rows
assert len(data) == 1
# check amount of multiple rows
table, headers, data = (LookaheadAnalysisSubFunctions.text_table_lookahead_analysis_instances(
lookahead_conf, [instance, instance, instance]))
assert len(data) == 3
def test_lookahead_helper_export_to_csv(lookahead_conf):
import pandas as pd
lookahead_conf['lookahead_analysis_exportfilename'] = "temp_csv_lookahead_analysis.csv"
# just to be sure the test won't fail: remove file if exists for some reason
# (repeat this at the end once again to clean up)
if Path(lookahead_conf['lookahead_analysis_exportfilename']).exists():
Path(lookahead_conf['lookahead_analysis_exportfilename']).unlink()
# before we can start we have to delete the
# 1st check: create a new file and verify its contents
analysis1 = Analysis()
analysis1.has_bias = True
analysis1.total_signals = 12
analysis1.false_entry_signals = 11
analysis1.false_exit_signals = 10
analysis1.false_indicators.append('falseIndicator1')
analysis1.false_indicators.append('falseIndicator2')
lookahead_conf['lookahead_analysis_exportfilename'] = "temp_csv_lookahead_analysis.csv"
strategy_obj1 = {
'name': "strat1",
'location': Path("file1.py"),
}
instance1 = LookaheadAnalysis(lookahead_conf, strategy_obj1)
instance1.failed_bias_check = False
instance1.current_analysis = analysis1
LookaheadAnalysisSubFunctions.export_to_csv(lookahead_conf, [instance1])
saved_data1 = pd.read_csv(lookahead_conf['lookahead_analysis_exportfilename'])
expected_values1 = [
[
'file1.py', 'strat1', True,
12, 11, 10,
"falseIndicator1,falseIndicator2"
],
]
expected_columns = ['filename', 'strategy', 'has_bias',
'total_signals', 'biased_entry_signals', 'biased_exit_signals',
'biased_indicators']
expected_data1 = pd.DataFrame(expected_values1, columns=expected_columns)
assert Path(lookahead_conf['lookahead_analysis_exportfilename']).exists()
assert expected_data1.equals(saved_data1)
# 2nd check: update the same strategy (which internally changed or is being retested)
expected_values2 = [
[
'file1.py', 'strat1', False,
22, 21, 20,
"falseIndicator3,falseIndicator4"
],
]
expected_data2 = pd.DataFrame(expected_values2, columns=expected_columns)
analysis2 = Analysis()
analysis2.has_bias = False
analysis2.total_signals = 22
analysis2.false_entry_signals = 21
analysis2.false_exit_signals = 20
analysis2.false_indicators.append('falseIndicator3')
analysis2.false_indicators.append('falseIndicator4')
strategy_obj2 = {
'name': "strat1",
'location': Path("file1.py"),
}
instance2 = LookaheadAnalysis(lookahead_conf, strategy_obj2)
instance2.failed_bias_check = False
instance2.current_analysis = analysis2
LookaheadAnalysisSubFunctions.export_to_csv(lookahead_conf, [instance2])
saved_data2 = pd.read_csv(lookahead_conf['lookahead_analysis_exportfilename'])
assert expected_data2.equals(saved_data2)
# 3rd check: now we add a new row to an already existing file
expected_values3 = [
[
'file1.py', 'strat1', False,
22, 21, 20,
"falseIndicator3,falseIndicator4"
],
[
'file3.py', 'strat3', True,
32, 31, 30, "falseIndicator5,falseIndicator6"
],
]
expected_data3 = pd.DataFrame(expected_values3, columns=expected_columns)
analysis3 = Analysis()
analysis3.has_bias = True
analysis3.total_signals = 32
analysis3.false_entry_signals = 31
analysis3.false_exit_signals = 30
analysis3.false_indicators.append('falseIndicator5')
analysis3.false_indicators.append('falseIndicator6')
lookahead_conf['lookahead_analysis_exportfilename'] = "temp_csv_lookahead_analysis.csv"
strategy_obj3 = {
'name': "strat3",
'location': Path("file3.py"),
}
instance3 = LookaheadAnalysis(lookahead_conf, strategy_obj3)
instance3.failed_bias_check = False
instance3.current_analysis = analysis3
LookaheadAnalysisSubFunctions.export_to_csv(lookahead_conf, [instance3])
saved_data3 = pd.read_csv(lookahead_conf['lookahead_analysis_exportfilename'])
assert expected_data3.equals(saved_data3)
# remove csv file after the test is done
if Path(lookahead_conf['lookahead_analysis_exportfilename']).exists():
Path(lookahead_conf['lookahead_analysis_exportfilename']).unlink()
def test_initialize_single_lookahead_analysis(lookahead_conf, mocker, caplog):
mocker.patch('freqtrade.data.history.get_timerange', get_timerange)
mocker.patch(f'{EXMS}.get_fee', return_value=0.0)
mocker.patch(f'{EXMS}.get_min_pair_stake_amount', return_value=0.00001)
mocker.patch(f'{EXMS}.get_max_pair_stake_amount', return_value=float('inf'))
patch_exchange(mocker)
mocker.patch('freqtrade.plugins.pairlistmanager.PairListManager.whitelist',
PropertyMock(return_value=['UNITTEST/BTC']))
lookahead_conf['pairs'] = ['UNITTEST/USDT']
lookahead_conf['timeframe'] = '5m'
lookahead_conf['timerange'] = '20180119-20180122'
start_mock = mocker.patch('freqtrade.optimize.lookahead_analysis.LookaheadAnalysis.start')
strategy_obj = {
'name': "strategy_test_v3_with_lookahead_bias",
'location': Path(lookahead_conf['strategy_path'], f"{lookahead_conf['strategy']}.py")
}
instance = LookaheadAnalysisSubFunctions.initialize_single_lookahead_analysis(
lookahead_conf, strategy_obj)
assert log_has_re(r"Bias test of .* started\.", caplog)
assert start_mock.call_count == 1
assert instance.strategy_obj['name'] == "strategy_test_v3_with_lookahead_bias"
@pytest.mark.parametrize('scenario', [
'no_bias', 'bias1'
])
def test_biased_strategy(lookahead_conf, mocker, caplog, scenario) -> None:
mocker.patch('freqtrade.data.history.get_timerange', get_timerange)
mocker.patch(f'{EXMS}.get_fee', return_value=0.0)
mocker.patch(f'{EXMS}.get_min_pair_stake_amount', return_value=0.00001)
mocker.patch(f'{EXMS}.get_max_pair_stake_amount', return_value=float('inf'))
patch_exchange(mocker)
mocker.patch('freqtrade.plugins.pairlistmanager.PairListManager.whitelist',
PropertyMock(return_value=['UNITTEST/BTC']))
lookahead_conf['pairs'] = ['UNITTEST/USDT']
lookahead_conf['timeframe'] = '5m'
lookahead_conf['timerange'] = '20180119-20180122'
# Patch scenario Parameter to allow for easy selection
mocker.patch('freqtrade.strategy.hyper.HyperStrategyMixin.load_params_from_file',
return_value={
'params': {
"buy": {
"scenario": scenario
}
}
})
strategy_obj = {'name': "strategy_test_v3_with_lookahead_bias"}
instance = LookaheadAnalysis(lookahead_conf, strategy_obj)
instance.start()
# Assert init correct
assert log_has_re(f"Strategy Parameter: scenario = {scenario}", caplog)
# check non-biased strategy
if scenario == "no_bias":
assert not instance.current_analysis.has_bias
# check biased strategy
elif scenario == "bias1":
assert instance.current_analysis.has_bias
def test_config_overrides(lookahead_conf):
lookahead_conf['max_open_trades'] = 0
lookahead_conf['dry_run_wallet'] = 1
lookahead_conf['pairs'] = ['BTC/USDT', 'ETH/USDT', 'SOL/USDT']
lookahead_conf = LookaheadAnalysisSubFunctions.calculate_config_overrides(lookahead_conf)
assert lookahead_conf['dry_run_wallet'] == 1000000000
assert lookahead_conf['max_open_trades'] == 3

View File

@@ -0,0 +1,58 @@
# pragma pylint: disable=missing-docstring, invalid-name, pointless-string-statement
from pandas import DataFrame
from technical.indicators import ichimoku
from freqtrade.strategy import IStrategy
from freqtrade.strategy.parameters import CategoricalParameter
class strategy_test_v3_with_lookahead_bias(IStrategy):
INTERFACE_VERSION = 3
# Minimal ROI designed for the strategy
minimal_roi = {
"40": 0.0,
"30": 0.01,
"20": 0.02,
"0": 0.04
}
# Optimal stoploss designed for the strategy
stoploss = -0.10
# Optimal timeframe for the strategy
timeframe = '5m'
scenario = CategoricalParameter(['no_bias', 'bias1'], default='bias1', space="buy")
# Number of candles the strategy requires before producing valid signals
startup_candle_count: int = 20
def populate_indicators(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
# bias is introduced here
if self.scenario.value != 'no_bias':
ichi = ichimoku(dataframe,
conversion_line_period=20,
base_line_periods=60,
laggin_span=120,
displacement=30)
dataframe['chikou_span'] = ichi['chikou_span']
return dataframe
def populate_entry_trend(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
if self.scenario.value == 'no_bias':
dataframe.loc[dataframe['close'].shift(10) < dataframe['close'], 'enter_long'] = 1
else:
dataframe.loc[dataframe['close'].shift(-10) > dataframe['close'], 'enter_long'] = 1
return dataframe
def populate_exit_trend(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
if self.scenario.value == 'no_bias':
dataframe.loc[
dataframe['close'].shift(10) < dataframe['close'], 'exit'] = 1
else:
dataframe.loc[
dataframe['close'].shift(-10) > dataframe['close'], 'exit'] = 1
return dataframe

View File

@@ -1237,6 +1237,8 @@ def test_handle_stoploss_on_exchange(mocker, default_conf_usdt, fee, caplog, is_
'type': 'stop_loss_limit',
'price': 3,
'average': 2,
'filled': enter_order['amount'],
'remaining': 0,
'amount': enter_order['amount'],
})
mocker.patch(f'{EXMS}.fetch_stoploss_order', stoploss_order_hit)
@@ -3029,8 +3031,8 @@ def test_manage_open_orders_exit_usercustom(
freqtrade.manage_open_orders()
assert cancel_order_mock.call_count == 0
assert rpc_mock.call_count == 1
assert freqtrade.strategy.check_exit_timeout.call_count == 1
assert freqtrade.strategy.check_entry_timeout.call_count == 0
assert freqtrade.strategy.check_exit_timeout.call_count == (0 if is_short else 1)
assert freqtrade.strategy.check_entry_timeout.call_count == (1 if is_short else 0)
freqtrade.strategy.check_exit_timeout = MagicMock(side_effect=KeyError)
freqtrade.strategy.check_entry_timeout = MagicMock(side_effect=KeyError)
@@ -3038,8 +3040,8 @@ def test_manage_open_orders_exit_usercustom(
freqtrade.manage_open_orders()
assert cancel_order_mock.call_count == 0
assert rpc_mock.call_count == 1
assert freqtrade.strategy.check_exit_timeout.call_count == 1
assert freqtrade.strategy.check_entry_timeout.call_count == 0
assert freqtrade.strategy.check_exit_timeout.call_count == (0 if is_short else 1)
assert freqtrade.strategy.check_entry_timeout.call_count == (1 if is_short else 0)
# Return True - sells!
freqtrade.strategy.check_exit_timeout = MagicMock(return_value=True)
@@ -3047,8 +3049,8 @@ def test_manage_open_orders_exit_usercustom(
freqtrade.manage_open_orders()
assert cancel_order_mock.call_count == 1
assert rpc_mock.call_count == 2
assert freqtrade.strategy.check_exit_timeout.call_count == 1
assert freqtrade.strategy.check_entry_timeout.call_count == 0
assert freqtrade.strategy.check_exit_timeout.call_count == (0 if is_short else 1)
assert freqtrade.strategy.check_entry_timeout.call_count == (1 if is_short else 0)
trade = Trade.session.scalars(select(Trade)).first()
# cancelling didn't succeed - order-id remains open.
assert trade.open_order_id is not None

View File

@@ -7,6 +7,8 @@ import pytest
from freqtrade.exceptions import OperationalException
from freqtrade.loggers import (FTBufferingHandler, FTStdErrStreamHandler, set_loggers,
setup_logging, setup_logging_pre)
from freqtrade.loggers.set_log_levels import (reduce_verbosity_for_bias_tester,
restore_verbosity_for_bias_tester)
def test_set_loggers() -> None:
@@ -128,3 +130,21 @@ def test_set_loggers_journald_importerror(import_fails):
match=r'You need the cysystemd python package.*'):
setup_logging(config)
logger.handlers = orig_handlers
def test_reduce_verbosity():
setup_logging_pre()
reduce_verbosity_for_bias_tester()
prior_level = logging.getLogger('freqtrade').getEffectiveLevel()
assert logging.getLogger('freqtrade.resolvers').getEffectiveLevel() == logging.WARNING
assert logging.getLogger('freqtrade.strategy.hyper').getEffectiveLevel() == logging.WARNING
# base level wasn't changed
assert logging.getLogger('freqtrade').getEffectiveLevel() == prior_level
restore_verbosity_for_bias_tester()
assert logging.getLogger('freqtrade.resolvers').getEffectiveLevel() == prior_level
assert logging.getLogger('freqtrade.strategy.hyper').getEffectiveLevel() == prior_level
assert logging.getLogger('freqtrade').getEffectiveLevel() == prior_level
# base level wasn't changed