remove unnecessary example in feature_engineering.md

This commit is contained in:
robcaulk
2023-06-10 16:13:28 +02:00
parent 229ee643cd
commit ad8a4897ce

View File

@@ -307,64 +307,7 @@ class MyCoolTransform(BaseTransform):
If you have created your own custom `IFreqaiModel` with a custom `train()`/`predict()` function, *and* you still rely on `data_cleaning_train/predict()`, then you will need to migrate to the new pipeline. If your model does *not* rely on `data_cleaning_train/predict()`, then you do not need to worry about this migration.
The conversion involves first removing `data_cleaning_train/predict()` and replacing them with a `define_data_pipeline()` and `define_label_pipeline()` function to your `IFreqaiModel` class:
```python
class MyCoolFreqaiModel(BaseRegressionModel):
def train(
self, unfiltered_df: DataFrame, pair: str, dk: FreqaiDataKitchen, **kwargs
) -> Any:
# ... your custom stuff
# Remove these lines
# data_dictionary = dk.make_train_test_datasets(features_filtered, labels_filtered)
# self.data_cleaning_train(dk)
# data_dictionary = dk.normalize_data(data_dictionary)
# Add these lines. Now we control the pipeline fit/transform ourselves
dd = dk.make_train_test_datasets(features_filtered, labels_filtered)
dk.feature_pipeline = self.define_data_pipeline(threads=dk.thread_count)
dk.label_pipeline = self.define_label_pipeline(threads=dk.thread_count)
(dd["train_features"],
dd["train_labels"],
dd["train_weights"]) = dk.feature_pipeline.fit_transform(dd["train_features"],
dd["train_labels"],
dd["train_weights"])
(dd["test_features"],
dd["test_labels"],
dd["test_weights"]) = dk.feature_pipeline.transform(dd["test_features"],
dd["test_labels"],
dd["test_weights"])
dd["train_labels"], _, _ = dk.label_pipeline.fit_transform(dd["train_labels"])
dd["test_labels"], _, _ = dk.label_pipeline.transform(dd["test_labels"])
def predict(
self, unfiltered_df: DataFrame, dk: FreqaiDataKitchen, **kwargs
) -> Tuple[DataFrame, npt.NDArray[np.int_]]:
# ... your custom stuff
# Remove these lines:
# self.data_cleaning_predict(dk)
# Add these lines:
dk.data_dictionary["prediction_features"], outliers, _ = dk.feature_pipeline.transform(
dk.data_dictionary["prediction_features"], outlier_check=True)
# Remove this line
# pred_df = dk.denormalize_labels_from_metadata(pred_df)
# Replace with these lines
pred_df, _, _ = dk.label_pipeline.inverse_transform(pred_df)
if self.freqai_info.get("DI_threshold", 0) > 0:
dk.DI_values = dk.feature_pipeline["di"].di_values
else:
dk.DI_values = np.zeros(len(outliers.index))
dk.do_predict = outliers.to_numpy()
More details about the migration can be found [here](strategy_migration.md#freqai---new-data-pipeline).
## Outlier detection