4.4 Strategy Development FAQs
1. Survivorship Bias
Survivorship Bias (or Survivor Bias, or Survivor Bias) is caused by statistical analysis of a data set that includes only entities that have “survived” to a specific point in time. The classic example of this is the analysis of returns generated by hedge funds. Over the past 30 years, many hedge funds (e.g., Long-Term Asset Management (LTCM)) have been hit hard or closed after significant losses. If returns generated over the past 30 years were examined but only current hedge funds still in operation were included, the dataset would not reflect hedge fund risk because none of the hedge funds that have gone out of business would be included.
2. Hidden variable bias
Latent Variable Bias (LVB) occurs when a model is created that omits one or more important causal variables. This bias occurs when the model incorrectly compensates for missing variables by underestimating the effect of one of the other variables. This is especially true when the included variable is correlated with the missing variable. Identifying the independent variables that may have predictive power for the dependent variable is challenging and not simple. One approach is to identify the set of variables that lead to the greatest change in the dependent variable, a method known as optimal subsetting. Alternatively, it is possible to identify the eigenvectors (linear combinations of the available variables) , which is the method used when performing principal component analysis. One problem with principal component analysis is that it can also overfit the data and the eigenvectors may not remain stable over time.
3. Long-Short Bias
Long-Short Bias (L-Short Bias) is caused by an uneven distribution of returns in a data set. For example, if the test set is a one-sided rising market with a majority of positive returns, then the strategy will be optimized to be tilted towards long. One way to address this bias is to logarithmize the test set returns and change the plus and minus signs and test again.
4. Future Functions
The reason for this problem stems from the introduction of future information into the strategy. For example, if we want to trade the maximum price of the current Bar, in real time, no one knows what the maximum price of a Bar will be until it completes its cycle. But unfortunately, historical backtesting allows us to see all the information. The solution to this bias is to write the asset price as a cycle in real time, while never applying the next moment’s data in the cycle. Note: A time series over a certain period of time constitutes a K-line, and a single K-line is called a bar.
5. Endpoint effects due to filtering
This problem may shatter many people’s dreams. Whether it is wavelet analysis or Empirical Mode Decomposition (EMD), which are effective noise reduction methods in engineering, there is an “endpoint effect” when applied to time series, which means that these methods are problematic in determining the recent trend and cannot be used to guide strategic trading. They cannot be used to guide strategic trading. This problem can be extended to other techniques such as zscore
standardization and the use of Hidden Markov Models (HMM) to measure the current market state. Their common feature is that new market data leads to changes in the historical state of the model. This means that all global optimization filtering techniques suffer from this problem, which is essentially a form of overfitting.
4.5 Strategy development and backtesting
using TqSdk
TgSdk
is a set of open source Python framework developed by Shini Technology , it provides a complete set of solutions for users from data cleaning , strategy development to real trading . Here are some of its features.
Installation in less than a minute;.
Tick and K-line data for all contracts since listing, with support for customized K-line periods from 1 second to 1 day.
Support for demo and live trading.
Performs Tick or K-line backtesting and supports backtesting of multi-contract strategies.
Note: Tick is a short-term snapshot of data in securities trading.
1 install
Perform Tick or K-line backtesting, support for multi-contract strategies Before installing TqSdk, you need to prepare the appropriate environment and Python package management tools, including.
- Python 3.6 or above.
- Windows 7 or higher or Linux.
To install TqSdk, simply run the following command in a terminal or command line environment:
1 | pip install tqsdk |
2.Data downloads
Once the installation is complete, we can easily download the data of all varieties for each cycle, examples are as follows.
1 | from datetime import datetime |
3. Strategy Backtesting
TqBacktest
can be passed in to enable backtesting
when creating a TqApi
instance:.
1 | !/usr/bin/env python |
Limit orders required to report the price of the order to reach or exceed the price of the counterparty plate in order to close the deal, the transaction price for the price of the order if there is no counterparty plate (up or down) can not be traded; market orders using the counterparty price of the transaction, if there is no counterparty plate (up or down) will be automatically withdrawn. Demo trading will not be part of the transaction, to deal with all the transactions.
4.6 Summary of the chapter
This chapter introduces the foundation of strategy development. Through the study of this chapter, readers should have an understanding of the classification of quantitative strategies as well as the evaluation system, so that they can carry out strategy development according to their actual needs and in accordance with the industrialized process, as well as have an understanding of some common problems in strategy development.