Quantitative Trading Core Strategy Development Chapter 4 Part IV

4.4 Strategy Development FAQs

1. Survivorship Bias

Survivorship Bias (or Survivor Bias, or Survivor Bias) is caused by statistical analysis of a data set that includes only entities that have “survived” to a specific point in time. The classic example of this is the analysis of returns generated by hedge funds. Over the past 30 years, many hedge funds (e.g., Long-Term Asset Management (LTCM)) have been hit hard or closed after significant losses. If returns generated over the past 30 years were examined but only current hedge funds still in operation were included, the dataset would not reflect hedge fund risk because none of the hedge funds that have gone out of business would be included.

2. Hidden variable bias

Latent Variable Bias (LVB) occurs when a model is created that omits one or more important causal variables. This bias occurs when the model incorrectly compensates for missing variables by underestimating the effect of one of the other variables. This is especially true when the included variable is correlated with the missing variable. Identifying the independent variables that may have predictive power for the dependent variable is challenging and not simple. One approach is to identify the set of variables that lead to the greatest change in the dependent variable, a method known as optimal subsetting. Alternatively, it is possible to identify the eigenvectors (linear combinations of the available variables) , which is the method used when performing principal component analysis. One problem with principal component analysis is that it can also overfit the data and the eigenvectors may not remain stable over time.

3. Long-Short Bias

Long-Short Bias (L-Short Bias) is caused by an uneven distribution of returns in a data set. For example, if the test set is a one-sided rising market with a majority of positive returns, then the strategy will be optimized to be tilted towards long. One way to address this bias is to logarithmize the test set returns and change the plus and minus signs and test again.

4. Future Functions

The reason for this problem stems from the introduction of future information into the strategy. For example, if we want to trade the maximum price of the current Bar, in real time, no one knows what the maximum price of a Bar will be until it completes its cycle. But unfortunately, historical backtesting allows us to see all the information. The solution to this bias is to write the asset price as a cycle in real time, while never applying the next moment’s data in the cycle. Note: A time series over a certain period of time constitutes a K-line, and a single K-line is called a bar.

5. Endpoint effects due to filtering

This problem may shatter many people’s dreams. Whether it is wavelet analysis or Empirical Mode Decomposition (EMD), which are effective noise reduction methods in engineering, there is an “endpoint effect” when applied to time series, which means that these methods are problematic in determining the recent trend and cannot be used to guide strategic trading. They cannot be used to guide strategic trading. This problem can be extended to other techniques such as zscore standardization and the use of Hidden Markov Models (HMM) to measure the current market state. Their common feature is that new market data leads to changes in the historical state of the model. This means that all global optimization filtering techniques suffer from this problem, which is essentially a form of overfitting.

4.5 Strategy development and `backtesting` using `TqSdk`

TgSdk is a set of open source Python framework developed by Shini Technology , it provides a complete set of solutions for users from data cleaning , strategy development to real trading . Here are some of its features.

Installation in less than a minute;.
Tick and K-line data for all contracts since listing, with support for customized K-line periods from 1 second to 1 day.
Support for demo and live trading.
Performs Tick or K-line backtesting and supports backtesting of multi-contract strategies.
Note: Tick is a short-term snapshot of data in securities trading.

1 install

Perform Tick or K-line backtesting, support for multi-contract strategies Before installing TqSdk, you need to prepare the appropriate environment and Python package management tools, including.

Python 3.6 or above.
Windows 7 or higher or Linux.

To install TqSdk, simply run the following command in a terminal or command line environment:

1	pip install tqsdk

2.Data downloads

Once the installation is complete, we can easily download the data of all varieties for each cycle, examples are as follows.

from datetime import datetime
from contextlib import closing
from tqsdk import TqApi， TqSim
from tqsdk.tools import DataDownloader

api = TqApi(TqSim())
download_tasks = {} 
#下载从2018-01-01 到 2018-09-01的 SR901 日线数据
download_tasks["SR daily"] = DataDownloader(api, 
                                            symbol_list="CZCE.SR901",
                                            dur_sec=24*60*60，
                                            start_dt=date(2018，1,1),                                                           end_dt=date(2018，9，1),                                                           csv_file_name="SR901 daily.csv")

# 下载从 2017-01-01 到 2018-09-01 的rb 主连 5分钟线数据
download tasks["rb 5min"] = DataDownloader(api，symbol                                                                        list="KQ.m@SHFE.rb",dur_sec=5*60，                                                  start_dt=date(2017，1，1),
                                           end_dt=date(2018，9，1),
                                           csv_file name="rb 5min.csv")
#下载从2018-01-01凌晨6点到 2018-06-01下午4点的 cu1805,cu1807,IC1803 分钟线数据，所有数据按 cu1805 的时间对齐
# 例如 cu1805 夜盘交易时段，IC1803 的各项数据为 N/A
# 例如 cu1805 13:00-13:30 不交易，因此 IC1803 在 13:00-13:30 之间的 K 线数据会被跳过
download tasks["cu min"] = DataDownloader(api,symbol,
                               list=["SHFE.cul805""SHFE.cu1807"，"CEFEX.IC1803"]，
                                          dur_sec=60，
                                          start_dt=datetime(2018，1，1，60 ,0)，
                                          end dt=datetime(2018，6，1，16，0，0), 
                                          csv_file_name="cu min.csv")
#下载从 2018-05-01 凌晨0点到 2018-06-01 凌晨0点的T1809 盘口 Tick 数据
download tasks["T_tick"] = DataDownloader(api, 
                                          symbol_list= ["CFFEX.T1809"],                                                     dur_sec=0,
                                          start_dt=datetime(2018，5，1),
                                          end_dt= datetime(2018，6，1),
                                          csv_file_name="T1809 tick.csv")
# 使用 withclosing 机制确保下载完成后释放对应的资源
with closing(api):
    whilenotall(lv,is finished() for v indownload tasks.values()]):
        api.wait_update()
        print("progress:"，{ k:(".2f%%"v.get_progress()) for k, v indownload tasks.items() })

3. Strategy Backtesting

TqBacktest can be passed in to enable backtesting when creating a TqApi instance:.

!/usr/bin/env python 
-*- coding: utf-8 -*-
author='chengzhi'

from datetime import date
from contextlib import closing
from tqsdk import TgApi，TqSim，TqBacktest，TargetPosTask

"""
如果当前价格大于 5 分钟 K 线的 MA15 则开多仓
如果小于则平仓
回测从 2018-05-01到 2018-10-01
"""

# 在创建api 实例时传入 TqBacktest 就会进入回测模式
api = TgApi(TgSim()，backtest=TqBacktest(start_dt=date(2018，5，1)，end_dt= date(2018，10，1)))
# 获得 m1901 5分钟K线的引用
Klines = api.get kline serial("DCE.m1901"，5*60，data length=15)
# 创建 m1901 的目标持 task，该 task 负责调整 m1901 的仓位到指定的目标仓位
target_pos = TargetPosTask(api，"DCE.m1901")
#使用 withclosing 机制确保回测完成后释放对应的资源
with closing(api):
    while True:
        api.wait update()
        if api.is changing(klines):
            ma = sum(klines.close[-15:])/15print("最新价"，klines.close[-1]，"MA"，ma)
            if klines.close[-1] > ma:
                print("最新价大于MA:目标多头5手")设置目标持仓为多头 5手
                target_pos.set_target_volume(5)
            elif klines.close[-1] < ma:
                print("最新价小于MA:目标空仓")
                # 设置目标持仓为空仓
                target pos.set target volume(0)

Limit orders required to report the price of the order to reach or exceed the price of the counterparty plate in order to close the deal, the transaction price for the price of the order if there is no counterparty plate (up or down) can not be traded; market orders using the counterparty price of the transaction, if there is no counterparty plate (up or down) will be automatically withdrawn. Demo trading will not be part of the transaction, to deal with all the transactions.

4.6 Summary of the chapter

This chapter introduces the foundation of strategy development. Through the study of this chapter, readers should have an understanding of the classification of quantitative strategies as well as the evaluation system, so that they can carry out strategy development according to their actual needs and in accordance with the industrialized process, as well as have an understanding of some common problems in strategy development.