Origin

Stocks and Digital Currency

The digital currency market is a 7*25, 365 days a year, non-stop trading market that generates huge amounts of data every day. Compared to the A-share market, which trades for only 4 hours a day, it is very difficult to get milliseconds of data. The digital currency market is undoubtedly a more friendly market.

I even had a fight with my hairdresser yesterday, he still insists that quantization is evil and is stealing money from them. And my point is Quantization is the big trend, and those who know quantization are facing traditional traders who don’t understand quantization in a downward spiral. The only way is to embrace quantization as soon as possible, but people have their own aspirations and cannot be persuaded.

The digital currency market is constantly evolving, from a person can make a lot of money by manually moving, to manual moving can not be profitable to move to the program to move arbitrage, then the program with the program coupled with a simple indicator for programmed trading, and then, the simple indicator began to fail, the cost of gradually began to introduce the concept of factors. Higher level began to high frequency.

This step by step can be clearly seen. The digital currency market is a gradual conversion from the cognitive threshold to the technical threshold of the process, at the beginning as long as you have the determination to gamble, you can obtain profiteering, to the present you can only clear away the fog in front of you, cross the threshold of cognition, and at the same time technically sound in order to obtain enough to satisfy their own profits.

All of this is similar to the early days of the reopening, except that this is a 5 times the speed of the market, everything is in the rapid evolution. I hope that students reading this article can embrace the oncoming wave together. Get your own share of the glory.

Multifactor and Digital Currency

Back to multi-factor, this article does not involve complex mathematical modeling and logical derivation, is simple to use the digital currency spot market as a data source to build a factor back test framework, to facilitate the evaluation of factors.

Factor can be regarded as an indicator, can also be written as an expression, which represents a kind of investment logic, reflects a kind of return information.

For example: The closing price factor assumes that the stock price predicts future returns, and that higher stock prices predict higher (or lower) returns. Constructing a portfolio of this factor is equivalent to a strategy of buying high priced stocks on a regular rotation. Factors that consistently generate excess returns are often referred to as alpha, such as the market capitalization factor and the momentum factor, which are proven effective factors.

Both the stock market and the digital currency market are extremely complex and it is impossible to fully predict future returns, but there is some predictability. Effective Alpha investment models gradually expire as more capital is invested, but new models and new Alpha will be created.

Market capitalization factor was very successful in the A-share market, since 2007, a decade of backtesting, simply buy the lowest market capitalization stocks, adjusted every day, gained more than 400 times the return, far more than the broader market. But in 2017, the small-cap factor failed, and the value factor became popular, reflecting the limitations of a single factor. We need to weigh the validation and use of factors. Finding factors is the basis of the strategy, combining multiple uncorrelated valid factors can build a better strategy.

Code implementation

1
2
3
4
5
6
7
8
9
# Introduction of related libraries
import requests
from datetime import date,datetime
import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import requests, zipfile, io
%matplotlib inline

Build data environment

The source of the data is the binance exchange spot market, because all these data have been deposited into the database, so that it is convenient for local backtesting and research, I have selected the species with a relatively large trading volume, and opened the trading pairs in the USDT perpetual futures at the same time. As I said before, multi-factor is actually a strategy for selecting coins, which requires that the coins should be sufficiently large, so as to have enough space to select.

First build a database environment to facilitate the data call

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
## Defining database-related constants
start_time = "2022-01-01 00:00:00"
end_time = "2022-09-01 00:00:00"
rule_type = '1h'

df_dict = {}
date_columns = "open_time_GMT8"

username = ""
password = ""
host = ""
port = 3306
dbname = "binance_spot"

# Initialize the database connection, using the pymysql module
db = pymysql.connect(host=host, port=port,user=username, passwd=password,db=dbname, charset="utf8")
# Use the cursor() method to create a cursor object cursor
con = db.cursor()

Define commonly used functions

Then define a few commonly used functions to make it easier to call them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#———————————————————— Defining Related Functions ————————————————————#
# Get table name
def get_tables(con, dbname):
con.execute("SHOW TABLES")
results = con.fetchall()
tables_name = [item for sublist in results for item in sublist]
return tables_name

# periodic conversion function
def transfer_to_period_data(df, rule_type):
period_df = df.resample(rule=rule_type, on='open_time_GMT8', label='left', closed='left').agg(
{'open': 'first',
'high': 'max',
'low': 'min',
'close': 'last',
'volume': 'sum',
'amount': 'sum'
})

period_df.dropna(subset=['open'], inplace=True) # 去除一天都没有交易的周期
period_df = period_df[period_df['volume'] > 0] # 去除成交量为0的交易周期
period_df.reset_index(inplace=True)
all_data = period_df[['open_time_GMT8', 'open', 'high', 'low', 'close', 'volume', 'amount']]
return all_data

def export_table_by_date(table_name, date_columns, start_date, end_date):
"""
Getting relevant data at a specified time
Args:
table_name: eg. eth_usdt
date_columns: time series eg. "open_time_GMT8"
Returns:Pandas.DataFrmae()
"""
try:
query = f"SELECT * FROM {table_name} WHERE {date_columns} BETWEEN '{start_date}' AND '{end_date}'"
con.execute(query)
results = con.fetchall()

columns = []
for i in range(len(con.description)):
columns.append(con.description[i][0])

df = pd.DataFrame(results, columns=columns)
return df, None
except Exception as e:
return None, e

Let’s print out the tables that are in the database memory

1
2
3
## Get the table names of all tables in the specified database
tables_names = get_tables(con, dbname)
print(tables_names, len(tables_names))

output:

1
['1inch_usdt', 'aave_usdt', 'aca_usdt', 'ach_usdt', 'acm_usdt', 'ada_usdt', 'adx_usdt', 'agld_usdt', 'aion_usdt', 'akro_usdt', 'alcx_usdt', 'algo_usdt', 'alice_usdt', 'alpaca_usdt', 'alpha_usdt', 'alpine_usdt', 'amp_usdt', 'anc_usdt', 'ankr_usdt', 'ant_usdt', 'any_usdt', 'ape_usdt', 'api3_usdt', 'apt_usdt', 'ar_usdt', 'ardr_usdt', 'arpa_usdt', 'asr_usdt', 'astr_usdt', 'ata_usdt', 'atm_usdt', 'atom_usdt', 'auction_usdt', 'aud_usdt', 'audio_usdt', 'auto_usdt', 'ava_usdt', 'avax_usdt', 'axs_usdt', 'badger_usdt', 'bake_usdt', 'bal_usdt', 'band_usdt', 'bar_usdt', 'bat_usdt', 'bcc_usdt', 'bch_usdt', 'bchabc_usdt', 'bchsv_usdt', 'beam_usdt', 'bel_usdt', 'beta_usdt', 'bico_usdt', 'bifi_usdt', 'bkrw_usdt', 'blz_usdt', 'bnb_usdt', 'bnt_usdt', 'bnx_usdt', 'bond_usdt', 'bsw_usdt', 'btc_usdt', 'btcst_usdt', 'btg_usdt', 'bts_usdt', 'btt_usdt', 'bttc_usdt', 'burger_usdt', 'busd_usdt', 'bzrx_usdt', 'c98_usdt', 'cake_usdt', 'celo_usdt', 'celr_usdt', 'cfx_usdt', 'chess_usdt', 'chr_usdt', 'chz_usdt', 'city_usdt', 'ckb_usdt', 'clv_usdt', 'cocos_usdt', 'comp_usdt', 'cos_usdt', 'coti_usdt', 'crv_usdt', 'ctk_usdt', 'ctsi_usdt', 'ctxc_usdt', 'cvc_usdt', 'cvp_usdt', 'cvx_usdt', 'dai_usdt', 'dar_usdt', 'dash_usdt', 'data_usdt', 'dcr_usdt', 'dego_usdt', 'dent_usdt', 'dexe_usdt', 'df_usdt', 'dgb_usdt', 'dia_usdt', 'dnt_usdt', 'dock_usdt', 'dodo_usdt', 'doge_usdt', 'dot_usdt', 'drep_usdt', 'dusk_usdt', 'dydx_usdt', 'egld_usdt', 'elf_usdt', 'enj_usdt', 'ens_usdt', 'eos_usdt', 'eps_usdt', 'epx_usdt', 'erd_usdt', 'ern_usdt', 'etc_usdt', 'eth_usdt', 'eur_usdt', 'farm_usdt', 'fet_usdt', 'fida_usdt', 'fil_usdt', 'fio_usdt', 'firo_usdt', 'fis_usdt', 'flm_usdt', 'flow_usdt', 'flux_usdt', 'for_usdt', 'forth_usdt', 'front_usdt', 'ftm_usdt', 'ftt_usdt', 'fun_usdt', 'fxs_usdt', 'gal_usdt', 'gala_usdt', 'gbp_usdt', 'ghst_usdt', 'glmr_usdt', 'gmt_usdt', 'gmx_usdt', 'gno_usdt', 'grt_usdt', 'gtc_usdt', 'gto_usdt', 'gxs_usdt', 'hard_usdt', 'hbar_usdt', 'hc_usdt', 'hft_usdt', 'hifi_usdt', 'high_usdt', 'hive_usdt', 'hnt_usdt', 'hot_usdt', 'icp_usdt', 'icx_usdt', 'idex_usdt', 'ilv_usdt', 'imx_usdt', 'inj_usdt', 'iost_usdt', 'iota_usdt', 'iotx_usdt', 'iris_usdt', 'jasmy_usdt', 'joe_usdt', 'jst_usdt', 'juv_usdt', 'kava_usdt', 'kda_usdt', 'keep_usdt', 'key_usdt', 'klay_usdt', 'kmd_usdt', 'knc_usdt', 'kp3r_usdt', 'ksm_usdt', 'lazio_usdt', 'ldo_usdt', 'lend_usdt', 'lever_usdt', 'lina_usdt', 'link_usdt', 'lit_usdt', 'loka_usdt', 'lqty_usdt', 'lrc_usdt', 'lsk_usdt', 'ltc_usdt', 'lto_usdt', 'luna_usdt', 'magic_usdt', 'mana_usdt', 'mask_usdt', 'matic_usdt', 'mbl_usdt', 'mbox_usdt', 'mc_usdt', 'mco_usdt', 'mdt_usdt', 'mdx_usdt', 'mft_usdt', 'mina_usdt', 'mir_usdt', 'mith_usdt', 'mkr_usdt', 'mln_usdt', 'mob_usdt', 'movr_usdt', 'mtl_usdt', 'multi_usdt', 'nano_usdt', 'nbs_usdt', 'nbt_usdt', 'near_usdt', 'neo_usdt', 'nexo_usdt', 'nkn_usdt', 'nmr_usdt', 'npxs_usdt', 'nu_usdt', 'nuls_usdt', 'ocean_usdt', 'og_usdt', 'ogn_usdt', 'om_usdt', 'omg_usdt', 'one_usdt', 'ong_usdt', 'ont_usdt', 'ooki_usdt', 'op_usdt', 'orn_usdt', 'oxt_usdt', 'pax_usdt', 'paxg_usdt', 'people_usdt', 'perl_usdt', 'perp_usdt', 'pha_usdt', 'pla_usdt', 'pnt_usdt', 'pols_usdt', 'poly_usdt', 'polyx_usdt', 'pond_usdt', 'porto_usdt', 'powr_usdt', 'psg_usdt', 'pundix_usdt', 'pyr_usdt', 'qi_usdt', 'qnt_usdt', 'qtum_usdt', 'quick_usdt', 'rad_usdt', 'ramp_usdt', 'rare_usdt', 'ray_usdt', 'reef_usdt', 'rei_usdt', 'rep_usdt', 'req_usdt', 'rgt_usdt', 'rif_usdt', 'rlc_usdt', 'rndr_usdt', 'rose_usdt', 'rpl_usdt', 'rsr_usdt', 'rune_usdt', 'sand_usdt', 'santos_usdt', 'sc_usdt', 'scrt_usdt', 'sfp_usdt', 'shib_usdt', 'skl_usdt', 'slp_usdt', 'snx_usdt', 'sol_usdt', 'spell_usdt', 'srm_usdt', 'steem_usdt', 'stg_usdt', 'stmx_usdt', 'storj_usdt', 'storm_usdt', 'stpt_usdt', 'strat_usdt', 'strax_usdt', 'stx_usdt', 'sun_usdt', 'susd_usdt', 'sushi_usdt', 'sxp_usdt', 'sys_usdt', 't_usdt', 'tct_usdt', 'tfuel_usdt', 'theta_usdt', 'tko_usdt', 'tlm_usdt', 'tomo_usdt', 'torn_usdt', 'trb_usdt', 'tribe_usdt', 'troy_usdt', 'tru_usdt', 'trx_usdt', 'tusd_usdt', 'tvk_usdt', 'twt_usdt', 'uma_usdt', 'unfi_usdt', 'uni_usdt', 'usdc_usdt', 'usdp_usdt', 'ust_usdt', 'utk_usdt', 'ven_usdt', 'vet_usdt', 'vgx_usdt', 'vidt_usdt', 'vite_usdt', 'voxel_usdt', 'vtho_usdt', 'wan_usdt', 'waves_usdt', 'waxp_usdt', 'win_usdt', 'wing_usdt', 'wnxm_usdt', 'woo_usdt', 'wrx_usdt', 'wtc_usdt', 'xec_usdt', 'xem_usdt', 'xlm_usdt', 'xmr_usdt', 'xno_usdt', 'xrp_usdt', 'xtz_usdt', 'xvg_usdt', 'xvs_usdt', 'xzc_usdt', 'yfi_usdt', 'yfii_usdt', 'ygg_usdt', 'zec_usdt', 'zen_usdt', 'zil_usdt', 'zrx_usdt'] 360

Then output the data in the table to see the data structure :

1
2
3
symbol = 'btc_usdt'
df, error = export_table_by_date(symbol, date_columns, start_time, end_time)
df

output:

id open_time_GMT8 open high low close volume amount
0 2291854 2022-01-01 00:00:00 48005.37 48027.31 47937.50 48004.75 50.29149 2.413640e+06
1 2291855 2022-01-01 00:01:00 48007.21 48011.78 47951.50 47951.50 14.75541 7.080488e+05
2 2291856 2022-01-01 00:02:00 47956.29 47992.36 47944.49 47965.46 15.33062 7.353789e+05
3 2291857 2022-01-01 00:03:00 47965.46 47970.25 47910.56 47954.34 13.35734 6.403608e+05
4 2291858 2022-01-01 00:04:00 47954.35 47971.49 47949.55 47955.58 8.34380 4.001467e+05
... ... ... ... ... ... ... ... ...
349916 2641770 2022-08-31 23:56:00 20148.68 20155.55 20143.12 20151.50 170.79243 3.441419e+06
349917 2641771 2022-08-31 23:57:00 20152.62 20153.10 20130.85 20142.19 227.70751 4.586203e+06
349918 2641772 2022-08-31 23:58:00 20143.65 20151.31 20136.47 20145.66 160.46265 3.232340e+06
349919 2641773 2022-08-31 23:59:00 20144.40 20158.24 20138.16 20140.63 193.27150 3.894068e+06
349920 2641774 2022-09-01 00:00:00 20142.26 20166.17 20131.97 20158.42 294.71744 5.938896e+06

349921 rows × 8 columns

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Filter the tables to keep only the ones with high recognition. 
tables = ['btc_usdt', 'eth_usdt', 'bch_usdt', 'xrp_usdt', 'eos_usdt', 'ltc_usdt', 'trx_usdt', 'etc_usdt', 'link_usdt', 'xlm_usdt', 'xmr_usdt', 'dash_usdt', 'zec_usdt', 'xtz_usdt', 'bnb_usdt', 'atom_usdt',
'ont_usdt', 'iota_usdt', 'bat_usdt', 'vet_usdt', 'neo_usdt', 'qtum_usdt', 'iost_usdt','theta_usdt','algo_usdt', 'zil_usdt', 'knc_usdt', 'zrx_usdt', 'comp_usdt', 'omg_usdt', 'doge_usdt', 'sxp_usdt', 'kava_usdt',
'band_usdt', 'rlc_usdt', 'waves_usdt', 'snx_usdt', 'dot_usdt', 'yfi_usdt', 'bal_usdt', 'crv_usdt', 'ada_usdt', 'trb_usdt', 'rune_usdt',
'sushi_usdt', 'srm_usdt', 'egld_usdt', 'sol_usdt', 'icx_usdt', 'blz_usdt', 'uni_usdt', 'avax_usdt', 'ftm_usdt', 'hnt_usdt', 'enj_usdt',
'flm_usdt', 'tomo_usdt', 'ksm_usdt', 'near_usdt', 'aave_usdt', 'fil_usdt', 'rsr_usdt', 'lrc_usdt', 'matic_usdt', 'ocean_usdt', 'cvc_usdt',
'bel_usdt', 'ctk_usdt', 'axs_usdt', 'alpha_usdt', 'zen_usdt', 'skl_usdt', 'grt_usdt', '1inch_usdt', 'chz_usdt', 'sand_usdt', 'ankr_usdt',
'bts_usdt', 'unfi_usdt', 'reef_usdt', 'sfp_usdt', 'xem_usdt', 'coti_usdt', 'chr_usdt', 'mana_usdt', 'alice_usdt', 'hbar_usdt', 'one_usdt',
'stmx_usdt', 'dent_usdt', 'celr_usdt', 'hot_usdt', 'mtl_usdt', 'ogn_usdt', 'nkn_usdt', 'sc_usdt', 'dgb_usdt', 'icp_usdt', 'tlm_usdt',
'iotx_usdt', 'audio_usdt', 'ray_usdt', 'c98_usdt', 'mask_usdt', 'ata_usdt', 'dydx_usdt', 'gala_usdt', 'celo_usdt', 'klay_usdt', 'arpa_usdt'
, 'ctsi_usdt', 'ens_usdt', 'people_usdt', 'ant_usdt', 'rose_usdt', 'dusk_usdt', 'flow_usdt', 'imx_usdt', 'api3_usdt', 'gmt_usdt',
'ape_usdt', 'bnx_usdt', 'woo_usdt', 'ftt_usdt', 'jasmy_usdt', 'dar_usdt', 'gal_usdt', 'op_usdt', 'btc_usdt', 'eth_usdt', 'inj_usdt',
'stg_usdt', 'spell_usdt', 'ldo_usdt', 'cvx_usdt']
print(len(tables))

output: 135

1
2
3
4
5
6
7
8
9
10
11
## All the data is stored in a file for subsequent analysis.
for i in tables:
df, error = export_table_by_date(i, date_columns, start_time, end_time)
if df is None:
print(f"Error exporting table {i}: {error}")
continue

if not df.empty:
df = transfer_to_period_data(df, rule_type)
df.index = pd.to_datetime(df["open_time_GMT8"])
df_dict[i] = df

Grouping of data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
df_close = pd.DataFrame(index=pd.date_range(start=start_time, end=end_time, freq=rule_type),columns=df_dict.keys())
df_open = pd.DataFrame(index=pd.date_range(start=start_time, end=end_time, freq=rule_type),columns=df_dict.keys())
df_volume = pd.DataFrame(index=pd.date_range(start=start_time, end=end_time, freq=rule_type),columns=df_dict.keys())
df_amount = pd.DataFrame(index=pd.date_range(start=start_time, end=end_time, freq=rule_type),columns=df_dict.keys())

for symbol in df_dict.keys():
df_s = df_dict[symbol]
df_close[symbol] = df_s['close']
df_open[symbol] = df_s['open']
df_volume[symbol] = df_s['volume']
df_amount[symbol] = df_s['amount']



df_close = df_close.dropna(how='all')
df_open = df_open.dropna(how='all')
df_volume = df_volume.dropna(how='all')
df_amount = df_count.dropna(how='all')
1
2
3
# Normalization process and output
df_norm = df_close/df_close.fillna(method='bfill').iloc[0] #归一化
df_norm.mean(axis=1).plot(figsize=(15,6),grid=True);

As we can see the market is very poor going from 1.0 at the beginning of the year to 0.3 in September and the market capitalization is down 35% of what it started with

The backtesting engine

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Backtesting engine
class Exchange:

def __init__(self, trade_symbols, fee=0.0004, initial_balance=10000):
self.initial_balance = initial_balance #初始的资产
self.fee = fee
self.trade_symbols = trade_symbols
self.account = {'USDT':{'realised_profit':0, 'unrealised_profit':0, 'total':initial_balance, 'fee':0, 'leverage':0, 'hold':0}}
for symbol in trade_symbols:
self.account[symbol] = {'amount':0, 'hold_price':0, 'value':0, 'price':0, 'realised_profit':0,'unrealised_profit':0,'fee':0}

def Trade(self, symbol, direction, price, amount):

cover_amount = 0 if direction*self.account[symbol]['amount'] >=0 else min(abs(self.account[symbol]['amount']), amount)
open_amount = amount - cover_amount
self.account['USDT']['realised_profit'] -= price*amount*self.fee #扣除手续费
self.account['USDT']['fee'] += price*amount*self.fee
self.account[symbol]['fee'] += price*amount*self.fee

if cover_amount > 0: #先平仓
self.account['USDT']['realised_profit'] += -direction*(price - self.account[symbol]['hold_price'])*cover_amount #利润
self.account[symbol]['realised_profit'] += -direction*(price - self.account[symbol]['hold_price'])*cover_amount

self.account[symbol]['amount'] -= -direction*cover_amount
self.account[symbol]['hold_price'] = 0 if self.account[symbol]['amount'] == 0 else self.account[symbol]['hold_price']

if open_amount > 0:
total_cost = self.account[symbol]['hold_price']*direction*self.account[symbol]['amount'] + price*open_amount
total_amount = direction*self.account[symbol]['amount']+open_amount

self.account[symbol]['hold_price'] = total_cost/total_amount
self.account[symbol]['amount'] += direction*open_amount


def Buy(self, symbol, price, amount):
self.Trade(symbol, 1, price, amount)

def Sell(self, symbol, price, amount):
self.Trade(symbol, -1, price, amount)

def Update(self, close_price): #对资产进行更新
self.account['USDT']['unrealised_profit'] = 0
self.account['USDT']['hold'] = 0
for symbol in self.trade_symbols:
if not np.isnan(close_price[symbol]):
self.account[symbol]['unrealised_profit'] = (close_price[symbol] - self.account[symbol]['hold_price'])*self.account[symbol]['amount']
self.account[symbol]['price'] = close_price[symbol]
self.account[symbol]['value'] = abs(self.account[symbol]['amount'])*close_price[symbol]
self.account['USDT']['hold'] += self.account[symbol]['value']
self.account['USDT']['unrealised_profit'] += self.account[symbol]['unrealised_profit']
self.account['USDT']['total'] = round(self.account['USDT']['realised_profit'] + self.initial_balance + self.account['USDT']['unrealised_profit'],6)
self.account['USDT']['leverage'] = round(self.account['USDT']['hold']/self.account['USDT']['total'],3)

# Functions for test factors
def Test(factor, symbols, period=1, N=40, value=300):
e = Exchange(symbols, fee=0.0002, initial_balance=10000)
res_list = []
index_list = []
factor = factor.dropna(how='all')
for idx, row in factor.iterrows():
if idx.hour % period == 0:
buy_symbols = row.sort_values().dropna()[0:N].index
sell_symbols = row.sort_values().dropna()[-N:].index
prices = df_close.loc[idx,]
index_list.append(idx)
for symbol in symbols:
if symbol in buy_symbols and e.account[symbol]['amount'] <= 0:
e.Buy(symbol,prices[symbol],value/prices[symbol]-e.account[symbol]['amount'])
if symbol in sell_symbols and e.account[symbol]['amount'] >= 0:
e.Sell(symbol,prices[symbol], value/prices[symbol]+e.account[symbol]['amount'])
e.Update(prices)
res_list.append([e.account['USDT']['total'],e.account['USDT']['hold']])
return pd.DataFrame(data=res_list, columns=['total','hold'],index = index_list)
1
2
3
4
#Volume
factor_volume = df_volume
factor_volume_res = Test(factor_volume, tables, period=4)
factor_volume_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
#3 hour momentum factor
factor_1 = (df_close - df_close.shift(3))/df_close.shift(3)
factor_1_res = Test(factor_1,tables,period=1)
factor_1_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
#24 hour momentum factor
factor_1 = (df_close - df_close.shift(24))/df_close.shift(3)
factor_1_res = Test(factor_1,tables,period=1)
factor_1_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
#Volume Factor
factor_3 = df_volume.rolling(24).mean()/df_volume.rolling(96).mean()
factor_3_res = Test(factor_3, tables, period=8)
factor_3_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
# Volatility factor
factor_7 = (df_close/df_open).rolling(24).std()
factor_7_res = Test(factor_7, tables, period=2)
factor_7_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
# Volume and closing price correlation factors
factor_8 = df_close.rolling(96).corr(df_volume)
factor_8_res = Test(factor_8, tables, period=4)
factor_8_res.total.plot(figsize=(15,6),grid=True);

1
2
3
4
5
6
7
8
9
10
11
12
13
# Normalization function, removing missing and extreme values and normalized
def norm_factor(factor):
factor = factor.dropna(how='all')
factor_clip = factor.apply(lambda x:x.clip(x.quantile(0.2), x.quantile(0.8)),axis=1)
factor_norm = factor_clip.add(-factor_clip.mean(axis=1),axis ='index').div(factor_clip.std(axis=1),axis ='index')
return factor_norm


df_volume_norm = norm_factor(df_volume)
factor_close_norm = norm_factor(factor_close)
factor_3_norm = norm_factor(factor_3)
factor_7_norm = norm_factor(factor_7)
factor_8_norm = norm_factor(factor_8)

polyfactor synthesis

1
2
3
4
## Multi-factor backtesting
factor_total = 0.6*df_volume_norm + 0.4*factor_close_norm + 0.5*factor_3_norm + 0.3*factor_7_norm + 0.4*factor_8_norm
factor_total_res = Test(factor_total, tables, period=8)
factor_total_res.total.plot(figsize=(15,6),grid=True);

Summary

This paper is a replication of the inventor’s article, with only a few parameter changes, so the results are similar.

In the backtesting data, we can see that the multi-factor performance is not as good as the single-factor performance of Momentum, and we are looking for multi-factors with better performance. We are looking for better multi-factor performance. There are still a lot of ways to play with multi-factor strategies and a lot of room for improvement, so I hope we can discuss this together.