Choice of data fetching interface
To get the data of stocks, generally speaking, there can be several methods:
- tushare, baostock, qstock and other python library interface, the advantage is that the data has been sorted and cleaned, it is more convenient to get, only need a line of command can be obtained, the disadvantage is that there are often a variety of constraints, and the accuracy is not enough, for example, tushare need to register and get points to use, these libraries of the highest free time accuracy is 5 minutes, want to get the granularity of 1 minute data, you need to find a way to do it yourself. The maximum free time precision of these libraries is 5 minutes, so if you want to get the data with a granularity of 1 minute, you need to think of your own way!
- directly through the web page to crawl the corresponding data, this way is more free, but the problem is that you can not be broken, once one day your crawler server is down, which means that your data is difficult to pick up, you need to make up from other data sources, the problem is that you have other channels of information, why not use it in the beginning
- download the data through stock trading software, and then transform it into your own data.
- buy the data directly
Downloading the data is not the end of the matter, it is best to save it in the database, so that it is convenient for multiple terminals and can be called at any time. Here are three ways to do this
Getting historical data via python library.
Install the baostock library.
1 | pip install baostock -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host pypi.tuna.tsinghua.edu.cn |
Get historical A-share K-line data
1 | import baostock as bs |
Access to valuation indicators (daily frequency)
1 |
|
Ex-rights and ex-dividend information: query_dividend_data()
1 |
|
Basic information on securities:query_stock_basic()
1 | ## 登陆过程略 |
Quarterly frequency company performance snapshot:query_performance_express_report()
1 |
|
Comparison of libraries
There’s a picture attached. It’s all in the picture.
Crawling data through web pages
Project structure
database connection pool
1 | ## connectionpool.py |
database operation
1 | ## DBOperator.py |
log module
1 | ## LoggerFactory.py |
Getting Stock Historical Data - Single Threaded Mode
1 | ## stockyahoo.py |
Get stock history data - multi-threaded mode
1 | #-*- coding: UTF-8 -*- |
A few commonly used interfaces for getting stock data
Brief Information
Sending address
1
http://qt.gtimg.cn/q=s_sh600519 // sh600519是贵州茅台的股票代码,根据自己的关键字拼接URL
Message received
1
v_s_sh600519="1~贵州茅台~600519~2075.95~42.95~2.11~38930~794942~~26078.04~GP-A";
Here represents the summary information, which is the stock code we spliced in, followed in quotes by the summary information of the specific stock found, separated by the “~” sign, the specific information of the stock
序号 返回值 含义 1 1 Representative exchanges, 200-US (us), 100-HK (hk), 51-SZ (sz), 1-SH (sh) 2 贵州茅台 Stock Names 3 600519 stock code (computing) 4 2075.95 current price 5 42.95 rise or fall in price 6 2.11 % increase/decrease 7 38930 turnover 8 794942 turnover 9 26078.04 - 10 GP-A total market value
Daily K-line data Returns json format
1 |
|
Minute data
1 |
|
Getting Historical Data via Stock Software
Download data to local via Tom Tom Tom
I don’t need to tell you more about this.
Run the following code
1 | # coding: UTF-8 |
Buying direct
Direct purchase is to do without technical content is also the most efficient, but still the words do not forget the beginning of the always. Some people’s goal is Rome, some people were born in Rome, this is not comparable, may we soon reach the end of the heart.