Use Your Computer to Make Informed Decisions in Stock Trading: Practical Introduction — Part 2: Exploring Finance APIs
This second part in the series covers several well-known finance APIs that you can use in your Python code to obtain and analyse stock data. In particular, we’ll focus on those APIs that are available for free or mostly for free, meaning your requests to the key services and most of the data provided by an API can be done for free.
This part assumes that you’ve already installed a Python environment on your local machine and have familiarised yourself with Google Colab, as discussed in the part 1 of this series. I’ll encourage you to use Google Colab to try the examples in this article.
APIs to Obtain and Analyse Stock Data
Actually, you have several options when it comes to getting stock data programmatically. The most natural way to obtain stock data is via an API. The data can be received from an API either through an HTTP request (requests library), or a python-wrapper library for the API. In this article, you’ll look at the following APIs:
- Yahoo Finance API
- Quandl API
- Pandas-datareader
- get-all-tickers library (a Python wrapper to an API provided by the NASDAQ marketplace)
In this part, we won’t go too deep into financial analysis tasks you can perform with these APIs but rather will look at their general capabilities. However, you’ll get a chance to get your hands dirty with the APIs.
General rules for API selection
There are many parameters you should take into consideration before choosing an API:
- Functionality. It is quite rare when one datasource has enough data to cover all your needs. Thus, Yahoo finance API can provide only daily data, and not hourly and minutes cut.
- Free / Paid. Many data sources have free tier to try their data (with a limited amount of calls per day), while paid options can have more granularity and no limits on usage.
- Stability. You need to check when was the last release and how often the datasource is updated. Many Python libraries have their pages on pypi.org, where you can find the stats on the number of installs (more is better), GitHub stars and forks (more is better), current health status. For small projects, you should assume that the datasource can be unreachable sometimes, can return null value or error at any moment.
- Documentation. It can be very handy to see all the API calls details covered in one place. For example, Quandl library has a separate web page for many of its time series with a profound description and code snippets for different programming languages as this one https://www.quandl.com/data/LBMA/GOLD-Gold-Price-London-Fixing
Yahoo Finance API
As mentioned in the previous section, it is very useful to look at the PyPi website page of the library (https://pypi.org/project/yfinance/), which can tell you that the project is quite popular mid-July 2020: it has 165k installs per month, 2300 stars on GitHub, 6700 followers on Twitter, last version was released only half-year ago in December 2019.
Yahoo Finance API allows you to make 2000 requests per IP per hour => 48k requests per day. Before starting to obtain stock data programmatically with Yahoo Finance API, let’s look at the Yahoo Finance website in a browser to explicitly see what can be found there. Suppose you want to look at historical stock prices for Pfizer Inc.(PFE). To accomplish this, point your browser to https://finance.yahoo.com/quote/PFE/history?p=PFE
The key fragment of what you’ll see in your browser is shown in the following screenshot:
Let’s now try to get some stock data programmatically from Yahoo Finance. To start with, go to Google Colab as it was described in the previous part 1, and open a new notebook to be used for the examples in this article.
To start with, install the yfinance library in your notebook:
!pip install yfinance
Now, suppose you first want to look at some general — including financial — information about the company of interest. This can be done as follows (you should use a new code cell in your Colab notebook for this code):
import yfinance as yfpfe = yf.Ticker(‘PFE’)
pfe.info
The output is truncated to save space:
{
‘zip’: ‘10017’,
‘sector’: ‘Healthcare’,
‘fullTimeEmployees’: 88300,
‘longBusinessSummary’: ‘Pfizer Inc. develops, manufactures, and sells healthcare products worldwide. It offers …’
‘city’: ‘New York’,
‘phone’: ‘212–733–2323’,
‘state’: ‘NY’,
‘country’: ‘United States’,
…
‘profitMargins’: 0.31169,
‘enterpriseToEbitda’: 11.87,
‘52WeekChange’: -0.15343297,
…
}
For details on dividends and stock splits, you can take advantage of the action property of the Ticker object:
pfe.actions
In this particular example, this should produce the following output:
Date Dividends Stock Splits1972–08–29 0.00333 0.0
1972–11–28 0.00438 0.0
1973–02–28 0.00333 0.0
1973–05–30 0.00333 0.0
1973–08–28 0.00333 0.0
… … …
2019–05–09 0.36000 0.0
2019–08–01 0.36000 0.0
2019–11–07 0.36000 0.0
2020–01–30 0.38000 0.0
2020–05–07 0.38000 0.0
It started from as much as $0.00333 dividend in cash per 1 stock in 1972, and finished with $0.38 in 2020.
Dividends can affect stock’s price in many ways and change the patterns of growth observed before. For example, a company may increase the dividend rate at some moment of time trying to show that it is ready to give back more of its earnings to shareholders and increase their investment income. It can move the stock price upwards, if the market believes that company’s management is eager to continue paying high dividends.
Now suppose you want to obtain historical stock prices for Pfizer Inc. over the past six months. This can be done as follows:
hist = pfe.history(period=”6mo”)
Depending on your needs, you can specify another period. Your options include: 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, max. Apparently, the hist variable shown in the previous code snippet contains the stock data we have requested. If so, in what format? This can be instantly clarified as follows:
type(hist)
<class pandas.core.frame.DataFrame’>
As you can see, yfinance returns data in the pandas dataframe format. So you can use the pandas’ info() function to print a concise summary of the dataframe:
hist.info()<class ‘pandas.core.frame.DataFrame’>
DatetimeIndex: 125 entries, 2020–01–13 to 2020–07–10
Data columns (total 7 columns):
# Column Non-Null Count Dtype
— — — — — — — — — — — — — — -
0 Open 125 non-null float64
1 High 125 non-null float64
2 Low 125 non-null float64
3 Close 125 non-null float64
4 Volume 125 non-null int64
5 Dividends 125 non-null float64
6 Stock Splits 125 non-null int64
Suppose you’re interested in open prices only. The necessary selection from the original dataframe can be done as follows:
df1 = hist[[‘Open’]]
print(df1)
Now if you print out the df1 dataframe variable, you’ll see the following output:
Open Date2020–01–13 38.83
2020–01–14 38.65
2020–01–15 39.39
2020–01–16 39.98
2020–01–17 39.76
… …
2020–07–06 34.95
2020–07–07 34.05
2020–07–08 34.01
2020–07–09 33.73
2020–07–10 33.66
Some Examples on Using Yahoo Finance API
Now that you know how yfinance works in general, let’s look at how it might be used in some simple examples. Say, you want to look at 1-year stock price history for the following companies:
tickers = [‘TSLA’, ‘API’, ‘LMND’,’MRK’]
Some info on the above tickers and their recent performance:
- TSLA (Tesla Inc) — shows the most impressive growth. Despite many investors were “shorting” the stock (betting on its decrease) — it showed an amazing growth in the recent months.
- API (Agora Inc) and LMND (Lemonade Inc) — companies that had IPO recently. Their price is quite volatile in the first months: it could jump 10–20% just in matter of days. This gives a good opportunity to make profits quickly, but also bears more risk, as these stocks can go down quickly as well.
- MRK (Merck & Co., Inc.) — as many other stocks in most of the verticals had a drop around 2020–03 (Covid-19 effect), now restored to the previous year levels.
For clarity, you might want to make a plot for each company:
import matplotlib.pyplot as pltfor i,ticker in enumerate(tickers):
current_ticker = yf.Ticker(ticker)
plt.subplot(len(tickers),1,i+1)
current_ticker.history(period=’365d’)[‘Close’].plot(figsize= (16,60), title=’1 year price history for ticker: ‘+ticker)
The plot for Tesla Inc might look as illustrated in the following figure:
Continuing with this example, suppose you want to look at a particular financial parameter of a certain company from the list of tickers you defined here:
ticker = tickers[0]
yf_info = yf.Ticker(ticker).info
print(ticker)TSLA
You already saw an example of using the info property of a Ticker object in the beginning of this section. If you recall, the info includes a lot of parameters related to the company, including both general and financial ones. You can extract the necessary one as follows:
#an easy way to get 1 year stock growth
yf_info[‘52WeekChange’]4.8699937
The following example illustrates how you can compare two financial parameters: 52WeekChange and profitMargins for several tickers:
stock_52w_change = []
profitsMargins = []
tickers = [‘NVS’,’JNJ’,’ABBV’,’AMGN’]
for ticker in tickers:
print(ticker)
current_ticker = yf.Ticker(ticker)
current_ticker_info = current_ticker.info
stock_52w_change.append(current_ticker_info[‘52WeekChange’])
profitsMargins.append(current_ticker_info[‘profitMargins’])
You’ll combine the stock_52w_change and profitsMargins lists created in the above code cell into a Pandas dataframe:
import pandas as pddf = pd.DataFrame([stock_52w_change, profitsMargins], columns=tickers, index={‘52w change’, ‘profitMargins’})print(df) NVS JNJ ABBV AMGNprofitMargins -0.06242 0.160992 0.482794 0.469441
52w change 0.24318 0.188630 0.247700 0.320250
You might also want to look at a visual representation of this comparing:
import matplotlib.ticker as mtickax = df.plot.bar()ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1))ax.set_title(‘Comparing Profit Margins and 52 weeks growth rates for pharma stocks’)
This code should generate the following bar:
Interestingly, four companies from the same Pharma sector have a different pattern in one of the most important financial ratios profitMargins (profitMargins = [Net Income / Net Sales] *100%) from -6% to 48%, but stocks price increased from 18% to 32% for all of them (52 weeks change =~ 1 Year change).
Getting Data for S&P500 Index
It’s a common practice to compare the stock performance and ‘health check’ to the index value, which represents a composite aggregated average performance over the set of stocks. The most well-known index is probably S&P500. You can get it from Pandas Datareader, which uses Stooq company data as one of its sources. The list of all indexes available is on this page.
With the following code, you can create a plot for 1 year price history for index S&P500:
import pandas_datareader.data as pdr
from datetime import dateend = date.today()
start = datetime(year=end.year-1, month=end.month, day=end.day-2)# More information of the datasource:
spx_index = pdr.get_data_stooq('^SPX', start, end)
spx_index['Close'].plot(title='1 year price history for index S&P500')
This should generate the following plot:
As you can see from the plot, S&P500 was around 3000 1 year ago, then it had a drop to 2200 in March, now it shows a moderate increase to 3200 (~8% increase in 1 year). But still the last 3 months it is a ‘bullish’ market showing constantly growing prices of a stock index and many individual stocks.
Quandl API
Quandl brings together millions of financial and economic datasets from hundreds of sources, providing access to them via a single free API. This diversity of sources enables you to look at the other classes of investment, say gold or bitcoin — to compare them with stocks performance. To make this API available from your Colab, install the Python wrapper for it with the following command:
!pip install quandl
After successful installation, you can start using it as illustrated in the following example. Here, you request the prices for gold in London that are globally considered as the international standard for pricing of gold.
london_fixing_gold_price = quandl.get(“LBMA/GOLD”,start_date=start, end_date=end, authtoken=<your auth token>)
(You will need to create an account, describe the purpose of using Quandl data, and get a FREE token).
The Gold price in London is set twice a day, so you might want to look at your options before making a plot:
print(london_fixing_gold_price.columns)Index([‘USD (AM)’, ‘USD (PM)’, ‘GBP (AM)’, ‘GBP (PM)’, ‘EURO (AM)’, ‘EURO (PM)’], dtype=’object’)
Suppose you want to look at the morning prices in USD:
london_fixing_gold_price[‘USD (AM)’].plot(figsize=(20,5), title=”Gold Price: London Fixing in USD(AM price)”), plt.show();
The generated plot might look as follows:
#a harder way to get 1 year growth, controlling the exact dates
london_fixing_gold_price[‘USD (AM)’][‘2020–07–17’] / london_fixing_gold_price[‘USD (AM)’][‘2019–07–17’]1.2870502569960023
Lift of Tickers
Sometimes you may simply need to get a complete list of ticker symbols from a certain marketplace or several marketplaces. This is where the get-all-tickers library may come in very handy. In this section, we’ll touch upon this library, which is actually a wrapper to an API provided by the NASDAQ marketplace. This library allows you to retrieve all the tickers from the three most popular stock exchanges: NYSE, NASDAQ, and AMEX.
The library is an open source, which you can find in this GitHub repository. If you explore the code, you may discover that it extracts the tickers from several CSV files:
_NYSE_URL = ‘https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nyse&render=download'_NASDAQ_URL = ‘https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download'_AMEX_URL = ‘https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=amex&render=download'
So, using the above links, you can obtain a list of tickers directly as a file (and then import it to Python), without the use of the library — especially, when your request to the library hangs (sometimes it happens).
Before you can start using the get-all-tickers library, you’ll need to install it. This can be done with the pip command as follows in a code cell in your Colab notebook:
!pip install get-all-tickers
After the successful installation, issue the following line of code in a new code cell to get all tickers from NYSE and NASDAQ stock exchanges:
from get_all_tickers import get_tickers as gtlist_of_tickers = gt.get_tickers(NYSE=True, NASDAQ=True, AMEX=False)
The first thing you might want to do is to check the number of returned tickers:
len(list_of_tickers)6144
Then you might want to look at the tickers:
print(list_of_tickers)
Conclusion
In this second part of the series, you got familiar with some interesting finance APIs and looked at some examples of their usage.
You learned that gold showed a good growth of 29% in the last year, S&P 500 Index had 8% increase, four selected pharma stocks were up 20–30%, and Tesla Inc stock did an incredible 5x jump. All of these are good examples of profitable investing.
In the next part, you’ll learn how sentiment analysis applied to news articles about stock movements can help you make informed decisions in your trading operations.