The Zipline python library allows us to write and test trading algorithms. We can follow the zipline tutorial to setup the library. After using the sample market data provided, or sourcing our own, running an algorithm becomes straight forward.
However, detailed information on how to build more complex and dynamic algorithms based on stock factors is not explicitly defined within the examples or documentation. Here is the example from the library that uses the Pipeline class to sort by one factor (RSI).
We will demonstrate how to filter a universe of stocks using multiple factors to create a dynamic algorithm, and use custom factors not included within the default library.
Code
Import Statements
First we will define the import statements. The order function will allow us to place orders. Next we list the imports related to pipeline that will allow us to define factor criteria. In this example we will use AnnualizedVolatility, RSI, BollingerBands as factors. We will add a custom factor later on, and we will also import pricing data to use as a criteria for stock selection.
1
2
3
4
5
6
7
8
9
import numpy as np
from zipline.api import order
#API imports for pipeline
from zipline.api import attach_pipeline
from zipline.api import pipeline_output
from zipline.pipeline import Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import AnnualizedVolatility, RSI, BollingerBands
Factor Instantiation
Next, all of the imported factors need to be instantiated as variables as seen on lines 5 - 9 of the snippet below. Some factors have further arguments that must be given. For example, BollingerBands compute the simple moving average of price given window_length look back periods and also the value of +/- k standard deviations from that moving average.
1
2
3
4
5
6
7
8
9
def initialize (context):
#runs once when script starts
#context is a python dictionary that contains information on portfolio/performance.
#Factor criteria
close_price = USEquityPricing.close.latest
vol = USEquityPricing.volume.latest
ann_std = AnnualizedVolatility(annualization_factor = 252)
rsi = RSI(window_length = 15)
bol252 = BollingerBands(window_length = 252, k = 2)
Custom Factor Definition
Now we will define a custom factor using the imported USEquityPricing data. A custom factor class takes as inputs an iterable of Zipline’s BoundColumn class. In this case we use USEquityPricing.close and USEquityPricing.open. We compute the average percent change over the specified window length. Note that the inputs and window length are default values and can be changed as a parameter when instantiating the custom class. The computation is actually done via numpy and should use the inputs def compute(...,close,open) passed into the compute function. Quantopian explains the basis of custom factors here. Our custom factor is computing the average price change per day over a default 10 days. Lastly we instantiate the class. The default window length is overridden. In the final code we will place the class definition before the initialize function.
1
2
3
4
5
6
7
8
9
class NDayMeanPctDif(CustomFactor):
#Default inputs.
inputs = [USEquityPricing.close, USEquityPricing.open]
window_length = 10
def compute(self, today, asset_ids, out, close, open):
#Calculates the column-wise man difference, ignoring NaNs
out[:] = np.nanmean((close - open) / open, axis = 0)
mean_pct_dif = NDayMeanPctDif(window_length = 5)
Constructing the Pipeline
Next we use the defined variables to construct a boolean expression that will constrain the factors and create a stock universe. mask_custom is the boolean expression.
- The
mask_customboolean expression is used to set the universe of available stocks. stock_basketgets the top 5 stocks with the highest average price change over the last 5 days within themask_customuniverse.
With the code below we are telling Zipline to find stocks that meet the following conditions:
- The price is greater than $15.
- The volume is greater than 100000 shares traded per day.
- The price is higher than the 252 day moving average.
- The annualized volatility of the stock is greater than 50%
- The RSI over a 15 day period is less than 30.
- Finally we use the custom factor to say that the average price change over a 5 day period must be greater than 1%.
1
2
3
#screening
mask_custom = ((close_price > 15) & (vol > 100000) & (close_price > bol252.middle) & (ann_std > 0.50) & (rsi < 30) & (mean_pct_dif > 0.01))
stockBasket = mean_pct_dif.top(5, mask = mask_custom)
In the code below,pipe is the actual dataframe of the output that matches our criteria. attach_pipeline saves the pipeline settings.
1
2
3
4
#Column construction
pipe_columns = {"close_price": close_price, "volume": vol, 'ann_std': ann_std, 'rsi': rsi, "mean_pct_dif": mean_pct_dif}
pipe = Pipeline(columns = pipe_columns, screen = stockBasket)
attach_pipeline(pipe, "Stocks")
Running the Pipeline
In Zipline the handle_data function runs every time epoch (either minute or day). The pipeline_output function gets the stocks and columns from the pipeline previously set by attach_pipeline. For this example we will buy 100 shares of each stock output by the pipeline.
1
2
3
4
5
6
def handle_data(context, data):
context.days_stocks = pipeline_output('Stocks')
print(context.days_stocks)
for stock in context.days_stocks:
order(stock, 100)
Conclusion
We constructed a custom factor and created an algorithm that gets a dynamic universe of stocks as different stocks meet our specified requirements over time. This algorithm doesn’t complete a round-trip, but gives an idea of how to use multiple factors simultaneously. Below is the sample example code put together:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
from zipline.api import order
#API imports for pipeline
from zipline.api import attach_pipeline
from zipline.api import pipeline_output
from zipline.pipeline import Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import AnnualizedVolatility, RSI, BollingerBands, CustomFactor
class NDayMeanPctDif(CustomFactor):
#Default inputs.
inputs = [USEquityPricing.close, USEquityPricing.open]
window_length = 10
def compute(self, today, asset_ids, out, close, open):
#Calculates the column-wise man difference, ignoring NaNs
out[:] = np.nanmean((close - open) / open, axis = 0)
def initialize (context):
#runs once when script starts
#context is a python dictionary that contains information on portfolio/performance.
#Factor criteria
close_price = USEquityPricing.close.latest
vol = USEquityPricing.volume.latest
ann_std = AnnualizedVolatility(annualization_factor = 252)
rsi = RSI(window_length = 15)
bol252 = BollingerBands(window_length = 252, k = 2)
mean_pct_dif = NDayMeanPctDif(window_length = 5)
#screening
mask_custom = ((close_price > 15) & (vol > 100000) & (close_price > bol252.middle) & (ann_std > 0.50) & (rsi < 30) & (mean_pct_dif > 0.01))
stockBasket = mean_pct_dif.top(5, mask = mask_custom)
#Column construction
pipe_columns = {"close_price": close_price, "volume": vol, 'ann_std': ann_std, 'rsi': rsi, "mean_pct_dif": mean_pct_dif}
pipe = Pipeline(columns = pipe_columns, screen = stockBasket)
attach_pipeline(pipe, "Stocks")
def handle_data(context, data):
context.days_stocks = pipeline_output('Stocks')
print(context.days_stocks)
for stock in context.days_stocks.index:
order(stock, 100)
We run the code in a jupyter notebook or command line with the something like the following:
1
%zipline --bundle <bundle_name> --start 2021-1-6 --end 2022-2-1 --capital-base 25000 -o strat_test.pickle --no-benchmark --data-frequency daily