Skip to content

Stock Trading Algorithm on top of Market Event Study

Reading Time: 7 minutes

This post is the result of the first six weeks of class from the Computational Investing course I’m currently following on Coursera. The course is an introduction to Portfolio Management and Optimization in Python and lays the foundations for the second part of the track which wil deal with Machine Learning for Trading. Let’s move quickly to the core of the business.

The question I want to answer with is the following:

  • Is it possible to exploit event studies in a trading strategy?

First of all we should clarify what an event study is. As Wikipedia states, an event study is a statistical method to assess the impact of an event on the value of a firm. This definition is very broad and can easily incorporate facts directly concerning the company (i.e. private life of the CEO, merging with other firms, confidential news from insiders) or anomalous fluctuactions in the price of the stock. I naively (and maybe incorrectly) categorized events regarding a company into these two types, news related and market related, but there should be no difference as they are generally tigthly correlated. In any case, as it is not easy to have access and parse in real time news feeds we will focus on market related events, meaning that in the rest of the post an event must be intended as an anomalous behavior in the price of the stock whose consequences we could exploit to trade in a more efficient way.

Now that we have properly defined an event we can go back to the beginning and think a little bit more about what study an event really means. To understand it let’s walk through a complete example and suppose that we have an event whenever the closing price of a stock at the end of day i  is less than 10$ whilst  at the end of day i-1 was more than 10$. Thus we are examining a significant drop in the price of the stock. Given this definition the answer is: what does it statistically happen to prices of stocks experiencing those kind of fluctuations? Is there a trend that could be somehow exploited? The reason at the base of these questions is that if we knew in advance that a stock followed a specific pattern as a consequence of some event we could could adjust our trading strategy accordingly. If statistics suggests that the price is bound to increase maybe it is a good idea to long the shares whether in the opposite case the best decision is to short.

In order to run an even study we take advantage of the EventProfiler class inside the QSTK library. This class allows us to define an event and then, given a time inerval and a list of stocks, it works in the following way: it scrolls firm after firm and whenever it finds an event sets that day day as day 0. Then it goes 20 days ahead and 20 days before the event and saves the timeframe. After having analyzed all the stocks it aligns the events on the day 0, averages all the prices before and after and scales the result by the market (SPY). The output is a chart which basically answers this question: what happens on average when the closing price of a stock at the end of day i  is less than 10$ whilst  at the end of day i-1 was more than 10$? The test period was the one between 1 January 2008 and 31 December 2009 (in the middle of the financial crisis), while the stocks chosen were the 500 contained in the S&P index in 2012. The graph is shown below and the following information can be extracted: first, 461 such events were registered during the investigated time frame. Second, on the day of the event there is a drop of about 10% in the stock price w.r.t the day before. Third, the price seems to recover after day zero, even though the confidence intervals of the daily increase are huge. SPY2012_10$

 Now the idea is the following. If the observed behavior is respected what we can do is build a trading strategy consisting in buying on the day of the event and selling let’s say after 5 days (we don’t want to hold too long despite the price increasing almost monotonically). Just to recap here you find the whole pipeline from event definition to portofolio assessment.

trade

Now that we have a plan let’s dive into the code (you can find all the code on Github).

# import statements
import pandas as pd
import numpy as np
import math
import copy
import sys
import matplotlib.pyplot as plt
from pylab import *
import QSTK.qstkutil.qsdateutil as du
import datetime as dt
import QSTK.qstkutil.DataAccess as da
import QSTK.qstkutil.tsutil as tsu
import QSTK.qstkstudy.EventProfiler as ep
# save the marketsim() function as marketsim.py
from marketsim import marketsim

First of all I’ll introduce one after the other the two main functions.

find_events(ls_symbols,  d_data,  shares=100):  given the list of the stocks in the portfolio, their historical prices and the number of shares to be traded identifies events and issues a Buy Order on the day of the event and a Sell Order after 5 trading days. Eventually it returns a csv file to be passed to the market simulator. The first lines of the csv file are previed below (year, month, day, stock, order, shares).

orders

def find_events(ls_symbols, d_data, shares=100):
    ''' Finding the event dataframe '''
    df_close = d_data['actual_close']
    orders = ''
    print "Finding Events"

    df_events = copy.deepcopy(df_close)
    df_events = df_events * np.NAN

    ldt_timestamps = df_close.index
    
    for s_sym in ls_symbols:
        for i in range(1, len(ldt_timestamps)-5):
            
            f_symprice_today = df_close[s_sym].ix[ldt_timestamps[i]]
            f_symprice_yest = df_close[s_sym].ix[ldt_timestamps[i - 1]]
 
            if f_symprice_today < 10 and f_symprice_yest >=10 :
                buy_time = pd.to_datetime(ldt_timestamps[i]).strftime('%Y,%m,%d,')
                buy_order = buy_time+str(s_sym)+',Buy,'+str(shares)+',\n'
                orders += buy_order
                
                sell_time = pd.to_datetime(ldt_timestamps[i+5]).strftime('%Y,%m,%d,')
                sell_order = sell_time+str(s_sym)+',Sell,'+str(shares)+',\n'
                orders += sell_order
    
    with open('event-orders.csv', 'w') as ord:
        ord.write(orders)
    
    print 'Saved orders to csv file'

marketsim(investment, orders_file, out_file):  given the initial investment in dollars (50000 $ in our case), the csv files containing all the orders (the output of find_events()) and the file to save to the results of the simulation, this function places the order in chronologic order and updates automatically the value of the portfolio. It returns a csv file with the portfolio value in time, a plot comparing the portfolio performance against the market benchmark and print to screen a summary of the main financial metrics used to evaluate the portfolio.

def marketsim(investment, orders_file, out_file):
    df = pd.read_csv(orders_file, parse_dates=[[0,1,2]], header=None)
    
    df.columns = ['date', 'stock', 'order', 'shares', 'no']
    df = df.drop('no',1)
    df = df.sort('date', 0)
    df = df.reset_index(drop=True)
    df['date'] = df['date'] + dt.timedelta(hours=16)
    
    start_date = df['date'][0]
    end_date = df['date'][df.shape[0]-1]
    dt_timeofday = dt.timedelta(hours=16)
    ldt_timestamps = du.getNYSEdays(start_date, end_date, dt_timeofday)
    c_dataobj = da.DataAccess('Yahoo')
    
    ls_keys = ['close']
    equities = list(df.stock.unique())
    
    data = c_dataobj.get_data(ldt_timestamps, equities, ls_keys)[0]
    data['cash'] = float(investment)
    for equity in equities:
        data['shares_'+equity] = 0
    for row in range(df.shape[0]):
        order = df.ix[row]
        if order['order'] == 'Buy':
            bought = order.shares
            data['shares_'+order.stock][data.index >= order.date] += bought
            cash_paid = bought * data[order.stock][data.index == order.date][0]
            data['cash'][data.index >= order.date] -= cash_paid
        elif order['order'] == 'Sell':
            sold = order.shares
            data['shares_'+order.stock][data.index >= order.date] -= sold
            cash_taken = sold * data[order.stock][data.index == order.date][0]
            data['cash'][data.index >= order.date] += cash_taken
    
    def compute_equities_value(row):
        return (row[:len(equities)].values * row[len(equities)+1:].values).sum() 
    
    data['eq_value'] = data.apply(lambda row: compute_equities_value(row), axis=1)
    data['portfolio'] = data['cash'] + data['eq_value']
    
    portfolio = data['portfolio'].copy()
    dret = tsu.returnize0(portfolio) 
    vol = dret.std()
    daily_ret = dret.mean()
    sharpe = np.sqrt(252)*daily_ret/vol
    cum_ret = data['portfolio'][data.shape[0]-1]/investment - 1 
    
    market = c_dataobj.get_data(ldt_timestamps, ['SPY'], ls_keys)[0]
    original = market.SPY.copy()
    market['dret'] = tsu.returnize0(market.SPY)
    market.SPY = original 
    mvol = market.dret.std()
    mdaily_ret = market.dret.mean()
    msharpe = np.sqrt(252)*mdaily_ret/mvol
    mcum_ret = original[market.shape[0]-1]/original[0] - 1
    
    fig = figure()
    ax = fig.add_subplot(111)
    ax.set_xticklabels(data.index, rotation=45)
    ax.yaxis.grid(color='gray', linestyle='dashed')
    ax.xaxis.grid(color='gray', linestyle='dashed')
    ax.xaxis.set_major_formatter(DateFormatter('%b %Y'))
    ax.legend(('Fund','Market'), loc='upper left')
    ax.set_title('Fund Performance VS Market (SPY)', 
                    fontsize=16, fontweight="bold")
    ax.set_xlabel('Date', fontsize=16)
    ax.set_ylabel('Normalized Fund Value', fontsize=16)
    
    port = data.portfolio/data.portfolio.max()
    mark = original/original.max()
    
    y_min = min(port.min(), mark.min())
    ax.set_ylim([y_min-0.02, 1.02])
    plt.plot(data.index, port, lw=2., label='Fund')
    plt.plot(data.index, mark, lw=2., label='Market')
    ax.legend(('Fund','Market'), loc='upper left', prop={"size":16})
    fig.autofmt_xdate()
    plt.show()

    data = data.reset_index()
    data.columns.values[0] = 'date'
    
    begin = pd.to_datetime(data.date[0]).strftime('%b %d %Y')    
    end = pd.to_datetime(data.date[data.shape[0]-1]).strftime('%b %d %Y')
    
    print 'Details of the Performance of the portfolio'
    print ''
    print 'Data Range: ', begin, ' - ', end
    print ''
    print 'Sharpe Ratio of Fund: ', sharpe
    print 'Sharpe Ratio of Market: ', msharpe
    print ''
    print 'Total Return of Fund: ', cum_ret
    print 'Total Return of Market: ', mcum_ret
    print ''
    print 'Volatily of Fund: ', vol
    print 'Volatily of Market: ', mvol
    print ''
    print 'Average Daily Return of Fund: ', daily_ret
    print 'Average Daily Return of Market: ', mdaily_ret
    print ''
    
    
    data.to_csv(out_file, index=False)


main(): this function calls the previous two after getting and cleaning all the relevant data.

if __name__ == '__main__':
    # test begin = 1 January 2008
    dt_start = dt.datetime(2008, 1, 1)
    # test begin = 31 December 2009
    dt_end = dt.datetime(2009, 12, 31)
    # getting only the trading days in the timeframe
    ldt_timestamps = du.getNYSEdays(dt_start, dt_end, dt.timedelta(hours=16))
    # downloading prices for all the stocks contained in the list of S&P-500 in 2012
    dataobj = da.DataAccess('Yahoo')
    ls_symbols = dataobj.get_symbols_from_list('sp5002012')
    ls_symbols.append('SPY')

    ls_keys = ['open', 'high', 'low', 'close', 'volume', 'actual_close']
    ldf_data = dataobj.get_data(ldt_timestamps, ls_symbols, ls_keys)
    d_data = dict(zip(ls_keys, ldf_data))
     
    # taking care of missing values
    for s_key in ls_keys:
        d_data[s_key] = d_data[s_key].fillna(method='ffill')
        d_data[s_key] = d_data[s_key].fillna(method='bfill')
        d_data[s_key] = d_data[s_key].fillna(1.0)
    
    # finding events and preparing trading strategy
    find_events(ls_symbols, d_data)
    # evaluating the strategy against the market
    marketsim(50000, 'event-orders.csv', 'event-values.csv')

This is the output, as promised:

Details of the Performance of the portfolio

Data Range:  Jan 03 2008  -  Dec 22 2009

Sharpe Ratio of Fund:  0.610680695525
Sharpe Ratio of Market:  -0.133639366311

Total Return of Fund:  0.19602
Total Return of Market:  -0.191838897721

Volatily of Fund:  0.0108878915846
Volatily of Market:  0.02205848174

Average Daily Return of Fund:  0.000418849217983
Average Daily Return of Market:  -0.000185699080962

portfolio

Well, despite the huge crisis (-19% market return) our trading strategy brought us to gain a remarkable +19%! This was just an example but in any case very powerful to show the possibilities of event studies in finance.

Discover more from

Subscribe now to keep reading and get access to the full archive.

Continue reading