Pair Trading - Exploring The Low Risk Statistical Arbitrage Trading Concepts

ncube

Well-Known Member
Definitely.
I haven't got a hang of pair trade. (Watching lectures on Quantopian on the aame subject)
Once I get it, would definitely do with lot more options.

@ncube Regarding auto-merging, it would be better if you include the dates also. (The first file I downloaded only had serial numbers)
I also curious to know how would auto-merging work. (This is a topic of its own)
My suggestion would be to dump details in a database or HDF file from a directory, scan the directory for changes mark the file, append it and so on.
Do you have any free tools for it (I am developing one on my own since I can't find any comprehensive tool)

That's it for the day
Good night
I just use a NSE EOD data downloader tool, there many such tools available for free on the net. Once the files are downloaded I have written my own python scripts to process the data and upload to my backtesting system. I am using an open source backtesting framework called "Backtrader", this does not require date to be included as it works on indexed data. Hence my python code is customized for it.

Anyway you guys can explore other options too, always better to have alternatives to choose from..:)
 
Last edited:

ncube

Well-Known Member
I think we could all come to a conclusion on this before proceeding further more complicated

1) Data Format
2) Filters/Strategy
3) Execution
4) Options

Good night
Guy's, in order to keep the readability of this thread focused and simple can we have a separate thread to discuss tools and coding related topics..as I think it is better to restrict this thread to discuss only simple Pair Trading related concepts which will help beginner intra day traders to explore a low risk strategy. Else people without much coding expertise will not be able to understand the advanced issues discussed here and would lose interest :)

Suggestions are welcome, If many feel it helps no issues it can be taken up in this thread itself.
 
Last edited:

VJAY

Well-Known Member
Guy's, in order to keep the readability of this thread focused and simple can we have a separate thread to discuss tools and coding related topics..as I think it is better to restrict this thread to discuss only simple Pair Trading related concepts which will help beginner intra day traders to explore a low risk strategy. Else people without much coding expertise will not be able to understand the advanced issues discussed here and would lose interest :)

Suggestions are welcome, If many feel it helps no issues it can be taken up in this thread itself.
Yes am with you....
 

VJAY

Well-Known Member
Yes, you are correct, it will be an issue, lets modify the code a bit to write the updated master file to a different folder. This way one can run the script any number of time, however eod they need to put both master & eod.txt file in the same folder first and then copy the updated master file to original folder.

1. Add the following function into the cell with other functions:
def update_eod(masterfile,eodfile):
master = pd.read_csv(masterfile, index_col=[0])
eod = pd.read_csv(eodfile, header=None,index_col=[0],usecols=[0,5])
df = master.append(eod.T).dropna(axis=1).reset_index(drop=True)
df.to_csv('C://master/stockdata.csv')
return

2. Add a new cell after after the cell which reads the master file and add the following python statement
update_eod('C://master/stockdata.csv','C://master/eod.txt')

View attachment 27047

How it works:
1. Every day download the EOD NSE bhavcopy text file and rename it as eod.txt, place it in a folder named "master" along with the old master file.
2. Then just run the cell with the function update_eod to update the master file and then comment this statement with # so that it is not run next time. Remove the # only when you need to update the master file.
3. Once the master file is updated copy the file to the folder used as earlier.

View attachment 27048

I have update the script in my google drive:
https://drive.google.com/drive/folders/1a40Ih__otr99E1TAQe7WEjsTEsiBDKQQ?usp=sharing
Dear ncube,
Is this code make update of your master file( stockdata file) from our eod file where there is no date available only index numbers have...I not tested it asking before as some time we (not know anything about code/tech things) make problems with orginal codes which spoil what we have...:)
 

UberMachine

Well-Known Member
Guy's, in order to keep the readability of this thread focused and simple can we have a separate thread to discuss tools and coding related topics..as I think it is better to restrict this thread to discuss only simple Pair Trading related concepts which will help beginner intra day traders to explore a low risk strategy. Else people without much coding expertise will not be able to understand the advanced issues discussed here and would lose interest :)

Suggestions are welcome, If many feel it helps no issues it can be taken up in this thread itself.
Great s
I just use a NSE EOD data downloader tool, there many such tools available for free on the net. Once the files are downloaded I have written my own python scripts to process the data and upload to my backtesting system. I am using an open source backtesting framework called "Backtrader", this does not require date to be included as it works on indexed data. Hence my python code is customized for it.

Anyway you guys can explore other options too, always better to have alternatives to choose from..:)
Good stuff.
I also tried backtrader but since I use universe and pipeline concepts, I have created my own set of scripts (backtrader too have multiple feeds option but I think of using zipline)

Ok. Would have a discussion in a new thread some time later
 

ncube

Well-Known Member
Dear ncube,
Is this code make update of your master file( stockdata file) from our eod file where there is no date available only index numbers have...I not tested it asking before as some time we (not know anything about code/tech things) make problems with orginal codes which spoil what we have...:)
@VJAY , The new PairTrading.ipynb is updated with only one function load_eod(), this is required only for members who want to use script to update the daily EOD NSE bhavcopy. If you are doing updates mannually for few stocks then it is not required. You can comment the cell with this function with a # in front of the function as follows, and this cell will not get executed.
#update_eod('C://master/stockdata.csv','C://master/eod.txt')

Also if someone wants to use date as index instead of numbers as it is easier to update manually, you can replace the first column in the stockdata.csv file with dates for all the rows. The python script that I have written just takes the first column as index so if its date then the index will be date. To replace the first column with the dates, you can download the historical data (For the number of rows in the stockdata.csv file) excel sheet for one of the stocks from NSE website to get the trading day dates and copy only the date to the first column in stockdata.csv file.
 

UberMachine

Well-Known Member
@ncube @VJAY
Got it. Got a working version.

Let me sum up
  1. We are picking 2 correlated stocks
  2. Test them for co-integration so that mean reversion works
  3. We use spread to measure mean reversion
  4. If the spread increases or decrease before a certain threshold (indicated by zscore), we place simultaneous orders on both of them

So, to generalize
  1. Select some pair of stocks that have both correlation and co-integration
  2. Use some measure to quantify the difference
  3. Place a suitable order if the measure is out of threshold

Things to resolve (parameters to decide)

  1. How to select pairs?
  2. What is the look back window?
  3. How to set entry and exit prices?
  4. What and threshold to use?
  5. How to backtest this strategy?
And finally, how to make profit out of it?

Hope I am moving in the right direction
 

VJAY

Well-Known Member
Nice write up UB....as am too starter in this method hope ncube bro will give some insights .....One question from me too...How can backtest this method in one pairs?it can be done through python?backtest results will be helpfull to stick with tahose pairs.....my views
 

ncube

Well-Known Member
@ncube @VJAY
Got it. Got a working version.

Let me sum up
  1. We are picking 2 correlated stocks
  2. Test them for co-integration so that mean reversion works
  3. We use spread to measure mean reversion
  4. If the spread increases or decrease before a certain threshold (indicated by zscore), we place simultaneous orders on both of them

So, to generalize
  1. Select some pair of stocks that have both correlation and co-integration
  2. Use some measure to quantify the difference
  3. Place a suitable order if the measure is out of threshold

Things to resolve (parameters to decide)
  1. How to select pairs?
  2. What is the look back window?
  3. How to set entry and exit prices?
  4. What and threshold to use?
  5. How to backtest this strategy?
And finally, how to make profit out of it?

Hope I am moving in the right direction
@UberMachine , I have already explained these points in my earlier posts, maybe you can once read this thread from the begining. That is one reason why I want to keep this thread simple and clean so that people new to Pair trading find information and understand the basics strategy and not get overwhelmed with complex concepts which are not required for day trading the pairs. Once they are familiar with the basics they can then explore other complex concepts and methods to automate it.

Just to briefly answer your queries:
How to select pairs?

>> Select pairs which has co-integration significance value less than 0.05, lower the value better.
What is the look back window?
>> In the script I have shared I have used 200 days for co-integration & 20 days for zScore calculations. These values are fine for day trading.
How to set entry and exit prices?
>> I have suggested the 1st 30 min candle break out strategy to trade the pairs, if one can understand the core idea of pair trading they can come up with their own strategy.
What and threshold to use?
>> Usually a zScore value more than +/- 2 SD is good to initiate the trades. However this is more applicable for position trading as one need enough buffer, for day trading even a lower threshold should be fine.
How to backtest this strategy?
>> One can use zScore values in association with the stock closing prices for backtesting. However for day trading I do not see much benefit of doing this as one can easily perform direct walk-forward testing for few days to understand the characteristics of the pair. One can just select few good co-integrating pairs and trade on it till the co-integration falls apart.

But If one wants to position trade it they can use reinforcement machine learning models or deep learning models to identify pairs which show similar characteristics, but these are complex methods and I dont recommend it to beginners. In trading its always better to keep it as simple as possible, more complexities does not mean more profits.

Also as we discussed earlier, lets keep the automation/coding and tool specific discussion in a separate thread.
 
Last edited: