All of my previous posts have been strictly focused on end of day trading. With this post I want to elaborate on how to test & trade intraday using limit orders. Furthermore I want to discuss the validity of the system presented with you.
The basic idea
- Buying oversold stocks intraday using limit orders.
- Oversold is defined when today’s price falls below yesterday’s close – 0.9 ATR(21)
- Buy top 5 stocks using ROC(20) for ranking
- All trades are closed at end of day
.
The test: Too good to be true?
- Strictly uses previous bar information in order to avoid look ahead bias, based on daily bars
- In case open is bellow limit order, open will be the execution price
- Based on NDX100 stocks (survivorship free: historical index adjustment + delisted stocks)
- No commission / slippage
- Explanation of the key performance indicators being used (link)
Click on the picture to enlarge it!
.
So what’s wrong?
The main issues with the system presented is the approach of taking (ranking) the trades according to previous bar’s ROC(20). In reality the limit orders won’t be hit in yesterday’s ROC(20) order. Using daily bars there is no way one can figure out the “real” sequence of the limit orders. So I decided to simulate the sequence by randomizing the selection. That can be easily done in AmiBroker by assigning a random variable as POSITIONSCORE. This way I’ve no influence (bias) what stocks are selected (as long as the low is lower than the initially defined limit). I’ve run the test 1000 times using randomized selection, bellow you find the randomized results. The results appear to be pretty stable regardless how the stocks are being picked. Even the “worst” run shows decent performance.
.
I’m interested to hear from YOU!
- Has the test faulty assumptions?
- Do you think the system is tradable?
Please enter your comments bellow. In case you are interested in the AmiBroker AFL code, please send an email to me (fhinbox-trading@yahoo) incl. your thoughts to these two questions.



Another interesting post Frank. Obviously slippage and brokerage costs are going to have a significant impact on an intraday trading system such as this. But the results appear impressive never the less. Why not throw 10K at it as a test for 6 months? If you can automate the trade execution then you will be able to gather some valuable stats on the friction such a system would encounter without a large time commitment.
I have noticed with my systems however that the more frequently they trade the less the real time results match the back testing. That is why I strongly favor infrequent trading when it comes to systems established on back testing.
Cheers
Derry
Frank,
interesting post! The test seems indeed too good for such a simple system — in the sense that an automated artificial learning algorithm of a sophisticated market participant should have picked it up already.
I could think of four drivers of (fake) returns:
1) Spikes in the data. Particularly daily highs and lows can be inaccurate (in fact, almost no data provider always gets them right), as they are often driven by bad ticks in the data. The impact of “fake” extreme highs and lows, of course, would be limit fills that in reality would never happen. This is something you can easily check by comparing the daily highs and lows of an *accurate* 1-min time series with the highs and lows of the daily.
2) Too aggressive limit fill assumptions: just hitting the limit price with the day’s low does not mean you are the first in line in the orderbook to get the fill. Also, extreme prices of the day are often only available for a few minutes or even seconds (think of a V-shaped pattern on the tick chart). If your limit price just matches the low of the day, you will either get no fill, or just a very low number of shares. In addition, if you sit in the order book with a large limit order, other traders will place their orders 1-cent above you, further reducing your fill size. Try testing more conservative limit fill assumptions (ideally by looking at 1 min data at least). I typically use the condition of at least 1 minute-bar completely below a buy limit price to fill the order.
3) Liquidity bias: what stocks are the returns driven by? Particularly if you test without slippage, your system’s parameterization is likely biased toward trading the low liquidity stocks, i.e. the system will generate a disproportionately large share of trades on these stocks. They will exhibit the largest intraday spikes (leading to the most attractive limit fills in simulation). The AAPLs and GOOGs of the world will rarely appear in your list of trades.
Moreover, realistically you cannot assume to be able to take more than a 1% of average daily volume position. The lowest liquidity stocks in the NDX100 will then define your portfolio size, unless you allow positions of different size. Overall the strategy will be very limited in scalability.
4) Omitted slippage: If you close the position at end of day using a market order, you will incur significant slippage (anything between 0.01% on AAPL and 0.5% on the low liquidity stocks for a $100K market order).
All in, I would expect your average trade to drop to the range of around 0.20%, and be limited in position size. Anything beyond that would be a big positive surprise.
Regards
Martin Niemann
Martin, good points.
However, I have found that using market-on-close in Interactive Brokers gets the closing price almost every time. I know, not very scientific, but there is very very little slippage. This improves even more when trading NDX stocks. I do think it is possible to sell the close and have slippage not be an issue.
That’s true with IB, but do not forget that once you put in a large market order, you will also impact that closing price. So still the same as the price you get, but lower than what the closing price would be without your order.
Hi Frank,
Great work as usual!
One issue of note that many people in the blogosphere have highlighted is the issue of unrealistic open prices for these types of systems. Basically, the opening prices in your historical data can be suspect because real liquidity might not arrive until a few minutes after the open for many US stocks. This issue will only effect those trades that trigger at the open though, and it may in fact bear a positive impact on your results – only real $ will tell. There are some good posts on this issue here:
http://www.puppetmastertrading.com/blog/2008/08/01/execution-quality-at-the-open-close/
and here
http://ibankcoin.com/woodshedderblog/2010/01/13/introducing-the-slippage-project/
cheers
Ramon
Hullo Frank
I have been using in real markets a very similar system during last year. Well, I bought any stock falling very below Bollinger bands and sold next open day, but it’s the same idea.
It has not work.
After looking my trades many months, I saw that slippages wouldn’t be very big, but the problem was actually the chosen stocks.
As you say, you randomize the stocks bought to minimize the impact of the stocks being chosen, but I saw that, in the really volatile days, the stocks bought were the worst stocks of the list of eligible stocks.
This is normal, ’cause the days the stock markets steep falls, the first stocks that are being falling (which are the bought stocks), use to be the stocks most falling at the end of the day.
I tried to solve this problem, by assigning the PositionScore variable to the stocks nearest Bollinger band the day before. This helped, but I think the results, mainly in the most volatile days, varied too much to trust in them.
Also, there is the problem of reading many stocks at once, while many brokers limit to 100 simultaneous tickers in real time.
I used the Interactive Brokers TraderWorkstation API in Excel, to read hundreds of tickers in a loop-while, but it is does not work very fluent..
Anyway, It’s been very good to see people fighting as me!
Congratulations for your work!
Frank,
(a) Regarding your comment “I’ve run the test 1000 times using randomized selection, bellow you find the randomized results” – i typically do that one other way which i think is more robust. I run the system on EACH stock using individual backtest in Amibroker, and compare the statistics of the system on the individual backtests on list of stocks. IF a substantial amount of the stocks show good statistics, that is a good data point. This way, you dont need to rely on positionscore to PICK a few stocks and maybe get lucky. of course, you have removed the element of luck by running a 1000 times, but that is a lot of compute time which you could supplement by running individual backtests.
(b) I hope you have put a filter such as
BuyPrice > L and BuyPrice = L.
You should probably not do the above, because this will indicate that there was only one tick which matches your buyprice -> typically the open price.
I found that some systems of mine which are daily trading and shut positions at the end of the day did have am impact due to the above.
I started then doing things like checking the range of parameters (in your case the ATR sizes) for which this is valid.
(d) i have been running 3 systems like this ( one of them is very close in concept) in real money for the last 3-4 months, and both Martin’s and Ramon’s comments are on the money.
(e) Slippage is significant. The key is to get per trade stats to be as high as possible. I ended up sacrificing quite a bit of performance to get the per trade stats to be high by filtering trades. I was careful not to put too many filters and overfit. It was good that there are a lot of trades, so the results are statistically reliable. But no harm in doing statistical tests / monte carlo as well on the trade sequence.
(f) Execution is key. You can improve execution by using more sophisticated algorithms. I realized that am and now starting to do so. For instance, i would use an OTO order (one trigger another) order to execute the closing at market on close, but i realized that it would be better to do this better. Also the entry orders should definitely have been a better order looking at the order book and size.
hope it helps.
rgds
bgpl
sorry, a typo.. when i said:
b) I hope you have put a filter such as
BuyPrice > L and BuyPrice = L.
You should probably not do the above, because this will indicate that there was only one tick which matches your buyprice -> typically the open price.
I meant: instead of
“You should probably not do the above”
should be
“You should DEFINITELY DO the above”
oh boy. not my day.
BuyPrice > L and BuyPrice = L.
should be
BuyPrice > L and BuyPrice < H.
Ramon, those links point out the issue with using consolidated open open prices for testing in NYSE-listed stocks (vs. using the actual NYSE open price). This issue does not exist for Nasdaq-listed stocks, as there is a single opening cross price published. The only possible issue is market impact, which is not going to be huge for the average retail trader trading the Nasdaq 100.
Frank,
Thanks for sharing. I thought there might be some days where it was buying the Open after the fact, but I think I confirmed that is not happening by changing the Buy price to this:
BuyPrice = IIf(Open < LE1, Open*1.15, IIf( Low < LE1, LE1, Close));
You could add some additional reality-check cushion buy bumping up the Open price slightly.
BuyPrice = IIf(Open < LE1, Open*1.015, IIf( Low < LE1, LE1, Close));
This didn't change the result much for me but I'm not able to match your results with non-survivorship adjusted data. I would have expected my results to be better than yours sense I'm cheating.
Terry
typo above…my first formula should not have contained the “*1.15″, it should read:
BuyPrice = IIf(Open < LE1, Open, IIf( Low < LE1, LE1, Close));
A little late with my comment… been out of the country. The curve is too good to be true IMO. As you say, there’s no way to pick the “top 5″ ROC tickers because you don’t know in realtime which tickers will be eligible for trades (hit your limit prices) and which won’t. Are you using intraday data or simply daily bars? If daily bars, you can tell IF your limit would have been hit but not WHEN during the day. When you say “randomizing the selection”, does that just mean you’re randomly picking 5 tickers out of all that would have hit your limit, disregarding the ROC? If that’s what you’re doing then the results seem extremely good if they’re actually tradeable… not sure about that though.
A possible issue: How are you getting your data? From IB or from a source like Yahoo? I have found that IB historical data especially is subject to “bad prints”, where it may show a erroneous low that didn’t actually trade (or if it did executed a single flukey trade way outside the day’s range for some reason). Even a small percentage of these in the data could skew the results significantly. I have automated trading systems that use IB for realtime data but pull historical data from sources (like Google or Yahoo) that use the “official” cleaned data rather than raw data reconstructed from ticks like IB.
A suggestion: if you want to incorporate ROC, why not just set a threshold, like the top 20% of tickers based on ROC are eligible for trading with this system?
…
My gut feeling is that the system is that there’s a problem with your results. I don’t doubt the system could be profitable but it seems a little too consistent. Maybe there is some look-ahead bias in the code? How many trades is it making per day on average? What % are on open and what % intraday?
Just read your follow-up post. Clarified things, so don’t feel an obligation to respond to my above posts if you feel you’d be repeating yourself… (My gut still says something may be off though.)
Great blog and great posting. Thank you for sharing your work with us.
I’ m just developing and testing a similar dipbuyer-system for us stocks with a daily large number of pre-market placed limit orders. I’m using Amibroker with eoddata.com data to select my stocks and automatically place the limit buy orders and the sell orders. My experiences so far:
In backtesting (not survivorship bias cleaned database yet) the results are dramatically better for less liquid stocks. There ist a strong negativ correlation between profit an liquidity of the traded stocks.
In forward testing via my IB simulation account with a position size of 5000 dollar per position I get partial fills even in simulation mode. So the situation will be worse when trying to really get the stocks.
Selling the stocks on MOO even in IB simulation account produces a huge slippage in both directions compared to the open price listed later on in the common databases. The slippage depends on how the market starts to run on the selling day and can be more than 2% of the open price. Even the simulated MOC price differs 1-5 cent to the listed close price afterwards.
Putting this together tells me that my wonderfull backtesting profits melt down with growing position size and are very unrealistic for low-price stocks if no slippage is considered. So I guess this all may lead to a strategy with only very limited position sizes and with no spectacular profits.
I will continue to test in IB simulation mode and if I have a stable and tested Amibroker version of daily autotrading I will start testing with real money and small position size.
Hello ChartRider,
thanks for your feedback.
Regarding low volume stocks: can’t say much about it, only tested it on NDX100 stocks.
Since I finished this strategy I tested it on a IB test account as well. Have also seen different results. Trying to improve the system to get a closer match between backtest and reality.
How do you backtest the ranking procedure? I’ve found it essential to sync test and reality.
I’ve seen slippage bettween CLOSE price and MOC order execution. Considering manual execution at the close.
Frank
How do I backtest the ranking procedure? I’m not shure that we have the same problems with ranking procedures in backtesting our systems. But let me talk about the ranking problems that occur to me:
In forwad testing or in real trading I’m placing about 100 buy limit orders out of a watch list of 2500 stocks before the market opens. I guess you don’t ask about the kind of ranking system of this selection, because you only have 100 stocks in NDX100 so you dont need to select at this point. During market hours my autotrader has to watch the fillings and after 20 started fillings per day he cancels all remaining limit orders to prevent me from overleveraging (position size 5% per stock). Almost all days I don’t get 20 triggered stock (I’m using a higher trigger threshold: minimum 1 * atr).
Doing backtesting with Amibroker I’m sorry but I’m not able to model this way of proceeding without cheating. So my cheat is: Set MaxOpenPositions to 100 (number of buy limits) and set PositionScore to the ranking function that gives me my daily selection of the 100 stocks out of the 2500 stock. Then set Buy=true for all stocks so backtester will buy each day the same 100 stocks that I would have placed the limit order for.
By setting BuyPrice=SellPrice=100 for all bars that would not have been triggered by my trigger conditions and setting BuyPrice and SellPrice to the correct values that would have been reached in first beeing triggered and later on beeing sold (with commission and mean slippage) I get a backtest result that is similar to real trading (Set AccountMargin=1, PriceBoundChecking=false, TradeDelays=0,0,0,0 of course). Only those few days where more than 20 positions are triggered are not realistic in backtesting. Since I don’t know wich of the x+20 Stocks are the first ones that would have been triggered and therefor are the ones beeing traded in forward testing or real trading, I average the daily profit/loss of those days and weight them with 20 Stocks (I’m not able to do this correction in Amibroker, so I export the tradeslist to Access or Excel where I do these calculations).
My experience with averaging the daily profit/loss and weighting them with 20 Stocks tells me that the averaged profit is higher than the uncorrected profit calculated in pure Amibroker. It seems, that these few days with to much triggered orders are some of the worse days of the system (exception: the day of the flash-crash in may 2010).
P.S.: you wrote that you use ATR(21) as one basis of your trigger threshold. The tests I made in backtesting a lot of stocks over a lot of daily bars showed that an EMA(ATR(1), n) with n>8 and n<25 has much better predictive value than a MA(ATR(1),n) or an ATR(n). Also it is usefull to check if a range=high-low is sometimes more usefull than a true range.
The issue of partial fills in IB simulation mode (papertrader) is a quirk of the papertrading system. IB papertrader sees only the top of the book, no depth (e.g. would see bids at 20.02 but not 20.01, 20.00, etc), so if there isn’t enough liquidity at the very top then a market order won’t fill. Using larger position sizes I experienced this issue testing my own system on S&P 100 stocks, which are highly liquid, but it resolved as soon as I switched to realmoney trading.
Slippage is a real concern and even on the top few hundred stocks in the S&P, which almost always have .01 spreads, it will account for more frictions than IB’s nominal commissions.
Also keep in mind that the “official” close price is often not the very last trade but a volume-weighted average of the last X seconds of trades. I forget what X equals. So if you are using raw IB data to test vs. “official” data (e.g. from Yahoo!) you will get different results.
I’ll also echo what someone said earlier that bad prints for highs & lows are pretty common with raw IB data. In my own system we use IB data only for realtime and download historical data from sources with the “official” data for all historical data. Also, we apply a filter to the IB realtime data (currently daily range >10%) to weed out erroneous highs or lows.
One more thing… For daily trading, round-trip commissions on IB for a $40 ticker with .01 spread will cost approximately .06%. That is about 15% annually. You can double it for a $20 ticker and almost double it for a .02 spread. So obvious ways to limit frictions include:
1) Setting a minimum limit for ticker price. Trading $10 tickers will cost roughly 60% annually in frictions so the edge has to be huge to make it worthwhile.
2) Using limit orders instead of market orders. Submitting limit at the bid (for buys) will eliminate the spread if filled, but may not be filled. Submitting limit at bid+.01 will eliminate risk of getting burned crossing a .02-.03 spread, but there’s still a chance that the market will move away or you’ll get frontrun by HFT. This is a complicated issue with no easy solutions, but in daily trading execution is just as important as the system itself.
Hello Scrilla_Gorilla,
THANK you very much for your detailed feedback.
I’m currently trading the system in a IB test account. Hope to report back positive results.
Regards,
Frank
Hello All,
I had been trading a very similar system for the first half of this year. I had a java api written to interactive brokers which would submit a limit order when the stock traded within a few % of the entry price (to avoid the too-many limit orders issue). It would then issue an MOC order if I was filled. Once a certain number of orders were filled it would stop entering any new orders. I generally looked at 2-3 standard devations below yesterday’s close. The good news is it worked very well during the fat finger crash on May 6th, earning 13% on my $500k account. The bad news is that the issues mentioned in previous posts (described by Martin Nieman above in particular) drag down results substantially compared to backtesting. Fake low price spikes in the data are probably the biggest factor. Many if not most large winning trades turned out to be low price data errors (with the exception of May 6th when the market did fall substantially) or were untradeable or later corrected erroneous trades (they usually show as a single minute price spike well below the rest of the day’s trading range). You can confirm this by looking at the recent intraday pricing on IB for big falling stocks. Also, on most occasions fills at the open where it gapped below my entry price were worse than the open price (my system would issue a Peg-Mid order if a gap occurred) where it would often reverse within a few seconds (similar to the first issue of being an untradeable price spike down). Intraday mean reversion certainly does exist, but my experience is that real returns per trade and win-loss ratio are substantially lower than the backtested results. I did trade across a very wide range of stocks, so perhaps these issues are less a factor in the main trading stocks, but of course less opportunities would exist so you would need to risk more per trade. I now trade mean reversion across several days instead of intraday, using the closing price as my basis for backtesting which very rarely has the same data issues as the low price. In fact I now believe the close price is the only reliable price to base any trading strategy on – and I have used many data sources. My MOC orders always hit right around the close price.
Hello David,
You do address some issues that certainly to exists. I’ve traded the system in an IB test account for a few weeks and making some twists here and there, the results are getting closer to the back-tested ones. The most important issue in my eyes: how can one address the fact that “bad” stocks drop and hit the limit order before the “good” stocks tend to fall.
Furthermore I will test (re-write) the system to run on SP500 stocks. Will write a post about it.
Regards,
Hi Frank,
Yes, a key issue with this strategy is not knowing how many candidate stocks you will have each day and so how to allocate your capital. I took a conservative approach and would put a few percent of equity on each falling stock, giving myself the chance to capture as many as possible to get as close to the average result of all stocks for that day. Having said that, I didn’t really find a major effect in terms of the order in which they fell. On many days the later falling ones were the worst. It seemed more the case that there was a varying time each day when the general market would turn upwards (when it did turn upwards) and most stocks purchased close to that time did the best. I was trading a much larger set of stocks than just the NDX, though. I haven’t investigated it but on further thought I would be confident that ETFs would work best with this strategy. In my experience they have the best mean reversion tendencies, at least on a multiday basis. You could allocate 5% of equity to the first trade, 5% of remaining equity to the next, and so on. So with smaller sizes on each subsequent stock you could theoretically trade all the candidate stocks of that day.
Hello David,
thanks for your feedback. Let me give you some thoughts:
- I do submit all order Limit Entry, Limit Exit, MOC before the market opens. So I don’t change orders / entry amount during the day. What you suggest is a good ideas, however it requires to submit order during the day. Hence have a some kind of auto-trading capabilities at hand. Haven’t digged into it (yet)
- In my most recent version of this system I focus on SP500 stocks. Just trading the top50 according to my prop ranking mech. There It’s almost non-relevant to pick from top to bottom. So I agree with your observation.
- Testing this on ETF’s would be interesting. Are you thinking of country/sector ETF’s?
Regards,
Frank
hi, I have looked at a similar system in detail. Indeed when using EOD data there is a problem because when selecting your signals e.g. using mtrandom() it will select the signals according to some distribution profile in which the signals appear during the day. In reality on a very negative day you will buy all your positions at the open or close to the open while the EOD backtest will also add some signals that occur later in the day, therefor taking less of a hit on the downside.
I backtested this idea in the 1 minute timeframe and still it gives good results using the last few years of Nasdaq 100 stocks. I tried to approach the EOD results by allowing a maximum number of signals during the earlier stages of the day in an attempt to spead the entries but did not manage to get a better result. I think this is because when you look at the distribution profile of the signals over the entire day for all stocks over a few years you get a nice profile with about 70% of all signals within the first hour. However, per individual day there is more scatter in this. This I guess makes it difficult to approach the EOD result. But the result in the 1 minute timeframe is about 1/2 as good so not bad at all.
Hi Frank,
How do you do risk control? Since no use of stops was mentioned, I assume losses are potentially unlimited with this system.
Hello Andrey,
no risk control – just a simple test.
Trading the system one needs to consider that.
Frank
I guess this is another driver of unrealistic returns, since risk controls tend to reduce results considerably – often to the point of mediocrity. Ignoring this renders the whole discussion moot, IMHO.
You know thus considerably relating to this subject, made me individually believe it from so many varied angles. Its like women and men are not fascinated unless it’s one thing to do with Lady gaga! Your personal stuffs nice. At all times handle it up!
Thanks for leaving a comment here. Frank