getSymbols
I've noticed an intermittent problem in the historical data retrieved by the quantmod package in R. Following is my test code:
library(quantmod)
getyahoo <- function(symbol, sdate){
getSymbols(symbol, src = 'yahoo', from = sdate)
dd <- get(symbol)
#dd <- na.omit(dd) # uncomment this line to delete NAs
return(dd)
}
symbol <- "QQQ"
sdate <- "1900-01-01"
count <- 50
ierr <- NULL
for (i in 1:count){
sss <<- getyahoo(symbol, sdate)
ttt <- sss[!is.na(Ad(sss)),]
isna <- which(is.na(Ad(sss)))
if (length(isna) > 0){
print(paste0(i, ": ", length(isna), " of ", NROW(sss), " => ", NROW(ttt)))
print(sss[isna,])
ierr <- c(ierr, i)
}
isna <- which(is.na(Ad(ttt)))
if (length(isna) > 0){
print("***** FIX FAILED *****")
}
}
print(paste0("ERRORS IN ", length(ierr), " OUT OF ", count))
print(ierr)
Following is the output from a test run of the above code:
source("testna1.R")
[1] "13: 2 of 4770 => 4768"
QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2018-01-01 NA NA NA NA NA NA
2018-01-15 NA NA NA NA NA NA
[1] "17: 2 of 4770 => 4768"
QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2018-01-01 NA NA NA NA NA NA
2018-01-15 NA NA NA NA NA NA
[1] "26: 2 of 4770 => 4768"
QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2018-01-01 NA NA NA NA NA NA
2018-01-15 NA NA NA NA NA NA
[1] "35: 2 of 4770 => 4768"
QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2018-01-01 NA NA NA NA NA NA
2018-01-15 NA NA NA NA NA NA
[1] "49: 2 of 4770 => 4768"
QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2018-01-01 NA NA NA NA NA NA
2018-01-15 NA NA NA NA NA NA
[1] "ERRORS IN 5 OUT OF 50"
[1] 13 17 26 35 49
Warning messages:
1: QQQ contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
2: QQQ contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
3: QQQ contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
4: QQQ contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
5: QQQ contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.As you can see at the end, I hit an error where two dates when the market was not open (2018-01-01 and 2018-01-15) return NAs in 5 out of 50 executions. I seemed to typically hit errors in about 5 to 15 out of 50 executions but I don't see any discernible pattern. Still, uncommenting the statement "dd <- na.omit(dd)" seems to fix the problem by getting rid of the rows with NAs. It appears that all 6 columns have NAs when the problem occurs but it's possible to key on just the column that you're using by using "dd <- dd[!is.na(Ad(dd)),]" for Adjusted prices or "dd <- dd[!is.na(Cl(dd)),]" for Closing prices. In all cases, you still seem to get the Warnings but the code will fix the problem.
Do you have any ideas on what the problem is? As the very least, is the proposed fix of "dd <- na.omit(dd)" valid?
Thanks,
R. Davis