Jump to content

Using open source software for portfolio analysis

From finiki, the Canadian financial wiki

Using open source software for portfolio analysis is a compilation of open source software used to analyze portfolios. Topics covered include regression analysis, Monte Carlo simulation, and other statistical methods.

GnuCash

GnuCash is a free and open-source personal and small-business financial accounting software. GnuCash allows you to track bank accounts, stocks, income and expenses. Double-entry accounting is utilized to ensure balanced books and accurate reports.[1] GnuCash's data file format is fully open and the data is stored in either a XML file format (default) or SQL database.[2]

GnuCash has versions that run on Windows, macOS, and many different flavors of Linux and BSD.[3]

History

Programming on GnuCash began in 1997, and its first stable release was in 1998. Small Business Accounting was added in 2001. A Mac installer became available in 2004. A Windows port was released in 2007.[4]

In May 2012, the development of GnuCash for Android was announced.[5] This is an expense-tracking companion app for GnuCash, as opposed to a stand-alone accounting package.

Main features

Some of the numerous features that GnuCash has to offer to its users are:[6]

  • Double Entry Accounting, every transaction must debit one account and credit others by an equal amount. This ensures the books balance: the difference between income and expenses exactly equals the sum of assets and liabilities.
  • Checkbook-style register, including split transactions, autofill and the ability to mark a transaction cleared or reconciled
  • Scheduled Transactions
  • Reports, Graphs. an integrated module to display graphs of your financial data in the form of: Barcharts; Piecharts; and Scatter plots. GnuCash also comes complete with a full suite of standard and customizeable reports, such as: Balance Sheet; Profit & Loss; Portfolio Valuation and many others.
  • Statement Reconciliation which allows the user to compare the transactions entered in an account against a bank statement.
  • Stock, Bond, Mutual fund Accounts
  • Small-Business Accounting
  • Quicken Interchange Format (QIF) and Open Financial Exchange (OFX) import, if you are migrating from other financial software, GnuCash can import Intuit® Quicken® QIF files using a practical assistant.

R programming language

R is a language and environment for statistical computing and graphics.

RStudio is the free and open source integrated development environment (IDE) for R.

R has many packages available at CRAN Task Views. There is a page for finance that lists the available packages.

In the examples below, view the source code by clicking the show/hide link of the title bar.

Asset correlation

Create a correlation matrix among 4 different funds. A matrix containing the correlations among the 4 funds and a set of correlation plots is displayed. Some summary statistics are also shown.

Asset correlation in R
Source code
library(quantmod)# make sure you have library installed
#download data from yahoo
getSymbols(c("XIU.TO","XBB.TO","XRE.TO","GLD"),src='yahoo', from="2000-01-01")
#merge adjusted columns into one data frame
returns<-merge(monthlyReturn(XIU.TO$XIU.TO.Adjusted),
               monthlyReturn(XBB.TO$XBB.TO.Adjusted),
               monthlyReturn(XRE.TO$XRE.TO.Adjusted),
               monthlyReturn(GLD$GLD.Adjusted))
#put normal names
names(returns)<-c("xiu","xbb","xre","gld")

#show correlation matrix using only rows with all (complete) data
cor(returns, use="complete.obs")

library(psych) #has a good stats function
describe(returns)[,c(4,8,9,11,12)]# choose the columns 

#scatterplots to visualize correlation
pairs(~xiu+xbb+xre+gld, data = returns, upper.panel = NULL)
            xiu         xbb       xre       gld
xiu  1.00000000 -0.07174554 0.6165985 0.3090652
xbb -0.07174554  1.00000000 0.2208214 0.1644747
xre  0.61659853  0.22082135 1.0000000 0.0768836
gld  0.30906520  0.16447469 0.0768836 1.0000000
      sd   min  max  skew kurtosis
xiu 0.04 -0.17 0.13 -0.75     1.59
xbb 0.01 -0.03 0.05 -0.14     0.47
xre 0.04 -0.23 0.14 -1.36     6.13
gld 0.06 -0.16 0.13 -0.19     0.06

Bond market simulation

Bogleheads forum member linuxizer has developed a preliminary bond market simulator in R.

Downloading fund data

Bogleheads forum member camontgo has written a script to import fund data from Yahoo! Finance.

See: Downloading a Batch of Returns Using Yahoo! Finance, from The Calculating Investor

Multifactor regression

R is the preferred tool for performing multifactor regression analysis; a number of scripts are available in the referenced article under "R".

Statistical distribution moments, value at risk

Plot the 2nd (variance), 3rd (skewness), and 4th (kurtosis) statistical distribution moments of a fund. Additionally, examples of Value At Risk (VaR) are shown. The plots can be seen in the following forum posts: Re: Risk = ?? and Re: Risk = ??

Statistical distribution moments, value at risk
Source code
library(quantmod)
library(e1071)#libraries used

getSymbols(c("SPY"), src='yahoo', from="1993-01-29")#dnld data
spy.logr <- log(dailyReturn(SPY[,"SPY.Adjusted"])+1)  #log returns
period <- 21 # about monthly
spy.retn <- rollapply(spy.logr, period, sum, by=period, align="right")
spy.var <- rollapply(spy.logr, period, var, by=period, align="right")
spy.skew <- rollapply(spy.logr, period, skewness,by=period, align="right")
spy.kurt <- rollapply(spy.logr, period, kurtosis, by=period, align="right")
plot.fit<-function(x,y){
  plot(coredata(spy.retn)~coredata(x), xlab=y, ylab="log returns")
  abline(fit <- lm(spy.retn~x), col = "red")
  legend("top", bty="n", legend=paste("R2 =",
                                      format(summary(fit)$adj.r.squared, digits=2)))
}
##par(mfrow=c(1,3)) #3 graphs per line
par(mfrow=c(2,3)) #6 graphs in one display - 2 rows x 3 columns
plot.fit(spy.var,"variance")
plot.fit(spy.skew,"skew")
plot.fit(spy.kurt,"kurtosis")
library(rugarch)
gspec <- ugarchspec(mean.model=list(armaOrder=c(0,0)), distribution="std")
gfit <- ugarchfit(gspec, spy.logr)
##par(mfrow = c(2,1))
plot(gfit, which = 2)
plot(gfit, which = 3)
groll <- ugarchroll(gspec, spy.logr,refit.window = "moving")
plot(groll, which = 4)

MATLAB clones

MATLAB® (MATrix LABoratory) is a high-level language and interactive environment for numerical computation, visualization, and programming. It is neither open source nor free. However, it's an industry standard and open source clones have been developed which strive to emulate MATLAB functionality.[note 1]

Below are examples which run in Octave or MATLAB. Bogleheads forum member camontgo developed and maintains this code at The Calculating Investor.

Efficient frontier (mean-variance optimization)

This is a 3-part series which calculates the efficient frontier for a set of securities.[note 2] These efficient frontier calculations are not very practical. The results are too sensitive to small changes in the inputs, but it is educational since a lot of financial theory builds on some of the ideas underlying Markowitz.[7]

Rebalancing bonus

This is a replication of an analysis by William Bernstein which performs a Monte Carlo analysis of two portfolios to determine the rebalancing bonus. This simulation adds some math which allows simulation of rebalancing with correlated assets. Bernstein's original analysis assumed no correlation.[7]

Market timing

In 1975, Nobel laureate William Sharpe published a study titled “Likely Gains from Market Timing”. In this paper, Sharpe reportedly found that a market timer who switches between 100% stocks and 100% T-bills on an annual basis must be correct about 74% of the time (on average) to beat the market.[8]

This Monte Carlo simulation uses a simple market timing strategy to determine the market timing accuracy required to outperform buy-and-hold. A comparison of market timing to buy-and-hold in terms of both total returns and risk-adjusted returns (measured by the Sharpe Ratio) is performed. The analysis shows that a surprisingly high (and unlikely) degree of accuracy is necessary to beat the market return through market timing.[7]

The factor data is obtained from the Kenneth R. French - Data Library, under Fama/French Factors (direct link). Be sure to remove the file headers and the extra data sets (starting around row 1046). Save the file as F-F_Factors_annual.txt.

This simulation takes a long time to run. To start, reduce the number of iterations from 10,000 to a lower number (such as 100) and ensure that the analysis is functioning correctly.

Python

Python is a rapid development scripting language that is suitable for many tasks. Using add in libraries like NumPy and pandas make it easy to do financial analysis. There are many IDEs

LibreOffice

LibreOffice is a Microsoft Office replacement. Spreadsheets are probably the most used financial analysis software.

Several spreadsheets are available in the Bogleheads wiki. See: Using a spreadsheet to maintain a portfolio, also Google Docs.spreadsheets.

Notes

  1. ^ See Matlab Clones for a detailed overview.
  2. ^ The example covariance matrix could be replaced with a real covariance matrix. For example, the R-script for downloading Yahoo Finance returns could be used to get returns for a set of assets, and the covariance matrix could be calculated and used as the input to these scripts...though the scaling would probably need to be changed.

See also

References

  1. ^ "GnuCash offical website". Retrieved December 11, 2017.
  2. ^ "FAQ - GnuCash". Retrieved 2017-12-11.
  3. ^ GnuCash Wiki, viewed December 11, 2017.
  4. ^ "GnuCash - Older Announcements". gnucash.org. Retrieved December 11, 2017.
  5. ^ http://www.codinguser.com/2012/05/gnucash-mobile/
  6. ^ "GnuCash Features". Retrieved 2017-12-11.
  7. ^ a b c camontgo PM to LadyGeek.
  8. ^ Market Timing: How good is good enough?, from The Calculating Investor

External links