Quickstart¶

Here is a quick rundown of ffn’s capabilities. For a more complete guide, read the source, or check out the API docs.

import ffn

%matplotlib inline

Data Retrieval¶

The main method for data retrieval is the get function. The get function uses a data provider to download data from an external service and packs that data into a pandas DataFrame for further manipulation.

Note

You should note that upon import ffn modifies the pandas.core.base.PandasObject to provide added functionality to pandas objects, including DataFrames.

data = ffn.get('agg,hyg,spy,eem,efa', start='2010-01-01', end='2014-01-01')
print(data.head())

                   agg        hyg        spy        eem        efa
 Date
 2010-01-04  74.942818  43.466671  89.225403  33.181232  38.846069
 2010-01-05  75.283783  43.672871  89.461578  33.422070  38.880318
 2010-01-06  75.240242  43.785816  89.524582  33.491989  39.044666
 2010-01-07  75.153183  43.962566  89.902481  33.297764  38.894009
 2010-01-08  75.196701  44.031300  90.201675  33.561905  39.202148

By default, the data is downloaded from Yahoo! Finance and the Adjusted Close is used as the security’s price. Other data sources are also available and you may select other fields as well. Fields are specified by using the following format: {ticker}:{field}. So, if we want to get the Open, High, Low, Close for aapl, we would do the following:

print(ffn.get('aapl:Open,aapl:High,aapl:Low,aapl:Close', start='2010-01-01', end='2014-01-01').head())

             aaplopen  aaplhigh   aapllow  aaplclose
 Date
 2010-01-04  7.622500  7.660714  7.585000   7.643214
 2010-01-05  7.664286  7.699643  7.616071   7.656429
 2010-01-06  7.656429  7.686786  7.526786   7.534643
 2010-01-07  7.562500  7.571429  7.466071   7.520714
 2010-01-08  7.510714  7.571429  7.466429   7.570714

The default data provider is ffn.data.web(). This is basically just a thin wrapper around pandas’ pandas.io.data provider. Please refer to the appropriate docs for more info (data sources, etc.). The ffn.data.csv() provider is also available when we want to load data from a local file. In this case, we can tell ffn.data.get() to use the csv provider. In this case, we also want to merge this new data with the existing data we downloaded earlier. Therefore, we will provide the data object as the existing argument, and the new data will be merged into the existing DataFrame.

data = ffn.get('dbc', provider=ffn.data.csv, path='test_data.csv', existing=data)
print(data.head())

                   agg        hyg        spy        eem        efa    dbc
 Date
 2010-01-04  74.942818  43.466671  89.225403  33.181232  38.846069  25.24
 2010-01-05  75.283783  43.672871  89.461578  33.422070  38.880318  25.27
 2010-01-06  75.240242  43.785816  89.524582  33.491989  39.044666  25.72
 2010-01-07  75.153183  43.962566  89.902481  33.297764  38.894009  25.40
 2010-01-08  75.196701  44.031300  90.201675  33.561905  39.202148  25.38

As we can see above, the dbc column was added to the DataFrame. Internally, get is using the function ffn.merge, which is useful when you want to merge TimeSeries and DataFrames together. We plan on adding many more data sources over time. If you know your way with Python and would like to contribute a data provider, please feel free to submit a pull request - contributions are always welcome!

Data Manipulation¶

Now that we have some data, let’s start manipulating it. In quantitative finance, we are often interested in the returns of a given time series. Let’s calculate the returns by simply calling the to_returns or to_log_returns extension methods.

returns = data.to_log_returns().dropna()
print(returns.head())

                  agg       hyg       spy       eem       efa       dbc
 Date
 2010-01-05  0.004539  0.004733  0.002643  0.007232  0.000881  0.001188
 2010-01-06 -0.000579  0.002583  0.000704  0.002090  0.004218  0.017651
 2010-01-07 -0.001158  0.004029  0.004212 -0.005816 -0.003866 -0.012520
 2010-01-08  0.000579  0.001562  0.003322  0.007901  0.007891 -0.000788
 2010-01-11 -0.000772 -0.000893  0.001395 -0.002085  0.008176 -0.003157

Let’s look at the different distributions to see how they look.

ax = returns.hist(figsize=(12, 5))

We can also use the numerous functions packed into numpy, pandas and the like to further analyze the returns. For example, we can use the corr function to get the pairwise correlations between assets.

returns.corr().as_format('.2f')

	agg	hyg	spy	eem	efa	dbc
agg	1.00	-0.12	-0.33	-0.23	-0.29	-0.18
hyg	-0.12	1.00	0.77	0.75	0.76	0.49
spy	-0.33	0.77	1.00	0.88	0.92	0.59
eem	-0.23	0.75	0.88	1.00	0.90	0.62
efa	-0.29	0.76	0.92	0.90	1.00	0.61
dbc	-0.18	0.49	0.59	0.62	0.61	1.00

Here we used the convenience method as_format to have a prettier output. We could also plot a heatmap to better visualize the results.

returns.plot_corr_heatmap();

We used the ffn.core.plot_corr_heatmap(), which is a convenience method that simply calls ffn’s ffn.core.plot_heatmap() with sane arguments.

Let’s start looking at how all these securities performed over the period. To achieve this, we will plot rebased time series so that we can see how they each performed relative to eachother.

ax = data.rebase().plot(figsize=(12,5))

Performance Measurement¶

For a more complete view of each asset’s performance over the period, we can use the ffn.core.calc_stats() method which will create a ffn.core.GroupStats object. A GroupStats object wraps a bunch of ffn.core.PerformanceStats objects in a dict with some added convenience methods.

perf = data.calc_stats()

Now that we have our GroupStats object, we can analyze the performance in greater detail. For example, the plot method yields a graph similar to the one above.

perf.plot();

We can also display a wide array of statistics that are all contained in the PerformanceStats object. This will probably look crappy in the docs, but do try it out in a Notebook. We are also actively trying to improve the way we display this wide array of stats.

print(perf.display())

 Stat                 agg         hyg         spy         eem         efa         dbc
 -------------------  ----------  ----------  ----------  ----------  ----------  ----------
 Start                2010-01-04  2010-01-04  2010-01-04  2010-01-04  2010-01-04  2010-01-04
 End                  2013-12-31  2013-12-31  2013-12-31  2013-12-31  2013-12-31  2013-12-31
 Risk-free rate       0.00%       0.00%       0.00%       0.00%       0.00%       0.00%

 Total Return         16.36%      39.22%      76.92%      5.46%       33.43%      1.66%
 Daily Sharpe         1.11        0.97        0.93        0.18        0.44        0.11
 Daily Sortino        1.84        1.51        1.48        0.29        0.69        0.17
 CAGR                 3.87%       8.65%       15.37%      1.34%       7.50%       0.41%
 Max Drawdown         -5.14%      -10.06%     -18.61%     -30.87%     -25.86%     -24.34%
 Calmar Ratio         0.75        0.86        0.83        0.04        0.29        0.02

 MTD                  -0.56%      0.41%       2.59%       -0.41%      2.18%       0.59%
 3m                   0.02%       3.42%       10.52%      3.48%       6.08%       -0.39%
 6m                   0.57%       5.84%       16.32%      9.55%       18.12%      2.11%
 YTD                  -1.98%      5.75%       32.31%      -3.65%      21.44%      -7.63%
 1Y                   -1.98%      5.75%       32.31%      -3.65%      21.44%      -7.63%
 3Y (ann.)            3.08%       7.83%       16.07%      -2.34%      8.17%       -2.34%
 5Y (ann.)            -           -           -           -           -           -
 10Y (ann.)           -           -           -           -           -           -
 Since Incep. (ann.)  3.87%       8.65%       15.37%      1.34%       7.50%       0.41%

 Daily Sharpe         1.11        0.97        0.93        0.18        0.44        0.11
 Daily Sortino        1.84        1.51        1.48        0.29        0.69        0.17
 Daily Mean (ann.)    3.86%       8.70%       15.73%      4.35%       9.73%       1.83%
 Daily Vol (ann.)     3.48%       8.97%       16.83%      24.56%      22.31%      16.84%
 Daily Skew           -0.40       -0.55       -0.39       -0.12       -0.26       -0.47
 Daily Kurt           2.30        7.50        4.03        3.06        3.64        2.90
 Best Day             0.84%       3.05%       4.65%       7.20%       6.74%       4.34%
 Worst Day            -1.24%      -4.26%      -6.51%      -8.34%      -7.46%      -6.70%

 Monthly Sharpe       1.23        1.11        1.22        0.30        0.60        0.27
 Monthly Sortino      2.49        2.19        2.36        0.53        1.06        0.43
 Monthly Mean (ann.)  3.59%       9.51%       16.99%      6.43%       11.06%      4.61%
 Monthly Vol (ann.)   2.93%       8.56%       13.91%      21.45%      18.41%      17.10%
 Monthly Skew         -0.34       0.14        -0.32       -0.10       -0.37       -0.74
 Monthly Kurt         0.02        1.75        0.24        1.28        0.17        1.16
 Best Month           1.77%       8.49%       10.91%      16.27%      11.61%      9.89%
 Worst Month          -2.00%      -5.30%      -7.95%      -17.89%     -11.19%     -14.62%

 Yearly Sharpe        0.65        2.79        1.10        -0.06       0.50        -0.40
 Yearly Sortino       2.77        inf         inf         -0.11       1.32        -0.58
 Yearly Mean          3.16%       7.85%       16.73%      -1.13%      9.32%       -2.24%
 Yearly Vol           4.86%       2.82%       15.22%      19.05%      18.72%      5.57%
 Yearly Skew          -0.54       1.49        0.22        0.58        -1.69       0.27
 Yearly Kurt          -           -           -           -           -           -
 Best Year            7.70%       11.06%      32.31%      19.05%      21.44%      3.50%
 Worst Year           -1.98%      5.75%       1.89%       -18.79%     -12.23%     -7.63%

 Avg. Drawdown        -0.48%      -1.18%      -1.78%      -5.16%      -4.96%      -5.09%
 Avg. Drawdown Days   16.95       15.70       17.55       78.22       60.04       107.85
 Avg. Up Month        0.83%       1.86%       3.58%       5.87%       4.37%       4.28%
 Avg. Down Month      -0.49%      -2.31%      -3.21%      -3.41%      -4.15%      -3.35%
 Win Year %           66.67%      100.00%     100.00%     33.33%      66.67%      33.33%
 Win 12m %            81.08%      97.30%      94.59%      59.46%      70.27%      45.95%
 None

Lots to look at here. We can also access the underlying PerformanceStats for each series, either by index or name.

# we can also use perf[2] in this case
perf['spy'].display_monthly_returns()

   Year    Jan    Feb    Mar    Apr    May    Jun    Jul    Aug    Sep    Oct    Nov    Dec    YTD
 ------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
   2010  -5.24   3.12   6.09   1.55  -7.95  -5.17   6.83  -4.5    8.96   3.82   0      6.69  13.14
   2011   2.33   3.47   0.01   2.9   -1.12  -1.69  -2     -5.5   -6.94  10.91  -0.41   1.04   1.89
   2012   4.64   4.34   3.22  -0.67  -6.01   4.06   1.18   2.51   2.54  -1.82   0.57   0.89  15.99
   2013   5.12   1.28   3.8    1.92   2.36  -1.33   5.17  -3      3.16   4.63   2.96   2.59  32.31

perf[2].plot_histogram();

Most of the stats are also available as pandas objects - see the stats, return_table, lookback_returns attributes.

perf['spy'].stats

 start                    2010-01-04 00:00:00
 end                      2013-12-31 00:00:00
 rf                                       0.0
 total_return                        0.769155
 cagr                                 0.15375
 max_drawdown                       -0.186055
 calmar                              0.826367
 mtd                                 0.025926
 three_month                         0.105247
 six_month                           0.163183
 ytd                                 0.323077
 one_year                            0.323077
 three_year                           0.16066
 five_year                                NaN
 ten_year                                 NaN
 incep                                0.15375
 daily_sharpe                         0.93439
 daily_sortino                       1.478916
 daily_mean                          0.157279
 daily_vol                           0.168323
 daily_skew                         -0.388777
 daily_kurt                          4.028481
 best_day                            0.046499
 worst_day                          -0.065123
 monthly_sharpe                      1.221065
 monthly_sortino                     2.362922
 monthly_mean                        0.169906
 monthly_vol                         0.139146
 monthly_skew                       -0.319921
 monthly_kurt                        0.235707
 best_month                          0.109147
 worst_month                        -0.079455
 yearly_sharpe                       1.099284
 yearly_sortino                           inf
 yearly_mean                          0.16731
 yearly_vol                          0.152199
 yearly_skew                          0.21847
 yearly_kurt                              NaN
 best_year                           0.323077
 worst_year                           0.01895
 avg_drawdown                       -0.017845
 avg_drawdown_days                  17.550725
 avg_up_month                        0.035827
 avg_down_month                     -0.032066
 win_year_perc                            1.0
 twelve_month_win_perc               0.945946
 dtype: object

Numerical Routines and Financial Functions¶

ffn also provides commonly used numerical routines and plans to add many more in the future. One can easily determine the proper weights using a mean-variance approach using the ffn.core.calc_mean_var_weights() function.

returns.calc_mean_var_weights().as_format('.2%')

 agg    79.52%
 hyg     6.47%
 spy    14.01%
 eem     0.00%
 efa     0.00%
 dbc     0.00%
 dtype: object

Some other interesting functions are the clustering routines, such as a Python implementation of David Varadi’s Fast Threshold Clustering Algorithm (FTCA)

returns.calc_ftca(threshold=0.8)

 {1: ['eem', 'spy', 'efa'], 2: ['agg'], 3: ['dbc'], 4: ['hyg']}