To revolutionize financial education, research and practice through Python.

Today's talk covers the following topics:

- Data Science & Finance Today
- Open Data Science & The Cloud
- The Importance of the Python Ecosystem
- Why Python for Finance?
- How to Learn and Do Python for Finance?
- A "Complete Package" Example

For instance, Wikipedia defines the field as follows (cf. https://en.wikipedia.org/wiki/Data_science):

Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics, ...

"Big data is data that does not fit into an Excel spreadsheet." *Unknown*.

“If I had asked people what they wanted, they would have said faster horses.” Henry Ford

Source: http://kdnuggets.com

Source: http://boingboing.net

In Python, you generally start with the **import** of some libraries and packages.

In [2]:

```
from pandas import *
from pylab import *
from pandas_datareader import data as web
from seaborn import *; set()
%matplotlib inline
```

Consider the stochastic differential equation of a **geometric Brownian motion** as used by Black-Scholes-Merton (1973), describing the evolution of a stock index over time, for example.

A possible Euler discretization to simulate the time $T$ value of $S$ is given by the **difference equation**

with $z$ being a standard normally distributed random variable.

**simulation** of 10,000,000 time $T$ values for the index with Python is efficient and fast, the syntax is really concise.

In [3]:

```
%%time
S0 = 100.; r = 0.05; sigma = 0.2; T = 1.
z = standard_normal(10000000)
ST = S0 * exp((r - 0.5 * sigma ** 2) * T + sigma * sqrt(T) * z)
```

CPU times: user 787 ms, sys: 165 ms, total: 952 ms Wall time: 1.33 s

We would expect for the **mean value** $\bar{S}_T = S_0 \cdot e^{r T} \approx 105.1$.

In [4]:

```
S0 * exp(r * T)
```

Out[4]:

105.12710963760242

In [5]:

```
ST.mean()
```

Out[5]:

105.135621517714

The **histogram** of the simulated values with mean value (red line).

In [6]:

```
figure(figsize=(10, 6))
hist(ST, bins=45);
axvline(ST.mean(), color='r');
```

In addition, **option pricing** by Monte Carlo simulation is also easily implemented.

In [7]:

```
%%time
strike = 105. # option strike
payoff = maximum(ST - strike, 0) # payoff at maturity
C0 = exp(-r * T) * payoff.mean() # MCS estimator
print 'Europen call option value is %.2f.' % C0
```

We read **historical daily closing data** for the **S&P 500 index**.

In [8]:

```
spx = web.DataReader('^GSPC', data_source='yahoo')['Close']
```

In [9]:

```
spx.tail()
```

Out[9]:

Date 2016-06-14 2075.320068 2016-06-15 2071.500000 2016-06-16 2077.989990 2016-06-17 2071.219971 2016-06-20 2083.250000 Name: Close, dtype: float64

A simple method call allows us to **visualize** the time series data.

In [10]:

```
spx.plot(figsize=(10, 6));
```

We also read historical closing data for the **VIX volatility index** ...

In [11]:

```
vix = web.DataReader('^VIX', data_source='yahoo')['Close']
```

In [12]:

```
vix.tail()
```

Out[12]:

Date 2016-06-14 20.500000 2016-06-15 20.139999 2016-06-16 19.370001 2016-06-17 19.410000 2016-06-20 18.370001 Name: Close, dtype: float64

... and **visualize** it.

In [13]:

```
vix.plot(figsize=(10, 6));
```

First, we **combine the two data sets** into one.

In [14]:

```
data = DataFrame({'SPX': spx, 'VIX': vix})
```

In [15]:

```
data.tail()
```

Out[15]:

SPX | VIX | |
---|---|---|

Date | ||

2016-06-14 | 2075.320068 | 20.500000 |

2016-06-15 | 2071.500000 | 20.139999 |

2016-06-16 | 2077.989990 | 19.370001 |

2016-06-17 | 2071.219971 | 19.410000 |

2016-06-20 | 2083.250000 | 18.370001 |

Second, let us **plot the data** into a single diagram.

In [16]:

```
data.plot(figsize=(10, 6), secondary_y='VIX', title='');
```

**calculate the log returns** for the two time series. This task is accomplished by a highly vectorized operation ("no looping").

In [17]:

```
rets = log(data / data.shift(1))
```

In [18]:

```
rets.head()
```

Out[18]:

SPX | VIX | |
---|---|---|

Date | ||

2010-01-04 | NaN | NaN |

2010-01-05 | 0.003111 | -0.035038 |

2010-01-06 | 0.000545 | -0.009868 |

2010-01-07 | 0.003993 | -0.005233 |

2010-01-08 | 0.002878 | -0.050024 |

Now we can, for instance, **calculate the correlation** between the two time series.

In [19]:

```
rets.corr()
```

Out[19]:

SPX | VIX | |
---|---|---|

SPX | 1.000000 | -0.828457 |

VIX | -0.828457 | 1.000000 |

"Use a picture. It's worth a thousand words." Tass Flanders (1911)

In [20]:

```
jointplot(rets['SPX'], rets['VIX'], kind='reg', size=7);
```

2nd edition in the making ...

Saturday here in Singapore at PyCon as well ...

Currently in the planning phase:

Master of Science in Financial Data Science and Computational Finance

*Derivatives Analytics with Python*

Python has become the English of programming languages (for finance).

If you have only time to learn one programming language for finance (thoroughly), learn Python.

http://tpq.io | @dyjh | team@tpq.io

**Python Quant Platform** |
http://quant-platform.com

**Python for Finance** |
Python for Finance @ O'Reilly

**Derivatives Analytics with Python** |
Derivatives Analytics @ Wiley Finance

**Listed Volatility and Variance Derivatives** |
Listed VV Derivatives @ Wiley Finance