Type something and hit enter

ads here
On
advertise here

It didn't work.

I occasionally see posts suggesting this (one by /u/craino the other day), so I thought I'd give it a try.

Here's the details:

I downloaded the Sharadar SF0 dataset from Quandl http://ift.tt/2iEzZWL

It has 63 indicators on 2000+ companies (they say).

I selected the indicators for each company on 12-31-2015.

I then went and got 12-31-2015 and 12-31-2016 closing prices using Quandl WIKI EOD to compute the percent gain during 2016. http://ift.tt/2BPWlfl

I put together a table that looked like this

Ticker 2016%Gain ACCOCI Assets Assetsc ASSETSNC etc.
AAL 11.49 -4732 4.84 998 384
AAN 43.43 -5170 2.65 NA NA

The table has 65 columns and 1072 rows. Each row is a different company. Each company has about 63 indicators, but some were not available for particular companies.

I tried to train a deep learning network on the data. "Training" basically says try to find some type of relationship using the 63 indicators in the table to predict the 2016%Gain.

Here's how the model training went: training graph

The way training works is that you split the 1072 rows into two parts: a training set (blue) and a validation set (orange). You can see from the training graph that the blue training deviance (difference between actual 2016%Gain and model's prediction of 2016%Gain) got substantially smaller, indicating better predictive value. However, that had no effect on the validation deviance (orange), which is data that the training process is not using. This says the model is just overfitting the data and has no predictive value.

I'm definitely not a pro at this, just learning. It's possible that someone could come up with a model that works, but I'll say it isn't easy.

Some technical details for those who are interested:
I used H2o's deep learning model to fit a regression model on the 2016%return. The training frame was 75% of the data, validation frame 25% of data. The hidden layer sizes were 200,100,50,25,10. I used 10000 epochs of training. The activation function was Rectifier with Dropout. The input dropout ratio was 0.2 and the hidden layer dropouts were all 0.4.



Submitted December 08, 2017 at 12:50PM by ron_leflore http://ift.tt/2BOSGhU

Click to comment