Python Code -- Regression Training and Testing.
(https://www.youtube.com/watch?v=r4mwkS2T9aI&index=4&list=PLQVvvaa0QuDfKTOs3Keq_kaG2P55YRn5v)
In addition to this video, some implementation techniques are added.
Initially i tried with Python27, but it ended up with errors( Discussed below), so its recommended to use Python3
Open Command Window and install the following
In addition to this video, some implementation techniques are added.
Initially i tried with Python27, but it ended up with errors( Discussed below), so its recommended to use Python3
Open Command Window and install the following
- C:\Python27\Scripts\pip install sklearn
- C:\Python27\Scripts\pip install quandl
- C:\Python27\Scripts\pip install scipy
- C:\Python27\Scripts\pip install pyparsing
Make sure you have installed this,
(But using python 27 is not preferable as it doesnt have timespace() function)
But it is better to install Anaconda3, (as it take care of installing all libraries no need to install manually )
https://www.continuum.io/downloads
after installing it, open cmd windows
type: conda install quandl (other packages are already there)
Database used for our analysis is given below
This is added via quandl.get in the below program
import pandas as pd
import quandl
import math, datetime
import numpy as np
from sklearn import preprocessing, cross_validation, svm
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from matplotlib import style
style.use('ggplot')
df=quandl.get('WIKI/GOOGL') #Quandl contains database for testing
df=df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj. Volume']]
df['HL_PCT']=(df['Adj. High'] - df['Adj. Close']) / df['Adj. Close'] * 100.0df['PCT_change']=(df['Adj. Close'] - df['Adj. Open']) / df['Adj. Open'] * 100.0
df=df[['Adj. Close', 'HL_PCT', 'PCT_change', 'Adj. Volume']]
forecast_col='Adj. Close'df.fillna(-99999, inplace=True)
forecast_out = int(math.ceil(0.01*len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)
x=np.array(df.drop(['label'],1))
x=preprocessing.scale(x)
x=x[:-forecast_out]
x_lately=x[-forecast_out:]
df.dropna(inplace=True)
y=np.array(df['label'])
y=np.array(df['label'])
x_train, x_test, y_train, y_test = cross_validation.train_test_split(x,y,test_size=0.2)
clf=LinearRegression()
clf.fit(x_train, y_train)
accuracy=clf.score(x_test,y_test)
forecast_set=clf.predict(x_lately)
print(forecast_set,accuracy,forecast_out)
df['Forecast']=np.nan
last_date=df.iloc[-1].name
last_unix=last_date.timestamp()
one_day=86400next_unix=last_unix+one_day
for i in forecast_set:
next_date=datetime.datetime.fromtimestamp(next_unix)
next_unix += one_day
df.loc[next_date]=[np.nan for _ in range(len(df.columns)-1)] + [i]
df['Adj. Close'].plot()
df['Forecast'].plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()
Microsoft Visual C++ Compiler for Python 2.7
https://www.quandl.com/data/WIKI/GOOGL-Alphabet-Inc-GOOGL-Prices-Dividends-Splits-and-Trading-Volume
Here is the python code
Comments
Post a Comment