Working with dates is an absolute nightmare. How do I select the index column instead of the dates column so that I can plot a linear regression line? The index column is numbered so I would like to reference the index against the price (instead of the date column)
I am getting this error message when trying to convert this column of strings so I choose to ignore it and use the index instead:
ValueError: could not convert string to float: '28/07/2017'
Here is the csv data:
Date Time Open High Low Last Volume
0 28/07/2017 00:00:00 1.12670 1.14067 1.12626 1.13833 245861
1 31/07/2017 00:00:00 1.13892 1.14552 1.13356 1.14511 179706
2 01/08/2017 00:00:00 1.14457 1.14514 1.13869 1.13973 162943
Here is the code:
#import libraries
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
data = pd.read_csv('EURCHF_Daily.csv') # load data set
x = data.iloc[:, 0].values.reshape(-1, 1) # values converts it into a numpy array
x = x.astype(np.float)
Y = data.iloc[:, 1].values.reshape(-1, 1) # -1 means that calculate the dimension of rows, but have 1 column
linear_regressor = LinearRegression() # create object for the class
linear_regressor.fit(X, Y) # perform linear regression
Y_pred = linear_regressor.predict(X) # make predictions
plt.scatter(X, Y)
plt.plot(X, Y_pred, color='red')
plt.show()
Try:
I am assuming you want Y
to be the column Open
X = df.index.to_numpy().reshape(-1, 1)
Y = df.iloc[:, 2].values.reshape(-1, 1)
linear_regressor = LinearRegression() # create object for the class
linear_regressor.fit(X, Y)
Y_pred = linear_regressor.predict(X)
plt.scatter(X, Y)
plt.plot(X, Y_pred, color='red')
plt.show()
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments