I have two dataframes: df = pd.DataFrame([{'A': -4, 'B': -3, 'C': -2, 'D': -1, 'E': 2, 'F': 4, 'G': 8, 'H': 6, 'I': -2}]) df2 looks like this (just a cutout; i
I'm new to R and trying to isolate the best performing features from a data set of 247 columns (246 variables + 1 outcome), and 800 or so rows (where each row i
i want to show why a specific model is not appropriate, given a data with 6 variables (they are chr variables) the model is y= abc*(x1+x2) a and b from the data
I have months(Jan, Feb, Mar etc) data in my dataset and I am generating dummy variable using pandas library. pd.get_dummies(df['month'],drop_first=True) I want
After performing a regression, you get the residuals and the fitted values for the dependent variable. Plotting them can yield insights over the violation of OL
I am trying to get predictions of a multiple variables model, its eplt, its made of 7 scores and one final exam score moy_exam2, I want to predict the later usi
I'm a beginner with ML and have been following the Coursera intro syllabus. I am trying to implement the exercises using TensorFlow rather than Octave. I have t
I am trying to run a Fama Macbeth analysis in R, where I am using the 'pmg' function with the following code: Fpmg1 <- pmg(ret ~ HML_OBS + SMB + Mktrf + HML,
I am trying to build the following model but am getting this error when I am finally training the model and trying to get it's accuracy. It gets stuck when I am
Hey there i want to explore the effect of Age and Gender on points of a test via mlr. Yet when i type model <- lm(punkte~ Age + Gender, data = df) R gives m
Can some one with expertise explain how the following vectorized format of multiple linear regression is derived from given independent variable matrix with int
#This is my model linearMod <- lm( Housing_Training$SalePrice ~ Housing_Training$MSSubClass + Housing_Training$LotFrontage + Housing_Training$LotArea + Hous
I have an expression which does the same calculation. When I try to do the whole calculation in a single expression and store it in variable "a", the expression
The OLSResults of df2 = pd.read_csv("MultipleRegression.csv") X = df2[['Distance', 'CarrierNum', 'Day', 'DayOfBooking']] Y = df2['Price'] X = add_constant(X) f
Just like we use the Normal Equation to find out the optimum theta value in Linear Regression, can/can't we use a similar formula for Logistic Regression ? If n
If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with: theta = inv(X^T * X) * X^T * y one step
I tried to make a linear regression with the lm function, but the output is NA for every independent variable. The dataframe is numeric. I have already tried t
With the following code, I get a plot how the regression was done for my data. In the plot also vertical (error?) bars are shown. To which number in the sum
I know there is a small difference between $sigma and the concept of root mean squared error. So, i am wondering what is the easiest way to obtain RMSE out of l
I want to use predict() with a polr() model to predict variable z, as per the following code. This first is the df to train the model and the subsequent test da