Calculating Standard Error of Coefficients for Logistic Regression in Spark

user2129946

I know this question has been asked previously here. But I couldn't find the correct answer. The answer provided in the previous post suggests the usage of Statistics.chiSqTest(data) which provides the goodness of fit test (Pearson's Chi-Square tests), not the Wald Chi-Square tests for significance of coefficients.

I was trying to build the parameter estimate table for logistic regression in Spark. I was able to get the coefficients and intercepts, but I couldn't find the spark API to get the standard error for the coefficients. I see that the coefficient standard errors are available in the linear model as part of the model summary. But Logistic regression model summary doesn't provide this. Part of the sample code is as follows.

import org.apache.spark.ml.classification.{BinaryLogisticRegressionSummary, LogisticRegression}

val lr = new LogisticRegression()
  .setMaxIter(10)
  .setRegParam(0.3)
  .setElasticNetParam(0.8)

// Fit the model
val lrModel = lr.fit(training) // Assuming training is my training dataset

val trainingSummary = lrModel.summary
val binarySummary = trainingSummary.asInstanceOf[BinaryLogisticRegressionSummary] // provides the summary information of the fitted model

Is there any way of calculating the standard error for coefficients. (or getting the variance-covariance matrix for coefficients, from which we can get the standard error)

Jeremy

You need to use the GLM method with Binomial+Logit instead of LogisticRegression.

https://spark.apache.org/docs/2.1.1/ml-classification-regression.html#generalized-linear-regression

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Empty Coefficients in Logistic regression in spark

How to get the coefficients of the best logistic regression in a spark-ml CrossValidatorModel?

R logistic regression extracting coefficients in a loop: error with setting up loop

coefficients of logistic regression model in tensorflow

Logistic regression coefficients not making sense

Finding coefficients for logistic regression in python

Calculating OR for logistic regression using rms

Initializing logistic regression coefficients when using the Spark dataset-based ML APIs?

Plot coefficients from a multinomial logistic regression model

R - lrm logistic regression coefficients / odds ratio?

Calculating VIF for ordinal logistic regression & multicollinearity in R

Hardcode a spark logistic regression model

Scala and Spark - Logistic Regression - NullPointerException

all coefficients turn zero in Logistic regression using scikit learn

How to obtain the coefficients of a parsnip multinomial logistic regression model?

Coefficients for Logistic Regression scikit-learn vs statsmodels

which coefficients go to which class in multiclass logistic regression in scikit learn?

Different coefficients: scikit-learn vs statsmodels (logistic regression)

Simulate sklearn logistic regression predict_proba with only coefficients and intercept

Can I extract significane values for Logistic Regression coefficients in pyspark

Extracting only some coefficients for the logistic regression model and its plot

Conduct a linear hypothesis test on the estimated coefficients of a logistic regression in R

beta coefficients and p-value with l Logistic Regression in Python

Logistic regression - how to fit a model with multiple features and show coefficients

logistic regression with gradient descent error

Method for error dataset in Logistic Regression

Simple Logistic Regression Error in Python

Error with training logistic regression model on Apache Spark. SPARK-5063

Calculating Standard Error Vba