我正在逐行构建一个DataFrame,然后对其进行回归。为简单起见,代码为:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, @formula(response ~ 1), df)
我得到错误:
ERROR: LoadError: MethodError: Cannot `convert` an object of type Array{Number,1} to an object of type GLM.LmResp
This may have arisen from a call to the constructor GLM.LmResp(...),
since type constructors fall back to convert methods.
Stacktrace:
[1] fit(::Type{GLM.LinearModel}, ::Array{Float64,2}, ::Array{Number,1}) at ~/.julia/v0.6/GLM/src/lm.jl:140
[2] #fit#44(::Dict{Any,Any}, ::Array{Any,1}, ::Function, ::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:72
[3] fit(::Type{GLM.LinearModel}, ::StatsModels.Formula, ::DataFrames.DataFrame) at ~/.julia/v0.6/StatsModels/src/statsmodel.jl:66
[4] include_from_node1(::String) at ./loading.jl:576
[5] include(::String) at ./sysimg.jl:14
while loading ~/test.jl, in expression starting on line 10
线性回归的调用与“ Julia简介”中的回归非常相似:
linearmodel = fit(LinearModel, @formula(Y1 ~ X1), anscombe)
问题是什么?
几个小时后,我意识到GLM需要具体类型,而Number是抽象类型(即使GLM.LmResp的文档在撰写本文时对此几乎没有说明,只是“封装了线性模型的响应”)。解决方案是将声明更改为具体类型,例如Float64:
using DataFrames
using GLM
df = DataFrame(response = Float64[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
fit(LinearModel, @formula(response ~ 1), df)
输出:
StatsModels.DataFrameRegressionModel{GLM.LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}
Formula: response ~ +1
Coefficients:
Estimate Std.Error t value Pr(>|t|)
(Intercept) 0.408856 0.0969961 4.21518 0.0023
类型必须是具体的,例如,Real
带有的抽象类型会df = DataFrame(response = Real[])
失败,并显示一条更有用的错误消息:
ERROR: LoadError: `float` not defined on abstractly-typed arrays; please convert to a more specific type
或者,您可以Real
在构建数据框后转换为:
using DataFrames
using GLM
df = DataFrame(response = Number[])
for i in 1:10
df = vcat(df, DataFrame(response = rand()))
end
df2 = DataFrame(response = map(Real, df[:response]))
fit(LinearModel, @formula(response ~ 1), df2)
这是可行的,因为转换为Real实际上会转换为Float64:
julia> typeof(df2[:response])
Array{Float64,1}
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句