当前位置：凯发ag旗舰厅登录网址下载 > 编程语言 > python >内容正文

python

python实现线性回归预测不用sklearn库-凯发ag旗舰厅登录网址下载

发布时间：2024/7/23 python 27 豆豆

凯发ag旗舰厅登录网址下载收集整理的这篇文章主要介绍了 python实现线性回归预测不用sklearn库_python – 为什么我的自定义线性回归模型不匹配sklearn？... 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

我正在尝试用python创建一个简单的线性模型,不使用库(numpy除外).这就是我所拥有的

import numpy as np

import pandas

np.random.seed(1)

alpha = 0.1

def h(x,w):

return np.dot(w.t,x)

def cost(x,w,y):

totalcost = 0

for i in range(47):

diff = h(x[i],w) - y[i]

squared = diff * diff

totalcost = squared

return totalcost / 2

housing_data = np.loadtxt('housing.csv',delimiter=',')

x1 = housing_data[:,0]

x2 = housing_data[:,1]

y = housing_data[:,2]

avgx1 = np.mean(x1)

stdx1 = np.std(x1)

normx1 = (x1 - avgx1) / stdx1

print('avgx1',avgx1)

print('stdx1',stdx1)

avgx2 = np.mean(x2)

stdx2 = np.std(x2)

normx2 = (x2 - avgx2) / stdx2

print('avgx2',avgx2)

print('stdx2',stdx2)

normalizedx = np.ones((47,3))

normalizedx[:,1] = normx1

normalizedx[:,2] = normx2

np.savetxt('normalizedx.csv',normalizedx)

weights = np.ones((3,))

for boom in range(100):

currentcost = cost(normalizedx,weights,y)

if boom % 1 == 0:

print(boom,'iteration',weights[0],weights[1],weights[2])

print('cost',currentcost)

for i in range(47):

errordiff = h(normalizedx[i],weights) - y[i]

weights[0] = weights[0] - alpha * (errordiff) * normalizedx[i][0]

weights[1] = weights[1] - alpha * (errordiff) * normalizedx[i][1]

weights[2] = weights[2] - alpha * (errordiff) * normalizedx[i][2]

print(weights)

predictedx = [1,(2100 - avgx1) / stdx1,(3 - avgx2) / stdx2]

firstprediction = np.array(predictedx)

print('firstprediction',firstprediction)

firstprediction = h(firstprediction,weights)

print(firstprediction)

首先,它很快收敛.仅经过14次迭代.其次,它给出了与sklearn的线性回归不同的结果.作为参考,我的sklearn代码是：

import numpy

import matplotlib.pyplot as plot

import pandas

import sklearn

from sklearn.model_selection import train_test_split

from sklearn.linear_model import linearregression

dataset = pandas.read_csv('housing.csv',header=none)

x = dataset.iloc[:,:-1].values

y = dataset.iloc[:,2].values

linearregressor = linearregression()

xnorm = sklearn.preprocessing.scale(x)

scalecoef = sklearn.preprocessing.standardscaler().fit(x)

mean = scalecoef.mean_

std = numpy.sqrt(scalecoef.var_)

print('stf')

print(std)

stuff = linearregressor.fit(xnorm,y)

predictedx = [[(2100 - mean[0]) / std[0],(3 - mean[1]) / std[1]]]

yprediction = linearregressor.predict(predictedx)

print('predictedx',predictedx)

print('predict',yprediction)

print(stuff.coef_,stuff.intercept_)

我的自定义模型预测为y值为337,000,sklearn预测为355,000.我的数据是47行,看起来像

2104,3,3.999e 05

1600,3.299e 05

2400,3.69e 05

1416,2,2.32e 05

3000,4,5.399e 05

1985,2.999e 05

1534,3.149e 05

我假设(a)我的梯度下降回归在某种程度上是错误的或(b)我没有正确地使用sklearn.

为什么2不会为给定输入预测相同输出的任何其他原因？

总结

以上是凯发ag旗舰厅登录网址下载为你收集整理的python实现线性回归预测不用sklearn库_python – 为什么我的自定义线性回归模型不匹配sklearn？...的全部内容，希望文章能够帮你解决所遇到的问题。

如果觉得凯发ag旗舰厅登录网址下载网站内容还不错，欢迎将凯发ag旗舰厅登录网址下载推荐给好友。