Feature scaling in Gradient descent with single featureGradient descent and normal equation method for solving linear regression gives different solutionsGaining intuition from gradient descent update ruleLarge number of features in Machine Learning is bad (regression)?better results from simple linear regression than multivariate/multiple regCalculate optimal input of a neural network with theano, by using gradient descent w.r.t. inputsHow to do feature engineering of real time data?Gradient descent on linear regression not converging5 input and 3 output features for machine learningHow to to predict the model using linear regression?Feature scaling a linear regression model and how it affects the output

Prime Sieve and brute force

What can I, as a user, do about offensive reviews in App Store?

What ways have you found to get edits from non-LaTeX users?

What is the actual quality of machine translations?

How to forge a multi-part weapon?

Motivation - or how can I get myself to do the work I know I need to?

Second (easy access) account in case my bank screws up

Cycle through MeshStyle directives in ListLinePlot

What makes an item an artifact?

Why didn't Voldemort recognize that Dumbledore was affected by his curse?

How to construct an hbox with negative height?

Déjà vu, again?

Medieval flying castle propulsion

Is the term 'open source' a trademark?

How to return a security deposit to a tenant

How can I get an unreasonable manager to approve time off?

Arriving at the same result with the opposite hypotheses

Do simulator games use a realistic trajectory to get into orbit?

Someone whose aspirations exceed abilities or means

PhD - Well known professor or well known school?

Using "subway" as name for London Underground?

Thread Pool C++ Implementation

Is open-sourcing the code of a webapp not recommended?

Overlapping String-Blocks



Feature scaling in Gradient descent with single feature


Gradient descent and normal equation method for solving linear regression gives different solutionsGaining intuition from gradient descent update ruleLarge number of features in Machine Learning is bad (regression)?better results from simple linear regression than multivariate/multiple regCalculate optimal input of a neural network with theano, by using gradient descent w.r.t. inputsHow to do feature engineering of real time data?Gradient descent on linear regression not converging5 input and 3 output features for machine learningHow to to predict the model using linear regression?Feature scaling a linear regression model and how it affects the output






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I am writing code for linear regression in which my model will predict price of houses on basis of the area. So, i have only one feature that is the area of the house and my output is the price. My input that is the area which is in range 1000 - 9000 and the price of the houses are in range 280000 - 800000 . So how should i perform feature scaling and how should i manage it with the output. I mean to ask that if i am bringing the house area in range 0 - 1 and house prices also in range 0 - 1 and than find out value of theta1 and theta2 (as i am applying linear regression equation like output = theta1 + theta2*input ) or i should scale down house prices to range of 1000 - 9000.



I am applying feature scaling in which i am bringing both the values input as well as output in between 0 - 1 my model is not giving right answers. I can figure out the mistake in it but i am not able to correct it. Please tell me how should i proceed.










share|improve this question



















  • 1





    I am using python.

    – K. Bakshi
    Mar 25 at 2:46

















0















I am writing code for linear regression in which my model will predict price of houses on basis of the area. So, i have only one feature that is the area of the house and my output is the price. My input that is the area which is in range 1000 - 9000 and the price of the houses are in range 280000 - 800000 . So how should i perform feature scaling and how should i manage it with the output. I mean to ask that if i am bringing the house area in range 0 - 1 and house prices also in range 0 - 1 and than find out value of theta1 and theta2 (as i am applying linear regression equation like output = theta1 + theta2*input ) or i should scale down house prices to range of 1000 - 9000.



I am applying feature scaling in which i am bringing both the values input as well as output in between 0 - 1 my model is not giving right answers. I can figure out the mistake in it but i am not able to correct it. Please tell me how should i proceed.










share|improve this question



















  • 1





    I am using python.

    – K. Bakshi
    Mar 25 at 2:46













0












0








0








I am writing code for linear regression in which my model will predict price of houses on basis of the area. So, i have only one feature that is the area of the house and my output is the price. My input that is the area which is in range 1000 - 9000 and the price of the houses are in range 280000 - 800000 . So how should i perform feature scaling and how should i manage it with the output. I mean to ask that if i am bringing the house area in range 0 - 1 and house prices also in range 0 - 1 and than find out value of theta1 and theta2 (as i am applying linear regression equation like output = theta1 + theta2*input ) or i should scale down house prices to range of 1000 - 9000.



I am applying feature scaling in which i am bringing both the values input as well as output in between 0 - 1 my model is not giving right answers. I can figure out the mistake in it but i am not able to correct it. Please tell me how should i proceed.










share|improve this question
















I am writing code for linear regression in which my model will predict price of houses on basis of the area. So, i have only one feature that is the area of the house and my output is the price. My input that is the area which is in range 1000 - 9000 and the price of the houses are in range 280000 - 800000 . So how should i perform feature scaling and how should i manage it with the output. I mean to ask that if i am bringing the house area in range 0 - 1 and house prices also in range 0 - 1 and than find out value of theta1 and theta2 (as i am applying linear regression equation like output = theta1 + theta2*input ) or i should scale down house prices to range of 1000 - 9000.



I am applying feature scaling in which i am bringing both the values input as well as output in between 0 - 1 my model is not giving right answers. I can figure out the mistake in it but i am not able to correct it. Please tell me how should i proceed.







machine-learning linear-regression gradient-descent






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 25 at 2:47







K. Bakshi

















asked Mar 24 at 17:10









K. BakshiK. Bakshi

254




254







  • 1





    I am using python.

    – K. Bakshi
    Mar 25 at 2:46












  • 1





    I am using python.

    – K. Bakshi
    Mar 25 at 2:46







1




1





I am using python.

– K. Bakshi
Mar 25 at 2:46





I am using python.

– K. Bakshi
Mar 25 at 2:46












1 Answer
1






active

oldest

votes


















0














from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import normalize
import numpy as np
X=np.random.randint(1000,9000,(10,))
Y=np.random.randint(100,200,(10,))+100*X
reg = LinearRegression().fit(X.reshape(-1,1), Y)
print(reg.score(X.reshape(-1,1),Y))
print(reg.coef_)
0.9999999822251018
[100.00518473]

X1=normalize(X.reshape(-1,1),axis=0)
Y1=normalize(Y.reshape(-1,1),axis=0)
reg = LinearRegression().fit(X1.reshape(-1,1), Y1)
print(reg.score(X1.reshape(-1,1),Y1))
print(reg.coef_)

0.9999999822251019
[[0.99982554]]


This is just ordinary Linear Regression using non normalized and normalized data. There will be no difference in these cases. Only the heading of your question includes "Gradient Descent" So If you use gradient descent method the weights will be automatically adjusted.



As the normal SGD is:
w=w-alpha*delta



Where alpha is the learning rate, The weight will be automatically adjusted as time goes on. So there is no difference between these two. Only that normalized one has to deal with numbers less than 1 so computation will be easier.






share|improve this answer























  • Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

    – K. Bakshi
    Mar 27 at 5:22











  • theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

    – Justice_Lords
    Mar 27 at 6:50











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55326356%2ffeature-scaling-in-gradient-descent-with-single-feature%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import normalize
import numpy as np
X=np.random.randint(1000,9000,(10,))
Y=np.random.randint(100,200,(10,))+100*X
reg = LinearRegression().fit(X.reshape(-1,1), Y)
print(reg.score(X.reshape(-1,1),Y))
print(reg.coef_)
0.9999999822251018
[100.00518473]

X1=normalize(X.reshape(-1,1),axis=0)
Y1=normalize(Y.reshape(-1,1),axis=0)
reg = LinearRegression().fit(X1.reshape(-1,1), Y1)
print(reg.score(X1.reshape(-1,1),Y1))
print(reg.coef_)

0.9999999822251019
[[0.99982554]]


This is just ordinary Linear Regression using non normalized and normalized data. There will be no difference in these cases. Only the heading of your question includes "Gradient Descent" So If you use gradient descent method the weights will be automatically adjusted.



As the normal SGD is:
w=w-alpha*delta



Where alpha is the learning rate, The weight will be automatically adjusted as time goes on. So there is no difference between these two. Only that normalized one has to deal with numbers less than 1 so computation will be easier.






share|improve this answer























  • Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

    – K. Bakshi
    Mar 27 at 5:22











  • theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

    – Justice_Lords
    Mar 27 at 6:50















0














from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import normalize
import numpy as np
X=np.random.randint(1000,9000,(10,))
Y=np.random.randint(100,200,(10,))+100*X
reg = LinearRegression().fit(X.reshape(-1,1), Y)
print(reg.score(X.reshape(-1,1),Y))
print(reg.coef_)
0.9999999822251018
[100.00518473]

X1=normalize(X.reshape(-1,1),axis=0)
Y1=normalize(Y.reshape(-1,1),axis=0)
reg = LinearRegression().fit(X1.reshape(-1,1), Y1)
print(reg.score(X1.reshape(-1,1),Y1))
print(reg.coef_)

0.9999999822251019
[[0.99982554]]


This is just ordinary Linear Regression using non normalized and normalized data. There will be no difference in these cases. Only the heading of your question includes "Gradient Descent" So If you use gradient descent method the weights will be automatically adjusted.



As the normal SGD is:
w=w-alpha*delta



Where alpha is the learning rate, The weight will be automatically adjusted as time goes on. So there is no difference between these two. Only that normalized one has to deal with numbers less than 1 so computation will be easier.






share|improve this answer























  • Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

    – K. Bakshi
    Mar 27 at 5:22











  • theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

    – Justice_Lords
    Mar 27 at 6:50













0












0








0







from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import normalize
import numpy as np
X=np.random.randint(1000,9000,(10,))
Y=np.random.randint(100,200,(10,))+100*X
reg = LinearRegression().fit(X.reshape(-1,1), Y)
print(reg.score(X.reshape(-1,1),Y))
print(reg.coef_)
0.9999999822251018
[100.00518473]

X1=normalize(X.reshape(-1,1),axis=0)
Y1=normalize(Y.reshape(-1,1),axis=0)
reg = LinearRegression().fit(X1.reshape(-1,1), Y1)
print(reg.score(X1.reshape(-1,1),Y1))
print(reg.coef_)

0.9999999822251019
[[0.99982554]]


This is just ordinary Linear Regression using non normalized and normalized data. There will be no difference in these cases. Only the heading of your question includes "Gradient Descent" So If you use gradient descent method the weights will be automatically adjusted.



As the normal SGD is:
w=w-alpha*delta



Where alpha is the learning rate, The weight will be automatically adjusted as time goes on. So there is no difference between these two. Only that normalized one has to deal with numbers less than 1 so computation will be easier.






share|improve this answer













from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import normalize
import numpy as np
X=np.random.randint(1000,9000,(10,))
Y=np.random.randint(100,200,(10,))+100*X
reg = LinearRegression().fit(X.reshape(-1,1), Y)
print(reg.score(X.reshape(-1,1),Y))
print(reg.coef_)
0.9999999822251018
[100.00518473]

X1=normalize(X.reshape(-1,1),axis=0)
Y1=normalize(Y.reshape(-1,1),axis=0)
reg = LinearRegression().fit(X1.reshape(-1,1), Y1)
print(reg.score(X1.reshape(-1,1),Y1))
print(reg.coef_)

0.9999999822251019
[[0.99982554]]


This is just ordinary Linear Regression using non normalized and normalized data. There will be no difference in these cases. Only the heading of your question includes "Gradient Descent" So If you use gradient descent method the weights will be automatically adjusted.



As the normal SGD is:
w=w-alpha*delta



Where alpha is the learning rate, The weight will be automatically adjusted as time goes on. So there is no difference between these two. Only that normalized one has to deal with numbers less than 1 so computation will be easier.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 25 at 13:23









Justice_LordsJustice_Lords

732211




732211












  • Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

    – K. Bakshi
    Mar 27 at 5:22











  • theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

    – Justice_Lords
    Mar 27 at 6:50

















  • Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

    – K. Bakshi
    Mar 27 at 5:22











  • theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

    – Justice_Lords
    Mar 27 at 6:50
















Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

– K. Bakshi
Mar 27 at 5:22





Yes but how to scale input and output. For example my input is 2000 and my corresponding output is 200000. So, if i scale down both values in between 0 to 1 how will it work. The value of theta1 and theta2 will not be correct. My question is the relation between the input values and output values scaling.

– K. Bakshi
Mar 27 at 5:22













theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

– Justice_Lords
Mar 27 at 6:50





theta2 can also be 100 so theta1+x*theta2=y theta1,thetha2 belongs to (-inf,inf)

– Justice_Lords
Mar 27 at 6:50



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55326356%2ffeature-scaling-in-gradient-descent-with-single-feature%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript