Given the person’s attribute: Age, Sex, BMI, Smoker etc. we have to predict insurance cost.

**Age**is a real number**Sex**binary variable male and female**bmi**(body mass index) is a real number**children**is number of children a person has**smoker**is a binary variable**region**is class variable**charges**is dependent variable (y)

For this article the prerequisite is: Andrew Ng’s linear regression lecture.

*clickhere** *to check out the course.

- Given the Years of experience we have to predict salary of the person. Since dependent variable salary is continuous, it is a regression problem.
- We will use three methods i.e. Gradient descent (Optimization method), Statistical method (formula) and Scikit learn Linear regression library to estimate the regression parameter.
- After that we will compare the parameters learned from all these three methods.

Let’s say you have dataset in which one column has **numeric** data type and there are 1000 data points (rows) in that column, It is hard and time consuming to go through each and every data point, hence to overcome this problem we use descriptive statistics which describes our data and makes our task much more simpler. We also use visualizations such as histogram and boxplot to understand the distribution of the data.

Rather than understanding 1000 rows, summary statistics only has 1 number which can give the idea of whole data.

There are basically 2 types of summarizing techniques.

…