Department:  School Data of Science

Course Code

DATA130005.01

Course Title

Statistics: Principles, Methods and R (I)

Credit

3

Credit Hours

48

Course Nature

Specific General Education Courses □Core Courses □General Education Elective Courses □Basic Courses in General Discipline  Professional Compulsory Courses □Professional Elective Courses □Others

Course Objectives

The course covers fundamental aspects of probability and statistics methods and principles. Data illustration using statistical package R constitutes an integral part throughout the course, therefore provides the hands-on experience in simulation and data analysis.

Course Description

The topics covered in this course include: introduction of R, probability, independence, conditional probability, Bayes' formula, random variables and distributions, moment generating functions, probability inequalities, law of large numbers, central limit theorem, point estimation, maximum likelihood estimation, Fisher's information, asymptotic efficiency, Hypothesis testing, Wald's test, t-tests, likelihood ratio tests, permutation tests, confidence intervals, linear regression model.

Course Requirements:

Probability Theory, Linear Algebra

Teaching Methods:

The course is carried out mostly by conventional lectures, combined with data analytic studies in R. Homework assignments will be given to help the students review the contents and apply their newly acquired knowledge and tools on real-world data problems.

Instructor's Academic Background:

Fengnan Gao is an assistant professor jointly appointed by the School of Data Science and Shanghai Center for Mathematical Sciences. He finished hid PhD with Aad van der Vaart from Leiden University in 2016 and joined Fudan afterwards. He has published papers on Electronic Journal of Statistics and Stochastic Processers and their Applications. He has presented his works in Amsterdam, North Carolina, Frejus, Cambridge and Eindhoven.


Members of Teaching Team

Name

Gender

Professional Title

Department

Responsibility

Fengnan Gao

Male

Assistant Professor

School of Data Science and Shanghai Center for Mathematical Sciences

Main instructor
















Course Schedule (Please supply the details about each lesson with 48 academic hours in a total of 8 weeks):


Week 1---- Introduction to the course and R

  1. What is R? Installing R, help and documentation

  2. Data objects, data import and export, basic data manipulation

  3. Computing with data, organizing an analysis

Week 2---- Probability and Random variables

  1. Sample space and events, probability, independent events

  2. Conditional probability, Bayes’ formula

  3. Distribution functions and probability functions, mean and variance, moment generating functions

Week 3--- Distributions and Multivariate distributions

  1. Discrete random variables, continuous random variables

  2. Bivariate distributions, marginal distributions, independent random variables, conditional distributions

  3. Multivariate distributions, IID samples, transformations of random variables

Week 4---- Inequalities and Convergence of random variables

  1. Probability inequalities

  2. Inequalities for expectations

  3. Types of convergence

Week 5---- Limit Theorems and Monte Carlo Methods

  1. Law of Large Numbers (LLN) and Central Limit Theorem (CLT)

  2. Monte Carlo integrals

  3. Importance sampling

Week 6---- Introduction to Statistical Inference I

  1. Parametric models

  2. Nonparametric models

  3. Sampling distributions

Week 7---- Mid-term Exam

Week 8---- Introduction to Statistical Inference II & Bootstrap I

  1. Fundamental concepts in inference

  2. Empirical distributions

  3. Simulations

Week 9---- Bootstrap II

  1. Bootstrap variance estimation

  2. Bootstrap confidence intervals - Approximate normal intervals and Pivotal intervals

  3. Bootstrap confidence intervals - Percentile intervals

Week 10---- Point Estimation I

  1. Methods of moments estimation

  2. Maximum likelihood estimation

  3. Properties of MLE

Week 11---- Point Estimation II

  1. Asymptotic efficiency

  2. Fisher information

  3. Parametric bootstrap

Week 12---- Hypothesis testing I

  1. EM algorithms

  2. Null and alternative hypotheses

  3. P-values

Week 13---- Hypothesis testing II

  1. Two types of errors

  2. Wald test, t-tests and t-intervals

  3. Likelihood ratio tests

Week 14---- Hypothesis testing III

  1. Asymptotic distribution of likelihood ratio tests

  2. Pearson’s chi-square test

  3. Goodness-of-fit tests

Week 15---- Regression models I

  1. Permutation tests

  2. Simple linear regression

  3. Least squares estimation

Week 16--- Regression models I

  1. Prediction

  2. Multiple linear regression

  3. Model selection


The design of class discussion or exercise, practice, experience and so on:

Students may be asked to do small projects so that they can better understand key concepts and statistical methods for solving real problems.

If you need a TA, please indicate the assignment of assistant:

The TA(s) will assist in grading homework and quiz.

Grading & Evaluation (Provide a final grade that reflects the formative evaluation process):

Final grade will depend on the following components with these proportions: homework (20%), midterm (30%), and final exam (50%). Late, poor attendance of the class will be considered for final grade.

Teaching Materials & References (Including Author, Title, Publisher andPublishing time):

  1.  Wasserman, L. (2004) All of Statistics. Springer (Chapters 1-10)

  2.  Casella, G.S. and Berger, R.L. (2002). Statistical Inference (2nd edition). Duxbury.

  3.  Knight, K. (2000). Mathematical Statistics. Chapman & Hall/CRC.

  4.  Pawitan, Y. (2001). In All Likelihood. Oxford University Press.

  5.  Zuur, A., Ieno, E. and Meesters, E. (2009). A Beginners Guide to R. Springer.





复旦大学 Statistics: Principles Methods and R (I)版权所有