/images/Yiyang.png

Yiyang Dong

Decision Tree 1 - Regression and Classification Trees

0. Tree-based methods Involving stratifying / segmenting the predictor space into a number of simple regions Use the mean/mode of the training data in the region as prediction for test data 1. Regression Decision Tree 1.1 Motivation Making Prediction via Stratification of the Feature Space: Divide the predictor space – that is, the set of possible response $Y$ for ${X_1,X_2,…,X_p}$ – into $J$ distinct and non-overlapping regions, ${R_1,R_2,…,R_J}$ For every test data which will fall into the region $R_j$, we make the same prediction, which is simply the mean of the response $Y$ for the training observations in $R_j$ 1.

Full Stack Notes | Dockers

1. What is Docker? Docker is a tool for building, running and shipping applications in an isolated Environment Similar to VM Apps run in same environment Standard for software deployment (easier) 2. Containers vs Virtual Machines Containers are an abstraction at the *application layer** that packages code and dependencies together. Multiple Containers can run on the SAME machine and share the OS Kernel with other containers, each running as isolated processes in User Space light and fast Virtual machines are an abstraction of physical hardware turning one server into many server.

4. Likelihoods $$p(data|\theta)$$ 4.1. What is a Likelihood Imagine that we flip a coin and record its outcome. The simplest(idealised) model to represent this outcome ignores: the angle the coin was thrown at its height above the surface etc.. Because of our ignorance, our model cannot perfectly predict the behaviour of the coin. this uncertainty means that our model is probabilistic rather than deterministic For We can use our model to calculate the probability of obtaining two heads in a row: $$Pr(H,H|\theta, Model) = Pr(H|\theta,Model) \times Pr(H|\theta, Model)$$ $$=\theta \times \theta = \theta^2 = (\frac{1}{2})^2 = \frac{1}{4}$$

Example 1 Consider data for the Average Cardiovascular Mortality Rate (平均心血管病死亡率) in Los Angeles California in cmort.data (a) Examine the data, state if there was anything strange about the files, and if you made adjustments. 1 (b) Produce a plot of the data versus time. Include either labels on the axes OR a legend explaining the axes. 1 (c) Make observations about the existence or lack of existence of,

SQL Basics

Create Table 1 2 3 4 5 CREATE TABLE tableName ( col1 <dataType> <Restrictions>, col2 <dataType> <Restrictions>, col3 <dataType> <Restrictions>, <Primary Key or Index definitions>); Create Table Actors 1 2 3 4 5 6 7 CREATETABLEActors(FirstNameVARCHAR(20),SecondNameVARCHAR(20),DoBDATE,GenderENUM('Male','Female','Other'),MaritalStatusENUM('Married','Divorced','Single'),NetWorthInMillionsDECIMAL);