# Random Forest

In order to learn svm(support vector machine), we have to learn about what the Random Forest is.

## 1. What is a decision tree

A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm.(from wiki)

### Two Types of decision tree

1.Categorical Variable Decision Tree

2.Continuous Variable Decision Tree

Example:- Let’s say we have a problem to predict whether a customer will pay his renewal premium with an insurance company (yes/ no). Here we know that income of customer is a significant variable but insurance company does not have income details for all customers. Now, as we know this is an important variable, then we can build a decision tree to predict customer income based on occupation, product and various other variables. In this case, we are predicting values for continuous variable.

### Important Terminology related to Decision Trees

Root Node, Splitting, Decision Node, Leaf/Terminal Node:

Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You can say opposite process of splitting.

Branch / Sub-TreeParent and Child Node

1. Over fitting: Over fitting is one of the most practical difficulty for decision tree models. This problem gets solved by setting constraints on model parameters and pruning (discussed in detailed below).
2. Not fit for continuous variables: While working with continuous numerical variables, decision tree looses information when it categorizes variables in different categories.

## 2. Regression Trees vs Classification Trees

Categories: Programming

# Zero to C-Mips Compiler

According to CS143 course task, complete a compiler by my own.

There are four steps:

1. Lexical & Syntax Analysis
2. Semantic Analysis & Type Checking
3. Intermediate Code
4. Translated MIPS Code
5. Optimization
Categories: Programming

# 组成原理处理器部分(1)

4.1到4.3的内容分别为引言、逻辑设计的一般方法和建立数据通路，我们一步步来：

## 4.1 引言

control unit的定义：It tells the computer’s memory, arithmetic/logic unit and input and output devices how to respond to a program’s instructions.

## 4.2 逻辑设计的一般方法

Categories: Programming Tags: 标签：,

# Archlinux:不能在多个程序中播放音频

Categories: Programming Tags: 标签：,

# benchmark optimize(1)

source code: whetstone.c

based compiler flags: -std=c89 -DDP  -DROLL -lm

no warning, no error

## GCC First:

1.simply run:

Rolled Double  Precision 703148 Kflops ; 2048 Reps

2.703148 is too slow,then we add flag: -O4, optimize the loops,then compile again,run it:

Rolled Double  Precision 4177105 Kflops ; 2048 Reps

better now!

now come to these flags:

gcc -std=c89 -DDP  -DROLL -O4 -ffast-math -funroll-all-loops -mavx whetstone.c -fopenmp -lm -o b.out

fast-math means faster but sacrifices the accuracy

avx means using the avx instruction
5340310 Kflops now!

## ICC THEN:

1.simply run:

Rolled Double  Precision 4636137 Kflops ; 2048 Reps

seems good at first,if we add flag:-O3, the program isn’t faster at all,then we think about using parallel methods

flags -xHost can improve about 14%

2.parallel methods:

we have to run vtune_amplifier_xe above all,this software locate in /opt/intel/vtune_amplifier_xe_xxx/bin64, run /opt/intel/vtune_amplifier_xe_xxx/bin64/amplxe-gui and you will see the software window.(ps: xxx means the version of vtune_amplifier_xe)

run command(as root):

root# echo 0 > /proc/sys/kernel/yama/ptrace_scope

then refer to the tutorial:hotspots_amplxe_lin.pdf

it shows those hotspots:

it also shows the Utilization situation:

Poor!Now we have to consider to parallel it.

Categories: Programming