Jul 03, 2024

Wiki

Python

Aide

edit SideBar

Search

Random Forests


Random forests, which are used for both regression and classification, were proposed by Leo Breiman, an eminent American statistician known for his work on decision trees and on the CART method. Decision trees have indeed the following main flaws:

  • performance is too strongly dependent on the starting sample,
  • the topology of the tree can totally change with the addition of a few extra observations.

To overcome these problems, we use several trees. And to avoid having equal trees, we add randomness: each tree has a fragmented view of the problem, randomly drawn :

  • on the input observations,
  • on the explanatory variables.

More precisely, the assembly of decision trees built on the basis of a random draw among the observations is the tree bagging algorithm. Random forests add to tree bagging a sampling on the variables: random forest = tree bagging + feature sampling.

Page Actions

Recent Changes

Group & Page

Back Links