Classification and Regression Trees

CART® software is the ultimate classification tree that has revolutionized the field of advanced analytics, and inaugurated the current era of data science. CART is one of the most important tools in modern data mining.

Features

  • Linear Combination Splits
  • Optimal tree selection based on area under ROC curve
  • User defined splits for the root node and its children
  • Translating models into Topology
  • Edit and modify the CART trees via FORCE command structures
  • RATIO of the improvements of the primary splitter and the first competitor
  • Scoring of CV models as an Ensemble
  • Report impact of penalties in root node
  • New penalty against biased splits PENALTY BIAS (PENALTY / BIAS, CONTBIAS, CATBIAS)
  • Automation: Generate models with alternative handling of missing values (Automate MISSING_PENALTY)
  • Automation: Build a model using each splitting rule (six for classification, two for regression) (Automate RULES)
  • Automation: Build a series of models varying the depth of the tree (Automate DEPTH)
  • Automation: Build a series of models changing the minimum required size on parent nodes (Automate ATOM)
  • Automation: Build a series of models changing the minimum required size on child nodes (Automate MINCHILD)
  • Automation: Explore accuracy versus speed trade-off due to potential sampling of records at each node in a tree (Automate SUBSAMPLE)
  • Automation: Generates a series of N unsupervised-learning models (Automate UNSUPERVISED)
  • Automation: Varies the RIN (Regression In the Node) parameter through the series of values (Automate RIN)
  • Automation: Varying the number of "folds" used in cross-validation (Automate CVFOLDS)
  • Automation: Repeat cross-validation process many times to explore the variance of estimates (Automate CVREPEATED)
  • Automation: Build a series of models using a user-supplied list of binning variables for cross-validation (Automate CVBIN)
  • Automation: Check the validity of model performance using Monte Carlo shuffling of the target (Automate TARGETSHUFFLE)
  • Automation: Build two linked models, where the first one predicts the binary event while the second one predicts the amount (Automate RELATED). For example, predicting whether someone will buy and how much they will spend
  • Automation: Indicates whether a variable importance matrix report should be produced when possible (Automate VARIMP)
  • Automation: Saves the variable importance matrix to a comma-separated file (Automate VARIMPFILE)
  • Automation: Generate models with alternative handling of missing values (AUTOMATE MVI)
  • Hotspot detection for Automate UNSUPERVISED
  • Hotspot detection for Automate TARGET
  • Hotspot detection to identify the richest nodes across the multiple trees
  • Differential Lift Modeling (Netlift/Uplift)
  • Profile tab in CART Summary window
  • Multiple user defined lists for linear combinations
  • Constrained trees
  • Ability to create and save dummy variables for every node in the tree during scoring
  • Report basic stats on any variable of user choice at every node in the tree
  • Comparison of learn vs. test performance at every node of every tree in the sequence
  • Automation: Vary the priors for the specified class (Automate PRIORS)
  • Automation: Build a series of models by progressively removing misclassified records thus increasing the robustness of trees and posssibly reducing model complexity (Automate REFINE)
  • Automation: Bagging and ARCing using the legacy code (COMBINE)
  • Automation: Build a series of models limiting the number of nodes in a tree (Automate NODES)
  • Automation: Build a series of models trying each available predictor as the root node splitter (Automate ROOT)
  • Automation: Explore the impact of favoring equal sized child nodes by varying CART’s end cut parameter (Automate POWER)
  • Automation: Explore the impact of penalty on categorical predictors (Automate PENALTY=HLC)
  • Build a Random Forests model utlizing the CART engine to gain alternative handling of missing values via surrogate splits (Automate BOOTSTRAP RSPLIT)
  •