2,512 questions
6
votes
2
answers
103
views
Reproduce a particular tree from the random forest using DecisionTreeRegressor
I am trying to replicate a specific decision tree trained by a RandomForestRegressor class, using DecisionTreeRegressor.
However, I cannot get the exact results, even with using the exact same ...
0
votes
1
answer
104
views
HDBSCAN Interpretation and Logic
I have made a basic HDBSCAN model (picture output below) but I need to figure out names for the individual clusters. Is there a way I can get something like a decision tree or the parameters for each ...
1
vote
0
answers
88
views
Why is DecisionTree using same feature and same condition twice
When trying to fit scikit-learn DecisionTreeClassifier on my data, I am observing some weird behavior.
x[54] (a boolan feature) is used to break the 19 samples into 2 and 17 on top left node. Then ...
0
votes
1
answer
96
views
HalvingGridSearchCV cannot fit multi label DecisionTreeClassifier
I'm trying to use HalvingGridSearch to find the best DecisionTree model. My model performs a multi-label prediction on a single example, it is trained on a batch of data of size (n_samples x ...
0
votes
0
answers
71
views
Get rule interpretations in h2o rulefit model
Following the example for h2o rulefit model from the documentation (https://docs.h2o.ai/h2o/latest-stable/h2o-docs/data-science/rulefit.html), I checked the variable importance of the rules or linear ...
3
votes
1
answer
94
views
Generate a simple decision tree program for finding minimums
A new code-generating tool is given only one input - n, the size of an input array. The tool should then generate a simple decision tree program that contains 2 kinds of nodes:
decision node with ...
0
votes
0
answers
48
views
Define a custom tree splitter from sklearn
I'm trying to define a custom splitter using sklearn Classification Trees classes, but I'm getting no results so far. I got no errors but the tree is not developed. How to achieve this?
My strategy is ...
0
votes
2
answers
84
views
How is each tree within DecisionTreeClassifier calculating probability of a class?
According to the sklearn docs, if you apply predict_proba to DecisionTreeClassifier:
The predicted class probability is the fraction of samples of the same
class in a leaf.
Let's say that the rows ...
1
vote
1
answer
54
views
Plotting one Decision Tree of a Random Forest in sklearn
I have come around a strange thing when plotting a decision tree in sklearn.
I just wanted to compare a Random Forest model consisting of one estimator using bootstrapping and one without ...
0
votes
0
answers
148
views
TypeError: 'int' object is not subscriptable, python, dtreeviz
I'm trying to use the dtreeviz library to visualize a decision tree, but I’m encountering an error:
TypeError: 'int' object is not subscriptable
Here’s the code I’m trying to run:
viz = dtreeviz(...
0
votes
0
answers
38
views
Draw a decision tree while hiding the values of the "value" row
I want to simplify the decision tree output and hide the values in the "value" field.Below is the code I am using
enter image description here
fig, ax = plt.subplots(figsize=(10, 10))
...
1
vote
0
answers
38
views
How does actions in DecisionTree works?
I'm currently working on a decision tree for an enemy character. I'm not sure how to use the actions for the decision tree.
Does the actions either:
1.
void AttackAction::ExecuteAction(EnemyBase& ...
0
votes
1
answer
49
views
sklearn plot_tree function does not show the class when the tree has only one node
I use the following code to plot a decisions trees:
plt.figure(figsize=(12, 12))
plot_tree(estimator,
feature_names=feature_names,
label= 'all',
...
0
votes
1
answer
48
views
Error predicting with REEMtree model: Number of observations in newdata does not match group identifiers
I am using the REEMtree package in R to build a tree with random effects, but when I attempt to make predictions on the test data, I encounter the following error:
Error in predict.REEMtree(...
0
votes
0
answers
55
views
Data structure for summing and taking the max of decision trees
I have a tree structure that I'm using to represent a set of decisions and the resulting payout:
Here there are two choices to be made: 0 can be either a, b or c, and 1 can be either d or e. The ...
2
votes
0
answers
31
views
How to use pal.node.fun with nod.fun in prp decision tree diagram package for R
Currently, I'm working on a project where we desire the labels under the tree diagram to show the accuracy rate (I figured out how to calculate this using values given in the tree data frame). I've ...
2
votes
1
answer
147
views
Is set.seed() needed when building a single decision tree in R?
I am learning how to build a single decision tree and random forests in R. I understand that set.seed() is needed before building a random forest to ensure reproducibility of the results, e.g. if ...
1
vote
1
answer
23
views
Weird info ordering in scikit-learn trees
When plotting scikit-learn trees (on iris data as an example), as in the below code:
from sklearn.datasets import load_iris
from sklearn import tree
iris = load_iris()
X, y = iris.data, iris.target
...
1
vote
1
answer
56
views
How to apply the exported sklearn trained tree to the test data
from sklearn.tree import DecisionTreeRegressor, export_text
cols_X = ['f1', 'f2']
df_train = pd.DataFrame([[1, 3, 4], [2, 5, 1], [7, 8, 7]], columns=['f1', 'f2', 'label'])
df_test = pd.DataFrame([[2, ...
0
votes
0
answers
122
views
Prune function for decision tree
I am creating a decision tree from scratch and implementing pruning. Currently I believe the problem in my code is that when I prune a tree, the new leaf node I create does not get placed into the ...
-3
votes
2
answers
77
views
ValueError: could not convert string to float: '?' while working with MSE
I am using the auto-mpg dataset . I am giving the link of the dataset below:
https://www.kaggle.com/datasets/uciml/autompg-dataset
I am giving the code below:
df = pd.read_csv('data/auto-mpg.csv')
df....
0
votes
0
answers
29
views
plots not generating all the samples and leave excess vertical space
I'm pretty new to machine learning. I was using fetch_olivetti_faces as my database for practice in my coding class. I ran the code, and it worked since I was following the teacher's instructions. ...
0
votes
1
answer
83
views
How to manually adjust a decision tree obtained from rpart, including surrogate splits?
I built a decision tree with surrogate splits using rpart. Now, after inspection of the tree by a subject matter expert, the tree needs some small manual adjustment (addition of an extra branch).
...
1
vote
1
answer
106
views
User defined impurity in Regression Decision Trees
I am migrating from R to PySpark. I have a process that creates a regression tree that is currently built using R's rpart algorithm.
While configuring this in PySpark, I am unable to see an option to ...
0
votes
0
answers
74
views
I am trying to replicate a decision tree from SPSS in python using DecisionTreeClassifier
I am trying to replicate a decision tree from SPSS in python using DecisionTreeClassifier. I am unable to do the following.
Unable to use a feature to do the first force split.
If I use the same ...
-1
votes
1
answer
61
views
Verifying data but valid values depends on other columns
I have a Pandas dataframe built like this:
Fruit
Color
Eaten?
Date Eaten
Apple
Red
Yes
14-Mar-2024
Apple
Green
No
14-Mar-2024
Apple
Yellow
Yes
Banana
Red
Banana
Yellow
Yes
14-Mar-2024
I'm trying to ...
0
votes
1
answer
73
views
KeyError in Decision Tree during prediction
I want to create predict and predict_proba methods in my DecisionTreeClassifier implementation, but it gives the error
Traceback (most recent call last):
File "c:\Users\Nijat\project.py", ...
0
votes
1
answer
81
views
rpart() decision tree fails to generate splits (decision tree with only one node (the root node)) [closed]
I'm trying to create a decision tree to predict whether a given loan applicant would default or repay their debt.
I'm using the following dataset
library(readr)
library(dplyr)
library(rpart)
library(...
1
vote
0
answers
245
views
Apply large number of decision tree rules to SQL data
I want to apply rules that I've created from modelling on data using a decision tree to unseen data. I've parsed the rules to get a CASE WHEN statement like so:
CASE
WHEN variable_1 = "Value1&...
-1
votes
1
answer
478
views
How to generate multi-level decision tree from a JSON structure using Python? [closed]
Currently I am solving a problem on where I have to create a flexible multi-dimensional decision tree from a JSON structure in Python. There are composed decision rules in the JSON and each decision ...
-1
votes
1
answer
95
views
Decision Tree Regressor Output
I have a very simple dataset of employee age and years of experience as features and income as label. The ask is to predict the income level using various regressors and I am using 4: Decision Trees (...
2
votes
2
answers
360
views
Plot a decision tree from HistGratientBoostingClassifier
I have a HistGradientBoostingClassifier model and I want to plot one or more of its decision trees, nevertheless I can't manage to find a native function to do it, I can access the Tree predictor ...
1
vote
2
answers
163
views
Random Forest / Decision Tree Output Probability Design: Using Positive Output Leaf Samples / Total Output Leaf Samples
I am designing a binary classifier random forest model using python and scikitlearn, in which I would like to retrieve the probability of my test set being one of the two labels. To my understanding, ...
0
votes
0
answers
37
views
Why does the DecisionTreeClassifiers accuracy change when we do these changes?
I was training a model using DecisionTreeClassifier and just like the LinearRegression / LogisticRegression algorithms testing and training datasets I use the 2D array for X values and 1D array for y ...
1
vote
1
answer
1k
views
Is there a difference in the underlying sklearn 'entropy' and 'log_loss' criteria for decision tree classifiers?
I'm implementing an decision tree classifier using sklearn and testing out different criteria, but I can't seem to find what the difference is between the 'entropy' and 'log_loss' criteria. The ...
0
votes
1
answer
377
views
How to set decision-tree distance threshold in Deepface library? Regarding "LightFace: A Hybrid Deep Face Recognition Framework" paper
I am reading deep face library paper,"LightFace: A Hybrid Deep Face Recognition Framework".
Q1. Is there code to explain how to determine the distance threshold?
The decision tree(C4.5) ...
0
votes
0
answers
47
views
Partykit predict function throws warning when predicting with new data
Im trying to predict new data with a ctree object. I get this warning message when I run the function:
b1b2_party <- ctree(factor(final_category_bin) ~ ., data = train, control = ctree_control(...
1
vote
1
answer
58
views
printing decision tree as an image in python
I have written a decision tree class in Python which uses the Node class as tree nodes, as shown below:
class Node:
'''
Helper class which implements a single tree node.
'''
def ...
-1
votes
1
answer
54
views
Line of code not compiling in jupyter notebook
I'm trying to build a second dataframe in Jupyter Notebook to train a stronger model.
This is the line of code:
dtc2 = DecisionTreeClassifier(criterion = 'entropy', ccp_alpha=0.04)
I'm getting a type ...
0
votes
1
answer
183
views
Switching binary classification python scikit-learn model to multi-class classification model
I'm currently having trouble switching the following code to fit a multiclass variable (3 levels).
# data import
from ucimlrepo import fetch_ucirepo
import numpy as np
import pandas as pd
import ...
0
votes
1
answer
111
views
Decision Tree prediction for the fail reason
In my experiment, I used Decision Trees to predict whether participants will pass or fail, and I will provide feedback to them based on the reason for their failure. The Decision Tree includes three ...
1
vote
0
answers
79
views
Decision tree using rpart for factor returns only the first node
I am trying to make a decision tree using rpart function in r. I have the y variable "outcome" and 4 variables as x. All of them are factors. Every tree I tried to make returns only the ...
0
votes
1
answer
136
views
Creating Tensorflow decision forests from individual trees
Is possible to build a decision forest with TensorFlow from many individual decision trees? Also, remove and add individual trees that are in the decision forest based on some performance criteria? ...
0
votes
1
answer
185
views
How to identify feature names from indices in a decision tree using scikit-learn’s CountVectorizer?
I have the following data for training a model to detect whether a sentence is about:
a cat or dog
NOT about a cat or dog
I ran the following code to train a DecisionTreeClassifier() model then view ...
-4
votes
1
answer
57
views
How does persisting the model increase accuracy?
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score
whitewine_data = pd....
0
votes
0
answers
163
views
XGBoost custom & default objective and evaluation functions
I am training a BDT for binary classification of signal/background (I work in particle physics). My model (implemented in python) looks like:
import xgboost as xgb
train = xgb.DMatrix(data=train_df[...
0
votes
1
answer
116
views
AttributeError: 'RandomForestRegressor' object has no attribute 'tree_'. How do i resolve?
I am trying to use the random forest model to predict social media ads effects based on age and estimated salary, this is my code but i keep getting Attribute error prompting up.
from sklearn.tree ...
0
votes
1
answer
152
views
How can I limit the depth of a decision tree using C4.5 in Weka?
I am trying to use Weka and the J48 classifier, specifically for the C4.5 algorithm. Specifically, I am looking to limit the depth of the decision tree produced (1 layer, 2 layers, etc.) but I do not ...
0
votes
1
answer
141
views
Error when importing DecisionTreeClassifier from sklearn
When I try to import a DecisionTreeClassifier from sklearn.tree I receive the following attribute error: AttributeError: module 'numpy' has no attribute 'float'
My code is:
import sklearn
print(...
0
votes
1
answer
156
views
i have loaded a csv file in weka tool but J48 is not highlight
i upload a csv file in weka tool. After preprocessing, i want to apply J48 classifier but J48 is not highlighted there.
i want to apply decision tree classifier but it is not highlighted