Aim: Visual Programing with Orange Tool
What is Visual Programing?
Visual programming is one type of language that lets user create programs by manipulating program elements graphically rather than by specifying them textually.
Dataset Description
Here I have used a data set of heart-disease. Where we can find the data age, gender, chest pain with its category, cholesterol, fasting blood etc.
Split Our data into training data and testing data
If we want to split our data we can use data sampler for that. With the use of this widget in Orange tool we can split our data according to our need.
Here is the process for the splitting of data.
This is how data is split into two tables
Now we want to split our data into data and test data. That we can do using test and score widget available in evaluate section.
Test and score widget connect it with data sampler
Click on any link between data sampler and Test & Score. Then connect data sample with data and remaining data with test data as shown in the above image.
What is the effect of splitting data on classification result/classification model?
When we split our data into test data and training data it will distribute our data randomly and accordingly the proportion of the class. 


As shown in the above image we can split data and use a different model like a tree, Naive Bayes, logistic regression and we can check which gives better accuracy.
From the above image, we can see that if we use test data with logistic regression we can get the best precision on data then tree model and svm model.
How to efficiently use cross-validation in orange? what is the effect of it on model accuracy/output?
We can split the data using data sampler into k groups. Then we can use a data table to hold the training set data and remaining data. After that, we can use that split data using test & score widget on any model like tree or logistic regression and can summarize the skill of the model by comparison of all that as shown in the figure below.
From the image, we can summarize the probability score of the model in the row is higher than that of the model in the column.
Question and Answers
1) Does this visual programming is enough for data analytics?
ans. No, visual programming may provide easy to implement environment, but it doesn't give the developer full control over the program.
2) How to get much deeper insights into data?
ans. We can do much deeper insights into data by using as much data as we can, automate our discovery of insights and leverage machine learning for the masses.
Orange File Link
No comments:
Post a Comment