Aim: Perform the following task in Orange tool.
1) How to use workflows in Orange.
2) How to do basic data exploration.
3) How to load your data? and how to load external data from API?4) Learn all the widgets in Orange tool
What is Orange tool?Orange is an open-source data visualization, machine learning, and data mining toolkit. It features a visual programming front-end for explorative rapid qualitative data analysis and interactive data visualisation.
What can you do with it?we can perform task ranging from basic visuals to data manipulations, transformation and data mining. It consolidates all the functions of the entire process into a single workflow. We can choose a particular column or rows from the dataset. We can generate different visuals like boxplot, scatterplot, distribution etc.
Advantages- Orange provides effective data analysis by visuals.
- It has a smart UI to build machine learning and prediction models.
- A useful tool to analyse big data
Disadvantages- The advance analysis is not so easy.
- It does not give any error even on wrong data.
- Some times slower the system while working on data.
Comparison of Tools
Dataset Description

Here I have used a data set of hotel booking. Where we can find the data of arrival date, departure date, number of adults, children, babies and also about the meal and all.
How to use workflows in Orange?

Here I have given a sample of the workflow and how it works. All the information are provided in the above image.
Basic Data Exploration
In the data field, there are so many options available which we can apply to our dataset. we can get the info of the data set like how many rows, column, size of data etc. We can also use data sampler to divide the data into two portions. We can also select a particular row or column.

Some of the screenshots of the above data explorations
Data Info
Data Sampler
Select column Selected column
Load DataIn Orange tool, we can load our own dataset or we can also use the available dataset.

If we want to use our own dataset then we can use widget CSV file import and then open that and go to browse and select the dataset that you want to use. Shown in the above image.

If you don't have any dataset then you can choose from the available data set. You just have to choose dataset option and then open it by double-clicking and select whichever dataset you want to select.
Widgets in the Orange Tool
1) File: The file widgets reads the input data file and sends the dataset to its output channel. The history of most recently opened files is maintained in the widget.
2) CSV File Import: This widget reads comma-separated files and sends the dataset to its output channel. Files separators can be commas, semicolons, spaces, tabs or manually-defined delimiters.
3) Datasets: Dataset widget retrieves selected dataset from the server and sends it to the output. The file is downloaded to the local memory and thus instantly available even without the internet connection.
4) Data Table: This widget receives one or more datasets in its input and presents them as a spreadsheet.
5) Data Info: A simple widget that presents information on the dataset size, features, targets, meta attributes, and location.
6) Data Sampler: The data sampler widget implements several data sampling methods. It outputs a sampled and a complementary dataset.
7) Select Columns: The select columns widget is used to manually compose your data domain. The user can decide which attributes will be used and how.
8) Box Plot: The box plot widget shows the distribution of attribute value. It is a good practice to check any new data with this widget to quickly discover any anomalies, such as duplicate values, outliers, and alike.
9) Distributions: The Distributions widget displays the value distribution of discrete or continuous attributes. If the data contains a class variable, the distribution may be conditioned on the class.
10) Scatter Plot: The Scatter Plot widget provides a 2-dimensional scatter plot visualization for continuous attributes. The data is displayed as a collection of points.
11) Line Plot: Line plot a type of plot which displays the data as a series of points, connected by straight line segments. It only works for numerical data, while categorical can be used for grouping of the data points.
Box plot
Distribution plot
Linear Projection
Scatter plot
Select Column
Questions and Answers
1) Is this tool easy to use compared to its peers?ans. Yes, this tool is easy as compare to others because it has very simple Ui. We can also find a different option for plotting graph and we can compare it.
2) Are there any other tools that are available which provides more features than this tool?ans. Yes, there are tools available that provide features like orange tool for data analytics like, KNIME, Xcos, RapidMiner, WEKA, GMDH Shell, ELKI and many more.
Conclusion: I have learnt how to use Orange tool for data analysis. How to import our own dataset, how can we plot a graph and many other widgets like data sampler, data info, scatterplot, data table, select column, select row etc.
Orange File Link
- Orange provides effective data analysis by visuals.
- It has a smart UI to build machine learning and prediction models.
- A useful tool to analyse big data
- The advance analysis is not so easy.
- It does not give any error even on wrong data.
- Some times slower the system while working on data.
Dataset Description
Here I have used a data set of hotel booking. Where we can find the data of arrival date, departure date, number of adults, children, babies and also about the meal and all.
How to use workflows in Orange?
Here I have given a sample of the workflow and how it works. All the information are provided in the above image.
Basic Data Exploration
In the data field, there are so many options available which we can apply to our dataset. we can get the info of the data set like how many rows, column, size of data etc. We can also use data sampler to divide the data into two portions. We can also select a particular row or column.
Some of the screenshots of the above data explorations
Data Info
Data Sampler
Select column Selected column
Load Data
In Orange tool, we can load our own dataset or we can also use the available dataset.
If we want to use our own dataset then we can use widget CSV file import and then open that and go to browse and select the dataset that you want to use. Shown in the above image.
If you don't have any dataset then you can choose from the available data set. You just have to choose dataset option and then open it by double-clicking and select whichever dataset you want to select.
1) File: The file widgets reads the input data file and sends the dataset to its output channel. The history of most recently opened files is maintained in the widget.
2) CSV File Import: This widget reads comma-separated files and sends the dataset to its output channel. Files separators can be commas, semicolons, spaces, tabs or manually-defined delimiters.
3) Datasets: Dataset widget retrieves selected dataset from the server and sends it to the output. The file is downloaded to the local memory and thus instantly available even without the internet connection.
4) Data Table: This widget receives one or more datasets in its input and presents them as a spreadsheet.
5) Data Info: A simple widget that presents information on the dataset size, features, targets, meta attributes, and location.
6) Data Sampler: The data sampler widget implements several data sampling methods. It outputs a sampled and a complementary dataset.
7) Select Columns: The select columns widget is used to manually compose your data domain. The user can decide which attributes will be used and how.
8) Box Plot: The box plot widget shows the distribution of attribute value. It is a good practice to check any new data with this widget to quickly discover any anomalies, such as duplicate values, outliers, and alike.
9) Distributions: The Distributions widget displays the value distribution of discrete or continuous attributes. If the data contains a class variable, the distribution may be conditioned on the class.
10) Scatter Plot: The Scatter Plot widget provides a 2-dimensional scatter plot visualization for continuous attributes. The data is displayed as a collection of points.
11) Line Plot: Line plot a type of plot which displays the data as a series of points, connected by straight line segments. It only works for numerical data, while categorical can be used for grouping of the data points.
Box plot
Distribution plot
Linear Projection
Scatter plot
Select Column
Questions and Answers
1) Is this tool easy to use compared to its peers?
ans. Yes, this tool is easy as compare to others because it has very simple Ui. We can also find a different option for plotting graph and we can compare it.
2) Are there any other tools that are available which provides more features than this tool?
ans. Yes, there are tools available that provide features like orange tool for data analytics like, KNIME, Xcos, RapidMiner, WEKA, GMDH Shell, ELKI and many more.
Conclusion: I have learnt how to use Orange tool for data analysis. How to import our own dataset, how can we plot a graph and many other widgets like data sampler, data info, scatterplot, data table, select column, select row etc.
Orange File Link
No comments:
Post a Comment