
As the name suggests, once we are provided with real time data we should understand the data first and then we should able to get some insights from the provided data.
These insights can be gained only by analyzing and exploring the provided data, which we would call as Exploratory data analysis or EDA Process.
The data analysis can be done by our basic python libraries Numpy, Pandas, Matplotlib, etc., Good knowledge on our data will help us to get the answers that we need or develop an intuition for interpreting the results of future modeling.
There are a lot of ways to reach these goals: we can get a basic description of the data, visualize it, identify patterns in it, identify challenges of using the data, etc.
Direct definition: The Basic definition of EDA is, Exploratory data analysis is an approach to analyzing data sets by summarizing their main characteristics with visualizations. The EDA process is a crucial step prior to building a model in order to unravel various insights that later become important in developing a robust algorithmic model.
Now, we will go through a basic analysis step, when we are ready with real time data.
1. Choosing a Dataset: First, rely on open source data to initiate ML execution. There are mountains of data for machine learning around and some companies (like Google) are ready to give it away. we have many open sources like UCI Repository, Data.gov, Kaggle etc., From here we can choose our dataset base on our interests
2. Exploring the Dataset: The critical step in EDA process, is that understanding the data better by using different python libraries and checking the relationship between data.
3. Data Visualization: This step makes use of the plot library like matplotlib etc., in order to portray our understandings in picture model. This will help us to get more insights on data.
4. The Final step: This step is where our ML algorithms comes into play to provide insights on future unseen data.
As we have some basic understanding of python libraries now. we will gonna have fun with real time data in next blog.
Let’s wait to dirty your hands. have a great day:) Bye
Good work
LikeLike
Thank you sandy
LikeLike