

What are the clues provided by each variable / function? Sherlock said, “Never t rust general impressions, my boy, but concentrate yourself upon details.” At this stage, one must only assimilate and not hasten to polish the data.Īctively engage with the dataset – how much more does the data reveal Observing all the circumstantial details is the first step business analysts and data science professionals need to embark on to be able to replicate Mr. How many columns (features / variables) and how many rows (observations) encompass the data set? Which is the most important feature or variable that can be leveraged to solve the defined business problem? Are there variables that do not add value to your observation and only present “noise”? In the attic of your mind, compartmentalize observations into separate boxes in the form of categorical variables(nominal or ordinal) and numeric variables. Ask yourself “Are there any NAN Values, Missing values or Special Characters?” Observe meticulously, column by column. Upload every bit of information (the data set), channelize the inner mind and activate mindfulness. Leave every little prejudice and bias at the door. The Setting: “To a great mind nothing is little.”Ĭlosely examine the Head, the Tail, the Shape and everything else with healthy skepticism. Holmes, would do next is to mindfully engage with the data presented. By asking the right question, one is able to make an informed deduction thereby eliminating the possibility of a wasteful and ultimately unproductive wild goose chase. WHY ARE WE DOING?WHAT ARE WE SOLVING? WHAT ARE WE PROVING WRONG? WHAT IS THE BUSINESS and DO WE FULLY UNDERSTAND THE BUSINESS CONTEXT?Īs Data Scientists or Business Analysts, commencing by asking “ WHAT IS THE BUSINESS PROBLEM?” is a very pertinent beginning. The entire EDA process can be divided as such:Ĭhapter Three : Character Development (Suspect identification and elimination)Ĭhapter Four : Decoding & Encoding Characters However, before anything else, we start with a big “WHAT”? One does not jump to conclusions, or massage the data so it speaks the convenient truth.ĮDA is “Elementary, my Dear Watson” to building any kind of model.Įxploratory Data Analysis is not just viewing, but closely studying the data – for patterns, for anomalies, to check assumptions and the hypothesis. Each “Why” stacks one upon the other, and before long a pattern is formed, the hypothesis proved or negated and a final theory presented. Holmes, data analysts are not equipped with special capabilities -only the power of induction, whereby one meticulously evolves one’s process and intuitively builds one’s power of inductive reasoning.Įvery byte of data presented to a data analyst is much like a well preserved crime scene, devoid of manipulation, untouched and raw, or at least, ideally so.

Indeed, some of his methods were officially adopted by The Scotland Yard. He single handedly revolutionized the art of reasoning thereby elevating it to a proper science.

Data Analysts and strategists around the globe, consciously or sublimely follow the path of the great fictional genius - Sherlock Holmes.
