Search This Blog

Friday, April 15, 2016

Learning so called Hadoop: where to start?

It is confusing what book to read, what tutorial or courses to take. Right now the system has been split in different modules. Picking material on Hadoop will often straight away take you to HDFS and MapReduce/Yarn and programming. This can be confusing for analyst community as you are trying to learn more about analysis and not system/infrastructure maintenance.

So then the question is where does an analyst start? In my opinion you can look at the following blocks and start on any query tools.

In the current Hadoop ecosystem, HDFS is still the major storage option. On top of it snappy, RCFile, Parquet and ORCFile could be used for storage optimisation. Core Hadoop MapReduce released a version 2.0 called Yarn for better performance and scalability. Spark and Tez as solutions for real-time processing are able to run on the Yarn to work with Hadoop closely. Base is a leading NoSQL database, especially when there is a NoSQL database request on the deployed Hadoop clusters. Swoop is still one of the leading and matured tools for exchanging data between Hadoop and relational databases. Flume is matured distributed and reliable log-collecting tool to move or collect data to HDFS. Impala and Presto query directly against the data on HDFS for better performance.

So if you are an analyst like me then Hive, Pig, Impala, Presto, Sqoop and HBase can be a good flow to start taming the beast. Just like in the good ol days you can become an analyst first and then depending on your interest in infrastructure and admin side you can jump into other systems.

To start learning Hive - one needs to install it. So I would recommend following this URL (this one is the best of the couple available out there)

Saturday, March 12, 2016

Customer Loyalty

What does loyal customer mean?
  - Someone who makes repeat/regular purchases
  - Someone who purchases across product/service lines/categories
  - Someone who refers others
  - Demonstrates immunity from going to competition.


  • R: Recency
  • F : Frequency
  • M: Monetary Value
And then Customer Life Time Value (LTV)

Value Pyramid of Customers/ Where is the opportunity to create loyal customers?
  • Know your best customers: Who buys high order value? Who is a repeat buyer?
  • What is the expected value in your segment?
  • What is it about your service and product right now that makes the customer buy?
  • Where does top percentage of your revenue come from? (Ticket size, geo, category, sub-cat, brand)
  • What are they buying, When are they buying, How are they buying?
  • When looking at top buyers - do look into returns and other data silos.
  • Has the value of purchase/order value grown over the time?
  • A) What are their unsolved problems? B) What are their headaches? C) What keeps them up at night?
  • Make it easy for them to try or buy your new products and services.
  • Are you doing Birthday/New Year promotions card? Do you use this opportunity to force feedback?
  • "What is one thing we could have done better?"
  • Seek out employee feedback. Make sure you empower employees.
  • Communicate the vision of promotion to the front line.
Create such a visualisation:

Total Revenue                         80% of Revenue
XXXX                                      0.8R

CUST ID         REVENUE        80% REVENUE
8                         R                        0.8R

Friday, March 11, 2016

Strategic Planning: Steps

These are notes from my scribe. Wont make sense to most readers who have reached here randomly. If they do - well and good!

Steps involved in strategic planning

Communicate and prepare
       - Announce process
       - Identify resource -> set out the work they do

Meeting #1
      - Explain process
      - Work on the content
                 - SWOT
                 - Mission
                 - Vision
                 - Principles
                 - Goals
                 - Strategic filters

Homework Assignment
       Comeback with which initiatives they are going to work on

Meeting #2
     - Why everyone rated the initiatives the way they did as per the strategic filters
     - Initial prioritisation list
     - Initial Owners assigned

Homework Assignment 2
      Analyse high priority
            - Market Validation
            - Financial Analysis
            - Execution Considerations

Meeting #3- Resource Planning
      - Validate priorities
      - Identify resource
      - Allocate resources

Resource Matrix: Initiative| Cost| Resource

Tools for analysis
- Five Forces

Good matrix:


Why analytics, data science, big data and digital transformation initiatives fail?

Excerpt from the book The Rebel - Osho does explains nicely why most analytics, data science, big data and digital transformation initiatives fail? Its just because data alone can't change the organization, it has to be the culture of the organization that needs to be changed - an no, not after the new is built.

Old and New: Such is human mind

"I have heard about an old church: it was so ancient that people had stopped going there because even strong wind and the church would start swaying. It was so fragile, any moment it could fall. Even the priest had started giving his sermons outside the church, far away in the open ground.

Finally, the board of trustees had a meeting; something had to be done. But the trouble was that the church was very ancient - it was the glory of the town; their town was famous far and wide because of the old church: perhaps it was the oldest church in the world. It was not possible to demolish it and to make a new one. But it was also dangerous to let it remain as it was - it was going to kill someone. Nobody had been going in for years; even the priest was not courageous enough to go in because who knew at what moment the church would simply collapse? So something had to be done.

The board was in a very great dilemma: something had to be done, and nothing should be done because that church is so ancient, and man has been in such deep attachment with things that are ancient. So they passed a resolution with four clauses in it. The first was: "We will make a new church, but it will be exactly the same as the old. It will be made of the same material the old is made of - nothing new will be used in it, so it remains ancient. It will be made in the same place where the old church stands because that place has become holy by ancientness."

The last thing in their resolution was, "we will not demolish the old church until the new is ready." They were all happy that they had come to a conclusion. But who was going to ask those idiots, "how are you going to do it?" The old should not be demolished till the new was ready. And the new had to be made of everything the old was made of, in the same place where the old was standing, with exactly same architecture the old had. Nothing new could be added to it: the same doors, the same windows, the same glass, the same bricks - everything that needed to be used had to be of the old church.

And finally, they decided that the old should not be touched till the new was ready. "When the new is ready, then we can demolish the old."

Such is the humans mind: it clings to old, it also wants the new, and then it tries to find some compromise - that at least the new should be like the old. But a few things are impossible, nature just won't allow them.