Показаны сообщения с ярлыком learning curve. Показать все сообщения
Показаны сообщения с ярлыком learning curve. Показать все сообщения

вторник, 20 июня 2017 г.

INFOGRAPHIC: A BEGINNER’S GUIDE TO MACHINE LEARNING ALGORITHMS



We hear the term “machine learning” a lot these days (usually in the context of predictive analysis and artificial intelligence), but machine learning has actually been a field of its own for several decades. Only recently have we been able to really take advantage of machine learning on a broad scale thanks to modern advancements in computing power. But how does machine learning actually work? The answer is simple: algorithms.   
Machine learning is a type of artificial intelligence (AI) where computers can essentially learn concepts on their own without being programmed. These are computer programmes that alter their “thinking” (or output) once exposed to new data. In order for machine learning to take place, algorithms are needed. Algorithms are put into the computer and give it rules to follow when dissecting data.
Machine learning algorithms are often used in predictive analysis. In business, predictive analysis can be used to tell the business what is most likely to happen in the future. For example, with predictive algorithms, an online T-shirt retailer can use present-day data to predict how many T-shirts they will sell next month.  

REGRESSION OR CLASSIFICATION

While machine learning algorithms can be used for other purposes, we are going to focus on prediction in this guide. Prediction is a process where output variables can be estimated based on input variables. For example, if we input characteristics of a certain house, we can predict the sale price.
Prediction problems are divided into two main categories:
  • Regression Problems: The variable we are trying to predict is numerical (e.g., the price of a house)
  • Classification Problems: The variable we are trying to predict is a “Yes/No” answer (e.g., whether a certain piece of equipment will experience a mechanical failure)
Now that we’ve covered what machine learning can do in terms of predictions, we can discuss the machine learning algorithms, which come in three groups: linear models, tree-based models, and neural networks.

WHAT ARE LINEAR MODEL ALGORITHMS

A linear model uses a simple formula to find a “best fit” line through a set of data points. You find the variable you want to predict (for example, how long it will take to bake a cake) through an equation of variables you know (for example, the ingredients). In order to find the prediction, we input the variables we know to get our answer. In other words, to find how long it will take for the cake to bake, we simply input the ingredients.
For example, to bake our cake, the analysis gives us this equation: t = 0.5x + 0.25y, where t = the time it takes the bake the cake, x = the weight of the cake batter, and y = 1 for chocolate cake and 0 for non-chocolate cake. So let’s say we have 1kg of cake batter and we want a chocolate cake, we input our numbers to form this equation: t = 0.5(1) + (0.25)(1) = 0.75 or 45 minutes.
There are different forms of linear model algorithms, and we’re going to discuss linear regression and logistic regression.

LINEAR REGRESSION

Linear regression, also known as “least squares regression,” is the most standard form of linear model. For regression problems (the variable we are trying to predict is numerical), linear regression is the simplest linear model.

LOGISTIC REGRESSION

Logistic regression is simply the adaptation of linear regression to classification problems (the variable we are trying to predict is a “Yes/No” answer). Logistic regression is very good for classification problems because of its shape.

DRAWBACKS OF LINEAR REGRESSION AND LOGISTIC REGRESSION

Both linear regression and logistic regression have the same drawbacks. Both have the tendency to “overfit,” which means the model adapts too exactly to the data at the expense of the ability to generalise to previously unseen data. Because of that, both models are often “regularised,” which means they have certain penalties to prevent overfit. Another drawback of linear models is that, since they’re so simple, they tend to have trouble predicting more complex behaviours.

WHAT ARE TREE-BASED MODELS

Tree-based models help explore a data set and visualise decision rules for prediction. When you hear about tree-based models, visualise decision trees or a sequence of branching operations. Tree-based models are highly accurate, stable, and are easier to interpret. As opposed to linear models, they can map non-linear relationships to problem solve.

DECISION TREE

A decision tree is a graph that uses the branching method to show each possible outcome of a decision. For example, if you want to order a salad that includes lettuce, toppings, and dressing, a decision tree can map all the possible outcomes (or varieties of salads you could end up with).
To create or train a decision tree, we take the data that we used to train the model and find which attributes best split the train set with regards to the target.
For example, a decision tree can be used in credit card fraud detection. We would find the attribute that best predicts the risk of fraud is the purchase amount (for example that someone with the credit card has made a very large purchase). This could be the first split (or branching off) – those cards that have unusually high purchases and those that do not. Then we use the second best attribute (for example, that the credit card is often used) to create the next split. We can then continue on until we have enough attributes to satisfy our needs.

RANDOM FOREST

A random forest is the average of many decision trees, each of which is trained with a random sample of the data. Each single tree in the forest is weaker than a full decision tree, but by putting them all together, we get better overall performance thanks to diversity.
Random forest is a very popular algorithm in machine learning today. It is very easy to train (or create), and it tends to perform well. Its downside is that it can be slow to output predictions relative to other algorithms, so you might not use it when you need lightning-fast predictions.

GRADIENT BOOSTING

Gradient boosting, like random forest, is also made from “weak” decision trees. The big difference is that in gradient boosting, the trees are trained one after another. Each subsequent tree is trained primarily with data that had been incorrectly identified by previous trees. This allows gradient boost to focus less on the easy-to-predict cases and more on difficult cases.
Gradient boosting is also pretty fast to train and performs very well. However, small changes in the training data set can create radical changes in the model, so it may not produce the most explainable results.

WHAT ARE NEURAL NETWORKS

Neural networks in biology are interconnected neurons that exchange messages with each other. This idea has now been adapted to the world of machine learning and is called artificial neural networks (ANN). The concept of deep learning, which is a word that pops up often, is just several layers of artificial neural networks put one after the other.
ANNs are a family of models that are taught to adopt cognitive skills to function like the human brain. No other algorithms can handle extremely complex tasks, such as image recognition, as well as neural networks can. However, just like the human brain, it takes a very long time to train the model, and it requires a lot of power (just think about how much we eat to keep our brains working).
Dataiku - Top Prediction Algorithms

суббота, 17 июня 2017 г.

Companies That Learn Together, Earn Together




You and your customers may work in different sectors of an industry, operate within different organizational structures, or be located in different geographies, but you have one very important thing in common: you both need to learn in order to adapt and succeed. So why not learn alongside your customers and strengthen your relationships in the process? That’s the idea behind a trend that can best be described as customer-centric co-learning, where organizations and their customers come together to learn side by side in an executive education–type setting.
General Electric is already famous for the leadership development programs for company managers that it offers at its leadership center in Crotonville, New York. It is also one of the best examples of a company that has embraced—and benefitted from—a customer-centric co-learning model. Over the past few years, I’ve had the privilege of teaching at more than half a dozen of GE’s co-learning events around the globe, and I can tell you firsthand that the inherent value of co-learning with customers is immense. Not unlike the benefits of co-creation, which show that developing beta products with active users works much better than traditional iterative improvement techniques, co-learning offers the opportunity to work collaboratively with key influencers to strengthen business relationships in ways that can’t necessarily be achieved otherwise.
In GE’s case, here’s how it works:
The company identifies its top customers and partners and invites them to send a small cohort of executives to participate in the learning experience. Most events I’ve been involved with are capped at about 120 participants (10 senior executives from 12 customer companies), which is a small enough group to remain intimate, while also offering diverse perspectives and viewpoints.
The course, which typically runs for one week, is designed to focus on one area or theme that is particularly relevant for the specific audience. The company assembles a team of presenters—educators and thought leaders who specialize in that area—while tapping GE representatives to moderate and facilitate group work. During the course, teams learn how to apply relevant concepts within their own organizations. At face value, this setup may very well look like any other executive education event—but it’s not.
Here’s why it works. Customer-centric co-learning enables organizations to do the following:
  • Get closer to their customers. The opportunity to spend a week, or even a few days, interacting with customers outside of the normal course of business doesn’t come around often. But when it does, it enables both parties to learn things about the other they may not otherwise know—things that may not be directly connected to their business dealings at the moment, but that lead to a deeper understanding and enable companies to better serve their customers in the short and long term. It is also worth noting that greater trust and collegiality are by-products of this sort of transparency.
  • Break free from sales tensions.These engagements are about learning and relationship building—not making another sale (although that may happen as a longer-term result). Sales conversations are prohibited during these programs, unless initiated by the customer, freeing participants from their everyday vendor–customer mode and encouraging customers to share freely without needing to fend off sales pitches.
  • Encourage outside-in thinking.When customers and partners from various industries come together, everyone benefits from different viewpoints. Too often leaders are limited by their own perspectives, but a diverse group of peers can help one another see the bigger picture and adopt a more macro lens that can help them on a strategic level. 
  • Demonstrate their investment in customer success.Including customers in such a high-caliber learning event clearly demonstrates that an organization values its partnerships. It sends the message that the organization is vested, and invested, in its customers’ success. It’s not unusual for companies to commit substantial resources to make this all happen, and while there is no guaranteed hard return on investment, it pays dividends in building long-term business relationships.
For all of these reasons, I’m convinced that companies that learn together, earn together. While GE is not alone in its co-learning endeavors, most organizations have yet to recognize its value, let alone take the plunge. But those willing to try will reap the rewards.

вторник, 10 ноября 2015 г.

Cost Curve, Experience Curve and Learning Curve



These are strategic concepts which many of us learned about in a college microeconomics 101 course. They may be a bit theoretical, but nevertheless often provide helpful insights into how an industry functions.

The industry cost curve concept is a classic framework helpful to understand pricing. It fundamentally states that producers of commodity products are willing to supply products as long as the price is higher than their cost of production. The graph of an industry’s cost curve charts capacity on the horizontal axis and unit costs on the vertical axis, incrementally listing various competitors in order of increasing costs. The theory states that the competitive market price is determined by overall market demand and the unit costs of the next available supplier.



In reality, a number of questions make this a lot more complicated than it seems at first blush. Key questions would include: Are the products really interchangeable and do various user segments attach a different perceived value to different suppliers’ products? Are there significant entry and exit barriers? Do the locations of suppliers vs. users matter? What about non-commodity products? Some of these questions can be included in a more complex analysis, using linear programming. But in the end, a qualitative element will always remain part of the analysis. Nevertheless, the industry cost curve provides an interesting starting point for many pricing related questions.



Related to the cost curve is the experience curve. The terms of experience curve and learning curves are used mostly interchangeably (although the former more often refers to organizations becoming better at doing things, while the latter generally applies more to individuals). The fundamental rule states that the more you do something, the less time it will take next time around. Or in other terms: if the quantity of an item produced doubles, the cost decreases at a predictable rate (as little as a few percentage points, or as high as more than 30%). Experience curves have been observed empirically in a variety of industries, and you can find formulas to express the relationship (Wikipedia has good sections on both cost curves and experience curves).



What drives the experience curve effects? A variety of factors play a role:

–Labor efficiency

–Standardization and specialization

–Technology and automation

–Better use of equipment

–Changes in the resource mix

–Product redesign



The link back to the cost curve is that you would typically expect the largest producers to have the lowest unit costs (i.e. they appear to the left of the cost curve). Strategically, it can be a significant advantage for a significant competitor to fully take advantage of experience curve effects, generating above average profits or achieving higher market shares, or a combination of both. Of course, we all know that cost leadership is not the only way to achieve a strong sustainable position.

Long run total costs: