Pattern Languages are languages derived from entities called patterns that when combined form solutions to complex problems. Each pattern describes a problem and offers solutions. Pattern languages are a way of expressing complex solutions that were derived from experience such that others can gain a better understanding of the solution.
Pattern Languages were originally promoted by Christopher Alexander to describe the architecture of businesses and towns. These ideas where later adopted by Object Oriented Programming (OOP)practitioners to describe the design of OOP programs, these were named Design Patterns. These were extended further into other domains like SOA (http://www.manageability.org/blog/stuff/pattern-language-interoperability/view) and High Scalability (http://www.manageability.org/blog/stuff/patterns-for-infinite-scalability/view).
In the domain of Machine Learning (ML) there is an emerging practice called “Deep Learning”. In ML there are many new terms that one encounters such as Artificial Neural Networks, Random Forests, Support Vector Machines and Non-negative Matrix Factorization. These however usually refer to a specific kind of algorithm. Deep Learning (DL) however is not really one kind of algorithm, rather it is a whole class of algorithms that tend to exhibit similar ‘patterns’. DL systems are Artificial Neural Networks (ANN) that are constructed with multiple layers (sometimes called Multi-level Perceptrons). The idea is not entirely new, since it was first proposed back in the 1960s.. However, interest in the domain has exploded with the help of advancing hardware technology (i.e. GPU). Since 2011, DL systems have been exhibiting impressive results in the field.
The confusion with DL arises when one realizes that there actually many implementations and it is not just a single kind of algorithm. There are the conventional Feed forward Networks (aka. Fully Connected Networks), Convolution Networks (ConvNet), Recurrent Neural Networks (RNN) and less used Restricted Boltzmann Machines (RBM). They all share a common trait in that these networks are constructed using a hierarchy of layers. One common pattern for example is the employment of differentiable layers, this constraint on the construction of DL systems leads to an incremental way to evolve the network into something that learns classification. There are many such patterns that have been discovered recently and it would be very useful for practitioners to have at their disposal a compilation of these patterns. In the next few weeks we will be sharing more details of this Pattern Language.
Pattern languages are an ideal vehicle for describing and understanding Deep Learning. One would like to believe the Deep Learning has a solid fundamental foundation based on advanced mathematics. Most academic research papers will conjure up high-falutin math such as path integrals, tensors, Hilbert spaces, measure theory etc. but don’t let the math distract oneself from the reality that our understanding is minimal. Mathematics you see has its inherent limitations. Physical scientists have known this for centuries. We formulate theories in such a way that the structures are mathematically convenient. The Gaussian distribution for example is prevalent not because its some magical construct that reality has gifted to us. It is prevalent because it is mathematically convenient.
Pattern languages have been leveraged in many fuzzy domains. The original pattern language revolved around the discussion of architecture (i.e. buildings and towns). There are pattern languages that focus on user interfaces, on usability, on interaction design and on software process. These all don’t have concise mathematical underpinnings yet we do extract real value from these pattern languages. In fact, the specification of a pattern language is not too far off from the creation of a new algebra in mathematics. Algebras are strictly consistent but they are purely abstract and may not need to have any connection with reality. Pattern languages are however connected with reality, however consistency rules are more relaxed. In our attempt to understand the complex world of machine learning (or learning in general) we cannot always leap frog into mathematics. The reality may be such that our current mathematics are woefully incapable of describing what is happening.
Visit www.deeplearningpatterns.com for ongoing updates.