1000000500010@unknown@formal@none@1@S@⌊δArtificial neural networkδ⌋@@@@1@3@@oe@26-8-2013 1000000500020@unknown@formal@none@1@S@An ⌊∗artificial neural network (ANN)∗⌋, often just called a "neural network" (NN), is a ⌊>mathematical model>⌋ or ⌊>computational model>⌋ based on ⌊>biological neural networks>⌋.@@@@1@24@@oe@26-8-2013 1000000500030@unknown@formal@none@1@S@It consists of an interconnected group of ⌊>artificial neuron>⌋s and processes information using a ⌊>connectionist>⌋ approach to ⌊>computation>⌋.@@@@1@18@@oe@26-8-2013 1000000500040@unknown@formal@none@1@S@In most cases an ANN is an ⌊>adaptive system>⌋ that changes its structure based on external or internal information that flows through the network during the learning phase.@@@@1@28@@oe@26-8-2013 1000000500050@unknown@formal@none@1@S@In more practical terms neural networks are ⌊>non-linear>⌋ ⌊>statistical>⌋ ⌊>data modeling>⌋ tools.@@@@1@12@@oe@26-8-2013 1000000500060@unknown@formal@none@1@S@They can be used to model complex relationships between inputs and outputs or to ⌊>find patterns>⌋ in data.@@@@1@18@@oe@26-8-2013 1000000500070@unknown@formal@none@1@S@⌊=Background¦2=⌋@@@@1@1@@oe@26-8-2013 1000000500080@unknown@formal@none@1@S@There is no precise agreed-upon definition among researchers as to what a ⌊>neural network>⌋ is, but most would agree that it involves a network of simple processing elements (⌊>neurons>⌋), which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters.@@@@1@46@@oe@26-8-2013 1000000500090@unknown@formal@none@1@S@The original inspiration for the technique was from examination of the ⌊>central nervous system>⌋ and the neurons (and their ⌊>axons>⌋, ⌊>dendrites>⌋ and ⌊>synapses>⌋) which constitute one of its most significant information processing elements (see ⌊>Neuroscience>⌋).@@@@1@35@@oe@26-8-2013 1000000500100@unknown@formal@none@1@S@In a neural network model, simple ⌊>nodes>⌋ (called variously "neurons", "neurodes", "PEs" ("processing elements") or "units") are connected together to form a network of nodes — hence the term "neural network."@@@@1@31@@oe@26-8-2013 1000000500110@unknown@formal@none@1@S@While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.@@@@1@36@@oe@26-8-2013 1000000500120@unknown@formal@none@1@S@These networks are also similar to the ⌊>biological neural networks>⌋ in the sense that functions are performed collectively and in parallel by the units, rather than there being a clear delineation of subtasks to which various units are assigned (see also ⌊>connectionism>⌋).@@@@1@42@@oe@26-8-2013 1000000500130@unknown@formal@none@1@S@Currently, the term Artificial Neural Network (ANN) tends to refer mostly to neural network models employed in ⌊>statistics>⌋, ⌊>cognitive psychology>⌋ and ⌊>artificial intelligence>⌋.@@@@1@23@@oe@26-8-2013 1000000500140@unknown@formal@none@1@S@⌊>Neural network>⌋ models designed with emulation of the ⌊>central nervous system>⌋ (CNS) in mind are a subject of ⌊>theoretical neuroscience>⌋ (⌊>computational neuroscience>⌋).@@@@1@22@@oe@26-8-2013 1000000500150@unknown@formal@none@1@S@In modern ⌊>software implementations>⌋ of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing.@@@@1@30@@oe@26-8-2013 1000000500160@unknown@formal@none@1@S@In some of these systems neural networks, or parts of neural networks (such as ⌊>artificial neuron>⌋s) are used as components in larger systems that combine both adaptive and non-adaptive elements.@@@@1@30@@oe@26-8-2013 1000000500170@unknown@formal@none@1@S@While the more general approach of such ⌊>adaptive systems>⌋ is more suitable for real-world problem solving, it has far less to do with the traditional artificial intelligence connectionist models.@@@@1@29@@oe@26-8-2013 1000000500180@unknown@formal@none@1@S@What they do, however, have in common is the principle of non-linear, distributed, parallel and local processing and adaptation.@@@@1@19@@oe@26-8-2013 1000000500190@unknown@formal@none@1@S@⌊=Models¦3=⌋@@@@1@1@@oe@26-8-2013 1000000500200@unknown@formal@none@1@S@Neural network models in artificial intelligence are usually referred to as artificial neural networks (ANNs); these are essentially simple mathematical models defining a function ⌊× f : X \\rightarrow Y ×⌋.@@@@1@31@@oe@26-8-2013 1000000500210@unknown@formal@none@1@S@Each type of ANN model corresponds to a ⌊/class/⌋ of such functions.@@@@1@12@@oe@26-8-2013 1000000500220@unknown@formal@none@1@S@⌊=The ⌊/network/⌋ in ⌊/artificial neural network/⌋¦4=⌋@@@@1@6@@oe@26-8-2013 1000000500230@unknown@formal@none@1@S@The word ⌊/network/⌋ in the term 'artificial neural network' arises because the function ⌊×f(x)×⌋ is defined as a composition of other functions ⌊×g_i(x)×⌋, which can further be defined as a composition of other functions.@@@@1@34@@oe@26-8-2013 1000000500240@unknown@formal@none@1@S@This can be conveniently represented as a network structure, with arrows depicting the dependencies between variables.@@@@1@16@@oe@26-8-2013 1000000500250@unknown@formal@none@1@S@A widely used type of composition is the ⌊/nonlinear weighted sum/⌋, where ⌊×f (x) = K \\left(\\sum_i w_i g_i(x)\\right) ×⌋, where ⌊×K×⌋ is some predefined function, such as the ⌊>hyperbolic tangent>⌋.@@@@1@31@@oe@26-8-2013 1000000500260@unknown@formal@none@1@S@It will be convenient for the following to refer to a collection of functions ⌊×g_i×⌋ as simply a vector ⌊×g = (g_1, g_2, \\ldots, g_n)×⌋.@@@@1@25@@oe@26-8-2013 1000000500270@unknown@formal@none@1@S@This figure depicts such a decomposition of ⌊×f×⌋, with dependencies between variables indicated by arrows.@@@@1@15@@oe@26-8-2013 1000000500280@unknown@formal@none@1@S@These can be interpreted in two ways.@@@@1@7@@oe@26-8-2013 1000000500290@unknown@formal@none@1@S@The first view is the functional view: the input ⌊×x×⌋ is transformed into a 3-dimensional vector ⌊×h×⌋, which is then transformed into a 2-dimensional vector ⌊×g×⌋, which is finally transformed into ⌊×f×⌋.@@@@1@32@@oe@26-8-2013 1000000500300@unknown@formal@none@1@S@This view is most commonly encountered in the context of ⌊>optimization>⌋.@@@@1@11@@oe@26-8-2013 1000000500310@unknown@formal@none@1@S@The second view is the probabilistic view: the ⌊>random variable>⌋ ⌊×F = f(G) ×⌋ depends upon the random variable ⌊×G = g(H)×⌋, which depends upon ⌊×H=h(X)×⌋, which depends upon the random variable ⌊×X×⌋.@@@@1@33@@oe@26-8-2013 1000000500320@unknown@formal@none@1@S@This view is most commonly encountered in the context of ⌊>graphical models>⌋.@@@@1@12@@oe@26-8-2013 1000000500330@unknown@formal@none@1@S@The two views are largely equivalent.@@@@1@6@@oe@26-8-2013 1000000500340@unknown@formal@none@1@S@In either case, for this particular network architecture, the components of individual layers are independent of each other (e.g., the components of ⌊×g×⌋ are independent of each other given their input ⌊×h×⌋).@@@@1@32@@oe@26-8-2013 1000000500350@unknown@formal@none@1@S@This naturally enables a degree of parallelism in the implementation.@@@@1@10@@oe@26-8-2013 1000000500360@unknown@formal@none@1@S@Networks such as the previous one are commonly called ⌊>feedforward>⌋, because their graph is a ⌊>directed acyclic graph>⌋.@@@@1@18@@oe@26-8-2013 1000000500370@unknown@formal@none@1@S@Networks with ⌊>cycles>⌋ are commonly called ⌊>recurrent>⌋.@@@@1@7@@oe@26-8-2013 1000000500380@unknown@formal@none@1@S@Such networks are commonly depicted in the manner shown at the top of the figure, where ⌊×f×⌋ is shown as being dependent upon itself.@@@@1@24@@oe@26-8-2013 1000000500390@unknown@formal@none@1@S@However, there is an implied temporal dependence which is not shown.@@@@1@11@@oe@26-8-2013 1000000500400@unknown@formal@none@1@S@What this actually means in practice is that the value of ⌊×f×⌋ at some point in time ⌊×t×⌋ depends upon the values of ⌊×f×⌋ at zero or at one or more other points in time.@@@@1@35@@oe@26-8-2013 1000000500410@unknown@formal@none@1@S@The graphical model at the bottom of the figure illustrates the case: the value of ⌊×f×⌋ at time ⌊×t×⌋ only depends upon its last value.@@@@1@25@@oe@26-8-2013 1000000500420@unknown@formal@none@1@S@⌊=Learning¦3=⌋@@@@1@1@@oe@26-8-2013 1000000500430@unknown@formal@none@1@S@However interesting such functions may be in themselves, what has attracted the most interest in neural networks is the possibility of ⌊/learning/⌋, which in practice means the following:@@@@1@28@@oe@26-8-2013 1000000500440@unknown@formal@none@1@S@Given a specific ⌊/task/⌋ to solve, and a ⌊/class/⌋ of functions ⌊×F×⌋, learning means using a set of ⌊/observations/⌋, in order to find ⌊×f^* \\in F×⌋ which solves the task in an ⌊/optimal sense/⌋.@@@@1@34@@oe@26-8-2013 1000000500450@unknown@formal@none@1@S@This entails defining a ⌊>cost function>⌋ ⌊×C : F \\rightarrow \\mathbb{R}×⌋ such that, for the optimal solution ⌊×f^*×⌋, ⌊×C(f^*) \\leq C(f)×⌋ ⌊×\\forall f \\in F×⌋ (no solution has a cost less than the cost of the optimal solution).@@@@1@38@@oe@26-8-2013 1000000500460@unknown@formal@none@1@S@The ⌊>cost function>⌋ ⌊×C×⌋ is an important concept in learning, as it is a measure of how far away we are from an optimal solution to the problem that we want to solve.@@@@1@33@@oe@26-8-2013 1000000500470@unknown@formal@none@1@S@Learning algorithms search through the solution space in order to find a function that has the smallest possible cost.@@@@1@19@@oe@26-8-2013 1000000500480@unknown@formal@none@1@S@For applications where the solution is dependent on some data, the cost must necessarily be a ⌊/function of the observations/⌋, otherwise we would not be modelling anything related to the data.@@@@1@31@@oe@26-8-2013 1000000500490@unknown@formal@none@1@S@It is frequently defined as a ⌊>statistic>⌋ to which only approximations can be made.@@@@1@14@@oe@26-8-2013 1000000500500@unknown@formal@none@1@S@As a simple example consider the problem of finding the model ⌊×f×⌋ which minimizes ⌊×C=E\\left[(f(x) - y)^2\\right]×⌋, for data pairs ⌊×(x,y)×⌋ drawn from some distribution ⌊×\\mathcal{D}×⌋.@@@@1@26@@oe@26-8-2013 1000000500510@unknown@formal@none@1@S@In practical situations we would only have ⌊×N×⌋ samples from ⌊×\\mathcal{D}×⌋ and thus, for the above example, we would only minimize ⌊×\\hat{C}=\\frac{1}{N}\\sum_{i=1}^N (f(x_i)-y_i)^2×⌋.@@@@1@23@@oe@26-8-2013 1000000500520@unknown@formal@none@1@S@Thus, the cost is minimized over a sample of the data rather than the true data distribution.@@@@1@17@@oe@26-8-2013 1000000500530@unknown@formal@none@1@S@When ⌊×N \\rightarrow \\infty×⌋ some form of online learning must be used, where the cost is partially minimized as each new example is seen.@@@@1@24@@oe@26-8-2013 1000000500540@unknown@formal@none@1@S@While online learning is often used when ⌊×\\mathcal{D}×⌋ is fixed, it is most useful in the case where the distribution changes slowly over time.@@@@1@24@@oe@26-8-2013 1000000500550@unknown@formal@none@1@S@In neural network methods, some form of online learning is frequently also used for finite datasets.@@@@1@16@@oe@26-8-2013 1000000500560@unknown@formal@none@1@S@⌊=Choosing a cost function¦4=⌋@@@@1@4@@oe@26-8-2013 1000000500570@unknown@formal@none@1@S@While it is possible to arbitrarily define some ⌊>ad hoc>⌋ cost function, frequently a particular cost will be used either because it has desirable properties (such as convexity) or because it arises naturally from a particular formulation of the problem (i.e., In a probabilistic formulation the posterior probability of the model can be used as an inverse cost).@@@@1@58@@oe@26-8-2013 1000000500580@unknown@formal@none@1@S@⌊∗Ultimately, the cost function will depend on the task we wish to perform∗⌋.@@@@1@13@@oe@26-8-2013 1000000500590@unknown@formal@none@1@S@The three main categories of learning tasks are overviewed below.@@@@1@10@@oe@26-8-2013 1000000500600@unknown@formal@none@1@S@⌊=Learning paradigms¦3=⌋@@@@1@2@@oe@26-8-2013 1000000500610@unknown@formal@none@1@S@There are three major learning paradigms, each corresponding to a particular abstract learning task.@@@@1@14@@oe@26-8-2013 1000000500620@unknown@formal@none@1@S@These are ⌊>supervised learning>⌋, ⌊>unsupervised learning>⌋ and ⌊>reinforcement learning>⌋.@@@@1@9@@oe@26-8-2013 1000000500630@unknown@formal@none@1@S@Usually any given type of network architecture can be employed in any of those tasks.@@@@1@15@@oe@26-8-2013 1000000500640@unknown@formal@none@1@S@⌊=Supervised learning¦4=⌋@@@@1@2@@oe@26-8-2013 1000000500650@unknown@formal@none@1@S@In ⌊>supervised learning>⌋, we are given a set of example pairs ⌊× (x, y), x \\in X, y \\in Y×⌋ and the aim is to find a function ⌊× f : X \\rightarrow Y ×⌋ in the allowed class of functions that matches the examples.@@@@1@45@@oe@26-8-2013 1000000500660@unknown@formal@none@1@S@In other words, we wish to ⌊/infer/⌋ the mapping implied by the data; the cost function is related to the mismatch between our mapping and the data and it implicitly contains prior knowledge about the problem domain.@@@@1@37@@oe@26-8-2013 1000000500670@unknown@formal@none@1@S@A commonly used cost is the ⌊>mean-squared error>⌋ which tries to minimize the average error between the network's output, f(x), and the target value y over all the example pairs.@@@@1@30@@oe@26-8-2013 1000000500680@unknown@formal@none@1@S@When one tries to minimise this cost using ⌊>gradient descent>⌋ for the class of neural networks called ⌊>Multi-Layer Perceptrons>⌋, one obtains the common and well-known ⌊>backpropagation algorithm>⌋ for training neural networks.@@@@1@31@@oe@26-8-2013 1000000500690@unknown@formal@none@1@S@Tasks that fall within the paradigm of supervised learning are ⌊>pattern recognition>⌋ (also known as classification) and ⌊>regression>⌋ (also known as function approximation).@@@@1@23@@oe@26-8-2013 1000000500700@unknown@formal@none@1@S@The supervised learning paradigm is also applicable to sequential data (e.g., for speech and gesture recognition).@@@@1@16@@oe@26-8-2013 1000000500710@unknown@formal@none@1@S@This can be thought of as learning with a "teacher," in the form of a function that provides continuous feedback on the quality of solutions obtained thus far.@@@@1@28@@oe@26-8-2013 1000000500720@unknown@formal@none@1@S@⌊=Unsupervised learning¦4=⌋@@@@1@2@@oe@26-8-2013 1000000500730@unknown@formal@none@1@S@In ⌊>unsupervised learning>⌋ we are given some data ⌊×x×⌋, and the cost function to be minimized can be any function of the data ⌊×x×⌋ and the network's output, ⌊×f×⌋.@@@@1@29@@oe@26-8-2013 1000000500740@unknown@formal@none@1@S@The cost function is dependent on the task (what we are trying to model) and our ⌊/a priori/⌋ assumptions (the implicit properties of our model, its parameters and the observed variables).@@@@1@31@@oe@26-8-2013 1000000500750@unknown@formal@none@1@S@As a trivial example, consider the model ⌊×f(x) = a×⌋, where ⌊×a×⌋ is a constant and the cost ⌊×C=E[(x - f(x))^2]×⌋.@@@@1@21@@oe@26-8-2013 1000000500760@unknown@formal@none@1@S@Minimizing this cost will give us a value of ⌊×a×⌋ that is equal to the mean of the data.@@@@1@19@@oe@26-8-2013 1000000500770@unknown@formal@none@1@S@The cost function can be much more complicated.@@@@1@8@@oe@26-8-2013 1000000500780@unknown@formal@none@1@S@Its form depends on the application: For example in compression it could be related to the ⌊>mutual information>⌋ between x and y.@@@@1@22@@oe@26-8-2013 1000000500790@unknown@formal@none@1@S@In statistical modelling, it could be related to the ⌊>posterior probability>⌋ of the model given the data.@@@@1@17@@oe@26-8-2013 1000000500800@unknown@formal@none@1@S@(Note that in both of those examples those quantities would be maximized rather than minimised).@@@@1@15@@oe@26-8-2013 1000000500810@unknown@formal@none@1@S@Tasks that fall within the paradigm of unsupervised learning are in general ⌊>estimation>⌋ problems; the applications include ⌊>clustering>⌋, the estimation of ⌊>statistical distributions>⌋, ⌊>compression>⌋ and ⌊>filtering>⌋.@@@@1@26@@oe@26-8-2013 1000000500820@unknown@formal@none@1@S@⌊=Reinforcement learning¦4=⌋@@@@1@2@@oe@26-8-2013 1000000500830@unknown@formal@none@1@S@In ⌊>reinforcement learning>⌋, data ⌊×x×⌋ is usually not given, but generated by an agent's interactions with the environment.@@@@1@18@@oe@26-8-2013 1000000500840@unknown@formal@none@1@S@At each point in time ⌊×t×⌋, the agent performs an action ⌊×y_t×⌋ and the environment generates an observation ⌊×x_t×⌋ and an instantaneous cost ⌊×c_t×⌋, according to some (usually unknown) dynamics.@@@@1@30@@oe@26-8-2013 1000000500850@unknown@formal@none@1@S@The aim is to discover a ⌊/policy/⌋ for selecting actions that minimizes some measure of a long-term cost, i.e. the expected cumulative cost.@@@@1@23@@oe@26-8-2013 1000000500860@unknown@formal@none@1@S@The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated.@@@@1@17@@oe@26-8-2013 1000000500870@unknown@formal@none@1@S@More formally, the environment is modeled as a ⌊>Markov decision process>⌋ (MDP) with states ⌊×{s_1,...,s_n}\\in ×⌋S and actions ⌊×{a_1,...,a_m} \\in A×⌋ with the following probability distributions: the instantaneous cost distribution ⌊×P(c_t|s_t)×⌋, the observation distribution ⌊×P(x_t|s_t)×⌋ and the transition ⌊×P(s_{t+1}|s_t, a_t)×⌋, while a policy is defined as conditional distribution over actions given the observations.@@@@1@53@@oe@26-8-2013 1000000500880@unknown@formal@none@1@S@Taken together, the two define a ⌊>Markov chain>⌋ (MC).@@@@1@9@@oe@26-8-2013 1000000500890@unknown@formal@none@1@S@The aim is to discover the policy that minimizes the cost, i.e. the MC for which the cost is minimal.@@@@1@20@@oe@26-8-2013 1000000500900@unknown@formal@none@1@S@ANNs are frequently used in reinforcement learning as part of the overall algorithm.@@@@1@13@@oe@26-8-2013 1000000500910@unknown@formal@none@1@S@Tasks that fall within the paradigm of reinforcement learning are control problems, ⌊>games>⌋ and other ⌊>sequential decision making>⌋ tasks.@@@@1@19@@oe@26-8-2013 1000000500920@unknown@formal@none@1@S@See also: ⌊>dynamic programming>⌋, ⌊>stochastic control>⌋@@@@1@6@@oe@26-8-2013 1000000500930@unknown@formal@none@1@S@⌊=Learning algorithms¦3=⌋@@@@1@2@@oe@26-8-2013 1000000500940@unknown@formal@none@1@S@Training a neural network model essentially means selecting one model from the set of allowed models (or, in a ⌊>Bayesian>⌋ framework, determining a distribution over the set of allowed models) that minimises the cost criterion.@@@@1@35@@oe@26-8-2013 1000000500950@unknown@formal@none@1@S@There are numerous algorithms available for training neural network models; most of them can be viewed as a straightforward application of ⌊>optimization>⌋ theory and ⌊>statistical estimation>⌋.@@@@1@26@@oe@26-8-2013 1000000500960@unknown@formal@none@1@S@Most of the algorithms used in training artificial neural networks are employing some form of ⌊>gradient descent>⌋.@@@@1@17@@oe@26-8-2013 1000000500970@unknown@formal@none@1@S@This is done by simply taking the derivative of the cost function with respect to the network parameters and then changing those parameters in a ⌊>gradient-related>⌋ direction.@@@@1@27@@oe@26-8-2013 1000000500980@unknown@formal@none@1@S@⌊>Evolutionary methods>⌋, ⌊>simulated annealing>⌋, and ⌊>expectation-maximization>⌋ and ⌊>non-parametric methods>⌋ are among other commonly used methods for training neural networks.@@@@1@19@@oe@26-8-2013 1000000500990@unknown@formal@none@1@S@See also ⌊>machine learning>⌋.@@@@1@4@@oe@26-8-2013 1000000501000@unknown@formal@none@1@S@Temporal perceptual learning rely on finding temporal relationships in sensory signal streams.@@@@1@12@@oe@26-8-2013 1000000501010@unknown@formal@none@1@S@In an environment, statistically salient temporal correlations can be found by monitoring the arrival times of sensory signals.@@@@1@18@@oe@26-8-2013 1000000501020@unknown@formal@none@1@S@This is done by the ⌊>perceptual network>⌋.@@@@1@7@@oe@26-8-2013 1000000501030@unknown@formal@none@1@S@⌊=Employing artificial neural networks¦2=⌋@@@@1@4@@oe@26-8-2013 1000000501040@unknown@formal@none@1@S@Perhaps the greatest advantage of ANNs is their ability to be used as an arbitrary function approximation mechanism which 'learns' from observed data.@@@@1@23@@oe@26-8-2013 1000000501050@unknown@formal@none@1@S@However, using them is not so straightforward and a relatively good understanding of the underlying theory is essential.@@@@1@18@@oe@26-8-2013 1000000501060@unknown@formal@none@1@S@⌊•⌊#Choice of model: This will depend on the data representation and the application.@@@@1@13@@oe@26-8-2013 1000000501070@unknown@formal@none@1@S@Overly complex models tend to lead to problems with learning.#⌋@@@@1@10@@oe@26-8-2013 1000000501080@unknown@formal@none@1@S@⌊#Learning algorithm: There are numerous tradeoffs between learning algorithms.@@@@1@9@@oe@26-8-2013 1000000501090@unknown@formal@none@1@S@Almost any algorithm will work well with the ⌊/correct ⌊>hyperparameter>⌋s/⌋ for training on a particular fixed dataset.@@@@1@17@@oe@26-8-2013 1000000501100@unknown@formal@none@1@S@However selecting and tuning an algorithm for training on unseen data requires a significant amount of experimentation.#⌋@@@@1@17@@oe@26-8-2013 1000000501110@unknown@formal@none@1@S@⌊#Robustness: If the model, cost function and learning algorithm are selected appropriately the resulting ANN can be extremely robust.#⌋•⌋@@@@1@19@@oe@26-8-2013 1000000501120@unknown@formal@none@1@S@With the correct implementation ANNs can be used naturally in ⌊>online learning>⌋ and large dataset applications.@@@@1@16@@oe@26-8-2013 1000000501130@unknown@formal@none@1@S@Their simple implementation and the existence of mostly local dependencies exhibited in the structure allows for fast, parallel implementations in hardware.@@@@1@21@@oe@26-8-2013 1000000501140@unknown@formal@none@1@S@⌊=Applications¦2=⌋@@@@1@1@@oe@26-8-2013 1000000501150@unknown@formal@none@1@S@The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations.@@@@1@22@@oe@26-8-2013 1000000501160@unknown@formal@none@1@S@This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.@@@@1@24@@oe@26-8-2013 1000000501170@unknown@formal@none@1@S@⌊=Real life applications¦3=⌋@@@@1@3@@oe@26-8-2013 1000000501180@unknown@formal@none@1@S@The tasks to which artificial neural networks are applied tend to fall within the following broad categories:@@@@1@17@@oe@26-8-2013 1000000501190@unknown@formal@none@1@S@⌊•⌊#⌊>Function approximation>⌋, or ⌊>regression analysis>⌋, including ⌊>time series prediction>⌋ and modeling.#⌋@@@@1@11@@oe@26-8-2013 1000000501200@unknown@formal@none@1@S@⌊#⌊>Classification>⌋, including ⌊>pattern>⌋ and sequence recognition, ⌊>novelty detection>⌋ and sequential decision making.#⌋@@@@1@12@@oe@26-8-2013 1000000501210@unknown@formal@none@1@S@⌊#⌊>Data processing>⌋, including filtering, clustering, blind source separation and compression.#⌋•⌋@@@@1@10@@oe@26-8-2013 1000000501220@unknown@formal@none@1@S@Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications (automated trading systems), ⌊>data mining>⌋ (or knowledge discovery in databases, "KDD"), visualization and ⌊>e-mail spam>⌋ filtering.@@@@1@55@@oe@26-8-2013 1000000501230@unknown@formal@none@1@S@⌊=Neural network software¦2=⌋@@@@1@3@@oe@26-8-2013 1000000501240@unknown@formal@none@1@S@⌊∗Neural network software∗⌋ is used to ⌊>simulate>⌋, ⌊>research>⌋, ⌊>develop>⌋ and apply artificial neural networks, ⌊>biological neural network>⌋s and in some cases a wider array of ⌊>adaptive system>⌋s.@@@@1@27@@oe@26-8-2013 1000000501250@unknown@formal@none@1@S@See also ⌊>logistic regression>⌋.@@@@1@4@@oe@26-8-2013 1000000501260@unknown@formal@none@1@S@⌊=Types of neural networks¦2=⌋@@@@1@4@@oe@26-8-2013 1000000501270@unknown@formal@none@1@S@⌊=Feedforward neural network¦3=⌋@@@@1@3@@oe@26-8-2013 1000000501280@unknown@formal@none@1@S@The feedforward neural network was the first and arguably simplest type of artificial neural network devised.@@@@1@16@@oe@26-8-2013 1000000501290@unknown@formal@none@1@S@In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes.@@@@1@26@@oe@26-8-2013 1000000501300@unknown@formal@none@1@S@There are no cycles or loops in the network.@@@@1@9@@oe@26-8-2013 1000000501310@unknown@formal@none@1@S@⌊=Radial basis function (RBF) network¦3=⌋@@@@1@5@@oe@26-8-2013 1000000501320@unknown@formal@none@1@S@Radial Basis Functions are powerful techniques for interpolation in multidimensional space.@@@@1@11@@oe@26-8-2013 1000000501330@unknown@formal@none@1@S@A RBF is a function which has built into a distance criterion with respect to a centre.@@@@1@17@@oe@26-8-2013 1000000501340@unknown@formal@none@1@S@Radial basis functions have been applied in the area of neural networks where they may be used as a replacement for the sigmoidal hidden layer transfer characteristic in Multi-Layer Perceptrons.@@@@1@30@@oe@26-8-2013 1000000501350@unknown@formal@none@1@S@RBF networks have two layers of processing: In the first, input is mapped onto each RBF in the 'hidden' layer.@@@@1@20@@oe@26-8-2013 1000000501360@unknown@formal@none@1@S@The RBF chosen is usually a Gaussian.@@@@1@7@@oe@26-8-2013 1000000501370@unknown@formal@none@1@S@In regression problems the output layer is then a linear combination of hidden layer values representing mean predicted output.@@@@1@19@@oe@26-8-2013 1000000501380@unknown@formal@none@1@S@The interpretation of this output layer value is the same as a regression model in statistics.@@@@1@16@@oe@26-8-2013 1000000501390@unknown@formal@none@1@S@In classification problems the output layer is typically a ⌊>sigmoid function>⌋ of a linear combination of hidden layer values, representing a posterior probability.@@@@1@23@@oe@26-8-2013 1000000501400@unknown@formal@none@1@S@Performance in both cases is often improved by shrinkage techniques, known as ⌊>ridge regression>⌋ in classical statistics and known to correspond to a prior belief in small parameter values (and therefore smooth output functions) in a Bayesian framework.@@@@1@38@@oe@26-8-2013 1000000501410@unknown@formal@none@1@S@RBF networks have the advantage of not suffering from local minima in the same way as Multi-Layer Perceptrons.@@@@1@18@@oe@26-8-2013 1000000501420@unknown@formal@none@1@S@This is because the only parameters that are adjusted in the learning process are the linear mapping from hidden layer to output layer.@@@@1@23@@oe@26-8-2013 1000000501430@unknown@formal@none@1@S@Linearity ensures that the error surface is quadratic and therefore has a single easily found minimum.@@@@1@16@@oe@26-8-2013 1000000501440@unknown@formal@none@1@S@In regression problems this can be found in one matrix operation.@@@@1@11@@oe@26-8-2013 1000000501450@unknown@formal@none@1@S@In classification problems the fixed non-linearity introduced by the sigmoid output function is most efficiently dealt with using ⌊>iteratively re-weighted least squares>⌋.@@@@1@22@@oe@26-8-2013 1000000501460@unknown@formal@none@1@S@RBF networks have the disadvantage of requiring good coverage of the input space by radial basis functions.@@@@1@17@@oe@26-8-2013 1000000501470@unknown@formal@none@1@S@RBF centres are determined with reference to the distribution of the input data, but without reference to the prediction task.@@@@1@20@@oe@26-8-2013 1000000501480@unknown@formal@none@1@S@As a result, representational resources may be wasted on areas of the input space that are irrelevant to the learning task.@@@@1@21@@oe@26-8-2013 1000000501490@unknown@formal@none@1@S@A common solution is to associate each data point with its own centre, although this can make the linear system to be solved in the final layer rather large, and requires shrinkage techniques to avoid overfitting.@@@@1@36@@oe@26-8-2013 1000000501500@unknown@formal@none@1@S@Associating each input datum with an RBF leads naturally to kernel methods such as ⌊>Support Vector Machine>⌋s and Gaussian Processes (the RBF is the kernel function).@@@@1@26@@oe@26-8-2013 1000000501510@unknown@formal@none@1@S@All three approaches use a non-linear kernel function to project the input data into a space where the learning problem can be solved using a linear model.@@@@1@27@@oe@26-8-2013 1000000501520@unknown@formal@none@1@S@Like Gaussian Processes, and unlike SVMs, RBF networks are typically trained in a Maximum Likelihood framework by maximizing the probability (minimizing the error) of the data under the model.@@@@1@29@@oe@26-8-2013 1000000501530@unknown@formal@none@1@S@SVMs take a different approach to avoiding overfitting by maximizing instead a margin.@@@@1@13@@oe@26-8-2013 1000000501540@unknown@formal@none@1@S@RBF networks are outperformed in most classification applications by SVMs.@@@@1@10@@oe@26-8-2013 1000000501550@unknown@formal@none@1@S@In regression applications they can be competitive when the dimensionality of the input space is relatively small.@@@@1@17@@oe@26-8-2013 1000000501560@unknown@formal@none@1@S@⌊=Kohonen self-organizings network¦3=⌋@@@@1@3@@oe@26-8-2013 1000000501570@unknown@formal@none@1@S@The self-organizing map (SOM) invented by ⌊>Teuvo Kohonen>⌋ uses a form of ⌊>unsupervised learning>⌋.@@@@1@14@@oe@26-8-2013 1000000501580@unknown@formal@none@1@S@A set of artificial neurons learn to map points in an input space to coordinates in an output space.@@@@1@19@@oe@26-8-2013 1000000501590@unknown@formal@none@1@S@The input space can have different dimensions and topology from the output space, and the SOM will attempt to preserve these.@@@@1@21@@oe@26-8-2013 1000000501600@unknown@formal@none@1@S@⌊=Recurrent network¦3=⌋@@@@1@2@@oe@26-8-2013 1000000501610@unknown@formal@none@1@S@Contrary to feedforward networks, ⌊>recurrent neural network>⌋s (RNs) are models with bi-directional data flow.@@@@1@14@@oe@26-8-2013 1000000501620@unknown@formal@none@1@S@While a feedforward network propagates data linearly from input to output, RNs also propagate data from later processing stages to earlier stages.@@@@1@22@@oe@26-8-2013 1000000501630@unknown@formal@none@1@S@⌊=Simple recurrent network¦4=⌋@@@@1@3@@oe@26-8-2013 1000000501640@unknown@formal@none@1@S@A ⌊/simple recurrent network/⌋ (SRN) is a variation on the Multi-Layer Perceptron, sometimes called an "Elman network" due to its invention by ⌊>Jeff Elman>⌋.@@@@1@24@@oe@26-8-2013 1000000501650@unknown@formal@none@1@S@A three-layer network is used, with the addition of a set of "context units" in the input layer.@@@@1@18@@oe@26-8-2013 1000000501660@unknown@formal@none@1@S@There are connections from the middle (hidden) layer to these context units fixed with a weight of one.@@@@1@18@@oe@26-8-2013 1000000501670@unknown@formal@none@1@S@At each time step, the input is propagated in a standard feed-forward fashion, and then a learning rule (usually back-propagation) is applied.@@@@1@22@@oe@26-8-2013 1000000501680@unknown@formal@none@1@S@The fixed back connections result in the context units always maintaining a copy of the previous values of the hidden units (since they propagate over the connections before the learning rule is applied).@@@@1@33@@oe@26-8-2013 1000000501690@unknown@formal@none@1@S@Thus the network can maintain a sort of state, allowing it to perform such tasks as sequence-prediction that are beyond the power of a standard Multi-Layer Perceptron.@@@@1@27@@oe@26-8-2013 1000000501700@unknown@formal@none@1@S@In a ⌊/fully recurrent network/⌋, every neuron receives inputs from every other neuron in the network.@@@@1@16@@oe@26-8-2013 1000000501710@unknown@formal@none@1@S@These networks are not arranged in layers.@@@@1@7@@oe@26-8-2013 1000000501720@unknown@formal@none@1@S@Usually only a subset of the neurons receive external inputs in addition to the inputs from all the other neurons, and another disjunct subset of neurons report their output externally as well as sending it to all the neurons.@@@@1@39@@oe@26-8-2013 1000000501730@unknown@formal@none@1@S@These distinctive inputs and outputs perform the function of the input and output layers of a feed-forward or simple recurrent network, and also join all the other neurons in the recurrent processing.@@@@1@32@@oe@26-8-2013 1000000501740@unknown@formal@none@1@S@⌊=Hopfield network¦4=⌋@@@@1@2@@oe@26-8-2013 1000000501750@unknown@formal@none@1@S@The ⌊>Hopfield network>⌋ is a recurrent neural network in which all connections are symmetric.@@@@1@14@@oe@26-8-2013 1000000501760@unknown@formal@none@1@S@Invented by ⌊>John Hopfield>⌋ in 1982, this network guarantees that its dynamics will converge.@@@@1@14@@oe@26-8-2013 1000000501770@unknown@formal@none@1@S@If the connections are trained using ⌊>Hebbian learning>⌋ then the Hopfield network can perform as robust content-addressable (or ⌊>associative>⌋) memory, resistant to connection alteration.@@@@1@24@@oe@26-8-2013 1000000501780@unknown@formal@none@1@S@⌊=Echo state network¦4=⌋@@@@1@3@@oe@26-8-2013 1000000501790@unknown@formal@none@1@S@The ⌊>echo state network>⌋ (ESN) is a ⌊>recurrent neural network>⌋ with a sparsely connected random hidden layer.@@@@1@17@@oe@26-8-2013 1000000501800@unknown@formal@none@1@S@The weights of output neurons are the only part of the network that can change and be learned.@@@@1@18@@oe@26-8-2013 1000000501810@unknown@formal@none@1@S@ESN are good to (re)produce temporal patterns.@@@@1@7@@oe@26-8-2013 1000000501820@unknown@formal@none@1@S@⌊=Long short term memory network¦4=⌋@@@@1@5@@oe@26-8-2013 1000000501830@unknown@formal@none@1@S@The ⌊>Long short term memory>⌋ is an artificial neural net structure that unlike traditional RNNs doesn't have the problem of vanishing gradients.@@@@1@22@@oe@26-8-2013 1000000501840@unknown@formal@none@1@S@It can therefore use long delays and can handle signals that have a mix of low and high frequency components.@@@@1@20@@oe@26-8-2013 1000000501850@unknown@formal@none@1@S@⌊=Stochastic neural networks¦3=⌋@@@@1@3@@oe@26-8-2013 1000000501860@unknown@formal@none@1@S@A ⌊>stochastic neural network>⌋ differs from a typical neural network because it introduces random variations into the network.@@@@1@18@@oe@26-8-2013 1000000501870@unknown@formal@none@1@S@In a probabilistic view of neural networks, such random variations can be viewed as a form of ⌊>statistical sampling>⌋, such as ⌊>Monte Carlo sampling>⌋.@@@@1@24@@oe@26-8-2013 1000000501880@unknown@formal@none@1@S@⌊=Boltzmann machine¦4=⌋@@@@1@2@@oe@26-8-2013 1000000501890@unknown@formal@none@1@S@The ⌊>Boltzmann machine>⌋ can be thought of as a noisy Hopfield network.@@@@1@12@@oe@26-8-2013 1000000501900@unknown@formal@none@1@S@Invented by ⌊>Geoff Hinton>⌋ and ⌊>Terry Sejnowski>⌋ in 1985, the Boltzmann machine is important because it is one of the first neural networks to demonstrate learning of latent variables (hidden units).@@@@1@31@@oe@26-8-2013 1000000501910@unknown@formal@none@1@S@Boltzmann machine learning was at first slow to simulate, but the ⌊>contrastive divergence algorithm>⌋ of Geoff Hinton (circa 2000) allows models such as Boltzmann machines and ⌊/products of experts/⌋ to be trained much faster.@@@@1@34@@oe@26-8-2013 1000000501920@unknown@formal@none@1@S@⌊=Modular neural networks¦3=⌋@@@@1@3@@oe@26-8-2013 1000000501930@unknown@formal@none@1@S@Biological studies showed that the human brain functions not as a single massive network, but as a collection of small networks.@@@@1@21@@oe@26-8-2013 1000000501940@unknown@formal@none@1@S@This realisation gave birth to the concept of ⌊>modular neural networks>⌋, in which several small networks cooperate or compete to solve problems.@@@@1@22@@oe@26-8-2013 1000000501950@unknown@formal@none@1@S@⌊=Committee of machines¦4=⌋@@@@1@3@@oe@26-8-2013 1000000501960@unknown@formal@none@1@S@A ⌊>committee of machines>⌋ (CoM) is a collection of different neural networks that together "vote" on a given example.@@@@1@19@@oe@26-8-2013 1000000501970@unknown@formal@none@1@S@This generally gives a much better result compared to other neural network models.@@@@1@13@@oe@26-8-2013 1000000501980@unknown@formal@none@1@S@In fact in many cases, starting with the same architecture and training but using different initial random weights gives vastly different networks.@@@@1@22@@oe@26-8-2013 1000000501990@unknown@formal@none@1@S@A CoM tends to stabilize the result.@@@@1@7@@oe@26-8-2013 1000000502000@unknown@formal@none@1@S@The CoM is similar to the general ⌊>machine learning>⌋ ⌊/⌊>bagging>⌋/⌋ method, except that the necessary variety of machines in the committee is obtained by training from different random starting weights rather than training on different randomly selected subsets of the training data.@@@@1@42@@oe@26-8-2013 1000000502010@unknown@formal@none@1@S@⌊=Associative neural network (ASNN)¦4=⌋@@@@1@4@@oe@26-8-2013 1000000502020@unknown@formal@none@1@S@The ASNN is an extension of the ⌊/committee of machines/⌋ that goes beyond a simple/weighted average of different models.@@@@1@19@@oe@26-8-2013 1000000502030@unknown@formal@none@1@S@⌊>ASNN>⌋ represents a combination of an ensemble of feed-forward neural networks and the k-nearest neighbor technique (⌊>kNN>⌋).@@@@1@17@@oe@26-8-2013 1000000502040@unknown@formal@none@1@S@It uses the correlation between ensemble responses as a measure of ⌊∗distance∗⌋ amid the analyzed cases for the kNN.@@@@1@19@@oe@26-8-2013 1000000502050@unknown@formal@none@1@S@This corrects the bias of the neural network ensemble.@@@@1@9@@oe@26-8-2013 1000000502060@unknown@formal@none@1@S@An associative neural network has a memory that can coincide with the training set.@@@@1@14@@oe@26-8-2013 1000000502070@unknown@formal@none@1@S@If new data becomes available, the network instantly improves its predictive ability and provides data approximation (self-learn the data) without a need to retrain the ensemble.@@@@1@26@@oe@26-8-2013 1000000502080@unknown@formal@none@1@S@Another important feature of ASNN is the possibility to interpret neural network results by analysis of correlations between data cases in the space of models.@@@@1@25@@oe@26-8-2013 1000000502090@unknown@formal@none@1@S@The method is demonstrated at ⌊> www.vcclab.org>⌋, where you can either use it online or download it.@@@@1@17@@oe@26-8-2013 1000000502100@unknown@formal@none@1@S@⌊=Other types of networks¦3=⌋@@@@1@4@@oe@26-8-2013 1000000502110@unknown@formal@none@1@S@These special networks do not fit in any of the previous categories.@@@@1@12@@oe@26-8-2013 1000000502120@unknown@formal@none@1@S@⌊=Holographic associative memory¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502130@unknown@formal@none@1@S@⌊>⌊/Holographic associative memory/⌋>⌋ represents a family of analog, correlation-based, associative, stimulus-response memories, where information is mapped onto the phase orientation of complex numbers operating.@@@@1@24@@oe@26-8-2013 1000000502140@unknown@formal@none@1@S@⌊=Instantaneously trained networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502150@unknown@formal@none@1@S@⌊/⌊>Instantaneously trained neural networks>⌋/⌋ (ITNNs) were inspired by the phenomenon of short-term learning that seems to occur instantaneously.@@@@1@18@@oe@26-8-2013 1000000502160@unknown@formal@none@1@S@In these networks the weights of the hidden and the output layers are mapped directly from the training vector data.@@@@1@20@@oe@26-8-2013 1000000502170@unknown@formal@none@1@S@Ordinarily, they work on binary data, but versions for continuous data that require small additional processing are also available.@@@@1@19@@oe@26-8-2013 1000000502180@unknown@formal@none@1@S@⌊=Spiking neural networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502190@unknown@formal@none@1@S@⌊>Spiking neural network>⌋s (SNNs) are models which explicitly take into account the timing of inputs.@@@@1@15@@oe@26-8-2013 1000000502200@unknown@formal@none@1@S@The network input and output are usually represented as series of spikes (delta function or more complex shapes).@@@@1@18@@oe@26-8-2013 1000000502210@unknown@formal@none@1@S@SNNs have an advantage of being able to process information in the ⌊>time domain>⌋ (signals that vary over time).@@@@1@19@@oe@26-8-2013 1000000502220@unknown@formal@none@1@S@They are often implemented as recurrent networks.@@@@1@7@@oe@26-8-2013 1000000502230@unknown@formal@none@1@S@SNNs are also a form of ⌊>pulse computer>⌋.@@@@1@8@@oe@26-8-2013 1000000502240@unknown@formal@none@1@S@Networks of spiking neurons — and the temporal correlations of neural assemblies in such networks — have been used to model figure/ground separation and region linking in the visual system (see e.g. Reitboeck et.al.in Haken and Stadler: Synergetics of the Brain.@@@@1@41@@oe@26-8-2013 1000000502250@unknown@formal@none@1@S@Berlin, 1989).@@@@1@2@@oe@26-8-2013 1000000502260@unknown@formal@none@1@S@Gerstner and Kistler have a freely available online textbook on ⌊> Spiking Neuron Models>⌋.@@@@1@14@@oe@26-8-2013 1000000502270@unknown@formal@none@1@S@Spiking neural networks with axonal conduction delays exhibit polychronization, and hence could have a potentially unlimited memory capacity.@@@@1@18@@oe@26-8-2013 1000000502280@unknown@formal@none@1@S@In June 2005 ⌊>IBM>⌋ announced construction of a ⌊>Blue Gene>⌋ ⌊>supercomputer>⌋ dedicated to the simulation of a large recurrent spiking neural network .@@@@1@23@@oe@26-8-2013 1000000502290@unknown@formal@none@1@S@⌊=Dynamic neural networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502300@unknown@formal@none@1@S@⌊>Dynamic neural network>⌋s not only deal with nonlinear multivariate behaviour, but also include (learning of) time-dependent behaviour such as various transient phenomena and delay effects.@@@@1@25@@oe@26-8-2013 1000000502310@unknown@formal@none@1@S@⌊=Cascading neural networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502320@unknown@formal@none@1@S@⌊/Cascade-Correlation/⌋ is an architecture and ⌊>supervised learning>⌋ ⌊>algorithm>⌋ developed by ⌊>Scott Fahlman>⌋ and ⌊>Christian Lebiere>⌋.@@@@1@15@@oe@26-8-2013 1000000502330@unknown@formal@none@1@S@Instead of just adjusting the weights in a network of fixed topology, Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure.@@@@1@33@@oe@26-8-2013 1000000502340@unknown@formal@none@1@S@Once a new hidden unit has been added to the network, its input-side weights are frozen.@@@@1@16@@oe@26-8-2013 1000000502350@unknown@formal@none@1@S@This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors.@@@@1@22@@oe@26-8-2013 1000000502360@unknown@formal@none@1@S@The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no ⌊>back-propagation>⌋ of error signals through the connections of the network.@@@@1@48@@oe@26-8-2013 1000000502370@unknown@formal@none@1@S@See: ⌊>Cascade correlation algorithm>⌋.@@@@1@4@@oe@26-8-2013 1000000502380@unknown@formal@none@1@S@⌊=Neuro-fuzzy networks¦4=⌋@@@@1@2@@oe@26-8-2013 1000000502390@unknown@formal@none@1@S@A neuro-fuzzy network is a ⌊>fuzzy inference system>⌋ in the body of an artificial neural network.@@@@1@16@@oe@26-8-2013 1000000502400@unknown@formal@none@1@S@Depending on the ⌊/FIS/⌋ type, there are several layers that simulate the processes involved in a ⌊/fuzzy inference/⌋ like fuzzification, inference, aggregation and defuzzification.@@@@1@24@@oe@26-8-2013 1000000502410@unknown@formal@none@1@S@Embedding an ⌊/FIS/⌋ in a general structure of an ⌊/ANN/⌋ has the benefit of using available ⌊/ANN/⌋ training methods to find the parameters of a fuzzy system.@@@@1@27@@oe@26-8-2013 1000000502420@unknown@formal@none@1@S@⌊=Holosemantic neural networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502430@unknown@formal@none@1@S@The holosemantic neural network invented by Manfred Hoffleisch uses a kind a genetic algorithm to build a multidimensional structure.@@@@1@19@@oe@26-8-2013 1000000502440@unknown@formal@none@1@S@It takes into account the timing of inputs.@@@@1@8@@oe@26-8-2013 1000000502450@unknown@formal@none@1@S@⌊=Compositional pattern-producing networks¦4=⌋@@@@1@3@@oe@26-8-2013 1000000502460@unknown@formal@none@1@S@⌊>Compositional pattern-producing network>⌋s (CPPNs) are a variation of ANNs which differ in their set of activation functions and how they are applied.@@@@1@22@@oe@26-8-2013 1000000502470@unknown@formal@none@1@S@While typical ANNs often contain only ⌊>sigmoid function>⌋s (and sometimes ⌊>Gaussian function>⌋s), CPPNs can include both types of functions and many others.@@@@1@22@@oe@26-8-2013 1000000502480@unknown@formal@none@1@S@Furthermore, unlike typical ANNs, CPPNs are applied across the entire space of possible inputs so that they can represent a complete image.@@@@1@22@@oe@26-8-2013 1000000502490@unknown@formal@none@1@S@Since they are compositions of functions, CPPNs in effect encode images at infinite resolution and can be sampled for a particular display at whatever resolution is optimal.@@@@1@27@@oe@26-8-2013 1000000502500@unknown@formal@none@1@S@⌊=Theoretical properties¦2=⌋@@@@1@2@@oe@26-8-2013 1000000502510@unknown@formal@none@1@S@⌊=Computational power¦3=⌋@@@@1@2@@oe@26-8-2013 1000000502520@unknown@formal@none@1@S@The multi-layer perceptron (MLP) is a universal function approximator, as proven by the ⌊>Cybenko theorem>⌋.@@@@1@15@@oe@26-8-2013 1000000502530@unknown@formal@none@1@S@However, the proof is not constructive regarding the number of neurons required or the settings of the weights.@@@@1@18@@oe@26-8-2013 1000000502540@unknown@formal@none@1@S@Work by ⌊>Hava T. Siegelmann>⌋ and ⌊>Eduardo D. Sontag>⌋ has provided a proof that a specific recurrent architecture with rational valued weights (as opposed to the commonly used floating point approximations) has the full power of a ⌊>Universal Turing Machine>⌋.@@@@1@40@@oe@26-8-2013 1000000502550@unknown@formal@none@1@S@They have further shown that the use of irrational values for weights results in a machine with trans-Turing power.@@@@1@19@@oe@26-8-2013 1000000502560@unknown@formal@none@1@S@⌊=Capacity¦3=⌋@@@@1@1@@oe@26-8-2013 1000000502570@unknown@formal@none@1@S@Artificial neural network models have a property called 'capacity', which roughly corresponds to their ability to model any given function.@@@@1@20@@oe@26-8-2013 1000000502580@unknown@formal@none@1@S@It is related to the amount of information that can be stored in the network and to the notion of complexity.@@@@1@21@@oe@26-8-2013 1000000502590@unknown@formal@none@1@S@⌊=Convergence¦3=⌋@@@@1@1@@oe@26-8-2013 1000000502600@unknown@formal@none@1@S@Nothing can be said in general about convergence since it depends on a number of factors.@@@@1@16@@oe@26-8-2013 1000000502610@unknown@formal@none@1@S@Firstly, there may exist many local minima.@@@@1@7@@oe@26-8-2013 1000000502620@unknown@formal@none@1@S@This depends on the cost function and the model.@@@@1@9@@oe@26-8-2013 1000000502630@unknown@formal@none@1@S@Secondly, the optimization method used might not be guaranteed to converge when far away from a local minimum.@@@@1@18@@oe@26-8-2013 1000000502640@unknown@formal@none@1@S@Thirdly, for a very large amount of data or parameters, some methods become impractical.@@@@1@14@@oe@26-8-2013 1000000502650@unknown@formal@none@1@S@In general, it has been found that theoretical guarantees regarding convergence are not always a very reliable guide to practical application.@@@@1@21@@oe@26-8-2013 1000000502660@unknown@formal@none@1@S@⌊=Generalisation and statistics¦3=⌋@@@@1@3@@oe@26-8-2013 1000000502670@unknown@formal@none@1@S@In applications where the goal is to create a system that generalises well in unseen examples, the problem of overtraining has emerged.@@@@1@22@@oe@26-8-2013 1000000502680@unknown@formal@none@1@S@This arises in overcomplex or overspecified systems when the capacity of the network significantly exceeds the needed free parameters.@@@@1@19@@oe@26-8-2013 1000000502690@unknown@formal@none@1@S@There are two schools of thought for avoiding this problem: The first is to use cross-validation and similar techniques to check for the presence of overtraining and optimally select hyperparameters such as to minimise the generalisation error.@@@@1@37@@oe@26-8-2013 1000000502700@unknown@formal@none@1@S@The second is to use some form of ⌊/regularisation/⌋.@@@@1@9@@oe@26-8-2013 1000000502710@unknown@formal@none@1@S@This is a concept that emerges naturally in a probabilistic (Bayesian) framework, where the regularisation can be performed by putting a larger prior probability over simpler models; but also in statistical learning theory, where the goal is to minimise over two quantities: the 'empirical risk' and the 'structural risk', which roughly correspond to the error over the training set and the predicted error in unseen data due to overfitting.@@@@1@69@@oe@26-8-2013 1000000502720@unknown@formal@none@1@S@Supervised neural networks that use an ⌊>MSE>⌋ cost function can use formal statistical methods to determine the confidence of the trained model.@@@@1@22@@oe@26-8-2013 1000000502730@unknown@formal@none@1@S@The MSE on a validation set can be used as an estimate for variance.@@@@1@14@@oe@26-8-2013 1000000502740@unknown@formal@none@1@S@This value can then be used to calculate the ⌊>confidence interval>⌋ of the output of the network, assuming a ⌊>normal distribution>⌋.@@@@1@21@@oe@26-8-2013 1000000502750@unknown@formal@none@1@S@A confidence analysis made this way is statistically valid as long as the output ⌊>probability distribution>⌋ stays the same and the network is not modified.@@@@1@25@@oe@26-8-2013 1000000502760@unknown@formal@none@1@S@By assigning a softmax activation function on the output layer of the neural network (or a softmax component in a component-based neural network) for categorical target variables, the outputs can be interpreted as posterior probabilities.@@@@1@35@@oe@26-8-2013 1000000502770@unknown@formal@none@1@S@This is very useful in classification as it gives a certainty measure on classifications.@@@@1@14@@oe@26-8-2013 1000000502780@unknown@formal@none@1@S@The softmax activation function: ⌊×y_i=\\frac{e^{x_i}}{\\sum_{j=1}^c e^{x_j}}×⌋@@@@1@6@@oe@26-8-2013 1000000502790@unknown@formal@none@1@S@⌊=Dynamic properties¦3=⌋@@@@1@2@@oe@26-8-2013 1000000502800@unknown@formal@none@1@S@Various techniques originally developed for studying disordered magnetic systems (i.e. the ⌊>spin glass>⌋) have been successfully applied to simple neural network architectures, such as the Hopfield network.@@@@1@27@@oe@26-8-2013 1000000502810@unknown@formal@none@1@S@Influential work by E. Gardner and B. Derrida has revealed many interesting properties about perceptrons with real-valued synaptic weights, while later work by W. Krauth and M. Mezard has extended these principles to binary-valued synapses.@@@@1@35@@oe@26-8-2013 1000000600010@unknown@formal@none@1@S@⌊δAssociation for Computational Linguisticsδ⌋@@@@1@4@@oe@26-8-2013 1000000600020@unknown@formal@none@1@S@The ⌊∗Association for Computational Linguistics∗⌋ (⌊∗ACL∗⌋) is the international scientific and professional society for people working on problems involving ⌊>natural language and computation>⌋.@@@@1@23@@oe@26-8-2013 1000000600030@unknown@formal@none@1@S@An annual meeting is held each summer in locations where significant ⌊>computational linguistics>⌋ research is carried out.@@@@1@17@@oe@26-8-2013 1000000600040@unknown@formal@none@1@S@It was founded in ⌊>1962>⌋, originally named the ⌊∗Association for Machine Translation and Computational Linguistics∗⌋ (⌊∗AMTCL∗⌋).@@@@1@16@@oe@26-8-2013 1000000600050@unknown@formal@none@1@S@It became the ACL in ⌊>1968>⌋.@@@@1@6@@oe@26-8-2013 1000000600060@unknown@formal@none@1@S@The ACL has ⌊>Europe>⌋an and ⌊>North America>⌋n chapters, the ⌊>European Chapter of the Association for Computational Linguistics>⌋ (EACL) and the ⌊>North American Chapter of the Association for Computational Linguistics>⌋ (NAACL).@@@@1@30@@oe@26-8-2013 1000000600070@unknown@formal@none@1@S@The ACL journal, ⌊>⌊/Computational Linguistics/⌋>⌋, continues to be the primary forum for research on computational linguistics and ⌊>natural language processing>⌋.@@@@1@20@@oe@26-8-2013 1000000600080@unknown@formal@none@1@S@Since ⌊>1988>⌋, the journal has been published for the ACL by ⌊>MIT Press>⌋.@@@@1@13@@oe@26-8-2013 1000000600090@unknown@formal@none@1@S@The ACL book series, ⌊>⌊/Studies in Natural Language Processing/⌋>⌋, is published by ⌊>Cambridge University Press>⌋.@@@@1@15@@oe@26-8-2013 1000000700010@unknown@formal@none@1@S@⌊δBLEUδ⌋@@@@1@1@@oe@26-8-2013 1000000700020@unknown@formal@none@1@S@⌊⇥⌊/This page is about the evaluation metric for machine translation.@@@@1@10@@oe@26-8-2013 1000000700030@unknown@formal@none@1@S@For other meanings, please see ⌊>Bleu>⌋./⌋⇥⌋@@@@1@6@@oe@26-8-2013 1000000700040@unknown@formal@none@1@S@⌊∗BLEU∗⌋ (⌊∗Bilingual Evaluation Understudy∗⌋) is a method for evaluating the quality of text which has been translated from one ⌊>natural language>⌋ to another using ⌊>machine translation>⌋.@@@@1@26@@oe@26-8-2013 1000000700050@unknown@formal@none@1@S@BLEU was one of the first ⌊>software metric>⌋s to report high ⌊>correlation>⌋ with human judgements of quality.@@@@1@17@@oe@26-8-2013 1000000700060@unknown@formal@none@1@S@The metric is currently one of the most popular in the field.@@@@1@12@@oe@26-8-2013 1000000700070@unknown@formal@none@1@S@The central idea behind the metric is that, "the closer a machine translation is to a professional human translation, the better it is".@@@@1@23@@oe@26-8-2013 1000000700080@unknown@formal@none@1@S@The metric calculates scores for individual segments, generally ⌊>sentence>⌋s, and then averages these scores over the whole ⌊>corpus>⌋ in order to reach a final score.@@@@1@25@@oe@26-8-2013 1000000700090@unknown@formal@none@1@S@It has been shown to correlate highly with human judgements of quality at the corpus level.@@@@1@16@@oe@26-8-2013 1000000700100@unknown@formal@none@1@S@The quality of translation is indicated as a number between 0 and 1 and is measured as statistical closeness to a given set of good quality human reference translations.@@@@1@29@@oe@26-8-2013 1000000700110@unknown@formal@none@1@S@Therefore, it does not directly take into account translation intelligibility or grammatical correctness.@@@@1@13@@oe@26-8-2013 1000000700120@unknown@formal@none@1@S@The metric works by measuring the ⌊>n-gram>⌋ co-occurrence between a given translation and the set of reference translations and then taking the weighted ⌊>geometric mean>⌋.@@@@1@25@@oe@26-8-2013 1000000700130@unknown@formal@none@1@S@BLEU is specifically designed to approximate human judgement on a ⌊>corpus>⌋ level and performs badly if used to evaluate the quality of isolated sentences.@@@@1@24@@oe@26-8-2013 1000000700140@unknown@formal@none@1@S@⌊=Algorithm¦2=⌋@@@@1@1@@oe@26-8-2013 1000000700150@unknown@formal@none@1@S@BLEU uses a modified form of ⌊>precision>⌋ to compare a candidate translation against multiple reference translations.@@@@1@16@@oe@26-8-2013 1000000700160@unknown@formal@none@1@S@The metric modifies simple precision since machine translation systems have been known to generate more words than appear in a reference text.@@@@1@22@@oe@26-8-2013 1000000700170@unknown@formal@none@1@S@This is illustrated in the following example from Papineni et al. (2002),@@@@1@12@@oe@26-8-2013 1000000700180@unknown@formal@none@1@S@In this example, the candidate text is given a unigram precision of,@@@@1@12@@oe@26-8-2013 1000000700190@unknown@formal@none@1@S@⌊⇥⌊×P = \\frac{m}{w_{t}} = \\frac{7}{7} = 1×⌋⇥⌋@@@@1@7@@oe@26-8-2013 1000000700200@unknown@formal@none@1@S@Of the seven words in the candidate translation, all of them appear in the reference translations.@@@@1@16@@oe@26-8-2013 1000000700210@unknown@formal@none@1@S@This presents a problem for a metric, as the candidate translation above is complete nonsense, retaining none of the content of either of the references.@@@@1@25@@oe@26-8-2013 1000000700220@unknown@formal@none@1@S@The modification that BLEU makes is fairly straightforward.@@@@1@8@@oe@26-8-2013 1000000700230@unknown@formal@none@1@S@For each word in the candidate translation, the algorithm takes the maximum total count in the reference translations.@@@@1@18@@oe@26-8-2013 1000000700240@unknown@formal@none@1@S@Taking the example above, the word 'the' appears twice in reference 1, and once in reference 2.@@@@1@17@@oe@26-8-2013 1000000700250@unknown@formal@none@1@S@The largest value is taken, in this case '2' as the "maximum reference count".@@@@1@14@@oe@26-8-2013 1000000700260@unknown@formal@none@1@S@For each of the words in the candidate translation, the count of the word is compared against the maximum reference count, and the lowest value is taken.@@@@1@27@@oe@26-8-2013 1000000700270@unknown@formal@none@1@S@In this case, the count of the word 'the' in the candidate translation is '7', while the maximum reference count for the word is '2'.@@@@1@25@@oe@26-8-2013 1000000700280@unknown@formal@none@1@S@This "modified count" is then divided by the total number of words in the candidate translation.@@@@1@16@@oe@26-8-2013 1000000700290@unknown@formal@none@1@S@In the above example, the modified unigram precision score would be,@@@@1@11@@oe@26-8-2013 1000000700300@unknown@formal@none@1@S@⌊⇥⌊×P = \\frac{2}{7}×⌋⇥⌋@@@@1@3@@oe@26-8-2013 1000000700310@unknown@formal@none@1@S@The above method is used to calculate scores for each ⌊×n×⌋.@@@@1@11@@oe@26-8-2013 1000000700320@unknown@formal@none@1@S@The value of ⌊×n×⌋ which has the "highest correlation with monolingual human judgements" was found to be 4.@@@@1@18@@oe@26-8-2013 1000000700330@unknown@formal@none@1@S@The unigram scores are found to account for the adequacy of the translation, in other words, how much information is retained in the translation.@@@@1@24@@oe@26-8-2013 1000000700340@unknown@formal@none@1@S@The longer ⌊×n×⌋-gram scores account for the fluency of the translation, or to what extent it reads like "good English".@@@@1@20@@oe@26-8-2013 1000000700350@unknown@formal@none@1@S@The modification made to precision does not solve the problem of short translations.@@@@1@13@@oe@26-8-2013 1000000700360@unknown@formal@none@1@S@Short translations can produce very high precision scores, even using modified precision.@@@@1@12@@oe@26-8-2013 1000000700370@unknown@formal@none@1@S@An example of a candidate translation for the same references as above might be:@@@@1@14@@oe@26-8-2013 1000000700380@unknown@formal@none@1@S@⌊⇥the cat⇥⌋@@@@1@2@@oe@26-8-2013 1000000700390@unknown@formal@none@1@S@In this example, the modified unigram precision would be,@@@@1@9@@oe@26-8-2013 1000000700400@unknown@formal@none@1@S@⌊⇥⌊×P = \\frac{1}{2} + \\frac{1}{2} = \\frac{2}{2}×⌋⇥⌋@@@@1@7@@oe@26-8-2013 1000000700410@unknown@formal@none@1@S@as the word 'the' and the word 'cat' appear once each in the candidate, and the total number of words is two.@@@@1@22@@oe@26-8-2013 1000000700420@unknown@formal@none@1@S@The modified bigram precision would be ⌊×1 / 1×⌋ as the bigram, "the cat" appears once in the candidate.@@@@1@19@@oe@26-8-2013 1000000700430@unknown@formal@none@1@S@It has been pointed out that precision is usually twinned with ⌊>recall>⌋ to overcome this problem , as the unigram recall of this example would be ⌊×2 / 6×⌋ or ⌊×2 / 7×⌋.@@@@1@33@@oe@26-8-2013 1000000700440@unknown@formal@none@1@S@The problem being that as there are multiple reference translations, a bad translation could easily have an inflated recall, such as a translation which consisted of all the words in each of the references.@@@@1@34@@oe@26-8-2013 1000000700450@unknown@formal@none@1@S@In order to produce a score for the whole corpus, the modified precision scores for the segments are combined using the ⌊>geometric mean>⌋, multiplied by a brevity penalty, whose purpose is to prevent very short candidates from receiving too high a score.@@@@1@42@@oe@26-8-2013 1000000700460@unknown@formal@none@1@S@Let ⌊×r×⌋ be the total length of the reference corpus, and ⌊×c×⌋ the total length of the translation corpus.@@@@1@19@@oe@26-8-2013 1000000700470@unknown@formal@none@1@S@If ⌊×c \\leq r×⌋, the brevity penalty applies and is defined to be ⌊×e^{(1-r/c)}×⌋.@@@@1@14@@oe@26-8-2013 1000000700480@unknown@formal@none@1@S@(In the case of multiple reference sentences, ⌊×r×⌋ is taken to be the sum of the lengths of the sentences whose lengths are closest to the lengths of the candidate sentences.@@@@1@31@@oe@26-8-2013 1000000700490@unknown@formal@none@1@S@However, in the version of the metric used by ⌊>NIST>⌋, the short reference sentence is used.)@@@@1@16@@oe@26-8-2013 1000000700500@unknown@formal@none@1@S@⌊=Performance¦2=⌋@@@@1@1@@oe@26-8-2013 1000000700510@unknown@formal@none@1@S@BLEU has frequently been reported as correlating well with human judgement, and certainly remains a benchmark for any new evaluation metric to beat.@@@@1@23@@oe@26-8-2013 1000000700520@unknown@formal@none@1@S@There are however a number of criticisms that have been voiced.@@@@1@11@@oe@26-8-2013 1000000700530@unknown@formal@none@1@S@It has been noted that while in theory capable of evaluating any language, BLEU does not in the present form work on languages without word boundaries.@@@@1@26@@oe@26-8-2013 1000000700540@unknown@formal@none@1@S@It has been argued that although BLEU certainly has significant advantages, there is no guarantee that an increase in BLEU score is an indicator of improved translation quality.@@@@1@28@@oe@26-8-2013 1000000700550@unknown@formal@none@1@S@As BLEU scores are taken at the corpus level, it is difficult to give a textual example.@@@@1@17@@oe@26-8-2013 1000000700560@unknown@formal@none@1@S@Nevertheless, they highlight two instances where BLEU seriously underperformed.@@@@1@9@@oe@26-8-2013 1000000700570@unknown@formal@none@1@S@These were the 2005 ⌊>NIST>⌋ evaluations where a number of different machine translation systems were tested, and their study of the ⌊>SYSTRAN>⌋ engine versus two engines using ⌊>statistical machine translation>⌋ (SMT) techniques.@@@@1@32@@oe@26-8-2013 1000000700580@unknown@formal@none@1@S@In the 2005 NIST evaluation, they report that the scores generated by BLEU failed to correspond to the scores produced in the human evaluations.@@@@1@24@@oe@26-8-2013 1000000700590@unknown@formal@none@1@S@The system which was ranked highest by the human judges was only ranked 6th by BLEU.@@@@1@16@@oe@26-8-2013 1000000700600@unknown@formal@none@1@S@In their study, they compared SMT systems with SYSTRAN, a knowledge based system.@@@@1@13@@oe@26-8-2013 1000000700610@unknown@formal@none@1@S@The scores from BLEU for SYSTRAN were substantially worse than the scores given to SYSTRAN by the human judges.@@@@1@19@@oe@26-8-2013 1000000700620@unknown@formal@none@1@S@They note that the SMT systems were trained using BLEU minimum error rate training, and point out that this could be one of the reasons behind the difference.@@@@1@28@@oe@26-8-2013 1000000700630@unknown@formal@none@1@S@They conclude by recommending that BLEU be used in a more restricted manner, for comparing the results from two similar systems, and for tracking "broad, incremental changes to a single system".@@@@1@31@@oe@26-8-2013 1000000800010@unknown@formal@none@1@S@⌊δBabel Fish (website)δ⌋@@@@1@3@@oe@26-8-2013 1000000800020@unknown@formal@none@1@S@⌊∗Babel Fish∗⌋ is a ⌊>web>⌋-based application on ⌊>Yahoo!>⌋ that ⌊>machine translates>⌋ text or web pages from one of several languages into another.@@@@1@22@@oe@26-8-2013 1000000800030@unknown@formal@none@1@S@Developed by ⌊>AltaVista>⌋, the application is named after the fictional animal used for instantaneous ⌊>language>⌋ ⌊>translation>⌋ in ⌊>Douglas Adams>⌋'s series ⌊/⌊>The Hitchhiker's Guide to the Galaxy>⌋./⌋@@@@1@26@@oe@26-8-2013 1000000800040@unknown@formal@none@1@S@In turn the fish is a reference to the ⌊>biblical>⌋ account of the city of ⌊>Babel>⌋ and the ⌊>various languages>⌋ said to have arisen there.@@@@1@25@@oe@26-8-2013 1000000800050@unknown@formal@none@1@S@The translation technology for Babel Fish is provided by ⌊>SYSTRAN>⌋, whose technology also powers a number of other sites and portals.@@@@1@21@@oe@26-8-2013 1000000800060@unknown@formal@none@1@S@It translates among ⌊>English>⌋, ⌊>Simplified Chinese>⌋, ⌊>Traditional Chinese>⌋, ⌊>Dutch>⌋, ⌊>French>⌋, ⌊>German>⌋, ⌊>Greek>⌋, ⌊>Italian>⌋, ⌊>Japanese>⌋, ⌊>Korean>⌋, ⌊>Portuguese>⌋, ⌊>Russian>⌋, and ⌊>Spanish>⌋.@@@@1@19@@oe@26-8-2013 1000000800070@unknown@formal@none@1@S@The service makes no claim to produce a perfect translation.@@@@1@10@@oe@26-8-2013 1000000800080@unknown@formal@none@1@S@A number of humour sites have sprung up that use the Babel Fish service to translate back and forth between one or more languages (a so-called ⌊>round-trip translation>⌋).@@@@1@28@@oe@26-8-2013 1000000800090@unknown@formal@none@1@S@After a long existence at babelfish.altavista.com, the site was moved on May 9 2008 to babelfish.yahoo.com.@@@@1@16@@oe@26-8-2013 1000000900010@unknown@formal@none@1@S@⌊δBioinformaticsδ⌋@@@@1@1@@oe@26-8-2013 1000000900020@unknown@formal@none@1@S@⌊∗Bioinformatics∗⌋ and ⌊∗computational biology∗⌋ involve the use of techniques including ⌊>applied mathematics>⌋, ⌊>informatics>⌋, ⌊>statistics>⌋, ⌊>computer science>⌋, ⌊>artificial intelligence>⌋, ⌊>chemistry>⌋, and ⌊>biochemistry>⌋ to solve ⌊>biological>⌋ problems usually on the ⌊>molecular>⌋ level.@@@@1@30@@oe@26-8-2013 1000000900030@unknown@formal@none@1@S@The core principle of these techniques is using computing resources in order to solve problems on scales of magnitude far too great for human discernment.@@@@1@25@@oe@26-8-2013 1000000900040@unknown@formal@none@1@S@Research in computational biology often overlaps with ⌊>systems biology>⌋.@@@@1@9@@oe@26-8-2013 1000000900050@unknown@formal@none@1@S@Major research efforts in the field include ⌊>sequence alignment>⌋, ⌊>gene finding>⌋, ⌊>genome assembly>⌋, ⌊>protein structure alignment>⌋, ⌊>protein structure prediction>⌋, prediction of ⌊>gene expression>⌋ and ⌊>protein-protein interactions>⌋, and the modeling of ⌊>evolution>⌋.@@@@1@31@@oe@26-8-2013 1000000900060@unknown@formal@none@1@S@⌊=Introduction¦2=⌋@@@@1@1@@oe@26-8-2013 1000000900070@unknown@formal@none@1@S@The terms ⌊∗⌊/bioinformatics/⌋∗⌋ and ⌊/⌊>computational biology>⌋/⌋ are often used interchangeably.@@@@1@10@@oe@26-8-2013 1000000900080@unknown@formal@none@1@S@However ⌊/bioinformatics/⌋ more properly refers to the creation and advancement of algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.@@@@1@33@@oe@26-8-2013 1000000900090@unknown@formal@none@1@S@Computational biology, on the other hand, refers to hypothesis-driven investigation of a specific biological problem using computers, carried out with experimental or simulated data, with the primary goal of discovery and the advancement of biological knowledge.@@@@1@36@@oe@26-8-2013 1000000900100@unknown@formal@none@1@S@Put more simply, bioinformatics is concerned with the information while computational biology is concerned with the hypotheses.@@@@1@17@@oe@26-8-2013 1000000900110@unknown@formal@none@1@S@A similar distinction is made by ⌊>National Institutes of Health>⌋ in their ⌊> working definitions of Bioinformatics and Computational Biology>⌋, where it is further emphasized that there is a tight coupling of developments and knowledge between the more hypothesis-driven research in computational biology and technique-driven research in bioinformatics.@@@@1@48@@oe@26-8-2013 1000000900120@unknown@formal@none@1@S@Bioinformatics is also often specified as an applied subfield of the more general discipline of ⌊>Biomedical informatics>⌋.@@@@1@17@@oe@26-8-2013 1000000900130@unknown@formal@none@1@S@A common thread in projects in bioinformatics and computational biology is the use of mathematical tools to extract useful information from data produced by high-throughput biological techniques such as ⌊>genome sequencing>⌋.@@@@1@31@@oe@26-8-2013 1000000900140@unknown@formal@none@1@S@A representative problem in bioinformatics is the assembly of high-quality genome sequences from fragmentary "shotgun" DNA ⌊>sequencing>⌋.@@@@1@17@@oe@26-8-2013 1000000900150@unknown@formal@none@1@S@Other common problems include the study of ⌊>gene regulation>⌋ to perform ⌊>expression profiling>⌋ using data from ⌊>microarrays>⌋ or ⌊>mass spectrometry>⌋.@@@@1@20@@oe@26-8-2013 1000000900160@unknown@formal@none@1@S@⌊=Major research areas¦2=⌋@@@@1@3@@oe@26-8-2013 1000000900170@unknown@formal@none@1@S@⌊=Sequence analysis¦3=⌋@@@@1@2@@oe@26-8-2013 1000000900180@unknown@formal@none@1@S@Since the ⌊>Phage Φ-X174>⌋ was ⌊>sequenced>⌋ in 1977, the ⌊>DNA sequence>⌋s of hundreds of organisms have been decoded and stored in databases.@@@@1@22@@oe@26-8-2013 1000000900190@unknown@formal@none@1@S@The information is analyzed to determine genes that encode ⌊>polypeptides>⌋, as well as regulatory sequences.@@@@1@15@@oe@26-8-2013 1000000900200@unknown@formal@none@1@S@A comparison of genes within a ⌊>species>⌋ or between different species can show similarities between protein functions, or relations between species (the use of ⌊>molecular systematics>⌋ to construct ⌊>phylogenetic tree>⌋s).@@@@1@30@@oe@26-8-2013 1000000900210@unknown@formal@none@1@S@With the growing amount of data, it long ago became impractical to analyze DNA sequences manually.@@@@1@16@@oe@26-8-2013 1000000900220@unknown@formal@none@1@S@Today, ⌊>computer program>⌋s are used to search the ⌊>genome>⌋ of thousands of organisms, containing billions of ⌊>nucleotide>⌋s.@@@@1@17@@oe@26-8-2013 1000000900230@unknown@formal@none@1@S@These programs would compensate for mutations (exchanged, deleted or inserted bases) in the DNA sequence, in order to identify sequences that are related, but not identical.@@@@1@26@@oe@26-8-2013 1000000900240@unknown@formal@none@1@S@A variant of this ⌊>sequence alignment>⌋ is used in the sequencing process itself.@@@@1@13@@oe@26-8-2013 1000000900250@unknown@formal@none@1@S@The so-called ⌊>shotgun sequencing>⌋ technique (which was used, for example, by ⌊>The Institute for Genomic Research>⌋ to sequence the first bacterial genome, ⌊/Haemophilus influenzae/⌋) does not give a sequential list of nucleotides, but instead the sequences of thousands of small DNA fragments (each about 600-800 nucleotides long).@@@@1@47@@oe@26-8-2013 1000000900260@unknown@formal@none@1@S@The ends of these fragments overlap and, when aligned in the right way, make up the complete genome.@@@@1@18@@oe@26-8-2013 1000000900270@unknown@formal@none@1@S@Shotgun sequencing yields sequence data quickly, but the task of assembling the fragments can be quite complicated for larger genomes.@@@@1@20@@oe@26-8-2013 1000000900280@unknown@formal@none@1@S@In the case of the ⌊>Human Genome Project>⌋, it took several months of CPU time (on a circa-2000 vintage ⌊>DEC Alpha>⌋ computer) to assemble the fragments.@@@@1@26@@oe@26-8-2013 1000000900290@unknown@formal@none@1@S@Shotgun sequencing is the method of choice for virtually all genomes sequenced today, and genome assembly algorithms are a critical area of bioinformatics research.@@@@1@24@@oe@26-8-2013 1000000900300@unknown@formal@none@1@S@Another aspect of bioinformatics in sequence analysis is the automatic ⌊>search for genes>⌋ and regulatory sequences within a genome.@@@@1@19@@oe@26-8-2013 1000000900310@unknown@formal@none@1@S@Not all of the nucleotides within a genome are genes.@@@@1@10@@oe@26-8-2013 1000000900320@unknown@formal@none@1@S@Within the genome of higher organisms, large parts of the DNA do not serve any obvious purpose.@@@@1@17@@oe@26-8-2013 1000000900330@unknown@formal@none@1@S@This so-called ⌊>junk DNA>⌋ may, however, contain unrecognized functional elements.@@@@1@10@@oe@26-8-2013 1000000900340@unknown@formal@none@1@S@Bioinformatics helps to bridge the gap between genome and ⌊>proteome>⌋ projects--for example, in the use of DNA sequences for protein identification.@@@@1@21@@oe@26-8-2013 1000000900350@unknown@formal@none@1@S@⌊/See also:/⌋ ⌊>sequence analysis>⌋, ⌊>sequence profiling tool>⌋, ⌊>sequence motif>⌋.@@@@1@9@@oe@26-8-2013 1000000900360@unknown@formal@none@1@S@⌊=Genome annotation¦3=⌋@@@@1@2@@oe@26-8-2013 1000000900370@unknown@formal@none@1@S@In the context of ⌊>genomics>⌋, ⌊∗annotation∗⌋ is the process of marking the genes and other biological features in a DNA sequence.@@@@1@21@@oe@26-8-2013 1000000900380@unknown@formal@none@1@S@The first genome annotation software system was designed in 1995 by Dr. Owen White, who was part of the team that sequenced and analyzed the first genome of a free-living organism to be decoded, the bacterium ⌊/⌊>Haemophilus influenzae>⌋/⌋.@@@@1@38@@oe@26-8-2013 1000000900390@unknown@formal@none@1@S@Dr. White built a software system to find the genes (places in the DNA sequence that encode a protein), the transfer RNA, and other features, and to make initial assignments of function to those genes.@@@@1@35@@oe@26-8-2013 1000000900400@unknown@formal@none@1@S@Most current genome annotation systems work similarly, but the programs available for analysis of genomic DNA are constantly changing and improving.@@@@1@21@@oe@26-8-2013 1000000900410@unknown@formal@none@1@S@⌊=Computational evolutionary biology¦3=⌋@@@@1@3@@oe@26-8-2013 1000000900420@unknown@formal@none@1@S@⌊>Evolutionary biology>⌋ is the study of the origin and descent of ⌊>species>⌋, as well as their change over time.@@@@1@19@@oe@26-8-2013 1000000900430@unknown@formal@none@1@S@Informatics has assisted evolutionary biologists in several key ways; it has enabled researchers to:@@@@1@14@@oe@26-8-2013 1000000900440@unknown@formal@none@1@S@⌊•⌊#trace the evolution of a large number of organisms by measuring changes in their ⌊>DNA>⌋, rather than through ⌊>physical taxonomy>⌋ or physiological observations alone,#⌋@@@@1@24@@oe@26-8-2013 1000000900450@unknown@formal@none@1@S@⌊#more recently, compare entire ⌊>genomes>⌋, which permits the study of more complex evolutionary events, such as ⌊>gene duplication>⌋, ⌊>lateral gene transfer>⌋, and the prediction of factors important in bacterial ⌊>speciation>⌋,#⌋@@@@1@30@@oe@26-8-2013 1000000900460@unknown@formal@none@1@S@⌊#build complex computational models of populations to predict the outcome of the system over time#⌋@@@@1@15@@oe@26-8-2013 1000000900470@unknown@formal@none@1@S@⌊#track and share information on an increasingly large number of species and organisms#⌋•⌋@@@@1@13@@oe@26-8-2013 1000000900480@unknown@formal@none@1@S@Future work endeavours to reconstruct the now more complex ⌊>tree of life>⌋.@@@@1@12@@oe@26-8-2013 1000000900490@unknown@formal@none@1@S@The area of research within ⌊>computer science>⌋ that uses ⌊>genetic algorithm>⌋s is sometimes confused with ⌊>computational evolutionary biology>⌋, but the two areas are unrelated.@@@@1@24@@oe@26-8-2013 1000000900500@unknown@formal@none@1@S@⌊=Measuring biodiversity¦3=⌋@@@@1@2@@oe@26-8-2013 1000000900510@unknown@formal@none@1@S@⌊>Biodiversity>⌋ of an ecosystem might be defined as the total genomic complement of a particular environment, from all of the species present, whether it is a biofilm in an abandoned mine, a drop of sea water, a scoop of soil, or the entire ⌊>biosphere>⌋ of the planet ⌊>Earth>⌋.@@@@1@48@@oe@26-8-2013 1000000900520@unknown@formal@none@1@S@Databases are used to collect the ⌊>species>⌋ names, descriptions, distributions, genetic information, status and size of ⌊>population>⌋s, ⌊>habitat>⌋ needs, and how each organism interacts with other species.@@@@1@27@@oe@26-8-2013 1000000900530@unknown@formal@none@1@S@Specialized ⌊>software>⌋ programs are used to find, visualize, and analyze the information, and most importantly, communicate it to other people.@@@@1@20@@oe@26-8-2013 1000000900540@unknown@formal@none@1@S@Computer simulations model such things as population dynamics, or calculate the cumulative genetic health of a breeding pool (in ⌊>agriculture>⌋) or endangered population (in ⌊>conservation>⌋).@@@@1@25@@oe@26-8-2013 1000000900550@unknown@formal@none@1@S@One very exciting potential of this field is that entire ⌊>DNA>⌋ sequences, or ⌊>genome>⌋s of ⌊>endangered species>⌋ can be preserved, allowing the results of Nature's genetic experiment to be remembered ⌊/⌊>in silico>⌋/⌋, and possibly reused in the future, even if that species is eventually lost.@@@@1@45@@oe@26-8-2013 1000000900560@unknown@formal@none@1@S@⌊/Important projects:/⌋ ⌊> Species 2000 project>⌋; ⌊> uBio Project>⌋.@@@@1@9@@oe@26-8-2013 1000000900570@unknown@formal@none@1@S@⌊=Analysis of gene expression¦3=⌋@@@@1@4@@oe@26-8-2013 1000000900580@unknown@formal@none@1@S@The ⌊>expression>⌋ of many genes can be determined by measuring ⌊>mRNA>⌋ levels with multiple techniques including ⌊>microarrays>⌋, ⌊>expressed cDNA sequence tag>⌋ (EST) sequencing, ⌊>serial analysis of gene expression>⌋ (SAGE) tag sequencing, ⌊>massively parallel signature sequencing>⌋ (MPSS), or various applications of multiplexed in-situ hybridization.@@@@1@43@@oe@26-8-2013 1000000900590@unknown@formal@none@1@S@All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate ⌊>signal>⌋ from ⌊>noise>⌋ in high-throughput gene expression studies.@@@@1@37@@oe@26-8-2013 1000000900600@unknown@formal@none@1@S@Such studies are often used to determine the genes implicated in a disorder: one might compare microarray data from cancerous ⌊>epithelial>⌋ cells to data from non-cancerous cells to determine the transcripts that are up-regulated and down-regulated in a particular population of cancer cells.@@@@1@43@@oe@26-8-2013 1000000900610@unknown@formal@none@1@S@⌊=Analysis of regulation¦3=⌋@@@@1@3@@oe@26-8-2013 1000000900620@unknown@formal@none@1@S@Regulation is the complex orchestration of events starting with an extracellular signal such as a ⌊>hormone>⌋ and leading to an increase or decrease in the activity of one or more ⌊>protein>⌋s.@@@@1@31@@oe@26-8-2013 1000000900630@unknown@formal@none@1@S@Bioinformatics techniques have been applied to explore various steps in this process.@@@@1@12@@oe@26-8-2013 1000000900640@unknown@formal@none@1@S@For example, ⌊>promoter analysis>⌋ involves the identification and study of ⌊>sequence motif>⌋s in the DNA surrounding the coding region of a gene.@@@@1@22@@oe@26-8-2013 1000000900650@unknown@formal@none@1@S@These motifs influence the extent to which that region is transcribed into mRNA.@@@@1@13@@oe@26-8-2013 1000000900660@unknown@formal@none@1@S@Expression data can be used to infer gene regulation: one might compare ⌊>microarray>⌋ data from a wide variety of states of an organism to form hypotheses about the genes involved in each state.@@@@1@33@@oe@26-8-2013 1000000900670@unknown@formal@none@1@S@In a single-cell organism, one might compare stages of the ⌊>cell cycle>⌋, along with various stress conditions (heat shock, starvation, etc.).@@@@1@21@@oe@26-8-2013 1000000900680@unknown@formal@none@1@S@One can then apply ⌊>clustering algorithms>⌋ to that expression data to determine which genes are co-expressed.@@@@1@16@@oe@26-8-2013 1000000900690@unknown@formal@none@1@S@For example, the upstream regions (promoters) of co-expressed genes can be searched for over-represented ⌊>regulatory elements>⌋.@@@@1@16@@oe@26-8-2013 1000000900700@unknown@formal@none@1@S@⌊=Analysis of protein expression¦3=⌋@@@@1@4@@oe@26-8-2013 1000000900710@unknown@formal@none@1@S@⌊>Protein microarray>⌋s and high throughput (HT) ⌊>mass spectrometry>⌋ (MS) can provide a snapshot of the proteins present in a biological sample.@@@@1@21@@oe@26-8-2013 1000000900720@unknown@formal@none@1@S@Bioinformatics is very much involved in making sense of protein microarray and HT MS data; the former approach faces similar problems as with microarrays targeted at mRNA, the latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.@@@@1@63@@oe@26-8-2013 1000000900730@unknown@formal@none@1@S@⌊=Analysis of mutations in cancer¦3=⌋@@@@1@5@@oe@26-8-2013 1000000900740@unknown@formal@none@1@S@In cancer, the genomes of affected cells are rearranged in complex or even unpredictable ways.@@@@1@15@@oe@26-8-2013 1000000900750@unknown@formal@none@1@S@Massive sequencing efforts are used to identify previously unknown ⌊>point mutation>⌋s in a variety of ⌊>gene>⌋s in ⌊>cancer>⌋.@@@@1@18@@oe@26-8-2013 1000000900760@unknown@formal@none@1@S@Bioinformaticians continue to produce specialized automated systems to manage the sheer volume of sequence data produced, and they create new algorithms and software to compare the sequencing results to the growing collection of ⌊>human genome>⌋ sequences and ⌊>germline>⌋ polymorphisms.@@@@1@39@@oe@26-8-2013 1000000900770@unknown@formal@none@1@S@New physical detection technology are employed, such as ⌊>oligonucleotide>⌋ microarrays to identify chromosomal gains and losses (called ⌊>comparative genomic hybridization>⌋), and ⌊>single nucleotide polymorphism>⌋ arrays to detect known ⌊/point mutations/⌋.@@@@1@30@@oe@26-8-2013 1000000900780@unknown@formal@none@1@S@These detection methods simultaneously measure several hundred thousand sites throughout the genome, and when used in high-throughput to measure thousands of samples, generate ⌊>terabyte>⌋s of data per experiment.@@@@1@28@@oe@26-8-2013 1000000900790@unknown@formal@none@1@S@Again the massive amounts and new types of data generate new opportunities for bioinformaticians.@@@@1@14@@oe@26-8-2013 1000000900800@unknown@formal@none@1@S@The data is often found to contain considerable variability, or ⌊>noise>⌋, and thus ⌊>Hidden Markov model>⌋ and ⌊>change-point analysis>⌋ methods are being developed to infer real ⌊>copy number>⌋ changes.@@@@1@29@@oe@26-8-2013 1000000900810@unknown@formal@none@1@S@Another type of data that requires novel informatics development is the analysis of lesions found to be recurrent among many tumors .@@@@1@22@@oe@26-8-2013 1000000900820@unknown@formal@none@1@S@⌊=Prediction of protein structure¦3=⌋@@@@1@4@@oe@26-8-2013 1000000900830@unknown@formal@none@1@S@Protein structure prediction is another important application of bioinformatics.@@@@1@9@@oe@26-8-2013 1000000900840@unknown@formal@none@1@S@The ⌊>amino acid>⌋ sequence of a protein, the so-called ⌊>primary structure>⌋, can be easily determined from the sequence on the gene that codes for it.@@@@1@25@@oe@26-8-2013 1000000900850@unknown@formal@none@1@S@In the vast majority of cases, this primary structure uniquely determines a structure in its native environment.@@@@1@17@@oe@26-8-2013 1000000900860@unknown@formal@none@1@S@(Of course, there are exceptions, such as the ⌊>bovine spongiform encephalopathy>⌋ - aka ⌊>Mad Cow Disease>⌋ - ⌊>prion>⌋.)@@@@1@18@@oe@26-8-2013 1000000900870@unknown@formal@none@1@S@Knowledge of this structure is vital in understanding the function of the protein.@@@@1@13@@oe@26-8-2013 1000000900880@unknown@formal@none@1@S@For lack of better terms, structural information is usually classified as one of ⌊/⌊>secondary>⌋/⌋, ⌊/⌊>tertiary>⌋/⌋ and ⌊/⌊>quaternary>⌋/⌋ structure.@@@@1@18@@oe@26-8-2013 1000000900890@unknown@formal@none@1@S@A viable general solution to such predictions remains an open problem.@@@@1@11@@oe@26-8-2013 1000000900900@unknown@formal@none@1@S@As of now, most efforts have been directed towards heuristics that work most of the time.@@@@1@16@@oe@26-8-2013 1000000900910@unknown@formal@none@1@S@One of the key ideas in bioinformatics is the notion of ⌊>homology>⌋.@@@@1@12@@oe@26-8-2013 1000000900920@unknown@formal@none@1@S@In the genomic branch of bioinformatics, homology is used to predict the function of a gene: if the sequence of gene ⌊/A/⌋, whose function is known, is homologous to the sequence of gene ⌊/B,/⌋ whose function is unknown, one could infer that B may share A's function.@@@@1@47@@oe@26-8-2013 1000000900930@unknown@formal@none@1@S@In the structural branch of bioinformatics, homology is used to determine which parts of a protein are important in structure formation and interaction with other proteins.@@@@1@26@@oe@26-8-2013 1000000900940@unknown@formal@none@1@S@In a technique called homology modeling, this information is used to predict the structure of a protein once the structure of a homologous protein is known.@@@@1@26@@oe@26-8-2013 1000000900950@unknown@formal@none@1@S@This currently remains the only way to predict protein structures reliably.@@@@1@11@@oe@26-8-2013 1000000900960@unknown@formal@none@1@S@One example of this is the similar protein homology between hemoglobin in humans and the hemoglobin in legumes (⌊>leghemoglobin>⌋).@@@@1@19@@oe@26-8-2013 1000000900970@unknown@formal@none@1@S@Both serve the same purpose of transporting oxygen in the organism.@@@@1@11@@oe@26-8-2013 1000000900980@unknown@formal@none@1@S@Though both of these proteins have completely different amino acid sequences, their protein structures are virtually identical, which reflects their near identical purposes.@@@@1@23@@oe@26-8-2013 1000000900990@unknown@formal@none@1@S@Other techniques for predicting protein structure include protein threading and ⌊/de novo/⌋ (from scratch) physics-based modeling.@@@@1@16@@oe@26-8-2013 1000000901000@unknown@formal@none@1@S@⌊/See also:/⌋ ⌊>structural motif>⌋ and ⌊>structural domain>⌋.@@@@1@7@@oe@26-8-2013 1000000901010@unknown@formal@none@1@S@⌊=Comparative genomics¦3=⌋@@@@1@2@@oe@26-8-2013 1000000901020@unknown@formal@none@1@S@The core of comparative genome analysis is the establishment of the correspondence between ⌊>genes>⌋ (orthology analysis) or other genomic features in different organisms.@@@@1@23@@oe@26-8-2013 1000000901030@unknown@formal@none@1@S@It is these intergenomic maps that make it possible to trace the evolutionary processes responsible for the divergence of two genomes.@@@@1@21@@oe@26-8-2013 1000000901040@unknown@formal@none@1@S@A multitude of evolutionary events acting at various organizational levels shape genome evolution.@@@@1@13@@oe@26-8-2013 1000000901050@unknown@formal@none@1@S@At the lowest level, point mutations affect individual nucleotides.@@@@1@9@@oe@26-8-2013 1000000901060@unknown@formal@none@1@S@At a higher level, large chromosomal segments undergo duplication, lateral transfer, inversion, transposition, deletion and insertion.@@@@1@16@@oe@26-8-2013 1000000901070@unknown@formal@none@1@S@Ultimately, whole genomes are involved in processes of hybridization, polyploidization and ⌊>endosymbiosis>⌋, often leading to rapid speciation.@@@@1@17@@oe@26-8-2013 1000000901080@unknown@formal@none@1@S@The complexity of genome evolution poses many exciting challenges to developers of mathematical models and algorithms, who have recourse to a spectra of algorithmic, statistical and mathematical techniques, ranging from exact, ⌊>heuristics>⌋, fixed parameter and ⌊>approximation algorithms>⌋ for problems based on parsimony models to ⌊>Markov Chain Monte Carlo>⌋ algorithms for ⌊>Bayesian analysis>⌋ of problems based on probabilistic models.@@@@1@58@@oe@26-8-2013 1000000901090@unknown@formal@none@1@S@Many of these studies are based on the homology detection and protein families computation.@@@@1@14@@oe@26-8-2013 1000000901100@unknown@formal@none@1@S@⌊=Modeling biological systems¦3=⌋@@@@1@3@@oe@26-8-2013 1000000901110@unknown@formal@none@1@S@Systems biology involves the use of ⌊>computer simulation>⌋s of ⌊>cellular>⌋ subsystems (such as the ⌊>networks of metabolites>⌋ and ⌊>enzyme>⌋s which comprise ⌊>metabolism>⌋, ⌊>signal transduction>⌋ pathways and ⌊>gene regulatory network>⌋s) to both analyze and visualize the complex connections of these cellular processes.@@@@1@41@@oe@26-8-2013 1000000901120@unknown@formal@none@1@S@⌊>Artificial life>⌋ or virtual evolution attempts to understand evolutionary processes via the computer simulation of simple (artificial) life forms.@@@@1@19@@oe@26-8-2013 1000000901130@unknown@formal@none@1@S@⌊=High-throughput image analysis¦3=⌋@@@@1@3@@oe@26-8-2013 1000000901140@unknown@formal@none@1@S@Computational technologies are used to accelerate or fully automate the processing, quantification and analysis of large amounts of high-information-content ⌊>biomedical imagery>⌋.@@@@1@21@@oe@26-8-2013 1000000901150@unknown@formal@none@1@S@Modern image analysis systems augment an observer's ability to make measurements from a large or complex set of images, by improving ⌊>accuracy>⌋, ⌊>objectivity>⌋, or speed.@@@@1@25@@oe@26-8-2013 1000000901160@unknown@formal@none@1@S@A fully developed analysis system may completely replace the observer.@@@@1@10@@oe@26-8-2013 1000000901170@unknown@formal@none@1@S@Although these systems are not unique to biomedical imagery, biomedical imaging is becoming more important for both ⌊>diagnostics>⌋ and research.@@@@1@20@@oe@26-8-2013 1000000901180@unknown@formal@none@1@S@Some examples are:@@@@1@3@@oe@26-8-2013 1000000901190@unknown@formal@none@1@S@⌊•⌊#high-throughput and high-fidelity quantification and sub-cellular localization (⌊>high-content screening>⌋, ⌊>cytohistopathology>⌋)#⌋@@@@1@10@@oe@26-8-2013 1000000901200@unknown@formal@none@1@S@⌊#⌊>morphometrics>⌋#⌋@@@@1@1@@oe@26-8-2013 1000000901210@unknown@formal@none@1@S@⌊#clinical image analysis and visualization#⌋@@@@1@5@@oe@26-8-2013 1000000901220@unknown@formal@none@1@S@⌊#determining the real-time air-flow patterns in breathing lungs of living animals#⌋@@@@1@11@@oe@26-8-2013 1000000901230@unknown@formal@none@1@S@⌊#quantifying occlusion size in real-time imagery from the development of and recovery during arterial injury#⌋@@@@1@15@@oe@26-8-2013 1000000901240@unknown@formal@none@1@S@⌊#making behavioral observations from extended video recordings of laboratory animals#⌋@@@@1@10@@oe@26-8-2013 1000000901250@unknown@formal@none@1@S@⌊#infrared measurements for metabolic activity determination#⌋•⌋@@@@1@6@@oe@26-8-2013 1000000901260@unknown@formal@none@1@S@⌊=Protein-protein docking¦3=⌋@@@@1@2@@oe@26-8-2013 1000000901270@unknown@formal@none@1@S@In the last two decades, tens of thousands of protein three-dimensional structures have been determined by ⌊>X-ray crystallography>⌋ and ⌊>Protein nuclear magnetic resonance spectroscopy>⌋ (protein NMR).@@@@1@26@@oe@26-8-2013 1000000901280@unknown@formal@none@1@S@One central question for the biological scientist is whether it is practical to predict possible protein-protein interactions only based on these 3D shapes, without doing ⌊>protein-protein interaction>⌋ experiments.@@@@1@28@@oe@26-8-2013 1000000901290@unknown@formal@none@1@S@A variety of methods have been developed to tackle the ⌊>Protein-protein docking>⌋ problem, though it seems that there is still much place to work on in this field.@@@@1@28@@oe@26-8-2013 1000000901300@unknown@formal@none@1@S@⌊=Software and Tools¦3=⌋@@@@1@3@@oe@26-8-2013 1000000901310@unknown@formal@none@1@S@Software tools for bioinformatics range from simple command-line tools, to more complex graphical programs and standalone web-services.@@@@1@17@@oe@26-8-2013 1000000901320@unknown@formal@none@1@S@The computational biology tool best-known among biologists is probably ⌊>BLAST>⌋, an algorithm for determining the similarity of arbitrary sequences against other sequences, possibly from curated databases of protein or DNA sequences.@@@@1@31@@oe@26-8-2013 1000000901330@unknown@formal@none@1@S@The ⌊>NCBI>⌋ provides a popular web-based implementation that searches their databases.@@@@1@11@@oe@26-8-2013 1000000901340@unknown@formal@none@1@S@BLAST is one of a number of ⌊>generally available programs>⌋ for doing sequence alignment.@@@@1@14@@oe@26-8-2013 1000000901350@unknown@formal@none@1@S@⌊=Web Services in Bioinformatics¦3=⌋@@@@1@4@@oe@26-8-2013 1000000901360@unknown@formal@none@1@S@⌊>SOAP>⌋ and ⌊>REST>⌋-based interfaces have been developed for a wide variety of bioinformatics applications allowing an application running on one computer in one part of the world to use algorithms, data and computing resources on servers in other parts of the world.@@@@1@42@@oe@26-8-2013 1000000901370@unknown@formal@none@1@S@The main advantages lay in the end user not having to deal with software and database maintenance overheads.@@@@1@18@@oe@26-8-2013 1000000901380@unknown@formal@none@1@S@Basic bioinformatics services are classified by the ⌊>EBI>⌋ into three categories: ⌊>SSS>⌋ (Sequence Search Services), ⌊>MSA>⌋ (Multiple Sequence Alignment) and ⌊>BSA>⌋ (Biological Sequence Analysis).@@@@1@24@@oe@26-8-2013 1000000901390@unknown@formal@none@1@S@The availability of these ⌊>service-oriented>⌋ bioinformatics resources demonstrate the applicability of web based bioinformatics solutions, and range from a collection of standalone tools with a common data format under a single, standalone or web-based interface, to integrative, distributed and extensible ⌊>bioinformatics workflow management systems>⌋.@@@@1@44@@oe@26-8-2013 1000001000010@unknown@formal@none@1@S@⌊δBusiness intelligenceδ⌋@@@@1@2@@oe@26-8-2013 1000001000020@unknown@formal@none@1@S@⌊∗Business intelligence∗⌋ (⌊∗BI∗⌋) refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business ⌊>information>⌋ and sometimes to the information itself.@@@@1@25@@oe@26-8-2013 1000001000030@unknown@formal@none@1@S@The purpose of business intelligence--a term that dates at least to 1958--is to support better business decision making.@@@@1@18@@oe@26-8-2013 1000001000040@unknown@formal@none@1@S@Thus, BI is also described as a ⌊>decision support system>⌋ (DSS):@@@@1@11@@oe@26-8-2013 1000001000050@unknown@formal@none@1@S@⌊"BI is sometimes used interchangeably with briefing books, report and query tools and executive information systems.@@@@1@16@@oe@26-8-2013 1000001000060@unknown@formal@none@1@S@In general, business intelligence systems are data-driven DSS."⌋@@@@1@8@@oe@26-8-2013 1000001000070@unknown@formal@none@1@S@BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a ⌊>data warehouse>⌋ or a ⌊>data mart>⌋ and occasionally working from operational data.@@@@1@33@@oe@26-8-2013 1000001000080@unknown@formal@none@1@S@Software elements support the use of this information by assisting in the extraction, analysis, and reporting of information.@@@@1@18@@oe@26-8-2013 1000001000090@unknown@formal@none@1@S@Applications tackle sales, production, financial, and many other sources of business data for purposes that include, notably, ⌊>business performance management>⌋.@@@@1@20@@oe@26-8-2013 1000001000100@unknown@formal@none@1@S@Information may be gathered on comparable companies to produce ⌊>benchmarks>⌋.@@@@1@10@@oe@26-8-2013 1000001000110@unknown@formal@none@1@S@⌊=History¦2=⌋@@@@1@1@@oe@26-8-2013 1000001000120@unknown@formal@none@1@S@Prior to the start of the ⌊>Information Age>⌋ in the late 20th century, businesses had to collect data from non-automated sources.@@@@1@21@@oe@26-8-2013 1000001000130@unknown@formal@none@1@S@Businesses then lacked the computing resources necessary to properly analyze the data, and as a result, companies often made business decisions primarily on the basis of ⌊>intuition>⌋.@@@@1@27@@oe@26-8-2013 1000001000140@unknown@formal@none@1@S@As businesses automated systems the amount of data increased but its collection remained difficult due to the inability of information to be moved between or within systems.@@@@1@27@@oe@26-8-2013 1000001000150@unknown@formal@none@1@S@Analysis of information informed for long-term decision making, but was slow and often required the use of instinct or expertise to make short-term decisions.@@@@1@24@@oe@26-8-2013 1000001000160@unknown@formal@none@1@S@Business intelligence was defined in 1958 by ⌊>Hans Peter Luhn>⌋, who wrote,@@@@1@12@@oe@26-8-2013 1000001000170@unknown@formal@none@1@S@⌊"In this paper, business is a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.@@@@1@25@@oe@26-8-2013 1000001000180@unknown@formal@none@1@S@The communication facility serving the conduct of a business (in the broad sense) may be referred to as an intelligence system.@@@@1@21@@oe@26-8-2013 1000001000190@unknown@formal@none@1@S@The notion of intelligence is also defined here, in a more general sense, as "the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal.""⌋@@@@1@35@@oe@26-8-2013 1000001000200@unknown@formal@none@1@S@In 1989 Howard Dresner, later a ⌊>Gartner Group>⌋ analyst, popularized BI as an umbrella term to describe "concepts and methods to improve business decision making by using fact-based support systems."@@@@1@30@@oe@26-8-2013 1000001000210@unknown@formal@none@1@S@In modern businesses the use of standards, automation and specialized software, including ⌊>analytical tools>⌋, allows large volumes of data to be ⌊>extracted, transformed, loaded>⌋ and ⌊>warehoused>⌋ to greatly increase the speed at which information becomes available for decision-making.@@@@1@38@@oe@26-8-2013 1000001000220@unknown@formal@none@1@S@⌊=Key intelligence topics¦3=⌋@@@@1@3@@oe@26-8-2013 1000001000230@unknown@formal@none@1@S@Business intelligence often uses ⌊>key performance indicators>⌋ (KPIs) to assess the present state of business and to prescribe a course of action.@@@@1@22@@oe@26-8-2013 1000001000240@unknown@formal@none@1@S@Examples of KPIs are things such as lead conversion rate (in sales) and inventory turnover (in inventory management).@@@@1@18@@oe@26-8-2013 1000001000250@unknown@formal@none@1@S@Prior to the widespread adoption of computer and web applications, when information had to be manually input and calculated, performance data was often not available for weeks or months.@@@@1@29@@oe@26-8-2013 1000001000260@unknown@formal@none@1@S@Recently, banks have tried to make data available at shorter intervals and have reduced delays.@@@@1@15@@oe@26-8-2013 1000001000270@unknown@formal@none@1@S@The KPI methodology was further expanded with the Chief Performance Officer methodology which incorporated KPIs and root cause analysis into a single methodology.@@@@1@23@@oe@26-8-2013 1000001000280@unknown@formal@none@1@S@Businesses that face higher operational/⌊>credit risk>⌋ loading, such as ⌊>credit card>⌋ companies and "wealth management" services, often make KPI-related data available weekly.@@@@1@22@@oe@26-8-2013 1000001000290@unknown@formal@none@1@S@In some cases, companies may even offer a daily analysis of data.@@@@1@12@@oe@26-8-2013 1000001000300@unknown@formal@none@1@S@This fast pace requires analysts to use ⌊>IT>⌋ ⌊>system>⌋s to process this large volume of data.@@@@1@16@@oe@26-8-2013 1000001100010@unknown@formal@none@1@S@⌊δChatterbotδ⌋@@@@1@1@@oe@26-8-2013 1000001100020@unknown@formal@none@1@S@A ⌊∗chatterbot∗⌋ (or chatbot) is a type of conversational agent, a ⌊>computer program>⌋ designed to simulate an intelligent ⌊>conversation>⌋ with one or more human users via auditory or textual methods.@@@@1@30@@oe@26-8-2013 1000001100030@unknown@formal@none@1@S@In other words, a chatterbot is a computer program with artificial intelligence to talk to people through voices or typed words.@@@@1@21@@oe@26-8-2013 1000001100040@unknown@formal@none@1@S@Though many appear to be intelligently interpreting the human input prior to providing a response, most chatterbots simply scan for keywords within the input and pull a reply with the most matching keywords or the most similar wording pattern from a local ⌊>database>⌋.@@@@1@43@@oe@26-8-2013 1000001100050@unknown@formal@none@1@S@Chatterbots may also be referred to as ⌊/talk bots/⌋, ⌊/chat bots/⌋, or ⌊/chatterboxes/⌋.@@@@1@13@@oe@26-8-2013 1000001100060@unknown@formal@none@1@S@⌊=Method of operation¦2=⌋@@@@1@3@@oe@26-8-2013 1000001100070@unknown@formal@none@1@S@A good understanding of a conversation is required to carry on a meaningful dialog but most chatterbots do not attempt this.@@@@1@21@@oe@26-8-2013 1000001100080@unknown@formal@none@1@S@Instead they "converse" by recognizing cue words or phrases from the human user, which allows them to use pre-prepared or pre-calculated responses which can move the conversation on in an apparently meaningful way without requiring them to know what they are talking about.@@@@1@43@@oe@26-8-2013 1000001100090@unknown@formal@none@1@S@For example, if a human types, "I am feeling very worried lately," the chatterbot may be programmed to recognize the phrase "I am" and respond by replacing it with "Why are you" plus a question mark at the end, giving the answer, "Why are you feeling very worried lately?"@@@@1@49@@oe@26-8-2013 1000001100100@unknown@formal@none@1@S@A similar approach using keywords would be for the program to answer any comment including ⌊/(Name of celebrity)/⌋ with "I think they're great, don't you?"@@@@1@25@@oe@26-8-2013 1000001100110@unknown@formal@none@1@S@Humans, especially those unfamiliar with chatterbots, sometimes find the resulting conversations engaging.@@@@1@12@@oe@26-8-2013 1000001100120@unknown@formal@none@1@S@Critics of chatterbots call this engagement the ⌊>ELIZA effect>⌋.@@@@1@9@@oe@26-8-2013 1000001100130@unknown@formal@none@1@S@Some programs classified as chatterbots use other principles.@@@@1@8@@oe@26-8-2013 1000001100140@unknown@formal@none@1@S@One example is ⌊>Jabberwacky>⌋, which attempts to model the way humans learn new facts and language.@@@@1@16@@oe@26-8-2013 1000001100150@unknown@formal@none@1@S@⌊>ELLA>⌋ attempts to use ⌊>natural language processing>⌋ to make more useful responses from a human's input.@@@@1@16@@oe@26-8-2013 1000001100160@unknown@formal@none@1@S@Some programs that use natural language conversation, such as ⌊>SHRDLU>⌋, are not generally classified as chatterbots because they link their speech ability to knowledge of a simulated world.@@@@1@28@@oe@26-8-2013 1000001100170@unknown@formal@none@1@S@This type of link requires a more complex ⌊>artificial intelligence>⌋ (eg., a "vision" system) than standard chatterbots have.@@@@1@18@@oe@26-8-2013 1000001100180@unknown@formal@none@1@S@⌊=Early chatterbots¦2=⌋@@@@1@2@@oe@26-8-2013 1000001100190@unknown@formal@none@1@S@The classic early chatterbots are ⌊>ELIZA>⌋ and ⌊>PARRY>⌋.@@@@1@8@@oe@26-8-2013 1000001100200@unknown@formal@none@1@S@More recent programs are ⌊>Racter>⌋, ⌊>Verbot>⌋s, ⌊>A.L.I.C.E.>⌋, and ⌊>ELLA>⌋.@@@@1@9@@oe@26-8-2013 1000001100210@unknown@formal@none@1@S@The growth of chatterbots as a research field has created an expansion in their purposes.@@@@1@15@@oe@26-8-2013 1000001100220@unknown@formal@none@1@S@While ELIZA and PARRY were used exclusively to simulate typed conversation, ⌊>Racter>⌋ was used to "write" a story called ⌊/The Policeman's Beard is Half Constructed/⌋.@@@@1@25@@oe@26-8-2013 1000001100230@unknown@formal@none@1@S@ELLA includes a collection of games and functional features to further extend the potential of chatterbots.@@@@1@16@@oe@26-8-2013 1000001100240@unknown@formal@none@1@S@The term "ChatterBot" was coined by ⌊>Michael Mauldin>⌋ (Creator of the first ⌊>Verbot>⌋, Julia) in 1994 to describe these conversational programs.@@@@1@21@@oe@26-8-2013 1000001100250@unknown@formal@none@1@S@⌊=Malicious chatterbots¦2=⌋@@@@1@2@@oe@26-8-2013 1000001100260@unknown@formal@none@1@S@Malicious chatterbots are frequently used to fill chat rooms with spam and advertising, or to entice people into revealing personal information, such as bank account numbers.@@@@1@26@@oe@26-8-2013 1000001100270@unknown@formal@none@1@S@They are commonly found on ⌊>Yahoo! Messenger>⌋, ⌊>Windows Live Messenger>⌋, ⌊>AOL Instant Messenger>⌋ and other ⌊>instant messaging>⌋ protocols.@@@@1@18@@oe@26-8-2013 1000001100280@unknown@formal@none@1@S@There has been a published report of a chatterbot used in a fake personal ad on a dating service's website.@@@@1@20@@oe@26-8-2013 1000001100290@unknown@formal@none@1@S@⌊=Chatterbots in modern AI¦2=⌋@@@@1@4@@oe@26-8-2013 1000001100300@unknown@formal@none@1@S@Most modern AI research focuses on practical engineering tasks.@@@@1@9@@oe@26-8-2013 1000001100310@unknown@formal@none@1@S@This is known as weak AI and is distinguished from ⌊>strong AI>⌋, which would require ⌊>sapience>⌋ and reasoning abilities.@@@@1@19@@oe@26-8-2013 1000001100320@unknown@formal@none@1@S@One pertinent field of AI research is natural language.@@@@1@9@@oe@26-8-2013 1000001100330@unknown@formal@none@1@S@Usually weak AI fields employ specialised software or programming languages created for them.@@@@1@13@@oe@26-8-2013 1000001100340@unknown@formal@none@1@S@For example, one of the 'most-human' natural language chatterbots, ⌊>A.L.I.C.E.>⌋, uses a programming language called AIML that is specific to its program, and its various clones, named Alicebots.@@@@1@28@@oe@26-8-2013 1000001100350@unknown@formal@none@1@S@Nevertheless, A.L.I.C.E. is still based on pattern matching without any reasoning.@@@@1@11@@oe@26-8-2013 1000001100360@unknown@formal@none@1@S@This is the same technique ⌊>ELIZA>⌋, the first chatterbot, was using back in 1966.@@@@1@14@@oe@26-8-2013 1000001100370@unknown@formal@none@1@S@Australian company MyCyberTwin also deals in strong AI, allowing users to create and sustain their own virtual personalities online.@@@@1@19@@oe@26-8-2013 1000001100380@unknown@formal@none@1@S@MyCyberTwin.com also works in a corporate setting, allowing companies to set up Virtual AI Assistants.@@@@1@15@@oe@26-8-2013 1000001100390@unknown@formal@none@1@S@Another notable program, known as ⌊>Jabberwacky>⌋, also deals in strong AI, as it is claimed to learn new responses based on user interactions, rather than being driven from a static database like many other existing chatterbots.@@@@1@36@@oe@26-8-2013 1000001100400@unknown@formal@none@1@S@Although such programs show initial promise, many of the existing results in trying to tackle the problem of natural language still appear fairly poor, and it seems reasonable to state that there is currently no general purpose conversational artificial intelligence.@@@@1@40@@oe@26-8-2013 1000001100410@unknown@formal@none@1@S@This has led some software developers to focus more on the practical aspect of chatterbot technology - information retrieval.@@@@1@19@@oe@26-8-2013 1000001100420@unknown@formal@none@1@S@A common rebuttal often used within the AI community against criticism of such approaches asks, "How do we know that humans don't also just follow some cleverly devised rules?" (in the way that Chatterbots do).@@@@1@35@@oe@26-8-2013 1000001100430@unknown@formal@none@1@S@Two famous examples of this line of argument against the rationale for the basis of the Turing test are John Searle's ⌊>Chinese room>⌋ argument and Ned Block's ⌊>Blockhead argument>⌋.@@@@1@29@@oe@26-8-2013 1000001100440@unknown@formal@none@1@S@⌊=Chatterbots/Virtual Assistants in Commercial Environments¦2=⌋@@@@1@5@@oe@26-8-2013 1000001100450@unknown@formal@none@1@S@Automated Conversational Systems have progressed and evolved far from the original designs of the first widely used chatbots.@@@@1@18@@oe@26-8-2013 1000001100460@unknown@formal@none@1@S@In the UK, large commercial entities such as Lloyds TSB, Royal Bank of Scotland, Renault, Citroën and One Railway are already utilizing Virtual Assistants to reduce expenditures on Call Centres and provide a first point of contact that can inform the user exactly of points of interest, provide support, capture data from the user and promote products for sale.@@@@1@59@@oe@26-8-2013 1000001100470@unknown@formal@none@1@S@In the UK, new projects and research are being conducted to introduce a Virtual Assistant into the classroom to assist the teacher.@@@@1@22@@oe@26-8-2013 1000001100480@unknown@formal@none@1@S@This project is the first of its kind and the chatbot VA in question is based on the Yhaken chatbot design.@@@@1@21@@oe@26-8-2013 1000001100490@unknown@formal@none@1@S@The Yhaken template provides a further move forward in Automated Conversational Systems with features such as complex conversational routing and responses, well defined personality, a complex hierarchical construct with additional external reference points, emotional responses and in depth small talk, all to make the experience more interactive and involving for the user.@@@@1@52@@oe@26-8-2013 1000001100500@unknown@formal@none@1@S@⌊=Annual contests for chatterbots¦2=⌋@@@@1@4@@oe@26-8-2013 1000001100510@unknown@formal@none@1@S@Many organizations tries to encourage and support developers all over the world to develop chatterbots that able to do variety of tasks and compete with each other through ⌊>turing test>⌋s and more.@@@@1@32@@oe@26-8-2013 1000001100520@unknown@formal@none@1@S@Annual contests are organized at the following links:@@@@1@8@@oe@26-8-2013 1000001100530@unknown@formal@none@1@S@⌊•⌊#⌊>The Chatterbox Challenge>⌋#⌋@@@@1@3@@oe@26-8-2013 1000001100540@unknown@formal@none@1@S@⌊#⌊>The Loebner Prize>⌋#⌋•⌋@@@@1@3@@oe@26-8-2013 1000001200010@unknown@formal@none@1@S@⌊δCluster analysisδ⌋@@@@1@2@@oe@26-8-2013 1000001200020@unknown@formal@none@1@S@⌊∗Clustering∗⌋ is the ⌊>classification>⌋ of objects into different groups, or more precisely, the ⌊>partitioning>⌋ of a ⌊>data set>⌋ into ⌊>subset>⌋s (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined ⌊>distance measure>⌋.@@@@1@42@@oe@26-8-2013 1000001200030@unknown@formal@none@1@S@Data clustering is a common technique for ⌊>statistical>⌋ ⌊>data analysis>⌋, which is used in many fields, including ⌊>machine learning>⌋, ⌊>data mining>⌋, ⌊>pattern recognition>⌋, ⌊>image analysis>⌋ and ⌊>bioinformatics>⌋.@@@@1@27@@oe@26-8-2013 1000001200040@unknown@formal@none@1@S@The computational task of classifying the data set into ⌊/k/⌋ clusters is often referred to as ⌊∗⌊/k/⌋∗⌋⌊∗-clustering∗⌋.@@@@1@17@@oe@26-8-2013 1000001200050@unknown@formal@none@1@S@Besides the term ⌊/data clustering/⌋ (or just ⌊/clustering/⌋), there are a number of terms with similar meanings, including ⌊/cluster analysis/⌋, ⌊/automatic classification/⌋, ⌊/numerical taxonomy/⌋, ⌊/botryology/⌋ and ⌊/typological analysis/⌋.@@@@1@28@@oe@26-8-2013 1000001200060@unknown@formal@none@1@S@⌊=Types of clustering¦2=⌋@@@@1@3@@oe@26-8-2013 1000001200070@unknown@formal@none@1@S@Data clustering algorithms can be ⌊>hierarchical>⌋.@@@@1@6@@oe@26-8-2013 1000001200080@unknown@formal@none@1@S@Hierarchical algorithms find successive clusters using previously established clusters.@@@@1@9@@oe@26-8-2013 1000001200090@unknown@formal@none@1@S@Hierarchical algorithms can be agglomerative ("bottom-up") or divisive ("top-down").@@@@1@9@@oe@26-8-2013 1000001200100@unknown@formal@none@1@S@Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters.@@@@1@17@@oe@26-8-2013 1000001200110@unknown@formal@none@1@S@Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters.@@@@1@16@@oe@26-8-2013 1000001200120@unknown@formal@none@1@S@⌊>Partitional>⌋ algorithms typically determine all clusters at once, but can also be used as divisive algorithms in the ⌊>hierarchical>⌋ clustering.@@@@1@20@@oe@26-8-2013 1000001200130@unknown@formal@none@1@S@⌊/Two-way clustering/⌋, ⌊/co-clustering/⌋ or ⌊>biclustering>⌋ are clustering methods where not only the objects are clustered but also the features of the objects, i.e., if the data is represented in a ⌊>data matrix>⌋, the rows and columns are clustered simultaneously.@@@@1@39@@oe@26-8-2013 1000001200140@unknown@formal@none@1@S@Another important distinction is whether the clustering uses symmetric or asymmetric distances.@@@@1@12@@oe@26-8-2013 1000001200150@unknown@formal@none@1@S@A property of ⌊>Euclidean space>⌋ is that distances are symmetric (the distance from object⌊/ A/⌋ to ⌊/B/⌋ is the same as the distance from ⌊/B/⌋ to ⌊/A/⌋).@@@@1@27@@oe@26-8-2013 1000001200160@unknown@formal@none@1@S@In other applications (e.g., sequence-alignment methods, see Prinzie & Van den Poel (2006)), this is not the case.@@@@1@18@@oe@26-8-2013 1000001200170@unknown@formal@none@1@S@⌊=Distance measure¦2=⌋@@@@1@2@@oe@26-8-2013 1000001200180@unknown@formal@none@1@S@An important step in any clustering is to select a ⌊>distance measure>⌋, which will determine how the ⌊/similarity/⌋ of two elements is calculated.@@@@1@23@@oe@26-8-2013 1000001200190@unknown@formal@none@1@S@This will influence the shape of the clusters, as some elements may be close to one another according to one distance and further away according to another.@@@@1@27@@oe@26-8-2013 1000001200200@unknown@formal@none@1@S@For example, in a 2-dimensional space, the distance between the point (x=1, y=0) and the origin (x=0, y=0) is always 1 according to the usual norms, but the distance between the point (x=1, y=1) and the origin can be 2,⌊×\\sqrt 2×⌋ or 1 if you take respectively the 1-norm, 2-norm or infinity-norm distance.@@@@1@53@@oe@26-8-2013 1000001200210@unknown@formal@none@1@S@Common distance functions:@@@@1@3@@oe@26-8-2013 1000001200220@unknown@formal@none@1@S@⌊•⌊#The ⌊>Euclidean distance>⌋ (also called distance ⌊>as the crow flies>⌋ or 2-norm distance).@@@@1@13@@oe@26-8-2013 1000001200230@unknown@formal@none@1@S@A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the Euclidean distance or the squared Euclidean distance.#⌋@@@@1@32@@oe@26-8-2013 1000001200240@unknown@formal@none@1@S@⌊#The ⌊>Manhattan distance>⌋ (also called taxicab norm or 1-norm)#⌋@@@@1@9@@oe@26-8-2013 1000001200250@unknown@formal@none@1@S@⌊#The ⌊>maximum norm>⌋#⌋@@@@1@3@@oe@26-8-2013 1000001200260@unknown@formal@none@1@S@⌊#The ⌊>Mahalanobis distance>⌋ corrects data for different scales and correlations in the variables#⌋@@@@1@13@@oe@26-8-2013 1000001200270@unknown@formal@none@1@S@⌊#The angle between two vectors can be used as a distance measure when clustering high dimensional data.@@@@1@17@@oe@26-8-2013 1000001200280@unknown@formal@none@1@S@See ⌊>Inner product space>⌋.#⌋@@@@1@4@@oe@26-8-2013 1000001200290@unknown@formal@none@1@S@⌊#The ⌊>Hamming distance>⌋ (sometimes edit distance) measures the minimum number of substitutions required to change one member into another.#⌋•⌋@@@@1@19@@oe@26-8-2013 1000001200300@unknown@formal@none@1@S@⌊=Hierarchical clustering¦2=⌋@@@@1@2@@oe@26-8-2013 1000001200310@unknown@formal@none@1@S@⌊=Creating clusters¦3=⌋@@@@1@2@@oe@26-8-2013 1000001200320@unknown@formal@none@1@S@Hierarchical clustering builds (agglomerative), or breaks up (divisive), a hierarchy of clusters.@@@@1@12@@oe@26-8-2013 1000001200330@unknown@formal@none@1@S@The traditional representation of this hierarchy is a ⌊>tree>⌋ (called a ⌊>dendrogram>⌋), with individual elements at one end and a single cluster containing every element at the other.@@@@1@28@@oe@26-8-2013 1000001200340@unknown@formal@none@1@S@Agglomerative algorithms begin at the top of the tree, whereas divisive algorithms begin at the root.@@@@1@16@@oe@26-8-2013 1000001200350@unknown@formal@none@1@S@(In the figure, the arrows indicate an agglomerative clustering.)@@@@1@9@@oe@26-8-2013 1000001200360@unknown@formal@none@1@S@Cutting the tree at a given height will give a clustering at a selected precision.@@@@1@15@@oe@26-8-2013 1000001200370@unknown@formal@none@1@S@In the following example, cutting after the second row will yield clusters {a} {b c} {d e} {f}.@@@@1@18@@oe@26-8-2013 1000001200380@unknown@formal@none@1@S@Cutting after the third row will yield clusters {a} {b c} {d e f}, which is a coarser clustering, with a smaller number of larger clusters.@@@@1@26@@oe@26-8-2013 1000001200390@unknown@formal@none@1@S@⌊=Agglomerative hierarchical clustering¦3=⌋@@@@1@3@@oe@26-8-2013 1000001200400@unknown@formal@none@1@S@For example, suppose this data is to be clustered, and the ⌊>euclidean distance>⌋ is the ⌊>distance metric>⌋.@@@@1@17@@oe@26-8-2013 1000001200410@unknown@formal@none@1@S@The hierarchical clustering ⌊>dendrogram>⌋ would be as such:@@@@1@8@@oe@26-8-2013 1000001200420@unknown@formal@none@1@S@This method builds the hierarchy from the individual elements by progressively merging clusters.@@@@1@13@@oe@26-8-2013 1000001200430@unknown@formal@none@1@S@In our example, we have six elements {a} {b} {c} {d} {e} and {f}.@@@@1@14@@oe@26-8-2013 1000001200440@unknown@formal@none@1@S@The first step is to determine which elements to merge in a cluster.@@@@1@13@@oe@26-8-2013 1000001200450@unknown@formal@none@1@S@Usually, we want to take the two closest elements, according to the chosen distance.@@@@1@14@@oe@26-8-2013 1000001200460@unknown@formal@none@1@S@Optionally, one can also construct a ⌊>distance matrix>⌋ at this stage, where the number in the ⌊/i/⌋-th row ⌊/j/⌋-th column is the distance between the ⌊/i/⌋-th and ⌊/j/⌋-th elements.@@@@1@29@@oe@26-8-2013 1000001200470@unknown@formal@none@1@S@Then, as clustering progresses, rows and columns are merged as the clusters are merged and the distances updated.@@@@1@18@@oe@26-8-2013 1000001200480@unknown@formal@none@1@S@This is a common way to implement this type of clustering, and has the benefit of caching distances between clusters.@@@@1@20@@oe@26-8-2013 1000001200490@unknown@formal@none@1@S@A simple agglomerative clustering algorithm is described in the ⌊>single linkage clustering>⌋ page; it can easily be adapted to different types of linkage (see below).@@@@1@25@@oe@26-8-2013 1000001200500@unknown@formal@none@1@S@Suppose we have merged the two closest elements ⌊/b/⌋ and ⌊/c/⌋, we now have the following clusters {⌊/a/⌋}, {⌊/b/⌋, ⌊/c/⌋}, {⌊/d/⌋}, {⌊/e/⌋} and {⌊/f/⌋}, and want to merge them further.@@@@1@30@@oe@26-8-2013 1000001200510@unknown@formal@none@1@S@To do that, we need to take the distance between {a} and {b c}, and therefore define the distance between two clusters.@@@@1@22@@oe@26-8-2013 1000001200520@unknown@formal@none@1@S@Usually the distance between two clusters ⌊×\\mathcal{A}×⌋ and ⌊×\\mathcal{B}×⌋ is one of the following:@@@@1@14@@oe@26-8-2013 1000001200530@unknown@formal@none@1@S@⌊•⌊#The maximum distance between elements of each cluster (also called complete linkage clustering):#⌋•⌋@@@@1@13@@oe@26-8-2013 1000001200540@unknown@formal@none@1@S@⌊⇥⌊⇥⌊×\\max \\{\\, d(x,y) : x \\in \\mathcal{A},\\, y \\in \\mathcal{B}\\,\\}×⌋⇥⌋⇥⌋@@@@1@10@@oe@26-8-2013 1000001200550@unknown@formal@none@1@S@⌊•⌊#The minimum distance between elements of each cluster (also called ⌊>single linkage clustering>⌋):#⌋•⌋@@@@1@13@@oe@26-8-2013 1000001200560@unknown@formal@none@1@S@⌊⇥⌊⇥⌊×\\min \\{\\, d(x,y) : x \\in \\mathcal{A},\\, y \\in \\mathcal{B} \\,\\}×⌋⇥⌋⇥⌋@@@@1@11@@oe@26-8-2013 1000001200570@unknown@formal@none@1@S@⌊•⌊#The mean distance between elements of each cluster (also called average linkage clustering, used e.g. in ⌊>UPGMA>⌋):#⌋•⌋@@@@1@17@@oe@26-8-2013 1000001200580@unknown@formal@none@1@S@⌊⇥⌊⇥⌊×{1 \\over {|\\mathcal{A}|\\cdot|\\mathcal{B}|}}\\sum_{x \\in \\mathcal{A}}\\sum_{ y \\in \\mathcal{B}} d(x,y)×⌋⇥⌋⇥⌋@@@@1@9@@oe@26-8-2013 1000001200590@unknown@formal@none@1@S@⌊•⌊#The sum of all intra-cluster variance#⌋@@@@1@6@@oe@26-8-2013 1000001200600@unknown@formal@none@1@S@⌊#The increase in variance for the cluster being merged (⌊>Ward's criterion>⌋)#⌋@@@@1@11@@oe@26-8-2013 1000001200610@unknown@formal@none@1@S@⌊#The probability that candidate clusters spawn from the same distribution function (V-linkage)#⌋•⌋@@@@1@12@@oe@26-8-2013 1000001200620@unknown@formal@none@1@S@Each agglomeration occurs at a greater distance between clusters than the previous agglomeration, and one can decide to stop clustering either when the clusters are too far apart to be merged (distance criterion) or when there is a sufficiently small number of clusters (number criterion).@@@@1@45@@oe@26-8-2013 1000001200630@unknown@formal@none@1@S@⌊=Concept clustering¦3=⌋@@@@1@2@@oe@26-8-2013 1000001200640@unknown@formal@none@1@S@Another variation of the agglomerative clustering approach is ⌊>conceptual clustering>⌋.@@@@1@10@@oe@26-8-2013 1000001200650@unknown@formal@none@1@S@⌊=Partitional clustering¦2=⌋@@@@1@2@@oe@26-8-2013 1000001200660@unknown@formal@none@1@S@⌊=⌊/K/⌋-means and derivatives¦3=⌋@@@@1@3@@oe@26-8-2013 1000001200670@unknown@formal@none@1@S@⌊=⌊/K/⌋-means clustering¦4=⌋@@@@1@2@@oe@26-8-2013 1000001200680@unknown@formal@none@1@S@The ⌊>⌊/K/⌋-means algorithm>⌋ assigns each point to the cluster whose center (also called centroid) is nearest.@@@@1@16@@oe@26-8-2013 1000001200690@unknown@formal@none@1@S@The center is the average of all the points in the cluster — that is, its coordinates are the arithmetic mean for each dimension separately over all the points in the cluster...@@@@1@32@@oe@26-8-2013 1000001200700@unknown@formal@none@1@S@⌊⇥⌊/Example:/⌋ The data set has three dimensions and the cluster has two points: ⌊/X/⌋ = (⌊/x/⌋⌊,1,⌋, ⌊/x/⌋⌊,2,⌋, ⌊/x/⌋⌊,3,⌋) and ⌊/Y/⌋ = (⌊/y/⌋⌊,1,⌋, ⌊/y/⌋⌊,2,⌋, ⌊/y/⌋⌊,3,⌋).@@@@1@24@@oe@26-8-2013 1000001200710@unknown@formal@none@1@S@Then the centroid ⌊/Z/⌋ becomes ⌊/Z/⌋ = (⌊/z/⌋⌊,1,⌋, ⌊/z/⌋⌊,2,⌋, ⌊/z/⌋⌊,3,⌋), where ⌊/z/⌋⌊,1,⌋ = (⌊/x/⌋⌊,1,⌋ + ⌊/y/⌋⌊,1,⌋)/2 and ⌊/z/⌋⌊,2,⌋ = (⌊/x/⌋⌊,2,⌋ + ⌊/y/⌋⌊,2,⌋)/2 and ⌊/z/⌋⌊,3,⌋ = (⌊/x/⌋⌊,3,⌋ + ⌊/y/⌋⌊,3,⌋)/2.⇥⌋@@@@1@22@@oe@26-8-2013 1000001200720@unknown@formal@none@1@S@The algorithm steps are (J. MacQueen, 1967):@@@@1@7@@oe@26-8-2013 1000001200730@unknown@formal@none@1@S@⌊•⌊#Choose the number of clusters, ⌊/k/⌋.#⌋@@@@1@6@@oe@26-8-2013 1000001200740@unknown@formal@none@1@S@⌊#Randomly generate ⌊/k/⌋ clusters and determine the cluster centers, or directly generate ⌊/k/⌋ random points as cluster centers.#⌋@@@@1@18@@oe@26-8-2013 1000001200750@unknown@formal@none@1@S@⌊#Assign each point to the nearest cluster center.#⌋@@@@1@8@@oe@26-8-2013 1000001200760@unknown@formal@none@1@S@⌊#Recompute the new cluster centers.#⌋@@@@1@5@@oe@26-8-2013 1000001200770@unknown@formal@none@1@S@⌊#Repeat the two previous steps until some convergence criterion is met (usually that the assignment hasn't changed).#⌋•⌋@@@@1@17@@oe@26-8-2013 1000001200780@unknown@formal@none@1@S@The main advantages of this algorithm are its simplicity and speed which allows it to run on large datasets.@@@@1@19@@oe@26-8-2013 1000001200790@unknown@formal@none@1@S@Its disadvantage is that it does not yield the same result with each run, since the resulting clusters depend on the initial random assignments.@@@@1@24@@oe@26-8-2013 1000001200800@unknown@formal@none@1@S@It minimizes intra-cluster variance, but does not ensure that the result has a global minimum of variance.@@@@1@17@@oe@26-8-2013 1000001200810@unknown@formal@none@1@S@⌊=Fuzzy ⌊/c/⌋-means clustering¦4=⌋@@@@1@3@@oe@26-8-2013 1000001200820@unknown@formal@none@1@S@In ⌊>fuzzy clustering>⌋, each point has a degree of belonging to clusters, as in ⌊>fuzzy logic>⌋, rather than belonging completely to just one cluster.@@@@1@24@@oe@26-8-2013 1000001200830@unknown@formal@none@1@S@Thus, points on the edge of a cluster, may be ⌊/in the cluster/⌋ to a lesser degree than points in the center of cluster.@@@@1@24@@oe@26-8-2013 1000001200840@unknown@formal@none@1@S@For each point ⌊/x/⌋ we have a coefficient giving the degree of being in the ⌊/k/⌋th cluster ⌊×u_k(x)×⌋.@@@@1@18@@oe@26-8-2013 1000001200850@unknown@formal@none@1@S@Usually, the sum of those coefficients is defined to be 1:@@@@1@11@@oe@26-8-2013 1000001200860@unknown@formal@none@1@S@⌊⇥⌊×\\forall x \\sum_{k=1}^{\\mathrm{num.}\\ \\mathrm{clusters}} u_k(x) \\ =1.×⌋⇥⌋@@@@1@7@@oe@26-8-2013 1000001200870@unknown@formal@none@1@S@With fuzzy ⌊/c/⌋-means, the centroid of a cluster is the mean of all points, weighted by their degree of belonging to the cluster:@@@@1@23@@oe@26-8-2013 1000001200880@unknown@formal@none@1@S@⌊⇥⌊×\\mathrm{center}_k = {{\\sum_x u_k(x)^m x} \\over {\\sum_x u_k(x)^m}}.×⌋⇥⌋@@@@1@8@@oe@26-8-2013 1000001200890@unknown@formal@none@1@S@The degree of belonging is related to the inverse of the distance to the cluster@@@@1@15@@oe@26-8-2013 1000001200900@unknown@formal@none@1@S@⌊⇥⌊×u_k(x) = {1 \\over d(\\mathrm{center}_k,x)},×⌋⇥⌋@@@@1@5@@oe@26-8-2013 1000001200910@unknown@formal@none@1@S@then the coefficients are normalized and fuzzyfied with a real parameter ⌊×m>1×⌋ so that their sum is 1.@@@@1@18@@oe@26-8-2013 1000001200920@unknown@formal@none@1@S@So@@@@1@1@@oe@26-8-2013 1000001200930@unknown@formal@none@1@S@⌊⇥⌊×u_k(x) = \\frac{1}{\\sum_j \\left(\\frac{d(\\mathrm{center}_k,x)}{d(\\mathrm{center}_j,x)}\\right)^{2/(m-1)}}.×⌋⇥⌋@@@@1@4@@oe@26-8-2013 1000001200940@unknown@formal@none@1@S@For ⌊/m/⌋ equal to 2, this is equivalent to normalising the coefficient linearly to make their sum 1.@@@@1@18@@oe@26-8-2013 1000001200950@unknown@formal@none@1@S@When ⌊/m/⌋ is close to 1, then cluster center closest to the point is given much more weight than the others, and the algorithm is similar to ⌊/k/⌋-means.@@@@1@28@@oe@26-8-2013 1000001200960@unknown@formal@none@1@S@The fuzzy ⌊/c/⌋-means algorithm is very similar to the ⌊/k/⌋-means algorithm:@@@@1@11@@oe@26-8-2013 1000001200970@unknown@formal@none@1@S@⌊•⌊#Choose a number of clusters.#⌋@@@@1@5@@oe@26-8-2013 1000001200980@unknown@formal@none@1@S@⌊#Assign randomly to each point coefficients for being in the clusters.#⌋@@@@1@11@@oe@26-8-2013 1000001200990@unknown@formal@none@1@S@⌊#Repeat until the algorithm has converged (that is, the coefficients' change between two iterations is no more than ⌊×\\epsilon×⌋, the given sensitivity threshold) :@@@@1@24@@oe@26-8-2013 1000001201000@unknown@formal@none@1@S@⌊•⌊#Compute the centroid for each cluster, using the formula above.#⌋@@@@1@10@@oe@26-8-2013 1000001201010@unknown@formal@none@1@S@⌊#For each point, compute its coefficients of being in the clusters, using the formula above.#⌋•⌋#⌋•⌋@@@@1@15@@oe@26-8-2013 1000001201020@unknown@formal@none@1@S@The algorithm minimizes intra-cluster variance as well, but has the same problems as ⌊/k/⌋-means, the minimum is a local minimum, and the results depend on the initial choice of weights.@@@@1@30@@oe@26-8-2013 1000001201030@unknown@formal@none@1@S@The ⌊>Expectation-maximization algorithm>⌋ is a more statistically formalized method which includes some of these ideas: partial membership in classes.@@@@1@19@@oe@26-8-2013 1000001201040@unknown@formal@none@1@S@It has better convergence properties and is in general preferred to fuzzy-c-means.@@@@1@12@@oe@26-8-2013 1000001201050@unknown@formal@none@1@S@⌊=QT clustering algorithm¦4=⌋@@@@1@3@@oe@26-8-2013 1000001201060@unknown@formal@none@1@S@QT (quality threshold) clustering (Heyer et al, 1999) is an alternative method of partitioning data, invented for gene clustering.@@@@1@19@@oe@26-8-2013 1000001201070@unknown@formal@none@1@S@It requires more computing power than ⌊/k/⌋-means, but does not require specifying the number of clusters ⌊/a priori/⌋, and always returns the same result when run several times.@@@@1@28@@oe@26-8-2013 1000001201080@unknown@formal@none@1@S@The algorithm is:@@@@1@3@@oe@26-8-2013 1000001201090@unknown@formal@none@1@S@⌊•⌊#The user chooses a maximum diameter for clusters.#⌋@@@@1@8@@oe@26-8-2013 1000001201100@unknown@formal@none@1@S@⌊#Build a candidate cluster for each point by including the closest point, the next closest, and so on, until the diameter of the cluster surpasses the threshold.#⌋@@@@1@27@@oe@26-8-2013 1000001201110@unknown@formal@none@1@S@⌊#Save the candidate cluster with the most points as the first true cluster, and remove all points in the cluster from further consideration.@@@@1@23@@oe@26-8-2013 1000001201120@unknown@formal@none@1@S@Must clarify what happens if more than 1 cluster has the maximum number of points ?#⌋@@@@1@16@@oe@26-8-2013 1000001201130@unknown@formal@none@1@S@⌊#⌊>Recurse>⌋ with the reduced set of points.#⌋•⌋@@@@1@7@@oe@26-8-2013 1000001201140@unknown@formal@none@1@S@The distance between a point and a group of points is computed using complete linkage, i.e. as the maximum distance from the point to any member of the group (see the "Agglomerative hierarchical clustering" section about distance between clusters).@@@@1@39@@oe@26-8-2013 1000001201150@unknown@formal@none@1@S@⌊=Locality-sensitive hashing¦3=⌋@@@@1@2@@oe@26-8-2013 1000001201160@unknown@formal@none@1@S@⌊>Locality-sensitive hashing>⌋ can be used for clustering.@@@@1@7@@oe@26-8-2013 1000001201170@unknown@formal@none@1@S@Feature space vectors are sets, and the metric used is the ⌊>Jaccard distance>⌋.@@@@1@13@@oe@26-8-2013 1000001201180@unknown@formal@none@1@S@The feature space can be considered high-dimensional.@@@@1@7@@oe@26-8-2013 1000001201190@unknown@formal@none@1@S@The ⌊/min-wise independent permutations/⌋ LSH scheme (sometimes MinHash) is then used to put similar items into buckets.@@@@1@17@@oe@26-8-2013 1000001201200@unknown@formal@none@1@S@With just one set of hashing methods, there are only clusters of very similar elements.@@@@1@15@@oe@26-8-2013 1000001201210@unknown@formal@none@1@S@By seeding the hash functions several times (eg 20), it is possible to get bigger clusters.@@@@1@16@@oe@26-8-2013 1000001201220@unknown@formal@none@1@S@⌊=Graph-theoretic methods¦3=⌋@@@@1@2@@oe@26-8-2013 1000001201230@unknown@formal@none@1@S@⌊>Formal concept analysis>⌋ is a technique for generating clusters of objects and attributes, given a ⌊>bipartite graph>⌋ representing the relations between the objects and attributes.@@@@1@25@@oe@26-8-2013 1000001201240@unknown@formal@none@1@S@Other methods for generating ⌊/overlapping clusters/⌋ (a ⌊>cover>⌋ rather than a ⌊>partition>⌋) are discussed by Jardine and Sibson (1968) and Cole and Wishart (1970).@@@@1@24@@oe@26-8-2013 1000001201250@unknown@formal@none@1@S@⌊=Elbow criterion¦2=⌋@@@@1@2@@oe@26-8-2013 1000001201260@unknown@formal@none@1@S@The elbow criterion is a common ⌊>rule of thumb>⌋ to determine what number of clusters should be chosen, for example for ⌊/k/⌋-means and agglomerative hierarchical clustering.@@@@1@26@@oe@26-8-2013 1000001201270@unknown@formal@none@1@S@It should also be noted that the initial assignment of cluster seeds has bearing on the final model performance.@@@@1@19@@oe@26-8-2013 1000001201280@unknown@formal@none@1@S@Thus, it is appropriate to re-run the cluster analysis multiple times.@@@@1@11@@oe@26-8-2013 1000001201290@unknown@formal@none@1@S@The elbow criterion says that you should choose a number of clusters so that adding another cluster doesn't add sufficient information.@@@@1@21@@oe@26-8-2013 1000001201300@unknown@formal@none@1@S@More precisely, if you graph the percentage of variance explained by the clusters against the number of clusters, the first clusters will add much information (explain a lot of variance), but at some point the marginal gain will drop, giving an angle in the graph (the elbow).@@@@1@47@@oe@26-8-2013 1000001201310@unknown@formal@none@1@S@This elbow cannot always be unambiguously identified.@@@@1@7@@oe@26-8-2013 1000001201320@unknown@formal@none@1@S@Percentage of variance explained is the ratio of the between-group variance to the total variance.@@@@1@15@@oe@26-8-2013 1000001201330@unknown@formal@none@1@S@On the following graph, the elbow is indicated by the red circle.@@@@1@12@@oe@26-8-2013 1000001201340@unknown@formal@none@1@S@The number of clusters chosen should therefore be 4.@@@@1@9@@oe@26-8-2013 1000001201350@unknown@formal@none@1@S@⌊=Spectral clustering¦2=⌋@@@@1@2@@oe@26-8-2013 1000001201360@unknown@formal@none@1@S@Given a set of data points A, the ⌊>similarity matrix>⌋ may be defined as a matrix ⌊×S×⌋ where ⌊×S_{ij}×⌋ represents a measure of the similarity between points ⌊×i, j\\in A×⌋.@@@@1@30@@oe@26-8-2013 1000001201370@unknown@formal@none@1@S@Spectral clustering techniques make use of the ⌊>spectrum>⌋ of the similarity matrix of the data to perform ⌊>dimensionality reduction>⌋ for clustering in fewer dimensions.@@@@1@24@@oe@26-8-2013 1000001201380@unknown@formal@none@1@S@One such technique is the ⌊/⌊>Shi-Malik algorithm>⌋/⌋, commonly used for ⌊>image segmentation>⌋.@@@@1@12@@oe@26-8-2013 1000001201390@unknown@formal@none@1@S@It partitions points into two sets ⌊×(S_1,S_2)×⌋ based on the ⌊>eigenvector>⌋ ⌊×v×⌋ corresponding to the second-smallest ⌊>eigenvalue>⌋ of the ⌊>Laplacian matrix>⌋@@@@1@21@@oe@26-8-2013 1000001201400@unknown@formal@none@1@S@⌊⇥⌊×L = I - D^{-1/2}SD^{-1/2}×⌋⇥⌋@@@@1@5@@oe@26-8-2013 1000001201410@unknown@formal@none@1@S@of ⌊×S×⌋, where ⌊×D×⌋ is the diagonal matrix@@@@1@8@@oe@26-8-2013 1000001201420@unknown@formal@none@1@S@⌊⇥⌊×D_{ii} = \\sum_{j} S_{ij}.×⌋⇥⌋@@@@1@4@@oe@26-8-2013 1000001201430@unknown@formal@none@1@S@This partitioning may be done in various ways, such as by taking the median ⌊×m×⌋ of the components in ⌊×v×⌋, and placing all points whose component in ⌊×v×⌋ is greater than ⌊×m×⌋ in ⌊×S_1×⌋, and the rest in ⌊×S_2×⌋.@@@@1@39@@oe@26-8-2013 1000001201440@unknown@formal@none@1@S@The algorithm can be used for hierarchical clustering by repeatedly partitioning the subsets in this fashion.@@@@1@16@@oe@26-8-2013 1000001201450@unknown@formal@none@1@S@A related algorithm is the ⌊/⌊>Meila-Shi algorithm>⌋/⌋, which takes the ⌊>eigenvector>⌋s corresponding to the ⌊/k/⌋ largest ⌊>eigenvalue>⌋s of the matrix ⌊×P = SD^{-1}×⌋ for some ⌊/k/⌋, and then invokes another (e.g. ⌊/k/⌋-means) to cluster points by their respective ⌊/k/⌋ components in these eigenvectors.@@@@1@43@@oe@26-8-2013 1000001201460@unknown@formal@none@1@S@⌊=Applications¦2=⌋@@@@1@1@@oe@26-8-2013 1000001201470@unknown@formal@none@1@S@⌊=Biology¦3=⌋@@@@1@1@@oe@26-8-2013 1000001201480@unknown@formal@none@1@S@In ⌊>biology>⌋ ⌊∗clustering∗⌋ has many applications@@@@1@6@@oe@26-8-2013 1000001201490@unknown@formal@none@1@S@⌊•⌊#In imaging, data clustering may take different form based on the data dimensionality.@@@@1@13@@oe@26-8-2013 1000001201500@unknown@formal@none@1@S@For example, the ⌊> SOCR EM Mixture model segmentation activity and applet>⌋ shows how to obtain point, region or volume classification using the online ⌊>SOCR>⌋ computational libraries.#⌋@@@@1@27@@oe@26-8-2013 1000001201510@unknown@formal@none@1@S@⌊#In the fields of ⌊>plant>⌋ and ⌊>animal>⌋ ⌊>ecology>⌋, clustering is used to describe and to make spatial and temporal comparisons of communities (assemblages) of organisms in heterogeneous environments; it is also used in ⌊>plant systematics>⌋ to generate artificial ⌊>phylogenies>⌋ or clusters of organisms (individuals) at the species, genus or higher level that share a number of attributes#⌋@@@@1@57@@oe@26-8-2013 1000001201520@unknown@formal@none@1@S@⌊#In computational biology and ⌊>bioinformatics>⌋:@@@@1@5@@oe@26-8-2013 1000001201530@unknown@formal@none@1@S@⌊•⌊#In ⌊>transcriptomics>⌋, clustering is used to build groups of ⌊>genes>⌋ with related expression patterns (also known as coexpressed genes).@@@@1@19@@oe@26-8-2013 1000001201540@unknown@formal@none@1@S@Often such groups contain functionally related proteins, such as ⌊>enzyme>⌋s for a specific ⌊>pathway>⌋, or genes that are co-regulated.@@@@1@19@@oe@26-8-2013 1000001201550@unknown@formal@none@1@S@High throughput experiments using ⌊>expressed sequence tag>⌋s (ESTs) or ⌊>DNA microarray>⌋s can be a powerful tool for ⌊>genome annotation>⌋, a general aspect of ⌊>genomics>⌋.#⌋@@@@1@24@@oe@26-8-2013 1000001201560@unknown@formal@none@1@S@⌊#In ⌊>sequence analysis>⌋, clustering is used to group homologous sequences into ⌊>gene families>⌋.@@@@1@13@@oe@26-8-2013 1000001201570@unknown@formal@none@1@S@This is a very important concept in bioinformatics, and ⌊>evolutionary biology>⌋ in general.@@@@1@13@@oe@26-8-2013 1000001201580@unknown@formal@none@1@S@See evolution by ⌊>gene duplication>⌋.#⌋@@@@1@5@@oe@26-8-2013 1000001201590@unknown@formal@none@1@S@⌊#In high-throughput genotyping platforms clustering algorithms are used to automatically assign ⌊>genotypes>⌋.#⌋•⌋#⌋•⌋@@@@1@12@@oe@26-8-2013 1000001201600@unknown@formal@none@1@S@⌊=Medicine¦3=⌋@@@@1@1@@oe@26-8-2013 1000001201610@unknown@formal@none@1@S@In ⌊>medical imaging>⌋, such as ⌊>PET scans>⌋, cluster analysis can be used to differentiate between different types of ⌊>tissue>⌋ and ⌊>blood>⌋ in a three dimensional image.@@@@1@26@@oe@26-8-2013 1000001201620@unknown@formal@none@1@S@In this application, actual position does not matter, but the ⌊>voxel>⌋ intensity is considered as a ⌊>vector>⌋, with a dimension for each image that was taken over time.@@@@1@28@@oe@26-8-2013 1000001201630@unknown@formal@none@1@S@This technique allows, for example, accurate measurement of the rate a radioactive tracer is delivered to the area of interest, without a separate sampling of ⌊>arterial>⌋ blood, an intrusive technique that is most common today.@@@@1@35@@oe@26-8-2013 1000001201640@unknown@formal@none@1@S@⌊=Market research¦3=⌋@@@@1@2@@oe@26-8-2013 1000001201650@unknown@formal@none@1@S@Cluster analysis is widely used in ⌊>market research>⌋ when working with multivariate data from ⌊>surveys>⌋ and test panels.@@@@1@18@@oe@26-8-2013 1000001201660@unknown@formal@none@1@S@Market researchers use cluster analysis to partition the general ⌊>population>⌋ of ⌊>consumers>⌋ into market segments and to better understand the relationships between different groups of consumers/potential ⌊>customers>⌋.@@@@1@27@@oe@26-8-2013 1000001201670@unknown@formal@none@1@S@⌊•⌊#Segmenting the market and determining ⌊>target market>⌋s#⌋@@@@1@7@@oe@26-8-2013 1000001201680@unknown@formal@none@1@S@⌊#⌊>Product positioning>⌋#⌋@@@@1@2@@oe@26-8-2013 1000001201690@unknown@formal@none@1@S@⌊#⌊>New product development>⌋#⌋@@@@1@3@@oe@26-8-2013 1000001201700@unknown@formal@none@1@S@⌊#Selecting test markets (see : ⌊>experimental techniques>⌋)#⌋•⌋@@@@1@7@@oe@26-8-2013 1000001201710@unknown@formal@none@1@S@⌊=Other applications¦3=⌋@@@@1@2@@oe@26-8-2013 1000001201720@unknown@formal@none@1@S@⌊∗Social network analysis∗⌋: In the study of ⌊>social networks>⌋, clustering may be used to recognize ⌊>communities>⌋ within large groups of people.@@@@1@21@@oe@26-8-2013 1000001201730@unknown@formal@none@1@S@⌊∗Image segmentation∗⌋: Clustering can be used to divide a ⌊>digital>⌋ ⌊>image>⌋ into distinct regions for ⌊>border detection>⌋ or ⌊>object recognition>⌋.@@@@1@20@@oe@26-8-2013 1000001201740@unknown@formal@none@1@S@⌊∗Data mining∗⌋: Many ⌊>data mining>⌋ applications involve partitioning data items into related subsets; the marketing applications discussed above represent some examples.@@@@1@21@@oe@26-8-2013 1000001201750@unknown@formal@none@1@S@Another common application is the division of documents, such as ⌊>World Wide Web>⌋ pages, into genres.@@@@1@16@@oe@26-8-2013 1000001201760@unknown@formal@none@1@S@⌊∗Search result grouping∗⌋: In the process of intelligent grouping of the files and websites, clustering may be used to create a more relevant set of search results compared to normal search engines like ⌊>Google>⌋.@@@@1@34@@oe@26-8-2013 1000001201770@unknown@formal@none@1@S@There are currently a number of web based clustering tools such as ⌊>Clusty>⌋.@@@@1@13@@oe@26-8-2013 1000001201780@unknown@formal@none@1@S@⌊∗Slippy map optimization∗⌋: ⌊>Flickr>⌋'s map of photos and other map sites use clustering to reduce the number of markers on a map.@@@@1@22@@oe@26-8-2013 1000001201790@unknown@formal@none@1@S@This makes it both faster and reduces the amount of visual clutter.@@@@1@12@@oe@26-8-2013 1000001201800@unknown@formal@none@1@S@⌊∗IMRT segmentation∗⌋: Clustering can be used to divide a fluence map into distinct regions for conversion into deliverable fields in MLC-based Radiation Therapy.@@@@1@23@@oe@26-8-2013 1000001201810@unknown@formal@none@1@S@⌊∗Grouping of Shopping Items∗⌋: Clustering can be used to group all the shopping items available on the web into a set of unique products.@@@@1@24@@oe@26-8-2013 1000001201820@unknown@formal@none@1@S@For example, all the items on eBay can be grouped into unique products.@@@@1@13@@oe@26-8-2013 1000001201830@unknown@formal@none@1@S@(eBay doesn't have the concept of a SKU)@@@@1@8@@oe@26-8-2013 1000001201840@unknown@formal@none@1@S@⌊∗⌊>Mathematical chemistry>⌋∗⌋: To find structural similarity, etc., for example, 3000 chemical compounds were clustered in the space of 90 ⌊>topological indices>⌋.@@@@1@21@@oe@26-8-2013 1000001201850@unknown@formal@none@1@S@⌊∗Petroleum Geology∗⌋: Cluster Analysis is used to reconstruct missing bottom hole core data or missing log curves in order to evaluate reservoir properties.@@@@1@23@@oe@26-8-2013 1000001201860@unknown@formal@none@1@S@⌊=Comparisons between data clusterings¦2=⌋@@@@1@4@@oe@26-8-2013 1000001201870@unknown@formal@none@1@S@There have been several suggestions for a measure of similarity between two clusterings.@@@@1@13@@oe@26-8-2013 1000001201880@unknown@formal@none@1@S@Such a measure can be used to compare how well different data clustering algorithms perform on a set of data.@@@@1@20@@oe@26-8-2013 1000001201890@unknown@formal@none@1@S@Many of these measures are derived from the ⌊>matching matrix>⌋ (aka ⌊>confusion matrix>⌋), e.g., the ⌊>Rand measure>⌋ and the Fowlkes-Mallows ⌊/B/⌋⌊/⌊,k,⌋/⌋ measures.@@@@1@22@@oe@26-8-2013 1000001201900@unknown@formal@none@1@S@⌊>Marina Meila>⌋'s Variation of Information metric is a more recent approach for measuring distance between clusterings.@@@@1@16@@oe@26-8-2013 1000001201910@unknown@formal@none@1@S@It uses ⌊>mutual information>⌋ and ⌊>entropy>⌋ to approximate the distance between two clusterings across the lattice of possible clusterings.@@@@1@19@@oe@26-8-2013 1000001201920@unknown@formal@none@1@S@⌊=Algorithms¦2=⌋@@@@1@1@@oe@26-8-2013 1000001201930@unknown@formal@none@1@S@In recent years considerable effort has been put into improving algorithm performance (Z. Huang, 1998).@@@@1@15@@oe@26-8-2013 1000001201940@unknown@formal@none@1@S@Among the most popular are ⌊/CLARANS/⌋ (Ng and Han,1994), ⌊/⌊>DBSCAN>⌋/⌋ (Ester et al., 1996) and ⌊/BIRCH/⌋ (Zhang et al., 1996).@@@@1@20@@oe@26-8-2013 1000001300010@unknown@formal@none@1@S@⌊δComputational linguisticsδ⌋@@@@1@2@@oe@26-8-2013 1000001300020@unknown@formal@none@1@S@⌊∗Computational linguistics∗⌋ is an ⌊>interdisciplinary>⌋ field dealing with the ⌊>statistical>⌋ and/or rule-based modeling of ⌊>natural language>⌋ from a computational perspective.@@@@1@20@@oe@26-8-2013 1000001300030@unknown@formal@none@1@S@This modeling is not limited to any particular field of ⌊>linguistics>⌋.@@@@1@11@@oe@26-8-2013 1000001300040@unknown@formal@none@1@S@Traditionally, computational linguistics was usually performed by ⌊>computer scientist>⌋s who had specialized in the application of computers to the processing of a ⌊>natural language>⌋.@@@@1@24@@oe@26-8-2013 1000001300050@unknown@formal@none@1@S@Recent research has shown that human language is much more complex than previously thought, so computational linguists often work as members of interdisciplinary teams, including linguists (specifically trained in linguistics), language experts (persons with some level of ability in the languages relevant to a given project), and computer scientists.@@@@1@49@@oe@26-8-2013 1000001300060@unknown@formal@none@1@S@In general computational linguistics draws upon the involvement of linguists, ⌊>computer scientists>⌋, experts in ⌊>artificial intelligence>⌋, ⌊>cognitive psychologists>⌋, ⌊>math>⌋ematicians, and ⌊>logic>⌋ians, amongst others.@@@@1@23@@oe@26-8-2013 1000001300070@unknown@formal@none@1@S@⌊=Origins¦2=⌋@@@@1@1@@oe@26-8-2013 1000001300080@unknown@formal@none@1@S@Computational linguistics as a field predates ⌊>artificial intelligence>⌋, a field under which it is often grouped.@@@@1@16@@oe@26-8-2013 1000001300090@unknown@formal@none@1@S@Computational linguistics originated with efforts in the ⌊>United States>⌋ in the 1950s to use computers to automatically translate texts from foreign languages, particularly ⌊>Russian>⌋ scientific journals, into English.@@@@1@28@@oe@26-8-2013 1000001300100@unknown@formal@none@1@S@Since computers had proven their ability to do ⌊>arithmetic>⌋ much faster and more accurately than humans, it was thought to be only a short matter of time before the technical details could be taken care of that would allow them the same remarkable capacity to process language.@@@@1@47@@oe@26-8-2013 1000001300110@unknown@formal@none@1@S@When ⌊>machine translation>⌋ (also known as mechanical translation) failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed.@@@@1@31@@oe@26-8-2013 1000001300120@unknown@formal@none@1@S@Computational linguistics was born as the name of the new field of study devoted to developing ⌊>algorithm>⌋s and ⌊>software>⌋ for intelligently processing language data.@@@@1@24@@oe@26-8-2013 1000001300130@unknown@formal@none@1@S@When artificial intelligence came into existence in the 1960s, the field of computational linguistics became that sub-division of artificial intelligence dealing with human-level comprehension and production of natural languages.@@@@1@29@@oe@26-8-2013 1000001300140@unknown@formal@none@1@S@In order to translate one language into another, it was observed that one had to understand the ⌊>grammar>⌋ of both languages, including both ⌊>morphology>⌋ (the grammar of word forms) and ⌊>syntax>⌋ (the grammar of sentence structure).@@@@1@36@@oe@26-8-2013 1000001300150@unknown@formal@none@1@S@In order to understand syntax, one had to also understand the ⌊>semantics>⌋ and the ⌊>lexicon>⌋ (or 'vocabulary'), and even to understand something of the ⌊>pragmatics>⌋ of language use.@@@@1@28@@oe@26-8-2013 1000001300160@unknown@formal@none@1@S@Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers.@@@@1@27@@oe@26-8-2013 1000001300170@unknown@formal@none@1@S@⌊=Subfields¦2=⌋@@@@1@1@@oe@26-8-2013 1000001300180@unknown@formal@none@1@S@Computational linguistics can be divided into major areas depending upon the medium of the language being processed, whether spoken or textual; and upon the task being performed, whether analyzing language (recognition) or synthesizing language (generation).@@@@1@35@@oe@26-8-2013 1000001300190@unknown@formal@none@1@S@⌊>Speech recognition>⌋ and ⌊>speech synthesis>⌋ deal with how spoken language can be understood or created using computers.@@@@1@17@@oe@26-8-2013 1000001300200@unknown@formal@none@1@S@Parsing and generation are sub-divisions of computational linguistics dealing respectively with taking language apart and putting it together.@@@@1@18@@oe@26-8-2013 1000001300210@unknown@formal@none@1@S@Machine translation remains the sub-division of computational linguistics dealing with having computers translate between languages.@@@@1@15@@oe@26-8-2013 1000001300220@unknown@formal@none@1@S@Some of the areas of research that are studied by computational linguistics include:@@@@1@13@@oe@26-8-2013 1000001300230@unknown@formal@none@1@S@⌊•⌊#Computer aided ⌊>corpus linguistics>⌋#⌋@@@@1@4@@oe@26-8-2013 1000001300240@unknown@formal@none@1@S@⌊#Design of ⌊>parser>⌋s or ⌊>chunkers>⌋ for ⌊>natural language>⌋s#⌋@@@@1@8@@oe@26-8-2013 1000001300250@unknown@formal@none@1@S@⌊#Design of taggers like ⌊>POS-taggers (part-of-speech taggers)>⌋#⌋@@@@1@7@@oe@26-8-2013 1000001300260@unknown@formal@none@1@S@⌊#Definition of specialized logics like resource logics for ⌊>NLP>⌋#⌋@@@@1@9@@oe@26-8-2013 1000001300270@unknown@formal@none@1@S@⌊#Research in the relation between formal and natural languages in general#⌋@@@@1@11@@oe@26-8-2013 1000001300280@unknown@formal@none@1@S@⌊#⌊>Machine translation>⌋, e.g. by a translating computer#⌋@@@@1@7@@oe@26-8-2013 1000001300290@unknown@formal@none@1@S@⌊#⌊>Computational complexity>⌋ of natural language, largely modeled on ⌊>automata theory>⌋, with the application of ⌊>context-sensitive grammar>⌋ and ⌊>linearly-bounded>⌋ ⌊>Turing machine>⌋s.#⌋•⌋@@@@1@20@@oe@26-8-2013 1000001300300@unknown@formal@none@1@S@The ⌊>Association for Computational Linguistics>⌋ defines computational linguistics as:@@@@1@9@@oe@26-8-2013 1000001300310@unknown@formal@none@1@S@⌊⇥...the scientific study of ⌊>language>⌋ from a computational perspective.@@@@1@9@@oe@26-8-2013 1000001300320@unknown@formal@none@1@S@Computational linguists are interested in providing ⌊>computational model>⌋s of various kinds of linguistic phenomena.⇥⌋@@@@1@14@@oe@26-8-2013 1000001400010@unknown@formal@none@1@S@⌊δComputer programδ⌋@@@@1@2@@oe@26-8-2013 1000001400020@unknown@formal@none@1@S@⌊∗Computer programs∗⌋ (also ⌊∗⌊>software programs>⌋∗⌋, or just ⌊∗programs∗⌋) are ⌊>instructions>⌋ for a ⌊>computer>⌋.@@@@1@13@@oe@26-8-2013 1000001400030@unknown@formal@none@1@S@A computer requires programs to function, and a computer program does nothing unless its instructions are executed by a ⌊>central processor>⌋.@@@@1@21@@oe@26-8-2013 1000001400040@unknown@formal@none@1@S@Computer programs are usually ⌊>executable>⌋ programs or the ⌊>source code>⌋ from which executable programs are derived (e.g., ⌊>compiled>⌋).@@@@1@18@@oe@26-8-2013 1000001400050@unknown@formal@none@1@S@Computer source code is often written by professional ⌊>computer programmer>⌋s.@@@@1@10@@oe@26-8-2013 1000001400060@unknown@formal@none@1@S@Source code is written in a ⌊>programming language>⌋ that usually follows one of two main ⌊>paradigms>⌋: ⌊>imperative>⌋ or ⌊>declarative>⌋ programming.@@@@1@20@@oe@26-8-2013 1000001400070@unknown@formal@none@1@S@Source code may be converted into an ⌊>executable file>⌋ (sometimes called an executable program) by a ⌊>compiler>⌋.@@@@1@17@@oe@26-8-2013 1000001400080@unknown@formal@none@1@S@Alternatively, computer programs may be executed by a ⌊>central processing unit>⌋ with the aid of an ⌊>interpreter>⌋, or may be ⌊>embedded>⌋ directly into ⌊>hardware>⌋.@@@@1@24@@oe@26-8-2013 1000001400090@unknown@formal@none@1@S@Computer programs may be categorized along functional lines: ⌊>system software>⌋ and ⌊>application software>⌋.@@@@1@13@@oe@26-8-2013 1000001400100@unknown@formal@none@1@S@And many computer programs may run simultaneously on a single computer, a process known as ⌊>multitasking>⌋.@@@@1@16@@oe@26-8-2013 1000001400110@unknown@formal@none@1@S@⌊=Programming¦2=⌋@@@@1@1@@oe@26-8-2013 1000001400120@unknown@formal@none@1@S@main() {output_string("Hello world!");}@@@@1@3@@oe@26-8-2013 1000001400130@unknown@formal@none@1@S@Source code of a program written in the ⌊>C programming language>⌋@@@@1@11@@oe@26-8-2013 1000001400140@unknown@formal@none@1@S@⌊>Computer programming>⌋ is the iterative process of writing or editing ⌊>source code>⌋.@@@@1@12@@oe@26-8-2013 1000001400150@unknown@formal@none@1@S@Editing source code involves testing, analyzing, and refining.@@@@1@8@@oe@26-8-2013 1000001400160@unknown@formal@none@1@S@A person who practices this skill is referred to as a computer ⌊>programmer>⌋ or software developer.@@@@1@16@@oe@26-8-2013 1000001400170@unknown@formal@none@1@S@The sometimes lengthy process of computer programming is usually referred to as ⌊>software development>⌋.@@@@1@14@@oe@26-8-2013 1000001400180@unknown@formal@none@1@S@The term ⌊>software engineering>⌋ is becoming popular as the process is seen as an ⌊>engineering>⌋ discipline.@@@@1@16@@oe@26-8-2013 1000001400190@unknown@formal@none@1@S@⌊=Paradigms¦3=⌋@@@@1@1@@oe@26-8-2013 1000001400200@unknown@formal@none@1@S@Computer programs can be categorized by the ⌊>programming language>⌋ ⌊>paradigm>⌋ used to produce them.@@@@1@14@@oe@26-8-2013 1000001400210@unknown@formal@none@1@S@Two of the main paradigms are ⌊>imperative>⌋ and ⌊>declarative>⌋.@@@@1@9@@oe@26-8-2013 1000001400220@unknown@formal@none@1@S@Programs written using an imperative language specify an ⌊>algorithm>⌋ using declarations, expressions, and statements.@@@@1@14@@oe@26-8-2013 1000001400230@unknown@formal@none@1@S@A declaration associates a ⌊>variable>⌋ name with a ⌊>datatype>⌋.@@@@1@9@@oe@26-8-2013 1000001400240@unknown@formal@none@1@S@For example: ⌊◊ var x: integer; ◊⌋.@@@@1@7@@oe@26-8-2013 1000001400250@unknown@formal@none@1@S@An expression yields a value.@@@@1@5@@oe@26-8-2013 1000001400260@unknown@formal@none@1@S@For example: ⌊◊ 2 + 2 ◊⌋ yields 4.@@@@1@9@@oe@26-8-2013 1000001400270@unknown@formal@none@1@S@Finally, a statement might assign an expression to a variable or use the value of a variable to alter the program's control flow.@@@@1@23@@oe@26-8-2013 1000001400280@unknown@formal@none@1@S@For example: ⌊◊x := 2 + 2; if x = 4 then do_something();◊⌋@@@@1@13@@oe@26-8-2013 1000001400290@unknown@formal@none@1@S@One criticism of imperative languages is the side-effect of an assignment statement on a class of variables called non-local variables.@@@@1@20@@oe@26-8-2013 1000001400300@unknown@formal@none@1@S@Programs written using a declarative language specify the properties that have to be met by the output and do not specify any implementation details.@@@@1@24@@oe@26-8-2013 1000001400310@unknown@formal@none@1@S@Two broad categories of declarative languages are ⌊>functional language>⌋s and ⌊>logical language>⌋s.@@@@1@12@@oe@26-8-2013 1000001400320@unknown@formal@none@1@S@The principle behind functional languages (like ⌊>Haskell>⌋) is to not allow side-effects, which makes it easier to reason about programs like mathematical functions.@@@@1@23@@oe@26-8-2013 1000001400330@unknown@formal@none@1@S@The principle behind logical languages (like ⌊>Prolog>⌋) is to define the problem to be solved — the goal — and leave the detailed solution to the Prolog system itself.@@@@1@29@@oe@26-8-2013 1000001400340@unknown@formal@none@1@S@The goal is defined by providing a list of subgoals.@@@@1@10@@oe@26-8-2013 1000001400350@unknown@formal@none@1@S@Then each subgoal is defined by further providing a list of its subgoals, etc.@@@@1@14@@oe@26-8-2013 1000001400360@unknown@formal@none@1@S@If a path of subgoals fails to find a solution, then that subgoal is ⌊>backtracked>⌋ and another path is systematically attempted.@@@@1@21@@oe@26-8-2013 1000001400370@unknown@formal@none@1@S@The form in which a program is created may be textual or visual.@@@@1@13@@oe@26-8-2013 1000001400380@unknown@formal@none@1@S@In a ⌊>visual language>⌋ program, elements are graphically manipulated rather than textually specified.@@@@1@13@@oe@26-8-2013 1000001400390@unknown@formal@none@1@S@⌊=Compilation or interpretation¦3=⌋@@@@1@3@@oe@26-8-2013 1000001400400@unknown@formal@none@1@S@A ⌊/computer program/⌋ in the form of a ⌊>human-readable>⌋, computer programming language is called ⌊>source code>⌋.@@@@1@16@@oe@26-8-2013 1000001400410@unknown@formal@none@1@S@Source code may be converted into an ⌊>executable image>⌋ by a ⌊>compiler>⌋ or executed immediately with the aid of an ⌊>interpreter>⌋.@@@@1@21@@oe@26-8-2013 1000001400420@unknown@formal@none@1@S@Compiled computer programs are commonly referred to as executables, binary images, or simply as ⌊>binaries>⌋ — a reference to the ⌊>binary>⌋ ⌊>file format>⌋ used to store the executable code.@@@@1@29@@oe@26-8-2013 1000001400430@unknown@formal@none@1@S@Compilers are used to translate source code from a programming language into either ⌊>object code>⌋ or ⌊>machine code>⌋.@@@@1@18@@oe@26-8-2013 1000001400440@unknown@formal@none@1@S@Object code needs further processing to become machine code, and machine code is the ⌊>Central Processing Unit>⌋'s native ⌊>code>⌋, ready for execution.@@@@1@22@@oe@26-8-2013 1000001400450@unknown@formal@none@1@S@Interpreted computer programs are either decoded and then immediately executed or are decoded into some efficient intermediate representation for future execution.@@@@1@21@@oe@26-8-2013 1000001400460@unknown@formal@none@1@S@⌊>BASIC>⌋, ⌊>Perl>⌋, and ⌊>Python>⌋ are examples of immediately executed computer programs.@@@@1@11@@oe@26-8-2013 1000001400470@unknown@formal@none@1@S@Alternatively, ⌊>Java>⌋ computer programs are compiled ahead of time and stored as a machine independent code called ⌊>bytecode>⌋.@@@@1@18@@oe@26-8-2013 1000001400480@unknown@formal@none@1@S@Bytecode is then executed upon request by an interpreter called a ⌊>virtual machine>⌋.@@@@1@13@@oe@26-8-2013 1000001400490@unknown@formal@none@1@S@The main disadvantage of interpreters is computer programs run slower than if compiled.@@@@1@13@@oe@26-8-2013 1000001400500@unknown@formal@none@1@S@Interpreting code is slower than running the compiled version because the interpreter must ⌊>decode>⌋ each ⌊>statement>⌋ each time it is loaded and then perform the desired action.@@@@1@27@@oe@26-8-2013 1000001400510@unknown@formal@none@1@S@On the other hand, software development may be quicker using an interpreter because testing is immediate when the compilation step is omitted.@@@@1@22@@oe@26-8-2013 1000001400520@unknown@formal@none@1@S@Another disadvantage of interpreters is the interpreter must be present on the computer at the time the computer program is executed.@@@@1@21@@oe@26-8-2013 1000001400530@unknown@formal@none@1@S@Alternatively, compiled computer programs need not have the compiler present at the time of execution.@@@@1@15@@oe@26-8-2013 1000001400540@unknown@formal@none@1@S@No properties of a programming language require it to be exclusively compiled or exclusively interpreted.@@@@1@15@@oe@26-8-2013 1000001400550@unknown@formal@none@1@S@The categorization usually reflects the most popular method of language execution.@@@@1@11@@oe@26-8-2013 1000001400560@unknown@formal@none@1@S@For example, BASIC is thought of as an interpreted language and C a compiled language, despite the existence of BASIC compilers and C interpreters.@@@@1@24@@oe@26-8-2013 1000001400570@unknown@formal@none@1@S@⌊=Self-modifying programs¦3=⌋@@@@1@2@@oe@26-8-2013 1000001400580@unknown@formal@none@1@S@A computer program in ⌊>execution>⌋ is normally treated as being different from the ⌊>data>⌋ the program operates on.@@@@1@18@@oe@26-8-2013 1000001400590@unknown@formal@none@1@S@However, in some cases this distinction is blurred when a computer program modifies itself.@@@@1@14@@oe@26-8-2013 1000001400600@unknown@formal@none@1@S@The modified computer program is subsequently executed as part of the same program.@@@@1@13@@oe@26-8-2013 1000001400610@unknown@formal@none@1@S@⌊>Self-modifying code>⌋ is possible for programs written in ⌊>Lisp>⌋, ⌊>COBOL>⌋, and ⌊>Prolog>⌋.@@@@1@12@@oe@26-8-2013 1000001400620@unknown@formal@none@1@S@⌊=Execution and storage¦2=⌋@@@@1@3@@oe@26-8-2013 1000001400630@unknown@formal@none@1@S@Typically, computer programs are stored in ⌊>non-volatile memory>⌋ until requested either directly or indirectly to be ⌊>executed>⌋ by the computer user.@@@@1@21@@oe@26-8-2013 1000001400640@unknown@formal@none@1@S@Upon such a request, the program is loaded into ⌊>random access memory>⌋, by a computer program called an ⌊>operating system>⌋, where it can be accessed directly by the central processor.@@@@1@30@@oe@26-8-2013 1000001400650@unknown@formal@none@1@S@The central processor then executes ("runs") the program, instruction by instruction, until termination.@@@@1@13@@oe@26-8-2013 1000001400660@unknown@formal@none@1@S@A program in execution is called a ⌊>process>⌋.@@@@1@8@@oe@26-8-2013 1000001400670@unknown@formal@none@1@S@Termination is either by normal self-termination or by error — software or hardware error.@@@@1@14@@oe@26-8-2013 1000001400680@unknown@formal@none@1@S@⌊=Embedded programs¦3=⌋@@@@1@2@@oe@26-8-2013 1000001400690@unknown@formal@none@1@S@Some computer programs are embedded into hardware.@@@@1@7@@oe@26-8-2013 1000001400700@unknown@formal@none@1@S@A ⌊>stored-program computer>⌋ requires an initial computer program stored in its ⌊>read-only memory>⌋ to ⌊>boot>⌋.@@@@1@15@@oe@26-8-2013 1000001400710@unknown@formal@none@1@S@The boot process is to identify and initialize all aspects of the system, from ⌊>CPU registers>⌋ to ⌊>device controllers>⌋ to ⌊>memory>⌋ contents.@@@@1@22@@oe@26-8-2013 1000001400720@unknown@formal@none@1@S@Following the initialization process, this initial computer program loads the ⌊>operating system>⌋ and sets the ⌊>program counter>⌋ to begin normal operations.@@@@1@21@@oe@26-8-2013 1000001400730@unknown@formal@none@1@S@Independent of the host computer, a ⌊>hardware device>⌋ might have embedded ⌊>firmware>⌋ to control its operation.@@@@1@16@@oe@26-8-2013 1000001400740@unknown@formal@none@1@S@Firmware is used when the computer program is rarely or never expected to change, or when the program must not be lost when the power is off.@@@@1@27@@oe@26-8-2013 1000001400750@unknown@formal@none@1@S@⌊=Manual programming¦3=⌋@@@@1@2@@oe@26-8-2013 1000001400760@unknown@formal@none@1@S@Computer programs historically were manually input to the central processor via switches.@@@@1@12@@oe@26-8-2013 1000001400770@unknown@formal@none@1@S@An instruction was represented by a configuration of on/off settings.@@@@1@10@@oe@26-8-2013 1000001400780@unknown@formal@none@1@S@After setting the configuration, an execute button was pressed.@@@@1@9@@oe@26-8-2013 1000001400790@unknown@formal@none@1@S@This process was then repeated.@@@@1@5@@oe@26-8-2013 1000001400800@unknown@formal@none@1@S@Computer programs also historically were manually input via ⌊>paper tape>⌋ or ⌊>punched cards>⌋.@@@@1@13@@oe@26-8-2013 1000001400810@unknown@formal@none@1@S@After the medium was loaded, the starting address was set via switches and the execute button pressed.@@@@1@17@@oe@26-8-2013 1000001400820@unknown@formal@none@1@S@⌊=Automatic program generation¦3=⌋@@@@1@3@@oe@26-8-2013 1000001400830@unknown@formal@none@1@S@⌊>Generative programming>⌋ is a style of ⌊>computer programming>⌋ that creates ⌊>source code>⌋ through ⌊>generic>⌋ ⌊>classes>⌋, ⌊>prototypes>⌋, ⌊>template>⌋s, ⌊>aspect>⌋s, and ⌊>code generator>⌋s to improve ⌊>programmer>⌋ productivity.@@@@1@25@@oe@26-8-2013 1000001400840@unknown@formal@none@1@S@Source code is generated with ⌊>programming tool>⌋s such as a ⌊>template processor>⌋ or an ⌊>Integrated Development Environment>⌋.@@@@1@17@@oe@26-8-2013 1000001400850@unknown@formal@none@1@S@The simplest form of source code generator is a ⌊>macro>⌋ processor, such as the ⌊>C preprocessor>⌋, which replaces patterns in source code according to relatively simple rules.@@@@1@27@@oe@26-8-2013 1000001400860@unknown@formal@none@1@S@⌊>Software engine>⌋s output source code or ⌊>markup code>⌋ that simultaneously become the input to another ⌊>computer process>⌋.@@@@1@17@@oe@26-8-2013 1000001400870@unknown@formal@none@1@S@The analogy is that of one process driving another process, with the computer code being burned as fuel.@@@@1@18@@oe@26-8-2013 1000001400880@unknown@formal@none@1@S@⌊>Application server>⌋s are software engines that deliver applications to ⌊>client computer>⌋s.@@@@1@11@@oe@26-8-2013 1000001400890@unknown@formal@none@1@S@For example, a ⌊>Wiki>⌋ is an application server that allows users to build ⌊>dynamic content>⌋ assembled from ⌊>articles>⌋.@@@@1@18@@oe@26-8-2013 1000001400900@unknown@formal@none@1@S@Wikis generate ⌊>HTML>⌋, ⌊>CSS>⌋, ⌊>Java>⌋, and ⌊>Javascript>⌋ which are then ⌊>interpreted>⌋ by a ⌊>web browser>⌋.@@@@1@15@@oe@26-8-2013 1000001400910@unknown@formal@none@1@S@⌊=Simultaneous execution¦3=⌋@@@@1@2@@oe@26-8-2013 1000001400920@unknown@formal@none@1@S@Many operating systems support ⌊>multitasking>⌋ which enables many computer programs to appear to be running simultaneously on a single computer.@@@@1@20@@oe@26-8-2013 1000001400930@unknown@formal@none@1@S@Operating systems may run multiple programs through ⌊>process scheduling>⌋ — a software mechanism to ⌊>switch>⌋ the CPU among processes frequently so that users can ⌊>interact>⌋ with each program while it is running.@@@@1@32@@oe@26-8-2013 1000001400940@unknown@formal@none@1@S@Within hardware, modern day multiprocessor computers or computers with multicore processors may run multiple programs.@@@@1@15@@oe@26-8-2013 1000001400950@unknown@formal@none@1@S@⌊=Functional categories¦2=⌋@@@@1@2@@oe@26-8-2013 1000001400960@unknown@formal@none@1@S@Computer programs may be categorized along functional lines.@@@@1@8@@oe@26-8-2013 1000001400970@unknown@formal@none@1@S@These functional categories are ⌊>system software>⌋ and ⌊>application software>⌋.@@@@1@9@@oe@26-8-2013 1000001400980@unknown@formal@none@1@S@System software includes the ⌊>operating system>⌋ which couples the ⌊>computer's hardware>⌋ with the application software.@@@@1@15@@oe@26-8-2013 1000001400990@unknown@formal@none@1@S@The purpose of the operating system is to provide an environment in which application software executes in a convenient and efficient manner.@@@@1@22@@oe@26-8-2013 1000001401000@unknown@formal@none@1@S@In addition to the operating system, system software includes ⌊>utility programs>⌋ that help manage and tune the computer.@@@@1@18@@oe@26-8-2013 1000001401010@unknown@formal@none@1@S@If a computer program is not system software then it is application software.@@@@1@13@@oe@26-8-2013 1000001401020@unknown@formal@none@1@S@Application software includes ⌊>middleware>⌋, which couples the system software with the ⌊>user interface>⌋.@@@@1@13@@oe@26-8-2013 1000001401030@unknown@formal@none@1@S@Application software also includes utility programs that help users solve application problems, like the need for sorting.@@@@1@17@@oe@26-8-2013 1000001500010@unknown@formal@none@1@S@⌊δComputer scienceδ⌋@@@@1@2@@oe@26-8-2013 1000001500020@unknown@formal@none@1@S@⌊∗Computer science∗⌋ (or ⌊∗computing science∗⌋) is the study and the ⌊>science>⌋ of the theoretical foundations of ⌊>information>⌋ and ⌊>computation>⌋ and their implementation and application in ⌊>computer system>⌋s.@@@@1@27@@oe@26-8-2013 1000001500030@unknown@formal@none@1@S@Computer science has many sub-fields; some emphasize the computation of specific results (such as ⌊>computer graphics>⌋), while others relate to properties of ⌊>computational problem>⌋s (such as ⌊>computational complexity theory>⌋).@@@@1@29@@oe@26-8-2013 1000001500040@unknown@formal@none@1@S@Still others focus on the challenges in implementing computations.@@@@1@9@@oe@26-8-2013 1000001500050@unknown@formal@none@1@S@For example, ⌊>programming language theory>⌋ studies approaches to describing computations, while ⌊>computer programming>⌋ applies specific ⌊>programming language>⌋s to solve specific computational problems.@@@@1@22@@oe@26-8-2013 1000001500060@unknown@formal@none@1@S@A further subfield, ⌊>human-computer interaction>⌋, focuses on the challenges in making computers and computations useful, usable and universally accessible to ⌊>people>⌋.@@@@1@21@@oe@26-8-2013 1000001500070@unknown@formal@none@1@S@⌊=History¦2=⌋@@@@1@1@@oe@26-8-2013 1000001500080@unknown@formal@none@1@S@The early foundations of what would become computer science predate the invention of the modern ⌊>digital computer>⌋.@@@@1@17@@oe@26-8-2013 1000001500090@unknown@formal@none@1@S@Machines for calculating fixed numerical tasks, such as the ⌊>abacus>⌋, have existed since antiquity.@@@@1@14@@oe@26-8-2013 1000001500100@unknown@formal@none@1@S@⌊>Wilhelm Schickard>⌋ built the first mechanical calculator in 1623.@@@@1@9@@oe@26-8-2013 1000001500110@unknown@formal@none@1@S@⌊>Charles Babbage>⌋ designed a ⌊>difference engine>⌋ in ⌊>Victorian>⌋ times (between 1837 and 1901) helped by ⌊>Ada Lovelace>⌋.@@@@1@17@@oe@26-8-2013 1000001500120@unknown@formal@none@1@S@Around 1900, the ⌊>IBM>⌋ corporation sold ⌊>punch-card machines>⌋.@@@@1@8@@oe@26-8-2013 1000001500130@unknown@formal@none@1@S@However, all of these machines were constrained to perform a single task, or at best some subset of all possible tasks.@@@@1@21@@oe@26-8-2013 1000001500140@unknown@formal@none@1@S@During the 1940s, as newer and more powerful computing machines were developed, the term ⌊/computer/⌋ came to refer to the machines rather than their human predecessors.@@@@1@26@@oe@26-8-2013 1000001500150@unknown@formal@none@1@S@As it became clear that computers could be used for more than just mathematical calculations, the field of computer science broadened to study ⌊>computation>⌋ in general.@@@@1@26@@oe@26-8-2013 1000001500160@unknown@formal@none@1@S@Computer science began to be established as a distinct academic discipline in the 1960s, with the creation of the first computer science departments and degree programs.@@@@1@26@@oe@26-8-2013 1000001500170@unknown@formal@none@1@S@Since practical computers became available, many applications of computing have become distinct areas of study in their own right.@@@@1@19@@oe@26-8-2013 1000001500180@unknown@formal@none@1@S@Many initially believed it impossible that "computers themselves could actually be a scientific field of study" (Levy 1984, p. 11), though it was in the "late fifties" (Levy 1984, p.11) that it gradually became accepted among the greater academic population.@@@@1@40@@oe@26-8-2013 1000001500190@unknown@formal@none@1@S@It is the now well-known IBM brand that formed part of the computer science revolution during this time.@@@@1@18@@oe@26-8-2013 1000001500200@unknown@formal@none@1@S@'IBM' (short for International Business Machines) released the IBM 704 and later the IBM 709 computers, which were widely used during the exploration period of such devices.@@@@1@27@@oe@26-8-2013 1000001500210@unknown@formal@none@1@S@"Still, working with the IBM [computer] was frustrating...if you had misplaced as much as one letter in one instruction, the program would crash, and you would have to start the whole process over again" (Levy 1984, p.13).@@@@1@37@@oe@26-8-2013 1000001500220@unknown@formal@none@1@S@During the late 1950s, the computer science discipline was very much in its developmental stages, and such issues were commonplace.@@@@1@20@@oe@26-8-2013 1000001500230@unknown@formal@none@1@S@Time has seen significant improvements in the useability and effectiveness of computer science technology.@@@@1@14@@oe@26-8-2013 1000001500240@unknown@formal@none@1@S@Modern society has seen a significant shift from computers being used solely by experts or professionals to a more widespread user base.@@@@1@22@@oe@26-8-2013 1000001500250@unknown@formal@none@1@S@By the 1990s, computers became accepted as being the norm within everyday life.@@@@1@13@@oe@26-8-2013 1000001500260@unknown@formal@none@1@S@During this time data entry was a primary component of the use of computers, many preferring to streamline their business practices through the use of a computer.@@@@1@27@@oe@26-8-2013 1000001500270@unknown@formal@none@1@S@This also gave the additional benefit of removing the need of large amounts of documentation and file records which consumed much-needed physical space within offices.@@@@1@25@@oe@26-8-2013 1000001500280@unknown@formal@none@1@S@⌊=Major achievements¦2=⌋@@@@1@2@@oe@26-8-2013 1000001500290@unknown@formal@none@1@S@Despite its relatively short history as a formal academic discipline, computer science has made a number of fundamental contributions to ⌊>science>⌋ and ⌊>society>⌋.@@@@1@23@@oe@26-8-2013 1000001500300@unknown@formal@none@1@S@These include:@@@@1@2@@oe@26-8-2013 1000001500310@unknown@formal@none@1@S@⌊:Applications within computer science:⌋@@@@1@4@@oe@26-8-2013 1000001500320@unknown@formal@none@1@S@⌊•⌊#A formal definition of ⌊>computation>⌋ and ⌊>computability>⌋, and proof that there are computationally ⌊>unsolvable>⌋ and ⌊>intractable>⌋ problems.#⌋@@@@1@17@@oe@26-8-2013 1000001500330@unknown@formal@none@1@S@⌊#The concept of a ⌊>programming language>⌋, a tool for the precise expression of methodological information at various levels of abstraction.#⌋•⌋@@@@1@20@@oe@26-8-2013 1000001500340@unknown@formal@none@1@S@⌊:Applications outside of computing:⌋@@@@1@4@@oe@26-8-2013 1000001500350@unknown@formal@none@1@S@⌊•⌊#Sparked the ⌊>Digital Revolution>⌋ which led to the current ⌊>Information Age>⌋ and the ⌊>Internet>⌋.#⌋@@@@1@14@@oe@26-8-2013 1000001500360@unknown@formal@none@1@S@⌊#In ⌊>cryptography>⌋, ⌊>breaking the Enigma machine>⌋ was an important factor contributing to the Allied victory in World War II.#⌋@@@@1@19@@oe@26-8-2013 1000001500370@unknown@formal@none@1@S@⌊#⌊>Scientific computing>⌋ enabled advanced study of the mind and mapping the human genome was possible with ⌊>Human Genome Project>⌋.@@@@1@19@@oe@26-8-2013 1000001500380@unknown@formal@none@1@S@⌊>Distributed computing>⌋ projects like ⌊>Folding\shome>⌋ explore ⌊>protein folding>⌋.#⌋@@@@1@8@@oe@26-8-2013 1000001500390@unknown@formal@none@1@S@⌊#⌊>Algorithmic trading>⌋ has increased the ⌊>efficiency>⌋ and ⌊>liquidity>⌋ of financial markets by using ⌊>artificial intelligence>⌋, ⌊>machine learning>⌋ and other ⌊>statistical>⌋ and ⌊>numerical>⌋ techniques on a large scale.#⌋•⌋@@@@1@27@@oe@26-8-2013 1000001500400@unknown@formal@none@1@S@⌊=Relationship with other fields¦2=⌋@@@@1@4@@oe@26-8-2013 1000001500410@unknown@formal@none@1@S@Despite its name, a significant amount of computer science does not involve the study of computers themselves.@@@@1@17@@oe@26-8-2013 1000001500420@unknown@formal@none@1@S@Because of this, several alternative names have been proposed.@@@@1@9@@oe@26-8-2013 1000001500430@unknown@formal@none@1@S@Danish scientist ⌊>Peter Naur>⌋ suggested the term ⌊/datalogy/⌋, to reflect the fact that the scientific discipline revolves around data and data treatment, while not necessarily involving computers.@@@@1@27@@oe@26-8-2013 1000001500440@unknown@formal@none@1@S@The first scientific institution to use the term was the Department of Datalogy at the University of Copenhagen, founded in 1969, with Peter Naur being the first professor in datalogy.@@@@1@30@@oe@26-8-2013 1000001500450@unknown@formal@none@1@S@The term is used mainly in the Scandinavian countries.@@@@1@9@@oe@26-8-2013 1000001500460@unknown@formal@none@1@S@Also, in the early days of computing, a number of terms for the and practitioners of the field of computing were suggested in the ⌊/Communications are of the ACM/⌋—⌊/turingineer/⌋, ⌊/turologist/⌋, ⌊/flow-charts-man/⌋, ⌊/applied meta-mathematician/⌋, and ⌊/applied epistemologist/⌋.@@@@1@36@@oe@26-8-2013 1000001500470@unknown@formal@none@1@S@Three months later in the same journal, ⌊/comptologist/⌋ was suggested, followed next year by ⌊/hypologist/⌋.@@@@1@15@@oe@26-8-2013 1000001500480@unknown@formal@none@1@S@Recently the term ⌊/computics/⌋ has been suggested.@@@@1@7@@oe@26-8-2013 1000001500490@unknown@formal@none@1@S@⌊/Informatik/⌋ was a term used in Europe with more frequency.@@@@1@10@@oe@26-8-2013 1000001500500@unknown@formal@none@1@S@The renowned computer scientist ⌊>Edsger Dijkstra>⌋ stated, "Computer science is no more about computers than astronomy is about telescopes."@@@@1@19@@oe@26-8-2013 1000001500510@unknown@formal@none@1@S@The design and deployment of computers and computer systems is generally considered the province of disciplines other than computer science.@@@@1@20@@oe@26-8-2013 1000001500520@unknown@formal@none@1@S@For example, the study of ⌊>computer hardware>⌋ is usually considered part of ⌊>computer engineering>⌋, while the study of commercial ⌊>computer system>⌋s and their deployment is often called ⌊>information technology>⌋ or ⌊>information systems>⌋.@@@@1@32@@oe@26-8-2013 1000001500530@unknown@formal@none@1@S@Computer science is sometimes criticized as being insufficiently scientific, a view espoused in the statement "Science is to computer science as hydrodynamics is to plumbing", credited to ⌊>Stan Kelly-Bootle>⌋ and others.@@@@1@31@@oe@26-8-2013 1000001500540@unknown@formal@none@1@S@However, there has been much cross-fertilization of ideas between the various computer-related disciplines.@@@@1@13@@oe@26-8-2013 1000001500550@unknown@formal@none@1@S@Computer science research has also often crossed into other disciplines, such as ⌊>cognitive science>⌋, ⌊>economics>⌋, ⌊>mathematics>⌋, ⌊>physics>⌋ (see ⌊>quantum computing>⌋), and ⌊>linguistics>⌋.@@@@1@22@@oe@26-8-2013 1000001500560@unknown@formal@none@1@S@Computer science is considered by some to have a much closer relationship with ⌊>mathematics>⌋ than many scientific disciplines.@@@@1@18@@oe@26-8-2013 1000001500570@unknown@formal@none@1@S@Early computer science was strongly influenced by the work of mathematicians such as ⌊>Kurt Gödel>⌋ and ⌊>Alan Turing>⌋, and there continues to be a useful interchange of ideas between the two fields in areas such as ⌊>mathematical logic>⌋, ⌊>category theory>⌋, ⌊>domain theory>⌋, and ⌊>algebra>⌋.@@@@1@44@@oe@26-8-2013 1000001500580@unknown@formal@none@1@S@The relationship between computer science and ⌊>software engineering>⌋ is a contentious issue, which is further muddied by ⌊>disputes>⌋ over what the term "software engineering" means, and how computer science is defined.@@@@1@31@@oe@26-8-2013 1000001500590@unknown@formal@none@1@S@⌊>David Parnas>⌋, taking a cue from the relationship between other engineering and science disciplines, has claimed that the principal focus of computer science is studying the properties of computation in general, while the principal focus of software engineering is the design of specific computations to achieve practical goals, making the two separate but complementary disciplines.@@@@1@55@@oe@26-8-2013 1000001500600@unknown@formal@none@1@S@The academic, political, and funding aspects of computer science tend to have roots as to whether a department in the U.S. formed with either a mathematical emphasis or an engineering emphasis.@@@@1@31@@oe@26-8-2013 1000001500610@unknown@formal@none@1@S@In general, electrical engineering-based computer science departments have tended to succeed as computer science and/or engineering departments.@@@@1@17@@oe@26-8-2013 1000001500620@unknown@formal@none@1@S@Computer science departments with a mathematics emphasis and with a numerical orientation consider alignment ⌊>computational science>⌋.@@@@1@16@@oe@26-8-2013 1000001500630@unknown@formal@none@1@S@Both types of departments tend to make efforts to bridge the field educationally if not across all research.@@@@1@18@@oe@26-8-2013 1000001500640@unknown@formal@none@1@S@⌊=Fields of computer science¦2=⌋@@@@1@4@@oe@26-8-2013 1000001500650@unknown@formal@none@1@S@Computer science searches for concepts and ⌊>formal proof>⌋s to explain and describe computational systems of interest.@@@@1@16@@oe@26-8-2013 1000001500660@unknown@formal@none@1@S@As with all sciences, these theories can then be utilised to synthesize practical engineering applications, which in turn may suggest new systems to be studied and analysed.@@@@1@27@@oe@26-8-2013 1000001500670@unknown@formal@none@1@S@While the ⌊>ACM Computing Classification System>⌋ can be used to split computer science up into different topics of fields, a more descriptive breakdown follows:@@@@1@24@@oe@26-8-2013 1000001500680@unknown@formal@none@1@S@⌊=Mathematical foundations¦3=⌋@@@@1@2@@oe@26-8-2013 1000001500690@unknown@formal@none@1@S@⌊:⌊>Mathematical logic>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500700@unknown@formal@none@1@S@⌊⇥Boolean logic and other ways of modeling logical queries; the uses and limitations of formal proof methods.⇥⌋@@@@1@17@@oe@26-8-2013 1000001500710@unknown@formal@none@1@S@⌊:⌊>Number theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500720@unknown@formal@none@1@S@⌊⇥Theory of proofs and heuristics for finding proofs in the simple domain of integers.@@@@1@14@@oe@26-8-2013 1000001500730@unknown@formal@none@1@S@Used in ⌊>cryptography>⌋ as well as a test domain in ⌊>artificial intelligence>⌋.⇥⌋@@@@1@12@@oe@26-8-2013 1000001500740@unknown@formal@none@1@S@⌊:⌊>Graph theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500750@unknown@formal@none@1@S@⌊⇥Foundations for data structures and searching algorithms.⇥⌋@@@@1@7@@oe@26-8-2013 1000001500760@unknown@formal@none@1@S@⌊:⌊>Type theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500770@unknown@formal@none@1@S@⌊⇥Formal analysis of the types of data, and the use of these types to understand properties of programs, especially program safety.⇥⌋@@@@1@21@@oe@26-8-2013 1000001500780@unknown@formal@none@1@S@⌊:⌊>Category theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500790@unknown@formal@none@1@S@⌊⇥Category theory provides a means of capturing all of math and computation in a single synthesis.⇥⌋@@@@1@16@@oe@26-8-2013 1000001500800@unknown@formal@none@1@S@⌊:⌊>Computational geometry>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500810@unknown@formal@none@1@S@⌊⇥The study of ⌊>algorithm>⌋s to solve problems stated in terms of ⌊>geometry>⌋.⇥⌋@@@@1@12@@oe@26-8-2013 1000001500820@unknown@formal@none@1@S@⌊:⌊>Numerical analysis>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500830@unknown@formal@none@1@S@⌊⇥Foundations for algorithms in discrete mathematics, as well as the study of the limitations of floating point computation, including ⌊>round-off>⌋ errors.⇥⌋@@@@1@21@@oe@26-8-2013 1000001500840@unknown@formal@none@1@S@⌊=Theory of computation¦3=⌋@@@@1@3@@oe@26-8-2013 1000001500850@unknown@formal@none@1@S@⌊:⌊>Automata theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500860@unknown@formal@none@1@S@⌊⇥Different logical structures for solving problems.⇥⌋@@@@1@6@@oe@26-8-2013 1000001500870@unknown@formal@none@1@S@⌊:⌊>Computability theory>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001500880@unknown@formal@none@1@S@⌊⇥What is calculable with the current models of computers.@@@@1@9@@oe@26-8-2013 1000001500890@unknown@formal@none@1@S@Proofs developed by ⌊>Alan Turing>⌋ and others provide insight into the possibilities of what can be computed and what cannot.⇥⌋@@@@1@20@@oe@26-8-2013 1000001500900@unknown@formal@none@1@S@⌊:⌊>Computational complexity theory>⌋:⌋@@@@1@3@@oe@26-8-2013 1000001500910@unknown@formal@none@1@S@⌊⇥Fundamental bounds (especially time and storage space) on classes of computations; in practice, study of which problems a computer can solve with reasonable resources (while computability theory studies which problems can be solved at all).⇥⌋@@@@1@35@@oe@26-8-2013 1000001500920@unknown@formal@none@1@S@⌊:⌊>Quantum computing theory>⌋:⌋@@@@1@3@@oe@26-8-2013 1000001500930@unknown@formal@none@1@S@⌊⇥Representation and manipulation of data using the quantum properties of particles and quantum mechanism.⇥⌋@@@@1@14@@oe@26-8-2013 1000001500940@unknown@formal@none@1@S@⌊=Algorithms and data structures¦3=⌋@@@@1@4@@oe@26-8-2013 1000001500950@unknown@formal@none@1@S@⌊:⌊>Analysis of algorithms>⌋:⌋@@@@1@3@@oe@26-8-2013 1000001500960@unknown@formal@none@1@S@⌊⇥Time and space complexity of algorithms.⇥⌋@@@@1@6@@oe@26-8-2013 1000001500970@unknown@formal@none@1@S@⌊:⌊>Algorithms>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001500980@unknown@formal@none@1@S@⌊⇥Formal logical processes used for computation, and the efficiency of these processes.⇥⌋@@@@1@12@@oe@26-8-2013 1000001500990@unknown@formal@none@1@S@⌊=Programming languages and compilers¦3=⌋@@@@1@4@@oe@26-8-2013 1000001501000@unknown@formal@none@1@S@⌊:⌊>Compiler>⌋s:⌋@@@@1@1@@oe@26-8-2013 1000001501010@unknown@formal@none@1@S@⌊⇥Ways of translating computer programs, usually from ⌊>higher level>⌋ languages to ⌊>lower level>⌋ ones.⇥⌋@@@@1@14@@oe@26-8-2013 1000001501020@unknown@formal@none@1@S@⌊:⌊>Interpreter>⌋s:⌋@@@@1@1@@oe@26-8-2013 1000001501030@unknown@formal@none@1@S@⌊⇥A program that takes in as input a computer program and executes it.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501040@unknown@formal@none@1@S@⌊:⌊>Programming language>⌋s:⌋@@@@1@2@@oe@26-8-2013 1000001501050@unknown@formal@none@1@S@⌊⇥Formal language paradigms for expressing algorithms, and the properties of these languages (e.g., what problems they are suited to solve).⇥⌋@@@@1@20@@oe@26-8-2013 1000001501060@unknown@formal@none@1@S@⌊=Concurrent, parallel, and distributed systems¦3=⌋@@@@1@5@@oe@26-8-2013 1000001501070@unknown@formal@none@1@S@⌊:⌊>Concurrency>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501080@unknown@formal@none@1@S@⌊⇥The theory and practice of simultaneous computation; data safety in any multitasking or multithreaded environment.⇥⌋@@@@1@15@@oe@26-8-2013 1000001501090@unknown@formal@none@1@S@⌊:⌊>Distributed computing>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501100@unknown@formal@none@1@S@⌊⇥Computing using multiple computing devices over a network to accomplish a common objective or task and thereby reducing the latency involved in single processor contributions for any task.⇥⌋@@@@1@28@@oe@26-8-2013 1000001501110@unknown@formal@none@1@S@⌊:⌊>Parallel computing>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501120@unknown@formal@none@1@S@⌊⇥Computing using multiple concurrent threads of execution.⇥⌋@@@@1@7@@oe@26-8-2013 1000001501130@unknown@formal@none@1@S@⌊=Software engineering¦3=⌋@@@@1@2@@oe@26-8-2013 1000001501140@unknown@formal@none@1@S@⌊:⌊>Algorithm design>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501150@unknown@formal@none@1@S@⌊⇥Using ideas from algorithm theory to creatively design solutions to real tasks⇥⌋@@@@1@12@@oe@26-8-2013 1000001501160@unknown@formal@none@1@S@⌊:⌊>Computer programming>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501170@unknown@formal@none@1@S@⌊⇥The practice of using a programming language to implement algorithms⇥⌋@@@@1@10@@oe@26-8-2013 1000001501180@unknown@formal@none@1@S@⌊:⌊>Formal methods>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501190@unknown@formal@none@1@S@⌊⇥Mathematical approaches for describing and reasoning about software designs.⇥⌋@@@@1@9@@oe@26-8-2013 1000001501200@unknown@formal@none@1@S@⌊:⌊>Reverse engineering>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501210@unknown@formal@none@1@S@⌊⇥The application of the scientific method to the understanding of arbitrary existing software⇥⌋@@@@1@13@@oe@26-8-2013 1000001501220@unknown@formal@none@1@S@⌊:⌊>Software development>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501230@unknown@formal@none@1@S@⌊⇥The principles and practice of designing, developing, and testing programs, as well as proper engineering practices.⇥⌋@@@@1@16@@oe@26-8-2013 1000001501240@unknown@formal@none@1@S@⌊=System architecture¦3=⌋@@@@1@2@@oe@26-8-2013 1000001501250@unknown@formal@none@1@S@⌊:⌊>Computer architecture>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501260@unknown@formal@none@1@S@⌊⇥The design, organization, optimization and verification of a computer system, mostly about ⌊>CPU>⌋s and ⌊>memory>⌋ subsystems (and the bus connecting them).⇥⌋@@@@1@21@@oe@26-8-2013 1000001501270@unknown@formal@none@1@S@⌊:⌊>Computer organization>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501280@unknown@formal@none@1@S@⌊⇥The implementation of computer architectures, in terms of descriptions of their specific ⌊>electrical circuit>⌋ry⇥⌋@@@@1@14@@oe@26-8-2013 1000001501290@unknown@formal@none@1@S@⌊:⌊>Operating system>⌋s:⌋@@@@1@2@@oe@26-8-2013 1000001501300@unknown@formal@none@1@S@⌊⇥Systems for managing computer programs and providing the basis of a useable system.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501310@unknown@formal@none@1@S@⌊=Communications¦3=⌋@@@@1@1@@oe@26-8-2013 1000001501320@unknown@formal@none@1@S@⌊:⌊>Computer audio>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501330@unknown@formal@none@1@S@⌊⇥Algorithms and data structures for the creation, manipulation, storage, and transmission of ⌊>digital audio>⌋ recordings.@@@@1@15@@oe@26-8-2013 1000001501340@unknown@formal@none@1@S@Also important in ⌊>voice recognition>⌋ applications.⇥⌋@@@@1@6@@oe@26-8-2013 1000001501350@unknown@formal@none@1@S@⌊:⌊>Networking>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501360@unknown@formal@none@1@S@⌊⇥Algorithms and protocols for communicating data across different shared or dedicated media, often including ⌊>error correction>⌋.⇥⌋@@@@1@16@@oe@26-8-2013 1000001501370@unknown@formal@none@1@S@⌊:⌊>Cryptography>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501380@unknown@formal@none@1@S@⌊⇥Applies results from complexity, probability and number theory to invent and break codes.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501390@unknown@formal@none@1@S@⌊=Databases¦3=⌋@@@@1@1@@oe@26-8-2013 1000001501400@unknown@formal@none@1@S@⌊:⌊>Data mining>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501410@unknown@formal@none@1@S@⌊⇥Data mining is the extraction of relevant data from all sources of data.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501420@unknown@formal@none@1@S@⌊:⌊>Relational databases>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501430@unknown@formal@none@1@S@⌊⇥Study of algorithms for searching and processing information in documents and databases; closely related to ⌊>information retrieval>⌋.⇥⌋@@@@1@17@@oe@26-8-2013 1000001501440@unknown@formal@none@1@S@⌊:⌊>OLAP>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501450@unknown@formal@none@1@S@⌊⇥Online Analytical Processing, or OLAP, is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature.@@@@1@20@@oe@26-8-2013 1000001501460@unknown@formal@none@1@S@OLAP is part of the broader category ⌊>business intelligence>⌋, which also encompasses relational reporting and data mining.⇥⌋@@@@1@17@@oe@26-8-2013 1000001501470@unknown@formal@none@1@S@⌊=Artificial intelligence¦3=⌋@@@@1@2@@oe@26-8-2013 1000001501480@unknown@formal@none@1@S@⌊:⌊>Artificial intelligence>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501490@unknown@formal@none@1@S@⌊⇥The implementation and study of systems that exhibit an autonomous intelligence or behaviour of their own.⇥⌋@@@@1@16@@oe@26-8-2013 1000001501500@unknown@formal@none@1@S@⌊:⌊>Artificial life>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501510@unknown@formal@none@1@S@⌊⇥The study of digital organisms to learn about biological systems and evolution.⇥⌋@@@@1@12@@oe@26-8-2013 1000001501520@unknown@formal@none@1@S@⌊:⌊>Automated reasoning>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501530@unknown@formal@none@1@S@⌊⇥Solving engines, such as used in ⌊>Prolog>⌋, which produce steps to a result given a query on a fact and rule database.⇥⌋@@@@1@22@@oe@26-8-2013 1000001501540@unknown@formal@none@1@S@⌊:⌊>Computer vision>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501550@unknown@formal@none@1@S@⌊⇥Algorithms for identifying three dimensional objects from one or more two dimensional pictures.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501560@unknown@formal@none@1@S@⌊:⌊>Machine learning>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501570@unknown@formal@none@1@S@⌊⇥Automated creation of a set of rules and axioms based on input.⇥⌋@@@@1@12@@oe@26-8-2013 1000001501580@unknown@formal@none@1@S@⌊:⌊>Natural language processing>⌋/⌊>Computational linguistics>⌋:⌋@@@@1@4@@oe@26-8-2013 1000001501590@unknown@formal@none@1@S@⌊⇥Automated understanding and generation of human language⇥⌋@@@@1@7@@oe@26-8-2013 1000001501600@unknown@formal@none@1@S@⌊:⌊>Robotics>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501610@unknown@formal@none@1@S@⌊⇥Algorithms for controlling the behavior of robots.⇥⌋@@@@1@7@@oe@26-8-2013 1000001501620@unknown@formal@none@1@S@⌊=Visual rendering (or Computer graphics)¦3=⌋@@@@1@5@@oe@26-8-2013 1000001501630@unknown@formal@none@1@S@⌊:⌊>Computer graphics>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501640@unknown@formal@none@1@S@⌊⇥Algorithms both for generating visual images synthetically, and for integrating or altering visual and spatial information sampled from the real world.⇥⌋@@@@1@21@@oe@26-8-2013 1000001501650@unknown@formal@none@1@S@⌊:⌊>Image processing>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501660@unknown@formal@none@1@S@⌊⇥Determining information from an image through computation.⇥⌋@@@@1@7@@oe@26-8-2013 1000001501670@unknown@formal@none@1@S@⌊=Human-Computer Interaction¦3=⌋@@@@1@2@@oe@26-8-2013 1000001501680@unknown@formal@none@1@S@⌊:⌊>Human computer interaction>⌋:⌋@@@@1@3@@oe@26-8-2013 1000001501690@unknown@formal@none@1@S@⌊⇥The study of making computers and computations useful, usable and universally accessible to ⌊>people>⌋, including the study and design of computer interfaces through which people use computers.⇥⌋@@@@1@27@@oe@26-8-2013 1000001501700@unknown@formal@none@1@S@⌊=Scientific computing¦3=⌋@@@@1@2@@oe@26-8-2013 1000001501710@unknown@formal@none@1@S@⌊:⌊>Bioinformatics>⌋:⌋@@@@1@1@@oe@26-8-2013 1000001501720@unknown@formal@none@1@S@⌊⇥The use of computer science to maintain, analyse, and store ⌊>biological data>⌋, and to assist in solving biological problems such as ⌊>protein folding>⌋, function prediction and ⌊>phylogeny>⌋.⇥⌋@@@@1@27@@oe@26-8-2013 1000001501730@unknown@formal@none@1@S@⌊:⌊>Cognitive Science>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501740@unknown@formal@none@1@S@⌊⇥Computational modelling of real minds⇥⌋@@@@1@5@@oe@26-8-2013 1000001501750@unknown@formal@none@1@S@⌊:⌊>Computational chemistry>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501760@unknown@formal@none@1@S@⌊⇥Computational modelling of theoretical chemistry in order to determine chemical structures and properties⇥⌋@@@@1@13@@oe@26-8-2013 1000001501770@unknown@formal@none@1@S@⌊:⌊>Computational neuroscience>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501780@unknown@formal@none@1@S@⌊⇥Computational modelling of real brains⇥⌋@@@@1@5@@oe@26-8-2013 1000001501790@unknown@formal@none@1@S@⌊:⌊>Computational physics>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501800@unknown@formal@none@1@S@⌊⇥Numerical simulations of large non-analytic systems⇥⌋@@@@1@6@@oe@26-8-2013 1000001501810@unknown@formal@none@1@S@⌊:⌊>Numerical algorithms>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501820@unknown@formal@none@1@S@⌊⇥Algorithms for the numerical solution of mathematical problems such as ⌊>root-finding>⌋, ⌊>integration>⌋, the ⌊>solution of ordinary differential equations>⌋ and the approximation/evaluation of ⌊>special functions>⌋.⇥⌋@@@@1@24@@oe@26-8-2013 1000001501830@unknown@formal@none@1@S@⌊:⌊>Symbolic mathematics>⌋:⌋@@@@1@2@@oe@26-8-2013 1000001501840@unknown@formal@none@1@S@⌊⇥Manipulation and solution of expressions in symbolic form, also known as ⌊>Computer algebra>⌋.⇥⌋@@@@1@13@@oe@26-8-2013 1000001501850@unknown@formal@none@1@S@⌊=Didactics of computer science/informatics¦3=⌋@@@@1@4@@oe@26-8-2013 1000001501860@unknown@formal@none@1@S@The subfield didactics of computer science focuses on cognitive approaches of developing competencies of computer science and specific strategies for analysis, design, implementation and evaluation of excellent lessons in computer science.@@@@1@31@@oe@26-8-2013 1000001501870@unknown@formal@none@1@S@⌊=Computer science education¦2=⌋@@@@1@3@@oe@26-8-2013 1000001501880@unknown@formal@none@1@S@Some universities teach computer science as a theoretical study of computation and algorithmic reasoning.@@@@1@14@@oe@26-8-2013 1000001501890@unknown@formal@none@1@S@These programs often feature the ⌊>theory of computation>⌋, ⌊>analysis of algorithms>⌋, ⌊>formal methods>⌋, ⌊>concurrency theory>⌋, ⌊>databases>⌋, ⌊>computer graphics>⌋ and ⌊>systems analysis>⌋, among others.@@@@1@23@@oe@26-8-2013 1000001501900@unknown@formal@none@1@S@They typically also teach ⌊>computer programming>⌋, but treat it as a vessel for the support of other fields of computer science rather than a central focus of high-level study.@@@@1@29@@oe@26-8-2013 1000001501910@unknown@formal@none@1@S@Other colleges and universities, as well as ⌊>secondary school>⌋s and vocational programs that teach computer science, emphasize the practice of advanced ⌊>computer programming>⌋ rather than the theory of algorithms and computation in their computer science curricula.@@@@1@36@@oe@26-8-2013 1000001501920@unknown@formal@none@1@S@Such curricula tend to focus on those skills that are important to workers entering the software industry.@@@@1@17@@oe@26-8-2013 1000001501930@unknown@formal@none@1@S@The practical aspects of computer programming are often referred to as ⌊>software engineering>⌋.@@@@1@13@@oe@26-8-2013 1000001501940@unknown@formal@none@1@S@However, there is a lot of ⌊>disagreement>⌋ over what the term "software engineering" actually means, and whether it is the same thing as programming.@@@@1@24@@oe@26-8-2013 1000001600010@unknown@formal@none@1@S@⌊δComputer softwareδ⌋@@@@1@2@@oe@26-8-2013 1000001600020@unknown@formal@none@1@S@⌊∗Computer software,∗⌋ or just ⌊∗software∗⌋ is a general term used to describe a collection of ⌊>computer program>⌋s, ⌊>procedures>⌋ and documentation that perform some tasks on a computer system.@@@@1@28@@oe@26-8-2013 1000001600030@unknown@formal@none@1@S@The term includes ⌊>application software>⌋ such as ⌊>word processor>⌋s which perform productive tasks for users, ⌊>system software>⌋ such as ⌊>operating system>⌋s, which interface with ⌊>hardware>⌋ to provide the necessary services for application software, and ⌊>middleware>⌋ which controls and co-ordinates ⌊>distributed systems>⌋.@@@@1@41@@oe@26-8-2013 1000001600040@unknown@formal@none@1@S@"Software" is sometimes used in a broader context to mean anything which is not hardware but which is ⌊/used/⌋ with hardware, such as film, tapes and records.@@@@1@27@@oe@26-8-2013 1000001600050@unknown@formal@none@1@S@⌊=Relationship to computer hardware¦2=⌋@@@@1@4@@oe@26-8-2013 1000001600060@unknown@formal@none@1@S@⌊>Computer>⌋ software is so called to distinguish it from ⌊>computer hardware>⌋, which encompasses the physical interconnections and devices required to store and execute (or run) the software.@@@@1@27@@oe@26-8-2013 1000001600070@unknown@formal@none@1@S@At the lowest level, software consists of a ⌊>machine language>⌋ specific to an individual processor.@@@@1@15@@oe@26-8-2013 1000001600080@unknown@formal@none@1@S@A machine language consists of groups of binary values signifying processor instructions which change the state of the computer from its preceding state.@@@@1@23@@oe@26-8-2013 1000001600090@unknown@formal@none@1@S@Software is an ordered sequence of instructions for changing the state of the computer hardware in a particular sequence.@@@@1@19@@oe@26-8-2013 1000001600100@unknown@formal@none@1@S@It is usually written in ⌊>high-level programming language>⌋s that are easier and more efficient for humans to use (closer to ⌊>natural language>⌋) than machine language.@@@@1@25@@oe@26-8-2013 1000001600110@unknown@formal@none@1@S@High-level languages are ⌊>compiled>⌋ or ⌊>interpreted>⌋ into machine language object code.@@@@1@11@@oe@26-8-2013 1000001600120@unknown@formal@none@1@S@Software may also be written in an ⌊>assembly language>⌋, essentially, a mnemonic representation of a machine language using a natural language alphabet.@@@@1@22@@oe@26-8-2013 1000001600130@unknown@formal@none@1@S@Assembly language must be assembled into object code via an ⌊>assembler>⌋.@@@@1@11@@oe@26-8-2013 1000001600140@unknown@formal@none@1@S@The term "software" was first used in this sense by ⌊>John W. Tukey>⌋ in ⌊>1958>⌋.@@@@1@15@@oe@26-8-2013 1000001600150@unknown@formal@none@1@S@In ⌊>computer science>⌋ and ⌊>software engineering>⌋, ⌊∗computer software∗⌋ is all computer programs.@@@@1@12@@oe@26-8-2013 1000001600160@unknown@formal@none@1@S@The theory that is the basis for most modern software was first proposed by ⌊>Alan Turing>⌋ in his ⌊>1935>⌋ essay ⌊/Computable numbers with an application to the Entscheidungsproblem/⌋.@@@@1@28@@oe@26-8-2013 1000001600170@unknown@formal@none@1@S@⌊=Types¦2=⌋@@@@1@1@@oe@26-8-2013 1000001600180@unknown@formal@none@1@S@Practical ⌊>computer system>⌋s divide ⌊>software system>⌋s into three major classes: ⌊>system software>⌋, ⌊>programming software>⌋ and ⌊>application software>⌋, although the distinction is arbitrary, and often blurred.@@@@1@25@@oe@26-8-2013 1000001600190@unknown@formal@none@1@S@⌊•⌊#⌊∗⌊>System software>⌋∗⌋ helps run the ⌊>computer hardware>⌋ and ⌊>computer system>⌋.@@@@1@10@@oe@26-8-2013 1000001600200@unknown@formal@none@1@S@It includes ⌊>operating system>⌋s, ⌊>device driver>⌋s, diagnostic tools, ⌊>server>⌋s, ⌊>windowing system>⌋s, ⌊>utilities>⌋ and more.@@@@1@14@@oe@26-8-2013 1000001600210@unknown@formal@none@1@S@The purpose of systems software is to insulate the applications programmer as much as possible from the details of the particular computer complex being used, especially memory and other hardware features, and such as accessory devices as communications, printers, readers, displays, keyboards, etc.#⌋@@@@1@43@@oe@26-8-2013 1000001600220@unknown@formal@none@1@S@⌊#⌊∗⌊>Programming software>⌋∗⌋ usually provides tools to assist a ⌊>programmer>⌋ in writing ⌊>computer program>⌋s, and software using different ⌊>programming language>⌋s in a more convenient way.@@@@1@24@@oe@26-8-2013 1000001600230@unknown@formal@none@1@S@The tools include ⌊>text editors>⌋, ⌊>compilers>⌋, ⌊>interpreters>⌋, ⌊>linkers>⌋, ⌊>debuggers>⌋, and so on.@@@@1@12@@oe@26-8-2013 1000001600240@unknown@formal@none@1@S@An ⌊>Integrated development environment>⌋ (IDE) merges those tools into a software bundle, and a programmer may not need to type multiple ⌊>command>⌋s for compiling, interpreting, debugging, tracing, and etc., because the IDE usually has an advanced ⌊/⌊>graphical user interface>⌋,/⌋ or GUI.#⌋@@@@1@41@@oe@26-8-2013 1000001600250@unknown@formal@none@1@S@⌊#⌊∗⌊>Application software>⌋∗⌋ allows end users to accomplish one or more specific (non-computer related) ⌊>task>⌋s.@@@@1@14@@oe@26-8-2013 1000001600260@unknown@formal@none@1@S@Typical applications include ⌊>industrial>⌋ ⌊>automation>⌋, ⌊>business software>⌋, ⌊>educational software>⌋, ⌊>medical software>⌋, ⌊>database>⌋s, and ⌊>computer games>⌋.@@@@1@15@@oe@26-8-2013 1000001600270@unknown@formal@none@1@S@Businesses are probably the biggest users of application software, but almost every field of human activity now uses some form of application software#⌋•⌋@@@@1@23@@oe@26-8-2013 1000001600280@unknown@formal@none@1@S@⌊=Program and library¦2=⌋@@@@1@3@@oe@26-8-2013 1000001600290@unknown@formal@none@1@S@A ⌊>program>⌋ may not be sufficiently complete for execution by a ⌊>computer>⌋.@@@@1@12@@oe@26-8-2013 1000001600300@unknown@formal@none@1@S@In particular, it may require additional software from a ⌊>software library>⌋ in order to be complete.@@@@1@16@@oe@26-8-2013 1000001600310@unknown@formal@none@1@S@Such a library may include software components used by ⌊>stand-alone>⌋ programs, but which cannot work on their own.@@@@1@18@@oe@26-8-2013 1000001600320@unknown@formal@none@1@S@Thus, programs may include standard routines that are common to many programs, extracted from these libraries.@@@@1@16@@oe@26-8-2013 1000001600330@unknown@formal@none@1@S@Libraries may also ⌊/include/⌋ 'stand-alone' programs which are activated by some ⌊>computer event>⌋ and/or perform some function (e.g., of computer 'housekeeping') but do not return data to their calling program.@@@@1@30@@oe@26-8-2013 1000001600340@unknown@formal@none@1@S@Libraries may be ⌊>called>⌋ by one to many other programs; programs may call zero to many other programs.@@@@1@18@@oe@26-8-2013 1000001600350@unknown@formal@none@1@S@⌊=Three layers¦2=⌋@@@@1@2@@oe@26-8-2013 1000001600360@unknown@formal@none@1@S@Users often see things differently than programmers.@@@@1@7@@oe@26-8-2013 1000001600370@unknown@formal@none@1@S@People who use modern general purpose computers (as opposed to ⌊>embedded system>⌋s, ⌊>analog computer>⌋s, ⌊>supercomputer>⌋s, etc.) usually see three layers of software performing a variety of tasks: platform, application, and user software.@@@@1@32@@oe@26-8-2013 1000001600380@unknown@formal@none@1@S@⌊:Platform software:⌋@@@@1@2@@oe@26-8-2013 1000001600390@unknown@formal@none@1@S@⌊⇥⌊>Platform>⌋ includes the ⌊>firmware>⌋, ⌊>device driver>⌋s, an ⌊>operating system>⌋, and typically a ⌊>graphical user interface>⌋ which, in total, allow a user to interact with the computer and its ⌊>peripheral>⌋s (associated equipment).@@@@1@31@@oe@26-8-2013 1000001600400@unknown@formal@none@1@S@Platform software often comes bundled with the computer.@@@@1@8@@oe@26-8-2013 1000001600410@unknown@formal@none@1@S@On a ⌊>PC>⌋ you will usually have the ability to change the platform software.⇥⌋@@@@1@14@@oe@26-8-2013 1000001600420@unknown@formal@none@1@S@⌊:Application software:⌋@@@@1@2@@oe@26-8-2013 1000001600430@unknown@formal@none@1@S@⌊⇥⌊>Application software>⌋ or Applications are what most people think of when they think of software.@@@@1@15@@oe@26-8-2013 1000001600440@unknown@formal@none@1@S@Typical examples include office suites and video games.@@@@1@8@@oe@26-8-2013 1000001600450@unknown@formal@none@1@S@Application software is often purchased separately from computer hardware.@@@@1@9@@oe@26-8-2013 1000001600460@unknown@formal@none@1@S@Sometimes applications are bundled with the computer, but that does not change the fact that they run as independent applications.@@@@1@20@@oe@26-8-2013 1000001600470@unknown@formal@none@1@S@Applications are almost always independent programs from the operating system, though they are often tailored for specific platforms.@@@@1@18@@oe@26-8-2013 1000001600480@unknown@formal@none@1@S@Most users think of compilers, databases, and other "system software" as applications.⇥⌋@@@@1@12@@oe@26-8-2013 1000001600490@unknown@formal@none@1@S@⌊:User-written software:⌋@@@@1@2@@oe@26-8-2013 1000001600500@unknown@formal@none@1@S@⌊⇥⌊>End-user development>⌋ tailors systems to meet users' specific needs.@@@@1@9@@oe@26-8-2013 1000001600510@unknown@formal@none@1@S@User software include spreadsheet templates, word processor macros, scientific simulations, and scripts for graphics and animations.@@@@1@16@@oe@26-8-2013 1000001600520@unknown@formal@none@1@S@Even email filters are a kind of user software.@@@@1@9@@oe@26-8-2013 1000001600530@unknown@formal@none@1@S@Users create this software themselves and often overlook how important it is.@@@@1@12@@oe@26-8-2013 1000001600540@unknown@formal@none@1@S@Depending on how competently the user-written software has been integrated into purchased application packages, many users may not be aware of the distinction between the purchased packages, and what has been added by fellow co-workers.⇥⌋@@@@1@35@@oe@26-8-2013 1000001600550@unknown@formal@none@1@S@⌊=Operation¦2=⌋@@@@1@1@@oe@26-8-2013 1000001600560@unknown@formal@none@1@S@Computer software has to be "loaded" into the ⌊>computer's storage>⌋ (such as a ⌊/⌊>hard drive>⌋/⌋, ⌊/memory/⌋, or ⌊/⌊>RAM>⌋/⌋).@@@@1@18@@oe@26-8-2013 1000001600570@unknown@formal@none@1@S@Once the software has loaded, the computer is able to ⌊/execute/⌋ the software.@@@@1@13@@oe@26-8-2013 1000001600580@unknown@formal@none@1@S@This involves passing ⌊>instructions>⌋ from the application software, through the system software, to the ⌊>hardware>⌋ which ultimately receives the instruction as ⌊>machine code>⌋.@@@@1@23@@oe@26-8-2013 1000001600590@unknown@formal@none@1@S@Each instruction causes the computer to carry out an operation -- moving ⌊>data>⌋, carrying out a ⌊>computation>⌋, or altering the ⌊>control flow>⌋ of instructions.@@@@1@24@@oe@26-8-2013 1000001600600@unknown@formal@none@1@S@Data movement is typically from one place in memory to another.@@@@1@11@@oe@26-8-2013 1000001600610@unknown@formal@none@1@S@Sometimes it involves moving data between memory and registers which enable high-speed data access in the CPU.@@@@1@17@@oe@26-8-2013 1000001600620@unknown@formal@none@1@S@Moving data, especially large amounts of it, can be costly.@@@@1@10@@oe@26-8-2013 1000001600630@unknown@formal@none@1@S@So, this is sometimes avoided by using "pointers" to data instead.@@@@1@11@@oe@26-8-2013 1000001600640@unknown@formal@none@1@S@Computations include simple operations such as incrementing the value of a variable data element.@@@@1@14@@oe@26-8-2013 1000001600650@unknown@formal@none@1@S@More complex computations may involve many operations and data elements together.@@@@1@11@@oe@26-8-2013 1000001600660@unknown@formal@none@1@S@Instructions may be performed sequentially, conditionally, or iteratively.@@@@1@8@@oe@26-8-2013 1000001600670@unknown@formal@none@1@S@Sequential instructions are those operations that are performed one after another.@@@@1@11@@oe@26-8-2013 1000001600680@unknown@formal@none@1@S@Conditional instructions are performed such that different sets of instructions execute depending on the value(s) of some data.@@@@1@18@@oe@26-8-2013 1000001600690@unknown@formal@none@1@S@In some languages this is known as an "if" statement.@@@@1@10@@oe@26-8-2013 1000001600700@unknown@formal@none@1@S@Iterative instructions are performed repetitively and may depend on some data value.@@@@1@12@@oe@26-8-2013 1000001600710@unknown@formal@none@1@S@This is sometimes called a "loop."@@@@1@6@@oe@26-8-2013 1000001600720@unknown@formal@none@1@S@Often, one instruction may "call" another set of instructions that are defined in some other program or ⌊>module>⌋.@@@@1@18@@oe@26-8-2013 1000001600730@unknown@formal@none@1@S@When more than one computer processor is used, instructions may be executed simultaneously.@@@@1@13@@oe@26-8-2013 1000001600740@unknown@formal@none@1@S@A simple example of the way software operates is what happens when a user selects an entry such as "Copy" from a menu.@@@@1@23@@oe@26-8-2013 1000001600750@unknown@formal@none@1@S@In this case, a conditional instruction is executed to copy text from data in a 'document' area residing in memory, perhaps to an intermediate storage area known as a 'clipboard' data area.@@@@1@32@@oe@26-8-2013 1000001600760@unknown@formal@none@1@S@If a different menu entry such as "Paste" is chosen, the software may execute the instructions to copy the text from the clipboard data area to a specific location in the same or another document in memory.@@@@1@37@@oe@26-8-2013 1000001600770@unknown@formal@none@1@S@Depending on the application, even the example above could become complicated.@@@@1@11@@oe@26-8-2013 1000001600780@unknown@formal@none@1@S@The field of ⌊>software engineering>⌋ endeavors to manage the complexity of how software operates.@@@@1@14@@oe@26-8-2013 1000001600790@unknown@formal@none@1@S@This is especially true for software that operates in the context of a large or powerful ⌊>computer system>⌋.@@@@1@18@@oe@26-8-2013 1000001600800@unknown@formal@none@1@S@Currently, almost the only limitations on the use of computer software in applications is the ingenuity of the designer/programmer.@@@@1@19@@oe@26-8-2013 1000001600810@unknown@formal@none@1@S@Consequently, large areas of activities (such as playing grand master level chess) formerly assumed to be incapable of software simulation are now routinely programmed.@@@@1@24@@oe@26-8-2013 1000001600820@unknown@formal@none@1@S@The only area that has so far proved reasonably secure from software simulation is the realm of human art— especially, pleasing music and literature.@@@@1@24@@oe@26-8-2013 1000001600830@unknown@formal@none@1@S@Kinds of software by operation: ⌊>computer program>⌋ as ⌊>executable>⌋, ⌊>source code>⌋ or ⌊>script>⌋, ⌊>configuration>⌋.@@@@1@14@@oe@26-8-2013 1000001600840@unknown@formal@none@1@S@⌊=Quality and reliability¦2=⌋@@@@1@3@@oe@26-8-2013 1000001600850@unknown@formal@none@1@S@⌊>Software reliability>⌋ considers the errors, faults, and failures related to the design, implementation and operation of software.@@@@1@17@@oe@26-8-2013 1000001600860@unknown@formal@none@1@S@⌊∗See∗⌋ ⌊>Software auditing>⌋, ⌊>Software quality>⌋, ⌊>Software testing>⌋, and ⌊>Software reliability>⌋.@@@@1@10@@oe@26-8-2013 1000001600870@unknown@formal@none@1@S@⌊=License¦2=⌋@@@@1@1@@oe@26-8-2013 1000001600880@unknown@formal@none@1@S@⌊>Software license>⌋ gives the user the right to use the software in the licensed environment, some software comes with the license when purchased off the shelf, or an OEM license when bundled with hardware.@@@@1@34@@oe@26-8-2013 1000001600890@unknown@formal@none@1@S@Other software comes with a ⌊>free software licence>⌋, granting the recipient the rights to modify and redistribute the software.@@@@1@19@@oe@26-8-2013 1000001600900@unknown@formal@none@1@S@Software can also be in the form of ⌊>freeware>⌋ or ⌊>shareware>⌋.@@@@1@11@@oe@26-8-2013 1000001600910@unknown@formal@none@1@S@See also ⌊>License Management>⌋.@@@@1@4@@oe@26-8-2013 1000001600920@unknown@formal@none@1@S@⌊=Patents¦2=⌋@@@@1@1@@oe@26-8-2013 1000001600930@unknown@formal@none@1@S@The issue of ⌊>software patent>⌋s is controversial.@@@@1@7@@oe@26-8-2013 1000001600940@unknown@formal@none@1@S@Some believe that they hinder ⌊>software development>⌋, while others argue that software patents provide an important incentive to spur software innovation.@@@@1@21@@oe@26-8-2013 1000001600950@unknown@formal@none@1@S@See ⌊>software patent debate>⌋.@@@@1@4@@oe@26-8-2013 1000001600960@unknown@formal@none@1@S@⌊=Ethics and rights for software users¦2=⌋@@@@1@6@@oe@26-8-2013 1000001600970@unknown@formal@none@1@S@Being a new part of society, the idea of what rights users of software should have is not very developed.@@@@1@20@@oe@26-8-2013 1000001600980@unknown@formal@none@1@S@Some, such as the ⌊>free software community>⌋, believe that software users should be free to modify and redistribute the software they use.@@@@1@22@@oe@26-8-2013 1000001600990@unknown@formal@none@1@S@They argue that these rights are necessary so that each individual can control their computer, and so that everyone can cooperate, if they choose, to work together as a community and control the direction that software progresses in.@@@@1@38@@oe@26-8-2013 1000001601000@unknown@formal@none@1@S@Others believe that software authors should have the power to say what rights the user will get.@@@@1@17@@oe@26-8-2013 1000001601010@unknown@formal@none@1@S@⌊=Software companies and non-profit organizations¦2=⌋@@@@1@5@@oe@26-8-2013 1000001601020@unknown@formal@none@1@S@Examples of non-profit software organizations : ⌊>Free Software Foundation>⌋, ⌊>GNU Project>⌋, ⌊>Mozilla Foundation>⌋@@@@1@13@@oe@26-8-2013 1000001601030@unknown@formal@none@1@S@Examples of large software companies are: ⌊>Microsoft>⌋, ⌊>IBM>⌋, ⌊>Oracle>⌋, ⌊>SAP>⌋ and ⌊>HP>⌋.@@@@1@12@@oe@26-8-2013 1000001700010@unknown@formal@none@1@S@⌊δCorpus linguisticsδ⌋@@@@1@2@@oe@26-8-2013 1000001700020@unknown@formal@none@1@S@⌊∗Corpus linguistics∗⌋ is the ⌊>study of language>⌋ as expressed in ⌊>sample>⌋s ⌊/(⌊>corpora>⌋)/⌋ or "real world" text.@@@@1@16@@oe@26-8-2013 1000001700030@unknown@formal@none@1@S@This method represents a ⌊>digest>⌋ive approach to deriving a set of abstract rules by which a ⌊>natural language>⌋ is governed or else relates to another language.@@@@1@26@@oe@26-8-2013 1000001700040@unknown@formal@none@1@S@Originally done by hand, corpora are largely derived by an automated process, which is corrected.@@@@1@15@@oe@26-8-2013 1000001700050@unknown@formal@none@1@S@Computational methods had once been viewed as a ⌊>holy grail>⌋ of ⌊>linguistic>⌋ research, which would ultimately manifest a ⌊>ruleset>⌋ for ⌊>natural language processing>⌋ and ⌊>machine translation>⌋ at a high level.@@@@1@30@@oe@26-8-2013 1000001700060@unknown@formal@none@1@S@Such has not been the case, and since the ⌊>cognitive revolution>⌋, cognitive linguistics has been largely critical of many claimed practical uses for corpora.@@@@1@24@@oe@26-8-2013 1000001700070@unknown@formal@none@1@S@However, as ⌊>computation>⌋ capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability.@@@@1@24@@oe@26-8-2013 1000001700080@unknown@formal@none@1@S@The corpus approach runs counter to ⌊>Noam Chomsky>⌋'s view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting.@@@@1@32@@oe@26-8-2013 1000001700090@unknown@formal@none@1@S@Corpus linguistics does away with Chomsky's ⌊/competence/performance/⌋ split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.@@@@1@27@@oe@26-8-2013 1000001700100@unknown@formal@none@1@S@⌊=History¦2=⌋@@@@1@1@@oe@26-8-2013 1000001700110@unknown@formal@none@1@S@A landmark in modern corpus linguistics was the publication by ⌊>Henry Kucera>⌋ and ⌊>Nelson Francis>⌋ of ⌊/Computational Analysis of Present-Day American English/⌋ in 1967, a work based on the analysis of the ⌊>Brown Corpus>⌋, a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources.@@@@1@54@@oe@26-8-2013 1000001700120@unknown@formal@none@1@S@Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, language teaching, ⌊>psychology>⌋, ⌊>statistics>⌋, and ⌊>sociology>⌋.@@@@1@30@@oe@26-8-2013 1000001700130@unknown@formal@none@1@S@A further key publication was ⌊>Randolph Quirk>⌋'s 'Towards a description of English Usage' (1960, Transactions of the Philological Society, 40-61) in which he introduced ⌊/The Survey of English Usage/⌋.@@@@1@29@@oe@26-8-2013 1000001700140@unknown@formal@none@1@S@Shortly thereafter, Boston publisher ⌊>Houghton-Mifflin>⌋ approached Kucera to supply a million word, three-line citation base for its new ⌊/⌊>American Heritage Dictionary>⌋/⌋, the first ⌊>dictionary>⌋ to be compiled using corpus linguistics.@@@@1@30@@oe@26-8-2013 1000001700150@unknown@formal@none@1@S@The AHD made the innovative step of combining prescriptive elements (how language ⌊/should/⌋ be used) with descriptive information (how it actually ⌊/is/⌋ used).@@@@1@23@@oe@26-8-2013 1000001700160@unknown@formal@none@1@S@Other publishers followed suit.@@@@1@4@@oe@26-8-2013 1000001700170@unknown@formal@none@1@S@The British publisher Collins' ⌊>COBUILD>⌋ ⌊>monolingual learner's dictionary>⌋, designed for users learning ⌊>English as a foreign language>⌋, was compiled using the ⌊>Bank of English>⌋.@@@@1@24@@oe@26-8-2013 1000001700180@unknown@formal@none@1@S@The ⌊>Brown Corpus>⌋ has also spawned a number of similarly structured corpora: the ⌊>LOB Corpus>⌋ (1960s ⌊>British English>⌋), Kolhapur (⌊>Indian English>⌋), Wellington (⌊>New Zealand English>⌋), Australian Corpus of English (⌊>Australian English>⌋), the Frown Corpus (⌊>early 1990s>⌋ ⌊>American English>⌋), and the FLOB Corpus (1990s British English).@@@@1@45@@oe@26-8-2013 1000001700190@unknown@formal@none@1@S@Other corpora represent many languages, varieties and modes, and include the ⌊>International Corpus of English>⌋, and the ⌊>British National Corpus>⌋, a 100 million word collection of a range of spoken and written texts, created in the 1990s by a consortium of publishers, universities (⌊>Oxford>⌋ and ⌊>Lancaster>⌋) and the ⌊>British Library>⌋.@@@@1@50@@oe@26-8-2013 1000001700200@unknown@formal@none@1@S@For contemporary American English, work has stalled on the ⌊>American National Corpus>⌋, but the 360 million word ⌊>Corpus of Contemporary American English (COCA)>⌋ (1990-present) is now available.@@@@1@27@@oe@26-8-2013 1000001700210@unknown@formal@none@1@S@⌊=Methods¦2=⌋@@@@1@1@@oe@26-8-2013 1000001700220@unknown@formal@none@1@S@This means dealing with real input data, where descriptions based on a linguist's intuition are not usually helpful.@@@@1@18@@oe@26-8-2013