10010010@unknown@formal@none@1@S@Algorithm@@@@1@1@@oe@19-8-2009
10010020@unknown@formal@none@1@S@In [[mathematics]], [[computing]], [[linguistics]] and related disciplines, an '''algorithm''' is a sequence of instructions, often used for [[calculation]] and [[data processing]].@@@@1@21@@oe@19-8-2009
10010030@unknown@formal@none@1@S@It is formally a type of [[effective method]] in which a list of well-defined instructions for completing a task will, when given an initial state, proceed through a well-defined series of successive states, eventually terminating in an end-state.@@@@1@38@@oe@19-8-2009
10010040@unknown@formal@none@1@S@The transition from one state to the next is not necessarily [[deterministic]]; some algorithms, known as [[probabilistic algorithms]], incorporate randomness.@@@@1@20@@oe@19-8-2009
10010050@unknown@formal@none@1@S@A partial formalization of the concept began with attempts to solve the [[Entscheidungsproblem]] (the "decision problem") posed by [[David Hilbert]] in 1928.@@@@1@22@@oe@19-8-2009
10010060@unknown@formal@none@1@S@Subsequent formalizations were framed as attempts to define "[[effective calculability]]" (Kleene 1943:274) or "effective method" (Rosser 1939:225); those formalizations included the Gödel-Herbrand-Kleene [[Recursion (computer science)|recursive function]]s of 1930, 1934 and 1935, [[Alonzo Church]]'s [[lambda calculus]] of 1936, [[Emil Post]]'s "Formulation I" of 1936, and [[Alan Turing]]'s [[Turing machines]] of 1936-7 and 1939.@@@@1@52@@oe@19-8-2009
10010070@unknown@formal@none@1@S@==Etymology==@@@@1@1@@oe@19-8-2009
10010080@unknown@formal@none@1@S@[[Muhammad ibn Mūsā al-Khwārizmī|Al-Khwārizmī]], [[Persian people|Persian]] [[astronomer]] and [[mathematician]], wrote a [[treatise]] in [[Arabic]] in 825 AD, ''On Calculation with Hindu Numerals''.@@@@1@22@@oe@19-8-2009
10010090@unknown@formal@none@1@S@(See [[algorism]]).@@@@1@2@@oe@19-8-2009
10010100@unknown@formal@none@1@S@It was translated into [[Latin]] in the 12th century as ''Algoritmi de numero Indorum'' (al-Daffa 1977), which title was likely intended to mean "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's rendition of the author's name; but people misunderstanding the title treated ''Algoritmi'' as a Latin plural and this led to the word "algorithm" (Latin ''algorismus'') coming to mean "calculation method".@@@@1@65@@oe@19-8-2009
10010110@unknown@formal@none@1@S@The intrusive "th" is most likely due to a [[false cognate]] with the [[Greek language|Greek]] {{lang|grc|ἀριθμός}} (''arithmos'') meaning "number".@@@@1@19@@oe@19-8-2009
10010120@unknown@formal@none@1@S@== Why algorithms are necessary: an informal definition ==@@@@1@9@@oe@19-8-2009
10010130@unknown@formal@none@1@S@No generally accepted ''formal'' definition of "algorithm" exists yet.@@@@1@9@@oe@19-8-2009
10010140@unknown@formal@none@1@S@An informal definition could be "an algorithm is a computer program that calculates something."@@@@1@14@@oe@19-8-2009
10010150@unknown@formal@none@1@S@For some people, a program is only an algorithm if it stops eventually.@@@@1@13@@oe@19-8-2009
10010160@unknown@formal@none@1@S@For others, a program is only an algorithm if it stops before a given number of calculation steps.@@@@1@18@@oe@19-8-2009
10010170@unknown@formal@none@1@S@A prototypical example of an "algorithm" is Euclid's algorithm to determine the maximum common divisor of two integers greater than one: "subtract the smallest number from the biggest one, repeat until you get a zero or a one".@@@@1@38@@oe@19-8-2009
10010180@unknown@formal@none@1@S@This procedure is know to stop always, and the number of subtractions needed is always smaller than the biggest of the two numbers.@@@@1@23@@oe@19-8-2009
10010190@unknown@formal@none@1@S@We can derive clues to the issues involved and an informal meaning of the word from the following quotation from {{Harvtxt|Boolos|Jeffrey|1974, 1999}} (boldface added):@@@@1@24@@oe@19-8-2009
10010200@unknown@formal@none@1@S@
No human being can write fast enough, or long enough, or small enough to list all members of an enumerably infinite set by writing out their names, one after another, in some notation.@@@@1@34@@oe@19-8-2009
10010210@unknown@formal@none@1@S@But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give '''explicit instructions for determining the nth member of the set''', for arbitrary finite n.@@@@1@32@@oe@19-8-2009
10010220@unknown@formal@none@1@S@Such instructions are to be given quite explicitly, in a form in which '''they could be followed by a computing machine''', or by a '''human who is capable of carrying out only very elementary operations on symbols'''
@@@@1@38@@oe@19-8-2009
10010230@unknown@formal@none@1@S@The words "enumerably infinite" mean "countable using integers perhaps extending to infinity".@@@@1@12@@oe@19-8-2009
10010240@unknown@formal@none@1@S@Thus Boolos and Jeffrey are saying that an algorithm ''implies'' instructions for a process that "creates" output integers from an ''arbitrary'' "input" integer or integers that, in theory, can be chosen from 0 to infinity.@@@@1@35@@oe@19-8-2009
10010250@unknown@formal@none@1@S@Thus we might expect an algorithm to be an algebraic equation such as '''y = m + n''' — two arbitrary "input variables" '''m''' and '''n''' that produce an output '''y'''.@@@@1@31@@oe@19-8-2009
10010260@unknown@formal@none@1@S@As we see in [[Algorithm characterizations]] — the word algorithm implies much more than this, something on the order of (for our addition example):@@@@1@24@@oe@19-8-2009
10010270@unknown@formal@none@1@S@:Precise instructions (in language understood by "the computer") for a "fast, efficient, good" ''process'' that specifies the "moves" of "the computer" (machine or human, equipped with the necessary internally-contained information and capabilities) to find, decode, and then munch arbitrary input integers/symbols '''m''' and '''n''', symbols '''+''' and '''=''' ... and (reliably, correctly, "effectively") produce, in a "reasonable" [[time]], output-integer '''y''' at a specified place and in a specified format.@@@@1@69@@oe@19-8-2009
10010280@unknown@formal@none@1@S@The concept of ''algorithm'' is also used to define the notion of [[decidability (logic)|decidability]].@@@@1@14@@oe@19-8-2009
10010290@unknown@formal@none@1@S@That notion is central for explaining how [[formal system]]s come into being starting from a small set of [[axiom]]s and rules.@@@@1@21@@oe@19-8-2009
10010300@unknown@formal@none@1@S@In [[logic]], the time that an algorithm requires to complete cannot be measured, as it is not apparently related with our customary physical dimension.@@@@1@24@@oe@19-8-2009
10010310@unknown@formal@none@1@S@From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of ''algorithm'' that suits both concrete (in some sense) and abstract usage of the term.@@@@1@28@@oe@19-8-2009
10010320@unknown@formal@none@1@S@:''For a detailed presentation of the various points of view around the definition of "algorithm" see [[Algorithm characterizations]].@@@@1@18@@oe@19-8-2009
10010330@unknown@formal@none@1@S@For examples of simple addition algorithms specified in the detailed manner described in [[Algorithm characterizations]], see [[Algorithm examples]].''@@@@1@18@@oe@19-8-2009
10010340@unknown@formal@none@1@S@== Formalization of algorithms ==@@@@1@5@@oe@19-8-2009
10010350@unknown@formal@none@1@S@Algorithms are essential to the way [[computer]]s process information, because a [[computer program]] is essentially an algorithm that tells the computer what specific steps to perform (in what specific order) in order to carry out a specified task, such as calculating employees’ paychecks or printing students’ report cards.@@@@1@48@@oe@19-8-2009
10010360@unknown@formal@none@1@S@Thus, an algorithm can be considered to be any sequence of operations that can be performed by a [[Turing completeness|Turing-complete]] system.@@@@1@21@@oe@19-8-2009
10010370@unknown@formal@none@1@S@Authors who assert this thesis include Savage (1987) and Gurevich (2000):@@@@1@11@@oe@19-8-2009
10010380@unknown@formal@none@1@S@
...Turing's informal argument in favor of his thesis justifies a stronger thesis: every algorithm can be simulated by a Turing machine (Gurevich 2000:1)...according to Savage [1987], an algorithm is a computational process defined by a Turing machine.@@@@1@38@@oe@19-8-2009
10010390@unknown@formal@none@1@S@(Gurevich 2000:3)
@@@@1@3@@oe@19-8-2009
10010400@unknown@formal@none@1@S@Typically, when an algorithm is associated with processing information, data are read from an input source or device, written to an output sink or device, and/or stored for further processing.@@@@1@30@@oe@19-8-2009
10010410@unknown@formal@none@1@S@Stored data are regarded as part of the internal state of the entity performing the algorithm.@@@@1@16@@oe@19-8-2009
10010420@unknown@formal@none@1@S@In practice, the state is stored in a [[data structure]], but an algorithm requires the internal data only for specific operation sets called [[abstract data type]]s.@@@@1@26@@oe@19-8-2009
10010430@unknown@formal@none@1@S@For any such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise.@@@@1@24@@oe@19-8-2009
10010440@unknown@formal@none@1@S@That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable).@@@@1@21@@oe@19-8-2009
10010450@unknown@formal@none@1@S@Because an algorithm is a precise list of precise steps, the order of computation will almost always be critical to the functioning of the algorithm.@@@@1@25@@oe@19-8-2009
10010460@unknown@formal@none@1@S@Instructions are usually assumed to be listed explicitly, and are described as starting "from the top" and going "down to the bottom", an idea that is described more formally by ''[[control flow|flow of control]]''.@@@@1@34@@oe@19-8-2009
10010470@unknown@formal@none@1@S@So far, this discussion of the formalization of an algorithm has assumed the premises of [[imperative programming]].@@@@1@17@@oe@19-8-2009
10010480@unknown@formal@none@1@S@This is the most common conception, and it attempts to describe a task in discrete, "mechanical" means.@@@@1@17@@oe@19-8-2009
10010490@unknown@formal@none@1@S@Unique to this conception of formalized algorithms is the [[assignment operation]], setting the value of a variable.@@@@1@17@@oe@19-8-2009
10010500@unknown@formal@none@1@S@It derives from the intuition of "[[memory]]" as a scratchpad.@@@@1@10@@oe@19-8-2009
10010510@unknown@formal@none@1@S@There is an example below of such an assignment.@@@@1@9@@oe@19-8-2009
10010520@unknown@formal@none@1@S@For some alternate conceptions of what constitutes an algorithm see [[functional programming]] and [[logic programming]] .@@@@1@16@@oe@19-8-2009
10010530@unknown@formal@none@1@S@=== Termination ===@@@@1@3@@oe@19-8-2009
10010540@unknown@formal@none@1@S@Some writers restrict the definition of ''algorithm'' to procedures that eventually finish.@@@@1@12@@oe@19-8-2009
10010550@unknown@formal@none@1@S@In such a category Kleene places the "''decision procedure'' or ''decision method'' or ''algorithm'' for the question" (Kleene 1952:136).@@@@1@19@@oe@19-8-2009
10010560@unknown@formal@none@1@S@Others, including Kleene, include procedures that could run forever without stopping; such a procedure has been called a "computational method" (Knuth 1997:5) or "''calculation procedure'' or ''algorithm''" (Kleene 1952:137); however, Kleene notes that such a method must eventually exhibit "some object" (Kleene 1952:137).@@@@1@43@@oe@19-8-2009
10010570@unknown@formal@none@1@S@Minsky makes the pertinent observation, in regards to determining whether an algorithm will eventually terminate (from a particular starting state):@@@@1@20@@oe@19-8-2009
10010580@unknown@formal@none@1@S@
But if the length of the process is not known in advance, then "trying" it may not be decisive, because if the process does go on forever — then at no time will we ever be sure of the answer (Minsky 1967:105).
@@@@1@43@@oe@19-8-2009
10010590@unknown@formal@none@1@S@As it happens, no other method can do any better, as was shown by [[Alan Turing]] with his celebrated result on the undecidability of the so-called [[halting problem]].@@@@1@28@@oe@19-8-2009
10010600@unknown@formal@none@1@S@There is no algorithmic procedure for determining of arbitrary algorithms whether or not they terminate from given starting states.@@@@1@19@@oe@19-8-2009
10010610@unknown@formal@none@1@S@The analysis of algorithms for their likelihood of termination is called [[termination analysis]].@@@@1@13@@oe@19-8-2009
10010620@unknown@formal@none@1@S@See the examples of (im-)"proper" subtraction at [[partial function]] for more about what can happen when an algorithm fails for certain of its input numbers — e.g., (i) non-termination, (ii) production of "junk" (output in the wrong format to be considered a number) or no number(s) at all (halt ends the computation with no output), (iii) wrong number(s), or (iv) a combination of these.@@@@1@64@@oe@19-8-2009
10010630@unknown@formal@none@1@S@Kleene proposed that the production of "junk" or failure to produce a number is solved by having the algorithm detect these instances and produce e.g., an error message (he suggested "0"), or preferably, force the algorithm into an endless loop (Kleene 1952:322).@@@@1@42@@oe@19-8-2009
10010640@unknown@formal@none@1@S@Davis does this to his subtraction algorithm — he fixes his algorithm in a second example so that it is proper subtraction (Davis 1958:12-15).@@@@1@24@@oe@19-8-2009
10010650@unknown@formal@none@1@S@Along with the logical outcomes "true" and "false" Kleene also proposes the use of a third logical symbol "u" — undecided (Kleene 1952:326) — thus an algorithm will always produce ''something'' when confronted with a "proposition".@@@@1@36@@oe@19-8-2009
10010660@unknown@formal@none@1@S@The problem of wrong answers must be solved with an independent "proof" of the algorithm e.g., using induction:@@@@1@18@@oe@19-8-2009
10010670@unknown@formal@none@1@S@
We normally require auxiliary evidence for this (that the algorithm correctly defines a [[mu recursive function]]), e.g., in the form of an inductive proof that, for each argument value, the computation terminates with a unique value (Minsky 1967:186).
BI is sometimes used interchangeably with briefing books, report and query tools and executive information systems.@@@@1@17@@danf@17-8-2009
10100060@unknown@formal@none@1@S@In general, business intelligence systems are data-driven DSS.
@@@@1@8@@danf@17-8-2009
10100070@unknown@formal@none@1@S@BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a [[data warehouse]] or a [[data mart]] and occasionally working from operational data.@@@@1@33@@danf@17-8-2009
10100080@unknown@formal@none@1@S@Software elements support the use of this information by assisting in the extraction, analysis, and reporting of information.@@@@1@18@@danf@17-8-2009
10100090@unknown@formal@none@1@S@Applications tackle sales, production, financial, and many other sources of business data for purposes that include, notably, [[business performance management]].@@@@1@20@@danf@17-8-2009
10100100@unknown@formal@none@1@S@Information may be gathered on comparable companies to produce [[benchmarking|benchmarks]].@@@@1@10@@danf@17-8-2009
10100110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10100120@unknown@formal@none@1@S@Prior to the start of the [[Information Age]] in the late 20th century, businesses had to collect data from non-automated sources.@@@@1@21@@danf@17-8-2009
10100130@unknown@formal@none@1@S@Businesses then lacked the computing resources necessary to properly analyze the data, and as a result, companies often made business decisions primarily on the basis of [[intuition (knowledge)|intuition]].@@@@1@28@@danf@17-8-2009
10100140@unknown@formal@none@1@S@As businesses automated systems the amount of data increased but its collection remained difficult due to the inability of information to be moved between or within systems.@@@@1@27@@danf@17-8-2009
10100150@unknown@formal@none@1@S@Analysis of information informed for long-term decision making, but was slow and often required the use of instinct or expertise to make short-term decisions.@@@@1@24@@danf@17-8-2009
10100160@unknown@formal@none@1@S@Business intelligence was defined in 1958 by [[Hans Peter Luhn]], who wrote,@@@@1@12@@danf@17-8-2009
10100170@unknown@formal@none@1@S@
In this paper, business is a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.@@@@1@26@@danf@17-8-2009
10100180@unknown@formal@none@1@S@The communication facility serving the conduct of a business (in the broad sense) may be referred to as an intelligence system.@@@@1@21@@danf@17-8-2009
10100190@unknown@formal@none@1@S@The notion of intelligence is also defined here, in a more general sense, as "the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal."
@@@@1@35@@danf@17-8-2009
10100200@unknown@formal@none@1@S@In 1989 Howard Dresner, later a [[Gartner Group]] analyst, popularized BI as an umbrella term to describe "concepts and methods to improve business decision making by using fact-based support systems."@@@@1@30@@danf@17-8-2009
10100210@unknown@formal@none@1@S@In modern businesses the use of standards, automation and specialized software, including [[Online analytical processing|analytical tools]], allows large volumes of data to be [[Extract, transform, load|extracted, transformed, loaded]] and [[Data warehouse|warehoused]] to greatly increase the speed at which information becomes available for decision-making.@@@@1@43@@danf@17-8-2009
10100220@unknown@formal@none@1@S@===Key intelligence topics===@@@@1@3@@danf@17-8-2009
10100230@unknown@formal@none@1@S@Business intelligence often uses [[key performance indicators]] (KPIs) to assess the present state of business and to prescribe a course of action.@@@@1@22@@danf@17-8-2009
10100240@unknown@formal@none@1@S@Examples of KPIs are things such as lead conversion rate (in sales) and inventory turnover (in inventory management).@@@@1@18@@danf@17-8-2009
10100250@unknown@formal@none@1@S@Prior to the widespread adoption of computer and web applications, when information had to be manually input and calculated, performance data was often not available for weeks or months.@@@@1@29@@danf@17-8-2009
10100260@unknown@formal@none@1@S@Recently, banks have tried to make data available at shorter intervals and have reduced delays.@@@@1@15@@danf@17-8-2009
10100270@unknown@formal@none@1@S@The KPI methodology was further expanded with the Chief Performance Officer methodology which incorporated KPIs and root cause analysis into a single methodology.@@@@1@23@@danf@17-8-2009
10100280@unknown@formal@none@1@S@Businesses that face higher operational/[[credit risk]] loading, such as [[credit card]] companies and "wealth management" services, often make KPI-related data available weekly.@@@@1@22@@danf@17-8-2009
10100290@unknown@formal@none@1@S@In some cases, companies may even offer a daily analysis of data.@@@@1@12@@danf@17-8-2009
10100300@unknown@formal@none@1@S@This fast pace requires analysts to use [[information technology|IT]] [[system]]s to process this large volume of data.@@@@1@17@@danf@17-8-2009
10110010@unknown@formal@none@1@S@Chatterbot@@@@1@1@@danf@17-8-2009
10110020@unknown@formal@none@1@S@A '''chatterbot''' (or chatbot) is a type of conversational agent, a [[computer program]] designed to simulate an intelligent [[conversation]] with one or more human users via auditory or textual methods.@@@@1@30@@danf@17-8-2009
10110030@unknown@formal@none@1@S@In other words, a chatterbot is a computer program with artificial intelligence to talk to people through voices or typed words.@@@@1@21@@danf@17-8-2009
10110040@unknown@formal@none@1@S@Though many appear to be intelligently interpreting the human input prior to providing a response, most chatterbots simply scan for keywords within the input and pull a reply with the most matching keywords or the most similar wording pattern from a local [[database]].@@@@1@43@@danf@17-8-2009
10110050@unknown@formal@none@1@S@Chatterbots may also be referred to as ''talk bots'', ''chat bots'', or ''chatterboxes''.@@@@1@13@@danf@17-8-2009
10110060@unknown@formal@none@1@S@== Method of operation ==@@@@1@5@@danf@17-8-2009
10110070@unknown@formal@none@1@S@A good understanding of a conversation is required to carry on a meaningful dialog but most chatterbots do not attempt this.@@@@1@21@@danf@17-8-2009
10110080@unknown@formal@none@1@S@Instead they "converse" by recognizing cue words or phrases from the human user, which allows them to use pre-prepared or pre-calculated responses which can move the conversation on in an apparently meaningful way without requiring them to know what they are talking about.@@@@1@43@@danf@17-8-2009
10110090@unknown@formal@none@1@S@For example, if a human types, "I am feeling very worried lately," the chatterbot may be programmed to recognize the phrase "I am" and respond by replacing it with "Why are you" plus a question mark at the end, giving the answer, "Why are you feeling very worried lately?"@@@@1@49@@danf@17-8-2009
10110100@unknown@formal@none@1@S@A similar approach using keywords would be for the program to answer any comment including ''(Name of celebrity)'' with "I think they're great, don't you?"@@@@1@25@@danf@17-8-2009
10110110@unknown@formal@none@1@S@Humans, especially those unfamiliar with chatterbots, sometimes find the resulting conversations engaging.@@@@1@12@@danf@17-8-2009
10110120@unknown@formal@none@1@S@Critics of chatterbots call this engagement the [[ELIZA effect]].@@@@1@9@@danf@17-8-2009
10110130@unknown@formal@none@1@S@Some programs classified as chatterbots use other principles.@@@@1@8@@danf@17-8-2009
10110140@unknown@formal@none@1@S@One example is [[Jabberwacky]], which attempts to model the way humans learn new facts and language.@@@@1@16@@danf@17-8-2009
10110150@unknown@formal@none@1@S@[[Ellaz Systems|ELLA]] attempts to use [[natural language processing]] to make more useful responses from a human's input.@@@@1@17@@danf@17-8-2009
10110160@unknown@formal@none@1@S@Some programs that use natural language conversation, such as [[SHRDLU]], are not generally classified as chatterbots because they link their speech ability to knowledge of a simulated world.@@@@1@28@@danf@17-8-2009
10110170@unknown@formal@none@1@S@This type of link requires a more complex [[artificial intelligence]] (eg., a "vision" system) than standard chatterbots have.@@@@1@18@@danf@17-8-2009
10110180@unknown@formal@none@1@S@== Early chatterbots ==@@@@1@4@@danf@17-8-2009
10110190@unknown@formal@none@1@S@The classic early chatterbots are [[ELIZA]] and [[PARRY]].@@@@1@8@@danf@17-8-2009
10110200@unknown@formal@none@1@S@More recent programs are [[Racter]], [[Verbot]]s, [[Artificial Linguistic Internet Computer Entity|A.L.I.C.E.]], and [[Ellaz Systems|ELLA]].@@@@1@14@@danf@17-8-2009
10110210@unknown@formal@none@1@S@The growth of chatterbots as a research field has created an expansion in their purposes.@@@@1@15@@danf@17-8-2009
10110220@unknown@formal@none@1@S@While ELIZA and PARRY were used exclusively to simulate typed conversation, [[Racter]] was used to "write" a story called ''The Policeman's Beard is Half Constructed''.@@@@1@25@@danf@17-8-2009
10110230@unknown@formal@none@1@S@ELLA includes a collection of games and functional features to further extend the potential of chatterbots.@@@@1@16@@danf@17-8-2009
10110240@unknown@formal@none@1@S@The term "ChatterBot" was coined by [[Michael Loren Mauldin|Michael Mauldin]] (Creator of the first [[Verbot]], Julia) in 1994 to describe these conversational programs.@@@@1@23@@danf@17-8-2009
10110250@unknown@formal@none@1@S@== Malicious chatterbots ==@@@@1@4@@danf@17-8-2009
10110260@unknown@formal@none@1@S@Malicious chatterbots are frequently used to fill chat rooms with spam and advertising, or to entice people into revealing personal information, such as bank account numbers.@@@@1@26@@danf@17-8-2009
10110270@unknown@formal@none@1@S@They are commonly found on [[Yahoo! Messenger]], [[Windows Live Messenger]], [[AOL Instant Messenger]] and other [[instant messaging]] protocols.@@@@1@18@@danf@17-8-2009
10110280@unknown@formal@none@1@S@There has been a published report of a chatterbot used in a fake personal ad on a dating service's website.@@@@1@20@@danf@17-8-2009
10110290@unknown@formal@none@1@S@==Chatterbots in modern AI==@@@@1@4@@danf@17-8-2009
10110300@unknown@formal@none@1@S@Most modern AI research focuses on practical engineering tasks.@@@@1@9@@danf@17-8-2009
10110310@unknown@formal@none@1@S@This is known as weak AI and is distinguished from [[strong AI]], which would require [[sapience]] and reasoning abilities.@@@@1@19@@danf@17-8-2009
10110320@unknown@formal@none@1@S@One pertinent field of AI research is natural language.@@@@1@9@@danf@17-8-2009
10110330@unknown@formal@none@1@S@Usually weak AI fields employ specialised software or programming languages created for them.@@@@1@13@@danf@17-8-2009
10110340@unknown@formal@none@1@S@For example, one of the 'most-human' natural language chatterbots, [[Artificial Linguistic Internet Computer Entity|A.L.I.C.E.]], uses a programming language called AIML that is specific to its program, and its various clones, named Alicebots.@@@@1@32@@danf@17-8-2009
10110350@unknown@formal@none@1@S@Nevertheless, A.L.I.C.E. is still based on pattern matching without any reasoning.@@@@1@11@@danf@17-8-2009
10110360@unknown@formal@none@1@S@This is the same technique [[ELIZA]], the first chatterbot, was using back in 1966.@@@@1@14@@danf@17-8-2009
10110370@unknown@formal@none@1@S@Australian company MyCyberTwin also deals in strong AI, allowing users to create and sustain their own virtual personalities online.@@@@1@19@@danf@17-8-2009
10110380@unknown@formal@none@1@S@MyCyberTwin.com also works in a corporate setting, allowing companies to set up Virtual AI Assistants.@@@@1@15@@danf@17-8-2009
10110390@unknown@formal@none@1@S@Another notable program, known as [[Jabberwacky]], also deals in strong AI, as it is claimed to learn new responses based on user interactions, rather than being driven from a static database like many other existing chatterbots.@@@@1@36@@danf@17-8-2009
10110400@unknown@formal@none@1@S@Although such programs show initial promise, many of the existing results in trying to tackle the problem of natural language still appear fairly poor, and it seems reasonable to state that there is currently no general purpose conversational artificial intelligence.@@@@1@40@@danf@17-8-2009
10110410@unknown@formal@none@1@S@This has led some software developers to focus more on the practical aspect of chatterbot technology - information retrieval.@@@@1@19@@danf@17-8-2009
10110420@unknown@formal@none@1@S@A common rebuttal often used within the AI community against criticism of such approaches asks, "How do we know that humans don't also just follow some cleverly devised rules?" (in the way that Chatterbots do).@@@@1@35@@danf@17-8-2009
10110430@unknown@formal@none@1@S@Two famous examples of this line of argument against the rationale for the basis of the Turing test are John Searle's [[Chinese room]] argument and Ned Block's [[Intentional stance|Blockhead argument]].@@@@1@30@@danf@17-8-2009
10110440@unknown@formal@none@1@S@==Chatterbots/Virtual Assistants in Commercial Environments==@@@@1@5@@danf@17-8-2009
10110450@unknown@formal@none@1@S@Automated Conversational Systems have progressed and evolved far from the original designs of the first widely used chatbots.@@@@1@18@@danf@17-8-2009
10110460@unknown@formal@none@1@S@In the UK, large commercial entities such as Lloyds TSB, Royal Bank of Scotland, Renault, Citroën and One Railway are already utilizing Virtual Assistants to reduce expenditures on Call Centres and provide a first point of contact that can inform the user exactly of points of interest, provide support, capture data from the user and promote products for sale.@@@@1@59@@danf@17-8-2009
10110470@unknown@formal@none@1@S@In the UK, new projects and research are being conducted to introduce a Virtual Assistant into the classroom to assist the teacher.@@@@1@22@@danf@17-8-2009
10110480@unknown@formal@none@1@S@This project is the first of its kind and the chatbot VA in question is based on the Yhaken [http://www.elzware.com] chatbot design.@@@@1@22@@danf@17-8-2009
10110490@unknown@formal@none@1@S@The Yhaken template provides a further move forward in Automated Conversational Systems with features such as complex conversational routing and responses, well defined personality, a complex hierarchical construct with additional external reference points, emotional responses and in depth small talk, all to make the experience more interactive and involving for the user.@@@@1@52@@danf@17-8-2009
10110500@unknown@formal@none@1@S@==Annual contests for chatterbots==@@@@1@4@@danf@17-8-2009
10110510@unknown@formal@none@1@S@Many organizations tries to encourage and support developers all over the world to develop chatterbots that able to do variety of tasks and compete with each other through [[turing test]]s and more.@@@@1@32@@danf@17-8-2009
10110520@unknown@formal@none@1@S@Annual contests are organized at the following links:@@@@1@8@@danf@17-8-2009
10110530@unknown@formal@none@1@S@*[http://www.chatterboxchallenge.com The Chatterbox Challenge]@@@@1@4@@danf@17-8-2009
10110540@unknown@formal@none@1@S@*[http://www.loebner.net/Prizef/loebner-prize.html The Loebner Prize]@@@@1@4@@danf@17-8-2009
10120010@unknown@formal@none@1@S@Computational linguistics@@@@1@2@@danf@17-8-2009
10120020@unknown@formal@none@1@S@'''Computational linguistics''' is an [[interdisciplinary]] field dealing with the [[Statistics|statistical]] and/or rule-based modeling of [[natural language]] from a computational perspective.@@@@1@20@@danf@17-8-2009
10120030@unknown@formal@none@1@S@This modeling is not limited to any particular field of [[linguistics]].@@@@1@11@@danf@17-8-2009
10120040@unknown@formal@none@1@S@Traditionally, computational linguistics was usually performed by [[computer scientist]]s who had specialized in the application of computers to the processing of a [[natural language]].@@@@1@24@@danf@17-8-2009
10120050@unknown@formal@none@1@S@Recent research has shown that human language is much more complex than previously thought, so computational linguists often work as members of interdisciplinary teams, including linguists (specifically trained in linguistics), language experts (persons with some level of ability in the languages relevant to a given project), and computer scientists.@@@@1@49@@danf@17-8-2009
10120060@unknown@formal@none@1@S@In general computational linguistics draws upon the involvement of linguists, [[computer science|computer scientists]], experts in [[artificial intelligence]], [[cognitive psychology|cognitive psychologists]], [[math]]ematicians, and [[logic]]ians, amongst others.@@@@1@25@@danf@17-8-2009
10120070@unknown@formal@none@1@S@==Origins==@@@@1@1@@danf@17-8-2009
10120080@unknown@formal@none@1@S@Computational linguistics as a field predates [[artificial intelligence]], a field under which it is often grouped.@@@@1@16@@danf@17-8-2009
10120090@unknown@formal@none@1@S@Computational linguistics originated with efforts in the [[United States]] in the 1950s to use computers to automatically translate texts from foreign languages, particularly [[Russian language|Russian]] scientific journals, into English.@@@@1@29@@danf@17-8-2009
10120100@unknown@formal@none@1@S@Since computers had proven their ability to do [[arithmetic]] much faster and more accurately than humans, it was thought to be only a short matter of time before the technical details could be taken care of that would allow them the same remarkable capacity to process language.@@@@1@47@@danf@17-8-2009
10120110@unknown@formal@none@1@S@When [[machine translation]] (also known as mechanical translation) failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed.@@@@1@31@@danf@17-8-2009
10120120@unknown@formal@none@1@S@Computational linguistics was born as the name of the new field of study devoted to developing [[algorithm]]s and [[software]] for intelligently processing language data.@@@@1@24@@danf@17-8-2009
10120130@unknown@formal@none@1@S@When artificial intelligence came into existence in the 1960s, the field of computational linguistics became that sub-division of artificial intelligence dealing with human-level comprehension and production of natural languages.@@@@1@29@@danf@17-8-2009
10120140@unknown@formal@none@1@S@In order to translate one language into another, it was observed that one had to understand the [[grammar]] of both languages, including both [[morphology (linguistics)|morphology]] (the grammar of word forms) and [[syntax]] (the grammar of sentence structure).@@@@1@37@@danf@17-8-2009
10120150@unknown@formal@none@1@S@In order to understand syntax, one had to also understand the [[semantics]] and the [[lexicon]] (or 'vocabulary'), and even to understand something of the [[pragmatics]] of language use.@@@@1@28@@danf@17-8-2009
10120160@unknown@formal@none@1@S@Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers.@@@@1@27@@danf@17-8-2009
10120170@unknown@formal@none@1@S@==Subfields==@@@@1@1@@danf@17-8-2009
10120180@unknown@formal@none@1@S@Computational linguistics can be divided into major areas depending upon the medium of the language being processed, whether spoken or textual; and upon the task being performed, whether analyzing language (recognition) or synthesizing language (generation).@@@@1@35@@danf@17-8-2009
10120190@unknown@formal@none@1@S@[[Speech recognition]] and [[speech synthesis]] deal with how spoken language can be understood or created using computers.@@@@1@17@@danf@17-8-2009
10120200@unknown@formal@none@1@S@Parsing and generation are sub-divisions of computational linguistics dealing respectively with taking language apart and putting it together.@@@@1@18@@danf@17-8-2009
10120210@unknown@formal@none@1@S@Machine translation remains the sub-division of computational linguistics dealing with having computers translate between languages.@@@@1@15@@danf@17-8-2009
10120220@unknown@formal@none@1@S@Some of the areas of research that are studied by computational linguistics include:@@@@1@13@@danf@17-8-2009
10120230@unknown@formal@none@1@S@*Computer aided [[corpus linguistics]]@@@@1@4@@danf@17-8-2009
10120240@unknown@formal@none@1@S@*Design of [[parser]]s or [[phrase chunking|chunkers]] for [[natural language]]s@@@@1@9@@danf@17-8-2009
10120250@unknown@formal@none@1@S@*Design of taggers like [[Part-of-speech tagging|POS-taggers (part-of-speech taggers)]]@@@@1@8@@danf@17-8-2009
10120260@unknown@formal@none@1@S@*Definition of specialized logics like resource logics for [[Natural language processing|NLP]]@@@@1@11@@danf@17-8-2009
10120270@unknown@formal@none@1@S@*Research in the relation between formal and natural languages in general@@@@1@11@@danf@17-8-2009
10120280@unknown@formal@none@1@S@*[[Machine translation]], e.g. by a translating computer@@@@1@7@@danf@17-8-2009
10120290@unknown@formal@none@1@S@*[[Computational complexity]] of natural language, largely modeled on [[automata theory]], with the application of [[context-sensitive grammar]] and [[Linear bounded automaton|linearly-bounded]] [[Turing machine]]s.@@@@1@22@@danf@17-8-2009
10120300@unknown@formal@none@1@S@The [[Association for Computational Linguistics]] defines computational linguistics as:@@@@1@9@@danf@17-8-2009
10120310@unknown@formal@none@1@S@:...the scientific study of [[language]] from a computational perspective.@@@@1@9@@danf@17-8-2009
10120320@unknown@formal@none@1@S@Computational linguists are interested in providing [[computational model]]s of various kinds of linguistic phenomena.@@@@1@14@@danf@17-8-2009
10130010@unknown@formal@none@1@S@Computer program@@@@1@2@@danf@17-8-2009
10130020@unknown@formal@none@1@S@'''Computer programs''' (also '''[[Computer software|software programs]]''', or just '''programs''') are [[Instruction (computer science)|instructions]] for a [[computer]].@@@@1@16@@danf@17-8-2009
10130030@unknown@formal@none@1@S@A computer requires programs to function, and a computer program does nothing unless its instructions are executed by a [[Central processing unit|central processor]].@@@@1@23@@danf@17-8-2009
10130040@unknown@formal@none@1@S@Computer programs are usually [[executable]] programs or the [[source code]] from which executable programs are derived (e.g., [[compiler|compiled]]).@@@@1@18@@danf@17-8-2009
10130050@unknown@formal@none@1@S@Computer source code is often written by professional [[computer programmer]]s.@@@@1@10@@danf@17-8-2009
10130060@unknown@formal@none@1@S@Source code is written in a [[programming language]] that usually follows one of two main [[Programming paradigm|paradigms]]: [[imperative programming|imperative]] or [[declarative language|declarative]] programming.@@@@1@23@@danf@17-8-2009
10130070@unknown@formal@none@1@S@Source code may be converted into an [[executable file]] (sometimes called an executable program) by a [[compiler]].@@@@1@17@@danf@17-8-2009
10130080@unknown@formal@none@1@S@Alternatively, computer programs may be executed by a [[central processing unit]] with the aid of an [[Interpreter (computing)|interpreter]], or may be [[firmware|embedded]] directly into [[Computer hardware|hardware]].@@@@1@26@@danf@17-8-2009
10130090@unknown@formal@none@1@S@Computer programs may be categorized along functional lines: [[system software]] and [[application software]].@@@@1@13@@danf@17-8-2009
10130100@unknown@formal@none@1@S@And many computer programs may run simultaneously on a single computer, a process known as [[computer multitasking|multitasking]].@@@@1@17@@danf@17-8-2009
10130110@unknown@formal@none@1@S@==Programming==@@@@1@1@@danf@17-8-2009
10130120@unknown@formal@none@1@S@ main() {output_string("Hello world!");} @@@@1@5@@danf@17-8-2009
10130160@unknown@formal@none@1@S@Source code of a program written in the [[C programming language]]@@@@1@11@@danf@17-8-2009
10130170@unknown@formal@none@1@S@[[Computer programming]] is the iterative process of writing or editing [[source code]].@@@@1@12@@danf@17-8-2009
10130180@unknown@formal@none@1@S@Editing source code involves testing, analyzing, and refining.@@@@1@8@@danf@17-8-2009
10130190@unknown@formal@none@1@S@A person who practices this skill is referred to as a computer [[programmer]] or software developer.@@@@1@16@@danf@17-8-2009
10130200@unknown@formal@none@1@S@The sometimes lengthy process of computer programming is usually referred to as [[software development]].@@@@1@14@@danf@17-8-2009
10130210@unknown@formal@none@1@S@The term [[software engineering]] is becoming popular as the process is seen as an [[engineering]] discipline.@@@@1@16@@danf@17-8-2009
10130220@unknown@formal@none@1@S@=== Paradigms ===@@@@1@3@@danf@17-8-2009
10130230@unknown@formal@none@1@S@Computer programs can be categorized by the [[programming language]] [[Programming paradigm|paradigm]] used to produce them.@@@@1@15@@danf@17-8-2009
10130240@unknown@formal@none@1@S@Two of the main paradigms are [[imperative programming|imperative]] and [[declarative language|declarative]].@@@@1@11@@danf@17-8-2009
10130250@unknown@formal@none@1@S@Programs written using an imperative language specify an [[algorithm]] using declarations, expressions, and statements.@@@@1@14@@danf@17-8-2009
10130260@unknown@formal@none@1@S@A declaration associates a [[variable]] name with a [[datatype]].@@@@1@9@@danf@17-8-2009
10130270@unknown@formal@none@1@S@For example: var x: integer; .@@@@1@7@@danf@17-8-2009
10130280@unknown@formal@none@1@S@An expression yields a value.@@@@1@5@@danf@17-8-2009
10130290@unknown@formal@none@1@S@For example: 2 + 2 yields 4.@@@@1@9@@danf@17-8-2009
10130300@unknown@formal@none@1@S@Finally, a statement might assign an expression to a variable or use the value of a variable to alter the program's control flow.@@@@1@23@@danf@17-8-2009
10130310@unknown@formal@none@1@S@For example: x := 2 + 2; if x = 4 then do_something();@@@@1@13@@danf@17-8-2009
10130315@unknown@formal@none@1@S@One criticism of imperative languages is the side-effect of an assignment statement on a class of variables called non-local variables.@@@@1@20@@danf@17-8-2009
10130320@unknown@formal@none@1@S@Programs written using a declarative language specify the properties that have to be met by the output and do not specify any implementation details.@@@@1@24@@danf@17-8-2009
10130330@unknown@formal@none@1@S@Two broad categories of declarative languages are [[functional language]]s and [[logical language]]s.@@@@1@12@@danf@17-8-2009
10130340@unknown@formal@none@1@S@The principle behind functional languages (like [[Haskell (programming language)|Haskell]]) is to not allow side-effects, which makes it easier to reason about programs like mathematical functions.@@@@1@25@@danf@17-8-2009
10130350@unknown@formal@none@1@S@The principle behind logical languages (like [[Prolog]]) is to define the problem to be solved — the goal — and leave the detailed solution to the Prolog system itself.@@@@1@29@@danf@17-8-2009
10130360@unknown@formal@none@1@S@The goal is defined by providing a list of subgoals.@@@@1@10@@danf@17-8-2009
10130370@unknown@formal@none@1@S@Then each subgoal is defined by further providing a list of its subgoals, etc.@@@@1@14@@danf@17-8-2009
10130380@unknown@formal@none@1@S@If a path of subgoals fails to find a solution, then that subgoal is [[Backtracking|backtracked]] and another path is systematically attempted.@@@@1@21@@danf@17-8-2009
10130390@unknown@formal@none@1@S@The form in which a program is created may be textual or visual.@@@@1@13@@danf@17-8-2009
10130400@unknown@formal@none@1@S@In a [[visual language]] program, elements are graphically manipulated rather than textually specified.@@@@1@13@@danf@17-8-2009
10130410@unknown@formal@none@1@S@===Compilation or interpretation===@@@@1@3@@danf@17-8-2009
10130420@unknown@formal@none@1@S@A ''computer program'' in the form of a [[human-readable]], computer programming language is called [[source code]].@@@@1@16@@danf@17-8-2009
10130430@unknown@formal@none@1@S@Source code may be converted into an [[executable file|executable image]] by a [[compiler]] or executed immediately with the aid of an [[Interpreter (computing)|interpreter]].@@@@1@23@@danf@17-8-2009
10130440@unknown@formal@none@1@S@Compiled computer programs are commonly referred to as executables, binary images, or simply as [[binary file|binaries]] — a reference to the [[binary numeral system|binary]] [[file format]] used to store the executable code.@@@@1@32@@danf@17-8-2009
10130450@unknown@formal@none@1@S@Compilers are used to translate source code from a programming language into either [[object file|object code]] or [[machine code]].@@@@1@19@@danf@17-8-2009
10130460@unknown@formal@none@1@S@Object code needs further processing to become machine code, and machine code is the [[Central processing unit|Central Processing Unit]]'s native [[microcode|code]], ready for execution.@@@@1@24@@danf@17-8-2009
10130470@unknown@formal@none@1@S@Interpreted computer programs are either decoded and then immediately executed or are decoded into some efficient intermediate representation for future execution.@@@@1@21@@danf@17-8-2009
10130480@unknown@formal@none@1@S@[[BASIC]], [[Perl]], and [[Python (programming language)|Python]] are examples of immediately executed computer programs.@@@@1@13@@danf@17-8-2009
10130490@unknown@formal@none@1@S@Alternatively, [[Java (programming language)|Java]] computer programs are compiled ahead of time and stored as a machine independent code called [[bytecode]].@@@@1@20@@danf@17-8-2009
10130500@unknown@formal@none@1@S@Bytecode is then executed upon request by an interpreter called a [[virtual machine]].@@@@1@13@@danf@17-8-2009
10130510@unknown@formal@none@1@S@The main disadvantage of interpreters is computer programs run slower than if compiled.@@@@1@13@@danf@17-8-2009
10130520@unknown@formal@none@1@S@Interpreting code is slower than running the compiled version because the interpreter must [[decode]] each [[Statement (programming)|statement]] each time it is loaded and then perform the desired action.@@@@1@28@@danf@17-8-2009
10130530@unknown@formal@none@1@S@On the other hand, software development may be quicker using an interpreter because testing is immediate when the compilation step is omitted.@@@@1@22@@danf@17-8-2009
10130540@unknown@formal@none@1@S@Another disadvantage of interpreters is the interpreter must be present on the computer at the time the computer program is executed.@@@@1@21@@danf@17-8-2009
10130550@unknown@formal@none@1@S@Alternatively, compiled computer programs need not have the compiler present at the time of execution.@@@@1@15@@danf@17-8-2009
10130560@unknown@formal@none@1@S@No properties of a programming language require it to be exclusively compiled or exclusively interpreted.@@@@1@15@@danf@17-8-2009
10130570@unknown@formal@none@1@S@The categorization usually reflects the most popular method of language execution.@@@@1@11@@danf@17-8-2009
10130580@unknown@formal@none@1@S@For example, BASIC is thought of as an interpreted language and C a compiled language, despite the existence of BASIC compilers and C interpreters.@@@@1@24@@danf@17-8-2009
10130590@unknown@formal@none@1@S@===Self-modifying programs===@@@@1@2@@danf@17-8-2009
10130600@unknown@formal@none@1@S@A computer program in [[execution (computers)|execution]] is normally treated as being different from the [[data (computing)|data]] the program operates on.@@@@1@20@@danf@17-8-2009
10130610@unknown@formal@none@1@S@However, in some cases this distinction is blurred when a computer program modifies itself.@@@@1@14@@danf@17-8-2009
10130620@unknown@formal@none@1@S@The modified computer program is subsequently executed as part of the same program.@@@@1@13@@danf@17-8-2009
10130630@unknown@formal@none@1@S@[[Self-modifying code]] is possible for programs written in [[Lisp programming language|Lisp]], [[cobol|COBOL]], and [[Prolog]].@@@@1@14@@danf@17-8-2009
10130640@unknown@formal@none@1@S@==Execution and storage==@@@@1@3@@danf@17-8-2009
10130650@unknown@formal@none@1@S@Typically, computer programs are stored in [[non-volatile memory]] until requested either directly or indirectly to be [[execution (computers)|executed]] by the computer user.@@@@1@22@@danf@17-8-2009
10130660@unknown@formal@none@1@S@Upon such a request, the program is loaded into [[random access memory]], by a computer program called an [[operating system]], where it can be accessed directly by the central processor.@@@@1@30@@danf@17-8-2009
10130670@unknown@formal@none@1@S@The central processor then executes ("runs") the program, instruction by instruction, until termination.@@@@1@13@@danf@17-8-2009
10130680@unknown@formal@none@1@S@A program in execution is called a [[Process (computing)|process]].@@@@1@9@@danf@17-8-2009
10130690@unknown@formal@none@1@S@Termination is either by normal self-termination or by error — software or hardware error.@@@@1@14@@danf@17-8-2009
10130700@unknown@formal@none@1@S@===Embedded programs===@@@@1@2@@danf@17-8-2009
10130710@unknown@formal@none@1@S@Some computer programs are embedded into hardware.@@@@1@7@@danf@17-8-2009
10130720@unknown@formal@none@1@S@A [[stored-program computer]] requires an initial computer program stored in its [[read-only memory]] to [[booting|boot]].@@@@1@15@@danf@17-8-2009
10130730@unknown@formal@none@1@S@The boot process is to identify and initialize all aspects of the system, from [[Processor register|CPU registers]] to [[Device driver|device controllers]] to [[Volatile memory|memory]] contents.@@@@1@25@@danf@17-8-2009
10130740@unknown@formal@none@1@S@Following the initialization process, this initial computer program loads the [[operating system]] and sets the [[program counter]] to begin normal operations.@@@@1@21@@danf@17-8-2009
10130750@unknown@formal@none@1@S@Independent of the host computer, a [[Peripheral|hardware device]] might have embedded [[firmware]] to control its operation.@@@@1@16@@danf@17-8-2009
10130760@unknown@formal@none@1@S@Firmware is used when the computer program is rarely or never expected to change, or when the program must not be lost when the power is off.@@@@1@27@@danf@17-8-2009
10130770@unknown@formal@none@1@S@===Manual programming===@@@@1@2@@danf@17-8-2009
10130780@unknown@formal@none@1@S@Computer programs historically were manually input to the central processor via switches.@@@@1@12@@danf@17-8-2009
10130790@unknown@formal@none@1@S@An instruction was represented by a configuration of on/off settings.@@@@1@10@@danf@17-8-2009
10130800@unknown@formal@none@1@S@After setting the configuration, an execute button was pressed.@@@@1@9@@danf@17-8-2009
10130810@unknown@formal@none@1@S@This process was then repeated.@@@@1@5@@danf@17-8-2009
10130820@unknown@formal@none@1@S@Computer programs also historically were manually input via [[paper tape]] or [[punched cards]].@@@@1@13@@danf@17-8-2009
10130830@unknown@formal@none@1@S@After the medium was loaded, the starting address was set via switches and the execute button pressed.@@@@1@17@@danf@17-8-2009
10130840@unknown@formal@none@1@S@===Automatic program generation===@@@@1@3@@danf@17-8-2009
10130850@unknown@formal@none@1@S@[[Generative programming]] is a style of [[computer programming]] that creates [[source code]] through [[generic programming|generic]] [[class (computer science)|classes]], [[Prototype-based programming|prototypes]], [[template (programming)|template]]s, [[aspect (computer science)|aspect]]s, and [[Code generation (compiler)|code generator]]s to improve [[programmer]] productivity.@@@@1@34@@danf@17-8-2009
10130860@unknown@formal@none@1@S@Source code is generated with [[programming tool]]s such as a [[template processor]] or an [[Integrated development environment|Integrated Development Environment]].@@@@1@19@@danf@17-8-2009
10130870@unknown@formal@none@1@S@The simplest form of source code generator is a [[Macro (computer science)|macro]] processor, such as the [[C preprocessor]], which replaces patterns in source code according to relatively simple rules.@@@@1@29@@danf@17-8-2009
10130880@unknown@formal@none@1@S@[[Software engine]]s output source code or [[Markup language|markup code]] that simultaneously become the input to another [[Process (computing)|computer process]].@@@@1@19@@danf@17-8-2009
10130890@unknown@formal@none@1@S@The analogy is that of one process driving another process, with the computer code being burned as fuel.@@@@1@18@@danf@17-8-2009
10130900@unknown@formal@none@1@S@[[Application server]]s are software engines that deliver applications to [[client computer]]s.@@@@1@11@@danf@17-8-2009
10130910@unknown@formal@none@1@S@For example, a [[Wiki software|Wiki]] is an application server that allows users to build [[dynamic web page|dynamic content]] assembled from [[article (publishing)|articles]].@@@@1@22@@danf@17-8-2009
10130920@unknown@formal@none@1@S@Wikis generate [[HTML]], [[CSS]], [[Java (programming language)|Java]], and [[Javascript]] which are then [[Interpreter (computing)|interpreted]] by a [[web browser]].@@@@1@18@@danf@17-8-2009
10130930@unknown@formal@none@1@S@=== Simultaneous execution===@@@@1@3@@danf@17-8-2009
10130940@unknown@formal@none@1@S@Many operating systems support [[computer multitasking|multitasking]] which enables many computer programs to appear to be running simultaneously on a single computer.@@@@1@21@@danf@17-8-2009
10130950@unknown@formal@none@1@S@Operating systems may run multiple programs through [[process scheduling]] — a software mechanism to [[Context switch|switch]] the CPU among processes frequently so that users can [[Time-sharing|interact]] with each program while it is running.@@@@1@33@@danf@17-8-2009
10130960@unknown@formal@none@1@S@Within hardware, modern day multiprocessor computers or computers with multicore processors may run multiple programs.@@@@1@15@@danf@17-8-2009
10130970@unknown@formal@none@1@S@== Functional categories ==@@@@1@4@@danf@17-8-2009
10130980@unknown@formal@none@1@S@Computer programs may be categorized along functional lines.@@@@1@8@@danf@17-8-2009
10130990@unknown@formal@none@1@S@These functional categories are [[system software]] and [[application software]].@@@@1@9@@danf@17-8-2009
10131000@unknown@formal@none@1@S@System software includes the [[operating system]] which couples the [[computer hardware|computer's hardware]] with the application software.@@@@1@16@@danf@17-8-2009
10131010@unknown@formal@none@1@S@The purpose of the operating system is to provide an environment in which application software executes in a convenient and efficient manner.@@@@1@22@@danf@17-8-2009
10131020@unknown@formal@none@1@S@In addition to the operating system, system software includes [[Utility software|utility programs]] that help manage and tune the computer.@@@@1@19@@danf@17-8-2009
10131030@unknown@formal@none@1@S@If a computer program is not system software then it is application software.@@@@1@13@@danf@17-8-2009
10131040@unknown@formal@none@1@S@Application software includes [[middleware]], which couples the system software with the [[user interface]].@@@@1@13@@danf@17-8-2009
10131050@unknown@formal@none@1@S@Application software also includes utility programs that help users solve application problems, like the need for sorting.@@@@1@17@@danf@17-8-2009
10140010@unknown@formal@none@1@S@Computer science@@@@1@2@@danf@17-8-2009
10140020@unknown@formal@none@1@S@'''Computer science''' (or '''computing science''') is the study and the [[science]] of the theoretical foundations of [[information]] and [[computation]] and their implementation and application in [[computer|computer system]]s.@@@@1@27@@danf@17-8-2009
10140030@unknown@formal@none@1@S@Computer science has many sub-fields; some emphasize the computation of specific results (such as [[computer graphics]]), while others relate to properties of [[computational problem]]s (such as [[computational complexity theory]]).@@@@1@29@@danf@17-8-2009
10140040@unknown@formal@none@1@S@Still others focus on the challenges in implementing computations.@@@@1@9@@danf@17-8-2009
10140050@unknown@formal@none@1@S@For example, [[programming language theory]] studies approaches to describing computations, while [[computer programming]] applies specific [[programming language]]s to solve specific computational problems.@@@@1@22@@danf@17-8-2009
10140060@unknown@formal@none@1@S@A further subfield, [[human-computer interaction]], focuses on the challenges in making computers and computations useful, usable and universally accessible to [[humans|people]].@@@@1@21@@danf@17-8-2009
10140070@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10140080@unknown@formal@none@1@S@The early foundations of what would become computer science predate the invention of the modern [[digital computer]].@@@@1@17@@danf@17-8-2009
10140090@unknown@formal@none@1@S@Machines for calculating fixed numerical tasks, such as the [[abacus]], have existed since antiquity.@@@@1@14@@danf@17-8-2009
10140100@unknown@formal@none@1@S@[[Wilhelm Schickard]] built the first mechanical calculator in 1623.@@@@1@9@@danf@17-8-2009
10140110@unknown@formal@none@1@S@[[Charles Babbage]] designed a [[difference engine]] in [[Victorian era|Victorian]] times (between 1837 and 1901) helped by [[Ada Lovelace]].@@@@1@18@@danf@17-8-2009
10140120@unknown@formal@none@1@S@Around 1900, the [[IBM]] corporation sold [[Key_punch|punch-card machines]].@@@@1@8@@danf@17-8-2009
10140130@unknown@formal@none@1@S@However, all of these machines were constrained to perform a single task, or at best some subset of all possible tasks.@@@@1@21@@danf@17-8-2009
10140140@unknown@formal@none@1@S@During the 1940s, as newer and more powerful computing machines were developed, the term ''computer'' came to refer to the machines rather than their human predecessors.@@@@1@26@@danf@17-8-2009
10140150@unknown@formal@none@1@S@As it became clear that computers could be used for more than just mathematical calculations, the field of computer science broadened to study [[computation]] in general.@@@@1@26@@danf@17-8-2009
10140160@unknown@formal@none@1@S@Computer science began to be established as a distinct academic discipline in the 1960s, with the creation of the first computer science departments and degree programs.@@@@1@26@@danf@17-8-2009
10140170@unknown@formal@none@1@S@Since practical computers became available, many applications of computing have become distinct areas of study in their own right.@@@@1@19@@danf@17-8-2009
10140180@unknown@formal@none@1@S@Many initially believed it impossible that "computers themselves could actually be a scientific field of study" (Levy 1984, p. 11), though it was in the "late fifties" (Levy 1984, p.11) that it gradually became accepted among the greater academic population.@@@@1@40@@danf@17-8-2009
10140190@unknown@formal@none@1@S@It is the now well-known IBM brand that formed part of the computer science revolution during this time.@@@@1@18@@danf@17-8-2009
10140200@unknown@formal@none@1@S@'IBM' (short for International Business Machines) released the IBM 704 and later the IBM 709 computers, which were widely used during the exploration period of such devices.@@@@1@27@@danf@17-8-2009
10140210@unknown@formal@none@1@S@"Still, working with the IBM [computer] was frustrating...if you had misplaced as much as one letter in one instruction, the program would crash, and you would have to start the whole process over again" (Levy 1984, p.13).@@@@1@37@@danf@17-8-2009
10140220@unknown@formal@none@1@S@During the late 1950s, the computer science discipline was very much in its developmental stages, and such issues were commonplace.@@@@1@20@@danf@17-8-2009
10140230@unknown@formal@none@1@S@Time has seen significant improvements in the useability and effectiveness of computer science technology.@@@@1@14@@danf@17-8-2009
10140240@unknown@formal@none@1@S@Modern society has seen a significant shift from computers being used solely by experts or professionals to a more widespread user base.@@@@1@22@@danf@17-8-2009
10140250@unknown@formal@none@1@S@By the 1990s, computers became accepted as being the norm within everyday life.@@@@1@13@@danf@17-8-2009
10140260@unknown@formal@none@1@S@During this time data entry was a primary component of the use of computers, many preferring to streamline their business practices through the use of a computer.@@@@1@27@@danf@17-8-2009
10140270@unknown@formal@none@1@S@This also gave the additional benefit of removing the need of large amounts of documentation and file records which consumed much-needed physical space within offices.@@@@1@25@@danf@17-8-2009
10140280@unknown@formal@none@1@S@== Major achievements ==@@@@1@4@@danf@17-8-2009
10140290@unknown@formal@none@1@S@Despite its relatively short history as a formal academic discipline, computer science has made a number of fundamental contributions to [[science]] and [[society]].@@@@1@23@@danf@17-8-2009
10140300@unknown@formal@none@1@S@These include:@@@@1@2@@danf@17-8-2009
10140310@unknown@formal@none@1@S@;Applications within computer science@@@@1@4@@danf@17-8-2009
10140320@unknown@formal@none@1@S@* A formal definition of [[computation]] and [[computability]], and proof that there are computationally [[Undecidable problem|unsolvable]] and [[Intractable#Intractability|intractable]] problems.@@@@1@19@@danf@17-8-2009
10140330@unknown@formal@none@1@S@* The concept of a [[programming language]], a tool for the precise expression of methodological information at various levels of abstraction.@@@@1@21@@danf@17-8-2009
10140340@unknown@formal@none@1@S@;Applications outside of computing@@@@1@4@@danf@17-8-2009
10140350@unknown@formal@none@1@S@* Sparked the [[Digital Revolution]] which led to the current [[Information Age]] and the [[Internet]].@@@@1@15@@danf@17-8-2009
10140360@unknown@formal@none@1@S@* In [[cryptography]], [[Cryptanalysis of the Enigma|breaking the Enigma machine]] was an important factor contributing to the Allied victory in World War II.@@@@1@23@@danf@17-8-2009
10140370@unknown@formal@none@1@S@* [[Scientific computing]] enabled advanced study of the mind and mapping the human genome was possible with [[Human Genome Project]].@@@@1@20@@danf@17-8-2009
10140380@unknown@formal@none@1@S@[[Distributed computing]] projects like [[Folding\shome]] explore [[protein folding]].@@@@1@8@@danf@17-8-2009
10140390@unknown@formal@none@1@S@* [[Algorithmic trading]] has increased the [[Economic efficiency|efficiency]] and [[Market liquidity|liquidity]] of financial markets by using [[artificial intelligence]], [[machine learning]] and other [[statistics|statistical]] and [[Numerical analysis|numerical]] techniques on a large scale.@@@@1@31@@danf@17-8-2009
10140400@unknown@formal@none@1@S@== Relationship with other fields ==@@@@1@6@@danf@17-8-2009
10140410@unknown@formal@none@1@S@Despite its name, a significant amount of computer science does not involve the study of computers themselves.@@@@1@17@@danf@17-8-2009
10140420@unknown@formal@none@1@S@Because of this, several alternative names have been proposed.@@@@1@9@@danf@17-8-2009
10140430@unknown@formal@none@1@S@Danish scientist [[Peter Naur]] suggested the term ''datalogy'', to reflect the fact that the scientific discipline revolves around data and data treatment, while not necessarily involving computers.@@@@1@27@@danf@17-8-2009
10140440@unknown@formal@none@1@S@The first scientific institution to use the term was the Department of Datalogy at the University of Copenhagen, founded in 1969, with Peter Naur being the first professor in datalogy.@@@@1@30@@danf@17-8-2009
10140450@unknown@formal@none@1@S@The term is used mainly in the Scandinavian countries.@@@@1@9@@danf@17-8-2009
10140460@unknown@formal@none@1@S@Also, in the early days of computing, a number of terms for the and practitioners of the field of computing were suggested in the ''Communications are of the ACM''—''turingineer'', ''turologist'', ''flow-charts-man'', ''applied meta-mathematician'', and ''applied epistemologist''.@@@@1@36@@danf@17-8-2009
10140470@unknown@formal@none@1@S@Three months later in the same journal, ''comptologist'' was suggested, followed next year by ''hypologist''.@@@@1@15@@danf@17-8-2009
10140480@unknown@formal@none@1@S@Recently the term ''computics'' has been suggested.@@@@1@7@@danf@17-8-2009
10140490@unknown@formal@none@1@S@''Informatik'' was a term used in Europe with more frequency.@@@@1@10@@danf@17-8-2009
10140500@unknown@formal@none@1@S@The renowned computer scientist [[Edsger W. Dijkstra|Edsger Dijkstra]] stated, "Computer science is no more about computers than astronomy is about telescopes."@@@@1@21@@danf@17-8-2009
10140510@unknown@formal@none@1@S@The design and deployment of computers and computer systems is generally considered the province of disciplines other than computer science.@@@@1@20@@danf@17-8-2009
10140520@unknown@formal@none@1@S@For example, the study of [[computer hardware]] is usually considered part of [[computer engineering]], while the study of commercial [[computer system]]s and their deployment is often called [[information technology]] or [[information systems]].@@@@1@32@@danf@17-8-2009
10140530@unknown@formal@none@1@S@Computer science is sometimes criticized as being insufficiently scientific, a view espoused in the statement "Science is to computer science as hydrodynamics is to plumbing", credited to [[Stan Kelly-Bootle]] and others.@@@@1@31@@danf@17-8-2009
10140540@unknown@formal@none@1@S@However, there has been much cross-fertilization of ideas between the various computer-related disciplines.@@@@1@13@@danf@17-8-2009
10140550@unknown@formal@none@1@S@Computer science research has also often crossed into other disciplines, such as [[cognitive science]], [[economics]], [[mathematics]], [[physics]] (see [[quantum computing]]), and [[linguistics]].@@@@1@22@@danf@17-8-2009
10140560@unknown@formal@none@1@S@Computer science is considered by some to have a much closer relationship with [[mathematics]] than many scientific disciplines.@@@@1@18@@danf@17-8-2009
10140570@unknown@formal@none@1@S@Early computer science was strongly influenced by the work of mathematicians such as [[Kurt Gödel]] and [[Alan Turing]], and there continues to be a useful interchange of ideas between the two fields in areas such as [[mathematical logic]], [[category theory]], [[domain theory]], and [[algebra]].@@@@1@44@@danf@17-8-2009
10140580@unknown@formal@none@1@S@The relationship between computer science and [[software engineering]] is a contentious issue, which is further muddied by [[Debates within software engineering|disputes]] over what the term "software engineering" means, and how computer science is defined.@@@@1@34@@danf@17-8-2009
10140590@unknown@formal@none@1@S@[[David Parnas]], taking a cue from the relationship between other engineering and science disciplines, has claimed that the principal focus of computer science is studying the properties of computation in general, while the principal focus of software engineering is the design of specific computations to achieve practical goals, making the two separate but complementary disciplines.@@@@1@55@@danf@17-8-2009
10140600@unknown@formal@none@1@S@The academic, political, and funding aspects of computer science tend to have roots as to whether a department in the U.S. formed with either a mathematical emphasis or an engineering emphasis.@@@@1@31@@danf@17-8-2009
10140610@unknown@formal@none@1@S@In general, electrical engineering-based computer science departments have tended to succeed as computer science and/or engineering departments.@@@@1@17@@danf@17-8-2009
10140620@unknown@formal@none@1@S@Computer science departments with a mathematics emphasis and with a numerical orientation consider alignment [[computational science]].@@@@1@16@@danf@17-8-2009
10140630@unknown@formal@none@1@S@Both types of departments tend to make efforts to bridge the field educationally if not across all research.@@@@1@18@@danf@17-8-2009
10140640@unknown@formal@none@1@S@== Fields of computer science ==@@@@1@6@@danf@17-8-2009
10140650@unknown@formal@none@1@S@Computer science searches for concepts and [[formal proof]]s to explain and describe computational systems of interest.@@@@1@16@@danf@17-8-2009
10140660@unknown@formal@none@1@S@As with all sciences, these theories can then be utilised to synthesize practical engineering applications, which in turn may suggest new systems to be studied and analysed.@@@@1@27@@danf@17-8-2009
10140670@unknown@formal@none@1@S@While the [[ACM Computing Classification System]] can be used to split computer science up into different topics of fields, a more descriptive breakdown follows:@@@@1@24@@danf@17-8-2009
10140680@unknown@formal@none@1@S@=== Mathematical foundations ===@@@@1@4@@danf@17-8-2009
10140690@unknown@formal@none@1@S@; [[Mathematical logic]]@@@@1@3@@danf@17-8-2009
10140700@unknown@formal@none@1@S@: Boolean logic and other ways of modeling logical queries; the uses and limitations of formal proof methods.@@@@1@18@@danf@17-8-2009
10140710@unknown@formal@none@1@S@; [[Number theory]]@@@@1@3@@danf@17-8-2009
10140720@unknown@formal@none@1@S@: Theory of proofs and heuristics for finding proofs in the simple domain of integers.@@@@1@15@@danf@17-8-2009
10140730@unknown@formal@none@1@S@Used in [[cryptography]] as well as a test domain in [[artificial intelligence]].@@@@1@12@@danf@17-8-2009
10140740@unknown@formal@none@1@S@; [[Graph theory]]@@@@1@3@@danf@17-8-2009
10140750@unknown@formal@none@1@S@: Foundations for data structures and searching algorithms.@@@@1@8@@danf@17-8-2009
10140760@unknown@formal@none@1@S@; [[Type theory]]@@@@1@3@@danf@17-8-2009
10140770@unknown@formal@none@1@S@: Formal analysis of the types of data, and the use of these types to understand properties of programs, especially program safety.@@@@1@22@@danf@17-8-2009
10140780@unknown@formal@none@1@S@; [[Category theory]]@@@@1@3@@danf@17-8-2009
10140790@unknown@formal@none@1@S@: Category theory provides a means of capturing all of math and computation in a single synthesis.@@@@1@17@@danf@17-8-2009
10140800@unknown@formal@none@1@S@; [[Computational geometry]]@@@@1@3@@danf@17-8-2009
10140810@unknown@formal@none@1@S@: The study of [[algorithm]]s to solve problems stated in terms of [[geometry]].@@@@1@13@@danf@17-8-2009
10140820@unknown@formal@none@1@S@; [[Numerical analysis]]@@@@1@3@@danf@17-8-2009
10140830@unknown@formal@none@1@S@: Foundations for algorithms in discrete mathematics, as well as the study of the limitations of floating point computation, including [[round-off]] errors.@@@@1@22@@danf@17-8-2009
10140840@unknown@formal@none@1@S@=== Theory of computation ===@@@@1@5@@danf@17-8-2009
10140850@unknown@formal@none@1@S@; [[Automata theory]]@@@@1@3@@danf@17-8-2009
10140860@unknown@formal@none@1@S@: Different logical structures for solving problems.@@@@1@7@@danf@17-8-2009
10140870@unknown@formal@none@1@S@; [[Computability theory (computer science)|Computability theory]]@@@@1@6@@danf@17-8-2009
10140880@unknown@formal@none@1@S@: What is calculable with the current models of computers.@@@@1@10@@danf@17-8-2009
10140890@unknown@formal@none@1@S@Proofs developed by [[Alan Turing]] and others provide insight into the possibilities of what can be computed and what cannot.@@@@1@20@@danf@17-8-2009
10140900@unknown@formal@none@1@S@; [[Computational complexity theory]]@@@@1@4@@danf@17-8-2009
10140910@unknown@formal@none@1@S@: Fundamental bounds (especially time and storage space) on classes of computations; in practice, study of which problems a computer can solve with reasonable resources (while computability theory studies which problems can be solved at all).@@@@1@36@@danf@17-8-2009
10140920@unknown@formal@none@1@S@; [[Quantum computing|Quantum computing theory]]@@@@1@5@@danf@17-8-2009
10140930@unknown@formal@none@1@S@: Representation and manipulation of data using the quantum properties of particles and quantum mechanism.@@@@1@15@@danf@17-8-2009
10140940@unknown@formal@none@1@S@=== Algorithms and data structures ===@@@@1@6@@danf@17-8-2009
10140950@unknown@formal@none@1@S@; [[Analysis of algorithms]]@@@@1@4@@danf@17-8-2009
10140960@unknown@formal@none@1@S@: Time and space complexity of algorithms.@@@@1@7@@danf@17-8-2009
10140970@unknown@formal@none@1@S@; [[Algorithms]]@@@@1@2@@danf@17-8-2009
10140980@unknown@formal@none@1@S@: Formal logical processes used for computation, and the efficiency of these processes.@@@@1@13@@danf@17-8-2009
10140990@unknown@formal@none@1@S@=== Programming languages and compilers ===@@@@1@6@@danf@17-8-2009
10141000@unknown@formal@none@1@S@; [[Compiler]]s@@@@1@2@@danf@17-8-2009
10141010@unknown@formal@none@1@S@: Ways of translating computer programs, usually from [[high-level programming language|higher level]] languages to [[low-level programming language|lower level]] ones.@@@@1@19@@danf@17-8-2009
10141020@unknown@formal@none@1@S@; [[Interpreter (computing)|Interpreter]]s@@@@1@3@@danf@17-8-2009
10141030@unknown@formal@none@1@S@: A program that takes in as input a computer program and executes it.@@@@1@14@@danf@17-8-2009
10141040@unknown@formal@none@1@S@; [[Programming language]]s@@@@1@3@@danf@17-8-2009
10141050@unknown@formal@none@1@S@: Formal language paradigms for expressing algorithms, and the properties of these languages (e.g., what problems they are suited to solve).@@@@1@21@@danf@17-8-2009
10141060@unknown@formal@none@1@S@=== Concurrent, parallel, and distributed systems ===@@@@1@7@@danf@17-8-2009
10141070@unknown@formal@none@1@S@; [[Concurrency (computer science)|Concurrency]]@@@@1@4@@danf@17-8-2009
10141080@unknown@formal@none@1@S@: The theory and practice of simultaneous computation; data safety in any multitasking or multithreaded environment.@@@@1@16@@danf@17-8-2009
10141090@unknown@formal@none@1@S@; [[Distributed computing]]@@@@1@3@@danf@17-8-2009
10141100@unknown@formal@none@1@S@: Computing using multiple computing devices over a network to accomplish a common objective or task and thereby reducing the latency involved in single processor contributions for any task.@@@@1@29@@danf@17-8-2009
10141110@unknown@formal@none@1@S@; [[Parallel computing]]@@@@1@3@@danf@17-8-2009
10141120@unknown@formal@none@1@S@: Computing using multiple concurrent threads of execution.@@@@1@8@@danf@17-8-2009
10141130@unknown@formal@none@1@S@=== Software engineering ===@@@@1@4@@danf@17-8-2009
10141140@unknown@formal@none@1@S@; [[Algorithm design]]@@@@1@3@@danf@17-8-2009
10141150@unknown@formal@none@1@S@: Using ideas from algorithm theory to creatively design solutions to real tasks@@@@1@13@@danf@17-8-2009
10141160@unknown@formal@none@1@S@; [[Computer programming]]@@@@1@3@@danf@17-8-2009
10141170@unknown@formal@none@1@S@: The practice of using a programming language to implement algorithms@@@@1@11@@danf@17-8-2009
10141180@unknown@formal@none@1@S@; [[Formal methods]]@@@@1@3@@danf@17-8-2009
10141190@unknown@formal@none@1@S@: Mathematical approaches for describing and reasoning about software designs.@@@@1@10@@danf@17-8-2009
10141200@unknown@formal@none@1@S@; [[Reverse engineering]]@@@@1@3@@danf@17-8-2009
10141210@unknown@formal@none@1@S@: The application of the scientific method to the understanding of arbitrary existing software@@@@1@14@@danf@17-8-2009
10141220@unknown@formal@none@1@S@; [[Software development]]@@@@1@3@@danf@17-8-2009
10141230@unknown@formal@none@1@S@: The principles and practice of designing, developing, and testing programs, as well as proper engineering practices.@@@@1@17@@danf@17-8-2009
10141240@unknown@formal@none@1@S@=== System architecture ===@@@@1@4@@danf@17-8-2009
10141250@unknown@formal@none@1@S@; [[Computer architecture]]@@@@1@3@@danf@17-8-2009
10141260@unknown@formal@none@1@S@: The design, organization, optimization and verification of a computer system, mostly about [[CPU]]s and [[memory (computers)|memory]] subsystems (and the bus connecting them).@@@@1@23@@danf@17-8-2009
10141270@unknown@formal@none@1@S@; [[Computer organization]]@@@@1@3@@danf@17-8-2009
10141280@unknown@formal@none@1@S@: The implementation of computer architectures, in terms of descriptions of their specific [[electrical circuit]]ry@@@@1@15@@danf@17-8-2009
10141290@unknown@formal@none@1@S@; [[Operating system]]s@@@@1@3@@danf@17-8-2009
10141300@unknown@formal@none@1@S@: Systems for managing computer programs and providing the basis of a useable system.@@@@1@14@@danf@17-8-2009
10141310@unknown@formal@none@1@S@=== Communications ===@@@@1@3@@danf@17-8-2009
10141320@unknown@formal@none@1@S@; [[Computer audio]]@@@@1@3@@danf@17-8-2009
10141330@unknown@formal@none@1@S@: Algorithms and data structures for the creation, manipulation, storage, and transmission of [[digital audio]] recordings.@@@@1@16@@danf@17-8-2009
10141340@unknown@formal@none@1@S@Also important in [[voice recognition]] applications.@@@@1@6@@danf@17-8-2009
10141350@unknown@formal@none@1@S@; [[Computer networking|Networking]]@@@@1@3@@danf@17-8-2009
10141360@unknown@formal@none@1@S@: Algorithms and protocols for communicating data across different shared or dedicated media, often including [[error correction]].@@@@1@17@@danf@17-8-2009
10141370@unknown@formal@none@1@S@; [[Cryptography]]@@@@1@2@@danf@17-8-2009
10141380@unknown@formal@none@1@S@: Applies results from complexity, probability and number theory to invent and break codes.@@@@1@14@@danf@17-8-2009
10141390@unknown@formal@none@1@S@=== Databases ===@@@@1@3@@danf@17-8-2009
10141400@unknown@formal@none@1@S@; [[Data mining]]@@@@1@3@@danf@17-8-2009
10141410@unknown@formal@none@1@S@: Data mining is the extraction of relevant data from all sources of data.@@@@1@14@@danf@17-8-2009
10141420@unknown@formal@none@1@S@; [[Relational databases]]@@@@1@3@@danf@17-8-2009
10141430@unknown@formal@none@1@S@: Study of algorithms for searching and processing information in documents and databases; closely related to [[information retrieval]].@@@@1@18@@danf@17-8-2009
10141440@unknown@formal@none@1@S@; [[OLAP]]@@@@1@2@@danf@17-8-2009
10141450@unknown@formal@none@1@S@: Online Analytical Processing, or OLAP, is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature.@@@@1@21@@danf@17-8-2009
10141460@unknown@formal@none@1@S@OLAP is part of the broader category [[business intelligence]], which also encompasses relational reporting and data mining.@@@@1@17@@danf@17-8-2009
10141470@unknown@formal@none@1@S@=== Artificial intelligence ===@@@@1@4@@danf@17-8-2009
10141480@unknown@formal@none@1@S@; [[Artificial intelligence]]@@@@1@3@@danf@17-8-2009
10141490@unknown@formal@none@1@S@: The implementation and study of systems that exhibit an autonomous intelligence or behaviour of their own.@@@@1@17@@danf@17-8-2009
10141500@unknown@formal@none@1@S@; [[Artificial life]]@@@@1@3@@danf@17-8-2009
10141510@unknown@formal@none@1@S@: The study of digital organisms to learn about biological systems and evolution.@@@@1@13@@danf@17-8-2009
10141520@unknown@formal@none@1@S@; [[Automated reasoning]]@@@@1@3@@danf@17-8-2009
10141530@unknown@formal@none@1@S@: Solving engines, such as used in [[Prolog]], which produce steps to a result given a query on a fact and rule database.@@@@1@23@@danf@17-8-2009
10141540@unknown@formal@none@1@S@; [[Computer vision]]@@@@1@3@@danf@17-8-2009
10141550@unknown@formal@none@1@S@: Algorithms for identifying three dimensional objects from one or more two dimensional pictures.@@@@1@14@@danf@17-8-2009
10141560@unknown@formal@none@1@S@; [[Machine learning]]@@@@1@3@@danf@17-8-2009
10141570@unknown@formal@none@1@S@: Automated creation of a set of rules and axioms based on input.@@@@1@13@@danf@17-8-2009
10141580@unknown@formal@none@1@S@; [[Natural language processing]]/[[Computational linguistics]]@@@@1@5@@danf@17-8-2009
10141590@unknown@formal@none@1@S@: Automated understanding and generation of human language@@@@1@8@@danf@17-8-2009
10141600@unknown@formal@none@1@S@; [[Robotics]]@@@@1@2@@danf@17-8-2009
10141610@unknown@formal@none@1@S@: Algorithms for controlling the behavior of robots.@@@@1@8@@danf@17-8-2009
10141620@unknown@formal@none@1@S@=== Visual rendering (or Computer graphics) ===@@@@1@7@@danf@17-8-2009
10141630@unknown@formal@none@1@S@; [[Computer graphics]]@@@@1@3@@danf@17-8-2009
10141640@unknown@formal@none@1@S@: Algorithms both for generating visual images synthetically, and for integrating or altering visual and spatial information sampled from the real world.@@@@1@22@@danf@17-8-2009
10141650@unknown@formal@none@1@S@; [[Image processing]]@@@@1@3@@danf@17-8-2009
10141660@unknown@formal@none@1@S@: Determining information from an image through computation.@@@@1@8@@danf@17-8-2009
10141670@unknown@formal@none@1@S@=== Human-Computer Interaction ===@@@@1@4@@danf@17-8-2009
10141680@unknown@formal@none@1@S@; [[Human computer interaction]]@@@@1@4@@danf@17-8-2009
10141690@unknown@formal@none@1@S@: The study of making computers and computations useful, usable and universally accessible to [[user (computing)|people]], including the study and design of computer interfaces through which people use computers.@@@@1@29@@danf@17-8-2009
10141700@unknown@formal@none@1@S@=== Scientific computing ===@@@@1@4@@danf@17-8-2009
10141710@unknown@formal@none@1@S@; [[Bioinformatics]]@@@@1@2@@danf@17-8-2009
10141720@unknown@formal@none@1@S@: The use of computer science to maintain, analyse, and store [[biological data]], and to assist in solving biological problems such as [[protein folding]], function prediction and [[phylogeny]].@@@@1@28@@danf@17-8-2009
10141730@unknown@formal@none@1@S@; [[Cognitive Science]]@@@@1@3@@danf@17-8-2009
10141740@unknown@formal@none@1@S@: Computational modelling of real minds@@@@1@6@@danf@17-8-2009
10141750@unknown@formal@none@1@S@; [[Computational chemistry]]@@@@1@3@@danf@17-8-2009
10141760@unknown@formal@none@1@S@: Computational modelling of theoretical chemistry in order to determine chemical structures and properties@@@@1@14@@danf@17-8-2009
10141770@unknown@formal@none@1@S@; [[Computational neuroscience]]@@@@1@3@@danf@17-8-2009
10141780@unknown@formal@none@1@S@: Computational modelling of real brains@@@@1@6@@danf@17-8-2009
10141790@unknown@formal@none@1@S@; [[Computational physics]]@@@@1@3@@danf@17-8-2009
10141800@unknown@formal@none@1@S@: Numerical simulations of large non-analytic systems@@@@1@7@@danf@17-8-2009
10141810@unknown@formal@none@1@S@; [[Numerical analysis|Numerical algorithms]]@@@@1@4@@danf@17-8-2009
10141820@unknown@formal@none@1@S@: Algorithms for the numerical solution of mathematical problems such as [[Root-finding algorithm|root-finding]], [[Numerical integration|integration]], the [[Numerical ordinary differential equations|solution of ordinary differential equations]] and the approximation/evaluation of [[special functions]].@@@@1@30@@danf@17-8-2009
10141830@unknown@formal@none@1@S@; [[Symbolic mathematics]]@@@@1@3@@danf@17-8-2009
10141840@unknown@formal@none@1@S@: Manipulation and solution of expressions in symbolic form, also known as [[Computer algebra]].@@@@1@14@@danf@17-8-2009
10141850@unknown@formal@none@1@S@=== Didactics of computer science/informatics ===@@@@1@6@@danf@17-8-2009
10141860@unknown@formal@none@1@S@The subfield didactics of computer science focuses on cognitive approaches of developing competencies of computer science and specific strategies for analysis, design, implementation and evaluation of excellent lessons in computer science.@@@@1@31@@danf@17-8-2009
10141870@unknown@formal@none@1@S@== Computer science education ==@@@@1@5@@danf@17-8-2009
10141880@unknown@formal@none@1@S@Some universities teach computer science as a theoretical study of computation and algorithmic reasoning.@@@@1@14@@danf@17-8-2009
10141890@unknown@formal@none@1@S@These programs often feature the [[theory of computation]], [[analysis of algorithms]], [[formal methods]], [[Concurrency (computer science)|concurrency theory]], [[databases]], [[computer graphics]] and [[systems analysis]], among others.@@@@1@25@@danf@17-8-2009
10141900@unknown@formal@none@1@S@They typically also teach [[computer programming]], but treat it as a vessel for the support of other fields of computer science rather than a central focus of high-level study.@@@@1@29@@danf@17-8-2009
10141910@unknown@formal@none@1@S@Other colleges and universities, as well as [[secondary school]]s and vocational programs that teach computer science, emphasize the practice of advanced [[computer programming]] rather than the theory of algorithms and computation in their computer science curricula.@@@@1@36@@danf@17-8-2009
10141920@unknown@formal@none@1@S@Such curricula tend to focus on those skills that are important to workers entering the software industry.@@@@1@17@@danf@17-8-2009
10141930@unknown@formal@none@1@S@The practical aspects of computer programming are often referred to as [[software engineering]].@@@@1@13@@danf@17-8-2009
10141940@unknown@formal@none@1@S@However, there is a lot of [[Debates within software engineering|disagreement]] over what the term "software engineering" actually means, and whether it is the same thing as programming.@@@@1@27@@danf@17-8-2009
10150010@unknown@formal@none@1@S@Corpus linguistics@@@@1@2@@danf@17-8-2009
10150020@unknown@formal@none@1@S@'''Corpus linguistics''' is the [[study of language]] as expressed in [[sample]]s ''([[Text corpus|corpora]])'' or "real world" text.@@@@1@17@@danf@17-8-2009
10150030@unknown@formal@none@1@S@This method represents a [[digest]]ive approach to deriving a set of abstract rules by which a [[natural language]] is governed or else relates to another language.@@@@1@26@@danf@17-8-2009
10150040@unknown@formal@none@1@S@Originally done by hand, corpora are largely derived by an automated process, which is corrected.@@@@1@15@@danf@17-8-2009
10150050@unknown@formal@none@1@S@Computational methods had once been viewed as a [[holy grail]] of [[linguistics|linguistic]] research, which would ultimately manifest a [[ruleset]] for [[natural language processing]] and [[machine translation]] at a high level.@@@@1@30@@danf@17-8-2009
10150060@unknown@formal@none@1@S@Such has not been the case, and since the [[cognitive revolution]], cognitive linguistics has been largely critical of many claimed practical uses for corpora.@@@@1@24@@danf@17-8-2009
10150070@unknown@formal@none@1@S@However, as [[computation]] capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability.@@@@1@24@@danf@17-8-2009
10150080@unknown@formal@none@1@S@The corpus approach runs counter to [[Noam Chomsky]]'s view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting.@@@@1@32@@danf@17-8-2009
10150090@unknown@formal@none@1@S@Corpus linguistics does away with Chomsky's ''competence/performance'' split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.@@@@1@27@@danf@17-8-2009
10150100@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10150110@unknown@formal@none@1@S@A landmark in modern corpus linguistics was the publication by [[Henry Kucera]] and [[Nelson Francis]] of ''Computational Analysis of Present-Day American English'' in 1967, a work based on the analysis of the [[Brown Corpus]], a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources.@@@@1@54@@danf@17-8-2009
10150120@unknown@formal@none@1@S@Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, language teaching, [[psychology]], [[statistics]], and [[sociology]].@@@@1@30@@danf@17-8-2009
10150130@unknown@formal@none@1@S@A further key publication was [[Randolph Quirk]]'s 'Towards a description of English Usage' (1960, Transactions of the Philological Society, 40-61) in which he introduced ''The Survey of English Usage''.@@@@1@29@@danf@17-8-2009
10150140@unknown@formal@none@1@S@Shortly thereafter, Boston publisher [[Houghton-Mifflin]] approached Kucera to supply a million word, three-line citation base for its new ''[[The American Heritage Dictionary of the English Language|American Heritage Dictionary]]'', the first [[dictionary]] to be compiled using corpus linguistics.@@@@1@37@@danf@17-8-2009
10150150@unknown@formal@none@1@S@The AHD made the innovative step of combining prescriptive elements (how language ''should'' be used) with descriptive information (how it actually ''is'' used).@@@@1@23@@danf@17-8-2009
10150160@unknown@formal@none@1@S@Other publishers followed suit.@@@@1@4@@danf@17-8-2009
10150170@unknown@formal@none@1@S@The British publisher Collins' [[COBUILD]] [[monolingual learner's dictionary]], designed for users learning [[English language learning and teaching|English as a foreign language]], was compiled using the [[Bank of English]].@@@@1@28@@danf@17-8-2009
10150180@unknown@formal@none@1@S@The [[Brown Corpus]] has also spawned a number of similarly structured corpora: the [[LOB Corpus]] (1960s [[British English]]), Kolhapur ([[Indian English]]), Wellington ([[New Zealand English]]), Australian Corpus of English ([[Australian English]]), the Frown Corpus ([[early 1990s]] [[American English]]), and the FLOB Corpus (1990s British English).@@@@1@45@@danf@17-8-2009
10150190@unknown@formal@none@1@S@Other corpora represent many languages, varieties and modes, and include the [[International Corpus of English]], and the [[British National Corpus]], a 100 million word collection of a range of spoken and written texts, created in the 1990s by a consortium of publishers, universities ([[Oxford University|Oxford]] and [[Lancaster University|Lancaster]]) and the [[British Library]].@@@@1@52@@danf@17-8-2009
10150200@unknown@formal@none@1@S@For contemporary American English, work has stalled on the [[American National Corpus]], but the 360 million word [[Corpus of Contemporary American English (COCA)]] (1990-present) is now available.@@@@1@27@@danf@17-8-2009
10150210@unknown@formal@none@1@S@== Methods ==@@@@1@3@@danf@17-8-2009
10150220@unknown@formal@none@1@S@This means dealing with real input data, where descriptions based on a linguist's intuition are not usually helpful.@@@@1@18@@danf@17-8-2009
10160010@unknown@formal@none@1@S@Cross-platform@@@@1@1@@danf@17-8-2009
10160020@unknown@formal@none@1@S@'''Cross-platform''' (also known as '''multi-platform''') is a term used in computing to refer to [[computer program]]s, [[operating system]]s, [[computer language]]s, [[programming language]]s, or other [[computer software]] and their implementations which can be made to work on multiple [[computer platform]]s.@@@@1@39@@danf@17-8-2009
10160030@unknown@formal@none@1@S@“Cross-platform” and “multi-platform” both refer to the idea that a given piece of computer software is able to be run on more than one computer platform.@@@@1@26@@danf@17-8-2009
10160040@unknown@formal@none@1@S@There are two major types of cross-platform software; one requires building for each platform that it supports (e.g., is written in a compiled language, such as [[Pascal (programming language)|Pascal]]), and the other one can be directly run on any platform which supports it (e.g., software written in an [[interpreted language]] such as [[Perl]], [[Python (programming language)|Python]], or [[shell script]]) or software written in a language which compiles to [[bytecode]] and the bytecode is redistributed (such as is the case with [[Java (programming language)|Java]] and languages used in the [[.NET Framework]]) such as [[Chrome (programming language)|Chrome]].@@@@1@95@@danf@17-8-2009
10160050@unknown@formal@none@1@S@For example, a cross-platform [[application software|application]] may run on [[Microsoft Windows]] on the [[x86 architecture]], [[Linux]] on the [[x86 architecture]] and [[Mac OS X]] on either the [[PowerPC]] or [[x86]] based [[Apple Macintosh]] systems.@@@@1@34@@danf@17-8-2009
10160060@unknown@formal@none@1@S@A cross-platform [[application software|application]] may run on as many as all existing platforms, or on as few as two platforms.@@@@1@20@@danf@17-8-2009
10160070@unknown@formal@none@1@S@== Platforms ==@@@@1@3@@danf@17-8-2009
10160080@unknown@formal@none@1@S@A platform is a combination of hardware and software used to run software applications.@@@@1@14@@danf@17-8-2009
10160090@unknown@formal@none@1@S@A platform can be described simply as an operating system or computer architecture, or it could be the combination of both.@@@@1@21@@danf@17-8-2009
10160100@unknown@formal@none@1@S@Probably the most familiar platform is [[Microsoft Windows]] running on the [[x86 architecture]].@@@@1@13@@danf@17-8-2009
10160110@unknown@formal@none@1@S@Other well-known desktop computer platforms include [[Linux]] and [[Mac OS X]] (both of which are themselves cross-platform).@@@@1@17@@danf@17-8-2009
10160120@unknown@formal@none@1@S@There are, however, many devices such as [[cellular telephones]] that are also effectively computer platforms but less commonly thought about in that way.@@@@1@23@@danf@17-8-2009
10160130@unknown@formal@none@1@S@[[Application software]] can be written to depend on the features of a particular platform—either the hardware, operating system, or virtual machine it runs on.@@@@1@24@@danf@17-8-2009
10160140@unknown@formal@none@1@S@The [[Java Platform|Java platform]] is a [[virtual machine]] platform which runs on many operating systems and hardware types, and is a common platform for software to be written for.@@@@1@29@@danf@17-8-2009
10160150@unknown@formal@none@1@S@=== Hardware platforms ===@@@@1@4@@danf@17-8-2009
10160160@unknown@formal@none@1@S@A '''hardware platform''' can refer to a computer’s [[computer architecture|architecture]] or [[processor architecture]].@@@@1@13@@danf@17-8-2009
10160170@unknown@formal@none@1@S@For example, the [[x86]] and [[x86-64]] [[CPU]]s make up one of the most common [[computer architecture]]s in use in home machines today.@@@@1@22@@danf@17-8-2009
10160180@unknown@formal@none@1@S@These machines commonly run [[Microsoft Windows]], though they can run other [[operating system]]s as well, including [[Linux]], [[OpenBSD]], [[NetBSD]], [[Mac OS X]] and [[FreeBSD]].@@@@1@24@@danf@17-8-2009
10160190@unknown@formal@none@1@S@=== Software platforms ===@@@@1@4@@danf@17-8-2009
10160200@unknown@formal@none@1@S@Software platforms can either be an [[operating system]] or programming environment, though more commonly it is a combination of both.@@@@1@20@@danf@17-8-2009
10160210@unknown@formal@none@1@S@A notable exception to this is [[Java (programming language)|Java]], which uses an [[operating system]] independent [[virtual machine]] for its [[compiled]] code, known in the world of Java as [[bytecode]].@@@@1@29@@danf@17-8-2009
10160220@unknown@formal@none@1@S@Examples of software platforms include:@@@@1@5@@danf@17-8-2009
10160230@unknown@formal@none@1@S@* [[MS-DOS]] ([[x86]]), [[DR-DOS]] ([[x86]]), [[FreeDOS]] ([[x86]]) etc.@@@@1@8@@danf@17-8-2009
10160240@unknown@formal@none@1@S@* [[Microsoft Windows]] ([[x86]], [[x64]])@@@@1@5@@danf@17-8-2009
10160250@unknown@formal@none@1@S@* [[Linux]] (x86, x64, [[PowerPC]], various other architectures)@@@@1@8@@danf@17-8-2009
10160260@unknown@formal@none@1@S@* [[Mac OS X]] (PowerPC, x86)@@@@1@6@@danf@17-8-2009
10160270@unknown@formal@none@1@S@* [[OS/2]], [[eComStation]]@@@@1@3@@danf@17-8-2009
10160280@unknown@formal@none@1@S@* [[AmigaOS]] ([[m68k]]), [[AROS]] (x86, PowerPC, m68k), [[MorphOS]] (PowerPC)@@@@1@9@@danf@17-8-2009
10160290@unknown@formal@none@1@S@* [[Java (programming language)|Java]]@@@@1@4@@danf@17-8-2009
10160300@unknown@formal@none@1@S@==== Java platform ====@@@@1@4@@danf@17-8-2009
10160310@unknown@formal@none@1@S@As previously noted, the [[Java platform]] is an exception to the general rule that an [[operating system]] is a software platform.@@@@1@21@@danf@17-8-2009
10160320@unknown@formal@none@1@S@The Java language provides a [[virtual machine]], or a “virtual CPU” which runs all of the code that is written for the language.@@@@1@23@@danf@17-8-2009
10160330@unknown@formal@none@1@S@This enables the same [[executable]] [[binary file|binary]] to run on all systems which support the Java software, through the [[Java Virtual Machine]].@@@@1@22@@danf@17-8-2009
10160340@unknown@formal@none@1@S@Java [[executable]]s do not run directly on the [[operating system]]; that is, neither [[Microsoft Windows|Windows]] nor [[Linux]] execute Java programs directly.@@@@1@21@@danf@17-8-2009
10160350@unknown@formal@none@1@S@Because of this, however, Java is limited in that it does not directly support system-specific functionality.@@@@1@16@@danf@17-8-2009
10160360@unknown@formal@none@1@S@[[Java Native Interface|JNI]] can be used to access system specific functions, but then the code is likely no longer portable.@@@@1@20@@danf@17-8-2009
10160370@unknown@formal@none@1@S@Java programs can run on at least the [[Microsoft Windows]], [[Mac OS X]], [[Linux]], and [[Solaris Operating System|Solaris]] operating systems, and so the language is limited to functionality that exists on all these systems.@@@@1@34@@danf@17-8-2009
10160380@unknown@formal@none@1@S@This includes things such as [[computer networking]], [[Internet socket]]s, but not necessarily raw hardware [[input/output]].@@@@1@15@@danf@17-8-2009
10160390@unknown@formal@none@1@S@== Cross-platform software ==@@@@1@4@@danf@17-8-2009
10160400@unknown@formal@none@1@S@In order for software to be considered '''cross-platform''', it must be able to function on more than one [[computer architecture]] or [[operating system]].@@@@1@23@@danf@17-8-2009
10160410@unknown@formal@none@1@S@This can be a time-consuming task given that different [[operating system]]s have different [[application programming interface]]s or [[application programming interface|API]]s (for example, [[Linux]] uses a different [[application programming interface|API]] for [[application software]] than [[Microsoft Windows|Windows]] does).@@@@1@36@@danf@17-8-2009
10160420@unknown@formal@none@1@S@Just because a particular [[operating system]] may run on different [[computer architecture]]s, that does not mean that the software written for that operating system will automatically work on all [[computer architecture|architecture]]s that the operating system supports.@@@@1@36@@danf@17-8-2009
10160430@unknown@formal@none@1@S@One example as of August, 2006 was [[OpenOffice.org]], which did not natively run on the [[AMD64]] or [[EM64T]] lines of processors implementing the [[x86-64]] [[64-bit]] standards for computers; this has since been changed, and the OpenOffice.org suite of software is “mostly” ported to these 64-bit systems[http://wiki.services.openoffice.org/wiki/Porting_to_x86-64_(AMD64,_EM64T)].@@@@1@46@@danf@17-8-2009
10160440@unknown@formal@none@1@S@This also means that just because a program is written in a popular programming language such as [[C (programming language)|C]] or [[C++]], it does not mean it will run on all [[operating systems]] that support that [[programming language]].@@@@1@38@@danf@17-8-2009
10160450@unknown@formal@none@1@S@=== Web applications ===@@@@1@4@@danf@17-8-2009
10160460@unknown@formal@none@1@S@[[Web application]]s are typically described as cross-platform because, ideally, they are accessible from any of various [[web browser]]s within different operating systems.@@@@1@22@@danf@17-8-2009
10160470@unknown@formal@none@1@S@Such applications generally employ a [[client-server]] system architecture, and vary widely in complexity and functionality.@@@@1@15@@danf@17-8-2009
10160480@unknown@formal@none@1@S@This wide variability significantly complicates the goal of cross-platform capability, which is routinely at odds with the goal of advanced functionality.@@@@1@21@@danf@17-8-2009
10160490@unknown@formal@none@1@S@==== Basic applications ====@@@@1@4@@danf@17-8-2009
10160500@unknown@formal@none@1@S@Basic web applications perform all or most processing from a [[Stateless server|stateless]] [[web server]], and pass the result to the client web browser.@@@@1@23@@danf@17-8-2009
10160510@unknown@formal@none@1@S@All user interaction with the application consists of simple exchanges of data requests and server responses.@@@@1@16@@danf@17-8-2009
10160520@unknown@formal@none@1@S@These types of applications were the norm in the early phases of [[World Wide Web]] application development.@@@@1@17@@danf@17-8-2009
10160530@unknown@formal@none@1@S@Such applications follow a simple [[Transaction processing|transaction]] model, identical to that of serving [[static web page]]s.@@@@1@16@@danf@17-8-2009
10160540@unknown@formal@none@1@S@Today, they are still relatively common, especially where cross-platform compatibility and simplicity are deemed more critical than advanced functionality.@@@@1@19@@danf@17-8-2009
10160550@unknown@formal@none@1@S@==== Advanced applications ====@@@@1@4@@danf@17-8-2009
10160560@unknown@formal@none@1@S@Prominent examples of advanced web applications include the Web interface to [[Gmail]], [[A9.com]], and the maps.live.com section of [[Live Search]].@@@@1@20@@danf@17-8-2009
10160570@unknown@formal@none@1@S@Such advanced applications routinely depend on additional features found only in the more recent versions of popular web browsers.@@@@1@19@@danf@17-8-2009
10160580@unknown@formal@none@1@S@These dependencies include [[Ajax (programming)|Ajax]], [[JavaScript]], [[Dynamic HTML|“Dynamic” HTML]], [[SVG]], and other components of [[rich internet application]]s.@@@@1@17@@danf@17-8-2009
10160590@unknown@formal@none@1@S@Older versions of popular browsers tend to lack support for certain features.@@@@1@12@@danf@17-8-2009
10160600@unknown@formal@none@1@S@==== Design strategies ====@@@@1@4@@danf@17-8-2009
10160610@unknown@formal@none@1@S@Because of the competing interests of cross-platform compatibility and advanced functionality, numerous alternative web application design strategies have emerged.@@@@1@19@@danf@17-8-2009
10160620@unknown@formal@none@1@S@Such strategies include:@@@@1@3@@danf@17-8-2009
10160630@unknown@formal@none@1@S@=====Graceful degradation=====@@@@1@2@@danf@17-8-2009
10160640@unknown@formal@none@1@S@Graceful degradation attempts to provide the same or similar functionality to all users and platforms, while diminishing that functionality to a ‘least common denominator’ for more limited client browsers.@@@@1@29@@danf@17-8-2009
10160650@unknown@formal@none@1@S@For example, a user attempting to use a limited-feature browser to access Gmail may notice that Gmail switches to “Basic Mode,” with reduced functionality.@@@@1@24@@danf@17-8-2009
10160660@unknown@formal@none@1@S@Some view this strategy as a lesser form of cross-platform capability.@@@@1@11@@danf@17-8-2009
10160670@unknown@formal@none@1@S@=====Separation of functionality=====@@@@1@3@@danf@17-8-2009
10160680@unknown@formal@none@1@S@Separation of functionality attempts to simply omit those subsets of functionality that are not capable from within certain client browsers or operating systems, while still delivering a ‘complete’ application to the user. (see also [[Separation of concerns]]).@@@@1@37@@danf@17-8-2009
10160690@unknown@formal@none@1@S@=====Multiple codebase=====@@@@1@2@@danf@17-8-2009
10160700@unknown@formal@none@1@S@Multiple codebase applications present different versions of an application depending on the specific client in use.@@@@1@16@@danf@17-8-2009
10160710@unknown@formal@none@1@S@This strategy is arguably the most complicated and expensive way to fulfill cross-platform capability, since even different versions of the same client browser (within the same operating system) can differ dramatically between each other.@@@@1@34@@danf@17-8-2009
10160720@unknown@formal@none@1@S@This is further complicated by the support for “plugins” which may or may not be present for any given installation of a particular browser version.@@@@1@25@@danf@17-8-2009
10160730@unknown@formal@none@1@S@=====Third party libraries=====@@@@1@3@@danf@17-8-2009
10160740@unknown@formal@none@1@S@Third party libraries attempt to simplify cross-platform capability by ‘hiding’ the complexities of client differentiation behind a single, unified API.@@@@1@20@@danf@17-8-2009
10160750@unknown@formal@none@1@S@==== Testing strategies ====@@@@1@4@@danf@17-8-2009
10160760@unknown@formal@none@1@S@One complicated aspect of cross-platform web application design is the need for [[software testing]].@@@@1@14@@danf@17-8-2009
10160770@unknown@formal@none@1@S@In addition to the complications mentioned previously, there is the additional restriction that some browsers prohibit installation of different versions of the same browser on the same operating system.@@@@1@29@@danf@17-8-2009
10160780@unknown@formal@none@1@S@Techniques such as [[full virtualization]] are sometimes used as a workaround for this problem.@@@@1@14@@danf@17-8-2009
10160790@unknown@formal@none@1@S@=== Traditional applications ===@@@@1@4@@danf@17-8-2009
10160800@unknown@formal@none@1@S@Although web applications are becoming increasingly popular, many computer users still use traditional [[application software]] which does not rely on a client/web-server architecture.@@@@1@23@@danf@17-8-2009
10160810@unknown@formal@none@1@S@The distinction between “traditional” and “web” applications is not always unambiguous, however, because applications have many different features, installation methods and architectures; and some of these can overlap and occur in ways that blur the distinction.@@@@1@36@@danf@17-8-2009
10160820@unknown@formal@none@1@S@Nevertheless, this simplifying distinction is a common and useful generalization.@@@@1@10@@danf@17-8-2009
10160830@unknown@formal@none@1@S@==== Binary software ====@@@@1@4@@danf@17-8-2009
10160840@unknown@formal@none@1@S@Traditionally in modern computing, application software has been distributed to end-users as '''binary images''', which are stored in [[executable]]s, a specific type of [[binary file]].@@@@1@25@@danf@17-8-2009
10160850@unknown@formal@none@1@S@Such [[executable]]s only support the [[operating system]] and [[computer architecture]] that they were built for—which means that making a “cross-platform executable” would be something of a massive task, and is generally not done.@@@@1@33@@danf@17-8-2009
10160860@unknown@formal@none@1@S@For software that is distributed as a [[binary file|binary]] [[executable]], such as software written in [[C (programming language)|C]] or [[C++]], the programmer must [[software build|build the software]] for each different [[operating system]] and [[computer architecture]].@@@@1@35@@danf@17-8-2009
10160870@unknown@formal@none@1@S@For example, [[Mozilla]] [[Mozilla Firefox|Firefox]], an open-source web browser, is available on [[Microsoft Windows]], [[Mac OS X]] (both [[PowerPC]] and [[x86]] through something Apple calls a '''[[Universal binary]]'''), and [[Linux]] on multiple computer architectures.@@@@1@34@@danf@17-8-2009
10160880@unknown@formal@none@1@S@The three platforms (in this case, [[Microsoft Windows|Windows]], [[Mac OS X]], and [[Linux]]) are separate [[executable]] distributions, although they come from the same [[source code]].@@@@1@25@@danf@17-8-2009
10160890@unknown@formal@none@1@S@In the context of binary software, cross-platform programs are written in the source code and then “translated” to each system that it runs on through compiling it on different platforms.@@@@1@30@@danf@17-8-2009
10160900@unknown@formal@none@1@S@Also, software can be [[porting|ported]] to a new [[computer architecture]] or [[operating system]] so that the program becomes more cross-platform than it already is.@@@@1@24@@danf@17-8-2009
10160910@unknown@formal@none@1@S@For example, a program such as Firefox, which already runs on Windows on the x86 family, can be modified and re-built to run on Linux on the x86 (and potentially other architectures) as well.@@@@1@34@@danf@17-8-2009
10160920@unknown@formal@none@1@S@As an alternative to porting, cross-platform virtualization allows applications compiled for one CPU and operating system to run on a system with a different CPU and/or operating system, without modification to the source code or binaries.@@@@1@36@@danf@17-8-2009
10160930@unknown@formal@none@1@S@As an example, [[Apple Computer|Apple's]] [[Rosetta (software)|Rosetta]] software, which is built into [[Intel]]-based Apple Macintosh computers, runs applications compiled for the previous generation of Macs that used [[PowerPC]] CPUs.@@@@1@29@@danf@17-8-2009
10160940@unknown@formal@none@1@S@Another example is IBM PowerVM Lx86, which allows Linux/x86 applications to run unmodified on the Linux/Power operating system.@@@@1@18@@danf@17-8-2009
10160950@unknown@formal@none@1@S@==== Scripts and [[interpreted language]]s ====@@@@1@6@@danf@17-8-2009
10160960@unknown@formal@none@1@S@A script can be considered to be cross-platform if the [[scripting language]] is available on multiple platforms and the script only uses the facilities provided by the language.@@@@1@28@@danf@17-8-2009
10160970@unknown@formal@none@1@S@That is, a script written in [[Python (programming language)|Python]] for a [[Unix-like]] system will likely run with little or no modification on [[Microsoft Windows|Windows]], because Python also runs on [[Microsoft Windows|Windows]]; there is also more than one implementation of Python that will run the same scripts (e.g., [[IronPython]] for [[.NET Framework|.NET]]).@@@@1@51@@danf@17-8-2009
10160980@unknown@formal@none@1@S@The same goes for many of the [[open source]] [[programming language]]s that are available and are [[scripting language]]s.@@@@1@18@@danf@17-8-2009
10160990@unknown@formal@none@1@S@Unlike [[binary file|binary]] [[executable]]s, the same script can be used on all computers that have software to interpret the script.@@@@1@20@@danf@17-8-2009
10161000@unknown@formal@none@1@S@This is because the script is generally stored in [[plain text]] in a [[text file]].@@@@1@15@@danf@17-8-2009
10161010@unknown@formal@none@1@S@There may be some issues, however, such as the type of [[newline|new line character]] that sits between the lines.@@@@1@19@@danf@17-8-2009
10161020@unknown@formal@none@1@S@Generally, however, little or no work has to be done to make a script written for one system, run on another.@@@@1@21@@danf@17-8-2009
10161030@unknown@formal@none@1@S@Some quite popular cross-platform scripting or [[interpreted language]]s are:@@@@1@9@@danf@17-8-2009
10161040@unknown@formal@none@1@S@* [[bash]]—A [[Unix shell]] commonly run on [[Linux]] and other modern [[Unix-like]] systems, as well as on [[Microsoft Windows|Windows]] via the [[Cygwin]] [[POSIX]] compatibility layer.@@@@1@25@@danf@17-8-2009
10161050@unknown@formal@none@1@S@* [[Python (programming language)|Python]]—A modern [[scripting language]] where the focus is on [[rapid application development]] and ease-of-writing, instead of program run-time efficiency.@@@@1@22@@danf@17-8-2009
10161060@unknown@formal@none@1@S@* [[Perl]]—A scripting language first released in 1987.@@@@1@8@@danf@17-8-2009
10161070@unknown@formal@none@1@S@Used for [[Common Gateway Interface|CGI]] [[WWW]] programming, small [[system administration]] tasks, and more.@@@@1@13@@danf@17-8-2009
10161080@unknown@formal@none@1@S@* [[PHP]]—A [[scripting language]] most popular in use on the [[WWW]] for [[web application]]s.@@@@1@14@@danf@17-8-2009
10161090@unknown@formal@none@1@S@* [[Ruby (programming language)|Ruby]]—A scripting language who's purpose is to be object-oriented and easy to read.@@@@1@16@@danf@17-8-2009
10161100@unknown@formal@none@1@S@Can also be used on the web through [[Ruby on Rails]].@@@@1@11@@danf@17-8-2009
10161110@unknown@formal@none@1@S@* [[Tcl]] - A dynamic programming language, suitable for a wide range of uses, including web and desktop applications, networking, administration, testing and many more.@@@@1@25@@danf@17-8-2009
10161120@unknown@formal@none@1@S@==== Video games ====@@@@1@4@@danf@17-8-2009
10161130@unknown@formal@none@1@S@Cross-platform is a term that can also apply to [[video game]]s.@@@@1@11@@danf@17-8-2009
10161140@unknown@formal@none@1@S@Such games are released on a range of [[video game console]]s and [[handheld game console]]s, which are specialized [[computer]]s dedicated to the task of playing games (and thus, are a platform as any other computer).@@@@1@35@@danf@17-8-2009
10161150@unknown@formal@none@1@S@Examples of these games include:@@@@1@5@@danf@17-8-2009
10161160@unknown@formal@none@1@S@* [[Miner 2049er]], the first major multiplatform game@@@@1@8@@danf@17-8-2009
10161170@unknown@formal@none@1@S@* [[Phantasy Star Online]]@@@@1@4@@danf@17-8-2009
10161180@unknown@formal@none@1@S@* [[Lara Croft Tomb Raider: Legend]]@@@@1@6@@danf@17-8-2009
10161190@unknown@formal@none@1@S@* [[FIFA Series]]@@@@1@3@@danf@17-8-2009
10161200@unknown@formal@none@1@S@* [[Shadow of Legend]]@@@@1@4@@danf@17-8-2009
10161210@unknown@formal@none@1@S@… which are spread across a variety of platforms, such as the [[Nintendo GameCube]], [[PlayStation 2]], [[Xbox]], [[Personal computer|PC]], and [[mobile devices]].@@@@1@22@@danf@17-8-2009
10161220@unknown@formal@none@1@S@In some cases, depending on the hardware of a particular system it may take longer than expected to create a video game across multiple platforms.@@@@1@25@@danf@17-8-2009
10161230@unknown@formal@none@1@S@So, a video game may only get released on a few platforms and then later released on the remaining platforms.@@@@1@20@@danf@17-8-2009
10161240@unknown@formal@none@1@S@Typically, this is what occurs when a new system is released, because the [[Video game developer|developer]]s of the video game need to become acquainted with the hardware and software associated with the new console.@@@@1@34@@danf@17-8-2009
10161250@unknown@formal@none@1@S@Some games may not become cross-platform because of licensing agreements between the [[Video game developer|developer]]s and the maker of the [[video game console]] which state that the game will only be made for one particular console.@@@@1@36@@danf@17-8-2009
10161260@unknown@formal@none@1@S@As an example, [[Disney]] could create a new game and wish to release it on the latest [[Nintendo]] and [[Sony]] game consoles.@@@@1@22@@danf@17-8-2009
10161270@unknown@formal@none@1@S@If [[Disney]] licenses the game with [[Sony]] first, [[Disney]] may be required to only release the game on [[Sony|Sony’s]] console for a short time, or indefinitely—effectively prohibiting the game from cross-platform at least for a period of time.@@@@1@38@@danf@17-8-2009
10161280@unknown@formal@none@1@S@Several developers have developed ways to play games online while using different platforms.@@@@1@13@@danf@17-8-2009
10161290@unknown@formal@none@1@S@Epic Games, Microsoft and Valve Software all have this technology, that allows Xbox 360 gamers and PS3 gamers to play with PC gamers, allowing gamers to finally decide which platform is the best for a game.@@@@1@36@@danf@17-8-2009
10161300@unknown@formal@none@1@S@The first game released to allow this interactivity between PC and Console games was [[Quake 3]].@@@@1@16@@danf@17-8-2009
10161310@unknown@formal@none@1@S@Games that feature cross-platform online play include:@@@@1@7@@danf@17-8-2009
10161320@unknown@formal@none@1@S@*[[Champions Online]]@@@@1@2@@danf@17-8-2009
10161330@unknown@formal@none@1@S@*[[Lost Planet: Colonies]]@@@@1@3@@danf@17-8-2009
10161340@unknown@formal@none@1@S@*[[Phantasy Star Online]]@@@@1@3@@danf@17-8-2009
10161350@unknown@formal@none@1@S@*[[Shadowrun (2007 video game)|Shadowrun]]@@@@1@4@@danf@17-8-2009
10161360@unknown@formal@none@1@S@*[[UNO (Xbox Live Arcade)|UNO]]@@@@1@4@@danf@17-8-2009
10161370@unknown@formal@none@1@S@*[[Final Fantasy XI Online]]@@@@1@4@@danf@17-8-2009
10161380@unknown@formal@none@1@S@== Platform independent software ==@@@@1@5@@danf@17-8-2009
10161390@unknown@formal@none@1@S@Software that is platform independent does not rely on any special features of any single platform, or, if it does, handles those special features such that it can deal with multiple platforms.@@@@1@32@@danf@17-8-2009
10161400@unknown@formal@none@1@S@All [[algorithm]]s, such as the [[quicksort]] algorithm, are able to be implemented on different platforms.@@@@1@15@@danf@17-8-2009
10161410@unknown@formal@none@1@S@== Cross-platform programming ==@@@@1@4@@danf@17-8-2009
10161420@unknown@formal@none@1@S@Cross-platform programming is the practice of actively writing software that will work on more than one platform.@@@@1@17@@danf@17-8-2009
10161430@unknown@formal@none@1@S@=== Approaches to cross-platform programming ===@@@@1@6@@danf@17-8-2009
10161440@unknown@formal@none@1@S@There are different ways of approaching the problem of writing a cross-platform application program.@@@@1@14@@danf@17-8-2009
10161450@unknown@formal@none@1@S@One such approach is simply to create multiple versions of the same program in different ''source trees''—in other words, the [[Microsoft Windows|Windows]] version of a program might have one set of source code files and the [[Apple Macintosh|Macintosh]] version might have another, while a FOSS *nix system might have another.@@@@1@50@@danf@17-8-2009
10161460@unknown@formal@none@1@S@While this is a straightforward approach to the problem, it has the potential to be considerably more expensive in development cost, development time, or both, especially for the corporate entities.@@@@1@30@@danf@17-8-2009
10161470@unknown@formal@none@1@S@The idea behind this is to create more than two different programs that have the ability to behave similarly to each other.@@@@1@22@@danf@17-8-2009
10161480@unknown@formal@none@1@S@It is also possible that this means of developing a cross-platform application will result in more problems with bug tracking and fixing, because the two different ''source trees'' would have different programmers, and thus different defects in each version.@@@@1@39@@danf@17-8-2009
10161490@unknown@formal@none@1@S@The smaller the programming team, the quicker the bug fixes tend to be.@@@@1@13@@danf@17-8-2009
10161500@unknown@formal@none@1@S@Another approach that is used is to depend on pre-existing software that hides the differences between the platforms—called [[abstraction]] of the platform—such that the program itself is unaware of the platform it is running on.@@@@1@35@@danf@17-8-2009
10161510@unknown@formal@none@1@S@It could be said that such programs are ''platform agnostic''.@@@@1@10@@danf@17-8-2009
10161520@unknown@formal@none@1@S@Programs that run on the [[Java (Sun)|Java]] [[Virtual Machine]] ([[Java Virtual Machine|JVM]]) are built in this fashion.@@@@1@17@@danf@17-8-2009
10161530@unknown@formal@none@1@S@Some applications mix various methods of cross-platform programming to create the final application.@@@@1@13@@danf@17-8-2009
10161540@unknown@formal@none@1@S@An example of this is the [[Firefox]] [[web browser]], which uses [[abstraction]] to build some of the lower-level components, separate source subtrees for implementing platform specific features (like the GUI), and the implementation of more than one [[scripting language]] to help facilitate ease of portability.@@@@1@45@@danf@17-8-2009
10161550@unknown@formal@none@1@S@[[Firefox]] implements [[XUL]], [[Cascading Style Sheets|CSS]] and [[JavaScript]] for extending the browser, in addition to classic [[Netscape]]-style browser plugins.@@@@1@19@@danf@17-8-2009
10161560@unknown@formal@none@1@S@Much of the browser itself is written in XUL, CSS, and JavaScript, as well.@@@@1@14@@danf@17-8-2009
10161570@unknown@formal@none@1@S@=== Cross-platform programming toolkits ===@@@@1@5@@danf@17-8-2009
10161580@unknown@formal@none@1@S@There are a number of tools which are available to help facilitate the process of cross-platform programming:@@@@1@17@@danf@17-8-2009
10161590@unknown@formal@none@1@S@* [[Simple DirectMedia Layer]]—An [[open source]] cross-platform multimedia library written in C that creates an abstraction over various platforms’ graphics, sound, and input [[Application programming interface|API]]s.@@@@1@26@@danf@17-8-2009
10161600@unknown@formal@none@1@S@It runs on many operating systems including Linux, Windows and Mac OS X and is aimed at games and multimedia applications.@@@@1@21@@danf@17-8-2009
10161610@unknown@formal@none@1@S@* [[Cairo (graphics)|Cairo]]−A [[free software]] library used to provide a vector graphics-based, device-independent API.@@@@1@14@@danf@17-8-2009
10161620@unknown@formal@none@1@S@It is designed to provide primitives for 2-dimensional drawing across a number of different backends.@@@@1@15@@danf@17-8-2009
10161630@unknown@formal@none@1@S@Cairo is written in C and has bindings for many programming languages.@@@@1@12@@danf@17-8-2009
10161640@unknown@formal@none@1@S@* ''ParaGUI''—ParaGUI is a cross-platform high-level application framework and GUI library.@@@@1@11@@danf@17-8-2009
10161650@unknown@formal@none@1@S@It can be compiled on various platforms(Linux, Win32, BeOS, Mac OS, ...).@@@@1@12@@danf@17-8-2009
10161660@unknown@formal@none@1@S@ParaGUI is based on the Simple DirectMedia Layer (SDL).@@@@1@9@@danf@17-8-2009
10161670@unknown@formal@none@1@S@ParaGUI is targeted on crossplatform multimedia applications and embedded devices operating on framebuffer displays.@@@@1@14@@danf@17-8-2009
10161680@unknown@formal@none@1@S@* [[wxWidgets]]—An open source widget toolkit that is also an [[application framework]].@@@@1@12@@danf@17-8-2009
10161690@unknown@formal@none@1@S@It runs on [[Unix-like]] systems with [[X11]], Microsoft Windows and Mac OS X. It permits applications written to use it to run on all of the systems that it supports, if the application does not use any [[operating system]]-specific programming in addition to it.@@@@1@44@@danf@17-8-2009
10161700@unknown@formal@none@1@S@* [[Qt (toolkit)|Qt]]—An application framework and [[widget toolkit]] for [[Unix-like]] systems with [[X11]], Microsoft Windows, Mac OS X, and other systems—available under both [[open source]] and commercial licenses.@@@@1@28@@danf@17-8-2009
10161710@unknown@formal@none@1@S@* [[GTK+]]—An open source widget toolkit for Unix-like systems with X11 and Microsoft Windows.@@@@1@14@@danf@17-8-2009
10161720@unknown@formal@none@1@S@* [[FLTK]]—Another open source cross platform toolkit, but more light weight because it restricts itself to the GUI.@@@@1@18@@danf@17-8-2009
10161730@unknown@formal@none@1@S@* [[Mozilla application framework|Mozilla]]—An open source platform for building Mac, Windows and Linux applications.@@@@1@14@@danf@17-8-2009
10161740@unknown@formal@none@1@S@* [[Mono (software)|Mono]] (and more specifically, [[Microsoft .NET]])—A cross-platform framework for applications and programming languages.@@@@1@15@@danf@17-8-2009
10161750@unknown@formal@none@1@S@* ''molib''—A robust commercial application toolkit library that abstracts the system calls through C++ objects (such as the file system, database system and thread implementation.).@@@@1@25@@danf@17-8-2009
10161760@unknown@formal@none@1@S@This allows for the creation of applications that compile and run under Microsoft Windows, Mac OS X, GNU/Linux, and other uses (Sun OS, AIX, HP-UX, 32/64 bit, SMP).@@@@1@28@@danf@17-8-2009
10161770@unknown@formal@none@1@S@Use in concert with ''the sandbox'' to create GUI-based applications.@@@@1@10@@danf@17-8-2009
10161780@unknown@formal@none@1@S@* [[fpGUI]] - An open source widget toolkit that is completely implemented in Object Pascal.@@@@1@15@@danf@17-8-2009
10161790@unknown@formal@none@1@S@It currently supports Linux, Windows and a bit of Windows CE.@@@@1@11@@danf@17-8-2009
10161795@unknown@formal@none@1@S@fpGUI does not rely on any large libraries, instead it talks directly to Xlib (Linux) or GDI (Windows).@@@@1@18@@danf@17-8-2009
10161800@unknown@formal@none@1@S@The framework is compiled with the Free Pascal compiler.@@@@1@9@@danf@17-8-2009
10161810@unknown@formal@none@1@S@Mac OS support is also in the works.@@@@1@8@@danf@17-8-2009
10161820@unknown@formal@none@1@S@* [[Tcl/Tk]] - Tcl (Tool Command Language) is a dynamic programming language, suitable for a wide range of uses, including web and desktop applications, networking, administration, testing and many more.@@@@1@30@@danf@17-8-2009
10161830@unknown@formal@none@1@S@Open source and business-friendly, Tcl is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible.@@@@1@21@@danf@17-8-2009
10161840@unknown@formal@none@1@S@Tk is a graphical user interface toolkit that takes developing desktop applications to a higher level than conventional approaches.@@@@1@19@@danf@17-8-2009
10161850@unknown@formal@none@1@S@Tk is the standard GUI not only for Tcl, but for many other dynamic languages, and can produce rich, native applications that run unchanged across Windows, Mac OS X, Linux and more.@@@@1@32@@danf@17-8-2009
10161860@unknown@formal@none@1@S@The combination of Tcl and the Tk GUI toolkit is referred to as Tcl/Tk.@@@@1@14@@danf@17-8-2009
10161870@unknown@formal@none@1@S@* [[XVT]] is a cross-platform toolkit for creating enterprise and desktop applications in C/C++ on Windows, Linux and Unix (Solaris, HPUX, AIX), and Mac.@@@@1@24@@danf@17-8-2009
10161880@unknown@formal@none@1@S@Most recent release is 5.8, in April 2007@@@@1@8@@danf@17-8-2009
10161890@unknown@formal@none@1@S@=== Cross-platform development environments ===@@@@1@5@@danf@17-8-2009
10161900@unknown@formal@none@1@S@Cross-platform applications can also be built using proprietary [[Integrated development environment|IDE]]s, or so-called [[Rapid Application Development]] tools.@@@@1@17@@danf@17-8-2009
10161910@unknown@formal@none@1@S@There are a number of development environments which allow developers to build and deploy applications across multiple platforms:@@@@1@18@@danf@17-8-2009
10161920@unknown@formal@none@1@S@* [[Eclipse (software)| Eclipse]]—An Open source [[software framework]] and [[Integrated development environment|IDE]] extendable through plug-ins including the C++ Development Toolkit.@@@@1@20@@danf@17-8-2009
10161930@unknown@formal@none@1@S@Eclipse is available on any operating system with a modern Java virtual machine (including Windows, Linux, and Mac OS X, Sun, HP-UX, and other systems).@@@@1@25@@danf@17-8-2009
10161940@unknown@formal@none@1@S@* [[IntelliJ IDEA]]—A proprietary [[Integrated development environment|IDE]]@@@@1@7@@danf@17-8-2009
10161950@unknown@formal@none@1@S@* [[NetBeans]]—An Open source [[software framework]] and [[Integrated development environment|IDE]] extendable through plug-ins.@@@@1@13@@danf@17-8-2009
10161960@unknown@formal@none@1@S@NetBeans is available on any operating system with a modern Java virtual machine (including Windows, Linux, and Mac OS X, Sun, HP-UX, and other systems).@@@@1@25@@danf@17-8-2009
10161970@unknown@formal@none@1@S@Similar to Eclipse in features and functionality.@@@@1@7@@danf@17-8-2009
10161980@unknown@formal@none@1@S@Promoted by [[Sun Microsystems]]@@@@1@4@@danf@17-8-2009
10161990@unknown@formal@none@1@S@* [[Omnis Studio]]—A proprietary [[Integrated development environment|IDE]] or Rapid Application Development tool for creating enterprise and web applications for Windows, Linux, and Mac OS X.@@@@1@25@@danf@17-8-2009
10162000@unknown@formal@none@1@S@* [[Runtime Revolution]]—a proprietary [[Integrated development environment|IDE]], compiler engine and CGI builder that [[cross compile]]s to [[Microsoft Windows|Windows]], [[Mac OS X]] ([[PowerPC|PPC]], [[Intel]]), [[Linux]], [[Solaris Operating System|Solaris]], [[BSD]], and [[Irix]].@@@@1@30@@danf@17-8-2009
10162010@unknown@formal@none@1@S@*[[Code::Blocks]]—A free/open source, cross platform IDE.@@@@1@6@@danf@17-8-2009
10162020@unknown@formal@none@1@S@It is developed in C++ using wxWidgets.@@@@1@7@@danf@17-8-2009
10162030@unknown@formal@none@1@S@Using a plugin architecture, its capabilities and features are defined by the provided plugins.@@@@1@14@@danf@17-8-2009
10162040@unknown@formal@none@1@S@*[[Lazarus (software)]]—Lazarus is a cross platform Visual IDE developed for and supported by the open source Free Pascal compiler.@@@@1@19@@danf@17-8-2009
10162050@unknown@formal@none@1@S@It aims to provide a Rapid Application Development Delphi Clone for Pascal and Object Pascal developers.@@@@1@16@@danf@17-8-2009
10162060@unknown@formal@none@1@S@*[[REALbasic]]—REALbasic (RB) is an object-oriented dialect of the BASIC programming language developed and commercially marketed by REAL Software, Inc in Austin, Texas for Mac OS X, Microsoft Windows, and Linux.@@@@1@30@@danf@17-8-2009
10162070@unknown@formal@none@1@S@== Criticisms of cross-platform development ==@@@@1@6@@danf@17-8-2009
10162080@unknown@formal@none@1@S@There are certain issues associated with cross-platform development.@@@@1@8@@danf@17-8-2009
10162090@unknown@formal@none@1@S@Some of these include:@@@@1@4@@danf@17-8-2009
10162100@unknown@formal@none@1@S@* Testing cross-platform applications may also be considerably more complicated, since different platforms can exhibit slightly different behaviors or subtle bugs.@@@@1@21@@danf@17-8-2009
10162110@unknown@formal@none@1@S@This problem has led some developers to deride cross-platform development as “Write Once, Debug Everywhere”, a take on Sun’s [[Write once, run anywhere|“Write Once, Run Anywhere”]] marketing slogan.@@@@1@28@@danf@17-8-2009
10162120@unknown@formal@none@1@S@* Developers are often restricted to using the [[lowest common denominator]] subset of features which are available on all platforms.@@@@1@20@@danf@17-8-2009
10162130@unknown@formal@none@1@S@This may hinder the application's performance or prohibit developers from using platforms’ most advanced features.@@@@1@15@@danf@17-8-2009
10162140@unknown@formal@none@1@S@* Different platforms often have different user interface conventions, which cross-platform applications do not always accommodate.@@@@1@16@@danf@17-8-2009
10162150@unknown@formal@none@1@S@For example, applications developed for Mac OS X and [[GNOME]] are supposed to place the most important button on the right-hand side of windows and dialogs, whereas Microsoft Windows and [[KDE]] have the opposite convention.@@@@1@35@@danf@17-8-2009
10162160@unknown@formal@none@1@S@Though many of these differences are subtle, a cross-platform application which does not conform appropriately to these conventions may feel clunky or alien to the user.@@@@1@26@@danf@17-8-2009
10162170@unknown@formal@none@1@S@When working quickly, such opposing conventions may even result in [[data loss]], such as in a [[dialog box]] confirming whether the user wants to save or discard changes to a file.@@@@1@31@@danf@17-8-2009
10162180@unknown@formal@none@1@S@* Scripting languages and virtual machines must be translated into native executable code each time the application is executed, imposing a performance penalty.@@@@1@23@@danf@17-8-2009
10162190@unknown@formal@none@1@S@This performance hit can be alleviated using advanced techniques like [[just-in-time compilation]]; but even using such techniques, some performance overhead may be unavoidable.@@@@1@23@@danf@17-8-2009
10170010@unknown@formal@none@1@S@Data@@@@1@1@@danf@17-8-2009
10170020@unknown@formal@none@1@S@'''Data''' (singular: '''datum''') are collected of natural phenomena descriptors including the results of [[experience]], [[observation]] or [[experiment]], or a set of [[premise]]s.@@@@1@22@@danf@17-8-2009
10170030@unknown@formal@none@1@S@This may consist of [[number]]s, [[word]]s, or [[image]]s, particularly as [[measurement]]s or observations of a set of [[variable]]s.@@@@1@18@@danf@17-8-2009
10170040@unknown@formal@none@1@S@==Etymology==@@@@1@1@@danf@17-8-2009
10170050@unknown@formal@none@1@S@The word ''data ''is the plural of [[Latin]] ''[[datum]]'', [[Grammatical gender|neuter]] past [[participle]] of ''dare'', "to give", hence "something given".@@@@1@20@@danf@17-8-2009
10170060@unknown@formal@none@1@S@The [[past participle]] of "to give" has been used for millennia, in the sense of a statement accepted at face value; one of the works of [[Euclid]], circa 300 BC, was the ''Dedomena'' (in Latin, ''Data'').@@@@1@36@@danf@17-8-2009
10170070@unknown@formal@none@1@S@In discussions of problems in [[geometry]], [[mathematics]], [[engineering]], and so on, the terms ''givens'' and ''data'' are used interchangeably.@@@@1@19@@danf@17-8-2009
10170080@unknown@formal@none@1@S@Such usage is the origin of ''data'' as a concept in [[computer science]]:'' ''data'' ''are numbers, words, images, etc., accepted as they stand.@@@@1@23@@danf@17-8-2009
10170090@unknown@formal@none@1@S@Pronounced dey-tuh, dat-uh, or dah-tuh.''@@@@1@5@@danf@17-8-2009
10170100@unknown@formal@none@1@S@[[Experimental data]] are data generated within the context of a scientific investigation.@@@@1@12@@danf@17-8-2009
10170110@unknown@formal@none@1@S@Mathematically, data can be grouped in many ways.@@@@1@8@@danf@17-8-2009
10170120@unknown@formal@none@1@S@==Usage in English==@@@@1@3@@danf@17-8-2009
10170130@unknown@formal@none@1@S@In [[English language|English]], the word ''datum'' is still used in the general sense of "something given", and more specifically in [[cartography]], [[geography]], [[geology]], [[NMR]] and [[technical drawing|drafting]] to mean a reference point, reference line, or reference surface.@@@@1@37@@danf@17-8-2009
10170140@unknown@formal@none@1@S@More generally speaking, any measurement or result can be called a (single) ''datum'', but ''data point'' is more common.@@@@1@19@@danf@17-8-2009
10170150@unknown@formal@none@1@S@Both ''datums'' (see usage in [[datum]] article) and the originally Latin plural ''data'' are used as the plural of ''datum'' in English, but ''data'' is more commonly treated as a [[mass noun]] and used in the [[Grammatical number|singular]], especially in day-to-day usage.@@@@1@42@@danf@17-8-2009
10170160@unknown@formal@none@1@S@For example, "This is all the data from the experiment".@@@@1@10@@danf@17-8-2009
10170170@unknown@formal@none@1@S@This usage is inconsistent with the rules of Latin grammar and traditional English, which would instead suggest "These are all the data from the experiment".@@@@1@25@@danf@17-8-2009
10170180@unknown@formal@none@1@S@Some British and UN academic, scientific, and professional [[style guides]] (e.g., see page 43 of the [http://whqlibdoc.who.int/hq/2004/WHO_IMD_PUB_04.1.pdf World Health Organization Style Guide]) request that authors treat ''data'' as a plural noun.@@@@1@31@@danf@17-8-2009
10170190@unknown@formal@none@1@S@Other international organization, such as the IEEE computing society , allow its usage as either a mass noun or plural based on author preference.@@@@1@24@@danf@17-8-2009
10170200@unknown@formal@none@1@S@It is now usually treated as a singular mass noun in informal usage, but usage in scientific publications shows a strong UK/U.S divide.@@@@1@23@@danf@17-8-2009
10170210@unknown@formal@none@1@S@U.S. usage tends to treat ''data'' in the singular, including in serious and academic publishing, although some major newspapers (such as the [[New York Times]]) regularly use it in the plural.@@@@1@31@@danf@17-8-2009
10170220@unknown@formal@none@1@S@"The plural usage is still common, as this headline from the New York Times attests: “Data Are Elusive on the Homeless.”@@@@1@21@@danf@17-8-2009
10170230@unknown@formal@none@1@S@Sometimes scientists think of data as plural, as in ''These data do not support the conclusions.''@@@@1@16@@danf@17-8-2009
10170240@unknown@formal@none@1@S@But more often scientists and researchers think of data as a singular mass entity like information, and most people now follow this in general usage.@@@@1@25@@danf@17-8-2009
10170245@unknown@formal@none@1@S@"[http://www.bartleby.com/61/51/D0035100.html] UK usage now widely accepts treating ''data'' as singular in standard English, including everyday newspaper usage at least in non-scientific use.@@@@1@22@@danf@17-8-2009
10170250@unknown@formal@none@1@S@UK scientific publishing usually still prefers treating it as a plural..@@@@1@11@@danf@17-8-2009
10170260@unknown@formal@none@1@S@Some UK university style guides recommend using ''data'' for both singular and plural use and some recommend treating it only as a singular in connection with computers.@@@@1@27@@danf@17-8-2009
10170270@unknown@formal@none@1@S@==Uses of ''data'' in science and computing==@@@@1@7@@danf@17-8-2009
10170280@unknown@formal@none@1@S@''Raw data'' are [[number]]s, [[character (computing)|characters]], [[image]]s or other outputs from devices to convert physical quantities into symbols, in a very broad sense.@@@@1@23@@danf@17-8-2009
10170290@unknown@formal@none@1@S@Such data are typically further [[data processing|processed]] by a human or [[input]] into a [[computer]], [[Computer storage|stored]] and processed there, or transmitted ([[output]]) to another human or computer.@@@@1@28@@danf@17-8-2009
10170300@unknown@formal@none@1@S@''Raw data'' is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next.@@@@1@28@@danf@17-8-2009
10170310@unknown@formal@none@1@S@Mechanical computing devices are classified according to the means by which they represent data.@@@@1@14@@danf@17-8-2009
10170320@unknown@formal@none@1@S@An [[analog computer]] represents a datum as a voltage, distance, position, or other physical quantity.@@@@1@15@@danf@17-8-2009
10170330@unknown@formal@none@1@S@A [[digital computer]] represents a datum as a sequence of symbols drawn from a fixed [[alphabet]].@@@@1@16@@danf@17-8-2009
10170340@unknown@formal@none@1@S@The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1".@@@@1@21@@danf@17-8-2009
10170350@unknown@formal@none@1@S@More familiar representations, such as numbers or letters, are then constructed from the binary alphabet.@@@@1@15@@danf@17-8-2009
10170360@unknown@formal@none@1@S@Some special forms of data are distinguished.@@@@1@7@@danf@17-8-2009
10170370@unknown@formal@none@1@S@A [[computer program]] is a collection of data, which can be interpreted as instructions.@@@@1@14@@danf@17-8-2009
10170380@unknown@formal@none@1@S@Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably [[Lisp programming language|Lisp]] and similar languages, programs are essentially indistinguishable from other data.@@@@1@34@@danf@17-8-2009
10170390@unknown@formal@none@1@S@It is also useful to distinguish [[metadata]], that is, a description of other data.@@@@1@14@@danf@17-8-2009
10170400@unknown@formal@none@1@S@A similar yet earlier term for metadata is "ancillary data."@@@@1@10@@danf@17-8-2009
10170410@unknown@formal@none@1@S@The prototypical example of metadata is the library catalog, which is a description of the contents of books.@@@@1@18@@danf@17-8-2009
10170420@unknown@formal@none@1@S@==Meaning of data, information and knowledge==@@@@1@6@@danf@17-8-2009
10170430@unknown@formal@none@1@S@The terms [[information]] and [[knowledge]] are frequently used for overlapping concepts.@@@@1@11@@danf@17-8-2009
10170440@unknown@formal@none@1@S@The main difference is in the level of [[abstraction]] being considered.@@@@1@11@@danf@17-8-2009
10170450@unknown@formal@none@1@S@Data is the lowest level of abstraction, information is the next level, and finally, knowledge is the highest level among all three.@@@@1@22@@danf@17-8-2009
10170460@unknown@formal@none@1@S@For example, the height of Mt. Everest is generally considered as "data", a book on Mt. Everest geological characteristics may be considered as "information", and a report containing practical information on the best way to reach Mt. Everest's peak may be considered as "knowledge".@@@@1@44@@danf@17-8-2009
10170470@unknown@formal@none@1@S@Information as a concept bears a diversity of meanings, from everyday usage to technical settings.@@@@1@15@@danf@17-8-2009
10170480@unknown@formal@none@1@S@Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation.@@@@1@26@@danf@17-8-2009
10170490@unknown@formal@none@1@S@Beynon-Davies uses the concept of a [[sign]] to distinguish between [[data]] and [[information]].@@@@1@13@@danf@17-8-2009
10170500@unknown@formal@none@1@S@Data are symbols.@@@@1@3@@danf@17-8-2009
10170510@unknown@formal@none@1@S@Information occurs when symbols are used to refer to something.@@@@1@10@@danf@17-8-2009
10180010@unknown@formal@none@1@S@Data analysis@@@@1@2@@danf@17-8-2009
10180020@unknown@formal@none@1@S@'''Data analysis''' is the process of looking at and summarizing '''[[data]]''' with the intent to extract useful [[information]] and develop conclusions.@@@@1@21@@danf@17-8-2009
10180030@unknown@formal@none@1@S@Data analysis is closely related to [[data mining]], but data mining tends to focus on larger data sets, with less emphasis on making [[inference]], and often uses data that was originally collected for a different purpose.@@@@1@36@@danf@17-8-2009
10180040@unknown@formal@none@1@S@In [[statistics|statistical applications]], some people divide data analysis into [[descriptive statistics]], [[exploratory data analysis]] and [[confirmatory data analysis]], where the EDA focuses on discovering new features in the data, and CDA on confirming or falsifying existing hypotheses.@@@@1@37@@danf@17-8-2009
10180050@unknown@formal@none@1@S@Data analysis assumes different aspects, and possibly different names, in different fields.@@@@1@12@@danf@17-8-2009
10180060@unknown@formal@none@1@S@The term ''data analysis'' is also used as a synonym for [[data modeling]], which is unrelated to the subject of this article.@@@@1@22@@danf@17-8-2009
10180070@unknown@formal@none@1@S@==Nuclear and particle physics==@@@@1@4@@danf@17-8-2009
10180080@unknown@formal@none@1@S@In [[nuclear physics|nuclear]] and [[particle physics]] the data usually originate from the [[particle detector|experimental apparatus]] via a [[data acquisition]] system.@@@@1@20@@danf@17-8-2009
10180090@unknown@formal@none@1@S@It is then processed, in a step usually called ''data reduction'', to apply calibrations and to extract physically significant information.@@@@1@20@@danf@17-8-2009
10180100@unknown@formal@none@1@S@Data reduction is most often, especially in large particle physics experiments, an automatic, batch-mode operation carried out by software written ad-hoc.@@@@1@21@@danf@17-8-2009
10180110@unknown@formal@none@1@S@The resulting data ''n-tuples'' are then scrutinized by the physicists, using specialized software tools like [[ROOT]] or [[Physics Analysis Workstation|PAW]], comparing the results of the experiment with theory.@@@@1@28@@danf@17-8-2009
10180120@unknown@formal@none@1@S@The theoretical models are often difficult to compare directly with the results of the experiments, so they are used instead as input for [[Monte Carlo method|Monte Carlo simulation]] software like [[Geant4]] that predict the response of the detector to a given theoretical event, producing '''simulated events''' which are then compared to experimental data.@@@@1@53@@danf@17-8-2009
10180130@unknown@formal@none@1@S@See also: [[Computational physics]].@@@@1@4@@danf@17-8-2009
10180140@unknown@formal@none@1@S@==Social sciences==@@@@1@2@@danf@17-8-2009
10180150@unknown@formal@none@1@S@[[Qualitative data analysis]] (QDA) or [[qualitative research]] is the analysis of non-numerical data, for example words, photographs, observations, etc..@@@@1@19@@danf@17-8-2009
10180160@unknown@formal@none@1@S@==Information technology==@@@@1@2@@danf@17-8-2009
10180170@unknown@formal@none@1@S@A special case is the [[Data analysis (information technology in othm )|data analysis in information technology audits]].@@@@1@17@@danf@17-8-2009
10180180@unknown@formal@none@1@S@==Business==@@@@1@1@@danf@17-8-2009
10180190@unknown@formal@none@1@S@See@@@@1@1@@danf@17-8-2009
10180200@unknown@formal@none@1@S@* [[Analytics]]@@@@1@2@@danf@17-8-2009
10180210@unknown@formal@none@1@S@* [[Business intelligence]]@@@@1@3@@danf@17-8-2009
10180220@unknown@formal@none@1@S@* [[Data mining]]@@@@1@3@@danf@17-8-2009
10190010@unknown@formal@none@1@S@Database@@@@1@1@@danf@17-8-2009
10190020@unknown@formal@none@1@S@A '''database''' is a [[structure]]d collection of records or [[data]].@@@@1@10@@danf@17-8-2009
10190030@unknown@formal@none@1@S@A [[computer]] database relies upon [[software]] to organize the storage of data.@@@@1@12@@danf@17-8-2009
10190040@unknown@formal@none@1@S@The software models the database structure in what are known as [[database model]]s.@@@@1@13@@danf@17-8-2009
10190050@unknown@formal@none@1@S@The model in most common use today is the [[relational model]].@@@@1@11@@danf@17-8-2009
10190060@unknown@formal@none@1@S@Other models such as the [[hierarchical model]] and the [[network model]] use a more explicit representation of relationships (see below for explanation of the various database models).@@@@1@27@@danf@17-8-2009
10190070@unknown@formal@none@1@S@Database management systems (DBMS) are the software used to organize and maintain the database.@@@@1@14@@danf@17-8-2009
10190080@unknown@formal@none@1@S@These are categorized according to the [[database model]] that they support.@@@@1@11@@danf@17-8-2009
10190090@unknown@formal@none@1@S@The model tends to determine the query languages that are available to access the database.@@@@1@15@@danf@17-8-2009
10190100@unknown@formal@none@1@S@A great deal of the internal engineering of a DBMS, however, is independent of the data model, and is concerned with managing factors such as performance, concurrency, integrity, and recovery from [[hardware failure]]s.@@@@1@33@@danf@17-8-2009
10190110@unknown@formal@none@1@S@In these areas there are large differences between products.@@@@1@9@@danf@17-8-2009
10190120@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10190130@unknown@formal@none@1@S@The earliest known use of the term '''''data base''''' was in November 1963, when the [[System Development Corporation]] sponsored a symposium under the title ''Development and Management of a Computer-centered Data Base''.@@@@1@32@@danf@17-8-2009
10190140@unknown@formal@none@1@S@'''Database''' as a single word became common in Europe in the early 1970s and by the end of the decade it was being used in major American newspapers.@@@@1@28@@danf@17-8-2009
10190150@unknown@formal@none@1@S@(The abbreviation DB, however, survives.)@@@@1@5@@danf@17-8-2009
10190160@unknown@formal@none@1@S@The first database management systems were developed in the 1960s.@@@@1@10@@danf@17-8-2009
10190170@unknown@formal@none@1@S@A pioneer in the field was [[Charles Bachman]].@@@@1@8@@danf@17-8-2009
10190180@unknown@formal@none@1@S@Bachman's early papers show that his aim was to make more effective use of the new direct access storage devices becoming available: until then, data processing had been based on [[punch card|punched cards]] and [[magnetic tape]], so that serial processing was the dominant activity.@@@@1@44@@danf@17-8-2009
10190190@unknown@formal@none@1@S@Two key [[data model]]s arose at this time: [[CODASYL]] developed the [[network model]] based on Bachman's ideas, and (apparently independently) the [[hierarchical model]] was used in a system developed by [[North American Rockwell]] later adopted by [[IBM]] as the cornerstone of their [[Information Management System|IMS]] product.@@@@1@46@@danf@17-8-2009
10190200@unknown@formal@none@1@S@While IMS along with the CODASYL [[IDMS]] were the big, high visibility databases developed in the 1960s, several others were also born in that decade, some of which have a significant installed base today.@@@@1@34@@danf@17-8-2009
10190210@unknown@formal@none@1@S@Two worthy of mention are the [[Pick operating system|PICK]] and [[MUMPS]] databases, with the former developed originally as an operating system with an embedded database and the latter as a programming language and database for the development of healthcare systems.@@@@1@40@@danf@17-8-2009
10190220@unknown@formal@none@1@S@The [[relational model]] was proposed by [[Edgar F. Codd|E. F. Codd]] in 1970.@@@@1@13@@danf@17-8-2009
10190230@unknown@formal@none@1@S@He criticized existing models for confusing the abstract description of information structure with descriptions of physical access mechanisms.@@@@1@18@@danf@17-8-2009
10190240@unknown@formal@none@1@S@For a long while, however, the relational model remained of academic interest only.@@@@1@13@@danf@17-8-2009
10190250@unknown@formal@none@1@S@While CODASYL products (IDMS) and network model products (IMS) were conceived as practical engineering solutions taking account of the technology as it existed at the time, the relational model took a much more theoretical perspective, arguing (correctly) that hardware and software technology would catch up in time.@@@@1@47@@danf@17-8-2009
10190260@unknown@formal@none@1@S@Among the first implementations were [[Michael Stonebraker]]'s [[Ingres (database)|Ingres]] at [[University of California, Berkeley|Berkeley]], and the [[System R]] project at IBM.@@@@1@21@@danf@17-8-2009
10190270@unknown@formal@none@1@S@Both of these were research prototypes, announced during 1976.@@@@1@9@@danf@17-8-2009
10190280@unknown@formal@none@1@S@The first commercial products, [[Oracle database|Oracle]] and [[IBM DB2|DB2]], did not appear until around 1980.@@@@1@15@@danf@17-8-2009
10190290@unknown@formal@none@1@S@The first successful database product for microcomputers was [[dBASE]] for the [[CP/M]] and [[PC-DOS]]/[[MS-DOS]] operating systems.@@@@1@16@@danf@17-8-2009
10190300@unknown@formal@none@1@S@During the 1980s, research activity focused on [[distributed database]] systems and [[database machine]]s.@@@@1@13@@danf@17-8-2009
10190310@unknown@formal@none@1@S@Another important theoretical idea was the [[Functional Data Model]], but apart from some specialized applications in genetics, molecular biology, and fraud investigation, the world took little notice.@@@@1@27@@danf@17-8-2009
10190320@unknown@formal@none@1@S@In the 1990s, attention shifted to [[OODB|object-oriented databases]].@@@@1@8@@danf@17-8-2009
10190330@unknown@formal@none@1@S@These had some success in fields where it was necessary to handle more complex data than relational systems could easily cope with, such as [[spatial database]]s, engineering data (including software [[Software repository|repositories]]), and multimedia data.@@@@1@35@@danf@17-8-2009
10190340@unknown@formal@none@1@S@Some of these ideas were adopted by the relational vendors, who integrated new features into their products as a result.@@@@1@20@@danf@17-8-2009
10190350@unknown@formal@none@1@S@The 1990s also saw the spread of [[Open Source]] databases, such as [[PostgreSQL]] and [[MySQL]].@@@@1@15@@danf@17-8-2009
10190360@unknown@formal@none@1@S@In the 2000s, the fashionable area for innovation is the [[XML database]].@@@@1@12@@danf@17-8-2009
10190370@unknown@formal@none@1@S@As with object databases, this has spawned a new collection of start-up companies, but at the same time the key ideas are being integrated into the established relational products.@@@@1@29@@danf@17-8-2009
10190380@unknown@formal@none@1@S@[[XML databases]] aim to remove the traditional divide between documents and data, allowing all of an organization's information resources to be held in one place, whether they are highly structured or not.@@@@1@32@@danf@17-8-2009
10190390@unknown@formal@none@1@S@==Database models==@@@@1@2@@danf@17-8-2009
10190400@unknown@formal@none@1@S@Various techniques are used to model data structure.@@@@1@8@@danf@17-8-2009
10190410@unknown@formal@none@1@S@Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model.@@@@1@25@@danf@17-8-2009
10190420@unknown@formal@none@1@S@For any one [[logical model]] various physical implementations may be possible, and most products will offer the user some level of control in tuning the [[physical implementation]], since the choices that are made have a significant effect on performance.@@@@1@39@@danf@17-8-2009
10190430@unknown@formal@none@1@S@Here are three examples:@@@@1@4@@danf@17-8-2009
10190440@unknown@formal@none@1@S@===Hierarchical model===@@@@1@2@@danf@17-8-2009
10190450@unknown@formal@none@1@S@In a [[hierarchical model]], data is organized into an inverted tree-like structure, implying a multiple downward link in each node to describe the nesting, and a sort field to keep the records in a particular order in each same-level list.@@@@1@40@@danf@17-8-2009
10190460@unknown@formal@none@1@S@This structure arranges the various data elements in a hierarchy and helps to establish logical relationships among data elements of multiple files.@@@@1@22@@danf@17-8-2009
10190470@unknown@formal@none@1@S@Each unit in the model is a record which is also known as a node.@@@@1@15@@danf@17-8-2009
10190480@unknown@formal@none@1@S@In such a model, each record on one level can be related to multiple records on the next lower level.@@@@1@20@@danf@17-8-2009
10190490@unknown@formal@none@1@S@A record that has subsidiary records is called a parent and the subsidiary records are called children.@@@@1@17@@danf@17-8-2009
10190500@unknown@formal@none@1@S@Data elements in this model are well suited for one-to-many relationships with other data elements in the database.@@@@1@18@@danf@17-8-2009
10190510@unknown@formal@none@1@S@This model is advantageous when the data elements are inherently hierarchical.@@@@1@11@@danf@17-8-2009
10190520@unknown@formal@none@1@S@The disadvantage is that in order to prepare the database it becomes necessary to identify the requisite groups of files that are to be logically integrated.@@@@1@26@@danf@17-8-2009
10190530@unknown@formal@none@1@S@Hence, a hierarchical data model may not always be flexible enough to accommodate the dynamic needs of an organization.@@@@1@19@@danf@17-8-2009
10190540@unknown@formal@none@1@S@===Network model===@@@@1@2@@danf@17-8-2009
10190550@unknown@formal@none@1@S@The [[network model]] tends to store records with links to other records.@@@@1@12@@danf@17-8-2009
10190560@unknown@formal@none@1@S@Each record in the database can have multiple parents, i.e., the relationships among data elements can have a many to many relationship.@@@@1@22@@danf@17-8-2009
10190570@unknown@formal@none@1@S@Associations are tracked via "pointers".@@@@1@5@@danf@17-8-2009
10190580@unknown@formal@none@1@S@These pointers can be node numbers or disk addresses.@@@@1@9@@danf@17-8-2009
10190590@unknown@formal@none@1@S@Most network databases tend to also include some form of hierarchical model.@@@@1@12@@danf@17-8-2009
10190600@unknown@formal@none@1@S@Databases can be translated from hierarchical model to network and vice versa.@@@@1@12@@danf@17-8-2009
10190610@unknown@formal@none@1@S@The main difference between the network model and hierarchical model is that in a network model, a child can have a number of parents whereas in a hierarchical model, a child can have only one parent.@@@@1@36@@danf@17-8-2009
10190620@unknown@formal@none@1@S@The network model provides greater advantage than the hierarchical model in that promotes greater flexibility and data accessibility, since records at a lower level can be accessed without accessing the records above them.@@@@1@33@@danf@17-8-2009
10190630@unknown@formal@none@1@S@This model is more efficient than hierarchical model, easier to understand and can be applied to many real world problems that require routine transactions.@@@@1@24@@danf@17-8-2009
10190640@unknown@formal@none@1@S@The disadvantages are that: It is a complex process to design and develop a network database; It has to be refined frequently; It requires that the relationships among all the records be defined before development starts, and changes often demand major programming efforts; Operation and maintenance of the network model is expensive and time consuming.@@@@1@55@@danf@17-8-2009
10190650@unknown@formal@none@1@S@Examples of database engines that have network model capabilities are [[RDM Embedded]] and [[RDM Server]].@@@@1@15@@danf@17-8-2009
10190660@unknown@formal@none@1@S@===Relational model===@@@@1@2@@danf@17-8-2009
10190670@unknown@formal@none@1@S@The basic data structure of the relational model is a table where information about a particular entity (say, an employee) is represented in columns and rows.@@@@1@26@@danf@17-8-2009
10190680@unknown@formal@none@1@S@The columns enumerate the various attributes of an entity (e.g. employee_name, address, phone_number).@@@@1@13@@danf@17-8-2009
10190690@unknown@formal@none@1@S@Rows (also called records) represent instances of an entity (e.g. specific employees).@@@@1@12@@danf@17-8-2009
10190700@unknown@formal@none@1@S@The "relation" in "relational database" comes from the mathematical notion of [[Relation (mathematics)|relations]] from the field of [[set theory]].@@@@1@19@@danf@17-8-2009
10190710@unknown@formal@none@1@S@A relation is a set of [[tuple]]s, so rows are sometimes called tuples.@@@@1@13@@danf@17-8-2009
10190720@unknown@formal@none@1@S@All tables in a relational database adhere to three basic rules.@@@@1@11@@danf@17-8-2009
10190730@unknown@formal@none@1@S@* The ordering of columns is immaterial@@@@1@7@@danf@17-8-2009
10190740@unknown@formal@none@1@S@* Identical rows are not allowed in a table@@@@1@9@@danf@17-8-2009
10190750@unknown@formal@none@1@S@* Each row has a single (separate) value for each of its columns (each tuple has an atomic value).@@@@1@19@@danf@17-8-2009
10190760@unknown@formal@none@1@S@If the same value occurs in two different records (from the same table or different tables) it can imply a relationship between those records.@@@@1@24@@danf@17-8-2009
10190770@unknown@formal@none@1@S@Relationships between records are often categorized by their [[Cardinality (data modeling)|cardinality]] (1:1, (0), 1:M, M:M).@@@@1@15@@danf@17-8-2009
10190780@unknown@formal@none@1@S@Tables can have a designated column or set of columns that act as a "key" to select rows from that table with the same or similar key values.@@@@1@28@@danf@17-8-2009
10190790@unknown@formal@none@1@S@A "primary key" is a key that has a unique value for each row in the table.@@@@1@17@@danf@17-8-2009
10190800@unknown@formal@none@1@S@Keys are commonly used to join or combine data from two or more tables.@@@@1@14@@danf@17-8-2009
10190810@unknown@formal@none@1@S@For example, an ''employee'' table may contain a column named ''address'' which contains a value that matches the key of an ''address'' table.@@@@1@23@@danf@17-8-2009
10190820@unknown@formal@none@1@S@Keys are also critical in the creation of indexes, which facilitate fast retrieval of data from large tables.@@@@1@18@@danf@17-8-2009
10190830@unknown@formal@none@1@S@It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.@@@@1@29@@danf@17-8-2009
10190840@unknown@formal@none@1@S@====Relational operations====@@@@1@2@@danf@17-8-2009
10190850@unknown@formal@none@1@S@Users (or programs) request data from a relational database by sending it a [[query]] that is written in a special language, usually a dialect of [[SQL]].@@@@1@26@@danf@17-8-2009
10190860@unknown@formal@none@1@S@Although SQL was originally intended for end-users, it is much more common for SQL queries to be embedded into software that provides an easier user interface.@@@@1@26@@danf@17-8-2009
10190870@unknown@formal@none@1@S@Many web applications, such as [[Wikipedia]], perform SQL queries when generating pages.@@@@1@12@@danf@17-8-2009
10190880@unknown@formal@none@1@S@In response to a query, the database returns a result set, which is the list of rows constituting the answer.@@@@1@20@@danf@17-8-2009
10190890@unknown@formal@none@1@S@The simplest query is just to return all the rows from a table, but more often, the rows are filtered in some way to return just the answer wanted.@@@@1@29@@danf@17-8-2009
10190900@unknown@formal@none@1@S@Often, data from multiple tables are combined into one, by doing a [[Join (SQL)|join]].@@@@1@14@@danf@17-8-2009
10190910@unknown@formal@none@1@S@There are a number of relational operations in addition to join.@@@@1@11@@danf@17-8-2009
10190920@unknown@formal@none@1@S@====Normal forms====@@@@1@2@@danf@17-8-2009
10190930@unknown@formal@none@1@S@Relations are classified based upon the types of anomalies to which they're vulnerable.@@@@1@13@@danf@17-8-2009
10190940@unknown@formal@none@1@S@A database that's in the first normal form is vulnerable to all types of anomalies, while a database that's in the domain/key normal form has no modification anomalies.@@@@1@28@@danf@17-8-2009
10190950@unknown@formal@none@1@S@Normal forms are hierarchical in nature.@@@@1@6@@danf@17-8-2009
10190960@unknown@formal@none@1@S@That is, the lowest level is the first normal form, and the database cannot meet the requirements for higher level normal forms without first having met all the requirements of the lesser normal form.@@@@1@34@@danf@17-8-2009
10190970@unknown@formal@none@1@S@==Database Management Systems==@@@@1@3@@danf@17-8-2009
10190980@unknown@formal@none@1@S@===Relational database management systems===@@@@1@4@@danf@17-8-2009
10190990@unknown@formal@none@1@S@An RDBMS implements the features of the relational model outlined above.@@@@1@11@@danf@17-8-2009
10191000@unknown@formal@none@1@S@In this context, [[Christopher J. Date|Date]]'s '''Information Principle''' states:@@@@1@9@@danf@17-8-2009
10191010@unknown@formal@none@1@S@
The entire information content of the database is represented in one and only one way.@@@@1@16@@danf@17-8-2009
10191020@unknown@formal@none@1@S@Namely as explicit values in column positions (attributes) and rows in relations ([[tuple]]s) Therefore, there are no explicit pointers between related tables.
@@@@1@22@@danf@17-8-2009
10191030@unknown@formal@none@1@S@===Post-relational database models===@@@@1@3@@danf@17-8-2009
10191040@unknown@formal@none@1@S@Several products have been identified as [[post-relational]] because the data model incorporates [[relations]] but is not constrained by the Information Principle, requiring that all information is represented by [[data values]] in relations.@@@@1@32@@danf@17-8-2009
10191050@unknown@formal@none@1@S@Products using a post-relational data model typically employ a model that actually pre-dates the [[relational model]].@@@@1@16@@danf@17-8-2009
10191060@unknown@formal@none@1@S@These might be identified as a [[directed graph]] with [[tree data structure|trees]] on the [[data structure|nodes]].@@@@1@16@@danf@17-8-2009
10191070@unknown@formal@none@1@S@Examples of models that could be classified as post-relational are [[Pick operating system|PICK]] aka [[Multidimensional database|MultiValue]], and [[MUMPS]].@@@@1@18@@danf@17-8-2009
10191080@unknown@formal@none@1@S@===Object database models===@@@@1@3@@danf@17-8-2009
10191090@unknown@formal@none@1@S@In recent years, the [[object-oriented]] paradigm has been applied to database technology, creating a new programming model known as [[object database]]s.@@@@1@21@@danf@17-8-2009
10191100@unknown@formal@none@1@S@These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same [[type system]] as the application program.@@@@1@31@@danf@17-8-2009
10191110@unknown@formal@none@1@S@This aims to avoid the overhead (sometimes referred to as the ''[[Object-Relational impedance mismatch|impedance mismatch]]'') of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects).@@@@1@40@@danf@17-8-2009
10191120@unknown@formal@none@1@S@At the same time, object databases attempt to introduce the key ideas of object programming, such as [[encapsulation]] and [[polymorphism (computer science)|polymorphism]], into the world of databases.@@@@1@27@@danf@17-8-2009
10191130@unknown@formal@none@1@S@A variety of these ways have been tried for storing objects in a database.@@@@1@14@@danf@17-8-2009
10191140@unknown@formal@none@1@S@Some products have approached the problem from the application programming end, by making the objects manipulated by the program [[Persistence (computer science)|persistent]].@@@@1@22@@danf@17-8-2009
10191150@unknown@formal@none@1@S@This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content.@@@@1@29@@danf@17-8-2009
10191160@unknown@formal@none@1@S@Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.@@@@1@35@@danf@17-8-2009
10191170@unknown@formal@none@1@S@==DBMS internals==@@@@1@2@@danf@17-8-2009
10191180@unknown@formal@none@1@S@===Storage and physical database design===@@@@1@5@@danf@17-8-2009
10191190@unknown@formal@none@1@S@Database tables/indexes are typically stored in memory or on hard disk in one of many forms, ordered/unordered [[flat file database|flat files]], [[ISAM]], [[heap (data structure)|heaps]], [[hash table|hash buckets]] or [[B+ tree]]s.@@@@1@31@@danf@17-8-2009
10191200@unknown@formal@none@1@S@These have various advantages and disadvantages discussed further in the main article on this topic.@@@@1@15@@danf@17-8-2009
10191210@unknown@formal@none@1@S@The most commonly used are B+ trees and ISAM.@@@@1@9@@danf@17-8-2009
10191220@unknown@formal@none@1@S@Other important design choices relate to the clustering of data by category (such as grouping data by month, or location), creating pre-computed views known as materialized views, partitioning data by range or hash.@@@@1@33@@danf@17-8-2009
10191230@unknown@formal@none@1@S@As well memory management and storage topology can be important design choices for database designers.@@@@1@15@@danf@17-8-2009
10191240@unknown@formal@none@1@S@Just as normalization is used to reduce storage requirements and improve the extensibility of the database, conversely denormalization is often used to reduce join complexity and reduce execution time for queries.@@@@1@31@@danf@17-8-2009
10191250@unknown@formal@none@1@S@====Indexing====@@@@1@1@@danf@17-8-2009
10191260@unknown@formal@none@1@S@All of these databases can take advantage of [[Index (database)|indexing]] to increase their speed.@@@@1@14@@danf@17-8-2009
10191270@unknown@formal@none@1@S@This technology has advanced tremendously since its early uses in the 1960s and 1970s.@@@@1@14@@danf@17-8-2009
10191280@unknown@formal@none@1@S@The most common kind of index is a sorted list of the contents of some particular table column, with pointers to the row associated with the value.@@@@1@27@@danf@17-8-2009
10191290@unknown@formal@none@1@S@An index allows a set of table rows matching some criterion to be located quickly.@@@@1@15@@danf@17-8-2009
10191300@unknown@formal@none@1@S@Typically, indexes are also stored in the various forms of data-structure mentioned above (such as [[B-tree]]s, [[hash table|hash]]es, and [[linked lists]]).@@@@1@21@@danf@17-8-2009
10191310@unknown@formal@none@1@S@Usually, a specific technique is chosen by the database designer to increase efficiency in the particular case of the type of index required.@@@@1@23@@danf@17-8-2009
10191320@unknown@formal@none@1@S@Relational DBMS's have the advantage that indexes can be created or dropped without changing existing applications making use of it.@@@@1@20@@danf@17-8-2009
10191330@unknown@formal@none@1@S@The database chooses between many different strategies based on which one it estimates will run the fastest.@@@@1@17@@danf@17-8-2009
10191340@unknown@formal@none@1@S@In other words, indexes are transparent to the application or end-user querying the database; while they affect performance, any SQL command will run with or without index to compute the result of an [[SQL]] statement.@@@@1@35@@danf@17-8-2009
10191350@unknown@formal@none@1@S@The RDBMS will produce a plan of how to execute the query, which is generated by analyzing the run times of the different algorithms and selecting the quickest.@@@@1@28@@danf@17-8-2009
10191360@unknown@formal@none@1@S@Some of the key algorithms that deal with [[join (SQL)|joins]] are [[nested loop join]], [[sort-merge join]] and [[hash join]].@@@@1@19@@danf@17-8-2009
10191370@unknown@formal@none@1@S@Which of these is chosen depends on whether an index exists, what type it is, and its [[Cardinality (SQL statements)|cardinality]].@@@@1@20@@danf@17-8-2009
10191380@unknown@formal@none@1@S@An index speeds up access to data, but it has disadvantages as well.@@@@1@13@@danf@17-8-2009
10191390@unknown@formal@none@1@S@First, every index increases the amount of storage on the hard drive necessary for the database file, and second, the index must be updated each time the data are altered, and this costs time.@@@@1@34@@danf@17-8-2009
10191400@unknown@formal@none@1@S@(Thus an index saves time in the reading of data, but it costs time in entering and altering data.@@@@1@19@@danf@17-8-2009
10191410@unknown@formal@none@1@S@It thus depends on the use to which the data are to be put whether an index is on the whole a net plus or minus in the quest for efficiency.)@@@@1@31@@danf@17-8-2009
10191420@unknown@formal@none@1@S@A special case of an index is a primary index, or primary key, which is distinguished in that the primary index must ensure a unique reference to a record.@@@@1@29@@danf@17-8-2009
10191430@unknown@formal@none@1@S@Often, for this purpose one simply uses a running index number (ID number).@@@@1@13@@danf@17-8-2009
10191440@unknown@formal@none@1@S@Primary indexes play a significant role in relational databases, and they can speed up access to data considerably.@@@@1@18@@danf@17-8-2009
10191450@unknown@formal@none@1@S@===Transactions and concurrency===@@@@1@3@@danf@17-8-2009
10191460@unknown@formal@none@1@S@In addition to their data model, most practical databases ("transactional databases") attempt to enforce a [[database transaction]] .@@@@1@18@@danf@17-8-2009
10191470@unknown@formal@none@1@S@Ideally, the database software should enforce the [[ACID]] rules, summarized here:@@@@1@11@@danf@17-8-2009
10191480@unknown@formal@none@1@S@* [[Atomicity]]: Either all the tasks in a transaction must be done, or none of them.@@@@1@16@@danf@17-8-2009
10191490@unknown@formal@none@1@S@The transaction must be completed, or else it must be undone (rolled back).@@@@1@13@@danf@17-8-2009
10191500@unknown@formal@none@1@S@* [[Database consistency|Consistency]]: Every transaction must preserve the integrity constraints — the declared consistency rules — of the database.@@@@1@19@@danf@17-8-2009
10191510@unknown@formal@none@1@S@It cannot place the data in a contradictory state.@@@@1@9@@danf@17-8-2009
10191520@unknown@formal@none@1@S@* [[Isolation]]: Two simultaneous transactions cannot interfere with one another.@@@@1@10@@danf@17-8-2009
10191530@unknown@formal@none@1@S@Intermediate results within a transaction are not visible to other transactions.@@@@1@11@@danf@17-8-2009
10191540@unknown@formal@none@1@S@* [[Durability (computer science)|Durability]]: Completed transactions cannot be aborted later or their results discarded.@@@@1@14@@danf@17-8-2009
10191550@unknown@formal@none@1@S@They must persist through (for instance) restarts of the DBMS after crashes@@@@1@12@@danf@17-8-2009
10191560@unknown@formal@none@1@S@In practice, many DBMS's allow most of these rules to be selectively relaxed for better performance.@@@@1@16@@danf@17-8-2009
10191570@unknown@formal@none@1@S@[[Concurrency control]] is a method used to ensure that transactions are executed in a safe manner and follow the ACID rules.@@@@1@21@@danf@17-8-2009
10191580@unknown@formal@none@1@S@The DBMS must be able to ensure that only [[serializability|serializable]], [[serializability#correctness - recoverability|recoverable]] schedules are allowed, and that no actions of committed transactions are lost while undoing aborted transactions .@@@@1@30@@danf@17-8-2009
10191590@unknown@formal@none@1@S@===Replication===@@@@1@1@@danf@17-8-2009
10191600@unknown@formal@none@1@S@Replication of databases is closely related to transactions.@@@@1@8@@danf@17-8-2009
10191610@unknown@formal@none@1@S@If a database can log its individual actions, it is possible to create a duplicate of the data in real time.@@@@1@21@@danf@17-8-2009
10191620@unknown@formal@none@1@S@The duplicate can be used to improve performance or availability of the whole database system.@@@@1@15@@danf@17-8-2009
10191630@unknown@formal@none@1@S@Common replication concepts include:@@@@1@4@@danf@17-8-2009
10191640@unknown@formal@none@1@S@* Master/Slave Replication: All write requests are performed on the master and then replicated to the slaves@@@@1@17@@danf@17-8-2009
10191650@unknown@formal@none@1@S@* Quorum: The result of Read and Write requests are calculated by querying a "majority" of replicas.@@@@1@17@@danf@17-8-2009
10191660@unknown@formal@none@1@S@* Multimaster: Two or more replicas sync each other via a transaction identifier.@@@@1@13@@danf@17-8-2009
10191670@unknown@formal@none@1@S@Parallel synchronous replication of databases enables transactions to be replicated on multiple servers simultaneously, which provides a method for backup and security as well as data availability.@@@@1@27@@danf@17-8-2009
10191680@unknown@formal@none@1@S@===Security===@@@@1@1@@danf@17-8-2009
10191690@unknown@formal@none@1@S@[[Database security]] denotes the system, processes, and procedures that protect a database from unintended activity.@@@@1@15@@danf@17-8-2009
10191700@unknown@formal@none@1@S@Security is usually enforced through '''access control''', '''auditing''', and '''encryption'''.@@@@1@10@@danf@17-8-2009
10191710@unknown@formal@none@1@S@* Access control ensures and restricts who can connect and what can be done to the database.@@@@1@17@@danf@17-8-2009
10191720@unknown@formal@none@1@S@* Auditing logs what action or change has been performed, when and by who.@@@@1@14@@danf@17-8-2009
10191730@unknown@formal@none@1@S@* Encryption: Since security has become a major issue in recent years, many commercial database vendors provide built-in encryption mechanism.@@@@1@20@@danf@17-8-2009
10191740@unknown@formal@none@1@S@Data is encoded natively into the tables and deciphered "on the fly" when a query comes in.@@@@1@17@@danf@17-8-2009
10191745@unknown@formal@none@1@S@Connections can also be secured and encrypted if required using DSA, MD5, SSL or legacy encryption standard.@@@@1@17@@danf@17-8-2009
10191750@unknown@formal@none@1@S@Enforcing security is one of the major tasks of the DBA.@@@@1@11@@danf@17-8-2009
10191760@unknown@formal@none@1@S@In the United Kingdom, legislation protecting the public from unauthorized disclosure of personal information held on databases falls under the Office of the Information Commissioner.@@@@1@25@@danf@17-8-2009
10191770@unknown@formal@none@1@S@United Kingdom based organizations holding personal data in electronic format (databases for example) are required to register with the Data Commissioner.@@@@1@21@@danf@17-8-2009
10191780@unknown@formal@none@1@S@===Locking===@@@@1@1@@danf@17-8-2009
10191790@unknown@formal@none@1@S@[[Lock (computer science)|Locking]] is how the database handle multiple concurent operations.@@@@1@11@@danf@17-8-2009
10191800@unknown@formal@none@1@S@This is the way how concurency and some form of basic intergrity is managed within the database system.@@@@1@18@@danf@17-8-2009
10191810@unknown@formal@none@1@S@Such locks can be applied on a row level, or on other levels like page (a basic data block), extend (multiple array of pages) or even an entire table.@@@@1@29@@danf@17-8-2009
10191820@unknown@formal@none@1@S@This helps maintain the integrity of the data by ensuring that only one process at a time can modify the '''same''' data.@@@@1@22@@danf@17-8-2009
10191830@unknown@formal@none@1@S@Unlike a basic filesystem files or folders, where only one lock at the time can be set, restricting the usage to one process only.@@@@1@24@@danf@17-8-2009
10191840@unknown@formal@none@1@S@A database can set and hold mutiples locks at the same time on the different level of the physical data structure.@@@@1@21@@danf@17-8-2009
10191850@unknown@formal@none@1@S@How locks are set, last is determined by the database engine locking scheme based on the submitted SQL or transactions by the users.@@@@1@23@@danf@17-8-2009
10191860@unknown@formal@none@1@S@Generaly speaking no activity on the database should be translated by no or very light locking.@@@@1@16@@danf@17-8-2009
10191870@unknown@formal@none@1@S@For most DBMS systems existing on the market, locks are generaly '''shared''' or '''exclusive'''.@@@@1@14@@danf@17-8-2009
10191880@unknown@formal@none@1@S@Exclusive locks mean that no other lock can acquire the current data object as long as the exclusive lock lasts.@@@@1@20@@danf@17-8-2009
10191890@unknown@formal@none@1@S@Exclusive locks are usually set while the database needs to change data, like during an UPDATE or DELETE operation.@@@@1@19@@danf@17-8-2009
10191900@unknown@formal@none@1@S@Shared locks can take ownership one from the other of the current data structure.@@@@1@14@@danf@17-8-2009
10191910@unknown@formal@none@1@S@Shared locks are usually used while the database is reading data, during a SELECT operation.@@@@1@15@@danf@17-8-2009
10191920@unknown@formal@none@1@S@The number, nature of locks and time the lock holds a data block can have a huge impact on the database performances.@@@@1@22@@danf@17-8-2009
10191930@unknown@formal@none@1@S@Bad locking can lead to desastrous performance response (usually the result of poor SQL requests, or inadequate database physical structure)@@@@1@20@@danf@17-8-2009
10191940@unknown@formal@none@1@S@Default locking behavior is enforced by the '''isolation level''' of the dataserver.@@@@1@12@@danf@17-8-2009
10191950@unknown@formal@none@1@S@Changing the isolation level will affect how shared or exclusive locks must be set on the data for the entire database system.@@@@1@22@@danf@17-8-2009
10191960@unknown@formal@none@1@S@Default isolation is generaly 1, where data can not be read while it is modfied, forbiding to return "ghost data" to end user.@@@@1@23@@danf@17-8-2009
10191970@unknown@formal@none@1@S@At some point intensive or inappropriate exclusive locking, can lead to the "dead lock" situation between two locks.@@@@1@18@@danf@17-8-2009
10191980@unknown@formal@none@1@S@Where none of the locks can be released because they try to acquire ressources mutually from each other.@@@@1@18@@danf@17-8-2009
10191990@unknown@formal@none@1@S@The Database has a fail safe mecanism and will automaticly "sacrifice" one of the locks releasing the ressource.@@@@1@18@@danf@17-8-2009
10192000@unknown@formal@none@1@S@Doing so processes or transactions involved in the "dead lock" will be rolled back.@@@@1@14@@danf@17-8-2009
10192010@unknown@formal@none@1@S@Databases can also be locked for other reasons, like access restrictions for given levels of user.@@@@1@16@@danf@17-8-2009
10192020@unknown@formal@none@1@S@Databases are also locked for routine database maintenance, which prevents changes being made during the maintenance.@@@@1@16@@danf@17-8-2009
10192030@unknown@formal@none@1@S@See [http://publib.boulder.ibm.com/infocenter/rbhelp/v6r3/index.jsp?topic=/com.ibm.redbrick.doc6.3/wag/wag80.htm IBM] for more detail.)@@@@1@6@@danf@17-8-2009
10192040@unknown@formal@none@1@S@===Architecture===@@@@1@1@@danf@17-8-2009
10192050@unknown@formal@none@1@S@Depending on the intended use, there are a number of database architectures in use.@@@@1@14@@danf@17-8-2009
10192060@unknown@formal@none@1@S@Many databases use a combination of strategies.@@@@1@7@@danf@17-8-2009
10192070@unknown@formal@none@1@S@On-line Transaction Processing systems (OLTP) often use a row-oriented datastore architecture, while data-warehouse and other retrieval-focused applications like [[Google]]'s [[BigTable]], or bibliographic database(library catalogue) systems may use a column-oriented datastore architecture.@@@@1@31@@danf@17-8-2009
10192080@unknown@formal@none@1@S@Document-Oriented, XML, Knowledgebases, as well as frame databases and rdf-stores (aka Triple-Stores), may also use a combination of these architectures in their implementation.@@@@1@23@@danf@17-8-2009
10192090@unknown@formal@none@1@S@Finally it should be noted that not all database have or need a database 'schema' (so called schema-less databases).@@@@1@19@@danf@17-8-2009
10192100@unknown@formal@none@1@S@==Applications of databases==@@@@1@3@@danf@17-8-2009
10192110@unknown@formal@none@1@S@Databases are used in many applications, spanning virtually the entire range of [[computer software]].@@@@1@14@@danf@17-8-2009
10192120@unknown@formal@none@1@S@Databases are the preferred method of storage for large multiuser applications, where coordination between many users is needed.@@@@1@18@@danf@17-8-2009
10192130@unknown@formal@none@1@S@Even individual users find them convenient, and many electronic mail programs and personal organizers are based on standard database technology.@@@@1@20@@danf@17-8-2009
10192140@unknown@formal@none@1@S@Software database drivers are available for most database platforms so that [[application software]] can use a common [[Application Programming Interface]] to retrieve the information stored in a database.@@@@1@28@@danf@17-8-2009
10192150@unknown@formal@none@1@S@Two commonly used database APIs are [[Java Database Connectivity|JDBC]] and [[ODBC]].@@@@1@11@@danf@17-8-2009
10192160@unknown@formal@none@1@S@For example suppliers database contains the data relating to suppliers such as;@@@@1@12@@danf@17-8-2009
10192170@unknown@formal@none@1@S@*supplier name@@@@1@2@@danf@17-8-2009
10192180@unknown@formal@none@1@S@*supplier code@@@@1@2@@danf@17-8-2009
10192190@unknown@formal@none@1@S@*supplier address@@@@1@2@@danf@17-8-2009
10192200@unknown@formal@none@1@S@It is often used by schools to teach students and grade them.@@@@1@12@@danf@17-8-2009
10192210@unknown@formal@none@1@S@==Links to DBMS products==@@@@1@4@@danf@17-8-2009
10192220@unknown@formal@none@1@S@*[[4th Dimension (Software)|4D]]@@@@1@3@@danf@17-8-2009
10192230@unknown@formal@none@1@S@*[[ADABAS]]@@@@1@1@@danf@17-8-2009
10192240@unknown@formal@none@1@S@*[[Alpha Five]]@@@@1@2@@danf@17-8-2009
10192250@unknown@formal@none@1@S@*[[Apache Derby]] (Java, also known as IBM Cloudscape and Sun Java DB)@@@@1@12@@danf@17-8-2009
10192260@unknown@formal@none@1@S@*[[BerkeleyDB]]@@@@1@1@@danf@17-8-2009
10192270@unknown@formal@none@1@S@*[[CouchDB]]@@@@1@1@@danf@17-8-2009
10192280@unknown@formal@none@1@S@*[[CSQL]]@@@@1@1@@danf@17-8-2009
10192290@unknown@formal@none@1@S@*[[Datawasp]]@@@@1@1@@danf@17-8-2009
10192300@unknown@formal@none@1@S@*[[Db4objects]]@@@@1@1@@danf@17-8-2009
10192310@unknown@formal@none@1@S@*[[dBase]]@@@@1@1@@danf@17-8-2009
10192320@unknown@formal@none@1@S@*[[FileMaker]]@@@@1@1@@danf@17-8-2009
10192330@unknown@formal@none@1@S@*[[Firebird (database server)]]@@@@1@3@@danf@17-8-2009
10192340@unknown@formal@none@1@S@*[[H2 (DBMS)|H2]] (Java)@@@@1@3@@danf@17-8-2009
10192350@unknown@formal@none@1@S@*[[Hsqldb]] (Java)@@@@1@2@@danf@17-8-2009
10192360@unknown@formal@none@1@S@*[[IBM DB2]]@@@@1@2@@danf@17-8-2009
10192370@unknown@formal@none@1@S@*[[Information Management System|IBM IMS (Information Management System)]]@@@@1@7@@danf@17-8-2009
10192380@unknown@formal@none@1@S@*[[IBM UniVerse]]@@@@1@2@@danf@17-8-2009
10192390@unknown@formal@none@1@S@*[[Informix]]@@@@1@1@@danf@17-8-2009
10192400@unknown@formal@none@1@S@*[[Ingres (database)|Ingres]]@@@@1@2@@danf@17-8-2009
10192410@unknown@formal@none@1@S@*[[Interbase]]@@@@1@1@@danf@17-8-2009
10192420@unknown@formal@none@1@S@*[[InterSystems Caché]]@@@@1@2@@danf@17-8-2009
10192430@unknown@formal@none@1@S@*[[MaxDB]] (formerly SapDB)@@@@1@3@@danf@17-8-2009
10192440@unknown@formal@none@1@S@*[[Microsoft Access]]@@@@1@2@@danf@17-8-2009
10192450@unknown@formal@none@1@S@*[[Microsoft SQL Server]]@@@@1@3@@danf@17-8-2009
10192460@unknown@formal@none@1@S@*[[Model 204]]@@@@1@2@@danf@17-8-2009
10192470@unknown@formal@none@1@S@*[[MySQL]]@@@@1@1@@danf@17-8-2009
10192480@unknown@formal@none@1@S@*[[Nomad software|Nomad]]@@@@1@2@@danf@17-8-2009
10192490@unknown@formal@none@1@S@*[[Objectivity/DB]]@@@@1@1@@danf@17-8-2009
10192500@unknown@formal@none@1@S@*[[ObjectStore]]@@@@1@1@@danf@17-8-2009
10192510@unknown@formal@none@1@S@*[[Virtuoso Universal Server|OpenLink Virtuoso]]@@@@1@4@@danf@17-8-2009
10192520@unknown@formal@none@1@S@*[[OpenOffice.org Base]]@@@@1@2@@danf@17-8-2009
10192530@unknown@formal@none@1@S@*[[Oracle Database]]@@@@1@2@@danf@17-8-2009
10192540@unknown@formal@none@1@S@*[[Paradox (database)]]@@@@1@2@@danf@17-8-2009
10192550@unknown@formal@none@1@S@*[[Polyhedra DBMS]]@@@@1@2@@danf@17-8-2009
10192560@unknown@formal@none@1@S@*[[PostgreSQL]]@@@@1@1@@danf@17-8-2009
10192570@unknown@formal@none@1@S@*[[Progress 4GL]]@@@@1@2@@danf@17-8-2009
10192580@unknown@formal@none@1@S@*[[RDM Embedded]]@@@@1@2@@danf@17-8-2009
10192590@unknown@formal@none@1@S@*[[ScimoreDB]]@@@@1@1@@danf@17-8-2009
10192600@unknown@formal@none@1@S@*[[Sedna (database)|Sedna]]@@@@1@2@@danf@17-8-2009
10192610@unknown@formal@none@1@S@*[[SQLite]]@@@@1@1@@danf@17-8-2009
10192620@unknown@formal@none@1@S@*[[Superbase database|Superbase]]@@@@1@2@@danf@17-8-2009
10192630@unknown@formal@none@1@S@*[[Sybase]]@@@@1@1@@danf@17-8-2009
10192640@unknown@formal@none@1@S@*[[Teradata]]@@@@1@1@@danf@17-8-2009
10192650@unknown@formal@none@1@S@*[[Vertica]]@@@@1@1@@danf@17-8-2009
10192660@unknown@formal@none@1@S@*[[Visual FoxPro]]@@@@1@2@@danf@17-8-2009
10200010@unknown@formal@none@1@S@Cluster analysis@@@@1@2@@danf@17-8-2009
10200020@unknown@formal@none@1@S@'''Clustering''' is the [[Statistical classification|classification]] of objects into different groups, or more precisely, the [[partition of a set|partitioning]] of a [[data set]] into [[subset]]s (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined [[metric (mathematics)|distance measure]].@@@@1@47@@danf@17-8-2009
10200030@unknown@formal@none@1@S@Data clustering is a common technique for [[statistics|statistical]] [[data analysis]], which is used in many fields, including [[machine learning]], [[data mining]], [[pattern recognition]], [[image analysis]] and [[bioinformatics]].@@@@1@27@@danf@17-8-2009
10200040@unknown@formal@none@1@S@The computational task of classifying the data set into ''k'' clusters is often referred to as '''''k''-clustering'''''.@@@@1@17@@danf@17-8-2009
10200050@unknown@formal@none@1@S@Besides the term ''data clustering'' (or just ''clustering''), there are a number of terms with similar meanings, including ''cluster analysis'', ''automatic classification'', ''numerical taxonomy'', ''botryology'' and ''typological analysis''.@@@@1@28@@danf@17-8-2009
10200060@unknown@formal@none@1@S@== Types of clustering ==@@@@1@5@@danf@17-8-2009
10200070@unknown@formal@none@1@S@Data clustering algorithms can be [[hierarchical]].@@@@1@6@@danf@17-8-2009
10200080@unknown@formal@none@1@S@Hierarchical algorithms find successive clusters using previously established clusters.@@@@1@9@@danf@17-8-2009
10200090@unknown@formal@none@1@S@Hierarchical algorithms can be agglomerative ("bottom-up") or divisive ("top-down").@@@@1@9@@danf@17-8-2009
10200100@unknown@formal@none@1@S@Agglomerative algorithms begin with each element as a separate cluster and merge them into successively larger clusters.@@@@1@17@@danf@17-8-2009
10200110@unknown@formal@none@1@S@Divisive algorithms begin with the whole set and proceed to divide it into successively smaller clusters.@@@@1@16@@danf@17-8-2009
10200120@unknown@formal@none@1@S@[[partition of a set|Partitional]] algorithms typically determine all clusters at once, but can also be used as divisive algorithms in the [[hierarchical]] clustering.@@@@1@23@@danf@17-8-2009
10200130@unknown@formal@none@1@S@''Two-way clustering'', ''co-clustering'' or [[biclustering]] are clustering methods where not only the objects are clustered but also the features of the objects, i.e., if the data is represented in a [[data matrix (statistics)|data matrix]], the rows and columns are clustered simultaneously.@@@@1@41@@danf@17-8-2009
10200140@unknown@formal@none@1@S@Another important distinction is whether the clustering uses symmetric or asymmetric distances.@@@@1@12@@danf@17-8-2009
10200150@unknown@formal@none@1@S@A property of [[Euclidean space]] is that distances are symmetric (the distance from object'' A'' to ''B'' is the same as the distance from ''B'' to ''A'').@@@@1@27@@danf@17-8-2009
10200160@unknown@formal@none@1@S@In other applications (e.g., sequence-alignment methods, see Prinzie & Van den Poel (2006)), this is not the case.@@@@1@18@@danf@17-8-2009
10200170@unknown@formal@none@1@S@== Distance measure ==@@@@1@4@@danf@17-8-2009
10200180@unknown@formal@none@1@S@An important step in any clustering is to select a [[Distance|distance measure]], which will determine how the ''similarity'' of two elements is calculated.@@@@1@23@@danf@17-8-2009
10200190@unknown@formal@none@1@S@This will influence the shape of the clusters, as some elements may be close to one another according to one distance and further away according to another.@@@@1@27@@danf@17-8-2009
10200200@unknown@formal@none@1@S@For example, in a 2-dimensional space, the distance between the point (x=1, y=0) and the origin (x=0, y=0) is always 1 according to the usual norms, but the distance between the point (x=1, y=1) and the origin can be 2, or 1 if you take respectively the 1-norm, 2-norm or infinity-norm distance.@@@@1@53@@danf@17-8-2009
10200210@unknown@formal@none@1@S@Common distance functions:@@@@1@3@@danf@17-8-2009
10200220@unknown@formal@none@1@S@* The [[Euclidean distance]] (also called distance [[as the crow flies]] or 2-norm distance).@@@@1@14@@danf@17-8-2009
10200230@unknown@formal@none@1@S@A review of cluster analysis in health psychology research found that the most common distance measure in published studies in that research area is the Euclidean distance or the squared Euclidean distance.@@@@1@32@@danf@17-8-2009
10200240@unknown@formal@none@1@S@* The [[Manhattan distance]] (also called taxicab norm or 1-norm)@@@@1@10@@danf@17-8-2009
10200250@unknown@formal@none@1@S@* The [[Maximum_norm|maximum norm]]@@@@1@4@@danf@17-8-2009
10200260@unknown@formal@none@1@S@* The [[Mahalanobis distance]] corrects data for different scales and correlations in the variables@@@@1@14@@danf@17-8-2009
10200270@unknown@formal@none@1@S@* The angle between two vectors can be used as a distance measure when clustering high dimensional data.@@@@1@18@@danf@17-8-2009
10200280@unknown@formal@none@1@S@See [[Inner product space]].@@@@1@4@@danf@17-8-2009
10200290@unknown@formal@none@1@S@* The [[Hamming distance]] (sometimes edit distance) measures the minimum number of substitutions required to change one member into another.@@@@1@20@@danf@17-8-2009
10200300@unknown@formal@none@1@S@==Hierarchical clustering==@@@@1@2@@danf@17-8-2009
10200310@unknown@formal@none@1@S@===Creating clusters===@@@@1@2@@danf@17-8-2009
10200320@unknown@formal@none@1@S@Hierarchical clustering builds (agglomerative), or breaks up (divisive), a hierarchy of clusters.@@@@1@12@@danf@17-8-2009
10200330@unknown@formal@none@1@S@The traditional representation of this hierarchy is a [[tree data structure|tree]] (called a [[dendrogram]]), with individual elements at one end and a single cluster containing every element at the other.@@@@1@30@@danf@17-8-2009
10200340@unknown@formal@none@1@S@Agglomerative algorithms begin at the top of the tree, whereas divisive algorithms begin at the root.@@@@1@16@@danf@17-8-2009
10200350@unknown@formal@none@1@S@(In the figure, the arrows indicate an agglomerative clustering.)@@@@1@9@@danf@17-8-2009
10200360@unknown@formal@none@1@S@Cutting the tree at a given height will give a clustering at a selected precision.@@@@1@15@@danf@17-8-2009
10200370@unknown@formal@none@1@S@In the following example, cutting after the second row will yield clusters {a} {b c} {d e} {f}.@@@@1@18@@danf@17-8-2009
10200380@unknown@formal@none@1@S@Cutting after the third row will yield clusters {a} {b c} {d e f}, which is a coarser clustering, with a smaller number of larger clusters.@@@@1@26@@danf@17-8-2009
10200390@unknown@formal@none@1@S@===Agglomerative hierarchical clustering===@@@@1@3@@danf@17-8-2009
10200400@unknown@formal@none@1@S@For example, suppose this data is to be clustered, and the [[euclidean distance]] is the [[Metric (mathematics)|distance metric]].@@@@1@18@@danf@17-8-2009
10200410@unknown@formal@none@1@S@The hierarchical clustering [[dendrogram]] would be as such:@@@@1@8@@danf@17-8-2009
10200420@unknown@formal@none@1@S@This method builds the hierarchy from the individual elements by progressively merging clusters.@@@@1@13@@danf@17-8-2009
10200430@unknown@formal@none@1@S@In our example, we have six elements {a} {b} {c} {d} {e} and {f}.@@@@1@14@@danf@17-8-2009
10200440@unknown@formal@none@1@S@The first step is to determine which elements to merge in a cluster.@@@@1@13@@danf@17-8-2009
10200450@unknown@formal@none@1@S@Usually, we want to take the two closest elements, according to the chosen distance.@@@@1@14@@danf@17-8-2009
10200460@unknown@formal@none@1@S@Optionally, one can also construct a [[distance matrix]] at this stage, where the number in the ''i''-th row ''j''-th column is the distance between the ''i''-th and ''j''-th elements.@@@@1@29@@danf@17-8-2009
10200470@unknown@formal@none@1@S@Then, as clustering progresses, rows and columns are merged as the clusters are merged and the distances updated.@@@@1@18@@danf@17-8-2009
10200480@unknown@formal@none@1@S@This is a common way to implement this type of clustering, and has the benefit of caching distances between clusters.@@@@1@20@@danf@17-8-2009
10200490@unknown@formal@none@1@S@A simple agglomerative clustering algorithm is described in the [[single linkage clustering]] page; it can easily be adapted to different types of linkage (see below).@@@@1@25@@danf@17-8-2009
10200500@unknown@formal@none@1@S@Suppose we have merged the two closest elements ''b'' and ''c'', we now have the following clusters {''a''}, {''b'', ''c''}, {''d''}, {''e''} and {''f''}, and want to merge them further.@@@@1@30@@danf@17-8-2009
10200510@unknown@formal@none@1@S@To do that, we need to take the distance between {a} and {b c}, and therefore define the distance between two clusters.@@@@1@22@@danf@17-8-2009
10200520@unknown@formal@none@1@S@Usually the distance between two clusters and is one of the following:@@@@1@14@@danf@17-8-2009
10200530@unknown@formal@none@1@S@* The maximum distance between elements of each cluster (also called complete linkage clustering):@@@@1@14@@danf@17-8-2009
10200540@unknown@formal@none@1@S@::@@@@1@12@@danf@17-8-2009
10200550@unknown@formal@none@1@S@* The minimum distance between elements of each cluster (also called [[single linkage clustering]]):@@@@1@14@@danf@17-8-2009
10200560@unknown@formal@none@1@S@::@@@@1@13@@danf@17-8-2009
10200570@unknown@formal@none@1@S@* The mean distance between elements of each cluster (also called average linkage clustering, used e.g. in [[UPGMA]]):@@@@1@18@@danf@17-8-2009
10200580@unknown@formal@none@1@S@::@@@@1@11@@danf@17-8-2009
10200590@unknown@formal@none@1@S@* The sum of all intra-cluster variance@@@@1@7@@danf@17-8-2009
10200600@unknown@formal@none@1@S@* The increase in variance for the cluster being merged ([[Ward's criterion]])@@@@1@12@@danf@17-8-2009
10200610@unknown@formal@none@1@S@* The probability that candidate clusters spawn from the same distribution function (V-linkage)@@@@1@13@@danf@17-8-2009
10200620@unknown@formal@none@1@S@Each agglomeration occurs at a greater distance between clusters than the previous agglomeration, and one can decide to stop clustering either when the clusters are too far apart to be merged (distance criterion) or when there is a sufficiently small number of clusters (number criterion).@@@@1@45@@danf@17-8-2009
10200630@unknown@formal@none@1@S@=== Concept clustering ===@@@@1@4@@danf@17-8-2009
10200640@unknown@formal@none@1@S@Another variation of the agglomerative clustering approach is [[conceptual clustering]].@@@@1@10@@danf@17-8-2009
10200650@unknown@formal@none@1@S@==Partitional clustering==@@@@1@2@@danf@17-8-2009
10200660@unknown@formal@none@1@S@===''K''-means and derivatives===@@@@1@3@@danf@17-8-2009
10200670@unknown@formal@none@1@S@====''K''-means clustering====@@@@1@2@@danf@17-8-2009
10200680@unknown@formal@none@1@S@The [[K-means algorithm|''K''-means algorithm]] assigns each point to the cluster whose center (also called centroid) is nearest.@@@@1@17@@danf@17-8-2009
10200690@unknown@formal@none@1@S@The center is the average of all the points in the cluster — that is, its coordinates are the arithmetic mean for each dimension separately over all the points in the cluster...@@@@1@32@@danf@17-8-2009
10200700@unknown@formal@none@1@S@:''Example:'' The data set has three dimensions and the cluster has two points: ''X'' = (''x''1, ''x''2, ''x''3) and ''Y'' = (''y''1, ''y''2, ''y''3).@@@@1@24@@danf@17-8-2009
10200710@unknown@formal@none@1@S@Then the centroid ''Z'' becomes ''Z'' = (''z''1, ''z''2, ''z''3), where ''z''1 = (''x''1 + ''y''1)/2 and ''z''2 = (''x''2 + ''y''2)/2 and ''z''3 = (''x''3 + ''y''3)/2.@@@@1@22@@danf@17-8-2009
10200720@unknown@formal@none@1@S@The algorithm steps are (J. MacQueen, 1967):@@@@1@7@@danf@17-8-2009
10200730@unknown@formal@none@1@S@* Choose the number of clusters, ''k''.@@@@1@7@@danf@17-8-2009
10200740@unknown@formal@none@1@S@* Randomly generate ''k'' clusters and determine the cluster centers, or directly generate ''k'' random points as cluster centers.@@@@1@19@@danf@17-8-2009
10200750@unknown@formal@none@1@S@* Assign each point to the nearest cluster center.@@@@1@9@@danf@17-8-2009
10200760@unknown@formal@none@1@S@* Recompute the new cluster centers.@@@@1@6@@danf@17-8-2009
10200770@unknown@formal@none@1@S@* Repeat the two previous steps until some convergence criterion is met (usually that the assignment hasn't changed).@@@@1@18@@danf@17-8-2009
10200780@unknown@formal@none@1@S@The main advantages of this algorithm are its simplicity and speed which allows it to run on large datasets.@@@@1@19@@danf@17-8-2009
10200790@unknown@formal@none@1@S@Its disadvantage is that it does not yield the same result with each run, since the resulting clusters depend on the initial random assignments.@@@@1@24@@danf@17-8-2009
10200800@unknown@formal@none@1@S@It minimizes intra-cluster variance, but does not ensure that the result has a global minimum of variance.@@@@1@17@@danf@17-8-2009
10200810@unknown@formal@none@1@S@====Fuzzy ''c''-means clustering====@@@@1@3@@danf@17-8-2009
10200820@unknown@formal@none@1@S@In [[fuzzy clustering]], each point has a degree of belonging to clusters, as in [[fuzzy logic]], rather than belonging completely to just one cluster.@@@@1@24@@danf@17-8-2009
10200830@unknown@formal@none@1@S@Thus, points on the edge of a cluster, may be ''in the cluster'' to a lesser degree than points in the center of cluster.@@@@1@24@@danf@17-8-2009
10200840@unknown@formal@none@1@S@For each point ''x'' we have a coefficient giving the degree of being in the ''k''th cluster .@@@@1@18@@danf@17-8-2009
10200850@unknown@formal@none@1@S@Usually, the sum of those coefficients is defined to be 1:@@@@1@11@@danf@17-8-2009
10200860@unknown@formal@none@1@S@:@@@@1@8@@danf@17-8-2009
10200870@unknown@formal@none@1@S@With fuzzy ''c''-means, the centroid of a cluster is the mean of all points, weighted by their degree of belonging to the cluster:@@@@1@23@@danf@17-8-2009
10200880@unknown@formal@none@1@S@:@@@@1@8@@danf@17-8-2009
10200890@unknown@formal@none@1@S@The degree of belonging is related to the inverse of the distance to the cluster@@@@1@15@@danf@17-8-2009
10200900@unknown@formal@none@1@S@:@@@@1@5@@danf@17-8-2009
10200910@unknown@formal@none@1@S@then the coefficients are normalized and fuzzyfied with a real parameter so that their sum is 1.@@@@1@18@@danf@17-8-2009
10200920@unknown@formal@none@1@S@So@@@@1@1@@danf@17-8-2009
10200930@unknown@formal@none@1@S@:@@@@1@4@@danf@17-8-2009
10200940@unknown@formal@none@1@S@For ''m'' equal to 2, this is equivalent to normalising the coefficient linearly to make their sum 1.@@@@1@18@@danf@17-8-2009
10200950@unknown@formal@none@1@S@When ''m'' is close to 1, then cluster center closest to the point is given much more weight than the others, and the algorithm is similar to ''k''-means.@@@@1@28@@danf@17-8-2009
10200960@unknown@formal@none@1@S@The fuzzy ''c''-means algorithm is very similar to the ''k''-means algorithm:@@@@1@11@@danf@17-8-2009
10200970@unknown@formal@none@1@S@* Choose a number of clusters.@@@@1@6@@danf@17-8-2009
10200980@unknown@formal@none@1@S@* Assign randomly to each point coefficients for being in the clusters.@@@@1@12@@danf@17-8-2009
10200990@unknown@formal@none@1@S@* Repeat until the algorithm has converged (that is, the coefficients' change between two iterations is no more than , the given sensitivity threshold) :@@@@1@25@@danf@17-8-2009
10201000@unknown@formal@none@1@S@** Compute the centroid for each cluster, using the formula above.@@@@1@11@@danf@17-8-2009
10201010@unknown@formal@none@1@S@** For each point, compute its coefficients of being in the clusters, using the formula above.@@@@1@16@@danf@17-8-2009
10201020@unknown@formal@none@1@S@The algorithm minimizes intra-cluster variance as well, but has the same problems as ''k''-means, the minimum is a local minimum, and the results depend on the initial choice of weights.@@@@1@30@@danf@17-8-2009
10201030@unknown@formal@none@1@S@The [[Expectation-maximization algorithm]] is a more statistically formalized method which includes some of these ideas: partial membership in classes.@@@@1@19@@danf@17-8-2009
10201040@unknown@formal@none@1@S@It has better convergence properties and is in general preferred to fuzzy-c-means.@@@@1@12@@danf@17-8-2009
10201050@unknown@formal@none@1@S@====QT clustering algorithm====@@@@1@3@@danf@17-8-2009
10201060@unknown@formal@none@1@S@QT (quality threshold) clustering (Heyer et al, 1999) is an alternative method of partitioning data, invented for gene clustering.@@@@1@19@@danf@17-8-2009
10201070@unknown@formal@none@1@S@It requires more computing power than ''k''-means, but does not require specifying the number of clusters ''a priori'', and always returns the same result when run several times.@@@@1@28@@danf@17-8-2009
10201080@unknown@formal@none@1@S@The algorithm is:@@@@1@3@@danf@17-8-2009
10201090@unknown@formal@none@1@S@* The user chooses a maximum diameter for clusters.@@@@1@9@@danf@17-8-2009
10201100@unknown@formal@none@1@S@* Build a candidate cluster for each point by including the closest point, the next closest, and so on, until the diameter of the cluster surpasses the threshold.@@@@1@28@@danf@17-8-2009
10201110@unknown@formal@none@1@S@* Save the candidate cluster with the most points as the first true cluster, and remove all points in the cluster from further consideration.@@@@1@24@@danf@17-8-2009
10201120@unknown@formal@none@1@S@Must clarify what happens if more than 1 cluster has the maximum number of points ?@@@@1@16@@danf@17-8-2009
10201130@unknown@formal@none@1@S@* [[Recursion|Recurse]] with the reduced set of points.@@@@1@8@@danf@17-8-2009
10201140@unknown@formal@none@1@S@The distance between a point and a group of points is computed using complete linkage, i.e. as the maximum distance from the point to any member of the group (see the "Agglomerative hierarchical clustering" section about distance between clusters).@@@@1@39@@danf@17-8-2009
10201150@unknown@formal@none@1@S@=== Locality-sensitive hashing ===@@@@1@4@@danf@17-8-2009
10201160@unknown@formal@none@1@S@[[Locality-sensitive hashing]] can be used for clustering.@@@@1@7@@danf@17-8-2009
10201170@unknown@formal@none@1@S@Feature space vectors are sets, and the metric used is the [[Jaccard distance]].@@@@1@13@@danf@17-8-2009
10201180@unknown@formal@none@1@S@The feature space can be considered high-dimensional.@@@@1@7@@danf@17-8-2009
10201190@unknown@formal@none@1@S@The ''min-wise independent permutations'' LSH scheme (sometimes MinHash) is then used to put similar items into buckets.@@@@1@17@@danf@17-8-2009
10201200@unknown@formal@none@1@S@With just one set of hashing methods, there are only clusters of very similar elements.@@@@1@15@@danf@17-8-2009
10201210@unknown@formal@none@1@S@By seeding the hash functions several times (eg 20), it is possible to get bigger clusters.@@@@1@16@@danf@17-8-2009
10201220@unknown@formal@none@1@S@=== Graph-theoretic methods ===@@@@1@4@@danf@17-8-2009
10201230@unknown@formal@none@1@S@[[Formal concept analysis]] is a technique for generating clusters of objects and attributes, given a [[bipartite graph]] representing the relations between the objects and attributes.@@@@1@25@@danf@17-8-2009
10201240@unknown@formal@none@1@S@Other methods for generating ''overlapping clusters'' (a [[Cover (topology)|cover]] rather than a [[partition of a set|partition]]) are discussed by Jardine and Sibson (1968) and Cole and Wishart (1970).@@@@1@28@@danf@17-8-2009
10201250@unknown@formal@none@1@S@== Elbow criterion ==@@@@1@4@@danf@17-8-2009
10201260@unknown@formal@none@1@S@The elbow criterion is a common [[rule of thumb]] to determine what number of clusters should be chosen, for example for ''k''-means and agglomerative hierarchical clustering.@@@@1@26@@danf@17-8-2009
10201270@unknown@formal@none@1@S@It should also be noted that the initial assignment of cluster seeds has bearing on the final model performance.@@@@1@19@@danf@17-8-2009
10201280@unknown@formal@none@1@S@Thus, it is appropriate to re-run the cluster analysis multiple times.@@@@1@11@@danf@17-8-2009
10201290@unknown@formal@none@1@S@The elbow criterion says that you should choose a number of clusters so that adding another cluster doesn't add sufficient information.@@@@1@21@@danf@17-8-2009
10201300@unknown@formal@none@1@S@More precisely, if you graph the percentage of variance explained by the clusters against the number of clusters, the first clusters will add much information (explain a lot of variance), but at some point the marginal gain will drop, giving an angle in the graph (the elbow).@@@@1@47@@danf@17-8-2009
10201310@unknown@formal@none@1@S@This elbow cannot always be unambiguously identified.@@@@1@7@@danf@17-8-2009
10201320@unknown@formal@none@1@S@Percentage of variance explained is the ratio of the between-group variance to the total variance.@@@@1@15@@danf@17-8-2009
10201330@unknown@formal@none@1@S@On the following graph, the elbow is indicated by the red circle.@@@@1@12@@danf@17-8-2009
10201340@unknown@formal@none@1@S@The number of clusters chosen should therefore be 4.@@@@1@9@@danf@17-8-2009
10201350@unknown@formal@none@1@S@== Spectral clustering ==@@@@1@4@@danf@17-8-2009
10201360@unknown@formal@none@1@S@Given a set of data points A, the [[similarity matrix]] may be defined as a matrix where represents a measure of the similarity between points .@@@@1@30@@danf@17-8-2009
10201370@unknown@formal@none@1@S@Spectral clustering techniques make use of the [[Spectrum of a matrix|spectrum]] of the similarity matrix of the data to perform [[dimensionality reduction]] for clustering in fewer dimensions.@@@@1@27@@danf@17-8-2009
10201380@unknown@formal@none@1@S@One such technique is the ''[[Shi-Malik algorithm]]'', commonly used for [[segmentation (image processing)|image segmentation]].@@@@1@14@@danf@17-8-2009
10201390@unknown@formal@none@1@S@It partitions points into two sets based on the [[eigenvector]] corresponding to the second-smallest [[eigenvalue]] of the [[Laplacian matrix]]@@@@1@21@@danf@17-8-2009
10201400@unknown@formal@none@1@S@:@@@@1@5@@danf@17-8-2009
10201410@unknown@formal@none@1@S@of , where is the diagonal matrix@@@@1@8@@danf@17-8-2009
10201420@unknown@formal@none@1@S@:@@@@1@4@@danf@17-8-2009
10201430@unknown@formal@none@1@S@This partitioning may be done in various ways, such as by taking the median of the components in , and placing all points whose component in is greater than in , and the rest in .@@@@1@39@@danf@17-8-2009
10201440@unknown@formal@none@1@S@The algorithm can be used for hierarchical clustering by repeatedly partitioning the subsets in this fashion.@@@@1@16@@danf@17-8-2009
10201450@unknown@formal@none@1@S@A related algorithm is the ''[[Meila-Shi algorithm]]'', which takes the [[eigenvector]]s corresponding to the ''k'' largest [[eigenvalue]]s of the matrix for some ''k'', and then invokes another (e.g. ''k''-means) to cluster points by their respective ''k'' components in these eigenvectors.@@@@1@43@@danf@17-8-2009
10201460@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009
10201470@unknown@formal@none@1@S@=== Biology ===@@@@1@3@@danf@17-8-2009
10201480@unknown@formal@none@1@S@In [[biology]] '''clustering''' has many applications@@@@1@6@@danf@17-8-2009
10201490@unknown@formal@none@1@S@*In imaging, data clustering may take different form based on the data dimensionality.@@@@1@13@@danf@17-8-2009
10201500@unknown@formal@none@1@S@For example, the [http://wiki.stat.ucla.edu/socr/index.php/SOCR_EduMaterials_Activities_2D_PointSegmentation_EM_Mixture SOCR EM Mixture model segmentation activity and applet] shows how to obtain point, region or volume classification using the online [[SOCR]] computational libraries.@@@@1@27@@danf@17-8-2009
10201510@unknown@formal@none@1@S@*In the fields of [[plant]] and [[animal]] [[ecology]], clustering is used to describe and to make spatial and temporal comparisons of communities (assemblages) of organisms in heterogeneous environments; it is also used in [[Systematics|plant systematics]] to generate artificial [[Phylogeny|phylogenies]] or clusters of organisms (individuals) at the species, genus or higher level that share a number of attributes@@@@1@57@@danf@17-8-2009
10201520@unknown@formal@none@1@S@*In computational biology and [[bioinformatics]]:@@@@1@5@@danf@17-8-2009
10201530@unknown@formal@none@1@S@** In [[transcriptome|transcriptomics]], clustering is used to build groups of [[genes]] with related expression patterns (also known as coexpressed genes).@@@@1@20@@danf@17-8-2009
10201540@unknown@formal@none@1@S@Often such groups contain functionally related proteins, such as [[enzyme]]s for a specific [[metabolic pathway|pathway]], or genes that are co-regulated.@@@@1@20@@danf@17-8-2009
10201550@unknown@formal@none@1@S@High throughput experiments using [[expressed sequence tag]]s (ESTs) or [[DNA microarray]]s can be a powerful tool for [[genome annotation]], a general aspect of [[genomics]].@@@@1@24@@danf@17-8-2009
10201560@unknown@formal@none@1@S@** In [[sequence analysis]], clustering is used to group homologous sequences into [[list of gene families|gene families]].@@@@1@17@@danf@17-8-2009
10201570@unknown@formal@none@1@S@This is a very important concept in bioinformatics, and [[evolutionary biology]] in general.@@@@1@13@@danf@17-8-2009
10201580@unknown@formal@none@1@S@See evolution by [[gene duplication]].@@@@1@5@@danf@17-8-2009
10201590@unknown@formal@none@1@S@** In high-throughput genotyping platforms clustering algorithms are used to automatically assign [[genotypes]].@@@@1@13@@danf@17-8-2009
10201600@unknown@formal@none@1@S@=== Medicine ===@@@@1@3@@danf@17-8-2009
10201610@unknown@formal@none@1@S@In [[medical imaging]], such as [[PET scan|PET scans]], cluster analysis can be used to differentiate between different types of [[tissue (biology)|tissue]] and [[blood]] in a three dimensional image.@@@@1@28@@danf@17-8-2009
10201620@unknown@formal@none@1@S@In this application, actual position does not matter, but the [[voxel]] intensity is considered as a [[coordinate vector|vector]], with a dimension for each image that was taken over time.@@@@1@29@@danf@17-8-2009
10201630@unknown@formal@none@1@S@This technique allows, for example, accurate measurement of the rate a radioactive tracer is delivered to the area of interest, without a separate sampling of [[arterial]] blood, an intrusive technique that is most common today.@@@@1@35@@danf@17-8-2009
10201640@unknown@formal@none@1@S@=== Market research ===@@@@1@4@@danf@17-8-2009
10201650@unknown@formal@none@1@S@Cluster analysis is widely used in [[market research]] when working with multivariate data from [[Statistical survey|surveys]] and test panels.@@@@1@19@@danf@17-8-2009
10201660@unknown@formal@none@1@S@Market researchers use cluster analysis to partition the general [[population]] of [[consumers]] into market segments and to better understand the relationships between different groups of consumers/potential [[customers]].@@@@1@27@@danf@17-8-2009
10201670@unknown@formal@none@1@S@* Segmenting the market and determining [[target market]]s@@@@1@8@@danf@17-8-2009
10201680@unknown@formal@none@1@S@* [[positioning (marketing)|Product positioning]]@@@@1@4@@danf@17-8-2009
10201690@unknown@formal@none@1@S@* [[New product development]]@@@@1@4@@danf@17-8-2009
10201700@unknown@formal@none@1@S@* Selecting test markets (see : [[experimental techniques]])@@@@1@8@@danf@17-8-2009
10201710@unknown@formal@none@1@S@=== Other applications ===@@@@1@4@@danf@17-8-2009
10201720@unknown@formal@none@1@S@'''Social network analysis''': In the study of [[social networks]], clustering may be used to recognize [[communities]] within large groups of people.@@@@1@21@@danf@17-8-2009
10201730@unknown@formal@none@1@S@'''Image segmentation''': Clustering can be used to divide a [[digital]] [[image]] into distinct regions for [[border detection]] or [[object recognition]].@@@@1@20@@danf@17-8-2009
10201740@unknown@formal@none@1@S@'''Data mining''': Many [[data mining]] applications involve partitioning data items into related subsets; the marketing applications discussed above represent some examples.@@@@1@21@@danf@17-8-2009
10201750@unknown@formal@none@1@S@Another common application is the division of documents, such as [[World Wide Web]] pages, into genres.@@@@1@16@@danf@17-8-2009
10201760@unknown@formal@none@1@S@'''Search result grouping''': In the process of intelligent grouping of the files and websites, clustering may be used to create a more relevant set of search results compared to normal search engines like [[Google]].@@@@1@34@@danf@17-8-2009
10201770@unknown@formal@none@1@S@There are currently a number of web based clustering tools such as [[Clusty]].@@@@1@13@@danf@17-8-2009
10201780@unknown@formal@none@1@S@'''Slippy map optimization''': [[Flickr]]'s map of photos and other map sites use clustering to reduce the number of markers on a map.@@@@1@22@@danf@17-8-2009
10201790@unknown@formal@none@1@S@This makes it both faster and reduces the amount of visual clutter.@@@@1@12@@danf@17-8-2009
10201800@unknown@formal@none@1@S@'''IMRT segmentation''': Clustering can be used to divide a fluence map into distinct regions for conversion into deliverable fields in MLC-based Radiation Therapy.@@@@1@23@@danf@17-8-2009
10201810@unknown@formal@none@1@S@'''Grouping of Shopping Items''': Clustering can be used to group all the shopping items available on the web into a set of unique products.@@@@1@24@@danf@17-8-2009
10201820@unknown@formal@none@1@S@For example, all the items on eBay can be grouped into unique products.@@@@1@13@@danf@17-8-2009
10201825@unknown@formal@none@1@S@(eBay doesn't have the concept of a SKU)@@@@1@8@@danf@17-8-2009
10201830@unknown@formal@none@1@S@'''[[Mathematical chemistry]]''': To find structural similarity, etc., for example, 3000 chemical compounds were clustered in the space of 90 [[topological index|topological indices]].@@@@1@22@@danf@17-8-2009
10201840@unknown@formal@none@1@S@'''Petroleum Geology''': Cluster Analysis is used to reconstruct missing bottom hole core data or missing log curves in order to evaluate reservoir properties.@@@@1@23@@danf@17-8-2009
10201850@unknown@formal@none@1@S@== Comparisons between data clusterings ==@@@@1@6@@danf@17-8-2009
10201860@unknown@formal@none@1@S@There have been several suggestions for a measure of similarity between two clusterings.@@@@1@13@@danf@17-8-2009
10201870@unknown@formal@none@1@S@Such a measure can be used to compare how well different data clustering algorithms perform on a set of data.@@@@1@20@@danf@17-8-2009
10201880@unknown@formal@none@1@S@Many of these measures are derived from the [[matching matrix]] (aka [[confusion matrix]]), e.g., the [[Rand index|Rand measure]] and the Fowlkes-Mallows ''B''''k'' measures.@@@@1@23@@danf@17-8-2009
10201890@unknown@formal@none@1@S@[[Marina Meila]]'s Variation of Information metric is a more recent approach for measuring distance between clusterings.@@@@1@16@@danf@17-8-2009
10201900@unknown@formal@none@1@S@It uses [[Mutual information|mutual information]] and [[entropy]] to approximate the distance between two clusterings across the lattice of possible clusterings.@@@@1@20@@danf@17-8-2009
10201910@unknown@formal@none@1@S@==Algorithms==@@@@1@1@@danf@17-8-2009
10201920@unknown@formal@none@1@S@In recent years considerable effort has been put into improving algorithm performance (Z. Huang, 1998).@@@@1@15@@danf@17-8-2009
10201930@unknown@formal@none@1@S@Among the most popular are ''CLARANS'' (Ng and Han,1994), ''[[DBSCAN]]'' (Ester et al., 1996) and ''BIRCH'' (Zhang et al., 1996).@@@@1@20@@danf@17-8-2009
10210010@unknown@formal@none@1@S@Data mining@@@@1@2@@danf@17-8-2009
10210020@unknown@formal@none@1@S@'''Data mining''' is the process of [[sorting]] through large amounts of data and picking out relevant information.@@@@1@17@@danf@17-8-2009
10210030@unknown@formal@none@1@S@It is usually used by [[business intelligence]] organizations, and [[financial analyst]]s, but is increasingly being used in the sciences to extract information from the enormous [[data set]]s generated by modern experimental and observational methods.@@@@1@34@@danf@17-8-2009
10210040@unknown@formal@none@1@S@It has been described as "the nontrivial extraction of implicit, previously unknown, and potentially useful [[information]] from [[data]]" and "the science of extracting useful information from large [[data set]]s or [[database]]s.@@@@1@31@@danf@17-8-2009
10210050@unknown@formal@none@1@S@" Data mining in relation to [[enterprise resource planning]] is the statistical and logical analysis of large sets of transaction data, looking for patterns that can aid decision making.@@@@1@29@@danf@17-8-2009
10210060@unknown@formal@none@1@S@==Background==@@@@1@1@@danf@17-8-2009
10210070@unknown@formal@none@1@S@Traditionally, business analysts have performed the task of extracting useful [[information]] from recorded [[data]], but the increasing volume of data in modern business and science calls for computer-based approaches.@@@@1@29@@danf@17-8-2009
10210080@unknown@formal@none@1@S@As [[data set]]s have grown in size and complexity, there has been a shift away from direct hands-on data analysis toward indirect, automatic data analysis using more complex and sophisticated tools.@@@@1@31@@danf@17-8-2009
10210090@unknown@formal@none@1@S@The modern technologies of [[computers]], [[networks]], and [[sensors]] have made [[data collection]] and organization much easier.@@@@1@16@@danf@17-8-2009
10210100@unknown@formal@none@1@S@However, the captured data needs to be converted into [[information]] and [[knowledge]] to become useful.@@@@1@15@@danf@17-8-2009
10210110@unknown@formal@none@1@S@Data mining is the entire process of applying computer-based [[methodology]], including new techniques for [[knowledge discovery]], to data.@@@@1@18@@danf@17-8-2009
10210120@unknown@formal@none@1@S@Data mining identifies trends within data that go beyond simple analysis.@@@@1@11@@danf@17-8-2009
10210130@unknown@formal@none@1@S@Through the use of sophisticated algorithms, non-statistician users have the opportunity to identify key attributes of business processes and target opportunities.@@@@1@21@@danf@17-8-2009
10210140@unknown@formal@none@1@S@However, abdicating control of this process from the statistician to the machine may result in false-positives or no useful results at all.@@@@1@22@@danf@17-8-2009
10210150@unknown@formal@none@1@S@Although data mining is a relatively new term, the technology is not.@@@@1@12@@danf@17-8-2009
10210160@unknown@formal@none@1@S@For many years, businesses have used powerful computers to sift through volumes of data such as supermarket scanner data to produce market research reports (although reporting is not considered to be data mining).@@@@1@33@@danf@17-8-2009
10210170@unknown@formal@none@1@S@Continuous innovations in computer processing power, disk storage, and statistical software are dramatically increasing the accuracy and usefulness of data analysis.@@@@1@21@@danf@17-8-2009
10210180@unknown@formal@none@1@S@Web 2.0 technologies have generated a colossal amount of user-generated data and media, making it hard to aggregate and consume information in a meaningful way without getting overloaded.@@@@1@28@@danf@17-8-2009
10210190@unknown@formal@none@1@S@Given the size of the data on the Internet, and the difficulty in contextualizing it, it is unclear whether the traditional approach to data mining is computationally viable.@@@@1@28@@danf@17-8-2009
10210200@unknown@formal@none@1@S@The term data mining is often used to apply to the two separate processes of knowledge discovery and [[prediction]].@@@@1@19@@danf@17-8-2009
10210210@unknown@formal@none@1@S@Knowledge discovery provides explicit information that has a readable form and can be understood by a user.@@@@1@17@@danf@17-8-2009
10210220@unknown@formal@none@1@S@[[Forecasting]], or [[predictive modeling]] provides predictions of future events and may be transparent and readable in some approaches (e.g., rule-based systems) and opaque in others such as [[neural network]]s.@@@@1@29@@danf@17-8-2009
10210230@unknown@formal@none@1@S@Moreover, some data-mining systems such as neural networks are inherently geared towards prediction and pattern recognition, rather than knowledge discovery.@@@@1@20@@danf@17-8-2009
10210240@unknown@formal@none@1@S@[[Metadata]], or data about a given data set, are often expressed in a condensed ''data-minable'' format, or one that facilitates the practice of data mining.@@@@1@25@@danf@17-8-2009
10210250@unknown@formal@none@1@S@Common examples include executive summaries and scientific abstracts.@@@@1@8@@danf@17-8-2009
10210260@unknown@formal@none@1@S@Data mining relies on the use of real world data.@@@@1@10@@danf@17-8-2009
10210270@unknown@formal@none@1@S@This data is extremely vulnerable to [[collinearity]] precisely because data from the real world may have unknown interrelations.@@@@1@18@@danf@17-8-2009
10210280@unknown@formal@none@1@S@An unavoidable weakness of data mining is that the critical data that may expose any relationship might have never been observed.@@@@1@21@@danf@17-8-2009
10210290@unknown@formal@none@1@S@Alternative approaches using an experiment-based approach such as [[Choice Modelling]] for human-generated data may be used.@@@@1@16@@danf@17-8-2009
10210300@unknown@formal@none@1@S@Inherent correlations are either controlled for or removed altogether through the construction of an [[experimental design]].@@@@1@16@@danf@17-8-2009
10210310@unknown@formal@none@1@S@Recently, there were some efforts to define a standard for data mining, for example the [[CRISP-DM]] standard for analysis processes or the [[Java Data-Mining]] Standard.@@@@1@25@@danf@17-8-2009
10210320@unknown@formal@none@1@S@Independent of these standardization efforts, freely available open-source software systems like [[RapidMiner]] and [[Weka (machine learning)| Weka]] have become an informal standard for defining data-mining processes.@@@@1@26@@danf@17-8-2009
10210330@unknown@formal@none@1@S@==Privacy concerns==@@@@1@2@@danf@17-8-2009
10210340@unknown@formal@none@1@S@There are also [[privacy]] and [[human rights]] concerns associated with data mining, specifically regarding the source of the data analyzed.@@@@1@20@@danf@17-8-2009
10210350@unknown@formal@none@1@S@Data mining provides information that may be difficult to obtain otherwise.@@@@1@11@@danf@17-8-2009
10210360@unknown@formal@none@1@S@When the data collected involves individual people, there are many questions concerning privacy, legality, and ethics.@@@@1@16@@danf@17-8-2009
10210370@unknown@formal@none@1@S@In particular, data mining government or commercial data sets for national security or law enforcement purposes has raised privacy concerns.@@@@1@20@@danf@17-8-2009
10210380@unknown@formal@none@1@S@==Notable uses of data mining==@@@@1@5@@danf@17-8-2009
10210390@unknown@formal@none@1@S@===Combatting Terrorism===@@@@1@2@@danf@17-8-2009
10210400@unknown@formal@none@1@S@Data mining has been cited as the method by which the U.S. Army unit [[Able Danger]] had identified the [[September 11, 2001 attacks]] leader, [[Mohamed Atta]], and three other 9/11 hijackers as possible members of an [[Al Qaeda]] cell operating in the U.S. more than a year before the attack.@@@@1@50@@danf@17-8-2009
10210410@unknown@formal@none@1@S@It has been suggested that both the [[Central Intelligence Agency]] and the [[Canadian Security Intelligence Service]] have employed this method.@@@@1@20@@danf@17-8-2009
10210420@unknown@formal@none@1@S@Previous data mining to stop terrorist programs under the US government include the Terrorism Information Awareness (TIA) program, Computer-Assisted Passenger Prescreening System (CAPPS II), Analysis, Dissemination, Visualization, Insight, and Semantic Enhancement (ADVISE), Multistate Anti-Terrorism Information Exchange (MATRIX), and the Secure Flight program [http://www.msnbc.msn.com/id/20604775/ Security-MSNBC].@@@@1@44@@danf@17-8-2009
10210430@unknown@formal@none@1@S@These programs have been discontinued due to controversy over whether they violate the US Constitution's 4th amendment.@@@@1@17@@danf@17-8-2009
10210440@unknown@formal@none@1@S@===Games===@@@@1@1@@danf@17-8-2009
10210450@unknown@formal@none@1@S@Since the early 1960s, with the availability of [[Oracle machine|oracle]]s for certain [[combinatorial game]]s, also called [[tablebase]]s (e.g. for 3x3-chess) with any beginning configuration, small-board [[dots-and-boxes]], small-board-hex, and certain endgames in chess, dots-and-boxes, and hex; a new area for data mining has been opened up.@@@@1@45@@danf@17-8-2009
10210460@unknown@formal@none@1@S@This is the extraction of human-usable strategies from these oracles.@@@@1@10@@danf@17-8-2009
10210470@unknown@formal@none@1@S@Current pattern recognition approaches do not seem to fully have the required high level of abstraction in order to be applied successfully.@@@@1@22@@danf@17-8-2009
10210480@unknown@formal@none@1@S@Instead, extensive experimentation with the tablebases, combined with an intensive study of tablebase-answers to well designed problems and with knowledge of prior art, i.e. pre-tablebase knowledge, is used to yield insightful patterns.@@@@1@32@@danf@17-8-2009
10210490@unknown@formal@none@1@S@[[Berlekamp]] in dots-and-boxes etc. and [[John Nunn]] in [[chess]] [[Chess endgame|endgames]] are notable examples of researchers doing this work, though they were not and are not involved in tablebase generation.@@@@1@30@@danf@17-8-2009
10210500@unknown@formal@none@1@S@===Business===@@@@1@1@@danf@17-8-2009
10210510@unknown@formal@none@1@S@Data mining in [[customer relationship management]] applications can contribute significantly to the bottom line.@@@@1@14@@danf@17-8-2009
10210520@unknown@formal@none@1@S@Rather than contacting a prospect or customer through a call center or sending mail, only prospects that are predicted to have a high likelihood of responding to an offer are contacted.@@@@1@31@@danf@17-8-2009
10210530@unknown@formal@none@1@S@More sophisticated methods may be used to optimize across campaigns so that we can predict which channel and which offer an individual is most likely to respond to - across all potential offers.@@@@1@33@@danf@17-8-2009
10210540@unknown@formal@none@1@S@Finally, in cases where many people will take an action without an offer, uplift modeling can be used to determine which people will have the greatest increase in responding if given an offer.@@@@1@33@@danf@17-8-2009
10210550@unknown@formal@none@1@S@[[Data clustering]] can also be used to automatically discover the segments or groups within a customer data set.@@@@1@18@@danf@17-8-2009
10210560@unknown@formal@none@1@S@Businesses employing data mining quickly see a return on investment, but also they recognize that the number of predictive models can quickly become very large.@@@@1@25@@danf@17-8-2009
10210570@unknown@formal@none@1@S@Rather than one model to predict which customers will [[Churning (stock trade)|churn]], a business could build a separate model for each region and customer type.@@@@1@25@@danf@17-8-2009
10210580@unknown@formal@none@1@S@Then instead of sending an offer to all people that are likely to churn, it may only want to send offers to customers that will likely take to offer.@@@@1@29@@danf@17-8-2009
10210590@unknown@formal@none@1@S@And finally, it may also want to determine which customers are going to be profitable over a window of time and only send the offers to those that are likely to be profitable.@@@@1@33@@danf@17-8-2009
10210600@unknown@formal@none@1@S@In order to maintain this quantity of models, they need to manage model versions and move to ''automated data mining''.@@@@1@20@@danf@17-8-2009
10210610@unknown@formal@none@1@S@Data mining can also be helpful to human-resources departments in identifying the characteristics of their most successful employees.@@@@1@18@@danf@17-8-2009
10210620@unknown@formal@none@1@S@Information obtained, such as universities attended by highly successful employees, can help HR focus recruiting efforts accordingly.@@@@1@17@@danf@17-8-2009
10210630@unknown@formal@none@1@S@Additionally, Strategic Enterprise Management applications help a company translate corporate-level goals, such as profit and margin share targets, into operational decisions, such as production plans and workforce levels.@@@@1@28@@danf@17-8-2009
10210640@unknown@formal@none@1@S@Another example of data mining, often called the [[market basket analysis]], relates to its use in retail sales.@@@@1@18@@danf@17-8-2009
10210650@unknown@formal@none@1@S@If a clothing store records the purchases of customers, a data-mining system could identify those customers who favour silk shirts over cotton ones.@@@@1@23@@danf@17-8-2009
10210660@unknown@formal@none@1@S@Although some explanations of relationships may be difficult, taking advantage of it is easier.@@@@1@14@@danf@17-8-2009
10210670@unknown@formal@none@1@S@The example deals with [[association rule]]s within transaction-based data.@@@@1@9@@danf@17-8-2009
10210680@unknown@formal@none@1@S@Not all data are transaction based and logical or inexact [[rule]]s may also be present within a [[database]].@@@@1@18@@danf@17-8-2009
10210690@unknown@formal@none@1@S@In a manufacturing application, an inexact rule may state that 73% of products which have a specific defect or problem will develop a secondary problem within the next six months.@@@@1@30@@danf@17-8-2009
10210700@unknown@formal@none@1@S@Related to an integrated-circuit production line, an example of data mining is described in the paper "Mining IC Test Data to Optimize VLSI Testing."@@@@1@24@@danf@17-8-2009
10210710@unknown@formal@none@1@S@In this paper the application of data mining and decision analysis to the problem of die-level functional test is described.@@@@1@20@@danf@17-8-2009
10210720@unknown@formal@none@1@S@Experiments mentioned in this paper demonstrate the ability of applying a system of mining historical die-test data to create a probabilistic model of patterns of die failure which are then utilized to decide in real time which die to test next and when to stop testing.@@@@1@46@@danf@17-8-2009
10210730@unknown@formal@none@1@S@This system has been shown, based on experiments with historical test data, to have the potential to improve profits on mature IC products.@@@@1@23@@danf@17-8-2009
10210740@unknown@formal@none@1@S@===Science and engineering===@@@@1@3@@danf@17-8-2009
10210750@unknown@formal@none@1@S@In recent years, data mining has been widely used in area of science and engineering, such as [[bioinformatic]]s, [[genetic]]s, [[medicine]], [[education]], and [[electrical power]] engineering.@@@@1@25@@danf@17-8-2009
10210760@unknown@formal@none@1@S@In the area of study on human genetics, the important goal is to understand the mapping relationship between the inter-individual variation in human [[DNA]] sequences and variability in disease susceptibility.@@@@1@30@@danf@17-8-2009
10210770@unknown@formal@none@1@S@In lay terms, it is to find out how the changes in an individual's DNA sequence affect the risk of developing common diseases such as [[cancer]].@@@@1@26@@danf@17-8-2009
10210780@unknown@formal@none@1@S@This is very important to help improve the diagnosis, prevention and treatment of the diseases.@@@@1@15@@danf@17-8-2009
10210790@unknown@formal@none@1@S@The data mining technique that is used to perform this task is known as [[multifactor dimensionality reduction]].@@@@1@17@@danf@17-8-2009
10210800@unknown@formal@none@1@S@In the area of electrical power engineering, data mining techniques have been widely used for [[condition monitoring]] of high voltage electrical equipment.@@@@1@22@@danf@17-8-2009
10210810@unknown@formal@none@1@S@The purpose of condition monitoring is to obtain valuable information on the [[insulation]]'s health status of the equipment.@@@@1@18@@danf@17-8-2009
10210820@unknown@formal@none@1@S@[[Data clustering]] such as [[self-organizing map]] (SOM) has been applied on the vibration monitoring and analysis of transformer on-load tap-changers(OLTCS).@@@@1@20@@danf@17-8-2009
10210830@unknown@formal@none@1@S@Using vibration monitoring, it can be observed that each tap change operation generates a signal that contains information about the condition of the tap changer contacts and the drive mechanisms.@@@@1@30@@danf@17-8-2009
10210840@unknown@formal@none@1@S@Obviously, different tap positions will generate different signals.@@@@1@8@@danf@17-8-2009
10210850@unknown@formal@none@1@S@However, there was considerable variability amongst normal condition signals for the exact same tap position.@@@@1@15@@danf@17-8-2009
10210860@unknown@formal@none@1@S@SOM has been applied to detect abnormal conditions and to estimate the nature of the abnormalities.@@@@1@16@@danf@17-8-2009
10210870@unknown@formal@none@1@S@Data mining techniques have also been applied for [[dissolved gas analysis]] (DGA) on [[power transformer]]s.@@@@1@15@@danf@17-8-2009
10210880@unknown@formal@none@1@S@DGA, as a diagnostics for power transformer, has been available for centuries.@@@@1@12@@danf@17-8-2009
10210890@unknown@formal@none@1@S@Data mining techniques such as SOM has been applied to analyse data and to determine trends which are not obvious to the standard DGA ratio techniques such as Duval Triangle.@@@@1@30@@danf@17-8-2009
10210900@unknown@formal@none@1@S@A fourth area of application for data mining in science/engineering is within educational research, where data mining has been used to study the factors leading students to choose to engage in behaviors which reduce their learning and to understand the factors influencing university student retention.@@@@1@45@@danf@17-8-2009
10210910@unknown@formal@none@1@S@Other examples of applying data mining technique applications are [[biomedical]] data facilitated by domain ontologies, mining clinical trial data, [[traffic analysis]] using SOM, et cetera.@@@@1@25@@danf@17-8-2009
10220010@unknown@formal@none@1@S@Data set@@@@1@2@@danf@17-8-2009
10220020@unknown@formal@none@1@S@A '''data set''' (or '''dataset''') is a collection of [[data]], usually presented in tabular form.@@@@1@15@@danf@17-8-2009
10220030@unknown@formal@none@1@S@Each column represents a particular variable.@@@@1@6@@danf@17-8-2009
10220040@unknown@formal@none@1@S@Each row corresponds to a given member of the data set in question.@@@@1@13@@danf@17-8-2009
10220050@unknown@formal@none@1@S@It lists values for each of the variables, such as height and weight of an object or values of random numbers.@@@@1@21@@danf@17-8-2009
10220060@unknown@formal@none@1@S@Each value is known as a [[datum]].@@@@1@7@@danf@17-8-2009
10220070@unknown@formal@none@1@S@The data set may comprise data for one or more members, corresponding to the number of rows.@@@@1@17@@danf@17-8-2009
10220080@unknown@formal@none@1@S@Historically, the term originated in the [[mainframe computer|mainframe field]], where it had a [[Data set (IBM mainframe)|well-defined meaning]], very close to contemporary ''[[computer file]]''.@@@@1@24@@danf@17-8-2009
10220090@unknown@formal@none@1@S@This topic is not covered here.@@@@1@6@@danf@17-8-2009
10220100@unknown@formal@none@1@S@In the simplest case, there is only one variable, and then the data set consists of a single column of values, often represented as a list.@@@@1@26@@danf@17-8-2009
10220110@unknown@formal@none@1@S@The values may be numbers, such as [[real number]]s or [[integer]]s, for example representing a person's height in centimeters, but may also be [[nominal data]] (i.e., not consisting of [[numerical]] values), for example representing a person's ethnicity.@@@@1@37@@danf@17-8-2009
10220120@unknown@formal@none@1@S@More generally, values may be of any of the kinds described as a [[level of measurement]].@@@@1@16@@danf@17-8-2009
10220130@unknown@formal@none@1@S@For each variable, the values will normally all be of the same kind.@@@@1@13@@danf@17-8-2009
10220140@unknown@formal@none@1@S@However, there may also be "[[missing values]]", which need to be indicated in some way.@@@@1@15@@danf@17-8-2009
10220150@unknown@formal@none@1@S@In [[statistics]] data sets usually come from actual observations obtained by [[sampling (statistics)|sampling]] a [[statistical population]], and each row corresponds to the observations on one element of that population.@@@@1@29@@danf@17-8-2009
10220160@unknown@formal@none@1@S@Data sets may further be generated by [[algorithms]] for the purpose of testing certain kinds of [[software]].@@@@1@17@@danf@17-8-2009
10220170@unknown@formal@none@1@S@Some modern statistical analysis software such as [[PSPP]] still present their data in the classical dataset fashion.@@@@1@17@@danf@17-8-2009
10220180@unknown@formal@none@1@S@== Classic data sets ==@@@@1@5@@danf@17-8-2009
10220190@unknown@formal@none@1@S@Several classic [[data set]]s have been used extensively in the [[statistical]] literature:@@@@1@12@@danf@17-8-2009
10220200@unknown@formal@none@1@S@* [[Iris flower data set]] - multivariate data set introduced by [[Ronald Fisher]] (1936).@@@@1@14@@danf@17-8-2009
10220210@unknown@formal@none@1@S@* ''[[Categorical data analysis]]'' - Data sets used in the book, ''An Introduction to Categorical Data Analysis'', by Agresti are [http://lib.stat.cmu.edu/datasets/agresti provided on-line by StatLib.]@@@@1@25@@danf@17-8-2009
10220220@unknown@formal@none@1@S@*''[[Robust statistics]]'' - Data sets used in ''Robust Regression and Outlier Detection'' (Rousseeuw and Leroy, 1986). [http://www.uni-koeln.de/themen/Statistik/data/rousseeuw/ Provided on-line at the University of Cologne.]@@@@1@24@@danf@17-8-2009
10220230@unknown@formal@none@1@S@*''[[Time series]]'' - Data used in Chatfield's book, ''The Analysis of Time Series'', are [http://lib.stat.cmu.edu/modules.php?op=modload&name=PostWrap&file=index&page=datasets/ provided on-line by StatLib.]@@@@1@19@@danf@17-8-2009
10220240@unknown@formal@none@1@S@*''Extreme values'' - Data used in the book, ''An Introduction to the Statistical Modeling of Extreme Values'' are [http://homes.stat.unipd.it/coles/public_html/ismev/ismev.dat provided on-line by Stuart Coles], the book's author.@@@@1@27@@danf@17-8-2009
10220250@unknown@formal@none@1@S@*''Bayesian Data Analysis'' - Data used in the book, ''[[Bayesian]] Data Analysis'', are [http://www.stat.columbia.edu/~gelman/book/data/ provided on-line by Andrew Gelman], one of the book's authors.@@@@1@24@@danf@17-8-2009
10220260@unknown@formal@none@1@S@* The [ftp://ftp.ics.uci.edu/pub/machine-learning-databases/liver-disorders Bupa liver data], used in several papers in the machine learning (data mining) literature.@@@@1@17@@danf@17-8-2009
10230010@unknown@formal@none@1@S@ELIZA@@@@1@1@@danf@17-8-2009
10230020@unknown@formal@none@1@S@'''ELIZA''' is a [[computer program]] by [[Joseph Weizenbaum]], designed in [[1966]], which parodied a [[Rogerian psychotherapy|Rogerian therapist]], largely by rephrasing many of the patient's statements as questions and posing them to the patient.@@@@1@33@@danf@17-8-2009
10230030@unknown@formal@none@1@S@Thus, for example, the response to "My head hurts" might be "Why do you say your head hurts?"@@@@1@18@@danf@17-8-2009
10230040@unknown@formal@none@1@S@The response to "My mother hates me" might be "Who else in your family hates you?"@@@@1@16@@danf@17-8-2009
10230050@unknown@formal@none@1@S@ELIZA was named after Eliza Doolittle, a working-class character in [[George Bernard Shaw|George Bernard Shaw's]] play ''[[Pygmalion (play)|Pygmalion]]'', who is taught to speak with an [[upper class]] [[accent (linguistics)|accent]].@@@@1@29@@danf@17-8-2009
10230060@unknown@formal@none@1@S@==Overview==@@@@1@1@@danf@17-8-2009
10230070@unknown@formal@none@1@S@It is sometimes inaccurately said that ELIZA simulates a therapist.@@@@1@10@@danf@17-8-2009
10230080@unknown@formal@none@1@S@Weizenbaum said that ELIZA provided a "[[parody]]" of "the responses of a non-directional psychotherapist in an initial psychiatric interview."@@@@1@19@@danf@17-8-2009
10230090@unknown@formal@none@1@S@He chose the context of psychotherapy to "sidestep the problem of giving the program a data base of real-world knowledge", the therapeutic situation being one of the few real human situations in which a human being can reply to a statement with a question that indicates very little specific knowledge of the topic under discussion.@@@@1@55@@danf@17-8-2009
10230100@unknown@formal@none@1@S@For example, it is a context in which the question "Who is your favorite composer?" can be answered acceptably with responses such as "What about your own favorite composer?" or "Does that question interest you?"@@@@1@35@@danf@17-8-2009
10230110@unknown@formal@none@1@S@First implemented in Weizenbaum's own [[SLIP (programming language)|SLIP]] list-processing language, ELIZA worked by simple [[parsing]] and substitution of key words into canned phrases.@@@@1@23@@danf@17-8-2009
10230120@unknown@formal@none@1@S@Depending upon the initial entries by the user the illusion of a human writer could be instantly dispelled, or could continue through several interchanges.@@@@1@24@@danf@17-8-2009
10230130@unknown@formal@none@1@S@It was sometimes so convincing that there are many anecdotes about people becoming very emotionally caught up in dealing with ELIZA for several minutes until the machine's true lack of understanding became apparent.@@@@1@33@@danf@17-8-2009
10230140@unknown@formal@none@1@S@This was likely due to people's tendency to attach meanings to words which the computer never put there.@@@@1@18@@danf@17-8-2009
10230150@unknown@formal@none@1@S@In 1966, interactive computing (via a teletype) was new.@@@@1@9@@danf@17-8-2009
10230160@unknown@formal@none@1@S@It was 15 years before the personal computer became familiar to the general public, and two decades before most people encountered attempts at [[natural language processing]] in Internet services like [[Ask.com]] or PC help systems such as Microsoft Office [[Office Assistant|Clippy]].@@@@1@41@@danf@17-8-2009
10230170@unknown@formal@none@1@S@Although those programs included years of research and work (while ''[[Ecala]]'' eclipsed the functionality of ''ELIZA'' after less than two weeks of work by a single programmer), ''ELIZA'' remains a milestone simply because it was the first time a programmer had attempted such a human-machine interaction with the goal of creating the illusion (however brief) of human-''human'' interaction.@@@@1@58@@danf@17-8-2009
10230180@unknown@formal@none@1@S@In the article "theNewMediaReader" an excerpt from "From Computer Power and Human Reason" by Joseph Weizenbaum in 1976, edited by Noah Wardrip-Fruin and Nick Montfort he references how quickly and deeply people became emotionally involved with the computer program, taking offence when he asked to view the transcripts, saying it was an invasion of their privacy, even asking him to leave the room while they were working with ELIZA.@@@@1@69@@danf@17-8-2009
10230190@unknown@formal@none@1@S@==Influence on games==@@@@1@3@@danf@17-8-2009
10230200@unknown@formal@none@1@S@ELIZA impacted a number of early [[computer games]] by demonstrating additional kinds of [[interface design]]s.@@@@1@15@@danf@17-8-2009
10230210@unknown@formal@none@1@S@[[Don Daglow]] wrote an enhanced version of the program called ''Ecala'' on a [[PDP-10]] [[mainframe computer]] at [[Pomona College]] in [[1973]] before writing what was possibly the second or third computer [[role-playing game]], ''[[Dungeon (computer game)|Dungeon]]'' ([[1975]]) (The first was probably "[[dnd (computer game)|dnd]]", written on and for the PLATO system in 1974, and the second may have been [[Moria]], written in 1975).@@@@1@63@@danf@17-8-2009
10230220@unknown@formal@none@1@S@It is likely that ''ELIZA'' was also on the system where [[Will Crowther]] created ''[[Colossal Cave Adventure|Adventure]]'', the 1975 game that spawned the [[interactive fiction]] genre.@@@@1@26@@danf@17-8-2009
10230230@unknown@formal@none@1@S@But both these games appeared some nine years after the original ''ELIZA''.@@@@1@12@@danf@17-8-2009
10230240@unknown@formal@none@1@S@==Response and legacy==@@@@1@3@@danf@17-8-2009
10230250@unknown@formal@none@1@S@Lay responses to ELIZA were disturbing to Weizenbaum and motivated him to write his book ''Computer Power and Human Reason: From Judgment to Calculation'', in which he explains the limits of computers, as he wants to make clear in people's minds his opinion that the anthropomorphic views of computers are just a reduction of the human being and any life form for that matter.@@@@1@64@@danf@17-8-2009
10230260@unknown@formal@none@1@S@There are many programs based on ELIZA in different languages in addition to ''Ecala''.@@@@1@14@@danf@17-8-2009
10230270@unknown@formal@none@1@S@For example, in 1980, a company called "Don't Ask Software", founded by Randy Simon, created a version for the Apple II, Atari, and Commodore PCs, which verbally abused the user based on the user's input.@@@@1@35@@danf@17-8-2009
10230280@unknown@formal@none@1@S@In Spain, Jordi Perez developed the famous ZEBAL in 1993, written in [[Clipper programming language|Clipper]] for MS-DOS.@@@@1@17@@danf@17-8-2009
10230290@unknown@formal@none@1@S@Other versions adapted ELIZA around a religious theme, such as ones featuring Jesus (both serious and comedic) and another Apple II variant called ''I Am Buddha''.@@@@1@26@@danf@17-8-2009
10230300@unknown@formal@none@1@S@The 1980 game ''[[The Prisoner (computer game)|The Prisoner]]'' incorporated ELIZA-style interaction within its gameplay.@@@@1@14@@danf@17-8-2009
10230310@unknown@formal@none@1@S@ELIZA has also inspired a [[podcast]] called "The Eliza Podcast", in which the host engages in self-analysis using a computer generated voice prompting with questions in the same style as the ELIZA program.@@@@1@33@@danf@17-8-2009
10230320@unknown@formal@none@1@S@==Implementations==@@@@1@1@@danf@17-8-2009
10230330@unknown@formal@none@1@S@* Using [[JavaScript]]: http://www.manifestation.com/neurotoys/eliza.php3@@@@1@4@@danf@17-8-2009
10230340@unknown@formal@none@1@S@* Source code in [[Java (programming language)|Java]]: http://chayden.net/eliza/Eliza.html@@@@1@8@@danf@17-8-2009
10230350@unknown@formal@none@1@S@* Another [[Java (programming language)|Java]]-implementation of ELIZA: http://www.wedesoft.demon.co.uk/eliza/@@@@1@8@@danf@17-8-2009
10230360@unknown@formal@none@1@S@* Using [[C (programming language)|C]] on the [[TI-89]]: http://kaikostack.com/ti89_en.htm#eliza@@@@1@9@@danf@17-8-2009
10230370@unknown@formal@none@1@S@* Using [[z80#The Z80 assembly language|z80 Assembly]] on the [[TI-83#TI-83 Plus|TI-83 Plus]]: http://www.ticalc.org/archives/files/fileinfo/354/35463.html@@@@1@13@@danf@17-8-2009
10230380@unknown@formal@none@1@S@* A [[perl module]] [http://search.cpan.org/dist/Chatbot-Eliza/ Chatbot::Eliza] — [http://www.terrence.com/perl/eliza/eliza.cgi example implementation]@@@@1@10@@danf@17-8-2009
10230390@unknown@formal@none@1@S@* Trans-Tex Software has released shareware versions for Classic Mac OS and Mac OS X: http://www.tex-edit.com/index.html#Eliza@@@@1@16@@danf@17-8-2009
10230400@unknown@formal@none@1@S@* doctor.el (circa [[1985]]) in [[Emacs]].@@@@1@6@@danf@17-8-2009
10230410@unknown@formal@none@1@S@* Source code in [[Tcl]]: [http://wiki.tcl.tk/9235 http://wiki.tcl.tk/9235]@@@@1@7@@danf@17-8-2009
10230420@unknown@formal@none@1@S@* The [http://www.indyproject.org Indy] [[Delphi]] oriented TCP/IP components suite has an Eliza implementation as demo.@@@@1@15@@danf@17-8-2009
10230430@unknown@formal@none@1@S@*[http://www.cs.bham.ac.uk/research/projects/cogaff/eliza Pop-11 Eliza] in the [[poplog]] system.@@@@1@7@@danf@17-8-2009
10230440@unknown@formal@none@1@S@Goes back to about 1976, when it was used for teaching AI at [[Sussex University]].@@@@1@15@@danf@17-8-2009
10230450@unknown@formal@none@1@S@Now part of the free open source Poplog system.@@@@1@9@@danf@17-8-2009
10230460@unknown@formal@none@1@S@* Source code in [[BASIC]]: http://www.atariarchives.org/bigcomputergames/showpage.php?page=22@@@@1@6@@danf@17-8-2009
10230470@unknown@formal@none@1@S@* ECC-Eliza for Windows (actual program is for DOS, but unpacker is for Windows) (rename .txt to .exe before running): http://www5.domaindlx.com/ecceliza1/ecceliza.txt.@@@@1@21@@danf@17-8-2009
10230480@unknown@formal@none@1@S@More recent version at http://web.archive.org/web/20041117123025/http://www5.domaindlx.com/ecceliza1/ecceliza.txt.@@@@1@5@@danf@17-8-2009
10240010@unknown@formal@none@1@S@English language@@@@1@2@@danf@17-8-2009
10240020@unknown@formal@none@1@S@'''English''' is an [[Indo-European languages|Indo-European]], [[West Germanic languages|West Germanic language]] originating in [[England]], and is the [[first language]] for most people in the [[United Kingdom]], the [[United States]], [[Canada]], [[Australia]], [[New Zealand]], [[Republic of Ireland|Ireland]], and the [[Anglophone Caribbean]].@@@@1@39@@danf@17-8-2009
10240030@unknown@formal@none@1@S@It is used extensively as a [[second language]] and as an [[official language]] throughout the world, especially in [[Commonwealth of Nations|Commonwealth]] countries and in many [[international organization]]s.@@@@1@27@@danf@17-8-2009
10240040@unknown@formal@none@1@S@==Significance==@@@@1@1@@danf@17-8-2009
10240050@unknown@formal@none@1@S@Modern English, sometimes described as the first global [[lingua franca]], is the [[Linguistic imperialism|dominant]] [[international auxiliary language|international language]] in [[communication]]s, [[science]], [[business]], [[aviation]], [[entertainment]], [[radio]] and [[diplomacy]].@@@@1@27@@danf@17-8-2009
10240060@unknown@formal@none@1@S@The initial reason for its enormous spread beyond the bounds of the [[British Isles]] where it was originally a native tongue was the [[British Empire]], and by the late nineteenth century its influence had won a truly global reach.@@@@1@39@@danf@17-8-2009
10240070@unknown@formal@none@1@S@It is the dominant language in the [[United States]] and the growing economic and cultural influence of that [[federal union]] as a global [[superpower]] since [[World War II]] has significantly accelerated adoption of English as a language across the planet.@@@@1@40@@danf@17-8-2009
10240080@unknown@formal@none@1@S@A working knowledge of English has become a requirement in a number of fields, occupations and professions such as medicine and as a consequence over a billion people speak English to at least a basic level (see [[English language learning and teaching]]).@@@@1@42@@danf@17-8-2009
10240090@unknown@formal@none@1@S@Linguists such as [[David Crystal]] recognize that one impact of this massive growth of English, in common with other global languages, has been to reduce native [[Natural language#Linguistic diversity|linguistic diversity]] in many parts of the world historically, most particularly in [[Australasia]] and [[North America]], and its huge influence continues to play an important role in [[language attrition]].@@@@1@57@@danf@17-8-2009
10240100@unknown@formal@none@1@S@By a similar token, [[historical linguistics|historical linguists]], aware of the complex and fluid dynamics of [[language change]], are always alive to the potential English contains through the vast size and spread of the communities that use it and its natural internal variety, such as in its [[English-based creole languages|creoles]] and [[pidgin]]s, to produce a new [[language family|family]] of distinct languages over time.@@@@1@62@@danf@17-8-2009
10240110@unknown@formal@none@1@S@English is one of six official languages of the [[United Nations]].@@@@1@11@@danf@17-8-2009
10240120@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10240130@unknown@formal@none@1@S@English is a [[West Germanic languages|West Germanic]] language that originated from the [[Anglo-Frisian languages|Anglo-Frisian]] dialects brought to [[Great Britain|Britain]] by Germanic settlers and Roman auxiliary troops from various parts of what is now northwest Germany and the Northern [[Netherlands]].@@@@1@39@@danf@17-8-2009
10240140@unknown@formal@none@1@S@Initially, [[Old English language|Old English]] was a diverse group of dialects, reflecting the varied origins of the Anglo-Saxon Kingdoms of [[England]].@@@@1@21@@danf@17-8-2009
10240150@unknown@formal@none@1@S@One of these dialects, Late West Saxon, eventually came to dominate.@@@@1@11@@danf@17-8-2009
10240160@unknown@formal@none@1@S@The original Old English language was then influenced by two waves of invasion.@@@@1@13@@danf@17-8-2009
10240170@unknown@formal@none@1@S@The first was by language speakers of the [[North Germanic languages|Scandinavian]] branch of the Germanic family; they conquered and colonized parts of Britain in the 8th and 9th centuries.@@@@1@29@@danf@17-8-2009
10240180@unknown@formal@none@1@S@The second was the [[Normans]] in the 11th century, who spoke Old Norman and ultimately developed an English variety of this called [[Anglo-Norman]].@@@@1@23@@danf@17-8-2009
10240190@unknown@formal@none@1@S@These two invasions caused English to become "mixed" to some degree (though it was never a truly mixed language in the strict linguistic sense of the word; mixed languages arise from the cohabitation of speakers of different languages, who develop a hybrid tongue for basic communication).@@@@1@46@@danf@17-8-2009
10240200@unknown@formal@none@1@S@Cohabitation with the Scandinavians resulted in a significant grammatical simplification and lexical supplementation of the Anglo-Frisian core of English; the later [[Normans|Norman]] occupation led to the grafting onto that Germanic core of a more elaborate layer of words from the [[Italic languages|Italic]] branch of the European languages.@@@@1@47@@danf@17-8-2009
10240210@unknown@formal@none@1@S@This Norman influence entered English largely through the courts and government.@@@@1@11@@danf@17-8-2009
10240220@unknown@formal@none@1@S@Thus, English developed into a "borrowing" language of great flexibility and with a huge vocabulary.@@@@1@15@@danf@17-8-2009
10240230@unknown@formal@none@1@S@== Classification and related languages ==@@@@1@6@@danf@17-8-2009
10240240@unknown@formal@none@1@S@The English language belongs to the western sub-branch of the [[Germanic languages|Germanic branch]] of the [[Indo-European languages|Indo-European]] family of languages.@@@@1@20@@danf@17-8-2009
10240250@unknown@formal@none@1@S@The closest living relative of English is [[Scots language|Scots]], spoken primarily in Scotland and parts of Northern Ireland, which is viewed by linguists as either a separate language or a group of dialects of English.@@@@1@35@@danf@17-8-2009
10240260@unknown@formal@none@1@S@The next closest relative to English after Scots is [[Frisian languages|Frisian]], spoken in the Northern Netherlands and Northwest Germany.@@@@1@19@@danf@17-8-2009
10240270@unknown@formal@none@1@S@Other less closely related living [[West Germanic languages]] include [[Dutch language|Dutch]], [[Low German]], [[German language|German]] and [[Afrikaans]].@@@@1@17@@danf@17-8-2009
10240280@unknown@formal@none@1@S@The [[North Germanic languages]] of Scandinavia are less closely related to English than the West Germanic languages.@@@@1@17@@danf@17-8-2009
10240290@unknown@formal@none@1@S@Many [[French language|French]] words are also intelligible to an English speaker (though pronunciations are often quite different) because English absorbed a large vocabulary from [[Norman language|Norman]] and French, via [[Anglo-Norman]] after the Norman Conquest and directly from French in subsequent centuries.@@@@1@41@@danf@17-8-2009
10240300@unknown@formal@none@1@S@As a result, a large portion of English vocabulary is derived from French, with some minor spelling differences (word endings, use of old French spellings, etc.), as well as occasional divergences in meaning, in so-called "faux amis", or [[false friend]]s.@@@@1@40@@danf@17-8-2009
10240310@unknown@formal@none@1@S@The pronunciation of French loanwords in English has become completely anglicized and follows a typically Germanic pattern of stress.@@@@1@19@@danf@17-8-2009
10240320@unknown@formal@none@1@S@== Geographical distribution ==@@@@1@4@@danf@17-8-2009
10240330@unknown@formal@none@1@S@Approximately 375 million people speak English as their first language.@@@@1@10@@danf@17-8-2009
10240340@unknown@formal@none@1@S@English today is probably the third largest language by number of native speakers, after [[Mandarin (linguistics)|Mandarin Chinese]] and [[Spanish language|Spanish]].@@@@1@20@@danf@17-8-2009
10240350@unknown@formal@none@1@S@However, when combining native and non-native speakers it is probably the most commonly spoken language in the world, though possibly second to a combination of the [[Chinese language]]s, depending on whether or not distinctions in the latter are classified as "languages" or "dialects."@@@@1@43@@danf@17-8-2009
10240360@unknown@formal@none@1@S@Estimates that include [[second language]] speakers vary greatly from 470 million to over a billion depending on how [[literacy]] or mastery is defined.@@@@1@23@@danf@17-8-2009
10240370@unknown@formal@none@1@S@There are some who claim that non-native speakers now outnumber native speakers by a ratio of 3 to 1.@@@@1@19@@danf@17-8-2009
10240380@unknown@formal@none@1@S@The countries with the highest populations of native English speakers are, in descending order: United States (215 million), United Kingdom (58 million), Canada (18.2 million), Australia (15.5 million), [[Republic of Ireland|Ireland]] (3.8 million), South Africa (3.7 million), and New Zealand (3.0-3.7 million).@@@@1@42@@danf@17-8-2009
10240390@unknown@formal@none@1@S@Countries such as [[Jamaica]] and [[Nigeria]] also have millions of native speakers of [[dialect continuum|dialect continua]] ranging from an [[English-based creole languages|English-based creole]] to a more standard version of English.@@@@1@30@@danf@17-8-2009
10240400@unknown@formal@none@1@S@Of those nations where English is spoken as a second language, India has the most such speakers ('[[Indian English]]') and linguistics professor [[David Crystal]] claims that, combining native and non-native speakers, India now has more people who speak or understand English than any other country in the world.@@@@1@48@@danf@17-8-2009
10240410@unknown@formal@none@1@S@Following India is the [[People's Republic of China]].@@@@1@8@@danf@17-8-2009
10240420@unknown@formal@none@1@S@===Countries in order of total speakers===@@@@1@6@@danf@17-8-2009
10240430@unknown@formal@none@1@S@English is the primary language in [[Anguilla]], [[Antigua and Barbuda]], Australia ([[Australian English]]), the [[The Bahamas|Bahamas]], [[Barbados]], [[Bermuda]], [[Belize]] ([[Belizean Kriol language|Belizean Kriol]]), the [[British Indian Ocean Territory]], the [[British Virgin Islands]], Canada ([[Canadian English]]), the [[Cayman Islands]], the [[Falkland Islands]], [[Gibraltar]], [[Grenada]], [[Guam]], [[Guernsey]] ([[Channel Island English]]), [[Guyana]], Ireland ([[Hiberno-English]]), [[Isle of Man]] ([[Manx English]]), Jamaica ([[Jamaican English]]), [[Jersey]], [[Montserrat]], [[Nauru]], New Zealand ([[New Zealand English]]), [[Pitcairn Islands]], [[Saint Helena]], [[Saint Kitts and Nevis]], [[Saint Vincent and the Grenadines]], [[Singapore]], [[South Georgia and the South Sandwich Islands]], [[Trinidad and Tobago]], the [[Turks and Caicos Islands]], the United Kingdom, the [[United States Virgin Islands|U.S. Virgin Islands]], and the United States.@@@@1@110@@danf@17-8-2009
10240440@unknown@formal@none@1@S@In many other countries, where English is not the most spoken language, it is an official language; these countries include [[Botswana]], [[Cameroon]], [[Dominica]], [[Fiji]], the [[Federated States of Micronesia]], [[Ghana]], [[The Gambia|Gambia]], [[India]], [[Kenya]], [[Kiribati]], [[Lesotho]], [[Liberia]], [[Madagascar]], [[Malta]], the [[Marshall Islands]], [[Mauritius]], [[Namibia]], [[Nigeria]], [[Pakistan]], [[Palau]], [[Papua New Guinea]], the [[Philippines]], [[Puerto Rico]], [[Rwanda]], the [[Solomon Islands]], [[Saint Lucia]], [[Samoa]], [[Seychelles]], [[Sierra Leone]], [[Sri Lanka]], [[Swaziland]], [[Tanzania]], [[Uganda]], [[Zambia]], and [[Zimbabwe]].@@@@1@72@@danf@17-8-2009
10240450@unknown@formal@none@1@S@It is also one of the 11 official languages that are given equal status in South Africa ([[South African English]]).@@@@1@20@@danf@17-8-2009
10240460@unknown@formal@none@1@S@English is also the official language in current [[dependent territory|dependent territories]] of Australia ([[Norfolk Island]], [[Christmas Island]] and [[Cocos Island]]) and of the United States ([[Northern Mariana Islands]], [[American Samoa]] and [[Puerto Rico]]), and in the former British colony of [[Hong Kong]].@@@@1@42@@danf@17-8-2009
10240470@unknown@formal@none@1@S@English is an important language in several former [[colony|colonies]] and [[protectorate]]s of the United Kingdom but falls short of official status, such as in [[Malaysia]], [[Brunei]], [[United Arab Emirates]] and [[Bahrain]].@@@@1@31@@danf@17-8-2009
10240480@unknown@formal@none@1@S@English is also not an official language in either the United States or the United Kingdom.@@@@1@16@@danf@17-8-2009
10240490@unknown@formal@none@1@S@Although the United States federal government has no official languages, English has been given official status by 30 of the 50 state governments.@@@@1@23@@danf@17-8-2009
10240500@unknown@formal@none@1@S@===English as a global language===@@@@1@5@@danf@17-8-2009
10240510@unknown@formal@none@1@S@Because English is so widely spoken, it has often been referred to as a "[[world language]]", the ''[[lingua franca]]'' of the modern era.@@@@1@23@@danf@17-8-2009
10240520@unknown@formal@none@1@S@While English is not an official language in most countries, it is currently the language most often taught as a [[second language]] around the world.@@@@1@25@@danf@17-8-2009
10240530@unknown@formal@none@1@S@Some linguists believe that it is no longer the exclusive cultural sign of "native English speakers", but is rather a language that is absorbing aspects of cultures worldwide as it continues to grow.@@@@1@33@@danf@17-8-2009
10240540@unknown@formal@none@1@S@It is, by international treaty, the official language for aerial and maritime communications.@@@@1@13@@danf@17-8-2009
10240550@unknown@formal@none@1@S@English is an official language of the [[United Nations]] and many other international organizations, including the [[International Olympic Committee]].@@@@1@19@@danf@17-8-2009
10240560@unknown@formal@none@1@S@English is the language most often studied as a foreign language in the European Union (by 89% of schoolchildren), followed by French (32%), German (18%), and Spanish (8%).@@@@1@28@@danf@17-8-2009
10240570@unknown@formal@none@1@S@In the EU, a large fraction of the population reports being able to converse to some extent in English.@@@@1@19@@danf@17-8-2009
10240580@unknown@formal@none@1@S@Among non-English speaking countries, a large percentage of the population claimed to be able to converse in English in the [[Netherlands]] (87%), [[Sweden]] (85%), [[Denmark]] (83%), [[Luxembourg]] (66%), [[Finland]] (60%), [[Slovenia]] (56%), [[Austria]] (53%), [[Belgium]] (52%), and [[Germany]] (51%).@@@@1@39@@danf@17-8-2009
10240590@unknown@formal@none@1@S@[[Norway]] and [[Iceland]] also have a large majority of competent English-speakers.@@@@1@11@@danf@17-8-2009
10240600@unknown@formal@none@1@S@[[Book]]s, [[magazine]]s, and [[newspaper]]s written in English are available in many countries around the world.@@@@1@15@@danf@17-8-2009
10240610@unknown@formal@none@1@S@English is also the most commonly used language in the [[science]]s.@@@@1@11@@danf@17-8-2009
10240620@unknown@formal@none@1@S@In 1997, the [[Science Citation Index]] reported that 95% of its articles were written in English, even though only half of them came from authors in English-speaking countries.@@@@1@28@@danf@17-8-2009
10240630@unknown@formal@none@1@S@=== Dialects and regional varieties ===@@@@1@6@@danf@17-8-2009
10240640@unknown@formal@none@1@S@The expansion of the British Empire and—since WWII—the primacy of the United States have spread English throughout the globe.@@@@1@19@@danf@17-8-2009
10240650@unknown@formal@none@1@S@Because of that global spread, English has developed a host of [[List of dialects of the English language|English dialects]] and English-based [[creole language]]s and [[pidgin]]s.@@@@1@25@@danf@17-8-2009
10240660@unknown@formal@none@1@S@The major [[Variety (linguistics)|varieties]] of English include, in most cases, several subvarieties, such as [[Cockney]] within [[British English]]; [[Newfoundland English]] within [[Canadian English]]; and [[African American Vernacular English]] ("Ebonics") and [[Southern American English]] within [[American English]].@@@@1@36@@danf@17-8-2009
10240670@unknown@formal@none@1@S@English is a [[pluricentric language]], without a central language authority like France's [[Académie française]]; and, although no variety is clearly considered the only standard, there are a number of accents considered to be more prestigious, such as [[Received Pronunciation]] in Britain.@@@@1@41@@danf@17-8-2009
10240680@unknown@formal@none@1@S@[[Scots language|Scots]] developed—largely independently—from the same origins, but following the [[Acts of Union 1707]] a process of [[language attrition]] began, whereby successive generations adopted more and more features from English causing dialectalisation.@@@@1@32@@danf@17-8-2009
10240690@unknown@formal@none@1@S@Whether it is now a separate language or a [[dialect]] of English better described as [[Scottish English]] is in dispute.@@@@1@20@@danf@17-8-2009
10240700@unknown@formal@none@1@S@The pronunciation, grammar and lexis of the traditional forms differ, sometimes substantially, from other varieties of English.@@@@1@17@@danf@17-8-2009
10240710@unknown@formal@none@1@S@Because of the wide use of English as a second language, English speakers have many different [[Accent (linguistics)|accents]], which often signal the speaker's native dialect or language.@@@@1@27@@danf@17-8-2009
10240720@unknown@formal@none@1@S@For the more distinctive characteristics of regional accents, see [[Regional accents of English]], and for the more distinctive characteristics of regional dialects, see [[List of dialects of the English language]].@@@@1@30@@danf@17-8-2009
10240730@unknown@formal@none@1@S@Just as English itself has borrowed words from many different languages over its history, English [[loanword]]s now appear in a great many languages around the world, indicative of the technological and cultural influence of its speakers.@@@@1@36@@danf@17-8-2009
10240740@unknown@formal@none@1@S@Several [[pidgin]]s and [[creole language]]s have formed using an English base, such as [[Jamaican (language)|Jamaican Patois]], [[Nigerian Pidgin]], and [[Tok Pisin]].@@@@1@21@@danf@17-8-2009
10240750@unknown@formal@none@1@S@There are many words in English coined to describe forms of particular non-English languages that contain a very high proportion of English words.@@@@1@23@@danf@17-8-2009
10240760@unknown@formal@none@1@S@[[Franglais]], for example, is used to describe French with a very high English word content; it is found on the [[Channel Islands]].@@@@1@22@@danf@17-8-2009
10240770@unknown@formal@none@1@S@Another variant, spoken in the border bilingual regions of Québec in Canada, is called [[Franglais#Frenglish|Frenglish]].@@@@1@15@@danf@17-8-2009
10240780@unknown@formal@none@1@S@In [[Wales]], which is part of the United Kingdom, the languages of [[Welsh language|Welsh]] and English are sometimes mixed together by fluent or comfortable Welsh speakers, the result of which is called [[Welsh English|Wenglish]].@@@@1@34@@danf@17-8-2009
10240790@unknown@formal@none@1@S@=== Constructed varieties of English ===@@@@1@6@@danf@17-8-2009
10240800@unknown@formal@none@1@S@* [[Basic English]] is simplified for easy international use.@@@@1@9@@danf@17-8-2009
10240810@unknown@formal@none@1@S@It is used by manufacturers and other international businesses to write manuals and communicate.@@@@1@14@@danf@17-8-2009
10240820@unknown@formal@none@1@S@Some English schools in Asia teach it as a practical subset of English for use by beginners.@@@@1@17@@danf@17-8-2009
10240830@unknown@formal@none@1@S@* [[Special English]] is a simplified version of English used by the [[Voice of America]].@@@@1@15@@danf@17-8-2009
10240840@unknown@formal@none@1@S@It uses a vocabulary of only 1500 words.@@@@1@8@@danf@17-8-2009
10240850@unknown@formal@none@1@S@* [[English spelling reform|English reform]] is an attempt to improve collectively upon the English language.@@@@1@15@@danf@17-8-2009
10240860@unknown@formal@none@1@S@* [[Seaspeak]] and the related [[NATO phonetic alphabet|Airspeak]] and Policespeak, all based on restricted vocabularies, were designed by [[Edward Johnson]] in the 1980s to aid international cooperation and communication in specific areas.@@@@1@32@@danf@17-8-2009
10240870@unknown@formal@none@1@S@There is also a [[tunnelspeak]] for use in the [[Channel Tunnel]].@@@@1@11@@danf@17-8-2009
10240880@unknown@formal@none@1@S@* [[Euro-English]] is a concept of standardising English for use as a second language in continental Europe.@@@@1@17@@danf@17-8-2009
10240890@unknown@formal@none@1@S@* [[Manually Coded English]] — a variety of systems have been developed to represent the English language with hand signals, designed primarily for use in deaf education.@@@@1@27@@danf@17-8-2009
10240900@unknown@formal@none@1@S@These should not be confused with true sign languages such as [[British Sign Language]] and [[American Sign Language]] used in Anglophone countries, which are independent and not based on English.@@@@1@30@@danf@17-8-2009
10240910@unknown@formal@none@1@S@* [[E-Prime]] excludes forms of the verb ''to be''.@@@@1@9@@danf@17-8-2009
10240920@unknown@formal@none@1@S@Euro-English (also ''EuroEnglish'' or ''Euro-English'') terms are English translations of European concepts that are not native to English-speaking countries.@@@@1@19@@danf@17-8-2009
10240930@unknown@formal@none@1@S@Because of the United Kingdom's (and even the Republic of Ireland's) involvement in the European Union, the usage focuses on non-British concepts.@@@@1@22@@danf@17-8-2009
10240940@unknown@formal@none@1@S@This kind of Euro-English was parodied when English was "made" one of the constituent languages of [[Europanto]].@@@@1@17@@danf@17-8-2009
10240950@unknown@formal@none@1@S@== Phonology ==@@@@1@3@@danf@17-8-2009
10240960@unknown@formal@none@1@S@=== Vowels ===@@@@1@3@@danf@17-8-2009
10240970@unknown@formal@none@1@S@'''Notes:'''@@@@1@1@@danf@17-8-2009
10240980@unknown@formal@none@1@S@It is the [[vowel]]s that differ most from region to region.@@@@1@11@@danf@17-8-2009
10240990@unknown@formal@none@1@S@Where symbols appear in pairs, the first corresponds to American English, [[General American]] accent; the second corresponds to British English, [[Received Pronunciation]].@@@@1@22@@danf@17-8-2009
10241000@unknown@formal@none@1@S@# American English lacks this sound; words with this sound are pronounced with {{IPA | /ɑ/}} or {{IPA | /ɔ/}}.@@@@1@20@@danf@17-8-2009
10241010@unknown@formal@none@1@S@See [[Phonological history of English low back vowels#Lot-cloth split|''Lot-cloth split'']].@@@@1@10@@danf@17-8-2009
10241020@unknown@formal@none@1@S@# Some dialects of North American English do not have this vowel.@@@@1@12@@danf@17-8-2009
10241030@unknown@formal@none@1@S@See [[phonological history of English low_back vowels#Cot-caught merger|''Cot-caught merger'']].@@@@1@9@@danf@17-8-2009
10241040@unknown@formal@none@1@S@# The North American variation of this sound is a [[r-colored vowel|rhotic vowel]].@@@@1@13@@danf@17-8-2009
10241050@unknown@formal@none@1@S@# Many speakers of North American English do not distinguish between these two unstressed vowels.@@@@1@15@@danf@17-8-2009
10241060@unknown@formal@none@1@S@For them, ''roses'' and ''Rosa's'' are pronounced the same, and the symbol usually used is [[schwa]] {{IPA | /ə/}}.@@@@1@19@@danf@17-8-2009
10241070@unknown@formal@none@1@S@# This sound is often transcribed with {{IPA | /i/}} or with {{IPA | /ɪ/}}.@@@@1@15@@danf@17-8-2009
10241080@unknown@formal@none@1@S@# The diphthongs {{IPA | /eɪ/}} and {{IPA | /oʊ/}} are monophthongal for many General American speakers, as {{IPA | /eː/}} and {{IPA | /oː/}}.@@@@1@25@@danf@17-8-2009
10241090@unknown@formal@none@1@S@# The letter <''U''> can represent either {{IPA|/u/}} or the [[iotation|iotated]] vowel {{IPA|/ju/}}.@@@@1@13@@danf@17-8-2009
10241100@unknown@formal@none@1@S@In BRP, if this iotated vowel {{IPA|/ju/}} occurs after {{IPA|/t/}}, {{IPA|/d/}}, {{IPA|/s/}} or {{IPA|/z/}}, it often triggers palatalization of the preceding consonant, turning it to {{IPA|/ʨ/}}, {{IPA|/ʥ/}}, {{IPA|/ɕ/}} and {{IPA|/ʑ/}} respectively, as in ''tune'', ''during'', ''sugar'', and ''azure''.@@@@1@38@@danf@17-8-2009
10241110@unknown@formal@none@1@S@In American English, palatalization does not generally happen unless the {{IPA|/ju/}} is followed by ''r'', with the result that {{IPA|/(t, d,s, z)jur/}} turn to {{IPA|/tʃɚ/}}, {{IPA|/dʒɚ/}}, {{IPA|/ʃɚ/}} and {{IPA|/ʒɚ/}} respectively, as in ''nature'', ''verdure'', ''sure'', and ''treasure''.@@@@1@37@@danf@17-8-2009
10241120@unknown@formal@none@1@S@# [[Vowel length]] plays a phonetic role in the majority of English dialects, and is said to be phonemic in a few dialects, such as [[Australian English]] and [[New Zealand English]].@@@@1@31@@danf@17-8-2009
10241130@unknown@formal@none@1@S@In certain dialects of the modern English language, for instance [[General American]], there is allophonic vowel length: vowel phonemes are realized as long vowel allophones before voiced consonant phonemes in the coda of a syllable.@@@@1@35@@danf@17-8-2009
10241140@unknown@formal@none@1@S@Before the [[Great Vowel Shift]], vowel length was phonemically contrastive.@@@@1@10@@danf@17-8-2009
10241150@unknown@formal@none@1@S@# This sound only occurs in non-rhotic accents.@@@@1@8@@danf@17-8-2009
10241160@unknown@formal@none@1@S@In some accents, this sound may be, instead of {{IPA|/ʊə/}}, {{IPA|/ɔ:/}}.@@@@1@11@@danf@17-8-2009
10241170@unknown@formal@none@1@S@See [[English-language vowel changes before historic r]].@@@@1@7@@danf@17-8-2009
10241180@unknown@formal@none@1@S@# This sound only occurs in non-rhotic accents.@@@@1@8@@danf@17-8-2009
10241190@unknown@formal@none@1@S@In some accents, the schwa offglide of {{IPA|/ɛə/}} may be dropped, monophthising and lengthening the sound to {{IPA|/ɛ:/}}.@@@@1@18@@danf@17-8-2009
10241200@unknown@formal@none@1@S@See also [[IPA chart for English dialects]] for more vowel charts.@@@@1@11@@danf@17-8-2009
10241210@unknown@formal@none@1@S@=== Consonants ===@@@@1@3@@danf@17-8-2009
10241220@unknown@formal@none@1@S@This is the English consonantal system using symbols from the [[International Phonetic Alphabet]] (IPA).@@@@1@14@@danf@17-8-2009
10241230@unknown@formal@none@1@S@# The [[velar nasal]] {{IPA | [ŋ]}} is a non-phonemic allophone of /n/ in some northerly British accents, appearing only before /k/ and /g/.@@@@1@24@@danf@17-8-2009
10241240@unknown@formal@none@1@S@In all other dialects it is a separate phoneme, although it only occurs in [[syllable coda]]s.@@@@1@16@@danf@17-8-2009
10241250@unknown@formal@none@1@S@# The [[alveolar tap]] {{IPA | [ɾ]}} is an allophone of /t/ and /d/ in unstressed syllables in [[North American English]] and [[Australian English]].@@@@1@24@@danf@17-8-2009
10241260@unknown@formal@none@1@S@This is the sound of ''tt'' or ''dd'' in the words ''latter'' and ''ladder'', which are homophones for many speakers of North American English.@@@@1@24@@danf@17-8-2009
10241270@unknown@formal@none@1@S@In some accents such as [[Scottish English]] and [[Indian English]] it replaces {{IPA|/ɹ/}}.@@@@1@13@@danf@17-8-2009
10241280@unknown@formal@none@1@S@This is the same sound represented by single ''r'' in most varieties of [[Spanish language|Spanish]].@@@@1@15@@danf@17-8-2009
10241290@unknown@formal@none@1@S@# In some dialects, such as [[Cockney]], the interdentals /θ/ and /ð/ are usually merged with /f/ and /v/, and in others, like [[African American Vernacular English]], /ð/ is merged with dental /d/.@@@@1@33@@danf@17-8-2009
10241300@unknown@formal@none@1@S@In some Irish varieties, /θ/ and /ð/ become the corresponding dental plosives, which then contrast with the usual alveolar plosives.@@@@1@20@@danf@17-8-2009
10241310@unknown@formal@none@1@S@# The sounds {{IPA | /ʃ/, /ʒ/, and /ɹ/}} are labialised in some dialects.@@@@1@14@@danf@17-8-2009
10241320@unknown@formal@none@1@S@Labialisation is never contrastive in initial position and therefore is sometimes not transcribed.@@@@1@13@@danf@17-8-2009
10241330@unknown@formal@none@1@S@Most speakers of [[General American]] realize (always rhoticized) as the [[retroflex approximant]] {{IPA|/ɻ/}}, whereas the same is realized in [[Scottish English]], etc. as the [[alveolar trill]].@@@@1@27@@danf@17-8-2009
10241340@unknown@formal@none@1@S@# The [[voiceless palatal fricative]] /ç/ is in most accents just an [[allophone]] of /h/ before /j/; for instance ''human'' /çjuːmən/.@@@@1@21@@danf@17-8-2009
10241350@unknown@formal@none@1@S@However, in some accents (see [[Phonological history of English consonant clusters|this]]), the /j/ is dropped, but the initial consonant is the same.@@@@1@22@@danf@17-8-2009
10241360@unknown@formal@none@1@S@# The [[voiceless velar fricative]] /x/ is used by Scottish or Welsh speakers of English for Scots/Gaelic words such as ''loch'' {{IPA | /lɒx/}} or by some speakers for loanwords from German and Hebrew like ''Bach'' {{IPA|/bax/}} or ''Chanukah'' /xanuka/. /x/ is also used in South African English.@@@@1@48@@danf@17-8-2009
10241370@unknown@formal@none@1@S@In some dialects such as [[Scouse]] ([[Liverpool]]) either {{IPA|[x]}} or the [[affricate consonant|affricate]] {{IPA|[kx]}} may be used as an [[allophone]] of /k/ in words such as ''docker'' {{IPA | [dɒkxə]}}.@@@@1@30@@danf@17-8-2009
10241380@unknown@formal@none@1@S@Most native speakers have a great deal of trouble pronouncing it correctly when learning a foreign language.@@@@1@17@@danf@17-8-2009
10241390@unknown@formal@none@1@S@Most speakers use the sounds [k] and [h] instead.@@@@1@9@@danf@17-8-2009
10241400@unknown@formal@none@1@S@# Voiceless w {{IPA | [ʍ]}} is found in Scottish and Irish English, as well as in some varieties of American, New Zealand, and English English.@@@@1@26@@danf@17-8-2009
10241410@unknown@formal@none@1@S@In most other dialects it is merged with /w/, in some dialects of Scots it is merged with /f/.@@@@1@19@@danf@17-8-2009
10241420@unknown@formal@none@1@S@==== Voicing and aspiration ====@@@@1@5@@danf@17-8-2009
10241430@unknown@formal@none@1@S@[[Voice (phonetics)|Voicing]] and [[aspiration (phonetics)|aspiration]] of [[stop consonant]]s in English depend on dialect and context, but a few general rules can be given:@@@@1@23@@danf@17-8-2009
10241440@unknown@formal@none@1@S@* Voiceless [[stop consonant|plosives]] and [[affricate consonant|affricates]] (/{{IPA | p}}/, /{{IPA | t}}/, /{{IPA | k}}/, and /{{IPA | tʃ}}/) are aspirated when they are word-initial or begin a stressed syllable — compare ''pin'' {{IPA | [pʰɪn]}} and ''spin'' {{IPA | [spɪn]}}, ''crap'' {{IPA | [kʰɹ̥æp]}} and ''scrap'' {{IPA | [skɹæp]}}.@@@@1@51@@danf@17-8-2009
10241450@unknown@formal@none@1@S@** In some dialects, aspiration extends to unstressed syllables as well.@@@@1@11@@danf@17-8-2009
10241460@unknown@formal@none@1@S@** In other dialects, such as [[Indian English]], all voiceless stops remain unaspirated.@@@@1@13@@danf@17-8-2009
10241470@unknown@formal@none@1@S@* Word-initial voiced plosives may be devoiced in some dialects.@@@@1@10@@danf@17-8-2009
10241480@unknown@formal@none@1@S@* Word-terminal voiceless plosives may be unreleased or accompanied by a glottal stop in some dialects (e.g. many varieties of [[American English]]) — examples: ''tap'' [{{IPA |tʰæp̚}}], ''sack'' [{{IPA |sæk̚}}].@@@@1@30@@danf@17-8-2009
10241490@unknown@formal@none@1@S@* Word-terminal voiced plosives may be devoiced in some dialects (e.g. some varieties of [[American English]]) — examples: ''sad'' [{{IPA |sæd̥}}], ''bag'' [{{IPA |bæɡ̊}}].@@@@1@24@@danf@17-8-2009
10241500@unknown@formal@none@1@S@In other dialects they are fully voiced in final position, but only partially voiced in initial position.@@@@1@17@@danf@17-8-2009
10241510@unknown@formal@none@1@S@=== Supra-segmental features ===@@@@1@4@@danf@17-8-2009
10241520@unknown@formal@none@1@S@==== Tone groups ====@@@@1@4@@danf@17-8-2009
10241530@unknown@formal@none@1@S@English is an [[Intonation (linguistics)|intonation language]]. This means that the [[pitch (music)|pitch]] of the [[human voice|voice]] is used [[Syntax|syntactically]], for example, to convey [[surprise (emotion)|surprise]] and [[irony]], or to change a [[sentence (linguistics)|statement]] into a [[question]].@@@@1@36@@danf@17-8-2009
10241540@unknown@formal@none@1@S@In English, intonation patterns are on groups of words, which are called tone groups, tone units, intonation groups or sense groups.@@@@1@21@@danf@17-8-2009
10241550@unknown@formal@none@1@S@Tone groups are said on a single breath and, as a consequence, are of limited length, more often being on average five words long or lasting roughly two seconds.@@@@1@29@@danf@17-8-2009
10241560@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10241570@unknown@formal@none@1@S@: -{{IPA | /duː juː niːd ˈɛnɪˌθɪŋ/}} ''Do you need anything?''@@@@1@11@@danf@17-8-2009
10241580@unknown@formal@none@1@S@: -{{IPA | /aɪ dəʊnt | nəʊ/}} ''I don't, no''@@@@1@10@@danf@17-8-2009
10241590@unknown@formal@none@1@S@: -{{IPA | /aɪ dəʊnt nəʊ/}} ''I don't know'' (contracted to, for example, -{{IPA | /aɪ dəʊnəʊ/}} or {{IPA | /aɪ dənəʊ/}} ''I dunno'' in fast or colloquial speech that de-emphasises the pause between don't and know even further)@@@@1@39@@danf@17-8-2009
10241600@unknown@formal@none@1@S@==== Characteristics of intonation ====@@@@1@5@@danf@17-8-2009
10241610@unknown@formal@none@1@S@English is a strongly stressed language, in that certain syllables, both within words and within phrases, get a relative prominence/loudness during pronunciation while the others do not.@@@@1@27@@danf@17-8-2009
10241620@unknown@formal@none@1@S@The former kind of syllables are said to be ''accentuated/stressed'' and the latter are ''unaccentuated/unstressed''.@@@@1@15@@danf@17-8-2009
10241630@unknown@formal@none@1@S@All good dictionaries of English mark the accentuated syllable(s) by either placing an apostrophe-like ( {{IPA | ˈ}} ) sign either before (as in [[International Phonetic Alphabet|IPA]], [[Oxford English Dictionary]], or [[Merriam-Webster]] dictionaries) or after (as in many other dictionaries) the syllable where the stress accent falls.@@@@1@47@@danf@17-8-2009
10241640@unknown@formal@none@1@S@Hence in a sentence, each tone group can be subdivided into syllables, which can either be stressed (strong) or unstressed (weak).@@@@1@21@@danf@17-8-2009
10241650@unknown@formal@none@1@S@The stressed syllable is called the nuclear syllable.@@@@1@8@@danf@17-8-2009
10241660@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10241670@unknown@formal@none@1@S@: ''That | was | the | '''best''' | thing | you | could | have | '''done'''!''@@@@1@18@@danf@17-8-2009
10241680@unknown@formal@none@1@S@Here, all syllables are unstressed, except the syllables/words ''best'' and ''done'', which are stressed.@@@@1@14@@danf@17-8-2009
10241690@unknown@formal@none@1@S@''Best'' is stressed harder and, therefore, is the nuclear syllable.@@@@1@10@@danf@17-8-2009
10241700@unknown@formal@none@1@S@The nuclear syllable carries the main point the speaker wishes to make.@@@@1@12@@danf@17-8-2009
10241710@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10241720@unknown@formal@none@1@S@: ''John'' had not stolen that money. (...@@@@1@8@@danf@17-8-2009
10241730@unknown@formal@none@1@S@Someone else had.)@@@@1@3@@danf@17-8-2009
10241740@unknown@formal@none@1@S@: John ''had not'' stolen that money. (...@@@@1@8@@danf@17-8-2009
10241750@unknown@formal@none@1@S@Someone said he had. or ...@@@@1@6@@danf@17-8-2009
10241760@unknown@formal@none@1@S@Not at that time, but later he did.)@@@@1@8@@danf@17-8-2009
10241770@unknown@formal@none@1@S@: John had not ''stolen'' that money. (...@@@@1@8@@danf@17-8-2009
10241780@unknown@formal@none@1@S@He acquired the money by some other means.)@@@@1@8@@danf@17-8-2009
10241790@unknown@formal@none@1@S@: John had not stolen ''that'' money. (...@@@@1@8@@danf@17-8-2009
10241800@unknown@formal@none@1@S@He had stolen some other money.)@@@@1@6@@danf@17-8-2009
10241810@unknown@formal@none@1@S@: John had not stolen that ''money''. (...@@@@1@8@@danf@17-8-2009
10241820@unknown@formal@none@1@S@He had stolen something else.)@@@@1@5@@danf@17-8-2009
10241830@unknown@formal@none@1@S@Also@@@@1@1@@danf@17-8-2009
10241840@unknown@formal@none@1@S@: ''I'' did not tell her that. (...@@@@1@8@@danf@17-8-2009
10241850@unknown@formal@none@1@S@Someone else told her)@@@@1@4@@danf@17-8-2009
10241860@unknown@formal@none@1@S@: I ''did not'' tell her that. (...@@@@1@8@@danf@17-8-2009
10241870@unknown@formal@none@1@S@You said I did. or ... but now I will)@@@@1@10@@danf@17-8-2009
10241880@unknown@formal@none@1@S@: I did not ''tell'' her that. (...@@@@1@8@@danf@17-8-2009
10241890@unknown@formal@none@1@S@I did not say it; she could have inferred it, etc)@@@@1@11@@danf@17-8-2009
10241900@unknown@formal@none@1@S@: I did not tell ''her'' that. (...@@@@1@8@@danf@17-8-2009
10241910@unknown@formal@none@1@S@I told someone else)@@@@1@4@@danf@17-8-2009
10241920@unknown@formal@none@1@S@: I did not tell her ''that''. (...@@@@1@8@@danf@17-8-2009
10241930@unknown@formal@none@1@S@I told her something else)@@@@1@5@@danf@17-8-2009
10241940@unknown@formal@none@1@S@This can also be used to express emotion:@@@@1@8@@danf@17-8-2009
10241950@unknown@formal@none@1@S@: ''Oh'' really? (...I did not know that)@@@@1@8@@danf@17-8-2009
10241960@unknown@formal@none@1@S@: Oh ''really''? (...I disbelieve you. or ...@@@@1@8@@danf@17-8-2009
10241970@unknown@formal@none@1@S@That's blatantly obvious)@@@@1@3@@danf@17-8-2009
10241980@unknown@formal@none@1@S@The nuclear syllable is spoken more loudly than the others and has a characteristic '''change of pitch'''.@@@@1@17@@danf@17-8-2009
10241990@unknown@formal@none@1@S@The changes of pitch most commonly encountered in English are the '''rising pitch''' and the '''falling pitch''', although the '''fall-rising pitch''' and/or the '''rise-falling pitch''' are sometimes used.@@@@1@28@@danf@17-8-2009
10242000@unknown@formal@none@1@S@In this opposition between falling and rising pitch, which plays a larger role in English than in most other languages, falling pitch conveys certainty and rising pitch uncertainty.@@@@1@28@@danf@17-8-2009
10242010@unknown@formal@none@1@S@This can have a crucial impact on meaning, specifically in relation to polarity, the positive–negative opposition; thus, falling pitch means "polarity known", while rising pitch means "polarity unknown".@@@@1@28@@danf@17-8-2009
10242020@unknown@formal@none@1@S@This underlies the rising pitch of yes/no questions.@@@@1@8@@danf@17-8-2009
10242030@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10242040@unknown@formal@none@1@S@: ''When do you want to be paid?''@@@@1@8@@danf@17-8-2009
10242050@unknown@formal@none@1@S@: ''Now?''@@@@1@2@@danf@17-8-2009
10242060@unknown@formal@none@1@S@(Rising pitch.@@@@1@2@@danf@17-8-2009
10242070@unknown@formal@none@1@S@In this case, it denotes a question: "Can I be paid now?" or "Do you desire to pay now?")@@@@1@19@@danf@17-8-2009
10242080@unknown@formal@none@1@S@: ''Now.''@@@@1@2@@danf@17-8-2009
10242090@unknown@formal@none@1@S@(Falling pitch.@@@@1@2@@danf@17-8-2009
10242100@unknown@formal@none@1@S@In this case, it denotes a statement: "I choose to be paid now.")@@@@1@13@@danf@17-8-2009
10242110@unknown@formal@none@1@S@== Grammar ==@@@@1@3@@danf@17-8-2009
10242120@unknown@formal@none@1@S@English grammar has minimal [[inflection]] compared with most other [[Indo-European languages]].@@@@1@11@@danf@17-8-2009
10242130@unknown@formal@none@1@S@For example, Modern English, unlike Modern German or Dutch and the [[Romance languages]], lacks [[grammatical gender]] and [[Agreement (linguistics)|adjectival agreement]].@@@@1@20@@danf@17-8-2009
10242140@unknown@formal@none@1@S@[[Grammatical case|Case]] marking has almost disappeared from the language and mainly survives in [[pronoun]]s.@@@@1@14@@danf@17-8-2009
10242150@unknown@formal@none@1@S@The patterning of [[Strong inflection|strong]] (e.g. ''speak/spoke/spoken'') versus [[Germanic weak verb|weak verbs]] inherited from its Germanic origins has declined in importance in modern English, and the remnants of inflection (such as [[plural]] marking) have become more regular.@@@@1@37@@danf@17-8-2009
10242160@unknown@formal@none@1@S@At the same time, the language has become more [[Isolating language|analytic]], and has developed features such as [[modal verb]]s and [[word order]] as resources for conveying meaning.@@@@1@27@@danf@17-8-2009
10242170@unknown@formal@none@1@S@[[Auxiliary verb]]s mark constructions such as questions, negative polarity, the [[Grammatical voice|passive voice]] and progressive [[grammatical aspect|aspect]].@@@@1@17@@danf@17-8-2009
10242180@unknown@formal@none@1@S@== Vocabulary ==@@@@1@3@@danf@17-8-2009
10242190@unknown@formal@none@1@S@The English vocabulary has changed considerably over the centuries.@@@@1@9@@danf@17-8-2009
10242200@unknown@formal@none@1@S@Like many languages deriving from [[Proto-Indo-European language|Proto-Indo-European]] (PIE), many of the most common words in English can trace back their origin (through the Germanic branch) to PIE.@@@@1@27@@danf@17-8-2009
10242210@unknown@formal@none@1@S@Such words include the basic pronouns ''I'', from [[Old English language|Old English]] ''ic'', (cf. Latin ''ego'', Greek ''ego'', Sanskrit ''aham''), ''me'' (cf. Latin ''me'', Greek ''eme'', Sanskrit ''mam''), numbers (e.g. ''one'', ''two'', ''three'', cf. Latin ''unus, duo, tres'', Greek ''oinos'' "ace (on dice)", ''duo, treis''), common family relationships such as mother, father, brother, sister etc (cf. Greek "meter", Latin "mater", Sanskrit "matṛ"; ''mother''), names of many animals (cf. Sankrit ''mus'', Greek ''mys'', Latin ''mus''; ''mouse''), and many common verbs (cf. Greek ''gignōmi'', Latin ''gnoscere'', Hittite ''kanes'';'' to know'').@@@@1@88@@danf@17-8-2009
10242220@unknown@formal@none@1@S@Germanic words (generally words of Old English or to a lesser extent Norse origin) tend to be shorter than the Latinate words of English, and more common in ordinary speech.@@@@1@30@@danf@17-8-2009
10242230@unknown@formal@none@1@S@This includes nearly all the basic pronouns, prepositions, conjunctions, modal verbs etc. that form the basis of English syntax and grammar.@@@@1@21@@danf@17-8-2009
10242240@unknown@formal@none@1@S@The longer Latinate words are often regarded as more elegant or educated.@@@@1@12@@danf@17-8-2009
10242250@unknown@formal@none@1@S@However, the excessive use of Latinate words is considered at times to be either pretentious or an attempt to [[obfuscation|obfuscate]] an issue.@@@@1@22@@danf@17-8-2009
10242260@unknown@formal@none@1@S@[[George Orwell]]'s [[essay]] "[[Politics and the English Language]]" is critical of this, as well as other perceived misuse of the language.@@@@1@21@@danf@17-8-2009
10242270@unknown@formal@none@1@S@An English speaker is in many cases able to choose between Germanic and Latinate [[synonym]]s: ''come'' or ''arrive''; ''sight'' or ''vision''; ''freedom'' or ''liberty''.@@@@1@24@@danf@17-8-2009
10242280@unknown@formal@none@1@S@In some cases there is a choice between a Germanic derived word (''oversee''), a Latin derived word (''supervise''), and a French word derived from the same Latin word (''survey'').@@@@1@29@@danf@17-8-2009
10242290@unknown@formal@none@1@S@Such synonyms harbor a variety of different meanings and nuances, enabling the speaker to express fine variations or shades of thought.@@@@1@21@@danf@17-8-2009
10242300@unknown@formal@none@1@S@Familiarity with the [[etymology]] of groups of synonyms can give English speakers greater control over their [[Register (sociolinguistics)|linguistic register]].@@@@1@19@@danf@17-8-2009
10242310@unknown@formal@none@1@S@See: [[List of Germanic and Latinate equivalents in English]].@@@@1@9@@danf@17-8-2009
10242320@unknown@formal@none@1@S@An exception to this and a peculiarity perhaps unique to English is that the nouns for meats are commonly different from, and unrelated to, those for the animals from which they are produced, the animal commonly having a Germanic name and the meat having a French-derived one.@@@@1@47@@danf@17-8-2009
10242330@unknown@formal@none@1@S@Examples include: ''[[deer]]'' and ''[[venison]]''; ''[[cattle|cow]]'' and ''[[beef]]''; ''swine''/''[[pig]]'' and ''[[pork]]'', or ''[[domestic sheep|sheep]]'' and ''[[lamb and mutton|mutton]]''.@@@@1@18@@danf@17-8-2009
10242340@unknown@formal@none@1@S@This is assumed to be a result of the aftermath of the Norman invasion, where a French-speaking elite were the consumers of the meat, produced by Anglo-Saxon lower classes.@@@@1@29@@danf@17-8-2009
10242350@unknown@formal@none@1@S@Since the majority of words used in informal settings will normally be Germanic, such words are often the preferred choices when a speaker wishes to make a point in an argument in a very direct way.@@@@1@36@@danf@17-8-2009
10242360@unknown@formal@none@1@S@A majority of Latinate words (or at least a majority of content words) will normally be used in more formal speech and writing, such as a [[court]]room or an [[encyclopedia]] article.@@@@1@31@@danf@17-8-2009
10242370@unknown@formal@none@1@S@However, there are other Latinate words that are used normally in everyday speech and do not sound formal; these are mainly words for concepts that no longer have Germanic words, and are generally assimilated better and in many cases do not appear Latinate.@@@@1@43@@danf@17-8-2009
10242380@unknown@formal@none@1@S@For instance, the words ''mountain'', ''valley'', ''river'', ''aunt'', ''uncle'', ''move'', ''use'', ''push'' and ''stay'' are all Latinate.@@@@1@17@@danf@17-8-2009
10242390@unknown@formal@none@1@S@English easily accepts technical terms into common usage and often imports new words and phrases.@@@@1@15@@danf@17-8-2009
10242400@unknown@formal@none@1@S@Examples of this phenomenon include: ''[[HTTP cookie|cookie]]'', ''[[Internet]]'' and ''[[Uniform Resource Locator|URL]]'' (technical terms), as well as ''[[genre]]'', ''[[über]]'', ''[[lingua franca]]'' and ''amigo'' (imported words/phrases from French, German, modern Latin, and Spanish, respectively).@@@@1@33@@danf@17-8-2009
10242410@unknown@formal@none@1@S@In addition, [[slang]] often provides new meanings for old words and phrases.@@@@1@12@@danf@17-8-2009
10242420@unknown@formal@none@1@S@In fact, this fluidity is so pronounced that a distinction often needs to be made between formal forms of English and contemporary usage.@@@@1@23@@danf@17-8-2009
10242430@unknown@formal@none@1@S@See also: [[sociolinguistics]].@@@@1@3@@danf@17-8-2009
10242440@unknown@formal@none@1@S@=== Number of words in English ===@@@@1@7@@danf@17-8-2009
10242450@unknown@formal@none@1@S@The ''General Explanations'' at the beginning of the ''Oxford English Dictionary'' states:@@@@1@12@@danf@17-8-2009
10242460@unknown@formal@none@1@S@The vocabulary of English is undoubtedly vast, but assigning a specific number to its size is more a matter of definition than of calculation.@@@@1@24@@danf@17-8-2009
10242470@unknown@formal@none@1@S@Unlike other languages, such as [[Académie française|French]], [[List of language regulators|German]], [[Real Academia Española|Spanish]] and [[Accademia della Crusca|Italian]] there is no [[List of language regulators|Academy]] to define officially accepted words and spellings.@@@@1@32@@danf@17-8-2009
10242480@unknown@formal@none@1@S@[[Neologism]]s are coined regularly in medicine, science and technology and other fields, and new [[slang]] is constantly developed.@@@@1@18@@danf@17-8-2009
10242490@unknown@formal@none@1@S@Some of these new words enter wide usage; others remain restricted to small circles.@@@@1@14@@danf@17-8-2009
10242500@unknown@formal@none@1@S@Foreign words used in immigrant communities often make their way into wider English usage.@@@@1@14@@danf@17-8-2009
10242510@unknown@formal@none@1@S@Archaic, dialectal, and regional words might or might not be widely considered as "English".@@@@1@14@@danf@17-8-2009
10242520@unknown@formal@none@1@S@The ''[[Oxford English Dictionary]],'' 2nd edition ''(OED2)'' includes over 600,000 definitions, following a rather inclusive policy:@@@@1@16@@danf@17-8-2009
10242530@unknown@formal@none@1@S@The editors of ''[[Webster's Dictionary|Webster's Third New International Dictionary, Unabridged]]'' (475,000 main headwords) in their preface, estimate the number to be much higher.@@@@1@23@@danf@17-8-2009
10242540@unknown@formal@none@1@S@It is estimated that about 25,000 words are added to the language each year.@@@@1@14@@danf@17-8-2009
10242550@unknown@formal@none@1@S@=== Word origins ===@@@@1@4@@danf@17-8-2009
10242560@unknown@formal@none@1@S@One of the consequences of the French influence is that the vocabulary of English is, to a certain extent, divided between those words which are [[Germanic languages|Germanic]] (mostly West Germanic, with a smaller influence from the North Germanic branch) and those which are "Latinate" (Latin-derived, either directly or from Norman French or other Romance languages).@@@@1@55@@danf@17-8-2009
10242570@unknown@formal@none@1@S@Numerous sets of statistics have been proposed to demonstrate the origins of English vocabulary.@@@@1@14@@danf@17-8-2009
10242580@unknown@formal@none@1@S@None, as yet, is considered definitive by most linguists.@@@@1@9@@danf@17-8-2009
10242590@unknown@formal@none@1@S@A computerised survey of about 80,000 words in the old ''Shorter Oxford Dictionary'' (3rd ed.) was published in ''Ordered Profusion'' by Thomas Finkenstaedt and Dieter Wolff (1973) that estimated the origin of English words as follows:@@@@1@36@@danf@17-8-2009
10242600@unknown@formal@none@1@S@*''[[Langues d'oïl|Langue d'oïl]]'', including French and [[Old Norman]]: [[List of English words of French origin|28.3%]]@@@@1@15@@danf@17-8-2009
10242610@unknown@formal@none@1@S@*Latin, including modern scientific and technical Latin: 28.24%@@@@1@8@@danf@17-8-2009
10242620@unknown@formal@none@1@S@*Other [[Germanic languages]] (including words directly inherited from [[Old English language|Old English]]): 25%@@@@1@13@@danf@17-8-2009
10242630@unknown@formal@none@1@S@*Greek: 5.32%@@@@1@2@@danf@17-8-2009
10242640@unknown@formal@none@1@S@*No etymology given: 4.03%@@@@1@4@@danf@17-8-2009
10242650@unknown@formal@none@1@S@*Derived from proper names: 3.28%@@@@1@5@@danf@17-8-2009
10242660@unknown@formal@none@1@S@*All other languages contributed less than 1%@@@@1@7@@danf@17-8-2009
10242670@unknown@formal@none@1@S@A survey by [[Joseph M. Williams]] in ''Origins of the English Language'' of 10,000 words taken from several thousand business letters gave this set of statistics:@@@@1@26@@danf@17-8-2009
10242680@unknown@formal@none@1@S@*French (langue d'oïl): 41%@@@@1@4@@danf@17-8-2009
10242690@unknown@formal@none@1@S@*"Native" English: 33%@@@@1@3@@danf@17-8-2009
10242700@unknown@formal@none@1@S@*Latin: 15%@@@@1@2@@danf@17-8-2009
10242710@unknown@formal@none@1@S@*Danish: 2%@@@@1@2@@danf@17-8-2009
10242720@unknown@formal@none@1@S@*Dutch: 1%@@@@1@2@@danf@17-8-2009
10242730@unknown@formal@none@1@S@*Other: 10%@@@@1@2@@danf@17-8-2009
10242740@unknown@formal@none@1@S@However, 83% of the 1,000 most-common, and all of the 100 most-common English words are Germanic.@@@@1@16@@danf@17-8-2009
10242750@unknown@formal@none@1@S@==== Dutch origins ====@@@@1@4@@danf@17-8-2009
10242760@unknown@formal@none@1@S@Words describing the navy, types of ships, and other objects or activities on the water are often from Dutch origin.@@@@1@20@@danf@17-8-2009
10242770@unknown@formal@none@1@S@''Yacht'' (''jacht'') and ''cruiser'' (''kruiser'') are examples.@@@@1@7@@danf@17-8-2009
10242780@unknown@formal@none@1@S@==== French origins ====@@@@1@4@@danf@17-8-2009
10242790@unknown@formal@none@1@S@There are many [[List of English words of French origin|words of French origin in English]], such as ''competition'', ''art'', ''table'', ''publicity'', ''police'', ''role'', ''routine'', ''machine'', ''force'', and many others that have been and are being [[anglicisation|anglicised]]; they are now pronounced according to English rules of [[phonology]], rather than French.@@@@1@49@@danf@17-8-2009
10242800@unknown@formal@none@1@S@A large portion of English vocabulary is of French or [[Langues d'oïl]] origin, most derived from, or transmitted via, the [[Anglo-Norman language|Anglo-Norman]] spoken by the [[upper class]]es in [[England]] for several hundred years after the [[Norman conquest of England]].@@@@1@39@@danf@17-8-2009
10242810@unknown@formal@none@1@S@== Writing system ==@@@@1@4@@danf@17-8-2009
10242820@unknown@formal@none@1@S@English has been written using the [[Latin alphabet]] since around the ninth century.@@@@1@13@@danf@17-8-2009
10242830@unknown@formal@none@1@S@(Before that, Old English had been written using [[Anglo-Saxon runes]].)@@@@1@10@@danf@17-8-2009
10242840@unknown@formal@none@1@S@The spelling system, or [[orthography]], is multilayered, with elements of French, Latin and Greek spelling on top of the native Germanic system; it has grown to vary significantly from the [[phonology]] of the language.@@@@1@34@@danf@17-8-2009
10242850@unknown@formal@none@1@S@The spelling of words often diverges considerably from how they are spoken.@@@@1@12@@danf@17-8-2009
10242860@unknown@formal@none@1@S@Though letters and sounds may not correspond in isolation, spelling rules that take into account syllable structure, phonetics, and accents are 75% or more reliable.@@@@1@25@@danf@17-8-2009
10242870@unknown@formal@none@1@S@Some phonics spelling advocates claim that English is more than 80% phonetic.@@@@1@12@@danf@17-8-2009
10242880@unknown@formal@none@1@S@In general, [[history of the English language|the English language]], being the product of many other languages and having only been codified orthographically in the 16th century, has fewer consistent relationships between sounds and letters than many other languages.@@@@1@38@@danf@17-8-2009
10242890@unknown@formal@none@1@S@The consequence of this orthographic history is that reading can be challenging.@@@@1@12@@danf@17-8-2009
10242900@unknown@formal@none@1@S@It takes longer for students to become completely fluent readers of English than of many other languages, including French, Greek, and Spanish.@@@@1@22@@danf@17-8-2009
10242910@unknown@formal@none@1@S@=== Basic sound-letter correspondence ===@@@@1@5@@danf@17-8-2009
10242920@unknown@formal@none@1@S@Only the consonant letters are pronounced in a relatively regular way:@@@@1@11@@danf@17-8-2009
10242930@unknown@formal@none@1@S@=== Written accents ===@@@@1@4@@danf@17-8-2009
10242940@unknown@formal@none@1@S@Unlike most other Germanic languages, English has almost no [[diacritic]]s except in foreign [[loanword]]s (like the [[acute accent]] in ''café''), and in the uncommon use of a [[diaeresis]] mark (often in formal writing) to indicate that two vowels are pronounced separately, rather than as one sound (e.g. ''naïve, Zoë'').@@@@1@49@@danf@17-8-2009
10242950@unknown@formal@none@1@S@It is almost always acceptable to leave out the marks, especially in digital communications where the [[QWERTY]] keyboard lacks any marked letters, but it depends on the context where the word is used.@@@@1@33@@danf@17-8-2009
10242960@unknown@formal@none@1@S@Some English words retain the diacritic to distinguish them from others, such as ''[[Animé (oleo-resin)|animé]], [[Investigative journalism|exposé]], [[Lamé (fencing)|lamé]], [[öre]], [[øre]], [[pâté]], [[piqué]],'' and ''[[rosé]]'', though these are sometimes also dropped (''[[résumé]]/resumé'' is usually spelled ''resume'' in the United States).@@@@1@40@@danf@17-8-2009
10242970@unknown@formal@none@1@S@There are loan words which occasionally use a diacritic to represent their pronunciation that is not in the original word, such as ''maté'', from Spanish ''[[yerba mate]]'', following the French usage, but they are extremely rare.@@@@1@36@@danf@17-8-2009
10242980@unknown@formal@none@1@S@== Formal written English ==@@@@1@5@@danf@17-8-2009
10242990@unknown@formal@none@1@S@A version of the language almost universally agreed upon by educated English speakers around the world is called [[formal written English]].@@@@1@21@@danf@17-8-2009
10243000@unknown@formal@none@1@S@It takes virtually the same form no matter where in the English-speaking world it is written.@@@@1@16@@danf@17-8-2009
10243010@unknown@formal@none@1@S@In spoken English, by contrast, there are a vast number of differences between [[dialect]]s, [[Accent (linguistics)|accents]], and varieties of [[slang]], colloquial and regional expressions.@@@@1@24@@danf@17-8-2009
10243020@unknown@formal@none@1@S@In spite of this, local variations in the formal written version of the language are quite limited, being restricted largely to the [[American and British English spelling differences|spelling differences between British and American English]].@@@@1@34@@danf@17-8-2009
10243030@unknown@formal@none@1@S@== Basic and simplified versions ==@@@@1@6@@danf@17-8-2009
10243040@unknown@formal@none@1@S@To make English easier to read, there are some simplified versions of the language.@@@@1@14@@danf@17-8-2009
10243050@unknown@formal@none@1@S@One basic version is named ''[[Basic English]]'', a [[constructed language]] with a small number of words created by [[Charles Kay Ogden]] and described in his book ''Basic English: A General Introduction with Rules and Grammar'' (1930).@@@@1@36@@danf@17-8-2009
10243060@unknown@formal@none@1@S@The language is based on a simplified version of English.@@@@1@10@@danf@17-8-2009
10243070@unknown@formal@none@1@S@Ogden said that it would take seven years to learn English, seven months for [[Esperanto]], and seven weeks for Basic English, comparable with [[Ido]].@@@@1@24@@danf@17-8-2009
10243080@unknown@formal@none@1@S@Thus Basic English is used by companies who need to make complex books for international use, and by language schools that need to give people some knowledge of English in a short time.@@@@1@33@@danf@17-8-2009
10243090@unknown@formal@none@1@S@Ogden did not put any words into Basic English that could be said with a few other words and he worked to make the words work for speakers of any other language.@@@@1@32@@danf@17-8-2009
10243100@unknown@formal@none@1@S@He put his set of words through a large number of tests and adjustments.@@@@1@14@@danf@17-8-2009
10243110@unknown@formal@none@1@S@He also made the grammar simpler, but tried to keep the grammar normal for English users.@@@@1@16@@danf@17-8-2009
10243120@unknown@formal@none@1@S@The concept gained its greatest publicity just after the [[World War II|Second World War]] as a tool for world peace.@@@@1@20@@danf@17-8-2009
10243130@unknown@formal@none@1@S@Although it was not built into a program, similar simplifications were devised for various international uses.@@@@1@16@@danf@17-8-2009
10243140@unknown@formal@none@1@S@Another version, [[Simplified English]], exists, which is a [[Controlled natural language|controlled language]] originally developed for [[aerospace]] industry maintenance manuals.@@@@1@19@@danf@17-8-2009
10243150@unknown@formal@none@1@S@It offers a carefully limited and standardised subset of English.@@@@1@10@@danf@17-8-2009
10243160@unknown@formal@none@1@S@Simplified English has a lexicon of approved words and those words can only be used in certain ways.@@@@1@18@@danf@17-8-2009
10243170@unknown@formal@none@1@S@For example, the word ''close'' can be used in the phrase "Close the door" but not "do not go close to the landing gear".@@@@1@24@@danf@17-8-2009
10250010@unknown@formal@none@1@S@Esperanto@@@@1@1@@danf@17-8-2009
10250020@unknown@formal@none@1@S@is by far the most widely spoken [[constructed language|constructed]] [[international auxiliary language]] in the world.@@@@1@15@@danf@17-8-2009
10250030@unknown@formal@none@1@S@Its name derives from ''Doktoro Esperanto,'' the [[pseudonym]] under which [[L. L. Zamenhof]] published the first book detailing Esperanto, the ''[[Unua Libro]],'' in 1887.@@@@1@24@@danf@17-8-2009
10250040@unknown@formal@none@1@S@The word ''esperanto'' means 'one who hopes' in the language itself.@@@@1@11@@danf@17-8-2009
10250050@unknown@formal@none@1@S@Zamenhof's goal was to create an easy and flexible language that would serve as a universal [[second language]] to foster peace and international understanding.@@@@1@24@@danf@17-8-2009
10250060@unknown@formal@none@1@S@Esperanto has had continuous usage by a community estimated at between 100,000 and 2 million speakers for over a century.@@@@1@20@@danf@17-8-2009
10250070@unknown@formal@none@1@S@By most estimates, there are approximately one thousand [[Native Esperanto speakers|native speakers]].@@@@1@12@@danf@17-8-2009
10250080@unknown@formal@none@1@S@However, no country has adopted the language [[official language|officially]].@@@@1@9@@danf@17-8-2009
10250090@unknown@formal@none@1@S@Today, Esperanto is employed in world travel, correspondence, cultural exchange, conventions, literature, language instruction, television, and radio broadcasting.@@@@1@18@@danf@17-8-2009
10250100@unknown@formal@none@1@S@Also, there is an [[Esperanto Wikipedia]] that contains over 100,000 articles as of June 2008.@@@@1@15@@danf@17-8-2009
10250110@unknown@formal@none@1@S@There is evidence that [[Propaedeutic value of Esperanto|learning Esperanto may provide a good foundation for learning languages in general]].@@@@1@19@@danf@17-8-2009
10250120@unknown@formal@none@1@S@Some state education systems offer basic instruction and elective courses in Esperanto.@@@@1@12@@danf@17-8-2009
10250130@unknown@formal@none@1@S@Esperanto is also the language of instruction in one university, the [[Akademio Internacia de la Sciencoj San Marino|Akademio Internacia de la Sciencoj]] in [[San Marino]].@@@@1@25@@danf@17-8-2009
10250140@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10250150@unknown@formal@none@1@S@Esperanto was developed in the late 1870s and early 1880s by [[ophthalmology|ophthalmologist]] [[L. L. Zamenhof|Dr. Ludovic Lazarus Zamenhof]], an [[Ashkenazi Jew]] from [[Bialystok]], now in [[Poland]] and previously in the [[Polish-Lithuanian Commonwealth]], but at the time part of the [[Russian Empire]].@@@@1@41@@danf@17-8-2009
10250160@unknown@formal@none@1@S@After some ten years of development, which Zamenhof spent translating literature into the language as well as writing original [[prose]] and [[Poetry|verse]], the [[Unua Libro|first book of Esperanto grammar]] was published in [[Warsaw]] in July 1887.@@@@1@36@@danf@17-8-2009
10250170@unknown@formal@none@1@S@The number of speakers grew rapidly over the next few decades, at first primarily in the [[Russian empire]] and [[Eastern Europe]], then in [[Western Europe]], the [[Americas]], [[China]], and [[Japan]].@@@@1@30@@danf@17-8-2009
10250180@unknown@formal@none@1@S@In the early years, speakers of Esperanto kept in contact primarily through correspondence and [[magazine|periodicals]], but in 1905 the first [[World Congress of Esperanto|world congress of Esperanto speakers]] was held in [[Boulogne-sur-Mer]], [[France]].@@@@1@33@@danf@17-8-2009
10250190@unknown@formal@none@1@S@Since then world congresses have been held in different countries every year, except during the two [[world war|World Wars]].@@@@1@19@@danf@17-8-2009
10250200@unknown@formal@none@1@S@Since the Second World War, they have been attended by an average of over 2000 and up to 6000 people.@@@@1@20@@danf@17-8-2009
10250210@unknown@formal@none@1@S@===Relation to 20th-century totalitarianism===@@@@1@4@@danf@17-8-2009
10250220@unknown@formal@none@1@S@As a potential vehicle for international understanding, Esperanto attracted the suspicion of many [[totalitarian]] states.@@@@1@15@@danf@17-8-2009
10250230@unknown@formal@none@1@S@The situation was especially pronounced in [[Nazi Germany]] and in the [[Soviet Union]] under [[Joseph Stalin]].@@@@1@16@@danf@17-8-2009
10250240@unknown@formal@none@1@S@In Germany, there was additional motivation to persecute Esperanto because Zamenhof was a Jew.@@@@1@14@@danf@17-8-2009
10250250@unknown@formal@none@1@S@In his work ''[[Mein Kampf]],'' [[Hitler]] mentioned Esperanto as an example of a language that would be used by an [[International Jewry|International]] [[Jewish conspiracy|Jewish Conspiracy]] once they achieved [[world domination]].@@@@1@30@@danf@17-8-2009
10250260@unknown@formal@none@1@S@[[Esperantist]]s were executed during [[the Holocaust]], with Zamenhof's family in particular singled out for execution.@@@@1@15@@danf@17-8-2009
10250270@unknown@formal@none@1@S@In the early years of the Soviet Union, Esperanto was given a measure of government support, and an officially recognized Soviet Esperanto Association came into being.@@@@1@26@@danf@17-8-2009
10250280@unknown@formal@none@1@S@However, in 1937, Stalin reversed this policy.@@@@1@7@@danf@17-8-2009
10250290@unknown@formal@none@1@S@He denounced Esperanto as "the language of spies" and had Esperantists executed.@@@@1@12@@danf@17-8-2009
10250300@unknown@formal@none@1@S@The use of Esperanto remained illegal until 1956.@@@@1@8@@danf@17-8-2009
10250310@unknown@formal@none@1@S@==Official use==@@@@1@2@@danf@17-8-2009
10250320@unknown@formal@none@1@S@Esperanto has never been an official language of any recognized country.@@@@1@11@@danf@17-8-2009
10250330@unknown@formal@none@1@S@However, there were plans at the beginning of the 20th century to establish [[Moresnet|Neutral Moresnet]] as the world's first Esperanto state.@@@@1@21@@danf@17-8-2009
10250340@unknown@formal@none@1@S@In China, there was talk in some circles after the 1911 [[Xinhai Revolution]] about officially replacing [[Chinese language|Chinese]] with Esperanto as a means to dramatically bring the country into the twentieth century, though this policy proved untenable.@@@@1@37@@danf@17-8-2009
10250350@unknown@formal@none@1@S@In the summer of 1924, the [[American Radio Relay League]] adopted Esperanto as its official [[international auxiliary language]], and hoped that the language would be used by [[Amateur radio|radio amateurs]] in international communications, but its actual use for radio communications was negligible.@@@@1@42@@danf@17-8-2009
10250360@unknown@formal@none@1@S@In addition, the self-proclaimed [[artificial island]] [[micronation]] of [[Republic of Rose Island|Rose Island]] used Esperanto as its official language in 1968.@@@@1@21@@danf@17-8-2009
10250370@unknown@formal@none@1@S@Esperanto is the working language of several [[non-profit organization|non-profit]] international organizations such as the ''[[Sennacieca Asocio Tutmonda]]'', but most others are specifically Esperanto organizations.@@@@1@24@@danf@17-8-2009
10250380@unknown@formal@none@1@S@The largest of these, the [[World Esperanto Association]], has an official consultative relationship with the [[United Nations]] and [[UNESCO]].@@@@1@19@@danf@17-8-2009
10250390@unknown@formal@none@1@S@The U.S. Army has published military phrasebooks in Esperanto, to be used in [[Military simulation|wargames]] by mock enemy forces.@@@@1@19@@danf@17-8-2009
10250400@unknown@formal@none@1@S@Esperanto is also the first language of teaching and administration of the [[Akademio Internacia de la Sciencoj San Marino|International Academy of Sciences San Marino]], which is sometimes called an "Esperanto University".@@@@1@31@@danf@17-8-2009
10250410@unknown@formal@none@1@S@== Linguistic properties ==@@@@1@4@@danf@17-8-2009
10250420@unknown@formal@none@1@S@=== Classification ===@@@@1@3@@danf@17-8-2009
10250430@unknown@formal@none@1@S@As a [[constructed language]], Esperanto is not [[Genealogy|genealogically]] related to any [[ethnic group|ethnic]] language.@@@@1@14@@danf@17-8-2009
10250440@unknown@formal@none@1@S@It has been described as "a language [[lexicon|lexically]] predominantly [[Romance languages|Romanic]], [[morphology (linguistics)|morphologically]] intensively [[agglutination|agglutinative]] and to a certain degree [[isolating languages|isolating]] in character".@@@@1@24@@danf@17-8-2009
10250450@unknown@formal@none@1@S@The [[phonology]], [[grammar]], [[vocabulary]], and [[semantics]] are based on the western [[Indo-European languages]].@@@@1@13@@danf@17-8-2009
10250460@unknown@formal@none@1@S@The [[phoneme|phonemic inventory]] is essentially [[Slavic languages|Slavic]], as is much of the semantics, while the [[vocabulary]] derives primarily from the [[Romance languages]], with a lesser contribution from the [[Germanic languages]].@@@@1@30@@danf@17-8-2009
10250470@unknown@formal@none@1@S@[[Pragmatics]] and other aspects of the language not specified by Zamenhof's original documents were influenced by the native languages of early speakers, primarily [[Russian language|Russian]], [[Polish language|Polish]], [[German language|German]], and [[French language|French]].@@@@1@32@@danf@17-8-2009
10250480@unknown@formal@none@1@S@[[Linguistic typology|Typologically]], Esperanto has [[preposition]]s and a [[information flow|pragmatic word order]] that by default is ''[[Subject Verb Object]]'' and ''[[Word order|Adjective Noun]]''.@@@@1@22@@danf@17-8-2009
10250490@unknown@formal@none@1@S@New words are formed through extensive [[prefix (linguistics)|prefix]]ing and [[suffix]]ing.@@@@1@10@@danf@17-8-2009
10250500@unknown@formal@none@1@S@=== Writing system ===@@@@1@4@@danf@17-8-2009
10250510@unknown@formal@none@1@S@Esperanto is written with a modified version of the [[Latin alphabet]], including six [[Letter (alphabet)|letters]] with [[diacritic]]s: [[c-circumflex|ĉ]], [[g-circumflex|ĝ]], [[h-circumflex|ĥ]], [[j-circumflex|ĵ]], [[s-circumflex|ŝ]] and [[u-breve|ŭ]] (that is, ''c, g, h, j, s'' [[circumflex]], and ''u'' [[breve]]).@@@@1@35@@danf@17-8-2009
10250520@unknown@formal@none@1@S@The alphabet does not include the letters ''q, w, x,'' or ''y'' except in unassimilated foreign names.@@@@1@17@@danf@17-8-2009
10250530@unknown@formal@none@1@S@The 28-letter alphabet is:
'''a b c ĉ d e f g ĝ h ĥ i j ĵ k l m n o p r s ŝ t u ŭ v z'''
@@@@1@32@@danf@17-8-2009
10250540@unknown@formal@none@1@S@All letters are pronounced approximately as in the [[IPA]], with the exception of ''c'' and the accented letters:@@@@1@18@@danf@17-8-2009
10250550@unknown@formal@none@1@S@Two [[ASCII]]-compatible writing conventions are in use.@@@@1@7@@danf@17-8-2009
10250560@unknown@formal@none@1@S@These substitute [[Digraph (orthography)|digraph]]s for the accented letters.@@@@1@8@@danf@17-8-2009
10250570@unknown@formal@none@1@S@The original "h-convention" (''ch, gh, hh, jh, sh, u'') is based on English 'ch' and 'sh', while a more recent "[[x-convention]]" (''cx, gx, hx, jx, sx, ux'') is useful for alphabetic word sorting on a [[computer]] (''cx'' comes correctly after ''cu'', ''sx'' after ''sv'', etc.) as well as for simple conversion back into the standard [[orthography]].@@@@1@56@@danf@17-8-2009
10250580@unknown@formal@none@1@S@Another scheme represents the superscripted letters by a [[caret]] (^), as for example: c^ or ^c.@@@@1@16@@danf@17-8-2009
10250590@unknown@formal@none@1@S@=== Phonology ===@@@@1@3@@danf@17-8-2009
10250600@unknown@formal@none@1@S@:''(For help with the phonetic symbols, see [[Help:IPA]])''@@@@1@8@@danf@17-8-2009
10250610@unknown@formal@none@1@S@Esperanto has 22 [[consonant]]s, 5 [[vowel]]s, and two [[semivowel]]s, which combine with the vowels to form 6 [[diphthong]]s.@@@@1@18@@danf@17-8-2009
10250620@unknown@formal@none@1@S@(The consonant {{IPA|/j/}} and semivowel {{IPA|/i̯/}} are both written .)@@@@1@10@@danf@17-8-2009
10250625@unknown@formal@none@1@S@[[tone (linguistics)|Tone]] is not used to distinguish meanings of words.@@@@1@10@@danf@17-8-2009
10250630@unknown@formal@none@1@S@[[Stress (linguistics)|Stress]] is always on the penultimate vowel, unless a final vowel ''o'' is [[Elision|elided]], a practice which occurs mostly in [[poetry]].@@@@1@22@@danf@17-8-2009
10250640@unknown@formal@none@1@S@For example, ''familio'' "family" is stressed {{IPA2|fa.mi.ˈli.o}}, but when found without the final o, ''famili’,'' the stress does not shift: {{IPA|[fa.mi.ˈli]}}.@@@@1@21@@danf@17-8-2009
10250650@unknown@formal@none@1@S@==== Consonants ====@@@@1@3@@danf@17-8-2009
10250660@unknown@formal@none@1@S@The 22 consonants are:@@@@1@4@@danf@17-8-2009
10250670@unknown@formal@none@1@S@The sound {{IPA|/r/}} is usually [[alveolar trill|rolled]], but may be [[alveolar flap|tapped]] {{IPA|[ɾ]}}.@@@@1@13@@danf@17-8-2009
10250680@unknown@formal@none@1@S@The {{IPA|/v/}} has a normative pronunciation like an [[English language|English]] ''v,'' but is sometimes somewhere between a ''v'' and a ''w,'' {{IPA|[ʋ]}}, depending on the language background of the speaker.@@@@1@30@@danf@17-8-2009
10250690@unknown@formal@none@1@S@A semivowel {{IPA|/u̯/}} normally occurs only in [[diphthong]]s after the vowels {{IPA|/a/}} and {{IPA|/e/}}, not as a consonant {{IPA|*/w/}}.@@@@1@19@@danf@17-8-2009
10250700@unknown@formal@none@1@S@Common, if debated, [[assimilation (linguistics)|assimilation]] includes the pronunciation of {{IPA|/nk/}} as {{IPA|[ŋk]}}, as in English ''sink,'' and {{IPA|/kz/}} as {{IPA|[gz]}}, like the ''x'' in English ''example''.@@@@1@26@@danf@17-8-2009
10250710@unknown@formal@none@1@S@A large number of consonant clusters can occur, up to three in initial position and four in medial position, as in ''instrui'' "to teach".@@@@1@24@@danf@17-8-2009
10250720@unknown@formal@none@1@S@Final clusters are uncommon except in foreign names, poetic elision of final ''o,'' and a very few basic words such as ''cent'' "hundred" and ''post'' "after".@@@@1@26@@danf@17-8-2009
10250730@unknown@formal@none@1@S@====Vowels====@@@@1@1@@danf@17-8-2009
10250740@unknown@formal@none@1@S@Esperanto has the five [[cardinal vowels]] of [[Spanish language|Spanish]], [[Swahili language|Swahili]], and [[Modern Greek]].@@@@1@14@@danf@17-8-2009
10250750@unknown@formal@none@1@S@There are six falling diphthongs: ''uj, oj, ej, aj, aŭ, eŭ'' ({{IPA|/ui̯, oi̯, ei̯, ai̯, au̯, eu̯/}}).@@@@1@17@@danf@17-8-2009
10250760@unknown@formal@none@1@S@With only five vowels, a good deal of variation is tolerated.@@@@1@11@@danf@17-8-2009
10250770@unknown@formal@none@1@S@For instance, {{IPA|/e/}} commonly ranges from {{IPA|[e]}} (French ''é'') to {{IPA|[ɛ]}} (French ''è'').@@@@1@13@@danf@17-8-2009
10250780@unknown@formal@none@1@S@The details often depend on the speaker's native language.@@@@1@9@@danf@17-8-2009
10250790@unknown@formal@none@1@S@A [[glottal stop]] may occur between adjacent vowels in some people's speech, especially when the two vowels are the same, as in ''heroo'' "hero" ({{IPA|[he.ˈro.o]}} or {{IPA|[he.ˈro.ʔo]}}) and ''praavo'' "great-grandfather" ({{IPA|[pra.ˈa.vo]}} or {{IPA|[pra.ˈʔa.vo]}}).@@@@1@33@@danf@17-8-2009
10250800@unknown@formal@none@1@S@=== Grammar ===@@@@1@3@@danf@17-8-2009
10250810@unknown@formal@none@1@S@Esperanto words are [[Derivation (linguistics)|derived]] by stringing together [[prefix (linguistics)|prefix]]es, [[Root (linguistics)|roots]], and [[suffix]]es.@@@@1@14@@danf@17-8-2009
10250820@unknown@formal@none@1@S@This process is regular, so that people can create new words as they speak and be understood.@@@@1@17@@danf@17-8-2009
10250830@unknown@formal@none@1@S@[[Compound (linguistics)|Compound]] words are formed with a modifier-first, [[head (linguistics)|head-final]] order, the same order as English "birdsong" ''vs.'' "songbird".@@@@1@19@@danf@17-8-2009
10250840@unknown@formal@none@1@S@The different [[Part of speech|parts of speech]] are marked by their own suffixes: all [[common noun]]s end in ''-o,'' all [[adjective]]s in ''-a,'' all derived adverbs in ''-e,'' and all [[verb]]s in one of six [[Grammatical tense|tense]] and [[Grammatical mood|mood]] suffixes, such as [[present tense]] ''-as.''@@@@1@46@@danf@17-8-2009
10250850@unknown@formal@none@1@S@[[Grammatical number|Plural]] nouns end in ''-oj'' (pronounced "oy"), whereas [[direct object]]s end in ''-on.''@@@@1@14@@danf@17-8-2009
10250860@unknown@formal@none@1@S@Plural direct objects end with the combination ''-ojn'' (pronounced to rhyme with "coin"): That is, ''-o'' for a noun, plus ''-j'' for plural, plus ''-n'' for direct object.@@@@1@28@@danf@17-8-2009
10250870@unknown@formal@none@1@S@Adjectives [[Grammatical number#Effect of number on verbs and other parts of speech|agree]] with their nouns; their endings are plural ''-aj'' (pronounced "eye"), direct-object ''-an,'' and plural direct-object ''-ajn'' (pronounced to rhyme with "fine").@@@@1@33@@danf@17-8-2009
10250880@unknown@formal@none@1@S@The suffix ''-n'' is used to indicate the goal of movement and a few other things, in addition to the direct object.@@@@1@22@@danf@17-8-2009
10250890@unknown@formal@none@1@S@See [[Esperanto grammar]] for details.@@@@1@5@@danf@17-8-2009
10250900@unknown@formal@none@1@S@The six verb [[inflection]]s consist of three tenses and three moods.@@@@1@11@@danf@17-8-2009
10250910@unknown@formal@none@1@S@They are [[present tense]] ''-as,'' [[future tense]] ''-os,'' [[past tense]] ''-is,'' [[infinitive|infinitive mood]] ''-i,'' [[conditional mood]] ''-us,'' and [[jussive mood]] ''-u'' (used for wishes and commands).@@@@1@26@@danf@17-8-2009
10250920@unknown@formal@none@1@S@Verbs are not marked for person or number.@@@@1@8@@danf@17-8-2009
10250930@unknown@formal@none@1@S@For instance: ''kanti'' "to sing"; ''mi kantas'' "I sing"; ''mi kantis'' "I sang"; ''mi kantos'' "I will sing"; ''li kantas'' "he sings"; ''vi kantas'' "you sing".@@@@1@26@@danf@17-8-2009
10250940@unknown@formal@none@1@S@Word order is comparatively free: Adjectives may precede or follow nouns, and subjects, verbs and objects (marked by the suffix ''-n)'' may occur in any order.@@@@1@26@@danf@17-8-2009
10250950@unknown@formal@none@1@S@However, the [[article (grammar)|article]] ''la'' "the" and [[demonstrative]]s such as ''tiu'' "this, that" almost always come before the noun, and a [[preposition]] such as ''ĉe'' "at" ''must'' come before it.@@@@1@30@@danf@17-8-2009
10250960@unknown@formal@none@1@S@Similarly, the negative ''ne'' "not" and [[conjunction]]s such as ''kaj'' "both, and" and ''ke'' "that" must precede the [[phrase]] or [[clause]] they introduce.@@@@1@23@@danf@17-8-2009
10250970@unknown@formal@none@1@S@In [[copula]]r (A = B) clauses, word order is just as important as it is in English clauses like "people are dogs" ''vs.'' "dogs are people".@@@@1@26@@danf@17-8-2009
10250980@unknown@formal@none@1@S@====Correlatives====@@@@1@1@@danf@17-8-2009
10250990@unknown@formal@none@1@S@A [[correlative]] is a word used to ask or answer a question of who, where, what, when, or how.@@@@1@19@@danf@17-8-2009
10251000@unknown@formal@none@1@S@Correlatives in Esperanto are set out in a systematic manner that correlates a basic [[idea]] (quantity, manner, time, ''etc.'') to a function (questioning, indicating, negating, ''etc.'')@@@@1@26@@danf@17-8-2009
10251010@unknown@formal@none@1@S@Examples:@@@@1@1@@danf@17-8-2009
10251020@unknown@formal@none@1@S@*''Kio estas tio?''@@@@1@3@@danf@17-8-2009
10251030@unknown@formal@none@1@S@"What is this?"@@@@1@3@@danf@17-8-2009
10251040@unknown@formal@none@1@S@*''Kioma estas la horo?''@@@@1@4@@danf@17-8-2009
10251050@unknown@formal@none@1@S@"What time is it?"@@@@1@4@@danf@17-8-2009
10251060@unknown@formal@none@1@S@Note ''kioma'' rather than ''Kiu estas la horo?'' "which is the hour?", when asking for the ranking order of the hour on the clock.@@@@1@24@@danf@17-8-2009
10251070@unknown@formal@none@1@S@*''Io falis el la ŝranko'' "Something fell out of the cupboard."@@@@1@11@@danf@17-8-2009
10251080@unknown@formal@none@1@S@*''Homoj tiaj kiel mi ne konadas timon.''@@@@1@7@@danf@17-8-2009
10251090@unknown@formal@none@1@S@"Men such as me know no fear."@@@@1@7@@danf@17-8-2009
10251100@unknown@formal@none@1@S@Correlatives are declined if the case demands it:@@@@1@8@@danf@17-8-2009
10251110@unknown@formal@none@1@S@*''Vi devas elekti ian vorton pli simpla'' "You should choose a (some kind of) simpler word."@@@@1@16@@danf@17-8-2009
10251120@unknown@formal@none@1@S@''Ia'' receives ''-n'' because it's part of the [[direct object]].@@@@1@10@@danf@17-8-2009
10251130@unknown@formal@none@1@S@*''Kian libron vi volas?''@@@@1@4@@danf@17-8-2009
10251140@unknown@formal@none@1@S@"What sort of book do you want?"@@@@1@7@@danf@17-8-2009
10251150@unknown@formal@none@1@S@Contrast this with, ''Kiun libron vi volas?''@@@@1@7@@danf@17-8-2009
10251160@unknown@formal@none@1@S@"Which book do you want?"@@@@1@5@@danf@17-8-2009
10251170@unknown@formal@none@1@S@=== Vocabulary ===@@@@1@3@@danf@17-8-2009
10251180@unknown@formal@none@1@S@The core vocabulary of Esperanto was defined by ''Lingvo internacia'', published by Zamenhof in 1887.@@@@1@15@@danf@17-8-2009
10251190@unknown@formal@none@1@S@It comprised 900 roots, which could be expanded into tens of thousands of words with prefixes, suffixes, and compounding.@@@@1@19@@danf@17-8-2009
10251200@unknown@formal@none@1@S@In 1894, Zamenhof published the first Esperanto [[dictionary]], ''Universala Vortaro'', with a larger set of roots.@@@@1@16@@danf@17-8-2009
10251210@unknown@formal@none@1@S@However, the rules of the language allowed speakers to borrow new roots as needed, recommending only that they look for the most international forms, and then derive related meanings from these.@@@@1@31@@danf@17-8-2009
10251220@unknown@formal@none@1@S@Since then, many words have been borrowed, primarily but not solely from the Western European languages.@@@@1@16@@danf@17-8-2009
10251230@unknown@formal@none@1@S@Not all proposed borrowings catch on, but many do, especially [[technical terminology|technical]] and [[science|scientific]] terms.@@@@1@15@@danf@17-8-2009
10251240@unknown@formal@none@1@S@Terms for everyday use, on the other hand, are more likely to be derived from existing roots—for example ''komputilo'' (a computer) from ''komputi'' (to compute) plus the suffix ''-ilo'' (tool)—or to be covered by extending the meanings of existing words (for example ''muso'' (a mouse), as in English, now also means a computer input device).@@@@1@55@@danf@17-8-2009
10251250@unknown@formal@none@1@S@There are frequent debates among Esperanto speakers about whether a particular borrowing is justified or whether the need can be met by deriving from or extending the meaning of existing words.@@@@1@31@@danf@17-8-2009
10251260@unknown@formal@none@1@S@In addition to the root words and the rules for combining them, a learner of Esperanto must memorize some idiomatic compounds that are not entirely straightforward.@@@@1@26@@danf@17-8-2009
10251270@unknown@formal@none@1@S@For example, ''eldoni'', literally "to give out", is used for "to publish" (a [[calque]] of words in several European languages with the same derivation), and ''vortaro'', literally "a collection of words", means "a glossary" or "a dictionary".@@@@1@37@@danf@17-8-2009
10251280@unknown@formal@none@1@S@Such forms are modeled after usage in some European languages, and speakers of other languages may find them illogical.@@@@1@19@@danf@17-8-2009
10251290@unknown@formal@none@1@S@Fossilized derivations inherited from Esperanto's source languages may be similarly obscure, such as the opaque connection the root word ''centralo'' "power station" has with ''centro'' "center".@@@@1@26@@danf@17-8-2009
10251300@unknown@formal@none@1@S@Compounds with ''-um-'' are overtly arbitrary, and must be learned individually, as ''-um-'' has no defined meaning.@@@@1@17@@danf@17-8-2009
10251310@unknown@formal@none@1@S@It turns ''dekstren'' "to the right" into ''dekstrumen'' "clockwise", and ''komuna'' "common/shared" into ''komunumo'' "community", for example.@@@@1@17@@danf@17-8-2009
10251320@unknown@formal@none@1@S@Nevertheless, there are not nearly as many idiomatic or [[slang]] words in Esperanto as in ethnic languages, as these tend to make international communication difficult, working against Esperanto's main goal.@@@@1@30@@danf@17-8-2009
10251330@unknown@formal@none@1@S@===Useful phrases===@@@@1@2@@danf@17-8-2009
10251340@unknown@formal@none@1@S@Here are some useful Esperanto phrases, with [[help:IPA|IPA]] transcriptions:@@@@1@9@@danf@17-8-2009
10251350@unknown@formal@none@1@S@* Hello: ''Saluton'' {{IPA|/sa.ˈlu.ton/}}@@@@1@4@@danf@17-8-2009
10251360@unknown@formal@none@1@S@* What is your name?: ''Kiel vi nomiĝas?''@@@@1@8@@danf@17-8-2009
10251370@unknown@formal@none@1@S@{{IPA|/ˈki.el vi no.ˈmi.ʤas/}}@@@@1@3@@danf@17-8-2009
10251380@unknown@formal@none@1@S@* My name is...: ''Mi nomiĝas...''@@@@1@6@@danf@17-8-2009
10251390@unknown@formal@none@1@S@{{IPA|/mi no.ˈmi.ʤas/}}@@@@1@2@@danf@17-8-2009
10251400@unknown@formal@none@1@S@* How much (is it/are they)?: ''Kiom (estas)?''@@@@1@8@@danf@17-8-2009
10251410@unknown@formal@none@1@S@{{IPA|/ˈki.om ˈes.tas/}}@@@@1@2@@danf@17-8-2009
10251420@unknown@formal@none@1@S@* Here you are: ''Jen'' {{IPA|/jen/}}@@@@1@6@@danf@17-8-2009
10251430@unknown@formal@none@1@S@* Do you speak Esperanto?: ''Ĉu vi parolas Esperanton?''@@@@1@9@@danf@17-8-2009
10251440@unknown@formal@none@1@S@{{IPA|/ˈʧu vi pa.ˈro.las es.pe.ˈran.ton/}}@@@@1@4@@danf@17-8-2009
10251450@unknown@formal@none@1@S@* I do not understand you: ''Mi ne komprenas vin'' {{IPA|/mi ˈne kom.ˈpre.nas vin/}}@@@@1@14@@danf@17-8-2009
10251460@unknown@formal@none@1@S@* I like ''this'' one: ''Ĉi tiu plaĉas al mi'' {{IPA|/ʧi ˈti.u ˈpla.ʧas al ˈmi/}} or ''Mi ŝatas tiun ĉi'' {{IPA|/mi ˈʃa.tas ˈti.un ˈʧi/}}@@@@1@24@@danf@17-8-2009
10251470@unknown@formal@none@1@S@* Thank you: ''Dankon'' {{IPA|/ˈdan.kon/}}@@@@1@5@@danf@17-8-2009
10251480@unknown@formal@none@1@S@* You're welcome: ''Ne dankinde'' {{IPA|/ˈne dan.ˈkin.de/}}@@@@1@7@@danf@17-8-2009
10251490@unknown@formal@none@1@S@* Please: ''Bonvolu'' {{IPA|/bon.ˈvo.lu/}} or ''mi petas'' {{IPA|/mi ˈpe.tas/}}@@@@1@9@@danf@17-8-2009
10251500@unknown@formal@none@1@S@* Here's to your health: ''Je via sano'' {{IPA|/je ˈvi.a ˈsa.no/}}@@@@1@11@@danf@17-8-2009
10251510@unknown@formal@none@1@S@* Bless you!/Gesundheit!: ''Sanon!''@@@@1@4@@danf@17-8-2009
10251520@unknown@formal@none@1@S@{{IPA|/ˈsa.non/}}@@@@1@1@@danf@17-8-2009
10251530@unknown@formal@none@1@S@* Congratulations!: ''Gratulon!''@@@@1@3@@danf@17-8-2009
10251540@unknown@formal@none@1@S@{{IPA|/ɡra.ˈtu.lon/}}@@@@1@1@@danf@17-8-2009
10251550@unknown@formal@none@1@S@* Okay: ''Bone'' {{IPA|/ˈbo.ne/}} or ''Ĝuste'' {{IPA|/ˈʤus.te/}}@@@@1@7@@danf@17-8-2009
10251560@unknown@formal@none@1@S@* Yes: ''Jes'' {{IPA|/ˈjes/}}@@@@1@4@@danf@17-8-2009
10251570@unknown@formal@none@1@S@* No: ''Ne'' {{IPA|/ˈne/}}@@@@1@4@@danf@17-8-2009
10251580@unknown@formal@none@1@S@* It is a nice day: ''Estas bela tago'' {{IPA|/ˈes.tas ˈbe.la ˈta.ɡo/}}@@@@1@12@@danf@17-8-2009
10251590@unknown@formal@none@1@S@* I love you: ''Mi amas vin'' {{IPA|/mi ˈa.mas vin/}}@@@@1@10@@danf@17-8-2009
10251600@unknown@formal@none@1@S@* Goodbye: ''Ĝis (la) (revido)'' {{IPA|/ʤis la re.ˈvi.do/}}@@@@1@8@@danf@17-8-2009
10251610@unknown@formal@none@1@S@* One beer, please: ''Unu bieron, mi petas.''@@@@1@8@@danf@17-8-2009
10251620@unknown@formal@none@1@S@{{IPA|/ˈu.nu bi.ˈe.ron, mi ˈpe.tas/}}@@@@1@4@@danf@17-8-2009
10251630@unknown@formal@none@1@S@* What is that?: ''Kio estas tio?''@@@@1@7@@danf@17-8-2009
10251640@unknown@formal@none@1@S@{{IPA|/ˈki.o ˈes.tas ˈti.o/}}@@@@1@3@@danf@17-8-2009
10251650@unknown@formal@none@1@S@* That is...: ''Tio estas...''@@@@1@5@@danf@17-8-2009
10251660@unknown@formal@none@1@S@{{IPA|/ˈti.o ˈes.tas/}}@@@@1@2@@danf@17-8-2009
10251670@unknown@formal@none@1@S@* How are you?: ''Kiel vi (fartas)?''@@@@1@7@@danf@17-8-2009
10251680@unknown@formal@none@1@S@{{IPA|/ˈki.el vi ˈfar.tas/}}@@@@1@3@@danf@17-8-2009
10251690@unknown@formal@none@1@S@* Good morning!: ''Bonan matenon!''@@@@1@5@@danf@17-8-2009
10251700@unknown@formal@none@1@S@{{IPA|/ˈbo.nan ma.ˈte.non/}}@@@@1@2@@danf@17-8-2009
10251710@unknown@formal@none@1@S@* Good evening!: ''Bonan vesperon!''@@@@1@5@@danf@17-8-2009
10251720@unknown@formal@none@1@S@{{IPA|/ˈbo.nan ves.ˈpe.ron/}}@@@@1@2@@danf@17-8-2009
10251730@unknown@formal@none@1@S@* Good night!: ''Bonan nokton!''@@@@1@5@@danf@17-8-2009
10251740@unknown@formal@none@1@S@{{IPA|/ˈbo.nan ˈnok.ton/}}@@@@1@2@@danf@17-8-2009
10251750@unknown@formal@none@1@S@* Peace!: ''Pacon!''@@@@1@3@@danf@17-8-2009
10251760@unknown@formal@none@1@S@{{IPA|/ˈpa.tson/}}@@@@1@1@@danf@17-8-2009
10251770@unknown@formal@none@1@S@=== Sample text ===@@@@1@4@@danf@17-8-2009
10251780@unknown@formal@none@1@S@The following short extract gives an idea of the character of Esperanto.@@@@1@12@@danf@17-8-2009
10251790@unknown@formal@none@1@S@(Pronunciation is covered above.@@@@1@4@@danf@17-8-2009
10251800@unknown@formal@none@1@S@The main point for English speakers to remember is that the letter 'J' has the sound of the letter 'Y' in English)@@@@1@22@@danf@17-8-2009
10251810@unknown@formal@none@1@S@* Esperanto text@@@@1@3@@danf@17-8-2009
10251820@unknown@formal@none@1@S@:''En multaj lokoj de Ĉinio estis temploj de drako-reĝo. Dum trosekeco oni preĝis en la temploj, ke la drako-reĝo donu pluvon al la homa mondo.@@@@1@25@@danf@17-8-2009
10251830@unknown@formal@none@1@S@Tiam drako estis simbolo de la supernatura estaĵo. Kaj pli poste, ĝi fariĝis prapatro de la plej altaj regantoj kaj simbolis la absolutan aŭtoritaton de feŭda imperiestro.@@@@1@27@@danf@17-8-2009
10251840@unknown@formal@none@1@S@La imperiestro pretendis, ke li estas filo de la drako. Ĉiuj liaj vivbezonaĵoj portis la nomon drako kaj estis ornamitaj per diversaj drakofiguroj.@@@@1@23@@danf@17-8-2009
10251850@unknown@formal@none@1@S@Nun ĉie en Ĉinio videblas drako-ornamentaĵoj kaj cirkulas legendoj pri drakoj.''@@@@1@11@@danf@17-8-2009
10251860@unknown@formal@none@1@S@*English Translation:@@@@1@2@@danf@17-8-2009
10251870@unknown@formal@none@1@S@:In many places in China there were temples of the dragon king.@@@@1@12@@danf@17-8-2009
10251880@unknown@formal@none@1@S@During times of drought, people prayed in the temples, that the dragon king would give rain to the human world.@@@@1@20@@danf@17-8-2009
10251890@unknown@formal@none@1@S@At that time the dragon was a symbol of the supernatural.@@@@1@11@@danf@17-8-2009
10251900@unknown@formal@none@1@S@Later on, it became the ancestor of the highest rulers and symbolised the absolute authority of the feudal emperor.@@@@1@19@@danf@17-8-2009
10251910@unknown@formal@none@1@S@The emperor claimed to be the son of the dragon.@@@@1@10@@danf@17-8-2009
10251920@unknown@formal@none@1@S@All of his personal possessions carried the name ''dragon'' and were decorated with various dragon figures.@@@@1@16@@danf@17-8-2009
10251930@unknown@formal@none@1@S@Now everywhere in China dragon decorations can be seen and there circulate legends about dragons.@@@@1@15@@danf@17-8-2009
10251940@unknown@formal@none@1@S@== Education ==@@@@1@3@@danf@17-8-2009
10251950@unknown@formal@none@1@S@The majority of Esperanto speakers learn the language through self-directed study, online tutorials, and correspondence courses taught by volunteers.@@@@1@19@@danf@17-8-2009
10251960@unknown@formal@none@1@S@In more recent years, teaching websites like ''[[lernu!]]'' have become popular.@@@@1@11@@danf@17-8-2009
10251970@unknown@formal@none@1@S@Esperanto instruction is occasionally available at schools, such as a [[Esperanto#Esperanto and language acquisition|pilot project involving four primary schools]] under the supervision of the [[University of Manchester]], and by one count at 69 universities.@@@@1@34@@danf@17-8-2009
10251980@unknown@formal@none@1@S@However, outside of [[China]] and [[Hungary]], these mostly involve informal arrangements rather than dedicated departments or state sponsorship.@@@@1@18@@danf@17-8-2009
10251990@unknown@formal@none@1@S@[[Eötvös Loránd University]] in Budapest had a department of Interlinguistics and Esperanto from 1966 to 2004, after which time instruction moved to vocational colleges; there are state examinations for Esperanto instructors.@@@@1@31@@danf@17-8-2009
10252000@unknown@formal@none@1@S@Various educators have estimated that Esperanto can be learned in anywhere from one quarter to one twentieth the amount of time required for other languages.@@@@1@25@@danf@17-8-2009
10252010@unknown@formal@none@1@S@Some argue, however, that this is only true for native speakers of Western European languages.@@@@1@15@@danf@17-8-2009
10252020@unknown@formal@none@1@S@[[Claude Piron]], a psychologist formerly at the [[University of Geneva]] and Chinese-English-Russian-Spanish translator for the United Nations, argued that Esperanto is far more "brain friendly" than many ethnic languages.@@@@1@29@@danf@17-8-2009
10252030@unknown@formal@none@1@S@"Esperanto relies entirely on innate reflexes [and] differs from all other languages in that you can always trust your natural tendency to generalize patterns. [...]@@@@1@25@@danf@17-8-2009
10252040@unknown@formal@none@1@S@The same [[neuropsychology|neuropsychological]] law [— called by] [[Jean Piaget]] ''generalizing assimilation'' — applies to word formation as well as to grammar."@@@@1@21@@danf@17-8-2009
10252050@unknown@formal@none@1@S@=== Language acquisition ===@@@@1@4@@danf@17-8-2009
10252060@unknown@formal@none@1@S@Four primary schools in Britain, with some 230 pupils, are currently following a course in "propedeutic Esperanto", under the supervision of the University of Manchester.@@@@1@25@@danf@17-8-2009
10252070@unknown@formal@none@1@S@That is, instruction in Esperanto to raise language awareness and accelerate subsequent learning of foreign languages.@@@@1@16@@danf@17-8-2009
10252080@unknown@formal@none@1@S@Several studies demonstrate that studying Esperanto before another foreign language speeds and improves learning the second language to a greater extent than other languages which have been investigated.@@@@1@28@@danf@17-8-2009
10252090@unknown@formal@none@1@S@This appears to be because learning subsequent foreign languages is easier than learning one's first, while the use of a grammatically simple and culturally flexible auxiliary language like Esperanto lessens the first-language learning hurdle.@@@@1@34@@danf@17-8-2009
10252100@unknown@formal@none@1@S@In one study, a group of European [[secondary school]] students studied Esperanto for one year, then French for three years, and ended up with a significantly better command of French than a control group, who studied French for all four years.@@@@1@41@@danf@17-8-2009
10252110@unknown@formal@none@1@S@Similar results were found when the course of study was reduced to two years, of which six months was spent learning Esperanto.@@@@1@22@@danf@17-8-2009
10252120@unknown@formal@none@1@S@Results are not yet available from a study in Australia to see if similar benefits would occur for learning East Asian languages, but the pupils taking Esperanto did better and enjoyed the subject more than those taking other languages.@@@@1@39@@danf@17-8-2009
10252130@unknown@formal@none@1@S@== Community ==@@@@1@3@@danf@17-8-2009
10252140@unknown@formal@none@1@S@=== Geography and demography ===@@@@1@5@@danf@17-8-2009
10252150@unknown@formal@none@1@S@Esperanto speakers are more numerous in Europe and East [[Asia]] than in the Americas, [[Africa]], and [[Oceania]], and more numerous in [[urban area|urban]] than in [[rural]] areas.@@@@1@27@@danf@17-8-2009
10252160@unknown@formal@none@1@S@Esperanto is particularly prevalent in the northern and eastern countries of Europe; in China, [[Korea]], Japan, and [[Iran]] within Asia; in [[Brazil]], [[Argentina]], and [[Mexico]] in the Americas; and in [[Togo]] in Africa.@@@@1@33@@danf@17-8-2009
10252170@unknown@formal@none@1@S@====Number of speakers====@@@@1@3@@danf@17-8-2009
10252180@unknown@formal@none@1@S@An estimate of the number of Esperanto speakers was made by the late [[Sidney S. Culbert]], a [[retirement|retired]] [[psychology]] [[professor]] at the [[University of Washington]] and a longtime Esperantist, who tracked down and tested Esperanto speakers in sample areas in dozens of countries over a period of twenty years.@@@@1@49@@danf@17-8-2009
10252190@unknown@formal@none@1@S@Culbert concluded that between one and two million people speak Esperanto at [[ILR or Foreign Service Level language ability measures|Foreign Service Level 3]], "professionally proficient" (able to communicate moderately complex ideas without hesitation, and to follow speeches, radio broadcasts, etc.).@@@@1@40@@danf@17-8-2009
10252200@unknown@formal@none@1@S@Culbert's estimate was not made for Esperanto alone, but formed part of his listing of estimates for all languages of over 1 million speakers, published annually in the [[World Almanac|World Almanac and Book of Facts]].@@@@1@35@@danf@17-8-2009
10252210@unknown@formal@none@1@S@Culbert's most detailed account of his methodology is found in a 1989 letter to David Wolff .@@@@1@17@@danf@17-8-2009
10252220@unknown@formal@none@1@S@Since Culbert never published detailed intermediate results for particular countries and regions, it is difficult to independently gauge the accuracy of his results.@@@@1@23@@danf@17-8-2009
10252230@unknown@formal@none@1@S@In the Almanac, his estimates for numbers of language speakers were rounded to the nearest million, thus the number for Esperanto speakers is shown as 2 million.@@@@1@27@@danf@17-8-2009
10252240@unknown@formal@none@1@S@This latter figure appears in ''[[Ethnologue]]''.@@@@1@6@@danf@17-8-2009
10252250@unknown@formal@none@1@S@Assuming that this figure is accurate, that means that about 0.03% of the world's population speaks the language.@@@@1@18@@danf@17-8-2009
10252260@unknown@formal@none@1@S@This falls short of Zamenhof's goal of a [[international auxiliary language|universal language]], but it represents a level of popularity unmatched by any other constructed language.@@@@1@25@@danf@17-8-2009
10252270@unknown@formal@none@1@S@Marcus Sikosek (now [[Ziko van Dijk]]) has challenged this figure of 1.6 million as exaggerated.@@@@1@15@@danf@17-8-2009
10252280@unknown@formal@none@1@S@He estimated that even if Esperanto speakers were evenly distributed, assuming one million Esperanto speakers worldwide would lead one to expect about 180 in the city of [[Cologne, Germany|Cologne]].@@@@1@29@@danf@17-8-2009
10252290@unknown@formal@none@1@S@Van Dijk finds only 30 [[fluency|fluent]] speakers in that city, and similarly smaller than expected figures in several other places thought to have a larger-than-average concentration of Esperanto speakers.@@@@1@29@@danf@17-8-2009
10252300@unknown@formal@none@1@S@He also notes that there are a total of about 20,000 members of the various Esperanto organizations (other estimates are higher).@@@@1@21@@danf@17-8-2009
10252310@unknown@formal@none@1@S@Though there are undoubtedly many Esperanto speakers who are not members of any Esperanto organization, he thinks it unlikely that there are fifty times more speakers than organization members.@@@@1@29@@danf@17-8-2009
10252320@unknown@formal@none@1@S@[[Finnish people|Finnish]] [[linguistics|linguist]] Jouko Lindstedt, an expert on native-born Esperanto speakers, presented the following scheme to show the overall proportions of language capabilities within the Esperanto community:@@@@1@27@@danf@17-8-2009
10252330@unknown@formal@none@1@S@* ''1,000 have Esperanto as their native language@@@@1@8@@danf@17-8-2009
10252340@unknown@formal@none@1@S@* ''10,000 speak it fluently@@@@1@5@@danf@17-8-2009
10252350@unknown@formal@none@1@S@* ''100,000 can use it actively@@@@1@6@@danf@17-8-2009
10252360@unknown@formal@none@1@S@* ''1,000,000 understand a large amount passively@@@@1@7@@danf@17-8-2009
10252370@unknown@formal@none@1@S@* ''10,000,000 have studied it to some extent at some time.''@@@@1@11@@danf@17-8-2009
10252380@unknown@formal@none@1@S@In the absence of Dr. Culbert's detailed sampling data, or any other census data, it is impossible to state the number of speakers with certainty.@@@@1@25@@danf@17-8-2009
10252390@unknown@formal@none@1@S@Few observers, probably, would challenge the following statement from the [[website]] of the [[World Esperanto Association]]:@@@@1@16@@danf@17-8-2009
10252400@unknown@formal@none@1@S@:Numbers of [[textbook]]s sold and membership of local societies put the number of people with some knowledge of the language in the hundreds of thousands and possibly millions.@@@@1@28@@danf@17-8-2009
10252410@unknown@formal@none@1@S@====Native speakers====@@@@1@2@@danf@17-8-2009
10252420@unknown@formal@none@1@S@Ethnologue reports estimates that there are 200 to 2000 native Esperanto speakers ''(denaskuloj),'' who have learned the language from birth from their Esperanto-speaking parents.@@@@1@24@@danf@17-8-2009
10252430@unknown@formal@none@1@S@This usually happens when Esperanto is the chief or only common language in an international family, but sometimes in a family of devoted Esperantists.@@@@1@24@@danf@17-8-2009
10252440@unknown@formal@none@1@S@The most famous native speaker of Esperanto is businessman [[George Soros]].@@@@1@11@@danf@17-8-2009
10252450@unknown@formal@none@1@S@Also notable is young Holocaust victim [[Petr Ginz]], whose drawing of the planet Earth as viewed from the moon was carried aboard the Space Shuttle ''[[Space Shuttle Columbia|Columbia]]'' in 2003 ([[STS-107]]).@@@@1@31@@danf@17-8-2009
10252460@unknown@formal@none@1@S@=== Culture ===@@@@1@3@@danf@17-8-2009
10252470@unknown@formal@none@1@S@Esperanto speakers can access an international [[culture]], including a large body of original as well as translated [[Esperanto literature|literature]].@@@@1@19@@danf@17-8-2009
10252480@unknown@formal@none@1@S@There are over 25,000 Esperanto books, both originals and translations, as well as several regularly distributed [[List of Esperanto magazines|Esperanto magazines]].@@@@1@21@@danf@17-8-2009
10252490@unknown@formal@none@1@S@Esperanto speakers use the language for free accommodations with [[Esperantist]]s in 92 countries using the [[Pasporta Servo]] or to develop [[pen pal]] friendships abroad through the Esperanto Pen Pal Service.@@@@1@30@@danf@17-8-2009
10252500@unknown@formal@none@1@S@Every year, 1,500-3,000 Esperanto speakers meet for the [[World Congress of Esperanto]] ''(Universala Kongreso de Esperanto)''.@@@@1@16@@danf@17-8-2009
10252510@unknown@formal@none@1@S@The [[European Esperanto Union]] ''(Eǔropa Esperanto-Unio)'' regroups the national Esperanto associations of the EU member states and holds congresses every two years.@@@@1@22@@danf@17-8-2009
10252520@unknown@formal@none@1@S@The most recent was in [[Maribor, Slovenia]], in July-August 2007.@@@@1@10@@danf@17-8-2009
10252530@unknown@formal@none@1@S@It attracted 256 delegates from 28 countries, including 2 members of the [[European Parliament]], Ms. [[Małgorzata Handzlik]] of [[Poland]] and Ms. [[Ljudmila Novak]] of [[Slovenia]].@@@@1@25@@danf@17-8-2009
10252540@unknown@formal@none@1@S@Historically, much [[Esperanto music]] has been in various folk traditions, such as ''Kaj Tiel Plu'', for example.@@@@1@17@@danf@17-8-2009
10252550@unknown@formal@none@1@S@In recent decades, more rock and other modern genres have appeared, an example being the Swedish band ''Persone''.@@@@1@18@@danf@17-8-2009
10252560@unknown@formal@none@1@S@There are also shared [[tradition]]s, such as [[Zamenhof Day]], and shared [[behaviour]] patterns.@@@@1@13@@danf@17-8-2009
10252570@unknown@formal@none@1@S@[[Esperantist]]s speak primarily in Esperanto at [[World Esperanto Congress|international Esperanto meetings]].@@@@1@11@@danf@17-8-2009
10252580@unknown@formal@none@1@S@Detractors of Esperanto occasionally criticize it as "having no culture".@@@@1@10@@danf@17-8-2009
10252590@unknown@formal@none@1@S@Proponents, such as Prof. [[Humphrey Tonkin]] of the [[University of Hartford]], observe that Esperanto is "culturally neutral by design, as it was intended to be a facilitator between cultures, not to be the carrier of any one national culture."@@@@1@39@@danf@17-8-2009
10252610@unknown@formal@none@1@S@The late [[Scotland|Scottish]] Esperanto author [[William Auld]] has written extensively on the subject, arguing that Esperanto is "the expression of a [[Esperanto as an international language|common human culture]], unencumbered by national frontiers.@@@@1@32@@danf@17-8-2009
10252620@unknown@formal@none@1@S@Thus it is considered a culture on its own."@@@@1@9@@danf@17-8-2009
10252630@unknown@formal@none@1@S@Others point to Esperanto's potential for strengthening a common European identity, as it combines features of several [[Esperanto etymology|European languages]].@@@@1@20@@danf@17-8-2009
10252640@unknown@formal@none@1@S@====In popular culture====@@@@1@3@@danf@17-8-2009
10252650@unknown@formal@none@1@S@Esperanto has been used in a number of films and novels.@@@@1@11@@danf@17-8-2009
10252660@unknown@formal@none@1@S@Typically, this is done either to add the exotic flavour of a foreign language without representing any particular ethnicity, or to avoid going to the trouble of inventing a new language.@@@@1@31@@danf@17-8-2009
10252670@unknown@formal@none@1@S@The [[Charlie Chaplin]] film ''[[The Great Dictator]]'' (1940) showed [[Warsaw ghetto|Jewish ghetto]] shops designated in Esperanto, each with the general Esperanto suffix ''-ejo'' (meaning "place for..."), in order to convey the atmosphere of some 'foreign' [[Eastern Europe|East European]] country without referencing any particular East European language.@@@@1@46@@danf@17-8-2009
10252680@unknown@formal@none@1@S@Two full-length [[feature film]]s have been produced with [[dialogue]] entirely in Esperanto: ''[[Angoroj]],'' in 1964, and ''[[Incubus (1965 film)|Incubus]],'' a 1965 [[B-movie]] horror film.@@@@1@24@@danf@17-8-2009
10252690@unknown@formal@none@1@S@[[Canada|Canadian]] actor [[William Shatner]] learned Esperanto to a limited level so that he could star in ''Incubus''.@@@@1@17@@danf@17-8-2009
10252700@unknown@formal@none@1@S@Other amateur productions have been made, such as a dramatisation of the novel ''Gerda Malaperis'' (Gerda Has Disappeared).@@@@1@18@@danf@17-8-2009
10252710@unknown@formal@none@1@S@A number of "mainstream" films in national languages have used Esperanto in some way, such as ''[[Gattaca]]'' (1997), in which Esperanto can be overheard on the public address system.@@@@1@29@@danf@17-8-2009
10252720@unknown@formal@none@1@S@In the 1994 film ''[[Street Fighter]]'', Esperanto is the native language of the fictional country of [[Shadaloo]], and in a barracks scene the soldiers of villain [[M. Bison]] sing a rousing Russian Army-style chorus, the "Bison Troopers Marching Song", in the language.@@@@1@42@@danf@17-8-2009
10252730@unknown@formal@none@1@S@Esperanto is also spoken and appears on signs in the film ''[[Blade: Trinity]]''.@@@@1@13@@danf@17-8-2009
10252740@unknown@formal@none@1@S@In the British comedy ''[[Red Dwarf]]'', [[Arnold Rimmer]] is seen attempting to learn Esperanto in a number of early episodes, including ''[[Kryten (Red Dwarf episode)|Kryten]]''.@@@@1@25@@danf@17-8-2009
10252750@unknown@formal@none@1@S@In the first season, signs on the titular spacecraft are in both English and Esperanto.@@@@1@15@@danf@17-8-2009
10252760@unknown@formal@none@1@S@Esperanto is used as the universal language in the far future of [[Harry Harrison]]'s ''[[Stainless Steel Rat]]'' and ''[[Deathworld]]'' stories.@@@@1@20@@danf@17-8-2009
10252770@unknown@formal@none@1@S@In a 1969 guest appearance on ''[[The Tonight Show]]'', [[Jay Silverheels]] of ''[[The Lone Ranger]]'' fame appeared in character as [[Tonto]] for a comedy sketch with [[Johnny Carson]], and claimed Esperanto skills as he sought new employment.@@@@1@37@@danf@17-8-2009
10252780@unknown@formal@none@1@S@The sketch ended with a statement of his ideal situation: "Tonto, to [[Toronto, Canada|Toronto]], for Esperanto, and pronto!"@@@@1@18@@danf@17-8-2009
10252790@unknown@formal@none@1@S@Also, in the [[Danny Phantom]] Episode, "Public Enemies", Danny, Tucker, and Sam come across a ghost wolf who speaks Esperanto, but only Tucker can understand at first.@@@@1@27@@danf@17-8-2009
10252800@unknown@formal@none@1@S@=== In Science ===@@@@1@4@@danf@17-8-2009
10252810@unknown@formal@none@1@S@In 1921 the [[French Academy of Sciences]] recommended using Esperanto for international scientific communication.@@@@1@14@@danf@17-8-2009
10252820@unknown@formal@none@1@S@A few scientists and mathematicians, such as [[Maurice René Fréchet|Maurice Fréchet]] (mathematics), [[John C. Wells]] (linguistics), [[Helmar Frank]] (pedagogy and cybernetics), and [[Nobel Prize in Economics|Nobel laureate]] [[Reinhard Selten]] (economics) have published part of their work in Esperanto.@@@@1@38@@danf@17-8-2009
10252830@unknown@formal@none@1@S@Frank and Selten were among the founders of the [[Akademio Internacia de la Sciencoj San Marino|International Academy of Sciences]] in [[San Marino]], sometimes called the "Esperanto University", where Esperanto is the primary language of teaching and administration.@@@@1@37@@danf@17-8-2009
10252840@unknown@formal@none@1@S@=== Goals of the movement ===@@@@1@6@@danf@17-8-2009
10252850@unknown@formal@none@1@S@Zamenhof's intention was to create an easy-to-learn language to foster international understanding.@@@@1@12@@danf@17-8-2009
10252860@unknown@formal@none@1@S@It was to serve as an international auxiliary language, that is, as a universal second language, not to replace ethnic languages.@@@@1@21@@danf@17-8-2009
10252870@unknown@formal@none@1@S@This goal was widely shared among Esperanto speakers in the early decades of the movement.@@@@1@15@@danf@17-8-2009
10252880@unknown@formal@none@1@S@Later, Esperanto speakers began to see the language and the culture that had grown up around it as ends in themselves, even if Esperanto is never adopted by the United Nations or other international organizations.@@@@1@35@@danf@17-8-2009
10252890@unknown@formal@none@1@S@Those Esperanto speakers who want to see Esperanto adopted officially or on a large scale worldwide are commonly called ''[[Finvenkismo|finvenkistoj]]'', from ''fina venko'', meaning "final victory", or ''pracelistoj'', from ''pracelo'', meaning "original goal".@@@@1@33@@danf@17-8-2009
10252900@unknown@formal@none@1@S@Those who focus on the intrinsic value of the language are commonly called ''[[Raumism|raŭmistoj]]'', from [[Rauma, Finland|Rauma]], [[Finland]], where a declaration on the near-term unlikelihood of the "fina venko" and the value of Esperanto culture was made at the International Youth Congress in 1980.@@@@1@44@@danf@17-8-2009
10252910@unknown@formal@none@1@S@These categories are, however, not mutually exclusive.@@@@1@7@@danf@17-8-2009
10252920@unknown@formal@none@1@S@The [[Prague Manifesto (Esperanto)|Prague Manifesto]] (1996) presents the views of the mainstream of the Esperanto movement and of its main organisation, the World Esperanto Association ([[World Esperanto Association|UEA]]).@@@@1@28@@danf@17-8-2009
10252930@unknown@formal@none@1@S@=== Symbols and flags ===@@@@1@5@@danf@17-8-2009
10252940@unknown@formal@none@1@S@In 1893, C. Rjabinis and P. Deullin designed and manufactured a lapel pin for Esperantists to identify each other.@@@@1@19@@danf@17-8-2009
10252950@unknown@formal@none@1@S@The design was a circular pin with a white background and a five pointed green star.@@@@1@16@@danf@17-8-2009
10252960@unknown@formal@none@1@S@The theme of the design was the hope of the [[Continent#Number of continents|five continents]] being united by a common language.@@@@1@20@@danf@17-8-2009
10252970@unknown@formal@none@1@S@The earliest flag, and the one most commonly used today, features a green five-pointed star against a white canton, upon a field of green.@@@@1@24@@danf@17-8-2009
10252980@unknown@formal@none@1@S@It was proposed to Zamenhof by [[Ireland|Irishman]] Richard Geoghegan, author of the first Esperanto textbook for English speakers, in 1887.@@@@1@20@@danf@17-8-2009
10252990@unknown@formal@none@1@S@In 1905, delegates to the first conference of Esperantists at Boulogne-sur-Mer unanimously approved a version that differed from the modern flag only by the superimposition of an "E" over the green star.@@@@1@32@@danf@17-8-2009
10253000@unknown@formal@none@1@S@Other variants include that for Christian Esperantists, with a white [[Christian cross]] superimposed upon the green star, and that for Leftists, with [[Red flag|the color of the field changed from green to red]].@@@@1@33@@danf@17-8-2009
10253010@unknown@formal@none@1@S@In 1987, a second flag design was chosen in a contest organized by the UEA celebrating the first centennial of the language.@@@@1@22@@danf@17-8-2009
10253020@unknown@formal@none@1@S@It featured a white background with two stylised curved "E"s facing each other.@@@@1@13@@danf@17-8-2009
10253030@unknown@formal@none@1@S@Dubbed the "jubilea simbolo" ([[Esperanto jubilee symbol|jubilee symbol]]) , it attracted criticism from some Esperantists, who dubbed it the "melono" (melon) because of the design's elliptical shape.@@@@1@27@@danf@17-8-2009
10253040@unknown@formal@none@1@S@It is still in use, though to a lesser degree than the traditional symbol, known as the "verda stelo" (green star).@@@@1@21@@danf@17-8-2009
10253050@unknown@formal@none@1@S@=== Religion ===@@@@1@3@@danf@17-8-2009
10253060@unknown@formal@none@1@S@Esperanto has served an important role in several religions, such as [[Oomoto]] from Japan and [[Baha'i]] from Iran, and has been encouraged by others.@@@@1@24@@danf@17-8-2009
10253070@unknown@formal@none@1@S@==== Oomoto ====@@@@1@3@@danf@17-8-2009
10253080@unknown@formal@none@1@S@The [[Oomoto]] religion encourages the use of Esperanto among their followers and includes Zamenhof as one of its deified spirits.@@@@1@20@@danf@17-8-2009
10253090@unknown@formal@none@1@S@==== Bahá'í Faith====@@@@1@3@@danf@17-8-2009
10253100@unknown@formal@none@1@S@The [[Bahá'í Faith]] encourages the [[Bahá'í Faith and auxiliary language|use of an auxiliary international language]].@@@@1@15@@danf@17-8-2009
10253110@unknown@formal@none@1@S@While endorsing no specific language, some Bahá'ís see Esperanto as having great potential in this role.@@@@1@16@@danf@17-8-2009
10253120@unknown@formal@none@1@S@[[Lidja Zamenhof]], the daughter of Esperanto founder [[L. L. Zamenhof]], became a Bahá'í.@@@@1@13@@danf@17-8-2009
10253130@unknown@formal@none@1@S@Various volumes of the [[Bahá'í literature]]s and other Baha'i books have been translated into Esperanto.@@@@1@15@@danf@17-8-2009
10253140@unknown@formal@none@1@S@==== Spiritism ====@@@@1@3@@danf@17-8-2009
10253150@unknown@formal@none@1@S@Esperanto is also actively promoted, at least in [[Brazil]], by followers of [[Spiritism]].@@@@1@13@@danf@17-8-2009
10253160@unknown@formal@none@1@S@The Brazilian Spiritist Federation publishes Esperanto coursebooks, translations of [[Spiritist Codification|Spiritism's basic books]], and encourages Spiritists to become Esperantists.@@@@1@19@@danf@17-8-2009
10253170@unknown@formal@none@1@S@==== Bible translations ====@@@@1@4@@danf@17-8-2009
10253180@unknown@formal@none@1@S@The first translation of the [[Bible]] into Esperanto was a translation of the [[Tanach]] or Old Testament done by [[L. L. Zamenhof]].@@@@1@22@@danf@17-8-2009
10253190@unknown@formal@none@1@S@The translation was reviewed and compared with other languages' translations by a group of British clergy and scholars before publishing it at the [[British and Foreign Bible Society]] in 1910.@@@@1@30@@danf@17-8-2009
10253200@unknown@formal@none@1@S@In 1926 this was published along with a New Testament translation, in an edition commonly called the "Londona Biblio".@@@@1@19@@danf@17-8-2009
10253210@unknown@formal@none@1@S@In the 1960s, the ''Internacia Asocio de Bibliistoj kaj Orientalistoj'' tried to organize a new, ecumenical Esperanto Bible version.@@@@1@19@@danf@17-8-2009
10253220@unknown@formal@none@1@S@Since then, the Dutch Lutheran pastor Gerrit Berveling has translated the [[Deuterocanonical]] or apocryphal books in addition to new translations of the Gospels, some of the New Testament epistles, and some books of the Tanakh or Old Testament.@@@@1@38@@danf@17-8-2009
10253230@unknown@formal@none@1@S@These have been published in various separate booklets, or serialized in ''Dia Regno'', but the [[Deuterocanonical]] books have appeared in recent editions of the Londona Biblio.@@@@1@26@@danf@17-8-2009
10253240@unknown@formal@none@1@S@==== Christianity ====@@@@1@3@@danf@17-8-2009
10253250@unknown@formal@none@1@S@Two Roman Catholic popes, [[Pope John Paul II|John Paul II]] and [[Pope Benedict XVI|Benedict XVI]], have regularly used Esperanto in their multilingual ''[[urbi et orbi]]'' blessings at Easter and Christmas each year since Easter 1994.@@@@1@35@@danf@17-8-2009
10253260@unknown@formal@none@1@S@Christian Esperanto organizations include two that were formed early in the history of Esperanto, the [[International Union of Catholic Esperantists]] and the [[List of Esperanto organizations#Religion|International Christian Esperantists League]].@@@@1@29@@danf@17-8-2009
10253270@unknown@formal@none@1@S@An issue of "The Friend" describes the activities of the [[Quaker]] Esperanto Society.@@@@1@13@@danf@17-8-2009
10253280@unknown@formal@none@1@S@There are instances of Christian apologists and teachers who use Esperanto as a medium.@@@@1@14@@danf@17-8-2009
10253290@unknown@formal@none@1@S@[[Nigeria]]n [[Pastor]] Bayo Afolaranmi's "[http://groups.yahoo.com/group/spiritanutrajxo/ Spirita nutraĵo]" (spiritual food) Yahoo mailing list, for example, has hosted weekly messages since 2003.@@@@1@20@@danf@17-8-2009
10253300@unknown@formal@none@1@S@[[Chick Publications]], publisher of [[Fundamentalist Christianity|Protestant fundamentalist]] themed evangelistic tracts, has published a number of comic book style tracts by [[Jack T. Chick]] translated into Esperanto, including "This Was Your Life!"@@@@1@31@@danf@17-8-2009
10253310@unknown@formal@none@1@S@("Jen Via Tuto Vivo!")@@@@1@4@@danf@17-8-2009
10253320@unknown@formal@none@1@S@==== Islam ====@@@@1@3@@danf@17-8-2009
10253330@unknown@formal@none@1@S@[[Ayatollah Khomeini]] of [[Iran]] called on Muslims to learn Esperanto and praised its use as a medium for better understanding among peoples of different religious backgrounds.@@@@1@26@@danf@17-8-2009
10253340@unknown@formal@none@1@S@After he suggested that Esperanto replace English as an international [[lingua franca]], it began to be used in the seminaries of [[Qom]].@@@@1@22@@danf@17-8-2009
10253350@unknown@formal@none@1@S@An Esperanto translation of the [[Qur'an]] was published by the state shortly thereafter.@@@@1@13@@danf@17-8-2009
10253360@unknown@formal@none@1@S@In 1981, Khomeini and the Iranian government began to oppose Esperanto after realising that followers of the [[Bahá'í Faith]] were interested in it.@@@@1@23@@danf@17-8-2009
10253370@unknown@formal@none@1@S@== Criticism ==@@@@1@3@@danf@17-8-2009
10253380@unknown@formal@none@1@S@Esperanto was conceived as a language of international communication, more precisely as a universal [[second language]].@@@@1@16@@danf@17-8-2009
10253390@unknown@formal@none@1@S@Since publication, there has been debate over whether it is possible for Esperanto to attain this position, and whether it would be an improvement for international communication if it did.@@@@1@30@@danf@17-8-2009
10253400@unknown@formal@none@1@S@There have been a number of attempts to reform the language, the most well-known of which is the language [[Ido]] which resulted in a schism in the community at the time, beginning in 1907.@@@@1@34@@danf@17-8-2009
10253410@unknown@formal@none@1@S@Since Esperanto is a planned language, there have been many, often passionate, criticisms of minor points which are too numerous to cover here, such as Zamenhof's choice of the word ''edzo'' over something like ''spozo'' for "husband, spouse", or his choice of the Classic Greek and Old Latin singular and plural endings ''-o, -oj, -a, -aj'' over their Medieval contractions ''-o, -i, -a, -e.''@@@@1@64@@danf@17-8-2009
10253420@unknown@formal@none@1@S@(Both these changes were adopted by the Ido reform, though Ido dispensed with adjectival agreement altogether.)@@@@1@16@@danf@17-8-2009
10253430@unknown@formal@none@1@S@See the links [[Esperanto#Criticism|below]] for examples of more general criticism.@@@@1@10@@danf@17-8-2009
10253440@unknown@formal@none@1@S@The more common points include:@@@@1@5@@danf@17-8-2009
10253450@unknown@formal@none@1@S@* Esperanto has failed the expectations of its founder to become a universal second language.@@@@1@15@@danf@17-8-2009
10253460@unknown@formal@none@1@S@Although many promoters of Esperanto stress the few successes it has had, the fact remains that well over a century since its publication, the portion of the world that speaks Esperanto, and the number of primary and secondary schools which teach it, remain minuscule.@@@@1@44@@danf@17-8-2009
10253470@unknown@formal@none@1@S@It simply cannot compete with English in this regard.@@@@1@9@@danf@17-8-2009
10253480@unknown@formal@none@1@S@* The vocabulary and grammar are based on major European languages, and are not universal.@@@@1@15@@danf@17-8-2009
10253490@unknown@formal@none@1@S@Often this criticism is specific to a few points such as adjectival agreement and the accusative case (generally such obvious details are all that reform projects suggest changing), but sometimes it is more general: Both the grammar and the 'international' vocabulary are difficult for many Asians, among others, and give an unfair advantage to speakers of European languages.@@@@1@58@@danf@17-8-2009
10253500@unknown@formal@none@1@S@One attempt to address this issue is [[Lojban]], which draws from the six populous languages [[Arabic language|Arabic]], [[Chinese language|Chinese]], [[English language|English]], [[Hindi]], [[Russian language|Russian]], and [[Spanish language|Spanish]], and whose grammar is designed for computer parsing.@@@@1@35@@danf@17-8-2009
10253510@unknown@formal@none@1@S@* The vocabulary, diacritic letters, and grammar are too dissimilar from the major Western European languages, and therefore Esperanto is not as easy as it could be for speakers of those languages to learn.@@@@1@34@@danf@17-8-2009
10253520@unknown@formal@none@1@S@Attempts to address this issue include the younger planned languages [[Ido]] and [[Interlingua]].@@@@1@13@@danf@17-8-2009
10253530@unknown@formal@none@1@S@* Esperanto phonology is unimaginatively provincial, being essentially [[Belorussian language|Belorussian]] with regularized stress, leaving out only the [[nasal vowel]]s, [[palatalization|palatalized consonants]], and /dz/.@@@@1@23@@danf@17-8-2009
10253540@unknown@formal@none@1@S@For example, Esperanto has phonemes such as {{IPA|/x/, /ʒ/, /ts/, /eu̯/}} ''(ĥ, ĵ, c, eŭ)'' which are rare as distinct phonemes outside Europe.@@@@1@23@@danf@17-8-2009
10253550@unknown@formal@none@1@S@(Note that none of these are found in initial position in English.)@@@@1@12@@danf@17-8-2009
10253560@unknown@formal@none@1@S@* Esperanto has no culture.@@@@1@5@@danf@17-8-2009
10253570@unknown@formal@none@1@S@Although it has a large international literature, Esperanto does not encapsulate a specific culture.@@@@1@14@@danf@17-8-2009
10253580@unknown@formal@none@1@S@* Esperanto is culturally European.@@@@1@5@@danf@17-8-2009
10253590@unknown@formal@none@1@S@This is due to the European derivation of its vocabulary, and more insidiously, its [[semantics]]; both infuse the language with a European world view.@@@@1@24@@danf@17-8-2009
10253600@unknown@formal@none@1@S@* The vocabulary is too large.@@@@1@6@@danf@17-8-2009
10253610@unknown@formal@none@1@S@Rather than deriving new words from existing roots, large numbers of new roots are adopted into the language by people who think they're international, when in fact they're only European.@@@@1@30@@danf@17-8-2009
10253620@unknown@formal@none@1@S@This makes the language much more difficult for non-Europeans than it needs to be.@@@@1@14@@danf@17-8-2009
10253630@unknown@formal@none@1@S@* Esperanto is [[sexism|sexist]].@@@@1@4@@danf@17-8-2009
10253640@unknown@formal@none@1@S@As in English, there is no neutral pronoun for ''s/he,'' and most kin terms and titles are masculine by default and only feminine when so specified.@@@@1@26@@danf@17-8-2009
10253650@unknown@formal@none@1@S@There have been many attempts to address this issue, of which one of the better known is [[Riism]].@@@@1@18@@danf@17-8-2009
10253660@unknown@formal@none@1@S@* Esperanto is, looks, or sounds artificial.@@@@1@7@@danf@17-8-2009
10253670@unknown@formal@none@1@S@This criticism is primarily due to the letters with circumflex diacritics, which some find odd or cumbersome, and to the lack of fluent speakers: Few Esperantists have spent much time with fluent, let alone native, speakers, and many learn Esperanto relatively late in life, and so speak haltingly, which can create a negative impression among non-speakers.@@@@1@56@@danf@17-8-2009
10253680@unknown@formal@none@1@S@Among fluent speakers, Esperanto sounds no more artificial than any other language.@@@@1@12@@danf@17-8-2009
10253690@unknown@formal@none@1@S@Others claim that an artificial language will necessarily be deficient, due to its very nature, but the [[Hungarian Academy of Sciences]] has found that Esperanto fulfills all the requirements of a living language.@@@@1@33@@danf@17-8-2009
10253700@unknown@formal@none@1@S@== Modifications ==@@@@1@3@@danf@17-8-2009
10253710@unknown@formal@none@1@S@Though Esperanto itself has changed little since the publication of the ''[[Fundamento de Esperanto]]'' (Foundation of Esperanto), a number of reform projects have been proposed over the years, starting with [[Reformed Esperanto|Zamenhof's proposals in 1894]] and [[Ido]] in 1907.@@@@1@39@@danf@17-8-2009
10253720@unknown@formal@none@1@S@Several later constructed languages, such as Fasile, were based on Esperanto.@@@@1@11@@danf@17-8-2009
10253730@unknown@formal@none@1@S@In modern times, attempts have been made to eliminate perceived sexism in the language.@@@@1@14@@danf@17-8-2009
10253740@unknown@formal@none@1@S@One example of this is [[Riism]].@@@@1@6@@danf@17-8-2009
10253750@unknown@formal@none@1@S@However, as Esperanto has become a living language, changes are as difficult to implement as in ethnic languages.@@@@1@18@@danf@17-8-2009
10260010@unknown@formal@none@1@S@Formal grammar@@@@1@2@@danf@17-8-2009
10260020@unknown@formal@none@1@S@In [[formal semantics]], [[computer science]] and [[linguistics]], a '''formal grammar''' (also called '''formation rules''') is a precise description of a [[formal language]] – that is, of a [[set]] of [[String (computer science)|strings]] over some [[Alphabet (computer science)|alphabet]].@@@@1@37@@danf@17-8-2009
10260030@unknown@formal@none@1@S@In other words, a grammar describes which of the possible sequences of symbols (strings) in a language constitute valid words or statements in that language, but it does not describe their [[semantics]] (i.e. what they mean).@@@@1@36@@danf@17-8-2009
10260040@unknown@formal@none@1@S@The branch of mathematics that is concerned with the properties of formal grammars and languages is called [[formal language theory]].@@@@1@20@@danf@17-8-2009
10260050@unknown@formal@none@1@S@A grammar is usually regarded as a means to [[generate]] all the valid strings of a language; it can also be used as the basis for a [[recognizer]] that determines for any given string whether it is [[grammatical]] (i.e. belongs to the language).@@@@1@43@@danf@17-8-2009
10260060@unknown@formal@none@1@S@To describe such recognizers, formal language theory uses separate formalisms, known as [[automata theory|automata]].@@@@1@14@@danf@17-8-2009
10260070@unknown@formal@none@1@S@A grammar can also be used to [[analyze]] the strings of a language – i.e. to describe their internal structure.@@@@1@20@@danf@17-8-2009
10260080@unknown@formal@none@1@S@In computer science, this process is known as [[parsing]].@@@@1@9@@danf@17-8-2009
10260090@unknown@formal@none@1@S@Most languages have very [[compositional semantics]], i.e. the meaning of their utterances is structured according to their [[syntax]]; therefore, the first step to describing the meaning of an utterance in language is to analyze it and look at its analyzed form (known as its [[parse tree]] in computer science, and as its [[deep structure]] in [[generative grammar]]).@@@@1@57@@danf@17-8-2009
10260100@unknown@formal@none@1@S@== Background ==@@@@1@3@@danf@17-8-2009
10260110@unknown@formal@none@1@S@=== Formal language ===@@@@1@4@@danf@17-8-2009
10260120@unknown@formal@none@1@S@A ''formal language'' is an organized [[set]] of [[symbol]]s the essential feature of which is that it can be precisely defined in terms of just the shapes and locations of those symbols.@@@@1@32@@danf@17-8-2009
10260130@unknown@formal@none@1@S@Such a language can be defined, then, without any [[reference]] to any [[meaning (linguistics)|meaning]]s of any of its expressions; it can exist before any [[formal interpretation]] is assigned to it -- that is, before it has any meaning.@@@@1@38@@danf@17-8-2009
10260140@unknown@formal@none@1@S@First order logic is expressed in some formal language.@@@@1@9@@danf@17-8-2009
10260150@unknown@formal@none@1@S@A formal grammar determines which symbols and sets of symbols are [[Formula (mathematical logic)|formula]]s in a formal language.@@@@1@18@@danf@17-8-2009
10260160@unknown@formal@none@1@S@=== Formal systems ===@@@@1@4@@danf@17-8-2009
10260170@unknown@formal@none@1@S@A ''formal system'' (also called a ''logical calculus'', or a ''logical system'') consists of a formal language together with a [[deductive apparatus]] (also called a ''deductive system'').@@@@1@27@@danf@17-8-2009
10260180@unknown@formal@none@1@S@The deductive apparatus may consist of a set of [[transformation rule]]s (also called ''inference rules'') or a set of [[axiom]]s, or have both.@@@@1@23@@danf@17-8-2009
10260190@unknown@formal@none@1@S@A formal system is used to [[Proof theory|derive]] one expression from one or more other expressions.@@@@1@16@@danf@17-8-2009
10260200@unknown@formal@none@1@S@=== Formal proofs ===@@@@1@4@@danf@17-8-2009
10260210@unknown@formal@none@1@S@A ''formal proof'' is a sequence of well-formed formulas of a formal language, the last one of which is a [[theorem]] of a formal system.@@@@1@25@@danf@17-8-2009
10260220@unknown@formal@none@1@S@The theorem is a [[syntactic consequence]] of all the wffs preceding it in the proof.@@@@1@15@@danf@17-8-2009
10260230@unknown@formal@none@1@S@For a wff to qualify as part of a proof, it must be the result of applying a rule of the deductive apparatus of some formal system to the previous wffs in the proof sequence.@@@@1@35@@danf@17-8-2009
10260240@unknown@formal@none@1@S@=== Formal interpretations ===@@@@1@4@@danf@17-8-2009
10260250@unknown@formal@none@1@S@An ''interpretation'' of a formal system is the assignment of meanings to the symbols, and truth-values to the sentences of a formal system.@@@@1@23@@danf@17-8-2009
10260260@unknown@formal@none@1@S@The study of formal interpretations is called [[formal semantics]].@@@@1@9@@danf@17-8-2009
10260270@unknown@formal@none@1@S@''Giving an interpretation'' is synonymous with ''constructing a [[Structure (mathematical logic)|model]].@@@@1@11@@danf@17-8-2009
10260280@unknown@formal@none@1@S@== Formal grammars ==@@@@1@4@@danf@17-8-2009
10260290@unknown@formal@none@1@S@A grammar mainly consists of a set of rules for transforming strings.@@@@1@12@@danf@17-8-2009
10260300@unknown@formal@none@1@S@(If it ''only'' consisted of these rules, it would be a [[semi-Thue system]].)@@@@1@13@@danf@17-8-2009
10260310@unknown@formal@none@1@S@To generate a string in the language, one begins with a string consisting of only a single ''start symbol'', and then successively applies the rules (any number of times, in any order) to rewrite this string.@@@@1@36@@danf@17-8-2009
10260320@unknown@formal@none@1@S@The language consists of all the strings that can be generated in this manner.@@@@1@14@@danf@17-8-2009
10260330@unknown@formal@none@1@S@Any particular sequence of legal choices taken during this rewriting process yields one particular string in the language.@@@@1@18@@danf@17-8-2009
10260340@unknown@formal@none@1@S@If there are multiple ways of generating the same single string, then the grammar is said to be [[ambiguous grammar|ambiguous]].@@@@1@20@@danf@17-8-2009
10260350@unknown@formal@none@1@S@For example, assume the alphabet consists of and , the start symbol is and we have the following rules:@@@@1@21@@danf@17-8-2009
10260360@unknown@formal@none@1@S@: 1. @@@@1@5@@danf@17-8-2009
10260370@unknown@formal@none@1@S@: 2. @@@@1@5@@danf@17-8-2009
10260380@unknown@formal@none@1@S@then we start with , and can choose a rule to apply to it.@@@@1@14@@danf@17-8-2009
10260390@unknown@formal@none@1@S@If we choose rule 1, we obtain the string .@@@@1@10@@danf@17-8-2009
10260400@unknown@formal@none@1@S@If we choose rule 1 again, we replace with and obtain the string .@@@@1@16@@danf@17-8-2009
10260410@unknown@formal@none@1@S@This process can be repeated at will until all occurrences of ''S'' are removed, and only symbols from the alphabet remain (i.e., and ).@@@@1@25@@danf@17-8-2009
10260420@unknown@formal@none@1@S@For example, if we now choose rule 2, we replace with and obtain the string , and are done.@@@@1@21@@danf@17-8-2009
10260430@unknown@formal@none@1@S@We can write this series of choices more briefly, using symbols: .@@@@1@18@@danf@17-8-2009
10260440@unknown@formal@none@1@S@The language of the grammar is the set of all the strings that can be generated using this process: .@@@@1@26@@danf@17-8-2009
10260450@unknown@formal@none@1@S@=== Formal definition ===@@@@1@4@@danf@17-8-2009
10260460@unknown@formal@none@1@S@In the classic formalization of generative grammars first proposed by [[Noam Chomsky]] in the 1950s, a grammar ''G'' consists of the following components:@@@@1@23@@danf@17-8-2009
10260470@unknown@formal@none@1@S@* A finite set of ''[[nonterminal symbol]]s''.@@@@1@8@@danf@17-8-2009
10260480@unknown@formal@none@1@S@* A finite set of ''[[terminal symbol]]s'' that is [[Disjoint sets|disjoint]] from .@@@@1@14@@danf@17-8-2009
10260490@unknown@formal@none@1@S@* A finite set of ''production rules'', each of the form@@@@1@12@@danf@17-8-2009
10260500@unknown@formal@none@1@S@:: @@@@1@13@@danf@17-8-2009
10260510@unknown@formal@none@1@S@:where is the [[Kleene star]] operator and denotes [[union (set theory)|set union]].@@@@1@14@@danf@17-8-2009
10260520@unknown@formal@none@1@S@That is, each production rule maps from one string of symbols to another, where the first string contains at least one nonterminal symbol.@@@@1@23@@danf@17-8-2009
10260530@unknown@formal@none@1@S@In the case that the second string is the [[empty string]] – that is, that it contains no symbols at all – in order to avoid confusion, the empty string is often denoted with a special notation, often (, or .@@@@1@42@@danf@17-8-2009
10260540@unknown@formal@none@1@S@* A distinguished symbol that is the ''start symbol''.@@@@1@12@@danf@17-8-2009
10260550@unknown@formal@none@1@S@A grammar is formally defined as the ordered quad-tuple .@@@@1@13@@danf@17-8-2009
10260560@unknown@formal@none@1@S@Such a formal grammar is often called a ''rewriting system'' or a ''phrase structure grammar'' in the literature.@@@@1@18@@danf@17-8-2009
10260570@unknown@formal@none@1@S@The operation of a grammar can be defined in terms of relations on strings:@@@@1@14@@danf@17-8-2009
10260580@unknown@formal@none@1@S@* Given a grammar , the binary relation (pronounced as "G derives in one step") on strings in is defined by:@@@@1@30@@danf@17-8-2009
10260590@unknown@formal@none@1@S@@@@@1@28@@danf@17-8-2009
10260600@unknown@formal@none@1@S@* the relation (pronounced as ''G derives in zero or more steps'') is defined as the [[transitive closure]] of @@@@1@23@@danf@17-8-2009
10260610@unknown@formal@none@1@S@* the ''language'' of , denoted as , is defined as all those strings over that can be generated by starting with the start symbol and then applying the production rules in until no more nonterminal symbols are present; that is, the set .@@@@1@55@@danf@17-8-2009
10260620@unknown@formal@none@1@S@Note that the grammar is effectively the [[semi-Thue system]] , rewriting strings in exactly the same way; the only difference is in that we distinguish specific ''nonterminal'' symbols which must be rewritten in rewrite rules, and are only interested in rewritings from the designated start symbol to strings without nonterminal symbols.@@@@1@61@@danf@17-8-2009
10260630@unknown@formal@none@1@S@=== Example ===@@@@1@3@@danf@17-8-2009
10260640@unknown@formal@none@1@S@''For these examples, formal languages are specified using [[set-builder notation]].''@@@@1@10@@danf@17-8-2009
10260650@unknown@formal@none@1@S@Consider the grammar where , , is the start symbol, and consists of the following production rules:@@@@1@31@@danf@17-8-2009
10260660@unknown@formal@none@1@S@: 1. @@@@1@5@@danf@17-8-2009
10260670@unknown@formal@none@1@S@: 2. @@@@1@5@@danf@17-8-2009
10260680@unknown@formal@none@1@S@: 3. @@@@1@5@@danf@17-8-2009
10260690@unknown@formal@none@1@S@: 4. @@@@1@6@@danf@17-8-2009
10260700@unknown@formal@none@1@S@Some examples of the derivation of strings in are:@@@@1@10@@danf@17-8-2009
10260710@unknown@formal@none@1@S@* @@@@1@4@@danf@17-8-2009
10260720@unknown@formal@none@1@S@* @@@@1@10@@danf@17-8-2009
10260730@unknown@formal@none@1@S@* @@@@1@19@@danf@17-8-2009
10260740@unknown@formal@none@1@S@:(Note on notation: reads "''L'' generates ''R'' by means of production ''i''" and the generated part is each time indicated in bold.)@@@@1@25@@danf@17-8-2009
10260750@unknown@formal@none@1@S@This grammar defines the language where denotes a string of ''n'' consecutive 's.@@@@1@25@@danf@17-8-2009
10260760@unknown@formal@none@1@S@Thus, the language is the set of strings that consist of 1 or more 's, followed by the same number of 's, followed by the same number of 's.@@@@1@29@@danf@17-8-2009
10260770@unknown@formal@none@1@S@=== The Chomsky hierarchy ===@@@@1@5@@danf@17-8-2009
10260780@unknown@formal@none@1@S@When [[Noam Chomsky]] first formalized generative grammars in 1956, he classified them into types now known as the [[Chomsky hierarchy]].@@@@1@20@@danf@17-8-2009
10260790@unknown@formal@none@1@S@The difference between these types is that they have increasingly strict production rules and can express fewer formal languages.@@@@1@19@@danf@17-8-2009
10260800@unknown@formal@none@1@S@Two important types are ''[[context-free grammar]]s'' (Type 2) and ''[[regular grammar]]s'' (Type 3).@@@@1@13@@danf@17-8-2009
10260810@unknown@formal@none@1@S@The languages that can be described with such a grammar are called ''[[context-free language]]s'' and ''[[regular language]]s'', respectively.@@@@1@18@@danf@17-8-2009
10260820@unknown@formal@none@1@S@Although much less powerful than unrestricted grammars (Type 0), which can in fact express any language that can be accepted by a [[Turing machine]], these two restricted types of grammars are most often used because [[parsing|parser]]s for them can be efficiently implemented.@@@@1@42@@danf@17-8-2009
10260830@unknown@formal@none@1@S@For example, all regular languages can be recognized by a [[finite state machine]], and for useful subsets of context-free grammars there are well-known algorithms to generate efficient [[LL parser]]s and [[LR parser]]s to recognize the corresponding languages those grammars generate.@@@@1@40@@danf@17-8-2009
10260840@unknown@formal@none@1@S@==== Context-free grammars ====@@@@1@4@@danf@17-8-2009
10260850@unknown@formal@none@1@S@A ''[[context-free grammar]]'' is a grammar in which the left-hand side of each production rule consists of only a single nonterminal symbol.@@@@1@22@@danf@17-8-2009
10260860@unknown@formal@none@1@S@This restriction is non-trivial; not all languages can be generated by context-free grammars.@@@@1@13@@danf@17-8-2009
10260870@unknown@formal@none@1@S@Those that can are called ''context-free languages''.@@@@1@7@@danf@17-8-2009
10260880@unknown@formal@none@1@S@The language defined above is not a context-free language, and this can be strictly proven using the [[pumping lemma for context-free languages]], but for example the language (at least 1 followed by the same number of 's) is context-free, as it can be defined by the grammar with , , the start symbol, and the following production rules:@@@@1@74@@danf@17-8-2009
10260890@unknown@formal@none@1@S@: 1. @@@@1@5@@danf@17-8-2009
10260900@unknown@formal@none@1@S@: 2. @@@@1@5@@danf@17-8-2009
10260910@unknown@formal@none@1@S@A context-free language can be recognized in time (''see'' [[Big O notation]]) by an algorithm such as [[Earley's algorithm]].@@@@1@20@@danf@17-8-2009
10260920@unknown@formal@none@1@S@That is, for every context-free language, a machine can be built that takes a string as input and determines in time whether the string is a member of the language, where is the length of the string.@@@@1@39@@danf@17-8-2009
10260930@unknown@formal@none@1@S@Further, some important subsets of the context-free languages can be recognized in linear time using other algorithms.@@@@1@17@@danf@17-8-2009
10260940@unknown@formal@none@1@S@==== Regular grammars ====@@@@1@4@@danf@17-8-2009
10260950@unknown@formal@none@1@S@In [[regular grammar]]s, the left hand side is again only a single nonterminal symbol, but now the right-hand side is also restricted: It may be the empty string, or a single terminal symbol, or a single terminal symbol followed by a nonterminal symbol, but nothing else.@@@@1@46@@danf@17-8-2009
10260960@unknown@formal@none@1@S@(Sometimes a broader definition is used: one can allow longer strings of terminals or single nonterminals without anything else, making languages [[syntactic sugar|easier to denote]] while still defining the same class of languages.)@@@@1@33@@danf@17-8-2009
10260970@unknown@formal@none@1@S@The language defined above is not regular, but the language (at least 1 followed by at least 1 , where the numbers may be different) is, as it can be defined by the grammar with , , the start symbol, and the following production rules:@@@@1@63@@danf@17-8-2009
10260980@unknown@formal@none@1@S@:# @@@@1@4@@danf@17-8-2009
10260990@unknown@formal@none@1@S@:# @@@@1@4@@danf@17-8-2009
10261000@unknown@formal@none@1@S@:# @@@@1@4@@danf@17-8-2009
10261010@unknown@formal@none@1@S@:# @@@@1@4@@danf@17-8-2009
10261020@unknown@formal@none@1@S@:# @@@@1@4@@danf@17-8-2009
10261030@unknown@formal@none@1@S@All languages generated by a regular grammar can be recognized in linear time by a [[finite state machine]].@@@@1@18@@danf@17-8-2009
10261040@unknown@formal@none@1@S@Although, in practice, regular grammars are commonly expressed using [[regular expression]]s, some forms of regular expression used in practice do not strictly generate the regular languages and do not show linear recognitional performance due to those deviations.@@@@1@37@@danf@17-8-2009
10261050@unknown@formal@none@1@S@=== Other forms of generative grammars ===@@@@1@7@@danf@17-8-2009
10261060@unknown@formal@none@1@S@Many extensions and variations on Chomsky's original hierarchy of formal grammars have been developed more recently, both by linguists and by computer scientists, usually either in order to increase their expressive power or in order to make them easier to analyze or [[parsing|parse]].@@@@1@43@@danf@17-8-2009
10261070@unknown@formal@none@1@S@Some forms of grammars developed include:@@@@1@6@@danf@17-8-2009
10261080@unknown@formal@none@1@S@* [[Tree-adjoining grammar]]s increase the expressiveness of conventional generative grammars by allowing rewrite rules to operate on [[parse tree]]s instead of just strings.@@@@1@23@@danf@17-8-2009
10261090@unknown@formal@none@1@S@* [[Affix grammar]]s and [[attribute grammar]]s allow rewrite rules to be augmented with semantic attributes and operations, useful both for increasing grammar expressiveness and for constructing practical language translation tools.@@@@1@30@@danf@17-8-2009
10261100@unknown@formal@none@1@S@== Analytic grammars ==@@@@1@4@@danf@17-8-2009
10261110@unknown@formal@none@1@S@Though there is very little literature on [[parsing]] [[algorithms]], most of these algorithms assume that the language to be parsed is initially ''described'' by means of a ''generative'' formal grammar, and that the goal is to transform this generative grammar into a working parser.@@@@1@44@@danf@17-8-2009
10261120@unknown@formal@none@1@S@Strictly speaking, a generative grammar does not in any way correspond to the algorithm used to parse a language, and various algorithms have different restrictions on the form of production rules that are considered well-formed.@@@@1@35@@danf@17-8-2009
10261130@unknown@formal@none@1@S@An alternative approach is to formalize the language in terms of an analytic grammar in the first place, which more directly corresponds to the structure and semantics of a parser for the language.@@@@1@33@@danf@17-8-2009
10261140@unknown@formal@none@1@S@Examples of analytic grammar formalisms include the following:@@@@1@8@@danf@17-8-2009
10261150@unknown@formal@none@1@S@* [[The Language Machine]] directly implements unrestricted analytic grammars.@@@@1@9@@danf@17-8-2009
10261160@unknown@formal@none@1@S@Substitution rules are used to transform an input to produce outputs and behaviour.@@@@1@13@@danf@17-8-2009
10261170@unknown@formal@none@1@S@The system can also produce [http://languagemachine.sourceforge.net/picturebook.html the lm-diagram] which shows what happens when the rules of an unrestricted analytic grammar are being applied.@@@@1@23@@danf@17-8-2009
10261180@unknown@formal@none@1@S@* [[Top-down parsing language]] (TDPL): a highly minimalist analytic grammar formalism developed in the early 1970s to study the behavior of [[Top-down parsing|top-down parsers]].@@@@1@24@@danf@17-8-2009
10261190@unknown@formal@none@1@S@* [[Link grammar]]s: a form of analytic grammar designed for [[linguistics]], which derives syntactic structure by examining the positional relationships between pairs of words.@@@@1@24@@danf@17-8-2009
10261200@unknown@formal@none@1@S@* [[Parsing expression grammar]]s (PEGs): a more recent generalization of TDPL designed around the practical [[expressiveness]] needs of [[programming language]] and [[compiler]] writers.@@@@1@23@@danf@17-8-2009
10270010@unknown@formal@none@1@S@Free software@@@@1@2@@danf@17-8-2009
10270020@unknown@formal@none@1@S@'''Free software''' or software libre is [[software]] that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with minimal restrictions only to ensure that further recipients can also do these things.@@@@1@46@@danf@17-8-2009
10270030@unknown@formal@none@1@S@In practice, for software to be distributed as free software, the human readable form of the program (the "[[source code]]") must be made available to the recipient along with a notice granting the above permissions.@@@@1@35@@danf@17-8-2009
10270040@unknown@formal@none@1@S@Such a notice is a "[[free software licence]]", or, in theory, could be a notice saying that the source code is released into the [[public domain]].@@@@1@26@@danf@17-8-2009
10270050@unknown@formal@none@1@S@The [[free software movement]] was conceived in 1983 by [[Richard Stallman]] to make these freedoms available to every computer user.@@@@1@20@@danf@17-8-2009
10270060@unknown@formal@none@1@S@From the late 1990s onward, [[alternative terms for free software]] came into use.@@@@1@13@@danf@17-8-2009
10270070@unknown@formal@none@1@S@"'''[[Open source software]]'''" is the most common such alternative term.@@@@1@10@@danf@17-8-2009
10270080@unknown@formal@none@1@S@Others include "'''software [[Gratis versus Libre|libre]]'''", "free, libre and open-source software" ("'''[[FOSS]]'''", or, with "libre", "'''FLOSS'''").@@@@1@16@@danf@17-8-2009
10270090@unknown@formal@none@1@S@The antonym of free software is "''[[proprietary software]]''" or ''non-free software''.@@@@1@11@@danf@17-8-2009
10270100@unknown@formal@none@1@S@Free software is distinct from "[[freeware]]" which is [[proprietary software]] made available free of charge.@@@@1@15@@danf@17-8-2009
10270110@unknown@formal@none@1@S@Users usually cannot study, modify, or redistribute freeware.@@@@1@8@@danf@17-8-2009
10270120@unknown@formal@none@1@S@Since free software may be freely redistributed, it generally is available at little or no cost.@@@@1@16@@danf@17-8-2009
10270130@unknown@formal@none@1@S@Free software business models are usually based on adding value such as support, training, customization, integration, or certification.@@@@1@18@@danf@17-8-2009
10270140@unknown@formal@none@1@S@At the same time, some business models which work with [[proprietary software]] are not compatible with free software, such as those that depend on a user paying for a licence in order to lawfully use a software product.@@@@1@38@@danf@17-8-2009
10270150@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10270160@unknown@formal@none@1@S@In the 1950s, 1960s, and 1970s, it was normal for computer users to have the freedoms that are provided by free software.@@@@1@22@@danf@17-8-2009
10270170@unknown@formal@none@1@S@[[Software]] was commonly shared by individuals who used computers and by hardware manufacturers who were glad that people were making software that made their hardware useful.@@@@1@26@@danf@17-8-2009
10270180@unknown@formal@none@1@S@In the 1970s and early 1980s, the [[software industry]] began using technical measures (such as only distributing [[Executable|binary copies]] of [[computer programs]]) to prevent [[computer users]] from being able to study and modify software..@@@@1@34@@danf@17-8-2009
10270190@unknown@formal@none@1@S@In 1980 [[copyright]] law was extended to computer programs.@@@@1@9@@danf@17-8-2009
10270200@unknown@formal@none@1@S@In 1983, [[Richard Stallman]], longtime member of the [[hacker (free and open source software)|hacker]] community at the [[MIT Artificial Intelligence Laboratory]], announced the [[GNU project]], saying that he had become frustrated with the effects of the change in culture of the computer industry and its users.@@@@1@46@@danf@17-8-2009
10270210@unknown@formal@none@1@S@Software development for the [[GNU operating system]] began in January 1984, and the [[Free Software Foundation]] (FSF) was founded in October 1985.@@@@1@22@@danf@17-8-2009
10270220@unknown@formal@none@1@S@He developed a free software definition and the concept of "[[copyleft]]", designed to ensure software freedom for all.@@@@1@18@@danf@17-8-2009
10270230@unknown@formal@none@1@S@Free software is a widespread international concept, producing software used by individuals, large organizations, and governmental administrations.@@@@1@17@@danf@17-8-2009
10270240@unknown@formal@none@1@S@Free software has a very high market penetration in server-side Internet applications such as the [[Apache web server]], [[MySQL]] database, and [[PHP]] scripting language.@@@@1@24@@danf@17-8-2009
10270250@unknown@formal@none@1@S@Completely free computing environments are available as large packages of basic system software, such as the many [[GNU/Linux distribution]]s and [[FreeBSD]].@@@@1@21@@danf@17-8-2009
10270260@unknown@formal@none@1@S@Free software [[Software development|developers]] have also created free versions of almost all commonly used desktop applications, including Web browsers, office productivity suites, and multimedia players.@@@@1@25@@danf@17-8-2009
10270270@unknown@formal@none@1@S@It is important to note, however, that in many categories, free software for individual [[workstation]]s or home users has only a fraction of the market share of its proprietary competitors.@@@@1@30@@danf@17-8-2009
10270280@unknown@formal@none@1@S@Most free software is distributed [[online]] without charge, or [[off-line]] at the [[marginal cost]] of distribution, but this pricing model is not required, and people may sell copies of free software programs for any price.@@@@1@35@@danf@17-8-2009
10270290@unknown@formal@none@1@S@The economic viability of free software has been recognised by large corporations such as [[IBM]], [[Red Hat]], and [[Sun Microsystems]].@@@@1@20@@danf@17-8-2009
10270300@unknown@formal@none@1@S@Many companies whose core business is not in the IT sector choose free software for their Internet information and sales sites, due to the lower initial capital investment and ability to freely customize the application packages.@@@@1@36@@danf@17-8-2009
10270310@unknown@formal@none@1@S@Also, some non-software industries are beginning to use techniques similar to those used in free software development for their research and development process; scientists, for example, are looking towards more open development processes, and hardware such as microchips are beginning to be developed with specifications released under [[copyleft]] licenses (see the [[OpenCores]] project, for instance).@@@@1@55@@danf@17-8-2009
10270320@unknown@formal@none@1@S@[[Creative Commons]] and the [[free culture movement]] have also been largely influenced by the free software movement.@@@@1@17@@danf@17-8-2009
10270330@unknown@formal@none@1@S@===Naming===@@@@1@1@@danf@17-8-2009
10270340@unknown@formal@none@1@S@The FSF recommends using the term "free software" rather than "open source software" because that term and the associated marketing campaign focuses on the technical issues of software development, avoiding the issue of user freedoms.@@@@1@35@@danf@17-8-2009
10270350@unknown@formal@none@1@S@"[[Libre]]" is used to avoid the ambiguity of the word "free".@@@@1@11@@danf@17-8-2009
10270360@unknown@formal@none@1@S@However, amongst English speakers, ''libre'' is primarily only used within the free software movement.@@@@1@14@@danf@17-8-2009
10270370@unknown@formal@none@1@S@== Definition ==@@@@1@3@@danf@17-8-2009
10270380@unknown@formal@none@1@S@The first formal definition of free software was published by FSF in February 1986.@@@@1@14@@danf@17-8-2009
10270390@unknown@formal@none@1@S@That definition, written by Richard Stallman, is still maintained today and states that software is free software if people who receive a copy of the software have the following four freedoms:@@@@1@31@@danf@17-8-2009
10270400@unknown@formal@none@1@S@* Freedom 0: The freedom to run the program for any purpose.@@@@1@12@@danf@17-8-2009
10270410@unknown@formal@none@1@S@* Freedom 1: The freedom to study and modify the program.@@@@1@11@@danf@17-8-2009
10270420@unknown@formal@none@1@S@* Freedom 2: The freedom to copy the program so you can help your neighbor.@@@@1@15@@danf@17-8-2009
10270430@unknown@formal@none@1@S@* Freedom 3: The freedom to improve the program, and release your improvements to the public, so that the whole community benefits.@@@@1@22@@danf@17-8-2009
10270440@unknown@formal@none@1@S@Freedoms 1 and 3 require [[source code]] to be available because studying and modifying software without its source code is highly impractical.@@@@1@22@@danf@17-8-2009
10270450@unknown@formal@none@1@S@Thus, free software means that [[user (computing)|computer users]] have the freedom to cooperate with whom they choose, and to control the software they use.@@@@1@24@@danf@17-8-2009
10270460@unknown@formal@none@1@S@To summarize this into a remark distinguishing ''[[Gratis versus Libre|libre]]'' (freedom) software from ''[[Gratis versus Libre|gratis]]'' (zero price) software, [[Richard Stallman]] said: "''Free software is a matter of liberty, not price.@@@@1@31@@danf@17-8-2009
10270470@unknown@formal@none@1@S@To understand the concept, you should think of 'free' as in '[[free speech]]', not as in '[[free beer]]'''".@@@@1@18@@danf@17-8-2009
10270480@unknown@formal@none@1@S@In the late 90s, other groups published their own definitions which describe an almost identical set of software.@@@@1@18@@danf@17-8-2009
10270490@unknown@formal@none@1@S@The most notable are [[Debian Free Software Guidelines]] published in 1997, and the [[Open Source Definition]], published in 1998.@@@@1@19@@danf@17-8-2009
10270500@unknown@formal@none@1@S@The BSD-based operating systems, such as [[FreeBSD]], [[OpenBSD]], and [[NetBSD]], do not have their own formal definitions of free software.@@@@1@20@@danf@17-8-2009
10270510@unknown@formal@none@1@S@Users of these systems generally find the same set of software to be acceptable, but sometimes see copyleft as restrictive.@@@@1@20@@danf@17-8-2009
10270520@unknown@formal@none@1@S@They generally advocate [[permissive free software licenses]], which allow others to make software based on their source code, and then release the modified result as proprietary software.@@@@1@27@@danf@17-8-2009
10270530@unknown@formal@none@1@S@Their view is that this permissive approach is more free.@@@@1@10@@danf@17-8-2009
10270540@unknown@formal@none@1@S@The [[Kerberos (protocol)|Kerberos]], [[X.org]], and [[Apache License|Apache]] software licenses are substantially similar in intent and implementation.@@@@1@16@@danf@17-8-2009
10270550@unknown@formal@none@1@S@All of these software packages originated in academic institutions interested in wide technology transfer ([[University of California]], [[Massachusetts Institute of Technology|MIT]], and [[University of Illinois at Urbana-Champaign|UIUC]]).@@@@1@27@@danf@17-8-2009
10270560@unknown@formal@none@1@S@== Examples of free software ==@@@@1@6@@danf@17-8-2009
10270570@unknown@formal@none@1@S@The [[Free Software Directory]] is a free software project that maintains a large database of free software packages.@@@@1@18@@danf@17-8-2009
10270580@unknown@formal@none@1@S@===Notable free software===@@@@1@3@@danf@17-8-2009
10270590@unknown@formal@none@1@S@* [[Graphical user interface|GUI]] related@@@@1@5@@danf@17-8-2009
10270600@unknown@formal@none@1@S@**[[X Window System]]@@@@1@3@@danf@17-8-2009
10270610@unknown@formal@none@1@S@**[[GNOME]]@@@@1@1@@danf@17-8-2009
10270620@unknown@formal@none@1@S@**[[KDE]]@@@@1@1@@danf@17-8-2009
10270630@unknown@formal@none@1@S@**[[Xfce]] desktop environments@@@@1@3@@danf@17-8-2009
10270640@unknown@formal@none@1@S@* [[OpenOffice.org]] office suite@@@@1@4@@danf@17-8-2009
10270650@unknown@formal@none@1@S@* [[Mozilla Application Suite|Mozilla]] and [[Mozilla Firefox|Firefox]] web browsers.@@@@1@9@@danf@17-8-2009
10270660@unknown@formal@none@1@S@* Typesetting and document preparation systems@@@@1@6@@danf@17-8-2009
10270670@unknown@formal@none@1@S@**[[TeX]]@@@@1@1@@danf@17-8-2009
10270680@unknown@formal@none@1@S@**[[LaTeX]]@@@@1@1@@danf@17-8-2009
10270690@unknown@formal@none@1@S@* Graphics tools like [[GIMP]] image graphics editor and [[Blender (software)|Blender]] 3D animation program.@@@@1@14@@danf@17-8-2009
10270700@unknown@formal@none@1@S@* [[Text editor]]s like [[vi]] or [[emacs]].@@@@1@7@@danf@17-8-2009
10270710@unknown@formal@none@1@S@* [[ogg]] is a free software multimedia container, used to hold [[ogg vorbis]] sound and [[ogg theora]] video.@@@@1@18@@danf@17-8-2009
10270720@unknown@formal@none@1@S@* [[Relational database]] systems@@@@1@4@@danf@17-8-2009
10270730@unknown@formal@none@1@S@**[[MySQL]]@@@@1@1@@danf@17-8-2009
10270740@unknown@formal@none@1@S@**[[PostgreSQL]]@@@@1@1@@danf@17-8-2009
10270750@unknown@formal@none@1@S@* [[GNU Compiler Collection|GCC]] compilers, [[GDB]] debugger and the [[GNU C Library]].@@@@1@12@@danf@17-8-2009
10270760@unknown@formal@none@1@S@====Programming languages====@@@@1@2@@danf@17-8-2009
10270770@unknown@formal@none@1@S@*[[Java (programming language)|Java]]@@@@1@3@@danf@17-8-2009
10270780@unknown@formal@none@1@S@*[[Perl]]@@@@1@1@@danf@17-8-2009
10270790@unknown@formal@none@1@S@*[[PHP]]@@@@1@1@@danf@17-8-2009
10270800@unknown@formal@none@1@S@*[[Python (programming language)|Python]]@@@@1@3@@danf@17-8-2009
10270810@unknown@formal@none@1@S@*[[Lua (programming language)|Lua]]@@@@1@3@@danf@17-8-2009
10270820@unknown@formal@none@1@S@*[[Ruby programming language|Ruby]]@@@@1@3@@danf@17-8-2009
10270830@unknown@formal@none@1@S@*[[Tcl]]@@@@1@1@@danf@17-8-2009
10270840@unknown@formal@none@1@S@====Servers====@@@@1@1@@danf@17-8-2009
10270850@unknown@formal@none@1@S@*[[Apache HTTP Server|Apache web server]]@@@@1@5@@danf@17-8-2009
10270860@unknown@formal@none@1@S@*[[BIND]] name server@@@@1@3@@danf@17-8-2009
10270870@unknown@formal@none@1@S@*[[Sendmail]] mail transport@@@@1@3@@danf@17-8-2009
10270880@unknown@formal@none@1@S@*[[Samba software|Samba]] file server.@@@@1@4@@danf@17-8-2009
10270890@unknown@formal@none@1@S@====Operating systems====@@@@1@2@@danf@17-8-2009
10270900@unknown@formal@none@1@S@*[[GNU/Linux]]@@@@1@1@@danf@17-8-2009
10270910@unknown@formal@none@1@S@*[[Berkeley Software Distribution|BSD]]@@@@1@3@@danf@17-8-2009
10270920@unknown@formal@none@1@S@*[[Darwin (operating system)|Darwin]]@@@@1@3@@danf@17-8-2009
10270930@unknown@formal@none@1@S@*[[OpenSolaris]]@@@@1@1@@danf@17-8-2009
10270940@unknown@formal@none@1@S@== Free software licenses ==@@@@1@5@@danf@17-8-2009
10270950@unknown@formal@none@1@S@All free software licenses must grant people all the freedoms discussed above.@@@@1@12@@danf@17-8-2009
10270960@unknown@formal@none@1@S@However, unless the applications' licenses are compatible, combining programs by mixing source code or directly linking binaries is problematic, because of license technicalities.@@@@1@23@@danf@17-8-2009
10270970@unknown@formal@none@1@S@Programs indirectly connected together may avoid this problem.@@@@1@8@@danf@17-8-2009
10270980@unknown@formal@none@1@S@The majority of free software uses a small set of licenses.@@@@1@11@@danf@17-8-2009
10270990@unknown@formal@none@1@S@The most popular of these licenses are:@@@@1@7@@danf@17-8-2009
10271000@unknown@formal@none@1@S@* the [[GNU General Public License]]@@@@1@6@@danf@17-8-2009
10271010@unknown@formal@none@1@S@* the [[GNU Lesser General Public License]]@@@@1@7@@danf@17-8-2009
10271020@unknown@formal@none@1@S@* the [[BSD License]]@@@@1@4@@danf@17-8-2009
10271030@unknown@formal@none@1@S@* the [[Mozilla Public License]]@@@@1@5@@danf@17-8-2009
10271040@unknown@formal@none@1@S@* the [[MIT License]]@@@@1@4@@danf@17-8-2009
10271050@unknown@formal@none@1@S@* the [[Apache License]]@@@@1@4@@danf@17-8-2009
10271060@unknown@formal@none@1@S@The Free Software Foundation and the Open Source Initiative both publish lists of licenses that they find to comply with their own definitions of free software and open-source software respectively.@@@@1@30@@danf@17-8-2009
10271070@unknown@formal@none@1@S@* [[List of FSF approved software licenses]]@@@@1@7@@danf@17-8-2009
10271080@unknown@formal@none@1@S@* [[List of OSI approved software licenses]]@@@@1@7@@danf@17-8-2009
10271090@unknown@formal@none@1@S@These lists are necessarily incomplete, because a license need not be known by either organization in order to provide these freedoms.@@@@1@21@@danf@17-8-2009
10271100@unknown@formal@none@1@S@Apart from these two organizations, the [[Debian]] project is seen by some to provide useful advice on whether particular licenses comply with their [[Debian Free Software Guidelines]].@@@@1@27@@danf@17-8-2009
10271110@unknown@formal@none@1@S@Debian doesn't publish a list of ''approved'' licenses, so its judgments have to be tracked by checking what software they have allowed into their software archives.@@@@1@26@@danf@17-8-2009
10271120@unknown@formal@none@1@S@That is summarized at the Debian web site.@@@@1@8@@danf@17-8-2009
10271130@unknown@formal@none@1@S@However, it is rare that a license is announced as being in-compliance by either FSF or OSI guidelines and not [[Vice_versa##vice_versa|vice versa]] (the [[Netscape Public License]] used for early versions of Mozilla being an exception), so exact definitions of the terms have not become hot issues.@@@@1@46@@danf@17-8-2009
10271140@unknown@formal@none@1@S@=== Permissive and copyleft licenses ===@@@@1@6@@danf@17-8-2009
10271150@unknown@formal@none@1@S@The FSF categorizes licenses in the following ways:@@@@1@8@@danf@17-8-2009
10271160@unknown@formal@none@1@S@* [[Public domain]] software - the copyright has expired, the work was not copyrighted or the author has abandoned the copyright.@@@@1@21@@danf@17-8-2009
10271170@unknown@formal@none@1@S@Since public-domain software lacks copyright protection, it may be freely incorporated into any work, whether proprietary or free.@@@@1@18@@danf@17-8-2009
10271180@unknown@formal@none@1@S@* [[permissive free software licences|Permissive licenses]], also called BSD-style because they are applied to much of the software distributed with the [[Berkeley Software Distribution|BSD]] operating systems.@@@@1@26@@danf@17-8-2009
10271190@unknown@formal@none@1@S@The author retains copyright solely to disclaim warranty and require proper attribution of modified works, but permits redistribution and modification in ''any'' work, even proprietary ones.@@@@1@26@@danf@17-8-2009
10271200@unknown@formal@none@1@S@* [[Copyleft]] licenses, the [[GNU General Public License]] being the most prominent.@@@@1@12@@danf@17-8-2009
10271210@unknown@formal@none@1@S@The author retains copyright and permits redistribution and modification provided all such redistribution is licensed under the same license.@@@@1@19@@danf@17-8-2009
10271220@unknown@formal@none@1@S@Additions and modifications by others must also be licensed under the same 'copyleft' license whenever they are distributed with part of the original licensed product.@@@@1@25@@danf@17-8-2009
10271230@unknown@formal@none@1@S@== Security and reliability==@@@@1@4@@danf@17-8-2009
10271240@unknown@formal@none@1@S@There is debate over the [[computer security|security]] of free software in comparison to proprietary software, with a major issue being [[security through obscurity]].@@@@1@23@@danf@17-8-2009
10271250@unknown@formal@none@1@S@A popular quantitative test in computer security is using relative counting of known unpatched security flaws.@@@@1@16@@danf@17-8-2009
10271260@unknown@formal@none@1@S@Generally, users of this method advise avoiding products which lack fixes for known security flaws, at least until a fix is available.@@@@1@22@@danf@17-8-2009
10271270@unknown@formal@none@1@S@Some claim that this method is biased by counting more vulnerabilities for the free software, since its source code is accessible and its community is more forthcoming about what problems exist.@@@@1@31@@danf@17-8-2009
10271280@unknown@formal@none@1@S@Free software advocates rebut that even if proprietary software does not have "published" flaws, flaws could still exist and possibly be known to malicious users.@@@@1@25@@danf@17-8-2009
10271290@unknown@formal@none@1@S@The ability of users to view and modify the source code allows many more people to potentially analyse the code and possibly to have a higher rate of finding bugs and flaws than an average sized corporation could manage.@@@@1@39@@danf@17-8-2009
10271300@unknown@formal@none@1@S@Users having access to the source code also makes creating and deploying [[spyware]] far more difficult.@@@@1@16@@danf@17-8-2009
10271310@unknown@formal@none@1@S@[[David A. Wheeler]] has published research concluding that free software is quantitatively more reliable than proprietary software.@@@@1@17@@danf@17-8-2009
10271320@unknown@formal@none@1@S@== Adoption ==@@@@1@3@@danf@17-8-2009
10271330@unknown@formal@none@1@S@Free software played a part in the development of the Internet, the World Wide Web and the infrastructure of [[dot-com companies]].@@@@1@21@@danf@17-8-2009
10271340@unknown@formal@none@1@S@Free software allows users to cooperate in enhancing and refining the programs they use; free software is a [[pure public good]] rather than a [[private good]].@@@@1@26@@danf@17-8-2009
10271350@unknown@formal@none@1@S@Companies that contribute to free software can increase commercial [[innovation]] amidst the void of [[patent]] [[cross licensing]] lawsuits.@@@@1@18@@danf@17-8-2009
10271360@unknown@formal@none@1@S@(See [[Mpeg2#Patent holders|mpeg2 patent holders]])@@@@1@5@@danf@17-8-2009
10271370@unknown@formal@none@1@S@Under the free software business model, free software vendors may charge a fee for distribution and offer pay support and software customization services.@@@@1@23@@danf@17-8-2009
10271380@unknown@formal@none@1@S@Proprietary software uses a different business model, where a customer of the proprietary software pays a fee for a license to use the software.@@@@1@24@@danf@17-8-2009
10271390@unknown@formal@none@1@S@This license may grant the customer the ability to configure some or no parts of the software themselves.@@@@1@18@@danf@17-8-2009
10271400@unknown@formal@none@1@S@Often some level of support is included in the purchase of proprietary software, but additional support services (especially for enterprise applications) are usually available for an additional fee.@@@@1@28@@danf@17-8-2009
10271410@unknown@formal@none@1@S@Some proprietary software vendors will also customize software for a fee.@@@@1@11@@danf@17-8-2009
10271420@unknown@formal@none@1@S@Free software is generally available at little to no cost and can result in permanently lower costs compared to [[proprietary software]].@@@@1@21@@danf@17-8-2009
10271430@unknown@formal@none@1@S@With free software, businesses can fit software to their specific needs by changing the software themselves or by hiring programmers to modify it for them.@@@@1@25@@danf@17-8-2009
10271440@unknown@formal@none@1@S@Free software often has no warranty, and more importantly, generally does not assign legal liability to anyone.@@@@1@17@@danf@17-8-2009
10271450@unknown@formal@none@1@S@However, warranties are permitted between any two parties upon the condition of the software and its usage.@@@@1@17@@danf@17-8-2009
10271460@unknown@formal@none@1@S@Such an agreement is made separately from the free software license.@@@@1@11@@danf@17-8-2009
10271470@unknown@formal@none@1@S@== Controversies ==@@@@1@3@@danf@17-8-2009
10271480@unknown@formal@none@1@S@=== Binary blobs ===@@@@1@4@@danf@17-8-2009
10271490@unknown@formal@none@1@S@In 2006, [[OpenBSD]] started the first campaign against the use of [[binary blobs]], in [[kernel (computer science)|kernels]].@@@@1@17@@danf@17-8-2009
10271500@unknown@formal@none@1@S@Blobs are usually freely distributable [[device driver]]s for hardware from vendors that do not reveal driver source code to users or developers.@@@@1@22@@danf@17-8-2009
10271510@unknown@formal@none@1@S@This restricts the users' freedom to effectively modify the software and distribute modified versions.@@@@1@14@@danf@17-8-2009
10271520@unknown@formal@none@1@S@Also, since the blobs are undocumented and may have [[computer bug|bugs]], they pose a security risk to any [[operating system]] whose kernel includes them.@@@@1@24@@danf@17-8-2009
10271530@unknown@formal@none@1@S@The proclaimed aim of the campaign against blobs is to collect hardware documentation that allows developers to write free software drivers for that hardware, ultimately enabling all free operating systems to become or remain blob-free.@@@@1@35@@danf@17-8-2009
10271540@unknown@formal@none@1@S@The issue of binary blobs in the [[Linux kernel]] and other device drivers motivated some developers in Ireland to launch [[gNewSense]], a GNU/Linux distribution with all the binary blobs removed.@@@@1@30@@danf@17-8-2009
10271550@unknown@formal@none@1@S@The project received support from the [[Free Software Foundation]]@@@@1@9@@danf@17-8-2009
10271560@unknown@formal@none@1@S@=== BitKeeper ===@@@@1@3@@danf@17-8-2009
10271570@unknown@formal@none@1@S@[[Larry McVoy]] invited high-profile free software projects to use his proprietary [[versioning system]], [[BitKeeper]], free of charge, in order to attract paying users.@@@@1@23@@danf@17-8-2009
10271580@unknown@formal@none@1@S@In 2002, Linux coordinator [[Linus Torvalds]] decided to use BitKeeper to develop the Linux kernel, a free software project, claiming no free software alternative met his needs.@@@@1@27@@danf@17-8-2009
10271590@unknown@formal@none@1@S@This controversial decision drew criticism from several sources, including the Free Software Foundation's founder Richard Stallman.@@@@1@16@@danf@17-8-2009
10271600@unknown@formal@none@1@S@Following the apparent [[reverse engineering]] of BitKeeper's protocols, McVoy withdrew permission for gratis use by free software projects, leading the Linux kernel community to develop a free software replacement in [[Git (software)|Git]].@@@@1@32@@danf@17-8-2009
10271610@unknown@formal@none@1@S@=== Patent deals ===@@@@1@4@@danf@17-8-2009
10271620@unknown@formal@none@1@S@In November 2006, the [[Microsoft]] and [[Novell]] software corporations announced a controversial partnership involving, among other things, patent protection for some customers of Novell under certain conditions.@@@@1@27@@danf@17-8-2009
10280010@unknown@formal@none@1@S@Freeware@@@@1@1@@danf@17-8-2009
10280020@unknown@formal@none@1@S@'''Freeware''' is computer [[software]] that is available for use at no cost or for an optional fee.@@@@1@17@@danf@17-8-2009
10280030@unknown@formal@none@1@S@Freeware is often made available in a binary-only, [[proprietary software|proprietary]] form, thus making it distinct from [[free software]].@@@@1@18@@danf@17-8-2009
10280040@unknown@formal@none@1@S@Proprietary freeware allows authors to contribute something for the benefit of the community, while at the same time allowing them to retain control of the source code and preserve its business potential.@@@@1@32@@danf@17-8-2009
10280050@unknown@formal@none@1@S@Freeware is different from [[shareware]], where the user is obliged to pay (e.g. after some trial period or for additional functionality).@@@@1@21@@danf@17-8-2009
10280060@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10280070@unknown@formal@none@1@S@The term ''freeware'' was coined by [[Andrew Fluegelman]] when he wanted to sell a communications program named [[PC-Talk]] that he had created but for which he did not wish to use traditional methods of distribution because of their cost.@@@@1@39@@danf@17-8-2009
10280080@unknown@formal@none@1@S@Fluegelman actually distributed PC-Talk via a process now referred to as [[shareware]].@@@@1@12@@danf@17-8-2009
10280090@unknown@formal@none@1@S@Current use of the term freeware does not necessarily match the original concept by Andrew Fluegelman.@@@@1@16@@danf@17-8-2009
10280100@unknown@formal@none@1@S@== Criteria ==@@@@1@3@@danf@17-8-2009
10280110@unknown@formal@none@1@S@The only criterion for being classified as freeware is that the software must be fully functional for an unlimited time with no monetary cost.@@@@1@24@@danf@17-8-2009
10280120@unknown@formal@none@1@S@The software license may impose one or more other restrictions on the type of use including personal use, individual use, non-profit use, non-commercial use, academic use, commercial use or any combination of these.@@@@1@33@@danf@17-8-2009
10280130@unknown@formal@none@1@S@For instance, the license may be "free for personal, non-commercial use."@@@@1@11@@danf@17-8-2009
10280140@unknown@formal@none@1@S@Everything created with the freeware programs can be distributed at no cost (for example graphic, documents, or sounds made by user).@@@@1@21@@danf@17-8-2009
10290010@unknown@formal@none@1@S@French language@@@@1@2@@danf@17-8-2009
10290020@unknown@formal@none@1@S@'''French''' (''français'', ) is today spoken around the world by 72 to 130 million people as a [[first language|native]] language, and by about 190 to 600 million people as a [[second language|second]] or third language, with significant speakers in 54 countries.@@@@1@41@@danf@17-8-2009
10290030@unknown@formal@none@1@S@Most native speakers of the language live in [[France]], where the language originated.@@@@1@13@@danf@17-8-2009
10290040@unknown@formal@none@1@S@The rest live in [[Canada]], [[Belgium]] and [[Switzerland]].@@@@1@8@@danf@17-8-2009
10290050@unknown@formal@none@1@S@French is a descendant of the [[Latin]] language of the [[Roman Empire]], as are languages such as [[Portuguese language|Portuguese]], [[Spanish language|Spanish]], [[Italian language|Italian]], [[Catalan language|Catalan]] and [[Romanian language|Romanian]].@@@@1@28@@danf@17-8-2009
10290060@unknown@formal@none@1@S@Its development was also influenced by the native [[Celtic languages]] of Roman [[Gaul]] and by the [[Germanic languages|Germanic]] language of the post-Roman [[Frankish]] invaders.@@@@1@24@@danf@17-8-2009
10290070@unknown@formal@none@1@S@It is an [[official language]] in [[List of countries where French is an official language|29 countries]], most of which form what is called in French ''La [[Francophonie]]'', the community of French-speaking nations.@@@@1@32@@danf@17-8-2009
10290080@unknown@formal@none@1@S@It is an official language of all [[United Nations]] agencies and a [[List of international organisations which have French as an official language|large number of international organizations]].@@@@1@27@@danf@17-8-2009
10290090@unknown@formal@none@1@S@According to the [[European Union]], 129 million (26% of the 497,198,740) people in 27 member states speak French, of which 59 million (12%) speak it natively and 69 million (14%) claim to speak it as a second language, which makes it the third most spoken second language in the Union, after English and German respectively.@@@@1@55@@danf@17-8-2009
10290100@unknown@formal@none@1@S@== Geographic distribution==@@@@1@3@@danf@17-8-2009
10290110@unknown@formal@none@1@S@===Europe===@@@@1@1@@danf@17-8-2009
10290120@unknown@formal@none@1@S@====Legal status in France====@@@@1@4@@danf@17-8-2009
10290130@unknown@formal@none@1@S@Per the [[Constitution of France]], French has been the official language since 1992 (although previous legal texts have made it official since 1539, see [[ordinance of Villers-Cotterêts]]).@@@@1@27@@danf@17-8-2009
10290140@unknown@formal@none@1@S@[[France]] mandates the use of French in official government publications, public [[education]] outside of specific cases (though these dispositions are often ignored) and legal [[contract]]s; [[advertisement]]s must bear a translation of foreign words.@@@@1@33@@danf@17-8-2009
10290150@unknown@formal@none@1@S@In addition to French, there are also a variety of regional languages.@@@@1@12@@danf@17-8-2009
10290160@unknown@formal@none@1@S@France has signed the European Charter for Regional Languages but has not ratified it since that would go against the 1958 Constitution.@@@@1@22@@danf@17-8-2009
10290170@unknown@formal@none@1@S@====Switzerland====@@@@1@1@@danf@17-8-2009
10290180@unknown@formal@none@1@S@French is one of the four official languages of [[Switzerland]] (along with [[German language|German]], [[Italian language|Italian]], and [[Romansh language|Romansh]]) and is spoken in the part of Switzerland called ''[[Romandie]]''.@@@@1@29@@danf@17-8-2009
10290190@unknown@formal@none@1@S@French is the native language of about 20% of the Swiss population.@@@@1@12@@danf@17-8-2009
10290200@unknown@formal@none@1@S@====Belgium====@@@@1@1@@danf@17-8-2009
10290210@unknown@formal@none@1@S@In [[Belgium]], French is the official language of [[Wallonia]] (excluding the [[East Cantons]], which are [[German language|German-speaking]]) and one of the two official languages—along with [[Dutch language|Dutch]]—of the [[Brussels-Capital Region]] where it is spoken by the majority of the population, though often not as their primary language.@@@@1@47@@danf@17-8-2009
10290220@unknown@formal@none@1@S@French and German are not official languages nor recognised minority languages in the [[Flemish Region]], although along borders with the Walloon and Brussels-Capital regions, there are a dozen of [[municipalities with language facilities]] for French-speakers; a mirroring situation exists for the Walloon Region with respect to the Dutch and German languages.@@@@1@51@@danf@17-8-2009
10290230@unknown@formal@none@1@S@In total, native French-speakers make up about 40% of the country's population, the remaining 60% speak Dutch, the latter of which 59% claim to speak French as a second language.@@@@1@30@@danf@17-8-2009
10290240@unknown@formal@none@1@S@French is thus known by an estimated 75% of all Belgians, either as a mother tongue, as second, or as third language.@@@@1@22@@danf@17-8-2009
10290250@unknown@formal@none@1@S@====Monaco and Andorra====@@@@1@3@@danf@17-8-2009
10290260@unknown@formal@none@1@S@Although [[Monégasque language|Monégasque]] is the national language of the [[Principality of Monaco]], French is the only official language, and French nationals make up some 47% of the population.@@@@1@28@@danf@17-8-2009
10290270@unknown@formal@none@1@S@[[Catalan language|Catalan]] is the only official language of [[Andorra]]; however, French is commonly used due to the proximity to France.@@@@1@20@@danf@17-8-2009
10290280@unknown@formal@none@1@S@French nationals make up 7% of the population.@@@@1@8@@danf@17-8-2009
10290290@unknown@formal@none@1@S@====Italy====@@@@1@1@@danf@17-8-2009
10290300@unknown@formal@none@1@S@French is also an official language, along with [[Italian language|Italian]], in the province of [[Aosta Valley]], [[Italy]].@@@@1@17@@danf@17-8-2009
10290310@unknown@formal@none@1@S@In addition, a number of [[Franco-Provençal language|Franco-Provençal]] dialects are spoken in the province, although they do not have official recognition.@@@@1@20@@danf@17-8-2009
10290320@unknown@formal@none@1@S@====Luxembourg====@@@@1@1@@danf@17-8-2009
10290330@unknown@formal@none@1@S@French is one of three official languages of [[Luxembourg|the Grand Duchy of Luxembourg ]] ;@@@@1@15@@danf@17-8-2009
10290340@unknown@formal@none@1@S@the other official languages of Luxembourg are@@@@1@7@@danf@17-8-2009
10290350@unknown@formal@none@1@S@*[[German language|German]]@@@@1@2@@danf@17-8-2009
10290360@unknown@formal@none@1@S@*[[Lëtzebuergesch|Luxemburgish]].@@@@1@1@@danf@17-8-2009
10290370@unknown@formal@none@1@S@Luxemburgish is the natively-spoken language of Luxembourg ;@@@@1@8@@danf@17-8-2009
10290380@unknown@formal@none@1@S@Luxembourg's education system is trilingual: the first years of primary school are in Luxembourgish, before changing to German, while secondary school, the language of instruction changes to French.@@@@1@28@@danf@17-8-2009
10290390@unknown@formal@none@1@S@====The Channel Islands====@@@@1@3@@danf@17-8-2009
10290400@unknown@formal@none@1@S@Although [[Jersey]] and [[Guernsey]], the two bailiwicks collectively referred to as the [[Channel Islands]], are separate entities, both use French to some degree, mostly in an administrative capacity.@@@@1@28@@danf@17-8-2009
10290410@unknown@formal@none@1@S@[[Jersey Legal French]] is the standardized variety used in Jersey.@@@@1@10@@danf@17-8-2009
10290420@unknown@formal@none@1@S@===The Americas===@@@@1@2@@danf@17-8-2009
10290430@unknown@formal@none@1@S@====Legal status in Canada====@@@@1@4@@danf@17-8-2009
10290440@unknown@formal@none@1@S@About 7 million [[Canadian]]s are native French-speakers, of whom 6 million live in [[Quebec]], and French is one of [[Canada]]'s two official languages (the other being [[English language|English]]).@@@@1@28@@danf@17-8-2009
10290450@unknown@formal@none@1@S@Various provisions of the [[Canadian Charter of Rights and Freedoms]] deal with Canadians' right to access services in both languages, including the right to a publicly funded education in the minority language of each province, where numbers warrant in a given locality.@@@@1@42@@danf@17-8-2009
10290460@unknown@formal@none@1@S@By [[law]], the federal government must operate and provide services in both English and French, proceedings of the [[Parliament of Canada]] must be translated into both these languages, and most products sold in Canada must have labeling in both languages.@@@@1@40@@danf@17-8-2009
10290470@unknown@formal@none@1@S@Overall, about 13% of Canadians have knowledge of French only, while 18% have knowledge of both English and French.@@@@1@19@@danf@17-8-2009
10290480@unknown@formal@none@1@S@In contrast, over 82% of the population of Quebec speaks French natively, and almost 96% speak it as either their first or second language.@@@@1@24@@danf@17-8-2009
10290490@unknown@formal@none@1@S@It has been the sole official language of Quebec since 1974.@@@@1@11@@danf@17-8-2009
10290500@unknown@formal@none@1@S@The legal status of French was further strengthened with the 1977 adoption of the [[Charter of the French Language]] (popularly known as ''Bill 101''), which guarantees that every person has a right to have the civil administration, the health and social services, corporations, and enterprises in Quebec communicate with him in French.@@@@1@52@@danf@17-8-2009
10290510@unknown@formal@none@1@S@While the Charter mandates that certain provincial government services, such as those relating to health and education, be offered to the English minority in its language, where numbers warrant, its primary purpose is to cement the role of French as the primary language used in the public sphere.@@@@1@48@@danf@17-8-2009
10290520@unknown@formal@none@1@S@[[Image:Knowledge French EU map.png|right|thumb|240px|Knowledge of French in the European Union and candidate countries]] The provision of the Charter that has arguably had the most significant impact mandates French-language [[education]] unless a child's parents or siblings have received the majority of their own primary education in English within Canada, with minor exceptions.@@@@1@51@@danf@17-8-2009
10290530@unknown@formal@none@1@S@This measure has reversed a historical trend whereby a large number of immigrant children would attend English schools.@@@@1@18@@danf@17-8-2009
10290540@unknown@formal@none@1@S@In so doing, the Charter has greatly contributed to the "visage français" (French face) of Montreal in spite of its growing immigrant population.@@@@1@23@@danf@17-8-2009
10290550@unknown@formal@none@1@S@Other provisions of the Charter have been ruled unconstitutional over the years, including those mandating French-only commercial signs, court proceedings, and debates in the legislature.@@@@1@25@@danf@17-8-2009
10290560@unknown@formal@none@1@S@Though none of these provisions are still in effect today, some continued to be on the books for a time even after courts had ruled them unconstitutional as a result of the government's decision to invoke the so-called [[Section Thirty-three of the Canadian Charter of Rights and Freedoms|notwithstanding clause]] of the Canadian constitution to override constitutional requirements.@@@@1@57@@danf@17-8-2009
10290570@unknown@formal@none@1@S@In 1993, the Charter was rewritten to allow signage in other languages so long as French was markedly "predominant."@@@@1@19@@danf@17-8-2009
10290580@unknown@formal@none@1@S@Another section of the Charter guarantees every person the right to work in French, meaning the right to have all communications with one's superiors and coworkers in French, as well as the right not to be required to know another language as a condition of hiring, unless this is warranted by the nature of one's duties, such as by reason of extensive interaction with people located outside the province or similar reasons.@@@@1@72@@danf@17-8-2009
10290590@unknown@formal@none@1@S@This section has not been as effective as had originally been hoped, and has faded somewhat from public consciousness.@@@@1@19@@danf@17-8-2009
10290600@unknown@formal@none@1@S@As of 2006, approximately 65% of the workforce on the island of Montreal predominantly used French in the workplace.@@@@1@19@@danf@17-8-2009
10290610@unknown@formal@none@1@S@The only other province that recognizes French as an official language is [[New Brunswick]], which is officially bilingual, like the nation as a whole.@@@@1@24@@danf@17-8-2009
10290620@unknown@formal@none@1@S@Outside of [[Quebec]], the highest number of Francophones in Canada, 485,000, excluding those who claim multiple mother tongues, reside in [[Ontario]], whereas [[New Brunswick]], home to the vast majority of [[Acadians]], has the highest ''percentage'' of Francophones after [[Quebec]], 33%, or 237,000.@@@@1@42@@danf@17-8-2009
10290630@unknown@formal@none@1@S@In [[Ontario]], [[Nova Scotia]], [[Prince Edward Island]], and [[Manitoba]], French does not have full official status, although the provincial governments do provide some French-language services in all communities where significant numbers of Francophones live.@@@@1@34@@danf@17-8-2009
10290640@unknown@formal@none@1@S@Canada's three northern territories ([[Yukon]], [[Northwest Territories]], and [[Nunavut]]) all recognize French as an official language as well.@@@@1@18@@danf@17-8-2009
10290650@unknown@formal@none@1@S@All provinces make some effort to accommodate the needs of their Francophone [[citizen]]s, although the level and quality of French-language service vary significantly from province to province.@@@@1@27@@danf@17-8-2009
10290660@unknown@formal@none@1@S@The Ontario [[French Language Services Act]], adopted in 1986, guarantees French language services in that province in regions where the Francophone population exceeds 10% of the total population, as well as communities with Francophone populations exceeding 5,000, and certain other designated areas; this has the most effect in the north and east of the province, as well as in other larger centres such as [[Ottawa]], [[Toronto]], [[Hamilton, Ontario|Hamilton]], [[Mississauga, Ontario|Mississauga]], [[London, Ontario|London]], [[Kitchener, Ontario|Kitchener]], [[St. Catharines, Ontario|St. Catharines]], [[Greater Sudbury]] and [[Windsor, Ontario|Windsor]].@@@@1@83@@danf@17-8-2009
10290670@unknown@formal@none@1@S@However, the French Language Services Act does not confer the status of "official bilingualism" on these cities, as that designation carries with it implications which go beyond the provision of services in both languages.@@@@1@34@@danf@17-8-2009
10290680@unknown@formal@none@1@S@The City of Ottawa's language policy (by-law 2001-170) allows employees to work in their official language of choice and be supervised in the language of choice.@@@@1@26@@danf@17-8-2009
10290690@unknown@formal@none@1@S@Canada has the status of member state in the Francophonie, while the provinces of Quebec and New Brunswick are recognized as participating governments.@@@@1@23@@danf@17-8-2009
10290700@unknown@formal@none@1@S@Ontario is currently seeking to become a full member on its own.@@@@1@12@@danf@17-8-2009
10290710@unknown@formal@none@1@S@====Haiti====@@@@1@1@@danf@17-8-2009
10290720@unknown@formal@none@1@S@French is an official language of [[Haiti]], although it is mostly spoken by the [[upper class]], while [[Haitian Creole]] (a [[French-based creole language]]) is more widely spoken as a [[mother tongue]].@@@@1@31@@danf@17-8-2009
10290730@unknown@formal@none@1@S@====French overseas territories====@@@@1@3@@danf@17-8-2009
10290740@unknown@formal@none@1@S@French is also the official language in France's overseas territories of [[French Guiana]], [[Guadeloupe]], [[Martinique]], [[Saint Barthélemy]], [[Saint Martin (France)|St. Martin]] and [[Saint-Pierre and Miquelon]].@@@@1@25@@danf@17-8-2009
10290750@unknown@formal@none@1@S@====The United States====@@@@1@3@@danf@17-8-2009
10290760@unknown@formal@none@1@S@Although it has no official recognition on a federal level, French is the third most-spoken language in the United States, after [[English language|English]] and [[Spanish language|Spanish]], and the second most-spoken in the states of [[Louisiana]], [[Maine]], [[Vermont]] and [[New Hampshire]].@@@@1@40@@danf@17-8-2009
10290770@unknown@formal@none@1@S@Louisiana is home to two distinct dialects, [[Cajun French]] and [[Louisiana Creole French|Creole French]]@@@@1@14@@danf@17-8-2009
10290780@unknown@formal@none@1@S@===Africa===@@@@1@1@@danf@17-8-2009
10290790@unknown@formal@none@1@S@A majority of the world's French-speaking population lives in Africa.@@@@1@10@@danf@17-8-2009
10290800@unknown@formal@none@1@S@According to the 2007 report by the Organisation internationale de la Francophonie, an estimated 115 million African people spread across 31 francophone African countries can speak French either as a [[first language|first]] or [[second language]].@@@@1@35@@danf@17-8-2009
10290810@unknown@formal@none@1@S@French is mostly a second language in Africa, but in some areas it has become a first language, such as in the region of [[Abidjan]], [[Côte d'Ivoire]] and in [[Libreville]], [[Gabon]].@@@@1@31@@danf@17-8-2009
10290820@unknown@formal@none@1@S@It is impossible to speak of a single form of [[African French]], but rather of diverse forms of African French which have developed due to the contact with many indigenous [[African languages]].@@@@1@32@@danf@17-8-2009
10290830@unknown@formal@none@1@S@In the territories of the [[Indian Ocean]], the French language is often spoken alongside French-derived creole languages, the major exception being [[Madagascar]].@@@@1@22@@danf@17-8-2009
10290840@unknown@formal@none@1@S@There, a Malayo-Polynesian language ([[Malagasy]]) is spoken alongside French.@@@@1@9@@danf@17-8-2009
10290850@unknown@formal@none@1@S@The French language has also met competition with English since English has been the official language in [[Mauritius]] and the [[Seychelles]] for a long time and has recently become an official language of Madagascar.@@@@1@34@@danf@17-8-2009
10290860@unknown@formal@none@1@S@[[Sub-Saharan Africa]] is the region where the French language is most likely to expand due to the expansion of education and it is also there the language has evolved most in recent years.@@@@1@33@@danf@17-8-2009
10290870@unknown@formal@none@1@S@Some vernacular forms of French in Africa can be difficult to understand for French speakers from other countries but written forms of the language are very closely related to those of the rest of the French-speaking world.@@@@1@37@@danf@17-8-2009
10290880@unknown@formal@none@1@S@French is an official language of many African countries, most of them former French or [[Belgian colonial empire|Belgian colonies]]:@@@@1@19@@danf@17-8-2009
10290890@unknown@formal@none@1@S@:*[[Benin]]@@@@1@1@@danf@17-8-2009
10290900@unknown@formal@none@1@S@:*[[Burkina Faso]]@@@@1@2@@danf@17-8-2009
10290910@unknown@formal@none@1@S@:*[[Burundi]]@@@@1@1@@danf@17-8-2009
10290920@unknown@formal@none@1@S@:*[[Cameroon]]@@@@1@1@@danf@17-8-2009
10290930@unknown@formal@none@1@S@:*[[Central African Republic]]@@@@1@3@@danf@17-8-2009
10290940@unknown@formal@none@1@S@:*[[Chad]]@@@@1@1@@danf@17-8-2009
10290950@unknown@formal@none@1@S@:*[[Comoros]]@@@@1@1@@danf@17-8-2009
10290960@unknown@formal@none@1@S@:*[[Congo (Brazzaville)]]@@@@1@2@@danf@17-8-2009
10290970@unknown@formal@none@1@S@:*[[Côte d'Ivoire]]@@@@1@2@@danf@17-8-2009
10290980@unknown@formal@none@1@S@:*[[Democratic Republic of the Congo]]@@@@1@5@@danf@17-8-2009
10290990@unknown@formal@none@1@S@:*[[Djibouti]]@@@@1@1@@danf@17-8-2009
10291000@unknown@formal@none@1@S@:*[[Equatorial Guinea]] (former colony of [[Spain]])@@@@1@6@@danf@17-8-2009
10291010@unknown@formal@none@1@S@:*[[Gabon]]@@@@1@1@@danf@17-8-2009
10291020@unknown@formal@none@1@S@:*[[Guinea]]@@@@1@1@@danf@17-8-2009
10291030@unknown@formal@none@1@S@:*[[Madagascar]]@@@@1@1@@danf@17-8-2009
10291040@unknown@formal@none@1@S@:*[[Mali]]@@@@1@1@@danf@17-8-2009
10291050@unknown@formal@none@1@S@:*[[Niger]]@@@@1@1@@danf@17-8-2009
10291060@unknown@formal@none@1@S@:*[[Rwanda]]@@@@1@1@@danf@17-8-2009
10291070@unknown@formal@none@1@S@:*[[Senegal]]@@@@1@1@@danf@17-8-2009
10291080@unknown@formal@none@1@S@:*[[Seychelles]]@@@@1@1@@danf@17-8-2009
10291090@unknown@formal@none@1@S@:*[[Togo]]@@@@1@1@@danf@17-8-2009
10291100@unknown@formal@none@1@S@In addition, French is an administrative language and commonly used though not on an official basis in [[Mauritius]] and in the [[Maghreb]] states:@@@@1@23@@danf@17-8-2009
10291110@unknown@formal@none@1@S@:* [[Mauritania]]@@@@1@2@@danf@17-8-2009
10291120@unknown@formal@none@1@S@:* [[Algeria]]@@@@1@2@@danf@17-8-2009
10291130@unknown@formal@none@1@S@:*[[Morocco]]@@@@1@1@@danf@17-8-2009
10291140@unknown@formal@none@1@S@:*[[Tunisia]].@@@@1@1@@danf@17-8-2009
10291150@unknown@formal@none@1@S@Various reforms have been implemented in recent decades in Algeria to improve the status of [[Arabic language|Arabic]] relative to French, especially in education.@@@@1@23@@danf@17-8-2009
10291160@unknown@formal@none@1@S@While the predominant European language in [[Egypt]] is [[English language|English]], French is considered to be a more sophisticated language by some elements of the Egyptian upper and upper-middle classes; for this reason, a typical educated Egyptian will learn French in addition to English at some point in his or her education.@@@@1@51@@danf@17-8-2009
10291170@unknown@formal@none@1@S@The perception of sophistication may be related to the use of French as the [[Noble court|royal court]] language of Egypt during the nineteenth century.@@@@1@24@@danf@17-8-2009
10291180@unknown@formal@none@1@S@Egypt participates in [[La Francophonie]].@@@@1@5@@danf@17-8-2009
10291190@unknown@formal@none@1@S@French is also the official language of [[Mayotte]] and [[Réunion]], two [[Overseas departments and territories of France|overseas territories]] of France located in the [[Indian Ocean]], as well as an administrative and educational language in [[Mauritius]], along with [[English language|English]].@@@@1@39@@danf@17-8-2009
10291200@unknown@formal@none@1@S@===Asia===@@@@1@1@@danf@17-8-2009
10291210@unknown@formal@none@1@S@====Lebanon ====@@@@1@2@@danf@17-8-2009
10291220@unknown@formal@none@1@S@French was the official language in [[Lebanon]] along with [[Arabic language|Arabic]] until 1941, the country's declaration of independence from [[France]].@@@@1@20@@danf@17-8-2009
10291230@unknown@formal@none@1@S@French is still seen as an official language by the [[Lebanese people]] as it is widely used by the Lebanese, especially for administrative purposes, and is taught in schools as a primary language along with [[Arabic]].@@@@1@36@@danf@17-8-2009
10291240@unknown@formal@none@1@S@====Southeast Asia====@@@@1@2@@danf@17-8-2009
10291250@unknown@formal@none@1@S@French is an administrative language in [[Laos]] and [[Cambodia]].@@@@1@9@@danf@17-8-2009
10291260@unknown@formal@none@1@S@French was historically spoken by the elite in the leased territory [[Guangzhouwan]] in southern [[China]].@@@@1@15@@danf@17-8-2009
10291270@unknown@formal@none@1@S@In colonial [[Vietnam]], the elites spoke French and many who worked for the French spoke a French creole known as "[[Tây Bồi]]" (now extinct).@@@@1@24@@danf@17-8-2009
10291280@unknown@formal@none@1@S@====India====@@@@1@1@@danf@17-8-2009
10291290@unknown@formal@none@1@S@French has official status in the Indian [[Union Territory]] of [[Puducherry|Pondicherry]], along with the regional language [[Tamil language|Tamil]] and some students of Tamil Nadu may opt French as their third or fourth language (usually behind [[English language|English]], Tamil, [[Hindi]]).@@@@1@39@@danf@17-8-2009
10291300@unknown@formal@none@1@S@French is also commonly taught as third language in secondary school in most cities of [[Maharashtra]] State including [[Mumbai]] as part of the Secondary (X-SSC) and Higher secondary School (XII-HSC) certificate examinations.@@@@1@32@@danf@17-8-2009
10291310@unknown@formal@none@1@S@===Oceania===@@@@1@1@@danf@17-8-2009
10291320@unknown@formal@none@1@S@French is also a second official language of the [[Pacific Island]] nation of [[Vanuatu]], along with France's territories of [[French Polynesia]], [[Wallis & Futuna]] and [[New Caledonia]].@@@@1@27@@danf@17-8-2009
10291330@unknown@formal@none@1@S@==Dialects==@@@@1@1@@danf@17-8-2009
10291340@unknown@formal@none@1@S@*[[Acadian French]]@@@@1@2@@danf@17-8-2009
10291350@unknown@formal@none@1@S@*[[African French]]@@@@1@2@@danf@17-8-2009
10291360@unknown@formal@none@1@S@*[[Aostan French]]@@@@1@2@@danf@17-8-2009
10291370@unknown@formal@none@1@S@*[[Belgian French]]@@@@1@2@@danf@17-8-2009
10291380@unknown@formal@none@1@S@*[[Cajun French]]@@@@1@2@@danf@17-8-2009
10291390@unknown@formal@none@1@S@*[[Canadian French]]@@@@1@2@@danf@17-8-2009
10291400@unknown@formal@none@1@S@*[[Cambodian French]]@@@@1@2@@danf@17-8-2009
10291410@unknown@formal@none@1@S@*Guyana French (see [[French Guiana]])@@@@1@5@@danf@17-8-2009
10291420@unknown@formal@none@1@S@*[[Indian French]]@@@@1@2@@danf@17-8-2009
10291430@unknown@formal@none@1@S@*[[Jersey Legal French]]@@@@1@3@@danf@17-8-2009
10291440@unknown@formal@none@1@S@*[[Lao French]]@@@@1@2@@danf@17-8-2009
10291450@unknown@formal@none@1@S@*[[Levantine French]] (most commonly referred to as Lebanese French, very similar to [[Maghreb French]])@@@@1@14@@danf@17-8-2009
10291460@unknown@formal@none@1@S@*[[Louisiana Creole French]]@@@@1@3@@danf@17-8-2009
10291470@unknown@formal@none@1@S@*[[Maghreb French]] (see also North African French)@@@@1@7@@danf@17-8-2009
10291480@unknown@formal@none@1@S@*[[Meridional French]]@@@@1@2@@danf@17-8-2009
10291490@unknown@formal@none@1@S@*[[Metropolitan France|Metropolitan French]]@@@@1@3@@danf@17-8-2009
10291500@unknown@formal@none@1@S@*[[Caldoche|New Caledonian French]]@@@@1@3@@danf@17-8-2009
10291510@unknown@formal@none@1@S@*[[Newfoundland French]]@@@@1@2@@danf@17-8-2009
10291520@unknown@formal@none@1@S@*Oceanic French@@@@1@2@@danf@17-8-2009
10291530@unknown@formal@none@1@S@*[[Quebec French]]@@@@1@2@@danf@17-8-2009
10291540@unknown@formal@none@1@S@*[[South East Asian French]]@@@@1@4@@danf@17-8-2009
10291550@unknown@formal@none@1@S@*[[Swiss French]]@@@@1@2@@danf@17-8-2009
10291560@unknown@formal@none@1@S@*[[Vietnamese French (dialect)|Vietnamese French]]@@@@1@4@@danf@17-8-2009
10291570@unknown@formal@none@1@S@*West Indian French@@@@1@3@@danf@17-8-2009
10291580@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10291590@unknown@formal@none@1@S@==Sounds==@@@@1@1@@danf@17-8-2009
10291600@unknown@formal@none@1@S@{{IPA notice}}@@@@1@2@@danf@17-8-2009
10291610@unknown@formal@none@1@S@Although there are many French regional accents, only one version of the language is normally chosen as a model for foreign learners, which has no commonly used special name, but has been termed ''[[français neutre]]'' (neutral French).@@@@1@37@@danf@17-8-2009
10291620@unknown@formal@none@1@S@* Voiced stops (i.e. {{IPA|/b d g/}}) are typically produced fully voiced throughout.@@@@1@13@@danf@17-8-2009
10291630@unknown@formal@none@1@S@* Voiceless stops (i.e. {{IPA|/p t k/}}) are unaspirated.@@@@1@9@@danf@17-8-2009
10291640@unknown@formal@none@1@S@* Nasals: The velar nasal {{IPA|/ŋ/}} occurs only in final position in borrowed (usually English) words: parking, camping, swing.@@@@1@19@@danf@17-8-2009
10291650@unknown@formal@none@1@S@The palatal nasal {{IPA|/ɲ/}}can occur in word initial position (e.g. gnon), but it is most frequently found in intervocalic, onset position or word-finally (e.g. montagne).@@@@1@25@@danf@17-8-2009
10291660@unknown@formal@none@1@S@* Fricatives: French has three pairs of homorganic fricatives distinguished by voicing, i.e. labiodental {{IPA|/f/–/v/}}, dental {{IPA|/s/–/z/}}, and palato-alveolar {{IPA|/ʃ/–/ʒ/}}.@@@@1@20@@danf@17-8-2009
10291670@unknown@formal@none@1@S@Notice that {{IPA|/s/–/z/}} are dental, like the plosives {{IPA|/t/–/d/}}, and the nasal {{IPA|/n/}}.@@@@1@13@@danf@17-8-2009
10291680@unknown@formal@none@1@S@* French has one rhotic whose pronunciation varies considerably among speakers and phonetic contexts.@@@@1@14@@danf@17-8-2009
10291690@unknown@formal@none@1@S@In general it is described as a voiced uvular fricative as in {{IPA|[ʁu]}} roue "wheel" .@@@@1@16@@danf@17-8-2009
10291700@unknown@formal@none@1@S@Vowels are often lengthened before this segment.@@@@1@7@@danf@17-8-2009
10291710@unknown@formal@none@1@S@It can be reduced to an approximant, particularly in final position (e.g. "fort") or reduced to zero in some word-final positions.@@@@1@21@@danf@17-8-2009
10291720@unknown@formal@none@1@S@For other speakers, a uvular trill is also fairly common, and an apical trill {{IPA|[r]}} occurs in some dialects.@@@@1@19@@danf@17-8-2009
10291730@unknown@formal@none@1@S@* Lateral and central approximants: The lateral approximant {{IPA|/l/}} is unvelarised in both onset (''lire'') and coda position (''il'').@@@@1@19@@danf@17-8-2009
10291740@unknown@formal@none@1@S@In the onset, the central approximants {{IPA|[w]}}, {{IPA|[ɥ]}}, and {{IPA|[j]}} each correspond to a high vowel, {{IPA|/u/}}, {{IPA|/y/}}, and {{IPA|/i/}} respectively.@@@@1@21@@danf@17-8-2009
10291750@unknown@formal@none@1@S@There are a few minimal pairs where the approximant and corresponding vowel contrast, but there are also many cases where they are in free variation.@@@@1@25@@danf@17-8-2009
10291760@unknown@formal@none@1@S@Contrasts between {{IPA|/j/}} and {{IPA|/i/}} occur in final position as in {{IPA|/pɛj/}} ''paye'' "pay" vs. {{IPA|/pɛi/}} ''pays'' "country".@@@@1@18@@danf@17-8-2009
10291770@unknown@formal@none@1@S@French pronunciation follows strict rules based on spelling, but French spelling is often based more on history than phonology.@@@@1@19@@danf@17-8-2009
10291780@unknown@formal@none@1@S@The rules for pronunciation vary between dialects, but the standard rules are:@@@@1@12@@danf@17-8-2009
10291790@unknown@formal@none@1@S@* final consonants: Final single consonants, in particular ''s'', ''x'', ''z'', ''t'', ''d'', ''n'' and ''m'', are normally silent.@@@@1@19@@danf@17-8-2009
10291800@unknown@formal@none@1@S@(The final letters ''c'', ''r'', ''f'' and ''l'', however, are normally pronounced.)@@@@1@12@@danf@17-8-2009
10291810@unknown@formal@none@1@S@**When the following word begins with a vowel, though, a silent consonant ''may'' once again be pronounced, to provide a ''[[liaison (linguistics)|liaison]]'' or "link" between the two words.@@@@1@28@@danf@17-8-2009
10291820@unknown@formal@none@1@S@Some liaisons are ''mandatory'', for example the ''s'' in ''les amants'' or ''vous avez''; some are ''optional'', depending on [[dialect]] and [[register (linguistics)|register]], for example the first ''s'' in ''deux cents euros'' or ''euros irlandais''; and some are ''forbidden'', for example the ''s'' in ''beaucoup d'hommes aiment''.@@@@1@47@@danf@17-8-2009
10291830@unknown@formal@none@1@S@The ''t'' of ''et'' is never pronounced and the silent final consonant of a noun is only pronounced in the plural and in [[set phrase]]s like ''pied-à-terre''.@@@@1@27@@danf@17-8-2009
10291840@unknown@formal@none@1@S@Note that in the case of a word ending ''d'' as in ''pied-à-terre'', the consonant ''t'' is pronounced instead.@@@@1@19@@danf@17-8-2009
10291850@unknown@formal@none@1@S@** Doubling a final ''n'' and adding a silent ''e'' at the end of a word (e.g. ''chien'' → ''chienne'') makes it clearly pronounced.@@@@1@24@@danf@17-8-2009
10291860@unknown@formal@none@1@S@Doubling a final ''l'' and adding a silent ''e'' (e.g. ''gentil'' → ''gentille'') adds a [j] sound.@@@@1@17@@danf@17-8-2009
10291870@unknown@formal@none@1@S@* [[elision (French)|elision]] or vowel dropping: Some monosyllabic function words ending in ''a'' or ''e'', such as ''je'' and ''que'', drop their final vowel when placed before a word that begins with a vowel sound (thus avoiding a [[hiatus (linguistics)|hiatus]]).@@@@1@40@@danf@17-8-2009
10291880@unknown@formal@none@1@S@The missing vowel is replaced by an apostrophe. (e.g. ''je ai'' is instead pronounced and spelt → ''j'ai'').@@@@1@18@@danf@17-8-2009
10291890@unknown@formal@none@1@S@This gives for example the same pronunciation for ''l'homme qu'il a vu'' ("the man whom he saw") and ''l'homme qui l'a vu'' ("the man who saw him").@@@@1@27@@danf@17-8-2009
10291900@unknown@formal@none@1@S@==Orthography==@@@@1@1@@danf@17-8-2009
10291910@unknown@formal@none@1@S@* [[Nasal vowel|Nasal]]: ''[[n]]'' and ''[[m]]''.@@@@1@6@@danf@17-8-2009
10291920@unknown@formal@none@1@S@When ''n'' or ''m'' follows a vowel or diphthong, the ''n'' or ''m'' becomes silent and causes the preceding vowel to become nasalized (i.e. pronounced with the soft palate extended downward so as to allow part of the air to leave through the nostrils).@@@@1@44@@danf@17-8-2009
10291930@unknown@formal@none@1@S@Exceptions are when the ''n'' or ''m'' is doubled, or immediately followed by a vowel.@@@@1@15@@danf@17-8-2009
10291940@unknown@formal@none@1@S@The prefixes ''en-'' and ''em-'' are always nasalized.@@@@1@8@@danf@17-8-2009
10291950@unknown@formal@none@1@S@The rules get more complex than this but may vary between dialects.@@@@1@12@@danf@17-8-2009
10291960@unknown@formal@none@1@S@* [[digraph (orthography)|Digraphs]]: French does not introduce extra letters or [[diacritic]]s to specify its large range of vowel sounds and [[diphthongs]], rather it uses specific combinations of vowels, sometimes with following consonants, to show which sound is intended.@@@@1@38@@danf@17-8-2009
10291970@unknown@formal@none@1@S@* [[Consonant length|Gemination]]: Within words, double consonants are generally not pronounced as geminates in modern French (but geminates can be heard in the cinema or TV news from as recently as the 1970s, and in very refined elocution they may still occur).@@@@1@42@@danf@17-8-2009
10291980@unknown@formal@none@1@S@For example, ''illusion'' is pronounced {{IPA|[ilyzjɔ̃]}} and not {{IPA|[illyzjɔ̃]}}.@@@@1@9@@danf@17-8-2009
10291990@unknown@formal@none@1@S@But gemination does occur between words.@@@@1@6@@danf@17-8-2009
10292000@unknown@formal@none@1@S@For example, ''une info'' ("a news") is pronounced {{IPA|[ynɛ̃fo]}}, whereas ''une nympho'' ("a nympho") is pronounced {{IPA|[ynnɛ̃fo]}}.@@@@1@17@@danf@17-8-2009
10292010@unknown@formal@none@1@S@* [[Diacritic|Accents]] are used sometimes for pronunciation, sometimes to distinguish similar words, and sometimes for etymology alone.@@@@1@17@@danf@17-8-2009
10292020@unknown@formal@none@1@S@**Accents that affect pronunciation@@@@1@4@@danf@17-8-2009
10292030@unknown@formal@none@1@S@***The [[acute accent]] (''l'accent aigu''), ''é'' (e.g. ''école''—school), means that the vowel is pronounced {{IPA|/e/}} instead of the default {{IPA|/ə/}}.@@@@1@20@@danf@17-8-2009
10292040@unknown@formal@none@1@S@***The [[grave accent]] (''l'accent grave''), ''è'' (e.g. ''élève''—pupil) means that the vowel is pronounced {{IPA|/ɛ/}} instead of the default {{IPA|/ə/}}.@@@@1@20@@danf@17-8-2009
10292050@unknown@formal@none@1@S@***The [[circumflex]] (''l'accent circonflexe'') ''ê'' (e.g. ''forêt''—forest) shows that an ''e'' is pronounced {{IPA|/ɛ/}} and that an ''o'' is pronounced {{IPA|/o/}}.@@@@1@21@@danf@17-8-2009
10292060@unknown@formal@none@1@S@In standard French it also signifies a pronunciation of {{IPA|/ɑ/}} for the letter ''a'', but this differentiation is disappearing.@@@@1@19@@danf@17-8-2009
10292070@unknown@formal@none@1@S@In the late 19th century, the circumflex was used in place of ''s'' where that letter was not to be pronounced.@@@@1@21@@danf@17-8-2009
10292080@unknown@formal@none@1@S@Thus, ''forest'' became ''forêt'' and ''hospital'' became'' hôpital''.@@@@1@8@@danf@17-8-2009
10292090@unknown@formal@none@1@S@***The [[Umlaut (diacritic)|diaeresis]] (''le tréma'') (e.g. ''naïf''—foolish, ''Noël''—Christmas) as in English, specifies that this vowel is pronounced separately from the preceding one, not combined and is not a [[schwa]].@@@@1@29@@danf@17-8-2009
10292100@unknown@formal@none@1@S@***The [[cedilla]] (''la cédille'') ''ç'' (e.g. ''garçon''—boy) means that the letter ''c'' is pronounced {{IPA|/s/}} in front of the hard vowels ''a'', ''o'' and ''u'' (''c'' is otherwise {{IPA|/k/}} before a hard vowel).@@@@1@33@@danf@17-8-2009
10292110@unknown@formal@none@1@S@''C'' is always pronounced {{IPA|/s/}} in front of the soft vowels ''e'', ''i'', and ''y'', thus ''ç'' is never found in front of soft vowels.@@@@1@25@@danf@17-8-2009
10292120@unknown@formal@none@1@S@**Accents with no pronunciation effect@@@@1@5@@danf@17-8-2009
10292130@unknown@formal@none@1@S@***The circumflex does not affect the pronunciation of the letters ''i'' or ''u'', and in most dialects, ''a'' as well.@@@@1@20@@danf@17-8-2009
10292140@unknown@formal@none@1@S@It usually indicates that an ''s'' came after it long ago, as in ''hôtel''.@@@@1@14@@danf@17-8-2009
10292150@unknown@formal@none@1@S@***All other accents are used only to distinguish similar words, as in the case of distinguishing the adverbs ''là'' and ''où'' ("there", "where") from the article ''la'' and the conjunction ''ou'' ("the" fem. sing., "or") respectively.@@@@1@36@@danf@17-8-2009
10292160@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10292170@unknown@formal@none@1@S@French grammar shares several notable features with most other Romance languages, including:@@@@1@12@@danf@17-8-2009
10292180@unknown@formal@none@1@S@* the loss of Latin's [[declension]]s@@@@1@6@@danf@17-8-2009
10292190@unknown@formal@none@1@S@* only two [[grammatical gender]]s@@@@1@5@@danf@17-8-2009
10292200@unknown@formal@none@1@S@* the development of grammatical [[article (grammar)|article]]s from Latin [[demonstrative]]s@@@@1@10@@danf@17-8-2009
10292210@unknown@formal@none@1@S@* new [[tense]]s formed from auxiliaries@@@@1@6@@danf@17-8-2009
10292220@unknown@formal@none@1@S@French word order is [[Subject Verb Object]], except when the object is a pronoun, in which case the word order is [[Subject Object Verb]].@@@@1@24@@danf@17-8-2009
10292230@unknown@formal@none@1@S@Some rare archaisms allow for different word orders.@@@@1@8@@danf@17-8-2009
10292240@unknown@formal@none@1@S@==Vocabulary==@@@@1@1@@danf@17-8-2009
10292250@unknown@formal@none@1@S@The majority of French words derive from [[Vulgar Latin]] or were constructed from Latin or Greek roots.@@@@1@17@@danf@17-8-2009
10292260@unknown@formal@none@1@S@There are often pairs of words, one form being "popular" (noun) and the other one "savant" (adjective), both originating from Latin.@@@@1@21@@danf@17-8-2009
10292270@unknown@formal@none@1@S@Example:@@@@1@1@@danf@17-8-2009
10292280@unknown@formal@none@1@S@* brother: ''frère'' / ''fraternel'' < from Latin ''frater''@@@@1@9@@danf@17-8-2009
10292290@unknown@formal@none@1@S@* finger: ''doigt'' / ''digital'' < from Latin ''digitus''@@@@1@9@@danf@17-8-2009
10292300@unknown@formal@none@1@S@* faith: ''foi'' / ''fidèle'' < from Latin ''fides''@@@@1@9@@danf@17-8-2009
10292310@unknown@formal@none@1@S@* cold: ''froid'' / ''frigide'' < from Latin ''frigidus''@@@@1@9@@danf@17-8-2009
10292320@unknown@formal@none@1@S@* eye: ''œil'' / ''oculaire'' < from Latin ''oculus''@@@@1@9@@danf@17-8-2009
10292330@unknown@formal@none@1@S@In some examples there is a common word from Vulgar Latin and a more savant word borrowed directly from [[Medieval Latin]] or even [[Ancient Greek]].@@@@1@25@@danf@17-8-2009
10292340@unknown@formal@none@1@S@* '''Cheval'''—Concours '''équestre'''—'''Hippo'''drome@@@@1@3@@danf@17-8-2009
10292350@unknown@formal@none@1@S@The French words which have developed from Latin are usually less recognisable than [[Italian language|Italian]] words of Latin origin because as French evolved from [[Vulgar Latin]], the unstressed final [[syllable]] of many words was dropped or elided into the following word.@@@@1@41@@danf@17-8-2009
10292360@unknown@formal@none@1@S@It is estimated that 12% (4,200) of common French words found in a typical [[dictionary]] such as the ''Petit Larousse'' or ''Micro-Robert Plus'' (35,000 words) are of foreign origin.@@@@1@29@@danf@17-8-2009
10292370@unknown@formal@none@1@S@About 25% (1,054) of these foreign words come from [[English language|English]] and are fairly recent borrowings.@@@@1@16@@danf@17-8-2009
10292380@unknown@formal@none@1@S@The others are some 707 words from [[Italian language|Italian]], 550 from ancient [[Germanic languages]], 481 from ancient [[Gallo-Romance languages]], 215 from [[Arabic language|Arabic]], 164 from [[German language|German]], 160 from [[Celtic languages]], 159 from [[Spanish language|Spanish]], 153 from [[Dutch language|Dutch]], 112 from [[Persian language|Persian]] and [[Sanskrit language|Sanskrit]], 101 from [[Native American languages]], 89 from other [[Asian languages]], 56 from other [[Afro-Asiatic languages]], 55 from [[Slavic languages]] and [[Baltic languages]], 10 for [[Basque language|Basque]] and 144 — about three percent — from other languages.@@@@1@82@@danf@17-8-2009
10292390@unknown@formal@none@1@S@===Numerals===@@@@1@1@@danf@17-8-2009
10292400@unknown@formal@none@1@S@The French counting system is partially [[vigesimal]]: [[20 (number)|twenty]] (''{{lang|fr|vingt}}'') is used as a base number in the names of numbers from 60–99.@@@@1@23@@danf@17-8-2009
10292410@unknown@formal@none@1@S@The French word for ''eighty'', for example, is ''{{lang|fr|quatre-vingts}}'', which literally means "four twenties", and ''{{lang|fr|soixante-quinze}}'' (literally "sixty-fifteen") means 75.@@@@1@20@@danf@17-8-2009
10292420@unknown@formal@none@1@S@This reform arose after the [[French Revolution]] to unify the different counting system (mostly vigesimal near the coast, due to Celtic (via [[Basque language|Basque]]) and Viking influence).@@@@1@27@@danf@17-8-2009
10292430@unknown@formal@none@1@S@This system is comparable to the archaic English use of ''score'', as in "fourscore and seven" (87), or "threescore and ten" (70).@@@@1@22@@danf@17-8-2009
10292440@unknown@formal@none@1@S@[[Belgian French]] and [[Swiss French]] are different in this respect.@@@@1@10@@danf@17-8-2009
10292450@unknown@formal@none@1@S@In Belgium and Switzerland 70 and 90 are ''{{lang|fr|septante}}'' and ''{{lang|fr|nonante}}''.@@@@1@11@@danf@17-8-2009
10292460@unknown@formal@none@1@S@In Switzerland, depending on the local dialect, 80 can be ''{{lang|fr|quatre-vingts}}'' (Geneva, Neuchâtel, Jura) or ''{{lang|fr|huitante}}'' (Vaud, Valais, Fribourg).@@@@1@19@@danf@17-8-2009
10292470@unknown@formal@none@1@S@''Octante'' had been used in Switzerland in the past, but is now considered archaic.@@@@1@14@@danf@17-8-2009
10292480@unknown@formal@none@1@S@In Belgium, however, ''quatre-vingts'' is universally used.@@@@1@7@@danf@17-8-2009
10292490@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10292500@unknown@formal@none@1@S@French is written using the 26 letters of the [[Latin alphabet]], plus five diacritics (the [[circumflex]] accent, [[acute accent]], [[grave accent]], [[Umlaut (diacritic)|diaeresis]], and [[cedilla]]) and the two [[Ligature (typography)|ligatures]] (œ) and (æ).@@@@1@33@@danf@17-8-2009
10292510@unknown@formal@none@1@S@French spelling, like English spelling, tends to preserve obsolete pronunciation rules.@@@@1@11@@danf@17-8-2009
10292520@unknown@formal@none@1@S@This is mainly due to extreme phonetic changes since the Old French period, without a corresponding change in spelling.@@@@1@19@@danf@17-8-2009
10292530@unknown@formal@none@1@S@Moreover, some conscious changes were made to restore Latin orthography:@@@@1@10@@danf@17-8-2009
10292540@unknown@formal@none@1@S@* Old French ''doit'' > French ''doigt'' "finger" (Latin ''digitus'')@@@@1@10@@danf@17-8-2009
10292550@unknown@formal@none@1@S@* Old French ''pie'' > French ''pied'' "foot" (Latin ''pes'' (stem: ''ped-'')@@@@1@12@@danf@17-8-2009
10292560@unknown@formal@none@1@S@As a result, it is difficult to predict the spelling on the basis of the sound alone.@@@@1@17@@danf@17-8-2009
10292570@unknown@formal@none@1@S@Final consonants are generally silent, except when the following word begins with a vowel.@@@@1@14@@danf@17-8-2009
10292580@unknown@formal@none@1@S@For example, all of these words end in a vowel sound: ''pied'', ''aller'', ''les'', ''finit'', ''beaux''.@@@@1@16@@danf@17-8-2009
10292590@unknown@formal@none@1@S@The same words followed by a vowel, however, may sound the consonants, as they do in these examples: ''beaux-arts'', ''les amis'', ''pied-à-terre''.@@@@1@22@@danf@17-8-2009
10292600@unknown@formal@none@1@S@On the other hand, a given spelling will almost always lead to a predictable sound, and the [[Académie française]] works hard to enforce and update this correspondence.@@@@1@27@@danf@17-8-2009
10292610@unknown@formal@none@1@S@In particular, a given vowel combination or diacritic predictably leads to one phoneme.@@@@1@13@@danf@17-8-2009
10292620@unknown@formal@none@1@S@The diacritics have '''phonetic''', '''semantic''', and '''etymological''' significance.@@@@1@8@@danf@17-8-2009
10292630@unknown@formal@none@1@S@* [[acute accent]] (''é''): Over an ''e'', indicates the sound of a short ''ai'' in English, with no [[diphthong]].@@@@1@19@@danf@17-8-2009
10292640@unknown@formal@none@1@S@An ''é'' in modern French is often used where a combination of ''e'' and a consonant, usually ''s,'' would have been used formerly: ''écouter'' < ''escouter''.@@@@1@26@@danf@17-8-2009
10292650@unknown@formal@none@1@S@This type of accent mark is called ''accent aigu'' in French.@@@@1@11@@danf@17-8-2009
10292660@unknown@formal@none@1@S@* [[grave accent]] (''à'', ''è'', ''ù''): Over ''a'' or ''u'', used only to distinguish homophones: ''à'' ("to") vs. ''a'' ("has"), ''ou'' ("or") vs. ''où'' ("where").@@@@1@25@@danf@17-8-2009
10292670@unknown@formal@none@1@S@Over an ''e'', indicates the sound {{IPA|/ɛ/}}.@@@@1@7@@danf@17-8-2009
10292680@unknown@formal@none@1@S@* [[circumflex]] (''â'', ''ê'', ''î'', ''ô'', ''û''): Over an ''a'', ''e'' or ''o'', indicates the sound {{IPA|/ɑ/}}, {{IPA|/ɛ/}} or {{IPA|/o/}}, respectively (the distinction ''a'' {{IPA|/a/}} vs. ''â'' {{IPA|/ɑ/}} tends to disappear in many dialects).@@@@1@34@@danf@17-8-2009
10292690@unknown@formal@none@1@S@Most often indicates the historical deletion of an adjacent letter (usually an ''s'' or a vowel): ''château'' < ''castel'', ''fête'' < ''feste'', ''sûr'' < ''seur'', ''dîner'' < ''disner''.@@@@1@28@@danf@17-8-2009
10292700@unknown@formal@none@1@S@It has also come to be used to distinguish homophones: ''du'' ("of the") vs. ''dû'' (past participle of ''devoir'' "to have to do something (pertaining to an act)"; note that ''dû'' is in fact written thus because of a dropped ''e'': ''deu'').@@@@1@42@@danf@17-8-2009
10292710@unknown@formal@none@1@S@(''See [[Use of the circumflex in French]]'')@@@@1@7@@danf@17-8-2009
10292720@unknown@formal@none@1@S@* [[Umlaut (diacritic)|diaeresis]] or ''tréma'' (''ë'', ''ï'', ''ü'', ''ÿ''): Indicates that a vowel is to be pronounced separately from the preceding one: ''naïve'', ''Noël''.@@@@1@24@@danf@17-8-2009
10292730@unknown@formal@none@1@S@A diaeresis on ''y'' only occurs in some proper names and in modern editions of old French texts.@@@@1@18@@danf@17-8-2009
10292740@unknown@formal@none@1@S@Some proper names in which ''ÿ'' appears include ''Aÿ'' (commune in ''canton de la Marne'' formerly ''Aÿ-Champagne''), ''Rue des Cloÿs'' (alley in the 18th arrondisement of Paris), ''Croÿ'' (family name and hotel on the Boulevard Raspail, Paris), ''[[Château du Feÿ]]'' (near Joigny), ''Ghÿs'' (name of Flemish origin spelt ''Ghijs'' where ''ij'' in handwriting looked like ''ÿ'' to French clerks), ''l'Haÿ-les-Roses'' (commune between Paris and Orly airport), Pierre Louÿs (author), Moÿ (place in ''commune de l'Aisne'' and family name), and ''Le Blanc de Nicolaÿ'' (an insurance company in eastern France).@@@@1@89@@danf@17-8-2009
10292750@unknown@formal@none@1@S@The diaresis on ''u'' appears only in the biblical proper names ''Archélaüs'', ''Capharnaüm'', ''Emmaüs'', ''Ésaü'' and ''Saül''.@@@@1@17@@danf@17-8-2009
10292760@unknown@formal@none@1@S@Nevertheless, since the 1990 orthographic rectifications (which are not applied at all by most French people), the diaeresis in words containing ''guë'' (such as ''aiguë'' or ''ciguë'') may be moved onto the ''u'': ''aigüe'', ''cigüe''.@@@@1@35@@danf@17-8-2009
10292770@unknown@formal@none@1@S@Words coming from German retain the old Umlaut (''ä'', ''ö'' and ''ü'') if applicable but use French pronunciation, such as ''kärcher'' (trade mark of a pressure washer).@@@@1@27@@danf@17-8-2009
10292780@unknown@formal@none@1@S@* [[cedilla]] (''ç''): Indicates that an etymological ''c'' is pronounced {{IPA|/s/}} when it would otherwise be pronounced /k/.@@@@1@18@@danf@17-8-2009
10292790@unknown@formal@none@1@S@Thus ''je lance'' "I throw" (with ''c'' = {{IPA|[s]}} before ''e''), ''je lan'''ç'''ais'' "I was throwing" (''c'' would be pronounced {{IPA|[k]}} before ''a'' without the cedilla).@@@@1@26@@danf@17-8-2009
10292800@unknown@formal@none@1@S@The c cedilla (ç) softens the hard /k/ sound to /s/ before the vowels '''a''', '''o''' or '''u''', for example '''ça''' /sa/.@@@@1@22@@danf@17-8-2009
10292810@unknown@formal@none@1@S@C cedilla is never used before the vowels '''e''' or '''i''' since these two vowels always produce a soft /s/ sound ('''ce''', '''ci''').@@@@1@23@@danf@17-8-2009
10292820@unknown@formal@none@1@S@There are two [[ligatures]], which have various origins.@@@@1@8@@danf@17-8-2009
10292830@unknown@formal@none@1@S@* The ligature ''[[œ]]'' is a mandatory contraction of ''oe'' in certain words.@@@@1@13@@danf@17-8-2009
10292840@unknown@formal@none@1@S@Some of these are native French words, with the pronunciation {{IPA|/œ/}} or {{IPA|/ø/}}, e.g. ''sœur'' "sister" {{IPA|/sœʁ/}}, ''œuvre'' "work (of art)" {{IPA|/œvʁ/}}.@@@@1@22@@danf@17-8-2009
10292850@unknown@formal@none@1@S@Note that it usually appears in the combination ''œu''; ''œil'' is an exception.@@@@1@13@@danf@17-8-2009
10292860@unknown@formal@none@1@S@Many of these words were originally written with the [[Digraph (orthography)|digraph]] ''eu''; the ''o'' in the ligature represents a sometimes artificial attempt to imitate the Latin spelling: Latin ''bovem'' > Old French ''buef''/''beuf'' > Modern French ''bœuf''. ''Œ'' is also used in words of Greek origin, as the Latin rendering of the Greek diphthong ''οι'', e.g. ''cœlacanthe'' "coelacanth".@@@@1@58@@danf@17-8-2009
10292870@unknown@formal@none@1@S@These words used to be pronounced with the vowel {{IPA|/e/}}, but in recent years a spelling pronunciation with {{IPA|/ø/}} has taken hold, e.g. ''œsophage'' {{IPA|/ezɔfaʒ/}} or {{IPA|/øzɔfaʒ/}}.@@@@1@27@@danf@17-8-2009
10292880@unknown@formal@none@1@S@The pronunciation with {{IPA|/e/}} is often seen to be more correct.@@@@1@11@@danf@17-8-2009
10292890@unknown@formal@none@1@S@The ligature œ is not used in some occurrences of the letter combination ''oe'', for example, when ''o'' is part of a prefix (''coexister'').@@@@1@24@@danf@17-8-2009
10292900@unknown@formal@none@1@S@* The ligature ''[[æ]]'' is rare and appears in some words of Latin and Greek origin like ''ægosome'', ''ægyrine'', ''æschne'', ''cæcum'', ''nævus'' or ''uræus''.@@@@1@24@@danf@17-8-2009
10292910@unknown@formal@none@1@S@The vowel quality is identical to é {{IPA|/e/}}.@@@@1@8@@danf@17-8-2009
10292920@unknown@formal@none@1@S@French writing, as with any language, is affected by the spoken language.@@@@1@12@@danf@17-8-2009
10292930@unknown@formal@none@1@S@In Old French, the plural for ''animal'' was ''animals''.@@@@1@9@@danf@17-8-2009
10292940@unknown@formal@none@1@S@Common speakers pronounced a ''u'' before a word ending in ''l'' as the plural.@@@@1@14@@danf@17-8-2009
10292950@unknown@formal@none@1@S@This resulted in ''animauls''.@@@@1@4@@danf@17-8-2009
10292960@unknown@formal@none@1@S@As the French language evolved this vanished and the form ''animaux'' (''aux'' pronounced {{IPA|/o/}}) was admitted.@@@@1@16@@danf@17-8-2009
10292970@unknown@formal@none@1@S@The same is true for ''cheval'' pluralized as ''chevaux'' and many others.@@@@1@12@@danf@17-8-2009
10292980@unknown@formal@none@1@S@Also ''castel'' pl. ''castels'' became ''château'' pl. ''châteaux''.@@@@1@8@@danf@17-8-2009
10292990@unknown@formal@none@1@S@==Samples==@@@@1@1@@danf@17-8-2009
10300010@unknown@formal@none@1@S@German language@@@@1@2@@danf@17-8-2009
10300020@unknown@formal@none@1@S@The '''German language''' ({{lang|de|''Deutsch''}}) is a [[West Germanic languages|West Germanic language]] and one of the world's [[world language|major languages]].@@@@1@19@@danf@17-8-2009
10300030@unknown@formal@none@1@S@German is closely related to and classified alongside [[English language|English]] and [[Dutch language|Dutch]].@@@@1@13@@danf@17-8-2009
10300040@unknown@formal@none@1@S@Around the world, German is spoken by approximately 100 million [[First language|native speakers]] and also about 80 million non-native speakers, and [[Standard German]] is widely taught in schools, universities, and [[Goethe Institute]]s worldwide.@@@@1@33@@danf@17-8-2009
10300050@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10300060@unknown@formal@none@1@S@===Europe===@@@@1@1@@danf@17-8-2009
10300070@unknown@formal@none@1@S@German is spoken primarily in [[Languages of Germany|Germany]] (95%), [[Languages of Austria|Austria]] (89%) and [[Linguistic geography of Switzerland|Switzerland]] (64%) together with [[Liechtenstein]], [[Luxembourg]] ([[D-A-CH-Li-Lux]]) constituting the countries where German is the majority language.@@@@1@33@@danf@17-8-2009
10300080@unknown@formal@none@1@S@Other European German-speaking communities are found in [[Italy]] ([[Province of Bolzano-Bozen|Bolzano-Bozen]]), in the [[German speaking community in Belgium|East Cantons]] of [[Belgium]], in the [[France|french]] area [[Alsace]] which often was traded between Germany and France in history and in some border villages of the former [[South Jutland County]] (in German, ''Nordschleswig'', in Danish, ''Sønderjylland'') of [[Denmark]].@@@@1@55@@danf@17-8-2009
10300090@unknown@formal@none@1@S@Some German-speaking communities still survive in parts of [[Romania]], the [[Czech Republic]], [[Poland]], [[Hungary]], and above all [[Russia]] and [[Kazakhstan]], although forced expulsions after World War II and massive emigration to Germany in the 1980s and 1990s have depopulated most of these communities.@@@@1@43@@danf@17-8-2009
10300100@unknown@formal@none@1@S@It is also spoken by German-speaking foreign populations and some of their descendants in [[Portugal]], [[Spain]], Italy, [[Morocco]], [[Egypt]], [[Israel]], [[Cyprus]], [[Turkey]], [[Greece]], [[United Kingdom]], [[Netherlands]], [[Scandinavia]], [[Siberia]] in Russia, Hungary, Romania, [[Bulgaria]], and the former [[Yugoslavia]] ([[Bosnia and Herzegovina|Bosnia]], [[Serbia]], [[Republic of Macedonia|Macedonia]], [[Croatia]] and [[Slovenia]]).@@@@1@47@@danf@17-8-2009
10300110@unknown@formal@none@1@S@In Luxembourg and the surrounding areas, big parts of the native population speak German dialects, and some people also master standard German (especially in Luxembourg), although in the [[France|French]] regions of [[Alsace]] (German: ''Elsass'') and [[Lorraine (region)|Lorraine]] (German: ''Lothringen'') [[French language|French]] has replaced the local German dialects as the official language, even though it has not been fully replaced on the street.@@@@1@62@@danf@17-8-2009
10300120@unknown@formal@none@1@S@===Overseas===@@@@1@1@@danf@17-8-2009
10300130@unknown@formal@none@1@S@Outside of Europe and the former [[Soviet Union]], the largest German-speaking communities are to be found in the [[United States]], [[Canada]], [[Brazil]] and in [[Argentina]] where millions of Germans migrated in the last 200 years; but the vast majority of their descendants no longer speak German.@@@@1@46@@danf@17-8-2009
10300140@unknown@formal@none@1@S@Additionally, German-speaking communities can be found in the former [[List of former German colonies|German colony]] of [[Namibia]] independent from [[South Africa]] since 1990, as well as in the other countries of German emigration such as [[Canada]], [[Mexico]], [[Dominican Republic]], [[Paraguay]], [[Uruguay]], [[Chile]], [[Peru]], [[Venezuela]] (where [[Alemán Coloniero]] developed), South Africa and [[Australia]].@@@@1@52@@danf@17-8-2009
10300150@unknown@formal@none@1@S@====South America====@@@@1@2@@danf@17-8-2009
10300160@unknown@formal@none@1@S@In Brazil the largest concentrations of German speakers are in [[Rio Grande do Sul]] (where [[Riograndenser Hunsrückisch]] was developed), [[Santa Catarina (state)|Santa Catarina]], [[Paraná (state)|Paraná]], and [[Espírito Santo]], and large German-speaking descendant communities in Argentina, Uruguay and Chile.@@@@1@38@@danf@17-8-2009
10300170@unknown@formal@none@1@S@In the 20th century, over 100,000 German [[Refugee|political refugees]] and invited entrepreneurs settled in [[Latin America]], such as [[Costa Rica]], [[Panama]], Venezuela and the Dominican Republic to establish German-speaking enclaves, and there is a reportedly small [[German immigration to Puerto Rico]].@@@@1@41@@danf@17-8-2009
10300180@unknown@formal@none@1@S@====North America====@@@@1@2@@danf@17-8-2009
10300190@unknown@formal@none@1@S@The United States has the largest concentration of German speakers outside of Europe; an indication of this presence can be found in the names of such villages and towns as [[New Leipzig, North Dakota|New Leipzig]], [[Munich, North Dakota|Munich]], [[Karlsruhe, North Dakota|Karlsruhe]], and [[Strasburg, North Dakota|Strasburg]], [[North Dakota]], and [[New Braunfels]], Texas.@@@@1@51@@danf@17-8-2009
10300200@unknown@formal@none@1@S@Though over the course of the 20th century many of the descendants of 18th and 19th-century immigrants ceased speaking German at home, small populations of elderly (as well as some younger) speakers can be found in [[Pennsylvania]] ([[Amish]], [[Hutterites]], [[Dunkards]] and some [[Mennonites]] historically spoke [[Pennsylvania German language|Pennsylvania Dutch]] (a [[West Central German]] variety) and [[Hutterite German]]), [[Kansas]] (Mennonites and [[Volga German]]s), North Dakota (Hutterite Germans, Mennonites, [[History of Germans in Russia and the Soviet Union|Russian German]]s, Volga Germans, and [[Baltic Germans]]), [[South Dakota]], [[Montana]], [[Texas]] ([[Texas German]]), [[Wisconsin]], [[Indiana]], [[Louisiana]] and [[Oklahoma]].@@@@1@93@@danf@17-8-2009
10300210@unknown@formal@none@1@S@Early twentieth century immigration was often to [[St. Louis, Missouri|St. Louis]], [[Chicago]], [[New York]], [[Pittsburgh]] and [[Cincinnati]].@@@@1@17@@danf@17-8-2009
10300220@unknown@formal@none@1@S@Most of the post–[[World War II]] wave are in the New York, [[Philadelphia]], [[Los Angeles]], [[San Francisco]] and Chicago [[urban area]]s, and in [[Florida]], [[Arizona]] and [[California]] where large communities of retired German, Swiss and Austrian expatriates live.@@@@1@38@@danf@17-8-2009
10300230@unknown@formal@none@1@S@The [[German Americans|American population of German ancestry]] is above 60 million.@@@@1@11@@danf@17-8-2009
10300240@unknown@formal@none@1@S@The German language is the third largest language in the U.S. after [[Spanish language|Spanish]].@@@@1@14@@danf@17-8-2009
10300250@unknown@formal@none@1@S@In Canada there are people of German ancestry throughout the country and especially in the western cities such as [[Kelowna]].@@@@1@20@@danf@17-8-2009
10300260@unknown@formal@none@1@S@German is also spoken in [[Ontario]] and southern [[Nova Scotia]].@@@@1@10@@danf@17-8-2009
10300270@unknown@formal@none@1@S@There is a large and vibrant community in the city of [[Kitchener, Ontario]].@@@@1@13@@danf@17-8-2009
10300280@unknown@formal@none@1@S@German immigrants were instrumental in the country's three largest urban areas: [[Montreal]], [[Toronto]] and [[Vancouver]], but post-WWII immigrants managed to preserve a fluency in the German language in their respective neighborhoods and sections.@@@@1@33@@danf@17-8-2009
10300290@unknown@formal@none@1@S@In the first half of the 20th century, over a million [[German-Canadian]]s made the language one of Canada's most spoken after [[French language|French]].@@@@1@23@@danf@17-8-2009
10300300@unknown@formal@none@1@S@In Mexico there are also large populations of German ancestry, mainly in the cities of: [[Mexico City]], [[Puebla]], [[Mazatlán]], [[Tapachula]], and larger populations scattered in the states of [[Chihuahua]], [[Durango]], and [[Zacatecas]].@@@@1@32@@danf@17-8-2009
10300310@unknown@formal@none@1@S@German ancestry is also said to be found in neighboring towns around [[Guadalajara, Jalisco]] and much of Northern Mexico, where German influence was immersed into the Mexican culture.@@@@1@28@@danf@17-8-2009
10300320@unknown@formal@none@1@S@Standard German is spoken by the affluent German communities in Puebla, Mexico City, [[Nuevo Leon]], [[San Luis Potosi]] and [[Quintana Roo]].@@@@1@21@@danf@17-8-2009
10300330@unknown@formal@none@1@S@German immigration in the twentieth century was small, but produced German-speaking communities in Central America (i.e.@@@@1@16@@danf@17-8-2009
10300340@unknown@formal@none@1@S@[[Guatemala]], [[Honduras]] and [[Nicaragua]]) and the Caribbean Islands like the [[Dominican Republic]].@@@@1@12@@danf@17-8-2009
10300350@unknown@formal@none@1@S@'''Dialects in North America:'''@@@@1@4@@danf@17-8-2009
10300360@unknown@formal@none@1@S@The dialects of German which are or were primarily spoken in colonies or communities founded by German speaking people resemble the dialects of the regions the founders came from.@@@@1@29@@danf@17-8-2009
10300370@unknown@formal@none@1@S@For example, Pennsylvania German resembles dialects of the [[Rhenish Palatinate|Palatinate]], and Hutterite German resembles dialects of [[Carinthia (state)|Carinthia]].@@@@1@18@@danf@17-8-2009
10300380@unknown@formal@none@1@S@[[Texas German]] is a dialect spoken in the areas of Texas settled by the [[Adelsverein]], such as New Braunfels and Fredericksburg.@@@@1@21@@danf@17-8-2009
10300390@unknown@formal@none@1@S@In the [[Amana Colonies]] in the state of Iowa [[Amana German]] is spoken.@@@@1@13@@danf@17-8-2009
10300400@unknown@formal@none@1@S@[[Plautdietsch]] is a large [[minority language]] spoken in Northern Mexico by the [[Mennonite]] communities, and is spoken by more than 200,000 people in Mexico.@@@@1@24@@danf@17-8-2009
10300410@unknown@formal@none@1@S@[[Hutterite German]] is an Upper German dialect of the [[Austro-Bavarian]] variety of the German language, which is spoken by Hutterite communities in Canada and the United States.@@@@1@27@@danf@17-8-2009
10300420@unknown@formal@none@1@S@Hutterite is spoken in the U.S. states of [[Washington]], Montana, North Dakota and South Dakota, and [[Minnesota]]; and in the Canadian provinces of [[Alberta]], [[Saskatchewan]] and [[Manitoba]].@@@@1@27@@danf@17-8-2009
10300430@unknown@formal@none@1@S@Its speakers belong to some Schmiedleit, Lehrerleit, and Dariusleit Hutterite groups, but there are also speakers among the older generations of Prairieleit (the descendants of those Hutterites who chose not to settle in colonies).@@@@1@34@@danf@17-8-2009
10300440@unknown@formal@none@1@S@Hutterite children who grow up in the colonies learn and speak first Hutterite German before learning English in the public school, the standard language of the surrounding areas.@@@@1@28@@danf@17-8-2009
10300450@unknown@formal@none@1@S@Many colonies though continue with German Grammar School, separate from the public school, throughout a student's elementary education.@@@@1@18@@danf@17-8-2009
10300460@unknown@formal@none@1@S@====Creoles====@@@@1@1@@danf@17-8-2009
10300470@unknown@formal@none@1@S@There is an important German creole being studied and recovered, named [[Unserdeutsch]], spoken in the former German colony of [[Papua New Guinea]], across [[Micronesia]] and in northern Australia (i.e. coastal parts of [[Queensland]] and [[Western Australia]]), by few elderly people.@@@@1@40@@danf@17-8-2009
10300480@unknown@formal@none@1@S@The risk of its extinction is serious and efforts to revive interest in the language are being implemented by scholars.@@@@1@20@@danf@17-8-2009
10300490@unknown@formal@none@1@S@====Internet====@@@@1@1@@danf@17-8-2009
10300500@unknown@formal@none@1@S@According to [[Global Reach]] (2004), 6.9% of the Internet population is German.@@@@1@12@@danf@17-8-2009
10300510@unknown@formal@none@1@S@According to [[Netz-tipp]] (2002), 7.7% of webpages are written in German, making it second only to English in the European language group.@@@@1@22@@danf@17-8-2009
10300520@unknown@formal@none@1@S@They also report that 12% of Google's users use its German interface.@@@@1@12@@danf@17-8-2009
10300530@unknown@formal@none@1@S@Older statistics: Babel (1998) found somewhat similar demographics.@@@@1@8@@danf@17-8-2009
10300540@unknown@formal@none@1@S@FUNREDES (1998) and Vilaweb (2000) both found that German is the third most popular language used by websites, after English and Japanese.@@@@1@22@@danf@17-8-2009
10300550@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10300560@unknown@formal@none@1@S@The history of the language begins with the [[High German consonant shift]] during the [[migration period]], separating [[High German]] dialects from common [[West Germanic]].@@@@1@24@@danf@17-8-2009
10300570@unknown@formal@none@1@S@The earliest testimonies of [[Old High German]] are from scattered [[Elder Futhark]] inscriptions, especially in [[Alemannic]], from the 6th century, the earliest glosses (''[[Abrogans]]'') date to the 8th and the oldest coherent texts (the ''[[Hildebrandslied]]'', the ''[[Muspilli]]'' and the [[Merseburg Incantations]]) to the 9th century.@@@@1@45@@danf@17-8-2009
10300580@unknown@formal@none@1@S@[[Old Saxon]] at this time belongs to the [[Ingvaeonic|North Sea Germanic]] cultural sphere, and [[Low Saxon]] should fall under German rather than [[Anglo-Frisian]] influence during the [[Holy Roman Empire]].@@@@1@29@@danf@17-8-2009
10300590@unknown@formal@none@1@S@As Germany was divided into many different [[state]]s, the only force working for a unification or [[standard language|standardization]] of German during a period of several hundred years was the general preference of writers trying to write in a way that could be understood in the largest possible area.@@@@1@48@@danf@17-8-2009
10300600@unknown@formal@none@1@S@When [[Martin Luther]] translated the [[Bible]] (the [[New Testament]] in 1522 and the [[Old Testament]], published in parts and completed in 1534) he based his translation mainly on the bureaucratic standard language used in Saxony (''sächsische Kanzleisprache'') also known as ''Meißner-Deutsch'' (Meißner-German), which was the most widely understood language at this time, because the region it was spoken in was quite influential amongst the German states.@@@@1@66@@danf@17-8-2009
10300610@unknown@formal@none@1@S@This language was based on Eastern Upper and Eastern Central German dialects and preserved much of the grammatical system of Middle High German (unlike the spoken German dialects in Central and Upper Germany that already at that time began to lose the [[genitive case]] and the preterite tense).@@@@1@48@@danf@17-8-2009
10300620@unknown@formal@none@1@S@In the beginning, copies of the Bible had a long list for each region, which translated words unknown in the region into the regional dialect.@@@@1@25@@danf@17-8-2009
10300630@unknown@formal@none@1@S@[[Roman Catholics]] rejected Luther's translation in the beginning and tried to create their own Catholic standard (''gemeines Deutsch'') — which, however, only differed from 'Protestant German' in some minor details.@@@@1@30@@danf@17-8-2009
10300640@unknown@formal@none@1@S@It took until the middle of the 18th century to create a standard that was widely accepted, thus ending the period of [[Early New High German]].@@@@1@26@@danf@17-8-2009
10300650@unknown@formal@none@1@S@In 1901 the 2nd Orthographical Conference ended with a complete standardization of German language in written form while the ''Deutsche Bühnensprache'' (literally: ''German stage-language'') had already established spelling-rules for German three years earlier which were later to become obligatory for general German pronunciation.@@@@1@43@@danf@17-8-2009
10300660@unknown@formal@none@1@S@German used to be the language of commerce and government in the [[Habsburg Empire]], which encompassed a large area of Central and Eastern Europe.@@@@1@24@@danf@17-8-2009
10300670@unknown@formal@none@1@S@Until the mid-19th century it was essentially the language of townspeople throughout most of the Empire.@@@@1@16@@danf@17-8-2009
10300680@unknown@formal@none@1@S@It indicated that the speaker was a [[merchant]], an urbanite, not their nationality.@@@@1@13@@danf@17-8-2009
10300690@unknown@formal@none@1@S@Some cities, such as [[Prague]] (German: ''Prag'') and [[Budapest]] ([[Buda]], German: ''Ofen''), were gradually [[Germanization|Germanized]] in the years after their incorporation into the Habsburg domain.@@@@1@25@@danf@17-8-2009
10300700@unknown@formal@none@1@S@Others, such as [[Bratislava]](German: ''Pressburg''), were originally settled during the Habsburg period and were primarily German at that time.@@@@1@19@@danf@17-8-2009
10300710@unknown@formal@none@1@S@A few cities such as [[Milan]] (German: ''Mailand'') remained primarily non-German.@@@@1@11@@danf@17-8-2009
10300720@unknown@formal@none@1@S@However, most cities were primarily German during this time, such as Prague, Budapest, Bratislava (German: ''Pressburg''), [[Zagreb]] (German: ''Agram''), and [[Ljubljana]] (German: ''Laibach''), though they were surrounded by territory that spoke other languages.@@@@1@33@@danf@17-8-2009
10300730@unknown@formal@none@1@S@Until about 1800, standard German was almost only a written language.@@@@1@11@@danf@17-8-2009
10300740@unknown@formal@none@1@S@At this time, people in urban [[northern Germany]], who spoke dialects very different from Standard German, learned it almost like a foreign language and tried to pronounce it as close to the spelling as possible.@@@@1@35@@danf@17-8-2009
10300750@unknown@formal@none@1@S@Prescriptive pronunciation guides used to consider northern [[German phonology|German pronunciation]] to be the standard.@@@@1@14@@danf@17-8-2009
10300760@unknown@formal@none@1@S@However, the actual pronunciation of standard German varies from region to region.@@@@1@12@@danf@17-8-2009
10300770@unknown@formal@none@1@S@Media and written works are almost all produced in standard German (often called ''Hochdeutsch'' in German) which is understood in all areas where German is spoken, except by [[Nursery school|pre-school]] children in areas which speak only dialect, for example [[Switzerland]] and [[Austria]].@@@@1@42@@danf@17-8-2009
10300780@unknown@formal@none@1@S@However, in this age of television, even they now usually learn to understand Standard German before school age.@@@@1@18@@danf@17-8-2009
10300790@unknown@formal@none@1@S@The first dictionary of the [[Brothers Grimm]], the 16 parts of which were issued between 1852 and 1860, remains the most comprehensive guide to the words of the German language.@@@@1@30@@danf@17-8-2009
10300800@unknown@formal@none@1@S@In 1860, grammatical and orthographic rules first appeared in the ''[[Duden Handbook]]''.@@@@1@12@@danf@17-8-2009
10300810@unknown@formal@none@1@S@In 1901, this was declared the standard definition of the German language.@@@@1@12@@danf@17-8-2009
10300820@unknown@formal@none@1@S@Official revisions of some of these rules were not issued until 1998, when the [[German spelling reform of 1996]] was officially promulgated by governmental representatives of all German-speaking countries.@@@@1@29@@danf@17-8-2009
10300830@unknown@formal@none@1@S@Since the reform, German spelling has been in an eight-year transitional period where the reformed spelling is taught in most schools, while traditional and reformed spellings co-exist in the media.@@@@1@30@@danf@17-8-2009
10300840@unknown@formal@none@1@S@See [[German spelling reform of 1996]] for an overview of the public debate concerning the reform with some major newspapers and magazines and several known writers refusing to adopt it.@@@@1@30@@danf@17-8-2009
10300850@unknown@formal@none@1@S@The German spelling reform of 1996 led to public controversy indeed to considerable dispute.@@@@1@14@@danf@17-8-2009
10300860@unknown@formal@none@1@S@Some state parliaments (Bundesländer) would not accept it ([[North Rhine-Westphalia|North Rhine Westphalia]] and Bavaria).@@@@1@14@@danf@17-8-2009
10300870@unknown@formal@none@1@S@The dispute landed at one point in the highest court which made a short issue of it, claiming that the states had to decide for themselves and that only in schools could the reform be made the official rule - everybody else could continue writing as they had learned it.@@@@1@50@@danf@17-8-2009
10300880@unknown@formal@none@1@S@After 10 years, without any intervention by the federal parliament, a major yet incomplete revision was installed in 2006, just in time for the new school year of 2006.@@@@1@29@@danf@17-8-2009
10300890@unknown@formal@none@1@S@In 2007, some venerable spellings will be finally invalidated even though they caused little or no trouble.@@@@1@17@@danf@17-8-2009
10300900@unknown@formal@none@1@S@The only sure and easily recognizable symptom of a text's being in compliance with the reform is the -ss at the end of words, like in ''dass'' and ''muss''.@@@@1@29@@danf@17-8-2009
10300910@unknown@formal@none@1@S@Classic spelling forbade this ending, instead using ''daß'' and ''muß''.@@@@1@10@@danf@17-8-2009
10300920@unknown@formal@none@1@S@The cause of the controversy evolved around the question whether a language is part of the culture which must be preserved or a means of communicating information which has to allow for growth.@@@@1@33@@danf@17-8-2009
10300930@unknown@formal@none@1@S@(The reformers seemed to be unimpressed by the fact that a considerable part of that culture - namely the entire German literature of the 20th century - is in the old spelling.)@@@@1@32@@danf@17-8-2009
10300940@unknown@formal@none@1@S@The increasing use of English in Germany's higher education system, as well as in business and in popular culture, has led various German academics to state, not necessarily from an entirely negative perspective, that German is a language in decline in its native country.@@@@1@44@@danf@17-8-2009
10300950@unknown@formal@none@1@S@For example, Ursula Kimpel, of the [[University of Tübingen]], said in 2005 that “German universities are offering more courses in English because of the large number of students coming from abroad.@@@@1@31@@danf@17-8-2009
10300960@unknown@formal@none@1@S@German is unfortunately a language in decline.@@@@1@7@@danf@17-8-2009
10300970@unknown@formal@none@1@S@We need and want our professors to be able to teach effectively in English.”@@@@1@14@@danf@17-8-2009
10300980@unknown@formal@none@1@S@==Standard German==@@@@1@2@@danf@17-8-2009
10300990@unknown@formal@none@1@S@Standard German originated not as a traditional dialect of a specific region, but as a [[written language]].@@@@1@17@@danf@17-8-2009
10301000@unknown@formal@none@1@S@However, there are places where the traditional regional dialects have been replaced by standard German; this is the case in vast stretches of Northern Germany, but also in major cities in other parts of the country.@@@@1@36@@danf@17-8-2009
10301010@unknown@formal@none@1@S@Standard German differs regionally, between German-speaking countries, in [[vocabulary]] and some instances of [[pronunciation]], and even [[grammar]] and [[orthography]].@@@@1@19@@danf@17-8-2009
10301020@unknown@formal@none@1@S@This variation must not be confused with the variation of local dialects.@@@@1@12@@danf@17-8-2009
10301030@unknown@formal@none@1@S@Even though the regional varieties of standard German are only to a certain degree influenced by the local dialects, they are very distinct.@@@@1@23@@danf@17-8-2009
10301040@unknown@formal@none@1@S@German is thus considered a pluricentric language.@@@@1@7@@danf@17-8-2009
10301050@unknown@formal@none@1@S@In most regions, the speakers use a continuum of mixtures from more dialectal varieties to more standard varieties according to situation.@@@@1@21@@danf@17-8-2009
10301060@unknown@formal@none@1@S@In the German-speaking parts of Switzerland, mixtures of dialect and standard are very seldom used, and the use of standard German is largely restricted to the written language.@@@@1@28@@danf@17-8-2009
10301070@unknown@formal@none@1@S@Therefore, this situation has been called a ''medial [[diglossia]]''.@@@@1@9@@danf@17-8-2009
10301080@unknown@formal@none@1@S@[[Swiss Standard German]] is used in the Swiss education system.@@@@1@10@@danf@17-8-2009
10301090@unknown@formal@none@1@S@===Official status===@@@@1@2@@danf@17-8-2009
10301100@unknown@formal@none@1@S@Standard German is the only [[official language]] in Liechtenstein and Austria; it shares official status in [[Germany]] (with [[Danish language|Danish]], [[Frisian languages|Frisian]] and [[Sorbian languages|Sorbian]] as minority languages), Switzerland (with [[French language|French]], [[Italian language|Italian]] and [[Romansh language|Romansh]]), Belgium (with [[Dutch language|Dutch]] and French) and Luxembourg (with French and [[Luxembourgish language|Luxembourgish]]).@@@@1@50@@danf@17-8-2009
10301110@unknown@formal@none@1@S@It is used as a local official language in Italy ([[Province of Bolzano-Bozen]]), as well as in the cities of [[Sopron]] (Hungary), Krahule ([[Slovakia]]) and several cities in Romania.@@@@1@29@@danf@17-8-2009
10301120@unknown@formal@none@1@S@It is the official language (with Italian) of the [[Vatican City|Vatican]] [[Swiss Guard]].@@@@1@13@@danf@17-8-2009
10301130@unknown@formal@none@1@S@German has an officially recognized status as regional or auxiliary language in Denmark ([[South Jutland]] region), France (Alsace and [[Moselle]] regions), Italy (Gressoney valley), Namibia, [[Poland]] ([[Bilingual communes in Poland|Opole]] region), and Russia (Asowo and Halbstadt).@@@@1@36@@danf@17-8-2009
10301140@unknown@formal@none@1@S@German is one of the 23 official [[languages of the European Union]].@@@@1@12@@danf@17-8-2009
10301150@unknown@formal@none@1@S@It is the language with the largest number of native speakers in the [[European Union]], and, shortly after English and long before French, the second-most spoken language in Europe.@@@@1@29@@danf@17-8-2009
10301160@unknown@formal@none@1@S@===German as a foreign language===@@@@1@5@@danf@17-8-2009
10301170@unknown@formal@none@1@S@German is the third most taught [[foreign language]] in the English speaking world after French and Spanish.@@@@1@17@@danf@17-8-2009
10301180@unknown@formal@none@1@S@German is the main language of about 90–95 million people in Europe (as of 2004), or 13.3% of all Europeans, being the second most spoken native language in Europe after [[Russian language|Russian]], above French (66.5 million speakers in 2004) and English (64.2 million speakers in 2004).@@@@1@46@@danf@17-8-2009
10301190@unknown@formal@none@1@S@It is therefore the most spoken first language in the EU.@@@@1@11@@danf@17-8-2009
10301200@unknown@formal@none@1@S@It is the second most known foreign language in the EU.@@@@1@11@@danf@17-8-2009
10301210@unknown@formal@none@1@S@It is one of the official languages of the European Union, and one of the three [[working language]]s of [[European Commission|the European Commission]], along with English and French.@@@@1@28@@danf@17-8-2009
10301220@unknown@formal@none@1@S@Thirty-two percent of citizens of the EU-15 countries say they can converse in German (either as a mother tongue or as a second or foreign language).@@@@1@26@@danf@17-8-2009
10301230@unknown@formal@none@1@S@This is assisted by the widespread availability of German TV by cable or satellite.@@@@1@14@@danf@17-8-2009
10301240@unknown@formal@none@1@S@German was once, and still remains to some extent, a [[lingua franca]] in Central, Eastern and [[Northern Europe]].@@@@1@18@@danf@17-8-2009
10301250@unknown@formal@none@1@S@==Dialects==@@@@1@1@@danf@17-8-2009
10301260@unknown@formal@none@1@S@German is a member of the [[West Germanic language|western branch]] of the [[Germanic languages|Germanic]] [[Language family|family of languages]], which in turn is part of the [[Indo-European language family]].@@@@1@28@@danf@17-8-2009
10301270@unknown@formal@none@1@S@The German dialect continuum is traditionally divided most broadly into [[High German languages|High German]] and Low German.@@@@1@17@@danf@17-8-2009
10301280@unknown@formal@none@1@S@The variation among the German dialects is considerable, with only the neighbouring dialects being mutually intelligible.@@@@1@16@@danf@17-8-2009
10301290@unknown@formal@none@1@S@Some dialects are not intelligible to people who only know standard German.@@@@1@12@@danf@17-8-2009
10301300@unknown@formal@none@1@S@However, all German dialects belong to the dialect continuum of High German and Low Saxon languages.@@@@1@16@@danf@17-8-2009
10301310@unknown@formal@none@1@S@Until roughly the end of the Second World War, there was a dialect continuum of all the continental West Germanic languages because nearly any pair of neighbouring dialects were perfectly mutually intelligible.@@@@1@32@@danf@17-8-2009
10301320@unknown@formal@none@1@S@=== Low German ===@@@@1@4@@danf@17-8-2009
10301330@unknown@formal@none@1@S@Low Saxon varieties (spoken on German territory) are considered linguistically a language separate from the German language by some, but just a dialect by others.@@@@1@25@@danf@17-8-2009
10301340@unknown@formal@none@1@S@Sometimes, Low Saxon and [[Low Franconian]] are grouped together because both are unaffected by the High German consonant shift.@@@@1@19@@danf@17-8-2009
10301350@unknown@formal@none@1@S@However, the part of the population capable of speaking and responding to it, or of understanding it has decreased continuously since WWII.@@@@1@22@@danf@17-8-2009
10301360@unknown@formal@none@1@S@Currently the effort to maintain a residual presence in cultural life is negligible.@@@@1@13@@danf@17-8-2009
10301370@unknown@formal@none@1@S@[[Middle Low German]] was the [[lingua franca]] of the [[Hanseatic League]].@@@@1@11@@danf@17-8-2009
10301380@unknown@formal@none@1@S@It was the predominant language in Northern Germany.@@@@1@8@@danf@17-8-2009
10301390@unknown@formal@none@1@S@This changed in the 16th century.@@@@1@6@@danf@17-8-2009
10301400@unknown@formal@none@1@S@In 1534 the [[Luther Bible]] by Martin Luther was printed.@@@@1@10@@danf@17-8-2009
10301410@unknown@formal@none@1@S@This translation is considered to be an important step towards the evolution of the Early New High German.@@@@1@18@@danf@17-8-2009
10301420@unknown@formal@none@1@S@It aimed to be understandable to an ample audience and was based mainly on Central and [[Upper German]] varieties.@@@@1@19@@danf@17-8-2009
10301430@unknown@formal@none@1@S@The Early New High German language gained more prestige than Low Saxon and became the language of science and literature.@@@@1@20@@danf@17-8-2009
10301440@unknown@formal@none@1@S@Other factors were that around the same time, the Hanseatic league lost its importance as new trade routes to [[Asia]] and the [[Americas]] were established, and that the most powerful German states of that period were located in Middle and Southern Germany.@@@@1@42@@danf@17-8-2009
10301450@unknown@formal@none@1@S@The 18th and 19th centuries were marked by mass [[education]], the language of the schools being standard German.@@@@1@18@@danf@17-8-2009
10301460@unknown@formal@none@1@S@Slowly Low Saxon was pushed back and back until it was nothing but a language spoken by the uneducated and at home.@@@@1@22@@danf@17-8-2009
10301470@unknown@formal@none@1@S@Today Low Saxon can be divided in two groups: Low Saxon varieties with a reasonable standard German influx and varieties of Standard German with a Low Saxon influence known as [[Missingsch]].@@@@1@31@@danf@17-8-2009
10301480@unknown@formal@none@1@S@=== High German ===@@@@1@4@@danf@17-8-2009
10301490@unknown@formal@none@1@S@High German is divided into [[Central German]] and [[Upper German language|Upper German]].@@@@1@12@@danf@17-8-2009
10301500@unknown@formal@none@1@S@Central German dialects include [[Ripuarian]], [[Moselle Franconian]], [[Hessian language|Hessian]], [[Thuringian]], [[South Franconian]], [[Lorraine Franconian]] and [[Upper Saxon dialect|Upper Saxon]].@@@@1@19@@danf@17-8-2009
10301510@unknown@formal@none@1@S@It is spoken in the southeastern Netherlands, eastern Belgium, Luxembourg, parts of France, and in Germany approximately between the River [[Main]] and the southern edge of the Lowlands.@@@@1@28@@danf@17-8-2009
10301520@unknown@formal@none@1@S@Modern Standard German is mostly based on Central German, but it should be noted that the common (but not linguistically correct) German term for modern Standard German is ''Hochdeutsch'', that is, ''High German''.@@@@1@33@@danf@17-8-2009
10301530@unknown@formal@none@1@S@The Moselle Franconian varieties spoken in Luxembourg have been officially standardised and institutionalised and are therefore usually considered a separate language known as [[Luxembourgish language|Luxembourgish]].@@@@1@25@@danf@17-8-2009
10301540@unknown@formal@none@1@S@Upper German dialects include [[Alemannic German|Alemannic]] (for instance [[Swiss German (linguistics)|Swiss German]]), [[Swabian German|Swabian]], [[East Franconian German|East Franconian]], [[Alsatian]] and [[Austro-Bavarian]].@@@@1@21@@danf@17-8-2009
10301550@unknown@formal@none@1@S@They are spoken in parts of the Alsace, southern Germany, Liechtenstein, Austria, and in the German-speaking parts of Switzerland and Italy.@@@@1@21@@danf@17-8-2009
10301560@unknown@formal@none@1@S@[[Wymysorys]], [[Sathmarisch]] and [[Siebenbürgisch]] are High German dialects of Poland and Romania respectively.@@@@1@13@@danf@17-8-2009
10301570@unknown@formal@none@1@S@The High German varieties spoken by [[Ashkenazi Jew]]s (mostly in the former [[Soviet Union]]) have several unique features, and are usually considered as a separate language, [[Yiddish]].@@@@1@27@@danf@17-8-2009
10301580@unknown@formal@none@1@S@It is the only Germanic language that does not use the [[Latin alphabet]] as its [[official script|standard script]].@@@@1@18@@danf@17-8-2009
10301590@unknown@formal@none@1@S@===German dialects versus varieties of standard German===@@@@1@7@@danf@17-8-2009
10301600@unknown@formal@none@1@S@In German [[linguistics]], German [[dialect]]s are distinguished from [[variety (linguistics)|varieties]] of [[standard German]].@@@@1@13@@danf@17-8-2009
10301610@unknown@formal@none@1@S@*The ''German dialects'' are the traditional local varieties.@@@@1@8@@danf@17-8-2009
10301620@unknown@formal@none@1@S@They are traditionally traced back to the different German tribes.@@@@1@10@@danf@17-8-2009
10301630@unknown@formal@none@1@S@Many of them are hardly understandable to someone who knows only standard German, since they often differ from standard German in [[lexicon]], [[phonology]] and [[syntax]].@@@@1@25@@danf@17-8-2009
10301640@unknown@formal@none@1@S@If a narrow definition of [[language]] based on [[mutual intelligibility]] is used, many German dialects are considered to be separate languages (for instance in the [[Ethnologue]]).@@@@1@26@@danf@17-8-2009
10301650@unknown@formal@none@1@S@However, such a point of view is unusual in German linguistics.@@@@1@11@@danf@17-8-2009
10301660@unknown@formal@none@1@S@*The ''varieties of standard German'' refer to the different local varieties of the [[pluricentric language|pluricentric]] standard German.@@@@1@17@@danf@17-8-2009
10301670@unknown@formal@none@1@S@They only differ slightly in lexicon and phonology.@@@@1@8@@danf@17-8-2009
10301680@unknown@formal@none@1@S@In certain regions, they have replaced the traditional German dialects, especially in Northern Germany.@@@@1@14@@danf@17-8-2009
10301690@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10301700@unknown@formal@none@1@S@German is an [[Fusional language|inflected language]].@@@@1@6@@danf@17-8-2009
10301710@unknown@formal@none@1@S@===Noun inflection===@@@@1@2@@danf@17-8-2009
10301720@unknown@formal@none@1@S@[[German nouns]] inflect into:@@@@1@4@@danf@17-8-2009
10301730@unknown@formal@none@1@S@* one of four [[Grammatical case|case]]s: [[nominative]], [[genitive]], [[dative case|dative]], and [[accusative case|accusative]].@@@@1@13@@danf@17-8-2009
10301740@unknown@formal@none@1@S@* one of three [[grammatical gender|genders]]: masculine, feminine, or neuter.@@@@1@10@@danf@17-8-2009
10301750@unknown@formal@none@1@S@Word endings sometimes reveal grammatical gender; for instance, nouns ending in '''...ung'''([[-ing]]), '''...e''','''...schaft'''([[-ship]]), '''...keit''' or '''...heit'''([[-hood]]) are feminine, while nouns ending in '''...chen''' or '''...lein''' ([[diminutive]] forms) are neuter and nouns ending in '''...ismus ([[-ism]])''' are masculine.@@@@1@37@@danf@17-8-2009
10301760@unknown@formal@none@1@S@Others are controversial, sometimes depending on the region in which it is spoken.@@@@1@13@@danf@17-8-2009
10301770@unknown@formal@none@1@S@Additionally, ambiguous endings exist, such as '''...er''' ([[-er]]), e.g. ''Feier (feminine)'', engl. ''celebration, party'', and ''Arbeiter (masculine)'', engl. ''labourer''.@@@@1@19@@danf@17-8-2009
10301780@unknown@formal@none@1@S@Sentences can usually be reorganized to avoid a misunderstanding.@@@@1@9@@danf@17-8-2009
10301790@unknown@formal@none@1@S@* two numbers: singular and plural@@@@1@6@@danf@17-8-2009
10301800@unknown@formal@none@1@S@Although German is usually cited as an outstanding example of a highly inflected language, the degree of inflection is considerably less than in [[Old German]], or in other old [[Indo-European languages]] such as [[Latin]], [[Ancient Greek]], or [[Sanskrit]].@@@@1@38@@danf@17-8-2009
10301810@unknown@formal@none@1@S@The three genders have collapsed in the plural, which now behaves, grammatically, somewhat as a fourth gender.@@@@1@17@@danf@17-8-2009
10301820@unknown@formal@none@1@S@With four cases and three genders plus plural there are 16 distinct possible combinations of case and gender/number, but presently there are only six forms of the [[Article (grammar)|definite article]] used for the 16 possibilities.@@@@1@35@@danf@17-8-2009
10301830@unknown@formal@none@1@S@Inflection for case on the noun itself is required in the singular for strong masculine and neuter nouns in the genitive and sometimes in the dative.@@@@1@26@@danf@17-8-2009
10301840@unknown@formal@none@1@S@Both of these cases are losing way to substitutes in [[Natural language|informal speech]].@@@@1@13@@danf@17-8-2009
10301850@unknown@formal@none@1@S@The dative ending is considered somewhat old-fashioned in many contexts and often dropped, but it is still used in sayings and in formal speech or in written language.@@@@1@28@@danf@17-8-2009
10301860@unknown@formal@none@1@S@Weak masculine nouns share a common case ending for genitive, dative and accusative in the singular.@@@@1@16@@danf@17-8-2009
10301870@unknown@formal@none@1@S@Feminines are not declined in the singular.@@@@1@7@@danf@17-8-2009
10301880@unknown@formal@none@1@S@The plural does have an inflection for the dative.@@@@1@9@@danf@17-8-2009
10301890@unknown@formal@none@1@S@In total, seven inflectional endings (not counting plural markers) exist in German: ''-s, -es, -n, -ns, -en, -ens, -e''.@@@@1@19@@danf@17-8-2009
10301900@unknown@formal@none@1@S@In the German orthography, nouns and most words with the syntactical function of nouns are capitalised, which is supposed to make it easier for readers to find out what function a word has within the sentence (''Am Freitag bin ich einkaufen gegangen.'' — "On Friday I went shopping."; ''Eines Tages war er endlich da.'' — "One day he finally showed up".)@@@@1@61@@danf@17-8-2009
10301910@unknown@formal@none@1@S@This spelling convention is almost unique to German today (shared perhaps only by the closely related [[Luxemburgish language]]), although it was historically common in other languages (e.g., Danish and English), too.@@@@1@31@@danf@17-8-2009
10301920@unknown@formal@none@1@S@Like most Germanic languages, German forms left-branching noun [[compound (linguistics)|compound]]s, where the first noun modifies the category given by the second, for example: ''Hundehütte'' (eng. ''dog hut''; specifically: ''doghouse'').@@@@1@29@@danf@17-8-2009
10301930@unknown@formal@none@1@S@Unlike English, where newer compounds or combinations of longer nouns are often written in ''open'' form with separating spaces, German (like the other German languages) nearly always uses the ''closed'' form without spaces, for example: Baumhaus (eng. ''tree house'').@@@@1@39@@danf@17-8-2009
10301940@unknown@formal@none@1@S@Like English, German allows arbitrarily long compounds, but these are rare.@@@@1@11@@danf@17-8-2009
10301950@unknown@formal@none@1@S@(''See also'' [[English compounds]].)@@@@1@4@@danf@17-8-2009
10301960@unknown@formal@none@1@S@The longest German word verified to be actually in (albeit very limited) use is [[Rinderkennzeichnungs- und Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz|Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz]]. [which, literally translated, breaks up into: Rind (cattle) - Fleisch (meat) - Etikettierung(s) (labelling) - Überwachung(s) (supervision) - Aufgaben (duties) - Übertragung(s) (assignment) - Gesetz (law), so "Beef labelling supervision duty assignment law".]@@@@1@50@@danf@17-8-2009
10301970@unknown@formal@none@1@S@===Verb inflection===@@@@1@2@@danf@17-8-2009
10301980@unknown@formal@none@1@S@Standard German verbs inflect into:@@@@1@5@@danf@17-8-2009
10301990@unknown@formal@none@1@S@* one of two conjugation classes, [[weak verb|weak]] and [[strong verb|strong]] (like English).@@@@1@13@@danf@17-8-2009
10302000@unknown@formal@none@1@S@(There is actually a third class, known as mixed verbs, which exhibit inflections combining features of both the strong and weak patterns.)@@@@1@22@@danf@17-8-2009
10302010@unknown@formal@none@1@S@* three persons: 1st, 2nd, 3rd.@@@@1@6@@danf@17-8-2009
10302020@unknown@formal@none@1@S@* two numbers: singular and plural@@@@1@6@@danf@17-8-2009
10302030@unknown@formal@none@1@S@* three [[Grammatical mood|mood]]s: Indicative, Subjunctive, Imperative@@@@1@7@@danf@17-8-2009
10302040@unknown@formal@none@1@S@* two [[Grammatical voice|genera verbi]]: active and passive; the passive being composed and dividable into static and dynamic.@@@@1@18@@danf@17-8-2009
10302050@unknown@formal@none@1@S@* two non-composed tenses ([[present tense|present]], [[preterite]]) and four composed tenses ([[perfect tense|perfect]], [[pluperfect]], [[Future tense|future]] and [[Future perfect tense|future perfect]])@@@@1@21@@danf@17-8-2009
10302060@unknown@formal@none@1@S@* distinction between [[grammatical aspect]]s is rendered by combined use of subjunctive and/or preterite marking; thus: neither of both is plain indicative voice, sole subjunctive conveys second-hand information, subjunctive plus Preterite marking forms the conditional state, and sole preterite is either plain indicative (in the past), or functions as a (literal) alternative for either second-hand-information or for the conditional state of the verb, when one of them may seem indistinguishable otherwise.@@@@1@71@@danf@17-8-2009
10302070@unknown@formal@none@1@S@* distinction between perfect and [[Continuous and progressive aspects|progressive aspect]] is and has at every stage of development been at hand as a productive category of the older language and in nearly all documented dialects, but, strangely enough, is nowadays rigorously excluded from written usage in its present normalised form.@@@@1@50@@danf@17-8-2009
10302080@unknown@formal@none@1@S@* disambiguation of completed vs. uncompleted forms is widely observed and regularly generated by common prefixes (blicken - to look, erblicken - to see [unrelated form: sehen - to see]).@@@@1@30@@danf@17-8-2009
10302090@unknown@formal@none@1@S@====Verb prefixes====@@@@1@2@@danf@17-8-2009
10302100@unknown@formal@none@1@S@There are also many ways to expand, and sometimes radically change, the meaning of a base verb through a relatively small number of prefixes.@@@@1@24@@danf@17-8-2009
10302110@unknown@formal@none@1@S@Some of those prefixes have a meaning themselves (Example: zer- refers to the destruction of things, as in zerreißen = to tear apart, zerbrechen = to break apart, zerschneiden = to cut apart), others do not have more than the vaguest meaning in and of themselves (Example: ver- , as in versuchen = to try, vernehmen = to interrogate, verteilen = to distribute, verstehen = to understand).@@@@1@53@@danf@17-8-2009
10302120@unknown@formal@none@1@S@More examples: haften = to stick, verhaften = to imprison; kaufen = to buy, verkaufen = to sell; hören = to hear, aufhören = to cease; fahren = to drive, erfahren = to get to know, to hear about something.@@@@1@24@@danf@17-8-2009
10302130@unknown@formal@none@1@S@=====Separable prefixes=====@@@@1@2@@danf@17-8-2009
10302140@unknown@formal@none@1@S@Many [[German verbs]] have a separable prefix, often with an adverbial function.@@@@1@12@@danf@17-8-2009
10302150@unknown@formal@none@1@S@In [[finite verb]] forms this is split off and moved to the end of the clause, and is hence considered by some to be a "resultative particle".@@@@1@27@@danf@17-8-2009
10302160@unknown@formal@none@1@S@For example, ''mitgehen'' meaning "to go with" would be split giving ''Gehen Sie mit?''@@@@1@14@@danf@17-8-2009
10302170@unknown@formal@none@1@S@(Literal: "Go you with?" ; Formal: "Are you going along"?).@@@@1@10@@danf@17-8-2009
10302180@unknown@formal@none@1@S@Indeed, several [[parenthetic]]al clauses may occur between the prefix of a finite verb and its complement; e.g.@@@@1@17@@danf@17-8-2009
10302190@unknown@formal@none@1@S@:''Er '''kam''' am Freitagabend nach einem harten Arbeitstag und dem üblichen Ärger, der ihn schon seit Jahren immer wieder an seinem Arbeitsplatz plagt, mit fraglicher Freude auf ein Mahl, das seine Frau ihm, wie er hoffte, bereits aufgetischt hatte, endlich zu Hause '''an''' ''.@@@@1@44@@danf@17-8-2009
10302200@unknown@formal@none@1@S@A literal translation of this example might look like this:@@@@1@10@@danf@17-8-2009
10302210@unknown@formal@none@1@S@:He '''arr-''' on a Friday evening after a hard day at work and the usual disagreements that had been troubling him repeatedly, looking forward to a questionable meal which, as he hoped, his wife had already fixed for him, '''-ived''' at home.@@@@1@42@@danf@17-8-2009
10302220@unknown@formal@none@1@S@===Word order===@@@@1@2@@danf@17-8-2009
10302230@unknown@formal@none@1@S@German requires that a verbal element (main verb or [[auxiliary verb]]) appear second in the sentence, preceded by the most important topical phrase.@@@@1@23@@danf@17-8-2009
10302240@unknown@formal@none@1@S@The second most important phrase appears at the end of the sentence.@@@@1@12@@danf@17-8-2009
10302250@unknown@formal@none@1@S@For a sentence without an auxiliary, this gives several options:@@@@1@10@@danf@17-8-2009
10302260@unknown@formal@none@1@S@: ''{{lang|de|Der alte Mann gibt mir das Buch heute.}}''@@@@1@9@@danf@17-8-2009
10302265@unknown@formal@none@1@S@(The old man gives me the book today)@@@@1@8@@danf@17-8-2009
10302270@unknown@formal@none@1@S@: ''{{lang|de|Der alte Mann gibt mir heute das Buch.}}''@@@@1@9@@danf@17-8-2009
10302280@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt mir der alte Mann heute.}}''@@@@1@9@@danf@17-8-2009
10302290@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt der alte Mann heute mir.}}'' ([[stress (linguistics)|stress]] on ''mir'')@@@@1@13@@danf@17-8-2009
10302300@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt heute der alte Mann mir.}}'' (as well)@@@@1@11@@danf@17-8-2009
10302310@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt der alte Mann mir heute.}}''@@@@1@9@@danf@17-8-2009
10302320@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt heute mir der alte Mann.}}''@@@@1@9@@danf@17-8-2009
10302330@unknown@formal@none@1@S@: ''{{lang|de|Das Buch gibt mir heute der alte Mann.}}''@@@@1@9@@danf@17-8-2009
10302340@unknown@formal@none@1@S@: ''{{lang|de|Heute gibt mir der alte Mann das Buch.}}''@@@@1@9@@danf@17-8-2009
10302350@unknown@formal@none@1@S@: ''{{lang|de|Heute gibt mir das Buch der alte Mann.}}''@@@@1@9@@danf@17-8-2009
10302360@unknown@formal@none@1@S@: ''{{lang|de|Heute gibt der alte Mann mir das Buch.}}''@@@@1@9@@danf@17-8-2009
10302370@unknown@formal@none@1@S@: ''{{lang|de|Mir gibt der alte Mann das Buch heute.}}''@@@@1@9@@danf@17-8-2009
10302380@unknown@formal@none@1@S@: ''{{lang|de|Mir gibt heute der alte Mann das Buch.}}''@@@@1@9@@danf@17-8-2009
10302390@unknown@formal@none@1@S@: ''{{lang|de|Mir gibt der alte Mann heute das Buch.}}''@@@@1@9@@danf@17-8-2009
10302400@unknown@formal@none@1@S@The position of a noun as a subject or object in a German sentence doesn't affect the meaning of the sentence as it would in English.@@@@1@26@@danf@17-8-2009
10302410@unknown@formal@none@1@S@In a [[Sentence (linguistics)|declarative sentence]] in English if the subject does not occur before the predicate the sentence could well be misunderstood.@@@@1@22@@danf@17-8-2009
10302420@unknown@formal@none@1@S@For example, in the sentence "Man bites dog" it is clear who did what to whom.@@@@1@16@@danf@17-8-2009
10302430@unknown@formal@none@1@S@To exchange the place of the subject with that of the object — "Dog bites man" — changes the meaning completely.@@@@1@21@@danf@17-8-2009
10302440@unknown@formal@none@1@S@In other words the word order in a sentence conveys significant information.@@@@1@12@@danf@17-8-2009
10302450@unknown@formal@none@1@S@In German, nouns and articles are declined as in Latin thus indicating whether it is the [[subject (linguistics)|subject]] or [[object (linguistics)|object]] of the verb's action.@@@@1@25@@danf@17-8-2009
10302460@unknown@formal@none@1@S@The above example in German would be ''{{lang|de|Ein Mann beißt den Hund}}'' or ''{{lang|de|Den Hund beißt ein Mann}}'' with both having exactly the same meaning.@@@@1@25@@danf@17-8-2009
10302470@unknown@formal@none@1@S@If the articles are omitted, which is sometimes done in headlines (''{{lang|de|Mann beißt Hund}}''), the syntax applies as in English — the first noun is the subject and the noun following the predicate is the object.@@@@1@36@@danf@17-8-2009
10302480@unknown@formal@none@1@S@Except for emphasis, adverbs of time have to appear in the third place in the sentence, just after the predicate.@@@@1@20@@danf@17-8-2009
10302490@unknown@formal@none@1@S@Otherwise the speaker would be recognised as non-German.@@@@1@8@@danf@17-8-2009
10302500@unknown@formal@none@1@S@For instance the German word order (in Modern English) is: We're going tomorrow to town. (''{{lang|de|Wir gehen morgen in die Stadt.}}'')@@@@1@21@@danf@17-8-2009
10302510@unknown@formal@none@1@S@====Auxiliary verbs====@@@@1@2@@danf@17-8-2009
10302520@unknown@formal@none@1@S@When an [[auxiliary verb]] is present, the auxiliary appears in second position, and the main verb appears at the end.@@@@1@20@@danf@17-8-2009
10302530@unknown@formal@none@1@S@This occurs notably in the creation of the [[perfect tense]].@@@@1@10@@danf@17-8-2009
10302540@unknown@formal@none@1@S@Many word orders are still possible, e.g.:@@@@1@7@@danf@17-8-2009
10302550@unknown@formal@none@1@S@:''{{lang|de|Der alte Mann hat mir das Buch gestern gegeben.}}''@@@@1@9@@danf@17-8-2009
10302555@unknown@formal@none@1@S@(The old man gave me the book yesterday.)@@@@1@8@@danf@17-8-2009
10302560@unknown@formal@none@1@S@:''{{lang|de|Der alte Mann hat mir gestern das Buch gegeben.}}''@@@@1@9@@danf@17-8-2009
10302570@unknown@formal@none@1@S@:''{{lang|de|Das Buch hat mir der alte Mann gestern gegeben.}}''@@@@1@9@@danf@17-8-2009
10302580@unknown@formal@none@1@S@:''{{lang|de|Das Buch hat mir gestern der alte Mann gegeben.}}''@@@@1@9@@danf@17-8-2009
10302590@unknown@formal@none@1@S@:''{{lang|de|Gestern hat mir der alte Mann das Buch gegeben.}}''@@@@1@9@@danf@17-8-2009
10302600@unknown@formal@none@1@S@:''{{lang|de|Gestern hat mir das Buch der alte Mann gegeben.}}''@@@@1@9@@danf@17-8-2009
10302610@unknown@formal@none@1@S@The word order is generally less rigid than in Modern English except for nouns (see below).@@@@1@16@@danf@17-8-2009
10302620@unknown@formal@none@1@S@There are two common [[word order]]s; one is for main [[clause]]s and another for [[subordinate clause]]s.@@@@1@16@@danf@17-8-2009
10302630@unknown@formal@none@1@S@In normal positive sentences the ''inflected'' verb always has position 2; in questions, exclamations and wishes it always has position 1.@@@@1@21@@danf@17-8-2009
10302640@unknown@formal@none@1@S@In subordinate clauses the verb is supposed to occur at the very end, but in speech this rule is often disregarded.@@@@1@21@@danf@17-8-2009
10302650@unknown@formal@none@1@S@For example in a [[Dependent clause|subordinate clause]] introduced by "weil" ("because") the verb quite often occupies the same order as in a [[Independent clause|main clause]].@@@@1@25@@danf@17-8-2009
10302660@unknown@formal@none@1@S@The correct way of saying "because I'm broke" is ''"{{lang|de|…weil ich pleite bin.}}"''.@@@@1@13@@danf@17-8-2009
10302670@unknown@formal@none@1@S@In the vernacular you may hear instead ''"{{lang|de|…weil ich bin pleite.}}"''@@@@1@11@@danf@17-8-2009
10302675@unknown@formal@none@1@S@This phenomenon may be caused by mixing the word-order pattern used for the word ''{{lang|de|weil}}'' with the pattern used for an alternative word for "because", ''{{lang|de|denn}}'', which is used with the main clause order (''"{{lang|de|…denn ich bin pleite.}}"'').@@@@1@38@@danf@17-8-2009
10302680@unknown@formal@none@1@S@====Modal verbs====@@@@1@2@@danf@17-8-2009
10302690@unknown@formal@none@1@S@Sentences using modal verbs place the infinitive at the end.@@@@1@10@@danf@17-8-2009
10302700@unknown@formal@none@1@S@For example, the sentence in Modern English "Should he go home?" would be rearranged in German to say "Should he (to) home go?" (''{{lang|de|Soll er nach Hause gehen?}}'').@@@@1@28@@danf@17-8-2009
10302710@unknown@formal@none@1@S@Thus in sentences with several subordinate or relative clauses the infinitives are clustered at the end.@@@@1@16@@danf@17-8-2009
10302720@unknown@formal@none@1@S@Compare the similar clustering of prepositions in the following English sentence: "What did you bring that book that I don't like to be read to out of up for?"@@@@1@29@@danf@17-8-2009
10302730@unknown@formal@none@1@S@====Multiple infinitives====@@@@1@2@@danf@17-8-2009
10302740@unknown@formal@none@1@S@The number of infinitives at the end is usually restricted to two, causing the third infinitive or auxiliary verb that would have gone at the very end to be placed instead at the beginning of the chain of verbs.@@@@1@39@@danf@17-8-2009
10302750@unknown@formal@none@1@S@For example in the sentence "Should he move into the house that he just has had renovated?" would be rearranged to "Should he into the house move, that he just renovated had?".@@@@1@32@@danf@17-8-2009
10302755@unknown@formal@none@1@S@(''{{lang|de|Soll er in das Haus einziehen, das er gerade hat renovieren lassen?}}'').@@@@1@12@@danf@17-8-2009
10302760@unknown@formal@none@1@S@The older form would have been (''{{lang|de|Soll er in das Haus, das er gerade hat renovieren lassen, einziehen?}}'').@@@@1@18@@danf@17-8-2009
10302770@unknown@formal@none@1@S@If there are more than three infinitives, all except the first two are relocated to the beginning of the chain.@@@@1@20@@danf@17-8-2009
10302780@unknown@formal@none@1@S@Needless to say the rule is not rigorously applied.@@@@1@9@@danf@17-8-2009
10302790@unknown@formal@none@1@S@==Vocabulary==@@@@1@1@@danf@17-8-2009
10302800@unknown@formal@none@1@S@Most German vocabulary is derived from the Germanic branch of the Indo-European language family, although there are significant minorities of words derived from Latin, and [[Greek language|Greek]], and a smaller amount from French and most recently English .@@@@1@38@@danf@17-8-2009
10302810@unknown@formal@none@1@S@At the same time, the effectiveness of the German language in forming equivalents for foreign words from its inherited Germanic stem repertory is great.@@@@1@24@@danf@17-8-2009
10302820@unknown@formal@none@1@S@Thus, [[Notker Labeo]] was able to translate Aristotelian treatises in pure (Old High) German in the decades after the year 1000.@@@@1@21@@danf@17-8-2009
10302830@unknown@formal@none@1@S@Overall, German has fewer Romance-language loanwords than does English.@@@@1@9@@danf@17-8-2009
10302840@unknown@formal@none@1@S@The coining of new, autochthonous words gave German a vocabulary of an estimated 40,000 words as early as the ninth century.@@@@1@21@@danf@17-8-2009
10302850@unknown@formal@none@1@S@In comparison, Latin, with a written tradition of nearly 2,500 years in an empire which ruled the Mediterranean, has grown to no more than 45,000 words today.@@@@1@27@@danf@17-8-2009
10302860@unknown@formal@none@1@S@Even today, many low-key scholarly movements try to promote the ''[[Ersatz]]'' (substitution) of virtually all foreign words with ancient, dialectal, or [[neologism|neologous]] German alternatives.@@@@1@24@@danf@17-8-2009
10302870@unknown@formal@none@1@S@It is claimed that this would also help in spreading modern or scientific notions among the less educated, and thus democratise public life, too.@@@@1@24@@danf@17-8-2009
10302880@unknown@formal@none@1@S@Jurisprudence in Germany, for example, uses perhaps the "purest" tongue in terms of "Germanness", but also the most cumbersome, to be found today..@@@@1@23@@danf@17-8-2009
10302890@unknown@formal@none@1@S@In the modern scientific German vocabulary data base in Leipzig (as of July 2003) there are nine million words and word groups in 35 million sentences (out of a corpus of 500 million words).@@@@1@34@@danf@17-8-2009
10302900@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10302910@unknown@formal@none@1@S@=== Present ===@@@@1@3@@danf@17-8-2009
10302920@unknown@formal@none@1@S@German is written using the Latin alphabet.@@@@1@7@@danf@17-8-2009
10302930@unknown@formal@none@1@S@In addition to the 26 standard letters, German has three vowels with [[Umlaut (diacritic)|Umlaut]], namely ''ä'', ''ö'' and ''ü'', as well as the Eszett or ''[[scharfes s]]'' (sharp s), ''[[ß]]''.@@@@1@30@@danf@17-8-2009
10302940@unknown@formal@none@1@S@Before the German spelling reform of 1996, ''ß'' replaced ''ss'' after [[Vowel length|long vowels]] and diphthongs and before consonants, word-, or partial-word-endings.@@@@1@22@@danf@17-8-2009
10302950@unknown@formal@none@1@S@In reformed spelling, ''ß'' replaces ''ss'' only after long vowels and diphthongs.@@@@1@12@@danf@17-8-2009
10302960@unknown@formal@none@1@S@Since there is no [[capital ß]], it is always written as SS when capitalization is required.@@@@1@16@@danf@17-8-2009
10302970@unknown@formal@none@1@S@For example, ''Maßband'' (tape measure) is capitalized ''MASSBAND''.@@@@1@8@@danf@17-8-2009
10302980@unknown@formal@none@1@S@An exception is the use of ß in legal documents and forms when capitalizing names.@@@@1@15@@danf@17-8-2009
10302990@unknown@formal@none@1@S@To avoid confusion with similar names, a "ß" is to be used instead of "SS".@@@@1@15@@danf@17-8-2009
10303000@unknown@formal@none@1@S@(So: "KREßLEIN" instead of "KRESSLEIN".)@@@@1@5@@danf@17-8-2009
10303010@unknown@formal@none@1@S@A capital ß has been proposed and included in [[Unicode]], but it is not yet recognized as standard German.@@@@1@19@@danf@17-8-2009
10303020@unknown@formal@none@1@S@In [[Switzerland]], ß is not used at all.@@@@1@8@@danf@17-8-2009
10303030@unknown@formal@none@1@S@Umlaut vowels (ä, ö, ü) are commonly circumscribed with ae, oe, and ue if the umlauts are not available on the keyboard used.@@@@1@23@@danf@17-8-2009
10303040@unknown@formal@none@1@S@In the same manner ß can be circumscribed as ss. German readers understand those circumscriptions (although they look unusual), but they are avoided if the regular umlauts are available because they are considered a makeshift, not proper spelling.@@@@1@38@@danf@17-8-2009
10303050@unknown@formal@none@1@S@(In Westphalia, city and family names exist where the extra e has a vowel lengthening effect, e.g. ''Raesfeld'' [ˈraːsfɛlt] and ''Coesfeld'' [ˈkoːsfɛlt], but this use of the letter e after a/o/u does not occur in the present-day spelling of words other than [[proper noun]]s.@@@@1@44@@danf@17-8-2009
10303060@unknown@formal@none@1@S@)@@@@1@1@@danf@17-8-2009
10303070@unknown@formal@none@1@S@Unfortunately there is still no general agreement exactly where these umlauts occur in the sorting sequence.@@@@1@16@@danf@17-8-2009
10303080@unknown@formal@none@1@S@Telephone directories treat them by replacing them with the base vowel followed by an e, whereas dictionaries use just the base vowel.@@@@1@22@@danf@17-8-2009
10303090@unknown@formal@none@1@S@As an example in a [[Telephone directory|telephone book]] ''Ärzte'' occurs after ''Adressenverlage'' but before ''Anlagenbauer'' (because Ä is replaced by Ae).@@@@1@21@@danf@17-8-2009
10303100@unknown@formal@none@1@S@In a dictionary ''Ärzte'' occurs after ''Arzt'' but before ''Asbest'' (because Ä is treated as A).@@@@1@16@@danf@17-8-2009
10303110@unknown@formal@none@1@S@In some older dictionaries or indexes, initial ''Sch'' and ''St'' are treated as separate letters and are listed as separate entries after ''S''.@@@@1@23@@danf@17-8-2009
10303120@unknown@formal@none@1@S@=== Past ===@@@@1@3@@danf@17-8-2009
10303130@unknown@formal@none@1@S@Until the early 20th century, German was mostly printed in [[blackletter]] [[typefaces]] (mostly in [[fraktur (typeface)|Fraktur]], but also in [[Schwabacher]]) and written in corresponding [[Penmanship|handwriting]] (for example [[Kurrent]] and [[Sütterlin]]).@@@@1@30@@danf@17-8-2009
10303140@unknown@formal@none@1@S@These variants of the Latin alphabet are very different from the serif or [[Sans-serif|sans serif]] [[Antiqua]] typefaces used today, and particularly the handwritten forms are difficult for the untrained to read.@@@@1@31@@danf@17-8-2009
10303150@unknown@formal@none@1@S@The printed forms however were claimed by some to be actually more readable when used for printing [[Germanic language]]s .@@@@1@20@@danf@17-8-2009
10303160@unknown@formal@none@1@S@The [[Nazis]] initially promoted Fraktur and Schwabacher since they were considered [[Aryan]], although they later abolished them in 1941 by claiming that these letters were Jewish.@@@@1@26@@danf@17-8-2009
10303170@unknown@formal@none@1@S@The latter fact is not widely known anymore; today the letters are often associated with the Nazis and are no longer commonly used .@@@@1@24@@danf@17-8-2009
10303180@unknown@formal@none@1@S@The Fraktur script remains present in everyday life through road signs, pub signs, beer brands and other forms of advertisement, where it is used to convey a certain rusticality and oldness.@@@@1@31@@danf@17-8-2009
10303190@unknown@formal@none@1@S@A proper use of the [[long s]], (''langes s''), [[Long s|ſ]], is essential to write German text in [[Fraktur (script)|Fraktur]] typefaces.@@@@1@21@@danf@17-8-2009
10303200@unknown@formal@none@1@S@Many [[Antiqua script|Antiqua]] typefaces include the [[long s]], also.@@@@1@9@@danf@17-8-2009
10303210@unknown@formal@none@1@S@A specific set of rules applies for the use of long s in German text, but it is rarely used in Antiqua typesetting, recently.@@@@1@24@@danf@17-8-2009
10303220@unknown@formal@none@1@S@Any lower case "s" at the beginning of a syllable would be a long s, as opposed to a terminal s or short s (the more common variation of the letter s), which marks the end of a syllable; for example, in differentiating between the words ''Wachſtube'' (=guard-house) and ''Wachstube'' (=tube of floor polish).@@@@1@54@@danf@17-8-2009
10303230@unknown@formal@none@1@S@One can decide which "s" to use by appropriate hyphenation, easily ("Wach-ſtube" vs. "Wachs-tube").@@@@1@14@@danf@17-8-2009
10303240@unknown@formal@none@1@S@The long s only appears in [[lower case]].@@@@1@8@@danf@17-8-2009
10303250@unknown@formal@none@1@S@The widespread ignorance of the correct use of the Fraktur scripts shows however in the many mistakes made— such as the frequent erroneous use of the round s instead of the [[long s]] at the beginning of a syllable, the failure to employ the mandatory [[Typographical ligature|ligature]]s of Fraktur, or the use of letter-forms more alike to the Antiqua for certain especially hard-to-read Fraktur letters.@@@@1@65@@danf@17-8-2009
10303260@unknown@formal@none@1@S@==Phonology==@@@@1@1@@danf@17-8-2009
10303270@unknown@formal@none@1@S@===Vowels===@@@@1@1@@danf@17-8-2009
10303280@unknown@formal@none@1@S@German vowels (excluding diphthongs; see below) come in ''short'' and ''long'' varieties, as detailed in the following table:@@@@1@18@@danf@17-8-2009
10303290@unknown@formal@none@1@S@Short {{IPA|/ɛ/}} is realised as {{IPA|[ɛ]}} in stressed syllables (including [[secondary stress]]), but as {{IPA|[ǝ]}} in unstressed syllables.@@@@1@18@@danf@17-8-2009
10303300@unknown@formal@none@1@S@Note that stressed short {{IPA|/ɛ/}} can be spelled either with ''e'' or with ''ä'' (''hätte'' 'would have' and ''Kette'' 'chain', for instance, rhyme).@@@@1@23@@danf@17-8-2009
10303310@unknown@formal@none@1@S@In general, the short vowels are open and the long vowels are closed.@@@@1@13@@danf@17-8-2009
10303320@unknown@formal@none@1@S@The one exception is the open {{IPA|/ɛː/}} sound of long Ä; in some varieties of standard German, {{IPA|/ɛː/}} and {{IPA|/eː/}} have merged into {{IPA|[eː]}}, removing this anomaly.@@@@1@27@@danf@17-8-2009
10303330@unknown@formal@none@1@S@In that case, pairs like ''Bären/Beeren'' 'bears/berries' or ''Ähre/Ehre'' 'spike/honour' become homophonous).@@@@1@12@@danf@17-8-2009
10303340@unknown@formal@none@1@S@In many varieties of standard German, an unstressed {{IPA|/ɛr/}} is not pronounced as {{IPA|[ər]}}, but vocalised to {{IPA|[ɐ]}}.@@@@1@18@@danf@17-8-2009
10303350@unknown@formal@none@1@S@Whether any particular vowel letter represents the long or short phoneme is not completely predictable, although the following regularities exist:@@@@1@20@@danf@17-8-2009
10303360@unknown@formal@none@1@S@* If a vowel (other than ''i'') is at the end of a syllable or followed by a single consonant, it is usually pronounced long (e.g. ''Hof'' [hoːf]).@@@@1@28@@danf@17-8-2009
10303370@unknown@formal@none@1@S@* If the vowel is followed by a double consonant (e.g. ''ff'', ''ss'' or ''tt''), ''ck'', ''tz'' or a [[consonant cluster]] (e.g. ''st'' or ''nd''), it is nearly always short (e.g. ''hoffen'' [ˈhɔfǝn]).@@@@1@33@@danf@17-8-2009
10303380@unknown@formal@none@1@S@Double consonants are used only for this function of marking preciding vowels as short; the consonant itself is never pronounced lengthened or doubled.@@@@1@23@@danf@17-8-2009
10303390@unknown@formal@none@1@S@Both of these rules have exceptions (e.g. ''hat'' [hat] 'has' is short despite the first rule; ''Kloster'' {{IPA|[kloːstər]}}, '[[cloister]]'; ''Mond'' {{IPA|[moːnt]}}, '[[moon]]' are long despite the second rule).@@@@1@28@@danf@17-8-2009
10303400@unknown@formal@none@1@S@For an ''i'' that is neither in the combination ''ie'' (making it long) nor followed by a double consonant or cluster (making it short), there is no general rule.@@@@1@29@@danf@17-8-2009
10303410@unknown@formal@none@1@S@In some cases, there are regional differences: In central Germany (Hessen), the ''o'' in the [[Noun#Proper nouns and common nouns|proper name]] "Hoffmann" is pronounced long while most other Germans would pronounce it short; the same applies to the ''e'' in the geographical name "Mecklenburg" for people in that region.@@@@1@49@@danf@17-8-2009
10303420@unknown@formal@none@1@S@The word ''Städte'' 'cities', is pronounced with a short vowel {{IPA|[ˈʃtɛtə]}} by some (Jan Hofer, ARD Television) and with a long vowel {{IPA|[ˈʃtɛːtə]}} by others (Marietta Slomka, ZDF Television).@@@@1@29@@danf@17-8-2009
10303430@unknown@formal@none@1@S@Finally, a vowel followed by ''ch'' can be short (''Fach'' {{IPA|[fax]}} 'compartment', ''Küche'' {{IPA|[ˈkʏçe]}} 'kitchen') or long (''Suche'' {{IPA|[ˈzuːxǝ]}} 'search', ''Bücher'' {{IPA|[ˈbyːçər]}} 'books') almost at random.@@@@1@26@@danf@17-8-2009
10303440@unknown@formal@none@1@S@Thus, ''Lache'' is homographous: {{IPA|[la:xe]}} 'puddle' and {{IPA|[laxe]}} 'manner of laughing' (coll.), 'laugh!'@@@@1@13@@danf@17-8-2009
10303450@unknown@formal@none@1@S@(Imp.).@@@@1@1@@danf@17-8-2009
10303460@unknown@formal@none@1@S@German vowels can form the following digraphs (in writing) and diphthongs (in pronunciation); note that the pronunciation of some of them (ei, äu, eu) is very different from what one would expect when considering the component letters:@@@@1@37@@danf@17-8-2009
10303470@unknown@formal@none@1@S@Additionally, the digraph ''ie'' generally represents the phoneme {{IPA|/iː/}}, which is not a diphthong.@@@@1@14@@danf@17-8-2009
10303480@unknown@formal@none@1@S@In many varieties, a /r/ at the end of a syllable is vocalised.@@@@1@13@@danf@17-8-2009
10303490@unknown@formal@none@1@S@However, a sequence of a vowel followed by such a vocalised /r/ is not considered a diphthong: Bär {{IPA|[bɛːɐ̯]}} 'bear', er {{IPA|[eːɐ̯]}} 'he', wir {{IPA|[viːɐ̯]}} 'we', Tor {{IPA|[toːɐ̯]}} 'gate', kurz {{IPA|[kʊɐ̯ts]}} 'short', Wörter {{IPA|[vœɐ̯tɐ]}} 'words'.@@@@1@35@@danf@17-8-2009
10303500@unknown@formal@none@1@S@In most varieties of standard German, word stems that begin with a vowel are preceded by a [[glottal stop]] [ʔ].@@@@1@20@@danf@17-8-2009
10303510@unknown@formal@none@1@S@===Consonants===@@@@1@1@@danf@17-8-2009
10303520@unknown@formal@none@1@S@* '''c''' standing by itself is not a German letter.@@@@1@10@@danf@17-8-2009
10303530@unknown@formal@none@1@S@In borrowed words, it is usually pronounced [ʦ] (before ä, äu, e, i, ö, ü, y) or [k] (before a, o, u, or before consonants).@@@@1@25@@danf@17-8-2009
10303540@unknown@formal@none@1@S@The combination '''ck''' is, as in English, used to indicate that the preceding vowel is short.@@@@1@16@@danf@17-8-2009
10303550@unknown@formal@none@1@S@* '''ch''' occurs most often and is pronounced either [ç] (after ä, ai, äu, e, ei, eu, i, ö, ü and after consonants) or [x] (after a, au, o, u).@@@@1@30@@danf@17-8-2009
10303560@unknown@formal@none@1@S@Ch never occurs at the beginning of an originally German word.@@@@1@11@@danf@17-8-2009
10303570@unknown@formal@none@1@S@In borrowed words with initial Ch there is no single agreement on the pronunciation.@@@@1@14@@danf@17-8-2009
10303580@unknown@formal@none@1@S@For example, the word ''"Chemie"'' (chemistry) can be pronounced [keːˈmiː], [çeːˈmiː] or [ʃeːˈmiː] depending on dialect.@@@@1@16@@danf@17-8-2009
10303590@unknown@formal@none@1@S@* '''dsch''' is pronounced ʤ (like ''j'' in ''Jungle'') but appears in a few [[loanwords]] only.@@@@1@16@@danf@17-8-2009
10303600@unknown@formal@none@1@S@* '''f''' is pronounced [f] as in "''f''ather".@@@@1@8@@danf@17-8-2009
10303610@unknown@formal@none@1@S@* '''h''' is pronounced [h] like in "''h''ome" at the beginning of a syllable.@@@@1@14@@danf@17-8-2009
10303620@unknown@formal@none@1@S@After a vowel it is silent and only lengthens the vowel (e.g. ''"Reh"'' = [[roe deer]]).@@@@1@16@@danf@17-8-2009
10303630@unknown@formal@none@1@S@* '''j''' is pronounced [j] in Germanic words (''"Jahr"'' [jaːɐ]).@@@@1@10@@danf@17-8-2009
10303640@unknown@formal@none@1@S@In younger loanwords, it follows more or less the respective languages' pronunciations.@@@@1@12@@danf@17-8-2009
10303650@unknown@formal@none@1@S@* '''l''' is always pronounced [l], never [ɫ] (the English "[[Dark L]]").@@@@1@12@@danf@17-8-2009
10303660@unknown@formal@none@1@S@* '''q''' only exists in combination with '''u''' and appears both in Germanic and Latin words (''"quer"''; ''"Qualität"'').@@@@1@18@@danf@17-8-2009
10303670@unknown@formal@none@1@S@It is pronounced [kv].@@@@1@4@@danf@17-8-2009
10303680@unknown@formal@none@1@S@* '''r''' is pronounced as a [[Guttural R|guttural sound]] (an [[uvular trill]], [ʀ]) in front of a vowel or consonant (''"Rasen"'' [ʀaːzən]; ''"Burg"'' like [buʀg]).@@@@1@25@@danf@17-8-2009
10303690@unknown@formal@none@1@S@In spoken German however, it is commonly vocalised after a vowel (''"er"'' being pronounced rather like ['ɛɐ] - ''"Burg"'' [buɐg]).@@@@1@20@@danf@17-8-2009
10303700@unknown@formal@none@1@S@In some southern non-standard varieties, the '''r''' is pronounced as a tongue-tip r (the [[alveolar trill]]).@@@@1@16@@danf@17-8-2009
10303710@unknown@formal@none@1@S@* '''s''' in Germany, is pronounced [z] (as in "''Z''ebra") if it forms the [[syllable onset]] (e.g. Sohn [zoːn]), otherwise [s] (e.g. Bus [bʊs]).@@@@1@24@@danf@17-8-2009
10303720@unknown@formal@none@1@S@In Austria, always pronounced [s].@@@@1@5@@danf@17-8-2009
10303730@unknown@formal@none@1@S@A '''ss''' [s] indicates that the preceding vowel is short. '''st''' and '''sp''' at the beginning of words of German origin are pronounced [ʃt] and [ʃp], respectively.@@@@1@27@@danf@17-8-2009
10303740@unknown@formal@none@1@S@* '''ß''' (a letter unique to German called "Esszet") was a ligature of a double '''s''' ''and'' of a '''sz''' and is always pronounced [s].@@@@1@25@@danf@17-8-2009
10303750@unknown@formal@none@1@S@Originating in [[Blackletter]] typeface, it traditionally replaced '''ss''' at the end of a syllable (e.g. ''"ich muss"'' → ''"ich muß"''; ''"ich müsste"'' → ''"ich müßte"''); within a word it contrasts with '''ss''' [s] in indicating that the preceding vowel is long (compare ''"in Maßen"'' [in 'maːsən] "with moderation" and ''"in Massen"'' [in 'masən] "in loads").@@@@1@55@@danf@17-8-2009
10303760@unknown@formal@none@1@S@The use of '''ß''' has recently been limited by the latest German spelling reform and is no longer used for '''ss''' at the end of a syllable; Switzerland and Liechtenstein already abolished it in 1934.@@@@1@35@@danf@17-8-2009
10303770@unknown@formal@none@1@S@* '''sch''' is pronounced [ʃ] (like "sh" in "Shine").@@@@1@9@@danf@17-8-2009
10303780@unknown@formal@none@1@S@* '''v''' is pronounced [f] in words of Germanic origin (e.g. ''"Vater"'' [ˈfaːtɐ]) and [v] in most other words (e.g. ''"Vase"'' [ˈvaːzǝ]).@@@@1@22@@danf@17-8-2009
10303790@unknown@formal@none@1@S@* '''w''' is pronounced [v] like in "''v''acation" (e.g. ''"was"'' [vas]).@@@@1@11@@danf@17-8-2009
10303800@unknown@formal@none@1@S@* '''y''' only appears in loanwords and is traditionally considered a vowel.@@@@1@12@@danf@17-8-2009
10303810@unknown@formal@none@1@S@* '''z''' is always pronounced [ʦ] (e.g. ''"zog"'' [ʦoːk]).@@@@1@9@@danf@17-8-2009
10303820@unknown@formal@none@1@S@A '''tz''' indicates that the preceding vowel is short.@@@@1@9@@danf@17-8-2009
10303830@unknown@formal@none@1@S@====Consonant shifts====@@@@1@2@@danf@17-8-2009
10303840@unknown@formal@none@1@S@German does not have any [[dental fricative]]s (as English '''th''').@@@@1@10@@danf@17-8-2009
10303850@unknown@formal@none@1@S@The '''th''' sounds, which the English language has inherited from [[Anglo-Saxons|Anglo Saxon]], survived on the continent up to Old High German and then disappeared in German with the consonant shifts between the 8th and the 10th century.@@@@1@37@@danf@17-8-2009
10303860@unknown@formal@none@1@S@It is sometimes possible to find parallels between German by replacing the English '''th''' with '''d''' in German: "Thank" → in German "Dank", "this" and "that" → "dies" and "das", "[[thou]]" (old 2nd person singular pronoun) → "du", "think" → "denken", "thirsty" → "durstig" and many other examples.@@@@1@48@@danf@17-8-2009
10303870@unknown@formal@none@1@S@Likewise, the '''gh''' in [[Germanic languages|Germanic]] English words, pronounced in several different ways in modern English (as an '''f''', or not at all), can often be linked to German '''ch''': "to laugh" → "lachen", "through" and "thorough" → "durch", "high" → "hoch", "naught" → "nichts", etc.@@@@1@46@@danf@17-8-2009
10303880@unknown@formal@none@1@S@==Cognates with English==@@@@1@3@@danf@17-8-2009
10303890@unknown@formal@none@1@S@There are many thousands of German words that are [[cognate]] to English words (in fact a sizeable fraction of native German and English vocabulary, although for various reasons much of it is not immediately obvious).@@@@1@35@@danf@17-8-2009
10303900@unknown@formal@none@1@S@Most of the words in the following table have almost the same meaning as in English.@@@@1@16@@danf@17-8-2009
10303910@unknown@formal@none@1@S@Compound word cognates@@@@1@3@@danf@17-8-2009
10303920@unknown@formal@none@1@S@When these cognates have slightly different consonants, this is often due to the High German consonant shift.@@@@1@17@@danf@17-8-2009
10303930@unknown@formal@none@1@S@Hence the affinity of English words with those of German dialects is more evidently:@@@@1@14@@danf@17-8-2009
10303940@unknown@formal@none@1@S@There are cognates whose meanings in either language have changed through the centuries.@@@@1@13@@danf@17-8-2009
10303950@unknown@formal@none@1@S@It is sometimes difficult for both English and German speakers to discern the relationship.@@@@1@14@@danf@17-8-2009
10303960@unknown@formal@none@1@S@On the other hand, once the definitions are made clear, then the logical relation becomes obvious.@@@@1@16@@danf@17-8-2009
10303970@unknown@formal@none@1@S@Sometimes the generality or specificity of word pairs may be opposite in the two languages.@@@@1@15@@danf@17-8-2009
10303980@unknown@formal@none@1@S@German and English also share many borrowings from other languages, especially Latin, French and Greek.@@@@1@15@@danf@17-8-2009
10303990@unknown@formal@none@1@S@Most of these words have the same meaning, while a few have subtle differences in meaning.@@@@1@16@@danf@17-8-2009
10304000@unknown@formal@none@1@S@As many of these words have been borrowed by numerous languages, not only German and English, they are called ''[[internationalism (linguistics)|internationalisms]]'' in German linguistics.@@@@1@24@@danf@17-8-2009
10304010@unknown@formal@none@1@S@For reference, a good number of these borrowed words are of the neuter gender.@@@@1@14@@danf@17-8-2009
10304020@unknown@formal@none@1@S@==Words borrowed by English==@@@@1@4@@danf@17-8-2009
10304030@unknown@formal@none@1@S@:''For a list of German loanwords in English, see [[:Category:German loanwords]]''@@@@1@11@@danf@17-8-2009
10304040@unknown@formal@none@1@S@In the English language, there are also many words taken from German without any letter change, e.g.:@@@@1@17@@danf@17-8-2009
10304050@unknown@formal@none@1@S@==Names for German in other languages==@@@@1@6@@danf@17-8-2009
10304060@unknown@formal@none@1@S@:''See also: [[Deutsch]], [[Names for the Dutch language|Dutch]], [[Deitsch]], [[Dietsch]], [[Teuton]], [[Teutonic]], [[Allemanic]], [[Alleman]], [[Theodisca]]''@@@@1@15@@danf@17-8-2009
10304070@unknown@formal@none@1@S@The names that countries have for the language differ from region to region.@@@@1@13@@danf@17-8-2009
10304080@unknown@formal@none@1@S@In Italian the sole name for German is still ''tedesco'', from the Latin ''[[theodiscus]]'', meaning "vernacular".@@@@1@16@@danf@17-8-2009
10304090@unknown@formal@none@1@S@A possible explanation for the use of words meaning "mute" (e.g., ''nemoj'' in Russian, ''němý'' in Czech, ''nem'' in [[Serbian language|Serbian]]) to refer to German (and also to Germans) in Slavic languages is that Germans were the first people [[Slavic peoples|Slavic tribes]] encountered with whom they could not communicate.@@@@1@49@@danf@17-8-2009
10304100@unknown@formal@none@1@S@[[Romanian language|Romanian]] used to use the Slavonic term "nemţeşte", but "germană" is now widely used.@@@@1@15@@danf@17-8-2009
10304110@unknown@formal@none@1@S@Hungarian "német" is also of Slavonic origin.@@@@1@7@@danf@17-8-2009
10304120@unknown@formal@none@1@S@The [[Arabic language|Arabic]] name for Austria, النمسا ("an-namsa"), is derived from the Slavonic term.@@@@1@14@@danf@17-8-2009
10304130@unknown@formal@none@1@S@Note also that though the Russian term for the language is ''немецкий'' ''(nemetskij)'', the country is ''Германия'' ''(Germania)''.@@@@1@18@@danf@17-8-2009
10304140@unknown@formal@none@1@S@However, in certain other [[Slavic languages]], such as Czech, the country name (''Německo'') is similar to the name of the language, ''německý'' (jazyk).@@@@1@23@@danf@17-8-2009
10304150@unknown@formal@none@1@S@[[Finns]] and [[Estonians]] use the term ''saksa'', originally from the [[Saxon people|Saxon]] tribe.@@@@1@13@@danf@17-8-2009
10304160@unknown@formal@none@1@S@[[Scandinavians]] use derivatives of the word ''Tyskland/Þýskaland'' (from Theodisca) for the country and ''tysk(a)/þýska'' for the language.@@@@1@17@@danf@17-8-2009
10304170@unknown@formal@none@1@S@[[Hebrew language|Hebrew]] traditionally (nowadays this is not the case) used the Biblical term אַשְׁכֲּנָז ([[Ashkenaz]]) (Genesis 10:3) to refer to Germany, or to certain parts of it, and the [[Ashkenazi]] Jews are those who originate from Germany and [[Eastern Europe]] and formerly spoke Yiddish as their native language, derived from [[Middle High German]].@@@@1@53@@danf@17-8-2009
10304180@unknown@formal@none@1@S@Modern Hebrew uses גֶּרְמָנִי ''germaní'' (Or גֶּרְמָנִית ''germanít'' for the language).@@@@1@11@@danf@17-8-2009
10304190@unknown@formal@none@1@S@The French term is ''allemand'', the Spanish term is ''alemán'', the [[Catalan language|Catalan]] term is ''alemany'', and the [[Portuguese language|Portuguese]] term is ''alemão''; all derive from the ancient [[Alamanni]] tribal alliance, meaning literally "''All Men''".@@@@1@35@@danf@17-8-2009
10304200@unknown@formal@none@1@S@The [[Latvian language|Latvian]] term ''vācu'' means "tinny" and refers disparagingly to the iron-clad [[Teutonic Knights]] that colonized the Baltic in the Middle Ages.@@@@1@23@@danf@17-8-2009
10304210@unknown@formal@none@1@S@The [[Scottish Gaelic]] term for the German language, ''Gearmailtis'', is formed in the standard way of adding ''-(a)is'' to the end of the country name.@@@@1@25@@danf@17-8-2009
10304220@unknown@formal@none@1@S@See [[Names for Germany]] for further details on the origins of these and other terms.@@@@1@15@@danf@17-8-2009
10310010@unknown@formal@none@1@S@GNU General Public License@@@@1@4@@danf@17-8-2009
10310020@unknown@formal@none@1@S@The '''GNU General Public License''' ('''GNU GPL''' or simply '''GPL''') is a widely used [[free software license]], originally written by [[Richard Stallman]] for the [[GNU project]].@@@@1@26@@danf@17-8-2009
10310030@unknown@formal@none@1@S@The GPL is the most popular and well-known example of the type of strong [[copyleft]] license that requires derived works to be available under the same copyleft.@@@@1@27@@danf@17-8-2009
10310040@unknown@formal@none@1@S@Under this philosophy, the GPL is said to grant the recipients of a [[computer program]] the rights of the [[free software definition]] and uses copyleft to ensure the freedoms are preserved, even when the work is changed or added to.@@@@1@40@@danf@17-8-2009
10310050@unknown@formal@none@1@S@This is in distinction to [[permissive free software licenses]], of which the [[BSD licenses]] are the standard examples.@@@@1@18@@danf@17-8-2009
10310060@unknown@formal@none@1@S@The [[GNU Lesser General Public License]] (LGPL) is a modified, more permissive, version of the GPL, originally intended for some [[library (computing)|software libraries]].@@@@1@23@@danf@17-8-2009
10310070@unknown@formal@none@1@S@There is also a [[GNU Free Documentation License]], which was originally intended for use with documentation for GNU software, but has also been adopted for other uses, such as the [[Wikipedia]] project.@@@@1@32@@danf@17-8-2009
10310080@unknown@formal@none@1@S@The [[Affero General Public License]] (GNU AGPL) is a similar license with a focus on networking server software.@@@@1@18@@danf@17-8-2009
10310090@unknown@formal@none@1@S@The GNU AGPL is similar to the GNU General Public License, except that it additionally covers the use of the software over a computer network, requiring that the complete source code be made available to any network user of the AGPLed work, for example a web application.@@@@1@47@@danf@17-8-2009
10310100@unknown@formal@none@1@S@The Free Software Foundation recommends that this license is considered for any software that will commonly be run over the network.@@@@1@21@@danf@17-8-2009
10310110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10310120@unknown@formal@none@1@S@The GPL was written by [[Richard Stallman]] in 1989 for use with programs released as part of the [[GNU project]].@@@@1@20@@danf@17-8-2009
10310130@unknown@formal@none@1@S@The original GPL was based on a unification of similar licenses used for early versions of [[GNU Emacs]], the [[GNU Debugger]] and the [[GNU Compiler Collection]].@@@@1@26@@danf@17-8-2009
10310140@unknown@formal@none@1@S@These licenses contained similar provisions to the modern GPL, but were specific to each program, rendering them incompatible, despite being the same license.@@@@1@23@@danf@17-8-2009
10310150@unknown@formal@none@1@S@Stallman's goal was to produce one license that could be used for any project, thus making it possible for many projects to share code.@@@@1@24@@danf@17-8-2009
10310160@unknown@formal@none@1@S@An important vote of confidence in the GPL came from [[Linus Torvalds]]' adoption of the license for the [[History of the Linux kernel|Linux kernel]] in 1992, switching from an earlier license that prohibited commercial distribution.@@@@1@35@@danf@17-8-2009
10310170@unknown@formal@none@1@S@As of August 2007, the GPL accounted for nearly 65% of the 43,442 free software projects listed on [[Freshmeat]], and [[As of 2006|as of January 2006]], about 68% of the projects listed on [[SourceForge.net]].@@@@1@34@@danf@17-8-2009
10310180@unknown@formal@none@1@S@Similarly, a 2001 survey of [[Red Hat Linux]] 7.1 found that 50% of the source code was licensed under the GPL and a 1997 survey of [[Ibiblio|MetaLab]], then the largest free software archive, showed that the GPL accounted for about half of the licenses used.@@@@1@45@@danf@17-8-2009
10310190@unknown@formal@none@1@S@One survey of a large repository of open-source software reported that in July 1997, about half the software packages with explicit license terms used the GPL.@@@@1@26@@danf@17-8-2009
10310200@unknown@formal@none@1@S@Prominent free software programs licensed under the GPL include the [[Linux kernel]] and the [[GNU Compiler Collection]] (GCC).@@@@1@18@@danf@17-8-2009
10310210@unknown@formal@none@1@S@Some other free software programs are [[dual-licensed]] under multiple licenses, often with one of the licenses being the GPL.@@@@1@19@@danf@17-8-2009
10310220@unknown@formal@none@1@S@Some observers believe that the strong [[copyleft]] provided by the GPL was crucial to the success of Linux, giving the programmers who contributed to it the confidence that their work would benefit the whole world and remain free, rather than being exploited by software companies that would not have to give anything back to the community.@@@@1@56@@danf@17-8-2009
10310230@unknown@formal@none@1@S@The second version of the license, version 2, was released in 1991.@@@@1@12@@danf@17-8-2009
10310240@unknown@formal@none@1@S@Over the following 15 years, some members of the [[free software community|FOSS (Free and Open Source Software) community]] came to believe that some software and hardware vendors were finding loopholes in the GPL, allowing GPL-licensed software to be exploited in ways that were contrary to the intentions of the programmers.@@@@1@50@@danf@17-8-2009
10310250@unknown@formal@none@1@S@These concerns included [[tivoization]] (the inclusion of GPL-licensed software in hardware that will refuse to run modified versions of its software); the use of unpublished, modified versions of GPL software behind web interfaces; and patent deals between [[Microsoft]] and Linux and Unix distributors that may represent an attempt to use patents as a weapon against competition from Linux.@@@@1@58@@danf@17-8-2009
10310260@unknown@formal@none@1@S@Version 3 was developed to attempt to address these concerns.@@@@1@10@@danf@17-8-2009
10310270@unknown@formal@none@1@S@It was [http://www.fsf.org/news/gplv3_launched officially released] on [[June 29]], [[2007]].@@@@1@9@@danf@17-8-2009
10310280@unknown@formal@none@1@S@==Versions==@@@@1@1@@danf@17-8-2009
10310290@unknown@formal@none@1@S@===Version 1===@@@@1@2@@danf@17-8-2009
10310300@unknown@formal@none@1@S@Version 1 of the GNU GPL, released in January 1989, prevented what were then the two main ways that software distributors restricted the freedoms that define free software.@@@@1@28@@danf@17-8-2009
10310310@unknown@formal@none@1@S@The first problem was that distributors may publish [[binary file]]s only – executable, but not readable or modifiable by humans.@@@@1@20@@danf@17-8-2009
10310320@unknown@formal@none@1@S@To prevent this, GPLv1 said that any vendor distributing binaries must also make the human readable source code available under the same licensing terms.@@@@1@24@@danf@17-8-2009
10310330@unknown@formal@none@1@S@The second problem was the distributors might add additional restrictions, either by adding restrictions to the license, or by combining the software with other software which had other restrictions on its distribution.@@@@1@32@@danf@17-8-2009
10310340@unknown@formal@none@1@S@If this was done, then the union of the two sets of restrictions would apply to the combined work, thus unacceptable restrictions could be added.@@@@1@25@@danf@17-8-2009
10310350@unknown@formal@none@1@S@To prevent this, GPLv1 said that modified versions, as a whole, had to be distributed under the terms in GPLv1.@@@@1@20@@danf@17-8-2009
10310360@unknown@formal@none@1@S@Therefore, software distributed under the terms of GPLv1 could be combined with software under more permissive terms, as this would not change the terms under which the whole could be distributed, but software distributed under GPLv1 could not be combined with software distributed under a more restrictive license, as this would conflict with the requirement that the whole be distributable under the terms of GPLv1.@@@@1@65@@danf@17-8-2009
10310370@unknown@formal@none@1@S@===Version 2===@@@@1@2@@danf@17-8-2009
10310380@unknown@formal@none@1@S@According to Richard Stallman, the major change in GPLv2 was the "Liberty or Death" clause, as he calls it - Section 7.@@@@1@22@@danf@17-8-2009
10310390@unknown@formal@none@1@S@This section says that if someone has restrictions imposed that ''prevent'' him or her from distributing GPL-covered software in a way that respects other users' freedom (for example, if a legal ruling states that he or she can only distribute the software in binary form), he or she cannot distribute it at all.@@@@1@53@@danf@17-8-2009
10310400@unknown@formal@none@1@S@By 1990, it was becoming apparent that a less restrictive license would be strategically useful for some software libraries; when version 2 of the GPL (GPLv2) was released in June 1991, therefore, a second license - the Library General Public License (LGPL) was introduced at the same time and numbered with version 2 to show that both were complementary.@@@@1@59@@danf@17-8-2009
10310410@unknown@formal@none@1@S@The version numbers diverged in 1999 when version 2.1 of the LGPL was released, which renamed it the [[GNU Lesser General Public License]] to reflect its place in the GNU philosophy.@@@@1@31@@danf@17-8-2009
10310420@unknown@formal@none@1@S@===Version 3===@@@@1@2@@danf@17-8-2009
10310430@unknown@formal@none@1@S@In late 2005, the [[Free Software Foundation]] (FSF) announced work on version 3 of the GPL (GPLv3).@@@@1@17@@danf@17-8-2009
10310440@unknown@formal@none@1@S@On [[January 16]], [[2006]], the first "discussion draft" of GPLv3 was published, and the public consultation began.@@@@1@17@@danf@17-8-2009
10310450@unknown@formal@none@1@S@The public consultation was originally planned for nine to fifteen months but finally stretched to eighteen months with four drafts being published.@@@@1@22@@danf@17-8-2009
10310460@unknown@formal@none@1@S@The official GPLv3 was released by FSF on [[June 29]], [[2007]].@@@@1@11@@danf@17-8-2009
10310470@unknown@formal@none@1@S@GPLv3 was written by [[Richard Stallman]], with legal counsel from [[Eben Moglen]] and [[Software Freedom Law Center]].@@@@1@17@@danf@17-8-2009
10310480@unknown@formal@none@1@S@According to Stallman, the most important changes are in relation to [[Software patents and free software|software patents]], [[free software license]] compatibility, the definition of "source code", and hardware restrictions on software modification ("[[tivoization]]").@@@@1@33@@danf@17-8-2009
10310490@unknown@formal@none@1@S@Other changes relate to internationalisation, how license violations are handled, and how additional permissions can be granted by the copyright holder.@@@@1@21@@danf@17-8-2009
10310500@unknown@formal@none@1@S@Other notable changes include allowing authors to add certain additional conditions or requirements to their contributions.@@@@1@16@@danf@17-8-2009
10310510@unknown@formal@none@1@S@One of those new optional requirements, sometimes referred to as the Affero clause, is intended to fulfill a request regarding [[software as a service]]; the permitting addition of this requirement makes GPLv3 compatible with the [[Affero General Public License]].@@@@1@39@@danf@17-8-2009
10310520@unknown@formal@none@1@S@The public consultation process was coordinated by the Free Software Foundation with assistance from [[Software Freedom Law Center]], [[Free Software Foundation Europe]], and other free software groups.@@@@1@27@@danf@17-8-2009
10310530@unknown@formal@none@1@S@Comments were collected from the public via the gplv3.fsf.org web portal.@@@@1@11@@danf@17-8-2009
10310540@unknown@formal@none@1@S@That portal runs purpose-written software called [[stet (software)|stet]].@@@@1@8@@danf@17-8-2009
10310550@unknown@formal@none@1@S@These comments were passed to four committees comprising approximately 130 people, including supporters and detractors of FSF's goals.@@@@1@18@@danf@17-8-2009
10310560@unknown@formal@none@1@S@Those committees researched the comments submitted by the public and passed their summaries to Stallman for a decision on what the license would do.@@@@1@24@@danf@17-8-2009
10310570@unknown@formal@none@1@S@During the public consultation process, 962 comments were submitted for the first draft.@@@@1@13@@danf@17-8-2009
10310580@unknown@formal@none@1@S@By the end, a total of 2,636 comments had been submitted.@@@@1@11@@danf@17-8-2009
10310590@unknown@formal@none@1@S@The third draft was released on [[March 28]], [[2007]].@@@@1@9@@danf@17-8-2009
10310600@unknown@formal@none@1@S@This draft included language intended to prevent patent cross-licenses like the controversial [[Novell#Agreement with Microsoft|Microsoft-Novell patent agreement]] and restricts the anti-tivoization clauses to a legal definition of a "User" or "consumer product."@@@@1@32@@danf@17-8-2009
10310610@unknown@formal@none@1@S@It also explicitly removed the section on "Geographical Limitations", whose probable removal had been announced at the launch of the public consultation.@@@@1@22@@danf@17-8-2009
10310620@unknown@formal@none@1@S@The fourth discussion draft, which was the last, was released on [[May 31]], [[2007]].@@@@1@14@@danf@17-8-2009
10310630@unknown@formal@none@1@S@It introduced [[Apache Software License]] compatibility, clarified the role of outside contractors, and made an exception to permit the Microsoft-Novell agreement, saying in section 11 paragraph 6 that@@@@1@28@@danf@17-8-2009
10310640@unknown@formal@none@1@S@This aims to make future such deals ineffective.@@@@1@8@@danf@17-8-2009
10310650@unknown@formal@none@1@S@The license is also meant to cause Microsoft to extend the patent licenses it grants to Novell customers for the use of GPLv3 software to ''all'' users of that GPLv3 software; this is possible only if Microsoft is legally a "conveyor" of the GPLv3 software.@@@@1@45@@danf@17-8-2009
10310660@unknown@formal@none@1@S@Others, notably some high-profile developers of the [[Linux kernel]], commented to the mass media and made public statements about their objections to parts of discussion drafts 1 and 2.@@@@1@29@@danf@17-8-2009
10310670@unknown@formal@none@1@S@== Terms and conditions ==@@@@1@5@@danf@17-8-2009
10310680@unknown@formal@none@1@S@The terms and conditions of the GPL are available to anybody receiving a copy of the work that has a GPL applied to it ("the licensee").@@@@1@26@@danf@17-8-2009
10310690@unknown@formal@none@1@S@Any licensee who adheres to the terms and conditions is given permission to modify the work, as well as to copy and redistribute the work or any derivative version.@@@@1@29@@danf@17-8-2009
10310700@unknown@formal@none@1@S@The licensee is allowed to charge a fee for this service, or do this free of charge.@@@@1@17@@danf@17-8-2009
10310710@unknown@formal@none@1@S@This latter point distinguishes the GPL from software licenses that prohibit commercial redistribution.@@@@1@13@@danf@17-8-2009
10310720@unknown@formal@none@1@S@The FSF argues that free software should not place restrictions on commercial use, and the GPL explicitly states that GPL works may be sold at any price.@@@@1@27@@danf@17-8-2009
10310730@unknown@formal@none@1@S@The GPL additionally states that a distributor may not impose "further restrictions on the rights granted by the GPL".@@@@1@19@@danf@17-8-2009
10310740@unknown@formal@none@1@S@This forbids activities such as distributing of the software under a non-disclosure agreement or contract.@@@@1@15@@danf@17-8-2009
10310750@unknown@formal@none@1@S@Distributors under the GPL also grant a license for any of their patents practiced by the software, to practice those patents in GPL software.@@@@1@24@@danf@17-8-2009
10310760@unknown@formal@none@1@S@Section three of the license requires that programs distributed as pre-compiled binaries are accompanied by a copy of the source code, a written offer to distribute the source code via the same mechanism as the pre-compiled binary or the written offer to obtain the source code that you got when you received the pre-compiled binary under the GPL.@@@@1@58@@danf@17-8-2009
10310770@unknown@formal@none@1@S@=== Copyleft ===@@@@1@3@@danf@17-8-2009
10310780@unknown@formal@none@1@S@The distribution rights granted by the GPL for modified versions of the work are not unconditional.@@@@1@16@@danf@17-8-2009
10310790@unknown@formal@none@1@S@When someone distributes a GPL'd work plus their own modifications, the requirements for distributing the whole work cannot be any greater than the requirements that are in the GPL.@@@@1@29@@danf@17-8-2009
10310800@unknown@formal@none@1@S@This requirement is known as copyleft.@@@@1@6@@danf@17-8-2009
10310810@unknown@formal@none@1@S@It earns its legal power from the use of [[copyright]] on software programs.@@@@1@13@@danf@17-8-2009
10310820@unknown@formal@none@1@S@Because a GPL work is copyrighted, a licensee has no right to redistribute it, not even in modified form (barring [[fair use]]), except under the terms of the license.@@@@1@29@@danf@17-8-2009
10310830@unknown@formal@none@1@S@One is only required to adhere to the terms of the GPL if one wishes to exercise rights normally restricted by copyright law, such as redistribution.@@@@1@26@@danf@17-8-2009
10310840@unknown@formal@none@1@S@Conversely, if one distributes copies of the work without abiding by the terms of the GPL (for instance, by keeping the source code secret), he or she can be [[lawsuit|sued]] by the original author under copyright law.@@@@1@37@@danf@17-8-2009
10310850@unknown@formal@none@1@S@Copyleft thus uses copyright law to accomplish the opposite of its usual purpose: instead of imposing restrictions, it grants rights to other people, in a way that ensures the rights cannot subsequently be taken away.@@@@1@35@@danf@17-8-2009
10310860@unknown@formal@none@1@S@It also ensures that unlimited redistribution rights are not granted, should any legal flaw (or "[[computer bug|bug]]") be found in the copyleft statement.@@@@1@23@@danf@17-8-2009
10310870@unknown@formal@none@1@S@Many distributors of GPL'ed programs bundle the source code with the [[executable]]s.@@@@1@12@@danf@17-8-2009
10310880@unknown@formal@none@1@S@An alternative method of satisfying the copyleft is to provide a written offer to provide the source code on a physical medium (such as a CD) upon request.@@@@1@28@@danf@17-8-2009
10310890@unknown@formal@none@1@S@In practice, many GPL'ed programs are distributed over the [[Internet]], and the source code is made available over [[File Transfer Protocol|FTP]].@@@@1@21@@danf@17-8-2009
10310900@unknown@formal@none@1@S@For Internet distribution, this complies with the license.@@@@1@8@@danf@17-8-2009
10310910@unknown@formal@none@1@S@Copyleft applies only when a person seeks to redistribute the program.@@@@1@11@@danf@17-8-2009
10310920@unknown@formal@none@1@S@One is allowed to make private modified versions, without any obligation to divulge the modifications as long as the modified software is not distributed to anyone else.@@@@1@27@@danf@17-8-2009
10310930@unknown@formal@none@1@S@Note that the copyleft applies only to the software and not to its output (unless that output is itself a derivative work of the program); for example, a public web portal running a modified derivative of a GPL'ed [[content management system]] is not required to distribute its changes to the underlying software.@@@@1@52@@danf@17-8-2009
10310940@unknown@formal@none@1@S@==Licensing and contractual issues==@@@@1@4@@danf@17-8-2009
10310950@unknown@formal@none@1@S@The GPL was designed as a [[license]], rather than a [[contract]].@@@@1@11@@danf@17-8-2009
10310960@unknown@formal@none@1@S@In some [[Common Law]] jurisdictions, the legal distinction between a license and a contract is an important one: contracts are enforceable by [[contract law]], whereas licenses are enforced under [[copyright law]].@@@@1@31@@danf@17-8-2009
10310970@unknown@formal@none@1@S@However, this distinction is not useful in the many jurisdictions where there are no differences between contracts and licenses, such as [[Civil law (legal system)|Civil Law]] systems.@@@@1@27@@danf@17-8-2009
10310980@unknown@formal@none@1@S@Those who do not agree to the GPL's terms and conditions do not have permission, under copyright law, to copy or distribute GPL licensed software or derivative works.@@@@1@28@@danf@17-8-2009
10310990@unknown@formal@none@1@S@However, they may still use the software however they like.@@@@1@10@@danf@17-8-2009
10311000@unknown@formal@none@1@S@== Copyright holders ==@@@@1@4@@danf@17-8-2009
10311010@unknown@formal@none@1@S@The text of the GPL is itself copyrighted, and the copyright is held by the [[Free Software Foundation]] (FSF).@@@@1@19@@danf@17-8-2009
10311020@unknown@formal@none@1@S@However, the FSF does not hold the copyright for a work released under the GPL, unless an author explicitly assigns copyrights to the FSF (which seldom happens except for programs that are part of the [[GNU]] project).@@@@1@37@@danf@17-8-2009
10311030@unknown@formal@none@1@S@Only the individual copyright holders have the authority to sue when a license violation takes place.@@@@1@16@@danf@17-8-2009
10311040@unknown@formal@none@1@S@The FSF permits people to create new licenses based on the GPL, as long as the derived licenses do not use the GPL preamble without permission.@@@@1@26@@danf@17-8-2009
10311050@unknown@formal@none@1@S@This is discouraged, however, since such a license is generally incompatible with the GPL.@@@@1@14@@danf@17-8-2009
10311060@unknown@formal@none@1@S@(See the [http://www.fsf.org/licenses/gpl-faq.html#ModifyGPL GPL FAQ] for more information.)@@@@1@8@@danf@17-8-2009
10311070@unknown@formal@none@1@S@Other licenses created by the GNU project include the [[GNU Lesser General Public License]] and the [[GNU Free Documentation License]].@@@@1@20@@danf@17-8-2009
10311080@unknown@formal@none@1@S@== The GPL in court ==@@@@1@6@@danf@17-8-2009
10311090@unknown@formal@none@1@S@A key dispute related to the GPL is whether or not non-GPL software can [[library linking|dynamically link]] to GPL libraries.@@@@1@20@@danf@17-8-2009
10311100@unknown@formal@none@1@S@The GPL is clear in requiring that all [[derivative work]]s of GPL'ed code must themselves be GPL'ed.@@@@1@17@@danf@17-8-2009
10311110@unknown@formal@none@1@S@However, it is not clear whether an executable that dynamically links to a GPL code should be considered a derivative work.@@@@1@21@@danf@17-8-2009
10311120@unknown@formal@none@1@S@The free/open-source software community is split on this issue.@@@@1@9@@danf@17-8-2009
10311130@unknown@formal@none@1@S@The FSF asserts that such an executable is indeed a derivative work if the executable and GPL code "make function calls to each other and share data structures," with others agreeing, while some (e.g. [[Linus Torvalds]]) agree that dynamic linking can create derived works but disagree over the circumstances.@@@@1@49@@danf@17-8-2009
10311150@unknown@formal@none@1@S@On the other hand, some experts have argued that the question is still open: one [[Novell]] lawyer has written that dynamic linking not being derivative "makes sense" but is not "clear-cut," and [[Lawrence Rosen]] has claimed that a court of law would "probably" exclude dynamic linking from derivative works although "there are also good arguments" on the other side and "the outcome is not clear" (on a later occasion, he argued that "market-based" factors are more important than the linking technique).@@@@1@81@@danf@17-8-2009
10311160@unknown@formal@none@1@S@This is ultimately a question not of the GPL ''per se'', but of how copyright law defines derivative works.@@@@1@19@@danf@17-8-2009
10311170@unknown@formal@none@1@S@In ''[[Galoob v. Nintendo]]'' the [[Ninth Circuit Court of Appeals]] defined a derivative work as having "'form' or permanence" and noted that "the infringing work must incorporate a portion of the copyrighted work in some form," but there have been no clear court decisions to resolve this particular conflict.@@@@1@49@@danf@17-8-2009
10311180@unknown@formal@none@1@S@Since there is no record of anyone circumventing the GPL by dynamic linking and contesting when threatened with lawsuits by the copyright holder, the restriction appears ''[[de facto]]'' enforceable even if not yet proven ''[[de jure]]''.@@@@1@36@@danf@17-8-2009
10311190@unknown@formal@none@1@S@In 2002, MySQL AB sued Progress NuSphere for copyright and trademark infringement in [[U.S. District Court for the District of Massachusetts|United States district court]].@@@@1@24@@danf@17-8-2009
10311200@unknown@formal@none@1@S@NuSphere had allegedly violated MySQL's copyright by linking code for the Gemini table type into the MySQL server.@@@@1@18@@danf@17-8-2009
10311210@unknown@formal@none@1@S@After a preliminary hearing before Judge [[Patti Saris]] on [[February 27]], [[2002]], the parties entered settlement talks and eventually settled.@@@@1@20@@danf@17-8-2009
10311220@unknown@formal@none@1@S@At the hearing, Judge Saris "saw no reason" that the GPL would not be enforceable.@@@@1@15@@danf@17-8-2009
10311230@unknown@formal@none@1@S@In August 2003, the [[SCO Group]] stated that they believed the GPL to have no legal validity, and that they intended to take up lawsuits over sections of code supposedly copied from SCO Unix into the [[Linux kernel]].@@@@1@38@@danf@17-8-2009
10311240@unknown@formal@none@1@S@This was a problematic stand for them, as they had distributed Linux and other GPL'ed code in their [[Caldera OpenLinux]] distribution, and there is little evidence that they had any legal right to do so except under the terms of the GPL.@@@@1@42@@danf@17-8-2009
10311250@unknown@formal@none@1@S@For more information, see [[SCO-Linux controversies]] and [[SCO v. IBM]].@@@@1@10@@danf@17-8-2009
10311260@unknown@formal@none@1@S@In April 2004 the [[netfilter/iptables]] project was granted a preliminary [[injunction]] against Sitecom Germany by [[Munich]] District Court after Sitecom refused to desist from distributing Netfilter's GPL'ed software in violation of the terms of the GPL.@@@@1@36@@danf@17-8-2009
10311270@unknown@formal@none@1@S@On July 2004 , the German court confirmed this injunction as a final ruling against Sitecom.@@@@1@16@@danf@17-8-2009
10311280@unknown@formal@none@1@S@The court's justification for its decision exactly mirrored the predictions given earlier by the FSF's [[Eben Moglen]]:@@@@1@17@@danf@17-8-2009
10311290@unknown@formal@none@1@S@: ''Defendant has infringed on the copyright of plaintiff by offering the software 'netfilter/iptables' for download and by advertising its distribution, without adhering to the license conditions of the GPL.@@@@1@30@@danf@17-8-2009
10311300@unknown@formal@none@1@S@Said actions would only be permissible if defendant had a license grant...@@@@1@12@@danf@17-8-2009
10311310@unknown@formal@none@1@S@This is independent of the questions whether the licensing conditions of the GPL have been effectively agreed upon between plaintiff and defendant or not.@@@@1@24@@danf@17-8-2009
10311320@unknown@formal@none@1@S@If the GPL were not agreed upon by the parties, defendant would notwithstanding lack the necessary rights to copy, distribute, and make the software 'netfilter/iptables' publicly available.''@@@@1@27@@danf@17-8-2009
10311330@unknown@formal@none@1@S@This ruling was important because it was the first time that a court had confirmed that violating terms of the GPL was an act of copyright violation.@@@@1@27@@danf@17-8-2009
10311340@unknown@formal@none@1@S@However, the case was not as crucial a test for the GPL as some have concluded.@@@@1@16@@danf@17-8-2009
10311350@unknown@formal@none@1@S@In the case, the enforceability of GPL itself was not under attack.@@@@1@12@@danf@17-8-2009
10311360@unknown@formal@none@1@S@Instead, the court was merely attempting to discern if the license itself was in effect.@@@@1@15@@danf@17-8-2009
10311370@unknown@formal@none@1@S@In May of [[2005]], [[Wallace versus International Business Machines et al|Daniel Wallace]] filed suit against the [[Free Software Foundation]] (FSF) in the [[U.S. District Court for the Southern District of Indiana|Southern District of Indiana]], contending that the GPL is an illegal attempt to fix prices at zero.@@@@1@47@@danf@17-8-2009
10311380@unknown@formal@none@1@S@The suit was dismissed in March 2006, on the grounds that Wallace had failed to state a valid anti-trust claim; the court noted that "the GPL encourages, rather than discourages, free competition and the distribution of computer operating systems, the benefits of which directly pass to consumers."@@@@1@47@@danf@17-8-2009
10311390@unknown@formal@none@1@S@Wallace was denied the possibility of further amending his complaint, and was ordered to pay the FSF's legal expenses.@@@@1@19@@danf@17-8-2009
10311400@unknown@formal@none@1@S@On September 8, 2005, Seoul Central District Court ruled that GPL has no legal relevance concerning the case dealing with [[trade secret]] derived from GPL-licensed work.@@@@1@26@@danf@17-8-2009
10311410@unknown@formal@none@1@S@Defendants argued that since it is impossible to maintain trade secret while being compliant with GPL and distributing the work, they aren't in breach of trade secret.@@@@1@27@@danf@17-8-2009
10311420@unknown@formal@none@1@S@This argument was considered without ground.@@@@1@6@@danf@17-8-2009
10311430@unknown@formal@none@1@S@On September 6, 2006, the [[gpl-violations.org]] project prevailed in court litigation against D-Link Germany GmbH regarding D-Link's inappropriate and copyright infringing use of parts of the Linux Operating System Kernel.@@@@1@30@@danf@17-8-2009
10311440@unknown@formal@none@1@S@The judgment finally provided the on-record, legal precedent that the GPL is valid and legally binding, and that it will stand up in German court.@@@@1@25@@danf@17-8-2009
10311450@unknown@formal@none@1@S@In late 2007, the developers of [[BusyBox]] and the [[Software Freedom Law Center]] embarked upon a program to gain GPL compliance from distributors of BusyBox in [[embedded system]]s, suing those who would not comply.@@@@1@34@@danf@17-8-2009
10311460@unknown@formal@none@1@S@These were claimed to be the first US uses of courts for enforcement of GPL obligations.@@@@1@16@@danf@17-8-2009
10311470@unknown@formal@none@1@S@''See'' [[BusyBox#GPL lawsuits]].@@@@1@3@@danf@17-8-2009
10311480@unknown@formal@none@1@S@== Compatibility and multi-licensing==@@@@1@4@@danf@17-8-2009
10311490@unknown@formal@none@1@S@Many of the most common free software licenses, such as the original [[MIT License|MIT/X license]], the [[BSD license]] (in its current 3-clause form), and the [[GNU Lesser General Public License|LGPL]], are "GPL-[[License compatibility|compatible]]".@@@@1@33@@danf@17-8-2009
10311500@unknown@formal@none@1@S@That is, their code can be combined with a program under the GPL without conflict (the new combination would have the GPL applied to the whole).@@@@1@26@@danf@17-8-2009
10311510@unknown@formal@none@1@S@However, some free/open source software licenses are not GPL-compatible.@@@@1@9@@danf@17-8-2009
10311520@unknown@formal@none@1@S@Many GPL proponents have strongly advocated that free/open source software developers use only GPL-compatible licenses, because doing otherwise makes it difficult to reuse software in larger wholes.@@@@1@27@@danf@17-8-2009
10311530@unknown@formal@none@1@S@Note that this issue only arises in concurrent use of licenses which impose conditions on their manner of combination.@@@@1@19@@danf@17-8-2009
10311540@unknown@formal@none@1@S@Some licenses, such as the BSD license, impose no conditions on the manner of their combination.@@@@1@16@@danf@17-8-2009
10311550@unknown@formal@none@1@S@Also see the [[list of FSF approved software licenses]] for examples of compatible and incompatible licenses.@@@@1@16@@danf@17-8-2009
10311560@unknown@formal@none@1@S@A number of businesses use [[dual-licensing]] to distribute a GPL version and sell a [[proprietary software|proprietary]] license to companies wishing to combine the package with proprietary code, using dynamic linking or not.@@@@1@32@@danf@17-8-2009
10311570@unknown@formal@none@1@S@Examples of such companies include [[MySQL AB]], [[Trolltech]] ([[Qt (toolkit)|Qt toolkit]]), [[Namesys]] ([[ReiserFS]]) and [[Red Hat]] ([[Cygwin]]).@@@@1@17@@danf@17-8-2009
10311580@unknown@formal@none@1@S@== Adoption ==@@@@1@3@@danf@17-8-2009
10311590@unknown@formal@none@1@S@The Open Source License Resource Center maintained by [[Black Duck Software]] shows that GPL is the license used in about 70% of all open source software.@@@@1@26@@danf@17-8-2009
10311600@unknown@formal@none@1@S@The vast majority of projects are released under GPL 2 with 3000 open source projects having migrated to GPL 3.@@@@1@20@@danf@17-8-2009
10311610@unknown@formal@none@1@S@==Criticism==@@@@1@1@@danf@17-8-2009
10311620@unknown@formal@none@1@S@In [[2001]] [[Microsoft]] [[CEO]] [[Steve Ballmer]] referred to Linux as "a cancer that attaches itself in an intellectual property sense to everything it touches."@@@@1@24@@danf@17-8-2009
10311630@unknown@formal@none@1@S@Critics of Microsoft claim that the real reason Microsoft dislikes the GPL is that the GPL resists proprietary vendors' attempts to "[[embrace, extend and extinguish]]".@@@@1@25@@danf@17-8-2009
10311640@unknown@formal@none@1@S@Microsoft has released [[Microsoft Windows Services for UNIX]] which contains GPL-licensed code.@@@@1@12@@danf@17-8-2009
10311650@unknown@formal@none@1@S@In response to Microsoft's attacks on the GPL, several prominent Free Software developers and advocates released a joint statement supporting the license.@@@@1@22@@danf@17-8-2009
10311660@unknown@formal@none@1@S@The GPL has been described as being [[Copyleft#Is copyleft .22viral.22.3F|"viral"]] by many of its critics because the GPL only allows conveyance of whole programs, which means that programmers are not allowed to convey programs that [[GPL linking exception|link]] to libraries having GPL-incompatible licenses.@@@@1@43@@danf@17-8-2009
10311670@unknown@formal@none@1@S@The so-called "viral" effect of this is that under such circumstances disparately licensed software cannot be combined unless one of the licenses is changed.@@@@1@24@@danf@17-8-2009
10311680@unknown@formal@none@1@S@Although theoretically either license could be changed, in the "viral" scenario the GPL cannot be practically changed (because the software may have so many contributors, some of whom will likely refuse), whereas the license of the other software ''can'' be practically changed.@@@@1@42@@danf@17-8-2009
10311690@unknown@formal@none@1@S@This is part of a [[BSD and GPL licensing|philosophical difference]] between the GPL and permissive free software licenses such as the [[BSD licenses|BSD-style licenses]], which do not put such a requirement on modified versions.@@@@1@34@@danf@17-8-2009
10311700@unknown@formal@none@1@S@While proponents of the GPL believe that free software should ensure that its freedoms are preserved all the way from the developer to the user, others believe that intermediaries between the developer and the user should be free to redistribute the software as non-free software.@@@@1@45@@danf@17-8-2009
10311710@unknown@formal@none@1@S@More specifically, the GPL requires that redistribution occur subject to the GPL, whereas more "permissive" licenses allow redistribution to occur under licenses more restrictive than the original license.@@@@1@28@@danf@17-8-2009
10311720@unknown@formal@none@1@S@While the GPL does allow commercial distribution of GPL software, the market price will settle near the price of distribution—near zero—since the purchasers may redistribute the software and its source code for their cost of redistribution.@@@@1@36@@danf@17-8-2009
10311730@unknown@formal@none@1@S@This could be seen to inhibit commercial use of GPL'ed code by others wishing to use that code for proprietary purposes—if they don't wish to avail themselves of GPL'ed code, they will have to re-implement it themselves.@@@@1@37@@danf@17-8-2009
10311740@unknown@formal@none@1@S@Microsoft has included anti-GPL terms in their open source software.@@@@1@10@@danf@17-8-2009
10311750@unknown@formal@none@1@S@In addition, the [[FreeBSD]] project has stated that "a less publicized and unintended use of the GPL is that it is very favorable to large companies that want to undercut software companies.@@@@1@32@@danf@17-8-2009
10311760@unknown@formal@none@1@S@In other words, the GPL is well suited for use as a marketing weapon, potentially reducing overall economic benefit and contributing to monopolistic behavior".@@@@1@24@@danf@17-8-2009
10311770@unknown@formal@none@1@S@It's not clear that there are any cases of this happening in practice, however.@@@@1@14@@danf@17-8-2009
10311780@unknown@formal@none@1@S@The GPL has no [[Indemnity|indemnification]] clause explicitly protecting maintainers and developers from litigation resulting from unscrupulous contribution.@@@@1@17@@danf@17-8-2009
10311790@unknown@formal@none@1@S@(If a developer submits existing patented or copyright work to a GPL project claiming it as their own contribution, all the project maintainers and even other developers can be held legally responsible for damages to the copyright or patent holder.)@@@@1@40@@danf@17-8-2009
10311800@unknown@formal@none@1@S@Lack of indemnification is one criticism that lead Mozilla to create the [[Mozilla Public License]] rather than use the GPL or LGPL.@@@@1@22@@danf@17-8-2009
10311810@unknown@formal@none@1@S@However, Mozilla later relicensed their work under a GPL/LGPL/MPL triple license, due to problems with the GPL-incompatibility of the MPL.@@@@1@20@@danf@17-8-2009
10311820@unknown@formal@none@1@S@Some software developers have found the extensive scope of the GPL to be too restrictive.@@@@1@15@@danf@17-8-2009
10311830@unknown@formal@none@1@S@For example, Bjørn Reese and Daniel Stenberg describe how the downstream effects of the GPL on later developers creates a "quodque pro quo" (Latin, "Everything in return for something").@@@@1@29@@danf@17-8-2009
10311840@unknown@formal@none@1@S@For that reason, in 2001 they abandoned the GPLv2 in favor of less restrictive copyleft licenses.@@@@1@16@@danf@17-8-2009
10311850@unknown@formal@none@1@S@A more specific example of the downstream effects of the GPL can be observed through the frame of incompatible licenses.@@@@1@20@@danf@17-8-2009
10311860@unknown@formal@none@1@S@Sun Microsystems' ZFS, because it is licensed under the GPL-incompatible CDDL and covered by several Sun patents, cannot link to the GPL-licensed linux kernel.@@@@1@24@@danf@17-8-2009
10311870@unknown@formal@none@1@S@Some have also argued that the GPL could, and should, be shorter.@@@@1@12@@danf@17-8-2009
10320010@unknown@formal@none@1@S@Google@@@@1@1@@danf@17-8-2009
10320020@unknown@formal@none@1@S@'''Google Inc.''' ( and ) is an [[United States|American]] [[public company|public corporation]], earning revenue from [[AdWords|advertising]] related to its [[Google search|Internet search]], [[Gmail|web-based e-mail]], [[Google Maps|online mapping]], [[Google Apps|office productivity]], [[Orkut|social networking]], and [[YouTube|video sharing]] services as well as selling advertising-free versions of the [[Google Search Appliance|same technologies]].@@@@1@48@@danf@17-8-2009
10320030@unknown@formal@none@1@S@Google's headquarters, the [[Googleplex]], is located in [[Mountain View, California]].@@@@1@10@@danf@17-8-2009
10320040@unknown@formal@none@1@S@As of [[June 30]] [[2008]] the company has 19,604 full-time employees.@@@@1@11@@danf@17-8-2009
10320050@unknown@formal@none@1@S@As of [[October 31]], [[2007]], it is the largest American company (by [[market capitalization]]) that is not part of the [[Dow Jones Industrial Average]].@@@@1@24@@danf@17-8-2009
10320060@unknown@formal@none@1@S@Google was co-founded by [[Larry Page]] and [[Sergey Brin]] while they were students at [[Stanford University]] and the company was first incorporated as a [[privately held company]] on [[September 7]], [[1998]].@@@@1@31@@danf@17-8-2009
10320070@unknown@formal@none@1@S@Google's [[initial public offering]] took place on [[August 19]], [[2004]], raising [[United States dollar|US$]]1.67 billion, making it worth US$23 billion.@@@@1@20@@danf@17-8-2009
10320080@unknown@formal@none@1@S@Google has continued its growth through a series of new product developments, [[List of Google acquisitions|acquisitions]], and [[Google#Partnerships|partnerships]].@@@@1@18@@danf@17-8-2009
10320090@unknown@formal@none@1@S@[[Google#Environmentalism|Environmentalism]], [[Google.org|philanthropy]], and [[Google#Corporate affairs and culture|positive employee relations]] have been important tenets during Google's growth, the latter resulting in being identified multiple times as [[Fortune Magazine|Fortune Magazine's]] #1 Best Place to Work.@@@@1@33@@danf@17-8-2009
10320100@unknown@formal@none@1@S@The company's unofficial slogan is "[[Don't be evil]]", although [[criticism of Google]] include concerns regarding the [[privacy]] of personal information, [[copyright]], [[censorship by Google|censorship]], and discontinuation of services.@@@@1@28@@danf@17-8-2009
10320110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10320120@unknown@formal@none@1@S@Google began in January 1996, as a research project by [[Larry Page]], who was soon joined by [[Sergey Brin]], two [[Doctor of Philosophy|Ph.D.]] students at [[Stanford University]] in [[California]].@@@@1@29@@danf@17-8-2009
10320130@unknown@formal@none@1@S@They hypothesized that a search engine that analyzed the relationships between websites would produce better ranking of results than existing techniques, which ranked results according to the number of times the search term appeared on a page.@@@@1@37@@danf@17-8-2009
10320140@unknown@formal@none@1@S@Their search engine was originally nicknamed "BackRub" because the system checked [[backlinks]] to estimate a site's importance.@@@@1@17@@danf@17-8-2009
10320150@unknown@formal@none@1@S@A small search engine called Rankdex was already exploring a similar strategy.@@@@1@12@@danf@17-8-2009
10320160@unknown@formal@none@1@S@Convinced that the pages with the most links to them from other highly relevant web pages must be the most relevant pages associated with the search, Page and Brin tested their thesis as part of their studies, and laid the foundation for their search engine.@@@@1@45@@danf@17-8-2009
10320170@unknown@formal@none@1@S@Originally, the search engine used the [[Stanford University]] website with the domain ''google.stanford.edu''.@@@@1@13@@danf@17-8-2009
10320180@unknown@formal@none@1@S@The domain ''google.com'' was registered on [[September 15]], [[1997]], and the company was incorporated as ''Google Inc.'' on [[September 7]], [[1998]] at a friend's garage in [[Menlo Park, California]].@@@@1@29@@danf@17-8-2009
10320190@unknown@formal@none@1@S@The total initial investment raised for the new company amounted to almost US$1.1 million, including a US$100,000 check by [[Andy Bechtolsheim]], one of the founders of [[Sun Microsystems]].@@@@1@28@@danf@17-8-2009
10320200@unknown@formal@none@1@S@In March 1999, the company moved into offices in [[Palo Alto, California|Palo Alto]], home to several other noted [[Silicon Valley]] technology startups.@@@@1@22@@danf@17-8-2009
10320210@unknown@formal@none@1@S@After quickly outgrowing two other sites, the company leased a complex of buildings in [[Mountain View, Santa Clara County, California|Mountain View]] at 1600 Amphitheatre Parkway from [[Silicon Graphics]] (SGI) in 2003.@@@@1@31@@danf@17-8-2009
10320220@unknown@formal@none@1@S@The company has remained at this location ever since, and the complex has since come to be known as the [[Googleplex]] (a play on the word [[googolplex]]).@@@@1@27@@danf@17-8-2009
10320230@unknown@formal@none@1@S@In 2006, Google bought the property from SGI for US$319 million.@@@@1@11@@danf@17-8-2009
10320240@unknown@formal@none@1@S@The Google search engine attracted a loyal following among the growing number of Internet users, who liked its simple design and usability.@@@@1@22@@danf@17-8-2009
10320250@unknown@formal@none@1@S@In 2000, Google began selling [[advertising|advertisements]] associated with search [[keyword (internet search)|keywords]].@@@@1@12@@danf@17-8-2009
10320260@unknown@formal@none@1@S@The ads were text-based to maintain an uncluttered page design and to maximize page loading speed.@@@@1@16@@danf@17-8-2009
10320270@unknown@formal@none@1@S@Keywords were sold based on a combination of price bid and clickthroughs, with bidding starting at US$.05 per click.@@@@1@19@@danf@17-8-2009
10320280@unknown@formal@none@1@S@This model of selling keyword advertising was pioneered by [[Yahoo! Search Marketing|Goto.com]] (later renamed Overture Services, before being acquired by [[Yahoo!]] and rebranded as [[Yahoo! Search Marketing]]).@@@@1@27@@danf@17-8-2009
10320290@unknown@formal@none@1@S@While many of its [[dot-com]] rivals failed in the new Internet marketplace, Google quietly rose in stature while generating revenue.@@@@1@20@@danf@17-8-2009
10320300@unknown@formal@none@1@S@The name "Google" originated from a common misspelling of the word "[[googol]]", which refers to 10100, the number represented by a 1 followed by one hundred zeros.@@@@1@27@@danf@17-8-2009
10320310@unknown@formal@none@1@S@Having found its way increasingly into everyday language, the verb "[[google (verb)|google]]", was added to the ''[[Merriam-Webster|Merriam Webster Collegiate Dictionary]]'' and the ''[[Oxford English Dictionary]]'' in 2006, meaning "to use the Google search engine to obtain information on the Internet."@@@@1@40@@danf@17-8-2009
10320320@unknown@formal@none@1@S@A [[patent]] describing part of Google's ranking mechanism ([[PageRank]]) was granted on [[September 4]], [[2001]].@@@@1@15@@danf@17-8-2009
10320330@unknown@formal@none@1@S@The patent was officially assigned to Stanford University and lists Lawrence Page as the inventor.@@@@1@15@@danf@17-8-2009
10320340@unknown@formal@none@1@S@===Financing and initial public offering===@@@@1@5@@danf@17-8-2009
10320350@unknown@formal@none@1@S@The first funding for Google as a company was secured in 1998, in the form of a US$100,000 contribution from [[Andy Bechtolsheim]], co-founder of [[Sun Microsystems]], given to a corporation which did not yet exist.@@@@1@35@@danf@17-8-2009
10320360@unknown@formal@none@1@S@Around six months later, a much larger round of funding was announced, with the major investors being rival venture capital firms [[Kleiner Perkins Caufield & Byers]] and [[Sequoia Capital]].@@@@1@29@@danf@17-8-2009
10320370@unknown@formal@none@1@S@Google's [[IPO]] took place on [[August 19]], [[2004]].@@@@1@8@@danf@17-8-2009
10320380@unknown@formal@none@1@S@19,605,052 [[stock|shares]] were offered at a price of US$85 per share.@@@@1@11@@danf@17-8-2009
10320390@unknown@formal@none@1@S@Of that, 14,142,135 (another mathematical reference as [[square root of two|√2]] ≈ 1.4142135) were floated by Google, and the remaining 5,462,917 were offered by existing stockholders.@@@@1@26@@danf@17-8-2009
10320400@unknown@formal@none@1@S@The sale of US$1.67 billion gave Google a [[market capitalization]] of more than US$23 billion.@@@@1@15@@danf@17-8-2009
10320410@unknown@formal@none@1@S@The vast majority of Google's 271 million shares remained under Google's control.@@@@1@12@@danf@17-8-2009
10320420@unknown@formal@none@1@S@Many of Google's employees became instant [[paper millionaires]].@@@@1@8@@danf@17-8-2009
10320430@unknown@formal@none@1@S@[[Yahoo!]], a competitor of Google, also benefited from the IPO because it owned 8.4 million shares of Google as of [[August 9]], [[2004]], ten days before the IPO.@@@@1@28@@danf@17-8-2009
10320440@unknown@formal@none@1@S@Google's stock performance after its first IPO launch has gone well, with shares hitting US$700 for the first time on [[October 31]], [[2007]], due to strong sales and earnings in the advertising market, as well as the release of new features such as the [[Google Desktop|desktop search function]] and its iGoogle personalized home page.@@@@1@54@@danf@17-8-2009
10320450@unknown@formal@none@1@S@The surge in stock price is fueled primarily by individual investors, as opposed to large institutional investors and [[mutual fund]]s.@@@@1@20@@danf@17-8-2009
10320460@unknown@formal@none@1@S@The company is listed on the [[NASDAQ]] stock exchange under the [[ticker]] symbol '''GOOG''' and under the [[London Stock Exchange]] under the ticker symbol '''GGEA'''.@@@@1@25@@danf@17-8-2009
10320470@unknown@formal@none@1@S@===Growth===@@@@1@1@@danf@17-8-2009
10320480@unknown@formal@none@1@S@While the company's primary business interest is in the web content arena, Google has begun experimenting with other markets, such as [[radio]] and print publications.@@@@1@25@@danf@17-8-2009
10320490@unknown@formal@none@1@S@On [[January 17]], [[2006]], Google announced that its purchase of a radio advertising company "dMarc", which provides an automated system that allows companies to advertise on the radio.@@@@1@28@@danf@17-8-2009
10320500@unknown@formal@none@1@S@This will allow Google to combine two niche advertising media—the Internet and radio—with Google's ability to laser-focus on the tastes of consumers.@@@@1@22@@danf@17-8-2009
10320510@unknown@formal@none@1@S@Google has also begun an experiment in selling advertisements from its advertisers in offline newspapers and magazines, with select advertisements in the [[Chicago Sun-Times]].@@@@1@24@@danf@17-8-2009
10320520@unknown@formal@none@1@S@They have been filling unsold space in the newspaper that would have normally been used for in-house advertisements.@@@@1@18@@danf@17-8-2009
10320530@unknown@formal@none@1@S@Google was added to the [[S&P 500 index]] on [[March 30]], [[2006]].@@@@1@12@@danf@17-8-2009
10320540@unknown@formal@none@1@S@It replaced [[Burlington Resources]], a major oil producer based in [[Houston]] which was acquired by [[ConocoPhillips]].@@@@1@16@@danf@17-8-2009
10320550@unknown@formal@none@1@S@===Acquisitions===@@@@1@1@@danf@17-8-2009
10320560@unknown@formal@none@1@S@Since 2001, Google has acquired several small start-up companies, often consisting of innovative teams and products.@@@@1@16@@danf@17-8-2009
10320570@unknown@formal@none@1@S@One of the earlier companies that Google bought was [[Pyra Labs]].@@@@1@11@@danf@17-8-2009
10320580@unknown@formal@none@1@S@They were the creators of [[Blogger (service)|Blogger]], a weblog publishing platform, first launched in 1999.@@@@1@15@@danf@17-8-2009
10320590@unknown@formal@none@1@S@This acquisition led to many premium features becoming free.@@@@1@9@@danf@17-8-2009
10320600@unknown@formal@none@1@S@Pyra Labs was originally formed by [[Evan Williams (blogger)|Evan Williams]], yet he left Google in 2004.@@@@1@16@@danf@17-8-2009
10320610@unknown@formal@none@1@S@In early 2006, Google acquired Upstartle, a company responsible for the online word processor, [[Writely]].@@@@1@15@@danf@17-8-2009
10320620@unknown@formal@none@1@S@The technology in this product was used by Google to eventually create [[Google Docs & Spreadsheets]].@@@@1@16@@danf@17-8-2009
10320630@unknown@formal@none@1@S@In 2004, Google acquired a company called [[Keyhole, Inc.]], which developed a product called ''Earth Viewer'' which was renamed in 2005 to [[Google Earth]].@@@@1@24@@danf@17-8-2009
10320640@unknown@formal@none@1@S@In February 2006, software company Adaptive Path sold Measure Map, a [[weblog]] statistics application, to Google.@@@@1@16@@danf@17-8-2009
10320650@unknown@formal@none@1@S@Registration to the service has since been temporarily disabled.@@@@1@9@@danf@17-8-2009
10320660@unknown@formal@none@1@S@The last update regarding the future of Measure Map was made on [[April 6]], [[2006]] and outlined many of the service's known issues.@@@@1@23@@danf@17-8-2009
10320670@unknown@formal@none@1@S@In late 2006, Google bought online video site [[YouTube]] for US$1.65 billion in stock.@@@@1@14@@danf@17-8-2009
10320680@unknown@formal@none@1@S@Shortly after, on [[October 31]], [[2006]], Google announced that it had also acquired [[JotSpot]], a developer of wiki technology for collaborative Web sites.@@@@1@23@@danf@17-8-2009
10320690@unknown@formal@none@1@S@On [[April 13]], [[2007]], Google reached an agreement to acquire [[DoubleClick]].@@@@1@11@@danf@17-8-2009
10320700@unknown@formal@none@1@S@Google agreed to buy the company for US$3.1 billion.@@@@1@9@@danf@17-8-2009
10320710@unknown@formal@none@1@S@On [[July 9]], [[2007]], Google announced that it had signed a definitive agreement to acquire enterprise messaging security and compliance company [[Postini]].@@@@1@22@@danf@17-8-2009
10320720@unknown@formal@none@1@S@===Partnerships===@@@@1@1@@danf@17-8-2009
10320730@unknown@formal@none@1@S@In 2005, Google entered into partnerships with other companies and government agencies to improve production and services.@@@@1@17@@danf@17-8-2009
10320740@unknown@formal@none@1@S@Google announced a partnership with [[NASA Ames Research Center]] to build up of offices and work on research projects involving large-scale data management, [[nanotechnology]], [[distributed computing]], and the entrepreneurial space industry.@@@@1@31@@danf@17-8-2009
10320750@unknown@formal@none@1@S@Google also entered into a partnership with [[Sun Microsystems]] in October to help share and distribute each other's technologies.@@@@1@19@@danf@17-8-2009
10320760@unknown@formal@none@1@S@The company entered into a partnership with [[Time Warner]]'s [[AOL]], to enhance each other's video search services.@@@@1@17@@danf@17-8-2009
10320770@unknown@formal@none@1@S@The same year, the company became a major financial investor of the new [[.mobi]] [[top-level domain]] for mobile devices, in conjunction with several other companies, including [[Microsoft]], [[Nokia]], and [[Ericsson]] among others.@@@@1@32@@danf@17-8-2009
10320780@unknown@formal@none@1@S@In September 2007, Google launched, "Adsense for Mobile", a service for its publishing partners which provides the ability to monetize their mobile websites through the targeted placement of mobile text ads, and acquired the mobile social networking site, ''Zingku.mobi'', to "provide people worldwide with direct access to Google applications, and ultimately the information they want and need, right from their mobile devices."@@@@1@62@@danf@17-8-2009
10320790@unknown@formal@none@1@S@In 2006, Google and [[News Corporation|News Corp.]]'s Fox Interactive Media entered into a US$900 million agreement to provide search and advertising on the popular social networking site, [[MySpace]].@@@@1@28@@danf@17-8-2009
10320800@unknown@formal@none@1@S@On November 5, 2007 Google announced the [[Open Handset Alliance]] to develop an open platform for mobile services called [[Google Android|Android]].@@@@1@21@@danf@17-8-2009
10320810@unknown@formal@none@1@S@On March,2008 Google, [[Sprint]], [[Intel]], [[Comcast]], [[Time Warner Cable]],[[Bright House Networks]],[[Clearwire]] together found [[Xohm]] to provide wireless [[telecommunication]] service.@@@@1@19@@danf@17-8-2009
10320820@unknown@formal@none@1@S@==Products and services==@@@@1@3@@danf@17-8-2009
10320830@unknown@formal@none@1@S@Google has created services and tools for the general public and business environment alike; including Web applications, advertising networks and solutions for businesses.@@@@1@23@@danf@17-8-2009
10320840@unknown@formal@none@1@S@===Advertising===@@@@1@1@@danf@17-8-2009
10320850@unknown@formal@none@1@S@Most of Google's revenue is derived from advertising programs.@@@@1@9@@danf@17-8-2009
10320860@unknown@formal@none@1@S@For the 2006 fiscal year, the company reported US$10.492 billion in total advertising revenues and only US$112 million in licensing and other revenues.@@@@1@23@@danf@17-8-2009
10320870@unknown@formal@none@1@S@Google [[AdWords]] allows Web advertisers to display advertisements in Google's search results and the Google Content Network, through either a cost-per-click or cost-per-view scheme.@@@@1@24@@danf@17-8-2009
10320880@unknown@formal@none@1@S@Google [[AdSense]] website owners can also display adverts on their own site, and earn money every time ads are clicked.@@@@1@20@@danf@17-8-2009
10320890@unknown@formal@none@1@S@===Web-based software===@@@@1@2@@danf@17-8-2009
10320900@unknown@formal@none@1@S@The [[Google search|Google web search engine]] is the company's most popular service.@@@@1@12@@danf@17-8-2009
10320910@unknown@formal@none@1@S@As of August 2007, Google is the most used [[search engine]] on the web with a 53.6% market share, ahead of [[Yahoo!]] (19.9%) and [[Live Search]] (12.9%).@@@@1@27@@danf@17-8-2009
10320920@unknown@formal@none@1@S@Google indexes billions of Web pages, so that users can search for the information they desire, through the use of [[keyword (Internet search)|keywords]] and [[operators]].@@@@1@25@@danf@17-8-2009
10320930@unknown@formal@none@1@S@Google has also employed the Web Search technology into other search services, including Image Search, [[Google News]], the price comparison site [[Google Product Search]], the interactive [[Usenet]] archive [[Google Groups]], [[Google Maps]], and more.@@@@1@34@@danf@17-8-2009
10320940@unknown@formal@none@1@S@In 2004, Google launched its own free web-based e-mail service, known as [[Gmail]] (or Google Mail in some jurisdictions).@@@@1@19@@danf@17-8-2009
10320950@unknown@formal@none@1@S@Gmail features [[e-mail filtering|spam-filtering technology]] and the capability to use Google technology to search e-mail.@@@@1@15@@danf@17-8-2009
10320960@unknown@formal@none@1@S@The service generates revenue by displaying advertisements and links from the [[AdWords]] service that are tailored to the choice of the user and/or content of the e-mail messages displayed on screen.@@@@1@31@@danf@17-8-2009
10320970@unknown@formal@none@1@S@In early 2006, the company launched [[Google Video]], which not only allows users to search and view freely available videos but also offers users and media publishers the ability to publish their content, including television shows on [[CBS]], [[NBA]] basketball games, and music videos.@@@@1@44@@danf@17-8-2009
10320980@unknown@formal@none@1@S@In August 2007, Google announced that it would shut down its video rental and sale program and offer refunds and [[Google Checkout]] credits to consumers who had purchased videos to own.@@@@1@31@@danf@17-8-2009
10320990@unknown@formal@none@1@S@On [[February 28]], [[2008]] Google launched the [[Google Sites]] [[wiki]] as a [[Google Apps]] component.@@@@1@15@@danf@17-8-2009
10321000@unknown@formal@none@1@S@Google has also developed several desktop applications, including [[Google Earth]], an interactive mapping program powered by satellite and aerial imagery that covers the vast majority of the planet.@@@@1@28@@danf@17-8-2009
10321010@unknown@formal@none@1@S@Google Earth is generally considered to be remarkably accurate and extremely detailed.@@@@1@12@@danf@17-8-2009
10321020@unknown@formal@none@1@S@Many major cities have such detailed images that one can zoom in close enough to see vehicles and pedestrians clearly.@@@@1@20@@danf@17-8-2009
10321030@unknown@formal@none@1@S@Consequently, there have been some concerns about national security implications.@@@@1@10@@danf@17-8-2009
10321040@unknown@formal@none@1@S@Specifically, some countries and militaries contend the software can be used to pinpoint with near-precision accuracy the physical location of critical infrastructure, commercial and residential buildings, bases, government agencies, and so on.@@@@1@32@@danf@17-8-2009
10321050@unknown@formal@none@1@S@However, the satellite images are not necessarily frequently updated, and all of them are available at no charge through other products and even government sources.@@@@1@25@@danf@17-8-2009
10321060@unknown@formal@none@1@S@For example, [[NASA]] and the [[NGA|National Geospatial-Intelligence Agency]].@@@@1@8@@danf@17-8-2009
10321070@unknown@formal@none@1@S@Some counter this argument by stating that Google Earth makes it easier to access and research the images.@@@@1@18@@danf@17-8-2009
10321080@unknown@formal@none@1@S@Many other products are available through [[Google Labs]], which is a collection of incomplete applications that are still being tested for use by the general public.@@@@1@26@@danf@17-8-2009
10321090@unknown@formal@none@1@S@Google has promoted their products in various ways.@@@@1@8@@danf@17-8-2009
10321100@unknown@formal@none@1@S@In [[London]], ''Google Space'' was set-up in [[Heathrow Airport]], showcasing several products, including Gmail, Google Earth and Picasa.@@@@1@18@@danf@17-8-2009
10321110@unknown@formal@none@1@S@Also, a similar page was launched for American college students, under the name ''College Life, Powered by Google.''@@@@1@18@@danf@17-8-2009
10321120@unknown@formal@none@1@S@In 2007, some reports surfaced that Google was planning the release of its own mobile phone, possibly a competitor to [[Apple Inc.|Apple]]'s [[iPhone]].@@@@1@23@@danf@17-8-2009
10321130@unknown@formal@none@1@S@The project, called [[Android (mobile phone platform)|Android]] provides a standard development kit that will allow any "Android" phone to run software developed for the Android SDK, no matter the phone manufacturer.@@@@1@31@@danf@17-8-2009
10321140@unknown@formal@none@1@S@In October 2007, Google SMS service was launched in [[India]] allowing users to get business listings, movie showtimes, and information by sending an [[SMS]].@@@@1@24@@danf@17-8-2009
10321150@unknown@formal@none@1@S@===Enterprise products===@@@@1@2@@danf@17-8-2009
10321160@unknown@formal@none@1@S@In 2007, Google launched [[Google Apps|Google Apps Premier Edition]], a version of Google Apps targeted primarily at the business user.@@@@1@20@@danf@17-8-2009
10321170@unknown@formal@none@1@S@It includes such extras as more disk space for e-mail, API access, and premium support, for a price of US$50 per user per year.@@@@1@24@@danf@17-8-2009
10321180@unknown@formal@none@1@S@A large implementation of Google Apps with 38,000 users is at [[Lakehead University]] in [[Thunder Bay, Ontario|Thunder Bay]], Ontario, Canada.@@@@1@20@@danf@17-8-2009
10321190@unknown@formal@none@1@S@==Platform==@@@@1@1@@danf@17-8-2009
10321200@unknown@formal@none@1@S@Google runs its services on several [[server farm]]s, each comprising thousands of low-cost commodity computers running stripped-down versions of [[Linux]].@@@@1@20@@danf@17-8-2009
10321210@unknown@formal@none@1@S@While the company divulges no details of its hardware, a 2006 estimate cites 450,000 servers, "racked up in clusters at data centers around the world."@@@@1@25@@danf@17-8-2009
10321220@unknown@formal@none@1@S@==Corporate affairs and culture==@@@@1@4@@danf@17-8-2009
10321230@unknown@formal@none@1@S@Google is known for its relaxed corporate culture, of which its playful variations on [[Google logo#History of the Google Doodle|its own corporate logo]] are an indicator.@@@@1@26@@danf@17-8-2009
10321240@unknown@formal@none@1@S@In 2007 and 2008, ''[[Fortune Magazine]]'' placed Google at the top of its list of the hundred best places to work.@@@@1@21@@danf@17-8-2009
10321250@unknown@formal@none@1@S@Google's corporate philosophy embodies such casual principles as "you can make money without doing evil," "you can be serious without a suit," and "work should be challenging and the challenge should be fun."@@@@1@33@@danf@17-8-2009
10321260@unknown@formal@none@1@S@Google has been criticized for having salaries below industry standards.@@@@1@10@@danf@17-8-2009
10321270@unknown@formal@none@1@S@For example, some [[system administrator]]s earn no more than US$35,000 per year – considered to be quite low for the [[San Francisco Bay Area|Bay Area]] job market.@@@@1@27@@danf@17-8-2009
10321280@unknown@formal@none@1@S@However, Google's stock performance following its [[Initial public offering|IPO]] has enabled many early employees to be competitively compensated by participation in the corporation's remarkable equity growth.@@@@1@26@@danf@17-8-2009
10321290@unknown@formal@none@1@S@Google implemented other employee incentives in 2005, such as the [[Google Founders' Award]], in addition to offering higher salaries to new employees.@@@@1@22@@danf@17-8-2009
10321300@unknown@formal@none@1@S@Google's workplace amenities, culture, global popularity, and strong brand recognition have also attracted potential applicants.@@@@1@15@@danf@17-8-2009
10321310@unknown@formal@none@1@S@After the company's [[IPO]] in August 2004, it was reported that founders [[Sergey Brin]] and [[Larry Page]], and CEO [[Eric E. Schmidt|Eric Schmidt]], requested that their base salary be cut to US$1.00.@@@@1@32@@danf@17-8-2009
10321320@unknown@formal@none@1@S@Subsequent offers by the company to increase their salaries have been turned down, primarily because, "their primary compensation continues to come from returns on their ownership stakes in Google.@@@@1@29@@danf@17-8-2009
10321330@unknown@formal@none@1@S@As significant stockholders, their personal wealth is tied directly to sustained stock price appreciation and performance, which provides direct alignment with stockholder interests."@@@@1@23@@danf@17-8-2009
10321340@unknown@formal@none@1@S@Prior to 2004, Schmidt was making US$250,000 per year, and Page and Brin each earned a salary of US$150,000.@@@@1@19@@danf@17-8-2009
10321350@unknown@formal@none@1@S@They have all declined recent offers of bonuses and increases in compensation by Google's board of directors.@@@@1@17@@danf@17-8-2009
10321360@unknown@formal@none@1@S@In a 2007 report of the United States' richest people, [[Forbes]] reported that [[Sergey Brin]] and [[Larry Page]] were tied for #5 with a net worth of US$18.5 billion each.@@@@1@30@@danf@17-8-2009
10321370@unknown@formal@none@1@S@In 2007 and through early 2008, Google has seen the departure of several top executives.@@@@1@15@@danf@17-8-2009
10321380@unknown@formal@none@1@S@Justin Rosenstein, Google’s product manager, left in June of 2007.@@@@1@10@@danf@17-8-2009
10321390@unknown@formal@none@1@S@Shortly thereafter, Gideon Yu, former chief financial officer of [[YouTube]], a Google unit, joined [[Facebook]] along with Benjamin Ling, a high-ranking engineer, who left in October 2007.@@@@1@27@@danf@17-8-2009
10321400@unknown@formal@none@1@S@In March 2008, two senior Google leaders announced their desire to pursue other opportunities.@@@@1@14@@danf@17-8-2009
10321410@unknown@formal@none@1@S@Sheryl Sandburg, ex-VP of global online sales and operations began her position as COO of [[Facebook]] while Ash ElDifrawi, former head of brand advertising, left to become CMO of [[Netshops]] Inc.@@@@1@31@@danf@17-8-2009
10321420@unknown@formal@none@1@S@===Googleplex===@@@@1@1@@danf@17-8-2009
10321430@unknown@formal@none@1@S@Google's headquarters in Mountain View, California, is referred to as "the [[Googleplex]]" in a play of words; a [[googolplex]] being 1 followed by a googol of zeros, and the HQ being a [[complex]] of buildings (cf. [[movie theater|multiplex]], cineplex, etc).@@@@1@40@@danf@17-8-2009
10321440@unknown@formal@none@1@S@The lobby is decorated with a [[piano]], [[lava lamps]], old server clusters, and a projection of search queries on the wall.@@@@1@21@@danf@17-8-2009
10321450@unknown@formal@none@1@S@The hallways are full of exercise balls and [[bicycle]]s.@@@@1@9@@danf@17-8-2009
10321460@unknown@formal@none@1@S@Each employee has access to the corporate recreation center.@@@@1@9@@danf@17-8-2009
10321470@unknown@formal@none@1@S@Recreational amenities are scattered throughout the campus and include a workout room with weights and rowing machines, locker rooms, washers and dryers, a massage room, assorted [[video game]]s, [[Foosball]], a [[piano|baby grand piano]], a pool table, and [[ping pong]].@@@@1@39@@danf@17-8-2009
10321480@unknown@formal@none@1@S@In addition to the [[Recreation room|rec room]], there are snack rooms stocked with various foods and drinks.@@@@1@17@@danf@17-8-2009
10321490@unknown@formal@none@1@S@In 2006, Google moved into of office space in [[New York City]], at 111 [[Eighth Avenue|Eighth Ave.]] in Manhattan.@@@@1@19@@danf@17-8-2009
10321500@unknown@formal@none@1@S@The office was specially designed and built for Google and houses its largest advertising sales team, which has been instrumental in securing large partnerships, most recently deals with [[MySpace]] and [[AOL]].@@@@1@31@@danf@17-8-2009
10321510@unknown@formal@none@1@S@In 2003, they added an engineering staff in New York City, which has been responsible for more than 100 engineering projects, including [[Google Maps]], [[Google Spreadsheet]]s, and others.@@@@1@28@@danf@17-8-2009
10321520@unknown@formal@none@1@S@It is estimated that the building costs Google US$10 million per year to rent and is similar in design and functionality to its [[Mountain View, California|Mountain View]] headquarters, including [[foosball]], [[air hockey]], and ping-pong tables, as well as a video game area.@@@@1@42@@danf@17-8-2009
10321530@unknown@formal@none@1@S@In November 2006, Google opened offices on [[Carnegie Mellon]]'s campus in [[Pittsburgh, Pennsylvania|Pittsburgh]].@@@@1@13@@danf@17-8-2009
10321540@unknown@formal@none@1@S@By late 2006, Google also established a new headquarters for its AdWords division in [[Ann Arbor, Michigan]].@@@@1@17@@danf@17-8-2009
10321550@unknown@formal@none@1@S@The size of Google's search system is presently undisclosed.@@@@1@9@@danf@17-8-2009
10321560@unknown@formal@none@1@S@The best estimates place the total number of the company's servers at 450,000, spread over twenty five locations throughout the world, including major [[network operations center|operations centers]] in [[Dublin]] (European Operations [[Headquarters]]) and [[Atlanta, Georgia]].@@@@1@35@@danf@17-8-2009
10321570@unknown@formal@none@1@S@Google is also in the process of constructing a major operations center in [[The Dalles, Oregon]], on the banks of the [[Columbia River]].@@@@1@23@@danf@17-8-2009
10321580@unknown@formal@none@1@S@The site, also referred to by the media as ''Project 02'', was chosen due to the availability of inexpensive [[hydroelectric power]] and a large surplus of [[fiber optic]] cable, remnants of the dot com boom of the late 1990s.@@@@1@39@@danf@17-8-2009
10321590@unknown@formal@none@1@S@The computing center is estimated to be the size of two [[American football|football fields]], and it has created hundreds of construction jobs, causing local real estate prices to increase 40%.@@@@1@30@@danf@17-8-2009
10321600@unknown@formal@none@1@S@Upon completion, the center is expected to create 60 to 200 permanent jobs in the town of 12,000 people.@@@@1@19@@danf@17-8-2009
10321610@unknown@formal@none@1@S@Google is taking steps to ensure that their operations are environmentally sound.@@@@1@12@@danf@17-8-2009
10321620@unknown@formal@none@1@S@In October 2006, the company announced plans to install thousands of [[Photovoltaic module|solar panels]] to provide up to 1.6 [[megawatt]]s of [[electricity]], enough to satisfy approximately 30% of the campus' energy needs.@@@@1@31@@danf@17-8-2009
10321630@unknown@formal@none@1@S@The system will be the largest solar power system constructed on a [[United States|U.S.]] corporate campus and one of the largest on any corporate site in the world.@@@@1@28@@danf@17-8-2009
10321640@unknown@formal@none@1@S@In June 2007, Google announced that they plan to become [[carbon neutral]] by 2008, which includes investing in energy efficiency, renewable energy sources, and purchasing carbon offsets, such as investing in projects like capturing and burning [[methane]] from animal waste at Mexican and Brazilian farms.@@@@1@45@@danf@17-8-2009
10321650@unknown@formal@none@1@S@===Innovation time off===@@@@1@3@@danf@17-8-2009
10321660@unknown@formal@none@1@S@As an interesting motivation technique (usually called [[ITO|Innovation Time Off]]), all Google engineers are encouraged to spend 20% of their work time (one day per week) on projects that interest them.@@@@1@31@@danf@17-8-2009
10321670@unknown@formal@none@1@S@Some of Google's newer services, such as [[Gmail]], [[Google News]], [[Orkut]], and [[AdSense]] originated from these independent endeavors.@@@@1@18@@danf@17-8-2009
10321680@unknown@formal@none@1@S@In a talk at [[Stanford University]], [[Marissa Mayer]], Google's Vice President of Search Products and User Experience, stated that her analysis showed that half of the new product launches originated from the 20% time.@@@@1@34@@danf@17-8-2009
10321690@unknown@formal@none@1@S@===Easter eggs and April Fool's Day jokes===@@@@1@7@@danf@17-8-2009
10321700@unknown@formal@none@1@S@Google has a tradition of creating [[April Fool's Day]] jokes—such as [[Google's hoaxes#2000|Google MentalPlex]], which allegedly featured the use of mental power to search the web.@@@@1@26@@danf@17-8-2009
10321710@unknown@formal@none@1@S@In 2002, they claimed that [[pigeons]] were the [[Google's hoaxes#2002: Pigeon Rank|secret]] behind their growing [[search engine]].@@@@1@17@@danf@17-8-2009
10321720@unknown@formal@none@1@S@In 2004, they featured [[Google's hoaxes#2004: Google Lunar/Copernicus Center|Google Lunar]] (which claimed to feature jobs on the [[moon]]), and in 2005, a [[fiction|fictitious]] brain-boosting drink, termed [[Google's hoaxes#2005: Google Gulp|Google Gulp]] was announced.@@@@1@33@@danf@17-8-2009
10321730@unknown@formal@none@1@S@In 2006, they came up with [[Google's hoaxes#2006: Google Romance|Google Romance]], a hypothetical [[online dating]] service.@@@@1@16@@danf@17-8-2009
10321740@unknown@formal@none@1@S@In 2007, Google announced two joke products.@@@@1@7@@danf@17-8-2009
10321750@unknown@formal@none@1@S@The first was a free wireless Internet service called [[TiSP]] (Toilet Internet Service Provider) in which one obtained a connection by flushing one end of a [[fiber-optic]] cable down their toilet and waiting only an hour for a "Plumbing Hardware Dispatcher (PHD)" to connect it to the Internet.@@@@1@48@@danf@17-8-2009
10321760@unknown@formal@none@1@S@Additionally, Google's [[Gmail]] page displayed an announcement for [[Gmail Paper]], which allows users of their free email service to have email messages printed and shipped to a snail mail address.@@@@1@30@@danf@17-8-2009
10321770@unknown@formal@none@1@S@Google's services contain a number of [[Easter egg (virtual)|Easter eggs]]; for instance, the Language Tools page offers the search interface in the [[Swedish Chef]]'s "Bork bork bork," [[Pig Latin]], ”Hacker” (actually [[leetspeak]]), [[Elmer Fudd]], and [[Klingon language|Klingon]].@@@@1@37@@danf@17-8-2009
10321780@unknown@formal@none@1@S@In addition, the search engine calculator provides the [[Answer to Life, the Universe, and Everything]] from [[Douglas Adams]]' ''[[The Hitchhiker's Guide to the Galaxy]]''.@@@@1@24@@danf@17-8-2009
10321790@unknown@formal@none@1@S@As Google's search box can be used as a unit converter (as well as a calculator), some non-standard units are built in, such as the [[Smoot]].@@@@1@26@@danf@17-8-2009
10321800@unknown@formal@none@1@S@Google also routinely modifies its logo in accordance with various holidays or special events throughout the year, such as [[Christmas]], [[Mother's Day]], or the [[birthday]]s of various notable individuals.@@@@1@29@@danf@17-8-2009
10321810@unknown@formal@none@1@S@===IPO and culture===@@@@1@3@@danf@17-8-2009
10321820@unknown@formal@none@1@S@Many people speculated that Google's [[initial public offering|IPO]] would inevitably lead to changes in the company's culture, because of shareholder pressure for employee benefit reductions and short-term advances, or because a large number of the company's employees would suddenly become millionaires on paper.@@@@1@43@@danf@17-8-2009
10321830@unknown@formal@none@1@S@In a report given to potential investors, co-founders Sergey Brin and Larry Page promised that the IPO would not change the company's culture.@@@@1@23@@danf@17-8-2009
10321840@unknown@formal@none@1@S@Later Mr. Page said, "We think a lot about how to maintain our culture and the fun elements.@@@@1@18@@danf@17-8-2009
10321850@unknown@formal@none@1@S@We spent a lot of time getting our offices right.@@@@1@10@@danf@17-8-2009
10321860@unknown@formal@none@1@S@We think it's important to have a high density of people.@@@@1@11@@danf@17-8-2009
10321870@unknown@formal@none@1@S@People are packed together everywhere.@@@@1@5@@danf@17-8-2009
10321880@unknown@formal@none@1@S@We all share offices.@@@@1@4@@danf@17-8-2009
10321890@unknown@formal@none@1@S@We like this set of buildings because it's more like a densely packed university campus than a typical suburban office park."@@@@1@21@@danf@17-8-2009
10321900@unknown@formal@none@1@S@However, many analysts are finding that as Google grows, the company is becoming more "corporate".@@@@1@15@@danf@17-8-2009
10321910@unknown@formal@none@1@S@In 2005, articles in ''[[The New York Times]]'' and other sources began suggesting that Google had lost its anti-corporate, no evil philosophy.@@@@1@22@@danf@17-8-2009
10321920@unknown@formal@none@1@S@In an effort to maintain the company's unique culture, Google has designated a Chief Culture Officer in 2006, who also serves as the Director of Human Resources.@@@@1@27@@danf@17-8-2009
10321930@unknown@formal@none@1@S@The purpose of the Chief Culture Officer is to develop and maintain the culture and work on ways to keep true to the core values that the company was founded on in the beginning—a flat organization, a lack of hierarchy, a collaborative environment.@@@@1@43@@danf@17-8-2009
10321940@unknown@formal@none@1@S@===Philanthropy===@@@@1@1@@danf@17-8-2009
10321950@unknown@formal@none@1@S@In 2004, Google formed a for-profit philanthropic wing, [[Google.org]], with a start-up fund of US$1 billion.@@@@1@16@@danf@17-8-2009
10321960@unknown@formal@none@1@S@The express mission of the organization is to create awareness about [[climate change]], global public health, and [[global poverty]].@@@@1@19@@danf@17-8-2009
10321970@unknown@formal@none@1@S@One of its first projects is to develop a viable [[plug-in hybrid]] [[electric vehicle]] that can attain 100 [[fuel economy in automobiles|mpg]].@@@@1@22@@danf@17-8-2009
10321980@unknown@formal@none@1@S@The founding and current director is Dr. [[Larry Brilliant]].@@@@1@9@@danf@17-8-2009
10321990@unknown@formal@none@1@S@==Criticism==@@@@1@1@@danf@17-8-2009
10322000@unknown@formal@none@1@S@As it has grown, Google has found itself the focus of several controversies related to its business practices and services.@@@@1@20@@danf@17-8-2009
10322010@unknown@formal@none@1@S@For example, [[Google Book Search]]'s effort to digitize millions of books and make the full text searchable has led to [[copyright]] disputes with the [[Authors Guild]].@@@@1@26@@danf@17-8-2009
10322020@unknown@formal@none@1@S@Google's cooperation with the governments of [[People's Republic of China|China]], and to a lesser extent [[France]] and [[Germany]] (regarding [[Holocaust denial]]) to filter search results in accordance to regional laws and regulations has led to claims of [[censorship by Google|censorship]].@@@@1@40@@danf@17-8-2009
10322030@unknown@formal@none@1@S@Google's persistent [[HTTP cookie|cookie]] and other information collection practices have led to concerns over user [[Google and privacy issues|privacy]].@@@@1@19@@danf@17-8-2009
10322040@unknown@formal@none@1@S@As of [[December 11]], [[2007]], Google, like the [[Microsoft]] search engine, stores "personal information for 18 months" and by comparison, [[Yahoo!]] and [[AOL]] ([[Time Warner]]) "retain search requests for 13 months."@@@@1@31@@danf@17-8-2009
10322050@unknown@formal@none@1@S@A number of [[India]]n state governments have raised concerns about the security risks posed by geographic details provided by [[Google Earth]]'s satellite imaging.@@@@1@23@@danf@17-8-2009
10322060@unknown@formal@none@1@S@Google has also been criticized by advertisers regarding its inability to combat [[click fraud]], when a person or automated script is used to generate a charge on an advertisement without really having an interest in the product.@@@@1@37@@danf@17-8-2009
10322070@unknown@formal@none@1@S@Industry reports in 2006 claim that approximately 14 to 20 percent of clicks were in fact fraudulent or invalid.@@@@1@19@@danf@17-8-2009
10322080@unknown@formal@none@1@S@Further, Google has faced allegations of [[sexism]] and [[ageism]] from former employees.@@@@1@12@@danf@17-8-2009
10322090@unknown@formal@none@1@S@Google has also faced accusations in [[Harper's Magazine]] of being extremely excessive with their energy usage, and were accused of employing their "[[Don't be evil]]" motto as well as their very public energy saving campaigns as means of trying to cover up or make up for the massive amounts of energy their servers actually require.@@@@1@55@@danf@17-8-2009
10322100@unknown@formal@none@1@S@Also, US District Court Judge [[Louis Stanton]], on [[July 1]], 2008 ordered Google to give [[YouTube]] user data / log to [[Viacom]] to support its case in a billion-dollar [[copyright]] lawsuit against Google.@@@@1@33@@danf@17-8-2009
10322110@unknown@formal@none@1@S@Google and [[Viacom]], however, on [[July 14]], 2008, agreed in [[compromise]] to protect [[YouTube]] users' personal data in the $ 1 billion (£ 497 million) copyright lawsuit.@@@@1@27@@danf@17-8-2009
10322120@unknown@formal@none@1@S@Google agreed it will make user information and internet protocol addresses from its YouTube subsidiary anonymous before handing over the data to Viacom.@@@@1@23@@danf@17-8-2009
10322130@unknown@formal@none@1@S@The privacy deal also applied to other litigants including the [[FA Premier League]], the Rodgers & Hammerstein Organisation and the [[Scottish Premier League]].@@@@1@23@@danf@17-8-2009
10322140@unknown@formal@none@1@S@The deal however did not extend the anonymity to employees, since Viacom would prove that Google staff are aware of uploading of illegal material to the site.@@@@1@27@@danf@17-8-2009
10322150@unknown@formal@none@1@S@The parties therefore will further meet on the matter lest the data be made available to the court.@@@@1@18@@danf@17-8-2009
10330010@unknown@formal@none@1@S@Google Translate@@@@1@2@@danf@17-8-2009
10330020@unknown@formal@none@1@S@'''Google Translate''' is a service provided by [[Google|Google Inc.]] to translate a section of text, or a webpage, into another language, with limits to the number of paragraphs, or range of technical terms, translated.@@@@1@34@@danf@17-8-2009
10330030@unknown@formal@none@1@S@For some languages, users are asked for alternate translations, such as for technical terms, to be included for future updates to the translation process.@@@@1@24@@danf@17-8-2009
10330040@unknown@formal@none@1@S@Unlike other translation services such as [[Babel Fish (website)|Babel Fish]], [[AOL]], and [[Yahoo!|Yahoo]] which use [[SYSTRAN]], Google uses its own translation software.@@@@1@22@@danf@17-8-2009
10330050@unknown@formal@none@1@S@== Functions ==@@@@1@3@@danf@17-8-2009
10330060@unknown@formal@none@1@S@The service also includes translation of an entire Web page.@@@@1@10@@danf@17-8-2009
10330070@unknown@formal@none@1@S@The translation is limited in number of paragraphs per webpage (such as indicated by break-tags <br>); however, if text on a webpage is separated by horizontal blank-line images (auto-wrapped without using any <br>), a long webpage can be translated containing several thousand words.@@@@1@43@@danf@17-8-2009
10330080@unknown@formal@none@1@S@Google Translate, like other automatic translation tools, has its limitations.@@@@1@10@@danf@17-8-2009
10330090@unknown@formal@none@1@S@While it can help the reader to understand the general content of a foreign language text, it does not deliver accurate translations and does not produce publication-standard content, for example it often translates words out of context and is deliberately not applying any [[Grammar|grammatical]] rules.@@@@1@45@@danf@17-8-2009
10330100@unknown@formal@none@1@S@== Approach ==@@@@1@3@@danf@17-8-2009
10330110@unknown@formal@none@1@S@Google translate is based on an approach called [[statistical machine translation]], and more specifically, on research by [[Franz-Josef Och]] who won the [[DARPA]] contest for speed machine translation in 2003.@@@@1@30@@danf@17-8-2009
10330120@unknown@formal@none@1@S@Och is now the head of Google's machine translation department.@@@@1@10@@danf@17-8-2009
10330130@unknown@formal@none@1@S@According to Och, a solid base for developing a usable statistical machine translation system for a new pair of languages from scratch, would consist in having a bilingual [[text corpus]] (or [[parallel text|parallel collection]]) of more than a million words and two monolingual corpora of each more than a billion words.@@@@1@51@@danf@17-8-2009
10330140@unknown@formal@none@1@S@Statistical [[Mathematical model|models]] from this data are then used to translate between those languages.@@@@1@14@@danf@17-8-2009
10330150@unknown@formal@none@1@S@To acquire this huge amount of linguistic data, Google used [[United Nations]] documents.@@@@1@13@@danf@17-8-2009
10330160@unknown@formal@none@1@S@The same document is normally available in all six official UN languages, thus Google now has a hectalingual corpus of 20 billion words' worth of human translations.@@@@1@27@@danf@17-8-2009
10330170@unknown@formal@none@1@S@The availability of Arabic and Chinese as official UN languages is probably one of the reasons why Google Translate initially focused on the development of translation between English and those languages, and not, for example, [[Japanese language|Japanese]] and [[German language|German]], which are not official languages at the UN.@@@@1@48@@danf@17-8-2009
10330180@unknown@formal@none@1@S@Google representatives have been very active at domestic conferences in Japan in the field asking researchers to provide them with bilingual corpora.@@@@1@22@@danf@17-8-2009
10330190@unknown@formal@none@1@S@== Options ==@@@@1@3@@danf@17-8-2009
10330200@unknown@formal@none@1@S@(by chronological order)@@@@1@3@@danf@17-8-2009
10330210@unknown@formal@none@1@S@*Beginning@@@@1@1@@danf@17-8-2009
10330220@unknown@formal@none@1@S@**English to Arabic@@@@1@3@@danf@17-8-2009
10330230@unknown@formal@none@1@S@**English to French@@@@1@3@@danf@17-8-2009
10330240@unknown@formal@none@1@S@**English to German@@@@1@3@@danf@17-8-2009
10330250@unknown@formal@none@1@S@**English to Spanish@@@@1@3@@danf@17-8-2009
10330260@unknown@formal@none@1@S@**French to English@@@@1@3@@danf@17-8-2009
10330270@unknown@formal@none@1@S@**German to English@@@@1@3@@danf@17-8-2009
10330280@unknown@formal@none@1@S@**Spanish to English@@@@1@3@@danf@17-8-2009
10330290@unknown@formal@none@1@S@**Arabic to English@@@@1@3@@danf@17-8-2009
10330300@unknown@formal@none@1@S@*2nd stage@@@@1@2@@danf@17-8-2009
10330310@unknown@formal@none@1@S@**English to Portuguese@@@@1@3@@danf@17-8-2009
10330320@unknown@formal@none@1@S@**Portuguese to English@@@@1@3@@danf@17-8-2009
10330330@unknown@formal@none@1@S@*3rd stage@@@@1@2@@danf@17-8-2009
10330340@unknown@formal@none@1@S@**English to Italian@@@@1@3@@danf@17-8-2009
10330350@unknown@formal@none@1@S@**Italian to English@@@@1@3@@danf@17-8-2009
10330360@unknown@formal@none@1@S@*4th stage@@@@1@2@@danf@17-8-2009
10330370@unknown@formal@none@1@S@**English to Chinese (Simplified) BETA@@@@1@5@@danf@17-8-2009
10330380@unknown@formal@none@1@S@**English to Japanese BETA@@@@1@4@@danf@17-8-2009
10330390@unknown@formal@none@1@S@**English to Korean BETA@@@@1@4@@danf@17-8-2009
10330400@unknown@formal@none@1@S@**Chinese (Simplified) to English BETA@@@@1@5@@danf@17-8-2009
10330410@unknown@formal@none@1@S@**Japanese to English BETA@@@@1@4@@danf@17-8-2009
10330420@unknown@formal@none@1@S@**Korean to English BETA@@@@1@4@@danf@17-8-2009
10330430@unknown@formal@none@1@S@*5th stage@@@@1@2@@danf@17-8-2009
10330440@unknown@formal@none@1@S@**English to Russian BETA@@@@1@4@@danf@17-8-2009
10330450@unknown@formal@none@1@S@**Russian to English BETA@@@@1@4@@danf@17-8-2009
10330460@unknown@formal@none@1@S@*6th stage@@@@1@2@@danf@17-8-2009
10330470@unknown@formal@none@1@S@**English to Arabic BETA@@@@1@4@@danf@17-8-2009
10330480@unknown@formal@none@1@S@**Arabic to English BETA@@@@1@4@@danf@17-8-2009
10330490@unknown@formal@none@1@S@*7th stage (launched February, 2007)@@@@1@5@@danf@17-8-2009
10330500@unknown@formal@none@1@S@**English to Chinese (Traditional) BETA@@@@1@5@@danf@17-8-2009
10330510@unknown@formal@none@1@S@**Chinese (Traditional) to English BETA@@@@1@5@@danf@17-8-2009
10330520@unknown@formal@none@1@S@**Chinese (Simplified to Traditional) BETA@@@@1@5@@danf@17-8-2009
10330530@unknown@formal@none@1@S@**Chinese (Traditional to Simplified) BETA@@@@1@5@@danf@17-8-2009
10330540@unknown@formal@none@1@S@*8th stage (launched October, 2007)@@@@1@5@@danf@17-8-2009
10330550@unknown@formal@none@1@S@** all 25 language pairs use Google's machine translation system@@@@1@10@@danf@17-8-2009
10330560@unknown@formal@none@1@S@*9th stage@@@@1@2@@danf@17-8-2009
10330570@unknown@formal@none@1@S@**English to Hindi BETA@@@@1@4@@danf@17-8-2009
10330580@unknown@formal@none@1@S@**Hindi to English BETA@@@@1@4@@danf@17-8-2009
10330590@unknown@formal@none@1@S@*10th stage (as of this stage, translation can be done between any two languages)@@@@1@14@@danf@17-8-2009
10330600@unknown@formal@none@1@S@**Bulgarian@@@@1@1@@danf@17-8-2009
10330610@unknown@formal@none@1@S@**Croatian@@@@1@1@@danf@17-8-2009
10330620@unknown@formal@none@1@S@**Czech@@@@1@1@@danf@17-8-2009
10330630@unknown@formal@none@1@S@**Danish@@@@1@1@@danf@17-8-2009
10330640@unknown@formal@none@1@S@**Dutch@@@@1@1@@danf@17-8-2009
10330650@unknown@formal@none@1@S@**Finnish@@@@1@1@@danf@17-8-2009
10330660@unknown@formal@none@1@S@**Greek@@@@1@1@@danf@17-8-2009
10330670@unknown@formal@none@1@S@**Norwegian@@@@1@1@@danf@17-8-2009
10330680@unknown@formal@none@1@S@**Polish@@@@1@1@@danf@17-8-2009
10330690@unknown@formal@none@1@S@**Romanian@@@@1@1@@danf@17-8-2009
10330700@unknown@formal@none@1@S@**Swedish@@@@1@1@@danf@17-8-2009
10340010@unknown@formal@none@1@S@Grammar@@@@1@1@@danf@17-8-2009
10340020@unknown@formal@none@1@S@'''Grammar''' is the field of [[linguistics]] that covers the [[rules]] governing the use of any given [[natural language|natural language]].@@@@1@19@@danf@17-8-2009
10340030@unknown@formal@none@1@S@It includes [[morphology (linguistics)|morphology]] and [[syntax]], often complemented by [[phonetics]], [[phonology]], [[semantics]], and [[pragmatics]].@@@@1@14@@danf@17-8-2009
10340040@unknown@formal@none@1@S@Each language has its own distinct grammar.@@@@1@7@@danf@17-8-2009
10340050@unknown@formal@none@1@S@"English grammar" is the rules of the English language itself.@@@@1@10@@danf@17-8-2009
10340060@unknown@formal@none@1@S@"''An'' English grammar" is a specific study or analysis of these rules.@@@@1@12@@danf@17-8-2009
10340070@unknown@formal@none@1@S@A [[reference book]] describing the grammar of a language is called a "reference grammar" or simply "a grammar".@@@@1@18@@danf@17-8-2009
10340080@unknown@formal@none@1@S@A fully explicit grammar exhaustively describing the [[grammaticality|grammatical]] constructions of a language is called a descriptive grammar, as opposed to [[linguistic prescription]] which tries to enforce the governing rules how a language is to be used.@@@@1@36@@danf@17-8-2009
10340090@unknown@formal@none@1@S@[[Grammatical framework]]s are approaches to constructing grammars.@@@@1@7@@danf@17-8-2009
10340100@unknown@formal@none@1@S@The standard framework of [[generative grammar]] is the [[transformational grammar]] model developed by [[Noam Chomsky]] and his followers from the 1950s to 1980s.@@@@1@23@@danf@17-8-2009
10340110@unknown@formal@none@1@S@==Etymology==@@@@1@1@@danf@17-8-2009
10340120@unknown@formal@none@1@S@The word "grammar," derives from [[Greek language|Greek]] ''γραμματική τέχνη'' (''grammatike techne''), which means "art of letters," from ''γράμμα'' (''gramma''), "letter," and that from ''γράφειν'' (''graphein''), "to draw, to write".@@@@1@29@@danf@17-8-2009
10340130@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10340140@unknown@formal@none@1@S@The first systematic grammars originate in [[Iron Age India]], with [[Panini (grammarian)|Panini]] (4th c. BC) and his commentators [[Pingala]] (ca. 200 BC), [[Katyayana]], and [[Patanjali]] (2nd c. BC).@@@@1@28@@danf@17-8-2009
10340150@unknown@formal@none@1@S@In the West, grammar emerges as a discipline in [[Hellenism]] from the 3rd c. BC forward with authors like [[Rhyanus]] and [[Aristarchus of Samothrace]], the oldest extant work being the ''[[Art of Grammar]]'' ({{lang|grc|Τέχνη Γραμματική}}), attributed to [[Dionysius Thrax]] (ca. 100 BC).@@@@1@42@@danf@17-8-2009
10340160@unknown@formal@none@1@S@[[Latin grammar]] developed by following Greek models from the 1st century BC, due to the work of authors such as [[Orbilius Pupillus]], [[Remmius Palaemon]], [[Marcus Valerius Probus]], [[Verrius Flaccus]], [[Aemilius Asper]].@@@@1@31@@danf@17-8-2009
10340170@unknown@formal@none@1@S@Tamil grammatical tradition also began around the 1st century BC with the [[Tolkāppiyam]].@@@@1@13@@danf@17-8-2009
10340180@unknown@formal@none@1@S@A grammar of [[Old Irish|Irish]] originated in the 7th century with the [[Auraicept na n-Éces]].@@@@1@15@@danf@17-8-2009
10340190@unknown@formal@none@1@S@[[Arabic grammar]] emerges from the 8th century with the work of [[Ibn Abi Ishaq]] and his students.@@@@1@17@@danf@17-8-2009
10340200@unknown@formal@none@1@S@The first treatises on [[Hebrew grammar]] appear in the [[High Middle Ages]], in the context of [[Mishnah]] (exegesis of the [[Hebrew Bible]]).@@@@1@22@@danf@17-8-2009
10340210@unknown@formal@none@1@S@The [[Karaite]] tradition originates in [[Abbasid]] [[Baghdad]].@@@@1@7@@danf@17-8-2009
10340220@unknown@formal@none@1@S@The ''[[Diqduq]]'' (10th century) is one of the earliest grammatical commentaries on the Hebrew Bible.@@@@1@15@@danf@17-8-2009
10340230@unknown@formal@none@1@S@[[Ibn Barun]] in the 12th century compares the Hebrew language with [[Arabic language|Arabic]] in the [[Islamic grammatical tradition]].@@@@1@18@@danf@17-8-2009
10340240@unknown@formal@none@1@S@Belonging to the ''trivium'' of the seven [[liberal arts]], grammar was taught as a core discipline throughout the [[Middle Ages]], following the influence of authors from [[Late Antiquity]], such as [[Priscian]].@@@@1@31@@danf@17-8-2009
10340250@unknown@formal@none@1@S@Treatment of vernaculars begins gradually during the [[High Middle Ages]], with isolated works such as the [[First Grammatical Treatise]], but becomes influential only in the [[Renaissance]] and [[Baroque]] periods.@@@@1@29@@danf@17-8-2009
10340260@unknown@formal@none@1@S@In [[1486]], [[Antonio de Nebrija]] published ''Las introduciones Latinas contrapuesto el romance al Latin'', and the first [[Spanish grammar]], ''Gramática de la lengua castellana'', in 1492.@@@@1@26@@danf@17-8-2009
10340270@unknown@formal@none@1@S@During the 16th century [[Italian Renaissance]], the ''Questione della lingua'' was the discussion on the status and ideal form of the [[Italian language]], initiated by [[Dante]]'s ''[[de vulgari eloquentia]]'' ([[Pietro Bembo]], ''Prose della volgar lingua'' Venice 1525).@@@@1@37@@danf@17-8-2009
10340280@unknown@formal@none@1@S@Grammars of non-European languages began to be compiled for the purposes of [[evangelization]] and [[Bible translation]] from the 16th century onward, such as ''Grammatica o Arte de la Lengua General de los Indios de los Reynos del Perú'' (1560), and a [[Quechua]] grammar by [[Fray Domingo de Santo Tomás]].@@@@1@49@@danf@17-8-2009
10340290@unknown@formal@none@1@S@In 1643 there appeared [[Ivan Uzhevych]]'s ''Grammatica sclavonica'' and, in 1762, the ''Short Introduction to English Grammar'' of [[Robert Lowth]] was also published.@@@@1@23@@danf@17-8-2009
10340300@unknown@formal@none@1@S@The ''Grammatisch-Kritisches Wörterbuch der hochdeutschen Mundart'', a [[High German]] grammar in five volumes by [[Johann Christoph Adelung]], appeared as early as 1774.@@@@1@22@@danf@17-8-2009
10340310@unknown@formal@none@1@S@From the latter part of the 18th century, grammar came to be understood as a subfield of the emerging discipline of modern [[linguistics]].@@@@1@23@@danf@17-8-2009
10340320@unknown@formal@none@1@S@The Serbian grammar by [[Vuk Stefanović Karadžić]] arrived in 1814, while the ''Deutsche Grammatik'' of the [[Brothers Grimm]] was first published in 1818.@@@@1@23@@danf@17-8-2009
10340330@unknown@formal@none@1@S@The ''Comparative Grammar'' of [[Franz Bopp]], the starting point of modern [[comparative linguistics]], came out in 1833.@@@@1@17@@danf@17-8-2009
10340340@unknown@formal@none@1@S@In the [[USA]], the Society for the Promotion of Good Grammar has designated March 4, 2008 as National Grammar Day.@@@@1@20@@danf@17-8-2009
10340350@unknown@formal@none@1@S@==Development of grammars==@@@@1@3@@danf@17-8-2009
10340360@unknown@formal@none@1@S@Grammars evolve through usage, and grammars also develop due to separations of the human population.@@@@1@15@@danf@17-8-2009
10340370@unknown@formal@none@1@S@With the advent of written [[Knowledge representation|representation]]s, formal rules about language usage tend to appear also.@@@@1@16@@danf@17-8-2009
10340380@unknown@formal@none@1@S@Formal grammars are [[codification (linguistics)|codifications]] of usage that are developed by repeated documentation over time, and by [[observation]] as well.@@@@1@20@@danf@17-8-2009
10340390@unknown@formal@none@1@S@As the rules become established and developed, the prescriptive concept of grammatical correctness can arise.@@@@1@15@@danf@17-8-2009
10340400@unknown@formal@none@1@S@This often creates a discrepancy between contemporary usage and that which has been accepted over time as being correct.@@@@1@19@@danf@17-8-2009
10340410@unknown@formal@none@1@S@Linguists tend to believe that prescriptive grammars do not have any justification beyond their authors' aesthetic tastes; however, prescriptions are considered in [[sociolinguistics]] as part of the explanation for why some people say "I didn't do nothing", some say "I didn't do anything", and some say one or the other depending on social context.@@@@1@54@@danf@17-8-2009
10340420@unknown@formal@none@1@S@The formal study of grammar is an important part of [[education]] for children from a young age through advanced [[learning]], though the rules taught in schools are not a "grammar" in the sense most [[linguistics|linguists]] use the term, as they are often [[prescriptive]] rather than [[descriptive]].@@@@1@46@@danf@17-8-2009
10340430@unknown@formal@none@1@S@[[Constructed language]]s (also called planned languages or conlangs) are more common in the modern day.@@@@1@15@@danf@17-8-2009
10340440@unknown@formal@none@1@S@Many have been designed to aid human [[communication]] (for example, naturalistic [[Interlingua]], schematic [[Esperanto]], and the highly logic-compatible artificial language [[Lojban]]).@@@@1@21@@danf@17-8-2009
10340450@unknown@formal@none@1@S@Each of these languages has its own grammar.@@@@1@8@@danf@17-8-2009
10340460@unknown@formal@none@1@S@No clear line can be drawn between syntax and morphology.@@@@1@10@@danf@17-8-2009
10340470@unknown@formal@none@1@S@[[Analytic languages]] use [[syntax]] to convey information that is encoded via [[inflection]] in [[synthetic language]]s.@@@@1@15@@danf@17-8-2009
10340480@unknown@formal@none@1@S@In other words, word order is not significant and [[morphology (linguistics)|morphology]] is highly significant in a purely synthetic language, whereas morphology is not significant and syntax is highly significant in an analytic language.@@@@1@33@@danf@17-8-2009
10340490@unknown@formal@none@1@S@[[Chinese language|Chinese]] and [[Afrikaans language|Afrikaans]], for example, are highly analytic, and meaning is therefore very context – dependent.@@@@1@18@@danf@17-8-2009
10340500@unknown@formal@none@1@S@(Both do have some inflections, and have had more in the past; thus, they are becoming even less synthetic and more "purely" analytic over time.)@@@@1@25@@danf@17-8-2009
10340510@unknown@formal@none@1@S@[[Latin]], which is highly [[synthetic language|synthetic]], uses [[affix]]es and [[inflection]]s to convey the same information that Chinese does with [[syntax]].@@@@1@20@@danf@17-8-2009
10340520@unknown@formal@none@1@S@Because Latin words are quite (though not completely) self-contained, an intelligible Latin [[Sentence (linguistics)|sentence]] can be made from elements that are placed in a largely arbitrary order.@@@@1@27@@danf@17-8-2009
10340530@unknown@formal@none@1@S@Latin has a complex affixation and a simple syntax, while Chinese has the opposite.@@@@1@14@@danf@17-8-2009
10340540@unknown@formal@none@1@S@==Grammar frameworks==@@@@1@2@@danf@17-8-2009
10340550@unknown@formal@none@1@S@Various "grammar frameworks" have been developed in [[theoretical linguistics]] since the mid 20th century, in particular under the influence of the idea of a "[[Universal grammar]]" in the USA.@@@@1@29@@danf@17-8-2009
10340560@unknown@formal@none@1@S@Of these, the main divisions are:@@@@1@6@@danf@17-8-2009
10340570@unknown@formal@none@1@S@*[[Transformational grammar]] (TG))@@@@1@3@@danf@17-8-2009
10340580@unknown@formal@none@1@S@*[[Principles and Parameters|Principles and Parameters Theory]] (P&P)@@@@1@7@@danf@17-8-2009
10340590@unknown@formal@none@1@S@*[[Lexical functional grammar|Lexical-functional Grammar]] (LFG)@@@@1@5@@danf@17-8-2009
10340600@unknown@formal@none@1@S@*[[Generalised Phrase Structure Grammar|Generalized Phrase Structure Grammar]] (GPSG)@@@@1@8@@danf@17-8-2009
10340610@unknown@formal@none@1@S@*[[Head-Driven Phrase Structure Grammar]] (HPSG)@@@@1@5@@danf@17-8-2009
10340620@unknown@formal@none@1@S@*[[Dependency grammar]]s (DG)@@@@1@3@@danf@17-8-2009
10340630@unknown@formal@none@1@S@*[[Role and reference grammar]] (RRG)@@@@1@5@@danf@17-8-2009
10350010@unknown@formal@none@1@S@Hidden Markov model@@@@1@3@@danf@17-8-2009
10350020@unknown@formal@none@1@S@A '''hidden Markov model''' ('''HMM''') is a [[statistical model]] in which the system being modeled is assumed to be a [[Markov process]] with unknown parameters, and the challenge is to determine the hidden parameters from the [[observable]] parameters.@@@@1@38@@danf@17-8-2009
10350030@unknown@formal@none@1@S@The extracted model parameters can then be used to perform further analysis, for example for [[pattern recognition]] applications.@@@@1@18@@danf@17-8-2009
10350040@unknown@formal@none@1@S@An HMM can be considered as the simplest [[dynamic Bayesian network]].@@@@1@11@@danf@17-8-2009
10350050@unknown@formal@none@1@S@In a regular [[Markov model]], the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters.@@@@1@23@@danf@17-8-2009
10350060@unknown@formal@none@1@S@In a ''hidden'' Markov model, the state is not directly visible, but variables influenced by the state are visible.@@@@1@19@@danf@17-8-2009
10350070@unknown@formal@none@1@S@Each state has a probability distribution over the possible output tokens.@@@@1@11@@danf@17-8-2009
10350080@unknown@formal@none@1@S@Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.@@@@1@17@@danf@17-8-2009
10350090@unknown@formal@none@1@S@Hidden Markov models are especially known for their application in [[time| temporal]] pattern recognition such as [[speech recognition|speech]], [[handwriting recognition|handwriting]], [[gesture recognition]], [[musical score]] following, [[partial discharge]]s and [[bioinformatics]].@@@@1@29@@danf@17-8-2009
10350100@unknown@formal@none@1@S@== Architecture of a hidden Markov model ==@@@@1@8@@danf@17-8-2009
10350110@unknown@formal@none@1@S@The diagram below shows the general architecture of an instantiated HMM.@@@@1@11@@danf@17-8-2009
10350120@unknown@formal@none@1@S@Each oval shape represents a random variable that can adopt a number of values.@@@@1@14@@danf@17-8-2009
10350130@unknown@formal@none@1@S@The random variable is the hidden state at time (with the model from the above diagram, ).@@@@1@23@@danf@17-8-2009
10350140@unknown@formal@none@1@S@The random variable is the observation at time ().@@@@1@16@@danf@17-8-2009
10350150@unknown@formal@none@1@S@The arrows in the diagram (often called a [[Trellis (graph)|trellis diagram]]) denote conditional dependencies.@@@@1@14@@danf@17-8-2009
10350160@unknown@formal@none@1@S@From the diagram, it is clear that the value of the hidden variable (at time ) ''only'' depends on the value of the hidden variable : the values at time and before have no influence.@@@@1@38@@danf@17-8-2009
10350170@unknown@formal@none@1@S@This is called the [[Markov property]].@@@@1@6@@danf@17-8-2009
10350180@unknown@formal@none@1@S@Similarly, the value of the observed variable only depends on the value of the hidden variable (both at time ).@@@@1@22@@danf@17-8-2009
10350190@unknown@formal@none@1@S@==Probability of an observed sequence==@@@@1@5@@danf@17-8-2009
10350200@unknown@formal@none@1@S@The probability of observing a sequence of length is given by@@@@1@14@@danf@17-8-2009
10350210@unknown@formal@none@1@S@:@@@@1@2@@danf@17-8-2009
10350220@unknown@formal@none@1@S@where the sum runs over all possible hidden node sequences .@@@@1@14@@danf@17-8-2009
10350230@unknown@formal@none@1@S@Brute force calculation of is intractable for most real-life problems, as the number of possible hidden node sequences is typically extremely high.@@@@1@23@@danf@17-8-2009
10350240@unknown@formal@none@1@S@The calculation can however be sped up enormously using the [[Viterbi algorithm|forward algorithm]] or the equivalent backward algorithm.@@@@1@18@@danf@17-8-2009
10350250@unknown@formal@none@1@S@==Using hidden Markov models==@@@@1@4@@danf@17-8-2009
10350260@unknown@formal@none@1@S@There are three [[canonical]] problems associated with HMM:@@@@1@8@@danf@17-8-2009
10350270@unknown@formal@none@1@S@* Given the parameters of the model, compute the probability of a particular output sequence, and the probabilities of the hidden state values given that output sequence.@@@@1@27@@danf@17-8-2009
10350280@unknown@formal@none@1@S@This problem is solved by the [[forward-backward algorithm]].@@@@1@8@@danf@17-8-2009
10350290@unknown@formal@none@1@S@* Given the parameters of the model, find the most likely sequence of hidden states that could have generated a given output sequence.@@@@1@23@@danf@17-8-2009
10350300@unknown@formal@none@1@S@This problem is solved by the [[Viterbi algorithm]].@@@@1@8@@danf@17-8-2009
10350310@unknown@formal@none@1@S@* Given an output sequence or a set of such sequences, find the most likely set of state transition and output probabilities.@@@@1@22@@danf@17-8-2009
10350320@unknown@formal@none@1@S@In other words, discover the parameters of the HMM given a dataset of sequences.@@@@1@14@@danf@17-8-2009
10350330@unknown@formal@none@1@S@This problem is solved by the [[Baum-Welch algorithm]].@@@@1@8@@danf@17-8-2009
10350340@unknown@formal@none@1@S@=== A concrete example ===@@@@1@5@@danf@17-8-2009
10350350@unknown@formal@none@1@S@''This example is further elaborated in the [[Viterbi algorithm]] page.''@@@@1@10@@danf@17-8-2009
10350360@unknown@formal@none@1@S@===Applications of hidden Markov models===@@@@1@5@@danf@17-8-2009
10350370@unknown@formal@none@1@S@* [[Cryptanalysis]]@@@@1@2@@danf@17-8-2009
10350380@unknown@formal@none@1@S@* [[Speech recognition]]@@@@1@3@@danf@17-8-2009
10350390@unknown@formal@none@1@S@* [[Machine translation]]@@@@1@3@@danf@17-8-2009
10350400@unknown@formal@none@1@S@* [[Partial discharge]]@@@@1@3@@danf@17-8-2009
10350410@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10350420@unknown@formal@none@1@S@Hidden Markov Models were first described in a series of statistical papers by [[Leonard E. Baum]] and other authors in the second half of the 1960s.@@@@1@26@@danf@17-8-2009
10350430@unknown@formal@none@1@S@One of the first applications of HMMs was [[speech recognition]], starting in the mid-1970s.@@@@1@14@@danf@17-8-2009
10350440@unknown@formal@none@1@S@In the second half of the 1980s, HMMs began to be applied to the analysis of biological sequences, in particular [[DNA]].@@@@1@21@@danf@17-8-2009
10350450@unknown@formal@none@1@S@Since then, they have become ubiquitous in the field of [[bioinformatics]].@@@@1@11@@danf@17-8-2009
10360010@unknown@formal@none@1@S@HTML@@@@1@1@@danf@17-8-2009
10360020@unknown@formal@none@1@S@'''HTML''', an [[Acronym and initialism|initialism]] of '''HyperText Markup Language''', is the predominant [[markup language]] for [[web page]]s.@@@@1@17@@danf@17-8-2009
10360030@unknown@formal@none@1@S@It provides a means to describe the structure of text-based information in a document — by denoting certain text as links, headings, paragraphs, lists, and so on — and to supplement that text with ''interactive forms'', embedded ''images'', and other objects.@@@@1@41@@danf@17-8-2009
10360040@unknown@formal@none@1@S@HTML is written in the form of tags, surrounded by [[Brackets#Angle brackets or chevrons .3C .3E|angle brackets]].@@@@1@17@@danf@17-8-2009
10360050@unknown@formal@none@1@S@HTML can also describe, to some degree, the appearance and [[semantics]] of a document, and can include embedded [[scripting language]] code (such as JavaScript) which can affect the behavior of [[Web browser]]s and other HTML processors.@@@@1@36@@danf@17-8-2009
10360060@unknown@formal@none@1@S@HTML is also often used to refer to content in specific languages, such as a [[MIME type]] text/html, or even more broadly as a generic term for HTML, whether in its [[XML]]-descended form (such as [[XHTML]] 1.0 and later) or its form descended directly from [[SGML]] (such as HTML 4.01 and earlier).@@@@1@52@@danf@17-8-2009
10360070@unknown@formal@none@1@S@By convention, HTML format data files use a file extension .html or .htm.@@@@1@13@@danf@17-8-2009
10360080@unknown@formal@none@1@S@==History of HTML==@@@@1@3@@danf@17-8-2009
10360090@unknown@formal@none@1@S@===Origins===@@@@1@1@@danf@17-8-2009
10360100@unknown@formal@none@1@S@In 1980, physicist [[Tim Berners-Lee]], who was an independent contractor at [[CERN]], proposed and prototyped [[ENQUIRE]], a system for CERN researchers to use and share documents.@@@@1@26@@danf@17-8-2009
10360110@unknown@formal@none@1@S@In 1989, Berners-Lee and CERN data systems engineer [[Robert Cailliau]] each submitted separate proposals for an [[Internet]]-based [[hypertext]] system providing similar functionality.@@@@1@22@@danf@17-8-2009
10360120@unknown@formal@none@1@S@The following year, they collaborated on a joint proposal, the WorldWideWeb (W3) project, which was accepted by CERN.@@@@1@18@@danf@17-8-2009
10360130@unknown@formal@none@1@S@===First specifications===@@@@1@2@@danf@17-8-2009
10360140@unknown@formal@none@1@S@The first publicly available description of HTML was a document called ''HTML Tags'', first mentioned on the Internet by Berners-Lee in late 1991.@@@@1@23@@danf@17-8-2009
10360150@unknown@formal@none@1@S@It describes 22 elements comprising the initial, relatively simple design of HTML.@@@@1@12@@danf@17-8-2009
10360160@unknown@formal@none@1@S@Thirteen of these elements still exist in HTML 4.@@@@1@9@@danf@17-8-2009
10360170@unknown@formal@none@1@S@Berners-Lee considered HTML to be, at the time, an application of [[SGML]], but it was not formally defined as such until the mid-1993 publication, by the [[Internet Engineering Task Force|IETF]], of the first proposal for an HTML specification: Berners-Lee and [[Dan Connolly]]'s "Hypertext Markup Language (HTML)" Internet-Draft, which included an SGML [[Document Type Definition]] to define the grammar.@@@@1@58@@danf@17-8-2009
10360180@unknown@formal@none@1@S@The draft expired after six months, but was notable for its acknowledgment of the [[Mosaic (web browser)|NCSA Mosaic]] browser's custom tag for embedding in-line images, reflecting the IETF's philosophy of basing standards on successful prototypes.@@@@1@35@@danf@17-8-2009
10360190@unknown@formal@none@1@S@Similarly, Dave Raggett's competing Internet-Draft, "HTML+ (Hypertext Markup Format)", from late 1993, suggested standardizing already-implemented features like tables and fill-out forms.@@@@1@21@@danf@17-8-2009
10360200@unknown@formal@none@1@S@After the HTML and HTML+ drafts expired in early 1994, the IETF created an HTML Working Group, which in 1995 completed "HTML 2.0", the first HTML specification intended to be treated as a standard against which future implementations should be based.@@@@1@41@@danf@17-8-2009
10360210@unknown@formal@none@1@S@Published as [[Request for Comments]] 1996, HTML 2.0 included ideas from the HTML and HTML+ drafts.@@@@1@16@@danf@17-8-2009
10360220@unknown@formal@none@1@S@There was no "HTML 1.0"; the 2.0 designation was intended to distinguish the new edition from previous drafts.@@@@1@18@@danf@17-8-2009
10360230@unknown@formal@none@1@S@Further development under the auspices of the IETF was stalled by competing interests.@@@@1@13@@danf@17-8-2009
10360240@unknown@formal@none@1@S@Since 1996, the HTML specifications have been maintained, with input from commercial software vendors, by the [[World Wide Web Consortium]] (W3C).@@@@1@21@@danf@17-8-2009
10360250@unknown@formal@none@1@S@However, in 2000, HTML also became an international standard ([[International Organization for Standardization|ISO]]/[[International Electrotechnical Commission|IEC]] 15445:2000).@@@@1@16@@danf@17-8-2009
10360260@unknown@formal@none@1@S@The last HTML specification published by the W3C is the HTML 4.01 Recommendation, published in late 1999.@@@@1@17@@danf@17-8-2009
10360270@unknown@formal@none@1@S@Its issues and errors were last acknowledged by errata published in 2001.@@@@1@12@@danf@17-8-2009
10360280@unknown@formal@none@1@S@===Version history of the standard===@@@@1@5@@danf@17-8-2009
10360290@unknown@formal@none@1@S@====HTML versions====@@@@1@2@@danf@17-8-2009
10360300@unknown@formal@none@1@S@'''July, 1993:''' [http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt Hypertext Markup Language], was published at [[Internet Engineering Task Force|IETF]] working draft (that is, not yet a standard).@@@@1@21@@danf@17-8-2009
10360310@unknown@formal@none@1@S@'''November, 1995:''' [http://tools.ietf.org/html/rfc1866 HTML 2.0] published as IETF [[Request for Comments]]:@@@@1@11@@danf@17-8-2009
10360320@unknown@formal@none@1@S@* RFC 1866,@@@@1@3@@danf@17-8-2009
10360330@unknown@formal@none@1@S@* supplemented by RFC 1867 (form-based file upload) that same month,@@@@1@11@@danf@17-8-2009
10360340@unknown@formal@none@1@S@* RFC 1942 (tables) in ''May 1996'',@@@@1@7@@danf@17-8-2009
10360350@unknown@formal@none@1@S@* RFC 1980 (client-side image maps) in ''August 1996'', and@@@@1@10@@danf@17-8-2009
10360360@unknown@formal@none@1@S@* RFC 2070 ([[internationalization and localization|internationalization]]) in ''January 1997'';@@@@1@9@@danf@17-8-2009
10360370@unknown@formal@none@1@S@Ultimately, all were declared obsolete/historic by RFC 2854 in ''June 2000''.@@@@1@11@@danf@17-8-2009
10360380@unknown@formal@none@1@S@'''April 1995''': [http://www.w3.org/MarkUp/html3/ HTML 3.0], proposed as a standard to the IETF.@@@@1@12@@danf@17-8-2009
10360390@unknown@formal@none@1@S@It included many of the capabilities that were in Raggett's HTML+ proposal, such as support for tables, text flow around figures, and the display of complex mathematical formulas.@@@@1@28@@danf@17-8-2009
10360400@unknown@formal@none@1@S@A demonstration appeared in W3C's own [[Arena (web browser)|Arena browser]].@@@@1@10@@danf@17-8-2009
10360410@unknown@formal@none@1@S@HTML 3.0 did not succeed for several reasons.@@@@1@8@@danf@17-8-2009
10360420@unknown@formal@none@1@S@The pace of browser development, as well as the number of interested parties, had outstripped the resources of the IETF.@@@@1@20@@danf@17-8-2009
10360430@unknown@formal@none@1@S@Netscape continued to introduce HTML elements that specified the visual appearance of documents, contrary to the goals of the newly-formed W3C, which sought to limit HTML to describing logical structure.@@@@1@30@@danf@17-8-2009
10360440@unknown@formal@none@1@S@Microsoft, a newcomer at the time, played to all sides by creating its own tags, implementing Netscape's elements for compatibility, and supporting W3C features such as Cascading Style Sheets.@@@@1@29@@danf@17-8-2009
10360450@unknown@formal@none@1@S@'''[[January 14]], [[1997]]:''' [http://www.w3.org/TR/REC-html32 HTML 3.2], published as a [[W3C Recommendation]].@@@@1@11@@danf@17-8-2009
10360460@unknown@formal@none@1@S@It was the first version developed and standardized exclusively by the W3C, as the IETF had closed its HTML Working Group in September 1997.@@@@1@24@@danf@17-8-2009
10360470@unknown@formal@none@1@S@The new version dropped math formulas entirely, reconciled overlap among various proprietary extensions, and adopted most of Netscape's visual markup tags.@@@@1@21@@danf@17-8-2009
10360480@unknown@formal@none@1@S@Netscape's [[blink element]] and Microsoft's [[marquee element]] were omitted due to a mutual agreement between the two companies.@@@@1@18@@danf@17-8-2009
10360490@unknown@formal@none@1@S@The ability to include mathematical formulas in HTML would not be standardized until years later in [[MathML]].@@@@1@17@@danf@17-8-2009
10360500@unknown@formal@none@1@S@'''[[December 18]], [[1997]]:''' [http://www.w3.org/TR/REC-html40-971218/ HTML 4.0], published as a W3C Recommendation.@@@@1@11@@danf@17-8-2009
10360510@unknown@formal@none@1@S@It offers three "flavors":@@@@1@4@@danf@17-8-2009
10360520@unknown@formal@none@1@S@* Strict, in which deprecated elements are forbidden,@@@@1@8@@danf@17-8-2009
10360530@unknown@formal@none@1@S@* Transitional, in which deprecated elements are allowed,@@@@1@8@@danf@17-8-2009
10360540@unknown@formal@none@1@S@* Frameset, in which mostly only [[Framing (World Wide Web)|frame]] related elements are allowed;@@@@1@14@@danf@17-8-2009
10360550@unknown@formal@none@1@S@HTML 4.0 (initially code-named "Cougar") likewise adopted many browser-specific element types and attributes, but at the same time sought to phase out Netscape's visual markup features by marking them as [[deprecation|deprecated]] in favor of style sheets.@@@@1@36@@danf@17-8-2009
10360560@unknown@formal@none@1@S@Minor editorial revisions to the HTML 4.0 specification were published in 1998 without incrementing the version number and further minor revisions as HTML 4.01.@@@@1@24@@danf@17-8-2009
10360570@unknown@formal@none@1@S@'''[[April 24]], [[1998]]:''' [http://www.w3.org/TR/1998/REC-html40-19980424/ HTML 4.0] was reissued with minor edits without incrementing the version number.@@@@1@16@@danf@17-8-2009
10360580@unknown@formal@none@1@S@'''[[December 24]], [[1999]]:''' [http://www.w3.org/TR/html401 HTML 4.01], published as a W3C Recommendation.@@@@1@11@@danf@17-8-2009
10360590@unknown@formal@none@1@S@It offers the same three flavors as HTML 4.0, and its last [http://www.w3.org/MarkUp/html4-updates/errata errata] were published [[May 12]], [[2001]].@@@@1@19@@danf@17-8-2009
10360600@unknown@formal@none@1@S@HTML 4.01 and ISO/IEC 15445:2000 are the most recent and final versions of HTML.@@@@1@14@@danf@17-8-2009
10360610@unknown@formal@none@1@S@'''[[May 15]], [[2000]]:''' [https://www.cs.tcd.ie/15445/15445.HTML ISO/IEC 15445:2000] ("[[International Organization for Standardization|ISO]] HTML", based on HTML 4.01 Strict), published as an ISO/IEC international standard.@@@@1@22@@danf@17-8-2009
10360620@unknown@formal@none@1@S@'''[[January 22]], [[2008]]:''' [http://www.w3.org/TR/html5/ HTML 5], published as a Working Draft by W3C.@@@@1@13@@danf@17-8-2009
10360630@unknown@formal@none@1@S@====XHTML versions====@@@@1@2@@danf@17-8-2009
10360640@unknown@formal@none@1@S@XHTML is a separate language that began as a reformulation of HTML 4.01 using XML 1.0.@@@@1@16@@danf@17-8-2009
10360650@unknown@formal@none@1@S@It continues to be developed:@@@@1@5@@danf@17-8-2009
10360660@unknown@formal@none@1@S@* [http://www.w3.org/TR/xhtml1/ XHTML 1.0], published [[January 26]], [[2000]] as a W3C Recommendation, later revised and republished [[August 1]], [[2002]].@@@@1@19@@danf@17-8-2009
10360670@unknown@formal@none@1@S@It offers the same three flavors as HTML 4.0 and 4.01, reformulated in XML, with minor restrictions.@@@@1@17@@danf@17-8-2009
10360680@unknown@formal@none@1@S@* [http://www.w3.org/TR/xhtml11/ XHTML 1.1], published [[May 31]], [[2001]] as a W3C Recommendation.@@@@1@12@@danf@17-8-2009
10360690@unknown@formal@none@1@S@It is based on XHTML 1.0 Strict, but includes minor changes, can be customized, and is reformulated using modules from [http://www.w3.org/TR/xhtml-modularization Modularization of XHTML], which was published [[April 10]], [[2001]] as a W3C Recommendation.@@@@1@34@@danf@17-8-2009
10360700@unknown@formal@none@1@S@* [http://www.w3.org/TR/xhtml2/ XHTML 2.0] is still a W3C Working Draft.@@@@1@10@@danf@17-8-2009
10360710@unknown@formal@none@1@S@XHTML 2.0 is incompatible with XHTML 1.x and, therefore, would be more accurate to characterize as an XHTML-inspired new language than an update to XHTML 1.x.@@@@1@26@@danf@17-8-2009
10360720@unknown@formal@none@1@S@* XHTML 5, which is an update to XHTML 1.x, is being defined alongside [[HTML 5]] in the [http://www.w3.org/html/wg/html5/ HTML 5 draft].@@@@1@22@@danf@17-8-2009
10360730@unknown@formal@none@1@S@==HTML markup==@@@@1@2@@danf@17-8-2009
10360740@unknown@formal@none@1@S@HTML markup consists of several key components, including ''elements'' (and their ''attributes''), character-based ''data types'', and ''character references'' and ''entity references''.@@@@1@21@@danf@17-8-2009
10360750@unknown@formal@none@1@S@Another important component is the ''document type declaration''.@@@@1@8@@danf@17-8-2009
10360760@unknown@formal@none@1@S@HTML [[Hello world program|Hello World]]: @@@@1@18@@danf@17-8-2009
10360770@unknown@formal@none@1@S@===Elements===@@@@1@1@@danf@17-8-2009
10360780@unknown@formal@none@1@S@:''See [[HTML element]]s for more detailed descriptions.''@@@@1@7@@danf@17-8-2009
10360790@unknown@formal@none@1@S@Elements are the basic structure for HTML markup.@@@@1@8@@danf@17-8-2009
10360800@unknown@formal@none@1@S@Elements have two basic properties: attributes and content.@@@@1@8@@danf@17-8-2009
10360810@unknown@formal@none@1@S@Each attribute and each element's content has certain restrictions that must be followed for an HTML document to be considered valid.@@@@1@21@@danf@17-8-2009
10360820@unknown@formal@none@1@S@An element usually has a start tag (e.g. ) and an end tag (e.g. ).@@@@1@15@@danf@17-8-2009
10360830@unknown@formal@none@1@S@The element's attributes are contained in the start tag and content is located between the tags (e.g. Content).@@@@1@18@@danf@17-8-2009
10360840@unknown@formal@none@1@S@Some elements, such as , do not have any content and must not have a closing tag.@@@@1@17@@danf@17-8-2009
10360850@unknown@formal@none@1@S@Listed below are several types of markup elements used in HTML.@@@@1@11@@danf@17-8-2009
10360860@unknown@formal@none@1@S@'''Structural''' markup describes the purpose of text.@@@@1@7@@danf@17-8-2009
10360870@unknown@formal@none@1@S@For example,
Golf
establishes "Golf" as a second-level [[heading]], which would be rendered in a browser in a manner similar to the "HTML markup" title at the start of this section.@@@@1@31@@danf@17-8-2009
10360880@unknown@formal@none@1@S@Structural markup does not denote any specific rendering, but most Web browsers have standardized on how elements should be formatted.@@@@1@20@@danf@17-8-2009
10360890@unknown@formal@none@1@S@Text may be further styled with [[Cascading Style Sheets]] (CSS).@@@@1@10@@danf@17-8-2009
10360900@unknown@formal@none@1@S@'''Presentational''' markup describes the appearance of the text, regardless of its function.@@@@1@12@@danf@17-8-2009
10360910@unknown@formal@none@1@S@For example boldface indicates that visual output devices should render "boldface" in bold text, but gives no indication what devices which are unable to do this (such as aural devices that read the text aloud) should do.@@@@1@37@@danf@17-8-2009
10360920@unknown@formal@none@1@S@In the case of both bold and italic, there are elements which usually have an equivalent visual rendering but are more semantic in nature, namely strong emphasis and emphasis respectively.@@@@1@30@@danf@17-8-2009
10360930@unknown@formal@none@1@S@It is easier to see how an aural user agent should interpret the latter two elements.@@@@1@16@@danf@17-8-2009
10360940@unknown@formal@none@1@S@However, they are not equivalent to their presentational counterparts: it would be undesirable for a screen-reader to emphasize the name of a book, for instance, but on a screen such a name would be italicized.@@@@1@35@@danf@17-8-2009
10360950@unknown@formal@none@1@S@Most presentational markup elements have become [[Deprecation|deprecated]] under the HTML 4.0 specification, in favor of [[Cascading Style Sheets|CSS]] based style design.@@@@1@21@@danf@17-8-2009
10360960@unknown@formal@none@1@S@'''Hypertext''' markup links parts of the document to other documents.@@@@1@10@@danf@17-8-2009
10360970@unknown@formal@none@1@S@HTML up through version [[XHTML]] 1.1 requires the use of an anchor element to create a hyperlink in the flow of text: Wikipedia.@@@@1@23@@danf@17-8-2009
10360980@unknown@formal@none@1@S@However, the href attribute must also be set to a valid [[Uniform Resource Locator|URL]] so for example the HTML code, Wikipedia, will render the word "[http://en.wikipedia.org/ Wikipedia]" as a [[hyperlink]].@@@@1@31@@danf@17-8-2009
10360985@unknown@formal@none@1@S@To link on an image, the anchor tag use the following syntax: @@@@1@16@@danf@17-8-2009
10360990@unknown@formal@none@1@S@===Attributes===@@@@1@1@@danf@17-8-2009
10361000@unknown@formal@none@1@S@Most of the attributes of an element are name-value pairs, separated by "=", and written within the start tag of an element, after the element's name.@@@@1@26@@danf@17-8-2009
10361010@unknown@formal@none@1@S@The value may be enclosed in single or double quotes, although values consisting of certain characters can be left unquoted in HTML (but not XHTML).@@@@1@25@@danf@17-8-2009
10361020@unknown@formal@none@1@S@Leaving attribute values unquoted is considered unsafe.@@@@1@7@@danf@17-8-2009
10361030@unknown@formal@none@1@S@In contrast with name-value pair attributes, there are some attributes that affect the element simply by their presence in the start tag of the element (like the ismap attribute for the img element).@@@@1@33@@danf@17-8-2009
10361040@unknown@formal@none@1@S@Most elements can take any of several common attributes:@@@@1@9@@danf@17-8-2009
10361050@unknown@formal@none@1@S@* The id attribute provides a document-wide unique identifier for an element.@@@@1@12@@danf@17-8-2009
10361060@unknown@formal@none@1@S@This can be used by stylesheets to provide presentational properties, by browsers to focus attention on the specific element, or by scripts to alter the contents or presentation of an element.@@@@1@31@@danf@17-8-2009
10361070@unknown@formal@none@1@S@* The class attribute provides a way of classifying similar elements for presentation purposes.@@@@1@14@@danf@17-8-2009
10361080@unknown@formal@none@1@S@For example, an HTML document might use the designation class="notation" to indicate that all elements with this class value are subordinate to the main text of the document.@@@@1@28@@danf@17-8-2009
10361090@unknown@formal@none@1@S@Such elements might be gathered together and presented as footnotes on a page instead of appearing in the place where they occur in the HTML source.@@@@1@26@@danf@17-8-2009
10361100@unknown@formal@none@1@S@* An author may use the style non-attributal codes presentational properties to a particular element.@@@@1@15@@danf@17-8-2009
10361110@unknown@formal@none@1@S@It is considered better practice to use an element’s son- id page and select the element with a stylesheet, though sometimes this can be too cumbersome for a simple ad hoc application of styled properties.@@@@1@35@@danf@17-8-2009
10361120@unknown@formal@none@1@S@* The title attribute is used to attach subtextual explanation to an element.@@@@1@13@@danf@17-8-2009
10361130@unknown@formal@none@1@S@In most browsers this attribute is displayed as what is often referred to as a [[tooltip]].@@@@1@16@@danf@17-8-2009
10361140@unknown@formal@none@1@S@The generic inline element span can be used to demonstrate these various attributes:@@@@1@13@@danf@17-8-2009
10361150@unknown@formal@none@1@S@::@@@@1@8@@danf@17-8-2009
10361160@unknown@formal@none@1@S@This example displays as HTML; in most browsers, pointing the cursor at the abbreviation should display the title text "Hypertext Markup Language."@@@@1@28@@danf@17-8-2009
10361170@unknown@formal@none@1@S@Most elements also take the language-related attributes lang and dir.@@@@1@10@@danf@17-8-2009
10361180@unknown@formal@none@1@S@===Character and entity references===@@@@1@4@@danf@17-8-2009
10361190@unknown@formal@none@1@S@As of version 4.0, HTML defines a set of [[List of XML and HTML character entity references|252]] [[character entity reference]]s and a set of 1,114,050 [[numeric character reference]]s, both of which allow individual characters to be written via simple markup, rather than literally.@@@@1@43@@danf@17-8-2009
10361200@unknown@formal@none@1@S@A literal character and its markup counterpart are considered equivalent and are rendered identically.@@@@1@14@@danf@17-8-2009
10361210@unknown@formal@none@1@S@The ability to "escape" characters in this way allows for the characters < and & (when written as < and &, respectively) to be interpreted as character data, rather than markup.@@@@1@31@@danf@17-8-2009
10361220@unknown@formal@none@1@S@For example, a literal < normally indicates the start of a tag, and & normally indicates the start of a character entity reference or numeric character reference; writing it as & or & or & allows & to be included in the content of elements or the values of attributes.@@@@1@50@@danf@17-8-2009
10361230@unknown@formal@none@1@S@The double-quote character ("), when used to quote an attribute value, must also be escaped as " or " or " when it appears within the attribute value itself.@@@@1@29@@danf@17-8-2009
10361240@unknown@formal@none@1@S@The single-quote character ('), when used to quote an attribute value, must also be escaped as ' or ' (should NOT be escaped as ' except in XHTML documents) when it appears within the attribute value itself.@@@@1@37@@danf@17-8-2009
10361250@unknown@formal@none@1@S@However, since document authors often overlook the need to escape these characters, browsers tend to be very forgiving, treating them as markup only when subsequent text appears to confirm that intent.@@@@1@31@@danf@17-8-2009
10361260@unknown@formal@none@1@S@Escaping also allows for characters that are not easily typed or that aren't even available in the document's [[character encoding]] to be represented within the element and attribute content.@@@@1@29@@danf@17-8-2009
10361270@unknown@formal@none@1@S@For example, the acute-accented e (é), a character typically found only on Western European keyboards, can be written in any HTML document as the entity reference é or as the numeric references é or é.@@@@1@35@@danf@17-8-2009
10361280@unknown@formal@none@1@S@The characters comprising those references (that is, the &, the ;, the letters in eacute, and so on) are available on all keyboards and are supported in all character encodings, whereas the literal é is not.@@@@1@36@@danf@17-8-2009
10361290@unknown@formal@none@1@S@===Data types===@@@@1@2@@danf@17-8-2009
10361300@unknown@formal@none@1@S@HTML defines several [[data type]]s for element content, such as script data and stylesheet data, and a plethora of types for attribute values, including IDs, names, URIs, numbers, units of length, languages, media descriptors, colors, character encodings, dates and times, and so on.@@@@1@43@@danf@17-8-2009
10361310@unknown@formal@none@1@S@All of these data types are specializations of character data.@@@@1@10@@danf@17-8-2009
10361320@unknown@formal@none@1@S@===The Document Type Declaration===@@@@1@4@@danf@17-8-2009
10361330@unknown@formal@none@1@S@In order to enable [[Document Type Definition]] (DTD)-based validation with SGML tools and in order to avoid the [[quirks mode]] in browsers, HTML documents can start with a [[Document Type Declaration]] (informally, a "DOCTYPE").@@@@1@34@@danf@17-8-2009
10361340@unknown@formal@none@1@S@The DTD to which the DOCTYPE refers contains machine-readable grammar specifying the permitted and prohibited content for a document conforming to such a DTD.@@@@1@24@@danf@17-8-2009
10361350@unknown@formal@none@1@S@Browsers do not necessarily read the DTD, however.@@@@1@8@@danf@17-8-2009
10361360@unknown@formal@none@1@S@The most popular graphical browsers use DOCTYPE declarations (or the lack thereof) and other data at the beginning of sources to determine which rendering mode to use.@@@@1@27@@danf@17-8-2009
10361370@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10361380@unknown@formal@none@1@S@:@@@@1@7@@danf@17-8-2009
10361390@unknown@formal@none@1@S@This declaration references the Strict DTD of HTML 4.01, which does not have presentational elements like , leaving formatting to Cascading Style Sheets and the span and div tags.@@@@1@29@@danf@17-8-2009
10361400@unknown@formal@none@1@S@SGML-based validators read the DTD in order to properly parse the document and to perform validation.@@@@1@16@@danf@17-8-2009
10361410@unknown@formal@none@1@S@In modern browsers, the HTML 4.01 Strict doctype activates standards layout mode for [[Cascading Style Sheets|CSS]] as opposed to [[quirks mode]].@@@@1@21@@danf@17-8-2009
10361420@unknown@formal@none@1@S@In addition, HTML 4.01 provides Transitional and Frameset DTDs.@@@@1@9@@danf@17-8-2009
10361430@unknown@formal@none@1@S@The Transitional DTD was intended to gradually phase in the changes made in the Strict DTD, while the Frameset DTD was intended for those documents which contained frames.@@@@1@28@@danf@17-8-2009
10361440@unknown@formal@none@1@S@==Semantic HTML==@@@@1@2@@danf@17-8-2009
10361450@unknown@formal@none@1@S@There is no official specification called "Semantic HTML", though the strict flavors of HTML discussed [[#Current flavors of HTML|below]] are a push in that direction.@@@@1@25@@danf@17-8-2009
10361460@unknown@formal@none@1@S@Rather, semantic HTML refers to an objective and a practice to create documents with HTML that contain only the author's intended meaning, without any reference to how this meaning is presented or conveyed.@@@@1@33@@danf@17-8-2009
10361470@unknown@formal@none@1@S@A classic example is the distinction between the emphasis element (<em>) and the italics element (<i>).@@@@1@16@@danf@17-8-2009
10361480@unknown@formal@none@1@S@Often the emphasis element is displayed in italics, so the presentation is typically the same.@@@@1@15@@danf@17-8-2009
10361490@unknown@formal@none@1@S@However, emphasizing something is different from listing the title of a book, for example, which may also be displayed in italics.@@@@1@21@@danf@17-8-2009
10361500@unknown@formal@none@1@S@In purely semantic HTML, a book title would use a different element than emphasized text uses (for example a <span>), because they are meaningfully different things.@@@@1@26@@danf@17-8-2009
10361510@unknown@formal@none@1@S@The goal of semantic HTML requires two things of authors:@@@@1@10@@danf@17-8-2009
10361520@unknown@formal@none@1@S@# To avoid the use of presentational markup (elements, attributes, and other entities).@@@@1@13@@danf@17-8-2009
10361530@unknown@formal@none@1@S@# To use available markup to differentiate the meanings of phrases and structure in the document.@@@@1@16@@danf@17-8-2009
10361540@unknown@formal@none@1@S@So for example, the book title from above would need to have its own element and class specified, such as <cite class="booktitle">The Grapes of Wrath</cite>.@@@@1@25@@danf@17-8-2009
10361545@unknown@formal@none@1@S@Here, the <cite> element is used because it most closely matches the meaning of this phrase in the text.@@@@1@19@@danf@17-8-2009
10361550@unknown@formal@none@1@S@However, the <cite> element is not specific enough to this task, since we mean to cite specifically a book title as opposed to a newspaper article or an academic journal.@@@@1@30@@danf@17-8-2009
10361560@unknown@formal@none@1@S@Semantic HTML also requires complementary specifications and software compliance with these specifications.@@@@1@12@@danf@17-8-2009
10361570@unknown@formal@none@1@S@Primarily, the development and proliferation of [[Cascading Style Sheets|CSS]] has led to increasing support for semantic HTML, because CSS provides designers with a rich language to alter the presentation of semantic-only documents.@@@@1@32@@danf@17-8-2009
10361580@unknown@formal@none@1@S@With the development of CSS, the need to include presentational properties in a document has virtually disappeared.@@@@1@17@@danf@17-8-2009
10361590@unknown@formal@none@1@S@With the advent and refinement of CSS and the increasing support for it in Web browsers, subsequent editions of HTML increasingly stress only using markup that suggests the semantic structure and phrasing of the document, like headings, paragraphs, quotes, and lists, instead of using markup which is written for visual purposes only, like <font>, <b> (bold), and <i> (italics).@@@@1@59@@danf@17-8-2009
10361600@unknown@formal@none@1@S@Some of these elements are not permitted in certain varieties of HTML, like HTML 4.01 Strict.@@@@1@16@@danf@17-8-2009
10361610@unknown@formal@none@1@S@CSS provides a way to separate document semantics from the content's presentation, by keeping everything relevant to presentation defined in a CSS file.@@@@1@23@@danf@17-8-2009
10361620@unknown@formal@none@1@S@See [[separation of style and content]].@@@@1@6@@danf@17-8-2009
10361630@unknown@formal@none@1@S@Semantic HTML offers many advantages.@@@@1@5@@danf@17-8-2009
10361640@unknown@formal@none@1@S@First, it ensures consistency in style across elements that have the same meaning.@@@@1@13@@danf@17-8-2009
10361650@unknown@formal@none@1@S@Every heading, every quotation, every similar element receives the same presentation properties.@@@@1@12@@danf@17-8-2009
10361660@unknown@formal@none@1@S@Second, semantic HTML frees authors from the need to concern themselves with presentation details.@@@@1@14@@danf@17-8-2009
10361670@unknown@formal@none@1@S@When writing the number two, for example, should it be written out in words ("two"), or should it be written as a numeral (2)?@@@@1@24@@danf@17-8-2009
10361680@unknown@formal@none@1@S@A semantic markup might enter something like 2 and leave presentation details to the stylesheet designers.@@@@1@16@@danf@17-8-2009
10361690@unknown@formal@none@1@S@Similarly, an author might wonder where to break out quotations into separate indented blocks of text: with purely semantic HTML, such details would be left up to stylesheet designers.@@@@1@29@@danf@17-8-2009
10361700@unknown@formal@none@1@S@Authors would simply indicate quotations when they occur in the text, and not concern themselves with presentation.@@@@1@17@@danf@17-8-2009
10361710@unknown@formal@none@1@S@A third advantage is device independence and repurposing of documents.@@@@1@10@@danf@17-8-2009
10361720@unknown@formal@none@1@S@A semantic HTML document can be paired with any number of stylesheets to provide output to computer screens (through Web browsers), high-resolution printers, handheld devices, aural browsers or braille devices for those with visual impairments, and so on.@@@@1@38@@danf@17-8-2009
10361730@unknown@formal@none@1@S@To accomplish this, nothing needs to be changed in a well-coded semantic HTML document.@@@@1@14@@danf@17-8-2009
10361740@unknown@formal@none@1@S@Readily available stylesheets make this a simple matter of pairing a semantic HTML document with the appropriate stylesheets.@@@@1@18@@danf@17-8-2009
10361750@unknown@formal@none@1@S@(Of course, the stylesheet's selectors need to match the appropriate properties in the HTML document.)@@@@1@15@@danf@17-8-2009
10361760@unknown@formal@none@1@S@Some aspects of authoring documents make separating semantics from style (in other words, meaning from presentation) difficult.@@@@1@17@@danf@17-8-2009
10361770@unknown@formal@none@1@S@Some elements are hybrids, using presentation in their very meaning.@@@@1@10@@danf@17-8-2009
10361780@unknown@formal@none@1@S@For example, a table displays content in a tabular form.@@@@1@10@@danf@17-8-2009
10361790@unknown@formal@none@1@S@Often such content conveys the meaning only when presented in this way.@@@@1@12@@danf@17-8-2009
10361800@unknown@formal@none@1@S@Repurposing a table for an aural device typically involves somehow presenting the table as an inherently visual element in an audible form.@@@@1@22@@danf@17-8-2009
10361810@unknown@formal@none@1@S@On the other hand, we frequently present lyrical songs—something inherently meant for audible presentation—and instead present them in textual form on a Web page.@@@@1@24@@danf@17-8-2009
10361820@unknown@formal@none@1@S@For these types of elements, the meaning is not so easily separated from their presentation.@@@@1@15@@danf@17-8-2009
10361830@unknown@formal@none@1@S@However, for a great many of the elements used and meanings conveyed in HTML, the translation is relatively smooth.@@@@1@19@@danf@17-8-2009
10361840@unknown@formal@none@1@S@==Delivery of HTML==@@@@1@3@@danf@17-8-2009
10361850@unknown@formal@none@1@S@HTML documents can be delivered by the same means as any other computer file; however, they are most often delivered in one of two forms: over [[HTTP]] servers and through e-mail.@@@@1@31@@danf@17-8-2009
10361860@unknown@formal@none@1@S@===Publishing HTML with HTTP===@@@@1@4@@danf@17-8-2009
10361870@unknown@formal@none@1@S@The [[World Wide Web]] is composed primarily of HTML documents transmitted from a [[Web server]] to a Web browser using the [[Hypertext Transfer Protocol]] (HTTP).@@@@1@25@@danf@17-8-2009
10361880@unknown@formal@none@1@S@However, HTTP can be used to serve images, sound, and other content in addition to HTML.@@@@1@16@@danf@17-8-2009
10361890@unknown@formal@none@1@S@To allow the Web browser to know how to handle the document it received, an indication of the [[file format]] of the document must be transmitted along with the document.@@@@1@30@@danf@17-8-2009
10361900@unknown@formal@none@1@S@This vital [[metadata]] includes the [[MIME]] type (text/html for HTML 4.01 and earlier, application/xhtml+xml for XHTML 1.0 and later) and the character encoding (see [[Character encodings in HTML]]).@@@@1@28@@danf@17-8-2009
10361910@unknown@formal@none@1@S@In modern browsers, the MIME type that is sent with the HTML document affects how the document is interpreted.@@@@1@19@@danf@17-8-2009
10361920@unknown@formal@none@1@S@A document sent with an XHTML MIME type, or ''served as application/xhtml+xml'', is expected to be [[XML#Well-formed documents|well-formed]] XML, and a syntax error causes the browser to fail to render the document.@@@@1@32@@danf@17-8-2009
10361930@unknown@formal@none@1@S@The same document sent with an HTML MIME type, or ''served as text/html'', might be displayed successfully, since Web browsers are more lenient with HTML.@@@@1@25@@danf@17-8-2009
10361940@unknown@formal@none@1@S@However, XHTML parsed in this way is not considered either proper XHTML or HTML, but so-called [[tag soup]].@@@@1@18@@danf@17-8-2009
10361950@unknown@formal@none@1@S@If the MIME type is not recognized as HTML, the Web browser should not attempt to render the document as HTML, even if the document is prefaced with a correct Document Type Declaration.@@@@1@33@@danf@17-8-2009
10361960@unknown@formal@none@1@S@Nevertheless, some Web browsers do examine the contents or URL of the document and attempt to infer the file type, despite this being forbidden by the HTTP 1.1 specification.@@@@1@29@@danf@17-8-2009
10361970@unknown@formal@none@1@S@===HTML e-mail===@@@@1@2@@danf@17-8-2009
10361980@unknown@formal@none@1@S@Most graphical [[e-mail]] clients allow the use of a subset of HTML (often ill-defined) to provide formatting and [[semantic web|semantic]] markup capabilities not available with [[plain text]], like emphasized text, block quotations for replies, and diagrams or mathematical formulas that could not easily be described otherwise.@@@@1@46@@danf@17-8-2009
10361990@unknown@formal@none@1@S@Many of these clients include both a [[GUI]] editor for composing HTML e-mail messages and a rendering engine for displaying received HTML messages.@@@@1@23@@danf@17-8-2009
10362000@unknown@formal@none@1@S@Use of HTML in e-mail is controversial because of compatibility issues, because it can be used in [[phishing]]/privacy attacks, because it can confuse [[E-Mail spam|spam]] filters, and because the message size is larger than plain text.@@@@1@36@@danf@17-8-2009
10362010@unknown@formal@none@1@S@===Naming conventions===@@@@1@2@@danf@17-8-2009
10362020@unknown@formal@none@1@S@The most common [[filename extension]] for [[computer file|files]] containing HTML is .html.@@@@1@12@@danf@17-8-2009
10362030@unknown@formal@none@1@S@A common abbreviation of this is .htm; it originates from older operating systems and file systems, such as the [[DOS]] versions from the 80s and early 90s and [[File Allocation Table|FAT]], which limit file extensions to three letters.@@@@1@38@@danf@17-8-2009
10362040@unknown@formal@none@1@S@Both forms are widely supported by browsers.@@@@1@7@@danf@17-8-2009
10362050@unknown@formal@none@1@S@==Current flavors of HTML==@@@@1@4@@danf@17-8-2009
10362060@unknown@formal@none@1@S@Since its inception, HTML and its associated protocols gained acceptance relatively quickly.@@@@1@12@@danf@17-8-2009
10362070@unknown@formal@none@1@S@However, no clear standards existed in the early years of the language.@@@@1@12@@danf@17-8-2009
10362080@unknown@formal@none@1@S@Though its creators originally conceived of HTML as a semantic language devoid of presentation details, practical uses pushed many presentational elements and attributes into the language, driven largely by the various browser vendors.@@@@1@33@@danf@17-8-2009
10362090@unknown@formal@none@1@S@The latest standards surrounding HTML reflect efforts to overcome the sometimes chaotic development of the language and to create a rational foundation for building both meaningful and well-presented documents.@@@@1@29@@danf@17-8-2009
10362100@unknown@formal@none@1@S@To return HTML to its role as a semantic language, the [[World Wide Web Consortium|W3C]] has developed style languages such as [[Cascading Style Sheets|CSS]] and [[Extensible Stylesheet Language|XSL]] to shoulder the burden of presentation.@@@@1@34@@danf@17-8-2009
10362110@unknown@formal@none@1@S@In conjunction, the HTML specification has slowly reined in the presentational elements.@@@@1@12@@danf@17-8-2009
10362120@unknown@formal@none@1@S@There are two axes differentiating various flavors of HTML as currently specified: SGML-based HTML versus XML-based HTML (referred to as XHTML) on the one axis, and strict versus transitional (loose) versus frameset on the other axis.@@@@1@36@@danf@17-8-2009
10362130@unknown@formal@none@1@S@===SGML-based versus XML-based HTML===@@@@1@4@@danf@17-8-2009
10362140@unknown@formal@none@1@S@One difference in the latest HTML specifications lies in the distinction between the SGML-based specification and the XML-based specification.@@@@1@19@@danf@17-8-2009
10362150@unknown@formal@none@1@S@The XML-based specification is usually called XHTML to distinguish it clearly from the more traditional definition; however, the root element name continues to be 'html' even in the XHTML-specified HTML.@@@@1@30@@danf@17-8-2009
10362160@unknown@formal@none@1@S@The W3C intended XHTML 1.0 to be identical to HTML 4.01 except where limitations of XML over the more complex SGML require workarounds.@@@@1@23@@danf@17-8-2009
10362170@unknown@formal@none@1@S@Because XHTML and HTML are closely related, they are sometimes documented in parallel.@@@@1@13@@danf@17-8-2009
10362180@unknown@formal@none@1@S@In such circumstances, some authors conflate the two names as (X)HTML or X(HTML).@@@@1@13@@danf@17-8-2009
10362190@unknown@formal@none@1@S@Like HTML 4.01, XHTML 1.0 has three sub-specifications: strict, loose, and frameset.@@@@1@12@@danf@17-8-2009
10362200@unknown@formal@none@1@S@Aside from the different opening declarations for a document, the differences between an HTML 4.01 and XHTML 1.0 document—in each of the corresponding DTDs—are largely syntactic.@@@@1@26@@danf@17-8-2009
10362210@unknown@formal@none@1@S@The underlying syntax of HTML allows many shortcuts that XHTML does not, such as elements with optional opening or closing tags, and even EMPTY elements which must not have an end tag.@@@@1@32@@danf@17-8-2009
10362220@unknown@formal@none@1@S@By contrast, XHTML requires all elements to have an opening tag or a closing tag.@@@@1@15@@danf@17-8-2009
10362230@unknown@formal@none@1@S@XHTML, however, also introduces a new shortcut: an XHTML tag may be opened and closed within the same tag, by including a slash before the end of the tag like this: <br/>.@@@@1@32@@danf@17-8-2009
10362240@unknown@formal@none@1@S@The introduction of this shorthand, which is not used in the SGML declaration for HTML 4.01, may confuse earlier software unfamiliar with this new convention.@@@@1@25@@danf@17-8-2009
10362250@unknown@formal@none@1@S@To understand the subtle differences between HTML and XHTML, consider the transformation of a valid and well-formed XHTML 1.0 document that adheres to Appendix C (see below) into a valid HTML 4.01 document.@@@@1@33@@danf@17-8-2009
10362260@unknown@formal@none@1@S@To make this translation requires the following steps:@@@@1@8@@danf@17-8-2009
10362270@unknown@formal@none@1@S@# '''The language for an element should be specified with a lang attribute rather than the XHTML xml:lang attribute.'''@@@@1@19@@danf@17-8-2009
10362280@unknown@formal@none@1@S@XHTML uses XML's built in language-defining functionality attribute.@@@@1@8@@danf@17-8-2009
10362290@unknown@formal@none@1@S@# '''Remove the XML namespace (xmlns=URI).'''@@@@1@6@@danf@17-8-2009
10362300@unknown@formal@none@1@S@HTML has no facilities for namespaces.@@@@1@6@@danf@17-8-2009
10362310@unknown@formal@none@1@S@# '''Change the document type declaration''' from XHTML 1.0 to HTML 4.01. (see [[#The Document Type Definition|DTD section]] for further explanation).@@@@1@21@@danf@17-8-2009
10362320@unknown@formal@none@1@S@# If present, '''remove the XML declaration.'''@@@@1@7@@danf@17-8-2009
10362330@unknown@formal@none@1@S@(Typically this is: ).@@@@1@6@@danf@17-8-2009
10362340@unknown@formal@none@1@S@# '''Ensure that the document’s MIME type is set to text/html.'''@@@@1@11@@danf@17-8-2009
10362350@unknown@formal@none@1@S@For both HTML and XHTML, this comes from the HTTP Content-Type header sent by the server.@@@@1@16@@danf@17-8-2009
10362360@unknown@formal@none@1@S@# '''Change the XML empty-element syntax to an HTML style empty element''' (<br/> to <br>).@@@@1@15@@danf@17-8-2009
10362370@unknown@formal@none@1@S@Those are the main changes necessary to translate a document from XHTML 1.0 to HTML 4.01.@@@@1@16@@danf@17-8-2009
10362380@unknown@formal@none@1@S@To translate from HTML to XHTML would also require the addition of any omitted opening or closing tags.@@@@1@18@@danf@17-8-2009
10362390@unknown@formal@none@1@S@Whether coding in HTML or XHTML it may just be best to always include the optional tags within an HTML document rather than remembering which tags can be omitted.@@@@1@29@@danf@17-8-2009
10362400@unknown@formal@none@1@S@A well-formed XHTML document adheres to all the syntax requirements of XML.@@@@1@12@@danf@17-8-2009
10362410@unknown@formal@none@1@S@A valid document adheres to the content specification for XHTML, which describes the document structure.@@@@1@15@@danf@17-8-2009
10362420@unknown@formal@none@1@S@The W3C recommends several conventions to ensure an easy migration between HTML and XHTML (see [http://www.w3.org/TR/xhtml1/#guidelines HTML Compatibility Guidelines]).@@@@1@19@@danf@17-8-2009
10362430@unknown@formal@none@1@S@The following steps can be applied to XHTML 1.0 documents only:@@@@1@11@@danf@17-8-2009
10362440@unknown@formal@none@1@S@* Include both xml:lang and lang attributes on any elements assigning language.@@@@1@12@@danf@17-8-2009
10362450@unknown@formal@none@1@S@* Use the empty-element syntax only for elements specified as empty in HTML.@@@@1@13@@danf@17-8-2009
10362460@unknown@formal@none@1@S@* Include an extra space in empty-element tags: for example <br /> instead of <br/>.@@@@1@14@@danf@17-8-2009
10362470@unknown@formal@none@1@S@* Include explicit close tags for elements that permit content but are left empty (for example, <div></div>, not <div />).@@@@1@20@@danf@17-8-2009
10362480@unknown@formal@none@1@S@* Omit the XML declaration.@@@@1@5@@danf@17-8-2009
10362490@unknown@formal@none@1@S@By carefully following the W3C’s compatibility guidelines, a user agent should be able to interpret the document equally as HTML or XHTML.@@@@1@22@@danf@17-8-2009
10362500@unknown@formal@none@1@S@For documents that are XHTML 1.0 and have been made compatible in this way, the W3C permits them to be served either as HTML (with a text/html [[MIME type]]), or as XHTML (with an application/xhtml+xml or application/xml MIME type).@@@@1@39@@danf@17-8-2009
10362510@unknown@formal@none@1@S@When delivered as XHTML, browsers should use an XML parser, which adheres strictly to the XML specifications for parsing the document's contents.@@@@1@22@@danf@17-8-2009
10362520@unknown@formal@none@1@S@===Transitional versus Strict ===@@@@1@4@@danf@17-8-2009
10362530@unknown@formal@none@1@S@The latest SGML-based specification HTML 4.01 and the earliest XHTML version include three sub-specifications: Strict, Transitional (once called Loose), and Frameset.@@@@1@21@@danf@17-8-2009
10362540@unknown@formal@none@1@S@The Strict variant represents the standard proper, whereas the Transitional and Frameset variants were developed to assist in the transition from earlier versions of HTML (including HTML 3.2).@@@@1@28@@danf@17-8-2009
10362550@unknown@formal@none@1@S@The Transitional and Frameset variants allow for [[presentational markup]] whereas the Strict variant encourages the use of style sheets through its omission of most presentational markup.@@@@1@26@@danf@17-8-2009
10362560@unknown@formal@none@1@S@The primary differences which make the Transitional variant more permissive than the Strict variant (the differences as the same in HTML 4 and XHTML 1.0) are:@@@@1@26@@danf@17-8-2009
10362570@unknown@formal@none@1@S@* '''A looser content model'''@@@@1@5@@danf@17-8-2009
10362580@unknown@formal@none@1@S@** Inline elements and plain text (#PCDATA) are allowed directly in: body, blockquote, form, noscript and noframes@@@@1@17@@danf@17-8-2009
10362590@unknown@formal@none@1@S@* '''Presentation related elements'''@@@@1@4@@danf@17-8-2009
10362600@unknown@formal@none@1@S@** underline (u)@@@@1@3@@danf@17-8-2009
10362610@unknown@formal@none@1@S@** strike-through (del)@@@@1@3@@danf@17-8-2009
10362620@unknown@formal@none@1@S@** center@@@@1@2@@danf@17-8-2009
10362630@unknown@formal@none@1@S@** font@@@@1@2@@danf@17-8-2009
10362640@unknown@formal@none@1@S@** basefont@@@@1@2@@danf@17-8-2009
10362650@unknown@formal@none@1@S@* '''Presentation related attributes'''@@@@1@4@@danf@17-8-2009
10362660@unknown@formal@none@1@S@** background and bgcolor attributes for body element.@@@@1@8@@danf@17-8-2009
10362670@unknown@formal@none@1@S@** align attribute on div, form, paragraph (p), and heading (h1...h6) elements@@@@1@12@@danf@17-8-2009
10362680@unknown@formal@none@1@S@** align, noshade, size, and width attributes on hr element@@@@1@10@@danf@17-8-2009
10362690@unknown@formal@none@1@S@** align, border, vspace, and hspace attributes on img and object elements@@@@1@12@@danf@17-8-2009
10362700@unknown@formal@none@1@S@** align attribute on legend and caption elements@@@@1@8@@danf@17-8-2009
10362710@unknown@formal@none@1@S@** align and bgcolor on table element@@@@1@7@@danf@17-8-2009
10362720@unknown@formal@none@1@S@** nowrap, bgcolor, width, height on td and th elements@@@@1@10@@danf@17-8-2009
10362730@unknown@formal@none@1@S@** bgcolor attribute on tr element@@@@1@6@@danf@17-8-2009
10362740@unknown@formal@none@1@S@** clear attribute on br element@@@@1@6@@danf@17-8-2009
10362750@unknown@formal@none@1@S@** compact attribute on dl, dir and menu elements@@@@1@9@@danf@17-8-2009
10362760@unknown@formal@none@1@S@** type, compact, and start attributes on ol and ul elements@@@@1@11@@danf@17-8-2009
10362770@unknown@formal@none@1@S@** type and value attributes on li element@@@@1@8@@danf@17-8-2009
10362780@unknown@formal@none@1@S@** width attribute on pre element@@@@1@6@@danf@17-8-2009
10362790@unknown@formal@none@1@S@* '''Additional elements in Transitional specification'''@@@@1@6@@danf@17-8-2009
10362800@unknown@formal@none@1@S@** menu list (no substitute, though unordered list is recommended; may return in XHTML 2.0 specification)@@@@1@16@@danf@17-8-2009
10362810@unknown@formal@none@1@S@** dir list (no substitute, though unordered list is recommended)@@@@1@10@@danf@17-8-2009
10362820@unknown@formal@none@1@S@** isindex (element requires server-side support and is typically added to documents server-side)@@@@1@13@@danf@17-8-2009
10362830@unknown@formal@none@1@S@** applet (deprecated in favor of object element)@@@@1@8@@danf@17-8-2009
10362840@unknown@formal@none@1@S@* '''The language attribute on script element''' (presumably redundant with type attribute, though this is maintained for legacy reasons).@@@@1@19@@danf@17-8-2009
10362850@unknown@formal@none@1@S@* '''Frame related entities'''@@@@1@4@@danf@17-8-2009
10362860@unknown@formal@none@1@S@** frameset element (used in place of body for frameset DTD)@@@@1@11@@danf@17-8-2009
10362870@unknown@formal@none@1@S@** frame element@@@@1@3@@danf@17-8-2009
10362880@unknown@formal@none@1@S@** iframe@@@@1@2@@danf@17-8-2009
10362890@unknown@formal@none@1@S@** noframes@@@@1@2@@danf@17-8-2009
10362900@unknown@formal@none@1@S@** target attribute on anchor, client-side image-map (imagemap), link, form, and base elements@@@@1@13@@danf@17-8-2009
10362910@unknown@formal@none@1@S@===Frameset versus transitional===@@@@1@3@@danf@17-8-2009
10362920@unknown@formal@none@1@S@In addition to the above transitional differences, the frameset specifications (whether XHTML 1.0 or HTML 4.01) specifies a different content model: @@@@1@32@@danf@17-8-2009
10362930@unknown@formal@none@1@S@=== Summary of flavors ===@@@@1@5@@danf@17-8-2009
10362940@unknown@formal@none@1@S@As this list demonstrates, the loose flavors of the specification are maintained for legacy support.@@@@1@15@@danf@17-8-2009
10362950@unknown@formal@none@1@S@However, contrary to popular misconceptions, the move to XHTML does not imply a removal of this legacy support.@@@@1@18@@danf@17-8-2009
10362960@unknown@formal@none@1@S@Rather the X in XML stands for extensible and the W3C is modularizing the entire specification and opening it up to independent extensions.@@@@1@23@@danf@17-8-2009
10362970@unknown@formal@none@1@S@The primary achievement in the move from XHTML 1.0 to XHTML 1.1 is the modularization of the entire specification.@@@@1@19@@danf@17-8-2009
10362980@unknown@formal@none@1@S@The strict version of HTML is deployed in XHTML 1.1 through a set of modular extensions to the base XHTML 1.1 specification.@@@@1@22@@danf@17-8-2009
10362990@unknown@formal@none@1@S@Likewise someone looking for the loose (transitional) or frameset specifications will find similar extended XHTML 1.1 support (much of it is contained in the legacy or frame modules).@@@@1@28@@danf@17-8-2009
10363000@unknown@formal@none@1@S@The modularization also allows for separate features to develop on their own timetable.@@@@1@13@@danf@17-8-2009
10363010@unknown@formal@none@1@S@So for example XHTML 1.1 will allow quicker migration to emerging XML standards such as [[MathML]] (a presentational and semantic math language based on XML) and [[XForms]] — a new highly advanced web-form technology to replace the existing HTML forms.@@@@1@40@@danf@17-8-2009
10363020@unknown@formal@none@1@S@In summary, the HTML 4.01 specification primarily reined in all the various HTML implementations into a single clear written specification based on SGML.@@@@1@23@@danf@17-8-2009
10363030@unknown@formal@none@1@S@XHTML 1.0, ported this specification, as is, to the new XML defined specification.@@@@1@13@@danf@17-8-2009
10363040@unknown@formal@none@1@S@Next, XHTML 1.1 takes advantage of the extensible nature of XML and modularizes the whole specification.@@@@1@16@@danf@17-8-2009
10363050@unknown@formal@none@1@S@XHTML 2.0 will be the first step in adding new features to the specification in a standards-body-based approach.@@@@1@18@@danf@17-8-2009
10363060@unknown@formal@none@1@S@== Hypertext features not in HTML ==@@@@1@7@@danf@17-8-2009
10363070@unknown@formal@none@1@S@HTML lacks some of the features found in earlier hypertext systems, such as [[typed link]]s, [[transclusion]], [[source tracking]], [[fat link]]s, and more.@@@@1@22@@danf@17-8-2009
10363080@unknown@formal@none@1@S@Even some hypertext features that were in early versions of HTML have been ignored by most popular web browsers until recently, such as the [[Hyperlink|link]] element and in-browser Web page editing.@@@@1@31@@danf@17-8-2009
10363090@unknown@formal@none@1@S@Sometimes Web services or browser manufacturers remedy these shortcomings.@@@@1@9@@danf@17-8-2009
10363100@unknown@formal@none@1@S@For instance, [[wiki]]s and [[content management system]]s allow surfers to edit the Web pages they visit.@@@@1@16@@danf@17-8-2009
10370010@unknown@formal@none@1@S@IBM@@@@1@1@@danf@17-8-2009
10370020@unknown@formal@none@1@S@'''International Business Machines Corporation,''' abbreviated '''IBM''' and nicknamed '''"Big Blue,"''' , is a [[multinational corporation|multinational]] [[computer]] [[technology]] and [[consulting]] [[corporation]] headquartered in [[Armonk, New York]], [[United States of America|USA]].@@@@1@29@@danf@17-8-2009
10370030@unknown@formal@none@1@S@The company is one of the few [[information technology]] companies with a continuous history dating back to the 19th century.@@@@1@20@@danf@17-8-2009
10370040@unknown@formal@none@1@S@IBM manufactures and sells computer [[computer hardware|hardware]] and [[computer software|software]], and offers infrastructure services, [[Internet hosting service|hosting services]], and [[consultant|consulting services]] in areas ranging from [[mainframe computer]]s to [[nanotechnology]].@@@@1@29@@danf@17-8-2009
10370050@unknown@formal@none@1@S@IBM has been known through most of its recent history as the world's largest computer company; with over 388,000 employees worldwide, IBM is the largest [[information technology]] employer in the world.@@@@1@31@@danf@17-8-2009
10370060@unknown@formal@none@1@S@Despite falling behind [[Hewlett-Packard]] in total revenue since 2006, it remains the most profitable.@@@@1@14@@danf@17-8-2009
10370070@unknown@formal@none@1@S@IBM holds more [[patent]]s than any other U.S. based technology company.@@@@1@11@@danf@17-8-2009
10370080@unknown@formal@none@1@S@It has engineers and consultants in over 170 countries and [[IBM Research]] has eight laboratories worldwide.@@@@1@16@@danf@17-8-2009
10370090@unknown@formal@none@1@S@IBM employees have earned three [[Nobel Prize]]s, four [[Turing Award]]s, five [[National Medal of Technology|National Medals of Technology]], and five [[National Medal of Science|National Medals of Science]].@@@@1@27@@danf@17-8-2009
10370100@unknown@formal@none@1@S@As a chip maker, IBM has been among the [[Worldwide Top 20 Semiconductor Sales Leaders]] in past years, and in 2007 IBM ranked second in the list of largest software companies in the world.@@@@1@34@@danf@17-8-2009
10370110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10370120@unknown@formal@none@1@S@The company which became IBM was founded in 1896 as the Tabulating Machine Company by [[Herman Hollerith]], in [[Broome County, New York]] ([[Endicott, New York]], Where it still maintains very limited operations).@@@@1@32@@danf@17-8-2009
10370130@unknown@formal@none@1@S@It was incorporated as [[Computing Tabulating Recording Corporation (CTR)]] on [[June 16]], [[1911]], and was listed on the [[New York Stock Exchange]] in 1916.@@@@1@24@@danf@17-8-2009
10370140@unknown@formal@none@1@S@IBM adopted its current name in 1924, when it became a [[Fortune 500]] company.@@@@1@14@@danf@17-8-2009
10370150@unknown@formal@none@1@S@In the 1950s, IBM became the dominant vendor in the emerging [[computer]] industry with the release of the [[IBM 701]] and other models in the [[IBM 700/7000 series]] of [[mainframes]].@@@@1@30@@danf@17-8-2009
10370160@unknown@formal@none@1@S@The company's dominance became even more pronounced in the 1960s and 1970s with the [[IBM System/360]] and [[IBM System/370]] mainframes, however antitrust actions by the [[United States Department of Justice]], the rise of [[minicomputer]] companies like [[Digital Equipment Corporation]] and [[Data General]], and the introduction of the [[microprocessor]] all contributed to dilution of IBM's position in the industry, eventually leading the company to diversify into other areas including personal computers, software, and services.@@@@1@73@@danf@17-8-2009
10370170@unknown@formal@none@1@S@In 1981 IBM introduced the [[IBM Personal Computer]] which is the original version and progenitor of the [[IBM PC compatible]] hardware [[platform (computing)|platform]].@@@@1@23@@danf@17-8-2009
10370180@unknown@formal@none@1@S@Descendants of the IBM PC compatibles make up the majority of [[microcomputer]]s on the market today.@@@@1@16@@danf@17-8-2009
10370190@unknown@formal@none@1@S@IBM sold its PC division to the Chinese company [[Lenovo]] on [[May 1]], [[2005]] for $655 million in cash and $600 million in Lenovo stock.@@@@1@25@@danf@17-8-2009
10370200@unknown@formal@none@1@S@On [[January 25]], [[2007]], [[Ricoh]] announced purchase of IBM Printing Systems Division for $725 million and investment in 3-year joint venture to form a new Ricoh subsidiary, [[InfoPrint Solutions Company]]; Ricoh will own a 51% share, and IBM will own a 49% share in ''InfoPrint''.@@@@1@45@@danf@17-8-2009
10370210@unknown@formal@none@1@S@===Controversies===@@@@1@1@@danf@17-8-2009
10370220@unknown@formal@none@1@S@The author [[Edwin Black]] has alleged that, during [[World War II]], IBM CEO [[Thomas J. Watson]] used overseas subsidiaries to provide the [[Third Reich]] with [[Unit record equipment|unit record]] [[data processing]] machines, supplies and services that helped the [[Nazis]] to efficiently track down European Jews, with sizable profits for the company.@@@@1@51@@danf@17-8-2009
10370230@unknown@formal@none@1@S@IBM denies that they had control over these subsidiaries after the Nazis took power.@@@@1@14@@danf@17-8-2009
10370240@unknown@formal@none@1@S@A lawsuit against IBM based on these allegations was dismissed.@@@@1@10@@danf@17-8-2009
10370250@unknown@formal@none@1@S@In support of the Allied war effort in World War II, from 1943 to 1945 IBM produced approximately 346,500 M1 Carbine (Caliber .30 carbine) light rifles for the U.S. Military.@@@@1@30@@danf@17-8-2009
10370260@unknown@formal@none@1@S@==Current projects==@@@@1@2@@danf@17-8-2009
10370270@unknown@formal@none@1@S@===Eclipse===@@@@1@1@@danf@17-8-2009
10370280@unknown@formal@none@1@S@Eclipse is a platform-independent, [[Java (programming language)|Java]]-based [[software framework]].@@@@1@9@@danf@17-8-2009
10370290@unknown@formal@none@1@S@Eclipse was originally a [[Proprietary software|proprietary]] product developed by IBM as a successor of the [[VisualAge]] family of tools.@@@@1@19@@danf@17-8-2009
10370300@unknown@formal@none@1@S@Eclipse has subsequently been released as [[free software|free]]/[[open source]] software under the [[Eclipse Public License]].@@@@1@15@@danf@17-8-2009
10370310@unknown@formal@none@1@S@===developerWorks===@@@@1@1@@danf@17-8-2009
10370320@unknown@formal@none@1@S@developerWorks is a website run by [[IBM]] for [[software developer]]s and IT professionals.@@@@1@13@@danf@17-8-2009
10370330@unknown@formal@none@1@S@It contains a large number of how-to articles and tutorials, as well as software downloads and code samples, discussion forums, podcasts, blogs, wikis, and other resources for developers and technical professionals.@@@@1@31@@danf@17-8-2009
10370340@unknown@formal@none@1@S@Subjects range from open, industry-standard technologies like [[Java (programming language)|Java]], [[Linux]], [[Service-oriented architecture|SOA]] and [[web services]], [[web development]], [[Ajax (programming)|Ajax]], [[PHP]], and [[XML]] to IBM's products ([[WebSphere]], [[Rational Software|Rational]], [[Lotus Software|Lotus]], [[Tivoli Systems, Inc.|Tivoli]] and [[IBM DB2|DB2]]).@@@@1@37@@danf@17-8-2009
10370350@unknown@formal@none@1@S@In 2007 developerWorks was inducted into the Jolt Hall of Fame.@@@@1@11@@danf@17-8-2009
10370360@unknown@formal@none@1@S@===alphaWorks===@@@@1@1@@danf@17-8-2009
10370370@unknown@formal@none@1@S@alphaWorks is IBM's source for emerging software technologies.@@@@1@8@@danf@17-8-2009
10370380@unknown@formal@none@1@S@These technologies include:@@@@1@3@@danf@17-8-2009
10370390@unknown@formal@none@1@S@*'''Flexible Internet Evaluation Report Architecture''' - A highly flexible architecture for the design, display, and reporting of Internet surveys.@@@@1@19@@danf@17-8-2009
10370400@unknown@formal@none@1@S@*'''[[IBM History Flow tool|IBM History Flow Visualization Application]]''' - A tool for visualizing dynamic, evolving documents and the interactions of multiple collaborating authors.@@@@1@23@@danf@17-8-2009
10370410@unknown@formal@none@1@S@*'''IBM [[Linux]] on POWER Performance Simulator''' - A tool that provides users of Linux on Power a set of performance models for IBM's POWER processors.@@@@1@25@@danf@17-8-2009
10370420@unknown@formal@none@1@S@*'''Database File Archive And Restoration Management''' - An application for archiving and restoring hard disk files using file references stored in a database.@@@@1@23@@danf@17-8-2009
10370430@unknown@formal@none@1@S@*'''Policy Management for Autonomic Computing''' - A policy-based autonomic management infrastructure that simplifies the automation of IT and business processes.@@@@1@20@@danf@17-8-2009
10370440@unknown@formal@none@1@S@*'''FairUCE''' - A spam filter that verifies sender identity instead of filtering content.@@@@1@13@@danf@17-8-2009
10370450@unknown@formal@none@1@S@*'''Unstructured Information Management Architecture (UIMA) SDK''' - A Java SDK that supports the implementation, composition, and deployment of applications working with unstructured information.@@@@1@23@@danf@17-8-2009
10370460@unknown@formal@none@1@S@*'''Accessibility Browser''' - A web-browser specifically designed to assist people with visual impairments, to be released as open-source software.@@@@1@19@@danf@17-8-2009
10370470@unknown@formal@none@1@S@Also known as the "A-Browser," the technology will aim to eliminate the need for a mouse, relying instead completely on voice-controls, buttons and predefined shortcut keys.@@@@1@26@@danf@17-8-2009
10370480@unknown@formal@none@1@S@===Semiconductor design and manufacturing===@@@@1@4@@danf@17-8-2009
10370490@unknown@formal@none@1@S@Virtually all modern [[video game console|console gaming systems]] use [[IC design|microprocessors developed]] by IBM.@@@@1@14@@danf@17-8-2009
10370500@unknown@formal@none@1@S@The [[Xbox 360]] contains the [[Xenon (processor)|Xenon]] tri-core processor, which was designed and produced by IBM in less than 24 months.@@@@1@21@@danf@17-8-2009
10370510@unknown@formal@none@1@S@Sony's [[PlayStation 3]] features the [[Cell microprocessor| Cell BE microprocessor]] designed jointly by IBM, [[Toshiba]], and [[Sony]].@@@@1@17@@danf@17-8-2009
10370520@unknown@formal@none@1@S@[[Nintendo]]'s [[History of video game consoles (seventh generation)|seventh-generation]] console, [[Wii]], features an IBM chip codenamed [[Broadway (microprocessor)|Broadway]].@@@@1@17@@danf@17-8-2009
10370530@unknown@formal@none@1@S@The older [[Nintendo GameCube]] also utilizes the [[Gekko (microprocessor)|Gekko]] processor, designed by IBM.@@@@1@13@@danf@17-8-2009
10370540@unknown@formal@none@1@S@In May 2002, IBM and Butterfly.net, Inc. announced the Butterfly Grid, a commercial [[grid computing|grid]] for the online video gaming market.@@@@1@21@@danf@17-8-2009
10370550@unknown@formal@none@1@S@In March 2006, IBM announced separate agreements with Hoplon Infotainment, Online Game Services Incorporated (OGSI), and RenderRocket to provide on-demand content management and [[blade server]] computing resources.@@@@1@27@@danf@17-8-2009
10370560@unknown@formal@none@1@S@===Open Client Offering===@@@@1@3@@danf@17-8-2009
10370570@unknown@formal@none@1@S@IBM announced it will launch its new software, called "Open Client Offering" which is to run on [[Microsoft]]'s [[Microsoft Windows|Windows]], [[Linux]] and [[Apple Inc.|Apple]]'s [[Macintosh]].@@@@1@25@@danf@17-8-2009
10370580@unknown@formal@none@1@S@The company states that its new product allows businesses to offer employees a choice of using the same software on Windows and its alternatives.@@@@1@24@@danf@17-8-2009
10370590@unknown@formal@none@1@S@This means that "Open Client Offering" is to cut costs of managing whether Linux or Apple relative to Windows.@@@@1@19@@danf@17-8-2009
10370600@unknown@formal@none@1@S@There will be no necessity for companies to pay Microsoft for its licenses for operations since the operations will no longer rely on software which is Windows-based.@@@@1@27@@danf@17-8-2009
10370610@unknown@formal@none@1@S@One of Microsoft's office alternatives is the Open Document Format software, whose development IBM supports.@@@@1@15@@danf@17-8-2009
10370620@unknown@formal@none@1@S@It is going to be used for several tasks like: word processing, presentations, along with collaboration with [[Lotus Notes]], instant messaging and blog tools as well as an [[Internet Explorer]] competitor – the [[Firefox]] web browser.@@@@1@36@@danf@17-8-2009
10370630@unknown@formal@none@1@S@IBM plans to install Open Client on 5 percent of its desktop PCs.@@@@1@13@@danf@17-8-2009
10370640@unknown@formal@none@1@S@===UC2: Unified Communications and Collaboration===@@@@1@5@@danf@17-8-2009
10370650@unknown@formal@none@1@S@'''UC2''' (''Unified Communications and Collaboration'') is an IBM and [[Cisco]] joint project based on [[Eclipse (software)|Eclipse]] and [[OSGi]].@@@@1@18@@danf@17-8-2009
10370660@unknown@formal@none@1@S@It will offer the numerous Eclipse application developers a unified platform for an easier work environment.@@@@1@16@@danf@17-8-2009
10370670@unknown@formal@none@1@S@The software based on UC2 platform will provide major enterprises with easy-to-use communication solutions, such as the Lotus based [[Sametime]].@@@@1@20@@danf@17-8-2009
10370680@unknown@formal@none@1@S@In the future the Sametime users will benefit from such additional functions as [[click-to-call]] and [[Voicemail|voice mailing]].@@@@1@17@@danf@17-8-2009
10370690@unknown@formal@none@1@S@===Internal programs===@@@@1@2@@danf@17-8-2009
10370700@unknown@formal@none@1@S@[[Extreme Blue]] is a company initiative that uses experienced IBM engineers, talented interns, and business managers to develop high-value technology.@@@@1@20@@danf@17-8-2009
10370710@unknown@formal@none@1@S@The project is designed to analyze emerging business needs and the technologies that can solve them.@@@@1@16@@danf@17-8-2009
10370720@unknown@formal@none@1@S@These projects mostly involve rapid-prototyping of high-profile software and hardware projects.@@@@1@11@@danf@17-8-2009
10370730@unknown@formal@none@1@S@In May 2007, IBM unveiled [[Project Big Green]] -- a re-direction of $1 billion per year across its businesses to increase energy efficiency.@@@@1@23@@danf@17-8-2009
10370740@unknown@formal@none@1@S@==IBM Software Group==@@@@1@3@@danf@17-8-2009
10370750@unknown@formal@none@1@S@This group is one of the major divisions of IBM.@@@@1@10@@danf@17-8-2009
10370760@unknown@formal@none@1@S@The various brands include:@@@@1@4@@danf@17-8-2009
10370770@unknown@formal@none@1@S@* [[IBM Information Management Software|Information Management Software]] — database servers and tools, text analytics, content management, business process management and business intelligence.@@@@1@22@@danf@17-8-2009
10370780@unknown@formal@none@1@S@* [[Lotus Software]] — Groupware, collaboration and business software.@@@@1@9@@danf@17-8-2009
10370790@unknown@formal@none@1@S@Acquired in 1995.@@@@1@3@@danf@17-8-2009
10370800@unknown@formal@none@1@S@* [[Rational Software]] — Software development and application lifecycle management.@@@@1@10@@danf@17-8-2009
10370810@unknown@formal@none@1@S@Acquired in 2002.@@@@1@3@@danf@17-8-2009
10370820@unknown@formal@none@1@S@* [[Tivoli Software]] — Systems management.@@@@1@6@@danf@17-8-2009
10370830@unknown@formal@none@1@S@Acquired in 1996.@@@@1@3@@danf@17-8-2009
10370840@unknown@formal@none@1@S@* [[IBM WebSphere|WebSphere]] — Integration and application infrastructure software.@@@@1@9@@danf@17-8-2009
10370850@unknown@formal@none@1@S@==Environmental record==@@@@1@2@@danf@17-8-2009
10370860@unknown@formal@none@1@S@IBM has a long history of dealing with its environmental problems.@@@@1@11@@danf@17-8-2009
10370870@unknown@formal@none@1@S@It established a corporate policy on environmental protection in 1971, with the support of a comprehensive global environmental management system.@@@@1@20@@danf@17-8-2009
10370880@unknown@formal@none@1@S@According to IBM’s stats, its total hazardous waste decreased by 44 percent over the past five years, and has decreased by 94.6 percent since 1987.@@@@1@25@@danf@17-8-2009
10370890@unknown@formal@none@1@S@IBM's total hazardous waste calculation consists of waste from both non-manufacturing and manufacturing operations.@@@@1@14@@danf@17-8-2009
10370900@unknown@formal@none@1@S@Waste from manufacturing operations includes waste recycled in closed-loop systems where process chemicals are recovered and for subsequent reuse, rather than just disposing and using new chemical materials.@@@@1@28@@danf@17-8-2009
10370910@unknown@formal@none@1@S@Over the years, IBM has redesigned processes to eliminate almost all closed loop recycling and now uses more environmental-friendly materials in their place.@@@@1@23@@danf@17-8-2009
10370920@unknown@formal@none@1@S@IBM was recognized as one of the "Top 20 Best Workplaces for Commuters" by the U.S. Environmental Protection Agency ([[EPA]]) in 2005.@@@@1@22@@danf@17-8-2009
10370930@unknown@formal@none@1@S@This was to recognize the Fortune 500 companies that provided their employees with excellent commuter benefits that helped reduce traffic and air pollution.@@@@1@23@@danf@17-8-2009
10370940@unknown@formal@none@1@S@However, the birthplace of IBM, [[Endicott, New York|Endicott]], suffered IBM's pollution for decades.@@@@1@13@@danf@17-8-2009
10370950@unknown@formal@none@1@S@IBM used liquid cleaning agents in its circuit board assembly operation for more than two decades, and six spills and leaks incidents were recorded, including one 1979 leak of 4,100 gallons from an underground tank.@@@@1@35@@danf@17-8-2009
10370960@unknown@formal@none@1@S@These left behind volatile organic compounds in the town's soil and aquifer.@@@@1@12@@danf@17-8-2009
10370970@unknown@formal@none@1@S@Trace elements of volatile organic compounds have been identified in the Endicott’s drinking water, but the levels are within regulatory limits.@@@@1@21@@danf@17-8-2009
10370980@unknown@formal@none@1@S@Also, from 1980, IBM has pumped out 78,000 gallons of chemicals, including trichloroethane, Freon, benzene and perchloroethene to the air and allegedly caused several cancer cases among the villagers.@@@@1@29@@danf@17-8-2009
10370990@unknown@formal@none@1@S@IBM Endicott has been identified by the Department of Environmental Conservation as the major source of pollution, though traces of contaminants from a local dry cleaner and other polluters were also found.@@@@1@32@@danf@17-8-2009
10371000@unknown@formal@none@1@S@Despite the amount of pollutant, state health officials cannot say whether air or water pollution in Endicott has actually caused any health problems.@@@@1@23@@danf@17-8-2009
10371010@unknown@formal@none@1@S@Village officials say tests show that the water is safe to drink.@@@@1@12@@danf@17-8-2009
10371020@unknown@formal@none@1@S@=== Solar power ===@@@@1@4@@danf@17-8-2009
10371030@unknown@formal@none@1@S@Tokyo Ohka Kogyo Co., Ltd. (TOK) and IBM are collaborating to establish new, low-cost methods for bringing the next generation of solar energy products to market,this is, [[CIGS]] (Copper-Indium-Gallium-Selenide) [[solar cell]] modules.@@@@1@32@@danf@17-8-2009
10371040@unknown@formal@none@1@S@Use of [[thin film]] technology, such as CIGS, has great promise in reducing the overall cost of solar cells and further enabling their widespread adoption.@@@@1@25@@danf@17-8-2009
10371050@unknown@formal@none@1@S@IBM is exploring four main areas of photovoltaic research: using current technologies to develop cheaper and more efficient [[silicon]] [[solar cell]]s, developing new solution processed [[thin film]] photovoltaic devices, [[concentrator photovoltaics]], and future generation photovoltaic architectures based upon [[nanostructures]] such as [[semiconductor quantum dot]]s and [[nanowire]]s.@@@@1@46@@danf@17-8-2009
10371060@unknown@formal@none@1@S@Dr. Supratik Guha is the leading scientist in IBM photovoltaics.@@@@1@10@@danf@17-8-2009
10371070@unknown@formal@none@1@S@==Corporate culture of IBM==@@@@1@4@@danf@17-8-2009
10371080@unknown@formal@none@1@S@'''Big Blue''' is a nickname for IBM; several theories exist regarding its origin.@@@@1@13@@danf@17-8-2009
10371090@unknown@formal@none@1@S@One theory, substantiated by people who worked for IBM at the time, is that IBM field reps coined the term in the 1960s, referring to the color of the mainframes IBM installed in the 1960s and early 1970s.@@@@1@38@@danf@17-8-2009
10371100@unknown@formal@none@1@S@"All blue" was a term used to describe a loyal IBM customer, and business writers later picked up the term.@@@@1@20@@danf@17-8-2009
10371110@unknown@formal@none@1@S@Another theory suggests that Big Blue simply refers to the Company's [[logo]].@@@@1@12@@danf@17-8-2009
10371120@unknown@formal@none@1@S@A third theory suggests that Big Blue refers to a former company dress code that required many IBM employees to wear only white shirts and many wore blue suits.@@@@1@29@@danf@17-8-2009
10371130@unknown@formal@none@1@S@In any event, IBM keyboards, typewriters, and some other manufactured devices, have played on the "Big Blue" concept, using the color for enter keys and carriage returns.@@@@1@27@@danf@17-8-2009
10371140@unknown@formal@none@1@S@===Sales===@@@@1@1@@danf@17-8-2009
10371150@unknown@formal@none@1@S@IBM has often been described as having a sales-centric or a sales-oriented business culture.@@@@1@14@@danf@17-8-2009
10371160@unknown@formal@none@1@S@Traditionally, many IBM executives and general managers are chosen from the sales force.@@@@1@13@@danf@17-8-2009
10371170@unknown@formal@none@1@S@The current CEO, [[Sam Palmisano]], for example, joined the company as a salesman and, unusually for CEOs of major corporations, has no MBA or postgraduate qualification.@@@@1@26@@danf@17-8-2009
10371180@unknown@formal@none@1@S@Middle and top management are often enlisted to give direct support to salesmen when pitching sales to important customers.@@@@1@19@@danf@17-8-2009
10371190@unknown@formal@none@1@S@===The uniform===@@@@1@2@@danf@17-8-2009
10371200@unknown@formal@none@1@S@A dark (or gray) suit, white shirt, and a "sincere" tie was the public uniform for IBM employees for most of the 20th Century.@@@@1@24@@danf@17-8-2009
10371210@unknown@formal@none@1@S@During IBM's management transformation in the 1990s, CEO [[Lou Gerstner]] relaxed these codes, normalizing the dress and behavior of IBM employees to resemble their counterparts in other large technology companies.@@@@1@30@@danf@17-8-2009
10371220@unknown@formal@none@1@S@===IBM company values and "Jam"===@@@@1@5@@danf@17-8-2009
10371230@unknown@formal@none@1@S@In 2003, IBM embarked on an ambitious project to rewrite company values.@@@@1@12@@danf@17-8-2009
10371240@unknown@formal@none@1@S@Using its ''Jam'' technology, the company hosted Intranet-based online discussions on key business issues with 50,000 employees over 3 days.@@@@1@20@@danf@17-8-2009
10371250@unknown@formal@none@1@S@The discussions were analyzed by sophisticated text analysis software (eClassifier) to mine online comments for themes.@@@@1@16@@danf@17-8-2009
10371260@unknown@formal@none@1@S@As a result of the 2003 Jam, the company values were updated to reflect three modern business, marketplace and employee views: "Dedication to every client's success", "Innovation that matters - for our company and for the world", "Trust and personal responsibility in all relationships".@@@@1@44@@danf@17-8-2009
10371270@unknown@formal@none@1@S@In 2004, another Jam was conducted during which 52,000 employees exchanged best practices for 72 hours.@@@@1@16@@danf@17-8-2009
10371280@unknown@formal@none@1@S@They focused on finding actionable ideas to support implementation of the values previously identified.@@@@1@14@@danf@17-8-2009
10371290@unknown@formal@none@1@S@A new post-Jam Ratings event was developed to allow IBMers to select key ideas that support the values.@@@@1@18@@danf@17-8-2009
10371300@unknown@formal@none@1@S@The board of directors cited this Jam when awarding Palmisano a pay rise in the spring of 2005.@@@@1@18@@danf@17-8-2009
10371310@unknown@formal@none@1@S@In July and September 2006, Palmisano launched another jam called [https://www.globalinnovationjam.com/ InnovationJam].@@@@1@12@@danf@17-8-2009
10371320@unknown@formal@none@1@S@InnovationJam was the largest online brainstorming session ever with more than 150,000 participants from 104 countries.@@@@1@16@@danf@17-8-2009
10371330@unknown@formal@none@1@S@The participants were IBM employees, members of IBM employees' families, universities, partners, and customers.@@@@1@14@@danf@17-8-2009
10371340@unknown@formal@none@1@S@InnovationJam was divided in two sessions (one in July and one in September) for 72 hours each and generated more than 46,000 ideas.@@@@1@23@@danf@17-8-2009
10371350@unknown@formal@none@1@S@In November 2006, IBM declared that they will invest $US 100 million in the 10 best ideas from InnovationJam.@@@@1@19@@danf@17-8-2009
10371360@unknown@formal@none@1@S@===Open source===@@@@1@2@@danf@17-8-2009
10371370@unknown@formal@none@1@S@IBM has been influenced by the [[Open Source Initiative]], and began supporting [[Linux]] in 1998.@@@@1@15@@danf@17-8-2009
10371380@unknown@formal@none@1@S@The company invests billions of dollars in services and software based on [[Linux]] through the IBM [[Linux Technology Center]], which includes over 300 [[Linux kernel]] developers.@@@@1@26@@danf@17-8-2009
10371390@unknown@formal@none@1@S@IBM has also released code under different [[open-source license]]s, such as the platform-independent software framework [[Eclipse (software)|Eclipse]] (worth approximately US$40 million at the time of the donation) and the [[Java (programming language)|Java]]-based [[relational database management system]] (RDBMS) [[Apache Derby]].@@@@1@39@@danf@17-8-2009
10371400@unknown@formal@none@1@S@IBM's open source involvement has not been trouble-free, however (see ''[[SCO v. IBM]]'').@@@@1@13@@danf@17-8-2009
10371410@unknown@formal@none@1@S@== Corporate affairs ==@@@@1@4@@danf@17-8-2009
10371420@unknown@formal@none@1@S@=== Diversity and workforce issues ===@@@@1@6@@danf@17-8-2009
10371430@unknown@formal@none@1@S@IBM's efforts to promote workforce diversity and equal opportunity date back at least to [[World War I]], when the company hired disabled veterans.@@@@1@23@@danf@17-8-2009
10371440@unknown@formal@none@1@S@IBM was the only technology company ranked in ''Working Mother'' magazine's Top 10 for 2004, and one of two technology companies in 2005 (the other company being Hewlett-Packard).@@@@1@28@@danf@17-8-2009
10371450@unknown@formal@none@1@S@On [[September 21]], [[1953]], [[Thomas J. Watson]], the CEO at the time, sent out a very controversial letter to all IBM employees stating that IBM needed to hire the best people, regardless of their race, ethnic origin, or gender.@@@@1@39@@danf@17-8-2009
10371460@unknown@formal@none@1@S@In 1984, IBM added sexual preference.@@@@1@6@@danf@17-8-2009
10371470@unknown@formal@none@1@S@He stated that this would give IBM a competitive advantage because IBM would then be able to hire talented people its competitors would turn down.@@@@1@25@@danf@17-8-2009
10371480@unknown@formal@none@1@S@The company has traditionally resisted [[trade union|labor union]] organizing, although unions represent some IBM workers outside the United States.@@@@1@19@@danf@17-8-2009
10371490@unknown@formal@none@1@S@In the 1990s, two major [[pension]] program changes, including a conversion to a cash balance plan, resulted in an employee [[class action]] lawsuit alleging [[age discrimination]].@@@@1@26@@danf@17-8-2009
10371500@unknown@formal@none@1@S@IBM employees won the lawsuit and arrived at a partial settlement, although appeals are still underway.@@@@1@16@@danf@17-8-2009
10371510@unknown@formal@none@1@S@IBM also settled a major overtime class-action lawsuit in 2006.@@@@1@10@@danf@17-8-2009
10371520@unknown@formal@none@1@S@Historically IBM has had a good reputation of long-term staff retention with few large scale layoffs.@@@@1@16@@danf@17-8-2009
10371530@unknown@formal@none@1@S@In more recent years there have been a number of broad sweeping cuts to the workforce as IBM attempts to adapt to changing market conditions and a declining profit base.@@@@1@30@@danf@17-8-2009
10371540@unknown@formal@none@1@S@After posting weaker than expected revenues in the first quarter of 2005, IBM eliminated 14,500 positions from its workforce, predominantly in Europe.@@@@1@22@@danf@17-8-2009
10371550@unknown@formal@none@1@S@In May 2005, IBM Ireland said to staff that the MD(Micro-electronics Division) facility was closing down by the end of 2005 and offered a settlement to staff.@@@@1@27@@danf@17-8-2009
10371560@unknown@formal@none@1@S@However, all staff that wished to stay with the Company were redeployed within IBM Ireland.@@@@1@15@@danf@17-8-2009
10371570@unknown@formal@none@1@S@The production moved to a company called Amkor in Singapore who purchased IBM's Microelectronics business in Singapore and is widely agreed that IBM promised this Company a full load capacity in return for the purchase of the facility.@@@@1@38@@danf@17-8-2009
10371580@unknown@formal@none@1@S@On [[June 8]] [[2005]], IBM Canada Ltd. eliminated approximately 700 positions.@@@@1@11@@danf@17-8-2009
10371590@unknown@formal@none@1@S@IBM projects these as part of a strategy to "rebalance" its portfolio of professional skills & businesses.@@@@1@17@@danf@17-8-2009
10371600@unknown@formal@none@1@S@[[IBM India]] and other IBM offices in [[China]], the [[Philippines]] and [[Costa Rica]] have been witnessing a recruitment boom and steady growth in number of employees due to lower wages.@@@@1@30@@danf@17-8-2009
10371610@unknown@formal@none@1@S@On [[October 10]] [[2005]], IBM became the first major company in the world to formally commit to not using [[genetic testing|genetic information]] in its employment decisions.@@@@1@26@@danf@17-8-2009
10371620@unknown@formal@none@1@S@This came just a few months after IBM announced its support of the [[National Geographic Society]]'s [[The Genographic Project|Genographic Project]].@@@@1@20@@danf@17-8-2009
10371630@unknown@formal@none@1@S@==== Gay rights ====@@@@1@4@@danf@17-8-2009
10371640@unknown@formal@none@1@S@IBM provides employees' same-sex partners with benefits and provides an anti-discrimination clause.@@@@1@12@@danf@17-8-2009
10371650@unknown@formal@none@1@S@The [[Human Rights Campaign]] has consistently rated IBM 100% on its index of gay-friendliness since 2003 (in 2002, the year it began compiling its report on major companies, IBM scored 86%).@@@@1@31@@danf@17-8-2009
10371660@unknown@formal@none@1@S@===Logos===@@@@1@1@@danf@17-8-2009
10371670@unknown@formal@none@1@S@[[Logo]]s designed in the 1970s tended to be sensitive to the technical limitations of photocopiers, which were then being widely deployed.@@@@1@21@@danf@17-8-2009
10371680@unknown@formal@none@1@S@A logo with large solid areas tended to be poorly copied by copiers in the 1970s, so companies preferred logos that avoided large solid areas.@@@@1@25@@danf@17-8-2009
10371690@unknown@formal@none@1@S@The 1972 IBM logos are an example of this tendency.@@@@1@10@@danf@17-8-2009
10371700@unknown@formal@none@1@S@With the advent of digital copiers in the mid-1980s this technical restriction had largely disappeared; at roughly the same time, the 13-bar logo was abandoned for almost the opposite reason it was difficult to render accurately on the low-resolution digital printers (240 dots per inch) of the time.@@@@1@48@@danf@17-8-2009
10371710@unknown@formal@none@1@S@===Board of directors===@@@@1@3@@danf@17-8-2009
10371720@unknown@formal@none@1@S@Current members of the [[board of directors]] of IBM are:@@@@1@10@@danf@17-8-2009
10371730@unknown@formal@none@1@S@*Cathleen Black President, [[Hearst Corporation|Hearst Magazines]]@@@@1@6@@danf@17-8-2009
10371740@unknown@formal@none@1@S@*[[William Brody]] President, [[Johns Hopkins University]]@@@@1@6@@danf@17-8-2009
10371750@unknown@formal@none@1@S@*[[Ken Chenault]] Chairman and CEO, [[American Express]] Company@@@@1@8@@danf@17-8-2009
10371760@unknown@formal@none@1@S@*Juergen Dormann Chairman of the Board, ABB Ltd@@@@1@8@@danf@17-8-2009
10371770@unknown@formal@none@1@S@*[[Michael Eskew]] Chairman and CEO, [[United Parcel Service]], Inc.@@@@1@9@@danf@17-8-2009
10371780@unknown@formal@none@1@S@*[[Shirley Ann Jackson]] President, [[Rensselaer Polytechnic Institute]]@@@@1@7@@danf@17-8-2009
10371790@unknown@formal@none@1@S@*Minoru Makihara Senior Corporate Advisor and former Chairman, [[Mitsubishi Corporation]]@@@@1@10@@danf@17-8-2009
10371800@unknown@formal@none@1@S@*Lucio Noto Managing Partner, Midstream Partners LLC@@@@1@7@@danf@17-8-2009
10371810@unknown@formal@none@1@S@*[[James W. Owens]] Chairman and CEO, [[Caterpillar Inc.]]@@@@1@8@@danf@17-8-2009
10371820@unknown@formal@none@1@S@*[[Samuel J. Palmisano]] Chairman, President and CEO, IBM@@@@1@8@@danf@17-8-2009
10371830@unknown@formal@none@1@S@*Joan Spero President, [[Doris Duke]] Charitable Foundation@@@@1@7@@danf@17-8-2009
10371840@unknown@formal@none@1@S@*Sidney Taurell, Chairman and CEO, [[Eli Lilly and Company]]@@@@1@9@@danf@17-8-2009
10371850@unknown@formal@none@1@S@*[[Lorenzo Zambrano]] Chairman and CEO, [[Cemex]] SAB de CV@@@@1@9@@danf@17-8-2009
10380010@unknown@formal@none@1@S@Information@@@@1@1@@danf@17-8-2009
10380020@unknown@formal@none@1@S@'''Information''' as a [[Conveyed concept|concept]] has a diversity of meanings, from everyday usage to technical settings.@@@@1@16@@danf@17-8-2009
10380030@unknown@formal@none@1@S@Generally speaking, the concept of information is closely related to notions of [[constraint]], [[communication]], [[control system|control]], [[data]], [[form]], [[instruction]], [[knowledge]], [[Meaning (linguistics)|meaning]], [[stimulation|mental stimulus]], [[pattern]], [[perception]], and [[knowledge representation|representation]].@@@@1@29@@danf@17-8-2009
10380040@unknown@formal@none@1@S@Many people speak about the [[Information Age]] as the advent of the Knowledge Age or [[knowledge society]], the [[information society]], the [[Information revolution]], and [[Information technology|information technologies]], and even though [[informatics]], [[information science]] and [[computer science]] are often in the spotlight, the word "information" is often used without careful consideration of the various meanings it has acquired.@@@@1@57@@danf@17-8-2009
10380050@unknown@formal@none@1@S@== Etymology ==@@@@1@3@@danf@17-8-2009
10380060@unknown@formal@none@1@S@According to the [[Oxford English Dictionary]], the earliest historical meaning of the word ''information'' in [[English language|English]] was the act of ''informing'', or giving form or shape to the mind, as in education, instruction, or training.@@@@1@36@@danf@17-8-2009
10380070@unknown@formal@none@1@S@A quote from 1387: "Five books come down from heaven for information of mankind."@@@@1@14@@danf@17-8-2009
10380080@unknown@formal@none@1@S@It was also used for an ''item'' of training, ''e.g.'' a particular instruction.@@@@1@13@@danf@17-8-2009
10380090@unknown@formal@none@1@S@"Melibee had heard the great skills and reasons of Dame Prudence, and her wise information and techniques."@@@@1@17@@danf@17-8-2009
10380100@unknown@formal@none@1@S@(1386)@@@@1@1@@danf@17-8-2009
10380110@unknown@formal@none@1@S@The English word was apparently derived by adding the common "noun of action" ending "''-ation''" (descended through Francais from Latin "''-tio''") to the earlier verb ''to inform'', in the sense of to give form to the mind, to discipline, instruct, teach: "Men so wise should go and inform their kings."@@@@1@50@@danf@17-8-2009
10380120@unknown@formal@none@1@S@(1330) ''Inform'' itself comes (via French) from the Latin verb ''informare'', to give form to, to form an idea of.@@@@1@20@@danf@17-8-2009
10380125@unknown@formal@none@1@S@Furthermore, Latin itself already even contained the word ''informatio'' meaning concept or idea, but the extent to which this may have influenced the development of the word ''information'' in English is unclear.@@@@1@32@@danf@17-8-2009
10380130@unknown@formal@none@1@S@As a final note, the ancient Greek word for ''form'' was [eidos], and this word was famously used in a technical philosophical sense by [Plato] (and later Aristotle) to denote the ideal identity or essence of something (see [Theory of forms]).@@@@1@41@@danf@17-8-2009
10380140@unknown@formal@none@1@S@"Eidos" can also be associated with [thought], [proposition] or even [concept].@@@@1@11@@danf@17-8-2009
10380150@unknown@formal@none@1@S@== Information as a message ==@@@@1@6@@danf@17-8-2009
10380160@unknown@formal@none@1@S@'''Information''' is the state of a system of interest.@@@@1@9@@danf@17-8-2009
10380170@unknown@formal@none@1@S@Message is the information materialized.@@@@1@5@@danf@17-8-2009
10380180@unknown@formal@none@1@S@Information is a quality of a [[message]] from a [[sender]] to one or more receivers.@@@@1@15@@danf@17-8-2009
10380190@unknown@formal@none@1@S@Information is always ''about'' something (size of a parameter, occurrence of an event, etc).@@@@1@14@@danf@17-8-2009
10380200@unknown@formal@none@1@S@Viewed in this manner, information does not have to be accurate.@@@@1@11@@danf@17-8-2009
10380210@unknown@formal@none@1@S@It may be a truth or a lie, or just the sound of a falling tree.@@@@1@16@@danf@17-8-2009
10380220@unknown@formal@none@1@S@Even a disruptive noise used to inhibit the flow of communication and create misunderstanding would in this view be a form of information.@@@@1@23@@danf@17-8-2009
10380230@unknown@formal@none@1@S@However, generally speaking, if the ''amount'' of information in the received message increases, the message is more accurate.@@@@1@18@@danf@17-8-2009
10380240@unknown@formal@none@1@S@This model assumes there is a definite [[sender]] and at least one receiver.@@@@1@13@@danf@17-8-2009
10380250@unknown@formal@none@1@S@Many refinements of the model assume the existence of a common language understood by the sender and at least one of the receivers.@@@@1@23@@danf@17-8-2009
10380260@unknown@formal@none@1@S@An important variation identifies information as that which would be communicated by a message if it were sent from a sender to a receiver capable of understanding the message.@@@@1@29@@danf@17-8-2009
10380270@unknown@formal@none@1@S@Notably, it is not required that the sender be capable of understanding the message, or even cognizant that there is a message.@@@@1@22@@danf@17-8-2009
10380280@unknown@formal@none@1@S@Thus, information is something that can be extracted from an environment, e.g., through observation, reading or measurement.@@@@1@17@@danf@17-8-2009
10380290@unknown@formal@none@1@S@Information is a term with many meanings depending on context, but is as a rule closely related to such concepts as meaning, knowledge, instruction, communication, representation, and mental stimulus.@@@@1@29@@danf@17-8-2009
10380300@unknown@formal@none@1@S@Simply stated, information is a message received and understood.@@@@1@9@@danf@17-8-2009
10380310@unknown@formal@none@1@S@In terms of data, it can be defined as a collection of facts from which conclusions may be drawn.@@@@1@19@@danf@17-8-2009
10380320@unknown@formal@none@1@S@There are many other aspects of information since it is the knowledge acquired through study or experience or instruction.@@@@1@19@@danf@17-8-2009
10380330@unknown@formal@none@1@S@But overall, information is the result of processing, manipulating and organizing data in a way that adds to the knowledge of the person receiving it.@@@@1@25@@danf@17-8-2009
10380340@unknown@formal@none@1@S@[[Communication theory]] provides a numerical measure of the uncertainty of an outcome.@@@@1@12@@danf@17-8-2009
10380350@unknown@formal@none@1@S@For example, we can say that "the signal contained thousands of bits of information".@@@@1@14@@danf@17-8-2009
10380360@unknown@formal@none@1@S@Communication theory tends to use the concept of [[information entropy]], generally attributed to [[C.E. Shannon]] (see below).@@@@1@17@@danf@17-8-2009
10380370@unknown@formal@none@1@S@Another form of information is [[Fisher information]], a concept of [[R.A. Fisher]].@@@@1@12@@danf@17-8-2009
10380380@unknown@formal@none@1@S@This is used in application of statistics to [[estimation theory]] and to science in general.@@@@1@15@@danf@17-8-2009
10380390@unknown@formal@none@1@S@Fisher information is thought of as the amount of information that a message carries about an unobservable parameter.@@@@1@18@@danf@17-8-2009
10380400@unknown@formal@none@1@S@It can be computed from knowledge of the [[likelihood function]] defining the system.@@@@1@13@@danf@17-8-2009
10380410@unknown@formal@none@1@S@For example, with a normal likelihood function, the Fisher information is the reciprocal of the variance of the law.@@@@1@19@@danf@17-8-2009
10380420@unknown@formal@none@1@S@In the absence of knowledge of the likelihood law, the Fisher information may be computed from normally distributed score data as the reciprocal of their second moment.@@@@1@27@@danf@17-8-2009
10380430@unknown@formal@none@1@S@Even though information and data are often used interchangeably, they are actually very different.@@@@1@14@@danf@17-8-2009
10380440@unknown@formal@none@1@S@Data is a set of unrelated information, and as such is of no use until it is properly evaluated.@@@@1@19@@danf@17-8-2009
10380450@unknown@formal@none@1@S@Upon evaluation, once there is some significant relation between data, and they show some relevance, then they are converted into information.@@@@1@21@@danf@17-8-2009
10380460@unknown@formal@none@1@S@Now this same data can be used for different purposes.@@@@1@10@@danf@17-8-2009
10380470@unknown@formal@none@1@S@Thus, till the data convey some information, they are not useful.@@@@1@11@@danf@17-8-2009
10380480@unknown@formal@none@1@S@=== Measuring information entropy ===@@@@1@5@@danf@17-8-2009
10380490@unknown@formal@none@1@S@The view of information as a message came into prominence with the publication in 1948 of an influential paper by [[Claude Shannon]], "[[A Mathematical Theory of Communication]]."@@@@1@27@@danf@17-8-2009
10380500@unknown@formal@none@1@S@This paper provides the foundations of [[information theory]] and endows the word ''information'' not only with a technical meaning but also a measure.@@@@1@23@@danf@17-8-2009
10380510@unknown@formal@none@1@S@If the sending device is equally likely to send any one of a set of messages, then the preferred measure of "the information produced when one message is chosen from the set" is the base two [[logarithm]] of (This measure is called ''[[self-information]]'').@@@@1@45@@danf@17-8-2009
10380520@unknown@formal@none@1@S@In this paper, Shannon continues:@@@@1@5@@danf@17-8-2009
10380530@unknown@formal@none@1@S@A complementary way of measuring information is provided by [[algorithmic information theory]].@@@@1@12@@danf@17-8-2009
10380540@unknown@formal@none@1@S@In brief, this measures the information content of a list of symbols based on how predictable they are, or more specifically how easy it is to compute the list through a [[computer program|program]]: the information content of a sequence is the number of bits of the shortest program that computes it.@@@@1@51@@danf@17-8-2009
10380550@unknown@formal@none@1@S@The sequence below would have a very low algorithmic information measurement since it is a very predictable pattern, and as the pattern continues the measurement would not change.@@@@1@28@@danf@17-8-2009
10380560@unknown@formal@none@1@S@Shannon information would give the same information measurement for each symbol, since they are [[statistical randomness|statistically random]], and each new symbol would increase the measurement.@@@@1@25@@danf@17-8-2009
10380570@unknown@formal@none@1@S@:123456789101112131415161718192021@@@@1@1@@danf@17-8-2009
10380580@unknown@formal@none@1@S@It is important to recognize the limitations of traditional information theory and algorithmic information theory from the perspective of human meaning.@@@@1@21@@danf@17-8-2009
10380590@unknown@formal@none@1@S@For example, when referring to the meaning content of a message Shannon noted “Frequently the messages have ''meaning…'' these semantic aspects of communication are irrelevant to the engineering problem.@@@@1@29@@danf@17-8-2009
10380600@unknown@formal@none@1@S@The significant aspect is that the actual message is one selected ''from a set of possible messages''” (emphasis in original).@@@@1@20@@danf@17-8-2009
10380610@unknown@formal@none@1@S@In information theory signals are part of a process, not a substance; they do something, they do not contain any specific meaning.@@@@1@22@@danf@17-8-2009
10380620@unknown@formal@none@1@S@Combining algorithmic information theory and information theory we can conclude that the most random signal contains the most information as it can be interpreted in any way and cannot be compressed.@@@@1@31@@danf@17-8-2009
10380630@unknown@formal@none@1@S@Michael Reddy noted that "'signals' of the [[mathematical theory]] are 'patterns that can be exchanged'.@@@@1@15@@danf@17-8-2009
10380640@unknown@formal@none@1@S@There is no message contained in the signal, the signals convey the ability to select from a set of possible messages."@@@@1@21@@danf@17-8-2009
10380650@unknown@formal@none@1@S@In information theory "the system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design".@@@@1@32@@danf@17-8-2009
10380660@unknown@formal@none@1@S@== Information as a pattern ==@@@@1@6@@danf@17-8-2009
10380670@unknown@formal@none@1@S@Information is any represented [[pattern]].@@@@1@5@@danf@17-8-2009
10380680@unknown@formal@none@1@S@This view assumes neither accuracy nor directly communicating parties, but instead assumes a separation between an object and its representation.@@@@1@20@@danf@17-8-2009
10380690@unknown@formal@none@1@S@Consider the following example: [[economic statistics]] represent an [[Economics|economy]], however inaccurately.@@@@1@11@@danf@17-8-2009
10380700@unknown@formal@none@1@S@What are commonly referred to as data in [[computing]], [[statistics]], and other fields, are forms of information in this sense.@@@@1@20@@danf@17-8-2009
10380710@unknown@formal@none@1@S@The [[electromagnetism|electro-magnetic]] patterns in a [[computer network]] and connected [[peripheral device|device]]s are related to something other than the pattern itself, such as [[Character (computing)|text characters]] to be displayed and [[Computer keyboard|keyboard]] input.@@@@1@32@@danf@17-8-2009
10380720@unknown@formal@none@1@S@[[Signal (information theory)|Signal]]s, [[Sign (linguistics)|sign]]s, and [[symbol]]s are also in this category.@@@@1@12@@danf@17-8-2009
10380730@unknown@formal@none@1@S@On the other hand, according to [[semiotics]], data is symbols with certain syntax and information is data with a certain semantic.@@@@1@21@@danf@17-8-2009
10380740@unknown@formal@none@1@S@[[Painting]] and [[drawing]] contain information to the extent that they represent something such as an assortment of objects on a table, a [[profile]], or a [[landscape]].@@@@1@26@@danf@17-8-2009
10380750@unknown@formal@none@1@S@In other words, when a pattern of something is transposed to a pattern of something else, the latter is information.@@@@1@20@@danf@17-8-2009
10380760@unknown@formal@none@1@S@This would be the case whether or not there was anyone to perceive it.@@@@1@14@@danf@17-8-2009
10380770@unknown@formal@none@1@S@But if information can be defined merely as a pattern, does that mean that neither [[utility]] nor meaning are necessary components of information?@@@@1@23@@danf@17-8-2009
10380780@unknown@formal@none@1@S@Arguably a distinction must be made between raw unprocessed data and information which possesses utility, [[value (economics)|value]] or some quantum of meaning.@@@@1@22@@danf@17-8-2009
10380790@unknown@formal@none@1@S@On this view, information may indeed be characterized as a pattern; but this is a [[necessary]] condition, not a [[sufficient]] one.@@@@1@21@@danf@17-8-2009
10380800@unknown@formal@none@1@S@An individual entry in a telephone book, which follows a specific pattern formed by name, address and telephone number, does not become "informative" in some sense unless and until it possesses some degree of utility, value or meaning.@@@@1@38@@danf@17-8-2009
10380810@unknown@formal@none@1@S@For example, someone might look up a girlfriend's number, might order a take away etc.@@@@1@15@@danf@17-8-2009
10380820@unknown@formal@none@1@S@The vast majority of numbers will never be construed as "information" in any meaningful sense.@@@@1@15@@danf@17-8-2009
10380830@unknown@formal@none@1@S@The gap between data and information is only closed by a behavioral bridge whereby some value, utility or meaning is added to transform mere data or pattern into information.@@@@1@29@@danf@17-8-2009
10380840@unknown@formal@none@1@S@When one constructs a representation of an object, one can selectively extract from the object ([[sampling (case studies)|sampling]]) or use a [[system]] of signs to replace ([[encode|encoding]]), or both.@@@@1@29@@danf@17-8-2009
10380850@unknown@formal@none@1@S@The sampling and encoding result in representation.@@@@1@7@@danf@17-8-2009
10380860@unknown@formal@none@1@S@An example of the former is a "sample" of a product; an example of the latter is "verbal description" of a product.@@@@1@22@@danf@17-8-2009
10380870@unknown@formal@none@1@S@Both contain information of the product, however inaccurate.@@@@1@8@@danf@17-8-2009
10380880@unknown@formal@none@1@S@When one interprets representation, one can predict a broader pattern from a limited number of observations (inference) or understand the relation between patterns of two different things ([[decode|decoding]]).@@@@1@28@@danf@17-8-2009
10380890@unknown@formal@none@1@S@One example of the former is to sip a [[soup]] to know if it is spoiled; an example of the latter is examining footprints to determine the animal and its condition.@@@@1@31@@danf@17-8-2009
10380900@unknown@formal@none@1@S@In both cases, information sources are not constructed or presented by some "sender" of information.@@@@1@15@@danf@17-8-2009
10380910@unknown@formal@none@1@S@Regardless, information is dependent upon, but usually unrelated to and separate from, the medium or media used to express it.@@@@1@20@@danf@17-8-2009
10380920@unknown@formal@none@1@S@In other words, the position of a theoretical series of bits, or even the output once interpreted by a [[computer]] or similar device, is unimportant, except when someone or something is present to interpret the information.@@@@1@36@@danf@17-8-2009
10380930@unknown@formal@none@1@S@Therefore, a quantity of information is totally distinct from its medium.@@@@1@11@@danf@17-8-2009
10380940@unknown@formal@none@1@S@== Information as sensory input ==@@@@1@6@@danf@17-8-2009
10380950@unknown@formal@none@1@S@Often information is viewed as a type of [[input]] to an [[organism]] or designed device.@@@@1@15@@danf@17-8-2009
10380960@unknown@formal@none@1@S@Inputs are of two kinds.@@@@1@5@@danf@17-8-2009
10380970@unknown@formal@none@1@S@Some inputs are important to the function of the organism (for example, food) or device ([[energy]]) by themselves.@@@@1@18@@danf@17-8-2009
10380980@unknown@formal@none@1@S@In his book ''Sensory Ecology,'' Dusenbery called these causal inputs.@@@@1@10@@danf@17-8-2009
10380990@unknown@formal@none@1@S@Other inputs (information) are important only because they are associated with causal inputs and can be used to predict the occurrence of a causal input at a later time (and perhaps another place).@@@@1@33@@danf@17-8-2009
10381000@unknown@formal@none@1@S@Some information is important because of association with other information but eventually there must be a connection to a causal input.@@@@1@21@@danf@17-8-2009
10381010@unknown@formal@none@1@S@In practice, information is usually carried by weak stimuli that must be detected by specialized sensory systems and amplified by energy inputs before they can be functional to the organism or device.@@@@1@32@@danf@17-8-2009
10381020@unknown@formal@none@1@S@For example, light is often a causal input to plants but provides information to animals.@@@@1@15@@danf@17-8-2009
10381030@unknown@formal@none@1@S@The colored light reflected from a flower is too weak to do much photosynthetic work but the visual system of the bee detects it and the bee's nervous system uses the information to guide the bee to the flower, where the bee often finds nectar or pollen, which are causal inputs, serving a nutritional function.@@@@1@55@@danf@17-8-2009
10381040@unknown@formal@none@1@S@Information is any type of sensory input.@@@@1@7@@danf@17-8-2009
10381050@unknown@formal@none@1@S@When an organism with a [[nervous system]] receives an input, it transforms the input into an electrical signal.@@@@1@18@@danf@17-8-2009
10381060@unknown@formal@none@1@S@This is regarded information by some.@@@@1@6@@danf@17-8-2009
10381070@unknown@formal@none@1@S@The idea of representation is still relevant, but in a slightly different manner.@@@@1@13@@danf@17-8-2009
10381080@unknown@formal@none@1@S@That is, while [[abstract painting]] does not represent anything concretely, when the viewer sees the painting, it is nevertheless transformed into electrical signals that create a representation of the painting.@@@@1@30@@danf@17-8-2009
10381090@unknown@formal@none@1@S@Defined this way, information does not have to be related to truth, communication, or representation of an object.@@@@1@18@@danf@17-8-2009
10381100@unknown@formal@none@1@S@[[Entertainment]] in general is not intended to be informative.@@@@1@9@@danf@17-8-2009
10381110@unknown@formal@none@1@S@[[Music]], the [[performing arts]], [[amusement park]]s, works of [[fiction]] and so on are thus forms of information in this sense, but they are not necessarily forms of information according to some definitions given above.@@@@1@34@@danf@17-8-2009
10381120@unknown@formal@none@1@S@Consider another example: food supplies both nutrition and taste for those who eat it.@@@@1@14@@danf@17-8-2009
10381130@unknown@formal@none@1@S@If information is equated to sensory input, then nutrition is not information but taste is.@@@@1@15@@danf@17-8-2009
10381140@unknown@formal@none@1@S@== Information as an influence which leads to a transformation ==@@@@1@11@@danf@17-8-2009
10381150@unknown@formal@none@1@S@Information is any type of pattern that influences the formation or transformation of other patterns.@@@@1@15@@danf@17-8-2009
10381160@unknown@formal@none@1@S@In this sense, there is no need for a conscious mind to perceive, much less appreciate, the pattern.@@@@1@18@@danf@17-8-2009
10381170@unknown@formal@none@1@S@Consider, for example, [[DNA]].@@@@1@4@@danf@17-8-2009
10381180@unknown@formal@none@1@S@The sequence of [[nucleotide]]s is a pattern that influences the formation and development of an organism without any need for a conscious mind.@@@@1@23@@danf@17-8-2009
10381190@unknown@formal@none@1@S@[[Systems theory]] at times seems to refer to information in this sense, assuming information does not necessarily involve any conscious mind, and patterns circulating (due to [[feedback]]) in the system can be called information.@@@@1@34@@danf@17-8-2009
10381200@unknown@formal@none@1@S@In other words, it can be said that information in this sense is something potentially perceived as representation, though not created or presented for that purpose.@@@@1@26@@danf@17-8-2009
10381210@unknown@formal@none@1@S@When [[Marshall McLuhan]] speaks of [[media (communication)|media]] and their effects on human cultures, he refers to the structure of [[cultural artifact|artifacts]] that in turn shape our behaviors and mindsets.@@@@1@29@@danf@17-8-2009
10381220@unknown@formal@none@1@S@Also, [[pheromone]]s are often said to be "information" in this sense.@@@@1@11@@danf@17-8-2009
10381230@unknown@formal@none@1@S@(See also [[Gregory Bateson]].)@@@@1@4@@danf@17-8-2009
10381240@unknown@formal@none@1@S@== Information as a property in physics ==@@@@1@8@@danf@17-8-2009
10381250@unknown@formal@none@1@S@In 2003, J. D. Bekenstein claimed there is a growing trend in [[physics]] to define the physical world as being made of information itself (and thus information is defined in this way).@@@@1@32@@danf@17-8-2009
10381260@unknown@formal@none@1@S@Information has a well defined meaning in physics.@@@@1@8@@danf@17-8-2009
10381270@unknown@formal@none@1@S@Examples of this include the phenomenon of [[quantum entanglement]] where particles can interact without reference to their separation or the speed of light.@@@@1@23@@danf@17-8-2009
10381280@unknown@formal@none@1@S@Information itself cannot travel faster than light even if the information is transmitted indirectly.@@@@1@14@@danf@17-8-2009
10381290@unknown@formal@none@1@S@This could lead to the fact that all attempts at physically observing a particle with an "entangled" relationship to another are slowed down, even though the particles are not connected in any other way other than by the information they carry.@@@@1@41@@danf@17-8-2009
10381300@unknown@formal@none@1@S@Another link is demonstrated by the [[Maxwell's demon]] thought experiment.@@@@1@10@@danf@17-8-2009
10381310@unknown@formal@none@1@S@In this experiment, a direct relationship between information and another physical property, [[entropy]], is demonstrated.@@@@1@15@@danf@17-8-2009
10381320@unknown@formal@none@1@S@A consequence is that it is impossible to destroy information without increasing the entropy of a system; in practical terms this often means generating heat.@@@@1@25@@danf@17-8-2009
10381330@unknown@formal@none@1@S@Another, more philosophical, outcome is that information could be thought of as interchangeable with [[Energy#Transformations_of_energy|energy]].@@@@1@15@@danf@17-8-2009
10381340@unknown@formal@none@1@S@Thus, in the study of [[logic gates]], the theoretical lower bound of thermal energy released by an ''AND gate'' is higher than for the ''NOT gate'' (because information is destroyed in an ''AND gate'' and simply converted in a ''NOT gate'').@@@@1@41@@danf@17-8-2009
10381350@unknown@formal@none@1@S@Physical information is of particular importance in the theory of [[quantum computers]].@@@@1@12@@danf@17-8-2009
10381360@unknown@formal@none@1@S@== Information as records ==@@@@1@5@@danf@17-8-2009
10381370@unknown@formal@none@1@S@Records are a specialized form of information.@@@@1@7@@danf@17-8-2009
10381380@unknown@formal@none@1@S@Essentially, records are information produced consciously or as by-products of business activities or transactions and retained because of their value.@@@@1@20@@danf@17-8-2009
10381390@unknown@formal@none@1@S@Primarily their value is as evidence of the activities of the organization but they may also be retained for their informational value.@@@@1@22@@danf@17-8-2009
10381400@unknown@formal@none@1@S@Sound [[records management]] ensures that the integrity of records is preserved for as long as they are required.@@@@1@18@@danf@17-8-2009
10381410@unknown@formal@none@1@S@The international standard on records management, ISO 15489, defines records as "information created, received, and maintained as evidence and information by an organization or person, in pursuance of legal obligations or in the transaction of business".@@@@1@36@@danf@17-8-2009
10381420@unknown@formal@none@1@S@The International Committee on Archives (ICA) Committee on electronic records defined a record as, "a specific piece of recorded information generated, collected or received in the initiation, conduct or completion of an activity and that comprises sufficient content, context and structure to provide proof or evidence of that activity".@@@@1@49@@danf@17-8-2009
10381430@unknown@formal@none@1@S@Records may be retained because of their business value, as part of the [[corporate memory]] of the organization or to meet legal, fiscal or accountability requirements imposed on the organization.@@@@1@30@@danf@17-8-2009
10381440@unknown@formal@none@1@S@Willis (2005) expressed the view that sound management of business records and information delivered "…six key requirements for good [[corporate governance]]…transparency; accountability; due process; compliance; meeting statutory and common law requirements; and security of personal and corporate information."@@@@1@38@@danf@17-8-2009
10381450@unknown@formal@none@1@S@== Information and semiotics ==@@@@1@5@@danf@17-8-2009
10381460@unknown@formal@none@1@S@Beynon-Davies explains the multi-faceted concept of information in terms of that of signs and sign-systems.@@@@1@15@@danf@17-8-2009
10381470@unknown@formal@none@1@S@Signs themselves can be considered in terms of four inter-dependent levels, layers or branches of [[semiotics]]: pragmatics, semantics, syntactics and empirics.@@@@1@21@@danf@17-8-2009
10381480@unknown@formal@none@1@S@These four layers serve to connect the social world on the one hand with the physical or technical world on the other.@@@@1@22@@danf@17-8-2009
10381490@unknown@formal@none@1@S@[[Pragmatics]] is concerned with the purpose of communication.@@@@1@8@@danf@17-8-2009
10381500@unknown@formal@none@1@S@Pragmatics links the issue of signs with that of intention.@@@@1@10@@danf@17-8-2009
10381510@unknown@formal@none@1@S@The focus of pragmatics is on the intentions of human agents underlying communicative behaviour.@@@@1@14@@danf@17-8-2009
10381520@unknown@formal@none@1@S@In other words, intentions link language to action.@@@@1@8@@danf@17-8-2009
10381530@unknown@formal@none@1@S@[[Semantics]] is concerned with the meaning of a message conveyed in a communicative act.@@@@1@14@@danf@17-8-2009
10381535@unknown@formal@none@1@S@Semantics considers the content of communication.@@@@1@6@@danf@17-8-2009
10381540@unknown@formal@none@1@S@Semantics is the study of the meaning of signs - the association between signs and behaviour.@@@@1@16@@danf@17-8-2009
10381550@unknown@formal@none@1@S@Semantics can be considered as the study of the link between symbols and their referents or concepts; particularly the way in which signs relate to human behaviour.@@@@1@27@@danf@17-8-2009
10381560@unknown@formal@none@1@S@Syntactics is concerned with the formalism used to represent a message.@@@@1@11@@danf@17-8-2009
10381570@unknown@formal@none@1@S@Syntactics as an area studies the form of communication in terms of the logic and grammar of sign systems.@@@@1@19@@danf@17-8-2009
10381580@unknown@formal@none@1@S@Syntactics is devoted to the study of the form rather than the content of signs and sign-systems.@@@@1@17@@danf@17-8-2009
10381590@unknown@formal@none@1@S@Empirics is the study of the signals used to carry a message; the physical characteristics of the medium of communication.@@@@1@20@@danf@17-8-2009
10381600@unknown@formal@none@1@S@Empirics is devoted to the study of communication channels and their characteristics, e.g., sound, light, electronic transmission etc.@@@@1@18@@danf@17-8-2009
10381610@unknown@formal@none@1@S@Communication normally exists within the context of some social situation.@@@@1@10@@danf@17-8-2009
10381620@unknown@formal@none@1@S@The social situation sets the context for the intentions conveyed (pragmatics) and the form in which communication takes place.@@@@1@19@@danf@17-8-2009
10381630@unknown@formal@none@1@S@In a communicative situation intentions are expressed through messages which comprise collections of inter-related signs taken from a language which is mutually understood by the agents involved in the communication.@@@@1@30@@danf@17-8-2009
10381640@unknown@formal@none@1@S@Mutual understanding implies that agents involved understand the chosen language in terms of its agreed syntax (syntactics) and semantics.@@@@1@19@@danf@17-8-2009
10381650@unknown@formal@none@1@S@The sender codes the message in the language and sends the message as signals along some communication channel (empirics).@@@@1@19@@danf@17-8-2009
10381660@unknown@formal@none@1@S@The chosen communication channel will have inherent properties which determine outcomes such as the speed with which communication can take place and over what distance.@@@@1@25@@danf@17-8-2009
10390010@unknown@formal@none@1@S@Information extraction@@@@1@2@@danf@17-8-2009
10390020@unknown@formal@none@1@S@In [[natural language processing]], '''information extraction''' (IE) is a type of [[information retrieval]] whose goal is to automatically extract structured information, i.e. categorized and contextually and semantically well-defined data from a certain domain, from unstructured [[machine-readable]] documents.@@@@1@37@@danf@17-8-2009
10390030@unknown@formal@none@1@S@An example of information extraction is the extraction of instances of corporate mergers, more formally , from an online news sentence such as: "Yesterday, New-York based Foo Inc. announced their acquisition of Bar Corp."@@@@1@36@@danf@17-8-2009
10390040@unknown@formal@none@1@S@A broad goal of IE is to allow computation to be done on the previously unstructured data.@@@@1@17@@danf@17-8-2009
10390050@unknown@formal@none@1@S@A more specific goal is to allow logical reasoning to draw inferences based on the logical content of the input data.@@@@1@21@@danf@17-8-2009
10390060@unknown@formal@none@1@S@The significance of IE is determined by the growing amount of information available in unstructured (i.e. without [[metadata]]) form, for instance on the Internet.@@@@1@24@@danf@17-8-2009
10390070@unknown@formal@none@1@S@This knowledge can be made more accessible by means of transformation into [[relational database|relational form]], or by marking-up with [[XML]] tags.@@@@1@21@@danf@17-8-2009
10390080@unknown@formal@none@1@S@An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with.@@@@1@21@@danf@17-8-2009
10390090@unknown@formal@none@1@S@A typical application of IE is to scan a set of documents written in a [[natural language]] and populate a database with the information extracted.@@@@1@25@@danf@17-8-2009
10390100@unknown@formal@none@1@S@Current approaches to IE use [[natural language processing]] techniques that focus on very restricted domains.@@@@1@15@@danf@17-8-2009
10390110@unknown@formal@none@1@S@For example, the ''[[Message Understanding Conference]]'' (MUC) is a competition-based conference that focused on the following domains in the past:@@@@1@20@@danf@17-8-2009
10390120@unknown@formal@none@1@S@*MUC-1 (1987), MUC-2 (1989): Naval operations messages.@@@@1@7@@danf@17-8-2009
10390130@unknown@formal@none@1@S@*MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries.@@@@1@9@@danf@17-8-2009
10390140@unknown@formal@none@1@S@*MUC-5 (1993): Joint ventures and microelectronics domain.@@@@1@7@@danf@17-8-2009
10390150@unknown@formal@none@1@S@*MUC-6 (1995): News articles on management changes.@@@@1@7@@danf@17-8-2009
10390160@unknown@formal@none@1@S@*MUC-7 (1998): Satellite launch reports.@@@@1@5@@danf@17-8-2009
10390170@unknown@formal@none@1@S@Natural Language texts may need to use some form of a [[Text simplification]] to create a more easily machine readable text to extract the sentences.@@@@1@25@@danf@17-8-2009
10390180@unknown@formal@none@1@S@Typical subtasks of IE are:@@@@1@5@@danf@17-8-2009
10390190@unknown@formal@none@1@S@* [[Named Entity Recognition]]: recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions.@@@@1@22@@danf@17-8-2009
10390200@unknown@formal@none@1@S@* [[Coreference]]: identification chains of [[noun phrase]]s that refer to the same object.@@@@1@13@@danf@17-8-2009
10390210@unknown@formal@none@1@S@For example, [[Anaphora (linguistics)|anaphora]] is a type of coreference.@@@@1@9@@danf@17-8-2009
10390220@unknown@formal@none@1@S@* [[Terminology extraction]]: finding the relevant terms for a given [[text corpus|corpus]]@@@@1@12@@danf@17-8-2009
10390230@unknown@formal@none@1@S@* Relation Extraction: identification of relations between entities, such as:@@@@1@10@@danf@17-8-2009
10390240@unknown@formal@none@1@S@**PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.")@@@@1@12@@danf@17-8-2009
10390250@unknown@formal@none@1@S@**PERSON located in LOCATION (extracted from the sentence "Bill is in France.")@@@@1@12@@danf@17-8-2009
10400010@unknown@formal@none@1@S@Information retrieval@@@@1@2@@danf@17-8-2009
10400020@unknown@formal@none@1@S@'''Information retrieval''' ('''IR''') is the science of searching for documents, for [[information]] within documents and for [[Metadata (computing)|metadata]] about documents, as well as that of searching [[relational database]]s and the [[World Wide Web]].@@@@1@33@@danf@17-8-2009
10400030@unknown@formal@none@1@S@There is overlap in the usage of the terms data retrieval, [[document retrieval]], information retrieval, and [[text retrieval]], but each also has its own body of literature, theory, [[Praxis (process)|praxis]] and technologies.@@@@1@32@@danf@17-8-2009
10400040@unknown@formal@none@1@S@IR is [[interdisciplinary]], based on [[computer science]], [[mathematics]], [[library science]], [[information science]], [[information architecture]], [[cognitive psychology]], [[linguistics]], [[statistics]] and [[physics]].@@@@1@20@@danf@17-8-2009
10400050@unknown@formal@none@1@S@Automated information retrieval systems are used to reduce what has been called "[[information overload]]".@@@@1@14@@danf@17-8-2009
10400060@unknown@formal@none@1@S@Many universities and [[public library|public libraries]] use IR systems to provide access to books, journals and other documents.@@@@1@18@@danf@17-8-2009
10400070@unknown@formal@none@1@S@Web [[Web search engine|search engine]]s are the most visible [[Information retrieval applications|IR applications]].@@@@1@13@@danf@17-8-2009
10400080@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10400090@unknown@formal@none@1@S@The idea of using computers to search for relevant pieces of information was popularized in an article ''[[As We May Think]]'' by [[Vannevar Bush]] in 1945.@@@@1@26@@danf@17-8-2009
10400100@unknown@formal@none@1@S@First implementations of information retrieval systems were introduced in the 1950s and 1960s.@@@@1@13@@danf@17-8-2009
10400110@unknown@formal@none@1@S@By 1990 several different techniques had been shown to perform well on small text corpora (several thousand documents).@@@@1@18@@danf@17-8-2009
10400120@unknown@formal@none@1@S@In 1992 the US Department of Defense, along with the [[National Institute of Standards and Technology]] (NIST), cosponsored the [[Text Retrieval Conference]] (TREC) as part of the TIPSTER text program.@@@@1@30@@danf@17-8-2009
10400130@unknown@formal@none@1@S@The aim of this was to look into the information retrieval community by supplying the infrastructure that was needed for evaluation of text retrieval methodologies on a very large text collection.@@@@1@31@@danf@17-8-2009
10400140@unknown@formal@none@1@S@This catalyzed research on methods that [[scalability|scale]] to huge corpora.@@@@1@10@@danf@17-8-2009
10400150@unknown@formal@none@1@S@The introduction of web [[Web search engine|search engine]]s has boosted the need for very large scale retrieval systems even further.@@@@1@20@@danf@17-8-2009
10400160@unknown@formal@none@1@S@The use of digital methods for storing and retrieving information has led to the phenomenon of [[digital obsolescence]], where a digital resource ceases to be readable because the physical media, the reader required to read the media, the hardware, or the software that runs on it, is no longer available.@@@@1@50@@danf@17-8-2009
10400170@unknown@formal@none@1@S@The information is initially easier to retrieve than if it were on paper, but is then effectively lost.@@@@1@18@@danf@17-8-2009
10400180@unknown@formal@none@1@S@=== Timeline ===@@@@1@3@@danf@17-8-2009
10400190@unknown@formal@none@1@S@* 1890: Hollerith tabulating machines were used to analyze the US census.@@@@1@12@@danf@17-8-2009
10400200@unknown@formal@none@1@S@([[Herman Hollerith]]).@@@@1@2@@danf@17-8-2009
10400210@unknown@formal@none@1@S@* 1945: [[Vannevar Bush]]'s ''[[As We May Think]]'' appeared in ''[[Atlantic Monthly]]''@@@@1@12@@danf@17-8-2009
10400220@unknown@formal@none@1@S@* Late 1940s: The US military confronted problems of indexing and retrieval of wartime scientific research documents captured from Germans.@@@@1@20@@danf@17-8-2009
10400230@unknown@formal@none@1@S@* 1947: [[Hans Peter Luhn]] (research engineer at IBM since 1941) began work on a mechanized, punch card based system for searching chemical compounds.@@@@1@24@@danf@17-8-2009
10400240@unknown@formal@none@1@S@* 1950: The term "information retrieval" may have been coined by [[Calvin Mooers]].@@@@1@13@@danf@17-8-2009
10400250@unknown@formal@none@1@S@* 1950s: Growing concern in the US for a "science gap" with the USSR motivated, encouraged funding, and provided a backdrop for mechanized literature searching systems ([[Allen Kent]] et al) and the invention of citation indexing ([[Eugene Garfield]]).@@@@1@38@@danf@17-8-2009
10400260@unknown@formal@none@1@S@* 1955: Allen Kent joined [[Case Western Reserve University]], and eventually becomes associate director of the Center for Documentation and Communications Research.@@@@1@22@@danf@17-8-2009
10400270@unknown@formal@none@1@S@That same year, Kent and colleagues publish a paper in American Documentation describing the precision and recall measures, as well as detailing a proposed "framework" for evaluating an IR system, which includes statistical sampling methods for determining the number of relevant documents not retrieved.@@@@1@44@@danf@17-8-2009
10400280@unknown@formal@none@1@S@* 1958: International Conference on Scientific Information Washington DC included consideration of IR systems as a solution to problems identified.@@@@1@20@@danf@17-8-2009
10400290@unknown@formal@none@1@S@See: Proceedings of the International Conference on Scientific Information, 1958 (National Academy of Sciences, Washington, DC, 1959)@@@@1@17@@danf@17-8-2009
10400300@unknown@formal@none@1@S@* 1959: Hans Peter Luhn published "Auto-encoding of documents for information retrieval."@@@@1@12@@danf@17-8-2009
10400310@unknown@formal@none@1@S@* 1960: Melvin Earl (Bill) Maron and J. L. Kuhns published "On relevance, probabilistic indexing, and information retrieval" in Journal of the ACM 7(3):216-244, July 1960.@@@@1@26@@danf@17-8-2009
10400320@unknown@formal@none@1@S@* Early 1960s: [[Gerard Salton]] began work on IR at Harvard, later moved to Cornell.@@@@1@15@@danf@17-8-2009
10400330@unknown@formal@none@1@S@* 1962: [[Cyril W. Cleverdon]] published early findings of the Cranfield studies, developing a model for IR system evaluation.@@@@1@19@@danf@17-8-2009
10400340@unknown@formal@none@1@S@See: Cyril W. Cleverdon, "Report on the Testing and Analysis of an Investigation into the Comparative Efficiency of Indexing Systems".@@@@1@20@@danf@17-8-2009
10400350@unknown@formal@none@1@S@Cranfield Coll. of Aeronautics, Cranfield, England, 1962.@@@@1@7@@danf@17-8-2009
10400360@unknown@formal@none@1@S@* 1962: Kent published Information Analysis and Retrieval@@@@1@8@@danf@17-8-2009
10400370@unknown@formal@none@1@S@* 1963: Weinberg report "Science, Government and Information" gave a full articulation of the idea of a "crisis of scientific information."@@@@1@21@@danf@17-8-2009
10400380@unknown@formal@none@1@S@The report was named after Dr. [[Alvin Weinberg]].@@@@1@8@@danf@17-8-2009
10400390@unknown@formal@none@1@S@* 1963: [[Joseph Becker]] and [[Robert M. Hayes]] published text on information retrieval.@@@@1@13@@danf@17-8-2009
10400400@unknown@formal@none@1@S@Becker, Joseph; Hayes, Robert Mayo.@@@@1@5@@danf@17-8-2009
10400410@unknown@formal@none@1@S@Information storage and retrieval: tools, elements, theories.@@@@1@7@@danf@17-8-2009
10400420@unknown@formal@none@1@S@New York, Wiley (1963).@@@@1@4@@danf@17-8-2009
10400430@unknown@formal@none@1@S@* 1964: [[Karen Spärck Jones]] finished her thesis at Cambridge, ''Synonymy and Semantic Classification'', and continued work on [[computational linguistics]] as it applies to IR@@@@1@25@@danf@17-8-2009
10400440@unknown@formal@none@1@S@* 1964: The [[National Bureau of Standards]] sponsored a symposium titled "Statistical Association Methods for Mechanized Documentation."@@@@1@17@@danf@17-8-2009
10400450@unknown@formal@none@1@S@Several highly significant papers, including G. Salton's first published reference (we believe) to the SMART system.@@@@1@16@@danf@17-8-2009
10400460@unknown@formal@none@1@S@* Mid-1960s: National Library of Medicine developed [[MEDLARS]] Medical Literature Analysis and Retrieval System, the first major machine-readable database and batch retrieval system@@@@1@23@@danf@17-8-2009
10400470@unknown@formal@none@1@S@* Mid-1960s: Project Intrex at MIT@@@@1@6@@danf@17-8-2009
10400480@unknown@formal@none@1@S@* 1965: [[J. C. R. Licklider]] published ''Libraries of the Future''@@@@1@11@@danf@17-8-2009
10400490@unknown@formal@none@1@S@* 1966: [[Don Swanson]] was involved in studies at University of Chicago on Requirements for Future Catalogs@@@@1@17@@danf@17-8-2009
10400500@unknown@formal@none@1@S@* 1968: Gerard Salton published ''Automatic Information Organization and Retrieval''.@@@@1@10@@danf@17-8-2009
10400510@unknown@formal@none@1@S@* 1968: [[J. W. Sammon]]'s RADC Tech report "Some Mathematics of Information Storage and Retrieval..." outlined the vector model.@@@@1@19@@danf@17-8-2009
10400520@unknown@formal@none@1@S@* 1969: Sammon's "A nonlinear mapping for data structure analysis" (IEEE Transactions on Computers) was the first proposal for visualization interface to an IR system.@@@@1@25@@danf@17-8-2009
10400530@unknown@formal@none@1@S@* Late 1960s: [[F. W. Lancaster]] completed evaluation studies of the MEDLARS system and published the first edition of his text on information retrieval@@@@1@24@@danf@17-8-2009
10400540@unknown@formal@none@1@S@* Early 1970s: first online systems--NLM's AIM-TWX, MEDLINE; Lockheed's Dialog; SDC's ORBIT@@@@1@12@@danf@17-8-2009
10400550@unknown@formal@none@1@S@* Early 1970s: [[Theodor Nelson]] promoting concept of [[hypertext]], published Computer Lib/Dream Machines@@@@1@13@@danf@17-8-2009
10400560@unknown@formal@none@1@S@* 1971: [[N. Jardine]] and [[C. J. Van Rijsbergen]] published "The use of hierarchic clustering in information retrieval", which articulated the "cluster hypothesis."@@@@1@23@@danf@17-8-2009
10400570@unknown@formal@none@1@S@(Information Storage and Retrieval, 7(5), pp. 217-240, Dec 1971)@@@@1@9@@danf@17-8-2009
10400580@unknown@formal@none@1@S@*1975: Three highly influential publications by Salton fully articulated his vector processing framework and term discrimination model:@@@@1@17@@danf@17-8-2009
10400590@unknown@formal@none@1@S@** A Theory of Indexing (Society for Industrial and Applied Mathematics)@@@@1@11@@danf@17-8-2009
10400600@unknown@formal@none@1@S@** "A theory of term importance in automatic text analysis", (JASIS v. 26)@@@@1@13@@danf@17-8-2009
10400610@unknown@formal@none@1@S@** "A vector space model for automatic indexing", (CACM 18:11)@@@@1@10@@danf@17-8-2009
10400620@unknown@formal@none@1@S@* 1978: The First [[Association for Computing Machinery|ACM]] [[SIGIR]] conference.@@@@1@10@@danf@17-8-2009
10400630@unknown@formal@none@1@S@* 1979: C. J. Van Rijsbergen published ''Information Retrieval'' (Butterworths).@@@@1@10@@danf@17-8-2009
10400640@unknown@formal@none@1@S@Heavy emphasis on probabilistic models.@@@@1@5@@danf@17-8-2009
10400650@unknown@formal@none@1@S@* 1980: First international ACM SIGIR conference, joint with British Computer Society IR group in Cambridge@@@@1@16@@danf@17-8-2009
10400660@unknown@formal@none@1@S@* 1982: [[Nicholas J. Belkin|Belkin]], Oddy, and Brooks proposed the ASK (Anomalous State of Knowledge) viewpoint for information retrieval.@@@@1@19@@danf@17-8-2009
10400670@unknown@formal@none@1@S@This was an important concept, though their automated analysis tool proved ultimately disappointing.@@@@1@13@@danf@17-8-2009
10400680@unknown@formal@none@1@S@* 1983: Salton (and M. McGill) published Introduction to Modern Information Retrieval (McGraw-Hill), with heavy emphasis on vector models.@@@@1@19@@danf@17-8-2009
10400690@unknown@formal@none@1@S@* Mid-1980s: Efforts to develop end user versions of commercial IR systems.@@@@1@12@@danf@17-8-2009
10400700@unknown@formal@none@1@S@* 1985-1993: Key papers on and experimental systems for visualization interfaces.@@@@1@11@@danf@17-8-2009
10400710@unknown@formal@none@1@S@* Work by [[D. B. Crouch]], [[Robert R. Korfhage]], [[M. Chalmers]], [[A. Spoerri]] and others.@@@@1@15@@danf@17-8-2009
10400720@unknown@formal@none@1@S@* 1989: First [[World Wide Web]] proposals by [[Tim Berners-Lee]] at [[CERN]].@@@@1@12@@danf@17-8-2009
10400730@unknown@formal@none@1@S@* 1992: First TREC conference.@@@@1@5@@danf@17-8-2009
10400740@unknown@formal@none@1@S@* 1997: Publication of [[Robert R. Korfhage|Korfhage]]'s ''Information Storage and Retrieval'' with emphasis on visualization and multi-reference point systems.@@@@1@19@@danf@17-8-2009
10400750@unknown@formal@none@1@S@* Late 1990s: Web [[Web search engine|search engine]] implementation of many features formerly found only in experimental IR systems@@@@1@19@@danf@17-8-2009
10400760@unknown@formal@none@1@S@== Overview ==@@@@1@3@@danf@17-8-2009
10400770@unknown@formal@none@1@S@An information retrieval process begins when a user enters a query into the system.@@@@1@14@@danf@17-8-2009
10400780@unknown@formal@none@1@S@Queries are formal statements of [[information need]]s, for example search strings in web search engines.@@@@1@15@@danf@17-8-2009
10400790@unknown@formal@none@1@S@In information retrieval a query does not uniquely identify a single object in the collection.@@@@1@15@@danf@17-8-2009
10400800@unknown@formal@none@1@S@Instead, several objects may match the query, perhaps with different degrees of [[relevance|relevancy]].@@@@1@13@@danf@17-8-2009
10400810@unknown@formal@none@1@S@An object is an entity which keeps or stores information in a database.@@@@1@13@@danf@17-8-2009
10400820@unknown@formal@none@1@S@User queries are matched to objects stored in the database.@@@@1@10@@danf@17-8-2009
10400830@unknown@formal@none@1@S@Depending on the [[Information retrieval applications|application]] the data objects may be, for example, text documents, images or videos.@@@@1@18@@danf@17-8-2009
10400840@unknown@formal@none@1@S@Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates.@@@@1@24@@danf@17-8-2009
10400850@unknown@formal@none@1@S@Most IR systems compute a numeric score on how well each object in the database match the query, and rank the objects according to this value.@@@@1@26@@danf@17-8-2009
10400860@unknown@formal@none@1@S@The top ranking objects are then shown to the user.@@@@1@10@@danf@17-8-2009
10400870@unknown@formal@none@1@S@The process may then be iterated if the user wishes to refine the query.@@@@1@14@@danf@17-8-2009
10400880@unknown@formal@none@1@S@== Performance measures ==@@@@1@4@@danf@17-8-2009
10400890@unknown@formal@none@1@S@Many different measures for evaluating the performance of information retrieval systems have been proposed.@@@@1@14@@danf@17-8-2009
10400900@unknown@formal@none@1@S@The measures require a collection of documents and a query.@@@@1@10@@danf@17-8-2009
10400910@unknown@formal@none@1@S@All common measures described here assume a ground truth notion of relevancy: every document is known to be either relevant or non-relevant to a particular query.@@@@1@26@@danf@17-8-2009
10400920@unknown@formal@none@1@S@In practice queries may be [[ill-posed]] and there may be different shades of relevancy.@@@@1@14@@danf@17-8-2009
10400930@unknown@formal@none@1@S@=== Precision ===@@@@1@3@@danf@17-8-2009
10400940@unknown@formal@none@1@S@Precision is the fraction of the documents retrieved that are [[Relevance (information retrieval)|relevant]] to the user's information need.@@@@1@18@@danf@17-8-2009
10400950@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10400960@unknown@formal@none@1@S@In [[binary classification]], precision is analogous to [[positive predictive value]].@@@@1@10@@danf@17-8-2009
10400970@unknown@formal@none@1@S@Precision takes all retrieved documents into account.@@@@1@7@@danf@17-8-2009
10400980@unknown@formal@none@1@S@It can also be evaluated at a given cut-off rank, considering only the topmost results returned by the system.@@@@1@19@@danf@17-8-2009
10400990@unknown@formal@none@1@S@This measure is called ''precision at n'' or ''P\sn''.@@@@1@9@@danf@17-8-2009
10401000@unknown@formal@none@1@S@Note that the meaning and usage of "precision" in the field of Information Retrieval differs from the definition of [[accuracy and precision]] within other branches of science and technology.@@@@1@29@@danf@17-8-2009
10401010@unknown@formal@none@1@S@=== Recall ===@@@@1@3@@danf@17-8-2009
10401020@unknown@formal@none@1@S@Recall is the fraction of the documents that are relevant to the query that are successfully retrieved.@@@@1@17@@danf@17-8-2009
10401030@unknown@formal@none@1@S@:@@@@1@5@@danf@17-8-2009
10401040@unknown@formal@none@1@S@In binary classification, recall is called [[sensitivity (tests)|sensitivity]].@@@@1@8@@danf@17-8-2009
10401050@unknown@formal@none@1@S@So it can be looked at as ''the probability that a relevant document is retrieved by the query''.@@@@1@18@@danf@17-8-2009
10401060@unknown@formal@none@1@S@It is trivial to achieve recall of 100% by returning all documents in response to any query.@@@@1@17@@danf@17-8-2009
10401070@unknown@formal@none@1@S@Therefore recall alone is not enough but one needs to measure the number of non-relevant documents also, for example by computing the precision.@@@@1@23@@danf@17-8-2009
10401080@unknown@formal@none@1@S@=== Fall-Out ===@@@@1@3@@danf@17-8-2009
10401090@unknown@formal@none@1@S@The proportion of non-relevant documents that are retrieved, out of all non-relevant documents available:@@@@1@14@@danf@17-8-2009
10401100@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10401110@unknown@formal@none@1@S@In binary classification, fall-out is closely related to [[specificity (tests)|specificity]].@@@@1@10@@danf@17-8-2009
10401120@unknown@formal@none@1@S@More precisely: .@@@@1@3@@danf@17-8-2009
10401130@unknown@formal@none@1@S@It can be looked at as ''the probability that a non-relevant document is retrieved by the query''.@@@@1@17@@danf@17-8-2009
10401140@unknown@formal@none@1@S@It is trivial to achieve fall-out of 0% by returning zero documents in response to any query.@@@@1@17@@danf@17-8-2009
10401150@unknown@formal@none@1@S@=== F-measure ===@@@@1@3@@danf@17-8-2009
10401160@unknown@formal@none@1@S@The weighted [[harmonic mean]] of precision and recall, the traditional F-measure or balanced F-score is:@@@@1@15@@danf@17-8-2009
10401170@unknown@formal@none@1@S@:@@@@1@11@@danf@17-8-2009
10401180@unknown@formal@none@1@S@This is also known as the measure, because recall and precision are evenly weighted.@@@@1@15@@danf@17-8-2009
10401190@unknown@formal@none@1@S@The general formula for non-negative real ß is:@@@@1@8@@danf@17-8-2009
10401200@unknown@formal@none@1@S@:@@@@1@15@@danf@17-8-2009
10401210@unknown@formal@none@1@S@Two other commonly used F measures are the measure, which weights recall twice as much as precision, and the measure, which weights precision twice as much as recall.@@@@1@30@@danf@17-8-2009
10401220@unknown@formal@none@1@S@The F-measure was derived by van Rijsbergen (1979) so that "measures the effectiveness of retrieval with respect to a user who attaches ß times as much importance to recall as precision".@@@@1@32@@danf@17-8-2009
10401230@unknown@formal@none@1@S@It is based on van Rijsbergen's effectiveness measure .@@@@1@13@@danf@17-8-2009
10401240@unknown@formal@none@1@S@Their relationship is where .@@@@1@10@@danf@17-8-2009
10401250@unknown@formal@none@1@S@=== Average precision of precision and recall===@@@@1@7@@danf@17-8-2009
10401260@unknown@formal@none@1@S@The precision and recall are based on the whole list of documents returned by the system.@@@@1@16@@danf@17-8-2009
10401270@unknown@formal@none@1@S@Average precision emphasizes returning more relevant documents earlier.@@@@1@8@@danf@17-8-2009
10401280@unknown@formal@none@1@S@It is average of precisions computed after truncating the list after each of the relevant documents in turn:@@@@1@18@@danf@17-8-2009
10401290@unknown@formal@none@1@S@:@@@@1@11@@danf@17-8-2009
10401300@unknown@formal@none@1@S@where ''r'' is the rank, ''N'' the number retrieved, ''rel()'' a binary function on the relevance of a given rank, and ''P()'' precision at a given cut-off rank.@@@@1@28@@danf@17-8-2009
10401310@unknown@formal@none@1@S@== Model types ==@@@@1@4@@danf@17-8-2009
10401320@unknown@formal@none@1@S@[[Image:Information-Retrieval-Models.png|thumb|500px|categorization of IR-models (translated from [http://de.wikipedia.org/wiki/Informationsrückgewinnung#Klassifikation_von_Modellen_zur_Repr.C3.A4sentation_nat.C3.BCrlichsprachlicher_Dokumente German entry], original source [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id= Dominik Kuropka])]]@@@@1@13@@danf@17-8-2009
10401325@unknown@formal@none@1@S@For the information retrieval to be efficient, the documents are typically transformed into a suitable representation.@@@@1@16@@danf@17-8-2009
10401330@unknown@formal@none@1@S@There are several representations.@@@@1@4@@danf@17-8-2009
10401340@unknown@formal@none@1@S@The picture on the right illustrates the relationship of some common models.@@@@1@12@@danf@17-8-2009
10401350@unknown@formal@none@1@S@In the picture, the models are categorized according to two dimensions: the mathematical basis and the properties of the model.@@@@1@20@@danf@17-8-2009
10401360@unknown@formal@none@1@S@=== First dimension: mathematical basis ===@@@@1@6@@danf@17-8-2009
10401370@unknown@formal@none@1@S@* ''Set-theoretic models'' represent documents as sets of words or phrases.@@@@1@11@@danf@17-8-2009
10401380@unknown@formal@none@1@S@Similarities are usually derived from set-theoretic operations on those sets.@@@@1@10@@danf@17-8-2009
10401390@unknown@formal@none@1@S@Common models are:@@@@1@3@@danf@17-8-2009
10401400@unknown@formal@none@1@S@** [[Standard Boolean model]]@@@@1@4@@danf@17-8-2009
10401410@unknown@formal@none@1@S@** [[Extended Boolean model]]@@@@1@4@@danf@17-8-2009
10401420@unknown@formal@none@1@S@** [[Fuzzy retrieval]]@@@@1@3@@danf@17-8-2009
10401430@unknown@formal@none@1@S@* ''Algebraic models'' represent documents and queries usually as vectors, matrices or tuples.@@@@1@13@@danf@17-8-2009
10401440@unknown@formal@none@1@S@The similarity of the query vector and document vector is represented as a scalar value.@@@@1@15@@danf@17-8-2009
10401450@unknown@formal@none@1@S@** [[Vector space model]]@@@@1@4@@danf@17-8-2009
10401460@unknown@formal@none@1@S@** [[Generalized vector space model]]@@@@1@5@@danf@17-8-2009
10401470@unknown@formal@none@1@S@** Topic-based vector space model (literature: [http://www.kuropka.net/files/TVSM.pdf], [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id=])@@@@1@8@@danf@17-8-2009
10401480@unknown@formal@none@1@S@** [[Extended Boolean model]]@@@@1@4@@danf@17-8-2009
10401490@unknown@formal@none@1@S@** Enhanced topic-based vector space model (literature: [http://kuropka.net/files/HPI_Evaluation_of_eTVSM.pdf], [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id=])@@@@1@9@@danf@17-8-2009
10401500@unknown@formal@none@1@S@** Latent semantic indexing aka [[latent semantic analysis]]@@@@1@8@@danf@17-8-2009
10401510@unknown@formal@none@1@S@* ''Probabilistic models'' treat the process of document retrieval as a probabilistic inference.@@@@1@13@@danf@17-8-2009
10401520@unknown@formal@none@1@S@Similarities are computed as probabilities that a document is relevant for a given query.@@@@1@14@@danf@17-8-2009
10401530@unknown@formal@none@1@S@Probabilistic theorems like the [[Bayes' theorem]] are often used in these models.@@@@1@12@@danf@17-8-2009
10401540@unknown@formal@none@1@S@** [[Binary independence retrieval]]@@@@1@4@@danf@17-8-2009
10401550@unknown@formal@none@1@S@** [[Probabilistic relevance model (BM25)]]@@@@1@5@@danf@17-8-2009
10401560@unknown@formal@none@1@S@** Uncertain inference@@@@1@3@@danf@17-8-2009
10401570@unknown@formal@none@1@S@** [[Language model]]s@@@@1@3@@danf@17-8-2009
10401580@unknown@formal@none@1@S@** [[Divergence-from-randomness model]]@@@@1@3@@danf@17-8-2009
10401590@unknown@formal@none@1@S@** [[Latent Dirichlet allocation]]@@@@1@4@@danf@17-8-2009
10401600@unknown@formal@none@1@S@=== Second dimension: properties of the model ===@@@@1@8@@danf@17-8-2009
10401610@unknown@formal@none@1@S@* ''Models without term-interdependencies'' treat different terms/words as independent.@@@@1@9@@danf@17-8-2009
10401620@unknown@formal@none@1@S@This fact is usually represented in vector space models by the [[orthogonality]] assumption of term vectors or in probabilistic models by an [[independency]] assumption for term variables.@@@@1@27@@danf@17-8-2009
10401630@unknown@formal@none@1@S@* ''Models with immanent term interdependencies'' allow a representation of interdependencies between terms.@@@@1@13@@danf@17-8-2009
10401640@unknown@formal@none@1@S@However the degree of the interdependency between two terms is defined by the model itself.@@@@1@15@@danf@17-8-2009
10401650@unknown@formal@none@1@S@It is usually directly or indirectly derived (e.g. by [[dimension reduction|dimensional reduction]]) from the [[co-occurrence]] of those terms in the whole set of documents.@@@@1@24@@danf@17-8-2009
10401660@unknown@formal@none@1@S@* ''Models with transcendent term interdependencies'' allow a representation of interdependencies between terms, but they do not allege how the interdependency between two terms is defined.@@@@1@26@@danf@17-8-2009
10401670@unknown@formal@none@1@S@They relay an external source for the degree of interdependency between two terms.@@@@1@13@@danf@17-8-2009
10401680@unknown@formal@none@1@S@(For example a human or sophisticated algorithms.)@@@@1@7@@danf@17-8-2009
10401690@unknown@formal@none@1@S@== Major figures ==@@@@1@4@@danf@17-8-2009
10401700@unknown@formal@none@1@S@* [[Gerard Salton]]@@@@1@3@@danf@17-8-2009
10401710@unknown@formal@none@1@S@* [[Hans Peter Luhn]]@@@@1@4@@danf@17-8-2009
10401720@unknown@formal@none@1@S@* [http://ciir.cs.umass.edu/personnel/croft.html W. Bruce Croft]@@@@1@5@@danf@17-8-2009
10401730@unknown@formal@none@1@S@* [[Karen Spärck Jones]]@@@@1@4@@danf@17-8-2009
10401740@unknown@formal@none@1@S@* [[C. J. van Rijsbergen]]@@@@1@5@@danf@17-8-2009
10401750@unknown@formal@none@1@S@* [http://www.soi.city.ac.uk/~ser/homepage.html Stephen E. Robertson]@@@@1@5@@danf@17-8-2009
10401760@unknown@formal@none@1@S@== Awards in the field ==@@@@1@6@@danf@17-8-2009
10401770@unknown@formal@none@1@S@* [[Tony Kent Strix award]]@@@@1@5@@danf@17-8-2009
10401780@unknown@formal@none@1@S@* [[Gerard Salton Award]]@@@@1@4@@danf@17-8-2009
10410010@unknown@formal@none@1@S@Information theory@@@@1@2@@danf@17-8-2009
10410020@unknown@formal@none@1@S@'''Information theory''' is a branch of [[applied mathematics]] and [[electrical engineering]] involving the quantification of [[information]].@@@@1@16@@danf@17-8-2009
10410030@unknown@formal@none@1@S@Historically, information theory was developed to find fundamental limits on compressing and reliably [[communication|communicating]] data.@@@@1@15@@danf@17-8-2009
10410040@unknown@formal@none@1@S@Since its inception it has broadened to find applications in many other areas, including [[statistical inference]], [[natural language processing]], [[cryptography]] generally, [[networks]] other than communication networks -- as in [[neurobiology]], the evolution and function of molecular codes, model selection in ecology, thermal physics, [[quantum computing]], plagiarism detection and other forms of [[data analysis]].@@@@1@53@@danf@17-8-2009
10410050@unknown@formal@none@1@S@A key measure of information in the theory is known as [[information entropy]], which is usually expressed by the average number of bits needed for storage or communication.@@@@1@28@@danf@17-8-2009
10410060@unknown@formal@none@1@S@Intuitively, entropy quantifies the uncertainty involved when encountering a [[random variable]].@@@@1@11@@danf@17-8-2009
10410070@unknown@formal@none@1@S@For example, a fair coin flip (2 equally likely outcomes) will have less entropy than a roll of a die (6 equally likely outcomes).@@@@1@24@@danf@17-8-2009
10410080@unknown@formal@none@1@S@Applications of fundamental topics of information theory include [[lossless data compression]] (e.g. [[ZIP (file format)|ZIP files]]), [[lossy data compression]] (e.g. [[MP3]]s), and [[channel capacity|channel coding]] (e.g. for [[DSL]] lines).@@@@1@29@@danf@17-8-2009
10410110@unknown@formal@none@1@S@The field is at the intersection of [[mathematics]], [[statistics]], [[computer science]], [[physics]], [[neurobiology]], and [[electrical engineering]].@@@@1@16@@danf@17-8-2009
10410120@unknown@formal@none@1@S@Its impact has been crucial to the success of the [[Voyager program|Voyager]] missions to deep space, the invention of the CD, the feasibility of mobile phones, the development of the [[Internet]], the study of [[linguistics]] and of human perception, the understanding of [[black hole]]s, and numerous other fields.@@@@1@48@@danf@17-8-2009
10410130@unknown@formal@none@1@S@Important sub-fields of information theory are source coding, channel coding, algorithmic complexity theory, algorithmic information theory, and measures of information.@@@@1@20@@danf@17-8-2009
10410140@unknown@formal@none@1@S@==Overview==@@@@1@1@@danf@17-8-2009
10410150@unknown@formal@none@1@S@The main concepts of information theory can be grasped by considering the most widespread means of human communication: language.@@@@1@19@@danf@17-8-2009
10410160@unknown@formal@none@1@S@Two important aspects of a good language are as follows: First, the most common words (e.g., "a", "the", "I") should be shorter than less common words (e.g., "benefit", "generation", "mediocre"), so that sentences will not be too long.@@@@1@38@@danf@17-8-2009
10410170@unknown@formal@none@1@S@Such a tradeoff in word length is analogous to [[data compression]] and is the essential aspect of [[source coding]].@@@@1@19@@danf@17-8-2009
10410180@unknown@formal@none@1@S@Second, if part of a sentence is unheard or misheard due to noise -— e.g., a passing car -— the listener should still be able to glean the meaning of the underlying message.@@@@1@33@@danf@17-8-2009
10410190@unknown@formal@none@1@S@Such robustness is as essential for an electronic communication system as it is for a language; properly building such robustness into communications is done by [[Channel capacity|channel coding]].@@@@1@28@@danf@17-8-2009
10410200@unknown@formal@none@1@S@Source coding and channel coding are the fundamental concerns of information theory.@@@@1@12@@danf@17-8-2009
10410210@unknown@formal@none@1@S@Note that these concerns have nothing to do with the ''importance'' of messages.@@@@1@13@@danf@17-8-2009
10410220@unknown@formal@none@1@S@For example, a platitude such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while clearly the latter is more important and more meaningful.@@@@1@35@@danf@17-8-2009
10410230@unknown@formal@none@1@S@Information theory, however, does not consider message importance or meaning, as these are matters of the quality of data rather than the quantity and readability of data, the latter of which is determined solely by probabilities.@@@@1@36@@danf@17-8-2009
10410240@unknown@formal@none@1@S@Information theory is generally considered to have been founded in 1948 by [[Claude Elwood Shannon|Claude Shannon]] in his seminal work, "[[A Mathematical Theory of Communication]]."@@@@1@25@@danf@17-8-2009
10410250@unknown@formal@none@1@S@The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel.@@@@1@20@@danf@17-8-2009
10410260@unknown@formal@none@1@S@The most fundamental results of this theory are Shannon's [[source coding theorem]], which establishes that, on average, the number of ''bits'' needed to represent the result of an uncertain event is given by its [[information entropy|entropy]]; and Shannon's [[noisy-channel coding theorem]], which states that ''reliable'' communication is possible over ''noisy'' channels provided that the rate of communication is below a certain threshold called the channel capacity.@@@@1@66@@danf@17-8-2009
10410270@unknown@formal@none@1@S@The channel capacity can be approached in practice by using appropriate encoding and decoding systems.@@@@1@15@@danf@17-8-2009
10410280@unknown@formal@none@1@S@Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: [[adaptive system]]s, [[anticipatory system]]s, [[artificial intelligence]], [[complex system]]s, [[complexity science]], [[cybernetics]], [[informatics]], [[machine learning]], along with [[systems science]]s of many descriptions.@@@@1@58@@danf@17-8-2009
10410290@unknown@formal@none@1@S@Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of [[coding theory]].@@@@1@24@@danf@17-8-2009
10410300@unknown@formal@none@1@S@Coding theory is concerned with finding explicit methods, called ''codes'', of increasing the efficiency and reducing the net error rate of data communication over a noisy channel to near the limit that Shannon proved is the maximum possible for that channel.@@@@1@41@@danf@17-8-2009
10410310@unknown@formal@none@1@S@These codes can be roughly subdivided into [[data compression]] (source coding) and [[error-correction]] (channel coding) techniques.@@@@1@16@@danf@17-8-2009
10410320@unknown@formal@none@1@S@In the latter case, it took many years to find the methods Shannon's work proved were possible.@@@@1@17@@danf@17-8-2009
10410330@unknown@formal@none@1@S@A third class of information theory codes are cryptographic algorithms (both [[code (cryptography)|code]]s and [[cipher]]s).@@@@1@15@@danf@17-8-2009
10410340@unknown@formal@none@1@S@Concepts, methods and results from coding theory and information theory are widely used in [[cryptography]] and [[cryptanalysis]].@@@@1@17@@danf@17-8-2009
10410350@unknown@formal@none@1@S@''See the article [[ban (information)]] for a historical application.''@@@@1@9@@danf@17-8-2009
10410360@unknown@formal@none@1@S@Information theory is also used in [[information retrieval]], [[intelligence (information gathering)|intelligence gathering]], [[gambling]], [[statistics]], and even in [[musical composition]].@@@@1@19@@danf@17-8-2009
10410370@unknown@formal@none@1@S@==Historical background==@@@@1@2@@danf@17-8-2009
10410380@unknown@formal@none@1@S@The landmark event that established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of [[Claude E. Shannon]]'s classic paper "[[A Mathematical Theory of Communication]]" in the ''[[Bell System Technical Journal]]'' in July and October of 1948.@@@@1@43@@danf@17-8-2009
10410390@unknown@formal@none@1@S@Prior to this paper, limited information theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability.@@@@1@21@@danf@17-8-2009
10410400@unknown@formal@none@1@S@[[Harry Nyquist]]'s 1924 paper, ''Certain Factors Affecting Telegraph Speed,'' contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation , where ''W'' is the speed of transmission of intelligence, ''m'' is the number of different voltage levels to choose from at each time step, and ''K'' is a constant.@@@@1@66@@danf@17-8-2009
10410410@unknown@formal@none@1@S@[[Ralph Hartley]]'s 1928 paper, ''Transmission of Information,'' uses the word ''information'' as a measurable quantity, reflecting the receiver's ability to distinguish that one sequence of symbols from any other, thus quantifying information as , where ''S'' was the number of possible symbols, and ''n'' the number of symbols in a transmission.@@@@1@58@@danf@17-8-2009
10410420@unknown@formal@none@1@S@The natural unit of information was therefore the decimal digit, much later renamed the [[ban (information)|hartley]] in his honour as a unit or scale or measure of information.@@@@1@28@@danf@17-8-2009
10410430@unknown@formal@none@1@S@[[Alan Turing]] in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war [[Cryptanalysis of the Enigma|Enigma]] ciphers.@@@@1@27@@danf@17-8-2009
10410440@unknown@formal@none@1@S@Much of the mathematics behind information theory with events of different probabilities was developed for the field of [[thermodynamics]] by [[Ludwig Boltzmann]] and [[J. Willard Gibbs]].@@@@1@26@@danf@17-8-2009
10410450@unknown@formal@none@1@S@Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by [[Rolf Landauer]] in the 1960s, are explored in ''[[Entropy in thermodynamics and information theory]]''.@@@@1@26@@danf@17-8-2009
10410460@unknown@formal@none@1@S@In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that@@@@1@47@@danf@17-8-2009
10410470@unknown@formal@none@1@S@:"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."@@@@1@22@@danf@17-8-2009
10410480@unknown@formal@none@1@S@With it came the ideas of@@@@1@6@@danf@17-8-2009
10410490@unknown@formal@none@1@S@* the [[information entropy]] and [[redundancy (information theory)|redundancy]] of a source, and its relevance through the [[source coding theorem]];@@@@1@19@@danf@17-8-2009
10410500@unknown@formal@none@1@S@* the [[mutual information]], and the [[channel capacity]] of a noisy channel, including the promise of perfect loss-free communication given by the [[noisy-channel coding theorem]];@@@@1@25@@danf@17-8-2009
10410510@unknown@formal@none@1@S@* the practical result of the [[Shannon–Hartley law]] for the channel capacity of a Gaussian channel; and of course@@@@1@19@@danf@17-8-2009
10410520@unknown@formal@none@1@S@* the [[bit]]—a new way of seeing the most fundamental unit of information@@@@1@13@@danf@17-8-2009
10410530@unknown@formal@none@1@S@==Ways of measuring information==@@@@1@4@@danf@17-8-2009
10410540@unknown@formal@none@1@S@Information theory is based on [[probability theory]] and [[statistics]].@@@@1@9@@danf@17-8-2009
10410550@unknown@formal@none@1@S@The most important quantities of information are [[Information entropy|entropy]], the information in a [[random variable]], and [[mutual information]], the amount of information in common between two random variables.@@@@1@28@@danf@17-8-2009
10410560@unknown@formal@none@1@S@The former quantity indicates how easily message data can be [[data compression|compressed]] while the latter can be used to find the communication rate across a [[Channel (communications)|channel]].@@@@1@27@@danf@17-8-2009
10410570@unknown@formal@none@1@S@The choice of logarithmic base in the following formulae determines the [[units of measurement|unit]] of [[information entropy]] that is used.@@@@1@20@@danf@17-8-2009
10410580@unknown@formal@none@1@S@The most common unit of information is the [[bit]], based on the [[binary logarithm]].@@@@1@14@@danf@17-8-2009
10410590@unknown@formal@none@1@S@Other units include the [[nat (information)|nat]], which is based on the [[natural logarithm]], and the [[deciban|hartley]], which is based on the [[common logarithm]].@@@@1@23@@danf@17-8-2009
10410600@unknown@formal@none@1@S@In what follows, an expression of the form is considered by convention to be equal to zero whenever @@@@1@23@@danf@17-8-2009
10410605@unknown@formal@none@1@S@This is justified because for any logarithmic base.@@@@1@16@@danf@17-8-2009
10410610@unknown@formal@none@1@S@===Entropy===@@@@1@1@@danf@17-8-2009
10410620@unknown@formal@none@1@S@The '''[[information entropy|entropy]]''', , of a discrete random variable is a measure of the amount of ''uncertainty'' associated with the value of .@@@@1@24@@danf@17-8-2009
10410630@unknown@formal@none@1@S@Suppose one transmits 1000 bits (0s and 1s).@@@@1@8@@danf@17-8-2009
10410640@unknown@formal@none@1@S@If these bits are known ahead of transmission (to be a certain value with absolute probability), logic dictates that no information has been transmitted.@@@@1@24@@danf@17-8-2009
10410650@unknown@formal@none@1@S@If, however, each is equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been transmitted.@@@@1@23@@danf@17-8-2009
10410660@unknown@formal@none@1@S@Between these two extremes, information can be quantified as follows.@@@@1@10@@danf@17-8-2009
10410670@unknown@formal@none@1@S@If is the set of all messages that could be, and is the probability of given , then the entropy of is defined:@@@@1@29@@danf@17-8-2009
10410680@unknown@formal@none@1@S@:@@@@1@12@@danf@17-8-2009
10410690@unknown@formal@none@1@S@(Here, is the [[self-information]], which is the entropy contribution of an individual message.)@@@@1@14@@danf@17-8-2009
10410700@unknown@formal@none@1@S@An important property of entropy is that it is maximized when all the messages in the message space are equiprobable—i.e., most unpredictable—in which case @@@@1@28@@danf@17-8-2009
10410710@unknown@formal@none@1@S@The special case of information entropy for a random variable with two outcomes is the '''[[binary entropy function]]''':@@@@1@18@@danf@17-8-2009
10410720@unknown@formal@none@1@S@:@@@@1@9@@danf@17-8-2009
10410730@unknown@formal@none@1@S@===Joint entropy===@@@@1@2@@danf@17-8-2009
10410740@unknown@formal@none@1@S@The '''[[joint entropy]]''' of two discrete random variables and is merely the entropy of their pairing: .@@@@1@20@@danf@17-8-2009
10410750@unknown@formal@none@1@S@This implies that if and are [[statistical independence|independent]], then their joint entropy is the sum of their individual entropies.@@@@1@21@@danf@17-8-2009
10410760@unknown@formal@none@1@S@For example, if represents the position of a [[chess]] piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.@@@@1@45@@danf@17-8-2009
10410770@unknown@formal@none@1@S@:@@@@1@16@@danf@17-8-2009
10410780@unknown@formal@none@1@S@Despite similar notation, joint entropy should not be confused with '''[[cross entropy]]'''.@@@@1@12@@danf@17-8-2009
10410790@unknown@formal@none@1@S@===Conditional entropy (equivocation)===@@@@1@3@@danf@17-8-2009
10410800@unknown@formal@none@1@S@The '''[[conditional entropy]]''' or '''conditional uncertainty''' of given random variable (also called the '''equivocation''' of about ) is the average conditional entropy over :@@@@1@27@@danf@17-8-2009
10410810@unknown@formal@none@1@S@:@@@@1@22@@danf@17-8-2009
10410820@unknown@formal@none@1@S@Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use.@@@@1@40@@danf@17-8-2009
10410830@unknown@formal@none@1@S@A basic property of this form of conditional entropy is that:@@@@1@11@@danf@17-8-2009
10410840@unknown@formal@none@1@S@: @@@@1@8@@danf@17-8-2009
10410850@unknown@formal@none@1@S@===Mutual information (transinformation)===@@@@1@3@@danf@17-8-2009
10410860@unknown@formal@none@1@S@'''[[Mutual information]]''' measures the amount of information that can be obtained about one random variable by observing another.@@@@1@18@@danf@17-8-2009
10410870@unknown@formal@none@1@S@It is important in communication where it can be used to maximize the amount of information shared between sent and received signals.@@@@1@22@@danf@17-8-2009
10410880@unknown@formal@none@1@S@The mutual information of relative to is given by:@@@@1@11@@danf@17-8-2009
10410890@unknown@formal@none@1@S@:@@@@1@10@@danf@17-8-2009
10410900@unknown@formal@none@1@S@where (''S''pecific mutual ''I''nformation) is the [[pointwise mutual information]].@@@@1@10@@danf@17-8-2009
10410910@unknown@formal@none@1@S@A basic property of the mutual information is that@@@@1@9@@danf@17-8-2009
10410920@unknown@formal@none@1@S@: @@@@1@6@@danf@17-8-2009
10410930@unknown@formal@none@1@S@That is, knowing ''Y'', we can save an average of bits in encoding ''X'' compared to not knowing ''Y''.@@@@1@21@@danf@17-8-2009
10410940@unknown@formal@none@1@S@Mutual information is [[symmetric function|symmetric]]:@@@@1@5@@danf@17-8-2009
10410950@unknown@formal@none@1@S@: @@@@1@10@@danf@17-8-2009
10410960@unknown@formal@none@1@S@Mutual information can be expressed as the average [[Kullback–Leibler divergence]] (information gain) of the [[posterior probability|posterior probability distribution]] of ''X'' given the value of ''Y'' to the [[prior probability|prior distribution]] on ''X'':@@@@1@32@@danf@17-8-2009
10410970@unknown@formal@none@1@S@: @@@@1@10@@danf@17-8-2009
10410980@unknown@formal@none@1@S@In other words, this is a measure of how much, on the average, the probability distribution on ''X'' will change if we are given the value of ''Y''.@@@@1@28@@danf@17-8-2009
10410990@unknown@formal@none@1@S@This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:@@@@1@19@@danf@17-8-2009
10411000@unknown@formal@none@1@S@: @@@@1@7@@danf@17-8-2009
10411010@unknown@formal@none@1@S@Mutual information is closely related to the [[likelihood-ratio test|log-likelihood ratio test]] in the context of contingency tables and the [[multinomial distribution]] and to [[Pearson's chi-square test|Pearson's χ2 test]]: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.@@@@1@49@@danf@17-8-2009
10411020@unknown@formal@none@1@S@===Kullback–Leibler divergence (information gain)===@@@@1@4@@danf@17-8-2009
10411030@unknown@formal@none@1@S@The '''[[Kullback–Leibler divergence]]''' (or '''information divergence''', '''information gain''', or '''relative entropy''') is a way of comparing two distributions: a "true" [[probability distribution]] ''p(X)'', and an arbitrary probability distribution ''q(X)''.@@@@1@29@@danf@17-8-2009
10411040@unknown@formal@none@1@S@If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression.@@@@1@39@@danf@17-8-2009
10411050@unknown@formal@none@1@S@It is thus defined@@@@1@4@@danf@17-8-2009
10411060@unknown@formal@none@1@S@:@@@@1@24@@danf@17-8-2009
10411070@unknown@formal@none@1@S@Although it is sometimes used as a 'distance metric', it is not a true [[Metric (mathematics)|metric]] since it is not symmetric and does not satisfy the [[triangle inequality]] (making it a semi-quasimetric).@@@@1@32@@danf@17-8-2009
10411080@unknown@formal@none@1@S@===Other quantities===@@@@1@2@@danf@17-8-2009
10411090@unknown@formal@none@1@S@Other important information theoretic quantities include [[Rényi entropy]] (a generalization of entropy) and [[differential entropy]] (a generalization of quantities of information to continuous distributions.)@@@@1@24@@danf@17-8-2009
10411100@unknown@formal@none@1@S@==Coding theory==@@@@1@2@@danf@17-8-2009
10411110@unknown@formal@none@1@S@[[Coding theory]] is one of the most important and direct applications of information theory.@@@@1@14@@danf@17-8-2009
10411120@unknown@formal@none@1@S@It can be subdivided into [[data compression|source coding]] theory and [[error correction|channel coding]] theory.@@@@1@14@@danf@17-8-2009
10411130@unknown@formal@none@1@S@Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.@@@@1@26@@danf@17-8-2009
10411140@unknown@formal@none@1@S@* Data compression (source coding): There are two formulations for the compression problem:@@@@1@13@@danf@17-8-2009
10411150@unknown@formal@none@1@S@#[[lossless data compression]]: the data must be reconstructed exactly;@@@@1@9@@danf@17-8-2009
10411160@unknown@formal@none@1@S@#[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function.@@@@1@20@@danf@17-8-2009
10411170@unknown@formal@none@1@S@This subset of Information theory is called [[rate–distortion theory]].@@@@1@9@@danf@17-8-2009
10411180@unknown@formal@none@1@S@* Error-correcting codes (channel coding): While data compression removes as much [[redundancy (information theory)|redundancy]] as possible, an error correcting code adds just the right kind of redundancy (i.e. [[error correction]]) needed to transmit the data efficiently and faithfully across a noisy channel.@@@@1@42@@danf@17-8-2009
10411190@unknown@formal@none@1@S@This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts.@@@@1@35@@danf@17-8-2009
10411200@unknown@formal@none@1@S@However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user.@@@@1@19@@danf@17-8-2009
10411210@unknown@formal@none@1@S@In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the [[broadcast channel]]) or intermediary "helpers" (the [[relay channel]]), or more general [[computer network|networks]], compression followed by transmission may no longer be optimal.@@@@1@37@@danf@17-8-2009
10411220@unknown@formal@none@1@S@[[Network information theory]] refers to these multi-agent communication models.@@@@1@9@@danf@17-8-2009
10411230@unknown@formal@none@1@S@===Source theory===@@@@1@2@@danf@17-8-2009
10411240@unknown@formal@none@1@S@Any process that generates successive messages can be considered a '''[[Communication source|source]]''' of information.@@@@1@14@@danf@17-8-2009
10411250@unknown@formal@none@1@S@A memoryless source is one in which each message is an [[Independent identically-distributed random variables|independent identically-distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose more general constraints.@@@@1@31@@danf@17-8-2009
10411260@unknown@formal@none@1@S@All such sources are [[stochastic process|stochastic]].@@@@1@6@@danf@17-8-2009
10411270@unknown@formal@none@1@S@These terms are well studied in their own right outside information theory.@@@@1@12@@danf@17-8-2009
10411280@unknown@formal@none@1@S@====Rate====@@@@1@1@@danf@17-8-2009
10411290@unknown@formal@none@1@S@Information [[Entropy rate|'''rate''']] is the average entropy per symbol.@@@@1@9@@danf@17-8-2009
10411300@unknown@formal@none@1@S@For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is@@@@1@22@@danf@17-8-2009
10411310@unknown@formal@none@1@S@:@@@@1@7@@danf@17-8-2009
10411320@unknown@formal@none@1@S@that is, the conditional entropy of a symbol given all the previous symbols generated.@@@@1@14@@danf@17-8-2009
10411330@unknown@formal@none@1@S@For the more general case of a process that is not necessarily stationary, the ''average rate'' is@@@@1@17@@danf@17-8-2009
10411340@unknown@formal@none@1@S@:@@@@1@10@@danf@17-8-2009
10411350@unknown@formal@none@1@S@that is, the limit of the joint entropy per symbol.@@@@1@10@@danf@17-8-2009
10411360@unknown@formal@none@1@S@For stationary sources, these two expressions give the same result.@@@@1@10@@danf@17-8-2009
10411370@unknown@formal@none@1@S@It is common in information theory to speak of the "rate" or "entropy" of a language.@@@@1@16@@danf@17-8-2009
10411380@unknown@formal@none@1@S@This is appropriate, for example, when the source of information is English prose.@@@@1@13@@danf@17-8-2009
10411390@unknown@formal@none@1@S@The rate of a source of information is related to its [[redundancy (information theory)|redundancy]] and how well it can be [[data compression|compressed]], the subject of '''source coding'''.@@@@1@27@@danf@17-8-2009
10411400@unknown@formal@none@1@S@===Channel capacity===@@@@1@2@@danf@17-8-2009
10411410@unknown@formal@none@1@S@Communications over a channel—such as an [[ethernet]] wire—is the primary motivation of information theory.@@@@1@14@@danf@17-8-2009
10411420@unknown@formal@none@1@S@As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.@@@@1@36@@danf@17-8-2009
10411430@unknown@formal@none@1@S@How much information can one hope to communicate over a noisy (or otherwise imperfect) channel?@@@@1@15@@danf@17-8-2009
10411440@unknown@formal@none@1@S@Consider the communications process over a discrete channel.@@@@1@8@@danf@17-8-2009
10411450@unknown@formal@none@1@S@A simple model of the process is shown below:@@@@1@9@@danf@17-8-2009
10411460@unknown@formal@none@1@S@Here ''X'' represents the space of messages transmitted, and ''Y'' the space of messages received during a unit time over our channel.@@@@1@22@@danf@17-8-2009
10411470@unknown@formal@none@1@S@Let be the [[conditional probability]] distribution function of ''Y'' given ''X''.@@@@1@12@@danf@17-8-2009
10411480@unknown@formal@none@1@S@We will consider to be an inherent fixed property of our communications channel (representing the nature of the '''[[Signal noise|noise]]''' of our channel).@@@@1@24@@danf@17-8-2009
10411490@unknown@formal@none@1@S@Then the joint distribution of ''X'' and ''Y'' is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel.@@@@1@32@@danf@17-8-2009
10411500@unknown@formal@none@1@S@Under these constraints, we would like to maximize the rate of information, or the '''[[Signal (electrical engineering)|signal]]''', we can communicate over the channel.@@@@1@23@@danf@17-8-2009
10411510@unknown@formal@none@1@S@The appropriate measure for this is the [[mutual information]], and this maximum mutual information is called the '''[[channel capacity]]''' and is given by:@@@@1@23@@danf@17-8-2009
10411520@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10411530@unknown@formal@none@1@S@This capacity has the following property related to communicating at information rate ''R'' (where ''R'' is usually bits per symbol).@@@@1@20@@danf@17-8-2009
10411540@unknown@formal@none@1@S@For any information rate ''R < C'' and coding error ε > 0, for large enough ''N'', there exists a code of length ''N'' and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error.@@@@1@56@@danf@17-8-2009
10411550@unknown@formal@none@1@S@In addition, for any rate ''R > C'', it is impossible to transmit with arbitrarily small block error.@@@@1@18@@danf@17-8-2009
10411560@unknown@formal@none@1@S@'''[[Channel code|Channel coding]]''' is concerned with finding such nearly optimal [[error detection and correction|codes]] that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.@@@@1@37@@danf@17-8-2009
10411570@unknown@formal@none@1@S@====Channel capacity of particular model channels====@@@@1@6@@danf@17-8-2009
10411580@unknown@formal@none@1@S@* A continuous-time analog communications channel subject to Gaussian noise — see [[Shannon–Hartley theorem]].@@@@1@14@@danf@17-8-2009
10411590@unknown@formal@none@1@S@* A [[binary symmetric channel]] (BSC) with crossover probability ''p'' is a binary input, binary output channel that flips the input bit with probability '' p''.@@@@1@26@@danf@17-8-2009
10411600@unknown@formal@none@1@S@The BSC has a capacity of bits per channel use, where is the [[binary entropy function]]:@@@@1@20@@danf@17-8-2009
10411610@unknown@formal@none@1@S@::@@@@1@1@@danf@17-8-2009
10411620@unknown@formal@none@1@S@* A binary erasure channel (BEC) with erasure probability '' p '' is a binary input, ternary output channel.@@@@1@19@@danf@17-8-2009
10411630@unknown@formal@none@1@S@The possible channel outputs are ''0'', ''1'', and a third symbol 'e' called an erasure.@@@@1@15@@danf@17-8-2009
10411640@unknown@formal@none@1@S@The erasure represents complete loss of information about an input bit.@@@@1@11@@danf@17-8-2009
10411650@unknown@formal@none@1@S@The capacity of the BEC is ''1 - p'' bits per channel use.@@@@1@13@@danf@17-8-2009
10411660@unknown@formal@none@1@S@::@@@@1@1@@danf@17-8-2009
10411670@unknown@formal@none@1@S@==Applications to other fields==@@@@1@4@@danf@17-8-2009
10411680@unknown@formal@none@1@S@===Intelligence uses and secrecy applications===@@@@1@5@@danf@17-8-2009
10411690@unknown@formal@none@1@S@Information theoretic concepts apply to [[cryptography]] and [[cryptanalysis]].@@@@1@8@@danf@17-8-2009
10411700@unknown@formal@none@1@S@[[Turing]]'s information unit, the [[Ban (information)|ban]], was used in the [[Ultra]] project, breaking the German [[Enigma machine]] code and hastening the [[Victory in Europe Day|end of WWII in Europe]].@@@@1@29@@danf@17-8-2009
10411710@unknown@formal@none@1@S@Shannon himself defined an important concept now called the [[unicity distance]].@@@@1@11@@danf@17-8-2009
10411720@unknown@formal@none@1@S@Based on the [[redundancy (information theory)|redundancy]] of the [[plaintext]], it attempts to give a minimum amount of [[ciphertext]] necessary to ensure unique decipherability.@@@@1@23@@danf@17-8-2009
10411730@unknown@formal@none@1@S@Information theory leads us to believe it is much more difficult to keep secrets than it might first appear.@@@@1@19@@danf@17-8-2009
10411740@unknown@formal@none@1@S@A [[brute force attack]] can break systems based on [[public-key cryptography|asymmetric key algorithms]] or on most commonly used methods of [[symmetric-key algorithm|symmetric key algorithms]] (sometimes called secret key algorithms), such as [[block cipher]]s.@@@@1@33@@danf@17-8-2009
10411750@unknown@formal@none@1@S@The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.@@@@1@24@@danf@17-8-2009
10411760@unknown@formal@none@1@S@[[Information theoretic security]] refers to methods such as the [[one-time pad]] that are not vulnerable to such brute force attacks.@@@@1@20@@danf@17-8-2009
10411770@unknown@formal@none@1@S@In such cases, the positive conditional [[mutual information]] between the [[plaintext]] and [[ciphertext]] (conditioned on the [[key (cryptography)| key]]) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications.@@@@1@40@@danf@17-8-2009
10411780@unknown@formal@none@1@S@In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key.@@@@1@29@@danf@17-8-2009
10411790@unknown@formal@none@1@S@However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the [[Venona project]] was able to crack the one-time pads of the [[Soviet Union]] due to their improper reuse of key material.@@@@1@40@@danf@17-8-2009
10411800@unknown@formal@none@1@S@===Pseudorandom number generation===@@@@1@3@@danf@17-8-2009
10411810@unknown@formal@none@1@S@[[Pseudorandom number generator]]s are widely available in computer language libraries and application programs.@@@@1@13@@danf@17-8-2009
10411820@unknown@formal@none@1@S@They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software.@@@@1@22@@danf@17-8-2009
10411830@unknown@formal@none@1@S@A class of improved random number generators is termed [[Cryptographically secure pseudorandom number generator]]s, but even they require external to the software [[random seed]]s to work as intended.@@@@1@28@@danf@17-8-2009
10411840@unknown@formal@none@1@S@These can be obtained via [[extractor]]s, if done carefully.@@@@1@9@@danf@17-8-2009
10411850@unknown@formal@none@1@S@The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems.@@@@1@29@@danf@17-8-2009
10411860@unknown@formal@none@1@S@Although related, the distinctions among these measures mean that a [[random variable]] with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.@@@@1@30@@danf@17-8-2009
10411870@unknown@formal@none@1@S@===Miscellaneous applications===@@@@1@2@@danf@17-8-2009
10411880@unknown@formal@none@1@S@Information theory also has applications in [[Gambling and information theory|gambling and investing]], [[black hole information paradox|black holes]], [[bioinformatics]], and [[music]].@@@@1@20@@danf@17-8-2009
10420010@unknown@formal@none@1@S@Italian language@@@@1@2@@danf@17-8-2009
10420020@unknown@formal@none@1@S@'''Italian''' (, or ''lingua italiana'') is a [[Romance languages|Romance language]] spoken as a [[first language]] by about 63 million people, primarily in [[Italy]].@@@@1@23@@danf@17-8-2009
10420030@unknown@formal@none@1@S@In [[Switzerland]], Italian is one of four [[Linguistic geography of Switzerland|official language]]s.@@@@1@12@@danf@17-8-2009
10420040@unknown@formal@none@1@S@It is also the official language of [[San Marino]].@@@@1@9@@danf@17-8-2009
10420050@unknown@formal@none@1@S@It is the primary language of the [[Vatican City]].@@@@1@9@@danf@17-8-2009
10420060@unknown@formal@none@1@S@Standard Italian, adopted by the state after the [[unification of Italy]], is based on [[Tuscan dialect|Tuscan]] and is somewhat intermediate between [[Italo-Western|Italo-Dalmatian languages]] of the [[Mezzogiorno|South]] and [[Northern Italian dialects]] of the [[Northern Italy|North]].@@@@1@34@@danf@17-8-2009
10420070@unknown@formal@none@1@S@Unlike most other Romance languages, Italian has retained the contrast between short and [[consonant length|long consonants]] which existed in Latin.@@@@1@20@@danf@17-8-2009
10420080@unknown@formal@none@1@S@As in most [[Romance languages]], [[stress (linguistics)|stress]] is distinctive.@@@@1@9@@danf@17-8-2009
10420090@unknown@formal@none@1@S@Of the Romance languages, Italian is considered to be one of the closest resembling [[Latin]] in terms of [[vocabulary]].@@@@1@19@@danf@17-8-2009
10420100@unknown@formal@none@1@S@According to Ethnologue, lexical similarity is 89% with [[French language|French]], 87% with [[Catalan language|Catalan]], 85% with [[Sardinian language|Sardinian]], 82% with [[Spanish language|Spanish]], 78% with Rheto-Romance, and 77% with Romanian.@@@@1@29@@danf@17-8-2009
10420110@unknown@formal@none@1@S@It is affectionately called ''il parlar gentile'' (the gentle language) by its speakers.@@@@1@13@@danf@17-8-2009
10420120@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10420130@unknown@formal@none@1@S@Italian is written using the [[Latin alphabet]].@@@@1@7@@danf@17-8-2009
10420140@unknown@formal@none@1@S@The letters ''J'', ''K'', ''W'', ''X'' and ''Y'' are not considered part of the standard [[Italian alphabet]], but appear in loanwords (such as ''jeans'', ''whisky'', ''taxi'').@@@@1@26@@danf@17-8-2009
10420150@unknown@formal@none@1@S@''X'' has become a commonly used letter in genuine Italian words with the prefix ''extra-''.@@@@1@15@@danf@17-8-2009
10420160@unknown@formal@none@1@S@''J'' in Italian is an old-fashioned orthographic variant of ''I'', appearing in the first name "Jacopo" as well as in some Italian place names, e.g., the towns of [[Bajardo]], [[Bojano]], [[Joppolo]], [[Jesolo]], [[Jesi]], among numerous others, and in the alternate spelling ''Mar Jonio'' (also spelled ''Mar Ionio'') for the [[Ionian Sea]].@@@@1@51@@danf@17-8-2009
10420170@unknown@formal@none@1@S@''J'' may also appear in many words from different dialects, but its use is discouraged in contemporary Italian, and it is not part of the standard 21-letter contemporary Italian alphabet.@@@@1@30@@danf@17-8-2009
10420180@unknown@formal@none@1@S@Each of these foreign letters had an Italian equivalent spelling: ''gi'' for ''j'', ''c'' or ''ch'' for ''k'', ''u'' or ''v'' for ''w'' (depending on what sound it makes), ''s'', ''ss'', or ''cs'' for ''x'', and ''i'' for ''y''.@@@@1@39@@danf@17-8-2009
10420190@unknown@formal@none@1@S@* Italian uses the [[acute accent]] over the letter ''E'' (as in ''perché'', why/because) to indicate a front mid-close vowel, and the [[grave accent]] (as in ''tè'', tea) to indicate a front mid-open vowel.@@@@1@34@@danf@17-8-2009
10420200@unknown@formal@none@1@S@The [[grave accent]] is also used on letters ''A'', ''I'', ''O'', and ''U'' to mark [[stress (linguistics)|stress]] when it falls on the final vowel of a word (for instance ''gioventù'', youth).@@@@1@31@@danf@17-8-2009
10420210@unknown@formal@none@1@S@Typically, the penultimate syllable is stressed.@@@@1@6@@danf@17-8-2009
10420220@unknown@formal@none@1@S@If syllables other than the last one are stressed, the accent is not mandatory, unlike in [[Spanish language|Spanish]], and, in virtually all cases, it is omitted.@@@@1@26@@danf@17-8-2009
10420230@unknown@formal@none@1@S@In some cases, when the word is ambiguous (as ''principi''), the accent mark is sometimes used in order to disambiguate its meaning (in this case, ''prìncipi'', princes, or ''princìpi'', principles).@@@@1@30@@danf@17-8-2009
10420240@unknown@formal@none@1@S@This is, however, not compulsory.@@@@1@5@@danf@17-8-2009
10420250@unknown@formal@none@1@S@Rare words with three or more syllables can confuse Italians themselves, and the pronunciation of [[Istanbul]] is a common example of a word in which placement of stress is not clearly established.@@@@1@32@@danf@17-8-2009
10420260@unknown@formal@none@1@S@Turkish, like French, tends to put the accent on ultimate syllable, but Italian doesn't.@@@@1@14@@danf@17-8-2009
10420270@unknown@formal@none@1@S@So we can hear "Istànbul" or "Ìstanbul".@@@@1@7@@danf@17-8-2009
10420280@unknown@formal@none@1@S@Another instance is the American State of [[Florida]]: the correct way to pronounce it in Italian is like in Spanish, "Florìda", but since there is an Italian word meaning the same ("flourishing"), "flòrida", and because of the influence of English, most Italians pronounce it that way.@@@@1@46@@danf@17-8-2009
10420290@unknown@formal@none@1@S@Dictionaries give the latter as an alternative pronunciation.@@@@1@8@@danf@17-8-2009
10420300@unknown@formal@none@1@S@* The letter ''H'' at the beginning of a word is used to distinguish ''ho'', ''hai'', ''ha'', ''hanno'' (present indicative of ''avere'', 'to have') from ''o'' ('or'), ''ai'' ('to the'), ''a'' ('to'), ''anno'' ('year').@@@@1@34@@danf@17-8-2009
10420310@unknown@formal@none@1@S@In the spoken language this letter is always silent for the cases given above.@@@@1@14@@danf@17-8-2009
10420320@unknown@formal@none@1@S@''H'' is also used in combinations with other letters (see below), but no [[phoneme]] {{IPA|[h]}} exists in Italian.@@@@1@18@@danf@17-8-2009
10420330@unknown@formal@none@1@S@In foreign words entered in common use, like "hotel" or "hovercraft", the H is commonly silent, so they are pronounced as {{IPA|/oˈtɛl/}} and {{IPA|/ˈɔverkraft/}}@@@@1@24@@danf@17-8-2009
10420340@unknown@formal@none@1@S@* The letter ''Z'' represents {{IPA|/ʣ/}}, for example: ''Zanzara'' {{IPA|/dzan'dzaɾa/}} (mosquito), or {{IPA|/ʦ/}}, for example: ''Nazione'' {{IPA|/naˈttsjone/}} (nation), depending on context, though there are few [[minimal pair]]s.@@@@1@27@@danf@17-8-2009
10420350@unknown@formal@none@1@S@The same goes for ''S'', which can represent {{IPA|/s/}} or {{IPA|/z/}}.@@@@1@11@@danf@17-8-2009
10420360@unknown@formal@none@1@S@However, these two phonemes are in [[complementary distribution]] everywhere except between two vowels in the same word, and even in such environment there are extremely few minimal pairs, so that this distinction is being lost in many varieties.@@@@1@38@@danf@17-8-2009
10420370@unknown@formal@none@1@S@* The letters ''C'' and ''G'' represent [[affricate]]s: [[Voiceless postalveolar affricate|{{IPA|/ʧ/}}]] as in "chair" and [[Voiced postalveolar affricate|{{IPA|/ʤ/}}]] as in "gem", respectively, before the [[front vowel]]s ''I'' and ''E''.@@@@1@29@@danf@17-8-2009
10420380@unknown@formal@none@1@S@They are pronounced as [[plosive]]s {{IPA|/k/}}, {{IPA|/g/}} (as in "call" and "gall") otherwise.@@@@1@13@@danf@17-8-2009
10420390@unknown@formal@none@1@S@Front/back vowel rules for ''C'' and ''G'' are similar in [[French language|French]], [[Romanian language|Romanian]], [[Spanish language|Spanish]], and to some extent [[English language|English]] (including [[Old English]]).@@@@1@25@@danf@17-8-2009
10420400@unknown@formal@none@1@S@[[swedish language|Swedish]] and [[Norwegian language|Norwegian]] have similar rules for ''K'' and ''G''.@@@@1@12@@danf@17-8-2009
10420410@unknown@formal@none@1@S@(See also [[palatalization]].)@@@@1@3@@danf@17-8-2009
10420420@unknown@formal@none@1@S@* However, an ''H'' can be added between ''C'' or ''G'' and ''E'' or ''I'' to represent a plosive, and an ''I'' can be added between ''C'' or ''G'' and ''A'', ''O'' or ''U'' to signal that the consonant is an affricate.@@@@1@42@@danf@17-8-2009
10420430@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10420440@unknown@formal@none@1@S@:Note that the ''H'' is [[silent letter|silent]] in the digraphs ''[[ch (digraph)|CH]]'' and ''[[gh (digraph)|GH]]'', as also the ''I'' in ''cia'', ''cio'', ''ciu'' and even ''cie'' is not pronounced as a separate vowel, unless it carries the primary stress.@@@@1@39@@danf@17-8-2009
10420450@unknown@formal@none@1@S@For example, it is silent in ''[[ciao]]'' {{IPA|/ˈʧa.o/}} and cielo {{IPA|/ˈʧɛ.lo/}}, but it is pronounced in ''farmacia'' {{IPA|/ˌfaɾ.ma.ˈʧi.a/}} and ''farmacie'' {{IPA|/ˌfaɾ.ma.ˈʧi.e/}}.@@@@1@21@@danf@17-8-2009
10420460@unknown@formal@none@1@S@* There are three other special [[digraph (orthography)|digraphs]] in Italian: ''[[gn (digraph)|GN]]'', ''GL'' and ''SC''.@@@@1@15@@danf@17-8-2009
10420470@unknown@formal@none@1@S@''GN'' represents [[Palatal nasal|{{IPA|/ɲ/}}]].@@@@1@4@@danf@17-8-2009
10420480@unknown@formal@none@1@S@''GL'' represents [[Palatal lateral approximant|{{IPA|/ʎ/}}]] only before ''i'', and never at the beginning of a word, except in the [[personal pronoun]] and [[definite article]] ''gli''.@@@@1@25@@danf@17-8-2009
10420490@unknown@formal@none@1@S@(Compare with [[Spanish language|Spanish]] ''ñ'' and ''ll'', [[Portuguese language|Portuguese]] ''nh'' and ''lh''.)@@@@1@12@@danf@17-8-2009
10420500@unknown@formal@none@1@S@''SC'' represents fricative [[Voiceless postalveolar fricative|{{IPA|/ʃ/}}]] before ''i'' or ''e''.@@@@1@10@@danf@17-8-2009
10420510@unknown@formal@none@1@S@Except in the speech of some Northern Italians, all of these are normally [[geminate]] between vowels.@@@@1@16@@danf@17-8-2009
10420520@unknown@formal@none@1@S@* In general, all letters or digraphs represent phonemes rather clearly, and, in standard varieties of Italian, there is little allophonic variation.@@@@1@22@@danf@17-8-2009
10420530@unknown@formal@none@1@S@The most notable exceptions are assimilation of /n/ in point of articulation before consonants, assimilatory voicing of /s/ to following voiced consonants, and vowel length (vowels are long in stressed open syllables, and short elsewhere) — compare with the enormous number of [[allophone]]s of the English phoneme /t/.@@@@1@48@@danf@17-8-2009
10420540@unknown@formal@none@1@S@Spelling is clearly phonemic and difficult to mistake given a clear pronunciation.@@@@1@12@@danf@17-8-2009
10420550@unknown@formal@none@1@S@Exceptions are generally only found in foreign borrowings.@@@@1@8@@danf@17-8-2009
10420560@unknown@formal@none@1@S@There are fewer cases of [[dyslexia]] than among speakers of languages such as English , and the concept of a spelling bee is strange to Italians.@@@@1@26@@danf@17-8-2009
10420570@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10420580@unknown@formal@none@1@S@The history of the Italian language is long, but the modern standard of the language was largely shaped by relatively recent events.@@@@1@22@@danf@17-8-2009
10420590@unknown@formal@none@1@S@The earliest surviving texts which can definitely be called Italian (or more accurately, vernacular, as opposed to its predecessor [[Vulgar Latin]]) are legal formulae from the region of [[province of Benevento|Benevento]] dating from 960-963.@@@@1@34@@danf@17-8-2009
10420600@unknown@formal@none@1@S@What would come to be thought of as Italian was first formalized in the first years of the 14th century through the works of [[Dante Alighieri]], who mixed southern Italian languages, especially [[Sicilian language|Sicilian]], with his native Tuscan in his epic poems known collectively as the ''[[Divine Comedy|Commedia]],'' to which [[Giovanni Boccaccio]] later affixed the title ''Divina''.@@@@1@57@@danf@17-8-2009
10420610@unknown@formal@none@1@S@Dante's much-loved works were read throughout Italy and his written dialect became the "canonical standard" that all educated Italians could understand.@@@@1@21@@danf@17-8-2009
10420620@unknown@formal@none@1@S@Dante is still credited with standardizing the Italian language and, thus, the dialect of [[Tuscany]] became the basis for what would become the official language of Italy.@@@@1@27@@danf@17-8-2009
10420630@unknown@formal@none@1@S@Italy has always had a distinctive dialect for each city since the cities were until recently thought of as [[city-state]]s.@@@@1@20@@danf@17-8-2009
10420640@unknown@formal@none@1@S@The latter now has considerable [[variety (linguistics)|variety]], however.@@@@1@8@@danf@17-8-2009
10420650@unknown@formal@none@1@S@As Tuscan-derived Italian came to be used throughout the nation, features of local speech were naturally adopted, producing various versions of Regional Italian.@@@@1@23@@danf@17-8-2009
10420660@unknown@formal@none@1@S@The most characteristic differences, for instance, between [[Romanesco|Roman Italian]] and [[Milanese|Milanese Italian]] are the [[consonant length|gemination]] of initial consonants and the pronunciation of stressed "e", and of "s" in some cases (e.g. ''va bene'' "all right": is pronounced {{IPA|[va ˈbːɛne]}} by a Roman, {{IPA|[va ˈbene]}} by a Milanese; ''a casa'' "at home": Roman {{IPA|[a ˈkːasa]}}, Milanese {{IPA|[a ˈkaza]}}).@@@@1@58@@danf@17-8-2009
10420670@unknown@formal@none@1@S@In contrast to the [[Northern Italian language|dialects of northern Italy]], [[southern Italian]] dialects were largely untouched by the Franco-[[Occitan language|Occitan]] influences introduced to Italy, mainly by [[bard]]s from [[France]], during the [[Middle Ages]].@@@@1@33@@danf@17-8-2009
10420680@unknown@formal@none@1@S@Even in the case of Northern Italian dialects, however, scholars are careful not to overstate the effects of outsiders on the natural indigenous developments of the languages.@@@@1@27@@danf@17-8-2009
10420690@unknown@formal@none@1@S@(See [[La Spezia-Rimini Line]].)@@@@1@4@@danf@17-8-2009
10420700@unknown@formal@none@1@S@The economic might and relative advanced development of [[Tuscany]] at the time ([[Late Middle Ages]]), gave its dialect weight, though Venetian remained widespread in medieval Italian commercial life.@@@@1@28@@danf@17-8-2009
10420710@unknown@formal@none@1@S@Also, the increasing cultural relevance of [[Florence, Italy|Florence]] during the periods of '[[Humanism|Umanesimo (Humanism)]]' and the [[Renaissance|Rinascimento (Renaissance)]] made its ''volgare'' (dialect), or rather a refined version of it, a standard in the arts.@@@@1@34@@danf@17-8-2009
10420720@unknown@formal@none@1@S@The re-discovery of Dante's ''[[De vulgari eloquentia]]'' and a renewed interest in linguistics in the 16th century sparked a debate which raged throughout Italy concerning which criteria should be chosen to establish a modern Italian standard to be used as much as a literary as a spoken language.@@@@1@48@@danf@17-8-2009
10420730@unknown@formal@none@1@S@Scholars were divided into three factions: the [[purism|purists]], headed by [[Pietro Bembo]] who in his ''[[Gli Asolani]]'' claimed that the language might only be based on the great literary classics (notably, [[Petrarch]], and Boccaccio but not Dante as Bembo believed that the Divine Comedy was not dignified enough as it used elements from other dialects), [[Niccolò Machiavelli]] and other [[Florence|Florentine]]s who preferred the version spoken by ordinary people in their own times, and the [[Courtesan]]s like [[Baldassarre Castiglione]] and [[Gian Giorgio Trissino]] who insisted that each local vernacular must contribute to the new standard.@@@@1@94@@danf@17-8-2009
10420740@unknown@formal@none@1@S@Eventually Bembo's ideas prevailed, the result being the publication of the first Italian dictionary in 1612 and the foundation of the [[Accademia della Crusca]] in Florence (1582-3), the official legislative body of the Italian language.@@@@1@35@@danf@17-8-2009
10420750@unknown@formal@none@1@S@Italian literature's first modern novel, [[The Betrothed|''I Promessi Sposi'']] (The Betrothed), by [[Alessandro Manzoni]] further defined the standard by "rinsing" his Milanese 'in the waters of the [[Arno River|Arno]]" ([[Florence]]'s river), as he states in the Preface to his 1840 edition.@@@@1@41@@danf@17-8-2009
10420760@unknown@formal@none@1@S@After unification a huge number of civil servants and soldiers recruited from all over the country introduced many more words and idioms from their home dialects ("[[ciao]]" is [[Venetian language|Venetian]], "[[panettone]]" is [[Milanese]] etc.).@@@@1@34@@danf@17-8-2009
10420770@unknown@formal@none@1@S@==Classification==@@@@1@1@@danf@17-8-2009
10420780@unknown@formal@none@1@S@Italian is most closely related to the other two Italo-Dalmatian languages, [[Sicilian language|Sicilian]] and the extinct [[Dalmatian language|Dalmatian]].@@@@1@18@@danf@17-8-2009
10420790@unknown@formal@none@1@S@The three are part of the [[Italo-Western languages|Italo-Western]] grouping of the [[Romance languages]], which are a subgroup of the [[Italic languages|Italic]] branch of [[Indo-European language family|Indo-European]].@@@@1@26@@danf@17-8-2009
10420800@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10420810@unknown@formal@none@1@S@The total speakers of Italian as maternal language are between 60 and 70 million.@@@@1@14@@danf@17-8-2009
10420820@unknown@formal@none@1@S@The speakers who use Italian as second or cultural language are estimated around 110-120 million .@@@@1@16@@danf@17-8-2009
10420830@unknown@formal@none@1@S@Italian is the official language of [[Italy]] and [[San Marino]], and one of the official languages of [[Switzerland]], spoken mainly in [[Canton Ticino|Ticino]] and [[Graubünden|Grigioni]] cantons, a region referred to as [[Italian Switzerland]].@@@@1@33@@danf@17-8-2009
10420840@unknown@formal@none@1@S@It is also the second official language in some areas of [[Istria]], in [[Slovenia]] and [[Croatia]], where an Italian minority exists.@@@@1@21@@danf@17-8-2009
10420850@unknown@formal@none@1@S@It is the primary language of the [[Vatican City]] and is widely used and taught in [[Monaco]] and [[Malta]].@@@@1@19@@danf@17-8-2009
10420860@unknown@formal@none@1@S@It is also widely understood in France with over one million speakers (especially in [[Corsica]] and the [[County of Nice]], areas that historically spoke [[Italian dialects]] before annexation to [[France]]), and in [[Albania]].@@@@1@33@@danf@17-8-2009
10420870@unknown@formal@none@1@S@Italian is also spoken by some in former Italian colonies in [[Africa]] ([[Libya]], [[Somalia]] and [[Eritrea]]).@@@@1@16@@danf@17-8-2009
10420880@unknown@formal@none@1@S@However, its use has sharply dropped off since the colonial period.@@@@1@11@@danf@17-8-2009
10420890@unknown@formal@none@1@S@In [[Eritrea]] [[Italian Language|Italian]] is widely understood .@@@@1@8@@danf@17-8-2009
10420900@unknown@formal@none@1@S@In fact, for fifty years, during the colonial period, Italian was the language of instruction, but [[as of 1997]], there is only one Italian language school remaining, with 470 pupils.@@@@1@30@@danf@17-8-2009
10420910@unknown@formal@none@1@S@In [[Somalia]] Italian used to be a major language but due to the civil war and lack of education only the older generation still uses it.@@@@1@26@@danf@17-8-2009
10420920@unknown@formal@none@1@S@Italian and [[Italian dialects]] are widely used by Italian immigrants and many of their descendants (see ''[[Italians]]'') living throughout [[Western Europe]] (especially [[France]], [[Germany]], [[Belgium]], [[Switzerland]], the [[Britalian|United Kingdom]] and [[Luxembourg]]), the [[Italian Americans|United States]], [[Italian Canadians|Canada]], [[Italian Australians|Australia]], and [[Latin America]] (especially [[Uruguay]], [[Italian Brazilians|Brazil]], [[Argentina]], and [[Venezuela]]).@@@@1@49@@danf@17-8-2009
10420930@unknown@formal@none@1@S@In the United States, Italian speakers are most commonly found in four cities: [[Boston]] (7,000), [[Chicago]] (12,000), [[New York City]] (140,000), and [[Philadelphia]] (15,000).@@@@1@24@@danf@17-8-2009
10420940@unknown@formal@none@1@S@In Canada there are large Italian-speaking communities in [[Montreal]] (120,000) and [[Toronto]] (195,000).@@@@1@13@@danf@17-8-2009
10420950@unknown@formal@none@1@S@Italian is the second most commonly-spoken language in Australia, where 353,605 [[Italian Australian]]s, or 1.9% of the population, reported speaking Italian at home in the 2001 [[Census in Australia|Census]].@@@@1@29@@danf@17-8-2009
10420960@unknown@formal@none@1@S@In 2001 there were 130,000 Italian speakers in [[Melbourne]], and 90,000 in [[Sydney]].@@@@1@13@@danf@17-8-2009
10420970@unknown@formal@none@1@S@===Italian language education===@@@@1@3@@danf@17-8-2009
10420980@unknown@formal@none@1@S@Italian is widely taught in many schools around the world, but rarely as the first non-native language of pupils; in fact, Italian generally is the fourth or fifth most taught second-language in the world.@@@@1@34@@danf@17-8-2009
10420990@unknown@formal@none@1@S@In [[anglophone]] parts of [[Canada]], Italian is, after [[French language|French]], the third most taught language.@@@@1@15@@danf@17-8-2009
10421000@unknown@formal@none@1@S@In [[francophone]] Canada it is third after [[English language|English]].@@@@1@9@@danf@17-8-2009
10421010@unknown@formal@none@1@S@In the [[United States]] and the [[United Kingdom]], Italian ranks fourth (after [[Spanish language|Spanish]]-French-[[German language|German]] and French-German-Spanish respectively).@@@@1@18@@danf@17-8-2009
10421020@unknown@formal@none@1@S@Throughout the world, Italian is the fifth most taught non-native language, after [[English language|English]], French, Spanish, and German.@@@@1@18@@danf@17-8-2009
10421030@unknown@formal@none@1@S@In the [[European Union]], Italian is spoken as a mother tongue by 13% of the population (64 million, mainly in Italy itself) and as a second language by 3% (14 million); among EU member states, it is most likely to be desired (and therefore learned) as a second language in [[Malta]] (61%), [[Croatia]] (14%), [[Slovenia]] (12%), [[Austria]] (11%), [[Romania]] (8%), [[France]] (6%), and [[Greece]] (6%).@@@@1@65@@danf@17-8-2009
10421040@unknown@formal@none@1@S@It is also an important second language in [[Albania]] and [[Switzerland]], which are not EU members or candidates.@@@@1@18@@danf@17-8-2009
10421050@unknown@formal@none@1@S@===Influence and derived languages===@@@@1@4@@danf@17-8-2009
10421060@unknown@formal@none@1@S@From the late 19th to the mid 20th century, thousands of Italians settled in Argentina, Uruguay and southern Brazil, where they formed a very strong physical and cultural presence (see the [[Italian diaspora]]).@@@@1@33@@danf@17-8-2009
10421070@unknown@formal@none@1@S@In some cases, colonies were established where variants of [[Italian dialects]] were used, and some continue to use a derived dialect.@@@@1@21@@danf@17-8-2009
10421080@unknown@formal@none@1@S@An example is [[Rio Grande do Sul]], [[Brazil]], where [[Talian]] is used and in the town of [[Chipilo]] near Puebla, [[Mexico]] each continuing to use a derived form of [[Venetian language|Venetian]] dating back to the 19th century.@@@@1@37@@danf@17-8-2009
10421090@unknown@formal@none@1@S@Another example is [[Cocoliche]], an Italian-Spanish [[pidgin]] once spoken in [[Argentina]] and especially in [[Buenos Aires]], and [[Lunfardo]].@@@@1@18@@danf@17-8-2009
10421100@unknown@formal@none@1@S@[[Rioplatense Spanish]], and particularly the speech of the city of Buenos Aires, has intonation patterns that resemble those of Italian dialects, due to the fact that Argentina had a constant, large influx of Italian settlers since the second half of the nineteenth century; initially primarily from Northern Italy then, since the beginning of the twentieth century, mostly from Southern Italy.@@@@1@60@@danf@17-8-2009
10421110@unknown@formal@none@1@S@===Lingua Franca===@@@@1@2@@danf@17-8-2009
10421120@unknown@formal@none@1@S@Starting in late [[medieval]] times, Italian language variants replaced Latin to become the primary commercial language for much of Europe and Mediterranean Sea (especially the Tuscan and Venetian variants).@@@@1@29@@danf@17-8-2009
10421130@unknown@formal@none@1@S@This became solidified during the [[Renaissance]] with the strength of Italian banking and the rise of [[Renaissance humanism|humanism]] in the arts.@@@@1@21@@danf@17-8-2009
10421140@unknown@formal@none@1@S@During the period of the Renaissance, Italy held artistic sway over the rest of Europe.@@@@1@15@@danf@17-8-2009
10421150@unknown@formal@none@1@S@All educated European gentlemen were expected to make the [[Grand Tour]], visiting Italy to see its great historical monuments and works of art.@@@@1@23@@danf@17-8-2009
10421160@unknown@formal@none@1@S@It thus became expected that educated Europeans would learn at least some Italian; the English poet [[John Milton]], for instance, wrote some of his early poetry in Italian.@@@@1@28@@danf@17-8-2009
10421170@unknown@formal@none@1@S@In England, Italian became the second most common modern language to be learned, after [[French language|French]] (though the classical languages, [[Latin]] and [[Greek language|Greek]], came first).@@@@1@26@@danf@17-8-2009
10421180@unknown@formal@none@1@S@However, by the late eighteenth century, Italian tended to be replaced by [[German language|German]] as the second modern language on the curriculum.@@@@1@22@@danf@17-8-2009
10421190@unknown@formal@none@1@S@Yet Italian [[loanword]]s continue to be used in most other [[European languages]] in matters of art and music.@@@@1@18@@danf@17-8-2009
10421200@unknown@formal@none@1@S@Today, the Italian language continues to be used as a [[lingua franca]] in some environments.@@@@1@15@@danf@17-8-2009
10421210@unknown@formal@none@1@S@Within the [[Catholic church]] Italian is known by a large part of the ecclesiastic hierarchy, and is used in substitution of [[Latin]] in some official documents.@@@@1@26@@danf@17-8-2009
10421220@unknown@formal@none@1@S@The presence of Italian as the primary language in the [[Vatican City]] indicates not only use within the [[Holy See]], but also throughout the world where an episcopal seat is present.@@@@1@31@@danf@17-8-2009
10421230@unknown@formal@none@1@S@It continues to be used in [[music]] and [[opera]].@@@@1@9@@danf@17-8-2009
10421240@unknown@formal@none@1@S@Other examples where Italian is sometimes used as a means communication is in some sports (sometimes in [[Football (association)|football]] and [[motorsports]]) and in the [[design]] and [[fashion]] industries.@@@@1@28@@danf@17-8-2009
10421250@unknown@formal@none@1@S@==Dialects==@@@@1@1@@danf@17-8-2009
10421260@unknown@formal@none@1@S@In Italy, all [[Romance languages]] spoken as the vernacular, other than standard Italian and other unrelated, non-Italian languages, are termed "Italian dialects".@@@@1@22@@danf@17-8-2009
10421270@unknown@formal@none@1@S@Many Italian dialects are, in fact, historical languages in their own right.@@@@1@12@@danf@17-8-2009
10421280@unknown@formal@none@1@S@These include recognized language groups such as [[Friulian language|Friulian]], [[Neapolitan language|Neapolitan]], [[Sardinian language|Sardinian]], [[Sicilian language|Sicilian]], [[Venetian language|Venetian]], and others, and regional variants of these languages such as [[Calabrian languages|Calabrian]].@@@@1@29@@danf@17-8-2009
10421290@unknown@formal@none@1@S@The division between dialect and language has been used by scholars (such as by [[Francesco Bruni]]) to distinguish between the languages that made up the Italian [[koine]], and those which had very little or no part in it, such as [[Albanian language|Albanian]], [[Greek language|Greek]], [[German language|German]], [[Ladin language|Ladin]], and [[Occitan language|Occitan]], which are still spoken by minorities.@@@@1@57@@danf@17-8-2009
10421300@unknown@formal@none@1@S@Dialects are generally not used for general mass communication and are usually limited to native speakers in informal contexts.@@@@1@19@@danf@17-8-2009
10421310@unknown@formal@none@1@S@In the past, speaking in dialect was often deprecated as a sign of poor education.@@@@1@15@@danf@17-8-2009
10421320@unknown@formal@none@1@S@Younger generations, especially those under 35 (though it may vary in different areas), speak almost exclusively standard Italian in all situations, usually with local accents and idioms.@@@@1@27@@danf@17-8-2009
10421330@unknown@formal@none@1@S@Regional differences can be recognized by various factors: the openness of vowels, the length of the consonants, and influence of the local dialect (for example, ''annà'' replaces ''andare'' in the area of Rome for the infinitive "to go").@@@@1@38@@danf@17-8-2009
10421340@unknown@formal@none@1@S@==Sounds==@@@@1@1@@danf@17-8-2009
10421350@unknown@formal@none@1@S@{{IPA notice|lang=it}}@@@@1@2@@danf@17-8-2009
10421360@unknown@formal@none@1@S@===Vowels===@@@@1@1@@danf@17-8-2009
10421370@unknown@formal@none@1@S@Italian has seven [[vowel]] phonemes: {{IPA|/a/}}, {{IPA|/e/}}, {{IPA|/ɛ/}}, {{IPA|/i/}}, {{IPA|/o/}}, {{IPA|/ɔ/}}, {{IPA|/u/}}.@@@@1@12@@danf@17-8-2009
10421380@unknown@formal@none@1@S@The pairs {{IPA|/e/}}-{{IPA|/ɛ/}} and {{IPA|/o/}}-{{IPA|/ɔ/}} are seldom distinguished in writing and often confused, even though most varieties of Italian employ both phonemes consistently.@@@@1@23@@danf@17-8-2009
10421390@unknown@formal@none@1@S@Compare, for example: "perché" {{IPA|[perˈkɛ]}} (why, because) and "senti" {{IPA|[ˈsenti]}} (you listen, you are listening, listen!), employed by some northern speakers, with {{IPA|[perˈke]}} and {{IPA|[ˈsɛnti]}}, as pronounced by most central and southern speakers.@@@@1@33@@danf@17-8-2009
10421400@unknown@formal@none@1@S@As a result, the usage is strongly indicative of a person's origin.@@@@1@12@@danf@17-8-2009
10421410@unknown@formal@none@1@S@The standard (Tuscan) usage of these vowels is listed in vocabularies, and employed outside Tuscany mainly by specialists, especially actors and very few (television) journalists.@@@@1@25@@danf@17-8-2009
10421420@unknown@formal@none@1@S@These are truly different [[phonemes]], however: compare {{IPA|/ˈpeska/}} (fishing) and {{IPA|/ˈpɛska/}} (peach), both spelled ''pesca'' .@@@@1@16@@danf@17-8-2009
10421430@unknown@formal@none@1@S@Similarly {{IPA|/ˈbotte/}} ('barrel') and {{IPA|/ˈbɔtte/}} ('beatings'), both spelled ''botte'', discriminate {{IPA|/o/}} and {{IPA|/ɔ/}} .@@@@1@14@@danf@17-8-2009
10421440@unknown@formal@none@1@S@In general, vowel combinations usually pronounce each vowel separately.@@@@1@9@@danf@17-8-2009
10421450@unknown@formal@none@1@S@[[Diphthong]]s exist (e.g. ''uo'', ''iu'', ''ie'', ''ai''), but are limited to an unstressed ''u'' or ''i'' before or after a stressed vowel.@@@@1@22@@danf@17-8-2009
10421460@unknown@formal@none@1@S@The unstressed ''u'' in a diphthong approximates the English semivowel ''w'', the unstressed ''i'' approximates the semivowel ''y''.@@@@1@18@@danf@17-8-2009
10421470@unknown@formal@none@1@S@E.g.: ''buono'' {{IPA|[ˈbwɔno]}}, ''ieri'' {{IPA|[ˈjɛri]}}.@@@@1@5@@danf@17-8-2009
10421480@unknown@formal@none@1@S@[[Triphthong]]s exist in Italian as well, like "contin''uia''mo" ("we continue").@@@@1@10@@danf@17-8-2009
10421490@unknown@formal@none@1@S@Three vowel combinations exist only in the form semiconsonant ({{IPA|/j/}} or {{IPA|/w/}}), followed by a vowel, followed by a desinence vowel (usually {{IPA|/i/}}), as in ''miei'', ''suoi'', or two semiconsonants followed by a vowel, as the group ''-uia-'' exemplified above, or ''-iuo-'' in the word ''aiuola''.@@@@1@46@@danf@17-8-2009
10421500@unknown@formal@none@1@S@===Mobile diphthongs===@@@@1@2@@danf@17-8-2009
10421510@unknown@formal@none@1@S@Many Latin words with a short ''e'' or ''o'' have Italian counterparts with a mobile diphthong (''ie'' and ''uo'' respectively).@@@@1@20@@danf@17-8-2009
10421520@unknown@formal@none@1@S@When the vowel sound is stressed, it is pronounced and written as a diphthong; when not stressed, it is pronounced and written as a single vowel.@@@@1@26@@danf@17-8-2009
10421530@unknown@formal@none@1@S@So Latin ''focus'' gave rise to Italian ''fuoco'' (meaning both "fire" and "optical focus"): when unstressed, as in ''focale'' ("focal") the "o" remains alone.@@@@1@24@@danf@17-8-2009
10421540@unknown@formal@none@1@S@Latin ''pes'' (more precisely its accusative form ''pedem'') is the source of Italian ''piede'' (foot): but unstressed "e" was left unchanged in ''pedone'' (pedestrian) and ''pedale'' (pedal).@@@@1@27@@danf@17-8-2009
10421550@unknown@formal@none@1@S@From Latin ''iocus'' comes Italian ''giuoco'' ("play", "game"), though in this case ''gioco'' is more common: ''giocare'' means "to play (a game)".@@@@1@22@@danf@17-8-2009
10421560@unknown@formal@none@1@S@From Latin ''homo'' comes Italian ''uomo'' (man), but also ''umano'' (human) and ''ominide'' (hominid).@@@@1@14@@danf@17-8-2009
10421570@unknown@formal@none@1@S@From Latin ''ovum'' comes Italian ''uovo'' (egg) and ''ovaie'' (ovaries).@@@@1@10@@danf@17-8-2009
10421580@unknown@formal@none@1@S@(The same phenomenon occurs in [[Spanish language|Spanish]]: ''juego'' (play, game) and ''jugar'' (to play), ''nieve'' (snow) and ''nevar'' (to snow)).@@@@1@20@@danf@17-8-2009
10421590@unknown@formal@none@1@S@===Consonants===@@@@1@1@@danf@17-8-2009
10421600@unknown@formal@none@1@S@Two symbols in a table cell denote the voiceless and voiced consonant, respectively.@@@@1@13@@danf@17-8-2009
10421610@unknown@formal@none@1@S@Nasals undergo assimilation when followed by a consonant, e.g., when preceding a velar ({{IPA|/k/}} or {{IPA|/g/}}) only {{IPA|[ŋ]}} appears, etc.@@@@1@20@@danf@17-8-2009
10421620@unknown@formal@none@1@S@Italian has geminate, or double, consonants, which are distinguished by [[Consonant length|length]].@@@@1@12@@danf@17-8-2009
10421630@unknown@formal@none@1@S@Length is distinctive for all consonants except for {{IPA|/ʃ/}}, {{IPA|/ʦ/}}, {{IPA|/ʣ/}}, {{IPA|/ʎ/}} {{IPA|/ɲ/}}, which are always geminate, and {{IPA|/z/}} which is always single.@@@@1@23@@danf@17-8-2009
10421640@unknown@formal@none@1@S@Geminate plosives and affricates are realised as lengthened closures.@@@@1@9@@danf@17-8-2009
10421650@unknown@formal@none@1@S@Geminate fricatives, nasals, and {{IPA|/l/}} are realized as lengthened [[continuant]]s.@@@@1@10@@danf@17-8-2009
10421660@unknown@formal@none@1@S@The flap consonant {{IPA|/ɾː/}} is typically dialectal, and it is called ''erre moscia''.@@@@1@13@@danf@17-8-2009
10421670@unknown@formal@none@1@S@The correct standard pronunciation is {{IPA|[r]}}.@@@@1@6@@danf@17-8-2009
10421680@unknown@formal@none@1@S@Of special interest to the linguistic study of Italian is the ''[[Tuscan gorgia|Gorgia Toscana]]'', or "Tuscan Throat", the weakening or [[lenition]] of certain [[:wiktionary:intervocalic|intervocalic]] consonants in [[Tuscan dialect]]s.@@@@1@28@@danf@17-8-2009
10421690@unknown@formal@none@1@S@See also [[Syntactic doubling]].@@@@1@4@@danf@17-8-2009
10421700@unknown@formal@none@1@S@===Assimilation===@@@@1@1@@danf@17-8-2009
10421710@unknown@formal@none@1@S@Italian has few diphthongs, so most unfamiliar diphthongs that are heard in foreign words (in particular, those beginning with vowel "a", "e", or "o") will be assimilated as the corresponding [[diaeresis]] (i.e., the vowel sounds will be pronounced separately).@@@@1@39@@danf@17-8-2009
10421720@unknown@formal@none@1@S@Italian [[phonotactics]] do not usually permit polysyllabic nouns and verbs to end with consonants, excepting poetry and song, so foreign words may receive extra terminal vowel sounds.@@@@1@27@@danf@17-8-2009
10421730@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10421740@unknown@formal@none@1@S@===Common variations in the writing systems===@@@@1@6@@danf@17-8-2009
10421750@unknown@formal@none@1@S@Some variations in the usage of the writing system may be present in practical use.@@@@1@15@@danf@17-8-2009
10421760@unknown@formal@none@1@S@These are scorned by educated people, but they are so common in certain contexts that knowledge of them may be useful.@@@@1@21@@danf@17-8-2009
10421770@unknown@formal@none@1@S@* Usage of ''x'' instead of ''per'': this is very common among teenagers and in [[Text messaging|SMS]] abbreviations.@@@@1@18@@danf@17-8-2009
10421780@unknown@formal@none@1@S@The multiplication operator is pronounced "per" in Italian, and so it is sometimes used to replace the word "per", which means "for"; thus, for example, "per te" ("for you") is shortened to "x te" (compare with English "4 U").@@@@1@39@@danf@17-8-2009
10421790@unknown@formal@none@1@S@Words containing ''per'' can also have it replaced with ''x'': for example, ''perché'' (both "why" and "because") is often shortened as ''xché'' or ''xké'' or ''x' ''(see below).@@@@1@28@@danf@17-8-2009
10421800@unknown@formal@none@1@S@This usage might be useful to jot down quick notes or to fit more text into the low character limit of an SMS, but it is considered unacceptable in formal writing.@@@@1@31@@danf@17-8-2009
10421810@unknown@formal@none@1@S@* Usage of foreign letters such as ''k'', ''j'' and ''y'', especially in nicknames and SMS language: ''ke'' instead of ''che'', ''Giusy'' instead of ''Giuseppina'' (or sometimes ''Giuseppe'').@@@@1@28@@danf@17-8-2009
10421820@unknown@formal@none@1@S@This is curiously mirrored in the usage of ''i'' in English names such as ''Staci'' instead of ''Stacey'', or in the usage of ''c'' in [[Northern Europe]] (''Jacob'' instead of ''Jakob'').@@@@1@31@@danf@17-8-2009
10421830@unknown@formal@none@1@S@The use of "k" instead of "ch" or "c" to represent a plosive sound is documented in some historical texts from before the standardization of the Italian language; however, that usage is no longer standard in Italian.@@@@1@37@@danf@17-8-2009
10421840@unknown@formal@none@1@S@Possibly because it is associated with the [[German language]], the letter "k" has sometimes also been used in satire to suggest that a political figure is an authoritarian or even a "pseudo-nazi": [[Francesco Cossiga]] was famously nicknamed ''Kossiga'' by rioting students during his tenure as minister of internal affairs.@@@@1@49@@danf@17-8-2009
10421850@unknown@formal@none@1@S@[Cf. the [[alternative political spelling#"K" replacing "C"|politicized spelling ''Amerika'']] in the USA.]@@@@1@12@@danf@17-8-2009
10421860@unknown@formal@none@1@S@* Usage of the following abbreviations is limited to the electronic communications media and is deprecated in all other cases: '''nn''' instead of ''non'' (not), '''cmq''' instead of ''comunque'' (anyway, however), '''cm''' instead of ''come'' (how, like, as), '''d''' instead of ''di'' (of), '''(io/loro) sn''' instead of ''(io/loro) sono'' (I am/they are), '''(io) dv''' instead of ''(io) devo'' (I must/I have to) or instead of ''dove'' (where), '''(tu) 6''' instead of ''(tu) sei'' (you are).@@@@1@75@@danf@17-8-2009
10421870@unknown@formal@none@1@S@* Inexperienced typists often replace accents with apostrophes, such as in ''perche''' instead of ''perché''.@@@@1@15@@danf@17-8-2009
10421880@unknown@formal@none@1@S@Uppercase ''[[È]]'' is particularly rare, as it is absent from the [[Keyboard layout#Italian|Italian keyboard layout]], and is very often written as ''E''' (even though there are [[:it:Aiuto:Manuale di stile#Scrivere .C3.88|several ways]] of producing the uppercase È on a computer).@@@@1@39@@danf@17-8-2009
10421890@unknown@formal@none@1@S@This never happens in books or other professionally typeset material.@@@@1@10@@danf@17-8-2009
10421900@unknown@formal@none@1@S@==Samples==@@@@1@1@@danf@17-8-2009
10421910@unknown@formal@none@1@S@==Examples==@@@@1@1@@danf@17-8-2009
10421920@unknown@formal@none@1@S@*Cheers: "Salute!"@@@@1@2@@danf@17-8-2009
10421930@unknown@formal@none@1@S@*English: ''inglese'' {{IPA|/iŋˈglese/}}@@@@1@3@@danf@17-8-2009
10421940@unknown@formal@none@1@S@*Good-bye: ''arrivederci'' {{IPA|/arriveˈdertʃi/}}@@@@1@3@@danf@17-8-2009
10421950@unknown@formal@none@1@S@*Hello: ''[[ciao]]'' {{IPA|/ˈtʃao/}}@@@@1@3@@danf@17-8-2009
10421960@unknown@formal@none@1@S@*Good day: ''buon giorno'' {{IPA|/bwɔnˈdʒorno/}}@@@@1@5@@danf@17-8-2009
10421970@unknown@formal@none@1@S@*Good evening: ''buona sera'' {{IPA|/bwɔnaˈsera/}}@@@@1@5@@danf@17-8-2009
10421980@unknown@formal@none@1@S@*Yes: ''sì'' {{IPA|/si/}}@@@@1@3@@danf@17-8-2009
10421990@unknown@formal@none@1@S@*No: ''no'' {{IPA|/nɔ/}}@@@@1@3@@danf@17-8-2009
10422000@unknown@formal@none@1@S@*How are you? : Come stai {{IPA|/ˈkome ˈstai/}} (informal); Come sta {{IPA|/ˈkome 'sta/}} (formal)@@@@1@14@@danf@17-8-2009
10422010@unknown@formal@none@1@S@*Sorry: ''mi dispiace'' {{IPA|/mi disˈpjatʃe/}}@@@@1@5@@danf@17-8-2009
10422020@unknown@formal@none@1@S@*Excuse me: scusa {{IPA|/ˈskuza/}} (informal); scusi {{IPA|/ˈskuzi/}} (formal)@@@@1@8@@danf@17-8-2009
10422030@unknown@formal@none@1@S@*Again: ''di nuovo'', /{{IPA|di ˈnwɔvo}}/; ''ancora'' /{{IPA|aŋˈkora}}/@@@@1@7@@danf@17-8-2009
10422040@unknown@formal@none@1@S@*Always: ''sempre'' /{{IPA|ˈsɛmpre}}/@@@@1@3@@danf@17-8-2009
10422050@unknown@formal@none@1@S@*When: ''quando'' {{IPA|/ˈkwando/}}@@@@1@3@@danf@17-8-2009
10422060@unknown@formal@none@1@S@*Where: ''dove'' {{IPA|/'dove/}}@@@@1@3@@danf@17-8-2009
10422070@unknown@formal@none@1@S@*Why/Because: ''perché'' {{IPA|/perˈke/}}@@@@1@3@@danf@17-8-2009
10422080@unknown@formal@none@1@S@*How: ''come'' {{IPA|/'kome/}}@@@@1@3@@danf@17-8-2009
10422090@unknown@formal@none@1@S@*How much is it?: ''quanto costa?''@@@@1@6@@danf@17-8-2009
10422100@unknown@formal@none@1@S@{{IPA|/ˈkwanto/}}@@@@1@1@@danf@17-8-2009
10422110@unknown@formal@none@1@S@*Thank you!: ''grazie!''@@@@1@3@@danf@17-8-2009
10422120@unknown@formal@none@1@S@{{IPA|/ˈgrattsie/}}@@@@1@1@@danf@17-8-2009
10422130@unknown@formal@none@1@S@*Bon appetit: ''buon appetito'' {{IPA|/ˌbwɔn appeˈtito/}}@@@@1@6@@danf@17-8-2009
10422140@unknown@formal@none@1@S@*You're welcome!: ''prego!''@@@@1@3@@danf@17-8-2009
10422150@unknown@formal@none@1@S@{{IPA|/ˈprɛgo/}}@@@@1@1@@danf@17-8-2009
10422160@unknown@formal@none@1@S@*I love you: ''Ti amo'' {{IPA|/ti ˈamo/}}, ''Ti voglio bene'' {{IPA|/ti ˈvɔʎʎo ˈbɛne/}}.@@@@1@13@@danf@17-8-2009
10422170@unknown@formal@none@1@S@The difference is that you use "Ti amo" when you are in a romantic relationship, "Ti voglio bene" in any other occasion (to parents, to relatives, to friends...)@@@@1@28@@danf@17-8-2009
10422180@unknown@formal@none@1@S@Counting to twenty:@@@@1@3@@danf@17-8-2009
10422190@unknown@formal@none@1@S@*One: ''uno'' {{IPA|/ˈuno/}}@@@@1@3@@danf@17-8-2009
10422200@unknown@formal@none@1@S@*Two: ''due'' {{IPA|/ˈdue/}}@@@@1@3@@danf@17-8-2009
10422210@unknown@formal@none@1@S@*Three: ''tre'' {{IPA|/tre/}}@@@@1@3@@danf@17-8-2009
10422220@unknown@formal@none@1@S@*Four: ''quattro'' {{IPA|/ˈkwattro/}}@@@@1@3@@danf@17-8-2009
10422230@unknown@formal@none@1@S@*Five: ''cinque'' {{IPA|/ˈʧiŋkwe/}}@@@@1@3@@danf@17-8-2009
10422240@unknown@formal@none@1@S@*Six: ''sei'' {{IPA|/ˈsɛi/}}@@@@1@3@@danf@17-8-2009
10422250@unknown@formal@none@1@S@*Seven: ''sette'' {{IPA|/ˈsɛtte/}}@@@@1@3@@danf@17-8-2009
10422260@unknown@formal@none@1@S@*Eight: ''otto'' {{IPA|/ˈɔtto/}}@@@@1@3@@danf@17-8-2009
10422270@unknown@formal@none@1@S@*Nine: ''nove'' {{IPA|/ˈnɔve/}}@@@@1@3@@danf@17-8-2009
10422280@unknown@formal@none@1@S@*Ten: ''dieci'' {{IPA|/ˈdjɛʧi/}}@@@@1@3@@danf@17-8-2009
10422290@unknown@formal@none@1@S@*Eleven: ''undici'' {{IPA|/ˈundiʧi/}}@@@@1@3@@danf@17-8-2009
10422300@unknown@formal@none@1@S@*Twelve: ''dodici'' {{IPA|/ˈdodiʧi/}}@@@@1@3@@danf@17-8-2009
10422310@unknown@formal@none@1@S@*Thirteen: ''tredici'' {{IPA|/ˈtrediʧi/}}@@@@1@3@@danf@17-8-2009
10422320@unknown@formal@none@1@S@*Fourteen: ''quattordici'' {{IPA|/kwat'tordiʧi/}}@@@@1@3@@danf@17-8-2009
10422330@unknown@formal@none@1@S@*Fifteen: ''quindici'' {{IPA|/ˈkwindiʧi/}}@@@@1@3@@danf@17-8-2009
10422340@unknown@formal@none@1@S@*Sixteen: ''sedici'' {{IPA|/ˈsediʧi/}}@@@@1@3@@danf@17-8-2009
10422350@unknown@formal@none@1@S@*Seventeen: ''diciassette'' {{IPA|/diʧas'sɛtte/}}@@@@1@3@@danf@17-8-2009
10422360@unknown@formal@none@1@S@*Eighteen: ''diciotto'' {{IPA|/di'ʧɔtto/}}@@@@1@3@@danf@17-8-2009
10422370@unknown@formal@none@1@S@*Nineteen: ''diciannove'' {{IPA|/diʧan'nɔve/}}@@@@1@3@@danf@17-8-2009
10422380@unknown@formal@none@1@S@*Twenty: ''venti'' {{IPA|/'venti/}}@@@@1@3@@danf@17-8-2009
10422390@unknown@formal@none@1@S@The days of the week:@@@@1@5@@danf@17-8-2009
10422400@unknown@formal@none@1@S@*Monday: ''lunedì'' {{IPA|/lune'di/}}@@@@1@3@@danf@17-8-2009
10422410@unknown@formal@none@1@S@*Tuesday: ''martedì'' {{IPA|/marte'di/}}@@@@1@3@@danf@17-8-2009
10422420@unknown@formal@none@1@S@*Wednesday: ''mercoledì'' {{IPA|/merkole'di/}}@@@@1@3@@danf@17-8-2009
10422430@unknown@formal@none@1@S@*Thursday: ''giovedì'' {{IPA|/dʒove'di/}}@@@@1@3@@danf@17-8-2009
10422440@unknown@formal@none@1@S@*Friday: ''venerdì'' {{IPA|/vener'di/}}@@@@1@3@@danf@17-8-2009
10422450@unknown@formal@none@1@S@*Saturday: ''sabato'' {{IPA|/ˈsabato/}}@@@@1@3@@danf@17-8-2009
10422460@unknown@formal@none@1@S@*Sunday: ''domenica'' {{IPA|/do'menika/}}@@@@1@3@@danf@17-8-2009
10422470@unknown@formal@none@1@S@==Sample texts==@@@@1@2@@danf@17-8-2009
10422480@unknown@formal@none@1@S@There is a recording of [[Dante]]'s [[Divine Comedy]] read by [[Lino Pertile]] available at http://etcweb.princeton.edu/dante/pdp/@@@@1@15@@danf@17-8-2009
10430010@unknown@formal@none@1@S@Japanese language@@@@1@2@@danf@17-8-2009
10430020@unknown@formal@none@1@S@{{Nihongo|'''Japanese'''|日本語 / にほんご |3=}} is a language spoken by over 130 million people in [[Japan]] and in Japanese emigrant communities.@@@@1@20@@danf@17-8-2009
10430030@unknown@formal@none@1@S@It is related to the [[Ryukyuan languages]], but whatever [[Classification of the Japanese language|relationships with other languages]] it may have remain undemonstrated.@@@@1@22@@danf@17-8-2009
10430040@unknown@formal@none@1@S@It is an [[agglutinative language]] and is distinguished by a complex system of [[Honorific speech in Japanese|honorifics]] reflecting the hierarchical nature of Japanese society, with verb forms and particular vocabulary to indicate the relative status of speaker, listener and the third person mentioned in conversation whether he is there or not.@@@@1@51@@danf@17-8-2009
10430050@unknown@formal@none@1@S@The sound inventory of Japanese is relatively small, and it has a lexically distinct [[Japanese pitch accent|pitch-accent]] system.@@@@1@18@@danf@17-8-2009
10430060@unknown@formal@none@1@S@It is a [[mora (linguistics)|mora]]-timed language.@@@@1@6@@danf@17-8-2009
10430070@unknown@formal@none@1@S@The Japanese language is written with a combination of three different types of scripts: [[Chinese characters]] called ''[[kanji]]'' (漢字 / かんじ), and two [[syllabary|syllabic]] scripts made up of modified [[Chinese characters]], ''[[hiragana]]'' (平仮名 / ひらがな) and ''[[katakana]]'' (片仮名 / カタカナ).@@@@1@40@@danf@17-8-2009
10430080@unknown@formal@none@1@S@The [[Latin alphabet]], ''[[rōmaji]]'' (ローマ字), is also often used in modern Japanese, especially for company names and logos, advertising, and when entering Japanese text into a computer.@@@@1@27@@danf@17-8-2009
10430090@unknown@formal@none@1@S@Western style [[Arabic numerals]] are generally used for numbers, but traditional [[Sino-Japanese vocabulary|Sino-Japanese]] numerals are also commonplace.@@@@1@17@@danf@17-8-2009
10430100@unknown@formal@none@1@S@Japanese [[vocabulary]] has been heavily influenced by [[loanword]]s from other languages.@@@@1@11@@danf@17-8-2009
10430110@unknown@formal@none@1@S@A vast number of words were borrowed from [[Chinese language|Chinese]], or created from Chinese models, over a period of at least 1,500 years.@@@@1@23@@danf@17-8-2009
10430120@unknown@formal@none@1@S@Since the late 19th century, Japanese has borrowed a considerable number of words from [[Indo-European languages]], primarily [[English language|English]].@@@@1@19@@danf@17-8-2009
10430130@unknown@formal@none@1@S@Because of the special trade relationship between Japan and first [[Portugal]] in the 16th century, and then mainly the [[Netherlands]] in the 17th century, [[Portuguese language|Portuguese]], [[German language|German]] and [[Dutch language|Dutch]] have also been influential.@@@@1@35@@danf@17-8-2009
10430140@unknown@formal@none@1@S@== Geographic distribution ==@@@@1@4@@danf@17-8-2009
10430150@unknown@formal@none@1@S@Although Japanese is spoken almost exclusively in Japan, it has been and sometimes still is spoken elsewhere.@@@@1@17@@danf@17-8-2009
10430160@unknown@formal@none@1@S@When [[Imperial Japan|Japan]] occupied [[Korea]], [[Taiwan]], parts of the [[Chinese mainland]], and various Pacific islands before and during [[World War II]], locals in [[Greater East Asia Co-Prosperity Sphere|those countries]] were forced to learn Japanese in empire-building programs.@@@@1@37@@danf@17-8-2009
10430170@unknown@formal@none@1@S@As a result, there are many people in these countries who can speak Japanese in addition to the local languages.@@@@1@20@@danf@17-8-2009
10430180@unknown@formal@none@1@S@Japanese emigrant communities (the largest of which are to be found in [[Brazil]]) sometimes employ Japanese as their primary language.@@@@1@20@@danf@17-8-2009
10430190@unknown@formal@none@1@S@Approximately 5% of Hawaii residents speak Japanese, with Japanese ancestry the largest single ancestry in the state (over 24% of the population).@@@@1@22@@danf@17-8-2009
10430200@unknown@formal@none@1@S@Japanese emigrants can also be found in [[Peru]], [[Argentina]], [[Australia]] (especially [[Sydney]], [[Brisbane]], and [[Melbourne]]), the [[United States]] (notably [[California]], where 1.2% of the population has Japanese ancestry, and [[Hawaii]]), and the [[Philippines]] (particularly in [[Davao]] and [[Laguna (province)|Laguna]]).@@@@1@39@@danf@17-8-2009
10430210@unknown@formal@none@1@S@Their descendants, who are known as {{transl|ja|''[[nikkei]]''}} ({{lang|ja|日系}}, literally Japanese descendants), however, rarely speak Japanese fluently after the second generation.@@@@1@20@@danf@17-8-2009
10430220@unknown@formal@none@1@S@There are estimated to be several million non-Japanese studying the language as well.@@@@1@13@@danf@17-8-2009
10430230@unknown@formal@none@1@S@=== Official status ===@@@@1@4@@danf@17-8-2009
10430240@unknown@formal@none@1@S@Japanese is the de facto official language of Japan.@@@@1@9@@danf@17-8-2009
10430250@unknown@formal@none@1@S@There is a form of the language considered standard: {{nihongo|''hyōjungo''|標準語|}} Standard Japanese, or {{nihongo|''kyōtsūgo''|共通語|}} the common language.@@@@1@17@@danf@17-8-2009
10430260@unknown@formal@none@1@S@The meanings of the two terms are almost the same.@@@@1@10@@danf@17-8-2009
10430270@unknown@formal@none@1@S@{{transl|ja|''Hyōjungo''}} or {{transl|ja|''kyōtsūgo''}} is a conception that forms the counterpart of dialect.@@@@1@12@@danf@17-8-2009
10430280@unknown@formal@none@1@S@This normative language was born after the {{nihongo|[[Meiji Restoration]]|明治維新|meiji ishin|1868}} from the language spoken in uptown [[Tokyo]] for communicating necessity.@@@@1@20@@danf@17-8-2009
10430290@unknown@formal@none@1@S@{{transl|ja|''Hyōjungo''}} is taught in schools and used on television and in official communications, and is the version of Japanese discussed in this article.@@@@1@23@@danf@17-8-2009
10430300@unknown@formal@none@1@S@Formerly, standard {{nihongo|Japanese in writing|文語|[[Bungo (Japanese language)|bungo]]|"literary language"}} was different from {{nihongo|colloquial language|口語|[[Kogo (Japanese language)|kōgo]]}}.@@@@1@15@@danf@17-8-2009
10430310@unknown@formal@none@1@S@The two systems have different rules of grammar and some variance in vocabulary.@@@@1@13@@danf@17-8-2009
10430320@unknown@formal@none@1@S@{{transl|ja|''Bungo''}} was the main method of writing Japanese until about 1900; since then {{transl|ja|''kōgo''}} gradually extended its influence and the two methods were both used in writing until the 1940s.@@@@1@30@@danf@17-8-2009
10430330@unknown@formal@none@1@S@{{transl|ja|''Bungo''}} still has some relevance for historians, literary scholars, and lawyers (many Japanese laws that survived [[World War II]] are still written in {{transl|ja|''bungo''}}, although there are ongoing efforts to modernize their language).@@@@1@33@@danf@17-8-2009
10430340@unknown@formal@none@1@S@{{transl|ja|''Kōgo''}} is the predominant method of both speaking and writing Japanese today, although {{transl|ja|''bungo''}} grammar and vocabulary are occasionally used in modern Japanese for effect.@@@@1@25@@danf@17-8-2009
10430350@unknown@formal@none@1@S@=== Dialects ===@@@@1@3@@danf@17-8-2009
10430360@unknown@formal@none@1@S@Dozens of dialects are spoken in Japan.@@@@1@7@@danf@17-8-2009
10430370@unknown@formal@none@1@S@The profusion is due to many factors, including the length of time the [[Japanese Archipelago|archipelago]] has been inhabited, its mountainous island terrain, and Japan's long history of both external and internal isolation.@@@@1@32@@danf@17-8-2009
10430380@unknown@formal@none@1@S@Dialects typically differ in terms of [[Japanese pitch accent|pitch accent]], inflectional [[morphology (linguistics)|morphology]], [[vocabulary]], and particle usage.@@@@1@17@@danf@17-8-2009
10430390@unknown@formal@none@1@S@Some even differ in [[vowel]] and [[consonant]] inventories, although this is uncommon.@@@@1@12@@danf@17-8-2009
10430400@unknown@formal@none@1@S@The main distinction in Japanese accents is between {{nihongo|Tokyo-type|東京式|Tōkyō-shiki}} and {{nihongo|Kyoto-Osaka-type|京阪式|Keihan-shiki}}, though Kyūshū-type dialects form a third, smaller group.@@@@1@19@@danf@17-8-2009
10430410@unknown@formal@none@1@S@Within each type are several subdivisions.@@@@1@6@@danf@17-8-2009
10430420@unknown@formal@none@1@S@Kyoto-Osaka-type dialects are in the central region, with borders roughly formed by [[Toyama Prefecture|Toyama]], [[Kyoto Prefecture|Kyōto]], [[Hyōgo Prefecture|Hyōgo]], and [[Mie Prefecture|Mie]] Prefectures; most [[Shikoku]] dialects are also that type.@@@@1@29@@danf@17-8-2009
10430430@unknown@formal@none@1@S@The final category of dialects are those that are descended from the Eastern dialect of [[Old Japanese]]; these dialects are spoken in [[Hachijōjima|Hachijō-jima island]] and few islands.@@@@1@27@@danf@17-8-2009
10430440@unknown@formal@none@1@S@Dialects from peripheral regions, such as [[Tōhoku Region|Tōhoku]] or [[Tsushima Island|Tsushima]], may be unintelligible to speakers from other parts of the country.@@@@1@22@@danf@17-8-2009
10430450@unknown@formal@none@1@S@The several dialects of [[Kagoshima Prefecture|Kagoshima]] in southern [[Kyūshū]] are famous for being unintelligible not only to speakers of standard Japanese but to speakers of nearby dialects elsewhere in Kyūshū as well.@@@@1@32@@danf@17-8-2009
10430460@unknown@formal@none@1@S@This is probably due in part to the Kagoshima dialects' peculiarities of pronunciation, which include the existence of closed syllables (i.e., syllables that end in a consonant, such as {{IPA|/kob/}} or {{IPA|/koʔ/}} for Standard Japanese {{IPA|/kumo/}} "spider").@@@@1@37@@danf@17-8-2009
10430470@unknown@formal@none@1@S@A dialects group of [[Kansai region|Kansai]] is spoken and known by many Japanese, and [[Osaka]] dialect in particular is associated with comedy (See [[Kansai dialect]]).@@@@1@25@@danf@17-8-2009
10430480@unknown@formal@none@1@S@Dialects of Tōhoku and North [[Kantō region|Kantō]] are associated with typical farmers.@@@@1@12@@danf@17-8-2009
10430490@unknown@formal@none@1@S@The [[Ryūkyūan languages]], spoken in [[Okinawa Prefecture|Okinawa]] and [[Amami Islands]] that are politically part of [[Kagoshima Prefecture|Kagoshima]], are distinct enough to be considered a separate branch of the [[Japonic languages|Japonic]] family.@@@@1@31@@danf@17-8-2009
10430500@unknown@formal@none@1@S@But many Japanese common people tend to consider the Ryūkyūan languages as dialects of Japanese.@@@@1@15@@danf@17-8-2009
10430510@unknown@formal@none@1@S@Not only is each language unintelligible to Japanese speakers, but most are unintelligible to those who speak other Ryūkyūan languages.@@@@1@20@@danf@17-8-2009
10430520@unknown@formal@none@1@S@Recently, Standard Japanese has become prevalent nationwide (including the Ryūkyū islands) due to [[education]], [[mass media]], and increase of mobility networks within Japan, as well as economic integration.@@@@1@28@@danf@17-8-2009
10430530@unknown@formal@none@1@S@== Sounds ==@@@@1@3@@danf@17-8-2009
10430540@unknown@formal@none@1@S@{{IPA notice}}@@@@1@2@@danf@17-8-2009
10430550@unknown@formal@none@1@S@Japanese vowels are "pure" sounds.@@@@1@5@@danf@17-8-2009
10430560@unknown@formal@none@1@S@The only unusual vowel is the high back vowel {{IPA|/ɯ/}} , which is like {{IPA|/u/}}, but [[roundedness|compressed]] instead of rounded.@@@@1@20@@danf@17-8-2009
10430570@unknown@formal@none@1@S@Japanese has five vowels, and [[vowel length]] is phonemic, so each one has both a short and a long version.@@@@1@20@@danf@17-8-2009
10430580@unknown@formal@none@1@S@Some Japanese consonants have several [[allophone]]s, which may give the impression of a larger inventory of sounds.@@@@1@17@@danf@17-8-2009
10430590@unknown@formal@none@1@S@However, some of these allophones have since become phonemic.@@@@1@9@@danf@17-8-2009
10430600@unknown@formal@none@1@S@For example, in the Japanese language up to and including the first half of the twentieth century, the phonemic sequence {{IPA|/ti/}} was [[palatalization|palatalized]] and realized phonetically as {{IPA|[tɕi]}}, approximately ''chi'' ; however, now {{IPA|/ti/}} and {{IPA|/tɕi/}} are distinct, as evidenced by words like ''tī'' {{IPA|[tiː]}} "Western style tea" and ''chii'' {{IPA|[tɕii]}} "social status."@@@@1@53@@danf@17-8-2009
10430610@unknown@formal@none@1@S@The 'r' of the Japanese language (technically a [[lateral consonant|lateral]] [[apical consonant|apical]] postalveolar flap), is of particular interest, sounding to most English speakers to be something between an 'l' and a [[retroflex consonant|retroflex]] 'r' depending on its position in a word.@@@@1@41@@danf@17-8-2009
10430620@unknown@formal@none@1@S@The syllabic structure and the [[phonotactics]] are very simple: the only [[consonant cluster]]s allowed within a syllable consist of one of a subset of the consonants plus {{IPA|/j/}}.@@@@1@28@@danf@17-8-2009
10430630@unknown@formal@none@1@S@These type of clusters only occur in onsets.@@@@1@8@@danf@17-8-2009
10430640@unknown@formal@none@1@S@However, consonant clusters across syllables are allowed as long as the two consonants are a nasal followed by a [[homo-organic]] consonant.@@@@1@21@@danf@17-8-2009
10430650@unknown@formal@none@1@S@[[Consonant length]] (gemination) is also phonemic.@@@@1@6@@danf@17-8-2009
10430660@unknown@formal@none@1@S@== Grammar ==@@@@1@3@@danf@17-8-2009
10430670@unknown@formal@none@1@S@=== Sentence structure ===@@@@1@4@@danf@17-8-2009
10430680@unknown@formal@none@1@S@Japanese word order is classified as [[Subject Object Verb]].@@@@1@9@@danf@17-8-2009
10430690@unknown@formal@none@1@S@However, unlike many [[Indo-European language]]s, Japanese sentences only require that verbs come last for intelligibility.@@@@1@15@@danf@17-8-2009
10430700@unknown@formal@none@1@S@This is because the Japanese [[sentence element]]s are marked with [[Japanese particles|particles]] that identify their grammatical functions.@@@@1@17@@danf@17-8-2009
10430710@unknown@formal@none@1@S@The basic sentence structure is [[topic-comment]].@@@@1@6@@danf@17-8-2009
10430720@unknown@formal@none@1@S@For example, {{transl|ja|''Kochira-wa Tanaka-san desu''}} ({{lang|ja|こちらは田中さんです}}).@@@@1@6@@danf@17-8-2009
10430730@unknown@formal@none@1@S@{{transl|ja|''Kochira''}} ("this") is the topic of the sentence, indicated by the particle ''-wa''.@@@@1@13@@danf@17-8-2009
10430740@unknown@formal@none@1@S@The verb is {{transl|ja|''desu''}}, a [[copula]], commonly translated as "to be" or "it is" (though there are other verbs that can be translated as "to be").@@@@1@26@@danf@17-8-2009
10430750@unknown@formal@none@1@S@As a phrase, {{transl|ja|''Tanaka-san desu''}} is the comment.@@@@1@8@@danf@17-8-2009
10430760@unknown@formal@none@1@S@This sentence loosely translates to "As for this person, (it) is Mr./Mrs./Miss Tanaka."@@@@1@13@@danf@17-8-2009
10430770@unknown@formal@none@1@S@Thus Japanese, like [[Chinese language|Chinese]], [[Korean language|Korean]], and many other Asian languages, is often called a [[topic-prominent language]], which means it has a strong tendency to indicate the topic separately from the subject, and the two do not always coincide.@@@@1@40@@danf@17-8-2009
10430780@unknown@formal@none@1@S@The sentence {{transl|ja|''Zō-wa hana-ga nagai (desu)''}} ({{lang|ja|象は鼻が長いです}}) literally means, "As for elephants, (their) noses are long".@@@@1@15@@danf@17-8-2009
10430790@unknown@formal@none@1@S@The topic is {{transl|ja|''zō''}} "elephant", and the subject is {{transl|ja|''hana''}} "nose".@@@@1@11@@danf@17-8-2009
10430800@unknown@formal@none@1@S@Japanese is a [[pro-drop language]], meaning that the subject or object of a sentence need not be stated if it is obvious from context.@@@@1@24@@danf@17-8-2009
10430810@unknown@formal@none@1@S@In addition, it is commonly felt, particularly in spoken Japanese, that the shorter a sentence is, the better.@@@@1@18@@danf@17-8-2009
10430820@unknown@formal@none@1@S@As a result of this grammatical permissiveness and tendency towards brevity, Japanese speakers tend naturally to omit words from sentences, rather than refer to them with [[pronoun]]s.@@@@1@27@@danf@17-8-2009
10430830@unknown@formal@none@1@S@In the context of the above example, {{transl|ja|''hana-ga nagai''}} would mean "[their] noses are long," while {{transl|ja|''nagai''}} by itself would mean "[they] are long."@@@@1@24@@danf@17-8-2009
10430840@unknown@formal@none@1@S@A single verb can be a complete sentence: {{transl|ja|''Yatta!''}}@@@@1@9@@danf@17-8-2009
10430850@unknown@formal@none@1@S@"[I / we / they / etc] did [it]!".@@@@1@9@@danf@17-8-2009
10430860@unknown@formal@none@1@S@In addition, since adjectives can form the predicate in a Japanese sentence (below), a single adjective can be a complete sentence: {{transl|ja|''Urayamashii!''}}@@@@1@22@@danf@17-8-2009
10430870@unknown@formal@none@1@S@"[I'm] jealous [of it]!".@@@@1@4@@danf@17-8-2009
10430880@unknown@formal@none@1@S@While the language has some words that are typically translated as pronouns, these are not used as frequently as pronouns in some [[Indo-European language]]s, and function differently.@@@@1@27@@danf@17-8-2009
10430890@unknown@formal@none@1@S@Instead, Japanese typically relies on special verb forms and auxiliary verbs to indicate the direction of benefit of an action: "down" to indicate the out-group gives a benefit to the in-group; and "up" to indicate the in-group gives a benefit to the out-group.@@@@1@43@@danf@17-8-2009
10430900@unknown@formal@none@1@S@Here, the in-group includes the speaker and the out-group doesn't, and their boundary depends on context.@@@@1@16@@danf@17-8-2009
10430910@unknown@formal@none@1@S@For example, {{transl|ja|''oshiete moratta''}} (literally, "explained" with a benefit from the out-group to the in-group) means "[he/she/they] explained it to [me/us]".@@@@1@21@@danf@17-8-2009
10430920@unknown@formal@none@1@S@Similarly, {{transl|ja|''oshiete ageta''}} (literally, "explained" with a benefit from the in-group to the out-group) means "[I/we] explained [it] to [him/her/them]".@@@@1@20@@danf@17-8-2009
10430930@unknown@formal@none@1@S@Such beneficiary auxiliary verbs thus serve a function comparable to that of pronouns and prepositions in Indo-European languages to indicate the actor and the recipient of an action.@@@@1@28@@danf@17-8-2009
10430940@unknown@formal@none@1@S@Japanese "pronouns" also function differently from most modern Indo-European pronouns (and more like nouns) in that they can take modifiers as any other noun may.@@@@1@25@@danf@17-8-2009
10430950@unknown@formal@none@1@S@For instance, one cannot say in English:@@@@1@7@@danf@17-8-2009
10430960@unknown@formal@none@1@S@:@@@@1@1@@danf@17-8-2009
10430970@unknown@formal@none@1@S@*The amazed he ran down the street. (grammatically incorrect)@@@@1@9@@danf@17-8-2009
10430980@unknown@formal@none@1@S@But one ''can'' grammatically say essentially the same thing in Japanese:@@@@1@11@@danf@17-8-2009
10430990@unknown@formal@none@1@S@: {{transl|ja|''Odoroita kare-wa michi-o hashitte itta.''}} (grammatically correct)@@@@1@8@@danf@17-8-2009
10431000@unknown@formal@none@1@S@This is partly due to the fact that these words evolved from regular nouns, such as {{transl|ja|''kimi''}} "you" ({{lang|ja|君}} "lord"), {{transl|ja|''anata''}} "you" ({{lang|ja|あなた}} "that side, yonder"), and {{transl|ja|''boku''}} "I" ({{lang|ja|僕}} "servant").@@@@1@31@@danf@17-8-2009
10431010@unknown@formal@none@1@S@This is why some linguists do not classify Japanese "pronouns" as pronouns, but rather as referential nouns.@@@@1@17@@danf@17-8-2009
10431020@unknown@formal@none@1@S@Japanese personal pronouns are generally used only in situations requiring special emphasis as to who is doing what to whom.@@@@1@20@@danf@17-8-2009
10431030@unknown@formal@none@1@S@The choice of words used as pronouns is correlated with the sex of the speaker and the social situation in which they are spoken: men and women alike in a formal situation generally refer to themselves as {{transl|ja|''watashi''}} ({{lang|ja|私}} "private") or {{transl|ja|''watakushi''}} (also {{lang|ja|私}}), while men in rougher or intimate conversation are much more likely to use the word {{transl|ja|''ore''}} ({{lang|ja|俺}} "oneself", "myself") or {{transl|ja|''boku''}}.@@@@1@65@@danf@17-8-2009
10431040@unknown@formal@none@1@S@Similarly, different words such as {{transl|ja|''anata''}}, {{transl|ja|''kimi''}}, and {{transl|ja|''omae''}} ({{lang|ja|お前}}, more formally {{lang|ja|御前}} "the one before me") may be used to refer to a listener depending on the listener's relative social position and the degree of familiarity between the speaker and the listener.@@@@1@43@@danf@17-8-2009
10431050@unknown@formal@none@1@S@When used in different social relationships, the same word may have positive (intimate or respectful) or negative (distant or disrespectful) connotations.@@@@1@21@@danf@17-8-2009
10431060@unknown@formal@none@1@S@Japanese often use titles of the person referred to where pronouns would be used in English.@@@@1@16@@danf@17-8-2009
10431070@unknown@formal@none@1@S@For example, when speaking to one's teacher, it is appropriate to use {{transl|ja|''sensei''}} ({{lang|ja|先生}}, teacher), but inappropriate to use {{transl|ja|''anata''}}.@@@@1@20@@danf@17-8-2009
10431080@unknown@formal@none@1@S@This is because {{transl|ja|''anata''}} is used to refer to people of equal or lower status, and one's teacher has allegedly higher status.@@@@1@22@@danf@17-8-2009
10431090@unknown@formal@none@1@S@For English speaking learners of Japanese, a frequent beginners mistake is to include {{transl|ja|''watashi-wa''}} or {{transl|ja|''anata-wa''}} at the beginning of sentences as one would with ''I'' or ''you'' in English.@@@@1@30@@danf@17-8-2009
10431100@unknown@formal@none@1@S@Though these sentences are not grammatically incorrect, even in formal settings it would be considered unnatural and would equate in English to repeatedly using a noun where a [[pronoun]] would suffice.@@@@1@31@@danf@17-8-2009
10431110@unknown@formal@none@1@S@=== Inflection and conjugation ===@@@@1@5@@danf@17-8-2009
10431120@unknown@formal@none@1@S@Japanese nouns have no grammatical number, gender or article aspect.@@@@1@10@@danf@17-8-2009
10431130@unknown@formal@none@1@S@The noun {{transl|ja|''hon''}} ({{lang|ja|本}}) may refer to a single book or several books; {{transl|ja|''hito''}} ({{lang|ja|人}}) can mean "person" or "people"; and {{transl|ja|''ki''}} ({{lang|ja|木}}) can be "tree" or "trees".@@@@1@28@@danf@17-8-2009
10431140@unknown@formal@none@1@S@Where number is important, it can be indicated by providing a quantity (often with a [[Japanese counter word|counter word]]) or (rarely) by adding a suffix.@@@@1@25@@danf@17-8-2009
10431150@unknown@formal@none@1@S@Words for people are usually understood as singular.@@@@1@8@@danf@17-8-2009
10431160@unknown@formal@none@1@S@Thus {{transl|ja|''Tanaka-san''}} usually means ''Mr./Mrs./Miss. Tanaka''.@@@@1@6@@danf@17-8-2009
10431170@unknown@formal@none@1@S@Words that refer to people and animals can be made to indicate a group of individuals through the addition of a collective suffix (a noun suffix that indicates a group), such as {{transl|ja|''-tachi''}}, but this is not a true plural: the meaning is closer to the English phrase "and company".@@@@1@50@@danf@17-8-2009
10431180@unknown@formal@none@1@S@A group described as {{transl|ja|''Tanaka-san-tachi''}} may include people not named Tanaka.@@@@1@11@@danf@17-8-2009
10431190@unknown@formal@none@1@S@Some Japanese nouns are effectively plural, such as {{transl|ja|''hitobito''}} "people" and {{transl|ja|''wareware''}} "we/us", while the word {{transl|ja|''tomodachi''}} "friend" is considered singular, although plural in form.@@@@1@25@@danf@17-8-2009
10431200@unknown@formal@none@1@S@Verbs are [[Japanese verb conjugations|conjugated]] to show tenses, of which there are two: past and present, or non-past, which is used for the present and the future.@@@@1@27@@danf@17-8-2009
10431210@unknown@formal@none@1@S@For verbs that represent an ongoing process, the ''-te iru'' form indicates a continuous (or progressive) tense.@@@@1@17@@danf@17-8-2009
10431220@unknown@formal@none@1@S@For others that represent a change of state, the {{transl|ja|''-te iru''}} form indicates a perfect tense.@@@@1@16@@danf@17-8-2009
10431230@unknown@formal@none@1@S@For example, {{transl|ja|''kite iru''}} means "He has come (and is still here)", but {{transl|ja|''tabete iru''}} means "He is eating".@@@@1@19@@danf@17-8-2009
10431240@unknown@formal@none@1@S@Questions (both with an interrogative pronoun and yes/no questions) have the same structure as affirmative sentences, but with intonation rising at the end.@@@@1@23@@danf@17-8-2009
10431250@unknown@formal@none@1@S@In the formal register, the question particle {{transl|ja|''-ka''}} is added.@@@@1@10@@danf@17-8-2009
10431260@unknown@formal@none@1@S@For example, {{transl|ja|''Ii desu''}} ({{lang|ja|いいです。}}) "It is OK" becomes {{transl|ja|''Ii desu-ka''}} ({{lang|ja|いいですか?}}) "Is it OK?".@@@@1@15@@danf@17-8-2009
10431270@unknown@formal@none@1@S@In a more informal tone sometimes the particle {{transl|ja|''-no''}} ({{lang|ja|の}}) is added instead to show a personal interest of the speaker: {{transl|ja|''Dōshite konai-no?''}}@@@@1@23@@danf@17-8-2009
10431280@unknown@formal@none@1@S@"Why aren't (you) coming?".@@@@1@4@@danf@17-8-2009
10431290@unknown@formal@none@1@S@Some simple queries are formed simply by mentioning the topic with an interrogative intonation to call for the hearer's attention: {{transl|ja|''Kore-wa?''}}@@@@1@21@@danf@17-8-2009
10431300@unknown@formal@none@1@S@"(What about) this?"; {{transl|ja|''Namae-wa?''}} ({{lang|ja|名前は?}}) "(What's your) name?".@@@@1@8@@danf@17-8-2009
10431310@unknown@formal@none@1@S@Negatives are formed by inflecting the verb.@@@@1@7@@danf@17-8-2009
10431320@unknown@formal@none@1@S@For example, {{transl|ja|''Pan-o taberu''}} ({{lang|ja|パンを食べる。}}) "I will eat bread" or "I eat bread" becomes {{transl|ja|''Pan-o tabenai''}} ({{lang|ja|パンを食べない。}}) "I will not eat bread" or "I do not eat bread".@@@@1@28@@danf@17-8-2009
10431330@unknown@formal@none@1@S@The so-called {{transl|ja|''-te''}} verb form is used for a variety of purposes: either progressive or perfect aspect (see above); combining verbs in a temporal sequence ({{transl|ja|''Asagohan-o tabete sugu dekakeru''}} "I'll eat breakfast and leave at once"), simple commands, conditional statements and permissions ({{transl|ja|''Dekakete-mo ii?''}} "May I go out?"), etc.@@@@1@49@@danf@17-8-2009
10431340@unknown@formal@none@1@S@The word {{transl|ja|''da''}} (plain), {{transl|ja|''desu''}} (polite) is the [[copula]] verb.@@@@1@10@@danf@17-8-2009
10431350@unknown@formal@none@1@S@It corresponds approximately to the English ''be'', but often takes on other roles, including a marker for tense, when the verb is conjugated into its past form {{transl|ja|''datta''}} (plain), {{transl|ja|''deshita''}} (polite).@@@@1@31@@danf@17-8-2009
10431360@unknown@formal@none@1@S@This comes into use because only {{transl|ja|''keiyōshi''}} adjectives and verbs can carry tense in Japanese.@@@@1@15@@danf@17-8-2009
10431370@unknown@formal@none@1@S@Two additional common verbs are used to indicate existence ("there is") or, in some contexts, property: {{transl|ja|''aru''}} (negative {{transl|ja|''nai''}}) and {{transl|ja|''iru''}} (negative {{transl|ja|''inai''}}), for inanimate and animate things, respectively.@@@@1@29@@danf@17-8-2009
10431380@unknown@formal@none@1@S@For example, {{transl|ja|''Neko ga iru''}} "There's a cat", {{transl|ja|''Ii kangae-ga nai''}} "[I] haven't got a good idea".@@@@1@17@@danf@17-8-2009
10431390@unknown@formal@none@1@S@Note that the negative forms of the verbs {{transl|ja|''iru''}} and {{transl|ja|''aru''}} are actually ''i''-adjectives and inflect as such, e.g. {{transl|ja|''Neko ga inakatta''}} "There was no cat".@@@@1@26@@danf@17-8-2009
10431400@unknown@formal@none@1@S@The verb "to do" ({{transl|ja|''suru''}}, polite form {{transl|ja|''shimasu''}}) is often used to make verbs from nouns ({{transl|ja|''ryōri suru''}} "to cook", {{transl|ja|''benkyō suru''}} "to study", etc.) and has been productive in creating modern slang words.@@@@1@34@@danf@17-8-2009
10431410@unknown@formal@none@1@S@Japanese also has a huge number of compound verbs to express concepts that are described in English using a verb and a preposition (e.g. {{transl|ja|''tobidasu''}} "to fly out, to flee," from {{transl|ja|''tobu''}} "to fly, to jump" + {{transl|ja|''dasu''}} "to put out, to emit").@@@@1@43@@danf@17-8-2009
10431420@unknown@formal@none@1@S@There are three types of [[Japanese adjectives|adjective]] (see also [[Japanese adjectives]]):@@@@1@11@@danf@17-8-2009
10431430@unknown@formal@none@1@S@# {{lang|ja|形容詞}} {{transl|ja|''keiyōshi''}}, or {{transl|ja|''i''}} adjectives, which have a [[Japanese verb conjugations|conjugating]] ending {{transl|ja|''i''}} ({{lang|ja|い}}) (such as {{lang|ja|あつい}} {{transl|ja|''atsui''}} "to be hot") which can become past ({{lang|ja|あつかった}} {{transl|ja|''atsukatta''}} "it was hot"), or negative ({{lang|ja|あつくない}} {{transl|ja|''atsuku nai''}} "it is not hot").@@@@1@40@@danf@17-8-2009
10431440@unknown@formal@none@1@S@Note that {{transl|ja|''nai''}} is also an {{transl|ja|''i''}} adjective, which can become past ({{lang|ja|あつくなかった}} {{transl|ja|''atsuku nakatta''}} "it was not hot").@@@@1@19@@danf@17-8-2009
10431450@unknown@formal@none@1@S@#: {{lang|ja|暑い日}} {{transl|ja|''atsui hi''}} "a hot day".@@@@1@7@@danf@17-8-2009
10431460@unknown@formal@none@1@S@# {{lang|ja|形容動詞}} {{transl|ja|''keiyōdōshi''}}, or {{transl|ja|''na''}} adjectives, which are followed by a form of the [[copula]], usually {{transl|ja|''na''}}.@@@@1@17@@danf@17-8-2009
10431470@unknown@formal@none@1@S@For example {{transl|ja|''hen''}} (strange)@@@@1@4@@danf@17-8-2009
10431480@unknown@formal@none@1@S@#: {{lang|ja|変なひと}} {{transl|ja|''hen na hito''}} "a strange person".@@@@1@8@@danf@17-8-2009
10431490@unknown@formal@none@1@S@# {{lang|ja|連体詞}} {{transl|ja|''rentaishi''}}, also called true adjectives, such as {{transl|ja|''ano''}} "that"@@@@1@11@@danf@17-8-2009
10431500@unknown@formal@none@1@S@#: {{lang|ja|あの山}} {{transl|ja|''ano yama''}} "that mountain".@@@@1@6@@danf@17-8-2009
10431510@unknown@formal@none@1@S@Both {{transl|ja|''keiyōshi''}} and {{transl|ja|''keiyōdōshi''}} may [[predicate (grammar)|predicate]] sentences.@@@@1@8@@danf@17-8-2009
10431520@unknown@formal@none@1@S@For example,@@@@1@2@@danf@17-8-2009
10431530@unknown@formal@none@1@S@: {{lang|ja|ご飯が熱い。}} {{transl|ja|''Gohan-ga atsui.''}}@@@@1@4@@danf@17-8-2009
10431540@unknown@formal@none@1@S@"The rice is hot."@@@@1@4@@danf@17-8-2009
10431550@unknown@formal@none@1@S@: {{lang|ja|彼は変だ。}} {{transl|ja|''Kare-wa hen da.''}}@@@@1@5@@danf@17-8-2009
10431560@unknown@formal@none@1@S@"He's strange."@@@@1@2@@danf@17-8-2009
10431570@unknown@formal@none@1@S@Both inflect, though they do not show the full range of conjugation found in true verbs.@@@@1@16@@danf@17-8-2009
10431580@unknown@formal@none@1@S@The {{transl|ja|''rentaishi''}} in Modern Japanese are few in number, and unlike the other words, are limited to directly modifying nouns.@@@@1@20@@danf@17-8-2009
10431590@unknown@formal@none@1@S@They never predicate sentences.@@@@1@4@@danf@17-8-2009
10431600@unknown@formal@none@1@S@Examples include {{transl|ja|''ookina''}} "big", {{transl|ja|''kono''}} "this", {{transl|ja|''iwayuru''}} "so-called" and {{transl|ja|''taishita''}} "amazing".@@@@1@11@@danf@17-8-2009
10431610@unknown@formal@none@1@S@Both {{transl|ja|''keiyōdōshi''}} and {{transl|ja|''keiyōshi''}} form [[adverb]]s, by following with {{transl|ja|''ni''}} in the case of {{transl|ja|''keiyōdōshi''}}:@@@@1@15@@danf@17-8-2009
10431620@unknown@formal@none@1@S@: {{lang|ja|変になる}} {{transl|ja|''hen ni naru''}} "become strange",@@@@1@7@@danf@17-8-2009
10431630@unknown@formal@none@1@S@and by changing {{transl|ja|''i''}} to {{transl|ja|''ku''}} in the case of {{transl|ja|''keiyōshi''}}:@@@@1@11@@danf@17-8-2009
10431640@unknown@formal@none@1@S@: {{lang|ja|熱くなる}} {{transl|ja|''atsuku naru''}} "become hot".@@@@1@6@@danf@17-8-2009
10431650@unknown@formal@none@1@S@The grammatical function of nouns is indicated by [[postposition]]s, also called [[Japanese particles|particles]].@@@@1@13@@danf@17-8-2009
10431660@unknown@formal@none@1@S@These include for example:@@@@1@4@@danf@17-8-2009
10431670@unknown@formal@none@1@S@* '''{{lang|ja|が}} {{transl|ja|''ga''}}''' for the [[nominative case]].@@@@1@7@@danf@17-8-2009
10431680@unknown@formal@none@1@S@Not necessarily a subject.@@@@1@4@@danf@17-8-2009
10431690@unknown@formal@none@1@S@: {{lang|ja|''彼'''が'''やった。''}}{{transl|ja|''Kare '''ga''' yatta.''}}@@@@1@4@@danf@17-8-2009
10431700@unknown@formal@none@1@S@"'''He''' did it."@@@@1@3@@danf@17-8-2009
10431710@unknown@formal@none@1@S@* '''{{lang|ja|に}} {{transl|ja|''ni''}}''' for the [[dative case]].@@@@1@7@@danf@17-8-2009
10431720@unknown@formal@none@1@S@: {{lang|ja|田中さん'''に'''あげて下さい。}} {{transl|ja|''Tanaka-san '''ni''' agete kudasai''}} "Please give it to '''Mr. Tanaka'''."@@@@1@12@@danf@17-8-2009
10431730@unknown@formal@none@1@S@It is also used for the [[lative]] case, indicating a motion to a location.@@@@1@14@@danf@17-8-2009
10431740@unknown@formal@none@1@S@: {{lang|ja|''日本'' '''に'''行きたい。}} {{transl|ja|'''''Nihon''' '''ni''' ikitai''}} "I want to go ''to'' '''Japan'''."@@@@1@12@@danf@17-8-2009
10431750@unknown@formal@none@1@S@* '''{{lang|ja|の}} {{transl|ja|''no''}}''' for the [[genitive case]], or nominalizing phrases.@@@@1@10@@danf@17-8-2009
10431760@unknown@formal@none@1@S@: {{lang|ja|私'''の'''カメラ。}} {{transl|ja|''watashi '''no''' kamera''}} "'''my''' camera"@@@@1@7@@danf@17-8-2009
10431770@unknown@formal@none@1@S@: {{lang|ja|スキーに行く'''の'''が好きです。}} {{transl|ja|''Sukī-ni iku '''no''' ga suki desu''}} "(I) like go'''ing''' skiing."@@@@1@12@@danf@17-8-2009
10431780@unknown@formal@none@1@S@* '''{{lang|ja|を}} {{transl|ja|''o''}}''' for the [[accusative case]].@@@@1@7@@danf@17-8-2009
10431790@unknown@formal@none@1@S@Not necessarily an object.@@@@1@4@@danf@17-8-2009
10431800@unknown@formal@none@1@S@: {{lang|ja|何'''を'''食べますか。}} {{transl|ja|''Nani '''o''' tabemasu ka?''}}@@@@1@6@@danf@17-8-2009
10431810@unknown@formal@none@1@S@"'''What''' will (you) eat?"@@@@1@4@@danf@17-8-2009
10431820@unknown@formal@none@1@S@* '''{{lang|ja|は}} {{transl|ja|''wa''}}''' for the topic.@@@@1@6@@danf@17-8-2009
10431830@unknown@formal@none@1@S@It can co-exist with case markers above except {{transl|ja|''no''}}, and it overrides {{transl|ja|''ga''}} and {{transl|ja|''o''}}.@@@@1@15@@danf@17-8-2009
10431840@unknown@formal@none@1@S@: {{lang|ja|私'''は'''タイ料理がいいです。}} {{transl|ja|''Watashi '''wa''' tai-ryōri ga ii desu.''}}@@@@1@8@@danf@17-8-2009
10431850@unknown@formal@none@1@S@"As for me, Thai food is good."@@@@1@7@@danf@17-8-2009
10431860@unknown@formal@none@1@S@The nominative marker {{transl|ja|''ga''}} after {{transl|ja|''watashi''}} is hidden under {{transl|ja|''wa''}}.@@@@1@10@@danf@17-8-2009
10431865@unknown@formal@none@1@S@(Note that English generally makes no distinction between sentence topic and subject.)@@@@1@12@@danf@17-8-2009
10431867@unknown@formal@none@1@S@Note: The difference between {{transl|ja|'''''wa'''''}} and {{transl|ja|'''''ga'''''}} goes beyond the English distinction between sentence topic and subject.@@@@1@17@@danf@17-8-2009
10431870@unknown@formal@none@1@S@While {{transl|ja|''wa''}} indicates the topic, which the rest of the sentence describes or acts upon, it carries the implication that the subject indicated by {{transl|ja|''wa''}} is not unique, or may be part of a larger group.@@@@1@36@@danf@17-8-2009
10431880@unknown@formal@none@1@S@: {{transl|ja|''Ikeda-san '''wa''' yonjū-ni sai da.''}}@@@@1@6@@danf@17-8-2009
10431890@unknown@formal@none@1@S@"As for Mr. Ikeda, he is forty-two years old."@@@@1@9@@danf@17-8-2009
10431900@unknown@formal@none@1@S@Others in the group may also be of that age.@@@@1@10@@danf@17-8-2009
10431910@unknown@formal@none@1@S@Absence of {{transl|ja|''wa''}} often means the subject is the [[focus (linguistics)|focus]] of the sentence.@@@@1@14@@danf@17-8-2009
10431920@unknown@formal@none@1@S@: {{transl|ja|''Ikeda-san '''ga''' yonjū-ni sai da.''}}@@@@1@6@@danf@17-8-2009
10431930@unknown@formal@none@1@S@"It is Mr. Ikeda who is forty-two years old."@@@@1@9@@danf@17-8-2009
10431940@unknown@formal@none@1@S@This is a reply to an implicit or explicit question who in this group is forty-two years old.@@@@1@18@@danf@17-8-2009
10431950@unknown@formal@none@1@S@=== Politeness ===@@@@1@3@@danf@17-8-2009
10431960@unknown@formal@none@1@S@Unlike most western languages, Japanese has an extensive grammatical system to express politeness and formality.@@@@1@15@@danf@17-8-2009
10431970@unknown@formal@none@1@S@Most relationships are not equal in Japanese [[society]].@@@@1@8@@danf@17-8-2009
10431980@unknown@formal@none@1@S@The differences in social position are determined by a variety of factors including job, age, experience, or even psychological state (e.g., a person asking a favour tends to do so politely).@@@@1@31@@danf@17-8-2009
10431990@unknown@formal@none@1@S@The person in the lower position is expected to use a polite form of speech, whereas the other might use a more plain form.@@@@1@24@@danf@17-8-2009
10432000@unknown@formal@none@1@S@Strangers will also speak to each other politely.@@@@1@8@@danf@17-8-2009
10432010@unknown@formal@none@1@S@Japanese children rarely use polite speech until they are teens, at which point they are expected to begin speaking in a more adult manner.@@@@1@24@@danf@17-8-2009
10432020@unknown@formal@none@1@S@''See [[uchi-soto]]''.@@@@1@2@@danf@17-8-2009
10432030@unknown@formal@none@1@S@Whereas {{transl|ja|''teineigo''}} ({{lang|ja|丁寧語}}) (polite language) is commonly an [[inflection]]al system, {{transl|ja|''sonkeigo''}} ({{lang|ja|尊敬語}}) (respectful language) and {{transl|ja|''kenjōgo''}} ({{lang|ja|謙譲語}}) (humble language) often employ many special honorific and humble alternate verbs: {{transl|ja|''iku''}} "go" becomes {{transl|ja|''ikimasu''}} in polite form, but is replaced by {{transl|ja|''irassharu''}} in honorific speech and {{transl|ja|''ukagau''}} or {{transl|ja|''mairu''}} in humble speech.@@@@1@50@@danf@17-8-2009
10432040@unknown@formal@none@1@S@The difference between honorific and humble speech is particularly pronounced in the Japanese language.@@@@1@14@@danf@17-8-2009
10432050@unknown@formal@none@1@S@Humble language is used to talk about oneself or one's own group (company, family) whilst honorific language is mostly used when describing the interlocutor and his/her group.@@@@1@27@@danf@17-8-2009
10432060@unknown@formal@none@1@S@For example, the {{transl|ja|''-san''}} suffix ("Mr" "Mrs." or "Miss") is an example of honorific language.@@@@1@15@@danf@17-8-2009
10432070@unknown@formal@none@1@S@It is not used to talk about oneself or when talking about someone from one's company to an external person, since the company is the speaker's "group".@@@@1@27@@danf@17-8-2009
10432080@unknown@formal@none@1@S@When speaking directly to one's superior in one's company or when speaking with other employees within one's company about a superior, a Japanese person will use vocabulary and inflections of the honorific register to refer to the in-group superior and his or her speech and actions.@@@@1@46@@danf@17-8-2009
10432090@unknown@formal@none@1@S@When speaking to a person from another company (i.e., a member of an out-group), however, a Japanese person will use the plain or the humble register to refer to the speech and actions of his or her own in-group superiors.@@@@1@40@@danf@17-8-2009
10432100@unknown@formal@none@1@S@In short, the register used in Japanese to refer to the person, speech, or actions of any particular individual varies depending on the relationship (either in-group or out-group) between the speaker and listener, as well as depending on the relative status of the speaker, listener, and third-person referents.@@@@1@48@@danf@17-8-2009
10432110@unknown@formal@none@1@S@For this reason, the Japanese system for explicit indication of social register is known as a system of "relative honorifics."@@@@1@20@@danf@17-8-2009
10432120@unknown@formal@none@1@S@This stands in stark contrast to the [[Korean language|Korean]] system of "absolute honorifics," in which the same register is used to refer to a particular individual (e.g. one's father, one's company president, etc.) in any context regardless of the relationship between the speaker and interlocutor.@@@@1@45@@danf@17-8-2009
10432130@unknown@formal@none@1@S@Thus, polite Korean speech can sound very presumptuous when translated verbatim into Japanese, as in Korean it is acceptable and normal to say things like "Our '''Mr.''' Company-President..." when communicating with a member of an out-group, which would be very inappropriate in a Japanese social context.@@@@1@46@@danf@17-8-2009
10432140@unknown@formal@none@1@S@Most [[noun]]s in the Japanese language may be made polite by the addition of {{transl|ja|''o-''}} or {{transl|ja|''go-''}} as a prefix.@@@@1@20@@danf@17-8-2009
10432145@unknown@formal@none@1@S@{{transl|ja|''o-''}} is generally used for words of native Japanese origin, whereas {{transl|ja|''go-''}} is affixed to words of Chinese derivation.@@@@1@19@@danf@17-8-2009
10432150@unknown@formal@none@1@S@In some cases, the prefix has become a fixed part of the word, and is included even in regular speech, such as {{transl|ja|''gohan''}} 'cooked rice; meal.'@@@@1@26@@danf@17-8-2009
10432160@unknown@formal@none@1@S@Such a construction often indicates deference to either the item's owner or to the object itself.@@@@1@16@@danf@17-8-2009
10432170@unknown@formal@none@1@S@For example, the word {{transl|ja|''tomodachi''}} 'friend,' would become {{transl|ja|''o-tomodachi''}} when referring to the friend of someone of higher status (though mothers often use this form to refer to their children's friends).@@@@1@31@@danf@17-8-2009
10432180@unknown@formal@none@1@S@On the other hand, a polite speaker may sometimes refer to {{transl|ja|''mizu''}} 'water' as {{transl|ja|''o-mizu''}} in order to show politeness.@@@@1@20@@danf@17-8-2009
10432190@unknown@formal@none@1@S@Most Japanese people employ politeness to indicate a lack of familiarity.@@@@1@11@@danf@17-8-2009
10432200@unknown@formal@none@1@S@That is, they use polite forms for new acquaintances, but if a relationship becomes more intimate, they no longer use them.@@@@1@21@@danf@17-8-2009
10432210@unknown@formal@none@1@S@This occurs regardless of age, social class, or gender.@@@@1@9@@danf@17-8-2009
10432220@unknown@formal@none@1@S@== Vocabulary ==@@@@1@3@@danf@17-8-2009
10432230@unknown@formal@none@1@S@The original language of Japan, or at least the original language of a certain population that was ancestral to a significant portion of the historical and present Japanese nation, was the so-called {{transl|ja|''yamato kotoba''}} ({{lang|ja|大和言葉}} or infrequently {{lang|ja|大和詞}}, i.e. "[[Yamato people|Yamato]] words"), which in scholarly contexts is sometimes referred to as {{transl|ja|''wa-go''}} ({{lang|ja|和語}} or rarely {{lang|ja|倭語}}, i.e. the {{transl|ja|"[[Wa (Japan)|Wa]]}} words").@@@@1@61@@danf@17-8-2009
10432240@unknown@formal@none@1@S@In addition to words from this original language, present-day Japanese includes a great number of words that were either borrowed from [[Chinese language|Chinese]] or constructed from Chinese roots following Chinese patterns.@@@@1@31@@danf@17-8-2009
10432250@unknown@formal@none@1@S@These words, known as {{transl|ja|''[[Sino-Japanese vocabulary|kango]]''}} ({{lang|ja|漢語}}), entered the language from the fifth century onwards via contact with Chinese culture.@@@@1@20@@danf@17-8-2009
10432260@unknown@formal@none@1@S@According to a [[Japanese dictionary]] ''Shinsen-kokugojiten'' (新選国語辞典), [[Sino-Japanese vocabulary|Chinese-based words]] comprise 49.1% of the total vocabulary, Wago is 33.8% and other foreign words are 8.8%.@@@@1@25@@danf@17-8-2009
10432270@unknown@formal@none@1@S@Like Latin-derived words in English, {{transl|ja|''[[Sino-Japanese vocabulary|kango]]''}} words typically are perceived as somewhat formal or academic compared to equivalent Yamato words.@@@@1@21@@danf@17-8-2009
10432280@unknown@formal@none@1@S@Indeed, it is generally fair to say that an English word derived from Latin/French roots typically corresponds to a Sino-Japanese word in Japanese, whereas a simpler Anglo-Saxon word would best be translated by a Yamato equivalent.@@@@1@36@@danf@17-8-2009
10432290@unknown@formal@none@1@S@A much smaller number of words has been borrowed from [[Korean language|Korean]] and [[Ainu language|Ainu]].@@@@1@15@@danf@17-8-2009
10432300@unknown@formal@none@1@S@Japan has also borrowed a number of words from other languages, particularly ones of European extraction, which are called {{transl|ja|''[[gairaigo]]''}}.@@@@1@20@@danf@17-8-2009
10432310@unknown@formal@none@1@S@This began with [[Japanese words of Portuguese origin|borrowings from Portuguese]] in the 16th century, followed by borrowing from [[Dutch language|Dutch]] during Japan's [[sakoku|long isolation]] of the [[Edo period]].@@@@1@28@@danf@17-8-2009
10432320@unknown@formal@none@1@S@With the [[Meiji Restoration]] and the reopening of Japan in the 19th century, borrowing occurred from [[German language|German]], [[French language|French]] and [[English language|English]].@@@@1@23@@danf@17-8-2009
10432330@unknown@formal@none@1@S@Currently, words of English origin are the most commonly borrowed.@@@@1@10@@danf@17-8-2009
10432340@unknown@formal@none@1@S@In the Meiji era, the Japanese also coined many neologisms using Chinese roots and morphology to translate Western concepts.@@@@1@19@@danf@17-8-2009
10432350@unknown@formal@none@1@S@The Chinese and Koreans imported many of these pseudo-Chinese words into [[Chinese language|Chinese]], [[Korean language|Korean]], and [[Vietnamese language|Vietnamese]] via their [[kanji]] in the late 19th and early 20th centuries.@@@@1@29@@danf@17-8-2009
10432360@unknown@formal@none@1@S@For example, {{lang|ja|政治}} {{transl|ja|''seiji''}} ("politics"), and {{lang|ja|化学}} {{transl|ja|''kagaku''}} ("chemistry") are words derived from Chinese roots that were first created and used by the Japanese, and only later borrowed into Chinese and other East Asian languages.@@@@1@35@@danf@17-8-2009
10432370@unknown@formal@none@1@S@As a result, Japanese, Chinese, Korean, and Vietnamese share a large common corpus of vocabulary in the same way a large number of Greek- and Latin-derived words are shared among modern European languages, although many academic words formed from such roots were certainly coined by native speakers of other languages, such as English.@@@@1@53@@danf@17-8-2009
10432380@unknown@formal@none@1@S@In the past few decades, {{transl|ja|''[[wasei-eigo]]''}} (made-in-Japan English) has become a prominent phenomenon.@@@@1@13@@danf@17-8-2009
10432390@unknown@formal@none@1@S@Words such as {{transl|ja|''wanpatān''}} {{lang|ja|ワンパターン}} (< ''one'' + ''pattern'', "to be in a rut", "to have a one-track mind") and {{transl|ja|''sukinshippu''}} {{lang|ja|スキンシップ}} (< ''skin'' + ''-ship'', "physical contact"), although coined by compounding English roots, are nonsensical in most non-Japanese contexts; exceptions exist in nearby languages such as Korean however, which often use words such as skinship and rimokon (remote control) in the same way as in Japanese.@@@@1@66@@danf@17-8-2009
10432400@unknown@formal@none@1@S@Additionally, many native Japanese words have become commonplace in English, due to the popularity of many Japanese cultural exports.@@@@1@19@@danf@17-8-2009
10432410@unknown@formal@none@1@S@Words such as [[futon]], [[haiku]], [[judo]], [[kamikaze]], [[karaoke]], [[karate]], [[ninja]], [[origami]], [[rickshaw]] (from {{lang|ja|人力車}} {{transl|ja|''jinrikisha''}}), [[samurai]], [[sayonara]], [[sumo]], [[sushi]], [[tsunami]], [[tycoon]] and many others have become part of the English language.@@@@1@31@@danf@17-8-2009
10432420@unknown@formal@none@1@S@See [[list of English words of Japanese origin]] for more.@@@@1@10@@danf@17-8-2009
10432430@unknown@formal@none@1@S@== Writing system ==@@@@1@4@@danf@17-8-2009
10432440@unknown@formal@none@1@S@Literacy was introduced to Japan in the form of the [[Chinese writing system]], by way of [[Baekje]] before the 5th century.@@@@1@21@@danf@17-8-2009
10432450@unknown@formal@none@1@S@Using this language, the Japanese emperor [[Emperor Yūryaku|Yūryaku]] sent a letter to a Chinese emperor [[Emperor Shun of Liu Song|Liu Song]] in 478 CE.@@@@1@24@@danf@17-8-2009
10432460@unknown@formal@none@1@S@After the ruin of Baekje, Japan invited scholars from China to learn more of the Chinese writing system.@@@@1@18@@danf@17-8-2009
10432470@unknown@formal@none@1@S@Japanese Emperors gave an official rank to Chinese scholars (続守言/薩弘格/袁晋卿) and spread the use of Chinese characters from the 7th century to the 8th century.@@@@1@25@@danf@17-8-2009
10432480@unknown@formal@none@1@S@At first, the Japanese wrote in [[Classical Chinese]], with Japanese names represented by characters used for their meanings and not their sounds.@@@@1@22@@danf@17-8-2009
10432490@unknown@formal@none@1@S@Later, during the seventh century CE, the Chinese-sounding phoneme principle was used to write pure Japanese poetry and prose (comparable to Akkadian's retention of Sumerian cuneiform), but some Japanese words were still written with characters for their meaning and not the original Chinese sound.@@@@1@44@@danf@17-8-2009
10432500@unknown@formal@none@1@S@This is when the history of Japanese as a written language begins in its own right.@@@@1@16@@danf@17-8-2009
10432510@unknown@formal@none@1@S@By this time, the Japanese language was already distinct from the [[Ryukyuan languages]].@@@@1@13@@danf@17-8-2009
10432520@unknown@formal@none@1@S@The Korean settlers and their descendants used Kudara-on or Baekje pronunciation (百済音), which was also called Tsushima-pronunciation (対馬音) or [[Go-on]] (呉音).@@@@1@21@@danf@17-8-2009
10432530@unknown@formal@none@1@S@An example of this mixed style is the [[Kojiki]], which was written in 712 AD.@@@@1@15@@danf@17-8-2009
10432540@unknown@formal@none@1@S@They then started to use Chinese characters to write Japanese in a style known as {{transl|ja|''man'yōgana''}}, a syllabic script which used Chinese characters for their sounds in order to transcribe the words of Japanese speech syllable by syllable.@@@@1@38@@danf@17-8-2009
10432550@unknown@formal@none@1@S@Over time, a writing system evolved.@@@@1@6@@danf@17-8-2009
10432560@unknown@formal@none@1@S@[[Chinese characters]] ([[kanji]]) were used to write either words borrowed from Chinese, or Japanese words with the same or similar meanings.@@@@1@21@@danf@17-8-2009
10432570@unknown@formal@none@1@S@Chinese characters were also used to write grammatical elements, were simplified, and eventually became two syllabic scripts: [[hiragana]] and [[katakana]].@@@@1@20@@danf@17-8-2009
10432580@unknown@formal@none@1@S@Modern Japanese is written in a mixture of three main systems: [[kanji]], characters of Chinese origin used to represent both Chinese [[loanword]]s into Japanese and a number of native Japanese [[morpheme]]s; and two [[syllabary|syllabaries]]: [[hiragana]] and [[katakana]].@@@@1@37@@danf@17-8-2009
10432590@unknown@formal@none@1@S@The [[Latin alphabet]] is also sometimes used.@@@@1@7@@danf@17-8-2009
10432600@unknown@formal@none@1@S@Arabic numerals are much more common than the kanji when used in counting, but kanji numerals are still used in compounds, such as {{lang|ja|統一}} {{transl|ja|''tōitsu''}} ("unification").@@@@1@26@@danf@17-8-2009
10432610@unknown@formal@none@1@S@''[[Hiragana]]'' are used for words without kanji representation, for words no longer written in kanji, and also following kanji to show conjugational endings.@@@@1@23@@danf@17-8-2009
10432620@unknown@formal@none@1@S@Because of the way verbs (and adjectives) in Japanese are [[conjugated]], kanji alone cannot fully convey Japanese tense and mood, as kanji cannot be subject to variation when written without losing its meaning.@@@@1@33@@danf@17-8-2009
10432630@unknown@formal@none@1@S@For this reason, hiragana are suffixed to the ends of kanji to show verb and adjective conjugations.@@@@1@17@@danf@17-8-2009
10432640@unknown@formal@none@1@S@Hiragana used in this way are called [[okurigana]].@@@@1@8@@danf@17-8-2009
10432650@unknown@formal@none@1@S@Hiragana are also written in a superscript called [[furigana]] above or beside a kanji to show the proper reading.@@@@1@19@@danf@17-8-2009
10432660@unknown@formal@none@1@S@This is done to facilitate learning, as well as to clarify particularly old or obscure (or sometimes invented) readings.@@@@1@19@@danf@17-8-2009
10432670@unknown@formal@none@1@S@''[[Katakana]]'', like hiragana, are a syllabary; katakana are primarily used to write foreign words, plant and animal names, and for emphasis.@@@@1@21@@danf@17-8-2009
10432680@unknown@formal@none@1@S@For example "Australia" has been adapted as {{transl|ja|''Ōsutoraria''}} ({{lang|ja|オーストラリア}}), and "supermarket" has been adapted and shortened into {{transl|ja|''sūpā''}} ({{lang|ja|スーパー}}).@@@@1@19@@danf@17-8-2009
10432690@unknown@formal@none@1@S@The [[Latin alphabet]] (in Japanese referred to as [[romaji|''Rōmaji'']] ({{lang|ja|ローマ字}}), literally "Roman letters") is used for some loan words like "CD" and "DVD", and also for some Japanese creations like "Sony".@@@@1@31@@danf@17-8-2009
10432700@unknown@formal@none@1@S@Historically, attempts to limit the number of kanji in use commenced in the mid-19th century, but did not become a matter of government intervention until after Japan's defeat in the Second World War.@@@@1@33@@danf@17-8-2009
10432710@unknown@formal@none@1@S@During the period of post-war occupation (and influenced by the views of some U.S. officials), various schemes including the complete abolition of kanji and exclusive use of rōmaji were considered.@@@@1@30@@danf@17-8-2009
10432720@unknown@formal@none@1@S@The {{transl|ja|''[[jōyō kanji]]''}} ("common use kanji", originally called {{transl|ja|''[[tōyō kanji]]''}} [kanji for general use]) scheme arose as a compromise solution.@@@@1@20@@danf@17-8-2009
10432730@unknown@formal@none@1@S@Japanese students begin to learn kanji from their first year at elementary school.@@@@1@13@@danf@17-8-2009
10432740@unknown@formal@none@1@S@A guideline created by the Japanese Ministry of Education, the list of {{transl|ja|''[[kyōiku kanji]]''}} ("education kanji", a subset of {{transl|ja|''[[jōyō kanji]]''}}), specifies the 1,006 simple characters a child is to learn by the end of sixth grade.@@@@1@37@@danf@17-8-2009
10432750@unknown@formal@none@1@S@Children continue to study another 939 characters in junior high school, covering in total 1,945 {{transl|ja|''[[jōyō kanji]]''}}.@@@@1@17@@danf@17-8-2009
10432760@unknown@formal@none@1@S@The official list of {{transl|ja|''[[jōyō kanji]]''}} was revised several times, but the total number of officially sanctioned characters remained largely unchanged.@@@@1@21@@danf@17-8-2009
10432770@unknown@formal@none@1@S@As for kanji for personal names, the circumstances are somewhat complicated.@@@@1@11@@danf@17-8-2009
10432780@unknown@formal@none@1@S@{{transl|ja|''[[Jōyō kanji]]''}} and {{transl|ja|''[[jinmeiyō kanji]]''}} (an appendix of additional characters for names) are approved for registering personal names.@@@@1@18@@danf@17-8-2009
10432790@unknown@formal@none@1@S@Names containing unapproved characters are denied registration.@@@@1@7@@danf@17-8-2009
10432800@unknown@formal@none@1@S@However, as with the list of {{transl|ja|''[[jōyō kanji]]''}}, criteria for inclusion were often arbitrary and led to many common and popular characters being disapproved for use.@@@@1@26@@danf@17-8-2009
10432810@unknown@formal@none@1@S@Under popular pressure and following a court decision holding the exclusion of common characters unlawful, the list of {{transl|ja|''[[jinmeiyō kanji]]''}} was substantially extended from 92 in 1951 (the year it was first decreed) to 983 in 2004.@@@@1@37@@danf@17-8-2009
10432820@unknown@formal@none@1@S@Furthermore, families whose names are not on these lists were permitted to continue using the older forms.@@@@1@17@@danf@17-8-2009
10432830@unknown@formal@none@1@S@Many writers rely on [[newspaper]] circulation to publish their work with officially sanctioned characters.@@@@1@14@@danf@17-8-2009
10432840@unknown@formal@none@1@S@This distribution method is more efficient than traditional [[pen]] and [[paper]] publications.@@@@1@12@@danf@17-8-2009
10432850@unknown@formal@none@1@S@==Study by non-native speakers==@@@@1@4@@danf@17-8-2009
10432860@unknown@formal@none@1@S@Many major universities throughout the world provide Japanese language courses, and a number of secondary and even primary schools worldwide offer courses in the language.@@@@1@25@@danf@17-8-2009
10432870@unknown@formal@none@1@S@International interest in the Japanese language dates from the 1800s but has become more prevalent following Japan's economic bubble of the 1980s and the global popularity of [[Japanese pop culture]] (such as [[anime]] and [[video games]]) since the 1990s.@@@@1@39@@danf@17-8-2009
10432880@unknown@formal@none@1@S@About 2.3 million people studied the language worldwide in 2003: 900,000 South [[Koreans]], 389,000 [[People's Republic of China|Chinese]], 381,000 [[Australians]], and 140,000 [[United States|Americans]] study Japanese in lower and higher educational institutions.@@@@1@32@@danf@17-8-2009
10432890@unknown@formal@none@1@S@In Japan, more than 90,000 foreign students study at [[List of universities in Japan|Japanese universities]] and Japanese [[language school]]s, including 77,000 Chinese and 15,000 South Koreans in 2003.@@@@1@28@@danf@17-8-2009
10432900@unknown@formal@none@1@S@In addition, local governments and some [[non-profit organisation|NPO]] groups provide free Japanese language classes for foreign residents, including [[Japanese Brazilians]] and foreigners married to Japanese nationals.@@@@1@26@@danf@17-8-2009
10432910@unknown@formal@none@1@S@In the United Kingdom, studies are supported by the [[British Association for Japanese Studies]].@@@@1@14@@danf@17-8-2009
10432920@unknown@formal@none@1@S@In Ireland, Japanese is offered as a language in the [[Leaving Certificate]] in some schools.@@@@1@15@@danf@17-8-2009
10432930@unknown@formal@none@1@S@The Japanese government provides standardised tests to measure spoken and written comprehension of Japanese for second language learners; the most prominent is the [[Japanese Language Proficiency Test]] (JLPT).@@@@1@28@@danf@17-8-2009
10432940@unknown@formal@none@1@S@The Japanese External Trade Organisation [[JETRO]] organises the ''Business Japanese Proficiency Test'' which tests the learner's ability to understand Japanese in a business setting.@@@@1@24@@danf@17-8-2009
10432950@unknown@formal@none@1@S@When learning Japanese in a college setting, students are usually first taught how to pronounce [[romaji]].@@@@1@16@@danf@17-8-2009
10432960@unknown@formal@none@1@S@From that point, they are taught the two main syllabaries, with [[kanji]] usually being introduced in the second semester.@@@@1@19@@danf@17-8-2009
10432970@unknown@formal@none@1@S@Focus is usually first on polite (distal) speech, as students that might interact with native speakers would be expected to use.@@@@1@21@@danf@17-8-2009
10432980@unknown@formal@none@1@S@Casual speech and formal speech usually follow polite speech, as well as the usage of honourifics.@@@@1@16@@danf@17-8-2009
10440010@unknown@formal@none@1@S@Java (programming language)@@@@1@3@@danf@17-8-2009
10440020@unknown@formal@none@1@S@'''Java''' is a [[programming language]] originally developed by [[Sun Microsystems]] and released in 1995 as a core component of Sun Microsystems' [[Java (Sun)|Java platform]].@@@@1@24@@danf@17-8-2009
10440030@unknown@formal@none@1@S@The language derives much of its [[Syntax of programming languages|syntax]] from [[C (programming language)|C]] and [[C++]] but has a simpler [[object model]] and fewer low-level facilities.@@@@1@26@@danf@17-8-2009
10440040@unknown@formal@none@1@S@Java applications are typically [[compiler|compiled]] to [[bytecode]] that can run on any [[Java virtual machine]] (JVM) regardless of [[computer architecture]].@@@@1@20@@danf@17-8-2009
10440050@unknown@formal@none@1@S@The original and [[reference implementation]] Java [[compiler]]s, virtual machines, and [[library (computing)|class libraries]] were developed by Sun from 1995.@@@@1@19@@danf@17-8-2009
10440060@unknown@formal@none@1@S@As of May 2007, in compliance with the specifications of the [[Java Community Process]], Sun made available most of their Java technologies as [[free software]] under the [[GNU General Public License]].@@@@1@31@@danf@17-8-2009
10440070@unknown@formal@none@1@S@Others have also developed alternative implementations of these Sun technologies, such as the [[GNU Compiler for Java]] and [[GNU Classpath]].@@@@1@20@@danf@17-8-2009
10440080@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10440090@unknown@formal@none@1@S@The Java language was created by [[James Gosling]] in June 1991 for use in one of his many [[set-top box]] projects.@@@@1@21@@danf@17-8-2009
10440100@unknown@formal@none@1@S@The language was initially called ''Oak'', after an [[oak tree]] that stood outside Gosling's office—and also went by the name ''Green''—and ended up later being renamed to ''Java'', from a list of random words.@@@@1@34@@danf@17-8-2009
10440110@unknown@formal@none@1@S@Gosling's goals were to implement a [[virtual machine]] and a language that had a familiar C/C++ style of notation.@@@@1@19@@danf@17-8-2009
10440120@unknown@formal@none@1@S@The first public implementation was Java 1.0 in 1995.@@@@1@9@@danf@17-8-2009
10440130@unknown@formal@none@1@S@It promised "[[Write once, run anywhere|Write Once, Run Anywhere]]" (WORA), providing no-cost runtimes on popular platforms.@@@@1@16@@danf@17-8-2009
10440140@unknown@formal@none@1@S@It was fairly secure and its security was configurable, allowing network and file access to be restricted.@@@@1@17@@danf@17-8-2009
10440150@unknown@formal@none@1@S@Major web browsers soon incorporated the ability to run secure Java ''[[applet]]s'' within web pages.@@@@1@15@@danf@17-8-2009
10440160@unknown@formal@none@1@S@Java quickly became popular.@@@@1@4@@danf@17-8-2009
10440170@unknown@formal@none@1@S@With the advent of ''Java 2'', new versions had multiple configurations built for different types of platforms.@@@@1@17@@danf@17-8-2009
10440180@unknown@formal@none@1@S@For example, ''[[J2EE]]'' was for enterprise applications and the greatly stripped down version ''[[J2ME]]'' was for mobile applications.@@@@1@18@@danf@17-8-2009
10440190@unknown@formal@none@1@S@''[[J2SE]]'' was the designation for the Standard Edition.@@@@1@8@@danf@17-8-2009
10440200@unknown@formal@none@1@S@In 2006, for marketing purposes, new ''J2'' versions were renamed ''Java EE'', ''Java ME'', and ''Java SE'', respectively.@@@@1@18@@danf@17-8-2009
10440210@unknown@formal@none@1@S@In 1997, Sun Microsystems approached the [[International Organization for Standardization#JTC1|ISO/IEC JTC1 standards body]] and later the [[Ecma International]] to formalize Java, but it soon withdrew from the process.@@@@1@28@@danf@17-8-2009
10440220@unknown@formal@none@1@S@Java remains a [[de facto]] standard that is controlled through the [[Java Community Process]].@@@@1@14@@danf@17-8-2009
10440230@unknown@formal@none@1@S@At one time, Sun made most of its Java implementations available without charge although they were [[proprietary software]].@@@@1@18@@danf@17-8-2009
10440240@unknown@formal@none@1@S@Sun's revenue from Java was generated by the selling of licenses for specialized products such as the Java Enterprise System.@@@@1@20@@danf@17-8-2009
10440250@unknown@formal@none@1@S@Sun distinguishes between its [[Software Development Kit|Software Development Kit (SDK)]] and [[HotSpot|Runtime Environment (JRE)]] that is a subset of the SDK, the primary distinction being that in the JRE, the compiler, utility programs, and many necessary header files are not present.@@@@1@41@@danf@17-8-2009
10440260@unknown@formal@none@1@S@On [[13 November]] [[2006]], Sun released much of Java as [[free software|free]] and [[open-source software|open-source]] software under the terms of the [[GNU General Public License]] (GPL).@@@@1@26@@danf@17-8-2009
10440270@unknown@formal@none@1@S@On [[8 May]] [[2007]] Sun finished the process, making all of Java's core code free and open-source, aside from a small portion of code to which Sun did not hold the copyright.@@@@1@32@@danf@17-8-2009
10440280@unknown@formal@none@1@S@== Philosophy ==@@@@1@3@@danf@17-8-2009
10440290@unknown@formal@none@1@S@=== Primary goals ===@@@@1@4@@danf@17-8-2009
10440300@unknown@formal@none@1@S@There were five primary goals in the creation of the Java language:@@@@1@12@@danf@17-8-2009
10440310@unknown@formal@none@1@S@# It should use the [[object-oriented programming]] methodology.@@@@1@8@@danf@17-8-2009
10440320@unknown@formal@none@1@S@# It should allow the same program to be [[execution (computers)|executed]] on multiple [[operating system]]s.@@@@1@15@@danf@17-8-2009
10440330@unknown@formal@none@1@S@# It should contain built-in support for using [[computer network]]s.@@@@1@10@@danf@17-8-2009
10440340@unknown@formal@none@1@S@# It should be designed to execute code from [[remote procedure call|remote source]]s securely.@@@@1@14@@danf@17-8-2009
10440350@unknown@formal@none@1@S@# It should be easy to use by selecting what were considered the good parts of other object-oriented languages.@@@@1@19@@danf@17-8-2009
10440360@unknown@formal@none@1@S@=== Platform independence ===@@@@1@4@@danf@17-8-2009
10440370@unknown@formal@none@1@S@One characteristic, [[Cross-platform|platform independence]], means that [[computer program|program]]s written in the Java language must run similarly on any supported hardware/operating-system platform.@@@@1@21@@danf@17-8-2009
10440380@unknown@formal@none@1@S@One should be able to write a program once, compile it once, and run it anywhere.@@@@1@16@@danf@17-8-2009
10440390@unknown@formal@none@1@S@This is achieved by most Java [[compiler]]s by compiling the Java language code ''halfway'' (to [[Java bytecode]]) – simplified machine instructions specific to the Java platform.@@@@1@26@@danf@17-8-2009
10440400@unknown@formal@none@1@S@The code is then run on a [[virtual machine]] (VM), a program written in native code on the host hardware that [[Interpreter (computing)|interprets]] and executes generic Java bytecode.@@@@1@28@@danf@17-8-2009
10440410@unknown@formal@none@1@S@(In some JVM versions, bytecode can also be compiled to native code, either before or during program execution, resulting in faster execution.)@@@@1@22@@danf@17-8-2009
10440420@unknown@formal@none@1@S@Further, standardized libraries are provided to allow access to features of the host machines (such as graphics, [[thread (computer science)|threading]] and [[Computer network|networking]]) in unified ways.@@@@1@26@@danf@17-8-2009
10440430@unknown@formal@none@1@S@Note that, although there is an explicit compiling stage, at some point, the Java bytecode is interpreted or converted to native [[machine code]] by the [[Just-in-time compilation|JIT compiler]].@@@@1@28@@danf@17-8-2009
10440440@unknown@formal@none@1@S@The first implementations of the language used an interpreted virtual machine to achieve [[Porting|portability]].@@@@1@14@@danf@17-8-2009
10440450@unknown@formal@none@1@S@These implementations produced programs that ran slower than programs compiled to native executables, for instance written in C or C++, so the language suffered a reputation for poor performance.@@@@1@29@@danf@17-8-2009
10440460@unknown@formal@none@1@S@More recent JVM implementations produce programs that run significantly faster than before, using multiple techniques.@@@@1@15@@danf@17-8-2009
10440470@unknown@formal@none@1@S@One technique, known as ''just-in-time compilation'' (JIT), translates the Java bytecode into native code at the time that the program is run, which results in a program that executes faster than interpreted code but also incurs compilation overhead during execution.@@@@1@40@@danf@17-8-2009
10440480@unknown@formal@none@1@S@More sophisticated VMs use ''[[dynamic recompilation]]'', in which the VM can analyze the behavior of the running program and selectively recompile and optimize critical parts of the program.@@@@1@28@@danf@17-8-2009
10440490@unknown@formal@none@1@S@Dynamic recompilation can achieve optimizations superior to static compilation because the dynamic compiler can base optimizations on knowledge about the runtime environment and the set of loaded classes, and can identify the ''hot spots'' (parts of the program, often inner loops, that take up the most execution time).@@@@1@48@@danf@17-8-2009
10440500@unknown@formal@none@1@S@JIT compilation and dynamic recompilation allow Java programs to take advantage of the speed of native code without losing portability.@@@@1@20@@danf@17-8-2009
10440510@unknown@formal@none@1@S@Another technique, commonly known as ''static compilation'', is to compile directly into native code like a more traditional compiler.@@@@1@19@@danf@17-8-2009
10440520@unknown@formal@none@1@S@Static Java compilers, such as [[GCJ]], translate the Java language code to native [[object code]], removing the intermediate bytecode stage.@@@@1@20@@danf@17-8-2009
10440530@unknown@formal@none@1@S@This achieves good performance compared to interpretation, but at the expense of portability; the output of these compilers can only be run on a single [[Computer architecture|architecture]].@@@@1@27@@danf@17-8-2009
10440540@unknown@formal@none@1@S@Some see avoiding the VM in this manner as defeating the point of developing in Java; however it can be useful to provide both a generic [[bytecode]] version, as well as an optimised native code version of an application.@@@@1@39@@danf@17-8-2009
10440550@unknown@formal@none@1@S@=== Implementations ===@@@@1@3@@danf@17-8-2009
10440560@unknown@formal@none@1@S@Sun Microsystems officially licenses the Java Standard Edition platform for [[Microsoft Windows]], [[Linux]], and [[Solaris (operating system)|Solaris]].@@@@1@17@@danf@17-8-2009
10440570@unknown@formal@none@1@S@Through a network of third-party vendors and licensees, alternative Java environments are available for these and other platforms.@@@@1@18@@danf@17-8-2009
10440580@unknown@formal@none@1@S@To qualify as a certified Java licensee, an implementation on any particular platform must pass a rigorous suite of validation and compatibility tests.@@@@1@23@@danf@17-8-2009
10440590@unknown@formal@none@1@S@This method enables a guaranteed level of compliance and platform through a trusted set of commercial and non-commercial partners.@@@@1@19@@danf@17-8-2009
10440600@unknown@formal@none@1@S@Sun's trademark license for usage of the Java brand insists that all implementations be "compatible".@@@@1@15@@danf@17-8-2009
10440610@unknown@formal@none@1@S@This resulted in a legal dispute with [[Microsoft]] after Sun claimed that the Microsoft implementation did not support the [[Java remote method invocation|RMI]] and [[Java Native Interface|JNI]] interfaces and had added platform-specific features of their own.@@@@1@36@@danf@17-8-2009
10440620@unknown@formal@none@1@S@Sun sued in 1997, and in 2001 won a settlement of $20 million as well as a court order enforcing the terms of the license from Sun.@@@@1@27@@danf@17-8-2009
10440630@unknown@formal@none@1@S@As a result, Microsoft no longer ships Java with [[Microsoft Windows|Windows]], and in recent versions of Windows, [[Internet Explorer]] cannot support Java applets without a third-party plugin.@@@@1@27@@danf@17-8-2009
10440640@unknown@formal@none@1@S@However, Sun and others have made available Java run-time systems at no cost for those and other versions of Windows.@@@@1@20@@danf@17-8-2009
10440650@unknown@formal@none@1@S@Platform-independent Java is essential to the [[Java Enterprise Edition]] strategy, and an even more rigorous validation is required to certify an implementation.@@@@1@22@@danf@17-8-2009
10440660@unknown@formal@none@1@S@This environment enables portable server-side applications, such as [[Web service]]s, [[servlet]]s, and [[Enterprise JavaBean]]s, as well as with [[Embedded system]]s based on [[OSGi]], using [[Embedded Java]] environments.@@@@1@27@@danf@17-8-2009
10440670@unknown@formal@none@1@S@Through the new [[GlassFish]] project, Sun is working to create a fully functional, unified [[open-source]] implementation of the Java EE technologies.@@@@1@21@@danf@17-8-2009
10440680@unknown@formal@none@1@S@=== Automatic memory management ===@@@@1@5@@danf@17-8-2009
10440690@unknown@formal@none@1@S@One of the ideas behind Java's automatic memory management model is that programmers be spared the burden of having to perform manual memory management.@@@@1@24@@danf@17-8-2009
10440700@unknown@formal@none@1@S@In some languages the programmer allocates memory for the creation of objects stored on the [[heap]] and the responsibility of later deallocating that memory also resides with the programmer.@@@@1@29@@danf@17-8-2009
10440710@unknown@formal@none@1@S@If the programmer forgets to deallocate memory or writes code that fails to do so, a [[memory leak]] occurs and the program can consume an arbitrarily large amount of memory.@@@@1@30@@danf@17-8-2009
10440720@unknown@formal@none@1@S@Additionally, if the program attempts to deallocate the region of memory more than once, the result is undefined and the program may become unstable and may crash.@@@@1@27@@danf@17-8-2009
10440730@unknown@formal@none@1@S@Finally, in non garbage collected environments, there is a certain degree of overhead and complexity of user-code to track and finalize allocations.@@@@1@22@@danf@17-8-2009
10440740@unknown@formal@none@1@S@Often developers may box themselves into certain designs to provide reasonable assurances that memory leaks will not occur.@@@@1@18@@danf@17-8-2009
10440750@unknown@formal@none@1@S@In Java, this potential problem is avoided by [[automatic garbage collection]].@@@@1@11@@danf@17-8-2009
10440760@unknown@formal@none@1@S@The programmer determines when objects are created, and the Java runtime is responsible for managing the [[object lifetime|object's lifecycle]].@@@@1@19@@danf@17-8-2009
10440770@unknown@formal@none@1@S@The program or other objects can reference an object by holding a reference to it (which, from a low-level point of view, is its address on the heap).@@@@1@28@@danf@17-8-2009
10440780@unknown@formal@none@1@S@When no references to an object remain, the [[unreachable object]] is eligible for release by the Java garbage collector - it may be freed automatically by the garbage collector at any time.@@@@1@32@@danf@17-8-2009
10440790@unknown@formal@none@1@S@Memory leaks may still occur if a programmer's code holds a reference to an object that is no longer needed—in other words, they can still occur but at higher conceptual levels.@@@@1@31@@danf@17-8-2009
10440800@unknown@formal@none@1@S@The use of garbage collection in a language can also affect programming paradigms.@@@@1@13@@danf@17-8-2009
10440810@unknown@formal@none@1@S@If, for example, the developer assumes that the cost of memory allocation/recollection is low, they may choose to more freely construct objects instead of pre-initializing, holding and reusing them.@@@@1@29@@danf@17-8-2009
10440820@unknown@formal@none@1@S@With the small cost of potential performance penalties (inner-loop construction of large/complex objects), this facilitates thread-isolation (no need to synchronize as different threads work on different object instances) and data-hiding.@@@@1@30@@danf@17-8-2009
10440830@unknown@formal@none@1@S@The use of transient immutable value-objects minimizes side-effect programming.@@@@1@9@@danf@17-8-2009
10440840@unknown@formal@none@1@S@Comparing Java and [[C++]], it is possible in C++ to implement similar functionality (for example, a memory management model for specific classes can be designed in C++ to improve speed and lower memory fragmentation considerably), with the possible cost of adding comparable runtime overhead to that of Java's garbage collector, and of added development time and application complexity if one favors manual implementation over using an existing third-party library.@@@@1@69@@danf@17-8-2009
10440850@unknown@formal@none@1@S@In Java, garbage collection is built-in and virtually invisible to the developer.@@@@1@12@@danf@17-8-2009
10440860@unknown@formal@none@1@S@That is, developers may have no notion of when garbage collection will take place as it may not necessarily correlate with any actions being explicitly performed by the code they write.@@@@1@31@@danf@17-8-2009
10440870@unknown@formal@none@1@S@Depending on intended application, this can be beneficial or disadvantageous: the programmer is freed from performing low-level tasks, but at the same time loses the option of writing lower level code.@@@@1@31@@danf@17-8-2009
10440880@unknown@formal@none@1@S@Additionally, the garbage collection capability demands some attention to tuning the JVM, as large heaps will cause apparently random stalls in performance.@@@@1@22@@danf@17-8-2009
10440890@unknown@formal@none@1@S@Java does not support [[pointer (computing)|pointer arithmetic]] as is supported in, for example, C++.@@@@1@14@@danf@17-8-2009
10440900@unknown@formal@none@1@S@This is because the garbage collector may relocate referenced objects, invalidating such pointers.@@@@1@13@@danf@17-8-2009
10440910@unknown@formal@none@1@S@Another reason that Java forbids this is that type safety and security can no longer be guaranteed if arbitrary manipulation of pointers is allowed.@@@@1@24@@danf@17-8-2009
10440920@unknown@formal@none@1@S@== Syntax ==@@@@1@3@@danf@17-8-2009
10440930@unknown@formal@none@1@S@The syntax of Java is largely derived from [[C++]].@@@@1@9@@danf@17-8-2009
10440940@unknown@formal@none@1@S@Unlike C++, which combines the syntax for structured, generic, and object-oriented programming, Java was built exclusively as an object oriented language.@@@@1@21@@danf@17-8-2009
10440950@unknown@formal@none@1@S@As a result, almost everything is an object and all code is written inside a class.@@@@1@16@@danf@17-8-2009
10440960@unknown@formal@none@1@S@The exceptions are the intrinsic data types (ordinal and real numbers, boolean values, and characters), which are not classes for performance reasons.@@@@1@22@@danf@17-8-2009
10440970@unknown@formal@none@1@S@=== Hello, world program ===@@@@1@5@@danf@17-8-2009
10440980@unknown@formal@none@1@S@This is a minimal [[Hello world program]] in Java with [[syntax highlighting]]:@@@@1@12@@danf@17-8-2009
10440990@unknown@formal@none@1@S@@@@@1@19@@danf@17-8-2009
10441000@unknown@formal@none@1@S@To execute a Java program, the code is saved as a file named Hello.java.@@@@1@14@@danf@17-8-2009
10441010@unknown@formal@none@1@S@It must first be compiled into bytecode using a [[Java compiler]], which produces a file named Hello.class.@@@@1@17@@danf@17-8-2009
10441020@unknown@formal@none@1@S@This class is then ''launched''.@@@@1@5@@danf@17-8-2009
10441030@unknown@formal@none@1@S@The above example merits a bit of explanation.@@@@1@8@@danf@17-8-2009
10441040@unknown@formal@none@1@S@* All executable statements in Java are written inside a class, including stand-alone programs.@@@@1@14@@danf@17-8-2009
10441050@unknown@formal@none@1@S@* Source files are by convention named the same as the class they contain, appending the mandatory suffix ''.java''.@@@@1@19@@danf@17-8-2009
10441060@unknown@formal@none@1@S@A '''class''' that is declared '''public''' is required to follow this convention.@@@@1@12@@danf@17-8-2009
10441070@unknown@formal@none@1@S@(In this case, the class '''Hello''' is public, therefore the source must be stored in a file called ''Hello.java'').@@@@1@19@@danf@17-8-2009
10441080@unknown@formal@none@1@S@* The compiler will generate a class file for each class defined in the source file.@@@@1@16@@danf@17-8-2009
10441090@unknown@formal@none@1@S@The name of the class file is the name of the class, with ''.class'' appended.@@@@1@15@@danf@17-8-2009
10441100@unknown@formal@none@1@S@For class file generation, anonymous classes are treated as if their name was the concatenation of the name of their enclosing class, a ''$'', and an integer.@@@@1@27@@danf@17-8-2009
10441110@unknown@formal@none@1@S@* The [[Java keywords|keyword]] '''public''' denotes that a method can be called from code in other classes, or that a class may be used by classes outside the class hierarchy.@@@@1@30@@danf@17-8-2009
10441120@unknown@formal@none@1@S@* The keyword '''static''' indicates that the method is a [[class method|static method]], associated with the class rather than object instances.@@@@1@21@@danf@17-8-2009
10441130@unknown@formal@none@1@S@* The keyword '''void''' indicates that the main method does not return any value to the caller.@@@@1@17@@danf@17-8-2009
10441140@unknown@formal@none@1@S@* The method name "main" is not a keyword in the Java language.@@@@1@13@@danf@17-8-2009
10441150@unknown@formal@none@1@S@It is simply the name of the method the Java launcher calls to pass control to the program.@@@@1@18@@danf@17-8-2009
10441160@unknown@formal@none@1@S@Java classes that run in managed environments such as applets and [[Enterprise Java Beans]] do not use or need a main() method.@@@@1@22@@danf@17-8-2009
10441170@unknown@formal@none@1@S@* The main method must accept an [[array]] of '''{{Javadoc:SE|java/lang|String}}''' objects.@@@@1@11@@danf@17-8-2009
10441180@unknown@formal@none@1@S@By convention, it is referenced as '''args''' although any other legal identifier name can be used.@@@@1@16@@danf@17-8-2009
10441190@unknown@formal@none@1@S@Since Java 5, the main method can also use [[varargs|variable arguments]], in the form of public static void main(String... args), allowing the main method to be invoked with an arbitrary number of String arguments.@@@@1@34@@danf@17-8-2009
10441200@unknown@formal@none@1@S@The effect of this alternate declaration is semantically identical (the args parameter is still an array of String objects), but allows an alternate syntax for creating and passing the array.@@@@1@30@@danf@17-8-2009
10441210@unknown@formal@none@1@S@* The Java launcher launches Java by loading a given class (specified on the command line) and starting its public static void main(String[]) method.@@@@1@24@@danf@17-8-2009
10441220@unknown@formal@none@1@S@Stand-alone programs must declare this method explicitly.@@@@1@7@@danf@17-8-2009
10441230@unknown@formal@none@1@S@The String[] args parameter is an [[array]] of {{Javadoc:SE|java/lang|String}} objects containing any arguments passed to the class.@@@@1@17@@danf@17-8-2009
10441240@unknown@formal@none@1@S@The parameters to main are often passed by means of a [[command line]].@@@@1@13@@danf@17-8-2009
10441250@unknown@formal@none@1@S@* The printing facility is part of the Java standard library: The '''{{Javadoc:SE|java/lang|System}}''' class defines a public static field called '''{{Javadoc:SE|name=out|java/lang|System|out}}'''.@@@@1@21@@danf@17-8-2009
10441260@unknown@formal@none@1@S@The out object is an instance of the {{Javadoc:SE|java/io|PrintStream}} class and provides the method '''{{Javadoc:SE|name=println(String)|java/io|PrintStream|println(java.lang.String)}}''' for displaying data to the screen while creating a new line ([[standard streams|standard out]]).@@@@1@29@@danf@17-8-2009
10441270@unknown@formal@none@1@S@=== A more comprehensive example ===@@@@1@6@@danf@17-8-2009
10441280@unknown@formal@none@1@S@@@@@1@312@@danf@17-8-2009
10441290@unknown@formal@none@1@S@* The '''[[Java keywords#import|import]]''' statement imports the '''{{Javadoc:SE|javax/swing|JOptionPane}}''' class from the '''{{Javadoc:SE|package=javax.swing|javax/swing}}''' package.@@@@1@13@@danf@17-8-2009
10441300@unknown@formal@none@1@S@* The '''OddEven''' class declares a single '''[[Java keywords#private|private]]''' [[field (computer science)|field]] of type '''int''' named '''input'''.@@@@1@17@@danf@17-8-2009
10441310@unknown@formal@none@1@S@Every instance of the OddEven class has its own copy of the input field.@@@@1@14@@danf@17-8-2009
10441320@unknown@formal@none@1@S@The private declaration means that no other class can access (read or write) the input field.@@@@1@16@@danf@17-8-2009
10441330@unknown@formal@none@1@S@* '''OddEven()''' is a '''public''' [[constructor (computer science)|constructor]].@@@@1@8@@danf@17-8-2009
10441340@unknown@formal@none@1@S@Constructors have the same name as the enclosing class they are declared in, and unlike a method, have no [[return type]].@@@@1@21@@danf@17-8-2009
10441350@unknown@formal@none@1@S@A constructor is used to initialize an [[object (computer science)|object]] that is a newly created instance of the class.@@@@1@19@@danf@17-8-2009
10441360@unknown@formal@none@1@S@The dialog returns a String that is converted to an int by the '''{{Javadoc:SE|java/lang|Integer|parseInt(String)}}''' method.@@@@1@15@@danf@17-8-2009
10441370@unknown@formal@none@1@S@* The '''calculate()''' method is declared without the static keyword.@@@@1@10@@danf@17-8-2009
10441380@unknown@formal@none@1@S@This means that the method is invoked using a specific instance of the OddEven class.@@@@1@15@@danf@17-8-2009
10441390@unknown@formal@none@1@S@(The [[reference (computer science)|reference]] used to invoke the method is passed as an undeclared parameter of type OddEven named '''[[Java keywords#this|this]]'''.)@@@@1@21@@danf@17-8-2009
10441400@unknown@formal@none@1@S@The method tests the expression input % 2 == 0 using the '''[[Java keywords#if|if]]''' keyword to see if the remainder of dividing the input field belonging to the instance of the class by two is zero.@@@@1@36@@danf@17-8-2009
10441410@unknown@formal@none@1@S@If this expression is true, then it prints '''Even'''; if this expression is false it prints '''Odd'''.@@@@1@17@@danf@17-8-2009
10441420@unknown@formal@none@1@S@(The input field can be equivalently accessed as this.input, which explicitly uses the undeclared this parameter.)@@@@1@16@@danf@17-8-2009
10441430@unknown@formal@none@1@S@* '''OddEven number = new OddEven();''' declares a local object [[reference (computer science)|reference]] variable in the main method named number.@@@@1@20@@danf@17-8-2009
10441440@unknown@formal@none@1@S@This variable can hold a reference to an object of type OddEven.@@@@1@12@@danf@17-8-2009
10441450@unknown@formal@none@1@S@The declaration initializes number by first creating an instance of the OddEven class, using the '''[[Java keywords#new|new]]''' keyword and the OddEven() constructor, and then assigning this instance to the variable.@@@@1@30@@danf@17-8-2009
10441460@unknown@formal@none@1@S@* The statement '''number.showDialog();''' calls the calculate method.@@@@1@8@@danf@17-8-2009
10441470@unknown@formal@none@1@S@The instance of OddEven object referenced by the number [[local variable]] is used to invoke the method and passed as the undeclared this parameter to the calculate method.@@@@1@28@@danf@17-8-2009
10441480@unknown@formal@none@1@S@* For simplicity, [[error handling]] has been ignored in this example.@@@@1@11@@danf@17-8-2009
10441490@unknown@formal@none@1@S@Entering a value that is not a number will cause the program to crash.@@@@1@14@@danf@17-8-2009
10441500@unknown@formal@none@1@S@This can be avoided by catching and handling the {{Javadoc:SE|java/lang|NumberFormatException}} thrown by Integer.parseInt(String).@@@@1@13@@danf@17-8-2009
10441510@unknown@formal@none@1@S@=== Applet ===@@@@1@3@@danf@17-8-2009
10441520@unknown@formal@none@1@S@Java applets are programs that are embedded in other applications, typically in a Web page displayed in a [[Web browser]].@@@@1@20@@danf@17-8-2009
10441530@unknown@formal@none@1@S@@@@@1@25@@danf@17-8-2009
10441540@unknown@formal@none@1@S@The '''import''' statements direct the [[Java compiler]] to include the '''{{Javadoc:SE|package=java.applet|java/applet|Applet}}''' and '''{{Javadoc:SE|package=java.awt|java/awt|Graphics}}''' classes in the compilation.@@@@1@17@@danf@17-8-2009
10441550@unknown@formal@none@1@S@The import statement allows these classes to be referenced in the [[source code]] using the ''simple class name'' (i.e. Applet) instead of the ''fully qualified class name'' (i.e. java.applet.Applet).@@@@1@29@@danf@17-8-2009
10441560@unknown@formal@none@1@S@The Hello class '''extends''' ([[subclass (computer science)|subclasses]]) the '''Applet''' class; the Applet class provides the framework for the host application to display and control the [[Object lifetime|lifecycle]] of the applet.@@@@1@30@@danf@17-8-2009
10441570@unknown@formal@none@1@S@The Applet class is an [[Abstract Windowing Toolkit]] (AWT) {{Javadoc:SE|java/awt|Component}}, which provides the applet with the capability to display a [[graphical user interface]] (GUI) and respond to user [[event-driven programming|events]].@@@@1@30@@danf@17-8-2009
10441580@unknown@formal@none@1@S@The Hello class [[method overriding (programming)|overrides]] the '''{{Javadoc:SE|name=paint(Graphics)|java/awt|Container|paint(java.awt.Graphics)}}''' method inherited from the {{Javadoc:SE|java/awt|Container}} [[superclass (computer science)|superclass]] to provide the code to display the applet.@@@@1@24@@danf@17-8-2009
10441590@unknown@formal@none@1@S@The paint() method is passed a '''Graphics''' object that contains the graphic context used to display the applet.@@@@1@18@@danf@17-8-2009
10441600@unknown@formal@none@1@S@The paint() method calls the graphic context '''{{Javadoc:SE|name=drawString(String, int, int)|java/awt|Graphics|drawString(java.lang.String,%20int,%20int)}}''' method to display the '''"Hello, world!"''' string at a [[pixel]] offset of ('''65, 95''') from the upper-left corner in the applet's display.@@@@1@32@@danf@17-8-2009
10441610@unknown@formal@none@1@S@@@@@1@16@@danf@17-8-2009
10441620@unknown@formal@none@1@S@An applet is placed in an [[HTML]] document using the '''''' [[HTML element]].@@@@1@13@@danf@17-8-2009
10441630@unknown@formal@none@1@S@The applet tag has three attributes set: '''code="Hello"''' specifies the name of the Applet class and '''width="200" height="200"''' sets the pixel width and height of the applet.@@@@1@27@@danf@17-8-2009
10441640@unknown@formal@none@1@S@Applets may also be embedded in HTML using either the object or embed element, although support for these elements by Web browsers is inconsistent.@@@@1@24@@danf@17-8-2009
10441650@unknown@formal@none@1@S@However, the applet tag is deprecated, so the object tag is preferred where supported.@@@@1@14@@danf@17-8-2009
10441660@unknown@formal@none@1@S@The host application, typically a Web browser, instantiates the '''Hello''' applet and creates an {{Javadoc:SE|java/applet|AppletContext}} for the applet.@@@@1@18@@danf@17-8-2009
10441670@unknown@formal@none@1@S@Once the applet has initialized itself, it is added to the AWT display hierarchy.@@@@1@14@@danf@17-8-2009
10441680@unknown@formal@none@1@S@The paint method is called by the AWT [[event dispatching thread]] whenever the display needs the applet to draw itself.@@@@1@20@@danf@17-8-2009
10441690@unknown@formal@none@1@S@=== '''Servlet''' ===@@@@1@3@@danf@17-8-2009
10441700@unknown@formal@none@1@S@Java Servlet technology provides Web developers with a simple, consistent mechanism for extending the functionality of a Web server and for accessing existing business systems.@@@@1@25@@danf@17-8-2009
10441710@unknown@formal@none@1@S@Servlets are [[server-side]] Java EE components that generate responses (typically [[HTML]] pages) to requests (typically [[HTTP]] requests) from [[client (computing)|client]]s.@@@@1@20@@danf@17-8-2009
10441720@unknown@formal@none@1@S@A servlet can almost be thought of as an applet that runs on the server side—without a face.@@@@1@18@@danf@17-8-2009
10441730@unknown@formal@none@1@S@@@@@1@35@@danf@17-8-2009
10441740@unknown@formal@none@1@S@The '''import''' statements direct the Java compiler to include all of the public classes and [[interface (Java)|interfaces]] from the '''{{Javadoc:SE|package=java.io|java/io}}''' and '''{{Javadoc:EE|package=javax.servlet|javax/servlet}}''' [[Java package|packages]] in the compilation.@@@@1@27@@danf@17-8-2009
10441750@unknown@formal@none@1@S@The '''Hello''' class '''extends''' the '''{{Javadoc:EE|javax/servlet|GenericServlet}}''' class; the GenericServlet class provides the interface for the [[server (computing)|server]] to forward requests to the servlet and control the servlet's lifecycle.@@@@1@28@@danf@17-8-2009
10441760@unknown@formal@none@1@S@The Hello class overrides the '''{{Javadoc:EE|name=service(ServletRequest, ServletResponse)|javax/servlet|Servlet|service(javax.servlet.ServletRequest,javax.servlet.ServletResponse)}}''' method defined by the {{Javadoc:EE|javax/servlet|Servlet}} [[Interface (Java)|interface]] to provide the code for the service request handler.@@@@1@23@@danf@17-8-2009
10441770@unknown@formal@none@1@S@The service() method is passed a '''{{Javadoc:EE|javax/servlet|ServletRequest}}''' object that contains the request from the client and a '''{{Javadoc:EE|javax/servlet|ServletResponse}}''' object used to create the response returned to the client.@@@@1@28@@danf@17-8-2009
10441780@unknown@formal@none@1@S@The service() method declares that it '''throws''' the [[exception handling|exceptions]] {{Javadoc:EE|javax/servlet|ServletException}} and {{Javadoc:SE|java/io|IOException}} if a problem prevents it from responding to the request.@@@@1@23@@danf@17-8-2009
10441790@unknown@formal@none@1@S@The '''{{Javadoc:EE|name=setContentType(String)|javax/servlet|ServletResponse|setContentType(java.lang.String)}}''' method in the response object is called to set the [[MIME]] content type of the returned data to '''"text/html"'''.@@@@1@21@@danf@17-8-2009
10441800@unknown@formal@none@1@S@The '''{{Javadoc:EE|name=getWriter()|javax/servlet|ServletResponse|getWriter()}}''' method in the response returns a '''{{Javadoc:SE|java/io|PrintWriter}}''' object that is used to write the data that is sent to the client.@@@@1@23@@danf@17-8-2009
10441810@unknown@formal@none@1@S@The '''{{Javadoc:SE|name=println(String)|java/io|PrintWriter|println(java.lang.String)}}''' method is called to write the '''"Hello, world!"''' string to the response and then the '''{{Javadoc:SE|name=close()|java/io|PrintWriter|close()}}''' method is called to close the print writer, which causes the data that has been written to the stream to be returned to the client.@@@@1@43@@danf@17-8-2009
10441820@unknown@formal@none@1@S@=== JavaServer Page ===@@@@1@4@@danf@17-8-2009
10441830@unknown@formal@none@1@S@JavaServer Pages (JSPs) are [[server-side]] Java EE components that generate responses, typically [[HTML]] pages, to [[HTTP]] requests from [[client (computing)|client]]s.@@@@1@20@@danf@17-8-2009
10441840@unknown@formal@none@1@S@JSPs embed Java code in an HTML page by using the special [[delimiter]]s <% and %>.@@@@1@18@@danf@17-8-2009
10441850@unknown@formal@none@1@S@A JSP is compiled to a Java ''servlet'', a Java application in its own right, the first time it is accessed.@@@@1@21@@danf@17-8-2009
10441860@unknown@formal@none@1@S@After that, the generated servlet creates the response.@@@@1@8@@danf@17-8-2009
10441870@unknown@formal@none@1@S@=== Swing application ===@@@@1@4@@danf@17-8-2009
10441880@unknown@formal@none@1@S@Swing is a graphical user interface [[library (computer science)|library]] for the Java SE platform.@@@@1@14@@danf@17-8-2009
10441890@unknown@formal@none@1@S@This example Swing application creates a single window with "Hello, world!" inside:@@@@1@12@@danf@17-8-2009
10441900@unknown@formal@none@1@S@@@@@1@38@@danf@17-8-2009
10441910@unknown@formal@none@1@S@The first '''import''' statement directs the Java compiler to include the {{Javadoc:SE|java/awt|BorderLayout}} class from the {{Javadoc:SE|package=java.awt|java/awt}} package in the compilation; the second '''import''' includes all of the public classes and interfaces from the '''{{Javadoc:SE|package=javax.swing|javax/swing}}''' package.@@@@1@35@@danf@17-8-2009
10441920@unknown@formal@none@1@S@The '''Hello''' class '''extends''' the '''{{Javadoc:SE|javax/swing|JFrame}}''' class; the JFrame class implements a [[window (computing)|window]] with a [[title bar]] and a close [[Widget (computing)|control]].@@@@1@23@@danf@17-8-2009
10441930@unknown@formal@none@1@S@The '''Hello()''' [[constructor (computer science)|constructor]] initializes the frame by first calling the superclass constructor, passing the parameter "hello", which is used as the window's title.@@@@1@25@@danf@17-8-2009
10441940@unknown@formal@none@1@S@It then calls the '''{{Javadoc:SE|name=setDefaultCloseOperation(int)|javax/swing|JFrame|setDefaultCloseOperation(int)}}''' method inherited from JFrame to set the default operation when the close control on the title bar is selected to '''{{Javadoc:SE|javax/swing|WindowConstants|EXIT_ON_CLOSE}}''' — this causes the JFrame to be disposed of when the frame is closed (as opposed to merely hidden), which allows the JVM to exit and the program to terminate.@@@@1@56@@danf@17-8-2009
10441950@unknown@formal@none@1@S@Next, the [[Layout manager|layout]] of the frame is set to a BorderLayout; this tells Swing how to arrange the components that will be added to the frame.@@@@1@27@@danf@17-8-2009
10441960@unknown@formal@none@1@S@A '''{{Javadoc:SE|javax/swing|JLabel}}''' is created for the string '''"Hello, world!"''' and the '''{{Javadoc:SE|name=add(Component)|java/awt|Container|add(java.awt.Component)}}''' method inherited from the {{Javadoc:SE|java/awt|Container}} superclass is called to add the label to the frame.@@@@1@27@@danf@17-8-2009
10441970@unknown@formal@none@1@S@The '''{{Javadoc:SE|name=pack()|java/awt|Window|pack()}}''' method inherited from the {{Javadoc:SE|java/awt|Window}} superclass is called to size the window and lay out its contents, in the manner indicated by the BorderLayout.@@@@1@26@@danf@17-8-2009
10441980@unknown@formal@none@1@S@The '''main()''' method is called by the JVM when the program starts.@@@@1@12@@danf@17-8-2009
10441990@unknown@formal@none@1@S@It [[Instance (programming)|instantiates]] a new '''Hello''' frame and causes it to be displayed by calling the '''{{Javadoc:SE|name=setVisible(boolean)|java/awt|Component|setVisible(boolean)}}''' method inherited from the {{Javadoc:SE|java/awt|Component}} superclass with the boolean parameter '''true'''.@@@@1@28@@danf@17-8-2009
10442000@unknown@formal@none@1@S@Note that once the frame is displayed, exiting the main method does not cause the program to terminate because the AWT [[event dispatching thread]] remains active until all of the Swing top-level windows have been disposed.@@@@1@36@@danf@17-8-2009
10442010@unknown@formal@none@1@S@== Criticism ==@@@@1@3@@danf@17-8-2009
10442020@unknown@formal@none@1@S@[[Java performance|Java's performance]] has improved substantially since the early versions, and performance of [[JIT compiler]]s relative to native compilers has in some tests been shown to be quite similar.@@@@1@29@@danf@17-8-2009
10442030@unknown@formal@none@1@S@The performance of the compilers does not necessarily indicate the performance of the compiled code; only careful testing can reveal the true performance issues in any system.@@@@1@27@@danf@17-8-2009
10442040@unknown@formal@none@1@S@The default [[look and feel]] of [[Graphical User Interface|GUI]] applications written in Java using the [[Swing (Java)|Swing]] toolkit is very different from native applications.@@@@1@24@@danf@17-8-2009
10442050@unknown@formal@none@1@S@It is possible to specify a different look and feel through the [[pluggable look and feel]] system of Swing.@@@@1@19@@danf@17-8-2009
10442060@unknown@formal@none@1@S@Clones of [[Microsoft Windows|Windows]], [[GTK]] and [[Motif (widget toolkit)|Motif]] are supplied by Sun.@@@@1@13@@danf@17-8-2009
10442070@unknown@formal@none@1@S@[[Apple Computer|Apple]] also provides an [[Aqua (theme)|Aqua]] look and feel for [[Mac OS X]].@@@@1@14@@danf@17-8-2009
10442080@unknown@formal@none@1@S@Though prior implementations of these looks and feels have been considered lacking, Swing in Java SE 6 addresses this problem by using more native [[Widget (computing)|widget]] drawing routines of the underlying platforms.@@@@1@32@@danf@17-8-2009
10442090@unknown@formal@none@1@S@Alternatively, third party toolkits such as [[wx4j]], [[Qt (toolkit)|Qt Jambi]] or [[Standard Widget Toolkit|SWT]] may be used for increased integration with the native windowing system.@@@@1@25@@danf@17-8-2009
10442100@unknown@formal@none@1@S@As in C++ and some other object-oriented languages, variables of Java's [[primitive type]]s were not originally objects.@@@@1@17@@danf@17-8-2009
10442110@unknown@formal@none@1@S@Values of primitive types are either stored directly in fields (for objects) or on the [[Stack-based memory allocation|stack]] (for methods) rather than on the heap, as is the common case for objects (but see [[Escape analysis]]).@@@@1@36@@danf@17-8-2009
10442120@unknown@formal@none@1@S@This was a conscious decision by Java's designers for performance reasons.@@@@1@11@@danf@17-8-2009
10442130@unknown@formal@none@1@S@Because of this, Java was not considered to be a pure object-oriented programming language.@@@@1@14@@danf@17-8-2009
10442140@unknown@formal@none@1@S@However, as of Java 5.0, [[Object type|autoboxing]] enables programmers to write as if primitive types are their wrapper classes, with their object-oriented counterparts representing classes of their own, and freely interchange between them for improved flexibility.@@@@1@36@@danf@17-8-2009
10442150@unknown@formal@none@1@S@Java suppresses several features (such as [[operator overloading]] and [[multiple inheritance]]) for ''classes'' in order to simplify the language, to "save the programmers from themselves", and to prevent possible errors and anti-pattern design.@@@@1@33@@danf@17-8-2009
10442160@unknown@formal@none@1@S@This has been a source of criticism, relating to a lack of low-level features, but some of these limitations may be worked around.@@@@1@23@@danf@17-8-2009
10442170@unknown@formal@none@1@S@Java ''interfaces'' have always had multiple inheritance.@@@@1@7@@danf@17-8-2009
10442180@unknown@formal@none@1@S@== Resources ==@@@@1@3@@danf@17-8-2009
10442190@unknown@formal@none@1@S@=== Java Runtime Environment ===@@@@1@5@@danf@17-8-2009
10442200@unknown@formal@none@1@S@The Java Runtime Environment, or ''JRE'', is the software required to run any [[Application software|application]] deployed on the Java Platform.@@@@1@20@@danf@17-8-2009
10442210@unknown@formal@none@1@S@[[End-user]]s commonly use a JRE in [[Software package (programming)|software package]]s and Web browser [[plugin]]s.@@@@1@14@@danf@17-8-2009
10442220@unknown@formal@none@1@S@Sun also distributes a superset of the JRE called the Java 2 [[SDK]] (more commonly known as the JDK), which includes development tools such as the [[Java compiler]], [[Javadoc]], [[JAR (file format)|Jar]] and [[debugger]].@@@@1@34@@danf@17-8-2009
10442230@unknown@formal@none@1@S@One of the unique advantages of the concept of a runtime engine is that errors (exceptions) should not 'crash' the system.@@@@1@21@@danf@17-8-2009
10442240@unknown@formal@none@1@S@Moreover, in runtime engine environments such as Java there exist tools that attach to the runtime engine and every time that an exception of interest occurs they record debugging information that existed in memory at the time the exception was thrown (stack and heap values).@@@@1@45@@danf@17-8-2009
10442250@unknown@formal@none@1@S@These [[Automated Exception Handling]] tools provide 'root-cause' information for exceptions in Java programs that run in production, testing or development environments.@@@@1@21@@danf@17-8-2009
10442260@unknown@formal@none@1@S@==== Components ====@@@@1@3@@danf@17-8-2009
10442270@unknown@formal@none@1@S@* Java [[Library (computer science)|libraries]] are the compiled [[byte code]]s of [[source code]] developed by the JRE implementor to support application development in Java.@@@@1@24@@danf@17-8-2009
10442280@unknown@formal@none@1@S@Examples of these libraries are:@@@@1@5@@danf@17-8-2009
10442290@unknown@formal@none@1@S@** The core libraries, which include:@@@@1@6@@danf@17-8-2009
10442300@unknown@formal@none@1@S@*** Collection libraries that implement [[data structure]]s such as [[List (computing)|lists]], [[associative array|dictionaries]], [[tree structure|trees]] and [[Set (computer science)|sets]]@@@@1@19@@danf@17-8-2009
10442310@unknown@formal@none@1@S@*** [[XML]] Processing (Parsing, Transforming, Validating) libraries@@@@1@7@@danf@17-8-2009
10442320@unknown@formal@none@1@S@*** Security@@@@1@2@@danf@17-8-2009
10442330@unknown@formal@none@1@S@*** [[i18n|Internationalization and localization]] libraries@@@@1@5@@danf@17-8-2009
10442340@unknown@formal@none@1@S@** The integration libraries, which allow the application writer to communicate with external systems.@@@@1@14@@danf@17-8-2009
10442350@unknown@formal@none@1@S@These libraries include:@@@@1@3@@danf@17-8-2009
10442360@unknown@formal@none@1@S@*** The [[Java Database Connectivity]] (JDBC) [[Application Programming Interface|API]] for database access@@@@1@12@@danf@17-8-2009
10442370@unknown@formal@none@1@S@*** [[Java Naming and Directory Interface]] (JNDI) for lookup and discovery@@@@1@11@@danf@17-8-2009
10442380@unknown@formal@none@1@S@*** [[Java remote method invocation|RMI]] and [[CORBA]] for distributed application development@@@@1@11@@danf@17-8-2009
10442390@unknown@formal@none@1@S@** [[User Interface]] libraries, which include:@@@@1@6@@danf@17-8-2009
10442400@unknown@formal@none@1@S@*** The (heavyweight, or [[native mode|native]]) [[Abstract Windowing Toolkit]] (AWT), which provides [[graphical user interface|GUI]] components, the means for laying out those components and the means for handling events from those components@@@@1@32@@danf@17-8-2009
10442410@unknown@formal@none@1@S@*** The (lightweight) [[Swing (Java)|Swing]] libraries, which are built on AWT but provide (non-native) implementations of the AWT widgetry@@@@1@19@@danf@17-8-2009
10442420@unknown@formal@none@1@S@*** APIs for audio capture, processing, and playback@@@@1@8@@danf@17-8-2009
10442430@unknown@formal@none@1@S@* A platform dependent implementation of [[Java virtual machine]] (JVM) that is the means by which the byte codes of the Java libraries and third party applications are executed@@@@1@29@@danf@17-8-2009
10442440@unknown@formal@none@1@S@* Plugins, which enable [[Java applet|applet]]s to be run in [[Web browser]]s@@@@1@12@@danf@17-8-2009
10442450@unknown@formal@none@1@S@* [[Java Web Start]], which allows Java applications to be efficiently distributed to [[end user]]s across the [[Internet]]@@@@1@18@@danf@17-8-2009
10442460@unknown@formal@none@1@S@* Licensing and documentation@@@@1@4@@danf@17-8-2009
10442470@unknown@formal@none@1@S@=== APIs ===@@@@1@3@@danf@17-8-2009
10442480@unknown@formal@none@1@S@Sun has defined three platforms targeting different application environments and segmented many of its [[application programming interface|API]]s so that they belong to one of the platforms.@@@@1@26@@danf@17-8-2009
10442490@unknown@formal@none@1@S@The platforms are:@@@@1@3@@danf@17-8-2009
10442500@unknown@formal@none@1@S@* [[Java Platform, Micro Edition]] (Java ME) — targeting environments with limited resources,@@@@1@13@@danf@17-8-2009
10442510@unknown@formal@none@1@S@* [[Java Platform, Standard Edition]] (Java SE) — targeting workstation environments, and@@@@1@12@@danf@17-8-2009
10442520@unknown@formal@none@1@S@* [[Java Platform, Enterprise Edition]] (Java EE) — targeting large distributed enterprise or Internet environments.@@@@1@15@@danf@17-8-2009
10442530@unknown@formal@none@1@S@The [[Class (computer science)|classes]] in the Java APIs are organized into separate groups called [[Java package|packages]].@@@@1@16@@danf@17-8-2009
10442540@unknown@formal@none@1@S@Each package contains a set of related [[Interface (Java)|interface]]s, classes and [[exception handling|exceptions]].@@@@1@13@@danf@17-8-2009
10442550@unknown@formal@none@1@S@Refer to the separate platforms for a description of the packages available.@@@@1@12@@danf@17-8-2009
10442560@unknown@formal@none@1@S@The set of APIs is controlled by Sun Microsystems in cooperation with others through the [[Java Community Process]] program.@@@@1@19@@danf@17-8-2009
10442570@unknown@formal@none@1@S@Companies or individuals participating in this process can influence the design and development of the APIs.@@@@1@16@@danf@17-8-2009
10442580@unknown@formal@none@1@S@This process has been a subject of controversy.@@@@1@8@@danf@17-8-2009
10450010@unknown@formal@none@1@S@Language@@@@1@1@@danf@17-8-2009
10450020@unknown@formal@none@1@S@A '''language''' is a dynamic set of visual, auditory, or tactile [[symbol]]s of [[communication]] and the elements used to manipulate them.@@@@1@21@@danf@17-8-2009
10450030@unknown@formal@none@1@S@''Language'' can also refer to the use of such systems as a general [[phenomenon]].@@@@1@14@@danf@17-8-2009
10450040@unknown@formal@none@1@S@Language is considered to be an exclusively human mode of communication; although other animals make use of quite sophisticated communicative systems, none of these are known to make use of all of the properties that linguists use to define language.@@@@1@40@@danf@17-8-2009
10450050@unknown@formal@none@1@S@== Properties of language ==@@@@1@5@@danf@17-8-2009
10450060@unknown@formal@none@1@S@A set of agreed-upon symbols is only one feature of language; all languages must define the structural relationships between these symbols in a system of [[grammar]].@@@@1@26@@danf@17-8-2009
10450070@unknown@formal@none@1@S@Rules of grammar are what distinguish language from other forms of communication.@@@@1@12@@danf@17-8-2009
10450080@unknown@formal@none@1@S@They allow a finite set of symbols to be manipulated to create a potentially infinite number of grammatical utterances.@@@@1@19@@danf@17-8-2009
10450090@unknown@formal@none@1@S@Another property of language is that its symbols are [[arbitrary]].@@@@1@10@@danf@17-8-2009
10450100@unknown@formal@none@1@S@Any concept or grammatical rule can be mapped onto a symbol.@@@@1@11@@danf@17-8-2009
10450110@unknown@formal@none@1@S@Most languages make use of sound, but the combinations of sounds used do not have any ''inherent'' meaning – they are merely an agreed-upon convention to represent a certain thing by users of that language.@@@@1@35@@danf@17-8-2009
10450120@unknown@formal@none@1@S@For instance, there is nothing about the [[Spanish language|Spanish]] [[word]] ''{{lang|es|nada}}'' itself that forces Spanish speakers to convey the idea of "nothing".@@@@1@22@@danf@17-8-2009
10450130@unknown@formal@none@1@S@Another set of sounds (for example, the English word ''nothing'') could equally be used to represent the same concept, but all Spanish speakers have acquired or learned to correlate this meaning for this particular sound pattern.@@@@1@36@@danf@17-8-2009
10450140@unknown@formal@none@1@S@For [[Slovene language|Slovenian]], [[Croatian language|Croatian]], [[Serbian language|Serbian/Kosovan]] or [[Bosnian language|Bosnian]] speakers on the other hand, ''{{lang|hr|nada}}'' means something else; it means "hope".@@@@1@22@@danf@17-8-2009
10450150@unknown@formal@none@1@S@==The study of language==@@@@1@4@@danf@17-8-2009
10450160@unknown@formal@none@1@S@===Linguistics===@@@@1@1@@danf@17-8-2009
10450170@unknown@formal@none@1@S@[[Linguistics]] is the [[science|scientific]] and [[philosophy|philosophical]] study of language, encompassing a number of sub-fields.@@@@1@14@@danf@17-8-2009
10450180@unknown@formal@none@1@S@At the core of [[theoretical linguistics]] are the study of language structure ([[grammar]]) and the study of meaning ([[semantics]]).@@@@1@19@@danf@17-8-2009
10450190@unknown@formal@none@1@S@The first of these encompasses [[morphology (linguistics)|morphology]] (the formation and composition of [[word]]s), [[syntax]] (the rules that determine how words combine into [[phrase]]s and [[Sentence (linguistics)|sentences]]) and [[phonology]] (the study of sound systems and abstract sound units).@@@@1@37@@danf@17-8-2009
10450200@unknown@formal@none@1@S@[[Phonetics]] is a related branch of linguistics concerned with the actual properties of speech sounds ([[phone]]s), non-speech sounds, and how they are produced and [[speech perception|perceived]].@@@@1@26@@danf@17-8-2009
10450210@unknown@formal@none@1@S@[[Theoretical linguistics]] is mostly concerned with developing models of linguistic knowledge.@@@@1@11@@danf@17-8-2009
10450220@unknown@formal@none@1@S@The fields that are generally considered as the core of theoretical linguistics are [[syntax]], [[phonology]], [[Morphology (linguistics)|morphology]], and [[semantics]].@@@@1@19@@danf@17-8-2009
10450230@unknown@formal@none@1@S@[[Applied linguistics]] attempts to put linguistic theories into practice through areas like [[translation]], [[Stylistics (linguistics)|stylistics]], [[literary criticism]] and [[Literary theory|theory]], [[discourse analysis]], [[speech therapy]], speech pathology and [[Second language acquisition|foreign language teaching]].@@@@1@32@@danf@17-8-2009
10450240@unknown@formal@none@1@S@===History===@@@@1@1@@danf@17-8-2009
10450250@unknown@formal@none@1@S@The historical record of [[linguistics]] begins in [[India]] with [[Pāṇini]], the [[5th century BCE]] grammarian who formulated 3,959 rules of [[Sanskrit language|Sanskrit]] [[morphology (linguistics)|morphology]], known as the ''{{IAST|[[Aṣṭādhyāyī]]}}'' (अष्टाध्यायी) and with [[Tolkāppiyar]], the [[3rd century BCE]] grammarian of the [[Tamil language|Tamil]] work [[Tolkāppiyam]]. grammar is highly systematized and technical.@@@@1@49@@danf@17-8-2009
10450260@unknown@formal@none@1@S@Inherent in its analytic approach are the concepts of the [[phoneme]], the [[morpheme]], and the [[Root (linguistics)|root]]; Western linguists only recognized the phoneme some two millennia later.@@@@1@27@@danf@17-8-2009
10450270@unknown@formal@none@1@S@Tolkāppiyar's work is perhaps the first to describe [[articulatory phonetics]] for a language.@@@@1@13@@danf@17-8-2009
10450280@unknown@formal@none@1@S@Its classification of the alphabet into [[consonant]]s and [[vowel]]s, and elements like nouns, verbs, vowels, and consonants, which he put into classes, were also breakthroughs at the time.@@@@1@28@@danf@17-8-2009
10450290@unknown@formal@none@1@S@In the [[Middle East]], the [[Persian Empire|Persian]] linguist [[Sibawayh]] (سیبویه) made a detailed and professional description of [[Arabic language|Arabic]] in 760 CE in his monumental work, ''Al-kitab fi al-nahw'' (الكتاب في النحو, ''The Book on Grammar''), bringing many [[Linguistics|linguistic]] aspects of language to light.@@@@1@44@@danf@17-8-2009
10450300@unknown@formal@none@1@S@In his book, he distinguished [[phonetics]] from [[phonology]].@@@@1@8@@danf@17-8-2009
10450310@unknown@formal@none@1@S@Later in the West, the success of [[science]], [[mathematics]], and other [[formal system]]s in the 20th century led many to attempt a formalization of the study of language as a "semantic code".@@@@1@32@@danf@17-8-2009
10450320@unknown@formal@none@1@S@This resulted in the [[academic discipline]] of [[linguistics]], the founding of which is attributed to [[Ferdinand de Saussure]].@@@@1@18@@danf@17-8-2009
10450330@unknown@formal@none@1@S@In the 20th century, substantial contributions to the understanding of language came from [[Ferdinand de Saussure]], [[Hjelmslev]], [[Émile Benveniste]] and [[Roman Jakobson]], which are characterized as being highly [[systematic]].@@@@1@29@@danf@17-8-2009
10450340@unknown@formal@none@1@S@== Human languages ==@@@@1@4@@danf@17-8-2009
10450350@unknown@formal@none@1@S@Human languages are usually referred to as natural languages, and the science of studying them falls under the purview of [[linguistics]].@@@@1@21@@danf@17-8-2009
10450360@unknown@formal@none@1@S@A common progression for natural languages is that they are considered to be first spoken, then written, and then an understanding and explanation of their grammar is attempted.@@@@1@28@@danf@17-8-2009
10450370@unknown@formal@none@1@S@Languages live, die, move from place to place, and change with time.@@@@1@12@@danf@17-8-2009
10450380@unknown@formal@none@1@S@Any language that ceases to change or develop is categorized as a [[dead language]].@@@@1@14@@danf@17-8-2009
10450390@unknown@formal@none@1@S@Conversely, any language that is a ''living language,'' that is, it is in a continuous state of change, is known as a [[modern language]].@@@@1@24@@danf@17-8-2009
10450400@unknown@formal@none@1@S@Making a principled distinction between one language and another is usually impossible.@@@@1@12@@danf@17-8-2009
10450410@unknown@formal@none@1@S@For instance, there are a few [[dialect]]s of [[German language|German]] similar to some dialects of [[Dutch language|Dutch]].@@@@1@17@@danf@17-8-2009
10450420@unknown@formal@none@1@S@The transition between languages within the same [[language family]] is sometimes gradual (see [[dialect continuum]]).@@@@1@15@@danf@17-8-2009
10450430@unknown@formal@none@1@S@Some like to make parallels with [[biology]], where it is not possible to make a well-defined distinction between one species and the next.@@@@1@23@@danf@17-8-2009
10450440@unknown@formal@none@1@S@In either case, the ultimate difficulty may stem from the [[interaction]]s between languages and [[population]]s.@@@@1@15@@danf@17-8-2009
10450450@unknown@formal@none@1@S@(See [[Dialect]] or [[August Schleicher]] for a longer discussion.)@@@@1@9@@danf@17-8-2009
10450460@unknown@formal@none@1@S@The concepts of [[Ausbausprache - Abstandsprache - Dachsprache|Ausbausprache, Abstandsprache and Dachsprache]] are used to make finer distinctions about the degrees of difference between languages or dialects.@@@@1@26@@danf@17-8-2009
10450470@unknown@formal@none@1@S@==Artificial languages==@@@@1@2@@danf@17-8-2009
10450480@unknown@formal@none@1@S@=== Constructed languages ===@@@@1@4@@danf@17-8-2009
10450490@unknown@formal@none@1@S@Some individuals and groups have constructed their own artificial languages, for practical, experimental, personal, or ideological reasons.@@@@1@17@@danf@17-8-2009
10450500@unknown@formal@none@1@S@International auxiliary languages are generally constructed languages that strive to be easier to learn than natural languages; other constructed languages strive to be more logical ("loglangs") than natural languages; a prominent example of this is [[Lojban]].@@@@1@36@@danf@17-8-2009
10450510@unknown@formal@none@1@S@Some writers, such as [[J. R. R. Tolkien]], have created fantasy languages, for literary, [[Artistic language|artistic]] or personal reasons.@@@@1@19@@danf@17-8-2009
10450520@unknown@formal@none@1@S@The fantasy language of the [[Klingon]] race has in recent years been developed by fans of the Star Trek series, including a vocabulary and grammar.@@@@1@25@@danf@17-8-2009
10450530@unknown@formal@none@1@S@Constructed languages are not necessarily restricted to the properties shared by natural languages.@@@@1@13@@danf@17-8-2009
10450540@unknown@formal@none@1@S@This part of ISO 639 also includes identifiers that denote constructed (or artificial) languages.@@@@1@14@@danf@17-8-2009
10450550@unknown@formal@none@1@S@In order to qualify for inclusion the language must have a literature and it must be designed for the purpose of human communication.@@@@1@23@@danf@17-8-2009
10450560@unknown@formal@none@1@S@Specifically excluded are reconstructed languages and computer programming languages.@@@@1@9@@danf@17-8-2009
10450570@unknown@formal@none@1@S@===International auxiliary languages===@@@@1@3@@danf@17-8-2009
10450580@unknown@formal@none@1@S@Some languages, most constructed, are meant specifically for communication between people of different nationalities or language groups as an easy-to-learn second language.@@@@1@22@@danf@17-8-2009
10450590@unknown@formal@none@1@S@Several of these languages have been constructed by individuals or groups.@@@@1@11@@danf@17-8-2009
10450600@unknown@formal@none@1@S@Natural, pre-existing languages may also be used in this way - their developers merely catalogued and standardized their vocabulary and identified their grammatical rules.@@@@1@24@@danf@17-8-2009
10450610@unknown@formal@none@1@S@These languages are called ''naturalistic.''@@@@1@5@@danf@17-8-2009
10450620@unknown@formal@none@1@S@One such language, [[Latino Sine Flexione]], is a simplified form of Latin.@@@@1@12@@danf@17-8-2009
10450630@unknown@formal@none@1@S@Two others, [[Occidental language|Occidental]] and [[Novial]], were drawn from several Western languages.@@@@1@12@@danf@17-8-2009
10450640@unknown@formal@none@1@S@To date, the most successful auxiliary language is [[Esperanto]], invented by Polish ophthalmologist [[L. L. Zamenhof|Zamenhof]].@@@@1@16@@danf@17-8-2009
10450650@unknown@formal@none@1@S@It has a relatively large community roughly estimated at about 2 million speakers worldwide, with a large body of literature, songs, and is the only known constructed language to have [[Native Esperanto speakers|native speakers]], such as the Hungarian-born American businessman [[George Soros]].@@@@1@42@@danf@17-8-2009
10450660@unknown@formal@none@1@S@Other auxiliary languages with a relatively large number of speakers and literature are [[Interlingua]] and [[Ido]].@@@@1@16@@danf@17-8-2009
10450670@unknown@formal@none@1@S@===Controlled languages===@@@@1@2@@danf@17-8-2009
10450680@unknown@formal@none@1@S@Controlled natural languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity.@@@@1@25@@danf@17-8-2009
10450690@unknown@formal@none@1@S@The purpose behind the development and implementation of a controlled natural language typically is to aid non-native speakers of a natural language in understanding it, or to ease computer processing of a natural language.@@@@1@34@@danf@17-8-2009
10450700@unknown@formal@none@1@S@An example of a widely used controlled natural language is [[Simplified English]], which was originally developed for [[aerospace]] industry maintenance manuals.@@@@1@21@@danf@17-8-2009
10450710@unknown@formal@none@1@S@== Formal languages ==@@@@1@4@@danf@17-8-2009
10450720@unknown@formal@none@1@S@[[Mathematics]] and [[computer science]] use artificial entities called formal languages (including [[programming language]]s and [[markup language]]s, and some that are more theoretical in nature).@@@@1@24@@danf@17-8-2009
10450730@unknown@formal@none@1@S@These often take the form of [[character string]]s, produced by a combination of [[formal grammar]] and semantics of arbitrary complexity.@@@@1@20@@danf@17-8-2009
10450740@unknown@formal@none@1@S@=== Programming languages ===@@@@1@4@@danf@17-8-2009
10450750@unknown@formal@none@1@S@A programming language is an extreme case of a formal language that can be used to control the behavior of a machine, particularly a computer, to perform specific tasks.@@@@1@29@@danf@17-8-2009
10450760@unknown@formal@none@1@S@Programming languages are defined using syntactic and semantic rules, to determine structure and meaning respectively.@@@@1@15@@danf@17-8-2009
10450770@unknown@formal@none@1@S@Programming languages are used to facilitate communication about the task of organizing and manipulating information, and to express algorithms precisely.@@@@1@20@@danf@17-8-2009
10450780@unknown@formal@none@1@S@Some authors restrict the term "programming language" to those languages that can express all possible algorithms; sometimes the term "computer language" is used for artificial languages that are more limited.@@@@1@30@@danf@17-8-2009
10450790@unknown@formal@none@1@S@== Animal communication ==@@@@1@4@@danf@17-8-2009
10450800@unknown@formal@none@1@S@The term "[[animal language]]s" is often used for non-human languages.@@@@1@10@@danf@17-8-2009
10450810@unknown@formal@none@1@S@Linguists do not consider these to be "language", but describe them as [[animal communication]], because the interaction between animals in such communication is fundamentally different in its underlying principles from human language.@@@@1@32@@danf@17-8-2009
10450820@unknown@formal@none@1@S@Nevertheless, some scholars have tried to disprove this mainstream premise through experiments on training chimpanzees to talk.@@@@1@17@@danf@17-8-2009
10450830@unknown@formal@none@1@S@[[Karl von Frisch]] received the Nobel Prize in 1973 for his proof of the language and dialects of the bees.@@@@1@20@@danf@17-8-2009
10450840@unknown@formal@none@1@S@In several publicized instances, non-human animals have been taught to understand certain features of human language.@@@@1@16@@danf@17-8-2009
10450850@unknown@formal@none@1@S@[[Chimpanzee]]s, [[gorilla]]s, and [[orangutan]]s have been taught hand signs based on [[American Sign Language]].@@@@1@14@@danf@17-8-2009
10450860@unknown@formal@none@1@S@The [[African Grey Parrot]], which possesses the ability to mimic human speech with a high degree of accuracy, is suspected of having sufficient intelligence to comprehend some of the speech it mimics.@@@@1@32@@danf@17-8-2009
10450870@unknown@formal@none@1@S@Most species of [[parrot]], despite expert mimicry, are believed to have no linguistic comprehension at all.@@@@1@16@@danf@17-8-2009
10450880@unknown@formal@none@1@S@While proponents of animal communication systems have debated levels of [[semantics]], these systems have not been found to have anything approaching human language [[syntax]].@@@@1@24@@danf@17-8-2009
10460010@unknown@formal@none@1@S@Language model@@@@1@2@@danf@17-8-2009
10460020@unknown@formal@none@1@S@A statistical '''language model''' assigns a [[probability]] to a sequence of ''m'' words by means of a [[probability distribution]].@@@@1@20@@danf@17-8-2009
10460030@unknown@formal@none@1@S@Language modeling is used in many [[natural language processing]] applications such as [[speech recognition]], [[machine translation]], [[part-of-speech tagging]], [[parsing]] and [[information retrieval]].@@@@1@22@@danf@17-8-2009
10460040@unknown@formal@none@1@S@In [[speech recognition]] and in [[data compression]], such a model tries to capture the properties of a language, and to predict the next word in a speech sequence.@@@@1@28@@danf@17-8-2009
10460050@unknown@formal@none@1@S@When used in information retrieval, a language model is associated with a [[document]] in a collection.@@@@1@16@@danf@17-8-2009
10460060@unknown@formal@none@1@S@With query ''Q'' as input, retrieved documents are ranked based on the probability that the document's language model would generate the terms of the query, ''P(Q|Md)''.@@@@1@26@@danf@17-8-2009
10460070@unknown@formal@none@1@S@Estimating the probability of sequences can become difficult in [[corpora]], in which [[phrase]]s or [[Sentence (linguistics)|sentence]]s can be arbitrarily long and hence some sequences are not observed during [[training]] of the language model ([[data sparseness problem]] of [[overfitting]]).@@@@1@38@@danf@17-8-2009
10460080@unknown@formal@none@1@S@For that reason these models are often approximated using smoothed [[N-gram]] models.@@@@1@12@@danf@17-8-2009
10460090@unknown@formal@none@1@S@== N-gram models ==@@@@1@4@@danf@17-8-2009
10460100@unknown@formal@none@1@S@In an n-gram model, the probability of observing the sentence w1,...,wm is approximated as@@@@1@15@@danf@17-8-2009
10460110@unknown@formal@none@1@S@@@@@1@9@@danf@17-8-2009
10460120@unknown@formal@none@1@S@Here, it is assumed that the probability of observing the ''ith'' word ''wi'' in the context history of the preceding ''i-1'' words can be approximated by the probability of observing it in the shortened context history of the preceding ''n-1'' words (''nth order [[Markov property]]).@@@@1@45@@danf@17-8-2009
10460130@unknown@formal@none@1@S@The conditional probability can be calculated from n-gram frequency counts: @@@@1@15@@danf@17-8-2009
10460140@unknown@formal@none@1@S@The words '''bigram''' and '''trigram''' language model denote n-gram language models with ''n=2'' and ''n=3'', respectively.@@@@1@16@@danf@17-8-2009
10460150@unknown@formal@none@1@S@=== Example ===@@@@1@3@@danf@17-8-2009
10460160@unknown@formal@none@1@S@In a bigram (n=2) language model, the probability of the sentence ''I saw the red house'' is approximated as @@@@1@28@@danf@17-8-2009
10460170@unknown@formal@none@1@S@whereas in a trigram (n=3) language model, the approximation is @@@@1@19@@danf@17-8-2009
10470010@unknown@formal@none@1@S@Latent semantic analysis@@@@1@3@@danf@17-8-2009
10470020@unknown@formal@none@1@S@'''Latent semantic analysis (LSA)''' is a technique in [[natural language processing]], in particular in [[vectorial semantics]], of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.@@@@1@41@@danf@17-8-2009
10470030@unknown@formal@none@1@S@LSA was patented in [[1988]] ([http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=4839853 US Patent 4,839,853]) by [[Scott Deerwester]], [[Susan Dumais]], [[George Furnas]], [[Richard Harshman]], [[Thomas Landauer]], [[Karen Lochbaum]] and [[Lynn Streeter]].@@@@1@25@@danf@17-8-2009
10470040@unknown@formal@none@1@S@In the context of its application to [[information retrieval]], it is sometimes called '''latent semantic indexing (LSI)'''.@@@@1@17@@danf@17-8-2009
10470050@unknown@formal@none@1@S@== Occurrence matrix ==@@@@1@4@@danf@17-8-2009
10470060@unknown@formal@none@1@S@LSA can use a [[term-document matrix]] which describes the occurrences of terms in documents; it is a [[sparse matrix]] whose rows correspond to [[terminology|terms]] and whose columns correspond to documents, typically [[stemming|stemmed]] words that appear in the documents.@@@@1@38@@danf@17-8-2009
10470070@unknown@formal@none@1@S@A typical example of the weighting of the elements of the matrix is [[tf-idf]] (term frequency–inverse document frequency): the element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance.@@@@1@46@@danf@17-8-2009
10470080@unknown@formal@none@1@S@This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used.@@@@1@29@@danf@17-8-2009
10470090@unknown@formal@none@1@S@LSA transforms the occurrence matrix into a relation between the terms and some ''concepts'', and a relation between those concepts and the documents.@@@@1@23@@danf@17-8-2009
10470100@unknown@formal@none@1@S@Thus the terms and documents are now indirectly related through the concepts.@@@@1@12@@danf@17-8-2009
10470110@unknown@formal@none@1@S@== Applications ==@@@@1@3@@danf@17-8-2009
10470120@unknown@formal@none@1@S@The new concept space typically can be used to:@@@@1@9@@danf@17-8-2009
10470130@unknown@formal@none@1@S@* Compare the documents in the concept space ([[data clustering]], [[document classification]])......@@@@1@12@@danf@17-8-2009
10470140@unknown@formal@none@1@S@* Find similar documents across languages, after analyzing a base set of translated documents ([[cross language retrieval]]).@@@@1@17@@danf@17-8-2009
10470150@unknown@formal@none@1@S@* Find relations between terms ([[synonymy]] and [[polysemy]]).@@@@1@8@@danf@17-8-2009
10470160@unknown@formal@none@1@S@* Given a query of terms, translate it into the concept space, and find matching documents ([[information retrieval]]).@@@@1@18@@danf@17-8-2009
10470170@unknown@formal@none@1@S@Synonymy and polysemy are fundamental problems in [[natural language processing]]:@@@@1@10@@danf@17-8-2009
10470180@unknown@formal@none@1@S@* Synonymy is the phenomenon where different words describe the same idea.@@@@1@12@@danf@17-8-2009
10470190@unknown@formal@none@1@S@Thus, a query in a search engine may fail to retrieve a relevant document that does not contain the words which appeared in the query.@@@@1@25@@danf@17-8-2009
10470200@unknown@formal@none@1@S@For example, a search for "doctors" may not return a document containing the word "physicians", even though the words have the same meaning.@@@@1@23@@danf@17-8-2009
10470210@unknown@formal@none@1@S@* Polysemy is the phenomenon where the same word has multiple meanings.@@@@1@12@@danf@17-8-2009
10470220@unknown@formal@none@1@S@So a search may retrieve irrelevant documents containing the desired words in the wrong meaning.@@@@1@15@@danf@17-8-2009
10470230@unknown@formal@none@1@S@For example, a botanist and a computer scientist looking for the word "tree" probably desire different sets of documents.@@@@1@19@@danf@17-8-2009
10470240@unknown@formal@none@1@S@== Rank lowering ==@@@@1@4@@danf@17-8-2009
10470250@unknown@formal@none@1@S@After the construction of the occurrence matrix, LSA finds a low-[[rank (matrix theory)|rank]] approximation to the [[term-document matrix]].@@@@1@18@@danf@17-8-2009
10470260@unknown@formal@none@1@S@There could be various reasons for these approximations:@@@@1@8@@danf@17-8-2009
10470270@unknown@formal@none@1@S@* The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an ''approximation'' (a "least and necessary evil").@@@@1@31@@danf@17-8-2009
10470280@unknown@formal@none@1@S@* The original term-document matrix is presumed ''noisy'': for example, anecdotal instances of terms are to be eliminated.@@@@1@18@@danf@17-8-2009
10470290@unknown@formal@none@1@S@From this point of view, the approximated matrix is interpreted as a ''de-noisified matrix'' (a better matrix than the original).@@@@1@20@@danf@17-8-2009
10470300@unknown@formal@none@1@S@* The original term-document matrix is presumed overly [[Sparse matrix|sparse]] relative to the "true" term-document matrix.@@@@1@16@@danf@17-8-2009
10470310@unknown@formal@none@1@S@That is, the original matrix lists only the words actually ''in'' each document, whereas we might be interested in all words ''related to'' each document--generally a much larger set due to [[synonymy]].@@@@1@32@@danf@17-8-2009
10470320@unknown@formal@none@1@S@The consequence of the rank lowering is that some dimensions are combined and depend on more than one term:@@@@1@19@@danf@17-8-2009
10470330@unknown@formal@none@1@S@:: {(car), (truck), (flower)} --> {(1.3452 * car + 0.2828 * truck), (flower)}@@@@1@13@@danf@17-8-2009
10470340@unknown@formal@none@1@S@This mitigates synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings.@@@@1@20@@danf@17-8-2009
10470350@unknown@formal@none@1@S@It also mitigates polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning.@@@@1@27@@danf@17-8-2009
10470360@unknown@formal@none@1@S@Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense.@@@@1@29@@danf@17-8-2009
10470370@unknown@formal@none@1@S@== Derivation ==@@@@1@3@@danf@17-8-2009
10470380@unknown@formal@none@1@S@Let be a matrix where element describes the occurrence of term in document (this can be, for example, the frequency).@@@@1@24@@danf@17-8-2009
10470385@unknown@formal@none@1@S@ will look like this:@@@@1@5@@danf@17-8-2009
10470390@unknown@formal@none@1@S@:@@@@1@33@@danf@17-8-2009
10470400@unknown@formal@none@1@S@Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document:@@@@1@20@@danf@17-8-2009
10470410@unknown@formal@none@1@S@:@@@@1@9@@danf@17-8-2009
10470420@unknown@formal@none@1@S@Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term:@@@@1@20@@danf@17-8-2009
10470430@unknown@formal@none@1@S@:@@@@1@9@@danf@17-8-2009
10470440@unknown@formal@none@1@S@Now the [[dot product]] between two term vectors gives the [[correlation]] between the terms over the documents.@@@@1@19@@danf@17-8-2009
10470450@unknown@formal@none@1@S@The [[matrix product]] contains all these dot products.@@@@1@10@@danf@17-8-2009
10470460@unknown@formal@none@1@S@Element (which is equal to element ) contains the dot product ().@@@@1@18@@danf@17-8-2009
10470470@unknown@formal@none@1@S@Likewise, the matrix contains the dot products between all the document vectors, giving their correlation over the terms: .@@@@1@25@@danf@17-8-2009
10470480@unknown@formal@none@1@S@Now assume that there exists a decomposition of such that and are [[orthonormal matrix|orthonormal matrices]] and is a [[diagonal matrix]].@@@@1@24@@danf@17-8-2009
10470490@unknown@formal@none@1@S@This is called a [[singular value decomposition]] (SVD):@@@@1@8@@danf@17-8-2009
10470500@unknown@formal@none@1@S@:@@@@1@7@@danf@17-8-2009
10470510@unknown@formal@none@1@S@The matrix products giving us the term and document correlations then become@@@@1@12@@danf@17-8-2009
10470520@unknown@formal@none@1@S@:@@@@1@61@@danf@17-8-2009
10470530@unknown@formal@none@1@S@Since and are diagonal we see that must contain the [[eigenvector]]s of , while must be the eigenvectors of .@@@@1@28@@danf@17-8-2009
10470540@unknown@formal@none@1@S@Both products have the same non-zero eigenvalues, given by the non-zero entries of , or equally, by the non-zero entries of .@@@@1@23@@danf@17-8-2009
10470550@unknown@formal@none@1@S@Now the decomposition looks like this:@@@@1@6@@danf@17-8-2009
10470560@unknown@formal@none@1@S@:@@@@1@141@@danf@17-8-2009
10470570@unknown@formal@none@1@S@The values are called the singular values, and and the left and right singular vectors.@@@@1@24@@danf@17-8-2009
10470580@unknown@formal@none@1@S@Notice how the only part of that contributes to is the row.@@@@1@15@@danf@17-8-2009
10470590@unknown@formal@none@1@S@Let this row vector be called .@@@@1@8@@danf@17-8-2009
10470600@unknown@formal@none@1@S@Likewise, the only part of that contributes to is the column, .@@@@1@16@@danf@17-8-2009
10470610@unknown@formal@none@1@S@These are ''not'' the eigenvectors, but ''depend'' on ''all'' the eigenvectors.@@@@1@11@@danf@17-8-2009
10470620@unknown@formal@none@1@S@It turns out that when you select the largest singular values, and their corresponding singular vectors from and , you get the rank approximation to X with the smallest error ([[Frobenius norm]]).@@@@1@35@@danf@17-8-2009
10470630@unknown@formal@none@1@S@The amazing thing about this approximation is that not only does it have a minimal error, but it translates the term and document vectors into a concept space.@@@@1@28@@danf@17-8-2009
10470640@unknown@formal@none@1@S@The vector then has entries, each giving the occurrence of term in one of the concepts.@@@@1@21@@danf@17-8-2009
10470650@unknown@formal@none@1@S@Likewise, the vector gives the relation between document and each concept.@@@@1@14@@danf@17-8-2009
10470660@unknown@formal@none@1@S@We write this approximation as@@@@1@5@@danf@17-8-2009
10470670@unknown@formal@none@1@S@:@@@@1@5@@danf@17-8-2009
10470680@unknown@formal@none@1@S@You can now do the following:@@@@1@6@@danf@17-8-2009
10470690@unknown@formal@none@1@S@* See how related documents and are in the concept space by comparing the vectors and (typically by [[vector space model|cosine similarity]]).@@@@1@28@@danf@17-8-2009
10470700@unknown@formal@none@1@S@This gives you a clustering of the documents.@@@@1@8@@danf@17-8-2009
10470710@unknown@formal@none@1@S@* Comparing terms and by comparing the vectors and , giving you a clustering of the terms in the concept space.@@@@1@26@@danf@17-8-2009
10470720@unknown@formal@none@1@S@* Given a query, view this as a mini document, and compare it to your documents in the concept space.@@@@1@20@@danf@17-8-2009
10470730@unknown@formal@none@1@S@To do the latter, you must first translate your query into the concept space.@@@@1@14@@danf@17-8-2009
10470740@unknown@formal@none@1@S@It is then intuitive that you must use the same transformation that you use on your documents:@@@@1@17@@danf@17-8-2009
10470750@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10470760@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10470770@unknown@formal@none@1@S@This means that if you have a query vector , you must do the translation before you compare it with the document vectors in the concept space.@@@@1@33@@danf@17-8-2009
10470780@unknown@formal@none@1@S@You can do the same for pseudo term vectors:@@@@1@9@@danf@17-8-2009
10470790@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10470800@unknown@formal@none@1@S@:@@@@1@10@@danf@17-8-2009
10470810@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10470820@unknown@formal@none@1@S@== Implementation ==@@@@1@3@@danf@17-8-2009
10470830@unknown@formal@none@1@S@The [[Singular Value Decomposition|SVD]] is typically computed using large matrix methods (for example, [[Lanczos method]]s) but may also be computed incrementally and with greatly reduced resources via a [[neural network]]-like approach, which does not require the large, full-rank matrix to be held in memory ([http://www.dcs.shef.ac.uk/~genevieve/gorrell_webb.pdf Gorrell and Webb, 2005]).@@@@1@49@@danf@17-8-2009
10470840@unknown@formal@none@1@S@A fast, incremental, low-memory, large-matrix SVD algorithm has recently been developed ([http://www.merl.com/publications/TR2006-059/ Brand, 2006]).@@@@1@14@@danf@17-8-2009
10470850@unknown@formal@none@1@S@Unlike Gorrell and Webb's (2005) stochastic approximation, Brand's (2006) algorithm provides an exact solution.@@@@1@14@@danf@17-8-2009
10470860@unknown@formal@none@1@S@== Limitations ==@@@@1@3@@danf@17-8-2009
10470870@unknown@formal@none@1@S@LSA has two drawbacks:@@@@1@4@@danf@17-8-2009
10470880@unknown@formal@none@1@S@* The resulting dimensions might be difficult to interpret.@@@@1@9@@danf@17-8-2009
10470890@unknown@formal@none@1@S@For instance, in@@@@1@3@@danf@17-8-2009
10470900@unknown@formal@none@1@S@:: {(car), (truck), (flower)} --> {(1.3452 * car + 0.2828 * truck), (flower)}@@@@1@13@@danf@17-8-2009
10470910@unknown@formal@none@1@S@:the (1.3452 * car + 0.2828 * truck) component could be interpreted as "vehicle".@@@@1@14@@danf@17-8-2009
10470920@unknown@formal@none@1@S@However, it is very likely that cases close to@@@@1@9@@danf@17-8-2009
10470930@unknown@formal@none@1@S@:: {(car), (bottle), (flower)} --> {(1.3452 * car + 0.2828 * bottle), (flower)}@@@@1@13@@danf@17-8-2009
10470940@unknown@formal@none@1@S@:will occur.@@@@1@2@@danf@17-8-2009
10470950@unknown@formal@none@1@S@This leads to results which can be justified on the mathematical level, but have no interpretable meaning in natural language.@@@@1@20@@danf@17-8-2009
10470960@unknown@formal@none@1@S@* The [[probabilistic model]] of LSA does not match observed data: LSA assumes that words and documents form a joint [[normal distribution|Gaussian]] model ([[ergodic hypothesis]]), while a [[Poisson distribution]] has been observed.@@@@1@32@@danf@17-8-2009
10470970@unknown@formal@none@1@S@Thus, a newer alternative is [[probabilistic latent semantic analysis]], based on a [[multinomial distribution|multinomial]] model, which is reported to give better results than standard LSA .@@@@1@26@@danf@17-8-2009
10480010@unknown@formal@none@1@S@Linguistics@@@@1@1@@danf@17-8-2009
10480020@unknown@formal@none@1@S@'''Linguistics''' is the [[science|scientific]] study of [[language]], encompassing a number of sub-fields.@@@@1@12@@danf@17-8-2009
10480030@unknown@formal@none@1@S@An important topical division is between the study of language structure ([[grammar]]) and the study of [[Meaning (linguistics)|meaning]] ([[semantics]]).@@@@1@19@@danf@17-8-2009
10480040@unknown@formal@none@1@S@Grammar encompasses [[morphology (linguistics)|morphology]] (the formation and composition of [[word]]s), [[syntax]] (the rules that determine how words combine into [[phrase]]s and [[Sentence (linguistics)|sentences]]) and [[phonology]] (the study of sound systems and abstract sound units).@@@@1@34@@danf@17-8-2009
10480050@unknown@formal@none@1@S@[[Phonetics]] is a related branch of linguistics concerned with the actual properties of speech sounds ([[phone]]s), non-speech sounds, and how they are produced and [[speech perception|perceived]].@@@@1@26@@danf@17-8-2009
10480060@unknown@formal@none@1@S@Over the twentieth century, following the work of [[Noam Chomsky]], linguistics came to be dominated by the [[Generative grammar|Generativist school]], which is chiefly concerned with explaining how human beings [[language acquisition|acquire language]] and the biological constraints on this acquisition; generative theory is [[Language module|modularist]] in character.@@@@1@46@@danf@17-8-2009
10480070@unknown@formal@none@1@S@While this remains the dominant paradigm, other linguistic theories have increasingly gained in popularity — [[cognitive linguistics]] being a prominent example.@@@@1@21@@danf@17-8-2009
10480080@unknown@formal@none@1@S@There are many sub-fields in linguistics, which may or may not be dominated by a particular theoretical approach: [[evolutionary linguistics]], for example, attempts to account for the origins of language; [[historical linguistics]] explores language change; and [[sociolinguistics]] looks at the relation between linguistic variation and social structures.@@@@1@47@@danf@17-8-2009
10480090@unknown@formal@none@1@S@A variety of intellectual disciplines are relevant to the study of language.@@@@1@12@@danf@17-8-2009
10480100@unknown@formal@none@1@S@Although certain linguists have downplayed the relevance of some other fields, linguistics — like other sciences — is highly interdisciplinary and draws on work from such fields as [[psychology]], [[informatics]], [[computer science]], [[philosophy]], [[biology]], [[human anatomy]], [[neuroscience]], [[sociology]], [[anthropology]], and [[acoustics]].@@@@1@41@@danf@17-8-2009
10480110@unknown@formal@none@1@S@==Names for the discipline==@@@@1@4@@danf@17-8-2009
10480120@unknown@formal@none@1@S@Before the twentieth century (the word is first attested 1716), the term "[[philology]]" was commonly used to refer to the science of language, which was then predominately historical in focus.@@@@1@30@@danf@17-8-2009
10480130@unknown@formal@none@1@S@Since [[Ferdinand de Saussure]]'s insistence on the importance of [[Synchronic analysis (linguistics)|synchronic analysis]], however, this focus has shifted and the term "philology" is now generally used for the "study of a language's grammar, history and literary tradition", especially in the [[USA]]., where it was never as popular as elsewhere in the sense "science of language".@@@@1@55@@danf@17-8-2009
10480140@unknown@formal@none@1@S@The term "linguistics" dates from 1847, although "linguist" in the sense a student of language" dates from 1641.@@@@1@18@@danf@17-8-2009
10480150@unknown@formal@none@1@S@It is now the usual academic term in English for the scientific study of language.@@@@1@15@@danf@17-8-2009
10480160@unknown@formal@none@1@S@==Fundamental concerns and divisions==@@@@1@4@@danf@17-8-2009
10480170@unknown@formal@none@1@S@Linguistics concerns itself with describing and explaining the nature of human language.@@@@1@12@@danf@17-8-2009
10480180@unknown@formal@none@1@S@Relevant to this are the questions of what is universal to language, how language can vary, and how human beings come to know languages.@@@@1@24@@danf@17-8-2009
10480190@unknown@formal@none@1@S@All humans (setting aside extremely pathological cases) achieve competence in whatever language is spoken (or signed, in the case of [[sign language|signed languages]]) around them when growing up, with apparently little need for explicit conscious instruction.@@@@1@36@@danf@17-8-2009
10480200@unknown@formal@none@1@S@While non-humans acquire their own communication systems, they do not acquire human language in this way (although many non-human animals can learn to respond to language, or can even be trained to use it to a degree).@@@@1@37@@danf@17-8-2009
10480210@unknown@formal@none@1@S@Therefore, linguists assume, the ability to acquire and use language is an innate, biologically-based potential of modern human beings, similar to the ability to walk.@@@@1@25@@danf@17-8-2009
10480220@unknown@formal@none@1@S@There is no consensus, however, as to the extent of this innate potential, or its domain-specificity (the degree to which such innate abilities are specific to language), with some theorists claiming that there is a very large set of highly abstract and specific binary settings coded into the human brain, while others claim that the ability to learn language is a product of general human cognition.@@@@1@66@@danf@17-8-2009
10480230@unknown@formal@none@1@S@It is, however, generally agreed that there are no strong ''genetic'' differences underlying the differences between languages: an individual will acquire whatever language(s) they are exposed to as a child, regardless of parentage or ethnic origin.@@@@1@36@@danf@17-8-2009
10480240@unknown@formal@none@1@S@Linguistic structures are pairings of meaning and form (which may consist of sound patterns, movements of the hand, written symbols, and so on); such pairings are known as [[Ferdinand de Saussure|Saussurean]] [[linguistic sign|signs]].@@@@1@33@@danf@17-8-2009
10480250@unknown@formal@none@1@S@Linguists may specialize in some sub-area of linguistic structure, which can be arranged in the following terms, from form to meaning:@@@@1@21@@danf@17-8-2009
10480260@unknown@formal@none@1@S@* '''[[Phonetics]]''', the study of the physical properties of speech (or signed) production and perception@@@@1@15@@danf@17-8-2009
10480270@unknown@formal@none@1@S@* '''[[Phonology]]''', the study of sounds (adjusted appropriately for signed languages) as discrete, abstract elements in the speaker's mind that distinguish meaning@@@@1@22@@danf@17-8-2009
10480280@unknown@formal@none@1@S@* '''[[Morphology (linguistics)|Morphology]]''', the study of internal structures of [[word]]s and how they can be modified@@@@1@16@@danf@17-8-2009
10480290@unknown@formal@none@1@S@* '''[[Syntax]]''', the study of how words combine to form grammatical [[sentence]]s@@@@1@12@@danf@17-8-2009
10480300@unknown@formal@none@1@S@* '''[[Semantics]]''', the study of the meaning of words ([[lexical semantics]]) and fixed word combinations ([[phraseology]]), and how these combine to form the [[meaning]]s of sentences@@@@1@26@@danf@17-8-2009
10480310@unknown@formal@none@1@S@* '''[[Pragmatics]]''', the study of how [[utterance]]s are used (literally, figuratively, or otherwise) in [[speech acts|communicative acts]]@@@@1@17@@danf@17-8-2009
10480320@unknown@formal@none@1@S@* '''[[Discourse analysis]]''', the analysis of language use in [[texts]] (spoken, written, or signed)@@@@1@14@@danf@17-8-2009
10480330@unknown@formal@none@1@S@Many linguists would agree that these divisions overlap considerably, and the independent significance of each of these areas is not universally acknowledged.@@@@1@22@@danf@17-8-2009
10480340@unknown@formal@none@1@S@Regardless of any particular linguist's position, each area has core concepts that foster significant scholarly inquiry and research.@@@@1@18@@danf@17-8-2009
10480350@unknown@formal@none@1@S@Intersecting with these domains are fields arranged around the kind of external factors that are considered.@@@@1@16@@danf@17-8-2009
10480360@unknown@formal@none@1@S@For example@@@@1@2@@danf@17-8-2009
10480370@unknown@formal@none@1@S@* [[Linguistic typology]], the study of the common properties of diverse unrelated languages, properties that may, given sufficient attestation, be assumed to be innate to human language capacity.@@@@1@28@@danf@17-8-2009
10480380@unknown@formal@none@1@S@* [[Stylistics (linguistics)|Stylistics]], the study of linguistic factors that place a discourse in context.@@@@1@14@@danf@17-8-2009
10480390@unknown@formal@none@1@S@* [[Developmental linguistics]], the study of the development of linguistic ability in an individual, particularly [[Language acquisition|the acquisition of language]] in childhood.@@@@1@22@@danf@17-8-2009
10480400@unknown@formal@none@1@S@* [[Historical linguistics]] or Diachronic linguistics, the study of language change.@@@@1@11@@danf@17-8-2009
10480410@unknown@formal@none@1@S@* [[Language geography]], the study of the spatial patterns of languages.@@@@1@11@@danf@17-8-2009
10480420@unknown@formal@none@1@S@* [[Evolutionary linguistics]], the study of the origin and subsequent development of language.@@@@1@13@@danf@17-8-2009
10480430@unknown@formal@none@1@S@* [[Psycholinguistics]], the study of the cognitive processes and representations underlying language use.@@@@1@13@@danf@17-8-2009
10480440@unknown@formal@none@1@S@* [[Sociolinguistics]], the study of social patterns and norms of linguistic variability.@@@@1@12@@danf@17-8-2009
10480450@unknown@formal@none@1@S@* [[Clinical linguistics]], the application of linguistic theory to the area of [[Speech-Language Pathology]].@@@@1@14@@danf@17-8-2009
10480460@unknown@formal@none@1@S@* [[Neurolinguistics]], the study of the brain networks that underlie grammar and communication.@@@@1@13@@danf@17-8-2009
10480470@unknown@formal@none@1@S@* [[Biolinguistics]], the study of natural as well as human-taught communication systems in animals compared to human language.@@@@1@18@@danf@17-8-2009
10480480@unknown@formal@none@1@S@* [[Computational linguistics]], the study of computational implementations of linguistic structures.@@@@1@11@@danf@17-8-2009
10480490@unknown@formal@none@1@S@* [[Applied linguistics]], the study of language related issues applied in everyday life, notably language. policies, planning, and education.@@@@1@19@@danf@17-8-2009
10480500@unknown@formal@none@1@S@[[Constructed language]] fits under Applied linguistics.@@@@1@6@@danf@17-8-2009
10480510@unknown@formal@none@1@S@The related discipline of [[semiotics]] investigates the relationship between signs and what they signify.@@@@1@14@@danf@17-8-2009
10480520@unknown@formal@none@1@S@From the perspective of semiotics, language can be seen as a sign or symbol, with the world as its representation.@@@@1@20@@danf@17-8-2009
10480530@unknown@formal@none@1@S@==Variation and universality==@@@@1@3@@danf@17-8-2009
10480540@unknown@formal@none@1@S@Much modern linguistic research, particularly within the [[paradigm]] of [[generative grammar]], has concerned itself with trying to account for differences between languages of the world.@@@@1@25@@danf@17-8-2009
10480550@unknown@formal@none@1@S@This has worked on the assumption that if human linguistic ability is narrowly constrained by human biology, then all languages must share certain fundamental properties.@@@@1@25@@danf@17-8-2009
10480560@unknown@formal@none@1@S@In [[generative grammar|generativist theory]], the collection of fundamental properties all languages share are referred to as [[universal grammar]] (UG).@@@@1@19@@danf@17-8-2009
10480570@unknown@formal@none@1@S@The specific characteristics of this universal grammar are a much debated topic.@@@@1@12@@danf@17-8-2009
10480580@unknown@formal@none@1@S@[[Linguistic typology|Typologists]] and non-generativist linguists usually refer simply to [[linguistic universal|language universals]], or ''universals of language''.@@@@1@16@@danf@17-8-2009
10480590@unknown@formal@none@1@S@Similarities between languages can have a number of different origins.@@@@1@10@@danf@17-8-2009
10480600@unknown@formal@none@1@S@In the simplest case, universal properties may be due to universal aspects of human experience.@@@@1@15@@danf@17-8-2009
10480610@unknown@formal@none@1@S@For example, all humans experience water, and all human languages have a word for water.@@@@1@15@@danf@17-8-2009
10480620@unknown@formal@none@1@S@Other similarities may be due to common descent: the [[Latin language]] spoken by the [[Ancient Rome|Ancient Romans]] developed into Spanish in Spain and Italian in Italy; similarities between Spanish and Italian are thus in many cases due to both being descended from Latin.@@@@1@43@@danf@17-8-2009
10480630@unknown@formal@none@1@S@In other cases, [[Language contact|contact between languages]] — particularly where many speakers are bilingual — can lead to much borrowing of structures, as well as words.@@@@1@26@@danf@17-8-2009
10480640@unknown@formal@none@1@S@Similarity may also, of course, be due to coincidence.@@@@1@9@@danf@17-8-2009
10480650@unknown@formal@none@1@S@English ''much'' and Spanish ''mucho'' are not descended from the same form or borrowed from one language to the other; nor is the similarity due to innate linguistic knowledge (see [[False cognate]]).@@@@1@32@@danf@17-8-2009
10480660@unknown@formal@none@1@S@Arguments in favor of language universals have also come from documented cases of [[sign language]]s (such as [[Al-Sayyid Bedouin Sign Language]]) developing in communities of congenitally deaf people, independently of spoken language.@@@@1@32@@danf@17-8-2009
10480670@unknown@formal@none@1@S@The properties of these sign languages conform generally to many of the properties of spoken languages.@@@@1@16@@danf@17-8-2009
10480680@unknown@formal@none@1@S@Other known and suspected sign language [[language isolate|isolates]] include [[Kata Kolok]], [[Nicaraguan Sign Language]], and [[Providence Island Sign Language]].@@@@1@19@@danf@17-8-2009
10480690@unknown@formal@none@1@S@== Structures ==@@@@1@3@@danf@17-8-2009
10480700@unknown@formal@none@1@S@It has been perceived that languages tend to be organized around [[grammatical categories]] such as noun and verb, [[nominative case|nominative]] and [[accusative case|accusative]], or present and past, though, importantly, not exclusively so.@@@@1@32@@danf@17-8-2009
10480710@unknown@formal@none@1@S@The grammar of a language is organized around such fundamental categories, though many languages express the relationships between words and syntax in other discrete ways (cf. some Bantu languages for noun/verb relations, ergative/absolutive systems for case relations, several Native American languages for tense/aspect relations).@@@@1@44@@danf@17-8-2009
10480720@unknown@formal@none@1@S@In addition to making substantial use of discrete categories, language has the important property that it organizes elements into recursive structures; this allows, for example, a noun phrase to contain another noun phrase (as in “the chimpanzee’s lips”) or a clause to contain a clause (as in “I think that it’s raining”).@@@@1@52@@danf@17-8-2009
10480730@unknown@formal@none@1@S@Though recursion in grammar was implicitly recognized much earlier (for example by [[Otto Jespersen|Jespersen]]), the importance of this aspect of language became more popular after the 1957 publication of [[Noam Chomsky]]’s book “[[Syntactic Structures]]”, - that presented a formal grammar of a fragment of English.@@@@1@45@@danf@17-8-2009
10480740@unknown@formal@none@1@S@Prior to this, the most detailed descriptions of linguistic systems were of phonological or morphological systems.@@@@1@16@@danf@17-8-2009
10480750@unknown@formal@none@1@S@Chomsky used a [[context-free grammar]] augmented with transformations.@@@@1@8@@danf@17-8-2009
10480760@unknown@formal@none@1@S@Since then, following the trend of Chomskyan linguistics, context-free grammars have been written for substantial fragments of various languages (for example [[Generalised phrase structure grammar|GPSG]], for English), but it has been demonstrated that human languages include cross-serial dependencies, which cannot be handled adequately by context-free grammars.@@@@1@46@@danf@17-8-2009
10480770@unknown@formal@none@1@S@==Some selected sub-fields ==@@@@1@4@@danf@17-8-2009
10480780@unknown@formal@none@1@S@'''Diachronic linguistics'''@@@@1@2@@danf@17-8-2009
10480790@unknown@formal@none@1@S@Studying languages at a particular point in time (usually the present) is "synchronic", while diachronic linguistics examines how language changes through time, sometimes over centuries.@@@@1@25@@danf@17-8-2009
10480800@unknown@formal@none@1@S@It enjoys both a rich history and a strong theoretical foundation for the study of [[language change]].@@@@1@17@@danf@17-8-2009
10480810@unknown@formal@none@1@S@In universities in the United States, the non-historic perspective is often out of fashion.@@@@1@14@@danf@17-8-2009
10480820@unknown@formal@none@1@S@The shift in focus to a non-historic perspective started with [[Ferdinand de Saussure|Saussure]] and became pre-dominant with [[Noam Chomsky]].@@@@1@19@@danf@17-8-2009
10480830@unknown@formal@none@1@S@Explicitly historical perspectives include [[historical-comparative linguistics]] and [[etymology]].@@@@1@8@@danf@17-8-2009
10480840@unknown@formal@none@1@S@'''Contextual linguistics'''@@@@1@2@@danf@17-8-2009
10480850@unknown@formal@none@1@S@Contextual linguistics may include the study of linguistics in interaction with other academic disciplines.@@@@1@14@@danf@17-8-2009
10480860@unknown@formal@none@1@S@The interdisciplinary areas of linguistics consider how language interacts with the rest of the world.@@@@1@15@@danf@17-8-2009
10480870@unknown@formal@none@1@S@[[Sociolinguistics]], [[anthropological linguistics]], and [[linguistic anthropology]] are seen as areas that bridge the gap between linguistics and society as a whole.@@@@1@21@@danf@17-8-2009
10480880@unknown@formal@none@1@S@[[Psycholinguistics]] and [[neurolinguistics]] relate linguistics to the [[medical science]]s.@@@@1@9@@danf@17-8-2009
10480890@unknown@formal@none@1@S@Other cross-disciplinary areas of linguistics include [[evolutionary linguistics]], [[computational linguistics]] and [[cognitive science]].@@@@1@13@@danf@17-8-2009
10480900@unknown@formal@none@1@S@'''Applied linguistics'''@@@@1@2@@danf@17-8-2009
10480910@unknown@formal@none@1@S@Linguists are largely concerned with finding and [[descriptive linguistics|describing]] the generalities and varieties both within particular languages and among all language.@@@@1@21@@danf@17-8-2009
10480920@unknown@formal@none@1@S@[[Applied linguistics]] takes the result of those findings and “applies” them to other areas.@@@@1@14@@danf@17-8-2009
10480930@unknown@formal@none@1@S@Often “applied linguistics” refers to the use of linguistic research in language teaching, but results of linguistic research are used in many other areas, as well.@@@@1@26@@danf@17-8-2009
10480940@unknown@formal@none@1@S@Today in the age of information technology, many areas of applied linguistics attempt to involve the use of computers.@@@@1@19@@danf@17-8-2009
10480950@unknown@formal@none@1@S@[[Speech synthesis]] and [[speech recognition]] use phonetic and phonemic knowledge to provide voice interfaces to computers.@@@@1@16@@danf@17-8-2009
10480960@unknown@formal@none@1@S@Applications of [[computational linguistics]] in [[machine translation]], [[computer-assisted translation]], and [[natural language processing]] are areas of applied linguistics which have come to the forefront.@@@@1@24@@danf@17-8-2009
10480970@unknown@formal@none@1@S@Their influence has had an effect on theories of syntax and semantics, as modeling syntactic and semantic theories on computers constraints.@@@@1@21@@danf@17-8-2009
10480980@unknown@formal@none@1@S@==Description and prescription==@@@@1@3@@danf@17-8-2009
10480990@unknown@formal@none@1@S@''Main articles: [[Descriptive linguistics]], [[Linguistic prescription]]''@@@@1@6@@danf@17-8-2009
10481000@unknown@formal@none@1@S@Linguistics is '''descriptive'''; linguists describe and explain features of language without making subjective judgments on whether a particular feature is "right" or "wrong".@@@@1@23@@danf@17-8-2009
10481010@unknown@formal@none@1@S@This is analogous to practice in other sciences: a [[zoologist]] studies the animal kingdom without making subjective judgments on whether a particular animal is better or worse than another.@@@@1@29@@danf@17-8-2009
10481020@unknown@formal@none@1@S@'''Prescription''', on the other hand, is an attempt to promote particular linguistic usages over others, often favouring a particular dialect or "[[acrolect]]".@@@@1@22@@danf@17-8-2009
10481030@unknown@formal@none@1@S@This may have the aim of establishing a [[Standard language|linguistic standard]], which can aid communication over large geographical areas.@@@@1@19@@danf@17-8-2009
10481040@unknown@formal@none@1@S@It may also, however, be an attempt by speakers of one language or dialect to exert influence over speakers of other languages or dialects (see [[Linguistic imperialism]]).@@@@1@27@@danf@17-8-2009
10481050@unknown@formal@none@1@S@An extreme version of prescriptivism can be found among [[censorship|censors]], who attempt to eradicate words and structures which they consider to be destructive to society.@@@@1@25@@danf@17-8-2009
10481060@unknown@formal@none@1@S@== Speech and writing ==@@@@1@5@@danf@17-8-2009
10481070@unknown@formal@none@1@S@Most contemporary linguists work under the assumption that [[spoken language|spoken]] (or signed) language is more fundamental than [[written language]].@@@@1@19@@danf@17-8-2009
10481080@unknown@formal@none@1@S@This is because:@@@@1@3@@danf@17-8-2009
10481090@unknown@formal@none@1@S@* Speech appears to be a human "universal", whereas there have been many [[culture]]s and speech communities that lack written communication;@@@@1@21@@danf@17-8-2009
10481100@unknown@formal@none@1@S@* Speech evolved before human beings discovered writing;@@@@1@8@@danf@17-8-2009
10481110@unknown@formal@none@1@S@* People learn to speak and process spoken languages more easily and much earlier than writing;@@@@1@16@@danf@17-8-2009
10481120@unknown@formal@none@1@S@Linguists nonetheless agree that the study of written language can be worthwhile and valuable.@@@@1@14@@danf@17-8-2009
10481130@unknown@formal@none@1@S@For research that relies on [[corpus linguistics]] and [[computational linguistics]], written language is often much more convenient for processing large amounts of linguistic data.@@@@1@24@@danf@17-8-2009
10481140@unknown@formal@none@1@S@Large corpora of spoken language are difficult to create and hard to find, and are typically [[transcription (linguistics)|transcribed]] and written.@@@@1@20@@danf@17-8-2009
10481150@unknown@formal@none@1@S@Additionally, linguists have turned to text-based discourse occurring in various formats of [[computer-mediated communication]] as a viable site for linguistic inquiry.@@@@1@21@@danf@17-8-2009
10481160@unknown@formal@none@1@S@The study of [[writing systems]] themselves is in any case considered a branch of linguistics.@@@@1@15@@danf@17-8-2009
10481170@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10481180@unknown@formal@none@1@S@Some of the earliest linguistic activities can be recalled from [[Iron Age India]] with the analysis of [[Sanskrit]].@@@@1@18@@danf@17-8-2009
10481190@unknown@formal@none@1@S@The [[Pratishakhya]]s (from ca. the 8th century BC) constitute as it were a proto-linguistic ''ad hoc'' collection of observations about mutations to a given [[corpus linguistics|corpus]] particular to a given [[Shakha|Vedic school]].@@@@1@32@@danf@17-8-2009
10481200@unknown@formal@none@1@S@Systematic study of these texts gives rise to the [[Vedanga]] discipline of [[Vyakarana]], the earliest surviving account of which is the work of {{IAST|[[Pānini]]}} (c. 520 – 460 BC), who, however, looks back on what are probably several generations of grammarians, whose opinions he occasionally refers to.@@@@1@47@@danf@17-8-2009
10481210@unknown@formal@none@1@S@{{IAST|Pānini}} formulates close to 4,000 rules which together form a compact [[generative grammar]] of Sanskrit.@@@@1@15@@danf@17-8-2009
10481220@unknown@formal@none@1@S@Inherent in his analytic approach are the concepts of the [[phoneme]], the [[morpheme]] and the [[root]].@@@@1@16@@danf@17-8-2009
10481230@unknown@formal@none@1@S@Due to its focus on brevity, his grammar has a highly unintuitive structure, reminiscent of contemporary "machine language" (as opposed to "human readable" programming languages).@@@@1@25@@danf@17-8-2009
10481240@unknown@formal@none@1@S@Indian linguistics maintained a high level for several centuries; [[Mahābhāṣya|Patanjali]] in the 2nd century BC still actively criticizes Panini.@@@@1@19@@danf@17-8-2009
10481250@unknown@formal@none@1@S@In the later centuries BC, however, Panini's grammar came to be seen as prescriptive, and commentators came to be fully dependent on it.@@@@1@23@@danf@17-8-2009
10481260@unknown@formal@none@1@S@[[Bhartrihari]] (c. 450 – 510) theorized the act of speech as being made up of four stages: first, conceptualization of an idea, second, its verbalization and sequencing (articulation) and third, delivery of speech into atmospheric air, the interpretation of speech by the listener, the interpreter.@@@@1@45@@danf@17-8-2009
10481270@unknown@formal@none@1@S@In the [[Middle East]], the [[Persian language|Persian]] linguist [[Sibawayh]] made a detailed and professional description of [[Arabic language|Arabic]] in 760, in his monumental work, ''Al-kitab fi al-nahw'' (الكتاب في النحو, ''The Book on Grammar''), bringing many linguistic aspects of language to light.@@@@1@42@@danf@17-8-2009
10481280@unknown@formal@none@1@S@In his book he distinguished [[phonetics]] from [[phonology]].@@@@1@8@@danf@17-8-2009
10481290@unknown@formal@none@1@S@Western linguistics begins in Classical Antiquity with grammatical speculation such as [[Plato]]'s ''[[Cratylus]]''.@@@@1@13@@danf@17-8-2009
10481300@unknown@formal@none@1@S@[[William Jones (philologist)|Sir William Jones]] noted that [[Sanskrit]] shared many common features with classical [[Latin]] and [[Ancient Greek|Greek]], notably verb roots and grammatical structures, such as the [[case system]].@@@@1@29@@danf@17-8-2009
10481310@unknown@formal@none@1@S@This led to the theory that all languages sprung from a common source and to the discovery of the [[Indo-European]] [[language family]].@@@@1@22@@danf@17-8-2009
10481320@unknown@formal@none@1@S@He began the study of [[comparative linguistics]], which would uncover more language families and branches.@@@@1@15@@danf@17-8-2009
10481330@unknown@formal@none@1@S@Some early-19th-century linguists were [[Jakob Grimm]], who devised a principle of consonantal shifts in pronunciation – known as [[Grimm's Law]] – in 1822; [[Karl Verner]], who formulated [[Verner's Law]]; [[August Schleicher]], who created the "Stammbaumtheorie" ("family tree"); and [[Johannes Schmidt (linguist)|Johannes Schmidt]], who developed the "Wellentheorie" ("wave model") in 1872.@@@@1@50@@danf@17-8-2009
10481340@unknown@formal@none@1@S@[[Ferdinand de Saussure]] was the founder of modern structural linguistics.@@@@1@10@@danf@17-8-2009
10481350@unknown@formal@none@1@S@[[Edward Sapir]], a leader in American structural linguistics, was one of the first who explored the relations between language studies and anthropology.@@@@1@22@@danf@17-8-2009
10481360@unknown@formal@none@1@S@His methodology had strong influence on all his successors.@@@@1@9@@danf@17-8-2009
10481370@unknown@formal@none@1@S@[[Noam Chomsky|Noam Chomsky's]] formal model of language, [[transformational-generative grammar]], developed under the influence of his teacher [[Zellig Harris]], who was in turn strongly influenced by [[Leonard Bloomfield]], has been the dominant model since the 1960s.@@@@1@35@@danf@17-8-2009
10481380@unknown@formal@none@1@S@[[Noam Chomsky]] remains a pop-linguistic figure.@@@@1@6@@danf@17-8-2009
10481390@unknown@formal@none@1@S@Linguists (working in frameworks such as [[Head-Driven Phrase Structure Grammar]] (HPSG) or [[Lexical Functional Grammar]] (LFG)) are increasingly seen to stress the importance of formalization and formal rigor in linguistic description, and may distance themselves somewhat from Chomsky's more recent work (the "Minimalist" program for [[Transformational grammar]]), connecting more closely to his earlier works.@@@@1@54@@danf@17-8-2009
10481400@unknown@formal@none@1@S@Other linguists working in [[Optimality Theory]] state generalizations in terms of violable constraints that interact with each other, and abandon the traditional rule-based formalism first pioneered by early work in generativist linguistics.@@@@1@32@@danf@17-8-2009
10481410@unknown@formal@none@1@S@Functionalist linguists working in [[functional grammar]] and [[Cognitive Linguistics]] tend to stress the non-autonomy of linguistic knowledge and the non-universality of linguistic structures, thus differing significantly from the Chomskyan school.@@@@1@30@@danf@17-8-2009
10481420@unknown@formal@none@1@S@They reject Chomskyan intuitive introspection as a scientific method, relying instead on typological evidence.@@@@1@14@@danf@17-8-2009
10490010@unknown@formal@none@1@S@Linux@@@@1@1@@danf@17-8-2009
10490020@unknown@formal@none@1@S@'''Linux''' (commonly pronounced {{IPAEng|ˈlɪnəks}} in English; variants exist) is a [[Unix-like]] computer [[operating system]].@@@@1@14@@danf@17-8-2009
10490030@unknown@formal@none@1@S@Linux is one of the most prominent examples of [[free software]] and [[open source]] development: typically all underlying [[source code]] can be freely modified, used, and redistributed by anyone.@@@@1@29@@danf@17-8-2009
10490040@unknown@formal@none@1@S@The name "Linux" comes from the [[Linux kernel]], originally written in 1991 by [[Linus Torvalds]].@@@@1@15@@danf@17-8-2009
10490050@unknown@formal@none@1@S@The system's [[system utility|utilities]] and [[library (computer science)|libraries]] usually come from the [[GNU operating system]], announced in 1983 by [[Richard Stallman]].@@@@1@21@@danf@17-8-2009
10490060@unknown@formal@none@1@S@The GNU contribution is the basis for the alternative name '''GNU/Linux'''.@@@@1@11@@danf@17-8-2009
10490070@unknown@formal@none@1@S@Predominantly known for its use in [[server (computing)|server]]s, Linux is supported by corporations such as [[Dell]], [[Hewlett-Packard]], [[IBM]], [[Novell]], [[Oracle Corporation]], [[Red Hat]], and [[Sun Microsystems]].@@@@1@26@@danf@17-8-2009
10490080@unknown@formal@none@1@S@It is used as an operating system for a wide variety of computer [[hardware]], including [[desktop computer]]s, [[supercomputers]], video game systems, such as the [[PlayStation 2]] and [[PlayStation 3]], several [[arcade games]], and [[embedded devices]] such as [[mobile phone]]s, [[routers]], and [[stage lighting]] systems.@@@@1@44@@danf@17-8-2009
10490090@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10490100@unknown@formal@none@1@S@The [[Unix]] operating system was conceived and implemented in the 1960s and first released in 1970.@@@@1@16@@danf@17-8-2009
10490110@unknown@formal@none@1@S@Its wide availability and [[Porting|portability]] meant that it was widely adopted, copied and modified by academic institutions and businesses, with its design being influential on authors of other systems.@@@@1@29@@danf@17-8-2009
10490120@unknown@formal@none@1@S@The [[GNU Project]], started in 1984, had the goal of creating a "''complete Unix-compatible software system''" made entirely of [[free software]].@@@@1@21@@danf@17-8-2009
10490130@unknown@formal@none@1@S@In 1985, [[Richard Stallman]] created the [[Free Software Foundation]] and developed the [[GNU General Public License]] (GNU GPL).@@@@1@18@@danf@17-8-2009
10490140@unknown@formal@none@1@S@Many of the programs required in an OS (such as libraries, [[compiler]]s, [[text editor]]s, a [[Unix shell]], and a windowing system) were completed by the early 1990s, although low level elements such as [[device driver]]s, [[daemon (computer software)|daemon]]s, and the [[Kernel (computer science)|kernel]] were stalled and incomplete.@@@@1@47@@danf@17-8-2009
10490150@unknown@formal@none@1@S@Linus Torvalds has said that if the GNU kernel had been available at the time (1991), he would not have decided to write his own.@@@@1@25@@danf@17-8-2009
10490160@unknown@formal@none@1@S@=== MINIX ===@@@@1@3@@danf@17-8-2009
10490170@unknown@formal@none@1@S@[[MINIX]], a Unix-like system intended for academic use, was released by [[Andrew S. Tanenbaum]] in 1987.@@@@1@16@@danf@17-8-2009
10490180@unknown@formal@none@1@S@While source code for the system was available, modification and redistribution were restricted (that is not the case today).@@@@1@19@@danf@17-8-2009
10490190@unknown@formal@none@1@S@In addition, MINIX's [[16-bit]] design was not well adapted to the [[32-bit]] design of the increasingly cheap and popular [[Intel 386]] architecture for personal computers.@@@@1@25@@danf@17-8-2009
10490200@unknown@formal@none@1@S@In 1991, Torvalds began to work on a non-commercial replacement for MINIX while he was attending the [[University of Helsinki]].@@@@1@20@@danf@17-8-2009
10490210@unknown@formal@none@1@S@This eventually became the [[Linux kernel]].@@@@1@6@@danf@17-8-2009
10490220@unknown@formal@none@1@S@In 1992, Tanenbaum posted an article on [[Usenet]] claiming Linux was obsolete.@@@@1@12@@danf@17-8-2009
10490230@unknown@formal@none@1@S@In the article, he criticized the operating system as being [[Monolithic kernel|monolithic]] in design and being tied closely to the x86 architecture and thus not portable, as he described "a fundamental error."@@@@1@32@@danf@17-8-2009
10490240@unknown@formal@none@1@S@Tanenbaum suggested that those who wanted a modern operating system should look into one based on the [[microkernel]] model.@@@@1@19@@danf@17-8-2009
10490250@unknown@formal@none@1@S@The posting elicited the response of Torvalds and [[Ken Thompson]], one of the founders of [[Unix]], which resulted in a well known debate over the microkernel and monolithic kernel designs.@@@@1@30@@danf@17-8-2009
10490260@unknown@formal@none@1@S@Linux was dependent on the MINIX [[user space]] at first.@@@@1@10@@danf@17-8-2009
10490270@unknown@formal@none@1@S@With code from the GNU system freely available, it was advantageous if this could be used with the fledgling OS.@@@@1@20@@danf@17-8-2009
10490275@unknown@formal@none@1@S@Code licensed under the GNU GPL can be used in other projects, so long as they also are released under the same or a compatible license.@@@@1@26@@danf@17-8-2009
10490280@unknown@formal@none@1@S@In order to make the Linux kernel compatible with the components from the GNU Project, Torvalds initiated a switch from his original license (which prohibited commercial redistribution) to the GNU GPL.@@@@1@31@@danf@17-8-2009
10490290@unknown@formal@none@1@S@Linux and GNU developers worked to integrate GNU components with Linux to make a fully functional and free operating system.@@@@1@20@@danf@17-8-2009
10490300@unknown@formal@none@1@S@=== Commercial and popular uptake ===@@@@1@6@@danf@17-8-2009
10490310@unknown@formal@none@1@S@Today Linux is used in numerous domains, from [[embedded system]]s to [[supercomputer]]s, and has secured a place in [[server (computing)|server]] installations with the popular [[LAMP (software bundle)|LAMP]] application stack.@@@@1@29@@danf@17-8-2009
10490320@unknown@formal@none@1@S@Torvalds continues to direct the development of the kernel.@@@@1@9@@danf@17-8-2009
10490330@unknown@formal@none@1@S@Stallman heads the Free Software Foundation, which in turn supports the GNU components.@@@@1@13@@danf@17-8-2009
10490340@unknown@formal@none@1@S@Finally, individuals and corporations develop third-party non-GNU components.@@@@1@8@@danf@17-8-2009
10490350@unknown@formal@none@1@S@These third-party components comprise a vast body of work and may include both kernel modules and user applications and libraries.@@@@1@20@@danf@17-8-2009
10490360@unknown@formal@none@1@S@Linux vendors and communities combine and distribute the kernel, GNU components, and non-GNU components, with additional package management software in the form of [[Linux distribution]]s.@@@@1@25@@danf@17-8-2009
10490370@unknown@formal@none@1@S@== Design ==@@@@1@3@@danf@17-8-2009
10490380@unknown@formal@none@1@S@Linux is a modular [[Unix-like]] operating system.@@@@1@7@@danf@17-8-2009
10490390@unknown@formal@none@1@S@It derives much of its basic design from principles established in Unix during the 1970s and 1980s.@@@@1@17@@danf@17-8-2009
10490400@unknown@formal@none@1@S@Linux uses a [[monolithic kernel]], the [[Linux kernel]], which handles process control, networking, and [[peripheral]] and [[file system]] access.@@@@1@19@@danf@17-8-2009
10490410@unknown@formal@none@1@S@[[Device drivers]] are integrated directly with the kernel.@@@@1@8@@danf@17-8-2009
10490420@unknown@formal@none@1@S@Much of Linux's higher-level functionality is provided by separate projects which interface with the kernel.@@@@1@15@@danf@17-8-2009
10490430@unknown@formal@none@1@S@The GNU [[Userland (computing)|userland]] is an important part of most Linux systems, providing the [[shell (computing)|shell]] and [[Unix tool]]s which carry out many basic operating system tasks.@@@@1@27@@danf@17-8-2009
10490440@unknown@formal@none@1@S@On top these tools form a Linux system with a [[graphical user interface]] that can be used, usually running in the [[X Window System]].@@@@1@24@@danf@17-8-2009
10490450@unknown@formal@none@1@S@=== User interface ===@@@@1@4@@danf@17-8-2009
10490460@unknown@formal@none@1@S@Linux can be controlled by one or more of a text-based [[command line interface]] (CLI), [[graphical user interface]] (GUI) (usually the default for desktop), or through controls on the device itself (common on embedded machines).@@@@1@35@@danf@17-8-2009
10490470@unknown@formal@none@1@S@On desktop machines, [[KDE]], [[GNOME]] and [[Xfce]] are the most popular user interfaces, though a variety of other user interfaces exist.@@@@1@21@@danf@17-8-2009
10490480@unknown@formal@none@1@S@Most popular user interfaces run on top of the [[X Window System]] (X), which provides [[network transparency]], enabling a graphical application running on one machine to be displayed and controlled from another.@@@@1@32@@danf@17-8-2009
10490490@unknown@formal@none@1@S@Other GUIs include [[X window manager]]s such as [[FVWM]], [[Enlightenment (window manager)|Enlightenment]] and [[Window Maker]].@@@@1@15@@danf@17-8-2009
10490500@unknown@formal@none@1@S@The window manager provides a means to control the placement and appearance of individual application windows, and interacts with the X window system.@@@@1@23@@danf@17-8-2009
10490510@unknown@formal@none@1@S@A Linux system usually provides a [[CLI]] of some sort through a [[Shell (computing)|shell]], which is the traditional way of interacting with a Unix system.@@@@1@25@@danf@17-8-2009
10490520@unknown@formal@none@1@S@A Linux distribution specialized for servers may use the CLI as its only interface.@@@@1@14@@danf@17-8-2009
10490530@unknown@formal@none@1@S@A “headless system” run without even a monitor can be controlled by the command line via a protocol such as [[Secure Shell|SSH]] or [[telnet]].@@@@1@24@@danf@17-8-2009
10490540@unknown@formal@none@1@S@Most low-level Linux components, including the GNU [[Userland (computing)|Userland]], use the CLI exclusively.@@@@1@13@@danf@17-8-2009
10490550@unknown@formal@none@1@S@The CLI is particularly suited for automation of repetitive or delayed tasks, and provides very simple [[inter-process communication]].@@@@1@18@@danf@17-8-2009
10490560@unknown@formal@none@1@S@A graphical [[terminal emulator]] program is often used to access the CLI from a Linux desktop.@@@@1@16@@danf@17-8-2009
10490570@unknown@formal@none@1@S@== Development ==@@@@1@3@@danf@17-8-2009
10490580@unknown@formal@none@1@S@The primary difference between Linux and many other popular contemporary operating systems is that the [[Linux kernel]] and other components are [[free software|free]] and [[open source software]].@@@@1@27@@danf@17-8-2009
10490590@unknown@formal@none@1@S@Linux is not the only such operating system, although it is the best-known and most widely used.@@@@1@17@@danf@17-8-2009
10490600@unknown@formal@none@1@S@Some [[free software license|free]] and [[open source license|open source]] software licences are based on the principle of [[copyleft]], a kind of reciprocity: any work derived from a copyleft piece of software must also be copyleft itself.@@@@1@36@@danf@17-8-2009
10490610@unknown@formal@none@1@S@The most common free software license, the [[GNU GPL]], is a form of copyleft, and is used for the Linux kernel and many of the components from the [[GNU project]].@@@@1@30@@danf@17-8-2009
10490620@unknown@formal@none@1@S@As an operating system [[underdog (competition)|underdog]] competing with mainstream operating systems, Linux cannot rely on a [[monopoly]] advantage; in order for Linux to be convenient for users, Linux aims for [[interoperability]] with other operating systems and established computing standards.@@@@1@39@@danf@17-8-2009
10490630@unknown@formal@none@1@S@Linux systems adhere to [[POSIX]], [[Single UNIX Specification|SUS]], [[International Organization for Standardization|ISO]] and [[American National Standards Institute|ANSI]] standards where possible, although to date only one Linux distribution has been POSIX.1 certified, Linux-FT.@@@@1@32@@danf@17-8-2009
10490640@unknown@formal@none@1@S@Free software projects, although developed in a [[Collaboration|collaborative]] fashion, are often produced independently of each other.@@@@1@16@@danf@17-8-2009
10490650@unknown@formal@none@1@S@However, given that the software licenses explicitly permit redistribution, this provides a basis for larger scale projects that collect the software produced by stand-alone projects and make it available all at once in the form of a [[Linux distribution]].@@@@1@39@@danf@17-8-2009
10490660@unknown@formal@none@1@S@A [[Linux distribution]], commonly called a “distro”, is a project that manages a remote collection of Linux-based software, and facilitates installation of a Linux operating system.@@@@1@26@@danf@17-8-2009
10490670@unknown@formal@none@1@S@Distributions are maintained by individuals, loose-knit teams, volunteer organizations, and commercial entities.@@@@1@12@@danf@17-8-2009
10490680@unknown@formal@none@1@S@They include system software and [[application software]] in the form of ''packages'', and distribution-specific software for initial system installation and configuration as well as later package upgrades and installs.@@@@1@29@@danf@17-8-2009
10490690@unknown@formal@none@1@S@A distribution is responsible for the default configuration of installed Linux systems, system security, and more generally integration of the different software packages into a coherent whole.@@@@1@27@@danf@17-8-2009
10490700@unknown@formal@none@1@S@=== Community ===@@@@1@3@@danf@17-8-2009
10490710@unknown@formal@none@1@S@Linux is largely driven by its developer and user communities.@@@@1@10@@danf@17-8-2009
10490720@unknown@formal@none@1@S@Some vendors develop and fund their distributions on a volunteer basis, [[Debian]] being a well-known example.@@@@1@16@@danf@17-8-2009
10490730@unknown@formal@none@1@S@Others maintain a community version of their commercial distributions, as [[Red Hat]] does with [[Fedora (Linux distribution)|Fedora]].@@@@1@17@@danf@17-8-2009
10490740@unknown@formal@none@1@S@In many cities and regions, local associations known as [[Linux Users Group]]s (LUGs) seek to promote Linux and by extension free software.@@@@1@22@@danf@17-8-2009
10490750@unknown@formal@none@1@S@They hold meetings and provide free demonstrations, training, technical support, and operating system installation to new users.@@@@1@17@@danf@17-8-2009
10490760@unknown@formal@none@1@S@There are also many [[Internet]] communities that seek to provide support to Linux users and developers.@@@@1@16@@danf@17-8-2009
10490770@unknown@formal@none@1@S@Most distributions and open source projects have [[IRC]] chatrooms or [[newsgroup]]s.@@@@1@11@@danf@17-8-2009
10490780@unknown@formal@none@1@S@[[Online forum]]s are another means for support, with notable examples being [[LinuxQuestions.org]] and the [[Gentoo Linux|Gentoo]] forums.@@@@1@17@@danf@17-8-2009
10490790@unknown@formal@none@1@S@Linux distributions host [[mailing list]]s; commonly there will be a specific topic such as usage or development for a given list.@@@@1@21@@danf@17-8-2009
10490800@unknown@formal@none@1@S@There are several technology websites with a Linux focus.@@@@1@9@@danf@17-8-2009
10490810@unknown@formal@none@1@S@[[Linux Weekly News]] is a weekly digest of Linux-related news; the [[Linux Journal]] is an online magazine of Linux articles published monthly; [[Slashdot]] is a technology-related news website with many stories on Linux and open source software; [[Groklaw]] has written in depth about Linux-related legal proceedings and there are many articles relevant to the Linux kernel and its relationship with [[GNU]] on the [[GNU Project|GNU project's]] website.@@@@1@67@@danf@17-8-2009
10490820@unknown@formal@none@1@S@Print [[magazine]]s on Linux often include [[cover disk]]s including software or even complete Linux distributions.@@@@1@15@@danf@17-8-2009
10490830@unknown@formal@none@1@S@Although Linux is generally available free of charge, several large corporations have established business models that involve selling, supporting, and contributing to Linux and free software.@@@@1@26@@danf@17-8-2009
10490840@unknown@formal@none@1@S@These include [[Dell]], [[IBM]], [[Hewlett-Packard|HP]], [[Sun Microsystems]], [[Novell]], and [[Red Hat]].@@@@1@11@@danf@17-8-2009
10490850@unknown@formal@none@1@S@The free software licenses on which Linux is based explicitly accommodate and encourage commercialization; the relationship between Linux as a whole and individual vendors may be seen as [[symbiosis|symbiotic]].@@@@1@29@@danf@17-8-2009
10490860@unknown@formal@none@1@S@One common business model of commercial suppliers is charging for support, especially for business users.@@@@1@15@@danf@17-8-2009
10490870@unknown@formal@none@1@S@A number of companies also offer a specialized business version of their distribution, which adds proprietary support packages and tools to administer higher numbers of installations or to simplify administrative tasks.@@@@1@31@@danf@17-8-2009
10490880@unknown@formal@none@1@S@Another business model is to give away the software in order to sell hardware.@@@@1@14@@danf@17-8-2009
10490890@unknown@formal@none@1@S@=== Programming on Linux ===@@@@1@5@@danf@17-8-2009
10490900@unknown@formal@none@1@S@Most Linux distributions support dozens of [[programming language]]s.@@@@1@8@@danf@17-8-2009
10490910@unknown@formal@none@1@S@The most common collection of utilities for building both Linux applications and operating system programs is found within the [[GNU toolchain]], which includes the [[GNU Compiler Collection]] (GCC) and the [[GNU build system]].@@@@1@33@@danf@17-8-2009
10490920@unknown@formal@none@1@S@Amongst others, GCC provides compilers for [[Ada (programming language)|Ada]], [[C (programming language)|C]], [[C++]], [[Java (programming language)|Java]], and [[Fortran]].@@@@1@18@@danf@17-8-2009
10490930@unknown@formal@none@1@S@The Linux kernel itself is written to be compiled with GCC.@@@@1@11@@danf@17-8-2009
10490940@unknown@formal@none@1@S@[[Proprietary software|Proprietary]] compilers for Linux include the [[Intel C++ Compiler]] and IBM XL C/C++ Compiler.@@@@1@15@@danf@17-8-2009
10490950@unknown@formal@none@1@S@Most distributions also include support for [[Perl]], [[Ruby programming language|Ruby]], [[Python programming language|Python]] and other [[Dynamic programming language|dynamic languages]].@@@@1@19@@danf@17-8-2009
10490960@unknown@formal@none@1@S@Examples of languages that are less common, but still well-supported, are [[C Sharp (programming language)|C#]] via the [[Mono (software)|Mono]] project, sponsored by [[Novell]], and [[Scheme programming language|Scheme]].@@@@1@27@@danf@17-8-2009
10490970@unknown@formal@none@1@S@A number of [[Java Virtual Machine]]s and development kits run on Linux, including the original Sun Microsystems JVM ([[HotSpot]]), and IBM's J2SE RE, as well as many open-source projects like [[Kaffe]].@@@@1@31@@danf@17-8-2009
10490980@unknown@formal@none@1@S@The two main frameworks for developing graphical applications are those of [[GNOME]] and [[KDE]].@@@@1@14@@danf@17-8-2009
10490990@unknown@formal@none@1@S@These projects are based on the [[GTK+]] and [[Qt (toolkit)|Qt]] [[widget toolkit]]s, respectively, which can also be used independently of the larger framework.@@@@1@23@@danf@17-8-2009
10491000@unknown@formal@none@1@S@Both support a wide variety of languages.@@@@1@7@@danf@17-8-2009
10491010@unknown@formal@none@1@S@There are a number of [[Integrated development environment]]s available including [[Anjuta]], [[Code::Blocks]], [[Eclipse (computing)|Eclipse]], [[KDevelop]], [[Lazarus (software)|Lazarus]], [[MonoDevelop]], [[NetBeans]], and [[Omnis Studio]] while the long-established editors [[Vim (text editor)|Vim]] and [[Emacs]] remain popular.@@@@1@33@@danf@17-8-2009
10491020@unknown@formal@none@1@S@== Uses ==@@@@1@3@@danf@17-8-2009
10491030@unknown@formal@none@1@S@As well as those designed for general purpose use on desktops and servers, distributions may be specialized for different purposes including: [[computer architecture]] support, [[Embedded Linux|embedded systems]], stability, security, localization to a specific region or language, targeting of specific user groups, support for [[real-time computing|real-time]] applications, or commitment to a given desktop environment.@@@@1@53@@danf@17-8-2009
10491040@unknown@formal@none@1@S@Furthermore, some distributions deliberately include only [[free software]].@@@@1@8@@danf@17-8-2009
10491050@unknown@formal@none@1@S@Currently, over three hundred distributions are actively developed, with about a dozen distributions being most popular for general-purpose use.@@@@1@19@@danf@17-8-2009
10491060@unknown@formal@none@1@S@Linux is a widely [[porting|ported]] operating system.@@@@1@7@@danf@17-8-2009
10491070@unknown@formal@none@1@S@While the Linux kernel was originally designed only for [[Intel 80386]] [[microprocessor]]s, it now runs on a more diverse range of [[computer architecture]]s than any other operating system: in the hand-held [[ARM architecture|ARM]]-based [[iPAQ]] and the [[mainframe computer|mainframe]] [[IBM]] [[System z9]], in devices ranging from [[mobile phone]]s to [[supercomputer]]s.@@@@1@49@@danf@17-8-2009
10491080@unknown@formal@none@1@S@Specialized distributions exist for less mainstream architectures.@@@@1@7@@danf@17-8-2009
10491090@unknown@formal@none@1@S@The [[ELKS]] kernel [[fork (software development)|fork]] can run on [[Intel 8086]] or [[Intel 80286]] [[16-bit]] microprocessors, while the [[µClinux]] kernel fork may run on systems without a [[memory management unit]].@@@@1@30@@danf@17-8-2009
10491100@unknown@formal@none@1@S@The kernel also runs on architectures that were only ever intended to use a manufacturer-created operating system, such as [[Macintosh]] computers, [[Personal digital assistant|PDA]]s, [[video game console]]s, [[Digital audio player|portable music players]], and [[mobile phone]]s.@@@@1@35@@danf@17-8-2009
10491110@unknown@formal@none@1@S@=== Desktop ===@@@@1@3@@danf@17-8-2009
10491120@unknown@formal@none@1@S@Although there is a lack of Linux ports for some [[Mac OS X]] and [[Microsoft Windows]] programs in domains such as [[desktop publishing]] and [[professional audio]], applications equivalent to those available for Mac and Windows are available for Linux.@@@@1@39@@danf@17-8-2009
10491130@unknown@formal@none@1@S@Most Linux distributions provide a program for browsing a list of thousands of [[free software]] applications that have already been tested and configured for a specific distribution.@@@@1@27@@danf@17-8-2009
10491140@unknown@formal@none@1@S@These free programs can be downloaded and installed with one mouse click and a digital signature guarantees that no one has added a virus or a spyware to these programs.@@@@1@30@@danf@17-8-2009
10491150@unknown@formal@none@1@S@Many [[free software]] titles that are popular on Windows, such as [[Pidgin (software)|Pidgin]], [[Mozilla Firefox]], [[Openoffice.org]], and [[GIMP]], are available for Linux.@@@@1@22@@danf@17-8-2009
10491160@unknown@formal@none@1@S@A growing amount of proprietary desktop software is also supported under Linux, examples being [[Adobe Flash Player]], [[Adobe Acrobat|Acrobat Reader]], [[Matlab]], [[Nero Burning ROM]], [[Opera (Internet suite)|Opera]], [[RealPlayer]], and [[Skype]].@@@@1@30@@danf@17-8-2009
10491170@unknown@formal@none@1@S@In the field of animation and visual effects, most high end software, such as AutoDesk Maya, Softimage XSI and Apple Shake, is available for Linux, Windows and/or Mac OS X.@@@@1@30@@danf@17-8-2009
10491180@unknown@formal@none@1@S@[[CrossOver]] is a proprietary solution based on the open source [[Wine (software)|Wine]] project that supports running older Windows versions of [[Microsoft Office]] and [[Adobe Photoshop]] versions through CS2.@@@@1@28@@danf@17-8-2009
10491190@unknown@formal@none@1@S@[[Microsoft Office 2007]] and Adobe Photoshop CS3 are known not to work.@@@@1@12@@danf@17-8-2009
10491200@unknown@formal@none@1@S@Besides the free Windows compatibility layer [[Wine (software)|Wine]], most distributions offer [[Dual boot]] and [[X86 virtualization]] for running both Linux and Windows on the same computer.@@@@1@26@@danf@17-8-2009
10491210@unknown@formal@none@1@S@Linux's open nature allows distributed teams to [[L10n|localize]] Linux distributions for use in locales where localizing proprietary systems would not be cost-effective.@@@@1@22@@danf@17-8-2009
10491220@unknown@formal@none@1@S@For example the [[Sinhalese language]] version of the [[Knoppix]] distribution was available for a long time before [[Microsoft Windows XP]] was translated to Sinhalese.@@@@1@24@@danf@17-8-2009
10491230@unknown@formal@none@1@S@In this case the Lanka Linux User Group played a major part in developing the localized system by combining the knowledge of university professors, [[linguist]]s, and local developers.@@@@1@28@@danf@17-8-2009
10491240@unknown@formal@none@1@S@The performance of Linux on the desktop has been a controversial topic, with at least one key Linux kernel developer, Con Kolivas, accusing the Linux community of favouring performance on servers.@@@@1@31@@danf@17-8-2009
10491250@unknown@formal@none@1@S@He quit Linux development because he was frustrated with this lack of focus on the desktop, and then gave a 'tell all' interview on the topic.@@@@1@26@@danf@17-8-2009
10491260@unknown@formal@none@1@S@=== Servers and supercomputers ===@@@@1@5@@danf@17-8-2009
10491270@unknown@formal@none@1@S@Historically, Linux has mainly been used as a [[Server (computing)|server]] operating system, and has risen to prominence in that area; [[Netcraft]] reported in September 2006 that eight of the ten most reliable internet hosting companies run Linux on their [[web server]]s.@@@@1@41@@danf@17-8-2009
10491280@unknown@formal@none@1@S@This is due to its relative stability and long uptime, and the fact that desktop software with a graphical user interface for servers is often unneeded.@@@@1@26@@danf@17-8-2009
10491290@unknown@formal@none@1@S@Enterprise and non-enterprise Linux distributions may be found running on servers.@@@@1@11@@danf@17-8-2009
10491300@unknown@formal@none@1@S@Linux is the cornerstone of the [[LAMP (software bundle)|LAMP]] server-software combination (Linux, [[Apache HTTP Server|Apache]], [[MySQL]], [[Perl]]/[[PHP]]/[[Python (programming language)|Python]]) which has achieved popularity among developers, and which is one of the more common platforms for website hosting.@@@@1@37@@danf@17-8-2009
10491310@unknown@formal@none@1@S@Linux is commonly used as an operating system for [[supercomputer]]s.@@@@1@10@@danf@17-8-2009
10491320@unknown@formal@none@1@S@As of [[November 2007]], out of the top 500 systems, 426 (85.2%) run Linux.@@@@1@14@@danf@17-8-2009
10491330@unknown@formal@none@1@S@=== Embedded devices ===@@@@1@4@@danf@17-8-2009
10491340@unknown@formal@none@1@S@Due to its low cost and ability to be easily modified, an [[embedded Linux]] is often used in [[embedded systems]].@@@@1@20@@danf@17-8-2009
10491350@unknown@formal@none@1@S@Linux has become a major competitor to the proprietary [[Symbian OS]] found in the majority of smartphones — 16.7% of [[smartphone]]s sold worldwide during 2006 were using Linux — and it is an alternative to the proprietary [[Windows CE]] and [[Palm OS]] operating systems on [[mobile device]]s.@@@@1@47@@danf@17-8-2009
10491360@unknown@formal@none@1@S@Cell phones or PDAs running on Linux and built on open source platform became a trend from 2007, like [[Nokia N810]], [[Openmoko]]'s [[Neo1973]] and the on-going [[Google Android]].@@@@1@28@@danf@17-8-2009
10491370@unknown@formal@none@1@S@The popular [[TiVo]] digital video recorder uses a customized version of Linux.@@@@1@12@@danf@17-8-2009
10491380@unknown@formal@none@1@S@Several network [[firewall]] and [[router]] standalone products, including several from [[Linksys]], use Linux internally, using its advanced firewall and routing capabilities.@@@@1@21@@danf@17-8-2009
10491390@unknown@formal@none@1@S@The [[Korg OASYS]] and the [[Yamaha Motif|Yamaha Motif XS]] [[music workstation]]s also run Linux.@@@@1@14@@danf@17-8-2009
10491400@unknown@formal@none@1@S@Further more Linux is used in the leading [[stage lighting]] control system, FlyingPig/HighEnd WholeHogIII Console .@@@@1@16@@danf@17-8-2009
10491410@unknown@formal@none@1@S@=== Market share and uptake ===@@@@1@6@@danf@17-8-2009
10491420@unknown@formal@none@1@S@Many quantitative studies of open source software focus on topics including market share and reliability, with numerous studies specifically examining Linux.@@@@1@21@@danf@17-8-2009
10491430@unknown@formal@none@1@S@The Linux market is growing rapidly, and the revenue of servers, desktops, and packaged software running Linux is expected to exceed $35.7 billion by 2008.@@@@1@25@@danf@17-8-2009
10491440@unknown@formal@none@1@S@[[International Data Corporation|IDC]]'s report for Q1 2007 says that Linux now holds 12.7% of the overall server market.@@@@1@18@@danf@17-8-2009
10491450@unknown@formal@none@1@S@This estimate was based on the number of Linux servers sold by various companies.@@@@1@14@@danf@17-8-2009
10491460@unknown@formal@none@1@S@Desktop adoption of Linux is approximately 1%.@@@@1@7@@danf@17-8-2009
10491470@unknown@formal@none@1@S@In comparison, [[List of Microsoft operating systems|Microsoft operating systems]] hold more than 90%.@@@@1@13@@danf@17-8-2009
10491480@unknown@formal@none@1@S@The frictional cost of switching operating systems and lack of support for certain hardware and application programs designed for [[Microsoft Windows]] have been two factors that have inhibited adoption.@@@@1@29@@danf@17-8-2009
10491490@unknown@formal@none@1@S@Proponents and analysts attribute the relative success of Linux to its security, reliability, low cost, and freedom from [[vendor lock-in]].@@@@1@20@@danf@17-8-2009
10491500@unknown@formal@none@1@S@Also most recently Google has begun to fund [[Wine (software)|Wine]], which acts as a compatibility layer, allowing users to run some Windows programs under Linux.@@@@1@25@@danf@17-8-2009
10491510@unknown@formal@none@1@S@The [[OLPC XO-1|XO laptop]] project of One Laptop Per Child is creating a new and potentially much larger Linux community, planned to reach [http://www.laptop.org/en/vision/mission/index.shtml several hundred million schoolchildren] and their families and communities in developing countries.@@@@1@36@@danf@17-8-2009
10491515@unknown@formal@none@1@S@[http://wiki.laptop.org/go/countries Six countries] have ordered a million or more units each for delivery in 2007 to distribute to schoolchildren at no charge.@@@@1@22@@danf@17-8-2009
10491520@unknown@formal@none@1@S@[[Google]], [[Red Hat]], and [[eBay]] are major supporters of the project.@@@@1@11@@danf@17-8-2009
10491530@unknown@formal@none@1@S@== Copyright and naming ==@@@@1@5@@danf@17-8-2009
10491540@unknown@formal@none@1@S@The Linux kernel and most GNU software are [[software license|license]]d under the [[GNU General Public License]] (GPL).@@@@1@17@@danf@17-8-2009
10491550@unknown@formal@none@1@S@The GPL requires that anyone who distributes the Linux kernel must make the source code (and any modifications) available to the recipient under the same terms.@@@@1@26@@danf@17-8-2009
10491560@unknown@formal@none@1@S@In 1997, Linus Torvalds stated, “Making Linux GPL'd was definitely the best thing I ever did.”@@@@1@16@@danf@17-8-2009
10491570@unknown@formal@none@1@S@Other key components of a Linux system may use other licenses; many libraries use the [[GNU Lesser General Public License]] (LGPL), a more permissive variant of the GPL, and the [[X Window System]] uses the [[MIT License]].@@@@1@37@@danf@17-8-2009
10491580@unknown@formal@none@1@S@Torvalds has publicly stated that he would not move the Linux kernel (currently licensed under GPL version 2) to version 3 of the GPL, released in mid-2007, specifically citing some provisions in the new license which prohibit the use of the software in [[digital rights management]].@@@@1@46@@danf@17-8-2009
10491590@unknown@formal@none@1@S@A 2001 study of [[Red Hat Linux]] 7.1 found that this distribution contained 30 million [[source lines of code]].@@@@1@19@@danf@17-8-2009
10491600@unknown@formal@none@1@S@Using the [[COCOMO|Constructive Cost Model]], the study estimated that this distribution required about eight thousand man-years of development time.@@@@1@19@@danf@17-8-2009
10491610@unknown@formal@none@1@S@According to the study, if all this software had been developed by conventional [[proprietary software|proprietary]] means, it would have cost about 1.08 billion dollars (year 2000 U.S. dollars) to develop in the United States.@@@@1@34@@danf@17-8-2009
10491620@unknown@formal@none@1@S@Most of the code (71%) was written in the [[C (programming language)|C]] [[computer programming|programming]] [[programming language|language]], but many other languages were used, including [[C++]], [[assembly language]], [[Perl]], [[Python (programming language)|Python]], [[Fortran]], and various [[shell script]]ing languages.@@@@1@36@@danf@17-8-2009
10491630@unknown@formal@none@1@S@Slightly over half of all lines of code were licensed under the GPL.@@@@1@13@@danf@17-8-2009
10491640@unknown@formal@none@1@S@The Linux kernel itself was 2.4 million lines of code, or 8% of the total.@@@@1@15@@danf@17-8-2009
10491650@unknown@formal@none@1@S@In a later study, the same analysis was performed for Debian GNU/Linux version 4.0.@@@@1@14@@danf@17-8-2009
10491660@unknown@formal@none@1@S@This distribution contained over 283 million source lines of code, and the study estimated that it would have cost 5.4 billion Euros to develop by conventional means.@@@@1@27@@danf@17-8-2009
10491670@unknown@formal@none@1@S@In the United States, the name ''Linux'' is a [[trademark]] registered to Linus Torvalds.@@@@1@14@@danf@17-8-2009
10491680@unknown@formal@none@1@S@Initially, nobody registered it, but on [[August 15]] [[1994]], William R. Della Croce, Jr. filed for the trademark ''Linux'', and then demanded royalties from Linux distributors.@@@@1@26@@danf@17-8-2009
10491690@unknown@formal@none@1@S@In 1996, Torvalds and some affected organizations sued him to have the trademark assigned to Torvalds, and in 1997 the case was settled.@@@@1@23@@danf@17-8-2009
10491700@unknown@formal@none@1@S@The licensing of the trademark has since been handled by the [[Linux Mark Institute]].@@@@1@14@@danf@17-8-2009
10491710@unknown@formal@none@1@S@Torvalds has stated that he only trademarked the name to prevent someone else from using it, but was bound in 2005 by [[United States trademark law]] to take active measures to enforce the trademark.@@@@1@34@@danf@17-8-2009
10491720@unknown@formal@none@1@S@As a result, the LMI sent out a number of letters to distribution vendors requesting that a fee be paid for the use of the name, and a number of companies have complied.@@@@1@33@@danf@17-8-2009
10491730@unknown@formal@none@1@S@=== GNU/Linux ===@@@@1@3@@danf@17-8-2009
10491740@unknown@formal@none@1@S@The [[Free Software Foundation]] views Linux distributions which use GNU software as [[GNU variants]] and they ask that such operating systems be referred to as ''GNU/Linux'' or ''a Linux-based GNU system''.@@@@1@31@@danf@17-8-2009
10491750@unknown@formal@none@1@S@However, the media and population at large refers to this family of operating systems simply as ''Linux''.@@@@1@17@@danf@17-8-2009
10491760@unknown@formal@none@1@S@While some distributors make a point of using the aggregate form, most notably [[Debian]] with the ''[[Debian GNU/Linux]]'' distribution, the term's use outside of the enthusiast community is limited.@@@@1@29@@danf@17-8-2009
10491770@unknown@formal@none@1@S@The distinction between the Linux kernel and distributions based on it plus the GNU system is a source of confusion to many newcomers, and the naming remains controversial, as many large Linux distributions (e.g. [[Ubuntu]] and [[SuSE]] Linux) are simply using the ''Linux'' name, rather than ''GNU/Linux''.@@@@1@47@@danf@17-8-2009
10500010@unknown@formal@none@1@S@List of chatterbots@@@@1@3@@danf@17-8-2009
10500020@unknown@formal@none@1@S@==Chatterbot Directories==@@@@1@2@@danf@17-8-2009
10500030@unknown@formal@none@1@S@*@@@@1@1@@danf@17-8-2009
10500040@unknown@formal@none@1@S@*[http://www.simonlaven.com Chatterbot Central] at [http://www.simonlaven.com The Simon Laven Page]@@@@1@9@@danf@17-8-2009
10500050@unknown@formal@none@1@S@*[http://www.aidreams.co.uk/chatterbotcollection/index.htm The Chatterbot Collection]@@@@1@4@@danf@17-8-2009
10500060@unknown@formal@none@1@S@*[http://www.aihub.org AI Hub] - A directory of news, programs, and links all related to chatterbots and Artificial Intelligence@@@@1@18@@danf@17-8-2009
10500070@unknown@formal@none@1@S@*[http://www.chatterboxchallenge.com/bots_dir.php The Chatterbox Challenge Bots Directory] at [http://www.chatterboxchallenge.com The Chatterbox Challenge]@@@@1@11@@danf@17-8-2009
10500080@unknown@formal@none@1@S@==Classic Chatterbots==@@@@1@2@@danf@17-8-2009
10500090@unknown@formal@none@1@S@*[[Dr. Sbaitso]]@@@@1@2@@danf@17-8-2009
10500100@unknown@formal@none@1@S@*[[ELIZA]]@@@@1@1@@danf@17-8-2009
10500110@unknown@formal@none@1@S@*[[PARRY]]@@@@1@1@@danf@17-8-2009
10500120@unknown@formal@none@1@S@*[[Racter]]@@@@1@1@@danf@17-8-2009
10500130@unknown@formal@none@1@S@==General Chatterbots==@@@@1@2@@danf@17-8-2009
10500140@unknown@formal@none@1@S@*[[Artificial Linguistic Internet Computer Entity|A.L.I.C.E.]] and other [[Alicebot]]/pandorabot-based ([http://www.titane.ca/concordia/dfar251/igod/main.html iGod], [http://www.mousebreaker.com/games/chatbot/play.php Mitsuku], [http://www.friendbot.co.uk FriendBot], etc.)@@@@1@15@@danf@17-8-2009
10500150@unknown@formal@none@1@S@*[[Albert One]]@@@@1@2@@danf@17-8-2009
10500160@unknown@formal@none@1@S@*[[ALIMbot]]@@@@1@1@@danf@17-8-2009
10500170@unknown@formal@none@1@S@*[[CHAT and TIPS]]@@@@1@3@@danf@17-8-2009
10500180@unknown@formal@none@1@S@*[http://www.chat-bot.com Chat-bot]@@@@1@2@@danf@17-8-2009
10500190@unknown@formal@none@1@S@*[[Claude Chatterbot|Claude]]@@@@1@2@@danf@17-8-2009
10500200@unknown@formal@none@1@S@*[http://www.dadorac.com Dadorac]@@@@1@2@@danf@17-8-2009
10500210@unknown@formal@none@1@S@*[http://www.dai2.co.uk/ DAI2] - A dynamic artificial intelligence which learns from its surrounding community@@@@1@13@@danf@17-8-2009
10500220@unknown@formal@none@1@S@*[http://www.elbot.com/ Elbot]@@@@1@2@@danf@17-8-2009
10500230@unknown@formal@none@1@S@*[[Ella Chatterbot|Ella]]@@@@1@2@@danf@17-8-2009
10500240@unknown@formal@none@1@S@*[[Fred Chatterbot|Fred]]@@@@1@2@@danf@17-8-2009
10500250@unknown@formal@none@1@S@*[[Jabberwacky]]@@@@1@1@@danf@17-8-2009
10500260@unknown@formal@none@1@S@*[http://www.abenteuermedien.de/jabberwock Jabberwock]@@@@1@2@@danf@17-8-2009
10500270@unknown@formal@none@1@S@*[http://www.jeeney.com/ Jeeney AI]@@@@1@3@@danf@17-8-2009
10500280@unknown@formal@none@1@S@*[http://www.jixperts.com?lang=en JIxperts] – collection of wiki chatterbots.@@@@1@7@@danf@17-8-2009
10500290@unknown@formal@none@1@S@*[http://www.iaindustrie.fr.nf KAR Intelligent Computer]@@@@1@4@@danf@17-8-2009
10500300@unknown@formal@none@1@S@*[http://www.leeds-city-guide.com/kyle Kyle] – A unique learning Artificial Intelligence chatbot, which employs contextual learning algorithms.@@@@1@14@@danf@17-8-2009
10500310@unknown@formal@none@1@S@*[[MegaHal]]@@@@1@1@@danf@17-8-2009
10500320@unknown@formal@none@1@S@*[[Mr Know-It-All]]@@@@1@2@@danf@17-8-2009
10500330@unknown@formal@none@1@S@*Oliverbot@@@@1@1@@danf@17-8-2009
10500340@unknown@formal@none@1@S@*[http://uk.geocities.com/mattbrown1101/ Poseidon]@@@@1@2@@danf@17-8-2009
10500350@unknown@formal@none@1@S@*[http://www.infradrive.com/robomatic.php RoboMatic X1] - A chatbot which controls the user's PC through chatting by their voice or by typing.@@@@1@19@@danf@17-8-2009
10500360@unknown@formal@none@1@S@*[http://www.cooldictionary.com/splotchy.mpl Splotchy]@@@@1@2@@danf@17-8-2009
10500370@unknown@formal@none@1@S@*[[Starship Titanic#Spookitalk|Spookitalk]] - A chatterbot used for [[Non-player character|NPC]]s in [[Douglas Adams]]' ''Starship Titanic'' video game.@@@@1@16@@danf@17-8-2009
10500380@unknown@formal@none@1@S@*[http://www.onebigspace.com/ Thomas]@@@@1@2@@danf@17-8-2009
10500390@unknown@formal@none@1@S@*[[Ultra Hal Assistant]]@@@@1@3@@danf@17-8-2009
10500400@unknown@formal@none@1@S@*[[Verbot]]@@@@1@1@@danf@17-8-2009
10500410@unknown@formal@none@1@S@*[http://www.yhaken.com/ Yhaken]@@@@1@2@@danf@17-8-2009
10500420@unknown@formal@none@1@S@*[http://www.scientiobot.com ScientioBot] - A new technology chatterbot using concept mining techniques accessible via a free web service.@@@@1@17@@danf@17-8-2009
10500430@unknown@formal@none@1@S@*[http://nicole.jetaylor.net NICOLE] A simple chatterbot with the ability to learn new phrases.@@@@1@12@@danf@17-8-2009
10500440@unknown@formal@none@1@S@==[[Instant messenger|IM]] Chatterbots==@@@@1@3@@danf@17-8-2009
10500450@unknown@formal@none@1@S@*DAI2 is also available on the MSN / Windows Live network as dai2\sdai2.co.uk@@@@1@13@@danf@17-8-2009
10500460@unknown@formal@none@1@S@*[http://www.dnreg.org/bot/ MSN Quickbot]@@@@1@3@@danf@17-8-2009
10500470@unknown@formal@none@1@S@*[http://www.smarterchild.com SmarterChild]@@@@1@2@@danf@17-8-2009
10500480@unknown@formal@none@1@S@*[http://www.spleak.com Spleak]@@@@1@2@@danf@17-8-2009
10500490@unknown@formal@none@1@S@*[http://www.mrmovie.com MrMovie] - searching actors/movies/dvd's in IM (Skype, AOL/AIM or MSN/Live)@@@@1@11@@danf@17-8-2009
10500500@unknown@formal@none@1@S@*[[Inside Messenger Bot|InsideMessenger]]@@@@1@3@@danf@17-8-2009
10500510@unknown@formal@none@1@S@*[http://www.inocu.jt-online.co.uk Inocu] - (MSN/Live)@@@@1@4@@danf@17-8-2009
10500520@unknown@formal@none@1@S@*[http://www.friendbot.co.uk FriendBot-An AIM Chatterbot]@@@@1@4@@danf@17-8-2009
10500530@unknown@formal@none@1@S@*[http://www.amsn-project.net/plugins.php amsnEliza plugin for aMSN]@@@@1@5@@danf@17-8-2009
10500540@unknown@formal@none@1@S@*[[Inside Messenger Bot|TrixieMouse]]@@@@1@3@@danf@17-8-2009
10500550@unknown@formal@none@1@S@*[http://www.infobot.pl/ Infobot] - Polish informational bot for Gadu-gadu, Skype and Jabber@@@@1@11@@danf@17-8-2009
10500560@unknown@formal@none@1@S@==AIML Chatterbots==@@@@1@2@@danf@17-8-2009
10500570@unknown@formal@none@1@S@*[http://www.taik.fi/turingenigma Alan] - In ''Turing Enigma'' Alan Turing's spirit has infiltrated the World War II encrypting device Enigma.@@@@1@18@@danf@17-8-2009
10500580@unknown@formal@none@1@S@*[http://www.dustyant.com/projects/deebot/ Deeb0t]@@@@1@2@@danf@17-8-2009
10500590@unknown@formal@none@1@S@*[http://www.pandorabots.com/pandora/talk?botid=b0dafd24ee35a477 Chomsky] A chatbot that uses a smiley face to convey emotions.@@@@1@12@@danf@17-8-2009
10500600@unknown@formal@none@1@S@It uses the information in Wikipedia to build its conversations and has links to Wikipedia articles.@@@@1@16@@danf@17-8-2009
10500610@unknown@formal@none@1@S@*[[John Lennon Artificial Intelligence Project]]@@@@1@5@@danf@17-8-2009
10500620@unknown@formal@none@1@S@*[[SitePal]]@@@@1@1@@danf@17-8-2009
10500630@unknown@formal@none@1@S@==JFred Chatterbots==@@@@1@2@@danf@17-8-2009
10500640@unknown@formal@none@1@S@*[[The Turing Hub]]@@@@1@3@@danf@17-8-2009
10500650@unknown@formal@none@1@S@==Educational Chatterbots==@@@@1@2@@danf@17-8-2009
10500660@unknown@formal@none@1@S@*[http://www.philocomp.net/?pageref=ai&page=elizabeth Elizabeth] Aims to teach AI techniques and concepts, starting from chatterbot design.@@@@1@13@@danf@17-8-2009
10500670@unknown@formal@none@1@S@Accompanied by self-teaching materials, as used at the University of Leeds.@@@@1@11@@danf@17-8-2009
10500680@unknown@formal@none@1@S@==Non-English Chatterbots==@@@@1@2@@danf@17-8-2009
10500690@unknown@formal@none@1@S@*[http://www.geocities.com/brizglace/amanda.htm Amanda] - (French) with source code for Windows.@@@@1@9@@danf@17-8-2009
10500700@unknown@formal@none@1@S@*[[Proteus]]@@@@1@1@@danf@17-8-2009
10500710@unknown@formal@none@1@S@*[msnim:chat?contact=senhorbot\shotmail.com Senhor Bot] (Brazillian bot for MSN)@@@@1@7@@danf@17-8-2009
10500720@unknown@formal@none@1@S@[[Category:Chatterbots|*]]@@@@1@1@@danf@17-8-2009
10500730@unknown@formal@none@1@S@[[bn:চ্যাটারবটসমূহের তালিকা]]@@@@1@2@@danf@17-8-2009
10510010@unknown@formal@none@1@S@Loebner prize@@@@1@2@@danf@17-8-2009
10510020@unknown@formal@none@1@S@The '''Loebner Prize''' is an annual competition that awards prizes to the [[Chatterbot]] considered by the judges to be the most [[Artificial intelligence|humanlike]] of those entered.@@@@1@26@@danf@17-8-2009
10510030@unknown@formal@none@1@S@The format of the competition is that of a standard [[Turing test]].@@@@1@12@@danf@17-8-2009
10510040@unknown@formal@none@1@S@In the Loebner Prize, as in a Turing test, a human judge is faced with two computer screens.@@@@1@18@@danf@17-8-2009
10510050@unknown@formal@none@1@S@One is under the control of a computer, the other is under the control of a human.@@@@1@17@@danf@17-8-2009
10510060@unknown@formal@none@1@S@The judge poses questions to the two screens and receives answers.@@@@1@11@@danf@17-8-2009
10510070@unknown@formal@none@1@S@Based upon the answers, the judge must decide which screen is controlled by the human and which is controlled by the computer program.@@@@1@23@@danf@17-8-2009
10510080@unknown@formal@none@1@S@The contest was begun in 1990 by [[Hugh Loebner]] in conjunction with the [[Cambridge Center for Behavioral Studies]] of [[Massachusetts]], [[United States]].@@@@1@22@@danf@17-8-2009
10510090@unknown@formal@none@1@S@It has since been associated with [[Flinders University]], [[Dartmouth College]], the [[Science Museum (London)|Science Museum]] in [[London]], and most recently the [[University of Reading]].@@@@1@24@@danf@17-8-2009
10510100@unknown@formal@none@1@S@Within the field of artificial intelligence, the Loebner Prize is somewhat controversial; the most prominent critic, [[Marvin Minsky]], has called it a publicity stunt that does not help the field along.@@@@1@31@@danf@17-8-2009
10510110@unknown@formal@none@1@S@==Prizes==@@@@1@1@@danf@17-8-2009
10510120@unknown@formal@none@1@S@The prizes for each year include:@@@@1@6@@danf@17-8-2009
10510130@unknown@formal@none@1@S@* $2,000 for the most human-seeming of all chatterbots for that year - awarded every year.@@@@1@16@@danf@17-8-2009
10510140@unknown@formal@none@1@S@In 2005, the prize was increased to $3,000, and the prize was $2,250 in 2006.@@@@1@15@@danf@17-8-2009
10510150@unknown@formal@none@1@S@In 2008 the prize will be $3000.00@@@@1@7@@danf@17-8-2009
10510160@unknown@formal@none@1@S@* $25,000 for the first chatterbot that judges cannot distinguish from a real human in a text-only Turing test, and that can convince judges that the other (human) entity they are talking to simultaneously is a computer.@@@@1@37@@danf@17-8-2009
10510165@unknown@formal@none@1@S@''(to be awarded once only)''@@@@1@5@@danf@17-8-2009
10510170@unknown@formal@none@1@S@* $100,000 to the first chatterbot that judges cannot distinguish from a real human in a Turing test that includes deciphering and understanding text, visual, and auditory input.@@@@1@28@@danf@17-8-2009
10510175@unknown@formal@none@1@S@''(to be awarded once only)''@@@@1@5@@danf@17-8-2009
10510180@unknown@formal@none@1@S@The Loebner Prize dissolves once the $100,000 prize is won.@@@@1@10@@danf@17-8-2009
10510190@unknown@formal@none@1@S@==2008 Loebner Prize==@@@@1@3@@danf@17-8-2009
10510200@unknown@formal@none@1@S@The 2008 Competition is to be held on Sunday [[12 October]] in University of Reading, [[United Kingdom|UK]].@@@@1@17@@danf@17-8-2009
10510210@unknown@formal@none@1@S@The event, which is being co-directed by [[Kevin Warwick]], will include a direct challenge on the [[Turing test]] as originally proposed by [[Alan Turing]].@@@@1@24@@danf@17-8-2009
10510220@unknown@formal@none@1@S@The first place winner will receive $3000.00 and a bronze medal.@@@@1@11@@danf@17-8-2009
10510230@unknown@formal@none@1@S@==2007 Loebner Prize==@@@@1@3@@danf@17-8-2009
10510240@unknown@formal@none@1@S@The 2007 Competition was held on Sunday, [[21 October]] in [[New York City]].@@@@1@13@@danf@17-8-2009
10510250@unknown@formal@none@1@S@The participants in the contest were:@@@@1@6@@danf@17-8-2009
10510260@unknown@formal@none@1@S@* [[Rollo Carpenter]] from Icogno, creator of [[Jabberwacky]]@@@@1@8@@danf@17-8-2009
10510270@unknown@formal@none@1@S@* Noah Duncan, private entry, creator of Cletus@@@@1@8@@danf@17-8-2009
10510280@unknown@formal@none@1@S@* Robert Medeksza from Zabaware, creator of [[Ultra Hal Assistant]]@@@@1@10@@danf@17-8-2009
10510290@unknown@formal@none@1@S@No bot passed the Turing test but the judges ranked the bots as "most human".@@@@1@15@@danf@17-8-2009
10510300@unknown@formal@none@1@S@The results of the contest were:@@@@1@6@@danf@17-8-2009
10510310@unknown@formal@none@1@S@* 1st place: Robert Medeksza@@@@1@5@@danf@17-8-2009
10510320@unknown@formal@none@1@S@* 2nd place: Noah Duncan@@@@1@5@@danf@17-8-2009
10510330@unknown@formal@none@1@S@* 3rd place: Rollo Carpenter@@@@1@5@@danf@17-8-2009
10510340@unknown@formal@none@1@S@The winner received $2250 and the Annual Medal.@@@@1@8@@danf@17-8-2009
10510350@unknown@formal@none@1@S@The runners up received $250 each.@@@@1@6@@danf@17-8-2009
10510360@unknown@formal@none@1@S@==2006 Loebner Prize==@@@@1@3@@danf@17-8-2009
10510370@unknown@formal@none@1@S@On Wednesday, [[August 30]], the finalists for the 2006 Loebner Prize were announced.@@@@1@13@@danf@17-8-2009
10510380@unknown@formal@none@1@S@The finalists were:@@@@1@3@@danf@17-8-2009
10510390@unknown@formal@none@1@S@* Rollo Carpenter@@@@1@3@@danf@17-8-2009
10510400@unknown@formal@none@1@S@* Richard Churchill and Marie-Claire Jenkins@@@@1@6@@danf@17-8-2009
10510410@unknown@formal@none@1@S@* Noah Duncan@@@@1@3@@danf@17-8-2009
10510420@unknown@formal@none@1@S@* Robert Medeksza@@@@1@3@@danf@17-8-2009
10510430@unknown@formal@none@1@S@The contest was held on Sunday, [[17 September]] at the Torrington Theatre, [[University College London]].@@@@1@15@@danf@17-8-2009
10510440@unknown@formal@none@1@S@==Winners==@@@@1@1@@danf@17-8-2009
10520010@unknown@formal@none@1@S@Machine learning@@@@1@2@@danf@17-8-2009
10520020@unknown@formal@none@1@S@As a broad subfield of [[artificial intelligence]], '''machine learning''' is concerned with the design and development of [[algorithm]]s and techniques that allow computers to "learn".@@@@1@25@@danf@17-8-2009
10520030@unknown@formal@none@1@S@At a general level, there are two types of learning: [[Inductive reasoning|inductive]], and [[Deductive reasoning|deductive]].@@@@1@15@@danf@17-8-2009
10520040@unknown@formal@none@1@S@Inductive machine learning methods extract rules and patterns out of massive data sets.@@@@1@13@@danf@17-8-2009
10520050@unknown@formal@none@1@S@The major focus of machine learning research is to extract information from data automatically, by computational and statistical methods.@@@@1@19@@danf@17-8-2009
10520060@unknown@formal@none@1@S@Hence, machine learning is closely related not only to [[data mining]] and [[statistics]], but also [[theoretical computer science]].@@@@1@18@@danf@17-8-2009
10520070@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009
10520080@unknown@formal@none@1@S@Machine learning has a wide spectrum of applications including [[natural language processing]], [[syntactic pattern recognition]], [[search engines]], [[diagnosis|medical diagnosis]], [[bioinformatics]], [[brain-machine interfaces]] and [[cheminformatics]], detecting [[credit card fraud]], [[stock market]] analysis, classifying [[DNA sequence]]s, [[speech recognition|speech]] and [[handwriting recognition]], [[object recognition]] in [[computer vision]], [[strategy game|game playing]] and [[robot locomotion]].@@@@1@50@@danf@17-8-2009
10520090@unknown@formal@none@1@S@== Human interaction ==@@@@1@4@@danf@17-8-2009
10520100@unknown@formal@none@1@S@Some machine learning systems attempt to eliminate the need for human intuition in the analysis of the data, while others adopt a collaborative approach between human and machine.@@@@1@28@@danf@17-8-2009
10520110@unknown@formal@none@1@S@Human intuition cannot be entirely eliminated since the designer of the system must specify how the data is to be represented and what mechanisms will be used to search for a characterization of the data.@@@@1@35@@danf@17-8-2009
10520120@unknown@formal@none@1@S@Machine learning can be viewed as an attempt to automate parts of the [[scientific method]].@@@@1@15@@danf@17-8-2009
10520130@unknown@formal@none@1@S@Some statistical machine learning researchers create methods within the framework of [[Bayesian statistics]].@@@@1@13@@danf@17-8-2009
10520140@unknown@formal@none@1@S@== Algorithm types ==@@@@1@4@@danf@17-8-2009
10520150@unknown@formal@none@1@S@Machine learning [[algorithm]]s are organized into a [[taxonomy]], based on the desired outcome of the algorithm.@@@@1@16@@danf@17-8-2009
10520160@unknown@formal@none@1@S@Common algorithm types include:@@@@1@4@@danf@17-8-2009
10520170@unknown@formal@none@1@S@* [[Supervised learning]] — in which the algorithm generates a function that maps inputs to desired outputs.@@@@1@17@@danf@17-8-2009
10520180@unknown@formal@none@1@S@One standard formulation of the supervised learning task is the [[statistical classification|classification]] problem: the learner is required to learn (to approximate) the behavior of a function which maps a vector into one of several classes by looking at several input-output examples of the function.@@@@1@48@@danf@17-8-2009
10520190@unknown@formal@none@1@S@* [[Unsupervised learning]] — An agent which models a set of inputs: labeled examples are not available.@@@@1@17@@danf@17-8-2009
10520200@unknown@formal@none@1@S@* [[Semi-supervised learning]] — which combines both labeled and unlabeled examples to generate an appropriate function or classifier.@@@@1@18@@danf@17-8-2009
10520210@unknown@formal@none@1@S@* [[Reinforcement learning]] — in which the algorithm learns a policy of how to act given an observation of the world.@@@@1@21@@danf@17-8-2009
10520220@unknown@formal@none@1@S@Every action has some impact in the environment, and the environment provides feedback that guides the learning algorithm.@@@@1@18@@danf@17-8-2009
10520230@unknown@formal@none@1@S@* [[Transduction (machine learning)|Transduction]] — similar to supervised learning, but does not explicitly construct a function: instead, tries to predict new outputs based on training inputs, training outputs, and test inputs which are available while training.@@@@1@36@@danf@17-8-2009
10520240@unknown@formal@none@1@S@* [[Leaning to learn]] — in which the algorithm learns its own [[inductive bias]] based on previous experience.@@@@1@18@@danf@17-8-2009
10520250@unknown@formal@none@1@S@The computational analysis of machine learning algorithms and their performance is a branch of [[theoretical computer science]] known as [[computational learning theory]].@@@@1@22@@danf@17-8-2009
10520260@unknown@formal@none@1@S@== Machine learning topics ==@@@@1@5@@danf@17-8-2009
10520270@unknown@formal@none@1@S@:''This list represents the topics covered on a typical machine learning course.''@@@@1@12@@danf@17-8-2009
10520280@unknown@formal@none@1@S@;Prerequisites@@@@1@1@@danf@17-8-2009
10520290@unknown@formal@none@1@S@*[[Bayesian theory]]@@@@1@2@@danf@17-8-2009
10520300@unknown@formal@none@1@S@;Modeling [[conditional probability|conditional probability density functions]]: [[Regression analysis|regression]] and [[Statistical classification|classification]]@@@@1@12@@danf@17-8-2009
10520310@unknown@formal@none@1@S@*[[Artificial neural network]]s@@@@1@3@@danf@17-8-2009
10520320@unknown@formal@none@1@S@*[[Decision tree]]s@@@@1@2@@danf@17-8-2009
10520330@unknown@formal@none@1@S@*[[Gene expression programming]]@@@@1@3@@danf@17-8-2009
10520340@unknown@formal@none@1@S@*[[Genetic algorithms]]@@@@1@2@@danf@17-8-2009
10520350@unknown@formal@none@1@S@*[[Genetic programming]]@@@@1@2@@danf@17-8-2009
10520360@unknown@formal@none@1@S@*[[Holographic associative memory]]@@@@1@3@@danf@17-8-2009
10520370@unknown@formal@none@1@S@*[[Inductive Logic Programming]]@@@@1@3@@danf@17-8-2009
10520380@unknown@formal@none@1@S@*[[Kriging|Gaussian process regression]]@@@@1@3@@danf@17-8-2009
10520390@unknown@formal@none@1@S@*[[Linear discriminant analysis]]@@@@1@3@@danf@17-8-2009
10520400@unknown@formal@none@1@S@*[[Nearest neighbor (pattern recognition)|K-nearest neighbor]]@@@@1@5@@danf@17-8-2009
10520410@unknown@formal@none@1@S@*[[Minimum message length]]@@@@1@3@@danf@17-8-2009
10520420@unknown@formal@none@1@S@*[[Perceptron]]@@@@1@1@@danf@17-8-2009
10520430@unknown@formal@none@1@S@*[[Quadratic classifier]]@@@@1@2@@danf@17-8-2009
10520440@unknown@formal@none@1@S@*[[Radial basis function network]]s@@@@1@4@@danf@17-8-2009
10520450@unknown@formal@none@1@S@*[[Support vector machine]]s@@@@1@3@@danf@17-8-2009
10520460@unknown@formal@none@1@S@;Algorithms for estimating model parameters:@@@@1@5@@danf@17-8-2009
10520470@unknown@formal@none@1@S@*[[Dynamic programming]]@@@@1@2@@danf@17-8-2009
10520480@unknown@formal@none@1@S@*[[Expectation-maximization algorithm]]@@@@1@2@@danf@17-8-2009
10520490@unknown@formal@none@1@S@;Modeling [[probability density function]]s through [[generative model]]s:@@@@1@7@@danf@17-8-2009
10520500@unknown@formal@none@1@S@*[[Graphical model]]s including [[Bayesian network]]s and [[Markov network|Markov random fields]]@@@@1@10@@danf@17-8-2009
10520510@unknown@formal@none@1@S@*[[Generative topographic map]]@@@@1@3@@danf@17-8-2009
10520520@unknown@formal@none@1@S@;Approximate inference techniques@@@@1@3@@danf@17-8-2009
10520530@unknown@formal@none@1@S@*[[Monte Carlo method]]s@@@@1@3@@danf@17-8-2009
10520540@unknown@formal@none@1@S@*[[Variational Bayes]]@@@@1@2@@danf@17-8-2009
10520550@unknown@formal@none@1@S@*[[Variable-order Markov model]]s@@@@1@3@@danf@17-8-2009
10520560@unknown@formal@none@1@S@*[[Variable-order Bayesian network]]s@@@@1@3@@danf@17-8-2009
10520570@unknown@formal@none@1@S@*[[Loopy belief propagation]]@@@@1@3@@danf@17-8-2009
10520580@unknown@formal@none@1@S@;Optimization@@@@1@1@@danf@17-8-2009
10520590@unknown@formal@none@1@S@*Most of methods listed above either use [[Optimization (mathematics)|optimization]] or are instances of optimization algorithms@@@@1@15@@danf@17-8-2009
10520600@unknown@formal@none@1@S@;Meta-learning (ensemble methods)@@@@1@3@@danf@17-8-2009
10520610@unknown@formal@none@1@S@*[[Boosting]]@@@@1@1@@danf@17-8-2009
10520620@unknown@formal@none@1@S@*[[Bootstrap aggregating]]@@@@1@2@@danf@17-8-2009
10520630@unknown@formal@none@1@S@*[[Random forest]]@@@@1@2@@danf@17-8-2009
10520640@unknown@formal@none@1@S@*[[Weighted majority algorithm]]@@@@1@3@@danf@17-8-2009
10520650@unknown@formal@none@1@S@;Inductive transfer and learning to learn@@@@1@6@@danf@17-8-2009
10520660@unknown@formal@none@1@S@*[[Inductive transfer]]@@@@1@2@@danf@17-8-2009
10520670@unknown@formal@none@1@S@*[[Reinforcement learning]]@@@@1@2@@danf@17-8-2009
10520680@unknown@formal@none@1@S@*[[Temporal difference learning]]@@@@1@3@@danf@17-8-2009
10520690@unknown@formal@none@1@S@*[[Monte-Carlo method]]@@@@1@2@@danf@17-8-2009
10530010@unknown@formal@none@1@S@Machine translation@@@@1@2@@danf@17-8-2009
10530020@unknown@formal@none@1@S@Machine translation''', sometimes referred to by the abbreviation '''MT''', is a sub-field of [[computational linguistics]] that investigates the use of [[computer software]] to [[translation|translate]] text or speech from one [[natural language]] to another.@@@@1@33@@danf@17-8-2009
10530030@unknown@formal@none@1@S@At its basic level, MT performs simple [[substitution]] of words in one natural language for words in another.@@@@1@18@@danf@17-8-2009
10530040@unknown@formal@none@1@S@Using [[corpus linguistics|corpus]] techniques, more complex translations may be attempted, allowing for better handling of differences in [[linguistic typology]], phrase [[recognition]], and translation of [[idiom]]s, as well as the isolation of anomalies.@@@@1@32@@danf@17-8-2009
10530050@unknown@formal@none@1@S@Current machine translation software often allows for customisation by domain or [[profession]] (such as [[meteorology|weather reports]]) — improving output by limiting the scope of allowable substitutions.@@@@1@26@@danf@17-8-2009
10530060@unknown@formal@none@1@S@This technique is particularly effective in domains where formal or formulaic language is used.@@@@1@14@@danf@17-8-2009
10530070@unknown@formal@none@1@S@It follows then that machine translation of government and legal documents more readily produces usable output than conversation or less standardised text.@@@@1@22@@danf@17-8-2009
10530080@unknown@formal@none@1@S@Improved output quality can also be achieved by human intervention: for example, some systems are able to translate more accurately if the user has [[word sense disambiguation|unambiguously identified]] which words in the text are names.@@@@1@35@@danf@17-8-2009
10530090@unknown@formal@none@1@S@With the assistance of these techniques, MT has proven useful as a tool to assist human translators, and in some cases can even produce output that can be used "as is".@@@@1@31@@danf@17-8-2009
10530100@unknown@formal@none@1@S@However, current systems are unable to produce output of the same quality as a human translator, particularly where the text to be translated uses casual language.@@@@1@26@@danf@17-8-2009
10530110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10530120@unknown@formal@none@1@S@The history of machine translation begins in the 1950s, after [[World War II]].@@@@1@13@@danf@17-8-2009
10530130@unknown@formal@none@1@S@The [[Georgetown-IBM experiment|Georgetown experiment]] (1954) involved fully-automatic translation of over sixty [[Russian language|Russian]] sentences into [[English language|English]].@@@@1@17@@danf@17-8-2009
10530140@unknown@formal@none@1@S@The experiment was a great success and ushered in an era of substantial funding for machine-translation research.@@@@1@17@@danf@17-8-2009
10530150@unknown@formal@none@1@S@The authors claimed that within three to five years, machine translation would be a solved problem.@@@@1@16@@danf@17-8-2009
10530160@unknown@formal@none@1@S@Real progress was much slower, however, and after the [[ALPAC|ALPAC report]] (1966), which found that the ten-year-long research had failed to fulfill expectations, funding was greatly reduced.@@@@1@27@@danf@17-8-2009
10530170@unknown@formal@none@1@S@Beginning in the late 1980s, as [[computation]]al power increased and became less expensive, more interest was shown in [[statistical machine translation|statistical models for machine translation]].@@@@1@25@@danf@17-8-2009
10530180@unknown@formal@none@1@S@The idea of using digital computers for translation of natural languages was proposed as early as 1946 by A.D.Booth and possibly others.@@@@1@22@@danf@17-8-2009
10530190@unknown@formal@none@1@S@The Georgetown experiment was by no means the first such application, and a demonstration was made in 1954 on the APEXC machine at Birkbeck College (London Univ.) of a rudimentary translation of English into French.@@@@1@35@@danf@17-8-2009
10530200@unknown@formal@none@1@S@Several papers on the topic were published at the time, and even articles in popular journals (see for example Wireless World, Sept. 1955, Cleave and Zacharov).@@@@1@26@@danf@17-8-2009
10530210@unknown@formal@none@1@S@A similar application, also pioneered at Birkbeck College at the time, was reading and composing Braille texts by computer.@@@@1@19@@danf@17-8-2009
10530220@unknown@formal@none@1@S@Recently, Internet has emerged as global information infrastructure, revolutionizing access to any information, as well as fast information transfer and exchange.@@@@1@21@@danf@17-8-2009
10530230@unknown@formal@none@1@S@Using Internet and e-mail technology, people need to communicate rapidly over long distances across continent boundaries.@@@@1@16@@danf@17-8-2009
10530240@unknown@formal@none@1@S@Not all of these Internet users, however, can use their own language for global communication to different people with different languages.@@@@1@21@@danf@17-8-2009
10530250@unknown@formal@none@1@S@Therefore, using machine translation software, people can possibly communicate and contact one to another around the world in their own mother tongue, in the near future.@@@@1@26@@danf@17-8-2009
10530260@unknown@formal@none@1@S@==Translation process==@@@@1@2@@danf@17-8-2009
10530270@unknown@formal@none@1@S@The [[translation process]] may be stated as:@@@@1@7@@danf@17-8-2009
10530280@unknown@formal@none@1@S@# [[Decoding]] the [[meaning (linguistic)|meaning]] of the [[source text]]; and@@@@1@10@@danf@17-8-2009
10530290@unknown@formal@none@1@S@# Re-[[encoding]] this [[meaning (linguistic)|meaning]] in the [[target language]].@@@@1@9@@danf@17-8-2009
10530300@unknown@formal@none@1@S@Behind this ostensibly simple procedure lies a complex [[cognitive]] operation.@@@@1@10@@danf@17-8-2009
10530310@unknown@formal@none@1@S@To decode the meaning of the [[source text]] in its entirety, the translator must interpret and analyse all the features of the text, a process that requires in-depth knowledge of the [[grammar]], [[semantics]], [[syntax]], [[idiom]]s, etc., of the [[source language]], as well as the [[culture]] of its speakers.@@@@1@48@@danf@17-8-2009
10530320@unknown@formal@none@1@S@The translator needs the same in-depth knowledge to re-encode the meaning in the [[target language]].@@@@1@15@@danf@17-8-2009
10530330@unknown@formal@none@1@S@Therein lies the challenge in machine translation: how to program a computer that will "understand" a text as a person does, and that will "create" a new text in the [[target language]] that "sounds" as if it has been written by a person.@@@@1@43@@danf@17-8-2009
10530340@unknown@formal@none@1@S@This problem may be approached in a number of ways.@@@@1@10@@danf@17-8-2009
10530350@unknown@formal@none@1@S@==Approaches==@@@@1@1@@danf@17-8-2009
10530360@unknown@formal@none@1@S@Machine translation can use a method based on [[Expert System|linguistic rules]], which means that words will be translated in a linguistic way — the most suitable (orally speaking) words of the target language will replace the ones in the source language.@@@@1@41@@danf@17-8-2009
10530370@unknown@formal@none@1@S@It is often argued that the success of machine translation requires the problem of [[natural language processing|natural language understanding]] to be solved first.@@@@1@23@@danf@17-8-2009
10530380@unknown@formal@none@1@S@Generally, rule-based methods parse a text, usually creating an intermediary, symbolic representation, from which the text in the target language is generated.@@@@1@22@@danf@17-8-2009
10530390@unknown@formal@none@1@S@According to the nature of the intermediary representation, an approach is described as [[interlingual machine translation]] or [[transfer-based machine translation]].@@@@1@20@@danf@17-8-2009
10530400@unknown@formal@none@1@S@These methods require extensive [[lexicon]]s with [[morphology (linguistics)|morphological]], [[syntax|syntactic]], and [[semantics|semantic]] information, and large sets of rules.@@@@1@17@@danf@17-8-2009
10530410@unknown@formal@none@1@S@Given enough data, machine translation programs often work well enough for a [[native speaker]] of one language to get the approximate meaning of what is written by the other native speaker.@@@@1@31@@danf@17-8-2009
10530420@unknown@formal@none@1@S@The difficulty is getting enough data of the right kind to support the particular method.@@@@1@15@@danf@17-8-2009
10530430@unknown@formal@none@1@S@For example, the large multilingual [[Text corpus|corpus]] of data needed for statistical methods to work is not necessary for the grammar-based methods.@@@@1@22@@danf@17-8-2009
10530440@unknown@formal@none@1@S@But then, the grammar methods need a skilled linguist to carefully design the grammar that they use.@@@@1@17@@danf@17-8-2009
10530450@unknown@formal@none@1@S@To translate between closely related languages, a technique referred to as [[shallow-transfer machine translation]] may be used.@@@@1@17@@danf@17-8-2009
10530460@unknown@formal@none@1@S@===Rule-based===@@@@1@1@@danf@17-8-2009
10530470@unknown@formal@none@1@S@The rule-based machine translation paradigm includes transfer-based machine translation, interlingual machine translation and dictionary-based machine translation paradigms.@@@@1@17@@danf@17-8-2009
10530480@unknown@formal@none@1@S@'''''Transfer-based machine translation'''''@@@@1@3@@danf@17-8-2009
10530490@unknown@formal@none@1@S@'''''Interlingual'''''@@@@1@1@@danf@17-8-2009
10530500@unknown@formal@none@1@S@Interlingual machine translation is one instance of rule-based machine-translation approaches.@@@@1@10@@danf@17-8-2009
10530510@unknown@formal@none@1@S@In this approach, the source language, i.e. the text to be translated, is transformed into an interlingual, i.e. source-/target-language-independent representation.@@@@1@20@@danf@17-8-2009
10530520@unknown@formal@none@1@S@The target language is then generated out of the [[interlinguistics|interlingua]].@@@@1@10@@danf@17-8-2009
10530530@unknown@formal@none@1@S@'''''Dictionary-based'''''@@@@1@1@@danf@17-8-2009
10530540@unknown@formal@none@1@S@Machine translation can use a method based on [[dictionary]] entries, which means that the words will be translated as they are by a dictionary.@@@@1@24@@danf@17-8-2009
10530550@unknown@formal@none@1@S@===Statistical===@@@@1@1@@danf@17-8-2009
10530560@unknown@formal@none@1@S@Statistical machine translation tries to generate translations using [[statistical methods]] based on bilingual text corpora, such as the [[Hansard#Canadian hansard and machine translation|Canadian Hansard]] corpus, the English-French record of the Canadian parliament and [[EUROPARL]], the record of the [[European Parliament]].@@@@1@40@@danf@17-8-2009
10530570@unknown@formal@none@1@S@Where such corpora are available, impressive results can be achieved translating texts of a similar kind, but such corpora are still very rare.@@@@1@23@@danf@17-8-2009
10530580@unknown@formal@none@1@S@The first statistical machine translation software was [[CANDIDE]] from [[IBM]].@@@@1@10@@danf@17-8-2009
10530590@unknown@formal@none@1@S@Google used [[SYSTRAN]] for several years, but has switched to a statistical translation method in October 2007.@@@@1@17@@danf@17-8-2009
10530600@unknown@formal@none@1@S@Recently, they improved their translation capabilities by inputting approximately 200 billion words from [[United Nations]] materials to train their system.@@@@1@20@@danf@17-8-2009
10530610@unknown@formal@none@1@S@Accuracy of the translation has improved.@@@@1@6@@danf@17-8-2009
10530620@unknown@formal@none@1@S@===Example-based===@@@@1@1@@danf@17-8-2009
10530630@unknown@formal@none@1@S@Example-based machine translation (EBMT) approach is often characterised by its use of a bilingual [[corpus]] as its main knowledge base, at run-time.@@@@1@22@@danf@17-8-2009
10530640@unknown@formal@none@1@S@It is essentially a translation by [[analogy]] and can be viewed as an implementation of [[case-based reasoning]] approach of [[machine learning]].@@@@1@21@@danf@17-8-2009
10530650@unknown@formal@none@1@S@==Major issues==@@@@1@2@@danf@17-8-2009
10530660@unknown@formal@none@1@S@===Disambiguation===@@@@1@1@@danf@17-8-2009
10530670@unknown@formal@none@1@S@Word sense disambiguation concerns finding a suitable translation when a word can have more than one meaning.@@@@1@17@@danf@17-8-2009
10530680@unknown@formal@none@1@S@The problem was first raised in the 1950s by [[Yehoshua Bar-Hillel]].@@@@1@11@@danf@17-8-2009
10530690@unknown@formal@none@1@S@He pointed out that without a "universal encyclopedia", a machine would never be able to distinguish between the two meanings of a word.@@@@1@23@@danf@17-8-2009
10530700@unknown@formal@none@1@S@Today there are numerous approaches designed to overcome this problem.@@@@1@10@@danf@17-8-2009
10530710@unknown@formal@none@1@S@They can be approximately divided into "shallow" approaches and "deep" approaches.@@@@1@11@@danf@17-8-2009
10530720@unknown@formal@none@1@S@Shallow approaches assume no knowledge of the text.@@@@1@8@@danf@17-8-2009
10530730@unknown@formal@none@1@S@They simply apply statistical methods to the words surrounding the ambiguous word.@@@@1@12@@danf@17-8-2009
10530740@unknown@formal@none@1@S@Deep approaches presume a comprehensive knowledge of the word.@@@@1@9@@danf@17-8-2009
10530750@unknown@formal@none@1@S@So far, shallow approaches have been more successful.@@@@1@8@@danf@17-8-2009
10530760@unknown@formal@none@1@S@===Named entities===@@@@1@2@@danf@17-8-2009
10530770@unknown@formal@none@1@S@Related to [[named entity recognition]] in [[information extraction]].@@@@1@8@@danf@17-8-2009
10530780@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009
10530790@unknown@formal@none@1@S@There are now many [[software]] programs for translating natural language, several of them [[online]], such as the [[SYSTRAN]] system which powers both [[Google]] translate and [[AltaVista]]'s [[Babel Fish (website)|Babel Fish]] as well as [[Promt]] that powers online translation services at Voila.fr and Orange.fr.@@@@1@43@@danf@17-8-2009
10530800@unknown@formal@none@1@S@Although no system provides the holy grail of "fully automatic high quality machine translation" (FAHQMT), many systems produce reasonable output.@@@@1@20@@danf@17-8-2009
10530810@unknown@formal@none@1@S@Despite their inherent limitations, MT programs are used around the world.@@@@1@11@@danf@17-8-2009
10530820@unknown@formal@none@1@S@Probably the largest institutional user is the [[European Commission]].@@@@1@9@@danf@17-8-2009
10530830@unknown@formal@none@1@S@[[Toggletext]] uses a transfer-based system (known as Kataku) to translate between [[English language|English]] and [[Indonesian language|Indonesian]].@@@@1@16@@danf@17-8-2009
10530840@unknown@formal@none@1@S@[[Google]] has claimed that promising results were obtained using a proprietary statistical machine translation engine.@@@@1@15@@danf@17-8-2009
10530850@unknown@formal@none@1@S@The statistical translation engine used in the [[Google tools#anchor_language_tools|Google language tools]] for Arabic <-> English and Chinese <-> English has an overall score of 0.4281 over the runner-up IBM's BLEU-4 score of 0.3954 (Summer 2006) in tests conducted by the National Institute for Standards and Technology.@@@@1@46@@danf@17-8-2009
10530860@unknown@formal@none@1@S@[[Uwe Muegge]] has implemented a demo website that uses a [[controlled language]] in combination with the [[Google tools#anchor_language_tools|Google tool]] to produce fully automatic, high-quality machine translations of his English, German, and French web sites.@@@@1@34@@danf@17-8-2009
10530870@unknown@formal@none@1@S@With the recent focus on terrorism, the military sources in the United States have been investing significant amounts of money in natural language engineering.@@@@1@24@@danf@17-8-2009
10530880@unknown@formal@none@1@S@''In-Q-Tel'' (a [[venture capital]] fund, largely funded by the US Intelligence Community, to stimulate new technologies through private sector entrepreneurs) brought up companies like [[Language Weaver]].@@@@1@26@@danf@17-8-2009
10530890@unknown@formal@none@1@S@Currently the military community is interested in translation and processing of languages like [[Arabic language|Arabic]], [[Pashto language|Pashto]], and [[Dari language|Dari]].@@@@1@20@@danf@17-8-2009
10530900@unknown@formal@none@1@S@Information Processing Technology Office in [[DARPA]] hosts programs like [[DARPA TIDES program|TIDES]] and [[Babylon translator|Babylon Translator]].@@@@1@16@@danf@17-8-2009
10530910@unknown@formal@none@1@S@US Air Force has awarded a $1 million contract to develop a language translation technology.@@@@1@15@@danf@17-8-2009
10530920@unknown@formal@none@1@S@== Evaluation ==@@@@1@3@@danf@17-8-2009
10530930@unknown@formal@none@1@S@There are various means for evaluating the performance of machine-translation systems.@@@@1@11@@danf@17-8-2009
10530940@unknown@formal@none@1@S@The oldest is the use of human judges to assess a translation's quality.@@@@1@13@@danf@17-8-2009
10530950@unknown@formal@none@1@S@Even though human evaluation is time-consuming, it is still the most reliable way to compare different systems such as rule-based and statistical systems.@@@@1@23@@danf@17-8-2009
10530960@unknown@formal@none@1@S@[[Automate]]d means of evaluation include [[Bilingual evaluation understudy|BLEU]], [[NIST (metric)|NIST]] and [[METEOR]].@@@@1@12@@danf@17-8-2009
10530970@unknown@formal@none@1@S@Relying exclusively on machine translation ignores that communication in [[natural language|human language]] is [[wiktionary:context|context]]-embedded, and that it takes a human to adequately comprehend the context of the original text.@@@@1@29@@danf@17-8-2009
10530980@unknown@formal@none@1@S@Even purely human-generated translations are prone to error.@@@@1@8@@danf@17-8-2009
10530990@unknown@formal@none@1@S@Therefore, to ensure that a machine-generated translation will be of publishable quality and useful to a human, it must be reviewed and edited by a human.@@@@1@26@@danf@17-8-2009
10531000@unknown@formal@none@1@S@It has, however, been asserted that in certain applications, e.g. product descriptions written in a [[controlled language]], a [[dictionary-based machine translation|dictionary-based machine-translation]] system has produced satisfactory translations that require no human intervention.@@@@1@32@@danf@17-8-2009
10540010@unknown@formal@none@1@S@Metadata@@@@1@1@@danf@17-8-2009
10540020@unknown@formal@none@1@S@'''Metadata''' ('''meta data''', or sometimes '''metainformation''') is "data about data", of any sort in any media.@@@@1@16@@danf@17-8-2009
10540030@unknown@formal@none@1@S@An item of metadata may describe an individual [[datum]], or content item, or a collection of data including multiple content items and hierarchical levels, for example a [[database schema]].@@@@1@29@@danf@17-8-2009
10540040@unknown@formal@none@1@S@== Purpose ==@@@@1@3@@danf@17-8-2009
10540050@unknown@formal@none@1@S@Metadata provides context for data.@@@@1@5@@danf@17-8-2009
10540060@unknown@formal@none@1@S@Metadata is used to facilitate the understanding, characteristics, and management usage of data.@@@@1@13@@danf@17-8-2009
10540070@unknown@formal@none@1@S@The metadata required for effective data management varies with the type of data and context of use.@@@@1@17@@danf@17-8-2009
10540080@unknown@formal@none@1@S@In a [[library]], where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the [[author]], the publication date and the physical location.@@@@1@34@@danf@17-8-2009
10540090@unknown@formal@none@1@S@== Examples of Metadata ==@@@@1@5@@danf@17-8-2009
10540100@unknown@formal@none@1@S@=== Camera ===@@@@1@3@@danf@17-8-2009
10540110@unknown@formal@none@1@S@In the context of a [[camera]], where the data is the photographic image, metadata would typically include the date the [[photograph]] was taken and details of the camera settings (lens, focal length, aperture, shutter timing, white balance, etc.).@@@@1@38@@danf@17-8-2009
10540120@unknown@formal@none@1@S@=== Digital Music Player ===@@@@1@5@@danf@17-8-2009
10540130@unknown@formal@none@1@S@On a digital portable music player, the album names, song titles and album art embedded in the music files are used to generate the artist and song listings, and are considered the metadata.@@@@1@33@@danf@17-8-2009
10540140@unknown@formal@none@1@S@=== Information system ===@@@@1@4@@danf@17-8-2009
10540150@unknown@formal@none@1@S@In the context of an [[information system]], where the data is the content of the [[computer]] files, metadata about an individual data item would typically include the name of the field and its length.@@@@1@34@@danf@17-8-2009
10540160@unknown@formal@none@1@S@Metadata about a collection of data items, a computer file, might typically include the name of the file, the type of file and the name of the data administrator.@@@@1@29@@danf@17-8-2009
10540170@unknown@formal@none@1@S@''Italic text''@@@@1@2@@danf@17-8-2009
10540180@unknown@formal@none@1@S@=== Real world location ===@@@@1@5@@danf@17-8-2009
10540190@unknown@formal@none@1@S@If we consider a particular place in the real world, this may be described by data, for example:@@@@1@18@@danf@17-8-2009
10540200@unknown@formal@none@1@S@* 1 "E83BJ" .@@@@1@4@@danf@17-8-2009
10540210@unknown@formal@none@1@S@* 2 "17"@@@@1@3@@danf@17-8-2009
10540220@unknown@formal@none@1@S@* 3 "Sunny"@@@@1@3@@danf@17-8-2009
10540230@unknown@formal@none@1@S@To make sense of and use this data, context is important, and can be provided by metadata.@@@@1@17@@danf@17-8-2009
10540240@unknown@formal@none@1@S@The metadata for the above three items of data might include:@@@@1@11@@danf@17-8-2009
10540250@unknown@formal@none@1@S@* 1.1 "Post Code" – This is a brief description (or name) of the data item "E83BJ"@@@@1@17@@danf@17-8-2009
10540260@unknown@formal@none@1@S@* 1.2 "The unique identifier of a postal district" – This is another description (a definition) of "E83BJ"@@@@1@18@@danf@17-8-2009
10540270@unknown@formal@none@1@S@* 1.3 "27 June 2006" – This could also help describe "E83BJ", for example by giving the date it was last updated@@@@1@22@@danf@17-8-2009
10540280@unknown@formal@none@1@S@* 2 "Average temperature in degrees Celsius" – This is a possible description of "17"@@@@1@15@@danf@17-8-2009
10540290@unknown@formal@none@1@S@* 3 "Yesterday's weather" – This is a description of "sunny"@@@@1@11@@danf@17-8-2009
10540300@unknown@formal@none@1@S@An item of metadata is itself data and therefore may have its own metadata.@@@@1@14@@danf@17-8-2009
10540310@unknown@formal@none@1@S@For example, "Post Code" might have the following metadata:@@@@1@9@@danf@17-8-2009
10540320@unknown@formal@none@1@S@* 1.1.1 "data item name"@@@@1@5@@danf@17-8-2009
10540330@unknown@formal@none@1@S@* 1.1.2 "5 characters, starting with A – Z"@@@@1@9@@danf@17-8-2009
10540340@unknown@formal@none@1@S@"27 June 2006" might have the following metadata:@@@@1@8@@danf@17-8-2009
10540350@unknown@formal@none@1@S@* 1.3.1 "date last changed"@@@@1@5@@danf@17-8-2009
10540360@unknown@formal@none@1@S@* 1.3.2 "dd MMM yyyy"@@@@1@5@@danf@17-8-2009
10540370@unknown@formal@none@1@S@== Levels ==@@@@1@3@@danf@17-8-2009
10540380@unknown@formal@none@1@S@The hierarchy of metadata descriptions can go on forever, but usually context or semantic understanding makes extensively detailed explanations unnecessary.@@@@1@20@@danf@17-8-2009
10540390@unknown@formal@none@1@S@The role played by any particular [[datum]] depends on the context.@@@@1@11@@danf@17-8-2009
10540400@unknown@formal@none@1@S@For example, when considering the geography of London, "E83BJ" would be a datum and "Post Code" would be metadatum.@@@@1@19@@danf@17-8-2009
10540410@unknown@formal@none@1@S@But, when considering the data management of an automated system that manages geographical data, "Post Code" might be a datum and then "data item name" and "5 characters, starting with A – Z" would be metadata.@@@@1@36@@danf@17-8-2009
10540420@unknown@formal@none@1@S@In any particular context, metadata characterizes the data it describes, not the entity described by that data.@@@@1@17@@danf@17-8-2009
10540430@unknown@formal@none@1@S@So, in relation to "E83BJ", the datum "is in London" is a further description of the place in the real world which has the post code "E83BJ", not of the code itself.@@@@1@32@@danf@17-8-2009
10540440@unknown@formal@none@1@S@Therefore, although it is providing information connected to "E83BJ" (telling us that this is the post code of a place in London), this would not normally be considered metadata, as it is describing "E83BJ" ''qua'' place in the real world and not ''qua'' data.@@@@1@44@@danf@17-8-2009
10540450@unknown@formal@none@1@S@== Definitions ==@@@@1@3@@danf@17-8-2009
10540460@unknown@formal@none@1@S@=== Etymology ===@@@@1@3@@danf@17-8-2009
10540470@unknown@formal@none@1@S@[[Meta]] is a classical Greek preposition (μετ’ αλλων εταιρων) and prefix (μεταβασις) conveying the following senses in English, depending upon the case of the associated noun: among; along with; with; by means of; in the midst of; after; behind.@@@@1@39@@danf@17-8-2009
10540480@unknown@formal@none@1@S@In [[epistemology]], the word means "about (its own category)"; thus metadata is "data about the data".@@@@1@16@@danf@17-8-2009
10540490@unknown@formal@none@1@S@=== Varying definitions ===@@@@1@4@@danf@17-8-2009
10540500@unknown@formal@none@1@S@The term was introduced intuitively, without a formal definition.@@@@1@9@@danf@17-8-2009
10540510@unknown@formal@none@1@S@Because of that, today there are various definitions.@@@@1@8@@danf@17-8-2009
10540520@unknown@formal@none@1@S@The most common one is the literal translation:@@@@1@8@@danf@17-8-2009
10540530@unknown@formal@none@1@S@* "Data about data are referred to as metadata."@@@@1@9@@danf@17-8-2009
10540540@unknown@formal@none@1@S@Example: "12345" is data, and with no additional context is meaningless.@@@@1@11@@danf@17-8-2009
10540550@unknown@formal@none@1@S@When "12345" is given a meaningful name (metadata) of "[[ZIP code]]", one can understand (at least in the [[United States]], and further placing "ZIP code" within the context of a [[postal address]]) that "12345" refers to the [[General Electric]] plant in [[Schenectady, New York]].@@@@1@44@@danf@17-8-2009
10540560@unknown@formal@none@1@S@As for most people the difference between data and [[information]] is merely a [[philosophical]] one of no relevance in practical use, other definitions are:@@@@1@24@@danf@17-8-2009
10540570@unknown@formal@none@1@S@* Metadata is information about data.@@@@1@6@@danf@17-8-2009
10540580@unknown@formal@none@1@S@* Metadata is information about information.@@@@1@6@@danf@17-8-2009
10540590@unknown@formal@none@1@S@* Metadata contains information about that data or other data@@@@1@10@@danf@17-8-2009
10540600@unknown@formal@none@1@S@There are more sophisticated definitions, such as:@@@@1@7@@danf@17-8-2009
10540610@unknown@formal@none@1@S@*"Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities."@@@@1@24@@danf@17-8-2009
10540620@unknown@formal@none@1@S@* "[Metadata is a set of] optional structured descriptions that are publicly available to explicitly assist in locating objects."@@@@1@19@@danf@17-8-2009
10540630@unknown@formal@none@1@S@These are used more rarely because they tend to concentrate on one purpose of metadata — to find "objects", "entities" or "resources" — and ignore others, such as using metadata to optimize [[data compression|compression algorithms]], or to perform additional computations using the data.@@@@1@43@@danf@17-8-2009
10540640@unknown@formal@none@1@S@The metadata concept has been extended into the world of systems to include any "data about data": the names of tables, columns, programs, and the like.@@@@1@26@@danf@17-8-2009
10540650@unknown@formal@none@1@S@Different views of this "system metadata" are detailed below, but beyond that is the recognition that metadata can describe all aspects of systems: data, activities, people and organizations involved, locations of data and processes, access methods, limitations, timing and events, as well as motivation and rules.@@@@1@46@@danf@17-8-2009
10540660@unknown@formal@none@1@S@Fundamentally, then, metadata is "the data that describe the structure and workings of an organization's use of information, and which describe the systems it uses to manage that information".@@@@1@29@@danf@17-8-2009
10540670@unknown@formal@none@1@S@To do a model of metadata is to do an "[[Enterprise modeling|Enterprise model]]" of the information technology industry itself.@@@@1@19@@danf@17-8-2009
10540680@unknown@formal@none@1@S@=== Metadata and Markup ===@@@@1@5@@danf@17-8-2009
10540690@unknown@formal@none@1@S@In the context of the web and the work of the [[W3C]] in providing markup technologies of [[HTML]], [[XML]] and [[SGML]] the concept of metadata has specific context that is perhaps clearer than in other information domains.@@@@1@37@@danf@17-8-2009
10540700@unknown@formal@none@1@S@With markup technologies there is metadata, markup and data content.@@@@1@10@@danf@17-8-2009
10540710@unknown@formal@none@1@S@The metadata describes characteristics about the data, while the markup identifies the specific type of data content and acts as a container for that document instance.@@@@1@26@@danf@17-8-2009
10540720@unknown@formal@none@1@S@This page in Wikipedia is itself an example of such usage, where the textual information is data, how it is packaged, linked, referenced, styled and displayed is markup and aspects and characteristics of that markup are metadata set globally across Wikipedia.@@@@1@41@@danf@17-8-2009
10540730@unknown@formal@none@1@S@In the context of markup the metadata is architected to allow optimization of document instances to contain only a minimum amount of metadata, while the metadata itself is likely referenced externally such as in a [[schema]] definition ([[XSD]]) instance.@@@@1@39@@danf@17-8-2009
10540740@unknown@formal@none@1@S@Also it should be noted that markup provides specialised mechanisms that handle referential data, again avoiding confusion over what is metadata or data, and allowing optimizations.@@@@1@26@@danf@17-8-2009
10540750@unknown@formal@none@1@S@The reference and ID mechanisms in markup allowing reference links between related data items, and links to data items that can then be repeated about a data item, such as an address or product details.@@@@1@35@@danf@17-8-2009
10540760@unknown@formal@none@1@S@These are then all themselves simply more data items and markup instances rather than metadata.@@@@1@15@@danf@17-8-2009
10540770@unknown@formal@none@1@S@Similarly there are concepts such as classifications, ontologies and associations for which markup mechanisms are provided.@@@@1@16@@danf@17-8-2009
10540780@unknown@formal@none@1@S@A data item can then be linked to such categories via markup and hence providing a clean delineation between what is metadata, and actual data instances.@@@@1@26@@danf@17-8-2009
10540790@unknown@formal@none@1@S@Therefore the concepts and descriptions in a classification would be metadata, but the actual classification entry for a data item is simply another data instance.@@@@1@25@@danf@17-8-2009
10540800@unknown@formal@none@1@S@Some examples can illustrate the points here.@@@@1@7@@danf@17-8-2009
10540810@unknown@formal@none@1@S@Items in bold are data content, in italic are metadata, normal text items are all markup.@@@@1@16@@danf@17-8-2009
10540820@unknown@formal@none@1@S@The two examples show in-line use of metadata within markup relating to a data instance (XML) compared to simple markup (HTML).@@@@1@21@@danf@17-8-2009
10540830@unknown@formal@none@1@S@A simple [[HTML]] instance example:@@@@1@5@@danf@17-8-2009
10540840@unknown@formal@none@1@S@<span style="normalText">'''Example'''</span>@@@@1@2@@danf@17-8-2009
10540850@unknown@formal@none@1@S@And then a [[XML]] instance example with metadata:@@@@1@8@@danf@17-8-2009
10540860@unknown@formal@none@1@S@'''John'''@@@@1@2@@danf@17-8-2009
10540870@unknown@formal@none@1@S@Where the inline assertion that a person's middle name may be an empty data item is metadata about the data item.@@@@1@21@@danf@17-8-2009
10540880@unknown@formal@none@1@S@Such definitions however are usually not placed inline in XML.@@@@1@10@@danf@17-8-2009
10540890@unknown@formal@none@1@S@Instead these definitions are moved away into the [[schema]] definition that contains the metadata for the entire document instance.@@@@1@19@@danf@17-8-2009
10540900@unknown@formal@none@1@S@This again illustrates another important aspect of metadata in the context of markup.@@@@1@13@@danf@17-8-2009
10540910@unknown@formal@none@1@S@The metadata is optimally defined only once for a collection of data instances.@@@@1@13@@danf@17-8-2009
10540920@unknown@formal@none@1@S@Hence repeated items of markup are rarely metadata, but rather more markup data instances themselves.@@@@1@15@@danf@17-8-2009
10540930@unknown@formal@none@1@S@=== Hierarchies of metadata ===@@@@1@5@@danf@17-8-2009
10540940@unknown@formal@none@1@S@When structured into a hierarchical arrangement, metadata is more properly called an [[Ontology (computer science)|ontology]] or [[schema]].@@@@1@17@@danf@17-8-2009
10540950@unknown@formal@none@1@S@Both terms describe "what exists" for some purpose or to enable some action.@@@@1@13@@danf@17-8-2009
10540960@unknown@formal@none@1@S@For instance, the arrangement of subject headings in a library catalog serves not only as a guide to finding books on a particular subject in the stacks, but also as a guide to what subjects "exist" in the library's own ontology and how more specialized topics are related to or derived from the more general subject headings.@@@@1@57@@danf@17-8-2009
10540970@unknown@formal@none@1@S@Metadata is frequently stored in a central location and used to help organizations standardize their data.@@@@1@16@@danf@17-8-2009
10540980@unknown@formal@none@1@S@This information is typically stored in a [[metadata registry]].@@@@1@9@@danf@17-8-2009
10540990@unknown@formal@none@1@S@=== Difference between data and metadata ===@@@@1@7@@danf@17-8-2009
10541000@unknown@formal@none@1@S@Usually it is not possible to distinguish between (plain) data and metadata because:@@@@1@13@@danf@17-8-2009
10541010@unknown@formal@none@1@S@*Something can be data and metadata at the same time.@@@@1@10@@danf@17-8-2009
10541020@unknown@formal@none@1@S@The headline of an article is both its title (metadata) and part of its text (data).@@@@1@16@@danf@17-8-2009
10541030@unknown@formal@none@1@S@* Data and metadata can change their roles.@@@@1@8@@danf@17-8-2009
10541040@unknown@formal@none@1@S@A poem, as such, would be regarded as data, but if there were a song that used it as lyrics, the whole poem could be attached to an audio file of the song as metadata.@@@@1@35@@danf@17-8-2009
10541050@unknown@formal@none@1@S@Thus, the labeling depends on the point of view.@@@@1@9@@danf@17-8-2009
10541060@unknown@formal@none@1@S@These considerations apply no matter which of the above definitions is considered, except where explicit markup is used to denote what is data and what is metadata.@@@@1@27@@danf@17-8-2009
10541070@unknown@formal@none@1@S@== Use ==@@@@1@3@@danf@17-8-2009
10541080@unknown@formal@none@1@S@Metadata has many different applications; this section lists some of the most common.@@@@1@13@@danf@17-8-2009
10541090@unknown@formal@none@1@S@Metadata is used to speed up and enrich searching for resources.@@@@1@11@@danf@17-8-2009
10541100@unknown@formal@none@1@S@In general, search queries using metadata can save users from performing more complex filter operations manually.@@@@1@16@@danf@17-8-2009
10541110@unknown@formal@none@1@S@It is now common for web browsers (with the notable exception of Mozilla Firefox), P2P applications and media management software to automatically download and locally cache metadata, to improve the speed at which files can be accessed and searched.@@@@1@39@@danf@17-8-2009
10541120@unknown@formal@none@1@S@Metadata may also be associated to files manually.@@@@1@8@@danf@17-8-2009
10541130@unknown@formal@none@1@S@This is often the case with documents which are scanned into a document storage repository such as FileNet or Documentum.@@@@1@20@@danf@17-8-2009
10541140@unknown@formal@none@1@S@Once the documents have been converted into an electronic format a user brings the image up in a viewer application, manually reads the document and keys values into an online application to be stored in a metadata repository.@@@@1@38@@danf@17-8-2009
10541150@unknown@formal@none@1@S@Metadata provide additional information to users of the data it describes.@@@@1@11@@danf@17-8-2009
10541160@unknown@formal@none@1@S@This information may be descriptive ("These pictures were taken by children in the school's third grade class.") or algorithmic ("Checksum=139F").@@@@1@20@@danf@17-8-2009
10541170@unknown@formal@none@1@S@Metadata helps to bridge the [[semantic gap]].@@@@1@7@@danf@17-8-2009
10541180@unknown@formal@none@1@S@By telling a computer how data items are related and how these relations can be evaluated automatically, it becomes possible to process even more complex filter and search operations.@@@@1@29@@danf@17-8-2009
10541190@unknown@formal@none@1@S@For example, if a search engine understands that "Van Gogh" was a "Dutch painter", it can answer a search query on "Dutch painters" with a link to a web page about Vincent Van Gogh, although the exact words "Dutch painters" never occur on that page.@@@@1@45@@danf@17-8-2009
10541200@unknown@formal@none@1@S@This approach, called knowledge representation, is of special interest to the [[semantic web]] and [[artificial intelligence]].@@@@1@16@@danf@17-8-2009
10541210@unknown@formal@none@1@S@Certain metadata is designed to optimize [[lossy compression]].@@@@1@8@@danf@17-8-2009
10541220@unknown@formal@none@1@S@For example, if a video has metadata that allows a computer to tell foreground from background, the latter can be compressed more aggressively to achieve a higher compression rate.@@@@1@29@@danf@17-8-2009
10541230@unknown@formal@none@1@S@Some metadata is intended to enable variable content presentation.@@@@1@9@@danf@17-8-2009
10541240@unknown@formal@none@1@S@For example, if a picture has metadata that indicates the most important region — the one where there is a person — an image viewer on a small screen, such as on a mobile phone's, can narrow the picture to that region and thus show the user the most interesting details.@@@@1@51@@danf@17-8-2009
10541250@unknown@formal@none@1@S@A similar kind of metadata is intended to allow blind people to access diagrams and pictures, by converting them for special output devices or reading their description using [[speech synthesis|text-to-speech]] software.@@@@1@31@@danf@17-8-2009
10541260@unknown@formal@none@1@S@Other descriptive metadata can be used to automate workflows.@@@@1@9@@danf@17-8-2009
10541270@unknown@formal@none@1@S@For example, if a "smart" software tool knows content and structure of data, it can convert it automatically and pass it to another "smart" tool as input.@@@@1@27@@danf@17-8-2009
10541280@unknown@formal@none@1@S@As a result, users save the many [[cut, copy and paste|copy-and-paste]] operations required when analyzing data with "dumb" tools.@@@@1@19@@danf@17-8-2009
10541290@unknown@formal@none@1@S@Metadata is becoming an increasingly important part of [[electronic discovery]].@@@@1@10@@danf@17-8-2009
10541295@unknown@formal@none@1@S@[http://www.lexbe.com/hp/indepth-e-discovery-rule-metadata.htm] Application and file system metadata derived from [[electronic document]]s and files can be important evidence.@@@@1@16@@danf@17-8-2009
10541300@unknown@formal@none@1@S@Recent changes to the [[Federal Rules of Civil Procedure]] make metadata routinely discoverable as part of [[Civil law (common law)|civil litigation]].@@@@1@21@@danf@17-8-2009
10541310@unknown@formal@none@1@S@Parties to litigation are required to maintain and produce metadata as part of [[discovery (law)|discovery]], and [[spoliation of evidence|spoliation]] of metadata can lead to sanctions.@@@@1@25@@danf@17-8-2009
10541320@unknown@formal@none@1@S@Metadata has become important on the [[World Wide Web]] because of the need to find useful information from the mass of information available.@@@@1@23@@danf@17-8-2009
10541330@unknown@formal@none@1@S@Manually-created metadata adds value because it ensures consistency.@@@@1@8@@danf@17-8-2009
10541340@unknown@formal@none@1@S@If a web page about a certain topic contains a word or phrase, then all web pages about that topic should contain that same word or phrase.@@@@1@27@@danf@17-8-2009
10541350@unknown@formal@none@1@S@Metadata also ensures variety, so that if a topic goes by two names each will be used.@@@@1@17@@danf@17-8-2009
10541360@unknown@formal@none@1@S@For example, an article about "[[sport utility vehicle]]s" would also be [[tag (metadata)|tagged]] "4 wheel drives", "4WDs" and "four wheel drives", as this is how SUVs are known in some countries.@@@@1@31@@danf@17-8-2009
10541370@unknown@formal@none@1@S@Examples of metadata for an [[Compact Disc|audio CD]] include the [[MusicBrainz]] project and [[All Media Guide]]'s [[Allmusic]].@@@@1@17@@danf@17-8-2009
10541380@unknown@formal@none@1@S@Similarly, [[MP3]] files have metadata tags in a format called [[ID3]].@@@@1@11@@danf@17-8-2009
10541390@unknown@formal@none@1@S@== Types of metadata ==@@@@1@5@@danf@17-8-2009
10541400@unknown@formal@none@1@S@Metadata can be classified by:@@@@1@5@@danf@17-8-2009
10541410@unknown@formal@none@1@S@* Content.@@@@1@2@@danf@17-8-2009
10541420@unknown@formal@none@1@S@Metadata can either describe the ''resource'' itself (for example, name and size of a file) or the ''content'' of the resource (for example, "This video shows a boy playing football").@@@@1@30@@danf@17-8-2009
10541430@unknown@formal@none@1@S@* Mutability.@@@@1@2@@danf@17-8-2009
10541440@unknown@formal@none@1@S@With respect to the whole resource, metadata can be either ''immutable'' (for example, the "Title" of a video does not change as the video itself is being played) or ''mutable'' (the "Scene description" does change).@@@@1@35@@danf@17-8-2009
10541450@unknown@formal@none@1@S@* Logical function.@@@@1@3@@danf@17-8-2009
10541460@unknown@formal@none@1@S@There are three layers of logical function: at the bottom the ''subsymbolic'' layer that contains the raw data itself, then the ''symbolic'' layer with metadata describing the raw data, and on the top the ''logical'' layer containing metadata that allows logical reasoning using the symbolic layer@@@@1@46@@danf@17-8-2009
10541470@unknown@formal@none@1@S@== Important issues ==@@@@1@4@@danf@17-8-2009
10541480@unknown@formal@none@1@S@To successfully develop and use metadata, several important issues should be treated with care:@@@@1@14@@danf@17-8-2009
10541490@unknown@formal@none@1@S@=== Metadata risks ===@@@@1@4@@danf@17-8-2009
10541500@unknown@formal@none@1@S@[[Microsoft Office]] files include metadata beyond their printable content, such as the original author's name, the creation date of the document, and the amount of time spent editing it.@@@@1@29@@danf@17-8-2009
10541510@unknown@formal@none@1@S@Unintentional disclosure can be awkward or even, in professional practices requiring confidentiality, raise malpractice concerns.@@@@1@15@@danf@17-8-2009
10541520@unknown@formal@none@1@S@Some of Microsoft Office document's metadata can be seen by clicking ''File'' then ''Properties'' from the program's menu.@@@@1@18@@danf@17-8-2009
10541530@unknown@formal@none@1@S@Other metadata is not visible except through external analysis of a file, such as is done in forensics.@@@@1@18@@danf@17-8-2009
10541540@unknown@formal@none@1@S@The author of the Microsoft Word-based [[Melissa (computer worm)|Melissa]] computer virus in 1999 was caught due to Word metadata that uniquely identified the computer used to create the original infected document.@@@@1@31@@danf@17-8-2009
10541550@unknown@formal@none@1@S@=== Metadata lifecycle ===@@@@1@4@@danf@17-8-2009
10541560@unknown@formal@none@1@S@Even in the early phases of planning and designing it is necessary to keep track of all metadata created.@@@@1@19@@danf@17-8-2009
10541570@unknown@formal@none@1@S@It is not economical to start attaching metadata only after the production process has been completed.@@@@1@16@@danf@17-8-2009
10541580@unknown@formal@none@1@S@For example, if metadata created by a digital camera at recording time is not stored immediately, it may have to be restored afterwards manually with great effort.@@@@1@27@@danf@17-8-2009
10541590@unknown@formal@none@1@S@Therefore, it is necessary for different groups of resource producers to cooperate using compatible methods and standards.@@@@1@17@@danf@17-8-2009
10541600@unknown@formal@none@1@S@* Manipulation.@@@@1@2@@danf@17-8-2009
10541610@unknown@formal@none@1@S@Metadata must adapt if the resource it describes changes.@@@@1@9@@danf@17-8-2009
10541620@unknown@formal@none@1@S@It should be merged when two resources are merged.@@@@1@9@@danf@17-8-2009
10541630@unknown@formal@none@1@S@These operations are seldom performed by today's software; for example, image editing programs usually do not keep track of the [[Exchangeable image file format|Exif]] metadata created by digital cameras.@@@@1@29@@danf@17-8-2009
10541640@unknown@formal@none@1@S@* Destruction.@@@@1@2@@danf@17-8-2009
10541650@unknown@formal@none@1@S@It can be useful to keep metadata even after the resource it describes has been destroyed, for example in change histories within a text document or to archive file deletions due to digital rights management.@@@@1@35@@danf@17-8-2009
10541660@unknown@formal@none@1@S@None of today's metadata standards consider this phase.@@@@1@8@@danf@17-8-2009
10541670@unknown@formal@none@1@S@=== Storage ===@@@@1@3@@danf@17-8-2009
10541680@unknown@formal@none@1@S@Metadata can be stored either ''internally'', in the same file as the data, or ''externally'', in a separate file.@@@@1@19@@danf@17-8-2009
10541690@unknown@formal@none@1@S@Metadata that are embedded with content is called ''embedded metadata''.@@@@1@10@@danf@17-8-2009
10541700@unknown@formal@none@1@S@A data repository typically stores the metadata ''detached'' from the data.@@@@1@11@@danf@17-8-2009
10541710@unknown@formal@none@1@S@Both ways have advantages and disadvantages:@@@@1@6@@danf@17-8-2009
10541720@unknown@formal@none@1@S@*Internal storage allows transferring metadata together with the data it describes; thus, metadata is always at hand and can be manipulated easily.@@@@1@22@@danf@17-8-2009
10541730@unknown@formal@none@1@S@This method creates high redundancy and does not allow holding metadata together.@@@@1@12@@danf@17-8-2009
10541740@unknown@formal@none@1@S@* External storage allows bundling metadata, for example in a database, for more efficient searching.@@@@1@15@@danf@17-8-2009
10541750@unknown@formal@none@1@S@There is no redundancy and metadata can be transferred simultaneously when using [[streaming media|streaming]].@@@@1@14@@danf@17-8-2009
10541760@unknown@formal@none@1@S@However, as most formats use [[Uniform Resource Identifier|URI]]s for that purpose, the method of how the metadata is linked to its data should be treated with care.@@@@1@27@@danf@17-8-2009
10541770@unknown@formal@none@1@S@What if a resource does not have a URI (resources on a local hard disk or web pages that are created on-the-fly using a content management system)?@@@@1@27@@danf@17-8-2009
10541780@unknown@formal@none@1@S@What if metadata can only be evaluated if there is a connection to the Web, especially when using [[Resource Description Framework|RDF]]?@@@@1@21@@danf@17-8-2009
10541790@unknown@formal@none@1@S@How to realize that a resource is replaced by another with the same name but different content?@@@@1@17@@danf@17-8-2009
10541800@unknown@formal@none@1@S@Moreover, there is the question of data format: storing metadata in a human-readable format such as XML can be useful because users can understand and edit it without specialized tools.@@@@1@30@@danf@17-8-2009
10541810@unknown@formal@none@1@S@On the other hand, these formats are not optimized for storage capacity; it may be useful to store metadata in a binary, non-human-readable format instead to speed up transfer and save memory.@@@@1@32@@danf@17-8-2009
10541820@unknown@formal@none@1@S@== Criticisms ==@@@@1@3@@danf@17-8-2009
10541830@unknown@formal@none@1@S@Although the majority of computer scientists see metadata as a chance for better interoperability, some critics argue:@@@@1@17@@danf@17-8-2009
10541840@unknown@formal@none@1@S@*Metadata is too expensive and time-consuming.@@@@1@6@@danf@17-8-2009
10541850@unknown@formal@none@1@S@The argument is that companies will not produce metadata without need because it costs extra money, and private users also will not produce complex metadata because its creation is very time-consuming.@@@@1@31@@danf@17-8-2009
10541860@unknown@formal@none@1@S@* Metadata is too complicated.@@@@1@5@@danf@17-8-2009
10541870@unknown@formal@none@1@S@Private users will not create metadata because existing formats, especially [[MPEG-7]], are too complicated.@@@@1@14@@danf@17-8-2009
10541880@unknown@formal@none@1@S@As long as there are no automatic tools for creating metadata, it will not be created.@@@@1@16@@danf@17-8-2009
10541890@unknown@formal@none@1@S@* Metadata is subjective and depends on context.@@@@1@8@@danf@17-8-2009
10541900@unknown@formal@none@1@S@Most probably, two persons will attach different metadata to the same resource due to their different points of view.@@@@1@19@@danf@17-8-2009
10541910@unknown@formal@none@1@S@Moreover, metadata can be misinterpreted due to its dependency on context.@@@@1@11@@danf@17-8-2009
10541920@unknown@formal@none@1@S@For example searching for "post-modern art" may miss a certain item because the expression was not in use at the time when that work of art was created, or searching for "pictures taken at 1:00" may produce confusing results due to local time differences.@@@@1@44@@danf@17-8-2009
10541930@unknown@formal@none@1@S@* There is no end to metadata.@@@@1@7@@danf@17-8-2009
10541940@unknown@formal@none@1@S@For example, when annotating a match of soccer with metadata, one can describe all the players and their actions in time and stop there.@@@@1@24@@danf@17-8-2009
10541950@unknown@formal@none@1@S@One can also describe the advertisements in the background and the clothes the players wear.@@@@1@15@@danf@17-8-2009
10541960@unknown@formal@none@1@S@One can also describe each fan on the tribune and the clothes they wear.@@@@1@14@@danf@17-8-2009
10541970@unknown@formal@none@1@S@All of this metadata can be interesting to one party or another — such as the spectators, sponsors or a counter-terrorist unit of the police — and even for a simple resource the amount of possible metadata can be gigantic.@@@@1@40@@danf@17-8-2009
10541980@unknown@formal@none@1@S@* Metadata is useless.@@@@1@4@@danf@17-8-2009
10541990@unknown@formal@none@1@S@Many of today's search engines are very efficient at finding text.@@@@1@11@@danf@17-8-2009
10542000@unknown@formal@none@1@S@Other techniques for finding pictures, videos and music (namely query-by-example) will become more and more powerful in the future.@@@@1@19@@danf@17-8-2009
10542010@unknown@formal@none@1@S@Thus, there is no real need for metadata.@@@@1@8@@danf@17-8-2009
10542020@unknown@formal@none@1@S@The opposers of metadata sometimes use the term [[metacrap]] to refer to the unsolved problems of metadata in some scenarios.@@@@1@20@@danf@17-8-2009
10542030@unknown@formal@none@1@S@These people are also referred to as "Meta Haters."@@@@1@9@@danf@17-8-2009
10542040@unknown@formal@none@1@S@== Types ==@@@@1@3@@danf@17-8-2009
10542050@unknown@formal@none@1@S@In general, there are two distinct classes of metadata: structural or control metadata and guide metadata.@@@@1@16@@danf@17-8-2009
10542060@unknown@formal@none@1@S@Structural metadata is used to describe the structure of computer systems such as tables, columns and indexes.@@@@1@17@@danf@17-8-2009
10542070@unknown@formal@none@1@S@Guide metadata is used to help humans find specific items and is usually expressed as a set of keywords in a natural language.@@@@1@23@@danf@17-8-2009
10542080@unknown@formal@none@1@S@Metatadata can be divided into 3 distinct categories:@@@@1@8@@danf@17-8-2009
10542090@unknown@formal@none@1@S@* Descriptive@@@@1@2@@danf@17-8-2009
10542100@unknown@formal@none@1@S@* Administrative@@@@1@2@@danf@17-8-2009
10542110@unknown@formal@none@1@S@* Structural@@@@1@2@@danf@17-8-2009
10542120@unknown@formal@none@1@S@=== Relational database metadata ===@@@@1@5@@danf@17-8-2009
10542130@unknown@formal@none@1@S@Each [[relational database]] system has its own mechanisms for storing metadata.@@@@1@11@@danf@17-8-2009
10542140@unknown@formal@none@1@S@Examples of relational-database metadata include:@@@@1@5@@danf@17-8-2009
10542150@unknown@formal@none@1@S@*Tables of all tables in database, their names, sizes and number of rows in each table.@@@@1@16@@danf@17-8-2009
10542160@unknown@formal@none@1@S@* Tables of columns in each database, what tables they are used in, and the type of data stored in each column.@@@@1@22@@danf@17-8-2009
10542170@unknown@formal@none@1@S@In database terminology, this set of metadata is referred to as the [[database catalog|catalog]].@@@@1@14@@danf@17-8-2009
10542180@unknown@formal@none@1@S@The [[SQL]] standard specifies a uniform means to access the catalog, called the INFORMATION_SCHEMA, but not all databases implement it, even if they implement other aspects of the SQL standard.@@@@1@30@@danf@17-8-2009
10542190@unknown@formal@none@1@S@For an example of database-specific metadata access methods, see [[Oracle metadata]].@@@@1@11@@danf@17-8-2009
10542200@unknown@formal@none@1@S@=== Data warehouse metadata ===@@@@1@5@@danf@17-8-2009
10542210@unknown@formal@none@1@S@[[Data warehouse]] metadata systems are sometimes separated into two sections:@@@@1@10@@danf@17-8-2009
10542220@unknown@formal@none@1@S@# '''back room''' metadata that are used for [[Extract, transform, load]] functions to get [[OLTP]] data into a data warehouse@@@@1@20@@danf@17-8-2009
10542230@unknown@formal@none@1@S@# '''front room''' metadata that are used to label screens and create reports@@@@1@13@@danf@17-8-2009
10542240@unknown@formal@none@1@S@Kimball lists the following types of metadata in a data warehouse (See also [http://www.fortunecity.com/skyscraper/oracle/699/orahtml/dbmsmag/9803d05.html]):@@@@1@14@@danf@17-8-2009
10542250@unknown@formal@none@1@S@* [[source system]] metadata@@@@1@4@@danf@17-8-2009
10542260@unknown@formal@none@1@S@** source specifications, such as [[repository|repositories]], and source [[logical schema]]s@@@@1@10@@danf@17-8-2009
10542270@unknown@formal@none@1@S@** source descriptive information, such as ownership descriptions, update frequencies, legal limitations, and [[access method]]s@@@@1@15@@danf@17-8-2009
10542280@unknown@formal@none@1@S@** process information, such as job schedules and extraction code@@@@1@10@@danf@17-8-2009
10542290@unknown@formal@none@1@S@* [[data staging]] metadata@@@@1@4@@danf@17-8-2009
10542300@unknown@formal@none@1@S@** [[data acquisition]] information, such as [[data transmission]] scheduling and results, and file usage@@@@1@14@@danf@17-8-2009
10542310@unknown@formal@none@1@S@** [[dimension table]] management, such as definitions of dimensions, and [[surrogate key]] assignments@@@@1@13@@danf@17-8-2009
10542320@unknown@formal@none@1@S@** [[Program transformation|transformation]] and [[aggregation]], such as [[data enhancement]] and mapping, [[DBMS]] load scripts, and aggregate definitions@@@@1@17@@danf@17-8-2009
10542330@unknown@formal@none@1@S@** audit, job logs and documentation, such as [[data lineage]] records, [[data transform]] logs@@@@1@14@@danf@17-8-2009
10542340@unknown@formal@none@1@S@* DBMS metadata, such as:@@@@1@5@@danf@17-8-2009
10542350@unknown@formal@none@1@S@** DBMS system table contents@@@@1@5@@danf@17-8-2009
10542360@unknown@formal@none@1@S@** processing hints@@@@1@3@@danf@17-8-2009
10542370@unknown@formal@none@1@S@Michael Bracket defines metadata (what he calls "Data resource data") as "any data about the organization's data resource".@@@@1@18@@danf@17-8-2009
10542380@unknown@formal@none@1@S@Adrienne Tannenbaum defines metadata as "the detailed description of instance data.@@@@1@11@@danf@17-8-2009
10542390@unknown@formal@none@1@S@The format and characteristics of populated instance data: instances and values, dependent on the role of the metadata recipient".@@@@1@19@@danf@17-8-2009
10542400@unknown@formal@none@1@S@These definitions are characteristic of the "data about data" definition.@@@@1@10@@danf@17-8-2009
10542410@unknown@formal@none@1@S@=== Business Intelligence metadata ===@@@@1@5@@danf@17-8-2009
10542420@unknown@formal@none@1@S@[[Business Intelligence]] is the process of analyzing large amounts of corporate data, usually stored in large databases such as the [[Data Warehouse]], tracking business performance, detecting patterns and trends, and helping enterprise business users make better decisions.@@@@1@37@@danf@17-8-2009
10542430@unknown@formal@none@1@S@Business Intelligence metadata describes how data is queried, filtered, analyzed, and displayed in Business Intelligence software tools, such as Reporting tools, OLAP tools, Data Mining tools.@@@@1@26@@danf@17-8-2009
10542440@unknown@formal@none@1@S@Examples:@@@@1@1@@danf@17-8-2009
10542450@unknown@formal@none@1@S@* [[Online analytical processing|OLAP]] metadata: The descriptions and structures of Dimensions, Cubes, Measures (Metrics), Hierarchies, Levels, Drill Paths@@@@1@18@@danf@17-8-2009
10542460@unknown@formal@none@1@S@* Reporting metadata: The descriptions and structures of Reports, Charts, Queries, DataSets, Filters, Variables, Expressions@@@@1@15@@danf@17-8-2009
10542470@unknown@formal@none@1@S@* [[Data Mining]] metadata: The descriptions and structures of DataSets, Algorithms, Queries@@@@1@12@@danf@17-8-2009
10542480@unknown@formal@none@1@S@Business Intelligence metadata can be used to understand how corporate financial reports reported to [[Wall Street]] are calculated, how the revenue, expense and profit are aggregated from individual sales transactions stored in the data warehouse.@@@@1@35@@danf@17-8-2009
10542490@unknown@formal@none@1@S@A good understanding of Business Intelligence metadata is required to solve complex problems such as compliance with corporate governance standards, such as [[Sarbanes Oxley]] (SOX) or Basel II.@@@@1@28@@danf@17-8-2009
10542500@unknown@formal@none@1@S@=== General IT metadata ===@@@@1@5@@danf@17-8-2009
10542510@unknown@formal@none@1@S@In contrast, David Marco, another metadata theorist, defines metadata as "all physical data and knowledge from inside and outside an organization, including information about the physical data, technical and business processes, rules and constraints of the data, and structures of the data used by a corporation."@@@@1@46@@danf@17-8-2009
10542520@unknown@formal@none@1@S@Others have included web services, systems and interfaces.@@@@1@8@@danf@17-8-2009
10542530@unknown@formal@none@1@S@In fact, the entire [[Zachman framework]] (see [[Enterprise Architecture]]) can be represented as metadata.@@@@1@14@@danf@17-8-2009
10542540@unknown@formal@none@1@S@Notice that such definitions expand metadata's scope considerably, to encompass most or all of the data required by the [[Management Information System]]s capability.@@@@1@23@@danf@17-8-2009
10542550@unknown@formal@none@1@S@In this sense, the concept of metadata has significant overlaps with the [[ITIL]] concept of a Configuration Management Database ([[CMDB]]), and also with disciplines such as [[Enterprise Architecture]] and [[IT portfolio management]].@@@@1@32@@danf@17-8-2009
10542560@unknown@formal@none@1@S@This broader definition of metadata has precedent.@@@@1@7@@danf@17-8-2009
10542570@unknown@formal@none@1@S@Third generation corporate repository products (such as those eventually merged into the CA Advantage line) not only store information about data definitions (COBOL copybooks, DBMS schema), but also about the programs accessing those data structures, and the [[Job Control Language]] and batch job infrastructure dependencies as well.@@@@1@47@@danf@17-8-2009
10542580@unknown@formal@none@1@S@These products (some of which are still in production) can provide a very complete picture of a mainframe computing environment, supporting exactly the kinds of impact analysis required for ITIL-based processes such as [[ITIL#Incident Management|Incident]] and [[Change Management (ITIL)|Change Management]].@@@@1@40@@danf@17-8-2009
10542590@unknown@formal@none@1@S@The [[ITIL]] [http://www.tso.co.uk/itil/ Back Catalogue] includes the ''Data Management'' volume which recognizes the role of these metadata products on the mainframe, posing the [[CMDB]] as the distributed computing equivalent.@@@@1@29@@danf@17-8-2009
10542600@unknown@formal@none@1@S@CMDB vendors however have generally not expanded their scope to include data definitions, and metadata solutions are also available in the distributed world.@@@@1@23@@danf@17-8-2009
10542610@unknown@formal@none@1@S@Determining the appropriate role and scope for each is thus a challenge for large IT organizations requiring the services of both.@@@@1@21@@danf@17-8-2009
10542620@unknown@formal@none@1@S@Since metadata is pervasive, centralized attempts at tracking it need to focus on the most highly leveraged assets.@@@@1@18@@danf@17-8-2009
10542630@unknown@formal@none@1@S@Enterprise Assets may only constitute a small percentage of the entire IT portfolio.@@@@1@13@@danf@17-8-2009
10542640@unknown@formal@none@1@S@Some practitioners have successfully managed IT metadata using the [[Dublin Core]] metamodel.@@@@1@12@@danf@17-8-2009
10542650@unknown@formal@none@1@S@==== IT metadata management products ====@@@@1@6@@danf@17-8-2009
10542660@unknown@formal@none@1@S@First generation data dictionary/metadata repository tools would be those only supporting a specific [[DBMS]], such as [[IDMS]]'s IDD (integrated data dictionary), the [[Information Management System|IMS]] Data Dictionary, and [[ADABAS]]'s Predict.@@@@1@30@@danf@17-8-2009
10542670@unknown@formal@none@1@S@Second generation would be ASG's DATAMANAGER product which could support many different file and DBMS types.@@@@1@16@@danf@17-8-2009
10542680@unknown@formal@none@1@S@Third generation repository products became briefly popular in the early 1990s along with the rise of widespread use of [[RDBMS]] engines such as IBM's [[IBM DB2|DB2]].@@@@1@26@@danf@17-8-2009
10542690@unknown@formal@none@1@S@Fourth generation products link the repository with more [[Extract, transform, load]] tools and can be connected with architectural modeling tools.@@@@1@20@@danf@17-8-2009
10542700@unknown@formal@none@1@S@Examples include [http://www.adaptive.com/products/mm.html Adaptive Metadata Manager] from Adaptive, [http://www.asg.com/products/product_details.asp?code=ROC&id=50 Rochade] from ASG,[http://www.infolibcorp.com/productsOverview.html InfoLibrarian Metadata Integration Framework] and [[Troux Technologies]] Metis Server product.@@@@1@22@@danf@17-8-2009
10542710@unknown@formal@none@1@S@=== File system metadata ===@@@@1@5@@danf@17-8-2009
10542720@unknown@formal@none@1@S@Nearly all [[file system]]s keep metadata about files [[out-of-band]].@@@@1@9@@danf@17-8-2009
10542730@unknown@formal@none@1@S@Some systems keep metadata in [[directory (file systems)|directory]] entries; others in specialized structure like [[inode]]s or even in the name of a file.@@@@1@23@@danf@17-8-2009
10542740@unknown@formal@none@1@S@Metadata can range from simple [[timestamp]]s, [[mode bit]]s, and other special-purpose information used by the implementation itself, to [[icon (computing)|icon]]s and free-text comments, to arbitrary [[attribute-value pair]]s.@@@@1@27@@danf@17-8-2009
10542750@unknown@formal@none@1@S@With more complex and open-ended metadata, it becomes useful to search for files based on the metadata contents.@@@@1@18@@danf@17-8-2009
10542760@unknown@formal@none@1@S@The [[Unix]] [[find]] utility was an early example, although inefficient when scanning hundreds of thousands of files on a modern computer system.@@@@1@22@@danf@17-8-2009
10542770@unknown@formal@none@1@S@[[Apple Computer]]'s [[Mac OS X]] operating system supports cataloguing and searching for file metadata through a feature known as [[Spotlight (software)|Spotlight]], as of [[Mac OS X v10.4|version 10.4]].@@@@1@28@@danf@17-8-2009
10542780@unknown@formal@none@1@S@[[Microsoft]] worked in the development of similar functionality with the [[Instant Search]] system in [[Windows Vista]], as well as being present in [[SharePoint Server]].@@@@1@24@@danf@17-8-2009
10542790@unknown@formal@none@1@S@[[Linux]] implements file metadata using [[extended file attributes]].@@@@1@8@@danf@17-8-2009
10542800@unknown@formal@none@1@S@=== Image metadata ===@@@@1@4@@danf@17-8-2009
10542810@unknown@formal@none@1@S@Examples of image files containing metadata include [[Exchangeable image file format]] (EXIF) and [[Tagged Image File Format]] (TIFF).@@@@1@18@@danf@17-8-2009
10542820@unknown@formal@none@1@S@Having metadata about images embedded in TIFF or EXIF files is one way of acquiring additional data about an image.@@@@1@20@@danf@17-8-2009
10542830@unknown@formal@none@1@S@[[Tag (metadata)|Tagging]] pictures with subjects, related emotions, and other descriptive phrases helps Internet users find pictures easily rather than having to search through entire image collections.@@@@1@26@@danf@17-8-2009
10542840@unknown@formal@none@1@S@A prime example of an image tagging service is [[Flickr]], where users upload images and then describe the contents.@@@@1@19@@danf@17-8-2009
10542850@unknown@formal@none@1@S@Other patrons of the site can then search for those tags.@@@@1@11@@danf@17-8-2009
10542860@unknown@formal@none@1@S@Flickr uses a [[folksonomy]]: a free-text keyword system in which the community defines the vocabulary through use rather than through a [[controlled vocabulary]].@@@@1@23@@danf@17-8-2009
10542870@unknown@formal@none@1@S@Users can also tag photos for organization purposes using Adobe's [[Extensible Metadata Platform]] (XMP) language, for example.@@@@1@17@@danf@17-8-2009
10542880@unknown@formal@none@1@S@Digital photography is increasingly making use of technical metadata tags describing the conditions of exposure.@@@@1@15@@danf@17-8-2009
10542890@unknown@formal@none@1@S@Photographers shooting [[RAW image format|Camera RAW]] file formats can use applications such as [[Adobe Bridge]] or Apple Computer's [[Aperture (photography software)|Aperture]] to work with camera metadata for post-processing.@@@@1@28@@danf@17-8-2009
10542900@unknown@formal@none@1@S@=== Audio Metadata ===@@@@1@4@@danf@17-8-2009
10542910@unknown@formal@none@1@S@Audio metadata generally relates to the how the data should be written in order for a processor to efficiently process it.@@@@1@21@@danf@17-8-2009
10542920@unknown@formal@none@1@S@These technologies are usually seen in Audio Engine Programming such as Microsoft [[Resource Interchange File Format|RIFF (Resource Interchange File Format)]] technologies for .wave file.@@@@1@24@@danf@17-8-2009
10542930@unknown@formal@none@1@S@Codes generally develop their own metadata standards for compression purpose.@@@@1@10@@danf@17-8-2009
10542940@unknown@formal@none@1@S@=== Program metadata ===@@@@1@4@@danf@17-8-2009
10542950@unknown@formal@none@1@S@Metadata is casually used to describe the controlling data used in software architectures that are more abstract or configurable.@@@@1@19@@danf@17-8-2009
10542960@unknown@formal@none@1@S@Most '''[[executable|executable file]]''' formats include what may be termed "metadata" that specifies certain, usually configurable, behavioral [[runtime]] characteristics.@@@@1@18@@danf@17-8-2009
10542970@unknown@formal@none@1@S@However, it is difficult if not impossible to precisely distinguish program "metadata" from general aspects of [[Von Neumann architecture|stored-program computing architecture]]; if the machine reads it and acts upon it, it is a computational [[Instruction (computer science)|instruction]], and the prefix "meta" has little significance.@@@@1@44@@danf@17-8-2009
10542980@unknown@formal@none@1@S@In [[Java (programming language)|Java]], the [[Class (file format)|class file format]] contains metadata used by the [[Java compiler]] and the [[Java virtual machine]] to [[dynamic linking|dynamically link]] [[class (computer science)|classes]] and to support [[reflection (computer science)|reflection]].@@@@1@35@@danf@17-8-2009
10542990@unknown@formal@none@1@S@The [[J2SE]] 5.0 version of Java included a [[metadata facility for Java|metadata facility]] to allow additional annotations that are used by [[development tool]]s.@@@@1@23@@danf@17-8-2009
10543000@unknown@formal@none@1@S@In [[MS-DOS]], the [[COM file]] format does ''not'' include metadata, while the [[EXE]] file and Windows [[Portable Executable|PE]] formats do.@@@@1@20@@danf@17-8-2009
10543010@unknown@formal@none@1@S@These metadata can include the company that published the program, the date the program was created, the version number and more.@@@@1@21@@danf@17-8-2009
10543020@unknown@formal@none@1@S@In the [[.NET Framework|Microsoft .NET]] executable format, extra metadata is included to allow [[Reflection (computer science)|reflection]] at runtime.@@@@1@18@@danf@17-8-2009
10543030@unknown@formal@none@1@S@=== Existing software metadata ===@@@@1@5@@danf@17-8-2009
10543040@unknown@formal@none@1@S@[[Object Management Group]] (OMG) has defined metadata format for representing entire existing applications for the purposes of [[software mining]], [[software modernization]] and software assurance.@@@@1@24@@danf@17-8-2009
10543050@unknown@formal@none@1@S@This specification, called the OMG [[Knowledge Discovery Metamodel]] (KDM) is the OMG's foundation for "modeling in reverse".@@@@1@17@@danf@17-8-2009
10543060@unknown@formal@none@1@S@KDM is a common language-independent intermediate representation that provides an integrated view of an entire enterprise application, including its behavior (program flow), data, and structure.@@@@1@25@@danf@17-8-2009
10543070@unknown@formal@none@1@S@One of the applications of KDM is Business Rules Mining.@@@@1@10@@danf@17-8-2009
10543080@unknown@formal@none@1@S@[[Knowledge Discovery Metamodel]] includes a fine grained low-level representation (called "micro KDM"), suitable for performing static analysis of programs.@@@@1@19@@danf@17-8-2009
10543090@unknown@formal@none@1@S@=== Document metadata ===@@@@1@4@@danf@17-8-2009
10543100@unknown@formal@none@1@S@Most programs that create documents, including Microsoft [[SharePoint]], [[Microsoft Office Word|Microsoft Word]] and other [[Microsoft Office]] products, save metadata with the document files.@@@@1@23@@danf@17-8-2009
10543110@unknown@formal@none@1@S@These metadata can contain the name of the person who created the file (obtained from the operating system), the name of the person who last edited the file, how many times the file has been printed, and even how many revisions have been made on the file.@@@@1@47@@danf@17-8-2009
10543120@unknown@formal@none@1@S@Other saved material, such as deleted text (saved in case of an undelete command), document comments and the like, is also commonly referred to as "metadata", and the inadvertent inclusion of this material in distributed files has sometimes led to undesirable disclosures.@@@@1@42@@danf@17-8-2009
10543130@unknown@formal@none@1@S@Document Metadata is particularly important in legal environments where litigation can request this sensitive information (metadata) which can include many elements of private detrimental data.@@@@1@25@@danf@17-8-2009
10543140@unknown@formal@none@1@S@This data has been linked to multiple lawsuits that have got corporations into legal complications.@@@@1@15@@danf@17-8-2009
10543150@unknown@formal@none@1@S@Many legal firms today use "Metadata Management Software", also known as "Metadata Removal Tools".@@@@1@14@@danf@17-8-2009
10543160@unknown@formal@none@1@S@This software can be used to clean documents before they are sent outside of their firm.@@@@1@16@@danf@17-8-2009
10543170@unknown@formal@none@1@S@This process, known as metadata management, protects lawfirms from potentially unsafe leaking of sensitive data through [[Electronic Discovery]].@@@@1@18@@danf@17-8-2009
10543180@unknown@formal@none@1@S@For a list of executable formats, see [[object file]].@@@@1@9@@danf@17-8-2009
10543190@unknown@formal@none@1@S@=== Metamodels ===@@@@1@3@@danf@17-8-2009
10543200@unknown@formal@none@1@S@Metadata on Models are called [[Metamodel]]s.@@@@1@6@@danf@17-8-2009
10543210@unknown@formal@none@1@S@In [[Model Driven Engineering]], a [[Model (abstract)|Model]] has to conform to a given [[Metamodel]].@@@@1@14@@danf@17-8-2009
10543220@unknown@formal@none@1@S@According to the [[model-driven architecture|MDA]] guide, a metamodel is a model and each model conforms to a given metamodel.@@@@1@19@@danf@17-8-2009
10543230@unknown@formal@none@1@S@[[Meta-modeling]] allows strict and agile automatic processing of models and metamodels.@@@@1@11@@danf@17-8-2009
10543240@unknown@formal@none@1@S@The [[Object Management Group]] (OMG) defines 4 layers of meta-modeling.@@@@1@10@@danf@17-8-2009
10543250@unknown@formal@none@1@S@Each level of modeling is defined, validated by the next layer:@@@@1@11@@danf@17-8-2009
10543260@unknown@formal@none@1@S@*M0: instance object, data row, record -> "John Smith"@@@@1@9@@danf@17-8-2009
10543270@unknown@formal@none@1@S@* M1: model, schema -> "Customer" UML Class or database Table@@@@1@11@@danf@17-8-2009
10543280@unknown@formal@none@1@S@* M2: metamodel -> [[Unified Modeling Language]] (UML), [[Common Warehouse Metamodel]] (CWM), [[Knowledge Discovery Metamodel]] (KDM)@@@@1@16@@danf@17-8-2009
10543290@unknown@formal@none@1@S@* M3: meta-metamodel -> [[Meta-Object Facility]] (MOF)@@@@1@7@@danf@17-8-2009
10543300@unknown@formal@none@1@S@=== Meta-metadata ===@@@@1@3@@danf@17-8-2009
10543310@unknown@formal@none@1@S@Since metadata are also data, it is possible to have metadata of metadata–"meta-metadata."@@@@1@13@@danf@17-8-2009
10543320@unknown@formal@none@1@S@Machine-generated meta-metadata, such as the reversed index created by a free-text search engine, is generally not considered metadata, though.@@@@1@19@@danf@17-8-2009
10543330@unknown@formal@none@1@S@=== Digital library metadata ===@@@@1@5@@danf@17-8-2009
10543340@unknown@formal@none@1@S@There are three categories of metadata that are frequently used to describe objects in a digital library:@@@@1@17@@danf@17-8-2009
10543350@unknown@formal@none@1@S@# '''descriptive''' - Information describing the intellectual content of the object, such as [[MARC]] cataloguing records, finding aids or similar schemes.@@@@1@21@@danf@17-8-2009
10543360@unknown@formal@none@1@S@It is typically used for bibliographic purposes and for search and retrieval.@@@@1@12@@danf@17-8-2009
10543370@unknown@formal@none@1@S@# '''structural''' - Information that ties each object to others to make up logical units (e.g., information that relates individual images of pages from a book to the others that make up the book).@@@@1@34@@danf@17-8-2009
10543380@unknown@formal@none@1@S@# '''administrative''' - Information used to manage the object or control access to it.@@@@1@14@@danf@17-8-2009
10543390@unknown@formal@none@1@S@This may include information on how it was scanned, its storage format, [[copyright]] and licensing information, and information necessary for the [[digital preservation|long-term preservation]] of the digital objects.@@@@1@28@@danf@17-8-2009
10543400@unknown@formal@none@1@S@=== Geospatial metadata ===@@@@1@4@@danf@17-8-2009
10543410@unknown@formal@none@1@S@Metadata that describe geographic objects (such as datasets, maps, features, or simply documents with a geospatial component) have a history going back to at least 1994 (refer [http://libraries.mit.edu/guides/subjects/metadata/standards/fgdc.html MIT Library page on FGDC Metadata]).@@@@1@34@@danf@17-8-2009
10543420@unknown@formal@none@1@S@This class of metadata is described more fully on the [[Geospatial metadata]] page.@@@@1@13@@danf@17-8-2009
10550010@unknown@formal@none@1@S@Microsoft Windows@@@@1@2@@danf@17-8-2009
10550020@unknown@formal@none@1@S@'''Microsoft Windows''' is a series of [[software]] [[operating system]]s produced by [[Microsoft]].@@@@1@12@@danf@17-8-2009
10550030@unknown@formal@none@1@S@Microsoft first introduced an operating environment named ''Windows'' in November 1985 as an add-on to [[MS-DOS]] in response to the growing interest in [[graphical user interface]]s (GUIs).@@@@1@27@@danf@17-8-2009
10550040@unknown@formal@none@1@S@Microsoft Windows came to [[Market dominance|dominate]] the world's [[personal computer]] market, overtaking [[Mac OS]], which had been introduced previously.@@@@1@19@@danf@17-8-2009
10550050@unknown@formal@none@1@S@At the 2004 [[International Data Corporation|IDC]] Directions conference, it was stated that Windows had approximately 90% of the [[Client (computing)|client]] operating system market.@@@@1@23@@danf@17-8-2009
10550060@unknown@formal@none@1@S@The most recent client version of Windows is [[Windows Vista]]; the current [[Server (computing)|server]] version is [[Windows Server 2008]].@@@@1@19@@danf@17-8-2009
10550070@unknown@formal@none@1@S@==Versions==@@@@1@1@@danf@17-8-2009
10550080@unknown@formal@none@1@S@The term ''Windows'' collectively describes any or all of several generations of Microsoft (MS) operating system (OS) products.@@@@1@18@@danf@17-8-2009
10550090@unknown@formal@none@1@S@These products are generally categorized as follows:@@@@1@7@@danf@17-8-2009
10550100@unknown@formal@none@1@S@===16-bit operating environments===@@@@1@3@@danf@17-8-2009
10550110@unknown@formal@none@1@S@The early versions of Windows were often thought of as just graphical user interfaces, mostly because they ran on top of [[MS-DOS]] and used it for [[file system]] services.@@@@1@29@@danf@17-8-2009
10550120@unknown@formal@none@1@S@However, even the earliest 16-bit Windows versions already assumed many typical operating system functions, notably, having their own [[executable file format]] and providing their own [[device driver]]s (timer, graphics, printer, mouse, keyboard and sound) for applications.@@@@1@36@@danf@17-8-2009
10550130@unknown@formal@none@1@S@Unlike [[MS-DOS]], Windows allowed users to execute multiple graphical applications at the same time, through [[computer multitasking|cooperative multitasking]].@@@@1@18@@danf@17-8-2009
10550140@unknown@formal@none@1@S@Finally, Windows implemented an elaborate, segment-based, software virtual memory scheme, which allowed it to run applications larger than available memory: code segments and [[resource (Windows)|resource]]s were swapped in and thrown away when memory became scarce, and data segments moved in memory when a given application had relinquished processor control, typically waiting for user input.@@@@1@54@@danf@17-8-2009
10550150@unknown@formal@none@1@S@16-bit Windows versions include [[Windows 1.0]] (1985), [[Windows 2.0]] (1987) and its close relatives, ''[[Windows 2.1x|Windows/286-Windows/386]]''.@@@@1@16@@danf@17-8-2009
10550160@unknown@formal@none@1@S@===Hybrid 16/32-bit operating environments===@@@@1@4@@danf@17-8-2009
10550170@unknown@formal@none@1@S@[[Windows 2.1x|Windows/386]] introduced a 32-bit [[protected mode]] [[kernel (computer science)|kernel]] and [[virtual machine]] monitor.@@@@1@14@@danf@17-8-2009
10550180@unknown@formal@none@1@S@For the duration of a Windows session, it created one or more [[virtual 8086 mode|virtual 8086 environments]] and provided device virtualization for the video card, keyboard, mouse, timer and [[interrupt]] controller inside each of them.@@@@1@35@@danf@17-8-2009
10550190@unknown@formal@none@1@S@The user-visible consequence was that it became possible to preemptively multitask multiple MS-DOS environments in separate windows, although graphical MS-DOS applications required full screen mode.@@@@1@25@@danf@17-8-2009
10550200@unknown@formal@none@1@S@Also, Windows applications were multi-tasked cooperatively inside one such virtual 8086 environment.@@@@1@12@@danf@17-8-2009
10550210@unknown@formal@none@1@S@[[Windows 3.0]] (1990) and [[Windows 3.1x|Windows 3.1]] (1992) improved the design, mostly because of [[virtual memory]] and loadable virtual device drivers ([[VxD]]s) which allowed them to share arbitrary devices between multitasked DOS windows.@@@@1@33@@danf@17-8-2009
10550220@unknown@formal@none@1@S@Also, Windows applications could now run in protected mode (when Windows was running in Standard or 386 Enhanced Mode), which gave them access to several megabytes of memory and removed the obligation to participate in the software virtual memory scheme.@@@@1@40@@danf@17-8-2009
10550230@unknown@formal@none@1@S@They still ran inside the same address space, where the segmented memory provided a degree of protection, and multi-tasked cooperatively.@@@@1@20@@danf@17-8-2009
10550240@unknown@formal@none@1@S@For Windows 3.0, Microsoft also rewrote critical operations from [[C (programming language)|C]] into [[Assembly language|assembly]], making this release faster and less memory-hungry than its predecessors.@@@@1@25@@danf@17-8-2009
10550250@unknown@formal@none@1@S@===Hybrid 16/32-bit operating systems===@@@@1@4@@danf@17-8-2009
10550260@unknown@formal@none@1@S@With the introduction of the [[32-bit]] [[Windows 3.1x|Windows for Workgroups 3.11]], Windows was able to stop relying on DOS for file management.@@@@1@22@@danf@17-8-2009
10550270@unknown@formal@none@1@S@Leveraging this, [[Windows 95]] introduced [[Long filename|Long File Names]], reducing the [[8.3 filename]] DOS environment to the role of a [[boot loader]].@@@@1@22@@danf@17-8-2009
10550280@unknown@formal@none@1@S@MS-DOS was now bundled with Windows; this notably made it (partially) aware of long file names when its utilities were run from within Windows.@@@@1@24@@danf@17-8-2009
10550290@unknown@formal@none@1@S@The most important novelty was the possibility of running 32-bit multi-threaded preemptively multitasked graphical programs.@@@@1@15@@danf@17-8-2009
10550300@unknown@formal@none@1@S@However, the necessity of keeping compatibility with 16-bit programs meant the GUI components were still 16-bit only and not fully reentrant, which resulted in reduced performance and stability.@@@@1@28@@danf@17-8-2009
10550310@unknown@formal@none@1@S@There were three releases of Windows 95 (the first in 1995, then subsequent bug-fix versions in 1996 and 1997, only released to OEMs, which added extra features such as [[File Allocation Table|FAT32]] and primitive USB support).@@@@1@36@@danf@17-8-2009
10550320@unknown@formal@none@1@S@Microsoft's next OS was [[Windows 98]]; there were two versions of this (the first in 1998 and the second, named "Windows 98 Second Edition", in 1999).@@@@1@26@@danf@17-8-2009
10550330@unknown@formal@none@1@S@In 2000, Microsoft released [[Windows Me]] (''Me'' standing for ''Millennium Edition''), which used the same core as Windows 98 but adopted some aspects of Windows 2000 and removed the option boot into DOS mode.@@@@1@34@@danf@17-8-2009
10550340@unknown@formal@none@1@S@It also added a new feature called System Restore, allowing the user to set the computer's settings back to an earlier date.@@@@1@22@@danf@17-8-2009
10550350@unknown@formal@none@1@S@===32-bit operating systems===@@@@1@3@@danf@17-8-2009
10550360@unknown@formal@none@1@S@The NT family of Windows systems was fashioned and marketed for higher reliability business use, and was unencumbered by any Microsoft DOS patrimony.@@@@1@23@@danf@17-8-2009
10550370@unknown@formal@none@1@S@The first release was [[Windows NT 3.1]] (1993, numbered "3.1" to match the Windows version and to one-up [[OS/2]] 2.1, IBM's flagship OS co-developed by Microsoft and was Windows NT's main competitor at the time), which was followed by [[Windows NT 3.5|NT 3.5]] (1994), [[Windows NT 3.51|NT 3.51]] (1995), [[Windows NT 4.0|NT 4.0]] (1996), and [[Windows 2000]] (essentially NT 5.0).@@@@1@60@@danf@17-8-2009
10550380@unknown@formal@none@1@S@NT 4.0 was the first in this line to implement the "Windows 95" user interface (and the first to include Windows 95's built-in 32-bit runtimes).@@@@1@25@@danf@17-8-2009
10550390@unknown@formal@none@1@S@Microsoft then moved to combine their consumer and business operating systems.@@@@1@11@@danf@17-8-2009
10550400@unknown@formal@none@1@S@[[Windows XP]], coming in both home and professional versions (and later niche market versions for [[tablet PC]]s and [[media center]]s) improved stability, user experience and backwards compatibility.@@@@1@27@@danf@17-8-2009
10550410@unknown@formal@none@1@S@Then, [[Windows Server 2003]] brought [[Windows Server]] up to date with Windows XP.@@@@1@13@@danf@17-8-2009
10550420@unknown@formal@none@1@S@Since then, a new version, [[Windows Vista]] was released and [[Windows Server 2008]], released on [[February 27]], [[2008]], brings [[Windows Server]] up to date with [[Windows Vista]].@@@@1@27@@danf@17-8-2009
10550430@unknown@formal@none@1@S@[[Windows CE]], Microsoft's offering in the mobile and embedded markets, is also a true 32-bit operating system that offers various services for all sub-operating workstations.@@@@1@25@@danf@17-8-2009
10550440@unknown@formal@none@1@S@===64-bit operating systems===@@@@1@3@@danf@17-8-2009
10550450@unknown@formal@none@1@S@[[Windows NT]] included support for several different platforms before the [[X86 architecture|x86]]-based [[personal computer]] became dominant in the professional world.@@@@1@20@@danf@17-8-2009
10550460@unknown@formal@none@1@S@Versions of NT from 3.1 to 4.0 variously supported [[PowerPC]], [[DEC Alpha]] and [[MIPS Technologies|MIPS]] R4000, some of which were 64-bit processors, although the operating system treated them as 32-bit processors.@@@@1@31@@danf@17-8-2009
10550470@unknown@formal@none@1@S@With the introduction of the [[Intel]] [[Itanium]] architecture, which is referred to as [[IA-64]], Microsoft released new versions of Windows to support it.@@@@1@23@@danf@17-8-2009
10550480@unknown@formal@none@1@S@Itanium versions of [[Windows XP]] and [[Windows Server 2003]] were released at the same time as their mainstream x86 (32-bit) counterparts.@@@@1@21@@danf@17-8-2009
10550490@unknown@formal@none@1@S@On [[April 25]] [[2005]], Microsoft released [[Windows XP Professional x64 Edition]] and x64 versions of Windows Server 2003 to support the [[x86-64|AMD64/Intel64]] (or ''x64'' in Microsoft terminology) architecture.@@@@1@28@@danf@17-8-2009
10550500@unknown@formal@none@1@S@Microsoft dropped support for the Itanium version of Windows XP in 2005.@@@@1@12@@danf@17-8-2009
10550510@unknown@formal@none@1@S@[[Windows Vista]] is the first end-user version of Windows that Microsoft has released simultaneously in 32-bit and x64 editions.@@@@1@19@@danf@17-8-2009
10550520@unknown@formal@none@1@S@Windows Vista does not support the Itanium architecture.@@@@1@8@@danf@17-8-2009
10550530@unknown@formal@none@1@S@The modern 64-bit Windows family comprises AMD64/Intel64 versions of [[Windows Vista]], and [[Windows Server 2003]] and [[Windows Server 2008]], in both Itanium and x64 editions.@@@@1@25@@danf@17-8-2009
10550540@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10550550@unknown@formal@none@1@S@Microsoft has taken two parallel routes in its operating systems.@@@@1@10@@danf@17-8-2009
10550560@unknown@formal@none@1@S@One route has been for the home user and the other has been for the professional IT user.@@@@1@18@@danf@17-8-2009
10550570@unknown@formal@none@1@S@The dual routes have generally led to home versions having greater [[multimedia]] support and less functionality in networking and security, and professional versions having inferior multimedia support and better networking and security.@@@@1@32@@danf@17-8-2009
10550580@unknown@formal@none@1@S@The first version of Microsoft Windows, [[Windows 1.0|version 1.0]], released in November 1985, lacked a degree of functionality and achieved little popularity, and was to compete with Apple's own operating system.@@@@1@31@@danf@17-8-2009
10550590@unknown@formal@none@1@S@Windows 1.0 is not a complete operating system; rather, it extends MS-DOS.@@@@1@12@@danf@17-8-2009
10550600@unknown@formal@none@1@S@Microsoft Windows version 2.0 was released in November, 1987 and was slightly more popular than its predecessor.@@@@1@17@@danf@17-8-2009
10550610@unknown@formal@none@1@S@Windows 2.03 (release date January 1988) had changed the OS from tiled windows to overlapping windows.@@@@1@16@@danf@17-8-2009
10550620@unknown@formal@none@1@S@The result of this change led to Apple Computer filing a suit against Microsoft alleging infringement on Apple's copyrights.@@@@1@19@@danf@17-8-2009
10550630@unknown@formal@none@1@S@Microsoft Windows version 3.0, released in 1990, was the first Microsoft Windows version to achieve broad commercial success, selling 2 million copies in the first six months.[http://www.islandnet.com/~kpolsson/compsoft/soft1991.htm][http://www.thocp.net/companies/microsoft/microsoft_company.htm]@@@@1@27@@danf@17-8-2009
10550635@unknown@formal@none@1@S@It featured improvements to the user interface and to multitasking capabilities.@@@@1@11@@danf@17-8-2009
10550640@unknown@formal@none@1@S@It received a facelift in Windows 3.1, made generally available on [[March 1]], [[1992]].@@@@1@14@@danf@17-8-2009
10550650@unknown@formal@none@1@S@Windows 3.1 support ended on [[December 31]], [[2001]].@@@@1@8@@danf@17-8-2009
10550660@unknown@formal@none@1@S@In July 1993, Microsoft released [[Windows NT]] based on a new kernel.@@@@1@12@@danf@17-8-2009
10550670@unknown@formal@none@1@S@NT was considered to be the professional OS and was the first Windows version to utilize [[preemptive multitasking]]..@@@@1@18@@danf@17-8-2009
10550680@unknown@formal@none@1@S@Windows NT would later be retooled to also function as a home operating system, with Windows XP.@@@@1@17@@danf@17-8-2009
10550690@unknown@formal@none@1@S@On August 24th 1995, Microsoft released [[Windows 95]], a new, and major, consumer version that made further changes to the user interface, and also used [[preemptive multitasking]].@@@@1@27@@danf@17-8-2009
10550700@unknown@formal@none@1@S@Windows 95 was designed to replace not only Windows 3.1, but also Windows for Workgroups, and MS-DOS.@@@@1@17@@danf@17-8-2009
10550710@unknown@formal@none@1@S@It was also the first Windows operating system to use Plug and Play capabilities.@@@@1@14@@danf@17-8-2009
10550720@unknown@formal@none@1@S@The changes Windows 95 brought to the desktop were revolutionary, as opposed to evolutionary, such as those in Windows 98 and Windows Me.@@@@1@23@@danf@17-8-2009
10550730@unknown@formal@none@1@S@Mainstream support for [[Windows 95]] ended on [[December 31]], [[2000]] and extended support for [[Windows 95]] ended on [[December 31]], [[2001]].@@@@1@21@@danf@17-8-2009
10550740@unknown@formal@none@1@S@The next in the consumer line was Microsoft [[Windows 98]] released on June 25th, 1998.@@@@1@15@@danf@17-8-2009
10550750@unknown@formal@none@1@S@It was substantially criticized for its slowness and for its unreliability compared with [[Windows 95]], but many of its basic problems were later rectified with the release of [[Windows 98]] Second Edition in 1999.@@@@1@34@@danf@17-8-2009
10550760@unknown@formal@none@1@S@Mainstream support for [[Windows 98]] ended on [[June 30]], [[2002]] and extended support for [[Windows 98]] ended on [[July 11]], [[2006]].@@@@1@21@@danf@17-8-2009
10550770@unknown@formal@none@1@S@As part of its "professional" line, Microsoft released [[Windows 2000]] in February 2000.@@@@1@13@@danf@17-8-2009
10550780@unknown@formal@none@1@S@The consumer version following Windows 98 was [[Windows Me]] (Windows Millennium Edition).@@@@1@12@@danf@17-8-2009
10550790@unknown@formal@none@1@S@Released in September 2000, [[Windows Me]] implemented a number of new technologies for Microsoft: most notably publicized was "[[Universal Plug and Play]]."@@@@1@22@@danf@17-8-2009
10550800@unknown@formal@none@1@S@In October 2001, Microsoft released [[Windows XP]], a version built on the Windows NT [[Kernel (computer science)|kernel]] that also retained the consumer-oriented usability of Windows 95 and its successors.@@@@1@29@@danf@17-8-2009
10550810@unknown@formal@none@1@S@This new version was widely praised in computer magazines.@@@@1@9@@danf@17-8-2009
10550820@unknown@formal@none@1@S@It shipped in two distinct editions, "Home" and "Professional", the former lacking many of the superior security and networking features of the Professional edition.@@@@1@24@@danf@17-8-2009
10550830@unknown@formal@none@1@S@Additionally, the first "Media Center" edition was released in 2002, with an emphasis on support for DVD and TV functionality including program recording and a remote control.@@@@1@27@@danf@17-8-2009
10550840@unknown@formal@none@1@S@Mainstream support for [[Windows XP]] will continue until [[April 14]], [[2009]] and extended support will continue until [[April 8]], [[2014]].@@@@1@20@@danf@17-8-2009
10550850@unknown@formal@none@1@S@In April 2003, [[Windows Server 2003]] was introduced, replacing the [[Windows 2000]] line of server products with a number of new features and a strong focus on security; this was followed in December 2005 by Windows Server 2003 R2.@@@@1@39@@danf@17-8-2009
10550860@unknown@formal@none@1@S@On [[January 30]], [[2007]] Microsoft released [[Windows Vista]].@@@@1@8@@danf@17-8-2009
10550870@unknown@formal@none@1@S@It contains a number of [[Features new to Windows Vista|new features]], from a redesigned shell and user interface to significant [[Technical features new to Windows Vista|technical changes]], with a particular focus on [[Security and safety features new to Windows Vista|security features]].@@@@1@41@@danf@17-8-2009
10550880@unknown@formal@none@1@S@It is available in a number of [[Windows Vista editions and pricing|different editions]], and has been subject to [[Criticism of Windows Vista|some criticism]].@@@@1@23@@danf@17-8-2009
10550890@unknown@formal@none@1@S@==Timeline of releases==@@@@1@3@@danf@17-8-2009
10550900@unknown@formal@none@1@S@==Security==@@@@1@1@@danf@17-8-2009
10550910@unknown@formal@none@1@S@[[Computer security|Security]] has been a hot topic with Windows for many years, and even Microsoft itself has been the victim of security breaches.@@@@1@23@@danf@17-8-2009
10550920@unknown@formal@none@1@S@Consumer versions of Windows were originally designed for ease-of-use on a single-user PC without a network connection, and did not have security features built in from the outset.@@@@1@28@@danf@17-8-2009
10550930@unknown@formal@none@1@S@[[Windows NT]] and its successors are designed for security (including on a network) and multi-user PCs, but are not designed with Internet security in mind as much since, when it was first developed in the early 1990s, Internet use was less prevalent.@@@@1@42@@danf@17-8-2009
10550940@unknown@formal@none@1@S@These design issues combined with flawed code (such as [[buffer overflow]]s) and the popularity of Windows means that it is a frequent target of [[computer worm|worm]] and [[computer virus|virus]] writers.@@@@1@30@@danf@17-8-2009
10550950@unknown@formal@none@1@S@In June 2005, [[Bruce Schneier]]'s ''Counterpane Internet Security'' reported that it had seen over 1,000 new viruses and worms in the previous six months.@@@@1@24@@danf@17-8-2009
10550960@unknown@formal@none@1@S@Microsoft releases security patches through its [[Windows Update]] service approximately once a month (usually the second Tuesday of the month), although critical updates are made available at shorter intervals when necessary.@@@@1@31@@danf@17-8-2009
10550970@unknown@formal@none@1@S@In Windows 2000 (SP3 and later), Windows XP and Windows Server 2003, updates can be automatically downloaded and installed if the user selects to do so.@@@@1@26@@danf@17-8-2009
10550980@unknown@formal@none@1@S@As a result, Service Pack 2 for Windows XP, as well as Service Pack 1 for Windows Server 2003, were installed by users more quickly than it otherwise might have been.@@@@1@31@@danf@17-8-2009
10550990@unknown@formal@none@1@S@===Windows Defender===@@@@1@2@@danf@17-8-2009
10551000@unknown@formal@none@1@S@On [[6 January]] [[2005]], Microsoft released a beta version of Microsoft AntiSpyware, based upon the previously released [[GIANT Company Software|Giant]] AntiSpyware.@@@@1@21@@danf@17-8-2009
10551010@unknown@formal@none@1@S@On [[14 February]], [[2006]], Microsoft AntiSpyware became [[Windows Defender]] with the release of beta 2.@@@@1@15@@danf@17-8-2009
10551020@unknown@formal@none@1@S@Windows Defender is a freeware program designed to protect against spyware and other unwanted software.@@@@1@15@@danf@17-8-2009
10551030@unknown@formal@none@1@S@[[Windows XP]] and [[Windows Server 2003]] users who have [[Windows Genuine Advantage|genuine]] copies of Microsoft Windows can freely download the program from Microsoft's web site, and Windows Defender ships as part of [[Windows Vista]].@@@@1@34@@danf@17-8-2009
10551040@unknown@formal@none@1@S@===Third-party analysis===@@@@1@2@@danf@17-8-2009
10551050@unknown@formal@none@1@S@In an article based on a report by Symantec, internetnews.com has described Microsoft Windows as having the "fewest number of patches and the shortest average patch development time of the five operating systems it monitored in the last six months of 2006."@@@@1@42@@danf@17-8-2009
10551060@unknown@formal@none@1@S@And the number of vulnerabilities found in Windows has significantly increased— Windows: 12+, Red Hat + Fedora: 2, Mac OS X: 1, HP-UX: 2, Solaris: 1.@@@@1@26@@danf@17-8-2009
10551070@unknown@formal@none@1@S@A study conducted by [[Kevin Mitnick]] and marketing communications firm Avantgarde in 2004 found that an unprotected and unpatched Windows XP system with Service Pack 1 lasted only 4 minutes on the Internet before it was compromised, and an unprotected and also unpatched [[Windows Server 2003]] system was compromised after being connected to the internet for 8 hours.@@@@1@58@@danf@17-8-2009
10551080@unknown@formal@none@1@S@However, it is important to note that this study does not apply to Windows XP systems running the Service Pack 2 update (released in late 2004), which vastly improved the security of Windows XP.@@@@1@34@@danf@17-8-2009
10551090@unknown@formal@none@1@S@The computer that was running Windows XP Service Pack 2 was not compromised.@@@@1@13@@danf@17-8-2009
10551100@unknown@formal@none@1@S@The [[AOL]] National Cyber Security Alliance Online Safety Study of October 2004 determined that 80% of Windows users were infected by at least one [[spyware]]/[[adware]] product.@@@@1@26@@danf@17-8-2009
10551110@unknown@formal@none@1@S@Much documentation is available describing how to increase the security of Microsoft Windows products.@@@@1@14@@danf@17-8-2009
10551120@unknown@formal@none@1@S@Typical suggestions include deploying Microsoft Windows behind a hardware or software [[firewall]], running [[anti-virus]] and [[anti-spyware]] software, and installing patches as they become available through [[Windows Update]].@@@@1@27@@danf@17-8-2009
10551130@unknown@formal@none@1@S@==Windows Lifecycle Policy==@@@@1@3@@danf@17-8-2009
10551140@unknown@formal@none@1@S@Microsoft has stopped releasing updates and hotfixes for many old Windows operating systems, including all versions of Windows 9x and earlier versions of Windows NT.@@@@1@25@@danf@17-8-2009
10551150@unknown@formal@none@1@S@Windows versions prior to [[Windows XP|XP]] are no longer supported, with the exception of [[Windows 2000]], which is currently in the Extended Support Period, that will end on [[July 13]], [[2010]].@@@@1@31@@danf@17-8-2009
10551160@unknown@formal@none@1@S@Windows XP versions prior to SP2 are no longer supported either.@@@@1@11@@danf@17-8-2009
10551170@unknown@formal@none@1@S@Also, support for [[Windows XP 64-bit Edition]] ended after the release of the more recent [[Windows XP Professional x64 Edition]].@@@@1@20@@danf@17-8-2009
10551180@unknown@formal@none@1@S@No new updates are created for unsupported versions of Windows.@@@@1@10@@danf@17-8-2009
10551190@unknown@formal@none@1@S@==Emulation software==@@@@1@2@@danf@17-8-2009
10551200@unknown@formal@none@1@S@Emulation allows the use of some Windows applications without using Microsoft Windows.@@@@1@12@@danf@17-8-2009
10551210@unknown@formal@none@1@S@These include:@@@@1@2@@danf@17-8-2009
10551220@unknown@formal@none@1@S@* [[Wine (software)|Wine]] - a [[free and open source software]] implementation of the [[Windows API]], allowing one to run many Windows applications on x86-based platforms, including [[Linux]].@@@@1@27@@danf@17-8-2009
10551230@unknown@formal@none@1@S@Wine is technically not an emulator but a "compatibility layer"; while an emulator effectively 'pretends' to be a different CPU, Wine instead makes use of Windows-style APIs to 'simulate' the Windows environment directly.@@@@1@33@@danf@17-8-2009
10551240@unknown@formal@none@1@S@** [[CrossOver]] - A Wine package with licensed fonts.@@@@1@9@@danf@17-8-2009
10551250@unknown@formal@none@1@S@Its developers are regular contributors to Wine, and focus on Wine running officially supported applications.@@@@1@15@@danf@17-8-2009
10551260@unknown@formal@none@1@S@** [[Cedega]] - [[TransGaming Technologies]]' proprietary [[Fork (software development)|fork]] of Wine, designed specifically for running games written for Microsoft Windows under Linux.@@@@1@22@@danf@17-8-2009
10551270@unknown@formal@none@1@S@** [[Darwine]] - This project intends to port and develop Wine as well as other supporting tools that will allow [[Darwin (operating system)|Darwin]] and [[Mac OS X]] users to run Microsoft Windows applications, and to provide [[Win32]] [[Application Programming Interface|API]] compatibility at application source code level.@@@@1@46@@danf@17-8-2009
10551280@unknown@formal@none@1@S@* [[ReactOS]] - An open-source OS that is intended to run the same software as Windows, originally designed to imitate Windows NT 4.0, now aiming at Windows XP compatibility.@@@@1@29@@danf@17-8-2009
10551290@unknown@formal@none@1@S@It has been in the [[development stage]] since 1996.@@@@1@9@@danf@17-8-2009
10560010@unknown@formal@none@1@S@Morphology (linguistics)@@@@1@2@@danf@17-8-2009
10560020@unknown@formal@none@1@S@'''Morphology''' is the field of [[linguistics]] that studies the internal structure of words.@@@@1@13@@danf@17-8-2009
10560030@unknown@formal@none@1@S@(Words as units in the lexicon are the subject matter of [[lexicology]].)@@@@1@12@@danf@17-8-2009
10560040@unknown@formal@none@1@S@While words are generally accepted as being (with [[clitic]]s) the smallest units of [[syntax]], it is clear that in most (if not all) languages, words can be related to other words by rules.@@@@1@33@@danf@17-8-2009
10560050@unknown@formal@none@1@S@For example, [[English language|English]] speakers recognize that the words ''dog'', ''dogs'', and ''dog-catcher'' are closely related.@@@@1@16@@danf@17-8-2009
10560060@unknown@formal@none@1@S@English speakers recognize these relations from their tacit knowledge of the rules of word-formation in English.@@@@1@16@@danf@17-8-2009
10560070@unknown@formal@none@1@S@They intuit that ''dog'' is to ''dogs'' as ''cat'' is to ''cats''; similarly, ''dog'' is to ''dog-catcher'' as ''dish'' is to ''dishwasher''.@@@@1@22@@danf@17-8-2009
10560080@unknown@formal@none@1@S@The rules understood by the speaker reflect specific patterns (or regularities) in the way words are formed from smaller units and how those smaller units interact in speech.@@@@1@28@@danf@17-8-2009
10560090@unknown@formal@none@1@S@In this way, morphology is the branch of linguistics that studies patterns of word-formation within and across languages, and attempts to formulate rules that model the knowledge of the speakers of those languages.@@@@1@33@@danf@17-8-2009
10560100@unknown@formal@none@1@S@==History ==@@@@1@2@@danf@17-8-2009
10560110@unknown@formal@none@1@S@The history of morphological analysis dates back to the [[ancient India]]n linguist , who formulated the 3,959 rules of [[Sanskrit]] morphology in the text by using a Constituency Grammar.@@@@1@29@@danf@17-8-2009
10560120@unknown@formal@none@1@S@The Graeco-Roman grammatical tradition also engaged in morphological analysis.@@@@1@9@@danf@17-8-2009
10560130@unknown@formal@none@1@S@The term ''morphology'' was coined by [[August Schleicher]] in [[1859]]@@@@1@10@@danf@17-8-2009
10560140@unknown@formal@none@1@S@== Fundamental concepts ==@@@@1@4@@danf@17-8-2009
10560150@unknown@formal@none@1@S@=== Lexemes and word forms ===@@@@1@6@@danf@17-8-2009
10560160@unknown@formal@none@1@S@The distinction between these two senses of "word" is arguably the most important one in morphology.@@@@1@16@@danf@17-8-2009
10560170@unknown@formal@none@1@S@The first sense of "word," the one in which ''dog'' and ''dogs'' are "the same word," is called '''[[lexeme]]'''.@@@@1@19@@danf@17-8-2009
10560180@unknown@formal@none@1@S@The second sense is called '''word-form'''.@@@@1@6@@danf@17-8-2009
10560190@unknown@formal@none@1@S@We thus say that ''dog'' and ''dogs'' are different forms of the same lexeme.@@@@1@14@@danf@17-8-2009
10560200@unknown@formal@none@1@S@''Dog'' and ''dog-catcher'', on the other hand, are different lexemes; for example, they refer to two different kinds of entities.@@@@1@20@@danf@17-8-2009
10560210@unknown@formal@none@1@S@The form of a word that is chosen conventionally to represent the canonical form of a word is called a [[lemma (linguistics)|lemma]], or '''citation form'''.@@@@1@25@@danf@17-8-2009
10560220@unknown@formal@none@1@S@==== Prosodic word vs. morphological word ====@@@@1@7@@danf@17-8-2009
10560230@unknown@formal@none@1@S@Here are examples from other languages of the failure of a single phonological word to coincide with a single morphological word-form.@@@@1@21@@danf@17-8-2009
10560240@unknown@formal@none@1@S@In Latin, one way to express the concept of 'NOUN-PHRASE1 and NOUN-PHRASE2' (as in "apples and oranges") is to suffix '-que' to the second noun phrase: "apples oranges-and", as it were.@@@@1@33@@danf@17-8-2009
10560250@unknown@formal@none@1@S@An extreme level of this theoretical quandary posed by some phonological words is provided by the Kwak'wala language.@@@@1@18@@danf@17-8-2009
10560260@unknown@formal@none@1@S@In Kwak'wala, as in a great many other languages, meaning relations between nouns, including possession and "semantic case", are formulated by affixes instead of by independent "words".@@@@1@27@@danf@17-8-2009
10560270@unknown@formal@none@1@S@The three word English phrase, "with his club", where 'with' identifies its dependent noun phrase as an instrument and 'his' denotes a possession relation, would consist of two words or even just one word in many languages.@@@@1@37@@danf@17-8-2009
10560280@unknown@formal@none@1@S@But affixation for semantic relations in Kwak'wala differs dramatically (from the viewpoint of those whose language is not Kwak'wala) from such affixation in other languages for this reason: the affixes phonologically attach not to the lexeme they pertain to semantically, but to the ''preceding'' lexeme.@@@@1@45@@danf@17-8-2009
10560290@unknown@formal@none@1@S@Consider the following example (in Kwakw'ala, sentences begin with what corresponds to an English verb):@@@@1@15@@danf@17-8-2009
10560300@unknown@formal@none@1@S@kwixʔid-i-da bəgwanəmai-χ-a q'asa-s-isi t'alwagwayu@@@@1@4@@danf@17-8-2009
10560310@unknown@formal@none@1@S@Morpheme by morpheme translation:@@@@1@4@@danf@17-8-2009
10560320@unknown@formal@none@1@S@kwixʔid-i-da = clubbed-PIVOT-DETERMINER@@@@1@4@@danf@17-8-2009
10560330@unknown@formal@none@1@S@bəgwanəma-χ-a = man-ACCUSATIVE-DETERMINER@@@@1@4@@danf@17-8-2009
10560340@unknown@formal@none@1@S@q'asa-s-is = otter-INSTRUMENTAL-3.PERSON.SINGULAR-POSSESSIVE@@@@1@4@@danf@17-8-2009
10560350@unknown@formal@none@1@S@t'alwagwayu = club.@@@@1@3@@danf@17-8-2009
10560360@unknown@formal@none@1@S@"the man clubbed the otter with his club"@@@@1@8@@danf@17-8-2009
10560370@unknown@formal@none@1@S@(Notation notes:@@@@1@2@@danf@17-8-2009
10560380@unknown@formal@none@1@S@1. accusative case marks an entity that something is done to.@@@@1@11@@danf@17-8-2009
10560390@unknown@formal@none@1@S@2. determiners are words such as "the", "this", "that".@@@@1@9@@danf@17-8-2009
10560400@unknown@formal@none@1@S@3. the concept of "pivot" is a theoretical construct that is not relevant to this discussion.)@@@@1@16@@danf@17-8-2009
10560410@unknown@formal@none@1@S@That is, to the speaker of Kwak'wala, the sentence does not contain the "words" 'him-the-otter' or 'with-his-club' Instead, the markers -''i-da'' (PIVOT-'the'), referring to ''man'', attaches not to ''bəgwanəma'' ('man'), but instead to the "verb"; the markers -''χ-a'' (ACCUSATIVE-'the'), referring to ''otter'', attach to ''bəgwanəma'' instead of to ''q'asa'' ('otter'), etc.@@@@1@53@@danf@17-8-2009
10560420@unknown@formal@none@1@S@To summarize differently: a speaker of Kwak'wala does ''not'' perceive the sentence to consist of these phonological words:@@@@1@18@@danf@17-8-2009
10560430@unknown@formal@none@1@S@kwixʔid i-da-bəgwanəma χ-a-q'asa s-isi-t'alwagwayu@@@@1@4@@danf@17-8-2009
10560440@unknown@formal@none@1@S@"clubbed PIVOT-the-mani hit-the-otter with-hisi-club@@@@1@4@@danf@17-8-2009
10560450@unknown@formal@none@1@S@A central publication on this topic is the recent volume edited by Dixon and Aikhenvald (2007), examining the mismatch between prosodic-phonological and grammatical definitions of "word" in various Amazonian, Australian Aboriginal, Caucasian, Eskimo, Indo-European, Native North American, and West African languages, and in sign languages.@@@@1@45@@danf@17-8-2009
10560460@unknown@formal@none@1@S@Apparently, a wide variety of languages make use of the hybrid linguistic unit clitic, possessing the grammatical features of independent words but the prosodic-phonological lack of freedom of bound morphemes.@@@@1@30@@danf@17-8-2009
10560470@unknown@formal@none@1@S@The intermediate status of clitics poses a considerable challenge to linguistic theory.@@@@1@12@@danf@17-8-2009
10560480@unknown@formal@none@1@S@=== Inflection vs. word-formation ===@@@@1@5@@danf@17-8-2009
10560490@unknown@formal@none@1@S@Given the notion of a lexeme, it is possible to distinguish two kinds of morphological rules.@@@@1@16@@danf@17-8-2009
10560500@unknown@formal@none@1@S@Some morphological rules relate to different forms of the same lexeme; while other rules relate to different lexemes.@@@@1@18@@danf@17-8-2009
10560510@unknown@formal@none@1@S@Rules of the first kind are called '''[[Inflection|inflectional rules]]''', while those of the second kind are called '''[[word formation|word-formation]]'''.@@@@1@19@@danf@17-8-2009
10560520@unknown@formal@none@1@S@The English plural, as illustrated by ''dog'' and ''dogs'', is an inflectional rule; compounds like ''dog-catcher'' or ''dishwasher'' provide an example of a word-formation rule.@@@@1@25@@danf@17-8-2009
10560530@unknown@formal@none@1@S@Informally, word-formation rules form "new words" (that is, new lexemes), while inflection rules yield variant forms of the "same" word (lexeme).@@@@1@21@@danf@17-8-2009
10560540@unknown@formal@none@1@S@There is a further distinction between two kinds of word-formation: [[Derivation (linguistics)|derivation]] and [[Compound (linguistics)|compounding]].@@@@1@15@@danf@17-8-2009
10560550@unknown@formal@none@1@S@Compounding is a process of word-formation that involves combining complete word-forms into a single '''compound''' form; ''dog-catcher'' is therefore a compound, because both ''dog'' and ''catcher'' are complete word-forms in their own right before the compounding process has been applied, and are subsequently treated as one form.@@@@1@47@@danf@17-8-2009
10560560@unknown@formal@none@1@S@Derivation involves [[affix]]ing [[bound morpheme|bound]] (non-independent) forms to existing lexemes, whereby the addition of the affix '''derives''' a new lexeme.@@@@1@20@@danf@17-8-2009
10560570@unknown@formal@none@1@S@One example of derivation is clear in this case: the word ''independent'' is derived from the word ''dependent'' by prefixing it with the derivational prefix ''in-'', while ''dependent'' itself is derived from the verb ''depend''.@@@@1@35@@danf@17-8-2009
10560580@unknown@formal@none@1@S@The distinction between inflection and word-formation is not at all clear-cut.@@@@1@11@@danf@17-8-2009
10560590@unknown@formal@none@1@S@There are many examples where linguists fail to agree whether a given rule is inflection or word-formation.@@@@1@17@@danf@17-8-2009
10560600@unknown@formal@none@1@S@The next section will attempt to clarify this distinction.@@@@1@9@@danf@17-8-2009
10560610@unknown@formal@none@1@S@=== Paradigms and morphosyntax ===@@@@1@5@@danf@17-8-2009
10560620@unknown@formal@none@1@S@A '''paradigm''' is the complete set of related word-forms associated with a given lexeme.@@@@1@14@@danf@17-8-2009
10560630@unknown@formal@none@1@S@The familiar examples of paradigms are the [[Grammatical conjugation|conjugations]] of verbs, and the [[declension]]s of nouns.@@@@1@16@@danf@17-8-2009
10560640@unknown@formal@none@1@S@Accordingly, the word-forms of a lexeme may be arranged conveniently into tables, by classifying them according to shared inflectional categories such as [[grammatical tense|tense]], [[grammatical aspect|aspect]], [[grammatical mood|mood]], [[grammatical number|number]], [[grammatical gender|gender]] or [[grammatical case|case]].@@@@1@35@@danf@17-8-2009
10560650@unknown@formal@none@1@S@For example, the personal pronouns in English can be organized into tables, using the categories of person (1st., 2nd., 3rd.), number (singular vs. plural), gender (masculine, feminine, neuter), and [[grammatical case|case]] (subjective, objective, and possessive).@@@@1@35@@danf@17-8-2009
10560660@unknown@formal@none@1@S@See [[English personal pronouns]] for the details.@@@@1@7@@danf@17-8-2009
10560670@unknown@formal@none@1@S@The inflectional categories used to group word-forms into paradigms cannot be chosen arbitrarily; they must be categories that are relevant to stating the [[syntax|syntactic rules]] of the language.@@@@1@28@@danf@17-8-2009
10560680@unknown@formal@none@1@S@For example, person and number are categories that can be used to define paradigms in English, because English has [[Agreement (linguistics)|grammatical agreement]] rules that require the verb in a sentence to appear in an inflectional form that matches the person and number of the subject.@@@@1@45@@danf@17-8-2009
10560690@unknown@formal@none@1@S@In other words, the syntactic rules of English care about the difference between ''dog'' and ''dogs'', because the choice between these two forms determines which form of the verb is to be used.@@@@1@33@@danf@17-8-2009
10560700@unknown@formal@none@1@S@In contrast, however, no syntactic rule of English cares about the difference between ''dog'' and ''dog-catcher'', or ''dependent'' and ''independent''.@@@@1@20@@danf@17-8-2009
10560710@unknown@formal@none@1@S@The first two are just nouns, and the second two just adjectives, and they generally behave like any other noun or adjective behaves.@@@@1@23@@danf@17-8-2009
10560720@unknown@formal@none@1@S@An important difference between inflection and word-formation is that inflected word-forms of lexemes are organized into paradigms, which are defined by the requirements of syntactic rules, whereas the rules of word-formation are not restricted by any corresponding requirements of syntax.@@@@1@40@@danf@17-8-2009
10560730@unknown@formal@none@1@S@Inflection is therefore said to be relevant to syntax, and word-formation is not.@@@@1@13@@danf@17-8-2009
10560740@unknown@formal@none@1@S@The part of morphology that covers the relationship between [[syntax]] and morphology is called morphosyntax, and it concerns itself with inflection and paradigms, but not with word-formation or compounding.@@@@1@29@@danf@17-8-2009
10560750@unknown@formal@none@1@S@=== Allomorphy ===@@@@1@3@@danf@17-8-2009
10560760@unknown@formal@none@1@S@In the exposition above, morphological rules are described as analogies between word-forms: ''dog'' is to ''dogs'' as ''cat'' is to ''cats'', and as ''dish'' is to ''dishes''.@@@@1@27@@danf@17-8-2009
10560770@unknown@formal@none@1@S@In this case, the analogy applies both to the form of the words and to their meaning: in each pair, the first word means "one of X", while the second "two or more of X", and the difference is always the plural form ''-s'' affixed to the second word, signaling the key distinction between singular and plural entities.@@@@1@58@@danf@17-8-2009
10560780@unknown@formal@none@1@S@One of the largest sources of complexity in morphology is that this one-to-one correspondence between meaning and form scarcely applies to every case in the language.@@@@1@26@@danf@17-8-2009
10560790@unknown@formal@none@1@S@In English, we have word form pairs like ''ox/oxen'', ''goose/geese'', and ''sheep/sheep'', where the difference between the singular and the plural is signaled in a way that departs from the regular pattern, or is not signaled at all.@@@@1@38@@danf@17-8-2009
10560800@unknown@formal@none@1@S@Even cases considered "regular", with the final ''-s'', are not so simple; the ''-s'' in ''dogs'' is not pronounced the same way as the ''-s'' in ''cats'', and in a plural like ''dishes'', an "extra" vowel appears before the ''-s''.@@@@1@40@@danf@17-8-2009
10560810@unknown@formal@none@1@S@These cases, where the same distinction is effected by alternative forms of a "word", are called '''[[allomorph]]y'''.@@@@1@17@@danf@17-8-2009
10560820@unknown@formal@none@1@S@Phonological rules constrain which sounds can appear next to each other in a language, and morphological rules, when applied blindly, would often violate phonological rules, by resulting in sound sequences that are prohibited in the language in question.@@@@1@38@@danf@17-8-2009
10560830@unknown@formal@none@1@S@For example, to form the plural of ''dish'' by simply appending an ''-s'' to the end of the word would result in the form *{{IPA|[dɪʃs]}}, which is not permitted by the [[phonotactics]] of English.@@@@1@34@@danf@17-8-2009
10560840@unknown@formal@none@1@S@In order to "rescue" the word, a vowel sound is inserted between the root and the plural marker, and {{IPA|[dɪʃəz]}} results.@@@@1@21@@danf@17-8-2009
10560850@unknown@formal@none@1@S@Similar rules apply to the pronunciation of the ''-s'' in ''dogs'' and ''cats'': it depends on the quality (voiced vs. unvoiced) of the final preceding [[phoneme]].@@@@1@26@@danf@17-8-2009
10560860@unknown@formal@none@1@S@=== Lexical morphology ===@@@@1@4@@danf@17-8-2009
10560870@unknown@formal@none@1@S@[[Lexical morphology]] is the branch of morphology that deals with the [[lexicon]], which, morphologically conceived, is the collection of [[lexeme]]s in a language.@@@@1@23@@danf@17-8-2009
10560880@unknown@formal@none@1@S@As such, it concerns itself primarily with word-formation: derivation and compounding.@@@@1@11@@danf@17-8-2009
10560890@unknown@formal@none@1@S@== Models of morphology ==@@@@1@5@@danf@17-8-2009
10560900@unknown@formal@none@1@S@There are three principal approaches to morphology, which each try to capture the distinctions above in different ways.@@@@1@18@@danf@17-8-2009
10560910@unknown@formal@none@1@S@These are,@@@@1@2@@danf@17-8-2009
10560920@unknown@formal@none@1@S@* [[Morpheme-based morphology]], which makes use of an [[Item-and-Arrangment (Morphology)|Item-and-Arrangement]] approach.@@@@1@11@@danf@17-8-2009
10560930@unknown@formal@none@1@S@* [[Lexeme-based morphology]], which normally makes use of an [[Item-and-Process (Morphology)|Item-and-Process]] approach.@@@@1@12@@danf@17-8-2009
10560940@unknown@formal@none@1@S@* [[Word-based morphology]], which normally makes use of a [[Word-and-paradigm morphology|Word-and-Paradigm]] approach.@@@@1@12@@danf@17-8-2009
10560950@unknown@formal@none@1@S@Note that while the associations indicated between the concepts in each item in that list is very strong, it is not absolute.@@@@1@22@@danf@17-8-2009
10560960@unknown@formal@none@1@S@=== Morpheme-based morphology ===@@@@1@4@@danf@17-8-2009
10560970@unknown@formal@none@1@S@In [[morpheme-based morphology]], word-forms are analyzed as arrangements of [[morpheme]]s.@@@@1@10@@danf@17-8-2009
10560980@unknown@formal@none@1@S@A '''morpheme''' is defined as the minimal meaningful unit of a language.@@@@1@12@@danf@17-8-2009
10560990@unknown@formal@none@1@S@In a word like ''independently'', we say that the morphemes are ''in-'', ''depend'', ''-ent'', and ''ly''; ''depend'' is the [[root (linguistics)|root]] and the other morphemes are, in this case, derivational affixes.@@@@1@31@@danf@17-8-2009
10561000@unknown@formal@none@1@S@In a word like ''dogs'', we say that ''dog'' is the root, and that ''-s'' is an inflectional morpheme.@@@@1@19@@danf@17-8-2009
10561010@unknown@formal@none@1@S@This way of analyzing word-forms as if they were made of morphemes put after each other like beads on a string, is called [[Item-and-Arrangment (Morphology)|Item-and-Arrangement]].@@@@1@25@@danf@17-8-2009
10561020@unknown@formal@none@1@S@The morpheme-based approach is the first one that beginners to morphology usually think of, and which laymen tend to find the most obvious.@@@@1@23@@danf@17-8-2009
10561030@unknown@formal@none@1@S@This is so to such an extent that very often beginners think that morphemes are an inevitable, fundamental notion of morphology, and many five-minute explanations of morphology are, in fact, five-minute explanations of morpheme-based morphology.@@@@1@35@@danf@17-8-2009
10561040@unknown@formal@none@1@S@This is, however, not so.@@@@1@5@@danf@17-8-2009
10561050@unknown@formal@none@1@S@The fundamental idea of morphology is that the words of a language are related to each other by different kinds of rules.@@@@1@22@@danf@17-8-2009
10561060@unknown@formal@none@1@S@Analyzing words as sequences of morphemes is a way of describing these relations, but is not the only way.@@@@1@19@@danf@17-8-2009
10561070@unknown@formal@none@1@S@In actual academic linguistics, morpheme-based morphology certainly has many adherents, but is by no means the dominant approach.@@@@1@18@@danf@17-8-2009
10561080@unknown@formal@none@1@S@=== Lexeme-based morphology ===@@@@1@4@@danf@17-8-2009
10561090@unknown@formal@none@1@S@[[Lexeme-based morphology]] is (usually) an [[Item-and-Process (Morphology)|Item-and-Process]] approach.@@@@1@8@@danf@17-8-2009
10561100@unknown@formal@none@1@S@Instead of analyzing a word-form as a set of morphemes arranged in sequence, a word-form is said to be the result of applying rules that ''alter'' a word-form or stem in order to produce a new one.@@@@1@37@@danf@17-8-2009
10561110@unknown@formal@none@1@S@An inflectional rule takes a stem, changes it as is required by the rule, and outputs a word-form; a derivational rule takes a stem, changes it as per its own requirements, and outputs a derived stem; a compounding rule takes word-forms, and similarly outputs a compound stem.@@@@1@47@@danf@17-8-2009
10561120@unknown@formal@none@1@S@=== Word-based morphology ===@@@@1@4@@danf@17-8-2009
10561130@unknown@formal@none@1@S@[[Word-based morphology]] is a (usually) [[Word-and-paradigm morphology|Word-and-paradigm]] approach.@@@@1@8@@danf@17-8-2009
10561140@unknown@formal@none@1@S@This theory takes paradigms as a central notion.@@@@1@8@@danf@17-8-2009
10561150@unknown@formal@none@1@S@Instead of stating rules to combine morphemes into word-forms, or to generate word-forms from stems, word-based morphology states generalizations that hold between the forms of inflectional paradigms.@@@@1@27@@danf@17-8-2009
10561160@unknown@formal@none@1@S@The major point behind this approach is that many such generalizations are hard to state with either of the other approaches.@@@@1@21@@danf@17-8-2009
10561170@unknown@formal@none@1@S@The examples are usually drawn from [[fusional language]]s, where a given "piece" of a word, which a morpheme-based theory would call an inflectional morpheme, corresponds to a combination of grammatical categories, for example, "third person plural."@@@@1@36@@danf@17-8-2009
10561180@unknown@formal@none@1@S@Morpheme-based theories usually have no problems with this situation, since one just says that a given morpheme has two categories.@@@@1@20@@danf@17-8-2009
10561190@unknown@formal@none@1@S@Item-and-Process theories, on the other hand, often break down in cases like these, because they all too often assume that there will be two separate rules here, one for third person, and the other for plural, but the distinction between them turns out to be artificial.@@@@1@46@@danf@17-8-2009
10561200@unknown@formal@none@1@S@Word-and-Paradigm approaches treat these as whole words that are related to each other by [[analogy|analogical]] rules.@@@@1@16@@danf@17-8-2009
10561210@unknown@formal@none@1@S@Words can be categorized based on the pattern they fit into.@@@@1@11@@danf@17-8-2009
10561220@unknown@formal@none@1@S@This applies both to existing words and to new ones.@@@@1@10@@danf@17-8-2009
10561230@unknown@formal@none@1@S@Application of a pattern different than the one that has been used historically can give rise to a new word, such as ''older'' replacing ''elder'' (where ''older'' follows the normal pattern of [[adjective|adjectival]] [[superlative]]s) and ''cows'' replacing ''kine'' (where ''cows'' fits the regular pattern of plural formation).@@@@1@47@@danf@17-8-2009
10561240@unknown@formal@none@1@S@While a Word-and-Paradigm approach can explain this easily, other approaches have difficulty with phenomena such as this.@@@@1@17@@danf@17-8-2009
10561250@unknown@formal@none@1@S@== Morphological typology ==@@@@1@4@@danf@17-8-2009
10561260@unknown@formal@none@1@S@In the 19th century, philologists devised a now classic classification of languages according to their morphology.@@@@1@16@@danf@17-8-2009
10561270@unknown@formal@none@1@S@According to this typology, some languages are [[isolating language|isolating]], and have little to no morphology; others are [[agglutinating language|agglutinative]], and their words tend to have lots of easily-separable morphemes; while others yet are inflectional or [[fusional language|fusional]], because their inflectional morphemes are said to be "fused" together.@@@@1@47@@danf@17-8-2009
10561280@unknown@formal@none@1@S@This leads to one bound morpheme conveying multiple pieces of information.@@@@1@11@@danf@17-8-2009
10561290@unknown@formal@none@1@S@The classic example of an isolating language is [[Chinese language|Chinese]]; the classic example of an agglutinative language is [[Turkish language|Turkish]]; both [[Latin language|Latin]] and [[Greek language|Greek]] are classic examples of fusional languages.@@@@1@32@@danf@17-8-2009
10561300@unknown@formal@none@1@S@Considering the variability of the world's languages, it becomes clear that this classification is not at all clear-cut, and many languages do not neatly fit any one of these types, and some fit in more than one.@@@@1@37@@danf@17-8-2009
10561310@unknown@formal@none@1@S@A continuum of complex morphology of language may be adapted when considering languages.@@@@1@13@@danf@17-8-2009
10561320@unknown@formal@none@1@S@The three models of morphology stem from attempts to analyze languages that more or less match different categories in this typology.@@@@1@21@@danf@17-8-2009
10561330@unknown@formal@none@1@S@The Item-and-Arrangement approach fits very naturally with agglutinative languages; while the Item-and-Process and Word-and-Paradigm approaches usually address fusional languages.@@@@1@19@@danf@17-8-2009
10561340@unknown@formal@none@1@S@The reader should also note that the classical typology also mostly applies to inflectional morphology.@@@@1@15@@danf@17-8-2009
10561350@unknown@formal@none@1@S@There is very little fusion going on with word-formation.@@@@1@9@@danf@17-8-2009
10561360@unknown@formal@none@1@S@Languages may be classified as synthetic or analytic in their word formation, depending on the preferred way of expressing notions that are not inflectional: either by using word-formation (synthetic), or by using syntactic phrases (analytic).@@@@1@35@@danf@17-8-2009
10570010@unknown@formal@none@1@S@Named entity recognition@@@@1@3@@danf@17-8-2009
10570020@unknown@formal@none@1@S@'''Named entity recognition''' (NER) (also known as '''entity identification (EI)''' and '''entity extraction''') is a subtask of [[information extraction]] that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.@@@@1@48@@danf@17-8-2009
10570030@unknown@formal@none@1@S@For example, a NER system producing [[Message Understanding Conference|MUC]]-style output might [[Metadata|tag]] the sentence,@@@@1@14@@danf@17-8-2009
10570040@unknown@formal@none@1@S@:''Jim bought 300 shares of Acme Corp. in 2006.''@@@@1@9@@danf@17-8-2009
10570050@unknown@formal@none@1@S@:''''''''Jim'''''' bought ''''''300'''''' shares of ''''''Acme Corp.'''''' in ''''''2006''''''''.@@@@1@13@@danf@17-8-2009
10570060@unknown@formal@none@1@S@NER systems have been created that use linguistic [[formal grammar|grammar]]-based techniques as well as [[statistical model]]s.@@@@1@16@@danf@17-8-2009
10570070@unknown@formal@none@1@S@Hand-crafted grammar-based systems typically obtain better results, but at the cost of months of work by experienced [[Linguistics|linguists]].@@@@1@18@@danf@17-8-2009
10570080@unknown@formal@none@1@S@Statistical NER systems typically require a large amount of manually [[annotation|annotated]] training data.@@@@1@13@@danf@17-8-2009
10570090@unknown@formal@none@1@S@Since about 1998, there has been a great deal of interest in entity identification in the [[molecular biology]], [[bioinformatics]], and medical [[natural language processing]] communities.@@@@1@25@@danf@17-8-2009
10570100@unknown@formal@none@1@S@The most common entity of interest in that domain has been names of genes and gene products.@@@@1@17@@danf@17-8-2009
10570110@unknown@formal@none@1@S@==Named entity types==@@@@1@3@@danf@17-8-2009
10570120@unknown@formal@none@1@S@In the expression ''named entity'', the word ''named'' restricts the task to those entities for which one or many [[rigid designator]]s, as defined by [[Saul Kripke|Kripke]], stands for the referent.@@@@1@30@@danf@17-8-2009
10570130@unknown@formal@none@1@S@For instance, the ''automotive company created by Henry Ford in 1903'' is referred to as ''Ford'' or ''Ford Motor Company''.@@@@1@20@@danf@17-8-2009
10570140@unknown@formal@none@1@S@Rigid designators include proper names as well as certain natural kind terms like biological species and substances.@@@@1@17@@danf@17-8-2009
10570150@unknown@formal@none@1@S@There is a general agreement to include [[temporal expressions]] and some numerical expressions such as money and measures in named entities.@@@@1@21@@danf@17-8-2009
10570160@unknown@formal@none@1@S@While some instances of these types are good examples of rigid designators (e.g., the year 2001) there are also many invalid ones (e.g., I take my vacations in “June”).@@@@1@29@@danf@17-8-2009
10570170@unknown@formal@none@1@S@In the first case, the year ''2001'' refers to the ''2001st year of the Gregorian calendar''.@@@@1@16@@danf@17-8-2009
10570180@unknown@formal@none@1@S@In the second case, the month ''June'' may refer to the month of an undefined year (''past June'', ''next June'', ''June 2020'', etc.).@@@@1@23@@danf@17-8-2009
10570190@unknown@formal@none@1@S@It is arguable that the named entity definition is loosened in such cases for practical reasons.@@@@1@16@@danf@17-8-2009
10570200@unknown@formal@none@1@S@At least two [[Hierarchy|hierarchies]] of named entity types have been proposed in the literature.@@@@1@14@@danf@17-8-2009
10570210@unknown@formal@none@1@S@[[BBN Technologies|BBN]] categories [http://www.ldc.upenn.edu/Catalog/docs/LDC2005T33/BBN-Types-Subtypes.html], proposed in 2002, is used for [[Question Answering]] and consists of 29 types and 64 subtypes.@@@@1@20@@danf@17-8-2009
10570220@unknown@formal@none@1@S@Sekine's extended hierarchy [http://nlp.cs.nyu.edu/ene/], proposed in 2002, is made of 200 subtypes.@@@@1@12@@danf@17-8-2009
10570230@unknown@formal@none@1@S@==Evaluation==@@@@1@1@@danf@17-8-2009
10570240@unknown@formal@none@1@S@Benchmarking and evaluations have been performed in the ''[[Message Understanding Conference]]s'' (MUC) organized by [[DARPA]], ''International Conference on Language Resources and Evaluation (LREC)'', ''Computational Natural Language Learning ([[CoNLL]])'' workshops, ''Automatic Content Extraction'' (ACE) organized by [[NIST]], the ''[[Multilingual Entity Task Conference]]'' (MET), ''Information Retrieval and Extraction Exercise'' (IREX) and in ''HAREM'' (Portuguese language only).@@@@1@54@@danf@17-8-2009
10570250@unknown@formal@none@1@S@[http://aclweb.org/aclwiki/index.php?title=Named_Entity_Recognition_%28State_of_the_art%29 State-of-the-art systems] produce near-human performance.@@@@1@6@@danf@17-8-2009
10570260@unknown@formal@none@1@S@For instance, the best system entering [http://www.itl.nist.gov/iad/894.02/related_projects/muc/proceedings/muc_7_toc.html MUC-7] scored 93.39% of [[Information_retrieval#F-measure|f-measure]] while human annotators scored 97.60% and 96.95%.@@@@1@19@@danf@17-8-2009
10580010@unknown@formal@none@1@S@Natural language@@@@1@2@@danf@17-8-2009
10580020@unknown@formal@none@1@S@In the [[philosophy of language]], a '''natural language''' (or '''ordinary language''') is a [[language]] that is spoken, [[writing|written]], or [[Sign language|signed]] by [[animal]]s for general-purpose communication, as distinguished from [[formal language]]s (such as [[Programming language|computer-programming languages]] or the "languages" used in the study of formal [[logic]], especially [[mathematical logic]]) and from [[constructed language]]s.@@@@1@53@@danf@17-8-2009
10580030@unknown@formal@none@1@S@== Defining natural language ==@@@@1@5@@danf@17-8-2009
10580040@unknown@formal@none@1@S@Though the exact definition is debatable, natural language is often contrasted with artificial or [[constructed languages]] such as [[Esperanto]], [[Latino Sexione]], and [[Occidental language|Occidental]].@@@@1@24@@danf@17-8-2009
10580050@unknown@formal@none@1@S@Linguists have an incomplete understanding of all aspects of the rules underlying natural languages, and these rules are therefore objects of study.@@@@1@22@@danf@17-8-2009
10580060@unknown@formal@none@1@S@The understanding of natural languages reveals much about not only how language works (in terms of [[syntax]], [[semantics]], [[phonetics]], [[phonology]], etc), but also about how the human [[mind]] and the human [[brain]] process language.@@@@1@34@@danf@17-8-2009
10580070@unknown@formal@none@1@S@In linguistic terms, 'natural language' only applies to a language that has evolved naturally, and the study of natural language primarily involves native (first language) speakers.@@@@1@26@@danf@17-8-2009
10580080@unknown@formal@none@1@S@The theory of [[universal grammar]] proposes that all natural languages have certain underlying rules which constrain the structure of the specific grammar for any given language.@@@@1@26@@danf@17-8-2009
10580090@unknown@formal@none@1@S@While [[grammarians]], writers of dictionaries, and language policy-makers all have a certain influence on the evolution of language, their ability to influence what people think they 'ought' to say is distinct from what people actually say.@@@@1@36@@danf@17-8-2009
10580100@unknown@formal@none@1@S@Natural language applies to the latter, and is thus a 'descriptive' rather than a 'prescriptive' term.@@@@1@16@@danf@17-8-2009
10580110@unknown@formal@none@1@S@Thus non-standard language varieties (such as [[African American Vernacular English]]) are considered to be natural while standard language varieties (such as [[Standard American English]]) which are more 'prescripted' can be considered to be at least somewhat artificial or constructed.@@@@1@39@@danf@17-8-2009
10580120@unknown@formal@none@1@S@== Native language learning ==@@@@1@5@@danf@17-8-2009
10580130@unknown@formal@none@1@S@The [[learning]] of one's own [[native language]], typically that of one's [[parent]]s, normally occurs spontaneously in early human [[childhood]] and is [[Biology|biologically]] driven.@@@@1@23@@danf@17-8-2009
10580140@unknown@formal@none@1@S@A crucial role of this process is performed by the [[Nervous system|neural]] activity of a portion of the human [[brain]] known as [[Broca's area]].@@@@1@24@@danf@17-8-2009
10580150@unknown@formal@none@1@S@There are approximately 7,000 current human languages, and many, if not most seem to share certain properties, leading to the belief in the existence of [[Universal Grammar]], as shown by [[generative grammar]] studies pioneered by the work of [[Noam Chomsky]].@@@@1@40@@danf@17-8-2009
10580160@unknown@formal@none@1@S@Recently, it has been demonstrated that a dedicated network in the human brain (crucially involving [[Broca's area]], a portion of the left inferior frontal gyrus), is selectively activated by complex verbal structures (but not simple ones) of those languages that meet the Universal Grammar requirements.@@@@1@45@@danf@17-8-2009
10580170@unknown@formal@none@1@S@== Origins of natural language ==@@@@1@6@@danf@17-8-2009
10580180@unknown@formal@none@1@S@There is disagreement among anthropologists on when language was first used by humans (or their ancestors).@@@@1@16@@danf@17-8-2009
10580190@unknown@formal@none@1@S@Estimates range from about two million (2,000,000) years ago, during the time of ''[[Homo habilis]]'', to as recently as forty thousand (40,000) years ago, during the time of [[Cro-Magnon]] man.@@@@1@30@@danf@17-8-2009
10580200@unknown@formal@none@1@S@However recent evidence suggests modern human language was invented or evolved in Africa prior to the dispersal of humans from Africa around 50,000 years ago.@@@@1@25@@danf@17-8-2009
10580210@unknown@formal@none@1@S@Since all people including the most isolated indigenous groups such as the [[Andamanese]] or the [[Tasmanian aboriginals]] possess language, then it must have been present in the ancestral populations in Africa before the human population split into various groups to colonize the rest of the world.@@@@1@46@@danf@17-8-2009
10580220@unknown@formal@none@1@S@Some claim that all nautural languages came out of one single language, known as [[Adamic]].@@@@1@15@@danf@17-8-2009
10580230@unknown@formal@none@1@S@== Linguistic diversity ==@@@@1@4@@danf@17-8-2009
10580240@unknown@formal@none@1@S@As of early 2007, there are 6,912 known living human languages.@@@@1@11@@danf@17-8-2009
10580250@unknown@formal@none@1@S@A "living language" is simply one which is in wide use by a specific group of living people.@@@@1@18@@danf@17-8-2009
10580260@unknown@formal@none@1@S@The exact number of known living languages will vary from 5,000 to 10,000, depending generally on the precision of one's definition of "language", and in particular on how one classifies [[dialects]].@@@@1@31@@danf@17-8-2009
10580270@unknown@formal@none@1@S@There are also many dead or [[extinct language]]s.@@@@1@8@@danf@17-8-2009
10580280@unknown@formal@none@1@S@There is no [[dialect#.22Dialect.22 or .22language.22|clear distinction]] between a language and a [[dialect]], notwithstanding linguist [[Max Weinreich]]'s famous [[aphorism]] that "[[a language is a dialect with an army and navy]]."@@@@1@30@@danf@17-8-2009
10580290@unknown@formal@none@1@S@In other words, the distinction may hinge on political considerations as much as on cultural differences, distinctive [[writing system]]s, or degree of [[mutual intelligibility]].@@@@1@24@@danf@17-8-2009
10580300@unknown@formal@none@1@S@It is probably impossible to accurately enumerate the living languages because our worldwide knowledge is incomplete, and it is a "moving target", as explained in greater detail by the [[Ethnologue]]'s Introduction, p. 7 - 8.@@@@1@35@@danf@17-8-2009
10580310@unknown@formal@none@1@S@With the 15th edition, the 103 newly added languages are not new but reclassified due to refinements in the definition of language.@@@@1@22@@danf@17-8-2009
10580320@unknown@formal@none@1@S@Although widely considered an [[encyclopedia]], the [[Ethnologue]] actually presents itself as an incomplete catalog, including only named languages that its editors are able to document.@@@@1@25@@danf@17-8-2009
10580330@unknown@formal@none@1@S@With each edition, the number of catalogued languages has grown.@@@@1@10@@danf@17-8-2009
10580340@unknown@formal@none@1@S@Beginning with the 14th edition (2000), an attempt was made to include all known living languages.@@@@1@16@@danf@17-8-2009
10580350@unknown@formal@none@1@S@SIL used an internal 3-letter code fashioned after [[airport code]]s to identify languages.@@@@1@13@@danf@17-8-2009
10580360@unknown@formal@none@1@S@This was the precursor to the modern [[ISO 639-3]] standard, to which SIL contributed.@@@@1@14@@danf@17-8-2009
10580370@unknown@formal@none@1@S@The standard allows for over 14,000 languages.@@@@1@7@@danf@17-8-2009
10580380@unknown@formal@none@1@S@In turn, the 15th edition was revised to conform to the pending ISO 639-3 standard.@@@@1@15@@danf@17-8-2009
10580390@unknown@formal@none@1@S@Of the catalogued languages, 497 have been flagged as "nearly extinct" due to trends in their usage.@@@@1@17@@danf@17-8-2009
10580400@unknown@formal@none@1@S@Per the 15th edition, 6,912 living languages are shared by over 5.7 billion speakers. (p. 15)@@@@1@16@@danf@17-8-2009
10580410@unknown@formal@none@1@S@== Taxonomy ==@@@@1@3@@danf@17-8-2009
10580420@unknown@formal@none@1@S@The [[Taxonomic classification|classification]] of natural languages can be performed on the basis of different underlying principles (different closeness notions, respecting different properties and relations between languages); important directions of present classifications are:@@@@1@32@@danf@17-8-2009
10580430@unknown@formal@none@1@S@* paying attention to the historical evolution of languages results in a genetic classification of languages—which is based on genetic relatedness of languages,@@@@1@23@@danf@17-8-2009
10580440@unknown@formal@none@1@S@* paying attention to the internal structure of languages ([[grammar]]) results in a typological classification of languages—which is based on similarity of one or more components of the language's grammar across languages,@@@@1@32@@danf@17-8-2009
10580450@unknown@formal@none@1@S@* and respecting geographical closeness and contacts between language-speaking communities results in areal groupings of languages.@@@@1@16@@danf@17-8-2009
10580460@unknown@formal@none@1@S@The different classifications do not match each other and are not expected to, but the correlation between them is an important point for many [[linguistics|linguistic]] research works.@@@@1@27@@danf@17-8-2009
10580470@unknown@formal@none@1@S@(There is a parallel to the classification of [[species]] in biological [[phylogenetics]] here: consider [[monophyletic]] vs. [[polyphyletic]] groups of species.)@@@@1@20@@danf@17-8-2009
10580480@unknown@formal@none@1@S@The task of genetic classification belongs to the field of [[historical-comparative linguistics]], of typological—to [[linguistic typology]].@@@@1@16@@danf@17-8-2009
10580490@unknown@formal@none@1@S@See also [[Taxonomy]], and [[Taxonomic classification]] for the general idea of classification and taxonomies.@@@@1@14@@danf@17-8-2009
10580500@unknown@formal@none@1@S@==== Genetic classification ====@@@@1@4@@danf@17-8-2009
10580510@unknown@formal@none@1@S@The world's languages have been grouped into families of languages that are believed to have common ancestors.@@@@1@17@@danf@17-8-2009
10580520@unknown@formal@none@1@S@Some of the major families are the [[Indo-European languages]], the [[Afro-Asiatic languages]], the [[Austronesian languages]], and the [[Sino-Tibetan languages]].@@@@1@19@@danf@17-8-2009
10580530@unknown@formal@none@1@S@The shared features of languages from one family can be due to shared ancestry.@@@@1@14@@danf@17-8-2009
10580540@unknown@formal@none@1@S@(Compare with [[homology (biology)|homology]] in biology.)@@@@1@6@@danf@17-8-2009
10580550@unknown@formal@none@1@S@==== Typological classification ====@@@@1@4@@danf@17-8-2009
10580560@unknown@formal@none@1@S@An example of a typological classification is the classification of languages on the basis of the basic order of the [[verb]], the [[subject (grammar)|subject]] and the [[object (grammar)|object]] in a [[sentence (linguistics)|sentence]] into several types: [[SVO language|SVO]], [[SOV language|SOV]], [[VSO language|VSO]], and so on, languages.@@@@1@45@@danf@17-8-2009
10580570@unknown@formal@none@1@S@([[English language|English]], for instance, belongs to the [[SVO language]] type.)@@@@1@10@@danf@17-8-2009
10580580@unknown@formal@none@1@S@The shared features of languages of one type (= from one typological class) may have arisen completely independently.@@@@1@18@@danf@17-8-2009
10580590@unknown@formal@none@1@S@(Compare with [[analogy (biology)|analogy]] in biology.)@@@@1@6@@danf@17-8-2009
10580595@unknown@formal@none@1@S@Their cooccurence might be due to the universal laws governing the structure of natural languages—[[language universal]]s.@@@@1@16@@danf@17-8-2009
10580600@unknown@formal@none@1@S@==== Areal classification ====@@@@1@4@@danf@17-8-2009
10580610@unknown@formal@none@1@S@The following language groupings can serve as some linguistically significant examples of areal linguistic units, or ''[[sprachbund]]s'': [[Balkan linguistic union]], or the bigger group of [[European languages]]; [[Caucasian languages]]; [[East Asian languages]].@@@@1@32@@danf@17-8-2009
10580620@unknown@formal@none@1@S@Although the members of each group are not closely [[genetic relatedness of languages|genetically related]], there is a reason for them to share similar features, namely: their speakers have been in contact for a long time within a common community and the languages ''converged'' in the course of the history.@@@@1@49@@danf@17-8-2009
10580630@unknown@formal@none@1@S@These are called "[[areal feature (linguistics)|areal feature]]s".@@@@1@7@@danf@17-8-2009
10580640@unknown@formal@none@1@S@One should be careful about the underlying classification principle for groups of languages which have apparently a geographical name: besides areal linguistic units, the [[taxa]] of the genetic classification ([[language family|language families]]) are often given names which themselves or parts of which refer to geographical areas.@@@@1@46@@danf@17-8-2009
10580650@unknown@formal@none@1@S@== Controlled languages ==@@@@1@4@@danf@17-8-2009
10580660@unknown@formal@none@1@S@Controlled natural languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity.@@@@1@25@@danf@17-8-2009
10580670@unknown@formal@none@1@S@The purpose behind the development and implementation of a controlled natural language typically is to aid non-native speakers of a natural language in understanding it, or to ease computer processing of a natural language.@@@@1@34@@danf@17-8-2009
10580680@unknown@formal@none@1@S@An example of a widely used controlled natural language is [[Simplified English]], which was originally developed for [[aerospace]] industry maintenance manuals.@@@@1@21@@danf@17-8-2009
10580690@unknown@formal@none@1@S@== Constructed languages and international auxiliary languages ==@@@@1@8@@danf@17-8-2009
10580700@unknown@formal@none@1@S@Constructed [[international auxiliary language]]s such as [[Esperanto]] and [[Interlingua]] that have [[native speaker]]s are by some also considered natural languages.@@@@1@20@@danf@17-8-2009
10580710@unknown@formal@none@1@S@However, constructed languages, while they are clearly languages, are not generally considered natural languages.@@@@1@14@@danf@17-8-2009
10580720@unknown@formal@none@1@S@The problem is that other languages have been used to communicate and evolve in a natural way, while Esperanto has been selectively designed by [[L.L. Zamenhof]] from natural languages, not grown from the natural fluctuations in vocabulary and syntax.@@@@1@39@@danf@17-8-2009
10580730@unknown@formal@none@1@S@Nor has Esperanto been naturally "standardized" by children's natural tendency to correct for illogical grammar structures in their parents' language, which can be seen in the development of [[pidgin]] languages into [[creole language]]s (as explained by Steven Pinker in [[The Language Instinct]]).@@@@1@42@@danf@17-8-2009
10580740@unknown@formal@none@1@S@The possible exception to this are true native speakers of such languages.@@@@1@12@@danf@17-8-2009
10580750@unknown@formal@none@1@S@More substantive basis for this designation is that the vocabulary, grammar, and orthography of Interlingua are natural; they have been standardized and presented by a [[International Auxiliary Language Association|linguistic research body]], but they predated it and are not themselves considered a product of human invention.@@@@1@45@@danf@17-8-2009
10580760@unknown@formal@none@1@S@Most experts, however, consider Interlingua to be naturalistic rather than natural.@@@@1@11@@danf@17-8-2009
10580770@unknown@formal@none@1@S@[[Latino Sine Flexione]], a second naturalistic auxiliary language, is also naturalistic in content but is no longer widely spoken.@@@@1@19@@danf@17-8-2009
10580780@unknown@formal@none@1@S@==Natural Language Processing==@@@@1@3@@danf@17-8-2009
10580790@unknown@formal@none@1@S@Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics.@@@@1@13@@danf@17-8-2009
10580800@unknown@formal@none@1@S@It studies the problems of automated generation and understanding of natural human languages.@@@@1@13@@danf@17-8-2009
10580810@unknown@formal@none@1@S@Natural-language-generation systems convert information from computer databases into normal-sounding human language.@@@@1@11@@danf@17-8-2009
10580820@unknown@formal@none@1@S@Natural-language-understanding systems convert samples of human language into more formal representations that are easier for computer programs to manipulate.@@@@1@19@@danf@17-8-2009
10580830@unknown@formal@none@1@S@== Modalities ==@@@@1@3@@danf@17-8-2009
10580840@unknown@formal@none@1@S@Natural language manifests itself in modalities other than speech.@@@@1@9@@danf@17-8-2009
10580850@unknown@formal@none@1@S@=== Sign languages ===@@@@1@4@@danf@17-8-2009
10580860@unknown@formal@none@1@S@In linguistic terms, sign languages are as rich and complex as any oral language, despite the previously common misconception that they are not "real languages".@@@@1@25@@danf@17-8-2009
10580870@unknown@formal@none@1@S@Professional linguists have studied many sign languages and found them to have every linguistic component required to be classed as true natural languages.@@@@1@23@@danf@17-8-2009
10580880@unknown@formal@none@1@S@Sign languages are not [[pantomime]], much as most spoken language is not [[onomatopoeic]].@@@@1@13@@danf@17-8-2009
10580890@unknown@formal@none@1@S@The signs do tend to exploit iconicity (visual connections with their referents) more than what is common in spoken language, but they are above all conventional and hence generally incomprehensible to non-speakers, just like spoken words and morphemes.@@@@1@38@@danf@17-8-2009
10580900@unknown@formal@none@1@S@They are not a visual rendition of an oral language either.@@@@1@11@@danf@17-8-2009
10580910@unknown@formal@none@1@S@They have complex grammars of their own, and can be used to discuss any topic, from the simple and concrete to the lofty and abstract.@@@@1@25@@danf@17-8-2009
10580920@unknown@formal@none@1@S@=== Written languages ===@@@@1@4@@danf@17-8-2009
10580930@unknown@formal@none@1@S@In a sense, written language should be distinguished from natural language.@@@@1@11@@danf@17-8-2009
10580940@unknown@formal@none@1@S@Until recently in the developed world, it was common for many people to be fluent in [[spoken language|spoken]] or [[sign language|signed languages]] and yet remain illiterate; this is still the case in poor countries today.@@@@1@35@@danf@17-8-2009
10580950@unknown@formal@none@1@S@Furthermore, natural [[language acquisition]] during childhood is largely spontaneous, while [[literacy]] must usually be intentionally acquired.@@@@1@16@@danf@17-8-2009
10590010@unknown@formal@none@1@S@Natural language processing@@@@1@3@@danf@17-8-2009
10590020@unknown@formal@none@1@S@'''Natural language processing''' ('''NLP''') is a subfield of [[artificial intelligence]] and [[computational linguistics]].@@@@1@13@@danf@17-8-2009
10590030@unknown@formal@none@1@S@It studies the problems of automated generation and understanding of [[natural language|natural human languages]].@@@@1@14@@danf@17-8-2009
10590040@unknown@formal@none@1@S@Natural-language-generation systems convert information from computer databases into normal-sounding human language.@@@@1@11@@danf@17-8-2009
10590050@unknown@formal@none@1@S@Natural-language-understanding systems convert samples of human language into more formal representations that are easier for [[computer]] programs to manipulate.@@@@1@19@@danf@17-8-2009
10590060@unknown@formal@none@1@S@==Tasks and limitations==@@@@1@3@@danf@17-8-2009
10590070@unknown@formal@none@1@S@In theory, natural-language processing is a very attractive method of [[human-computer interaction]].@@@@1@12@@danf@17-8-2009
10590080@unknown@formal@none@1@S@Early systems such as [[SHRDLU]], working in restricted "[[blocks world]]s" with restricted vocabularies, worked extremely well, leading researchers to excessive optimism, which was soon lost when the systems were extended to more realistic situations with real-world [[ambiguity]] and [[complexity]].@@@@1@39@@danf@17-8-2009
10590090@unknown@formal@none@1@S@Natural-language understanding is sometimes referred to as an [[AI-complete]] problem, because natural-language recognition seems to require extensive knowledge about the outside world and the ability to manipulate it.@@@@1@28@@danf@17-8-2009
10590100@unknown@formal@none@1@S@The definition of "[[understanding]]" is one of the major problems in natural-language processing.@@@@1@13@@danf@17-8-2009
10590110@unknown@formal@none@1@S@==Concrete problems==@@@@1@2@@danf@17-8-2009
10590120@unknown@formal@none@1@S@Some examples of the problems faced by natural-language-understanding systems:@@@@1@9@@danf@17-8-2009
10590130@unknown@formal@none@1@S@* The sentences ''We gave the monkeys the bananas because they were hungry'' and ''We gave the monkeys the bananas because they were over-ripe'' have the same surface grammatical structure.@@@@1@30@@danf@17-8-2009
10590140@unknown@formal@none@1@S@However, the pronoun ''they'' refers to ''monkeys'' in one sentence and ''bananas'' in the other, and it is impossible to tell which without a knowledge of the properties of monkeys and bananas.@@@@1@32@@danf@17-8-2009
10590150@unknown@formal@none@1@S@* A string of words may be interpreted in different ways.@@@@1@11@@danf@17-8-2009
10590160@unknown@formal@none@1@S@For example, the string ''Time flies like an arrow'' may be interpreted in a variety of ways:@@@@1@17@@danf@17-8-2009
10590170@unknown@formal@none@1@S@**The common [[simile]]: ''[[time]]'' moves quickly just like an arrow does;@@@@1@11@@danf@17-8-2009
10590180@unknown@formal@none@1@S@**measure the speed of flies like you would measure that of an arrow (thus interpreted as an imperative) - i.e. ''(You should) time flies as you would (time) an arrow.'';@@@@1@30@@danf@17-8-2009
10590190@unknown@formal@none@1@S@**measure the speed of flies like an arrow would - i.e. ''Time flies in the same way that an arrow would (time them).'';@@@@1@23@@danf@17-8-2009
10590200@unknown@formal@none@1@S@**measure the speed of flies that are like arrows - i.e. ''Time those flies that are like arrows'';@@@@1@18@@danf@17-8-2009
10590210@unknown@formal@none@1@S@**all of a type of flying insect, "time-flies," collectively enjoys a single arrow (compare ''Fruit flies like a banana'');@@@@1@19@@danf@17-8-2009
10590220@unknown@formal@none@1@S@**each of a type of flying insect, "time-flies," individually enjoys a different arrow (similar comparison applies);@@@@1@16@@danf@17-8-2009
10590230@unknown@formal@none@1@S@**A concrete object, for example the magazine, ''[[Time (magazine)|Time]]'', travels through the air in an arrow-like manner.@@@@1@17@@danf@17-8-2009
10590240@unknown@formal@none@1@S@English is particularly challenging in this regard because it has little [[inflectional morphology]] to distinguish between [[parts of speech]].@@@@1@19@@danf@17-8-2009
10590250@unknown@formal@none@1@S@* English and several other languages don't specify which word an adjective applies to.@@@@1@14@@danf@17-8-2009
10590260@unknown@formal@none@1@S@For example, in the string "pretty little girls' school".@@@@1@9@@danf@17-8-2009
10590270@unknown@formal@none@1@S@** Does the school look little?@@@@1@6@@danf@17-8-2009
10590280@unknown@formal@none@1@S@** Do the girls look little?@@@@1@6@@danf@17-8-2009
10590290@unknown@formal@none@1@S@** Do the girls look pretty?@@@@1@6@@danf@17-8-2009
10590300@unknown@formal@none@1@S@** Does the school look pretty?@@@@1@6@@danf@17-8-2009
10590310@unknown@formal@none@1@S@* We will often imply additional information in spoken language by the way we place stress on words.@@@@1@18@@danf@17-8-2009
10590320@unknown@formal@none@1@S@The sentence "I never said she stole my money" demonstrates the importance stress can play in a sentence, and thus the inherent difficulty a natural language processor can have in parsing it.@@@@1@32@@danf@17-8-2009
10590330@unknown@formal@none@1@S@Depending on which word the speaker places the stress, this sentence could have several distinct meanings:@@@@1@16@@danf@17-8-2009
10590340@unknown@formal@none@1@S@** "'''I''' never said she stole my money" - Someone else said it, but ''I'' didn't.@@@@1@16@@danf@17-8-2009
10590350@unknown@formal@none@1@S@** "I '''never''' said she stole my money" - I simply didn't ever say it.@@@@1@15@@danf@17-8-2009
10590360@unknown@formal@none@1@S@** "I never '''said''' she stole my money" - I might have implied it in some way, but I never explicitly said it.@@@@1@23@@danf@17-8-2009
10590370@unknown@formal@none@1@S@** "I never said '''she''' stole my money" - I said someone took it; I didn't say it was she.@@@@1@20@@danf@17-8-2009
10590380@unknown@formal@none@1@S@** "I never said she '''stole''' my money" - I just said she probably borrowed it.@@@@1@16@@danf@17-8-2009
10590390@unknown@formal@none@1@S@** "I never said she stole '''my''' money" - I said she stole someone else's money.@@@@1@16@@danf@17-8-2009
10590400@unknown@formal@none@1@S@** "I never said she stole my '''money'''" - I said she stole something, but not my money.@@@@1@18@@danf@17-8-2009
10590410@unknown@formal@none@1@S@==Subproblems==@@@@1@1@@danf@17-8-2009
10590420@unknown@formal@none@1@S@; [[Speech segmentation]]:@@@@1@3@@danf@17-8-2009
10590430@unknown@formal@none@1@S@In most spoken languages, the sounds representing successive letters blend into each other, so the conversion of the analog signal to discrete characters can be a very difficult process.@@@@1@29@@danf@17-8-2009
10590440@unknown@formal@none@1@S@Also, in [[natural speech]] there are hardly any pauses between successive words; the location of those boundaries usually must take into account [[grammatical]] and [[semantic]] constraints, as well as the [[context]].@@@@1@31@@danf@17-8-2009
10590450@unknown@formal@none@1@S@; [[Text segmentation]]:@@@@1@3@@danf@17-8-2009
10590460@unknown@formal@none@1@S@Some written languages like [[Chinese language|Chinese]], [[Japanese language|Japanese]] and [[Thai language|Thai]] do not have single-word boundaries either, so any significant text [[parsing]] usually requires the identification of word boundaries, which is often a non-trivial task.@@@@1@35@@danf@17-8-2009
10590470@unknown@formal@none@1@S@; [[Word sense disambiguation]]:@@@@1@4@@danf@17-8-2009
10590480@unknown@formal@none@1@S@Many words have more than one [[meaning]]; we have to select the meaning which makes the most sense in context.@@@@1@20@@danf@17-8-2009
10590490@unknown@formal@none@1@S@; [[Syntactic ambiguity]]:@@@@1@3@@danf@17-8-2009
10590500@unknown@formal@none@1@S@The [[grammar]] for [[natural language]]s is [[ambiguous]], i.e. there are often multiple possible [[parse tree]]s for a given sentence.@@@@1@19@@danf@17-8-2009
10590510@unknown@formal@none@1@S@Choosing the most appropriate one usually requires [[semantics|semantic]] and contextual information.@@@@1@11@@danf@17-8-2009
10590520@unknown@formal@none@1@S@Specific problem components of syntactic ambiguity include [[sentence boundary disambiguation]].@@@@1@10@@danf@17-8-2009
10590530@unknown@formal@none@1@S@; Imperfect or irregular input :@@@@1@6@@danf@17-8-2009
10590540@unknown@formal@none@1@S@Foreign or regional accents and vocal impediments in speech; typing or grammatical errors, [[Optical character recognition|OCR]] errors in texts.@@@@1@19@@danf@17-8-2009
10590550@unknown@formal@none@1@S@; [[Speech acts]] and plans:@@@@1@5@@danf@17-8-2009
10590560@unknown@formal@none@1@S@A sentence can often be considered an action by the speaker.@@@@1@11@@danf@17-8-2009
10590570@unknown@formal@none@1@S@The sentence structure, alone, may not contain enough information to define this action.@@@@1@13@@danf@17-8-2009
10590580@unknown@formal@none@1@S@For instance, a question is actually the speaker requesting some sort of response from the listener.@@@@1@16@@danf@17-8-2009
10590590@unknown@formal@none@1@S@The desired response may be verbal, physical, or some combination.@@@@1@10@@danf@17-8-2009
10590600@unknown@formal@none@1@S@For example, "Can you pass the class?" is a request for a simple yes-or-no answer, while "Can you pass the salt?" is requesting a physical action to be performed.@@@@1@29@@danf@17-8-2009
10590610@unknown@formal@none@1@S@It is not appropriate to respond with "Yes, I can pass the salt," without the accompanying action (although "No" or "I can't reach the salt" would explain a lack of action).@@@@1@31@@danf@17-8-2009
10590620@unknown@formal@none@1@S@== Statistical NLP ==@@@@1@4@@danf@17-8-2009
10590630@unknown@formal@none@1@S@Statistical natural-language processing uses [[stochastic]], [[probabilistic]] and [[statistical]] methods to resolve some of the difficulties discussed above, especially those which arise because longer sentences are highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses.@@@@1@39@@danf@17-8-2009
10590640@unknown@formal@none@1@S@Methods for disambiguation often involve the use of [[corpus linguistics | corpora]] and [[Markov model]]s.@@@@1@15@@danf@17-8-2009
10590650@unknown@formal@none@1@S@Statistical NLP comprises all quantitative approaches to automated language processing, including probabilistic modeling, [[information theory]], and [[linear algebra]].@@@@1@18@@danf@17-8-2009
10590660@unknown@formal@none@1@S@The technology for statistical NLP comes mainly from [[machine learning]] and [[data mining]], both of which are fields of [[artificial intelligence]] that involve learning from data.@@@@1@26@@danf@17-8-2009
10590670@unknown@formal@none@1@S@==Major tasks in NLP==@@@@1@4@@danf@17-8-2009
10590680@unknown@formal@none@1@S@* [[Automatic summarization]]@@@@1@3@@danf@17-8-2009
10590690@unknown@formal@none@1@S@* [[Foreign language reading aid]]@@@@1@5@@danf@17-8-2009
10590700@unknown@formal@none@1@S@* [[Foreign language writing aid]]@@@@1@5@@danf@17-8-2009
10590710@unknown@formal@none@1@S@* [[Information extraction]]@@@@1@3@@danf@17-8-2009
10590720@unknown@formal@none@1@S@* [[Information retrieval]]@@@@1@3@@danf@17-8-2009
10590730@unknown@formal@none@1@S@* [[Machine translation]]@@@@1@3@@danf@17-8-2009
10590740@unknown@formal@none@1@S@* [[Named entity recognition]]@@@@1@4@@danf@17-8-2009
10590750@unknown@formal@none@1@S@* [[Natural language generation]]@@@@1@4@@danf@17-8-2009
10590760@unknown@formal@none@1@S@* [[Natural language understanding]]@@@@1@4@@danf@17-8-2009
10590770@unknown@formal@none@1@S@* [[Optical character recognition]]@@@@1@4@@danf@17-8-2009
10590780@unknown@formal@none@1@S@* [[Question answering]]@@@@1@3@@danf@17-8-2009
10590790@unknown@formal@none@1@S@* [[Speech recognition]]@@@@1@3@@danf@17-8-2009
10590800@unknown@formal@none@1@S@* [[Spoken dialogue system]]@@@@1@4@@danf@17-8-2009
10590810@unknown@formal@none@1@S@* [[Text simplification]]@@@@1@3@@danf@17-8-2009
10590820@unknown@formal@none@1@S@* [[Text to speech]]@@@@1@4@@danf@17-8-2009
10590830@unknown@formal@none@1@S@* [[Text-proofing]]@@@@1@2@@danf@17-8-2009
10590840@unknown@formal@none@1@S@== Evaluation of natural language processing ==@@@@1@7@@danf@17-8-2009
10590850@unknown@formal@none@1@S@===Objectives===@@@@1@1@@danf@17-8-2009
10590860@unknown@formal@none@1@S@The goal of NLP evaluation is to measure one or more ''qualities'' of an algorithm or a system, in order to determine if (or to what extent) the system answers the goals of its designers, or the needs of its users.@@@@1@41@@danf@17-8-2009
10590870@unknown@formal@none@1@S@Research in NLP evaluation has received considerable attention, because the definition of proper evaluation criteria is one way to specify precisely an NLP problem, going thus beyond the vagueness of tasks defined only as ''language understanding'' or ''language generation''.@@@@1@39@@danf@17-8-2009
10590880@unknown@formal@none@1@S@A precise set of evaluation criteria, which includes mainly evaluation data and evaluation metrics, enables several teams to compare their solutions to a given NLP problem.@@@@1@26@@danf@17-8-2009
10590890@unknown@formal@none@1@S@===Short history of evaluation in NLP===@@@@1@6@@danf@17-8-2009
10590900@unknown@formal@none@1@S@The first evaluation campaign on written texts seems to be a campaign dedicated to message understanding in 1987 (Pallet 1998).@@@@1@20@@danf@17-8-2009
10590910@unknown@formal@none@1@S@Then, the Parseval/GEIG project compared phrase-structure grammars (Black 1991).@@@@1@9@@danf@17-8-2009
10590920@unknown@formal@none@1@S@A series of campaigns within Tipster project were realized on tasks like summarization, translation and searching (Hirshman 1998).@@@@1@18@@danf@17-8-2009
10590930@unknown@formal@none@1@S@In 1994, in Germany, the Morpholympics compared German taggers.@@@@1@9@@danf@17-8-2009
10590940@unknown@formal@none@1@S@Then, the Senseval and Romanseval campaigns were conducted with the objectives of semantic disambiguation.@@@@1@14@@danf@17-8-2009
10590950@unknown@formal@none@1@S@In 1996, the Sparkle campaign compared syntactic parsers in four different languages (English, French, German and Italian).@@@@1@17@@danf@17-8-2009
10590960@unknown@formal@none@1@S@In France, the Grace project compared a set of 21 taggers for French in 1997 (Adda 1999).@@@@1@17@@danf@17-8-2009
10590970@unknown@formal@none@1@S@In 2004, during the [[Technolangue/Easy]] project, 13 parsers for French were compared.@@@@1@12@@danf@17-8-2009
10590980@unknown@formal@none@1@S@Large-scale evaluation of dependency parsers were performed in the context of the CoNLL shared tasks in 2006 and 2007.@@@@1@19@@danf@17-8-2009
10590990@unknown@formal@none@1@S@In Italy, the evalita campaign was conducted in 2007 to compare various tools for Italian [http://evalita.itc.it evalita web site].@@@@1@19@@danf@17-8-2009
10591000@unknown@formal@none@1@S@In France, within the ANR-Passage project (end of 2007), 10 parsers for French were compared [http://atoll.inria.fr/passage/ passage web site].@@@@1@19@@danf@17-8-2009
10591010@unknown@formal@none@1@S@Adda G., Mariani J., Paroubek P., Rajman M. 1999 L'action GRACE d'évaluation de l'assignation des parties du discours pour le français. Langues vol-2@@@@1@23@@danf@17-8-2009
10591030@unknown@formal@none@1@S@Black E., Abney S., Flickinger D., Gdaniec C., Grishman R., Harrison P., Hindle D., Ingria R., Jelinek F., Klavans J., Liberman M., Marcus M., Reukos S., Santoni B., Strzalkowski T. 1991 A procedure for quantitatively comparing the syntactic coverage of English grammars. DARPA Speech and Natural Language Workshop@@@@1@48@@danf@17-8-2009
10591050@unknown@formal@none@1@S@Hirshman L. 1998 Language understanding evaluation: lessons learned from MUC and ATIS. LREC Granada@@@@1@14@@danf@17-8-2009
10591070@unknown@formal@none@1@S@Pallet D.S. 1998 The NIST role in automatic speech recognition benchmark tests. LREC Granada@@@@1@14@@danf@17-8-2009
10591090@unknown@formal@none@1@S@===Different types of evaluation===@@@@1@4@@danf@17-8-2009
10591100@unknown@formal@none@1@S@Depending on the evaluation procedures, a number of distinctions are traditionally made in NLP evaluation.@@@@1@15@@danf@17-8-2009
10591110@unknown@formal@none@1@S@* Intrinsic vs. extrinsic evaluation@@@@1@5@@danf@17-8-2009
10591120@unknown@formal@none@1@S@Intrinsic evaluation considers an isolated NLP system and characterizes its performance mainly with respect to a ''gold standard'' result, pre-defined by the evaluators.@@@@1@23@@danf@17-8-2009
10591130@unknown@formal@none@1@S@Extrinsic evaluation, also called ''evaluation in use'' considers the NLP system in a more complex setting, either as an embedded system or serving a precise function for a human user.@@@@1@30@@danf@17-8-2009
10591140@unknown@formal@none@1@S@The extrinsic performance of the system is then characterized in terms of its utility with respect to the overall task of the complex system or the human user.@@@@1@28@@danf@17-8-2009
10591150@unknown@formal@none@1@S@* Black-box vs. glass-box evaluation@@@@1@5@@danf@17-8-2009
10591160@unknown@formal@none@1@S@Black-box evaluation requires one to run an NLP system on a given data set and to measure a number of parameters related to the quality of the process (speed, reliability, resource consumption) and, most importantly, to the quality of the result (e.g. the accuracy of data annotation or the fidelity of a translation).@@@@1@53@@danf@17-8-2009
10591170@unknown@formal@none@1@S@Glass-box evaluation looks at the design of the system, the algorithms that are implemented, the linguistic resources it uses (e.g. vocabulary size), etc.@@@@1@23@@danf@17-8-2009
10591180@unknown@formal@none@1@S@Given the complexity of NLP problems, it is often difficult to predict performance only on the basis of glass-box evaluation, but this type of evaluation is more informative with respect to error analysis or future developments of a system.@@@@1@39@@danf@17-8-2009
10591190@unknown@formal@none@1@S@* Automatic vs. manual evaluation@@@@1@5@@danf@17-8-2009
10591200@unknown@formal@none@1@S@In many cases, automatic procedures can be defined to evaluate an NLP system by comparing its output with the gold standard (or desired) one.@@@@1@24@@danf@17-8-2009
10591210@unknown@formal@none@1@S@Although the cost of producing the gold standard can be quite high, automatic evaluation can be repeated as often as needed without much additional costs (on the same input data).@@@@1@30@@danf@17-8-2009
10591220@unknown@formal@none@1@S@However, for many NLP problems, the definition of a gold standard is a complex task, and can prove impossible when inter-annotator agreement is insufficient.@@@@1@24@@danf@17-8-2009
10591230@unknown@formal@none@1@S@Manual evaluation is performed by human judges, which are instructed to estimate the quality of a system, or most often of a sample of its output, based on a number of criteria.@@@@1@32@@danf@17-8-2009
10591240@unknown@formal@none@1@S@Although, thanks to their linguistic competence, human judges can be considered as the reference for a number of language processing tasks, there is also considerable variation across their ratings.@@@@1@29@@danf@17-8-2009
10591250@unknown@formal@none@1@S@This is why automatic evaluation is sometimes referred to as ''objective'' evaluation, while the human kind appears to be more ''subjective.''@@@@1@21@@danf@17-8-2009
10591260@unknown@formal@none@1@S@=== Shared tasks (Campaigns)===@@@@1@4@@danf@17-8-2009
10591270@unknown@formal@none@1@S@* [[BioCreative]]@@@@1@2@@danf@17-8-2009
10591280@unknown@formal@none@1@S@* [[Message Understanding Conference]]@@@@1@4@@danf@17-8-2009
10591290@unknown@formal@none@1@S@* [[Technolangue/Easy]]@@@@1@2@@danf@17-8-2009
10591300@unknown@formal@none@1@S@* [[Text Retrieval Conference]]@@@@1@4@@danf@17-8-2009
10591310@unknown@formal@none@1@S@==Standardization in NLP==@@@@1@3@@danf@17-8-2009
10591320@unknown@formal@none@1@S@An ISO sub-committee is working in order to ease interoperability between [[Lexical resource]]s and NLP programs.@@@@1@16@@danf@17-8-2009
10591330@unknown@formal@none@1@S@The sub-committee is part of [[ISO/TC37]] and is called ISO/TC37/SC4.@@@@1@10@@danf@17-8-2009
10591340@unknown@formal@none@1@S@Some ISO standards are already published but most of them are under construction, mainly on lexicon representation (see [[lexical markup framework|LMF]]), annotation and data category registry.@@@@1@26@@danf@17-8-2009
10600010@unknown@formal@none@1@S@Neural network@@@@1@2@@danf@17-8-2009
10600020@unknown@formal@none@1@S@Traditionally, the term '''neural network''' had been used to refer to a network or circuit of [[neuron|biological neurons]].@@@@1@18@@danf@17-8-2009
10600030@unknown@formal@none@1@S@The modern usage of the term often refers to [[artificial neural network]]s, which are composed of [[artificial neuron]]s or nodes.@@@@1@20@@danf@17-8-2009
10600040@unknown@formal@none@1@S@Thus the term has two distinct usages:@@@@1@7@@danf@17-8-2009
10600050@unknown@formal@none@1@S@# [[Biological neural network]]s are made up of real biological neurons that are connected or functionally-related in the [[peripheral nervous system]] or the [[central nervous system]].@@@@1@26@@danf@17-8-2009
10600060@unknown@formal@none@1@S@In the field of [[neuroscience]], they are often identified as groups of neurons that perform a specific physiological function in laboratory analysis.@@@@1@22@@danf@17-8-2009
10600070@unknown@formal@none@1@S@# [[Artificial neural network]]s are made up of interconnecting artificial neurons (programming constructs that mimic the properties of biological neurons).@@@@1@20@@danf@17-8-2009
10600080@unknown@formal@none@1@S@Artificial neural networks may either be used to gain an understanding of biological neural networks, or for solving artificial intelligence problems without necessarily creating a model of a real biological system.@@@@1@31@@danf@17-8-2009
10600090@unknown@formal@none@1@S@This article focuses on the relationship between the two concepts; for detailed coverage of the two different concepts refer to the separate articles: [[Biological neural network]] and [[Artificial neural network]].@@@@1@30@@danf@17-8-2009
10600100@unknown@formal@none@1@S@==Characterization==@@@@1@1@@danf@17-8-2009
10600110@unknown@formal@none@1@S@In general a biological neural network is composed of a group or groups of chemically connected or functionally associated neurons.@@@@1@20@@danf@17-8-2009
10600120@unknown@formal@none@1@S@A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive.@@@@1@24@@danf@17-8-2009
10600130@unknown@formal@none@1@S@Connections, called [[synapses]], are usually formed from [[axons]] to [[dendrites]], though dendrodendritic microcircuits and other connections are possible.@@@@1@18@@danf@17-8-2009
10600140@unknown@formal@none@1@S@Apart from the electrical signaling, there are other forms of signaling that arise from [[neurotransmitter]] diffusion, which have an effect on electrical signaling.@@@@1@23@@danf@17-8-2009
10600150@unknown@formal@none@1@S@As such, neural networks are extremely complex.@@@@1@7@@danf@17-8-2009
10600160@unknown@formal@none@1@S@[[Artificial intelligence]] and [[cognitive modeling]] try to simulate some properties of neural networks.@@@@1@13@@danf@17-8-2009
10600170@unknown@formal@none@1@S@While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.@@@@1@26@@danf@17-8-2009
10600180@unknown@formal@none@1@S@In the [[artificial intelligence]] field, artificial neural networks have been applied successfully to [[speech recognition]], [[image analysis]] and adaptive [[control]], in order to construct [[software agents]] (in [[Video game|computer and video games]]) or [[autonomous robot]]s.@@@@1@35@@danf@17-8-2009
10600190@unknown@formal@none@1@S@Most of the currently employed artificial neural networks for artificial intelligence are based on [[statistical estimation]], [[Optimization (mathematics)|optimization]] and [[control theory]].@@@@1@21@@danf@17-8-2009
10600200@unknown@formal@none@1@S@The [[cognitive modelling]] field involves the physical or mathematical modeling of the behaviour of neural systems; ranging from the individual neural level (e.g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e.g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e.g. behavioural modelling of the organism's response to stimuli).@@@@1@63@@danf@17-8-2009
10600210@unknown@formal@none@1@S@==The brain, neural networks and computers==@@@@1@6@@danf@17-8-2009
10600220@unknown@formal@none@1@S@Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated.@@@@1@33@@danf@17-8-2009
10600230@unknown@formal@none@1@S@A subject of current research in theoretical neuroscience is the question surrounding the degree of complexity and the properties that individual neural elements should have to reproduce something resembling animal intelligence.@@@@1@31@@danf@17-8-2009
10600240@unknown@formal@none@1@S@Historically, computers evolved from the [[von Neumann architecture]], which is based on sequential processing and execution of explicit instructions.@@@@1@19@@danf@17-8-2009
10600250@unknown@formal@none@1@S@On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources.@@@@1@43@@danf@17-8-2009
10600260@unknown@formal@none@1@S@In other words, at its very heart a neural network is a complex statistical processor (as opposed to being tasked to sequentially process and execute).@@@@1@25@@danf@17-8-2009
10600270@unknown@formal@none@1@S@==Neural networks and artificial intelligence==@@@@1@5@@danf@17-8-2009
10600280@unknown@formal@none@1@S@An ''artificial neural network'' (ANN), also called a ''simulated neural network'' (SNN) or commonly just ''neural network'' (NN) is an interconnected group of [[artificial neuron]]s that uses a [[mathematical model|mathematical or computational model]] for [[information processing]] based on a [[connectionism|connectionistic]] approach to [[computation]].@@@@1@43@@danf@17-8-2009
10600290@unknown@formal@none@1@S@In most cases an ANN is an [[adaptive system]] that changes its structure based on external or internal information that flows through the network.@@@@1@24@@danf@17-8-2009
10600300@unknown@formal@none@1@S@In more practical terms neural networks are [[non-linear]] [[statistical]] [[data modeling]] or [[decision making]] tools.@@@@1@15@@danf@17-8-2009
10600310@unknown@formal@none@1@S@They can be used to model complex relationships between inputs and outputs or to [[Pattern recognition|find patterns]] in data.@@@@1@19@@danf@17-8-2009
10600320@unknown@formal@none@1@S@===Background===@@@@1@1@@danf@17-8-2009
10600330@unknown@formal@none@1@S@An [[artificial neural network]] involves a network of simple processing elements ([[artificial neurons]]) which can exhibit complex global behaviour, determined by the connections between the processing elements and element parameters.@@@@1@30@@danf@17-8-2009
10600340@unknown@formal@none@1@S@One classical type of artificial neural network is the [[Hopfield net]].@@@@1@11@@danf@17-8-2009
10600350@unknown@formal@none@1@S@In a neural network model simple [[Node (neural networks)|nodes]], which can be called variously "neurons", "neurodes", "Processing Elements" (PE) or "units", are connected together to form a network of nodes — hence the term "neural network".@@@@1@36@@danf@17-8-2009
10600360@unknown@formal@none@1@S@While a neural network does not have to be adaptive ''per se'', its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.@@@@1@36@@danf@17-8-2009
10600370@unknown@formal@none@1@S@In modern [[Neural network software|software implementations]] of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing.@@@@1@32@@danf@17-8-2009
10600380@unknown@formal@none@1@S@In some of these systems neural networks, or parts of neural networks (such as [[artificial neuron]]s) are used as components in larger systems that combine both adaptive and non-adaptive elements.@@@@1@30@@danf@17-8-2009
10600390@unknown@formal@none@1@S@The concept of a neural network appears to have first been proposed by [[Alan Turing]] in his 1948 paper "Intelligent Machinery".@@@@1@21@@danf@17-8-2009
10600400@unknown@formal@none@1@S@===Applications===@@@@1@1@@danf@17-8-2009
10600410@unknown@formal@none@1@S@The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it.@@@@1@27@@danf@17-8-2009
10600420@unknown@formal@none@1@S@This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.@@@@1@24@@danf@17-8-2009
10600430@unknown@formal@none@1@S@====Real life applications====@@@@1@3@@danf@17-8-2009
10600440@unknown@formal@none@1@S@The tasks to which artificial neural networks are applied tend to fall within the following broad categories:@@@@1@17@@danf@17-8-2009
10600450@unknown@formal@none@1@S@*[[Function approximation]], or [[regression analysis]], including [[time series prediction]] and modelling.@@@@1@11@@danf@17-8-2009
10600460@unknown@formal@none@1@S@*[[Statistical classification|Classification]], including [[Pattern recognition|pattern]] and sequence recognition, novelty detection and sequential decision making.@@@@1@14@@danf@17-8-2009
10600470@unknown@formal@none@1@S@*[[Data processing]], including filtering, clustering, [[blind signal separation]] and compression.@@@@1@10@@danf@17-8-2009
10600480@unknown@formal@none@1@S@Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition, etc.), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications, [[data mining]] (or knowledge discovery in databases, "KDD"), visualization and [[e-mail spam]] filtering.@@@@1@51@@danf@17-8-2009
10600490@unknown@formal@none@1@S@===Neural network software===@@@@1@3@@danf@17-8-2009
10600500@unknown@formal@none@1@S@''Main article:'' [[Neural network software]]@@@@1@5@@danf@17-8-2009
10600510@unknown@formal@none@1@S@'''Neural network software''' is used to [[Simulation|simulate]], [[research]], [[Software development|develop]] and apply [[artificial neural network]]s, [[biological neural network]]s and in some cases a wider array of [[adaptive system]]s.@@@@1@28@@danf@17-8-2009
10600520@unknown@formal@none@1@S@====Learning paradigms====@@@@1@2@@danf@17-8-2009
10600530@unknown@formal@none@1@S@There are three major learning paradigms, each corresponding to a particular abstract learning task.@@@@1@14@@danf@17-8-2009
10600540@unknown@formal@none@1@S@These are [[supervised learning]], [[unsupervised learning]] and [[reinforcement learning]].@@@@1@9@@danf@17-8-2009
10600550@unknown@formal@none@1@S@Usually any given type of network architecture can be employed in any of those tasks.@@@@1@15@@danf@17-8-2009
10600560@unknown@formal@none@1@S@;Supervised learning@@@@1@2@@danf@17-8-2009
10600570@unknown@formal@none@1@S@In [[supervised learning]], we are given a set of example pairs and the aim is to find a function in the allowed class of functions that matches the examples.@@@@1@39@@danf@17-8-2009
10600580@unknown@formal@none@1@S@In other words, we wish to ''infer'' how the mapping implied by the data and the cost function is related to the mismatch between our mapping and the data.@@@@1@29@@danf@17-8-2009
10600590@unknown@formal@none@1@S@;Unsupervised learning@@@@1@2@@danf@17-8-2009
10600600@unknown@formal@none@1@S@In [[unsupervised learning]] we are given some data , and a cost function which is to be minimized which can be any function of and the network's output, .@@@@1@30@@danf@17-8-2009
10600610@unknown@formal@none@1@S@The cost function is determined by the task formulation.@@@@1@9@@danf@17-8-2009
10600620@unknown@formal@none@1@S@Most applications fall within the domain of [[estimation problems]] such as [[statistical modeling]], [[Data compression|compression]], [[Mail filter|filtering]], [[blind source separation]] and [[data clustering|clustering]].@@@@1@23@@danf@17-8-2009
10600630@unknown@formal@none@1@S@;Reinforcement learning@@@@1@2@@danf@17-8-2009
10600640@unknown@formal@none@1@S@In [[reinforcement learning]], data is usually not given, but generated by an agent's interactions with the environment.@@@@1@18@@danf@17-8-2009
10600650@unknown@formal@none@1@S@At each point in time , the agent performs an action and the environment generates an observation and an instantaneous cost , according to some (usually unknown) dynamics.@@@@1@30@@danf@17-8-2009
10600660@unknown@formal@none@1@S@The aim is to discover a ''policy'' for selecting actions that minimises some measure of a long-term cost, i.e. the expected cumulative cost.@@@@1@23@@danf@17-8-2009
10600670@unknown@formal@none@1@S@The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated.@@@@1@17@@danf@17-8-2009
10600680@unknown@formal@none@1@S@ANNs are frequently used in reinforcement learning as part of the overall algorithm.@@@@1@13@@danf@17-8-2009
10600690@unknown@formal@none@1@S@Tasks that fall within the paradigm of reinforcement learning are [[control]] problems, [[game]]s and other [[sequential decision making]] tasks.@@@@1@19@@danf@17-8-2009
10600700@unknown@formal@none@1@S@====Learning algorithms====@@@@1@2@@danf@17-8-2009
10600710@unknown@formal@none@1@S@There are many algorithms for training neural networks; most of them can be viewed as a straightforward application of [[Optimization (mathematics)|optimization]] theory and [[statistical estimation]].@@@@1@25@@danf@17-8-2009
10600720@unknown@formal@none@1@S@[[Evolutionary computation]] methods, [[simulated annealing]], [[Expectation-Maximization|expectation maximization]] and [[non-parametric methods]] are among other commonly used methods for training neural networks.@@@@1@20@@danf@17-8-2009
10600730@unknown@formal@none@1@S@See also [[machine learning]].@@@@1@4@@danf@17-8-2009
10600740@unknown@formal@none@1@S@Recent developments in this field also saw the use of [[particle swarm optimization]] and other [[swarm intelligence]] techniques used in the training of neural networks.@@@@1@25@@danf@17-8-2009
10600750@unknown@formal@none@1@S@==Neural networks and neuroscience==@@@@1@4@@danf@17-8-2009
10600760@unknown@formal@none@1@S@Theoretical and [[computational neuroscience]] is the field concerned with the theoretical analysis and computational modeling of biological neural systems.@@@@1@19@@danf@17-8-2009
10600770@unknown@formal@none@1@S@Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.@@@@1@21@@danf@17-8-2009
10600780@unknown@formal@none@1@S@The aim of the field is to create models of biological neural systems in order to understand how biological systems work.@@@@1@21@@danf@17-8-2009
10600790@unknown@formal@none@1@S@To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning ([[biological neural network]] models) and theory (statistical learning theory and [[information theory]]).@@@@1@35@@danf@17-8-2009
10600800@unknown@formal@none@1@S@=== Types of models ===@@@@1@5@@danf@17-8-2009
10600810@unknown@formal@none@1@S@Many models are used in the field, each defined at a different level of abstraction and trying to model different aspects of neural systems.@@@@1@24@@danf@17-8-2009
10600820@unknown@formal@none@1@S@They range from models of the short-term behaviour of [[biological neuron models|individual neurons]], through models of how the dynamics of neural circuitry arise from interactions between individual neurons, to models of how behaviour can arise from abstract neural modules that represent complete subsystems.@@@@1@43@@danf@17-8-2009
10600830@unknown@formal@none@1@S@These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.@@@@1@27@@danf@17-8-2009
10600840@unknown@formal@none@1@S@===Current research===@@@@1@2@@danf@17-8-2009
10600850@unknown@formal@none@1@S@While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of [[neuromodulators]] such as [[dopamine]], [[acetylcholine]], and [[serotonin]] on behaviour and learning.@@@@1@42@@danf@17-8-2009
10600860@unknown@formal@none@1@S@[[Biophysics|Biophysical]] models, such as [[BCM theory]], have been important in understanding mechanisms for [[synaptic plasticity]], and have had applications in both computer science and neuroscience.@@@@1@25@@danf@17-8-2009
10600870@unknown@formal@none@1@S@Research is ongoing in understanding the computational algorithms used in the brain, with some recent biological evidence for [[radial basis networks]] and [[neural backpropagation]] as mechanisms for processing data.@@@@1@29@@danf@17-8-2009
10600880@unknown@formal@none@1@S@==History of the neural network analogy==@@@@1@6@@danf@17-8-2009
10600890@unknown@formal@none@1@S@The concept of neural networks started in the late-1800s as an effort to describe how the human mind performed.@@@@1@19@@danf@17-8-2009
10600900@unknown@formal@none@1@S@These ideas started being applied to computational models with the [[Perceptron]].@@@@1@11@@danf@17-8-2009
10600910@unknown@formal@none@1@S@In early 1950s [[Friedrich Hayek]] was one of the first to posit the idea of [[spontaneous order]] in the brain arising out of decentralized networks of simple units (neurons).@@@@1@29@@danf@17-8-2009
10600920@unknown@formal@none@1@S@In the late 1940s, [[Donald Hebb]] made one of the first hypotheses for a mechanism of neural plasticity (i.e. learning), [[Hebbian learning]].@@@@1@22@@danf@17-8-2009
10600930@unknown@formal@none@1@S@Hebbian learning is considered to be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for [[long term potentiation]].@@@@1@25@@danf@17-8-2009
10600940@unknown@formal@none@1@S@The [[Perceptron]] is essentially a linear classifier for classifying data specified by parameters and an output function .@@@@1@32@@danf@17-8-2009
10600950@unknown@formal@none@1@S@Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent.@@@@1@14@@danf@17-8-2009
10600960@unknown@formal@none@1@S@Because the [[inner product]] is a [[linear operator]] in the input space, the Perceptron can only perfectly classify a set of data for which different classes are [[linearly separable]] in the input space, while it often fails completely for non-separable data.@@@@1@41@@danf@17-8-2009
10600970@unknown@formal@none@1@S@While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.@@@@1@40@@danf@17-8-2009
10600980@unknown@formal@none@1@S@The [[Cognitron]] (1975) was an early multilayered neural network with a training algorithm.@@@@1@13@@danf@17-8-2009
10600990@unknown@formal@none@1@S@The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages.@@@@1@28@@danf@17-8-2009
10601000@unknown@formal@none@1@S@Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state.@@@@1@29@@danf@17-8-2009
10601010@unknown@formal@none@1@S@The ability for bi-directional flow of inputs between neurons/nodes was produced with the [[Hopfield net|Hopfield's network]] (1982), and specialization of these node layers for specific purposes was introduced through the first [[hybrid neural network|hybrid network]].@@@@1@35@@danf@17-8-2009
10601020@unknown@formal@none@1@S@The [[connectionism|parallel distributed processing]] of the mid-1980s became popular under the name [[connectionism]].@@@@1@13@@danf@17-8-2009
10601030@unknown@formal@none@1@S@The rediscovery of the [[backpropagation]] algorithm was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986 (Though backpropagation itself dates from 1974).@@@@1@35@@danf@17-8-2009
10601040@unknown@formal@none@1@S@The original network utilised multiple layers of weight-sum units of the type , where was a [[sigmoid function]] or [[logistic function]] such as used in [[logistic regression]].@@@@1@32@@danf@17-8-2009
10601050@unknown@formal@none@1@S@Training was done by a form of stochastic steepest gradient descent.@@@@1@11@@danf@17-8-2009
10601060@unknown@formal@none@1@S@The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature.@@@@1@26@@danf@17-8-2009
10601070@unknown@formal@none@1@S@However it is essentially a form of gradient descent.@@@@1@9@@danf@17-8-2009
10601080@unknown@formal@none@1@S@Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point.@@@@1@31@@danf@17-8-2009
10601090@unknown@formal@none@1@S@In recent times, networks with the same architecture as the backpropagation network are referred to as [[Multilayer perceptron|Multi-Layer Perceptrons]].@@@@1@19@@danf@17-8-2009
10601100@unknown@formal@none@1@S@This name does not impose any limitations on the type of algorithm used for learning.@@@@1@15@@danf@17-8-2009
10601110@unknown@formal@none@1@S@The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.@@@@1@54@@danf@17-8-2009
10601120@unknown@formal@none@1@S@==Criticism==@@@@1@1@@danf@17-8-2009
10601130@unknown@formal@none@1@S@[[A. K. Dewdney]], a former ''[[Scientific American]]'' columnist, wrote in 1997, ''“Although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool.”''@@@@1@40@@danf@17-8-2009
10601140@unknown@formal@none@1@S@(Dewdney, p.82)@@@@1@2@@danf@17-8-2009
10601150@unknown@formal@none@1@S@Arguments against Dewdney's position are that neural nets have been successfully used to solve many complex and diverse tasks, ranging from autonomously flying aircraft[http://www.nasa.gov/centers/dryden/news/NewsReleases/2003/03-49.html] to detecting credit card fraud[http://www.visa.ca/en/about/visabenefits/innovation.cfm].@@@@1@29@@danf@17-8-2009
10601160@unknown@formal@none@1@S@Technology writer [[Roger Bridgman]] commented on Dewdney's statements about neural nets:@@@@1@11@@danf@17-8-2009
10601170@unknown@formal@none@1@S@
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table...valueless as a scientific resource".@@@@1@56@@danf@17-8-2009
10601180@unknown@formal@none@1@S@In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers.@@@@1@34@@danf@17-8-2009
10601190@unknown@formal@none@1@S@An unreadable table that a useful machine could read would still be well worth having.
@@@@1@15@@danf@17-8-2009
10610010@unknown@formal@none@1@S@N-gram@@@@1@1@@danf@17-8-2009
10610020@unknown@formal@none@1@S@An '''''n''-gram''' is a sub-sequence of ''n'' items from a given [[sequence]].@@@@1@12@@danf@17-8-2009
10610025@unknown@formal@none@1@S@''n''-grams are used in various areas of statistical [[natural language processing]] and genetic sequence analysis.@@@@1@15@@danf@17-8-2009
10610030@unknown@formal@none@1@S@The items in question can be letters, words or [[base pairs]] according to the application.@@@@1@15@@danf@17-8-2009
10610040@unknown@formal@none@1@S@An ''n''-gram of size 1 is a "[[unigram]]"; size 2 is a "[[bigram]]" (or, more etymologically sound but less commonly used, a "digram"); size 3 is a "[[trigram]]"; and size 4 or more is simply called an "''n''-gram".@@@@1@38@@danf@17-8-2009
10610050@unknown@formal@none@1@S@Some [[language model]]s built from n-grams are "(''n'' − 1)-order [[Markov_chain|Markov model]]s".@@@@1@10@@danf@17-8-2009
10610060@unknown@formal@none@1@S@==Examples==@@@@1@1@@danf@17-8-2009
10610070@unknown@formal@none@1@S@Here are examples of '''''word''''' level 3-grams and 4-grams (and counts of the number of times they appeared) from the [[N-gram#Google_use_of_N-gram|Google n-gram corpus]].@@@@1@23@@danf@17-8-2009
10610080@unknown@formal@none@1@S@*ceramics collectables collectibles (55)@@@@1@4@@danf@17-8-2009
10610090@unknown@formal@none@1@S@*ceramics collectables fine (130)@@@@1@4@@danf@17-8-2009
10610100@unknown@formal@none@1@S@*ceramics collected by (52)@@@@1@4@@danf@17-8-2009
10610110@unknown@formal@none@1@S@*ceramics collectible pottery (50)@@@@1@4@@danf@17-8-2009
10610120@unknown@formal@none@1@S@*ceramics collectibles cooking (45)@@@@1@4@@danf@17-8-2009
10610130@unknown@formal@none@1@S@4-grams@@@@1@1@@danf@17-8-2009
10610140@unknown@formal@none@1@S@*serve as the incoming (92)@@@@1@5@@danf@17-8-2009
10610150@unknown@formal@none@1@S@*serve as the incubator (99)@@@@1@5@@danf@17-8-2009
10610160@unknown@formal@none@1@S@*serve as the independent (794)@@@@1@5@@danf@17-8-2009
10610170@unknown@formal@none@1@S@*serve as the index (223)@@@@1@5@@danf@17-8-2009
10610180@unknown@formal@none@1@S@*serve as the indication (72)@@@@1@5@@danf@17-8-2009
10610190@unknown@formal@none@1@S@*serve as the indicator (120)@@@@1@5@@danf@17-8-2009
10610200@unknown@formal@none@1@S@==''n''-gram models==@@@@1@2@@danf@17-8-2009
10610210@unknown@formal@none@1@S@An '''''n''-gram model''' models sequences, notably natural languages, using the statistical properties of ''n''-grams.@@@@1@14@@danf@17-8-2009
10610220@unknown@formal@none@1@S@This idea can be traced to an experiment by [[Claude Shannon]]'s work in [[information theory]].@@@@1@15@@danf@17-8-2009
10610230@unknown@formal@none@1@S@His question was, given a sequence of letters (for example, the sequence "for ex"), what is the [[likelihood]] of the next letter?@@@@1@22@@danf@17-8-2009
10610240@unknown@formal@none@1@S@From training data, one can derive a [[probability distribution]] for the next letter given a history of size : ''a'' = 0.4, ''b'' = 0.00001, ''c'' = 0, ....; where the probabilities of all possible "next-letters" sum to 1.0.@@@@1@39@@danf@17-8-2009
10610250@unknown@formal@none@1@S@More concisely, an ''n''-gram model predicts based on .@@@@1@13@@danf@17-8-2009
10610260@unknown@formal@none@1@S@In Probability terms, this is nothing but .@@@@1@13@@danf@17-8-2009
10610270@unknown@formal@none@1@S@When used for [[language model|language modeling]] independence assumptions are made so that each word depends only on the last ''n'' words.@@@@1@21@@danf@17-8-2009
10610280@unknown@formal@none@1@S@This [[Markov model]] is used as an approximation of the true underlying language.@@@@1@13@@danf@17-8-2009
10610290@unknown@formal@none@1@S@This assumption is important because it massively simplifies the problem of learning the language model from data.@@@@1@17@@danf@17-8-2009
10610300@unknown@formal@none@1@S@In addition, because of the open nature of language, it is common to group words unknown to the language model together.@@@@1@21@@danf@17-8-2009
10610310@unknown@formal@none@1@S@''n''-gram models are widely used in statistical [[natural language processing]].@@@@1@10@@danf@17-8-2009
10610320@unknown@formal@none@1@S@In [[speech recognition]], [[phonemes]] and sequences of phonemes are modeled using a ''n''-gram distribution.@@@@1@14@@danf@17-8-2009
10610330@unknown@formal@none@1@S@For parsing, words are modeled such that each ''n''-gram is composed of ''n'' words.@@@@1@14@@danf@17-8-2009
10610340@unknown@formal@none@1@S@For [[language recognition]], sequences of letters are modeled for different languages.@@@@1@11@@danf@17-8-2009
10610350@unknown@formal@none@1@S@For a sequence of words, (for example "the dog smelled like a skunk"), the trigrams would be: "the dog smelled", "dog smelled like", "smelled like a", and "like a skunk".@@@@1@30@@danf@17-8-2009
10610360@unknown@formal@none@1@S@For sequences of characters, the 3-grams (sometimes referred to as "trigrams") that can be generated from "good morning" are "goo", "ood", "od ", "d m", " mo", "mor" and so forth.@@@@1@31@@danf@17-8-2009
10610370@unknown@formal@none@1@S@Some practitioners preprocess strings to remove spaces, most simply collapse whitespace to a single space while preserving paragraph marks.@@@@1@19@@danf@17-8-2009
10610380@unknown@formal@none@1@S@Punctuation is also commonly reduced or removed by preprocessing.@@@@1@9@@danf@17-8-2009
10610385@unknown@formal@none@1@S@''n''-grams can also be used for sequences of words or, in fact, for almost any type of data.@@@@1@18@@danf@17-8-2009
10610390@unknown@formal@none@1@S@They have been used for example for extracting features for clustering large sets of satellite earth images and for determining what part of the Earth a particular image came from.@@@@1@30@@danf@17-8-2009
10610400@unknown@formal@none@1@S@They have also been very successful as the first pass in genetic sequence search and in the identification of which species short sequences of DNA were taken from.@@@@1@28@@danf@17-8-2009
10610410@unknown@formal@none@1@S@N-gram models are often criticized because they lack any explicit representation of long range dependency.@@@@1@15@@danf@17-8-2009
10610420@unknown@formal@none@1@S@While it is true that the only explicit dependency range is (n-1) tokens for an n-gram model, it is also true that the effective range of dependency is significantly longer than this although long range correlations drop exponentially with distance for any Markov model.@@@@1@44@@danf@17-8-2009
10610430@unknown@formal@none@1@S@Alternative Markov language models that incorporate some degree of local state can exhibit very long range dependencies.@@@@1@17@@danf@17-8-2009
10610440@unknown@formal@none@1@S@This is often done using hand-crafted state variables that represent, for instance, the position in a sentence, the general topic of discourse or a grammatical state variable.@@@@1@27@@danf@17-8-2009
10610450@unknown@formal@none@1@S@Some of the best parsers of English currently in existence are roughly of this form.@@@@1@15@@danf@17-8-2009
10610460@unknown@formal@none@1@S@Another criticism that has been leveled is that Markov models of language, including n-gram models, do not explicitly capture the performance/competence distinction introduced by [[Noam Chomsky]].@@@@1@26@@danf@17-8-2009
10610470@unknown@formal@none@1@S@This criticism fails to explain why parsers that are the best at parsing text seem to uniformly lack any such distinction and most even lack any clear distinction between semantics and syntax.@@@@1@32@@danf@17-8-2009
10610480@unknown@formal@none@1@S@Most proponents of n-gram and related language models opt for a fairly pragmatic approach to language modeling that emphasizes empirical results over theoretical purity.@@@@1@24@@danf@17-8-2009
10610490@unknown@formal@none@1@S@==''n''-grams for approximate matching==@@@@1@4@@danf@17-8-2009
10610500@unknown@formal@none@1@S@''n''-grams can also be used for efficient approximate matching.@@@@1@9@@danf@17-8-2009
10610510@unknown@formal@none@1@S@By converting a sequence of items to a set of ''n''-grams, it can be embedded in a [[vector space]] (in other words, represented as a [[histogram]]), thus allowing the sequence to be compared to other sequences in an efficient manner.@@@@1@40@@danf@17-8-2009
10610520@unknown@formal@none@1@S@For example, if we convert strings with only letters in the English alphabet into 3-grams, we get a -dimensional space (the first dimension measures the number of occurrences of "aaa", the second "aab", and so forth for all possible combinations of three letters).@@@@1@43@@danf@17-8-2009
10610530@unknown@formal@none@1@S@Using this representation, we lose information about the string.@@@@1@9@@danf@17-8-2009
10610540@unknown@formal@none@1@S@For example, both the strings "abcba" and "bcbab" give rise to exactly the same 2-grams.@@@@1@15@@danf@17-8-2009
10610550@unknown@formal@none@1@S@However, we know empirically that if two strings of real text have a similar vector representation (as measured by [[dot product|cosine distance]]) then they are likely to be similar.@@@@1@29@@danf@17-8-2009
10610560@unknown@formal@none@1@S@Other metrics have also been applied to vectors of ''n''-grams with varying, sometimes better, results.@@@@1@15@@danf@17-8-2009
10610570@unknown@formal@none@1@S@For example [[z-score]]s have been used to compare documents by examining how many standard deviations each ''n''-gram differs from its mean occurrence in a large collection, or [[text corpus]], of documents (which form the "background" vector).@@@@1@36@@danf@17-8-2009
10610580@unknown@formal@none@1@S@In the event of small counts, the [[g-score]] may give better results for comparing alternative models.@@@@1@16@@danf@17-8-2009
10610590@unknown@formal@none@1@S@It is also possible to take a more principled approach to the statistics of ''n''-grams, modeling similarity as the likelihood that two strings came from the same source directly in terms of a problem in [[Bayesian inference]].@@@@1@37@@danf@17-8-2009
10610600@unknown@formal@none@1@S@==Other applications==@@@@1@2@@danf@17-8-2009
10610610@unknown@formal@none@1@S@''n''-grams find use in several areas of computer science, [[computational linguistics]], and applied mathematics.@@@@1@14@@danf@17-8-2009
10610620@unknown@formal@none@1@S@They have been used to:@@@@1@5@@danf@17-8-2009
10610630@unknown@formal@none@1@S@* design [[kernel (mathematics)|kernels]] that allow [[machine learning]] algorithms such as [[support vector machine]]s to learn from string data@@@@1@19@@danf@17-8-2009
10610640@unknown@formal@none@1@S@* find likely candidates for the correct spelling of a misspelled word@@@@1@12@@danf@17-8-2009
10610650@unknown@formal@none@1@S@* improve compression in [[data compression|compression algorithms]] where a small area of data requires ''n''-grams of greater length@@@@1@18@@danf@17-8-2009
10610660@unknown@formal@none@1@S@* assess the probability of a given word sequence appearing in text of a language of interest in pattern recognition systems, [[speech recognition]], OCR ([[optical character recognition]]), [[Intelligent Character Recognition]] ([[ICR]]), [[machine translation]] and similar applications@@@@1@36@@danf@17-8-2009
10610670@unknown@formal@none@1@S@* improve retrieval in [[information retrieval]] systems when it is hoped to find similar "documents" (a term for which the conventional meaning is sometimes stretched, depending on the data set) given a single query document and a database of reference documents@@@@1@41@@danf@17-8-2009
10610680@unknown@formal@none@1@S@* improve retrieval performance in genetic sequence analysis as in the [[BLAST]] family of programs@@@@1@15@@danf@17-8-2009
10610690@unknown@formal@none@1@S@* identify the language a text is in or the species a small sequence of DNA was taken from@@@@1@19@@danf@17-8-2009
10610700@unknown@formal@none@1@S@* predict letters or words at random in order to create text, as in the [[dissociated press]] algorithm.@@@@1@18@@danf@17-8-2009
10610710@unknown@formal@none@1@S@== Bias-versus-variance trade-off ==@@@@1@4@@danf@17-8-2009
10610720@unknown@formal@none@1@S@What goes into picking the ''n'' for the ''n''-gram?@@@@1@9@@danf@17-8-2009
10610730@unknown@formal@none@1@S@There are problems of balance weight between ''infrequent grams'' (for example, if a proper name appeared in the training data) and ''frequent grams''.@@@@1@23@@danf@17-8-2009
10610740@unknown@formal@none@1@S@Also, items not seen in the training data will be given a [[probability]] of 0.0 without [[smoothing]].@@@@1@17@@danf@17-8-2009
10610750@unknown@formal@none@1@S@For unseen but plausible data from a sample, one can introduce [[pseudocount]]s.@@@@1@12@@danf@17-8-2009
10610760@unknown@formal@none@1@S@Pseudocounts are generally motivated on Bayesian grounds.@@@@1@7@@danf@17-8-2009
10610770@unknown@formal@none@1@S@=== Smoothing techniques ===@@@@1@4@@danf@17-8-2009
10610780@unknown@formal@none@1@S@* [[Linear interpolation]] (e.g., taking the [[weighted mean]] of the unigram, bigram, and trigram)@@@@1@14@@danf@17-8-2009
10610790@unknown@formal@none@1@S@* [[Good-Turing]] discounting@@@@1@3@@danf@17-8-2009
10610800@unknown@formal@none@1@S@* [[Witten-Bell discounting]]@@@@1@3@@danf@17-8-2009
10610810@unknown@formal@none@1@S@* [[Katz's back-off model]] (trigram)@@@@1@5@@danf@17-8-2009
10610820@unknown@formal@none@1@S@==Google use of N-gram==@@@@1@4@@danf@17-8-2009
10610830@unknown@formal@none@1@S@[[Google]] uses n-gram models for a variety of R&D projects, such as [[statistical machine translation]], [[speech recognition]], [[Spell checker|checking spelling]], [[entity detection]], and [[information extraction|data mining]].@@@@1@26@@danf@17-8-2009
10610840@unknown@formal@none@1@S@In September of 2006 [http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html Google announced] that they made their n-grams [http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2006T13 public] at the [[Linguistic Data Consortium]] ([http://www.ldc.upenn.edu/ LDC]).@@@@1@21@@danf@17-8-2009
10620010@unknown@formal@none@1@S@Noun@@@@1@1@@danf@17-8-2009
10620020@unknown@formal@none@1@S@In [[linguistics]], a '''noun''' is a member of a large, [[open class (linguistics)|open]] [[lexical category]] whose members can occur as the main word in the [[subject (grammar)|subject]] of a [[clause]], the [[object (grammar)|object]] of a [[verb]], or the object of a [[preposition]].@@@@1@42@@danf@17-8-2009
10620030@unknown@formal@none@1@S@Lexical categories are defined in terms of how their members combine with other kinds of expressions.@@@@1@16@@danf@17-8-2009
10620040@unknown@formal@none@1@S@The syntactic rules for nouns differ from language to language.@@@@1@10@@danf@17-8-2009
10620050@unknown@formal@none@1@S@In [[English language|English]], nouns may be defined as those words which can occur with articles and [[adjective|attributive adjectives]] and can function as the [[phrase|head]] of a [[noun phrase]].@@@@1@28@@danf@17-8-2009
10620060@unknown@formal@none@1@S@In [[traditional grammar|traditional]] English grammar, the noun is one of the eight [[parts of speech]].@@@@1@15@@danf@17-8-2009
10620070@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10620080@unknown@formal@none@1@S@The word comes from the [[Latin]] ''nomen'' meaning "[[name]]".@@@@1@9@@danf@17-8-2009
10620090@unknown@formal@none@1@S@Word classes like nouns were first described by the Sanskrit grammarian [[Panini (grammarian)|{{IAST|Pāṇini}}]] and ancient Greeks like [[Dionysios Thrax]]; and were defined in terms of their [[morphology (linguistics)|morphological]] properties.@@@@1@29@@danf@17-8-2009
10620100@unknown@formal@none@1@S@For example, in Ancient Greek, nouns inflect for [[case (grammar)|grammatical case]], such as dative or accusative.@@@@1@16@@danf@17-8-2009
10620110@unknown@formal@none@1@S@[[Verb]]s, on the other hand, inflect for [[grammatical tense|tenses]], such as past, present or future, while nouns do not.@@@@1@19@@danf@17-8-2009
10620120@unknown@formal@none@1@S@[[Aristotle]] also had a notion of ''onomata'' (nouns) and ''rhemata'' (verbs) which, however, does not exactly correspond with modern notions of nouns and verbs.@@@@1@24@@danf@17-8-2009
10620130@unknown@formal@none@1@S@Vinokurova 2005 has a more detailed discussion of the historical origin of the notion of a noun.@@@@1@17@@danf@17-8-2009
10620140@unknown@formal@none@1@S@==Different definitions of nouns==@@@@1@4@@danf@17-8-2009
10620150@unknown@formal@none@1@S@Expressions of [[natural language]] have properties at different levels.@@@@1@9@@danf@17-8-2009
10620160@unknown@formal@none@1@S@They have ''formal'' properties, like what kinds of [[morphology (linguistics)|morphological]] [[prefix]]es or [[suffix]]es they take and what kinds of other expressions they combine with; but they also have [[semantics|semantic]] properties, i.e. properties pertaining to their meaning.@@@@1@36@@danf@17-8-2009
10620170@unknown@formal@none@1@S@The definition of a noun at the outset of this page is thus a ''formal'', traditional grammatical definition.@@@@1@18@@danf@17-8-2009
10620180@unknown@formal@none@1@S@That definition, for the most part, is considered uncontroversial and furnishes the propensity for certain language users to effectively distinguish most nouns from non-nouns.@@@@1@24@@danf@17-8-2009
10620190@unknown@formal@none@1@S@However, it has the disadvantage that it does not apply to nouns in all languages.@@@@1@15@@danf@17-8-2009
10620200@unknown@formal@none@1@S@For example in [[Russian language|Russian]], there are no definite articles, so one cannot define nouns as words that are modified by definite articles.@@@@1@23@@danf@17-8-2009
10620210@unknown@formal@none@1@S@There are also several attempts of defining nouns in terms of their [[semantics|semantic]] properties.@@@@1@14@@danf@17-8-2009
10620220@unknown@formal@none@1@S@Many of these are controversial, but some are discussed below.@@@@1@10@@danf@17-8-2009
10620230@unknown@formal@none@1@S@===Names for things===@@@@1@3@@danf@17-8-2009
10620240@unknown@formal@none@1@S@In [[Traditional grammar|traditional school grammars]], one often encounters the definition of nouns that they are all and only those expressions that refer to a ''person'', ''place'', ''thing'', ''event'', ''substance'', ''quality'', or ''idea'', etc.@@@@1@33@@danf@17-8-2009
10620250@unknown@formal@none@1@S@This is a ''semantic'' definition.@@@@1@5@@danf@17-8-2009
10620260@unknown@formal@none@1@S@It has been criticized by contemporary linguists as being uninformative.@@@@1@10@@danf@17-8-2009
10620270@unknown@formal@none@1@S@Contemporary linguists generally agree that one cannot successfully define nouns (or other grammatical categories) in terms of what sort of ''object in the world'' they ''[[reference|refer]] to'' or ''[[signification|signify]]''.@@@@1@29@@danf@17-8-2009
10620280@unknown@formal@none@1@S@Part of the [[conundrum]] is that the definition makes use of relatively ''general'' nouns ("thing", "phenomenon", "event") to define what nouns ''are''.@@@@1@22@@danf@17-8-2009
10620290@unknown@formal@none@1@S@The existence of such ''general'' nouns demonstrates that nouns refer to entities that are organized in [[taxonomy|taxonomic]] [[hierarchies]].@@@@1@18@@danf@17-8-2009
10620300@unknown@formal@none@1@S@But other kinds of expressions are also organized into such structured taxonomic relationships.@@@@1@13@@danf@17-8-2009
10620310@unknown@formal@none@1@S@For example the verbs "stroll","saunter", "stride", and "tread" are more specific words than the more ''general'' "walk".@@@@1@17@@danf@17-8-2009
10620320@unknown@formal@none@1@S@Moreover, "walk" is more specific than the verb "move", which, in turn, is less general than "change".@@@@1@17@@danf@17-8-2009
10620330@unknown@formal@none@1@S@But it is unlikely that such taxonomic relationships can be used to ''define'' nouns and verbs.@@@@1@16@@danf@17-8-2009
10620340@unknown@formal@none@1@S@We cannot ''define'' verbs as those words that refer to "changes" or "states", for example, because the nouns ''change'' and ''state'' probably refer to such things, but, of course, aren't verbs.@@@@1@31@@danf@17-8-2009
10620350@unknown@formal@none@1@S@Similarly, nouns like "invasion", "meeting", or "collapse" refer to things that are "done" or "happen".@@@@1@15@@danf@17-8-2009
10620360@unknown@formal@none@1@S@In fact, an influential [[theory]] has it that verbs like "kill" or "die" refer to events, which is among the sort of thing that nouns are supposed to refer to.@@@@1@30@@danf@17-8-2009
10620370@unknown@formal@none@1@S@The point being made here is not that this view of verbs is wrong, but rather that this property of verbs is a poor basis for a ''definition'' of this category, just like the property of ''having wheels'' is a poor basis for a definition of cars (some things that have wheels, such as my suitcase or a jumbo jet, aren't cars).@@@@1@62@@danf@17-8-2009
10620380@unknown@formal@none@1@S@Similarly, adjectives like "yellow" or "difficult" might be thought to refer to qualities, and adverbs like "outside" or "upstairs" seem to refer to places, which are also among the sorts of things nouns can refer to.@@@@1@36@@danf@17-8-2009
10620390@unknown@formal@none@1@S@But verbs, adjectives and adverbs are not nouns, and nouns aren't verbs, adjectives or adverbs.@@@@1@15@@danf@17-8-2009
10620400@unknown@formal@none@1@S@One might argue that "definitions" of this sort really rely on speakers' prior intuitive knowledge of what nouns, verbs and adjectives are, and, so don't really add anything over and beyond this.@@@@1@32@@danf@17-8-2009
10620410@unknown@formal@none@1@S@Speakers' intuitive knowledge of such things might plausibly be based on ''formal'' criteria, such as the traditional grammatical definition of English nouns aforementioned.@@@@1@23@@danf@17-8-2009
10620420@unknown@formal@none@1@S@===Prototypically referential expressions===@@@@1@3@@danf@17-8-2009
10620430@unknown@formal@none@1@S@Another semantic definition of nouns is that they are ''prototypically referential.''@@@@1@11@@danf@17-8-2009
10620440@unknown@formal@none@1@S@That definition is also not very helpful in distinguishing actual nouns from verbs.@@@@1@13@@danf@17-8-2009
10620450@unknown@formal@none@1@S@But it may still correctly identify a core property of nounhood.@@@@1@11@@danf@17-8-2009
10620460@unknown@formal@none@1@S@For example, we will tend to use nouns like "fool" and "car" when we wish to refer to fools and cars, respectively.@@@@1@22@@danf@17-8-2009
10620470@unknown@formal@none@1@S@The notion that this is '''prototypical''' reflects the fact that such nouns can be used, even though nothing with the corresponding property is referred to:@@@@1@25@@danf@17-8-2009
10620480@unknown@formal@none@1@S@:John is no '''fool'''.@@@@1@4@@danf@17-8-2009
10620490@unknown@formal@none@1@S@:If I had a '''car''', I'd go to Marrakech.@@@@1@9@@danf@17-8-2009
10620500@unknown@formal@none@1@S@The first sentence above doesn't refer to any fools, nor does the second one refer to any particular car.@@@@1@19@@danf@17-8-2009
10620510@unknown@formal@none@1@S@===Predicates with identity criteria===@@@@1@4@@danf@17-8-2009
10620520@unknown@formal@none@1@S@The British logician [[Peter Thomas Geach]] proposed a very subtle semantic definition of nouns.@@@@1@14@@danf@17-8-2009
10620530@unknown@formal@none@1@S@He noticed that adjectives like "same" can modify nouns, but no other kinds of parts of speech, like [[verbs]] or [[adjectives]].@@@@1@21@@danf@17-8-2009
10620540@unknown@formal@none@1@S@Not only that, but there also doesn't seem to be any ''other'' expressions with similar meaning that can modify verbs and adjectives.@@@@1@22@@danf@17-8-2009
10620550@unknown@formal@none@1@S@Consider the following examples.@@@@1@4@@danf@17-8-2009
10620560@unknown@formal@none@1@S@: Good: John and Bill participated in the '''same''' fight.@@@@1@10@@danf@17-8-2009
10620570@unknown@formal@none@1@S@: Bad:@@@@1@2@@danf@17-8-2009
10620580@unknown@formal@none@1@S@*John and Bill '''samely''' fought.@@@@1@5@@danf@17-8-2009
10620590@unknown@formal@none@1@S@There is no English adverb "samely".@@@@1@6@@danf@17-8-2009
10620600@unknown@formal@none@1@S@In some other languages, like Czech, however there are adverbs corresponding to "samely".@@@@1@13@@danf@17-8-2009
10620610@unknown@formal@none@1@S@Hence, in Czech, the translation of the last sentence would be fine; however, it would mean that John and Bill fought ''in the same way'': not that they participated in the ''same fight''.@@@@1@33@@danf@17-8-2009
10620620@unknown@formal@none@1@S@Geach proposed that we could explain this, if nouns denote logical [[predicate (grammar)|predicate]]s with '''identity criteria'''.@@@@1@16@@danf@17-8-2009
10620630@unknown@formal@none@1@S@An identity criterion would allow us to conclude, for example, that "person x at time 1 is ''the same person'' as person y at time 2".@@@@1@26@@danf@17-8-2009
10620640@unknown@formal@none@1@S@Different nouns can have different identity criteria.@@@@1@7@@danf@17-8-2009
10620650@unknown@formal@none@1@S@A well known example of this is due to Gupta:@@@@1@10@@danf@17-8-2009
10620660@unknown@formal@none@1@S@:National Airlines transported 2 million '''passengers''' in 1979.@@@@1@8@@danf@17-8-2009
10620670@unknown@formal@none@1@S@:National Airlines transported (at least) 2 million '''persons''' in 1979.@@@@1@10@@danf@17-8-2009
10620680@unknown@formal@none@1@S@Given that, in general, all passengers are persons, the last sentence above ought to follow logically from the first one.@@@@1@20@@danf@17-8-2009
10620690@unknown@formal@none@1@S@But it doesn't.@@@@1@3@@danf@17-8-2009
10620700@unknown@formal@none@1@S@It is easy to imagine, for example, that on average, every person who travelled with National Airlines in 1979, travelled with them twice.@@@@1@23@@danf@17-8-2009
10620710@unknown@formal@none@1@S@In that case, one would say that the airline transported 2 million ''passengers'' but only 1 million ''persons''.@@@@1@18@@danf@17-8-2009
10620720@unknown@formal@none@1@S@Thus, the way that we count ''passengers'' isn't necessarily the same as the way that we count ''persons''.@@@@1@18@@danf@17-8-2009
10620730@unknown@formal@none@1@S@Put somewhat differently: At two different times, ''you'' may correspond to two distinct ''passengers'', even though you are one and the same person.@@@@1@23@@danf@17-8-2009
10620740@unknown@formal@none@1@S@For a precise definition of ''identity criteria'', see Gupta.@@@@1@9@@danf@17-8-2009
10620750@unknown@formal@none@1@S@Recently, Baker has proposed that Geach's definition of nouns in terms of identity criteria allows us to ''explain'' the characteristic properties of nouns.@@@@1@23@@danf@17-8-2009
10620760@unknown@formal@none@1@S@He argues that nouns can co-occur with (in-)definite articles and numerals, and are "prototypically referential" ''because'' they are all and only those [[parts of speech]] that provide identity criteria.@@@@1@29@@danf@17-8-2009
10620770@unknown@formal@none@1@S@Baker's proposals are quite new, and linguists are still evaluating them.@@@@1@11@@danf@17-8-2009
10620780@unknown@formal@none@1@S@==Classification of nouns in English==@@@@1@5@@danf@17-8-2009
10620790@unknown@formal@none@1@S@===Proper nouns and common nouns===@@@@1@5@@danf@17-8-2009
10620800@unknown@formal@none@1@S@''Proper nouns'' (also called ''[[proper name]]s'') are nouns representing unique entities (such as ''London'', ''Universe'' or ''John''), as distinguished from common nouns which describe a class of entities (such as ''city'', ''planet'' or ''person'').@@@@1@34@@danf@17-8-2009
10620810@unknown@formal@none@1@S@In [[English language|English]] and most other languages that use the [[Latin alphabet]], proper nouns are usually [[capitalization|capitalized]].@@@@1@17@@danf@17-8-2009
10620820@unknown@formal@none@1@S@Languages differ in whether most elements of multiword proper nouns are capitalised (e.g., American English ''House of Representatives'') or only the initial element (e.g., Slovenian ''Državni zbor'' 'National Assembly').@@@@1@29@@danf@17-8-2009
10620830@unknown@formal@none@1@S@In [[German language|German]], nouns of all types are capitalized.@@@@1@9@@danf@17-8-2009
10620840@unknown@formal@none@1@S@The convention of capitalizing ''all'' nouns was previously used in English, but ended circa 1800.@@@@1@15@@danf@17-8-2009
10620850@unknown@formal@none@1@S@In America, the shift in capitalization is recorded in several noteworthy documents.@@@@1@12@@danf@17-8-2009
10620860@unknown@formal@none@1@S@The end (but not the beginning) of the [[United States Declaration of Independence#Annotated text of the Declaration|Declaration of Independence]] (1776) and all of the [[United States Constitution|Constitution]] (1787) show nearly all nouns capitalized, the [[United States Bill of Rights#Text of the Bill of Rights|Bill of Rights]] (1789) capitalizes a few common nouns but not most of them, and the [[Thirteenth Amendment to the United States Constitution|Thirteenth Constitutional Amendment]] (1865) only capitalizes proper nouns.@@@@1@73@@danf@17-8-2009
10620870@unknown@formal@none@1@S@Sometimes the same word can function as both a common noun and a proper noun, where one such entity is special.@@@@1@21@@danf@17-8-2009
10620880@unknown@formal@none@1@S@For example the common noun ''god'' denotes all deities, while the proper noun ''God'' references the [[monotheism|monotheistic]] [[God]] specifically.@@@@1@19@@danf@17-8-2009
10620890@unknown@formal@none@1@S@Owing to the essentially arbitrary nature of [[Orthography|orthographic]] classification and the existence of variant authorities and adopted [[Style guide|''house styles'']], questionable capitalization of words is not uncommon, even in respected newspapers and magazines.@@@@1@33@@danf@17-8-2009
10620900@unknown@formal@none@1@S@Most publishers, however, properly require ''consistency'', at least within the same document, in applying their specified standard.@@@@1@17@@danf@17-8-2009
10620910@unknown@formal@none@1@S@The common meaning of the word or words constituting a proper noun may be unrelated to the object to which the proper noun refers.@@@@1@24@@danf@17-8-2009
10620920@unknown@formal@none@1@S@For example, someone might be named "Tiger Smith" despite being neither a [[tiger]] nor a [[smith (metalwork)|smith]].@@@@1@17@@danf@17-8-2009
10620930@unknown@formal@none@1@S@For this reason, proper nouns are usually not [[translation|translated]] between languages, although they may be [[transliteration|transliterated]].@@@@1@16@@danf@17-8-2009
10620940@unknown@formal@none@1@S@For example, the German surname ''Knödel'' becomes ''Knodel'' or ''Knoedel'' in English (not the literal ''Dumpling'').@@@@1@16@@danf@17-8-2009
10620950@unknown@formal@none@1@S@However, the [[Transliteration|transcription]] of place names and the names of [[monarch]]s, [[pope]]s, and non-contemporary [[author]]s is common and sometimes universal.@@@@1@20@@danf@17-8-2009
10620960@unknown@formal@none@1@S@For instance, the [[Portuguese language|Portuguese]] word ''Lisboa'' becomes ''[[Lisbon]]'' in [[English language|English]]; the English ''London'' becomes ''Londres'' in French; and the [[ancient Greek|Greek]] ''Aristotelēs'' becomes [[Aristotle]] in English.@@@@1@28@@danf@17-8-2009
10620970@unknown@formal@none@1@S@===Countable and uncountable nouns===@@@@1@4@@danf@17-8-2009
10620980@unknown@formal@none@1@S@''Count nouns'' are common nouns that can take a [[plural]], can combine with [[numerals]] or [[quantifiers]] (e.g. "one", "two", "several", "every", "most"), and can take an indefinite article ("a" or "an").@@@@1@31@@danf@17-8-2009
10620990@unknown@formal@none@1@S@Examples of count nouns are "chair", "nose", and "occasion".@@@@1@9@@danf@17-8-2009
10621000@unknown@formal@none@1@S@''Mass nouns'' (or ''non-count nouns'') differ from count nouns in precisely that respect: they can't take plural or combine with number words or quantifiers.@@@@1@24@@danf@17-8-2009
10621010@unknown@formal@none@1@S@Examples from English include "laughter", "cutlery", "helium", and "furniture".@@@@1@9@@danf@17-8-2009
10621020@unknown@formal@none@1@S@For example, it is not possible to refer to "a furniture" or "three furnitures".@@@@1@14@@danf@17-8-2009
10621030@unknown@formal@none@1@S@This is true even though the pieces of furniture comprising "furniture" could be counted.@@@@1@14@@danf@17-8-2009
10621040@unknown@formal@none@1@S@Thus the distinction between mass and count nouns shouldn't be made in terms of what sorts of things the nouns ''refer'' to, but rather in terms of how the nouns ''present'' these entities.@@@@1@33@@danf@17-8-2009
10621050@unknown@formal@none@1@S@===Collective nouns===@@@@1@2@@danf@17-8-2009
10621060@unknown@formal@none@1@S@''Collective nouns'' are nouns that refer to ''groups'' consisting of more than one individual or entity, even when they are inflected for the [[Grammatical number|singular]].@@@@1@25@@danf@17-8-2009
10621070@unknown@formal@none@1@S@Examples include "committee", "herd", and "school" (of herring).@@@@1@8@@danf@17-8-2009
10621080@unknown@formal@none@1@S@These nouns have slightly different grammatical properties than other nouns.@@@@1@10@@danf@17-8-2009
10621090@unknown@formal@none@1@S@For example, the [[noun phrases]] that they [[head (syntax)|head]] can serve as the [[subject (grammar)|subject]] of a [[collective predicate]], even when they are inflected for the singular.@@@@1@27@@danf@17-8-2009
10621100@unknown@formal@none@1@S@A [[collective predicate]] is a predicate that normally can't take a singular subject.@@@@1@13@@danf@17-8-2009
10621110@unknown@formal@none@1@S@An example of the latter is "talked to each other".@@@@1@10@@danf@17-8-2009
10621120@unknown@formal@none@1@S@:Good: The '''boys''' talked to each other.@@@@1@7@@danf@17-8-2009
10621130@unknown@formal@none@1@S@:Bad: *The '''boy''' talked to each other.@@@@1@7@@danf@17-8-2009
10621140@unknown@formal@none@1@S@:Good: The '''committee''' talked to each other.@@@@1@7@@danf@17-8-2009
10621150@unknown@formal@none@1@S@===Concrete nouns and abstract nouns===@@@@1@5@@danf@17-8-2009
10621160@unknown@formal@none@1@S@''Concrete nouns'' refer to [[physical bodies]] which you use at least one of your [[sense]]s to observe.@@@@1@17@@danf@17-8-2009
10621170@unknown@formal@none@1@S@For instance, "chair", "apple", or "Janet".@@@@1@6@@danf@17-8-2009
10621180@unknown@formal@none@1@S@''Abstract nouns'' on the other hand refer to [[abstract object]]s, that is ideas or concepts, such as "justice" or "hate".@@@@1@20@@danf@17-8-2009
10621190@unknown@formal@none@1@S@While this distinction is sometimes useful, the boundary between the two of them is not always clear; consider, for example, the noun "art".@@@@1@23@@danf@17-8-2009
10621200@unknown@formal@none@1@S@In English, many abstract nouns are formed by adding noun-forming suffixes ("-ness", "-ity", "-tion") to adjectives or verbs.@@@@1@18@@danf@17-8-2009
10621210@unknown@formal@none@1@S@Examples are "happiness", "circulation" and "serenity".@@@@1@6@@danf@17-8-2009
10621220@unknown@formal@none@1@S@==Nouns and pronouns==@@@@1@3@@danf@17-8-2009
10621230@unknown@formal@none@1@S@[[Noun phrase]]s can typically be replaced by [[pronoun]]s, such as "he", "it", "which", and "those", in order to avoid repetition or explicit identification, or for other reasons.@@@@1@27@@danf@17-8-2009
10621240@unknown@formal@none@1@S@For example, in the sentence "Janet thought that he was weird", the word "he" is a pronoun standing in place of the name of the person in question.@@@@1@28@@danf@17-8-2009
10621250@unknown@formal@none@1@S@The English word ''one'' can replace parts of [[noun phrase]]s, and it sometimes stands in for a noun.@@@@1@18@@danf@17-8-2009
10621260@unknown@formal@none@1@S@An example is given below:@@@@1@5@@danf@17-8-2009
10621270@unknown@formal@none@1@S@: John's car is newer than ''the one'' that Bill has.@@@@1@11@@danf@17-8-2009
10621280@unknown@formal@none@1@S@But ''one'' can also stand in for bigger subparts of a noun phrase.@@@@1@13@@danf@17-8-2009
10621290@unknown@formal@none@1@S@For example, in the following example, ''one'' can stand in for ''new car''.@@@@1@13@@danf@17-8-2009
10621300@unknown@formal@none@1@S@: This new car is cheaper than ''that one''.@@@@1@9@@danf@17-8-2009
10621310@unknown@formal@none@1@S@==Substantive as a word for "noun"==@@@@1@6@@danf@17-8-2009
10621320@unknown@formal@none@1@S@Starting with old [[Latin language|Latin]] grammars, many European languages use some form of the word ''substantive'' as the basic term for noun.@@@@1@22@@danf@17-8-2009
10621330@unknown@formal@none@1@S@Nouns in the dictionaries of such languages are demarked by the abbreviation "s" instead of "n", which may be used for proper nouns instead.@@@@1@24@@danf@17-8-2009
10621340@unknown@formal@none@1@S@This corresponds to those grammars in which nouns and adjectives phase into each other in more areas than, for example, the English term [[Predicative_adjective#Predicative_adjective|predicate adjective]] entails.@@@@1@26@@danf@17-8-2009
10621350@unknown@formal@none@1@S@In French and Spanish, for example, adjectives frequently act as nouns referring to people who have the characteristics of the adjective.@@@@1@21@@danf@17-8-2009
10621360@unknown@formal@none@1@S@An example in English is:@@@@1@5@@danf@17-8-2009
10621370@unknown@formal@none@1@S@: The ''poor'' you have always with you.@@@@1@8@@danf@17-8-2009
10621380@unknown@formal@none@1@S@Similarly, an adjective can also be used for a whole group or organization of people:@@@@1@15@@danf@17-8-2009
10621390@unknown@formal@none@1@S@: The Socialist ''International''.@@@@1@4@@danf@17-8-2009
10621400@unknown@formal@none@1@S@Hence, these words are substantives that are usually adjectives in English.@@@@1@11@@danf@17-8-2009
10630010@unknown@formal@none@1@S@Ontology (information science)@@@@1@3@@danf@17-8-2009
10630020@unknown@formal@none@1@S@In both [[computer science]] and [[information science]], an '''ontology''' is a formal representation of a set of concepts within a [[Domain of discourse|domain]] and the relationships between those concepts.@@@@1@29@@danf@17-8-2009
10630030@unknown@formal@none@1@S@It is used to [[Reasoning|reason]] about the properties of that domain, and may be used to define the domain.@@@@1@19@@danf@17-8-2009
10630040@unknown@formal@none@1@S@Ontologies are used in [[artificial intelligence]], the [[Semantic Web]], [[software engineering]], [[biomedical informatics]], [[library science]], and [[information architecture]] as a form of [[knowledge representation]] about the world or some part of it.@@@@1@32@@danf@17-8-2009
10630050@unknown@formal@none@1@S@Common components of ontologies include:@@@@1@5@@danf@17-8-2009
10630060@unknown@formal@none@1@S@* Individuals: instances or objects (the basic or "ground level" objects)@@@@1@11@@danf@17-8-2009
10630070@unknown@formal@none@1@S@* [[Class]]es: [[set (computer science)|set]]s, collections, concepts or types of objects@@@@1@11@@danf@17-8-2009
10630080@unknown@formal@none@1@S@* [[Attribute (computing)|Attribute]]s: properties, features, characteristics, or parameters that objects (and classes) can have@@@@1@14@@danf@17-8-2009
10630090@unknown@formal@none@1@S@* [[Relation (mathematics)|Relations]]: ways that classes and objects can be related to one another@@@@1@14@@danf@17-8-2009
10630100@unknown@formal@none@1@S@* Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement@@@@1@22@@danf@17-8-2009
10630110@unknown@formal@none@1@S@* Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input@@@@1@20@@danf@17-8-2009
10630120@unknown@formal@none@1@S@* Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form@@@@1@27@@danf@17-8-2009
10630130@unknown@formal@none@1@S@* Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application.@@@@1@24@@danf@17-8-2009
10630140@unknown@formal@none@1@S@This definition differs from that of "axioms" in generative grammar and formal logic.@@@@1@13@@danf@17-8-2009
10630150@unknown@formal@none@1@S@In these disciplines, axioms include only statements asserted as ''a priori'' knowledge.@@@@1@12@@danf@17-8-2009
10630160@unknown@formal@none@1@S@As used here, "axioms" also include the theory derived from axiomatic statements.@@@@1@12@@danf@17-8-2009
10630170@unknown@formal@none@1@S@* [[Event (philosophy)|Events]]: the changing of attributes or relations@@@@1@9@@danf@17-8-2009
10630180@unknown@formal@none@1@S@Ontologies are commonly encoded using [[ontology language]]s.@@@@1@7@@danf@17-8-2009
10630190@unknown@formal@none@1@S@== Elements ==@@@@1@3@@danf@17-8-2009
10630200@unknown@formal@none@1@S@Contemporary ontologies share many structural similarities, regardless of the language in which they are expressed.@@@@1@15@@danf@17-8-2009
10630210@unknown@formal@none@1@S@As mentioned above, most ontologies describe individuals (instances), classes (concepts), attributes, and relations.@@@@1@13@@danf@17-8-2009
10630220@unknown@formal@none@1@S@In this section each of these components is discussed in turn.@@@@1@11@@danf@17-8-2009
10630230@unknown@formal@none@1@S@=== Individuals ===@@@@1@3@@danf@17-8-2009
10630240@unknown@formal@none@1@S@Individuals (instances) are the basic, "ground level" components of an ontology.@@@@1@11@@danf@17-8-2009
10630250@unknown@formal@none@1@S@The individuals in an ontology may include concrete objects such as people, animals, tables, automobiles, molecules, and planets, as well as abstract individuals such as numbers and words.@@@@1@28@@danf@17-8-2009
10630260@unknown@formal@none@1@S@Strictly speaking, an ontology need not include any individuals, but one of the general purposes of an ontology is to provide a means of classifying individuals, even if those individuals are not explicitly part of the ontology.@@@@1@37@@danf@17-8-2009
10630270@unknown@formal@none@1@S@In formal extensional ontologies, only the utterances of words and numbers are considered individuals – the numbers and names themselves are classes.@@@@1@22@@danf@17-8-2009
10630280@unknown@formal@none@1@S@In a 4D ontology, an individual is identified by its spatio-temporal extent.@@@@1@12@@danf@17-8-2009
10630290@unknown@formal@none@1@S@Examples of formal extensional ontologies are [[ISO 15926]] and the model in development by the [[IDEAS Group]].@@@@1@17@@danf@17-8-2009
10630300@unknown@formal@none@1@S@=== Classes ===@@@@1@3@@danf@17-8-2009
10630310@unknown@formal@none@1@S@Classes – concepts that are also called ''type'', ''sort'', ''category'', and ''kind'' – are abstract groups, sets, or collections of objects.@@@@1@21@@danf@17-8-2009
10630320@unknown@formal@none@1@S@They may contain individuals, other classes, or a combination of both.@@@@1@11@@danf@17-8-2009
10630330@unknown@formal@none@1@S@Some examples of classes:@@@@1@4@@danf@17-8-2009
10630340@unknown@formal@none@1@S@* ''Person'', the class of all people@@@@1@7@@danf@17-8-2009
10630350@unknown@formal@none@1@S@* ''Vehicle'', the class of all vehicles@@@@1@7@@danf@17-8-2009
10630360@unknown@formal@none@1@S@* ''Car'', the class of all cars@@@@1@7@@danf@17-8-2009
10630370@unknown@formal@none@1@S@* ''Class'', representing the class of all classes@@@@1@8@@danf@17-8-2009
10630380@unknown@formal@none@1@S@* ''Thing'', representing the class of all things@@@@1@8@@danf@17-8-2009
10630390@unknown@formal@none@1@S@Ontologies vary on whether classes can contain other classes, whether a class can belong to itself, whether there is a universal class (that is, a class containing everything), etc.@@@@1@29@@danf@17-8-2009
10630400@unknown@formal@none@1@S@Sometimes restrictions along these lines are made in order to avoid certain well-known [[paradox]]es.@@@@1@14@@danf@17-8-2009
10630410@unknown@formal@none@1@S@The classes of an ontology may be [[extensional]] or [[intensional]] in nature.@@@@1@12@@danf@17-8-2009
10630420@unknown@formal@none@1@S@A class is extensional if and only if it is characterized solely by its membership.@@@@1@15@@danf@17-8-2009
10630430@unknown@formal@none@1@S@More precisely, a class C is extensional if and only if for any class C', if C' has exactly the same members as C, then C and C' are identical.@@@@1@30@@danf@17-8-2009
10630440@unknown@formal@none@1@S@If a class does not satisfy this condition, then it is intensional.@@@@1@12@@danf@17-8-2009
10630450@unknown@formal@none@1@S@While extensional classes are more well-behaved and well-understood mathematically, as well as less problematic philosophically, they do not permit the fine grained distinctions that ontologies often need to make.@@@@1@29@@danf@17-8-2009
10630460@unknown@formal@none@1@S@For example, an ontology may want to distinguish between the class of all creatures with a kidney and the class of all creatures with a heart, even if these classes happen to have exactly the same members.@@@@1@37@@danf@17-8-2009
10630470@unknown@formal@none@1@S@In the upper ontologies mentioned above, the classes are defined intensionally.@@@@1@11@@danf@17-8-2009
10630480@unknown@formal@none@1@S@Intensionally defined classes usually have necessary conditions associated with membership in each class.@@@@1@13@@danf@17-8-2009
10630490@unknown@formal@none@1@S@Some classes may also have sufficient conditions, and in those cases the combination of necessary and sufficient conditions make that class a fully ''defined'' class.@@@@1@25@@danf@17-8-2009
10630500@unknown@formal@none@1@S@Importantly, a class can subsume or be subsumed by other classes; a class subsumed by another is called a ''subclass'' of the subsuming class.@@@@1@24@@danf@17-8-2009
10630510@unknown@formal@none@1@S@For example, ''Vehicle'' subsumes ''Car'', since (necessarily) anything that is a member of the latter class is a member of the former.@@@@1@22@@danf@17-8-2009
10630520@unknown@formal@none@1@S@The subsumption relation is used to create a hierarchy of classes, typically with a maximally general class like ''Thing'' at the top, and very specific classes like ''2002 Ford Explorer'' at the bottom.@@@@1@33@@danf@17-8-2009
10630530@unknown@formal@none@1@S@The critically important consequence of the subsumption relation is the inheritance of properties from the parent (subsuming) class to the child (subsumed) class.@@@@1@23@@danf@17-8-2009
10630540@unknown@formal@none@1@S@Thus, anything that is necessarily true of a parent class is also necessarily true of all of its subsumed child classes.@@@@1@21@@danf@17-8-2009
10630550@unknown@formal@none@1@S@In some ontologies, a class is only allowed to have one parent (''single inheritance''), but in most ontologies, classes are allowed to have any number of parents (''multiple inheritance''), and in the latter case all necessary properties of each parent are inherited by the subsumed child class.@@@@1@47@@danf@17-8-2009
10630560@unknown@formal@none@1@S@Thus a particular class of animal (''HouseCat'') may be a child of the class ''Cat'' and also a child of the class ''Pet''.@@@@1@23@@danf@17-8-2009
10630570@unknown@formal@none@1@S@A partition is a set of related classes and associated rules that allow objects to be placed into the appropriate class.@@@@1@21@@danf@17-8-2009
10630580@unknown@formal@none@1@S@For example, to the right is the partial diagram of an ontology that has a partition of the ''Car'' class into the classes ''2-Wheel Drive'' and ''4-Wheel Drive''.@@@@1@28@@danf@17-8-2009
10630590@unknown@formal@none@1@S@The partition rule determines if a particular car is placed in the ''2-Wheel Drive'' or the ''4-Wheel Drive'' class.@@@@1@19@@danf@17-8-2009
10630600@unknown@formal@none@1@S@If the partition rule(s) guarantee that a single ''Car'' cannot be in both classes, then the partition is called a disjoint partition.@@@@1@22@@danf@17-8-2009
10630610@unknown@formal@none@1@S@If the partition rules ensure that every concrete object in the super-class is an instance of at least one of the partition classes, then the partition is called an exhaustive partition.@@@@1@31@@danf@17-8-2009
10630620@unknown@formal@none@1@S@=== Attributes ===@@@@1@3@@danf@17-8-2009
10630630@unknown@formal@none@1@S@Objects in the ontology can be described by assigning attributes to them.@@@@1@12@@danf@17-8-2009
10630640@unknown@formal@none@1@S@Each attribute has at least a name and a value, and is used to store information that is specific to the object it is attached to.@@@@1@26@@danf@17-8-2009
10630650@unknown@formal@none@1@S@For example the Ford Explorer object has attributes such as:@@@@1@10@@danf@17-8-2009
10630660@unknown@formal@none@1@S@* ''Name'': Ford Explorer@@@@1@4@@danf@17-8-2009
10630670@unknown@formal@none@1@S@* ''Number-of-doors'': 4@@@@1@3@@danf@17-8-2009
10630680@unknown@formal@none@1@S@* ''Engine'': {4.0L, 4.6L}@@@@1@4@@danf@17-8-2009
10630690@unknown@formal@none@1@S@* ''Transmission'': 6-speed@@@@1@3@@danf@17-8-2009
10630700@unknown@formal@none@1@S@The value of an attribute can be a complex [[data type]]; in this example, the value of the attribute called ''Engine'' is a list of values, not just a single value.@@@@1@31@@danf@17-8-2009
10630710@unknown@formal@none@1@S@If you did not define attributes for the concepts you would have either a [[taxonomy]] (if [[hyponym]] relationships exist between concepts) or a '''controlled vocabulary'''.@@@@1@25@@danf@17-8-2009
10630720@unknown@formal@none@1@S@These are useful, but are not considered true ontologies.@@@@1@9@@danf@17-8-2009
10630730@unknown@formal@none@1@S@===Relationships===@@@@1@1@@danf@17-8-2009
10630740@unknown@formal@none@1@S@An important use of attributes is to describe the relationships (also known as relations) between objects in the ontology.@@@@1@19@@danf@17-8-2009
10630750@unknown@formal@none@1@S@Typically a relation is an attribute whose value is another object in the ontology.@@@@1@14@@danf@17-8-2009
10630760@unknown@formal@none@1@S@For example in the ontology that contains the Ford Explorer and the [[Ford Bronco]], the Ford Bronco object might have the following attribute:@@@@1@23@@danf@17-8-2009
10630770@unknown@formal@none@1@S@* ''Successor'': Ford Explorer@@@@1@4@@danf@17-8-2009
10630780@unknown@formal@none@1@S@This tells us that the Explorer is the model that replaced the Bronco.@@@@1@13@@danf@17-8-2009
10630790@unknown@formal@none@1@S@Much of the power of ontologies comes from the ability to describe these relations.@@@@1@14@@danf@17-8-2009
10630800@unknown@formal@none@1@S@Together, the set of relations describes the [[semantics]] of the domain.@@@@1@11@@danf@17-8-2009
10630810@unknown@formal@none@1@S@The most important type of relation is the [[subsumption]] relation (''is-[[superclass]]-of'', the converse of ''[[is-a]]'', ''is-subtype-of'' or ''is-[[subclass]]-of'').@@@@1@18@@danf@17-8-2009
10630820@unknown@formal@none@1@S@This defines which objects are members of classes of objects.@@@@1@10@@danf@17-8-2009
10630830@unknown@formal@none@1@S@For example we have already seen that the Ford Explorer ''is-a'' 4-wheel drive, which in turn ''is-a'' Car:@@@@1@18@@danf@17-8-2009
10630840@unknown@formal@none@1@S@The addition of the is-a relationships has created a hierarchical [[taxonomy]]; a tree-like structure (or, more generally, a [[partially ordered set]]) that clearly depicts how objects relate to one another.@@@@1@30@@danf@17-8-2009
10630850@unknown@formal@none@1@S@In such a structure, each object is the 'child' of a 'parent class' (Some languages restrict the is-a relationship to one parent for all nodes, but many do not).@@@@1@29@@danf@17-8-2009
10630860@unknown@formal@none@1@S@Another common type of relations is the [[meronymy]] relation, written as ''part-of'', that represents how objects combine together to form composite objects.@@@@1@22@@danf@17-8-2009
10630870@unknown@formal@none@1@S@For example, if we extended our example ontology to include objects like Steering Wheel, we would say that "Steering Wheel is-part-of Ford Explorer" since a steering wheel is one of the components of a Ford Explorer.@@@@1@36@@danf@17-8-2009
10630880@unknown@formal@none@1@S@If we introduce meronymy relationships to our ontology, we find that this simple and elegant tree structure quickly becomes complex and significantly more difficult to interpret manually.@@@@1@27@@danf@17-8-2009
10630890@unknown@formal@none@1@S@It is not difficult to understand why; an entity that is described as 'part of' another entity might also be 'part of' a third entity.@@@@1@25@@danf@17-8-2009
10630900@unknown@formal@none@1@S@Consequently, entities may have more than one parent.@@@@1@8@@danf@17-8-2009
10630910@unknown@formal@none@1@S@The structure that emerges is known as a [[directed acyclic graph]] (DAG).@@@@1@12@@danf@17-8-2009
10630920@unknown@formal@none@1@S@As well as the standard is-a and part-of relations, ontologies often include additional types of relation that further refine the semantics they model.@@@@1@23@@danf@17-8-2009
10630930@unknown@formal@none@1@S@These relations are often domain-specific and are used to answer particular types of question.@@@@1@14@@danf@17-8-2009
10630940@unknown@formal@none@1@S@For example in the domain of automobiles, we might define a ''made-in'' relationship which tells us where each car is built.@@@@1@21@@danf@17-8-2009
10630950@unknown@formal@none@1@S@So the Ford Explorer is ''made-in'' [[Louisville, Kentucky|Louisville]].@@@@1@8@@danf@17-8-2009
10630960@unknown@formal@none@1@S@The ontology may also know that Louisville is-in [[Kentucky]] and Kentucky is-a state of the [[United States|USA]].@@@@1@17@@danf@17-8-2009
10630970@unknown@formal@none@1@S@Software using this ontology could now answer a question like "which cars are made in the U.S.?"@@@@1@17@@danf@17-8-2009
10630980@unknown@formal@none@1@S@== Domain ontologies and upper ontologies ==@@@@1@7@@danf@17-8-2009
10630990@unknown@formal@none@1@S@A domain ontology (or domain-specific ontology) models a specific domain, or part of the world.@@@@1@15@@danf@17-8-2009
10631000@unknown@formal@none@1@S@It represents the particular meanings of terms as they apply to that domain.@@@@1@13@@danf@17-8-2009
10631010@unknown@formal@none@1@S@For example the word ''[[card]]'' has many different meanings.@@@@1@9@@danf@17-8-2009
10631020@unknown@formal@none@1@S@An ontology about the domain of [[poker]] would model the "[[playing card]]" meaning of the word, while an ontology about the domain of [[computer hardware]] would model the "[[punch card]]" and "[[video card]]" meanings.@@@@1@34@@danf@17-8-2009
10631030@unknown@formal@none@1@S@An [[Upper ontology (computer science)|upper ontology]] (or foundation ontology) is a model of the common objects that are generally applicable across a wide range of domain ontologies.@@@@1@27@@danf@17-8-2009
10631040@unknown@formal@none@1@S@It contains a [[core glossary]] in whose terms objects in a set of domains can be described.@@@@1@17@@danf@17-8-2009
10631050@unknown@formal@none@1@S@There are several standardized upper ontologies available for use, including [[Dublin Core]], [[General Formal Ontology|GFO]], [[OpenCyc]]/[[ResearchCyc]], [[Suggested Upper Merged Ontology|SUMO]], and [http://www.loa-cnr.it/DOLCE.html DOLCE]l.@@@@1@23@@danf@17-8-2009
10631060@unknown@formal@none@1@S@[[WordNet]], while considered an upper ontology by some, is not an ontology: it is a unique combination of a [[taxonomy]] and a controlled vocabulary (see above, under Attributes).@@@@1@28@@danf@17-8-2009
10631070@unknown@formal@none@1@S@The [[Gellish]] ontology is an example of a combination of an upper and a domain ontology.@@@@1@16@@danf@17-8-2009
10631080@unknown@formal@none@1@S@Since domain ontologies represent concepts in very specific and often eclectic ways, they are often incompatible.@@@@1@16@@danf@17-8-2009
10631090@unknown@formal@none@1@S@As systems that rely on domain ontologies expand, they often need to merge domain ontologies into a more general representation.@@@@1@20@@danf@17-8-2009
10631100@unknown@formal@none@1@S@This presents a challenge to the ontology designer.@@@@1@8@@danf@17-8-2009
10631110@unknown@formal@none@1@S@Different ontologies in the same domain can also arise due to different perceptions of the domain based on cultural background, education, ideology, or because a different representation language was chosen.@@@@1@30@@danf@17-8-2009
10631120@unknown@formal@none@1@S@At present, merging ontologies is a largely manual process and therefore time-consuming and expensive.@@@@1@14@@danf@17-8-2009
10631130@unknown@formal@none@1@S@Using a foundation ontology to provide a common definition of core terms can make this process manageable.@@@@1@17@@danf@17-8-2009
10631140@unknown@formal@none@1@S@There are studies on generalized techniques for merging ontologies, but this area of research is still largely theoretical.@@@@1@18@@danf@17-8-2009
10631150@unknown@formal@none@1@S@== Ontology languages ==@@@@1@4@@danf@17-8-2009
10631160@unknown@formal@none@1@S@An [[ontology language]] is a [[formal language]] used to encode the ontology.@@@@1@12@@danf@17-8-2009
10631170@unknown@formal@none@1@S@There are a number of such languages for ontologies, both proprietary and standards-based:@@@@1@13@@danf@17-8-2009
10631180@unknown@formal@none@1@S@* [[Web Ontology Language|OWL]] is a language for making ontological statements, developed as a follow-on from [[Resource Description Framework|RDF]] and [[RDFS]], as well as earlier ontology language projects including [[Ontology Inference Layer|OIL]], [[DARPA Agent Markup Language|DAML]] and [[DAMLplusOIL|DAML+OIL]].@@@@1@38@@danf@17-8-2009
10631190@unknown@formal@none@1@S@OWL is intended to be used over the [[World Wide Web]], and all its elements (classes, properties and individuals) are defined as RDF [[resource (Web)|resources]], and identified by [[Uniform Resource Identifier|URI]]s.@@@@1@31@@danf@17-8-2009
10631200@unknown@formal@none@1@S@* [[KIF]] is a syntax for [[first-order logic]] that is based on [[S-expression]]s.@@@@1@13@@danf@17-8-2009
10631210@unknown@formal@none@1@S@* The [[Cyc]] project has its own ontology language called [[CycL]], based on [[first-order predicate calculus]] with some higher-order extensions.@@@@1@20@@danf@17-8-2009
10631220@unknown@formal@none@1@S@* [[Rule Interchange Format]] (RIF) and [[F-Logic]] combine ontologies and rules.@@@@1@11@@danf@17-8-2009
10631230@unknown@formal@none@1@S@* The [[Gellish]] language includes rules for its own extension and thus integrates an ontology with an ontology language.@@@@1@19@@danf@17-8-2009
10631240@unknown@formal@none@1@S@== Relation to the philosophical term ==@@@@1@7@@danf@17-8-2009
10631250@unknown@formal@none@1@S@The term ''ontology'' has its origin in [[ontology|philosophy]], where it is the name of one fundamental branch of [[metaphysics]], concerned with analyzing various types or modes of ''existence'', often with special attention to the relations between particulars and universals, between intrinsic and extrinsic properties, and between essence and existence.@@@@1@49@@danf@17-8-2009
10631260@unknown@formal@none@1@S@According to [[Tom Gruber]] at [[Stanford University]], the meaning of ''ontology'' in the context of computer science is “a description of the concepts and relationships that can exist for an [[Software agent|agent]] or a community of agents.”@@@@1@37@@danf@17-8-2009
10631270@unknown@formal@none@1@S@He goes on to specify that an ontology is generally written, “as a set of definitions of formal vocabulary.”@@@@1@19@@danf@17-8-2009
10631280@unknown@formal@none@1@S@What ontology has in common in both computer science and philosophy is the representation of entities, ideas, and events, along with their properties and relations, according to a system of categories.@@@@1@31@@danf@17-8-2009
10631290@unknown@formal@none@1@S@In both fields, one finds considerable work on problems of ontological relativity (e.g. [[Quine]] and [[Kripke]] in philosophy, [[John F. Sowa|Sowa]] and [[Nicola Guarino|Guarino]] in computer science (Top-level ontological categories.@@@@1@30@@danf@17-8-2009
10631310@unknown@formal@none@1@S@By: Sowa, John F.@@@@1@4@@danf@17-8-2009
10631320@unknown@formal@none@1@S@In International Journal of Human-Computer Studies, v. 43 (November/December 1995) p. 669-85.), and debates concerning whether a normative ontology is viable (e.g. debates over [[foundationalism]] in philosophy, debates over the [[Cyc]] project in AI).@@@@1@34@@danf@17-8-2009
10631330@unknown@formal@none@1@S@Differences between the two are largely matters of focus.@@@@1@9@@danf@17-8-2009
10631340@unknown@formal@none@1@S@Philosophers are less concerned with establishing fixed, controlled vocabularies than are researchers in computer science, while computer scientists are less involved in discussions of first principles (such as debating whether there are such things as fixed essences, or whether entities must be ontologically more primary than processes).@@@@1@47@@danf@17-8-2009
10631350@unknown@formal@none@1@S@During the second half of the 20th century, philosophers extensively debated the possible methods or approaches to building ontologies, without actually ''building'' any very elaborate ontologies themselves.@@@@1@27@@danf@17-8-2009
10631360@unknown@formal@none@1@S@By contrast, computer scientists were building some large and robust ontologies (such as [[WordNet]] and [[Cyc]]) with comparatively little debate over ''how'' they were built.@@@@1@25@@danf@17-8-2009
10631370@unknown@formal@none@1@S@In the early years of the 21st century, the interdisciplinary project of [[cognitive science]] has been bringing the two circles of scholars closer together.@@@@1@24@@danf@17-8-2009
10631380@unknown@formal@none@1@S@For example, there is talk of a "computational turn in philosophy" which includes philosophers analyzing the formal ontologies of computer science (sometimes even working directly with the software), while researchers in computer science have been making more references to those philosophers who work on ontology (sometimes with direct consequences for their methods).@@@@1@52@@danf@17-8-2009
10631390@unknown@formal@none@1@S@Still, many scholars in both fields are uninvolved in this trend of cognitive science, and continue to work independently of one another, pursuing separately their different concerns.@@@@1@27@@danf@17-8-2009
10631400@unknown@formal@none@1@S@==Resources==@@@@1@1@@danf@17-8-2009
10631410@unknown@formal@none@1@S@===Examples of published ontologies ===@@@@1@5@@danf@17-8-2009
10631420@unknown@formal@none@1@S@* [[Dublin Core]], a simple ontology for documents and publishing.@@@@1@10@@danf@17-8-2009
10631430@unknown@formal@none@1@S@* [[Cyc]] for formal representation of the universe of discourse.@@@@1@10@@danf@17-8-2009
10631440@unknown@formal@none@1@S@* [[Suggested Upper Merged Ontology]], which is a formal upper ontology@@@@1@11@@danf@17-8-2009
10631450@unknown@formal@none@1@S@* [http://www.ifomis.org/bfo/ Basic Formal Ontology (BFO)], a formal upper ontology designed to support scientific research@@@@1@15@@danf@17-8-2009
10631460@unknown@formal@none@1@S@* [[Gellish English dictionary]], an ontology that includes a dictionary and taxonomy that includes an upper ontology and a lower ontology that focusses on industrial and business applications in engineering, technology and procurement.@@@@1@33@@danf@17-8-2009
10631470@unknown@formal@none@1@S@* [http://www.fb10.uni-bremen.de/anglistik/langpro/webspace/jb/gum/index.htm Generalized Upper Model], a linguistically-motivated ontology for mediating between clients systems and natural language technology@@@@1@17@@danf@17-8-2009
10631480@unknown@formal@none@1@S@* [[WordNet]] Lexical reference system@@@@1@5@@danf@17-8-2009
10631490@unknown@formal@none@1@S@* [[OBO Foundry]]: a suite of interoperable reference ontologies in biomedicine.@@@@1@11@@danf@17-8-2009
10631500@unknown@formal@none@1@S@* The [[Ontology for Biomedical Investigations]] is an open access, integrated ontology for the description of biological and clinical investigations.@@@@1@20@@danf@17-8-2009
10631510@unknown@formal@none@1@S@* [http://colab.cim3.net/file/work/SICoP/ontac/COSMO/ COSMO]: An OWL ontology that is a merger of the basic elements of the OpenCyc and SUMO ontologies, with additional elements.@@@@1@23@@danf@17-8-2009
10631520@unknown@formal@none@1@S@* [[Gene Ontology]] for [[genomics]]@@@@1@5@@danf@17-8-2009
10631530@unknown@formal@none@1@S@* [http://pir.georgetown.edu/pro/ PRO], the Protein Ontology of the Protein Information Resource, Georgetown University.@@@@1@13@@danf@17-8-2009
10631540@unknown@formal@none@1@S@* [http://proteinontology.info/ Protein Ontology] for [[proteomics]]@@@@1@6@@danf@17-8-2009
10631550@unknown@formal@none@1@S@* [http://sig.biostr.washington.edu/projects/fm/AboutFM.html Foundational Model of Anatomy] for human anatomy@@@@1@9@@danf@17-8-2009
10631560@unknown@formal@none@1@S@* [[SBO]], the Systems Biology Ontology, for computational models in biology@@@@1@11@@danf@17-8-2009
10631570@unknown@formal@none@1@S@* [http://www.plantontology.org/ Plant Ontology] for plant structures and growth/development stages, etc.@@@@1@11@@danf@17-8-2009
10631580@unknown@formal@none@1@S@* [[CIDOC|CIDOC CRM]] (Conceptual Reference Model) - an ontology for "[[cultural heritage]] information".@@@@1@13@@danf@17-8-2009
10631590@unknown@formal@none@1@S@* [http://www.linguistics-ontology.org/gold.html GOLD ] ('''G'''eneral '''O'''ntology for [[descriptive linguistics|'''L'''inguistic '''D'''escription ]])@@@@1@11@@danf@17-8-2009
10631600@unknown@formal@none@1@S@* [http://www.landcglobal.com/pages/linkbase.php Linkbase] A formal representation of the biomedical domain, founded upon [http://www.ifomis.org/bfo/ Basic Formal Ontology (BFO)].@@@@1@17@@danf@17-8-2009
10631610@unknown@formal@none@1@S@* [http://www.loa-cnr.it/Ontologies.html Foundational, Core and Linguistic Ontologies]@@@@1@7@@danf@17-8-2009
10631620@unknown@formal@none@1@S@* [[ThoughtTreasure]] ontology@@@@1@3@@danf@17-8-2009
10631630@unknown@formal@none@1@S@* [[LPL]] Lawson Pattern Language@@@@1@5@@danf@17-8-2009
10631640@unknown@formal@none@1@S@* [[TIME-ITEM]] Topics for Indexing Medical Education@@@@1@7@@danf@17-8-2009
10631650@unknown@formal@none@1@S@* [[POPE]] Purdue Ontology for Pharmaceutical Engineering@@@@1@7@@danf@17-8-2009
10631660@unknown@formal@none@1@S@* [[IDEAS Group]] A formal ontology for enterprise architecture being developed by the Australian, Canadian, UK and U.S. Defence Depts. [http://www.ideasgroup.org The IDEAS Group Website]@@@@1@25@@danf@17-8-2009
10631670@unknown@formal@none@1@S@* [http://www.eden-study.org/articles/2007/problems-ontology-programs_ao.pdf program abstraction taxonomy]@@@@1@5@@danf@17-8-2009
10631680@unknown@formal@none@1@S@* [http://sweet.jpl.nasa.gov/ SWEET] Semantic Web for Earth and Environmental Terminology@@@@1@10@@danf@17-8-2009
10631690@unknown@formal@none@1@S@* [http://www.cellcycleontology.org/ CCO] The Cell-Cycle Ontology is an application ontology that represents the cell cycle@@@@1@15@@danf@17-8-2009
10631700@unknown@formal@none@1@S@===Ontology libraries===@@@@1@2@@danf@17-8-2009
10631710@unknown@formal@none@1@S@The development of ontologies for the Web has led to the apparition of services providing lists or directories of ontologies with search facility.@@@@1@23@@danf@17-8-2009
10631720@unknown@formal@none@1@S@Such directories have been called ontology libraries.@@@@1@7@@danf@17-8-2009
10631730@unknown@formal@none@1@S@The following are static libraries of human-selected ontologies.@@@@1@8@@danf@17-8-2009
10631740@unknown@formal@none@1@S@* The [http://www.daml.org/ontologies/ DAML Ontology Library] maintains a legacy of ontologies in DAML.@@@@1@13@@danf@17-8-2009
10631750@unknown@formal@none@1@S@* The [http://protegewiki.stanford.edu/index.php/Protege_Ontology_Library Protege Ontology Library] contains a set of owl, Frame-based and other format ontologies.@@@@1@16@@danf@17-8-2009
10631760@unknown@formal@none@1@S@* [http://www.schemaweb.info/ SchemaWeb] is a directory of RDF schemata expressed in RDFS, OWL and DAML+OIL.@@@@1@15@@danf@17-8-2009
10631770@unknown@formal@none@1@S@The following are both directories and search engines.@@@@1@8@@danf@17-8-2009
10631780@unknown@formal@none@1@S@They include crawlers searching the Web for well-formed ontologies.@@@@1@9@@danf@17-8-2009
10631790@unknown@formal@none@1@S@* [[Swoogle]] is a directory and search engine for all RDF resources available on the Web, including ontologies.@@@@1@18@@danf@17-8-2009
10631800@unknown@formal@none@1@S@* The [http://olp.dfki.de/OntoSelect/ OntoSelect] Ontology Library offers similar services for RDF/S, DAML and OWL ontologies.@@@@1@15@@danf@17-8-2009
10631810@unknown@formal@none@1@S@* [http://www.w3.org/2004/ontaria/ Ontaria] is a "searchable and browsable directory of semantic web data", with a focus on RDF vocabularies with OWL ontologies.@@@@1@22@@danf@17-8-2009
10631820@unknown@formal@none@1@S@* The [http://www.obofoundry.org/ OBO Foundry / Bioportal]is a suite of interoperable reference ontologies in biology and biomedicine.@@@@1@17@@danf@17-8-2009
10640010@unknown@formal@none@1@S@OpenOffice.org@@@@1@1@@danf@17-8-2009
10640020@unknown@formal@none@1@S@'''OpenOffice.org''' ('''OO.o''' or '''OOo''') is a [[cross-platform]] [[office suite|office application suite]] available for a number of different computer [[operating system]]s.@@@@1@20@@danf@17-8-2009
10640030@unknown@formal@none@1@S@It supports the ISO standard '''[[OpenDocument]] Format (ODF)''' for data interchange as its default [[file format]], as well as [[Microsoft Office]] '97–2003 formats, [[Microsoft Office]] '2007 format (in version 3), among many others.@@@@1@33@@danf@17-8-2009
10640040@unknown@formal@none@1@S@OpenOffice.org was originally derived from [[StarOffice]], an office suite developed by [[StarDivision]] and acquired by [[Sun Microsystems]] in August 1999.@@@@1@20@@danf@17-8-2009
10640050@unknown@formal@none@1@S@The [[source code]] of the suite was released in July 2000 with the aim of reducing the dominant [[market share]] of [[Microsoft Office]] by providing a free, open and high-quality alternative; later versions of StarOffice are based upon OpenOffice.org with additional proprietary components.@@@@1@43@@danf@17-8-2009
10640060@unknown@formal@none@1@S@OpenOffice.org is [[free software]], available under the [[GNU Lesser General Public License]] (LGPL).@@@@1@13@@danf@17-8-2009
10640070@unknown@formal@none@1@S@The project and software are informally referred to as ''OpenOffice'', but this term is a [[trademark]] held by another party, requiring the project to adopt ''OpenOffice.org'' as its formal name.@@@@1@30@@danf@17-8-2009
10640080@unknown@formal@none@1@S@== History==@@@@1@2@@danf@17-8-2009
10640090@unknown@formal@none@1@S@Originally developed as the [[proprietary software]] application suite StarOffice by the German company [[StarDivision]], the code was purchased in 1999 by Sun Microsystems.@@@@1@23@@danf@17-8-2009
10640100@unknown@formal@none@1@S@In August 1999 version 5.2 of StarOffice was made available free of charge.@@@@1@13@@danf@17-8-2009
10640110@unknown@formal@none@1@S@On [[July 19]], [[2000]], Sun Microsystems announced that it was making the source code of StarOffice available for download under both the LGPL and the [[Sun Industry Standards Source License]] (SISSL) with the intention of building an open source development community around the software.@@@@1@44@@danf@17-8-2009
10640120@unknown@formal@none@1@S@The new project was known as OpenOffice.org, and its website went live on [[October 13]], [[2000]].@@@@1@16@@danf@17-8-2009
10640130@unknown@formal@none@1@S@Work on version 2.0 began in early 2003 with the following goals: better interoperability with Microsoft Office; better performance, with improved speed and lower memory usage; greater [[Scripting language|scripting]] capabilities; better integration, particularly with [[GNOME]]; an easier-to-find and use database front-end for creating reports, forms and queries; a new built-in [[SQL]] database; and improved [[usability]].@@@@1@55@@danf@17-8-2009
10640140@unknown@formal@none@1@S@A [[beta version]] was released on [[March 4]], [[2005]].@@@@1@9@@danf@17-8-2009
10640150@unknown@formal@none@1@S@On [[September 2]], [[2005]] Sun announced that it was retiring the SISSL.@@@@1@12@@danf@17-8-2009
10640160@unknown@formal@none@1@S@As a consequence, the OpenOffice.org Community Council announced that it would no longer [[dual license]] the office suite, and future versions would use only the LGPL.@@@@1@26@@danf@17-8-2009
10640170@unknown@formal@none@1@S@On [[October 20]], [[2005]], OpenOffice.org 2.0 was formally released to the public.@@@@1@12@@danf@17-8-2009
10640180@unknown@formal@none@1@S@Eight weeks after the release of Version 2.0, an update, OpenOffice.org 2.0.1, was released.@@@@1@14@@danf@17-8-2009
10640190@unknown@formal@none@1@S@It fixed minor bugs and introduced new features.@@@@1@8@@danf@17-8-2009
10640200@unknown@formal@none@1@S@As of the 2.0.3 release, OpenOffice.org changed its release cycle from 18-months to releasing updates, feature enhancements and bug fixes every three months.@@@@1@23@@danf@17-8-2009
10640210@unknown@formal@none@1@S@Currently, new versions including new features are released every six months (so-called "feature releases") alternating with so-called "bug fix releases" which are being released between two feature releases (Every 3 months).@@@@1@31@@danf@17-8-2009
10640220@unknown@formal@none@1@S@=== StarOffice ===@@@@1@3@@danf@17-8-2009
10640230@unknown@formal@none@1@S@Sun subsidizes the development of OpenOffice.org in order to use it as a base for its commercial [[proprietary software|proprietary]] StarOffice application software.@@@@1@22@@danf@17-8-2009
10640240@unknown@formal@none@1@S@Releases of StarOffice since version 6.0 have been based on the OpenOffice.org source code, with some additional proprietary components, including:@@@@1@20@@danf@17-8-2009
10640250@unknown@formal@none@1@S@* Additional bundled fonts (especially [[CJK|East Asian language]] fonts).@@@@1@9@@danf@17-8-2009
10640260@unknown@formal@none@1@S@* [[Adabas D]] database.@@@@1@4@@danf@17-8-2009
10640270@unknown@formal@none@1@S@* Additional document [[Template (word processing)|templates]].@@@@1@6@@danf@17-8-2009
10640280@unknown@formal@none@1@S@* [[Clip art]].@@@@1@3@@danf@17-8-2009
10640290@unknown@formal@none@1@S@* Sorting functionality for Asian versions.@@@@1@6@@danf@17-8-2009
10640300@unknown@formal@none@1@S@* Additional file filters.@@@@1@4@@danf@17-8-2009
10640310@unknown@formal@none@1@S@* Migration assessment tool (Enterprise Edition).@@@@1@6@@danf@17-8-2009
10640320@unknown@formal@none@1@S@* Macro migration tool (Enterprise Edition).@@@@1@6@@danf@17-8-2009
10640330@unknown@formal@none@1@S@* Configuration management tool (Enterprise Edition).@@@@1@6@@danf@17-8-2009
10640340@unknown@formal@none@1@S@OpenOffice.org, therefore, inherited many features from the original StarOffice upon which it was based including the [[OpenOffice.org XML]] file format which it retained until version 2, when it was replaced by the ISO standard [[OpenDocument]] Format (ODF).@@@@1@37@@danf@17-8-2009
10640350@unknown@formal@none@1@S@== Features ==@@@@1@3@@danf@17-8-2009
10640360@unknown@formal@none@1@S@According to its [[mission statement]], the OpenOffice.org project aims "''To create, as a community, the leading international office suite that will run on all major platforms and provide access to all functionality and data through open-component based APIs and an XML-based file format.''"@@@@1@43@@danf@17-8-2009
10640370@unknown@formal@none@1@S@OpenOffice.org aims to compete with Microsoft Office and emulate its look and feel where suitable.@@@@1@15@@danf@17-8-2009
10640380@unknown@formal@none@1@S@It can read and write most of the [[file formats]] found in Microsoft Office, and many other applications; an essential feature of the suite for many users.@@@@1@27@@danf@17-8-2009
10640390@unknown@formal@none@1@S@OpenOffice.org has been found to be able to open files of older versions of Microsoft Office and damaged files that newer versions of Microsoft Office itself cannot open.@@@@1@28@@danf@17-8-2009
10640400@unknown@formal@none@1@S@However, it cannot open older Word for Macintosh (MCW) files.@@@@1@10@@danf@17-8-2009
10640410@unknown@formal@none@1@S@=== Platforms ===@@@@1@3@@danf@17-8-2009
10640420@unknown@formal@none@1@S@Platforms for which OO.o is available include [[Microsoft Windows]], [[Linux]], [[Solaris Operating System|Solaris]], [[BSD]], [[OpenVMS]], [[OS/2]] and [[IRIX]].@@@@1@18@@danf@17-8-2009
10640430@unknown@formal@none@1@S@The current primary development platforms are Microsoft Windows, Linux and Solaris.@@@@1@11@@danf@17-8-2009
10640440@unknown@formal@none@1@S@A port for [[Mac OS X]] exists for OS X machines which have the [[X Window System]] component installed.@@@@1@19@@danf@17-8-2009
10640450@unknown@formal@none@1@S@A port to OS X's native [[Aqua (user interface)|Aqua user interface]] is in progress, and is scheduled for completion for the 3.0 milestone.@@@@1@23@@danf@17-8-2009
10640460@unknown@formal@none@1@S@[[NeoOffice]] is an independent [[Fork (software development)|fork]] of OpenOffice, specially adapted for Mac OS X.@@@@1@15@@danf@17-8-2009
10640470@unknown@formal@none@1@S@=== Version compatibility ===@@@@1@4@@danf@17-8-2009
10640480@unknown@formal@none@1@S@*Windows 95: up to v1.1.5@@@@1@5@@danf@17-8-2009
10640490@unknown@formal@none@1@S@*Windows 98-Vista: up to v2.4, development releases of v3.0@@@@1@9@@danf@17-8-2009
10640500@unknown@formal@none@1@S@*Mac OS 10.2: up to v1.1.2@@@@1@6@@danf@17-8-2009
10640510@unknown@formal@none@1@S@*Mac OS 10.3: up to v2.1@@@@1@6@@danf@17-8-2009
10640520@unknown@formal@none@1@S@*Mac OS 10.4-10.5: up to v2.4, development releases of v3.0 ([[Apple-Intel architecture|intel]] only)@@@@1@13@@danf@17-8-2009
10640530@unknown@formal@none@1@S@*OS/2 and eComStation: up to v2.0.4@@@@1@6@@danf@17-8-2009
10640540@unknown@formal@none@1@S@=== Components ===@@@@1@3@@danf@17-8-2009
10640550@unknown@formal@none@1@S@OpenOffice.org is a collection of applications that work together closely to provide the features expected from a modern office suite.@@@@1@20@@danf@17-8-2009
10640560@unknown@formal@none@1@S@Many of the components are designed to mirror those available in Microsoft Office.@@@@1@13@@danf@17-8-2009
10640570@unknown@formal@none@1@S@The components available include:@@@@1@4@@danf@17-8-2009
10640580@unknown@formal@none@1@S@*[[QuickStart]]er@@@@1@1@@danf@17-8-2009
10640590@unknown@formal@none@1@S@:A small program for Windows and Linux that runs when the computer starts for the first time.@@@@1@17@@danf@17-8-2009
10640600@unknown@formal@none@1@S@It loads the core files and libraries for OpenOffice.org during computer startup and allows the suite applications to start more quickly when selected later.@@@@1@24@@danf@17-8-2009
10640610@unknown@formal@none@1@S@The amount of time it takes to open OpenOffice.org applications was a common complaint in version 1.0 of the suite.@@@@1@20@@danf@17-8-2009
10640620@unknown@formal@none@1@S@Substantial improvements were made in this area for version 2.2.@@@@1@10@@danf@17-8-2009
10640630@unknown@formal@none@1@S@*The [[Macro (computer science)|macro]] recorder@@@@1@5@@danf@17-8-2009
10640640@unknown@formal@none@1@S@:Is used to record user actions and replay them later to help with automating tasks, using [[OpenOffice.org Basic]] (see [[OpenOffice.org#OpenOffice.org Basic|below]]).@@@@1@21@@danf@17-8-2009
10640650@unknown@formal@none@1@S@It is not possible to download these components individually on Windows, though they can be installed separately.@@@@1@17@@danf@17-8-2009
10640660@unknown@formal@none@1@S@Most Linux distributions break the components into individual packages which may be downloaded and installed separately.@@@@1@16@@danf@17-8-2009
10640670@unknown@formal@none@1@S@=== OpenOffice.org Basic ===@@@@1@4@@danf@17-8-2009
10640680@unknown@formal@none@1@S@OpenOffice.org Basic is a programming language similar to Microsoft [[Visual Basic for Applications]] (VBA) based on [[StarOffice Basic]].@@@@1@18@@danf@17-8-2009
10640690@unknown@formal@none@1@S@In addition to the macros, the upcoming Novell edition of OpenOffice.org 2.0 supports running Microsoft VBA macros, a feature expected to be incorporated into the mainstream version soon.@@@@1@28@@danf@17-8-2009
10640700@unknown@formal@none@1@S@OpenOffice.org Basic is available in the Writer and Calc applications.@@@@1@10@@danf@17-8-2009
10640710@unknown@formal@none@1@S@It is written in functions called subroutines or macros, with each macro performing a different task, such as counting the words in a paragraph.@@@@1@24@@danf@17-8-2009
10640720@unknown@formal@none@1@S@OpenOffice.org Basic is especially useful in doing repetitive tasks that have not been integrated in the program.@@@@1@17@@danf@17-8-2009
10640730@unknown@formal@none@1@S@As the OpenOffice.org database, called "Base", uses documents created under the Writer application for reports and forms, one could say that Base can also be programmed with OpenOffice.org Basic.@@@@1@29@@danf@17-8-2009
10640740@unknown@formal@none@1@S@== File formats ==@@@@1@4@@danf@17-8-2009
10640750@unknown@formal@none@1@S@OpenOffice.org pioneered the ISO/IEC standard [[OpenDocument]] file formats (ODF), which it uses natively, by default.@@@@1@15@@danf@17-8-2009
10640760@unknown@formal@none@1@S@It also supports reading (and in some cases writing) a large number of legacy proprietary file formats (e.g.: [[WordPerfect]] through libwpd, [[StarOffice]], [[Lotus software]], [[Microsoft Works|MS Works]] through libwps, [[Rich Text Format]]), most notably including [[Microsoft Office]] formats after which the OpenDocument specification was "approved for release as an ISO and IEC International Standard" under the name ISO/IEC 26300:2006..@@@@1@59@@danf@17-8-2009
10640770@unknown@formal@none@1@S@=== Microsoft Office interoperability ===@@@@1@5@@danf@17-8-2009
10640780@unknown@formal@none@1@S@In response to Microsoft's recent movement towards using the [[Office Open XML]] format in [[Microsoft Office 2007]], [[Novell]] has released an [[Office Open XML]] converter for OOo under a liberal [[BSD license]] (along with [[GNU GPL]] and [[LGPL]] licensed libraries), that will be submitted for inclusion into the OpenOffice.org project.@@@@1@50@@danf@17-8-2009
10640790@unknown@formal@none@1@S@This allows OOo to read and write Microsoft OpenXML-formatted word processing documents (.docx) in OpenOffice.org.@@@@1@15@@danf@17-8-2009
10640800@unknown@formal@none@1@S@Currently it works only with the latest Novell edition of OpenOffice.org.@@@@1@11@@danf@17-8-2009
10640810@unknown@formal@none@1@S@[[Sun Microsystems]] has developed an ODF plugin for Microsoft Office which enables users of Microsoft Office Word, Excel and PowerPoint to read and write ODF documents.@@@@1@26@@danf@17-8-2009
10640820@unknown@formal@none@1@S@The plugin currently works with Microsoft Office 2003, Microsoft Office XP and Microsoft Office 2000.@@@@1@15@@danf@17-8-2009
10640830@unknown@formal@none@1@S@Support for Microsoft Office 2007 is only available in combination with Microsoft Office 2007 SP1.@@@@1@15@@danf@17-8-2009
10640840@unknown@formal@none@1@S@Several software companies (including Microsoft and Novell) are working on an add-in for Microsoft Office that allows reading and writing ODF files.@@@@1@22@@danf@17-8-2009
10640850@unknown@formal@none@1@S@Currently it works only for Microsoft Word 2007 / XP / 2003.@@@@1@12@@danf@17-8-2009
10640860@unknown@formal@none@1@S@Microsoft provides a compatibility pack to read and write Office Open XML files with Office 2000, XP and 2003.@@@@1@19@@danf@17-8-2009
10640870@unknown@formal@none@1@S@The compatibility pack can also be used as a stand-alone converter with Microsoft Office 97.@@@@1@15@@danf@17-8-2009
10640880@unknown@formal@none@1@S@This might be helpful to convert older Microsoft Office files via Office Open XML to ODF if a direct conversion doesn't work as expected.@@@@1@24@@danf@17-8-2009
10640890@unknown@formal@none@1@S@The Office compatibility pack however does not install for Office 2000 or Office XP on [[Windows 9x]].@@@@1@17@@danf@17-8-2009
10640900@unknown@formal@none@1@S@Note that some office applications built with Microsoft components may refuse to import OpenOffice data.@@@@1@15@@danf@17-8-2009
10640910@unknown@formal@none@1@S@[[The Sage Group]]'s Simply Accounting, for example, can import Excel's .xls files, but refuses to accept OpenOffice.org-generated .xls files for the reason that the OOo .xls files are not "genuine Microsoft" .xls files.@@@@1@33@@danf@17-8-2009
10640920@unknown@formal@none@1@S@== Development ==@@@@1@3@@danf@17-8-2009
10640930@unknown@formal@none@1@S@=== Overview ===@@@@1@3@@danf@17-8-2009
10640940@unknown@formal@none@1@S@The OpenOffice.org [[Application Programming Interface|API]] is based on a component technology known as [[Universal Network Objects]] (UNO).@@@@1@17@@danf@17-8-2009
10640950@unknown@formal@none@1@S@It consists of a wide range of interfaces defined in a [[CORBA]]-like [[interface description language]].@@@@1@15@@danf@17-8-2009
10640960@unknown@formal@none@1@S@The [[document file format]] used is based on [[XML]] and several export and import filters.@@@@1@15@@danf@17-8-2009
10640970@unknown@formal@none@1@S@All external formats read by OpenOffice.org are converted back and forth from an internal XML representation.@@@@1@16@@danf@17-8-2009
10640980@unknown@formal@none@1@S@By using [[data compression|compression]] when saving [[XML]] to disk, files are generally smaller than the equivalent binary Microsoft Office documents.@@@@1@20@@danf@17-8-2009
10640990@unknown@formal@none@1@S@The native file format for storing documents in version 1.0 was used as the basis of the [[OASIS (organization)|OASIS]] OpenDocument file format standard, which has become the default file format in version 2.0.@@@@1@33@@danf@17-8-2009
10641000@unknown@formal@none@1@S@Development versions of the suite are released every few weeks on the developer zone of the OpenOffice.org website.@@@@1@18@@danf@17-8-2009
10641010@unknown@formal@none@1@S@The releases are meant for those who wish to test new features or are simply curious about forthcoming changes; they are not suitable for production use.@@@@1@26@@danf@17-8-2009
10641020@unknown@formal@none@1@S@=== Native desktop integration ===@@@@1@5@@danf@17-8-2009
10641030@unknown@formal@none@1@S@OpenOffice.org 1.0 was criticized for not having the [[look and feel]] of applications developed natively for the platforms on which it runs.@@@@1@22@@danf@17-8-2009
10641040@unknown@formal@none@1@S@Starting with version 2.0, OpenOffice.org uses native [[widget toolkit]], icons, and font-rendering libraries across a variety of platforms, to better match native applications and provide a smoother experience for the user.@@@@1@31@@danf@17-8-2009
10641050@unknown@formal@none@1@S@There are projects underway to further improve this integration on both [[GNOME]] and [[KDE]].@@@@1@14@@danf@17-8-2009
10641060@unknown@formal@none@1@S@This issue has been particularly pronounced on Mac OS X, whose standard user interface looks noticeably different from either Windows or [[X11]]-based desktop environments and requires the use of programming toolkits unfamiliar to most OpenOffice.org developers.@@@@1@36@@danf@17-8-2009
10641070@unknown@formal@none@1@S@There are two implementations of OpenOffice.org available for OS X:@@@@1@10@@danf@17-8-2009
10641080@unknown@formal@none@1@S@;OpenOffice.org Mac OS X (X11):@@@@1@5@@danf@17-8-2009
10641090@unknown@formal@none@1@S@This official implementation requires the installation of [[X11.app]] or [[XDarwin]], and is a close port of the well-tested Unix version.@@@@1@20@@danf@17-8-2009
10641100@unknown@formal@none@1@S@It is functionally equivalent to the Unix version, and its user interface resembles the [[look and feel]] of that version; for example, the application uses its own [[menu bar]] instead of the OS X menu at the top of the screen.@@@@1@41@@danf@17-8-2009
10641110@unknown@formal@none@1@S@It also requires system fonts to be converted to X11 format for OpenOffice.org to use them (which can be done during application installation).@@@@1@23@@danf@17-8-2009
10641120@unknown@formal@none@1@S@;OpenOffice.org Aqua:@@@@1@2@@danf@17-8-2009
10641130@unknown@formal@none@1@S@After a first step (completed) using [[Carbon (API)|Carbon]], OpenOffice.org Aqua switched to [[Cocoa (API)|Cocoa]] technology, and an [[Aqua (GUI)|Aqua]] version (based on [[Cocoa (API)|Cocoa]]) is also being developed under the aegis of OpenOffice.org, with a Beta version currently available.@@@@1@39@@danf@17-8-2009
10641140@unknown@formal@none@1@S@Sun Microsystems is collaborating with OOo to further development of the Aqua version of OpenOffice.org for Mac.@@@@1@17@@danf@17-8-2009
10641150@unknown@formal@none@1@S@=== Future ===@@@@1@3@@danf@17-8-2009
10641160@unknown@formal@none@1@S@Currently, a developed preview of OpenOffice.org 3 (OOo-dev 3.0) is available for download.@@@@1@13@@danf@17-8-2009
10641170@unknown@formal@none@1@S@Among the planned features for OOo 3.0, set to be released by September 2008 , are:@@@@1@16@@danf@17-8-2009
10641180@unknown@formal@none@1@S@* Personal Information Manager ([[Personal Information Manager|PIM]]), probably based on [[Mozilla Thunderbird|Thunderbird]]/[[Lightning (software)|Lightning]]@@@@1@13@@danf@17-8-2009
10641190@unknown@formal@none@1@S@* PDF import into Draw (to maintain correct layout of the original PDF)@@@@1@13@@danf@17-8-2009
10641200@unknown@formal@none@1@S@* [[OOXML]] document support for opening documents created in [[Office 2007]]@@@@1@11@@danf@17-8-2009
10641210@unknown@formal@none@1@S@* Support for [[Mac OS X]] [[Aqua (user interface)|Aqua]] platform@@@@1@10@@danf@17-8-2009
10641220@unknown@formal@none@1@S@* Extensions, to add third party functionality.@@@@1@7@@danf@17-8-2009
10641230@unknown@formal@none@1@S@* Presenter screen in Impress with multi-screen support@@@@1@8@@danf@17-8-2009
10641240@unknown@formal@none@1@S@=== Other projects ===@@@@1@4@@danf@17-8-2009
10641250@unknown@formal@none@1@S@A number of products are [http://wiki.services.openoffice.org/wiki/DerivedWorks derived from OpenOffice.org].@@@@1@9@@danf@17-8-2009
10641260@unknown@formal@none@1@S@Among the more well-known ones are Sun StarOffice and NeoOffice.@@@@1@10@@danf@17-8-2009
10641270@unknown@formal@none@1@S@The OpenOffice.org site also lists a large variety of [http://wiki.services.openoffice.org/wiki/OpenOffice.org_Solutions complementary products] including groupware solutions.@@@@1@15@@danf@17-8-2009
10641280@unknown@formal@none@1@S@==== NeoOffice ====@@@@1@3@@danf@17-8-2009
10641290@unknown@formal@none@1@S@[[NeoOffice]] is an independent [[porting|port]] that integrates with [[Mac OS X|OS X]]’s [[Aqua (GUI)|Aqua]] user interface using [[Java platform|Java]], [[Carbon (API)|Carbon]] and (increasingly) [[Cocoa (API)|Cocoa]] toolkits.@@@@1@26@@danf@17-8-2009
10641300@unknown@formal@none@1@S@NeoOffice adheres fairly closely to OS X UI standards (for example, using native pull-down menus), and has direct access to OS X’s installed fonts and printers.@@@@1@26@@danf@17-8-2009
10641310@unknown@formal@none@1@S@Its releases lag behind the official OpenOffice.org X11 releases, due to its small development team and the concurrent development of the technology used to port the user interface.@@@@1@28@@danf@17-8-2009
10641320@unknown@formal@none@1@S@Other projects run alongside the main OpenOffice.org project and are easier to contribute to.@@@@1@14@@danf@17-8-2009
10641330@unknown@formal@none@1@S@These include documentation, [[internationalisation and localisation]] and the API.@@@@1@9@@danf@17-8-2009
10641340@unknown@formal@none@1@S@==== OpenGroupware.org ====@@@@1@3@@danf@17-8-2009
10641350@unknown@formal@none@1@S@[[OpenGroupware.org]] is a set of extension programs to allow the sharing of OpenOffice.org documents, calendars, address books, [[e-mail]]s, [[instant messenger|instant messaging]] and blackboards, and provide access to other [[collaborative software|groupware]] applications.@@@@1@31@@danf@17-8-2009
10641360@unknown@formal@none@1@S@There is also an effort to create and share assorted document templates and other useful additions at OOExtras.@@@@1@18@@danf@17-8-2009
10641370@unknown@formal@none@1@S@A set of [[Perl]] extensions is available through the [[CPAN]] in order to allow OpenOffice.org document processing by external programs.@@@@1@20@@danf@17-8-2009
10641380@unknown@formal@none@1@S@These libraries do not use the OpenOffice.org API.@@@@1@8@@danf@17-8-2009
10641390@unknown@formal@none@1@S@They directly read or write the OpenOffice.org files using Perl standard file [[codec|compression/decompression]], XML access and [[UTF-8]] encoding modules.@@@@1@19@@danf@17-8-2009
10641400@unknown@formal@none@1@S@==== Portable ====@@@@1@3@@danf@17-8-2009
10641410@unknown@formal@none@1@S@A distribution of OpenOffice.org called OpenOffice.org Portable is designed to run the suite from a [[USB flash drive]].@@@@1@18@@danf@17-8-2009
10641420@unknown@formal@none@1@S@==== OxygenOffice Professional ====@@@@1@4@@danf@17-8-2009
10641430@unknown@formal@none@1@S@An enhancement of OpenOffice.org, providing: Current Version: 2.4@@@@1@8@@danf@17-8-2009
10641440@unknown@formal@none@1@S@* Possibility to run Visual Basic for Application (VBA) macros in Calc (for testing)@@@@1@14@@danf@17-8-2009
10641450@unknown@formal@none@1@S@* Improved Calc HTML export@@@@1@5@@danf@17-8-2009
10641460@unknown@formal@none@1@S@* Enhanced Access support for Base@@@@1@6@@danf@17-8-2009
10641470@unknown@formal@none@1@S@* Security fixes@@@@1@3@@danf@17-8-2009
10641480@unknown@formal@none@1@S@* Enhanced performance@@@@1@3@@danf@17-8-2009
10641490@unknown@formal@none@1@S@* Enhanced color-palette@@@@1@3@@danf@17-8-2009
10641500@unknown@formal@none@1@S@* Enhanced help menu, additional User’s Manual, and extended tips for beginners@@@@1@12@@danf@17-8-2009
10641510@unknown@formal@none@1@S@Optionally it provides, free for personal and professional use:@@@@1@9@@danf@17-8-2009
10641520@unknown@formal@none@1@S@* More than 3,200 graphics, both clip art and photos.@@@@1@10@@danf@17-8-2009
10641530@unknown@formal@none@1@S@* Several templates and sample documents@@@@1@6@@danf@17-8-2009
10641540@unknown@formal@none@1@S@* Over 90 free fonts.@@@@1@5@@danf@17-8-2009
10641550@unknown@formal@none@1@S@* Additional tools like OOoWikipedia@@@@1@5@@danf@17-8-2009
10641560@unknown@formal@none@1@S@====Extensions====@@@@1@1@@danf@17-8-2009
10641570@unknown@formal@none@1@S@Since version 2.0.4, OpenOffice.org has supported extensions in a similar manner to [[Mozilla Firefox]].@@@@1@14@@danf@17-8-2009
10641580@unknown@formal@none@1@S@Extensions make it easy to add new functionality to an existing OpenOffice.org installation.@@@@1@13@@danf@17-8-2009
10641590@unknown@formal@none@1@S@The [http://extensions.services.openoffice.org/most_pop_ext OpenOffice.org Extension Repository] lists already more than 80 extensions.@@@@1@11@@danf@17-8-2009
10641600@unknown@formal@none@1@S@Developers can easily build new extensions for OpenOffice.org, for example by using the [http://wiki.services.openoffice.org/wiki/OpenOffice_NetBeans_Integration OpenOffice.org API Plugin for NetBeans].@@@@1@19@@danf@17-8-2009
10641610@unknown@formal@none@1@S@==== The OpenOffice.org Bibliographic Project ====@@@@1@6@@danf@17-8-2009
10641620@unknown@formal@none@1@S@This aims to incorporate a powerful [[reference management software]] into the suite.@@@@1@12@@danf@17-8-2009
10641630@unknown@formal@none@1@S@The new major addition is slated for inclusion with the standard OpenOffice.org release on late-2007 to mid-2008, or possibly later depending upon the availability of programmers.@@@@1@26@@danf@17-8-2009
10641640@unknown@formal@none@1@S@=== Security ===@@@@1@3@@danf@17-8-2009
10641650@unknown@formal@none@1@S@OpenOffice.org includes a security team, and as of June 2008 the security organization [[Secunia]] reports no known unpatched security flaws for the software.@@@@1@23@@danf@17-8-2009
10641660@unknown@formal@none@1@S@[[Kaspersky Lab]] has shown a [[proof of concept]] virus for OpenOffice.org.@@@@1@11@@danf@17-8-2009
10641670@unknown@formal@none@1@S@This shows OOo viruses are possible, but there is no known virus "in the wild".@@@@1@15@@danf@17-8-2009
10641680@unknown@formal@none@1@S@In a private meeting of the French Ministry of Defense, macro-related security issues were raised.@@@@1@15@@danf@17-8-2009
10641690@unknown@formal@none@1@S@OpenOffice.org developers have responded and noted that the supposed vulnerability had not been announced through "well defined procedures" for disclosure and that the ministry had revealed nothing specific.@@@@1@28@@danf@17-8-2009
10641700@unknown@formal@none@1@S@However, the developers have been in talks with the researcher concerning the supposed vulnerability.@@@@1@14@@danf@17-8-2009
10641710@unknown@formal@none@1@S@As with Microsoft Word, documents created in OpenOffice can contain [[metadata]] which may include a complete history of what was changed, when and by whom.@@@@1@25@@danf@17-8-2009
10641720@unknown@formal@none@1@S@== Ownership ==@@@@1@3@@danf@17-8-2009
10641730@unknown@formal@none@1@S@The project and software are informally referred to as ''OpenOffice'', but project organizers report that this term is a [[trademark]] held by another party, requiring them to adopt ''OpenOffice.org'' as its formal name.@@@@1@33@@danf@17-8-2009
10641740@unknown@formal@none@1@S@(Due to a similar trademark issue, the [[Brazilian Portuguese]] version of the suite is distributed under the name ''BrOffice.org''.)@@@@1@19@@danf@17-8-2009
10641750@unknown@formal@none@1@S@Development is managed by staff members of StarOffice.@@@@1@8@@danf@17-8-2009
10641760@unknown@formal@none@1@S@Some delay and difficulty in implementing external contributions to the core codebase (even those from the project's corporate sponsors) has been noted.@@@@1@22@@danf@17-8-2009
10641770@unknown@formal@none@1@S@Currently, there are [http://wiki.services.openoffice.org/wiki/DerivedWorks several derived and/or proprietary works based on OOo], with some of them being:@@@@1@17@@danf@17-8-2009
10641780@unknown@formal@none@1@S@* Sun Microsystem's [[StarOffice]], with various complementary add-ons.@@@@1@8@@danf@17-8-2009
10641790@unknown@formal@none@1@S@* IBM's [[Lotus Symphony]], with a new interface based on [[Eclipse (software)|Eclipse]] (based on OO.o 1.x).@@@@1@16@@danf@17-8-2009
10641800@unknown@formal@none@1@S@* OpenOffice.org Novell edition, integrated with [[Novell Evolution|Evolution]] and with a [[OOXML]] filter.@@@@1@13@@danf@17-8-2009
10641810@unknown@formal@none@1@S@* Beijing [[Redflag]] Chinese 2000's [[RedOffice]], fully localized in Chinese characters.@@@@1@11@@danf@17-8-2009
10641820@unknown@formal@none@1@S@* Planamesa's [[NeoOffice]] for [[Mac OS X]] with Aqua support via Java.@@@@1@12@@danf@17-8-2009
10641830@unknown@formal@none@1@S@In [[May 23]], [[2007]], the OpenOffice.org community and Redflag Chinese 2000 Software Co, Ltd. announced a joint development effort focused on integrating the new features that have been added in the RedOffice localization of OpenOffice.org, as well as quality assurance and work on the core applications.@@@@1@46@@danf@17-8-2009
10641840@unknown@formal@none@1@S@Additionally, Redflag Chinese 2000 made public its commitment to the global OO.o community stating it would "strengthen its support of the development of the world's leading free and open source productivity suite", adding around 50 engineers (that have been working on RedOffice since 2006) to the project.@@@@1@47@@danf@17-8-2009
10641850@unknown@formal@none@1@S@In [[September 10]], [[2007]], the OO.o community announced that [[IBM]] had joined to support the development of OpenOffice.org.@@@@1@18@@danf@17-8-2009
10641860@unknown@formal@none@1@S@"IBM will be making initial code contributions that it has been developing as part of its Lotus Notes product, including accessibility enhancements, and will be making ongoing contributions to the feature richness and code quality of OpenOffice.org.@@@@1@37@@danf@17-8-2009
10641870@unknown@formal@none@1@S@Besides working with the community on the free productivity suite's software, IBM will also leverage OpenOffice.org technology in its products" as has been seen with [[Lotus Symphony]].@@@@1@27@@danf@17-8-2009
10641880@unknown@formal@none@1@S@Sean Poulley, the vice president of business and strategy in IBM's [[Lotus Software]] division said that IBM plans to take a leadership role in the OpenOffice.org community together with other companies such as Sun Microsystems.@@@@1@35@@danf@17-8-2009
10641890@unknown@formal@none@1@S@IBM will work within the leadership structure that exists.@@@@1@9@@danf@17-8-2009
10641900@unknown@formal@none@1@S@As of [[October 02]], [[2007]], [[Michael Meeks]] announced (and generated an answer by Sun's [[Simon Phipps]] and Mathias Bauer) a derived OpenOffice.org work, under the wing of his employer [[Novell]], with the purpose of including new features and fixes that do not get easily integrated in the OOo-build up-stream core.@@@@1@50@@danf@17-8-2009
10641910@unknown@formal@none@1@S@The work is called Go-OO (http://go-oo.org/) a name under which alternative OO.o software has been available for five years.@@@@1@19@@danf@17-8-2009
10641920@unknown@formal@none@1@S@The new features are shared with Novell's edition of OOo and include:@@@@1@12@@danf@17-8-2009
10641930@unknown@formal@none@1@S@* [[Visual Basic for Applications|VBA]] macros support.@@@@1@7@@danf@17-8-2009
10641940@unknown@formal@none@1@S@* Faster start up time.@@@@1@5@@danf@17-8-2009
10641950@unknown@formal@none@1@S@* "A [[Linear programming|linear optimization]] solver to optimize a cell value based on arbitrary constraints built into Calc".@@@@1@18@@danf@17-8-2009
10641960@unknown@formal@none@1@S@* Multimedia content supports into documents, using the [[gstreamer]] multimedia framework.@@@@1@11@@danf@17-8-2009
10641970@unknown@formal@none@1@S@* Support for [[Microsoft Works]] formats, [[WordPerfect]] graphics (WPG format) and T602 files imports.@@@@1@14@@danf@17-8-2009
10641980@unknown@formal@none@1@S@[http://wiki.services.openoffice.org/wiki/Contributing_Patches Details about the patch handling including metrics] can be found on the OpenOffice.org site.@@@@1@15@@danf@17-8-2009
10641990@unknown@formal@none@1@S@== Reactions ==@@@@1@3@@danf@17-8-2009
10642000@unknown@formal@none@1@S@Federal Computer Week issue listed OpenOffice.org as one of the "5 stars of open-source products."@@@@1@15@@danf@17-8-2009
10642010@unknown@formal@none@1@S@In contrast, OpenOffice.org was used in [[2005]] by ''[[The Guardian]]'' newspaper to illustrate what it claims are the limitations of open-source software, although the article does finish by stating that the software may be better than MS Word for books.@@@@1@40@@danf@17-8-2009
10642020@unknown@formal@none@1@S@=== Market share ===@@@@1@4@@danf@17-8-2009
10642030@unknown@formal@none@1@S@It is extremely difficult to estimate the market share of OpenOffice.org due to the fact that OpenOffice.org can be freely distributed via download sites including mirrors, peer-to-peer networks, CDs, Linux distros, etc.@@@@1@32@@danf@17-8-2009
10642040@unknown@formal@none@1@S@Nevertheless, the OpenOffice.org tries to capture key adoption data in a market share analysis@@@@1@14@@danf@17-8-2009
10642050@unknown@formal@none@1@S@Although Microsoft Office retains 95% of the general market as measured by revenue, OpenOffice.org and StarOffice have secured 14% of the large enterprise market as of 2004 and 19% of the small to midsize business market in 2005.@@@@1@38@@danf@17-8-2009
10642060@unknown@formal@none@1@S@The OpenOffice.org web site reports more than 98 million downloads.@@@@1@10@@danf@17-8-2009
10642070@unknown@formal@none@1@S@Other large scale users of OpenOffice.org include [[Ministry of Defence (Singapore)|Singapore’s Ministry of Defence]], and [[Bristol]] City Council in the UK.@@@@1@21@@danf@17-8-2009
10642080@unknown@formal@none@1@S@In [[France]], OpenOffice.org has attracted the attention of both local and national government administrations who wish to rationalize their software procurement, as well as have stable, standard file formats for archival purposes.@@@@1@32@@danf@17-8-2009
10642090@unknown@formal@none@1@S@It is now the official office suite for the [[French Gendarmerie]].@@@@1@11@@danf@17-8-2009
10642100@unknown@formal@none@1@S@Several government organizations in India, such as [[IIT Bombay]] (a renowned technical institute), the [[Supreme Court of India]], the [[Allahabad High Court]], which use Linux, completely rely on OpenOffice.org for their administration.@@@@1@32@@danf@17-8-2009
10642110@unknown@formal@none@1@S@On [[October 4]], [[2005]], Sun and [[Google]] announced a strategic partnership.@@@@1@11@@danf@17-8-2009
10642120@unknown@formal@none@1@S@As part of this agreement, Sun will add a Google search bar to OpenOffice.org, Sun and Google will engage in joint marketing activities as well as joint research and development, and Google will help distribute OpenOffice.org.@@@@1@36@@danf@17-8-2009
10642130@unknown@formal@none@1@S@Google is currently distributing StarOffice as part of the [[Google Pack]].@@@@1@11@@danf@17-8-2009
10642140@unknown@formal@none@1@S@Besides StarOffice, there are still a number of OpenOffice.org derived commercial products.@@@@1@12@@danf@17-8-2009
10642150@unknown@formal@none@1@S@Most of them are developed under [[SISSL]] license (which is valid up to OpenOffice.org 2.0 Beta 2).@@@@1@17@@danf@17-8-2009
10642160@unknown@formal@none@1@S@In general they are targeted at local or niche market, with proprietary add-ons such as speech recognition module, automatic database connection, or better [[CJK]] support.@@@@1@25@@danf@17-8-2009
10642170@unknown@formal@none@1@S@In July 2007 Everex, a division of First International Computer and the 9th largest PC supplier in the U.S., began shipping systems preloaded with OpenOffice.org 2.2 into Wal-Mart and Sam's Club throughout North America.@@@@1@34@@danf@17-8-2009
10642180@unknown@formal@none@1@S@In September 2007 IBM announced that it would supply and support OpenOffice.org branded as [[Lotus Symphony]], and integrated into Lotus Notes.@@@@1@21@@danf@17-8-2009
10642190@unknown@formal@none@1@S@IBM also announced 35 developers would be assigned to work on OpenOffice.org, and that it would join the OpenOffice.org foundation.@@@@1@20@@danf@17-8-2009
10642200@unknown@formal@none@1@S@Commentators noted parallels between IBM's 2000 support of Linux and this announcement.@@@@1@12@@danf@17-8-2009
10642210@unknown@formal@none@1@S@=== Java controversy ===@@@@1@4@@danf@17-8-2009
10642220@unknown@formal@none@1@S@In the past OpenOffice.org was criticized for an increasing dependency on the [[Java Runtime Environment]] which was not [[free software]].@@@@1@20@@danf@17-8-2009
10642230@unknown@formal@none@1@S@That Sun Microsystems is both the creator of Java and the chief supporter of OpenOffice.org drew accusations of ulterior motives for this technology choice.@@@@1@24@@danf@17-8-2009
10642240@unknown@formal@none@1@S@Version 1 depended on the [[Java Runtime Environment]] (JRE) being present on the user’s computer for some auxiliary functions, but version 2 increased the suite’s use of Java requiring a JRE.@@@@1@31@@danf@17-8-2009
10642250@unknown@formal@none@1@S@In response, [[Red Hat]] increased their efforts to improve [[free Java implementations]].@@@@1@12@@danf@17-8-2009
10642260@unknown@formal@none@1@S@Red Hat’s [[Fedora (Linux distribution)|Fedora Core]] 4 (released on [[June 13]], [[2005]]) included a beta version of OpenOffice.org version 2, running on [[GNU Compiler for Java|GCJ]] and [[GNU Classpath]].@@@@1@29@@danf@17-8-2009
10642270@unknown@formal@none@1@S@The issue of OpenOffice.org’s use of Java came to the fore in May 2005, when [[Richard Stallman]] appeared to call for a [[fork (software)|fork]] of the application in a posting on the [[Free Software Foundation]] website.@@@@1@36@@danf@17-8-2009
10642280@unknown@formal@none@1@S@This led to discussions within the OpenOffice.org community and between Sun staff and developers involved in [[GNU Classpath]], a free replacement for Sun’s Java implementation.@@@@1@25@@danf@17-8-2009
10642290@unknown@formal@none@1@S@Later that year, the OpenOffice.org developers also placed into their development guidelines various requirements to ensure that future versions of OpenOffice.org could be run on free implementations of Java and fixed the issues which previously prevented OpenOffice.org 2.0 from using free software Java implementations.@@@@1@44@@danf@17-8-2009
10642300@unknown@formal@none@1@S@On [[November 13]], [[2006]], Sun committed to releasing Java under the [[GNU General Public License]] in the near future.@@@@1@19@@danf@17-8-2009
10642310@unknown@formal@none@1@S@This process would end OpenOffice.org's dependence on [[non-free]] software.@@@@1@9@@danf@17-8-2009
10642320@unknown@formal@none@1@S@Between November 2006 and May 2007, Sun Microsystems made available most of their Java technologies under the GNU General Public License, in compliance with the specifications of the Java Community Process, thus making almost all of Sun's Java also free software.@@@@1@41@@danf@17-8-2009
10642330@unknown@formal@none@1@S@The following areas of OpenOffice.org 2.0 depend on the JRE being present:@@@@1@12@@danf@17-8-2009
10642340@unknown@formal@none@1@S@* The [[media player (application software)|media player]] on Unix-like systems@@@@1@10@@danf@17-8-2009
10642350@unknown@formal@none@1@S@* All document wizards in Writer@@@@1@6@@danf@17-8-2009
10642360@unknown@formal@none@1@S@* Accessibility tools@@@@1@3@@danf@17-8-2009
10642370@unknown@formal@none@1@S@* Report Autopilot@@@@1@3@@danf@17-8-2009
10642380@unknown@formal@none@1@S@* [[JDBC]] driver support@@@@1@4@@danf@17-8-2009
10642390@unknown@formal@none@1@S@* [[Hsqldb|HSQL]] database engine, which is used in OpenOffice.org Base@@@@1@10@@danf@17-8-2009
10642400@unknown@formal@none@1@S@* [[XSLT]] filters@@@@1@3@@danf@17-8-2009
10642410@unknown@formal@none@1@S@* [[BeanShell]], the [[NetBeans]] scripting language and the Java UNO bridge@@@@1@11@@danf@17-8-2009
10642420@unknown@formal@none@1@S@* Export filters to the Aportis.doc (.pdb) format for the [[Palm OS]] or [[Pocket Word]] (.psw) format for the [[Pocket PC]]@@@@1@21@@danf@17-8-2009
10642430@unknown@formal@none@1@S@* Export filter to [[LaTeX]]@@@@1@5@@danf@17-8-2009
10642440@unknown@formal@none@1@S@* Export filter to [[MediaWiki]]'s [[wikitext]]@@@@1@6@@danf@17-8-2009
10642450@unknown@formal@none@1@S@A common point of confusion is that [[mail merge]] to generate emails requires the Java API JavaMail in [[StarOffice]]; however, as of version 2.0.1, OpenOffice.org uses a [[Python (programming language)|Python]]-component instead.@@@@1@31@@danf@17-8-2009
10642460@unknown@formal@none@1@S@=== Complementary software ===@@@@1@4@@danf@17-8-2009
10642470@unknown@formal@none@1@S@OpenOffice.org provides replacement for MS Office's [[Microsoft Word]], [[Microsoft Excel]], [[Microsoft PowerPoint]], [[Microsoft Access]], [[Equation Editor|Microsoft Equation Editor]] and [[Microsoft Visio]].@@@@1@21@@danf@17-8-2009
10642480@unknown@formal@none@1@S@But to level the equivalent functionality from the rest of MS Office, OOo can be complemented with other open source programs such as:@@@@1@23@@danf@17-8-2009
10642490@unknown@formal@none@1@S@* [[Novell Evolution|Evolution]] or [[Mozilla Thunderbird|Thunderbird]]/[[Lightning (software)|Lightning]] for a PIM like [[Microsoft Outlook]].@@@@1@13@@danf@17-8-2009
10642500@unknown@formal@none@1@S@* [[OpenProj]] (which seeks integration with OOo, but might be limited due to licensing issues) for [[Microsoft Project]].@@@@1@18@@danf@17-8-2009
10642510@unknown@formal@none@1@S@* [[Scribus]] for [[Microsoft Publisher]]@@@@1@5@@danf@17-8-2009
10642520@unknown@formal@none@1@S@* [[O3spaces]] for [[Sharepoint]]@@@@1@4@@danf@17-8-2009
10642530@unknown@formal@none@1@S@Microsoft also provides Administrative Template Files ("adm files") that allow MS Office to be configured using Windows Group Policy.@@@@1@19@@danf@17-8-2009
10642540@unknown@formal@none@1@S@Equivalent functionality for OpenOffice.org is provided by [http://openoffice-enterprise.com/ OpenOffice-Enterprise], a commercial product from Open Office Technology, Inc.@@@@1@17@@danf@17-8-2009
10642550@unknown@formal@none@1@S@=== Issues ===@@@@1@3@@danf@17-8-2009
10642560@unknown@formal@none@1@S@OpenOffice.org has been criticized for slow start times and extensive CPU and RAM usage in comparison to other competitive software such as Microsoft Office.@@@@1@24@@danf@17-8-2009
10642570@unknown@formal@none@1@S@In comparison, tests between OpenOffice.org 2.2 and Microsoft Office 2007 have found that OpenOffice.org takes approximately 2 times the processing time and memory to load itself along with a blank file; and took approximately 4.7 times the processing time and 3.9 times the memory to open an extremely large spreadsheet file.@@@@1@49@@danf@17-8-2009
10642580@unknown@formal@none@1@S@Critics have pointed to excessive code bloat and OpenOffice.org's loading of the [[Java Virtual Machine|Java Runtime Environment]] as possible reasons for the slow speeds and excessive memory usage.@@@@1@28@@danf@17-8-2009
10642590@unknown@formal@none@1@S@However, since OpenOffice.org 2.2 the performance of OpenOffice.org has been improved dramatically.@@@@1@12@@danf@17-8-2009
10642600@unknown@formal@none@1@S@One of the greatest challenges is its ability to be truly cross compatible with other applications.@@@@1@16@@danf@17-8-2009
10642610@unknown@formal@none@1@S@Since Openoffice.org is forced to reverse engineer proprietary binary formats due to unavailability of open specifications, slight formatting incompatibilities tend to exist when files are saved in non-native format.@@@@1@29@@danf@17-8-2009
10642620@unknown@formal@none@1@S@For example, a complex .doc document formatted under OpenOffice.org, is usually not displayed with the correct format when opened with Microsoft Office.@@@@1@22@@danf@17-8-2009
10642630@unknown@formal@none@1@S@== Retail ==@@@@1@3@@danf@17-8-2009
10642640@unknown@formal@none@1@S@The [[free software license]] under which OpenOffice.org is distributed allows unlimited use of the software for both home and business use, including unlimited redistribution of the software.@@@@1@27@@danf@17-8-2009
10642650@unknown@formal@none@1@S@Several businesses sell the OpenOffice.org suite on auction websites such as [[eBay]], offering value-added services such as 24/7 technical support, download mirrors, and CD mailing.@@@@1@25@@danf@17-8-2009
10642660@unknown@formal@none@1@S@However, often the 24/7 support offered is not provided by the company selling the software, but rather by the official OpenOffice.org mailing list.@@@@1@23@@danf@17-8-2009
10650010@unknown@formal@none@1@S@Parsing@@@@1@1@@danf@17-8-2009
10650020@unknown@formal@none@1@S@In [[computer science]] and [[linguistics]], '''parsing''', or, more formally, '''syntactic analysis''', is the process of analyzing a sequence of [[Token (parser)|tokens]] to determine grammatical structure with respect to a given (more or less) [[formal grammar]].@@@@1@35@@danf@17-8-2009
10650030@unknown@formal@none@1@S@A '''parser''' is thus one of the components in an [[interpreter]] or [[compiler]], where it captures the implied hierarchy of the input text and transforms it into a form suitable for further processing (often some kind of [[parse tree]], [[abstract syntax tree]] or other hierarchical structure) and normally checks for syntax errors at the same time.@@@@1@56@@danf@17-8-2009
10650040@unknown@formal@none@1@S@The parser often uses a separate [[lexical analyser]] to create tokens from the sequence of input characters.@@@@1@17@@danf@17-8-2009
10650050@unknown@formal@none@1@S@Parsers may be programmed by hand or may be semi-automatically generated (in some programming language) by a tool (such as [[Yet Another Compiler Compiler|Yacc]]) from a grammar written in [[Backus-Naur form]].@@@@1@31@@danf@17-8-2009
10650060@unknown@formal@none@1@S@Parsing is also an earlier term for the diagramming of sentences of natural languages, and is still used for the diagramming of [[Inflection|inflected]] languages, such as the [[Romance languages|Romance languages]] or [[Latin]].@@@@1@32@@danf@17-8-2009
10650070@unknown@formal@none@1@S@Parsers can also be constructed as executable specifications of grammars in functional programming languages.@@@@1@14@@danf@17-8-2009
10650080@unknown@formal@none@1@S@Frost, Hafiz and Callaghan have built on the work of others to construct a set of [[higher-order function]]s (called [[parser combinators]]) which allow polynomial time and space complexity top-down parser to be constructed as executable specifications of ambiguous grammars containing left-recursive productions.@@@@1@42@@danf@17-8-2009
10650090@unknown@formal@none@1@S@The [http://www.cs.uwindsor.ca/~hafiz/proHome.html X-SAIGA] site has more about the algorithms and implementation details.@@@@1@12@@danf@17-8-2009
10650100@unknown@formal@none@1@S@== Human languages ==@@@@1@4@@danf@17-8-2009
10650110@unknown@formal@none@1@S@:''Also see [[:Category:Natural language parsing]]''@@@@1@5@@danf@17-8-2009
10650120@unknown@formal@none@1@S@In some [[machine translation]] and [[natural language processing]] systems, human languages are parsed by computer programs.@@@@1@16@@danf@17-8-2009
10650130@unknown@formal@none@1@S@Human sentences are not easily parsed by programs, as there is substantial [[syntactic ambiguity|ambiguity]] in the structure of human language.@@@@1@20@@danf@17-8-2009
10650140@unknown@formal@none@1@S@In order to parse natural language data, researchers must first agree on the [[grammar]] to be used.@@@@1@17@@danf@17-8-2009
10650150@unknown@formal@none@1@S@The choice of syntax is affected by both [[linguistic]] and computational concerns; for instance some parsing systems use [[lexical functional grammar]], but in general, parsing for grammars of this type is known to be [[NP-complete]].@@@@1@35@@danf@17-8-2009
10650160@unknown@formal@none@1@S@[[Head-driven phrase structure grammar]] is another linguistic formalism which has been popular in the parsing community, but other research efforts have focused on less complex formalisms such as the one used in the Penn [[Treebank]].@@@@1@35@@danf@17-8-2009
10650170@unknown@formal@none@1@S@[[Shallow parsing]] aims to find only the boundaries of major constituents such as noun phrases.@@@@1@15@@danf@17-8-2009
10650180@unknown@formal@none@1@S@Another popular strategy for avoiding linguistic controversy is [[dependency grammar]] parsing.@@@@1@11@@danf@17-8-2009
10650190@unknown@formal@none@1@S@Most modern parsers are at least partly [[statistics|statistical]]; that is, they rely on a corpus of training data which has already been annotated (parsed by hand).@@@@1@26@@danf@17-8-2009
10650200@unknown@formal@none@1@S@This approach allows the system to gather information about the frequency with which various constructions occur in specific contexts.@@@@1@19@@danf@17-8-2009
10650210@unknown@formal@none@1@S@''(See [[machine learning]].)''@@@@1@3@@danf@17-8-2009
10650220@unknown@formal@none@1@S@Approaches which have been used include straightforward [[PCFG]]s (probabilistic context free grammars), [[maximum entropy]], and [[neural net]]s.@@@@1@17@@danf@17-8-2009
10650230@unknown@formal@none@1@S@Most of the more successful systems use ''lexical'' statistics (that is, they consider the identities of the words involved, as well as their [[part of speech]]).@@@@1@26@@danf@17-8-2009
10650240@unknown@formal@none@1@S@However such systems are vulnerable to [[overfitting]] and require some kind of smoothing to be effective.@@@@1@16@@danf@17-8-2009
10650250@unknown@formal@none@1@S@Parsing algorithms for natural language cannot rely on the grammar having 'nice' properties as with manually-designed grammars for programming languages.@@@@1@20@@danf@17-8-2009
10650260@unknown@formal@none@1@S@As mentioned earlier some grammar formalisms are very computationally difficult to parse; in general, even if the desired structure is not [[context-free]], some kind of context-free approximation to the grammar is used to perform a first pass.@@@@1@37@@danf@17-8-2009
10650265@unknown@formal@none@1@S@Algorithms which use context-free grammars often rely on some variant of the [[CKY algorithm]], usually with some [[heuristic (computer science)|heuristic]] to prune away unlikely analyses to save time.@@@@1@28@@danf@17-8-2009
10650270@unknown@formal@none@1@S@''(See [[chart parsing]].)''@@@@1@3@@danf@17-8-2009
10650280@unknown@formal@none@1@S@However some systems trade speed for accuracy using, eg, linear-time versions of the [[Shift-reduce parsing|shift-reduce]] algorithm.@@@@1@16@@danf@17-8-2009
10650290@unknown@formal@none@1@S@A somewhat recent development has been [[parse reranking]] in which the parser proposes some large number of analyses, and a more complex system selects the best option.@@@@1@27@@danf@17-8-2009
10650300@unknown@formal@none@1@S@It is normally branching of one part and its subparts@@@@1@10@@danf@17-8-2009
10650310@unknown@formal@none@1@S@== Programming languages ==@@@@1@4@@danf@17-8-2009
10650320@unknown@formal@none@1@S@The most common use of a parser is as a component of a [[compiler]] or [[interpreter]].@@@@1@16@@danf@17-8-2009
10650330@unknown@formal@none@1@S@This parses the [[source code]] of a [[computer programming language]] to create some form of internal representation.@@@@1@17@@danf@17-8-2009
10650340@unknown@formal@none@1@S@Programming languages tend to be specified in terms of a [[context-free grammar]] because fast and efficient parsers can be written for them.@@@@1@22@@danf@17-8-2009
10650350@unknown@formal@none@1@S@Parsers are written by hand or generated by [[parser generator]]s.@@@@1@10@@danf@17-8-2009
10650360@unknown@formal@none@1@S@Context-free grammars are limited in the extent to which they can express all of the requirements of a language.@@@@1@19@@danf@17-8-2009
10650370@unknown@formal@none@1@S@Informally, the reason is that the memory of such a language is limited.@@@@1@13@@danf@17-8-2009
10650380@unknown@formal@none@1@S@The grammar cannot remember the presence of a construct over an arbitrarily long input; this is necessary for a language in which, for example, a name must be declared before it may be referenced.@@@@1@34@@danf@17-8-2009
10650390@unknown@formal@none@1@S@More powerful grammars that can express this constraint, however, cannot be parsed efficiently.@@@@1@13@@danf@17-8-2009
10650400@unknown@formal@none@1@S@Thus, it is a common strategy to create a relaxed parser for a context-free grammar which accepts a superset of the desired language constructs (that is, it accepts some invalid constructs); later, the unwanted constructs can be filtered out.@@@@1@39@@danf@17-8-2009
10650410@unknown@formal@none@1@S@===Overview of process===@@@@1@3@@danf@17-8-2009
10650420@unknown@formal@none@1@S@[[image:Parser_Flow.gif|right|Flow of data in a typical parser]] The following example demonstrates the common case of parsing a computer language with two levels of grammar: lexical and syntactic.@@@@1@27@@danf@17-8-2009
10650430@unknown@formal@none@1@S@The first stage is the token generation, or [[lexical analysis]], by which the input character stream is split into meaningful symbols defined by a grammar of [[regular expression]]s.@@@@1@28@@danf@17-8-2009
10650440@unknown@formal@none@1@S@For example, a calculator program would look at an input such as "12*(3+4)^2" and split it into the tokens 12, *, (, 3, +, 4, ), ^, and 2, each of which is a meaningful symbol in the context of an arithmetic expression.@@@@1@43@@danf@17-8-2009
10650450@unknown@formal@none@1@S@The parser would contain rules to tell it that the characters *, +, ^, ( and ) mark the start of a new token, so meaningless tokens like "12*" or "(3" will not be generated.@@@@1@35@@danf@17-8-2009
10650460@unknown@formal@none@1@S@The next stage is parsing or syntactic analysis, which is checking that the tokens form an allowable expression.@@@@1@18@@danf@17-8-2009
10650470@unknown@formal@none@1@S@This is usually done with reference to a [[context-free grammar]] which recursively defines components that can make up an expression and the order in which they must appear.@@@@1@28@@danf@17-8-2009
10650480@unknown@formal@none@1@S@However, not all rules defining programming languages can be expressed by context-free grammars alone, for example type validity and proper declaration of identifiers.@@@@1@23@@danf@17-8-2009
10650490@unknown@formal@none@1@S@These rules can be formally expressed with [[attribute grammar]]s.@@@@1@9@@danf@17-8-2009
10650500@unknown@formal@none@1@S@The final phase is [[Semantic analysis (computer science)|semantic parsing]] or analysis, which is working out the implications of the expression just validated and taking the appropriate action.@@@@1@27@@danf@17-8-2009
10650510@unknown@formal@none@1@S@In the case of a calculator or interpreter, the action is to evaluate the expression or program; a compiler, on the other hand, would generate some kind of code.@@@@1@29@@danf@17-8-2009
10650520@unknown@formal@none@1@S@Attribute grammars can also be used to define these actions.@@@@1@10@@danf@17-8-2009
10650530@unknown@formal@none@1@S@==Types of parsers==@@@@1@3@@danf@17-8-2009
10650540@unknown@formal@none@1@S@The task of the parser is essentially to determine if and how the input can be derived from the start symbol of the grammar.@@@@1@24@@danf@17-8-2009
10650550@unknown@formal@none@1@S@This can be done in essentially two ways:@@@@1@8@@danf@17-8-2009
10650560@unknown@formal@none@1@S@*[[Top-down parsing]] - Top-down parsing can be viewed as an attempt to find left-most derivations of an input-stream by searching for [[parse tree|parse-trees]] using a top-down expansion of the given [[formal grammar]] rules.@@@@1@33@@danf@17-8-2009
10650570@unknown@formal@none@1@S@Tokens are consumed from left to right.@@@@1@7@@danf@17-8-2009
10650580@unknown@formal@none@1@S@Inclusive choice is used to accommodate [[ambiguity]] by expanding all alternative right-hand-sides of grammar rules .@@@@1@16@@danf@17-8-2009
10650590@unknown@formal@none@1@S@[[LL parser]]s and [[recursive-descent parser]] are examples of top-down parsers, which cannot accommodate [[left recursion | left recursive]] productions.@@@@1@19@@danf@17-8-2009
10650600@unknown@formal@none@1@S@Although it has been believed that simple implementations of top-down parsing cannot accommodate direct and indirect left-recursion and may require exponential time and space complexity while parsing ambiguous [[context-free grammar]]s, more sophisticated algorithm for top-down parsing have been created by Frost, Hafiz, and Callaghan which accommodates [[ambiguity]] and [[left recursion]] in polynomial time and which generates polynomial-size representations of the potentially-exponential number of parse trees.@@@@1@65@@danf@17-8-2009
10650610@unknown@formal@none@1@S@Their algorithm is able to produce both left-most and right-most derivations of an input w.r.t. a given CFG.@@@@1@18@@danf@17-8-2009
10650620@unknown@formal@none@1@S@*[[Bottom-up parsing]] - A parser can start with the input and attempt to rewrite it to the start symbol.@@@@1@19@@danf@17-8-2009
10650630@unknown@formal@none@1@S@Intuitively, the parser attempts to locate the most basic elements, then the elements containing these, and so on.@@@@1@18@@danf@17-8-2009
10650640@unknown@formal@none@1@S@[[LR parser]]s are examples of bottom-up parsers.@@@@1@7@@danf@17-8-2009
10650650@unknown@formal@none@1@S@Another term used for this type of parser is Shift-Reduce parsing.@@@@1@11@@danf@17-8-2009
10650660@unknown@formal@none@1@S@Another important distinction is whether the parser generates a ''leftmost derivation'' or a ''rightmost derivation'' (see [[context-free grammar]]).@@@@1@18@@danf@17-8-2009
10650670@unknown@formal@none@1@S@LL parsers will generate a leftmost [[derivation]] and LR parsers will generate a rightmost derivation (although usually in reverse) .@@@@1@20@@danf@17-8-2009
10650680@unknown@formal@none@1@S@== Examples of parsers ==@@@@1@5@@danf@17-8-2009
10650690@unknown@formal@none@1@S@=== Top-down parsers ===@@@@1@4@@danf@17-8-2009
10650700@unknown@formal@none@1@S@Some of the parsers that use [[top-down parsing]] include:@@@@1@9@@danf@17-8-2009
10650710@unknown@formal@none@1@S@* [[Recursive descent parser]]@@@@1@4@@danf@17-8-2009
10650720@unknown@formal@none@1@S@* [[LL parser]] ('''L'''eft-to-right, '''L'''eftmost derivation)@@@@1@6@@danf@17-8-2009
10650730@unknown@formal@none@1@S@* [http://www.cs.uwindsor.ca/~hafiz/proHome.html X-SAIGA] - eXecutable SpecificAtIons of GrAmmars.@@@@1@8@@danf@17-8-2009
10650740@unknown@formal@none@1@S@Contains publications related to top-down parsing algorithm that supports left-recursion and ambiguity in polynomial time and space.@@@@1@17@@danf@17-8-2009
10650750@unknown@formal@none@1@S@=== Bottom-up parsers ===@@@@1@4@@danf@17-8-2009
10650760@unknown@formal@none@1@S@Some of the parsers that use [[bottom-up parsing]] include:@@@@1@9@@danf@17-8-2009
10650770@unknown@formal@none@1@S@* Precedence parser@@@@1@3@@danf@17-8-2009
10650780@unknown@formal@none@1@S@** [[Operator-precedence parser]]@@@@1@3@@danf@17-8-2009
10650790@unknown@formal@none@1@S@** [[Simple precedence parser]]@@@@1@4@@danf@17-8-2009
10650800@unknown@formal@none@1@S@* BC (bounded context) parsing@@@@1@5@@danf@17-8-2009
10650810@unknown@formal@none@1@S@* [[LR parser]] ('''L'''eft-to-right, '''R'''ightmost derivation)@@@@1@6@@danf@17-8-2009
10650820@unknown@formal@none@1@S@** [[SLR parser|Simple LR (SLR) parser]]@@@@1@6@@danf@17-8-2009
10650830@unknown@formal@none@1@S@** [[LALR parser]]@@@@1@3@@danf@17-8-2009
10650840@unknown@formal@none@1@S@** [[Canonical LR parser|Canonical LR (LR(1)) parser]]@@@@1@7@@danf@17-8-2009
10650850@unknown@formal@none@1@S@** [[GLR parser]]@@@@1@3@@danf@17-8-2009
10650860@unknown@formal@none@1@S@* [[CYK algorithm|CYK parser]]@@@@1@4@@danf@17-8-2009
10660010@unknown@formal@none@1@S@Lexical category@@@@1@2@@danf@17-8-2009
10660020@unknown@formal@none@1@S@In [[grammar]], a '''lexical category''' (also '''word class''', '''lexical class''', or in traditional grammar '''part of speech''') is a linguistic category of words (or more precisely ''lexical items''), which is generally defined by the [[syntactic]] or [[morphology (linguistics)|morphological]] behaviour of the lexical item in question.@@@@1@45@@danf@17-8-2009
10660030@unknown@formal@none@1@S@Common linguistic categories include ''noun'' and ''verb'', among others.@@@@1@9@@danf@17-8-2009
10660040@unknown@formal@none@1@S@There are [[open class word|open word classes]], which constantly acquire new members, and [[closed class word|closed word classes]], which acquire new members infrequently if at all.@@@@1@26@@danf@17-8-2009
10660050@unknown@formal@none@1@S@Different languages may have different lexical categories, or they might associate different properties to the same one.@@@@1@17@@danf@17-8-2009
10660060@unknown@formal@none@1@S@For example, [[Japanese language|Japanese]] has at least three classes of adjectives where English has one; Chinese and Japanese have [[measure word]]s while European languages have nothing resembling them; many languages don't have a distinction between adjectives and adverbs, or adjectives and nouns, etc.@@@@1@43@@danf@17-8-2009
10660070@unknown@formal@none@1@S@Many linguists argue that the formal distinctions between parts of speech must be made within the framework of a specific language or language family, and should not be carried over to other languages or language families.@@@@1@36@@danf@17-8-2009
10660080@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10660090@unknown@formal@none@1@S@The classification of words into lexical categories is found from the earliest moments in the [[history of linguistics]].@@@@1@18@@danf@17-8-2009
10660100@unknown@formal@none@1@S@In the ''[[Nirukta]]'', written in the [[5th century BCE|5th]] or [[6th century BCE]], the [[Sanskrit grammarian]] [[Yāska]] defined four main categories of words :@@@@1@24@@danf@17-8-2009
10660110@unknown@formal@none@1@S@# nāma - [[noun]]s or substantives@@@@1@6@@danf@17-8-2009
10660120@unknown@formal@none@1@S@# ākhyāta - [[verb]]s@@@@1@4@@danf@17-8-2009
10660130@unknown@formal@none@1@S@# upasarga - pre-verbs or [[prefix]]es@@@@1@6@@danf@17-8-2009
10660140@unknown@formal@none@1@S@# nipāta - [[Grammatical particle|particle]]s, invariant words (perhaps [[prepositions]])@@@@1@9@@danf@17-8-2009
10660150@unknown@formal@none@1@S@These four were grouped into two large classes: [[inflection|inflected]] (nouns and verbs) and uninflected (pre-verbs and particles).@@@@1@17@@danf@17-8-2009
10660160@unknown@formal@none@1@S@A century or two later, the [[Classical Greece|Greek]] scholar [[Plato]] wrote in the [[Cratylus (dialogue)|''Cratylus'' dialog]] that "... sentences are, I conceive, a combination of verbs [''rhēma''] and nouns [''ónoma'']".@@@@1@30@@danf@17-8-2009
10660170@unknown@formal@none@1@S@Another class, "conjunctions" (covering [[Grammatical conjunction|conjunction]]s, [[pronoun]]s, and the [[article (grammar)|article]]), was later added by [[Aristotle]].@@@@1@16@@danf@17-8-2009
10660180@unknown@formal@none@1@S@By the end of the [[2nd century BCE]], the classification scheme had been expanded into eight categories, seen in the ''[[Art of Grammar|Tékhnē grammatiké]]'':@@@@1@24@@danf@17-8-2009
10660190@unknown@formal@none@1@S@# Noun: a part of speech inflected for case, signifying a concrete or abstract entity@@@@1@15@@danf@17-8-2009
10660200@unknown@formal@none@1@S@# Verb: a part of speech without case inflection, but inflected for tense, person and number, signifying an activity or process performed or undergone@@@@1@24@@danf@17-8-2009
10660210@unknown@formal@none@1@S@# Participle: a part of speech sharing the features of the verb and the noun@@@@1@15@@danf@17-8-2009
10660220@unknown@formal@none@1@S@# Article: a part of speech inflected for case and preposed or postposed to nouns (the relative pronoun is meant by the postposed article)@@@@1@24@@danf@17-8-2009
10660230@unknown@formal@none@1@S@# Pronoun: a part of speech substitutable for a noun and marked for person@@@@1@14@@danf@17-8-2009
10660240@unknown@formal@none@1@S@# Preposition: a part of speech placed before other words in composition and in syntax@@@@1@15@@danf@17-8-2009
10660250@unknown@formal@none@1@S@# Adverb: a part of speech without inflection, in modification of or in addition to a verb@@@@1@17@@danf@17-8-2009
10660260@unknown@formal@none@1@S@# Conjunction: a part of speech binding together the discourse and filling gaps in its interpretation@@@@1@16@@danf@17-8-2009
10660270@unknown@formal@none@1@S@The [[Latin grammar]]ian [[Priscian]] ([[floruit|fl.]] [[500 CE]]) modified the above eight-fold system, substituting "[[interjection]]" for "article".@@@@1@16@@danf@17-8-2009
10660280@unknown@formal@none@1@S@It wasn't until 1767 that the [[adjective]] was taken as a separate class.@@@@1@13@@danf@17-8-2009
10660290@unknown@formal@none@1@S@Traditional English grammar is patterned after the European tradition above, and is still taught in schools and used in [[dictionaries]].@@@@1@20@@danf@17-8-2009
10660300@unknown@formal@none@1@S@It names eight parts of speech: [[noun]], [[verb]], [[adjective]], [[adverb]], [[pronoun]], [[preposition]], [[Grammatical conjunction|conjunction]], and [[interjection]] (sometimes called an exclamation).@@@@1@20@@danf@17-8-2009
10660310@unknown@formal@none@1@S@==Controversies==@@@@1@1@@danf@17-8-2009
10660320@unknown@formal@none@1@S@Since the Greek grammarians of 2nd century BCE, parts of speech have been defined by [[morphology (linguistics)|morphological]], [[syntax|syntactic]] and [[semantics|semantic]] criteria.@@@@1@21@@danf@17-8-2009
10660330@unknown@formal@none@1@S@However, there is currently no generally agreed-upon classification scheme that can apply to all languages, or even a set of criteria upon which such a scheme should be based.@@@@1@29@@danf@17-8-2009
10660340@unknown@formal@none@1@S@Linguists recognize that the above list of eight word classes is simplified and artificial.@@@@1@14@@danf@17-8-2009
10660350@unknown@formal@none@1@S@For example, "adverb" is to some extent a catch-all class that includes words with many different functions.@@@@1@17@@danf@17-8-2009
10660360@unknown@formal@none@1@S@Some have even argued that the most basic of category distinctions, that of nouns and verbs, is unfounded, or not applicable to certain languages.@@@@1@24@@danf@17-8-2009
10660370@unknown@formal@none@1@S@==Functional classification==@@@@1@2@@danf@17-8-2009
10660380@unknown@formal@none@1@S@Common ways of delimiting words by function include:@@@@1@8@@danf@17-8-2009
10660390@unknown@formal@none@1@S@* '''[[Open word classes]]:'''@@@@1@4@@danf@17-8-2009
10660400@unknown@formal@none@1@S@**[[adjective]]s@@@@1@1@@danf@17-8-2009
10660410@unknown@formal@none@1@S@**[[adverb]]s@@@@1@1@@danf@17-8-2009
10660420@unknown@formal@none@1@S@**[[interjection]]s@@@@1@1@@danf@17-8-2009
10660430@unknown@formal@none@1@S@**[[noun]]s@@@@1@1@@danf@17-8-2009
10660440@unknown@formal@none@1@S@**[[verb]]s (except [[auxiliary verb]]s)@@@@1@4@@danf@17-8-2009
10660450@unknown@formal@none@1@S@* '''[[Closed word classes]]:'''@@@@1@4@@danf@17-8-2009
10660460@unknown@formal@none@1@S@**[[auxiliary verb]]s@@@@1@2@@danf@17-8-2009
10660470@unknown@formal@none@1@S@**[[clitic]]s@@@@1@1@@danf@17-8-2009
10660480@unknown@formal@none@1@S@**[[coverb]]s@@@@1@1@@danf@17-8-2009
10660490@unknown@formal@none@1@S@**[[Grammatical conjunction|conjunction]]s@@@@1@2@@danf@17-8-2009
10660500@unknown@formal@none@1@S@**[[determiner (class)|Determiner]]s ([[article (grammar)|article]]s, [[quantifier]]s, [[demonstrative adjective]]s, and [[possessive adjective]]s)@@@@1@10@@danf@17-8-2009
10660510@unknown@formal@none@1@S@**[[grammatical particle|particle]]s@@@@1@2@@danf@17-8-2009
10660520@unknown@formal@none@1@S@**[[measure word]]s@@@@1@2@@danf@17-8-2009
10660530@unknown@formal@none@1@S@**[[adposition]]s (prepositions, postpositions, and circumpositions)@@@@1@5@@danf@17-8-2009
10660540@unknown@formal@none@1@S@**[[preverb]]s@@@@1@1@@danf@17-8-2009
10660550@unknown@formal@none@1@S@**[[pronoun]]s@@@@1@1@@danf@17-8-2009
10660560@unknown@formal@none@1@S@**[[Contraction (grammar)|contraction]]s@@@@1@2@@danf@17-8-2009
10660570@unknown@formal@none@1@S@**[[Names of numbers in English#Cardinal numbers|cardinal numbers]]@@@@1@7@@danf@17-8-2009
10660580@unknown@formal@none@1@S@==English==@@@@1@1@@danf@17-8-2009
10660590@unknown@formal@none@1@S@[[English language|English]] frequently does not [[marker (linguistics)|mark]] words as belonging to one part of speech or another.@@@@1@17@@danf@17-8-2009
10660600@unknown@formal@none@1@S@Words like ''neigh'', ''break'', ''outlaw'', ''laser'', ''microwave'' and ''telephone'' might all be either verb forms or nouns.@@@@1@17@@danf@17-8-2009
10660610@unknown@formal@none@1@S@Although ''-ly'' is an adverb marker, not all adverbs end in ''-ly'' and not all words ending in ''-ly'' are adverbs.@@@@1@21@@danf@17-8-2009
10660620@unknown@formal@none@1@S@For instance, ''tomorrow'', ''slow'', ''fast'', ''crosswise'' can all be adverbs, while ''early'', ''friendly'', ''ugly'' are all adjectives (though ''early'' can also function as an adverb).@@@@1@25@@danf@17-8-2009
10660630@unknown@formal@none@1@S@In certain circumstances, even words with primarily grammatical functions can be used as verbs or nouns, as in "We must look to the ''hows'' and not just the ''whys''" or "Miranda was ''to-ing and fro-ing'' and not paying attention".@@@@1@39@@danf@17-8-2009
10670010@unknown@formal@none@1@S@Part-of-speech tagging@@@@1@2@@danf@17-8-2009
10670020@unknown@formal@none@1@S@'''Part-of-speech tagging''' ('''POS tagging''' or '''POST'''), also called '''grammatical tagging''', is the process of marking up the words in a text as corresponding to a particular [[parts of speech|part of speech]], based on both its definition, as well as its context—i.e., relationship with adjacent and related words in a [[phrase]], [[sentence]], or [[paragraph]].@@@@1@53@@danf@17-8-2009
10670030@unknown@formal@none@1@S@A simplified form of this is commonly taught school-age children, in the identification of words as [[noun]]s, [[verb]]s, [[adjective]]s, [[adverb]]s, etc.@@@@1@21@@danf@17-8-2009
10670040@unknown@formal@none@1@S@Once performed by hand, POS tagging is now done in the context of [[computational linguistics]], using [[algorithms]] which associate discrete terms, as well as hidden parts of speech, in accordance with a set of descriptive tags.@@@@1@36@@danf@17-8-2009
10670050@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10670060@unknown@formal@none@1@S@Research on part-of-speech tagging has been closely tied to [[corpus linguistics]].@@@@1@11@@danf@17-8-2009
10670070@unknown@formal@none@1@S@The first major corpus of English for computer analysis was the [[Brown Corpus]] developed at [[Brown University]] by [[Henry Kucera]] and [[Nelson Francis]], in the mid-1960s.@@@@1@26@@danf@17-8-2009
10670080@unknown@formal@none@1@S@It consists of about 1,000,000 words of running English prose text, made up of 500 samples from randomly chosen publications.@@@@1@20@@danf@17-8-2009
10670090@unknown@formal@none@1@S@Each sample is 2,000 or more words (ending at the first sentence-end after 2,000 words, so that the corpus contains only complete sentences).@@@@1@23@@danf@17-8-2009
10670100@unknown@formal@none@1@S@The [[Brown Corpus]] was painstakingly "tagged" with part-of-speech markers over many years.@@@@1@12@@danf@17-8-2009
10670110@unknown@formal@none@1@S@A first approximation was done with a program by Greene and Rubin, which consisted of a huge handmade list of what categories could co-occur at all.@@@@1@26@@danf@17-8-2009
10670120@unknown@formal@none@1@S@For example, article then noun can occur, but article verb (arguably) cannot.@@@@1@12@@danf@17-8-2009
10670130@unknown@formal@none@1@S@The program got about 70% correct.@@@@1@6@@danf@17-8-2009
10670140@unknown@formal@none@1@S@Its results were repeatedly reviewed and corrected by hand, and later users sent in errata, so that by the late 70s the tagging was nearly perfect (allowing for some cases even human speakers might not agree on).@@@@1@37@@danf@17-8-2009
10670150@unknown@formal@none@1@S@This corpus has been used for innumerable studies of word-frequency and of part-of-speech, and inspired the development of similar "tagged" corpora in many other languages.@@@@1@25@@danf@17-8-2009
10670160@unknown@formal@none@1@S@Statistics derived by analyzing it formed the basis for most later part-of-speech tagging systems, such as CLAWS and [[VOLSUNGA]].@@@@1@19@@danf@17-8-2009
10670170@unknown@formal@none@1@S@However, by this time (2005) it has been superseded by larger corpora such as the 100 million word [[British National Corpus]].@@@@1@21@@danf@17-8-2009
10670180@unknown@formal@none@1@S@For some time, part-of-speech tagging was considered an inseparable part of [[natural language processing]], because there are certain cases where the correct part of speech cannot be decided without understanding the [[semantics]] or even the [[pragmatics]] of the context.@@@@1@39@@danf@17-8-2009
10670190@unknown@formal@none@1@S@This is extremely expensive, especially because analyzing the higher levels is much harder when multiple part-of-speech possibilities must be considered for each word.@@@@1@23@@danf@17-8-2009
10670200@unknown@formal@none@1@S@In the mid 1980s, researchers in Europe began to use [[hidden Markov model]]s (HMMs) to disambiguate parts of speech, when working to tag the [[Lancaster-Oslo-Bergen Corpus]] of British English.@@@@1@29@@danf@17-8-2009
10670210@unknown@formal@none@1@S@HMMs involve counting cases (such as from the Brown Corpus), and making a table of the probabilities of certain sequences.@@@@1@20@@danf@17-8-2009
10670220@unknown@formal@none@1@S@For example, once you've seen an article such as 'the', perhaps the next word is a noun 40% of the time, an adjective 40%, and a number 20%.@@@@1@28@@danf@17-8-2009
10670230@unknown@formal@none@1@S@Knowing this, a program can decide that "can" in "the can" is far more likely to be a noun than a verb or a modal.@@@@1@25@@danf@17-8-2009
10670240@unknown@formal@none@1@S@The same method can of course be used to benefit from knowledge about following words.@@@@1@15@@danf@17-8-2009
10670250@unknown@formal@none@1@S@More advanced ("higher order") HMMs learn the probabilities not only of pairs, but triples or even larger sequences.@@@@1@18@@danf@17-8-2009
10670260@unknown@formal@none@1@S@So, for example, if you've just seen an article and a verb, the next item may be very likely a preposition, article, or noun, but even less likely another verb.@@@@1@30@@danf@17-8-2009
10670270@unknown@formal@none@1@S@When several ambiguous words occur together, the possibilities multiply.@@@@1@9@@danf@17-8-2009
10670280@unknown@formal@none@1@S@However, it is easy to enumerate every combination and to assign a relative probability to each one, by multiplying together the probabilities of each choice in turn.@@@@1@27@@danf@17-8-2009
10670290@unknown@formal@none@1@S@The combination with highest probability is then chosen.@@@@1@8@@danf@17-8-2009
10670300@unknown@formal@none@1@S@The European group developed CLAWS, a tagging program that did exactly this, and achieved accuracy in the 93-95% range.@@@@1@19@@danf@17-8-2009
10670310@unknown@formal@none@1@S@It is worth remembering, as [[Eugene Charniak]] points out in ''Statistical techniques for natural language parsing'' [http://www.cs.brown.edu/people/ec/home.html], that merely assigning the most common tag to each known word and the tag "proper noun" to all unknowns, will approach 90% accuracy because many words are unambiguous.@@@@1@45@@danf@17-8-2009
10670320@unknown@formal@none@1@S@CLAWS pioneered the field of HMM-based part of speech tagging, but was quite expensive since it enumerated all possibilities.@@@@1@19@@danf@17-8-2009
10670330@unknown@formal@none@1@S@It sometimes had to resort to backup methods when there were simply too many (the [[Brown Corpus]] contains a case with 17 ambiguous words in a row, and there are words such as "still" that can represent as many as 7 distinct parts of speech).@@@@1@45@@danf@17-8-2009
10670340@unknown@formal@none@1@S@In 1987, [[Steve DeRose]] and [[Ken Church]] independently developed [[dynamic programming]] algorithms to solve the same problem in vastly less time.@@@@1@21@@danf@17-8-2009
10670350@unknown@formal@none@1@S@Their methods were similar to the [[Viterbi algorithm]] known for some time in other fields.@@@@1@15@@danf@17-8-2009
10670360@unknown@formal@none@1@S@DeRose used a table of pairs, while Church used a table of triples and an ingenious method of estimating the values for triples that were rare or nonexistent in the Brown Corpus (actual measurement of triple probabilities would require a much larger corpus).@@@@1@43@@danf@17-8-2009
10670370@unknown@formal@none@1@S@Both methods achieved accuracy over 95%.@@@@1@6@@danf@17-8-2009
10670380@unknown@formal@none@1@S@DeRose's 1990 dissertation at [[Brown University]] included analyses of the specific error types, probabilities, and other related data, and replicated his work for Greek, where it proved similarly effective.@@@@1@29@@danf@17-8-2009
10670390@unknown@formal@none@1@S@These findings were surprisingly disruptive to the field of [[Natural Language Processing]].@@@@1@12@@danf@17-8-2009
10670400@unknown@formal@none@1@S@The accuracy reported was higher than the typical accuracy of very sophisticated algorithms that integrated part of speech choice with many higher levels of linguistic analysis: syntax, morphology, semantics, and so on.@@@@1@32@@danf@17-8-2009
10670410@unknown@formal@none@1@S@CLAWS, DeRose's and Church's methods did fail for some of the known cases where semantics is required, but those proved negligibly rare.@@@@1@22@@danf@17-8-2009
10670420@unknown@formal@none@1@S@This convinced many in the field that part-of-speech tagging could usefully be separated out from the other levels of processing; this in turn simplified the theory and practice of computerized language analysis, and encouraged researchers to find ways to separate out other pieces as well.@@@@1@45@@danf@17-8-2009
10670430@unknown@formal@none@1@S@Markov Models are now the standard method for part-of-speech assignment.@@@@1@10@@danf@17-8-2009
10670440@unknown@formal@none@1@S@The methods already discussed involve working from a pre-existing corpus to learn tag probabilities.@@@@1@14@@danf@17-8-2009
10670450@unknown@formal@none@1@S@It is, however, also possible to [[Bootstrapping (linguistics)|bootstrap]] using "unsupervised" tagging.@@@@1@11@@danf@17-8-2009
10670460@unknown@formal@none@1@S@Unsupervised tagging techniques use an untagged corpus for their training data and produce the tagset by induction.@@@@1@17@@danf@17-8-2009
10670470@unknown@formal@none@1@S@That is, they observe patterns in word use, and derive part-of-speech categories themselves.@@@@1@13@@danf@17-8-2009
10670480@unknown@formal@none@1@S@For example, statistics readily reveal that "the", "a", and "an" occur in similar contexts, while "eat" occurs in very different ones.@@@@1@21@@danf@17-8-2009
10670490@unknown@formal@none@1@S@With sufficient iteration, similarity classes of words emerge that are remarkably similar to those human linguists would expect; and the differences themselves sometimes suggest valuable new insights.@@@@1@27@@danf@17-8-2009
10670500@unknown@formal@none@1@S@These two categories can be further subdivided into rule-based, stochastic, and neural approaches.@@@@1@13@@danf@17-8-2009
10670510@unknown@formal@none@1@S@Some current major algorithms for '''part-of-speech tagging''' include the [[Viterbi algorithm]], [[Brill Tagger]], and the [[Baum-Welch algorithm]] (also known as the forward-backward algorithm).@@@@1@23@@danf@17-8-2009
10670520@unknown@formal@none@1@S@[[Hidden Markov model]] and [[visible Markov model]] taggers can both be implemented using the [[Viterbi algorithm]].@@@@1@16@@danf@17-8-2009
10680010@unknown@formal@none@1@S@Pattern recognition@@@@1@2@@danf@17-8-2009
10680020@unknown@formal@none@1@S@'''Pattern recognition''' is a sub-topic of [[machine learning]].@@@@1@8@@danf@17-8-2009
10680030@unknown@formal@none@1@S@It can be defined as@@@@1@5@@danf@17-8-2009
10680040@unknown@formal@none@1@S@:"the act of taking in raw data and taking an action based on the [[Category (taxonomy)|category]] of the data".@@@@1@19@@danf@17-8-2009
10680050@unknown@formal@none@1@S@Most research in pattern recognition is about methods for [[supervised learning]] and [[unsupervised learning]].@@@@1@14@@danf@17-8-2009
10680060@unknown@formal@none@1@S@Pattern recognition aims to classify [[data]] ([[pattern]]s) based on either ''[[A priori and a posteriori (philosophy)|a priori]]'' knowledge or on [[statistics|statistical]] information extracted from the patterns.@@@@1@26@@danf@17-8-2009
10680070@unknown@formal@none@1@S@The patterns to be classified are usually groups of measurements or observations, defining points in an appropriate [[space (mathematics)|multidimensional space]].@@@@1@20@@danf@17-8-2009
10680080@unknown@formal@none@1@S@This is in contrast to '''[[pattern matching]]''', where the pattern is rigidly specified.@@@@1@13@@danf@17-8-2009
10680090@unknown@formal@none@1@S@==Overview==@@@@1@1@@danf@17-8-2009
10680100@unknown@formal@none@1@S@A complete pattern recognition system consists of a [[sensor]] that gathers the observations to be classified or described; a [[feature extraction]] mechanism that computes numeric or symbolic information from the observations; and a [[statistical classification|classification]] or description scheme that does the actual job of classifying or describing observations, relying on the extracted features.@@@@1@53@@danf@17-8-2009
10680110@unknown@formal@none@1@S@The classification or description scheme is usually based on the availability of a set of patterns that have already been classified or described.@@@@1@23@@danf@17-8-2009
10680120@unknown@formal@none@1@S@This set of patterns is termed the [[training set]] and the resulting learning strategy is characterized as [[supervised learning]].@@@@1@19@@danf@17-8-2009
10680130@unknown@formal@none@1@S@Learning can also be [[unsupervised learning|unsupervised]], in the sense that the system is not given an ''a priori'' labeling of patterns, instead it establishes the classes itself based on the statistical regularities of the patterns.@@@@1@35@@danf@17-8-2009
10680140@unknown@formal@none@1@S@The classification or description scheme usually uses one of the following approaches: [[statistical classification|statistical]] (or decision theoretic), [[syntactic pattern recognition|syntactic]] (or structural).@@@@1@22@@danf@17-8-2009
10680150@unknown@formal@none@1@S@Statistical pattern recognition is based on statistical characterisations of patterns, assuming that the patterns are generated by a [[probabilistic]] system.@@@@1@20@@danf@17-8-2009
10680160@unknown@formal@none@1@S@Syntactical (or structural) pattern recognition is based on the structural interrelationships of features.@@@@1@13@@danf@17-8-2009
10680170@unknown@formal@none@1@S@A wide range of algorithms can be applied for pattern recognition, from very simple [[Naive Bayes classifier|Bayesian classifiers]] to much more powerful [[Artificial neural network|neural networks]].@@@@1@26@@danf@17-8-2009
10680180@unknown@formal@none@1@S@An intriguing problem in pattern recognition yet to be solved is the relationship between the problem to be solved (data to be classified) and the performance of various pattern recognition algorithms (classifiers).@@@@1@32@@danf@17-8-2009
10680190@unknown@formal@none@1@S@Pattern recognition is more complex when templates are used to generate variants.@@@@1@12@@danf@17-8-2009
10680200@unknown@formal@none@1@S@For example, in English, sentences often follow the "N-VP" (noun - verb phrase) pattern, but some knowledge of the English language is required to detect the pattern.@@@@1@27@@danf@17-8-2009
10680210@unknown@formal@none@1@S@Pattern recognition is studied in many fields, including [[psychology]], [[ethology]], and [[computer science]].@@@@1@13@@danf@17-8-2009
10680220@unknown@formal@none@1@S@[[Holographic associative memory]] is another type of pattern matching scheme where a target small patterns can be searched from a large set of learned patterns based on cognitive meta-weight.@@@@1@29@@danf@17-8-2009
10680230@unknown@formal@none@1@S@==Uses==@@@@1@1@@danf@17-8-2009
10680240@unknown@formal@none@1@S@Within medical science pattern recognition creates the basis for [[computer-aided diagnosis]] (CAD) systems.@@@@1@13@@danf@17-8-2009
10680250@unknown@formal@none@1@S@CAD describes a procedure that supports the doctor's interpretations and findings.@@@@1@11@@danf@17-8-2009
10680260@unknown@formal@none@1@S@Typical applications are automatic [[speech recognition]], [[document classification|classification of text into several categories]] (e.g. spam/non-spam email messages), the [[handwriting recognition|automatic recognition of handwritten postal codes]] on postal envelopes, or the [[facial recognition system|automatic recognition of images]] of human faces.@@@@1@39@@danf@17-8-2009
10680270@unknown@formal@none@1@S@The last two examples form the subtopic [[image analysis]] of pattern recognition that deals with digital images as input to pattern recognition systems.@@@@1@23@@danf@17-8-2009
10690010@unknown@formal@none@1@S@Phrase@@@@1@1@@danf@17-8-2009
10690020@unknown@formal@none@1@S@In [[grammar]], a '''phrase''' is a group of [[word]]s that functions as a single unit in the [[syntax]] of a [[Sentence (linguistics)|sentence]].@@@@1@22@@danf@17-8-2009
10690030@unknown@formal@none@1@S@For example ''the house at the end of the street'' (example 1) is a phrase.@@@@1@15@@danf@17-8-2009
10690040@unknown@formal@none@1@S@It acts like a noun.@@@@1@5@@danf@17-8-2009
10690050@unknown@formal@none@1@S@It contains the phrase ''at the end of the street'' (example 2), a prepositional phrase which acts like an adjective.@@@@1@20@@danf@17-8-2009
10690060@unknown@formal@none@1@S@Example 2 could be replaced by ''white'', to make the phrase ''the white house''.@@@@1@14@@danf@17-8-2009
10690070@unknown@formal@none@1@S@Examples 1 and 2 contain the phrase ''the end of the street'' (example 3) which acts like a noun.@@@@1@19@@danf@17-8-2009
10690080@unknown@formal@none@1@S@It could be replaced by ''the cross-roads'' to give ''the house at the cross-roads''.@@@@1@14@@danf@17-8-2009
10690090@unknown@formal@none@1@S@Most phrases have a or central word which defines the type of phrase.@@@@1@13@@danf@17-8-2009
10690100@unknown@formal@none@1@S@This word is called the [[head (linguistics)|head]] of the phrase.@@@@1@10@@danf@17-8-2009
10690110@unknown@formal@none@1@S@In English the head is often the first word of the phrase.@@@@1@12@@danf@17-8-2009
10690120@unknown@formal@none@1@S@Some phrases, however, can be headless.@@@@1@6@@danf@17-8-2009
10690130@unknown@formal@none@1@S@For example, ''the rich'' is a noun phrase composed of a determiner and an adjective, but no noun.@@@@1@18@@danf@17-8-2009
10690140@unknown@formal@none@1@S@Phrases may be classified by the type of head they take@@@@1@11@@danf@17-8-2009
10690150@unknown@formal@none@1@S@*[[Prepositional phrase]] (PP) with a [[preposition]] as head (e.g. ''in love'', ''over the rainbow'').@@@@1@14@@danf@17-8-2009
10690160@unknown@formal@none@1@S@Languages that use [[postposition]]s instead have [[postpositional phrase]]s.@@@@1@8@@danf@17-8-2009
10690170@unknown@formal@none@1@S@The two types are sometimes commonly referred to as [[adpositional phrase]]s.@@@@1@11@@danf@17-8-2009
10690180@unknown@formal@none@1@S@*[[Noun phrase]] (NP) with a [[noun]] as head (e.g. ''the black cat'', ''a cat on the mat'')@@@@1@17@@danf@17-8-2009
10690190@unknown@formal@none@1@S@*[[Verb phrase]] (VP) with a [[verb]] as head (e.g. ''eat cheese'', ''jump up and down'')@@@@1@15@@danf@17-8-2009
10690200@unknown@formal@none@1@S@*[[Adjectival phrase]] with an [[adjective]] as head (e.g. ''full of toys'')@@@@1@11@@danf@17-8-2009
10690210@unknown@formal@none@1@S@*[[Adverbial phrase]] with [[adverb]] as head (e.g. ''very carefully'')@@@@1@9@@danf@17-8-2009
10690220@unknown@formal@none@1@S@== Formal definition ==@@@@1@4@@danf@17-8-2009
10690230@unknown@formal@none@1@S@A '''phrase''' is a [[syntax|syntactic]] structure which has syntactic properties derived from its [[head (linguistics)|head]].@@@@1@15@@danf@17-8-2009
10690240@unknown@formal@none@1@S@== Complexity ==@@@@1@3@@danf@17-8-2009
10690250@unknown@formal@none@1@S@A complex phrase consists of several words, whereas a simple phrase consists of only one word.@@@@1@16@@danf@17-8-2009
10690260@unknown@formal@none@1@S@This terminology is especially often used with [[verb]] phrases:@@@@1@9@@danf@17-8-2009
10690270@unknown@formal@none@1@S@* simple past and present are simple verb, which require just one verb@@@@1@13@@danf@17-8-2009
10690280@unknown@formal@none@1@S@* complex verb have one or two [[grammatical aspect|aspect]]s added, hence require additional two or three words@@@@1@17@@danf@17-8-2009
10690290@unknown@formal@none@1@S@"Complex", which is phrase-level, is often confused with "[[compound (linguistics)|compound]]", which is [[word]]-level.@@@@1@13@@danf@17-8-2009
10690300@unknown@formal@none@1@S@However, there are certain phenomena that formally seem to be phrases but semantically are more like compounds, like "women's magazines", which has the form of a possessive noun phrase, but which refers (just like a compound) to one specific [[lexeme]] (i.e. a magazine for women and not some magazine owned by a woman).@@@@1@53@@danf@17-8-2009
10690310@unknown@formal@none@1@S@== Semiotic approaches to the concept of "phrase" ==@@@@1@9@@danf@17-8-2009
10690320@unknown@formal@none@1@S@In more [[semiotic]] approaches to language, such as the more cognitivist versions of [[construction grammar]], a phrasal structure is not only a certain formal combination of word types whose features are inherited from the head.@@@@1@35@@danf@17-8-2009
10690330@unknown@formal@none@1@S@Here each phrasal structure also expresses some type of [[concept]]ual content, be it specific or abstract.@@@@1@16@@danf@17-8-2009
10700010@unknown@formal@none@1@S@Portuguese language@@@@1@2@@danf@17-8-2009
10700020@unknown@formal@none@1@S@'''Portuguese''' ( or ''língua portuguesa'') is a [[Romance language]] that originated in what is now [[Galicia (Spain)]] and [[Portugal|northern Portugal]] from the [[Latin language|Latin]] spoken by [[Romanization (cultural)|romanized]] [[Pre-Roman peoples of the Iberian Peninsula]] (namely the [[Gallaeci]], the [[Lusitanians]], the [[Celtici]] and the [[Conii]]) about 2000 years ago.@@@@1@48@@danf@17-8-2009
10700030@unknown@formal@none@1@S@It spread worldwide in the 15th and 16th centuries as Portugal established a [[Portuguese Empire|colonial and commercial empire]] (1415–1999) which spanned from [[Brazil]] in the [[Americas]] to [[Goa]] in [[India]] and [[Macau]] in [[China]], in fact it was used exclusively on the island of [[Sri Lanka]] as the [[lingua franca]] for almost 350 years.@@@@1@54@@danf@17-8-2009
10700040@unknown@formal@none@1@S@During that time, many [[Portuguese Creole|creole languages based on Portuguese]] also appeared around the world, especially in [[Africa]], [[Asia]], and the [[Caribbean]].@@@@1@22@@danf@17-8-2009
10700050@unknown@formal@none@1@S@Today it is one of the world's major languages, [[List of languages by number of native speakers|ranked 6th]] according to number of native speakers (approximately 177 million).@@@@1@27@@danf@17-8-2009
10700060@unknown@formal@none@1@S@It is the language with the largest number of speakers in [[South America]], spoken by nearly all of Brazil's population, which amounts to over 51% of the continent's population even though it is the only Portuguese-speaking nation in [[the Americas]].@@@@1@40@@danf@17-8-2009
10700070@unknown@formal@none@1@S@It is also a major lingua franca in Portugal's former colonial possessions in Africa.@@@@1@14@@danf@17-8-2009
10700080@unknown@formal@none@1@S@It is the official language of ten countries (see the table on the right), also being co-official with [[Spanish language|Spanish]] and [[French language|French]] in [[Equatorial Guinea]], with [[Standard Cantonese|Cantonese]] [[Chinese language|Chinese]] in the Chinese special administrative region of [[Macau]], and with [[Tetum]] in [[East Timor]].@@@@1@45@@danf@17-8-2009
10700090@unknown@formal@none@1@S@There are sizable communities of Portuguese-speakers in various regions of North America, notably in the [[United States]] ([[New Jersey]], [[New England]] and south [[Florida]]) and in [[Ontario]], [[Canada]].@@@@1@28@@danf@17-8-2009
10700100@unknown@formal@none@1@S@[[Spain|Spanish]] author [[Miguel de Cervantes]] once called Portuguese "the sweet language", while Brazilian writer [[Olavo Bilac]] poetically described it as ''a última flor do Lácio, inculta e bela'': "the last flower of [[Latium]], wild and beautiful".@@@@1@36@@danf@17-8-2009
10700110@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10700120@unknown@formal@none@1@S@Today, Portuguese is the [[official language]] of [[Angola]], [[Brazil]], [[Cape Verde]], [[Guinea-Bissau]], [[Portugal]], [[São Tomé and Príncipe]] and [[Mozambique]].@@@@1@19@@danf@17-8-2009
10700130@unknown@formal@none@1@S@It is also one of the official languages of [[Equatorial Guinea]] (with [[Spanish language|Spanish]] and [[French language|French]]), the [[Special Administrative Region of the People's Republic of China|Chinese special administrative region]] of [[Macau]] (with [[Chinese language|Chinese]]), and [[East Timor]], (with [[Tetum]]).@@@@1@40@@danf@17-8-2009
10700140@unknown@formal@none@1@S@It is a [[First language|native language]] of most of the population in Portugal (100%), Brazil (99%), Angola (60%), and São Tomé and Príncipe (50%), and it is spoken by a [[plurality]] of the population of Mozambique (40%), though only 6.5% are native speakers.@@@@1@43@@danf@17-8-2009
10700150@unknown@formal@none@1@S@No data is available for Cape Verde, but almost all the population is bilingual, and the monolingual population speaks [[Cape Verdean Creole]].@@@@1@22@@danf@17-8-2009
10700160@unknown@formal@none@1@S@Small Portuguese-speaking communities subsist in former overseas colonies of Portugal such as Macau, where it is spoken as a first language by 0.6% of the population and East Timor.@@@@1@29@@danf@17-8-2009
10700170@unknown@formal@none@1@S@[[Uruguay]] gave Portuguese an equal status to Spanish in its educational system at the north border with Brazil.@@@@1@18@@danf@17-8-2009
10700180@unknown@formal@none@1@S@In the rest of the country, it's taught as an obligatory subject beginning by the 6th grade.@@@@1@17@@danf@17-8-2009
10700190@unknown@formal@none@1@S@It is also spoken by substantial immigrant communities, though not official, in [[Andorra]], [[France]], [[Luxembourg]], [[Jersey]] (with a statistically significant Portuguese-speaking community of approximately 10,000 people), [[Paraguay]], [[Namibia]], [[South Africa]], [[Switzerland]], [[Venezuela]] and in the [[U.S.]] states of [[California]], [[Connecticut]], [[Florida]], [[Massachusetts]], [[New Jersey]], [[New York]] and [[Rhode Island]].@@@@1@49@@danf@17-8-2009
10700200@unknown@formal@none@1@S@In some parts of India, such as [[Goa]] and [[Daman and Diu]] Portuguese is still spoken.@@@@1@16@@danf@17-8-2009
10700210@unknown@formal@none@1@S@There are also significant populations of Portuguese speakers in [[Canada]] (mainly concentrated in and around [[Toronto]]) [[Bermuda]] and [[Netherlands Antilles]].@@@@1@20@@danf@17-8-2009
10700220@unknown@formal@none@1@S@Portuguese is an official language of several international organizations.@@@@1@9@@danf@17-8-2009
10700230@unknown@formal@none@1@S@The [[Community of Portuguese Language Countries]] (with the Portuguese acronym CPLP) consists of the eight independent countries that have Portuguese as an official language.@@@@1@24@@danf@17-8-2009
10700240@unknown@formal@none@1@S@It is also an official language of the [[European Union]], [[Mercosul]], the [[Organization of American States]], the [[Organization of Ibero-American States]], the [[Union of South American Nations]], and the [[African Union]] (one of the working languages) and one of the official languages of other organizations.@@@@1@45@@danf@17-8-2009
10700250@unknown@formal@none@1@S@The Portuguese language is gaining popularity in Africa, Asia, and South America as a second language for study.@@@@1@18@@danf@17-8-2009
10700260@unknown@formal@none@1@S@Portuguese and Spanish are the fastest-growing European languages, and, according to estimates by UNESCO, Portuguese is the language with the highest potential for growth as an international language in southern Africa and South America.@@@@1@34@@danf@17-8-2009
10700270@unknown@formal@none@1@S@The Portuguese-speaking African countries are expected to have a combined population of 83 million by 2050.@@@@1@16@@danf@17-8-2009
10700280@unknown@formal@none@1@S@Since 1991, when Brazil signed into the economic market of Mercosul with other South American nations, such as Argentina, Uruguay, and Paraguay, there has been an increase in interest in the study of Portuguese in those South American countries.@@@@1@39@@danf@17-8-2009
10700290@unknown@formal@none@1@S@The demographic weight of Brazil in the continent will continue to strengthen the presence of the language in the region.@@@@1@20@@danf@17-8-2009
10700300@unknown@formal@none@1@S@Although in the early 21st century, after Macau was ceded to China in 1999, the use of Portuguese was in decline in Asia, it is becoming a language of opportunity there; mostly because of East Timor's boost in the number of speakers in the last five years but also because of increased Chinese diplomatic and financial ties with Portuguese-speaking countries.@@@@1@60@@danf@17-8-2009
10700310@unknown@formal@none@1@S@In July 2007, President Teodoro Obiang Nguema announced his government's decision to make Portuguese [[Equatorial Guinea]]'s third official language, in order to meet the requirements to apply for full membership of the [[Community of Portuguese Language Countries]].@@@@1@37@@danf@17-8-2009
10700320@unknown@formal@none@1@S@This upgrading from its current Associate Observer condition would result in Equatorial Guinea being able to access several professional and academic exchange programs and the facilitation of cross-border circulation of citizens.@@@@1@31@@danf@17-8-2009
10700330@unknown@formal@none@1@S@Its application is currently being assessed by other CPLP members.@@@@1@10@@danf@17-8-2009
10700340@unknown@formal@none@1@S@In March 1994 the [[Bosque de Portugal]] (Portugal's Woods) was founded in the Brazilian city of [[Curitiba]].@@@@1@17@@danf@17-8-2009
10700350@unknown@formal@none@1@S@The park houses the Portuguese Language Memorial, which honors the Portuguese immigrants and the countries that adopted the Portuguese language.@@@@1@20@@danf@17-8-2009
10700360@unknown@formal@none@1@S@Originally there were seven nations represented with pillars, but the independence of [[East Timor]] brought yet another pillar for that nation in 2007.@@@@1@23@@danf@17-8-2009
10700370@unknown@formal@none@1@S@In March 2006, the [[Museum of the Portuguese Language]], an interactive museum about the Portuguese language, was founded in [[São Paulo]], Brazil, the city with the largest number of Portuguese speakers in the world.@@@@1@34@@danf@17-8-2009
10700380@unknown@formal@none@1@S@==Dialects==@@@@1@1@@danf@17-8-2009
10700390@unknown@formal@none@1@S@Portuguese is a [[pluricentric language]] with two main groups of [[dialect]]s, those of [[Brazil]] and those of the [[Old World]].@@@@1@20@@danf@17-8-2009
10700400@unknown@formal@none@1@S@For historical reasons, the dialects of Africa and Asia are generally closer to those of Portugal than the Brazilian dialects, although in some aspects of their phonetics, especially the pronunciation of unstressed vowels, they resemble [[Brazilian Portuguese]] more than [[European Portuguese]].@@@@1@41@@danf@17-8-2009
10700410@unknown@formal@none@1@S@They have not been studied as widely as European and Brazilian Portuguese.@@@@1@12@@danf@17-8-2009
10700420@unknown@formal@none@1@S@Audio samples of some dialects of Portuguese are available below.@@@@1@10@@danf@17-8-2009
10700430@unknown@formal@none@1@S@There are some differences between the areas but these are the best approximations possible.@@@@1@14@@danf@17-8-2009
10700440@unknown@formal@none@1@S@For example, the ''caipira'' dialect has some differences from the one of Minas Gerais, but in general it is very close.@@@@1@21@@danf@17-8-2009
10700450@unknown@formal@none@1@S@A good example of Brazilian Portuguese may be found in the capital city, [[Brasília]], because of the generalized population from all parts of the country.@@@@1@25@@danf@17-8-2009
10700460@unknown@formal@none@1@S@'''[[Angola]]'''@@@@1@1@@danf@17-8-2009
10700470@unknown@formal@none@1@S@# ''Benguelense'' — [[Benguela]] province.@@@@1@5@@danf@17-8-2009
10700480@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som85.html ''Luandense''] — [[Luanda]] province.@@@@1@6@@danf@17-8-2009
10700490@unknown@formal@none@1@S@# ''Sulista'' — South of Angola.@@@@1@6@@danf@17-8-2009
10700500@unknown@formal@none@1@S@'''[[Brazil]]'''@@@@1@1@@danf@17-8-2009
10700510@unknown@formal@none@1@S@# ''[[Caipira]]'' — States of [[São Paulo (state)|São Paulo]] (countryside; the city of São Paulo and the eastern areas of the state have their own dialect, called ''paulistano''); southern [[Minas Gerais]], northern [[Paraná (state)|Paraná]], [[Goiás]] and [[Mato Grosso do Sul]].@@@@1@40@@danf@17-8-2009
10700520@unknown@formal@none@1@S@# ''Cearense'' — [[Ceará]].@@@@1@4@@danf@17-8-2009
10700530@unknown@formal@none@1@S@# ''Baiano'' — [[Bahia]].@@@@1@4@@danf@17-8-2009
10700540@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som90.html ''Fluminense''] — Variants spoken in the states of [[Rio de Janeiro (state)|Rio de Janeiro]] and [[Espírito Santo]] (excluding the city of Rio de Janeiro and its adjacent metropolitan areas, which have their own dialect, called ''[[carioca]]'').@@@@1@38@@danf@17-8-2009
10700550@unknown@formal@none@1@S@# ''[[Gaucho|Gaúcho]]'' — [[Rio Grande do Sul]].@@@@1@7@@danf@17-8-2009
10700560@unknown@formal@none@1@S@(There are many distinct accents in Rio Grande do Sul, mainly due to the heavy influx of European immigrants of diverse origins, those which have settled several colonies throughout the state.)@@@@1@31@@danf@17-8-2009
10700570@unknown@formal@none@1@S@# ''[[Mineiro]]'' — [[Minas Gerais]] (not prevalent in the [[Triângulo Mineiro]], southern and southeastern [[Minas Gerais]]).@@@@1@16@@danf@17-8-2009
10700580@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som91.html ''Nordestino''] — [[Northeast Region, Brazil|northeastern states of Brazil]] ([[Pernambuco]] and [[Rio Grande do Norte]] have a particular way of speaking).@@@@1@22@@danf@17-8-2009
10700590@unknown@formal@none@1@S@# ''Nortista'' — [[Amazon Basin]] states.@@@@1@6@@danf@17-8-2009
10700600@unknown@formal@none@1@S@# ''Paulistano'' — Variants spoken around [[São Paulo]] city and the eastern areas of São Paulo state.@@@@1@17@@danf@17-8-2009
10700610@unknown@formal@none@1@S@# ''Sertanejo'' — States of [[Goiás]] and [[Mato Grosso]] (the city of [[Cuiabá]] has a particular way of speaking).@@@@1@19@@danf@17-8-2009
10700620@unknown@formal@none@1@S@# ''Sulista'' — Variants spoken in the areas between the northern regions of [[Rio Grande do Sul]] and southern regions of São Paulo state.@@@@1@24@@danf@17-8-2009
10700630@unknown@formal@none@1@S@(The cities of [[Curitiba]], [[Florianópolis]], and [[Itapetininga]] have fairly distinct accents as well.)@@@@1@13@@danf@17-8-2009
10700640@unknown@formal@none@1@S@'''[[Portugal]]'''@@@@1@1@@danf@17-8-2009
10700650@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som69.html ''Açoriano''] (Azorean) — [[Azores]].@@@@1@6@@danf@17-8-2009
10700660@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som40.html ''Alentejano''] — [[Alentejo]]@@@@1@5@@danf@17-8-2009
10700670@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som44.html ''Algarvio''] — [[Algarve]] (there is a particular dialect in a small part of western Algarve).@@@@1@17@@danf@17-8-2009
10700680@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som1.html ''Alto-Minhoto''] — North of [[Braga]] (hinterland).@@@@1@8@@danf@17-8-2009
10700690@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som49.html ''Baixo-Beirão''; ''Alto-Alentejano''] — Central Portugal (hinterland).@@@@1@8@@danf@17-8-2009
10700700@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som9.html ''Beirão''] — Central Portugal.@@@@1@6@@danf@17-8-2009
10700710@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som22.html ''Estremenho''] — Regions of [[Coimbra]] and [[Lisbon]] (the Lisbon dialect has some peculiar features not shared with the one of Coimbra).@@@@1@23@@danf@17-8-2009
10700720@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som60.html ''Madeirense''] (Madeiran) — [[Madeira]].@@@@1@6@@danf@17-8-2009
10700730@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som14.html ''Nortenho''] — Regions of Braga and [[Porto]].@@@@1@9@@danf@17-8-2009
10700740@unknown@formal@none@1@S@# [http://www.instituto-camoes.pt/cvc/hlp/geografia/som6.html ''Transmontano''] — [[Trás-os-Montes e Alto Douro]].@@@@1@8@@danf@17-8-2009
10700750@unknown@formal@none@1@S@Other countries@@@@1@2@@danf@17-8-2009
10700760@unknown@formal@none@1@S@* '''[[Cape Verde]]''' — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som87.html ''Português cabo-verdiano''] ([[Cape Verdean Portuguese]])@@@@1@10@@danf@17-8-2009
10700770@unknown@formal@none@1@S@* '''[[Daman and Diu]]''', India — ''Damaense''.@@@@1@7@@danf@17-8-2009
10700780@unknown@formal@none@1@S@* '''[[East Timor]]''' — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som84.html ''Timorense''] ([[East Timorese Portuguese|East Timorese]])@@@@1@10@@danf@17-8-2009
10700790@unknown@formal@none@1@S@* '''[[Goa]]''', India — ''Goês''.@@@@1@5@@danf@17-8-2009
10700800@unknown@formal@none@1@S@* '''[[Guinea-Bissau]]''' — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som88.html ''Guineense''] ([[Guinean Portuguese]]).@@@@1@7@@danf@17-8-2009
10700810@unknown@formal@none@1@S@* '''[[Macau]]''', China — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som92.html ''Macaense''] ([[Macanese Portuguese|Macanese]])@@@@1@8@@danf@17-8-2009
10700820@unknown@formal@none@1@S@* '''[[Mozambique]]''' — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som89.html ''Moçambicano''] ([[Mozambican Portuguese|Mozambican]])@@@@1@7@@danf@17-8-2009
10700830@unknown@formal@none@1@S@* '''[[São Tomé and Príncipe]]''' — [http://www.instituto-camoes.pt/cvc/hlp/geografia/som83.html ''Santomense'']@@@@1@8@@danf@17-8-2009
10700840@unknown@formal@none@1@S@* '''[[Uruguay]]''' — [[Riverense Portuñol language|''Dialectos Portugueses del Uruguay (DPU)'']].@@@@1@10@@danf@17-8-2009
10700850@unknown@formal@none@1@S@Differences between dialects are mostly of [[accent (linguistics)|accent]] and [[vocabulary]], but between the Brazilian dialects and other dialects, especially in their most coloquial forms, there can also be some grammatical differences.@@@@1@31@@danf@17-8-2009
10700860@unknown@formal@none@1@S@The [[Portuguese creole|Portuguese-based creole]]s spoken in various parts of Africa, Asia, and the Americas are independent languages which should not be confused with Portuguese itself.@@@@1@25@@danf@17-8-2009
10700870@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10700880@unknown@formal@none@1@S@Arriving in the Iberian Peninsula in 216 BC, the Romans brought with them the [[Latin language]], from which all Romance languages descend.@@@@1@22@@danf@17-8-2009
10700890@unknown@formal@none@1@S@The language was spread by arriving Roman soldiers, settlers and merchants, who built Roman cities mostly near the settlements of previous civilizations.@@@@1@22@@danf@17-8-2009
10700900@unknown@formal@none@1@S@Between AD 409 and 711, as the Roman Empire collapsed in Western Europe, the Iberian Peninsula was conquered by Germanic peoples ([[Migration Period]]).@@@@1@23@@danf@17-8-2009
10700910@unknown@formal@none@1@S@The occupiers, mainly [[Suebi]] and [[Visigoths]], quickly adopted late Roman culture and the [[Vulgar Latin]] dialects of the peninsula.@@@@1@19@@danf@17-8-2009
10700920@unknown@formal@none@1@S@After the [[Moors|Moorish]] invasion of 711, [[Arabic language|Arabic]] became the administrative language in the conquered regions, but most of the population continued to speak a form of [[Romance languages|Romance]] commonly known as [[Mozarabic]].@@@@1@33@@danf@17-8-2009
10700930@unknown@formal@none@1@S@The influence exerted by Arabic on the Romance dialects spoken in the Christian kingdoms of the north was small, affecting mainly their lexicon.@@@@1@23@@danf@17-8-2009
10700940@unknown@formal@none@1@S@The earliest surviving records of a distinctively Portuguese language are administrative documents of the 9th century, still interspersed with many Latin phrases.@@@@1@22@@danf@17-8-2009
10700950@unknown@formal@none@1@S@Today this phase is known as Proto-Portuguese (between the 9th and the 12th centuries).@@@@1@14@@danf@17-8-2009
10700960@unknown@formal@none@1@S@In the first period of Old Portuguese — [[Galician-Portuguese]] Period (from the 12th to the 14th century) — the language gradually came into general use.@@@@1@25@@danf@17-8-2009
10700970@unknown@formal@none@1@S@For some time, it was the language of preference for [[lyric poetry]] in Christian Hispania, much like [[Occitan]] was the language of the [[Occitan literature#Poetry_of_the_troubadours|poetry of the troubadours]].@@@@1@28@@danf@17-8-2009
10700980@unknown@formal@none@1@S@Portugal was formally recognized as an independent kingdom by the [[Kingdom of Leon]] in 1143, with [[Afonso I of Portugal|Afonso Henriques]] as king.@@@@1@23@@danf@17-8-2009
10700990@unknown@formal@none@1@S@In 1290, king [[Denis of Portugal|Dinis]] created the first Portuguese university in Lisbon (the ''Estudos Gerais'', later moved to [[Coimbra]]) and decreed that Portuguese, then simply called the "common language" should be known as the Portuguese language and used officially.@@@@1@40@@danf@17-8-2009
10701000@unknown@formal@none@1@S@In the second period of Old Portuguese, from the 14th to the 16th century, with the [[Age of discovery|Portuguese discoveries]], the language was taken to many regions of [[Asia]], [[Africa]] and the [[Americas]] (nowadays, the great majority of Portuguese speakers live in Brazil, in South America).@@@@1@46@@danf@17-8-2009
10701010@unknown@formal@none@1@S@By the 16th century it had become a ''[[lingua franca]]'' in Asia and Africa, used not only for colonial administration and trade but also for communication between local officials and Europeans of all nationalities.@@@@1@34@@danf@17-8-2009
10701020@unknown@formal@none@1@S@Its spread was helped by mixed marriages between Portuguese and local people, and by its association with [[Roman Catholic]] [[missionary]] efforts, which led to the formation of a [[creole language]] called [[Kristang language|Kristang]] in many parts of Asia (from the word ''cristão'', "Christian").@@@@1@43@@danf@17-8-2009
10701030@unknown@formal@none@1@S@The language continued to be popular in parts of Asia until the 19th century.@@@@1@14@@danf@17-8-2009
10701040@unknown@formal@none@1@S@Some Portuguese-speaking Christian communities in [[India]], [[Sri Lanka]], [[Malaysia]], and [[Indonesia]] preserved their language even after they were isolated from Portugal.@@@@1@21@@danf@17-8-2009
10701050@unknown@formal@none@1@S@The end of the Old Portuguese period was marked by the publication of the ''Cancioneiro Geral'' by [[Garcia de Resende]], in 1516.@@@@1@22@@danf@17-8-2009
10701060@unknown@formal@none@1@S@The early times of Modern Portuguese, which spans from the 16th century to present day, were characterized by an increase in the number of learned words borrowed from Classical Latin and Classical Greek since the Renaissance, which greatly enriched the lexicon.@@@@1@41@@danf@17-8-2009
10701070@unknown@formal@none@1@S@===Characterization===@@@@1@1@@danf@17-8-2009
10701080@unknown@formal@none@1@S@A distinctive feature of Portuguese is that it preserved the stressed vowels of [[Vulgar Latin]], which became diphthongs in other Romance languages; cf. Fr. ''pierre'', Sp. ''piedra'', It. ''pietra'', Port. ''pedra'', from Lat. ''petra''; or Sp. ''fuego'', It. ''fuoco'', Port. ''fogo'', from Lat. ''focum''.@@@@1@44@@danf@17-8-2009
10701090@unknown@formal@none@1@S@Another characteristic of early Portuguese was the loss of [[:wiktionary:intervocalic|intervocalic]] ''l'' and ''n'', sometimes followed by the merger of the two surrounding vowels, or by the insertion of an [[epenthesis|epenthetic vowel]] between them: cf. Lat. ''salire'', ''tenere'', ''catena'', Sp. ''salir'', ''tener'', ''cadena'', Port. ''sair'', ''ter'', ''cadeia''.@@@@1@46@@danf@17-8-2009
10701100@unknown@formal@none@1@S@When the [[elision|elided]] consonant was ''n'', it often [[nasalization|nasalized]] the preceding vowel: cf. Lat. ''manum'', ''rana'', ''bonum'', Port. ''mão'', ''rãa'', ''bõo'' (now ''mão'', ''rã'', ''bom'').@@@@1@25@@danf@17-8-2009
10701110@unknown@formal@none@1@S@This process was the source of most of the nasal diphthongs which are typical of Portuguese.@@@@1@16@@danf@17-8-2009
10701120@unknown@formal@none@1@S@In particular, the Latin endings ''-anem'', ''-anum'' and ''-onem'' became ''-ão'' in most cases, cf. Lat. ''canem'', ''germanum'', ''rationem'' with Modern Port. ''cão'', ''irmão'', ''razão'', and their plurals ''-anes'', ''-anos'', ''-ones'' normally became ''-ães'', ''-ãos'', ''-ões'', cf. ''cães'', ''irmãos'', ''razões''.@@@@1@40@@danf@17-8-2009
10701130@unknown@formal@none@1@S@===Movement to make Portuguese an official language of the UN===@@@@1@10@@danf@17-8-2009
10701140@unknown@formal@none@1@S@There is a growing number of people in the Portuguese speaking media and the internet who are presenting the case to the CPLP and other organizations to run a debate in the [[Lusophone]] community with the purpose of bringing forward a petition to make Portuguese an official language of the United Nations.@@@@1@52@@danf@17-8-2009
10701150@unknown@formal@none@1@S@In October 2005, during the international Convention of the [http://www.elosinternacional.com.br/index.htm Elos Club International ] that took place in Tavira, Portugal a petition was written and unanimously approved whose text can be found on the internet with the title ''Petição Para Tornar Oficial o Idioma Português na ONU''.@@@@1@47@@danf@17-8-2009
10701160@unknown@formal@none@1@S@Romulo Alexandre Soares, president of the Brazil-Portugal Chamber highlights that the positioning of Brazil in the international arena as one of the emergent powers of the 21 century, the size of its population, and the presence of the language around the world provides legitimacy and justifies a petition to the UN to make the Portuguese an official language at the UN.@@@@1@61@@danf@17-8-2009
10701170@unknown@formal@none@1@S@==Vocabulary==@@@@1@1@@danf@17-8-2009
10701180@unknown@formal@none@1@S@Most of the lexicon of Portuguese is derived from Latin.@@@@1@10@@danf@17-8-2009
10701190@unknown@formal@none@1@S@Nevertheless, because of the [[Moors|Moorish]] occupation of the [[Iberian Peninsula]] during the Middle Ages, and the participation of Portugal in the [[Age of Discovery]], it has adopted loanwords from all over the world.@@@@1@33@@danf@17-8-2009
10701200@unknown@formal@none@1@S@Very few Portuguese words can be traced to the [[Pre-Roman peoples of the Iberian Peninsula|pre-Roman inhabitants of Portugal]], which included the [[Gallaeci]], [[Lusitanians]], [[Celtici]] and [[Cynetes]].@@@@1@26@@danf@17-8-2009
10701210@unknown@formal@none@1@S@The [[Phoenicians]] and [[Carthaginians]], briefly present, also left some scarce traces.@@@@1@11@@danf@17-8-2009
10701220@unknown@formal@none@1@S@Some notable examples are ''abóbora'' "pumpkin" and ''bezerro'' "year-old calf", from the nearby [[Celtiberian language]] (probably through the Celtici); ''cerveja'' "beer", from [[Celtic languages|Celtic]]; ''saco'' "bag", from [[Phoenician language|Phoenician]]; and ''cachorro'' "dog, puppy", from [[Basque language|Basque]].@@@@1@36@@danf@17-8-2009
10701230@unknown@formal@none@1@S@In the 5th century, the Iberian Peninsula (the [[Ancient Rome|Roman]] [[Hispania]]) was conquered by the [[Germanic peoples|Germanic]] [[Suevi]] and [[Visigoths]].@@@@1@20@@danf@17-8-2009
10701240@unknown@formal@none@1@S@As they adopted the Roman civilization and language, however, these people contributed only a few words to the lexicon, mostly related to warfare — such as ''espora'' "spur", ''estaca'' "stake", and ''guerra'' "war", from [[Gothic language|Gothic]] ''*spaúra'', ''*stakka'', and ''*wirro'', respectively.@@@@1@41@@danf@17-8-2009
10701250@unknown@formal@none@1@S@Between the 9th and 15th centuries Portuguese acquired about 1000 words from [[Arabic language|Arabic]] by influence of [[al-Andalus|Moorish Iberia]].@@@@1@19@@danf@17-8-2009
10701260@unknown@formal@none@1@S@They are often recognizable by the initial Arabic article ''a''(''l'')''-'', and include many common words such as ''aldeia'' "village" from الضيعة ''aldaya'', ''alface'' "lettuce" from الخس ''alkhass'', ''armazém'' "warehouse" from المخزن ''almahazan'', and ''azeite'' "olive oil" from زيت ''azzait''.@@@@1@39@@danf@17-8-2009
10701270@unknown@formal@none@1@S@From Arabic came also the grammatically peculiar word [[Insha'Allah|''oxalá'']] "hopefully".@@@@1@10@@danf@17-8-2009
10701280@unknown@formal@none@1@S@The Mozambican currency name [[Mozambican Metical|''metical'']] was derived from the word مطقال ''miṭqāl'', a unit of weight.@@@@1@17@@danf@17-8-2009
10701290@unknown@formal@none@1@S@The word Mozambique itself is from the Arabic name of sultan Muça Alebique (Musa Alibiki).@@@@1@15@@danf@17-8-2009
10701300@unknown@formal@none@1@S@The name of the Portuguese town of [[Fátima, Portugal|Fátima]] comes from the name of one of the daughters of the prophet [[Muhammad]].@@@@1@22@@danf@17-8-2009
10701310@unknown@formal@none@1@S@Starting in the 15th century, the Portuguese maritime explorations led to the introduction of many loanwords from [[Asia]]n languages.@@@@1@19@@danf@17-8-2009
10701320@unknown@formal@none@1@S@For instance, ''catana'' "cutlass" from Japanese ''katana''; ''corja'' "rabble" from Malay ''kórchchu''; and ''chá'' "tea" from [[Chinese language|Chinese]] ''[[Tea#The word tea|''chá'']]''.@@@@1@21@@danf@17-8-2009
10701330@unknown@formal@none@1@S@From South America came ''batata'' "[[potato]]", from [[Taino]]; ''ananás'' and ''abacaxi'', from [[Tupi-Guarani]] ''naná'' and [[Tupi language|Tupi]] ''ibá cati'', respectively (two species of [[pineapple]]), and ''tucano'' "[[toucan]]" from [[Guarani language|Guarani]] ''tucan''.@@@@1@31@@danf@17-8-2009
10701340@unknown@formal@none@1@S@See [[List of Brazil state name etymologies]], for some more examples.@@@@1@11@@danf@17-8-2009
10701350@unknown@formal@none@1@S@From the 16th to the 19th century, the role of Portugal as intermediary in the [[Atlantic slave trade]], with the establishment of large Portuguese colonies in Angola, Mozambique, and Brazil, Portuguese got several words of African and [[indigenous peoples of Brazil|Amerind]] origin, especially names for most of the animals and plants found in those territories.@@@@1@55@@danf@17-8-2009
10701360@unknown@formal@none@1@S@While those terms are mostly used in the former colonies, many became current in European Portuguese as well.@@@@1@18@@danf@17-8-2009
10701370@unknown@formal@none@1@S@From [[Kimbundu language|Kimbundu]], for example, came ''kifumate'' → ''cafuné'' "head caress", ''kusula'' → ''caçula'' "youngest child", ''marimbondo'' "tropical wasp", and ''kubungula'' → ''bungular'' "to dance like a wizard".@@@@1@28@@danf@17-8-2009
10701380@unknown@formal@none@1@S@Finally, it has received a steady influx of loanwords from other European languages.@@@@1@13@@danf@17-8-2009
10701390@unknown@formal@none@1@S@For example, ''melena'' "hair lock", ''fiambre'' "wet-cured ham" (in contrast with ''presunto'' "dry-cured ham" from Latin ''prae-exsuctus'' "dehydrated"), and ''castelhano'' "Castilian", from Spanish; ''colchete''/''crochê'' "bracket"/"crochet", ''paletó'' "jacket", ''batom'' "lipstick", and ''filé''/''filete'' "steak"/"slice" respectively, from French ''crochet'', ''paletot'', ''bâton'', ''filet''; ''macarrão'' "pasta", ''piloto'' "pilot", ''carroça'' "carriage", and ''barraca'' "barrack", from Italian ''maccherone'', ''pilota'', ''carrozza'', ''baracca''; and ''bife'' "steak", ''futebol'', ''revólver'', ''estoque'', ''folclore'', from English ''beef'', ''football'', ''revolver'', ''stock'', ''folklore''.@@@@1@68@@danf@17-8-2009
10701400@unknown@formal@none@1@S@==Classification and related languages==@@@@1@4@@danf@17-8-2009
10701410@unknown@formal@none@1@S@Portuguese belongs to the [[West Iberian languages|West Iberian]] branch of the [[Romance language]]s, and it has special ties with the following members of this group:@@@@1@25@@danf@17-8-2009
10701420@unknown@formal@none@1@S@* [[Galician language|Galician]] and the [[Fala language|Fala]], its closest relatives.@@@@1@10@@danf@17-8-2009
10701430@unknown@formal@none@1@S@See below.@@@@1@2@@danf@17-8-2009
10701440@unknown@formal@none@1@S@* [[Spanish language|Spanish]], the major language closest to Portuguese.@@@@1@9@@danf@17-8-2009
10701450@unknown@formal@none@1@S@(See also [[Differences between Spanish and Portuguese]].)@@@@1@7@@danf@17-8-2009
10701460@unknown@formal@none@1@S@* [[Mirandese language|Mirandese]], another West Iberian language spoken in Portugal.@@@@1@10@@danf@17-8-2009
10701470@unknown@formal@none@1@S@* [[Judeo-Portuguese]] and [[Ladino language|Judeo-Spanish]], languages spoken by [[Sephardic Jew]]s, which remained close to Portuguese and Spanish.@@@@1@17@@danf@17-8-2009
10701480@unknown@formal@none@1@S@Despite the obvious lexical and grammatical similarities between Portuguese and other Romance languages, it is not [[mutually intelligible]] with most of them.@@@@1@22@@danf@17-8-2009
10701490@unknown@formal@none@1@S@Apart from Galician, Portuguese speakers will usually need some formal study of basic grammar and vocabulary, before attaining a reasonable level of comprehension of those languages, and vice-versa.@@@@1@28@@danf@17-8-2009
10701500@unknown@formal@none@1@S@===Galician and the Fala===@@@@1@4@@danf@17-8-2009
10701510@unknown@formal@none@1@S@The closest language to Portuguese is Galician, spoken in the autonomous community of Galicia (northwestern Spain).@@@@1@16@@danf@17-8-2009
10701520@unknown@formal@none@1@S@The two were at one time a single language, known today as [[Galician-Portuguese]], but since the political separation of Portugal from Galicia they have diverged somewhat, especially in pronunciation and vocabulary.@@@@1@31@@danf@17-8-2009
10701530@unknown@formal@none@1@S@Nevertheless, the core vocabulary and grammar of Galician are still noticeably closer to Portuguese than to Spanish.@@@@1@17@@danf@17-8-2009
10701540@unknown@formal@none@1@S@In particular, like Portuguese, it uses the future subjunctive, the personal infinitive, and the synthetic pluperfect (see the section on the grammar of Portuguese, below).@@@@1@25@@danf@17-8-2009
10701550@unknown@formal@none@1@S@Mutual intelligibility (estimated at 85% by R. A. Hall, Jr., 1989) is good between Galicians and northern Portuguese, but poorer between Galicians and speakers from central Portugal.@@@@1@27@@danf@17-8-2009
10701560@unknown@formal@none@1@S@The Fala language is another descendant of Galician-Portuguese, spoken by a small number of people in the Spanish towns of Valverdi du Fresnu, As Ellas and Sa Martín de Trebellu (autonomous community of [[Extremadura]], near the border with Portugal).@@@@1@39@@danf@17-8-2009
10701570@unknown@formal@none@1@S@===Influence on other languages===@@@@1@4@@danf@17-8-2009
10701580@unknown@formal@none@1@S@Many languages have [[loanword|borrowed words]] from Portuguese, such as [[Bahasa Indonesia|Indonesian]], [[Sri Lanka]]n [[Sri Lanka Tamils (native)|Tamil]] and [[Sinhalese language|Sinhalese]] (see [[Sri Lanka Indo-Portuguese language|Sri Lanka Indo-Portuguese]]), [[Malay language|Malay]], [[Bengali language|Bengali]], [[English (language)|English]], [[Hindi]], [[Konkani language|Konkani]], [[Marathi language|Marathi]], [[Tetum language|Tetum]], [[Tsonga language|Xitsonga]], [[Papiamentu]], [[Japanese language|Japanese]], [[Barbadian|Bajan Creole]] (Spoken in Barbados), [[Lanc-Patuá]] (spoken in northern Brazil) and [[Sranan Tongo]] (spoken in Suriname).@@@@1@61@@danf@17-8-2009
10701590@unknown@formal@none@1@S@It left a strong influence on the ''[[Old Tupi|língua brasílica]]'', a [[Tupi-Guarani|Tupi-Guarani language]] which was the most widely spoken in [[Brazil]] until the 18th century, and on the language spoken around [[Sikka]] in [[Flores|Flores Island]], [[Indonesia]].@@@@1@36@@danf@17-8-2009
10701600@unknown@formal@none@1@S@In nearby [[Larantuka]], Portuguese is used for prayers in [[Holy Week]] rituals.@@@@1@12@@danf@17-8-2009
10701610@unknown@formal@none@1@S@The Japanese-Portuguese dictionary ''[[Nippo Jisho]]'' (1603) was the first dictionary of Japanese in a European language, a product of [[Society of Jesus|Jesuit]] missionary activity in [[Japan]].@@@@1@26@@danf@17-8-2009
10701620@unknown@formal@none@1@S@Building on the work of earlier Portuguese missionaries, the ''Dictionarium Anamiticum, Lusitanum et Latinum'' (Annamite-Portuguese-Latin dictionary) of [[Alexandre de Rhodes]] (1651) introduced the modern [[Vietnamese alphabet|orthography of Vietnamese]], which is based on the orthography of 17th-century Portuguese.@@@@1@37@@danf@17-8-2009
10701630@unknown@formal@none@1@S@The [[Romanization]] of [[Chinese language|Chinese]] was also influenced by the Portuguese language (among others), particularly regarding [[List of common Chinese surnames|Chinese surnames]]; one example is ''Mei''.@@@@1@26@@danf@17-8-2009
10701640@unknown@formal@none@1@S@See also [[List of English words of Portuguese origin]], [[Loan words in Indonesian]], [[Japanese words of Portuguese origin]], [[Malay_language#Borrowed_words|Borrowed words in Malay]], [[Sinhala words of Portuguese origin]], [[Loan words in Sri Lankan Tamil#Portuguese|Loan words from Portuguese in Sri Lankan Tamil]].@@@@1@40@@danf@17-8-2009
10701650@unknown@formal@none@1@S@===Derived languages===@@@@1@2@@danf@17-8-2009
10701660@unknown@formal@none@1@S@Beginning in the 16th century, the extensive contacts between Portuguese travelers and settlers, African slaves, and local populations led to the appearance of many [[pidgin]]s with varying amounts of Portuguese influence.@@@@1@31@@danf@17-8-2009
10701670@unknown@formal@none@1@S@As these pidgins became the mother tongue of succeeding generations, they evolved into fully fledged [[creole language]]s, which remained in use in many parts of Asia and Africa until the 18th century.@@@@1@32@@danf@17-8-2009
10701680@unknown@formal@none@1@S@Some Portuguese-based or Portuguese-influenced creoles are still spoken today, by over 3 million people worldwide, especially people of partial [[Portuguese people|Portuguese]] ancestry.@@@@1@22@@danf@17-8-2009
10701690@unknown@formal@none@1@S@== Phonology ==@@@@1@3@@danf@17-8-2009
10701700@unknown@formal@none@1@S@There is a maximum of 9 oral vowels and 19 consonants, though some varieties of the language have fewer phonemes (Brazilian Portuguese has only 8 oral vowel [[phone]]s).@@@@1@28@@danf@17-8-2009
10701710@unknown@formal@none@1@S@There are also five nasal vowels, which some linguists regard as allophones of the oral vowels, ten oral [[diphthong]]s, and five nasal diphthongs.@@@@1@23@@danf@17-8-2009
10701720@unknown@formal@none@1@S@===Vowels===@@@@1@1@@danf@17-8-2009
10701730@unknown@formal@none@1@S@To the seven vowels of [[Vulgar Latin]], European Portuguese has added two [[Mid-centralized vowel|near central vowels]], one of which tends to be [[elision|elided]] in [[relaxed pronunciation|rapid speech]], like the ''e caduc'' of [[French language|French]] (represented either as {{IPA|/ɯ̽/}}, or {{IPA|/ɨ/}}, or {{IPA|/ə/}}).@@@@1@42@@danf@17-8-2009
10701740@unknown@formal@none@1@S@The high vowels {{IPA|/e o/}} and the low vowels {{IPA|/ɛ ɔ/}} are four distinct phonemes, and they alternate in various forms of [[apophony]].@@@@1@23@@danf@17-8-2009
10701750@unknown@formal@none@1@S@Like [[Catalan language|Catalan]], Portuguese uses vowel quality to contrast stressed syllables with unstressed syllables: isolated vowels tend to be [[Vowel#Height|raised]], and in some cases centralized, when unstressed.@@@@1@27@@danf@17-8-2009
10701760@unknown@formal@none@1@S@Nasal diphthongs occur mostly at the end of words.@@@@1@9@@danf@17-8-2009
10701770@unknown@formal@none@1@S@===Consonants===@@@@1@1@@danf@17-8-2009
10701780@unknown@formal@none@1@S@The consonant inventory of Portuguese is fairly conservative.@@@@1@8@@danf@17-8-2009
10701790@unknown@formal@none@1@S@The medieval affricates {{IPA|/ts/}}, {{IPA|/dz/}}, {{IPA|/tʃ/}}, {{IPA|/dʒ/}} merged with the fricatives {{IPA|/s/}}, {{IPA|/z/}}, {{IPA|/ʃ/}}, {{IPA|/ʒ/}}, respectively, but not with each other, and there were no other significant changes to the consonant phonemes since then.@@@@1@34@@danf@17-8-2009
10701800@unknown@formal@none@1@S@However, some remarkable dialectal variants and [[allophone]]s have appeared, among which:@@@@1@11@@danf@17-8-2009
10701810@unknown@formal@none@1@S@*In many regions of Brazil, {{IPA|/t/}} and {{IPA|/d/}} have the affricate allophones {{IPA|[tʃ]}} and {{IPA|[dʒ]}}, respectively, before {{IPA|/i/}} and {{IPA|/ĩ/}}.@@@@1@20@@danf@17-8-2009
10701820@unknown@formal@none@1@S@([[Quebec French]] has a similar phenomenon, with alveolar affricates instead of postalveolars.@@@@1@12@@danf@17-8-2009
10701830@unknown@formal@none@1@S@[[Japanese language|Japanese]] is another example).@@@@1@5@@danf@17-8-2009
10701840@unknown@formal@none@1@S@*At the end of a syllable, the phoneme {{IPA|/l/}} has the allophone {{IPA|[u̯]}} in Brazilian Portuguese (''[[L-vocalization#L-vocalization|L-vocalization]]'').@@@@1@17@@danf@17-8-2009
10701850@unknown@formal@none@1@S@*In many parts of Brazil and Angola, intervocalic {{IPA|/ɲ/}} is pronounced as a [[nasalization|nasalized]] [[palatal approximant]] {{IPA|[j̃]}} which nasalizes the preceding vowel, so that for instance {{IPA|/ˈniɲu/}} is pronounced {{IPA|[ˈnĩj̃u]}}.@@@@1@30@@danf@17-8-2009
10701860@unknown@formal@none@1@S@*In most of Brazil, the alveolar sibilants {{IPA|/s/}} and {{IPA|/z/}} occur in complementary distribution at the end of syllables, depending on whether the consonant that follows is voiceless or voiced, as in English.@@@@1@33@@danf@17-8-2009
10701870@unknown@formal@none@1@S@But in most of Portugal and parts of Brazil sibilants are postalveolar at the end of syllables, {{IPA|/ʃ/}} before voiceless consonants, and {{IPA|/ʒ/}} before voiced consonants (in [[Ladino language|Judeo-Spanish]], {{IPA|/s/}} is often replaced with {{IPA|/ʃ/}} at the end of syllables, too).@@@@1@41@@danf@17-8-2009
10701880@unknown@formal@none@1@S@*There is considerable dialectal variation in the value of the [[Rhotic consonant|rhotic]] phoneme {{IPA|/ʁ/}}.@@@@1@14@@danf@17-8-2009
10701890@unknown@formal@none@1@S@See [[Guttural R#Portuguese|Guttural R in Portuguese]], for details.@@@@1@8@@danf@17-8-2009
10701900@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10701910@unknown@formal@none@1@S@A particularly interesting aspect of the grammar of Portuguese is the verb.@@@@1@12@@danf@17-8-2009
10701920@unknown@formal@none@1@S@Morphologically, more verbal inflections from classical Latin have been preserved by Portuguese than any other major Romance language.@@@@1@18@@danf@17-8-2009
10701930@unknown@formal@none@1@S@See [[Romance copula#Morphological comparison|Romance copula]], for a detailed comparison.@@@@1@9@@danf@17-8-2009
10701940@unknown@formal@none@1@S@It has also some innovations not found in other Romance languages (except Galician and the Fala):@@@@1@16@@danf@17-8-2009
10701950@unknown@formal@none@1@S@* The [[present perfect tense]] has an iterative sense unique among the Romance languages.@@@@1@14@@danf@17-8-2009
10701960@unknown@formal@none@1@S@It denotes an action or a series of actions which began in the past and are expected to keep repeating in the future.@@@@1@23@@danf@17-8-2009
10701970@unknown@formal@none@1@S@For instance, the sentence ''Tenho tentado falar com ela'' would be translated to "I have been trying to talk to her", not "I have tried to talk to her".@@@@1@29@@danf@17-8-2009
10701980@unknown@formal@none@1@S@On the other hand, the correct translation of the question "Have you heard the latest news?" is not ''*Tem ouvido a última notícia?'', but ''Ouviu a última notícia?'', since no repetition is implied.@@@@1@33@@danf@17-8-2009
10701990@unknown@formal@none@1@S@* The future [[Subjunctive mood|subjunctive]] tense, which was developed by medieval [[West Iberian languages|West Iberian Romance]], but has now fallen into disuse in Spanish, is still used in [[vernacular]] Portuguese.@@@@1@30@@danf@17-8-2009
10702000@unknown@formal@none@1@S@It appears in dependent clauses that denote a condition which must be fulfilled in the future, so that the independent clause will occur.@@@@1@23@@danf@17-8-2009
10702010@unknown@formal@none@1@S@Other languages normally employ the present tense under the same circumstances:@@@@1@11@@danf@17-8-2009
10702020@unknown@formal@none@1@S@:''Se ''for'' eleito presidente, mudarei a lei.''@@@@1@7@@danf@17-8-2009
10702030@unknown@formal@none@1@S@:If ''I am'' elected president, I will change the law.@@@@1@10@@danf@17-8-2009
10702040@unknown@formal@none@1@S@:''Quando ''fores'' mais velho, vais entender.''@@@@1@6@@danf@17-8-2009
10702050@unknown@formal@none@1@S@:When ''you are'' older, you will understand.@@@@1@7@@danf@17-8-2009
10702060@unknown@formal@none@1@S@* The personal [[infinitive]]: infinitives can [[inflection|inflect]] according to their subject in [[Grammatical person|person]] and [[Grammatical number|number]], often showing who is expected to perform a certain action; cf. ''É melhor voltares'' "It is better [for you] to go back," ''É melhor voltarmos'' "It is better [for us] to go back."@@@@1@50@@danf@17-8-2009
10702070@unknown@formal@none@1@S@Perhaps for this reason, infinitive clauses replace subjunctive clauses more often in Portuguese than in other Romance languages.@@@@1@18@@danf@17-8-2009
10702080@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10702090@unknown@formal@none@1@S@Portuguese is written with the [[Latin alphabet]], making use of five [[diacritic]]s to denote stress, vowel height, contraction, nasalization, and other sound changes (acute accent, grave accent, circumflex accent, tilde, and cedilla).@@@@1@32@@danf@17-8-2009
10702100@unknown@formal@none@1@S@[[Brazilian Portuguese]] also uses the diaeresis mark.@@@@1@7@@danf@17-8-2009
10702110@unknown@formal@none@1@S@Accented characters and digraphs are not counted as separate letters for [[collation]] purposes.@@@@1@13@@danf@17-8-2009
10702120@unknown@formal@none@1@S@===Brazilian vs. European spelling===@@@@1@4@@danf@17-8-2009
10702130@unknown@formal@none@1@S@There are some minor differences between the orthographies of Brazil and other Portuguese language countries.@@@@1@15@@danf@17-8-2009
10702140@unknown@formal@none@1@S@One of the most pervasive is the use of acute accents in the European/African/Asian orthography in many words such as ''sinónimo'', where the Brazilian orthography has a circumflex accent, ''sinônimo''.@@@@1@30@@danf@17-8-2009
10702150@unknown@formal@none@1@S@Another important difference is that Brazilian spelling often lacks ''c'' or ''p'' before ''c'', ''ç'', or ''t'', where the European orthography has them; for example, cf. Brazilian ''fato'' with European ''facto'', "fact", or Brazilian ''objeto'' with European ''objecto'', "object".@@@@1@39@@danf@17-8-2009
10702160@unknown@formal@none@1@S@Some of these spelling differences reflect differences in the pronunciation of the words, but others are merely graphic.@@@@1@18@@danf@17-8-2009
10702170@unknown@formal@none@1@S@==Examples==@@@@1@1@@danf@17-8-2009
10702180@unknown@formal@none@1@S@;Excerpt from the Portuguese [[national epic]] ''[[Os Lusíadas]]'', by author [[Luís de Camões]] (I, 33)@@@@1@15@@danf@17-8-2009
10710010@unknown@formal@none@1@S@Predictive analytics@@@@1@2@@danf@17-8-2009
10710020@unknown@formal@none@1@S@'''Predictive analytics''' encompasses a variety of techniques from [[statistics]] and [[data mining]] that analyze current and historical data to make predictions about future events.@@@@1@24@@danf@17-8-2009
10710030@unknown@formal@none@1@S@Such predictions rarely take the form of absolute statements, and are more likely to be expressed as values that correspond to the odds of a particular event or behavior taking place in the future.@@@@1@34@@danf@17-8-2009
10710040@unknown@formal@none@1@S@In business, predictive models exploit patterns found in historical and transactional data to identify risks and opportunities.@@@@1@17@@danf@17-8-2009
10710050@unknown@formal@none@1@S@Models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions.@@@@1@26@@danf@17-8-2009
10710060@unknown@formal@none@1@S@One of the most well-known applications is [[credit scoring]], which is used throughout [[financial services]].@@@@1@15@@danf@17-8-2009
10710070@unknown@formal@none@1@S@Scoring models process a customer’s [[credit history]], [[loan application]], customer data, etc., in order to rank-order individuals by their likelihood of making future credit payments on time.@@@@1@27@@danf@17-8-2009
10710080@unknown@formal@none@1@S@Predictive analytics are also used in [[insurance]], [[telecommunications]], [[retail]], [[travel]], [[healthcare]], [[Pharmaceutical company|pharmaceuticals]] and other fields.@@@@1@16@@danf@17-8-2009
10710090@unknown@formal@none@1@S@== Types of predictive analytics ==@@@@1@6@@danf@17-8-2009
10710100@unknown@formal@none@1@S@Generally, predictive analytics is used to mean [[predictive modeling]], scoring of predictive models, and [[forecasting]].@@@@1@15@@danf@17-8-2009
10710110@unknown@formal@none@1@S@However, people are increasingly using the term to describe related analytic disciplines, such as descriptive modeling and decision modeling or optimization.@@@@1@21@@danf@17-8-2009
10710120@unknown@formal@none@1@S@These disciplines also involve rigorous data analysis, and are widely used in business for segmentation and decision making, but have different purposes and the statistical techniques underlying them vary.@@@@1@29@@danf@17-8-2009
10710130@unknown@formal@none@1@S@===Predictive models===@@@@1@2@@danf@17-8-2009
10710140@unknown@formal@none@1@S@Predictive models analyze past performance to assess how likely a customer is to exhibit a specific behavior in the future in order to improve [[marketing effectiveness]].@@@@1@26@@danf@17-8-2009
10710150@unknown@formal@none@1@S@This category also encompasses models that seek out subtle data patterns to answer questions about customer performance, such as fraud detection models.@@@@1@22@@danf@17-8-2009
10710160@unknown@formal@none@1@S@Predictive models often perform calculations during live transactions, for example, to evaluate the risk or opportunity of a given customer or transaction, in order to guide a decision.@@@@1@28@@danf@17-8-2009
10710170@unknown@formal@none@1@S@===Descriptive models===@@@@1@2@@danf@17-8-2009
10710180@unknown@formal@none@1@S@Descriptive models “describe” relationships in data in a way that is often used to classify customers or prospects into groups.@@@@1@20@@danf@17-8-2009
10710190@unknown@formal@none@1@S@Unlike predictive models that focus on predicting a single customer behavior (such as credit risk), descriptive models identify many different relationships between customers or products.@@@@1@25@@danf@17-8-2009
10710200@unknown@formal@none@1@S@But the descriptive models do not rank-order customers by their likelihood of taking a particular action the way predictive models do.@@@@1@21@@danf@17-8-2009
10710210@unknown@formal@none@1@S@Descriptive models are often used “offline,” for example, to categorize customers by their product preferences and life stage.@@@@1@18@@danf@17-8-2009
10710220@unknown@formal@none@1@S@Descriptive modeling tools can be utilized to develop agent based models that can simulate large number of individualized agents to predict possible futures.@@@@1@23@@danf@17-8-2009
10710230@unknown@formal@none@1@S@===Decision models===@@@@1@2@@danf@17-8-2009
10710240@unknown@formal@none@1@S@Decision models describe the relationship between all the elements of a decision — the known data (including results of predictive models), the decision and the forecast results of the decision — in order to predict the results of decisions involving many variables.@@@@1@42@@danf@17-8-2009
10710250@unknown@formal@none@1@S@These models can be used in optimization, a data-driven approach to improving decision logic that involves maximizing certain outcomes while minimizing others.@@@@1@22@@danf@17-8-2009
10710260@unknown@formal@none@1@S@Decision models are generally used offline, to develop decision logic or a set of business rules that will produce the desired action for every customer or circumstance.@@@@1@27@@danf@17-8-2009
10710270@unknown@formal@none@1@S@== Predictive analytics ==@@@@1@4@@danf@17-8-2009
10710280@unknown@formal@none@1@S@===Definition===@@@@1@1@@danf@17-8-2009
10710290@unknown@formal@none@1@S@Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns.@@@@1@25@@danf@17-8-2009
10710300@unknown@formal@none@1@S@The core of predictive analytics relies on capturing relationships between explanatory variables and the predicted variables from past occurrences, and exploiting it to predict future outcomes.@@@@1@26@@danf@17-8-2009
10710310@unknown@formal@none@1@S@===Current uses===@@@@1@2@@danf@17-8-2009
10710320@unknown@formal@none@1@S@Although predictive analytics can be put to use in many applications, we outline a few examples where predictive analytics has shown positive impact in recent years.@@@@1@26@@danf@17-8-2009
10710330@unknown@formal@none@1@S@====Analytical Customer Relationship Management (CRM)====@@@@1@5@@danf@17-8-2009
10710340@unknown@formal@none@1@S@Analytical [[Customer Relationship Management]] is a frequent commercial application of Predictive Analysis.@@@@1@12@@danf@17-8-2009
10710350@unknown@formal@none@1@S@Methods of predictive analysis are applied to customer data to pursue CRM objectives.@@@@1@13@@danf@17-8-2009
10710360@unknown@formal@none@1@S@====Direct marketing====@@@@1@2@@danf@17-8-2009
10710370@unknown@formal@none@1@S@Product [[marketing]] is constantly faced with the challenge of coping with the increasing number of competing products, different consumer preferences and the variety of methods (channels) available to interact with each consumer.@@@@1@32@@danf@17-8-2009
10710380@unknown@formal@none@1@S@Efficient marketing is a process of understanding the amount of variability and tailoring the marketing strategy for greater profitability.@@@@1@19@@danf@17-8-2009
10710390@unknown@formal@none@1@S@Predictive analytics can help identify consumers with a higher likelihood of responding to a particular marketing offer.@@@@1@17@@danf@17-8-2009
10710400@unknown@formal@none@1@S@Models can be built using data from consumers’ past purchasing history and past response rates for each channel.@@@@1@18@@danf@17-8-2009
10710410@unknown@formal@none@1@S@Additional information about the consumers demographic, geographic and other characteristics can be used to make more accurate predictions.@@@@1@18@@danf@17-8-2009
10710420@unknown@formal@none@1@S@Targeting only these consumers can lead to substantial increase in response rate which can lead to a significant reduction in cost per acquisition.@@@@1@23@@danf@17-8-2009
10710430@unknown@formal@none@1@S@Apart from identifying prospects, predictive analytics can also help to identify the most effective combination of products and marketing channels that should be used to target a given consumer.@@@@1@29@@danf@17-8-2009
10710440@unknown@formal@none@1@S@====Cross-sell====@@@@1@1@@danf@17-8-2009
10710450@unknown@formal@none@1@S@Often corporate organizations collect and maintain abundant data (e.g. customer records, sale transactions) and exploiting hidden relationships in the data can provide a competitive advantage to the organization.@@@@1@28@@danf@17-8-2009
10710460@unknown@formal@none@1@S@For an organization that offers multiple products, an analysis of existing customer behavior can lead to efficient [[cross-selling|cross sell]] of products.@@@@1@21@@danf@17-8-2009
10710470@unknown@formal@none@1@S@This directly leads to higher profitability per customer and strengthening of the customer relationship.@@@@1@14@@danf@17-8-2009
10710480@unknown@formal@none@1@S@Predictive analytics can help analyze customers’ spending, usage and other behavior, and help cross-sell the right product at the right time.@@@@1@21@@danf@17-8-2009
10710490@unknown@formal@none@1@S@====Customer retention====@@@@1@2@@danf@17-8-2009
10710500@unknown@formal@none@1@S@With the amount of competing services available, businesses need to focus efforts on maintaining continuous [[consumer satisfaction]].@@@@1@17@@danf@17-8-2009
10710510@unknown@formal@none@1@S@In such a competitive scenario, [[consumer loyalty]] needs to be rewarded and [[customer attrition]] needs to be minimized.@@@@1@18@@danf@17-8-2009
10710520@unknown@formal@none@1@S@Businesses tend to respond to customer attrition on a reactive basis, acting only after the customer has initiated the process to terminate service.@@@@1@23@@danf@17-8-2009
10710530@unknown@formal@none@1@S@At this stage, the chance of changing the customer’s decision is almost impossible.@@@@1@13@@danf@17-8-2009
10710540@unknown@formal@none@1@S@Proper application of predictive analytics can lead to a more proactive retention strategy.@@@@1@13@@danf@17-8-2009
10710550@unknown@formal@none@1@S@By a frequent examination of a customer’s past service usage, service performance, spending and other behavior patterns, predictive models can determine the likelihood of a customer wanting to terminate service sometime in the near future.@@@@1@35@@danf@17-8-2009
10710560@unknown@formal@none@1@S@An intervention with lucrative offers can increase the chance of retaining the customer.@@@@1@13@@danf@17-8-2009
10710570@unknown@formal@none@1@S@Silent attrition is the behavior of a customer to slowly but steadily reduce usage and is another problem faced by many companies.@@@@1@22@@danf@17-8-2009
10710580@unknown@formal@none@1@S@Predictive analytics can also predict this behavior accurately and before it occurs, so that the company can take proper actions to increase customer activity.@@@@1@24@@danf@17-8-2009
10710590@unknown@formal@none@1@S@====Underwriting====@@@@1@1@@danf@17-8-2009
10710600@unknown@formal@none@1@S@Many businesses have to account for risk exposure due to their different services and determine the cost needed to cover the risk.@@@@1@22@@danf@17-8-2009
10710610@unknown@formal@none@1@S@For example, auto insurance providers need to accurately determine the amount of premium to charge to cover each automobile and driver.@@@@1@21@@danf@17-8-2009
10710620@unknown@formal@none@1@S@A financial company needs to assess a borrower’s potential and ability to pay before granting a loan.@@@@1@17@@danf@17-8-2009
10710630@unknown@formal@none@1@S@For a health insurance provider, predictive analytics can analyze a few years of past medical claims data, as well as lab, pharmacy and other records where available, to predict how expensive an enrollee is likely to be in the future.@@@@1@40@@danf@17-8-2009
10710640@unknown@formal@none@1@S@Predictive analytics can help [[underwriting]] of these quantities by predicting the chances of illness, [[Default (finance)|default]], [[bankruptcy]], etc.@@@@1@18@@danf@17-8-2009
10710650@unknown@formal@none@1@S@Predictive analytics can streamline the process of customer acquisition, by predicting the future risk behavior of a customer using application level data.@@@@1@22@@danf@17-8-2009
10710660@unknown@formal@none@1@S@Proper predictive analytics can lead to proper pricing decisions, which can help mitigate future risk of default.@@@@1@17@@danf@17-8-2009
10710670@unknown@formal@none@1@S@====Collection analytics====@@@@1@2@@danf@17-8-2009
10710680@unknown@formal@none@1@S@Every portfolio has a set of delinquent customers who do not make their payments on time.@@@@1@16@@danf@17-8-2009
10710690@unknown@formal@none@1@S@The financial institution has to undertake collection activities on these customers to recover the amounts due.@@@@1@16@@danf@17-8-2009
10710700@unknown@formal@none@1@S@A lot of collection resources are wasted on customers who are difficult or impossible to recover.@@@@1@16@@danf@17-8-2009
10710710@unknown@formal@none@1@S@Predictive analytics can help optimize the allocation of collection resources by identifying the most effective collection agencies, contact strategies, legal actions and other strategies to each customer, thus significantly increasing recovery at the same time reducing collection costs.@@@@1@38@@danf@17-8-2009
10710720@unknown@formal@none@1@S@====Fraud detection====@@@@1@2@@danf@17-8-2009
10710730@unknown@formal@none@1@S@Fraud is a big problem for many businesses and can be of various types.@@@@1@14@@danf@17-8-2009
10710740@unknown@formal@none@1@S@Inaccurate credit applications, fraudulent transactions, [[identity theft]]s and false insurance claims are some examples of this problem.@@@@1@17@@danf@17-8-2009
10710750@unknown@formal@none@1@S@These problems plague firms all across the spectrum and some examples of likely victims are [[Credit card fraud|credit card issuers]], insurance companies, retail merchants, manufacturers, business to business suppliers and even services providers.@@@@1@33@@danf@17-8-2009
10710760@unknown@formal@none@1@S@This is an area where a predictive model is often used to help weed out the “bads” and reduce a business's exposure to fraud.@@@@1@24@@danf@17-8-2009
10710770@unknown@formal@none@1@S@====Portfolio, product or economy level prediction====@@@@1@6@@danf@17-8-2009
10710780@unknown@formal@none@1@S@Often the focus of analysis is not the consumer but the product, portfolio, firm, industry or even the economy.@@@@1@19@@danf@17-8-2009
10710790@unknown@formal@none@1@S@For example a retailer might be interested in predicting store level demand for inventory management purposes.@@@@1@16@@danf@17-8-2009
10710800@unknown@formal@none@1@S@Or the Federal Reserve Board might be interested in predicting the unemployment rate for the next year.@@@@1@17@@danf@17-8-2009
10710810@unknown@formal@none@1@S@These type of problems can be addressed by predictive analytics using Time Series techniques (see below).@@@@1@16@@danf@17-8-2009
10710820@unknown@formal@none@1@S@Wrong Information....@@@@1@2@@danf@17-8-2009
10710830@unknown@formal@none@1@S@==Statistical techniques==@@@@1@2@@danf@17-8-2009
10710840@unknown@formal@none@1@S@The approaches and techniques used to conduct predictive analytics can broadly be grouped into regression techniques and machine learning techniques.@@@@1@20@@danf@17-8-2009
10710850@unknown@formal@none@1@S@====Regression Techniques====@@@@1@2@@danf@17-8-2009
10710860@unknown@formal@none@1@S@Regression models are the mainstay of predictive analytics.@@@@1@8@@danf@17-8-2009
10710870@unknown@formal@none@1@S@The focus lies on establishing a mathematical equation as a model to represent the interactions between the different variables in consideration.@@@@1@21@@danf@17-8-2009
10710880@unknown@formal@none@1@S@Depending on the situation, there is a wide variety of models that can be applied while performing predictive analytics.@@@@1@19@@danf@17-8-2009
10710890@unknown@formal@none@1@S@Some of them are briefly discussed below.@@@@1@7@@danf@17-8-2009
10710900@unknown@formal@none@1@S@=====Linear Regression Model=====@@@@1@3@@danf@17-8-2009
10710910@unknown@formal@none@1@S@The linear regression model analyzes the relationship between the response or dependent variable and a set of independent or predictor variables.@@@@1@21@@danf@17-8-2009
10710920@unknown@formal@none@1@S@This relationship is expressed as an equation that predicts the response variable as a linear function of the parameters.@@@@1@19@@danf@17-8-2009
10710930@unknown@formal@none@1@S@These parameters are adjusted so that a measure of fit is optimized.@@@@1@12@@danf@17-8-2009
10710940@unknown@formal@none@1@S@Much of the effort in model fitting is focused on minimizing the size of the residual, as well as ensuring that it is randomly distributed with respect to the model predictions.@@@@1@31@@danf@17-8-2009
10710950@unknown@formal@none@1@S@The goal of regression is to select the parameters of the model so as to minimize the sum of the squared residuals.@@@@1@22@@danf@17-8-2009
10710960@unknown@formal@none@1@S@This is referred to as '''[[ordinary least squares]]''' (OLS) estimation and results in best linear unbiased estimates (BLUE) of the parameters if and only if the [[Gauss–Markov theorem|Gauss-Markowitz]] assumptions are satisfied.@@@@1@31@@danf@17-8-2009
10710970@unknown@formal@none@1@S@Once the model has been estimated we would be interested to know if the predictor variables belong in the model – i.e. is the estimate of each variable’s contribution reliable?@@@@1@30@@danf@17-8-2009
10710980@unknown@formal@none@1@S@To do this we can check the statistical significance of the model’s coefficients which can be measured using the t-statistic.@@@@1@20@@danf@17-8-2009
10710990@unknown@formal@none@1@S@This amounts to testing whether the coefficient is significantly different from zero.@@@@1@12@@danf@17-8-2009
10711000@unknown@formal@none@1@S@How well the model predicts the dependent variable based on the value of the independent variables can be assessed by using the R² statistic.@@@@1@24@@danf@17-8-2009
10711010@unknown@formal@none@1@S@It measures predictive power of the model i.e. the proportion of the total variation in the dependent variable that is “explained” (accounted for) by variation in the independent variables.@@@@1@29@@danf@17-8-2009
10711020@unknown@formal@none@1@S@====Discrete choice models====@@@@1@3@@danf@17-8-2009
10711030@unknown@formal@none@1@S@Multivariate regression (above) is generally used when the response variable is continuous and has an unbounded range.@@@@1@17@@danf@17-8-2009
10711040@unknown@formal@none@1@S@Often the response variable may not be continuous but rather discrete.@@@@1@11@@danf@17-8-2009
10711050@unknown@formal@none@1@S@While mathematically it is feasible to apply multivariate regression to discrete ordered dependent variables, some of the assumptions behind the theory of multivariate linear regression no longer hold, and there are other techniques such as discrete choice models which are better suited for this type of analysis.@@@@1@47@@danf@17-8-2009
10711060@unknown@formal@none@1@S@If the dependent variable is discrete, some of those superior methods are [[logistic regression]], [[multinomial logit]] and [[probit]] models.@@@@1@19@@danf@17-8-2009
10711070@unknown@formal@none@1@S@Logistic regression and probit models are used when the dependent variable is [[binary numeral system|binary]].@@@@1@15@@danf@17-8-2009
10711080@unknown@formal@none@1@S@=====Logistic regression=====@@@@1@2@@danf@17-8-2009
10711090@unknown@formal@none@1@S@In a classification setting, assigning outcome probabilities to observations can be achieved through the use of a logistic model, which is basically a method which transforms information about the binary dependent variable into an unbounded continuous variable and estimates a regular multivariate model (See Allison’s Logistic Regression for more information on the theory of Logistic Regression).@@@@1@56@@danf@17-8-2009
10711100@unknown@formal@none@1@S@The [[Wald test|Wald]] and [[likelihood-ratio test]] are used to test the statistical significance of each coefficient b in the model (analogous to the t tests used in OLS regression; see above).@@@@1@31@@danf@17-8-2009
10711110@unknown@formal@none@1@S@A test assessing the goodness-of-fit of a classification model is the [[Hosmer and Lemeshow test]].@@@@1@15@@danf@17-8-2009
10711120@unknown@formal@none@1@S@=====Multinomial logistic regression=====@@@@1@3@@danf@17-8-2009
10711130@unknown@formal@none@1@S@An extension of the [[binary logit model]] to cases where the dependent variable has more than 2 categories is the [[multinomial logit model]].@@@@1@23@@danf@17-8-2009
10711140@unknown@formal@none@1@S@In such cases collapsing the data into two categories might not make good sense or may lead to loss in the richness of the data.@@@@1@25@@danf@17-8-2009
10711150@unknown@formal@none@1@S@The multinomial logit model is the appropriate technique in these cases, especially when the dependent variable categories are not ordered (for examples colors like red, blue, green).@@@@1@27@@danf@17-8-2009
10711160@unknown@formal@none@1@S@Some authors have extended multinomial regression to include feature selection/importance methods such as [[Random multinomial logit]].@@@@1@16@@danf@17-8-2009
10711170@unknown@formal@none@1@S@=====Probit regression=====@@@@1@2@@danf@17-8-2009
10711180@unknown@formal@none@1@S@Probit models offer an alternative to logistic regression for modeling categorical dependent variables.@@@@1@13@@danf@17-8-2009
10711190@unknown@formal@none@1@S@Even though the outcomes tend to be similar, the underlying distributions are different.@@@@1@13@@danf@17-8-2009
10711200@unknown@formal@none@1@S@Probit models are popular in social sciences like economics.@@@@1@9@@danf@17-8-2009
10711210@unknown@formal@none@1@S@A good way to understand the key difference between probit and logit models, is to assume that there is a latent variable z.@@@@1@23@@danf@17-8-2009
10711220@unknown@formal@none@1@S@We do not observe z but instead observe y which takes the value 0 or 1.@@@@1@16@@danf@17-8-2009
10711230@unknown@formal@none@1@S@In the logit model we assume that follows a logistic distribution.@@@@1@11@@danf@17-8-2009
10711240@unknown@formal@none@1@S@In the probit model we assume that follows a standard normal distribution.@@@@1@12@@danf@17-8-2009
10711250@unknown@formal@none@1@S@Note that in social sciences (example economics), probit is often used to model situations where the observed variable y is continuous but takes values between 0 and 1.@@@@1@28@@danf@17-8-2009
10711260@unknown@formal@none@1@S@=====Logit vs. Probit=====@@@@1@3@@danf@17-8-2009
10711270@unknown@formal@none@1@S@The Probit model has been around longer than the logit model.@@@@1@11@@danf@17-8-2009
10711280@unknown@formal@none@1@S@They look identical, except that the logistic distribution tends to be a little flat tailed.@@@@1@15@@danf@17-8-2009
10711290@unknown@formal@none@1@S@In fact one of the reasons the logit model was formulated was that the probit model was extremely hard to compute because it involved calculating difficult integrals.@@@@1@27@@danf@17-8-2009
10711300@unknown@formal@none@1@S@Modern computing however has made this computation fairly simple.@@@@1@9@@danf@17-8-2009
10711310@unknown@formal@none@1@S@The coefficients obtained from the logit and probit model are also fairly close.@@@@1@13@@danf@17-8-2009
10711320@unknown@formal@none@1@S@However the odds ratio makes the logit model easier to interpret.@@@@1@11@@danf@17-8-2009
10711330@unknown@formal@none@1@S@For practical purposes the only reasons for choosing the probit model over the logistic model would be:@@@@1@17@@danf@17-8-2009
10711340@unknown@formal@none@1@S@* There is a strong belief that the underlying distribution is normal@@@@1@12@@danf@17-8-2009
10711350@unknown@formal@none@1@S@* The actual event is not a binary outcome (e.g. Bankrupt/not bankrupt) but a proportion (e.g. Proportion of population at different debt levels).@@@@1@23@@danf@17-8-2009
10711360@unknown@formal@none@1@S@==== Time series models====@@@@1@4@@danf@17-8-2009
10711370@unknown@formal@none@1@S@[[Time series]] models are used for predicting or forecasting the future behavior of variables.@@@@1@14@@danf@17-8-2009
10711380@unknown@formal@none@1@S@These models account for the fact that data points taken over time may have an internal structure (such as autocorrelation, trend or seasonal variation) that should be accounted for.@@@@1@29@@danf@17-8-2009
10711390@unknown@formal@none@1@S@As a result standard regression techniques cannot be applied to time series data and methodology has been developed to decompose the trend, seasonal and cyclical component of the series.@@@@1@29@@danf@17-8-2009
10711400@unknown@formal@none@1@S@Modeling the dynamic path of a variable can improve forecasts since the predictable component of the series can be projected into the future.@@@@1@23@@danf@17-8-2009
10711410@unknown@formal@none@1@S@Time series models estimate difference equations containing stochastic components.@@@@1@9@@danf@17-8-2009
10711420@unknown@formal@none@1@S@Two commonly used forms of these models are [[autoregressive model]]s (AR) and [[Moving average (technical analysis)|moving average]] (MA) models.@@@@1@19@@danf@17-8-2009
10711430@unknown@formal@none@1@S@The [[Box-Jenkins]] methodology (1976) developed by George Box and G.M. Jenkins combines the AR and MA models to produce the [[Autoregressive moving average model|ARMA]] (autoregressive moving average) model which is the cornerstone of stationary time series analysis.@@@@1@37@@danf@17-8-2009
10711440@unknown@formal@none@1@S@ARIMA (autoregressive integrated moving average models) on the other hand are used to describe non-stationary time series.@@@@1@17@@danf@17-8-2009
10711450@unknown@formal@none@1@S@Box and Jenkins suggest differencing a non stationary time series to obtain a stationary series to which an ARMA model can be applied.@@@@1@23@@danf@17-8-2009
10711460@unknown@formal@none@1@S@Non stationary time series have a pronounced trend and do not have a constant long-run mean or variance.@@@@1@18@@danf@17-8-2009
10711470@unknown@formal@none@1@S@Box and Jenkins proposed a three stage methodology which includes: model identification, estimation and validation.@@@@1@15@@danf@17-8-2009
10711480@unknown@formal@none@1@S@The identification stage involves identifying if the series is stationary or not and the presence of seasonality by examining plots of the series, autocorrelation and partial autocorrelation functions.@@@@1@28@@danf@17-8-2009
10711490@unknown@formal@none@1@S@In the estimation stage, models are estimated using non-linear time series or maximum likelihood estimation procedures.@@@@1@16@@danf@17-8-2009
10711500@unknown@formal@none@1@S@Finally the validation stage involves diagnostic checking such as plotting the residuals to detect outliers and evidence of model fit.@@@@1@20@@danf@17-8-2009
10711510@unknown@formal@none@1@S@In recent years time series models have become more sophisticated and attempt to model conditional heteroskedasticity with models such as ARCH ([[autoregressive conditional heteroskedasticity]]) and GARCH (generalized autoregressive conditional heteroskedasticity) models frequently used for financial time series.@@@@1@37@@danf@17-8-2009
10711520@unknown@formal@none@1@S@In addition time series models are also used to understand inter-relationships among economic variables represented by systems of equations using VAR (vector autoregression) and structural VAR models.@@@@1@27@@danf@17-8-2009
10711530@unknown@formal@none@1@S@==== Survival or duration analysis====@@@@1@5@@danf@17-8-2009
10711540@unknown@formal@none@1@S@[[Survival analysis]] is another name for time to event analysis.@@@@1@10@@danf@17-8-2009
10711550@unknown@formal@none@1@S@These techniques were primarily developed in the medical and biological sciences, but they are also widely used in the social sciences like economics, as well as in engineering (reliability and failure time analysis).@@@@1@33@@danf@17-8-2009
10711560@unknown@formal@none@1@S@Censoring and non-normality which are characteristic of survival data generate difficulty when trying to analyze the data using conventional statistical models such as multiple linear regression.@@@@1@26@@danf@17-8-2009
10711570@unknown@formal@none@1@S@The Normal distribution, being a symmetric distribution, takes positive as well as negative values, but duration by its very nature cannot be negative and therefore normality cannot be assumed when dealing with duration/survival data.@@@@1@34@@danf@17-8-2009
10711580@unknown@formal@none@1@S@Hence the normality assumption of regression models is violated.@@@@1@9@@danf@17-8-2009
10711590@unknown@formal@none@1@S@A censored observation is defined as an observation with incomplete information.@@@@1@11@@danf@17-8-2009
10711600@unknown@formal@none@1@S@Censoring introduces distortions into traditional statistical methods and is essentially a defect of the sample data.@@@@1@16@@danf@17-8-2009
10711610@unknown@formal@none@1@S@The assumption is that if the data were not censored it would be representative of the population of interest.@@@@1@19@@danf@17-8-2009
10711620@unknown@formal@none@1@S@In survival analysis, censored observations arise whenever the dependent variable of interest represents the time to a terminal event, and the duration of the study is limited in time.@@@@1@29@@danf@17-8-2009
10711630@unknown@formal@none@1@S@An important concept in survival analysis is the hazard rate.@@@@1@10@@danf@17-8-2009
10711640@unknown@formal@none@1@S@The hazard rate is defined as the probability that the event will occur at time t conditional on surviving until time t.@@@@1@22@@danf@17-8-2009
10711650@unknown@formal@none@1@S@Another concept related to the hazard rate is the survival function which can be defined as the probability of surviving to time t.@@@@1@23@@danf@17-8-2009
10711660@unknown@formal@none@1@S@Most models try to model the hazard rate by choosing the underlying distribution depending on the shape of the hazard function.@@@@1@21@@danf@17-8-2009
10711670@unknown@formal@none@1@S@A distribution whose hazard function slopes upward is said to have positive duration dependence, a decreasing hazard shows negative duration dependence whereas constant hazard is a process with no memory usually characterized by the exponential distribution.@@@@1@36@@danf@17-8-2009
10711680@unknown@formal@none@1@S@Some of the distributional choices in survival models are: F, gamma, Weibull, log normal, inverse normal, exponential etc.@@@@1@18@@danf@17-8-2009
10711690@unknown@formal@none@1@S@All these distributions are for a non-negative random variable.@@@@1@9@@danf@17-8-2009
10711700@unknown@formal@none@1@S@Duration models can be parametric, non-parametric or semi-parametric.@@@@1@8@@danf@17-8-2009
10711710@unknown@formal@none@1@S@Some of the models commonly used are Kaplan-Meier, Cox proportional hazard model (non parametric).@@@@1@14@@danf@17-8-2009
10711720@unknown@formal@none@1@S@==== Classification and regression trees====@@@@1@5@@danf@17-8-2009
10711730@unknown@formal@none@1@S@Classification and regression trees (CART) is a [[non-parametric statistics|non-parametric]] technique that produces either classification or regression trees, depending on whether the dependent variable is categorical or numeric, respectively.@@@@1@28@@danf@17-8-2009
10711740@unknown@formal@none@1@S@Trees are formed by a collection of rules based on values of certain variables in the modeling data set@@@@1@19@@danf@17-8-2009
10711750@unknown@formal@none@1@S@* Rules are selected based on how well splits based on variables’ values can differentiate observations based on the dependent variable@@@@1@21@@danf@17-8-2009
10711760@unknown@formal@none@1@S@* Once a rule is selected and splits a node into two, the same logic is applied to each “child” node (i.e. it is a recursive procedure)@@@@1@27@@danf@17-8-2009
10711770@unknown@formal@none@1@S@* Splitting stops when CART detects no further gain can be made, or some pre-set stopping rules are met@@@@1@19@@danf@17-8-2009
10711780@unknown@formal@none@1@S@Each branch of the tree ends in a terminal node@@@@1@10@@danf@17-8-2009
10711790@unknown@formal@none@1@S@* Each observation falls into one and exactly one terminal node@@@@1@11@@danf@17-8-2009
10711800@unknown@formal@none@1@S@* Each terminal node is uniquely defined by a set of rules@@@@1@12@@danf@17-8-2009
10711810@unknown@formal@none@1@S@A very popular method for predictive analytics is Leo Breiman's [[Random forests]] or derived versions of this technique like [[Random multinomial logit]].@@@@1@22@@danf@17-8-2009
10711820@unknown@formal@none@1@S@==== Multivariate adaptive regression splines====@@@@1@5@@danf@17-8-2009
10711830@unknown@formal@none@1@S@[[Multivariate adaptive regression splines]] (MARS) is a [[Non-parametric statistics|non-parametric]] technique that builds flexible models by fitting [[piecewise linear regression]]s.@@@@1@19@@danf@17-8-2009
10711840@unknown@formal@none@1@S@An important concept associated with regression splines is that of a knot.@@@@1@12@@danf@17-8-2009
10711850@unknown@formal@none@1@S@Knot is where one local regression model gives way to another and thus is the point of intersection between two splines.@@@@1@21@@danf@17-8-2009
10711860@unknown@formal@none@1@S@In multivariate and adaptive regression splines, [[basis function]]s are the tool used for generalizing the search for knots.@@@@1@18@@danf@17-8-2009
10711870@unknown@formal@none@1@S@Basis functions are a set of functions used to represent the information contained in one or more variables.@@@@1@18@@danf@17-8-2009
10711880@unknown@formal@none@1@S@Multivariate and Adaptive Regression Splines model almost always creates the basis functions in pairs.@@@@1@14@@danf@17-8-2009
10711890@unknown@formal@none@1@S@Multivariate and adaptive regression spline approach deliberately overfits the model and then prunes to get to the optimal model.@@@@1@19@@danf@17-8-2009
10711900@unknown@formal@none@1@S@The algorithm is computationally very intensive and in practice we are required to specify an upper limit on the number of basis functions.@@@@1@23@@danf@17-8-2009
10711910@unknown@formal@none@1@S@=== Machine learning techniques===@@@@1@4@@danf@17-8-2009
10711920@unknown@formal@none@1@S@[[Machine learning]], a branch of artificial intelligence, was originally employed to develop techniques to enable computers to learn.@@@@1@18@@danf@17-8-2009
10711930@unknown@formal@none@1@S@Today, since it includes a number of advanced statistical methods for regression and classification, it finds application in a wide variety of fields including [[medical diagnostics]], [[credit card fraud detection]], [[Face recognition|face]] and [[speech recognition]] and analysis of the [[stock market]].@@@@1@41@@danf@17-8-2009
10711940@unknown@formal@none@1@S@In certain applications it is sufficient to directly predict the dependent variable without focusing on the underlying relationships between variables.@@@@1@20@@danf@17-8-2009
10711950@unknown@formal@none@1@S@In other cases, the underlying relationships can be very complex and the mathematical form of the dependencies unknown.@@@@1@18@@danf@17-8-2009
10711960@unknown@formal@none@1@S@For such cases, machine learning techniques emulate [[human cognition]] and learn from training examples to predict future events.@@@@1@18@@danf@17-8-2009
10711970@unknown@formal@none@1@S@A brief discussion of some of these methods used commonly for predictive analytics is provided below.@@@@1@16@@danf@17-8-2009
10711980@unknown@formal@none@1@S@A detailed study of machine learning can be found in Mitchell (1997).@@@@1@12@@danf@17-8-2009
10711990@unknown@formal@none@1@S@==== Neural networks====@@@@1@3@@danf@17-8-2009
10712000@unknown@formal@none@1@S@[[Neural networks]] are [[Nonlinearity|nonlinear]] sophisticated modeling techniques that are able to [[Model (abstract)|model]] complex functions.@@@@1@15@@danf@17-8-2009
10712010@unknown@formal@none@1@S@They can be applied to problems of [[Time series|prediction]], [[Statistical classification|classification]] or [[Control theory|control]] in a wide spectrum of fields such as [[finance]], [[cognitive psychology]]/[[cognitive neuroscience|neuroscience]], [[medicine]], [[engineering]], and [[physics]].@@@@1@30@@danf@17-8-2009
10712020@unknown@formal@none@1@S@Neural networks are used when the exact nature of the relationship between inputs and output is not known.@@@@1@18@@danf@17-8-2009
10712030@unknown@formal@none@1@S@A key feature of neural networks is that they learn the relationship between inputs and output through training.@@@@1@18@@danf@17-8-2009
10712040@unknown@formal@none@1@S@There are two types of training in neural networks used by different networks, [[Supervised learning|supervised]] and [[Unsupervised learning|unsupervised]] training, with supervised being the most common one.@@@@1@26@@danf@17-8-2009
10712050@unknown@formal@none@1@S@Some examples of neural network training techniques are [[backpropagation]], quick propagation, [[Conjugate gradient method|conjugate gradient descent]], [[Radial basis function|projection operator]], Delta-Bar-Delta etc.@@@@1@22@@danf@17-8-2009
10712060@unknown@formal@none@1@S@Theses are applied to network architectures such as multilayer [[perceptron]]s, [[Self-organizing map|Kohonen network]]s, [[Hopfield network]]s, etc.@@@@1@16@@danf@17-8-2009
10712070@unknown@formal@none@1@S@====Radial basis functions====@@@@1@3@@danf@17-8-2009
10712080@unknown@formal@none@1@S@A [[radial basis function]] (RBF) is a function which has built into it a distance criterion with respect to a center.@@@@1@21@@danf@17-8-2009
10712090@unknown@formal@none@1@S@Such functions can be used very efficiently for interpolation and for smoothing of data.@@@@1@14@@danf@17-8-2009
10712100@unknown@formal@none@1@S@Radial basis functions have been applied in the area of [[neural network]]s where they are used as a replacement for the sigmoidal transfer function.@@@@1@24@@danf@17-8-2009
10712110@unknown@formal@none@1@S@Such networks have 3 layers, the input layer, the hidden layer with the RBF non-linearity and a linear output layer.@@@@1@20@@danf@17-8-2009
10712120@unknown@formal@none@1@S@The most popular choice for the non-linearity is the Gaussian.@@@@1@10@@danf@17-8-2009
10712130@unknown@formal@none@1@S@RBF networks have the advantage of not being locked into local minima as do the [[feed-forward]] networks such as the multilayer perceptron.@@@@1@22@@danf@17-8-2009
10712140@unknown@formal@none@1@S@==== Support vector machines====@@@@1@4@@danf@17-8-2009
10712150@unknown@formal@none@1@S@[[Support Vector Machine]]s (SVM) are used to detect and exploit complex patterns in data by clustering, classifying and ranking the data.@@@@1@21@@danf@17-8-2009
10712160@unknown@formal@none@1@S@They are learning machines that are used to perform binary classifications and regression estimations.@@@@1@14@@danf@17-8-2009
10712170@unknown@formal@none@1@S@They commonly use kernel based methods to apply linear classification techniques to non-linear classification problems.@@@@1@15@@danf@17-8-2009
10712180@unknown@formal@none@1@S@There are a number of types of SVM such as linear, polynomial, sigmoid etc.@@@@1@14@@danf@17-8-2009
10712190@unknown@formal@none@1@S@==== Naïve Bayes====@@@@1@3@@danf@17-8-2009
10712200@unknown@formal@none@1@S@[[Naive Bayes classifier|Naïve Bayes]] based on Bayes conditional probability rule is used for performing classification tasks.@@@@1@16@@danf@17-8-2009
10712210@unknown@formal@none@1@S@Naïve Bayes assumes the predictors are statistically independent which makes it an effective classification tool that is easy to interpret.@@@@1@20@@danf@17-8-2009
10712220@unknown@formal@none@1@S@It is best employed when faced with the problem of ‘curse of dimensionality’ i.e. when the number of predictors is very high.@@@@1@22@@danf@17-8-2009
10712230@unknown@formal@none@1@S@==== k-nearest neighbours====@@@@1@3@@danf@17-8-2009
10712240@unknown@formal@none@1@S@The [[K-nearest neighbor algorithm|nearest neighbour algorithm]] (KNN) belongs to the class of pattern recognition statistical methods.@@@@1@16@@danf@17-8-2009
10712250@unknown@formal@none@1@S@The method does not impose a priori any assumptions about the distribution from which the modeling sample is drawn.@@@@1@19@@danf@17-8-2009
10712260@unknown@formal@none@1@S@It involves a training set with both positive and negative values.@@@@1@11@@danf@17-8-2009
10712270@unknown@formal@none@1@S@A new sample is classified by calculating the distance to the nearest neighbouring training case.@@@@1@15@@danf@17-8-2009
10712280@unknown@formal@none@1@S@The sign of that point will determine the classification of the sample.@@@@1@12@@danf@17-8-2009
10712290@unknown@formal@none@1@S@In the k-nearest neighbour classifier, the k nearest points are considered and the sign of the majority is used to classify the sample.@@@@1@23@@danf@17-8-2009
10712300@unknown@formal@none@1@S@The performance of the kNN algorithm is influenced by three main factors: (1) the distance measure used to locate the nearest neighbours; (2) the decision rule used to derive a classification from the k-nearest neighbours; and (3) the number of neighbours used to classify the new sample.@@@@1@47@@danf@17-8-2009
10712310@unknown@formal@none@1@S@It can be proved that, unlike other methods, this method is universally asymptotically convergent, i.e.: as the size of the training set increases, if the observations are iid, regardless of the distribution from which the sample is drawn, the predicted class will converge to the class assignment that minimizes misclassification error.@@@@1@51@@danf@17-8-2009
10712320@unknown@formal@none@1@S@See Devroy et alt.@@@@1@4@@danf@17-8-2009
10712330@unknown@formal@none@1@S@==Popular tools==@@@@1@2@@danf@17-8-2009
10712340@unknown@formal@none@1@S@There are numerous tools available in the marketplace which help with the execution of predictive analytics.@@@@1@16@@danf@17-8-2009
10712350@unknown@formal@none@1@S@These range from those which need very little user sophistication to those that are designed for the expert practitioner.@@@@1@19@@danf@17-8-2009
10712360@unknown@formal@none@1@S@The difference between these tools is often in the level of customization and heavy data lifting allowed.@@@@1@17@@danf@17-8-2009
10712370@unknown@formal@none@1@S@For traditional statistical modeling some of the popular tools are [[DAP (software)|DAP]]/[[SAS Institute|SAS]], S-Plus, [[PSPP]]/[[SPSS]] and Stata.@@@@1@17@@danf@17-8-2009
10712380@unknown@formal@none@1@S@For machine learning/data mining type of applications, KnowledgeSEEKER, KnowledgeSTUDIO, Enterprise Miner, GeneXproTools, [[Viscovery]], Clementine, [[KXEN Inc.|KXEN Analytic Framework]], [[InforSense]] and Excel Miner are some of the popularly used options.@@@@1@29@@danf@17-8-2009
10712390@unknown@formal@none@1@S@Classification Tree analysis can be performed using CART software.@@@@1@9@@danf@17-8-2009
10712400@unknown@formal@none@1@S@SOMine is a predictive analytics tool based on [[self-organizing map]]s (SOMs) available from [[Viscovery Software]].@@@@1@15@@danf@17-8-2009
10712410@unknown@formal@none@1@S@[[R (programming_language)|R]] is a very powerful tool that can be used to perform almost any kind of statistical analysis, and is freely downloadable.@@@@1@23@@danf@17-8-2009
10712420@unknown@formal@none@1@S@[[WEKA]] is a freely available [[open source|open-source]] collection of [[machine learning]] methods for pattern classification, regression, clustering, and some types of meta-learning, which can be used for predictive analytics.@@@@1@29@@danf@17-8-2009
10712430@unknown@formal@none@1@S@[[RapidMiner]] is another freely available integrated [[open source|open-source]] software environment for predictive analytics, [[data mining]], and [[machine learning]] fully integrating WEKA and providing an even larger number of methods for predictive analytics.@@@@1@32@@danf@17-8-2009
10712440@unknown@formal@none@1@S@Recently, in an attempt to provide a standard language for expressing predictive models, the [[Predictive Model Markup Language]] (PMML) has been proposed.@@@@1@22@@danf@17-8-2009
10712450@unknown@formal@none@1@S@Such an XML-based language provides a way for the different tools to define predictive models and to share these between PMML compliant applications.@@@@1@23@@danf@17-8-2009
10712460@unknown@formal@none@1@S@Several tools already produce or consume PMML documents, these include [[ADAPA]], [[IBM DB2]] Warehouse, CART, SAS Enterprise Miner, and [[SPSS]].@@@@1@20@@danf@17-8-2009
10712470@unknown@formal@none@1@S@Predictive analytics has also found its way into the IT lexicon, most notably in the area of IT Automation.@@@@1@19@@danf@17-8-2009
10712480@unknown@formal@none@1@S@Vendors such as [[Stratavia]] and their [[Data Palette]] product offer predictive analytics as part of their automation platform, predicting how resources will behave in the future and automate the environment accordingly.@@@@1@31@@danf@17-8-2009
10712490@unknown@formal@none@1@S@The widespread use of predictive analytics in industry has led to the proliferation of numerous productized solutions firms.@@@@1@18@@danf@17-8-2009
10712500@unknown@formal@none@1@S@Some of them are highly specialized (focusing, for example, on fraud detection, automatic saleslead generation or response modeling) in a specific domain ([[Fair Isaac]] for credit card scores) or industry verticals (MarketRx in Pharmaceutical).@@@@1@34@@danf@17-8-2009
10712510@unknown@formal@none@1@S@Others provide predictive analytics services in support of a wide range of business problems across industry verticals ([[Fifth C]]).@@@@1@19@@danf@17-8-2009
10712520@unknown@formal@none@1@S@Predictive Analytics competitions are also fairly common and often pit academics and Industry practitioners (see for example, KDD CUP).@@@@1@19@@danf@17-8-2009
10712530@unknown@formal@none@1@S@==Conclusion==@@@@1@1@@danf@17-8-2009
10712540@unknown@formal@none@1@S@Predictive analytics adds great value to a businesses decision making capabilities by allowing it to formulate smart policies on the basis of predictions of future outcomes.@@@@1@26@@danf@17-8-2009
10712550@unknown@formal@none@1@S@A broad range of tools and techniques are available for this type of analysis and their selection is determined by the analytical maturity of the firm as well as the specific requirements of the problem being solved.@@@@1@37@@danf@17-8-2009
10712560@unknown@formal@none@1@S@==Education==@@@@1@1@@danf@17-8-2009
10712570@unknown@formal@none@1@S@Predictive analytics is taught at the following institutions:@@@@1@8@@danf@17-8-2009
10712580@unknown@formal@none@1@S@* Ghent University, Belgium: [http://www.mma.UGent.be Master of Marketing Analysis], an 8-month advanced master degree taught in English with strong emphasis on applications of predictive analytics in Analytical CRM.@@@@1@28@@danf@17-8-2009
10720010@unknown@formal@none@1@S@RapidMiner@@@@1@1@@danf@17-8-2009
10720020@unknown@formal@none@1@S@'''RapidMiner''' (formerly YALE (Yet Another Learning Environment)) is an environment for [[machine learning]] and [[data mining]] experiments.@@@@1@17@@danf@17-8-2009
10720030@unknown@formal@none@1@S@It allows experiments to be made up of a large number of arbitrarily nestable operators, described in [[XML]] files which can easily be created with RapidMiner's [[graphical user interface]].@@@@1@29@@danf@17-8-2009
10720040@unknown@formal@none@1@S@Applications of RapidMiner cover both research and real-world data mining tasks.@@@@1@11@@danf@17-8-2009
10720050@unknown@formal@none@1@S@The initial version has been developed by the Artificial Intelligence Unit of [[Dortmund University of Technology|University of Dortmund]] since [[2001]].@@@@1@20@@danf@17-8-2009
10720060@unknown@formal@none@1@S@It is distributed under a [[GNU]] license, and has been hosted by [[SourceForge]] since [[2004]].@@@@1@15@@danf@17-8-2009
10720070@unknown@formal@none@1@S@RapidMiner provides more than 400 operators for all main machine learning procedures, including input and output, and data preprocessing and visualization.@@@@1@21@@danf@17-8-2009
10720080@unknown@formal@none@1@S@It is written in the [[Java (programming language)|Java programming language]] and therefore can work on all popular operating systems.@@@@1@19@@danf@17-8-2009
10720090@unknown@formal@none@1@S@It also integrates all learning schemes and attribute evaluators of the [[Weka (machine learning)|Weka]] learning environment.@@@@1@16@@danf@17-8-2009
10720100@unknown@formal@none@1@S@== Properties ==@@@@1@3@@danf@17-8-2009
10720110@unknown@formal@none@1@S@Some properties of RapidMiner are:@@@@1@5@@danf@17-8-2009
10720120@unknown@formal@none@1@S@* written in Java@@@@1@4@@danf@17-8-2009
10720130@unknown@formal@none@1@S@* [[knowledge discovery]] processes are modeled as operator trees@@@@1@9@@danf@17-8-2009
10720140@unknown@formal@none@1@S@* internal XML representation ensures standardized interchange format of data mining experiments@@@@1@12@@danf@17-8-2009
10720150@unknown@formal@none@1@S@* scripting language allows for automatic large-scale experiments@@@@1@8@@danf@17-8-2009
10720160@unknown@formal@none@1@S@* multi-layered data view concept ensures efficient and transparent data handling@@@@1@11@@danf@17-8-2009
10720170@unknown@formal@none@1@S@* [[graphical user interface]], [[command line]] mode ([[Batch file|batch mode]]), and [[Java API]] for using RapidMiner from your own programs@@@@1@20@@danf@17-8-2009
10720180@unknown@formal@none@1@S@* [[plugin]] and [[Extension (computing)|extension]] mechanisms, several plugins already exist@@@@1@10@@danf@17-8-2009
10720190@unknown@formal@none@1@S@* [[plotting]] facility offering a large set of high-dimensional visualization schemes for data and models@@@@1@15@@danf@17-8-2009
10720200@unknown@formal@none@1@S@* applications include [[text mining]], multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining.@@@@1@24@@danf@17-8-2009
10730010@unknown@formal@none@1@S@Russian language@@@@1@2@@danf@17-8-2009
10730020@unknown@formal@none@1@S@'''Russian''' ([[:Media:Ru-russkiy jizyk.ogg|]]([[Wikipedia:Media help|help]]•[[:Image:Ru-russkiy jizyk.ogg|info]]), [[Romanization of Russian|transliteration]]: , {{IPA-ru|ˈruskʲɪj jɪˈzɨk}}) is the most geographically widespread language of [[Eurasia]], the most widely spoken of the [[Slavic languages]], and the largest [[native language]] in [[Europe]].@@@@1@38@@danf@17-8-2009
10730030@unknown@formal@none@1@S@Russian belongs to the family of [[Indo-European languages]] and is one of three (or, according to some authorities , four) living members of the [[East Slavic languages]], the others being [[Belarusian language|Belarusian]] and [[Ukrainian language|Ukrainian]] (and possibly [[Rusyn language|Rusyn]], often considered a dialect of Ukrainian).@@@@1@45@@danf@17-8-2009
10730040@unknown@formal@none@1@S@It is also spoken by the countries of the [[Russophone]].@@@@1@10@@danf@17-8-2009
10730050@unknown@formal@none@1@S@Written examples of Old East Slavonic are attested from the 10th century onwards.@@@@1@13@@danf@17-8-2009
10730060@unknown@formal@none@1@S@Today Russian is widely used outside [[Russia]].@@@@1@7@@danf@17-8-2009
10730070@unknown@formal@none@1@S@It is applied as a means of coding and storage of universal knowledge — 60–70% of all world information is published in English and Russian languages.@@@@1@26@@danf@17-8-2009
10730080@unknown@formal@none@1@S@Over a quarter of the world's scientific literature is published in Russian.@@@@1@12@@danf@17-8-2009
10730090@unknown@formal@none@1@S@Russian is also a necessary accessory of world communications systems (broadcasts, air- and space communication, etc).@@@@1@16@@danf@17-8-2009
10730100@unknown@formal@none@1@S@Due to the status of the [[Soviet Union]] as a [[superpower]], Russian had great political importance in the 20th century.@@@@1@20@@danf@17-8-2009
10730110@unknown@formal@none@1@S@Hence, the language is one of the [[United Nations#Languages|official languages]] of the [[United Nations]].@@@@1@14@@danf@17-8-2009
10730120@unknown@formal@none@1@S@Russian distinguishes between [[consonant]] [[phoneme]]s with [[palatalization|palatal]] [[secondary articulation]] and those without, the so-called ''soft'' and ''hard'' sounds.@@@@1@18@@danf@17-8-2009
10730130@unknown@formal@none@1@S@This distinction is found between pairs of almost all consonants and is one of the most distinguishing features of the language.@@@@1@21@@danf@17-8-2009
10730140@unknown@formal@none@1@S@Another important aspect is the [[vowel reduction|reduction]] of [[stress (linguistics)|unstressed]] [[vowel]]s, which is somewhat similar to [[Unstressed and reduced vowels in English|that of English]].@@@@1@24@@danf@17-8-2009
10730150@unknown@formal@none@1@S@Stress, which is unpredictable, is not normally indicated orthographically.@@@@1@9@@danf@17-8-2009
10730160@unknown@formal@none@1@S@According to the Institute of Russian Language of the Russian Academy of Sciences, an optional [[acute accent]] () may, and sometimes should, be used to mark stress.@@@@1@27@@danf@17-8-2009
10730170@unknown@formal@none@1@S@For example, it is used to distinguish between otherwise identical words, especially when context doesn't make it obvious: ''замо́к/за́мок'' (lock/castle), ''сто́ящий/стоя́щий'' (worthwhile/standing), ''чудно́/чу́дно'' (this is odd/this is marvellous), ''молоде́ц/мо́лодец'' (attaboy/fine young man), ''узна́ю/узнаю́'' (I shall learn it/I am learning it), ''отреза́ть/отре́зать'' (infinitive for "cut"/perfective for "cut"); to indicate the proper pronouncation of uncommon words, especially personal and family names (''афе́ра, гу́ру, Гарси́а, Оле́ша, Фе́рми''), and to express the stressed word in the sentence (''Ты́ съел печенье?/Ты съе́л печенье?/Ты съел пече́нье?'' - Was it you who eat the cookie?/Did you eat the cookie?/Was the cookie your meal?).@@@@1@96@@danf@17-8-2009
10730180@unknown@formal@none@1@S@Acute accents are mandatory in lexical dictionaries and books intended to be used either by children or foreign readers.@@@@1@19@@danf@17-8-2009
10730190@unknown@formal@none@1@S@==Classification==@@@@1@1@@danf@17-8-2009
10730200@unknown@formal@none@1@S@Russian is a [[Slavic languages|Slavic language]] in the [[Indo-European Languages|Indo-European family]].@@@@1@11@@danf@17-8-2009
10730210@unknown@formal@none@1@S@From the point of view of the [[spoken language]], its closest relatives are [[Ukrainian language|Ukrainian]] and [[Belarusian language|Belarusian]], the other two national languages in the [[East Slavic languages|East Slavic]] group.@@@@1@30@@danf@17-8-2009
10730220@unknown@formal@none@1@S@In many places in eastern [[Ukraine]] and [[Belarus]], these languages are spoken interchangeably, and in certain areas traditional bilingualism resulted in language mixture, e.g. [[Surzhyk]] in eastern Ukraine and [[Trasianka]] in Belarus.@@@@1@32@@danf@17-8-2009
10730240@unknown@formal@none@1@S@An East Slavic [[Old Novgorod dialect]], although vanished during the fifteenth or sixteenth century, is sometimes considered to have played a significant role in formation of the modern Russian language.@@@@1@30@@danf@17-8-2009
10730250@unknown@formal@none@1@S@The vocabulary (mainly abstract and literary words), principles of word formation, and, to some extent, inflections and literary style of Russian have been also influenced by [[Church Slavonic language|Church Slavonic]], a developed and partly adopted form of the [[South Slavic languages|South Slavic]] [[Old Church Slavonic]] language used by the [[Russian Orthodox Church]].@@@@1@52@@danf@17-8-2009
10730260@unknown@formal@none@1@S@However, the East Slavic forms have tended to be used exclusively in the various dialects that are experiencing a rapid decline.@@@@1@21@@danf@17-8-2009
10730270@unknown@formal@none@1@S@In some cases, both the [[East Slavic languages|East Slavic]] and the [[Church Slavonic]] forms are in use, with slightly different meanings.@@@@1@21@@danf@17-8-2009
10730280@unknown@formal@none@1@S@''For details, see [[Russian phonology]] and [[History of the Russian language]].''@@@@1@11@@danf@17-8-2009
10730290@unknown@formal@none@1@S@Russian phonology and syntax (especially in northern dialects) have also been influenced to some extent by the numerous Finnic languages of the [[Finno-Ugric languages|Finno-Ugric subfamily]]: [[Merya language|Merya]], [[Moksha language|Moksha]], [[Muromian language|Muromian]], the language of the [[Meshchera]], [[Veps language|Veps]], et cetera.@@@@1@40@@danf@17-8-2009
10730300@unknown@formal@none@1@S@These languages, some of them now extinct, used to be spoken in the center and in the north of what is now the European part of Russia.@@@@1@27@@danf@17-8-2009
10730310@unknown@formal@none@1@S@They came in contact with Eastern Slavic as far back as the early Middle Ages and eventually served as substratum for the modern Russian language.@@@@1@25@@danf@17-8-2009
10730320@unknown@formal@none@1@S@The Russian dialects spoken north, north-east and north-west of [[Moscow]] have a considerable number of words of Finno-Ugric origin.@@@@1@19@@danf@17-8-2009
10730330@unknown@formal@none@1@S@Over the course of centuries, the vocabulary and literary style of Russian have also been influenced by Turkic/Caucasian/Central Asian languages, as well as Western/Central European languages such as [[Polish language|Polish]], [[Latin]], [[Dutch language|Dutch]], [[German language|German]], [[French language|French]], and [[English language|English]].@@@@1@40@@danf@17-8-2009
10730340@unknown@formal@none@1@S@According to the [[Defense Language Institute]] in [[Monterey, California]], Russian is classified as a level III language in terms of learning difficulty for native English speakers, requiring approximately 780 hours of immersion instruction to achieve intermediate fluency.@@@@1@37@@danf@17-8-2009
10730350@unknown@formal@none@1@S@It is also regarded by the [[United States Intelligence Community]] as a "hard target" language, due to both its difficulty to master for English speakers as well as due to its critical role in American world policy.@@@@1@37@@danf@17-8-2009
10730360@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10730370@unknown@formal@none@1@S@Russian is primarily spoken in [[Russia]] and, to a lesser extent, the other countries that were once constituent republics of the [[Soviet Union|USSR]].@@@@1@23@@danf@17-8-2009
10730380@unknown@formal@none@1@S@Until [[1917]], it was the sole official language of the [[Russian Empire]].@@@@1@12@@danf@17-8-2009
10730390@unknown@formal@none@1@S@During the Soviet period, the policy toward the languages of the various other ethnic groups fluctuated in practice.@@@@1@18@@danf@17-8-2009
10730400@unknown@formal@none@1@S@Though each of the constituent republics had its own official language, the unifying role and superior status was reserved for Russian.@@@@1@21@@danf@17-8-2009
10730410@unknown@formal@none@1@S@Following the break-up of [[1991]], several of the newly independent states have encouraged their native languages, which has partly reversed the privileged status of Russian, though its role as the language of post-Soviet national intercourse throughout the region has continued.@@@@1@40@@danf@17-8-2009
10730420@unknown@formal@none@1@S@In [[Latvia]], notably, its official recognition and legality in the classroom have been a topic of considerable debate in a country where more than one-third of the population is Russian-speaking, consisting mostly of post-[[World War II]] immigrants from Russia and other parts of the former [[USSR]] (Belarus, Ukraine).@@@@1@48@@danf@17-8-2009
10730430@unknown@formal@none@1@S@Similarly, in [[Estonia]], the Soviet-era immigrants and their Russian-speaking descendants constitute 25,6% of the country's current population and 58,6% of the native Estonian population is also able to speak Russian.@@@@1@30@@danf@17-8-2009
10730440@unknown@formal@none@1@S@In all, 67,8% of Estonia's population can speak Russian.@@@@1@9@@danf@17-8-2009
10730450@unknown@formal@none@1@S@In [[Kazakhstan]] and [[Kyrgyzstan]], Russian remains a co-official language with [[Kazakh language|Kazakh]] and [[Kyrgyz language|Kyrgyz]] respectively.@@@@1@16@@danf@17-8-2009
10730460@unknown@formal@none@1@S@Large Russian-speaking communities still exist in northern Kazakhstan, and ethnic Russians comprise 25.6 % of Kazakhstan's population.@@@@1@17@@danf@17-8-2009
10730470@unknown@formal@none@1@S@A much smaller Russian-speaking minority in [[Lithuania]] has represented less than 1/10 of the country's overall population.@@@@1@17@@danf@17-8-2009
10730480@unknown@formal@none@1@S@Nevertheless more than half of the population of the [[Baltic states]] are able to hold a conversation in Russian and almost all have at least some familiarity with the most basic spoken and written phrases.@@@@1@35@@danf@17-8-2009
10730490@unknown@formal@none@1@S@The Russian control of [[Finland]] in 1809–1918, however, has left few Russian speakers in Finland.@@@@1@15@@danf@17-8-2009
10730500@unknown@formal@none@1@S@There are 33,400 Russian speakers in Finland, amounting to 0.6% of the population.@@@@1@13@@danf@17-8-2009
10730510@unknown@formal@none@1@S@5000 (0.1%) of them are late 19th century and 20th century immigrants, and the rest are recent immigrants, who have arrived in the 90's and later.@@@@1@26@@danf@17-8-2009
10730520@unknown@formal@none@1@S@In the twentieth century, Russian was widely taught in the schools of the members of the old [[Warsaw Pact]] and in other [[Communist state|countries]] that used to be allies of the USSR.@@@@1@32@@danf@17-8-2009
10730530@unknown@formal@none@1@S@In particular, these countries include [[Poland]], [[Bulgaria]], the [[Czech Republic]], [[Slovakia]], [[Hungary]], [[Romania]], [[Albania]] and [[Cuba]].@@@@1@16@@danf@17-8-2009
10730540@unknown@formal@none@1@S@However, younger generations are usually not fluent in it, because Russian is no longer mandatory in the school system.@@@@1@19@@danf@17-8-2009
10730550@unknown@formal@none@1@S@It is currently the most widely-taught foreign language in [[Mongolia]].@@@@1@10@@danf@17-8-2009
10730560@unknown@formal@none@1@S@Russian is also spoken in [[Israel]] by at least 750,000 ethnic [[Jew]]ish immigrants from the former [[Soviet Union]] (1999 census).@@@@1@20@@danf@17-8-2009
10730570@unknown@formal@none@1@S@The Israeli [[Mass media|press]] and [[website]]s regularly publish material in Russian.@@@@1@11@@danf@17-8-2009
10730580@unknown@formal@none@1@S@Sizable Russian-speaking communities also exist in [[North America]], especially in large urban centers of the [[United States|U.S.]] and [[Canada]] such as [[New York City]], [[Philadelphia]], [[Boston, Massachusetts|Boston]], [[Los Angeles, California|Los Angeles]], [[San Francisco]], [[Seattle]], [[Toronto]], [[Baltimore]], [[Miami, Florida|Miami]], [[Chicago]], [[Denver]], and the [[Cleveland, Ohio|Cleveland]] suburb of [[Richmond Heights, Ohio|Richmond Heights]].@@@@1@50@@danf@17-8-2009
10730590@unknown@formal@none@1@S@In the former two, Russian-speaking groups total over half a million.@@@@1@11@@danf@17-8-2009
10730600@unknown@formal@none@1@S@In a number of locations they issue their own newspapers, and live in their self-sufficient neighborhoods (especially the generation of immigrants who started arriving in the early sixties).@@@@1@28@@danf@17-8-2009
10730610@unknown@formal@none@1@S@Only about a quarter of them are ethnic Russians, however.@@@@1@10@@danf@17-8-2009
10730620@unknown@formal@none@1@S@Before the [[dissolution of the Soviet Union]], the overwhelming majority of [[Russophone]]s in North America were Russian-speaking [[Jews]].@@@@1@18@@danf@17-8-2009
10730630@unknown@formal@none@1@S@Afterwards the influx from the countries of the former [[Soviet Union]] changed the statistics somewhat.@@@@1@15@@danf@17-8-2009
10730640@unknown@formal@none@1@S@According to the [[United States 2000 Census]], Russian is the primary language spoken in the homes of over 700,000 individuals living in the United States.@@@@1@25@@danf@17-8-2009
10730650@unknown@formal@none@1@S@Significant Russian-speaking groups also exist in [[Western Europe]].@@@@1@8@@danf@17-8-2009
10730660@unknown@formal@none@1@S@These have been fed by several waves of immigrants since the beginning of the twentieth century, each with its own flavor of language.@@@@1@23@@danf@17-8-2009
10730670@unknown@formal@none@1@S@[[Germany]], the [[United Kingdom]], [[Spain]], [[France]], [[Italy]], [[Belgium]], [[Greece]], [[Brazil]], [[Norway]], [[Austria]], and [[Turkey]] have significant Russian-speaking communities totaling 3 million people.@@@@1@22@@danf@17-8-2009
10730680@unknown@formal@none@1@S@Two thirds of them are actually Russian-speaking descendants of [[German people|Germans]], [[Greeks]], [[Jews]], [[Armenians]], or [[Ukrainians]] who either repatriated after the [[USSR]] collapsed or are just looking for temporary employment.@@@@1@30@@danf@17-8-2009
10730690@unknown@formal@none@1@S@Recent estimates of the total number of speakers of Russian:@@@@1@10@@danf@17-8-2009
10730700@unknown@formal@none@1@S@===Official status===@@@@1@2@@danf@17-8-2009
10730710@unknown@formal@none@1@S@Russian is the official language of [[Russia]].@@@@1@7@@danf@17-8-2009
10730720@unknown@formal@none@1@S@It is also an official language of [[Belarus]], [[Kazakhstan]], [[Kyrgyzstan]], an unofficial but widely spoken language in [[Ukraine]] and the de facto official language of the [[List of unrecognized countries|unrecognized]] of [[Transnistria]], [[South Ossetia]] and [[Abkhazia]].@@@@1@36@@danf@17-8-2009
10730730@unknown@formal@none@1@S@Russian is one of the [[United Nations#Languages|six official languages]] of the [[United Nations]].@@@@1@13@@danf@17-8-2009
10730740@unknown@formal@none@1@S@Education in Russian is still a popular choice for both Russian as a second language (RSL) and native speakers in Russia as well as many of the former Soviet republics.@@@@1@30@@danf@17-8-2009
10730750@unknown@formal@none@1@S@97% of the public school students of Russia, 75% in Belarus, 41% in Kazakhstan, 25% in [[Ukraine]], 23% in Kyrgyzstan, 21% in [[Moldova]], 7% in [[Azerbaijan]], 5% in [[Georgia (country)|Georgia]] and 2% in [[Armenia]] and [[Tajikistan]] receive their education only or mostly in Russian.@@@@1@44@@danf@17-8-2009
10730760@unknown@formal@none@1@S@Although the corresponding percentage of ethnic Russians is 78% in [[Russia]], 10% in [[Belarus]], 26% in [[Kazakhstan]], 17% in [[Ukraine]], 9% in [[Kyrgyzstan]], 6% in [[Republic of Moldova|Moldova]], 2% in [[Azerbaijan]], 1.5% in [[Georgia (country)|Georgia]] and less than 1% in both [[Armenia]] and [[Tajikistan]].@@@@1@44@@danf@17-8-2009
10730770@unknown@formal@none@1@S@Russian-language schooling is also available in Latvia, Estonia and Lithuania, but due to education reforms, a number of subjects taught in Russian are reduced at the high school level.@@@@1@29@@danf@17-8-2009
10730780@unknown@formal@none@1@S@The language has a co-official status alongside [[Moldovan language|Moldovan]] in the autonomies of [[Gagauzia]] and [[Transnistria]] in [[Moldova]], and in seven [[Romania]]n [[Commune in Romania|communes]] in [[Tulcea County|Tulcea]] and [[Constanţa County|Constanţa]] counties.@@@@1@32@@danf@17-8-2009
10730790@unknown@formal@none@1@S@In these localities, Russian-speaking [[Lipovans]], who are a recognized ethnic minority, make up more than 20% of the population.@@@@1@19@@danf@17-8-2009
10730800@unknown@formal@none@1@S@Thus, according to Romania's minority rights law, education, signage, and access to public administration and the justice system are provided in Russian alongside Romanian.@@@@1@24@@danf@17-8-2009
10730810@unknown@formal@none@1@S@In the [[Crimea|Autonomous Republic of Crimea]] in Ukraine, Russian is an officially recognized language alongside with [[Crimean Tatar language|Crimean Tatar]], but in reality, is the only language used by the government, thus being a ''[[de facto]]'' official language.@@@@1@38@@danf@17-8-2009
10730820@unknown@formal@none@1@S@===Dialects===@@@@1@1@@danf@17-8-2009
10730830@unknown@formal@none@1@S@Despite leveling after 1900, especially in matters of vocabulary, a number of dialects exist in Russia.@@@@1@16@@danf@17-8-2009
10730840@unknown@formal@none@1@S@Some linguists divide the dialects of the Russian language into two primary regional groupings, "Northern" and "Southern", with [[Moscow]] lying on the zone of transition between the two.@@@@1@28@@danf@17-8-2009
10730850@unknown@formal@none@1@S@Others divide the language into three groupings, Northern, Central and Southern, with Moscow lying in the Central region.@@@@1@18@@danf@17-8-2009
10730860@unknown@formal@none@1@S@[[Dialectology]] within Russia recognizes dozens of smaller-scale variants.@@@@1@8@@danf@17-8-2009
10730870@unknown@formal@none@1@S@The dialects often show distinct and non-standard features of pronunciation and intonation, vocabulary, and grammar.@@@@1@15@@danf@17-8-2009
10730880@unknown@formal@none@1@S@Some of these are relics of ancient usage now completely discarded by the standard language.@@@@1@15@@danf@17-8-2009
10730890@unknown@formal@none@1@S@The [[northern Russian dialects]] and those spoken along the [[Volga River]] typically pronounce unstressed {{IPA|/o/}} clearly (the phenomenon called [[vowel reduction in Russian#Back vowels|okanye]]/оканье).@@@@1@24@@danf@17-8-2009
10730900@unknown@formal@none@1@S@East of Moscow, particularly in [[Ryazan Region]], unstressed {{IPA|/e/}} and {{IPA|/a/}} following [[palatalization|palatalized]] consonants and preceding a stressed syllable are not reduced to {{IPA|[ɪ]}} (like in the Moscow dialect), being instead pronounced as {{IPA|/a/}} in such positions (e.g. несл'''и''' is pronounced as {{IPA|[nʲasˈlʲi]}}, not as {{IPA|[nʲɪsˈlʲi]}}) - this is called [[yakanye]]/ яканье; many southern dialects have a palatalized final {{IPA|/tʲ/}} in 3rd person forms of verbs (this is unpalatalized in the standard dialect) and a fricative {{IPA|[ɣ]}} where the standard dialect has {{IPA|[g]}}.@@@@1@83@@danf@17-8-2009
10730910@unknown@formal@none@1@S@However, in certain areas south of Moscow, e.g. in and around [[Tula, Russia|Tula]], {{IPA|/g/}} is pronounced as in the Moscow and northern dialects unless it precedes a voiceless plosive or a pause.@@@@1@32@@danf@17-8-2009
10730920@unknown@formal@none@1@S@In this position {{IPA|/g/}} is lenited and devoiced to the fricative {{IPA|[x]}}, e.g. друг {{IPA|[drux]}} (in Moscow's dialect, only Бог {{IPA|[box]}}, лёгкий {{IPA|[lʲɵxʲkʲɪj]}}, мягкий {{IPA|[ˈmʲæxʲkʲɪj]}} and some derivatives follow this rule).@@@@1@31@@danf@17-8-2009
10730930@unknown@formal@none@1@S@Some of these features (e.g. a [[debuccalization|debuccalized]] or [[lenition|lenited]] {{IPA|/g/}} and palatalized final {{IPA|/tʲ/}} in 3rd person forms of verbs) are also present in modern [[Ukrainian language|Ukrainian]], indicating either a linguistic continuum or strong influence one way or the other.@@@@1@40@@danf@17-8-2009
10730940@unknown@formal@none@1@S@The city of [[Veliky Novgorod]] has historically displayed a feature called chokanye/tsokanye (чоканье/цоканье), where {{IPA|/ʨ/}} and {{IPA|/ʦ/}} were confused (this is thought to be due to influence from [[Finnish language|Finnish]], which doesn't distinguish these sounds).@@@@1@35@@danf@17-8-2009
10730950@unknown@formal@none@1@S@So, '''ц'''апля ("heron") has been recorded as 'чапля'.@@@@1@8@@danf@17-8-2009
10730960@unknown@formal@none@1@S@Also, the second palatalization of [[Velar consonant|velar]]s did not occur there, so the so-called '''ě²''' (from the Proto-Slavonic diphthong *ai) did not cause {{IPA|/k, g, x/}} to shift to {{IPA|/ʦ, ʣ, s/}}; therefore where [[Standard Russian]] has '''ц'''епь ("chain"), the form '''к'''епь {{IPA|[kʲepʲ]}} is attested in earlier texts.@@@@1@48@@danf@17-8-2009
10730970@unknown@formal@none@1@S@Among the first to study Russian dialects was [[Mikhail Lomonosov|Lomonosov]] in the eighteenth century.@@@@1@14@@danf@17-8-2009
10730980@unknown@formal@none@1@S@In the nineteenth, [[Vladimir Dal]] compiled the first dictionary that included dialectal vocabulary.@@@@1@13@@danf@17-8-2009
10730990@unknown@formal@none@1@S@Detailed mapping of Russian dialects began at the turn of the twentieth century.@@@@1@13@@danf@17-8-2009
10731000@unknown@formal@none@1@S@In modern times, the monumental ''Dialectological Atlas of the Russian Language'' (''Диалектологический атлас русского языка'' {{IPA|[dʲɪɐˌlʲɛktəlɐˈgʲiʨɪskʲɪj ˈatləs ˈruskəvə jɪzɨˈka]}}), was published in 3 folio volumes 1986–1989, after four decades of preparatory work.@@@@1@32@@danf@17-8-2009
10731010@unknown@formal@none@1@S@The ''standard language'' is based on (but not identical to) the Moscow dialect.@@@@1@13@@danf@17-8-2009
10731020@unknown@formal@none@1@S@===Derived languages===@@@@1@2@@danf@17-8-2009
10731030@unknown@formal@none@1@S@* [[Balachka]] a dialect, spoken primarily by [[Cossacks]], in the regions of Don, [[Kuban]] and [[Terek]].@@@@1@16@@danf@17-8-2009
10731040@unknown@formal@none@1@S@* [[Fenya]], a criminal [[argot]] of ancient origin, with Russian grammar, but with distinct vocabulary.@@@@1@15@@danf@17-8-2009
10731050@unknown@formal@none@1@S@* [[Nadsat]], the fictional language spoken in '[[A Clockwork Orange]]' uses a lot of Russian words and Russian slang.@@@@1@19@@danf@17-8-2009
10731060@unknown@formal@none@1@S@* [[Surzhyk]] is a language with Russian and Ukrainian features, spoken in some areas of Ukraine@@@@1@16@@danf@17-8-2009
10731070@unknown@formal@none@1@S@* [[Trasianka]] is a language with Russian and Belarusian features used by a large portion of the rural population in [[Belarus]].@@@@1@21@@danf@17-8-2009
10731080@unknown@formal@none@1@S@* [[Quelia]], a pseudo pidgin of German and Russian.@@@@1@9@@danf@17-8-2009
10731090@unknown@formal@none@1@S@* [[Runglish]], Russian-English pidgin.@@@@1@4@@danf@17-8-2009
10731100@unknown@formal@none@1@S@This word is also used by English speakers to describe the way in which Russians attempt to speak English using Russian morphology and/or syntax.@@@@1@24@@danf@17-8-2009
10731110@unknown@formal@none@1@S@* [[Russenorsk language|Russenorsk]] is an extinct [[pidgin]] language with mostly Russian vocabulary and mostly [[Norwegian language|Norwegian]] grammar, used for communication between [[Russians]] and [[Norway|Norwegian]] traders in the Pomor trade in [[Finnmark]] and the [[Kola Peninsula]].@@@@1@35@@danf@17-8-2009
10731120@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10731130@unknown@formal@none@1@S@===Alphabet===@@@@1@1@@danf@17-8-2009
10731140@unknown@formal@none@1@S@Russian is written using a modified version of the [[Cyrillic alphabet|Cyrillic (кириллица)]] alphabet.@@@@1@13@@danf@17-8-2009
10731150@unknown@formal@none@1@S@The Russian alphabet consists of 33 letters.@@@@1@7@@danf@17-8-2009
10731160@unknown@formal@none@1@S@The following table gives their upper case forms, along with [[help:IPA|IPA]] values for each letter's typical sound:@@@@1@17@@danf@17-8-2009
10731170@unknown@formal@none@1@S@Older letters of the Russian alphabet include <>, which merged to <е> ({{IPA|/e/}}); <і> and <>, which both merged to <и>({{IPA|/i/}}); <>, which merged to <ф> ({{IPA|/f/}}); and <>, which merged to <я> ({{IPA|/ja/}} or {{IPA|/ʲa/}}).@@@@1@36@@danf@17-8-2009
10731180@unknown@formal@none@1@S@While these older letters have been abandoned at one time or another, they may be used in this and related articles.@@@@1@21@@danf@17-8-2009
10731190@unknown@formal@none@1@S@The [[yer]]s <ъ> and <ь> originally indicated the pronunciation of ''ultra-short'' or ''reduced'' {{IPA|/ŭ/}}, {{IPA|/ĭ/}}.@@@@1@15@@danf@17-8-2009
10731200@unknown@formal@none@1@S@The Russian alphabet has many systems of [[character encoding]].@@@@1@9@@danf@17-8-2009
10731210@unknown@formal@none@1@S@[[KOI8-R]] was designed by the government and was intended to serve as the standard encoding.@@@@1@15@@danf@17-8-2009
10731220@unknown@formal@none@1@S@This encoding is still used in UNIX-like operating systems.@@@@1@9@@danf@17-8-2009
10731230@unknown@formal@none@1@S@Nevertheless, the spread of [[MS-DOS]] and [[Microsoft Windows]] created chaos and ended by establishing different encodings as de-facto standards.@@@@1@19@@danf@17-8-2009
10731240@unknown@formal@none@1@S@For communication purposes, a number of conversion applications were developed.@@@@1@10@@danf@17-8-2009
10731245@unknown@formal@none@1@S@"[[iconv]]" is an example that is supported by most versions of [[Linux]], [[Macintosh]] and some other [[operating system]]s.@@@@1@18@@danf@17-8-2009
10731250@unknown@formal@none@1@S@Most implementations (especially old ones) of the character encoding for the Russian language are aimed at simultaneous use of English and Russian characters only and do not include support for any other language.@@@@1@33@@danf@17-8-2009
10731260@unknown@formal@none@1@S@Certain hopes for a unification of the character encoding for the Russian alphabet are related to the [[Unicode|Unicode standard]], specifically designed for peaceful coexistence of various languages, including even [[dead language]]s.@@@@1@31@@danf@17-8-2009
10731270@unknown@formal@none@1@S@[[Unicode]] also supports the letters of the [[Early Cyrillic alphabet]], which have many similarities with the [[Greek alphabet]].@@@@1@18@@danf@17-8-2009
10731280@unknown@formal@none@1@S@===Orthography===@@@@1@1@@danf@17-8-2009
10731290@unknown@formal@none@1@S@Russian spelling is reasonably phonemic in practice.@@@@1@7@@danf@17-8-2009
10731300@unknown@formal@none@1@S@It is in fact a balance among phonemics, morphology, etymology, and grammar; and, like that of most living languages, has its share of inconsistencies and controversial points.@@@@1@27@@danf@17-8-2009
10731310@unknown@formal@none@1@S@A number of rigid [[spelling rule]]s introduced between the 1880s and 1910s have been responsible for the latter whilst trying to eliminate the former.@@@@1@24@@danf@17-8-2009
10731320@unknown@formal@none@1@S@The current spelling follows the major reform of 1918, and the final codification of 1956.@@@@1@15@@danf@17-8-2009
10731330@unknown@formal@none@1@S@An update proposed in the late 1990s has met a hostile reception, and has not been formally adopted.@@@@1@18@@danf@17-8-2009
10731340@unknown@formal@none@1@S@The punctuation, originally based on Byzantine Greek, was in the seventeenth and eighteenth centuries reformulated on the French and German models.@@@@1@21@@danf@17-8-2009
10731350@unknown@formal@none@1@S@==Sounds==@@@@1@1@@danf@17-8-2009
10731360@unknown@formal@none@1@S@The phonological system of Russian is inherited from [[Common Slavonic]], but underwent considerable modification in the early historical period, before being largely settled by about 1400.@@@@1@26@@danf@17-8-2009
10731370@unknown@formal@none@1@S@The language possesses five vowels, which are written with different letters depending on whether or not the preceding consonant is [[palatalization|palatalized]].@@@@1@21@@danf@17-8-2009
10731380@unknown@formal@none@1@S@The consonants typically come in plain vs. palatalized pairs, which are traditionally called ''hard'' and ''soft.''@@@@1@16@@danf@17-8-2009
10731390@unknown@formal@none@1@S@(The ''hard'' consonants are often [[velarization|velarized]], especially before back vowels, although in some dialects the velarization is limited to hard {{IPA|/l/}}).@@@@1@21@@danf@17-8-2009
10731400@unknown@formal@none@1@S@The standard language, based on the Moscow dialect, possesses heavy stress and moderate variation in pitch.@@@@1@16@@danf@17-8-2009
10731410@unknown@formal@none@1@S@Stressed vowels are somewhat lengthened, while unstressed vowels tend to be reduced to near-close vowels or an unclear [[schwa]].@@@@1@19@@danf@17-8-2009
10731420@unknown@formal@none@1@S@(See also: [[vowel reduction in Russian]].)@@@@1@6@@danf@17-8-2009
10731430@unknown@formal@none@1@S@The Russian [[syllable]] structure can be quite complex with both initial and final consonant clusters of up to 4 consecutive sounds.@@@@1@21@@danf@17-8-2009
10731440@unknown@formal@none@1@S@Using a formula with V standing for the nucleus (vowel) and C for each consonant the structure can be described as follows:@@@@1@22@@danf@17-8-2009
10731450@unknown@formal@none@1@S@(C)(C)(C)(C)V(C)(C)(C)(C)@@@@1@1@@danf@17-8-2009
10731460@unknown@formal@none@1@S@Clusters of four consonants are not very common, however, especially within a morpheme.@@@@1@13@@danf@17-8-2009
10731470@unknown@formal@none@1@S@===Consonants===@@@@1@1@@danf@17-8-2009
10731480@unknown@formal@none@1@S@Russian is notable for its distinction based on [[palatalization]] of most of the consonants.@@@@1@14@@danf@17-8-2009
10731490@unknown@formal@none@1@S@While {{IPA|/k/, /g/, /x/}} do have palatalized [[allophone]]s {{IPA|[kʲ, gʲ, xʲ]}}, only {{IPA|/kʲ/}} might be considered a phoneme, though it is marginal and generally not considered distinctive (the only native [[minimal pair]] which argues for {{IPA|/kʲ/}} to be a separate phoneme is "это ткёт"/"этот кот").@@@@1@45@@danf@17-8-2009
10731500@unknown@formal@none@1@S@Palatalization means that the center of the tongue is raised during and after the articulation of the consonant.@@@@1@18@@danf@17-8-2009
10731510@unknown@formal@none@1@S@In the case of {{IPA|/tʲ/ and /dʲ/}}, the tongue is raised enough to produce slight frication (affricate sounds).@@@@1@18@@danf@17-8-2009
10731520@unknown@formal@none@1@S@These sounds: {{IPA|/t, d, ʦ, s, z, n and rʲ/}} are [[dental consonant|dental]], that is pronounced with the tip of the tongue against the teeth rather than against the [[alveolar ridge]].@@@@1@31@@danf@17-8-2009
10731530@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10731540@unknown@formal@none@1@S@Russian has preserved an [[Indo-European languages|Indo-European]] [[Synthetic language|synthetic]]-[[inflection]]al structure, although considerable leveling has taken place.@@@@1@15@@danf@17-8-2009
10731550@unknown@formal@none@1@S@Russian grammar encompasses@@@@1@3@@danf@17-8-2009
10731560@unknown@formal@none@1@S@* a highly [[Synthetic language|synthetic]] '''morphology'''@@@@1@6@@danf@17-8-2009
10731570@unknown@formal@none@1@S@* a '''syntax''' that, for the literary language, is the conscious fusion of three elements:@@@@1@15@@danf@17-8-2009
10731580@unknown@formal@none@1@S@** a polished [[vernacular]] foundation;@@@@1@5@@danf@17-8-2009
10731590@unknown@formal@none@1@S@** a [[Church Slavonic language|Church Slavonic]] inheritance;@@@@1@7@@danf@17-8-2009
10731600@unknown@formal@none@1@S@** a [[Western Europe]]an style.@@@@1@5@@danf@17-8-2009
10731610@unknown@formal@none@1@S@The spoken language has been influenced by the literary one, but continues to preserve characteristic forms.@@@@1@16@@danf@17-8-2009
10731620@unknown@formal@none@1@S@The dialects show various non-standard grammatical features, some of which are archaisms or descendants of old forms since discarded by the literary language.@@@@1@23@@danf@17-8-2009
10731630@unknown@formal@none@1@S@==Vocabulary==@@@@1@1@@danf@17-8-2009
10731640@unknown@formal@none@1@S@See [[History of the Russian language]] for an account of the successive foreign influences on the Russian language.@@@@1@18@@danf@17-8-2009
10731650@unknown@formal@none@1@S@The total number of words in Russian is difficult to reckon because of the ability to agglutinate and create manifold compounds, diminutives, etc. (see [[Russian grammar#Word Formation|Word Formation]] under [[Russian grammar]]).@@@@1@31@@danf@17-8-2009
10731660@unknown@formal@none@1@S@The number of listed words or entries in some of the major dictionaries published during the last two centuries, and the total vocabulary of [[Pushkin]] (who is credited with greatly augmenting and codifying literary Russian), are as follows:@@@@1@38@@danf@17-8-2009
10731670@unknown@formal@none@1@S@(As a historical aside, [[Vladimir Ivanovich Dal|Dahl]] was, in the second half of the nineteenth century, still insisting that the proper spelling of the adjective '''русский''', which was at that time applied uniformly to all the Orthodox Eastern Slavic subjects of the Empire, as well as to its one official language, be spelled '''руский''' with one s, in accordance with ancient tradition and what he termed the "spirit of the language".@@@@1@71@@danf@17-8-2009
10731680@unknown@formal@none@1@S@He was contradicted by the philologist Grot, who distinctly heard the s lengthened or doubled).@@@@1@15@@danf@17-8-2009
10731690@unknown@formal@none@1@S@=== Proverbs and sayings ===@@@@1@5@@danf@17-8-2009
10731700@unknown@formal@none@1@S@The Russian language is replete with many hundreds of proverbs ('''пословица''' {{IPA|[pɐˈslo.vʲɪ.ʦə]}}) and sayings ('''поговоркa''' {{IPA|[pə.gɐˈvo.rkə]}}).@@@@1@16@@danf@17-8-2009
10731710@unknown@formal@none@1@S@These were already tabulated by the seventeenth century, and collected and studied in the nineteenth and twentieth, with the folk-tales being an especially fertile source.@@@@1@25@@danf@17-8-2009
10731720@unknown@formal@none@1@S@==History and examples==@@@@1@3@@danf@17-8-2009
10731730@unknown@formal@none@1@S@The history of Russian language may be divided into the following periods.@@@@1@12@@danf@17-8-2009
10731740@unknown@formal@none@1@S@* [[History of the Russian language#Kievan period and feudal breakup|Kievan period and feudal breakup]]@@@@1@14@@danf@17-8-2009
10731750@unknown@formal@none@1@S@* [[History of the Russian language#The Tatar yoke and the Grand Duchy of Lithuania|The Tatar yoke and the Grand Duchy of Lithuania]]@@@@1@22@@danf@17-8-2009
10731760@unknown@formal@none@1@S@* [[History of the Russian language#The Moscovite period (15th–17th centuries)|The Moscovite period (15th–17th centuries)]]@@@@1@14@@danf@17-8-2009
10731770@unknown@formal@none@1@S@* [[History of the Russian language#Empire (18th–19th centuries)|Empire (18th–19th centuries)]]@@@@1@10@@danf@17-8-2009
10731780@unknown@formal@none@1@S@* [[History of the Russian language#Soviet period and beyond (20th century)|Soviet period and beyond (20th century)]]@@@@1@16@@danf@17-8-2009
10731790@unknown@formal@none@1@S@Judging by the historical records, by approximately 1000 AD the predominant ethnic group over much of modern European [[Russia]], [[Ukraine]], and [[Belarus]] was the Eastern branch of the [[Slavic peoples|Slavs]], speaking a closely related group of dialects.@@@@1@37@@danf@17-8-2009
10731800@unknown@formal@none@1@S@The political unification of this region into [[Kievan Rus']] in about 880, from which modern Russia, Ukraine and Belarus trace their origins, established [[Old East Slavic]] as a literary and commercial language.@@@@1@32@@danf@17-8-2009
10731810@unknown@formal@none@1@S@It was soon followed by the adoption of [[Christianity]] in 988 and the introduction of the South Slavic [[Old Church Slavonic]] as the liturgical and official language.@@@@1@27@@danf@17-8-2009
10731820@unknown@formal@none@1@S@Borrowings and [[calque]]s from Byzantine [[Greek language|Greek]] began to enter the [[Old East Slavic]] and spoken dialects at this time, which in their turn modified the [[Old Church Slavonic]] as well.@@@@1@31@@danf@17-8-2009
10731830@unknown@formal@none@1@S@Dialectal differentiation accelerated after the breakup of [[Kievan Rus]] in approximately 1100.@@@@1@12@@danf@17-8-2009
10731840@unknown@formal@none@1@S@On the territories of modern [[Belarus]] and [[Ukraine]] emerged [[Ruthenian language|Ruthenian]] and in modern [[Russia]] [[History of the Russian language|medieval Russian]].@@@@1@21@@danf@17-8-2009
10731850@unknown@formal@none@1@S@They definitely became distinct in 13th century by the time of division of that land between the [[Grand Duchy of Lithuania]] on the west and independent Novgorod Feudal Republic plus small duchies which were vassals of the Tatars on the east.@@@@1@41@@danf@17-8-2009
10731860@unknown@formal@none@1@S@The official language in Moscow and Novgorod, and later, in the growing Moscow Rus’, was [[Church Slavonic]] which evolved from [[Old Church Slavonic]] and remained [[Diglossia|the literary language]] until the Petrine age, when its usage shrank drastically to biblical and liturgical texts.@@@@1@42@@danf@17-8-2009
10731870@unknown@formal@none@1@S@Russian developed under a strong influence of the Church Slavonic until the close of the seventeenth century; the influence reversed afterwards leading to corruption of liturgical texts.@@@@1@27@@danf@17-8-2009
10731880@unknown@formal@none@1@S@The political reforms of [[Peter I of Russia|Peter the Great]] were accompanied by a reform of the alphabet, and achieved their goal of secularization and Westernization.@@@@1@26@@danf@17-8-2009
10731890@unknown@formal@none@1@S@Blocks of specialized vocabulary were adopted from the languages of Western Europe.@@@@1@12@@danf@17-8-2009
10731900@unknown@formal@none@1@S@By 1800, a significant portion of the gentry spoke [[French language|French]], less often [[German language|German]], on an everyday basis.@@@@1@19@@danf@17-8-2009
10731910@unknown@formal@none@1@S@Many Russian novels of the 19th century, e.g. Lev Tolstoy’s "War and Peace", contain entire paragraphs and even pages in French with no translation given, with an assumption that educated readers won't need one.@@@@1@34@@danf@17-8-2009
10731920@unknown@formal@none@1@S@The modern literary language is usually considered to date from the time of [[Aleksandr Pushkin]] in the first third of the nineteenth century.@@@@1@23@@danf@17-8-2009
10731930@unknown@formal@none@1@S@Pushkin revolutionized Russian literature by rejecting archaic grammar and vocabulary (so called "высокий стиль" — "high style") in favor of grammar and vocabulary found in the spoken language of the time.@@@@1@31@@danf@17-8-2009
10731940@unknown@formal@none@1@S@Even modern readers of younger age may only experience slight difficulties understanding some words in Pushkin’s texts, since only few words used by Pushkin became archaic or changed meaning.@@@@1@29@@danf@17-8-2009
10731950@unknown@formal@none@1@S@On the other hand, many expressions used by Russian writers of the early 19th century, in particular Pushkin, [[Lermontov]], [[Gogol]], Griboiädov, became proverbs or sayings which can be frequently found even in the modern Russian colloquial speech.@@@@1@37@@danf@17-8-2009
10731960@unknown@formal@none@1@S@The political upheavals of the early twentieth century and the wholesale changes of political ideology gave written Russian its modern appearance after the spelling reform of 1918.@@@@1@27@@danf@17-8-2009
10731970@unknown@formal@none@1@S@Political circumstances and Soviet accomplishments in military, scientific, and technological matters (especially cosmonautics), gave Russian a world-wide prestige, especially during the middle third of the twentieth century.@@@@1@27@@danf@17-8-2009
10740010@unknown@formal@none@1@S@Web search engine@@@@1@3@@danf@17-8-2009
10740020@unknown@formal@none@1@S@A '''Web search engine''' is a [[search engine (computing)|search engine]] designed to search for information on the [[World Wide Web]].@@@@1@20@@danf@17-8-2009
10740030@unknown@formal@none@1@S@Information may consist of [[web page]]s, images and other types of files.@@@@1@12@@danf@17-8-2009
10740040@unknown@formal@none@1@S@Some search engines also mine data available in newsbooks, databases, or [[Web directory|open directories]].@@@@1@14@@danf@17-8-2009
10740050@unknown@formal@none@1@S@Unlike [[Web directories]], which are maintained by human editors, search engines operate algorithmically or are a mixture of [[algorithmic]] and human input.@@@@1@22@@danf@17-8-2009
10740060@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10740070@unknown@formal@none@1@S@Before there were search engines there was a complete list of all webservers.@@@@1@13@@danf@17-8-2009
10740080@unknown@formal@none@1@S@The list was edited by [[Tim Berners-Lee]] and hosted on the CERN webserver.@@@@1@13@@danf@17-8-2009
10740090@unknown@formal@none@1@S@One historical snapshot from 1992 remains.@@@@1@6@@danf@17-8-2009
10740100@unknown@formal@none@1@S@As more and more webservers went online the central list could not keep up.@@@@1@14@@danf@17-8-2009
10740110@unknown@formal@none@1@S@On the NCSA Site new servers were announced under the title "What's New!", but no complete listing existed any more.@@@@1@20@@danf@17-8-2009
10740120@unknown@formal@none@1@S@The very first tool used for searching on the (pre-web) Internet was [[Archie search engine|Archie]].@@@@1@15@@danf@17-8-2009
10740130@unknown@formal@none@1@S@The name stands for "archive" without the "v".@@@@1@8@@danf@17-8-2009
10740140@unknown@formal@none@1@S@It was created in 1990 by [[Alan Emtage]], a student at [[McGill University]] in Montreal.@@@@1@15@@danf@17-8-2009
10740150@unknown@formal@none@1@S@The program downloaded the directory listings of all the files located on public anonymous FTP ([[File Transfer Protocol]]) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites.@@@@1@36@@danf@17-8-2009
10740160@unknown@formal@none@1@S@The rise of [[Gopher (protocol)|Gopher]] (created in 1991 by [[Mark McCahill]] at the [[University of Minnesota]]) led to two new search programs, [[Veronica (computer)|Veronica]] and [[Jughead (computer)|Jughead]].@@@@1@27@@danf@17-8-2009
10740170@unknown@formal@none@1@S@Like Archie, they searched the file names and titles stored in Gopher index systems.@@@@1@14@@danf@17-8-2009
10740180@unknown@formal@none@1@S@Veronica ('''V'''ery '''E'''asy '''R'''odent-'''O'''riented '''N'''et-wide '''I'''ndex to '''C'''omputerized '''A'''rchives) provided a keyword search of most Gopher menu titles in the entire Gopher listings.@@@@1@23@@danf@17-8-2009
10740190@unknown@formal@none@1@S@Jughead ('''J'''onzy's '''U'''niversal '''G'''opher '''H'''ierarchy '''E'''xcavation '''A'''nd '''D'''isplay) was a tool for obtaining menu information from specific Gopher servers.@@@@1@19@@danf@17-8-2009
10740200@unknown@formal@none@1@S@While the name of the search engine "[[Archie search engine|Archie]]" was not a reference to the [[Archie Comics|Archie comic book]] series, "[[Veronica Lodge|Veronica]]" and "[[Jughead Jones|Jughead]]" are characters in the series, thus referencing their predecessor.@@@@1@35@@danf@17-8-2009
10740210@unknown@formal@none@1@S@The first Web search engine was Wandex, a now-defunct index collected by the [[World Wide Web Wanderer]], a [[web crawler]] developed by Matthew Gray at [[Massachusetts Institute of Technology|MIT]] in 1993.@@@@1@31@@danf@17-8-2009
10740220@unknown@formal@none@1@S@Another very early search engine, [[Aliweb]], also appeared in 1993.@@@@1@10@@danf@17-8-2009
10740230@unknown@formal@none@1@S@[[JumpStation]] (released in early 1994) used a crawler to find web pages for searching, but search was limited to the title of web pages only.@@@@1@25@@danf@17-8-2009
10740240@unknown@formal@none@1@S@One of the first "full text" crawler-based search engines was [[WebCrawler]], which came out in 1994.@@@@1@16@@danf@17-8-2009
10740250@unknown@formal@none@1@S@Unlike its predecessors, it let users search for any word in any webpage, which became the standard for all major search engines since.@@@@1@23@@danf@17-8-2009
10740260@unknown@formal@none@1@S@It was also the first one to be widely known by the public.@@@@1@13@@danf@17-8-2009
10740270@unknown@formal@none@1@S@Also in 1994 [[Lycos]] (which started at [[Carnegie Mellon University]]) was launched, and became a major commercial endeavor.@@@@1@18@@danf@17-8-2009
10740280@unknown@formal@none@1@S@Soon after, many search engines appeared and vied for popularity.@@@@1@10@@danf@17-8-2009
10740290@unknown@formal@none@1@S@These included [[Magellan]], [[Excite]], [[Infoseek]], [[Inktomi]], [[Northern Light Group|Northern Light]], and [[AltaVista]].@@@@1@12@@danf@17-8-2009
10740300@unknown@formal@none@1@S@[[Yahoo!]] was among the most popular ways for people to find web pages of interest, but its search function operated on its [[web directory]], rather than full-text copies of web pages.@@@@1@31@@danf@17-8-2009
10740310@unknown@formal@none@1@S@Information seekers could also browse the directory instead of doing a keyword-based search.@@@@1@13@@danf@17-8-2009
10740320@unknown@formal@none@1@S@In 1996, [[Netscape]] was looking to give a single search engine an exclusive deal to be their featured search engine.@@@@1@20@@danf@17-8-2009
10740330@unknown@formal@none@1@S@There was so much interest that instead a deal was struck with Netscape by 5 of the major search engines, where for $5Million per year each search engine would be in a rotation on the Netscape search engine page.@@@@1@39@@danf@17-8-2009
10740340@unknown@formal@none@1@S@These five engines were: [[Yahoo!]], [[Magellan]], [[Lycos]], [[Infoseek]] and [[Excite]].@@@@1@10@@danf@17-8-2009
10740350@unknown@formal@none@1@S@Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s.@@@@1@22@@danf@17-8-2009
10740360@unknown@formal@none@1@S@Several companies entered the market spectacularly, receiving record gains during their [[initial public offering]]s.@@@@1@14@@danf@17-8-2009
10740370@unknown@formal@none@1@S@Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light.@@@@1@17@@danf@17-8-2009
10740380@unknown@formal@none@1@S@Many search engine companies were caught up in the [[dot-com bubble]], a speculation-driven market boom that peaked in 1999 and ended in 2001.@@@@1@23@@danf@17-8-2009
10740390@unknown@formal@none@1@S@Around 2000, the [[Google Search|Google search engine]] rose to prominence.@@@@1@10@@danf@17-8-2009
10740400@unknown@formal@none@1@S@The company achieved better results for many searches with an innovation called [[PageRank]].@@@@1@13@@danf@17-8-2009
10740410@unknown@formal@none@1@S@This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others.@@@@1@35@@danf@17-8-2009
10740420@unknown@formal@none@1@S@Google also maintained a minimalist interface to its search engine.@@@@1@10@@danf@17-8-2009
10740430@unknown@formal@none@1@S@In contrast, many of its competitors embedded a search engine in a [[web portal]].@@@@1@14@@danf@17-8-2009
10740440@unknown@formal@none@1@S@By 2000, Yahoo was providing search services based on [[Inktomi]]'s search engine.@@@@1@12@@danf@17-8-2009
10740450@unknown@formal@none@1@S@Yahoo! acquired [[Inktomi]] in 2002, and [[Overture]] (which owned [[AlltheWeb]] and [[AltaVista]]) in 2003.@@@@1@14@@danf@17-8-2009
10740460@unknown@formal@none@1@S@Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.@@@@1@23@@danf@17-8-2009
10740470@unknown@formal@none@1@S@Microsoft first launched MSN Search (since re-branded [[Live Search]]) in the fall of 1998 using search results from [[Inktomi]].@@@@1@19@@danf@17-8-2009
10740480@unknown@formal@none@1@S@In early 1999 the site began to display listings from [[Looksmart]] blended with results from [[Inktomi]] except for a short time in 1999 when results from [[AltaVista]] were used instead.@@@@1@30@@danf@17-8-2009
10740490@unknown@formal@none@1@S@In 2004, Microsoft began a transition to its own search technology, powered by its own [[web crawler]] (called [[msnbot]]).@@@@1@19@@danf@17-8-2009
10740500@unknown@formal@none@1@S@As of late 2007, Google was by far the most popular Web search engine worldwide.@@@@1@15@@danf@17-8-2009
10740510@unknown@formal@none@1@S@A number of country-specific search engine companies have become prominent; for example [[Baidu]] is the most popular search engine in the [[People's Republic of China]] and [[guruji.com]] in [[India]].@@@@1@29@@danf@17-8-2009
10740520@unknown@formal@none@1@S@==How Web search engines work==@@@@1@5@@danf@17-8-2009
10740530@unknown@formal@none@1@S@A search engine operates, in the following order@@@@1@8@@danf@17-8-2009
10740540@unknown@formal@none@1@S@# [[Web crawling]]@@@@1@3@@danf@17-8-2009
10740550@unknown@formal@none@1@S@# [[Index (search engine)|Indexing]]@@@@1@4@@danf@17-8-2009
10740560@unknown@formal@none@1@S@# [[Web search query|Searching]]@@@@1@4@@danf@17-8-2009
10740570@unknown@formal@none@1@S@Web search engines work by storing information about many web pages, which they retrieve from the WWW itself.@@@@1@18@@danf@17-8-2009
10740580@unknown@formal@none@1@S@These pages are retrieved by a [[Web crawler]] (sometimes also known as a spider) — an automated Web browser which follows every link it sees.@@@@1@25@@danf@17-8-2009
10740590@unknown@formal@none@1@S@Exclusions can be made by the use of [[robots.txt]].@@@@1@9@@danf@17-8-2009
10740600@unknown@formal@none@1@S@The contents of each page are then analyzed to determine how it should be [[Search engine indexing|indexed]] (for example, words are extracted from the titles, headings, or special fields called [[meta tags]]).@@@@1@32@@danf@17-8-2009
10740610@unknown@formal@none@1@S@Data about web pages are stored in an index database for use in later queries.@@@@1@15@@danf@17-8-2009
10740620@unknown@formal@none@1@S@Some search engines, such as [[Google]], store all or part of the source page (referred to as a [[web cache|cache]]) as well as information about the web pages, whereas others, such as [[AltaVista]], store every word of every page they find.@@@@1@41@@danf@17-8-2009
10740630@unknown@formal@none@1@S@This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it.@@@@1@43@@danf@17-8-2009
10740640@unknown@formal@none@1@S@This problem might be considered to be a mild form of [[linkrot]], and Google's handling of it increases [[usability]] by satisfying [[user expectations]] that the search terms will be on the returned webpage.@@@@1@33@@danf@17-8-2009
10740650@unknown@formal@none@1@S@This satisfies the [[principle of least astonishment]] since the user normally expects the search terms to be on the returned pages.@@@@1@21@@danf@17-8-2009
10740660@unknown@formal@none@1@S@Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.@@@@1@25@@danf@17-8-2009
10740670@unknown@formal@none@1@S@When a user enters a [[web search query|query]] into a search engine (typically by using [[Keyword (Internet search)|key word]]s), the engine examines its [[inverted index|index]] and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text.@@@@1@52@@danf@17-8-2009
10740680@unknown@formal@none@1@S@Most search engines support the use of the [[boolean operators]] AND, OR and NOT to further specify the [[web search query|search query]].@@@@1@22@@danf@17-8-2009
10740690@unknown@formal@none@1@S@Some search engines provide an advanced feature called [[Proximity search (text)|proximity search]] which allows users to define the distance between keywords.@@@@1@21@@danf@17-8-2009
10740700@unknown@formal@none@1@S@The usefulness of a search engine depends on the [[relevance (information retrieval)|relevance]] of the '''result set''' it gives back.@@@@1@19@@danf@17-8-2009
10740710@unknown@formal@none@1@S@While there may be millions of webpages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others.@@@@1@25@@danf@17-8-2009
10740720@unknown@formal@none@1@S@Most search engines employ methods to [[rank order|rank]] the results to provide the "best" results first.@@@@1@16@@danf@17-8-2009
10740730@unknown@formal@none@1@S@How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another.@@@@1@27@@danf@17-8-2009
10740740@unknown@formal@none@1@S@The methods also change over time as Internet usage changes and new techniques evolve.@@@@1@14@@danf@17-8-2009
10740750@unknown@formal@none@1@S@Most Web search engines are commercial ventures supported by [[advertising]] revenue and, as a result, some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results.@@@@1@35@@danf@17-8-2009
10740760@unknown@formal@none@1@S@Those search engines which do not accept money for their search engine results make money by running search related ads alongside the regular search engine results.@@@@1@26@@danf@17-8-2009
10740770@unknown@formal@none@1@S@The search engines make money every time someone clicks on one of these ads.@@@@1@14@@danf@17-8-2009
10740780@unknown@formal@none@1@S@The vast majority of search engines are run by private companies using proprietary algorithms and closed databases, though [[List of search engines#Open source search engines|some]] are open source.@@@@1@28@@danf@17-8-2009
10740790@unknown@formal@none@1@S@Revenue in the web search portals industry is projected to grow in 2008 by 13.4 percent, with broadband connections expected to rise by 15.1 percent.@@@@1@25@@danf@17-8-2009
10740800@unknown@formal@none@1@S@Between 2008 and 2012, industry revenue is projected to rise by 56 percent as Internet penetration still has some way to go to reach full saturation in American households.@@@@1@29@@danf@17-8-2009
10740810@unknown@formal@none@1@S@Furthermore, broadband services are projected to account for an ever increasing share of domestic Internet users, rising to 118.7 million by 2012, with an increasing share accounted for by fiber-optic and high speed cable lines.@@@@1@35@@danf@17-8-2009
10750010@unknown@formal@none@1@S@Semantics@@@@1@1@@danf@17-8-2009
10750020@unknown@formal@none@1@S@'''Semantics''' is the study of meaning in communication.@@@@1@8@@danf@17-8-2009
10750030@unknown@formal@none@1@S@The word derives from [[Greek language|Greek]] ''σημαντικός'' (''semantikos''), "significant", from ''σημαίνω'' (''semaino''), "to signify, to indicate" and that from ''σήμα'' (''sema''), "sign, mark, token".@@@@1@24@@danf@17-8-2009
10750040@unknown@formal@none@1@S@In [[linguistics]] it is the study of interpretation of signs as used by [[agent]]s or [[community|communities]] within particular circumstances and contexts.@@@@1@21@@danf@17-8-2009
10750050@unknown@formal@none@1@S@It has related meanings in several other fields.@@@@1@8@@danf@17-8-2009
10750060@unknown@formal@none@1@S@Semanticists differ on what constitutes [[Meaning (linguistics)|meaning]] in an expression.@@@@1@10@@danf@17-8-2009
10750070@unknown@formal@none@1@S@For example, in the sentence, "John loves a bagel", the word ''bagel'' may refer to the object itself, which is its ''literal'' meaning or ''[[denotation]]'', but it may also refer to many other figurative associations, such as how it meets John's hunger, etc., which may be its ''[[connotation]]''.@@@@1@48@@danf@17-8-2009
10750080@unknown@formal@none@1@S@Traditionally, the [[formal semantic]] view restricts semantics to its literal meaning, and relegates all figurative associations to [[pragmatics]], but this distinction is increasingly difficult to defend.@@@@1@26@@danf@17-8-2009
10750090@unknown@formal@none@1@S@The degree to which a theorist subscribes to the literal-figurative distinction decreases as one moves from the [[formal semantic]], [[semiotic]], [[pragmatic]], to the [[cognitive semantic]] traditions.@@@@1@26@@danf@17-8-2009
10750100@unknown@formal@none@1@S@The word ''semantic'' in its modern sense is considered to have first appeared in [[French language|French]] as ''sémantique'' in [[Michel Bréal]]'s 1897 book, ''Essai de sémantique'.@@@@1@26@@danf@17-8-2009
10750110@unknown@formal@none@1@S@In [[International Scientific Vocabulary]] semantics is also called ''[[semasiology]]''.@@@@1@9@@danf@17-8-2009
10750120@unknown@formal@none@1@S@The discipline of Semantics is distinct from [[General semantics|Alfred Korzybski's General Semantics]], which is a system for looking at non-immediate, or abstract meanings.@@@@1@23@@danf@17-8-2009
10750130@unknown@formal@none@1@S@==Linguistics==@@@@1@1@@danf@17-8-2009
10750140@unknown@formal@none@1@S@In [[linguistics]], '''semantics''' is the subfield that is devoted to the study of meaning, as inherent at the levels of words, phrases, sentences, and even larger units of [[discourse]] (referred to as ''texts'').@@@@1@33@@danf@17-8-2009
10750150@unknown@formal@none@1@S@The basic area of study is the meaning of [[sign (semiotics)|sign]]s, and the study of relations between different linguistic units: [[homonym]]y, [[synonym]]y, [[antonym]]y, [[polysemy]], [[paronyms]], [[hypernym]]y, [[hyponym]]y, [[meronymy]], [[metonymy]], [[holonymy]], [[exocentric]]ity / [[endocentric]]ity, linguistic [[compound (linguistics)|compounds]].@@@@1@36@@danf@17-8-2009
10750160@unknown@formal@none@1@S@A key concern is how meaning attaches to larger chunks of text, possibly as a result of the composition from smaller units of meaning.@@@@1@24@@danf@17-8-2009
10750170@unknown@formal@none@1@S@Traditionally, semantics has included the study of connotative ''[[word sense|sense]]'' and denotative ''[[reference]]'', [[truth condition]]s, [[argument structure]], [[thematic role]]s, [[discourse analysis]], and the linkage of all of these to syntax.@@@@1@30@@danf@17-8-2009
10750180@unknown@formal@none@1@S@[[Formal semantics|Formal semanticists]] are concerned with the modeling of meaning in terms of the semantics of logic.@@@@1@17@@danf@17-8-2009
10750190@unknown@formal@none@1@S@Thus the sentence ''John loves a bagel'' above can be broken down into its constituents (signs), of which the unit ''loves'' may serve as both syntactic and semantic [[head (linguistics)|head]].@@@@1@30@@danf@17-8-2009
10750200@unknown@formal@none@1@S@In the late 1960s, [[Richard Montague]] proposed a system for defining semantic entries in the lexicon in terms of [[lambda calculus]].@@@@1@21@@danf@17-8-2009
10750210@unknown@formal@none@1@S@Thus, the syntactic [[parsing|parse]] of the sentence above would now indicate ''loves'' as the head, and its entry in the lexicon would point to the arguments as the agent, ''John'', and the object, ''bagel'', with a special role for the article "a" (which Montague called a quantifier).@@@@1@47@@danf@17-8-2009
10750220@unknown@formal@none@1@S@This resulted in the sentence being associated with the logical predicate ''loves (John, bagel)'', thus linking semantics to [[categorial grammar]] models of [[syntax]].@@@@1@23@@danf@17-8-2009
10750230@unknown@formal@none@1@S@The logical predicate thus obtained would be elaborated further, e.g. using truth theory models, which ultimately relate meanings to a set of [[Tarski]]ian universals, which may lie outside the logic.@@@@1@30@@danf@17-8-2009
10750240@unknown@formal@none@1@S@The notion of such meaning atoms or primitives are basic to the [[language of thought]] hypothesis from the 70s.@@@@1@19@@danf@17-8-2009
10750250@unknown@formal@none@1@S@Despite its elegance, [[Montague grammar]] was limited by the context-dependent variability in word sense, and led to several attempts at incorporating context, such as :@@@@1@25@@danf@17-8-2009
10750260@unknown@formal@none@1@S@*[[situation semantics]] ('80s): Truth-values are incomplete, they get assigned based on context@@@@1@12@@danf@17-8-2009
10750270@unknown@formal@none@1@S@*[[generative lexicon]] ('90s): categories (types) are incomplete, and get assigned based on context@@@@1@13@@danf@17-8-2009
10750280@unknown@formal@none@1@S@===The dynamic turn in semantics===@@@@1@5@@danf@17-8-2009
10750290@unknown@formal@none@1@S@In the [[Noam Chomsky|Chomskian]] tradition in linguistics there was no mechanism for the learning of semantic relations, and the [[Psychological nativism|nativist]] view considered all semantic notions as inborn.@@@@1@28@@danf@17-8-2009
10750300@unknown@formal@none@1@S@Thus, even novel concepts were proposed to have been dormant in some sense.@@@@1@13@@danf@17-8-2009
10750310@unknown@formal@none@1@S@This traditional view was also unable to address many issues such as [[metaphor]] or associative meanings, and [[semantic change]], where meanings within a linguistic community change over time, and [[qualia]] or subjective experience.@@@@1@33@@danf@17-8-2009
10750320@unknown@formal@none@1@S@Another issue not addressed by the nativist model was how perceptual cues are combined in thought, e.g. in [[mental rotation]].@@@@1@20@@danf@17-8-2009
10750330@unknown@formal@none@1@S@This traditional view of semantics, as an innate finite meaning inherent in a [[lexical unit]] that can be composed to generate meanings for larger chunks of discourse, is now being fiercely debated in the emerging domain of [[cognitive linguistics]] and also in the non-[[Jerry Fodor|Fodorian]] camp in [[Philosophy of Language]].@@@@1@50@@danf@17-8-2009
10750340@unknown@formal@none@1@S@The challenge is motivated by@@@@1@5@@danf@17-8-2009
10750350@unknown@formal@none@1@S@* factors internal to language, such as the problem of resolving [[indexical]] or [[anaphora]] (e.g. ''this x'', ''him'', ''last week'').@@@@1@20@@danf@17-8-2009
10750360@unknown@formal@none@1@S@In these situations "context" serves as the input, but the interpreted utterance also modifies the context, so it is also the output.@@@@1@22@@danf@17-8-2009
10750370@unknown@formal@none@1@S@Thus, the interpretation is necessarily dynamic and the meaning of sentences is viewed as context-change potentials instead of [[propositions]].@@@@1@19@@danf@17-8-2009
10750380@unknown@formal@none@1@S@* factors external to language, i.e. language is not a set of labels stuck on things, but "a toolbox, the importance of whose elements lie in the way they function rather than their attachments to things."@@@@1@36@@danf@17-8-2009
10750390@unknown@formal@none@1@S@This view reflects the position of the later [[Wittgenstein]] and his famous ''game'' example, and is related to the positions of [[Willard Van Orman Quine|Quine]], [[Donald Davidson (philosopher)|Davidson]], and others.@@@@1@30@@danf@17-8-2009
10750400@unknown@formal@none@1@S@A concrete example of the latter phenomenon is semantic [[underspecification]] — meanings are not complete without some elements of context.@@@@1@20@@danf@17-8-2009
10750410@unknown@formal@none@1@S@To take an example of a single word, "red", its meaning in a phrase such as ''red book'' is similar to many other usages, and can be viewed as compositional.@@@@1@30@@danf@17-8-2009
10750420@unknown@formal@none@1@S@However, the colours implied in phrases such as "red wine" (very dark), and "red hair" (coppery), or "red soil", or "red skin" are very different.@@@@1@25@@danf@17-8-2009
10750430@unknown@formal@none@1@S@Indeed, these colours by themselves would not be called "red" by native speakers.@@@@1@13@@danf@17-8-2009
10750440@unknown@formal@none@1@S@These instances are contrastive, so "red wine" is so called only in comparison with the other kind of wine (which also is not "white" for the same reasons).@@@@1@28@@danf@17-8-2009
10750450@unknown@formal@none@1@S@This view goes back to [[Ferdinand de Saussure|de Saussure]]:@@@@1@9@@danf@17-8-2009
10750460@unknown@formal@none@1@S@:Each of a set of synonyms like ''redouter'' ('to dread'), ''craindre'' ('to fear'), ''avoir peur'' ('to be afraid') has its particular value only because they stand in contrast with one another.@@@@1@31@@danf@17-8-2009
10750470@unknown@formal@none@1@S@No word has a value that can be identified independently of what else is in its vicinity.@@@@1@17@@danf@17-8-2009
10750480@unknown@formal@none@1@S@and may go back to earlier [[India]]n views on language, especially the [[Nyaya]] view of words as [[Semantic indicator|indicators]] and not carriers of meaning.@@@@1@24@@danf@17-8-2009
10750490@unknown@formal@none@1@S@An attempt to defend a system based on propositional meaning for semantic underspecification can be found in the [[Generative Lexicon]] model of [[James Pustejovsky]], who extends contextual operations (based on type shifting) into the lexicon.@@@@1@35@@danf@17-8-2009
10750500@unknown@formal@none@1@S@Thus meanings are generated on the fly based on finite context.@@@@1@11@@danf@17-8-2009
10750510@unknown@formal@none@1@S@===Prototype theory===@@@@1@2@@danf@17-8-2009
10750520@unknown@formal@none@1@S@Another set of concepts related to fuzziness in semantics is based on [[Prototype Theory|prototype]]s.@@@@1@14@@danf@17-8-2009
10750530@unknown@formal@none@1@S@The work of [[Eleanor Rosch]] and [[George Lakoff]] in the 1970s led to a view that natural categories are not characterizable in terms of necessary and sufficient conditions, but are graded (fuzzy at their boundaries) and inconsistent as to the status of their constituent members.@@@@1@45@@danf@17-8-2009
10750540@unknown@formal@none@1@S@Systems of categories are not objectively "out there" in the world but are rooted in people's experience.@@@@1@17@@danf@17-8-2009
10750550@unknown@formal@none@1@S@These categories evolve as [[learning theory (education)|learned]] concepts of the world — meaning is not an objective truth, but a subjective construct, learned from experience, and language arises out of the "grounding of our conceptual systems in shared [[embodied philosophy|embodiment]] and bodily experience".@@@@1@43@@danf@17-8-2009
10750560@unknown@formal@none@1@S@A corollary of this is that the conceptual categories (i.e. the lexicon) will not be identical for different cultures, or indeed, for every individual in the same culture.@@@@1@28@@danf@17-8-2009
10750570@unknown@formal@none@1@S@This leads to another debate (see the [[Whorf-Sapir hypothesis]] or [[Eskimo words for snow]]).@@@@1@14@@danf@17-8-2009
10750580@unknown@formal@none@1@S@==Computer science==@@@@1@2@@danf@17-8-2009
10750590@unknown@formal@none@1@S@In [[computer science]], where it is considered as an application of [[mathematical logic]], semantics reflects the meaning of programs or functions.@@@@1@21@@danf@17-8-2009
10750600@unknown@formal@none@1@S@In this regard, semantics permits programs to be separated into their syntactical part (grammatical structure) and their semantic part (meaning).@@@@1@20@@danf@17-8-2009
10750610@unknown@formal@none@1@S@For instance, the following statements use different syntaxes (languages), but result in the same semantic:@@@@1@15@@danf@17-8-2009
10750620@unknown@formal@none@1@S@* x += y; ([[C (programming language)|C]], [[Java (programming language)|Java]], etc.)@@@@1@11@@danf@17-8-2009
10750630@unknown@formal@none@1@S@* x := x + y; ([[Pascal (programming language)|Pascal]])@@@@1@9@@danf@17-8-2009
10750640@unknown@formal@none@1@S@* Let x = x + y; (early [[BASIC]])@@@@1@9@@danf@17-8-2009
10750650@unknown@formal@none@1@S@* x = x + y (most BASIC dialects, [[Fortran]])@@@@1@10@@danf@17-8-2009
10750660@unknown@formal@none@1@S@Generally these operations would all perform an arithmetical addition of 'y' to 'x' and store the result in a variable 'x'.@@@@1@21@@danf@17-8-2009
10750670@unknown@formal@none@1@S@Semantics for computer applications falls into three categories:@@@@1@8@@danf@17-8-2009
10750680@unknown@formal@none@1@S@* [[Operational semantics]]: The meaning of a construct is specified by the computation it induces when it is executed on a machine.@@@@1@22@@danf@17-8-2009
10750690@unknown@formal@none@1@S@In particular, it is of interest ''how'' the effect of a computation is produced.@@@@1@14@@danf@17-8-2009
10750700@unknown@formal@none@1@S@* [[Denotational semantics]]: Meanings are modelled by mathematical objects that represent the effect of executing the constructs.@@@@1@17@@danf@17-8-2009
10750710@unknown@formal@none@1@S@Thus ''only'' the effect is of interest, not how it is obtained.@@@@1@12@@danf@17-8-2009
10750720@unknown@formal@none@1@S@* [[Axiomatic semantics]]: Specific properties of the effect of executing the constructs as expressed as ''assertions''.@@@@1@16@@danf@17-8-2009
10750730@unknown@formal@none@1@S@Thus there may be aspects of the executions that are ignored.@@@@1@11@@danf@17-8-2009
10750740@unknown@formal@none@1@S@The '''[[Semantic Web]]''' refers to the extension of the [[World Wide Web]] through the embedding of additional semantic [[metadata]]; s.a.@@@@1@20@@danf@17-8-2009
10750750@unknown@formal@none@1@S@[[Web Ontology Language]] (OWL).@@@@1@4@@danf@17-8-2009
10750760@unknown@formal@none@1@S@==Psychology==@@@@1@1@@danf@17-8-2009
10750770@unknown@formal@none@1@S@In [[psychology]], ''[[semantic memory]]'' is memory for meaning, in other words, the aspect of memory that preserves only the ''gist'', the general significance, of remembered experience, while [[episodic memory]] is memory for the ephemeral details, the individual features, or the unique particulars of experience.@@@@1@44@@danf@17-8-2009
10750780@unknown@formal@none@1@S@Word meaning is measured by the company they keep; the relationships among words themselves in a [[semantic network]].@@@@1@18@@danf@17-8-2009
10750790@unknown@formal@none@1@S@In a network created by people analyzing their understanding of the word (such as [[Wordnet]]) the links and decomposition structures of the network are few in number and kind; and include "part of", "kind of", and similar links.@@@@1@38@@danf@17-8-2009
10750800@unknown@formal@none@1@S@In automated [[ontologies]] the links are computed vectors without explicit meaning.@@@@1@11@@danf@17-8-2009
10750810@unknown@formal@none@1@S@Various automated technologies are being developed to compute the meaning of words: [[latent semantic indexing]] and [[support vector machines]] as well as [[natural language processing]], [[neural networks]] and [[predicate calculus]] techniques.@@@@1@31@@danf@17-8-2009
10750820@unknown@formal@none@1@S@Semantics has been reported to drive the course of psychotherapeutic interventions.@@@@1@11@@danf@17-8-2009
10750830@unknown@formal@none@1@S@Language structure can determine the treatment approach to drug-abusing patients. .@@@@1@11@@danf@17-8-2009
10750840@unknown@formal@none@1@S@While working in Europe for the US Information Agency, American psychiatrist, Dr. A. James Giannini reported semantic differences in medical approaches to addiction treatment..@@@@1@24@@danf@17-8-2009
10750850@unknown@formal@none@1@S@English speaking countries used the term "drug dependence" to describe a rather passive pathology in their patients.@@@@1@17@@danf@17-8-2009
10750860@unknown@formal@none@1@S@As a result the physician's role was more active.@@@@1@9@@danf@17-8-2009
10750870@unknown@formal@none@1@S@Southern European countries such as Italy and Yugoslavia utilized the concept of "tossicomania" (i.e. toxic mania) to describe a more acive rather than passive role of the addict.@@@@1@28@@danf@17-8-2009
10750880@unknown@formal@none@1@S@As a result the treating physician's role shifted to that of a more passive guide than that of an active interventionist. .@@@@1@22@@danf@17-8-2009
10760010@unknown@formal@none@1@S@Sentence (linguistics)@@@@1@2@@danf@17-8-2009
10760020@unknown@formal@none@1@S@In [[linguistics]], a '''sentence''' is a grammatical unit of one or more words, bearing minimal syntactic relation to the words that precede or follow it, often preceded and followed in speech by pauses, having one of a small number of characteristic intonation patterns, and typically expressing an independent statement, question, request, command, etc.@@@@1@53@@danf@17-8-2009
10760030@unknown@formal@none@1@S@Sentences are generally characterized in most languages by the presence of a [[finite verb]], e.g. "[[The quick brown fox jumps over the lazy dog]]".@@@@1@24@@danf@17-8-2009
10760050@unknown@formal@none@1@S@==Components of a sentence==@@@@1@4@@danf@17-8-2009
10760060@unknown@formal@none@1@S@A simple ''complete sentence'' consists of a ''[[subject (grammar)|subject]]'' and a ''[[predicate (grammar)|predicate]]''.@@@@1@13@@danf@17-8-2009
10760070@unknown@formal@none@1@S@The subject is typically a [[noun phrase]], though other kinds of phrases (such as [[gerund]] phrases) work as well, and some languages allow subjects to be omitted.@@@@1@27@@danf@17-8-2009
10760080@unknown@formal@none@1@S@The predicate is a finite [[verb phrase]]: it's a finite verb together with zero or more [[object (grammar)|objects]], zero or more [[complement (linguistics)|complements]], and zero or more [[adverbial]]s.@@@@1@28@@danf@17-8-2009
10760090@unknown@formal@none@1@S@See also [[copula]] for the consequences of this verb on the theory of sentence structure.@@@@1@15@@danf@17-8-2009
10760100@unknown@formal@none@1@S@===Clauses===@@@@1@1@@danf@17-8-2009
10760110@unknown@formal@none@1@S@A [[clause]] consists of a subject and a verb.@@@@1@9@@danf@17-8-2009
10760120@unknown@formal@none@1@S@There are two types of clauses: independent and subordinate (dependent).@@@@1@10@@danf@17-8-2009
10760130@unknown@formal@none@1@S@An independent clause consists of a subject verb and also demonstrates a complete thought: for example, "I am sad."@@@@1@19@@danf@17-8-2009
10760140@unknown@formal@none@1@S@A subordinate clause consists of a subject and a verb, but demonstrates an incomplete thought: for example, "Because I had to move."@@@@1@22@@danf@17-8-2009
10760150@unknown@formal@none@1@S@==Classification==@@@@1@1@@danf@17-8-2009
10760160@unknown@formal@none@1@S@===By structure===@@@@1@2@@danf@17-8-2009
10760170@unknown@formal@none@1@S@One traditional scheme for classifying [[English language|English]] sentences is by the number and types of [[finite verb|finite]] [[clause]]s:@@@@1@18@@danf@17-8-2009
10760180@unknown@formal@none@1@S@* A ''[[simple sentence]]'' consists of a single [[independent clause]] with no [[dependent clause]]s.@@@@1@14@@danf@17-8-2009
10760190@unknown@formal@none@1@S@* A ''[[compound sentence (linguistics)|compound sentence]]'' consists of multiple independent clauses with no dependent clauses.@@@@1@15@@danf@17-8-2009
10760200@unknown@formal@none@1@S@These clauses are joined together using [[grammatical conjunction|conjunctions]], [[punctuation]], or both.@@@@1@11@@danf@17-8-2009
10760210@unknown@formal@none@1@S@* A ''[[complex sentence]]'' consists of one or more independent clauses with at least one dependent clause.@@@@1@17@@danf@17-8-2009
10760220@unknown@formal@none@1@S@* A ''[[complex-compound sentence]]'' (or ''compound-complex sentence'') consists of multiple independent clauses, at least one of which has at least one dependent clause.@@@@1@23@@danf@17-8-2009
10760230@unknown@formal@none@1@S@===By purpose===@@@@1@2@@danf@17-8-2009
10760240@unknown@formal@none@1@S@Sentences can also be classified based on their purpose:@@@@1@9@@danf@17-8-2009
10760250@unknown@formal@none@1@S@*A ''declarative sentence'' or ''declaration'', the most common type, commonly makes a statement: ''I am going home.''@@@@1@17@@danf@17-8-2009
10760260@unknown@formal@none@1@S@*A ''negative sentence'' or ''[[negation (linguistics)|negation]]'' denies that a statement is true: ''I am not going home.''@@@@1@17@@danf@17-8-2009
10760270@unknown@formal@none@1@S@*An ''interrogative sentence'' or ''[[question]]'' is commonly used to request information — ''When are you going to work?'' — but sometimes not; ''see'' [[rhetorical question]].@@@@1@25@@danf@17-8-2009
10760280@unknown@formal@none@1@S@*An ''exclamatory sentence'' or ''[[exclamation]]'' is generally a more emphatic form of statement: ''What a wonderful day this is!''@@@@1@19@@danf@17-8-2009
10760290@unknown@formal@none@1@S@===Major and minor sentences===@@@@1@4@@danf@17-8-2009
10760300@unknown@formal@none@1@S@A major sentence is a ''regular'' sentence; it has a [[subject (grammar)|subject]] and a [[predicate (grammar)|predicate]].@@@@1@16@@danf@17-8-2009
10760310@unknown@formal@none@1@S@For example: ''I have a ball.''@@@@1@6@@danf@17-8-2009
10760320@unknown@formal@none@1@S@In this sentence one can change the persons: ''We have a ball.''@@@@1@12@@danf@17-8-2009
10760330@unknown@formal@none@1@S@However, a minor sentence is an irregular type of sentence.@@@@1@10@@danf@17-8-2009
10760340@unknown@formal@none@1@S@It does not contain a finite verb.@@@@1@7@@danf@17-8-2009
10760350@unknown@formal@none@1@S@For example, "Mary!"@@@@1@3@@danf@17-8-2009
10760360@unknown@formal@none@1@S@"Yes."@@@@1@1@@danf@17-8-2009
10760370@unknown@formal@none@1@S@"Coffee." etc.@@@@1@2@@danf@17-8-2009
10760380@unknown@formal@none@1@S@Other examples of minor sentences are headings (e.g. the heading of this entry), stereotyped expressions (''Hello!''), emotional expressions (''Wow!''), proverbs, etc.@@@@1@21@@danf@17-8-2009
10760390@unknown@formal@none@1@S@This can also include sentences which do not contain verbs (e.g. ''The more, the merrier.'') in order to intensify the meaning around the nouns (normally found in poetry and catchphrases) by Judee N..@@@@1@33@@danf@17-8-2009
10770010@unknown@formal@none@1@S@Computer software@@@@1@2@@danf@17-8-2009
10770020@unknown@formal@none@1@S@'''Computer software,''' or just '''software''' is a general term used to describe a collection of [[computer program]]s, [[procedures]] and documentation that perform some tasks on a computer system.@@@@1@28@@danf@17-8-2009
10770030@unknown@formal@none@1@S@The term includes [[application software]] such as [[word processor]]s which perform productive tasks for users, [[system software]] such as [[operating system]]s, which interface with [[hardware]] to provide the necessary services for application software, and [[middleware]] which controls and co-ordinates [[Distributed computing|distributed systems]].@@@@1@42@@danf@17-8-2009
10770040@unknown@formal@none@1@S@"Software" is sometimes used in a broader context to mean anything which is not hardware but which is ''used'' with hardware, such as film, tapes and records.@@@@1@27@@danf@17-8-2009
10770050@unknown@formal@none@1@S@==Relationship to computer hardware==@@@@1@4@@danf@17-8-2009
10770060@unknown@formal@none@1@S@[[Computer]] software is so called to distinguish it from [[computer hardware]], which encompasses the physical interconnections and devices required to store and execute (or run) the software.@@@@1@27@@danf@17-8-2009
10770070@unknown@formal@none@1@S@At the lowest level, software consists of a [[machine language]] specific to an individual processor.@@@@1@15@@danf@17-8-2009
10770080@unknown@formal@none@1@S@A machine language consists of groups of binary values signifying processor instructions which change the state of the computer from its preceding state.@@@@1@23@@danf@17-8-2009
10770090@unknown@formal@none@1@S@Software is an ordered sequence of instructions for changing the state of the computer hardware in a particular sequence.@@@@1@19@@danf@17-8-2009
10770100@unknown@formal@none@1@S@It is usually written in [[high-level programming language]]s that are easier and more efficient for humans to use (closer to [[natural language]]) than machine language.@@@@1@25@@danf@17-8-2009
10770110@unknown@formal@none@1@S@High-level languages are [[compiler|compiled]] or [[interpreter (computing)|interpreted]] into machine language object code.@@@@1@12@@danf@17-8-2009
10770120@unknown@formal@none@1@S@Software may also be written in an [[assembly language]], essentially, a mnemonic representation of a machine language using a natural language alphabet.@@@@1@22@@danf@17-8-2009
10770130@unknown@formal@none@1@S@Assembly language must be assembled into object code via an [[assembly language#Assembler|assembler]].@@@@1@12@@danf@17-8-2009
10770140@unknown@formal@none@1@S@The term "software" was first used in this sense by [[John W. Tukey]] in [[1958]].@@@@1@15@@danf@17-8-2009
10770150@unknown@formal@none@1@S@In [[computer science]] and [[software engineering]], '''computer software''' is all computer programs.@@@@1@12@@danf@17-8-2009
10770160@unknown@formal@none@1@S@The theory that is the basis for most modern software was first proposed by [[Alan Turing]] in his [[1935]] essay ''Computable numbers with an application to the Entscheidungsproblem''.@@@@1@28@@danf@17-8-2009
10770170@unknown@formal@none@1@S@==Types==@@@@1@1@@danf@17-8-2009
10770180@unknown@formal@none@1@S@Practical [[computer system]]s divide [[software system]]s into three major classes: [[system software]], [[programming software]] and [[application software]], although the distinction is arbitrary, and often blurred.@@@@1@25@@danf@17-8-2009
10770190@unknown@formal@none@1@S@*'''[[System software]]''' helps run the [[computer hardware]] and [[computer system]].@@@@1@10@@danf@17-8-2009
10770200@unknown@formal@none@1@S@It includes [[operating system]]s, [[device driver]]s, diagnostic tools, [[Server (computing)|server]]s, [[windowing system]]s, [[software utility|utilities]] and more.@@@@1@16@@danf@17-8-2009
10770210@unknown@formal@none@1@S@The purpose of systems software is to insulate the applications programmer as much as possible from the details of the particular computer complex being used, especially memory and other hardware features, and such as accessory devices as communications, printers, readers, displays, keyboards, etc.@@@@1@43@@danf@17-8-2009
10770220@unknown@formal@none@1@S@*'''[[Programming software]]''' usually provides tools to assist a [[programmer]] in writing [[computer program]]s, and software using different [[programming language]]s in a more convenient way.@@@@1@24@@danf@17-8-2009
10770230@unknown@formal@none@1@S@The tools include [[text editors]], [[compilers]], [[interpreter (computing)|interpreters]], [[linkers]], [[debuggers]], and so on.@@@@1@13@@danf@17-8-2009
10770240@unknown@formal@none@1@S@An [[Integrated development environment]] (IDE) merges those tools into a software bundle, and a programmer may not need to type multiple [[command]]s for compiling, interpreting, debugging, tracing, and etc., because the IDE usually has an advanced ''[[graphical user interface]],'' or GUI.@@@@1@41@@danf@17-8-2009
10770250@unknown@formal@none@1@S@*'''[[Application software]]''' allows end users to accomplish one or more specific (non-computer related) [[task]]s.@@@@1@14@@danf@17-8-2009
10770260@unknown@formal@none@1@S@Typical applications include [[Industry|industrial]] [[automation]], [[business software]], [[educational software]], [[medical software]], [[database]]s, and [[computer games]].@@@@1@15@@danf@17-8-2009
10770270@unknown@formal@none@1@S@Businesses are probably the biggest users of application software, but almost every field of human activity now uses some form of application software@@@@1@23@@danf@17-8-2009
10770280@unknown@formal@none@1@S@==Program and library==@@@@1@3@@danf@17-8-2009
10770290@unknown@formal@none@1@S@A [[Computer program|program]] may not be sufficiently complete for execution by a [[computer]].@@@@1@13@@danf@17-8-2009
10770300@unknown@formal@none@1@S@In particular, it may require additional software from a [[software library]] in order to be complete.@@@@1@16@@danf@17-8-2009
10770310@unknown@formal@none@1@S@Such a library may include software components used by [[stand-alone]] programs, but which cannot work on their own.@@@@1@18@@danf@17-8-2009
10770320@unknown@formal@none@1@S@Thus, programs may include standard routines that are common to many programs, extracted from these libraries.@@@@1@16@@danf@17-8-2009
10770330@unknown@formal@none@1@S@Libraries may also ''include'' 'stand-alone' programs which are activated by some [[event-driven programming|computer event]] and/or perform some function (e.g., of computer 'housekeeping') but do not return data to their calling program.@@@@1@31@@danf@17-8-2009
10770340@unknown@formal@none@1@S@Libraries may be [[Execution (computers)|called]] by one to many other programs; programs may call zero to many other programs.@@@@1@19@@danf@17-8-2009
10770350@unknown@formal@none@1@S@==Three layers==@@@@1@2@@danf@17-8-2009
10770360@unknown@formal@none@1@S@Users often see things differently than programmers.@@@@1@7@@danf@17-8-2009
10770370@unknown@formal@none@1@S@People who use modern general purpose computers (as opposed to [[embedded system]]s, [[analog computer]]s, [[supercomputer]]s, etc.) usually see three layers of software performing a variety of tasks: platform, application, and user software.@@@@1@32@@danf@17-8-2009
10770380@unknown@formal@none@1@S@;Platform software:@@@@1@2@@danf@17-8-2009
10770390@unknown@formal@none@1@S@[[Platform (computing)|Platform]] includes the [[firmware]], [[device driver]]s, an [[operating system]], and typically a [[graphical user interface]] which, in total, allow a user to interact with the computer and its [[peripheral]]s (associated equipment).@@@@1@32@@danf@17-8-2009
10770400@unknown@formal@none@1@S@Platform software often comes bundled with the computer.@@@@1@8@@danf@17-8-2009
10770410@unknown@formal@none@1@S@On a [[Personal computer|PC]] you will usually have the ability to change the platform software.@@@@1@15@@danf@17-8-2009
10770420@unknown@formal@none@1@S@;Application software:@@@@1@2@@danf@17-8-2009
10770430@unknown@formal@none@1@S@[[Application software]] or Applications are what most people think of when they think of software.@@@@1@15@@danf@17-8-2009
10770440@unknown@formal@none@1@S@Typical examples include office suites and video games.@@@@1@8@@danf@17-8-2009
10770450@unknown@formal@none@1@S@Application software is often purchased separately from computer hardware.@@@@1@9@@danf@17-8-2009
10770460@unknown@formal@none@1@S@Sometimes applications are bundled with the computer, but that does not change the fact that they run as independent applications.@@@@1@20@@danf@17-8-2009
10770470@unknown@formal@none@1@S@Applications are almost always independent programs from the operating system, though they are often tailored for specific platforms.@@@@1@18@@danf@17-8-2009
10770480@unknown@formal@none@1@S@Most users think of compilers, databases, and other "system software" as applications.@@@@1@12@@danf@17-8-2009
10770490@unknown@formal@none@1@S@;User-written software:@@@@1@2@@danf@17-8-2009
10770500@unknown@formal@none@1@S@[[End-user development]] tailors systems to meet users' specific needs.@@@@1@9@@danf@17-8-2009
10770510@unknown@formal@none@1@S@User software include spreadsheet templates, word processor macros, scientific simulations, and scripts for graphics and animations.@@@@1@16@@danf@17-8-2009
10770520@unknown@formal@none@1@S@Even email filters are a kind of user software.@@@@1@9@@danf@17-8-2009
10770530@unknown@formal@none@1@S@Users create this software themselves and often overlook how important it is.@@@@1@12@@danf@17-8-2009
10770535@unknown@formal@none@1@S@Depending on how competently the user-written software has been integrated into purchased application packages, many users may not be aware of the distinction between the purchased packages, and what has been added by fellow co-workers.@@@@1@35@@danf@17-8-2009
10770540@unknown@formal@none@1@S@==Creation==@@@@1@1@@danf@17-8-2009
10770550@unknown@formal@none@1@S@==Operation==@@@@1@1@@danf@17-8-2009
10770560@unknown@formal@none@1@S@Computer software has to be "loaded" into the [[computer storage|computer's storage]] (such as a ''[[hard drive]]'', ''memory'', or ''[[RAM]]'').@@@@1@19@@danf@17-8-2009
10770570@unknown@formal@none@1@S@Once the software has loaded, the computer is able to ''execute'' the software.@@@@1@13@@danf@17-8-2009
10770580@unknown@formal@none@1@S@This involves passing [[instruction (computer science)|instructions]] from the application software, through the system software, to the [[hardware]] which ultimately receives the instruction as [[machine language|machine code]].@@@@1@26@@danf@17-8-2009
10770590@unknown@formal@none@1@S@Each instruction causes the computer to carry out an operation -- moving [[data (computing)|data]], carrying out a [[computation]], or altering the [[control flow]] of instructions.@@@@1@25@@danf@17-8-2009
10770600@unknown@formal@none@1@S@Data movement is typically from one place in memory to another.@@@@1@11@@danf@17-8-2009
10770610@unknown@formal@none@1@S@Sometimes it involves moving data between memory and registers which enable high-speed data access in the CPU.@@@@1@17@@danf@17-8-2009
10770620@unknown@formal@none@1@S@Moving data, especially large amounts of it, can be costly.@@@@1@10@@danf@17-8-2009
10770630@unknown@formal@none@1@S@So, this is sometimes avoided by using "pointers" to data instead.@@@@1@11@@danf@17-8-2009
10770640@unknown@formal@none@1@S@Computations include simple operations such as incrementing the value of a variable data element.@@@@1@14@@danf@17-8-2009
10770650@unknown@formal@none@1@S@More complex computations may involve many operations and data elements together.@@@@1@11@@danf@17-8-2009
10770660@unknown@formal@none@1@S@Instructions may be performed sequentially, conditionally, or iteratively.@@@@1@8@@danf@17-8-2009
10770670@unknown@formal@none@1@S@Sequential instructions are those operations that are performed one after another.@@@@1@11@@danf@17-8-2009
10770680@unknown@formal@none@1@S@Conditional instructions are performed such that different sets of instructions execute depending on the value(s) of some data.@@@@1@18@@danf@17-8-2009
10770690@unknown@formal@none@1@S@In some languages this is known as an "if" statement.@@@@1@10@@danf@17-8-2009
10770700@unknown@formal@none@1@S@Iterative instructions are performed repetitively and may depend on some data value.@@@@1@12@@danf@17-8-2009
10770710@unknown@formal@none@1@S@This is sometimes called a "loop."@@@@1@6@@danf@17-8-2009
10770720@unknown@formal@none@1@S@Often, one instruction may "call" another set of instructions that are defined in some other program or [[module (programming)|module]].@@@@1@19@@danf@17-8-2009
10770730@unknown@formal@none@1@S@When more than one computer processor is used, instructions may be executed simultaneously.@@@@1@13@@danf@17-8-2009
10770740@unknown@formal@none@1@S@A simple example of the way software operates is what happens when a user selects an entry such as "Copy" from a menu.@@@@1@23@@danf@17-8-2009
10770750@unknown@formal@none@1@S@In this case, a conditional instruction is executed to copy text from data in a 'document' area residing in memory, perhaps to an intermediate storage area known as a 'clipboard' data area.@@@@1@32@@danf@17-8-2009
10770760@unknown@formal@none@1@S@If a different menu entry such as "Paste" is chosen, the software may execute the instructions to copy the text from the clipboard data area to a specific location in the same or another document in memory.@@@@1@37@@danf@17-8-2009
10770770@unknown@formal@none@1@S@Depending on the application, even the example above could become complicated.@@@@1@11@@danf@17-8-2009
10770780@unknown@formal@none@1@S@The field of [[software engineering]] endeavors to manage the complexity of how software operates.@@@@1@14@@danf@17-8-2009
10770790@unknown@formal@none@1@S@This is especially true for software that operates in the context of a large or powerful [[computer system]].@@@@1@18@@danf@17-8-2009
10770800@unknown@formal@none@1@S@Currently, almost the only limitations on the use of computer software in applications is the ingenuity of the designer/programmer.@@@@1@19@@danf@17-8-2009
10770810@unknown@formal@none@1@S@Consequently, large areas of activities (such as playing grand master level chess) formerly assumed to be incapable of software simulation are now routinely programmed.@@@@1@24@@danf@17-8-2009
10770820@unknown@formal@none@1@S@The only area that has so far proved reasonably secure from software simulation is the realm of human art— especially, pleasing music and literature.@@@@1@24@@danf@17-8-2009
10770830@unknown@formal@none@1@S@Kinds of software by operation: [[computer program]] as [[executable]], [[source code]] or [[script (computer programming)|script]], [[computer configuration|configuration]].@@@@1@17@@danf@17-8-2009
10770840@unknown@formal@none@1@S@==Quality and reliability==@@@@1@3@@danf@17-8-2009
10770850@unknown@formal@none@1@S@[[Software reliability]] considers the errors, faults, and failures related to the design, implementation and operation of software.@@@@1@17@@danf@17-8-2009
10770860@unknown@formal@none@1@S@'''See''' [[Computer security audit|Software auditing]], [[Software quality]], [[Software testing]], and [[Software reliability]].@@@@1@12@@danf@17-8-2009
10770870@unknown@formal@none@1@S@==License==@@@@1@1@@danf@17-8-2009
10770880@unknown@formal@none@1@S@[[Software license]] gives the user the right to use the software in the licensed environment, some software comes with the license when purchased off the shelf, or an OEM license when bundled with hardware.@@@@1@34@@danf@17-8-2009
10770890@unknown@formal@none@1@S@Other software comes with a [[free software licence]], granting the recipient the rights to modify and redistribute the software.@@@@1@19@@danf@17-8-2009
10770900@unknown@formal@none@1@S@Software can also be in the form of [[freeware]] or [[shareware]].@@@@1@11@@danf@17-8-2009
10770910@unknown@formal@none@1@S@See also [[License Management]].@@@@1@4@@danf@17-8-2009
10770920@unknown@formal@none@1@S@==Patents==@@@@1@1@@danf@17-8-2009
10770930@unknown@formal@none@1@S@The issue of [[software patent]]s is controversial.@@@@1@7@@danf@17-8-2009
10770940@unknown@formal@none@1@S@Some believe that they hinder [[software development]], while others argue that software patents provide an important incentive to spur software innovation.@@@@1@21@@danf@17-8-2009
10770950@unknown@formal@none@1@S@See [[software patent debate]].@@@@1@4@@danf@17-8-2009
10770960@unknown@formal@none@1@S@==Ethics and rights for software users==@@@@1@6@@danf@17-8-2009
10770970@unknown@formal@none@1@S@Being a new part of society, the idea of what rights users of software should have is not very developed.@@@@1@20@@danf@17-8-2009
10770980@unknown@formal@none@1@S@Some, such as the [[free software community]], believe that software users should be free to modify and redistribute the software they use.@@@@1@22@@danf@17-8-2009
10770990@unknown@formal@none@1@S@They argue that these rights are necessary so that each individual can control their computer, and so that everyone can cooperate, if they choose, to work together as a community and control the direction that software progresses in.@@@@1@38@@danf@17-8-2009
10770995@unknown@formal@none@1@S@Others believe that software authors should have the power to say what rights the user will get.@@@@1@17@@danf@17-8-2009
10771000@unknown@formal@none@1@S@==Software companies and non-profit organizations==@@@@1@5@@danf@17-8-2009
10771010@unknown@formal@none@1@S@Examples of non-profit software organizations : [[Free Software Foundation]], [[GNU Project]], [[Mozilla Foundation]]@@@@1@13@@danf@17-8-2009
10771020@unknown@formal@none@1@S@Examples of large software companies are: [[Microsoft]], [[IBM]], [[Oracle_Corporation|Oracle]], [[SAP AG|SAP]] and [[HP]].@@@@1@13@@danf@17-8-2009
10780010@unknown@formal@none@1@S@Spanish language@@@@1@2@@danf@17-8-2009
10780020@unknown@formal@none@1@S@'''Spanish''' or '''Castilian''' (''castellano'') is an [[Indo-European]], [[Romance languages|Romance language]] that originated in northern [[Spain]], and gradually spread in the [[Kingdom of Castile]] and evolved into the principal language of government and trade.@@@@1@33@@danf@17-8-2009
10780030@unknown@formal@none@1@S@It was taken to [[Spanish Empire#Territories in Africa (1898–1975)|Africa]], the [[Spanish colonization of the Americas|Americas]], and [[Spanish East Indies|Asia Pacific]] with the expansion of the [[Spanish Empire]] between the fifteenth and nineteenth centuries.@@@@1@33@@danf@17-8-2009
10780040@unknown@formal@none@1@S@Today, between 322 and 400 million people speak Spanish as a native language, making it the world's second most-spoken language by native speakers (after [[Standard Mandarin|Mandarin Chinese]]).@@@@1@27@@danf@17-8-2009
10780050@unknown@formal@none@1@S@==Hispanosphere==@@@@1@1@@danf@17-8-2009
10780060@unknown@formal@none@1@S@It is estimated that the combined total of native and non-native Spanish speakers is approximately 500 million, likely making it the third most spoken language by total number of speakers (after [[English_language|English]] and [[Chinese_language|Chinese]]).@@@@1@34@@danf@17-8-2009
10780070@unknown@formal@none@1@S@Today, Spanish is an official language of Spain, most [[Latin American]] countries, and [[Equatorial Guinea]]; 21 nations speak it as their primary language.@@@@1@23@@danf@17-8-2009
10780080@unknown@formal@none@1@S@Spanish also is one of [[United Nations#Languages|six official languages]] of the [[United Nations]].@@@@1@13@@danf@17-8-2009
10780090@unknown@formal@none@1@S@[[Mexico]] has the world's largest Spanish-speaking population, and Spanish is the second most-widely spoken language in the [[United States]] and the most popular studied foreign language in [[United States|U.S.]] schools and universities.@@@@1@32@@danf@17-8-2009
10780100@unknown@formal@none@1@S@[[Global internet usage]] statistics for 2007 show Spanish as the third most commonly used language on the Internet, after English and [[Chinese language|Chinese]].@@@@1@23@@danf@17-8-2009
10780110@unknown@formal@none@1@S@==Naming and origin==@@@@1@3@@danf@17-8-2009
10780120@unknown@formal@none@1@S@Spaniards tend to call this language {{lang|es|'''''español'''''}} (Spanish) when contrasting it with languages of other states, such as [[French language|French]] and [[English language|English]], but call it {{lang|es|'''''castellano'''''}} (Castilian), that is, the language of the [[Castile (historical region)|Castile]] region, when contrasting it with other [[languages of Spain|languages spoken in Spain]] such as [[Galician language|Galician]], [[Basque language|Basque]], and [[Catalan language|Catalan]].@@@@1@58@@danf@17-8-2009
10780130@unknown@formal@none@1@S@This reasoning also holds true for the language's preferred name in some [[Hispanic America]]n countries.@@@@1@15@@danf@17-8-2009
10780140@unknown@formal@none@1@S@In this manner, the [[Spanish Constitution of 1978]] uses the term {{lang|es|''castellano''}} to define the [[official language]] of the whole Spanish State, as opposed to {{lang|es|''las demás lenguas españolas''}} (lit. ''the other Spanish languages'').@@@@1@34@@danf@17-8-2009
10780150@unknown@formal@none@1@S@Article III reads as follows:@@@@1@5@@danf@17-8-2009
10780160@unknown@formal@none@1@S@The name ''castellano'' is, however, widely used for the language as a whole in Latin America.@@@@1@16@@danf@17-8-2009
10780170@unknown@formal@none@1@S@Some Spanish speakers consider ''{{lang|es|castellano}}'' a generic term with no political or ideological links, much as "Spanish" is in English.@@@@1@20@@danf@17-8-2009
10780180@unknown@formal@none@1@S@Often Latin Americans use it to differentiate their own variety of Spanish as opposed to the variety of Spanish spoken in Spain, or variety of Spanish which is considered as standard in the region.@@@@1@34@@danf@17-8-2009
10780190@unknown@formal@none@1@S@==Classification and related languages==@@@@1@4@@danf@17-8-2009
10780200@unknown@formal@none@1@S@Spanish is closely related to the other [[West Iberian languages|West Iberian]] Romance languages: [[Asturian language|Asturian]] ({{lang|ast|''asturianu''}}), [[Galician language|Galician]] ({{lang|gl|''galego''}}), [[Ladino language|Ladino]] ({{lang|lad|''dzhudezmo/spanyol/kasteyano''}}), and [[Portuguese language|Portuguese]] ({{lang|pt|''português''}}).@@@@1@26@@danf@17-8-2009
10780210@unknown@formal@none@1@S@Catalan, an [[Iberian Romance languages|East Iberian language]] which exhibits many [[Gallo-Romance]] traits, is more similar to the neighbouring [[Occitan language]] ({{lang|oc|''occitan''}}) than to Spanish, or indeed than Spanish and Portuguese are to each other.@@@@1@34@@danf@17-8-2009
10780220@unknown@formal@none@1@S@Spanish and Portuguese share similar grammars and vocabulary as well as a common history of [[Influence of Arabic on other languages|Arabic influence]] while a great part of the peninsula was under [[Timeline of the Muslim presence in the Iberian peninsula|Islamic rule]] (both languages expanded over [[Islamic empire|Islamic territories]]).@@@@1@48@@danf@17-8-2009
10780230@unknown@formal@none@1@S@Their [[lexical similarity]] has been estimated as 89%.@@@@1@8@@danf@17-8-2009
10780240@unknown@formal@none@1@S@See [[Differences between Spanish and Portuguese]] for further information.@@@@1@9@@danf@17-8-2009
10780250@unknown@formal@none@1@S@===Ladino===@@@@1@1@@danf@17-8-2009
10780260@unknown@formal@none@1@S@Ladino, which is essentially medieval Spanish and closer to modern Spanish than any other language, is spoken by many descendants of the [[Sephardi Jews]] who were [[Alhambra decree|expelled from Spain in the 15th century]].@@@@1@34@@danf@17-8-2009
10780270@unknown@formal@none@1@S@Ladino speakers are currently almost exclusively [[Sephardim|Sephardi]] Jews, with family roots in Turkey, Greece or the Balkans: current speakers mostly live in Israel and Turkey, with a few pockets in Latin America.@@@@1@32@@danf@17-8-2009
10780280@unknown@formal@none@1@S@It lacks the [[Amerindian languages|Native American vocabulary]] which was influential during the [[Spanish Empire|Spanish colonial period]], and it retains many archaic features which have since been lost in standard Spanish.@@@@1@30@@danf@17-8-2009
10780290@unknown@formal@none@1@S@It contains, however, other vocabulary which is not found in standard Castilian, including vocabulary from [[Hebrew language|Hebrew]], some French, Greek and [[Turkish language|Turkish]], and other languages spoken where the Sephardim settled.@@@@1@31@@danf@17-8-2009
10780300@unknown@formal@none@1@S@Ladino is in serious danger of extinction because many native speakers today are elderly as well as elderly ''olim'' (immigrants to [[Israel]]) who have not transmitted the language to their children or grandchildren.@@@@1@33@@danf@17-8-2009
10780310@unknown@formal@none@1@S@However, it is experiencing a minor revival among Sephardi communities, especially in music.@@@@1@13@@danf@17-8-2009
10780320@unknown@formal@none@1@S@In the case of the Latin American communities, the danger of extinction is also due to the risk of assimilation by modern Castilian.@@@@1@23@@danf@17-8-2009
10780330@unknown@formal@none@1@S@A related dialect is [[Haketia]], the Judaeo-Spanish of northern Morocco.@@@@1@10@@danf@17-8-2009
10780340@unknown@formal@none@1@S@This too tended to assimilate with modern Spanish, during the Spanish occupation of the region.@@@@1@15@@danf@17-8-2009
10780350@unknown@formal@none@1@S@===Vocabulary comparison===@@@@1@2@@danf@17-8-2009
10780360@unknown@formal@none@1@S@Spanish and [[Italian language|Italian]] share a very similar phonological system.@@@@1@10@@danf@17-8-2009
10780370@unknown@formal@none@1@S@At present, the [[lexical similarity]] with Italian is estimated at 82%.@@@@1@11@@danf@17-8-2009
10780380@unknown@formal@none@1@S@As a result, Spanish and Italian are mutually intelligible to various degrees.@@@@1@12@@danf@17-8-2009
10780390@unknown@formal@none@1@S@The lexical similarity with [[Portuguese language|Portuguese]] is greater, 89%, but the vagaries of Portuguese pronunciation make it less easily understood by Hispanophones than Italian.@@@@1@24@@danf@17-8-2009
10780400@unknown@formal@none@1@S@[[Mutual intelligibility]] between Spanish and [[French language|French]] or [[Romanian language|Romanian]] is even lower (lexical similarity being respectively 75% and 71%): comprehension of Spanish by French speakers who have not studied the language is as low as an estimated 45% - the same as of English.@@@@1@45@@danf@17-8-2009
10780410@unknown@formal@none@1@S@The common features of the writing systems of the Romance languages allow for a greater amount of interlingual reading comprehension than oral communication would.@@@@1@24@@danf@17-8-2009
10780420@unknown@formal@none@1@S@ 1. also {{lang|pt|''nós outros''}} in early modern Portuguese (e.g. ''[[The Lusiads]]'')@@@@1@12@@danf@17-8-2009
10780430@unknown@formal@none@1@S@2. {{lang|it|''noi '''altri'''''}} in Southern [[List of languages of Italy|Italian dialects and languages]]@@@@1@13@@danf@17-8-2009
10780440@unknown@formal@none@1@S@3. Alternatively {{lang|fr|''nous '''autres'''''}} @@@@1@5@@danf@17-8-2009
10780460@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10780470@unknown@formal@none@1@S@Spanish evolved from [[Vulgar Latin]], with major [[Arabic influence on the Spanish language|influences from Arabic]] in vocabulary during the [[Al-Andalus|Andalusian]] period and minor surviving influences from [[Basque language|Basque]] and [[Celtiberian language|Celtiberian]], as well as [[Germanic languages]] via the [[Visigoths]].@@@@1@39@@danf@17-8-2009
10780480@unknown@formal@none@1@S@Spanish developed along the remote cross road strips among the [[Alava]], [[Cantabria]], [[Burgos]], [[Soria]] and [[La Rioja (autonomous community)|La Rioja]] provinces of Northern Spain, as a strongly innovative and differing variant from its nearest cousin, [[Asturian|Leonese speech]], with a higher degree of Basque influence in these regions (see [[Iberian Romance languages]]).@@@@1@51@@danf@17-8-2009
10780490@unknown@formal@none@1@S@Typical features of Spanish diachronical [[phonology]] include [[lenition]] (Latin {{lang|la|''vita''}}, Spanish {{lang|es|''vida''}}), [[palatalization]] (Latin {{lang|la|''annum''}}, Spanish {{lang|es|''año''}}, and Latin {{lang|la|''anellum''}}, Spanish {{lang|es|''anillo''}}) and [[diphthong]]ation ([[stem (linguistics)|stem]]-changing) of short ''e'' and ''o'' from Vulgar Latin (Latin {{lang|la|''terra''}}, Spanish {{lang|es|''tierra''}}; Latin {{lang|la|''novus''}}, Spanish {{lang|es|''nuevo''}}).@@@@1@42@@danf@17-8-2009
10780500@unknown@formal@none@1@S@Similar phenomena can be found in other Romance languages as well.@@@@1@11@@danf@17-8-2009
10780510@unknown@formal@none@1@S@During the {{lang|es|''[[Reconquista]]''}}, this northern dialect from [[Cantabria]] was carried south, and remains a [[minority language]] in the northern coastal [[Morocco]].@@@@1@21@@danf@17-8-2009
10780520@unknown@formal@none@1@S@The first Latin-to-Spanish grammar ({{lang|es|''Gramática de la Lengua Castellana''}}) was written in [[Salamanca]], Spain, in 1492, by [[Antonio de Nebrija|Elio Antonio de Nebrija]].@@@@1@23@@danf@17-8-2009
10780530@unknown@formal@none@1@S@When it was presented to [[Isabel de Castilla]], she asked, "What do I want a work like this for, if I already know the language?", to which he replied, "Your highness, the language is the instrument of the Empire."@@@@1@39@@danf@17-8-2009
10780540@unknown@formal@none@1@S@From the 16th century onwards, the language was taken to the [[Americas]] and the [[Spanish East Indies]] via [[Spanish colonization of the Americas|Spanish colonization]].@@@@1@24@@danf@17-8-2009
10780550@unknown@formal@none@1@S@In the 20th century, Spanish was introduced to [[Equatorial Guinea]] and the [[Western Sahara]], the United States, such as in [[Spanish Harlem]], in [[New York City]], that had not been part of the Spanish Empire.@@@@1@35@@danf@17-8-2009
10780560@unknown@formal@none@1@S@For details on borrowed words and other external influences upon Spanish, see [[Influences on the Spanish language]].@@@@1@17@@danf@17-8-2009
10780570@unknown@formal@none@1@S@===Characterization===@@@@1@1@@danf@17-8-2009
10780580@unknown@formal@none@1@S@A defining characteristic of Spanish was the [[diphthong]]ization of the Latin short vowels ''e'' and ''o'' into ''ie'' and ''ue'', respectively, when they were stressed.@@@@1@25@@danf@17-8-2009
10780590@unknown@formal@none@1@S@Similar [[sound law|sound changes]] are found in other Romance languages, but in Spanish they were significant.@@@@1@16@@danf@17-8-2009
10780600@unknown@formal@none@1@S@Some examples:@@@@1@2@@danf@17-8-2009
10780610@unknown@formal@none@1@S@* Lat. {{lang|la|''petra''}} > Sp. {{lang|es|''piedra''}}, It. {{lang|it|''pietra''}}, Fr. {{lang|fr|''pierre''}}, Rom. {{lang|ro|''piatrǎ''}}, Port./Gal. {{lang|pt|''pedra''}} "stone".@@@@1@15@@danf@17-8-2009
10780620@unknown@formal@none@1@S@* Lat. {{lang|la|''moritur''}} > Sp. {{lang|es|''muere''}}, It. {{lang|it|''muore''}}, Fr. {{lang|fr|''meurt''}} / {{lang|fr|''muert''}}, Rom. {{lang|ro|''moare''}}, Port./Gal. {{lang|pt|''morre''}} "die".@@@@1@17@@danf@17-8-2009
10780630@unknown@formal@none@1@S@Peculiar to early Spanish (as in the [[Gascon]] dialect of Occitan, and possibly due to a Basque [[substratum]]) was the mutation of Latin initial ''f-'' into ''h-'' whenever it was followed by a vowel that did not diphthongate.@@@@1@38@@danf@17-8-2009
10780640@unknown@formal@none@1@S@Compare for instance:@@@@1@3@@danf@17-8-2009
10780650@unknown@formal@none@1@S@* Lat. {{lang|la|''filium''}} > It. {{lang|it|''figlio''}}, Port. {{lang|pt|''filho''}}, Gal. {{lang|gl|''fillo''}}, Fr. {{lang|fr|''fils''}}, Occitan {{lang|oc|''filh''}} (but Gascon {{lang|gsc|''hilh''}}) Sp. {{lang|es|''hijo''}} (but Ladino {{lang|lad|''fijo''}});@@@@1@22@@danf@17-8-2009
10780660@unknown@formal@none@1@S@* Lat. {{lang|la|''fabulari''}} > Lad. {{lang|lad|''favlar''}}, Port./Gal. {{lang|pt|''falar''}}, Sp. {{lang|es|''hablar''}};@@@@1@10@@danf@17-8-2009
10780670@unknown@formal@none@1@S@* but Lat. {{lang|la|''focum''}} > It. {{lang|it|''fuoco''}}, Port./Gal. {{lang|pt|''fogo''}}, Sp./Lad. {{lang|es|''fuego''}}.@@@@1@11@@danf@17-8-2009
10780680@unknown@formal@none@1@S@Some [[consonant cluster]]s of Latin also produced characteristically different results in these languages, for example:@@@@1@15@@danf@17-8-2009
10780690@unknown@formal@none@1@S@* Lat. {{lang|la|''clamare''}}, acc. {{lang|la|''flammam''}}, {{lang|la|''plenum''}} > Lad. {{lang|lad|''lyamar''}}, {{lang|lad|''flama''}}, {{lang|lad|''pleno''}}; Sp. {{lang|es|''llamar''}}, {{lang|es|''llama''}}, {{lang|es|''lleno''}}.@@@@1@15@@danf@17-8-2009
10780700@unknown@formal@none@1@S@However, in Spanish there are also the forms {{lang|la|''clamar''}}, {{lang|lad|''flama''}}, {{lang|lad|''pleno''}}; Port. {{lang|pt|''chamar''}}, {{lang|pt|''chama''}}, {{lang|pt|''cheio''}}; Gal. {{lang|gl|''chamar''}}, {{lang|gl|''chama''}}, {{lang|gl|''cheo''}}.@@@@1@19@@danf@17-8-2009
10780710@unknown@formal@none@1@S@* Lat. acc. {{lang|la|''octo''}}, {{lang|la|''noctem''}}, {{lang|la|''multum''}} > Lad. {{lang|lad|''ocho''}}, {{lang|lad|''noche''}}, {{lang|lad|''muncho''}}; Sp. {{lang|es|''ocho''}}, {{lang|es|''noche''}}, {{lang|es|''mucho''}}; Port. {{lang|pt|''oito''}}, {{lang|pt|''noite''}}, {{lang|pt|''muito''}}; Gal. {{lang|gl|''oito''}}, {{lang|gl|''noite''}}, {{lang|gl|''moito''}}.@@@@1@23@@danf@17-8-2009
10780720@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10780730@unknown@formal@none@1@S@Spanish is one of the official languages of the [[European Union]], the [[Organization of American States]], the [[Organization of Ibero-American States]], the [[United Nations]], and the [[Union of South American Nations]].@@@@1@31@@danf@17-8-2009
10780740@unknown@formal@none@1@S@===Europe===@@@@1@1@@danf@17-8-2009
10780750@unknown@formal@none@1@S@Spanish is an official language of Spain, the country for which it is named and from which it originated.@@@@1@19@@danf@17-8-2009
10780760@unknown@formal@none@1@S@It is also spoken in [[Gibraltar]], though English is the official language.@@@@1@12@@danf@17-8-2009
10780770@unknown@formal@none@1@S@Likewise, it is spoken in [[Andorra]] though [[Catalan language|Catalan]] is the official language.@@@@1@13@@danf@17-8-2009
10780780@unknown@formal@none@1@S@It is also spoken by small communities in other European countries, such as the [[United Kingdom]], [[France]], and [[Germany]].@@@@1@19@@danf@17-8-2009
10780790@unknown@formal@none@1@S@Spanish is an official language of the [[European Union]].@@@@1@9@@danf@17-8-2009
10780800@unknown@formal@none@1@S@In Switzerland, Spanish is the [[mother tongue]] of 1.7% of the population, representing the first minority after the 4 official languages of the country.@@@@1@24@@danf@17-8-2009
10780810@unknown@formal@none@1@S@===The Americas ===@@@@1@3@@danf@17-8-2009
10780820@unknown@formal@none@1@S@====Latin America====@@@@1@2@@danf@17-8-2009
10780830@unknown@formal@none@1@S@Most Spanish speakers are in [[Latin America]]; of most countries with the most Spanish speakers, only [[Spain]] is outside of the [[Americas]].@@@@1@22@@danf@17-8-2009
10780840@unknown@formal@none@1@S@[[Mexico]] has most of the world's native speakers.@@@@1@8@@danf@17-8-2009
10780850@unknown@formal@none@1@S@Nationally, Spanish is the official language of [[Argentina]], [[Bolivia]] (co-official [[Quechua]] and [[Aymara language|Aymara]]), [[Chile]], [[Colombia]], [[Costa Rica]], [[Cuba]], [[Dominican Republic]], [[Ecuador]], [[El Salvador]], [[Guatemala]], [[Honduras]], [[Mexico]] , [[Nicaragua]], [[Panama]], [[Paraguay]] (co-official [[Guarani language|Guaraní]]), [[Peru]] (co-official [[Quechua]] and, in some regions, [[Aymara language|Aymara]]), [[Uruguay]], and [[Venezuela]].@@@@1@46@@danf@17-8-2009
10780860@unknown@formal@none@1@S@Spanish is also the official language (co-official with [[English language|English]]) in the U.S. commonwealth of [[Puerto Rico]].@@@@1@17@@danf@17-8-2009
10780870@unknown@formal@none@1@S@Spanish has no official recognition in the former [[British overseas territories|British colony]] of [[Belize]]; however, per the 2000 census, it is spoken by 43% of the population.@@@@1@27@@danf@17-8-2009
10780880@unknown@formal@none@1@S@Mainly, it is spoken by Hispanic descendants who remained in the region since the 17th century; however, English is the official language.@@@@1@22@@danf@17-8-2009
10780890@unknown@formal@none@1@S@Spain colonized [[Trinidad and Tobago]] first in [[1498]], leaving the [[Carib]] people the Spanish language.@@@@1@15@@danf@17-8-2009
10780900@unknown@formal@none@1@S@Also the [[Cocoa Panyol]]s, laborers from Venezuela, took their culture and language with them; they are accredited with the music of "[[Parang]]" ("[[Parranda]]") on the island.@@@@1@26@@danf@17-8-2009
10780910@unknown@formal@none@1@S@Because of Trinidad's location on the South American coast, the country is much influenced by its Spanish-speaking neighbors.@@@@1@18@@danf@17-8-2009
10780920@unknown@formal@none@1@S@A recent census shows that more than 1,500 inhabitants speak Spanish.@@@@1@11@@danf@17-8-2009
10780930@unknown@formal@none@1@S@In 2004, the government launched the ''Spanish as a First Foreign Language'' (SAFFL) initiative in March 2005.@@@@1@17@@danf@17-8-2009
10780940@unknown@formal@none@1@S@Government regulations require Spanish to be taught, beginning in primary school, while thirty percent of public employees are to be linguistically competent within five years.@@@@1@25@@danf@17-8-2009
10780950@unknown@formal@none@1@S@The government also announced that Spanish will be the country's second official language by [[2020]], beside English.@@@@1@17@@danf@17-8-2009
10780960@unknown@formal@none@1@S@Spanish is important in [[Brazil]] because of its proximity to and increased trade with its Spanish-speaking neighbors; for example, as a member of the [[Mercosur]] trading bloc.@@@@1@27@@danf@17-8-2009
10780970@unknown@formal@none@1@S@In 2005, the [[National Congress of Brazil]] approved a bill, signed into law by the [[President of Brazil|President]], making Spanish available as a foreign language in secondary schools.@@@@1@28@@danf@17-8-2009
10780980@unknown@formal@none@1@S@In many border towns and villages (especially on the Uruguayan-Brazilian border), a [[mixed language]] known as [[Riverense Portuñol|Portuñol]] is spoken.@@@@1@20@@danf@17-8-2009
10780990@unknown@formal@none@1@S@====United States====@@@@1@2@@danf@17-8-2009
10781000@unknown@formal@none@1@S@In the 2006 census, 44.3 million people of the U.S. population were [[Hispanic]] or [[Latino]] by origin; 34 million people, 12.2 percent, of the population older than 5 years speak Spanish at home.@@@@1@33@@danf@17-8-2009
10781005@unknown@formal@none@1@S@Spanish has a [[Spanish in the United States|long history in the United States]] (many south-western states were part of Mexico and Spain), and it recently has been revitalized by much immigration from Latin America.@@@@1@34@@danf@17-8-2009
10781010@unknown@formal@none@1@S@Spanish is the most widely taught foreign language in the country.@@@@1@11@@danf@17-8-2009
10781020@unknown@formal@none@1@S@Although the United States has no formally designated "official languages," Spanish is formally recognized at the state level beside English; in the U.S. state of [[New Mexico]], 30 per cent of the population speak it.@@@@1@35@@danf@17-8-2009
10781030@unknown@formal@none@1@S@It also has strong influence in metropolitan areas such as Los Angeles, Miami and New York City.@@@@1@17@@danf@17-8-2009
10781040@unknown@formal@none@1@S@Spanish is the dominant spoken language in [[Puerto Rico]], a U.S. territory.@@@@1@12@@danf@17-8-2009
10781050@unknown@formal@none@1@S@In total, the U.S. has the world's fifth-largest Spanish-speaking population.@@@@1@10@@danf@17-8-2009
10781060@unknown@formal@none@1@S@===Asia===@@@@1@1@@danf@17-8-2009
10781070@unknown@formal@none@1@S@Spanish was an official language of the [[Philippines]] but was never spoken by a majority of the population.@@@@1@18@@danf@17-8-2009
10781080@unknown@formal@none@1@S@Movements for most of the masses to learn the language were started but were stopped by the friars.@@@@1@18@@danf@17-8-2009
10781090@unknown@formal@none@1@S@Its importance fell in the first half of the 20th century following the U.S. occupation and administration of the islands.@@@@1@20@@danf@17-8-2009
10781100@unknown@formal@none@1@S@The introduction of the English language in the Philippine government system put an end to the use of Spanish as the official language.@@@@1@23@@danf@17-8-2009
10781110@unknown@formal@none@1@S@The language lost its official status in 1973 during the [[Ferdinand Marcos]] administration.@@@@1@13@@danf@17-8-2009
10781120@unknown@formal@none@1@S@Spanish is spoken mainly by small communities of Filipino-born Spaniards, Latin Americans, and Filipino [[mestizo]]s (mixed race), descendants of the early colonial Spanish settlers.@@@@1@24@@danf@17-8-2009
10781130@unknown@formal@none@1@S@Throughout the 20th century, the Spanish language has declined in importance compared to English and [[Tagalog language|Tagalog]].@@@@1@17@@danf@17-8-2009
10781140@unknown@formal@none@1@S@According to the 1990 Philippine census, there were 2,658 native speakers of Spanish.@@@@1@13@@danf@17-8-2009
10781150@unknown@formal@none@1@S@No figures were provided during the 1995 and 2000 censuses; however, figures for 2000 did specify there were over 600,000 native speakers of [[Chavacano language|Chavacano]], a Spanish based [[Creole language|creole]] language spoken in [[Cavite]] and [[Zamboanga]].@@@@1@36@@danf@17-8-2009
10781160@unknown@formal@none@1@S@Some other sources put the number of Spanish speakers in the Philippines around two to three million; however, these sources are disputed.@@@@1@22@@danf@17-8-2009
10781170@unknown@formal@none@1@S@In Tagalog, there are 4,000 Spanish adopted words and around 6,000 Spanish adopted words in Visayan and other Philippine languages as well.@@@@1@22@@danf@17-8-2009
10781180@unknown@formal@none@1@S@Today Spanish is offered as a foreign language in Philippines schools and universities.@@@@1@13@@danf@17-8-2009
10781190@unknown@formal@none@1@S@===Africa===@@@@1@1@@danf@17-8-2009
10781200@unknown@formal@none@1@S@In Africa, Spanish is official in the UN-recognised but Moroccan-occupied [[Western Sahara]] (co-official [[Arabic language|Arabic]]) and [[Equatorial Guinea]] (co-official [[French language|French]] and [[Portuguese language|Portuguese]]).@@@@1@24@@danf@17-8-2009
10781210@unknown@formal@none@1@S@Today, nearly 200,000 refugee Sahrawis are able to read and write in Spanish, and several thousands have received [[university]] education in foreign countries as part of aid packages (mainly [[Cuba]] and [[Spain]]).@@@@1@32@@danf@17-8-2009
10781220@unknown@formal@none@1@S@In Equatorial Guinea, Spanish is the predominant language when counting native and non-native speakers (around 500,000 people), while [[Fang language|Fang]] is the most spoken language by a number of native speakers.@@@@1@31@@danf@17-8-2009
10781230@unknown@formal@none@1@S@It is also spoken in the Spanish cities in [[Plazas de soberanía|continental North Africa]] ([[Ceuta]] and [[Melilla]]) and in the autonomous community of [[Canary Islands]] (143,000 and 1,995,833 people, respectively).@@@@1@30@@danf@17-8-2009
10781240@unknown@formal@none@1@S@Within Northern Morocco, a former [[History of Morocco#European influence|Franco-Spanish protectorate]] that is also geographically close to Spain, approximately 20,000 people speak Spanish.@@@@1@22@@danf@17-8-2009
10781250@unknown@formal@none@1@S@It is spoken by some communities of [[Angola]], because of the Cuban influence from the [[Cold War]], and in [[Nigeria]] by the descendants of [[Afro-Cuban]] ex-slaves.@@@@1@26@@danf@17-8-2009
10781260@unknown@formal@none@1@S@In [[Côte d'Ivoire]] and [[Senegal]], Spanish can be learned as a second foreign language in the public education system.@@@@1@19@@danf@17-8-2009
10781270@unknown@formal@none@1@S@In 2008, [[Cervantes Institute]]s centers will be opened in [[Lagos]] and [[Johannesburg]], the first one in the [[Sub-Saharan Africa]]@@@@1@19@@danf@17-8-2009
10781280@unknown@formal@none@1@S@===Oceania===@@@@1@1@@danf@17-8-2009
10781290@unknown@formal@none@1@S@Among the countries and territories in [[Oceania]], Spanish is also spoken in [[Easter Island]], a territorial possession of Chile.@@@@1@19@@danf@17-8-2009
10781300@unknown@formal@none@1@S@According to the 2001 census, there are approximately 95,000 speakers of Spanish in Australia, 44,000 of which live in Greater Sydney , where the older [[:Category: Australians of Mexican descent|Mexican]], [[:Category:Australians of Colombian descent|Colombian]], and [[:Category: Australians of Spanish descent|Spanish]] populations and newer [[:Category:Australians of Argentine descent|Argentine]], Salvadoran and [[:Category:Australians of Uruguayan descent|Uruguyan]] communities live.@@@@1@55@@danf@17-8-2009
10781310@unknown@formal@none@1@S@The island nations of [[Guam]], [[Palau]], [[Northern Marianas]], [[Marshall Islands]] and [[Federated States of Micronesia]] all once had Spanish speakers, since [[Marianas Islands|Marianas]] and [[Caroline Islands]] were Spanish colonial possessions until late 19th century (see [[Spanish-American War]]), but Spanish has since been forgotten.@@@@1@43@@danf@17-8-2009
10781320@unknown@formal@none@1@S@It now only exists as an influence on the local native languages and also spoken by [[Hispanics in the United States|Hispanic American]] resident populations.@@@@1@24@@danf@17-8-2009
10781330@unknown@formal@none@1@S@==Dialectal variation==@@@@1@2@@danf@17-8-2009
10781340@unknown@formal@none@1@S@There are important variations among the regions of Spain and throughout Spanish-speaking America.@@@@1@13@@danf@17-8-2009
10781350@unknown@formal@none@1@S@In countries in Hispanophone America, it is preferable to use the word ''castellano'' to distinguish their version of the language from that of Spain, thus asserting their autonomy and national identity.@@@@1@31@@danf@17-8-2009
10781360@unknown@formal@none@1@S@In Spain the Castilian dialect's pronunciation is commonly regarded as the national standard, although a use of slightly different pronouns called [[Loísmo|{{lang|es|''laísmo''}}]] of this dialect is deprecated.@@@@1@27@@danf@17-8-2009
10781370@unknown@formal@none@1@S@More accurately, for nearly everyone in Spain, "standard Spanish" means "pronouncing everything exactly as it is written," an ideal which does not correspond to any real dialect, though the northern dialects are the closest to it.@@@@1@36@@danf@17-8-2009
10781380@unknown@formal@none@1@S@In practice, the standard way of speaking Spanish in the media is "written Spanish" for formal speech, "Madrid dialect" (one of the transitional variants between Castilian and Andalusian) for informal speech.@@@@1@31@@danf@17-8-2009
10781390@unknown@formal@none@1@S@===Voseo===@@@@1@1@@danf@17-8-2009
10781400@unknown@formal@none@1@S@Spanish has three [[grammatical person|second-person]] [[grammatical number|singular]] [[pronoun]]s: {{lang|es|''tú''}}, {{lang|es|''usted''}}, and in some parts of Latin America, {{lang|es|''vos''}} (the use of this pronoun and/or its verb forms is called ''voseo'').@@@@1@30@@danf@17-8-2009
10781410@unknown@formal@none@1@S@In those regions where it is used, generally speaking, {{lang|es|''tú''}} and {{lang|es|''vos''}} are informal and used with friends; in other countries, {{lang|es|''vos''}} is considered an archaic form.@@@@1@27@@danf@17-8-2009
10781415@unknown@formal@none@1@S@{{lang|es|''Usted''}} is universally regarded as the formal address (derived from {{lang|es|''vuestra merced''}}, "your grace"), and is used as a mark of respect, as when addressing one's elders or strangers.@@@@1@29@@danf@17-8-2009
10781420@unknown@formal@none@1@S@{{lang|es|''Vos''}} is used extensively as the primary spoken form of the second-person singular pronoun, although with wide differences in social consideration, in many countries of [[Latin America]], including [[Argentina]], [[Chile]], [[Costa Rica]], the central mountain region of [[Ecuador]], the State of [[Chiapas]] in [[Mexico]], [[El Salvador]], [[Guatemala]], [[Honduras]], [[Nicaragua]], [[Paraguay]], [[Uruguay]], the [[Paisa region]] and Caleños of [[Colombia]] and the [[States]] of [[Zulia]] and Trujillo in [[Venezuela]].@@@@1@67@@danf@17-8-2009
10781430@unknown@formal@none@1@S@There are some differences in the verbal endings for ''vos'' in each country.@@@@1@13@@danf@17-8-2009
10781440@unknown@formal@none@1@S@In Argentina, Uruguay, and increasingly in Paraguay and some Central American countries, it is also the standard form used in the [[mass media|media]], but the media in other countries with {{lang|es|''voseo''}} generally continue to use {{lang|es|''usted''}} or {{lang|es|''tú''}} except in advertisements, for instance.@@@@1@43@@danf@17-8-2009
10781445@unknown@formal@none@1@S@{{lang|es|''Vos''}} may also be used regionally in other countries.@@@@1@9@@danf@17-8-2009
10781450@unknown@formal@none@1@S@Depending on country or region, usage may be considered standard or (by better educated speakers) to be unrefined.@@@@1@18@@danf@17-8-2009
10781460@unknown@formal@none@1@S@Interpersonal situations in which the use of ''vos'' is acceptable may also differ considerably between regions.@@@@1@16@@danf@17-8-2009
10781470@unknown@formal@none@1@S@===Ustedes===@@@@1@1@@danf@17-8-2009
10781480@unknown@formal@none@1@S@Spanish forms also differ regarding second-person plural pronouns.@@@@1@8@@danf@17-8-2009
10781490@unknown@formal@none@1@S@The Spanish dialects of Latin America have only one form of the second-person plural for daily use, {{lang|es|''ustedes''}} (formal or familiar, as the case may be, though {{lang|es|''vosotros''}} non-formal usage can sometimes appear in poetry and rhetorical or literary style).@@@@1@40@@danf@17-8-2009
10781500@unknown@formal@none@1@S@In Spain there are two forms — {{lang|es|''ustedes''}} (formal) and {{lang|es|''vosotros''}} (familiar).@@@@1@12@@danf@17-8-2009
10781510@unknown@formal@none@1@S@The pronoun {{lang|es|''vosotros''}} is the plural form of {{lang|es|''tú''}} in most of Spain, but in the Americas (and certain southern Spanish cities such as [[Cádiz]] or [[Seville]], and in the [[Canary Islands]]) it is replaced with {{lang|es|''ustedes''}}.@@@@1@37@@danf@17-8-2009
10781520@unknown@formal@none@1@S@It is notable that the use of {{lang|es|''ustedes''}} for the informal plural "you" in southern Spain does not follow the usual rule for pronoun-verb [[agreement (linguistics)|agreement]]; e.g., while the formal form for "you go", {{lang|es|''ustedes van''}}, uses the third-person plural form of the verb, in Cádiz or Seville the informal form is constructed as {{lang|es|''ustedes vais''}}, using the second-person plural of the verb.@@@@1@63@@danf@17-8-2009
10781530@unknown@formal@none@1@S@In the Canary Islands, though, the usual pronoun-verb agreement is preserved in most cases.@@@@1@14@@danf@17-8-2009
10781540@unknown@formal@none@1@S@Some words can be different, even embarrassingly so, in different Hispanophone countries.@@@@1@12@@danf@17-8-2009
10781550@unknown@formal@none@1@S@Most Spanish speakers can recognize other Spanish forms, even in places where they are not commonly used, but Spaniards generally do not recognise specifically American usages.@@@@1@26@@danf@17-8-2009
10781560@unknown@formal@none@1@S@For example, Spanish ''mantequilla'', ''aguacate'' and ''albaricoque'' (respectively, "butter", "avocado", "apricot") correspond to ''manteca'', ''palta'', and ''damasco'', respectively, in Argentina, Chile and Uruguay.@@@@1@23@@danf@17-8-2009
10781570@unknown@formal@none@1@S@The everyday Spanish words ''coger'' (to catch, get, or pick up), ''pisar'' (to step on) and ''concha'' (seashell) are considered extremely rude in parts of Latin America, where the meaning of ''coger'' and ''pisar'' is also "to have sex" and ''concha'' means "vulva".@@@@1@43@@danf@17-8-2009
10781580@unknown@formal@none@1@S@The Puerto Rican word for "bobby pin" (''pinche'') is an obscenity in Mexico, and in [[Nicaragua]] simply means "stingy".@@@@1@19@@danf@17-8-2009
10781590@unknown@formal@none@1@S@Other examples include ''[[taco]]'', which means "swearword" in Spain but is known to the rest of the world as a Mexican dish.@@@@1@22@@danf@17-8-2009
10781600@unknown@formal@none@1@S@''Pija'' in many countries of Latin America is an obscene slang word for "penis", while in [[Spain]] the word also signifies "posh girl" or "snobby".@@@@1@25@@danf@17-8-2009
10781610@unknown@formal@none@1@S@''Coche'', which means "car" in Spain, for the vast majority of Spanish-speakers actually means "baby-stroller", in Guatemala it means "pig", while ''carro'' means "car" in some Latin American countries and "cart" in others, as well as in Spain.@@@@1@38@@danf@17-8-2009
10781620@unknown@formal@none@1@S@The {{lang|es|[[Real Academia Española]]}} (Royal Spanish Academy), together with the 21 other national ones (see [[Association of Spanish Language Academies]]), exercises a standardizing influence through its publication of dictionaries and widely respected grammar and style guides.@@@@1@36@@danf@17-8-2009
10781630@unknown@formal@none@1@S@Due to this influence and for other sociohistorical reasons, a standardized form of the language ([[Standard Spanish]]) is widely acknowledged for use in literature, academic contexts and the media.@@@@1@29@@danf@17-8-2009
10781640@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10781650@unknown@formal@none@1@S@Spanish is written using the [[Latin alphabet]], with the addition of the character ''[[ñ]]'' (''eñe'', representing the phoneme {{IPA|/ɲ/}}, a letter distinct from ''n'', although typographically composed of an ''n'' with a [[tilde]]) and the [[digraph (orthography)|digraph]]s ''ch'' ({{lang|es|''che''}}, representing the phoneme {{IPA|/tʃ/}}) and ''ll'' ({{lang|es|''elle''}}, representing the phoneme {{IPA|/ʎ/}}).@@@@1@50@@danf@17-8-2009
10781660@unknown@formal@none@1@S@However, the digraph ''rr'' ({{lang|es|''erre fuerte''}}, "strong ''r''", {{lang|es|''erre doble''}}, "double ''r''", or simply {{lang|es|''erre''}}), which also represents a distinct phoneme {{IPA|/r/}}, is not similarly regarded as a single letter.@@@@1@30@@danf@17-8-2009
10781670@unknown@formal@none@1@S@Since 1994, the digraphs ''ch'' and ''ll'' are to be treated as letter pairs for [[collation]] purposes, though they remain a part of the alphabet.@@@@1@25@@danf@17-8-2009
10781680@unknown@formal@none@1@S@Words with ''ch'' are now alphabetically sorted between those with ''ce'' and ''ci'', instead of following ''cz'' as they used to, and similarly for ''ll''.@@@@1@25@@danf@17-8-2009
10781690@unknown@formal@none@1@S@Thus, the Spanish alphabet has the following 29 letters:@@@@1@9@@danf@17-8-2009
10781700@unknown@formal@none@1@S@:a, b, c, ch, d, e, f, g, h, i, j, k, l, ll, m, n, ñ, o, p, q, r, s, t, u, v, w, x, y, z.@@@@1@29@@danf@17-8-2009
10781710@unknown@formal@none@1@S@With the exclusion of a very small number of regional terms such as ''México'' (see [[Toponymy of Mexico]]) and some neologisms like ''software'', pronunciation can be entirely determined from spelling.@@@@1@30@@danf@17-8-2009
10781720@unknown@formal@none@1@S@A typical Spanish word is stressed on the [[syllable]] before the last if it ends with a vowel (not including ''y'') or with a vowel followed by ''n'' or ''s''; it is stressed on the last syllable otherwise.@@@@1@38@@danf@17-8-2009
10781730@unknown@formal@none@1@S@Exceptions to this rule are indicated by placing an [[acute accent]] on the [[stress (linguistics)|stressed vowel]].@@@@1@16@@danf@17-8-2009
10781740@unknown@formal@none@1@S@The acute accent is used, in addition, to distinguish between certain [[homophone]]s, especially when one of them is a stressed word and the other one is a [[clitic]]: compare {{lang|es|''el''}} ("the", masculine singular definite article) with {{lang|es|''él''}} ("he" or "it"), or {{lang|es|''te''}} ("you", object pronoun), {{lang|es|''de''}} (preposition "of" or "from"), and {{lang|es|''se''}} (reflexive pronoun) with {{lang|es|''té''}} ("tea"), {{lang|es|''dé''}} ("give") and {{lang|es|''sé''}} ("I know", or imperative "be").@@@@1@66@@danf@17-8-2009
10781750@unknown@formal@none@1@S@The interrogative pronouns ({{lang|es|''qué''}}, {{lang|es|''cuál''}}, {{lang|es|''dónde''}}, {{lang|es|''quién''}}, etc.) also receive accents in direct or indirect questions, and some demonstratives ({{lang|es|''ése''}}, {{lang|es|''éste''}}, {{lang|es|''aquél''}}, etc.) must be accented when used as pronouns.@@@@1@30@@danf@17-8-2009
10781760@unknown@formal@none@1@S@The conjunction {{lang|es|''o''}} ("or") is written with an accent between numerals so as not to be confused with a zero: e.g., {{lang|es|''10 ó 20''}} should be read as {{lang|es|''diez o veinte''}} rather than {{lang|es|''diez mil veinte''}} ("10,020").@@@@1@37@@danf@17-8-2009
10781770@unknown@formal@none@1@S@Accent marks are frequently omitted in capital letters (a widespread practice in the early days of computers where only lowercase vowels were available with accents), although the [[Real Academia Española|RAE]] advises against this.@@@@1@33@@danf@17-8-2009
10781780@unknown@formal@none@1@S@When ''u'' is written between ''g'' and a front vowel (''e'' or ''i''), if it should be pronounced, it is written with a [[diaeresis (diacritic)|diaeresis]] (''ü'') to indicate that it is not silent as it normally would be (e.g., ''cigüeña'', "stork", is pronounced {{IPA|/θiˈɣweɲa/}}; if it were written ''cigueña'', it would be pronounced {{IPA|/θiˈɣeɲa/}}.@@@@1@54@@danf@17-8-2009
10781790@unknown@formal@none@1@S@Interrogative and exclamatory clauses are introduced with [[Inverted question and exclamation marks|inverted question ( ¿ ) and exclamation ( ¡ ) marks]].@@@@1@22@@danf@17-8-2009
10781800@unknown@formal@none@1@S@==Sounds==@@@@1@1@@danf@17-8-2009
10781810@unknown@formal@none@1@S@The phonemic inventory listed in the following table includes [[phoneme]]s that are preserved only in some dialects, other dialects having merged them (such as ''[[yeísmo]]''); these are marked with an asterisk (*).@@@@1@32@@danf@17-8-2009
10781820@unknown@formal@none@1@S@Sounds in parentheses are [[allophone]]s.@@@@1@5@@danf@17-8-2009
10781830@unknown@formal@none@1@S@By the 16th century, the consonant system of Spanish underwent the following important changes that differentiated it from [[Iberian Romance languages|neighboring Romance languages]] such as [[Portuguese language|Portuguese]] and [[Catalan language|Catalan]]:@@@@1@30@@danf@17-8-2009
10781840@unknown@formal@none@1@S@*Initial {{IPA|/f/}}, when it had evolved into a vacillating {{IPA|/h/}}, was lost in most words (although this etymological ''h-'' is preserved in spelling and in some Andalusian dialects is still aspirated).@@@@1@31@@danf@17-8-2009
10781850@unknown@formal@none@1@S@*The [[bilabial approximant]] {{IPA|/β̞/}} (which was written ''u'' or ''v'') merged with the bilabial oclusive {{IPA|/b/}} (written ''b'').@@@@1@18@@danf@17-8-2009
10781860@unknown@formal@none@1@S@There is no difference between the pronunciation of orthographic ''b'' and ''v'' in contemporary Spanish, excepting emphatic pronunciations that cannot be considered standard or natural.@@@@1@25@@danf@17-8-2009
10781870@unknown@formal@none@1@S@*The [[voiced alveolar fricative]] {{IPA|/z/}} which existed as a separate phoneme in medieval Spanish merged with its voiceless counterpart {{IPA|/s/}}.@@@@1@20@@danf@17-8-2009
10781880@unknown@formal@none@1@S@The phoneme which resulted from this merger is currently spelled ''s''.@@@@1@11@@danf@17-8-2009
10781890@unknown@formal@none@1@S@*The [[voiced postalveolar fricative]] {{IPA|/ʒ/}} merged with its voiceless counterpart {{IPA|/ʃ/}}, which evolved into the modern velar sound {{IPA|/x/}} by the 17th century, now written with ''j'', or ''g'' before ''e, i''.@@@@1@32@@danf@17-8-2009
10781900@unknown@formal@none@1@S@Nevertheless, in most parts of Argentina and in Uruguay, ''y'' and ''ll'' have both evolved to {{IPA|/ʒ/}} or {{IPA|/ʃ/}}.@@@@1@19@@danf@17-8-2009
10781910@unknown@formal@none@1@S@*The [[voiced alveolar affricate]] {{IPA|/dz/}} merged with its voiceless counterpart {{IPA|/ts/}}, which then developed into the interdental {{IPA|/θ/}}, now written ''z'', or ''c'' before ''e, i''.@@@@1@26@@danf@17-8-2009
10781920@unknown@formal@none@1@S@But in [[Andalusia]], the [[Canary Islands]] and the Americas this sound merged with {{IPA|/s/}} as well.@@@@1@16@@danf@17-8-2009
10781930@unknown@formal@none@1@S@See ''[[Ceceo]]'', for further information.@@@@1@5@@danf@17-8-2009
10781940@unknown@formal@none@1@S@The consonant system of Medieval Spanish has been better preserved in [[Ladino language|Ladino]] and in Portuguese, neither of which underwent these shifts.@@@@1@22@@danf@17-8-2009
10781950@unknown@formal@none@1@S@===Lexical stress===@@@@1@2@@danf@17-8-2009
10781960@unknown@formal@none@1@S@Spanish is a [[syllable-timed language]], so each syllable has the same duration regardless of stress.@@@@1@15@@danf@17-8-2009
10781970@unknown@formal@none@1@S@Stress most often occurs on any of the last three syllables of a word, with some rare exceptions at the fourth last.@@@@1@22@@danf@17-8-2009
10781980@unknown@formal@none@1@S@The ''tendencies'' of stress assignment are as follows:@@@@1@8@@danf@17-8-2009
10781990@unknown@formal@none@1@S@* In words ending in vowels and {{IPA|/s/}}, stress most often falls on the penultimate syllable.@@@@1@16@@danf@17-8-2009
10782000@unknown@formal@none@1@S@* In words ending in all other consonants, the stress more often falls on the ultimate syllable.@@@@1@17@@danf@17-8-2009
10782010@unknown@formal@none@1@S@* Preantepenultimate stress occurs rarely and only in words like ''guardándoselos'' ('saving them for him/her') where a clitic follows certain verbal forms.@@@@1@22@@danf@17-8-2009
10782020@unknown@formal@none@1@S@In addition to the many exceptions to these tendencies, there are numerous [[minimal pair]]s which contrast solely on stress.@@@@1@19@@danf@17-8-2009
10782030@unknown@formal@none@1@S@For example, ''sabana'', with penultimate stress, means 'savannah' while ''{{lang|es|sábana}}'', with antepenultimate stress, means 'sheet'; ''{{lang|es|límite}}'' ('boundary'), ''{{lang|es|limite}}'' ('[that] he/she limits') and ''{{lang|es|limité}}'' ('I limited') also contrast solely on stress.@@@@1@30@@danf@17-8-2009
10782040@unknown@formal@none@1@S@Phonological stress may be marked orthographically with an [[acute accent]] (''ácido'', ''distinción'', etc).@@@@1@13@@danf@17-8-2009
10782050@unknown@formal@none@1@S@This is done according to the mandatory stress rules of [[Spanish orthography]] which are similar to the tendencies above (differing with words like ''distinción'') and are defined so as to unequivocally indicate where the stress lies in a given written word.@@@@1@41@@danf@17-8-2009
10782060@unknown@formal@none@1@S@An acute accent may also be used to differentiate homophones (such as ''[[wikt:té#Spanish|té]]'' for 'tea' and ''[[wikt:te#Spanish|te]]''@@@@1@17@@danf@17-8-2009
10782070@unknown@formal@none@1@S@An amusing example of the significance of intonation in Spanish is the phrase ''{{lang|es|¿Cómo "cómo como"?@@@@1@16@@danf@17-8-2009
10782080@unknown@formal@none@1@S@¡Como como como!}}''@@@@1@3@@danf@17-8-2009
10782090@unknown@formal@none@1@S@("What do you mean / 'how / do I eat'? / I eat / the way / I eat!").@@@@1@19@@danf@17-8-2009
10782100@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10782110@unknown@formal@none@1@S@Spanish is a relatively [[inflected]] language, with a two-[[Grammatical gender|gender]] system and about fifty [[Grammatical conjugation|conjugated]] forms per [[verb]], but limited inflection of [[noun]]s, [[adjective]]s, and [[determiner]]s.@@@@1@27@@danf@17-8-2009
10782120@unknown@formal@none@1@S@(For a detailed overview of verbs, see [[Spanish verbs]] and [[Spanish irregular verbs]].)@@@@1@13@@danf@17-8-2009
10782130@unknown@formal@none@1@S@It is [[Branching (linguistics)|right-branching]], uses [[preposition]]s, and usually, though not always, places [[adjective]]s after [[noun]]s.@@@@1@15@@danf@17-8-2009
10782140@unknown@formal@none@1@S@Its [[syntax]] is generally [[Subject Verb Object]], though variations are common.@@@@1@11@@danf@17-8-2009
10782150@unknown@formal@none@1@S@It is a [[pro-drop language]] (allows the deletion of pronouns when pragmatically unnecessary) and [[verb framing|verb-framed]].@@@@1@16@@danf@17-8-2009
10782160@unknown@formal@none@1@S@== Samples ==@@@@1@3@@danf@17-8-2009
10790010@unknown@formal@none@1@S@Speech recognition@@@@1@2@@danf@17-8-2009
10790020@unknown@formal@none@1@S@'''Speech recognition''' (also known as '''automatic speech recognition''' or '''computer speech recognition''') converts spoken words to machine-readable input (for example, to keypresses, using the binary code for a string of [[Character (computing)|character]] codes).@@@@1@33@@danf@17-8-2009
10790030@unknown@formal@none@1@S@The term [[speaker recognition|voice recognition]] may also be used to refer to speech recognition, but more precisely refers to '''speaker recognition''', which attempts to identify the person speaking, as opposed to what is being said.@@@@1@35@@danf@17-8-2009
10790040@unknown@formal@none@1@S@Speech recognition applications include voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), [[domotic]] appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., [[word processor]]s or [[email]]s), and in aircraft [[cockpit]]s (usually termed [[Direct Voice Input]]).@@@@1@70@@danf@17-8-2009
10790050@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10790060@unknown@formal@none@1@S@One of the most notable domains for the commercial application of speech recognition in the United States has been health care and in particular the work of the [[medical transcription]]ist (MT).@@@@1@31@@danf@17-8-2009
10790070@unknown@formal@none@1@S@According to industry experts, at its inception, speech recognition (SR) was sold as a way to completely eliminate transcription rather than make the transcription process more efficient, hence it was not accepted.@@@@1@32@@danf@17-8-2009
10790080@unknown@formal@none@1@S@It was also the case that SR at that time was often technically deficient.@@@@1@14@@danf@17-8-2009
10790090@unknown@formal@none@1@S@Additionally, to be used effectively, it required changes to the ways physicians worked and documented clinical encounters, which many if not all were reluctant to do.@@@@1@26@@danf@17-8-2009
10790100@unknown@formal@none@1@S@The biggest limitation to speech recognition automating transcription, however, is seen as the software.@@@@1@14@@danf@17-8-2009
10790110@unknown@formal@none@1@S@The nature of narrative dictation is highly interpretive and often requires judgment that may be provided by a real human but not yet by an automated system.@@@@1@27@@danf@17-8-2009
10790120@unknown@formal@none@1@S@Another limitation has been the extensive amount of time required by the user and/or system provider to train the software.@@@@1@20@@danf@17-8-2009
10790130@unknown@formal@none@1@S@A distinction in ASR is often made between "artificial syntax systems" which are usually domain-specific and "natural language processing" which is usually language-specific.@@@@1@23@@danf@17-8-2009
10790140@unknown@formal@none@1@S@Each of these types of application presents its own particular goals and challenges.@@@@1@13@@danf@17-8-2009
10790150@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009
10790160@unknown@formal@none@1@S@===Health care===@@@@1@2@@danf@17-8-2009
10790170@unknown@formal@none@1@S@In the [[health care]] domain, even in the wake of improving speech recognition technologies, medical transcriptionists (MTs) have not yet become obsolete.@@@@1@22@@danf@17-8-2009
10790180@unknown@formal@none@1@S@Many experts in the field anticipate that with increased use of speech recognition technology, the services provided may be redistributed rather than replaced.@@@@1@23@@danf@17-8-2009
10790190@unknown@formal@none@1@S@Speech recognition can be implemented in front-end or back-end of the medical documentation process.@@@@1@14@@danf@17-8-2009
10790200@unknown@formal@none@1@S@Front-End SR is where the provider dictates into a speech-recognition engine, the recognized words are displayed right after they are spoken, and the dictator is responsible for editing and signing off on the document.@@@@1@34@@danf@17-8-2009
10790210@unknown@formal@none@1@S@It never goes through an MT/editor.@@@@1@6@@danf@17-8-2009
10790220@unknown@formal@none@1@S@Back-End SR or Deferred SR is where the provider dictates into a digital dictation system, and the voice is routed through a speech-recognition machine and the recognized draft document is routed along with the original voice file to the MT/editor, who edits the draft and finalizes the report.@@@@1@48@@danf@17-8-2009
10790230@unknown@formal@none@1@S@Deferred SR is being widely used in the industry currently.@@@@1@10@@danf@17-8-2009
10790240@unknown@formal@none@1@S@Many [[Electronic Medical Records]] (EMR) applications can be more effective and may be performed more easily when deployed in conjunction with a speech-recognition engine.@@@@1@24@@danf@17-8-2009
10790250@unknown@formal@none@1@S@Searches, queries, and form filling may all be faster to perform by voice than by using a keyboard.@@@@1@18@@danf@17-8-2009
10790260@unknown@formal@none@1@S@****************************************************************************************@@@@1@1@@danf@17-8-2009
10790270@unknown@formal@none@1@S@**********************************@@@@1@1@@danf@17-8-2009
10790280@unknown@formal@none@1@S@*****************@@@@1@1@@danf@17-8-2009
10790290@unknown@formal@none@1@S@===Military===@@@@1@1@@danf@17-8-2009
10790300@unknown@formal@none@1@S@====High-performance fighter aircraft====@@@@1@3@@danf@17-8-2009
10790310@unknown@formal@none@1@S@Substantial efforts have been devoted in the last decade to the test and evaluation of speech recognition in fighter aircraft.@@@@1@20@@danf@17-8-2009
10790320@unknown@formal@none@1@S@Of particular note are the U.S. program in speech recognition for the Advanced Fighter Technology Integration (AFTI)/[[F-16]] aircraft ([[F-16 VISTA]]), the program in France on installing speech recognition systems on [[Mirage (aircraft)|Mirage]] aircraft, and programs in the UK dealing with a variety of aircraft platforms.@@@@1@45@@danf@17-8-2009
10790330@unknown@formal@none@1@S@In these programs, speech recognizers have been operated successfully in fighter aircraft with applications including: setting radio frequencies, commanding an autopilot system, setting steer-point coordinates and weapons release parameters, and controlling flight displays.@@@@1@33@@danf@17-8-2009
10790340@unknown@formal@none@1@S@Generally, only very limited, constrained vocabularies have been used successfully, and a major effort has been devoted to integration of the speech recognizer with the avionics system.@@@@1@27@@danf@17-8-2009
10790350@unknown@formal@none@1@S@Some important conclusions from the work were as follows:@@@@1@9@@danf@17-8-2009
10790360@unknown@formal@none@1@S@#Speech recognition has definite potential for reducing pilot workload, but this potential was not realized consistently.@@@@1@16@@danf@17-8-2009
10790370@unknown@formal@none@1@S@#Achievement of very high recognition accuracy (95% or more) was the most critical factor for making the speech recognition system useful — with lower recognition rates, pilots would not use the system.@@@@1@31@@danf@17-8-2009
10790380@unknown@formal@none@1@S@#More natural vocabulary and grammar, and shorter training times would be useful, but only if very high recognition rates could be maintained.@@@@1@22@@danf@17-8-2009
10790390@unknown@formal@none@1@S@Laboratory research in robust speech recognition for military environments has produced promising results which, if extendable to the cockpit, should improve the utility of speech recognition in high-performance aircraft.@@@@1@29@@danf@17-8-2009
10790400@unknown@formal@none@1@S@Working with Swedish pilots flying in the [[JAS-39]] Gripen cockpit, Englund (2004) found recognition deteriorated with increasing G-loads.@@@@1@18@@danf@17-8-2009
10790410@unknown@formal@none@1@S@It was also concluded that adaptation greatly improved the results in all cases and introducing models for breathing was shown to improve recognition scores significantly.@@@@1@25@@danf@17-8-2009
10790420@unknown@formal@none@1@S@Contrary to what might be expected, no effects of the broken English of the speakers were found.@@@@1@17@@danf@17-8-2009
10790430@unknown@formal@none@1@S@It was evident that spontaneous speech caused problems for the recognizer, as could be expected.@@@@1@15@@danf@17-8-2009
10790440@unknown@formal@none@1@S@A restricted vocabulary, and above all, a proper syntax, could thus be expected to improve recognition accuracy substantially.@@@@1@18@@danf@17-8-2009
10790450@unknown@formal@none@1@S@The [[Eurofighter Typhoon]] currently in service with the UK [[RAF]] employs a speaker-dependent system, i.e. it requires each pilot to create a template.@@@@1@23@@danf@17-8-2009
10790460@unknown@formal@none@1@S@The system is not used for any safety critical or weapon critical tasks, such as weapon release or lowering of the undercarriage, but is used for a wide range of other [[cockpit]] functions.@@@@1@33@@danf@17-8-2009
10790470@unknown@formal@none@1@S@Voice commands are confirmed by visual and/or aural feedback.@@@@1@9@@danf@17-8-2009
10790480@unknown@formal@none@1@S@The system is seen as a major design feature in the reduction of pilot [[workload]], and even allows the pilot to assign targets to himself with two simple voice commands or to any of his wingmen with only five commands.@@@@1@40@@danf@17-8-2009
10790490@unknown@formal@none@1@S@====Helicopters====@@@@1@1@@danf@17-8-2009
10790500@unknown@formal@none@1@S@The problems of achieving high recognition accuracy under stress and noise pertain strongly to the helicopter environment as well as to the fighter environment.@@@@1@24@@danf@17-8-2009
10790510@unknown@formal@none@1@S@The acoustic noise problem is actually more severe in the helicopter environment, not only because of the high noise levels but also because the helicopter pilot generally does not wear a facemask, which would reduce acoustic noise in the microphone.@@@@1@40@@danf@17-8-2009
10790520@unknown@formal@none@1@S@Substantial test and evaluation programs have been carried out in the post decade in speech recognition systems applications in helicopters, notably by the U.S. Army Avionics Research and Development Activity (AVRADA) and by the Royal Aerospace Establishment (RAE) in the UK.@@@@1@41@@danf@17-8-2009
10790530@unknown@formal@none@1@S@Work in France has included speech recognition in the Puma helicopter.@@@@1@11@@danf@17-8-2009
10790540@unknown@formal@none@1@S@There has also been much useful work in Canada.@@@@1@9@@danf@17-8-2009
10790550@unknown@formal@none@1@S@Results have been encouraging, and voice applications have included: control of communication radios; setting of navigation systems; and control of an automated target handover system.@@@@1@25@@danf@17-8-2009
10790560@unknown@formal@none@1@S@As in fighter applications, the overriding issue for voice in helicopters is the impact on pilot effectiveness.@@@@1@17@@danf@17-8-2009
10790570@unknown@formal@none@1@S@Encouraging results are reported for the AVRADA tests, although these represent only a feasibility demonstration in a test environment.@@@@1@19@@danf@17-8-2009
10790580@unknown@formal@none@1@S@Much remains to be done both in speech recognition and in overall speech recognition technology, in order to consistently achieve performance improvements in operational settings.@@@@1@25@@danf@17-8-2009
10790590@unknown@formal@none@1@S@====Battle management====@@@@1@2@@danf@17-8-2009
10790600@unknown@formal@none@1@S@Battle management command centres generally require rapid access to and control of large, rapidly changing information databases.@@@@1@17@@danf@17-8-2009
10790610@unknown@formal@none@1@S@Commanders and system operators need to query these databases as conveniently as possible, in an eyes-busy environment where much of the information is presented in a display format.@@@@1@28@@danf@17-8-2009
10790620@unknown@formal@none@1@S@Human machine interaction by voice has the potential to be very useful in these environments.@@@@1@15@@danf@17-8-2009
10790630@unknown@formal@none@1@S@A number of efforts have been undertaken to interface commercially available isolated-word recognizers into battle management environments.@@@@1@17@@danf@17-8-2009
10790640@unknown@formal@none@1@S@In one feasibility study, speech recognition equipment was tested in conjunction with an integrated information display for naval battle management applications.@@@@1@21@@danf@17-8-2009
10790650@unknown@formal@none@1@S@Users were very optimistic about the potential of the system, although capabilities were limited.@@@@1@14@@danf@17-8-2009
10790660@unknown@formal@none@1@S@Speech understanding programs sponsored by the Defense Advanced Research Projects Agency (DARPA) in the U.S. has focused on this problem of natural speech interface..@@@@1@24@@danf@17-8-2009
10790670@unknown@formal@none@1@S@Speech recognition efforts have focused on a database of continuous speech recognition (CSR), large-vocabulary speech which is designed to be representative of the naval resource management task.@@@@1@27@@danf@17-8-2009
10790680@unknown@formal@none@1@S@Significant advances in the state-of-the-art in CSR have been achieved, and current efforts are focused on integrating speech recognition and natural language processing to allow spoken language interaction with a naval resource management system.@@@@1@34@@danf@17-8-2009
10790690@unknown@formal@none@1@S@====Training air traffic controllers====@@@@1@4@@danf@17-8-2009
10790700@unknown@formal@none@1@S@Training for military (or civilian) air traffic controllers (ATC) represents an excellent application for speech recognition systems.@@@@1@17@@danf@17-8-2009
10790710@unknown@formal@none@1@S@Many ATC training systems currently require a person to act as a "pseudo-pilot", engaging in a voice dialog with the trainee controller, which simulates the dialog which the controller would have to conduct with pilots in a real ATC situation.@@@@1@40@@danf@17-8-2009
10790720@unknown@formal@none@1@S@Speech recognition and synthesis techniques offer the potential to eliminate the need for a person to act as pseudo-pilot, thus reducing training and support personnel.@@@@1@25@@danf@17-8-2009
10790730@unknown@formal@none@1@S@Air controller tasks are also characterized by highly structured speech as the primary output of the controller, hence reducing the difficulty of the speech recognition task.@@@@1@26@@danf@17-8-2009
10790740@unknown@formal@none@1@S@The U.S. Naval Training Equipment Center has sponsored a number of developments of prototype ATC trainers using speech recognition.@@@@1@19@@danf@17-8-2009
10790750@unknown@formal@none@1@S@Generally, the recognition accuracy falls short of providing graceful interaction between the trainee and the system.@@@@1@16@@danf@17-8-2009
10790760@unknown@formal@none@1@S@However, the prototype training systems have demonstrated a significant potential for voice interaction in these systems, and in other training applications.@@@@1@21@@danf@17-8-2009
10790770@unknown@formal@none@1@S@The U.S. Navy has sponsored a large-scale effort in ATC training systems, where a commercial speech recognition unit was integrated with a complex training system including displays and scenario creation.@@@@1@30@@danf@17-8-2009
10790780@unknown@formal@none@1@S@Although the recognizer was constrained in vocabulary, one of the goals of the training programs was to teach the controllers to speak in a constrained language, using specific vocabulary specifically designed for the ATC task.@@@@1@35@@danf@17-8-2009
10790790@unknown@formal@none@1@S@Research in France has focussed on the application of speech recognition in ATC training systems, directed at issues both in speech recognition and in application of task-domain grammar constraints.@@@@1@29@@danf@17-8-2009
10790800@unknown@formal@none@1@S@The USAF, USMC, US Army, and FAA are currently using ATC simulators with speech recognition provided by Adacel Systems Inc (ASI).@@@@1@21@@danf@17-8-2009
10790810@unknown@formal@none@1@S@Adacel's MaxSim software uses speech recognition and synthetic speech to enable the trainee to control aircraft and ground vehicles in the simulation without the need for pseudo pilots.@@@@1@28@@danf@17-8-2009
10790820@unknown@formal@none@1@S@Adacel's ATC In A Box Software provideds a synthetic ATC environment for flight simulators.@@@@1@14@@danf@17-8-2009
10790830@unknown@formal@none@1@S@The "real" pilot talks to a virtual controller using speech recognition and the virtual controller responds with synthetic speech.@@@@1@19@@danf@17-8-2009
10790840@unknown@formal@none@1@S@It will be an application format@@@@1@6@@danf@17-8-2009
10790850@unknown@formal@none@1@S@===Telephony and other domains===@@@@1@4@@danf@17-8-2009
10790860@unknown@formal@none@1@S@ASR in the field of telephony is now commonplace and in the field of computer gaming and simulation is becoming more widespread.@@@@1@22@@danf@17-8-2009
10790870@unknown@formal@none@1@S@Despite the high level of integration with word processing in general personal computing, however, ASR in the field of document production has not seen the expected increases in use.@@@@1@29@@danf@17-8-2009
10790880@unknown@formal@none@1@S@The improvement of mobile processor speeds let create speech-enabled Symbian and Windows Mobile Smartphones.@@@@1@14@@danf@17-8-2009
10790890@unknown@formal@none@1@S@Current speech-to-text programs are too large and require too much CPU power to be practical for the Pocket PC.@@@@1@19@@danf@17-8-2009
10790900@unknown@formal@none@1@S@Speech is used mostly as a part of User Interface, for creating pre-defined or custom speech commands.@@@@1@17@@danf@17-8-2009
10790910@unknown@formal@none@1@S@Leading software vendors in this field are: Microsoft Corporation (Microsoft Voice Command); Nuance Communications (Nuance Voice Control); Vito Technology (VITO Voice2Go); Speereo Software (Speereo Voice Translator).@@@@1@26@@danf@17-8-2009
10790920@unknown@formal@none@1@S@===People with Disabilities===@@@@1@3@@danf@17-8-2009
10790930@unknown@formal@none@1@S@People with disabilities are another part of the population that benefit from using speech recognition programs.@@@@1@16@@danf@17-8-2009
10790940@unknown@formal@none@1@S@It is especially useful for people who have difficulty with or are unable to use their hands, from mild repetitive stress injuries to involved disabilities that require alternative input for support with accessing the computer.@@@@1@35@@danf@17-8-2009
10790950@unknown@formal@none@1@S@In fact, people who used the keyboard a lot and developed [[Repetitive Strain Injury|RSI]] became an urgent early market for speech recognition.@@@@1@22@@danf@17-8-2009
10790960@unknown@formal@none@1@S@Speech recognition is used in [[deaf]] [[telephony]], such as [[spinvox]] voice-to-text voicemail, [[relay services]], and [[Telecommunications Relay Service#Captioned_telephone|captioned telephone]].@@@@1@19@@danf@17-8-2009
10790970@unknown@formal@none@1@S@===Further applications===@@@@1@2@@danf@17-8-2009
10790980@unknown@formal@none@1@S@*Automatic translation@@@@1@2@@danf@17-8-2009
10790990@unknown@formal@none@1@S@*Automotive speech recognition (e.g., [[Ford Sync]])@@@@1@6@@danf@17-8-2009
10791000@unknown@formal@none@1@S@*Telematics (e.g. vehicle Navigation Systems)@@@@1@5@@danf@17-8-2009
10791010@unknown@formal@none@1@S@*Court reporting (Realtime Voice Writing)@@@@1@5@@danf@17-8-2009
10791020@unknown@formal@none@1@S@*[[Hands-free computing]]: voice command recognition computer [[user interface]]@@@@1@8@@danf@17-8-2009
10791030@unknown@formal@none@1@S@*[[Home automation]]@@@@1@2@@danf@17-8-2009
10791040@unknown@formal@none@1@S@*[[Interactive voice response]]@@@@1@3@@danf@17-8-2009
10791050@unknown@formal@none@1@S@*[[Mobile telephony]], including mobile email@@@@1@5@@danf@17-8-2009
10791060@unknown@formal@none@1@S@*[[Multimodal interaction]]@@@@1@2@@danf@17-8-2009
10791070@unknown@formal@none@1@S@*[[Pronunciation]] evaluation in computer-aided language learning applications@@@@1@7@@danf@17-8-2009
10791080@unknown@formal@none@1@S@*[[Robotics]]@@@@1@1@@danf@17-8-2009
10791090@unknown@formal@none@1@S@*[[Transcription (linguistics)|Transcription]] (digital speech-to-text).@@@@1@4@@danf@17-8-2009
10791100@unknown@formal@none@1@S@*Speech-to-Text (Transcription of speech into mobile text messages)@@@@1@8@@danf@17-8-2009
10791110@unknown@formal@none@1@S@==Performance of speech recognition systems==@@@@1@5@@danf@17-8-2009
10791120@unknown@formal@none@1@S@The performance of speech recognition systems is usually specified in terms of accuracy and speed.@@@@1@15@@danf@17-8-2009
10791130@unknown@formal@none@1@S@Accuracy may be measured in terms of performance accuracy which is usually rated with [[word error rate]] (WER), whereas speed is measured with the [[real time factor]].@@@@1@27@@danf@17-8-2009
10791140@unknown@formal@none@1@S@Other measures of accuracy include [[Single Word Error Rate]] (SWER) and [[Command Success Rate]] (CSR).@@@@1@15@@danf@17-8-2009
10791150@unknown@formal@none@1@S@Most speech recognition users would tend to agree that dictation machines can achieve very high performance in controlled conditions.@@@@1@19@@danf@17-8-2009
10791160@unknown@formal@none@1@S@There is some confusion, however, over the interchangeability of the terms "speech recognition" and "dictation".@@@@1@15@@danf@17-8-2009
10791170@unknown@formal@none@1@S@Commercially available speaker-dependent dictation systems usually require only a short period of training (sometimes also called `enrollment') and may successfully capture continuous speech with a large vocabulary at normal pace with a very high accuracy.@@@@1@35@@danf@17-8-2009
10791180@unknown@formal@none@1@S@Most commercial companies claim that recognition software can achieve between 98% to 99% accuracy if operated under optimal conditions.@@@@1@19@@danf@17-8-2009
10791190@unknown@formal@none@1@S@`Optimal conditions' usually assume that users:@@@@1@6@@danf@17-8-2009
10791200@unknown@formal@none@1@S@* have speech characteristics which match the training data,@@@@1@9@@danf@17-8-2009
10791210@unknown@formal@none@1@S@* can achieve proper speaker adaptation, and@@@@1@7@@danf@17-8-2009
10791220@unknown@formal@none@1@S@* work in a clean noise environment (e.g. quiet office or laboratory space).@@@@1@13@@danf@17-8-2009
10791230@unknown@formal@none@1@S@This explains why some users, especially those whose speech is heavily accented, might achieve recognition rates much lower than expected.@@@@1@20@@danf@17-8-2009
10791240@unknown@formal@none@1@S@Speech recognition in video has become a popular search technology used by several video search companies.@@@@1@16@@danf@17-8-2009
10791250@unknown@formal@none@1@S@Limited vocabulary systems, requiring no training, can recognize a small number of words (for instance, the ten digits) as spoken by most speakers.@@@@1@23@@danf@17-8-2009
10791260@unknown@formal@none@1@S@Such systems are popular for routing incoming phone calls to their destinations in large organizations.@@@@1@15@@danf@17-8-2009
10791270@unknown@formal@none@1@S@Both [[Acoustic Model|acoustic modeling]] and [[language model]]ing are important parts of modern statistically-based speech recognition algorithms.@@@@1@16@@danf@17-8-2009
10791280@unknown@formal@none@1@S@Hidden Markov models (HMMs) are widely used in many systems.@@@@1@10@@danf@17-8-2009
10791290@unknown@formal@none@1@S@Language modeling has many other applications such as [[smart keyboard]] and [[document classification]].@@@@1@13@@danf@17-8-2009
10791300@unknown@formal@none@1@S@===Hidden Markov model (HMM)-based speech recognition===@@@@1@6@@danf@17-8-2009
10791310@unknown@formal@none@1@S@Modern general-purpose speech recognition systems are generally based on [[Hidden Markov Model|HMMs]].@@@@1@12@@danf@17-8-2009
10791320@unknown@formal@none@1@S@These are statistical models which output a sequence of symbols or quantities.@@@@1@12@@danf@17-8-2009
10791330@unknown@formal@none@1@S@One possible reason why HMMs are used in speech recognition is that a speech signal could be viewed as a piecewise stationary signal or a short-time stationary signal.@@@@1@28@@danf@17-8-2009
10791340@unknown@formal@none@1@S@That is, one could assume in a short-time in the range of 10 milliseconds, speech could be approximated as a [[stationary process]].@@@@1@22@@danf@17-8-2009
10791350@unknown@formal@none@1@S@Speech could thus be thought of as a [[Markov model]] for many stochastic processes.@@@@1@14@@danf@17-8-2009
10791360@unknown@formal@none@1@S@Another reason why HMMs are popular is because they can be trained automatically and are simple and computationally feasible to use.@@@@1@21@@danf@17-8-2009
10791370@unknown@formal@none@1@S@In speech recognition, the hidden Markov model would output a sequence of ''n''-dimensional real-valued vectors (with ''n'' being a small integer, such as 10), outputting one of these every 10 milliseconds.@@@@1@31@@danf@17-8-2009
10791380@unknown@formal@none@1@S@The vectors would consist of [[cepstrum|cepstral]] coefficients, which are obtained by taking a [[Fourier transform]] of a short time window of speech and decorrelating the spectrum using a [[cosine transform]], then taking the first (most significant) coefficients.@@@@1@37@@danf@17-8-2009
10791390@unknown@formal@none@1@S@The hidden Markov model will tend to have in each state a statistical distribution that is a mixture of diagonal covariance Gaussians which will give a likelihood for each observed vector.@@@@1@31@@danf@17-8-2009
10791400@unknown@formal@none@1@S@Each word, or (for more general speech recognition systems), each [[phoneme]], will have a different output distribution; a hidden Markov model for a sequence of words or phonemes is made by concatenating the individual trained hidden Markov models for the separate words and phonemes.@@@@1@44@@danf@17-8-2009
10791410@unknown@formal@none@1@S@Described above are the core elements of the most common, HMM-based approach to speech recognition.@@@@1@15@@danf@17-8-2009
10791420@unknown@formal@none@1@S@Modern speech recognition systems use various combinations of a number of standard techniques in order to improve results over the basic approach described above.@@@@1@24@@danf@17-8-2009
10791430@unknown@formal@none@1@S@A typical large-vocabulary system would need context dependency for the phonemes (so phonemes with different left and right context have different realizations as HMM states); it would use cepstral normalization to normalize for different speaker and recording conditions; for further speaker normalization it might use vocal tract length normalization (VTLN) for male-female normalization and maximum likelihood linear regression (MLLR) for more general speaker adaptation.@@@@1@64@@danf@17-8-2009
10791440@unknown@formal@none@1@S@The features would have so-called delta and delta-delta coefficients to capture speech dynamics and in addition might use heteroscedastic linear discriminant analysis (HLDA); or might skip the delta and delta-delta coefficients and use splicing and an LDA-based projection followed perhaps by heteroscedastic linear discriminant analysis or a global semitied covariance transform (also known as maximum likelihood linear transform, or MLLT).@@@@1@60@@danf@17-8-2009
10791450@unknown@formal@none@1@S@Many systems use so-called discriminative training techniques which dispense with a purely statistical approach to HMM parameter estimation and instead optimize some classification-related measure of the training data.@@@@1@28@@danf@17-8-2009
10791460@unknown@formal@none@1@S@Examples are maximum [[mutual information]] (MMI), minimum classification error (MCE) and minimum phone error (MPE).@@@@1@15@@danf@17-8-2009
10791470@unknown@formal@none@1@S@Decoding of the speech (the term for what happens when the system is presented with a new utterance and must compute the most likely source sentence) would probably use the [[Viterbi algorithm]] to find the best path, and here there is a choice between dynamically creating a combination hidden Markov model which includes both the acoustic and language model information, or combining it statically beforehand (the [[finite state transducer]], or FST, approach).@@@@1@72@@danf@17-8-2009
10791480@unknown@formal@none@1@S@===Dynamic time warping (DTW)-based speech recognition===@@@@1@6@@danf@17-8-2009
10791490@unknown@formal@none@1@S@Dynamic time warping is an approach that was historically used for speech recognition but has now largely been displaced by the more successful HMM-based approach.@@@@1@25@@danf@17-8-2009
10791500@unknown@formal@none@1@S@Dynamic time warping is an algorithm for measuring similarity between two sequences which may vary in time or speed.@@@@1@19@@danf@17-8-2009
10791510@unknown@formal@none@1@S@For instance, similarities in walking patterns would be detected, even if in one video the person was walking slowly and if in another they were walking more quickly, or even if there were accelerations and decelerations during the course of one observation.@@@@1@42@@danf@17-8-2009
10791520@unknown@formal@none@1@S@DTW has been applied to video, audio, and graphics – indeed, any data which can be turned into a linear representation can be analyzed with DTW.@@@@1@25@@danf@17-8-2009
10791530@unknown@formal@none@1@S@A well known application has been automatic speech recognition, to cope with different speaking speeds.@@@@1@15@@danf@17-8-2009
10791540@unknown@formal@none@1@S@In general, it is a method that allows a computer to find an optimal match between two given sequences (e.g. time series) with certain restrictions, i.e. the sequences are "warped" non-linearly to match each other.@@@@1@35@@danf@17-8-2009
10791550@unknown@formal@none@1@S@This sequence alignment method is often used in the context of hidden Markov models.@@@@1@14@@danf@17-8-2009
10791560@unknown@formal@none@1@S@==Further information==@@@@1@2@@danf@17-8-2009
10791570@unknown@formal@none@1@S@Popular speech recognition conferences held each year or two include ICASSP, Eurospeech/ICSLP (now named Interspeech) and the IEEE ASRU.@@@@1@19@@danf@17-8-2009
10791580@unknown@formal@none@1@S@Conferences in the field of [[Natural Language Processing]], such as ACL, NAACL, EMNLP, and HLT, are beginning to include papers on speech processing.@@@@1@23@@danf@17-8-2009
10791590@unknown@formal@none@1@S@Important journals include the [[IEEE]] Transactions on Speech and Audio Processing (now named [[IEEE]] Transactions on Audio, Speech and Language Processing), Computer Speech and Language, and Speech Communication.@@@@1@28@@danf@17-8-2009
10791600@unknown@formal@none@1@S@Books like "Fundamentals of Speech Recognition" by [[Lawrence Rabiner]] can be useful to acquire basic knowledge but may not be fully up to date (1993).@@@@1@25@@danf@17-8-2009
10791610@unknown@formal@none@1@S@Another good source can be "Statistical Methods for Speech Recognition" by Frederick Jelinek which is a more up to date book (1998).@@@@1@22@@danf@17-8-2009
10791620@unknown@formal@none@1@S@Even more up to date is "Computer Speech", by Manfred R. Schroeder, second edition published in 2004.@@@@1@17@@danf@17-8-2009
10791630@unknown@formal@none@1@S@A good insight into the techniques used in the best modern systems can be gained by paying attention to government sponsored evaluations such as those organised by [[DARPA]] (the largest speech recognition-related project ongoing as of 2007 is the GALE project, which involves both speech recognition and translation components).@@@@1@49@@danf@17-8-2009
10791640@unknown@formal@none@1@S@In terms of freely available resources, the [[HTK (software)|HTK]] book (and the accompanying HTK toolkit) is one place to start to both learn about speech recognition and to start experimenting.@@@@1@30@@danf@17-8-2009
10791650@unknown@formal@none@1@S@Another such resource is [[Carnegie Mellon University]]'s SPHINX toolkit.@@@@1@9@@danf@17-8-2009
10791660@unknown@formal@none@1@S@The AT&T libraries [http://www.research.att.com/projects/mohri/fsm FSM Library], [http://www.research.att.com/projects/mohri/grm GRM library], and [http://www.cs.nyu.edu/~mohri DCD library] are also general software libraries for large-vocabulary speech recognition.@@@@1@22@@danf@17-8-2009
10791670@unknown@formal@none@1@S@A useful review of the area of robustness in ASR is provided by Junqua and Haton (1995).@@@@1@17@@danf@17-8-2009
10800010@unknown@formal@none@1@S@Speech synthesis@@@@1@2@@danf@17-8-2009
10800020@unknown@formal@none@1@S@'''Speech synthesis''' is the artificial production of human [[Speech communication|speech]].@@@@1@10@@danf@17-8-2009
10800030@unknown@formal@none@1@S@A computer system used for this purpose is called a '''speech synthesizer''', and can be implemented in [[software]] or [[Computer hardware|hardware]].@@@@1@21@@danf@17-8-2009
10800040@unknown@formal@none@1@S@A '''text-to-speech (TTS)''' system converts normal language text into speech; other systems render [[symbolic linguistic representation]]s like [[phonetic transcription]]s into speech.@@@@1@21@@danf@17-8-2009
10800050@unknown@formal@none@1@S@Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a [[database]].@@@@1@17@@danf@17-8-2009
10800060@unknown@formal@none@1@S@Systems differ in the size of the stored speech units; a system that stores [[phone]]s or [[diphone]]s provides the largest output range, but may lack clarity.@@@@1@26@@danf@17-8-2009
10800070@unknown@formal@none@1@S@For specific usage domains, the storage of entire words or sentences allows for high-quality output.@@@@1@15@@danf@17-8-2009
10800080@unknown@formal@none@1@S@Alternatively, a synthesizer can incorporate a model of the [[vocal tract]] and other human voice characteristics to create a completely "synthetic" voice output.@@@@1@23@@danf@17-8-2009
10800090@unknown@formal@none@1@S@The quality of a speech synthesizer is judged by its similarity to the human voice, and by its ability to be understood.@@@@1@22@@danf@17-8-2009
10800100@unknown@formal@none@1@S@An intelligible text-to-speech program allows people with [[visual impairment]]s or [[reading disability|reading disabilities]] to listen to written works on a home computer.@@@@1@22@@danf@17-8-2009
10800110@unknown@formal@none@1@S@Many computer operating systems have included speech synthesizers since the early 1980s.@@@@1@12@@danf@17-8-2009
10800120@unknown@formal@none@1@S@== Overview of text processing ==@@@@1@6@@danf@17-8-2009
10800130@unknown@formal@none@1@S@A text-to-speech system (or "engine") is composed of two parts: a [[front-end]] and a back-end.@@@@1@15@@danf@17-8-2009
10800140@unknown@formal@none@1@S@The front-end has two major tasks.@@@@1@6@@danf@17-8-2009
10800150@unknown@formal@none@1@S@First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words.@@@@1@17@@danf@17-8-2009
10800160@unknown@formal@none@1@S@This process is often called ''text normalization'', ''pre-processing'', or ''[[tokenization]]''.@@@@1@10@@danf@17-8-2009
10800170@unknown@formal@none@1@S@The front-end then assigns [[phonetic transcription]]s to each word, and divides and marks the text into [[prosody (linguistics)|prosodic units]], like [[phrase]]s, [[clause]]s, and [[sentence (linguistics)|sentence]]s.@@@@1@25@@danf@17-8-2009
10800180@unknown@formal@none@1@S@The process of assigning phonetic transcriptions to words is called ''text-to-phoneme'' or ''[[grapheme]]-to-phoneme'' conversion.@@@@1@14@@danf@17-8-2009
10800190@unknown@formal@none@1@S@Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end.@@@@1@18@@danf@17-8-2009
10800200@unknown@formal@none@1@S@The back-end—often referred to as the ''synthesizer''—then converts the symbolic linguistic representation into sound.@@@@1@14@@danf@17-8-2009
10800210@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10800220@unknown@formal@none@1@S@Long before [[electronics|electronic]] [[signal processing]] was invented, there were those who tried to build machines to create human speech.@@@@1@19@@danf@17-8-2009
10800230@unknown@formal@none@1@S@Some early legends of the existence of [[Brazen Head|"speaking heads"]] involved [[Pope Silvester II|Gerbert of Aurillac]] (d. 1003 AD), [[Albertus Magnus]] (1198–1280), and [[Roger Bacon]] (1214–1294).@@@@1@26@@danf@17-8-2009
10800240@unknown@formal@none@1@S@In 1779, the [[Denmark|Danish]] scientist Christian Kratzenstein, working at the [[Russian Academy of Sciences]], built models of the human [[vocal tract]] that could produce the five long [[vowel]] sounds (in [[help:IPA|International Phonetic Alphabet]] notation, they are {{IPA|[aː]}}, {{IPA|[eː]}}, {{IPA|[iː]}}, {{IPA|[oː]}} and {{IPA|[uː]}}).@@@@1@42@@danf@17-8-2009
10800250@unknown@formal@none@1@S@This was followed by the [[bellows]]-operated "acoustic-mechanical speech machine" by [[Wolfgang von Kempelen]] of [[Vienna]], [[Austria]], described in a 1791 paper.@@@@1@21@@danf@17-8-2009
10800260@unknown@formal@none@1@S@This machine added models of the tongue and lips, enabling it to produce [[consonant]]s as well as vowels.@@@@1@18@@danf@17-8-2009
10800270@unknown@formal@none@1@S@In 1837, [[Charles Wheatstone]] produced a "speaking machine" based on von Kempelen's design, and in 1857, M. Faber built the "Euphonia".@@@@1@21@@danf@17-8-2009
10800280@unknown@formal@none@1@S@Wheatstone's design was resurrected in 1923 by Paget.@@@@1@8@@danf@17-8-2009
10800290@unknown@formal@none@1@S@In the 1930s, [[Bell Labs]] developed the [[Vocoder|VOCODER]], a keyboard-operated electronic speech analyzer and synthesizer that was said to be clearly intelligible.@@@@1@22@@danf@17-8-2009
10800300@unknown@formal@none@1@S@[[Homer Dudley]] refined this device into the VODER, which he exhibited at the [[1939 New York World's Fair]].@@@@1@18@@danf@17-8-2009
10800310@unknown@formal@none@1@S@The [[Pattern playback]] was built by [[Franklin S. Cooper|Dr. Franklin S. Cooper]] and his colleagues at [[Haskins Laboratories]] in the late 1940s and completed in 1950.@@@@1@26@@danf@17-8-2009
10800320@unknown@formal@none@1@S@There were several different versions of this hardware device but only one currently survives.@@@@1@14@@danf@17-8-2009
10800330@unknown@formal@none@1@S@The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound.@@@@1@19@@danf@17-8-2009
10800340@unknown@formal@none@1@S@Using this device, [[Alvin Liberman]] and colleagues were able to discover acoustic cues for the perception of [[phonetic]] segments (consonants and vowels).@@@@1@22@@danf@17-8-2009
10800350@unknown@formal@none@1@S@Early electronic speech synthesizers sounded robotic and were often barely intelligible.@@@@1@11@@danf@17-8-2009
10800360@unknown@formal@none@1@S@However, the quality of synthesized speech has steadily improved, and output from contemporary speech synthesis systems is sometimes indistinguishable from actual human speech.@@@@1@23@@danf@17-8-2009
10800370@unknown@formal@none@1@S@=== Electronic devices ===@@@@1@4@@danf@17-8-2009
10800380@unknown@formal@none@1@S@The first computer-based speech synthesis systems were created in the late 1950s, and the first complete text-to-speech system was completed in 1968.@@@@1@22@@danf@17-8-2009
10800390@unknown@formal@none@1@S@In 1961, physicist [[John Larry Kelly, Jr]] and colleague Louis Gerstman used an [[IBM 704]] computer to synthesize speech, an event among the most prominent in the history of [[Bell Labs]].@@@@1@31@@danf@17-8-2009
10800400@unknown@formal@none@1@S@Kelly's voice recorder synthesizer (vocoder) recreated the song "[[Daisy Bell]]", with musical accompaniment from [[Max Mathews]].@@@@1@16@@danf@17-8-2009
10800410@unknown@formal@none@1@S@Coincidentally, [[Arthur C. Clarke]] was visiting his friend and colleague John Pierce at the Bell Labs Murray Hill facility.@@@@1@19@@danf@17-8-2009
10800420@unknown@formal@none@1@S@Clarke was so impressed by the demonstration that he used it in the climactic scene of his screenplay for his novel ''[[2001: A Space Odyssey (novel)|2001: A Space Odyssey]]'', where the [[HAL 9000]] computer sings the same song as it is being put to sleep by astronaut [[Dave Bowman]].@@@@1@49@@danf@17-8-2009
10800430@unknown@formal@none@1@S@Despite the success of purely electronic speech synthesis, research is still being conducted into mechanical speech synthesizers.@@@@1@17@@danf@17-8-2009
10800440@unknown@formal@none@1@S@== Synthesizer technologies ==@@@@1@4@@danf@17-8-2009
10800450@unknown@formal@none@1@S@The most important qualities of a speech synthesis system are ''naturalness'' and ''[[Intelligibility]]''.@@@@1@13@@danf@17-8-2009
10800460@unknown@formal@none@1@S@Naturalness describes how closely the output sounds like human speech, while intelligibility is the ease with which the output is understood.@@@@1@21@@danf@17-8-2009
10800470@unknown@formal@none@1@S@The ideal speech synthesizer is both natural and intelligible.@@@@1@9@@danf@17-8-2009
10800480@unknown@formal@none@1@S@Speech synthesis systems usually try to maximize both characteristics.@@@@1@9@@danf@17-8-2009
10800490@unknown@formal@none@1@S@The two primary technologies for generating synthetic speech waveforms are ''concatenative synthesis'' and ''[[formant]] synthesis''.@@@@1@15@@danf@17-8-2009
10800500@unknown@formal@none@1@S@Each technology has strengths and weaknesses, and the intended uses of a synthesis system will typically determine which approach is used.@@@@1@21@@danf@17-8-2009
10800510@unknown@formal@none@1@S@=== Concatenative synthesis ===@@@@1@4@@danf@17-8-2009
10800520@unknown@formal@none@1@S@Concatenative synthesis is based on the [[concatenation]] (or stringing together) of segments of recorded speech.@@@@1@15@@danf@17-8-2009
10800530@unknown@formal@none@1@S@Generally, concatenative synthesis produces the most natural-sounding synthesized speech.@@@@1@9@@danf@17-8-2009
10800540@unknown@formal@none@1@S@However, differences between natural variations in speech and the nature of the automated techniques for segmenting the waveforms sometimes result in audible glitches in the output.@@@@1@26@@danf@17-8-2009
10800550@unknown@formal@none@1@S@There are three main sub-types of concatenative synthesis.@@@@1@8@@danf@17-8-2009
10800560@unknown@formal@none@1@S@
==== Unit selection synthesis ====@@@@1@6@@danf@17-8-2009
10800570@unknown@formal@none@1@S@Unit selection synthesis uses large [[database]]s of recorded speech.@@@@1@9@@danf@17-8-2009
10800580@unknown@formal@none@1@S@During database creation, each recorded utterance is segmented into some or all of the following: individual [[phone]]s, [[diphone]]s, half-phones, [[syllable]]s, [[morpheme]]s, [[word]]s, [[phrase]]s, and [[Sentence (linguistics)|sentence]]s.@@@@1@26@@danf@17-8-2009
10800590@unknown@formal@none@1@S@Typically, the division into segments is done using a specially modified [[speech recognition|speech recognizer]] set to a "forced alignment" mode with some manual correction afterward, using visual representations such as the [[waveform]] and [[spectrogram]].@@@@1@34@@danf@17-8-2009
10800600@unknown@formal@none@1@S@An [[index (database)|index]] of the units in the speech database is then created based on the segmentation and acoustic parameters like the [[fundamental frequency]] ([[pitch (music)|pitch]]), duration, position in the syllable, and neighboring phones.@@@@1@34@@danf@17-8-2009
10800610@unknown@formal@none@1@S@At [[runtime]], the desired target utterance is created by determining the best chain of candidate units from the database (unit selection).@@@@1@21@@danf@17-8-2009
10800620@unknown@formal@none@1@S@This process is typically achieved using a specially weighted [[decision tree]].@@@@1@11@@danf@17-8-2009
10800630@unknown@formal@none@1@S@Unit selection provides the greatest naturalness, because it applies only a small amount of [[digital signal processing]] (DSP) to the recorded speech.@@@@1@22@@danf@17-8-2009
10800640@unknown@formal@none@1@S@DSP often makes recorded speech sound less natural, although some systems use a small amount of signal processing at the point of concatenation to smooth the waveform.@@@@1@27@@danf@17-8-2009
10800650@unknown@formal@none@1@S@The output from the best unit-selection systems is often indistinguishable from real human voices, especially in contexts for which the TTS system has been tuned.@@@@1@25@@danf@17-8-2009
10800660@unknown@formal@none@1@S@However, maximum naturalness typically require unit-selection speech databases to be very large, in some systems ranging into the [[gigabyte]]s of recorded data, representing dozens of hours of speech.@@@@1@28@@danf@17-8-2009
10800670@unknown@formal@none@1@S@Also, unit selection algorithms have been known to select segments from a place that results in less than ideal synthesis (e.g. minor words become unclear) even when a better choice exists in the database.
==== Diphone synthesis ====@@@@1@5@@danf@17-8-2009
10800690@unknown@formal@none@1@S@Diphone synthesis uses a minimal speech database containing all the [[diphone]]s (sound-to-sound transitions) occurring in a language.@@@@1@17@@danf@17-8-2009
10800700@unknown@formal@none@1@S@The number of diphones depends on the [[phonotactics]] of the language: for example, Spanish has about 800 diphones, and German about 2500.@@@@1@22@@danf@17-8-2009
10800710@unknown@formal@none@1@S@In diphone synthesis, only one example of each diphone is contained in the speech database.@@@@1@15@@danf@17-8-2009
10800720@unknown@formal@none@1@S@At runtime, the target [[prosody]] of a sentence is superimposed on these minimal units by means of [[digital signal processing]] techniques such as [[linear predictive coding]], [[PSOLA]] or [[MBROLA]].@@@@1@29@@danf@17-8-2009
10800730@unknown@formal@none@1@S@The quality of the resulting speech is generally worse than that of unit-selection systems, but more natural-sounding than the output of formant synthesizers.@@@@1@23@@danf@17-8-2009
10800740@unknown@formal@none@1@S@Diphone synthesis suffers from the sonic glitches of concatenative synthesis and the robotic-sounding nature of formant synthesis, and has few of the advantages of either approach other than small size.@@@@1@30@@danf@17-8-2009
10800750@unknown@formal@none@1@S@As such, its use in commercial applications is declining, although it continues to be used in research because there are a number of freely available software implementations.
==== Domain-specific synthesis ====@@@@1@5@@danf@17-8-2009
10800770@unknown@formal@none@1@S@Domain-specific synthesis concatenates prerecorded words and phrases to create complete utterances.@@@@1@11@@danf@17-8-2009
10800780@unknown@formal@none@1@S@It is used in applications where the variety of texts the system will output is limited to a particular domain, like transit schedule announcements or weather reports.@@@@1@27@@danf@17-8-2009
10800790@unknown@formal@none@1@S@The technology is very simple to implement, and has been in commercial use for a long time, in devices like talking clocks and calculators.@@@@1@24@@danf@17-8-2009
10800800@unknown@formal@none@1@S@The level of naturalness of these systems can be very high because the variety of sentence types is limited, and they closely match the prosody and intonation of the original recordings.@@@@1@31@@danf@17-8-2009
10800810@unknown@formal@none@1@S@Because these systems are limited by the words and phrases in their databases, they are not general-purpose and can only synthesize the combinations of words and phrases with which they have been preprogrammed.@@@@1@33@@danf@17-8-2009
10800820@unknown@formal@none@1@S@The blending of words within naturally spoken language however can still cause problems unless the many variations are taken into account.@@@@1@21@@danf@17-8-2009
10800830@unknown@formal@none@1@S@For example, in [[Rhotic and non-rhotic accents|non-rhotic]] dialects of English the in words like {{IPA|/ˈkliːə/}} is usually only pronounced when the following word has a vowel as its first letter (e.g. is realized as {{IPA|/ˌkliːəɹˈɑʊt/}}).@@@@1@39@@danf@17-8-2009
10800840@unknown@formal@none@1@S@Likewise in [[French language|French]], many final consonants become no longer silent if followed by a word that begins with a vowel, an effect called [[Liaison (French)|liaison]].@@@@1@26@@danf@17-8-2009
10800845@unknown@formal@none@1@S@This [[alternation (linguistics)|alternation]] cannot be reproduced by a simple word-concatenation system, which would require additional complexity to be [[context-sensitive]].
@@@@1@19@@danf@17-8-2009
10800850@unknown@formal@none@1@S@=== Formant synthesis ===@@@@1@4@@danf@17-8-2009
10800860@unknown@formal@none@1@S@[[Formant]] synthesis does not use human speech samples at runtime.@@@@1@10@@danf@17-8-2009
10800870@unknown@formal@none@1@S@Instead, the synthesized speech output is created using an acoustic model.@@@@1@11@@danf@17-8-2009
10800880@unknown@formal@none@1@S@Parameters such as [[fundamental frequency]], [[phonation|voicing]], and [[noise]] levels are varied over time to create a [[waveform]] of artificial speech.@@@@1@20@@danf@17-8-2009
10800890@unknown@formal@none@1@S@This method is sometimes called ''rules-based synthesis''; however, many concatenative systems also have rules-based components.@@@@1@15@@danf@17-8-2009
10800900@unknown@formal@none@1@S@Many systems based on formant synthesis technology generate artificial, robotic-sounding speech that would never be mistaken for human speech.@@@@1@19@@danf@17-8-2009
10800910@unknown@formal@none@1@S@However, maximum naturalness is not always the goal of a speech synthesis system, and formant synthesis systems have advantages over concatenative systems.@@@@1@22@@danf@17-8-2009
10800920@unknown@formal@none@1@S@Formant-synthesized speech can be reliably intelligible, even at very high speeds, avoiding the acoustic glitches that commonly plague concatenative systems.@@@@1@20@@danf@17-8-2009
10800930@unknown@formal@none@1@S@High-speed synthesized speech is used by the visually impaired to quickly navigate computers using a [[screen reader]].@@@@1@17@@danf@17-8-2009
10800940@unknown@formal@none@1@S@Formant synthesizers are usually smaller programs than concatenative systems because they do not have a database of speech samples.@@@@1@19@@danf@17-8-2009
10800950@unknown@formal@none@1@S@They can therefore be used in [[embedded system]]s, where [[data storage device|memory]] and [[microprocessor]] power are especially limited.@@@@1@18@@danf@17-8-2009
10800960@unknown@formal@none@1@S@Because formant-based systems have complete control of all aspects of the output speech, a wide variety of prosodies and [[Intonation (linguistics)|intonation]]s can be output, conveying not just questions and statements, but a variety of emotions and tones of voice.@@@@1@39@@danf@17-8-2009
10800970@unknown@formal@none@1@S@Examples of non-real-time but highly accurate intonation control in formant synthesis include the work done in the late 1970s for the [[Texas Instruments]] toy [[Speak & Spell (game)|Speak & Spell]], and in the early 1980s [[Sega]] [[Video arcade|arcade]] machines.@@@@1@39@@danf@17-8-2009
10800980@unknown@formal@none@1@S@Creating proper intonation for these projects was painstaking, and the results have yet to be matched by real-time text-to-speech interfaces.@@@@1@20@@danf@17-8-2009
10800990@unknown@formal@none@1@S@=== Articulatory synthesis ===@@@@1@4@@danf@17-8-2009
10801000@unknown@formal@none@1@S@[[Articulatory synthesis]] refers to computational techniques for synthesizing speech based on models of the human [[vocal tract]] and the articulation processes occurring there.@@@@1@23@@danf@17-8-2009
10801010@unknown@formal@none@1@S@The first articulatory synthesizer regularly used for laboratory experiments was developed at [[Haskins Laboratories]] in the mid-1970s by [[Philip Rubin]], Tom Baer, and Paul Mermelstein.@@@@1@25@@danf@17-8-2009
10801020@unknown@formal@none@1@S@This synthesizer, known as ASY, was based on vocal tract models developed at [[Bell Laboratories]] in the 1960s and 1970s by Paul Mermelstein, Cecil Coker, and colleagues.@@@@1@27@@danf@17-8-2009
10801030@unknown@formal@none@1@S@Until recently, articulatory synthesis models have not been incorporated into commercial speech synthesis systems.@@@@1@14@@danf@17-8-2009
10801040@unknown@formal@none@1@S@A notable exception is the [[NeXT]]-based system originally developed and marketed by Trillium Sound Research, a spin-off company of the [[University of Calgary]], where much of the original research was conducted.@@@@1@31@@danf@17-8-2009
10801050@unknown@formal@none@1@S@Following the demise of the various incarnations of NeXT (started by [[Steve Jobs]] in the late 1980s and merged with Apple Computer in 1997), the Trillium software was published under the [[GNU General Public License]], with work continuing as ''gnuspeech''.@@@@1@40@@danf@17-8-2009
10801060@unknown@formal@none@1@S@The system, first marketed in 1994, provides full articulatory-based text-to-speech conversion using a waveguide or transmission-line analog of the human oral and nasal tracts controlled by Carré's "distinctive region model".@@@@1@30@@danf@17-8-2009
10801070@unknown@formal@none@1@S@=== HMM-based synthesis ===@@@@1@4@@danf@17-8-2009
10801080@unknown@formal@none@1@S@HMM-based synthesis is a synthesis method based on [[hidden Markov model]]s.@@@@1@11@@danf@17-8-2009
10801090@unknown@formal@none@1@S@In this system, the [[frequency spectrum]] ([[vocal tract]]), [[fundamental frequency]] (vocal source), and duration ([[prosody]]) of speech are modeled simultaneously by HMMs.@@@@1@22@@danf@17-8-2009
10801100@unknown@formal@none@1@S@Speech [[waveforms]] are generated from HMMs themselves based on the [[maximum likelihood]] criterion.@@@@1@13@@danf@17-8-2009
10801110@unknown@formal@none@1@S@=== Sinewave synthesis ===@@@@1@4@@danf@17-8-2009
10801120@unknown@formal@none@1@S@[[Sinewave synthesis]] is a technique for synthesizing speech by replacing the [[formants]] (main bands of energy) with pure tone whistles.@@@@1@20@@danf@17-8-2009
10801130@unknown@formal@none@1@S@== Challenges ==@@@@1@3@@danf@17-8-2009
10801140@unknown@formal@none@1@S@=== Text normalization challenges ===@@@@1@5@@danf@17-8-2009
10801150@unknown@formal@none@1@S@The process of normalizing text is rarely straightforward.@@@@1@8@@danf@17-8-2009
10801160@unknown@formal@none@1@S@Texts are full of [[Heteronym (linguistics)|heteronym]]s, [[number]]s, and [[abbreviation]]s that all require expansion into a phonetic representation.@@@@1@17@@danf@17-8-2009
10801170@unknown@formal@none@1@S@There are many spellings in English which are pronounced differently based on context.@@@@1@13@@danf@17-8-2009
10801180@unknown@formal@none@1@S@For example, "My latest project is to learn how to better project my voice" contains two pronunciations of "project".@@@@1@19@@danf@17-8-2009
10801190@unknown@formal@none@1@S@Most text-to-speech (TTS) systems do not generate semantic representations of their input texts, as processes for doing so are not reliable, well understood, or computationally effective.@@@@1@26@@danf@17-8-2009
10801200@unknown@formal@none@1@S@As a result, various [[heuristic]] techniques are used to guess the proper way to disambiguate homographs, like examining neighboring words and using statistics about frequency of occurrence.@@@@1@27@@danf@17-8-2009
10801210@unknown@formal@none@1@S@Deciding how to convert numbers is another problem that TTS systems have to address.@@@@1@14@@danf@17-8-2009
10801220@unknown@formal@none@1@S@It is a simple programming challenge to convert a number into words, like "1325" becoming "one thousand three hundred twenty-five."@@@@1@20@@danf@17-8-2009
10801230@unknown@formal@none@1@S@However, numbers occur in many different contexts; when a year or part of an address, "1325" should likely be read as "thirteen twenty-five", or, when part of a [[social security number]], as "one three two five".@@@@1@36@@danf@17-8-2009
10801240@unknown@formal@none@1@S@A TTS system can often infer how to expand a number based on surrounding words, numbers, and punctuation, and sometimes the system provides a way to specify the context if it is ambiguous.@@@@1@33@@danf@17-8-2009
10801250@unknown@formal@none@1@S@Similarly, abbreviations can be ambiguous.@@@@1@5@@danf@17-8-2009
10801260@unknown@formal@none@1@S@For example, the abbreviation "in" for "inches" must be differentiated from the word "in", and the address "12 St John St." uses the same abbreviation for both "Saint" and "Street".@@@@1@30@@danf@17-8-2009
10801270@unknown@formal@none@1@S@TTS systems with intelligent front ends can make educated guesses about ambiguous abbreviations, while others provide the same result in all cases, resulting in nonsensical (and sometimes comical) outputs.@@@@1@29@@danf@17-8-2009
10801280@unknown@formal@none@1@S@=== Text-to-phoneme challenges ===@@@@1@4@@danf@17-8-2009
10801290@unknown@formal@none@1@S@Speech synthesis systems use two basic approaches to determine the pronunciation of a word based on its spelling, a process which is often called text-to-phoneme or grapheme-to-phoneme conversion ([[phoneme]] is the term used by linguists to describe distinctive sounds in a language).@@@@1@42@@danf@17-8-2009
10801300@unknown@formal@none@1@S@The simplest approach to text-to-phoneme conversion is the dictionary-based approach, where a large dictionary containing all the words of a language and their correct pronunciations is stored by the program.@@@@1@30@@danf@17-8-2009
10801310@unknown@formal@none@1@S@Determining the correct pronunciation of each word is a matter of looking up each word in the dictionary and replacing the spelling with the pronunciation specified in the dictionary.@@@@1@29@@danf@17-8-2009
10801320@unknown@formal@none@1@S@The other approach is rule-based, in which pronunciation rules are applied to words to determine their pronunciations based on their spellings.@@@@1@21@@danf@17-8-2009
10801330@unknown@formal@none@1@S@This is similar to the "sounding out", or [[synthetic phonics]], approach to learning reading.@@@@1@14@@danf@17-8-2009
10801340@unknown@formal@none@1@S@Each approach has advantages and drawbacks.@@@@1@6@@danf@17-8-2009
10801350@unknown@formal@none@1@S@The dictionary-based approach is quick and accurate, but completely fails if it is given a word which is not in its dictionary.@@@@1@22@@danf@17-8-2009
10801360@unknown@formal@none@1@S@As dictionary size grows, so too does the memory space requirements of the synthesis system.@@@@1@15@@danf@17-8-2009
10801370@unknown@formal@none@1@S@On the other hand, the rule-based approach works on any input, but the complexity of the rules grows substantially as the system takes into account irregular spellings or pronunciations.@@@@1@29@@danf@17-8-2009
10801380@unknown@formal@none@1@S@(Consider that the word "of" is very common in English, yet is the only word in which the letter "f" is pronounced [v].)@@@@1@23@@danf@17-8-2009
10801390@unknown@formal@none@1@S@As a result, nearly all speech synthesis systems use a combination of these approaches.@@@@1@14@@danf@17-8-2009
10801400@unknown@formal@none@1@S@Some languages, like [[Spanish language|Spanish]], have a very regular writing system, and the prediction of the pronunciation of words based on their spellings is quite successful.@@@@1@26@@danf@17-8-2009
10801410@unknown@formal@none@1@S@Speech synthesis systems for such languages often use the rule-based method extensively, resorting to dictionaries only for those few words, like foreign names and borrowings, whose pronunciations are not obvious from their spellings.@@@@1@33@@danf@17-8-2009
10801420@unknown@formal@none@1@S@On the other hand, speech synthesis systems for languages like [[English language|English]], which have extremely irregular spelling systems, are more likely to rely on dictionaries, and to use rule-based methods only for unusual words, or words that aren't in their dictionaries.@@@@1@41@@danf@17-8-2009
10801430@unknown@formal@none@1@S@=== Evaluation challenges ===@@@@1@4@@danf@17-8-2009
10801440@unknown@formal@none@1@S@It is very difficult to evaluate speech synthesis systems consistently because there is no subjective criterion and usually different organizations use different speech data.@@@@1@24@@danf@17-8-2009
10801450@unknown@formal@none@1@S@The quality of a speech synthesis system highly depends on the quality of recording.@@@@1@14@@danf@17-8-2009
10801460@unknown@formal@none@1@S@Therefore, evaluating speech synthesis systems is almost the same as evaluating the recording skills.@@@@1@14@@danf@17-8-2009
10801470@unknown@formal@none@1@S@Recently researchers start evaluating speech synthesis systems using the common speech dataset.@@@@1@12@@danf@17-8-2009
10801480@unknown@formal@none@1@S@This may help people to compare the difference between technologies rather than recordings.@@@@1@13@@danf@17-8-2009
10801490@unknown@formal@none@1@S@=== Prosodics and emotional content ===@@@@1@6@@danf@17-8-2009
10801500@unknown@formal@none@1@S@A recent study reported in the journal "'''Speech Communication'''" by Amy Drahota and colleagues at the [[University of Portsmouth]], [[UK]], reported that listeners to voice recordings could determine, at better than chance levels, whether or not the speaker was smiling.@@@@1@40@@danf@17-8-2009
10801510@unknown@formal@none@1@S@It was suggested that identification of the vocal features which signal emotional content may be used to help make synthesized speech sound more natural.@@@@1@24@@danf@17-8-2009
10801520@unknown@formal@none@1@S@== Dedicated hardware ==@@@@1@4@@danf@17-8-2009
10801530@unknown@formal@none@1@S@*Votrax@@@@1@1@@danf@17-8-2009
10801540@unknown@formal@none@1@S@**SC-01A (analog formant)@@@@1@3@@danf@17-8-2009
10801550@unknown@formal@none@1@S@**SC-02 / SSI-263 / "Arctic 263"@@@@1@6@@danf@17-8-2009
10801560@unknown@formal@none@1@S@*General Instruments SP0256-AL2 (CTS256A-AL2, MEA8000)@@@@1@5@@danf@17-8-2009
10801570@unknown@formal@none@1@S@*Magnevation SpeakJet (www.speechchips.com TTS256)@@@@1@4@@danf@17-8-2009
10801580@unknown@formal@none@1@S@*Savage Innovations SoundGin@@@@1@3@@danf@17-8-2009
10801590@unknown@formal@none@1@S@*National Semiconductor DT1050 Digitalker (Mozer)@@@@1@5@@danf@17-8-2009
10801600@unknown@formal@none@1@S@*Silicon Systems SSI 263 (analog formant)@@@@1@6@@danf@17-8-2009
10801610@unknown@formal@none@1@S@*Texas Instruments@@@@1@2@@danf@17-8-2009
10801620@unknown@formal@none@1@S@**TMS5110A (LPC)@@@@1@2@@danf@17-8-2009
10801630@unknown@formal@none@1@S@**TMS5200@@@@1@1@@danf@17-8-2009
10801640@unknown@formal@none@1@S@*Oki Semiconductor@@@@1@2@@danf@17-8-2009
10801650@unknown@formal@none@1@S@**MSM5205@@@@1@1@@danf@17-8-2009
10801660@unknown@formal@none@1@S@**MSM5218RS (ADPCM)@@@@1@2@@danf@17-8-2009
10801670@unknown@formal@none@1@S@*Toshiba T6721A@@@@1@2@@danf@17-8-2009
10801680@unknown@formal@none@1@S@*Philips PCF8200@@@@1@2@@danf@17-8-2009
10801690@unknown@formal@none@1@S@== Computer operating systems or outlets with speech synthesis ==@@@@1@10@@danf@17-8-2009
10801700@unknown@formal@none@1@S@=== Apple ===@@@@1@3@@danf@17-8-2009
10801710@unknown@formal@none@1@S@The first speech system integrated into an [[operating system]] was [[Apple Computer]]'s [[PlainTalk#The original MacInTalk|MacInTalk]] in 1984.@@@@1@17@@danf@17-8-2009
10801720@unknown@formal@none@1@S@Since the 1980s Macintosh Computers offered text to speech capabilities through The MacinTalk software.@@@@1@14@@danf@17-8-2009
10801730@unknown@formal@none@1@S@In the early 1990s Apple expanded its capabilities offering system wide text-to-speech support.@@@@1@13@@danf@17-8-2009
10801740@unknown@formal@none@1@S@With the introduction of faster PowerPC based computers they included higher quality voice sampling.@@@@1@14@@danf@17-8-2009
10801750@unknown@formal@none@1@S@Apple also introduced [[speech recognition]] into its systems which provided a fluid command set.@@@@1@14@@danf@17-8-2009
10801760@unknown@formal@none@1@S@More recently, Apple has added sample-based voices.@@@@1@7@@danf@17-8-2009
10801770@unknown@formal@none@1@S@Starting as a curiosity, the speech system of Apple [[Macintosh (computer)|Macintosh]] has evolved into a cutting edge fully-supported program, [[PlainTalk]], for people with vision problems.@@@@1@25@@danf@17-8-2009
10801780@unknown@formal@none@1@S@[[VoiceOver]] was included in Mac OS Tiger and more recently Mac OS Leopard.@@@@1@13@@danf@17-8-2009
10801790@unknown@formal@none@1@S@The voice shipping with Mac OS X 10.5 ("Leopard") is called "Alex" and features the taking of realistic-sounding breaths between sentences, as well as improved clarity at high read rates.@@@@1@30@@danf@17-8-2009
10801800@unknown@formal@none@1@S@=== AmigaOS ===@@@@1@3@@danf@17-8-2009
10801810@unknown@formal@none@1@S@The second operating system with advanced speech synthesis capabilities was [[AmigaOS]], introduced in 1985.@@@@1@14@@danf@17-8-2009
10801820@unknown@formal@none@1@S@The voice synthesis was licensed by [[Commodore International]] from a third-party software house (Don't Ask Software, now Softvoice, Inc.) and it featured a complete system of voice emulation, with both male and female voices and "stress" indicator markers, made possible by advanced features of the [[Amiga]] hardware audio [[chipset]].@@@@1@49@@danf@17-8-2009
10801830@unknown@formal@none@1@S@It was divided into a narrator device and a translator library.@@@@1@11@@danf@17-8-2009
10801840@unknown@formal@none@1@S@Amiga [[AmigaOS#Speech synthesis|Speak Handler]] featured a text-to-speech translator.@@@@1@8@@danf@17-8-2009
10801850@unknown@formal@none@1@S@AmigaOS considered speech synthesis a virtual hardware device, so the user could even redirect console output to it.@@@@1@18@@danf@17-8-2009
10801860@unknown@formal@none@1@S@Some Amiga programs, such as word processors, made extensive use of the speech system.@@@@1@14@@danf@17-8-2009
10801870@unknown@formal@none@1@S@=== Microsoft Windows ===@@@@1@4@@danf@17-8-2009
10801880@unknown@formal@none@1@S@Modern [[Microsoft Windows|Windows]] systems use [[Speech Application Programming Interface#SAPI 1-4 API family|SAPI4]]- and [[Speech Application Programming Interface#SAPI 5 API family|SAPI5]]-based speech systems that include a [[speech recognition]] engine (SRE).@@@@1@29@@danf@17-8-2009
10801890@unknown@formal@none@1@S@SAPI 4.0 was available on Microsoft-based operating systems as a third-party add-on for systems like [[Windows 95]] and [[Windows 98]].@@@@1@20@@danf@17-8-2009
10801900@unknown@formal@none@1@S@[[Windows 2000]] added a speech synthesis program called [[Microsoft Narrator|Narrator]], directly available to users.@@@@1@14@@danf@17-8-2009
10801910@unknown@formal@none@1@S@All Windows-compatible programs could make use of speech synthesis features, available through menus once installed on the system.@@@@1@18@@danf@17-8-2009
10801920@unknown@formal@none@1@S@[[Microsoft Speech Server]] is a complete package for voice synthesis and recognition, for commercial applications such as [[call centers]].@@@@1@19@@danf@17-8-2009
10801930@unknown@formal@none@1@S@=== Internet ===@@@@1@3@@danf@17-8-2009
10801940@unknown@formal@none@1@S@Currently, there are a number of [[Application software|applications]], [[plugin]]s and [[gadget]]s that can read messages directly from an [[e-mail client]] and web pages from a [[web browser]].@@@@1@27@@danf@17-8-2009
10801950@unknown@formal@none@1@S@Some specialized [[Computer software|software]] can narrate [[RSS|RSS-feeds]].@@@@1@7@@danf@17-8-2009
10801960@unknown@formal@none@1@S@On one hand, online RSS-narrators simplify information delivery by allowing users to listen to their favourite news sources and to convert them to [[podcast]]s.@@@@1@24@@danf@17-8-2009
10801970@unknown@formal@none@1@S@On the other hand, on-line RSS-readers are available on almost any [[Personal computer|PC]] connected to the Internet.@@@@1@17@@danf@17-8-2009
10801980@unknown@formal@none@1@S@Users can download generated audio files to portable devices, e.g. with a help of [[podcast]] receiver, and listen to them while walking, jogging or commuting to work.@@@@1@27@@danf@17-8-2009
10801990@unknown@formal@none@1@S@A growing field in internet based TTS technology is web-based assistive technology, e.g. Talklets.@@@@1@14@@danf@17-8-2009
10802000@unknown@formal@none@1@S@This web based approach to a traditionally locally installed form of software application can afford many of those requiring software for accessibility reason, the ability to access web content from public machines, or those belonging to others.@@@@1@37@@danf@17-8-2009
10802010@unknown@formal@none@1@S@While responsiveness is not as immediate as that of applications installed locally, the 'access anywhere' nature of it is the key benefit to this approach.@@@@1@25@@danf@17-8-2009
10802020@unknown@formal@none@1@S@=== Others ===@@@@1@3@@danf@17-8-2009
10802030@unknown@formal@none@1@S@* Some models of Texas Instruments home computers produced in 1979 and 1981 ([[TI-99/4A|Texas Instruments TI-99/4 and TI-99/4A]]) were capable of text-to-phoneme synthesis or reciting complete words and phrases (text-to-dictionary), using a very popular Speech Synthesizer peripheral.@@@@1@37@@danf@17-8-2009
10802040@unknown@formal@none@1@S@TI used a proprietary [[codec]] to embed complete spoken phrases into applications, primarily video games.@@@@1@15@@danf@17-8-2009
10802050@unknown@formal@none@1@S@* Systems that operate on free and open source software systems including [[Linux|GNU/Linux]] are various, and include [[open-source]] programs such as the [[Festival Speech Synthesis System]] which uses diphone-based synthesis (and can use a limited number of [[MBROLA]] voices), and gnuspeech which uses articulatory synthesis from the [[Free Software Foundation]].@@@@1@50@@danf@17-8-2009
10802060@unknown@formal@none@1@S@Other commercial vendor software also runs on GNU/Linux.@@@@1@8@@danf@17-8-2009
10802070@unknown@formal@none@1@S@* Several commercial companies are also developing speech synthesis systems (this list is reporting them just for the sake of information, not endorsing any specific product): [http://www.acapela-group.com Acapela Group], [[AT&T]], [[Cepstral]], [[DECtalk]], [[IBM ViaVoice]], [[IVONA|IVONA TTS]], [http://www.loquendo.com Loquendo TTS], [http://www.neospeech.com NeoSpeech TTS], [[Nuance Communications]], Rhetorical Systems, [http://www.svox.com SVOX] and [http://www.yakitome.com YAKiToMe!].@@@@1@51@@danf@17-8-2009
10802080@unknown@formal@none@1@S@* Companies which developed speech synthesis systems but which are no longer in this business include BeST Speech (bought by L&H), [[Lernout & Hauspie]] (bankrupt), [[SpeechWorks]] (bought by Nuance)@@@@1@29@@danf@17-8-2009
10802090@unknown@formal@none@1@S@== Speech synthesis markup languages ==@@@@1@6@@danf@17-8-2009
10802100@unknown@formal@none@1@S@A number of [[markup language]]s have been established for the rendition of text as speech in an [[XML]]-compliant format.@@@@1@19@@danf@17-8-2009
10802110@unknown@formal@none@1@S@The most recent is [[Speech Synthesis Markup Language]] (SSML), which became a [[W3C recommendation]] in 2004.@@@@1@16@@danf@17-8-2009
10802120@unknown@formal@none@1@S@Older speech synthesis markup languages include Java Speech Markup Language ([[JSML]]) and [[SABLE]].@@@@1@13@@danf@17-8-2009
10802130@unknown@formal@none@1@S@Although each of these was proposed as a standard, none of them has been widely adopted.@@@@1@16@@danf@17-8-2009
10802140@unknown@formal@none@1@S@Speech synthesis markup languages are distinguished from dialogue markup languages.@@@@1@10@@danf@17-8-2009
10802150@unknown@formal@none@1@S@[[VoiceXML]], for example, includes tags related to speech recognition, dialogue management and touchtone dialing, in addition to text-to-speech markup.@@@@1@19@@danf@17-8-2009
10802160@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009
10802170@unknown@formal@none@1@S@===Accessibility===@@@@1@1@@danf@17-8-2009
10802180@unknown@formal@none@1@S@Speech synthesis has long been a vital [[assistive technology]] tool and its application in this area is significant and widespread.@@@@1@20@@danf@17-8-2009
10802190@unknown@formal@none@1@S@It allows environmental barriers to be removed for people with a wide range of disabilities.@@@@1@15@@danf@17-8-2009
10802200@unknown@formal@none@1@S@The longest application has been in the use of [[screenreaders]] for people with [[visual impairment]], but text-to-speech systems are now commonly used by people with [[dyslexia]] and other reading difficulties as well as by pre-literate youngsters.@@@@1@36@@danf@17-8-2009
10802210@unknown@formal@none@1@S@They are also frequently employed to aid those with severe [[speech impairment]] usually through a dedicated [[voice output communication aid]].@@@@1@20@@danf@17-8-2009
10802220@unknown@formal@none@1@S@===News service===@@@@1@2@@danf@17-8-2009
10802230@unknown@formal@none@1@S@Sites such as [[Ananova]] have used speech synthesis to convert written news to audio content, which can be used for mobile applications.@@@@1@22@@danf@17-8-2009
10802240@unknown@formal@none@1@S@===Entertainment===@@@@1@1@@danf@17-8-2009
10802250@unknown@formal@none@1@S@Speech synthesis techniques are used as well in the entertainment productions such as games, anime and similar.@@@@1@17@@danf@17-8-2009
10802260@unknown@formal@none@1@S@In 2007, Animo Limited announced the development of a software application package based on its speech synthesis software FineSpeech, explicitly geared towards customers in the entertainment industries, able to generate narration and lines of dialogue according to user specifications.@@@@1@39@@danf@17-8-2009
10802270@unknown@formal@none@1@S@Software such as [[Vocaloid]] can generate singing voices via lyrics and melody.@@@@1@12@@danf@17-8-2009
10802280@unknown@formal@none@1@S@This is also the aim of the Singing Computer project (which uses the [[GNU General Public License|GPL]] software [[GNU LilyPond|Lilypond]] and [[Festival Speech Synthesis System|Festival]]) to help blind people check their lyric input.@@@@1@33@@danf@17-8-2009
10810010@unknown@formal@none@1@S@Statistical classification@@@@1@2@@danf@17-8-2009
10810020@unknown@formal@none@1@S@'''Statistical classification''' is a procedure in which individual items are placed into groups based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, etc) and based on a [[training set]] of previously labeled items.@@@@1@43@@danf@17-8-2009
10810030@unknown@formal@none@1@S@Formally, the problem can be stated as follows: given training data produce a classifier which maps an object to its classification label .@@@@1@31@@danf@17-8-2009
10810040@unknown@formal@none@1@S@For example, if the problem is filtering spam, then is some representation of an email and is either "Spam" or "Non-Spam".@@@@1@23@@danf@17-8-2009
10810050@unknown@formal@none@1@S@Statistical classification algorithms are typically used in [[pattern recognition]] systems.@@@@1@10@@danf@17-8-2009
10810060@unknown@formal@none@1@S@'''Note:''' in [[community ecology]], the term "classification" is synonymous with what is commonly known (in [[machine learning]]) as [[data clustering|clustering]].@@@@1@20@@danf@17-8-2009
10810070@unknown@formal@none@1@S@See that article for more information about purely [[unsupervised learning|unsupervised]] techniques.@@@@1@11@@danf@17-8-2009
10810080@unknown@formal@none@1@S@* The second problem is to consider classification as an [[estimation]] problem, where the goal is to estimate a function of the form@@@@1@23@@danf@17-8-2009
10810090@unknown@formal@none@1@S@: where the feature vector input is , and the function f is typically parameterized by some parameters .@@@@1@27@@danf@17-8-2009
10810100@unknown@formal@none@1@S@In the [[Bayesian statistics|Bayesian]] approach to this problem, instead of choosing a single parameter vector , the result is integrated over all possible thetas, with the thetas weighted by how likely they are given the training data D:@@@@1@39@@danf@17-8-2009
10810110@unknown@formal@none@1@S@:@@@@1@11@@danf@17-8-2009
10810120@unknown@formal@none@1@S@* The third problem is related to the second, but the problem is to estimate the [[conditional probability|class-conditional probabilities]] and then use [[Bayes' rule]] to produce the class probability as in the second problem.@@@@1@37@@danf@17-8-2009
10810130@unknown@formal@none@1@S@Examples of classification algorithms include:@@@@1@5@@danf@17-8-2009
10810140@unknown@formal@none@1@S@* [[Linear classifier]]s@@@@1@3@@danf@17-8-2009
10810150@unknown@formal@none@1@S@** [[Fisher's linear discriminant]]@@@@1@4@@danf@17-8-2009
10810160@unknown@formal@none@1@S@** [[Logistic regression]]@@@@1@3@@danf@17-8-2009
10810170@unknown@formal@none@1@S@** [[Naive Bayes classifier]]@@@@1@4@@danf@17-8-2009
10810180@unknown@formal@none@1@S@** [[Perceptron]]@@@@1@2@@danf@17-8-2009
10810190@unknown@formal@none@1@S@** [[Support vector machine]]s@@@@1@4@@danf@17-8-2009
10810200@unknown@formal@none@1@S@* [[Quadratic classifier]]s@@@@1@3@@danf@17-8-2009
10810210@unknown@formal@none@1@S@* [[Nearest_neighbor_(pattern_recognition)|k-nearest neighbor]]@@@@1@3@@danf@17-8-2009
10810220@unknown@formal@none@1@S@* [[Boosting]]@@@@1@2@@danf@17-8-2009
10810230@unknown@formal@none@1@S@* [[Decision tree]]s@@@@1@3@@danf@17-8-2009
10810240@unknown@formal@none@1@S@** [[Random forest]]s@@@@1@3@@danf@17-8-2009
10810250@unknown@formal@none@1@S@* [[Artificial neural networks|Neural network]]s@@@@1@5@@danf@17-8-2009
10810260@unknown@formal@none@1@S@* [[Bayesian network]]s@@@@1@3@@danf@17-8-2009
10810270@unknown@formal@none@1@S@* [[Hidden Markov model]]s@@@@1@4@@danf@17-8-2009
10810280@unknown@formal@none@1@S@An intriguing problem in pattern recognition yet to be solved is the relationship between the problem to be solved (data to be classified) and the performance of various pattern recognition algorithms (classifiers).@@@@1@32@@danf@17-8-2009
10810290@unknown@formal@none@1@S@Van der Walt and Barnard (see reference section) investigated very specific artificial data sets to determine conditions under which certain classifiers perform better and worse than others.@@@@1@27@@danf@17-8-2009
10810300@unknown@formal@none@1@S@Classifier performance depends greatly on the characteristics of the data to be classified.@@@@1@13@@danf@17-8-2009
10810310@unknown@formal@none@1@S@There is no single classifier that works best on all given problems (a phenomenon that may be explained by the [[No free lunch in search and optimization|No-free-lunch theorem]]).@@@@1@28@@danf@17-8-2009
10810320@unknown@formal@none@1@S@Various empirical tests have been performed to compare classifier performance and to find the characteristics of data that determine classifier performance.@@@@1@21@@danf@17-8-2009
10810330@unknown@formal@none@1@S@Determining a suitable classifier for a given problem is however still more an art than a science.@@@@1@17@@danf@17-8-2009
10810340@unknown@formal@none@1@S@The most widely used classifiers are the [[Neural Network]] (Multi-layer Perceptron), [[Support Vector Machines]], [[KNN|k-Nearest Neighbours]], Gaussian Mixture Model, Gaussian, [[Naive Bayes]], [[Decision Tree]] and [[Radial Basis Function|RBF]] classifiers.@@@@1@29@@danf@17-8-2009
10810350@unknown@formal@none@1@S@== Evaluation ==@@@@1@3@@danf@17-8-2009
10810360@unknown@formal@none@1@S@The measures [[Precision and Recall]] are popular metrics used to evaluate the quality of a classification system.@@@@1@17@@danf@17-8-2009
10810370@unknown@formal@none@1@S@More recently, [[Receiver Operating Characteristic]] (ROC) curves have been used to evaluate the tradeoff between true- and false-positive rates of classification algorithms.@@@@1@22@@danf@17-8-2009
10810380@unknown@formal@none@1@S@==Application domains==@@@@1@2@@danf@17-8-2009
10810390@unknown@formal@none@1@S@* [[Computer vision]]@@@@1@3@@danf@17-8-2009
10810400@unknown@formal@none@1@S@** [[Medical Imaging]] and Medical Image Analysis@@@@1@7@@danf@17-8-2009
10810410@unknown@formal@none@1@S@** [[Optical character recognition]]@@@@1@4@@danf@17-8-2009
10810420@unknown@formal@none@1@S@* [[Geostatistics]]@@@@1@2@@danf@17-8-2009
10810430@unknown@formal@none@1@S@* [[Speech recognition]]@@@@1@3@@danf@17-8-2009
10810440@unknown@formal@none@1@S@* [[Handwriting recognition]]@@@@1@3@@danf@17-8-2009
10810450@unknown@formal@none@1@S@* [[Biometric]] identification@@@@1@3@@danf@17-8-2009
10810460@unknown@formal@none@1@S@* [[Natural language processing]]@@@@1@4@@danf@17-8-2009
10810470@unknown@formal@none@1@S@* [[Document classification]]@@@@1@3@@danf@17-8-2009
10810480@unknown@formal@none@1@S@* Internet [[search engines]]@@@@1@4@@danf@17-8-2009
10810490@unknown@formal@none@1@S@* [[Credit scoring]]@@@@1@3@@danf@17-8-2009
10820010@unknown@formal@none@1@S@Statistical machine translation@@@@1@3@@danf@17-8-2009
10820020@unknown@formal@none@1@S@'''Statistical machine translation''' ('''SMT''') is a [[machine translation]] paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual [[text corpora]].@@@@1@30@@danf@17-8-2009
10820030@unknown@formal@none@1@S@The statistical approach contrasts with the rule-based approaches to [[machine translation]] as well as with [[example-based machine translation]].@@@@1@18@@danf@17-8-2009
10820040@unknown@formal@none@1@S@The first ideas of statistical machine translation were introduced by [[Warren Weaver]] in 1949, including the ideas of applying [[Claude Shannon]]'s [[information theory]].@@@@1@23@@danf@17-8-2009
10820050@unknown@formal@none@1@S@Statistical machine translation was re-introduced in 1991 by researchers at [[IBM]]'s [[Thomas J. Watson Research Center]] and has contributed to the significant resurgence in interest in machine translation in recent years.@@@@1@31@@danf@17-8-2009
10820060@unknown@formal@none@1@S@As of 2006, it is by far the most widely-studied machine translation paradigm.@@@@1@13@@danf@17-8-2009
10820070@unknown@formal@none@1@S@==Benefits==@@@@1@1@@danf@17-8-2009
10820080@unknown@formal@none@1@S@The benefits of statistical machine translation over traditional paradigms that are most often cited are the following:@@@@1@17@@danf@17-8-2009
10820090@unknown@formal@none@1@S@* '''Better use of resources'''@@@@1@5@@danf@17-8-2009
10820100@unknown@formal@none@1@S@**There is a great deal of natural language in machine-readable format.@@@@1@11@@danf@17-8-2009
10820110@unknown@formal@none@1@S@**Generally, SMT systems are not tailored to any specific pair of languages.@@@@1@12@@danf@17-8-2009
10820120@unknown@formal@none@1@S@**Rule-based translation systems require the manual development of linguistic rules, which can be costly, and which often do not generalize to other languages.@@@@1@23@@danf@17-8-2009
10820130@unknown@formal@none@1@S@* '''More natural translations'''@@@@1@4@@danf@17-8-2009
10820140@unknown@formal@none@1@S@The ideas behind statistical machine translation come out of [[information theory]].@@@@1@11@@danf@17-8-2009
10820150@unknown@formal@none@1@S@Essentially, the document is translated on the [[probability]] that a string in native language (for example, English) is the translation of a string in foreign language (for example, French).@@@@1@32@@danf@17-8-2009
10820160@unknown@formal@none@1@S@Generally, these probabilities are estimated using techniques of [[parameter estimation]].@@@@1@10@@danf@17-8-2009
10820170@unknown@formal@none@1@S@The [[Bayes Theorem]] is applied to , the probability that the foreign string produces the native string to get , where the [[translation model]] is the probability that the native string is the translation of the foreign string, and the [[language model]] is the probability of seeing that native string.@@@@1@55@@danf@17-8-2009
10820180@unknown@formal@none@1@S@Mathematically speaking, finding the best translation is done by picking up the one that gives the highest probability:@@@@1@19@@danf@17-8-2009
10820190@unknown@formal@none@1@S@:.@@@@1@15@@danf@17-8-2009
10820200@unknown@formal@none@1@S@For a rigorous implementation of this one would have to perform an exhaustive search by going through all strings in the native language.@@@@1@24@@danf@17-8-2009
10820210@unknown@formal@none@1@S@Performing the search efficiently is the work of a [[machine translation decoder]] that uses the foreign string, heuristics and other methods to limit the search space and at the same time keeping acceptable quality.@@@@1@34@@danf@17-8-2009
10820220@unknown@formal@none@1@S@This trade-off between quality and time usage can also be found in [[speech recognition]].@@@@1@14@@danf@17-8-2009
10820230@unknown@formal@none@1@S@As the translation systems are not able to store all native strings and their translations, a document is typically translated sentence by sentence, but even this is not enough.@@@@1@29@@danf@17-8-2009
10820240@unknown@formal@none@1@S@Language models are typically approximated by smoothed ''n''-gram models, and similar approaches have been applied to translation models, but there is additional complexity due to different sentence lengths and word orders in the languages.@@@@1@34@@danf@17-8-2009
10820250@unknown@formal@none@1@S@The statistical translation models were initially [[word]] based (Models 1-5 from [[IBM]]), but significant advances were made with the introduction of [[phrase]] based models.@@@@1@24@@danf@17-8-2009
10820260@unknown@formal@none@1@S@Recent work has incorporated [[syntax]] or quasi-syntactic structures.@@@@1@8@@danf@17-8-2009
10820270@unknown@formal@none@1@S@==Word-based translation==@@@@1@2@@danf@17-8-2009
10820280@unknown@formal@none@1@S@In word-based translation, translated elements are words.@@@@1@7@@danf@17-8-2009
10820290@unknown@formal@none@1@S@Typically, the number of words in translated sentences are different due to compound words, morphology and idioms.@@@@1@17@@danf@17-8-2009
10820300@unknown@formal@none@1@S@The ratio of the lengths of sequences of translated words is called fertility, which tells how many foreign words each native word produces.@@@@1@23@@danf@17-8-2009
10820310@unknown@formal@none@1@S@Simple word-based translation is not able to translate language pairs with fertility rates different from one.@@@@1@16@@danf@17-8-2009
10820320@unknown@formal@none@1@S@To make word-based translation systems manage, for instance, high fertility rates, the system could be able to map a single word to multiple words, but not vice versa.@@@@1@28@@danf@17-8-2009
10820330@unknown@formal@none@1@S@For instance, if we are translating from French to English, each word in English could produce zero or more French words.@@@@1@21@@danf@17-8-2009
10820340@unknown@formal@none@1@S@But there's no way to group two English words producing a single French word.@@@@1@14@@danf@17-8-2009
10820350@unknown@formal@none@1@S@An example of a word-based translation system is the freely available [[GIZA++]] package ([[GPL]]ed), which includes [[IBM]] models.@@@@1@18@@danf@17-8-2009
10820360@unknown@formal@none@1@S@==Phrase-based translation==@@@@1@2@@danf@17-8-2009
10820370@unknown@formal@none@1@S@In phrase-based translation, the restrictions produced by word-based translation have been tried to reduce by translating sequences of words to sequences of words, where the lengths can differ.@@@@1@28@@danf@17-8-2009
10820380@unknown@formal@none@1@S@The sequences of words are called, for instance, blocks or phrases, but typically are not linguistic [[phrase]]s but phrases found using statistical methods from the corpus.@@@@1@26@@danf@17-8-2009
10820390@unknown@formal@none@1@S@Restricting the phrases to linguistic phrases has been shown to decrease translation quality.@@@@1@13@@danf@17-8-2009
10820400@unknown@formal@none@1@S@==Syntax-based translation==@@@@1@2@@danf@17-8-2009
10820410@unknown@formal@none@1@S@==Challenges with statistical machine translation==@@@@1@5@@danf@17-8-2009
10820420@unknown@formal@none@1@S@Problems that statistical machine translation have to deal with include@@@@1@10@@danf@17-8-2009
10820430@unknown@formal@none@1@S@=== Compound words ===@@@@1@4@@danf@17-8-2009
10820440@unknown@formal@none@1@S@=== Idioms ===@@@@1@3@@danf@17-8-2009
10820450@unknown@formal@none@1@S@=== Morphology ===@@@@1@3@@danf@17-8-2009
10820460@unknown@formal@none@1@S@=== Different word orders ===@@@@1@5@@danf@17-8-2009
10820470@unknown@formal@none@1@S@Word order in languages differ.@@@@1@5@@danf@17-8-2009
10820480@unknown@formal@none@1@S@Some classification can be done by naming the typical order of subject (S), verb (V) and object (O) in a sentence and one can talk, for instance, of SVO or VSO languages.@@@@1@32@@danf@17-8-2009
10820490@unknown@formal@none@1@S@There are also additional differences in word orders, for instance, where modifiers for nouns are located.@@@@1@16@@danf@17-8-2009
10820500@unknown@formal@none@1@S@In [[Speech Recognition]], the speech signal and the corresponding textual representation can be mapped to each other in blocks in order.@@@@1@21@@danf@17-8-2009
10820510@unknown@formal@none@1@S@This is not always the case with the same text in two languages.@@@@1@13@@danf@17-8-2009
10820520@unknown@formal@none@1@S@For SMT, the translation model is only able to translate small sequences of words and word order has to be taken into account somehow.@@@@1@24@@danf@17-8-2009
10820530@unknown@formal@none@1@S@Typical solution has been re-ordering models, where a distribution of location changes for each item of translation is approximated from aligned bi-text.@@@@1@22@@danf@17-8-2009
10820540@unknown@formal@none@1@S@Different location changes can be ranked with the help of the language model and the best can be selected.@@@@1@19@@danf@17-8-2009
10820550@unknown@formal@none@1@S@=== Syntax ===@@@@1@3@@danf@17-8-2009
10820560@unknown@formal@none@1@S@=== Out of vocabulary (OOV) words ===@@@@1@7@@danf@17-8-2009
10820570@unknown@formal@none@1@S@SMT systems store different word forms as separate symbols without any relation to each other and word forms or phrases that were not in the training data cannot be translated.@@@@1@30@@danf@17-8-2009
10820580@unknown@formal@none@1@S@Main reasons for out of vocabulary words are the limitation of training data, domain changes and morphology.@@@@1@17@@danf@17-8-2009
10830010@unknown@formal@none@1@S@Statistics@@@@1@1@@danf@17-8-2009
10830020@unknown@formal@none@1@S@'''Statistics''' is a [[Mathematics|mathematical science]] pertaining to the collection, analysis, interpretation or explanation, and presentation of [[data]].@@@@1@17@@danf@17-8-2009
10830030@unknown@formal@none@1@S@It is applicable to a wide variety of [[academic discipline]]s, from the [[Natural science|natural]] and [[social science]]s to the [[humanities]], government and business.@@@@1@23@@danf@17-8-2009
10830040@unknown@formal@none@1@S@Statistical methods can be used to summarize or describe a collection of data; this is called '''[[descriptive statistics]]'''.@@@@1@18@@danf@17-8-2009
10830050@unknown@formal@none@1@S@In addition, patterns in the data may be [[mathematical model|modeled]] in a way that accounts for [[random]]ness and uncertainty in the observations, and then used to draw inferences about the process or population being studied; this is called '''[[inferential statistics]]'''.@@@@1@40@@danf@17-8-2009
10830060@unknown@formal@none@1@S@Both descriptive and inferential statistics comprise '''applied statistics'''.@@@@1@8@@danf@17-8-2009
10830070@unknown@formal@none@1@S@There is also a discipline called '''[[mathematical statistics]]''', which is concerned with the theoretical basis of the subject.@@@@1@18@@danf@17-8-2009
10830080@unknown@formal@none@1@S@The word '''''statistics''''' is also the plural of '''''[[statistic]]''''' (singular), which refers to the result of applying a statistical algorithm to a set of data, as in [[economic statistics]], [[crime statistics]], etc.@@@@1@32@@danf@17-8-2009
10830090@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10830100@unknown@formal@none@1@S@:@@@@1@1@@danf@17-8-2009
10830110@unknown@formal@none@1@S@''"Five men, [[Hermann Conring|Conring]],[[Gottfried Achenwall| Achenwall]], [[Johann Peter Süssmilch|Süssmilch]], [[John Graunt|Graunt]] and [[William Petty|Petty]] have been honored by different writers as the founder of statistics."'' claims one source (Willcox, Walter (1938) ''The Founder of Statistics''.@@@@1@35@@danf@17-8-2009
10830120@unknown@formal@none@1@S@Review of the [[International Statistical Institute]] 5(4):321-328.)@@@@1@7@@danf@17-8-2009
10830130@unknown@formal@none@1@S@Some scholars pinpoint the origin of statistics to 1662, with the publication of "[[Observations on the Bills of Mortality]]" by John Graunt.@@@@1@22@@danf@17-8-2009
10830140@unknown@formal@none@1@S@Early applications of statistical thinking revolved around the needs of states to base policy on demographic and economic data.@@@@1@19@@danf@17-8-2009
10830150@unknown@formal@none@1@S@The scope of the discipline of statistics broadened in the early 19th century to include the collection and analysis of data in general.@@@@1@23@@danf@17-8-2009
10830160@unknown@formal@none@1@S@Today, statistics is widely employed in government, business, and the natural and social sciences.@@@@1@14@@danf@17-8-2009
10830170@unknown@formal@none@1@S@Because of its empirical roots and its applications, statistics is generally considered not to be a subfield of pure mathematics, but rather a distinct branch of applied mathematics.@@@@1@28@@danf@17-8-2009
10830180@unknown@formal@none@1@S@Its mathematical foundations were laid in the 17th century with the development of [[probability theory]] by [[Pascal]] and [[Fermat]].@@@@1@19@@danf@17-8-2009
10830190@unknown@formal@none@1@S@Probability theory arose from the study of games of chance.@@@@1@10@@danf@17-8-2009
10830200@unknown@formal@none@1@S@The [[method of least squares]] was first described by [[Carl Friedrich Gauss]] around 1794.@@@@1@14@@danf@17-8-2009
10830210@unknown@formal@none@1@S@The use of modern [[computer]]s has expedited large-scale statistical computation, and has also made possible new methods that are impractical to perform manually.@@@@1@23@@danf@17-8-2009
10830220@unknown@formal@none@1@S@==Overview==@@@@1@1@@danf@17-8-2009
10830230@unknown@formal@none@1@S@In applying statistics to a scientific, industrial, or societal problem, one begins with a process or [[statistical population|population]] to be studied.@@@@1@21@@danf@17-8-2009
10830240@unknown@formal@none@1@S@This might be a population of people in a country, of crystal grains in a rock, or of goods manufactured by a particular factory during a given period.@@@@1@28@@danf@17-8-2009
10830250@unknown@formal@none@1@S@It may instead be a process observed at various times; data collected about this kind of "population" constitute what is called a [[time series]].@@@@1@24@@danf@17-8-2009
10830260@unknown@formal@none@1@S@For practical reasons, rather than compiling data about an entire population, one usually studies a chosen subset of the population, called a [[sampling (statistics)|sample]].@@@@1@24@@danf@17-8-2009
10830270@unknown@formal@none@1@S@Data are collected about the sample in an observational or [[experiment]]al setting.@@@@1@12@@danf@17-8-2009
10830280@unknown@formal@none@1@S@The data are then subjected to statistical analysis, which serves two related purposes: description and inference.@@@@1@16@@danf@17-8-2009
10830290@unknown@formal@none@1@S@*[[Descriptive statistics]] can be used to summarize the data, either numerically or graphically, to describe the sample.@@@@1@17@@danf@17-8-2009
10830300@unknown@formal@none@1@S@Basic examples of numerical descriptors include the [[mean]] and [[standard deviation]].@@@@1@11@@danf@17-8-2009
10830310@unknown@formal@none@1@S@Graphical summarizations include various kinds of charts and graphs.@@@@1@9@@danf@17-8-2009
10830320@unknown@formal@none@1@S@*[[Inferential statistics]] is used to model patterns in the data, accounting for randomness and drawing inferences about the larger population.@@@@1@20@@danf@17-8-2009
10830330@unknown@formal@none@1@S@These inferences may take the form of answers to yes/no questions ([[hypothesis testing]]), estimates of numerical characteristics ([[estimation]]), descriptions of association ([[correlation]]), or modeling of relationships ([[regression analysis|regression]]).@@@@1@28@@danf@17-8-2009
10830340@unknown@formal@none@1@S@Other [[mathematical model|modeling]] techniques include [[ANOVA]], [[time series]], and [[data mining]].@@@@1@11@@danf@17-8-2009
10830350@unknown@formal@none@1@S@The concept of correlation is particularly noteworthy.@@@@1@7@@danf@17-8-2009
10830360@unknown@formal@none@1@S@Statistical analysis of a [[data set]] may reveal that two variables (that is, two properties of the population under consideration) tend to vary together, as if they are connected.@@@@1@29@@danf@17-8-2009
10830370@unknown@formal@none@1@S@For example, a study of annual income and age of death among people might find that poor people tend to have shorter lives than affluent people.@@@@1@26@@danf@17-8-2009
10830380@unknown@formal@none@1@S@The two variables are said to be correlated (which is a positive correlation in this case).@@@@1@16@@danf@17-8-2009
10830390@unknown@formal@none@1@S@However, one cannot immediately infer the existence of a causal relationship between the two variables.@@@@1@15@@danf@17-8-2009
10830400@unknown@formal@none@1@S@(See [[Correlation does not imply causation]].)@@@@1@6@@danf@17-8-2009
10830410@unknown@formal@none@1@S@The correlated phenomena could be caused by a third, previously unconsidered phenomenon, called a [[lurking variable]] or [[confounding variable]].@@@@1@19@@danf@17-8-2009
10830420@unknown@formal@none@1@S@If the sample is representative of the population, then inferences and conclusions made from the sample can be extended to the population as a whole.@@@@1@25@@danf@17-8-2009
10830430@unknown@formal@none@1@S@A major problem lies in determining the extent to which the chosen sample is representative.@@@@1@15@@danf@17-8-2009
10830440@unknown@formal@none@1@S@Statistics offers methods to estimate and correct for randomness in the sample and in the data collection procedure, as well as methods for designing robust experiments in the first place.@@@@1@30@@danf@17-8-2009
10830450@unknown@formal@none@1@S@(See [[experimental design]].)@@@@1@3@@danf@17-8-2009
10830460@unknown@formal@none@1@S@The fundamental mathematical concept employed in understanding such randomness is [[probability]].@@@@1@11@@danf@17-8-2009
10830470@unknown@formal@none@1@S@[[Mathematical statistics]] (also called [[statistical theory]]) is the branch of [[applied mathematics]] that uses probability theory and [[mathematical analysis|analysis]] to examine the theoretical basis of statistics.@@@@1@26@@danf@17-8-2009
10830480@unknown@formal@none@1@S@The use of any statistical method is valid only when the system or population under consideration satisfies the basic mathematical assumptions of the method.@@@@1@24@@danf@17-8-2009
10830490@unknown@formal@none@1@S@[[Misuse of statistics]] can produce subtle but serious errors in description and interpretation — subtle in the sense that even experienced professionals sometimes make such errors, serious in the sense that they may affect, for instance, social policy, medical practice and the reliability of structures such as bridges.@@@@1@48@@danf@17-8-2009
10830500@unknown@formal@none@1@S@Even when statistics is correctly applied, the results can be difficult for the non-expert to interpret.@@@@1@16@@danf@17-8-2009
10830510@unknown@formal@none@1@S@For example, the [[statistical significance]] of a trend in the data, which measures the extent to which the trend could be caused by random variation in the sample, may not agree with one's intuitive sense of its significance.@@@@1@38@@danf@17-8-2009
10830520@unknown@formal@none@1@S@The set of basic statistical skills (and skepticism) needed by people to deal with information in their everyday lives is referred to as [[statistical literacy]].@@@@1@25@@danf@17-8-2009
10830530@unknown@formal@none@1@S@==Statistical methods==@@@@1@2@@danf@17-8-2009
10830540@unknown@formal@none@1@S@===Experimental and observational studies===@@@@1@4@@danf@17-8-2009
10830550@unknown@formal@none@1@S@A common goal for a statistical research project is to investigate [[causality]], and in particular to draw a conclusion on the effect of changes in the values of predictors or [[independent variable]]s on response or [[dependent variable]]s.@@@@1@37@@danf@17-8-2009
10830560@unknown@formal@none@1@S@There are two major types of causal statistical studies, experimental studies and observational studies.@@@@1@14@@danf@17-8-2009
10830570@unknown@formal@none@1@S@In both types of studies, the effect of differences of an independent variable (or variables) on the behavior of the dependent variable are observed.@@@@1@24@@danf@17-8-2009
10830580@unknown@formal@none@1@S@The difference between the two types lies in how the study is actually conducted.@@@@1@14@@danf@17-8-2009
10830590@unknown@formal@none@1@S@Each can be very effective.@@@@1@5@@danf@17-8-2009
10830600@unknown@formal@none@1@S@An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements.@@@@1@35@@danf@17-8-2009
10830610@unknown@formal@none@1@S@In contrast, an observational study does not involve experimental manipulation.@@@@1@10@@danf@17-8-2009
10830620@unknown@formal@none@1@S@Instead, data are gathered and correlations between predictors and response are investigated.@@@@1@12@@danf@17-8-2009
10830630@unknown@formal@none@1@S@An example of an experimental study is the famous [[Hawthorne studies]], which attempted to test the changes to the working environment at the Hawthorne plant of the Western Electric Company.@@@@1@30@@danf@17-8-2009
10830640@unknown@formal@none@1@S@The researchers were interested in determining whether increased illumination would increase the productivity of the [[assembly line]] workers.@@@@1@18@@danf@17-8-2009
10830650@unknown@formal@none@1@S@The researchers first measured the productivity in the plant, then modified the illumination in an area of the plant and checked if the changes in illumination affected the productivity.@@@@1@29@@danf@17-8-2009
10830660@unknown@formal@none@1@S@It turned out that the productivity indeed improved (under the experimental conditions).@@@@1@12@@danf@17-8-2009
10830663@unknown@formal@none@1@S@(See [[Hawthorne effect]].)@@@@1@3@@danf@17-8-2009
10830665@unknown@formal@none@1@S@However, the study is heavily criticized today for errors in experimental procedures, specifically for the lack of a [[control group]] and [[double-blind|blindedness]].@@@@1@22@@danf@17-8-2009
10830670@unknown@formal@none@1@S@An example of an observational study is a study which explores the correlation between smoking and lung cancer.@@@@1@18@@danf@17-8-2009
10830680@unknown@formal@none@1@S@This type of study typically uses a survey to collect observations about the area of interest and then performs statistical analysis.@@@@1@21@@danf@17-8-2009
10830690@unknown@formal@none@1@S@In this case, the researchers would collect observations of both smokers and non-smokers, perhaps through a [[case-control study]], and then look for the number of cases of lung cancer in each group.@@@@1@32@@danf@17-8-2009
10830700@unknown@formal@none@1@S@The basic steps of an experiment are;@@@@1@7@@danf@17-8-2009
10830710@unknown@formal@none@1@S@# Planning the research, including determining information sources, research subject selection, and [[ethics|ethical]] considerations for the proposed research and method.@@@@1@20@@danf@17-8-2009
10830720@unknown@formal@none@1@S@# [[Design of experiments]], concentrating on the system model and the interaction of independent and dependent variables.@@@@1@17@@danf@17-8-2009
10830730@unknown@formal@none@1@S@# [[summary statistics|Summarizing a collection of observations]] to feature their commonality by suppressing details.@@@@1@14@@danf@17-8-2009
10830740@unknown@formal@none@1@S@([[Descriptive statistics]])@@@@1@2@@danf@17-8-2009
10830750@unknown@formal@none@1@S@# Reaching consensus about what [[statistical inference|the observations tell]] about the world being observed.@@@@1@14@@danf@17-8-2009
10830760@unknown@formal@none@1@S@([[Statistical inference]])@@@@1@2@@danf@17-8-2009
10830770@unknown@formal@none@1@S@# Documenting / presenting the results of the study.@@@@1@9@@danf@17-8-2009
10830780@unknown@formal@none@1@S@===Levels of measurement===@@@@1@3@@danf@17-8-2009
10830790@unknown@formal@none@1@S@:''See: [[Levels of measurement|Stanley Stevens' "Scales of measurement" (1946): nominal, ordinal, interval, ratio]]''@@@@1@13@@danf@17-8-2009
10830800@unknown@formal@none@1@S@There are four types of measurements or [[level of measurement|levels of measurement]] or measurement scales used in statistics: nominal, ordinal, interval, and ratio.@@@@1@23@@danf@17-8-2009
10830810@unknown@formal@none@1@S@They have different degrees of usefulness in statistical [[research]].@@@@1@9@@danf@17-8-2009
10830820@unknown@formal@none@1@S@Ratio measurements have both a zero value defined and the distances between different measurements defined; they provide the greatest flexibility in statistical methods that can be used for analyzing the data.@@@@1@31@@danf@17-8-2009
10830830@unknown@formal@none@1@S@Interval measurements have meaningful distances between measurements defined, but have no meaningful zero value defined (as in the case with IQ measurements or with temperature measurements in [[Fahrenheit]]).@@@@1@28@@danf@17-8-2009
10830840@unknown@formal@none@1@S@Ordinal measurements have imprecise differences between consecutive values, but have a meaningful order to those values.@@@@1@16@@danf@17-8-2009
10830850@unknown@formal@none@1@S@Nominal measurements have no meaningful rank order among values.@@@@1@9@@danf@17-8-2009
10830860@unknown@formal@none@1@S@Since variables conforming only to nominal or ordinal measurements cannot be reasonably measured numerically, sometimes they are called together as categorical variables, whereas ratio and interval measurements are grouped together as quantitative or [[continuous variables]] due to their numerical nature.@@@@1@40@@danf@17-8-2009
10830870@unknown@formal@none@1@S@===Statistical techniques===@@@@1@2@@danf@17-8-2009
10830880@unknown@formal@none@1@S@Some well known statistical [[Statistical hypothesis testing|test]]s and [[procedure]]s for [[research]] [[observation]]s are:@@@@1@13@@danf@17-8-2009
10830890@unknown@formal@none@1@S@* [[Student's t-test]]@@@@1@3@@danf@17-8-2009
10830900@unknown@formal@none@1@S@* [[chi-square test]]@@@@1@3@@danf@17-8-2009
10830910@unknown@formal@none@1@S@* [[Analysis of variance]] (ANOVA)@@@@1@5@@danf@17-8-2009
10830920@unknown@formal@none@1@S@* [[Mann-Whitney U]]@@@@1@3@@danf@17-8-2009
10830930@unknown@formal@none@1@S@* [[Regression analysis]]@@@@1@3@@danf@17-8-2009
10830940@unknown@formal@none@1@S@* [[Factor Analysis]]@@@@1@3@@danf@17-8-2009
10830950@unknown@formal@none@1@S@* [[Correlation]]@@@@1@2@@danf@17-8-2009
10830960@unknown@formal@none@1@S@* [[Pearson product-moment correlation coefficient]]@@@@1@5@@danf@17-8-2009
10830970@unknown@formal@none@1@S@* [[Spearman's rank correlation coefficient]]@@@@1@5@@danf@17-8-2009
10830980@unknown@formal@none@1@S@* [[Time Series Analysis]]@@@@1@4@@danf@17-8-2009
10830990@unknown@formal@none@1@S@==Specialized disciplines==@@@@1@2@@danf@17-8-2009
10831000@unknown@formal@none@1@S@Some fields of inquiry use applied statistics so extensively that they have [[specialized terminology]].@@@@1@14@@danf@17-8-2009
10831010@unknown@formal@none@1@S@These disciplines include:@@@@1@3@@danf@17-8-2009
10831020@unknown@formal@none@1@S@* [[Actuarial science]]@@@@1@3@@danf@17-8-2009
10831030@unknown@formal@none@1@S@* [[Applied information economics]]@@@@1@4@@danf@17-8-2009
10831040@unknown@formal@none@1@S@* [[Biostatistics]]@@@@1@2@@danf@17-8-2009
10831050@unknown@formal@none@1@S@* [[Bootstrapping (statistics)|Bootstrap]] & [[Resampling (statistics)|Jackknife Resampling]]@@@@1@7@@danf@17-8-2009
10831060@unknown@formal@none@1@S@* [[Business statistics]]@@@@1@3@@danf@17-8-2009
10831070@unknown@formal@none@1@S@* [[Data analysis]]@@@@1@3@@danf@17-8-2009
10831080@unknown@formal@none@1@S@* [[Data mining]] (applying statistics and [[pattern recognition]] to discover knowledge from data)@@@@1@13@@danf@17-8-2009
10831090@unknown@formal@none@1@S@* [[Demography]]@@@@1@2@@danf@17-8-2009
10831100@unknown@formal@none@1@S@* [[Economic statistics]] (Econometrics)@@@@1@4@@danf@17-8-2009
10831110@unknown@formal@none@1@S@* [[Energy statistics]]@@@@1@3@@danf@17-8-2009
10831120@unknown@formal@none@1@S@* [[Engineering statistics]]@@@@1@3@@danf@17-8-2009
10831130@unknown@formal@none@1@S@* [[Environmental Statistics]]@@@@1@3@@danf@17-8-2009
10831140@unknown@formal@none@1@S@* [[Epidemiology]]@@@@1@2@@danf@17-8-2009
10831150@unknown@formal@none@1@S@* [[Geography]] and [[Geographic Information Systems]], more specifically in [[Spatial analysis]]@@@@1@11@@danf@17-8-2009
10831160@unknown@formal@none@1@S@* [[Image processing]]@@@@1@3@@danf@17-8-2009
10831170@unknown@formal@none@1@S@* [[Multivariate statistics|Multivariate Analysis]]@@@@1@4@@danf@17-8-2009
10831180@unknown@formal@none@1@S@* [[Psychological statistics]]@@@@1@3@@danf@17-8-2009
10831190@unknown@formal@none@1@S@* [[Quality]]@@@@1@2@@danf@17-8-2009
10831200@unknown@formal@none@1@S@* [[Social statistics]]@@@@1@3@@danf@17-8-2009
10831210@unknown@formal@none@1@S@* [[Statistical literacy]]@@@@1@3@@danf@17-8-2009
10831220@unknown@formal@none@1@S@* [[Statistical modeling]]@@@@1@3@@danf@17-8-2009
10831230@unknown@formal@none@1@S@* [[Statistical survey]]s@@@@1@3@@danf@17-8-2009
10831240@unknown@formal@none@1@S@* Process analysis and [[chemometrics]] (for analysis of data from [[analytical chemistry]] and [[chemical engineering]])@@@@1@15@@danf@17-8-2009
10831250@unknown@formal@none@1@S@* [[Structured data analysis (statistics)]]@@@@1@5@@danf@17-8-2009
10831260@unknown@formal@none@1@S@* [[Survival analysis]]@@@@1@3@@danf@17-8-2009
10831270@unknown@formal@none@1@S@* [[Reliability engineering]]@@@@1@3@@danf@17-8-2009
10831280@unknown@formal@none@1@S@* Statistics in various sports, particularly [[Baseball statistics|baseball]] and [[Cricket statistics|cricket]]@@@@1@11@@danf@17-8-2009
10831290@unknown@formal@none@1@S@Statistics form a key basis tool in business and manufacturing as well.@@@@1@12@@danf@17-8-2009
10831300@unknown@formal@none@1@S@It is used to understand measurement systems variability, control processes (as in [[statistical process control]] or SPC), for summarizing data, and to make data-driven decisions.@@@@1@25@@danf@17-8-2009
10831310@unknown@formal@none@1@S@In these roles, it is a key tool, and perhaps the only reliable tool.@@@@1@14@@danf@17-8-2009
10831320@unknown@formal@none@1@S@==Statistical computing==@@@@1@2@@danf@17-8-2009
10831330@unknown@formal@none@1@S@The rapid and sustained increases in computing power starting from the second half of the 20th century have had a substantial impact on the practice of statistical science.@@@@1@28@@danf@17-8-2009
10831340@unknown@formal@none@1@S@Early statistical models were almost always from the class of [[linear model]]s, but powerful computers, coupled with suitable numerical [[algorithms]], caused an increased interest in [[nonlinear regression|nonlinear models]] (especially [[neural networks]] and [[decision tree]]s) as well as the creation of new types, such as [[generalized linear model|generalised linear model]]s and [[multilevel model]]s.@@@@1@52@@danf@17-8-2009
10831350@unknown@formal@none@1@S@Increased computing power has also led to the growing popularity of computationally-intensive methods based on [[resampling (statistics)|resampling]], such as permutation tests and the [[bootstrapping (statistics)|bootstrap]], while techniques such as [[Gibbs sampling]] have made Bayesian methods more feasible.@@@@1@37@@danf@17-8-2009
10831360@unknown@formal@none@1@S@The computer revolution has implications for the future of statistics with new emphasis on "experimental" and "empirical" statistics.@@@@1@18@@danf@17-8-2009
10831370@unknown@formal@none@1@S@A large number of both general and special purpose [[List of statistical packages|statistical software]] are now available.@@@@1@17@@danf@17-8-2009
10831380@unknown@formal@none@1@S@== Misuse ==@@@@1@3@@danf@17-8-2009
10831390@unknown@formal@none@1@S@:@@@@1@1@@danf@17-8-2009
10831400@unknown@formal@none@1@S@There is a general perception that statistical knowledge is all-too-frequently intentionally [[Misuse of statistics|misused]] by finding ways to interpret only the data that are favorable to the presenter.@@@@1@28@@danf@17-8-2009
10831410@unknown@formal@none@1@S@A famous saying attributed to [[Benjamin Disraeli]] is, "[[Lies, damned lies, and statistics|There are three kinds of lies: lies, damned lies, and statistics]]"; and Harvard President [[Lawrence Lowell]] wrote in 1909 that statistics, ''"like veal pies, are good if you know the person that made them, and are sure of the ingredients"''.@@@@1@52@@danf@17-8-2009
10831420@unknown@formal@none@1@S@If various studies appear to contradict one another, then the public may come to distrust such studies.@@@@1@17@@danf@17-8-2009
10831430@unknown@formal@none@1@S@For example, one study may suggest that a given diet or activity raises [[blood pressure]], while another may suggest that it lowers blood pressure.@@@@1@24@@danf@17-8-2009
10831440@unknown@formal@none@1@S@The discrepancy can arise from subtle variations in experimental design, such as differences in the patient groups or research protocols, that are not easily understood by the non-expert.@@@@1@28@@danf@17-8-2009
10831450@unknown@formal@none@1@S@(Media reports sometimes omit this vital contextual information entirely.)@@@@1@9@@danf@17-8-2009
10831460@unknown@formal@none@1@S@By choosing (or rejecting, or modifying) a certain sample, results can be manipulated.@@@@1@13@@danf@17-8-2009
10831470@unknown@formal@none@1@S@Such manipulations need not be malicious or devious; they can arise from unintentional biases of the researcher.@@@@1@17@@danf@17-8-2009
10831480@unknown@formal@none@1@S@The graphs used to summarize data can also be misleading.@@@@1@10@@danf@17-8-2009
10831490@unknown@formal@none@1@S@Deeper criticisms come from the fact that the hypothesis testing approach, widely used and in many cases required by law or regulation, forces one hypothesis (the [[null hypothesis]]) to be "favored", and can also seem to exaggerate the importance of minor differences in large studies.@@@@1@45@@danf@17-8-2009
10831500@unknown@formal@none@1@S@A difference that is highly statistically significant can still be of no practical significance.@@@@1@14@@danf@17-8-2009
10831510@unknown@formal@none@1@S@(See [[Hypothesis test#Criticism|criticism of hypothesis testing]] and [[Null hypothesis#Controversy|controversy over the null hypothesis]].)@@@@1@13@@danf@17-8-2009
10831520@unknown@formal@none@1@S@One response is by giving a greater emphasis on the [[p-value|''p''-value]] than simply reporting whether a hypothesis is rejected at the given level of significance.@@@@1@25@@danf@17-8-2009
10831530@unknown@formal@none@1@S@The ''p''-value, however, does not indicate the size of the effect.@@@@1@11@@danf@17-8-2009
10831540@unknown@formal@none@1@S@Another increasingly common approach is to report [[confidence interval]]s.@@@@1@9@@danf@17-8-2009
10831550@unknown@formal@none@1@S@Although these are produced from the same calculations as those of hypothesis tests or ''p''-values, they describe both the size of the effect and the uncertainty surrounding it.@@@@1@28@@danf@17-8-2009
10840010@unknown@formal@none@1@S@Syntax@@@@1@1@@danf@17-8-2009
10840020@unknown@formal@none@1@S@In [[linguistics]], '''syntax''' (from [[Ancient Greek]] {{lang|grc|συν-}} ''syn-'', "together", and {{lang|grc|τάξις}} ''táxis'', "arrangement") is the study of the principles and rules for constructing [[sentence]]s in [[natural language]]s.@@@@1@27@@danf@17-8-2009
10840030@unknown@formal@none@1@S@In addition to referring to the discipline, the term ''syntax'' is also used to refer directly to the rules and principles that govern the sentence structure of any individual language, as in "the [[Irish syntax|syntax of Modern Irish]]".@@@@1@38@@danf@17-8-2009
10840040@unknown@formal@none@1@S@Modern research in syntax attempts to [[descriptive linguistics|describe languages]] in terms of such rules.@@@@1@14@@danf@17-8-2009
10840050@unknown@formal@none@1@S@Many professionals in this discipline attempt to find [[Universal Grammar|general rules]] that apply to all natural languages.@@@@1@17@@danf@17-8-2009
10840060@unknown@formal@none@1@S@The term ''syntax'' is also sometimes used to refer to the rules governing the behavior of mathematical systems, such as [[logic]], artificial formal languages, and computer programming languages.@@@@1@28@@danf@17-8-2009
10840070@unknown@formal@none@1@S@== Early history ==@@@@1@4@@danf@17-8-2009
10840080@unknown@formal@none@1@S@Works on grammar were being written long before modern syntax came about; the ''Aṣṭādhyāyī'' of [[Pāṇini]] is often cited as an example of a pre-modern work that approaches the sophistication of a modern syntactic theory.@@@@1@35@@danf@17-8-2009
10840090@unknown@formal@none@1@S@In the West, the school of thought that came to be known as "traditional grammar" began with the work of [[Dionysius Thrax]].@@@@1@22@@danf@17-8-2009
10840100@unknown@formal@none@1@S@For centuries, work in syntax was dominated by a framework known as {{lang|fr|''grammaire générale''}}, first expounded in 1660 by [[Antoine Arnauld]] in a book of the same title.@@@@1@28@@danf@17-8-2009
10840110@unknown@formal@none@1@S@This system took as its basic premise the assumption that language is a direct reflection of thought processes and therefore there is a single, most natural way to express a thought.@@@@1@31@@danf@17-8-2009
10840120@unknown@formal@none@1@S@That way, coincidentally, was exactly the way it was expressed in French.@@@@1@12@@danf@17-8-2009
10840130@unknown@formal@none@1@S@However, in the 19th century, with the development of [[historical-comparative linguistics]], linguists began to realize the sheer diversity of human language, and to question fundamental assumptions about the relationship between language and logic.@@@@1@33@@danf@17-8-2009
10840140@unknown@formal@none@1@S@It became apparent that there was no such thing as a most natural way to express a thought, and therefore logic could no longer be relied upon as a basis for studying the structure of language.@@@@1@36@@danf@17-8-2009
10840150@unknown@formal@none@1@S@The Port-Royal grammar modeled the study of syntax upon that of logic (indeed, large parts of the [[Port-Royal Logic]] were copied or adapted from the ''Grammaire générale'').@@@@1@27@@danf@17-8-2009
10840160@unknown@formal@none@1@S@Syntactic categories were identified with logical ones, and all sentences were analyzed in terms of "Subject – Copula – Predicate".@@@@1@20@@danf@17-8-2009
10840170@unknown@formal@none@1@S@Initially, this view was adopted even by the early comparative linguists such as [[Franz Bopp]].@@@@1@15@@danf@17-8-2009
10840180@unknown@formal@none@1@S@The central role of syntax within theoretical linguistics became clear only in the 20th century, which could reasonably be called the "century of syntactic theory" as far as linguistics is concerned.@@@@1@31@@danf@17-8-2009
10840190@unknown@formal@none@1@S@For a detailed and critical survey of the history of syntax in the last two centuries, see the monumental work by Graffi (2001).@@@@1@23@@danf@17-8-2009
10840200@unknown@formal@none@1@S@==Modern theories==@@@@1@2@@danf@17-8-2009
10840210@unknown@formal@none@1@S@There are a number of theoretical approaches to the discipline of syntax.@@@@1@12@@danf@17-8-2009
10840220@unknown@formal@none@1@S@Many linguists (e.g. [[Noam Chomsky]]) see syntax as a branch of biology, since they conceive of syntax as the study of linguistic knowledge as embodied in the human [[mind]].@@@@1@29@@danf@17-8-2009
10840240@unknown@formal@none@1@S@Others (e.g. [[Gerald Gazdar]]) take a more [[Philosophy of mathematics#Platonism|Platonistic]] view, since they regard syntax to be the study of an abstract [[formal system]].@@@@1@24@@danf@17-8-2009
10840260@unknown@formal@none@1@S@Yet others (e.g. [[Joseph Greenberg]]) consider grammar a taxonomical device to reach broad generalizations across languages.@@@@1@16@@danf@17-8-2009
10840280@unknown@formal@none@1@S@Some of the major approaches to the discipline are listed below.@@@@1@11@@danf@17-8-2009
10840290@unknown@formal@none@1@S@===Generative grammar===@@@@1@2@@danf@17-8-2009
10840300@unknown@formal@none@1@S@The hypothesis of [[generative grammar]] is that language is a structure of the human mind.@@@@1@15@@danf@17-8-2009
10840310@unknown@formal@none@1@S@The goal of generative grammar is to make a complete model of this inner language (known as ''[[i-language]]'').@@@@1@18@@danf@17-8-2009
10840320@unknown@formal@none@1@S@This model could be used to describe all human language and to predict the [[grammaticality]] of any given utterance (that is, to predict whether the utterance would sound correct to native speakers of the language).@@@@1@35@@danf@17-8-2009
10840330@unknown@formal@none@1@S@This approach to language was pioneered by [[Noam Chomsky]].@@@@1@9@@danf@17-8-2009
10840340@unknown@formal@none@1@S@Most generative theories (although not all of them) assume that syntax is based upon the constituent structure of sentences.@@@@1@19@@danf@17-8-2009
10840350@unknown@formal@none@1@S@Generative grammars are among the theories that focus primarily on the form of a sentence, rather than its communicative function.@@@@1@20@@danf@17-8-2009
10840360@unknown@formal@none@1@S@Among the many generative theories of linguistics are:@@@@1@8@@danf@17-8-2009
10840370@unknown@formal@none@1@S@*[[Transformational Grammar]] (TG) (now largely out of date)@@@@1@8@@danf@17-8-2009
10840380@unknown@formal@none@1@S@*[[Government and binding theory]] (GB) (common in the late 1970s and 1980s)@@@@1@12@@danf@17-8-2009
10840390@unknown@formal@none@1@S@*[[Linguistic minimalism|Minimalism]] (MP) (the most recent Chomskyan version of generative grammar)@@@@1@11@@danf@17-8-2009
10840400@unknown@formal@none@1@S@Other theories that find their origin in the generative paradigm are:@@@@1@11@@danf@17-8-2009
10840410@unknown@formal@none@1@S@*[[Generative semantics]] (now largely out of date)@@@@1@7@@danf@17-8-2009
10840420@unknown@formal@none@1@S@*[[Relational grammar]] (RG) (now largely out of date)@@@@1@8@@danf@17-8-2009
10840430@unknown@formal@none@1@S@*[[Arc Pair grammar]]@@@@1@3@@danf@17-8-2009
10840440@unknown@formal@none@1@S@*[[Generalised phrase structure grammar|Generalized phrase structure grammar]] (GPSG; now largely out of date)@@@@1@13@@danf@17-8-2009
10840450@unknown@formal@none@1@S@*[[Head-driven phrase structure grammar]] (HPSG)@@@@1@5@@danf@17-8-2009
10840460@unknown@formal@none@1@S@*[[Lexical-functional grammar]] (LFG)@@@@1@3@@danf@17-8-2009
10840470@unknown@formal@none@1@S@===Categorial grammar ===@@@@1@3@@danf@17-8-2009
10840480@unknown@formal@none@1@S@[[Categorial grammar]] is an approach that attributes the syntactic structure not to rules of grammar, but to the properties of the [[syntactic categories]] themselves.@@@@1@24@@danf@17-8-2009
10840490@unknown@formal@none@1@S@For example, rather than asserting that sentences are constructed by a rule that combines a noun phrase (NP) and a verb phrase (VP) (e.g. the [[phrase structure rule]] S → NP VP), in categorial grammar, such principles are embedded in the category of the [[head (linguistics)|head]] word itself.@@@@1@48@@danf@17-8-2009
10840500@unknown@formal@none@1@S@So the syntactic category for an [[intransitive]] verb is a complex formula representing the fact that the verb acts as a [[functor]] which requires an NP as an input and produces a sentence level structure as an output.@@@@1@38@@danf@17-8-2009
10840510@unknown@formal@none@1@S@This complex category is notated as (NP\\S) instead of V.@@@@1@10@@danf@17-8-2009
10840515@unknown@formal@none@1@S@NP\\S is read as " a category that searches to the left (indicated by \\) for a NP (the element on the left) and outputs a sentence (the element on the right)".@@@@1@32@@danf@17-8-2009
10840520@unknown@formal@none@1@S@The category of [[transitive verb]] is defined as an element that requires two NPs (its subject and its direct object) to form a sentence.@@@@1@24@@danf@17-8-2009
10840530@unknown@formal@none@1@S@This is notated as (NP/(NP\\S)) which means "a category that searches to the right (indicated by /) for an NP (the object), and generates a function (equivalent to the VP) which is (NP\\S), which in turn represents a function that searches to the left for an NP and produces a sentence).@@@@1@51@@danf@17-8-2009
10840540@unknown@formal@none@1@S@[[Tree-adjoining grammar]] is a categorial grammar that adds in partial [[tree structure]]s to the categories.@@@@1@15@@danf@17-8-2009
10840550@unknown@formal@none@1@S@===Dependency grammar===@@@@1@2@@danf@17-8-2009
10840560@unknown@formal@none@1@S@[[Dependency grammar]] is a different type of approach in which structure is determined by the [[relation]]s (such as [[grammatical relation]]s) between a word (a ''[[head (linguistics)|head]]'') and its dependents, rather than being based in constituent structure.@@@@1@36@@danf@17-8-2009
10840570@unknown@formal@none@1@S@For example, syntactic structure is described in terms of whether a particular [[noun]] is the [[subject]] or [[agent]] of the [[verb]], rather than describing the relations in terms of trees (one version of which is the [[parse tree]]) or other structural system.@@@@1@42@@danf@17-8-2009
10840580@unknown@formal@none@1@S@Some dependency-based theories of syntax:@@@@1@5@@danf@17-8-2009
10840590@unknown@formal@none@1@S@*[[Algebraic syntax]]@@@@1@2@@danf@17-8-2009
10840600@unknown@formal@none@1@S@*[[Word grammar]]@@@@1@2@@danf@17-8-2009
10840610@unknown@formal@none@1@S@*[[Operator Grammar]]@@@@1@2@@danf@17-8-2009
10840620@unknown@formal@none@1@S@===Stochastic/probabilistic grammars/network theories ===@@@@1@4@@danf@17-8-2009
10840630@unknown@formal@none@1@S@Theoretical approaches to syntax that are based upon [[probability theory]] are known as [[stochastic grammar]]s.@@@@1@15@@danf@17-8-2009
10840640@unknown@formal@none@1@S@One common implementation of such an approach makes use of a [[neural network]] or [[connectionism]].@@@@1@15@@danf@17-8-2009
10840650@unknown@formal@none@1@S@Some theories based within this approach are:@@@@1@7@@danf@17-8-2009
10840660@unknown@formal@none@1@S@*[[Optimality theory]]@@@@1@2@@danf@17-8-2009
10840670@unknown@formal@none@1@S@*[[Stochastic context-free grammar]]@@@@1@3@@danf@17-8-2009
10840680@unknown@formal@none@1@S@===Functionalist grammars===@@@@1@2@@danf@17-8-2009
10840690@unknown@formal@none@1@S@Functionalist theories, although focused upon form, are driven by explanation based upon the function of a sentence (i.e. its communicative function).@@@@1@21@@danf@17-8-2009
10840700@unknown@formal@none@1@S@Some typical functionalist theories include:@@@@1@5@@danf@17-8-2009
10840710@unknown@formal@none@1@S@*[[Functional grammar]] (Dik)@@@@1@3@@danf@17-8-2009
10840720@unknown@formal@none@1@S@*[[Prague Linguistic Circle]]@@@@1@3@@danf@17-8-2009
10840730@unknown@formal@none@1@S@*[[Systemic functional grammar]]@@@@1@3@@danf@17-8-2009
10840740@unknown@formal@none@1@S@*[[Cognitive grammar]]@@@@1@2@@danf@17-8-2009
10840750@unknown@formal@none@1@S@*[[Construction grammar]] (CxG)@@@@1@3@@danf@17-8-2009
10840760@unknown@formal@none@1@S@*[[Role and reference grammar]] (RRG)@@@@1@5@@danf@17-8-2009
10850010@unknown@formal@none@1@S@SYSTRAN@@@@1@1@@danf@17-8-2009
10850020@unknown@formal@none@1@S@'''SYSTRAN''', founded by Dr. [[Peter Toma]] in [[1968]], is one of the oldest [[machine translation]] companies.@@@@1@16@@danf@17-8-2009
10850030@unknown@formal@none@1@S@SYSTRAN has done extensive work for the [[United States Department of Defense]] and the [[European Commission]].@@@@1@16@@danf@17-8-2009
10850040@unknown@formal@none@1@S@SYSTRAN provides the technology for [[Yahoo!]] and [[AltaVista]]'s ([[Babel Fish (website)|Babel Fish]]) among others, but use of it was ended (circa 2007) for all of the language combinations offered by [[Google]]'s [[List of Google products#anchor_language_tools|language tools]].@@@@1@36@@danf@17-8-2009
10850050@unknown@formal@none@1@S@Commercial versions of SYSTRAN operate with operating systems [[Microsoft Windows]] (including [[Windows Mobile]]), [[Linux]] and [[Solaris (operating system)|Solaris]].@@@@1@18@@danf@17-8-2009
10850060@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10850070@unknown@formal@none@1@S@With its origin in the [[Georgetown-IBM experiment|Georgetown]] machine translation effort, SYSTRAN was one of the few machine translation systems to survive the major decrease of funding after the [[ALPAC|ALPAC Report]] of the mid-1960's.@@@@1@33@@danf@17-8-2009
10850080@unknown@formal@none@1@S@The company was established in [[La Jolla, San Diego, California|La Jolla]], [[California]] to work on translation of Russian to English text for the [[United States Air Force]] during the "[[Cold War]]".@@@@1@31@@danf@17-8-2009
10850090@unknown@formal@none@1@S@Large numbers of Russian scientific and technical documents were translated using SYSTRAN under the auspices of the USAF Foreign Technology Division (later the National Air and Space Intelligence Center) at [[Wright-Patterson Air Force Base]], Ohio.@@@@1@35@@danf@17-8-2009
10850100@unknown@formal@none@1@S@The quality of the translations, although only approximate, was usually adequate for understanding content.@@@@1@14@@danf@17-8-2009
10850110@unknown@formal@none@1@S@The company was sold during 1986 to the Gachot family, based in [[Paris]], [[France]], and is now traded publicly by the French stock exchange.@@@@1@24@@danf@17-8-2009
10850120@unknown@formal@none@1@S@It has a main office at the [[Grande Arche]] in [[La Defense]] and maintains a secondary office in [[La Jolla, San Diego, California]].@@@@1@23@@danf@17-8-2009
10850130@unknown@formal@none@1@S@== Languages ==@@@@1@3@@danf@17-8-2009
10850140@unknown@formal@none@1@S@Here is a list of the source and target languages SYSTRAN works with.@@@@1@13@@danf@17-8-2009
10850150@unknown@formal@none@1@S@Many of the pairs are to or from English or French.@@@@1@11@@danf@17-8-2009
10850160@unknown@formal@none@1@S@* Russian into English (1968)@@@@1@5@@danf@17-8-2009
10850170@unknown@formal@none@1@S@* English into Russian (1973) for the [[Apollo-Soyuz]] project@@@@1@9@@danf@17-8-2009
10850180@unknown@formal@none@1@S@* English source (1975) for the [[European Commission]]@@@@1@8@@danf@17-8-2009
10850190@unknown@formal@none@1@S@* Arabic@@@@1@2@@danf@17-8-2009
10850200@unknown@formal@none@1@S@* Chinese@@@@1@2@@danf@17-8-2009
10850210@unknown@formal@none@1@S@* Danish@@@@1@2@@danf@17-8-2009
10850220@unknown@formal@none@1@S@* Dutch@@@@1@2@@danf@17-8-2009
10850230@unknown@formal@none@1@S@* French@@@@1@2@@danf@17-8-2009
10850240@unknown@formal@none@1@S@* German@@@@1@2@@danf@17-8-2009
10850250@unknown@formal@none@1@S@* Greek@@@@1@2@@danf@17-8-2009
10850260@unknown@formal@none@1@S@* Hindi@@@@1@2@@danf@17-8-2009
10850270@unknown@formal@none@1@S@* Italian@@@@1@2@@danf@17-8-2009
10850280@unknown@formal@none@1@S@* Japanese@@@@1@2@@danf@17-8-2009
10850290@unknown@formal@none@1@S@* Korean@@@@1@2@@danf@17-8-2009
10850300@unknown@formal@none@1@S@* Norwegian@@@@1@2@@danf@17-8-2009
10850310@unknown@formal@none@1@S@* Serbo-Croatian@@@@1@2@@danf@17-8-2009
10850320@unknown@formal@none@1@S@* Spanish@@@@1@2@@danf@17-8-2009
10850330@unknown@formal@none@1@S@* Swedish@@@@1@2@@danf@17-8-2009
10850340@unknown@formal@none@1@S@* Persian@@@@1@2@@danf@17-8-2009
10850350@unknown@formal@none@1@S@* Polish@@@@1@2@@danf@17-8-2009
10850360@unknown@formal@none@1@S@* Portuguese@@@@1@2@@danf@17-8-2009
10850370@unknown@formal@none@1@S@* Ukrainian@@@@1@2@@danf@17-8-2009
10850380@unknown@formal@none@1@S@* Urdu@@@@1@2@@danf@17-8-2009
10860010@unknown@formal@none@1@S@Text analytics@@@@1@2@@danf@17-8-2009
10860020@unknown@formal@none@1@S@The term '''text analytics''' describes a set of linguistic, lexical, pattern recognition, extraction, tagging/structuring, visualization, and predictive techniques.@@@@1@18@@danf@17-8-2009
10860030@unknown@formal@none@1@S@The term also describes processes that apply these techniques, whether independently or in conjunction with query and analysis of fielded, numerical data, to solve business problems.@@@@1@26@@danf@17-8-2009
10860040@unknown@formal@none@1@S@These techniques and processes discover and present knowledge – facts, business rules, and relationships – that is otherwise locked in textual form, impenetrable to automated processing.@@@@1@26@@danf@17-8-2009
10860050@unknown@formal@none@1@S@A typical application is to scan a set of documents written in a [[natural language]] and either model the document set for predictive classification purposes or populate a database or search index with the information extracted.@@@@1@36@@danf@17-8-2009
10860060@unknown@formal@none@1@S@Current approaches to text analytics use [[natural language processing]] techniques that focus on specialized domains.@@@@1@15@@danf@17-8-2009
10860070@unknown@formal@none@1@S@Typical subtasks are:@@@@1@3@@danf@17-8-2009
10860080@unknown@formal@none@1@S@* [[Named Entity Recognition]]: recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions.@@@@1@22@@danf@17-8-2009
10860090@unknown@formal@none@1@S@* [[Coreference]]: identification chains of [[noun phrase]]s that refer to the same object.@@@@1@13@@danf@17-8-2009
10860100@unknown@formal@none@1@S@For example, [[Anaphora (linguistics)|anaphora]] is a type of coreference.@@@@1@9@@danf@17-8-2009
10860110@unknown@formal@none@1@S@* [[Relationship Extraction]]: extraction of named relationships between entities in text@@@@1@11@@danf@17-8-2009