The story of the Natural language processing (NLP) begins in the 1950s, although previous work can be found. In 1950, Alan Turing published a famous article entitled "Computing Machinery and Intelligence" which proposes what is now called the Turing test as an intelligence criterion. This criterion depends on the ability of a computer program to impersonate a human in a real-time written conversation, convincingly enough that the human interlocutor cannot surely distinguish - based on the sole content of the conversation - whether interacts with a program or another real human. Georgetown's experience in 1954 included the fully automatic translation of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would no longer be a problem.

Statistical uses of natural language processing rely on stochastic, probabilistic, or simply statistical methods to solve some of the difficulties discussed above, especially those that arise from the fact that very long sentences are highly ambiguous when treated with realistic grammars, allowing thousands or millions of possible analyzes. Disambiguation methods often involve the use of corpora and formalization tools such as Markov models. Statistical TAL includes all quantitative approaches to automated language processing, including modeling, information theory, and linear algebra. The technology for statistical NLP comes mainly from machine learning and data mining, both as they involve learning from data coming from artificial intelligence.

Hebrew (עברית) is a Semitic language belonging to the Afro-Asian language group. Currently, modern Hebrew is spoken as the language of speech, literature and official dealings, spoken by more than 7 million people distributed within the borders of Israel and the Palestinian Territories. The Hebrew language represents a multitude of constraints to developers of natural language processing systems because of its particular spelling and rich morphology. A highly advanced software infrastructure, based on linguistic knowledge, is required for natural language applications such as machine translation, speech-to-text conversion, automatic document synthesis, spelling and stylistic verification...

NPL Tasks


The science of the NPL therefore includes several classic tasks. She is in fact the heiress of a long mathematical and logical tradition of computational modeling. More precisely, it can be said that the foundations of computing are twofold: first, oding data using discrete elements (the famous 0/1). Second, The actual coding of the treatments using algorithms.

The NPL will bring computer linguists interested in semantics. Even if a word has only one grammatical function and fits in a clear syntax, it can happen that it has several meanings. We can distinguish different meanings in the same words (this is the case for several Hebrew words).

Advanced Natural Language Processing (NLP) technology gives Virtual Assistant the power to naturally interact in Hebrew to provide personalized service to clients. This is the case for MILA, GitHub and many open sources available online like bold360. This one offers the technology that humanizes the customer experience, This company is the 2nd largest bank in israel, it knew it had to meet the expectations of customers in the digital age to satisfy them and retain them. Customers expect a simpler, faster, and smarter banking experience, and now prefer to type free text instead of searching through the app's options. To meet these expectations, Discount Bank decided to look for an online technology that would add value to the customer and provide valuable information to the bank. But the real challenge was to provide an AI solution that would unrestrictedly include the natural language of customers, and translate it into queries or actions on their account, so they can get the information they need in a simple and easy way.

Why The NPL tasks are harder in Hebrew then in others languages?

NPL takes morphology first: it is based on the function of words. In primary school, we have assimilated a simplified form of this process by learning to differentiate a noun from a verb or a pronoun. This can however be complicated for computers because a word can, depending on the sentence in which it occurs, change function which is difficult in the Hebrew case, Hebrew language-related constraints are numerous and they are related to its complicated structure as well as rich and particular dictionary, for this reason that the Israeli National Institute for Testing and Evaluation has launched a Hebrew Language Project concerning a dictionary of morphology, labeled corpora, linguistic models, tokenizers and more; https://hlp.nite.org.il/Default.aspx - Among the most popular open-source tools, we find:

  • The Natural Language Toolkit is a set of NPL tools in Python language. The tool offers access to more than 100 corpora of texts.
  • Stanford NLP Group Software: one of the most important research groups in the field of natural language processing. Many tools are available. They make it possible to define the basic form of words (segmentation in units), the function of words (morpho-syntactical labeling) and the structure of sentences (syntactic tree). In addition, there are tools for complicated processes such as deep learning for which the context of the sentence is taken into account. Stanford CoreNLP introduces most of the basic features.
  • For voice language processing, there is the CSLU kit. These tools include, among other functions, voice recognition and retransmission of texts orally (by synthetic voice). They also include training tools with which children can learn new vocabulary words, and with which deaf people can practice speaking. The tools are thus adapted to young students, students, researchers and of course to any other interested person.
  • Visualtext is a set of tools written in an NPL-specific programming language: the NLP ++ language.


If the language of Israelis today is not that of the Hebrew of yesterday, it must be remembered, however, that the Israeli who remains convinced that he speaks the language of the Bible, if it does not is not quite the case, understands it in any case better than a contemporary Greek does not understand Homer. Moreover, Hebrew has finally become a few things, a language like any other, today experiencing a similar evolution to other languages: constant and so-called natural evolution, discrepancy between certain official rules and certain popular practices, even socialization of the language according to the social categories and creation of a Hebrew slang. In this case, the presence of multiple open sources offering high-performance solutions for the NLP is an urgent necessity in order to put in the market several choices for the customers who know very well that despite the difficulty of this language, it remains very interesting to learn.

Show All Articles