NLP – Natural Language Processing

Layers

  1. Input and initial processing—Taking in speech or text and breaking it up into smaller pieces for processing. For speech, this step is called phonetic analysis, and consists of breaking down the speech into individual sounds, called phonemes. For text input, this can include optical character recognition (OCR) and tokenization. OCR is used to recognize the individual characters in text if it’s coming in as an image rather than as words made of characters. Tokenization refers to breaking down a continuous text into individual tokens, often words.
  2. Morphological analysis—Breaking down complex words into their components to better understand their meaning. For example, you can break down “incomprehensible” into its component parts.
    • “in”—not
    • “comprehens”—to understand or comprehend
    • “ible”—indicates that this word is an adjective, describing whether something can be comprehended
  3. Syntactic analysis—Trying to understand the structure of sentences by looking at how the words work together. This step is like diagramming a sentence, where you identify the role each word is playing in the sentence.
  4. Semantic interpretation—Working out the meaning of a sentence by combining the meaning of individual words with their syntactic roles in the sentence.
  5. Discourse processing—Understanding the context around a sentence to fully process what it means.
1. Speech (phonetic/phonological analysis) or text (OCR/tokenization); 2. Morphological analysis; 3. Syntactic analysis; 4. Semantic interpretation; 5. Discourse processing

Fonte: https://trailhead.salesforce.com/pt-BR/content/learn/modules/deep-learning-and-natural-language-processing/start-with-nlp

Leave a Reply