This site is no longer updated.Go to new Conversational Cloud docs

Migrating projects to CAILA


Projects that were created based on patterns or phrase examples can be migrated to the CAILA NLU kernel.

All the CAILA features will be available for the project: intent recognition, system and user-defined entities, slot markup and filling, CAILA API, etc.

We will cover the process of migrating projects to CAILA in this article:

  1. Editing the project configuration file.
  2. Migrating NLU settings.
  3. Configuring activation rules.
  4. Testing and updating the script.

Configuration file

Specify the chatbot.yaml parameters in the configuration file:

botEngine: v2
language: ru
 
sts:
    noMatchThreshold: 0.2
caila:
    noMatchThreshold: 0.2
  • botEngine – classifier type; specify v2 to use CAILA.
  • language – specify the language for compatibility with the STS classifier.
  • noMatchThreshold – sets the minimum required similarity of the phrase to one of the classes. We have empirically determined that the optimal value for this parameter is 0.2 in the process of the NLU service development.

Please note that we also set the noMatchThreshold value for the STS classifier. This is required for the project’s backward compatibility.


Migrating NLU settings

Most NLU settings in chatbot.yaml become inactive when migrated to CAILA. Some of the settings can be defined when you configure the project in the NLU settings section.

Let us have a look at the changes in a sample project with an STS classifier:

nlp:
  morphology: myStem
  tokenizer: myStem
 
  vocabulary: common-vocabulary.json
  lengthLimit:
    enabled: true
    symbols: 400
    words: 100000
  timeLimit:
    enabled: true
    timeout: 10000
  spellcheck:
    enabled: true
    dictionary: dict.txt
    frequency: frequency.txt
    minWordLengthForEditDistance: 3
    maxWordEditDistance: 0
  speller:
    dictionary: speller.dict
 
classifier:
  enable: true
  engine: sts
  noMatchThreshold: 0.2
  parameters:
    algorithm: aligner2
  • morphology – parameter is inactive; the text is marked up in CAILA. Can be overridden in advanced NLU settings.
  • tokenizer – parameter is inactive; the text is marked up in CAILA. Can be overridden in advanced NLU settings.
  • vocabulary – parameter is inactive, cannot be overridden.
  • timeLimit – parameter is only active for the q, e, eg tags. The intent tag is set to a default value, cannot be overridden.
    • enabled – parameter is inactive.
    • timeout – parameter is inactive.
  • spellcheck – spellchecker module, the parameter is inactive. The format of the dictionary is not compatible with CAILA. Create and upload a dictionary in the new format.
  • speller – new format spellchecker module, the parameter is inactive. The format of the dictionary is compatible with CAILA. You can use the CAILA API to upload a dictionary.
  • classifer – parameter is only active for the q, e, eg tags. The intent tag is set to a default value, cannot be overridden.
  • noMatchThreshold – parameter is inactive, the sts.noMatchThreshold parameter is used instead.

Spellchecker module

The built-in spellchecker module can be used to correct spelling errors in client requests. If can be used in combination with a user-defined dictionary. This way, the project dictionary will be used to correct the words from the domain scope and the global module will be used for other words.

If you used a .dict dictionary before, you can migrate it to your project using the CAILA API Direct.


Tokenization and lemmatization

The udpipe tokenizer is used for projects in Russian by default. Tests have shown this is the best tokenization and lemmatization solution.

If your project was created using patterns or an STS classifier, we recommend that you use the morphsrus or mystem tokenizer.


NLU advanced configuration parameters

Switch to project editing. Specify advanced configuration parameters here: NLU language, classifier algorithm, timezone, NLU settings.

Learn more about NLU settings


Configuring activation rules

You can use patterns, phrase examples from an STS classifier and the CAILA classifier in combination to detect client intent. Specify the state triggering mechanism for the combined use of intents, patterns and example groups in the bot script.

Learn more about the activation rule mechanism


Example dictionary

If you used an STS classifier in your project, you can migrate your example dictionary to the updated project.

Open the Intents page. Click Import at the top of the intents tree > upload the .json file.


CatchAll

Note that if the NLU service is used in combination with patterns and classifier phrase examples, the following CatchAll is not used:

    state: CatchAll
        q!: *
        a: I did not get it

Use event: noMatch for user requests not processed by your script:

    state: CatchAll
        event: noMatch
        a: You said: {{ $request.query }}
        

Testing and updating the script

Use the test widget built in the script editor to debug your script.

We recommend using the intent activation rule, system and custom entities, slot filling and other features of the CAILA NLU kernel to further update your script.