End-of-Life (EoL)

Train a Classifier on Languages with Adjusted Tokenization

ml machine learning
Cortex XSOAR allows you to customize automations and playbooks to support phishing classifiers for languages other than English. Cortex XSOAR offers adjusted tokenization for the following languages:
  • German
  • French
  • Spanish
  • Portuguese
  • Italian
  • Dutch
You need to configure the following automations and playbooks:
  • DBotPreProcessTextData
  • WordTokenizerNLP
  • DBotPredictPhishingWords
  • DBot Create Phishing Classifier V2
  1. Go to
    Automation
    .
  2. Configure the language for
    DBotPreProcessTextData
    .
    1. Copy the
      DBotPreProcessTextData
      automation, by selecting
      Duplicate Automation
      .
    2. (Optional)
      Change the name of the duplicated automation to make it distinguishable.
    3. From the
      Advanced
      section, in the
      Docker image name
      field, type
      demisto/dl:languages1.0
      .
    4. In the
      Arguments
      section, expand the
      language
      argument.
    5. In the
      Initial value
      field, change the language to train the classifier.
    6. Click
      Save Version
      .
  3. Configure the language for
    WordTokenizerNLP
    .
    1. Copy the
      WordTokenizerNLP
      automation, by selecting the
      Duplicate Automation
      .
    2. (Optional)
      Change the name of the duplicated automation to make it distinguishable.
    3. From the
      Advanced
      section, in the
      Docker image name
      field, type
      demisto/dl:languages1.0
      .
    4. In the
      Arguments
      section, expand the
      language
      argument.
    5. In the
      Initial value
      field, change the language to train the classifier.
    6. Click
      Save Version
      .
  4. Configure the language for
    DBotPredictPhishingWords
    .
    1. Copy the
      DBotPredictPhishingWords
      automation by selecting
      Duplicate Automation
      .
    2. (Optional)
      Change the name of the duplicated automation to make it distinguishable.
    3. From the
      Advanced
      section, in the
      Docker image name
      field, type
      demisto/dl:languages1.0
      .
    4. In the
      Arguments
      section, expand the
      language
      argument.
    5. In the
      Initial value
      field change the language to train the classifier.
    6. Click
      Save Version
      .
  5. Go to
    Playbooks
    .
  6. Search for
    DBot Create Phishing Classifier V2
    to update the playbook.
    1. Copy the playbook, by selecting
      Duplicate Playbook
      .
    2. Select the
      Pre-process file
      task.
    3. From the drop down menu, replace the automation with the duplicated version of
      DBotPreProcessTextData
      created in step 2.
    4. Click
      OK
      and
      Save Version
      .

Recommended For You