End-of-Life (EoL)

Create a Machine Learning Model

Follow these steps to create an ml model.
A machine learning model enables Cortex XSOAR to predict the classification of phishing incidents. For example, whether the incident should be classified as legitimate, malicious, or spam. You can use these models in conjunction with your default investigation playbooks, or run commands separately in the War Room. It is usually used for training a model to predict the classification of a phishing incident. The main goal of the machine learning model is leveraging past phishing incidents to assist with the investigation of future incidents.
  1. Select
    ML Models
    New Model
  2. Define the Incidents Training Set Scope.
    1. In the
      Model name
      field, type the name of the model that you want to create.
    2. (
      ) In the
      field, type a meaningful description for the model.
    3. To choose which incidents are to be used for training the model, in the
      Incident type
      field, from the drop down list, select the type of incident for training, such as Phishing.
    4. Select the date range from which incidents will be used for the training set. The more incidents, the better the expected results. It is recommended to use a longer period.
    5. In the
      Maximum number of incidents to test
      field, type the number of incidents that will be used to train the model.
      Reduce the number only if the number of incidents is too large and causes performance problems. Use a higher number if you have more samples in your environment. Default is 3000.
  3. Select the field for which you want the model to learn to predict.
    1. In the
      Incident field
      from the drop down list, select the relevant field.
      The Incident Field (classification field), stores the classification of the incident. This is a single select field, where the classification or the closed reason of incidents are stored. The out of the box fields are “Email Classification” or “Close Reason”, but you can use any other custom field.
      After selecting the Incident field in the
      Field Values
      field, you can see the different values of classifications and the number of values across the selected incidents scope of incidents.
  4. Set the final classification values.
    1. In the
      columns, define the names of the verdicts for mapping your existing classification values.
      This stage allows you to control which incidents’ classifications will be used in the training, and also merge multiple classifications into a single category. Verdict is a group of classifications values, for which each verdict includes one classification or more. The trained model predicts each new incident as one of those verdicts.
    2. Map your data by associating the verdict with your defined classification values by dragging and dropping the
      Field Values
      into the respective
      Where values remain in the
      Field Values
      column, their corresponding incidents are not involved in the training. You may want to leave classifications such as
      Internal Phishing Test
      , or any other classifications that you do not want to participate in the training. For example:
      It is possible to drag multiple classifications values into a single verdict. If so, the model treats all the classification values under the same verdict as if they had the same classification. This allows you to better define the prediction task of the model and merge some smaller groups into a single group.
      This might be helpful if you have different subtypes of classifications. For example, if you have classification values of Spear Phishing, Malware, and Ransomware, you may want to map them all into a single verdict called Phishing. If you want to have a model which distinguishes between one classification and the rest (for instance, if you want to train a model which distinguishes between phishing and the rest of the classifications, you can map all other classifications other than phishing into a single verdict called “Non-Phishing”). In the following example we have 2 verdicts, one has phishing, the other has everything other than phishing:
      You can have 2-3 different verdicts, where each verdict needs a minimum of 50 incidents for each. For an example. see Machine Learning Model Example
  5. (
    ) Change the fields where the email body and email subject are stored in the incident.
    1. In the
      Argument Mapping
      select the equivalent fields for Email body, Email HTML and Email subject.
      By default, training is done based on the Email body, Email HTML, and Email subject.
  6. Train the model by clicking
    Start Training
    You will be redirected back to the Machine Learning Models page, and the training process takes several minutes (it is possible to close the page).
    If training is completed successfully, the percentage scores appear, which reflect the precision of the model of the different verdicts.
  7. (
    ) View detailed performance information of the model.
    For a more thorough evaluation of the model, click on the + button next the model name. If using the phishing incident type, you can now use model in the War Room or in the playbook.
    If using the phishing incident type, you can now use model in the machine learning or War Room window or in the playbook. For more information, see Machine Learning Models Overview

Recommended For You