Train Models (optional)

The tool uses two models to mutate malware to increase evasiveness and maintain functionality.

Generative Adversarial Network is used to generate benign-looking imports and sections for malware samples.
Reinforcement Learning is used to train an agent to choose the best sequence of mutations for a malware sample in order to evade a classifier.

Prerequisites

Use --help with scripts to learn about different parameters

Step 1: Train Generative Adversarial Network (GAN)

The Generative adversarial network training model can be broken down into the following components:

Feature Extraction Component:

The feature extraction component is crucial to allow our AI model to understand what a binary is in a simpler representation of the features it possesses.

Features - The most relevant components of the binary that are extracted:
- Section names
- Import libraries and functions
Feature Vector Map - all unique features are stored into a map
Binary Feature Vectors - the binaries are then converted to a binary representation using a feature vector map.

python extract_feature.py --log debug  #debug messages will help track the feature extraction

Table 1: Simplified representation of feature vector map

Feature

Index

ConversionListA : inn32.dll

OdQueryStringA : user32.dll

GetTabbedTextExtentA : user32.dll

...

<function_name><dll name>

Table 2: Simplified representation of a binary feature vector

Feature Index

Boolean

<feature_index>

Adversarial Features Generation Component

Once we have a simpler representation of our binaries, we can then train our GAN with it. Our GAN is made of three components

The Generator - this is an AI model that is responsible for generating features that look benign while trying to retain malicious functionality
The Detector - this is an AI model that is responsible for detecting whether the output by the generator is malicious or benign. The discriminator then provides feedback to the generator so that it can improve on its next iteration.
Blackbox detector - this component is trained with AI models that are used in Avs to classify the output of the generator and aid the discriminator into making its decision.

To start the GAN training, run the following command

python main_malgan.py --log debug  #debug messages will help track the training

Once we have the adversarial feature vector from the GAN, we feed it to the binary_builder.py script which uses the original feature mapping vector from step 1 to map the adversarial features back to the import functions and section names.

 python binary_builder.py --log debug #debug messages will help track the feature builder

The output from GAN will be stored as (RL_Features/adverarial_imports_set.pk and RL_Features/adverarial_sections_set.pk) which will then be used for adding imports and sections to the malware for mutation during the RL phase.

Step 2: Train Reinforcement Learning (RL) Agent

The reinforcement learning training module will include the following

Malware Environment - The environment that will help the agent by making mutations chosen by the agent, classify, and score the mutated samples.
Agent - A Neural network model that learns the sequence of mutations for malware samples that can evade the classifier.

You can build your own RL agent by following the steps here

Mutations - The various mutations currently available in the tool are as follows:
- Add imports (Query GAN model)
- Add sections (Query GAN model)
- Append bytes to sections
- Rename sections
- UPX pack
- UPX unpack
- Add/Remove signature
- Append a random number of bytes

You can add your own mutations by the following the steps here

Classifier - An AI model that can take a binary as input and classify it as either malware or benign file. By default, the tool uses a sample gradient boosting classifier.

You can add your own classifier by following the steps here

To train the RL agent, run the following command

python rl_train.py --log debug  #debug messages will help track the training

The script will save the RL model at the interval of 500 episodes to save the status in case the training stops and you want to resume. These models can be then tested to see which ones work best.

PreviousPre-requisites NextMutate Malware

Last updated 4 years ago

Was this helpful?