Machine learning has the disadvantage of requiring large amounts of data, which is currently a major shortcoming in the application of collusion detection due to the scarcity of labelled examples and the imbalance of existing datasets. The project investigates how to augment datasets with synthetic examples generated with AI techniques.