Introduction
Generating random codes based on probability is a common requirement in testing, simulation, and data sampling. With DataWeave, MuleSoft’s powerful transformation language, you can easily create weighted random code lists. This guide will walk you through the process step by step.
Understanding the Input Data
The input is a JSON array containing objects with two fields: code and probability. The probability value represents the likelihood of each code appearing in the output, expressed as a percentage. For example, if "ABC" has a probability of 60, it should appear approximately 60% of the time in the generated list.
Here’s a sample input:
Code: "ABC", Probability: 60
Code: "DEF", Probability: 20
Code: "HIJ", Probability: 20
The DataWeave Script Explained
The script below generates 1000 random codes based on the probabilities provided. It uses functions from the dw::core::Arrays module, such as map, flatten, and orderBy, to achieve this.
Key Steps in the Script:
Define the input data and parameters.
Calculate how many times each code should appear based on its probability.
Create a list of codes repeated according to their calculated counts.
Shuffle the list randomly to ensure unbiased distribution.
Example Script:
var multiply_items = percentage map ((item) -> (((item default 0) * maxreq) / 100) as Number) var total_codes = flatten(multiply_items map ((item, index) -> (0 to item-1) map ((i) -> codes[index])))
How the Output Looks
The output is a JSON array of 1000 codes, randomly shuffled. For instance, "ABC" will appear more frequently than "DEF" or "HIJ" due to its higher probability.
Example Output:
"ABC", "ABC", "HIJ", "ABC", "DEF", "ABC", "ABC", "ABC", "DEF", "ABC", ...
Bonus Simplified Script
For a more straightforward approach, use this script:
fileData flatMap( 1 to (($.probability / 100) * maxRequest) map ((item, index) -> $.code) ) orderBy random()
This version simplifies the process by directly mapping probabilities to code counts and randomizing the results.
Real-World Example
Imagine you’re simulating product purchases for an e-commerce platform. You can use this method to generate random product codes based on their popularity, ensuring your tests reflect real-world shopping behaviour.
Sources
https://docs.mulesoft.com/dataweave/2.4/
https://help.mulesoft.com/s/article/How-to-Use-DataWeave-Map-Function
https://dzone.com/articles/dataweave-random-number-generation
Conclusion
DataWeave makes it easy to generate random codes based on probability, enabling realistic testing and simulation scenarios. By leveraging its built-in functions, you can efficiently transform data for various use cases.