Aug 16, 2024 3 min read

Can GPT generate random numbers ?

Well a random thought came to my mind if GPT can really generate a random number ?

Best thing is to find out yourself.

First thing is to quickly write boiler plates around random number generation using the cohere sonet model.

async def generate_random_number(till: int):
    temperature = random.random()
    message = await client.messages.create(
        model="claude-3-5-sonnet-20240620",
        max_tokens=1000,
        temperature=temperature,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": f"Generate a random number between 1 to {till}"
                        + ". Answer in the json format {number: int} only answer a valid json",
                    }
                ],
            }
        ],
    )
    random_number = json.loads(message.content[0].text)
    return random_number.get("number")


async def generate_random_x_times(times: int, till: int) -> List[str]:
    tasks = [generate_random_number(till) for _ in range(times)]
    random_results = await asyncio.gather(*tasks)
    return random_results


def plot_histogram(random_numbers: List[int]):
    # plot the random_numbers

Now how random were the results ?

For a random number between 1 to 10000 (Cohere), the model is settling amongst predicting amongst a handful of numbers

What is funny is that if I run the script again the variation of random number changes but still majority of the numbers are between 7300 - 7400 range.

Let us try reducing the upper limit to 1000.

What is interesting is that in this case even trying multiple times gave the same distribution.

For upper limit 100, the number is always 73 and for 10 the number is always 7.

Now let us twist the experiment a little bit and ask Claude to pick a random string from a given set of 10 names.

RANDOM_NAMES = ["Olivia", "Liam", "Emma", "Noah", "Sophia", "James", "Isabella", "Benjamin", "Mia", "Elijah"]

I ran it couple of times, the results in case were different but still centered around the names which appear in the beginning of the list.

Some takeaways

The number is always 7 for a window of 1 to 10 and 73 for a window between 1 to 100.
The number range is weirdly centered around the number 7. Maybe because of the training data ?
While asked to choose randomly from a list, GPT is more probable to select from the starting of the list.

Follow up questions

Would prompt engineering help here ?
How are the results with other models - though I guess the behaviour would be the same
How much dollars did I burn while doing this ?