Checking for Insert Collisions in Laravel

March 7th, 2024

When you fill a database row with any kind of random or generated data, checking for collisions is a great idea.

Let's take the example of a URL shortener. When a Url gets created, we want to generate an identifier for the short URL (e.g. https://short.test/AdtFq). In reality, you may use a package like Hashids to generate this, but we'll use str()->random() as an example. It's still perfectly acceptable, but more prone to collisions... which we'll obviously solve!

Inserting a random identifer

Ok, so here's the code that generates and fills a Url model with that random identifier before it's saved to the database.

class Url extends Model
{
    //...
    public static function booted()
    {
        static::creating(function (Url $url) {
            $url->identifier = str()->random(5)
        });
    }
}

This will successfuly fill the identifier, but it's likely that somewhere along the way it'll collide with another randomly generated 5 character string, particularly with more database rows.

Sure, we can use unique for the column (and that's a great idea) but it'll still fail, and we'll have to deal with it anyway.

Iterating until there's no collision

Let's build a separate class who's job is to iterate until this string no longer exists in a column for a given model.

Here's the class in full.

class IdentifierGenerator
{
    public function __construct(protected string $model, protected string $column) {}

    public function generate(): string
    {
        do {
            $identifier = str()->random(5);
        } while ($this->model::where($this->column, $identifier)->exists());

        return $identifier;
    }
}

The IdentifierGenerator class simply takes a model and column as it's arguments, and starts by generating a random string. However, this is done within a do while loop, with the condition being that the random string we've generated doesn't already exist.

Simply put, this will just iterate until it generates a random string that doesn't already exist for that model, and then return it.

Ok, so here's how we use it in the Eloquent observer we registered earlier.

static::creating(function (Url $url) {
    $url->identifier = (new IdentifierGenerator(Url::class, 'identifier'))->generate();
});

Ok, but how do we test this?

Tests for this depend on what you're using to generate the value. But if you were using str()->random(), here's what a couple of tests might look like with Pest. I've over-commented these to document what's happening.

test('it generates a hashid when creating', function () {
    // Override what gets returned when we use str()->random()
    str()->createRandomStringsUsing(fn () => 'abc');

    // Create a URL which invokes our Eloquent observer
    $url = Url::create(['url' => 'https://codecourse.com']);

    // Reset the str()->random() return value
    str()->createRandomStringsNormally();

    // We can assert the exact value was stored
    expect($url->identifier)->toBe('abc');
});

test('it avoids using the same identifier', function () {
    // Pre-create an existing URL with a specific identifier so we can check
    // for collisions
    Url::create(['url' => 'https://codecourse.com', 'identifier' => 'abc']);

    // This will create a once() type scenario, where 'abc' will be used for
    // the first generation of a random string, and revert back after
    str()->createRandomStringsUsing(fn () => 'abc');
    str()->createRandomStringsNormally();

    $url = Url::create(['url' => 'https://codecourse.com']);

    expect($url->identifier)->not->toBe('abc');
});

We use str()->createRandomStringsUsing() and str()->createRandomStringsNormally() to control the return value of str()->random() which means we're able to verify with accuracy that our random identifier generator is working.

The second test specifically verifies that the do while loop runs again, since we already have a record with the identifier abc. Perfect.

Adapt it

Now you have the foundation for checking collisions, you can adapt this whether you're using random strings, Hashids, or any other generated data you're inserting and need to be absolutely sure is unique.

Author
Alex Garrett-Smith

Comments

No coments, yet. Be the first to leave a comment.