Playing
01. Playing with Redis sets and HyperLogLog

Transcript

00:00
If you want to log unique views for a model inside of Laravel, there's a couple of options that you can take. The first one is to build out your database with perhaps another table that logs these views, and then store something like an IP address inside of that table. And then when it comes to grabbing them views back out of the database,
00:21
you could do this only by the unique IP addresses. Now this solution is absolutely fine, but it's a little bit slower, and it's also a little bit more difficult to maintain. So in this course, we're going to talk about another solution using Redis.
00:37
So the first thing that we're going to do is just take a look at the really basic app that we're going to build, how this works, and how it syncs these back to the database, so we can do things like ordering by views as well, because we can't order by data that's already in Redis. Then we're just going to play around with Redis to get a use to the commands
00:56
that we're going to be using, and then we'll build this whole thing out. Okay, so the first thing is the very simple app that I've built out here. I just have a bunch of articles in the database, and if we just take a look, these do have a view count column,
01:11
which we're not going to be incrementing. We'll take a look at that in a second. Okay, so I'm going to view one of these articles. So let's view this one just here. That's just gone over to another route, and it's showing me the title. Now, if I go back and just give this a refresh, you can see that the view count has increased.
01:28
Now, what we're not doing here is we're not storing this directly in the database, but this is now coming from Redis. So we're taking the article, we're creating a key in Redis that's unique to that article, and then we're using something called HyperLogLog to insert the IP address inside of a set,
01:45
and we'll talk about that in a second. Now, if I click on this again, obviously I'm using the same local IP address. That's not going to log an additional view. Now, all of these are ordered by how many views there are, but we're ordering at the database level. So what we're going to do at the end of the course
02:03
is look at how we can take all the data that we already have in Redis, sync it back to the database on a periodic basis using our scheduler, and then we'll have something that we can actually order by. Okay. So now that we kind of know how this works, let's talk about sets in Redis.
02:20
So I'm going to go ahead and boot up the Redis CLI, and let's go ahead and create our set by using sadd. That's command to add to a key and then add something in there. So let's go ahead and say that we have a bunch of articles and we want to add this to a key called article1. So that's the article with the ID of one.
02:39
Now, into here, what we might do is store an IP address. So I'm going to store 127.0.0.1, and I'm going to hit enter. So that's gone ahead and added the IP address 127.0.0.1 to the set called articles1. Now, what we can do is use a S card to get the cardinality or the count of this
02:59
by passing in articles1, and sure enough, we get one. Now, sets are unique in Redis, which means that if we rerun the sadd articles1 command and add the IP address in again as a view, we're going to end up with a zero returned, which means that when we grab the cardinality
03:16
of that again, we're still going to end up with one. So you can actually use sets to log all of the IP addresses into this specific key, and that will always give you back a unique count. So you could count on them and get the unique views back.
03:31
For example, if I add 127.0.0.2 and we do another count on articles1, we get two. So that's now had two views from two unique IP addresses. Now, the issue with this, particularly at scale, is if you're adding lots and lots of IP addresses to this particular set, this is going to be very slow.
03:50
So counting on this is going to be slow. Now, the alternative to this is using HyperLogLog. So let's take a look. So it's a probabilistic data structure that estimates the cardinality or the count of a set
04:03
and its trades accuracy. So that's one thing we need to bear in mind for space utilization. So let's go ahead and look at using HyperLogLog on the terminal and see how this differs. So if we come over to our command section, these are prefixed by PF.
04:19
So we can add an element to a HyperLogLog key, and we can use PF count to grab the cardinality or the count of this. So let's go over and just try this instead. I'm going to go ahead and just flush everything out so we can start again.
04:32
And let's go ahead and run these. So we'll say PF add. And it works in exactly the same way. So we're going to go ahead and say articles one.
04:40
And we're going to add in one, two, seven, zero, zero, one. Hit enter. And we've added that to that set now. Now, if I go ahead and do this again, and then we run PF count on articles one.
04:54
Sure enough, we get one because we have an IP address logged more than once. But we want a unique view count. Let's do it again with one, two, seven, zero, zero, two, and three. And then let's run PF count on this again.
05:06
Sure enough, we get three. So pretty accurate. But what we're doing here is we are sacrificing potential accuracy for speed and space utilization.
05:16
So these are the commands that we're going to use to integrate this in. But what we're also going to do in the course is take a look at syncing this data from Redis back to the database, like I said, so we can eventually order by view count.
05:31
Because if it doesn't exist in the database, we're going to have a really hard time trying to order by it. So hopefully these commands make sense. Let's go over and start to integrate this into these article models that we have.
4 episodes 22 mins

Overview

If you need to log unique views in Laravel, you might reach for a database table to track IP addresses or another unique piece of data.

Let's take a look at speeding things up both in performance and complexity by using Redis and the HyperLogLog probabilistic data structure.

Once we're done, we'll set up a period command to sync views back to the database for easy ordering, and then create a trait to share functionality between other models.

Alex Garrett-Smith
Alex Garrett-Smith
Hey, I'm the founder of Codecourse!

Episode discussion

No comments, yet. Be the first!