Using presence for large groups of users

Presence-for-large-groups-3.jpg

Pusher provides many powerful features that can build working presence implementations for large scale use cases, covering hundreds to millions of users.

Introduction

One of our features that many customers get a lot of value from is our presence implementation. This provides a constantly updating state of who is “present” in the channels in your application. In practice the feature provides a list of the Channel occupants when a user subscribes and then sends messages to all occupants of a channel when someone joins or leaves.

This works extremely well for smallish groups where the names of the participants are broadly known to the occupants (maybe sub 100 people). For large-scale use-cases (100-1,000,000), that specific implementation is less effective. Fundamentally, this is because the number of messages distributed grows quadratically with the number of people present in the channel.

This is problematic because:

  • It quickly burns through your message allowance
  • The number of messages arriving cause a performance problem for clients

Aside from the resource issues, sending that kind of volume of state changes isn’t even useful to the end customer from a user experience point of view. As the scale of participation increases, the complexity and detail of the information must be simplified. This requires some custom logic depending on the use case. For example, you may just want to periodically show how many people are participating, or even try to pick out the user’s friends from the crowd.

The vanilla presence channels don’t work well for scale, and we limit the number of participants in a room. However, we provide many other powerful features that can build presence implementations for larger groups.

Building a larger presence implementation

At its heart, presence is a directory of which users are subscribed to what. The job of building a larger presence implementation requires several steps:

  • registering the locations where users are “present”
  • syncing the state of presence information from our service
  • pushing a filtered version of that state to connected users

Registering where users are present

To sync state, you first need to register which users are part of which “rooms”. Do this by subscribing each user to a corresponding private channel based on a naming convention like private-- (e.g. private-roomA-max). This decouples the concept of “presence” from the channel that is being used to send other data. Channels are cheap, so are perfect for this kind of scenario.

Synchronising state to your server

Next, you need to synchronise the state of which channels “exist”. Because Channels are lazily created, they only exist when one or more subscribers are occupying them. We store a state of which channels exist at any point in time, and we provide a set of tools to use this information on your server:

    • 1. Push-based tools – webhooks sent when a channel becomes occupied or vacant
    • 2. Pull based tools – queryable HTTP APIs that return the state of channel existence (with filter patterns)

In the push scenario, you’d receive web hooks for all the events, and update some state on your server to show who is subscribed to what. This is more precise from a timing point of view, but it can become quite a large firehose of information.

Alternatively, you can poll the API periodically, and use the results to alter your internal state.

Pushing presence state to users

Finally, you need to make sure that the state is pushed to your existing users. This can be as simple as publishing a message about the number of people in the room to the participants every second. You’d generally subscribe users to a global shared private or public for this broadcast, e.g. private-<roomId>.

For examples, users would subscribe to “roomA” and “private-roomA-max” in the example above. The former distributes information they are interested in, including the count of the number of participants. The latter is tied to it by naming convention, and signifies whether or not they are an active participant.

Extending this further

There are many ways to extend the example once you have the foundational building blocks. You could look up people’s friends, and send everyone a message telling them which of those are in the same room for a personalised friend roster. You could also create sub groups of users in the channel to get different kinds of information about presence.

We are exploring ways that we can make this more visible and useful over time. Let us know what you build.