Skip to main content

3 posts tagged with "updates"

View All Tags

· 3 min read

We have a number of protocol and infrastructure changes rolling out in the next three months, and want to keep everybody in the loop.

This update was also emailed to the developer mailing list, which you can subscribe to here.

TL;DR

  • As of this week, the Bluesky AppView instance now consumes from a Bluesky BGS, instead of directly from the PDS. Devs can access the current streaming API at https://bsky.network/xrpc/com.atproto.sync.subscribeRepos or for WebSocket directly, wss://bsky.network/xrpc/com.atproto.sync.subscribeRepos
    Your existing cursor for bsky.social will not be in sync with bsky.network, so check the live stream first to grab a recent seq before connecting!
  • We are updating the DID document public key syntax to “Multikey” format next week on the main network PLC directory (plc.directory). This change is already live on the sandbox PLC directory.

How will this affect me?

  • For today, if you're consuming the firehose, grab a new cursor from bsky.network and restart your firehose consumer pointed at bsky.network.

Bluesky BGS

The Bluesky services themselves are moving to a federated deployment, with multiple Bluesky (the company) PDS instances aggregated by a BGS, and the AppView downstream of that. As of yesterday, the Bluesky Appview instance (api.bsky.app) consumes from a Bluesky PBC BGS (bsky.network), which consumes from the Bluesky PDS (bsky.social). Until now, the AppView consumed directly from the PDS.

How close are we to federation?

Technically, the main network BGS could start consuming from independent PDS instances today, the same as the sandbox BGS does. We have configured it not to do so until we finish implementing some more details, and do our own round of security hardening. If you want to bang on the BGS implementation (written in Go, code in the indigo github repository), please do so in the sandbox environment, not the main network.

This change impacts devs in two ways:

  • In the next couple weeks, new Bluesky (company) PDS instances will appear in the main network. Our plan is to optionally abstract this away for most client developers, so they can continue to connect to bsky.social as a virtual PDS. But the actual PDS hostnames will be distinct and will show up in DID documents.
  • Firehose consumers (feed generators, mirrors, metrics dashboards, etc) will need to switch over and consume from the BGS instead of the PDS directly. If they do not, they will miss content from the new (Bluesky) PDS instances.

The firehose subscription endpoint, which works as of today, is https://bsky.network/xrpc/com.atproto.sync.subscribeRepos (or wss:// for WebSocket directly). Note that this endpoint has different sequence numbers. When switching over, we recommend folks consume from both the BGS and PDS for a period to ensure no events are lost, or to scroll back the BGS cursor to ensure there is reasonable overlap in streams.

We encourage developers and operators to switch to the BGS firehose sooner than later.

DID Document Formatting Changes

We also want to remind folks that we are planning to update the DID document public key syntax to “Multikey” format next week on the main network PLC directory (plc.directory). These changes are described here, with example documents for testing, and are live now on the sandbox PLC directory.

· 5 min read

To get future blog posts directly in your email, you can now subscribe to Bluesky’s Developer Mailing List here.

Adding Rate Limits

Now that we have a better sense of user activity on the network, we’re adding some application rate limits. This helps us keep the network secure — for example, by limiting the number of requests a user or bot can make in a given time period, it prevents bad actors from brute-forcing certain requests and helps us limit spammy behavior.

We’re adding a rate limit for the number of created actions per DID. These numbers shouldn’t affect typical Bluesky users, and won’t affect the majority of developers either, but it will affect prolific bots, such as the ones that follow every user or like every post on the network. The limit is 5,000 points per hour and 35,000 points per day, where:

Action TypeValue
CREATE3 points
UPDATE2 points
DELETE1 point

To reiterate, these limits should be high enough to affect no human users, but low enough to constrain abusive or spammy bots. We decided to release this new rate limit immediately instead of giving developers an advance notice to secure the network from abusive behavior as soon as possible, especially since bad actors might take this blog post as an open invite!

Per this system, an account may create at most 1,666 records per hour and 11,666 records per day. That means an account can like up to 1,666 records in one hour with no problem. We took the most active human users on the network into account when we set this threshold (you surpassed our expectations!).

In case you missed it, in August, we added some other rate limits as well.

  • Global limit (aggregated across all routes)
    • Rate limited by IP
    • 3000/5 min
  • updateHandle
    • Rate limited by DID
    • 10/5 min
    • 50/day
  • createAccount
    • Rate limited by IP
    • 100/5 min
  • createSession
    • Rate limited by handle
    • 30/5 min
    • 300/day
  • deleteAccount
    • Rate limited by IP
    • 50/5 min
  • resetPassword
    • Rate limited by IP
    • 50/5 min

We’ll also return rate limit headers on each response so developers can dynamically adapt to these standards.

In a future update (in about a week), we’re also lowering the applyWrites limit from 200 to 10. This function applies a batch transaction of creates, updates, and deletes. This is part of the PDS distribution upgrade to v3 (read more below) — now that repos are ahistorical, we no longer need a higher limit to account for batch writes. applyWrites is used for transactional writes, and logic that requires more than 10 transactional records is rare.

PDS Distribution v3

We’re rolling out v3 of the PDS distribution. This shouldn’t be a breaking change, though we will be wiping the PLC sandbox. PDSs in parallel networks should still continue to operate with the new distribution.

Reminder: The PDS distribution auto-updates via the Watchtower companion Docker container, unless you specifically disabled that option. We’re adding the admin upgradeRepoVersion endpoint to the upgraded PDS distribution, so PDS admins can also upgrade their repos by hand.

Handle Invalidations on App View

Last month, we began proxying requests to the App View. In our federation architecture, the App View is the piece of the stack that gives you all your views of data, such as profiles and threads. Initially, we started out by serving all of these requests from our bsky.social PDS, but proxying these to the App View is one way of scaling our infrastructure to handle many more users on the network. (Read our federation architecture overview blog post for more information.)

For some users, this caused an invalid handle error. If you have an invalid handle, the user-facing UI will display this instead of your handle:

Screenshot of a profile with an invalid handle

You can use our debugging tool to investigate this: https://bsky-debug.app/handle. Just type your handle in. If it shows no error, please try updating your handle to the same handle you currently have to resolve this issue.

If the debugging page shows an error for your handle, follow this guide to make sure you set up your handle properly.

If that still isn’t working for you, file a support ticket through the app (“Help” button in the left menu on mobile or right side on desktop) and a Bluesky team member will assist you.

Subscribe for Developer Updates

You can subscribe to Bluesky’s Developer Mailing List here to receive future updates in your email. If you received your invite code from the developer waitlist, you’re already subscribed. Each email will have the option to unsubscribe.

We’ll continue to publish updates to our technical blog as well as on the app from @atproto.com.

· 4 min read

We’re excited to announce that we’re rolling out a new version of atproto repositories that removes history from the canonical structure of repositories, and replaces it with a logical clock. We’ll start rolling out this update next week (August 28, 2023).

For most developers with projects subscribed to the firehose, such as feed generators, this change shouldn’t affect you. These will only affect you if you’re doing commit-aware repo sync (a good rule of thumb is if you’ve ever passed earliest or latest to the com.atproto.sync.getRepo method) or are explicitly checking the repo version when processing commits.

Removing Repository History

Repositories on the AT Protocol are like Git repositories, but for structured records. Just like Git, each commit to an atproto repository currently includes a pointer to the previous commit. However, this approach has caused a couple of pain points:

  • Record deletions are difficult to process. If a user deletes a record, that commit needs to be erased from their repository to match their intent.
  • Increased storage cost. Maintaining repo history can cause anywhere from a 5-10x increase in repo size.

We attempted to resolve both of these in the current model through rebases (discrete moments when the history of a repository is deleted/mutated, like in Git). However, this is a tricky and sensitive operation that is expensive to conduct and complex to communicate across the network.

Using a Logical Clock for Repositories

To address the above issues, we’re replacing the prev pointer in commits with a logical clock. We originally published our intention to do so a few weeks ago. These are the changes we’re making to the way we handle repository history:

  • Incrementing the repo version to 3
  • Making the prev field on repo commits optional
  • Adding a new required rev (revision) field which is a logical clock
  • Removing or adjusting commit-aware repo sync mechanisms

Note: If you explicitly verify the version of a repo commit or do strict type checking on commit repo commits (which you shouldn’t — the spec allows unspecified fields!), you will need to make that check inclusive of version 3.

To facilitate backwards compatibility with software that is still running repo v2, we will continue setting the prev field on commits in the interim.

Even though we are setting the prev field, this can be considered a “hint” and the history is no longer considered a canonical part of the repository.


Repository Revisions

The new sync semantics for the repository rely on a logical clock included in each signed commit.

This “revision” takes the form of a TID and must be monotonically increasing.

The included revision serves a few functions:

Ordering

The clock provides a simple ordering mechanism for encountered repos or commits. If a consumer encounters the same repo from two different sources, each with a valid signature and structure, the revision gives a simple mechanism to determine which is the most recent repository.

Sync

When syncing a repository, revisions give a series of signposts that allow you to request everything from a given repo since a previously seen version. Because revisions are ordered and monotonically increasing, the provider does not necessarily need the exact revision that the consumer is asking for (as with a commit hash), rather they can provide all repo contents from the latest version of the repo that they remember that is before the requested revision.

The PDS for instance will track the revision at which each repo block or record was introduced into a repository. If a consumer asks for every block or record since a given revision, the PDS has a simple mechanism by which to give that information, without needing a complicated sync algorithm.

Stale Reads

Finally, a logical clock on the repo gives us a mechanism through which we can detect stale reads. (We actually already snuck this in with an optional revision field on v2 repos!)

Repo revisions may be returned in response headers to most requests. A client will know their own repo’s current revision and can compare that with the upstream service’s revision.

We use this today on the PDS to paper over some read-after-write concerns that are inherent in eventually consistent architectures. Some clients may use these headers to alert their users that their PDS is “out of sync” with other services in the network (for instance an AppView).

Available sync methods


If you have questions about these changes, join us on GitHub Discussions here.