Compliance Firehose API
Please note
We have released a new compliance tool to X API v2 called batch compliance. This new tool allows you to upload large datasets of Post or user IDs to retrieve their compliance status in order to determine what data requires action in order to bring your datasets into compliance.
In addtion, both the batch compliance and the compliance firehose have been updated to support Post edits. For the compliance firehose, a new ‘tweet_edit’ event was added. See the Compliance Data Objects documentation for more details. Learn more about how Edit Post metadata works on the Edit Posts fundamentals page.
Overview
Enterprise
One of X’s core values is to defend and respect the user’s voice. This includes respecting their expectations and intent when they delete, modify, or edit the content they choose to share on X. We believe that this is critically important to the long term health of one of the largest public, real-time information platforms in the world. X puts controls in the hands of its users, giving individuals the ability to control their own X experience. We believe that business consumers that receive X data have a responsibility to honor the expectations and intent of end users.
For more information on the types of compliance events that are possible on the X platform, reference our article, Honoring User Intent on X.
Any developer or company consuming X data via an API holds an obligation to use all reasonable efforts to honor changes to user content. This obligation extends to user events such as deletions, modifications, and changes to sharing options (e.g., content becoming protected or withheld). This also includes when users edit their Posts. Please reference the specific language in the Developer Policy and/or your X Data Agreement to understand how this obligation affects your use of X data.
X offers the following solutions that deliver information about these user compliance events and whether a specific Post or User is publicly available or not. A brief overview of the solutions and their general integration patterns is detailed below:
GET statuses/lookup and GET users/lookup
-
Format: REST API’s See: GET statuses/lookup and GET users/lookup.
-
These endpoints always return the latest version of any Post edits. All Post objects describing Posts created after the edit feature was introduced will include Post edit metadata. This is true even for Posts that were not edited.
-
For all Posts, requests for Posts more than 30 minutes after they were created will represent the final state of all Posts.
-
Deliver availability information for specific Posts or Users as defined by the caller as part of the API request.
-
May be used for ad-hoc spot-checking on the current availability state of a specific group of Posts / Users.
-
Ideal for customers who need a way to check the current state of a specific Post or User at a given moment in time.
-
These API’s provide a helpful mechanism that may be used by customers who need to check the availability of a piece of Content, for instance when:
- Displaying Posts
- Engaging with a Post(s) or User(s) in a 1:1 way
- Distributing X Content to a 3rd party through an allowed file download
- Storing Posts for extended periods of time
Compliance Firehose (enterprise only)
- Format: Streaming API See: Compliance Firehose.
- Delivers realtime stream of Compliance activities on X. These activities include when Posts are edited.
- May be used to maintain compliance state across a set of stored data as new compliance events happen on the platform.
- Ideal for customers consuming and storing large quantities of X data for extended periods of time.
Guides
Compliance Best Practices
Recommendations & Best Practices
-
Build Data Storage Schemas That Store Numeric Post ID and User ID: User messages require action to be taken on all Posts from that User. Therefore, since all compliance messages are delivered only by numeric ID, it is important to design storage schemas that maintain the relationship between Post and User based on numeric IDs. Data consumers will need to monitor compliance events by both Post ID and User ID and be able to update the local data store appropriately.
-
Build Schemas That Address All Compliance Statuses: Depending on how compliance activities will be addressed in various applications, it may be required to add other metadata to the data store. For instance, data consumers may decide to add metadata to an existing database to facilitate restricting the display of content in countries affected by a status_withheld message.
-
Handling Retweet Deletes: Retweets are a special kind of Post where the original message is nested in an object within the Retweet. In this case, there are two Post IDs referenced in a Retweet – the ID for the Retweet, and the ID for the original message (included in the nested object). When an original message is deleted, a Post delete message is issued for the original ID. Post deletion events typically trigger delete events for all Retweets. However, in some cases not all are sent and client systems should be tolerant of incomplete Retweet deletions. The deletion of the original ID should be sufficient to delete all subsequent Retweets. It is a best practice to reference the original Post ID when storing Retweets, and deleting all referenced Retweets when receiving Post delete events.
Compliance Data Objects
Compliance Firehose API
Possible types of compliance events will include Post (or “status”) events and User events, for which there are multiple types described below.
Please note:
- Read more about User statuses here and our developer policy around deleted Posts here.
- The Compliance Firehose has been updated to provide ‘tweet_edit’ events.
- Several User delete, protect and suspend events are not necessarily permanent and can toggle between states infinitely. These include: user_delete,user_undelete, user_protect, user_unprotect and user_suspend, user_unsuspend.
- User_deletes are followed by status_deletes 30 days later only if the user has not selected to user_undelete their account. It is possible that a user_delete is reversed by the user and deletes for all of their Posts 30 days later do not occur.
- User_suspend is an action that remains true unless the user is subject to an user_unsuspend event. These are not subject to any changes on a 30 day time period.
Refer to the ‘Recommended Action’ column to understand how to process each type of event in order to respect the privacy and intent of the end user.
Original Message Type | Object | Permanent (Yes/No) | Recommended Action |
---|---|---|---|
delete | Status | Yes | Delete associated Post. |
status_withheld | Status | Yes | Suppress associated Post in specific countries listed in the status_withheld message. |
drop | Status | No | Remove the Post from public view. |
undrop | Status | No | Status may be displayed again and treated as public. |
tweet_edit | Status | Yes | Honor and, where relevant, display the new edit. |
user_delete | User | No | Suppress or delete all Posts by associated user. |
user_undelete | User | No | All Posts by associated user may be displayed again and treated as public. |
user_protect | User | No | Suppress or delete all Posts by associated user. |
user_unprotect | User | No | All Posts by associated user may be displayed again and treated as public. |
user_suspend | User | No | Suppress or delete all Posts by associated user. |
user_unsuspend | User | No | All Posts by associated user may be displayed again and treated as public. |
scrub_geo | User | Yes | Delete all geodata provided by X for all Posts by the user prior to the specified Post in the scrub_geomessage. Note that subsequent Posts by a user may contain geodata that may be used. |
user_withheld | User | Yes | Suppress Posts by associated user in specific countries listed in the user_withheld message. |
delete | Favorite | Yes | Delete associated like/favorite. |
Payload examples
See the payload examples below for each compliance event described in the table above.
Post edit
Post delete
Post withheld
Drop
Undrop
Scrub geo
User delete
User undelete
User withheld
User protect
User unprotect
User suspend
User unsuspend
integrating Compliance Firehose
The Compliance Firehose is a realtime streaming API that delivers compliance events that occur on the X platform. For an understanding of compliance events and how they are generated on X, please reference our article, Honoring User Intent on X.
It is important to note that Post and User events are delivered independently and that each should be processed independently (i.e. a Post delete doesn’t imply a User event, and vice versa.) Several User events are not necessarily permanent and can toggle between states infinitely. These include: user_delete,user_undelete, user_protect, user_unprotect and user_suspend, user_unsuspend.
User_deletes are followed by status_deletes 30 days later only if the user has not selected to user_undelete their account. It is possible that a user_delete is reversed by the user and deletes for all of their Posts 30 days later do not occur.
User_suspend is an action that remains true unless the user is subject to an user_unsuspend event. These are not subject to any changes on a 30 day time period.
It is never suitable to display compliance events directly to users of your software or to otherwise incorporate them into your products or customer experiences. They are intended solely for maintaining compliance and honoring the actions of X users.
Integrating with the Compliance Firehose
To integrate the Compliance Firehose into your system, you will need to build an integration that can do the following:
- Establish a streaming connection to each streaming API partition of the Compliance Firehose
- Handle high data volumes – de-couple stream ingestion from additional processing using asynchronous processes
- Reconnect to the stream partitions automatically when disconnected for any reason
- Process compliance events that are relevant to Post and User data you have stored in accordance with the guidance presented above
Honoring user intent on Twitter
We believe that respecting the privacy and intent of X users is critically important to the long term health of one of the largest public, real-time information platforms in the world. X puts privacy controls in the hands of its users, giving individuals the ability to control their own X experience. As business consumers of X data, we have a collective responsibility to honor the privacy and actions of end users in order to maintain this environment of trust and respect.
There are a variety of things that can happen to Posts and User accounts that impact how they are displayed on the platform. The actions that affect privacy and intent are defined at both the Status (Post) and User levels. These actions include:
User
Action | Description |
---|---|
Protect Account | A X user can protect or unprotect their account at any time. Protected accounts require manual user approval of every person who is allowed to view their account’s Posts. For more information, see About Public and Protected Posts. |
Delete Account | A X user can decide to delete their account and all associated status messages at any time. X retains the account information for 30 days after deletion in case the user decides to undelete and effectively reactivate their account. |
Scrub Geo | A X user can remove all location data from past Posts at any time. This known as “scrub geo”. |
Suspend Account | X retains the right to suspend accounts that are in violation of the X Rules or if an account is suspected to have been hacked or compromised. Account suspensions can only be reversed (unsuspend) by X. |
Withhold Account | X retains the right to reactively withhold access to certain content in a specific country from time to time. A withheld account can only be made unwithheld by X. For more information, see Country Withheld Content. |
Status
Action | Description |
---|---|
Delete Status | A X user can delete a status at any point in time. Deleted statuses cannot be reversed and are permanently deleted. |
Withhold Status | X retains the right to reactively withhold access to certain content in a specific country from time to time. A withheld status can only be made unwithheld by X. For more information, see Country Withheld Content. |
Keeping Track of User and Status Changes
The state of a User or Status can change at any time due to one of the actions above, and this impacts how consumers of X data are expected to treat the availability and privacy of all associated content. When these actions happen, a corresponding compliance message is sent that indicates that the state of a Status or User has changed.
API Reference
GET compliance/firehose
Methods
Method | Description |
---|---|
GET /compliance/:stream | Connect to the Data Stream |
Authentication
All requests to the Compliance Firehose API must use HTTP Basic Authentication, constructed from a valid email address and password combination used to log into your account at console.gnip.com. Credentials must be passed as the Authorization header for each request.
GET /compliance/:stream
Establishes a persistent connection to the Compliance firehose data stream, through which the compliance events will be delivered.
Request Method | HTTP GET |
Connection Type | Keep-Alive |
URL | Found on the stream’s API Help page of your dashboard, and resembles the following structure: https://gnip-stream.twitter.com/stream/compliance/accounts/:account_name/publishers/twitter/:stream_label.json?partition=1 Note: The “partition” parameter is required. You will need to connect to all 8 partitions, each containing 12.5% of the total volume, to consume the full stream. |
Compression | Gzip. To connect to the stream using Gzip compression, simply send an Accept-Encoding header in the connection request. The header should look like the following: Accept-Encoding: gzip |
Character Encoding | UTF-8 |
Response Format | JSON. The header of your request should specify JSON format for the response. |
Rate Limit | 10 requests per 60 seconds. |
Read Timeout | Set a read timeout on your client, and ensure that it is set to a value beyond 30 seconds. |
Support for Tweet edits | All Tweet edits trigger a “tweet_edit” Compliance event. See the Compliance Data Objects documentation for more details. |
Example Curl Request
The following example request is accomplished using cURL on the command line:
Note: the above request is only connecting to partition=1
of the Compliance firehose - you’ll need to connect to all 8 partitions to consume the entirety of this stream.
Response Codes
The following responses may be returned by the API for these requests. Most error codes are returned with a string with additional details in the body. For non-200 responses, clients should attempt to reconnect.
Status | Text | Definition |
---|---|---|
200 | Success | The connection was successfully opened, and new activities will be sent through as they arrive. |
401 | Unauthorized | HTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request. |
406 | Not Acceptable | Generally, this occurs where your client fails to properly include the headers to accept gzip encoding from the stream, but can occur in other circumstances as well. Will contain a JSON message similar to “This connection requires compression. To enable compression, send an ‘Accept-Encoding: gzip’ header in your request and be ready to uncompress the stream as it is read on the client end.” |
429 | Rate Limited | Your app has exceeded the limit on connection requests. |
503 | Service Unavailable | Twitter server issue. Reconnect using an exponential backoff pattern. If no notice about this issue has been posted on the Twitter API Status Page, contact support. |
Other Recommendations & Best practices
-
Build Data Storage Schemas That Store Numeric Tweet ID and User ID: User messages require action to be taken on all Tweets from that User. Therefore, since all compliance messages are delivered only by numeric ID, it is important to design storage schemas that maintain the relationship between Tweet and User based on numeric IDs. Data consumers will need to monitor compliance events by both Tweet ID and User ID and be able to update the local data store appropriately.
-
Build Schemas That Address All Compliance Statuses: Depending on how compliance activities will be addressed in various applications, it may be required to add other metadata to the data store. For instance, data consumers may decide to add metadata to an existing database to facilitate restricting the display of content in countries affected by a status_withheld message.
-
Handling Retweet Deletes: Retweets are a special kind of Tweet where the original message is nested in an object within the Retweet. In this case, there are two Tweet IDs referenced in a Retweet — the ID for the Retweet, and the ID for the original message (included in the nested object). When an original message is deleted, a Tweet delete message is issued for the original ID. Subsequent delete messages are NOT issued for all of the Retweets. The deletion of the original ID should be sufficient to delete all subsequent Retweets.