Building a client to consume streaming data
When using a streaming endpoint, there are some general best practices to consider in order to optimize usage.Client design
When building a solution with the filter stream endpoint, you will need a client that can do the following:- Establish an HTTPS streaming connection to the filter stream endpoint.
- Asynchronously send POST requests to the filter stream rules endpoint to add and delete rules from the stream.
- Handle low data volumes – Maintain the streaming connection, detecting Post objects and keep-alive signals
- Handle high data volumes – de-couple stream ingestion from additional processing using asynchronous processes, and ensure client side buffers are flushed regularly.
- Manage volume consumption tracking on the client side.
- Detect stream disconnections, evaluate and reconnect to the stream automatically.
Connecting to a streaming endpoint
Establishing a connection to X API v2 streaming endpoints means making a very long lived HTTP request, and parsing the response incrementally. Conceptually, you can think of it as downloading an infinitely long file over HTTP. Once a connection has been established, the X server will deliver Post events through the connection as long as the connection is open.Consuming data
Note that the individual fields of JSON objects are not ordered, and not all fields will be present in all circumstances. Similarly, separate activities are not delivered in sorted order, and duplicate messages may be encountered. Keep in mind that over time, new message types may be added and sent through the stream. Thus, your client must tolerate:- Fields appearing in any order
- Unexpected or missing fields
- Non-sorted Posts
- Duplicate messages
- New arbitrary message types coming down the stream at any time
Buffering
The streaming endpoints will send data to you as quickly as it becomes available, which can result in high volumes in many cases. If the X server cannot write new data to the stream right away (for example if your client is not reading fast enough, see handling disconnections for more), it will buffer the content on its end to allow your client to catch up. However, when this buffer is full, a forced disconnect will be initiated to drop the connection, and the buffered Posts will be dropped and not resent. See below for more details. One way to identify times where your app is falling behind is to compare the timestamp of the Posts being received with the current time, and track this over time. Although stream backups cannot ever be completely eliminated due to potential latency and hiccups over the public internet, they can be largely eliminated through proper configuration of your app. To minimize the occurrence of backups:- Ensure that your client is reading the stream fast enough. Typically you should not do any real processing work as you read the stream. Read the stream and hand the activity to another thread/process/data store to do your processing asynchronously.
- Ensure that your data center has inbound bandwidth sufficient to accomodate large sustained data volumes as well as significantly larger spikes (e.g. 5-10x normal volume). For filtered stream, the volume and corresponding bandwidth required on your end are wholly dependent on what Posts your rules are matching.