Interested in learning more about how the enterprise data formats map to the X API v2 format?
Check out our comparison guides:
Enterprise
Posts are the basic atomic building block of all things X. All X APIs that return Posts provide that data encoded using JavaScript Object Notation (JSON). JSON is based on key-value pairs, with named attributes and associated values. Post objects retrieved from the API include a X User’s “status update” but Retweets, replies, and quote Tweets are all also Post objects. If a Post is related to another Post, as a Retweet, reply or quote Tweet, each will be identified or embedded into the Post object. Even the simplest Post in the native X data format, will have nested JSON objects to represent the other attributes of a Post, such as the author, mentioned users, tagged place location, hashtags, cashtag symbols, media or URL links. When working with X data, this is an important concept to understand. The format of the Post data you will receive from the X API depends on the type of Post received, the X API you are using, and the format settings.
Enterprise endpoints that return Post objects have been updated to provide the metadata needed to understand the Post’s edit history. Learn more about these metadata on the “Edit Posts” fundamentals page.
In native X format, the JSON payload will include of ‘root-level’ attributes, and nested JSON objects (which are represented here with the {}
notation):
Please note: It is highly recommended to use the Enriched Native format for enterprise data APIs.Enterprise data APIs deliver data in two different formats. The enterprise format closest to the standard v1.1 native format is Native Enriched. The legacy enterprise data format is Activity Streams, orignially implimented and used by Gnip as a normalized format across X and other social media data providers at the time. While this format is still available, X has only invested new features and developments on the native enriched format since 2017. The enriched native format is exactly how it sounds, it includes native X objects as well as additional enrichments avialable for enterprise data products such as URL unwinding metadata, profile geo, poll metadata and additional engagement metrics.
- The Enriched Native format includes all new metadata since 2017, such as poll metadata, and additional metrics such as reply_count and quote_count.
- Activity Streams format has not been updated with new metadata or enrichments since the character update in 2017.
Native Enriched | Activity Streams |
---|---|
Link Post object | Link Activity object |
Link User object | Link Actor object |
Link Entities object | Link X entities object |
Link Extended entities object | [Link]/x-api/enterprise-gnip-2.0/fundamentals/data-dictionary#x-extended-entities X extended entitites object |
Link Geo object | Link Location object |
n/a | Link Gnip object |
Interested in learning more about how the Native Enriched data format maps to the X API v2 format? Check out our comparison guide: Native Enriched compared to X API v2
id
, created_at
, and text
. Post objects will also have nested objects to include the user
, entities
, and extended_entities
. Post objects will also have other nested Post objects such as retweeted_status, quoted_status and extended_tweet. The native enriched format will additionally have a matching_rules object.
Attribute | Type | Description |
---|---|---|
created_at | String | UTC time when this Post was created. Example: “created_at”: “Wed Oct 10 20:19:24 +0000 2018” |
id | Int64 | The integer representation of the unique identifier for this Post. This number is greater than 53 bits and some programming languages may have difficulty/silent defects in interpreting it. Using a signed 64 bit integer for storing this identifier is safe. Use id_str to fetch the identifier to be safe. See X IDs for more information. Example:“id”:1050118621198921728 |
id_str | String | The string representation of the unique identifier for this Post. Implementations should use this rather than the large integer in id . Example:“id_str”:“1050118621198921728” |
text | String | The actual UTF-8 text of the status update. See X-text for details on what characters are currently considered valid. Example: “text”:“To make room for more expression, we will now count all emojis as equal—including those with gender and skin t… https://t.co/MkGjXf9aXm” |
source | String | Utility used to post the Post, as an HTML-formatted string. Posts from the X website have a source value of web .Example: “source”:“X Web Client” |
truncated | Boolean | Indicates whether the value of the text parameter was truncated, for example, as a result of a retweet exceeding the original Post text length limit of 140 characters. Truncated text will end in ellipsis, like this ... Since X now rejects long Posts vs truncating them, the large majority of Posts will have this set to false . Note that while native retweets may have their toplevel text property shortened, the original text will be available under the retweeted_status object and the truncated parameter will be set to the value of the original status (in most cases, false ). Example:“truncated”:true |
in_reply_to_status_id | Int64 | Nullable. If the represented Post is a reply, this field will contain the integer representation of the original Post’s ID. Example: “in_reply_to_status_id”:1051222721923756032 |
in_reply_to_status_id_str | String | Nullable. If the represented Post is a reply, this field will contain the string representation of the original Post’s ID. Example: “in_reply_to_status_id_str”:“1051222721923756032” |
in_reply_to_user_id | Int64 | Nullable. If the represented Post is a reply, this field will contain the integer representation of the original Post’s author ID. This will not necessarily always be the user directly mentioned in the Post. Example: “in_reply_to_user_id”:6253282 |
in_reply_to_user_id_str | String | Nullable. If the represented Post is a reply, this field will contain the string representation of the original Post’s author ID. This will not necessarily always be the user directly mentioned in the Post. Example: “in_reply_to_user_id_str”:“6253282” |
in_reply_to_screen_name | String | Nullable. If the represented Post is a reply, this field will contain the screen name of the original Post’s author. Example: “in_reply_to_screen_name”:“xapi” |
user | User object | The user who posted this Post. See User data dictionary for complete list of attributes. Example highlighting select attributes: { “user”: <br/> “id”: 6253282, “id_str”: “6253282”, “name”: “X API”, “screen_name”: “API”, “location”: “San Francisco, CA”, “url”: “https://developer.x.com”, “description”: “The Real X API. Tweets about API changes, service issues and our Developer Platform. Don’t get an answer? It’s on my website.”, “verified”: true, “followers_count”: 6129794, “friends_count”: 12, “listed_count”: 12899, “favourites_count”: 31, “statuses_count”: 3658, “created_at”: “Wed May 23 06:01:13 +0000 2007”, “utc_offset”: null, “time_zone”: null, “geo_enabled”: false, “lang”: “en”, “contributors_enabled”: false, “is_translator”: false, “profile_background_color”: “null”, “profile_background_image_url”: “null”, “profile_background_image_url_https”: “null”, “profile_background_tile”: null, “profile_link_color”: “null”, “profile_sidebar_border_color”: “null”, “profile_sidebar_fill_color”: “null”, “profile_text_color”: “null”, “profile_use_background_image”: null, “profile_image_url”: “null”, “profile_image_url_https”: “https://pbs.twimg.com/profile\_images/942858479592554497/BbazLO9L_normal.jpg”, “profile_banner_url”: “https://pbs.twimg.com/profile_banners/6253282/1497491515”, “default_profile”: false, “default_profile_image”: false, “following”: null, “follow_request_sent”: null, “notifications”: null } } |
coordinates | Coordinates | Nullable. Represents the geographic location of this Post as reported by the user or client application. The inner coordinates array is formatted as geoJSON (longitude first, then latitude). Example: “coordinates”: <br/> “coordinates”: [ -75.14310264, 40.05701649 ], “type”:“Point” } |
place | Places | Nullable When present, indicates that the Post is associated (but not necessarily originating from) a Place Example: “place”: <br/> “attributes”:, “bounding_box”: <br/> “coordinates”: [[ [-77.119759,38.791645], [-76.909393,38.791645], [-76.909393,38.995548], [-77.119759,38.995548] ]], “type”:“Polygon” }, “country”:“United States”, “country_code”:“US”, “full_name”:“Washington, DC”, “id”:“01fbe706f872cb32”, “name”:“Washington”, “place_type”:“city”, “url”:“http://api.x.com/1/geo/id/0172cb32.json” } |
quoted_status_id | Int64 | This field only surfaces when the Post is a quote Tweet. This field contains the integer value Post ID of the quoted Tweet. Example: “quoted_status_id”:1050119905717055488 |
quoted_status_id_str | String | This field only surfaces when the Post is a quote Tweet. This is the string representation Post ID of the quoted Tweet. Example: “quoted_status_id_str”:“1050119905717055488” |
is_quote_status | Boolean | Indicates whether this is a Quoted Tweet. Example: “is_quote_status”:false |
quoted_status | Post | This field only surfaces when the Post is a quote Tweet. This attribute contains the Post object of the original Post that was quoted. |
retweeted_status | Post | Users can amplify the broadcast of Posts authored by other users by Retweeting . Retweets can be distinguished from typical Posts by the existence of a retweeted_status attribute. This attribute contains a representation of the original Post that was retweeted. Note that retweets of retweets do not show representations of the intermediary retweet, but only the original Post. (Users can also unretweet a retweet they created by deleting their retweet.) |
quote_count | Integer | Nullable. Indicates approximately how many times this Post has been quoted by X users. Example: “quote_count”:33 Note: This object is only available with the Premium and Enterprise tier products. |
reply_count | Int | Number of times this Post has been replied to. Example: “reply_count”:30 Note: This object is only available with the Premium and Enterprise tier products. |
retweet_count | Int | Number of times this Post has been retweeted. Example: “retweet_count”:160 |
favorite_count | Integer | Nullable. Indicates approximately how many times this Post has been liked by X users. Example: “favorite_count”:295 |
entities | Entities | Entities which have been parsed out of the text of the Post. Additionally see Entities in X Objects . Example: “entities”: <br/> “hashtags”:[], “urls”:[], “user_mentions”:[], “media”:[], “symbols”:[] “polls”:[] } |
extended_entities | Extended Entities | When between one and four native photos or one video or one animated GIF are in Post, contains an array ‘media’ metadata. This is also available in Quote Tweets. Additionally see Entities in X Objects . Example: “entities”: <br/> “media”:[] } |
favorited | Boolean | Nullable. Indicates whether this Post has been liked by the authenticating user. Example: “favorited”:true |
retweeted | Boolean | Indicates whether this Post has been Retweeted by the authenticating user. Example: “retweeted”:false |
possibly_sensitive | Boolean | Nullable. This field indicates content may be recognized as sensitive. The Post author can select within their own account preferences and choose “Mark media you post as having material that may be sensitive” so each Post created after has this flag set. This may also be judged and labeled by an internal X support agent. ”possibly_sensitive”:false |
filter_level | String | Indicates the maximum value of the filter_level parameter which may be used and still stream this Post. So a value of medium will be streamed on none , low , and medium streams.Example: “filter_level”: “low” |
lang | String | Nullable. When present, indicates a BCP 47 language identifier corresponding to the machine-detected language of the Post text, or und if no language could be detected. Example: “lang”: “en” |
edit_history | Object | Unique identifiers indicating all versions of a Post. For Posts with no edits, there will be one ID. For Posts with an edit history, there will be multiple IDs, arranged in ascending order reflecting the order of edits, with the most recent version in the last position of the array. The Post IDs can be used to hydrate and view previous versions of a Post. Example: edit_history”: <br/> “initial_tweet_id”: “1283764123” “edit_tweet_ids”: [“1283764123”, “1394263866”] } |
edit_controls | Object | When present, indicates how long a Post is still editable for and the number of remaining edits. Posts are only editable for the first 30 minutes after creation and can be edited up to five times. The Post IDs can be used to hydrate and view previous versions of a Post. Example: “edit_controls”: <br/> “editable_until_ms”: 123 “edits_remaining”: 3 } |
editable | Boolean | When present, indicates if a Post was eligible for edit when published. This field is not dynamic and won’t toggle from True to False when a Post reaches its editable time limit, or maximum number of edits. The following Post features will cause this field to be false: * Posts is promoted * Post has a poll * Post is a non-self-thread reply * Post is a retweet (note that Quote Tweets are eligible for edit) * Post is nullcast * Community Post * Superfollow Post * Collaborative Post |
matching_rules | Array of Rule Objects | Present in filtered products such as X Search and PowerTrack. Provides the id and tag associated with the rule that matched the Post. More on matching rules here. With PowerTrack, more than one rule can match a Post. Example: “matching_rules”: ” [<br/> “tag”: “xapi emojis”, “id”: 1050118621198921728, “id_str”: “1050118621198921728” }]“ |
Attribute | Type | Description |
---|---|---|
current_user_retweet | Object | Perspectival Only surfaces on methods supporting the include_my_retweet parameter, when set to true. Details the Post ID of the user’s own retweet (if existent) of this Post. Example: “current_user_retweet”: <br/> “id”: 6253282, “id_str”: “6253282” } |
scopes | Object | A set of key-value pairs indicating the intended contextual delivery of the containing Post. Currently used by X’s Promoted Products. Example: “scopes”:{“followers”:false} |
withheld_copyright | Boolean | When present and set to “true”, it indicates that this piece of content has been withheld due to a DMCA complaint . Example: “withheld_copyright”: true |
withheld_in_countries | Array of String | When present, indicates a list of uppercase two-letter country codes this content is withheld from. X supports the following non-country values for this field: “XX” - Content is withheld in all countries “XY” - Content is withheld due to a DMCA request. Example: “withheld_in_countries”: [“GR”, “HK”, “MY”] |
withheld_scope | String | When present, indicates whether the content being withheld is the “status” or a “user.” Example: “withheld_scope”: “status” |
Field | Type | Description |
geo | Object | Deprecated. Nullable. Use the coordinates field instead. This deprecated attribute has its coordinates formatted as [lat, long], while all other Post geo is formatted as [long, lat]. |
Attribute | Type | Description |
---|---|---|
id | Int64 | The integer representation of the unique identifier for this User. This number is greater than 53 bits and some programming languages may have difficulty/silent defects in interpreting it. Using a signed 64 bit integer for storing this identifier is safe. Use id_str to fetch the identifier to be safe. See X IDs for more information. Example:“id”: 6253282 |
id_str | String | The string representation of the unique identifier for this User. Implementations should use this rather than the large, possibly un-consumable integer in id . Example:“id_str”: “6253282” |
name | String | The name of the user, as they’ve defined it. Not necessarily a person’s name. Typically capped at 50 characters, but subject to change. Example: “name”: “API” |
screen_name | String | The screen name, handle, or alias that this user identifies themselves with. screen_names are unique but subject to change. Use id_str as a user identifier whenever possible. Typically a maximum of 15 characters long, but some historical accounts may exist with longer names. Example:“screen_name”: “api” |
location | String | Nullable . The user-defined location for this account’s profile. Not necessarily a location, nor machine-parseable. This field will occasionally be fuzzily interpreted by the Search service. Example: “location”: “San Francisco, CA” |
derived | Arrays of Enrichment Objects | Enterprise APIs only Collection of Enrichment metadata derived for user. Provides the Profile Geo Enrichment metadata. See referenced documentation for more information, including JSON data dictionaries. Example: “derived”:“locations”: [“country”:“United States”,“country_code”:“US”,“locality”:“Denver”] |
url | String | Nullable . A URL provided by the user in association with their profile. Example: “url”: “https://developer.x.com” |
description | String | Nullable . The user-defined UTF-8 string describing their account. Example: “description”: “The Real X API.” |
protected | Boolean | When true, indicates that this user has chosen to protect their Posts. See About Public and Protected Posts . Example: “protected”: true |
verified | Boolean | When true, indicates that the user has a verified account. See Verified Accounts . Example: “verified”: false |
followers_count | Int | The number of followers this account currently has. Under certain conditions of duress, this field will temporarily indicate “0”. Example: “followers_count”: 21 |
friends_count | Int | The number of users this account is following (AKA their “followings”). Under certain conditions of duress, this field will temporarily indicate “0”. Example: “friends_count”: 32 |
listed_count | Int | The number of public lists that this user is a member of. Example: “listed_count”: 9274 |
favourites_count | Int | The number of Posts this user has liked in the account’s lifetime. British spelling used in the field name for historical reasons. Example: “favourites_count”: 13 |
statuses_count | Int | The number of Posts (including retweets) issued by the user. Example: “statuses_count”: 42 |
created_at | String | The UTC datetime that the user account was created on X. Example: “created_at”: “Mon Nov 29 21:18:15 +0000 2010” |
profile_banner_url | String | The HTTPS-based URL pointing to the standard web representation of the user’s uploaded profile banner. By adding a final path element of the URL, it is possible to obtain different image sizes optimized for specific displays. For size variants, please see User Profile Images and Banners . Example: “profile_banner_url”: “https://si0.twimg.com/profile_banners/819797/1348102824” |
profile_image_url_https | String | A HTTPS-based URL pointing to the user’s profile image. Example: “profile_image_url_https”: “https://abs.twimg.com/sticky/default\_profile\_images/default\_profile\_normal.png” |
default_profile | Boolean | When true, indicates that the user has not altered the theme or background of their user profile. Example: “default_profile”: false |
default_profile_image | Boolean | When true, indicates that the user has not uploaded their own profile image and a default image is used instead. Example: “default_profile_image”: false |
Field | Type | Description |
---|---|---|
utc_offset | null | Value will be set to null. Still available via GET account/settings |
time_zone | null | Value will be set to null. Still available via GET account/settings as tzinfo_name |
lang | null | Value will be set to null. Still available via GET account/settings as language |
geo_enabled | null | Value will be set to null. Still available via GET account/settings. This field must be true for the current user to attach geographic data when using POST statuses / update |
following | null | Value will be set to null. Still available via GET friendships/lookup |
follow_request_sent | null | Value will be set to null. Still available via GET friendships/lookup |
has_extended_profile | null | Deprecated. Value will be set to null. |
notifications | null | Deprecated. Value will be set to null. |
profile_location | null | Deprecated. Value will be set to null. |
contributors_enabled | null | Deprecated. Value will be set to null. |
profile_image_url | null | Deprecated. Value will be set to null. NOTE: Profile images are only available using the profile_image_url_https field. |
profile_background_color | null | Deprecated. Value will be set to null. |
profile_background_image_url | null | Deprecated. Value will be set to null. |
profile_background_image_url_https | null | Deprecated. Value will be set to null. |
profile_background_tile | null | Deprecated. Value will be set to null. |
profile_link_color | null | Deprecated. Value will be set to null. |
profile_sidebar_border_color | null | Deprecated. Value will be set to null. |
profile_sidebar_fill_color | null | Deprecated. Value will be set to null. |
profile_text_color | null | Deprecated. Value will be set to null. |
profile_use_background_image | null | Deprecated. Value will be set to null. |
is_translator | null | Deprecated. Value will be set to null. |
is_translation_enabled | null | Deprecated. Value will be set to null. |
translator_type | null | Deprecated. Value will be set to null. |
place
object is always present when a Post is geo-tagged with a place,. Places are specific, named locations with corresponding geo coordinates. When users decide to assign a location to their Post, they are presented with a list of candidate X Places. When using the API to post, a X Place can be attached by specifying a place_id when posting. Posts associated with Places are not necessarily issued from that location but could also potentially be about that location.
The geo and coordinates
objects only present (non-null) when the Post is assigned an exact location. If an exact location is provided, the coordinates
object will provide a [long, lat] array with the geographical coordinates, and a X Place that corresponds to that location will be assigned.
Field | Type | Description |
---|---|---|
id | String | ID representing this place. Note that this is represented as a string, not an integer. Example: “id”:“01a9a39529b27f36” |
url | String | URL representing the location of additional place metadata for this place. Example: “url”:“https://api.x.com/1.1/geo/id/01a9a39529b27f36.json” |
place_type | String | The type of location represented by this place. Example: “place_type”:“city” |
name | String | Short human-readable representation of the place’s name. Example: “name”:“Manhattan” |
full_name | String | Full human-readable representation of the place’s name. Example: “full_name”:“Manhattan, NY” |
country_code | String | Shortened country code representing the country containing this place. Example: “country_code”:“US” |
country | String | Name of the country containing this place. Example: “country”:“United States” |
bounding_box | Object | A bounding box of coordinates which encloses this place. Example: “bounding_box”: “coordinates”: [ [ [ -74.026675, 40.683935 ], [ -74.026675, 40.877483 ], [ -73.910408, 40.877483 ], [ -73.910408, 40.3935 ] ] ], “type”: “Polygon” |
attributes | Object | When using PowerTrack, 30-Day and Full-Archive Search APIs, and Volume Streams this hash is null. Example: “attributes”: |
Field | Type | Description |
coordinates | Array of Array of Array of Float | A series of longitude and latitude points, defining a box which will contain the Place entity this bounding box is related to. Each point is an array in the form of [longitude, latitude]. Points are grouped into an array per bounding box. Bounding box arrays are wrapped in one additional array to be compatible with the polygon notation. Example: “coordinates”: [ [ [ -74.026675, 40.683935 ], [ -74.026675, 40.877483 ], [ -73.910408, 40.877483 ], [ -73.910408, 40.3935 ] ] ] |
type | String | The type of data encoded in the coordinates property. This will be “Polygon” for bounding boxes and “Point” for Posts with exact coordinates. Example: “type”:“Polygon” |
Field | Type | Description |
coordinates | Collection of Float | The longitude and latitude of the Post’s location, as a collection in the form [latitude, longitude]. Example: ** “geo”: “type”:** “Point”, ** “coordinates”: [ 54.27784, -0.41068 ] ** |
type | String | The type of data encoded in the coordinates property. This will be “Point” for Post coordinates fields. Example: “type”: “Point” |
Field | Type | Description |
coordinates | Collection of Float | The longitude and latitude of the Post’s location, as a collection in the form [longitude, latitude]. Example: ** “coordinates”: “type”:** “Point”, ** “coordinates”: [ -0.41068, 54.27784 ] ** |
type | String | The type of data encoded in the coordinates property. This will be “Point” for Post coordinates fields. Example: “type”: “Point” |
Field | Type | Description |
derived | locations object | Derived location from the profile geo enrichement “derived”: “locations”: [ ** “country”:** “United Kingdom”, “country_code”: “GB”, “locality”: “Yorkshire”, “region”: “England”, “full_name”: “Yorkshire, England, United Kingdom”, ** “geo”: “coordinates”: [ -1.5, 54 ], “type”:** “point” ** ] ** |
entities
section provides arrays of common things included in Posts: hashtags, user mentions, links, stock tickers (symbols), X polls, and attached media. These arrays are convenient for developers when ingesting Posts, since X has essentially pre-processed, or pre-parsed, the text body. Instead of needing to explicitly search and find these entities in the Post body, your parser can go straight to this JSON section and there they are.
Beyond providing parsing conveniences, the entities
section also provides useful ‘value-add’ metadata. For example, if you are using the Enhanced URLs enrichment, URL metadata include fully-expanded URLs, as well as associated website titles and descriptions. Another example is when there are user mentions, the entities metadata include the numeric user ID, which are useful when making requests to many X APIs.
Every Post JSON payload includes an entities
section, with the minimum set of hashtags
, urls
, user_mentions
, and symbols
attributes, even if none of those entities are part of the Post message. For example, if you examine the JSON for a Post with a body of “Hello World!” and no attached media, the Post’s JSON will include the following content with entity arrays containing zero items:
entities
and extended_entities
sections are both made up of arrays of entity objects. Below you will find descriptions for each of these entity objects, including data dictionaries that describe the object attribute names, types, and short description. We’ll also indicate which PowerTrack Operators match these attributes, and include some sample JSON payloads.
A collection of common entities found in Posts, including hashtags, links, and user mentions. This entities
object does include a media
attribute, but its implementation in the entiites
section is only completely accurate for Posts with a single photo. For all Posts with more than one photo, a video, or animated GIF, the reader is directed to the extended_entities
section.
entities
structure, data dictionaries for these sub-objects, and the Operators that match them, will be provided.
Field | Type | Description |
---|---|---|
hashtags | Array of Hashtag Objects | Represents hashtags which have been parsed out of the Post text. Example: “hashtags”: [ “indices”: [ 32, 38 ], “text”: “nodejs” ] |
media | Array of Media Objects | Represents media elements uploaded with the Post. Example: “media”: [ “display_url”: “pic.x.com/5J1WJSRCy9”, “expanded_url”: “https://x.com/nolan\_test/status/930077847535812610/photo/1”, “id”: 9.300778475358126e17, “id_str”: “930077847535812610”, “indices”: [ 13, 36 ], “media_url”: “http://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg”, “media_url_https”: “https://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg” “sizes”: “thumb”: “h”: 150, “resize”: “crop”, “w”: 150 , “large”: “h”: 1366, “resize”: “fit”, “w”: 2048 , “medium”: “h”: 800, “resize”: “fit”, “w”: 1200 , “small”: “h”: 454, “resize”: “fit”, “w”: 680 , “type”: “photo”, “url”: “https://t.co/5J1WJSRCy9”, ] |
urls | Array of URL Objects | Represents URLs included in the text of a Post. Example (without Enhanced URLs enrichment enabled): “urls”: [ “indices”: [ 32, 52 ], “url”: “http://t.co/IOwBrTZR”, “display_url”: “youtube.com/watch?v=oHg5SJ…”, “expanded_url”: “http://www.youtube.com/watch?v=oHg5SJYRHA0” ] Example (with Enhanced URLs enrichment enabled): “urls”: [ “url”: “https://t.co/D0n7a53c2l”, “expanded_url”: “http://bit.ly/18gECvy”, “display_url”: “bit.ly/18gECvy”, “unwound”: “url”: “https://www.youtube.com/watch?v=oHg5SJYRHA0”, “status”: 200, “title”: “RickRoll’D”, “description”: “http://www.facebook.com/rickroll548 As long as trolls are still trolling, the Rick will never stop rolling.” , “indices”: [ 62, 85 ] ] |
user_mentions | Array of User Mention Objects | Represents other X users mentioned in the text of the Post. Example: “user_mentions”: [ “name”: “X API”, “indices”: [ 4, 15 ], “screen_name”: “xapi”, “id”: 6253282, “id_str”: “6253282” ] |
symbols | Array of Symbol Objects | Represents symbols, i.e. $cashtags, included in the text of the Post. Example: “symbols”: [ “indices”: [ 12, 17 ], “text”: “twtr” ] |
polls | Array of Poll Objects | Represents X Polls included in the Post. Example: “polls”: [ “options”: [ “position”: 1, “text”: “I read documentation once.” , “position”: 2, “text”: “I read documentation twice.” }, “position”: 3, “text”: “I read documentation over and over again.” } ], “end_datetime”: “Thu May 25 22:20:27 +0000 2017”, “duration_minutes”: 60 ] |
entities
section will contain a hashtags
array containing an object for every hashtag included in the Post body, and include an empty array if no hashtags are present.
The PowerTrack #
Operator is used to match on the text
attribute. The has:hashtags
Operator will match if there is at least one item in the array.
Field | Type | Description |
indices | Array of Int | An array of integers indicating the offsets within the Post text where the hashtag begins and ends. The first integer represents the location of the # character in the Post text string. The second integer represents the location of the first character after the hashtag. Therefore the difference between the two numbers will be the length of the hashtag name plus one (for the ‘#’ character). Example: “indices”:[32,38] |
text | String | Name of the hashtag, minus the leading ‘#’ character. Example: “text”:“nodejs” |
entities
section will contain a media
array containing a single media object if any media object has been ‘attached’ to the Post. If no native media has been attached, there will be no media
array in the entities
. For the following reasons the extended_entities
section should be used to process Post native media:
+ Media type
will always indicate ‘photo’ even in cases of a video and GIF being attached to Post.
+ Even though up to four photos can be attached, only the first one will be listed in the entities
section.
The has:media
Operator will match if this array is populated.
Field | Type | Description |
display_url | String | URL of the media to display to clients. Example: “display_url”:“pic.x.com/rJC5Pxsu” |
expanded_url | String | An expanded version of display_url. Links to the media display page. Example: “expanded_url”: “http://x.com/yunorno/status/114080493036773378/photo/1” |
id | Int64 | ID of the media expressed as a 64-bit integer. Example: “id”:114080493040967680 |
id_str | String | ID of the media expressed as a string. Example: “id_str”:“114080493040967680” |
indices | Array of Int | An array of integers indicating the offsets within the Post text where the URL begins and ends. The first integer represents the location of the first character of the URL in the Post text. The second integer represents the location of the first non-URL character occurring after the URL (or the end of the string if the URL is the last part of the Post text). Example: “indices”:[15,35] |
media_url | String | An http:// URL pointing directly to the uploaded media file. Example: “media_url”:“http://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg” For media in direct messages, media_url is the same https URL as media_url_https and must be accessed by signing a request with the user’s access token using OAuth 1.0A.It is not possible to access images via an authenticated x.com session. Please visit this page to learn how to account for these recent change. You cannot directly embed these images in a web page. See Photo Media URL formatting for how to format a photo’s URL, such as media_url_https , based on the available sizes . |
media_url_https | String | An https:// URL pointing directly to the uploaded media file, for embedding on https pages. Example: “media_url_https”:“https://p.twimg.com/AZVLmp-CIAAbkyy.jpg” For media in direct messages, media_url_https must be accessed by signing a request with the user’s access token using OAuth 1.0A.It is not possible to access images via an authenticated x.com session. Please visit this page to learn how to account for these recent change. You cannot directly embed these images in a web page. See Photo Media URL formatting for how to format a photo’s URL, such as media_url_https , based on the available sizes . |
sizes | Size Object | An object showing available sizes for the media file. Example: “sizes”: “thumb”: “h”: 150, “resize”: “crop”, “w”: 150 }, “large”: “h”: 1366, “resize”: “fit”, “w”: 2048 }, “medium”: “h”: 800, “resize”: “fit”, “w”: 1200 }, “small”: “h”: 454, “resize”: “fit”, “w”: 680 } } } See Photo Media URL formatting for how to format a photo’s URL, such as media_url_https , based on the available sizes . |
source_status_id | Int64 | Nullable. For Posts containing media that was originally associated with a different Post, this ID points to the original Post. Example: “source_status_id”: 205282515685081088 |
source_status_id_str | Int64 | Nullable. For Posts containing media that was originally associated with a different post, this string-based ID points to the original Post. Example: “source_status_id_str”: “205282515685081088” |
type | String | Type of uploaded media. Possible types include photo, video, and animated_gif. Example: “type”:“photo” |
url | String | Wrapped URL for the media link. This corresponds with the URL embedded directly into the raw Post text, and the values for the indices parameter. Example:“url”:“http://t.co/rJC5Pxsu” |
Field | Type | Description |
thumb | Size Object | Information for a thumbnail-sized version of the media. Example: “thumb”:“h”:150, “resize”:“crop”, “w”:150} Thumbnail-sized photo media will be limited to fill a 150x150 boundary and cropped. |
large | Size Object | Information for a large-sized version of the media. Example: “large”:“h”:454, “resize”:“fit”, “w”:680} Small-sized photo media will be limited to fit within a 680x680 boundary. |
medium | Size Object | Information for a medium-sized version of the media. Example: “medium”:“h”:800, “resize”:“fit”, “w”:1200} Medium-sized photo media will be limited to fit within a 1200x1200 boundary. |
small | Size Object | Information for a small-sized version of the media. Example: “small”:“h”:1366, “resize”:“fit”, “w”:2048} Large-sized photo media will be limited to fit within a 2048x2048 boundary. |
Field | Type | Description |
w | Int | Width in pixels of this size. Example: “w”:150 |
h | Int | Height in pixels of this size. Example: “h”:150 |
resize | String | Resizing method used to obtain this size. A value of fit means that the media was resized to fit one dimension, keeping its native aspect ratio. A value of crop means that the media was cropped in order to fit a specific resolution. Example: “resize”:“crop” |
media_url
or media_url_https
on their own can be loaded, which will result in the medium variant being loaded by default. It is preferable, however, to provide a fully formatted photo media URL when possible.
There are three parts of a photo media URL:
Base URL | The base URL is the media URL without the file extension. For example: “media_url_https”: “https://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg”, The base URL is then: https://pbs.twimg.com/media/DOhM30VVwAEpIHq |
Format | The format is the type of photo the image is formatted as. Possible formats are jpg or png, which is provided as the extension of the media URL. For example: “media_url_https”: “https://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg”, The format is then: jpg |
Name | The name is the field name of the size to load. For example: “sizes”: “thumb”: “h”: 150, “resize”: “crop”, “w”: 150 , “large”: “h”: 1366, “resize”: “fit”, “w”: 2048 }, “medium”: “h”: 800, “resize”: “fit”, “w”: 1200 }, “small”: “h”: 454, “resize”: “fit”, “w”: 680 } } } The name when loading the large-sized photo would be: large |
Legacy format | The legacy format is deprecated. Photo media loads should all move to the modern format. <base_url>.<format>:<name> For example: https://pbs.twimg.com/media/DOhM30VVwAEpIHq.jpg:large |
Modern format | The modern format for loading photos was established at X in 2015 and has been defacto since 2017. All photo media loads should move to this format. <base_url>?format=<format>&name=<name> For example: https://pbs.twimg.com/media/DOhM30VVwAEpIHq?format=jpg&name=large Note: the items in the query string for the photo media URL are in alphabetical order. If media loading were to add any additional query items, alphabetical ordering would continue to be necessary. For example, if there was the hypothetical new query item called preferred_format, it would go after format and name in the query string. |
entities
section will contain a urls
array containing an object for every link included in the Post body, and include an empty array if no links are present.
The has:links
Operator will match if there is at least one item in the array. The url:
Operator is used to match on the expanded_url
attribute. If you are using the Expanded URL enrichment, the url:
Operator is used to match on the unwound.url
(fully unwound URL) attribute. If you are using the Exhanced URL enrichment, the url_title:
and url_decription:
Operators are used to match on the unwound.title
and unwound.description
attributes.
Field | Type | Description |
display_url | String | URL pasted/typed into Post. Example: “display_url”:“bit.ly/2so49n2” |
expanded_url | String | Expanded version of display_url . Example:“expanded_url”:“http://bit.ly/2so49n2” |
indices | Array of Int | An array of integers representing offsets within the Post text where the URL begins and ends. The first integer represents the location of the first character of the URL in the Post text. The second integer represents the location of the first non-URL character after the end of the URL. Example: “indices”:[30,53] |
url | String | Wrapped URL, corresponding to the value embedded directly into the raw Post text, and the values for the indices parameter. Example: “url”:“https://t.co/yzocNFvJuL” |
unwound
attribute:
Field | Type | Description |
url | String | The fully unwound version of the link included in the Post. Example: “url”:“https://blog.x.com/en_us/topics/insights/2016/using-twitter-as-a-go-to-communication-channel-during-severe-weather-events.html” |
status | Int | Final HTTP status of the unwinding process, a ‘200’ indicating success. Example: 200 |
title | String | HTML title for the link. Example: “title”:“Using X as a ‘go-to’ communication channel during severe weather” |
description | String | HTML description for the link. Example: “description”:“Using X as a ‘go-to’ communication channel during severe weather” |
entities
section will contain a user_mentions
array containing an object for every user mention included in the Post body, and include an empty array if no user mention is present.
The PowerTrack @
Operator is used to match on the screen_name
attribute. The has:mentions
Operator will match if there is at least one item in the array.
Field | Type | Description |
id | Int64 | ID of the mentioned user, as an integer. Example: “id”:6253282 |
id_str | String | If of the mentioned user, as a string. Example: “id_str”:“6253282” |
indices | Array of Int | An array of integers representing the offsets within the Post text where the user reference begins and ends. The first integer represents the location of the ‘@’ character of the user mention. The second integer represents the location of the first non-screenname character following the user mention. Example: “indices”:[4,15] |
name | String | Display name of the referenced user. Example: “name”:“API” |
screen_name | String | Screen name of the referenced user. Example: “screen_name”:“api” |
entities
section will contain a symbols
array containing an object for every $cashtag included in the Post body, and include an empty array if no symbol is present.
The PowerTrack $
Operator is used to match on the text
attribute. The has:symbols
Operator will match if there is at least one item in the array.
Field | Type | Description |
indices | Array of Int | An array of integers indicating the offsets within the Post text where the symbol/cashtag begins and ends. The first integer represents the location of the ’ character). Example: “indices”:[12,17] |
text | String | Name of the cashhtag, minus the leading ‘$’ character. Example: “text”:“twtr” |
entities
section will contain a polls
array containing a single poll
object if the Post contains a poll. If no poll is included, there will be no polls
array in the entities
section.
Note that these Poll metadata are only available with the following Enterprise APIs:
Field | Type | Description |
options | Array of Option Object | An array of options, each having a poll position, and the text for that position. Example: “options”: [ “position”: 1, “text”: “I read documentation once.” } ] } |
end_datetime | String | Time stamp (UTC) of when poll ends. Example: “end_datetime”: “Thu May 25 22:20:27 +0000 2017” |
duration_minutes | String | Duration of poll in minutes. Example: “duration_minutes”: 60 |
extended_entities
JSON object. The extended_entities
object contains a single media
array of media
objects (see the entities
section for its data dictionary). No other entity types, such as hashtags and links, are included in the extended_entities
section. The media
object in the extended_entities
section is identical in structure to the one included in the entities
section.
Posts can only have one type of media attached to it. For photos, up to four photos can be attached. For videos and GIFs, one can be attached. Since the media type
metadata in the extended_entities
section correctly indicates the media type (‘photo’, ‘video’ or ‘animated_gif’), and supports up to 4 photos, it is the preferred metadata source for native media.
entities
section for this Post:
extented_entities
section for this Post:
video_info
object will be replaced with an additional_media_info
object.
The additional_media_info
will contain additional media info provided by the publisher, such as title
, description
and embeddable flag
. Video content is made available only to X official clients when embeddable=false
. In this case, all video URLs provided in the payload will be X-based, so the user can open the video in a X owned property by clicking the link.
Here is an example of what the extended entities object will look like in this situation:
entities
section that incorrectly has the type
set to ‘photo’. Again, the extended_entities
section is preferred for all native media types, including ‘video’ and ‘animated_gif’.
Attribute | Type | Description |
id | string | A unique IRI for the post. In more detail, “tag” is the scheme, “search.x.com” represents the domain for the scheme, and 2005 is when the scheme was derived. When storing Posts, this should be used as the unique identifier or primary key. “id”: “tag:search.x.com,2005:1050118621198921728” |
objectType | string | Type of object, always set to “activity” “objectType”: “activity” |
object | object | An object representing post being posted or shared. For Retweets, this will contain an entire “activity”, with the pertinent fields described in this schema. For Original posts, this will contain a “note” object, with the fields described here. “object”: “object”: “objectType”: “note”, “id”: “object:search.x.com,2005:1050118621198921728”, “summary”: “To make room for more expression, we will now count all emojis as equal—including those with gender and skin t… https://t.co/MkGjXf9aXm”, “link”: “http://x.com/API/statuses/1050118621198921728”, “postedTime”: “2018-10-10T20:19:24.000Z” |
long_object | object | An object representing the full text body if the post text extends beyond 140 characters. “long_object”: “body”: “To make room for more expression, we will now count all emojis as equal—including those with gender and skin tone modifiers 👍🏻👍🏽👍🏿. This is now reflected in Twitter-Text, our Open Source library. \n\nUsing Twitter-Text? See the forum post for detail: https://t.co/Nx1XZmRCXA”, “display_text_range”: [ 0, 277 ], “twitter_entities”: “hashtags”: [], “urls”: [ “url”: “https://t.co/Nx1XZmRCXA”, “expanded_url”: “https://devcommunity.x.com/t/new-update-to-the-twitter-text-library-emoji-character-count/114607”, “display_url”: “devcommunity.com/t/new-update-t…”, “indices”: [ 254, 277 ] ], “user_mentions”: [], “symbols”: [] |
display_text_range | array | if the post text extends beyond 140 characters. “display_text_range”: [ 0, 142 ] |
verb | string | The type of action being taken by the user. Posts, “post” Retweets, “share” Deleted Posts, “delete” The verb is the proper way to distinguish between a Tweet and a true Retweet. However, this only applies to true retweets, and not modified or quoted Tweets, which don’t use X Retweet functionality. For a description of AS verbs click here. For Deletes, note that only a limited number of fields will be included, as shown in the sample payload below. “verb”: “post” |
postedTime | date (ISO 8601) | The time the action occurred, e.g. the time the post was posted. “postedTime”: “2018-10-10T20:19:24.000Z” |
generator | object | An object representing the utility used to post the post. This will contain the name (“displayName”) and a link (“link”) for the source application generating the Post. “generator”: “displayName”: “X Web Client”, “link”: “http://x.com” |
provider | object | A JSON object representing the provider of the activity. This will contain an objectType (“service”), the name of the provider (“displayName”), and a link to the provider’s website (“link”). “provider”: “objectType”: “service”, “displayName”: “X”, “link”: “http://www.x.com” |
link | string | A Permalink for the post. “link”: “http://x.com/API/statuses/1050118621198921728” |
body | string | The post text. In Retweets, note that X modifies the value of the body at the root level by adding “RT @username” at the beginning, and by truncating the original text and adding an ellipsis at the end. Thus, for Retweets, your app should look at the object.body to ensure that it is extracting the non-modified text of the original Post (being retweeted). “body”: “With Cardiff, Crystal Palace, and Hull City joining the EPL from the Championship it will be a great relegation battle at the end.” |
display_text_range | array | Describes the range of characters within the body text that indicates the displayed Post. Posts with leading @mentions will start at more than 0 and Posts with attached media or that extened beyond 140 characters will indicate the display_text_range in the long_object. “display_text_range”: [ 14, 42 ] or “long_object”: “display_text_range”: [ 0, 277 ]… |
actor | object | An object representing the x user who posted. The Actor Object refers to a X User, and contains all metadata relevant to that user. See actor object details |
inReplyTo | object | A JSON object referring to the Post being replied to, if applicable. Contains a link to the Post. “inReplyTo”: “link”: “http:\/\/x.com\/GOP\/statuses\/349573991561838593” |
location | object | A JSON object representing the X “Place” where the post was created. This is an object passed through from the X platform. See location object |
twitter_entities | object | The entities object from X’s data format which contains lists of urls, mentions and hashtags. Please reference the X documentation on Entities here Note that in Retweets, X may truncate the values of entities that it extracts at the root level. So, for Retweets, your app should look at object.twitter_entities to ensure that you are using non-truncated values. See twitter_entities object details |
twitter_extended_entities | object | An object from X’s native data format containing “media”. This will be present for any post where the twitter_entities object has data present in the “media” field, and will include multiple photos where present in the post. Note that this is the correct location to retrieve media information for multi-photo posts. Multiple photos are represented by comma-separated JSON objects within the “media” array. See twitter_extended_entities object details |
gnip | object | An object added to the activity payload to indicate the matching rules, and added enriched data based on enrichments active on the stream or product. See gnip object details |
edit_history | Object | Unique identifiers indicating all versions of a Post. For Posts with no edits, there will be one ID. For Posts with an edit history, there will be multiple IDs, arranged in ascending order reflecting the order of edits, with the most recent version in the last position of the array. The Post IDs can be used to hydrate and view previous versions of a Post. Example: edit_history”: “initial_tweet_id”: “1283764123” “edit_tweet_ids”: [“1283764123”, “1394263866”] |
edit_controls | Object | When present, indicates how long a Post is still editable for and the number of remaining edits. Posts are only editable for the first 30 minutes after creation and can be edited up to five times. The Post IDs can be used to hydrate and view previous versions of a Post. Example: “edit_controls”: “editable_until_ms”: 123 “edits_remaining”: 3 |
editable | Boolean | When present, indicates if a Post was eligible for edit when published. This field is not dynamic and won’t toggle from True to False when a Post reaches its editable time limit, or maximum number of edits. The following Post features will cause this field to be false: * Posts is promoted * Post has a poll * Post is a non-self-thread reply * Post is a retweet (note that Quote Tweets are eligible for edit) * Post is nullcast * Community Post * Superfollow Post * Collaborative Post |
Attribute | Type | Description |
---|---|---|
twitter_lang | string | |
favoritesCount | int | Nullable. Indicates approximately how many times this Post has been liked by X users. “favoritesCount”:298 |
retweetCount | int | Number of times this Post has been retweeted. Example: “retweetCount”:153 |
Field | Type | Description |
geo | object | Point location where the Post was created. |
twitter_filter_level | string | Deprecated field left in for non breaking change |
{ "id": "tag:search.x.com,2005:222222222222", "objectType": "activity", "verb": "post", "body": "Quoting a Tweet: https://t.co/mxiFJ59FlB", "actor": { "displayName": "TheQuoter2" }, "object": { "objectType": "note", "id": "object:search.x.com,2005:111111111", "summary": "https://t.co/mxiFJ59FlB" }, "twitter_entities": {}, "twitter_extended_entities": {}, "gnip": {}, "twitter_quoted_status": { "id": "tag:search.x.com,2005:111111111", "objectType": "activity", "verb": "post", "body": "console.log('Happy birthday, JavaScript!');", "actor": { "displayName": "TheOriginalTweeter" }, "object": { "objectType": "note", "id": "object:search.x.com,2005:111111111" }, "twitter_entities": {} } }
Retweeted Quote Tweet:
Attribute | Type | Description |
---|---|---|
objectType | string | ”objectType”: “person” |
id | string | The string representation of the unique identifier for this author. Example: “id:x.com:2244994945” |
link | ”http://www.x.com/XDeveloeprs | |
displayName | String | The name of the user, as they’ve defined it. Not necessarily a person’s name. Typically capped at 50 characters, but subject to change. Example: “displayName”: “XDevelopers” |
preferredUsername | string | The screen name, handle, or alias that this user identifies themselves with. Unique but subject to change. Use id as a user identifier whenever possible. Typically a maximum of 15 characters long, but some historical accounts may exist with longer names. Example:“preferredUsername”: “XDevelopers” |
location | object | ** “location”: “objectType”:** “place”, “displayName”: “127.0.0.1” ** }** |
links | array | Nullable . A URL provided by the user in association with their profile. Example: ** “links”: [ { “href”:** “https://developer.x.com/en/community”, “rel”: “me” ** } ]** |
summary | string | Nullable . The user-defined UTF-8 string describing their account. Example: “summary”: “The voice of the #XDevelopers team…“ |
protected | Boolean | When true, indicates that this user has chosen to protect their Posts. See About Public and Protected Posts. Example: “protected”: true |
verified | Boolean | When true, indicates that the user has a verified account. See Verified Accounts . Example: “verified”: false |
followersCount | Int | The number of followers this account currently has. Under certain conditions of duress, this field will temporarily indicate “0”. Example: “followers_count”: 21 |
friendsCount | Int | The number of users this account is following (AKA their “followings”). Under certain conditions of duress, this field will temporarily indicate “0”. Example: “friends_count”: 32 |
listedCount | Int | The number of public lists that this user is a member of. Example: “listed_count”: 9274 |
favoritesCount | Int | The number of Posts this user has liked in the account’s lifetime. British spelling used in the field name for historical reasons. Example: “favourites_count”: 13 |
statusesCount | Int | The number of Posts (including retweets) issued by the user. Example: “statuses_count”: 42 |
postedTime | date | The UTC datetime that the user account was created on X. Example: “postedTime”: “2013-12-14T04:35:55.036Z” |
image | string | A HTTPS-based URL pointing to the user’s profile image. Example: “image”: “https://pbs.twimg.com/profile\_images/1283786620521652229/lEODkLTh\_normal.jpg” |
Field | Type | Description |
---|---|---|
utcOffset | null | Value will be set to null. Still available via GET account/settings |
twitterTimeZone | null | Value will be set to null. Still available via GET account/settings as tzinfo_name |
languages | null | Value will be set to null. Still available via GET account/settings as language |
Field | Type | Description |
---|---|---|
objectType | string | See here for more detailed information. Example: “objectType”: “place” |
displayName | string | The full name of the location. ****“displayName”: “United States” |
name | string | Name of the location from X’s place JSON format. |
link | string | A link to the full X JSON representation of the place. “link”: “https://api.x.com/1.1/geo/id/27c45d804c777999.json” |
geo | object | The geo coordintates object from X. Either a polygon, or point. See geo |
countryCode | String | Shortened country code representing the country containing this place. Example: “countryCode”: “US |
country | String | Name of the country containing this place. Example: **“country”: **“United States” |
Field | Type | Description |
address | object | Within profileLocation location object within the gnip object. Address of location derived by the profile geo enrichement. Level of granularity will vary. “address”: { ** “country”: “United States”, “countryCode”: “US”, “locality”: “Providence”, “region”: “Rhode Island”, “subRegion”: “Providence County” }** |
geo | object | Within profileLocation location object within the gnip object. Centroid coordinates of the location derived by the profile geo enrichement. ”geo”: { ** “coordinates”: [ -98.5, 39.76 ], “type”: “point” }** |
Field | Type | Description |
matching_rules | array | Contains an array of matching rule objects which indicate the rule which the activity matches on. “matching_rules”: [ ** { “tag”: null, “id”:** 1026514022567358500**, “id_str”:** “1026514022567358464” ** } ]** |
urls | array | Contains an array of the links within the activity, and the expanded url metadata for the URL unwinding enrichement “urls”: [ { “url”:* “https://t.co/tGQqNxxyhU”, “expanded_url”: “https://www.youtube.com/channel/UCwUxW2CV2p5mzjMBqvqLzJA”, “expanded_status”: 200**, “expanded_url_title”:** “Birdys Daughter”, “expanded_url_description”: “Premium, single-origin, handpicked Jamaica Blue Mountain Coffee” ** } ]** |
profileLocations | array of location objects | Contains the derived location object from the Profile Geo enrichment ** “profileLocations”: [ { “address”: { “country”:** “Canada”, “countryCode”: “CA”, “locality”: “Toronto”, “region”: “Ontario” ** }, “displayName”:** “Toronto, Ontario, Canada”, ** “geo”: { “coordinates”: [ -79.4163, 43.70011 ], “type”:** “point” ** }, “objectType”:** “place” ** } ] }** |
twitter_extended_entities
has:videos
, has:images
, and has:media
. These will match only on media content that was shared via X features. To match on other media hosted off of the X platform, you’ll want to use Operators that match on URL metadata.
So, before we dig into the Historical PowerTrack and Full-Archive Search product details, let’s take a tour of how X, as a product and platform, evolved over time.
X timeline
Below you will find a select timeline of X. Most of these X updates in some way fundamentally affected either user behavior, Post JSON contents, query Operators, or all three. Looking at X as a API platform, the following events in some way affected the JSON payloads that are used to encode Posts. In turn, those JSON details affect how X historical API match on them.
Note that this timeline list is generally precise and not exhaustive.
in_reply_to
metadata.retweet_status
metadata.lang:
Operator, which is used to match Posts in a specified language. X provides a language classification service (supporting over 50 languages), and X APIs provide this metadata in the JSON that is generated for every Post. So, if a Post is written in Spanish the “lang” JSON attribute is set to “es”. So, if you build a filter with the lang:es
clause, it will only match on Post messages classified as Spanish.
The timeline information can also help better interpret the Post data received. Say you were researching the sharing of content about the 2008 and 2012 Summer Olympics. If you applied only the is:retweet
Operator to match on Retweets, no data would match in 2008. However, for 2012 there would likely be millions of Retweets. From this you potentially could erroneously conclude that in 2008 Retweets were not a user convention, or that simply no one Retweeted about those Olympics. Since Retweets became a first-class object in 2009, you need to add a ”RT @”
rule clause to help identify them in 2008.
Both Retweets and Post language classifying are examples of Post attributes with a long history and many product details. Below we will discuss more details of these and other attribute classes important to matching on and understanding X Data.
has:videos
Operator, which matches on Posts with native videos, that clause will not match any Posts before 2015.
However, sharing of videos has been common on X long before 2015. Before then users shared links to videos hosted elsewhere, but in 2015, X built new ‘sharing video’ features directly into the platform. For finding these earlier Posts of interest, you would include a rule clause such as url:”youtube.com”
.
Note, with the Search APIs, there are some examples of metadata being ‘backfilled’ as its index was rebuilt. One good example are cashtag operator was introduced in 2015, the Search index was rebuilt, and in the process the symbol entity was extracted from all Post bodies, including early 2006 when $
was used mainly for slang; “I hope it $oon!”.
is:retweet
Operator enables users to either include or exclude Retweets. If pulling data from before August 2009, users need to have two strategies for Retweet matching (or not matching). Before August 2009, the Post message itself needs to be checked, using exact phrase matching, for matches on the “@RT ” pattern. For periods after August 2009, the is:retweet
Operator is available.
lang:
Operator is available for the entire post archive. With Historical PowerTrack, X’s language classification metadata is available in the archive beginning on March 26, 2013.