Please note:

We have released a new version of search Posts and Post counts in X API v2. We encourage you to review what’s new with X API v2. 

These endpoints have been updated to include Post edit metadata. Learn more about these metadata on the “Edit Posts” fundamentals page

Overview

Enterprise

The enterprise APIs are available within our managed access levels only. To use these APIs, you must first set up an account with our enterprise sales team. To learn more see HERE.

You can view all of the X API search Post offerings HERE.

There are two enterprise search APIs:

  1. 30-Day Search API provides data from the previous 30 days.
  2. Full-Archive Search API provides complete and instant access to the full corpus of X data dating all the way back to the first Post in March 2006.

These RESTful APIs supports a single query of up to 2,048 characters per request. Queries are written with the PowerTrack rule syntax - see Rules and filtering for more details. Users can specify any time period, to the granularity of a minute. However, responses will be limited to the lesser of your specified maxResults OR 31 days and include a next token to paginate for the next set of results. If time parameters are not specified, the API will return matching data from the 30 most recent days.

The enterprise search APIs provide low-latency, full-fidelity, query-based access to the Post archive with minute granularity. Post data is served in reverse chronological order, starting with the most recent Post that matches your query. Posts are available from the search API approximately 30 seconds after being published.

These search endpoints provide edited Post metadata. All objects for Posts created since September 29, 2022, include Post edit metadata, even if the Post was never edited. Each time a Post is edited, a new Post ID is created. A Post’s edit history is documented by an array of Post IDs, starting with the original ID.

These endpoints will always return the most recent edit, along with any edit history. Any Post collected after its 30-minute edit window will represent its final version. To learn more about Edit Post metadata, check out the Edit Posts fundamentals page.

Requests include a maxResultsparameter that specifies the maximum number of Posts to return per API response. If more Posts are associated with the query than this maximum amount of results per response, a next token is included in the response. These next tokens are used in subsequent requests to page through the entire set of Posts associated with the query.

These enterprise search APIs provide a counts endpoint that enables users to request the data volume associated with their query. 

Request types

The enterprise search APIs support two types of requests:

Search requests (data)

Search requests to the enterprise search APIs allow you to retrieve up to 500 results per response for a given timeframe, with the ability to paginate for additional data. Using the maxResults parameter, you can specify smaller page sizes for display use cases (allowing your user to request more results as needed) or larger page sizes (up to 500) for larger data pulls. The data is delivered in reverse chronological order and compliant at the time of delivery.

Counts requests (Post count)

Counts requests provide the ability to retrieve historical activity counts, which reflect the number of activities that occurred which match a given query during the requested timeframe. The response will essentially provide you with a histogram of counts, bucketed by day, hour, or minute (the default bucket is hour). It’s important to note that counts results do not always reflect compliance events (e.g., Posts deletes) that happen well after (7+ days) a Post is published; therefore, it is expected that the counts metric may not always match that of a data request for the same query.

Billing note: each request – including pagination requests – made against the data and counts endpoints are counted as a billed request. Therefore, if there are multiple pages of results for a single query, paging through the X pages of results would equate to X requests for billing.

Available operators

Enterprise search APIs support rules with up to 2,048 characters. The enterprise search APIs support the operators listed below. For detailed descriptions see HERE

Matching on Post contents:Matching on accounts of interest:Post attributes:Geospatial operators:
* keyword
* “quoted phrase”
* “keyword1 keyword2”~N
* #
* @
* $
* url:
* lang:
* from:
* to:
* retweets_of:
* is:retweet

* has:mentions
* has:hashtags
* has:media
* has:videos
* has:images
* has:links
* has:symbols
* is:verified

* -is:nullcast (negation only operator)
* bounding_box:[west_long south_lat east_long north_lat]
* point_radius:[lon lat radius]
* has:geo
* place:
* place_country:
* has:profile_geo
* profile_country:
* profile_region:
* profile_locality:

Notes: Do not embed/nest operators (“#cats”) will resolve to cats with the search APIs.   The ‘lang:’ operator and all ‘is:’ and ‘has:’ operators cannot be used as standalone operators and must be combined with another clause (e.g. @XDevelopers has:links).    

Search APIs use a limited set of operators due to tokenization/matching functionality. enterprise real-time and batched historical APIs provide additional operators. See HERE for more details.

For more details, please see the Getting started with operators guide.

Data availability / important date

When using the Full-Archive search API, keep in mind that the X platform has continued to evolve since 2006. As new features were added, the underlying JSON objects have had new metadata added to it. For that reason it is important to understand when Post attributes were added that search operators match on. Below are some of the more fundamental ‘born on’ dates of important groups of metadata. To learn more about when Post attributes were first introduced, see this guide.  

  • First Post: 3/21/2006
  • First Native Retweets: 11/6/2009
  • First Geo-tagged Post: 11/19/2009
  • URLs first indexed for filtering: 8/27/2011
  • Enhanced URL expansion metadata (website titles and descriptions): 12/1/2014
  • Profile Geo enrichment metadata and filtering: 2/17/2015

Data Updates and Mutability

With the enterprise search APIs, some of the data within a Post is mutable, i.e. can be updated or changed after initial archival.

This mutable data falls into two categories:

  • User object metadata:
    • User’s @handle (numeric ID does not ever change)
    • Bio description
    • Counts: statuses, followers, friends, favorites, lists
    • Profile location
    • Other details such as time zone and language
  • Post statistics - i.e. anything that can be changed on the platform by user actions (examples below):
    • Favorites count
    • Retweet count

In most of these cases, the search APIs will return data as it exists on the platform at query-time, rather than Post generation time. However, in the case of queries using select operators (e.g. from, to, @, is:verified), this may not be the case. Data is updated in our index on a regular basis, with an increased frequency for most recent timeframes. As a result, in some cases, the data returned may not exactly match the current data as displayed on X.com, but matches data at the time it was last indexed.

Note, this issue of inconsistency only applies to queries where the operator applies to mutable data. One example is filtering for usernames, and the best workaround would be to use user numeric IDs rather than @handles for these queries.

Single vs. Multi-threaded Requests

Each customer has a defined rate limit for their search endpoint. The default per-minute rate limit for Full-Archive search is 120 requests per minute, for an average of 2 queries per second (QPS). This average QPS means that, in theory, 2 requests can be made of the API every second. Given the pagination feature of the product, if a one-year query has one million Posts associated with it, spread evenly over the year, over 2,000 requests would be required (assuming a ‘maxResults’ of 500) to receive all the data. Assuming it takes two seconds per response, that is 4,000 seconds (or just over an hour) to pull all of that data serially/sequentially through a single thread (1 request per second using the prior response’s “next” token). Not bad!

Now consider the situation where twelve parallel threads are used to receive data. Assuming an even distribution of the one million Posts over the one-year period, you could split the requests into twelve parallel threads (multi-threaded) and utilize more of the per-second rate limit for the single “job”. In other words, you could run one thread per-month you are interested in and by doing so, data could be retrieved 12x as fast (or ~6 minutes).

This multi-threaded example applies equally well to the counts endpoint. For example, if you wanted to receive Post counts for a two-year period, you could make a single-threaded request and page back through the counts 31 days at a time. Assuming it takes 2 seconds per response, it would take approximately 48 seconds to make the 24 API requests and retrieve the entire set of counts. However, you also have the option to make multiple one-month requests at a time. When making 12 requests per second, the entire set of counts could be retrieved in approximately 2 seconds.

Retry Logic

If you experience a 503 error with the enterprise search APIs, it is likely a transient error and can be resolved by re-trying the request a short time later.

If the request fails 4 times in a row, and you have waited at least 10 minutes between failures, use the following steps to troubleshoot:

  • Retry the request after reducing the amount of time it covers. Repeat this down to a 6-hour time window if unsuccessful.
  • If you are ORing a large number of terms together, split them into separate rules and retry each individually.
  • If you are using a large number of exclusions in your rule, reduce the number of negated terms in the rule and retry.

Quick start

Getting started with enterprise Search Posts: 30-Day API

The enterprise Search Posts: 30-Day API provides you with Posts posted within the last 30 days. Posts are matched and sent back to you based on the query you specify in your request. A query is a rule in which you define what the Post you get back should contain. In this tutorial, we will search for Posts originating from the X account @XDevelopers in English.

The Posts you get back in your payload can be in a data format, which provides you with the full Post payload, or it can be in a counts format which gives you numerical count data of matched Posts. We will be using cURL to make requests to the data and counts endpoints.

You will need the following:

Accessing the data endpoint

The data endpoint will provide us with the full Post payload of matched Posts. We will use the from: and lang: operators to find Posts originating from @XDevelopers in English. For more operators click here.

cURL is a command-line tool for getting or sending files using the URL syntax.

Copy the following cURL request into your command line after making changes to the following:

  • Username <USERNAME> e.g. email@domain.com

  • Account name <ACCOUNT-NAME> e.g. john-doe

  • Label <LABEL> e.g. prod

  • fromDate and toDate e.g. "fromDate":"201811010000", "toDate":"201811122359"

After sending your request, you will be prompted for your password.

curl -X POST -u<USERNAME> "https://gnip-api.x.com/search/30day/accounts/<ACCOUNT-NAME>/<LABEL>.json" -d '{"query":"from:TwitterDev lang:en","maxResults":"500","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>"}'

Data endpoint response payload

The payload you get back from your API request will appear in JSON format, as shown below.

{
	"results": [
		{
			"created_at": "Fri Nov 02 17:18:31 +0000 2018",
			"id": 1058408022936977409,
			"id_str": "1058408022936977409",
			"text": "RT @harmophone: \"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relevant conv…",
			"source": "<a href=\"http:\/\/twitter.com\" rel=\"nofollow\">Twitter Web Client<\/a>",
			"truncated": false,
			"in_reply_to_status_id": null,
			"in_reply_to_status_id_str": null,
			"in_reply_to_user_id": null,
			"in_reply_to_user_id_str": null,
			"in_reply_to_screen_name": null,
			"user": {
				"id": 2244994945,
				"id_str": "2244994945",
				"name": "Twitter Dev",
				"screen_name": "TwitterDev",
				"location": "Internet",
				"url": "https:\/\/developer.twitter.com\/",
				"description": "Your official source for Twitter Platform news, updates & events. Need technical help? Visit https:\/\/twittercommunity.com\/ ⌨️ #TapIntoTwitter",
				"translator_type": "null",
				"protected": false,
				"verified": true,
				"followers_count": 503828,
				"friends_count": 1477,
				"listed_count": 1437,
				"favourites_count": 2199,
				"statuses_count": 3380,
				"created_at": "Sat Dec 14 04:35:55 +0000 2013",
				"utc_offset": null,
				"time_zone": null,
				"geo_enabled": true,
				"lang": "en",
				"contributors_enabled": false,
				"is_translator": false,
				"profile_background_color": "null",
				"profile_background_image_url": "null",
				"profile_background_image_url_https": "null",
				"profile_background_tile": null,
				"profile_link_color": "null",
				"profile_sidebar_border_color": "null",
				"profile_sidebar_fill_color": "null",
				"profile_text_color": "null",
				"profile_use_background_image": null,
				"profile_image_url": "null",
				"profile_image_url_https": "https:\/\/pbs.twimg.com\/profile_images\/880136122604507136\/xHrnqf1T_normal.jpg",
				"profile_banner_url": "https:\/\/pbs.twimg.com\/profile_banners\/2244994945\/1498675817",
				"default_profile": false,
				"default_profile_image": false,
				"following": null,
				"follow_request_sent": null,
				"notifications": null
			},
			"geo": null,
			"coordinates": null,
			"place": null,
			"contributors": null,
			"retweeted_status": {
				"created_at": "Tue Oct 30 21:30:25 +0000 2018",
				"id": 1057384253116289025,
				"id_str": "1057384253116289025",
				"text": "\"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relev… https:\/\/t.co\/w46U5TRTzQ",
				"source": "<a href=\"http:\/\/twitter.com\" rel=\"nofollow\">Twitter Web Client<\/a>",
				"truncated": true,
				"in_reply_to_status_id": null,
				"in_reply_to_status_id_str": null,
				"in_reply_to_user_id": null,
				"in_reply_to_user_id_str": null,
				"in_reply_to_screen_name": null,
				"user": {
					"id": 175187944,
					"id_str": "175187944",
					"name": "Tyler Singletary",
					"screen_name": "harmophone",
					"location": "San Francisco, CA",
					"url": "http:\/\/medium.com\/@harmophone",
					"description": "SVP Product at @Tagboard. Did some Data, biz, and product @Klout & for @LithiumTech; @BBI board member; @Insightpool advisor. World's worst whiteboarder.",
					"translator_type": "null",
					"protected": false,
					"verified": false,
					"followers_count": 1982,
					"friends_count": 1877,
					"listed_count": 245,
					"favourites_count": 23743,
					"statuses_count": 12708,
					"created_at": "Thu Aug 05 22:59:29 +0000 2010",
					"utc_offset": null,
					"time_zone": null,
					"geo_enabled": false,
					"lang": "en",
					"contributors_enabled": false,
					"is_translator": false,
					"profile_background_color": "null",
					"profile_background_image_url": "null",
					"profile_background_image_url_https": "null",
					"profile_background_tile": null,
					"profile_link_color": "null",
					"profile_sidebar_border_color": "null",
					"profile_sidebar_fill_color": "null",
					"profile_text_color": "null",
					"profile_use_background_image": null,
					"profile_image_url": "null",
					"profile_image_url_https": "https:\/\/pbs.twimg.com\/profile_images\/719985428632240128\/WYFHcK-m_normal.jpg",
					"profile_banner_url": "https:\/\/pbs.twimg.com\/profile_banners\/175187944\/1398653841",
					"default_profile": false,
					"default_profile_image": false,
					"following": null,
					"follow_request_sent": null,
					"notifications": null
				},
				"geo": null,
				"coordinates": null,
				"place": null,
				"contributors": null,
				"is_quote_status": false,
				"extended_tweet": {
					"full_text": "\"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relevant conversations in real-time and enabling voters to ask questions during debates,”  -- @adamostrow, @TEGNA\nLearn More: https:\/\/t.co\/ivAFtanfje",
					"display_text_range": [
						0,
						259
					],
					"entities": {
						"hashtags": [],
						"urls": [
							{
								"url": "https:\/\/t.co\/ivAFtanfje",
								"expanded_url": "https:\/\/blog.tagboard.com\/twitter-and-tagboard-collaborate-to-bring-best-election-content-to-news-outlets-with-tagboard-e85fc864bcf4",
								"display_url": "blog.tagboard.com\/twitter-and-ta…",
								"unwound": {
									"url": "https:\/\/blog.tagboard.com\/twitter-and-tagboard-collaborate-to-bring-best-election-content-to-news-outlets-with-tagboard-e85fc864bcf4",
									"status": 200,
									"title": "Twitter and Tagboard Collaborate to Bring Best Election Content to News Outlets With Tagboard…",
									"description": "By Tyler Singletary, Head of Product, Tagboard"
								},
								"indices": [
									236,
									259
								]
							}
						],
						"user_mentions": [
							{
								"screen_name": "adamostrow",
								"name": "Adam Ostrow",
								"id": 5695942,
								"id_str": "5695942",
								"indices": [
									204,
									215
								]
							},
							{
								"screen_name": "TEGNA",
								"name": "TEGNA",
								"id": 34123003,
								"id_str": "34123003",
								"indices": [
									217,
									223
								]
							}
						],
						"symbols": []
					}
				},
				"quote_count": 0,
				"reply_count": 1,
				"retweet_count": 6,
				"favorite_count": 19,
				"entities": {
					"hashtags": [],
					"urls": [
						{
							"url": "https:\/\/t.co\/w46U5TRTzQ",
							"expanded_url": "https:\/\/twitter.com\/i\/web\/status\/1057384253116289025",
							"display_url": "twitter.com\/i\/web\/status\/1…",
							"indices": [
								117,
								140
							]
						}
					],
					"user_mentions": [],
					"symbols": []
				},
				"favorited": false,
				"retweeted": false,
				"possibly_sensitive": false,
				"filter_level": "low",
				"lang": "en"
			},
			"is_quote_status": false,
			"quote_count": 0,
			"reply_count": 0,
			"retweet_count": 0,
			"favorite_count": 0,
			"entities": {
				"hashtags": [],
				"urls": [],
				"user_mentions": [
					{
						"screen_name": "harmophone",
						"name": "Tyler Singletary",
						"id": 175187944,
						"id_str": "175187944",
						"indices": [
							3,
							14
						]
					}
				],
				"symbols": []
			},
			"favorited": false,
			"retweeted": false,
			"filter_level": "low",
			"lang": "en",
			"matching_rules": [
				{
					"tag": null
				}
			]
		}
	],
	"requestParameters": {
		"maxResults": 100,
		"fromDate": "201811010000",
		"toDate": "201811060000"
	}
}

Accessing the counts endpoint

With the counts endpoint, we’ll retrieve the number of Posts originating from the @XDevelopers account in English grouped by day.

cURL is a command-line tool for getting or sending files using the URL syntax.

Copy the following cURL request into your command line after making changes to the following:

  • Username <USERNAME> e.g. email@domain.com

  • Account name <ACCOUNT-NAME> e.g. john-doe

  • Label <LABEL> e.g. prod

  • fromDate and toDate e.g. "fromDate":"201811010000", "toDate":"201811122359"

After sending your request, you will be prompted for your password.

curl -X POST -u<USERNAME> "https://gnip-api.x.com/search/30day/accounts/<ACCOUNT-NAME>/<LABEL>/counts.json" -d '{"query":"from:TwitterDev lang:en","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>","bucket":"day"}'

Counts endpoint response payload

The payload you get back from your API request will appear in JSON format, as shown below.

{
	"results": [
		{
			"timePeriod": "201811010000",
			"count": 0
		},
		{
			"timePeriod": "201811020000",
			"count": 1
		},
		{
			"timePeriod": "201811030000",
			"count": 0
		},
		{
			"timePeriod": "201811040000",
			"count": 0
		},
		{
			"timePeriod": "201811050000",
			"count": 0
		}
	],
	"totalCount": 1,
	"requestParameters": {
		"bucket": "day",
		"fromDate": "201811010000",
		"toDate": "201811060000"
	}
}

Great job! Now you’ve successfully accessed the enterprise Search Posts: 30-Day API.

Referenced articles

Getting started with enterprise Search Posts: Full-Archive API

The enterprise Search Posts: Full-Archive API provides you with Posts since the first one posted in 2006. Posts are matched and sent back to you based on the query you specify in your request. A query is a rule in which you define what the Post you get back should contain. In this tutorial, we will search for Posts originating from the X account @XDevelopers in English.

The Posts you get back in your payload can be in a data format, which provides you with the full Post payload, or it can be in a counts format which gives you numerical count data of matched Posts. We will be using cURL to make requests to the data and counts endpoints.

You will need the following:

Accessing the data endpoint

The data endpoint will provide us with the full Post payload of matched Posts. We will use the from: and lang: operators to find Posts originating from @XDevelopers in English. For more operators click here.

cURL is a command-line tool for getting or sending files using the URL syntax.

Copy the following cURL request into your command line after making changes to the following:

  • Username <USERNAME> e.g. email@domain.com

  • Account name <ACCOUNT-NAME> e.g. john-doe

  • Label <LABEL> e.g. prod

  • fromDate and toDate e.g. "fromDate":"201802010000", "toDate":"201802282359"

After sending your request, you will be prompted for your password.

curl -X POST -u<USERNAME> "https://gnip-api.x.com/search/fullarchive/accounts/<ACCOUNT-NAME>/<LABEL>.json" -d '{"query":"from:TwitterDev lang:en","maxResults":"500","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>"}'
Data endpoint response payload

The payload you get back from your API request will appear in JSON format, as shown below.

{
	"results": [
		{
			"created_at": "Fri Nov 02 17:18:31 +0000 2018",
			"id": 1058408022936977409,
			"id_str": "1058408022936977409",
			"text": "RT @harmophone: \"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relevant conv…",
			"source": "<a href=\"http:\/\/twitter.com\" rel=\"nofollow\">Twitter Web Client<\/a>",
			"truncated": false,
			"in_reply_to_status_id": null,
			"in_reply_to_status_id_str": null,
			"in_reply_to_user_id": null,
			"in_reply_to_user_id_str": null,
			"in_reply_to_screen_name": null,
			"user": {
				"id": 2244994945,
				"id_str": "2244994945",
				"name": "Twitter Dev",
				"screen_name": "TwitterDev",
				"location": "Internet",
				"url": "https:\/\/developer.twitter.com\/",
				"description": "Your official source for Twitter Platform news, updates & events. Need technical help? Visit https:\/\/twittercommunity.com\/ ⌨️ #TapIntoTwitter",
				"translator_type": "null",
				"protected": false,
				"verified": true,
				"followers_count": 503828,
				"friends_count": 1477,
				"listed_count": 1437,
				"favourites_count": 2199,
				"statuses_count": 3380,
				"created_at": "Sat Dec 14 04:35:55 +0000 2013",
				"utc_offset": null,
				"time_zone": null,
				"geo_enabled": true,
				"lang": "en",
				"contributors_enabled": false,
				"is_translator": false,
				"profile_background_color": "null",
				"profile_background_image_url": "null",
				"profile_background_image_url_https": "null",
				"profile_background_tile": null,
				"profile_link_color": "null",
				"profile_sidebar_border_color": "null",
				"profile_sidebar_fill_color": "null",
				"profile_text_color": "null",
				"profile_use_background_image": null,
				"profile_image_url": "null",
				"profile_image_url_https": "https:\/\/pbs.twimg.com\/profile_images\/880136122604507136\/xHrnqf1T_normal.jpg",
				"profile_banner_url": "https:\/\/pbs.twimg.com\/profile_banners\/2244994945\/1498675817",
				"default_profile": false,
				"default_profile_image": false,
				"following": null,
				"follow_request_sent": null,
				"notifications": null
			},
			"geo": null,
			"coordinates": null,
			"place": null,
			"contributors": null,
			"retweeted_status": {
				"created_at": "Tue Oct 30 21:30:25 +0000 2018",
				"id": 1057384253116289025,
				"id_str": "1057384253116289025",
				"text": "\"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relev… https:\/\/t.co\/w46U5TRTzQ",
				"source": "<a href=\"http:\/\/twitter.com\" rel=\"nofollow\">Twitter Web Client<\/a>",
				"truncated": true,
				"in_reply_to_status_id": null,
				"in_reply_to_status_id_str": null,
				"in_reply_to_user_id": null,
				"in_reply_to_user_id_str": null,
				"in_reply_to_screen_name": null,
				"user": {
					"id": 175187944,
					"id_str": "175187944",
					"name": "Tyler Singletary",
					"screen_name": "harmophone",
					"location": "San Francisco, CA",
					"url": "http:\/\/medium.com\/@harmophone",
					"description": "SVP Product at @Tagboard. Did some Data, biz, and product @Klout & for @LithiumTech; @BBI board member; @Insightpool advisor. World's worst whiteboarder.",
					"translator_type": "null",
					"protected": false,
					"verified": false,
					"followers_count": 1982,
					"friends_count": 1877,
					"listed_count": 245,
					"favourites_count": 23743,
					"statuses_count": 12708,
					"created_at": "Thu Aug 05 22:59:29 +0000 2010",
					"utc_offset": null,
					"time_zone": null,
					"geo_enabled": false,
					"lang": "en",
					"contributors_enabled": false,
					"is_translator": false,
					"profile_background_color": "null",
					"profile_background_image_url": "null",
					"profile_background_image_url_https": "null",
					"profile_background_tile": null,
					"profile_link_color": "null",
					"profile_sidebar_border_color": "null",
					"profile_sidebar_fill_color": "null",
					"profile_text_color": "null",
					"profile_use_background_image": null,
					"profile_image_url": "null",
					"profile_image_url_https": "https:\/\/pbs.twimg.com\/profile_images\/719985428632240128\/WYFHcK-m_normal.jpg",
					"profile_banner_url": "https:\/\/pbs.twimg.com\/profile_banners\/175187944\/1398653841",
					"default_profile": false,
					"default_profile_image": false,
					"following": null,
					"follow_request_sent": null,
					"notifications": null
				},
				"geo": null,
				"coordinates": null,
				"place": null,
				"contributors": null,
				"is_quote_status": false,
				"extended_tweet": {
					"full_text": "\"The innovative crowdsourcing that the Tagboard, Twitter and TEGNA collaboration enables is surfacing locally relevant conversations in real-time and enabling voters to ask questions during debates,”  -- @adamostrow, @TEGNA\nLearn More: https:\/\/t.co\/ivAFtanfje",
					"display_text_range": [
						0,
						259
					],
					"entities": {
						"hashtags": [],
						"urls": [
							{
								"url": "https:\/\/t.co\/ivAFtanfje",
								"expanded_url": "https:\/\/blog.tagboard.com\/twitter-and-tagboard-collaborate-to-bring-best-election-content-to-news-outlets-with-tagboard-e85fc864bcf4",
								"display_url": "blog.tagboard.com\/twitter-and-ta…",
								"unwound": {
									"url": "https:\/\/blog.tagboard.com\/twitter-and-tagboard-collaborate-to-bring-best-election-content-to-news-outlets-with-tagboard-e85fc864bcf4",
									"status": 200,
									"title": "Twitter and Tagboard Collaborate to Bring Best Election Content to News Outlets With Tagboard…",
									"description": "By Tyler Singletary, Head of Product, Tagboard"
								},
								"indices": [
									236,
									259
								]
							}
						],
						"user_mentions": [
							{
								"screen_name": "adamostrow",
								"name": "Adam Ostrow",
								"id": 5695942,
								"id_str": "5695942",
								"indices": [
									204,
									215
								]
							},
							{
								"screen_name": "TEGNA",
								"name": "TEGNA",
								"id": 34123003,
								"id_str": "34123003",
								"indices": [
									217,
									223
								]
							}
						],
						"symbols": []
					}
				},
				"quote_count": 0,
				"reply_count": 1,
				"retweet_count": 6,
				"favorite_count": 19,
				"entities": {
					"hashtags": [],
					"urls": [
						{
							"url": "https:\/\/t.co\/w46U5TRTzQ",
							"expanded_url": "https:\/\/twitter.com\/i\/web\/status\/1057384253116289025",
							"display_url": "twitter.com\/i\/web\/status\/1…",
							"indices": [
								117,
								140
							]
						}
					],
					"user_mentions": [],
					"symbols": []
				},
				"favorited": false,
				"retweeted": false,
				"possibly_sensitive": false,
				"filter_level": "low",
				"lang": "en"
			},
			"is_quote_status": false,
			"quote_count": 0,
			"reply_count": 0,
			"retweet_count": 0,
			"favorite_count": 0,
			"entities": {
				"hashtags": [],
				"urls": [],
				"user_mentions": [
					{
						"screen_name": "harmophone",
						"name": "Tyler Singletary",
						"id": 175187944,
						"id_str": "175187944",
						"indices": [
							3,
							14
						]
					}
				],
				"symbols": []
			},
			"favorited": false,
			"retweeted": false,
			"filter_level": "low",
			"lang": "en",
			"matching_rules": [
				{
					"tag": null
				}
			]
		}
	],
	"requestParameters": {
		"maxResults": 100,
		"fromDate": "201811010000",
		"toDate": "201811060000"
	}
}

Accessing the counts endpoint

With the counts endpoint, we’ll retrieve the number of Posts originating from the @XDevelopers account in English grouped by day.

cURL is a command-line tool for getting or sending files using the URL syntax.

Copy the following cURL request into your command line after making changes to the following:

  • Username <USERNAME> e.g. email@domain.com

  • Account name <ACCOUNT-NAME> e.g. john-doe

  • Label <LABEL> e.g. prod

  • fromDate and toDate e.g. "fromDate":"201802010000", "toDate":"201802282359"

After sending your request, you will be prompted for your password.

curl -X POST -u<USERNAME> "https://gnip-api.x.com/search/fullarchive/accounts/<ACCOUNT-NAME>/<LABEL>/counts.json" -d '{"query":"from:TwitterDev lang:en","fromDate":"<yyyymmddhhmm>","toDate":"<yyyymmddhhmm>","bucket":"day"}'

Counts endpoint response payload

The payload you get back from your API request will appear in JSON format, as shown below.

{
	"results": [
		{
			"timePeriod": "201811010000",
			"count": 0
		},
		{
			"timePeriod": "201811020000",
			"count": 1
		},
		{
			"timePeriod": "201811030000",
			"count": 0
		},
		{
			"timePeriod": "201811040000",
			"count": 0
		},
		{
			"timePeriod": "201811050000",
			"count": 0
		}
	],
	"totalCount": 1,
	"requestParameters": {
		"bucket": "day",
		"fromDate": "201811010000",
		"toDate": "201811060000"
	}
}

Great job! Now you’ve successfully accessed the enterprise Search Posts: Full-Archive API.

Referenced articles

Guides

Building search queries

Enterprise operators

Below is a list of all operators supported in X’s enterprise search APIs:

  • Enterprise 30-day search API
  • Enterprise Full-archive search API

For a side-by-side comparison of available operators by product see HERE.

OperatorDescription
keywordMatches a tokenized keyword within the body or urls of a Post. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the Post body – tokenization is based on punctuation, symbol, and separator Unicode basic plane characters. For example, a Post with the text “I like coca-cola” would be split into the following tokens: I, like, coca, cola. These tokens would then be compared to the keyword string used in your rule. To match strings containing punctuation (for example, coca-cola), symbol, or separator characters, you must use a quoted exact match as described below.

Note: With the Search API, accented and special characters are normalized to standard latin characters, which can change meanings in foreign languages or return unexpected results:
For example, “músic” will match “music” and vice versa.
For example, common phrases like “Feliz Año Nuevo!” in Spanish, would be indexed as “Feliz Ano Nuevo”, which changes the meaning of the phrase.

Note: This operator will match on both URLs and unwound URLs within a Post.
emojiMatches an emoji within the body of a Post. Emojis are a tokenized match, meaning that your emoji will be matched against the tokenized text of the Post body – tokenization is based on punctuation, symbol/emoji, and separator Unicode basic plane characters. For example, a Post with the text “I like

” would be split into the following tokens: I, like,

. These tokens would then be compared to the emoji used in your rule. Note that if an emoji has a variant, you must use “quotations” to add to a rule.
”exact phrase match”Matches the tokenized and ordered phrase within the body or urls of a Post. This is a tokenized match, meaning that your keyword string will be matched against the tokenized text of the Post body – tokenization is based on punctuation, symbol, and separator Unicode basic plane characters.

Note: Punctuation is not tokenized and is instead treated as whitespace.
For example, quoted “#hashtag” will match “hashtag” but not #hashtag (use the hashtag # operator without quotes to match on actual hashtags.
For example, quoted “cashtagwillmatchcashtagbutnotcashtag” will match “cashtag” but not cashtag (use the cashtag $ operator without quotes to match on actual cashtags.
For example, “Love Snow” will match “#love #snow”
For example, “#Love #Snow” will match “love snow”

Note: This operator will match on both URLs and unwound URLs within a Post.
”keyword1 keyword2”~NCommonly referred to as a proximity operator, this matches a Post where the keywords are no more than N tokens from each other.

If the keywords are in the opposite order, they can not be more than N-2 tokens from each other. Can have any number of keywords in quotes. N cannot be greater than 6.

Note that this operator is only available in the enterprise search APIs.
from:Matches any Post from a specific user.
The value must be the user’s X numeric Account ID or username (excluding the @ character). See HERE or HERE for methods for looking up numeric X Account IDs.
to:Matches any Post that is in reply to a particular user.

The value must be the user’s numeric Account ID or username (excluding the @ character). See HERE for methods for looking up numeric X Account IDs.
url:Performs a tokenized (keyword/phrase) match on the expanded URLs of a Post (similar to url_contains). Tokens and phrases containing punctuation or special characters should be double-quoted. For example, url:“/developer”. While generally not recommended, if you want to match on a specific protocol, enclose in double-quotes: url:“https://developer.x.com”.
Note: When using PowerTrack or Historical PowerTrack, this operator will match on URLs contained within the original Post of a Quote Post. For example, if your rule includes url:“developer.x.com”, and a Post contains that URL, any Quote Tweets of that Post will be included in the results. This is not the case when using the Search API.
#Matches any Post with the given hashtag.

This operator performs an exact match, NOT a tokenized match, meaning the rule “2016” will match posts with the exact hashtag “2016”, but not those with the hashtag “2016election”

Note: that the hashtag operator relies on X’s entity extraction to match hashtags, rather than extracting the hashtag from the body itself. See HERE for more information on X Entities JSON attributes.
@Matches any Post that mentions the given username.
The to: operator returns a subset match of the @mention operator.
$Matches any Post that contains the specified ‘cashtag’ (where the leading character of the token is the ‘$’ character).

Note that the cashtag operator relies on X’s ‘symbols’ entity extraction to match cashtags, rather than trying to extract the cashtag from the body itself. See HERE for more information on X Entities JSON attributes.

Note that this operator is only available in the enterprise search APIs.

retweets_of:Available alias: retweets_of_user:
Matches Posts that are retweets of a specified user. Accepts both usernames and numeric X Account IDs (NOT Post status IDs).See HERE for methods for looking up numeric X Account IDs.
lang:Matches Posts that have been classified by X as being of a particular language (if, and only if, the post has been classified). It is important to note that each Post is currently only classified as being of one language, so AND’ing together multiple languages will yield no results.

Note: if no language classification can be made the provided result is ‘und’ (for undefined).

The list below represents the current supported languages and their corresponding BCP 47 language indentifier:
Amharic: amGerman: deMalayalam: mlSlovak: sk
Arabic: arGreek: elMaldivian: dvSlovenian: sl
Armenian: hyGujarati: guMarathi: mrSorani Kurdish: ckb
Basque: euHaitian Creole: htNepali: neSpanish: es
Bengali: bnHebrew: iwNorwegian: noSwedish: sv
Bosnian: bsHindi: hiOriya: orTagalog: tl
Bulgarian: bgLatinized Hindi: hi-LatnPanjabi: paTamil: ta
Burmese: myHungarian: huPashto: psTelugu: te
Croatian: hrIcelandic: isPersian: faThai: th
Catalan: caIndonesian: inPolish: plTibetan: bo
Czech: csItalian: itPortuguese: ptTraditional Chinese: zh-TW
Danish: daJapanese: jaRomanian: roTurkish: tr
Dutch: nlKannada: knRussian: ruUkrainian: uk
English: enKhmer: kmSerbian: srUrdu: ur
Estonian: etKorean: koSimplified Chinese: zh-CNUyghur: ug
Finnish: fiLao: loSindhi: sdVietnamese: vi
French: frLatvian: lvSinhala: siWelsh: cy
Georgian: kaLithuanian: lt
place:Matches Posts tagged with the specified location or X place ID (see examples). Multi-word place names (“New York City”, “Palo Alto”) should be enclosed in quotes.

Note: See the GET geo/search public API endpoint for how to obtain X place IDs.

Note: This operator will not match on Retweets, since Retweet’s places are attached to the original Post. It will also not match on places attached to the original Post of a Quote Tweet.
place_country:Matches Posts where the country code associated with a tagged place matches the given ISO alpha-2 character code.

Valid ISO codes can be found here: http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

Note: This operator will not match on Retweets, since Retweet’s places are attached to the original Post. It will also not match on places attached to the original Post of a Quote Tweet.
point_radius:[lon lat radius]Matches against the Exact Location (x,y) of the Post when present, and in X, against a “Place” geo polygon, where the Place is fully contained within the defined region.

* Units of radius supported are miles (mi) and kilometers (km).
* Radius must be less than 25mi.
* Longitude is in the range of ±180
* Latitude is in the range of ±90
* All coordinates are in decimal degrees.
* Rule arguments are contained with brackets, space delimited.

Note: This operator will not match on Retweets, since Retweet’s places are attached to the original Post. It will also not match on places attached to the original Post of a Quote Tweet.
bounding_box:[west_long south_lat east_long north_lat]Available alias: geo_bounding_box:

Matches against the Exact Location (long, lat) of the Post when present, and in X, against a “Place” geo polygon, where the Place is fully contained within the defined region.

* west_long south_lat represent the southwest corner of the bounding box where west-long is the longitude of that point, and south_lat is the latitude.
* east_long and north_lat represent the northeast corner of the bounding box, where east_long is the longitude of that point, and north_lat is the latitude.
* Width and height of the bounding box must be less than 25mi
* Longitude is in the range of ±180
* Latitude is in the range of ±90
* All coordinates are in decimal degrees.
* Rule arguments are contained with brackets, space delimited.
Note: This operator will not match on Retweets, since Retweet’s places are attached to the original Post. It will also not match on places attached to the original Post of a Quote Tweet.
profile_country:Exact match on the “countryCode” field from the “address” object in the Profile Geo enrichment.
Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for “country” field from the “address” object to be concise.
profile_region:Matches on the “region” field from the “address” object in the Profile Geo enrichment.

This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.
profile_locality:Matches on the “locality” field from the “address” object in the Profile Geo enrichment.

This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.

NOTE: All is: and has: operators cannot be used as standalone operators when using the Search API, and must be combined with another clause.

For example, @XDeevelopers has:links

has:geoMatches Posts that have Post-specific geo location data provided from X. This can be either “geo” lat-long coordinate, or a “location” in the form of a X “Place”, with corresponding display name, geo polygon, and other fields.



Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:profile_geoAvailable alias: has:derived_user_geo

Matches Posts that have any Profile Geo metadata, regardless of the actual value.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:linksThis operators matches Posts which contain links in the message body.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
is:retweetDeliver only explicit retweets that match a rule. Can also be negated to exclude retweets that match a rule from delivery and only original content is delivered.

This operator looks only for true Retweets, which use X’s retweet functionality. Quoted Tweets and Modified Posst which do not use X’s retweet functionality will not be matched by this operator.



Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
is:replyAn operator to filter Posts based on whether they are or are not replies to Posts. Deliver only explicit replies that match a rule. Can also be negated to exclude replies that match a rule from delivery.

Note that this operator is available for paid premium and enterprise search and is not available in Sandbox dev environments.



Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
is:quoteDelivers only Quote Tweets, or Posts that reference another Post, as identified by the “is_quote_status”:true in Post payloads. Can also be negated to exclude Quote Tweets.

Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
is:verifiedDeliver only Posts where the author is “verified” by X. Can also be negated to exclude Posts where the author is verified.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:mentionsMatches Posts that mention another X user.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:hashtagsMatches Posts that contain a hashtag.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:mediaAvailable alias: has:media_link

Matches Posts that contain a media url classified by X. For example, pic.x.com.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:imagesMatches Posts that contain a media url classified by X. For example, pic.x.com.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:videosAvailable alias: has:video_link

Matches Posts that contain native X videos, uploaded directly to X. This will not match on videos created with Vine, Periscope, or Posts with links to other video hosting sites.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.
has:symbolsMatches Posts that contain a cashtag symbol (with a leading ‘character. Forexample,’ character. For example, tag).  Note that this operator is only available in the enterprise search APIs.


Note: When using the Search API, this operator must be used in conjunction with other operators that don’t include is: or has:.

Full-Archive Search metadata timeline

That article discusses how the historical changes of the full-archive roadmap affects creating the filters needed to find your historical signal of interest. This article and a complementary article about Historical PowerTrack, will serve as a ‘compare and contrast’ discussion of the two X historical products.

Product overview

The enterprise-tier Full-archive Search was launched in August 2015, and the premium-tier version was launched in February 2018. These search products enable customers to immediately access any publicly available Post. With Full-archive Search you submit a single query and receive a response in classic RESTful fashion. Full-archive Search implements (up to) 500-Posts-per-response pagination, and supports up to a 60-requests-per-minute (rpm) rate-limit for premium, 120 rpm for enterprise. Given these details, Full-archive Search can be used to rapidly retrieve Posts, and at large scale using concurrent requests.

Unlike Historical PowerTrack, whose archive is based on a set of Post flat-files on disk, the Full-archive Search Post archive is much like an on-line database. As with all databases, it supports making queries on its contents. It also makes use of an index to enable high-performance data retrieval. With Full-archive search endpoints, the querying language is made up of PowerTrack Operators, and these Operators each correspond to a Post JSON attribute that is indexed.

Also, like Historical PowerTrack, there are Post attributes that are current to the time a query is made. For example, if you are using Search API to access a Post posted in 2010 today, the user’s profile description, account ‘home’ location, display name, and Post metrics for Favorites and Retweet counts will be updated to today’s values and not what they were in 2010. 

Metadata timelines

Below is a timeline of when Full-archive search endpoint Operators begin matching. In some cases Operator matching began well after a ‘communication convention’ became commonplace on X. For example, @Replies emerged as a user convention in 2006, but did not become a first-class object or event with ‘supporting’ JSON until early 2007. Accordingly, matching on @Replies in 2006 requires an examination of the Post body, rather than relying on the to: and in_reply_to_status_id: PowerTrack Operators.

The details provided here were generated using Full-Archive Search (a product of hundreds of searches). This timeline is not 100% complete or precise. If you identify another filtering/metadata “born on date” fundamental to your use-case, please let us know.

Note that the underlying Search index is subject to being rebuilt. Accordingly, these timeline details are subject to change.

2006

  • March 26 - lang:. An example of Post metadata being backfilled while generating the Search index.
  • July 13 - has:mentions begins matching.
  • October 6 - has:symbols. cashtags(orsymbols)fordiscussingstocksymbolsdoesnotbecomecommonuntilearly2009.Untilthenmostusageswereprobablyslang(e.g.,cashtags (or symbols) for discussing stock symbols does not become common until early 2009. Until then most usages were probably slang (e.g., slang).
  • October 26 - has:links begins matching.
  • November 23 - has:hashtags begins matching.

2007

  • January 30 - First first-class @reply (in_reply_to_user_id), reply_to_status_id: begins matching.
  • August 23 - Hashtags emerge as a common convention for organizing topics and conversations. First real use a week later.

2009

  • May 15 - is:retweet. Note that this Operator starts matching with the ‘beta’ release of official Retweets and its “Via @’ pattern. During this beta period, the Post verb is ‘post’ and the original Post is not included in the payload.
  • August 13 - Final version of official Retweets is released with “RT @” pattern, a verb set to ‘share’, and the ‘retweet_status’ attribute containing the original Post (thus approximately doubling the JSON payload size).

2010

  • March 6 - has:geo, bounding_box: and point_radius: geo Operators begin matching.
  • August 28 - has:videos (Until February 2015, this Operator matches on Posts with links to select video hosting sites such as youtube.com, vimeo.com, and vivo.com).

2011

  • July 20 - has:media and has:images begin matching. Native photos officially announced August 9, 2010.

2014

  • December 3 - (Approximately) Some Enhanced URL metadata with HTML title and description begins in payloads. Enhanced metadata more fully emerged in May 2016.

2015

  • February 10 - has:videos matches on ‘native’ X videos.
  • February 17 - has:profile_geo, profile_country:, profile_region:, profile_locality: Profile Geo Operators begin matching.
  • February 17 - place_country: and place: Post geo Operators begin matching.

2016

2017

  • February 22 - Poll metadata become available in enriched native format. No associated Operators for these metadata.

2022

  • September 27 - All Post objects created since this date have Edited Post metadata available. All Enterprise endpoints that provide Post objects were updated to provide this metadata starting on this date. The edit metadata provided includes edit_history and edit_controls objects. These metadata will not be returned for Posts that were created before September 27, 2022. Currently, there are no Enterprise Operators available that match these metadata.  To learn more about Edit Post metadata, check out the Edit Posts fundamentals page.

2022

  • September 29 - All Post objects created since this date have Edited Post metadata available. All Enterprise endpoints that provide Post objects were updated to provide this metadata starting on this date. The edit metadata provided includes edit_history and edit_controls objects. These metadata will not be returned for Posts that were created before September 27, 2022. Currently, there are no Enterprise Operators available matching these metadata.  To learn more about Edit Post metadata, check out the Edit Posts fundamentals page.

Filtering tips

Given all the above timeline information, it is clear that there are a lot of details to consider when writing Search APIs filters. There are two key things to consider:

  • Some metadata have ‘born-on’ dates so filters can result in false negatives. Such searches include Operators reliant on metadata that did not exist for all of part of the search period. For example, if you are searching for Posts with the has:images Operator, you will not have any matches for periods before July 2011. That is because that Operator matches on native photos (attached to a Post using the X user-interface). For a more complete data set of photo-sharing Posts, filters for before July 2011 would need to contain rule clauses that match on common URLs for photo hosting.
  • Some metadata has been backfilled with metadata from a time after the X was posted.

There are several attribute types that are commonly focused on when creating PowerTrack queries:

  • X Profiles
  • Original or shared Posts
  • Post language classification
  • Geo-referencing Posts
  • Shared links media

Some of these have product-specific behavior while others have identical behavior. See below for more details.

X Profiles

The Search APIs serves historical Posts with the user profile data set as it is at the time of retrieval. If you request a Post from 2014, the user’s profile metadata will reflect how it exists at query-time.

Original Posts and Retweets

The PowerTrack _is:retweet_ Operator enables users to either include or exclude Retweets. Users of this Operator need to have two strategies for Retweet matching (or not matching) for data before August 2009. Before August 2009, the Post message itself needs to be checked, using exact phrase matching, for matches on the “@RT ” pattern (Actually, if you are filtering on Retweets from between May-August 2009, the “Via @” pattern should be included). For periods after August 2009, the is:retweet Operator is available.

Post language classifications

For filtering on a Post’s language classification, X’s historical products are quite different. When the Search archive was built, all Posts were backfilled with the X language classification. Therefore the lang: Operator is available for the entire Post archive.

Geo-referencing Posts

There are three primary ways to geo-reference Posts:

  • Geographical references in Post message. Matching on geographic references in the Post message, while often the most challenging method since it depends on local knowledge, is an option for the entire Post archive. Here is an example geo-referenced match from 2006 for the San Francisco area based on a ‘golden gate’ filter.

  • Posts geo-tagged by the user. With the search APIs the ability to start matching on Posts with some Geo Operators started in March 2010, and with others in February 2015:

    • March 6, 2010: has:geo, bounding_box: and point_radius:
    • February 17, 2015: place_country: and place:
  • Account profile ‘home’ location set by the user. Profile Geo Operators are available in both Historical PowerTrack and the Search APIs. With the Search APIs, these Profile Geo metadata is available starting in February 2015. For Posts posted before Profile Geo metadata became available, the bio_location: Operator is available which can be used to match on non-normalized user input.

In March 2012, the expanded URL enrichment was introduced. Before this time, the Post payloads included only the URL as provided by the user. So, if the user included a shortened URL it can be challenging to match on (expanded) URLs of interest. With the Search APIs, these metadata are available starting in March 2012.

In July 2016, the enhanced URL enrichment was introduced. This enhanced version provides a web site’s HTML title and description in the Post payload, along with Operators for matching on those. These metadata begin emerging in December 2014.

In September 2016 X introduced ‘native attachments’ where a trailing shared link is not counted against the 140 Post character limit. Both URL enrichments still apply to these shared links.

Here are when related Search Operators begin matching:

  • 2006 October 26 - has:links
  • 2011 July 20 - has:images and has:media
  • 2011 August - url: with the Expanded URLs enrichment As early as September 2006 (url:"spotify.com" OR url:gnip OR url:microsoft OR url:google OR url:youtube) matches http://twitter.com/Adam/statuses/16602, even though there is no urls[] metadata in twitter_entities and gnip objects. “youtube.com” is an example of message content that, without any urls[] metadata, matches url:youtube.
  • 2015 February 10 - has:videos for native videos. Between 2010/08/28 and 2015/02/10, this Operator matches on Posts with links to select video hosting sites such as youtube.com, vimeo.com, and vivo.com.
  • 2016 May 1 - url_title: and url_description:, based on the Enhanced URLs enrichment, generally available. First Enhanced URL metadata began appearing in December 2014.

Frequently asked questions(FAQ)

General Search Post API questions

Error troubleshooting guide

Code 404 - Not Found

  1. Please ensure you are using the right parameters for each endpoint (e.g. the bucketsfield can only be used with the counts endpoint, not the data endpoint)
  2. Please double check the :product :account_name and :label fields are correct. You can find your :label field in the GNIP Console (enterprise customers only).

API Reference

Enterprise search APIs

There are two enterprise search APIs:

  • 30-Day Search API - provides Tweets posted with the last 30 days.
  • Full-Archive Search API - provides Tweets from as early as 2006, starting with the first Tweet posted in March 2006.

These search APIs share a common design and the documentation below applies to both. Note that for Tweets created starting September 29, 2022, Tweet objects will include Tweet edit metadata that describes its edit history. See the “Edit Tweets” fundamentals page for more details.

Below you will find important details needed when integrating with the enterprise search APIs:

  • Methods for requesting Tweet data and counts
  • Authentication
  • Pagination
  • API request parameters and example requests
  • API response JSON payloads and example responses
  • HTTP response codes

The enterprise APIs provide low-latency, full-fidelity, query-based access to the Tweet archive. The only difference in the two APIs is the time frame you can search, either the previous 30 days or from as early as 2006. Time frames can be specified with minute granularity. Tweet data is served in reverse chronological order, starting with the most recent Tweet that matches your query. Tweets are available from the search API approximately 30 seconds after being published.

Methods

The base URI for enterprise search is https://gnip-api.x.com/search/.

MethodDescription
POST /search/:product/accounts/:account_name/:labelRetrieve Tweets from the past 30 days that match the specified PowerTrack rule.
POST /search/:product/accounts/:account_name/:label/countsRetrieve the number of Tweets from the past 30 days that match the specified PowerTrack rule.

Where:

  • :product indicates the search endpoint you are making requests to, either 30day or fullarchive.
  • :account_name is the (case-sensitive) name associated with your account, as displayed at console.gnip.com
  • :label is the (case-sensitive) label associated with your search endpoint, as displayed at console.gnip.com

For example, if the TwitterDev account has the 30-Day search product with a label of ‘prod’ (short for production), the search endpoints would be:

Your complete enterprise search API endpoint is displayed at https://console.gnip.com.

Below there are several example requests using a simple HTTP utility called curl. These examples use URLs with :product, :account_name, and :label. To use these examples, be sure to update the URLs with your details.

Authentication

All requests to the enterprise search APIs must use HTTP Basic Authentication, constructed from a valid email address and password combination used to log into your account at https://console.gnip.com. Credentials must be passed as the Authorization header for each request.

Request/response behavior

Using the fromDate and toDate parameters, you can request any time period that the API supports. The 30-Day search API provides Tweets from the most recent 31 days (even though referred to as the ‘30-Day’ API, it makes 31 days available to enable users to make complete-month requests). The Full-Archive search API provides Tweets back to the very first tweet (March 21, 2006). However, a single response will be limited to the lesser of your specified ‘maxResults’ or 31 days. If matching data or your time range exceeds your specified maxResults or 31 days, you will receive a ‘next’ token which you should use to paginate through the remainder of your specified time range.

For example, say you are using Full-Archive search and want all Tweets matching your query from January 1, 2017 to June 30, 2017. You will specify that full six-month time period in your request using the fromDate and toDate parameters. The search API will respond with the first ‘page’ of Tweets, with the number of Tweets matching your maxResults parameter (which defaults to 100). Assuming there are more Tweets (and there most likely will be more), the API will also provide a ‘next’ token that enables you to make a request for the next ‘page’ of data. This process is repeated until the API does not return a ‘next’ token. See the next section for more details.

Pagination

When making both data and count requests it is likely that there is more data than can be returned in a single response. When that is the case the response will include a ‘next’ token. The ‘next’ token is provided as a root-level JSON attribute. Whenever a ‘next’ token is provided, there is additional data to retrieve so you will need to keep making API requests.

Note: The ‘next’ token behavior differs slightly for data and counts requests, and both are described below with example responses provided in the API Reference section.

Data pagination

Requests for data will likely generate more data than can be returned in a single response. Each data request includes a parameter that sets the maximum number of Tweets to return per request. This maxResults parameter defaults to 100 and can be set to a range of 10-500. If your query matches more Tweets than the ‘maxResults’ parameter used in the request, the response will include a ‘next’ token (as a root-level JSON attribute). This ‘next’ token is used in the subsequent request to retrieve the next portion of the matching Tweets for that query (i.e. the next ‘page”). Next tokens will continue to be provided until you have reached the last ‘page’ of results for that query when no ‘next’ token is provided.

To request the next ‘page’ of data, you must make the exact same query as the original, including query, toDate, and fromDate parameters, if used, and also include a ‘next’ request parameter set to the value from the previous response. This can be utilized with either a GET or POST request. However, the ‘next’ parameter must be URL encoded in the case of a GET request.

You can continue to pass in the ‘next’ element from your previous query until you have received all Tweets from the time period covered by your query. When you receive a response that does not include a ‘next’ element, it means that you have reached the last page and no additional data is available for the specified query and time range.

Counts pagination

The ‘counts’ endpoint provides Tweet volumes associated with a query on either a daily, hourly, or per-minute basis. The ‘counts’ API endpoint will return a timestamped array of counts for a maximum of a 31-day payload of counts. If you request more than 31 days of counts you will be provided a ‘next’ token. As with the data ‘next’ tokens, you must make the exact same query as the original and also include a ‘next’ request parameter set to the value from the previous response.

Beyond requesting more than 31 days of counts, there is another scenario when a ‘next’ token is provided. For higher volume queries, there is the potential that the generation of counts will take long enough to trigger a response timeout. When this occurs you will receive less than 31 days of counts but will be provided a ‘next’ token in order to continue making requests for the entire payload of counts. Important: Timeouts will only issue full “buckets” - so 2.5 days would result in 2 full day “buckets”.

Additional notes
  • When using a fromDate or toDate in a search request, you will only get results from within your time range. When you reach the last group of results within your time range, you will not receive a ‘next’ token.
  • The ‘next’ element can be used with any maxResults value between 10-500 (with a default value of 100). The maxResults determines how many Tweets are returned in each response, but does not prevent you from eventually getting all results.
  • The ‘next’ element does not expire. Multiple requests using the same ‘next’ query will receive the same results, regardless of when the request is made.
  • When paging through results using the ‘next’ parameter, you may encounter duplicates at the edges of the query. Your application should be tolerant of these.

Data endpoint

POST /search/:product/:label
Endpoint pattern:

This endpoint returns data for the specified query and time period. If a time period is not specified the time parameters will default to the last 30 days. Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.

Data request parameters
ParametersDescriptionRequiredSample Value
queryThe equivalent of one PowerTrack rule, with up to 2,048 characters (and no limits on the number of positive and negative clauses).

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

Note: Not all PowerTrack operators are supported. Supported Operators are listed HERE.
Yes(snow OR cold OR blizzard) weather
tagTags can be used to segregate rules and their matching data into different logical groups. If a rule tag is provided the rule tag is included in the ‘matching_rules’ attribute.

It is recommended to assign rule-specific UUIDs to rule tags and maintain desired mappings on the client side.
No8HYG54ZGTU
fromDateThe oldest UTC timestamp (back to 3/21/2006 with Full-Archive search) from which the Tweets will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute).

Specified: Using only the fromDate with no toDate parameter will deliver results for the query going back in time from now( ) until the fromDate.

Not Specified: If a fromDate is not specified, the API will deliver all of the results for 30 days prior to now( ) or the toDate (if specified).

If neither the fromDate or toDate parameter is used, the API will deliver all results for the most recent 30 days, starting at the time of the request, going backwards.
No201207220000
toDateThe latest, most recent UTC timestamp to which the Tweets will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour).

Specified: Using only the toDate with no fromDate parameter will deliver the most recent 30 days of data prior to the toDate.

Not Specified: If a toDate is not specified, the API will deliver all of the results from now( ) for the query going back in time to the fromDate.

If neither the fromDate or toDate parameter is used, the API will deliver all results for the entire 30-day index, starting at the time of the request, going backwards.
No201208220000
maxResultsThe maximum number of search results to be returned by a request. A number between 10 and the system limit (currently 500). By default, a request response will return 100 results.No500
nextThis parameter is used to get the next ‘page’ of results as described HERE. The value used with the parameter is pulled directly from the response provided by the API, and should not be modified.NoNTcxODIyMDMyODMwMjU1MTA0
Additional details
Available Timeframe30-Day: last 31 days
Full-Archive: March 21, 2006 - Present
Query FormatThe equivalent of one PowerTrack rule, with up to 2,048 characters (and no limits on the number of positive and negative clauses).

Note: Not all PowerTrack operators are supported. Refer to Available operators for a list of supported operators.
Rate LimitPartners will be rate limited at both minute and second granularity. The per minute rate limit will vary by partner as specified in your contract. However, these per-minute rate limits are not intended to be used in a single burst. Regardless of your per minute rate limit, all partners will be limited to a maximum of 20 requests per second, aggregated across all requests for data and/or counts.
ComplianceAll data delivered via the Full-Archive Search API is compliant at the time of delivery.
Realtime AvailabilityData is available in the index within 30 seconds of generation on the Twitter Platform
Example data requests and responses
Example POST request
  • Request parameters in a POST request are sent via a JSON-formatted body, as shown below.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter.
  • Do not split portions of the rule out as separate parameters in the query URL.

Here is an example POST (using cURL) command for making an initial data request:

    curl -X POST -u<username> "https://gnip-api.x.com/search/:product/accounts/:account_name/:label.json" -d '{"query":"from:twitterDev","maxResults":500,"fromDate":"yyyymmddhhmm","toDate":"yyyymmddhhmm"}'

If the API data response includes a ‘next’ token, below is a subsequent request that consists of the original request, with the ‘next’ parameter set to the provided token:

    curl -X POST -u<username> "https://gnip-api.x.com/search/:product/accounts/:account_name/:label.json" -d '{"query":"from:twitterDev","maxResults":500,"fromDate":"yyyymmddhhmm","toDate":"yyyymmddhhmm",
    "next":"NTcxODIyMDMyODMwMjU1MTA0"}'
Example GET request
  • Request parameters in a GET request are encoded into the URL, using standard URL encoding.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter.
  • Do not split portions of the rule out as separate parameters in the query URL.

Here is an example GET (using cURL) command for making an initial data request:

    curl -u<username> "http://gnip-api.x.com/search/:product/accounts/:account_name/:label.json?query=from%3Atwitterdev&maxResults=500&fromDate=yyyymmddhhmm&toDate=yyyymmddhhmm"
Example data responses

Note that for Tweets created starting September 29, 2022, Tweet objects will include Tweet edit metadata that describes its edit history. See the “Edit Tweets” fundamentals page for more details.

Below is an example response to a data query. This example assumes that there were more than ‘maxResults’ Tweets available so a ‘next’ token is provided for subsequent requests. If ‘maxResults’ or fewer Tweets are associated with your query, no ‘next’ token would be included in the response. The value of the ‘next’ element will change with each query and should be treated as an opaque string. The ‘next’ element will look like the following in the response body:

{
    "results":
      [
            {--Tweet 1--},
            {--Tweet 2--},
            ...
            {--Tweet 500--}
      ],
    "next":"NTcxODIyMDMyODMwMjU1MTA0",
    "requestParameters":
      {
        "maxResults":500,
        "fromDate":"201101010000",
        "toDate":"201201010000"
      }
  }

The response to a subsequent request might look like the following (note the new Tweets and different ‘next’ value):

{
      "results":
      [
            {--Tweet 501--},
            {--Tweet 502--},
            ...
            {--Tweet 1000--}
      ],
      "next":"R2hCDbpBFR6eLXGwiRF1cQ",
      "requestParameters":
      {
        "maxResults":500,
        "fromDate":"201101010000",
        "toDate":"201201010000"
      }
  }

You can continue to pass in the ‘next’ element from your previous query until you have received all Tweets from the time period covered by your query. When you receive a response that does not include a ‘next’ element, it means that you have reached the last page and no additional data is available in your time range.

Counts endpoint

/search/:stream/counts
Endpoint pattern:

/search/fullarchive/accounts/:account_name/:label/counts.json

This endpoint returns counts (data volumes) data for the specified query. If a time period is not specified the time parameters will default to the last 30 days. Data volumes are returned as a timestamped array on either daily, hourly (default), or by the minute.

Note: This functionality can also be accomplished using a GET request, instead of a POST, by encoding the parameters described below into the URL.

Counts request parameters
ParametersDescriptionRequiredSample Value
queryThe equivalent of one PowerTrack rule, with up to 2,048 characters (and no limits on the number of positive and negative clauses).

This parameter should include ALL portions of the PowerTrack rule, including all operators, and portions of the rule should not be separated into other parameters of the query.

Note: Not all PowerTrack operators are supported. Refer to Available operators for a list of supported operators.
Yes(snow OR cold OR blizzard) weather
fromDateThe oldest UTC timestamp (back to 3/21/2006) from which the Tweets will be provided. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute).

Specified: Using only the fromDate with no toDate parameter, the API will deliver counts (data volumes) data for the query going back in time from now until the fromDate. If the fromDate is older than 31 days from now( ), you will receive a next token to page through your request.

Not Specified: If a fromDate is not specified, the API will deliver counts (data volumes) for 30 days prior to now( ) or the toDate (if specified).

If neither the fromDate or toDate parameter is used, the API will deliver counts (data volumes) for the most recent 30 days, starting at the time of the request, going backwards.
No201207220000
toDateThe latest, most recent UTC timestamp to which the Tweets will be provided. Timestamp is in minute granularity and is not inclusive (i.e. 11:59 does not include the 59th minute of the hour).

Specified: Using only the toDate with no fromDate parameter will deliver the most recent counts (data volumes) for 30 days prior to the toDate.

Not Specified: If a toDate is not specified, the API will deliver counts (data volumes) for the query going back in time to the fromDate. If the fromDate is more than 31 days from now( ), you will receive a next token to page through your request.

If neither the fromDate or toDate parameter is used, the API will deliver counts (data volumes) for the most recent 30 days, starting at the time of the request, going backwards.
No201208220000
bucketThe unit of time for which count data will be provided. Count data can be returned for every day, hour or minute in the requested timeframe. By default, hourly counts will be provided. Options: ‘day’, ‘hour’, ‘minute’Nominute
nextThis parameter is used to get the next ‘page’ of results as described HERE. The value used with the parameter is pulled directly from the response provided by the API, and should not be modified.NoNTcxODIyMDMyODMwMjU1MTA0
Additional details
Available Timeframe30-Day: last 31 days
Full-Archive: March 21, 2006 - Present
Query FormatThe equivalent of one PowerTrack rule, with up to 2,048 characters.

Note: Not all PowerTrack operators are supported. Refer to Available operators for a list of supported operators.
Rate LimitPartners will be rate limited at both minute and second granularity. The per minute rate limit will vary by partner as specified in your contract. However, these per-minute rate limits are not intended to be used in a single burst. Regardless of your per minute rate limit, all partners will be limited to a maximum of 20 requests per second, aggregated across all requests for data and/or counts.
Count PrecisionThe counts delivered through this endpoint reflect the number of Tweets that occurred and do not reflect any later compliance events (deletions, scrub geos). Some Tweets counted may not be available via data endpoint due to user compliance actions.
Example counts requests and responses
Example POST request
  • Request parameters in a POST request are sent via a JSON-formatted body, as shown below.
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter.
  • Do not split portions of the rule out as separate parameters in the query URL.

Here is an example POST (using cURL) command for making an initial counts request:

    curl -X POST -u<username> "https://gnip-api.x.com/search/:product/accounts/:account_name/:label/counts.json" -d '{"query":"TwitterDev","fromDate":"yyyymmddhhmm","toDate":"yyyymmddhhmm","bucket":"day"}'

If the API counts response includes a ‘next’ token, below is a subsequent request that consists of the original request, with the ‘next’ parameter set to the provided token:

    curl -X POST -u<username> "https://gnip-api.x.com/search/:product/accounts/:account_name/:label/counts.json" -d '{"query":"TwitterDev","fromDate":"yyyymmddhhmm","toDate":"yyyymmddhhmm","bucket":"day",
    "next":"YUcxO87yMDMyODMwMjU1MTA0"}'
Example GET request
  • Request parameters in a GET request are encoded into the URL, using standard URL encoding
  • All portions of the PowerTrack rule being queried for (e.g. keywords, other operators like bounding_box:) should be placed in the ‘query’ parameter
  • Do not split portions of the rule out as separate parameters in the query URL

Here is an example GET (using cURL) command for making an initial counts request:

    curl -u<username> "http://gnip-api.x.com/search/fullarchive/accounts/:account_name/:label/counts.json?query=TwitterDev&bucket=day&fromDate=yyyymmddhhmm&toDate=yyyymmddhhmm"

Example counts responses

Below is an example response to a counts (data volume) query. This example response includes a ‘next’ token, meaning the counts request was for more than 31 days, or that the submitted query had a large enough volume associated with it to trigger a partial response.

The value of the ‘next’ element will change with each query and should be treated as an opaque string. The ‘next’ element will look like the following in the response body:

    {
      "results": [
        { "timePeriod": "201101010000", "count": 32 },
        { "timePeriod": "201101020000", "count": 45 },
        { "timePeriod": "201101030000", "count": 57 },
        { "timePeriod": "201101040000", "count": 123 },
        { "timePeriod": "201101050000", "count": 134 },
        { "timePeriod": "201101060000", "count": 120 },
        { "timePeriod": "201101070000", "count": 43 },
        { "timePeriod": "201101080000", "count": 65 },
        { "timePeriod": "201101090000", "count": 85 },
        { "timePeriod": "201101100000", "count": 32 },
        { "timePeriod": "201101110000", "count": 23 },
        { "timePeriod": "201101120000", "count": 85 },
        { "timePeriod": "201101130000", "count": 32 },
        { "timePeriod": "201101140000", "count": 95 },
        { "timePeriod": "201101150000", "count": 109 },
        { "timePeriod": "201101160000", "count": 34 },
        { "timePeriod": "201101170000", "count": 74 },
        { "timePeriod": "201101180000", "count": 24 },
        { "timePeriod": "201101190000", "count": 90 },
        { "timePeriod": "201101200000", "count": 85 },
        { "timePeriod": "201101210000", "count": 93 },
        { "timePeriod": "201101220000", "count": 48 },
        { "timePeriod": "201101230000", "count": 37 },
        { "timePeriod": "201101240000", "count": 54 },
        { "timePeriod": "201101250000", "count": 52 },
        { "timePeriod": "201101260000", "count": 84 },
        { "timePeriod": "201101270000", "count": 120 },
        { "timePeriod": "201101280000", "count": 34 },
        { "timePeriod": "201101290000", "count": 83 },
        { "timePeriod": "201101300000", "count": 23 },
        { "timePeriod": "201101310000", "count": 12 }
       ],
      "totalCount":2027,
      "next":"NTcxODIyMDMyODMwMjU1MTA0",
      "requestParameters":
        {
          "bucket":"day",
          "fromDate":"201101010000",
          "toDate":"201201010000"
        }
    }

The response to a subsequent request might look like the following (note the new counts timeline and different ‘next’ value):

    {
      "results": [
        { "timePeriod": "201102010000", "count": 45 },
        { "timePeriod": "201102020000", "count": 76 },
         ....
        { "timePeriod": "201103030000", "count": 13 }
     ],
     "totalCount":3288,
     "next":"WE79fnakFanyMDMyODMwMjU1MTA0",
     "requestParameters":
        {
          "bucket":"day",
          "fromDate":"201101010000",
          "toDate":"201201010000"
        }
    }

You can continue to pass in the ‘next’ element from your previous query until you have received all counts from the query time period. When you receive a response that does not include a ‘next’ element, it means that you have reached the last page and no additional counts are available in your time range.

HTTP response codes

StatusTextDescription
200OKThe request was successful. The JSON response will be similar to the following:
400Bad RequestGenerally, this response occurs due to the presence of invalid JSON in the request, or where the request failed to send any JSON payload.
401UnauthorizedHTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request.
404Not FoundThe resource was not found at the URL to which the request was sent, likely because an incorrect URL was used.
422Unprocessable EntityThis is returned due to invalid parameters in the query — e.g. invalid PowerTrack rules.
429Unknown CodeYour app has exceeded the limit on connection requests. The corresponding JSON message will look similar to the following:
500Internal Server ErrorThere was an error on the server side. Retry your request using an exponential backoff pattern.
502Proxy ErrorThere was an error on server side. Retry your request using an exponential backoff pattern.
503Service UnavailableThere was an error on server side. Retry your request using an exponential backoff pattern.