diff --git a/docs/35.0.0/api-reference/api-reference.md b/docs/35.0.0/api-reference/api-reference.md
new file mode 100644
index 0000000000..dd4a4ab638
--- /dev/null
+++ b/docs/35.0.0/api-reference/api-reference.md
@@ -0,0 +1,44 @@
+---
+id: api-reference
+title: API reference
+sidebar_label: Overview
+---
+
+
+
+
+This topic is an index to the Apache Druid API documentation.
+
+## HTTP APIs
+* [Druid SQL queries](./sql-api.md) to submit SQL queries using the Druid SQL API.
+* [SQL-based ingestion](./sql-ingestion-api.md) to submit SQL-based batch ingestion requests.
+* [JSON querying](./json-querying-api.md) to submit JSON-based native queries.
+* [Tasks](./tasks-api.md) to manage data ingestion operations.
+* [Supervisors](./supervisor-api.md) to manage supervisors for data ingestion lifecycle and data processing.
+* [Retention rules](./retention-rules-api.md) to define and manage data retention rules across datasources.
+* [Data management](./data-management-api.md) to manage data segments.
+* [Automatic compaction](./automatic-compaction-api.md) to optimize segment sizes after ingestion.
+* [Lookups](./lookups-api.md) to manage and modify key-value datasources.
+* [Service status](./service-status-api.md) to monitor components within the Druid cluster.
+* [Dynamic configuration](./dynamic-configuration-api.md) to configure the behavior of the Coordinator and Overlord processes.
+* [Legacy metadata](./legacy-metadata-api.md) to retrieve datasource metadata.
+
+## Java APIs
+* [SQL JDBC driver](./sql-jdbc.md) to connect to Druid and make Druid SQL queries using the Avatica JDBC driver.
\ No newline at end of file
diff --git a/docs/35.0.0/api-reference/automatic-compaction-api.md b/docs/35.0.0/api-reference/automatic-compaction-api.md
new file mode 100644
index 0000000000..f3744a45f0
--- /dev/null
+++ b/docs/35.0.0/api-reference/automatic-compaction-api.md
@@ -0,0 +1,1592 @@
+---
+id: automatic-compaction-api
+title: Automatic compaction API
+sidebar_label: Automatic compaction
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+This topic describes the status and configuration API endpoints for [automatic compaction using Coordinator duties](../data-management/automatic-compaction.md#auto-compaction-using-coordinator-duties) in Apache Druid. You can configure automatic compaction in the Druid web console or API.
+
+:::info[Experimental]
+
+Instead of the automatic compaction API, you can use the supervisor API to submit auto-compaction jobs using compaction supervisors. For more information, see [Auto-compaction using compaction supervisors](../data-management/automatic-compaction.md#auto-compaction-using-compaction-supervisors).
+
+:::
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port. Replace it with the information for your deployment. For example, use `http://localhost:8888` for quickstart deployments.
+
+## Manage automatic compaction
+
+### Create or update automatic compaction configuration
+
+Creates or updates the automatic compaction configuration for a datasource. Pass the automatic compaction as a JSON object in the request body.
+
+The automatic compaction configuration requires only the `dataSource` property. Druid fills all other properties with default values if not specified. See [Automatic compaction dynamic configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) for configuration details.
+
+Note that this endpoint returns an HTTP `200 OK` message code even if the datasource name does not exist.
+
+#### URL
+
+`POST` `/druid/coordinator/v1/config/compaction`
+
+#### Responses
+
+
+
+
+
+
+*Successfully submitted auto compaction configuration*
+
+
+
+
+---
+#### Sample request
+
+The following example creates an automatic compaction configuration for the datasource `wikipedia_hour`, which was ingested with `HOUR` segment granularity. This automatic compaction configuration performs compaction on `wikipedia_hour`, resulting in compacted segments that represent a day interval of data.
+
+In this example:
+
+* `wikipedia_hour` is a datasource with `HOUR` segment granularity.
+* `skipOffsetFromLatest` is set to `PT0S`, meaning that no data is skipped.
+* `partitionsSpec` is set to the default `dynamic`, allowing Druid to dynamically determine the optimal partitioning strategy.
+* `type` is set to `index_parallel`, meaning that parallel indexing is used.
+* `segmentGranularity` is set to `DAY`, meaning that each compacted segment is a day of data.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction"\
+--header 'Content-Type: application/json' \
+--data '{
+ "dataSource": "wikipedia_hour",
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "partitionsSpec": {
+ "type": "dynamic"
+ },
+ "type": "index_parallel"
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ }
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/coordinator/v1/config/compaction HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 281
+
+{
+ "dataSource": "wikipedia_hour",
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "partitionsSpec": {
+ "type": "dynamic"
+ },
+ "type": "index_parallel"
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ }
+}
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+
+### Remove automatic compaction configuration
+
+Removes the automatic compaction configuration for a datasource. This updates the compaction status of the datasource to "Not enabled."
+
+#### URL
+
+`DELETE` `/druid/coordinator/v1/config/compaction/{dataSource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully deleted automatic compaction configuration*
+
+
+
+
+
+*Datasource does not have automatic compaction or invalid datasource name*
+
+
+
+
+---
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/coordinator/v1/config/compaction/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+### Update capacity for compaction tasks
+
+:::info
+This API is now deprecated. Use [Update cluster-level compaction config](#update-cluster-level-compaction-config) instead.
+:::
+
+Updates the capacity for compaction tasks. The minimum number of compaction tasks is 1 and the maximum is 2147483647.
+
+Note that while the max compaction tasks can theoretically be set to 2147483647, the practical limit is determined by the available cluster capacity and is capped at 10% of the cluster's total capacity.
+
+#### URL
+
+`POST` `/druid/coordinator/v1/config/compaction/taskslots`
+
+#### Query parameters
+
+To limit the maximum number of compaction tasks, use the optional query parameters `ratio` and `max`:
+
+* `ratio` (optional)
+ * Type: Float
+ * Default: 0.1
+ * Limits the ratio of the total task slots to compaction task slots.
+* `max` (optional)
+ * Type: Int
+ * Default: 2147483647
+ * Limits the maximum number of task slots for compaction tasks.
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated compaction configuration*
+
+
+
+
+
+*Invalid `max` value*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction/taskslots?ratio=0.2&max=250000"
+```
+
+
+
+
+
+```HTTP
+POST /druid/coordinator/v1/config/compaction/taskslots?ratio=0.2&max=250000 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+## View automatic compaction configuration
+
+### Get all automatic compaction configurations
+
+Retrieves all automatic compaction configurations. Returns a `compactionConfigs` object containing the active automatic compaction configurations of all datasources.
+
+You can use this endpoint to retrieve `compactionTaskSlotRatio` and `maxCompactionTaskSlots` values for managing resource allocation of compaction tasks.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/compaction`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved automatic compaction configurations*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/config/compaction HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "compactionConfigs": [
+ {
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ },
+ {
+ "dataSource": "wikipedia",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ }
+ ],
+ "compactionTaskSlotRatio": 0.1,
+ "maxCompactionTaskSlots": 2147483647,
+
+}
+```
+
+
+### Get automatic compaction configuration
+
+Retrieves the automatic compaction configuration for a datasource.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/compaction/{dataSource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved configuration for datasource*
+
+
+
+
+
+*Invalid datasource or datasource does not have automatic compaction enabled*
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the automatic compaction configuration for datasource `wikipedia_hour`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/config/compaction/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+}
+```
+
+
+### Get automatic compaction configuration history
+
+Retrieves the history of the automatic compaction configuration for a datasource. Returns an empty list if the datasource does not exist or there is no compaction history for the datasource.
+
+The response contains a list of objects with the following keys:
+* `globalConfig`: A JSON object containing automatic compaction configuration that applies to the entire cluster.
+* `compactionConfig`: A JSON object containing the automatic compaction configuration for the datasource.
+* `auditInfo`: A JSON object containing information about the change made, such as `author`, `comment` or `ip`.
+* `auditTime`: The date and time when the change was made.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/compaction/{dataSource}/history`
+
+#### Query parameters
+* `interval` (optional)
+ * Type: ISO-8601
+ * Limits the results within a specified interval. Use `/` as the delimiter for the interval string.
+* `count` (optional)
+ * Type: Int
+ * Limits the number of results.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved configuration history*
+
+
+
+
+
+*Invalid `count` value*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction/wikipedia_hour/history"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/config/compaction/wikipedia_hour/history HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+[
+ {
+ "globalConfig": {
+ "compactionTaskSlotRatio": 0.1,
+ "maxCompactionTaskSlots": 2147483647,
+ "compactionPolicy": {
+ "type": "newestSegmentFirst",
+ "priorityDatasource": "wikipedia"
+ },
+ "useSupervisors": true,
+ "engine": "native"
+ },
+ "compactionConfig": {
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "P1D",
+ "tuningConfig": null,
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ },
+ "auditInfo": {
+ "author": "",
+ "comment": "",
+ "ip": "127.0.0.1"
+ },
+ "auditTime": "2023-07-31T18:15:19.302Z"
+ },
+ {
+ "globalConfig": {
+ "compactionTaskSlotRatio": 0.1,
+ "maxCompactionTaskSlots": 2147483647,
+ "compactionPolicy": {
+ "type": "newestSegmentFirst"
+ },
+ "useSupervisors": false,
+ "engine": "native"
+ },
+ "compactionConfig": {
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ },
+ "auditInfo": {
+ "author": "",
+ "comment": "",
+ "ip": "127.0.0.1"
+ },
+ "auditTime": "2023-07-31T18:16:16.362Z"
+ }
+]
+```
+
+
+## View automatic compaction status
+
+### Get segments awaiting compaction
+
+Returns the total size of segments awaiting compaction for a given datasource. Returns a 404 response if a datasource does not have automatic compaction enabled.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/compaction/progress?dataSource={dataSource}`
+
+#### Query parameter
+* `dataSource` (required)
+ * Type: String
+ * Name of the datasource for this status information.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved segment size awaiting compaction*
+
+
+
+
+
+*Unknown datasource name or datasource does not have automatic compaction enabled*
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the remaining segments to be compacted for datasource `wikipedia_hour`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/compaction/progress?dataSource=wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/compaction/progress?dataSource=wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "remainingSegmentSize": 7615837
+}
+```
+
+
+
+### Get compaction status and statistics
+
+Retrieves an array of `latestStatus` objects representing the status and statistics from the latest automatic compaction run for all datasources with automatic compaction enabled.
+
+#### Compaction status response
+
+The `latestStatus` object has the following properties:
+* `dataSource`: Name of the datasource for this status information.
+* `scheduleStatus`: Automatic compaction scheduling status. Possible values are `NOT_ENABLED` and `RUNNING`. Returns `RUNNING ` if the datasource has an active automatic compaction configuration submitted. Otherwise, returns `NOT_ENABLED`.
+* `bytesAwaitingCompaction`: Total bytes of this datasource waiting to be compacted by the automatic compaction (only consider intervals/segments that are eligible for automatic compaction).
+* `bytesCompacted`: Total bytes of this datasource that are already compacted with the spec set in the automatic compaction configuration.
+* `bytesSkipped`: Total bytes of this datasource that are skipped (not eligible for automatic compaction) by the automatic compaction.
+* `segmentCountAwaitingCompaction`: Total number of segments of this datasource waiting to be compacted by the automatic compaction (only consider intervals/segments that are eligible for automatic compaction).
+* `segmentCountCompacted`: Total number of segments of this datasource that are already compacted with the spec set in the automatic compaction configuration.
+* `segmentCountSkipped`: Total number of segments of this datasource that are skipped (not eligible for automatic compaction) by the automatic compaction.
+* `intervalCountAwaitingCompaction`: Total number of intervals of this datasource waiting to be compacted by the automatic compaction (only consider intervals/segments that are eligible for automatic compaction).
+* `intervalCountCompacted`: Total number of intervals of this datasource that are already compacted with the spec set in the automatic compaction configuration.
+* `intervalCountSkipped`: Total number of intervals of this datasource that are skipped (not eligible for automatic compaction) by the automatic compaction.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/compaction/status`
+
+#### Query parameters
+* `dataSource` (optional)
+ * Type: String
+ * Filter the result by name of a specific datasource.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved `latestStatus` object*
+
+
+
+
+---
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/compaction/status"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/compaction/status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "latestStatus": [
+ {
+ "dataSource": "wikipedia_api",
+ "scheduleStatus": "RUNNING",
+ "bytesAwaitingCompaction": 0,
+ "bytesCompacted": 0,
+ "bytesSkipped": 64133616,
+ "segmentCountAwaitingCompaction": 0,
+ "segmentCountCompacted": 0,
+ "segmentCountSkipped": 8,
+ "intervalCountAwaitingCompaction": 0,
+ "intervalCountCompacted": 0,
+ "intervalCountSkipped": 1
+ },
+ {
+ "dataSource": "wikipedia_hour",
+ "scheduleStatus": "RUNNING",
+ "bytesAwaitingCompaction": 0,
+ "bytesCompacted": 5998634,
+ "bytesSkipped": 0,
+ "segmentCountAwaitingCompaction": 0,
+ "segmentCountCompacted": 1,
+ "segmentCountSkipped": 0,
+ "intervalCountAwaitingCompaction": 0,
+ "intervalCountCompacted": 1,
+ "intervalCountSkipped": 0
+ }
+ ]
+}
+```
+
+
+## [Experimental] Unified Compaction APIs
+
+This section describes the new unified compaction APIs which can be used regardless of whether compaction supervisors are enabled (i.e. `useSupervisors` is `true`) or not in the compaction dynamic config.
+
+- If compaction supervisors are disabled, the APIs read or write the compaction dynamic config, same as the Coordinator-based compaction APIs above.
+- If compaction supervisors are enabled, the APIs read or write the corresponding compaction supervisors. In conjunction with the APIs described below, the supervisor APIs may also be used to read or write the compaction supervisors as they offer greater flexibility and also serve information related to supervisor and task statuses.
+
+### Update cluster-level compaction config
+
+Updates cluster-level configuration for compaction tasks which applies to all datasources, unless explicitly overridden in the datasource compaction config.
+This includes the following fields:
+
+|Config|Description|Default value|
+|------|-----------|-------------|
+|`compactionTaskSlotRatio`|Ratio of number of slots taken up by compaction tasks to the number of total task slots across all workers.|0.1|
+|`maxCompactionTaskSlots`|Maximum number of task slots that can be taken up by compaction tasks and sub-tasks. Minimum number of task slots available for compaction is 1. When using MSQ engine or Native engine with range partitioning, a single compaction job occupies more than one task slot. In this case, the minimum is 2 so that at least one compaction job can always run in the cluster.|2147483647 (i.e. total task slots)|
+|`compactionPolicy`|Policy to choose intervals for compaction. Currently, the only supported policy is [Newest segment first](#compaction-policy-newestsegmentfirst).|Newest segment first|
+|`useSupervisors`|Whether compaction should be run on Overlord using supervisors instead of Coordinator duties.|false|
+|`engine`|Engine used for running compaction tasks, unless overridden in the datasource-level compaction config. Possible values are `native` and `msq`. `msq` engine can be used for compaction only if `useSupervisors` is `true`.|`native`|
+
+#### Compaction policy `newestSegmentFirst`
+
+|Field|Description|Default value|
+|-----|-----------|-------------|
+|`type`|This must always be `newestSegmentFirst`||
+|`priorityDatasource`|Datasource to prioritize for compaction. The intervals of this datasource are chosen for compaction before the intervals of any other datasource. Within this datasource, the intervals are prioritized based on the chosen compaction policy.|None|
+
+
+#### URL
+
+`POST` `/druid/indexer/v1/compaction/config/cluster`
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated compaction configuration*
+
+
+
+
+
+*Invalid `max` value*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/compaction/cluster" \
+--header 'Content-Type: application/json' \
+--data '{
+ "compactionTaskSlotRatio": 0.5,
+ "maxCompactionTaskSlots": 1500,
+ "compactionPolicy": {
+ "type": "newestSegmentFirst",
+ "priorityDatasource": "wikipedia"
+ },
+ "useSupervisors": true,
+ "engine": "msq"
+}'
+
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/compaction/config/cluster HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+
+{
+ "compactionTaskSlotRatio": 0.5,
+ "maxCompactionTaskSlots": 1500,
+ "compactionPolicy": {
+ "type": "newestSegmentFirst",
+ "priorityDatasource": "wikipedia"
+ },
+ "useSupervisors": true,
+ "engine": "msq"
+}
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+### Get cluster-level compaction config
+
+Retrieves cluster-level configuration for compaction tasks which applies to all datasources, unless explicitly overridden in the datasource compaction config.
+This includes all the fields listed in [Update cluster-level compaction config](#update-cluster-level-compaction-config).
+
+#### URL
+
+`GET` `/druid/indexer/v1/compaction/config/cluster`
+
+#### Responses
+
+
+
+
+
+*Successfully retrieved cluster compaction configuration*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/config/cluster"
+```
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/compaction/config/cluster HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "compactionTaskSlotRatio": 0.5,
+ "maxCompactionTaskSlots": 1500,
+ "compactionPolicy": {
+ "type": "newestSegmentFirst",
+ "priorityDatasource": "wikipedia"
+ },
+ "useSupervisors": true,
+ "engine": "msq"
+}
+```
+
+
+
+### Get automatic compaction configurations for all datasources
+
+Retrieves all datasource compaction configurations.
+
+#### URL
+
+`GET` `/druid/indexer/v1/compaction/config/datasources`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved automatic compaction configurations*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/config/datasources"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/compaction/config/datasources HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "compactionConfigs": [
+ {
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ },
+ {
+ "dataSource": "wikipedia",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+ }
+ ]
+}
+```
+
+
+### Get automatic compaction configuration for a datasource
+
+Retrieves the automatic compaction configuration for a datasource.
+
+#### URL
+
+`GET` `/druid/indexer/v1/compaction/config/datasources/{dataSource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved configuration for datasource*
+
+
+
+
+
+*Invalid datasource or datasource does not have automatic compaction enabled*
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the automatic compaction configuration for datasource `wikipedia_hour`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/config/datasources/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/compaction/config/datasources/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "dataSource": "wikipedia_hour",
+ "taskPriority": 25,
+ "inputSegmentSizeBytes": 100000000000000,
+ "maxRowsPerSegment": null,
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "maxRowsInMemory": null,
+ "appendableIndexSpec": null,
+ "maxBytesInMemory": null,
+ "maxTotalRows": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": null,
+ "indexSpecForIntermediatePersists": null,
+ "maxPendingPersists": null,
+ "pushTimeout": null,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": null,
+ "maxRetry": null,
+ "taskStatusCheckPeriodMs": null,
+ "chatHandlerTimeout": null,
+ "chatHandlerNumRetries": null,
+ "maxNumSegmentsToMerge": null,
+ "totalNumMergeTasks": null,
+ "maxColumnsToMerge": null,
+ "type": "index_parallel",
+ "forceGuaranteedRollup": false
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY",
+ "queryGranularity": null,
+ "rollup": null
+ },
+ "dimensionsSpec": null,
+ "metricsSpec": null,
+ "transformSpec": null,
+ "ioConfig": null,
+ "taskContext": null
+}
+```
+
+
+### Create or update automatic compaction configuration for a datasource
+
+Creates or updates the automatic compaction configuration for a datasource. Pass the automatic compaction as a JSON object in the request body.
+
+The automatic compaction configuration requires only the `dataSource` property. Druid fills all other properties with default values if not specified. See [Automatic compaction dynamic configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) for configuration details.
+
+Note that this endpoint returns an HTTP `200 OK` message code even if the datasource name does not exist.
+
+#### URL
+
+`POST` `/druid/indexer/v1/compaction/config/datasources/wikipedia_hour`
+
+#### Responses
+
+
+
+
+
+
+*Successfully submitted auto compaction configuration*
+
+
+
+
+---
+#### Sample request
+
+The following example creates an automatic compaction configuration for the datasource `wikipedia_hour`, which was ingested with `HOUR` segment granularity. This automatic compaction configuration performs compaction on `wikipedia_hour`, resulting in compacted segments that represent a day interval of data.
+
+In this example:
+
+* `wikipedia_hour` is a datasource with `HOUR` segment granularity.
+* `skipOffsetFromLatest` is set to `PT0S`, meaning that no data is skipped.
+* `partitionsSpec` is set to the default `dynamic`, allowing Druid to dynamically determine the optimal partitioning strategy.
+* `type` is set to `index_parallel`, meaning that parallel indexing is used.
+* `segmentGranularity` is set to `DAY`, meaning that each compacted segment is a day of data.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/config/datasources/wikipedia_hour"\
+--header 'Content-Type: application/json' \
+--data '{
+ "dataSource": "wikipedia_hour",
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "partitionsSpec": {
+ "type": "dynamic"
+ },
+ "type": "index_parallel"
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ }
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/compaction/config/datasources/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 281
+
+{
+ "dataSource": "wikipedia_hour",
+ "skipOffsetFromLatest": "PT0S",
+ "tuningConfig": {
+ "partitionsSpec": {
+ "type": "dynamic"
+ },
+ "type": "index_parallel"
+ },
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ }
+}
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+
+### Delete automatic compaction configuration for a datasource
+
+Removes the automatic compaction configuration for a datasource. This updates the compaction status of the datasource to "Not enabled."
+
+#### URL
+
+`DELETE` `/druid/indexer/v1/compaction/config/datasources/{dataSource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully deleted automatic compaction configuration*
+
+
+
+
+
+*Datasource does not have automatic compaction or invalid datasource name*
+
+
+
+
+---
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/config/datasources/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/indexer/v1/compaction/config/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+### Get compaction status for all datasources
+
+Retrieves an array of `latestStatus` objects representing the status and statistics from the latest automatic compaction run for all the datasources to which the user has read access.
+The response payload is in the same format as [Compaction status response](#compaction-status-response).
+
+#### URL
+
+`GET` `/druid/indexer/v1/compaction/status/datasources`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved `latestStatus` object*
+
+
+
+
+---
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/status/datasources"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/compaction/status/datasources HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "latestStatus": [
+ {
+ "dataSource": "wikipedia_api",
+ "scheduleStatus": "RUNNING",
+ "bytesAwaitingCompaction": 0,
+ "bytesCompacted": 0,
+ "bytesSkipped": 64133616,
+ "segmentCountAwaitingCompaction": 0,
+ "segmentCountCompacted": 0,
+ "segmentCountSkipped": 8,
+ "intervalCountAwaitingCompaction": 0,
+ "intervalCountCompacted": 0,
+ "intervalCountSkipped": 1
+ },
+ {
+ "dataSource": "wikipedia_hour",
+ "scheduleStatus": "RUNNING",
+ "bytesAwaitingCompaction": 0,
+ "bytesCompacted": 5998634,
+ "bytesSkipped": 0,
+ "segmentCountAwaitingCompaction": 0,
+ "segmentCountCompacted": 1,
+ "segmentCountSkipped": 0,
+ "intervalCountAwaitingCompaction": 0,
+ "intervalCountCompacted": 1,
+ "intervalCountSkipped": 0
+ }
+ ]
+}
+```
+
+
+### Get compaction status for a single datasource
+
+Retrieves the latest status from the latest automatic compaction run for a datasource. The response payload is in the same format as [Compaction status response](#compaction-status-response) with zero or one entry.
+
+#### URL
+
+`GET` `/druid/indexer/v1/compaction/status/datasources/{dataSource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved `latestStatus` object*
+
+
+
+
+---
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/compaction/status/datasources/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/compaction/status/datasources/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "latestStatus": [
+ {
+ "dataSource": "wikipedia_hour",
+ "scheduleStatus": "RUNNING",
+ "bytesAwaitingCompaction": 0,
+ "bytesCompacted": 5998634,
+ "bytesSkipped": 0,
+ "segmentCountAwaitingCompaction": 0,
+ "segmentCountCompacted": 1,
+ "segmentCountSkipped": 0,
+ "intervalCountAwaitingCompaction": 0,
+ "intervalCountCompacted": 1,
+ "intervalCountSkipped": 0
+ }
+ ]
+}
+```
+
diff --git a/docs/35.0.0/api-reference/data-management-api.md b/docs/35.0.0/api-reference/data-management-api.md
new file mode 100644
index 0000000000..fe37c6a814
--- /dev/null
+++ b/docs/35.0.0/api-reference/data-management-api.md
@@ -0,0 +1,607 @@
+---
+id: data-management-api
+title: Data management API
+sidebar_label: Data management
+---
+
+
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This topic describes the data management API endpoints for Apache Druid.
+This includes information on how to mark segments as used or unused and delete them from Druid.
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port.
+Replace it with the information for your deployment.
+For example, use `http://localhost:8888` for quickstart deployments.
+
+:::info
+- Coordinator APIs for data management are now deprecated. Use new APIs served by the Overlord instead.
+- Do not use these APIs while an indexing task or kill task is in progress for the same datasource and interval.
+:::
+
+## Segment management
+
+You can mark segments as used by sending POST requests to the datasource, but the Coordinator may subsequently mark segments as unused if they meet any configured [drop rules](../operations/rule-configuration.md#drop-rules).
+Even if these API requests update segments to used, you still need to configure a [load rule](../operations/rule-configuration.md#load-rules) to load them onto Historical processes.
+
+When you use these APIs concurrently with an indexing task or a kill task, the behavior is undefined.
+Druid terminates some segments and marks others as used.
+Furthermore, it is possible that all segments could be unused, yet an indexing task might still be able to read data from these segments and complete successfully.
+
+All of the following APIs, except [Segment deletion](#segment-deletion) are served by the Overlord as it is the service responsible for performing actions on segment metadata on behalf of indexing tasks.
+This makes it the single source of truth for segment metadata, thus ensuring a consistent view across the Druid cluster and allowing the Overlord to cache metadata to improve performance.
+
+### Segment IDs
+
+You must provide segment IDs when using many of the endpoints described in this topic.
+For information on segment IDs, see [Segment identification](../design/segments.md#segment-identification).
+For information on finding segment IDs in the web console, see [Segments](../operations/web-console.md#segments).
+
+### Mark a single segment unused
+
+Marks the state of a segment as unused, using the segment ID.
+This is a "soft delete" of the segment from Historicals.
+To undo this action, [mark the segment used](#mark-a-single-segment-as-used).
+
+Note that this endpoint returns an HTTP `200 OK` response code even if the segment ID or datasource doesn't exist.
+Check the response payload to confirm if any segment was actually updated.
+
+#### URL
+
+`DELETE` `/druid/indexer/v1/datasources/{datasource}/segments/{segmentId}`
+
+#### Header
+
+The following headers are required for this request:
+
+```json
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segment*
+
+
+
+
+---
+
+#### Sample request
+
+The following example updates the segment `wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z` from datasource `wikipedia_hour` as `unused`.
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z" \
+--header 'Content-Type: application/json' \
+--header 'Accept: application/json, text/plain'
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/indexer/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T16:00:00.000Z_2015-09-12T17:00:00.000Z_2023-08-10T04:12:03.860Z HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "segmentStateChanged": true,
+ "numChangedSegments": 1
+}
+```
+
+
+### Mark a single segment as used
+
+Marks the state of a segment as used, using the segment ID.
+
+#### URL
+
+`POST` `/druid/indexer/v1/datasources/{datasource}/segments/{segmentId}`
+
+#### Header
+
+The following headers are required for this request:
+
+```json
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segments*
+
+
+
+
+---
+
+#### Sample request
+
+The following example updates the segment with ID `wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z` to used.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z" \
+--header 'Content-Type: application/json' \
+--header 'Accept: application/json, text/plain'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/datasources/wikipedia_hour/segments/wikipedia_hour_2015-09-12T18:00:00.000Z_2015-09-12T19:00:00.000Z_2023-08-10T04:12:03.860Z HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "segmentStateChanged": true,
+ "numChangedSegments": 1
+}
+```
+
+
+### Mark a group of segments unused
+
+Marks the state of a group of segments as unused, using an array of segment IDs or an interval.
+Pass the array of segment IDs or interval as a JSON object in the request body.
+
+For the interval, specify the start and end times as ISO 8601 strings to identify segments inclusive of the start time and exclusive of the end time.
+Optionally, specify an array of segment versions with interval. Druid updates only the segments completely contained
+within the specified interval that match the optional list of versions; partially overlapping segments are not affected.
+
+#### URL
+
+`POST` `/druid/indexer/v1/datasources/{datasource}/markUnused`
+
+#### Request body
+
+The group of segments is sent as a JSON request payload that accepts the following properties:
+
+|Property|Description|Required|Example|
+|--------|-----------|--------|-------|
+|`interval`|ISO 8601 segments interval.|Yes, if `segmentIds` is not specified.|`"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z"`|
+|`segmentIds`|List of segment IDs.|Yes, if `interval` is not specified.|`["segmentId1", "segmentId2"]`|
+|`versions`|List of segment versions. Must be provided with `interval`.|No.|`["2024-03-14T16:00:04.086Z", ""2024-03-12T16:00:04.086Z"]`|
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segments*
+
+
+
+
+
+*Invalid datasource name*
+
+
+
+
+
+*Invalid request payload*
+
+
+
+
+---
+
+#### Sample request
+
+The following example marks two segments from the `wikipedia_hour` datasource unused based on their segment IDs.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour/markUnused" \
+--header 'Content-Type: application/json' \
+--data '{
+ "segmentIds": [
+ "wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
+ "wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
+ ]
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/datasources/wikipedia_hour/markUnused HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 230
+
+{
+ "segmentIds": [
+ "wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
+ "wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
+ ]
+}
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "numChangedSegments": 2
+}
+```
+
+
+### Mark a group of segments used
+
+Marks the state of a group of segments as used, using an array of segment IDs or an interval.
+Pass the array of segment IDs or interval as a JSON object in the request body.
+
+For the interval, specify the start and end times as ISO 8601 strings to identify segments inclusive of the start time and exclusive of the end time.
+Optionally, specify an array of segment versions with interval. Druid updates only the segments completely contained
+within the specified interval that match the optional list of versions; partially overlapping segments are not affected.
+
+#### URL
+
+`POST` `/druid/indexer/v1/datasources/{datasource}/markUsed`
+
+#### Request body
+
+The group of segments is sent as a JSON request payload that accepts the following properties:
+
+|Property|Description|Required|Example|
+|--------|-----------|--------|-------|
+|`interval`|ISO 8601 segments interval.|Yes, if `segmentIds` is not specified.|`"2015-09-12T03:00:00.000Z/2015-09-12T05:00:00.000Z"`|
+|`segmentIds`|List of segment IDs.|Yes, if `interval` is not specified.|`["segmentId1", "segmentId2"]`|
+|`versions`|List of segment versions. Must be provided with `interval`.|No.|`["2024-03-14T16:00:04.086Z", ""2024-03-12T16:00:04.086Z"]`|
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segments*
+
+
+
+
+
+*Invalid datasource name*
+
+
+
+
+
+*Invalid request payload*
+
+
+
+
+---
+
+#### Sample request
+
+The following example marks two segments from the `wikipedia_hour` datasource used based on their segment IDs.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour/markUsed" \
+--header 'Content-Type: application/json' \
+--data '{
+ "segmentIds": [
+ "wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
+ "wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
+ ]
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/datasources/wikipedia_hour/markUsed HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 230
+
+{
+ "segmentIds": [
+ "wikipedia_hour_2015-09-12T14:00:00.000Z_2015-09-12T15:00:00.000Z_2023-08-10T04:12:03.860Z",
+ "wikipedia_hour_2015-09-12T04:00:00.000Z_2015-09-12T05:00:00.000Z_2023-08-10T04:12:03.860Z"
+ ]
+}
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "numChangedSegments": 2
+}
+```
+
+
+### Mark all segments unused
+
+Marks the state of all segments of a datasource as unused.
+This action performs a "soft delete" of the segments from Historicals.
+
+Note that this endpoint returns an HTTP `200 OK` response code even if the datasource doesn't exist.
+Check the response payload to confirm if any segment was actually updated.
+
+#### URL
+
+`DELETE` `/druid/indexer/v1/datasources/{datasource}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segments*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/indexer/v1/datasources/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "numChangedSegments": 24
+}
+```
+
+
+### Mark all non-overshadowed segments used
+
+Marks the state of all unused segments of a datasource as used given that they are not already overshadowed by other segments.
+The endpoint returns the number of changed segments.
+
+Note that this endpoint returns an HTTP `200 OK` response code even if the datasource doesn't exist.
+Check the response payload to get the number of segments actually updated.
+
+#### URL
+
+`POST` `/druid/indexer/v1/datasources/{datasource}`
+
+#### Header
+
+The following headers are required for this request:
+
+```json
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated segments*
+
+
+
+
+---
+
+#### Sample request
+
+The following example updates all unused segments of `wikipedia_hour` to used.
+`wikipedia_hour` contains one unused segment eligible to be marked as used.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_hour" \
+--header 'Content-Type: application/json' \
+--header 'Accept: application/json, text/plain'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/datasources/wikipedia_hour HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Accept: application/json, text/plain
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "numChangedSegments": 1
+}
+```
+
+
+## Segment deletion
+
+### Permanently delete segments
+
+The DELETE endpoint sends a [kill task](../ingestion/tasks.md) for a given interval and datasource. The interval value is an ISO 8601 string delimited by `_`. This request permanently deletes all metadata for unused segments and removes them from deep storage.
+
+Note that this endpoint returns an HTTP `200 OK` response code even if the datasource doesn't exist.
+
+This endpoint supersedes the deprecated endpoint: `DELETE /druid/coordinator/v1/datasources/{datasource}?kill=true&interval={interval}`
+
+#### URL
+
+`DELETE` `/druid/coordinator/v1/datasources/{datasource}/intervals/{interval}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully sent kill task*
+
+
+
+
+---
+
+#### Sample request
+
+The following example sends a kill task to permanently delete segments in the datasource `wikipedia_hour` from the interval `2015-09-12` to `2015-09-13`.
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/datasources/wikipedia_hour/intervals/2015-09-12_2015-09-13"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/coordinator/v1/datasources/wikipedia_hour/intervals/2015-09-12_2015-09-13 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` and an empty response body.
diff --git a/docs/35.0.0/api-reference/dynamic-configuration-api.md b/docs/35.0.0/api-reference/dynamic-configuration-api.md
new file mode 100644
index 0000000000..cad61e4b88
--- /dev/null
+++ b/docs/35.0.0/api-reference/dynamic-configuration-api.md
@@ -0,0 +1,665 @@
+---
+id: dynamic-configuration-api
+title: Dynamic configuration API
+sidebar_label: Dynamic configuration
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+This document describes the API endpoints to retrieve and manage dynamic configurations for the [Coordinator](../design/coordinator.md) and [Overlord](../design/overlord.md) in Apache Druid.
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port.
+Replace it with the information for your deployment.
+For example, use `http://localhost:8888` for quickstart deployments.
+
+## Coordinator dynamic configuration
+
+The Coordinator has dynamic configurations to tune certain behavior on the fly, without requiring a service restart.
+For information on the supported properties, see [Coordinator dynamic configuration](../configuration/index.md#dynamic-configuration).
+
+### Get dynamic configuration
+
+Retrieves the current Coordinator dynamic configuration. Returns a JSON object with the dynamic configuration properties.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved dynamic configuration*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/config HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+View the response
+
+```json
+{
+ "millisToWaitBeforeDeleting": 900000,
+ "maxSegmentsToMove": 100,
+ "replicantLifetime": 15,
+ "replicationThrottleLimit": 500,
+ "balancerComputeThreads": 1,
+ "killDataSourceWhitelist": [],
+ "killPendingSegmentsSkipList": [],
+ "maxSegmentsInNodeLoadingQueue": 500,
+ "decommissioningNodes": [],
+ "decommissioningMaxPercentOfMaxSegmentsToMove": 70,
+ "pauseCoordination": false,
+ "replicateAfterLoadTimeout": false,
+ "maxNonPrimaryReplicantsToLoad": 2147483647,
+ "useRoundRobinSegmentAssignment": true,
+ "smartSegmentLoading": true,
+ "debugDimensions": null,
+ "turboLoadingNodes": [],
+ "cloneServers": {}
+
+}
+```
+
+
+
+### Update dynamic configuration
+
+Submits a JSON-based dynamic configuration spec to the Coordinator.
+For information on the supported properties, see [Dynamic configuration](../configuration/index.md#dynamic-configuration).
+
+#### URL
+
+`POST` `/druid/coordinator/v1/config`
+
+#### Header parameters
+
+The endpoint supports a set of optional header parameters to populate the `author` and `comment` fields in the configuration history.
+
+* `X-Druid-Author`
+ * Type: String
+ * Author of the configuration change.
+* `X-Druid-Comment`
+ * Type: String
+ * Description for the update.
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated dynamic configuration*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config" \
+--header 'Content-Type: application/json' \
+--data '{
+ "millisToWaitBeforeDeleting": 900000,
+ "maxSegmentsToMove": 5,
+ "percentOfSegmentsToConsiderPerMove": 100,
+ "useBatchedSegmentSampler": true,
+ "replicantLifetime": 15,
+ "replicationThrottleLimit": 10,
+ "balancerComputeThreads": 1,
+ "emitBalancingStats": true,
+ "killDataSourceWhitelist": [],
+ "killPendingSegmentsSkipList": [],
+ "maxSegmentsInNodeLoadingQueue": 100,
+ "decommissioningNodes": [],
+ "decommissioningMaxPercentOfMaxSegmentsToMove": 70,
+ "pauseCoordination": false,
+ "replicateAfterLoadTimeout": false,
+ "maxNonPrimaryReplicantsToLoad": 2147483647,
+ "useRoundRobinSegmentAssignment": true,
+ "turboLoadingNodes": [],
+ "cloneServers": {}
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/coordinator/v1/config HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 683
+
+{
+ "millisToWaitBeforeDeleting": 900000,
+ "maxSegmentsToMove": 5,
+ "percentOfSegmentsToConsiderPerMove": 100,
+ "useBatchedSegmentSampler": true,
+ "replicantLifetime": 15,
+ "replicationThrottleLimit": 10,
+ "balancerComputeThreads": 1,
+ "emitBalancingStats": true,
+ "killDataSourceWhitelist": [],
+ "killPendingSegmentsSkipList": [],
+ "maxSegmentsInNodeLoadingQueue": 100,
+ "decommissioningNodes": [],
+ "decommissioningMaxPercentOfMaxSegmentsToMove": 70,
+ "pauseCoordination": false,
+ "replicateAfterLoadTimeout": false,
+ "maxNonPrimaryReplicantsToLoad": 2147483647,
+ "useRoundRobinSegmentAssignment": true,
+ "turboLoadingNodes": [],
+ "cloneServers": {}
+}
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+### Get dynamic configuration history
+
+Retrieves the history of changes to Coordinator dynamic configuration over an interval of time. Returns an empty array if there are no history records available.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/history`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+* `interval`
+ * Type: String
+ * Limit the results to the specified time interval in ISO 8601 format delimited with `/`. For example, `2023-07-13/2023-07-19`. The default interval is one week. You can change this period by setting `druid.audit.manager.auditHistoryMillis` in the `runtime.properties` file for the Coordinator.
+
+* `count`
+ * Type: Integer
+ * Limit the number of results to the last `n` entries.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved history*
+
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the dynamic configuration history between `2022-07-13` and `2024-07-19`. The response is limited to 10 entries.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/config/history?interval=2022-07-13%2F2024-07-19&count=10"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/config/history?interval=2022-07-13/2024-07-19&count=10 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+[
+ {
+ "key": "coordinator.config",
+ "type": "coordinator.config",
+ "auditInfo": {
+ "author": "",
+ "comment": "",
+ "ip": "127.0.0.1"
+ },
+ "payload": "{\"millisToWaitBeforeDeleting\":900000,\"maxSegmentsToMove\":5,\"replicantLifetime\":15,\"replicationThrottleLimit\":10,\"balancerComputeThreads\":1,\"killDataSourceWhitelist\":[],\"killPendingSegmentsSkipList\":[],\"maxSegmentsInNodeLoadingQueue\":100,\"decommissioningNodes\":[],\"decommissioningMaxPercentOfMaxSegmentsToMove\":70,\"pauseCoordination\":false,\"replicateAfterLoadTimeout\":false,\"maxNonPrimaryReplicantsToLoad\":2147483647,\"useRoundRobinSegmentAssignment\":true,\"smartSegmentLoading\":true,\"debugDimensions\":null,\"decommissioningNodes\":[]}",
+ "auditTime": "2023-10-03T20:59:51.622Z"
+ }
+]
+```
+
+
+## Overlord dynamic configuration
+
+The Overlord has dynamic configurations to tune how Druid assigns tasks to workers.
+For information on the supported properties, see [Overlord dynamic configuration](../configuration/index.md#overlord-dynamic-configuration).
+
+### Get dynamic configuration
+
+Retrieves the current Overlord dynamic configuration.
+Returns a JSON object with the dynamic configuration properties.
+Returns an empty response body if there is no current Overlord dynamic configuration.
+
+#### URL
+
+`GET` `/druid/indexer/v1/worker`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved dynamic configuration*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/worker"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/worker HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "type": "default",
+ "selectStrategy": {
+ "type": "fillCapacityWithCategorySpec",
+ "workerCategorySpec": {
+ "categoryMap": {},
+ "strong": true
+ }
+ },
+ "autoScaler": null
+}
+```
+
+
+
+### Update dynamic configuration
+
+Submits a JSON-based dynamic configuration spec to the Overlord.
+For information on the supported properties, see [Overlord dynamic configuration](../configuration/index.md#overlord-dynamic-configuration).
+
+#### URL
+
+`POST` `/druid/indexer/v1/worker`
+
+#### Header parameters
+
+The endpoint supports a set of optional header parameters to populate the `author` and `comment` fields in the configuration history.
+
+* `X-Druid-Author`
+ * Type: String
+ * Author of the configuration change.
+* `X-Druid-Comment`
+ * Type: String
+ * Description for the update.
+
+#### Responses
+
+
+
+
+
+
+*Successfully updated dynamic configuration*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/worker" \
+--header 'Content-Type: application/json' \
+--data '{
+ "type": "default",
+ "selectStrategy": {
+ "type": "fillCapacityWithCategorySpec",
+ "workerCategorySpec": {
+ "categoryMap": {},
+ "strong": true
+ }
+ },
+ "autoScaler": null
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/worker HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 196
+
+{
+ "type": "default",
+ "selectStrategy": {
+ "type": "fillCapacityWithCategorySpec",
+ "workerCategorySpec": {
+ "categoryMap": {},
+ "strong": true
+ }
+ },
+ "autoScaler": null
+}
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+### Get dynamic configuration history
+
+Retrieves the history of changes to Overlord dynamic configuration over an interval of time. Returns an empty array if there are no history records available.
+
+#### URL
+
+`GET` `/druid/indexer/v1/worker/history`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+* `interval`
+ * Type: String
+ * Limit the results to the specified time interval in ISO 8601 format delimited with `/`. For example, `2023-07-13/2023-07-19`. The default interval is one week. You can change this period by setting `druid.audit.manager.auditHistoryMillis` in the `runtime.properties` file for the Overlord.
+
+* `count`
+ * Type: Integer
+ * Limit the number of results to the last `n` entries.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved history*
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the dynamic configuration history between `2022-07-13` and `2024-07-19`. The response is limited to 10 entries.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/worker/history?interval=2022-07-13%2F2024-07-19&count=10"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/worker/history?interval=2022-07-13%2F2024-07-19&count=10 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+[
+ {
+ "key": "worker.config",
+ "type": "worker.config",
+ "auditInfo": {
+ "author": "",
+ "comment": "",
+ "ip": "127.0.0.1"
+ },
+ "payload": "{\"type\":\"default\",\"selectStrategy\":{\"type\":\"fillCapacityWithCategorySpec\",\"workerCategorySpec\":{\"categoryMap\":{},\"strong\":true}},\"autoScaler\":null}",
+ "auditTime": "2023-10-03T21:49:49.991Z"
+ }
+]
+```
+
+
+
+### Get an array of worker nodes in the cluster
+
+Returns an array of all the worker nodes in the cluster along with its corresponding metadata.
+
+`GET` `/druid/indexer/v1/workers`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved worker nodes*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/workers"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/workers HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+[
+ {
+ "worker": {
+ "scheme": "http",
+ "host": "localhost:8091",
+ "ip": "198.51.100.0",
+ "capacity": 2,
+ "version": "0",
+ "category": "_default_worker_category"
+ },
+ "currCapacityUsed": 0,
+ "currParallelIndexCapacityUsed": 0,
+ "availabilityGroups": [],
+ "runningTasks": [],
+ "lastCompletedTaskTime": "2023-09-29T19:13:05.505Z",
+ "blacklistedUntil": null
+ }
+]
+```
+
+
+
+### Get scaling events
+
+Returns Overlord scaling events if autoscaling runners are in use.
+Returns an empty response body if there are no Overlord scaling events.
+
+#### URL
+
+`GET` `/druid/indexer/v1/scaling`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved scaling events*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/scaling"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/scaling HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns a `200 OK` response and an array of scaling events.
diff --git a/docs/35.0.0/api-reference/json-querying-api.md b/docs/35.0.0/api-reference/json-querying-api.md
new file mode 100644
index 0000000000..5d03ec8b31
--- /dev/null
+++ b/docs/35.0.0/api-reference/json-querying-api.md
@@ -0,0 +1,925 @@
+---
+id: json-querying-api
+title: JSON querying API
+sidebar_label: JSON querying
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+This topic describes the API endpoints to submit JSON-based [native queries](../querying/querying.md) to Apache Druid.
+
+In this topic, `http://SERVICE_IP:SERVICE_PORT` is a placeholder for the server address of deployment and the service port. For example, on the quickstart configuration, replace `http://ROUTER_IP:ROUTER_PORT` with `http://localhost:8888`.
+
+
+## Submit a query
+
+Submits a JSON-based native query. The body of the request is the native query itself.
+
+Druid supports different types of queries for different use cases. All queries require the following properties:
+* `queryType`: A string representing the type of query. Druid supports the following native query types: `timeseries`, `topN`, `groupBy`, `timeBoundaries`, `segmentMetadata`, `datasourceMetadata`, `scan`, and `search`.
+* `dataSource`: A string or object defining the source of data to query. The most common value is the name of the datasource to query. For more information, see [Datasources](../querying/datasource.md).
+
+For additional properties based on your query type or use case, see [available native queries](../querying/querying.md#available-queries).
+
+### URL
+
+`POST` `/druid/v2`
+
+### Query parameters
+
+* `pretty` (optional)
+ * Druid returns the response in a pretty-printed format using indentation and line breaks.
+
+### Responses
+
+
+
+
+
+
+*Successfully submitted query*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+For more information on possible error messages, see [query execution failures](../querying/querying.md#query-execution-failures).
+
+
+
+
+---
+
+### Example query: `topN`
+
+The following example shows a `topN` query. The query analyzes the `social_media` datasource to return the top five users from the `username` dimension with the highest number of views from the `views` metric.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2?pretty=null" \
+--header 'Content-Type: application/json' \
+--data '{
+ "queryType": "topN",
+ "dataSource": "social_media",
+ "dimension": "username",
+ "threshold": 5,
+ "metric": "views",
+ "granularity": "all",
+ "aggregations": [
+ {
+ "type": "longSum",
+ "name": "views",
+ "fieldName": "views"
+ }
+ ],
+ "intervals": [
+ "2022-01-01T00:00:00.000/2024-01-01T00:00:00.000"
+ ]
+}'
+```
+
+
+
+
+```HTTP
+POST /druid/v2?pretty=null HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 336
+
+{
+ "queryType": "topN",
+ "dataSource": "social_media",
+ "dimension": "username",
+ "threshold": 5,
+ "metric": "views",
+ "granularity": "all",
+ "aggregations": [
+ {
+ "type": "longSum",
+ "name": "views",
+ "fieldName": "views"
+ }
+ ],
+ "intervals": [
+ "2022-01-01T00:00:00.000/2024-01-01T00:00:00.000"
+ ]
+}
+```
+
+
+
+
+#### Example response: `topN`
+
+
+ View the response
+
+ ```json
+[
+ {
+ "timestamp": "2023-07-03T18:49:54.848Z",
+ "result": [
+ {
+ "views": 11591218026,
+ "username": "gus"
+ },
+ {
+ "views": 11578638578,
+ "username": "miette"
+ },
+ {
+ "views": 11561618880,
+ "username": "leon"
+ },
+ {
+ "views": 11552609824,
+ "username": "mia"
+ },
+ {
+ "views": 11551537517,
+ "username": "milton"
+ }
+ ]
+ }
+]
+ ```
+
+
+### Example query: `groupBy`
+
+The following example submits a JSON query of the `groupBy` type to retrieve the `username` with the highest votes to posts ratio from the `social_media` datasource.
+
+In this query:
+* The `upvoteSum` aggregation calculates the sum of the `upvotes` for each user.
+* The `postCount` aggregation calculates the sum of posts for each user.
+* The `upvoteToPostRatio` is a post-aggregation of the `upvoteSum` and the `postCount`, divided to calculate the ratio.
+* The result is sorted based on the `upvoteToPostRatio` in descending order.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2" \
+--header 'Content-Type: application/json' \
+--data '{
+ "queryType": "groupBy",
+ "dataSource": "social_media",
+ "dimensions": ["username"],
+ "granularity": "all",
+ "aggregations": [
+ { "type": "doubleSum", "name": "upvoteSum", "fieldName": "upvotes" },
+ { "type": "count", "name": "postCount", "fieldName": "post_title" }
+ ],
+ "postAggregations": [
+ {
+ "type": "arithmetic",
+ "name": "upvoteToPostRatio",
+ "fn": "/",
+ "fields": [
+ { "type": "fieldAccess", "name": "upvoteSum", "fieldName": "upvoteSum" },
+ { "type": "fieldAccess", "name": "postCount", "fieldName": "postCount" }
+ ]
+ }
+ ],
+ "intervals": ["2022-01-01T00:00:00.000/2024-01-01T00:00:00.000"],
+ "limitSpec": {
+ "type": "default",
+ "limit": 1,
+ "columns": [
+ { "dimension": "upvoteToPostRatio", "direction": "descending" }
+ ]
+ }
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/v2?pretty=null HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 817
+
+{
+ "queryType": "groupBy",
+ "dataSource": "social_media",
+ "dimensions": ["username"],
+ "granularity": "all",
+ "aggregations": [
+ { "type": "doubleSum", "name": "upvoteSum", "fieldName": "upvotes" },
+ { "type": "count", "name": "postCount", "fieldName": "post_title" }
+ ],
+ "postAggregations": [
+ {
+ "type": "arithmetic",
+ "name": "upvoteToPostRatio",
+ "fn": "/",
+ "fields": [
+ { "type": "fieldAccess", "name": "upvoteSum", "fieldName": "upvoteSum" },
+ { "type": "fieldAccess", "name": "postCount", "fieldName": "postCount" }
+ ]
+ }
+ ],
+ "intervals": ["2022-01-01T00:00:00.000/2024-01-01T00:00:00.000"],
+ "limitSpec": {
+ "type": "default",
+ "limit": 1,
+ "columns": [
+ { "dimension": "upvoteToPostRatio", "direction": "descending" }
+ ]
+ }
+}
+```
+
+
+
+
+#### Example response: `groupBy`
+
+
+ View the response
+
+```json
+[
+ {
+ "version": "v1",
+ "timestamp": "2022-01-01T00:00:00.000Z",
+ "event": {
+ "upvoteSum": 8.0419541E7,
+ "upvoteToPostRatio": 69.53014661762697,
+ "postCount": 1156614,
+ "username": "miette"
+ }
+ }
+]
+```
+
+
+## Get segment information for query
+
+Retrieves an array that contains objects with segment information, including the server locations associated with the query provided in the request body.
+
+### URL
+
+`POST` `/druid/v2/candidates`
+
+### Query parameters
+
+* `pretty` (optional)
+ * Druid returns the response in a pretty-printed format using indentation and line breaks.
+
+### Responses
+
+
+
+
+
+
+*Successfully retrieved segment information*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+
+For more information on possible error messages, see [query execution failures](../querying/querying.md#query-execution-failures).
+
+
+
+
+---
+
+### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2/candidates" \
+--header 'Content-Type: application/json' \
+--data '{
+ "queryType": "topN",
+ "dataSource": "social_media",
+ "dimension": "username",
+ "threshold": 5,
+ "metric": "views",
+ "granularity": "all",
+ "aggregations": [
+ {
+ "type": "longSum",
+ "name": "views",
+ "fieldName": "views"
+ }
+ ],
+ "intervals": [
+ "2022-01-01T00:00:00.000/2024-01-01T00:00:00.000"
+ ]
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/v2/candidates HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 336
+
+{
+ "queryType": "topN",
+ "dataSource": "social_media",
+ "dimension": "username",
+ "threshold": 5,
+ "metric": "views",
+ "granularity": "all",
+
+ "aggregations": [
+ {
+ "type": "longSum",
+ "name": "views",
+ "fieldName": "views"
+ }
+ ],
+ "intervals": [
+ "2020-01-01T00:00:00.000/2024-01-01T00:00:00.000"
+ ]
+}
+```
+
+
+
+
+### Sample response
+
+
+ View the response
+
+ ```json
+[
+ {
+ "interval": "2023-07-03T18:00:00.000Z/2023-07-03T19:00:00.000Z",
+ "version": "2023-07-03T18:51:18.905Z",
+ "partitionNumber": 0,
+ "size": 21563693,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-03T19:00:00.000Z/2023-07-03T20:00:00.000Z",
+ "version": "2023-07-03T19:00:00.657Z",
+ "partitionNumber": 0,
+ "size": 6057236,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-05T21:00:00.000Z/2023-07-05T22:00:00.000Z",
+ "version": "2023-07-05T21:09:58.102Z",
+ "partitionNumber": 0,
+ "size": 223926186,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-05T21:00:00.000Z/2023-07-05T22:00:00.000Z",
+ "version": "2023-07-05T21:09:58.102Z",
+ "partitionNumber": 1,
+ "size": 20244827,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-05T22:00:00.000Z/2023-07-05T23:00:00.000Z",
+ "version": "2023-07-05T22:00:00.524Z",
+ "partitionNumber": 0,
+ "size": 104628051,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-05T22:00:00.000Z/2023-07-05T23:00:00.000Z",
+ "version": "2023-07-05T22:00:00.524Z",
+ "partitionNumber": 1,
+ "size": 1603995,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-05T23:00:00.000Z/2023-07-06T00:00:00.000Z",
+ "version": "2023-07-05T23:21:55.242Z",
+ "partitionNumber": 0,
+ "size": 181506843,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T00:00:00.000Z/2023-07-06T01:00:00.000Z",
+ "version": "2023-07-06T00:02:08.498Z",
+ "partitionNumber": 0,
+ "size": 9170974,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T00:00:00.000Z/2023-07-06T01:00:00.000Z",
+ "version": "2023-07-06T00:02:08.498Z",
+ "partitionNumber": 1,
+ "size": 23969632,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T01:00:00.000Z/2023-07-06T02:00:00.000Z",
+ "version": "2023-07-06T01:13:53.982Z",
+ "partitionNumber": 0,
+ "size": 599895,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T01:00:00.000Z/2023-07-06T02:00:00.000Z",
+ "version": "2023-07-06T01:13:53.982Z",
+ "partitionNumber": 1,
+ "size": 1627041,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T02:00:00.000Z/2023-07-06T03:00:00.000Z",
+ "version": "2023-07-06T02:55:50.701Z",
+ "partitionNumber": 0,
+ "size": 629753,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T02:00:00.000Z/2023-07-06T03:00:00.000Z",
+ "version": "2023-07-06T02:55:50.701Z",
+ "partitionNumber": 1,
+ "size": 1342360,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T04:00:00.000Z/2023-07-06T05:00:00.000Z",
+ "version": "2023-07-06T04:02:36.562Z",
+ "partitionNumber": 0,
+ "size": 2131434,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T05:00:00.000Z/2023-07-06T06:00:00.000Z",
+ "version": "2023-07-06T05:23:27.856Z",
+ "partitionNumber": 0,
+ "size": 797161,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T05:00:00.000Z/2023-07-06T06:00:00.000Z",
+ "version": "2023-07-06T05:23:27.856Z",
+ "partitionNumber": 1,
+ "size": 1176858,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T06:00:00.000Z/2023-07-06T07:00:00.000Z",
+ "version": "2023-07-06T06:46:34.638Z",
+ "partitionNumber": 0,
+ "size": 2148760,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T07:00:00.000Z/2023-07-06T08:00:00.000Z",
+ "version": "2023-07-06T07:38:28.050Z",
+ "partitionNumber": 0,
+ "size": 2040748,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T08:00:00.000Z/2023-07-06T09:00:00.000Z",
+ "version": "2023-07-06T08:27:31.407Z",
+ "partitionNumber": 0,
+ "size": 678723,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T08:00:00.000Z/2023-07-06T09:00:00.000Z",
+ "version": "2023-07-06T08:27:31.407Z",
+ "partitionNumber": 1,
+ "size": 1437866,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T10:00:00.000Z/2023-07-06T11:00:00.000Z",
+ "version": "2023-07-06T10:02:42.079Z",
+ "partitionNumber": 0,
+ "size": 1671296,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T11:00:00.000Z/2023-07-06T12:00:00.000Z",
+ "version": "2023-07-06T11:27:23.902Z",
+ "partitionNumber": 0,
+ "size": 574893,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T11:00:00.000Z/2023-07-06T12:00:00.000Z",
+ "version": "2023-07-06T11:27:23.902Z",
+ "partitionNumber": 1,
+ "size": 1427384,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T12:00:00.000Z/2023-07-06T13:00:00.000Z",
+ "version": "2023-07-06T12:52:00.846Z",
+ "partitionNumber": 0,
+ "size": 2115172,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T14:00:00.000Z/2023-07-06T15:00:00.000Z",
+ "version": "2023-07-06T14:32:33.926Z",
+ "partitionNumber": 0,
+ "size": 589108,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T14:00:00.000Z/2023-07-06T15:00:00.000Z",
+ "version": "2023-07-06T14:32:33.926Z",
+ "partitionNumber": 1,
+ "size": 1392649,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T15:00:00.000Z/2023-07-06T16:00:00.000Z",
+ "version": "2023-07-06T15:53:25.467Z",
+ "partitionNumber": 0,
+ "size": 2037851,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T16:00:00.000Z/2023-07-06T17:00:00.000Z",
+ "version": "2023-07-06T16:02:26.568Z",
+ "partitionNumber": 0,
+ "size": 230400650,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T16:00:00.000Z/2023-07-06T17:00:00.000Z",
+ "version": "2023-07-06T16:02:26.568Z",
+ "partitionNumber": 1,
+ "size": 38209056,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ },
+ {
+ "interval": "2023-07-06T17:00:00.000Z/2023-07-06T18:00:00.000Z",
+ "version": "2023-07-06T17:00:02.391Z",
+ "partitionNumber": 0,
+ "size": 211099463,
+ "locations": [
+ {
+ "name": "localhost:8083",
+ "host": "localhost:8083",
+ "hostAndTlsPort": null,
+ "maxSize": 300000000000,
+ "type": "historical",
+ "tier": "_default_tier",
+ "priority": 0
+ }
+ ]
+ }
+]
+ ```
+
diff --git a/docs/35.0.0/api-reference/legacy-metadata-api.md b/docs/35.0.0/api-reference/legacy-metadata-api.md
new file mode 100644
index 0000000000..d22be18a7e
--- /dev/null
+++ b/docs/35.0.0/api-reference/legacy-metadata-api.md
@@ -0,0 +1,344 @@
+---
+id: legacy-metadata-api
+title: Legacy metadata API
+sidebar_label: Legacy metadata
+---
+
+
+
+This document describes the legacy API endpoints to retrieve datasource metadata from Apache Druid. Use the [SQL metadata tables](../querying/sql-metadata-tables.md) to retrieve datasource metadata instead.
+
+## Segment loading
+
+`GET /druid/coordinator/v1/loadstatus`
+
+Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster.
+
+`GET /druid/coordinator/v1/loadstatus?simple`
+
+Returns the number of segments left to load until segments that should be loaded in the cluster are available for queries. This does not include segment replication counts.
+
+`GET /druid/coordinator/v1/loadstatus?full`
+
+Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available. This includes segment replication counts.
+
+`GET /druid/coordinator/v1/loadstatus?full&computeUsingClusterView`
+
+Returns the number of segments not yet loaded for each tier until all segments loading in the cluster are available.
+The result includes segment replication counts. It also factors in the number of available nodes that are of a service type that can load the segment when computing the number of segments remaining to load.
+A segment is considered fully loaded when:
+- Druid has replicated it the number of times configured in the corresponding load rule.
+- Or the number of replicas for the segment in each tier where it is configured to be replicated equals the available nodes of a service type that are currently allowed to load the segment in the tier.
+
+`GET /druid/coordinator/v1/loadqueue`
+
+Returns the ids of segments to load and drop for each Historical process.
+
+`GET /druid/coordinator/v1/loadqueue?simple`
+
+Returns the number of segments to load and drop, as well as the total segment load and drop size in bytes for each Historical process.
+
+`GET /druid/coordinator/v1/loadqueue?full`
+
+Returns the serialized JSON of segments to load and drop for each Historical process.
+
+## Segment loading by datasource
+
+Note that all _interval_ query parameters are ISO 8601 strings—for example, 2016-06-27/2016-06-28.
+Also note that these APIs only guarantees that the segments are available at the time of the call.
+Segments can still become missing because of historical process failures or any other reasons afterward.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?forceMetadataRefresh={boolean}&interval={myInterval}`
+
+Returns the percentage of segments actually loaded in the cluster versus segments that should be loaded in the cluster for the given
+datasource over the given interval (or last 2 weeks if interval is not given). `forceMetadataRefresh` is required to be set.
+* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
+(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
+of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
+* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
+If no used segments are found for the given inputs, this API returns `204 No Content`
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?simple&forceMetadataRefresh={boolean}&interval={myInterval}`
+
+Returns the number of segments left to load until segments that should be loaded in the cluster are available for the given datasource
+over the given interval (or last 2 weeks if interval is not given). This does not include segment replication counts. `forceMetadataRefresh` is required to be set.
+* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
+(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
+of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
+* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
+If no used segments are found for the given inputs, this API returns `204 No Content`
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/loadstatus?full&forceMetadataRefresh={boolean}&interval={myInterval}`
+
+Returns the number of segments left to load in each tier until segments that should be loaded in the cluster are all available for the given datasource over the given interval (or last 2 weeks if interval is not given). This includes segment replication counts. `forceMetadataRefresh` is required to be set.
+* Setting `forceMetadataRefresh` to true will force the coordinator to poll latest segment metadata from the metadata store
+(Note: `forceMetadataRefresh=true` refreshes Coordinator's metadata cache of all datasources. This can be a heavy operation in terms
+of the load on the metadata store but can be necessary to make sure that we verify all the latest segments' load status)
+* Setting `forceMetadataRefresh` to false will use the metadata cached on the coordinator from the last force/periodic refresh.
+
+You can pass the optional query parameter `computeUsingClusterView` to factor in the available cluster services when calculating
+the segments left to load. See [Coordinator Segment Loading](#segment-loading) for details.
+If no used segments are found for the given inputs, this API returns `204 No Content`
+
+## Metadata store information
+
+:::info
+ Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
+ [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) table.
+:::
+
+`GET /druid/coordinator/v1/metadata/segments`
+
+Returns a list of all segments for each datasource enabled in the cluster.
+
+`GET /druid/coordinator/v1/metadata/segments?datasources={dataSourceName1}&datasources={dataSourceName2}`
+
+Returns a list of all segments for one or more specific datasources enabled in the cluster.
+
+`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus`
+
+Returns a list of all segments for each datasource with the full segment metadata and an extra field `overshadowed`.
+
+`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&includeRealtimeSegments`
+
+Returns a list of all published and realtime segments for each datasource with the full segment metadata and extra fields `overshadowed`,`realtime` & `numRows`. Realtime segments are returned only when `druid.centralizedDatasourceSchema.enabled` is set on the Coordinator.
+
+`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&datasources={dataSourceName1}&datasources={dataSourceName2}`
+
+Returns a list of all segments for one or more specific datasources with the full segment metadata and an extra field `overshadowed`.
+
+`GET /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&includeRealtimeSegments&datasources={dataSourceName1}&datasources={dataSourceName2}`
+
+Returns a list of all published and realtime segments for the specified datasources with the full segment metadata and extra fields `overshadwed`,`realtime` & `numRows`. Realtime segments are returned only when `druid.centralizedDatasourceSchema.enabled` is set on the Coordinator.
+
+`GET /druid/coordinator/v1/metadata/datasources`
+
+Returns a list of the names of datasources with at least one used segment in the cluster, retrieved from the metadata database. Users should call this API to get the eventual state that the system will be in.
+
+`GET /druid/coordinator/v1/metadata/datasources?includeUnused`
+
+Returns a list of the names of datasources, regardless of whether there are used segments belonging to those datasources in the cluster or not.
+
+`GET /druid/coordinator/v1/metadata/datasources?includeDisabled`
+
+Returns a list of the names of datasources, regardless of whether the datasource is disabled or not.
+
+`GET /druid/coordinator/v1/metadata/datasources?full`
+
+Returns a list of all datasources with at least one used segment in the cluster. Returns all metadata about those datasources as stored in the metadata store.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}`
+
+Returns full metadata for a datasource as stored in the metadata store.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
+
+Returns a list of all segments for a datasource as stored in the metadata store.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
+
+Returns a list of all segments for a datasource with the full segment metadata as stored in the metadata store.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}`
+
+Returns full segment metadata for a specific segment as stored in the metadata store, if the segment is used. If the
+segment is unused, or is unknown, a 404 response is returned.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments/{segmentId}?includeUnused=true`
+
+Returns full segment metadata for a specific segment as stored in the metadata store. If it is unknown, a 404 response
+is returned.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments`
+
+Returns a list of all segments, overlapping with any of given intervals, for a datasource as stored in the metadata store. Request body is array of string IS0 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
+
+`GET /druid/coordinator/v1/metadata/datasources/{dataSourceName}/segments?full`
+
+Returns a list of all segments, overlapping with any of given intervals, for a datasource with the full segment metadata as stored in the metadata store. Request body is array of string ISO 8601 intervals like `[interval1, interval2,...]`—for example, `["2012-01-01T00:00:00.000/2012-01-03T00:00:00.000", "2012-01-05T00:00:00.000/2012-01-07T00:00:00.000"]`.
+
+`POST /druid/coordinator/v1/metadata/dataSourceInformation`
+
+Returns information about the specified datasources, including the datasource schema.
+
+`POST /druid/coordinator/v1/metadata/bootstrapSegments`
+
+Returns information about bootstrap segments for all datasources. The returned set includes all broadcast segments if broadcast rules are configured.
+
+
+
+## Datasources
+
+Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`—for example, `2016-06-27_2016-06-28`.
+
+`GET /druid/coordinator/v1/datasources`
+
+Returns a list of datasource names found in the cluster as seen by the coordinator. This view is updated every [`druid.coordinator.period`](../configuration/index.md#coordinator-operation).
+
+`GET /druid/coordinator/v1/datasources?simple`
+
+Returns a list of JSON objects containing the name and properties of datasources found in the cluster. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
+
+`GET /druid/coordinator/v1/datasources?full`
+
+Returns a list of datasource names found in the cluster with all metadata about those datasources.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}`
+
+Returns a JSON object containing the name and properties of a datasource. Properties include segment count, total segment byte size, replicated total segment byte size, minTime, and maxTime.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}?full`
+
+Returns full metadata for a datasource.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals`
+
+Returns a set of segment intervals.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?simple`
+
+Returns a map of an interval to a JSON object containing the total byte size of segments and number of segments for that interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals?full`
+
+Returns a map of an interval to a map of segment metadata to a set of server names that contain the segment for that interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}`
+
+Returns a set of segment ids for an interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?simple`
+
+Returns a map of segment intervals contained within the specified interval to a JSON object containing the total byte size of segments and number of segments for an interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}?full`
+
+Returns a map of segment intervals contained within the specified interval to a map of segment metadata to a set of server names that contain the segment for an interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/intervals/{interval}/serverview`
+
+Returns a map of segment intervals contained within the specified interval to information about the servers that contain the segment for an interval.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments`
+
+Returns a list of all segments for a datasource in the cluster.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments?full`
+
+Returns a list of all segments for a datasource in the cluster with the full segment metadata.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/segments/{segmentId}`
+
+Returns full segment metadata for a specific segment in the cluster.
+
+`GET /druid/coordinator/v1/datasources/{dataSourceName}/tiers`
+
+Return the tiers that a datasource exists in.
+
+## Intervals
+
+Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` as in `2016-06-27_2016-06-28`.
+
+`GET /druid/coordinator/v1/intervals`
+
+Returns all intervals for all datasources with total size and count.
+
+`GET /druid/coordinator/v1/intervals/{interval}`
+
+Returns aggregated total size and count for all intervals that intersect given ISO interval.
+
+`GET /druid/coordinator/v1/intervals/{interval}?simple`
+
+Returns total size and count for each interval within given ISO interval.
+
+`GET /druid/coordinator/v1/intervals/{interval}?full`
+
+Returns total size and count for each datasource for each interval within given ISO interval.
+
+## Server information
+
+`GET /druid/coordinator/v1/servers`
+
+Returns a list of servers URLs using the format `{hostname}:{port}`. Note that
+processes that run with different types will appear multiple times with different
+ports.
+
+`GET /druid/coordinator/v1/servers?simple`
+
+Returns a list of server data objects in which each object has the following keys:
+* `host`: host URL include (`{hostname}:{port}`)
+* `type`: process type (`indexer-executor`, `historical`)
+* `currSize`: storage size currently used
+* `maxSize`: maximum storage size
+* `priority`
+* `tier`
+
+
+## Query server
+
+This section documents the API endpoints for the services that reside on Query servers (Brokers) in the suggested [three-server configuration](../design/architecture.md#druid-servers).
+
+### Broker
+
+#### Datasource information
+
+Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/`
+as in `2016-06-27_2016-06-28`.
+
+:::info
+ Note: Much of this information is available in a simpler, easier-to-use form through the Druid SQL
+ [`INFORMATION_SCHEMA.TABLES`](../querying/sql-metadata-tables.md#tables-table),
+ [`INFORMATION_SCHEMA.COLUMNS`](../querying/sql-metadata-tables.md#columns-table), and
+ [`sys.segments`](../querying/sql-metadata-tables.md#segments-table) tables.
+:::
+
+`GET /druid/v2/datasources`
+
+Returns a list of queryable datasources.
+
+`GET /druid/v2/datasources/{dataSourceName}`
+
+Returns the dimensions and metrics of the datasource. Optionally, you can provide request parameter "full" to get list of served intervals with dimensions and metrics being served for those intervals. You can also provide request param "interval" explicitly to refer to a particular interval.
+
+If no interval is specified, a default interval spanning a configurable period before the current time will be used. The default duration of this interval is specified in ISO 8601 duration format via: `druid.query.segmentMetadata.defaultHistory`
+
+`GET /druid/v2/datasources/{dataSourceName}/dimensions`
+
+:::info
+ This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
+ which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
+ if you're using SQL.
+:::
+
+Returns the dimensions of the datasource.
+
+`GET /druid/v2/datasources/{dataSourceName}/metrics`
+
+:::info
+ This API is deprecated and will be removed in future releases. Please use [SegmentMetadataQuery](../querying/segmentmetadataquery.md) instead
+ which provides more comprehensive information and supports all dataSource types including streaming dataSources. It's also encouraged to use [INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md)
+ if you're using SQL.
+:::
+
+Returns the metrics of the datasource.
+
+`GET /druid/v2/datasources/{dataSourceName}/candidates?intervals={comma-separated-intervals}&numCandidates={numCandidates}`
+
+Returns segment information lists including server locations for the given datasource and intervals. If "numCandidates" is not specified, it will return all servers for each interval.
diff --git a/docs/35.0.0/api-reference/lookups-api.md b/docs/35.0.0/api-reference/lookups-api.md
new file mode 100644
index 0000000000..4a122917b5
--- /dev/null
+++ b/docs/35.0.0/api-reference/lookups-api.md
@@ -0,0 +1,279 @@
+---
+id: lookups-api
+title: Lookups API
+sidebar_label: Lookups
+---
+
+
+
+This document describes the API endpoints to configure, update, retrieve, and manage lookups for Apache Druid.
+
+## Configure lookups
+
+### Bulk update
+
+Lookups can be updated in bulk by posting a JSON object to `/druid/coordinator/v1/lookups/config`. The format of the json object is as follows:
+
+```json
+{
+ "": {
+ "": {
+ "version": "",
+ "lookupExtractorFactory": {
+ "type": "",
+ "": ""
+ }
+ }
+ }
+}
+```
+
+Note that "version" is an arbitrary string assigned by the user, when making updates to existing lookup then user would need to specify a lexicographically higher version.
+
+For example, a config might look something like:
+
+```json
+{
+ "__default": {
+ "country_code": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "77483": "United States"
+ }
+ }
+ },
+ "site_id": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "cachedNamespace",
+ "extractionNamespace": {
+ "type": "jdbc",
+ "connectorConfig": {
+ "createTables": true,
+ "connectURI": "jdbc:mysql:\/\/localhost:3306\/druid",
+ "user": "druid",
+ "password": "diurd"
+ },
+ "table": "lookupTable",
+ "keyColumn": "country_id",
+ "valueColumn": "country_name",
+ "tsColumn": "timeColumn"
+ },
+ "firstCacheTimeout": 120000,
+ "injective": true
+ }
+ },
+ "site_id_customer1": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "847632": "Internal Use Only"
+ }
+ }
+ },
+ "site_id_customer2": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "AHF77": "Home"
+ }
+ }
+ }
+ },
+ "realtime_customer1": {
+ "country_code": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "77483": "United States"
+ }
+ }
+ },
+ "site_id_customer1": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "847632": "Internal Use Only"
+ }
+ }
+ }
+ },
+ "realtime_customer2": {
+ "country_code": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "77483": "United States"
+ }
+ }
+ },
+ "site_id_customer2": {
+ "version": "v0",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "AHF77": "Home"
+ }
+ }
+ }
+ }
+}
+```
+
+All entries in the map will UPDATE existing entries. No entries will be deleted.
+
+### Update lookup
+
+A `POST` to a particular lookup extractor factory via `/druid/coordinator/v1/lookups/config/{tier}/{id}` creates or updates that specific extractor factory.
+
+For example, a post to `/druid/coordinator/v1/lookups/config/realtime_customer1/site_id_customer1` might contain the following:
+
+```json
+{
+ "version": "v1",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "847632": "Internal Use Only"
+ }
+ }
+}
+```
+
+This will replace the `site_id_customer1` lookup in the `realtime_customer1` with the definition above.
+
+Assign a unique version identifier each time you update a lookup extractor factory. Otherwise the call will fail.
+
+### Get all lookups
+
+A `GET` to `/druid/coordinator/v1/lookups/config/all` will return all known lookup specs for all tiers.
+
+### Get lookup
+
+A `GET` to a particular lookup extractor factory is accomplished via `/druid/coordinator/v1/lookups/config/{tier}/{id}`
+
+Using the prior example, a `GET` to `/druid/coordinator/v1/lookups/config/realtime_customer2/site_id_customer2` should return
+
+```json
+{
+ "version": "v1",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "AHF77": "Home"
+ }
+ }
+}
+```
+
+### Delete lookup
+
+A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}/{id}` will remove that lookup from the cluster. If it was last lookup in the tier, then tier is deleted as well.
+
+### Delete tier
+
+A `DELETE` to `/druid/coordinator/v1/lookups/config/{tier}` will remove that tier from the cluster.
+
+### List tier names
+
+A `GET` to `/druid/coordinator/v1/lookups/config` will return a list of known tier names in the dynamic configuration.
+To discover a list of tiers currently active in the cluster in addition to ones known in the dynamic configuration, the parameter `discover=true` can be added as per `/druid/coordinator/v1/lookups/config?discover=true`.
+
+### List lookup names
+
+A `GET` to `/druid/coordinator/v1/lookups/config/{tier}` will return a list of known lookup names for that tier.
+
+These end points can be used to get the propagation status of configured lookups to processes using lookups such as Historicals.
+
+## Lookup status
+
+### List load status of all lookups
+
+`GET` `/druid/coordinator/v1/lookups/status` with optional query parameter `detailed`.
+
+### List load status of lookups in a tier
+
+`GET` `/druid/coordinator/v1/lookups/status/{tier}` with optional query parameter `detailed`.
+
+### List load status of single lookup
+
+`GET` `/druid/coordinator/v1/lookups/status/{tier}/{lookup}` with optional query parameter `detailed`.
+
+### List lookup state of all processes
+
+`GET` `/druid/coordinator/v1/lookups/nodeStatus` with optional query parameter `discover` to discover tiers advertised by other Druid nodes, or by default, returning all configured lookup tiers. The default response will also include the lookups which are loaded, being loaded, or being dropped on each node, for each tier, including the complete lookup spec. Add the optional query parameter `detailed=false` to only include the 'version' of the lookup instead of the complete spec.
+
+### List lookup state of processes in a tier
+
+`GET` `/druid/coordinator/v1/lookups/nodeStatus/{tier}`
+
+### List lookup state of single process
+
+`GET` `/druid/coordinator/v1/lookups/nodeStatus/{tier}/{host:port}`
+
+## Internal API
+
+The Peon, Router, Broker, and Historical processes all have the ability to consume lookup configuration.
+There is an internal API these processes use to list/load/drop their lookups starting at `/druid/listen/v1/lookups`.
+These follow the same convention for return values as the cluster wide dynamic configuration. Following endpoints
+can be used for debugging purposes but not otherwise.
+
+### Get lookups
+
+A `GET` to the process at `/druid/listen/v1/lookups` will return a json map of all the lookups currently active on the process.
+The return value will be a json map of the lookups to their extractor factories.
+
+```json
+{
+ "site_id_customer2": {
+ "version": "v1",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "AHF77": "Home"
+ }
+ }
+ }
+}
+```
+
+### Get lookup
+
+A `GET` to the process at `/druid/listen/v1/lookups/some_lookup_name` will return the LookupExtractorFactory for the lookup identified by `some_lookup_name`.
+The return value will be the json representation of the factory.
+
+```json
+{
+ "version": "v1",
+ "lookupExtractorFactory": {
+ "type": "map",
+ "map": {
+ "AHF77": "Home"
+ }
+ }
+}
+```
\ No newline at end of file
diff --git a/docs/35.0.0/api-reference/retention-rules-api.md b/docs/35.0.0/api-reference/retention-rules-api.md
new file mode 100644
index 0000000000..c21e546abd
--- /dev/null
+++ b/docs/35.0.0/api-reference/retention-rules-api.md
@@ -0,0 +1,562 @@
+---
+id: retention-rules-api
+title: Retention rules API
+sidebar_label: Retention rules
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+This topic describes the API endpoints for managing retention rules in Apache Druid. You can configure retention rules in the Druid web console or API.
+
+Druid uses retention rules to determine what data is retained in the cluster. Druid supports load, drop, and broadcast rules. For more information, see [Using rules to drop and retain data](../operations/rule-configuration.md).
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port. Replace it with the information for your deployment. For example, use `http://localhost:8888` for quickstart deployments.
+
+## Update retention rules for a datasource
+
+Updates one or more retention rules for a datasource. The request body takes an array of retention rule objects. For details on defining retention rules, see the following sources:
+
+* [Load rules](../operations/rule-configuration.md#load-rules)
+* [Drop rules](../operations/rule-configuration.md#drop-rules)
+* [Broadcast rules](../operations/rule-configuration.md#broadcast-rules)
+
+This request overwrites any existing rules for the datasource.
+Druid reads rules in the order in which they appear; for more information, see [rule structure](../operations/rule-configuration.md).
+
+Note that this endpoint returns an HTTP `200 OK` even if the datasource does not exist.
+
+### URL
+
+`POST` `/druid/coordinator/v1/rules/{dataSource}`
+
+### Header parameters
+
+The endpoint supports a set of optional header parameters to populate the `author` and `comment` fields in the `auditInfo` property for audit history.
+
+* `X-Druid-Author` (optional)
+ * Type: String
+ * A string representing the author making the configuration change.
+* `X-Druid-Comment` (optional)
+ * Type: String
+ * A string describing the update.
+
+### Responses
+
+
+
+
+
+
+*Successfully updated retention rules for specified datasource*
+
+
+
+
+---
+
+### Sample request
+
+The following example sets a set of broadcast, load, and drop retention rules for the `kttm1` datasource.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/rules/kttm1" \
+--header 'X-Druid-Author: doc intern' \
+--header 'X-Druid-Comment: submitted via api' \
+--header 'Content-Type: application/json' \
+--data '[
+ {
+ "type": "broadcastForever"
+ },
+ {
+ "type": "loadForever",
+ "tieredReplicants": {
+ "_default_tier": 2
+ },
+ "useDefaultTierForNull": true
+ },
+ {
+ "type": "dropByPeriod",
+ "period": "P1M"
+ }
+]'
+```
+
+
+
+
+
+```HTTP
+POST /druid/coordinator/v1/rules/kttm1 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+X-Druid-Author: doc intern
+X-Druid-Comment: submitted via api
+Content-Type: application/json
+Content-Length: 273
+
+[
+ {
+ "type": "broadcastForever"
+ },
+ {
+ "type": "loadForever",
+ "tieredReplicants": {
+ "_default_tier": 1
+ },
+ "useDefaultTierForNull": true
+ },
+ {
+ "type": "dropByPeriod",
+ "period": "P1M"
+ }
+]
+```
+
+
+
+
+### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+## Update default retention rules for all datasources
+
+Updates one or more default retention rules for all datasources. Submit retention rules as an array of objects in the request body. For details on defining retention rules, see the following sources:
+
+* [Load rules](../operations/rule-configuration.md#load-rules)
+* [Drop rules](../operations/rule-configuration.md#drop-rules)
+* [Broadcast rules](../operations/rule-configuration.md#broadcast-rules)
+
+This request overwrites any existing rules for all datasources. To remove default retention rules for all datasources, submit an empty rule array in the request body. Rules are read in the order in which they appear; for more information, see [rule structure](../operations/rule-configuration.md).
+
+### URL
+
+`POST` `/druid/coordinator/v1/rules/_default`
+
+### Header parameters
+
+The endpoint supports a set of optional header parameters to populate the `author` and `comment` fields in the `auditInfo` property for audit history.
+
+* `X-Druid-Author` (optional)
+ * Type: String
+ * A string representing the author making the configuration change.
+* `X-Druid-Comment` (optional)
+ * Type: String
+ * A string describing the update.
+
+### Responses
+
+
+
+
+
+
+*Successfully updated default retention rules*
+
+
+
+
+
+*Error with request body*
+
+
+
+
+---
+
+### Sample request
+
+The following example updates the default retention rule for all datasources with a `loadByInterval` rule.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/rules/_default" \
+--header 'Content-Type: application/json' \
+--data '[
+ {
+ "type": "loadByInterval",
+ "tieredReplicants": {},
+ "useDefaultTierForNull": false,
+ "interval": "2010-01-01/2020-01-01"
+ }
+]'
+```
+
+
+
+
+
+```HTTP
+POST /druid/coordinator/v1/rules/_default HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 205
+
+[
+ {
+ "type": "loadByInterval",
+ "tieredReplicants": {},
+ "useDefaultTierForNull": false,
+ "interval": "2010-01-01/2020-01-01"
+ }
+]
+```
+
+
+
+
+### Sample response
+
+A successful request returns an HTTP `200 OK` message code and an empty response body.
+
+## Get an array of all retention rules
+
+Retrieves all current retention rules in the cluster including the default retention rule. Returns an array of objects for each datasource and their associated retention rules.
+
+### URL
+
+`GET` `/druid/coordinator/v1/rules`
+
+### Responses
+
+
+
+
+
+
+*Successfully retrieved retention rules*
+
+
+
+
+---
+
+### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/rules"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/rules HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "_default": [
+ {
+ "tieredReplicants": {
+ "_default_tier": 2
+ },
+ "type": "loadForever"
+ }
+ ],
+ "social_media": [
+ {
+ "interval": "2023-01-01T00:00:00.000Z/2023-02-01T00:00:00.000Z",
+ "type": "dropByInterval"
+ }
+ ],
+ "wikipedia_api": [],
+}
+ ```
+
+
+## Get an array of retention rules for a datasource
+
+Retrieves an array of rule objects for a single datasource. Returns an empty array if there are no retention rules.
+
+Note that this endpoint returns an HTTP `200 OK` message code even if the datasource doesn't exist.
+
+### URL
+
+`GET` `/druid/coordinator/v1/rules/{dataSource}`
+
+### Query parameters
+
+* `full` (optional)
+ * Includes the default retention rule for the datasource in the response.
+
+### Responses
+
+
+
+
+
+
+*Successfully retrieved retention rules*
+
+
+
+
+---
+
+### Sample request
+
+The following example retrieves the custom retention rules and default retention rules for datasource with the name `social_media`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/rules/social_media?full=null"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/rules/social_media?full=null HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+### Sample response
+
+
+ View the response
+
+ ```json
+[
+ {
+ "interval": "2020-01-01T00:00:00.000Z/2022-02-01T00:00:00.000Z",
+ "type": "dropByInterval"
+ },
+ {
+ "interval": "2010-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z",
+ "tieredReplicants": {
+ "_default_tier": 2
+ },
+ "type": "loadByInterval"
+ },
+ {
+ "tieredReplicants": {
+ "_default_tier": 2
+ },
+ "type": "loadForever"
+ }
+]
+ ```
+
+
+
+## Get audit history for all datasources
+
+Retrieves the audit history of rules for all datasources over an interval of time. The default interval is 1 week. You can change this period by setting `druid.audit.manager.auditHistoryMillis` in the `runtime.properties` file for the Coordinator.
+
+### URL
+
+`GET` `/druid/coordinator/v1/rules/history`
+
+### Query parameters
+
+Note that the following query parameters cannot be chained.
+
+* `interval` (optional)
+ * Type: ISO 8601.
+ * Limits the number of results to the specified time interval. Delimit with `/`. For example, `2023-07-13/2023-07-19`.
+* `count` (optional)
+ * Type: Int
+ * Limits the number of results to the last `n` entries.
+
+### Responses
+
+
+
+
+
+
+*Successfully retrieved audit history*
+
+
+
+
+
+*Request in the incorrect format*
+
+
+
+
+
+*`count` query parameter too large*
+
+
+
+
+---
+
+### Sample request
+
+The following example retrieves the audit history for all datasources from `2023-07-13` to `2023-07-19`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/rules/history?interval=2023-07-13%2F2023-07-19"
+```
+
+
+
+
+
+```HTTP
+GET /druid/coordinator/v1/rules/history?interval=2023-07-13/2023-07-19 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+### Sample response
+
+
+ View the response
+
+ ```json
+[
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"interval\":\"2023-01-01T00:00:00.000Z/2023-02-01T00:00:00.000Z\",\"type\":\"dropByInterval\"}]",
+ "auditTime": "2023-07-13T18:05:33.066Z"
+ },
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[]",
+ "auditTime": "2023-07-18T18:10:21.203Z"
+ },
+ {
+ "key": "wikipedia_api",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"tieredReplicants\":{\"_default_tier\":2},\"type\":\"loadForever\"}]",
+ "auditTime": "2023-07-18T18:10:44.519Z"
+ },
+ {
+ "key": "wikipedia_api",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[]",
+ "auditTime": "2023-07-18T18:11:02.110Z"
+ },
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"interval\":\"2023-07-03T18:49:54.848Z/2023-07-03T18:49:55.861Z\",\"type\":\"dropByInterval\"}]",
+ "auditTime": "2023-07-18T18:32:50.060Z"
+ },
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"interval\":\"2020-01-01T00:00:00.000Z/2022-02-01T00:00:00.000Z\",\"type\":\"dropByInterval\"}]",
+ "auditTime": "2023-07-18T18:34:09.657Z"
+ },
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"interval\":\"2020-01-01T00:00:00.000Z/2022-02-01T00:00:00.000Z\",\"type\":\"dropByInterval\"},{\"tieredReplicants\":{\"_default_tier\":2},\"type\":\"loadForever\"}]",
+ "auditTime": "2023-07-18T18:38:37.223Z"
+ },
+ {
+ "key": "social_media",
+ "type": "rules",
+ "auditInfo": {
+ "author": "console",
+ "comment": "test",
+ "ip": "127.0.0.1"
+ },
+ "payload": "[{\"interval\":\"2020-01-01T00:00:00.000Z/2022-02-01T00:00:00.000Z\",\"type\":\"dropByInterval\"},{\"interval\":\"2010-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z\",\"tieredReplicants\":{\"_default_tier\":2},\"type\":\"loadByInterval\"}]",
+ "auditTime": "2023-07-18T18:49:43.964Z"
+ }
+]
+ ```
+
diff --git a/docs/35.0.0/api-reference/service-status-api.md b/docs/35.0.0/api-reference/service-status-api.md
new file mode 100644
index 0000000000..47d2a5a6d3
--- /dev/null
+++ b/docs/35.0.0/api-reference/service-status-api.md
@@ -0,0 +1,1469 @@
+---
+id: service-status-api
+title: Service status API
+sidebar_label: Service status
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+
+This document describes the API endpoints to retrieve service status, cluster information for Apache Druid.
+
+In this document, `http://SERVICE_IP:SERVICE_PORT` is a placeholder for the server address of deployment and the service port. For example, on the quickstart configuration, replace `http://ROUTER_IP:ROUTER_PORT` with `http://localhost:8888`.
+
+## Common
+
+All services support the following endpoints.
+
+You can use each endpoint with the ports for each type of service. The following table contains port addresses for a local configuration:
+
+|Service|Port address|
+| ------ | ------------ |
+| Coordinator|8081|
+| Overlord|8081|
+| Router|8888|
+| Broker|8082|
+| Historical|8083|
+| Middle Manager|8091|
+
+### Get service information
+
+Retrieves the Druid version, loaded extensions, memory used, total memory, and other useful information about the individual service.
+
+Modify the host and port for the endpoint to match the service to query. Refer to the [default service ports](#common) for the port numbers.
+
+#### URL
+
+`GET` `/status`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved service information*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/status"
+```
+
+
+
+
+
+```http
+GET /status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "version": "26.0.0",
+ "modules": [
+ {
+ "name": "org.apache.druid.common.aws.AWSModule",
+ "artifact": "druid-aws-common",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.common.gcp.GcpModule",
+ "artifact": "druid-gcp-common",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.storage.hdfs.HdfsStorageDruidModule",
+ "artifact": "druid-hdfs-storage",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.indexing.kafka.KafkaIndexTaskModule",
+ "artifact": "druid-kafka-indexing-service",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.theta.SketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.theta.oldapi.OldApiSketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.tuple.ArrayOfDoublesSketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.hll.HllSketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.query.aggregation.datasketches.kll.KllSketchModule",
+ "artifact": "druid-datasketches",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.MSQExternalDataSourceModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.MSQIndexingModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.MSQDurableStorageModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.MSQServiceClientModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.MSQSqlModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ },
+ {
+ "name": "org.apache.druid.msq.guice.SqlTaskModule",
+ "artifact": "druid-multi-stage-query",
+ "version": "26.0.0"
+ }
+ ],
+ "memory": {
+ "maxMemory": 268435456,
+ "totalMemory": 268435456,
+ "freeMemory": 139060688,
+ "usedMemory": 129374768,
+ "directMemory": 134217728
+ }
+ }
+ ```
+
+
+### Get service health
+
+Retrieves the online status of the individual Druid service. It is a simple health check to determine if the service is running and accessible. If online, it will always return a boolean `true` value, indicating that the service can receive API calls. This endpoint is suitable for automated health checks.
+
+Modify the host and port for the endpoint to match the service to query. Refer to the [default service ports](#common) for the port numbers.
+
+Additional checks for readiness should use the [Historical segment readiness](#get-segment-readiness) and [Broker query readiness](#get-broker-query-readiness) endpoints.
+
+#### URL
+
+`GET` `/status/health`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved service health*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/status/health"
+```
+
+
+
+
+
+```http
+GET /status/health HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ true
+ ```
+
+
+
+
+### Get configuration properties
+
+Retrieves the current configuration properties of the individual service queried.
+
+Modify the host and port for the endpoint to match the service to query. Refer to the [default service ports](#common) for the port numbers.
+
+#### URL
+
+`GET` `/status/properties`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved service configuration properties*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/status/properties"
+```
+
+
+
+
+
+```http
+GET /status/properties HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+{
+ "gopherProxySet": "false",
+ "awt.toolkit": "sun.lwawt.macosx.LWCToolkit",
+ "druid.monitoring.monitors": "[\"org.apache.druid.java.util.metrics.JvmMonitor\"]",
+ "java.specification.version": "11",
+ "sun.cpu.isalist": "",
+ "druid.plaintextPort": "8888",
+ "sun.jnu.encoding": "UTF-8",
+ "druid.indexing.doubleStorage": "double",
+ "druid.metadata.storage.connector.port": "1527",
+ "java.class.path": "/Users/genericUserPath",
+ "log4j.shutdownHookEnabled": "true",
+ "java.vm.vendor": "Homebrew",
+ "sun.arch.data.model": "64",
+ "druid.extensions.loadList": "[\"druid-hdfs-storage\", \"druid-kafka-indexing-service\", \"druid-datasketches\", \"druid-multi-stage-query\"]",
+ "java.vendor.url": "https://github.com/Homebrew/homebrew-core/issues",
+ "druid.router.coordinatorServiceName": "druid/coordinator",
+ "user.timezone": "UTC",
+ "druid.global.http.eagerInitialization": "false",
+ "os.name": "Mac OS X",
+ "java.vm.specification.version": "11",
+ "sun.java.launcher": "SUN_STANDARD",
+ "user.country": "US",
+ "sun.boot.library.path": "/opt/homebrew/Cellar/openjdk@11/11.0.19/libexec/openjdk.jdk/Contents/Home/lib",
+ "sun.java.command": "org.apache.druid.cli.Main server router",
+ "http.nonProxyHosts": "local|*.local|169.254/16|*.169.254/16",
+ "jdk.debug": "release",
+ "druid.metadata.storage.connector.host": "localhost",
+ "sun.cpu.endian": "little",
+ "druid.zk.paths.base": "/druid",
+ "user.home": "/Users/genericUser",
+ "user.language": "en",
+ "java.specification.vendor": "Oracle Corporation",
+ "java.version.date": "2023-04-18",
+ "java.home": "/opt/homebrew/Cellar/openjdk@11/11.0.19/libexec/openjdk.jdk/Contents/Home",
+ "druid.service": "druid/router",
+ "druid.selectors.coordinator.serviceName": "druid/coordinator",
+ "druid.metadata.storage.connector.connectURI": "jdbc:derby://localhost:1527/var/druid/metadata.db;create=true",
+ "file.separator": "/",
+ "druid.selectors.indexing.serviceName": "druid/overlord",
+ "java.vm.compressedOopsMode": "Zero based",
+ "druid.metadata.storage.type": "derby",
+ "line.separator": "\n",
+ "druid.log.path": "/Users/genericUserPath",
+ "java.vm.specification.vendor": "Oracle Corporation",
+ "java.specification.name": "Java Platform API Specification",
+ "druid.indexer.logs.directory": "var/druid/indexing-logs",
+ "java.awt.graphicsenv": "sun.awt.CGraphicsEnvironment",
+ "druid.router.defaultBrokerServiceName": "druid/broker",
+ "druid.storage.storageDirectory": "var/druid/segments",
+ "sun.management.compiler": "HotSpot 64-Bit Tiered Compilers",
+ "ftp.nonProxyHosts": "local|*.local|169.254/16|*.169.254/16",
+ "java.runtime.version": "11.0.19+0",
+ "user.name": "genericUser",
+ "druid.indexer.logs.type": "file",
+ "druid.host": "localhost",
+ "log4j2.is.webapp": "false",
+ "path.separator": ":",
+ "os.version": "12.6.5",
+ "druid.lookup.enableLookupSyncOnStartup": "false",
+ "java.runtime.name": "OpenJDK Runtime Environment",
+ "druid.zk.service.host": "localhost",
+ "file.encoding": "UTF-8",
+ "druid.sql.planner.useGroupingSetForExactDistinct": "true",
+ "druid.router.managementProxy.enabled": "true",
+ "java.vm.name": "OpenJDK 64-Bit Server VM",
+ "java.vendor.version": "Homebrew",
+ "druid.startup.logging.logProperties": "true",
+ "java.vendor.url.bug": "https://github.com/Homebrew/homebrew-core/issues",
+ "log4j.shutdownCallbackRegistry": "org.apache.druid.common.config.Log4jShutdown",
+ "java.io.tmpdir": "var/tmp",
+ "druid.sql.enable": "true",
+ "druid.emitter.logging.logLevel": "info",
+ "java.version": "11.0.19",
+ "user.dir": "/Users/genericUser/Downloads/apache-druid-26.0.0",
+ "os.arch": "aarch64",
+ "java.vm.specification.name": "Java Virtual Machine Specification",
+ "druid.node.type": "router",
+ "java.awt.printerjob": "sun.lwawt.macosx.CPrinterJob",
+ "sun.os.patch.level": "unknown",
+ "java.util.logging.manager": "org.apache.logging.log4j.jul.LogManager",
+ "java.library.path": "/Users/genericUserPath",
+ "java.vendor": "Homebrew",
+ "java.vm.info": "mixed mode",
+ "java.vm.version": "11.0.19+0",
+ "druid.emitter": "noop",
+ "sun.io.unicode.encoding": "UnicodeBig",
+ "druid.storage.type": "local",
+ "java.class.version": "55.0",
+ "socksNonProxyHosts": "local|*.local|169.254/16|*.169.254/16",
+ "druid.server.hiddenProperties": "[\"druid.s3.accessKey\",\"druid.s3.secretKey\",\"druid.metadata.storage.connector.password\", \"password\", \"key\", \"token\", \"pwd\"]"
+}
+```
+
+
+
+### Get node discovery status and cluster integration confirmation
+
+Retrieves a JSON map of the form `{"selfDiscovered": true/false}`, indicating whether the node has received a confirmation from the central node discovery mechanism (currently ZooKeeper) of the Druid cluster that the node has been added to the cluster.
+
+Only consider a Druid node "healthy" or "ready" in automated deployment/container management systems when this endpoint returns `{"selfDiscovered": true}`. Nodes experiencing network issues may become isolated and are not healthy.
+For nodes that use Zookeeper segment discovery, a response of `{"selfDiscovered": true}` indicates that the node's Zookeeper client has started receiving data from the Zookeeper cluster, enabling timely discovery of segments and other nodes.
+
+#### URL
+
+`GET` `/status/selfDiscovered/status`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Node was successfully added to the cluster*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/status/selfDiscovered/status"
+```
+
+
+
+
+
+```http
+GET /status/selfDiscovered/status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "selfDiscovered": true
+ }
+ ```
+
+
+
+
+### Get node self-discovery status
+
+Returns an HTTP status code to indicate node discovery within the Druid cluster. This endpoint is similar to the `status/selfDiscovered/status` endpoint, but relies on HTTP status codes alone.
+Use this endpoint for monitoring checks that are unable to examine the response body. For example, AWS load balancer health checks.
+
+#### URL
+
+`GET` `/status/selfDiscovered`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved node status*
+
+
+
+
+
+
+
+*Unsuccessful node self-discovery*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/status/selfDiscovered"
+```
+
+
+
+
+
+```http
+GET /status/selfDiscovered HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful response to this endpoint results in an empty response body.
+
+## Coordinator
+
+### Get Coordinator leader address
+
+Retrieves the address of the current leader Coordinator of the cluster. If any request is sent to a non-leader Coordinator, the request is automatically redirected to the leader Coordinator.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/leader`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved leader Coordinator address*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/leader"
+```
+
+
+
+
+
+```http
+GET /druid/coordinator/v1/leader HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ http://localhost:8081
+ ```
+
+
+
+### Get Coordinator leader status
+
+Retrieves a JSON object with a `leader` key. Returns `true` if this server is the current leader Coordinator of the cluster. To get the individual address of the leader Coordinator node, see the [leader endpoint](#get-coordinator-leader-address).
+
+Use this endpoint as a load balancer status check when you only want the active leader to be considered in-service at the load balancer.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/isLeader`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Current server is the leader*
+
+
+
+
+
+
+
+*Current server is not the leader*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://COORDINATOR_IP:COORDINATOR_PORT/druid/coordinator/v1/isLeader"
+```
+
+
+
+
+
+```http
+GET /druid/coordinator/v1/isLeader HTTP/1.1
+Host: http://COORDINATOR_IP:COORDINATOR_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "leader": true
+ }
+ ```
+
+
+
+
+### Get Historical Cloning Status
+
+Retrieves the current status of Historical cloning from the Coordinator.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/cloneStatus`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved cloning status*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://COORDINATOR_IP:COORDINATOR_PORT/druid/coordinator/v1/config/cloneStatus"
+```
+
+
+
+
+
+```http
+GET /druid/coordinator/v1/config/cloneStatus HTTP/1.1
+Host: http://COORDINATOR_IP:COORDINATOR_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "cloneStatus": [
+ {
+ "sourceServer": "localhost:8089",
+ "targetServer": "localhost:8083",
+ "state": "IN_PROGRESS",
+ "segmentLoadsRemaining": 0,
+ "segmentDropsRemaining": 0,
+ "bytesToLoad": 0
+ }
+ ]
+}
+```
+
+
+
+### Get Broker dynamic configuration view
+
+Retrieves the list of Brokers which have an up-to-date view of Coordinator dynamic configuration.
+
+#### URL
+
+`GET` `/druid/coordinator/v1/config/syncedBrokers`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved Broker Configuration view*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://COORDINATOR_IP:COORDINATOR_PORT/druid/coordinator/v1/config/syncedBrokers"
+```
+
+
+
+
+
+```http
+GET /druid/coordinator/v1/config/syncedBrokers HTTP/1.1
+Host: http://COORDINATOR_IP:COORDINATOR_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+{
+ "syncedBrokers": [
+ {
+ "host": "localhost",
+ "port": 8082,
+ "lastSyncTimestampMillis": 1745756337472
+ }
+ ]
+}
+```
+
+
+
+## Overlord
+
+### Get Overlord leader address
+
+Retrieves the address of the current leader Overlord of the cluster. In a cluster of multiple Overlords, only one Overlord assumes the leading role, while the remaining Overlords remain on standby.
+
+#### URL
+
+`GET` `/druid/indexer/v1/leader`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved leader Overlord address*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/leader"
+```
+
+
+
+
+
+```http
+GET /druid/indexer/v1/leader HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ http://localhost:8081
+ ```
+
+
+
+
+### Get Overlord leader status
+
+Retrieves a JSON object with a `leader` property. The value can be `true` or `false`, indicating if this server is the current leader Overlord of the cluster. To get the individual address of the leader Overlord node, see the [leader endpoint](#get-overlord-leader-address).
+
+Use this endpoint as a load balancer status check when you only want the active leader to be considered in-service at the load balancer.
+
+#### URL
+
+`GET` `/druid/indexer/v1/isLeader`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Current server is the leader*
+
+
+
+
+
+
+
+*Current server is not the leader*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://OVERLORD_IP:OVERLORD_PORT/druid/indexer/v1/isLeader"
+```
+
+
+
+
+
+```http
+GET /druid/indexer/v1/isLeader HTTP/1.1
+Host: http://OVERLORD_IP:OVERLORD_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "leader": true
+ }
+ ```
+
+
+
+
+## Middle Manager
+
+### Get Middle Manager state status
+
+Retrieves the enabled state of the Middle Manager process. Returns JSON object keyed by the combined `druid.host` and `druid.port` with a boolean `true` or `false` state as the value.
+
+#### URL
+
+`GET` `/druid/worker/v1/enabled`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved Middle Manager state*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT/druid/worker/v1/enabled"
+```
+
+
+
+
+
+```http
+GET /druid/worker/v1/enabled HTTP/1.1
+Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "localhost:8091": true
+ }
+ ```
+
+
+
+### Get active tasks
+
+Retrieves a list of active tasks being run on the Middle Manager. Returns JSON list of task ID strings. Note that for normal usage, you should use the `/druid/indexer/v1/tasks` [Tasks API](./tasks-api.md) endpoint or one of the task state specific variants instead.
+
+#### URL
+
+`GET` `/druid/worker/v1/tasks`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved active tasks*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT/druid/worker/v1/tasks"
+```
+
+
+
+
+
+```http
+GET /druid/worker/v1/tasks HTTP/1.1
+Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ "index_parallel_wikipedia_mgchefio_2023-06-13T22:18:05.360Z"
+ ]
+ ```
+
+
+
+### Get task log
+
+Retrieves task log output stream by task ID. For normal usage, you should use the `/druid/indexer/v1/task/{taskId}/log`
+[Tasks API](./tasks-api.md) endpoint instead.
+
+#### URL
+
+`GET` `/druid/worker/v1/task/{taskId}/log`
+
+### Shut down running task
+
+Shuts down a running task by ID. For normal usage, you should use the `/druid/indexer/v1/task/{taskId}/shutdown`
+[Tasks API](./tasks-api.md) endpoint instead.
+
+#### URL
+
+`POST` `/druid/worker/v1/task/{taskId}/shutdown`
+
+#### Responses
+
+
+
+
+
+
+
+*Successfully shut down a task*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shuts down a task with specified ID `index_kafka_wikiticker_f7011f8ffba384b_fpeclode`.
+
+
+
+
+
+
+```shell
+curl "http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT/druid/worker/v1/task/index_kafka_wikiticker_f7011f8ffba384b_fpeclode/shutdown"
+```
+
+
+
+
+
+```http
+POST /druid/worker/v1/task/index_kafka_wikiticker_f7011f8ffba384b_fpeclode/shutdown HTTP/1.1
+Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "task":"index_kafka_wikiticker_f7011f8ffba384b_fpeclode"
+ }
+ ```
+
+
+
+### Disable Middle Manager
+
+Disables a Middle Manager, causing it to stop accepting new tasks but complete all existing tasks. Returns a JSON object
+keyed by the combined `druid.host` and `druid.port`.
+
+#### URL
+
+`POST` `/druid/worker/v1/disable`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully disabled Middle Manager*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT/druid/worker/v1/disable"
+```
+
+
+
+
+
+```http
+POST /druid/worker/v1/disable HTTP/1.1
+Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "localhost:8091":"disabled"
+ }
+ ```
+
+
+
+### Enable Middle Manager
+
+Enables a Middle Manager, allowing it to accept new tasks again if it was previously disabled. Returns a JSON object keyed by the combined `druid.host` and `druid.port`.
+
+#### URL
+
+`POST` `/druid/worker/v1/enable`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully enabled Middle Manager*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT/druid/worker/v1/enable"
+```
+
+
+
+
+
+```http
+POST /druid/worker/v1/enable HTTP/1.1
+Host: http://MIDDLEMANAGER_IP:MIDDLEMANAGER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "localhost:8091":"enabled"
+ }
+ ```
+
+
+
+## Historical
+
+### Get segment load status
+
+Retrieves a JSON object of the form `{"cacheInitialized":value}`, where value is either `true` or `false` indicating if all segments in the local cache have been loaded.
+
+Use this endpoint to know when a Broker service is ready to accept queries after a restart.
+
+#### URL
+
+`GET` `/druid/historical/v1/loadstatus`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved status*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://HISTORICAL_IP:HISTORICAL_PORT/druid/historical/v1/loadstatus"
+```
+
+
+
+
+
+```http
+GET /druid/historical/v1/loadstatus HTTP/1.1
+Host: http://HISTORICAL_IP:HISTORICAL_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "cacheInitialized": true
+ }
+ ```
+
+
+
+### Get segment readiness
+
+Retrieves a status code to indicate if all segments in the local cache have been loaded. Similar to `/druid/historical/v1/loadstatus`, but instead of returning JSON with a flag, it returns status codes.
+
+#### URL
+
+`GET` `/druid/historical/v1/readiness`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Segments in local cache successfully loaded*
+
+
+
+
+
+
+
+*Segments in local cache have not been loaded*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://HISTORICAL_IP:HISTORICAL_PORT/druid/historical/v1/readiness"
+```
+
+
+
+
+
+```http
+GET /druid/historical/v1/readiness HTTP/1.1
+Host: http://HISTORICAL_IP:HISTORICAL_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful response to this endpoint results in an empty response body.
+
+## Load Status
+
+### Get Broker query load status
+
+Retrieves a flag indicating if the Broker knows about all segments in the cluster. Use this endpoint to know when a Broker service is ready to accept queries after a restart.
+
+#### URL
+
+`GET` `/druid/broker/v1/loadstatus`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Segments successfully loaded*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://BROKER_IP:BROKER_PORT/druid/broker/v1/loadstatus"
+```
+
+
+
+
+
+```http
+GET /druid/broker/v1/loadstatus HTTP/1.1
+Host: http://:
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "inventoryInitialized": true
+ }
+ ```
+
+
+
+### Get Broker query readiness
+
+Retrieves a status code to indicate Broker readiness. Readiness signifies the Broker knows about all segments in the cluster and is ready to accept queries after a restart. Similar to `/druid/broker/v1/loadstatus`, but instead of returning a JSON, it returns status codes.
+
+#### URL
+
+`GET` `/druid/broker/v1/readiness`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Segments successfully loaded*
+
+
+
+
+
+
+
+*Segments have not been loaded*
+
+
+
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://BROKER_IP:BROKER_PORT/druid/broker/v1/readiness"
+```
+
+
+
+
+
+```http
+GET /druid/broker/v1/readiness HTTP/1.1
+Host: http://BROKER_IP:BROKER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful response to this endpoint results in an empty response body.
diff --git a/docs/35.0.0/api-reference/sql-api.md b/docs/35.0.0/api-reference/sql-api.md
new file mode 100644
index 0000000000..af60cee4c8
--- /dev/null
+++ b/docs/35.0.0/api-reference/sql-api.md
@@ -0,0 +1,1727 @@
+---
+id: sql-api
+title: Druid SQL API
+sidebar_label: Druid SQL
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+:::info
+ Apache Druid supports two query languages: Druid SQL and [native queries](../querying/querying.md).
+ This document describes the SQL language.
+:::
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port. Replace it with the information for your deployment. For example, use `http://localhost:8888` for quickstart deployments.
+
+## Query from Historicals
+
+### Submit a query
+
+Submits a SQL-based query in the JSON or text format request body.
+Returns a JSON object with the query results and optional metadata for the results. You can also use this endpoint to query [metadata tables](../querying/sql-metadata-tables.md).
+
+Each query has an associated SQL query ID. You can set this ID manually using the SQL context parameter `sqlQueryId`. If not set, Druid automatically generates `sqlQueryId` and returns it in the response header for `X-Druid-SQL-Query-Id`. Note that you need the `sqlQueryId` to [cancel a query](#cancel-a-query).
+
+#### URL
+
+`POST` `/druid/v2/sql`
+
+#### JSON Format Request body
+
+To send queries in JSON format, the `Content-Type` in the HTTP request MUST be `application/json`.
+If there are multiple `Content-Type` headers, the **first** one is used.
+
+The request body takes the following properties:
+
+* `query`: SQL query string. HTTP requests are permitted to include multiple `SET` statements to assign [SQL query context parameter](../querying/sql-query-context.md) values to apply to the query statement, see [SET](../querying/sql.md#set) for details. Context parameters set by `SET` statements take priority over values set in `context`.
+* `resultFormat`: String that indicates the format to return query results. Select one of the following formats:
+ * `object`: Returns a JSON array of JSON objects with the HTTP response header `Content-Type: application/json`.
+ Object field names match the columns returned by the SQL query in the same order as the SQL query.
+
+ * `array`: Returns a JSON array of JSON arrays with the HTTP response header `Content-Type: application/json`.
+ Each inner array has elements matching the columns returned by the SQL query, in order.
+
+ * `objectLines`: Returns newline-delimited JSON objects with the HTTP response header `Content-Type: text/plain`.
+ Newline separation facilitates parsing the entire response set as a stream if you don't have a streaming JSON parser.
+ This format includes a single trailing newline character so you can detect a truncated response.
+
+ * `arrayLines`: Returns newline-delimited JSON arrays with the HTTP response header `Content-Type: text/plain`.
+ Newline separation facilitates parsing the entire response set as a stream if you don't have a streaming JSON parser.
+ This format includes a single trailing newline character so you can detect a truncated response.
+
+ * `csv`: Returns comma-separated values with one row per line. Sent with the HTTP response header `Content-Type: text/csv`.
+ Druid uses double quotes to escape individual field values. For example, a value with a comma returns `"A,B"`.
+ If the field value contains a double quote character, Druid escapes it with a second double quote character.
+ For example, `foo"bar` becomes `foo""bar`.
+ This format includes a single trailing newline character so you can detect a truncated response.
+
+* `header`: Boolean value that determines whether to return information on column names. When set to `true`, Druid returns the column names as the first row of the results. To also get information on the column types, set `typesHeader` or `sqlTypesHeader` to `true`. For a comparative overview of data formats and configurations for the header, see the [Query output format](#query-output-format) table.
+
+* `typesHeader`: Adds Druid runtime type information in the header. Requires `header` to be set to `true`. Complex types, like sketches, will be reported as `COMPLEX` if a particular complex type name is known for that field, or as `COMPLEX` if the particular type name is unknown or mixed.
+
+* `sqlTypesHeader`: Adds SQL type information in the header. Requires `header` to be set to `true`.
+
+ For compatibility, Druid returns the HTTP header `X-Druid-SQL-Header-Included: yes` when all of the following conditions are met:
+ * The `header` property is set to true.
+ * The version of Druid supports `typesHeader` and `sqlTypesHeader`, regardless of whether either property is set.
+
+* `context`: JSON object containing optional [SQL query context parameters](../querying/sql-query-context.md), such as to set the query ID, time zone, and whether to use an approximation algorithm for distinct count. You can also set the context through the SQL SET command. For more information, see [Druid SQL overview](../querying/sql.md#set).
+
+* `parameters`: List of query parameters for parameterized queries. Each parameter in the array should be a JSON object containing the parameter's SQL data type and parameter value. For more information on using dynamic parameters, see [Dynamic parameters](../querying/sql.md#dynamic-parameters). For a list of supported SQL types, see [Data types](../querying/sql-data-types.md).
+
+ For example:
+
+ ```json
+ {
+ "query": "SELECT \"arrayDouble\", \"stringColumn\" FROM \"array_example\" WHERE ARRAY_CONTAINS(\"arrayDouble\", ?) AND \"stringColumn\" = ?",
+ "parameters": [
+ {"type": "ARRAY", "value": [999.0, null, 5.5]},
+ {"type": "VARCHAR", "value": "bar"}
+ ]
+ }
+ ```
+
+##### Text Format Request body
+
+Druid also allows you to submit SQL queries in text format which is simpler than above JSON format.
+To do this, just set the `Content-Type` request header to `text/plain` or `application/x-www-form-urlencoded`, and pass SQL via the HTTP Body.
+
+If `application/x-www-form-urlencoded` is used, make sure the SQL query is URL-encoded.
+
+If there are multiple `Content-Type` headers, the **first** one is used.
+
+For response, the `resultFormat` is always `object` with the HTTP response header `Content-Type: application/json`.
+If you want more control over the query context or response format, use the above JSON format request body instead.
+
+The following example demonstrates how to submit a SQL query in text format:
+
+```commandline
+echo 'SELECT 1' | curl -H 'Content-Type: text/plain' http://ROUTER_IP:ROUTER_PORT/druid/v2/sql --data @-
+```
+
+We can also use `application/x-www-form-urlencoded` to submit URL-encoded SQL queries as shown by the following examples:
+
+```commandline
+echo 'SELECT%20%31' | curl http://ROUTER_IP:ROUTER_PORT/druid/v2/sql --data @-
+echo 'SELECT 1' | curl http://ROUTER_IP:ROUTER_PORT/druid/v2/sql --data-urlencode @-
+```
+
+The `curl` tool uses `application/x-www-form-urlencoded` as Content-Type header if the header is not given.
+
+The first example pass the URL-encoded query `SELECT%20%31`, which is `SELECT 1`, to the `curl` and `curl` will directly sends it to the server.
+While the second example passes the raw query `SELECT 1` to `curl` and the `curl` encodes the query to `SELECT%20%31` because of `--data-urlencode` option and sends the encoded text to the server.
+
+#### Responses
+
+
+
+
+
+
+*Successfully submitted query*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+
+
+
+
+*Request not sent due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+
+
+
+
+#### Client-side error handling and truncated responses
+
+Druid reports errors that occur before the response body is sent as JSON with an HTTP 500 status code. The errors are reported using the same format as [native Druid query errors](../querying/querying.md#query-errors).
+If an error occurs while Druid is sending the response body, the server handling the request stops the response midstream and logs an error.
+
+This means that when you call the SQL API, you must properly handle response truncation.
+For `object` and `array` formats, truncated responses are invalid JSON.
+For line-oriented formats, Druid includes a newline character as the final character of every complete response. Absence of a final newline character indicates a truncated response.
+
+If you detect a truncated response, treat it as an error.
+
+---
+
+#### Sample request
+
+In the following example, this query demonstrates the following actions:
+- Retrieves all rows from the `wikipedia` datasource.
+- Filters the results where the `user` value is `BlueMoon2662`.
+- Applies the `sqlTimeZone` context parameter to set the time zone of results to `America/Los_Angeles`.
+- Returns descriptors for `header`, `typesHeader`, and `sqlTypesHeader`.
+
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql" \
+--header 'Content-Type: application/json' \
+--data '{
+ "query": "SELECT * FROM wikipedia WHERE user='\''BlueMoon2662'\''",
+ "context" : {"sqlTimeZone" : "America/Los_Angeles"},
+ "header" : true,
+ "typesHeader" : true,
+ "sqlTypesHeader" : true
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/v2/sql HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 201
+
+{
+ "query": "SELECT * FROM wikipedia WHERE user='BlueMoon2662'",
+ "context" : {"sqlTimeZone" : "America/Los_Angeles"},
+ "header" : true,
+ "typesHeader" : true,
+ "sqlTypesHeader" : true
+}
+```
+
+
+
+
+You can also specify query-level context parameters directly within the SQL query string using the `SET` command. For more details, see [SET](../querying/sql.md#set).
+
+The following request body is functionally equivalent to the previous example and uses SET instead of the `context` parameter:
+
+```JSON
+{
+ "query": "SET sqlTimeZone='America/Los_Angeles'; SELECT * FROM wikipedia WHERE user='BlueMoon2662'",
+ "header": true,
+ "typesHeader": true,
+ "sqlTypesHeader": true
+}
+```
+
+
+#### Sample response
+
+
+ View the response
+
+```json
+[
+ {
+ "__time": {
+ "type": "LONG",
+ "sqlType": "TIMESTAMP"
+ },
+ "channel": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "cityName": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "comment": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "countryIsoCode": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "countryName": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "isAnonymous": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "isMinor": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "isNew": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "isRobot": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "isUnpatrolled": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "metroCode": {
+ "type": "LONG",
+ "sqlType": "BIGINT"
+ },
+ "namespace": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "page": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "regionIsoCode": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "regionName": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "user": {
+ "type": "STRING",
+ "sqlType": "VARCHAR"
+ },
+ "delta": {
+ "type": "LONG",
+ "sqlType": "BIGINT"
+ },
+ "added": {
+ "type": "LONG",
+ "sqlType": "BIGINT"
+ },
+ "deleted": {
+ "type": "LONG",
+ "sqlType": "BIGINT"
+ }
+ },
+ {
+ "__time": "2015-09-11T17:47:53.259-07:00",
+ "channel": "#ja.wikipedia",
+ "cityName": null,
+ "comment": "/* 対戦通算成績と得失点 */",
+ "countryIsoCode": null,
+ "countryName": null,
+ "isAnonymous": "false",
+ "isMinor": "true",
+ "isNew": "false",
+ "isRobot": "false",
+ "isUnpatrolled": "false",
+ "metroCode": null,
+ "namespace": "Main",
+ "page": "アルビレックス新潟の年度別成績一覧",
+ "regionIsoCode": null,
+ "regionName": null,
+ "user": "BlueMoon2662",
+ "delta": 14,
+ "added": 14,
+ "deleted": 0
+ }
+]
+```
+
+
+### Cancel a query
+
+Cancels a query on the Router or the Broker with the associated `sqlQueryId`. The `sqlQueryId` can be manually set when the query is submitted in the query context parameter, or if not set, Druid will generate one and return it in the response header when the query is successfully submitted. Note that Druid does not enforce a unique `sqlQueryId` in the query context. If you've set the same `sqlQueryId` for multiple queries, Druid cancels all requests with that query ID.
+
+When you cancel a query, Druid handles the cancellation in a best-effort manner. Druid immediately marks the query as canceled and aborts the query execution as soon as possible. However, the query may continue running for a short time after you make the cancellation request.
+
+Cancellation requests require READ permission on all resources used in the SQL query.
+
+#### URL
+
+`DELETE` `/druid/v2/sql/{sqlQueryId}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully deleted query*
+
+
+
+
+
+*Authorization failure*
+
+
+
+
+
+*Invalid `sqlQueryId` or query was completed before cancellation request*
+
+
+
+
+---
+
+#### Sample request
+
+The following example cancels a request with the set query ID `request01`.
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/request01"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/v2/sql/request01 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful response results in an `HTTP 202` message code and an empty response body.
+
+### Query output format
+
+The following table shows examples of how Druid returns the column names and data types based on the result format and the type request.
+In all cases, `header` is true.
+The examples includes the first row of results, where the value of `user` is `BlueMoon2662`.
+
+```
+| Format | typesHeader | sqlTypesHeader | Example output |
+|--------|-------------|----------------|--------------------------------------------------------------------------------------------|
+| object | true | false | [ { "user" : { "type" : "STRING" } }, { "user" : "BlueMoon2662" } ] |
+| object | true | true | [ { "user" : { "type" : "STRING", "sqlType" : "VARCHAR" } }, { "user" : "BlueMoon2662" } ] |
+| object | false | true | [ { "user" : { "sqlType" : "VARCHAR" } }, { "user" : "BlueMoon2662" } ] |
+| object | false | false | [ { "user" : null }, { "user" : "BlueMoon2662" } ] |
+| array | true | false | [ [ "user" ], [ "STRING" ], [ "BlueMoon2662" ] ] |
+| array | true | true | [ [ "user" ], [ "STRING" ], [ "VARCHAR" ], [ "BlueMoon2662" ] ] |
+| array | false | true | [ [ "user" ], [ "VARCHAR" ], [ "BlueMoon2662" ] ] |
+| array | false | false | [ [ "user" ], [ "BlueMoon2662" ] ] |
+| csv | true | false | user STRING BlueMoon2662 |
+| csv | true | true | user STRING VARCHAR BlueMoon2662 |
+| csv | false | true | user VARCHAR BlueMoon2662 |
+| csv | false | false | user BlueMoon2662 |
+```
+
+## Query from deep storage
+
+You can use the `sql/statements` endpoint to query segments that exist only in deep storage and are not loaded onto your Historical processes as determined by your load rules.
+
+Note that at least one segment of a datasource must be available on a Historical process so that the Broker can plan your query. A quick way to check if this is true is whether or not a datasource is visible in the Druid console.
+
+
+For more information, see [Query from deep storage](../querying/query-from-deep-storage.md).
+
+### Submit a query
+
+Submit a query for data stored in deep storage. Any data ingested into Druid is placed into deep storage. The query is contained in the "query" field in the JSON object within the request payload.
+
+Note that at least part of a datasource must be available on a Historical process so that Druid can plan your query and only the user who submits a query can see the results.
+
+#### URL
+
+`POST` `/druid/v2/sql/statements`
+
+#### Request body
+
+Generally, the `sql` and `sql/statements` endpoints support the same response body fields with minor differences. For general information about the available fields, see [Submit a query to the `sql` endpoint](#submit-a-query).
+
+Keep the following in mind when submitting queries to the `sql/statements` endpoint:
+
+- Apart from the context parameters mentioned [here](../multi-stage-query/reference.md#context-parameters) there are additional context parameters for `sql/statements` specifically:
+
+ - `executionMode` determines how query results are fetched. Druid currently only supports `ASYNC`. You must manually retrieve your results after the query completes.
+ - `selectDestination` determines where final results get written. By default, results are written to task reports. Set this parameter to `durableStorage` to instruct Druid to write the results from SELECT queries to durable storage, which allows you to fetch larger result sets. For result sets with more than 3000 rows, it is highly recommended to use `durableStorage`. Note that this requires you to have [durable storage for MSQ](../operations/durable-storage.md) enabled.
+
+#### Responses
+
+
+
+
+
+
+*Successfully queried from deep storage*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "Summary of the encountered error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred.",
+ "errorCode": "Well-defined error code.",
+ "persona": "Role or persona associated with the error.",
+ "category": "Classification of the error.",
+ "errorMessage": "Summary of the encountered issue with expanded information.",
+ "context": "Additional context about the error."
+}
+```
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/statements" \
+--header 'Content-Type: application/json' \
+--data '{
+ "query": "SELECT * FROM wikipedia WHERE user='\''BlueMoon2662'\''",
+ "context": {
+ "executionMode":"ASYNC"
+ }
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/v2/sql/statements HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 134
+
+{
+ "query": "SELECT * FROM wikipedia WHERE user='BlueMoon2662'",
+ "context": {
+ "executionMode":"ASYNC"
+ }
+}
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "queryId": "query-b82a7049-b94f-41f2-a230-7fef94768745",
+ "state": "ACCEPTED",
+ "createdAt": "2023-07-26T21:16:25.324Z",
+ "schema": [
+ {
+ "name": "__time",
+ "type": "TIMESTAMP",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "channel",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "cityName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "comment",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "countryIsoCode",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "countryName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "isAnonymous",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isMinor",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isNew",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isRobot",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isUnpatrolled",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "metroCode",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "namespace",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "page",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "regionIsoCode",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "regionName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "user",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "delta",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "added",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "deleted",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ }
+ ],
+ "durationMs": -1
+}
+ ```
+
+
+### Get query status
+
+Retrieves information about the query associated with the given query ID. The response matches the response from the POST API if the query is accepted or running and the execution mode is `ASYNC`. In addition to the fields that this endpoint shares with `POST /sql/statements`, a completed query's status includes the following:
+
+- A `result` object that summarizes information about your results, such as the total number of rows and sample records.
+- A `pages` object that includes the following information for each page of results:
+ - `numRows`: the number of rows in that page of results.
+ - `sizeInBytes`: the size of the page.
+ - `id`: the page number that you can use to reference a specific page when you get query results.
+
+If the optional query parameter `detail` is supplied, then the response also includes the following:
+- A `stages` object that summarizes information about the different stages being used for query execution, such as stage number, phase, start time, duration, input and output information, processing methods, and partitioning.
+- A `counters` object that provides details on the rows, bytes, and files processed at various stages for each worker across different channels, along with sort progress.
+- A `warnings` object that provides details about any warnings.
+
+#### URL
+
+`GET` `/druid/v2/sql/statements/{queryId}`
+
+#### Query parameters
+* `detail` (optional)
+ * Type: Boolean
+ * Default: false
+ * Fetch additional details about the query, which includes the information about different stages, counters for each stage, and any warnings.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved query status*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "Summary of the encountered error.",
+ "errorCode": "Well-defined error code.",
+ "persona": "Role or persona associated with the error.",
+ "category": "Classification of the error.",
+ "errorMessage": "Summary of the encountered issue with expanded information.",
+ "context": "Additional context about the error."
+}
+```
+
+
+
+
+#### Sample request
+
+The following example retrieves the status of a query with specified ID `query-9b93f6f7-ab0e-48f5-986a-3520f84f0804`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/statements/query-9b93f6f7-ab0e-48f5-986a-3520f84f0804?detail=true"
+```
+
+
+
+
+
+```HTTP
+GET /druid/v2/sql/statements/query-9b93f6f7-ab0e-48f5-986a-3520f84f0804?detail=true HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "queryId": "query-9b93f6f7-ab0e-48f5-986a-3520f84f0804",
+ "state": "SUCCESS",
+ "createdAt": "2023-07-26T22:57:46.620Z",
+ "schema": [
+ {
+ "name": "__time",
+ "type": "TIMESTAMP",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "channel",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "cityName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "comment",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "countryIsoCode",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "countryName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "isAnonymous",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isMinor",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isNew",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isRobot",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "isUnpatrolled",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "metroCode",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "namespace",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "page",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "regionIsoCode",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "regionName",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "user",
+ "type": "VARCHAR",
+ "nativeType": "STRING"
+ },
+ {
+ "name": "delta",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "added",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ },
+ {
+ "name": "deleted",
+ "type": "BIGINT",
+ "nativeType": "LONG"
+ }
+ ],
+ "durationMs": 25591,
+ "result": {
+ "numTotalRows": 1,
+ "totalSizeInBytes": 375,
+ "dataSource": "__query_select",
+ "sampleRecords": [
+ [
+ 1442018873259,
+ "#ja.wikipedia",
+ "",
+ "/* 対戦通算成績と得失点 */",
+ "",
+ "",
+ 0,
+ 1,
+ 0,
+ 0,
+ 0,
+ 0,
+ "Main",
+ "アルビレックス新潟の年度別成績一覧",
+ "",
+ "",
+ "BlueMoon2662",
+ 14,
+ 14,
+ 0
+ ]
+ ],
+ "pages": [
+ {
+ "id": 0,
+ "numRows": 1,
+ "sizeInBytes": 375
+ }
+ ]
+ },
+ "stages": [
+ {
+ "stageNumber": 0,
+ "definition": {
+ "id": "query-9b93f6f7-ab0e-48f5-986a-3520f84f0804_0",
+ "input": [
+ {
+ "type": "table",
+ "dataSource": "wikipedia",
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ],
+ "filter": {
+ "type": "equals",
+ "column": "user",
+ "matchValueType": "STRING",
+ "matchValue": "BlueMoon2662"
+ },
+ "filterFields": [
+ "user"
+ ]
+ }
+ ],
+ "processor": {
+ "type": "scan",
+ "query": {
+ "queryType": "scan",
+ "dataSource": {
+ "type": "inputNumber",
+ "inputNumber": 0
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "virtualColumns": [
+ {
+ "type": "expression",
+ "name": "v0",
+ "expression": "'BlueMoon2662'",
+ "outputType": "STRING"
+ }
+ ],
+ "resultFormat": "compactedList",
+ "limit": 1001,
+ "filter": {
+ "type": "equals",
+ "column": "user",
+ "matchValueType": "STRING",
+ "matchValue": "BlueMoon2662"
+ },
+ "columns": [
+ "__time",
+ "added",
+ "channel",
+ "cityName",
+ "comment",
+ "commentLength",
+ "countryIsoCode",
+ "countryName",
+ "deleted",
+ "delta",
+ "deltaBucket",
+ "diffUrl",
+ "flags",
+ "isAnonymous",
+ "isMinor",
+ "isNew",
+ "isRobot",
+ "isUnpatrolled",
+ "metroCode",
+ "namespace",
+ "page",
+ "regionIsoCode",
+ "regionName",
+ "v0"
+ ],
+ "context": {
+ "__resultFormat": "array",
+ "__user": "allowAll",
+ "executionMode": "async",
+ "finalize": true,
+ "maxNumTasks": 2,
+ "maxParseExceptions": 0,
+ "queryId": "33b53acb-7533-4880-a81b-51c16c489eab",
+ "scanSignature": "[{\"name\":\"__time\",\"type\":\"LONG\"},{\"name\":\"added\",\"type\":\"LONG\"},{\"name\":\"channel\",\"type\":\"STRING\"},{\"name\":\"cityName\",\"type\":\"STRING\"},{\"name\":\"comment\",\"type\":\"STRING\"},{\"name\":\"commentLength\",\"type\":\"LONG\"},{\"name\":\"countryIsoCode\",\"type\":\"STRING\"},{\"name\":\"countryName\",\"type\":\"STRING\"},{\"name\":\"deleted\",\"type\":\"LONG\"},{\"name\":\"delta\",\"type\":\"LONG\"},{\"name\":\"deltaBucket\",\"type\":\"LONG\"},{\"name\":\"diffUrl\",\"type\":\"STRING\"},{\"name\":\"flags\",\"type\":\"STRING\"},{\"name\":\"isAnonymous\",\"type\":\"STRING\"},{\"name\":\"isMinor\",\"type\":\"STRING\"},{\"name\":\"isNew\",\"type\":\"STRING\"},{\"name\":\"isRobot\",\"type\":\"STRING\"},{\"name\":\"isUnpatrolled\",\"type\":\"STRING\"},{\"name\":\"metroCode\",\"type\":\"STRING\"},{\"name\":\"namespace\",\"type\":\"STRING\"},{\"name\":\"page\",\"type\":\"STRING\"},{\"name\":\"regionIsoCode\",\"type\":\"STRING\"},{\"name\":\"regionName\",\"type\":\"STRING\"},{\"name\":\"v0\",\"type\":\"STRING\"}]",
+ "sqlOuterLimit": 1001,
+ "sqlQueryId": "33b53acb-7533-4880-a81b-51c16c489eab",
+ "sqlStringifyArrays": false
+ },
+ "columnTypes": [
+ "LONG",
+ "LONG",
+ "STRING",
+ "STRING",
+ "STRING",
+ "LONG",
+ "STRING",
+ "STRING",
+ "LONG",
+ "LONG",
+ "LONG",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING",
+ "STRING"
+ ],
+ "granularity": {
+ "type": "all"
+ },
+ "legacy": false
+ }
+ },
+ "signature": [
+ {
+ "name": "__boost",
+ "type": "LONG"
+ },
+ {
+ "name": "__time",
+ "type": "LONG"
+ },
+ {
+ "name": "added",
+ "type": "LONG"
+ },
+ {
+ "name": "channel",
+ "type": "STRING"
+ },
+ {
+ "name": "cityName",
+ "type": "STRING"
+ },
+ {
+ "name": "comment",
+ "type": "STRING"
+ },
+ {
+ "name": "commentLength",
+ "type": "LONG"
+ },
+ {
+ "name": "countryIsoCode",
+ "type": "STRING"
+ },
+ {
+ "name": "countryName",
+ "type": "STRING"
+ },
+ {
+ "name": "deleted",
+ "type": "LONG"
+ },
+ {
+ "name": "delta",
+ "type": "LONG"
+ },
+ {
+ "name": "deltaBucket",
+ "type": "LONG"
+ },
+ {
+ "name": "diffUrl",
+ "type": "STRING"
+ },
+ {
+ "name": "flags",
+ "type": "STRING"
+ },
+ {
+ "name": "isAnonymous",
+ "type": "STRING"
+ },
+ {
+ "name": "isMinor",
+ "type": "STRING"
+ },
+ {
+ "name": "isNew",
+ "type": "STRING"
+ },
+ {
+ "name": "isRobot",
+ "type": "STRING"
+ },
+ {
+ "name": "isUnpatrolled",
+ "type": "STRING"
+ },
+ {
+ "name": "metroCode",
+ "type": "STRING"
+ },
+ {
+ "name": "namespace",
+ "type": "STRING"
+ },
+ {
+ "name": "page",
+ "type": "STRING"
+ },
+ {
+ "name": "regionIsoCode",
+ "type": "STRING"
+ },
+ {
+ "name": "regionName",
+ "type": "STRING"
+ },
+ {
+ "name": "v0",
+ "type": "STRING"
+ }
+ ],
+ "shuffleSpec": {
+ "type": "mix"
+ },
+ "maxWorkerCount": 1
+ },
+ "phase": "FINISHED",
+ "workerCount": 1,
+ "partitionCount": 1,
+ "shuffle": "mix",
+ "output": "localStorage",
+ "startTime": "2024-07-31T15:20:21.255Z",
+ "duration": 103
+ },
+ {
+ "stageNumber": 1,
+ "definition": {
+ "id": "query-9b93f6f7-ab0e-48f5-986a-3520f84f0804_1",
+ "input": [
+ {
+ "type": "stage",
+ "stage": 0
+ }
+ ],
+ "processor": {
+ "type": "limit",
+ "limit": 1001
+ },
+ "signature": [
+ {
+ "name": "__boost",
+ "type": "LONG"
+ },
+ {
+ "name": "__time",
+ "type": "LONG"
+ },
+ {
+ "name": "added",
+ "type": "LONG"
+ },
+ {
+ "name": "channel",
+ "type": "STRING"
+ },
+ {
+ "name": "cityName",
+ "type": "STRING"
+ },
+ {
+ "name": "comment",
+ "type": "STRING"
+ },
+ {
+ "name": "commentLength",
+ "type": "LONG"
+ },
+ {
+ "name": "countryIsoCode",
+ "type": "STRING"
+ },
+ {
+ "name": "countryName",
+ "type": "STRING"
+ },
+ {
+ "name": "deleted",
+ "type": "LONG"
+ },
+ {
+ "name": "delta",
+ "type": "LONG"
+ },
+ {
+ "name": "deltaBucket",
+ "type": "LONG"
+ },
+ {
+ "name": "diffUrl",
+ "type": "STRING"
+ },
+ {
+ "name": "flags",
+ "type": "STRING"
+ },
+ {
+ "name": "isAnonymous",
+ "type": "STRING"
+ },
+ {
+ "name": "isMinor",
+ "type": "STRING"
+ },
+ {
+ "name": "isNew",
+ "type": "STRING"
+ },
+ {
+ "name": "isRobot",
+ "type": "STRING"
+ },
+ {
+ "name": "isUnpatrolled",
+ "type": "STRING"
+ },
+ {
+ "name": "metroCode",
+ "type": "STRING"
+ },
+ {
+ "name": "namespace",
+ "type": "STRING"
+ },
+ {
+ "name": "page",
+ "type": "STRING"
+ },
+ {
+ "name": "regionIsoCode",
+ "type": "STRING"
+ },
+ {
+ "name": "regionName",
+ "type": "STRING"
+ },
+ {
+ "name": "v0",
+ "type": "STRING"
+ }
+ ],
+ "shuffleSpec": {
+ "type": "maxCount",
+ "clusterBy": {
+ "columns": [
+ {
+ "columnName": "__boost",
+ "order": "ASCENDING"
+ }
+ ]
+ },
+ "partitions": 1
+ },
+ "maxWorkerCount": 1
+ },
+ "phase": "FINISHED",
+ "workerCount": 1,
+ "partitionCount": 1,
+ "shuffle": "globalSort",
+ "output": "localStorage",
+ "startTime": "2024-07-31T15:20:21.355Z",
+ "duration": 10,
+ "sort": true
+ }
+ ],
+ "counters": {
+ "0": {
+ "0": {
+ "input0": {
+ "type": "channel",
+ "rows": [
+ 24433
+ ],
+ "bytes": [
+ 7393933
+ ],
+ "files": [
+ 22
+ ],
+ "totalFiles": [
+ 22
+ ]
+ }
+ }
+ },
+ "1": {
+ "0": {
+ "sortProgress": {
+ "type": "sortProgress",
+ "totalMergingLevels": -1,
+ "levelToTotalBatches": {},
+ "levelToMergedBatches": {},
+ "totalMergersForUltimateLevel": -1,
+ "triviallyComplete": true,
+ "progressDigest": 1
+ }
+ }
+ }
+ },
+ "warnings": []
+}
+ ```
+
+
+
+### Get query results
+
+Retrieves results for completed queries. Results are separated into pages, so you can use the optional `page` parameter to refine the results you get. Druid returns information about the composition of each page and its page number (`id`). For information about pages, see [Get query status](#get-query-status).
+
+If a page number isn't passed, all results are returned sequentially in the same response. If you have large result sets, you may encounter timeouts based on the value configured for `druid.router.http.readTimeout`.
+
+Getting the query results for an ingestion query returns an empty response.
+
+#### URL
+
+`GET` `/druid/v2/sql/statements/{queryId}/results`
+
+#### Query parameters
+* `page` (optional)
+ * Type: Int
+ * Fetch results based on page numbers. If not specified, all results are returned sequentially starting from page 0 to N in the same response.
+* `resultFormat` (optional)
+ * Type: String
+ * Defines the format in which the results are presented. The following options are supported `arrayLines`,`objectLines`,`array`,`object`, and `csv`. The default is `object`.
+* `filename` (optional)
+ * Type: String
+ * If set, attaches a `Content-Disposition` header to the response with the value of `attachment; filename={filename}`. The filename must not be longer than 255 characters and must not contain the characters `/`, `\`, `:`, `*`, `?`, `"`, `<`, `>`, `|`, `\0`, `\n`, or `\r`.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved query results*
+
+
+
+
+
+*Query in progress. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "Summary of the encountered error.",
+ "errorCode": "Well-defined error code.",
+ "persona": "Role or persona associated with the error.",
+ "category": "Classification of the error.",
+ "errorMessage": "Summary of the encountered issue with expanded information.",
+ "context": "Additional context about the error."
+}
+```
+
+
+
+
+
+*Query not found, failed or canceled*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "Summary of the encountered error.",
+ "errorCode": "Well-defined error code.",
+ "persona": "Role or persona associated with the error.",
+ "category": "Classification of the error.",
+ "errorMessage": "Summary of the encountered issue with expanded information.",
+ "context": "Additional context about the error."
+}
+```
+
+
+
+
+---
+
+#### Sample request
+
+The following example retrieves the status of a query with specified ID `query-f3bca219-173d-44d4-bdc7-5002e910352f`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/statements/query-f3bca219-173d-44d4-bdc7-5002e910352f/results"
+```
+
+
+
+
+
+```HTTP
+GET /druid/v2/sql/statements/query-f3bca219-173d-44d4-bdc7-5002e910352f/results HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+[
+ {
+ "__time": 1442018818771,
+ "channel": "#en.wikipedia",
+ "cityName": "",
+ "comment": "added project",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 0,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Talk",
+ "page": "Talk:Oswald Tilghman",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "GELongstreet",
+ "delta": 36,
+ "added": 36,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018820496,
+ "channel": "#ca.wikipedia",
+ "cityName": "",
+ "comment": "Robot inserta {{Commonscat}} que enllaça amb [[commons:category:Rallicula]]",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 1,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Rallicula",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "PereBot",
+ "delta": 17,
+ "added": 17,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018825474,
+ "channel": "#en.wikipedia",
+ "cityName": "Auburn",
+ "comment": "/* Status of peremptory norms under international law */ fixed spelling of 'Wimbledon'",
+ "countryIsoCode": "AU",
+ "countryName": "Australia",
+ "isAnonymous": 1,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 0,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Peremptory norm",
+ "regionIsoCode": "NSW",
+ "regionName": "New South Wales",
+ "user": "60.225.66.142",
+ "delta": 0,
+ "added": 0,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018828770,
+ "channel": "#vi.wikipedia",
+ "cityName": "",
+ "comment": "fix Lỗi CS1: ngày tháng",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 1,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Apamea abruzzorum",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "Cheers!-bot",
+ "delta": 18,
+ "added": 18,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018831862,
+ "channel": "#vi.wikipedia",
+ "cityName": "",
+ "comment": "clean up using [[Project:AWB|AWB]]",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Atractus flammigerus",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "ThitxongkhoiAWB",
+ "delta": 18,
+ "added": 18,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018833987,
+ "channel": "#vi.wikipedia",
+ "cityName": "",
+ "comment": "clean up using [[Project:AWB|AWB]]",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Agama mossambica",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "ThitxongkhoiAWB",
+ "delta": 18,
+ "added": 18,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018837009,
+ "channel": "#ca.wikipedia",
+ "cityName": "",
+ "comment": "/* Imperi Austrohongarès */",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 0,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Campanya dels Balcans (1914-1918)",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "Jaumellecha",
+ "delta": -20,
+ "added": 0,
+ "deleted": 20
+ },
+ {
+ "__time": 1442018839591,
+ "channel": "#en.wikipedia",
+ "cityName": "",
+ "comment": "adding comment on notability and possible COI",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 1,
+ "isRobot": 0,
+ "isUnpatrolled": 1,
+ "metroCode": 0,
+ "namespace": "Talk",
+ "page": "Talk:Dani Ploeger",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "New Media Theorist",
+ "delta": 345,
+ "added": 345,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018841578,
+ "channel": "#en.wikipedia",
+ "cityName": "",
+ "comment": "Copying assessment table to wiki",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "User",
+ "page": "User:WP 1.0 bot/Tables/Project/Pubs",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "WP 1.0 bot",
+ "delta": 121,
+ "added": 121,
+ "deleted": 0
+ },
+ {
+ "__time": 1442018845821,
+ "channel": "#vi.wikipedia",
+ "cityName": "",
+ "comment": "clean up using [[Project:AWB|AWB]]",
+ "countryIsoCode": "",
+ "countryName": "",
+ "isAnonymous": 0,
+ "isMinor": 0,
+ "isNew": 0,
+ "isRobot": 1,
+ "isUnpatrolled": 0,
+ "metroCode": 0,
+ "namespace": "Main",
+ "page": "Agama persimilis",
+ "regionIsoCode": "",
+ "regionName": "",
+ "user": "ThitxongkhoiAWB",
+ "delta": 18,
+ "added": 18,
+ "deleted": 0
+ }
+]
+ ```
+
+
+### Cancel a query
+
+Cancels a running or accepted query.
+
+#### URL
+
+`DELETE` `/druid/v2/sql/statements/{queryId}`
+
+#### Responses
+
+
+
+
+
+
+*A no op operation since the query is not in a state to be cancelled*
+
+
+
+
+
+*Successfully accepted query for cancellation*
+
+
+
+
+
+*Invalid query ID. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "Summary of the encountered error.",
+ "errorCode": "Well-defined error code.",
+ "persona": "Role or persona associated with the error.",
+ "category": "Classification of the error.",
+ "errorMessage": "Summary of the encountered issue with expanded information.",
+ "context": "Additional context about the error."
+}
+```
+
+
+
+
+---
+
+#### Sample request
+
+The following example cancels a query with specified ID `query-945c9633-2fa2-49ab-80ae-8221c38c024da`.
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/statements/query-945c9633-2fa2-49ab-80ae-8221c38c024da"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/v2/sql/statements/query-945c9633-2fa2-49ab-80ae-8221c38c024da HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns an HTTP `202 ACCEPTED` message code and an empty response body.
diff --git a/docs/35.0.0/api-reference/sql-ingestion-api.md b/docs/35.0.0/api-reference/sql-ingestion-api.md
new file mode 100644
index 0000000000..59942aff8e
--- /dev/null
+++ b/docs/35.0.0/api-reference/sql-ingestion-api.md
@@ -0,0 +1,850 @@
+---
+id: sql-ingestion-api
+title: SQL-based ingestion API
+sidebar_label: SQL-based ingestion
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+:::info
+ This page describes SQL-based batch ingestion using the [`druid-multi-stage-query`](../multi-stage-query/index.md)
+ extension, new in Druid 24.0. Refer to the [ingestion methods](../ingestion/index.md#batch) table to determine which
+ ingestion method is right for you.
+:::
+
+The **Query** view in the web console provides a friendly experience for the multi-stage query task engine (MSQ task engine) and multi-stage query architecture. We recommend using the web console if you don't need a programmatic interface.
+
+When using the API for the MSQ task engine, the action you want to take determines the endpoint you use:
+
+- `/druid/v2/sql/task`: Submit a query for ingestion.
+- `/druid/indexer/v1/task`: Interact with a query, including getting its status or details, or canceling the query. This page describes a few of the Overlord Task APIs that you can use with the MSQ task engine. For information about Druid APIs, see the [API reference for Druid](../ingestion/tasks.md).
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port. Replace it with the information for your deployment. For example, use `http://localhost:8888` for quickstart deployments.
+
+## Submit a query
+
+Submits queries to the MSQ task engine.
+
+The `/druid/v2/sql/task` endpoint accepts the following:
+
+- [SQL requests in the JSON-over-HTTP form](sql-api.md#request-body) using the
+`query`, `context`, and `parameters` fields. The endpoint ignores the `resultFormat`, `header`, `typesHeader`, and `sqlTypesHeader` fields.
+- [INSERT](../multi-stage-query/reference.md#insert) and [REPLACE](../multi-stage-query/reference.md#replace) statements.
+- SELECT queries (experimental feature). SELECT query results are collected from workers by the controller, and written into the [task report](#get-the-report-for-a-query-task) as an array of arrays. The behavior and result format of plain SELECT queries (without INSERT or REPLACE) is subject to change.
+
+### URL
+
+`POST` `/druid/v2/sql/task`
+
+### Responses
+
+
+
+
+
+
+*Successfully submitted query*
+
+
+
+
+
+*Error thrown due to bad query. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+
+
+
+
+*Request not sent due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error.",
+ "errorClass": "Class of exception that caused this error.",
+ "host": "The host on which the error occurred."
+}
+```
+
+
+
+
+---
+
+### Sample request
+
+The following example shows a query that fetches data from an external JSON source and inserts it into a table named `wikipedia`.
+The example specifies two query context parameters:
+
+- `maxNumTasks=3`: Limits the maximum number of parallel tasks to 3.
+- `finalizeAggregations=false`: Ensures that Druid saves the aggregation's intermediate type during ingestion. For more information, see [Rollup](../multi-stage-query/concepts.md#rollup).
+
+
+
+
+
+
+```HTTP
+POST /druid/v2/sql/task HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+
+{
+ "query": "SET maxNumTasks=3;\nSET finalizeAggregations=false;\nINSERT INTO wikipedia\nSELECT\n TIME_PARSE(\"timestamp\") AS __time,\n *\nFROM TABLE(\n EXTERN(\n '{\"type\": \"http\", \"uris\": [\"https://druid.apache.org/data/wikipedia.json.gz\"]}',\n '{\"type\": \"json\"}',\n '[{\"name\": \"added\", \"type\": \"long\"}, {\"name\": \"channel\", \"type\": \"string\"}, {\"name\": \"cityName\", \"type\": \"string\"}, {\"name\": \"comment\", \"type\": \"string\"}, {\"name\": \"commentLength\", \"type\": \"long\"}, {\"name\": \"countryIsoCode\", \"type\": \"string\"}, {\"name\": \"countryName\", \"type\": \"string\"}, {\"name\": \"deleted\", \"type\": \"long\"}, {\"name\": \"delta\", \"type\": \"long\"}, {\"name\": \"deltaBucket\", \"type\": \"string\"}, {\"name\": \"diffUrl\", \"type\": \"string\"}, {\"name\": \"flags\", \"type\": \"string\"}, {\"name\": \"isAnonymous\", \"type\": \"string\"}, {\"name\": \"isMinor\", \"type\": \"string\"}, {\"name\": \"isNew\", \"type\": \"string\"}, {\"name\": \"isRobot\", \"type\": \"string\"}, {\"name\": \"isUnpatrolled\", \"type\": \"string\"}, {\"name\": \"metroCode\", \"type\": \"string\"}, {\"name\": \"namespace\", \"type\": \"string\"}, {\"name\": \"page\", \"type\": \"string\"}, {\"name\": \"regionIsoCode\", \"type\": \"string\"}, {\"name\": \"regionName\", \"type\": \"string\"}, {\"name\": \"timestamp\", \"type\": \"string\"}, {\"name\": \"user\", \"type\": \"string\"}]'\n )\n)\nPARTITIONED BY DAY"
+}
+```
+
+
+
+
+
+
+```shell
+curl --location --request POST 'http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/task' \
+ --header 'Content-Type: application/json' \
+ --data '{
+ "query": "SET maxNumTasks=3;\nSET finalizeAggregations=false;\nINSERT INTO wikipedia\nSELECT\n TIME_PARSE(\"timestamp\") AS __time,\n *\nFROM TABLE(\n EXTERN(\n '\''{\"type\": \"http\", \"uris\": [\"https://druid.apache.org/data/wikipedia.json.gz\"]}'\'',\n '\''{\"type\": \"json\"}'\'',\n '\''[{\"name\": \"added\", \"type\": \"long\"}, {\"name\": \"channel\", \"type\": \"string\"}, {\"name\": \"cityName\", \"type\": \"string\"}, {\"name\": \"comment\", \"type\": \"string\"}, {\"name\": \"commentLength\", \"type\": \"long\"}, {\"name\": \"countryIsoCode\", \"type\": \"string\"}, {\"name\": \"countryName\", \"type\": \"string\"}, {\"name\": \"deleted\", \"type\": \"long\"}, {\"name\": \"delta\", \"type\": \"long\"}, {\"name\": \"deltaBucket\", \"type\": \"string\"}, {\"name\": \"diffUrl\", \"type\": \"string\"}, {\"name\": \"flags\", \"type\": \"string\"}, {\"name\": \"isAnonymous\", \"type\": \"string\"}, {\"name\": \"isMinor\", \"type\": \"string\"}, {\"name\": \"isNew\", \"type\": \"string\"}, {\"name\": \"isRobot\", \"type\": \"string\"}, {\"name\": \"isUnpatrolled\", \"type\": \"string\"}, {\"name\": \"metroCode\", \"type\": \"string\"}, {\"name\": \"namespace\", \"type\": \"string\"}, {\"name\": \"page\", \"type\": \"string\"}, {\"name\": \"regionIsoCode\", \"type\": \"string\"}, {\"name\": \"regionName\", \"type\": \"string\"}, {\"name\": \"timestamp\", \"type\": \"string\"}, {\"name\": \"user\", \"type\": \"string\"}]'\''\n )\n)\nPARTITIONED BY DAY"
+}'
+```
+
+
+
+
+
+
+```python
+import json
+import requests
+
+url = "http://ROUTER_IP:ROUTER_PORT/druid/v2/sql/task"
+
+payload = json.dumps({
+ "query": "SET maxNumTasks=3;\nSET finalizeAggregations=false;\nINSERT INTO wikipedia\nSELECT\n TIME_PARSE(\"timestamp\") AS __time,\n *\nFROM TABLE(\n EXTERN(\n '{\"type\": \"http\", \"uris\": [\"https://druid.apache.org/data/wikipedia.json.gz\"]}',\n '{\"type\": \"json\"}',\n '[{\"name\": \"added\", \"type\": \"long\"}, {\"name\": \"channel\", \"type\": \"string\"}, {\"name\": \"cityName\", \"type\": \"string\"}, {\"name\": \"comment\", \"type\": \"string\"}, {\"name\": \"commentLength\", \"type\": \"long\"}, {\"name\": \"countryIsoCode\", \"type\": \"string\"}, {\"name\": \"countryName\", \"type\": \"string\"}, {\"name\": \"deleted\", \"type\": \"long\"}, {\"name\": \"delta\", \"type\": \"long\"}, {\"name\": \"deltaBucket\", \"type\": \"string\"}, {\"name\": \"diffUrl\", \"type\": \"string\"}, {\"name\": \"flags\", \"type\": \"string\"}, {\"name\": \"isAnonymous\", \"type\": \"string\"}, {\"name\": \"isMinor\", \"type\": \"string\"}, {\"name\": \"isNew\", \"type\": \"string\"}, {\"name\": \"isRobot\", \"type\": \"string\"}, {\"name\": \"isUnpatrolled\", \"type\": \"string\"}, {\"name\": \"metroCode\", \"type\": \"string\"}, {\"name\": \"namespace\", \"type\": \"string\"}, {\"name\": \"page\", \"type\": \"string\"}, {\"name\": \"regionIsoCode\", \"type\": \"string\"}, {\"name\": \"regionName\", \"type\": \"string\"}, {\"name\": \"timestamp\", \"type\": \"string\"}, {\"name\": \"user\", \"type\": \"string\"}]'\n )\n)\nPARTITIONED BY DAY"
+})
+headers = {
+ 'Content-Type': 'application/json'
+}
+
+response = requests.post(url, headers=headers, data=payload)
+
+print(response.text)
+
+```
+
+
+
+
+
+### Sample response
+
+
+ View the response
+
+```json
+{
+ "taskId": "query-431c4a18-9dde-4ec8-ab82-ec7fd17d5a4e",
+ "state": "RUNNING"
+}
+```
+
+
+**Response fields**
+
+| Field | Description |
+|---|---|
+| `taskId` | Controller task ID. You can use Druid's standard [Tasks API](./tasks-api.md) to interact with this controller task. |
+| `state` | Initial state for the query. |
+
+## Get the status for a query task
+
+Retrieves the status of a query task. It returns a JSON object with the task's status code, runner status, task type, datasource, and other relevant metadata.
+
+### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}/status`
+
+### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved task status*
+
+
+
+
+
+
+
+*Cannot find task with ID*
+
+
+
+
+---
+
+### Sample request
+
+The following example shows how to retrieve the status of a task with the ID `query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e`.
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+
+```shell
+curl --location --request GET 'http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/status'
+```
+
+
+
+
+
+
+```python
+import requests
+
+url = "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/status"
+
+payload={}
+headers = {}
+
+response = requests.post(url, headers=headers, data=payload)
+
+print(response.text)
+print(response.text)
+```
+
+
+
+
+
+### Sample response
+
+
+ View the response
+
+```json
+{
+ "task": "query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e",
+ "status": {
+ "id": "query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e",
+ "groupId": "query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e",
+ "type": "query_controller",
+ "createdTime": "2022-09-14T22:12:00.183Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "RUNNING",
+ "duration": -1,
+ "location": {
+ "host": "localhost",
+ "port": 8100,
+ "tlsPort": -1
+ },
+ "dataSource": "kttm_simple",
+ "errorMsg": null
+ }
+}
+```
+
+
+## Get the report for a query task
+
+Retrieves the task report for a query.
+The report provides detailed information about the query task, including things like the stages, warnings, and errors.
+
+Keep the following in mind when using the task API to view reports:
+
+- The task report for an entire job is associated with the `query_controller` task. The `query_worker` tasks don't have their own reports; their information is incorporated into the controller report.
+- The task report API may report `404 Not Found` temporarily while the task is in the process of starting up.
+- As an experimental feature, the MSQ task engine supports running SELECT queries. SELECT query results are written into
+the `multiStageQuery.payload.results.results` task report key as an array of arrays. The behavior and result format of plain
+SELECT queries (without INSERT or REPLACE) is subject to change.
+- `multiStageQuery.payload.results.resultsTruncated` denotes whether the results of the report have been truncated to prevent the reports from blowing up.
+
+For an explanation of the fields in a report, see [Report response fields](#report-response-fields).
+
+### URL
+
+
+`GET` `/druid/indexer/v1/task/{taskId}/reports`
+
+### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved task report*
+
+
+
+
+---
+
+### Sample request
+
+The following example shows how to retrieve the report for a query with the task ID `query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e`.
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/reports HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+
+```shell
+curl --location --request GET 'http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/reports'
+```
+
+
+
+
+
+
+```python
+import requests
+
+url = "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e/reports"
+
+headers = {}
+
+response = requests.post(url, headers=headers, data=payload)
+
+print(response.text)
+print(response.text)
+```
+
+
+
+
+
+### Sample response
+
+The response shows an example report for a query.
+
+
+View the response
+
+```json
+{
+ "multiStageQuery": {
+ "type": "multiStageQuery",
+ "taskId": "query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e",
+ "payload": {
+ "status": {
+ "status": "SUCCESS",
+ "startTime": "2022-09-14T22:12:09.266Z",
+ "durationMs": 28227,
+ "workers": {
+ "0": [
+ {
+ "workerId": "query-3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e-worker0_0",
+ "state": "SUCCESS",
+ "durationMs": 15511,
+ "pendingMs": 137
+ }
+ ]
+ },
+ "pendingTasks": 0,
+ "runningTasks": 2,
+ "segmentLoadWaiterStatus": {
+ "state": "SUCCESS",
+ "dataSource": "kttm_simple",
+ "startTime": "2022-09-14T23:12:09.266Z",
+ "duration": 15,
+ "totalSegments": 1,
+ "usedSegments": 1,
+ "precachedSegments": 0,
+ "onDemandSegments": 0,
+ "pendingSegments": 0,
+ "unknownSegments": 0
+ },
+ "segmentReport": {
+ "shardSpec": "NumberedShardSpec",
+ "details": "Cannot use RangeShardSpec, RangedShardSpec only supports string CLUSTER BY keys. Using NumberedShardSpec instead."
+ }
+ },
+ "stages": [
+ {
+ "stageNumber": 0,
+ "definition": {
+ "id": "71ecb11e-09d7-42f8-9225-1662c8e7e121_0",
+ "input": [
+ {
+ "type": "external",
+ "inputSource": {
+ "type": "http",
+ "uris": [
+ "https://static.imply.io/example-data/kttm-v2/kttm-v2-2019-08-25.json.gz"
+ ],
+ "httpAuthenticationUsername": null,
+ "httpAuthenticationPassword": null
+ },
+ "inputFormat": {
+ "type": "json",
+ "flattenSpec": null,
+ "featureSpec": {},
+ "keepNullColumns": false
+ },
+ "signature": [
+ {
+ "name": "timestamp",
+ "type": "STRING"
+ },
+ {
+ "name": "agent_category",
+ "type": "STRING"
+ },
+ {
+ "name": "agent_type",
+ "type": "STRING"
+ }
+ ]
+ }
+ ],
+ "processor": {
+ "type": "scan",
+ "query": {
+ "queryType": "scan",
+ "dataSource": {
+ "type": "inputNumber",
+ "inputNumber": 0
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "resultFormat": "compactedList",
+ "columns": [
+ "agent_category",
+ "agent_type",
+ "timestamp"
+ ],
+ "context": {
+ "finalize": false,
+ "finalizeAggregations": false,
+ "groupByEnableMultiValueUnnesting": false,
+ "scanSignature": "[{\"name\":\"agent_category\",\"type\":\"STRING\"},{\"name\":\"agent_type\",\"type\":\"STRING\"},{\"name\":\"timestamp\",\"type\":\"STRING\"}]",
+ "sqlInsertSegmentGranularity": "{\"type\":\"all\"}",
+ "sqlQueryId": "3dc0c45d-34d7-4b15-86c9-cdb2d3ebfc4e",
+ "sqlReplaceTimeChunks": "all"
+ },
+ "granularity": {
+ "type": "all"
+ }
+ }
+ },
+ "signature": [
+ {
+ "name": "__boost",
+ "type": "LONG"
+ },
+ {
+ "name": "agent_category",
+ "type": "STRING"
+ },
+ {
+ "name": "agent_type",
+ "type": "STRING"
+ },
+ {
+ "name": "timestamp",
+ "type": "STRING"
+ }
+ ],
+ "shuffleSpec": {
+ "type": "targetSize",
+ "clusterBy": {
+ "columns": [
+ {
+ "columnName": "__boost"
+ }
+ ]
+ },
+ "targetSize": 3000000
+ },
+ "maxWorkerCount": 1,
+ "shuffleCheckHasMultipleValues": true
+ },
+ "phase": "FINISHED",
+ "workerCount": 1,
+ "partitionCount": 1,
+ "startTime": "2022-09-14T22:12:11.663Z",
+ "duration": 19965,
+ "sort": true
+ },
+ {
+ "stageNumber": 1,
+ "definition": {
+ "id": "71ecb11e-09d7-42f8-9225-1662c8e7e121_1",
+ "input": [
+ {
+ "type": "stage",
+ "stage": 0
+ }
+ ],
+ "processor": {
+ "type": "segmentGenerator",
+ "dataSchema": {
+ "dataSource": "kttm_simple",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "millis",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "timestamp",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "agent_category",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "agent_type",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "arbitrary",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "columnMappings": [
+ {
+ "queryColumn": "timestamp",
+ "outputColumn": "timestamp"
+ },
+ {
+ "queryColumn": "agent_category",
+ "outputColumn": "agent_category"
+ },
+ {
+ "queryColumn": "agent_type",
+ "outputColumn": "agent_type"
+ }
+ ],
+ "tuningConfig": {
+ "maxNumWorkers": 1,
+ "maxRowsInMemory": 100000,
+ "rowsPerSegment": 3000000
+ }
+ },
+ "signature": [],
+ "maxWorkerCount": 1
+ },
+ "phase": "FINISHED",
+ "workerCount": 1,
+ "partitionCount": 1,
+ "startTime": "2022-09-14T22:12:31.602Z",
+ "duration": 5891
+ }
+ ],
+ "counters": {
+ "0": {
+ "0": {
+ "input0": {
+ "type": "channel",
+ "rows": [
+ 465346
+ ],
+ "files": [
+ 1
+ ],
+ "totalFiles": [
+ 1
+ ]
+ },
+ "output": {
+ "type": "channel",
+ "rows": [
+ 465346
+ ],
+ "bytes": [
+ 43694447
+ ],
+ "frames": [
+ 7
+ ]
+ },
+ "shuffle": {
+ "type": "channel",
+ "rows": [
+ 465346
+ ],
+ "bytes": [
+ 41835307
+ ],
+ "frames": [
+ 73
+ ]
+ },
+ "sortProgress": {
+ "type": "sortProgress",
+ "totalMergingLevels": 3,
+ "levelToTotalBatches": {
+ "0": 1,
+ "1": 1,
+ "2": 1
+ },
+ "levelToMergedBatches": {
+ "0": 1,
+ "1": 1,
+ "2": 1
+ },
+ "totalMergersForUltimateLevel": 1,
+ "progressDigest": 1
+ }
+ }
+ },
+ "1": {
+ "0": {
+ "input0": {
+ "type": "channel",
+ "rows": [
+ 465346
+ ],
+ "bytes": [
+ 41835307
+ ],
+ "frames": [
+ 73
+ ]
+ },
+ "segmentGenerationProgress": {
+ "type": "segmentGenerationProgress",
+ "rowsProcessed": 465346,
+ "rowsPersisted": 465346,
+ "rowsMerged": 465346
+ }
+ }
+ }
+ }
+ }
+ }
+}
+```
+
+
+
+
+
+The following table describes the response fields when you retrieve a report for a MSQ task engine using the `/druid/indexer/v1/task/{taskId}/reports` endpoint:
+
+| Field | Description |
+|---|---|
+| `multiStageQuery.taskId` | Controller task ID. |
+| `multiStageQuery.payload.status` | Query status container. |
+| `multiStageQuery.payload.status.status` | RUNNING, SUCCESS, or FAILED. |
+| `multiStageQuery.payload.status.startTime` | Start time of the query in ISO format. Only present if the query has started running. |
+| `multiStageQuery.payload.status.durationMs` | Milliseconds elapsed after the query has started running. -1 denotes that the query hasn't started running yet. |
+| `multiStageQuery.payload.status.workers` | Workers for the controller task.|
+| `multiStageQuery.payload.status.workers.` | Array of worker tasks including retries. |
+| `multiStageQuery.payload.status.workers.[].workerId` | Id of the worker task.| |
+| `multiStageQuery.payload.status.workers.[].status` | RUNNING, SUCCESS, or FAILED.|
+| `multiStageQuery.payload.status.workers.[].durationMs` | Milliseconds elapsed between when the worker task was first requested and when it finished. It is -1 for worker tasks with status RUNNING.|
+| `multiStageQuery.payload.status.workers.[].pendingMs` | Milliseconds elapsed between when the worker task was first requested and when it fully started RUNNING. Actual work time can be calculated using `actualWorkTimeMS = durationMs - pendingMs`.|
+| `multiStageQuery.payload.status.pendingTasks` | Number of tasks that are not fully started. -1 denotes that the number is currently unknown. |
+| `multiStageQuery.payload.status.runningTasks` | Number of currently running tasks. Should be at least 1 since the controller is included. |
+| `multiStageQuery.payload.status.segmentLoadStatus` | Segment loading container. Only present after the segments have been published. |
+| `multiStageQuery.payload.status.segmentLoadStatus.state` | Either INIT, WAITING, SUCCESS, FAILED or TIMED_OUT. |
+| `multiStageQuery.payload.status.segmentLoadStatus.startTime` | Time since which the controller has been waiting for the segments to finish loading. |
+| `multiStageQuery.payload.status.segmentLoadStatus.duration` | The duration in milliseconds that the controller has been waiting for the segments to load. |
+| `multiStageQuery.payload.status.segmentLoadStatus.totalSegments` | The total number of segments generated by the job. This includes tombstone segments (if any). |
+| `multiStageQuery.payload.status.segmentLoadStatus.usedSegments` | The number of segments which are marked as used based on the load rules. Unused segments can be cleaned up at any time. |
+| `multiStageQuery.payload.status.segmentLoadStatus.precachedSegments` | The number of segments which are marked as precached and served by historicals, as per the load rules. |
+| `multiStageQuery.payload.status.segmentLoadStatus.onDemandSegments` | The number of segments which are not loaded on any historical, as per the load rules. |
+| `multiStageQuery.payload.status.segmentLoadStatus.pendingSegments` | The number of segments remaining to be loaded. |
+| `multiStageQuery.payload.status.segmentLoadStatus.unknownSegments` | The number of segments whose status is unknown. |
+| `multiStageQuery.payload.status.segmentReport` | Segment report. Only present if the query is an ingestion. |
+| `multiStageQuery.payload.status.segmentReport.shardSpec` | Contains the shard spec chosen. |
+| `multiStageQuery.payload.status.segmentReport.details` | Contains further reasoning about the shard spec chosen. |
+| `multiStageQuery.payload.status.errorReport` | Error object. Only present if there was an error. |
+| `multiStageQuery.payload.status.errorReport.taskId` | The task that reported the error, if known. May be a controller task or a worker task. |
+| `multiStageQuery.payload.status.errorReport.host` | The hostname and port of the task that reported the error, if known. |
+| `multiStageQuery.payload.status.errorReport.stageNumber` | The stage number that reported the error, if it happened during execution of a specific stage. |
+| `multiStageQuery.payload.status.errorReport.error` | Error object. Contains `errorCode` at a minimum, and may contain other fields as described in the [error code table](../multi-stage-query/reference.md#error-codes). Always present if there is an error. |
+| `multiStageQuery.payload.status.errorReport.error.errorCode` | One of the error codes from the [error code table](../multi-stage-query/reference.md#error-codes). Always present if there is an error. |
+| `multiStageQuery.payload.status.errorReport.error.errorMessage` | User-friendly error message. Not always present, even if there is an error. |
+| `multiStageQuery.payload.status.errorReport.exceptionStackTrace` | Java stack trace in string form, if the error was due to a server-side exception. |
+| `multiStageQuery.payload.stages` | Array of query stages. |
+| `multiStageQuery.payload.stages[].stageNumber` | Each stage has a number that differentiates it from other stages. |
+| `multiStageQuery.payload.stages[].phase` | Either NEW, READING_INPUT, POST_READING, RESULTS_COMPLETE, or FAILED. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].workerCount` | Number of parallel tasks that this stage is running on. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].partitionCount` | Number of output partitions generated by this stage. Only present if the stage has started and has computed its number of output partitions. |
+| `multiStageQuery.payload.stages[].startTime` | Start time of this stage. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].duration` | The number of milliseconds that the stage has been running. Only present if the stage has started. |
+| `multiStageQuery.payload.stages[].sort` | A boolean that is set to `true` if the stage does a sort as part of its execution. |
+| `multiStageQuery.payload.stages[].definition` | The object defining what the stage does. |
+| `multiStageQuery.payload.stages[].definition.id` | The unique identifier of the stage. |
+| `multiStageQuery.payload.stages[].definition.input` | Array of inputs that the stage has. |
+| `multiStageQuery.payload.stages[].definition.broadcast` | Array of input indexes that get broadcasted. Only present if there are inputs that get broadcasted. |
+| `multiStageQuery.payload.stages[].definition.processor` | An object defining the processor logic. |
+| `multiStageQuery.payload.stages[].definition.signature` | The output signature of the stage. |
+
+## Cancel a query task
+
+Cancels a query task.
+Returns a JSON object with the ID of the task that was canceled successfully.
+
+### URL
+
+`POST` `/druid/indexer/v1/task/{taskId}/shutdown`
+
+### Responses
+
+
+
+
+
+
+
+
+*Successfully shut down task*
+
+
+
+
+
+
+
+*Cannot find task with ID or task is no longer running*
+
+
+
+
+---
+
+### Sample request
+
+The following example shows how to cancel a query task with the ID `query-655efe33-781a-4c50-ae84-c2911b42d63c`.
+
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/task/query-655efe33-781a-4c50-ae84-c2911b42d63c/shutdown HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+
+```shell
+curl --location --request POST 'http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-655efe33-781a-4c50-ae84-c2911b42d63c/shutdown'
+```
+
+
+
+
+
+
+```python
+import requests
+
+url = "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-655efe33-781a-4c50-ae84-c2911b42d63c/shutdown"
+
+payload = {}
+headers = {}
+
+response = requests.post(url, headers=headers, data=payload)
+
+print(response.text)
+print(response.text)
+```
+
+
+
+
+
+### Sample response
+
+The response shows the ID of the task that was canceled.
+
+```json
+{
+ "task": "query-655efe33-781a-4c50-ae84-c2911b42d63c"
+}
+```
\ No newline at end of file
diff --git a/docs/35.0.0/api-reference/sql-jdbc.md b/docs/35.0.0/api-reference/sql-jdbc.md
new file mode 100644
index 0000000000..affe9ea738
--- /dev/null
+++ b/docs/35.0.0/api-reference/sql-jdbc.md
@@ -0,0 +1,251 @@
+---
+id: sql-jdbc
+title: SQL JDBC driver API
+sidebar_label: SQL JDBC driver
+---
+
+
+
+:::info
+ Apache Druid supports two query languages: Druid SQL and [native queries](../querying/querying.md).
+ This document describes the SQL language.
+:::
+
+You can make [Druid SQL](../querying/sql.md) queries using the [Avatica JDBC driver](https://calcite.apache.org/avatica/downloads/).
+We recommend using Avatica JDBC driver version 1.23.0 or later. Note that starting with Avatica 1.21.0, you may need to set the [`transparent_reconnection`](https://calcite.apache.org/avatica/docs/client_reference.html#transparent_reconnection) property to `true` if you notice intermittent query failures.
+
+Once you've downloaded the Avatica client jar, add it to your classpath.
+
+Example connection string:
+
+```
+jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true
+```
+
+Or, to use the protobuf protocol instead of JSON:
+
+```
+jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnection=true;serialization=protobuf
+```
+
+The `url` is the `/druid/v2/sql/avatica/` endpoint on the Router, which routes JDBC connections to a consistent Broker.
+For more information, see [Connection stickiness](#connection-stickiness).
+
+Set `transparent_reconnection` to `true` so your connection is not interrupted if the pool of Brokers changes membership,
+or if a Broker is restarted.
+
+Set `serialization` to `protobuf` if using the protobuf endpoint.
+
+Note that as of the time of this writing, Avatica 1.23.0, the latest version, does not support passing
+[connection context parameters](../querying/sql-query-context.md) from the JDBC connection string to Druid. These context parameters
+must be passed using a `Properties` object instead. Refer to the Java code below for an example.
+
+Example Java code:
+
+```java
+// Connect to /druid/v2/sql/avatica/ on your Broker.
+String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";
+
+// Set any connection context parameters you need here.
+// Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
+Properties connectionProperties = new Properties();
+connectionProperties.setProperty("sqlTimeZone", "Etc/UTC");
+//To connect to a Druid deployment protected by basic authentication,
+//you can incorporate authentication details from https://druid.apache.org/docs/latest/operations/security-overview
+connectionProperties.setProperty("user", "admin");
+connectionProperties.setProperty("password", "password1");
+
+try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
+ try (
+ final Statement statement = connection.createStatement();
+ final ResultSet resultSet = statement.executeQuery(query)
+ ) {
+ while (resultSet.next()) {
+ // process result set
+ }
+ }
+}
+```
+
+For a runnable example that includes a query that you might run, see [Examples](#examples).
+
+It is also possible to use a protocol buffers JDBC connection with Druid, this offer reduced bloat and potential performance
+improvements for larger result sets. To use it apply the following connection URL instead, everything else remains the same
+
+```
+String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnection=true;serialization=protobuf";
+```
+
+:::info
+ The protobuf endpoint is also known to work with the official [Golang Avatica driver](https://github.com/apache/calcite-avatica-go)
+:::
+
+Table metadata is available over JDBC using `connection.getMetaData()` or by querying the
+[INFORMATION_SCHEMA tables](../querying/sql-metadata-tables.md). For an example of this, see [Get the metadata for a datasource](#get-the-metadata-for-a-datasource).
+
+## Connection stickiness
+
+Druid's JDBC server does not share connection state between Brokers. This means that if you're using JDBC and have
+multiple Druid Brokers, you should either connect to a specific Broker or use a load balancer with sticky sessions
+enabled. The Druid Router process provides connection stickiness when balancing JDBC requests, and can be used to achieve
+the necessary stickiness even with a normal non-sticky load balancer. Please see the
+[Router](../design/router.md) documentation for more details.
+
+Note that the non-JDBC [JSON over HTTP](sql-api.md#submit-a-query) API is stateless and does not require stickiness.
+
+## Dynamic parameters
+
+You can use [parameterized queries](../querying/sql.md#dynamic-parameters) in JDBC code, as in this example:
+
+```java
+PreparedStatement statement = connection.prepareStatement("SELECT COUNT(*) AS cnt FROM druid.foo WHERE dim1 = ? OR dim1 = ?");
+statement.setString(1, "abc");
+statement.setString(2, "def");
+final ResultSet resultSet = statement.executeQuery();
+```
+
+Sample code where dynamic parameters replace arrays using STRING_TO_ARRAY:
+```java
+PreparedStatement statement = connection.prepareStatement("select l1 from numfoo where SCALAR_IN_ARRAY(l1, STRING_TO_ARRAY(CAST(? as varchar),','))");
+List li = ImmutableList.of(0, 7);
+String sqlArg = Joiner.on(",").join(li);
+statement.setString(1, sqlArg);
+statement.executeQuery();
+```
+
+Sample code using native array:
+```java
+PreparedStatement statement = connection.prepareStatement("select l1 from numfoo where SCALAR_IN_ARRAY(l1, ?)");
+Iterable list = ImmutableList.of(0, 7);
+ArrayFactoryImpl arrayFactoryImpl = new ArrayFactoryImpl(TimeZone.getDefault());
+AvaticaType type = ColumnMetaData.scalar(Types.INTEGER, SqlType.INTEGER.name(), Rep.INTEGER);
+Array array = arrayFactoryImpl.createArray(type, list);
+statement.setArray(1, array);
+statement.executeQuery();
+```
+
+## Examples
+
+
+
+The following section contains two complete samples that use the JDBC connector:
+
+- [Get the metadata for a datasource](#get-the-metadata-for-a-datasource) shows you how to query the `INFORMATION_SCHEMA` to get metadata like column names.
+- [Query data](#query-data) runs a select query against the datasource.
+
+You can try out these examples after verifying that you meet the [prerequisites](#prerequisites).
+
+For more information about the connection options, see [Client Reference](https://calcite.apache.org/avatica/docs/client_reference.html).
+
+### Prerequisites
+
+Make sure you meet the following requirements before trying these examples:
+
+- A supported [Java version](../operations/java.md)
+
+- [Avatica JDBC driver](https://calcite.apache.org/avatica/downloads/). You can add the JAR to your `CLASSPATH` directly or manage it externally, such as through Maven and a `pom.xml` file.
+
+- An available Druid instance. You can use the `micro-quickstart` configuration described in [Quickstart (local)](../tutorials/index.md). The examples assume that you are using the quickstart, so no authentication or authorization is expected unless explicitly mentioned.
+
+- The example `wikipedia` datasource from the quickstart is loaded on your Druid instance. If you have a different datasource loaded, you can still try these examples. You'll have to update the table name and column names to match your datasource.
+
+### Get the metadata for a datasource
+
+Metadata, such as column names, is available either through the [`INFORMATION_SCHEMA`](../querying/sql-metadata-tables.md) table or through `connection.getMetaData()`. The following example uses the `INFORMATION_SCHEMA` table to retrieve and print the list of column names for the `wikipedia` datasource that you loaded during a previous tutorial.
+
+```java
+import java.sql.*;
+import java.util.Properties;
+
+public class JdbcListColumns {
+
+ public static void main(String[] args)
+ {
+ // Connect to /druid/v2/sql/avatica/ on your Router.
+ // You can connect to a Broker but must configure connection stickiness if you do.
+ String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";
+
+ String query = "SELECT COLUMN_NAME,* FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'wikipedia' and TABLE_SCHEMA='druid'";
+
+ // Set any connection context parameters you need here.
+ // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
+ Properties connectionProperties = new Properties();
+
+ try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
+ try (
+ final Statement statement = connection.createStatement();
+ final ResultSet rs = statement.executeQuery(query)
+ ) {
+ while (rs.next()) {
+ String columnName = rs.getString("COLUMN_NAME");
+ System.out.println(columnName);
+ }
+ }
+ } catch (SQLException e) {
+ throw new RuntimeException(e);
+ }
+
+ }
+}
+```
+
+### Query data
+
+Now that you know what columns are available, you can start querying the data. The following example queries the datasource named `wikipedia` for the timestamps and comments from Japan. It also sets the [query context parameter](../querying/sql-query-context.md) `sqlTimeZone`. Optionally, you can also parameterize queries by using [dynamic parameters](#dynamic-parameters).
+
+```java
+import java.sql.*;
+import java.util.Properties;
+
+public class JdbcCountryAndTime {
+
+ public static void main(String[] args)
+ {
+ // Connect to /druid/v2/sql/avatica/ on your Router.
+ // You can connect to a Broker but must configure connection stickiness if you do.
+ String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";
+
+ //The query you want to run.
+ String query = "SELECT __time, isRobot, countryName, comment FROM wikipedia WHERE countryName='Japan'";
+
+ // Set any connection context parameters you need here.
+ // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
+ Properties connectionProperties = new Properties();
+ connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");
+
+ try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
+ try (
+ final Statement statement = connection.createStatement();
+ final ResultSet rs = statement.executeQuery(query)
+ ) {
+ while (rs.next()) {
+ Timestamp timeStamp = rs.getTimestamp("__time");
+ String comment = rs.getString("comment");
+ System.out.println(timeStamp);
+ System.out.println(comment);
+ }
+ }
+ } catch (SQLException e) {
+ throw new RuntimeException(e);
+ }
+
+ }
+}
+```
diff --git a/docs/35.0.0/api-reference/supervisor-api.md b/docs/35.0.0/api-reference/supervisor-api.md
new file mode 100644
index 0000000000..38e68d4e13
--- /dev/null
+++ b/docs/35.0.0/api-reference/supervisor-api.md
@@ -0,0 +1,3652 @@
+---
+id: supervisor-api
+title: Supervisor API
+sidebar_label: Supervisors
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+This topic describes the API endpoints to manage and monitor supervisors for Apache Druid.
+The topic uses the Apache Kafka term offset to refer to the identifier for records in a partition. If you are using Amazon Kinesis, the equivalent is sequence number.
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for your Router service address and port. Replace it with the information for your deployment. For example, use `http://localhost:8888` for quickstart deployments.
+
+## Supervisor information
+
+The following table lists the properties of a supervisor object:
+
+|Property|Type|Description|
+|---|---|---|
+|`id`|String|Unique identifier.|
+|`state`|String|Generic state of the supervisor. Available states:`UNHEALTHY_SUPERVISOR`, `UNHEALTHY_TASKS`, `PENDING`, `RUNNING`, `SUSPENDED`, `STOPPING`. See [Supervisor reference](../ingestion/supervisor.md#status-report) for more information.|
+|`detailedState`|String|Detailed state of the supervisor. This property contains a more descriptive, implementation-specific state that may provide more insight into the supervisor's activities than the `state` property. See [Apache Kafka ingestion](../ingestion/kafka-ingestion.md) and [Amazon Kinesis ingestion](../ingestion/kinesis-ingestion.md) for supervisor-specific states.|
+|`healthy`|Boolean|Supervisor health indicator.|
+|`spec`|Object|Container object for the supervisor configuration.|
+|`suspended`|Boolean|Indicates whether the supervisor is in a suspended state.|
+
+### Get an array of active supervisor IDs
+
+Returns an array of strings representing the names of active supervisors. If there are no active supervisors, it returns an empty array.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved array of active supervisor IDs*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ "wikipedia_stream",
+ "social_media"
+ ]
+ ```
+
+
+### Get an array of active supervisor objects
+
+Retrieves an array of active supervisor objects. If there are no active supervisors, it returns an empty array. For reference on the supervisor object properties, see the preceding [table](#supervisor-information).
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor?full`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved supervisor objects*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor?full=null"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor?full=null HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "wikipedia_stream",
+ "state": "RUNNING",
+ "detailedState": "CONNECTING_TO_STREAM",
+ "healthy": true,
+ "spec": {
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "wikipedia_stream",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9042"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "wikipedia_stream",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9042"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+ },
+ "suspended": false
+ },
+ {
+ "id": "social_media",
+ "state": "RUNNING",
+ "detailedState": "RUNNING",
+ "healthy": true,
+ "spec": {
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+ },
+ "suspended": false
+ }
+ ]
+ ```
+
+
+### Get an array of supervisor states
+
+Retrieves an array of objects representing active supervisors and their current state. If there are no active supervisors, it returns an empty array. For reference on the supervisor object properties, see the preceding [table](#supervisor-information).
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor?state=true`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved supervisor state objects*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor?state=true"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor?state=true HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "wikipedia_stream",
+ "state": "UNHEALTHY_SUPERVISOR",
+ "detailedState": "UNABLE_TO_CONNECT_TO_STREAM",
+ "healthy": false,
+ "suspended": false
+ },
+ {
+ "id": "social_media",
+ "state": "RUNNING",
+ "detailedState": "RUNNING",
+ "healthy": true,
+ "suspended": false
+ }
+ ]
+ ```
+
+
+
+### Get supervisor specification
+
+Retrieves the specification for a single supervisor. The returned specification includes the `dataSchema`, `ioConfig`, and `tuningConfig` objects.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/{supervisorId}`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved supervisor spec*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to retrieve the specification of a supervisor with the name `wikipedia_stream`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/wikipedia_stream"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/wikipedia_stream HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+}
+ ```
+
+
+### Get supervisor status
+
+Retrieves the current status report for a single supervisor. The report contains the state of the supervisor tasks and an array of recently thrown exceptions.
+
+For additional information about the status report, see [Supervisor reference](../ingestion/supervisor.md#status-report).
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/{supervisorId}/status`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved supervisor status*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to retrieve the status of a supervisor with the name `social_media`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/status"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/social_media/status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "id": "social_media",
+ "generationTime": "2023-07-05T23:24:43.934Z",
+ "payload": {
+ "dataSource": "social_media",
+ "stream": "social_media",
+ "partitions": 1,
+ "replicas": 1,
+ "durationSeconds": 3600,
+ "activeTasks": [
+ {
+ "id": "index_kafka_social_media_ab72ae4127c591c_flcbhdlh",
+ "startingOffsets": {
+ "0": 3176381
+ },
+ "startTime": "2023-07-05T23:21:39.321Z",
+ "remainingSeconds": 3415,
+ "type": "ACTIVE",
+ "currentOffsets": {
+ "0": 3296632
+ },
+ "lag": {
+ "0": 3
+ }
+ }
+ ],
+ "publishingTasks": [],
+ "latestOffsets": {
+ "0": 3296635
+ },
+ "minimumLag": {
+ "0": 3
+ },
+ "aggregateLag": 3,
+ "offsetsLastUpdated": "2023-07-05T23:24:30.212Z",
+ "suspended": false,
+ "healthy": true,
+ "state": "RUNNING",
+ "detailedState": "RUNNING",
+ "recentErrors": []
+ }
+ }
+ ```
+
+
+### Get supervisor health
+
+Retrieves the current health report for a single supervisor. The health of a supervisor is determined by the supervisor's `state` (as returned by the `/status` endpoint) and the `druid.supervisor.*` Overlord configuration thresholds.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/{supervisorId}/health`
+
+#### Responses
+
+
+
+
+
+*Supervisor is healthy*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+
+*Supervisor is unhealthy*
+
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to retrieve the health report for a supervisor with the name `social_media`.
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/health"
+```
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/social_media/health HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "healthy": false
+ }
+ ```
+
+
+### Get supervisor ingestion stats
+
+Returns a snapshot of the current ingestion row counters for each task being managed by the supervisor, along with moving averages for the row counters. See [Row stats](../ingestion/tasks.md#row-stats) for more information.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/{supervisorId}/stats`
+
+#### Responses
+
+
+
+
+
+*Successfully retrieved supervisor stats*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to retrieve the current ingestion row counters for a supervisor with the name `custom_data`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/custom_data/stats"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/custom_data/stats HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "0": {
+ "index_kafka_custom_data_881d621078f6b7c_ccplchbi": {
+ "movingAverages": {
+ "buildSegments": {
+ "5m": {
+ "processed": 53.401225142603316,
+ "processedBytes": 5226.400757148808,
+ "unparseable": 0.0,
+ "thrownAway": 0.0,
+ "processedWithError": 0.0
+ },
+ "15m": {
+ "processed": 56.92994990102502,
+ "processedBytes": 5571.772059828217,
+ "unparseable": 0.0,
+ "thrownAway": 0.0,
+ "processedWithError": 0.0
+ },
+ "1m": {
+ "processed": 37.134921285556636,
+ "processedBytes": 3634.2766230628677,
+ "unparseable": 0.0,
+ "thrownAway": 0.0,
+ "processedWithError": 0.0
+ }
+ }
+ },
+ "totals": {
+ "buildSegments": {
+ "processed": 665,
+ "processedBytes": 65079,
+ "processedWithError": 0,
+ "thrownAway": 0,
+ "unparseable": 0
+ }
+ }
+ }
+ }
+ }
+ ```
+
+
+## Audit history
+
+An audit history provides a comprehensive log of events, including supervisor configuration, creation, suspension, and modification history.
+
+### Get audit history for all supervisors
+
+Retrieves an audit history of specs for all supervisors.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/history`
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved audit history*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/history"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/history HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "social_media": [
+ {
+ "spec": {
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+ },
+ "version": "2023-07-03T18:51:02.970Z"
+ }
+ ]
+}
+ ```
+
+
+### Get audit history for a specific supervisor
+
+Retrieves an audit history of specs for a single supervisor.
+
+#### URL
+
+`GET` `/druid/indexer/v1/supervisor/{supervisorId}/history`
+
+#### Query parameters
+
+* `count` (optional)
+ * Type: Integer
+ * Limit the number of results to the last `n` entries. Must be greater than 0 if specified.
+
+#### Responses
+
+
+
+
+
+
+*Successfully retrieved supervisor audit history*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following examples show how to retrieve the audit history of a supervisor with the name `wikipedia_stream`.
+
+**Get all history entries:**
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/wikipedia_stream/history"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/wikipedia_stream/history HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+**Get last 10 history entries:**
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/wikipedia_stream/history?count=10"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/supervisor/wikipedia_stream/history?count=10 HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+[
+ {
+ "spec": {
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "wikipedia_stream",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9042"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "wikipedia_stream",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9042"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+ },
+ "version": "2023-07-05T20:59:16.872Z"
+ }
+]
+ ```
+
+
+## Manage supervisors
+
+### Create or update a supervisor
+
+Creates a new supervisor spec or updates an existing one with new configuration and schema information. When updating a supervisor spec, the datasource must remain the same as the previous supervisor.
+
+You can define a supervisor spec for [Apache Kafka](../ingestion/kafka-ingestion.md) or [Amazon Kinesis](../ingestion/kinesis-ingestion.md) streaming ingestion methods.
+
+The following table lists the properties of a supervisor spec:
+
+|Property|Type|Description|Required|
+|--------|----|-----------|--------|
+|`type`|String|The supervisor type. One of`kafka` or `kinesis`.|Yes|
+|`spec`|Object|The container object for the supervisor configuration.|Yes|
+|`ioConfig`|Object|The I/O configuration object to define the connection and I/O-related settings for the supervisor and indexing task.|Yes|
+|`dataSchema`|Object|The schema for the indexing task to use during ingestion. See [`dataSchema`](../ingestion/ingestion-spec.md#dataschema) for more information.|Yes|
+|`tuningConfig`|Object|The tuning configuration object to define performance-related settings for the supervisor and indexing tasks.|No|
+
+When you call this endpoint on an existing supervisor, the running supervisor signals its tasks to stop reading and begin publishing, exiting itself. Druid then uses the provided configuration from the request body to create a new supervisor. Druid submits a new schema while retaining existing publishing tasks and starts new tasks at the previous task offsets.
+This way, you can apply configuration changes without a pause in ingestion.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor`
+
+#### Responses
+
+
+
+
+
+
+*Successfully created a new supervisor or updated an existing supervisor*
+
+
+
+
+
+*Request body content type is not in JSON format*
+
+
+
+
+---
+
+#### Sample request
+
+The following example uses JSON input format to create a supervisor spec for Kafka with a `social_media` datasource and `social_media` topic.
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor" \
+--header 'Content-Type: application/json' \
+--data '{
+ "type": "kafka",
+ "spec": {
+ "ioConfig": {
+ "type": "kafka",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "useEarliestOffset": true
+ },
+ "tuningConfig": {
+ "type": "kafka"
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso"
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ "username",
+ "post_title",
+ {
+ "type": "long",
+ "name": "views"
+ },
+ {
+ "type": "long",
+ "name": "upvotes"
+ },
+ {
+ "type": "long",
+ "name": "comments"
+ },
+ "edited"
+ ]
+ },
+ "granularitySpec": {
+ "queryGranularity": "none",
+ "rollup": false,
+ "segmentGranularity": "hour"
+ }
+ }
+ }
+}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 1359
+
+{
+ "type": "kafka",
+ "spec": {
+ "ioConfig": {
+ "type": "kafka",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "useEarliestOffset": true
+ },
+ "tuningConfig": {
+ "type": "kafka"
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso"
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ "username",
+ "post_title",
+ {
+ "type": "long",
+ "name": "views"
+ },
+ {
+ "type": "long",
+ "name": "upvotes"
+ },
+ {
+ "type": "long",
+ "name": "comments"
+ },
+ "edited"
+ ]
+ },
+ "granularitySpec": {
+ "queryGranularity": "none",
+ "rollup": false,
+ "segmentGranularity": "hour"
+ }
+ }
+ }
+}
+```
+
+
+
+
+#### Sample request with `skipRestartIfUnmodified`
+
+The following example sets the `skipRestartIfUnmodified` flag to true. With this flag set to true, the Supervisor will only restart if there has been a modification to the SupervisorSpec. If left unset, the flag defaults to false.
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor?skipRestartIfUnmodified=true" \
+--header 'Content-Type: application/json' \
+--data '{
+ "type": "kafka",
+ "spec": {
+ "ioConfig": {
+ "type": "kafka",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "useEarliestOffset": true
+ },
+ "tuningConfig": {
+ "type": "kafka"
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso"
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ "username",
+ "post_title",
+ {
+ "type": "long",
+ "name": "views"
+ },
+ {
+ "type": "long",
+ "name": "upvotes"
+ },
+ {
+ "type": "long",
+ "name": "comments"
+ },
+ "edited"
+ ]
+ },
+ "granularitySpec": {
+ "queryGranularity": "none",
+ "rollup": false,
+ "segmentGranularity": "hour"
+ }
+ }
+ }
+}'
+```
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "id": "social_media"
+}
+ ```
+
+
+### Suspend a running supervisor
+
+Suspends a single running supervisor. Returns the updated supervisor spec, where the `suspended` property is set to `true`. The suspended supervisor continues to emit logs and metrics.
+Indexing tasks remain suspended until you [resume the supervisor](#resume-a-supervisor).
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/suspend`
+
+#### Responses
+
+
+
+
+
+
+*Successfully shut down supervisor*
+
+
+
+
+
+*Supervisor already suspended*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to suspend a running supervisor with the name `social_media`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/suspend"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/suspend HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": true
+}
+ ```
+
+
+### Suspend all supervisors
+
+Suspends all supervisors. Note that this endpoint returns an HTTP `200 Success` code message even if there are no supervisors or running supervisors to suspend.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/suspendAll`
+
+#### Responses
+
+
+
+
+
+
+*Successfully suspended all supervisors*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/suspendAll"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/suspendAll HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "status": "success"
+}
+ ```
+
+
+### Resume a supervisor
+
+Resumes indexing tasks for a supervisor. Returns an updated supervisor spec with the `suspended` property set to `false`.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/resume`
+
+#### Responses
+
+
+
+
+
+
+*Successfully resumed supervisor*
+
+
+
+
+
+*Supervisor already running*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following example resumes a previously suspended supervisor with name `social_media`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/resume"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/resume HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ }
+ },
+ "dataSchema": {
+ "dataSource": "social_media",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "username",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "post_title",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "long",
+ "name": "views",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "upvotes",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "long",
+ "name": "comments",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": false
+ },
+ {
+ "type": "string",
+ "name": "edited",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": []
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "tuningConfig": {
+ "type": "kafka",
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 150000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null,
+ "intermediatePersistPeriod": "PT10M",
+ "maxPendingPersists": 0,
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "reportParseExceptions": false,
+ "handoffConditionTimeout": 0,
+ "resetOffsetAutomatically": false,
+ "segmentWriteOutMediumFactory": null,
+ "workerThreads": null,
+ "chatRetries": 8,
+ "httpTimeout": "PT10S",
+ "shutdownTimeout": "PT80S",
+ "offsetFetchPeriod": "PT30S",
+ "intermediateHandoffPeriod": "P2147483647D",
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "skipSequenceNumberAvailabilityCheck": false,
+ "repartitionTransitionDuration": "PT120S"
+ },
+ "ioConfig": {
+ "topic": "social_media",
+ "inputFormat": {
+ "type": "json"
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9094"
+ },
+ "autoScalerConfig": null,
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": true,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "configOverrides": null,
+ "idleConfig": null,
+ "stream": "social_media",
+ "useEarliestSequenceNumber": true
+ },
+ "context": null,
+ "suspended": false
+}
+ ```
+
+
+### Resume all supervisors
+
+Resumes all supervisors. Note that this endpoint returns an HTTP `200 Success` code even if there are no supervisors or suspended supervisors to resume.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/resumeAll`
+
+#### Responses
+
+
+
+
+
+
+*Successfully resumed all supervisors*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/resumeAll"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/resumeAll HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "status": "success"
+}
+ ```
+
+
+### Reset a supervisor
+
+The supervisor must be running for this endpoint to be available.
+
+Resets the specified supervisor. This endpoint clears supervisor metadata, prompting the supervisor to resume data reading. The supervisor restarts from the earliest or latest available position, depending on the value of the `useEarliestOffset` property.
+After clearing all stored offsets, the supervisor kills and recreates active tasks,
+so that tasks begin reading from valid positions.
+
+Use this endpoint to recover from a stopped state due to missing offsets. Use this endpoint with caution as it may result in skipped messages and lead to data loss or duplicate data.
+
+The indexing service keeps track of the latest persisted offsets to provide exactly-once ingestion guarantees across tasks. Subsequent tasks must start reading from where the previous task completed for Druid to accept the generated segments. If the messages at the expected starting offsets are no longer available, the supervisor refuses to start and in-flight tasks fail. Possible causes for missing messages include the message retention period elapsing or the topic being removed and re-created. Use the `reset` endpoint to recover from this condition.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/reset`
+
+#### Responses
+
+
+
+
+
+
+*Successfully reset supervisor*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to reset a supervisor with the name `social_media`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/reset"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/reset HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "id": "social_media"
+}
+ ```
+
+
+### Reset offsets for a supervisor
+
+The supervisor must be running for this endpoint to be available.
+
+Resets the specified offsets for partitions without resetting the entire set.
+
+This endpoint clears only the stored offsets, prompting the supervisor to resume reading data from the specified offsets.
+If there are no stored offsets, the specified offsets are set in the metadata store.
+
+After resetting stored offsets, the supervisor kills and recreates any active tasks pertaining to the specified partitions,
+so that tasks begin reading specified offsets. For partitions that are not specified in this operation, the supervisor resumes from the last stored offset.
+
+Use this endpoint with caution. It can cause skipped messages, leading to data loss or duplicate data.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/resetOffsets`
+
+#### Responses
+
+
+
+
+
+
+*Successfully reset offsets*
+
+
+
+
+
+*Invalid supervisor ID*
+
+
+
+
+---
+#### Reset Offsets Metadata
+
+This section presents the structure and details of the reset offsets metadata payload.
+
+| Field | Type | Description | Required |
+|---------|---------|---------|---------|
+| `type` | String | The type of reset offsets metadata payload. It must match the supervisor's `type`. Possible values: `kafka` or `kinesis`. | Yes |
+| `partitions` | Object | An object representing the reset metadata. See below for details. | Yes |
+
+#### Partitions
+
+The following table defines the fields within the `partitions` object in the reset offsets metadata payload.
+
+| Field | Type | Description | Required |
+|---------|---------|---------|---------|
+| `type` | String | Must be set as `end`. Indicates the end sequence numbers for the reset offsets. | Yes |
+| `stream` | String | The stream to be reset. It must be a valid stream consumed by the supervisor. | Yes |
+| `partitionOffsetMap` | Object | A map of partitions to corresponding offsets for the stream to be reset.| Yes |
+
+#### Sample request
+
+The following example shows how to reset offsets for a Kafka supervisor with the name `social_media`. For example, the supervisor is reading from a Kafka topic `ads_media_stream` and has the stored offsets: `{"0": 0, "1": 10, "2": 20, "3": 40}`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/resetOffsets"
+--header 'Content-Type: application/json'
+--data-raw '{"type":"kafka","partitions":{"type":"end","stream":"ads_media_stream","partitionOffsetMap":{"0":100, "2": 650}}}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/resetOffsets HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+
+{
+ "type": "kafka",
+ "partitions": {
+ "type": "end",
+ "stream": "ads_media_stream",
+ "partitionOffsetMap": {
+ "0": 100,
+ "2": 650
+ }
+ }
+}
+```
+
+The example operation resets offsets only for partitions `0` and `2` to 100 and 650 respectively. After a successful reset,
+when the supervisor's tasks restart, they resume reading from `{"0": 100, "1": 10, "2": 650, "3": 40}`.
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "id": "social_media"
+}
+ ```
+
+
+### Terminate a supervisor
+
+Terminates a supervisor and its associated indexing tasks, triggering the publishing of their segments. When you terminate a supervisor, Druid places a tombstone marker in the metadata store to prevent reloading on restart.
+
+The terminated supervisor still exists in the metadata store and its history can be retrieved.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/terminate`
+
+#### Responses
+
+
+
+
+
+
+*Successfully terminated a supervisor*
+
+
+
+
+
+*Invalid supervisor ID or supervisor not running*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/terminate"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/terminate HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "id": "social_media"
+}
+ ```
+
+
+### Terminate all supervisors
+
+Terminates all supervisors. Terminated supervisors still exist in the metadata store and their history can be retrieved. Note that this endpoint returns an HTTP `200 Success` code even if there are no supervisors or running supervisors to terminate.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/terminateAll`
+
+#### Responses
+
+
+
+
+
+
+*Successfully terminated all supervisors*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/terminateAll"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/terminateAll HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+{
+ "status": "success"
+}
+ ```
+
+
+### Handoff task groups for a supervisor early
+
+Trigger handoff for specified task groups of a supervisor early. This is a best effort API and makes no guarantees of handoff execution
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/taskGroups/handoff`
+
+#### Sample request
+
+The following example shows how to handoff task groups for a supervisor with the name `social_media` and has the task groups: `1,2,3`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/supervisor/social_media/taskGroups/handoff"
+--header 'Content-Type: application/json'
+--data-raw '{"taskGroupIds": [1, 2, 3]}'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/supervisor/social_media/taskGroups/handoff HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+
+{
+ "taskGroupIds": [1, 2, 3],
+}
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+(empty response)
+
+
+### Shut down a supervisor
+
+Shuts down a supervisor. This endpoint is deprecated and will be removed in future releases. Use the equivalent [terminate](#terminate-a-supervisor) endpoint instead.
+
+#### URL
+
+`POST` `/druid/indexer/v1/supervisor/{supervisorId}/shutdown`
diff --git a/docs/35.0.0/api-reference/tasks-api.md b/docs/35.0.0/api-reference/tasks-api.md
new file mode 100644
index 0000000000..f53037f84e
--- /dev/null
+++ b/docs/35.0.0/api-reference/tasks-api.md
@@ -0,0 +1,1663 @@
+---
+id: tasks-api
+title: Tasks API
+sidebar_label: Tasks
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+
+
+
+This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid. Tasks are individual jobs performed by Druid to complete operations such as ingestion, querying, and compaction.
+
+In this topic, `http://ROUTER_IP:ROUTER_PORT` is a placeholder for the Router service address and port. For example, on the quickstart configuration, use `http://localhost:8888`.
+
+## Task information and retrieval
+
+### Get an array of tasks
+
+Retrieves an array of all tasks in the Druid cluster. Each task object includes information on its ID, status, associated datasource, and other metadata. For definitions of the response properties, see the [Tasks table](../querying/sql-metadata-tables.md#tasks-table).
+
+#### URL
+
+`GET` `/druid/indexer/v1/tasks`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+|Parameter|Type|Description|
+|---|---|---|
+|`state`|String|Filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.|
+| `datasource`|String| Return tasks filtered by Druid datasource.|
+| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. Use `_` as the delimiter for the interval string. Do not use `/`. For example, `2023-06-27_2023-06-28`.|
+| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.|
+| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved list of tasks*
+
+
+
+
+
+
+
+*Invalid `state` query parameter value*
+
+
+
+
+
+
+
+*Invalid query parameter*
+
+
+
+
+---
+
+#### Sample request
+
+The following example shows how to retrieve a list of tasks filtered with the following query parameters:
+* State: `complete`
+* Datasource: `wikipedia_api`
+* Time interval: between `2015-09-12` and `2015-09-13`
+* Max entries returned: `10`
+* Task type: `query_worker`
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12_2015-09-13&max=10&type=query_worker"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12_2015-09-13&max=10&type=query_worker HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0",
+ "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "type": "query_worker",
+ "createdTime": "2023-06-22T22:11:37.012Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "SUCCESS",
+ "status": "SUCCESS",
+ "runnerStatusCode": "NONE",
+ "duration": 17897,
+ "location": {
+ "host": "localhost",
+ "port": 8101,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ },
+ {
+ "id": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f-worker0_0",
+ "groupId": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f",
+ "type": "query_worker",
+ "createdTime": "2023-06-20T22:51:21.302Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "SUCCESS",
+ "status": "SUCCESS",
+ "runnerStatusCode": "NONE",
+ "duration": 16911,
+ "location": {
+ "host": "localhost",
+ "port": 8101,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ },
+ {
+ "id": "query-5419da7a-b270-492f-90e6-920ecfba766a-worker0_0",
+ "groupId": "query-5419da7a-b270-492f-90e6-920ecfba766a",
+ "type": "query_worker",
+ "createdTime": "2023-06-20T22:45:53.909Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "SUCCESS",
+ "status": "SUCCESS",
+ "runnerStatusCode": "NONE",
+ "duration": 17030,
+ "location": {
+ "host": "localhost",
+ "port": 8101,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ }
+ ]
+ ```
+
+
+
+### Get an array of complete tasks
+
+Retrieves an array of completed tasks in the Druid cluster. This is functionally equivalent to `/druid/indexer/v1/tasks?state=complete`. For definitions of the response properties, see the [Tasks table](../querying/sql-metadata-tables.md#tasks-table).
+
+#### URL
+
+`GET` `/druid/indexer/v1/completeTasks`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+|Parameter|Type|Description|
+|---|---|---|
+| `datasource`|String| Return tasks filtered by Druid datasource.|
+| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. The interval string should be delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`.|
+| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.|
+| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved list of complete tasks*
+
+
+
+
+
+
+
+*Request sent to incorrect service*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/completeTasks"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/completeTasks HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0",
+ "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "type": "query_worker",
+ "createdTime": "2023-06-22T22:11:37.012Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "SUCCESS",
+ "status": "SUCCESS",
+ "runnerStatusCode": "NONE",
+ "duration": 17897,
+ "location": {
+ "host": "localhost",
+ "port": 8101,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ },
+ {
+ "id": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "type": "query_controller",
+ "createdTime": "2023-06-22T22:11:28.367Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "SUCCESS",
+ "status": "SUCCESS",
+ "runnerStatusCode": "NONE",
+ "duration": 30317,
+ "location": {
+ "host": "localhost",
+ "port": 8100,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ }
+ ]
+ ```
+
+
+
+### Get an array of running tasks
+
+Retrieves an array of running task objects in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=running`. For definitions of the response properties, see the [Tasks table](../querying/sql-metadata-tables.md#tasks-table).
+
+#### URL
+
+`GET` `/druid/indexer/v1/runningTasks`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+|Parameter|Type|Description|
+|---|---|---|
+| `datasource`|String| Return tasks filtered by Druid datasource.|
+| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. The interval string should be delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`.|
+| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.|
+| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved list of running tasks*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/runningTasks"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/runningTasks HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "query-32663269-ead9-405a-8eb6-0817a952ef47",
+ "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47",
+ "type": "query_controller",
+ "createdTime": "2023-06-22T22:54:43.170Z",
+ "queueInsertionTime": "2023-06-22T22:54:43.170Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "RUNNING",
+ "duration": -1,
+ "location": {
+ "host": "localhost",
+ "port": 8100,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ }
+ ]
+ ```
+
+
+
+### Get an array of waiting tasks
+
+Retrieves an array of waiting tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=waiting`. For definitions of the response properties, see the [Tasks table](../querying/sql-metadata-tables.md#tasks-table).
+
+#### URL
+
+`GET` `/druid/indexer/v1/waitingTasks`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+|Parameter|Type|Description|
+|---|---|---|
+| `datasource`|String| Return tasks filtered by Druid datasource.|
+| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. The interval string should be delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`.|
+| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.|
+| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved list of waiting tasks*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/waitingTasks"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/waitingTasks HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z",
+ "groupId": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z",
+ "type": "index_parallel",
+ "createdTime": "2023-06-26T21:08:05.217Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "WAITING",
+ "duration": -1,
+ "location": {
+ "host": null,
+ "port": -1,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_auto",
+ "errorMsg": null
+ },
+ {
+ "id": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z",
+ "groupId": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z",
+ "type": "index_parallel",
+ "createdTime": "2023-06-26T21:08:05.548Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "WAITING",
+ "duration": -1,
+ "location": {
+ "host": null,
+ "port": -1,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_auto",
+ "errorMsg": null
+ },
+ {
+ "id": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z",
+ "groupId": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z",
+ "type": "index_parallel",
+ "createdTime": "2023-06-26T21:08:06.671Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "WAITING",
+ "duration": -1,
+ "location": {
+ "host": null,
+ "port": -1,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_auto",
+ "errorMsg": null
+ }
+ ]
+ ```
+
+
+
+### Get an array of pending tasks
+
+Retrieves an array of pending tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=pending`. For definitions of the response properties, see the [Tasks table](../querying/sql-metadata-tables.md#tasks-table).
+
+#### URL
+
+`GET` `/druid/indexer/v1/pendingTasks`
+
+#### Query parameters
+
+The endpoint supports a set of optional query parameters to filter results.
+
+|Parameter|Type|Description|
+|---|---|---|
+| `datasource`|String| Return tasks filtered by Druid datasource.|
+| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. The interval string should be delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`.|
+| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.|
+| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.|
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved list of pending tasks*
+
+
+
+
+---
+
+#### Sample request
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/pendingTasks"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/pendingTasks HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ [
+ {
+ "id": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67",
+ "groupId": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67",
+ "type": "query_controller",
+ "createdTime": "2023-06-23T19:53:06.037Z",
+ "queueInsertionTime": "2023-06-23T19:53:06.037Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "PENDING",
+ "duration": -1,
+ "location": {
+ "host": null,
+ "port": -1,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ },
+ {
+ "id": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36",
+ "groupId": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36",
+ "type": "query_controller",
+ "createdTime": "2023-06-23T19:53:06.616Z",
+ "queueInsertionTime": "2023-06-23T19:53:06.616Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "PENDING",
+ "duration": -1,
+ "location": {
+ "host": null,
+ "port": -1,
+ "tlsPort": -1
+ },
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ }
+ ]
+ ```
+
+
+
+### Get task payload
+
+Retrieves the payload of a task given the task ID. It returns a JSON object with the task ID and payload that includes task configuration details and relevant specifications associated with the execution of the task.
+
+#### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved payload of task*
+
+
+
+
+
+
+
+*Cannot find task with ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following examples shows how to retrieve the task payload of a task with the specified ID `index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "task": "index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z",
+ "payload": {
+ "type": "index_parallel",
+ "id": "index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z",
+ "groupId": "index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z",
+ "resource": {
+ "availabilityGroup": "index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z",
+ "requiredCapacity": 1
+ },
+ "spec": {
+ "dataSchema": {
+ "dataSource": "wikipedia_short",
+ "timestampSpec": {
+ "column": "time",
+ "format": "iso",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "cityName",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "countryName",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ },
+ {
+ "type": "string",
+ "name": "regionName",
+ "multiValueHandling": "SORTED_ARRAY",
+ "createBitmapIndex": true
+ }
+ ],
+ "dimensionExclusions": [
+ "__time",
+ "time"
+ ],
+ "includeAllDimensions": false,
+ "useSchemaDiscovery": false
+ },
+ "metricsSpec": [],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "DAY",
+ "queryGranularity": {
+ "type": "none"
+ },
+ "rollup": false,
+ "intervals": [
+ "2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z"
+ ]
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "type": "index_parallel",
+ "inputSource": {
+ "type": "local",
+ "baseDir": "quickstart/tutorial",
+ "filter": "wikiticker-2015-09-12-sampled.json.gz"
+ },
+ "inputFormat": {
+ "type": "json"
+ },
+ "appendToExisting": false,
+ "dropExisting": false
+ },
+ "tuningConfig": {
+ "type": "index_parallel",
+ "maxRowsPerSegment": 5000000,
+ "appendableIndexSpec": {
+ "type": "onheap",
+ "preserveExistingMetrics": false
+ },
+ "maxRowsInMemory": 25000,
+ "maxBytesInMemory": 0,
+ "skipBytesInMemoryOverheadCheck": false,
+ "maxTotalRows": null,
+ "numShards": null,
+ "splitHintSpec": null,
+ "partitionsSpec": {
+ "type": "dynamic",
+ "maxRowsPerSegment": 5000000,
+ "maxTotalRows": null
+ },
+ "indexSpec": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "indexSpecForIntermediatePersists": {
+ "bitmap": {
+ "type": "roaring"
+ },
+ "dimensionCompression": "lz4",
+ "stringDictionaryEncoding": {
+ "type": "utf8"
+ },
+ "metricCompression": "lz4",
+ "longEncoding": "longs"
+ },
+ "maxPendingPersists": 0,
+ "forceGuaranteedRollup": false,
+ "reportParseExceptions": false,
+ "pushTimeout": 0,
+ "segmentWriteOutMediumFactory": null,
+ "maxNumConcurrentSubTasks": 1,
+ "maxRetry": 3,
+ "taskStatusCheckPeriodMs": 1000,
+ "chatHandlerTimeout": "PT10S",
+ "chatHandlerNumRetries": 5,
+ "maxNumSegmentsToMerge": 100,
+ "totalNumMergeTasks": 10,
+ "logParseExceptions": false,
+ "maxParseExceptions": 2147483647,
+ "maxSavedParseExceptions": 0,
+ "maxColumnsToMerge": -1,
+ "awaitSegmentAvailabilityTimeoutMillis": 0,
+ "maxAllowedLockCount": -1,
+ "partitionDimensions": []
+ }
+ },
+ "context": {
+ "forceTimeChunkLock": true,
+ "useLineageBasedSegmentAllocation": true
+ },
+ "dataSource": "wikipedia_short"
+ }
+}
+ ```
+
+
+
+### Get task status
+
+Retrieves the status of a task given the task ID. It returns a JSON object with the task's status code, runner status, task type, datasource, and other relevant metadata.
+
+#### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}/status`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved task status*
+
+
+
+
+
+
+
+*Cannot find task with ID*
+
+
+
+
+---
+
+#### Sample request
+
+The following examples shows how to retrieve the status of a task with the specified ID `query-223549f8-b993-4483-b028-1b0d54713cad`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-223549f8-b993-4483-b028-1b0d54713cad/status"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/query-223549f8-b993-4483-b028-1b0d54713cad/status HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "task": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "status": {
+ "id": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad",
+ "type": "query_controller",
+ "createdTime": "2023-06-22T22:11:28.367Z",
+ "queueInsertionTime": "1970-01-01T00:00:00.000Z",
+ "statusCode": "RUNNING",
+ "status": "RUNNING",
+ "runnerStatusCode": "RUNNING",
+ "duration": -1,
+ "location": {"host": "localhost", "port": 8100, "tlsPort": -1},
+ "dataSource": "wikipedia_api",
+ "errorMsg": null
+ }
+ }
+ ```
+
+
+
+### Get task segments
+
+:::info
+ This API is not supported anymore and always returns a 404 response.
+ Use the metric `segment/added/bytes` instead to identify the segment IDs committed by a task.
+:::
+
+#### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}/segments`
+
+#### Responses
+
+
+
+
+
+
+```json
+{
+ "error": "Segment IDs committed by a task action are not persisted anymore. Use the metric 'segment/added/bytes' to identify the segments created by a task."
+}
+```
+
+
+
+
+---
+
+#### Sample request
+
+The following examples shows how to retrieve the task segment of the task with the specified ID `query-52a8aafe-7265-4427-89fe-dc51275cc470`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+A successful request returns a `200 OK` response and an array of the task segments.
+
+### Get task log
+
+Retrieves the event log associated with a task. It returns a list of logged events during the lifecycle of the task. The endpoint is useful for providing information about the execution of the task, including any errors or warnings raised.
+
+Task logs are automatically retrieved from the Middle Manager/Indexer or in long-term storage. For reference, see [Task logs](../ingestion/tasks.md#task-logs).
+
+#### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}/log`
+
+#### Query parameters
+
+* `offset` (optional)
+ * Type: Int
+ * Exclude the first passed in number of entries from the response.
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved task log*
+
+
+
+
+---
+
+#### Sample request
+
+The following examples shows how to retrieve the task log of a task with the specified ID `index_kafka_social_media_0e905aa31037879_nommnaeg`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/index_kafka_social_media_0e905aa31037879_nommnaeg/log"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/index_kafka_social_media_0e905aa31037879_nommnaeg/log HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ 2023-07-03T22:11:17,891 INFO [qtp1251996697-122] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Sequence[index_kafka_social_media_0e905aa31037879_0] end offsets updated from [{0=9223372036854775807}] to [{0=230985}].
+ 2023-07-03T22:11:17,900 INFO [qtp1251996697-122] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved sequence metadata to disk: [SequenceMetadata{sequenceId=0, sequenceName='index_kafka_social_media_0e905aa31037879_0', assignments=[0], startOffsets={0=230985}, exclusiveStartPartitions=[], endOffsets={0=230985}, sentinel=false, checkpointed=true}]
+ 2023-07-03T22:11:17,901 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Received resume command, resuming ingestion.
+ 2023-07-03T22:11:17,901 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Finished reading partition[0], up to[230985].
+ 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Resetting generation and member id due to: consumer pro-actively leaving the group
+ 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Request joining group due to: consumer pro-actively leaving the group
+ 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Unsubscribed all topics or patterns and assigned partitions
+ 2023-07-03T22:11:17,912 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted rows[0] and (estimated) bytes[0]
+ 2023-07-03T22:11:17,916 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Flushed in-memory data with commit metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}] for segments:
+ 2023-07-03T22:11:17,917 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted stats: processed rows: [0], persisted rows[0], sinks: [0], total fireHydrants (across sinks): [0], persisted fireHydrants (across sinks): [0]
+ 2023-07-03T22:11:17,919 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing [0] segments in background
+ 2023-07-03T22:11:17,921 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted rows[0] and (estimated) bytes[0]
+ 2023-07-03T22:11:17,924 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Flushed in-memory data with commit metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}] for segments:
+ 2023-07-03T22:11:17,924 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted stats: processed rows: [0], persisted rows[0], sinks: [0], total fireHydrants (across sinks): [0], persisted fireHydrants (across sinks): [0]
+ 2023-07-03T22:11:17,925 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Preparing to push (stats): processed rows: [0], sinks: [0], fireHydrants (across sinks): [0]
+ 2023-07-03T22:11:17,925 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Push complete...
+ 2023-07-03T22:11:17,929 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SequenceMetadata - With empty segment set, start offsets [SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}] and end offsets [SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}] are the same, skipping metadata commit.
+ 2023-07-03T22:11:17,930 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Published [0] segments with commit metadata [{nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}]
+ 2023-07-03T22:11:17,930 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Published 0 segments for sequence [index_kafka_social_media_0e905aa31037879_0] with metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}].
+ 2023-07-03T22:11:17,931 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved sequence metadata to disk: []
+ 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Handoff complete for segments:
+ 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Resetting generation and member id due to: consumer pro-actively leaving the group
+ 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Request joining group due to: consumer pro-actively leaving the group
+ 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Metrics scheduler closed
+ 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Closing reporter org.apache.kafka.common.metrics.JmxReporter
+ 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Metrics reporters closed
+ 2023-07-03T22:11:17,935 INFO [task-runner-0-priority-0] org.apache.kafka.common.utils.AppInfoParser - App info kafka.consumer for consumer-kafka-supervisor-dcanhmig-1 unregistered
+ 2023-07-03T22:11:17,936 INFO [task-runner-0-priority-0] org.apache.druid.curator.announcement.PathChildrenAnnouncer - Unannouncing [/druid/internal-discovery/PEON/localhost:8100]
+ 2023-07-03T22:11:17,972 INFO [task-runner-0-priority-0] org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced self [{"druidNode":{"service":"druid/middleManager","host":"localhost","bindOnHost":false,"plaintextPort":8100,"port":-1,"tlsPort":-1,"enablePlaintextPort":true,"enableTlsPort":false},"nodeType":"peon","services":{"dataNodeService":{"type":"dataNodeService","tier":"_default_tier","maxSize":0,"type":"indexer-executor","serverType":"indexer-executor","priority":0},"lookupNodeService":{"type":"lookupNodeService","lookupTier":"__default"}}}].
+ 2023-07-03T22:11:17,972 INFO [task-runner-0-priority-0] org.apache.druid.curator.announcement.PathChildrenAnnouncer - Unannouncing [/druid/announcements/localhost:8100]
+ 2023-07-03T22:11:17,996 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
+ "id" : "index_kafka_social_media_0e905aa31037879_nommnaeg",
+ "status" : "SUCCESS",
+ "duration" : 3601130,
+ "errorMsg" : null,
+ "location" : {
+ "host" : null,
+ "port" : -1,
+ "tlsPort" : -1
+ }
+ }
+ 2023-07-03T22:11:17,998 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [ANNOUNCEMENTS]
+ 2023-07-03T22:11:18,005 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [SERVER]
+ 2023-07-03T22:11:18,009 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@6491006{HTTP/1.1, (http/1.1)}{0.0.0.0:8100}
+ 2023-07-03T22:11:18,009 INFO [main] org.eclipse.jetty.server.session - node0 Stopped scavenging
+ 2023-07-03T22:11:18,012 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@742aa00a{/,null,STOPPED}
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [NORMAL]
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.ZkCoordinator - Stopping ZkCoordinator for [DruidServerMetadata{name='localhost:8100', hostAndPort='localhost:8100', hostAndTlsPort='null', maxSize=0, tier='_default_tier', type=indexer-executor, priority=0}]
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.SegmentLoadDropHandler - Stopping...
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.SegmentLoadDropHandler - Stopped.
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting graceful shutdown of task[index_kafka_social_media_0e905aa31037879_nommnaeg].
+ 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Stopping forcefully (status: [PUBLISHING])
+ 2023-07-03T22:11:18,019 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited. Lookup notices are not handled anymore.
+ 2023-07-03T22:11:18,020 INFO [main] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [name].
+ 2023-07-03T22:11:18,020 INFO [Curator-Framework-0] org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting
+ 2023-07-03T22:11:18,147 INFO [main] org.apache.zookeeper.ZooKeeper - Session: 0x1000097ceaf0007 closed
+ 2023-07-03T22:11:18,147 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1000097ceaf0007
+ 2023-07-03T22:11:18,151 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [INIT]
+ Finished peon task
+ ```
+
+
+
+### Get task completion report
+
+Retrieves a [task completion report](../ingestion/tasks.md#task-reports) for a task. It returns a JSON object with information about the number of rows ingested, and any parse exceptions that Druid raised.
+
+#### URL
+
+`GET` `/druid/indexer/v1/task/{taskId}/reports`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved task report*
+
+
+
+
+---
+
+#### Sample request
+
+The following examples shows how to retrieve the completion report of a task with the specified ID `query-52a8aafe-7265-4427-89fe-dc51275cc470`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports"
+```
+
+
+
+
+
+```HTTP
+GET /druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "ingestionStatsAndErrors": {
+ "type": "ingestionStatsAndErrors",
+ "taskId": "query-52a8aafe-7265-4427-89fe-dc51275cc470",
+ "payload": {
+ "ingestionState": "COMPLETED",
+ "unparseableEvents": {},
+ "rowStats": {
+ "determinePartitions": {
+ "processed": 0,
+ "processedBytes": 0,
+ "processedWithError": 0,
+ "thrownAway": 0,
+ "unparseable": 0
+ },
+ "buildSegments": {
+ "processed": 39244,
+ "processedBytes": 17106256,
+ "processedWithError": 0,
+ "thrownAway": 0,
+ "unparseable": 0
+ }
+ },
+ "errorMsg": null,
+ "segmentAvailabilityConfirmed": false,
+ "segmentAvailabilityWaitTimeMs": 0
+ }
+ }
+ }
+ ```
+
+
+
+## Task operations
+
+### Submit a task
+
+Submits a JSON-based ingestion spec or supervisor spec to the Overlord. It returns the task ID of the submitted task. For information on creating an ingestion spec, refer to the [ingestion spec reference](../ingestion/ingestion-spec.md).
+
+Note that for most batch ingestion use cases, you should use the [SQL-ingestion API](./sql-ingestion-api.md) instead of JSON-based batch ingestion.
+
+#### URL
+
+`POST` `/druid/indexer/v1/task`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully submitted task*
+
+
+
+
+
+
+
+*Missing information in query*
+
+
+
+
+
+
+
+*Incorrect request body media type*
+
+
+
+
+
+
+
+*Unexpected token or characters in request body*
+
+
+
+
+---
+
+#### Sample request
+
+The following request is an example of submitting a task to create a datasource named `"wikipedia auto"`.
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task" \
+--header 'Content-Type: application/json' \
+--data '{
+ "type" : "index_parallel",
+ "spec" : {
+ "dataSchema" : {
+ "dataSource" : "wikipedia_auto",
+ "timestampSpec": {
+ "column": "time",
+ "format": "iso"
+ },
+ "dimensionsSpec" : {
+ "useSchemaDiscovery": true
+ },
+ "metricsSpec" : [],
+ "granularitySpec" : {
+ "type" : "uniform",
+ "segmentGranularity" : "day",
+ "queryGranularity" : "none",
+ "intervals" : ["2015-09-12/2015-09-13"],
+ "rollup" : false
+ }
+ },
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "local",
+ "baseDir" : "quickstart/tutorial/",
+ "filter" : "wikiticker-2015-09-12-sampled.json.gz"
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ },
+ "tuningConfig" : {
+ "type" : "index_parallel",
+ "maxRowsPerSegment" : 5000000,
+ "maxRowsInMemory" : 25000
+ }
+ }
+}'
+
+```
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/task HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 952
+
+{
+ "type" : "index_parallel",
+ "spec" : {
+ "dataSchema" : {
+ "dataSource" : "wikipedia_auto",
+ "timestampSpec": {
+ "column": "time",
+ "format": "iso"
+ },
+ "dimensionsSpec" : {
+ "useSchemaDiscovery": true
+ },
+ "metricsSpec" : [],
+ "granularitySpec" : {
+ "type" : "uniform",
+ "segmentGranularity" : "day",
+ "queryGranularity" : "none",
+ "intervals" : ["2015-09-12/2015-09-13"],
+ "rollup" : false
+ }
+ },
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "local",
+ "baseDir" : "quickstart/tutorial/",
+ "filter" : "wikiticker-2015-09-12-sampled.json.gz"
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ },
+ "tuningConfig" : {
+ "type" : "index_parallel",
+ "maxRowsPerSegment" : 5000000,
+ "maxRowsInMemory" : 25000
+ }
+ }
+}
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "task": "index_parallel_wikipedia_odofhkle_2023-06-23T21:07:28.226Z"
+ }
+ ```
+
+
+
+### Shut down a task
+
+Shuts down a task if it not already complete. Returns a JSON object with the ID of the task that was shut down successfully.
+
+#### URL
+
+`POST` `/druid/indexer/v1/task/{taskId}/shutdown`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully shut down task*
+
+
+
+
+
+
+
+*Cannot find task with ID or task is no longer running*
+
+
+
+
+---
+
+#### Sample request
+
+The following request shows how to shut down a task with the ID `query-52as 8aafe-7265-4427-89fe-dc51275cc470`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-52as 8aafe-7265-4427-89fe-dc51275cc470/shutdown"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/task/query-52as 8aafe-7265-4427-89fe-dc51275cc470/shutdown HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "task": "query-577a83dd-a14e-4380-bd01-c942b781236b"
+ }
+ ```
+
+
+
+### Shut down all tasks for a datasource
+
+Shuts down all tasks for a specified datasource. If successful, it returns a JSON object with the name of the datasource whose tasks are shut down.
+
+#### URL
+
+`POST` `/druid/indexer/v1/datasources/{datasource}/shutdownAllTasks`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully shut down tasks*
+
+
+
+
+
+
+
+*Error or datasource does not have a running task*
+
+
+
+
+---
+
+#### Sample request
+
+The following request is an example of shutting down all tasks for datasource `wikipedia_auto`.
+
+
+
+
+
+
+```shell
+curl --request POST "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_auto/shutdownAllTasks"
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/datasources/wikipedia_auto/shutdownAllTasks HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "dataSource": "wikipedia_api"
+ }
+ ```
+
+
+
+## Task management
+
+### Retrieve status objects for tasks
+
+Retrieves list of task status objects for list of task ID strings in request body. It returns a set of JSON objects with the status, duration, location of each task, and any error messages.
+
+#### URL
+
+`POST` `/druid/indexer/v1/taskStatus`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully retrieved status objects*
+
+
+
+
+
+
+
+*Missing request body or incorrect request body type*
+
+
+
+
+---
+
+#### Sample request
+
+The following request is an example of retrieving status objects for task ID `index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z` and `index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z` .
+
+
+
+
+
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/taskStatus" \
+--header 'Content-Type: application/json' \
+--data '["index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z","index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z"]'
+```
+
+
+
+
+
+```HTTP
+POST /druid/indexer/v1/taskStatus HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+Content-Type: application/json
+Content-Length: 134
+
+["index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z", "index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z"]
+```
+
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z": {
+ "id": "index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z",
+ "status": "SUCCESS",
+ "duration": 10630,
+ "errorMsg": null,
+ "location": {
+ "host": "localhost",
+ "port": 8100,
+ "tlsPort": -1
+ }
+ },
+ "index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z": {
+ "id": "index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z",
+ "status": "SUCCESS",
+ "duration": 11012,
+ "errorMsg": null,
+ "location": {
+ "host": "localhost",
+ "port": 8100,
+ "tlsPort": -1
+ }
+ }
+ }
+ ```
+
+
+
+### Clean up pending segments for a datasource
+
+Manually clean up pending segments table in metadata storage for `datasource`. It returns a JSON object response with
+`numDeleted` for the number of rows deleted from the pending segments table. This API is used by the
+`druid.coordinator.kill.pendingSegments.on` [Coordinator setting](../configuration/index.md#data-management)
+which automates this operation to perform periodically.
+
+#### URL
+
+`DELETE` `/druid/indexer/v1/pendingSegments/{datasource}`
+
+#### Responses
+
+
+
+
+
+
+
+
+*Successfully deleted pending segments*
+
+
+
+
+---
+
+#### Sample request
+
+The following request is an example of cleaning up pending segments for the `wikipedia_api` datasource.
+
+
+
+
+
+
+```shell
+curl --request DELETE "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/pendingSegments/wikipedia_api"
+```
+
+
+
+
+
+```HTTP
+DELETE /druid/indexer/v1/pendingSegments/wikipedia_api HTTP/1.1
+Host: http://ROUTER_IP:ROUTER_PORT
+```
+
+
+
+
+#### Sample response
+
+
+ View the response
+
+ ```json
+ {
+ "numDeleted": 2
+ }
+ ```
+
+
\ No newline at end of file
diff --git a/docs/35.0.0/assets/compaction-config.png b/docs/35.0.0/assets/compaction-config.png
new file mode 100644
index 0000000000..9dbcfefa80
Binary files /dev/null and b/docs/35.0.0/assets/compaction-config.png differ
diff --git a/docs/35.0.0/assets/datasources-action-button.png b/docs/35.0.0/assets/datasources-action-button.png
new file mode 100644
index 0000000000..6a52b8444d
Binary files /dev/null and b/docs/35.0.0/assets/datasources-action-button.png differ
diff --git a/docs/35.0.0/assets/druid-architecture.png b/docs/35.0.0/assets/druid-architecture.png
new file mode 100644
index 0000000000..954a87bc1b
Binary files /dev/null and b/docs/35.0.0/assets/druid-architecture.png differ
diff --git a/docs/35.0.0/assets/druid-architecture.svg b/docs/35.0.0/assets/druid-architecture.svg
new file mode 100644
index 0000000000..9d0e67188f
--- /dev/null
+++ b/docs/35.0.0/assets/druid-architecture.svg
@@ -0,0 +1,19 @@
+
+
\ No newline at end of file
diff --git a/docs/35.0.0/assets/druid-column-types.png b/docs/35.0.0/assets/druid-column-types.png
new file mode 100644
index 0000000000..9db56c0681
Binary files /dev/null and b/docs/35.0.0/assets/druid-column-types.png differ
diff --git a/docs/35.0.0/assets/druid-dataflow-2x.png b/docs/35.0.0/assets/druid-dataflow-2x.png
new file mode 100644
index 0000000000..ab1c583e43
Binary files /dev/null and b/docs/35.0.0/assets/druid-dataflow-2x.png differ
diff --git a/docs/35.0.0/assets/druid-dataflow-3.png b/docs/35.0.0/assets/druid-dataflow-3.png
new file mode 100644
index 0000000000..355215cbce
Binary files /dev/null and b/docs/35.0.0/assets/druid-dataflow-3.png differ
diff --git a/docs/35.0.0/assets/druid-manage-1.png b/docs/35.0.0/assets/druid-manage-1.png
new file mode 100644
index 0000000000..0d10c6e7bc
Binary files /dev/null and b/docs/35.0.0/assets/druid-manage-1.png differ
diff --git a/docs/35.0.0/assets/druid-timeline.png b/docs/35.0.0/assets/druid-timeline.png
new file mode 100644
index 0000000000..40380e2794
Binary files /dev/null and b/docs/35.0.0/assets/druid-timeline.png differ
diff --git a/docs/35.0.0/assets/files/kttm-kafka-supervisor.json b/docs/35.0.0/assets/files/kttm-kafka-supervisor.json
new file mode 100644
index 0000000000..2096f9c7cd
--- /dev/null
+++ b/docs/35.0.0/assets/files/kttm-kafka-supervisor.json
@@ -0,0 +1,66 @@
+{
+ "type": "kafka",
+ "spec": {
+ "ioConfig": {
+ "type": "kafka",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9092"
+ },
+ "topic": "kttm",
+ "inputFormat": {
+ "type": "json"
+ },
+ "useEarliestOffset": true
+ },
+ "tuningConfig": {
+ "type": "kafka"
+ },
+ "dataSchema": {
+ "dataSource": "kttm-kafka-supervisor-api",
+ "timestampSpec": {
+ "column": "timestamp",
+ "format": "iso"
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ "session",
+ "number",
+ "client_ip",
+ "language",
+ "adblock_list",
+ "app_version",
+ "path",
+ "loaded_image",
+ "referrer",
+ "referrer_host",
+ "server_ip",
+ "screen",
+ "window",
+ {
+ "type": "long",
+ "name": "session_length"
+ },
+ "timezone",
+ "timezone_offset",
+ {
+ "type": "json",
+ "name": "event"
+ },
+ {
+ "type": "json",
+ "name": "agent"
+ },
+ {
+ "type": "json",
+ "name": "geo_ip"
+ }
+ ]
+ },
+ "granularitySpec": {
+ "queryGranularity": "none",
+ "rollup": false,
+ "segmentGranularity": "day"
+ }
+ }
+ }
+}
\ No newline at end of file
diff --git a/docs/35.0.0/assets/indexing_service.png b/docs/35.0.0/assets/indexing_service.png
new file mode 100644
index 0000000000..a4462a413c
Binary files /dev/null and b/docs/35.0.0/assets/indexing_service.png differ
diff --git a/docs/35.0.0/assets/multi-stage-query/msq-ui-download-query-results.png b/docs/35.0.0/assets/multi-stage-query/msq-ui-download-query-results.png
new file mode 100644
index 0000000000..e428cb2dfd
Binary files /dev/null and b/docs/35.0.0/assets/multi-stage-query/msq-ui-download-query-results.png differ
diff --git a/docs/35.0.0/assets/multi-stage-query/tutorial-msq-convert.png b/docs/35.0.0/assets/multi-stage-query/tutorial-msq-convert.png
new file mode 100644
index 0000000000..f16941af67
Binary files /dev/null and b/docs/35.0.0/assets/multi-stage-query/tutorial-msq-convert.png differ
diff --git a/docs/35.0.0/assets/multi-stage-query/ui-annotated.png b/docs/35.0.0/assets/multi-stage-query/ui-annotated.png
new file mode 100644
index 0000000000..5a98c00d19
Binary files /dev/null and b/docs/35.0.0/assets/multi-stage-query/ui-annotated.png differ
diff --git a/docs/35.0.0/assets/multi-stage-query/ui-empty.png b/docs/35.0.0/assets/multi-stage-query/ui-empty.png
new file mode 100644
index 0000000000..7c30d5a671
Binary files /dev/null and b/docs/35.0.0/assets/multi-stage-query/ui-empty.png differ
diff --git a/docs/35.0.0/assets/native-queries-01.png b/docs/35.0.0/assets/native-queries-01.png
new file mode 100644
index 0000000000..27fd29b632
Binary files /dev/null and b/docs/35.0.0/assets/native-queries-01.png differ
diff --git a/docs/35.0.0/assets/nested-combined-json.png b/docs/35.0.0/assets/nested-combined-json.png
new file mode 100644
index 0000000000..f98bfcf538
Binary files /dev/null and b/docs/35.0.0/assets/nested-combined-json.png differ
diff --git a/docs/35.0.0/assets/nested-display-data-types.png b/docs/35.0.0/assets/nested-display-data-types.png
new file mode 100644
index 0000000000..2776068ee4
Binary files /dev/null and b/docs/35.0.0/assets/nested-display-data-types.png differ
diff --git a/docs/35.0.0/assets/nested-examine-schema.png b/docs/35.0.0/assets/nested-examine-schema.png
new file mode 100644
index 0000000000..11769a162a
Binary files /dev/null and b/docs/35.0.0/assets/nested-examine-schema.png differ
diff --git a/docs/35.0.0/assets/nested-extract-as-type.png b/docs/35.0.0/assets/nested-extract-as-type.png
new file mode 100644
index 0000000000..c54a5eeb62
Binary files /dev/null and b/docs/35.0.0/assets/nested-extract-as-type.png differ
diff --git a/docs/35.0.0/assets/nested-extract-elements.png b/docs/35.0.0/assets/nested-extract-elements.png
new file mode 100644
index 0000000000..9f7076b50d
Binary files /dev/null and b/docs/35.0.0/assets/nested-extract-elements.png differ
diff --git a/docs/35.0.0/assets/nested-group-aggregate.png b/docs/35.0.0/assets/nested-group-aggregate.png
new file mode 100644
index 0000000000..2d1907fe64
Binary files /dev/null and b/docs/35.0.0/assets/nested-group-aggregate.png differ
diff --git a/docs/35.0.0/assets/nested-msq-ingestion-transform.png b/docs/35.0.0/assets/nested-msq-ingestion-transform.png
new file mode 100644
index 0000000000..b46fde8593
Binary files /dev/null and b/docs/35.0.0/assets/nested-msq-ingestion-transform.png differ
diff --git a/docs/35.0.0/assets/nested-msq-ingestion.png b/docs/35.0.0/assets/nested-msq-ingestion.png
new file mode 100644
index 0000000000..0487ee1883
Binary files /dev/null and b/docs/35.0.0/assets/nested-msq-ingestion.png differ
diff --git a/docs/35.0.0/assets/nested-parse-deserialize.png b/docs/35.0.0/assets/nested-parse-deserialize.png
new file mode 100644
index 0000000000..881a67164b
Binary files /dev/null and b/docs/35.0.0/assets/nested-parse-deserialize.png differ
diff --git a/docs/35.0.0/assets/nested-retrieve-json.png b/docs/35.0.0/assets/nested-retrieve-json.png
new file mode 100644
index 0000000000..4f5fa0f969
Binary files /dev/null and b/docs/35.0.0/assets/nested-retrieve-json.png differ
diff --git a/docs/35.0.0/assets/nested-return-json.png b/docs/35.0.0/assets/nested-return-json.png
new file mode 100644
index 0000000000..9a67aaa71d
Binary files /dev/null and b/docs/35.0.0/assets/nested-return-json.png differ
diff --git a/docs/35.0.0/assets/retention-rules.png b/docs/35.0.0/assets/retention-rules.png
new file mode 100644
index 0000000000..59061d5511
Binary files /dev/null and b/docs/35.0.0/assets/retention-rules.png differ
diff --git a/docs/35.0.0/assets/security-model-1.png b/docs/35.0.0/assets/security-model-1.png
new file mode 100644
index 0000000000..55c7f24c54
Binary files /dev/null and b/docs/35.0.0/assets/security-model-1.png differ
diff --git a/docs/35.0.0/assets/security-model-2.png b/docs/35.0.0/assets/security-model-2.png
new file mode 100644
index 0000000000..dcb256bacc
Binary files /dev/null and b/docs/35.0.0/assets/security-model-2.png differ
diff --git a/docs/35.0.0/assets/segmentPropagation.png b/docs/35.0.0/assets/segmentPropagation.png
new file mode 100644
index 0000000000..e1ec82029e
Binary files /dev/null and b/docs/35.0.0/assets/segmentPropagation.png differ
diff --git a/docs/35.0.0/assets/services-overview.png b/docs/35.0.0/assets/services-overview.png
new file mode 100644
index 0000000000..157ce608e5
Binary files /dev/null and b/docs/35.0.0/assets/services-overview.png differ
diff --git a/docs/35.0.0/assets/set-query-context-insert-query.png b/docs/35.0.0/assets/set-query-context-insert-query.png
new file mode 100644
index 0000000000..d156597d2a
Binary files /dev/null and b/docs/35.0.0/assets/set-query-context-insert-query.png differ
diff --git a/docs/35.0.0/assets/set-query-context-open-context-dialog.png b/docs/35.0.0/assets/set-query-context-open-context-dialog.png
new file mode 100644
index 0000000000..765caa0d72
Binary files /dev/null and b/docs/35.0.0/assets/set-query-context-open-context-dialog.png differ
diff --git a/docs/35.0.0/assets/set-query-context-query-view.png b/docs/35.0.0/assets/set-query-context-query-view.png
new file mode 100644
index 0000000000..9d25d3c664
Binary files /dev/null and b/docs/35.0.0/assets/set-query-context-query-view.png differ
diff --git a/docs/35.0.0/assets/set-query-context-run-the-query.png b/docs/35.0.0/assets/set-query-context-run-the-query.png
new file mode 100644
index 0000000000..27f29f8390
Binary files /dev/null and b/docs/35.0.0/assets/set-query-context-run-the-query.png differ
diff --git a/docs/35.0.0/assets/set-query-context-set-context-parameters.png b/docs/35.0.0/assets/set-query-context-set-context-parameters.png
new file mode 100644
index 0000000000..17fa110501
Binary files /dev/null and b/docs/35.0.0/assets/set-query-context-set-context-parameters.png differ
diff --git a/docs/35.0.0/assets/spectator-histogram-size-comparison.png b/docs/35.0.0/assets/spectator-histogram-size-comparison.png
new file mode 100644
index 0000000000..306f45abd8
Binary files /dev/null and b/docs/35.0.0/assets/spectator-histogram-size-comparison.png differ
diff --git a/docs/35.0.0/assets/supervisor-actions.png b/docs/35.0.0/assets/supervisor-actions.png
new file mode 100644
index 0000000000..2797cf69ea
Binary files /dev/null and b/docs/35.0.0/assets/supervisor-actions.png differ
diff --git a/docs/35.0.0/assets/supervisor-info-dialog.png b/docs/35.0.0/assets/supervisor-info-dialog.png
new file mode 100644
index 0000000000..3be424a413
Binary files /dev/null and b/docs/35.0.0/assets/supervisor-info-dialog.png differ
diff --git a/docs/35.0.0/assets/supervisor-view.png b/docs/35.0.0/assets/supervisor-view.png
new file mode 100644
index 0000000000..e3100cdd3b
Binary files /dev/null and b/docs/35.0.0/assets/supervisor-view.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-00.png b/docs/35.0.0/assets/tutorial-batch-data-loader-00.png
new file mode 100644
index 0000000000..793b6c1232
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-00.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-01.png b/docs/35.0.0/assets/tutorial-batch-data-loader-01.png
new file mode 100644
index 0000000000..2ff1d6398b
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-015.png b/docs/35.0.0/assets/tutorial-batch-data-loader-015.png
new file mode 100644
index 0000000000..fd588caea4
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-015.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-02.png b/docs/35.0.0/assets/tutorial-batch-data-loader-02.png
new file mode 100644
index 0000000000..736188cb13
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-03.png b/docs/35.0.0/assets/tutorial-batch-data-loader-03.png
new file mode 100644
index 0000000000..74bb8c88fe
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-04.png b/docs/35.0.0/assets/tutorial-batch-data-loader-04.png
new file mode 100644
index 0000000000..e4237cda8a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-05.png b/docs/35.0.0/assets/tutorial-batch-data-loader-05.png
new file mode 100644
index 0000000000..d245dde67a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-06.png b/docs/35.0.0/assets/tutorial-batch-data-loader-06.png
new file mode 100644
index 0000000000..285fd57ba2
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-07.png b/docs/35.0.0/assets/tutorial-batch-data-loader-07.png
new file mode 100644
index 0000000000..481838d789
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-07.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-08.png b/docs/35.0.0/assets/tutorial-batch-data-loader-08.png
new file mode 100644
index 0000000000..b64c5a4e0d
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-08.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-09.png b/docs/35.0.0/assets/tutorial-batch-data-loader-09.png
new file mode 100644
index 0000000000..bec3085f67
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-09.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-10.png b/docs/35.0.0/assets/tutorial-batch-data-loader-10.png
new file mode 100644
index 0000000000..857a5a5c4f
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-10.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-11.png b/docs/35.0.0/assets/tutorial-batch-data-loader-11.png
new file mode 100644
index 0000000000..bf7e304b8a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-11.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-data-loader-12.png b/docs/35.0.0/assets/tutorial-batch-data-loader-12.png
new file mode 100644
index 0000000000..f195b9ca50
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-data-loader-12.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-submit-task-01.png b/docs/35.0.0/assets/tutorial-batch-submit-task-01.png
new file mode 100644
index 0000000000..01b91427fc
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-submit-task-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-batch-submit-task-02.png b/docs/35.0.0/assets/tutorial-batch-submit-task-02.png
new file mode 100644
index 0000000000..ba7caeb22c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-batch-submit-task-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-01.png b/docs/35.0.0/assets/tutorial-compaction-01.png
new file mode 100644
index 0000000000..aeb9bf36fc
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-02.png b/docs/35.0.0/assets/tutorial-compaction-02.png
new file mode 100644
index 0000000000..836d8a7a7c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-03.png b/docs/35.0.0/assets/tutorial-compaction-03.png
new file mode 100644
index 0000000000..d51f8f8a8a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-04.png b/docs/35.0.0/assets/tutorial-compaction-04.png
new file mode 100644
index 0000000000..46c5b1d261
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-05.png b/docs/35.0.0/assets/tutorial-compaction-05.png
new file mode 100644
index 0000000000..e692694aff
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-06.png b/docs/35.0.0/assets/tutorial-compaction-06.png
new file mode 100644
index 0000000000..55c999f9d1
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-07.png b/docs/35.0.0/assets/tutorial-compaction-07.png
new file mode 100644
index 0000000000..661e89784c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-07.png differ
diff --git a/docs/35.0.0/assets/tutorial-compaction-08.png b/docs/35.0.0/assets/tutorial-compaction-08.png
new file mode 100644
index 0000000000..6e3f1aa037
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-compaction-08.png differ
diff --git a/docs/35.0.0/assets/tutorial-deletion-01.png b/docs/35.0.0/assets/tutorial-deletion-01.png
new file mode 100644
index 0000000000..942f057d7e
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-deletion-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-deletion-02.png b/docs/35.0.0/assets/tutorial-deletion-02.png
new file mode 100644
index 0000000000..516fdf7fe8
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-deletion-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-deletion-03.png b/docs/35.0.0/assets/tutorial-deletion-03.png
new file mode 100644
index 0000000000..666ff7a89e
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-deletion-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-01.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-01.png
new file mode 100644
index 0000000000..7f8d0daacd
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-02.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-02.png
new file mode 100644
index 0000000000..8475eeba2b
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-03.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-03.png
new file mode 100644
index 0000000000..dc7400404f
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-04.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-04.png
new file mode 100644
index 0000000000..5703066959
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-05.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-05.png
new file mode 100644
index 0000000000..c920f05658
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-06.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-06.png
new file mode 100644
index 0000000000..4fb96dd47c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-07.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-07.png
new file mode 100644
index 0000000000..b3013b735d
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-07.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-08.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-08.png
new file mode 100644
index 0000000000..b1cdd2df16
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-08.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-09.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-09.png
new file mode 100644
index 0000000000..e2045ac895
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-09.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-10.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-10.png
new file mode 100644
index 0000000000..39eaa3750a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-10.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-11.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-11.png
new file mode 100644
index 0000000000..7bd3d9a25e
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-11.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-data-loader-12.png b/docs/35.0.0/assets/tutorial-kafka-data-loader-12.png
new file mode 100644
index 0000000000..ed952b135b
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-data-loader-12.png differ
diff --git a/docs/35.0.0/assets/tutorial-kafka-submit-supervisor-01.png b/docs/35.0.0/assets/tutorial-kafka-submit-supervisor-01.png
new file mode 100644
index 0000000000..809c0c6733
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-kafka-submit-supervisor-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-01.png b/docs/35.0.0/assets/tutorial-query-01.png
new file mode 100644
index 0000000000..99354cbdfe
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-02.png b/docs/35.0.0/assets/tutorial-query-02.png
new file mode 100644
index 0000000000..4d789f5989
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-03.png b/docs/35.0.0/assets/tutorial-query-03.png
new file mode 100644
index 0000000000..841d36bfe8
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-04.png b/docs/35.0.0/assets/tutorial-query-04.png
new file mode 100644
index 0000000000..7c713e367c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-05.png b/docs/35.0.0/assets/tutorial-query-05.png
new file mode 100644
index 0000000000..4b3d78d155
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-06.png b/docs/35.0.0/assets/tutorial-query-06.png
new file mode 100644
index 0000000000..cb35a07871
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-07.png b/docs/35.0.0/assets/tutorial-query-07.png
new file mode 100644
index 0000000000..aa94d629f8
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-07.png differ
diff --git a/docs/35.0.0/assets/tutorial-query-deepstorage-retention-rule.png b/docs/35.0.0/assets/tutorial-query-deepstorage-retention-rule.png
new file mode 100644
index 0000000000..9dee37bdea
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-query-deepstorage-retention-rule.png differ
diff --git a/docs/35.0.0/assets/tutorial-quickstart-01.png b/docs/35.0.0/assets/tutorial-quickstart-01.png
new file mode 100644
index 0000000000..649708b7c4
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-quickstart-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-quickstart-02.png b/docs/35.0.0/assets/tutorial-quickstart-02.png
new file mode 100644
index 0000000000..5edec67c3f
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-quickstart-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-quickstart-03.png b/docs/35.0.0/assets/tutorial-quickstart-03.png
new file mode 100644
index 0000000000..917f25d040
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-quickstart-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-quickstart-04.png b/docs/35.0.0/assets/tutorial-quickstart-04.png
new file mode 100644
index 0000000000..e847ef550c
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-quickstart-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-quickstart-05.png b/docs/35.0.0/assets/tutorial-quickstart-05.png
new file mode 100644
index 0000000000..da3ed0dfa6
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-quickstart-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-00.png b/docs/35.0.0/assets/tutorial-retention-00.png
new file mode 100644
index 0000000000..a3f84a9fe6
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-00.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-01.png b/docs/35.0.0/assets/tutorial-retention-01.png
new file mode 100644
index 0000000000..35a97c2626
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-02.png b/docs/35.0.0/assets/tutorial-retention-02.png
new file mode 100644
index 0000000000..f38fad0d27
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-03.png b/docs/35.0.0/assets/tutorial-retention-03.png
new file mode 100644
index 0000000000..256836a2d4
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-04.png b/docs/35.0.0/assets/tutorial-retention-04.png
new file mode 100644
index 0000000000..d39495f87d
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-05.png b/docs/35.0.0/assets/tutorial-retention-05.png
new file mode 100644
index 0000000000..638a752fac
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-retention-06.png b/docs/35.0.0/assets/tutorial-retention-06.png
new file mode 100644
index 0000000000..f47cbffbb1
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-retention-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-aggregate-query.png b/docs/35.0.0/assets/tutorial-sql-aggregate-query.png
new file mode 100644
index 0000000000..0ffbff60e0
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-aggregate-query.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-auto-queries.png b/docs/35.0.0/assets/tutorial-sql-auto-queries.png
new file mode 100644
index 0000000000..dc04a8de6f
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-auto-queries.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-count-distinct-help.png b/docs/35.0.0/assets/tutorial-sql-count-distinct-help.png
new file mode 100644
index 0000000000..5327972d2a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-count-distinct-help.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-count-distinct.png b/docs/35.0.0/assets/tutorial-sql-count-distinct.png
new file mode 100644
index 0000000000..5fb9b2ae0b
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-count-distinct.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-demo-queries.png b/docs/35.0.0/assets/tutorial-sql-demo-queries.png
new file mode 100644
index 0000000000..16fc040a67
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-demo-queries.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-query-plan.png b/docs/35.0.0/assets/tutorial-sql-query-plan.png
new file mode 100644
index 0000000000..03f3c3cc6e
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-query-plan.png differ
diff --git a/docs/35.0.0/assets/tutorial-sql-result-column-actions.png b/docs/35.0.0/assets/tutorial-sql-result-column-actions.png
new file mode 100644
index 0000000000..16518d4bff
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-sql-result-column-actions.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-01.png b/docs/35.0.0/assets/tutorial-theta-01.png
new file mode 100644
index 0000000000..2411fbf194
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-01.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-02.png b/docs/35.0.0/assets/tutorial-theta-02.png
new file mode 100644
index 0000000000..ce849fd36a
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-02.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-03.png b/docs/35.0.0/assets/tutorial-theta-03.png
new file mode 100644
index 0000000000..316bf7f0b0
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-03.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-04.png b/docs/35.0.0/assets/tutorial-theta-04.png
new file mode 100644
index 0000000000..21f383af6d
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-04.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-05.png b/docs/35.0.0/assets/tutorial-theta-05.png
new file mode 100644
index 0000000000..ec2c8df6d3
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-05.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-06.png b/docs/35.0.0/assets/tutorial-theta-06.png
new file mode 100644
index 0000000000..4048aa2389
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-06.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-07.png b/docs/35.0.0/assets/tutorial-theta-07.png
new file mode 100644
index 0000000000..369b5914ad
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-07.png differ
diff --git a/docs/35.0.0/assets/tutorial-theta-08.png b/docs/35.0.0/assets/tutorial-theta-08.png
new file mode 100644
index 0000000000..59a6bc051e
Binary files /dev/null and b/docs/35.0.0/assets/tutorial-theta-08.png differ
diff --git a/docs/35.0.0/assets/web-console-0.7-tasks.png b/docs/35.0.0/assets/web-console-0.7-tasks.png
new file mode 100644
index 0000000000..80080ba8ed
Binary files /dev/null and b/docs/35.0.0/assets/web-console-0.7-tasks.png differ
diff --git a/docs/35.0.0/assets/web-console-01-home-view.png b/docs/35.0.0/assets/web-console-01-home-view.png
new file mode 100644
index 0000000000..39b6e8a1a6
Binary files /dev/null and b/docs/35.0.0/assets/web-console-01-home-view.png differ
diff --git a/docs/35.0.0/assets/web-console-02-data-loader-1.png b/docs/35.0.0/assets/web-console-02-data-loader-1.png
new file mode 100644
index 0000000000..ecd18c01f9
Binary files /dev/null and b/docs/35.0.0/assets/web-console-02-data-loader-1.png differ
diff --git a/docs/35.0.0/assets/web-console-03-data-loader-2.png b/docs/35.0.0/assets/web-console-03-data-loader-2.png
new file mode 100644
index 0000000000..bfb7be59cf
Binary files /dev/null and b/docs/35.0.0/assets/web-console-03-data-loader-2.png differ
diff --git a/docs/35.0.0/assets/web-console-04-datasources.png b/docs/35.0.0/assets/web-console-04-datasources.png
new file mode 100644
index 0000000000..fab3cec452
Binary files /dev/null and b/docs/35.0.0/assets/web-console-04-datasources.png differ
diff --git a/docs/35.0.0/assets/web-console-05-retention.png b/docs/35.0.0/assets/web-console-05-retention.png
new file mode 100644
index 0000000000..96278525a8
Binary files /dev/null and b/docs/35.0.0/assets/web-console-05-retention.png differ
diff --git a/docs/35.0.0/assets/web-console-06-segments.png b/docs/35.0.0/assets/web-console-06-segments.png
new file mode 100644
index 0000000000..9e9e9ab985
Binary files /dev/null and b/docs/35.0.0/assets/web-console-06-segments.png differ
diff --git a/docs/35.0.0/assets/web-console-07-supervisors.png b/docs/35.0.0/assets/web-console-07-supervisors.png
new file mode 100644
index 0000000000..70391bd642
Binary files /dev/null and b/docs/35.0.0/assets/web-console-07-supervisors.png differ
diff --git a/docs/35.0.0/assets/web-console-08-supervisor-status.png b/docs/35.0.0/assets/web-console-08-supervisor-status.png
new file mode 100644
index 0000000000..1bcfccdfe6
Binary files /dev/null and b/docs/35.0.0/assets/web-console-08-supervisor-status.png differ
diff --git a/docs/35.0.0/assets/web-console-09-task-status.png b/docs/35.0.0/assets/web-console-09-task-status.png
new file mode 100644
index 0000000000..100e8ada0e
Binary files /dev/null and b/docs/35.0.0/assets/web-console-09-task-status.png differ
diff --git a/docs/35.0.0/assets/web-console-10-servers.png b/docs/35.0.0/assets/web-console-10-servers.png
new file mode 100644
index 0000000000..a3e0084e12
Binary files /dev/null and b/docs/35.0.0/assets/web-console-10-servers.png differ
diff --git a/docs/35.0.0/assets/web-console-11-query-sql.png b/docs/35.0.0/assets/web-console-11-query-sql.png
new file mode 100644
index 0000000000..a144774f46
Binary files /dev/null and b/docs/35.0.0/assets/web-console-11-query-sql.png differ
diff --git a/docs/35.0.0/assets/web-console-12-query-rune.png b/docs/35.0.0/assets/web-console-12-query-rune.png
new file mode 100644
index 0000000000..8c5e270562
Binary files /dev/null and b/docs/35.0.0/assets/web-console-12-query-rune.png differ
diff --git a/docs/35.0.0/assets/web-console-13-lookups.png b/docs/35.0.0/assets/web-console-13-lookups.png
new file mode 100644
index 0000000000..fa0bd0b060
Binary files /dev/null and b/docs/35.0.0/assets/web-console-13-lookups.png differ
diff --git a/docs/35.0.0/comparisons/druid-vs-elasticsearch.md b/docs/35.0.0/comparisons/druid-vs-elasticsearch.md
new file mode 100644
index 0000000000..82752aa7ad
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-elasticsearch.md
@@ -0,0 +1,39 @@
+---
+id: druid-vs-elasticsearch
+title: "Apache Druid vs Elasticsearch"
+---
+
+
+
+
+We are not experts on search systems, if anything is incorrect about our portrayal, please let us know on the mailing list or via some other means.
+
+Elasticsearch is a search system based on Apache Lucene. It provides full text search for schema-free documents
+and provides access to raw event level data. Elasticsearch is increasingly adding more support for analytics and aggregations.
+[Some members of the community](https://groups.google.com/forum/#!msg/druid-development/nlpwTHNclj8/sOuWlKOzPpYJ) have pointed out
+the resource requirements for data ingestion and aggregation in Elasticsearch is much higher than those of Druid.
+
+Elasticsearch also does not support data summarization/roll-up at ingestion time, which can compact the data that needs to be
+stored up to 100x with real-world data sets. This leads to Elasticsearch having greater storage requirements.
+
+Druid focuses on OLAP work flows. Druid is optimized for high performance (fast aggregation and ingestion) at low cost,
+and supports a wide range of analytic operations. Druid has some basic search support for structured event data, but does not support
+full text search. Druid also does not support completely unstructured data. Measures must be defined in a Druid schema such that
+summarization/roll-up can be done.
diff --git a/docs/35.0.0/comparisons/druid-vs-key-value.md b/docs/35.0.0/comparisons/druid-vs-key-value.md
new file mode 100644
index 0000000000..57f3dec66d
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-key-value.md
@@ -0,0 +1,46 @@
+---
+id: druid-vs-key-value
+title: "Apache Druid vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)"
+---
+
+
+
+
+Druid is highly optimized for scans and aggregations, it supports arbitrarily deep drill downs into data sets. This same functionality
+is supported in key/value stores in 2 ways:
+
+1. Pre-compute all permutations of possible user queries
+2. Range scans on event data
+
+When pre-computing results, the key is the exact parameters of the query, and the value is the result of the query.
+The queries return extremely quickly, but at the cost of flexibility, as ad-hoc exploratory queries are not possible with
+pre-computing every possible query permutation. Pre-computing all permutations of all ad-hoc queries leads to result sets
+that grow exponentially with the number of columns of a data set, and pre-computing queries for complex real-world data sets
+can require hours of pre-processing time.
+
+The other approach to using key/value stores for aggregations to use the dimensions of an event as the key and the event measures as the value.
+Aggregations are done by issuing range scans on this data. Timeseries specific databases such as OpenTSDB use this approach.
+One of the limitations here is that the key/value storage model does not have indexes for any kind of filtering other than prefix ranges,
+which can be used to filter a query down to a metric and time range, but cannot resolve complex predicates to narrow the exact data to scan.
+When the number of rows to scan gets large, this limitation can greatly reduce performance. It is also harder to achieve good
+locality with key/value stores because most don’t support pushing down aggregates to the storage layer.
+
+For arbitrary exploration of data (flexible data filtering), Druid's custom column format enables ad-hoc queries without pre-computation. The format
+also enables fast scans on columns, which is important for good aggregation performance.
diff --git a/docs/35.0.0/comparisons/druid-vs-kudu.md b/docs/35.0.0/comparisons/druid-vs-kudu.md
new file mode 100644
index 0000000000..b992a1633d
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-kudu.md
@@ -0,0 +1,39 @@
+---
+id: druid-vs-kudu
+title: "Apache Druid vs Kudu"
+---
+
+
+
+
+Kudu's storage format enables single row updates, whereas updates to existing Druid segments requires recreating the segment, so theoretically
+the process for updating old values should be higher latency in Druid. However, the requirements in Kudu for maintaining extra head space to store
+updates as well as organizing data by id instead of time has the potential to introduce some extra latency and accessing
+of data that is not needed to answer a query at query time.
+
+Druid summarizes/rollups up data at ingestion time, which in practice reduces the raw data that needs to be
+stored significantly (up to 40 times on average), and increases performance of scanning raw data significantly.
+Druid segments also contain bitmap indexes for fast filtering, which Kudu does not currently support.
+Druid's segment architecture is heavily geared towards fast aggregates and filters, and for OLAP workflows. Appends are very
+fast in Druid, whereas updates of older data are higher latency. This is by design as the data Druid is good for is typically event data,
+and does not need to be updated too frequently. Kudu supports arbitrary primary keys with uniqueness constraints, and
+efficient lookup by ranges of those keys. Kudu chooses not to include the execution engine, but supports sufficient
+operations so as to allow node-local processing from the execution engines. This means that Kudu can support multiple frameworks on the same data (e.g., MR, Spark, and SQL).
+Druid includes its own query layer that allows it to push down aggregations and computations directly to data processes for faster query processing.
diff --git a/docs/35.0.0/comparisons/druid-vs-redshift.md b/docs/35.0.0/comparisons/druid-vs-redshift.md
new file mode 100644
index 0000000000..3e2c7b9ead
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-redshift.md
@@ -0,0 +1,62 @@
+---
+id: druid-vs-redshift
+title: "Apache Druid vs Redshift"
+---
+
+
+
+
+### How does Druid compare to Redshift?
+
+In terms of drawing a differentiation, Redshift started out as ParAccel (Actian), which Amazon is licensing and has since heavily modified.
+
+Aside from potential performance differences, there are some functional differences:
+
+### Real-time data ingestion
+
+Because Druid is optimized to provide insight against massive quantities of streaming data; it is able to load and aggregate data in real-time.
+
+Generally traditional data warehouses including column stores work only with batch ingestion and are not optimal for streaming data in regularly.
+
+### Druid is a read oriented analytical data store
+
+Druid’s write semantics are not as fluid and does not support full joins (we support large table to small table joins). Redshift provides full SQL support including joins and insert/update statements.
+
+### Data distribution model
+
+Druid’s data distribution is segment-based and leverages a highly available "deep" storage such as S3 or HDFS. Scaling up (or down) does not require massive copy actions or downtime; in fact, losing any number of Historical processes does not result in data loss because new Historical processes can always be brought up by reading data from "deep" storage.
+
+To contrast, ParAccel’s data distribution model is hash-based. Expanding the cluster requires re-hashing the data across the nodes, making it difficult to perform without taking downtime. Amazon’s Redshift works around this issue with a multi-step process:
+
+* set cluster into read-only mode
+* copy data from cluster to new cluster that exists in parallel
+* redirect traffic to new cluster
+
+### Replication strategy
+
+Druid employs segment-level data distribution meaning that more processes can be added and rebalanced without having to perform a staged swap. The replication strategy also makes all replicas available for querying. Replication is done automatically and without any impact to performance.
+
+ParAccel’s hash-based distribution generally means that replication is conducted via hot spares. This puts a numerical limit on the number of nodes you can lose without losing data, and this replication strategy often does not allow the hot spare to help share query load.
+
+### Indexing strategy
+
+Along with column oriented structures, Druid uses indexing structures to speed up query execution when a filter is provided. Indexing structures do increase storage overhead (and make it more difficult to allow for mutation), but they also significantly speed up queries.
+
+ParAccel does not appear to employ indexing strategies.
diff --git a/docs/35.0.0/comparisons/druid-vs-spark.md b/docs/35.0.0/comparisons/druid-vs-spark.md
new file mode 100644
index 0000000000..4d3a6b43da
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-spark.md
@@ -0,0 +1,42 @@
+---
+id: druid-vs-spark
+title: "Apache Druid vs Spark"
+---
+
+
+
+
+Druid and Spark are complementary solutions as Druid can be used to accelerate OLAP queries in Spark.
+
+Spark is a general cluster computing framework initially designed around the concept of Resilient Distributed Datasets (RDDs).
+RDDs enable data reuse by persisting intermediate results
+in memory and enable Spark to provide fast computations for iterative algorithms.
+This is especially beneficial for certain work flows such as machine
+learning, where the same operation may be applied over and over
+again until some result is converged upon. The generality of Spark makes it very suitable as an engine to process (clean or transform) data.
+Although Spark provides the ability to query data through Spark SQL, much like Hadoop, the query latencies are not specifically targeted to be interactive (sub-second).
+
+Druid's focus is on extremely low latency queries, and is ideal for powering applications used by thousands of users, and where each query must
+return fast enough such that users can interactively explore through data. Druid fully indexes all data, and can act as a middle layer between Spark and your application.
+One typical setup seen in production is to process data in Spark, and load the processed data into Druid for faster access.
+
+For more information about using Druid and Spark together, including benchmarks of the two systems, please see:
+
+https://www.linkedin.com/pulse/combining-druid-spark-interactive-flexible-analytics-scale-butani
diff --git a/docs/35.0.0/comparisons/druid-vs-sql-on-hadoop.md b/docs/35.0.0/comparisons/druid-vs-sql-on-hadoop.md
new file mode 100644
index 0000000000..00e4473125
--- /dev/null
+++ b/docs/35.0.0/comparisons/druid-vs-sql-on-hadoop.md
@@ -0,0 +1,82 @@
+---
+id: druid-vs-sql-on-hadoop
+title: "Apache Druid vs SQL-on-Hadoop"
+---
+
+
+
+
+SQL-on-Hadoop engines provide an
+execution engine for various data formats and data stores, and
+many can be made to push down computations down to Druid, while providing a SQL interface to Druid.
+
+For a direct comparison between the technologies and when to only use one or the other, things basically comes down to your
+product requirements and what the systems were designed to do.
+
+Druid was designed to
+
+1. be an always on service
+1. ingest data in real-time
+1. handle slice-n-dice style ad-hoc queries
+
+SQL-on-Hadoop engines generally sidestep Map/Reduce, instead querying data directly from HDFS or, in some cases, other storage systems.
+Some of these engines (including Impala and Presto) can be co-located with HDFS data nodes and coordinate with them to achieve data locality for queries.
+What does this mean? We can talk about it in terms of three general areas
+
+1. Queries
+1. Data Ingestion
+1. Query Flexibility
+
+### Queries
+
+Druid segments stores data in a custom column format. Segments are scanned directly as part of queries and each Druid server
+calculates a set of results that are eventually merged at the Broker level. This means the data that is transferred between servers
+are queries and results, and all computation is done internally as part of the Druid servers.
+
+Most SQL-on-Hadoop engines are responsible for query planning and execution for underlying storage layers and storage formats.
+They are processes that stay on even if there is no query running (eliminating the JVM startup costs from Hadoop MapReduce).
+Some (Impala/Presto) SQL-on-Hadoop engines have daemon processes that can be run where the data is stored, virtually eliminating network transfer costs. There is still
+some latency overhead (e.g. serialization/deserialization time) associated with pulling data from the underlying storage layer into the computation layer. We are unaware of exactly
+how much of a performance impact this makes.
+
+### Data Ingestion
+
+Druid is built to allow for real-time ingestion of data. You can ingest data and query it immediately upon ingestion,
+the latency between how quickly the event is reflected in the data is dominated by how long it takes to deliver the event to Druid.
+
+SQL-on-Hadoop, being based on data in HDFS or some other backing store, are limited in their data ingestion rates by the
+rate at which that backing store can make data available. Generally, the backing store is the biggest bottleneck for
+how quickly data can become available.
+
+### Query Flexibility
+
+Druid's query language is fairly low level and maps to how Druid operates internally. Although Druid can be combined with a high level query
+planner to support most SQL queries and analytic SQL queries (minus joins among large tables),
+base Druid is less flexible than SQL-on-Hadoop solutions for generic processing.
+
+SQL-on-Hadoop support SQL style queries with full joins.
+
+## Druid vs Parquet
+
+Parquet is a column storage format that is designed to work with SQL-on-Hadoop engines. Parquet doesn't have a query execution engine, and instead
+relies on external sources to pull data out of it.
+
+Druid's storage format is highly optimized for linear scans. Although Druid has support for nested data, Parquet's storage format is much
+more hierarchical, and is more designed for binary chunking. In theory, this should lead to faster scans in Druid.
diff --git a/docs/35.0.0/configuration/extensions.md b/docs/35.0.0/configuration/extensions.md
new file mode 100644
index 0000000000..ae8d5987d2
--- /dev/null
+++ b/docs/35.0.0/configuration/extensions.md
@@ -0,0 +1,178 @@
+---
+id: extensions
+title: "Extensions"
+---
+
+
+
+Druid implements an extension system that allows for adding functionality at runtime. Extensions
+are commonly used to add support for deep storages (like HDFS and S3), metadata stores (like MySQL
+and PostgreSQL), new aggregators, new input formats, and so on.
+
+Production clusters will generally use at least two extensions; one for deep storage and one for a
+metadata store. Many clusters will also use additional extensions.
+
+## Core extensions
+
+Core extensions are maintained by Druid committers.
+
+|Name|Description|Docs|
+|----|-----------|----|
+|druid-avro-extensions|Support for data in Apache Avro data format.|[link](../development/extensions-core/avro.md)|
+|druid-azure-extensions|Microsoft Azure deep storage.|[link](../development/extensions-core/azure.md)|
+|druid-basic-security|Support for Basic HTTP authentication and role-based access control.|[link](../development/extensions-core/druid-basic-security.md)|
+|druid-bloom-filter|Support for providing Bloom filters in druid queries.|[link](../development/extensions-core/bloom-filter.md)|
+|druid-catalog|This extension allows users to configure, update, retrieve, and manage metadata stored in Druid's catalog. |[link](../development/extensions-core/catalog.md)|
+|druid-datasketches|Support for approximate counts and set operations with [Apache DataSketches](https://datasketches.apache.org/).|[link](../development/extensions-core/datasketches-extension.md)|
+|druid-google-extensions|Google Cloud Storage deep storage.|[link](../development/extensions-core/google.md)|
+|druid-hdfs-storage|HDFS deep storage.|[link](../development/extensions-core/hdfs.md)|
+|druid-histogram|Approximate histograms and quantiles aggregator. Deprecated, please use the [DataSketches quantiles aggregator](../development/extensions-core/datasketches-quantiles.md) from the `druid-datasketches` extension instead.|[link](../development/extensions-core/approximate-histograms.md)|
+|druid-kafka-extraction-namespace|Apache Kafka-based namespaced lookup. Requires namespace lookup extension.|[link](../querying/kafka-extraction-namespace.md)|
+|druid-kafka-indexing-service|Supervised exactly-once Apache Kafka ingestion for the indexing service.|[link](../ingestion/kafka-ingestion.md)|
+|druid-kinesis-indexing-service|Supervised exactly-once Kinesis ingestion for the indexing service.|[link](../ingestion/kinesis-ingestion.md)|
+|druid-kerberos|Kerberos authentication for druid processes.|[link](../development/extensions-core/druid-kerberos.md)|
+|druid-lookups-cached-global|A module for [lookups](../querying/lookups.md) providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data.|[link](../querying/lookups-cached-global.md)|
+|druid-lookups-cached-single| Per lookup caching module to support the use cases where a lookup need to be isolated from the global pool of lookups |[link](../development/extensions-core/druid-lookups.md)|
+|druid-multi-stage-query| Support for the multi-stage query architecture for Apache Druid and the multi-stage query task engine.|[link](../multi-stage-query/index.md)|
+|druid-orc-extensions|Support for data in Apache ORC data format.|[link](../development/extensions-core/orc.md)|
+|druid-parquet-extensions|Support for data in Apache Parquet data format. Requires druid-avro-extensions to be loaded.|[link](../development/extensions-core/parquet.md)|
+|druid-protobuf-extensions| Support for data in Protobuf data format.|[link](../development/extensions-core/protobuf.md)|
+|druid-s3-extensions|Interfacing with data in Amazon S3, and using S3 as deep storage.|[link](../development/extensions-core/s3.md)|
+|druid-ec2-extensions|Interfacing with AWS EC2 for autoscaling middle managers|UNDOCUMENTED|
+|druid-aws-rds-extensions|Support for AWS token based access to AWS RDS DB Cluster.|[link](../development/extensions-core/druid-aws-rds.md)|
+|druid-stats|Statistics related module including variance and standard deviation.|[link](../development/extensions-core/stats.md)|
+|mysql-metadata-storage|MySQL metadata store.|[link](../development/extensions-core/mysql.md)|
+|postgresql-metadata-storage|PostgreSQL metadata store.|[link](../development/extensions-core/postgresql.md)|
+|simple-client-sslcontext|Simple SSLContext provider module to be used by Druid's internal HttpClient when talking to other Druid processes over HTTPS.|[link](../development/extensions-core/simple-client-sslcontext.md)|
+|druid-pac4j|OpenID Connect authentication for druid processes.|[link](../development/extensions-core/druid-pac4j.md)|
+|druid-kubernetes-extensions|Druid cluster deployment on Kubernetes without Zookeeper.|[link](../development/extensions-core/kubernetes.md)|
+|druid-kubernetes-overlord-extensions|Support for launching tasks in k8s without Middle Managers|[link](../development/extensions-core/k8s-jobs.md)|
+
+## Community extensions
+
+:::info
+ Community extensions are not maintained by Druid committers, although we accept patches from community members using these extensions. They may not have been as extensively tested as the core extensions.
+:::
+
+A number of community members have contributed their own extensions to Druid that are not packaged with the default Druid tarball.
+If you'd like to take on maintenance for a community extension, please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) to let us know!
+
+All of these community extensions can be downloaded using [pull-deps](../operations/pull-deps.md) while specifying a `-c` coordinate option to pull `org.apache.druid.extensions.contrib:{EXTENSION_NAME}:{DRUID_VERSION}`.
+
+|Name|Description|Docs|
+|----|-----------|----|
+|aliyun-oss-extensions|Aliyun OSS deep storage |[link](../development/extensions-contrib/aliyun-oss-extensions.md)|
+|ambari-metrics-emitter|Ambari Metrics Emitter |[link](../development/extensions-contrib/ambari-metrics-emitter.md)|
+|druid-cassandra-storage|Apache Cassandra deep storage.|[link](../development/extensions-contrib/cassandra.md)|
+|druid-cloudfiles-extensions|Rackspace Cloudfiles deep storage.|[link](../development/extensions-contrib/cloudfiles.md)|
+|druid-compressed-bigdecimal|Compressed Big Decimal Type | [link](../development/extensions-contrib/compressed-big-decimal.md)|
+|druid-ddsketch|Support for DDSketch approximate quantiles based on [DDSketch](https://github.com/datadog/sketches-java) | [link](../development/extensions-contrib/ddsketch-quantiles.md)|
+|druid-deltalake-extensions|Support for ingesting Delta Lake tables.|[link](../development/extensions-contrib/delta-lake.md)|
+|druid-distinctcount|DistinctCount aggregator|[link](../development/extensions-contrib/distinctcount.md)|
+|druid-exact-count-bitmap|Support for exact cardinality counting using Roaring Bitmap over a Long column.|[link](../development/extensions-contrib/druid-exact-count-bitmap.md)|
+|druid-iceberg-extensions|Support for ingesting Iceberg tables.|[link](../development/extensions-contrib/iceberg.md)|
+|druid-redis-cache|A cache implementation for Druid based on Redis.|[link](../development/extensions-contrib/redis-cache.md)|
+|druid-time-min-max|Min/Max aggregator for timestamp.|[link](../development/extensions-contrib/time-min-max.md)|
+|sqlserver-metadata-storage|Microsoft SQLServer metadata store.|[link](../development/extensions-contrib/sqlserver.md)|
+|graphite-emitter|Graphite metrics emitter|[link](../development/extensions-contrib/graphite.md)|
+|statsd-emitter|StatsD metrics emitter|[link](../development/extensions-contrib/statsd.md)|
+|kafka-emitter|Kafka metrics emitter|[link](../development/extensions-contrib/kafka-emitter.md)|
+|druid-thrift-extensions|Support thrift ingestion |[link](../development/extensions-contrib/thrift.md)|
+|druid-opentsdb-emitter|OpenTSDB metrics emitter |[link](../development/extensions-contrib/opentsdb-emitter.md)|
+|materialized-view-selection, materialized-view-maintenance|Materialized View|[link](../development/extensions-contrib/materialized-view.md)|
+|druid-moving-average-query|Support for [Moving Average](https://en.wikipedia.org/wiki/Moving_average) and other Aggregate [Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions) in Druid queries.|[link](../development/extensions-contrib/moving-average-query.md)|
+|druid-influxdb-emitter|InfluxDB metrics emitter|[link](../development/extensions-contrib/influxdb-emitter.md)|
+|druid-momentsketch|Support for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library|[link](../development/extensions-contrib/momentsketch-quantiles.md)|
+|druid-tdigestsketch|Support for approximate sketch aggregators based on [T-Digest](https://github.com/tdunning/t-digest)|[link](../development/extensions-contrib/tdigestsketch-quantiles.md)|
+|gce-extensions|GCE Extensions|[link](../development/extensions-contrib/gce-extensions.md)|
+|prometheus-emitter|Exposes [Druid metrics](../operations/metrics.md) for [Prometheus](https://prometheus.io/)|[link](../development/extensions-contrib/prometheus.md)|
+|druid-spectator-histogram|Support for efficient approximate percentile queries|[link](../development/extensions-contrib/spectator-histogram.md)|
+|druid-rabbit-indexing-service|Support for creating and managing [RabbitMQ](https://www.rabbitmq.com/) indexing tasks|[link](../development/extensions-contrib/rabbit-stream-ingestion.md)|
+|druid-ranger-security|Support for access control through Apache Ranger.|[link](../development/extensions-contrib/druid-ranger-security.md)|
+
+## Promoting community extensions to core extensions
+
+Please post on [dev@druid.apache.org](https://lists.apache.org/list.html?dev@druid.apache.org) if you'd like an extension to be promoted to core.
+If we see a community extension actively supported by the community, we can promote it to core based on community feedback.
+
+For information how to create your own extension, please see [here](../development/modules.md).
+
+## Loading extensions
+
+### Loading core extensions
+
+Apache Druid bundles all [core extensions](../configuration/extensions.md#core-extensions) out of the box.
+See the [list of extensions](../configuration/extensions.md#core-extensions) for your options. You
+can load bundled extensions by adding their names to your common.runtime.properties
+`druid.extensions.loadList` property. For example, to load the postgresql-metadata-storage and
+druid-hdfs-storage extensions, use the configuration:
+
+```properties
+druid.extensions.loadList=["postgresql-metadata-storage", "druid-hdfs-storage"]
+```
+
+These extensions are located in the `extensions` directory of the distribution.
+
+:::info
+ Druid bundles two sets of configurations: one for the [quickstart](../tutorials/index.md) and
+ one for a [clustered configuration](../tutorials/cluster.md). Make sure you are updating the correct
+ `common.runtime.properties` for your setup.
+:::
+
+:::info
+ Because of licensing, the mysql-metadata-storage extension does not include the required MySQL JDBC driver. For instructions
+ on how to install this library, see the [MySQL extension page](../development/extensions-core/mysql.md).
+:::
+
+### Loading community extensions
+
+You can also load community and third-party extensions not already bundled with Druid. To do this, first download the extension and
+then install it into your `extensions` directory. You can download extensions from their distributors directly, or
+if they are available from Maven, the included [pull-deps](../operations/pull-deps.md) can download them for you. To use *pull-deps*,
+specify the full Maven coordinate of the extension in the form `groupId:artifactId:version`. For example,
+for the (hypothetical) extension *com.example:druid-example-extension:1.0.0*, run:
+
+```shell
+java \
+ -cp "lib/*" \
+ -Ddruid.extensions.directory="extensions" \
+ -Ddruid.extensions.hadoopDependenciesDir="hadoop-dependencies" \
+ org.apache.druid.cli.Main tools pull-deps \
+ --no-default-hadoop \
+ -c "com.example:druid-example-extension:1.0.0"
+```
+
+You only have to install the extension once. Then, add `"druid-example-extension"` to
+`druid.extensions.loadList` in common.runtime.properties to instruct Druid to load the extension.
+
+:::info
+ Please make sure all the Extensions related configuration properties listed [here](../configuration/index.md#extensions) are set correctly.
+:::
+
+:::info
+ The Maven `groupId` for almost every [community extension](../configuration/extensions.md#community-extensions) is `org.apache.druid.extensions.contrib`. The `artifactId` is the name
+ of the extension, and the version is the latest Druid stable version.
+:::
+
+### Loading extensions from the classpath
+
+If you add your extension jar to the classpath at runtime, Druid will also load it into the system. This mechanism is relatively easy to reason about,
+but it also means that you have to ensure that all dependency jars on the classpath are compatible. That is, Druid makes no provisions while using
+this method to maintain class loader isolation so you must make sure that the jars on your classpath are mutually compatible.
diff --git a/docs/35.0.0/configuration/human-readable-byte.md b/docs/35.0.0/configuration/human-readable-byte.md
new file mode 100644
index 0000000000..0f412b69ab
--- /dev/null
+++ b/docs/35.0.0/configuration/human-readable-byte.md
@@ -0,0 +1,98 @@
+---
+id: human-readable-byte
+title: "Human-readable Byte Configuration Reference"
+---
+
+
+
+
+This page documents configuration properties related to bytes.
+
+These properties can be configured through 2 ways:
+1. a simple number in bytes
+2. a number with a unit suffix
+
+## A number in bytes
+
+Given that cache size is 3G, there's a configuration as below
+
+```properties
+# 3G bytes = 3_000_000_000 bytes
+druid.cache.sizeInBytes=3000000000
+```
+
+
+## A number with a unit suffix
+
+When you have to put a large number for some configuration as above, it is easy to make a mistake such as extra or missing 0s. Druid supports a better way, a number with a unit suffix.
+
+Given a disk of 1T, the configuration can be
+
+```properties
+druid.segmentCache.locations=[{"path":"/segment-cache-00","maxSize":"1t"},{"path":"/segment-cache-01","maxSize":"1200g"}]
+```
+
+Note: in above example, both `1t` and `1T` are acceptable since it's case-insensitive.
+Also, only integers are valid as the number part. For example, you can't replace `1200g` with `1.2t`.
+
+### Supported Units
+In the world of computer, a unit like `K` is ambiguous. It means 1000 or 1024 in different contexts, for more information please see [Here](https://en.wikipedia.org/wiki/Binary_prefix).
+
+To make it clear, the base of units are defined in Druid as below
+
+| Unit | Description | Base |
+|---|---|---|
+| K | Kilo Decimal Byte | 1_000 |
+| M | Mega Decimal Byte | 1_000_000 |
+| G | Giga Decimal Byte | 1_000_000_000 |
+| T | Tera Decimal Byte | 1_000_000_000_000 |
+| P | Peta Decimal Byte | 1_000_000_000_000_000 |
+| Ki | Kilo Binary Byte | 1024 |
+| Mi | Mega Binary Byte | 1024 * 1024 |
+| Gi | Giga Binary Byte | 1024 * 1024 * 1024 |
+| Ti | Tera Binary Byte | 1024 * 1024 * 1024 * 1024 |
+| Pi | Peta Binary Byte | 1024 * 1024 * 1024 * 1024 * 1024 |
+| KiB | Kilo Binary Byte | 1024 |
+| MiB | Mega Binary Byte | 1024 * 1024 |
+| GiB | Giga Binary Byte | 1024 * 1024 * 1024 |
+| TiB | Tera Binary Byte | 1024 * 1024 * 1024 * 1024 |
+| PiB | Peta Binary Byte | 1024 * 1024 * 1024 * 1024 * 1024 |
+
+Unit is case-insensitive. `k`, `kib`, `ki`, `KiB`, `Ki`, `kiB` are all acceptable.
+
+Here are some examples
+
+```properties
+# 1G bytes = 1_000_000_000 bytes
+druid.cache.sizeInBytes=1g
+```
+
+```properties
+# 256MiB bytes = 256 * 1024 * 1024 bytes
+druid.cache.sizeInBytes=256MiB
+```
+
+```properties
+# 256Mi = 256MiB = 256 * 1024 * 1024 bytes
+druid.cache.sizeInBytes=256Mi
+```
+
+
+
diff --git a/docs/35.0.0/configuration/index.md b/docs/35.0.0/configuration/index.md
new file mode 100644
index 0000000000..8aa5e81846
--- /dev/null
+++ b/docs/35.0.0/configuration/index.md
@@ -0,0 +1,2320 @@
+---
+id: index
+title: "Configuration reference"
+---
+
+
+
+This page documents all of the configuration properties for each Druid service type.
+
+## Recommended configuration file organization
+
+A recommended way of organizing Druid configuration files can be seen in the `conf` directory in the Druid package root, shown below:
+
+```sh
+$ ls -R conf
+druid
+
+conf/druid:
+_common broker coordinator historical middleManager overlord
+
+conf/druid/_common:
+common.runtime.properties log4j2.xml
+
+conf/druid/broker:
+jvm.config runtime.properties
+
+conf/druid/coordinator:
+jvm.config runtime.properties
+
+conf/druid/historical:
+jvm.config runtime.properties
+
+conf/druid/middleManager:
+jvm.config runtime.properties
+
+conf/druid/overlord:
+jvm.config runtime.properties
+```
+
+Each directory has a `runtime.properties` file containing configuration properties for the specific Druid service corresponding to the directory, such as `historical`.
+
+The `jvm.config` files contain JVM flags such as heap sizing properties for each service.
+
+Common properties shared by all services are placed in `_common/common.runtime.properties`.
+
+## Configuration interpolation
+
+Configuration values can be interpolated from System Properties, Environment Variables, or local files. Below is an example of how this can be used:
+
+```properties
+druid.metadata.storage.type=${env:METADATA_STORAGE_TYPE}
+druid.processing.tmpDir=${sys:java.io.tmpdir}
+druid.segmentCache.locations=${file:UTF-8:/config/segment-cache-def.json}
+```
+
+Interpolation is also recursive so you can do:
+
+```properties
+druid.segmentCache.locations=${file:UTF-8:${env:SEGMENT_DEF_LOCATION}}
+```
+
+If the property is not set, an exception will be thrown on startup, but a default can be provided if desired. Setting a default value will not work with file interpolation as an exception will be thrown if the file does not exist.
+
+```properties
+druid.metadata.storage.type=${env:METADATA_STORAGE_TYPE:-mysql}
+druid.processing.tmpDir=${sys:java.io.tmpdir:-/tmp}
+```
+
+If you need to set a variable that is wrapped by `${...}` but do not want it to be interpolated, you can escape it by adding another `$`. For example:
+
+```properties
+config.name=$${value}
+```
+
+## Common configurations
+
+The properties under this section are common configurations that should be shared across all Druid services in a cluster.
+
+### JVM configuration best practices
+
+There are four JVM parameters that we set on all of our services:
+
+* `-Duser.timezone=UTC`: This sets the default timezone of the JVM to UTC. We always set this and do not test with other default timezones, so local timezones might work, but they also might uncover weird and interesting bugs. To issue queries in a non-UTC timezone, see [query granularities](../querying/granularities.md#period-granularities)
+* `-Dfile.encoding=UTF-8` This is similar to timezone, we test assuming UTF-8. Local encodings might work, but they also might result in weird and interesting bugs.
+* `-Djava.io.tmpdir=` Various parts of Druid use temporary files to interact with the file system. These files can become quite large. This means that systems that have small `/tmp` directories can cause problems for Druid. Therefore, set the JVM tmp directory to a location with ample space.
+
+ Also consider the following when configuring the JVM tmp directory:
+ * The temp directory should not be volatile tmpfs.
+ * This directory should also have good read and write speed.
+ * Avoid NFS mount.
+ * The `org.apache.druid.java.util.metrics.SysMonitor` requires execute privileges on files in `java.io.tmpdir`. If you are using the system monitor, do not set `java.io.tmpdir` to `noexec`.
+* `-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager` This allows log4j2 to handle logs for non-log4j2 components (like jetty) which use standard java logging.
+
+### Extensions
+
+Many of Druid's external dependencies can be plugged in as modules. Extensions can be provided using the following configs:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.extensions.directory`|The root extension directory where user can put extensions related files. Druid will load extensions stored under this directory.|`extensions` (This is a relative path to Druid's working directory)|
+|`druid.extensions.hadoopDependenciesDir`|The root Hadoop dependencies directory where user can put Hadoop related dependencies files. Druid will load the dependencies based on the Hadoop coordinate specified in the Hadoop index task.|`hadoop-dependencies` (This is a relative path to Druid's working directory|
+|`druid.extensions.loadList`|A JSON array of extensions to load from extension directories by Druid. If it is not specified, its value will be `null` and Druid will load all the extensions under `druid.extensions.directory`. If its value is empty list `[]`, then no extensions will be loaded at all. It is also allowed to specify absolute path of other custom extensions not stored in the common extensions directory.|null|
+|`druid.extensions.searchCurrentClassloader`|This is a boolean flag that determines if Druid will search the main classloader for extensions. It defaults to true but can be turned off if you have reason to not automatically add all modules on the classpath.|true|
+|`druid.extensions.useExtensionClassloaderFirst`|This is a boolean flag that determines if Druid extensions should prefer loading classes from their own jars rather than jars bundled with Druid. If false, extensions must be compatible with classes provided by any jars bundled with Druid. If true, extensions may depend on conflicting versions.|false|
+|`druid.extensions.hadoopContainerDruidClasspath`|Hadoop Indexing launches Hadoop jobs and this configuration provides way to explicitly set the user classpath for the Hadoop job. By default, this is computed automatically by Druid based on the Druid service classpath and set of extensions. However, sometimes you might want to be explicit to resolve dependency conflicts between Druid and Hadoop.|null|
+|`druid.extensions.addExtensionsToHadoopContainer`|Only applicable if `druid.extensions.hadoopContainerDruidClasspath` is provided. If set to true, then extensions specified in the loadList are added to Hadoop container classpath. Note that when `druid.extensions.hadoopContainerDruidClasspath` is not provided then extensions are always added to Hadoop container classpath.|false|
+
+### Modules
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.modules.excludeList`|A JSON array of canonical class names (e.g., `"org.apache.druid.somepackage.SomeModule"`) of module classes which shouldn't be loaded, even if they are found in extensions specified by `druid.extensions.loadList`, or in the list of core modules specified to be loaded on a particular Druid service type. Useful when some useful extension contains some module, which shouldn't be loaded on some Druid service type because some dependencies of that module couldn't be satisfied.|[]|
+
+### ZooKeeper
+
+We recommend just setting the base ZK path and the ZK service host, but all ZK paths that Druid uses can be overwritten to absolute paths.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.zk.paths.base`|Base ZooKeeper path.|`/druid`|
+|`druid.zk.service.host`|The ZooKeeper hosts to connect to. This is a REQUIRED property and therefore a host address must be supplied.|none|
+|`druid.zk.service.user`|The username to authenticate with ZooKeeper. This is an optional property.|none|
+|`druid.zk.service.pwd`|The [Password Provider](../operations/password-provider.md) or the string password to authenticate with ZooKeeper. This is an optional property.|none|
+|`druid.zk.service.authScheme`|digest is the only authentication scheme supported. |digest|
+
+#### ZooKeeper behavior
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.zk.service.sessionTimeoutMs`|ZooKeeper session timeout, in milliseconds.|`30000`|
+|`druid.zk.service.connectionTimeoutMs`|ZooKeeper connection timeout, in milliseconds.|`15000`|
+|`druid.zk.service.compress`|Boolean flag for whether or not created Znodes should be compressed.|`true`|
+|`druid.zk.service.acl`|Boolean flag for whether or not to enable ACL security for ZooKeeper. If ACL is enabled, zNode creators will have all permissions.|`false`|
+|`druid.zk.service.pathChildrenCacheStrategy`|Dictates the underlying caching strategy for service announcements. Set true to let announcers to use Apache Curator's PathChildrenCache strategy, otherwise NodeCache strategy. Consider using NodeCache strategy when you are dealing with huge number of ZooKeeper watches in your cluster.|`true`|
+
+#### Path configuration
+
+Druid interacts with ZooKeeper through a set of standard path configurations. We recommend just setting the base ZooKeeper path, but all ZooKeeper paths that Druid uses can be overwritten to absolute paths.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.zk.paths.base`|Base ZooKeeper path.|`/druid`|
+|`druid.zk.paths.propertiesPath`|ZooKeeper properties path.|`${druid.zk.paths.base}/properties`|
+|`druid.zk.paths.announcementsPath`|Druid service announcement path.|`${druid.zk.paths.base}/announcements`|
+|`druid.zk.paths.liveSegmentsPath`|Current path for where Druid services announce their segments.|`${druid.zk.paths.base}/segments`|
+|`druid.zk.paths.coordinatorPath`|Used by the Coordinator for leader election.|`${druid.zk.paths.base}/coordinator`|
+
+The indexing service also uses its own set of paths. These configs can be included in the common configuration.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.zk.paths.indexer.base`|Base ZooKeeper path for |`${druid.zk.paths.base}/indexer`|
+|`druid.zk.paths.indexer.announcementsPath`|Middle Managers announce themselves here.|`${druid.zk.paths.indexer.base}/announcements`|
+|`druid.zk.paths.indexer.tasksPath`|Used to assign tasks to Middle Managers.|`${druid.zk.paths.indexer.base}/tasks`|
+|`druid.zk.paths.indexer.statusPath`|Parent path for announcement of task statuses.|`${druid.zk.paths.indexer.base}/status`|
+
+If `druid.zk.paths.base` and `druid.zk.paths.indexer.base` are both set, and none of the other `druid.zk.paths.*` or `druid.zk.paths.indexer.*` values are set, then the other properties will be evaluated relative to their respective `base`.
+For example, if `druid.zk.paths.base` is set to `/druid1` and `druid.zk.paths.indexer.base` is set to `/druid2` then `druid.zk.paths.announcementsPath` will default to `/druid1/announcements` while `druid.zk.paths.indexer.announcementsPath` will default to `/druid2/announcements`.
+
+The following path is used for service discovery. It is **not** affected by `druid.zk.paths.base` and **must** be specified separately.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.discovery.curator.path`|Services announce themselves under this ZooKeeper path.|`/druid/discovery`|
+
+### TLS
+
+#### General configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.enablePlaintextPort`|Enable/Disable HTTP connector.|`true`|
+|`druid.enableTlsPort`|Enable/Disable HTTPS connector.|`false`|
+
+Although not recommended but both HTTP and HTTPS connectors can be enabled at a time and respective ports are configurable using `druid.plaintextPort`
+and `druid.tlsPort` properties on each service. Please see `Configuration` section of individual services to check the valid and default values for these ports.
+
+#### Jetty server TLS configuration
+
+Druid uses Jetty as an embedded web server. To learn more about TLS/SSL, certificates, and related concepts in Jetty, including explanations of the configuration settings below, see "Configuring SSL/TLS KeyStores" in the [Jetty Operations Guide](https://www.eclipse.org/jetty/documentation.php).
+
+For information about TLS/SSL support in Java in general, see the [Java Secure Socket Extension (JSSE) Reference Guide](https://docs.oracle.com/en/java/javase/17/security/java-secure-socket-extension-jsse-reference-guide.html).
+The [Java Cryptography Architecture
+Standard Algorithm Name Documentation for JDK 17](https://docs.oracle.com/en/java/javase/17/docs/specs/security/standard-names.html) lists all possible
+values for the following properties, among others provided by the Java implementation.
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|--------|
+|`druid.server.https.keyStorePath`|The file path or URL of the TLS/SSL KeyStore.|none|yes|
+|`druid.server.https.keyStoreType`|The type of the KeyStore.|none|yes|
+|`druid.server.https.certAlias`|Alias of TLS/SSL certificate for the connector.|none|yes|
+|`druid.server.https.keyStorePassword`|The [Password Provider](../operations/password-provider.md) or String password for the KeyStore.|none|yes|
+
+Following table contains non-mandatory advanced configuration options, use caution.
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|--------|
+|`druid.server.https.keyManagerFactoryAlgorithm`|Algorithm to use for creating KeyManager, more details [here](https://docs.oracle.com/javase/7/docs/technotes/guides/security/jsse/JSSERefGuide.html#KeyManager).|`javax.net.ssl.KeyManagerFactory.getDefaultAlgorithm()`|no|
+|`druid.server.https.keyManagerPassword`|The [Password Provider](../operations/password-provider.md) or String password for the Key Manager.|none|no|
+|`druid.server.https.includeCipherSuites`|List of cipher suite names to include. You can either use the exact cipher suite name or a regular expression.|Jetty's default include cipher list|no|
+|`druid.server.https.excludeCipherSuites`|List of cipher suite names to exclude. You can either use the exact cipher suite name or a regular expression.|Jetty's default exclude cipher list|no|
+|`druid.server.https.includeProtocols`|List of exact protocols names to include.|Jetty's default include protocol list|no|
+|`druid.server.https.excludeProtocols`|List of exact protocols names to exclude.|Jetty's default exclude protocol list|no|
+
+#### Internal client TLS configuration (requires `simple-client-sslcontext` extension)
+
+These properties apply to the SSLContext that will be provided to the internal HTTP client that Druid services use to communicate with each other. These properties require the `simple-client-sslcontext` extension to be loaded. Without it, Druid services will be unable to communicate with each other when TLS is enabled.
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|--------|
+|`druid.client.https.protocol`|SSL protocol to use.|`TLSv1.2`|no|
+|`druid.client.https.trustStoreType`|The type of the key store where trusted root certificates are stored.|`java.security.KeyStore.getDefaultType()`|no|
+|`druid.client.https.trustStorePath`|The file path or URL of the TLS/SSL Key store where trusted root certificates are stored.|none|yes|
+|`druid.client.https.trustStoreAlgorithm`|Algorithm to be used by TrustManager to validate certificate chains|`javax.net.ssl.TrustManagerFactory.getDefaultAlgorithm()`|no|
+|`druid.client.https.trustStorePassword`|The [Password Provider](../operations/password-provider.md) or String password for the Trust Store.|none|yes|
+
+This [document](https://docs.oracle.com/en/java/javase/17/docs/specs/security/standard-names.html) lists all the possible
+values for the above mentioned configs among others provided by Java implementation.
+
+### Authentication and authorization
+
+|Property|Type|Description|Default|Required|
+|--------|-----------|--------|--------|--------|
+|`druid.auth.authenticatorChain`|JSON List of Strings|List of Authenticator type names|["allowAll"]|no|
+|`druid.escalator.type`|String|Type of the Escalator that should be used for internal Druid communications. This Escalator must use an authentication scheme that is supported by an Authenticator in `druid.auth.authenticatorChain`.|`noop`|no|
+|`druid.auth.authorizers`|JSON List of Strings|List of Authorizer type names |["allowAll"]|no|
+|`druid.auth.unsecuredPaths`| List of Strings|List of paths for which security checks will not be performed. All requests to these paths will be allowed.|[]|no|
+|`druid.auth.allowUnauthenticatedHttpOptions`|Boolean|If true, skip authentication checks for HTTP OPTIONS requests. This is needed for certain use cases, such as supporting CORS pre-flight requests. Note that disabling authentication checks for OPTIONS requests will allow unauthenticated users to determine what Druid endpoints are valid (by checking if the OPTIONS request returns a 200 instead of 404), so enabling this option may reveal information about server configuration, including information about what extensions are loaded (if those extensions add endpoints).|false|no|
+
+For more information, please see [Authentication and Authorization](../operations/auth.md).
+
+For configuration options for specific auth extensions, please refer to the extension documentation.
+
+### Startup logging
+
+All services can log debugging information on startup.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.startup.logging.logProperties`|Log all properties on startup (from common.runtime.properties, runtime.properties, and the JVM command line).|false|
+|`druid.startup.logging.maskProperties`|Masks sensitive properties (passwords, for example) containing theses words.|["password"]|
+
+Note that some sensitive information may be logged if these settings are enabled.
+
+### Request logging
+
+All services that can serve queries can also log the query requests they see. Broker services can additionally log the SQL requests (both from HTTP and JDBC) they see.
+For an example of setting up request logging, see [Request logging](../operations/request-logging.md).
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.type`|How to log every query request. Choices: `noop`, [`file`](#file-request-logging), [`emitter`](#emitter-request-logging), [`slf4j`](#slf4j-request-logging), [`filtered`](#filtered-request-logging), [`composing`](#composing-request-logging), [`switching`](#switching-request-logging)|`noop` (request logging disabled by default)|
+
+To enable sending all the HTTP requests to a log, set `org.apache.druid.jetty.RequestLog` to the `DEBUG` level. See [Logging](../configuration/logging.md) for more information.
+
+#### File request logging
+
+The `file` request logger stores daily request logs on disk.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.dir`| Historical, Realtime, and Broker services maintain request logs of all of the requests they get (interaction is via POST, so normal request logs don’t generally capture information about the actual query), this specifies the directory to store the request logs in. | none|
+|`druid.request.logging.filePattern`| [Joda datetime format](http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html) for each file.| "yyyy-MM-dd'.log'"|
+|`druid.request.logging.durationToRetain`| Period to retain the request logs on disk. The period should be at least as long as roll period.| none|
+|`druid.request.logging.rollPeriod`| Defines the log rotation period for request logs. The period should be at least `PT1H`. For periods smaller than 1 day, it is recommended to use `"yyyy-MM-dd-HH'.log'"` as the file pattern.| P1D|
+
+The format of request logs is TSV, one line per requests, with five fields: timestamp, remote\_addr, native\_query, query\_context, sql\_query.
+
+For native JSON request, the `sql_query` field is empty. For example:
+
+```txt
+2019-01-14T10:00:00.000Z 127.0.0.1 {"queryType":"topN","dataSource":{"type":"table","name":"wikiticker"},"virtualColumns":[],"dimension":{"type":"LegacyDimensionSpec","dimension":"page","outputName":"page","outputType":"STRING"},"metric":{"type":"LegacyTopNMetricSpec","metric":"count"},"threshold":10,"intervals":{"type":"LegacySegmentSpec","intervals":["2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z"]},"filter":null,"granularity":{"type":"all"},"aggregations":[{"type":"count","name":"count"}],"postAggregations":[],"context":{"queryId":"74c2d540-d700-4ebd-b4a9-3d02397976aa"},"descending":false} {"query/time":100,"query/bytes":800,"success":true,"identity":"user1"}
+```
+
+For SQL query request, the `native_query` field is empty. For example:
+
+```txt
+2019-01-14T10:00:00.000Z 127.0.0.1 {"sqlQuery/time":100, "sqlQuery/planningTimeMs":10, "sqlQuery/bytes":600, "success":true, "identity":"user1"} {"query":"SELECT page, COUNT(*) AS Edits FROM wikiticker WHERE TIME_IN_INTERVAL(\"__time\", '2015-09-12/2015-09-13') GROUP BY page ORDER BY Edits DESC LIMIT 10","context":{"sqlQueryId":"c9d035a0-5ffd-4a79-a865-3ffdadbb5fdd","nativeQueryIds":"[490978e4-f5c7-4cf6-b174-346e63cf8863]"}}
+```
+
+#### Emitter request logging
+
+The `emitter` request logger emits every request to the external location specified in the [emitter](#metrics-monitors) configuration.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.feed`|Feed name for requests.|none|
+
+#### SLF4J request logging
+
+The `slf4j` request logger logs every request using SLF4J. It serializes native queries into JSON in the log message regardless of the SLF4J format specification. Requests are logged under the class `org.apache.druid.server.log.LoggingRequestLogger`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.setMDC`|If you want to set MDC entries within the log entry, set this value to `true`. Your logging system must be configured to support MDC in order to format this data.|false|
+|`druid.request.logging.setContextMDC`|Set to "true" to add the Druid query `context` to the MDC entries. Only applies when `setMDC` is `true`.|false|
+
+For a native query, the following MDC fields are populated when `setMDC` is `true`:
+
+|MDC field|Description|
+|---------|-----------|
+|`queryId` |The query ID|
+|`sqlQueryId`|The SQL query ID if this query is part of a SQL request|
+|`dataSource`|The datasource the query was against|
+|`queryType` |The type of the query|
+|`hasFilters`|If the query has any filters|
+|`remoteAddr`|The remote address of the requesting client|
+|`duration` |The duration of the query interval|
+|`resultOrdering`|The ordering of results|
+|`descending`|If the query is a descending query|
+
+#### Filtered request logging
+
+The `filtered` request logger filters requests based on the query type or how long a query takes to complete.
+For native queries, the logger only logs requests when the `query/time` metric exceeds the threshold provided in `queryTimeThresholdMs`.
+For SQL queries, it only logs requests when the `sqlQuery/time` metric exceeds threshold provided in `sqlQueryTimeThresholdMs`.
+See [Metrics](../operations/metrics.md) for more details on query metrics.
+
+Requests that meet the threshold are logged using the request logger type set in `druid.request.logging.delegate.type`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.queryTimeThresholdMs`|Threshold value for the `query/time` metric in milliseconds.|0, i.e., no filtering|
+|`druid.request.logging.sqlQueryTimeThresholdMs`|Threshold value for the `sqlQuery/time` metric in milliseconds.|0, i.e., no filtering|
+|`druid.request.logging.mutedQueryTypes` | Query requests of these types are not logged. Query types are defined as string objects corresponding to the "queryType" value for the specified query in the Druid's [native JSON query API](../querying/querying.md). Misspelled query types will be ignored. Example to ignore scan and timeBoundary queries: `["scan", "timeBoundary"]`| []|
+|`druid.request.logging.delegate.type`|Type of delegate request logger to log requests.|none|
+
+#### Composing request logging
+
+The `composing` request logger emits request logs to multiple request loggers.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.loggerProviders`|List of request loggers for emitting request logs.|none|
+
+#### Switching request logging
+
+The `switching` request logger routes native query request logs to one request logger and SQL query request logs to another request logger.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.request.logging.nativeQueryLogger`|Request logger for emitting native query request logs.|none|
+|`druid.request.logging.sqlQueryLogger`|Request logger for emitting SQL query request logs.|none|
+
+### Audit logging
+
+Coordinator and Overlord log changes to lookups, segment load/drop rules, and dynamic configuration changes for auditing.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.audit.manager.type`|Type of audit manager used for handling audited events. Audited events are logged when set to `log` or persisted in metadata store when set to `sql`.|sql|
+|`druid.audit.manager.logLevel`|Log level of audit events with possible values DEBUG, INFO, WARN. This property is used only when `druid.audit.manager.type` is set to `log`.|INFO|
+|`druid.audit.manager.auditHistoryMillis`|Default duration for querying audit history.|1 week|
+|`druid.audit.manager.includePayloadAsDimensionInMetric`|Boolean flag on whether to add `payload` column in service metric.|false|
+|`druid.audit.manager.maxPayloadSizeBytes`|The maximum size of audit payload to store in Druid's metadata store audit table. If the size of audit payload exceeds this value, the audit log would be stored with a message indicating that the payload was omitted instead. Setting `maxPayloadSizeBytes` to -1 (default value) disables this check, meaning Druid will always store audit payload regardless of it's size. Setting to any negative number other than `-1` is invalid. Human-readable format is supported, see [here](human-readable-byte.md). |-1|
+|`druid.audit.manager.skipNullField`|If true, the audit payload stored in metadata store will exclude any field with null value. |false|
+
+### Metadata storage
+
+These properties specify the JDBC connection and other configuration around the metadata storage. The only services that connect to the metadata storage with these properties are the [Coordinator](../design/coordinator.md) and [Overlord](../design/overlord.md).
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.metadata.storage.type`|The type of metadata storage to use. One of `mysql`, `postgresql`, or `derby`.|`derby`|
+|`druid.metadata.storage.connector.connectURI`|The JDBC URI for the database to connect to|none|
+|`druid.metadata.storage.connector.user`|The username to connect with.|none|
+|`druid.metadata.storage.connector.password`|The [Password Provider](../operations/password-provider.md) or String password used to connect with.|none|
+|`druid.metadata.storage.connector.createTables`|If Druid requires a table and it doesn't exist, create it?|true|
+|`druid.metadata.storage.tables.base`|The base name for tables.|`druid`|
+|`druid.metadata.storage.tables.dataSource`|The table to use to look for datasources created by [Kafka Indexing Service](../ingestion/kafka-ingestion.md).|`druid_dataSource`|
+|`druid.metadata.storage.tables.pendingSegments`|The table to use to look for pending segments.|`druid_pendingSegments`|
+|`druid.metadata.storage.tables.segments`|The table to use to look for segments.|`druid_segments`|
+|`druid.metadata.storage.tables.rules`|The table to use to look for segment load/drop rules.|`druid_rules`|
+|`druid.metadata.storage.tables.config`|The table to use to look for configs.|`druid_config`|
+|`druid.metadata.storage.tables.tasks`|Used by the indexing service to store tasks.|`druid_tasks`|
+|`druid.metadata.storage.tables.taskLog`|Used by the indexing service to store task logs.|`druid_tasklogs`|
+|`druid.metadata.storage.tables.taskLock`|Used by the indexing service to store task locks.|`druid_tasklocks`|
+|`druid.metadata.storage.tables.supervisors`|Used by the indexing service to store supervisor configurations.|`druid_supervisors`|
+|`druid.metadata.storage.tables.audit`|The table to use for audit history of configuration changes, such as Coordinator rules.|`druid_audit`|
+|`druid.metadata.storage.tables.useShortIndexNames`|Whether to use SHA-based unique index names to ensure all indices are created.|`false`|
+
+### Deep storage
+
+The configurations concern how to push and pull [Segments](../design/segments.md) from deep storage.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.storage.type`|The type of deep storage to use. One of `local`, `noop`, `s3`, `hdfs`, `c*`.|local|
+
+#### Local deep storage
+
+Local deep storage uses the local filesystem.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.storage.storageDirectory`|Directory on disk to use as deep storage.|`/tmp/druid/localStorage`|
+
+#### Noop deep storage
+
+This deep storage doesn't do anything. There are no configs.
+
+#### S3 deep storage
+
+This deep storage is used to interface with Amazon's S3. Note that the `druid-s3-extensions` extension must be loaded.
+The below table shows some important configurations for S3. See [S3 Deep Storage](../development/extensions-core/s3.md) for full configurations.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.storage.bucket`|S3 bucket name.|none|
+|`druid.storage.baseKey`|S3 object key prefix for storage.|none|
+|`druid.storage.disableAcl`|Boolean flag for ACL. If this is set to `false`, the full control would be granted to the bucket owner. This may require to set additional permissions. See [S3 permissions settings](../development/extensions-core/s3.md#s3-permissions-settings).|false|
+|`druid.storage.archiveBucket`|S3 bucket name for archiving when running the _archive task_.|none|
+|`druid.storage.archiveBaseKey`|S3 object key prefix for archiving.|none|
+|`druid.storage.sse.type`|Server-side encryption type. Should be one of `s3`, `kms`, and `custom`. See the below [Server-side encryption section](../development/extensions-core/s3.md#server-side-encryption) for more details.|None|
+|`druid.storage.sse.kms.keyId`|AWS KMS key ID. This is used only when `druid.storage.sse.type` is `kms` and can be empty to use the default key ID.|None|
+|`druid.storage.sse.custom.base64EncodedKey`|Base64-encoded key. Should be specified if `druid.storage.sse.type` is `custom`.|None|
+|`druid.storage.useS3aSchema`|If true, use the "s3a" filesystem when using Hadoop-based ingestion. If false, the "s3n" filesystem will be used. Only affects Hadoop-based ingestion.|false|
+
+#### HDFS deep storage
+
+This deep storage is used to interface with HDFS. You must load the `druid-hdfs-storage` extension.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.storage.storageDirectory`|HDFS directory to use as deep storage.|none|
+
+#### Cassandra deep storage
+
+This deep storage is used to interface with Cassandra. You must load the `druid-cassandra-storage` extension.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.storage.host`|Cassandra host.|none|
+|`druid.storage.keyspace`|Cassandra key space.|none|
+
+#### Centralized datasource schema (Experimental)
+
+This is an [experimental feature](../development/experimental.md) to improve datasource schema management by persisting segment schemas to the metadata store and caching them on the Coordinator.
+Traditionally, Brokers issue segment metadata queries to data nodes and tasks to fetch the schemas of all available segments.
+Each Broker then individually builds the schema of a datasource by combining the schemas of all the segments of that datasource.
+This mechanism is redundant and prone to errors as there is no single source of truth for schemas.
+
+Centralized schema management improves upon this design as follows:
+- Tasks publish segment schema along with segment metadata to the database.
+- Tasks announce schema for realtime segments periodically to the Coordinator.
+- Coordinator caches segment schemas and builds a combined schema for each datasource.
+- Broker poll the datasource schema cached on the Coordinator rather than building it on their own.
+- Brokers still retain the ability to build a datasource schema if they are unable to fetch it from the Coordinator.
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|--------|
+|`druid.centralizedDatasourceSchema.enabled`|Boolean flag for enabling datasource schema building and caching on the Coordinator. This property should be specified in the common runtime properties.|false|No.|
+|`druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled`| This config should be set when CentralizedDatasourceSchema feature is enabled. This should be specified in the Middle Manager runtime properties.|false|No.|
+
+If you enable this feature, you can query datasources that are only stored in deep storage and are not loaded on a Historical. For more information, see [Query from deep storage](../querying/query-from-deep-storage.md).
+
+For stale schema cleanup configs, refer to properties with the prefix `druid.coordinator.kill.segmentSchema` in [Metadata Management](#metadata-management).
+
+### Ingestion security configuration
+
+#### HDFS input source
+
+You can set the following property to specify permissible protocols for
+the [HDFS input source](../ingestion/input-sources.md#hdfs-input-source).
+
+|Property|Possible values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.ingestion.hdfs.allowedProtocols`|List of protocols|Allowed protocols for the HDFS input source.|`["hdfs"]`|
+
+#### HTTP input source
+
+You can set the following property to specify permissible protocols for
+the [HTTP input source](../ingestion/input-sources.md#http-input-source).
+
+|Property| Possible values | Description |Default|
+|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|-------|
+|`druid.ingestion.http.allowedProtocols`| List of protocols | Allowed protocols for the HTTP input source. |`["http", "https"]`|
+|`druid.ingestion.http.allowedHeaders`| A list of permitted request headers for the HTTP input source. By default, the list is empty, which means no headers are allowed in the ingestion specification. |`[]`|
+
+### External data access security configuration
+
+#### JDBC connections to external databases
+
+You can use the following properties to specify permissible JDBC options for:
+
+* [SQL input source](../ingestion/input-sources.md#sql-input-source)
+* [globally cached JDBC lookups](../querying/lookups-cached-global.md#jdbc-lookup)
+* [JDBC Data Fetcher for per-lookup caching](../development/extensions-core/druid-lookups.md#data-fetcher-layer).
+
+These properties do not apply to metadata storage connections.
+
+|Property|Possible values| Description |Default|
+|--------|---------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
+|`druid.access.jdbc.enforceAllowedProperties`|Boolean| When true, Druid applies `druid.access.jdbc.allowedProperties` to JDBC connections starting with `jdbc:postgresql:`, `jdbc:mysql:`, or `jdbc:mariadb:`. When false, Druid allows any kind of JDBC connections without JDBC property validation. This config is for backward compatibility especially during upgrades since enforcing allow list can break existing ingestion jobs or lookups based on JDBC. This config is deprecated and will be removed in a future release. |true|
+|`druid.access.jdbc.allowedProperties`|List of JDBC properties| Defines a list of allowed JDBC properties. Druid always enforces the list for all JDBC connections starting with `jdbc:postgresql:`, `jdbc:mysql:`, and `jdbc:mariadb:` if `druid.access.jdbc.enforceAllowedProperties` is set to true. This option is tested against MySQL connector 8.2.0, MariaDB connector 2.7.4, and PostgreSQL connector 42.2.14. Other connector versions might not work. |`["useSSL", "requireSSL", "ssl", "sslmode"]`|
+|`druid.access.jdbc.allowUnknownJdbcUrlFormat`|Boolean| When false, Druid only accepts JDBC connections starting with `jdbc:postgresql:` or `jdbc:mysql:`. When true, Druid allows JDBC connections to any kind of database, but only enforces `druid.access.jdbc.allowedProperties` for PostgreSQL and MySQL/MariaDB. |true|
+
+### Task logging
+
+You can use the `druid.indexer` configuration to set a [long-term storage](#log-long-term-storage) location for task log files, and to set a [retention policy](#log-retention-policy).
+
+For more information about ingestion tasks and the services of generating logs, see the [task reference](../ingestion/tasks.md).
+
+#### Log long-term storage
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.type`|Where to store task logs. `noop`, [`s3`](#s3-task-logs), [`azure`](#azure-blob-store-task-logs), [`google`](#google-cloud-storage-task-logs), [`hdfs`](#hdfs-task-logs), [`file`](#file-task-logs) |`file`|
+
+##### File task logs
+
+Store task logs in the local filesystem.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.directory`|Local filesystem path.|log|
+
+##### S3 task logs
+
+Store task logs in S3. Note that the `druid-s3-extensions` extension must be loaded.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.s3Bucket`|S3 bucket name.|none|
+|`druid.indexer.logs.s3Prefix`|S3 key prefix.|none|
+|`druid.indexer.logs.disableAcl`|Boolean flag for ACL. If this is set to `false`, the full control would be granted to the bucket owner. If the task logs bucket is the same as the deep storage (S3) bucket, then the value of this property will need to be set to true if druid.storage.disableAcl has been set to true.|false|
+
+##### Azure Blob Store task logs
+
+Store task logs in Azure Blob Store. To enable this feature, load the `druid-azure-extensions` extension, and configure deep storage for Azure. Druid uses the same authentication method configured for deep storage and stores task logs in the same storage account (set in `druid.azure.account`).
+
+| Property | Description | Default |
+|---|---|---|
+| `druid.indexer.logs.container` | The Azure Blob Store container to write logs to. | Must be set. |
+| `druid.indexer.logs.prefix` | The path to prepend to logs. | Must be set. |
+
+##### Google Cloud Storage task logs
+
+Store task logs in Google Cloud Storage.
+
+Note: The `druid-google-extensions` extension must be loaded, and this uses the same storage settings as the deep storage module for google.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.bucket`|The Google Cloud Storage bucket to write logs to|none|
+|`druid.indexer.logs.prefix`|The path to prepend to logs|none|
+
+##### HDFS task logs
+
+Store task logs in HDFS. Note that the `druid-hdfs-storage` extension must be loaded.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.directory`|The directory to store logs.|none|
+
+#### Log retention policy
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.logs.kill.enabled`|Boolean value for whether to enable deletion of old task logs. If set to true, Overlord will submit kill tasks periodically based on `druid.indexer.logs.kill.delay` specified, which will delete task logs from the log directory as well as tasks and tasklogs table entries in metadata storage except for tasks created in the last `druid.indexer.logs.kill.durationToRetain` period. |false|
+|`druid.indexer.logs.kill.durationToRetain`| Required if kill is enabled. In milliseconds, task logs and entries in task-related metadata storage tables to be retained created in last x milliseconds. |None|
+|`druid.indexer.logs.kill.initialDelay`| Optional. Number of milliseconds after Overlord start when first auto kill is run. |random value less than 300000 (5 mins)|
+|`druid.indexer.logs.kill.delay`|Optional. Number of milliseconds of delay between successive executions of auto kill run. |21600000 (6 hours)|
+
+### API error response
+
+You can configure Druid API error responses to hide internal information like the Druid class name, stack trace, thread name, servlet name, code, line/column number, host, or IP address.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.http.showDetailedJettyErrors`|When set to true, any error from the Jetty layer / Jetty filter includes the following fields in the JSON response: `servlet`, `message`, `url`, `status`, and `cause`, if it exists. When set to false, the JSON response only includes `message`, `url`, and `status`. The field values remain unchanged.|true|
+|`druid.server.http.errorResponseTransform.strategy`|Error response transform strategy. The strategy controls how Druid transforms error responses from Druid services. When unset or set to `none`, Druid leaves error responses unchanged.|`none`|
+
+#### Error response transform strategy
+
+You can use an error response transform strategy to transform error responses from within Druid services to hide internal information.
+When you specify an error response transform strategy other than `none`, Druid transforms the error responses from Druid services as follows:
+
+* For any query API that fails in the Router service, Druid sets the fields `errorClass` and `host` to null. Druid applies the transformation strategy to the `errorMessage` field.
+* For any SQL query API that fails, for example `POST /druid/v2/sql/...`, Druid sets the fields `errorClass` and `host` to null. Druid applies the transformation strategy to the `errorMessage` field.
+* For any JDBC related exceptions, Druid will turn all checked exceptions into `QueryInterruptedException` otherwise druid will attempt to keep the exception as the same type. For example if the original exception isn't owned by Druid it will become `QueryInterruptedException`. Druid applies the transformation strategy to the `errorMessage` field.
+
+##### No error response transform strategy
+
+In this mode, Druid leaves error responses from underlying services unchanged and returns the unchanged errors to the API client.
+This is the default Druid error response mode. To explicitly enable this strategy, set `druid.server.http.errorResponseTransform.strategy` to `none`.
+
+##### Allowed regular expression error response transform strategy
+
+In this mode, Druid validates the error responses from underlying services against a list of regular expressions. Only error messages that match a configured regular expression are returned. To enable this strategy, set `druid.server.http.errorResponseTransform.strategy` to `allowedRegex`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.http.errorResponseTransform.allowedRegex`|The list of regular expressions Druid uses to validate error messages. If the error message matches any of the regular expressions, then Druid includes it in the response unchanged. If the error message does not match any of the regular expressions, Druid replaces the error message with null or with a default message depending on the type of underlying Exception. |`[]`|
+
+For example, consider the following error response:
+
+```json
+{"error":"Plan validation failed","errorMessage":"org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 38: Object 'nonexistent-datasource' not found","errorClass":"org.apache.calcite.tools.ValidationException","host":null}
+```
+
+If `druid.server.http.errorResponseTransform.allowedRegex` is set to `[]`, Druid transforms the query error response to the following:
+
+```json
+{"error":"Plan validation failed","errorMessage":null,"errorClass":null,"host":null}
+```
+
+On the other hand, if `druid.server.http.errorResponseTransform.allowedRegex` is set to `[".*CalciteContextException.*"]` then Druid transforms the query error response to the following:
+
+```json
+{"error":"Plan validation failed","errorMessage":"org.apache.calcite.runtime.CalciteContextException: From line 1, column 15 to line 1, column 38: Object 'nonexistent-datasource' not found","errorClass":null,"host":null}
+```
+
+##### Persona based error response transform strategy
+
+In this mode, Druid transforms any exceptions which are targeted at non-users personas. Instead of returning such exception directly, the strategy logs the exception against a random id and returns the id along with a generic error message to the user.
+
+To enable this strategy, set `druid.server.http.errorResponseTransform.strategy` to `persona`.
+
+### Overlord discovery
+
+This config is used to find the [Overlord](../design/overlord.md) using Curator service discovery. Only required if you are actually running an Overlord.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.selectors.indexing.serviceName`|The druid.service name of the Overlord service. To start the Overlord with a different name, set it with this property. |druid/overlord|
+
+### Coordinator discovery
+
+This config is used to find the [Coordinator](../design/coordinator.md) using Curator service discovery. This config is used by the realtime indexing services to get information about the segments loaded in the cluster.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.selectors.coordinator.serviceName`|The druid.service name of the Coordinator service. To start the Coordinator with a different name, set it with this property. |druid/coordinator|
+
+### Announcing segments
+
+You can configure how to announce and unannounce Znodes in ZooKeeper (using Curator). For normal operations you do not need to override any of these configs.
+
+#### Batch data segment announcer
+
+In current Druid, multiple data segments may be announced under the same Znode.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.announcer.segmentsPerNode`|Each Znode contains info for up to this many segments.|50|
+|`druid.announcer.maxBytesPerNode`|Max byte size for Znode. Allowed range is [1024, 1048576].|524288|
+|`druid.announcer.skipDimensionsAndMetrics`|Skip Dimensions and Metrics list from segment announcements. NOTE: Enabling this will also remove the dimensions and metrics list from Coordinator and Broker endpoints.|false|
+|`druid.announcer.skipLoadSpec`|Skip segment LoadSpec from segment announcements. NOTE: Enabling this will also remove the loadspec from Coordinator and Broker endpoints.|false|
+
+If you want to turn off the batch data segment announcer, you can add a property to skip announcing segments. **You do not want to enable this config if you have any services using `batch` for `druid.serverview.type`**
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.announcer.skipSegmentAnnouncementOnZk`|Skip announcing segments to ZooKeeper. Note that the batch server view will not work if this is set to true.|false|
+
+### JavaScript
+
+Druid supports dynamic runtime extension through JavaScript functions. This functionality can be configured through
+the following properties.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.javascript.enabled`|Set to "true" to enable JavaScript functionality. This affects the JavaScript parser, filter, extractionFn, aggregator, post-aggregator, router strategy, and worker selection strategy.|false|
+
+:::info
+ JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.
+:::
+
+### Double column storage
+
+Prior to version 0.13.0, Druid's storage layer used a 32-bit float representation to store columns created by the
+doubleSum, doubleMin, and doubleMax aggregators at indexing time.
+Starting from version 0.13.0 the default will be 64-bit floats for Double columns.
+Using 64-bit representation for double column will lead to avoid precision loss at the cost of doubling the storage size of such columns.
+To keep the old format set the system-wide property `druid.indexing.doubleStorage=float`.
+You can also use `floatSum`, `floatMin`, and `floatMax` to use 32-bit float representation.
+Support for 64-bit floating point columns was released in Druid 0.11.0, so if you use this feature then older versions of Druid will not be able to read your data segments.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexing.doubleStorage`|Set to "float" to use 32-bit double representation for double columns.|double|
+
+### HTTP client
+
+All Druid components can communicate with each other over HTTP.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.global.http.numConnections`|Size of connection pool per destination URL. If there are more HTTP requests than this number that all need to speak to the same URL, then they will queue up.|`20`|
+|`druid.global.http.eagerInitialization`|Indicates that http connections should be eagerly initialized. If set to true, `numConnections` connections are created upon initialization|`false`|
+|`druid.global.http.compressionCodec`|Compression codec to communicate with others. May be "gzip" or "identity".|`gzip`|
+|`druid.global.http.readTimeout`|The timeout for data reads.|`PT15M`|
+|`druid.global.http.unusedConnectionTimeout`|The timeout for idle connections in connection pool. The connection in the pool will be closed after this timeout and a new one will be established. This timeout should be less than `druid.global.http.readTimeout`. Set this timeout = ~90% of `druid.global.http.readTimeout`|`PT4M`|
+|`druid.global.http.numMaxThreads`|Maximum number of I/O worker threads|`(number of cores) * 3 / 2 + 1`|
+|`druid.global.http.clientConnectTimeout`|The timeout (in milliseconds) for establishing client connections.|500|
+
+### Common endpoints configuration
+
+This section contains the configuration options for endpoints that are supported by all services.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.hiddenProperties`| If property names or substring of property names (case insensitive) is in this list, responses of the `/status/properties` endpoint do not show these properties | `["druid.s3.accessKey","druid.s3.secretKey","druid.metadata.storage.connector.password", "password", "key", "token", "pwd"]` |
+
+## Master server
+
+This section contains the configuration options for the services that reside on Master servers (Coordinators and Overlords) in the suggested [three-server configuration](../design/architecture.md#druid-servers).
+
+### Coordinator
+
+For general Coordinator services information, see [Coordinator service](../design/coordinator.md).
+
+#### Static Configuration
+
+These Coordinator static configurations can be defined in the `coordinator/runtime.properties` file.
+
+##### Coordinator service config
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current service. This is used to advertise the current service location as reachable from another service and should generally be specified such that `http://${druid.host}/` could actually talk to this service.|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the service's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8081|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative integer.|8281|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services.|`druid/coordinator`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+##### Coordinator operation
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.coordinator.period`|The run period for the Coordinator. The Coordinator operates by maintaining the current state of the world in memory and periodically looking at the set of "used" segments and segments being served to make decisions about whether any changes need to be made to the data topology. This property sets the delay between each of these runs.|`PT60S`|
+|`druid.coordinator.startDelay`|The operation of the Coordinator works on the assumption that it has an up-to-date view of the state of the world when it runs, the current ZooKeeper interaction code, however, is written in a way that doesn’t allow the Coordinator to know for a fact that it’s done loading the current state of the world. This delay is a hack to give it enough time to believe that it has all the data.|`PT300S`|
+|`druid.coordinator.load.timeout`|The timeout duration for when the Coordinator assigns a segment to a Historical service.|`PT15M`|
+|`druid.coordinator.balancer.strategy`|The [balancing strategy](../design/coordinator.md#balancing-segments-in-a-tier) used by the Coordinator to distribute segments among the Historical servers in a tier. The `cost` strategy distributes segments by minimizing a cost function, `diskNormalized` weights these costs with the disk usage ratios of the servers and `random` distributes segments randomly.|`cost`|
+|`druid.coordinator.loadqueuepeon.http.repeatDelay`|The start and repeat delay (in milliseconds) for the load queue peon, which manages the load/drop queue of segments for any server.|1 minute|
+|`druid.coordinator.loadqueuepeon.http.batchSize`|Number of segment load/drop requests to batch in one HTTP request. Note that it must be smaller than or equal to the `druid.segmentCache.numLoadingThreads` config on Historical service. If this value is not configured, the coordinator uses the value of the `numLoadingThreads` for the respective server. | `druid.segmentCache.numLoadingThreads` |
+|`druid.coordinator.asOverlord.enabled`|Boolean value for whether this Coordinator service should act like an Overlord as well. This configuration allows users to simplify a Druid cluster by not having to deploy any standalone Overlord services. If set to true, then Overlord console is available at `http://coordinator-host:port/console.html` and be sure to set `druid.coordinator.asOverlord.overlordService` also.|false|
+|`druid.coordinator.asOverlord.overlordService`| Required, if `druid.coordinator.asOverlord.enabled` is `true`. This must be same value as `druid.service` on standalone Overlord services and `druid.selectors.indexing.serviceName` on Middle Managers.|NULL|
+
+##### Data management
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.coordinator.period.indexingPeriod`|Period to run data management duties on the Coordinator including launching compact tasks and performing clean up of unused data. It is recommended to keep this value longer than `druid.manager.segments.pollDuration`.|`PT1800S` (30 mins)|
+|`druid.coordinator.kill.pendingSegments.on`|Boolean flag for whether or not the Coordinator clean up old entries in the `pendingSegments` table of metadata store. If set to true, Coordinator will check the created time of most recently complete task. If it doesn't exist, it finds the created time of the earliest running/pending/waiting tasks. Once the created time is found, then for all datasources not in the `killPendingSegmentsSkipList` (see [Dynamic configuration](#dynamic-configuration)), Coordinator will ask the Overlord to clean up the entries 1 day or more older than the found created time in the `pendingSegments` table. This will be done periodically based on `druid.coordinator.period.indexingPeriod` specified.|true|
+|`druid.coordinator.kill.on`|Boolean flag to enable the Coordinator to submit a kill task for unused segments and delete them permanently from the metadata store and deep storage.|false|
+|`druid.coordinator.kill.period`| The frequency of sending kill tasks to the indexing service. The value must be greater than or equal to `druid.coordinator.period.indexingPeriod`. Only applies if kill is turned on.|Same as `druid.coordinator.period.indexingPeriod`|
+|`druid.coordinator.kill.durationToRetain`|Duration, in ISO 8601 format, relative to the current time that identifies the data interval of segments to retain. When `druid.coordinator.kill.on` is true, any segment with a data interval ending before `now - durationToRetain` is eligible for permanent deletion. For example, if `durationToRetain` is set to `P90D`, unused segments with time intervals ending 90 days in the past are eligible for deletion. If `durationToRetain` is set to a negative ISO 8601 period, segments with future intervals ending before `now - durationToRetain` are also eligible for deletion.|`P90D`|
+|`druid.coordinator.kill.ignoreDurationToRetain`|A way to override `druid.coordinator.kill.durationToRetain` and tell the coordinator that you do not care about the end date of unused segment intervals when it comes to killing them. If true, the coordinator considers all unused segments as eligible to be killed.|false|
+|`druid.coordinator.kill.bufferPeriod`|The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused.|`P30D`|
+|`druid.coordinator.kill.maxSegments`|The number of unused segments to kill per kill task. This number must be greater than 0. This only applies when `druid.coordinator.kill.on=true`.|100|
+|`druid.coordinator.kill.maxInterval`|The largest interval, as an [ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), of segments to delete per kill task. Set to zero, e.g. `PT0S`, for unlimited. This only applies when `druid.coordinator.kill.on=true`.|`P30D`|
+
+##### Metadata management
+
+|Property|Description|Required|Default|
+|--------|-----------|---------|-------|
+|`druid.coordinator.period.metadataStoreManagementPeriod`|How often to run metadata management tasks in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. |No | `PT1H`|
+|`druid.coordinator.kill.supervisor.on`| Boolean value for whether to enable automatic deletion of terminated supervisors. If set to true, Coordinator will periodically remove terminated supervisors from the supervisor table in metadata storage.| No |true|
+|`druid.coordinator.kill.supervisor.period`| How often to do automatic deletion of terminated supervisor in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.supervisor.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.supervisor.durationToRetain`| Duration of terminated supervisor to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.supervisor.on` is set to true.| Yes if `druid.coordinator.kill.supervisor.on` is set to true.| `P90D`|
+|`druid.coordinator.kill.audit.on`| Boolean value for whether to enable automatic deletion of audit logs. If set to true, Coordinator will periodically remove audit logs from the audit table entries in metadata storage.| No | True|
+|`druid.coordinator.kill.audit.period`| How often to do automatic deletion of audit logs in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.audit.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.audit.durationToRetain`| Duration of audit logs to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.audit.on` is set to true.| Yes if `druid.coordinator.kill.audit.on` is set to true.| `P90D`|
+|`druid.coordinator.kill.compaction.on`| Boolean value for whether to enable automatic deletion of compaction configurations. If set to true, Coordinator will periodically remove compaction configuration of inactive datasource (datasource with no used and unused segments) from the config table in metadata storage. | No |True|
+|`druid.coordinator.kill.compaction.period`| How often to do automatic deletion of compaction configurations in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.compaction.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.rule.on`| Boolean value for whether to enable automatic deletion of rules. If set to true, Coordinator will periodically remove rules of inactive datasource (datasource with no used and unused segments) from the rule table in metadata storage.| No | True|
+|`druid.coordinator.kill.rule.period`| How often to do automatic deletion of rules in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.rule.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.rule.durationToRetain`| Duration of rules to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.rule.on` is set to true.| Yes if `druid.coordinator.kill.rule.on` is set to true.| `P90D`|
+|`druid.coordinator.kill.datasource.on`| Boolean value for whether to enable automatic deletion of datasource metadata (Note: datasource metadata only exists for datasource created from supervisor). If set to true, Coordinator will periodically remove datasource metadata of terminated supervisor from the datasource table in metadata storage. | No | True|
+|`druid.coordinator.kill.datasource.period`| How often to do automatic deletion of datasource metadata in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.datasource.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.datasource.durationToRetain`| Duration of datasource metadata to be retained from created time in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.datasource.on` is set to true.| Yes if `druid.coordinator.kill.datasource.on` is set to true.| `P90D`|
+|`druid.coordinator.kill.segmentSchema.on`| Boolean value for whether to enable automatic deletion of unused segment schemas. If set to true, Coordinator will periodically identify segment schemas which are not referenced by any used segment and mark them as unused. At a later point, these unused schemas are deleted. Only applies if [Centralized Datasource schema](#centralized-datasource-schema-experimental) feature is enabled. | No | True|
+|`druid.coordinator.kill.segmentSchema.period`| How often to do automatic deletion of segment schemas in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.segmentSchema.on` is set to true.| No| `P1D`|
+|`druid.coordinator.kill.segmentSchema.durationToRetain`| Duration of segment schemas to be retained from the time it was marked as unused in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Only applies if `druid.coordinator.kill.segmentSchema.on` is set to true.| Yes, if `druid.coordinator.kill.segmentSchema.on` is set to true.| `P90D`|
+
+##### Segment management
+
+|Property|Possible values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.serverview.type`|batch or http|Segment discovery method to use. "http" enables discovering segments using HTTP instead of ZooKeeper.|http|
+|`druid.coordinator.segment.awaitInitializationOnStart`|true or false|Whether the Coordinator will wait for its view of segments to fully initialize before starting up. If set to 'true', the Coordinator's HTTP server will not start up, and the Coordinator will not announce itself as available, until the server view is initialized.|true|
+
+##### Metadata retrieval
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.manager.config.pollDuration`|How often the manager polls the config table for updates.|`PT1M`|
+|`druid.manager.segments.pollDuration`|The duration between polls the Coordinator does for updates to the set of active segments. Generally defines the amount of lag time it can take for the Coordinator to notice new segments.|`PT1M`|
+|`druid.manager.segments.useIncrementalCache`|(Experimental) Denotes the usage mode of the segment metadata incremental cache. This cache provides a performance improvement over the polling mechanism currently employed by the Coordinator as it retrieves payloads of only updated segments. Possible cache modes are: (a) `never`: Incremental cache is disabled. (b) `always`: Incremental cache is enabled. Service start-up will be blocked until cache has synced with the metadata store at least once. (c) `ifSynced`: Cache is enabled. This mode does not block service start-up and is a way to retain existing behavior of the Coordinator. If the incremental cache is in modes `always` or `ifSynced`, reads from the cache will block until it has synced with the metadata store at least once after becoming leader. The Coordinator never writes to this cache.|`never`|
+|`druid.manager.rules.pollDuration`|The duration between polls the Coordinator does for updates to the set of active rules. Generally defines the amount of lag time it can take for the Coordinator to notice rules.|`PT1M`|
+|`druid.manager.rules.defaultRule`|The default rule for the cluster|`_default`|
+|`druid.manager.rules.alertThreshold`|The duration after a failed poll upon which an alert should be emitted.|`PT10M`|
+
+#### Dynamic configuration
+
+The Coordinator has dynamic configurations to tune certain behavior on the fly, without requiring a service restart.
+You can configure these parameters using the [web console](../operations/web-console.md)(recommended) or through the [Coordinator dynamic configuration API](../api-reference/dynamic-configuration-api.md#coordinator-dynamic-configuration).
+
+The following table shows the dynamic configuration properties for the Coordinator.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`millisToWaitBeforeDeleting`|How long does the Coordinator need to be a leader before it can start marking overshadowed segments as unused in metadata storage.| 900000 (15 mins)|
+|`smartSegmentLoading`|Enables ["smart" segment loading mode](#smart-segment-loading) which dynamically computes the optimal values of several properties that maximize Coordinator performance.|true|
+|`maxSegmentsToMove`|The maximum number of segments that can be moved in a Historical tier at any given time.|100|
+|`replicantLifetime`|The maximum number of Coordinator runs for which a segment can wait in the load queue of a Historical before Druid raises an alert.|15|
+|`replicationThrottleLimit`|The maximum number of segment replicas that can be assigned to a historical tier in a single Coordinator run. This property prevents Historical services from becoming overwhelmed when loading extra replicas of segments that are already available in the cluster.|500|
+|`balancerComputeThreads`|Thread pool size for computing moving cost of segments during segment balancing. Consider increasing this if you have a lot of segments and moving segments begins to stall.|`num_cores` / 2|
+|`killDataSourceWhitelist`|List of specific data sources for which kill tasks can be issued if `druid.coordinator.kill.on` is true. It can be a comma-separated list of data source names or a JSON array. If `killDataSourceWhitelist` is empty, the Coordinator issues kill tasks for all data sources.|none|
+|`killTaskSlotRatio`|Ratio of total available task slots, including autoscaling if applicable that will be allowed for kill tasks. This value must be between 0 and 1. Only applicable for kill tasks that are spawned automatically by the coordinator's auto kill duty, which is enabled when `druid.coordinator.kill.on` is true.|0.1|
+|`maxKillTaskSlots`|Maximum number of tasks that will be allowed for kill tasks. This limit only applies for kill tasks that are spawned automatically by the coordinator's auto kill duty, which is enabled when `druid.coordinator.kill.on` is true.|`Integer.MAX_VALUE` - no limit|
+|`killPendingSegmentsSkipList`|List of data sources for which pendingSegments are _NOT_ cleaned up if property `druid.coordinator.kill.pendingSegments.on` is true. This can be a list of comma-separated data sources or a JSON array.|none|
+|`maxSegmentsInNodeLoadingQueue`|The maximum number of segments allowed in the load queue of any given server. Use this parameter to load segments faster if, for example, the cluster contains slow-loading nodes or if there are too many segments to be replicated to a particular node (when faster loading is preferred to better segments distribution). The optimal value depends on the loading speed of segments, acceptable replication time and number of nodes.|500|
+|`useRoundRobinSegmentAssignment`|Boolean flag for whether segments should be assigned to Historical services in a round robin fashion. When disabled, segment assignment is done using the chosen balancer strategy. When enabled, this can speed up segment assignments leaving balancing to move the segments to their optimal locations (based on the balancer strategy) lazily.|true|
+|`decommissioningNodes`|List of Historical servers to decommission. Coordinator will not assign new segments to decommissioning servers, and segments will be moved away from them to be placed on non-decommissioning servers at the maximum rate specified by `maxSegmentsToMove`.|none|
+|`pauseCoordination`|Boolean flag for whether or not the Coordinator should execute its various duties of coordinating the cluster. Setting this to true essentially pauses all coordination work while allowing the API to remain up. Duties that are paused include all classes that implement the `CoordinatorDuty` interface. Such duties include: segment balancing, segment compaction, submitting kill tasks for unused segments (if enabled), logging of used segments in the cluster, marking of newly unused or overshadowed segments, matching and execution of load/drop rules for used segments, unloading segments that are no longer marked as used from Historical servers. An example of when an admin may want to pause coordination would be if they are doing deep storage maintenance on HDFS name nodes with downtime and don't want the Coordinator to be directing Historical nodes to hit the name node with API requests until maintenance is done and the deep store is declared healthy for use again.|false|
+|`replicateAfterLoadTimeout`|Boolean flag for whether or not additional replication is needed for segments that have failed to load due to the expiry of `druid.coordinator.load.timeout`. If this is set to true, the Coordinator will attempt to replicate the failed segment on a different historical server. This helps improve the segment availability if there are a few slow Historicals in the cluster. However, the slow Historical may still load the segment later and the Coordinator may issue drop requests if the segment is over-replicated.|false|
+|`turboLoadingNodes`| Experimental. List of Historical servers to place in turbo loading mode. These servers use a larger thread-pool to load segments faster but at the cost of query performance. For servers specified in `turboLoadingNodes`, `druid.coordinator.loadqueuepeon.http.batchSize` is ignored and the coordinator uses the value of the respective `numLoadingThreads` instead. Please use this config with caution. All servers should eventually be removed from this list once the segment loading on the respective historicals is finished. |none|
+|`cloneServers`| Experimental. Map from target Historical server to source Historical server which should be cloned by the target. The target Historical does not participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the source Historical onto the target Historical, so that the target becomes an exact copy of the source. Segments on the target Historical do not count towards replica counts either. If the source disappears, the target remains in the last known state of the source server until removed from the configuration. Use this config with caution. All servers should eventually be removed from this list once the desired state on the respective Historicals is achieved. |none|
+
+##### Smart segment loading
+
+The `smartSegmentLoading` mode simplifies Coordinator configuration for segment loading and balancing.
+If you enable this mode, do not provide values for the properties in the table below as the Coordinator computes them automatically.
+Druid computes the values to optimize Coordinator performance, based on the current state of the cluster.
+
+If you enable `smartSegmentLoading` mode, Druid ignores any value you provide for the following properties.
+
+|Property|Computed value|Description|
+|--------|--------------|-----------|
+|`useRoundRobinSegmentAssignment`|true|Speeds up segment assignment.|
+|`maxSegmentsInNodeLoadingQueue`|0|Removes the limit on load queue size.|
+|`replicationThrottleLimit`|5% of used segments, minimum value 100|Prevents aggressive replication when a Historical disappears only intermittently.|
+|`replicantLifetime`|60|Allows segments to wait about an hour (assuming a Coordinator period of 1 minute) in the load queue before an alert is raised. In `smartSegmentLoading` mode, load queues are not limited by size. Segments might therefore assigned to a load queue even if the corresponding server is slow to load them.|
+|`maxSegmentsToMove`|2% of used segments, minimum value 100, maximum value 1000|Ensures that some segments are always moving in the cluster to keep it well balanced. The maximum value keeps the Coordinator run times bounded.|
+|`balancerComputeThreads`|`num_cores` / 2|Ensures that there are enough threads to perform balancing computations without hogging all Coordinator resources.|
+
+When `smartSegmentLoading` is disabled, Druid uses the configured values of these properties.
+Disable `smartSegmentLoading` only if you want to explicitly set the values of any of the above properties.
+
+##### Lookups dynamic configuration
+
+These configuration options control Coordinator lookup management. For configurations that affect lookup propagation, see [Dynamic configuration for lookups](../querying/lookups.md#dynamic-configuration).
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.manager.lookups.hostDeleteTimeout`|How long to wait for a `DELETE` request to a particular service before considering the `DELETE` a failure.|`PT1S`|
+|`druid.manager.lookups.hostUpdateTimeout`|How long to wait for a `POST` request to a particular service before considering the `POST` a failure.|`PT10S`|
+|`druid.manager.lookups.deleteAllTimeout`|How long to wait for all `DELETE` requests to finish before considering the delete attempt a failure.|`PT10S`|
+|`druid.manager.lookups.updateAllTimeout`|How long to wait for all `POST` requests to finish before considering the attempt a failure.|`PT60S`|
+|`druid.manager.lookups.threadPoolSize`|How many services can be managed concurrently (concurrent `POST` and `DELETE` requests). Requests this limit will wait in a queue until a slot becomes available.|10|
+|`druid.manager.lookups.period`|Number of milliseconds between checks for configuration changes.|120000 (2 minutes)|
+
+##### Automatic compaction dynamic configuration
+
+You can set or update [automatic compaction](../data-management/automatic-compaction.md) properties dynamically using the
+[Automatic compaction API](../api-reference/automatic-compaction-api.md) without restarting Coordinators.
+
+For details about segment compaction, see [Segment size optimization](../operations/segment-optimization.md).
+
+You can configure automatic compaction through the following properties:
+
+|Property|Description|Required|
+|--------|-----------|--------|
+|`dataSource`|The datasource name to be compacted.|yes|
+|`taskPriority`|[Priority](../ingestion/tasks.md#lock-priority) of compaction task.|no (default = 25)|
+|`inputSegmentSizeBytes`|Maximum number of total segment bytes processed per compaction task. Since a time chunk must be processed in its entirety, if the segments for a particular time chunk have a total size in bytes greater than this parameter, compaction will not run for that time chunk.|no (default = 100,000,000,000,000 i.e. 100TB)|
+|`skipOffsetFromLatest`|The offset for searching segments to be compacted in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Strongly recommended to set for realtime datasources. See [Data handling with compaction](../data-management/compaction.md#data-handling-with-compaction).|no (default = "P1D")|
+|`tuningConfig`|Tuning config for compaction tasks. See below [Automatic compaction tuningConfig](#automatic-compaction-tuningconfig).|no|
+|`taskContext`|[Task context](../ingestion/tasks.md#context-parameters) for compaction tasks.|no|
+|`granularitySpec`|Custom `granularitySpec`. See [Automatic compaction granularitySpec](#automatic-compaction-granularityspec).|no|
+|`dimensionsSpec`|Custom `dimensionsSpec`. See [Automatic compaction dimensionsSpec](#automatic-compaction-dimensionsspec).|no|
+|`transformSpec`|Custom `transformSpec`. See [Automatic compaction transformSpec](#automatic-compaction-transformspec).|no|
+|`metricsSpec`|Custom [`metricsSpec`](../ingestion/ingestion-spec.md#metricsspec). The compaction task preserves any existing metrics regardless of whether `metricsSpec` is specified. If `metricsSpec` is specified, Druid does not reapply any aggregators matching the metric names specified in `metricsSpec` to rows that already have the associated metrics. For rows that do not already have the metric specified in `metricsSpec`, Druid applies the metric aggregator on the source column, then proceeds to combine the metrics across segments as usual. If `metricsSpec` is not specified, Druid automatically discovers the metrics in the existing segments and combines existing metrics with the same metric name across segments. Aggregators for metrics with the same name are assumed to be compatible for combining across segments, otherwise the compaction task may fail.|no|
+|`ioConfig`|IO config for compaction tasks. See [Automatic compaction ioConfig](#automatic-compaction-ioconfig).|no|
+
+Automatic compaction config example:
+
+```json
+{
+ "dataSource": "wikiticker",
+ "granularitySpec" : {
+ "segmentGranularity" : "none"
+ }
+}
+```
+
+Compaction tasks fail when higher priority tasks cause Druid to revoke their locks. By default, realtime tasks like ingestion have a higher priority than compaction tasks. Frequent conflicts between compaction tasks and realtime tasks can cause the Coordinator's automatic compaction to hang.
+You may see this issue with streaming ingestion from Kafka and Kinesis, which ingest late-arriving data.
+
+To mitigate this problem, set `skipOffsetFromLatest` to a value large enough so that arriving data tends to fall outside the offset value from the current time. This way you can avoid conflicts between compaction tasks and realtime ingestion tasks.
+For example, if you want to skip over segments from thirty days prior to the end time of the most recent segment, assign `"skipOffsetFromLatest": "P30D"`.
+For more information, see [Avoid conflicts with ingestion](../data-management/automatic-compaction.md#avoid-conflicts-with-ingestion).
+
+###### Automatic compaction tuningConfig
+
+Auto-compaction supports a subset of the [tuningConfig for Parallel task](../ingestion/native-batch.md#tuningconfig).
+
+The following table shows the supported configurations for auto-compaction.
+
+|Property|Description|Required|
+|--------|-----------|--------|
+|type|The task type. If you're using Coordinator duties for auto-compaction, set it to `index_parallel`. If you're using compaction supervisors, set it to `autocompact`. |yes|
+|`maxRowsInMemory`|Used in determining when intermediate persists to disk should occur. Normally user does not need to set this, but depending on the nature of data, if rows are short in terms of bytes, user may not want to store a million rows in memory and this value should be set.|no (default = 1000000)|
+|`maxBytesInMemory`|Used in determining when intermediate persists to disk should occur. Normally this is computed internally and user does not need to set it. This value represents number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. The maximum heap memory usage for indexing is `maxBytesInMemory` * (2 + `maxPendingPersists`)|no (default = 1/6 of max JVM memory)|
+|`splitHintSpec`|Used to give a hint to control the amount of data that each first phase task reads. This hint could be ignored depending on the implementation of the input source. See [Split hint spec](../ingestion/native-batch.md#split-hint-spec) for more details.|no (default = size-based split hint spec)|
+|`partitionsSpec`|Defines how to partition data in each time chunk, see [`PartitionsSpec`](../ingestion/native-batch.md#partitionsspec)|no (default = `dynamic`)|
+|`indexSpec`|Defines segment storage format options to be used at indexing time, see [IndexSpec](../ingestion/ingestion-spec.md#indexspec)|no|
+|`indexSpecForIntermediatePersists`|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. this can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. however, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](../ingestion/ingestion-spec.md#indexspec) for possible values.|no|
+|`maxPendingPersists`|Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with `maxRowsInMemory` * (2 + `maxPendingPersists`).|no (default = 0, meaning one persist can be running concurrently with ingestion, and none can be queued up)|
+|`pushTimeout`|Milliseconds to wait for pushing segments. It must be >= 0, where 0 means to wait forever.|no (default = 0)|
+|`segmentWriteOutMediumFactory`|Segment write-out medium to use when creating segments. See [SegmentWriteOutMediumFactory](../ingestion/native-batch.md#segmentwriteoutmediumfactory).|no (default is the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type` is used)|
+|`maxNumConcurrentSubTasks`|Maximum number of worker tasks which can be run in parallel at the same time. The supervisor task would spawn worker tasks up to `maxNumConcurrentSubTasks` regardless of the current available task slots. If this value is set to 1, the Supervisor task processes data ingestion on its own instead of spawning worker tasks. If this value is set to too large, too many worker tasks can be created which might block other ingestion. Check [Capacity Planning](../ingestion/native-batch.md#capacity-planning) for more details.|no (default = 1)|
+|`maxRetry`|Maximum number of retries on task failures.|no (default = 3)|
+|`maxNumSegmentsToMerge`|Max limit for the number of segments that a single task can merge at the same time in the second phase. Used only with `hashed` or `single_dim` partitionsSpec.|no (default = 100)|
+|`totalNumMergeTasks`|Total number of tasks to merge segments in the merge phase when `partitionsSpec` is set to `hashed` or `single_dim`.|no (default = 10)|
+|`taskStatusCheckPeriodMs`|Polling period in milliseconds to check running task statuses.|no (default = 1000)|
+|`chatHandlerTimeout`|Timeout for reporting the pushed segments in worker tasks.|no (default = PT10S)|
+|`chatHandlerNumRetries`|Retries for reporting the pushed segments in worker tasks.|no (default = 5)|
+|`engine` | Engine for compaction. Can be either `native` or `msq`. `msq` uses the MSQ task engine and is only supported with [compaction supervisors](../data-management/automatic-compaction.md#auto-compaction-using-compaction-supervisors). | no (default = native)|
+
+###### Automatic compaction granularitySpec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`segmentGranularity`|Time chunking period for the segment granularity. Defaults to 'null', which preserves the original segment granularity. Accepts all [Query granularity](../querying/granularities.md) values.|No|
+|`queryGranularity`|The resolution of timestamp storage within each segment. Defaults to 'null', which preserves the original query granularity. Accepts all [Query granularity](../querying/granularities.md) values.|No|
+|`rollup`|Whether to enable ingestion-time rollup or not. Defaults to null, which preserves the original setting. Note that once data is rollup, individual records can no longer be recovered. |No|
+
+###### Automatic compaction dimensionsSpec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`dimensions`| A list of dimension names or objects. Defaults to null, which preserves the original dimensions. Note that setting this will cause segments manually compacted with `dimensionExclusions` to be compacted again.|No|
+
+###### Automatic compaction transformSpec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`filter`| Conditionally filters input rows during compaction. Only rows that pass the filter will be included in the compacted segments. Any of Druid's standard [query filters](../querying/filters.md) can be used. Defaults to null, which will not filter any row. |No|
+
+###### Automatic compaction ioConfig
+
+Auto-compaction supports a subset of the [ioConfig for Parallel task](../ingestion/native-batch.md).
+The below is a list of the supported configurations for auto-compaction.
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|--------|
+|`dropExisting`|If `true` the compaction task replaces all existing segments fully contained by the umbrella interval of the compacted segments when the task publishes new segments and tombstones. If compaction fails, Druid does not publish any segments or tombstones. WARNING: this functionality is still in beta. Note that changing this config does not cause intervals to be compacted again.|false|no|
+
+### Overlord
+
+For general Overlord service information, see [Overlord](../design/overlord.md).
+
+#### Overlord static configuration
+
+These Overlord static configurations can be defined in the `overlord/runtime.properties` file.
+
+##### Overlord service configs
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current service. This is used to advertise the current service location as reachable from another service and should generally be specified such that `http://${druid.host}/` could actually talk to this service.|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the service's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`.|8090|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|8290|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services.|`druid/overlord`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+##### Overlord operations
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.runner.type`|Indicates whether tasks should be run locally using `local` or in a distributed environment using `remote`. The recommended option is `httpRemote`, which is similar to `remote` but uses HTTP to interact with Middle Managers instead of ZooKeeper.|`httpRemote`|
+|`druid.indexer.storage.type`|Indicates whether incoming tasks should be stored locally (in heap) or in metadata storage. One of `local` or `metadata`. `local` is mainly for internal testing while `metadata` is recommended in production because storing incoming tasks in metadata storage allows for tasks to be resumed if the Overlord should fail.|`local`|
+|`druid.indexer.storage.recentlyFinishedThreshold`|Duration of time to store task results. Default is 24 hours. If you have hundreds of tasks running in a day, consider increasing this threshold.|`PT24H`|
+|`druid.indexer.tasklock.forceTimeChunkLock`|**Setting this to false is still experimental** If set, all tasks are enforced to use time chunk lock. If not set, each task automatically chooses a lock type to use. This configuration can be overwritten by setting `forceTimeChunkLock` in the [task context](../ingestion/tasks.md#context-parameters). See [Task lock system](../ingestion/tasks.md#task-lock-system) for more details about locking in tasks.|true|
+|`druid.indexer.tasklock.batchSegmentAllocation`| If set to true, Druid performs segment allocate actions in batches to improve throughput and reduce the average `task/action/run/time`. See [batching `segmentAllocate` actions](../ingestion/tasks.md#batching-segmentallocate-actions) for details.|true|
+|`druid.indexer.tasklock.batchAllocationWaitTime`|Number of milliseconds after Druid adds the first segment allocate action to a batch, until it executes the batch. Allows the batch to add more requests and improve the average segment allocation run time. This configuration takes effect only if `batchSegmentAllocation` is enabled.|0|
+|`druid.indexer.tasklock.batchAllocationNumThreads`|Number of worker threads to use for batch segment allocation. This represents the maximum number of allocation batches that can be processed in parallel for distinct datasources. Batches for a single datasource are always processed sequentially. This configuration takes effect only if `batchSegmentAllocation` is enabled.|5|
+|`druid.indexer.task.default.context`|Default task context that is applied to all tasks submitted to the Overlord. Any default in this config does not override neither the context values the user provides nor `druid.indexer.tasklock.forceTimeChunkLock`.|empty context|
+|`druid.indexer.queue.maxSize`|Maximum number of active tasks at one time.|`Integer.MAX_VALUE`|
+|`druid.indexer.queue.startDelay`|Sleep this long before starting Overlord queue management. This can be useful to give a cluster time to re-orient itself (for example, after a widespread network issue).|`PT1M`|
+|`druid.indexer.queue.restartDelay`|Sleep this long when Overlord queue management throws an exception before trying again.|`PT30S`|
+|`druid.indexer.queue.storageSyncRate`|Sync Overlord state this often with an underlying task persistence mechanism.|`PT1M`|
+|`druid.indexer.queue.maxTaskPayloadSize`|Maximum allowed size in bytes of a single task payload accepted by the Overlord.|none (allow all task payload sizes)|
+
+The following configs only apply if the Overlord is running in remote mode. For a description of local vs. remote mode, see [Overlord service](../design/overlord.md).
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.runner.taskAssignmentTimeout`|How long to wait after a task has been assigned to a Middle Manager before throwing an error.|`PT5M`|
+|`druid.indexer.runner.minWorkerVersion`|The minimum Middle Manager version to send tasks to. The version number is a string. This affects the expected behavior during certain operations like comparison against `druid.worker.version`. Specifically, the version comparison follows dictionary order. Use ISO8601 date format for the version to accommodate date comparisons. |"0"|
+| `druid.indexer.runner.parallelIndexTaskSlotRatio`| The ratio of task slots available for parallel indexing supervisor tasks per worker. The specified value must be in the range `[0, 1]`. |1|
+|`druid.indexer.runner.compressZnodes`|Indicates whether or not the Overlord should expect Middle Managers to compress Znodes.|true|
+|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in ZooKeeper, should be in the range of `[10KiB, 2GiB)`. [Human-readable format](human-readable-byte.md) is supported.| 512 KiB |
+|`druid.indexer.runner.taskCleanupTimeout`|How long to wait before failing a task after a Middle Manager is disconnected from ZooKeeper.|`PT15M`|
+|`druid.indexer.runner.taskShutdownLinkTimeout`|How long to wait on a shutdown request to a Middle Manager before timing out|`PT1M`|
+|`druid.indexer.runner.pendingTasksRunnerNumThreads`|Number of threads to allocate pending-tasks to workers, must be at least 1.|1|
+|`druid.indexer.runner.maxRetriesBeforeBlacklist`|Number of consecutive times the Middle Manager can fail tasks, before the worker is blacklisted, must be at least 1|5|
+|`druid.indexer.runner.workerBlackListBackoffTime`|How long to wait before a task is whitelisted again. This value should be greater that the value set for taskBlackListCleanupPeriod.|`PT15M`|
+|`druid.indexer.runner.workerBlackListCleanupPeriod`|A duration after which the cleanup thread will start up to clean blacklisted workers.|`PT5M`|
+|`druid.indexer.runner.maxPercentageBlacklistWorkers`|The maximum percentage of workers to blacklist, this must be between 0 and 100.|20|
+
+If autoscaling is enabled, you can set these additional configs:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.autoscale.strategy`|Sets the strategy to run when autoscaling is required. One of `noop`, `ec2` or `gce`.|`noop`|
+|`druid.indexer.autoscale.doAutoscale`|If set to true, autoscaling will be enabled.|false|
+|`druid.indexer.autoscale.provisionPeriod`|How often to check whether or not new Middle Managers should be added.|`PT1M`|
+|`druid.indexer.autoscale.terminatePeriod`|How often to check when Middle Managers should be removed.|`PT5M`|
+|`druid.indexer.autoscale.originTime`|The starting reference timestamp that the terminate period increments upon.|`2012-01-01T00:55:00.000Z`|
+|`druid.indexer.autoscale.workerIdleTimeout`|How long can a worker be idle (not a run task) before it can be considered for termination.|`PT90M`|
+|`druid.indexer.autoscale.maxScalingDuration`|How long the Overlord will wait around for a Middle Manager to show up before giving up.|`PT15M`|
+|`druid.indexer.autoscale.numEventsToTrack`|The number of autoscaling related events (node creation and termination) to track.|10|
+|`druid.indexer.autoscale.pendingTaskTimeout`|How long a task can be in "pending" state before the Overlord tries to scale up.|`PT30S`|
+|`druid.indexer.autoscale.workerVersion`|If set, will only create nodes of set version during autoscaling. Overrides dynamic configuration. |null|
+|`druid.indexer.autoscale.workerPort`|The port that Middle Managers will run on.|8080|
+|`druid.indexer.autoscale.workerCapacityHint`| An estimation of the number of task slots available for each worker launched by the auto scaler when there are no workers running. The auto scaler uses the worker capacity hint to launch workers with an adequate capacity to handle pending tasks. When unset or set to a value less than or equal to 0, the auto scaler scales workers equal to the value for `minNumWorkers` in autoScaler config instead. The auto scaler assumes that each worker, either a Middle Manager or indexer, has the same amount of task slots. Therefore, when all your workers have the same capacity (homogeneous capacity), set the value for `autoscale.workerCapacityHint` equal to `druid.worker.capacity`. If your workers have different capacities (heterogeneous capacity), set the value to the average of `druid.worker.capacity` across the workers. For example, if two workers have `druid.worker.capacity=10`, and one has `druid.worker.capacity=4`, set `autoscale.workerCapacityHint=8`. Only applies to `pendingTaskBased` provisioning strategy.|-1|
+
+##### Supervisors
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.supervisor.healthinessThreshold`|The number of successful runs before an unhealthy supervisor is again considered healthy.|3|
+|`druid.supervisor.unhealthinessThreshold`|The number of failed runs before the supervisor is considered unhealthy.|3|
+|`druid.supervisor.taskHealthinessThreshold`|The number of consecutive task successes before an unhealthy supervisor is again considered healthy.|3|
+|`druid.supervisor.taskUnhealthinessThreshold`|The number of consecutive task failures before the supervisor is considered unhealthy.|3|
+|`druid.supervisor.storeStackTrace`|Whether full stack traces of supervisor exceptions should be stored and returned by the supervisor `/status` endpoint.|false|
+|`druid.supervisor.maxStoredExceptionEvents`|The maximum number of exception events that can be returned through the supervisor `/status` endpoint.|`max(healthinessThreshold, unhealthinessThreshold)`|
+|`druid.supervisor.idleConfig.enabled`|If `true`, supervisor can become idle if there is no data on input stream/topic for some time.|false|
+|`druid.supervisor.idleConfig.inactiveAfterMillis`|Supervisor is marked as idle if all existing data has been read from input topic and no new data has been published for `inactiveAfterMillis` milliseconds.|`600_000`|
+
+The `druid.supervisor.idleConfig.*` specification in the Overlord runtime properties defines the default behavior for the entire cluster. See [Idle Configuration in Kafka Supervisor IOConfig](../ingestion/kinesis-ingestion.md#io-configuration) to override it for an individual supervisor.
+
+##### Segment metadata cache (Experimental)
+
+The following properties pertain to segment metadata caching on the Overlord that may be used to speed up segment allocation and other metadata operations.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.manager.segments.useIncrementalCache`|Denotes the usage mode of the segment metadata incremental cache. Possible modes are: (a) `never`: Cache is disabled. (b) `always`: Reads are always done from the cache. Service start-up will be blocked until cache has synced with the metadata store at least once. Transactions will block until cache has synced with the metadata store at least once after becoming leader. (c) `ifSynced`: Reads are done from the cache only if it has already synced with the metadata store. This mode does not block service start-up or transactions.|`never`|
+|`druid.manager.segments.pollDuration`|Duration (in ISO 8601 format) between successive syncs of the cache with the metadata store. This property is used only when `druid.manager.segments.useIncrementalCache` is set to `always` or `ifSynced`.|`PT1M` (1 minute)|
+
+##### Auto-kill unused segments (Experimental)
+
+These configs pertain to the new embedded mode of running [kill tasks on the Overlord](../data-management/delete.md#auto-kill-data-on-the-overlord-experimental).
+None of the configs that apply to [auto-kill performed by the Coordinator](../data-management/delete.md#auto-kill-data-using-coordinator-duties) are used by this feature.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.manager.segments.killUnused.enabled`|Boolean flag to enable auto-kill of eligible unused segments on the Overlord. This feature can be used only when [segment metadata caching](#segment-metadata-cache-experimental) is enabled on the Overlord and MUST NOT be enabled if `druid.coordinator.kill.on` is already set to `true` on the Coordinator.|`true`|
+|`druid.manager.segments.killUnused.bufferPeriod`|Period after which a segment marked as unused becomes eligible for auto-kill on the Overlord. This config is effective only if `druid.manager.segments.killUnused.enabled` is set to `true`.|`P30D` (30 days)|
+
+#### Overlord dynamic configuration
+
+The Overlord has dynamic configurations to tune how Druid assigns tasks to workers.
+You can configure these parameters using the [web console](../operations/web-console.md) or through the [Overlord dynamic configuration API](../api-reference/dynamic-configuration-api.md#overlord-dynamic-configuration).
+
+The following table shows the dynamic configuration properties for the Overlord.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`selectStrategy`| Describes how to assign tasks to Middle Managers. The type can be `equalDistribution`, `equalDistributionWithCategorySpec`, `fillCapacity`, `fillCapacityWithCategorySpec`, and `javascript`. | `{"type":"equalDistribution"}` |
+|`autoScaler`| Only used if [autoscaling](#autoscaler) is enabled.| null |
+
+The following is an example of an Overlord dynamic config:
+
+
+Click to view the example
+
+```json
+{
+ "selectStrategy": {
+ "type": "fillCapacity",
+ "affinityConfig": {
+ "affinity": {
+ "datasource1": ["host1:port", "host2:port"],
+ "datasource2": ["host3:port"]
+ }
+ }
+ },
+ "autoScaler": {
+ "type": "ec2",
+ "minNumWorkers": 2,
+ "maxNumWorkers": 12,
+ "envConfig": {
+ "availabilityZone": "us-east-1a",
+ "nodeData": {
+ "amiId": "${AMI}",
+ "instanceType": "c3.8xlarge",
+ "minInstances": 1,
+ "maxInstances": 1,
+ "securityGroupIds": ["${IDs}"],
+ "keyName": "${KEY_NAME}"
+ },
+ "userData": {
+ "impl": "string",
+ "data": "${SCRIPT_COMMAND}",
+ "versionReplacementString": ":VERSION:",
+ "version": null
+ }
+ }
+ }
+}
+```
+
+
+
+##### Worker select strategy
+
+The select strategy controls how Druid assigns tasks to workers (Middle Managers).
+At a high level, the select strategy determines the list of eligible workers for a given task using
+either an `affinityConfig` or a `categorySpec`. Then, Druid assigns the task by either trying to distribute load equally
+(`equalDistribution`) or to fill as many workers as possible to capacity (`fillCapacity`).
+There are 4 options for select strategies:
+
+* [`equalDistribution`](#equaldistribution)
+* [`equalDistributionWithCategorySpec`](#equaldistributionwithcategoryspec)
+* [`fillCapacity`](#fillcapacity)
+* [`fillCapacityWithCategorySpec`](#fillcapacitywithcategoryspec)
+
+A `javascript` option is also available but should only be used for prototyping new strategies.
+
+If an `affinityConfig` is provided (as part of `fillCapacity` and `equalDistribution` strategies) for a given task, the list of workers eligible to be assigned is determined as follows:
+
+* a non-affinity worker if no affinity is specified for that datasource. Any worker not listed in the `affinityConfig` is considered a non-affinity worker.
+* a non-affinity worker if preferred workers are not available and the affinity is _weak_ i.e. `strong: false`.
+* a preferred worker listed in the `affinityConfig` for this datasource if it has available capacity
+* no worker if preferred workers are not available and affinity is _strong_ i.e. `strong: true`. In this case, the task remains in "pending" state. The chosen provisioning strategy (e.g. `pendingTaskBased`) may then use the total number of pending tasks to determine if a new node should be provisioned.
+
+Note that every worker listed in the `affinityConfig` will only be used for the assigned datasources and no other.
+
+If a `categorySpec` is provided (as part of `fillCapacityWithCategorySpec` and `equalDistributionWithCategorySpec` strategies), then a task of a given datasource may be assigned to:
+
+* any worker if no category config is given for task type
+* any worker if category config is given for task type but no category is given for datasource and there's no default category
+* a preferred worker (based on category config and category for datasource) if available
+* any worker if category config and category are given but no preferred worker is available and category config is `weak`
+* not assigned at all if preferred workers are not available and category config is `strong`
+
+In both the cases, Druid determines the list of eligible workers and selects one depending on their load with the goal of either distributing the load equally or filling as few workers as possible.
+
+If you are using auto-scaling, use the `fillCapacity` select strategy since auto-scaled nodes can
+not be assigned a category, and you want the work to be concentrated on the fewest number of workers to allow the empty ones to scale down.
+
+###### `equalDistribution`
+
+Tasks are assigned to the Middle Manager with the most free slots at the time the task begins running.
+This evenly distributes work across your Middle Managers.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`type`|`equalDistribution`|required; must be `equalDistribution`|
+|`affinityConfig`|[`AffinityConfig`](#affinityconfig) object|null (no affinity)|
+|`taskLimits`|[`TaskLimits`](#tasklimits) object|null (no limits)|
+
+###### `equalDistributionWithCategorySpec`
+
+This strategy is a variant of `equalDistribution`, which supports `workerCategorySpec` field rather than `affinityConfig`.
+By specifying `workerCategorySpec`, you can assign tasks to run on different categories of Middle Managers based on the **type** and **dataSource** of the task.
+This strategy doesn't work with `AutoScaler` since the behavior is undefined.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`type`|`equalDistributionWithCategorySpec`|required; must be `equalDistributionWithCategorySpec`|
+|`workerCategorySpec`|[`WorkerCategorySpec`](#workercategoryspec) object|null (no worker category spec)|
+|`taskLimits`|[`TaskLimits`](#tasklimits) object|null (no limits)|
+
+The following example shows tasks of type `index_kafka` that default to running on Middle Managers of category `c1`, except for tasks that write to datasource `ds1`, which run on Middle Managers of category `c2`.
+
+```json
+{
+ "selectStrategy": {
+ "type": "equalDistributionWithCategorySpec",
+ "workerCategorySpec": {
+ "strong": false,
+ "categoryMap": {
+ "index_kafka": {
+ "defaultCategory": "c1",
+ "categoryAffinity": {
+ "ds1": "c2"
+ }
+ }
+ }
+ }
+ }
+}
+```
+
+###### `fillCapacity`
+
+Tasks are assigned to the worker with the most currently-running tasks. This is
+useful when you are auto-scaling Middle Managers since it tends to pack some full and
+leave others empty. The empty ones can be safely terminated.
+
+Note that if `druid.indexer.runner.pendingTasksRunnerNumThreads` is set to _N_ > 1, then this strategy will fill _N_
+Middle Managers up to capacity simultaneously, rather than a single Middle Manager.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`type`| `fillCapacity`|required; must be `fillCapacity`|
+|`affinityConfig`| [`AffinityConfig`](#affinityconfig) object |null (no affinity)|
+|`taskLimits`|[`TaskLimits`](#tasklimits) object|null (no limits)|
+
+###### `fillCapacityWithCategorySpec`
+
+This strategy is a variant of `fillCapacity`, which supports `workerCategorySpec` instead of an `affinityConfig`.
+The usage is the same as `equalDistributionWithCategorySpec` strategy.
+This strategy doesn't work with `AutoScaler` since the behavior is undefined.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`type`|`fillCapacityWithCategorySpec`.|required; must be `fillCapacityWithCategorySpec`|
+|`workerCategorySpec`|[`WorkerCategorySpec`](#workercategoryspec) object|null (no worker category spec)|
+|`taskLimits`|[`TaskLimits`](#tasklimits) object|null (no limits)|
+
+
+
+###### `javascript`
+
+Allows defining arbitrary logic for selecting workers to run task using a JavaScript function.
+The function is passed remoteTaskRunnerConfig, map of workerId to available workers and task to be executed and returns the workerId on which the task should be run or null if the task cannot be run.
+It can be used for rapid development of missing features where the worker selection logic is to be changed or tuned often.
+If the selection logic is quite complex and cannot be easily tested in JavaScript environment,
+its better to write a druid extension module with extending current worker selection strategies written in java.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`type`|`javascript`|required; must be `javascript`|
+|`function`|String representing JavaScript function| |
+
+The following example shows a function that sends `batch_index_task` to workers `10.0.0.1` and `10.0.0.2` and all other tasks to other available workers.
+
+```json
+{
+ "type":"javascript",
+ "function":"function (config, zkWorkers, task) {\nvar batch_workers = new java.util.ArrayList();\nbatch_workers.add(\"middleManager1_hostname:8091\");\nbatch_workers.add(\"middleManager2_hostname:8091\");\nworkers = zkWorkers.keySet().toArray();\nvar sortedWorkers = new Array()\n;for(var i = 0; i < workers.length; i++){\n sortedWorkers[i] = workers[i];\n}\nArray.prototype.sort.call(sortedWorkers,function(a, b){return zkWorkers.get(b).getCurrCapacityUsed() - zkWorkers.get(a).getCurrCapacityUsed();});\nvar minWorkerVer = config.getMinWorkerVersion();\nfor (var i = 0; i < sortedWorkers.length; i++) {\n var worker = sortedWorkers[i];\n var zkWorker = zkWorkers.get(worker);\n if(zkWorker.canRunTask(task) && zkWorker.isValidVersion(minWorkerVer)){\n if(task.getType() == 'index_hadoop' && batch_workers.contains(worker)){\n return worker;\n } else {\n if(task.getType() != 'index_hadoop' && !batch_workers.contains(worker)){\n return worker;\n }\n }\n }\n}\nreturn null;\n}"
+}
+```
+
+:::info
+ JavaScript-based functionality is disabled by default. Refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.
+:::
+
+###### affinityConfig
+
+Use the `affinityConfig` field to pass affinity configuration to the `equalDistribution` and `fillCapacity` strategies.
+If not provided, the default is to have no affinity.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`affinity`|JSON object mapping a datasource String name to a list of indexing service Middle Manager `host:port` values. Druid doesn't perform DNS resolution, so the 'host' value must match what is configured on the Middle Manager and what the Middle Manager announces itself as (examine the Overlord logs to see what your Middle Manager announces itself as).|`{}`|
+|`strong`|When `true` tasks for a datasource must be assigned to affinity-mapped Middle Managers. Tasks remain queued until a slot becomes available. When `false`, Druid may assign tasks for a datasource to other Middle Managers when affinity-mapped Middle Managers are unavailable to run queued tasks.|false|
+
+###### workerCategorySpec
+
+You can provide `workerCategorySpec` to the `equalDistributionWithCategorySpec` and `fillCapacityWithCategorySpec` strategies using the `workerCategorySpec`
+field. If not provided, the default is to not use it at all.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`categoryMap`|A JSON map object mapping a task type String name to a [CategoryConfig](#categoryconfig) object, by which you can specify category config for different task type.|`{}`|
+|`strong`|With weak workerCategorySpec (the default), tasks for a dataSource may be assigned to other Middle Managers if the Middle Managers specified in `categoryMap` are not able to run all pending tasks in the queue for that dataSource. With strong workerCategorySpec, tasks for a dataSource will only ever be assigned to their specified Middle Managers, and will wait in the pending queue if necessary.|false|
+
+###### `taskLimits`
+
+The `taskLimits` field can be used with the `equalDistribution`, `fillCapacity`, `equalDistributionWithCategorySpec` and `fillCapacityWithCategorySpec` strategies.
+If you don't provide it, it will default to not being used.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`maxSlotCountByType`|A map where each key is a task type (`String`), and the corresponding value represents the absolute limit on the number of task slots that tasks of this type can occupy. The value is an `Integer` that is greater than or equal to 0. For example, a value of 5 means that tasks of this type can occupy up to 5 task slots in total. If both absolute and ratio limits are specified for the same task type, the effective limit will be the smaller of the absolute limit and the limit derived from the corresponding ratio. `maxSlotCountByType = {"index_parallel": 3, "query_controller": 5}`. In this example, parallel indexing tasks can occupy up to 3 task slots, and query controllers can occupy up to 5 task slots.|`{}`|
+|`maxSlotRatioByType`|A map where each key is a task type (`String`), and the corresponding value is a `Double` which should be in the range [0, 1], representing the ratio of task slots that tasks of this type can occupy. This ratio defines the proportion of total task slots a task type can use, calculated as `ratio * totalSlots`. If both absolute and ratio limits are specified for the same task type, the effective limit will be the smaller of the absolute limit and the limit derived from the corresponding ratio. `maxSlotRatioByType = {"index_parallel": 0.5, "query_controller": 0.25}`. In this example, parallel indexing tasks can occupy up to 50% of the total task slots, and query controllers can occupy up to 25% of the total task slots.|`{}`|
+
+###### CategoryConfig
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`defaultCategory`|Specify default category for a task type.|null|
+|`categoryAffinity`|A JSON map object mapping a datasource String name to a category String name of the Middle Manager. If category isn't specified for a datasource, then using the `defaultCategory`. If no specified category and the `defaultCategory` is also null, then tasks can run on any available Middle Managers.|null|
+
+##### Autoscaler
+
+Amazon's EC2 together with Google's GCE are currently the only supported autoscalers.
+
+EC2's autoscaler properties are:
+
+|Property| Description|Default|
+|--------|------------|-------|
+|`type`|`ec2`|0|
+|`minNumWorkers`| The minimum number of workers that can be in the cluster at any given time.|0|
+|`maxNumWorkers`| The maximum number of workers that can be in the cluster at any given time.|0|
+|`envConfig.availabilityZone` | What Amazon availability zone to run in.|none|
+|`envConfig.nodeData`| A JSON object that describes how to launch new nodes.|none; required|
+| `envConfig.userData`| A JSON object that describes how to configure new nodes. If you have set `druid.indexer.autoscale.workerVersion`, this must have a `versionReplacementString`. Otherwise, a `versionReplacementString` is not necessary.|none; optional|
+
+For GCE's properties, please refer to the [gce-extensions](../development/extensions-contrib/gce-extensions.md).
+
+## Data server
+
+This section contains the configuration options for the services that reside on Data servers (Middle Managers/Peons and Historicals) in the suggested [three-server configuration](../design/architecture.md#druid-servers).
+
+Configuration options for the [Indexer process](../design/indexer.md) are also provided here.
+
+### Middle Manager and Peon
+
+These Middle Manager and Peon configurations can be defined in the `middleManager/runtime.properties` file.
+
+#### Middle Manager service config
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current service. This is used to advertise the current service location as reachable from another service and should generally be specified such that `http://${druid.host}/` could actually talk to this service|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the service's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8091|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|8291|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|`druid/middlemanager`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+#### Middle Manager configuration
+
+Middle Managers pass their configurations down to their child peons. The Middle Manager requires the following configs:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.indexer.runner.allowedPrefixes`|Whitelist of prefixes for configs that can be passed down to child peons.|`com.metamx`, `druid`, `org.apache.druid`, `user.timezone`, `file.encoding`, `java.io.tmpdir`, `hadoop`|
+|`druid.indexer.runner.compressZnodes`|Indicates whether or not the Middle Managers should compress Znodes.|true|
+|`druid.indexer.runner.classpath`|Java classpath for the peon.|`System.getProperty("java.class.path")`|
+|`druid.indexer.runner.javaCommand`|Command required to execute java.|java|
+|`druid.indexer.runner.javaOpts`|_DEPRECATED_ A string of -X Java options to pass to the peon's JVM. Quotable parameters or parameters with spaces are encouraged to use javaOptsArray|`''`|
+|`druid.indexer.runner.javaOptsArray`|A JSON array of strings to be passed in as options to the peon's JVM. This is additive to `druid.indexer.runner.javaOpts` and is recommended for properly handling arguments which contain quotes or spaces like `["-XX:OnOutOfMemoryError=kill -9 %p"]`|`[]`|
+|`druid.indexer.runner.maxZnodeBytes`|The maximum size Znode in bytes that can be created in ZooKeeper, should be in the range of [10KiB, 2GiB). [Human-readable format](human-readable-byte.md) is supported.|512KiB|
+|`druid.indexer.runner.startPort`|Starting port used for Peon services, should be greater than 1023 and less than 65536.|8100|
+|`druid.indexer.runner.endPort`|Ending port used for Peon services, should be greater than or equal to `druid.indexer.runner.startPort` and less than 65536.|65535|
+|`druid.indexer.runner.ports`|A JSON array of integers to specify ports that used for Peon services. If provided and non-empty, ports for Peon services will be chosen from these ports. And `druid.indexer.runner.startPort/druid.indexer.runner.endPort` will be completely ignored.|`[]`|
+|`druid.worker.ip`|The IP of the worker.|`localhost`|
+|`druid.worker.version`|Version identifier for the Middle Manager. The version number is a string. This affects the expected behavior during certain operations like comparison against `druid.indexer.runner.minWorkerVersion`. Specifically, the version comparison follows dictionary order. Use ISO8601 date format for the version to accommodate date comparisons.|0|
+|`druid.worker.capacity`|Maximum number of tasks the Middle Manager can accept.|Number of CPUs on the machine - 1|
+|`druid.worker.baseTaskDirs`|List of base temporary working directories, one of which is assigned per task in a round-robin fashion. This property can be used to allow usage of multiple disks for indexing. This property is recommended in place of and takes precedence over `${druid.indexer.task.baseTaskDir}`. If this configuration is not set, `${druid.indexer.task.baseTaskDir}` is used. For example, `druid.worker.baseTaskDirs=[\"PATH1\",\"PATH2\",...]`.|null|
+|`druid.worker.baseTaskDirSize`|The total amount of bytes that can be used by tasks on any single task dir. This value is treated symmetrically across all directories, that is, if this is 500 GB and there are 3 `baseTaskDirs`, then each of those task directories is assumed to allow for 500 GB to be used and a total of 1.5 TB will potentially be available across all tasks. The actual amount of memory assigned to each task is discussed in [Configuring task storage sizes](../ingestion/tasks.md#configuring-task-storage-sizes)|`Long.MAX_VALUE`|
+|`druid.worker.category`|A string to name the category that the Middle Manager node belongs to.|`_default_worker_category`|
+|`druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled`| This config should be set when [Centralized Datasource Schema](#centralized-datasource-schema-experimental) feature is enabled. |false|
+
+#### Peon processing
+
+Processing properties set on the Middle Manager are passed through to Peons.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.processing.buffer.sizeBytes`|This specifies a buffer size (less than 2GiB) for the storage of intermediate results. The computation engine in both the Historical and Realtime processes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed. [Human-readable format](human-readable-byte.md) is supported.|auto (max 1 GiB)|
+|`druid.processing.buffer.poolCacheMaxCount`|Processing buffer pool caches the buffers for later use. This is the maximum count that the cache will grow to. Note that pool can create more buffers than it can cache if necessary.|`Integer.MAX_VALUE`|
+|`druid.processing.formatString`|Realtime and Historical processes use this format string to name their processing threads.|processing-%s|
+|`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`|
+|`druid.processing.numThreads`|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.|Number of cores - 1 (or 1)|
+|`druid.processing.numTimeoutThreads`|The number of processing threads to have available for handling per-segment query timeouts. Setting this value to `0` removes the ability to service per-segment timeouts, irrespective of `perSegmentTimeout` query context parameter. As these threads are just servicing timers, it's recommended to set this value to some small percent (e.g. 5%) of the total query processing cores available to the peon.|0|
+|`druid.processing.fifo`|Enables the processing queue to treat tasks of equal priority in a FIFO manner.|`true`|
+|`druid.processing.tmpDir`|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.|path represented by `java.io.tmpdir`|
+|`druid.processing.intermediaryData.storage.type`|Storage type for intermediary segments of data shuffle between native parallel index tasks. Set to `local` to store segment files in the local storage of the Middle Manager or Indexer. Set to `deepstore` to use configured deep storage for better fault tolerance during rolling updates. When the storage type is `deepstore`, Druid stores the data in the `shuffle-data` directory under the configured deep storage path. Druid does not support automated cleanup for the `shuffle-data` directory. You can set up cloud storage lifecycle rules for automated cleanup of data at the `shuffle-data` prefix location.|`local`|
+
+The amount of direct memory needed by Druid is at least
+`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)`. You can
+ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=` in
+`druid.indexer.runner.javaOptsArray` as documented above.
+
+#### Peon query configuration
+
+See [general query configuration](#general-query-configuration).
+
+#### Peon caching
+
+You can optionally configure caching to be enabled on the peons by setting caching configs here.
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.realtime.cache.useCache`|true, false|Enable the cache on the realtime.|false|
+|`druid.realtime.cache.populateCache`|true, false|Populate the cache on the realtime.|false|
+|`druid.realtime.cache.unCacheable`|All druid query types|All query types to not cache.|`[scan]`|
+|`druid.realtime.cache.maxEntrySize`|positive integer|Maximum cache entry size in bytes.|1_000_000|
+
+See [cache configuration](#cache-configuration) for how to configure cache settings.
+
+#### Additional Peon configuration
+
+Although Peons inherit the configurations of their parent Middle Managers, explicit child Peon configs in Middle Manager can be set by prefixing them with:
+
+```properties
+druid.indexer.fork.property
+```
+
+Additional Peon configs include:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.peon.mode`|One of `local` or `remote`. Setting this property to `local` means you intend to run the Peon as a standalone process which is not recommended.|`remote`|
+|`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
+|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/task`|
+|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|
+|`druid.indexer.task.defaultRowFlushBoundary`|Highest row count before persisting to disk. Used for indexing generating tasks.|75000|
+|`druid.indexer.task.directoryLockTimeout`|Wait this long for zombie Peons to exit before giving up on their replacements.|PT10M|
+|`druid.indexer.task.gracefulShutdownTimeout`|Wait this long on Middle Manager restart for restorable tasks to gracefully exit.|PT5M|
+|`druid.indexer.task.hadoopWorkingPath`|Temporary working directory for Hadoop tasks.|`/tmp/druid-indexing`|
+|`druid.indexer.task.restoreTasksOnRestart`|If true, Middle Managers will attempt to stop tasks gracefully on shutdown and restore them on restart.|false|
+|`druid.indexer.task.ignoreTimestampSpecForDruidInputSource`|If true, tasks using the [Druid input source](../ingestion/input-sources.md) will ignore the provided timestampSpec, and will use the `__time` column of the input datasource. This option is provided for compatibility with ingestion specs written before Druid 0.22.0.|false|
+|`druid.indexer.task.storeEmptyColumns`|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](../ingestion/ingestion-spec.md#dimensionsspec). If you use the string-based schemaless ingestion and don't specify any dimensions to ingest, you must also set [`includeAllDimensions`](../ingestion/ingestion-spec.md#dimensionsspec) for Druid to store empty columns. If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest placeholder data for empty columns or else not query on empty columns. You can overwrite this configuration by setting `storeEmptyColumns` in the [task context](../ingestion/tasks.md#context-parameters).|true|
+|`druid.indexer.task.tmpStorageBytesPerTask`|Maximum number of bytes per task to be used to store temporary files on disk. This config is generally intended for internal usage. Attempts to set it are very likely to be overwritten by the TaskRunner that executes the task, so be sure of what you expect to happen before directly adjusting this configuration parameter. The config is documented here primarily to provide an understanding of what it means if/when someone sees that it has been set. A value of -1 disables this limit. |-1|
+|`druid.indexer.task.allowHadoopTaskExecution`|Conditional dictating if the cluster allows `index_hadoop` tasks to be executed. `index_hadoop` is deprecated, and defaulting to false will force cluster operators to acknowledge the deprecation and consciously opt in to using index_hadoop with the understanding that it will be removed in the future.|false|
+|`druid.indexer.server.maxChatRequests`|Maximum number of concurrent requests served by a task's chat handler. Set to 0 to disable limiting.|0|
+
+If the Peon is running in remote mode, there must be an Overlord up and running. Peons in remote mode can set the following configurations:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.peon.taskActionClient.retry.minWait`|The minimum retry time to communicate with Overlord.|`PT5S`|
+|`druid.peon.taskActionClient.retry.maxWait`|The maximum retry time to communicate with Overlord.|`PT1M`|
+|`druid.peon.taskActionClient.retry.maxRetryCount`|The maximum number of retries to communicate with Overlord.|13 (about 10 minutes of retrying)|
+
+##### SegmentWriteOutMediumFactory
+
+When new segments are created, Druid temporarily stores some preprocessed data in some buffers.
+The following types of medium exist for the buffers:
+
+* **Temporary files** (`tmpFile`) are stored under the task working directory (see `druid.worker.baseTaskDirs` configuration above) and thus share it's mounting properties. For example, they could be backed by HDD, SSD or memory (tmpfs).
+This type of medium may do unnecessary disk I/O and requires some disk space to be available.
+
+* **Off-heap memory** (`offHeapMemory`) creates buffers in off-heap memory of a JVM process that is running a task.
+This type of medium is preferred, but it may require you to allow the JVM to have more off-heap memory by changing the `-XX:MaxDirectMemorySize` configuration. It's not understood yet how the required off-heap memory size relates to the size of the segments being created. But you shouldn't add more extra off-heap memory than the configured maximum _heap_ size (`-Xmx`) for the same JVM.
+
+* **On-heap memory** (`onHeapMemory`) creates buffers using the allocated heap memory of the JVM process running a task. Using on-heap memory introduces garbage collection overhead and so is not recommended in most cases. This type of medium is most helpful for tasks run on external clusters where it may be difficult to allocate and work with direct memory effectively.
+
+For most types of tasks, `SegmentWriteOutMediumFactory` can be configured per-task (see [Tasks](../ingestion/tasks.md) for more information), but if it's not specified for a task, or it's not supported for a particular task type, then Druid uses the value from the following configuration:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.peon.defaultSegmentWriteOutMediumFactory.type`|`tmpFile`, `offHeapMemory`, or `onHeapMemory`|`tmpFile`|
+
+### Indexer
+
+#### Indexer process configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current process. This is used to advertise the current processes location as reachable from another process and should generally be specified such that `http://${druid.host}/` could actually talk to this process|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the process's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8091|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|8283|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|`druid/indexer`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+#### Indexer general configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.worker.version`|Version identifier for the Indexer.|0|
+|`druid.worker.capacity`|Maximum number of tasks the Indexer can accept.|Number of available processors - 1|
+|`druid.worker.baseTaskDirs`|List of base temporary working directories, one of which is assigned per task in a round-robin fashion. This property can be used to allow usage of multiple disks for indexing. This property is recommended in place of and takes precedence over `${druid.indexer.task.baseTaskDir}`. If this configuration is not set, `${druid.indexer.task.baseTaskDir}` is used. Example: `druid.worker.baseTaskDirs=[\"PATH1\",\"PATH2\",...]`.|null|
+|`druid.worker.baseTaskDirSize`|The total amount of bytes that can be used by tasks on any single task dir. This value is treated symmetrically across all directories, that is, if this is 500 GB and there are 3 `baseTaskDirs`, then each of those task directories is assumed to allow for 500 GB to be used and a total of 1.5 TB will potentially be available across all tasks. The actual amount of memory assigned to each task is discussed in [Configuring task storage sizes](../ingestion/tasks.md#configuring-task-storage-sizes)|`Long.MAX_VALUE`|
+|`druid.worker.globalIngestionHeapLimitBytes`|Total amount of heap available for ingestion processing. This is applied by automatically setting the `maxBytesInMemory` property on tasks.|Configured max JVM heap size / 6|
+|`druid.worker.numConcurrentMerges`|Maximum number of segment persist or merge operations that can run concurrently across all tasks.|`druid.worker.capacity` / 2, rounded down|
+|`druid.indexer.task.baseDir`|Base temporary working directory.|`System.getProperty("java.io.tmpdir")`|
+|`druid.indexer.task.baseTaskDir`|Base temporary working directory for tasks.|`${druid.indexer.task.baseDir}/persistent/tasks`|
+|`druid.indexer.task.defaultHadoopCoordinates`|Hadoop version to use with HadoopIndexTasks that do not request a particular version.|`org.apache.hadoop:hadoop-client-api:3.3.6`, `org.apache.hadoop:hadoop-client-runtime:3.3.6`|
+|`druid.indexer.task.gracefulShutdownTimeout`|Wait this long on Indexer restart for restorable tasks to gracefully exit.|`PT5M`|
+|`druid.indexer.task.hadoopWorkingPath`|Temporary working directory for Hadoop tasks.|`/tmp/druid-indexing`|
+|`druid.indexer.task.restoreTasksOnRestart`|If true, the Indexer will attempt to stop tasks gracefully on shutdown and restore them on restart.|false|
+|`druid.indexer.task.ignoreTimestampSpecForDruidInputSource`|If true, tasks using the [Druid input source](../ingestion/input-sources.md) will ignore the provided timestampSpec, and will use the `__time` column of the input datasource. This option is provided for compatibility with ingestion specs written before Druid 0.22.0.|false|
+|`druid.indexer.task.storeEmptyColumns`|Boolean value for whether or not to store empty columns during ingestion. When set to true, Druid stores every column specified in the [`dimensionsSpec`](../ingestion/ingestion-spec.md#dimensionsspec). If you set `storeEmptyColumns` to false, Druid SQL queries referencing empty columns will fail. If you intend to leave `storeEmptyColumns` disabled, you should either ingest placeholder data for empty columns or else not query on empty columns. You can overwrite this configuration by setting `storeEmptyColumns` in the [task context](../ingestion/tasks.md#context-parameters).|true|
+|`druid.peon.taskActionClient.retry.minWait`|The minimum retry time to communicate with Overlord.|`PT5S`|
+|`druid.peon.taskActionClient.retry.maxWait`|The maximum retry time to communicate with Overlord.|`PT1M`|
+|`druid.peon.taskActionClient.retry.maxRetryCount`|The maximum number of retries to communicate with Overlord.|13 (about 10 minutes of retrying)|
+
+#### Indexer concurrent requests
+
+Druid uses Jetty to serve HTTP requests.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.http.numThreads`|Number of threads for HTTP requests. Please see the [Indexer Server HTTP threads](../design/indexer.md#server-http-threads) documentation for more details on how the Indexer uses this configuration.|max(10, (Number of cores * 17) / 16 + 2) + 30|
+|`druid.server.http.queueSize`|Size of the worker queue used by Jetty server to temporarily store incoming client connections. If this value is set and a request is rejected by jetty because queue is full then client would observe request failure with TCP connection being closed immediately with a completely empty response from server.|Unbounded|
+|`druid.server.http.maxIdleTime`|The Jetty max idle time for a connection.|`PT5M`|
+|`druid.server.http.enableRequestLimit`|If enabled, no requests would be queued in jetty queue and "HTTP 429 Too Many Requests" error response would be sent. |false|
+|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which unfinished queries will be cancelled|300000|
+|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty waits after receiving shutdown signal. After this timeout the threads will be forcefully shutdown. This allows any queries that are executing to complete(Only values greater than zero are valid).|`PT30S`|
+|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper unannouncements to propagate before shutting down Jetty. This is a minimum and `druid.server.http.gracefulShutdownTimeout` does not start counting down until after this period elapses.|`PT0S` (do not wait)|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds) for `timeout` parameter. See [query-context](../querying/query-context-reference.md) to know more about `timeout`. Query is rejected if the query context `timeout` is greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in bytes. Larger headers consume more memory and can make a server more vulnerable to denial of service attacks.|8 * 1024|
+|`druid.server.http.enableForwardedRequestCustomizer`|If enabled, adds Jetty ForwardedRequestCustomizer which reads X-Forwarded-* request headers to manipulate servlet request object when Druid is used behind a proxy.|false|
+|`druid.server.http.allowedHttpMethods`|List of HTTP methods that should be allowed in addition to the ones required by Druid APIs. Druid APIs require GET, PUT, POST, and DELETE, which are always allowed. This option is not useful unless you have installed an extension that needs these additional HTTP methods or that adds functionality related to CORS. None of Druid's bundled extensions require these methods.|`[]`|
+|`druid.server.http.contentSecurityPolicy`|Content-Security-Policy header value to set on each non-POST response. Setting this property to an empty string, or omitting it, both result in the default `frame-ancestors: none` being set.|`frame-ancestors 'none'`|
+|`druid.server.http.uriCompliance`|Jetty `UriCompliance` mode for Druid's embedded Jetty servers. To modify, override this config with the string representation of any `UriCompliance` mode that [Jetty supports](https://javadoc.jetty.org/jetty-12/org/eclipse/jetty/http/UriCompliance.html).|LEGACY|
+|`druid.server.http.enforceStrictSNIHostChecking`| If enabled, the Jetty server will enforce strict SNI host checking. This means that if a client connects to the server using TLS but does not provide an SNI hostname, or provides an SNI hostname that does not match the server's configured hostname, a request will get a 400 response. Setting this to false is not recommended in production.|true|
+
+#### Indexer processing resources
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.processing.buffer.sizeBytes`|This specifies a buffer size (less than 2GiB) for the storage of intermediate results. The computation engine in the Indexer processes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed. [Human-readable format](human-readable-byte.md) is supported.|auto (max 1GiB)|
+|`druid.processing.buffer.poolCacheMaxCount`|processing buffer pool caches the buffers for later use, this is the maximum count cache will grow to. note that pool can create more buffers than it can cache if necessary.|`Integer.MAX_VALUE`|
+|`druid.processing.formatString`|Indexer processes use this format string to name their processing threads.|processing-%s|
+|`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`|
+|`druid.processing.numThreads`|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.|Number of cores - 1 (or 1)|
+|`druid.processing.numTimeoutThreads`|The number of processing threads to have available for handling per-segment query timeouts. Setting this value to `0` removes the ability to service per-segment timeouts, irrespective of `perSegmentTimeout` query context parameter. As these threads are just servicing timers, it's recommended to set this value to some small percent (e.g. 5%) of the total query processing cores available to the indexer.|0|
+|`druid.processing.fifo`|If the processing queue should treat tasks of equal priority in a FIFO manner|`true`|
+|`druid.processing.tmpDir`|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.|path represented by `java.io.tmpdir`|
+
+The amount of direct memory needed by Druid is at least
+`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)`. You can
+ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=` at the command
+line.
+
+#### Query configurations
+
+See [general query configuration](#general-query-configuration).
+
+#### Indexer caching
+
+You can optionally configure caching to be enabled on the Indexer by setting caching configs here.
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.realtime.cache.useCache`|true, false|Enable the cache on the realtime.|false|
+|`druid.realtime.cache.populateCache`|true, false|Populate the cache on the realtime.|false|
+|`druid.realtime.cache.unCacheable`|All druid query types|All query types to not cache.|`[scan]`|
+|`druid.realtime.cache.maxEntrySize`|positive integer|Maximum cache entry size in bytes.|1_000_000|
+
+See [cache configuration](#cache-configuration) for how to configure cache settings.
+
+Note that only local caches such as the `local`-type cache and `caffeine` cache are supported. If a remote cache such as `memcached` is used, it will be ignored.
+
+### Historical
+
+For general Historical service information, see [Historical](../design/historical.md).
+
+These Historical configurations can be defined in the `historical/runtime.properties` file.
+
+#### Historical service configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current service. This is used to advertise the current service location as reachable from another service and should generally be specified such that `http://${druid.host}/` could actually talk to this service|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the service's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8083|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|8283|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|`druid/historical`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+#### Historical general configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.maxSize`|The maximum number of bytes-worth of segments that the service wants assigned to it. The Coordinator service will attempt to assign segments to a Historical service only if this property is greater than the total size of segments served by it. Since this property defines the upper limit on the total segment size that can be assigned to a Historical, it is defaulted to the sum of all `maxSize` values specified within `druid.segmentCache.locations` property. Human-readable format is supported, see [here](human-readable-byte.md). |Sum of `maxSize` values defined within `druid.segmentCache.locations`|
+|`druid.server.tier`| A string to name the distribution tier that the storage service belongs to. Many of the [rules Coordinator services use](../operations/rule-configuration.md) to manage segments can be keyed on tiers. | `_default_tier` |
+|`druid.server.priority`|In a tiered architecture, the priority of the tier, thus allowing control over which services are queried. Higher numbers mean higher priority. The default (no priority) works for architecture with no cross replication (tiers that have no data-storage overlap). Data centers typically have equal priority. | 0 |
+
+#### Storing segments
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.segmentCache.locations`|Segments assigned to a Historical services are first stored on the local file system (in a disk cache) and then served by the Historical services. These locations define where that local cache resides. This value cannot be NULL or EMPTY. Here is an example `druid.segmentCache.locations=[{"path": "/mnt/druidSegments", "maxSize": "10k", "freeSpacePercent": 1.0}]`. "freeSpacePercent" is optional, if provided then enforces that much of free disk partition space while storing segments. But, it depends on `File.getTotalSpace()` and `File.getFreeSpace()` methods, so enable if only if they work for your File System.| none |
+|`druid.segmentCache.locationSelector.strategy`|The strategy used to select a location from the configured `druid.segmentCache.locations` for segment distribution. Possible values are `leastBytesUsed`, `roundRobin`, `random`, or `mostAvailableSize`. |leastBytesUsed|
+|`druid.segmentCache.deleteOnRemove`|Delete segment files from cache once a service is no longer serving a segment.|true|
+|`druid.segmentCache.dropSegmentDelayMillis`|How long a service delays before completely dropping segment.|30000 (30 seconds)|
+|`druid.segmentCache.infoDir`|Historical services keep track of the segments they are serving so that when the service is restarted they can reload the same segments without waiting for the Coordinator to reassign. This path defines where this metadata is kept. Directory will be created if needed.|`${first_location}/info_dir`|
+|`druid.segmentCache.announceIntervalMillis`|How frequently to announce segments while segments are loading from cache. Set this value to zero to wait for all segments to be loaded before announcing.|5000 (5 seconds)|
+|`druid.segmentCache.numLoadingThreads`|How many segments to drop or load concurrently from deep storage. Note that the work of loading segments involves downloading segments from deep storage, decompressing them and loading them to a memory mapped location. So the work is not all I/O Bound. Depending on CPU and network load, one could possibly increase this config to a higher value.|max(1,Number of cores / 6)|
+|`druid.segmentCache.numBootstrapThreads`|How many segments to load concurrently during historical startup.|`druid.segmentCache.numLoadingThreads`|
+|`druid.segmentCache.lazyLoadOnStart`|Whether or not to load segment columns metadata lazily during historical startup. When set to true, Historical startup time will be dramatically improved by deferring segment loading until the first time that segment takes part in a query, which will incur this cost instead.|false|
+|`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload`|Number of threads to asynchronously read segment index files into null output stream on each new segment download after the Historical service finishes bootstrapping. Recommended to set to 1 or 2 or leave unspecified to disable. See also `druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnBootstrap`|0|
+|`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnBootstrap`|Number of threads to asynchronously read segment index files into null output stream during Historical service bootstrap. This thread pool is terminated after Historical service finishes bootstrapping. Recommended to set to half of available cores. If left unspecified, `druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload` will be used. If both configs are unspecified, this feature is disabled. Preemptively loading segments into page cache helps in the sense that later when a segment is queried, it's already in page cache and only a minor page fault needs to be triggered instead of a more costly major page fault to make the query latency more consistent. Note that loading segment into page cache just does a blind loading of segment index files and will evict any existing segments from page cache at the discretion of operating system when the total segment size on local disk is larger than the page cache usable in the RAM, which roughly equals to total available RAM in the host - druid process memory including both heap and direct memory allocated - memory used by other non druid processes on the host, so it is the user's responsibility to ensure the host has enough RAM to host all the segments to avoid random evictions to fully leverage this feature.|`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload`|
+
+In `druid.segmentCache.locations`, `freeSpacePercent` was added because the `maxSize` setting is only a theoretical limit and assumes that much space will always be available for storing segments. In case of any druid bug leading to unaccounted segment files left alone on disk or some other service writing stuff to disk, This check can start failing segment loading early before filling up the disk completely and leaving the host usable otherwise.
+
+In `druid.segmentCache.locationSelector.strategy`, one of `leastBytesUsed`, `roundRobin`, `random`, or `mostAvailableSize` could be specified to represent the strategy to distribute segments across multiple segment cache locations.
+
+|Strategy|Description|
+|--------|-----------|
+|`leastBytesUsed`|Selects a location which has least bytes used in absolute terms.|
+|`roundRobin`|Selects a location in a round robin fashion oblivious to the bytes used or the capacity.|
+|`random`|Selects a segment cache location randomly each time among the available storage locations.|
+|`mostAvailableSize`|Selects a segment cache location that has most free space among the available storage locations.|
+
+Note that if `druid.segmentCache.numLoadingThreads` > 1, multiple threads can download different segments at the same time. In this case, with the `leastBytesUsed` strategy or `mostAvailableSize` strategy, Historicals may select a sub-optimal storage location because each decision is based on a snapshot of the storage location status of when a segment is requested to download.
+
+#### Historical query configs
+
+##### Concurrent requests
+
+Druid uses Jetty to serve HTTP requests.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.http.numThreads`|Number of threads for HTTP requests.|max(10, (Number of cores * 17) / 16 + 2) + 30|
+|`druid.server.http.queueSize`|Size of the worker queue used by Jetty server to temporarily store incoming client connections. If this value is set and a request is rejected by jetty because queue is full then client would observe request failure with TCP connection being closed immediately with a completely empty response from server.|Unbounded|
+|`druid.server.http.maxIdleTime`|The Jetty max idle time for a connection.|`PT5M`|
+|`druid.server.http.enableRequestLimit`|If enabled, no requests would be queued in jetty queue and "HTTP 429 Too Many Requests" error response would be sent. |false|
+|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which unfinished queries will be cancelled|300000|
+|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty waits after receiving shutdown signal. After this timeout the threads will be forcefully shutdown. This allows any queries that are executing to complete(Only values greater than zero are valid).|`PT30S`|
+|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper unannouncements to propagate before shutting down Jetty. This is a minimum and `druid.server.http.gracefulShutdownTimeout` does not start counting down until after this period elapses.|`PT0S` (do not wait)|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds) for `timeout` parameter. See [query-context](../querying/query-context-reference.md) to know more about `timeout`. Query is rejected if the query context `timeout` is greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in bytes. Larger headers consume more memory and can make a server more vulnerable to denial of service attacks.|8 * 1024|
+|`druid.server.http.contentSecurityPolicy`|Content-Security-Policy header value to set on each non-POST response. Setting this property to an empty string, or omitting it, both result in the default `frame-ancestors: none` being set.|`frame-ancestors 'none'`|
+
+##### Processing
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.processing.buffer.sizeBytes`|This specifies a buffer size (less than 2GiB), for the storage of intermediate results. The computation engine in both the Historical and Realtime processes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed. [Human-readable format](human-readable-byte.md) is supported.|auto (max 1GiB)|
+|`druid.processing.buffer.poolCacheMaxCount`|processing buffer pool caches the buffers for later use, this is the maximum count cache will grow to. note that pool can create more buffers than it can cache if necessary.|`Integer.MAX_VALUE`|
+|`druid.processing.formatString`|Realtime and Historical processes use this format string to name their processing threads.|processing-%s|
+|`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`|
+|`druid.processing.numThreads`|The number of processing threads to have available for parallel processing of segments. Our rule of thumb is `num_cores - 1`, which means that even under heavy load there will still be one core available to do background tasks like talking with ZooKeeper and pulling down segments. If only one core is available, this property defaults to the value `1`.|Number of cores - 1 (or 1)|
+|`druid.processing.numTimeoutThreads`|The number of processing threads to have available for handling per-segment query timeouts. Setting this value to `0` removes the ability to service per-segment timeouts, irrespective of `perSegmentTimeout` query context parameter. As these threads are just servicing timers, it's recommended to set this value to some small percent (e.g. 5%) of the total query processing cores available to the historical.|0|
+|`druid.processing.fifo`|If the processing queue should treat tasks of equal priority in a FIFO manner|`true`|
+|`druid.processing.tmpDir`|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.|path represented by `java.io.tmpdir`|
+
+The amount of direct memory needed by Druid is at least
+`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + druid.processing.numThreads + 1)`. You can
+ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=` at the command
+line.
+
+##### Historical query configuration
+
+See [general query configuration](#general-query-configuration).
+
+#### Historical caching
+
+You can optionally only configure caching to be enabled on the Historical by setting caching configs here.
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.historical.cache.useCache`|true, false|Enable the cache on the Historical.|false|
+|`druid.historical.cache.populateCache`|true, false|Populate the cache on the Historical.|false|
+|`druid.historical.cache.unCacheable`|All druid query types|All query types to not cache.|`[scan]`|
+|`druid.historical.cache.maxEntrySize`|positive integer|Maximum cache entry size in bytes.|1_000_000|
+
+See [cache configuration](#cache-configuration) for how to configure cache settings.
+
+## Query server
+
+This section contains the configuration options for the services that reside on Query servers (Brokers) in the suggested [three-server configuration](../design/architecture.md#druid-servers).
+
+Configuration options for the [Router process](../design/router.md) are also provided here.
+
+### Broker
+
+For general Broker process information, see [here](../design/broker.md).
+
+These Broker configurations can be defined in the `broker/runtime.properties` file.
+
+#### Broker process configs
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current process. This is used to advertise the current processes location as reachable from another process and should generally be specified such that `http://${druid.host}/` could actually talk to this process|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the process's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8082|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|8282|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|`druid/broker`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+#### Query configuration
+
+##### Query routing
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.broker.balancer.type`|`random`, `connectionCount`|Determines how the broker balances connections to Historical processes. `random` choose randomly, `connectionCount` picks the process with the fewest number of active connections to|`random`|
+|`druid.broker.select.tier`|`highestPriority`, `lowestPriority`, `custom`, `preferred`|If segments are cross-replicated across tiers in a cluster, you can tell the broker to prefer to select segments in a tier with a certain priority.|`highestPriority`|
+|`druid.broker.select.tier.custom.priorities`|An array of integer priorities, such as `[-1, 0, 1, 2]`|Select servers in tiers with a custom priority list.|The config only has effect if `druid.broker.select.tier` is set to `custom`. If `druid.broker.select.tier` is set to `custom` but this config is not specified, the effect is the same as `druid.broker.select.tier` set to `highestPriority`. Any of the integers in this config can be ignored if there's no corresponding tiers with such priorities. Tiers with priorities explicitly specified in this config always have higher priority than those not and those not specified fall back to use `highestPriority` strategy among themselves.|
+|`druid.broker.select.tier.preferred.tier`| The preferred tier name. E.g., `_default_tier` | A non-empty value that specifies the preferred tier in which historical servers will be picked up for queries. If there are not enough historical servers from the preferred tier, servers from other tiers (if there are any) will be selected. This config only has effect if `druid.broker.select.tier` is set to `preferred` | null |
+|`druid.broker.select.tier.preferred.priority`| `highest`, `lowest` | If there are multiple candidates in a preferred tier, specifies the priority to pick up candidates. By default, the higher priority a historical, the higher chances it will be picked up. This config only has effect if `druid.broker.select.tier` is set to `preferred`| `highest` |
+
+##### Query prioritization and laning
+
+Laning strategies allow you to control capacity utilization for heterogeneous query workloads. With laning, the broker examines and classifies a query for the purpose of assigning it to a lane. Lanes have capacity limits, enforced by the broker, that can be used to ensure sufficient resources are available for other lanes or for interactive queries (with no lane), or to limit overall throughput for queries within the lane. Requests in excess of the capacity are discarded with an HTTP 429 status code.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.scheduler.numThreads`|Maximum number of concurrently-running queries. When this parameter is set lower than `druid.server.http.numThreads`, query requests beyond the limit are put into the Jetty request queue. This has the effect of reserving the leftover Jetty threads for non-query requests. When this parameter is set equal to or higher than `druid.server.http.numThreads`, it has no effect.|Unbounded|
+|`druid.query.scheduler.laning.strategy`|Query laning strategy to use to assign queries to a lane in order to control capacities for certain classes of queries.|`none`|
+|`druid.query.scheduler.prioritization.strategy`|Query prioritization strategy to automatically assign priorities.|`manual`|
+
+##### Prioritization strategies
+
+###### Manual prioritization strategy
+
+With this configuration, queries are never assigned a priority automatically, but will preserve a priority manually set on the [query context](../querying/query-context-reference.md) with the `priority` key. This mode can be explicitly set by setting `druid.query.scheduler.prioritization.strategy` to `manual`.
+
+###### Threshold prioritization strategy
+
+This prioritization strategy lowers the priority of queries that cross any of a configurable set of thresholds, such as how far in the past the data is, how large of an interval a query covers, or the number of segments taking part in a query.
+
+This strategy can be enabled by setting `druid.query.scheduler.prioritization.strategy` to `threshold`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.scheduler.prioritization.periodThreshold`|ISO duration threshold for how old data can be queried before automatically adjusting query priority.|none|
+|`druid.query.scheduler.prioritization.durationThreshold`|ISO duration threshold for maximum duration a queries interval can span before the priority is automatically adjusted.|none|
+|`druid.query.scheduler.prioritization.segmentCountThreshold`|Number threshold for maximum number of segments that can take part in a query before its priority is automatically adjusted.|none|
+|`druid.query.scheduler.prioritization.segmentRangeThreshold`|ISO duration threshold for maximum segment range a query can span before the priority is automatically adjusted.|none|
+|`druid.query.scheduler.prioritization.adjustment`|Amount to reduce the priority of queries which cross any threshold.|none|
+
+##### Laning strategies
+
+###### No laning strategy
+
+In this mode, queries are never assigned a lane, and the concurrent query count will only be limited by `druid.server.http.numThreads` or `druid.query.scheduler.numThreads`, if set. This is the default Druid query scheduler operating mode. Enable this strategy explicitly by setting `druid.query.scheduler.laning.strategy` to `none`.
+
+###### 'High/Low' laning strategy
+
+This laning strategy splits queries with a `priority` below zero into a `low` query lane, automatically. Queries with priority of zero (the default) or above are considered 'interactive'. The limit on `low` queries can be set to some desired percentage of the total capacity (or HTTP thread pool size), reserving capacity for interactive queries. Queries in the `low` lane are _not_ guaranteed their capacity, which may be consumed by interactive queries, but may use up to this limit if total capacity is available.
+
+If the `low` lane is specified in the [query context](../querying/query-context-reference.md) `lane` parameter, this will override the computed lane.
+
+This strategy can be enabled by setting `druid.query.scheduler.laning.strategy=hilo`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.scheduler.laning.maxLowPercent`|Maximum percent of the smaller number of `druid.server.http.numThreads` or `druid.query.scheduler.numThreads`, defining the number of HTTP threads that can be used by queries with a priority lower than 0. Value must be an integer in the range 1 to 100, and will be rounded up|No default, must be set if using this mode|
+
+##### Guardrails for materialization of subqueries
+
+Druid stores the subquery rows in temporary tables that live in the Java heap. It is a good practice to avoid large subqueries in Druid.
+Therefore, there are guardrails that are built in Druid to prevent the queries from generating subquery results which can exhaust the heap
+space. They can be set on a cluster level or modified per query level as desired.
+Note the following guardrails that can be set by the cluster admin to limit the subquery results:
+
+1. `druid.server.http.maxSubqueryRows` in broker's config to set a default for the entire cluster or `maxSubqueryRows` in the query context to set an upper limit on the number of rows a subquery can generate
+2. `druid.server.http.maxSubqueryBytes` in broker's config to set a default for the entire cluster or `maxSubqueryBytes` in the query context to set an upper limit on the number of bytes a subquery can generate
+
+Limiting the subquery by bytes is an experimental feature as it materializes the results differently.
+
+You can configure `maxSubqueryBytes` to the following values:
+
+* `disabled`: It is the default setting out of the box. It disables the subquery's from the byte based limit, and effectively disables this feature.
+* `auto`: Druid automatically decides the optimal byte based limit based upon the heap space available and the max number of concurrent queries.
+* A positive long value: User can manually specify the number of bytes that the results of the subqueries of a single query can occupy on the heap.
+
+Due to the conversion between the Java objects and the Frame's format, setting `maxSubqueryBytes` can become slow if the subquery starts generating
+rows in the order of magnitude of around 10 million and above. In those scenarios, disable the `maxSubqueryBytes` settings for such queries, assess the number of rows that the subqueries generate and override the `maxSubqueryRows` to appropriate value.
+
+If you choose to modify or set any of the above limits, you must also think about the heap size of all Brokers, Historicals, and task Peons that process data for the subqueries to accommodate the subquery results.
+There is no formula to calculate the correct value. Trial and error is the best approach.
+
+###### Manual laning strategy
+
+This laning strategy is best suited for cases where one or more external applications which query Druid are capable of manually deciding what lane a given query should belong to. Configured with a map of lane names to percent or exact max capacities, queries with a matching `lane` parameter in the [query context](../querying/query-context-reference.md) will be subjected to those limits.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.scheduler.laning.lanes.{name}`|Maximum percent or exact limit of queries that can concurrently run in the defined lanes. Any number of lanes may be defined like this. The lane names 'total' and 'default' are reserved for internal use.|No default, must define at least one lane with a limit above 0. If `druid.query.scheduler.laning.isLimitPercent` is set to `true`, values must be integers in the range of 1 to 100.|
+|`druid.query.scheduler.laning.isLimitPercent`|If set to `true`, the values set for `druid.query.scheduler.laning.lanes` will be treated as a percent of the smaller number of `druid.server.http.numThreads` or `druid.query.scheduler.numThreads`. Note that in this mode, these lane values across lanes are _not_ required to add up to, and can exceed, 100%.|`false`|
+
+##### Server configuration
+
+Druid uses Jetty to serve HTTP requests. Each query being processed consumes a single thread from `druid.server.http.numThreads`, so consider defining `druid.query.scheduler.numThreads` to a lower value in order to reserve HTTP threads for responding to health checks, lookup loading, and other non-query, (in most cases) comparatively very short-lived, HTTP requests.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.server.http.numThreads`|Number of threads for HTTP requests.|max(10, (Number of cores * 17) / 16 + 2) + 30|
+|`druid.server.http.queueSize`|Size of the worker queue used by Jetty server to temporarily store incoming client connections. If this value is set and a request is rejected by jetty because queue is full then client would observe request failure with TCP connection being closed immediately with a completely empty response from server.|Unbounded|
+|`druid.server.http.maxIdleTime`|The Jetty max idle time for a connection.|`PT5M`|
+|`druid.server.http.enableRequestLimit`|If enabled, no requests would be queued in jetty queue and "HTTP 429 Too Many Requests" error response would be sent. |false|
+|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which unfinished queries will be cancelled|300000|
+|`druid.server.http.maxScatterGatherBytes`|Maximum number of bytes gathered from data processes such as Historicals and realtime processes to execute a query. Queries that exceed this limit will fail. This is an advance configuration that allows to protect in case Broker is under heavy load and not utilizing the data gathered in memory fast enough and leading to OOMs. This limit can be further reduced at query time using `maxScatterGatherBytes` in the context. Note that having large limit is not necessarily bad if broker is never under heavy concurrent load in which case data gathered is processed quickly and freeing up the memory used. Human-readable format is supported, see [here](human-readable-byte.md). |`Long.MAX_VALUE`|
+|`druid.server.http.maxSubqueryRows`|Maximum number of rows from all subqueries per query. Druid stores the subquery rows in temporary tables that live in the Java heap. `druid.server.http.maxSubqueryRows` is a guardrail to prevent the system from exhausting available heap. When a subquery exceeds the row limit, Druid throws a resource limit exceeded exception: "Subquery generated results beyond maximum." It is a good practice to avoid large subqueries in Druid. However, if you choose to raise the subquery row limit, you must also increase the heap size of all Brokers, Historicals, and task Peons that process data for the subqueries to accommodate the subquery results. There is no formula to calculate the correct value. Trial and error is the best approach.|100000|
+|`druid.server.http.maxSubqueryBytes`|Maximum number of bytes from all subqueries per query. Since the results are stored on the Java heap, `druid.server.http.maxSubqueryBytes` is a guardrail like `druid.server.http.maxSubqueryRows` to prevent the heap space from exhausting. When a subquery exceeds the byte limit, Druid throws a resource limit exceeded exception. A negative value for the guardrail indicates that Druid won't guardrail by memory. This can be set to 'disabled' which disables the results from being limited via the byte limit, 'auto' which sets this value automatically taking free heap space into account, or a positive long value depicting the number of bytes per query's subqueries' results can occupy. This is an experimental feature for now as this materializes the results in a different format.|'disabled'|
+|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty waits after receiving shutdown signal. After this timeout the threads will be forcefully shutdown. This allows any queries that are executing to complete(Only values greater than zero are valid).|`PT30S`|
+|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper unannouncements to propagate before shutting down Jetty. This is a minimum and `druid.server.http.gracefulShutdownTimeout` does not start counting down until after this period elapses.|`PT0S` (do not wait)|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds) for `timeout` parameter. See [query-context](../querying/query-context-reference.md) to know more about `timeout`. Query is rejected if the query context `timeout` is greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in bytes. Larger headers consume more memory and can make a server more vulnerable to denial of service attacks. |8 * 1024|
+|`druid.server.http.contentSecurityPolicy`|Content-Security-Policy header value to set on each non-POST response. Setting this property to an empty string, or omitting it, both result in the default `frame-ancestors: none` being set.|`frame-ancestors 'none'`|
+|`druid.server.http.enableHSTS`|If set to true, druid services will add strict transport security header `Strict-Transport-Security: max-age=63072000; includeSubDomains` to all HTTP responses|`false`|
+
+##### Client configuration
+
+Druid Brokers use an HTTP client to communicate with data servers (Historical servers and real-time tasks). This
+client has the following configuration options.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.broker.http.numConnections`|Size of connection pool for the Broker to connect to Historical and real-time processes. If there are more queries than this number that all need to speak to the same process, then they will queue up.|20|
+|`druid.broker.http.eagerInitialization`|Indicates that http connections from Broker to Historical and Real-time processes should be eagerly initialized. If set to true, `numConnections` connections are created upon initialization|`true`|
+|`druid.broker.http.compressionCodec`|Compression codec the Broker uses to communicate with Historical and real-time processes. May be "gzip" or "identity".|`gzip`|
+|`druid.broker.http.readTimeout`|The timeout for data reads from Historical servers and real-time tasks.|`PT15M`|
+|`druid.broker.http.unusedConnectionTimeout`|The timeout for idle connections in connection pool. The connection in the pool will be closed after this timeout and a new one will be established. This timeout should be less than `druid.broker.http.readTimeout`. Set this timeout = ~90% of `druid.broker.http.readTimeout`|`PT4M`|
+|`druid.broker.http.maxQueuedBytes`|Maximum number of bytes queued per query before exerting [backpressure](../operations/basic-cluster-tuning.md#broker-backpressure) on channels to the data servers. Similar to `druid.server.http.maxScatterGatherBytes`, except that `maxQueuedBytes` triggers [backpressure](../operations/basic-cluster-tuning.md#broker-backpressure) instead of query failure. Set to zero to disable. You can override this setting by using the [`maxQueuedBytes` query context parameter](../querying/query-context-reference.md). Druid supports [human-readable](human-readable-byte.md) format. |25 MB or 2% of maximum Broker heap size, whichever is greater.|
+|`druid.broker.http.numMaxThreads`|`Maximum number of I/O worker threads|(number of cores) * 3 / 2 + 1`|
+|`druid.broker.http.clientConnectTimeout`|The timeout (in milliseconds) for establishing client connections.|500|
+
+
+##### Retry policy
+
+Druid broker can optionally retry queries internally for transient errors.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.broker.retryPolicy.numTries`|Number of tries.|1|
+
+##### Processing
+
+The broker uses processing configs for nested groupBy queries.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.processing.buffer.sizeBytes`|This specifies a buffer size (less than 2GiB) for the storage of intermediate results. The computation engine in both the Historical and Realtime processes will use a scratch buffer of this size to do all of their intermediate computations off-heap. Larger values allow for more aggregations in a single pass over the data while smaller values can require more passes depending on the query that is being executed. [Human-readable format](human-readable-byte.md) is supported.|auto (max 1GiB)|
+|`druid.processing.buffer.poolCacheInitialCount`|initializes the number of buffers allocated on the intermediate results pool. Note that pool can create more buffers if necessary.|`0`|
+|`druid.processing.buffer.poolCacheMaxCount`|processing buffer pool caches the buffers for later use, this is the maximum count cache will grow to. note that pool can create more buffers than it can cache if necessary.|`Integer.MAX_VALUE`|
+|`druid.processing.numMergeBuffers`|The number of direct memory buffers available for merging query results. The buffers are sized by `druid.processing.buffer.sizeBytes`. This property is effectively a concurrency limit for queries that require merging buffers. If you are using any queries that require merge buffers (currently, just groupBy) then you should have at least two of these.|`max(2, druid.processing.numThreads / 4)`|
+|`druid.processing.fifo`|If the processing queue should treat tasks of equal priority in a FIFO manner|`true`|
+|`druid.processing.tmpDir`|Path where temporary files created while processing a query should be stored. If specified, this configuration takes priority over the default `java.io.tmpdir` path.|path represented by `java.io.tmpdir`|
+|`druid.processing.merge.useParallelMergePool`|Enable automatic parallel merging for Brokers on a dedicated async ForkJoinPool. If `false`, instead merges will be done serially on the `HTTP` thread pool.|`true`|
+|`druid.processing.merge.parallelism`|Size of ForkJoinPool. Note that the default configuration assumes that the value returned by `Runtime.getRuntime().availableProcessors()` represents 2 hyper-threads per physical core, and multiplies this value by `0.75` in attempt to size `1.5` times the number of _physical_ cores.|`Runtime.getRuntime().availableProcessors() * 0.75` (rounded up)|
+|`druid.processing.merge.defaultMaxQueryParallelism`|Default maximum number of parallel merge tasks per query. Note that the default configuration assumes that the value returned by `Runtime.getRuntime().availableProcessors()` represents 2 hyper-threads per physical core, and multiplies this value by `0.5` in attempt to size to the number of _physical_ cores.|`Runtime.getRuntime().availableProcessors() * 0.5` (rounded up)|
+|`druid.processing.merge.awaitShutdownMillis`|Time to wait for merge ForkJoinPool tasks to complete before ungracefully stopping on process shutdown in milliseconds.|`60_000`|
+|`druid.processing.merge.targetRunTimeMillis`|Ideal run-time of each ForkJoinPool merge task, before forking off a new task to continue merging sequences.|100|
+|`druid.processing.merge.initialYieldNumRows`|Number of rows to yield per ForkJoinPool merge task, before forking off a new task to continue merging sequences.|16384|
+|`druid.processing.merge.smallBatchNumRows`|Size of result batches to operate on in ForkJoinPool merge tasks.|4096|
+
+The amount of direct memory needed by Druid is at least
+`druid.processing.buffer.sizeBytes * (druid.processing.numMergeBuffers + 1)`. You can
+ensure at least this amount of direct memory is available by providing `-XX:MaxDirectMemorySize=` at the command
+line.
+
+##### Broker query configuration
+
+See [general query configuration](#general-query-configuration).
+
+###### Broker generated query configuration supplementation
+
+The Broker generates queries internally. This configuration section describes how an operator can augment the configuration
+of these queries.
+
+As of now the only supported augmentation is overriding the default query context. This allows an operator the flexibility
+to adjust it as they see fit. A common use of this configuration is to override the query priority of the cluster generated
+queries in order to avoid running as a default priority of 0.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.broker.internal.query.config.context`|A string formatted `key:value` map of a query context to add to internally generated broker queries.|null|
+
+#### SQL
+
+The Druid SQL server is configured through the following properties on the Broker.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.sql.enable`|Whether to enable SQL at all, including background metadata fetching. If false, this overrides all other SQL-related properties and disables SQL metadata, serving, and planning completely.|true|
+|`druid.sql.avatica.enable`|Whether to enable JDBC querying at `/druid/v2/sql/avatica/`.|true|
+|`druid.sql.avatica.maxConnections`|Maximum number of open connections for the Avatica server. These are not HTTP connections, but are logical client connections that may span multiple HTTP connections.|25|
+|`druid.sql.avatica.maxRowsPerFrame`|Maximum acceptable value for the JDBC client `Statement.setFetchSize` method. This setting determines the maximum number of rows that Druid will populate in a single 'fetch' for a JDBC `ResultSet`. Set this property to -1 to enforce no row limit on the server-side and potentially return the entire set of rows on the initial statement execution. If the JDBC client calls `Statement.setFetchSize` with a value other than -1, Druid uses the lesser value of the client-provided limit and `maxRowsPerFrame`. If `maxRowsPerFrame` is smaller than `minRowsPerFrame`, then the `ResultSet` size will be fixed. To handle queries that produce results with a large number of rows, you can increase value of `druid.sql.avatica.maxRowsPerFrame` to reduce the number of fetches required to completely transfer the result set.|5,000|
+|`druid.sql.avatica.minRowsPerFrame`|Minimum acceptable value for the JDBC client `Statement.setFetchSize` method. The value for this property must greater than 0. If the JDBC client calls `Statement.setFetchSize` with a lesser value, Druid uses `minRowsPerFrame` instead. If `maxRowsPerFrame` is less than `minRowsPerFrame`, Druid uses the minimum value of the two. For handling queries which produce results with a large number of rows, you can increase this value to reduce the number of fetches required to completely transfer the result set.|100|
+|`druid.sql.avatica.maxStatementsPerConnection`|Maximum number of simultaneous open statements per Avatica client connection.|4|
+|`druid.sql.avatica.connectionIdleTimeout`|Avatica client connection idle timeout.|`PT5M`|
+|`druid.sql.avatica.fetchTimeoutMs`|Avatica fetch timeout, in milliseconds. When a request for the next batch of data takes longer than this time, Druid returns an empty result set, causing the client to poll again. This avoids HTTP timeouts for long-running queries. The default of 5 sec. is good for most cases. |5000|
+|`druid.sql.http.enable`|Whether to enable JSON over HTTP querying at `/druid/v2/sql/`.|true|
+|`druid.sql.planner.maxTopNLimit`|Maximum threshold for a [TopN query](../querying/topnquery.md). Higher limits will be planned as [GroupBy queries](../querying/groupbyquery.md) instead.|100000|
+|`druid.sql.planner.metadataRefreshPeriod`|Throttle for metadata refreshes.|`PT1M`|
+|`druid.sql.planner.metadataColumnTypeMergePolicy`|Defines how column types will be chosen when faced with differences between segments when computing the SQL schema. Options are specified as a JSON object, with valid choices of `leastRestrictive` or `latestInterval`. For `leastRestrictive`, Druid will automatically widen the type computed for the schema to a type which data across all segments can be converted into, however planned schema migrations can only take effect once all segments have been re-ingested to the new schema. With `latestInterval`, the column type in most recent time chunks defines the type for the schema. |`leastRestrictive`|
+|`druid.sql.planner.useApproximateCountDistinct`|Whether to use an approximate cardinality algorithm for `COUNT(DISTINCT foo)`.|true|
+|`druid.sql.planner.useGroupingSetForExactDistinct`|Only relevant when `useApproximateCountDistinct` is disabled. If set to true, exact distinct queries are re-written using grouping sets. Otherwise, exact distinct queries are re-written using joins. This should be set to true for group by query with multiple exact distinct aggregations. This flag can be overridden per query.|false|
+|`druid.sql.planner.useApproximateTopN`|Whether to use approximate [TopN queries](../querying/topnquery.md) when a SQL query could be expressed as such. If false, exact [GroupBy queries](../querying/groupbyquery.md) will be used instead.|true|
+|`druid.sql.planner.useLexicographicTopN`|Whether to use [TopN queries](../querying/topnquery.md) with lexicographic dimension ordering. If false, [GroupBy queries](../querying/groupbyquery.md) will be used instead for lexicographic ordering. When both this and `useApproximateTopN` are false, TopN queries are never used.|false|
+|`druid.sql.planner.requireTimeCondition`|Whether to require SQL to have filter conditions on `__time` column so that all generated native queries will have user specified intervals. If true, all queries without filter condition on `__time` column will fail|false|
+|`druid.sql.planner.sqlTimeZone`|Sets the default time zone for the server, which will affect how time functions and timestamp literals behave. Should be a time zone name like "America/Los_Angeles" or offset like "-08:00".|UTC|
+|`druid.sql.planner.metadataSegmentCacheEnable`|Whether to keep a cache of published segments in broker. If true, broker polls coordinator in background to get segments from metadata store and maintains a local cache. If false, coordinator's REST API will be invoked when broker needs published segments info.|false|
+|`druid.sql.planner.metadataSegmentPollPeriod`|How often to poll coordinator for published segments list if `druid.sql.planner.metadataSegmentCacheEnable` is set to true. Poll period is in milliseconds. |60000|
+|`druid.sql.planner.authorizeSystemTablesDirectly`|If true, Druid authorizes queries against any of the system schema tables (`sys` in SQL) as `SYSTEM_TABLE` resources which require `READ` access, in addition to permissions based content filtering.|false|
+|`druid.sql.planner.useNativeQueryExplain`|If true, `EXPLAIN PLAN FOR` will return the explain plan as a JSON representation of equivalent native query(s), else it will return the original version of explain plan generated by Calcite. It can be overridden per query with `useNativeQueryExplain` context key.|true|
+|`druid.sql.planner.maxNumericInFilters`|Max limit for the amount of numeric values that can be compared for a string type dimension when the entire SQL WHERE clause of a query translates to an [OR](../querying/filters.md#or) of [Bound filter](../querying/filters.md#bound-filter). By default, Druid does not restrict the amount of numeric Bound Filters on String columns, although this situation may block other queries from running. Set this property to a smaller value to prevent Druid from running queries that have prohibitively long segment processing times. The optimal limit requires some trial and error; we recommend starting with 100. Users who submit a query that exceeds the limit of `maxNumericInFilters` should instead rewrite their queries to use strings in the `WHERE` clause instead of numbers. For example, `WHERE someString IN (‘123’, ‘456’)`. If this value is disabled, `maxNumericInFilters` set through query context is ignored.|`-1` (disabled)|
+|`druid.sql.approxCountDistinct.function`|Implementation to use for the [`APPROX_COUNT_DISTINCT` function](../querying/sql-aggregations.md). Without extensions loaded, the only valid value is `APPROX_COUNT_DISTINCT_BUILTIN` (a HyperLogLog, or HLL, based implementation). If the [DataSketches extension](../development/extensions-core/datasketches-extension.md) is loaded, this can also be `APPROX_COUNT_DISTINCT_DS_HLL` (alternative HLL implementation) or `APPROX_COUNT_DISTINCT_DS_THETA`. Theta sketches use significantly more memory than HLL sketches, so you should prefer one of the two HLL implementations.|`APPROX_COUNT_DISTINCT_BUILTIN`|
+
+:::info
+ Previous versions of Druid had properties named `druid.sql.planner.maxQueryCount` and `druid.sql.planner.maxSemiJoinRowsInMemory`.
+ These properties are no longer available. Since Druid 0.18.0, you can use `druid.server.http.maxSubqueryRows` to control the maximum
+ number of rows permitted across all subqueries.
+:::
+
+#### Broker caching
+
+You can optionally only configure caching to be enabled on the Broker by setting caching configs here.
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.broker.cache.useCache`|true, false|Enable the cache on the Broker.|false|
+|`druid.broker.cache.populateCache`|true, false|Populate the cache on the Broker.|false|
+|`druid.broker.cache.useResultLevelCache`|true, false|Enable result level caching on the Broker.|false|
+|`druid.broker.cache.populateResultLevelCache`|true, false|Populate the result level cache on the Broker.|false|
+|`druid.broker.cache.resultLevelCacheLimit`|positive integer|Maximum size of query response that can be cached.|`Integer.MAX_VALUE`|
+|`druid.broker.cache.unCacheable`|All druid query types|All query types to not cache.|`[scan]`|
+|`druid.broker.cache.cacheBulkMergeLimit`|positive integer or 0|Queries with more segments than this number will not attempt to fetch from cache at the broker level, leaving potential caching fetches (and cache result merging) to the Historicals|`Integer.MAX_VALUE`|
+|`druid.broker.cache.maxEntrySize`|positive integer|Maximum cache entry size in bytes.|1_000_000|
+
+See [cache configuration](#cache-configuration) for how to configure cache settings.
+
+:::info
+ Note: Even if cache is enabled, for [groupBy](../querying/groupbyquery.md) queries, segment level cache does not work on Brokers.
+ See [Query caching](../querying/caching.md) for more information.
+:::
+
+#### Segment discovery
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.serverview.type`|batch or http|Segment discovery method to use. "http" enables discovering segments using HTTP instead of ZooKeeper.|http|
+|`druid.broker.segment.watchedTiers`|List of strings|The Broker watches segment announcements from processes that serve segments to build a cache to relate each process to the segments it serves. This configuration allows the Broker to only consider segments being served from a list of tiers. By default, Broker considers all tiers. This can be used to partition your dataSources in specific Historical tiers and configure brokers in partitions so that they are only queryable for specific dataSources. This config is mutually exclusive from `druid.broker.segment.ignoredTiers` and at most one of these can be configured on a Broker.|none|
+|`druid.broker.segment.ignoredTiers`|List of strings|The Broker watches segment announcements from processes that serve segments to build a cache to relate each process to the segments it serves. This configuration allows the Broker to ignore the segments being served from a list of tiers. By default, Broker considers all tiers. This config is mutually exclusive from `druid.broker.segment.watchedTiers` and at most one of these can be configured on a Broker.|none|
+|`druid.broker.segment.watchedDataSources`|List of strings|Broker watches the segment announcements from processes serving segments to build cache of which process is serving which segments, this configuration allows to only consider segments being served from a whitelist of dataSources. By default, Broker would consider all datasources. This can be used to configure brokers in partitions so that they are only queryable for specific dataSources.|none|
+|`druid.broker.segment.watchRealtimeTasks`|Boolean|The Broker watches segment announcements from processes that serve segments to build a cache to relate each process to the segments it serves. When `watchRealtimeTasks` is true, the Broker watches for segment announcements from both Historicals and realtime processes. To configure a broker to exclude segments served by realtime processes, set `watchRealtimeTasks` to false. |true|
+|`druid.broker.segment.awaitInitializationOnStart`|Boolean|Whether the Broker will wait for its view of segments to fully initialize before starting up. If set to 'true', the Broker's HTTP server will not start up, and the Broker will not announce itself as available, until the server view is initialized. See also `druid.sql.planner.awaitInitializationOnStart`, a related setting.|true|
+
+## Metrics monitors
+
+You can configure Druid services to emit [metrics](../operations/metrics.md) regularly from a number of [monitors](#metrics-monitors-for-each-service) via [emitters](#metrics-emitters). The following table lists general configurations for metrics:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.monitoring.emissionPeriod`| Frequency that Druid emits metrics.|`PT1M`|
+|[`druid.monitoring.monitors`](#metrics-monitors-for-each-service)|Sets list of Druid monitors used by a service.|none (no monitors)|
+|[`druid.emitter`](#metrics-emitters)|Setting this value initializes one of the emitter modules.|`noop` (metric emission disabled by default)|
+
+### Metrics monitors for each service
+
+Metric monitoring is an essential part of Druid operations.
+Monitors can be enabled by configuring the property `druid.monitoring.monitors` in the common configuration file, `common.runtime.properties`.
+If a monitor is not supported on a certain service, it will simply be ignored while starting up that service.
+
+The following table lists available monitors and the respective services where they are supported:
+
+|Name|Description|Service|
+|----|-----------|-------|
+|`org.apache.druid.client.cache.CacheMonitor`|Emits metrics (to logs) about the segment results cache for Historical and Broker services. Reports typical cache statistics include hits, misses, rates, and size (bytes and number of entries), as well as timeouts and and errors.|Broker, Historical, Indexer, Peon|
+|`org.apache.druid.java.util.metrics.OshiSysMonitor`|Reports on various system activities and statuses using the [OSHI](https://github.com/oshi/oshi), a JNA-based (native) Operating System and Hardware Information library for Java.|Any|
+|`org.apache.druid.java.util.metrics.JvmMonitor`|Reports various JVM-related statistics.|Any|
+|`org.apache.druid.java.util.metrics.JvmCpuMonitor`|Reports statistics of CPU consumption by the JVM.|Any|
+|`org.apache.druid.java.util.metrics.CpuAcctDeltaMonitor`|Reports consumed CPU as per the cpuacct cgroup.|Any|
+|`org.apache.druid.java.util.metrics.JvmThreadsMonitor`|Reports Thread statistics in the JVM, like numbers of total, daemon, started, died threads.|Any|
+|`org.apache.druid.java.util.metrics.CgroupCpuMonitor`|Reports CPU shares and quotas as per the `cpu` cgroup.|Any|
+|`org.apache.druid.java.util.metrics.CgroupCpuSetMonitor`|Reports CPU core/HT and memory node allocations as per the `cpuset` cgroup.|Any|
+|`org.apache.druid.java.util.metrics.CgroupDiskMonitor`|Reports disk statistic as per the blkio cgroup.|Any|
+|`org.apache.druid.java.util.metrics.CgroupMemoryMonitor`|Reports memory statistic as per the memory cgroup.|Any|
+|`org.apache.druid.java.util.metrics.CgroupV2CpuMonitor`| **EXPERIMENTAL** Reports CPU usage from `cpu.stat` file. Only applicable to `cgroupv2`.|Any|
+|`org.apache.druid.java.util.metrics.CgroupV2DiskMonitor`| **EXPERIMENTAL** Reports disk usage from `io.stat` file. Only applicable to `cgroupv2`.|Any|
+|`org.apache.druid.java.util.metrics.CgroupV2MemoryMonitor`| **EXPERIMENTAL** Reports memory usage from `memory.current` and `memory.max` files. Only applicable to `cgroupv2`.|Any|
+|`org.apache.druid.server.metrics.HistoricalMetricsMonitor`|Reports statistics on Historical services.|Historical|
+|`org.apache.druid.server.metrics.SegmentStatsMonitor` | **EXPERIMENTAL** Reports statistics about segments on Historical services. Not to be used when lazy loading is configured.|Historical|
+|`org.apache.druid.server.metrics.QueryCountStatsMonitor`|Reports how many queries have been successful/failed/interrupted.|Broker, Historical, Router, Indexer, Peon|
+|`org.apache.druid.server.metrics.SubqueryCountStatsMonitor`|Reports how many subqueries have been materialized as rows or bytes and various other statistics related to the subquery execution|Broker|
+|`org.apache.druid.server.emitter.HttpEmittingMonitor`|Reports internal metrics of `http` or `parametrized` emitter (see below). Must not be used with another emitter type. See the description of the metrics here: https://github.com/apache/druid/pull/4973.|Any|
+|`org.apache.druid.server.metrics.TaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting and also the number of successful/failed tasks per emission period.|Overlord|
+|`org.apache.druid.server.metrics.TaskSlotCountStatsMonitor`|Reports metrics about task slot usage per emission period.|Overlord|
+|`org.apache.druid.server.metrics.WorkerTaskCountStatsMonitor`|Reports how many ingestion tasks are currently running/pending/waiting, the number of successful/failed tasks, and metrics about task slot usage for the reporting worker, per emission period. |MiddleManager, Indexer|
+|`org.apache.druid.server.metrics.ServiceStatusMonitor`|Reports a heartbeat for the service.|Any|
+|`org.apache.druid.server.metrics.GroupByStatsMonitor`|Report metrics for groupBy queries like disk and merge buffer utilization. |Broker, Historical, Indexer, Peon|
+
+For example, if you only wanted monitors on all services for system and JVM information, you'd add the following to `common.runtime.properties`:
+
+```properties
+druid.monitoring.monitors=["org.apache.druid.java.util.metrics.OshiSysMonitor","org.apache.druid.java.util.metrics.JvmMonitor"]
+```
+
+All the services in your Druid deployment would have these two monitors.
+
+If you want any service specific monitors though, you need to add all the monitors you want to run for that service to the service's `runtime.properties` file even if they are listed in the common file. The service specific properties take precedence.
+
+The following example adds the `TaskCountStatsMonitor` and `TaskSlotCountStatsMonitor` as well as the `OshiSysMonitor` and `JvmMonitor` from the previous example to the Overlord service (`coordinator-overlord/runtime.properties`):
+
+```properties
+druid.monitoring.monitors=["org.apache.druid.server.metrics.TaskCountStatsMonitor", "org.apache.druid.server.metrics.TaskSlotCountStatsMonitor", "org.apache.druid.java.util.metrics.OshiSysMonitor","org.apache.druid.java.util.metrics.JvmMonitor"]
+```
+
+If you don't include `OshiSysMonitor` and `JvmMonitor` in the Overlord's `runtime.properties` file, the monitors don't get loaded onto the Overlord despite being specified in the common file.
+
+### Metrics emitters
+
+There are several emitters available:
+
+* `noop` (default) disables metric emission.
+* [`logging`](#logging-emitter-module) emits logs using Log4j2.
+* [`http`](#http-emitter-module) sends `POST` requests of JSON events.
+* [`parametrized`](#parametrized-http-emitter-module) operates like the `http` emitter but fine-tunes the recipient URL based on the event feed.
+* [`composing`](#composing-emitter-module) initializes multiple emitter modules.
+* [`graphite`](#graphite-emitter) emits metrics to a [Graphite](https://graphiteapp.org/) Carbon service.
+* [`switching`](#switching-emitter) initializes and emits to multiple emitter modules based on the event feed.
+
+#### Logging emitter module
+
+The use this emitter module, set `druid.emitter=logging`. The `logging` emitter uses a Log4j2 logger named
+`druid.emitter.logging.loggerClass` to emit events. Each event is logged as a single `json` object with a
+[Marker](https://logging.apache.org/log4j/2.x/manual/markers.html) as the feed of the event. Users may wish to edit the
+log4j config to route these logs to different sources based on the feed of the event.
+
+|Property|Description| Default|
+|--------|-----------|--------|
+|`druid.emitter.logging.loggerClass`|The class used for logging.|`org.apache.druid.java.util.emitter.core.LoggingEmitter`|
+|`druid.emitter.logging.logLevel`|Choices: debug, info, warn, error. The log level at which message are logged.|info|
+
+#### HTTP emitter module
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.emitter.http.flushMillis`|How often the internal message buffer is flushed (data is sent).|60000|
+|`druid.emitter.http.flushCount`|How many messages the internal message buffer can hold before flushing (sending).|500|
+|`druid.emitter.http.basicAuthentication`|[Password Provider](../operations/password-provider.md) for providing login and password for authentication in `"login:password"` form. For example, `druid.emitter.http.basicAuthentication=admin:adminpassword` uses Default Password Provider which allows plain text passwords.|not specified = no authentication|
+|`druid.emitter.http.flushTimeOut`|The timeout after which an event should be sent to the endpoint, even if internal buffers are not filled, in milliseconds.|not specified = no timeout|
+|`druid.emitter.http.batchingStrategy`|The strategy of how the batch is formatted. "ARRAY" means `[event1,event2]`, "NEWLINES" means `event1\nevent2`, ONLY_EVENTS means `event1event2`.|ARRAY|
+|`druid.emitter.http.maxBatchSize`|The maximum batch size, in bytes.|the minimum of (10% of JVM heap size divided by 2) or (5242880 (i. e. 5 MiB))|
+|`druid.emitter.http.batchQueueSizeLimit`|The maximum number of batches in emitter queue, if there are problems with emitting.|the maximum of (2) or (10% of the JVM heap size divided by 5MiB)|
+|`druid.emitter.http.minHttpTimeoutMillis`|If the speed of filling batches imposes timeout smaller than that, not even trying to send batch to endpoint, because it will likely fail, not being able to send the data that fast. Configure this depending based on emitter/successfulSending/minTimeMs metric. Reasonable values are 10ms..100ms.|0|
+|`druid.emitter.http.recipientBaseUrl`|The base URL to emit messages to. Druid will POST JSON to be consumed at the HTTP endpoint specified by this property.|none, required config|
+
+#### HTTP emitter module TLS overrides
+
+By default, when sending events to a TLS-enabled receiver, the HTTP Emitter uses an SSLContext obtained from the service described at [Druid's internal communication over TLS](../operations/tls-support.md), that is the same SSLContext that would be used for internal communications between Druid services.
+
+In some use cases it may be desirable to have the HTTP Emitter use its own separate truststore configuration. For example, there may be organizational policies that prevent the TLS-enabled metrics receiver's certificate from being added to the same truststore used by Druid's internal HTTP client.
+
+The following properties allow the HTTP Emitter to use its own truststore configuration when building its SSLContext.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.emitter.http.ssl.useDefaultJavaContext`|If set to true, the HttpEmitter will use `SSLContext.getDefault()`, the default Java SSLContext, and all other properties below are ignored.|false|
+|`druid.emitter.http.ssl.trustStorePath`|The file path or URL of the TLS/SSL Key store where trusted root certificates are stored. If this is unspecified, the HTTP Emitter will use the same SSLContext as Druid's internal HTTP client, as described in the beginning of this section, and all other properties below are ignored.|null|
+|`druid.emitter.http.ssl.trustStoreType`|The type of the key store where trusted root certificates are stored.|`java.security.KeyStore.getDefaultType()`|
+|`druid.emitter.http.ssl.trustStoreAlgorithm`|Algorithm to be used by TrustManager to validate certificate chains|`javax.net.ssl.TrustManagerFactory.getDefaultAlgorithm()`|
+|`druid.emitter.http.ssl.trustStorePassword`|The [Password Provider](../operations/password-provider.md) or String password for the Trust Store.|none|
+|`druid.emitter.http.ssl.protocol`|TLS protocol to use.|"TLSv1.2"|
+
+#### Parametrized HTTP emitter module
+
+The parametrized emitter takes the same configs as the [`http` emitter](#http-emitter-module) using the prefix `druid.emitter.parametrized.httpEmitting.`.
+For example:
+
+* `druid.emitter.parametrized.httpEmitting.flushMillis`
+* `druid.emitter.parametrized.httpEmitting.flushCount`
+* `druid.emitter.parametrized.httpEmitting.ssl.trustStorePath`
+
+Do not specify `recipientBaseUrl` with the parametrized emitter.
+Instead use `recipientBaseUrlPattern` described in the table below.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.emitter.parametrized.recipientBaseUrlPattern`|The URL pattern to send an event to, based on the event's feed. For example, `http://foo.bar/{feed}`, that will send event to `http://foo.bar/metrics` if the event's feed is "metrics".|none, required config|
+
+#### Composing emitter module
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.emitter.composing.emitters`|List of emitter modules to load, such as ["logging","http"].|[]|
+
+#### Graphite emitter
+
+To use graphite as emitter set `druid.emitter=graphite`. For configuration details, see [Graphite emitter](../development/extensions-contrib/graphite.md) for the Graphite emitter Druid extension.
+
+#### Switching emitter
+
+To use switching as emitter set `druid.emitter=switching`.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.emitter.switching.emitters`|JSON map of feed to list of emitter modules that will be used for the mapped feed, such as `{"metrics":["http"], "alerts":["logging"]}`|{}|
+|`druid.emitter.switching.defaultEmitters`|JSON list of emitter modules to load that will be used if there is no emitter specifically designated for that event's feed, such as `["logging","http"]`.|[]|
+
+
+## Cache configuration
+
+This section describes caching configuration that is common to Broker, Historical, and Middle Manager/Peon processes.
+
+Caching could optionally be enabled on the Broker, Historical, and Middle Manager/Peon processes. See
+[Broker](#broker-caching), [Historical](#historical-caching), and [Peon](#peon-caching) configuration options for how to
+enable it for different processes.
+
+Druid uses a local in-memory cache by default, unless a different type of cache is specified.
+Use the `druid.cache.type` configuration to set a different kind of cache.
+
+Cache settings are set globally, so the same configuration can be re-used
+for both Broker and Historical processes, when defined in the common properties file.
+
+### Cache type
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.cache.type`|`local`, `memcached`, `hybrid`, `caffeine`|The type of cache to use for queries. See below of the configuration options for each cache type|`caffeine`|
+
+#### Local cache
+
+:::info
+ DEPRECATED: Use caffeine (default as of v0.12.0) instead
+:::
+
+The local cache is deprecated in favor of the Caffeine cache, and may be removed in a future version of Druid. The Caffeine cache affords significantly better performance and control over eviction behavior compared to `local` cache, and is recommended in any situation where you are using JRE 8u60 or higher.
+
+A simple in-memory LRU cache. Local cache resides in JVM heap memory, so if you enable it, make sure you increase heap size accordingly.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.cache.sizeInBytes`|Maximum cache size in bytes. Zero disables caching.|0|
+|`druid.cache.initialSize`|Initial size of the hash table backing the cache.|500000|
+|`druid.cache.logEvictionCount`|If non-zero, log cache eviction every `logEvictionCount` items.|0|
+
+#### Caffeine cache
+
+A highly performant local cache implementation for Druid based on [Caffeine](https://github.com/ben-manes/caffeine). Requires a JRE8u60 or higher if using `COMMON_FJP`.
+
+##### Configuration
+
+The following table shows the configuration options known to this module:
+
+|`runtime.properties`|Description|Default|
+|--------------------|-----------|-------|
+|`druid.cache.type`| Set this to `caffeine` or leave out parameter|`caffeine`|
+|`druid.cache.sizeInBytes`|The maximum size of the cache in bytes on heap. It can be configured as described in [here](human-readable-byte.md). |min(1GiB, Runtime.maxMemory / 10)|
+|`druid.cache.expireAfter`|The time (in ms) after an access for which a cache entry may be expired|None (no time limit)|
+|`druid.cache.cacheExecutorFactory`|The executor factory to use for Caffeine maintenance. One of `COMMON_FJP`, `SINGLE_THREAD`, or `SAME_THREAD`|ForkJoinPool common pool (`COMMON_FJP`)|
+|`druid.cache.evictOnClose`|If a close of a namespace (ex: removing a segment from a process) should cause an eager eviction of associated cache values|`false`|
+
+##### `druid.cache.cacheExecutorFactory`
+
+The following are the possible values for `druid.cache.cacheExecutorFactory`, which controls how maintenance tasks are run:
+
+* `COMMON_FJP` (default) use the common ForkJoinPool. Should use with [JRE 8u60 or higher](https://github.com/apache/druid/pull/4810#issuecomment-329922810). Older versions of the JRE may have worse performance than newer JRE versions.
+* `SINGLE_THREAD` Use a single-threaded executor.
+* `SAME_THREAD` Cache maintenance is done eagerly.
+
+##### Metrics
+
+In addition to the normal cache metrics, the caffeine cache implementation also reports the following in both `total` and `delta`:
+
+|Metric|Description|Normal value|
+|------|-----------|------------|
+|`query/cache/caffeine/*/requests`|Count of hits or misses.|hit + miss|
+|`query/cache/caffeine/*/loadTime`|Length of time caffeine spends loading new values (unused feature).|0|
+|`query/cache/caffeine/*/evictionBytes`|Size in bytes that have been evicted from the cache|Varies, should tune cache `sizeInBytes` so that `sizeInBytes`/`evictionBytes` is approximately the rate of cache churn you desire.|
+
+##### Memcached
+
+Uses memcached as cache backend. This allows all processes to share the same cache.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.cache.expiration`|Memcached [expiration time](https://code.google.com/p/memcached/wiki/NewCommands#Standard_Protocol).|2592000 (30 days)|
+|`druid.cache.timeout`|Maximum time in milliseconds to wait for a response from Memcached.|500|
+|`druid.cache.hosts`|Comma separated list of Memcached hosts ``. Need to specify all nodes when `druid.cache.clientMode` is set to static. Dynamic mode [automatically identifies nodes in your cluster](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.html) so just specifying the configuration endpoint and port is fine.|none|
+|`druid.cache.maxObjectSize`|Maximum object size in bytes for a Memcached object.|52428800 (50 MiB)|
+|`druid.cache.memcachedPrefix`|Key prefix for all keys in Memcached.|druid|
+|`druid.cache.numConnections`| Number of memcached connections to use.|1|
+|`druid.cache.protocol`| Memcached communication protocol. Can be binary or text.|binary|
+|`druid.cache.locator`| Memcached locator. Can be consistent or `array_mod`.|consistent|
+|`druid.cache.enableTls`|Enable TLS based connection for Memcached client. Boolean.|false|
+|`druid.cache.clientMode`|Client Mode. Static mode requires the user to specify individual cluster nodes. Dynamic mode uses [AutoDiscovery](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.HowAutoDiscoveryWorks.html) feature of AWS Memcached. String. ["static"](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.Manual.html) or ["dynamic"](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.Using.ModifyApp.Java.html)|static|
+|`druid.cache.skipTlsHostnameVerification`|Skip TLS Hostname Verification. Boolean.|true|
+
+#### Hybrid
+
+Uses a combination of any two caches as a two-level L1 / L2 cache.
+This may be used to combine a local in-memory cache with a remote memcached cache.
+
+Cache requests will first check L1 cache before checking L2.
+If there is an L1 miss and L2 hit, it will also populate L1.
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.cache.l1.type`|The type of cache to use for L1 cache. See `druid.cache.type` configuration for valid types.|`caffeine`|
+|`druid.cache.l2.type`|The type of cache to use for L2 cache. See `druid.cache.type` configuration for valid types.|`caffeine`|
+|`druid.cache.l1.*`|Any property valid for the given type of L1 cache can be set using this prefix. For instance, if you are using a `caffeine` L1 cache, specify `druid.cache.l1.sizeInBytes` to set its size.|defaults are the same as for the given cache type|
+|`druid.cache.l2.*`|Prefix for L2 cache settings, see description for L1.|defaults are the same as for the given cache type|
+|`druid.cache.useL2`|A boolean indicating whether to query L2 cache, if it's a miss in L1. It makes sense to configure this to `false` on Historical processes, if L2 is a remote cache like `memcached`, and this cache also used on brokers, because in this case if a query reached Historical it means that a broker didn't find corresponding results in the same remote cache, so a query to the remote cache from Historical is guaranteed to be a miss.|`true`|
+|`druid.cache.populateL2`|A boolean indicating whether to put results into L2 cache.|`true`|
+
+## General query configuration
+
+This section describes configurations that control behavior of Druid's query types, applicable to Broker, Historical, and Middle Manager processes.
+
+### Overriding default query context values
+
+You can override any [query context general parameter](../querying/query-context-reference.md#general-parameters) default value by setting the runtime property in the format of `druid.query.default.context.{query_context_key}`.
+The `druid.query.default.context.{query_context_key}` runtime property prefix applies to all current and future query context keys, the same as how query context parameter passed with the query works. You can override the runtime property value if the value for the same key is specified in the query contexts.
+
+The precedence chain for query context values is as follows:
+
+hard-coded default value in Druid code `<-` runtime property not prefixed with `druid.query.default.context`
+`<-` runtime property prefixed with `druid.query.default.context` `<-` context parameter in the query
+
+Note that not all query context key has a runtime property not prefixed with `druid.query.default.context` that can
+override the hard-coded default value. For example, `maxQueuedBytes` has `druid.broker.http.maxQueuedBytes`
+but `joinFilterRewriteMaxSize` does not. Hence, the only way of overriding `joinFilterRewriteMaxSize` hard-coded default
+value is with runtime property `druid.query.default.context.joinFilterRewriteMaxSize`.
+
+To further elaborate on the previous example:
+
+If neither `druid.broker.http.maxQueuedBytes` or `druid.query.default.context.maxQueuedBytes` is set and
+the query does not have `maxQueuedBytes` in the context, then the hard-coded value in Druid code is use.
+If runtime property only contains `druid.broker.http.maxQueuedBytes=x` and query does not have `maxQueuedBytes` in the
+context, then the value of the property, `x`, is use. However, if query does have `maxQueuedBytes` in the context,
+then that value is use instead.
+If runtime property only contains `druid.query.default.context.maxQueuedBytes=y` OR runtime property contains both
+`druid.broker.http.maxQueuedBytes=x` and `druid.query.default.context.maxQueuedBytes=y`, then the value of
+`druid.query.default.context.maxQueuedBytes`, `y`, is use (given that query does not have `maxQueuedBytes` in the
+context). If query does have `maxQueuedBytes` in the context, then that value is use instead.
+
+### TopN query config
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.topN.minTopNThreshold`|See [TopN Aliasing](../querying/topnquery.md#aliasing) for details.|1000|
+
+### Search query config
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.search.maxSearchLimit`|Maximum number of search results to return.|1000|
+|`druid.query.search.searchStrategy`|Default search query strategy.|`useIndexes`|
+
+### SegmentMetadata query config
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.segmentMetadata.defaultHistory`|When no interval is specified in the query, use a default interval of defaultHistory before the end time of the most recent segment, specified in ISO8601 format. This property also controls the duration of the default interval used by `GET` `/druid/v2/datasources/{dataSourceName}` interactions for retrieving datasource dimensions and metrics.|`P1W`|
+|`druid.query.segmentMetadata.defaultAnalysisTypes`|This can be used to set the Default Analysis Types for all segment metadata queries, this can be overridden when making the query|`["cardinality", "interval", "minmax"]`|
+
+### GroupBy query config
+
+This section describes the configurations for groupBy queries. You can set the runtime properties in the `runtime.properties` file on Broker, Historical, and Middle Manager processes. You can set the query context parameters through the [query context](../querying/query-context-reference.md).
+
+Supported runtime properties:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.groupBy.maxSelectorDictionarySize`|Maximum amount of heap space (approximately) to use for per-segment string dictionaries. See [groupBy memory tuning and resource limits](../querying/groupbyquery.md#memory-tuning-and-resource-limits) for details.|100000000|
+|`druid.query.groupBy.maxMergingDictionarySize`|Maximum amount of heap space (approximately) to use for per-query string dictionaries. When the dictionary exceeds this size, a spill to disk will be triggered. See [groupBy memory tuning and resource limits](../querying/groupbyquery.md#memory-tuning-and-resource-limits) for details.|100000000|
+|`druid.query.groupBy.maxOnDiskStorage`|Maximum amount of disk space to use, per-query, for spilling result sets to disk when either the merging buffer or the dictionary fills up. Queries that exceed this limit will fail. Set to zero to disable disk spilling.|0 (disabled)|
+|`druid.query.groupBy.defaultOnDiskStorage`|Default amount of disk space to use, per-query, for spilling the result sets to disk when either the merging buffer or the dictionary fills up. Set to zero to disable disk spilling for queries which don't override `maxOnDiskStorage` in their context.|`druid.query.groupBy.maxOnDiskStorage`|
+
+Supported query contexts:
+
+|Key|Description|
+|---|-----------|
+|`maxSelectorDictionarySize`|Can be used to lower the value of `druid.query.groupBy.maxMergingDictionarySize` for this query.|
+|`maxMergingDictionarySize`|Can be used to lower the value of `druid.query.groupBy.maxMergingDictionarySize` for this query.|
+|`maxOnDiskStorage`|Can be used to set `maxOnDiskStorage` to a value between 0 and `druid.query.groupBy.maxOnDiskStorage` for this query. If this query context override exceeds `druid.query.groupBy.maxOnDiskStorage`, the query will use `druid.query.groupBy.maxOnDiskStorage`. Omitting this from the query context will cause the query to use `druid.query.groupBy.defaultOnDiskStorage` for `maxOnDiskStorage`|
+
+### Advanced configurations
+
+Supported runtime properties:
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.query.groupBy.singleThreaded`|Merge results using a single thread.|false|
+|`druid.query.groupBy.bufferGrouperInitialBuckets`|Initial number of buckets in the off-heap hash table used for grouping results. Set to 0 to use a reasonable default (1024).|0|
+|`druid.query.groupBy.bufferGrouperMaxLoadFactor`|Maximum load factor of the off-heap hash table used for grouping results. When the load factor exceeds this size, the table will be grown or spilled to disk. Set to 0 to use a reasonable default (0.7).|0|
+|`druid.query.groupBy.forceHashAggregation`|Force to use hash-based aggregation.|false|
+|`druid.query.groupBy.intermediateCombineDegree`|Number of intermediate processes combined together in the combining tree. Higher degrees will need less threads which might be helpful to improve the query performance by reducing the overhead of too many threads if the server has sufficiently powerful CPU cores.|8|
+|`druid.query.groupBy.numParallelCombineThreads`|Hint for the number of parallel combining threads. This should be larger than 1 to turn on the parallel combining feature. The actual number of threads used for parallel combining is min(`druid.query.groupBy.numParallelCombineThreads`, `druid.processing.numThreads`).|1 (disabled)|
+
+Supported query contexts:
+
+|Key|Description|Default|
+|---|-----------|-------|
+|`groupByIsSingleThreaded`|Overrides the value of `druid.query.groupBy.singleThreaded` for this query.| |
+|`bufferGrouperInitialBuckets`|Overrides the value of `druid.query.groupBy.bufferGrouperInitialBuckets` for this query.|none|
+|`bufferGrouperMaxLoadFactor`|Overrides the value of `druid.query.groupBy.bufferGrouperMaxLoadFactor` for this query.|none|
+|`forceHashAggregation`|Overrides the value of `druid.query.groupBy.forceHashAggregation`|none|
+|`intermediateCombineDegree`|Overrides the value of `druid.query.groupBy.intermediateCombineDegree`|none|
+|`numParallelCombineThreads`|Overrides the value of `druid.query.groupBy.numParallelCombineThreads`|none|
+|`sortByDimsFirst`|Sort the results first by dimension values and then by timestamp.|false|
+|`forceLimitPushDown`|When all fields in the orderby are part of the grouping key, the broker will push limit application down to the Historical processes. When the sorting order uses fields that are not in the grouping key, applying this optimization can result in approximate results with unknown accuracy, so this optimization is disabled by default in that case. Enabling this context flag turns on limit push down for limit/orderbys that contain non-grouping key columns.|false|
+
+### Router
+
+#### Router process configs
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.host`|The host for the current process. This is used to advertise the current processes location as reachable from another process and should generally be specified such that `http://${druid.host}/` could actually talk to this process|`InetAddress.getLocalHost().getCanonicalHostName()`|
+|`druid.bindOnHost`|Indicating whether the process's internal jetty server bind on `druid.host`. Default is false, which means binding to all interfaces.|false|
+|`druid.plaintextPort`|This is the port to actually listen on; unless port mapping is used, this will be the same port as is on `druid.host`|8888|
+|`druid.tlsPort`|TLS port for HTTPS connector, if [druid.enableTlsPort](../operations/tls-support.md) is set then this config will be used. If `druid.host` contains port then that port will be ignored. This should be a non-negative Integer.|9088|
+|`druid.service`|The name of the service. This is used as a dimension when emitting metrics and alerts to differentiate between the various services|`druid/router`|
+|`druid.labels`|Optional JSON object of key-value pairs that define custom labels for the server. These labels are displayed in the web console under the "Services" tab. Example: `druid.labels={"location":"Airtrunk"}` or `druid.labels.location=Airtrunk`|`null`|
+
+#### Runtime configuration
+
+|Property|Description|Default|
+|--------|-----------|-------|
+|`druid.router.defaultBrokerServiceName`|The default Broker to connect to in case service discovery fails.|`druid/broker`|
+|`druid.router.tierToBrokerMap`|Queries for a certain tier of data are routed to their appropriate Broker. This value should be an ordered JSON map of tiers to Broker names. The priority of Brokers is based on the ordering.|`{"_default_tier": ""}`|
+|`druid.router.defaultRule`|The default rule for all datasources.|`_default`|
+|`druid.router.pollPeriod`|How often to poll for new rules.|`PT1M`|
+|`druid.router.sql.enable`|Enable routing of SQL queries using strategies. When`true`, the Router uses the strategies defined in `druid.router.strategies` to determine the broker service for a given SQL query. When `false`, the Router uses the `defaultBrokerServiceName`.|`false`|
+|`druid.router.strategies`|Please see [Router Strategies](../design/router.md#router-strategies) for details.|`[{"type":"timeBoundary"},{"type":"priority"}]`|
+|`druid.router.avatica.balancer.type`|Class to use for balancing Avatica queries across Brokers. Please see [Avatica Query Balancing](../design/router.md#avatica-query-balancing).|`rendezvousHash`|
+|`druid.router.managementProxy.enabled`|Enables the Router's [management proxy](../design/router.md#router-as-management-proxy) functionality.|false|
+|`druid.router.http.numConnections`|Size of connection pool for the Router to connect to Broker processes. If there are more queries than this number that all need to speak to the same process, then they will queue up.|`20`|
+|`druid.router.http.eagerInitialization`|Indicates that http connections from Router to Broker should be eagerly initialized. If set to true, `numConnections` connections are created upon initialization|`true`|
+|`druid.router.http.readTimeout`|The timeout for data reads from Broker processes.|`PT15M`|
+|`druid.router.http.numMaxThreads`|Maximum number of worker threads to handle HTTP requests and responses|`(number of cores) * 3 / 2 + 1`|
+|`druid.router.http.numRequestsQueued`|Maximum number of requests that may be queued to a destination|`1024`|
+|`druid.router.http.requestBuffersize`|Size of the content buffer for receiving requests. These buffers are only used for active connections that have requests with bodies that will not fit within the header buffer|`8 * 1024`|
+|`druid.router.http.clientConnectTimeout`|The timeout (in milliseconds) for establishing client connections.|500|
diff --git a/docs/35.0.0/configuration/logging.md b/docs/35.0.0/configuration/logging.md
new file mode 100644
index 0000000000..d740f38b09
--- /dev/null
+++ b/docs/35.0.0/configuration/logging.md
@@ -0,0 +1,170 @@
+---
+id: logging
+title: "Logging"
+---
+
+
+
+
+Apache Druid services emit logs that to help you debug.
+The same services also emit periodic [metrics](../configuration/index.md#metrics-monitors) about their state.
+To disable metric info logs set the following runtime property: `-Ddruid.emitter.logging.logLevel=debug`.
+
+Druid uses [log4j2](http://logging.apache.org/log4j/2.x/) for logging.
+The default configuration file log4j2.xml ships with Druid at the following path: `conf/druid/{config}/_common/log4j2.xml`.
+
+By default, Druid uses `RollingRandomAccessFile` for rollover daily, and keeps log files up to 7 days.
+If that's not suitable in your case, modify the `log4j2.xml` accordingly.
+
+The following example log4j2.xml is based upon the micro quickstart:
+
+```
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+Peons always output logs to standard output. Middle Managers redirect task logs from standard output to
+[long-term storage](index.md#log-long-term-storage).
+
+:::info
+
+ Druid shares the log4j configuration file among all services, including task peon processes.
+ However, you must define a console appender in the logger for your peon processes.
+ If you don't define a console appender, Druid creates and configures a new console appender
+ that retains the log level, such as `info` or `warn`, but does not retain any other appender
+ configuration, including non-console ones.
+:::
+
+## Log directory
+The included log4j2.xml configuration for Druid and ZooKeeper writes logs to the `log` directory at the root of the distribution.
+
+If you want to change the log directory, set the environment variable `DRUID_LOG_DIR` to the right directory before you start Druid.
+
+## All-in-one start commands
+
+If you use one of the all-in-one start commands, such as `bin/start-micro-quickstart`, the default configuration for each service has two kinds of log files.
+Log4j2 writes the main log file and rotates it periodically.
+For example, `log/historical.log`.
+
+The secondary log file contains anything that is written by the component
+directly to standard output or standard error without going through log4j2.
+For example, `log/historical.stdout.log`.
+This consists mainly of messages from the
+Java runtime itself.
+This file is not rotated, but it is generally small due to the low volume of messages.
+If necessary, you can truncate it using the Linux command `truncate --size 0 log/historical.stdout.log`.
+
+## Set the logs to asynchronously write
+
+If your logs are really chatty, you can set them to write asynchronously.
+The following example shows a `log4j2.xml` that configures some of the more chatty classes to write asynchronously:
+
+```
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
diff --git a/docs/35.0.0/data-management/automatic-compaction.md b/docs/35.0.0/data-management/automatic-compaction.md
new file mode 100644
index 0000000000..1a0803bafb
--- /dev/null
+++ b/docs/35.0.0/data-management/automatic-compaction.md
@@ -0,0 +1,370 @@
+---
+id: automatic-compaction
+title: "Automatic compaction"
+---
+
+
+
+In Apache Druid, compaction is a special type of ingestion task that reads data from a Druid datasource and writes it back into the same datasource. A common use case for this is to [optimally size segments](../operations/segment-optimization.md) after ingestion to improve query performance. Automatic compaction, or auto-compaction, refers to the system for automatic execution of compaction tasks issued by Druid itself. In addition to auto-compaction, you can perform [manual compaction](./manual-compaction.md) using the Overlord APIs.
+
+:::info
+ Auto-compaction skips datasources that have a segment granularity of `ALL`.
+:::
+
+As a best practice, you should set up auto-compaction for all Druid datasources. You can run compaction tasks manually for cases where you want to allocate more system resources. For example, you may choose to run multiple compaction tasks in parallel to compact an existing datasource for the first time. See [Compaction](compaction.md) for additional details and use cases.
+
+This topic guides you through setting up automatic compaction for your Druid cluster. See the [examples](#examples) for common use cases for automatic compaction.
+
+## Auto-compaction syntax
+
+You can configure automatic compaction dynamically without restarting Druid.
+The automatic compaction system uses the following syntax:
+
+```json
+{
+ "dataSource": ,
+ "ioConfig": ,
+ "dimensionsSpec": ,
+ "transformSpec": ,
+ "metricsSpec": ,
+ "tuningConfig": ,
+ "granularitySpec": ,
+ "skipOffsetFromLatest": ,
+ "taskPriority": ,
+ "taskContext":
+}
+```
+
+:::info[Experimental]
+
+The MSQ task engine is available as a compaction engine when you run automatic compaction as a compaction supervisor. For more information, see [Auto-compaction using compaction supervisors](#auto-compaction-using-compaction-supervisors).
+
+:::
+
+For automatic compaction using Coordinator duties, you submit the spec to the [Compaction config UI](#manage-auto-compaction-using-the-web-console) or the [Compaction configuration API](#manage-auto-compaction-using-coordinator-apis).
+
+Most fields in the auto-compaction configuration correlate to a typical [Druid ingestion spec](../ingestion/ingestion-spec.md).
+The following properties only apply to auto-compaction:
+* `skipOffsetFromLatest`
+* `taskPriority`
+* `taskContext`
+
+Since the automatic compaction system provides a management layer on top of manual compaction tasks,
+the auto-compaction configuration does not include task-specific properties found in a typical Druid ingestion spec.
+The following properties are automatically set by the Coordinator:
+* `type`: Set to `compact`.
+* `id`: Generated using the task type, datasource name, interval, and timestamp. The task ID is prefixed with `coordinator-issued`.
+* `context`: Set according to the user-provided `taskContext`.
+
+Compaction tasks typically fetch all [relevant segments](manual-compaction.md#compaction-io-configuration) prior to launching any subtasks,
+_unless_ the following properties are all set to non-null values. It is strongly recommended to set them to non-null values to
+maximize performance and minimize disk usage of the `compact` tasks launched by auto-compaction:
+
+- [`granularitySpec`](manual-compaction.md#compaction-granularity-spec), with non-null values for each of `segmentGranularity`, `queryGranularity`, and `rollup`
+- [`dimensionsSpec`](manual-compaction.md#compaction-dimensions-spec)
+- `metricsSpec`
+
+For more details on each of the specs in an auto-compaction configuration, see [Automatic compaction dynamic configuration](../configuration/index.md#automatic-compaction-dynamic-configuration).
+
+## Auto-compaction using Coordinator duties
+
+You can control how often the Coordinator checks to see if auto-compaction is needed. The Coordinator [indexing period](../configuration/index.md#data-management), `druid.coordinator.period.indexingPeriod`, controls the frequency of compaction tasks.
+The default indexing period is 30 minutes, meaning that the Coordinator first checks for segments to compact at most 30 minutes from when auto-compaction is enabled.
+This time period also affects other Coordinator duties such as cleanup of unused segments and stale pending segments.
+To configure the auto-compaction time period without interfering with `indexingPeriod`, see [Set frequency of compaction runs](#change-compaction-frequency).
+
+At every invocation of auto-compaction, the Coordinator initiates a [segment search](../design/coordinator.md#segment-search-policy-in-automatic-compaction) to determine eligible segments to compact.
+When there are eligible segments to compact, the Coordinator issues compaction tasks based on available worker capacity.
+If a compaction task takes longer than the indexing period, the Coordinator waits for it to finish before resuming the period for segment search.
+
+No additional configuration is needed to run automatic compaction tasks using the Coordinator and native engine. This is the default behavior for Druid.
+You can configure it for a datasource through the web console or programmatically via an API.
+This process differs for manual compaction tasks, which can be submitted from the [Tasks view of the web console](../operations/web-console.md) or the [Tasks API](../api-reference/tasks-api.md).
+
+### Manage auto-compaction using the web console
+
+Use the web console to enable automatic compaction for a datasource as follows:
+
+1. Click **Datasources** in the top-level navigation.
+2. In the **Compaction** column, click the edit icon for the datasource to compact.
+3. In the **Compaction config** dialog, configure the auto-compaction settings. The dialog offers a form view as well as a JSON view. Editing the form updates the JSON specification, and editing the JSON updates the form field, if present. Form fields not present in the JSON indicate default values. You may add additional properties to the JSON for auto-compaction settings not displayed in the form. See [Configure automatic compaction](#auto-compaction-syntax) for supported settings for auto-compaction.
+4. Click **Submit**.
+5. Refresh the **Datasources** view. The **Compaction** column for the datasource changes from “Not enabled” to “Awaiting first run.”
+
+The following screenshot shows the compaction config dialog for a datasource with auto-compaction enabled.
+
+
+To disable auto-compaction for a datasource, click **Delete** from the **Compaction config** dialog. Druid does not retain your auto-compaction configuration.
+
+### Manage auto-compaction using Coordinator APIs
+
+Use the [Automatic compaction API](../api-reference/automatic-compaction-api.md#manage-automatic-compaction) to configure automatic compaction.
+To enable auto-compaction for a datasource, create a JSON object with the desired auto-compaction settings.
+See [Configure automatic compaction](#auto-compaction-syntax) for the syntax of an auto-compaction spec.
+Send the JSON object as a payload in a [`POST` request](../api-reference/automatic-compaction-api.md#create-or-update-automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction`.
+The following example configures auto-compaction for the `wikipedia` datasource:
+
+```sh
+curl --location --request POST 'http://localhost:8081/druid/coordinator/v1/config/compaction' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "dataSource": "wikipedia",
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ }
+}'
+```
+
+To disable auto-compaction for a datasource, send a [`DELETE` request](../api-reference/automatic-compaction-api.md#remove-automatic-compaction-configuration) to `/druid/coordinator/v1/config/compaction/{dataSource}`. Replace `{dataSource}` with the name of the datasource for which to disable auto-compaction. For example:
+
+```sh
+curl --location --request DELETE 'http://localhost:8081/druid/coordinator/v1/config/compaction/wikipedia'
+```
+
+### Change compaction frequency
+
+If you want the Coordinator to check for compaction more frequently than its indexing period, create a separate group to handle compaction duties.
+Set the time period of the duty group in the `coordinator/runtime.properties` file.
+The following example shows how to create a duty group named `compaction` and set the auto-compaction period to 1 minute:
+```
+druid.coordinator.dutyGroups=["compaction"]
+druid.coordinator.compaction.duties=["compactSegments"]
+druid.coordinator.compaction.period=PT60S
+```
+
+### View Coordinator duty auto-compaction stats
+
+After the Coordinator has initiated auto-compaction, you can view compaction statistics for the datasource, including the number of bytes, segments, and intervals already compacted and those awaiting compaction. The Coordinator also reports the total bytes, segments, and intervals not eligible for compaction in accordance with its [segment search policy](../design/coordinator.md#segment-search-policy-in-automatic-compaction).
+
+In the web console, the Datasources view displays auto-compaction statistics. The Tasks view shows the task information for compaction tasks that were triggered by the automatic compaction system.
+
+To get statistics by API, send a [`GET` request](../api-reference/automatic-compaction-api.md#view-automatic-compaction-status) to `/druid/coordinator/v1/compaction/status`. To filter the results to a particular datasource, pass the datasource name as a query parameter to the request—for example, `/druid/coordinator/v1/compaction/status?dataSource=wikipedia`.
+
+
+## Avoid conflicts with ingestion
+
+Compaction tasks may be interrupted when they interfere with ingestion. For example, this occurs when an ingestion task needs to write data to a segment for a time interval locked for compaction. If there are continuous failures that prevent compaction from making progress, consider one of the following strategies:
+
+* Enable [concurrent append and replace tasks](#enable-concurrent-append-and-replace) on your datasource and on the ingestion tasks.
+* Set `skipOffsetFromLatest` to reduce the chance of conflicts between ingestion and compaction. See more details in [Skip compaction for latest segments](#skip-compaction-for-latest-segments).
+* Increase the priority value of compaction tasks relative to ingestion tasks. Only recommended for advanced users. This approach can cause ingestion jobs to fail or lag. To change the priority of compaction tasks, set `taskPriority` to the desired priority value in the auto-compaction configuration. For details on the priority values of different task types, see [Lock priority](../ingestion/tasks.md#lock-priority).
+
+### Enable concurrent append and replace
+
+You can use concurrent append and replace to safely replace the existing data in an interval of a datasource while new data is being appended to that interval even during compaction.
+
+To do this, you need to update your datasource to allow concurrent append and replace tasks:
+
+* If you're using the API, include the following `taskContext` property in your API call: `"useConcurrentLocks": true`
+* If you're using the UI, enable **Use concurrent locks** in the **Compaction config** for your datasource.
+
+You'll also need to update your ingestion jobs for the datasource to include the task context `"useConcurrentLocks": true`.
+
+For information on how to do this, see [Concurrent append and replace](../ingestion/concurrent-append-replace.md).
+
+### Skip compaction for latest segments
+
+The Coordinator compacts segments from newest to oldest. In the auto-compaction configuration, you can set a time period, relative to the end time of the most recent segment, for segments that should not be compacted. Assign this value to `skipOffsetFromLatest`. Note that this offset is not relative to the current time but to the latest segment time. For example, if you want to skip over segments from five days prior to the end time of the most recent segment, assign `"skipOffsetFromLatest": "P5D"`.
+
+To set `skipOffsetFromLatest`, consider how frequently you expect the stream to receive late arriving data. If your stream only occasionally receives late arriving data, the auto-compaction system robustly compacts your data even though data is ingested outside the `skipOffsetFromLatest` window. For most realtime streaming ingestion use cases, it is reasonable to set `skipOffsetFromLatest` to a few hours or a day.
+
+## Examples
+
+The following examples demonstrate potential use cases in which auto-compaction may improve your Druid performance. See more details in [Compaction strategies](../data-management/compaction.md#compaction-guidelines). The examples in this section do not change the underlying data.
+
+### Change segment granularity
+
+You have a stream set up to ingest data with `HOUR` segment granularity into the `wikistream` datasource. You notice that your Druid segments are smaller than the [recommended segment size](../operations/segment-optimization.md) of 5 million rows per segment. You wish to automatically compact segments to `DAY` granularity while leaving the latest week of data _not_ compacted because your stream consistently receives data within that time period.
+
+The following auto-compaction configuration compacts existing `HOUR` segments into `DAY` segments while leaving the latest week of data not compacted:
+
+```json
+{
+ "dataSource": "wikistream",
+ "granularitySpec": {
+ "segmentGranularity": "DAY"
+ },
+ "skipOffsetFromLatest": "P1W",
+}
+```
+
+### Update partitioning scheme
+
+For your `wikipedia` datasource, you want to optimize segment access when regularly ingesting data without compromising compute time when querying the data. Your ingestion spec for batch append uses [dynamic partitioning](../ingestion/native-batch.md#dynamic-partitioning) to optimize for write-time operations, while your stream ingestion partitioning is configured by the stream service. You want to implement auto-compaction to reorganize the data with a suitable read-time partitioning using [multi-dimension range partitioning](../ingestion/native-batch.md#multi-dimension-range-partitioning). Based on the dimensions frequently accessed in queries, you wish to partition on the following dimensions: `channel`, `countryName`, `namespace`.
+
+The following auto-compaction configuration compacts updates the `wikipedia` segments to use multi-dimension range partitioning:
+
+```json
+{
+ "dataSource": "wikipedia",
+ "tuningConfig": {
+ "partitionsSpec": {
+ "type": "range",
+ "partitionDimensions": [
+ "channel",
+ "countryName",
+ "namespace"
+ ],
+ "targetRowsPerSegment": 5000000
+ }
+ }
+}
+```
+
+## Auto-compaction using compaction supervisors
+
+:::info[Experimental]
+Compaction supervisors are experimental. For production use, we recommend [auto-compaction using Coordinator duties](#auto-compaction-using-coordinator-duties).
+:::
+
+You can run automatic compaction using compaction supervisors on the Overlord rather than Coordinator duties. Compaction supervisors provide the following benefits over Coordinator duties:
+
+* Can use the supervisor framework to get information about the auto-compaction, such as status or state
+* More easily suspend or resume compaction for a datasource
+* Can use either the native compaction engine or the [MSQ task engine](#use-msq-for-auto-compaction)
+* More reactive and submits tasks as soon as a compaction slot is available
+* Tracked compaction task status to avoid re-compacting an interval repeatedly
+
+
+To use compaction supervisors, update the [compaction dynamic config](../api-reference/automatic-compaction-api.md#update-cluster-level-compaction-config) and set:
+
+* `useSupervisors` to `true` so that compaction tasks can be run as supervisor tasks
+* `engine` to `msq` to use the MSQ task engine as the compaction engine or to `native` (default value) to use the native engine.
+
+Compaction supervisors use the same syntax as auto-compaction using Coordinator duties with one key difference: you submit the auto-compaction as a supervisor spec. In the spec, set the `type` to `autocompact` and include the auto-compaction config in the `spec`.
+
+To submit an automatic compaction task, you can submit a supervisor spec through the [web console](#manage-compaction-supervisors-with-the-web-console) or the [supervisor API](#manage-compaction-supervisors-with-supervisor-apis).
+
+
+### Manage compaction supervisors with the web console
+
+To submit a supervisor spec for MSQ task engine automatic compaction, perform the following steps:
+
+1. In the web console, go to the **Supervisors** tab.
+1. Click **...** > **Submit JSON supervisor**.
+1. In the dialog, include the following:
+ - The type of supervisor spec by setting `"type": "autocompact"`
+ - The compaction configuration by adding it to the `spec` field
+ ```json
+ {
+ "type": "autocompact",
+ "spec": {
+ "dataSource": YOUR_DATASOURCE,
+ "tuningConfig": {...},
+ "granularitySpec": {...},
+ "engine": ,
+ ...
+ }
+ ```
+1. Submit the supervisor.
+
+To stop the automatic compaction task, suspend or terminate the supervisor through the UI or API.
+
+### Manage compaction supervisors with supervisor APIs
+
+Submitting an automatic compaction as a supervisor task uses the same endpoint as supervisor tasks for streaming ingestion.
+
+The following example configures auto-compaction for the `wikipedia` datasource:
+
+```sh
+curl --location --request POST 'http://localhost:8081/druid/indexer/v1/supervisor' \
+--header 'Content-Type: application/json' \
+--data-raw '{
+ "type": "autocompact", // required
+ "suspended": false, // optional
+ "spec": { // required
+ "dataSource": "wikipedia", // required
+ "tuningConfig": {...}, // optional
+ "granularitySpec": {...}, // optional
+ "engine": , // optional
+ ...
+ }
+}'
+```
+
+Note that if you omit `spec.engine`, Druid uses the default compaction engine. You can control the default compaction engine with the `druid.supervisor.compaction.engine` Overlord runtime property. If `spec.engine` and `druid.supervisor.compaction.engine` are omitted, Druid defaults to the native engine.
+
+To stop the automatic compaction task, suspend or terminate the supervisor through the UI or API.
+
+### Use MSQ for auto-compaction
+
+The MSQ task engine is available as a compaction engine if you configure auto-compaction to use compaction supervisors. To use the MSQ task engine for automatic compaction, make sure the following requirements are met:
+
+* [Load the MSQ task engine extension](../multi-stage-query/index.md#load-the-extension).
+* In your Overlord runtime properties, set the following properties:
+ * `druid.supervisor.compaction.enabled` to `true` so that compaction tasks can be run as a supervisor task.
+ * Optionally, set `druid.supervisor.compaction.engine` to `msq` to specify the MSQ task engine as the default compaction engine. If you don't do this, you'll need to set `spec.engine` to `msq` for each compaction supervisor spec where you want to use the MSQ task engine.
+* Have at least two compaction task slots available or set `compactionConfig.taskContext.maxNumTasks` to two or more. The MSQ task engine requires at least two tasks to run, one controller task and one worker task.
+
+You can use [MSQ task engine context parameters](../multi-stage-query/reference.md#context-parameters) in `spec.taskContext` when configuring your datasource for automatic compaction, such as setting the maximum number of tasks using the `spec.taskContext.maxNumTasks` parameter. Some of the MSQ task engine context parameters overlap with automatic compaction parameters. When these settings overlap, set one or the other.
+
+
+#### MSQ task engine limitations
+
+
+
+When using the MSQ task engine for auto-compaction, keep the following limitations in mind:
+
+- The `metricSpec` field is only supported for certain aggregators. For more information, see [Supported aggregators](#supported-aggregators).
+- Only dynamic and range-based partitioning are supported.
+- Set `rollup` to `true` if and only if `metricSpec` is not empty or null.
+- You can only partition on string dimensions. However, multi-valued string dimensions are not supported.
+- The `maxTotalRows` config is not supported in `DynamicPartitionsSpec`. Use `maxRowsPerSegment` instead.
+- Segments can only be sorted on `__time` as the first column.
+
+#### Supported aggregators
+
+Auto-compaction using the MSQ task engine supports only aggregators that satisfy the following properties:
+* __Mergeability__: can combine partial aggregates
+* __Idempotency__: produces the same results on repeated runs of the aggregator on previously aggregated values in a column
+
+This is exemplified by the following `longSum` aggregator:
+
+```
+{"name": "added", "type": "longSum", "fieldName": "added"}
+```
+
+where `longSum` being capable of combining partial results satisfies mergeability, while input and output column being the same (`added`) ensures idempotency.
+
+The following are some examples of aggregators that aren't supported since at least one of the required conditions aren't satisfied:
+
+* `longSum` aggregator where the `added` column rolls up into `sum_added` column discarding the input `added` column, violating idempotency, as subsequent runs would no longer find the `added` column:
+ ```
+ {"name": "sum_added", "type": "longSum", "fieldName": "added"}
+ ```
+* Partial sketches which cannot themselves be used to combine partial aggregates and need merging aggregators -- such as `HLLSketchMerge` required for `HLLSketchBuild` aggregator below -- violating mergeability:
+ ```
+ {"name": "added", "type": "HLLSketchBuild", "fieldName": "added"}
+ ```
+* Count aggregator since it cannot be used to combine partial aggregates and it rolls up into a different `count` column discarding the input column(s), violating both mergeability and idempotency.
+ ```
+ {"type": "count", "name": "count"}
+ ```
+
+
+
+## Learn more
+
+See the following topics for more information:
+* [Compaction](compaction.md) for an overview of compaction in Druid.
+* [Manual compaction](manual-compaction.md) for how to manually perform compaction tasks.
+* [Segment optimization](../operations/segment-optimization.md) for guidance on evaluating and optimizing Druid segment size.
+* [Coordinator process](../design/coordinator.md#automatic-compaction) for details on how the Coordinator plans compaction tasks.
+
diff --git a/docs/35.0.0/data-management/compaction.md b/docs/35.0.0/data-management/compaction.md
new file mode 100644
index 0000000000..51bf7ee864
--- /dev/null
+++ b/docs/35.0.0/data-management/compaction.md
@@ -0,0 +1,113 @@
+---
+id: compaction
+title: "Compaction"
+description: "Defines compaction and automatic compaction (auto-compaction or autocompaction) for segment optimization. Use cases and strategies for compaction. Describes compaction task configuration."
+---
+
+
+
+Query performance in Apache Druid depends on optimally sized segments. Compaction is one strategy you can use to optimize segment size for your Druid database. Compaction tasks read an existing set of segments for a given time interval and combine the data into a new "compacted" set of segments. In some cases the compacted segments are larger, but there are fewer of them. In other cases the compacted segments may be smaller. Compaction tends to increase performance because optimized segments require less per-segment processing and less memory overhead for ingestion and for querying paths.
+
+## Compaction guidelines
+
+There are several cases to consider compaction for segment optimization:
+
+- With streaming ingestion, data can arrive out of chronological order creating many small segments.
+- If you append data using `appendToExisting` for [native batch](../ingestion/native-batch.md) ingestion creating suboptimal segments.
+- When you use `index_parallel` for parallel batch indexing and the parallel ingestion tasks create many small segments.
+- When a misconfigured ingestion task creates oversized segments.
+
+By default, compaction does not modify the underlying data of the segments. However, there are cases when you may want to modify data during compaction to improve query performance:
+
+- If, after ingestion, you realize that data for the time interval is sparse, you can use compaction to increase the segment granularity.
+- If you don't need fine-grained granularity for older data, you can use compaction to change older segments to a coarser query granularity. For example, from `minute` to `hour` or `hour` to `day`. This reduces the storage space required for older data.
+- You can change the dimension order to improve sorting and reduce segment size.
+- You can remove unused columns in compaction or implement an aggregation metric for older data.
+- You can change segment rollup from dynamic partitioning with best-effort rollup to hash or range partitioning with perfect rollup. For more information on rollup, see [perfect vs best-effort rollup](../ingestion/rollup.md#perfect-rollup-vs-best-effort-rollup).
+
+Compaction does not improve performance in all situations. For example, if you rewrite your data with each ingestion task, you don't need to use compaction. See [Segment optimization](../operations/segment-optimization.md) for additional guidance to determine if compaction will help in your environment.
+
+## Ways to run compaction
+
+Automatic compaction, also called auto-compaction, works in most use cases and should be your first option.
+
+The Coordinator uses its [segment search policy](../design/coordinator.md#segment-search-policy-in-automatic-compaction) to periodically identify segments for compaction starting from newest to oldest. When the Coordinator discovers segments that have not been compacted or segments that were compacted with a different or changed spec, it submits compaction tasks for the time interval covering those segments.
+
+To learn more, see [Automatic compaction](../data-management/automatic-compaction.md).
+
+In cases where you require more control over compaction, you can manually submit compaction tasks. For example:
+
+- Automatic compaction is running into the limit of task slots available to it, so tasks are waiting for previous automatic compaction tasks to complete. Manual compaction can use all available task slots, therefore you can complete compaction more quickly by submitting more concurrent tasks for more intervals.
+- You want to force compaction for a specific time range or you want to compact data out of chronological order.
+
+See [Setting up a manual compaction task](./manual-compaction.md#setting-up-manual-compaction) for more about manual compaction tasks.
+
+## Data handling with compaction
+
+During compaction, Druid overwrites the original set of segments with the compacted set. Druid also locks the segments for the time interval being compacted to ensure data consistency. By default, compaction tasks do not modify the underlying data. You can configure the compaction task to change the query granularity or add or remove dimensions in the compaction task. This means that the only changes to query results should be the result of intentional, not automatic, changes.
+
+You can set `dropExisting` in `ioConfig` to "true" in the compaction task to configure Druid to replace all existing segments fully contained by the interval. See the suggestion for reindexing with finer granularity under [Implementation considerations](../ingestion/native-batch.md#implementation-considerations) for an example.
+:::info
+ WARNING: `dropExisting` in `ioConfig` is a beta feature.
+:::
+
+If an ingestion task needs to write data to a segment for a time interval locked for compaction, by default the ingestion task supersedes the compaction task and the compaction task fails without finishing. For manual compaction tasks, you can adjust the input spec interval to avoid conflicts between ingestion and compaction. For automatic compaction, you can set the `skipOffsetFromLatest` key to adjust the auto-compaction starting point from the current time to reduce the chance of conflicts between ingestion and compaction.
+Another option is to set the compaction task to higher priority than the ingestion task.
+For more information, see [Avoid conflicts with ingestion](../data-management/automatic-compaction.md#avoid-conflicts-with-ingestion).
+
+### Segment granularity handling
+
+Unless you modify the segment granularity in [`granularitySpec`](manual-compaction.md#compaction-granularity-spec), Druid attempts to retain the granularity for the compacted segments. When segments have different segment granularities with no overlap in interval Druid creates a separate compaction task for each to retain the segment granularity in the compacted segment.
+
+If segments have different segment granularities before compaction but there is some overlap in interval, Druid attempts find start and end of the overlapping interval and uses the closest segment granularity level for the compacted segment.
+
+For example consider two overlapping segments: segment "A" for the interval 01/01/2020-01/02/2020 with day granularity and segment "B" for the interval 01/01/2020-02/01/2020. Druid attempts to combine and compact the overlapped segments. In this example, the earliest start time for the two segments is 01/01/2020 and the latest end time of the two segments is 02/01/2020. Druid compacts the segments together even though they have different segment granularity. Druid uses month segment granularity for the newly compacted segment even though segment A's original segment granularity was day granularity.
+
+### Query granularity handling
+
+Unless you modify the query granularity in the [`granularitySpec`](manual-compaction.md#compaction-granularity-spec), Druid retains the query granularity for the compacted segments. If segments have different query granularities before compaction, Druid chooses the finest level of granularity for the resulting compacted segment. For example if a compaction task combines two segments, one with day query granularity and one with minute query granularity, the resulting segment uses minute query granularity.
+
+:::info
+ In Apache Druid 0.21.0 and prior, Druid sets the granularity for compacted segments to the default granularity of `NONE` regardless of the query granularity of the original segments.
+:::
+
+If you configure query granularity in compaction to go from a finer granularity like month to a coarser query granularity like year, then Druid overshadows the original segment with coarser granularity. Because the new segments have a coarser granularity, running a kill task to remove the overshadowed segments for those intervals will cause you to permanently lose the finer granularity data.
+
+### Dimension handling
+
+Apache Druid supports schema changes. Therefore, dimensions can be different across segments even if they are a part of the same datasource. See [Segments with different schemas](../design/segments.md#segments-with-different-schemas). If the input segments have different dimensions, the resulting compacted segment includes all dimensions of the input segments.
+
+Even when the input segments have the same set of dimensions, the dimension order or the data type of dimensions can be different. The dimensions of recent segments precede that of old segments in terms of data types and the ordering because more recent segments are more likely to have the preferred order and data types.
+
+If you want to control dimension ordering or ensure specific values for dimension types, you can configure a custom `dimensionsSpec` in the compaction task spec.
+
+### Rollup
+
+Druid only rolls up the output segment when `rollup` is set for all input segments.
+See [Roll-up](../ingestion/rollup.md) for more details.
+You can check that your segments are rolled up or not by using [Segment Metadata Queries](../querying/segmentmetadataquery.md#analysistypes).
+
+## Learn more
+
+See the following topics for more information:
+- [Segment optimization](../operations/segment-optimization.md) for guidance to determine if compaction will help in your case.
+- [Manual compaction](./manual-compaction.md) for how to run a one-time compaction task.
+- [Automatic compaction](automatic-compaction.md) for how to enable and configure automatic compaction.
+
diff --git a/docs/35.0.0/data-management/delete.md b/docs/35.0.0/data-management/delete.md
new file mode 100644
index 0000000000..e37ba48b54
--- /dev/null
+++ b/docs/35.0.0/data-management/delete.md
@@ -0,0 +1,148 @@
+---
+id: delete
+title: "Data deletion"
+---
+
+
+
+## Delete data for a time range manually
+
+Apache Druid stores data [partitioned by time chunk](../design/storage.md) and supports
+deleting data for time chunks by dropping segments. This is a fast, metadata-only operation.
+
+Deletion by time range happens in two steps:
+
+1. Segments to be deleted must first be marked as ["unused"](../design/storage.md#segment-lifecycle). This can
+ happen when a segment is dropped by a [drop rule](../operations/rule-configuration.md) or when you manually mark a
+ segment unused through the Coordinator API or web console. This is a soft delete: the data is not available for
+ querying, but the segment files remains in deep storage, and the segment records remains in the metadata store.
+2. Once a segment is marked "unused", you can use a [`kill` task](#kill-task) to permanently delete the segment file from
+ deep storage and remove its record from the metadata store. This is a hard delete: the data is unrecoverable unless
+ you have a backup.
+
+For documentation on disabling segments using the Coordinator API, see the
+[Legacy metadata API reference](../api-reference/legacy-metadata-api.md#datasources).
+
+A data deletion tutorial is available at [Tutorial: Deleting data](../tutorials/tutorial-delete-data.md).
+
+## Delete data automatically using drop rules
+
+Druid supports [load and drop rules](../operations/rule-configuration.md), which are used to define intervals of time
+where data should be preserved, and intervals where data should be discarded. Data that falls under a drop rule is
+marked unused, in the same manner as if you [manually mark that time range unused](#delete-data-for-a-time-range-manually). This is a
+fast, metadata-only operation.
+
+Data that is dropped in this way is marked unused, but remains in deep storage. To permanently delete it, use a
+[`kill` task](#kill-task).
+
+## Delete specific records
+
+Druid supports deleting specific records using [reindexing](update.md#reindex) with a filter. The filter specifies which
+data remains after reindexing, so it must be the inverse of the data you want to delete. Because segments must be
+rewritten to delete data in this way, it can be a time-consuming operation.
+
+For example, to delete records where `userName` is `'bob'` with native batch indexing, use a
+[`transformSpec`](../ingestion/ingestion-spec.md#transformspec) with filter `{"type": "not", "field": {"type":
+"selector", "dimension": "userName", "value": "bob"}}`.
+
+To delete the same records using SQL, use [REPLACE](../multi-stage-query/concepts.md#overwrite-data-with-replace) with `WHERE userName <> 'bob'`.
+
+To reindex using [native batch](../ingestion/native-batch.md), use the [`druid` input
+source](../ingestion/input-sources.md#druid-input-source). If needed,
+[`transformSpec`](../ingestion/ingestion-spec.md#transformspec) can be used to filter or modify data during the
+reindexing job. To reindex with SQL, use [`REPLACE OVERWRITE`](../multi-stage-query/reference.md#replace)
+with `SELECT ... FROM `. (Druid does not have `UPDATE` or `ALTER TABLE` statements.) Any SQL SELECT query can be
+used to filter, modify, or enrich the data during the reindexing job.
+
+Data that is deleted in this way is marked unused, but remains in deep storage. To permanently delete it, use a [`kill`
+task](#kill-task).
+
+## Delete an entire table
+
+Deleting an entire table works the same way as [deleting part of a table by time range](#delete-data-for-a-time-range-manually). First,
+mark all segments unused using the Coordinator API or web console. Then, optionally, delete it permanently using a
+[`kill` task](#kill-task).
+
+
+
+## Delete data permanently using `kill` tasks
+
+Data that has been overwritten or soft-deleted still remains as segments that have been marked unused. You can use a
+`kill` task to permanently delete this data.
+
+The available grammar is:
+
+```json
+{
+ "type": "kill",
+ "id": ,
+ "dataSource": ,
+ "interval" : ,
+ "versions" : ,
+ "context": ,
+ "batchSize": ,
+ "limit": ,
+ "maxUsedStatusLastUpdatedTime":
+}
+```
+
+Some of the parameters used in the task payload are further explained below:
+
+| Parameter | Default | Explanation |
+|-------------|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `versions` | null (all versions) | List of segment versions within the specified `interval` for the kill task to delete. The default behavior is to delete all unused segment versions in the specified `interval`.|
+| `batchSize` |100 | Maximum number of segments that are deleted in one kill batch. Some operations on the Overlord may get stuck while a `kill` task is in progress due to concurrency constraints (such as in `TaskLockbox`). Thus, a `kill` task splits the list of unused segments to be deleted into smaller batches to yield the Overlord resources intermittently to other task operations.|
+| `limit` | null (no limit) | Maximum number of segments for the kill task to delete.|
+| `maxUsedStatusLastUpdatedTime` | null (no cutoff) | Maximum timestamp used as a cutoff to include unused segments. The kill task only considers segments which lie in the specified `interval` and were marked as unused no later than this time. The default behavior is to kill all unused segments in the `interval` regardless of when they where marked as unused.|
+
+
+**WARNING:** The `kill` task permanently removes all information about the affected segments from the metadata store and
+deep storage. This operation cannot be undone.
+
+### Auto-kill data using Coordinator duties
+
+Instead of submitting `kill` tasks manually to permanently delete data for a given interval, you can enable auto-kill of unused segments on the Coordinator.
+The Coordinator runs a duty periodically to identify intervals containing unused segments that are eligible for kill. It then launches a `kill` task for each of these intervals.
+
+Refer to [Data management on the Coordinator](../configuration/index.md#data-management) to configure auto-kill of unused segments on the Coordinator.
+
+### Auto-kill data on the Overlord (Experimental)
+
+:::info
+This is an experimental feature that:
+- Can be used only if [segment metadata caching](../configuration/index.md#segment-metadata-cache-experimental) is enabled on the Overlord.
+- MUST NOT be used if auto-kill of unused segments is already enabled on the Coordinator.
+:::
+
+This is an experimental feature to run kill tasks in an "embedded" mode on the Overlord itself.
+
+These embedded tasks offer several advantages over auto-kill performed by the Coordinator as they:
+- avoid a lot of unnecessary REST API calls to the Overlord from tasks or the Coordinator.
+- kill unused segments as soon as they become eligible.
+- run on the Overlord and do not take up task slots.
+- finish faster as they save on the overhead of launching a task process.
+- kill a small number of segments per task, to ensure that locks on an interval are not held for too long.
+- skip locked intervals to avoid head-of-line blocking in kill tasks.
+- require little to no configuration.
+- can keep up with a large number of unused segments in the cluster.
+- take advantage of the segment metadata cache on the Overlord.
+
+Refer to [Auto-kill unused segments on the Overlord](../configuration/index.md#auto-kill-unused-segments-experimental) to configure auto-kill of unused segments on the Overlord.
+See [Auto-kill metrics](../operations/metrics.md#auto-kill-unused-segments) for the metrics emitted by embedded kill tasks.
diff --git a/docs/35.0.0/data-management/index.md b/docs/35.0.0/data-management/index.md
new file mode 100644
index 0000000000..0e0e09ac89
--- /dev/null
+++ b/docs/35.0.0/data-management/index.md
@@ -0,0 +1,34 @@
+---
+id: index
+title: "Data management"
+sidebar_label: "Overview"
+---
+
+
+
+Apache Druid stores data [partitioned by time chunk](../design/storage.md) in immutable
+files called [segments](../design/segments.md). Data management operations involving replacing, or deleting,
+these segments include:
+
+- [Updates](update.md) to existing data.
+- [Deletion](delete.md) of existing data.
+- [Schema changes](schema-changes.md) for new and existing data.
+- [Compaction](compaction.md) and [automatic compaction](automatic-compaction.md), which reindex existing data to
+ optimize storage footprint and performance.
diff --git a/docs/35.0.0/data-management/manual-compaction.md b/docs/35.0.0/data-management/manual-compaction.md
new file mode 100644
index 0000000000..e6e34dba82
--- /dev/null
+++ b/docs/35.0.0/data-management/manual-compaction.md
@@ -0,0 +1,167 @@
+---
+id: manual-compaction
+title: "Manual compaction"
+---
+
+
+
+In Apache Druid, compaction is a special type of ingestion task that reads data from a Druid datasource and writes it back into the same datasource. A common use case for this is to [optimally size segments](../operations/segment-optimization.md) after ingestion to improve query performance.
+
+You can perform manual compaction where you submit a one-time compaction task for a specific interval. Generally, you don't need to do this if you use [automatic compaction](./automatic-compaction.md), which is recommended for most workloads.
+
+## Setting up manual compaction
+
+ Compaction tasks merge all segments for the defined interval according to the following syntax:
+
+```json
+{
+ "type": "compact",
+ "id": ,
+ "dataSource": ,
+ "ioConfig": ,
+ "dimensionsSpec": ,
+ "transformSpec": ,
+ "metricsSpec": ,
+ "tuningConfig": ,
+ "granularitySpec": ,
+ "context":
+}
+```
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`type`|Task type. Set the value to `compact`.|Yes|
+|`id`|Task ID|No|
+|`dataSource`|Data source name to compact|Yes|
+|`ioConfig`|I/O configuration for compaction task. See [Compaction I/O configuration](#compaction-io-configuration) for details.|Yes|
+|`dimensionsSpec`|When set, the compaction task uses the specified `dimensionsSpec` rather than generating one from existing segments. See [Compaction dimensionsSpec](#compaction-dimensions-spec) for details.|No|
+|`transformSpec`|When set, the compaction task uses the specified `transformSpec` rather than using `null`. See [Compaction transformSpec](#compaction-transform-spec) for details.|No|
+|`metricsSpec`|When set, the compaction task uses the specified `metricsSpec` rather than generating one from existing segments.|No|
+|`segmentGranularity`|Deprecated. Use `granularitySpec`.|No|
+|`tuningConfig`|[Tuning configuration](../ingestion/native-batch.md#tuningconfig) for parallel indexing. `awaitSegmentAvailabilityTimeoutMillis` value is not supported for compaction tasks. Leave this parameter at the default value, 0.|No|
+|`granularitySpec`|When set, the compaction task uses the specified `granularitySpec` rather than generating one from existing segments. See [Compaction `granularitySpec`](#compaction-granularity-spec) for details.|No|
+|`context`|[Task context](../ingestion/tasks.md#context-parameters)|No|
+
+:::info
+ Note: Use `granularitySpec` over `segmentGranularity` and only set one of these values. If you specify different values for these in the same compaction spec, the task fails.
+:::
+
+To control the number of result segments per time chunk, you can set [`maxRowsPerSegment`](../ingestion/native-batch.md#partitionsspec) or [`numShards`](../ingestion/native-batch.md#tuningconfig).
+
+:::info
+ You can run multiple compaction tasks in parallel. For example, if you want to compact the data for a year, you are not limited to running a single task for the entire year. You can run 12 compaction tasks with month-long intervals.
+:::
+
+A compaction task internally generates an `index` or `index_parallel` task spec for performing compaction work with some fixed parameters. For example, its `inputSource` is always the [`druid` input source](../ingestion/input-sources.md), and `dimensionsSpec` and `metricsSpec` include all dimensions and metrics of the input segments by default.
+
+Compaction tasks typically fetch all [relevant segments](#compaction-io-configuration) prior to launching any subtasks, _unless_ the following properties are all set to non-null values. It is strongly recommended to set them to non-null values to maximize performance and minimize disk usage of the `compact` task:
+
+- [`granularitySpec`](#compaction-granularity-spec), with non-null values for each of `segmentGranularity`, `queryGranularity`, and `rollup`
+- [`dimensionsSpec`](#compaction-dimensions-spec)
+- `metricsSpec`
+
+Compaction tasks exit without doing anything and issue a failure status code in either of the following cases:
+
+- If the interval you specify has no data segments loaded.
+- If the interval you specify is empty.
+
+Note that the metadata between input segments and the resulting compacted segments may differ if the metadata among the input segments differs as well. If all input segments have the same metadata, however, the resulting output segment will have the same metadata as all input segments.
+
+
+## Manual compaction task example
+
+The following JSON illustrates a compaction task to compact _all segments_ within the interval `2020-01-01/2021-01-01` and create new segments:
+
+```json
+{
+ "type": "compact",
+ "dataSource": "wikipedia",
+ "ioConfig": {
+ "type": "compact",
+ "inputSpec": {
+ "type": "interval",
+ "interval": "2020-01-01/2021-01-01"
+ }
+ },
+ "granularitySpec": {
+ "segmentGranularity": "day",
+ "queryGranularity": "hour"
+ }
+}
+```
+
+`granularitySpec` is an optional field.
+If you don't specify `granularitySpec`, Druid retains the original segment and query granularities when compaction is complete.
+
+## Compaction I/O configuration
+
+The compaction `ioConfig` requires specifying `inputSpec` as follows:
+
+|Field|Description|Default|Required|
+|-----|-----------|-------|--------|
+|`type`|Task type. Set the value to `compact`.|none|Yes|
+|`inputSpec`|Specification of the target [interval](#interval-inputspec) or [segments](#segments-inputspec).|none|Yes|
+|`dropExisting`|If `true`, the task replaces all existing segments fully contained by either of the following: - the `interval` in the `interval` type `inputSpec`. - the umbrella interval of the `segments` in the `segment` type `inputSpec`. If compaction fails, Druid does not change any of the existing segments. **WARNING**: `dropExisting` in `ioConfig` is a beta feature. |false|No|
+|`allowNonAlignedInterval`|If `true`, the task allows an explicit [`segmentGranularity`](#compaction-granularity-spec) that is not aligned with the provided [interval](#interval-inputspec) or [segments](#segments-inputspec). This parameter is only used if [`segmentGranularity`](#compaction-granularity-spec) is explicitly provided. This parameter is provided for backwards compatibility. In most scenarios it should not be set, as it can lead to data being accidentally overshadowed. This parameter may be removed in a future release.|false|No|
+
+The compaction task has two kinds of `inputSpec`:
+
+### Interval `inputSpec`
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`type`|Task type. Set the value to `interval`.|Yes|
+|`interval`|Interval to compact.|Yes|
+
+### Segments `inputSpec`
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`type`|Task type. Set the value to `segments`.|Yes|
+|`segments`|A list of segment IDs.|Yes|
+
+## Compaction dimensions spec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`dimensions`| A list of dimension names or objects. Cannot have the same column in both `dimensions` and `dimensionExclusions`. Defaults to `null`, which preserves the original dimensions.|No|
+|`dimensionExclusions`| The names of dimensions to exclude from compaction. Only names are supported here, not objects. This list is only used if the dimensions list is null or empty; otherwise it is ignored. Defaults to `[]`.|No|
+
+## Compaction transform spec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`filter`| The `filter` conditionally filters input rows during compaction. Only rows that pass the filter will be included in the compacted segments. Any of Druid's standard [query filters](../querying/filters.md) can be used. Defaults to 'null', which will not filter any row. |No|
+
+## Compaction granularity spec
+
+|Field|Description|Required|
+|-----|-----------|--------|
+|`segmentGranularity`|Time chunking period for the segment granularity. Defaults to 'null', which preserves the original segment granularity. Accepts all [Query granularity](../querying/granularities.md) values.|No|
+|`queryGranularity`|The resolution of timestamp storage within each segment. Defaults to 'null', which preserves the original query granularity. Accepts all [Query granularity](../querying/granularities.md) values.|No|
+|`rollup`|Enables compaction-time rollup. To preserve the original setting, keep the default value. To enable compaction-time rollup, set the value to `true`. Once the data is rolled up, you can no longer recover individual records.|No|
+
+## Learn more
+
+See the following topics for more information:
+* [Compaction](compaction.md) for an overview of compaction and how to set up manual compaction in Druid.
+* [Segment optimization](../operations/segment-optimization.md) for guidance on evaluating and optimizing Druid segment size.
+* [Coordinator process](../design/coordinator.md#automatic-compaction) for details on how the Coordinator plans compaction tasks.
+
diff --git a/docs/35.0.0/data-management/schema-changes.md b/docs/35.0.0/data-management/schema-changes.md
new file mode 100644
index 0000000000..0771da3ce2
--- /dev/null
+++ b/docs/35.0.0/data-management/schema-changes.md
@@ -0,0 +1,39 @@
+---
+id: schema-changes
+title: "Schema changes"
+---
+
+
+
+
+## For new data
+
+Apache Druid allows you to provide a new schema for new data without the need to update the schema of any existing data.
+It is sufficient to update your supervisor spec, if using [streaming ingestion](../ingestion/index.md#streaming), or to
+provide the new schema the next time you do a [batch ingestion](../ingestion/index.md#batch). This is made possible by
+the fact that each [segment](../design/segments.md), at the time it is created, stores a
+copy of its own schema. Druid reconciles all of these individual segment schemas automatically at query time.
+
+## For existing data
+
+Schema changes are sometimes necessary for existing data. For example, you may want to change the type of a column in
+previously-ingested data, or drop a column entirely. Druid handles this using [reindexing](update.md), the same method
+it uses to handle updates of existing data. Reindexing involves rewriting all affected segments and can be a
+time-consuming operation.
diff --git a/docs/35.0.0/data-management/update.md b/docs/35.0.0/data-management/update.md
new file mode 100644
index 0000000000..a8c75a5d34
--- /dev/null
+++ b/docs/35.0.0/data-management/update.md
@@ -0,0 +1,78 @@
+---
+id: update
+title: "Data updates"
+---
+
+
+
+## Overwrite
+
+Apache Druid stores data [partitioned by time chunk](../design/storage.md) and supports
+overwriting existing data using time ranges. Data outside the replacement time range is not touched. Overwriting of
+existing data is done using the same mechanisms as [batch ingestion](../ingestion/index.md#batch).
+
+For example:
+
+- [Native batch](../ingestion/native-batch.md) with `appendToExisting: false`, and `intervals` set to a specific
+ time range, overwrites data for that time range.
+- [SQL `REPLACE OVERWRITE [ALL | WHERE ...]`](../multi-stage-query/reference.md#replace) overwrites data for
+ the entire table or for a specified time range.
+
+In both cases, Druid's atomic update mechanism ensures that queries will flip seamlessly from the old data to the new
+data on a time-chunk-by-time-chunk basis.
+
+Ingestion and overwriting cannot run concurrently for the same time range of the same datasource. While an overwrite job
+is ongoing for a particular time range of a datasource, new ingestions for that time range are queued up. Ingestions for
+other time ranges proceed as normal. Read-only queries also proceed as normal, using the pre-existing version of the
+data.
+
+:::info
+ Druid does not support single-record updates by primary key.
+:::
+
+## Reindex
+
+Reindexing is an [overwrite of existing data](#overwrite) where the source of new data is the existing data itself. It
+is used to perform schema changes, repartition data, filter out unwanted data, enrich existing data, and so on. This
+behaves just like any other [overwrite](#overwrite) with regard to atomic updates and locking.
+
+With [native batch](../ingestion/native-batch.md), use the [`druid` input
+source](../ingestion/input-sources.md#druid-input-source). If needed,
+[`transformSpec`](../ingestion/ingestion-spec.md#transformspec) can be used to filter or modify data during the
+reindexing job.
+
+With SQL, use [`REPLACE OVERWRITE`](../multi-stage-query/reference.md#replace) with `SELECT ... FROM `.
+(Druid does not have `UPDATE` or `ALTER TABLE` statements.) Any SQL SELECT query can be used to filter,
+modify, or enrich the data during the reindexing job.
+
+## Rolled-up datasources
+
+Rolled-up datasources can be effectively updated using appends, without rewrites. When you append a row that has an
+identical set of dimensions to an existing row, queries that use aggregation operators automatically combine those two
+rows together at query time.
+
+[Compaction](compaction.md) or [automatic compaction](automatic-compaction.md) can be used to physically combine these
+matching rows together later on, by rewriting segments in the background.
+
+## Lookups
+
+If you have a dimension where values need to be updated frequently, try first using [lookups](../querying/lookups.md). A
+classic use case of lookups is when you have an ID dimension stored in a Druid segment, and want to map the ID dimension to a
+human-readable string that may need to be updated periodically.
diff --git a/docs/35.0.0/design/architecture.md b/docs/35.0.0/design/architecture.md
new file mode 100644
index 0000000000..04498defb1
--- /dev/null
+++ b/docs/35.0.0/design/architecture.md
@@ -0,0 +1,185 @@
+---
+id: architecture
+title: "Architecture"
+---
+
+
+
+
+Druid has a distributed architecture that is designed to be cloud-friendly and easy to operate. You can configure and scale services independently for maximum flexibility over cluster operations. This design includes enhanced fault tolerance: an outage of one component does not immediately affect other components.
+
+The following diagram shows the services that make up the Druid architecture, their typical arrangement across servers, and how queries and data flow through this architecture.
+
+
+
+The following sections describe the components of this architecture.
+
+## Druid services
+
+Druid has several types of services:
+
+* [Coordinator](../design/coordinator.md) manages data availability on the cluster.
+* [Overlord](../design/overlord.md) controls the assignment of data ingestion workloads.
+* [Broker](../design/broker.md) handles queries from external clients.
+* [Router](../design/router.md) routes requests to Brokers, Coordinators, and Overlords.
+* [Historical](../design/historical.md) stores queryable data.
+* [Middle Manager](../design/middlemanager.md) and [Peon](../design/peons.md) ingest data.
+* [Indexer](../design/indexer.md) serves as an alternative to the Middle Manager + Peon task execution system.
+
+You can view services in the **Services** tab in the web console:
+
+
+
+## Druid servers
+
+You can deploy Druid services according to your preferences. For ease of deployment, we recommend organizing them into three server types: [Master](#master-server), [Query](#query-server), and [Data](#data-server).
+
+### Master server
+
+A Master server manages data ingestion and availability. It is responsible for starting new ingestion jobs and coordinating availability of data on the [Data server](#data-server).
+
+Master servers divide operations between Coordinator and Overlord services.
+
+#### Coordinator service
+
+[Coordinator](../design/coordinator.md) services watch over the Historical services on the Data servers. They are responsible for assigning segments to specific servers, and for ensuring segments are well-balanced across Historicals.
+
+#### Overlord service
+
+[Overlord](../design/overlord.md) services watch over the Middle Manager services on the Data servers and are the controllers of data ingestion into Druid. They are responsible for assigning ingestion tasks to Middle Managers and for coordinating segment publishing.
+
+### Query server
+
+A Query server provides the endpoints that users and client applications interact with, routing queries to Data servers or other Query servers (and optionally proxied Master server requests).
+
+Query servers divide operations between Broker and Router services.
+
+#### Broker service
+
+[Broker](../design/broker.md) services receive queries from external clients and forward those queries to Data servers. When Brokers receive results from those subqueries, they merge those results and return them to the caller. Typically, you query Brokers rather than querying Historical or Middle Manager services on Data servers directly.
+
+#### Router service
+
+[**Router**](../design/router.md) services provide a unified API gateway in front of Brokers, Overlords, and Coordinators.
+
+The Router service also runs the [web console](../operations/web-console.md), a UI for loading data, managing datasources and tasks, and viewing server status and segment information.
+
+### Data server
+
+A Data server executes ingestion jobs and stores queryable data.
+
+Data servers divide operations between Historical and Middle Manager services.
+
+#### Historical service
+
+[**Historical**](../design/historical.md) services handle storage and querying on historical data, including any streaming data that has been in the system long enough to be committed. Historical services download segments from deep storage and respond to queries about these segments. They don't accept writes.
+
+#### Middle Manager service
+
+[**Middle Manager**](../design/middlemanager.md) services handle ingestion of new data into the cluster. They are responsible
+for reading from external data sources and publishing new Druid segments.
+
+##### Peon service
+
+[**Peon**](../design/peons.md) services are task execution engines spawned by Middle Managers. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the Middle Manager that spawned them.
+
+#### Indexer service (optional)
+
+[**Indexer**](../design/indexer.md) services are an alternative to Middle Managers and Peons. Instead of
+forking separate JVM processes per-task, the Indexer runs tasks as individual threads within a single JVM process.
+
+The Indexer is designed to be easier to configure and deploy compared to the MiddleManager + Peon system and to better enable resource sharing across tasks, which can help streaming ingestion. The Indexer is currently designated [experimental](../development/experimental.md).
+
+Typically, you would deploy one of the following: MiddleManagers, [MiddleManager-less ingestion using Kubernetes](../development/extensions-core/k8s-jobs.md), or Indexers. You wouldn't deploy more than one of these options.
+
+## Colocation of services
+
+Colocating Druid services by server type generally results in better utilization of hardware resources for most clusters.
+For very large scale clusters, it can be desirable to split the Druid services such that they run on individual servers to avoid resource contention.
+
+This section describes guidelines and configuration parameters related to service colocation.
+
+### Coordinators and Overlords
+
+The workload on the Coordinator service tends to increase with the number of segments in the cluster. The Overlord's workload also increases based on the number of segments in the cluster, but to a lesser degree than the Coordinator.
+
+In clusters with very high segment counts, it can make sense to separate the Coordinator and Overlord services to provide more resources for the Coordinator's segment balancing workload.
+
+You can run the Coordinator and Overlord services as a single combined service by setting the `druid.coordinator.asOverlord.enabled` property.
+For more information, see [Coordinator Operation](../configuration/index.md#coordinator-operation).
+
+### Historicals and Middle Managers
+
+With higher levels of ingestion or query load, it can make sense to deploy the Historical and Middle Manager services on separate hosts to to avoid CPU and memory contention.
+
+The Historical service also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Historical and Middle Manager services separately.
+
+## External dependencies
+
+In addition to its built-in service types, Druid also has three external dependencies. These are intended to be able to
+leverage existing infrastructure, where present.
+
+### Deep storage
+
+Druid uses deep storage to store any data that has been ingested into the system. Deep storage is shared file
+storage accessible by every Druid server. In a clustered deployment, this is typically a distributed object store like S3 or
+HDFS, or a network mounted filesystem. In a single-server deployment, this is typically local disk.
+
+Druid uses deep storage for the following purposes:
+
+- To store all the data you ingest. Segments that get loaded onto Historical services for low latency queries are also kept in deep storage for backup purposes. Additionally, segments that are only in deep storage can be used for [queries from deep storage](../querying/query-from-deep-storage.md).
+- As a way to transfer data in the background between Druid services. Druid stores data in files called _segments_.
+
+Historical services cache data segments on local disk and serve queries from that cache as well as from an in-memory cache.
+Segments on disk for Historical services provide the low latency querying performance Druid is known for.
+
+You can also query directly from deep storage. When you query segments that exist only in deep storage, you trade some performance for the ability to query more of your data without necessarily having to scale your Historical services.
+
+When determining sizing for your storage, keep the following in mind:
+
+- Deep storage needs to be able to hold all the data that you ingest into Druid.
+- On disk storage for Historical services need to be able to accommodate the data you want to load onto them to run queries. The data on Historical services should be data you access frequently and need to run low latency queries for.
+
+Deep storage is an important part of Druid's elastic, fault-tolerant design. Druid bootstraps from deep storage even
+if every single data server is lost and re-provisioned.
+
+For more details, please see the [Deep storage](../design/deep-storage.md) page.
+
+### Metadata storage
+
+The metadata storage holds various shared system metadata such as segment usage information and task information. In a
+clustered deployment, this is typically a traditional RDBMS like PostgreSQL or MySQL. In a single-server
+deployment, it is typically a locally-stored Apache Derby database.
+
+For more details, please see the [Metadata storage](../design/metadata-storage.md) page.
+
+### ZooKeeper
+
+Used for internal service discovery, coordination, and leader election.
+
+For more details, please see the [ZooKeeper](zookeeper.md) page.
+
+## Learn more
+
+See the following topics for more information:
+
+* [Storage components](storage.md) to learn about data storage in Druid.
+* [Segments](segments.md) to learn about segment files.
+* [Query processing](../querying/query-processing.md) for a high-level overview of how Druid processes queries.
\ No newline at end of file
diff --git a/docs/35.0.0/design/broker.md b/docs/35.0.0/design/broker.md
new file mode 100644
index 0000000000..bbd6b94f2b
--- /dev/null
+++ b/docs/35.0.0/design/broker.md
@@ -0,0 +1,54 @@
+---
+id: broker
+title: "Broker service"
+sidebar_label: "Broker"
+---
+
+
+
+
+The Broker service routes queries in a distributed cluster setup. It interprets the metadata published to ZooKeeper about segment distribution across services and routes queries accordingly. Additionally, the Broker service consolidates result sets from individual services.
+
+## Configuration
+
+For Apache Druid Broker service configuration, see [Broker Configuration](../configuration/index.md#broker).
+
+For basic tuning guidance for the Broker service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#broker).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Broker, see [Broker API](../api-reference/legacy-metadata-api.md#broker).
+
+## Running
+
+```
+org.apache.druid.cli.Main server broker
+```
+
+## Forwarding queries
+
+Most Druid queries contain an interval object that indicates a span of time for which data is requested. Similarly, Druid partitions [segments](../design/segments.md) to contain data for some interval of time and distributes the segments across a cluster. Consider a simple datasource with seven segments where each segment contains data for a given day of the week. Any query issued to the datasource for more than one day of data will hit more than one segment. These segments will likely be distributed across multiple services, and hence, the query will likely hit multiple services.
+
+To determine which services to forward queries to, the Broker service first builds a view of the world from information in ZooKeeper. ZooKeeper maintains information about [Historical](../design/historical.md) and streaming ingestion [Peon](../design/peons.md) services and the segments they are serving. For every datasource in ZooKeeper, the Broker service builds a timeline of segments and the services that serve them. When queries are received for a specific datasource and interval, the Broker service performs a lookup into the timeline associated with the query datasource for the query interval and retrieves the services that contain data for the query. The Broker service then forwards down the query to the selected services.
+
+## Caching
+
+Broker services employ a cache with an LRU cache invalidation strategy. The Broker cache stores per-segment results. The cache can be local to each Broker service or shared across multiple services using an external distributed cache such as [memcached](http://memcached.org/). Each time a Broker service receives a query, it first maps the query to a set of segments. A subset of these segment results may already exist in the cache and the results can be directly pulled from the cache. For any segment results that do not exist in the cache, the Broker service will forward the query to the
+Historical services. Once the Historical services return their results, the Broker will store those results in the cache. Real-time segments are never cached and hence requests for real-time data will always be forwarded to real-time services. Real-time data is perpetually changing and caching the results would be unreliable.
\ No newline at end of file
diff --git a/docs/35.0.0/design/coordinator.md b/docs/35.0.0/design/coordinator.md
new file mode 100644
index 0000000000..bc4c5ebc1c
--- /dev/null
+++ b/docs/35.0.0/design/coordinator.md
@@ -0,0 +1,176 @@
+---
+id: coordinator
+title: "Coordinator service"
+sidebar_label: "Coordinator"
+---
+
+
+
+
+The Coordinator service is primarily responsible for segment management and distribution. More specifically, the
+Coordinator service communicates to Historical services to load or drop segments based on configurations. The Coordinator is responsible for loading new segments, dropping outdated segments, ensuring that segments are "replicated" (that is, loaded on multiple different Historical nodes) proper (configured) number of times, and moving
+("balancing") segments between Historical nodes to keep the latter evenly loaded.
+
+The Coordinator runs its duties periodically and the time between each run is a configurable parameter. On each
+run, the Coordinator assesses the current state of the cluster before deciding on the appropriate actions to take.
+Similar to the Broker and Historical services, the Coordinator maintains a connection to a ZooKeeper cluster for
+current cluster information. The Coordinator also maintains a connection to a database containing information about
+"used" segments (that is, the segments that *should* be loaded in the cluster) and the loading rules.
+
+Before any unassigned segments are serviced by Historical services, the Historical services for each tier are first
+sorted in terms of capacity, with least capacity servers having the highest priority. Unassigned segments are always
+assigned to the services with least capacity to maintain a level of balance between services. The Coordinator does not
+directly communicate with a Historical service when assigning it a new segment; instead the Coordinator creates some
+temporary information about the new segment under load queue path of the Historical service. Once this request is seen,
+the Historical service loads the segment and begins servicing it.
+
+## Configuration
+
+For Apache Druid Coordinator service configuration, see [Coordinator configuration](../configuration/index.md#coordinator).
+
+For basic tuning guidance for the Coordinator service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#coordinator).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Coordinator, see [Service status API reference](../api-reference/service-status-api.md#coordinator).
+
+## Running
+
+```
+org.apache.druid.cli.Main server coordinator
+```
+
+## Rules
+
+Segments can be automatically loaded and dropped from the cluster based on a set of rules. For more information on rules, see [Rule Configuration](../operations/rule-configuration.md).
+
+### Clean up overshadowed segments
+
+On each run, the Coordinator compares the set of used segments in the database with the segments served by some
+Historical nodes in the cluster. The Coordinator sends requests to Historical nodes to unload unused segments or segments
+that are removed from the database.
+
+Segments that are overshadowed (their versions are too old and their data has been replaced by newer segments) are
+marked as unused. During the next Coordinator's run, they will be unloaded from Historical nodes in the cluster.
+
+### Clean up non-overshadowed eternity tombstone segments
+
+On each run, the Coordinator determines and cleans up unneeded eternity tombstone segments for each datasource. These segments must fit all the following criteria:
+- It is a tombstone segment that starts at -INF or ends at INF (for example, a tombstone with an interval of `-146136543-09-08T08:23:32.096Z/2000-01-01` or `2020-01-01/146140482-04-24T15:36:27.903Z` or `-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z`)
+- It does not overlap with any overshadowed segment
+- It has 0 core partitions
+
+## Segment availability
+
+If a Historical service restarts or becomes unavailable for any reason, the Coordinator notices that a service has gone missing and treats all segments served by that service as being dropped. The segments are then reassigned to other Historical services in the cluster. However, each segment that is dropped is not immediately forgotten. Instead, there is a transitional data structure that stores all dropped segments with an associated lifetime. The lifetime represents a period of time in which the Coordinator will not reassign a dropped segment. Hence, if a Historical service becomes unavailable and available again within a short period of time, the Historical service will start up and serve segments from its cache without any of those segments being reassigned across the cluster.
+
+## Balancing segments in a tier
+
+Druid queries perform optimally when segments are distributed evenly across Historical services. An ideal distribution would ensure that all Historicals participate equally in the query load thus avoiding hot-spots in the system. To some extent, this can be achieved by keeping multiple replicas of a segment in a cluster.
+But in a tier with several Historicals (or a low replication factor), segment replication is not sufficient to attain balance.
+Thus, the Coordinator constantly monitors the set of segments present on each Historical in a tier and employs one of the following strategies to identify segments that may be moved from one Historical to another to retain balance.
+
+- `cost` (default): For a given segment in a tier, this strategy picks the server with the minimum "cost" of placing that segment. The cost is a function of the data interval of the segment and the data intervals of all the segments already present on the candidate server. In essence, this strategy tries to avoid placing segments with adjacent or overlapping data intervals on the same server. This is based on the premise that adjacent-interval segments are more likely to be used together in a query and placing them on the same server may lead to skewed CPU usages of Historicals.
+- `diskNormalized`: A derivative of the `cost` strategy that weights the cost of placing a segment on a server with the disk usage ratio of the server. There are known issues with this strategy and is not recommended for a production cluster.
+- `random`: Distributes segments randomly across servers. This is an experimental strategy and is not recommended for a production cluster.
+
+All of the above strategies prioritize moving segments from the Historical with the least available disk space.
+
+## Automatic compaction
+
+The Coordinator manages the [automatic compaction system](../data-management/automatic-compaction.md).
+Each run, the Coordinator compacts segments by merging small segments or splitting a large one. This is useful when the size of your segments is not optimized which may degrade query performance.
+See [Segment size optimization](../operations/segment-optimization.md) for details.
+
+The Coordinator first finds the segments to compact based on the [segment search policy](#segment-search-policy-in-automatic-compaction).
+Once some segments are found, it issues a [compaction task](../ingestion/tasks.md#compact) to compact those segments.
+The maximum number of running compaction tasks is `min(sum of worker capacity * slotRatio, maxSlots)`.
+Note that even if `min(sum of worker capacity * slotRatio, maxSlots) = 0`, at least one compaction task is always submitted
+if the compaction is enabled for a dataSource.
+See [Automatic compaction configuration API](../api-reference/automatic-compaction-api.md#manage-automatic-compaction) and [Automatic compaction configuration](../configuration/index.md#automatic-compaction-dynamic-configuration) to enable and configure automatic compaction.
+
+Compaction tasks might fail due to the following reasons:
+
+- If the input segments of a compaction task are removed or overshadowed before it starts, that compaction task fails immediately.
+- If a task of a higher priority acquires a [time chunk lock](../ingestion/tasks.md#locking) for an interval overlapping with the interval of a compaction task, the compaction task fails.
+
+Once a compaction task fails, the Coordinator simply checks the segments in the interval of the failed task again, and issues another compaction task in the next run.
+
+Note that Compacting Segments Coordinator Duty is automatically enabled and run as part of the Indexing Service Duties group. However, Compacting Segments Coordinator Duty can be configured to run in isolation as a separate Coordinator duty group. This allows changing the period of Compacting Segments Coordinator Duty without impacting the period of other Indexing Service Duties. This can be done by setting the following properties. For more details, see [custom pluggable Coordinator Duty](../development/modules.md#adding-your-own-custom-pluggable-coordinator-duty).
+```
+druid.coordinator.dutyGroups=[]
+druid.coordinator..duties=["compactSegments"]
+druid.coordinator..period=
+```
+
+## Segment search policy in automatic compaction
+
+At every Coordinator run, this policy looks up time chunks from newest to oldest and checks whether the segments in those time chunks
+need compaction.
+A set of segments needs compaction if all conditions below are satisfied:
+
+* Total size of segments in the time chunk is smaller than or equal to the configured `inputSegmentSizeBytes`.
+* Segments have never been compacted yet or compaction spec has been updated since the last compaction: `maxTotalRows` or `indexSpec`.
+
+Here are some details with an example. Suppose we have two dataSources (`foo`, `bar`) as seen below:
+
+- `foo`
+ - `foo_2017-11-01T00:00:00.000Z_2017-12-01T00:00:00.000Z_VERSION`
+ - `foo_2017-11-01T00:00:00.000Z_2017-12-01T00:00:00.000Z_VERSION_1`
+ - `foo_2017-09-01T00:00:00.000Z_2017-10-01T00:00:00.000Z_VERSION`
+- `bar`
+ - `bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION`
+ - `bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION_1`
+
+Assuming that each segment is 10 MB and haven't been compacted yet, this policy first returns two segments of
+`foo_2017-11-01T00:00:00.000Z_2017-12-01T00:00:00.000Z_VERSION` and `foo_2017-11-01T00:00:00.000Z_2017-12-01T00:00:00.000Z_VERSION_1` to compact together because
+`2017-11-01T00:00:00.000Z/2017-12-01T00:00:00.000Z` is the most recent time chunk.
+
+If the Coordinator has enough task slots for compaction, this policy will continue searching for the next segments and return
+`bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION` and `bar_2017-10-01T00:00:00.000Z_2017-11-01T00:00:00.000Z_VERSION_1`.
+Finally, `foo_2017-09-01T00:00:00.000Z_2017-10-01T00:00:00.000Z_VERSION` will be picked up even though there is only one segment in the time chunk of `2017-09-01T00:00:00.000Z/2017-10-01T00:00:00.000Z`.
+
+The search start point can be changed by setting `skipOffsetFromLatest`.
+If this is set, this policy will ignore the segments falling into the time chunk of (the end time of the most recent segment - `skipOffsetFromLatest`).
+This is to avoid conflicts between compaction tasks and realtime tasks.
+Note that realtime tasks have a higher priority than compaction tasks by default. Realtime tasks will revoke the locks of compaction tasks if their intervals overlap, resulting in the termination of the compaction task.
+For more information, see [Avoid conflicts with ingestion](../data-management/automatic-compaction.md#avoid-conflicts-with-ingestion).
+
+:::info
+ This policy currently cannot handle the situation when there are a lot of small segments which have the same interval,
+ and their total size exceeds [`inputSegmentSizeBytes`](../configuration/index.md#automatic-compaction-dynamic-configuration).
+ If it finds such segments, it simply skips them.
+:::
+
+## FAQ
+
+1. **Do clients ever contact the Coordinator service?**
+
+ The Coordinator is not involved in a query.
+
+ Historical services never directly contact the Coordinator service. The Coordinator tells the Historical services to load/drop data via ZooKeeper, but the Historical services are completely unaware of the Coordinator.
+
+ Brokers also never contact the Coordinator. Brokers base their understanding of the data topology on metadata exposed by the Historical services via ZooKeeper and are completely unaware of the Coordinator.
+
+2. **Does it matter if the Coordinator service starts up before or after other services?**
+
+ No. If the Coordinator is not started up, no new segments will be loaded in the cluster and outdated segments will not be dropped. However, the Coordinator service can be started up at any time, and after a configurable delay, will start running Coordinator tasks.
+
+ This also means that if you have a working cluster and all of your Coordinators die, the cluster will continue to function, it just won’t experience any changes to its data topology.
diff --git a/docs/35.0.0/design/deep-storage.md b/docs/35.0.0/design/deep-storage.md
new file mode 100644
index 0000000000..0674f32429
--- /dev/null
+++ b/docs/35.0.0/design/deep-storage.md
@@ -0,0 +1,88 @@
+---
+id: deep-storage
+title: "Deep storage"
+---
+
+
+
+
+Deep storage is where segments are stored. It is a storage mechanism that Apache Druid does not provide. This deep storage infrastructure defines the level of durability of your data. As long as Druid processes can see this storage infrastructure and get at the segments stored on it, you will not lose data no matter how many Druid nodes you lose. If segments disappear from this storage layer, then you will lose whatever data those segments represented.
+
+In addition to being the backing store for segments, you can use [query from deep storage](#querying-from-deep-storage) and run queries against segments stored primarily in deep storage. The [load rules](../operations/rule-configuration.md#load-rules) you configure determine whether segments exist primarily in deep storage or in a combination of deep storage and Historical processes.
+
+## Deep storage options
+
+Druid supports multiple options for deep storage, including blob storage from major cloud providers. Select the one that fits your environment.
+
+### Local
+
+Local storage is intended for use in the following situations:
+
+- You have just one server.
+- Or, you have multiple servers, and they all have access to a shared filesystem (for example: NFS).
+
+In multi-server production clusters, rather than local storage with a shared filesystem, it is instead recommended to
+use cloud-based deep storage ([Amazon S3](#amazon-s3-or-s3-compatible), [Google Cloud Storage](#google-cloud-storage),
+or [Azure Blob Storage](#azure-blob-storage)), S3-compatible storage (like Minio), or [HDFS](#hdfs). These options are
+generally more convenient, more scalable, and more robust than setting up a shared filesystem.
+
+The following configurations in `common.runtime.properties` apply to local storage:
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.storage.type`|`local`||Must be set.|
+|`druid.storage.storageDirectory`|any local directory|Directory for storing segments. Must be different from `druid.segmentCache.locations` and `druid.segmentCache.infoDir`.|`/tmp/druid/localStorage`|
+|`druid.storage.zip`|`true`, `false`|Whether segments in `druid.storage.storageDirectory` are written as directories (`false`) or zip files (`true`).|`false`|
+
+For example:
+
+```
+druid.storage.type=local
+druid.storage.storageDirectory=/tmp/druid/localStorage
+```
+
+The `druid.storage.storageDirectory` must be set to a different path than `druid.segmentCache.locations` or
+`druid.segmentCache.infoDir`.
+
+### Amazon S3 or S3-compatible
+
+See [`druid-s3-extensions`](../development/extensions-core/s3.md).
+
+### Google Cloud Storage
+
+See [`druid-google-extensions`](../development/extensions-core/google.md).
+
+### Azure Blob Storage
+
+See [`druid-azure-extensions`](../development/extensions-core/azure.md).
+
+### HDFS
+
+See [druid-hdfs-storage extension documentation](../development/extensions-core/hdfs.md).
+
+### Additional options
+
+For additional deep storage options, please see our [extensions list](../configuration/extensions.md).
+
+## Querying from deep storage
+
+Although not as performant as querying segments stored on disk for Historical processes, you can query from deep storage to access segments that you may not need frequently or with the extreme low latency Druid queries traditionally provide. You trade some performance for a total lower storage cost because you can access more of your data without the need to increase the number or capacity of your Historical processes.
+
+For information about how to run queries, see [Query from deep storage](../querying/query-from-deep-storage.md).
\ No newline at end of file
diff --git a/docs/35.0.0/design/extensions-contrib/dropwizard.md b/docs/35.0.0/design/extensions-contrib/dropwizard.md
new file mode 100644
index 0000000000..7e1100dc7c
--- /dev/null
+++ b/docs/35.0.0/design/extensions-contrib/dropwizard.md
@@ -0,0 +1,95 @@
+---
+id: dropwizard
+layout: doc_page
+title: "Dropwizard metrics emitter"
+---
+
+
+
+# Dropwizard Emitter
+
+To use this extension, make sure to [include](../../configuration/extensions.md#loading-extensions) `dropwizard-emitter` in the extensions load list.
+
+## Introduction
+
+This extension integrates [Dropwizard](http://metrics.dropwizard.io/3.1.0/getting-started/#) metrics library with druid so that dropwizard users can easily absorb druid into their monitoring ecosystem.
+It accumulates druid metrics as dropwizard metrics, and emits them to various sinks via dropwizard supported reporters.
+Currently supported dropwizard metrics types counter, gauge, meter, timer and histogram.
+These metrics can be emitted using either Console or JMX reporter.
+
+To use this emitter, set
+
+```
+druid.emitter=dropwizard
+```
+
+## Configuration
+
+All the configuration parameters for Dropwizard emitter are under `druid.emitter.dropwizard`.
+
+|property|description|required?|default|
+|--------|-----------|---------|-------|
+|`druid.emitter.dropwizard.reporters`|List of dropwizard reporters to be used. Here is a list of [Supported Reporters](#supported-dropwizard-reporters)|yes|none|
+|`druid.emitter.dropwizard.prefix`|Optional prefix to be used for metrics name|no|none|
+|`druid.emitter.dropwizard.includeHost`|Flag to include the host and port as part of the metric name.|no|yes|
+|`druid.emitter.dropwizard.dimensionMapPath`|Path to JSON file defining the dropwizard metric type, and desired dimensions for every Druid metric|no|Default mapping provided. See below.|
+|`druid.emitter.dropwizard.alertEmitters`| List of emitters where alerts will be forwarded to. |no| empty list (no forwarding)|
+|`druid.emitter.dropwizard.maxMetricsRegistrySize`| Maximum size of metrics registry to be cached at any time. |no| 100 Mb|
+
+
+### Druid to Dropwizard Event Conversion
+
+Each metric emitted using Dropwizard must specify a type, one of `[timer, counter, guage, meter, histogram]`. Dropwizard Emitter expects this mapping to
+be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric.
+If the user does not specify their own JSON file, a [default mapping](#default-metrics-mapping) is used.
+All metrics are expected to be mapped. Metrics which are not mapped will be ignored.
+Dropwizard metric path is organized using the following schema:
+
+` : { "dimensions" : , "type" : , "timeUnit" : }`
+
+e.g.
+```json
+"query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer", "timeUnit": "MILLISECONDS"},
+"segment/scan/pending" : { "dimensions" : [], "type" : "gauge"}
+```
+
+For most use-cases, the default mapping is sufficient.
+
+### Supported Dropwizard reporters
+
+#### JMX Reporter
+Used to report druid metrics via JMX.
+```
+
+druid.emitter.dropwizard.reporters=[{"type":"jmx"}]
+
+```
+
+#### Console Reporter
+Used to print Druid Metrics to console logs.
+
+```
+
+druid.emitter.dropwizard.reporters=[{"type":"console","emitIntervalInSecs":30}"}]
+
+```
+
+### Default Metrics Mapping
+Latest default metrics mapping can be found [here](https://github.com/apache/druid/blob/master/extensions-contrib/dropwizard-emitter/src/main/resources/defaultMetricDimensions.json)
diff --git a/docs/35.0.0/design/historical.md b/docs/35.0.0/design/historical.md
new file mode 100644
index 0000000000..d4a0782ba2
--- /dev/null
+++ b/docs/35.0.0/design/historical.md
@@ -0,0 +1,73 @@
+---
+id: historical
+title: "Historical service"
+sidebar_label: "Historical"
+---
+
+
+
+The Historical service is responsible for storing and querying historical data.
+Historical services cache data segments on local disk and serve queries from that cache as well as from an in-memory cache.
+
+## Configuration
+
+For Apache Druid Historical service configuration, see [Historical configuration](../configuration/index.md#historical).
+
+For basic tuning guidance for the Historical service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#historical).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Historical, please see the [Service status API reference](../api-reference/service-status-api.md#historical).
+
+## Running
+
+```
+org.apache.druid.cli.Main server historical
+```
+
+## Loading and serving segments
+
+Each Historical service copies or pulls segment files from deep storage to local disk in an area called the segment cache. To configure the size and location of the segment cache on each Historical service, set the `druid.segmentCache.locations`.
+For more information, see [Segment cache size](../operations/basic-cluster-tuning.md#segment-cache-size).
+
+The [Coordinator](../design/coordinator.md) controls the assignment of segments to Historicals and the balance of segments between Historicals. Historical services do not communicate directly with each other, nor do they communicate directly with the Coordinator. Instead, the Coordinator creates ephemeral entries in ZooKeeper in a [load queue path](../configuration/index.md#path-configuration). Each Historical service maintains a connection to ZooKeeper, watching those paths for segment information.
+
+When a Historical service detects a new entry in the ZooKeeper load queue, it checks its own segment cache. If no information about the segment exists there, the Historical service first retrieves metadata from ZooKeeper about the segment, including where the segment is located in deep storage and how it needs to decompress and process it.
+
+For more information about segment metadata and Druid segments in general, see [Segments](../design/segments.md).
+
+After a Historical service pulls down and processes a segment from deep storage, Druid advertises the segment as being available for queries from the Broker. This announcement by the Historical is made via ZooKeeper, in a [served segments path](../configuration/index.md#path-configuration).
+
+For more information about how the Broker determines what data is available for queries, see [Broker](broker.md).
+
+To make data from the segment cache available for querying as soon as possible, Historical services search the local segment cache upon startup and advertise the segments found there.
+
+## Loading and serving segments from cache
+
+The segment cache uses [memory mapping](https://en.wikipedia.org/wiki/Mmap). The cache consumes memory from the underlying operating system so Historicals can hold parts of segment files in memory to increase query performance at the data level. The in-memory segment cache is affected by the size of the Historical JVM, heap / direct memory buffers, and other services on the operating system itself.
+
+At query time, if the required part of a segment file is available in the memory mapped cache or "page cache", the Historical re-uses it and reads it directly from memory. If it is not in the memory-mapped cache, the Historical reads that part of the segment from disk. In this case, there is potential for new data to flush other segment data from memory. This means that if free operating system memory is close to `druid.server.maxSize`, the more likely that segment data will be available in memory and reduce query times. Conversely, the lower the free operating system memory, the more likely a Historical is to read segments from disk.
+
+Note that this memory-mapped segment cache is in addition to other [query-level caches](../querying/caching.md).
+
+## Querying segments
+
+You can configure a Historical service to log and report metrics for every query it services.
+For information on querying Historical services, see [Querying](../querying/querying.md).
diff --git a/docs/35.0.0/design/index.md b/docs/35.0.0/design/index.md
new file mode 100644
index 0000000000..4d4655a9b1
--- /dev/null
+++ b/docs/35.0.0/design/index.md
@@ -0,0 +1,104 @@
+---
+id: index
+title: "Introduction to Apache Druid"
+---
+
+
+
+Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics ("[OLAP](http://en.wikipedia.org/wiki/Online_analytical_processing)" queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important.
+
+Druid is commonly used as the database backend for GUIs of analytical applications, or for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data.
+
+Common application areas for Druid include:
+
+|Use Case|Description|
+|-----------------|-------------------|
+|Clickstream analytics|Analyze user behavior on websites and mobile applications to understand navigation patterns, popular content, and user engagement|
+|Network telemetry analytics|Monitor and analyze network traffic and performance metrics to optimize network efficiency, identify bottlenecks, and ensure quality of service|
+|Server metrics storage|Collect and store performance metrics such as CPU usage, memory usage, disk I/O, and network activity to monitor server health and optimize resource allocation|
+|Supply chain analytics|Use data from various stages of the supply chain to optimize inventory management, streamline logistics, forecast demand, and improve overall operational efficiency|
+|Application performance metrics|Monitor and analyze the performance of software applications to identify areas for improvement, troubleshoot issues, and ensure optimal user experience|
+|Digital marketing/advertising analytics|Track and analyze the effectiveness of digital marketing campaigns and advertising efforts across various channels, such as social media, search engines, and display ads|
+|Business intelligence (BI)/OLAP (Online Analytical Processing)|Use data analysis tools and techniques to gather insights from large datasets, generate reports, and make data-driven decisions to improve business operations and strategy|
+|Customer analytics|Analyze customer data to understand preferences, behavior, and purchasing patterns, enabling personalized marketing strategies, improved customer service, and customer retention efforts|
+|IoT (Internet of Things) analytics|Process and analyze data generated by IoT devices to gain insights into device performance, user behavior, and environmental conditions, facilitating automation, optimization, and predictive maintenance|
+|Financial analytics| Evaluate finance data to gauge financial performance, manage risk, detect fraud, and make informed investment decisions|
+|Healthcare analytics|Analyze healthcare data to improve patient outcomes, optimize healthcare delivery, reduce costs, and identify trends and patterns in diseases and treatments|
+|Social media analytics|Monitor and analyze social media activity, such as likes, shares, comments, and mentions, to understand audience sentiment, track brand perception, and identify influencers|
+
+If you are experimenting with a new use case for Druid or have questions about Druid's capabilities and features, join the [Apache Druid Slack](http://apachedruidworkspace.slack.com/) channel. There, you can connect with Druid experts, ask questions, and get help in real time.
+
+## Key features of Druid
+
+Druid's core architecture combines ideas from data warehouses, timeseries databases, and logsearch systems. Some of
+Druid's key features are:
+
+1. **Columnar storage format.** Druid uses column-oriented storage. This means it only loads the exact columns
+needed for a particular query. This greatly improves speed for queries that retrieve only a few columns. Additionally, to support fast scans and aggregations, Druid optimizes column storage for each column according to its data type.
+2. **Scalable distributed system.** Typical Druid deployments span clusters ranging from tens to hundreds of servers. Druid can ingest data at the rate of millions of records per second while retaining trillions of records and maintaining query latencies ranging from the sub-second to a few seconds.
+3. **Massively parallel processing.** Druid can process each query in parallel across the entire cluster.
+4. **Realtime or batch ingestion.** Druid can ingest data either real-time or in batches. Ingested data is immediately available for
+querying.
+5. **Self-healing, self-balancing, easy to operate.** As an operator, you add servers to scale out or
+remove servers to scale down. The Druid cluster re-balances itself automatically in the background without any downtime. If a
+Druid server fails, the system automatically routes data around the damage until the server can be replaced. Druid
+is designed to run continuously without planned downtime for any reason. This is true for configuration changes and software
+updates.
+6. **Cloud-native, fault-tolerant architecture that won't lose data.** After ingestion, Druid safely stores a copy of your data in [deep storage](architecture.md#deep-storage). Deep storage is typically cloud storage, HDFS, or a shared filesystem. You can recover your data from deep storage even in the unlikely case that all Druid servers fail. For a limited failure that affects only a few Druid servers, replication ensures that queries are still possible during system recoveries.
+7. **Indexes for quick filtering.** Druid uses [Roaring](https://roaringbitmap.org/) or
+[CONCISE](https://arxiv.org/pdf/1004.0403) compressed bitmap indexes to create indexes to enable fast filtering and searching across multiple columns.
+8. **Time-based partitioning.** Druid first partitions data by time. You can optionally implement additional partitioning based upon other fields.
+Time-based queries only access the partitions that match the time range of the query which leads to significant performance improvements.
+9. **Approximate algorithms.** Druid includes algorithms for approximate count-distinct, approximate ranking, and
+computation of approximate histograms and quantiles. These algorithms offer bounded memory usage and are often
+substantially faster than exact computations. For situations where accuracy is more important than speed, Druid also
+offers exact count-distinct and exact ranking.
+10. **Automatic summarization at ingest time.** Druid optionally supports data summarization at ingestion time. This
+summarization partially pre-aggregates your data, potentially leading to significant cost savings and performance boosts.
+
+## When to use Druid
+
+Druid is used by many companies of various sizes for many different use cases. For more information see
+[Powered by Apache Druid](/druid-powered).
+
+Druid is likely a good choice if your use case matches a few of the following:
+
+- Insert rates are very high, but updates are less common.
+- Most of your queries are aggregation and reporting queries. For example "group by" queries. You may also have searching and
+scanning queries.
+- You are targeting query latencies of 100ms to a few seconds.
+- Your data has a time component. Druid includes optimizations and design choices specifically related to time.
+- You may have more than one table, but each query hits just one big distributed table. Queries may potentially hit more
+than one smaller "lookup" table.
+- You have high cardinality data columns, e.g. URLs, user IDs, and need fast counting and ranking over them.
+- You want to load data from Kafka, HDFS, flat files, or object storage like Amazon S3.
+
+Situations where you would likely _not_ want to use Druid include:
+
+- You need low-latency updates of _existing_ records using a primary key. Druid supports streaming inserts, but not streaming updates. You can perform updates using
+background batch jobs.
+- You are building an offline reporting system where query latency is not very important.
+- You want to do "big" joins, meaning joining one big fact table to another big fact table, and you are okay with these queries
+taking a long time to complete.
+
+## Learn more
+- Try the Druid [Quickstart](../tutorials/index.md).
+- Learn more about Druid components in [Design](../design/architecture.md).
+- Read about new features and improvements in [Druid Releases](https://github.com/apache/druid/releases).
diff --git a/docs/35.0.0/design/indexer.md b/docs/35.0.0/design/indexer.md
new file mode 100644
index 0000000000..4b695b290b
--- /dev/null
+++ b/docs/35.0.0/design/indexer.md
@@ -0,0 +1,96 @@
+---
+id: indexer
+layout: doc_page
+title: "Indexer service"
+sidebar_label: "Indexer"
+---
+
+
+
+:::info
+ The Indexer is an optional and experimental feature. If you're primarily performing batch ingestion, we recommend you use either the MiddleManager and Peon task execution system or [MiddleManager-less ingestion using Kubernetes](../development/extensions-core/k8s-jobs.md). If you're primarily doing streaming ingestion, you may want to try either [MiddleManager-less ingestion using Kubernetes](../development/extensions-core/k8s-jobs.md) or the Indexer service.
+:::
+
+The Apache Druid Indexer service is an alternative to the Middle Manager + Peon task execution system. Instead of forking a separate JVM process per-task, the Indexer runs tasks as separate threads within a single JVM process.
+
+The Indexer is designed to be easier to configure and deploy compared to the Middle Manager + Peon system and to better enable resource sharing across tasks.
+
+## Configuration
+
+For Apache Druid Indexer service configuration, see [Indexer Configuration](../configuration/index.md#indexer).
+
+## HTTP endpoints
+
+The Indexer service shares the same HTTP endpoints as the [Middle Manager](../api-reference/service-status-api.md#middle-manager).
+
+## Running
+
+```
+org.apache.druid.cli.Main server indexer
+```
+
+## Task resource sharing
+
+The following resources are shared across all tasks running inside the Indexer service.
+
+### Query resources
+
+The query processing threads and buffers are shared across all tasks. The Indexer serves queries from a single endpoint shared by all tasks.
+
+If [query caching](../configuration/index.md#indexer-caching) is enabled, the query cache is also shared across all tasks.
+
+### Server HTTP threads
+
+The Indexer maintains two equally sized pools of HTTP threads.
+One pool is exclusively used for task control messages between the Overlord and the Indexer ("chat handler threads"). The other pool is used for handling all other HTTP requests.
+
+To configure the number of threads, use the `druid.server.http.numThreads` property. For example, if `druid.server.http.numThreads` is set to 10, there will be 10 chat handler threads and 10 non-chat handler threads.
+
+In addition to these two pools, the Indexer allocates two separate threads for lookup handling. If lookups are not used, these threads will not be used.
+
+### Memory sharing
+
+The Indexer uses the `druid.worker.globalIngestionHeapLimitBytes` property to impose a global heap limit across all of the tasks it is running.
+
+This global limit is evenly divided across the number of task slots configured by `druid.worker.capacity`.
+
+To apply the per-task heap limit, the Indexer overrides `maxBytesInMemory` in task tuning configurations, that is ignoring the default value or any user configured value. It also overrides `maxRowsInMemory` to an essentially unlimited value: the Indexer does not support row limits.
+
+By default, `druid.worker.globalIngestionHeapLimitBytes` is set to 1/6th of the available JVM heap. This default is chosen to align with the default value of `maxBytesInMemory` in task tuning configs when using the Middle Manager + Peon system, which is also 1/6th of the JVM heap.
+
+The peak usage for rows held in heap memory relates to the interaction between the `maxBytesInMemory` and `maxPendingPersists` properties in the task tuning configs. When the amount of row data held in-heap by a task reaches the limit specified by `maxBytesInMemory`, a task will persist the in-heap row data. After the persist has been started, the task can again ingest up to `maxBytesInMemory` bytes worth of row data while the persist is running.
+
+This means that the peak in-heap usage for row data can be up to approximately `maxBytesInMemory * (2 + maxPendingPersists)`. The default value of `maxPendingPersists` is 0, which allows for 1 persist to run concurrently with ingestion work.
+
+The remaining portion of the heap is reserved for query processing and segment persist/merge operations, and miscellaneous heap usage.
+
+### Concurrent segment persist/merge limits
+
+To help reduce peak memory usage, the Indexer imposes a limit on the number of concurrent segment persist/merge operations across all running tasks.
+
+By default, the number of concurrent persist/merge operations is limited to `(druid.worker.capacity / 2)`, rounded down. This limit can be configured with the `druid.worker.numConcurrentMerges` property.
+
+## Current limitations
+
+Separate task logs are not currently supported when using the Indexer; all task log messages will instead be logged in the Indexer service log.
+
+The Indexer currently imposes an identical memory limit on each task. In later releases, the per-task memory limit will be removed and only the global limit will apply. The limit on concurrent merges will also be removed.
+
+In later releases, per-task memory usage will be dynamically managed. Please see https://github.com/apache/druid/issues/7900 for details on future enhancements to the Indexer.
diff --git a/docs/35.0.0/design/indexing-service.md b/docs/35.0.0/design/indexing-service.md
new file mode 100644
index 0000000000..d7dde33ecd
--- /dev/null
+++ b/docs/35.0.0/design/indexing-service.md
@@ -0,0 +1,51 @@
+---
+id: indexing-service
+title: "Indexing Service"
+---
+
+
+
+
+The Apache Druid indexing service is a highly-available, distributed service that runs indexing related tasks.
+
+Indexing [tasks](../ingestion/tasks.md) are responsible for creating and [killing](../ingestion/tasks.md#kill) Druid [segments](../design/segments.md).
+
+The indexing service is composed of three main components: [Peons](../design/peons.md) that can run a single task, [Middle Managers](../design/middlemanager.md) that manage Peons, and an [Overlord](../design/overlord.md) that manages task distribution to Middle Managers.
+Overlords and Middle Managers may run on the same process or across multiple processes, while Middle Managers and Peons always run on the same process.
+
+Tasks are managed using API endpoints on the Overlord service. Please see [Tasks API](../api-reference/tasks-api.md) for more information.
+
+
+
+## Overlord
+
+See [Overlord](../design/overlord.md).
+
+## Middle Managers
+
+See [Middle Manager](../design/middlemanager.md).
+
+## Peons
+
+See [Peon](../design/peons.md).
+
+## Tasks
+
+See [Tasks](../ingestion/tasks.md).
diff --git a/docs/35.0.0/design/metadata-storage.md b/docs/35.0.0/design/metadata-storage.md
new file mode 100644
index 0000000000..8071753f3e
--- /dev/null
+++ b/docs/35.0.0/design/metadata-storage.md
@@ -0,0 +1,175 @@
+---
+id: metadata-storage
+title: "Metadata storage"
+---
+
+
+
+
+Apache Druid relies on an external dependency for metadata storage.
+Druid uses the metadata store to house various metadata about the system, but not to store the actual data.
+The metadata store retains all metadata essential for a Druid cluster to work.
+
+The metadata store includes the following:
+- Segments records
+- Rule records
+- Configuration records
+- Task-related tables
+- Audit records
+
+Derby is the default metadata store for Druid, however, it is not suitable for production.
+[MySQL](../development/extensions-core/mysql.md) and [PostgreSQL](../development/extensions-core/postgresql.md) are more production suitable metadata stores.
+See [Metadata storage configuration](../configuration/index.md#metadata-storage) for the default configuration settings.
+
+:::info
+ We also recommend you set up a high availability environment because there is no way to restore lost metadata.
+:::
+
+## Available metadata stores
+
+Druid supports Derby, MySQL, and PostgreSQL for storing metadata. Note that your metadata store must be ACID-compliant. If it isn't ACID-compliant, you can encounter issues, such as tasks failing sporadically.
+
+To avoid issues with upgrades that require schema changes to a large metadata table, consider a metadata store version that supports instant ADD COLUMN semantics.
+See the database-specific docs for guidance on versions.
+
+### MySQL
+
+See [mysql-metadata-storage extension documentation](../development/extensions-core/mysql.md).
+
+### PostgreSQL
+
+See [postgresql-metadata-storage](../development/extensions-core/postgresql.md).
+
+
+### Derby
+
+:::info
+ For production clusters, consider using MySQL or PostgreSQL instead of Derby.
+:::
+
+Configure metadata storage with Derby by setting the following properties in your Druid configuration.
+
+```properties
+druid.metadata.storage.type=derby
+druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527//opt/var/druid_state/derby;create=true
+```
+
+## Adding custom DBCP properties
+
+You can add custom properties to customize the database connection pool (DBCP) for connecting to the metadata store.
+Define these properties with a `druid.metadata.storage.connector.dbcp.` prefix.
+For example:
+
+```properties
+druid.metadata.storage.connector.dbcp.maxConnLifetimeMillis=1200000
+druid.metadata.storage.connector.dbcp.defaultQueryTimeout=30000
+```
+
+Certain properties cannot be set through `druid.metadata.storage.connector.dbcp.` and must be set with the prefix `druid.metadata.storage.connector.`:
+* `username`
+* `password`
+* `connectURI`
+* `validationQuery`
+* `testOnBorrow`
+
+See [BasicDataSource Configuration](https://commons.apache.org/proper/commons-dbcp/configuration) for a full list of configurable properties.
+
+## Metadata storage tables
+
+This section describes the various tables in metadata storage.
+
+### Segments table
+
+This is dictated by the `druid.metadata.storage.tables.segments` property.
+
+This table stores metadata about the segments that should be available in the system. (This set of segments is called
+"used segments" elsewhere in the documentation and throughout the project.) The table is polled by the
+[Coordinator](../design/coordinator.md) to determine the set of segments that should be available for querying in the
+system. The table has two main functional columns, the other columns are for indexing purposes.
+
+Value 1 in the `used` column means that the segment should be "used" by the cluster (i.e., it should be loaded and
+available for requests). Value 0 means that the segment should not be loaded into the cluster. We do this as a means of
+unloading segments from the cluster without actually removing their metadata (which allows for simpler rolling back if
+that is ever an issue). The `used` column has a corresponding `used_status_last_updated` column which denotes the time
+when the `used` status of the segment was last updated. This information can be used by the Coordinator to determine if
+a segment is a candidate for deletion (if automated segment killing is enabled).
+
+The `payload` column stores a JSON blob that has all of the metadata for the segment.
+Some of the data in the `payload` column intentionally duplicates data from other columns in the segments table.
+As an example, the `payload` column may take the following form:
+
+```json
+{
+ "dataSource":"wikipedia",
+ "interval":"2012-05-23T00:00:00.000Z/2012-05-24T00:00:00.000Z",
+ "version":"2012-05-24T00:10:00.046Z",
+ "loadSpec":{
+ "type":"s3_zip",
+ "bucket":"bucket_for_segment",
+ "key":"path/to/segment/on/s3"
+ },
+ "dimensions":"comma-delimited-list-of-dimension-names",
+ "metrics":"comma-delimited-list-of-metric-names",
+ "shardSpec":{"type":"none"},
+ "binaryVersion":9,
+ "size":size_of_segment,
+ "identifier":"wikipedia_2012-05-23T00:00:00.000Z_2012-05-24T00:00:00.000Z_2012-05-23T00:10:00.046Z"
+}
+```
+
+### Rule table
+
+The rule table stores the various rules about where segments should
+land. These rules are used by the [Coordinator](../design/coordinator.md)
+ when making segment (re-)allocation decisions about the cluster.
+
+### Config table
+
+The config table stores runtime configuration objects. We do not have
+many of these yet and we are not sure if we will keep this mechanism going
+forward, but it is the beginnings of a method of changing some configuration
+parameters across the cluster at runtime.
+
+### Task-related tables
+
+Task-related tables are created and used by the [Overlord](../design/overlord.md) and [Middle Manager](../design/middlemanager.md) when managing tasks.
+
+### Audit table
+
+The audit table stores the audit history for configuration changes
+such as rule changes done by [Coordinator](../design/coordinator.md) and other
+config changes.
+
+## Metadata storage access
+
+Only the following processes access the metadata storage:
+
+1. Indexing service processes (if any)
+2. Realtime processes (if any)
+3. Coordinator processes
+
+Thus you need to give permissions (e.g., in AWS security groups) for only these machines to access the metadata storage.
+
+## Learn more
+
+See the following topics for more information:
+* [Metadata storage configuration](../configuration/index.md#metadata-storage)
+* [Automated cleanup for metadata records](../operations/clean-metadata-store.md)
+
diff --git a/docs/35.0.0/design/middlemanager.md b/docs/35.0.0/design/middlemanager.md
new file mode 100644
index 0000000000..9037b56a6e
--- /dev/null
+++ b/docs/35.0.0/design/middlemanager.md
@@ -0,0 +1,43 @@
+---
+id: middlemanager
+title: "Middle Manager service"
+sidebar_label: "Middle Manager"
+---
+
+
+
+The Middle Manager service is a worker service that executes submitted tasks. Middle Managers forward tasks to [Peons](../design/peons.md) that run in separate JVMs.
+Druid uses separate JVMs for tasks to isolate resources and logs. Each Peon is capable of running only one task at a time, whereas a Middle Manager may have multiple Peons.
+
+## Configuration
+
+For Apache Druid Middle Manager service configuration, see [Middle Manager and Peons](../configuration/index.md#middle-manager-and-peon).
+
+For basic tuning guidance for the Middle Manager service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#middle-manager).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Middle Manager, see the [Service status API reference](../api-reference/service-status-api.md#middle-manager).
+
+## Running
+
+```
+org.apache.druid.cli.Main server middleManager
+```
diff --git a/docs/35.0.0/design/overlord.md b/docs/35.0.0/design/overlord.md
new file mode 100644
index 0000000000..d8458e750a
--- /dev/null
+++ b/docs/35.0.0/design/overlord.md
@@ -0,0 +1,59 @@
+---
+id: overlord
+title: "Overlord service"
+sidebar_label: "Overlord"
+---
+
+
+
+
+The Overlord service is responsible for accepting tasks, coordinating task distribution, creating locks around tasks, and returning statuses to callers. The Overlord can be configured to run in one of two modes - local or remote (local being default).
+In local mode, the Overlord is also responsible for creating Peons for executing tasks. When running the Overlord in local mode, all Middle Manager and Peon configurations must be provided as well.
+Local mode is typically used for simple workflows. In remote mode, the Overlord and Middle Manager are run in separate services and you can run each on a different server.
+This mode is recommended if you intend to use the indexing service as the single endpoint for all Druid indexing.
+
+## Configuration
+
+For Apache Druid Overlord service configuration, see [Overlord Configuration](../configuration/index.md#overlord).
+
+For basic tuning guidance for the Overlord service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#overlord).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Overlord, please see the [Service status API reference](../api-reference/service-status-api.md#overlord).
+
+## Blacklisted workers
+
+If a Middle Manager has task failures above a threshold, the Overlord will blacklist these Middle Managers. No more than 20% of the Middle Managers can be blacklisted. Blacklisted Middle Managers will be periodically whitelisted.
+
+The following variables can be used to set the threshold and blacklist timeouts.
+
+```
+druid.indexer.runner.maxRetriesBeforeBlacklist
+druid.indexer.runner.workerBlackListBackoffTime
+druid.indexer.runner.workerBlackListCleanupPeriod
+druid.indexer.runner.maxPercentageBlacklistWorkers
+```
+
+## Autoscaling
+
+The autoscaling mechanisms currently in place are tightly coupled with our deployment infrastructure but the framework should be in place for other implementations. We are highly open to new implementations or extensions of the existing mechanisms. In our own deployments, Middle Manager services are Amazon AWS EC2 nodes and they are provisioned to register themselves in a [galaxy](https://github.com/ning/galaxy) environment.
+
+If autoscaling is enabled, new Middle Managers may be added when a task has been in pending state for too long. Middle Managers may be terminated if they have not run any tasks for a period of time.
diff --git a/docs/35.0.0/design/peons.md b/docs/35.0.0/design/peons.md
new file mode 100644
index 0000000000..b31bd8ec1a
--- /dev/null
+++ b/docs/35.0.0/design/peons.md
@@ -0,0 +1,48 @@
+---
+id: peons
+title: "Peon service"
+sidebar_label: "Peon"
+---
+
+
+
+The Peon service is a task execution engine spawned by the Middle Manager. Each Peon runs a separate JVM and is responsible for executing a single task. Peons always run on the same host as the Middle Manager that spawned them.
+
+## Configuration
+
+For Apache Druid Peon configuration, see [Peon Query Configuration](../configuration/index.md#peon-query-configuration) and [Additional Peon Configuration](../configuration/index.md#additional-peon-configuration).
+
+For basic tuning guidance for Middle Manager tasks, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#task-configurations).
+
+## HTTP endpoints
+
+Peons run a single task in a single JVM. The Middle Manager is responsible for creating Peons for running tasks.
+Peons should rarely run on their own.
+
+## Running
+
+The Peon should seldom run separately from the Middle Manager, except for development purposes.
+
+```
+org.apache.druid.cli.Main internal peon
+```
+
+The task file contains the task JSON object.
+The status file indicates where the task status will be output.
diff --git a/docs/35.0.0/design/router.md b/docs/35.0.0/design/router.md
new file mode 100644
index 0000000000..ffe9358e48
--- /dev/null
+++ b/docs/35.0.0/design/router.md
@@ -0,0 +1,230 @@
+---
+id: router
+title: "Router service"
+sidebar_label: "Router"
+---
+
+
+
+The Router service distributes queries between different Broker services. By default, the Broker routes queries based on preconfigured [data retention rules](../operations/rule-configuration.md). For example, if one month of recent data is loaded into a `hot` cluster, queries that fall within the recent month can be routed to a dedicated set of Brokers. Queries outside this range are routed to another set of Brokers. This set up provides query isolation such that queries for more important data are not impacted by queries for less important data.
+
+For query routing purposes, you should only ever need the Router service if you have a Druid cluster well into the terabyte range.
+
+In addition to query routing, the Router also runs the [web console](../operations/web-console.md), a UI for loading data, managing datasources and tasks, and viewing server status and segment information.
+
+## Configuration
+
+For Apache Druid Router service configuration, see [Router configuration](../configuration/index.md#router).
+
+For basic tuning guidance for the Router service, see [Basic cluster tuning](../operations/basic-cluster-tuning.md#router).
+
+## HTTP endpoints
+
+For a list of API endpoints supported by the Router, see [Legacy metadata API reference](../api-reference/legacy-metadata-api.md#datasource-information).
+
+## Running
+
+```
+org.apache.druid.cli.Main server router
+```
+
+## Router as management proxy
+
+You can configure the Router to forward requests to the active Coordinator or Overlord service. This may be useful for
+setting up a highly available cluster in situations where the HTTP redirect mechanism of the inactive to active
+Coordinator or Overlord service does not function correctly, such as when servers are behind a load balancer or the hostname used in the redirect is only resolvable internally.
+
+### Enable the management proxy
+
+To enable the management proxy, set the following in the Router's `runtime.properties`:
+
+```
+druid.router.managementProxy.enabled=true
+```
+
+### Management proxy routing
+
+The management proxy supports implicit and explicit routes. Implicit routes are those where the destination can be
+determined from the original request path based on Druid API path conventions. For the Coordinator the convention is
+`/druid/coordinator/*` and for the Overlord the convention is `/druid/indexer/*`. These are convenient because they mean
+that using the management proxy does not require modifying the API request other than issuing the request to the Router
+instead of the Coordinator or Overlord. Most Druid API requests can be routed implicitly.
+
+Explicit routes are those where the request to the Router contains a path prefix indicating which service the request
+should be routed to. For the Coordinator this prefix is `/proxy/coordinator` and for the Overlord it is `/proxy/overlord`.
+This is required for API calls with an ambiguous destination. For example, the `/status` API is present on all Druid
+services, so explicit routing needs to be used to indicate the proxy destination.
+
+This is summarized in the table below:
+
+|Request Route|Destination|Rewritten Route|Example|
+|-------------|-----------|---------------|-------|
+|`/druid/coordinator/*`|Coordinator|`/druid/coordinator/*`|`router:8888/druid/coordinator/v1/datasources` -> `coordinator:8081/druid/coordinator/v1/datasources`|
+|`/druid/indexer/*`|Overlord|`/druid/indexer/*`|`router:8888/druid/indexer/v1/task` -> `overlord:8090/druid/indexer/v1/task`|
+|`/proxy/coordinator/*`|Coordinator|`/*`|`router:8888/proxy/coordinator/status` -> `coordinator:8081/status`|
+|`/proxy/overlord/*`|Overlord|`/*`|`router:8888/proxy/overlord/druid/indexer/v1/isLeader` -> `overlord:8090/druid/indexer/v1/isLeader`|
+
+## Router strategies
+
+The Router has a configurable list of strategies to determine which Brokers to route queries to. The order of the strategies is important because the Broker is selected immediately after the strategy condition is satisfied.
+
+### timeBoundary
+
+```json
+{
+ "type":"timeBoundary"
+}
+```
+
+Including this strategy means all `timeBoundary` queries are always routed to the highest priority Broker.
+
+### priority
+
+```json
+{
+ "type":"priority",
+ "minPriority":0,
+ "maxPriority":1
+}
+```
+
+Queries with a priority set to less than `minPriority` are routed to the lowest priority Broker. Queries with priority set to greater than `maxPriority` are routed to the highest priority Broker. By default, `minPriority` is 0 and `maxPriority` is 1. Using these default values, if a query with priority 0 (the default query priority is 0) is sent, the query skips the priority selection logic.
+
+### manual
+
+This strategy reads the parameter `brokerService` from the query context and routes the query to that broker service. If no valid `brokerService` is specified in the query context, the field `defaultManualBrokerService` is used to determine target broker service given the value is valid and non-null. A value is considered valid if it is present in `druid.router.tierToBrokerMap`.
+This strategy can route both native and SQL queries.
+
+The following example strategy routes queries to the Broker `druid:broker-hot` if no valid `brokerService` is found in the query context.
+
+```json
+{
+ "type": "manual",
+ "defaultManualBrokerService": "druid:broker-hot"
+}
+```
+
+### JavaScript
+
+Allows defining arbitrary routing rules using a JavaScript function. The function takes the configuration and the query to be executed, and returns the tier it should be routed to, or null for the default tier.
+
+The following example function sends queries containing more than three aggregators to the lowest priority Broker.
+
+```json
+{
+ "type" : "javascript",
+ "function" : "function (config, query) { if (query.getAggregatorSpecs && query.getAggregatorSpecs().size() >= 3) { var size = config.getTierToBrokerMap().values().size(); if (size > 0) { return config.getTierToBrokerMap().values().toArray()[size-1] } else { return config.getDefaultBrokerServiceName() } } else { return null } }"
+}
+```
+
+:::info
+ JavaScript-based functionality is disabled by default. Please refer to the Druid [JavaScript programming guide](../development/javascript.md) for guidelines about using Druid's JavaScript functionality, including instructions on how to enable it.
+:::
+
+## Routing of SQL queries using strategies
+
+To enable routing of SQL queries using strategies, set `druid.router.sql.enable` to `true`. The Broker service for a
+given SQL query is resolved using only the provided Router strategies. If not resolved using any of the strategies, the
+Router uses the `defaultBrokerServiceName`. This behavior is slightly different from native queries where the Router
+first tries to resolve the Broker service using strategies, then load rules and finally using the `defaultBrokerServiceName`
+if still not resolved. When `druid.router.sql.enable` is set to `false` (default value), the Router uses the
+`defaultBrokerServiceName`.
+
+Setting `druid.router.sql.enable` does not affect either Avatica JDBC requests or native queries.
+Druid always routes native queries using the strategies and load rules as documented.
+Druid always routes Avatica JDBC requests based on connection ID.
+
+## Avatica query balancing
+
+All Avatica JDBC requests with a given connection ID must be routed to the same Broker, since Druid Brokers do not share connection state with each other.
+
+To accomplish this, Druid provides two built-in balancers that use rendezvous hashing and consistent hashing of a request's connection ID respectively to assign requests to Brokers.
+
+Note that when multiple Routers are used, all Routers should have identical balancer configuration to ensure that they make the same routing decisions.
+
+### Rendezvous hash balancer
+
+This balancer uses [Rendezvous Hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing) on an Avatica request's connection ID to assign the request to a Broker.
+
+To use this balancer, specify the following property:
+
+```
+druid.router.avatica.balancer.type=rendezvousHash
+```
+
+If no `druid.router.avatica.balancer` property is set, the Router defaults to using the rendezvous hash balancer.
+
+### Consistent hash balancer
+
+This balancer uses [Consistent Hashing](https://en.wikipedia.org/wiki/Consistent_hashing) on an Avatica request's connection ID to assign the request to a Broker.
+
+To use this balancer, specify the following property:
+
+```
+druid.router.avatica.balancer.type=consistentHash
+```
+
+This is a non-default implementation that is provided for experimentation purposes. The consistent hasher has longer setup times on initialization and when the set of Brokers changes, but has a faster Broker assignment time than the rendezvous hasher when tested with 5 Brokers. Benchmarks for both implementations have been provided in `ConsistentHasherBenchmark` and `RendezvousHasherBenchmark`. The consistent hasher also requires locking, while the rendezvous hasher does not.
+
+## Example production configuration
+
+In this example, we have two tiers in our production cluster: `hot` and `_default_tier`. Queries for the `hot` tier are routed through the `broker-hot` set of Brokers, and queries for the `_default_tier` are routed through the `broker-cold` set of Brokers. If any exceptions or network problems occur, queries are routed to the `broker-cold` set of brokers. In our example, we are running with a c3.2xlarge EC2 instance. We assume a `common.runtime.properties` already exists.
+
+JVM settings:
+
+```
+-server
+-Xmx13g
+-Xms13g
+-XX:NewSize=256m
+-XX:MaxNewSize=256m
+-XX:+UseConcMarkSweepGC
+-XX:+PrintGCDetails
+-XX:+PrintGCTimeStamps
+-XX:+UseLargePages
+-XX:+HeapDumpOnOutOfMemoryError
+-XX:HeapDumpPath=/mnt/galaxy/deploy/current/
+-Duser.timezone=UTC
+-Dfile.encoding=UTF-8
+-Djava.io.tmpdir=/mnt/tmp
+
+-Dcom.sun.management.jmxremote.port=17071
+-Dcom.sun.management.jmxremote.authenticate=false
+-Dcom.sun.management.jmxremote.ssl=false
+```
+
+Runtime.properties:
+
+```
+druid.host=#{IP_ADDR}:8080
+druid.plaintextPort=8080
+druid.service=druid/router
+
+druid.router.defaultBrokerServiceName=druid:broker-cold
+druid.router.coordinatorServiceName=druid:coordinator
+druid.router.tierToBrokerMap={"hot":"druid:broker-hot","_default_tier":"druid:broker-cold"}
+druid.router.http.numConnections=50
+druid.router.http.readTimeout=PT5M
+
+# Number of threads used by the Router proxy http client
+druid.router.http.numMaxThreads=100
+
+druid.server.http.numThreads=100
+```
diff --git a/docs/35.0.0/design/segments.md b/docs/35.0.0/design/segments.md
new file mode 100644
index 0000000000..6d2d9b5bad
--- /dev/null
+++ b/docs/35.0.0/design/segments.md
@@ -0,0 +1,217 @@
+---
+id: segments
+title: "Segments"
+---
+
+
+
+
+Apache Druid stores its data and indexes in *segment files* partitioned by time. Druid creates a segment for each segment interval that contains data. If an interval is empty—that is, containing no rows—no segment exists for that time interval. Druid may create multiple segments for the same interval if you ingest data for that period via different ingestion jobs. [Compaction](../data-management/compaction.md) is the Druid process that attempts to combine these segments into a single segment per interval for optimal performance.
+
+The time interval is configurable in the `segmentGranularity` parameter of the [`granularitySpec`](../ingestion/ingestion-spec.md#granularityspec).
+
+For Druid to operate well under heavy query load, it is important for the segment
+file size to be within the recommended range of 300-700 MB. If your
+segment files are larger than this range, then consider either
+changing the granularity of the segment time interval or partitioning your
+data and/or adjusting the `targetRowsPerSegment` in your `partitionsSpec`.
+A good starting point for this parameter is 5 million rows.
+See the Sharding section below and the "Partitioning specification" section of
+the [Batch ingestion](../ingestion/hadoop.md#partitionsspec) documentation
+for more guidance.
+
+## Segment file structure
+
+Segment files are *columnar*: the data for each column is laid out in
+separate data structures. By storing each column separately, Druid decreases query latency by scanning only those columns actually needed for a query. There are three basic column types: timestamp, dimensions, and metrics:
+
+
+
+Timestamp and metrics type columns are arrays of integer or floating point values compressed with
+[LZ4](https://github.com/lz4/lz4-java). Once a query identifies which rows to select, it decompresses them, pulls out the relevant rows, and applies the
+desired aggregation operator. If a query doesn’t require a column, Druid skips over that column's data.
+
+Dimension columns are different because they support filter and
+group-by operations, so each dimension requires the following
+three data structures:
+
+- __Dictionary__: Maps values (which are always treated as strings) to integer IDs, allowing compact representation of the list and bitmap values.
+- __List__: The column’s values, encoded using the dictionary. Required for GroupBy and TopN queries. These operators allow queries that solely aggregate metrics based on filters to run without accessing the list of values.
+- __Bitmap__: One bitmap for each distinct value in the column, to indicate which rows contain that value. Bitmaps allow for quick filtering operations because they are convenient for quickly applying AND and OR operators. Also known as inverted indexes.
+
+To get a better sense of these data structures, consider the "Page" column from the example data above, represented by the following data structures:
+
+```
+1: Dictionary
+ {
+ "Justin Bieber": 0,
+ "Ke$ha": 1
+ }
+
+2: List of column data
+ [0,
+ 0,
+ 1,
+ 1]
+
+3: Bitmaps
+ value="Justin Bieber": [1,1,0,0]
+ value="Ke$ha": [0,0,1,1]
+```
+
+Note that the bitmap is different from the dictionary and list data structures: the dictionary and list grow linearly with the size of the data, but the size of the bitmap section is the product of data size and column cardinality. That is, there is one bitmap per separate column value. Columns with the same value share the same bitmap.
+
+For each row in the list of column data, there is only a single bitmap that has a non-zero entry. This means that high cardinality columns have extremely sparse, and therefore highly compressible, bitmaps. Druid exploits this using compression algorithms that are specially suited for bitmaps, such as [Roaring bitmap compression](https://github.com/RoaringBitmap/RoaringBitmap).
+
+## Handling null values
+
+String columns always store the null value if present in any row as id 0, the first position in the value dictionary and an associated entry in the bitmap value indexes used to filter null values. Numeric columns also store a null value bitmap index to indicate the null valued rows, which is used to null check aggregations and for filter matching null values.
+
+## Segments with different schemas
+
+Druid segments for the same datasource may have different schemas. If a string column (dimension) exists in one segment but not another, queries that involve both segments still work. In default mode, queries for the segment without the dimension behave as if the dimension contains only blank values. In SQL-compatible mode, queries for the segment without the dimension behave as if the dimension contains only null values. Similarly, if one segment has a numeric column (metric) but another does not, queries on the segment without the metric generally operate as expected. Aggregations over the missing metric operate as if the metric doesn't exist.
+
+## Column format
+
+Each column is stored as two parts:
+
+- A Jackson-serialized `ColumnDescriptor`.
+- The binary data for the column.
+
+A `ColumnDescriptor` is Jackson-serialized instance of the internal Druid `ColumnDescriptor` class . It allows the use of Jackson's polymorphic deserialization to add new and interesting methods of serialization with minimal impact to the code. It consists of some metadata about the column (for example: type, whether it's multi-value) and a list of serialization/deserialization logic that can deserialize the rest of the binary.
+
+### Multi-value columns
+
+A multi-value column allows a single row to contain multiple strings for a column. You can think of it as an array of strings. If a datasource uses multi-value columns, then the data structures within the segment files look a bit different. Let's imagine that in the example above, the second row is tagged with both the `Ke$ha` *and* `Justin Bieber` topics, as follows:
+
+```
+1: Dictionary
+ {
+ "Justin Bieber": 0,
+ "Ke$ha": 1
+ }
+
+2: List of column data
+ [0,
+ [0,1], <--Row value in a multi-value column can contain an array of values
+ 1,
+ 1]
+
+3: Bitmaps
+ value="Justin Bieber": [1,1,0,0]
+ value="Ke$ha": [0,1,1,1]
+ ^
+ |
+ |
+ Multi-value column contains multiple non-zero entries
+```
+
+Note the changes to the second row in the list of column data and the `Ke$ha`
+bitmap. If a row has more than one value for a column, its entry in
+the list is an array of values. Additionally, a row with *n* values in the list has *n* non-zero valued entries in bitmaps.
+
+## Compression
+
+Druid uses LZ4 by default to compress blocks of values for string, long, float, and double columns. Druid uses Roaring to compress bitmaps for string columns and numeric null values. We recommend that you use these defaults unless you've experimented with your data and query patterns suggest that non-default options will perform better in your specific case.
+
+Druid also supports Concise bitmap compression. For string column bitmaps, the differences between using Roaring and Concise are most pronounced for high cardinality columns. In this case, Roaring is substantially faster on filters that match many values, but in some cases Concise can have a lower footprint due to the overhead of the Roaring format (but is still slower when many values are matched). You configure compression at the segment level, not for individual columns. See [IndexSpec](../ingestion/ingestion-spec.md#indexspec) for more details.
+
+## Segment identification
+
+Segment identifiers typically contain the segment datasource, interval start time (in ISO 8601 format), interval end time (in ISO 8601 format), and version information. If data is additionally sharded beyond a time range, the segment identifier also contains a partition number:
+
+`datasource_intervalStart_intervalEnd_version_partitionNum`
+
+### Segment ID examples
+
+The increasing partition numbers in the following segments indicate that multiple segments exist for the same interval:
+
+```
+foo_2015-01-01/2015-01-02_v1_0
+foo_2015-01-01/2015-01-02_v1_1
+foo_2015-01-01/2015-01-02_v1_2
+```
+
+If you reindex the data with a new schema, Druid allocates a new version ID to the newly created segments:
+
+```
+foo_2015-01-01/2015-01-02_v2_0
+foo_2015-01-01/2015-01-02_v2_1
+foo_2015-01-01/2015-01-02_v2_2
+```
+
+## Sharding
+
+Multiple segments can exist for a single time interval and datasource. These segments form a `block` for an interval. Depending on the type of `shardSpec` used to shard the data, Druid queries may only complete if a `block` is complete. For example, if a block consists of the following three segments:
+
+```
+sampleData_2011-01-01T02:00:00:00Z_2011-01-01T03:00:00:00Z_v1_0
+sampleData_2011-01-01T02:00:00:00Z_2011-01-01T03:00:00:00Z_v1_1
+sampleData_2011-01-01T02:00:00:00Z_2011-01-01T03:00:00:00Z_v1_2
+```
+
+All three segments must load before a query for the interval `2011-01-01T02:00:00:00Z_2011-01-01T03:00:00:00Z` can complete.
+
+Linear shard specs are an exception to this rule. Linear shard specs do not enforce "completeness" so queries can complete even if shards are not completely loaded.
+
+For example, if a real-time ingestion creates three segments that were sharded with linear shard spec, and only two of the segments are loaded, queries return results for those two segments.
+
+## Segment components
+
+A segment contains several files:
+
+* `version.bin`
+
+ 4 bytes representing the current segment version as an integer. For example, for v9 segments the version is 0x0, 0x0, 0x0, 0x9.
+
+* `meta.smoosh`
+
+ A file containing metadata (filenames and offsets) about the contents of the other `smoosh` files.
+
+* `XXXXX.smoosh`
+
+ Smoosh (`.smoosh`) files contain concatenated binary data. This file consolidation reduces the number of file descriptors that must be open when accessing data. The files are 2 GB or less in size to remain within the limit of a memory-mapped `ByteBuffer` in Java.
+ Smoosh files contain the following:
+ - Individual files for each column in the data, including one for the `__time` column that refers to the timestamp of the segment.
+ - An `index.drd` file that contains additional segment metadata.
+
+In the codebase, segments have an internal format version. The current segment format version is `v9`.
+
+## Implications of updating segments
+
+Druid uses versioning to manage updates to create a form of multi-version concurrency control (MVCC). These MVCC versions are distinct from the segment format version discussed above.
+
+Note that updates that span multiple segment intervals are only atomic within each interval. They are not atomic across the entire update. For example, if you have the following segments:
+
+```
+foo_2015-01-01/2015-01-02_v1_0
+foo_2015-01-02/2015-01-03_v1_1
+foo_2015-01-03/2015-01-04_v1_2
+```
+
+`v2` segments are loaded into the cluster as soon as they are built and replace `v1` segments for the period of time the segments overlap. Before `v2` segments are completely loaded, the cluster may contain a mixture of `v1` and `v2` segments.
+
+```
+foo_2015-01-01/2015-01-02_v1_0
+foo_2015-01-02/2015-01-03_v2_1
+foo_2015-01-03/2015-01-04_v1_2
+```
+
+In this case, queries may hit a mixture of `v1` and `v2` segments.
diff --git a/docs/35.0.0/design/storage.md b/docs/35.0.0/design/storage.md
new file mode 100644
index 0000000000..365819639e
--- /dev/null
+++ b/docs/35.0.0/design/storage.md
@@ -0,0 +1,140 @@
+---
+id: storage
+title: "Storage overview"
+sidebar_label: "Storage"
+---
+
+
+
+
+Druid stores data in datasources, which are similar to tables in a traditional RDBMS. Each datasource is partitioned by time and, optionally, further partitioned by other attributes. Each time range is called a chunk (for example, a single day, if your datasource is partitioned by day). Within a chunk, data is partitioned into one or more [segments](../design/segments.md). Each segment is a single file, typically comprising up to a few million rows of data. Since segments are organized into time chunks, it's sometimes helpful to think of segments as living on a timeline like the following:
+
+
+
+A datasource may have anywhere from just a few segments, up to hundreds of thousands and even millions of segments. Each segment is created by a Middle Manager as mutable and uncommitted. Data is queryable as soon as it is added to an uncommitted segment. The segment building process accelerates later queries by producing a data file that is compact and indexed:
+
+- Conversion to columnar format
+- Indexing with bitmap indexes
+- Compression
+ - Dictionary encoding with id storage minimization for String columns
+ - Bitmap compression for bitmap indexes
+ - Type-aware compression for all columns
+
+Periodically, segments are committed and published to [deep storage](deep-storage.md), become immutable, and move from Middle Managers to the Historical services. An entry about the segment is also written to the [metadata store](metadata-storage.md). This entry is a self-describing bit of metadata about the segment, including things like the schema of the segment, its size, and its location on deep storage. These entries tell the Coordinator what data is available on the cluster.
+
+For details on the segment file format, see [segment files](segments.md).
+
+For details on modeling your data in Druid, see [schema design](../ingestion/schema-design.md).
+
+## Indexing and handoff
+
+Indexing is the mechanism by which new segments are created, and handoff is the mechanism by which they are published and served by Historical services.
+
+On the indexing side:
+
+1. An indexing task starts running and building a new segment. It must determine the identifier of the segment before it starts building it. For a task that is appending (like a Kafka task, or an index task in append mode) this is done by calling an "allocate" API on the Overlord to potentially add a new partition to an existing set of segments. For
+a task that is overwriting (like a Hadoop task, or an index task not in append mode) this is done by locking an interval and creating a new version number and new set of segments.
+2. If the indexing task is a realtime task (like a Kafka task) then the segment is immediately queryable at this point. It's available, but unpublished.
+3. When the indexing task has finished reading data for the segment, it pushes it to deep storage and then publishes it by writing a record into the metadata store.
+4. If the indexing task is a realtime task, then to ensure data is continuously available for queries, it waits for a Historical service to load the segment. If the indexing task is not a realtime task, it exits immediately.
+
+On the Coordinator / Historical side:
+
+1. The Coordinator polls the metadata store periodically (by default, every 1 minute) for newly published segments.
+2. When the Coordinator finds a segment that is published and used, but unavailable, it chooses a Historical service to load that segment and instructs that Historical to do so.
+3. The Historical loads the segment and begins serving it.
+4. At this point, if the indexing task was waiting for handoff, it will exit.
+
+## Segment identifiers
+
+Segments all have a four-part identifier with the following components:
+
+- Datasource name.
+- Time interval for the time chunk containing the segment; this corresponds to the `segmentGranularity` specified at ingestion time. Uses the same format as [query granularity](../querying/granularities.md).
+- Version number (generally an ISO8601 timestamp corresponding to when the segment set was first started).
+- Partition number (an integer, unique within a datasource+interval+version; may not necessarily be contiguous).
+
+For example, this is the identifier for a segment in datasource `clarity-cloud0`, time chunk
+`2018-05-21T16:00:00.000Z/2018-05-21T17:00:00.000Z`, version `2018-05-21T15:56:09.909Z`, and partition number 1:
+
+```
+clarity-cloud0_2018-05-21T16:00:00.000Z_2018-05-21T17:00:00.000Z_2018-05-21T15:56:09.909Z_1
+```
+
+Segments with partition number 0 (the first partition in a chunk) omit the partition number, like the following example, which is a segment in the same time chunk as the previous one, but with partition number 0 instead of 1:
+
+```
+clarity-cloud0_2018-05-21T16:00:00.000Z_2018-05-21T17:00:00.000Z_2018-05-21T15:56:09.909Z
+```
+
+## Segment versioning
+
+The version number provides a form of [multi-version concurrency control](https://en.wikipedia.org/wiki/Multiversion_concurrency_control) (MVCC) to support batch-mode overwriting. If all you ever do is append data, then there will be just a single version for each time chunk. But when you overwrite data, Druid will seamlessly switch from querying the old version to instead query the new, updated versions. Specifically, a new set of segments is created with the same datasource, same time interval, but a higher version number. This is a signal to the rest of the Druid system that the older version should be removed from the cluster, and the new version should replace it.
+
+The switch appears to happen instantaneously to a user, because Druid handles this by first loading the new data (but not allowing it to be queried), and then, as soon as the new data is all loaded, switching all new queries to use those new segments. Then it drops the old segments a few minutes later.
+
+## Segment lifecycle
+
+Each segment has a lifecycle that involves the following three major areas:
+
+1. **Metadata store:** Segment metadata (a small JSON payload generally no more than a few KB) is stored in the [metadata store](metadata-storage.md) once a segment is done being constructed. The act of inserting a record for a segment into the metadata store is called publishing. These metadata records have a boolean flag named `used`, which controls whether the segment is intended to be queryable or not. Segments created by realtime tasks will be
+available before they are published, since they are only published when the segment is complete and will not accept any additional rows of data.
+2. **Deep storage:** Segment data files are pushed to deep storage once a segment is done being constructed. This happens immediately before publishing metadata to the metadata store.
+3. **Availability for querying:** Segments are available for querying on some Druid data server, like a realtime task, directly from deep storage, or a Historical service.
+
+You can inspect the state of currently active segments using the Druid SQL
+[`sys.segments` table](../querying/sql-metadata-tables.md#segments-table). It includes the following flags:
+
+- `is_published`: True if segment metadata has been published to the metadata store and `used` is true.
+- `is_available`: True if the segment is currently available for querying, either on a realtime task or Historical service.
+- `is_realtime`: True if the segment is only available on realtime tasks. For datasources that use realtime ingestion, this will generally start off `true` and then become `false` as the segment is published and handed off.
+- `is_overshadowed`: True if the segment is published (with `used` set to true) and is fully overshadowed by some other published segments. Generally this is a transient state, and segments in this state will soon have their `used` flag automatically set to false.
+
+## Availability and consistency
+
+Druid has an architectural separation between ingestion and querying, as described above in
+[Indexing and handoff](#indexing-and-handoff). This means that when understanding Druid's availability and consistency properties, we must look at each function separately.
+
+On the ingestion side, Druid's primary [ingestion methods](../ingestion/index.md#ingestion-methods) are all pull-based and offer transactional guarantees. This means that you are guaranteed that ingestion using these methods will publish in an all-or-nothing manner:
+
+- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md). With these methods, Druid commits stream offsets to its [metadata store](metadata-storage.md) alongside segment metadata, in the same transaction. Note that ingestion of data that has not yet been published can be rolled back if ingestion tasks fail. In this case, partially-ingested data is
+discarded, and Druid will resume ingestion from the last committed set of stream offsets. This ensures exactly-once publishing behavior.
+- [Hadoop-based batch ingestion](../ingestion/hadoop.md). Each task publishes all segment metadata in a single transaction.
+- [Native batch ingestion](../ingestion/native-batch.md). In parallel mode, the supervisor task publishes all segment metadata in a single transaction after the subtasks are finished. In simple (single-task) mode, the single task publishes all segment metadata in a single transaction after it is complete.
+
+Additionally, some ingestion methods offer an _idempotency_ guarantee. This means that repeated executions of the same ingestion will not cause duplicate data to be ingested:
+
+- Supervised "seekable-stream" ingestion methods like [Kafka](../ingestion/kafka-ingestion.md) and [Kinesis](../ingestion/kinesis-ingestion.md) are idempotent due to the fact that stream offsets and segment metadata are stored together and updated in lock-step.
+- [Hadoop-based batch ingestion](../ingestion/hadoop.md) is idempotent unless one of your input sources is the same Druid datasource that you are ingesting into. In this case, running the same task twice is non-idempotent, because you are adding to existing data instead of overwriting it.
+- [Native batch ingestion](../ingestion/native-batch.md) is idempotent unless
+[`appendToExisting`](../ingestion/native-batch.md) is true, or one of your input sources is the same Druid datasource that you are ingesting into. In either of these two cases, running the same task twice is non-idempotent, because you are adding to existing data instead of overwriting it.
+
+On the query side, the Druid Broker is responsible for ensuring that a consistent set of segments is involved in a given query. It selects the appropriate set of segment versions to use when the query starts based on what is currently available. This is supported by atomic replacement, a feature that ensures that from a user's perspective, queries flip instantaneously from an older version of data to a newer set of data, with no consistency or performance impact.
+This is used for Hadoop-based batch ingestion, native batch ingestion when `appendToExisting` is false, and compaction.
+
+Note that atomic replacement happens for each time chunk individually. If a batch ingestion task or compaction involves multiple time chunks, then each time chunk will undergo atomic replacement soon after the task finishes, but the replacements will not all happen simultaneously.
+
+Typically, atomic replacement in Druid is based on a core set concept that works in conjunction with segment versions.
+When a time chunk is overwritten, a new core set of segments is created with a higher version number. The core set must all be available before the Broker will use them instead of the older set. There can also only be one core set per version per time chunk. Druid will also only use a single version at a time per time chunk. Together, these properties provide Druid's atomic replacement guarantees.
+
+Druid also supports an experimental segment locking mode that is activated by setting
+[`forceTimeChunkLock`](../ingestion/tasks.md#context-parameters) to false in the context of an ingestion task. In this case, Druid creates an atomic update group using the existing version for the time chunk, instead of creating a new core set with a new version number. There can be multiple atomic update groups with the same version number per time chunk. Each one replaces a specific set of earlier segments in the same time chunk and with the same version number. Druid will query the latest one that is fully available. This is a more powerful version of the core set concept, because it enables atomically replacing a subset of data for a time chunk, as well as doing atomic replacement and appending simultaneously.
+
+If segments become unavailable due to multiple Historicals going offline simultaneously (beyond your replication factor), then Druid queries will include only the segments that are still available. In the background, Druid will reload these unavailable segments on other Historicals as quickly as possible, at which point they will be included in queries again.
diff --git a/docs/35.0.0/design/zookeeper.md b/docs/35.0.0/design/zookeeper.md
new file mode 100644
index 0000000000..a01f12ed74
--- /dev/null
+++ b/docs/35.0.0/design/zookeeper.md
@@ -0,0 +1,76 @@
+---
+id: zookeeper
+title: "ZooKeeper"
+---
+
+
+
+
+Apache Druid uses [Apache ZooKeeper](http://zookeeper.apache.org/) (ZK) for management of current cluster state.
+
+## Minimum ZooKeeper versions
+
+Apache Druid supports ZooKeeper versions 3.5.x and above.
+
+:::info
+ Note: Starting with Apache Druid 0.22.0, support for ZooKeeper 3.4.x has been removed
+ Starting with Apache Druid 31.0.0, support for Zookeeper-based segment loading has been removed.
+:::
+
+## ZooKeeper Operations
+
+The operations that happen over ZK are
+
+1. [Coordinator](../design/coordinator.md) leader election
+2. Segment "publishing" protocol from [Historical](../design/historical.md)
+3. [Overlord](../design/overlord.md) leader election
+4. [Overlord](../design/overlord.md) and [Middle Manager](../design/middlemanager.md) task management
+
+## Coordinator Leader Election
+
+We use the Curator [LeaderLatch](https://curator.apache.org/curator-recipes/leader-latch.html) recipe to perform leader election at path
+
+```
+${druid.zk.paths.coordinatorPath}/_COORDINATOR
+```
+
+## Segment "publishing" protocol from Historical and Realtime
+
+The `announcementsPath` and `liveSegmentsPath` are used for this.
+
+All [Historical](../design/historical.md) processes publish themselves on the `announcementsPath`, specifically, they will create an ephemeral znode at
+
+```
+${druid.zk.paths.announcementsPath}/${druid.host}
+```
+
+Which signifies that they exist. They will also subsequently create a permanent znode at
+
+```
+${druid.zk.paths.liveSegmentsPath}/${druid.host}
+```
+
+And as they load up segments, they will attach ephemeral znodes that look like
+
+```
+${druid.zk.paths.liveSegmentsPath}/${druid.host}/_segment_identifier_
+```
+
+Processes like the [Coordinator](../design/coordinator.md) and [Broker](../design/broker.md) can then watch these paths to see which processes are currently serving which segments.
diff --git a/docs/35.0.0/development/build.md b/docs/35.0.0/development/build.md
new file mode 100644
index 0000000000..3bfaca192d
--- /dev/null
+++ b/docs/35.0.0/development/build.md
@@ -0,0 +1,110 @@
+---
+id: build
+title: "Build from source"
+---
+
+
+
+
+You can build Apache Druid directly from source. Use the version of this page
+that matches the version you want to build.
+For building the latest code in master, follow the latest version of this page
+[here](https://github.com/apache/druid/blob/master/docs/development/build.md):
+make sure it has `/master/` in the URL.
+
+## Prerequisites
+
+### Installing Java and Maven
+
+- See our [Java documentation](../operations/java.md) for information about obtaining a supported JDK
+- [Maven version 3.x](http://maven.apache.org/download.cgi)
+
+### Other Dependencies
+
+- Distribution builds require Python 3.x and the `pyyaml` module.
+- Integration tests require `pyyaml` version 5.1 or later.
+
+## Downloading the Source Code
+
+```bash
+git clone git@github.com:apache/druid.git
+cd druid
+```
+
+## Building from Source
+
+The basic command to build Druid from source is:
+
+```bash
+mvn clean install
+```
+
+This will run static analysis, unit tests, compile classes, and package the projects into JARs. It will _not_ generate the source or binary distribution tarball. Note that this build may take some time to complete.
+
+In addition to the basic stages, you may also want to add the following profiles and properties:
+
+- **-Pdist** - Distribution profile: Generates the binary distribution tarball by pulling in core extensions and dependencies and packaging the files as `distribution/target/apache-druid-x.x.x-bin.tar.gz`
+- **-Papache-release** - Apache release profile: Generates GPG signature and checksums, and builds the source distribution tarball as `distribution/target/apache-druid-x.x.x-src.tar.gz`
+- **-Prat** - Apache Rat profile: Runs the Apache Rat license audit tool
+- **-DskipTests** - Skips unit tests (which reduces build time)
+- **-Dweb.console.skip=true** - Skip front end project
+
+Putting these together, if you wish to build the source and binary distributions with signatures and checksums, audit licenses, and skip the unit tests, you would run:
+
+```bash
+mvn clean install -Papache-release,dist,rat -DskipTests
+```
+
+### Building for Development
+
+For development, use only the dist profile and skip the Apache release and Apache rat profiles.
+
+```bash
+mvn clean install -Pdist -DskipTests
+```
+
+If you want to speed up the build even more, you can enable parallel building with the `-T1C` option and skip some static analysis checks.
+
+```bash
+mvn clean install -Pdist -T1C -DskipTests -Dforbiddenapis.skip=true -Dcheckstyle.skip=true -Dpmd.skip=true -Dmaven.javadoc.skip=true -Denforcer.skip=true
+```
+
+You will expect to find your distribution tar file under the `distribution/target` directory.
+
+## Potential issues
+
+### Missing `pyyaml`
+
+You are building Druid from source following the instructions on this page but you get
+```
+[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:exec (generate-binary-license) on project distribution: Command execution failed.: Process exited with an error: 1 (Exit value: 1) -> [Help 1]
+```
+
+Resolution: Make sure you have Python installed as well as the `yaml` module:
+
+```bash
+pip install pyyaml
+```
+
+On some systems, ensure you use the Python 3.x version of `pip`:
+
+```bash
+pip3 install pyyaml
+```
diff --git a/docs/35.0.0/development/docs-contribute.md b/docs/35.0.0/development/docs-contribute.md
new file mode 100644
index 0000000000..270713e158
--- /dev/null
+++ b/docs/35.0.0/development/docs-contribute.md
@@ -0,0 +1,227 @@
+---
+id: contribute-to-docs
+title: "Contribute to Druid docs"
+---
+
+
+
+Apache Druid is a [community-led project](https://druid.apache.org/community/). We are delighted to receive contributions to the docs ranging from minor fixes to big new features.
+
+Druid docs contributors:
+
+* Improve existing content
+* Create new content
+
+## Getting started
+
+Druid docs contributors can open an issue about documentation, or contribute a change with a pull request (PR).
+
+The open source Druid docs are located here:
+https://druid.apache.org/docs/latest/design/index.html
+
+If you need to update a Druid doc, locate and update the doc in the Druid repo following the instructions below.
+
+## Druid repo branches
+
+The Druid team works on the `master` branch and then branches for a release, such as `26.0.0`.
+
+See [`CONTRIBUTING.md`](https://github.com/apache/incubator-druid/blob/master/CONTRIBUTING.md) for instructions on contributing to Apache Druid.
+
+## Before you begin
+
+Before you can contribute to the Druid docs for the first time, you must complete the following steps:
+
+1. Fork the [Druid repo](https://github.com/apache/druid). Your fork will be the `origin` remote.
+2. Clone your fork:
+
+ ```bash
+ git clone git@github.com:GITHUB_USERNAME/druid.git
+ ```
+
+ Replace `GITHUB_USERNAME` with your GitHub username.
+3. In the directory where you cloned your fork, set up `apache/druid` as your your remote `upstream` repo:
+
+ ```bash
+ git remote add upstream https://github.com/apache/druid.git
+ ```
+
+4. Confirm that your fork shows up as the origin repo and `apache/druid` shows up as the upstream repo:
+
+ ```bash
+ git remote -v
+ ```
+
+5. Verify that you have your email configured for GitHub:
+
+ ```bash
+ git config user.email
+ ```
+
+ If you need to set your email, see the [GitHub instructions](https://docs.github.com/en/github-ae@latest/account-and-profile/setting-up-and-managing-your-github-user-account/managing-email-preferences/setting-your-commit-email-address#setting-your-commit-email-address-in-git).
+
+6. Install Docusaurus so that you can build the site locally. Run either `npm install` or `yarn install` in the `website` directory.
+
+## Contributing
+
+Before you contribute, make sure your local branch of `master` and the upstream Apache branch are up-to-date and in sync. This can help you avoid merge conflicts. Run the following commands on your fork's `master` branch:
+
+```bash
+git fetch origin
+git fetch upstream
+```
+
+Then run either one of the following commands:
+
+```bash
+git rebase upstream/master
+# or
+git merge upstream/master
+```
+
+Now you're up to date, and you can make your changes.
+
+1. Create your working branch:
+
+ ```bash
+ git checkout -b MY-BRANCH
+ ```
+
+ Provide a name for your feature branch in `MY-BRANCH`.
+
+2. Find the file that you want to make changes to. All the source files for the docs are written in Markdown and located in the `docs` directory. The URL for the page includes the subdirectory the source file is in. For example, the SQL-based ingestion tutorial found at `https://druid.apache.org/docs/latest/tutorials/tutorial-msq-extern.html` is in the `tutorials` subdirectory.
+
+ If you're adding a page, create a new Markdown file in the appropriate subdirectory. Then, copy the front matter and Apache license from an existing file. Update the `title` and `id` fields. Don't forget to add it to `website/sidebars.json` so that your new page shows up in the navigation.
+
+3. Test changes locally by building the site and navigating to your changes. In the `website` directory, run `npm run start`. By default, this starts the site on `localhost:3000`. If port `3000` is already in use, it'll increment the port number from there.
+
+4. Use the following commands to run the link and spellcheckers locally:
+
+ ```bash
+ cd website
+ # You only need to install once
+ npm install
+ npm run build
+
+ npm run spellcheck
+ npm run link-lint
+ ```
+
+ This step can save you time during the review process since they'll run faster than the GitHub Action version of the checks and warn you of issues before you create a PR.
+
+5. Push your changes to your fork:
+
+ ```bash
+ git push --set-upstream origin MY-BRANCH
+ ```
+
+6. Go to the Druid repo. GitHub should recognize that you have a new branch in your fork. Create a pull request from your Druid fork and branch to the `master` branch in the Apache Druid repo.
+
+The pull request template is extensive. You may not need all the information there, so feel free to delete unneeded sections as you fill it out. Once you create the pull request, GitHub automatically labels the issue so that reviewers can take a look.
+
+The docs go through a review process similar to the code where community members will offer feedback. Once the review process is complete and your changes are merged, they'll be available on the live site when the site gets republished.
+
+## Style guide
+
+Consistent style, formatting, and tone make documentation easier to consume.
+For the majority of style considerations, the Apache Druid documentation follows the [Google Developer Documentation Style Guide](https://developers.google.com/style).
+The style guide should serve as a point of reference to enable contributors and reviewers to maintain documentation quality.
+
+### Notable style exceptions
+
+In some cases, Google Style might make the Druid docs more difficult to read and understand. This section highlights those exceptions.
+
+#### SQL keyword syntax
+
+For SQL keywords and functions, use all caps, but do not use code font.
+
+:::tip
+
+**Correct**
+
+The UNNEST clause unnests array values.
+
+**Incorrect**
+
+The \`UNNEST\` clause unnests array values.
+:::
+
+#### Optional parameters and arguments
+
+For optional parameters and arguments, enclose the optional parameter and leading command in brackets.
+
+:::tip
+
+**Correct**
+
+HUMAN_READABLE_BINARY_BYTE_FORMAT(value[, precision])
+
+**Incorrect**
+
+HUMAN_READABLE_BINARY_BYTE_FORMAT(value, \[precision])
+:::
+
+#### Markdown table format
+
+When editing or adding tables, do not include extra characters to "prettify" the table format within the Markdown source.
+Some code editors may format tables by default.
+See the developer [style guide](https://github.com/apache/druid/blob/master/dev/style-conventions.md) for more information.
+
+:::tip
+
+**Correct**
+
+```markdown
+| Column 1 | Column 2 | Column 3 |
+| --- | --- | --- |
+| value 1 | val 2 | a-very-long-value 3 |
+```
+
+**Incorrect**
+
+```markdown
+| Column 1 | Column 2 | Column 3 |
+| -------- | -------- | ------------------- |
+| value 1 | val 2 | a-very-long-value 3 |
+```
+
+:::
+
+### Style checklist
+
+Before publishing new content or updating an existing topic, you can audit your documentation using the following checklist to make sure your contributions align with existing documentation:
+
+* Use descriptive link text. If a link downloads a file, make sure to indicate this action.
+* Use present tense where possible.
+* Avoid negative constructions when possible. In other words, try to tell people what they should do instead of what they shouldn't.
+* Use clear and direct language.
+* Use descriptive headings and titles.
+* Avoid using a present participle or gerund as the first word in a heading or title. A shortcut for this is to not start with a word that ends in `-ing`. For example, don't use "Configuring Druid." Use "Configure Druid."
+* Use sentence case in document titles and headings.
+* Don’t use images of text or code samples.
+* Use SVG over PNG for images if you can.
+* Provide alt text or an equivalent text explanation with each image.
+* Use the appropriate text-formatting. For example, make sure code snippets and property names are in code font and UI elements are bold. Generally, you should avoid using bold or italics to emphasize certain words unless there's a good reason.
+* Put conditional clauses before instructions. In the following example, "to drop a segment" is the conditional clause: to drop a segment, do the following.
+* Avoid gender-specific pronouns, instead use "they."
+* Use second person singular — "you" instead of "we."
+* When American spelling is different from Commonwealth/"British" spelling, use the American spelling.
+* Don’t use terms considered disrespectful. Refer to a list like Google’s [Word list](https://developers.google.com/style/word-list) for guidance and alternatives.
+* Use straight quotation marks and straight apostrophes instead of the curly versions.
+* Introduce a list, a table, or a procedure with an introductory sentence that prepares the reader for what they're about to read.
diff --git a/docs/35.0.0/development/experimental.md b/docs/35.0.0/development/experimental.md
new file mode 100644
index 0000000000..96a9b5085e
--- /dev/null
+++ b/docs/35.0.0/development/experimental.md
@@ -0,0 +1,37 @@
+---
+id: experimental
+title: "Experimental features"
+---
+
+
+
+
+Features often start out in "experimental" status that indicates they are still evolving.
+This can mean any of the following things:
+
+1. The feature's API may change even in minor releases or patch releases.
+2. The feature may have known "missing" pieces that will be added later.
+3. The feature may or may not have received full battle-testing in production environments.
+
+All experimental features are optional.
+
+Note that not all of these points apply to every experimental feature. Some have been battle-tested in terms of
+implementation, but are still marked experimental due to an evolving API. Please check the documentation for each
+feature for full details.
diff --git a/docs/35.0.0/development/extensions-contrib/aliyun-oss-extensions.md b/docs/35.0.0/development/extensions-contrib/aliyun-oss-extensions.md
new file mode 100644
index 0000000000..ab0573bdc4
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/aliyun-oss-extensions.md
@@ -0,0 +1,236 @@
+---
+id: aliyun-oss
+title: "Aliyun OSS"
+---
+
+
+
+[Alibaba Cloud](https://www.aliyun.com) is the 3rd largest cloud infrastructure provider in the world. It provides its own storage solution known as OSS, [Object Storage Service](https://www.aliyun.com/product/oss).
+This document describes how to use OSS as Druid deep storage.
+
+## Installation
+
+Use the [pull-deps](../../operations/pull-deps.md) tool shipped with Druid to install the `aliyun-oss-extensions` extension, as described [here](../../configuration/extensions.md#community-extensions) on middle manager and historical nodes.
+
+```bash
+java -classpath "{YOUR_DRUID_DIR}/lib/*" org.apache.druid.cli.Main tools pull-deps -c org.apache.druid.extensions.contrib:aliyun-oss-extensions:{YOUR_DRUID_VERSION}
+```
+
+## Enabling
+
+After installation, add this `aliyun-oss-extensions` extension to `druid.extensions.loadList` in common.runtime.properties and then restart middle manager and historical nodes.
+
+## Configuration
+
+First add the following OSS configurations to common.runtime.properties
+
+|Property|Description|Required|
+|--------|---------------|-----------|
+|`druid.oss.accessKey`|The `AccessKey ID` of the account to be used to access the OSS bucket|yes|
+|`druid.oss.secretKey`|The `AccessKey Secret` of the account to be used to access the OSS bucket| yes|
+|`druid.oss.endpoint`|The endpoint URL of your OSS storage. If your Druid cluster is also hosted in the same region on Alibaba Cloud as the region of your OSS bucket, it's recommended to use the internal network endpoint url, so that any inbound and outbound traffic to the OSS bucket is free of charge. | yes|
+
+To use OSS as deep storage, add the following configurations:
+
+|Property|Description|Required|
+|--------|---------------|-----------|
+|`druid.storage.type`| Global deep storage provider. Must be set to `oss` to make use of this extension. |yes|
+|`druid.storage.oss.bucket`|Storage bucket name.| yes |
+|`druid.storage.oss.prefix`| Folder where segments will be published to. `druid/segments` is recommended. | No |
+
+If OSS is used as deep storage for segment files, it's also recommended saving index logs in the OSS too.
+To do this, add following configurations:
+
+|Property|Description|Required|
+|--------|---------------|-----------|
+|`druid.indexer.logs.type`| Global deep storage provider. Must be set to `oss` to make use of this extension. | yes |
+|`druid.indexer.logs.oss.bucket`|The bucket used to keep logs. It could be the same as `druid.storage.oss.bucket`| yes |
+|`druid.indexer.logs.oss.prefix`|Folder where log files will be published to. `druid/logs` is recommended. | no |
+
+
+## Reading data from OSS
+
+Currently, Web Console does not support ingestion from OSS, but it could be done by submitting an ingestion task with OSS's input source configuration.
+
+Below shows the configurations of OSS's input source.
+
+### OSS Input Source
+
+|property|description|Required|
+|--------|-----------|-------|
+|type|This should be `oss`.|yes|
+|uris|JSON array of URIs where OSS objects to be ingested are located. For example, `oss://{your_bucket}/{source_file_path}`|`uris` or `prefixes` or `objects` must be set|
+|prefixes|JSON array of URI prefixes for the locations of OSS objects to be ingested. Empty objects starting with one of the given prefixes will be skipped.|`uris` or `prefixes` or `objects` must be set|
+|objects|JSON array of [OSS Objects](#oss-object) to be ingested. |`uris` or `prefixes` or `objects` must be set|
+|properties|[Properties Object](#properties-object) for overriding the default OSS configuration. See below for more information.|no (defaults will be used if not given)
+
+#### OSS Object
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|---------|
+|bucket|Name of the OSS bucket|None|yes|
+|path|The path where data is located.|None|yes|
+
+#### Properties Object
+
+|Property|Description|Default|Required|
+|--------|-----------|-------|---------|
+|accessKey|The [Password Provider](../../operations/password-provider.md) or plain text string of this OSS InputSource's access key|None|yes|
+|secretKey|The [Password Provider](../../operations/password-provider.md) or plain text string of this OSS InputSource's secret key|None|yes|
+|endpoint|The endpoint of this OSS InputSource|None|no|
+
+### Reading from a file
+
+Say that the file `rollup-data.json`, which can be found under Druid's `quickstart/tutorial` directory, has been uploaded to a folder `druid` in your OSS bucket, the bucket for which your Druid is configured.
+In this case, the `uris` property of the OSS's input source can be used for reading, as shown:
+
+```json
+{
+ "type" : "index_parallel",
+ "spec" : {
+ "dataSchema" : {
+ "dataSource" : "rollup-tutorial-from-oss",
+ "timestampSpec": {
+ "column": "timestamp",
+ "format": "iso"
+ },
+ "dimensionsSpec" : {
+ "dimensions" : [
+ "srcIP",
+ "dstIP"
+ ]
+ },
+ "metricsSpec" : [
+ { "type" : "count", "name" : "count" },
+ { "type" : "longSum", "name" : "packets", "fieldName" : "packets" },
+ { "type" : "longSum", "name" : "bytes", "fieldName" : "bytes" }
+ ],
+ "granularitySpec" : {
+ "type" : "uniform",
+ "segmentGranularity" : "week",
+ "queryGranularity" : "minute",
+ "intervals" : ["2018-01-01/2018-01-03"],
+ "rollup" : true
+ }
+ },
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "oss",
+ "uris" : [
+ "oss://{YOUR_BUCKET_NAME}/druid/rollup-data.json"
+ ]
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ },
+ "tuningConfig" : {
+ "type" : "index_parallel",
+ "maxRowsPerSegment" : 5000000,
+ "maxRowsInMemory" : 25000
+ }
+ }
+}
+```
+
+By posting the above ingestion task spec to `http://{YOUR_ROUTER_IP}:8888/druid/indexer/v1/task`, an ingestion task will be created by the indexing service to ingest.
+
+### Reading files in folders
+
+If we want to read files in a same folder, we could use the `prefixes` property to specify the folder name where Druid could find input files instead of specifying file URIs one by one.
+
+```json
+...
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "oss",
+ "prefixes" : [
+ "oss://{YOUR_BUCKET_NAME}/2020", "oss://{YOUR_BUCKET_NAME}/2021"
+ ]
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ }
+...
+```
+
+The spec above tells the ingestion task to read all files under `2020` and `2021` folders.
+
+### Reading from other buckets
+
+If you want to read from files in buckets which are different from the bucket Druid is configured, use `objects` property of OSS's InputSource for task submission as below:
+
+```json
+...
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "oss",
+ "objects" : [
+ {"bucket": "YOUR_BUCKET_NAME", "path": "druid/rollup-data.json"}
+ ]
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ }
+...
+```
+
+### Reading with customized accessKey
+
+If the default `druid.oss.accessKey` is not able to access a bucket, `properties` could be used to customize these secret information as below:
+
+```json
+...
+ "ioConfig" : {
+ "type" : "index_parallel",
+ "inputSource" : {
+ "type" : "oss",
+ "objects" : [
+ {"bucket": "YOUR_BUCKET_NAME", "path": "druid/rollup-data.json"}
+ ],
+ "properties": {
+ "endpoint": "YOUR_ENDPOINT_OF_BUCKET",
+ "accessKey": "YOUR_ACCESS_KEY",
+ "secretKey": "YOUR_SECRET_KEY"
+ }
+ },
+ "inputFormat" : {
+ "type" : "json"
+ },
+ "appendToExisting" : false
+ }
+...
+```
+
+This `properties` could be applied to any of `uris`, `objects`, `prefixes` property above.
+
+
+## Troubleshooting
+
+When using OSS as deep storage or reading from OSS, the most problems that users will encounter are related to OSS permission.
+Please refer to the official [OSS permission troubleshooting document](https://www.alibabacloud.com/help/doc-detail/42777.htm) to find a solution.
diff --git a/docs/35.0.0/development/extensions-contrib/ambari-metrics-emitter.md b/docs/35.0.0/development/extensions-contrib/ambari-metrics-emitter.md
new file mode 100644
index 0000000000..ee82ca6d78
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/ambari-metrics-emitter.md
@@ -0,0 +1,98 @@
+---
+id: ambari-metrics-emitter
+title: "Ambari Metrics Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `ambari-metrics-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits Druid metrics to an ambari-metrics carbon server. Events are sent after been pickled (i.e., batched). The size of the batch is configurable.
+
+## Configuration
+
+All the configuration parameters for ambari-metrics emitter are under `druid.emitter.ambari-metrics`.
+
+|property|description|required?|default|
+|--------|-----------|---------|-------|
+|`druid.emitter.ambari-metrics.hostname`|The hostname of the ambari-metrics server.|yes|none|
+|`druid.emitter.ambari-metrics.port`|The port of the ambari-metrics server.|yes|none|
+|`druid.emitter.ambari-metrics.protocol`|The protocol used to send metrics to ambari metrics collector. One of http/https|no|http|
+|`druid.emitter.ambari-metrics.trustStorePath`|Path to trustStore to be used for https|no|none|
+|`druid.emitter.ambari-metrics.trustStoreType`|trustStore type to be used for https|no|none|
+|`druid.emitter.ambari-metrics.trustStoreType`|trustStore password to be used for https|no|none|
+|`druid.emitter.ambari-metrics.batchSize`|Number of events to send as one batch.|no|100|
+|`druid.emitter.ambari-metrics.eventConverter`| Filter and converter of druid events to ambari-metrics timeline event(please see next section). |yes|none|
+|`druid.emitter.ambari-metrics.flushPeriod` | Queue flushing period in milliseconds. |no|1 minute|
+|`druid.emitter.ambari-metrics.maxQueueSize`| Maximum size of the queue used to buffer events. |no|`MAX_INT`|
+|`druid.emitter.ambari-metrics.alertEmitters`| List of emitters where alerts will be forwarded to. |no| empty list (no forwarding)|
+|`druid.emitter.ambari-metrics.emitWaitTime` | wait time in milliseconds to try to send the event otherwise emitter will throwing event. |no|0|
+|`druid.emitter.ambari-metrics.waitForEventTime` | waiting time in milliseconds if necessary for an event to become available. |no|1000 (1 sec)|
+
+### Druid to Ambari Metrics Timeline Event Converter
+
+Ambari Metrics Timeline Event Converter defines a mapping between druid metrics name plus dimensions to a timeline event metricName.
+ambari-metrics metric path is organized using the following schema:
+`.[].[]..`
+Properly naming the metrics is critical to avoid conflicts, confusing data and potentially wrong interpretation later on.
+
+Example `druid.historical.hist-host1:8080.MyDataSourceName.GroupBy.query/time`:
+
+ * `druid` -> namespace prefix
+ * `historical` -> service name
+ * `hist-host1:8080` -> druid hostname
+ * `MyDataSourceName` -> dimension value
+ * `GroupBy` -> dimension value
+ * `query/time` -> metric name
+
+We have two different implementation of event converter:
+
+#### Send-All converter
+
+The first implementation called `all`, will send all the druid service metrics events.
+The path will be in the form `.[].[]..`
+User has control of `.[].[].`
+
+```json
+
+druid.emitter.ambari-metrics.eventConverter={"type":"all", "namespacePrefix": "druid.test", "appName":"druid"}
+
+```
+
+#### White-list based converter
+
+The second implementation called `whiteList`, will send only the white listed metrics and dimensions.
+Same as for the `all` converter user has control of `.[].[].`
+White-list based converter comes with the following default white list map located under resources in `./src/main/resources/defaultWhiteListMap.json`
+
+Although user can override the default white list map by supplying a property called `mapPath`.
+This property is a String containing the path for the file containing **white list map JSON object**.
+For example the following converter will read the map from the file `/pathPrefix/fileName.json`.
+
+```json
+
+druid.emitter.ambari-metrics.eventConverter={"type":"whiteList", "namespacePrefix": "druid.test", "ignoreHostname":true, "appName":"druid", "mapPath":"/pathPrefix/fileName.json"}
+
+```
+
+**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**
diff --git a/docs/35.0.0/development/extensions-contrib/cassandra.md b/docs/35.0.0/development/extensions-contrib/cassandra.md
new file mode 100644
index 0000000000..916bacb917
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/cassandra.md
@@ -0,0 +1,30 @@
+---
+id: cassandra
+title: "Apache Cassandra"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-cassandra-storage` in the extensions load list.
+
+[Apache Cassandra](http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra) can also
+be leveraged for deep storage. This requires some additional Druid configuration as well as setting up the necessary
+schema within a Cassandra keystore.
diff --git a/docs/35.0.0/development/extensions-contrib/cloudfiles.md b/docs/35.0.0/development/extensions-contrib/cloudfiles.md
new file mode 100644
index 0000000000..d4e7592ee7
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/cloudfiles.md
@@ -0,0 +1,42 @@
+---
+id: cloudfiles
+title: "Rackspace Cloud Files"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-cloudfiles-extensions` in the extensions load list.
+
+## Deep Storage
+
+[Rackspace Cloud Files](http://www.rackspace.com/cloud/files/) is another option for deep storage. This requires some additional Druid configuration.
+
+|Property|Possible Values|Description|Default|
+|--------|---------------|-----------|-------|
+|`druid.storage.type`|cloudfiles||Must be set.|
+|`druid.storage.region`||Rackspace Cloud Files region.|Must be set.|
+|`druid.storage.container`||Rackspace Cloud Files container name.|Must be set.|
+|`druid.storage.basePath`||Rackspace Cloud Files base path to use in the container.|Must be set.|
+|`druid.storage.operationMaxRetries`||Number of tries before cancel a Rackspace operation.|10|
+|`druid.cloudfiles.userName`||Rackspace Cloud username|Must be set.|
+|`druid.cloudfiles.apiKey`||Rackspace Cloud API key.|Must be set.|
+|`druid.cloudfiles.provider`|rackspace-cloudfiles-us,rackspace-cloudfiles-uk|Name of the provider depending on the region.|Must be set.|
+|`druid.cloudfiles.useServiceNet`|true,false|Whether to use the internal service net.|true|
diff --git a/docs/35.0.0/development/extensions-contrib/compressed-big-decimal.md b/docs/35.0.0/development/extensions-contrib/compressed-big-decimal.md
new file mode 100644
index 0000000000..28a52185fd
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/compressed-big-decimal.md
@@ -0,0 +1,280 @@
+---
+id: compressed-big-decimal
+title: "Compressed Big Decimal"
+---
+
+
+
+## Overview
+**Compressed Big Decimal** is an extension which provides support for Mutable big decimal value that can be used to accumulate values without losing precision or reallocating memory. This type helps in absolute precision arithmetic on large numbers in applications, where greater level of accuracy is required, such as financial applications, currency based transactions. This helps avoid rounding issues where in potentially large amount of money can be lost.
+
+Accumulation requires that the two numbers have the same scale, but does not require that they are of the same size. If the value being accumulated has a larger underlying array than this value (the result), then the higher order bits are dropped, similar to what happens when adding a long to an int and storing the result in an int. A compressed big decimal that holds its data with an embedded array.
+
+Compressed big decimal is an absolute number based complex type based on big decimal in Java. This supports all the functionalities supported by Java Big Decimal. Java Big Decimal is not mutable in order to avoid big garbage collection issues. Compressed big decimal is needed to mutate the value in the accumulator.
+
+#### Main enhancements provided by this extension:
+1. Functionality: Mutating Big decimal type with greater precision
+2. Accuracy: Provides greater level of accuracy in decimal arithmetic
+
+## Operations
+To use this extension, make sure to [load](../../configuration/extensions.md#loading-extensions) `druid-compressed-bigdecimal` to your config file.
+
+## Configuration
+There are currently no configuration properties specific to Compressed Big Decimal
+
+## Limitations
+* Compressed Big Decimal does not provide correct result when the value being accumulated has a larger underlying array than this value (the result), then the higher order bits are dropped, similar to what happens when adding a long to an int and storing the result in an int.
+
+
+### Ingestion Spec:
+* Most properties in the Ingest spec derived from [Ingestion Spec](../../ingestion/index.md) / [Data Formats](../../ingestion/data-formats.md)
+
+
+|property|description|required?|
+|--------|-----------|---------|
+|metricsSpec|Metrics Specification, In metrics specification while specifying metrics details such as name, type should be specified as compressedBigDecimal|Yes|
+
+### Query spec:
+* Most properties in the query spec derived from [groupBy query](../../querying/groupbyquery.md) / [timeseries](../../querying/timeseriesquery.md), see documentation for these query types.
+
+|property|description|required?|
+|--------|-----------|---------|
+|queryType|This String should always be either "groupBy" OR "timeseries"; this is the first thing Druid looks at to figure out how to interpret the query.|yes|
+|dataSource|A String or Object defining the data source to query, very similar to a table in a relational database. See [DataSource](../../querying/datasource.md) for more information.|yes|
+|dimensions|A JSON list of [DimensionSpec](../../querying/dimensionspecs.md) (Notice that property is optional)|no|
+|limitSpec|See [LimitSpec](../../querying/limitspec.md)|no|
+|having|See [Having](../../querying/having.md)|no|
+|granularity|A period granularity; See [Period Granularities](../../querying/granularities.md#period-granularities)|yes|
+|filter|See [Filters](../../querying/filters.md)|no|
+|aggregations|Aggregations forms the input to Averagers; See [Aggregations](../../querying/aggregations.md). The Aggregations must specify type, scale and size as follows for compressedBigDecimal Type ```"aggregations": [{"type": "compressedBigDecimal","name": "..","fieldName": "..","scale": [Numeric],"size": [Numeric]}```. Please refer query example in Examples section. |Yes|
+|postAggregations|Supports only aggregations as input; See [Post Aggregations](../../querying/post-aggregations.md)|no|
+|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
+|context|An additional JSON Object which can be used to specify certain flags.|no|
+
+## Examples
+
+Consider the data as
+
+|Date|Item|SaleAmount|
+|--------|-----------|---------|
+
+```
+20201208,ItemA,0.0
+20201208,ItemB,10.000000000
+20201208,ItemA,-1.000000000
+20201208,ItemC,9999999999.000000000
+20201208,ItemB,5000000000.000000005
+20201208,ItemA,2.0
+20201208,ItemD,0.0
+```
+
+IngestionSpec syntax:
+
+```json
+{
+ "type": "index_parallel",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "invoices",
+ "timestampSpec": {
+ "column": "timestamp",
+ "format": "yyyyMMdd"
+ },
+ "dimensionsSpec": {
+ "dimensions": [{
+ "type": "string",
+ "name": "itemName"
+ }]
+ },
+ "metricsSpec": [{
+ "name": "saleAmount",
+ "type": "compressedBigDecimalSum",
+ "fieldName": "saleAmount"
+ }],
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ },
+ "granularitySpec": {
+ "type": "uniform",
+ "rollup": false,
+ "segmentGranularity": "DAY",
+ "queryGranularity": "none",
+ "intervals": ["2020-12-08/2020-12-09"]
+ }
+ },
+ "ioConfig": {
+ "type": "index_parallel",
+ "inputSource": {
+ "type": "local",
+ "baseDir": "/home/user/sales/data/staging/invoice-data",
+ "filter": "invoice-001.20201208.txt"
+ },
+ "inputFormat": {
+ "type": "tsv",
+ "delimiter": ",",
+ "skipHeaderRows": 0,
+ "columns": [
+ "timestamp",
+ "itemName",
+ "saleAmount"
+ ]
+ }
+ },
+ "tuningConfig": {
+ "type": "index_parallel"
+ }
+ }
+}
+```
+
+SQL-based ingestion sample query:
+```sql
+
+REPLACE INTO "bigdecimal" OVERWRITE ALL
+WITH "ext" AS (
+ SELECT *
+ FROM TABLE(
+ EXTERN(
+ '{"type":"local","baseDir":""/home/user/sales/data/staging/invoice-data","filter":"invoice-001.20201208.txt"}',
+ '{"type":"csv","findColumnsFromHeader":false,"columns":["timestamp","itemName","saleAmount"]}',
+ '[{"name":"timestamp","type":"string"},{"name":"itemName","type":"string"},{"name":"saleAmount","type":"double"}]'
+ )
+ )
+)
+SELECT
+ TIME_PARSE(TRIM("timestamp")) AS "__time",
+ "itemName",
+ BIG_SUM("saleAmount") as amount
+FROM "ext"
+group by TIME_PARSE(TRIM("timestamp")) , itemName
+PARTITIONED BY DAY
+```
+
+
+### Group By Query example
+
+Calculating sales groupBy all.
+
+Query syntax:
+
+```json
+{
+ "queryType": "groupBy",
+ "dataSource": "invoices",
+ "granularity": "ALL",
+ "dimensions": [
+ ],
+ "aggregations": [
+ {
+ "type": "compressedBigDecimalSum",
+ "name": "saleAmount",
+ "fieldName": "saleAmount",
+ "scale": 9,
+ "size": 3
+
+ }
+ ],
+ "intervals": [
+ "2020-01-08T00:00:00.000Z/P1D"
+ ]
+}
+```
+
+Result:
+
+```json
+[ {
+ "version" : "v1",
+ "timestamp" : "2020-12-08T00:00:00.000Z",
+ "event" : {
+ "revenue" : 15000000010.000000005
+ }
+} ]
+```
+
+Had you used *doubleSum* instead of *compressedBigDecimalSum* the result would be
+
+```json
+[ {
+ "timestamp" : "2020-12-08T00:00:00.000Z",
+ "result" : {
+ "revenue" : 1.500000001E10
+ }
+} ]
+```
+As shown above the precision is lost and could lead to loss in money.
+
+### TimeSeries Query Example
+
+Query syntax:
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": "invoices",
+ "granularity": "ALL",
+ "aggregations": [
+ {
+ "type": "compressedBigDecimalSum",
+ "name": "revenue",
+ "fieldName": "revenue",
+ "scale": 9,
+ "size": 3
+ }
+ ],
+ "filter": {
+ "type": "not",
+ "field": {
+ "type": "selector",
+ "dimension": "itemName",
+ "value": "ItemD"
+ }
+ },
+ "intervals": [
+ "2020-12-08T00:00:00.000Z/P1D"
+ ]
+}
+```
+
+Result:
+
+```json
+[ {
+ "timestamp" : "2020-12-08T00:00:00.000Z",
+ "result" : {
+ "revenue" : 15000000010.000000005
+ }
+} ]
+```
+
+### Supported Query Functions
+
+Native aggregation functions:
+
+ * `compressedBigDecimalSum`
+ * `compressedBigDecimalMin`
+ * `compressedBigDecimalMax`
+
+SQL aggregation functions:
+ * `big_sum()`
+ * `big_min()`
+ * `big_max()`
+
diff --git a/docs/35.0.0/development/extensions-contrib/ddsketch-quantiles.md b/docs/35.0.0/development/extensions-contrib/ddsketch-quantiles.md
new file mode 100644
index 0000000000..bd1a1e1dab
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/ddsketch-quantiles.md
@@ -0,0 +1,139 @@
+---
+id: ddsketch-quantiles
+title: "DDSketches for Approximate Quantiles module"
+---
+
+
+
+
+This module provides aggregators for approximate quantile queries using the [DDSketch](https://github.com/datadog/sketches-java) library. The DDSketch library provides a fast, and fully-mergeable quantile sketch with relative error. If the true quantile is 100, a sketch with relative error of 1% guarantees a quantile value between 101 and 99. This is important and highly valuable behavior for long tail distributions. The best use case for these sketches is for accurately describing the upper quantiles of long tailed distributions such as network latencies.
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) in the extensions load list.
+
+```
+druid.extensions.loadList=["druid-ddsketch", ...]
+```
+
+### Aggregator
+
+The result of the aggregation is a DDSketch that is the union of all sketches either built from raw data or read from the segments. The single number that is returned represents the total number of included data points. The default aggregator type of `ddSketch` uses the collapsingLowestDense strategy for storing and merging sketch. This means that in favor of keeping the highest values represented at the highest accuracy, the sketch will collapse and merge lower, smaller values in the sketch. Collapsed bins will lose accuracy guarantees. The default number of bins is 1000. Sketches can only be merged when using the same relativeError values.
+
+The `ddSketch` aggregator operates over raw data and precomputed sketches.
+
+```json
+{
+ "type" : "ddSketch",
+ "name" : ,
+ "fieldName" : ,
+ "relativeError" : ,
+ "numBins":
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "ddSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or raw numeric values).|yes|
+|relativeError|Describes the precision in which to store the sketch. Must be a number between 0 and 1.|no, defaults to 0.01 (1% error)|
+|numBins|Total number of bins the sketch is allowed to use to describe the distribution. This has a direct impact on max memory used. The more total bins available, the larger the range of accurate quantiles. With relative accuracy of 2%, only 275 bins are required to cover values between 1 millisecond and 1 minute. 800 bins are required to cover values between 1 nanosecond and 1 day.|no, defaults to 1000|
+
+
+### Post Aggregators
+
+To compute approximate quantiles, use `quantilesFromDDSketch` to query for a set of quantiles or `quantileFromDDSketch` to query for a single quantile. Call these post-aggregators on the sketches created by the `ddSketch` aggregators.
+
+
+#### quantilesFromDDSketch
+
+Use `quantilesFromDDSketch` to fetch multiple quantiles.
+
+```json
+{
+ "type" : "quantilesFromDDSketch",
+ "name" : ,
+ "field" : ,
+ "fractions" :
+}
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantilesFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fractions|Array of doubles from 0 to 1 of the quantiles to compute|yes|
+
+#### quantileFromDDSketch
+
+Use `quantileFromDDSketch` to fetch a single quantile.
+
+```json
+{
+ "type" : "quantileFromDDSketch",
+ "name" : ,
+ "field" : ,
+ "fraction" :
+}
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Must be "quantileFromDDSketch" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A computed ddSketch.|yes|
+|fraction|A double from 0 to 1 of the quantile to compute|yes|
+
+
+### Example
+
+As an example of a query with sketches pre-aggregated at ingestion time, one could set up the following aggregator at ingest:
+
+```json
+{
+ "type": "ddSketch",
+ "name": "sketch",
+ "fieldName": "value",
+ "relativeError": 0.01,
+ "numBins": 1000,
+}
+```
+
+Compute quantiles from the pre-aggregated sketches using the following aggregator and post-aggregator.
+
+```json
+{
+ "aggregations": [{
+ "type": "ddSketch",
+ "name": "sketch",
+ "fieldName": "sketch",
+ }],
+ "postAggregations": [
+ {
+ "type": "quantilesFromDDSketch",
+ "name": "quantiles",
+ "fractions": [0.5, 0.75, 0.9, 0.99],
+ "field": {
+ "type": "fieldAccess",
+ "fieldName": "sketch"
+ }
+ }]
+}
+```
diff --git a/docs/35.0.0/development/extensions-contrib/delta-lake.md b/docs/35.0.0/development/extensions-contrib/delta-lake.md
new file mode 100644
index 0000000000..88f3a2c77f
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/delta-lake.md
@@ -0,0 +1,54 @@
+---
+id: delta-lake
+title: "Delta Lake extension"
+---
+
+
+
+
+Delta Lake is an open source storage framework that enables building a
+Lakehouse architecture with various compute engines. [DeltaLakeInputSource](../../ingestion/input-sources.md#delta-lake-input-source) lets
+you ingest data stored in a Delta Lake table into Apache Druid. To use the Delta Lake extension, add the `druid-deltalake-extensions` to the list of loaded extensions.
+See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information.
+
+The Delta input source reads the configured Delta Lake table and extracts the underlying Delta files in the table's latest snapshot
+based on an optional Delta filter. These Delta Lake files are versioned Parquet files.
+
+## Version support
+
+The Delta Lake extension uses the Delta Kernel introduced in Delta Lake 3.0.0, which is compatible with Apache Spark 3.5.x.
+Older versions are unsupported, so consider upgrading to Delta Lake 3.0.x or higher to use this extension.
+
+## Downloading Delta Lake extension
+
+To download `druid-deltalake-extensions`, run the following command after replacing `` with the desired
+Druid version:
+
+```shell
+java \
+ -cp "lib/*" \
+ -Ddruid.extensions.directory="extensions" \
+ -Ddruid.extensions.hadoopDependenciesDir="hadoop-dependencies" \
+ org.apache.druid.cli.Main tools pull-deps \
+ --no-default-hadoop \
+ -c "org.apache.druid.extensions.contrib:druid-deltalake-extensions:"
+```
+
+See [Loading community extensions](../../configuration/extensions.md#loading-community-extensions) for more information.
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-contrib/distinctcount.md b/docs/35.0.0/development/extensions-contrib/distinctcount.md
new file mode 100644
index 0000000000..38f8e5efba
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/distinctcount.md
@@ -0,0 +1,99 @@
+---
+id: distinctcount
+title: "DistinctCount Aggregator"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) the `druid-distinctcount` in the extensions load list.
+
+Additionally, follow these steps:
+
+1. First, use a single dimension hash-based partition spec to partition data by a single dimension. For example visitor_id. This to make sure all rows with a particular value for that dimension will go into the same segment, or this might over count.
+2. Second, use distinctCount to calculate the distinct count, make sure queryGranularity is divided exactly by segmentGranularity or else the result will be wrong.
+
+There are some limitations, when used with groupBy, the groupBy keys' numbers should not exceed maxIntermediateRows in every segment. If exceeded the result will be wrong. When used with topN, numValuesPerPass should not be too big. If too big the distinctCount will use a lot of memory and might cause the JVM to go our of memory.
+
+Example:
+
+## Timeseries query
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": "sample_datasource",
+ "granularity": "day",
+ "aggregations": [
+ {
+ "type": "distinctCount",
+ "name": "uv",
+ "fieldName": "visitor_id"
+ }
+ ],
+ "intervals": [
+ "2016-03-01T00:00:00.000/2013-03-20T00:00:00.000"
+ ]
+}
+```
+
+## TopN query
+
+```json
+{
+ "queryType": "topN",
+ "dataSource": "sample_datasource",
+ "dimension": "sample_dim",
+ "threshold": 5,
+ "metric": "uv",
+ "granularity": "all",
+ "aggregations": [
+ {
+ "type": "distinctCount",
+ "name": "uv",
+ "fieldName": "visitor_id"
+ }
+ ],
+ "intervals": [
+ "2016-03-06T00:00:00/2016-03-06T23:59:59"
+ ]
+}
+```
+
+## GroupBy query
+
+```json
+{
+ "queryType": "groupBy",
+ "dataSource": "sample_datasource",
+ "dimensions": ["sample_dim"],
+ "granularity": "all",
+ "aggregations": [
+ {
+ "type": "distinctCount",
+ "name": "uv",
+ "fieldName": "visitor_id"
+ }
+ ],
+ "intervals": [
+ "2016-03-06T00:00:00/2016-03-06T23:59:59"
+ ]
+}
+```
diff --git a/docs/35.0.0/development/extensions-contrib/druid-exact-count-bitmap.md b/docs/35.0.0/development/extensions-contrib/druid-exact-count-bitmap.md
new file mode 100644
index 0000000000..b39ed38dd6
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/druid-exact-count-bitmap.md
@@ -0,0 +1,452 @@
+---
+id: druid-exact-count-bitmap
+title: "Exact Count Bitmap"
+---
+
+
+
+This extension provides exact cardinality counting functionality for LONG type columns using [Roaring Bitmaps](https://roaringbitmap.org/). Unlike approximate cardinality aggregators like HyperLogLog, this aggregator provides precise distinct counts.
+
+## Installation
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-exact-count-bitmap` in the extensions load list.
+
+## Comparison with Similar Aggregations
+
+The [Distinct Count Aggregator](https://druid.apache.org/docs/latest/development/extensions-contrib/distinctcount/) works in a similar way to the Exact Count Aggregator. Hence, it is important to understand the difference between the behavior of these two aggregators.
+
+| Exact Count | Distinct Count |
+| -- | -- |
+| No prerequisites needed (e.g. configuring hash partition, segment granularity) | Prerequisites needed to perform aggregation |
+| Works on 64-bit number columns only (BIGINT) | Works on dimension columns (Including Strings, Complex Types, etc) |
+
+## How it Works
+
+The extension uses `Roaring64NavigableMap` as its underlying data structure to efficiently store and compute exact cardinality of 64-bit integers. It provides two types of aggregators that serve different purposes:
+
+### Build Aggregator (Bitmap64ExactCountBuild)
+
+The BUILD aggregator is used when you want to compute cardinality directly from raw LONG values:
+
+- Used during ingestion or when querying raw data
+- Must be used on columns of type LONG.
+
+Example:
+
+```json
+{
+ "type": "Bitmap64ExactCountBuild",
+ "name": "unique_values",
+ "fieldName": "id"
+}
+```
+
+### Merge Aggregator (Bitmap64ExactCountMerge)
+
+The MERGE aggregator is used when working with pre-computed bitmaps:
+
+- Used for querying pre-aggregated data (columns that were previously aggregated using BUILD)
+- Combines multiple bitmaps using bitwise operations.
+- Must be used on columns that are aggregated using BUILD, or by a previous MERGE.
+- `Bitmap64ExactCountMerge` aggregator is recommended for use in `timeseries` type queries, though it also works for `topN` and `groupBy` queries.
+
+Example:
+
+```json
+{
+ "type": "Bitmap64ExactCountMerge",
+ "name": "total_unique_values",
+ "fieldName": "unique_values" // Must be a pre-computed bitmap
+}
+```
+
+### Typical Workflow
+
+1. During ingestion, use BUILD to create the initial bitmap:
+ ```json
+ {
+ "type": "index",
+ "spec": {
+ "dataSchema": {
+ "metricsSpec": [
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "unique_users",
+ "fieldName": "user_id"
+ }
+ ]
+ }
+ }
+ }
+ ```
+
+2. When querying the aggregated data, use MERGE to combine bitmaps:
+ ```json
+ {
+ "queryType": "timeseries",
+ "aggregations": [
+ {
+ "type": "Bitmap64ExactCountMerge",
+ "name": "total_unique_users",
+ "fieldName": "unique_users"
+ }
+ ]
+ }
+ ```
+
+## Usage
+
+### SQL Query
+
+You can use the `BITMAP64_EXACT_COUNT` function in SQL queries:
+
+```sql
+SELECT BITMAP64_EXACT_COUNT(column_name)
+FROM datasource
+WHERE ...
+GROUP BY ...
+```
+
+### Post-Aggregator
+
+You can also use the post-aggregator for further processing:
+
+```json
+{
+ "type": "bitmap64ExactCount",
+ "name": "",
+ "fieldName": ""
+}
+```
+
+## Considerations
+
+- **Memory Usage**: While Roaring Bitmaps are efficient, storing exact unique values will generally consume more memory than approximate algorithms like HyperLogLog.
+- **Input Type**: This aggregator only works with LONG (64-bit integer) columns. String or other data types must be converted to longs before using this aggregator.
+- **Build vs Merge**: Always use BUILD for raw numeric data and MERGE for pre-aggregated data. Using BUILD on pre-aggregated data or MERGE on raw data will not work correctly.
+
+## Example Use Cases
+
+1. **User Analytics**: Count unique users over time
+
+```sql
+-- First ingest with BUILD aggregator
+-- Then query with:
+SELECT
+ TIME_FLOOR(__time, 'PT1H') AS hour,
+ BITMAP64_EXACT_COUNT(unique_users) as distinct_users
+FROM user_metrics
+GROUP BY 1
+```
+
+2. **High-Precision Metrics**: When exact counts are required
+
+```json
+{
+ "type": "groupBy",
+ "dimensions": [
+ "country"
+ ],
+ "aggregations": [
+ {
+ "type": "Bitmap64ExactCountMerge",
+ "name": "exact_user_count",
+ "fieldName": "unique_users"
+ }
+ ]
+}
+```
+
+## Walkthrough Using Wikipedia datasource
+
+### Batch Ingestion Task Spec
+
+```json
+{
+ "type": "index",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "wikipedia_metrics",
+ "timestampSpec": {
+ "column": "__time",
+ "format": "auto"
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ "channel",
+ "namespace",
+ "page",
+ "user",
+ "cityName",
+ "countryName",
+ "regionName",
+ "isRobot",
+ "isUnpatrolled",
+ "isNew",
+ "isAnonymous"
+ ]
+ },
+ "metricsSpec": [
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "unique_added_values",
+ "fieldName": "added"
+ },
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "unique_delta_values",
+ "fieldName": "delta"
+ },
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "unique_comment_lengths",
+ "fieldName": "commentLength"
+ },
+ {
+ "name": "count",
+ "type": "count"
+ },
+ {
+ "name": "sum_added",
+ "type": "longSum",
+ "fieldName": "added"
+ },
+ {
+ "name": "sum_delta",
+ "type": "longSum",
+ "fieldName": "delta"
+ }
+ ],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "DAY",
+ "queryGranularity": "HOUR",
+ "rollup": true,
+ "intervals": [
+ "2016-06-27/2016-06-28"
+ ]
+ }
+ },
+ "ioConfig": {
+ "type": "index",
+ "inputSource": {
+ "type": "druid",
+ "dataSource": "wikipedia",
+ "interval": "2016-06-27/2016-06-28"
+ },
+ "inputFormat": {
+ "type": "tsv",
+ "findColumnsFromHeader": true
+ }
+ },
+ "tuningConfig": {
+ "type": "index",
+ "maxRowsPerSegment": 5000000,
+ "maxRowsInMemory": 25000
+ }
+ }
+}
+```
+
+### Query from datasource with raw bytes
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": {
+ "type": "table",
+ "name": "wikipedia_metrics"
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "granularity": {
+ "type": "all"
+ },
+ "aggregations": [
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "a0",
+ "fieldName": "unique_added_values"
+ }
+ ]
+}
+```
+
+### Query from datasource with pre-aggregated bitmap
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": {
+ "type": "table",
+ "name": "wikipedia_metrics"
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "granularity": {
+ "type": "all"
+ },
+ "aggregations": [
+ {
+ "type": "Bitmap64ExactCountMerge",
+ "name": "a0",
+ "fieldName": "unique_added_values"
+ }
+ ]
+}
+```
+
+## Other Examples
+
+### Kafka ingestion task spec
+
+```json
+{
+ "type": "kafka",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "ticker_event_bitmap64_exact_count_rollup",
+ "timestampSpec": {
+ "column": "timestamp",
+ "format": "millis",
+ "missingValue": null
+ },
+ "dimensionsSpec": {
+ "dimensions": [
+ {
+ "type": "string",
+ "name": "key"
+ }
+ ],
+ "dimensionExclusions": []
+ },
+ "metricsSpec": [
+ {
+ "type": "Bitmap64ExactCountBuild",
+ "name": "count",
+ "fieldName": "value"
+ }
+ ],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": "HOUR",
+ "rollup": true,
+ "intervals": null
+ },
+ "transformSpec": {
+ "filter": null,
+ "transforms": []
+ }
+ },
+ "ioConfig": {
+ "topic": "ticker_event",
+ "inputFormat": {
+ "type": "json",
+ "flattenSpec": {
+ "useFieldDiscovery": true,
+ "fields": []
+ },
+ "featureSpec": {}
+ },
+ "replicas": 1,
+ "taskCount": 1,
+ "taskDuration": "PT3600S",
+ "consumerProperties": {
+ "bootstrap.servers": "localhost:9092"
+ },
+ "pollTimeout": 100,
+ "startDelay": "PT5S",
+ "period": "PT30S",
+ "useEarliestOffset": false,
+ "completionTimeout": "PT1800S",
+ "lateMessageRejectionPeriod": null,
+ "earlyMessageRejectionPeriod": null,
+ "lateMessageRejectionStartDateTime": null,
+ "stream": "ticker_event",
+ "useEarliestSequenceNumber": false,
+ "type": "kafka"
+ }
+ }
+}
+```
+
+### Query with Post-aggregator:
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": {
+ "type": "table",
+ "name": "ticker_event_bitmap64_exact_count_rollup"
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "2020-09-13T06:35:35.000Z/146140482-04-24T15:36:27.903Z"
+ ]
+ },
+ "descending": false,
+ "virtualColumns": [],
+ "filter": null,
+ "granularity": {
+ "type": "all"
+ },
+ "aggregations": [
+ {
+ "type": "count",
+ "name": "cnt"
+ },
+ {
+ "type": "Bitmap64ExactCountMerge",
+ "name": "a0",
+ "fieldName": "count"
+ }
+ ],
+ "postAggregations": [
+ {
+ "type": "arithmetic",
+ "fn": "/",
+ "fields": [
+ {
+ "type": "bitmap64ExactCount",
+ "name": "a0",
+ "fieldName": "a0"
+ },
+ {
+ "type": "fieldAccess",
+ "name": "cnt",
+ "fieldName": "cnt"
+ }
+ ],
+ "name": "rollup_rate"
+ }
+ ],
+ "limit": 2147483647
+}
+```
diff --git a/docs/35.0.0/development/extensions-contrib/druid-ranger-security.md b/docs/35.0.0/development/extensions-contrib/druid-ranger-security.md
new file mode 100644
index 0000000000..502358f801
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/druid-ranger-security.md
@@ -0,0 +1,130 @@
+---
+id: druid-ranger-security
+title: "Apache Ranger Security"
+---
+
+
+
+This Apache Druid extension adds an Authorizer which implements access control for Druid, backed by [Apache Ranger](https://ranger.apache.org/). Please see [Authentication and Authorization](../../operations/auth.md) for more information on the basic facilities this extension provides.
+
+Make sure to [include](../../configuration/extensions.md#loading-extensions) `druid-ranger-security` in the extensions load list.
+
+
+## Configuration
+
+Support for Apache Ranger authorization consists of three elements:
+* configuring the extension in Apache Druid
+* configuring the connection to Apache Ranger
+* providing the service definition for Druid to Apache Ranger
+
+### Enabling the extension
+Ensure that you have a valid authenticator chain and escalator set in your `common.runtime.properties`. For every authenticator your wish to use the authorizer for, set `druid.auth.authenticator..authorizerName` to the name you will give the authorizer, e.g. `ranger`.
+
+Then add the following and amend to your needs (in case you need to use multiple authorizers):
+
+```
+druid.auth.authorizers=["ranger"]
+druid.auth.authorizer.ranger.type=ranger
+```
+
+The following is an example that showcases using `druid-basic-security` for authentication and `druid-ranger-security` for authorization.
+
+```
+druid.auth.authenticatorChain=["basic"]
+druid.auth.authenticator.basic.type=basic
+druid.auth.authenticator.basic.initialAdminPassword=password1
+druid.auth.authenticator.basic.initialInternalClientPassword=password2
+druid.auth.authenticator.basic.credentialsValidator.type=metadata
+druid.auth.authenticator.basic.skipOnFailure=false
+druid.auth.authenticator.basic.enableCacheNotifications=true
+druid.auth.authenticator.basic.authorizerName=ranger
+
+druid.auth.authorizers=["ranger"]
+druid.auth.authorizer.ranger.type=ranger
+
+# Escalator
+druid.escalator.type=basic
+druid.escalator.internalClientUsername=druid_system
+druid.escalator.internalClientPassword=password2
+druid.escalator.authorizerName=ranger
+```
+
+:::info
+ Contrary to the documentation of `druid-basic-auth` Ranger does not automatically provision a highly privileged system user, you will need to do this yourself. This system user in the case of `druid-basic-auth` is named `druid_system` and for the escalator it is configurable, as shown above. Make sure to take note of these user names and configure `READ` access to `state:STATE` and to `config:security` in your ranger policies, otherwise system services will not work properly.
+:::
+
+#### Properties to configure the extension in Apache Druid
+|Property|Description|Default|required|
+|--------|-----------|-------|--------|
+|`druid.auth.ranger.keytab`|Defines the keytab to be used while authenticating against Apache Ranger to obtain policies and provide auditing|null|No|
+|`druid.auth.ranger.principal`|Defines the principal to be used while authenticating against Apache Ranger to obtain policies and provide auditing|null|No|
+|`druid.auth.ranger.use_ugi`|Determines if groups that the authenticated user belongs to should be obtained from Hadoop's `UserGroupInformation`|null|No|
+
+### Configuring the connection to Apache Ranger
+
+The Apache Ranger authorization extension will read several configuration files. Discussing the contents of those files is beyond the scope of this document. Depending on your needs you will need to create them. The minimum you will need to have is a `ranger-druid-security.xml` file that you will need to put in the classpath (e.g. `_common`). For auditing, the configuration is in `ranger-druid-audit.xml`.
+
+### Adding the service definition for Apache Druid to Apache Ranger
+
+At the time of writing of this document Apache Ranger (2.0) does not include an out of the box service and service definition for Druid. You can add the service definition to Apache Ranger by entering the following command:
+
+`curl -u : -d "@ranger-servicedef-druid.json" -X POST -H "Accept: application/json" -H "Content-Type: application/json" http://localhost:6080/service/public/v2/api/servicedef/`
+
+You should get back `json` describing the service definition you just added. You can now go to the web interface of Apache Ranger which should now include a widget for "Druid". Click the plus sign and create the new service. Ensure your service name is equal to what you configured in `ranger-druid-security.xml`.
+
+#### Configuring Apache Ranger policies
+
+When installing a new Druid service in Apache Ranger for the first time, Ranger will provision the policies to allow the administrative user `read/write` access to all properties and data sources. You might want to limit this. Do not forget to add the correct policies for the `druid_system` user and the `internalClientUserName` of the escalator.
+
+:::info
+ Loading new data sources requires `write` access to the `datasource` prior to the loading itself. So if you want to create a datasource `wikipedia` you are required to have an `allow` policy inside Apache Ranger before trying to load the spec.
+:::
+
+## Usage
+
+### HTTP methods
+
+For information on what HTTP methods are supported for a particular request endpoint, please refer to the [API documentation](../../api-reference/api-reference.md).
+
+GET requires READ permission, while POST and DELETE require WRITE permission.
+
+### SQL Permissions
+
+Queries on Druid datasources require DATASOURCE READ permissions for the specified datasource.
+
+Queries on the [INFORMATION_SCHEMA tables](../../querying/sql-metadata-tables.md#information-schema) will return information about datasources that the caller has DATASOURCE READ access to. Other datasources will be omitted.
+
+Queries on the [system schema tables](../../querying/sql-metadata-tables.md#system-schema) require the following permissions:
+- `segments`: Segments will be filtered based on DATASOURCE READ permissions.
+- `servers`: The user requires STATE READ permissions.
+- `server_segments`: The user requires STATE READ permissions and segments will be filtered based on DATASOURCE READ permissions.
+- `tasks`: Tasks will be filtered based on DATASOURCE READ permissions.
+
+
+### Debugging
+
+If you face difficulty grasping why access is denied to certain elements, and the `audit` section in Apache Ranger does not give you any detail, you can enable debug logging for `org.apache.druid.security.ranger`. To do so add the following in your `log4j2.xml`:
+
+```xml
+
+
+
+
+```
diff --git a/docs/35.0.0/development/extensions-contrib/gce-extensions.md b/docs/35.0.0/development/extensions-contrib/gce-extensions.md
new file mode 100644
index 0000000000..2a8da66154
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/gce-extensions.md
@@ -0,0 +1,103 @@
+---
+id: gce-extensions
+title: "GCE Extensions"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `gce-extensions` in the extensions load list.
+
+At the moment, this extension enables only Druid to autoscale instances in GCE.
+
+The extension manages the instances to be scaled up and down through the use of the [Managed Instance Groups](https://cloud.google.com/compute/docs/instance-groups/creating-groups-of-managed-instances#resize_managed_group)
+of GCE (MIG from now on). This choice has been made to ease the configuration of the machines and simplify their
+management.
+
+For this reason, in order to use this extension, the user must have created
+1. An instance template with the right machine type and image to bu used to run the Middle Manager
+2. A MIG that has been configured to use the instance template created in the point above
+
+Moreover, in order to be able to rescale the machines in the MIG, the Overlord must run with a service account
+guaranteeing the following two scopes from the [Compute Engine API](https://developers.google.com/identity/protocols/googlescopes#computev1)
+- `https://www.googleapis.com/auth/cloud-platform`
+- `https://www.googleapis.com/auth/compute`
+
+## Overlord Dynamic Configuration
+
+The Overlord can dynamically change worker behavior.
+
+The JSON object can be submitted to the Overlord via a POST request at:
+
+```
+http://:/druid/indexer/v1/worker
+```
+
+Optional Header Parameters for auditing the config change can also be specified.
+
+|Header Param Name| Description | Default |
+|----------|-------------|---------|
+|`X-Druid-Author`| author making the config change|""|
+|`X-Druid-Comment`| comment describing the change being done|""|
+
+A sample worker config spec is shown below:
+
+```json
+{
+ "autoScaler": {
+ "envConfig" : {
+ "numInstances" : 1,
+ "projectId" : "super-project",
+ "zoneName" : "us-central-1",
+ "managedInstanceGroupName" : "druid-middlemanagers"
+ },
+ "maxNumWorkers" : 4,
+ "minNumWorkers" : 2,
+ "type" : "gce"
+ }
+}
+```
+
+The configuration of the autoscaler is quite simple and it is made of two levels only.
+
+The external level specifies the `type`—always `gce` in this case— and two numeric values,
+the `maxNumWorkers` and `minNumWorkers` used to define the boundaries in between which the
+number of instances must be at any time.
+
+The internal level is the `envConfig` and it is used to specify
+
+- The `numInstances` used to specify how many workers will be spawned at each
+request to provision more workers. This is safe to be left to `1`
+- The `projectId` used to specify the name of the project in which the MIG resides
+- The `zoneName` used to identify in which zone of the worlds the MIG is
+- The `managedInstanceGroupName` used to specify the MIG containing the instances created or
+removed
+
+Please refer to the Overlord Dynamic Configuration section in the main [documentation](../../configuration/index.md)
+for parameters other than the ones specified here, such as `selectStrategy` etc.
+
+## Known limitations
+
+- The module internally uses the [ListManagedInstances](https://cloud.google.com/compute/docs/reference/rest/v1/instanceGroupManagers/listManagedInstances)
+ call from the API and, while the documentation of the API states that the call can be paged through using the
+ `pageToken` argument, the responses to such call do not provide any `nextPageToken` to set such parameter. This means
+ that the extension can operate safely with a maximum of 500 Middle Managers instances at any time (the maximum number
+ of instances to be returned for each call).
+
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-contrib/graphite.md b/docs/35.0.0/development/extensions-contrib/graphite.md
new file mode 100644
index 0000000000..a6e04e9b00
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/graphite.md
@@ -0,0 +1,117 @@
+---
+id: graphite
+title: "Graphite Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `graphite-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits druid metrics to a graphite carbon server.
+Metrics can be sent by using [plaintext](http://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-plaintext-protocol) or [pickle](http://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol) protocol.
+The pickle protocol is more efficient and supports sending batches of metrics (plaintext protocol send only one metric) in one request; batch size is configurable.
+
+## Configuration
+
+All the configuration parameters for graphite emitter are under `druid.emitter.graphite`.
+
+|property|description|required?|default|
+|--------|-----------|---------|-------|
+|`druid.emitter.graphite.hostname`|The hostname of the graphite server.|yes|none|
+|`druid.emitter.graphite.port`|The port of the graphite server.|yes|none|
+|`druid.emitter.graphite.batchSize`|Number of events to send as one batch (only for pickle protocol)|no|100|
+|`druid.emitter.graphite.protocol`|Graphite protocol; available protocols: pickle, plaintext.|no|pickle|
+|`druid.emitter.graphite.eventConverter`| Filter and converter of druid events to graphite event (please see next section).|yes|none|
+|`druid.emitter.graphite.flushPeriod` | Queue flushing period in milliseconds. |no|1 minute|
+|`druid.emitter.graphite.maxQueueSize`| Maximum size of the queue used to buffer events. |no|`MAX_INT`|
+|`druid.emitter.graphite.alertEmitters`| List of emitters where alerts will be forwarded to. This is a JSON list of emitter names, e.g. `["logging", "http"]`|no| empty list (no forwarding)|
+|`druid.emitter.graphite.requestLogEmitters`| List of emitters where request logs (i.e., query logging events sent to emitters when `druid.request.logging.type` is set to `emitter`) will be forwarded to. This is a JSON list of emitter names, e.g. `["logging", "http"]`|no| empty list (no forwarding)|
+|`druid.emitter.graphite.emitWaitTime` | wait time in milliseconds to try to send the event otherwise emitter will throwing event. |no|0|
+|`druid.emitter.graphite.waitForEventTime` | waiting time in milliseconds if necessary for an event to become available. |no|1000 (1 sec)|
+
+### Supported event types
+
+The graphite emitter only emits service metric events to graphite (See [Druid Metrics](../../operations/metrics.md) for a list of metrics).
+
+Alerts and request logs are not sent to graphite. These event types are not well represented in Graphite, which is more suited for timeseries views on numeric metrics, vs. storing non-numeric log events.
+
+Instead, alerts and request logs are optionally forwarded to other emitter implementations, specified by `druid.emitter.graphite.alertEmitters` and `druid.emitter.graphite.requestLogEmitters` respectively.
+
+### Druid to Graphite Event Converter
+
+Graphite Event Converter defines a mapping between druid metrics name plus dimensions to a Graphite metric path.
+Graphite metric path is organized using the following schema:
+`.[].[]..`
+Properly naming the metrics is critical to avoid conflicts, confusing data and potentially wrong interpretation later on.
+
+Example `druid.historical.hist-host1_yahoo_com:8080.MyDataSourceName.GroupBy.query/time`:
+
+ * `druid` -> namespace prefix
+ * `historical` -> service name
+ * `hist-host1.yahoo.com:8080` -> druid hostname
+ * `MyDataSourceName` -> dimension value
+ * `GroupBy` -> dimension value
+ * `query/time` -> metric name
+
+We have two different implementation of event converter:
+
+#### Send-All converter
+
+The first implementation called `all`, will send all the druid service metrics events.
+The path will be in the form `.[].[]..`
+User has control of `.[].[].`
+
+You can omit the hostname by setting `ignoreHostname=true`
+`druid.SERVICE_NAME.dataSourceName.queryType.query/time`
+
+You can omit the service name by setting `ignoreServiceName=true`
+`druid.HOSTNAME.dataSourceName.queryType.query/time`
+
+Elements in metric name by default are separated by "/", so graphite will create all metrics on one level. If you want to have metrics in the tree structure, you have to set `replaceSlashWithDot=true`
+Original: `druid.HOSTNAME.dataSourceName.queryType.query/time`
+Changed: `druid.HOSTNAME.dataSourceName.queryType.query.time`
+
+
+```json
+
+druid.emitter.graphite.eventConverter={"type":"all", "namespacePrefix": "druid.test", "ignoreHostname":true, "ignoreServiceName":true}
+
+```
+
+#### White-list based converter
+
+The second implementation called `whiteList`, will send only the white listed metrics and dimensions.
+Same as for the `all` converter user has control of `.[].[].`
+White-list based converter comes with the following default white list map located under resources in `./src/main/resources/defaultWhiteListMap.json`
+
+Although user can override the default white list map by supplying a property called `mapPath`.
+This property is a String containing the path for the file containing **white list map JSON object**.
+For example the following converter will read the map from the file `/pathPrefix/fileName.json`.
+
+```json
+
+druid.emitter.graphite.eventConverter={"type":"whiteList", "namespacePrefix": "druid.test", "ignoreHostname":true, "ignoreServiceName":true, "mapPath":"/pathPrefix/fileName.json"}
+
+```
+
+**Druid emits a huge number of metrics we highly recommend to use the `whiteList` converter**
diff --git a/docs/35.0.0/development/extensions-contrib/iceberg.md b/docs/35.0.0/development/extensions-contrib/iceberg.md
new file mode 100644
index 0000000000..e2a5a06cb9
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/iceberg.md
@@ -0,0 +1,149 @@
+---
+id: iceberg
+title: "Iceberg extension"
+---
+
+
+
+
+
+## Iceberg Ingest extension
+
+Apache Iceberg is an open table format for huge analytic datasets. [IcebergInputSource](../../ingestion/input-sources.md#iceberg-input-source) lets you ingest data stored in the Iceberg table format into Apache Druid. To use the iceberg extension, add the `druid-iceberg-extensions` to the list of loaded extensions. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information.
+
+Iceberg manages most of its metadata in metadata files in the object storage. However, it is still dependent on a metastore to manage a certain amount of metadata.
+Iceberg refers to these metastores as catalogs. The Iceberg extension lets you connect to the following Iceberg catalog types:
+
+* Glue catalog
+* REST-based catalog
+* Hive metastore catalog
+* Local catalog
+
+For a given catalog, Iceberg input source reads the table name from the catalog, applies the filters, and extracts all the underlying live data files up to the latest snapshot.
+The data files can be in Parquet, ORC, or Avro formats. The data files typically reside in a warehouse location, which can be in HDFS, S3, or the local filesystem.
+The `druid-iceberg-extensions` extension relies on the existing input source connectors in Druid to read the data files from the warehouse. Therefore, the Iceberg input source can be considered as an intermediate input source, which provides the file paths for other input source implementations.
+
+## Hive metastore catalog
+
+For Druid to seamlessly talk to the Hive metastore, ensure that the Hive configuration files such as `hive-site.xml` and `core-site.xml` are available in the Druid classpath for peon processes.
+You can also specify Hive properties under the `catalogProperties` object in the ingestion spec.
+
+The `druid-iceberg-extensions` extension presently only supports HDFS, S3 and local warehouse directories.
+
+### Read from HDFS warehouse
+
+To read from a HDFS warehouse, load the `druid-hdfs-storage` extension. Druid extracts data file paths from the Hive metastore catalog and uses [HDFS input source](../../ingestion/input-sources.md#hdfs-input-source) to ingest these files.
+The `warehouseSource` type in the ingestion spec should be `hdfs`.
+
+For authenticating with Kerberized clusters, include `principal` and `keytab` properties in the `catalogProperties` object:
+
+```json
+"catalogProperties": {
+ "principal": "krb_principal",
+ "keytab": "/path/to/keytab"
+}
+```
+Only Kerberos based authentication is supported as of now.
+
+### Read from S3 warehouse
+
+To read from a S3 warehouse, load the `druid-s3-extensions` extension. Druid extracts the data file paths from the Hive metastore catalog and uses `S3InputSource` to ingest these files.
+Set the `type` property of the `warehouseSource` object to `s3` in the ingestion spec. If the S3 endpoint for the warehouse is different from the endpoint configured as the deep storage, include the following properties in the `warehouseSource` object to define the S3 endpoint settings:
+
+```json
+"warehouseSource": {
+ "type": "s3",
+ "endpointConfig": {
+ "url": "S3_ENDPOINT_URL",
+ "signingRegion": "us-east-1"
+ },
+ "clientConfig": {
+ "protocol": "http",
+ "disableChunkedEncoding": true,
+ "enablePathStyleAccess": true,
+ "forceGlobalBucketAccessEnabled": false
+ },
+ "properties": {
+ "accessKeyId": {
+ "type": "default",
+ "password": ""
+ }
+ }
+}
+```
+
+This extension uses the [Hadoop AWS module](https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/) to connect to S3 and retrieve the metadata and data file paths.
+The following properties are required in the `catalogProperties`:
+
+```json
+"catalogProperties": {
+ "fs.s3a.access.key" : "S3_ACCESS_KEY",
+ "fs.s3a.secret.key" : "S3_SECRET_KEY",
+ "fs.s3a.endpoint" : "S3_API_ENDPOINT"
+}
+```
+Since the Hadoop AWS connector uses the `s3a` filesystem client, specify the warehouse path with the `s3a://` protocol instead of `s3://`.
+
+## Local catalog
+
+The local catalog type can be used for catalogs configured on the local filesystem. Set the `icebergCatalog` type to `local`. You can use this catalog for demos or localized tests. It is not recommended for production use cases.
+The `warehouseSource` is set to `local` because this catalog only supports reading from a local filesystem.
+
+## REST catalog
+
+To connect to an Iceberg REST Catalog server, configure the `icebergCatalog` type as `rest`. The Iceberg REST Open API spec gives catalogs greater control over the implementation and in most cases, the `warehousePath` does not have to be provided by the client.
+Security credentials may be provided in the `catalogProperties` object.
+
+## Glue catalog
+
+Configure the `icebergCatalog` type as `glue`.`warehousePath` and properties must be provided in `catalogProperties` object.
+Refer [Iceberg Glue Catalog documentation](https://iceberg.apache.org/docs/1.6.0/aws/#glue-catalog) for setting properties.
+
+
+## Downloading Iceberg extension
+
+To download `druid-iceberg-extensions`, run the following command after replacing `` with the desired
+Druid version:
+
+```shell
+java \
+ -cp "lib/*" \
+ -Ddruid.extensions.directory="extensions" \
+ -Ddruid.extensions.hadoopDependenciesDir="hadoop-dependencies" \
+ org.apache.druid.cli.Main tools pull-deps \
+ --no-default-hadoop \
+ -c "org.apache.druid.extensions.contrib:druid-iceberg-extensions:"
+```
+
+See [Loading community extensions](../../configuration/extensions.md#loading-community-extensions) for more information.
+
+## Known limitations
+
+This section lists the known limitations that apply to the Iceberg extension.
+
+- This extension does not fully utilize the Iceberg features such as snapshotting or schema evolution.
+- The Iceberg input source reads every single live file on the Iceberg table up to the latest snapshot, which makes the table scan less performant. It is recommended to use Iceberg filters on partition columns in the ingestion spec in order to limit the number of data files being retrieved. Since, Druid doesn't store the last ingested iceberg snapshot ID, it cannot identify the files created between that snapshot and the latest snapshot on Iceberg.
+- It does not handle Iceberg [schema evolution](https://iceberg.apache.org/docs/latest/evolution/) yet. In cases where an existing Iceberg table column is deleted and recreated with the same name, ingesting this table into Druid may bring the data for this column before it was deleted.
+- The Hive catalog has not been tested on Hadoop 2.x.x and is not guaranteed to work with Hadoop 2.
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-contrib/influx.md b/docs/35.0.0/development/extensions-contrib/influx.md
new file mode 100644
index 0000000000..eec9fb555e
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/influx.md
@@ -0,0 +1,67 @@
+---
+id: influx
+title: "InfluxDB Line Protocol Parser"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-influx-extensions` in the extensions load list.
+
+This extension enables Druid to parse the [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.5/write_protocols/line_protocol_tutorial/), a popular text-based timeseries metric serialization format.
+
+## Line Protocol
+
+A typical line looks like this:
+
+```cpu,application=dbhost=prdb123,region=us-east-1 usage_idle=99.24,usage_user=0.55 1520722030000000000```
+
+which contains four parts:
+
+ - measurement: A string indicating the name of the measurement represented (e.g. cpu, network, web_requests)
+ - tags: zero or more key-value pairs (i.e. dimensions)
+ - measurements: one or more key-value pairs; values can be numeric, boolean, or string
+ - timestamp: nanoseconds since Unix epoch (the parser truncates it to milliseconds)
+
+The parser extracts these fields into a map, giving the measurement the key `measurement` and the timestamp the key `_ts`. The tag and measurement keys are copied verbatim, so users should take care to avoid name collisions. It is up to the ingestion spec to decide which fields should be treated as dimensions and which should be treated as metrics (typically tags correspond to dimensions and measurements correspond to metrics).
+
+The parser is configured like so:
+
+```json
+"parser": {
+ "type": "string",
+ "parseSpec": {
+ "format": "influx",
+ "timestampSpec": {
+ "column": "__ts",
+ "format": "millis"
+ },
+ "dimensionsSpec": {
+ "dimensionExclusions": [
+ "__ts"
+ ]
+ },
+ "whitelistMeasurements": [
+ "cpu"
+ ]
+ }
+```
+
+The `whitelistMeasurements` field is an optional list of strings. If present, measurements that do not match one of the strings in the list will be ignored.
diff --git a/docs/35.0.0/development/extensions-contrib/influxdb-emitter.md b/docs/35.0.0/development/extensions-contrib/influxdb-emitter.md
new file mode 100644
index 0000000000..1086a5121e
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/influxdb-emitter.md
@@ -0,0 +1,78 @@
+---
+id: influxdb-emitter
+title: "InfluxDB Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-influxdb-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits druid metrics to [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/) over HTTP. Currently this emitter only emits service metric events to InfluxDB (See [Druid metrics](../../operations/metrics.md) for a list of metrics).
+When a metric event is fired it is added to a queue of events. After a configurable amount of time, the events on the queue are transformed to InfluxDB's line protocol
+and POSTed to the InfluxDB HTTP API. The entire queue is flushed at this point. The queue is also flushed as the emitter is shutdown.
+
+Note that authentication and authorization must be [enabled](https://docs.influxdata.com/influxdb/v1.7/administration/authentication_and_authorization/) on the InfluxDB server.
+
+## Configuration
+
+All the configuration parameters for the influxdb emitter are under `druid.emitter.influxdb`.
+
+|Property|Description|Required?|Default|
+|--------|-----------|---------|-------|
+|`druid.emitter.influxdb.hostname`|The hostname of the InfluxDB server.|Yes|N/A|
+|`druid.emitter.influxdb.port`|The port of the InfluxDB server.|No|8086|
+|`druid.emitter.influxdb.protocol`|The protocol used to send metrics to InfluxDB. One of http/https|No|http|
+|`druid.emitter.influxdb.trustStorePath`|The path to the trustStore to be used for https|No|none|
+|`druid.emitter.influxdb.trustStoreType`|The trustStore type to be used for https|No|`jks`|
+|`druid.emitter.influxdb.trustStorePassword`|The trustStore password to be used for https|No|none|
+|`druid.emitter.influxdb.databaseName`|The name of the database in InfluxDB.|Yes|N/A|
+|`druid.emitter.influxdb.maxQueueSize`|The size of the queue that holds events.|No|Integer.MAX_VALUE(=2^31-1)|
+|`druid.emitter.influxdb.flushPeriod`|How often (in milliseconds) the events queue is parsed into Line Protocol and POSTed to InfluxDB.|No|60000|
+|`druid.emitter.influxdb.flushDelay`|How long (in milliseconds) the scheduled method will wait until it first runs.|No|60000|
+|`druid.emitter.influxdb.influxdbUserName`|The username for authenticating with the InfluxDB database.|Yes|N/A|
+|`druid.emitter.influxdb.influxdbPassword`|The password of the database authorized user|Yes|N/A|
+|`druid.emitter.influxdb.dimensionWhitelist`|A whitelist of metric dimensions to include as tags|No|`["dataSource","type","numMetrics","numDimensions","threshold","dimension","taskType","taskStatus","tier"]`|
+
+## InfluxDB Line Protocol
+
+An example of how this emitter parses a Druid metric event into InfluxDB's [line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_reference/) is given here:
+
+The syntax of the line protocol is :
+
+`[,=[,=]] =[,=] []`
+
+where timestamp is in nanoseconds since epoch.
+
+A typical service metric event as recorded by Druid's logging emitter is: `Event [{"feed":"metrics","timestamp":"2017-10-31T09:09:06.857Z","service":"druid/historical","host":"historical001:8083","version":"0.11.0-SNAPSHOT","metric":"query/cache/total/hits","value":34787256}]`.
+
+This event is parsed into line protocol according to these rules:
+
+* The measurement becomes druid_query since query is the first part of the metric.
+* The tags are service=druid/historical, hostname=historical001, metric=druid_cache_total. (The metric tag is the middle part of the druid metric separated with _ and preceded by druid_. Another example would be if an event has metric=query/time then there is no middle part and hence no metric tag)
+* The field is druid_hits since this is the last part of the metric.
+
+This gives the following String which can be POSTed to InfluxDB: `"druid_query,service=druid/historical,hostname=historical001,metric=druid_cache_total druid_hits=34787256 1509440946857000000"`
+
+The InfluxDB emitter has a white list of dimensions
+which will be added as a tag to the line protocol string if the metric has a dimension from the white list.
+The value of the dimension is sanitized such that every occurrence of a dot or whitespace is replaced with a `_` .
diff --git a/docs/35.0.0/development/extensions-contrib/kafka-emitter.md b/docs/35.0.0/development/extensions-contrib/kafka-emitter.md
new file mode 100644
index 0000000000..772c0ff405
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/kafka-emitter.md
@@ -0,0 +1,66 @@
+---
+id: kafka-emitter
+title: "Kafka Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `kafka-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits Druid metrics to [Apache Kafka](https://kafka.apache.org) directly with JSON format.
+Currently, Kafka has not only their nice ecosystem but also consumer API readily available.
+So, If you currently use Kafka, It's easy to integrate various tool or UI
+to monitor the status of your Druid cluster with this extension.
+
+## Configuration
+
+All the configuration parameters for the Kafka emitter are under `druid.emitter.kafka`.
+
+| Property | Description | Required | Default |
+|----------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|-----------|-----------------------|
+| `druid.emitter.kafka.bootstrap.servers` | Comma-separated Kafka broker. (`[hostname:port],[hostname:port]...`) | yes | none |
+| `druid.emitter.kafka.event.types` | Comma-separated event types. Supported types are `alerts`, `metrics`, `requests`, and `segment_metadata`. | no | `["metrics", "alerts"]` |
+| `druid.emitter.kafka.metric.topic` | Kafka topic name for emitter's target to emit service metrics. If `event.types` contains `metrics`, this field cannot be empty. | no | none |
+| `druid.emitter.kafka.alert.topic` | Kafka topic name for emitter's target to emit alerts. If `event.types` contains `alerts`, this field cannot empty. | no | none |
+| `druid.emitter.kafka.request.topic` | Kafka topic name for emitter's target to emit request logs. If `event.types` contains `requests`, this field cannot be empty. | no | none |
+| `druid.emitter.kafka.segmentMetadata.topic` | Kafka topic name for emitter's target to emit segment metadata. If `event.types` contains `segment_metadata`, this field cannot be empty. | no | none |
+| `druid.emitter.kafka.producer.config` | JSON configuration to set additional properties to Kafka producer. | no | none |
+| `druid.emitter.kafka.clusterName` | Optional value to specify the name of your Druid cluster. It can help make groups in your monitoring environment. | no | none |
+| `druid.emitter.kafka.extra.dimensions` | Optional JSON configuration to specify a map of extra string dimensions for the events emitted. These can help make groups in your monitoring environment. | no | none |
+| `druid.emitter.kafka.producer.hiddenProperties` | JSON configuration to specify sensitive Kafka producer properties such as username and password. This property accepts a [DynamicConfigProvider](../../operations/dynamic-config-provider.md) implementation. | no | none |
+| `druid.emitter.kafka.producer.shutdownTimeout` | Duration in milliseconds the Kafka producer waits for pending requests to finish before shutting down. | no | Long.MAX_VALUE |
+
+### Example
+
+```
+druid.emitter.kafka.bootstrap.servers=hostname1:9092,hostname2:9092
+druid.emitter.kafka.event.types=["metrics", "alerts", "requests", "segment_metadata"]
+druid.emitter.kafka.metric.topic=druid-metric
+druid.emitter.kafka.alert.topic=druid-alert
+druid.emitter.kafka.request.topic=druid-request-logs
+druid.emitter.kafka.segmentMetadata.topic=druid-segment-metadata
+druid.emitter.kafka.producer.config={"max.block.ms":10000}
+druid.emitter.kafka.extra.dimensions={"region":"us-east-1","environment":"preProd"}
+druid.emitter.kafka.producer.hiddenProperties={"config":{"sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\\"KV...NI\\" password=\\"gA3...n6a/\\";"}}
+```
+
diff --git a/docs/35.0.0/development/extensions-contrib/materialized-view.md b/docs/35.0.0/development/extensions-contrib/materialized-view.md
new file mode 100644
index 0000000000..a493c3c417
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/materialized-view.md
@@ -0,0 +1,136 @@
+---
+id: materialized-view
+title: "Materialized View"
+---
+
+
+
+
+To use this Apache Druid feature, make sure to load `materialized-view-selection` and `materialized-view-maintenance`. In addition, this feature currently requires a Hadoop cluster.
+
+This feature enables Druid to greatly improve the query performance, especially when the query dataSource has a very large number of dimensions but the query only required several dimensions. This feature includes two parts. One is `materialized-view-maintenance`, and the other is `materialized-view-selection`.
+
+## Materialized-view-maintenance
+In materialized-view-maintenance, dataSources user ingested are called "base-dataSource". For each base-dataSource, we can submit `derivativeDataSource` supervisors to create and maintain other dataSources which we called "derived-dataSource". The dimensions and metrics of derived-dataSources are the subset of base-dataSource's.
+The `derivativeDataSource` supervisor is used to keep the timeline of derived-dataSource consistent with base-dataSource. Each `derivativeDataSource` supervisor is responsible for one derived-dataSource.
+
+A sample derivativeDataSource supervisor spec is shown below:
+
+```json
+ {
+ "type": "derivativeDataSource",
+ "baseDataSource": "wikiticker",
+ "dimensionsSpec": {
+ "dimensions": [
+ "isUnpatrolled",
+ "metroCode",
+ "namespace",
+ "page",
+ "regionIsoCode",
+ "regionName",
+ "user"
+ ]
+ },
+ "metricsSpec": [
+ {
+ "name": "count",
+ "type": "count"
+ },
+ {
+ "name": "added",
+ "type": "longSum",
+ "fieldName": "added"
+ }
+ ],
+ "tuningConfig": {
+ "type": "hadoop"
+ }
+ }
+```
+
+**Supervisor Configuration**
+
+|Field|Description|Required|
+|--------|-----------|---------|
+|Type |The supervisor type. This should always be `derivativeDataSource`.|yes|
+|baseDataSource |The name of base dataSource. This dataSource data should be already stored inside Druid, and the dataSource will be used as input data.|yes|
+|dimensionsSpec |Specifies the dimensions of the data. These dimensions must be the subset of baseDataSource's dimensions.|yes|
+|metricsSpec |A list of aggregators. These metrics must be the subset of baseDataSource's metrics. See [aggregations](../../querying/aggregations.md).|yes|
+|tuningConfig |TuningConfig must be HadoopTuningConfig. See [Hadoop tuning config](../../ingestion/hadoop.md#tuningconfig).|yes|
+|dataSource |The name of this derived dataSource. |no(default=baseDataSource-hashCode of supervisor)|
+|hadoopDependencyCoordinates |A JSON array of Hadoop dependency coordinates that Druid will use, this property will override the default Hadoop coordinates. Once specified, Druid will look for those Hadoop dependencies from the location specified by druid.extensions.hadoopDependenciesDir |no|
+|classpathPrefix |Classpath that will be prepended for the Peon process. |no|
+|context |See below. |no|
+
+**Context**
+
+|Field|Description|Required|
+|--------|-----------|---------|
+|maxTaskCount |The max number of tasks the supervisor can submit simultaneously. |no(default=1)|
+
+## Materialized-view-selection
+
+In materialized-view-selection, we implement a new query type `view`. When we request a view query, Druid will try its best to optimize the query based on query dataSource and intervals.
+
+A sample view query spec is shown below:
+
+```json
+ {
+ "queryType": "view",
+ "query": {
+ "queryType": "groupBy",
+ "dataSource": "wikiticker",
+ "granularity": "all",
+ "dimensions": [
+ "user"
+ ],
+ "limitSpec": {
+ "type": "default",
+ "limit": 1,
+ "columns": [
+ {
+ "dimension": "added",
+ "direction": "descending",
+ "dimensionOrder": "numeric"
+ }
+ ]
+ },
+ "aggregations": [
+ {
+ "type": "longSum",
+ "name": "added",
+ "fieldName": "added"
+ }
+ ],
+ "intervals": [
+ "2015-09-12/2015-09-13"
+ ]
+ }
+ }
+```
+
+There are 2 parts in a view query:
+
+|Field|Description|Required|
+|--------|-----------|---------|
+|queryType |The query type. This should always be view |yes|
+|query |The real query of this `view` query. The real query must be [groupBy](../../querying/groupbyquery.md), [topN](../../querying/topnquery.md), or [timeseries](../../querying/timeseriesquery.md) type.|yes|
+
+**Note that Materialized View is currently designated as experimental. Please make sure the time of all processes are the same and increase monotonically. Otherwise, some unexpected errors may happen on query results.**
diff --git a/docs/35.0.0/development/extensions-contrib/momentsketch-quantiles.md b/docs/35.0.0/development/extensions-contrib/momentsketch-quantiles.md
new file mode 100644
index 0000000000..eaad48f69c
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/momentsketch-quantiles.md
@@ -0,0 +1,121 @@
+---
+id: momentsketch-quantiles
+title: "Moment Sketches for Approximate Quantiles module"
+---
+
+
+
+
+This module provides aggregators for approximate quantile queries using the [momentsketch](https://github.com/stanford-futuredata/momentsketch) library.
+The momentsketch provides coarse quantile estimates with less space and aggregation time overheads than traditional sketches, approaching the performance of counts and sums by reconstructing distributions from computed statistics.
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) in the extensions load list.
+
+### Aggregator
+
+The result of the aggregation is a momentsketch that is the union of all sketches either built from raw data or read from the segments.
+
+The `momentSketch` aggregator operates over raw data while the `momentSketchMerge` aggregator should be used when aggregating precomputed sketches.
+
+```json
+{
+ "type" : ,
+ "name" : ,
+ "fieldName" : ,
+ "k" : ,
+ "compress" :
+ }
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Type of aggregator desired. Either "momentSketch" or "momentSketchMerge" |yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field (can contain sketches or raw numeric values).|yes|
+|k|Parameter that determines the accuracy and size of the sketch. Higher k means higher accuracy but more space to store sketches. Usable range is generally [3,15] |no, defaults to 13.|
+|compress|Flag for whether the aggregator compresses numeric values using arcsinh. Can improve robustness to skewed and long-tailed distributions, but reduces accuracy slightly on more uniform distributions.| no, defaults to true
+
+### Post Aggregators
+
+Users can query for a set of quantiles using the `momentSketchSolveQuantiles` post-aggregator on the sketches created by the `momentSketch` or `momentSketchMerge` aggregators.
+
+```json
+{
+ "type" : "momentSketchSolveQuantiles",
+ "name" : ,
+ "field" : ,
+ "fractions" :
+}
+```
+
+Users can also query for the min/max of a distribution:
+
+```json
+{
+ "type" : "momentSketchMin" | "momentSketchMax",
+ "name" : ,
+ "field" : ,
+}
+```
+
+### Example
+As an example of a query with sketches pre-aggregated at ingestion time, one could set up the following aggregator at ingest:
+
+```json
+{
+ "type": "momentSketch",
+ "name": "sketch",
+ "fieldName": "value",
+ "k": 10,
+ "compress": true,
+}
+```
+
+and make queries using the following aggregator + post-aggregator:
+
+```json
+{
+ "aggregations": [{
+ "type": "momentSketchMerge",
+ "name": "sketch",
+ "fieldName": "sketch",
+ "k": 10,
+ "compress": true
+ }],
+ "postAggregations": [
+ {
+ "type": "momentSketchSolveQuantiles",
+ "name": "quantiles",
+ "fractions": [0.1, 0.5, 0.9],
+ "field": {
+ "type": "fieldAccess",
+ "fieldName": "sketch"
+ }
+ },
+ {
+ "type": "momentSketchMin",
+ "name": "min",
+ "field": {
+ "type": "fieldAccess",
+ "fieldName": "sketch"
+ }
+ }]
+}
+```
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-contrib/moving-average-query.md b/docs/35.0.0/development/extensions-contrib/moving-average-query.md
new file mode 100644
index 0000000000..0fcc9c4f5b
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/moving-average-query.md
@@ -0,0 +1,364 @@
+---
+id: moving-average-query
+title: "Moving Average Query"
+---
+
+
+
+
+## Overview
+**Moving Average Query** is an extension which provides support for [Moving Average](https://en.wikipedia.org/wiki/Moving_average) and other Aggregate [Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions) in Druid queries.
+
+These Aggregate Window Functions consume standard Druid Aggregators and outputs additional windowed aggregates called [Averagers](#averagers).
+
+#### High level algorithm
+
+Moving Average encapsulates the [groupBy query](../../querying/groupbyquery.md) (Or [timeseries](../../querying/timeseriesquery.md) in case of no dimensions) in order to rely on the maturity of these query types.
+
+It runs the query in two main phases:
+
+1. Runs an inner [groupBy](../../querying/groupbyquery.md) or [timeseries](../../querying/timeseriesquery.md) query to compute Aggregators (i.e. daily count of events).
+2. Passes over aggregated results in Broker, in order to compute Averagers (i.e. moving 7 day average of the daily count).
+
+#### Main enhancements provided by this extension:
+1. Functionality: Extending druid query functionality (i.e. initial introduction of Window Functions).
+2. Performance: Improving performance of such moving aggregations by eliminating multiple segment scans.
+
+#### Further reading
+[Moving Average](https://en.wikipedia.org/wiki/Moving_average)
+
+[Window Functions](https://en.wikibooks.org/wiki/Structured_Query_Language/Window_functions)
+
+[Analytic Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts)
+
+
+## Operations
+
+### Installation
+Use [pull-deps](../../operations/pull-deps.md) tool shipped with Druid to install this [extension](../../configuration/extensions.md#community-extensions) on all Druid broker and router nodes.
+
+```bash
+java -classpath "/lib/*" org.apache.druid.cli.Main tools pull-deps -c org.apache.druid.extensions.contrib:druid-moving-average-query:{VERSION}
+```
+
+### Enabling
+After installation, to enable this extension, just add `druid-moving-average-query` to `druid.extensions.loadList` in broker and routers' `runtime.properties` file and then restart broker and router nodes.
+
+For example:
+
+```bash
+druid.extensions.loadList=["druid-moving-average-query"]
+```
+
+## Configuration
+There are currently no configuration properties specific to Moving Average.
+
+## Limitations
+* movingAverage is missing support for the following groupBy properties: `subtotalsSpec`, `virtualColumns`.
+* movingAverage is missing support for the following timeseries properties: `descending`.
+* movingAverage averagers consider empty buckets and null aggregation values as 0 unless otherwise noted.
+
+## Query spec
+* Most properties in the query spec derived from [groupBy query](../../querying/groupbyquery.md) / [timeseries](../../querying/timeseriesquery.md), see documentation for these query types.
+
+|property|description|required?|
+|--------|-----------|---------|
+|queryType|This String should always be "movingAverage"; this is the first thing Druid looks at to figure out how to interpret the query.|yes|
+|dataSource|A String or Object defining the data source to query, very similar to a table in a relational database. See [DataSource](../../querying/datasource.md) for more information.|yes|
+|dimensions|A JSON list of [DimensionSpec](../../querying/dimensionspecs.md) (Notice that property is optional)|no|
+|limitSpec|See [LimitSpec](../../querying/limitspec.md)|no|
+|having|See [Having](../../querying/having.md)|no|
+|granularity|A period granularity; See [Period Granularities](../../querying/granularities.md#period-granularities)|yes|
+|filter|See [Filters](../../querying/filters.md)|no|
+|aggregations|Aggregations forms the input to Averagers; See [Aggregations](../../querying/aggregations.md)|yes|
+|postAggregations|Supports only aggregations as input; See [Post Aggregations](../../querying/post-aggregations.md)|no|
+|intervals|A JSON Object representing ISO-8601 Intervals. This defines the time ranges to run the query over.|yes|
+|context|An additional JSON Object which can be used to specify certain flags.|no|
+|averagers|Defines the moving average function; See [Averagers](#averagers)|yes|
+|postAveragers|Support input of both averagers and aggregations; Syntax is identical to postAggregations (See [Post Aggregations](../../querying/post-aggregations.md))|no|
+
+## Averagers
+
+Averagers are used to define the Moving-Average function. Averagers are not limited to an average - they can also provide other types of window functions such as MAX()/MIN().
+
+### Properties
+
+These are properties which are common to all Averagers:
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|Averager type; See [Averager types](#averager-types)|yes|
+|name|Averager name|yes|
+|fieldName|Input name (An aggregation name)|yes|
+|buckets|Number of lookback buckets (time periods), including current one. Must be >0|yes|
+|cycleSize|Cycle size; Used to calculate day-of-week option; See [Cycle size (Day of Week)](#cycle-size-day-of-week)|no, defaults to 1|
+
+
+### Averager types:
+
+* [Standard averagers](#standard-averagers):
+ * doubleMean
+ * doubleMeanNoNulls
+ * doubleSum
+ * doubleMax
+ * doubleMin
+ * longMean
+ * longMeanNoNulls
+ * longSum
+ * longMax
+ * longMin
+
+#### Standard averagers
+
+These averagers offer four functions:
+
+* Mean (Average)
+* MeanNoNulls (Ignores empty buckets).
+* Sum
+* Max
+* Min
+
+**Ignoring nulls**:
+Using a MeanNoNulls averager is useful when the interval starts at the dataset beginning time.
+In that case, the first records will ignore missing buckets and average won't be artificially low.
+However, this also means that empty days in a sparse dataset will also be ignored.
+
+Example of usage:
+
+```json
+{ "type" : "doubleMean", "name" : , "fieldName": }
+```
+
+### Cycle size (Day of Week)
+This optional parameter is used to calculate over a single bucket within each cycle instead of all buckets.
+A prime example would be weekly buckets, resulting in a Day of Week calculation. (Other examples: Month of year, Hour of day).
+
+I.e. when using these parameters:
+
+* *granularity*: period=P1D (daily)
+* *buckets*: 28
+* *cycleSize*: 7
+
+Within each output record, the averager will compute the result over the following buckets: current (#0), #7, #14, #21.
+Whereas without specifying cycleSize it would have computed over all 28 buckets.
+
+## Examples
+
+All examples are based on the Wikipedia dataset provided in the Druid [tutorials](../../tutorials/index.md).
+
+### Basic example
+
+Calculating a 7-buckets moving average for Wikipedia edit deltas.
+
+Query syntax:
+
+```json
+{
+ "queryType": "movingAverage",
+ "dataSource": "wikipedia",
+ "granularity": {
+ "type": "period",
+ "period": "PT30M"
+ },
+ "intervals": [
+ "2015-09-12T00:00:00Z/2015-09-13T00:00:00Z"
+ ],
+ "aggregations": [
+ {
+ "name": "delta30Min",
+ "fieldName": "delta",
+ "type": "longSum"
+ }
+ ],
+ "averagers": [
+ {
+ "name": "trailing30MinChanges",
+ "fieldName": "delta30Min",
+ "type": "longMean",
+ "buckets": 7
+ }
+ ]
+}
+```
+
+Result:
+
+```json
+[ {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T00:30:00.000Z",
+ "event" : {
+ "delta30Min" : 30490,
+ "trailing30MinChanges" : 4355.714285714285
+ }
+ }, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T01:00:00.000Z",
+ "event" : {
+ "delta30Min" : 96526,
+ "trailing30MinChanges" : 18145.14285714286
+ }
+ }, {
+...
+...
+...
+}, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T23:00:00.000Z",
+ "event" : {
+ "delta30Min" : 119100,
+ "trailing30MinChanges" : 198697.2857142857
+ }
+}, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T23:30:00.000Z",
+ "event" : {
+ "delta30Min" : 177882,
+ "trailing30MinChanges" : 193890.0
+ }
+}
+```
+
+### Post averager example
+
+Calculating a 7-buckets moving average for Wikipedia edit deltas, plus a ratio between the current period and the moving average.
+
+Query syntax:
+
+```json
+{
+ "queryType": "movingAverage",
+ "dataSource": "wikipedia",
+ "granularity": {
+ "type": "period",
+ "period": "PT30M"
+ },
+ "intervals": [
+ "2015-09-12T22:00:00Z/2015-09-13T00:00:00Z"
+ ],
+ "aggregations": [
+ {
+ "name": "delta30Min",
+ "fieldName": "delta",
+ "type": "longSum"
+ }
+ ],
+ "averagers": [
+ {
+ "name": "trailing30MinChanges",
+ "fieldName": "delta30Min",
+ "type": "longMean",
+ "buckets": 7
+ }
+ ],
+ "postAveragers" : [
+ {
+ "name": "ratioTrailing30MinChanges",
+ "type": "arithmetic",
+ "fn": "/",
+ "fields": [
+ {
+ "type": "fieldAccess",
+ "fieldName": "delta30Min"
+ },
+ {
+ "type": "fieldAccess",
+ "fieldName": "trailing30MinChanges"
+ }
+ ]
+ }
+ ]
+}
+```
+
+Result:
+
+```json
+[ {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T22:00:00.000Z",
+ "event" : {
+ "delta30Min" : 144269,
+ "trailing30MinChanges" : 204088.14285714287,
+ "ratioTrailing30MinChanges" : 0.7068955500319539
+ }
+}, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T22:30:00.000Z",
+ "event" : {
+ "delta30Min" : 242860,
+ "trailing30MinChanges" : 214031.57142857142,
+ "ratioTrailing30MinChanges" : 1.134692411867141
+ }
+}, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T23:00:00.000Z",
+ "event" : {
+ "delta30Min" : 119100,
+ "trailing30MinChanges" : 198697.2857142857,
+ "ratioTrailing30MinChanges" : 0.5994042624782422
+ }
+}, {
+ "version" : "v1",
+ "timestamp" : "2015-09-12T23:30:00.000Z",
+ "event" : {
+ "delta30Min" : 177882,
+ "trailing30MinChanges" : 193890.0,
+ "ratioTrailing30MinChanges" : 0.9174377224199288
+ }
+} ]
+```
+
+
+### Cycle size example
+
+Calculating an average of every first 10-minutes of the last 3 hours:
+
+Query syntax:
+
+```json
+{
+ "queryType": "movingAverage",
+ "dataSource": "wikipedia",
+ "granularity": {
+ "type": "period",
+ "period": "PT10M"
+ },
+ "intervals": [
+ "2015-09-12T00:00:00Z/2015-09-13T00:00:00Z"
+ ],
+ "aggregations": [
+ {
+ "name": "delta10Min",
+ "fieldName": "delta",
+ "type": "doubleSum"
+ }
+ ],
+ "averagers": [
+ {
+ "name": "trailing10MinPerHourChanges",
+ "fieldName": "delta10Min",
+ "type": "doubleMeanNoNulls",
+ "buckets": 18,
+ "cycleSize": 6
+ }
+ ]
+}
+```
diff --git a/docs/35.0.0/development/extensions-contrib/opentsdb-emitter.md b/docs/35.0.0/development/extensions-contrib/opentsdb-emitter.md
new file mode 100644
index 0000000000..e13cd5b55f
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/opentsdb-emitter.md
@@ -0,0 +1,62 @@
+---
+id: opentsdb-emitter
+title: "OpenTSDB Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `opentsdb-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits druid metrics to [OpenTSDB](https://github.com/OpenTSDB/opentsdb) over HTTP (Using `Jersey client`). And this emitter only emits service metric events to OpenTSDB (See [Druid metrics](../../operations/metrics.md) for a list of metrics).
+
+## Configuration
+
+All the configuration parameters for the OpenTSDB emitter are under `druid.emitter.opentsdb`.
+
+|property|description|required?|default|
+|--------|-----------|---------|-------|
+|`druid.emitter.opentsdb.host`|The host of the OpenTSDB server.|yes|none|
+|`druid.emitter.opentsdb.port`|The port of the OpenTSDB server.|yes|none|
+|`druid.emitter.opentsdb.connectionTimeout`|`Jersey client` connection timeout(in milliseconds).|no|2000|
+|`druid.emitter.opentsdb.readTimeout`|`Jersey client` read timeout(in milliseconds).|no|2000|
+|`druid.emitter.opentsdb.flushThreshold`|Queue flushing threshold.(Events will be sent as one batch)|no|100|
+|`druid.emitter.opentsdb.maxQueueSize`|Maximum size of the queue used to buffer events.|no|1000|
+|`druid.emitter.opentsdb.consumeDelay`|Queue consuming delay(in milliseconds). Actually, we use `ScheduledExecutorService` to schedule consuming events, so this `consumeDelay` means the delay between the termination of one execution and the commencement of the next. If your druid processes produce metric events fast, then you should decrease this `consumeDelay` or increase the `maxQueueSize`.|no|10000|
+|`druid.emitter.opentsdb.metricMapPath`|JSON file defining the desired metrics and dimensions for every Druid metric|no|./src/main/resources/defaultMetrics.json|
+|`druid.emitter.opentsdb.namespacePrefix`|Optional (string) prefix for metric names, for example the default metric name `query.count` with a namespacePrefix set to `druid` would be emitted as `druid.query.count` |no|null|
+
+### Druid to OpenTSDB Event Converter
+
+The OpenTSDB emitter will send only the desired metrics and dimensions which is defined in a JSON file.
+If the user does not specify their own JSON file, a default file is used. All metrics are expected to be configured in the JSON file. Metrics which are not configured will be logged.
+Desired metrics and dimensions is organized using the following schema:` : [ ]`
+e.g.
+
+```json
+"query/time": [
+ "dataSource",
+ "type"
+]
+```
+
+For most use-cases, the default configuration is sufficient.
diff --git a/docs/35.0.0/development/extensions-contrib/prometheus.md b/docs/35.0.0/development/extensions-contrib/prometheus.md
new file mode 100644
index 0000000000..d5660e2e54
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/prometheus.md
@@ -0,0 +1,117 @@
+---
+id: prometheus
+title: "Prometheus Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `prometheus-emitter` in the extensions load list.
+
+## Introduction
+
+This extension exposes [Druid metrics](https://druid.apache.org/docs/latest/operations/metrics.html) for collection by a Prometheus server (https://prometheus.io/).
+
+Emitter is enabled by setting `druid.emitter=prometheus` [configs](https://druid.apache.org/docs/latest/configuration/index.html#enabling-metrics) or include `prometheus` in the composing emitter list.
+
+## Configuration
+
+All the configuration parameters for the Prometheus emitter are under `druid.emitter.prometheus`.
+
+| property | description | required? | default |
+|-----------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|--------------------------------------|
+| `druid.emitter.prometheus.strategy` | The strategy to expose prometheus metrics. Should be one of `exporter` and `pushgateway`. Default strategy `exporter` would expose metrics for scraping purpose. Peon tasks (short-lived jobs) should use `pushgateway` strategy. | yes | exporter |
+| `druid.emitter.prometheus.port` | The port on which to expose the prometheus HTTPServer. Required if using `exporter` strategy. | no | none |
+| `druid.emitter.prometheus.namespace` | Optional metric namespace. Must match the regex `[a-zA-Z_:][a-zA-Z0-9_:]*` | no | druid |
+| `druid.emitter.prometheus.dimensionMapPath` | JSON file defining the Prometheus metric type, desired dimensions, conversionFactor, histogram buckets and help text for every Druid metric. | no | Default mapping provided. See below. |
+| `druid.emitter.prometheus.addHostAsLabel` | Flag to include the hostname as a prometheus label. | no | false |
+| `druid.emitter.prometheus.addServiceAsLabel` | Flag to include the druid service name (e.g. `druid/broker`, `druid/coordinator`, etc.) as a prometheus label. | no | false |
+| `druid.emitter.prometheus.pushGatewayAddress` | Pushgateway address. Required if using `pushgateway` strategy. | no | none |
+| `druid.emitter.prometheus.flushPeriod` | When using the `pushgateway` strategy metrics are emitted every `flushPeriod` seconds. When using the `exporter` strategy this configures the metric TTL such that if the metric value is not updated within `flushPeriod` seconds then it will stop being emitted. Note that unique label combinations per metric are currently not subject to TTL expiration. It is recommended to set this to at least 3 * `scrape_interval`. | Required if `pushgateway` strategy is used, optional otherwise. | 15 seconds for `pushgateway` strategy. None for `exporter` strategy. |
+| `druid.emitter.prometheus.extraLabels` | JSON key-value pairs for additional labels on all metrics. Keys (label names) must match the regex `[a-zA-Z_:][a-zA-Z0-9_:]*`. Example: `{"cluster_name": "druid_cluster1", "env": "staging"}`. | no | none |
+| `druid.emitter.prometheus.deletePushGatewayMetricsOnShutdown` | Flag to delete metrics from Pushgateway on task shutdown. Works only if `pushgateway` strategy is used. This feature allows to delete a stale metrics from batch executed tasks. Otherwise, the Pushgateway will store these stale metrics indefinitely as there is [no time to live mechanism](https://github.com/prometheus/pushgateway/issues/117), using the memory to hold data that was already scraped by Prometheus. | no | false |
+| `druid.emitter.prometheus.waitForShutdownDelay` | Time in milliseconds to wait for peon tasks to delete metrics from the Pushgateway on shutdown (e.g. 60_000). Applicable only when `pushgateway` strategy is used and `deletePushGatewayMetricsOnShutdown` is set to true. There is no guarantee that a peon task will delete metrics from the gateway if the configured delay is more than the [Peon's `druid.indexer.task.gracefulShutdownTimeout`](https://druid.apache.org/docs/latest/configuration/#additional-peon-configuration) value. For best results, set this value is 1.2 times the configured Prometheus `scrape_interval` of Pushgateway to ensure that Druid scrapes the metrics before cleanup. | no | none |
+
+### Ports for colocated Druid processes
+
+In certain instances, Druid processes may be colocated on the same host. For example, the Broker and Router may share the same server. Other colocated processes include the Historical and Middle Manager or the Coordinator and Overlord. When you have colocated processes, specify `druid.emitter.prometheus.port` separately for each process on each host. For example, even if the Broker and Router share the same host, the Broker runtime properties and the Router runtime properties each need to list `druid.emitter.prometheus.port`, and the port value for both must be different.
+
+### Override properties for Peon Tasks
+
+Peon tasks are created dynamically by middle managers and have dynamic host and port addresses. Since the `exporter` strategy allows Prometheus to read only from a fixed address, it cannot be used for peon tasks.
+So, these tasks need to be configured to use `pushgateway` strategy to push metrics from Druid to prometheus gateway.
+
+If this emitter is configured to use `exporter` strategy globally, some of the above configurations need to be overridden in the middle manager so that spawned peon tasks can still use the `pushgateway` strategy.
+
+```
+#
+# Override global prometheus emitter configuration for peon tasks to use `pushgateway` strategy.
+# Other configurations can also be overridden by adding `druid.indexer.fork.property.` prefix to above configuration properties.
+#
+druid.indexer.fork.property.druid.emitter.prometheus.strategy=pushgateway
+druid.indexer.fork.property.druid.emitter.prometheus.pushGatewayAddress=http://
+```
+
+### Metric names
+
+All metric names and labels are reformatted to match Prometheus standards.
+- For names: all characters which are not alphanumeric, underscores, or colons (matching `[^a-zA-Z_:][^a-zA-Z0-9_:]*`) are replaced with `_`
+- For labels: all characters which are not alphanumeric or underscores (matching `[^a-zA-Z0-9_][^a-zA-Z0-9_]*`) are replaced with `_`
+
+### Metric mapping
+
+Each metric to be collected by Prometheus must specify a type, one of `[timer, counter, guage]`. Prometheus Emitter expects this mapping to
+be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric. Prometheus expects
+histogram timers to use Seconds as the base unit. Timers which do not use seconds as a base unit can use the `conversionFactor` to set
+the base time unit. Histogram timers also support custom bucket configurations through the `histogramBuckets` parameter. If no custom buckets are provided, the following default buckets are used: `[0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0, 30.0, 60.0, 120.0, 300.0]`. If the user does not specify their own JSON file, a default mapping is used. All
+metrics are expected to be mapped. Metrics which are not mapped will not be tracked.
+
+Prometheus metric path is organized using the following schema:
+
+```json
+ : {
+ "dimensions" : ,
+ "type" : ,
+ "conversionFactor": ,
+ "histogramBuckets": ,
+ "help" :
+}
+```
+
+For example:
+```json
+"query/time" : {
+ "dimensions" : ["dataSource", "type"],
+ "type" : "timer",
+ "conversionFactor": 1000.0,
+ "histogramBuckets": [0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0, 30.0, 60.0, 120.0, 300.0],
+ "help": "Seconds taken to complete a query."
+}
+```
+
+For metrics which are emitted from multiple services with different dimensions, the metric name is prefixed with
+the service name. For example:
+
+```json
+"druid/coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
+"druid/historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" }
+```
+
+For most use cases, the default mapping is sufficient.
diff --git a/docs/35.0.0/development/extensions-contrib/rabbit-stream-ingestion.md b/docs/35.0.0/development/extensions-contrib/rabbit-stream-ingestion.md
new file mode 100644
index 0000000000..9c0e395180
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/rabbit-stream-ingestion.md
@@ -0,0 +1,238 @@
+---
+id: rabbit-super-stream-injestion
+title: "RabbitMQ superstream ingestion"
+sidebar_label: "Rabbitmq superstream"
+---
+
+
+
+The rabbit stream indexing service allows you to configure *supervisors* on the Overlord to manage the creation and lifetime of [RabbitMQ](https://www.rabbitmq.com/) indexing tasks.
+These indexing tasks read events from a rabbit super-stream. The supervisor oversees the state of the indexing tasks to:
+
+ - coordinate handoffs
+ - manage failures
+ - ensure that Druid maintains scalability and replication requirements
+
+ To use the rabbit stream indexing service, load the `druid-rabbit-indexing-service` community druid extension.
+ See [Loading community extensions](../../configuration/extensions.md#loading-community-extensions) for more information.
+
+## Submitting a supervisor spec
+
+To use the rabbit stream indexing service, load the `druid-rabbit-indexing-service` extension on both the Overlord and the Middle Managers. Druid starts a supervisor for a dataSource when you submit a supervisor spec. Submit your supervisor spec to the following endpoint:
+
+
+`http://:/druid/indexer/v1/supervisor`
+
+For example:
+
+```
+curl -X POST -H 'Content-Type: application/json' -d @supervisor-spec.json http://localhost:8090/druid/indexer/v1/supervisor
+```
+
+Where the file `supervisor-spec.json` contains a rabbit supervisor spec:
+
+```json
+{
+ "type": "rabbit",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "metrics-rabbit",
+ "timestampSpec": {
+ "column": "timestamp",
+ "format": "auto"
+ },
+ "dimensionsSpec": {
+ "dimensions": [],
+ "dimensionExclusions": [
+ "timestamp",
+ "value"
+ ]
+ },
+ "metricsSpec": [
+ {
+ "name": "count",
+ "type": "count"
+ },
+ {
+ "name": "value_sum",
+ "fieldName": "value",
+ "type": "doubleSum"
+ },
+ {
+ "name": "value_min",
+ "fieldName": "value",
+ "type": "doubleMin"
+ },
+ {
+ "name": "value_max",
+ "fieldName": "value",
+ "type": "doubleMax"
+ }
+ ],
+ "granularitySpec": {
+ "type": "uniform",
+ "segmentGranularity": "HOUR",
+ "queryGranularity": "NONE"
+ }
+ },
+ "ioConfig": {
+ "stream": "metrics",
+ "inputFormat": {
+ "type": "json"
+ },
+ "uri": "rabbitmq-stream://localhost:5552",
+ "taskCount": 1,
+ "replicas": 1,
+ "taskDuration": "PT1H"
+ },
+ "tuningConfig": {
+ "type": "rabbit",
+ "maxRowsPerSegment": 5000000
+ }
+ }
+}
+```
+
+## Supervisor spec
+
+|Field|Description|Required|
+|--------|-----------|---------|
+|`type`|The supervisor type; this should always be `rabbit`.|yes|
+|`spec`|Container object for the supervisor configuration.|yes|
+|`dataSchema`|The schema that will be used by the rabbit indexing task during ingestion. See [`dataSchema`](../../ingestion/ingestion-spec.md#dataschema).|yes|
+|`ioConfig`|An [`ioConfig`](#ioconfig) object for configuring rabbit super stream connection and I/O-related settings for the supervisor and indexing task.|yes|
+|`tuningConfig`|A [`tuningConfig`](#tuningconfig) object for configuring performance-related settings for the supervisor and indexing tasks.|no|
+
+### `ioConfig`
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|`stream`|String|The RabbitMQ super stream to read.|yes|
+|`inputFormat`|Object|The input format to specify how to parse input data. See [`inputFormat`](../../ingestion/data-formats.md#input-format) for details.|yes|
+|`uri`|String|The URI to connect to RabbitMQ with. |yes |
+|`replicas`|Integer|The number of replica sets, where 1 means a single set of tasks (no replication). Replica tasks will always be assigned to different workers to provide resiliency against process failure.|no (default == 1)|
+|`taskCount`|Integer|The maximum number of *reading* tasks in a *replica set*. This means that the maximum number of reading tasks will be `taskCount * replicas` and the total number of tasks (*reading* + *publishing*) will be higher than this. |no (default == 1)|
+|`taskDuration`|ISO8601 Period|The length of time before tasks stop reading and begin publishing their segment.|no (default == PT1H)|
+|`startDelay`|ISO8601 Period|The period to wait before the supervisor starts managing tasks.|no (default == PT5S)|
+|`period`|ISO8601 Period|How often the supervisor will execute its management logic. Note that the supervisor will also run in response to certain events (such as tasks succeeding, failing, and reaching their taskDuration) so this value specifies the maximum time between iterations.|no (default == PT30S)|
+|`useEarliestSequenceNumber`|Boolean|If a supervisor is managing a dataSource for the first time, it will obtain a set of starting sequence numbers from RabbitMQ. This flag determines whether it retrieves the earliest or latest sequence numbers in the stream. Under normal circumstances, subsequent tasks will start from where the previous segments ended so this flag will only be used on first run.|no (default == false)|
+|`completionTimeout`|ISO8601 Period|The length of time to wait before declaring a publishing task as failed and terminating it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|no (default == PT6H)|
+|`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps earlier than this period before the task was created; for example if this is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no (default == none)|
+|`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps later than this period after the task reached its taskDuration; for example if this is set to `PT1H`, the taskDuration is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*. Messages with timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting `earlyMessageRejectionPeriod` too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|no (default == none)|
+|`Consumer Properties`|Object| a dynamic map used to provide |no (default == none)|
+
+
+
+### `tuningConfig`
+
+The `tuningConfig` is optional. If no `tuningConfig` is specified, default parameters are used.
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|`type`| String|The indexing task type, this should always be `rabbit`.|yes|
+|`maxRowsInMemory`|Integer|The number of rows to aggregate before persisting. This number is the post-aggregation rows, so it is not equivalent to the number of input events, but the number of aggregated rows that those events result in. This is used to manage the required JVM heap size. Maximum heap memory usage for indexing scales with `maxRowsInMemory * (2 + maxPendingPersists)`.|no (default == 100000)|
+|`maxBytesInMemory`|Long| The number of bytes to aggregate in heap memory before persisting. This is based on a rough estimate of memory usage and not actual usage. Normally, this is computed internally and user does not need to set it. The maximum heap memory usage for indexing is `maxBytesInMemory * (2 + maxPendingPersists)`.|no (default == One-sixth of max JVM memory)|
+|`maxRowsPerSegment`|Integer|The number of rows to aggregate into a segment; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier.|no (default == 5000000)|
+|`maxTotalRows`|Long|The number of rows to aggregate across all segments; this number is post-aggregation rows. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier.|no (default == unlimited)|
+|`intermediatePersistPeriod`|ISO8601 Period|The period that determines the rate at which intermediate persists occur.|no (default == PT10M)|
+|`maxPendingPersists`|Integer|Maximum number of persists that can be pending but not started. If this limit would be exceeded by a new intermediate persist, ingestion will block until the currently-running persist finishes. Maximum heap memory usage for indexing scales with `maxRowsInMemory * (2 + maxPendingPersists)`.|no (default == 0, meaning one persist can be running concurrently with ingestion, and none can be queued up)|
+|`indexSpec`|Object|Tune how data is indexed. See [IndexSpec](#indexspec) for more information.|no|
+|`indexSpecForIntermediatePersists`|Object|Defines segment storage format options to be used at indexing time for intermediate persisted temporary segments. This can be used to disable dimension/metric compression on intermediate segments to reduce memory required for final merging. However, disabling compression on intermediate segments might increase page cache use while they are used before getting merged into final segment published, see [IndexSpec](#indexspec) for possible values.| no (default = same as `indexSpec`)|
+|`reportParseExceptions`|Boolean|If true, exceptions encountered during parsing will be thrown and will halt ingestion; if false, unparseable rows and fields will be skipped.|no (default == false)|
+|`handoffConditionTimeout`|Long| Milliseconds to wait for segment handoff. It must be >= 0, where 0 means to wait forever.| no (default == 0)|
+|`resetOffsetAutomatically`|Boolean|Controls behavior when Druid needs to read RabbitMQ messages that are no longer available. Not supported. |no (default == false)|
+|`skipSequenceNumberAvailabilityCheck`|Boolean|Whether to enable checking if the current sequence number is still available in a particular RabbitMQ stream. If set to false, the indexing task will attempt to reset the current sequence number (or not), depending on the value of `resetOffsetAutomatically`.|no (default == false)|
+|`workerThreads`|Integer|The number of threads that the supervisor uses to handle requests/responses for worker tasks, along with any other internal asynchronous operation.|no (default == min(10, taskCount))|
+|`chatRetries`|Integer|The number of times HTTP requests to indexing tasks will be retried before considering tasks unresponsive.| no (default == 8)|
+|`httpTimeout`|ISO8601 Period|How long to wait for a HTTP response from an indexing task.|no (default == PT10S)|
+|`shutdownTimeout`|ISO8601 Period|How long to wait for the supervisor to attempt a graceful shutdown of tasks before exiting.|no (default == PT80S)|
+|`recordBufferSize`|Integer|Size of the buffer (number of events) used between the RabbitMQ consumers and the main ingestion thread.|no ( default == 100 MB or an estimated 10% of available heap, whichever is smaller.)|
+|`recordBufferOfferTimeout`|Integer|Length of time in milliseconds to wait for space to become available in the buffer before timing out.| no (default == 5000)| |
+|`segmentWriteOutMediumFactory`|Object|Segment write-out medium to use when creating segments. See below for more information.|no (not specified by default, the value from `druid.peon.defaultSegmentWriteOutMediumFactory.type` is used)|
+|`intermediateHandoffPeriod`|ISO8601 Period|How often the tasks should hand off segments. Handoff will happen either if `maxRowsPerSegment` or `maxTotalRows` is hit or every `intermediateHandoffPeriod`, whichever happens earlier.| no (default == P2147483647D)|
+|`logParseExceptions`|Boolean|If true, log an error message when a parsing exception occurs, containing information about the row where the error occurred.|no, default == false|
+|`maxParseExceptions`|Integer|The maximum number of parse exceptions that can occur before the task halts ingestion and fails. Overridden if `reportParseExceptions` is set.|no, unlimited default|
+|`maxSavedParseExceptions`|Integer|When a parse exception occurs, Druid can keep track of the most recent parse exceptions. `maxSavedParseExceptions` limits how many exception instances Druid saves. These saved exceptions are made available after the task finishes in the [task completion report](../../ingestion/tasks.md#task-reports). Overridden if `reportParseExceptions` is set.|no, default == 0|
+|`maxRecordsPerPoll`|Integer|The maximum number of records/events to be fetched from buffer per poll. The actual maximum will be `Max(maxRecordsPerPoll, Max(bufferSize, 1))`|no, default = 100|
+|`repartitionTransitionDuration`|ISO8601 Period|When shards are split or merged, the supervisor will recompute shard -> task group mappings, and signal any running tasks created under the old mappings to stop early at (current time + `repartitionTransitionDuration`). Stopping the tasks early allows Druid to begin reading from the new shards more quickly. The repartition transition wait time controlled by this property gives the stream additional time to write records to the new shards after the split/merge, which helps avoid the issues with empty shard handling described at https://github.com/apache/druid/issues/7600.|no, (default == PT2M)|
+|`offsetFetchPeriod`|ISO8601 Period|How often the supervisor queries RabbitMQ and the indexing tasks to fetch current offsets and calculate lag. If the user-specified value is below the minimum value (`PT5S`), the supervisor ignores the value and uses the minimum value instead.|no (default == PT30S, min == PT5S)|
+
+
+#### IndexSpec
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|bitmap|Object|Compression format for bitmap indexes. Should be a JSON object. See [Bitmap types](#bitmap-types) below for options.|no (defaults to Roaring)|
+|dimensionCompression|String|Compression format for dimension columns. Choose from `LZ4`, `LZF`, or `uncompressed`.|no (default == `LZ4`)|
+|metricCompression|String|Compression format for primitive type metric columns. Choose from `LZ4`, `LZF`, `uncompressed`, or `none`.|no (default == `LZ4`)|
+|longEncoding|String|Encoding format for metric and dimension columns with type long. Choose from `auto` or `longs`. `auto` encodes the values using sequence number or lookup table depending on column cardinality, and store them with variable size. `longs` stores the value as is with 8 bytes each.|no (default == `longs`)|
+
+##### Bitmap types
+
+For Roaring bitmaps:
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|`type`|String|Must be `roaring`.|yes|
+
+For Concise bitmaps:
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|`type`|String|Must be `concise`.|yes|
+
+#### SegmentWriteOutMediumFactory
+
+|Field|Type|Description|Required|
+|-----|----|-----------|--------|
+|`type`|String|See [Additional Peon configuration: SegmentWriteOutMediumFactory](../../configuration/index.md#segmentwriteoutmediumfactory) for explanation and available options.|yes|
+
+
+
+## Operations
+
+This section describes how some supervisor APIs work in the Rabbit Stream Indexing Service.
+For all supervisor APIs, check [Supervisor APIs](../../api-reference/supervisor-api.md).
+
+### RabbitMQ authentication
+
+To authenticate with RabbitMQ securely, you must provide a username and password, as well as configure
+a certificate if you aren't using a standard certificate provider.
+
+In order to configure these, use the dynamic configuration provider of the ioConfig:
+```
+ "ioConfig": {
+ "type": "rabbit",
+ "stream": "api-audit",
+ "uri": "rabbitmq-stream://localhost:5552",
+ "taskCount": 1,
+ "replicas": 1,
+ "taskDuration": "PT1H",
+ "consumerProperties": {
+ "druid.dynamic.config.provider" : {
+ "type": "environment",
+ "variables": {
+ "username": "RABBIT_USERNAME",
+ "password": "RABBIT_PASSWORD"
+ }
+ }
+ }
+ },
+ ```
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-contrib/redis-cache.md b/docs/35.0.0/development/extensions-contrib/redis-cache.md
new file mode 100644
index 0000000000..63e0b9e509
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/redis-cache.md
@@ -0,0 +1,111 @@
+---
+id: redis-cache
+title: "Druid Redis Cache"
+---
+
+
+
+A cache implementation for Druid based on [Redis](https://github.com/redis/redis).
+
+Below are guidance and configuration options known to this module.
+
+## Installation
+
+Use [pull-deps](../../operations/pull-deps.md) tool shipped with Druid to install this [extension](../../configuration/extensions.md#community-extensions) on broker, historical and middle manager nodes.
+
+```bash
+java -classpath "druid_dir/lib/*" org.apache.druid.cli.Main tools pull-deps -c org.apache.druid.extensions.contrib:druid-redis-cache:{VERSION}
+```
+
+## Enabling
+
+To enable this extension after installation,
+
+1. [include](../../configuration/extensions.md#loading-extensions) this `druid-redis-cache` extension
+2. to enable cache on broker nodes, follow [broker caching docs](../../configuration/index.md#broker-caching) to set related properties
+3. to enable cache on historical nodes, follow [historical caching docs](../../configuration/index.md#historical-caching) to set related properties
+4. to enable cache on middle manager nodes, follow [peon caching docs](../../configuration/index.md#peon-caching) to set related properties
+5. set `druid.cache.type` to `redis`
+6. add the following properties
+
+## Configuration
+
+### Cluster mode
+
+To utilize a redis cluster, following properties must be set.
+
+Note: some redis cloud service providers provide redis cluster service via a redis proxy, for these clusters, please follow the [Standalone mode](#standalone-mode) configuration below.
+
+| Properties |Description|Default|Required|
+|--------------------|-----------|-------|--------|
+|`druid.cache.cluster.nodes`| Redis nodes in a cluster, represented in comma separated string. See example below | None | yes |
+|`druid.cache.cluster.maxRedirection`| Max retry count | 5 | no |
+
+#### Example
+
+```properties
+# a typical redis cluster with 6 nodes
+druid.cache.cluster.nodes=127.0.0.1:7001,127.0.0.1:7002,127.0.0.1:7003,127.0.0.1:7004,127.0.0.1:7005,127.0.0.1:7006
+```
+
+### Standalone mode
+
+To use a standalone redis, following properties must be set.
+
+| Properties |Description|Default|Required|
+|--------------------|-----------|-------|--------|
+|`druid.cache.host`|Redis server host|None|yes|
+|`druid.cache.port`|Redis server port|None|yes|
+|`druid.cache.database`|Redis database index|0|no|
+
+Note: if both `druid.cache.cluster.nodes` and `druid.cache.host` are provided, cluster mode is preferred.
+
+### Shared Properties
+
+Except for the properties above, there are some extra properties which can be customized to meet different needs.
+
+| Properties |Description|Default|Required|
+|--------------------|-----------|-------|--------|
+|`druid.cache.password`| Password to access redis server/cluster | None |no|
+|`druid.cache.expiration`|Expiration for cache entries | P1D |no|
+|`druid.cache.timeout`|Timeout for connecting to Redis and reading entries from Redis|PT2S|no|
+|`druid.cache.maxTotalConnections`|Max total connections to Redis|8|no|
+|`druid.cache.maxIdleConnections`|Max idle connections to Redis|8|no|
+|`druid.cache.minIdleConnections`|Min idle connections to Redis|0|no|
+
+For `druid.cache.expiration` and `druid.cache.timeout` properties, values can be format of `Period` or a number in milliseconds.
+
+```properties
+# Period format(recomended)
+# cache expires after 1 hour
+druid.cache.expiration=PT1H
+
+# or in number(milliseconds) format
+# 1 hour = 3_600_000 milliseconds
+druid.cache.expiration=3600000
+```
+
+## Metrics
+
+In addition to the normal cache metrics, the redis cache implementation also reports the following in both `total` and `delta`
+
+|Metric|Description|Normal value|
+|------|-----------|------------|
+|`query/cache/redis/*/requests`|Count of requests to redis cache|whatever request to redis will increase request count by 1|
diff --git a/docs/35.0.0/development/extensions-contrib/spectator-histogram.md b/docs/35.0.0/development/extensions-contrib/spectator-histogram.md
new file mode 100644
index 0000000000..e6d12517e5
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/spectator-histogram.md
@@ -0,0 +1,457 @@
+---
+id: spectator-histogram
+title: "Spectator Histogram module"
+---
+
+
+
+## Summary
+This module provides Apache Druid approximate histogram aggregators and percentile
+post-aggregators based on Spectator fixed-bucket histograms.
+
+Consider SpectatorHistogram to compute percentile approximations. This extension has a reduced storage footprint compared to the [DataSketches extension](../extensions-core/datasketches-extension.md), which results in smaller segment sizes, faster loading from deep storage, and lower memory usage. This extension provides fast and accurate queries on large datasets at low storage cost.
+
+This aggregator only applies when your raw data contains positive long integer values. Do not use this aggregator if you have negative values in your data.
+
+In the Druid instance shown below, the example Wikipedia dataset is loaded 3 times.
+* `wikipedia` contains the dataset ingested as is, without rollup
+* `wikipedia_spectator` contains the dataset with a single extra metric column of type `spectatorHistogram` for the `added` column
+* `wikipedia_datasketch` contains the dataset with a single extra metric column of type `quantilesDoublesSketch` for the `added` column
+
+Spectator histograms average just 6 extra bytes per row, while the `quantilesDoublesSketch`
+adds 48 bytes per row. This represents an eightfold reduction in additional storage size for spectator histograms.
+
+
+
+As rollup improves, so does the size savings. For example, when you ingest the Wikipedia dataset
+with day-grain query granularity and remove all dimensions except `countryName`,
+this results in a segment that has just 106 rows. The base segment has 87 bytes per row.
+Compare the following bytes per row for SpectatorHistogram versus DataSketches:
+* An additional `spectatorHistogram` column adds 27 bytes per row on average.
+* An additional `quantilesDoublesSketch` column adds 255 bytes per row.
+
+SpectatorHistogram reduces the additional storage size by 9.4 times in this example.
+Storage gains will differ per dataset depending on the variance and rollup of the data.
+
+## Background
+[Spectator](https://netflix.github.io/atlas-docs/spectator/) is a simple library
+for instrumenting code to record dimensional time series data.
+It was built, primarily, to work with [Atlas](https://netflix.github.io/atlas-docs/).
+Atlas was developed by Netflix to manage dimensional time series data for near
+real-time operational insight.
+
+With the [Atlas-Druid](https://github.com/Netflix-Skunkworks/iep-apps/tree/main/atlas-druid)
+service, it's possible to use the power of Atlas queries, backed by Druid as a
+data store to benefit from high-dimensionality and high-cardinality data.
+
+SpectatorHistogram is designed for efficient parallel aggregations while still
+allowing for filtering and grouping by dimensions.
+It provides similar functionality to the built-in DataSketches `quantilesDoublesSketch` aggregator, but is
+opinionated to maintain higher absolute accuracy at smaller values.
+Larger values have lower absolute accuracy; however, relative accuracy is maintained across the range.
+See [Bucket boundaries](#histogram-bucket-boundaries) for more information.
+The SpectatorHistogram is optimized for typical measurements from cloud services and web apps,
+such as page load time, transferred bytes, response time, and request latency.
+
+Through some trade-offs SpectatorHistogram provides a significantly more compact
+representation with the same aggregation performance and accuracy as
+DataSketches Quantiles Sketch. Note that results depend on the dataset.
+Also see the [limitations](#limitations] of this extension.
+
+## Limitations
+* Supports positive long integer values within the range of [0, 2^53). Negatives are
+coerced to 0.
+* Does not support decimals.
+* Does not support Druid SQL queries, only native queries.
+* Does not support vectorized queries.
+* Generates 276 fixed buckets with increasing bucket widths. In practice, the observed error of computed percentiles ranges from 0.1% to 3%, exclusive. See [Bucket boundaries](#histogram-bucket-boundaries) for the full list of bucket boundaries.
+
+:::tip
+If these limitations don't work for your use case, then use [DataSketches](../extensions-core/datasketches-extension.md) instead.
+:::
+
+## Functionality
+The SpectatorHistogram aggregator can generate histograms from raw numeric
+values as well as aggregating or combining pre-aggregated histograms generated using
+the SpectatorHistogram aggregator itself.
+While you can generate histograms on the fly at query time, it is generally more
+performant to generate histograms during ingestion and then combine them at
+query time. This is especially true where rollup is enabled. It may be misleading or
+incorrect to generate histograms from already rolled-up summed data.
+
+The module provides postAggregators, `percentileSpectatorHistogram` (singular) and
+`percentilesSpectatorHistogram` (plural), to compute approximate
+percentiles from histograms generated by the SpectatorHistogram aggregator.
+Again, these postAggregators can be used to compute percentiles from raw numeric
+values via the SpectatorHistogram aggregator or from pre-aggregated histograms.
+
+> If you're only using the aggregator to compute percentiles from raw numeric values,
+then you can use the built-in quantilesDoublesSketch aggregator instead. The performance
+and accuracy are comparable. However, the DataSketches aggregator supports negative values,
+and you don't need to download an additional extension.
+
+An aggregated SpectatorHistogram can also be queried using a `longSum` or `doubleSum`
+aggregator to retrieve the population of the histogram. This is effectively the count
+of the number of values that were aggregated into the histogram. This flexibility can
+avoid the need to maintain a separate metric for the count of values.
+
+For high-frequency measurements, you may need to pre-aggregate data at the client prior
+to sending into Druid. For example, if you're measuring individual image render times
+on an image-heavy website, you may want to aggregate the render times for a page-view
+into a single histogram prior to sending to Druid in real-time. This can reduce the
+amount of data that's needed to send from the client across the wire.
+
+SpectatorHistogram supports ingesting pre-aggregated histograms in real-time and batch.
+They can be sent as a JSON map, keyed by the spectator bucket ID and the value is the
+count of values. This is the same format as the serialized JSON representation of the
+histogram. The keys need not be ordered or contiguous. For example:
+
+```json
+{ "4": 8, "5": 15, "6": 37, "7": 9, "8": 3, "10": 1, "13": 1 }
+```
+
+## Loading the extension
+To use SpectatorHistogram, make sure you [include](../../configuration/extensions.md#loading-extensions) the extension in your config file:
+
+```
+druid.extensions.loadList=["druid-spectator-histogram"]
+```
+
+## Aggregators
+
+The result of the aggregation is a histogram that is built by ingesting numeric values from
+the raw data, or from combining pre-aggregated histograms. The result is represented in
+JSON format where the keys are the bucket index and the values are the count of entries
+in that bucket.
+
+The buckets are defined as per the Spectator [PercentileBuckets](https://github.com/Netflix/spectator/blob/main/spectator-api/src/main/java/com/netflix/spectator/api/histogram/PercentileBuckets.java) specification.
+See [Histogram bucket boundaries](#histogram-bucket-boundaries) for the full list of bucket boundaries.
+```js
+ // The set of buckets is generated by using powers of 4 and incrementing by one-third of the
+ // previous power of 4 in between as long as the value is less than the next power of 4 minus
+ // the delta.
+ //
+ // Base: 1, 2, 3
+ //
+ // 4 (4^1), delta = 1 (~1/3 of 4)
+ // 5, 6, 7, ..., 14,
+ //
+ // 16 (4^2), delta = 5 (~1/3 of 16)
+ // 21, 26, 31, ..., 56,
+ //
+ // 64 (4^3), delta = 21 (~1/3 of 64)
+ // ...
+```
+
+There are multiple aggregator types included, all of which are based on the same
+underlying implementation. If you use the Atlas-Druid service, the different types
+signal the service on how to handle the resulting data from a query.
+
+* spectatorHistogramTimer signals that the histogram is representing
+a collection of timer values. It is recommended to normalize timer values to nanoseconds
+at, or prior to, ingestion. If queried via the Atlas-Druid service, it will
+normalize timers to second resolution at query time as a more natural unit of time
+for human consumption.
+* spectatorHistogram and spectatorHistogramDistribution are generic histograms that
+can be used to represent any measured value without units. No normalization is
+required or performed.
+
+### `spectatorHistogram` aggregator
+Alias: `spectatorHistogramDistribution`, `spectatorHistogramTimer`
+
+To aggregate at query time:
+```
+{
+ "type" : "spectatorHistogram",
+ "name" : ,
+ "fieldName" :
+ }
+```
+
+| Property | Description | Required? |
+|-----------|--------------------------------------------------------------------------------------------------------------|-----------|
+| type | This String must be one of "spectatorHistogram", "spectatorHistogramTimer", "spectatorHistogramDistribution" | yes |
+| name | A String for the output (result) name of the aggregation. | yes |
+| fieldName | A String for the name of the input field containing raw numeric values or pre-aggregated histograms. | yes |
+
+### `longSum`, `doubleSum` and `floatSum` aggregators
+To get the population size (count of events contributing to the histogram):
+```
+{
+ "type" : "longSum",
+ "name" : ,
+ "fieldName" :
+ }
+```
+
+| Property | Description | Required? |
+|-----------|--------------------------------------------------------------------------------|-----------|
+| type | Must be "longSum", "doubleSum", or "floatSum". | yes |
+| name | A String for the output (result) name of the aggregation. | yes |
+| fieldName | A String for the name of the input field containing pre-aggregated histograms. | yes |
+
+## Post Aggregators
+
+### Percentile (singular)
+This returns a single percentile calculation based on the distribution of the values in the aggregated histogram.
+
+```
+{
+ "type": "percentileSpectatorHistogram",
+ "name": ,
+ "field": {
+ "type": "fieldAccess",
+ "fieldName":
+ },
+ "percentile":
+}
+```
+
+| Property | Description | Required? |
+|------------|-------------------------------------------------------------|-----------|
+| type | This String should always be "percentileSpectatorHistogram" | yes |
+| name | A String for the output (result) name of the calculation. | yes |
+| field | A field reference pointing to the aggregated histogram. | yes |
+| percentile | A single decimal percentile between 0.0 and 100.0 | yes |
+
+### Percentiles (multiple)
+This returns an array of percentiles corresponding to those requested.
+
+```
+{
+ "type": "percentilesSpectatorHistogram",
+ "name": ,
+ "field": {
+ "type": "fieldAccess",
+ "fieldName":
+ },
+ "percentiles": [25, 50, 75, 99.5]
+}
+```
+
+> It's more efficient to request multiple percentiles in a single query
+than to request individual percentiles in separate queries. This array-based
+helper is provided for convenience and has a marginal performance benefit over
+using the singular percentile post-aggregator multiple times within a query.
+The more expensive part of the query is the aggregation of the histogram.
+The post-aggregation calculations all happen on the same aggregated histogram.
+
+The results contain arrays matching the length and order of the requested
+array of percentiles.
+
+```
+"percentilesAdded": [
+ 0.5504911679884643, // 25th percentile
+ 4.013975155279504, // 50th percentile
+ 78.89518317503394, // 75th percentile
+ 8580.024999999994 // 99.5th percentile
+]
+```
+
+| Property | Description | Required? |
+|-------------|--------------------------------------------------------------|-----------|
+| type | This String should always be "percentilesSpectatorHistogram" | yes |
+| name | A String for the output (result) name of the calculation. | yes |
+| field | A field reference pointing to the aggregated histogram. | yes |
+| percentiles | Non-empty array of decimal percentiles between 0.0 and 100.0 | yes |
+
+## Examples
+
+### Example Ingestion Spec
+Example of ingesting the sample Wikipedia dataset with a histogram metric column:
+```json
+{
+ "type": "index_parallel",
+ "spec": {
+ "ioConfig": {
+ "type": "index_parallel",
+ "inputSource": {
+ "type": "http",
+ "uris": ["https://druid.apache.org/data/wikipedia.json.gz"]
+ },
+ "inputFormat": { "type": "json" }
+ },
+ "dataSchema": {
+ "granularitySpec": {
+ "segmentGranularity": "day",
+ "queryGranularity": "minute",
+ "rollup": true
+ },
+ "dataSource": "wikipedia",
+ "timestampSpec": { "column": "timestamp", "format": "iso" },
+ "dimensionsSpec": {
+ "dimensions": [
+ "isRobot",
+ "channel",
+ "flags",
+ "isUnpatrolled",
+ "page",
+ "diffUrl",
+ "comment",
+ "isNew",
+ "isMinor",
+ "isAnonymous",
+ "user",
+ "namespace",
+ "cityName",
+ "countryName",
+ "regionIsoCode",
+ "metroCode",
+ "countryIsoCode",
+ "regionName"
+ ]
+ },
+ "metricsSpec": [
+ { "name": "count", "type": "count" },
+ { "name": "sum_added", "type": "longSum", "fieldName": "added" },
+ {
+ "name": "hist_added",
+ "type": "spectatorHistogram",
+ "fieldName": "added"
+ }
+ ]
+ },
+ "tuningConfig": {
+ "type": "index_parallel",
+ "partitionsSpec": { "type": "hashed" },
+ "forceGuaranteedRollup": true
+ }
+ }
+}
+```
+
+### Example Query
+Example query using the sample Wikipedia dataset:
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": {
+ "type": "table",
+ "name": "wikipedia"
+ },
+ "intervals": {
+ "type": "intervals",
+ "intervals": [
+ "0000-01-01/9999-12-31"
+ ]
+ },
+ "granularity": {
+ "type": "all"
+ },
+ "aggregations": [
+ {
+ "type": "spectatorHistogram",
+ "name": "histogram_added",
+ "fieldName": "added"
+ }
+ ],
+ "postAggregations": [
+ {
+ "type": "percentileSpectatorHistogram",
+ "name": "medianAdded",
+ "field": {
+ "type": "fieldAccess",
+ "fieldName": "histogram_added"
+ },
+ "percentile": "50.0"
+ }
+ ]
+}
+```
+Results in
+```json
+[
+ {
+ "result": {
+ "histogram_added": {
+ "0": 11096, "1": 632, "2": 297, "3": 187, "4": 322, "5": 161,
+ "6": 174, "7": 127, "8": 125, "9": 162, "10": 123, "11": 106,
+ "12": 95, "13": 104, "14": 95, "15": 588, "16": 540, "17": 690,
+ "18": 719, "19": 478, "20": 288, "21": 250, "22": 219, "23": 224,
+ "24": 737, "25": 424, "26": 343, "27": 266, "28": 232, "29": 217,
+ "30": 171, "31": 164, "32": 161, "33": 530, "34": 339, "35": 236,
+ "36": 181, "37": 152, "38": 113, "39": 128, "40": 80, "41": 75,
+ "42": 289, "43": 145, "44": 138, "45": 83, "46": 45, "47": 46,
+ "48": 64, "49": 65, "50": 71, "51": 421, "52": 525, "53": 59,
+ "54": 31, "55": 35, "56": 8, "57": 10, "58": 5, "59": 4, "60": 11,
+ "61": 10, "62": 5, "63": 2, "64": 2, "65": 1, "67": 1, "68": 1,
+ "69": 1, "70": 1, "71": 1, "78": 2
+ },
+ "medianAdded": 4.013975155279504
+ },
+ "timestamp": "2016-06-27T00:00:00.000Z"
+ }
+]
+```
+
+## Histogram bucket boundaries
+The following array lists the upper bounds of each bucket index. There are 276 buckets in total.
+The first bucket index is 0 and the last bucket index is 275.
+The bucket widths increase as the bucket index increases. This leads to a greater absolute error for larger values, but maintains a relative error of rough percentage across the number range.
+For example, the maximum error at value 10 is zero since the bucket width is 1 (the difference of `11-10`). For a value of 16,000,000,000, the bucket width is 1,431,655,768 (from `17179869184-15748213416`). This gives an error of up to ~8.9%, from `1,431,655,768/16,000,000,000*100`. In practice, the observed error of computed percentiles is in the range of (0.1%, 3%).
+```json
+[
+ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 21, 26, 31, 36, 41, 46,
+ 51, 56, 64, 85, 106, 127, 148, 169, 190, 211, 232, 256, 341, 426, 511, 596,
+ 681, 766, 851, 936, 1024, 1365, 1706, 2047, 2388, 2729, 3070, 3411, 3752,
+ 4096, 5461, 6826, 8191, 9556, 10921, 12286, 13651, 15016, 16384, 21845,
+ 27306, 32767, 38228, 43689, 49150, 54611, 60072, 65536, 87381, 109226,
+ 131071, 152916, 174761, 196606, 218451, 240296, 262144, 349525, 436906,
+ 524287, 611668, 699049, 786430, 873811, 961192, 1048576, 1398101, 1747626,
+ 2097151, 2446676, 2796201, 3145726, 3495251, 3844776, 4194304, 5592405,
+ 6990506, 8388607, 9786708, 11184809, 12582910, 13981011, 15379112, 16777216,
+ 22369621, 27962026, 33554431, 39146836, 44739241, 50331646, 55924051,
+ 61516456, 67108864, 89478485, 111848106, 134217727, 156587348, 178956969,
+ 201326590, 223696211, 246065832, 268435456, 357913941, 447392426, 536870911,
+ 626349396, 715827881, 805306366, 894784851, 984263336, 1073741824, 1431655765,
+ 1789569706, 2147483647, 2505397588, 2863311529, 3221225470, 3579139411,
+ 3937053352, 4294967296, 5726623061, 7158278826, 8589934591, 10021590356,
+ 11453246121, 12884901886, 14316557651, 15748213416, 17179869184, 22906492245,
+ 28633115306, 34359738367, 40086361428, 45812984489, 51539607550, 57266230611,
+ 62992853672, 68719476736, 91625968981, 114532461226, 137438953471,
+ 160345445716, 183251937961, 206158430206, 229064922451, 251971414696,
+ 274877906944, 366503875925, 458129844906, 549755813887, 641381782868,
+ 733007751849, 824633720830, 916259689811, 1007885658792, 1099511627776,
+ 1466015503701, 1832519379626, 2199023255551, 2565527131476, 2932031007401,
+ 3298534883326, 3665038759251, 4031542635176, 4398046511104, 5864062014805,
+ 7330077518506, 8796093022207, 10262108525908, 11728124029609, 13194139533310,
+ 14660155037011, 16126170540712, 17592186044416, 23456248059221,
+ 29320310074026, 35184372088831, 41048434103636, 46912496118441,
+ 52776558133246, 58640620148051, 64504682162856, 70368744177664,
+ 93824992236885, 117281240296106, 140737488355327, 164193736414548,
+ 187649984473769, 211106232532990, 234562480592211, 258018728651432,
+ 281474976710656, 375299968947541, 469124961184426, 562949953421311,
+ 656774945658196, 750599937895081, 844424930131966, 938249922368851,
+ 1032074914605736, 1125899906842624, 1501199875790165, 1876499844737706,
+ 2251799813685247, 2627099782632788, 3002399751580329, 3377699720527870,
+ 3752999689475411, 4128299658422952, 4503599627370496, 6004799503160661,
+ 7505999378950826, 9007199254740991, 10508399130531156, 12009599006321321,
+ 13510798882111486, 15011998757901651, 16513198633691816, 18014398509481984,
+ 24019198012642645, 30023997515803306, 36028797018963967, 42033596522124628,
+ 48038396025285289, 54043195528445950, 60047995031606611, 66052794534767272,
+ 72057594037927936, 96076792050570581, 120095990063213226, 144115188075855871,
+ 168134386088498516, 192153584101141161, 216172782113783806, 240191980126426451,
+ 264211178139069096, 288230376151711744, 384307168202282325, 480383960252852906,
+ 576460752303423487, 672537544353994068, 768614336404564649, 864691128455135230,
+ 960767920505705811, 1056844712556276392, 1152921504606846976, 1537228672809129301,
+ 1921535841011411626, 2305843009213693951, 2690150177415976276, 3074457345618258601,
+ 3458764513820540926, 3843071682022823251, 4227378850225105576, 9223372036854775807
+]
+```
diff --git a/docs/35.0.0/development/extensions-contrib/sqlserver.md b/docs/35.0.0/development/extensions-contrib/sqlserver.md
new file mode 100644
index 0000000000..0f2e8de24e
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/sqlserver.md
@@ -0,0 +1,56 @@
+---
+id: sqlserver
+title: "Microsoft SQLServer"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `sqlserver-metadata-storage` in the extensions load list.
+
+## Setting up SQLServer
+
+1. Install Microsoft SQLServer
+
+2. Create a druid database and user
+
+ Create the druid user
+ - Microsoft SQL Server Management Studio - Security - Logins - New Login...
+ - Create a druid user, enter `diurd` when prompted for the password.
+
+ Create a druid database owned by the user we just created
+ - Databases - New Database
+ - Database Name: druid, Owner: druid
+
+3. Add the Microsoft JDBC library to the Druid classpath
+ - To ensure the com.microsoft.sqlserver.jdbc.SQLServerDriver class is loaded you will have to add the appropriate Microsoft JDBC library (sqljdbc*.jar) to the Druid classpath.
+ - For instance, if all jar files in your "druid/lib" directory are automatically added to your Druid classpath, then manually download the Microsoft JDBC drivers from ( https://www.microsoft.com/en-ca/download/details.aspx?id=11774) and drop it into my druid/lib directory.
+
+4. Configure your Druid metadata storage extension:
+
+ Add the following parameters to your Druid configuration, replacing ``
+ with the location (host name and port) of the database.
+
+ ```properties
+ druid.metadata.storage.type=sqlserver
+ druid.metadata.storage.connector.connectURI=jdbc:sqlserver://;databaseName=druid
+ druid.metadata.storage.connector.user=druid
+ druid.metadata.storage.connector.password=diurd
+ ```
diff --git a/docs/35.0.0/development/extensions-contrib/statsd.md b/docs/35.0.0/development/extensions-contrib/statsd.md
new file mode 100644
index 0000000000..3e5713f586
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/statsd.md
@@ -0,0 +1,75 @@
+---
+id: statsd
+title: "StatsD Emitter"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `statsd-emitter` in the extensions load list.
+
+## Introduction
+
+This extension emits druid metrics to a StatsD server.
+(https://github.com/etsy/statsd)
+(https://github.com/armon/statsite)
+
+## Configuration
+
+All the configuration parameters for the StatsD emitter are under `druid.emitter.statsd`.
+
+|property|description|required?|default|
+|--------|-----------|---------|-------|
+|`druid.emitter.statsd.hostname`|The hostname of the StatsD server.|yes|none|
+|`druid.emitter.statsd.port`|The port of the StatsD server.|yes|none|
+|`druid.emitter.statsd.prefix`|Optional metric name prefix.|no|""|
+|`druid.emitter.statsd.separator`|Metric name separator|no|.|
+|`druid.emitter.statsd.includeHost`|Flag to include the hostname as part of the metric name.|no|false|
+|`druid.emitter.statsd.dimensionMapPath`|JSON file defining the StatsD type, and desired dimensions for every Druid metric|no|Default mapping provided. See below.|
+|`druid.emitter.statsd.blankHolder`|The blank character replacement as StatsD does not support path with blank character|no|"-"|
+|`druid.emitter.statsd.queueSize`|Maximum number of unprocessed messages in the message queue.|no|Default value of StatsD Client(4096)|
+|`druid.emitter.statsd.poolSize`|Network packet buffer pool size.|no|Default value of StatsD Client(512)|
+|`druid.emitter.statsd.processorWorkers`|The number of processor worker threads assembling buffers for submission.|no|Default value of StatsD Client(1)|
+|`druid.emitter.statsd.senderWorkers`| The number of sender worker threads submitting buffers to the socket.|no|Default value of StatsD Client(1)|
+|`druid.emitter.statsd.dogstatsd`|Flag to enable [DogStatsD](https://docs.datadoghq.com/developers/dogstatsd/) support. Causes dimensions to be included as tags, not as a part of the metric name. `convertRange` fields will be ignored.|no|false|
+|`druid.emitter.statsd.dogstatsdConstantTags`|If `druid.emitter.statsd.dogstatsd` is true, the tags in the JSON list of strings will be sent with every event.|no|[]|
+|`druid.emitter.statsd.dogstatsdServiceAsTag`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdServiceAsTag` are true, druid service (e.g. `druid/broker`, `druid/coordinator`, etc) is reported as a tag (e.g. `druid_service:druid/broker`) instead of being included in metric name (e.g. `druid.broker.query.time`) and `druid` is used as metric prefix (e.g. `druid.query.time`).|no|false|
+|`druid.emitter.statsd.dogstatsdEvents`|If `druid.emitter.statsd.dogstatsd` and `druid.emitter.statsd.dogstatsdEvents` are true, [Alert events](../../operations/alerts.md) are reported to DogStatsD.|no|false|
+
+### Druid to StatsD Event Converter
+
+Each metric sent to StatsD must specify a type, one of `[timer, counter, guage]`. StatsD Emitter expects this mapping to
+be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric.
+StatsD expects that metric values be integers. Druid emits some metrics with values between the range 0 and 1. To accommodate these metrics they are converted
+into the range 0 to 100. This conversion can be enabled by setting the optional "convertRange" field true in the JSON mapping file.
+If the user does not specify their own JSON file, a default mapping is used. All
+metrics are expected to be mapped. Metrics which are not mapped will log an error.
+StatsD metric path is organized using the following schema:
+` : { "dimensions" : , "type" : , "convertRange" : true/false}`
+e.g.
+`query/time" : { "dimensions" : ["dataSource", "type"], "type" : "timer"}`
+
+For metrics which are emitted from multiple services with different dimensions, the metric name is prefixed with
+the service name.
+e.g.
+`"druid/coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
+ "druid/historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" }`
+
+For most use-cases, the default mapping is sufficient.
diff --git a/docs/35.0.0/development/extensions-contrib/tdigestsketch-quantiles.md b/docs/35.0.0/development/extensions-contrib/tdigestsketch-quantiles.md
new file mode 100644
index 0000000000..101368445d
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/tdigestsketch-quantiles.md
@@ -0,0 +1,175 @@
+---
+id: tdigestsketch-quantiles
+title: "T-Digest Quantiles Sketch module"
+---
+
+
+
+
+This module provides Apache Druid approximate sketch aggregators based on T-Digest.
+T-Digest (https://github.com/tdunning/t-digest) is a popular data structure for accurate on-line accumulation of
+rank-based statistics such as quantiles and trimmed means.
+The data structure is also designed for parallel programming use cases like distributed aggregations or map reduce jobs by making combining two intermediate t-digests easy and efficient.
+
+The tDigestSketch aggregator is capable of generating sketches from raw numeric values as well as
+aggregating/combining pre-generated T-Digest sketches generated using the tDigestSketch aggregator itself.
+While one can generate sketches on the fly during the query time itself, it generally is more performant
+to generate sketches during ingestion time itself and then combining them during query time.
+The module also provides a postAggregator, quantilesFromTDigestSketch, that can be used to compute approximate
+quantiles from T-Digest sketches generated by the tDigestSketch aggregator.
+
+To use this aggregator, make sure you [include](../../configuration/extensions.md#loading-extensions) the extension in your config file:
+
+```
+druid.extensions.loadList=["druid-tdigestsketch"]
+```
+
+### Aggregator
+
+The result of the aggregation is a T-Digest sketch that is built ingesting numeric values from the raw data or from
+combining pre-generated T-Digest sketches.
+
+```json
+{
+ "type" : "tDigestSketch",
+ "name" : ,
+ "fieldName" : ,
+ "compression":
+ }
+```
+
+Example:
+
+```json
+{
+ "type": "tDigestSketch",
+ "name": "sketch",
+ "fieldName": "session_duration",
+ "compression": 200
+}
+```
+
+```json
+{
+ "type": "tDigestSketch",
+ "name": "combined_sketch",
+ "fieldName": ,
+ "compression": 200
+}
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|This String should always be "tDigestSketch"|yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|fieldName|A String for the name of the input field containing raw numeric values or pre-generated T-Digest sketches.|yes|
+|compression|Parameter that determines the accuracy and size of the sketch. Higher compression means higher accuracy but more space to store sketches.|no, defaults to 100|
+
+
+### Post Aggregators
+
+#### Quantiles
+
+This returns an array of quantiles corresponding to a given array of fractions.
+
+```json
+{
+ "type" : "quantilesFromTDigestSketch",
+ "name": ,
+ "field" : ,
+ "fractions" :
+}
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|This String should always be "quantilesFromTDigestSketch"|yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A field reference pointing to the field aggregated/combined T-Digest sketch.|yes|
+|fractions|Non-empty array of fractions between 0 and 1|yes|
+
+Example:
+
+```json
+{
+ "queryType": "groupBy",
+ "dataSource": "test_datasource",
+ "granularity": "ALL",
+ "dimensions": [],
+ "aggregations": [{
+ "type": "tDigestSketch",
+ "name": "merged_sketch",
+ "fieldName": "ingested_sketch",
+ "compression": 200
+ }],
+ "postAggregations": [{
+ "type": "quantilesFromTDigestSketch",
+ "name": "quantiles",
+ "fractions": [0, 0.5, 1],
+ "field": {
+ "type": "fieldAccess",
+ "fieldName": "merged_sketch"
+ }
+ }],
+ "intervals": ["2016-01-01T00:00:00.000Z/2016-01-31T00:00:00.000Z"]
+}
+```
+
+Similar to quantilesFromTDigestSketch except it takes in a single fraction for computing quantile.
+
+```json
+{
+ "type" : "quantileFromTDigestSketch",
+ "name": ,
+ "field" : ,
+ "fraction" :
+}
+```
+
+|property|description|required?|
+|--------|-----------|---------|
+|type|This String should always be "quantileFromTDigestSketch"|yes|
+|name|A String for the output (result) name of the calculation.|yes|
+|field|A field reference pointing to the field aggregated/combined T-Digest sketch.|yes|
+|fraction|Decimal value between 0 and 1|yes|
+
+### SQL functions
+
+Once you load the T-Digest extension, you can use the following SQL functions.
+
+#### TDIGEST_GENERATE_SKETCH
+
+Builds a T-Digest sketch on values produced by an expression.
+Compression parameter (default value 100) determines the accuracy and size of the sketch.
+Higher compression provides higher accuracy but requires more storage space.
+
+* **Syntax**: `TDIGEST_GENERATE_SKETCH(expr, [compression])`
+* **Default**: Empty Base64-encoded T-Digest sketch string
+* **Function type**: [Aggregation](../../querying/sql-aggregations.md)
+
+#### TDIGEST_QUANTILE
+
+Builds a T-Digest sketch on values produced by an expression and returns the value for the quantile.
+Compression parameter (default value 100) determines the accuracy and size of the sketch.
+Higher compression provides higher accuracy but requires more storage space.
+
+* **Syntax**: `TDIGEST_QUANTILE(expr, quantileFraction, [compression])`
+* **Default**: `Double.NaN`
+* **Function type**: [Aggregation](../../querying/sql-aggregations.md)
diff --git a/docs/35.0.0/development/extensions-contrib/thrift.md b/docs/35.0.0/development/extensions-contrib/thrift.md
new file mode 100644
index 0000000000..3148982709
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/thrift.md
@@ -0,0 +1,87 @@
+---
+id: thrift
+title: "Thrift"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-thrift-extensions` in the extensions load list.
+
+This extension enables Druid to ingest thrift compact data online (`ByteBuffer`) and offline (SequenceFile of type `` or LzoThriftBlock File).
+
+You may want to use another version of thrift, change the dependency in pom and compile yourself.
+
+## LZO Support
+
+If you plan to read LZO-compressed Thrift files, you will need to download version 0.4.19 of the [hadoop-lzo JAR](https://mvnrepository.com/artifact/com.hadoop.gplcompression/hadoop-lzo/0.4.19) and place it in your `extensions/druid-thrift-extensions` directory.
+
+## Thrift Parser
+
+
+| Field | Type | Description | Required |
+| ----------- | ----------- | ---------------------------------------- | -------- |
+| type | String | This should say `thrift` | yes |
+| parseSpec | JSON Object | Specifies the timestamp and dimensions of the data. Should be a JSON parseSpec. | yes |
+| thriftJar | String | path of thrift jar, if not provided, it will try to find the thrift class in classpath. Thrift jar in batch ingestion should be uploaded to HDFS first and configure `jobProperties` with `"tmpjars":"/path/to/your/thrift.jar"` | no |
+| thriftClass | String | classname of thrift | yes |
+
+- Batch Ingestion example - `inputFormat` and `tmpjars` should be set.
+
+This is for batch ingestion using the HadoopDruidIndexer. The inputFormat of inputSpec in ioConfig could be one of `"org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat"` and `com.twitter.elephantbird.mapreduce.input.LzoThriftBlockInputFormat`. Be careful, when `LzoThriftBlockInputFormat` is used, thrift class must be provided twice.
+
+```json
+{
+ "type": "index_hadoop",
+ "spec": {
+ "dataSchema": {
+ "dataSource": "book",
+ "parser": {
+ "type": "thrift",
+ "jarPath": "book.jar",
+ "thriftClass": "org.apache.druid.data.input.thrift.Book",
+ "protocol": "compact",
+ "parseSpec": {
+ "format": "json",
+ ...
+ }
+ },
+ "metricsSpec": [],
+ "granularitySpec": {}
+ },
+ "ioConfig": {
+ "type": "hadoop",
+ "inputSpec": {
+ "type": "static",
+ "inputFormat": "org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat",
+ // "inputFormat": "com.twitter.elephantbird.mapreduce.input.LzoThriftBlockInputFormat",
+ "paths": "/user/to/some/book.seq"
+ }
+ },
+ "tuningConfig": {
+ "type": "hadoop",
+ "jobProperties": {
+ "tmpjars":"/user/h_user_profile/du00/druid/test/book.jar",
+ // "elephantbird.class.for.MultiInputFormat" : "${YOUR_THRIFT_CLASS_NAME}"
+ }
+ }
+ }
+}
+```
diff --git a/docs/35.0.0/development/extensions-contrib/time-min-max.md b/docs/35.0.0/development/extensions-contrib/time-min-max.md
new file mode 100644
index 0000000000..f83667baea
--- /dev/null
+++ b/docs/35.0.0/development/extensions-contrib/time-min-max.md
@@ -0,0 +1,104 @@
+---
+id: time-min-max
+title: "Timestamp Min/Max aggregators"
+---
+
+
+
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-time-min-max` in the extensions load list.
+
+These aggregators enable more precise calculation of min and max time of given events than `__time` column whose granularity is sparse, the same as query granularity.
+To use this feature, a "timeMin" or "timeMax" aggregator must be included at indexing time.
+They can apply to any columns that can be converted to timestamp, which include Long, DateTime, Timestamp, and String types.
+
+For example, when a data set consists of timestamp, dimension, and metric value like followings.
+
+```
+2015-07-28T01:00:00.000Z A 1
+2015-07-28T02:00:00.000Z A 1
+2015-07-28T03:00:00.000Z A 1
+2015-07-28T04:00:00.000Z B 1
+2015-07-28T05:00:00.000Z A 1
+2015-07-28T06:00:00.000Z B 1
+2015-07-29T01:00:00.000Z C 1
+2015-07-29T02:00:00.000Z C 1
+2015-07-29T03:00:00.000Z A 1
+2015-07-29T04:00:00.000Z A 1
+```
+
+At ingestion time, timeMin and timeMax aggregator can be included as other aggregators.
+
+```json
+{
+ "type": "timeMin",
+ "name": "tmin",
+ "fieldName": ""
+}
+```
+
+```json
+{
+ "type": "timeMax",
+ "name": "tmax",
+ "fieldName": ""
+}
+```
+
+`name` is output name of aggregator and can be any string. `fieldName` is typically column specified in timestamp spec but can be any column that can be converted to timestamp.
+
+To query for results, the same aggregators "timeMin" and "timeMax" is used.
+
+```json
+{
+ "queryType": "groupBy",
+ "dataSource": "timeMinMax",
+ "granularity": "DAY",
+ "dimensions": ["product"],
+ "aggregations": [
+ {
+ "type": "count",
+ "name": "count"
+ },
+ {
+ "type": "timeMin",
+ "name": "",
+ "fieldName": "tmin"
+ },
+ {
+ "type": "timeMax",
+ "name": "",
+ "fieldName": "tmax"
+ }
+ ],
+ "intervals": [
+ "2010-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z"
+ ]
+}
+```
+
+Then, result has min and max of timestamp, which is finer than query granularity.
+
+```
+2015-07-28T00:00:00.000Z A 4 2015-07-28T01:00:00.000Z 2015-07-28T05:00:00.000Z
+2015-07-28T00:00:00.000Z B 2 2015-07-28T04:00:00.000Z 2015-07-28T06:00:00.000Z
+2015-07-29T00:00:00.000Z A 2 2015-07-29T03:00:00.000Z 2015-07-29T04:00:00.000Z
+2015-07-29T00:00:00.000Z C 2 2015-07-29T01:00:00.000Z 2015-07-29T02:00:00.000Z
+```
diff --git a/docs/35.0.0/development/extensions-core/approximate-histograms.md b/docs/35.0.0/development/extensions-core/approximate-histograms.md
new file mode 100644
index 0000000000..240d87a5a0
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/approximate-histograms.md
@@ -0,0 +1,315 @@
+---
+id: approximate-histograms
+title: "Approximate Histogram aggregators"
+---
+
+
+
+
+:::caution
+ The Approximate Histogram aggregator is deprecated. Use [DataSketches Quantiles](../extensions-core/datasketches-quantiles.md) instead as it provides a superior distribution-independent algorithm with formal error guarantees.
+:::
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-histogram` in the extensions load list.
+
+The `druid-histogram` extension provides an approximate histogram aggregator and a fixed buckets histogram aggregator.
+
+
+
+## Approximate Histogram aggregator
+
+
+This aggregator is based on
+[http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf)
+to compute approximate histograms, with the following modifications:
+
+- some tradeoffs in accuracy were made in the interest of speed (see below)
+- the sketch maintains the exact original data as long as the number of
+ distinct data points is fewer than the resolutions (number of centroids),
+ increasing accuracy when there are few data points, or when dealing with
+ discrete data points. You can find some of the details in [this post](https://metamarkets.com/2013/histograms/).
+
+Here are a few things to note before using approximate histograms:
+
+- As indicated in the original paper, there are no formal error bounds on the
+ approximation. In practice, the approximation gets worse if the distribution
+ is skewed.
+- The algorithm is order-dependent, so results can vary for the same query, due
+ to variations in the order in which results are merged.
+- In general, the algorithm only works well if the data that comes is randomly
+ distributed (i.e. if data points end up sorted in a column, approximation
+ will be horrible)
+- We traded accuracy for aggregation speed, taking some shortcuts when adding
+ histograms together, which can lead to pathological cases if your data is
+ ordered in some way, or if your distribution has long tails. It should be
+ cheaper to increase the resolution of the sketch to get the accuracy you need.
+
+That being said, those sketches can be useful to get a first order approximation
+when averages are not good enough. Assuming most rows in your segment store
+fewer data points than the resolution of histogram, you should be able to use
+them for monitoring purposes and detect meaningful variations with a few
+hundred centroids. To get good accuracy readings on 95th percentiles with
+millions of rows of data, you may want to use several thousand centroids,
+especially with long tails, since that's where the approximation will be worse.
+
+### Creating approximate histogram sketches at ingestion time
+
+To use this feature, an "approxHistogram" or "approxHistogramFold" aggregator must be included at
+indexing time. The ingestion aggregator can only apply to numeric values. If you use "approxHistogram"
+then any input rows missing the value will be considered to have a value of 0, while with "approxHistogramFold"
+such rows will be ignored.
+
+To query for results, an "approxHistogramFold" aggregator must be included in the
+query.
+
+```json
+{
+ "type" : "approxHistogram or approxHistogramFold (at ingestion time), approxHistogramFold (at query time)",
+ "name" : ,
+ "fieldName" : ,
+ "resolution" : ,
+ "numBuckets" : ,
+ "lowerLimit" : ,
+ "upperLimit" :
+}
+```
+
+|Property |Description |Default |
+|-------------------------|------------------------------|----------------------------------|
+|`resolution` |Number of centroids (data points) to store. The higher the resolution, the more accurate results are, but the slower the computation will be.|50|
+|`numBuckets` |Number of output buckets for the resulting histogram. Bucket intervals are dynamic, based on the range of the underlying data. Use a post-aggregator to have finer control over the bucketing scheme|7|
+|`lowerLimit`/`upperLimit`|Restrict the approximation to the given range. The values outside this range will be aggregated into two centroids. Counts of values outside this range are still maintained. |-INF/+INF|
+|`finalizeAsBase64Binary` |If true, the finalized aggregator value will be a Base64-encoded byte array containing the serialized form of the histogram. If false, the finalized aggregator value will be a JSON representation of the histogram.|false|
+
+## Fixed Buckets Histogram
+
+The fixed buckets histogram aggregator builds a histogram on a numeric column, with evenly-sized buckets across a specified value range. Values outside of the range are handled based on a user-specified outlier handling mode.
+
+This histogram supports the min/max/quantiles post-aggregators but does not support the bucketing post-aggregators.
+
+### When to use
+
+The accuracy/usefulness of the fixed buckets histogram is extremely data-dependent; it is provided to support special use cases where the user has a great deal of prior information about the data being aggregated and knows that a fixed buckets implementation is suitable.
+
+For general histogram and quantile use cases, the [DataSketches Quantiles Sketch](../extensions-core/datasketches-quantiles.md) extension is recommended.
+
+### Properties
+
+
+|Property |Description |Default |
+|-------------------------|------------------------------|----------------------------------|
+|`type`|Type of the aggregator. Must `fixedBucketsHistogram`.|No default, must be specified|
+|`name`|Column name for the aggregator.|No default, must be specified|
+|`fieldName`|Column name of the input to the aggregator.|No default, must be specified|
+|`lowerLimit`|Lower limit of the histogram. |No default, must be specified|
+|`upperLimit`|Upper limit of the histogram. |No default, must be specified|
+|`numBuckets`|Number of buckets for the histogram. The range [lowerLimit, upperLimit] will be divided into `numBuckets` intervals of equal size.|10|
+|`outlierHandlingMode`|Specifies how values outside of [lowerLimit, upperLimit] will be handled. Supported modes are "ignore", "overflow", and "clip". See [outlier handling modes](#outlier-handling-modes) for more details.|No default, must be specified|
+|`finalizeAsBase64Binary`|If true, the finalized aggregator value will be a Base64-encoded byte array containing the [serialized form](#serialization-formats) of the histogram. If false, the finalized aggregator value will be a JSON representation of the histogram.|false|
+
+An example aggregator spec is shown below:
+
+```json
+{
+ "type" : "fixedBucketsHistogram",
+ "name" : ,
+ "fieldName" : ,
+ "numBuckets" : ,
+ "lowerLimit" : ,
+ "upperLimit" : ,
+ "outlierHandlingMode":
+}
+```
+
+### Outlier handling modes
+
+The outlier handling mode specifies what should be done with values outside of the histogram's range. There are three supported modes:
+
+- `ignore`: Throw away outlier values.
+- `overflow`: A count of outlier values will be tracked by the histogram, available in the `lowerOutlierCount` and `upperOutlierCount` fields.
+- `clip`: Outlier values will be clipped to the `lowerLimit` or the `upperLimit` and included in the histogram.
+
+If you don't care about outliers, `ignore` is the cheapest option performance-wise. There is currently no difference in storage size among the modes.
+
+### Output fields
+
+The histogram aggregator's output object has the following fields:
+
+- `lowerLimit`: Lower limit of the histogram
+- `upperLimit`: Upper limit of the histogram
+- `numBuckets`: Number of histogram buckets
+- `outlierHandlingMode`: Outlier handling mode
+- `count`: Total number of values contained in the histogram, excluding outliers
+- `lowerOutlierCount`: Count of outlier values below `lowerLimit`. Only used if the outlier mode is `overflow`.
+- `upperOutlierCount`: Count of outlier values above `upperLimit`. Only used if the outlier mode is `overflow`.
+- `missingValueCount`: Count of null values seen by the histogram.
+- `max`: Max value seen by the histogram. This does not include outlier values.
+- `min`: Min value seen by the histogram. This does not include outlier values.
+- `histogram`: An array of longs with size `numBuckets`, containing the bucket counts
+
+### Ingesting existing histograms
+
+It is also possible to ingest existing fixed buckets histograms. The input must be a Base64 string encoding a byte array that contains a serialized histogram object. Both "full" and "sparse" formats can be used. Please see [Serialization formats](#serialization-formats) below for details.
+
+### Serialization formats
+
+#### Full serialization format
+
+This format includes the full histogram bucket count array in the serialization format.
+
+```
+byte: serialization version, must be 0x01
+byte: encoding mode, 0x01 for full
+double: lowerLimit
+double: upperLimit
+int: numBuckets
+byte: outlier handling mode (0x00 for `ignore`, 0x01 for `overflow`, and 0x02 for `clip`)
+long: count, total number of values contained in the histogram, excluding outliers
+long: lowerOutlierCount
+long: upperOutlierCount
+long: missingValueCount
+double: max
+double: min
+array of longs: bucket counts for the histogram
+```
+
+#### Sparse serialization format
+
+This format represents the histogram bucket counts as (bucketNum, count) pairs. This serialization format is used when less than half of the histogram's buckets have values.
+
+```
+byte: serialization version, must be 0x01
+byte: encoding mode, 0x02 for sparse
+double: lowerLimit
+double: upperLimit
+int: numBuckets
+byte: outlier handling mode (0x00 for `ignore`, 0x01 for `overflow`, and 0x02 for `clip`)
+long: count, total number of values contained in the histogram, excluding outliers
+long: lowerOutlierCount
+long: upperOutlierCount
+long: missingValueCount
+double: max
+double: min
+int: number of following (bucketNum, count) pairs
+sequence of (int, long) pairs:
+ int: bucket number
+ count: bucket count
+```
+
+### Combining histograms with different bucketing schemes
+
+It is possible to combine two histograms with different bucketing schemes (lowerLimit, upperLimit, numBuckets) together.
+
+The bucketing scheme of the "left hand" histogram will be preserved (i.e., when running a query, the bucketing schemes specified in the query's histogram aggregators will be preserved).
+
+When merging, we assume that values are evenly distributed within the buckets of the "right hand" histogram.
+
+When the right-hand histogram contains outliers (when using `overflow` mode), we assume that all of the outliers counted in the right-hand histogram will be outliers in the left-hand histogram as well.
+
+For performance and accuracy reasons, we recommend avoiding aggregation of histograms with different bucketing schemes if possible.
+
+### Null handling
+
+Druid tracks null values in the `missingValueCount` field of the histogram.
+
+## Histogram post-aggregators
+
+Post-aggregators are used to transform opaque approximate histogram sketches
+into bucketed histogram representations, as well as to compute various
+distribution metrics such as quantiles, min, and max.
+
+### Equal buckets post-aggregator
+
+Computes a visual representation of the approximate histogram with a given number of equal-sized bins.
+Bucket intervals are based on the range of the underlying data. This aggregator is not supported for the fixed buckets histogram.
+
+```json
+{
+ "type": "equalBuckets",
+ "name": "",
+ "fieldName": "",
+ "numBuckets":
+}
+```
+
+### Buckets post-aggregator
+
+Computes a visual representation given an initial breakpoint, offset, and a bucket size.
+
+Bucket size determines the width of the binning interval.
+
+Offset determines the value on which those interval bins align.
+
+This aggregator is not supported for the fixed buckets histogram.
+
+```json
+{
+ "type": "buckets",
+ "name": "",
+ "fieldName": "",
+ "bucketSize": ,
+ "offset":
+}
+```
+
+### Custom buckets post-aggregator
+
+Computes a visual representation of the approximate histogram with bins laid out according to the given breaks.
+
+This aggregator is not supported for the fixed buckets histogram.
+
+```json
+{ "type" : "customBuckets", "name" : , "fieldName" : ,
+ "breaks" : [ , , ... ] }
+```
+
+### min post-aggregator
+
+Returns the minimum value of the underlying approximate or fixed buckets histogram aggregator
+
+```json
+{ "type" : "min", "name" : , "fieldName" : }
+```
+
+### max post-aggregator
+
+Returns the maximum value of the underlying approximate or fixed buckets histogram aggregator
+
+```json
+{ "type" : "max", "name" : , "fieldName" : }
+```
+
+#### quantile post-aggregator
+
+Computes a single quantile based on the underlying approximate or fixed buckets histogram aggregator
+
+```json
+{ "type" : "quantile", "name" : , "fieldName" : ,
+ "probability" : }
+```
+
+#### quantiles post-aggregator
+
+Computes an array of quantiles based on the underlying approximate or fixed buckets histogram aggregator
+
+```json
+{ "type" : "quantiles", "name" : , "fieldName" : ,
+ "probabilities" : [ , , ... ] }
+```
diff --git a/docs/35.0.0/development/extensions-core/avro.md b/docs/35.0.0/development/extensions-core/avro.md
new file mode 100644
index 0000000000..7db7530b07
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/avro.md
@@ -0,0 +1,64 @@
+---
+id: avro
+title: "Apache Avro"
+---
+
+
+
+This Apache Druid extension enables Druid to ingest and parse the Apache Avro data format as follows:
+- [Avro stream input format](../../ingestion/data-formats.md#avro-stream) for Kafka and Kinesis.
+- [Avro OCF input format](../../ingestion/data-formats.md#avro-ocf) for native batch ingestion.
+- [Avro Hadoop Parser](../../ingestion/data-formats.md#avro-hadoop-parser).
+
+The [Avro Stream Parser](../../ingestion/data-formats.md#avro-stream-parser) is deprecated.
+
+## Load the Avro extension
+
+To use the Avro extension, add the `druid-avro-extensions` to the list of loaded extensions. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information.
+
+## Avro types
+
+Druid supports most Avro types natively. This section describes some exceptions.
+
+### Unions
+Druid has two modes for supporting `union` types.
+
+The default mode treats unions as a single value regardless of the type of data populating the union.
+
+If you want to operate on individual members of a union, set `extractUnionsByType` on the Avro parser. This configuration expands union values into nested objects according to the following rules:
+- Primitive types and unnamed complex types are keyed by their type name, such as `int` and `string`.
+- Complex named types are keyed by their names, this includes `record`, `fixed`, and `enum`.
+- The Avro null type is elided as its value can only ever be null.
+
+This is safe because an Avro union can only contain a single member of each unnamed type and duplicates of the same named type are not allowed. For example, only a single array is allowed, multiple records (or other named types) are allowed as long as each has a unique name.
+
+You can then access the members of the union with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) like you would for other nested types.
+
+### Binary types
+The extension returns `bytes` and `fixed` Avro types as base64 encoded strings by default. To decode these types as UTF-8 strings, enable the `binaryAsString` option on the Avro parser.
+
+### Enums
+The extension returns `enum` types as `string` of the enum symbol.
+
+### Complex types
+You can ingest `record` and `map` types representing nested data with a [flattenSpec](../../ingestion/data-formats.md#flattenspec) on the parser.
+
+### Logical types
+Druid does not currently support Avro logical types. It ignores them and handles fields according to the underlying primitive type.
diff --git a/docs/35.0.0/development/extensions-core/azure.md b/docs/35.0.0/development/extensions-core/azure.md
new file mode 100644
index 0000000000..d6310e32cf
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/azure.md
@@ -0,0 +1,96 @@
+---
+id: azure
+title: "Microsoft Azure"
+---
+
+
+
+## Azure extension
+
+This extension allows you to do the following:
+
+* [Ingest data](#ingest-data-from-azure) from objects stored in Azure Blob Storage.
+* [Write segments](#store-segments-in-azure) to Azure Blob Storage for deep storage.
+* [Persist task logs](#persist-task-logs-in-azure) to Azure Blob Storage for long-term storage.
+
+:::info
+
+To use this Apache Druid extension, [include](../../configuration/extensions.md#loading-extensions) `druid-azure-extensions` in the extensions load list.
+
+:::
+
+### Ingest data from Azure
+
+Ingest data using either [MSQ](../../multi-stage-query/index.md) or a native batch [parallel task](../../ingestion/native-batch.md) with an [Azure input source](../../ingestion/input-sources.md#azure-input-source) (`azureStorage`) to read objects directly from Azure Blob Storage.
+
+### Store segments in Azure
+
+:::info
+
+To use Azure for deep storage, set `druid.storage.type=azure`.
+
+:::
+
+#### Configure location
+
+Configure where to store segments using the following properties:
+
+| Property | Description | Default |
+|---|---|---|
+| `druid.azure.account` | The Azure Storage account name. | Must be set. |
+| `druid.azure.container` | The Azure Storage container name. | Must be set. |
+| `druid.azure.prefix` | A prefix string that will be prepended to the blob names for the segments published. | "" |
+| `druid.azure.maxTries` | Number of tries before canceling an Azure operation. | 3 |
+| `druid.azure.protocol` | The protocol to use to connect to the Azure Storage account. Either `http` or `https`. | `https` |
+| `druid.azure.storageAccountEndpointSuffix` | The Storage account endpoint to use. Override the default value to connect to [Azure Government](https://learn.microsoft.com/en-us/azure/azure-government/documentation-government-get-started-connect-to-storage#getting-started-with-storage-api) or storage accounts with [Azure DNS zone endpoints](https://learn.microsoft.com/en-us/azure/storage/common/storage-account-overview#azure-dns-zone-endpoints-preview). Do _not_ include the storage account name prefix in this config value. Examples: `ABCD1234.blob.storage.azure.net`, `blob.core.usgovcloudapi.net`. | `blob.core.windows.net` |
+
+#### Configure authentication
+
+Authenticate access to Azure Blob Storage using one of the following methods:
+
+* [SAS token](https://learn.microsoft.com/en-us/azure/storage/common/storage-sas-overview)
+* [Shared Key](https://learn.microsoft.com/en-us/rest/api/storageservices/authorize-with-shared-key)
+* Default Azure credentials chain ([`DefaultAzureCredential`](https://learn.microsoft.com/en-us/java/api/overview/azure/identity-readme#defaultazurecredential)).
+
+Configure authentication using the following properties:
+
+| Property | Description | Default |
+|---|---|---|
+| `druid.azure.sharedAccessStorageToken` | The SAS (Shared Storage Access) token. | |
+| `druid.azure.key` | The Shared Key. | |
+| `druid.azure.useAzureCredentialsChain` | If `true`, use `DefaultAzureCredential` for authentication. | `false` |
+| `druid.azure.managedIdentityClientId` | To use managed identity authentication in the `DefaultAzureCredential`, set `useAzureCredentialsChain` to `true` and provide the client ID here. | |
+
+### Persist task logs in Azure
+
+:::info
+
+To persist task logs in Azure Blob Storage, set `druid.indexer.logs.type=azure`.
+
+:::
+
+Druid stores task logs using the storage account and authentication method configured for storing segments. Use the following configuration to set up where to store the task logs:
+
+| Property | Description | Default |
+|---|---|---|
+| `druid.indexer.logs.container` | The Azure Blob Store container to write logs to. | Must be set. |
+| `druid.indexer.logs.prefix` | The path to prepend to logs. | Must be set. |
+
+For general options regarding task retention, see [Log retention policy](../../configuration/index.md#log-retention-policy).
diff --git a/docs/35.0.0/development/extensions-core/bloom-filter.md b/docs/35.0.0/development/extensions-core/bloom-filter.md
new file mode 100644
index 0000000000..c0167e446d
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/bloom-filter.md
@@ -0,0 +1,175 @@
+---
+id: bloom-filter
+title: "Bloom Filter"
+---
+
+
+
+
+To use the Apache Druid® Bloom filter extension, include `druid-bloom-filter` in the extensions load list. See [Loading extensions](../../configuration/extensions.md#loading-extensions) for more information.
+
+This extension adds the abilities to construct Bloom filters from query results and to filter query results by testing
+against a Bloom filter. A Bloom filter is a probabilistic data structure to check for set membership. A Bloom
+filter is a good candidate to use when an explicit filter is impossible, such as filtering a query
+against a set of millions of values.
+
+Following are some characteristics of Bloom filters:
+
+- Bloom filters are significantly more space efficient than HashSets.
+- Because they are probabilistic, false positive results are possible with Bloom filters. For example, the `test()` function might return `true` for an element that is not within the filter.
+- False negatives are not possible. If an element is present, `test()` always returns `true`.
+- The false positive probability of this implementation is fixed at 5%. Increasing the number of entries that the filter can hold can decrease this false positive rate in exchange for overall size.
+- Bloom filters are sensitive to the number of inserted elements. You must specify the expected number of entries at creation time. If the number of insertions exceeds the specified number of entries, the false positive probability increases accordingly.
+
+This extension is based on `org.apache.hive.common.util.BloomKFilter` from `hive-storage-api`. Internally,
+this implementation uses Murmur3 as the hash algorithm.
+
+The following Java example shows how to construct a BloomKFilter externally:
+
+```java
+BloomKFilter bloomFilter = new BloomKFilter(1500);
+bloomFilter.addString("value 1");
+bloomFilter.addString("value 2");
+bloomFilter.addString("value 3");
+ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
+BloomKFilter.serialize(byteArrayOutputStream, bloomFilter);
+String base64Serialized = Base64.encodeBase64String(byteArrayOutputStream.toByteArray());
+```
+
+You can then use the Base64 encoded string in JSON-based or SQL-based queries in Druid.
+
+## Filter queries with a Bloom filter
+
+### JSON specification
+
+```json
+{
+ "type" : "bloom",
+ "dimension" : ,
+ "bloomKFilter" : ,
+ "extractionFn" :
+}
+```
+
+|Property|Description|Required|
+|--------|-----------|--------|
+|`type`|Filter type. Set to `bloom`.|Yes|
+|`dimension`|Dimension to filter over.|Yes|
+|`bloomKFilter`|Base64 encoded binary representation of `org.apache.hive.common.util.BloomKFilter`.|Yes|
+|`extractionFn`|[Extraction function](../../querying/dimensionspecs.md#extraction-functions) to apply to the dimension values.|No|
+
+### Serialized format for BloomKFilter
+
+Serialized BloomKFilter format:
+
+- 1 byte for the number of hash functions.
+- 1 big-endian integer for the number of longs in the bitset.
+- Big-endian longs in the BloomKFilter bitset.
+
+`org.apache.hive.common.util.BloomKFilter` provides a method to serialize Bloom filters to `outputStream`.
+
+### Filter SQL queries
+
+You can use Bloom filters in SQL `WHERE` clauses with the `bloom_filter_test` operator:
+
+```sql
+SELECT COUNT(*) FROM druid.foo WHERE bloom_filter_test(, '')
+```
+
+### Expression and virtual column support
+
+The Bloom filter extension also adds a Bloom filter [Druid expression](../../querying/math-expr.md) which shares syntax
+with the SQL operator.
+
+```sql
+bloom_filter_test(, '')
+```
+
+## Bloom filter query aggregator
+
+You can create an input for a `BloomKFilter` from a Druid query with the `bloom` aggregator. Make sure to set a reasonable value for the `maxNumEntries` parameter to specify the maximum number of distinct entries that the Bloom filter can represent without increasing the false positive rate. Try performing a query using
+one of the unique count sketches to calculate the value for this parameter to build a Bloom filter appropriate for the query.
+
+### JSON specification
+
+```json
+{
+ "type": "bloom",
+ "name": ,
+ "maxNumEntries":
+ "field":
+ }
+```
+
+|Property|Description|Required|
+|--------|-----------|--------|
+|`type`|Aggregator type. Set to `bloom`.|Yes|
+|`name`|Output field name.|Yes|
+|`field`|[DimensionSpec](../../querying/dimensionspecs.md) to add to `org.apache.hive.common.util.BloomKFilter`.|Yes|
+|`maxNumEntries`|Maximum number of distinct values supported by `org.apache.hive.common.util.BloomKFilter`. Defaults to `1500`.|No|
+
+### Example
+
+The following example shows a timeseries query object with a `bloom` aggregator:
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": "wikiticker",
+ "intervals": [ "2015-09-12T00:00:00.000/2015-09-13T00:00:00.000" ],
+ "granularity": "day",
+ "aggregations": [
+ {
+ "type": "bloom",
+ "name": "userBloom",
+ "maxNumEntries": 100000,
+ "field": {
+ "type":"default",
+ "dimension":"user",
+ "outputType": "STRING"
+ }
+ }
+ ]
+}
+```
+
+Example response:
+
+```json
+[
+ {
+ "timestamp":"2015-09-12T00:00:00.000Z",
+ "result":{"userBloom":"BAAAJhAAAA..."}
+ }
+]
+```
+
+We recommend ordering by an alternative aggregation method instead of ordering results by a Bloom filter aggregator.
+Ordering results by a Bloom filter aggregator can be resource-intensive because Druid performs an expensive linear scan of the filter to approximate the count of items added to the set by counting the number of set bits.
+
+### SQL Bloom filter aggregator
+
+You can compute Bloom filters in SQL expressions with the BLOOM_FILTER aggregator. For example:
+
+```sql
+SELECT BLOOM_FILTER(, ) FROM druid.foo WHERE dim2 = 'abc'
+```
+
+Druid serializes Bloom filter results in a SQL response into a Base64 string. You can use the resulting string in subsequent queries as a filter.
diff --git a/docs/35.0.0/development/extensions-core/catalog.md b/docs/35.0.0/development/extensions-core/catalog.md
new file mode 100644
index 0000000000..37e56941e0
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/catalog.md
@@ -0,0 +1,456 @@
+---
+id: catalog
+title: Catalog
+---
+
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+Consider this an [EXPERIMENTAL](../experimental.md) feature mostly because it has not been tested yet on a wide variety of long running Druid clusters.
+
+This extension allows users to configure, update, retrieve, and manage metadata stored in Druid's catalog. At present, only metadata about tables is stored in the catalog. This extension only supports MSQ based ingestion.
+
+## Configuration
+
+To use this extension please make sure to [include](../../configuration/extensions.md#loading-extensions) `druid-catalog` in the extensions load list.
+
+# Catalog Metadata
+
+## Tables
+
+A user may define a table with a defined set of column names, and respective data types, along with other properties. When
+ingesting data into a table defined in the catalog, the DML query is validated against the definition of the table
+as defined in the catalog. This allows the user to omit the table's properties that are found in its definition,
+allowing queries to be more concise, and simpler to write. This also allows the user to ensure that the type of data being
+written into a defined column of the table is consistent with that columns definition, minimizing errors where unexpected
+data is written into a particular column of the table.
+
+### API Objects
+
+#### TableSpec
+
+A tableSpec defines a table
+
+| Property | Type | Description | Required | Default |
+|--------------|---------------------------------|---------------------------------------------------------------------------|----------|---------|
+| `type` | String | the type of table. The only value supported at this time is `datasource` | yes | null |
+| `properties` | Map<String, Object> | the table's defined properties. see [table properties](#table-properties) | no | null |
+| `columns` | List<[ColumnSpec](#columnspec)> | the table's defined columns | no | null |
+
+#### Table Properties
+
+| PropertyKeyName | PropertyValueType | Description | Required | Default |
+|----------------------|-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|---------|
+| `segmentGranularity` | String | determines how time-based partitioning is done. See [Partitioning by time](../../multi-stage-query/concepts.md#partitioning-by-time). Can specify any of the values as permitted for [PARTITIONED BY](../../multi-stage-query/reference.md#partitioned-by). This property value may be overridden at query time, by specifying the PARTITIONED BY clause. | no | null |
+| `sealed` | boolean | require all columns in the table schema to be fully declared before data is ingested. Setting this to true will cause failure when DML queries attempt to add undefined columns to the table. | no | false |
+
+#### ColumnSpec
+
+| Property | Type | Description | Required | Default |
+|--------------|---------------------|------------------------------------------------------------------------------------------------------------------------|----------|---------|
+| `name` | String | The name of the column | yes | null |
+| `dataType` | String | The type of the column. Can be any column data type that is available to Druid. Depends on what extensions are loaded. | no | null |
+| `properties` | Map<String, Object\> | the column's defined properties. Non properties defined at this time. | no | null |
+
+### APIs
+
+#### Create or update a table
+
+Update or create a new table containing the given table specification.
+
+##### URL
+
+`POST` `/druid/coordinator/v1/catalog/schemas/{schema}/tables/{name}`
+
+##### Request body
+
+The request object for this request is a [TableSpec](#tablespec)
+
+##### Query parameters
+
+The endpoint supports a set of optional query parameters to enforce optimistic locking, and to specify that a request
+is meant to update a table rather than create a new one. In the default case, with no query parameters set, this request
+will return an error if a table of the same name already exists in the schema specified.
+
+| Parameter | Type | Description |
+|-------------|---------|-------------------------------------------------------------------------------------------------------------------------------|
+| `version` | Long | the expected version of an existing table. The version must match. If not (or if the table does not exist), returns an error. |
+| `overwrite` | boolean | if true, then overwrites any existing table. Otherwise, the operation fails if the table already exists. |
+
+##### Responses
+
+
+
+
+
+*Successfully submitted table spec. Returns an object that includes the version of the table created or updated:*
+
+```json
+{
+ "version": 12345687
+}
+```
+
+
+
+
+*Error thrown due to bad request. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+*Error thrown due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+##### Sample request
+
+The following example shows how to create a sealed table with several defined columns, and a defined segment granularity of `"P1D"`
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/catalog/schemas/druid/tables/test_table" \
+-X 'POST' \
+--header 'Content-Type: application/json' \
+--data '{
+ "type": "datasource",
+ "columns": [
+ {
+ "name": "__time",
+ "dataType": "long"
+ },
+ {
+ "name": "double_col",
+ "dataType": "double"
+ },
+ {
+ "name": "float_col",
+ "dataType": "float"
+ },
+ {
+ "name": "long_col",
+ "dataType": "long"
+ },
+ {
+ "name": "string_col",
+ "dataType": "string"
+ }
+ ],
+ "properties": {
+ "segmentGranularity": "P1D",
+ "sealed": true
+ }
+}'
+```
+
+##### Sample response
+
+```json
+{
+ "version": 1730965026295
+}
+```
+
+#### Retrieve a table
+
+Retrieve a table
+
+##### URL
+
+`GET` `/druid/coordinator/v1/catalog/schemas/{schema}/tables/{name}`
+
+##### Responses
+
+
+
+
+
+*Successfully retrieved corresponding table's [TableSpec](#tablespec)*
+
+
+
+
+*Error thrown due to bad request. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+*Error thrown due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+##### Sample request
+
+The following example shows how to retrieve a table named `test_table` in schema `druid`:
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/catalog/schemas/druid/tables/test_table"
+```
+
+##### Sample response
+
+
+ View the response
+
+```json
+{
+ "id": {
+ "schema": "druid",
+ "name": "test_table"
+ },
+ "creationTime": 1730965026295,
+ "updateTime": 1730965026295,
+ "state": "ACTIVE",
+ "spec": {
+ "type": "datasource",
+ "properties": {
+ "segmentGranularity": "P1D",
+ "sealed": true
+ },
+ "columns": [
+ {
+ "name": "__time",
+ "dataType": "long"
+ },
+ {
+ "name": "double_col",
+ "dataType": "double"
+ },
+ {
+ "name": "float_col",
+ "dataType": "float"
+ },
+ {
+ "name": "long_col",
+ "dataType": "long"
+ },
+ {
+ "name": "string_col",
+ "dataType": "string"
+ }
+ ]
+ }
+}
+```
+
+
+#### Delete a table
+
+Delete a table
+
+##### URL
+
+`DELETE` `/druid/coordinator/v1/catalog/schemas/{schema}/tables/{name}`
+
+##### Responses
+
+
+
+
+
+*No response body*
+
+
+
+
+*Error thrown due to bad request. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+*Error thrown due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+##### Sample request
+
+The following example shows how to delete the a table named `test_table` in schema `druid`
+
+```shell
+curl -X 'DELETE' "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/catalog/schemas/druid/tables/test_table"
+```
+
+##### Sample response
+
+No response body
+
+#### Retrieve list of schema names
+
+retrieve list of schema names
+
+##### URL
+
+`GET` `/druid/coordinator/v1/catalog/schemas`
+
+##### Responses
+
+
+
+
+
+*Successfully retrieved list of schema names*
+
+
+
+
+*Error thrown due to bad request. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+*Error thrown due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+##### Sample request
+
+The following example shows how to retrieve the list of schema names.
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/catalog/schemas"
+```
+
+##### Sample response
+
+```json
+[
+ "INFORMATION_SCHEMA",
+ "druid",
+ "ext",
+ "lookups",
+ "sys",
+ "view"
+]
+```
+
+#### Retrieve list of table names in schema
+
+Retrieve a list of table names in the schema.
+
+##### URL
+
+`GET` `/druid/coordinator/v1/catalog/schemas/{schema}/table`
+
+##### Responses
+
+
+
+
+
+*Successfully retrieved list of table names belonging to schema*
+
+
+
+
+*Error thrown due to bad request. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+*Error thrown due to unexpected conditions. Returns a JSON object detailing the error with the following format:*
+
+```json
+{
+ "error": "A well-defined error code.",
+ "errorMessage": "A message with additional details about the error."
+}
+```
+
+
+
+
+##### Sample request
+
+The following example shows how to retrieve all of the table names of tables belonging to the `druid` schema.
+
+```shell
+curl "http://ROUTER_IP:ROUTER_PORT/druid/coordinator/v1/catalog/schemas/druid/tables"
+```
+
+##### Sample response
+
+```json
+[
+ "test_table"
+]
+```
\ No newline at end of file
diff --git a/docs/35.0.0/development/extensions-core/datasketches-extension.md b/docs/35.0.0/development/extensions-core/datasketches-extension.md
new file mode 100644
index 0000000000..00c955dc98
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/datasketches-extension.md
@@ -0,0 +1,40 @@
+---
+id: datasketches-extension
+title: "DataSketches extension"
+---
+
+
+
+
+Apache Druid aggregators based on [Apache DataSketches](https://datasketches.apache.org/) library. Sketches are data structures implementing approximate streaming mergeable algorithms. Sketches can be ingested from the outside of Druid or built from raw data at ingestion time. Sketches can be stored in Druid segments as additive metrics.
+
+To use the datasketches aggregators, make sure you [include](../../configuration/extensions.md#loading-extensions) the extension in your config file:
+
+```
+druid.extensions.loadList=["druid-datasketches"]
+```
+
+The following modules are available:
+
+* [Theta sketch](datasketches-theta.md) - approximate distinct counting with set operations (union, intersection and set difference).
+* [Tuple sketch](datasketches-tuple.md) - extension of Theta sketch to support values associated with distinct keys (arrays of numeric values in this specialized implementation).
+* [Quantiles sketch](datasketches-quantiles.md) - approximate distribution of comparable values to obtain ranks, quantiles and histograms. This is a specialized implementation for numeric values.
+* [KLL Quantiles sketch](datasketches-kll.md) - approximate distribution of comparable values to obtain ranks, quantiles and histograms. This is a specialized implementation for numeric values. This is a more advanced algorithm compared to the classic quantiles above, sketches are more compact for the same accuracy, or more accurate for the same size.
+* [HLL sketch](datasketches-hll.md) - approximate distinct counting using very compact HLL sketch.
diff --git a/docs/35.0.0/development/extensions-core/datasketches-hll.md b/docs/35.0.0/development/extensions-core/datasketches-hll.md
new file mode 100644
index 0000000000..4e2b369e5e
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/datasketches-hll.md
@@ -0,0 +1,155 @@
+---
+id: datasketches-hll
+title: "DataSketches HLL Sketch module"
+---
+
+
+
+
+This module provides Apache Druid aggregators for distinct counting based on HLL sketch from [Apache DataSketches](https://datasketches.apache.org/) library. At ingestion time, this aggregator creates the HLL sketch objects to store in Druid segments. By default, Druid reads and merges sketches at query time. The default result is
+the estimate of the number of distinct values presented to the sketch. You can also use post aggregators to produce a union of sketch columns in the same row.
+You can use the HLL sketch aggregator on any column to estimate its cardinality.
+
+To use this aggregator, make sure you [include](../../configuration/extensions.md#loading-extensions) the extension in your config file:
+
+```
+druid.extensions.loadList=["druid-datasketches"]
+```
+
+For additional sketch types supported in Druid, see [DataSketches extension](datasketches-extension.md).
+
+## Aggregators
+
+|Property|Description|Required?|
+|--------|-----------|---------|
+|`type`|Either [`HLLSketchBuild`](#hllsketchbuild-aggregator) or [`HLLSketchMerge`](#hllsketchmerge-aggregator).|yes|
+|`name`|String representing the output column to store sketch values.|yes|
+|`fieldName`|The name of the input field.|yes|
+|`lgK`|log2 of K that is the number of buckets in the sketch, parameter that controls the size and the accuracy. Must be between 4 and 21 inclusively.|no, defaults to `12`|
+|`tgtHllType`|The type of the target HLL sketch. Must be `HLL_4`, `HLL_6` or `HLL_8` |no, defaults to `HLL_4`|
+|`round`|Round off values to whole numbers. Only affects query-time behavior and is ignored at ingestion-time.|no, defaults to `false`|
+|`shouldFinalize`|Return the final double type representing the estimate rather than the intermediate sketch type itself. In addition to controlling the finalization of this aggregator, you can control whether all aggregators are finalized with the query context parameters [`finalize`](../../querying/query-context-reference.md) and [`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults to `true`|
+
+:::info
+ The default `lgK` value has proven to be sufficient for most use cases; expect only very negligible improvements in accuracy with `lgK` values over `16` in normal circumstances.
+:::
+
+### HLLSketchBuild aggregator
+
+```
+{
+ "type": "HLLSketchBuild",
+ "name": ,
+ "fieldName": ,
+ "lgK": ,
+ "tgtHllType": ,
+ "round":
+ }
+```
+
+The `HLLSketchBuild` aggregator builds an HLL sketch object from the specified input column. When used during ingestion, Druid stores pre-generated HLL sketch objects in the datasource instead of the raw data from the input column.
+When applied at query time on an existing dimension, you can use the resulting column as an intermediate dimension by the [post-aggregators](#post-aggregators).
+
+:::info
+ It is very common to use `HLLSketchBuild` in combination with [rollup](../../ingestion/rollup.md) to create a [metric](../../ingestion/ingestion-spec.md#metricsspec) on high-cardinality columns. In this example, a metric called `userid_hll` is included in the `metricsSpec`. This will perform a HLL sketch on the `userid` field at ingestion time, allowing for highly-performant approximate `COUNT DISTINCT` query operations and improving roll-up ratios when `userid` is then left out of the `dimensionsSpec`.
+
+ ```
+ "metricsSpec": [
+ {
+ "type": "HLLSketchBuild",
+ "name": "userid_hll",
+ "fieldName": "userid",
+ "lgK": 12,
+ "tgtHllType": "HLL_4"
+ }
+ ]
+ ```
+
+:::
+
+### HLLSketchMerge aggregator
+
+```
+{
+ "type": "HLLSketchMerge",
+ "name": ,
+ "fieldName": ,
+ "lgK": ,
+ "tgtHllType": ,
+ "round":
+}
+```
+
+You can use the `HLLSketchMerge` aggregator to ingest pre-generated sketches from an input dataset. For example, you can set up a batch processing job to generate the sketches before sending the data to Druid. You must serialize the sketches in the input dataset to Base64-encoded bytes. Then, specify `HLLSketchMerge` for the input column in the native ingestion `metricsSpec`.
+
+## Post aggregators
+
+### Estimate
+
+Returns the distinct count estimate as a double.
+
+```
+{
+ "type": "HLLSketchEstimate",
+ "name": ,
+ "field": ,
+ "round":
+}
+```
+
+### Estimate with bounds
+
+Returns a distinct count estimate and error bounds from an HLL sketch.
+The result will be an array containing three double values: estimate, lower bound and upper bound.
+The bounds are provided at a given number of standard deviations (optional, defaults to 1).
+This must be an integer value of 1, 2 or 3 corresponding to approximately 68.3%, 95.4% and 99.7% confidence intervals.
+
+```
+{
+ "type": "HLLSketchEstimateWithBounds",
+ "name": ,
+ "field": ,
+ "numStdDev":
+}
+```
+
+### Union
+
+```
+{
+ "type": "HLLSketchUnion",
+ "name": ,
+ "fields": ,
+ "lgK": ,
+ "tgtHllType":
+}
+```
+
+### Sketch to string
+
+Human-readable sketch summary for debugging.
+
+```
+{
+ "type": "HLLSketchToString",
+ "name": ,
+ "field":
+}
+```
diff --git a/docs/35.0.0/development/extensions-core/datasketches-kll.md b/docs/35.0.0/development/extensions-core/datasketches-kll.md
new file mode 100644
index 0000000000..b8e372dc94
--- /dev/null
+++ b/docs/35.0.0/development/extensions-core/datasketches-kll.md
@@ -0,0 +1,140 @@
+---
+id: datasketches-kll
+title: "DataSketches KLL Sketch module"
+---
+
+
+
+
+This module provides Apache Druid aggregators based on numeric quantiles KllFloatsSketch and KllDoublesSketch from [Apache DataSketches](https://datasketches.apache.org/) library. KLL quantiles sketch is a mergeable streaming algorithm to estimate the distribution of values, and approximately answer queries about the rank of a value, probability mass function of the distribution (PMF) or histogram, cumulative distribution function (CDF), and quantiles (median, min, max, 95th percentile and such). See [Quantiles Sketch Overview](https://datasketches.apache.org/docs/Quantiles/QuantilesSketchOverview.html). This document applies to both KllFloatsSketch and KllDoublesSketch. Only one of them will be used in the examples.
+
+There are three major modes of operation:
+
+1. Ingesting sketches built outside of Druid (say, with Pig or Hive)
+2. Building sketches from raw data during ingestion
+3. Building sketches from raw data at query time
+
+To use this aggregator, make sure you [include](../../configuration/extensions.md#loading-extensions) the extension in your config file:
+
+```
+druid.extensions.loadList=["druid-datasketches"]
+```
+
+For additional sketch types supported in Druid, see [DataSketches extension](datasketches-extension.md).
+
+## Aggregator
+
+The result of the aggregation is a KllFloatsSketch or KllDoublesSketch that is the union of all sketches either built from raw data or read from the segments.
+
+```json
+{
+ "type" : "KllDoublesSketch",
+ "name" : ,
+ "fieldName" : ,
+ "k":
+ }
+```
+
+|Property|Description|Required?|
+|--------|-----------|---------|
+|`type`|Either "KllFloatsSketch" or "KllDoublesSketch"|yes|
+|`name`|A String for the output (result) name of the calculation.|yes|
+|`fieldName`|String for the name of the input field, which may contain sketches or raw numeric values.|yes|
+|`k`|Parameter that determines the accuracy and size of the sketch. Higher k means higher accuracy but more space to store sketches. Must be from 8 to 65535. See [KLL Sketch Accuracy and Size](https://datasketches.apache.org/docs/KLL/KLLAccuracyAndSize.html).|no, defaults to 200|
+|`maxStreamLength`|This parameter defines the number of items that can be presented to each sketch before it may need to move from off-heap to on-heap memory. This is relevant to query types that use off-heap memory, including [TopN](../../querying/topnquery.md) and [GroupBy](../../querying/groupbyquery.md). Ideally, should be set high enough such that most sketches can stay off-heap.|no, defaults to 1000000000|
+
+## Post aggregators
+
+### Quantile
+
+This returns an approximation to the value that would be preceded by a given fraction of a hypothetical sorted version of the input stream.
+
+```json
+{
+ "type" : "KllDoublesSketchToQuantile",
+ "name": ,
+ "field" : ,
+ "fraction" :