DenserAIDenserAI Docs
IntegrationsRESTful API

Chatbot API

POST
/query

Query chatbot

Send a message to the chatbot to get a response.

/query

Request Body

question
Required
string

Question to ask the chatbot.

Example: "Can you summarize this doc?"

chatbotId
Required
string

Identifier for a chatbot.

Example: "3be6453c-03eb-4357-ae5a-984a0e574a54"Format: "uuid"

key
Required
string

API key for the chatbot.

Example: "811f9ad2-307c-46b0-9842-af000ea2e0a0"Format: "string"
Status codeDescription
200Success
401Bad request
500Internal server error
curl -X POST "https://denser.ai/api/query" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Can you summarize this doc?",
    "chatbotId": "3be6453c-03eb-4357-ae5a-984a0e574a54",
    "key": "811f9ad2-307c-46b0-9842-af000ea2e0a0"
  }'

A successful response will return the answer from the chatbot.

{
	"statusCode": "200",
	"answer": "The document provides detailed information about using doc values in Elasticsearch to improve performance by storing data on disk alongside regular index data. Doc values are prepared at indexing time, cached in memory by the kernel, and offer advantages such as smoother performance degradation, better memory management, and faster loading. However, they also come with disadvantages like increased index size and slightly slower indexing and requests that use field data. The document explains how to configure doc values in the mapping for a specific field and discusses the benefits of using doc values over in-memory field data. Additionally, it covers topics such as distributed scoring, circuit breaker limits, and different execution modes for filters. Overall, the document emphasizes the importance of efficient memory management and performance optimization techniques in Elasticsearch.",
	"passages": [
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "bad performance caused by field data evicti ons? This is where doc values come in.\n Doc values  take the data that needs to be lo aded into memory and instead prepare\nit when the document is indexed, storing it  on disk alongside the regular index data.\nThis means that when field data would norm ally be used and read out of memory, the\ndata can be read from disk instead. This provides a number of advantages:\n■Performance degrades smoothly —Unlike default field data , which needs to live in\nthe JVM heap all at once, doc values are read  from the disk, like the rest of the\nindex. If the OS can’t fit everything in its RAM  caches, more disk seeks will be\nneeded, but there are no expensive loads and evictions, no risk of OutOfMemory-\nError s, and no circuit-breaking exceptio ns because the circuit breaker pre-\nvented the field data cache from using too much memory.\nLicensed to Thomas Snead <[email protected]>\n177 Field data detour\n■Better memory management —When used, doc values are cached in memory by the\nkernel, avoiding the cost of garbage collection associated with heap usage.\n■Faster loading —With doc values, the uninverted structure is calculated at index\ntime, so even when you run the first query, Elasticsearch doesn’t have to unin-\nvert on the fly. This makes the initial requests faster, because the uninverting\nprocess has already been performed.\nAs with everything in this chapter, there’ s no such thing as free lunch. Doc values\ncome with disadvantages, too:\n■Bigger index size —Storing all doc values on disk inflates the index size.\n■Slightly slower indexing — T h e  n e e d  t o  c a l c u l a t e  d o c  v a l u e s  a t  i n d e x  t i m e  s l o w s\ndown the process of indexing.\n■Slightly slows requests that use field data —Disk is also slower than memory, so some\nrequests that would usually use an alre ady-loaded field data cache in memory\nwill be slightly slower when reading do c values from disk. This includes sorting,\nfacets, and aggregations.",
			"title": "Elasticsearch in Action",
			"id": 224,
			"score": 4.330338,
			"score_rerank": -3.566751480102539
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "requests that would usually use an alre ady-loaded field data cache in memory\nwill be slightly slower when reading do c values from disk. This includes sorting,\nfacets, and aggregations.\n■Works only on non-analyzed fields —As of version 1.4, doc values don’t support\nanalyzed fields. If you want  to build a word cloud of the words in event titles,\nfor example, you can’t take advantage of doc values. Doc values can be used for\nnumeric, date, Boolean, binary, and ge o-point fields, though, and work well\nfor large datasets on non-analyzed data, such as the timestamp  field of log\nmessages that are indexed into Elasticsearch.\nThe good news is that you can mix and match fields that use doc values with those that\nuse the in-memory field data cache, so althou gh you may want to use doc values for the\ntimestamp  field in your events, you can still keep the event’s title  field in memory.\n How are doc values used? Because they’re written out at indexing time, configur-\ning doc values has to happen in the mapping for a particular field. If you have a string\nfield that’s not analyzed and you’d like to us e field values on it, you can configure the\nmapping when creating an index, as shown in the next listing.\ncurl -XPOST 'localhost:9200/myindex' -d'\n{\n  \"mappings\": {    \"document\": {\n      \"properties\": {\n        \"title\": {          \"type\": \"string\",\n          \"index\": \"not_analyzed\",\n          \"doc_values\": true                     }\n      }\n    }  }\n}'Listing 6.21 Using doc-values  in the mapping for the title  field\nConfiguring the title \nfield to use doc_values \nfor its field data\nLicensed to Thomas Snead <[email protected]>\n178 CHAPTER  6Searching with relevancy\nOnce the mapping has been configured, in dexing and searching will work as normal\nwithout any additional changes.\n6.10 Summary\nYou now have a better understanding of how scoring works inside Elasticsearch as well\nas how documents interact with the field data cache, so let’s review what this chapter\nwas about:",
			"title": "Elasticsearch in Action",
			"id": 225,
			"score": 4.543381,
			"score_rerank": -3.6511802673339844
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "ning a different query will make the aggreg ation run through a different set of docu-\nments. Either way, you get 10 such results because size  defaults to 10. As you saw in\nchapters 2 and 4, you can change size  from either the URI or the JSON  payload of\nyour query.\nField data and aggregations\nWhen you run a regular search, it goes fast because of the nature of the inverted\nindex: you have a limited number of terms to  look for, and Elasticsearch will identify\ndocuments containing those terms and retu rn the results. An aggregation, on the\nother hand, has to work with the terms of  each document matching the query. It\nneeds a mapping between document IDs and terms—opposite of the inverted index,\nwhich maps terms to documents.\nBy default, Elasticsearch un-inverts the inverted index into field data , as we explained\nin chapter 6, section 6.10. The more terms it has to deal with, the more memory the\nfield data will use. That’s why you have to make sure you give Elasticsearch a large\nenough heap, especially when you’re doing a ggregations on large numbers of docu-\nments or if you’re analyzing fields and y ou have more than one term per document.\nFor not_analyzed  fields, you can use doc values to  have this un-inverted data struc-\nture built at index time and stored on di sk. More details about field data and doc val-\nues can be found in chapter 6, section 6.10.Aggregation results \nbegin here.\nAggregation name, \nas specified\nEach unique term is \nan item in the bucket.\nFor each term, you \nsee how many times it appeared.\nLicensed to Thomas Snead <[email protected]>\n184 CHAPTER  7Exploring your data with aggregations\n7.1.2 Aggregations run on query results\nComputing metrics over the whole dataset is  just one of the possible use cases for\naggregations. Often you want to compute metrics in the context of a query. For",
			"title": "Elasticsearch in Action",
			"id": 232,
			"score": 3.3440769,
			"score_rerank": -4.416218280792236
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "example, you can return groups where the organizer isn’t a member, so you can ask\nthem why they don’t partic ipate to their own groups:\n% curl 'localhost:9200/get-together/group/_search?pretty' -d '{\n  \"query\": {\n    \"filtered\": {      \"filter\": {\n        \"script\": {\n          \"script\": \"return \ndoc.organizer.values.intersect(doc.members.values).isEmpty()\",\n        }\n      }\n    }\n  }}'\nThere’s one caveat for using doc['organizer']  instead of _source['organizer']  or\nthe _fields  equivalent: you’ll access the terms, not the original field of the docu-\nment. If an organizer is 'Lee' , and the field is analyzed with the default analyzer,\nyou’ll get 'Lee'  from _source  and 'lee'  from doc. There are tradeoffs everywhere,\nbut we assume you’ve gotten used to  them at this point in the chapter.\n Next, we’ll take a deeper look at how distributed searches work and how you can\nuse search types to find a good balance between having accurate scores and low-\nlatency searches.\n10.4.3 Trading network trips for less data and better distributed scoring\nBack in chapter 2, you saw how when you hit an Elasticsearch node with a search\nrequest, that node distributes the request to  all the shards that are involved and aggre-\ngates the individual shard replies into one final reply to return to the application.\nLicensed to Thomas Snead <[email protected]>\n334 CHAPTER  10 Improving performance\n Let’s take a deeper look at how this works. The naïve approach would be to get N\ndocuments from all shards involved (where N is the value of size ), sort them on the\nnode that received the HTTP  request (let’s call it the coor dinating node), pick the top\nN documents, and return them to the applic ation. Let’s say that you send a request\nwith the default size of 10 to an index with  the default number of 5 shards. This means\nthat the coordinating node will fetch 10 whole documents from each shard, sort\nthem, and return only the top 10 from thos e 50 documents. But what if there were 10",
			"title": "Elasticsearch in Action",
			"id": 410,
			"score": 3.4381883,
			"score_rerank": -4.530974388122559
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "NOTE An alternative to the in-memory fiel d data is to use doc values, which\nare calculated at index time and stored on  disk with the rest of your index. As\nwe pointed out in chapter 6, doc valu es work for numeric and not-analyzed\nstring fields. In Elasticsearch 2.0, doc va lues will be used by default for those\nfields because holding field data in the JVM heap is usually not worth the per-\nformance increase.\nA terms  filter can have lots of terms, and a range  filter with a wide range will (under\nthe hood) match lots of numbers (and numb ers are also terms). Normal execution of\nthose filters will try to match every term separately and return the set of unique docu-ments, as illustrated in figure 10.6.\nAs you can imagine, filtering on many terms could get expensive because there would\nbe many lists to intersect. When the number of terms is large, it can be faster to take\nthe actual field values one by one and see if  the terms match instead of looking in the\nindex, as illustrated in figure 10.7.\nThese field values would be loaded in the field data cache by setting \nexecution  to\nfielddata  in the terms  or range  filters. For exam ple, the following range  filter will\nget events that happened in 2013 and will be executed on field data:\n      \"filter\": {\n        \"range\": {          \"date\": {apples\norangespearsbananas1,432,32,4\n[1,4] + [2,4] = [1,2,4]Filter: [apples, bananas]\nFigure 10.6 By default, the \nterms  filter is checking which \ndocuments match each term, \nand it intersects the lists.\napples\npears, bananaspears, orangesapples, bananas1234\n[1,2,4]Filter: [apples, bananas]\nFigure 10.7 Field data execution means iterating through documents \nbut no list intersections.\nLicensed to Thomas Snead <[email protected]>\n318 CHAPTER  10 Improving performance\n            \"gte\": \"2013-01-01T00:00\",\n            \"lt\": \"2014-01-01T00:00\"\n          },          \"execution\": \"fielddata\"\n        }\n      }",
			"title": "Elasticsearch in Action",
			"id": 389,
			"score": 3.587071,
			"score_rerank": -4.811100959777832
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "■The second line contains metadata. You’ d put the document in there under the\ndoc field. When you’re percolating existing documents, the metadata JSON\nwould be empty.\n■Finally, the body of the request is sent to the _mpercolate  endpoint. As with the\nbulk API, this endpoint can contain the in dex and the type name, which can\nlater be omitted from the body.\nGETTING  ONLY THE NUMBER  OF MATCHING  QUERIES\nBesides the percolate  action, the multi percolate API supports a count  action, which\nwill return the same reply as before with  the total number of matching queries for\neach document, but you won’t get the matches  array:\necho '{\" count\" : {\"index\" : \"blog\", \"type\" : \"posts\"}}\n{\"doc\": {\"title\": \"New Elasticsearch Release \"}}\n{\"count\" : {\"index\" : \"blog\", \"type\" : \"posts\"}}\n{\"doc\": {\"title\": \"New Elasticsearch Book \"}}\n' > percolate_requests\ncurl 'localhost:9200/_mpercolate?pretty' --data-binary @percolate_requestsYou can use the\nbulk API to\nregister queries,\njust as you’ve\nused the index\nAPI so far.\nMulti percolate\nwill return\nmatches\nfor each\npercolated\ndocument.\nKnowing which tag corresponds to which\npost, you can index posts with tags, too.\nLicensed to Thomas Snead <[email protected]>\n428 APPENDIX  ETurning search upside down with the percolator\nUsing count  doesn’t make sense for the tagging use case, because you need to know\nwhich queries match, but this might not be the case everywhere. Let’s say you have an\nonline shop and you want to add a new item . If you collect user queries and register\nthem for percolation, you can percolate ne w items against those queries to predict\nhow many users will find them while searching.\n In the get-together site example, you co uld get an idea of how many attendees to\nexpect for an event before submitting it—a ssuming you can get each user’s availability\nand register time ranges as queries.\n You can, of course, get counts for individual percolations, not just multi percola-\ntions. Add /count  to the _percolate  endpoint:",
			"title": "Elasticsearch in Action",
			"id": 516,
			"score": 3.8653405,
			"score_rerank": -5.211065769195557
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "Another benefit of this approach is that  the circuit breaker limit can be dynami-\ncally adjusted while the node is running, wh ereas the size of the cache must be set in\nthe configuration file and requires restarti ng the node to change. The circuit breaker\nis configured by default to limit the field data size to 60% of the JVM’s heap size. You\ncan configure this by sending a request like this:\ncurl -XPUT 'localhost:9200/_cluster/settings'\n{\n  \"transient\": {    \"indices.breaker.fielddata.limit\": \"350mb\"\n  }\n}\nAgain, this setting supports either an absolute value like 350mb  or a percentage such as\n45%. Once you’ve set this, you can see the limit and how much memory is currently\ntracked by the breaker with the Nodes Stats API, which we’ll talk about in chapter 11.\nNOTE As of version 1.4, there is also a request circuit breaker, which helps\nyou make sure that other in-memory da ta structures generated by a request\ndon’t cause an OutOfMemoryError  by limiting them to a default of 40%.\nThere’s also a parent circuit breaker, wh ich makes sure that the field data and\nthe request breakers together don’t exceed 70% of the heap. Both limitscan be updated via the Cluster Update Settings \nAPI through indices.breaker\n.request.limit  and indices.breaker.total.limit , respectively.\nBYPASSING  MEMORY  AND USING  THE DISK WITH DOC VALUES\nS o  f a r  y o u ’ v e  s e e n  t h a t  y o u  s h o u l d  u s e  c i r c u i t  b r e a k e r s  t o  m a k e  s u r e  o u t s t a n d i n g\nrequests don’t crash your nodes, and if you fall consistently short of field data space,\nyou should either increase your JVM heap size to use more RAM  or limit the field\ndata size and live with bad performance. But what if you’re consistently short on\nfield data space, don’t have enough RAM  to increase the JVM heap, and can’t live with\nbad performance caused by field data evicti ons? This is where doc values come in.\n Doc values  take the data that needs to be lo aded into memory and instead prepare",
			"title": "Elasticsearch in Action",
			"id": 223,
			"score": 3.7281864,
			"score_rerank": -5.412522315979004
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "The terms  filter has other execution modes, too. If the default execution mode (called\nplain ) builds a bitset to cache the overall result, you can set it to bool in order to\nhave a bitset for each te rm instead. This is useful when you have different terms  fil-\nters, which have lots of terms in common.\nAlso, there are and/or execution modes that perform a similar process, except the\nindividual term filters are wrapped in an and/or filter instead of a bool filter.\nUsually, the and/or approach is slower than bool because it doesn’t take advantage\nof bitsets. and/or might be faster if the first term filters match only a few docu-\nments, which makes subsequent filters extremely fast.\nLicensed to Thomas Snead <[email protected]>\n319 Making the best use of caches\nNOTE By setting search_type  to count  in the URI parameters, you tell Elas-\nticsearch that you’re not interested in the query results, only in their number.\nWe’ll look at count  and other search types later in this section. In Elastic-\nsearch 2.0, setting size  to 0 will also work and search_type=count  will be\ndeprecated.7\n7https:/ /github.com/elastic/elasticsearch/pull/9296Search request\nFilter: tag=elasticsearch\nAggregate result with other shards\nReturn reply\nIn shard\nquery cache?Yes\nYes\nIn filter\ncache?No: run filter\non segments\nNo: run filter\non docs\ndoc\nShardSegment\nNode\nFigure 10.8 The shard query cache is more high-level than the filter cache.\nLicensed to Thomas Snead <[email protected]>\n320 CHAPTER  10 Improving performance\nThe shard query cache entries differ from one request to another, so they apply only\nto a narrow set of requests. If you’re se arching for a different term or running a\nslightly different aggregation, it will be a cache miss. Also, when a refresh occurs and\nthe shard’s contents change, all shard query cache entries are invalidated. Otherwise,new matching documents could have been added to the index, and you’d get out-\ndated results from the cache.",
			"title": "Elasticsearch in Action",
			"id": 391,
			"score": 2.5281825,
			"score_rerank": -5.5273847579956055
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "Licensed to Thomas Snead <[email protected]>\n335 Other performance tradeoffs\nas documents get bigger because it will tr ansfer much less data over the network.\nquery_and_fetch  is only faster when you hit one sh ard—that’s why it’s used implicitly\nwhen you search a single shard, when you use routing, or when you only get the\ncounts (we’ll discuss this later). Right now you can specify query_and_fetch  explicitly,\nbut in version 2.0 it will only be used internally for these specific use cases.11\nDISTRIBUTED  SCORING\nBy default, scores are calculated per shard,  which can lead to inaccuracies. For exam-\nple, if you search for a term, one of the factors is the document frequency (DF), which\nshows how many times the term you search for appears in all documents. Those “all\ndocuments” are by default “all do cuments in this shard.” If the DF of a term is signifi-\ncantly different between shards, scoring migh t not reflect reality. You can see this in\nfigure 10.15, where doc 2 gets a higher score than doc 1, even though doc 1 has more\noccurrences of “elasticsearch,” because th ere are fewer documents with that term in\nits shard.\n You can imagine that with a high enough number of documents, DF values would\nnaturally balance across shards, and the defa ult behavior would work just fine. But if\nscore accuracy is a priority or if DF is unbalanced for your use case (for example, if you’re\nusing custom routing), you’ll need a different approach.\n That approach could be to change the search type from query_then_fetch  to\ndfs_query_then_fetch . The dfs part will tell the coordinati ng node to make an extra\ncall to the shards in order to gather docu ment frequencies of the searched terms. The\naggregated frequencies will be used to calcul ate the score, as you can see in figure 10.16,\nranking your doc 1 and doc 2 correctly.\n11https:/ /github.com/elastic/elasticsearch/issues/9606query: elasticsearch\nshard 1: DF for \"elasticsearch\" = 10doc1:",
			"title": "Elasticsearch in Action",
			"id": 412,
			"score": 4.3053937,
			"score_rerank": -5.541457176208496
		},
		{
			"source": "/home/ubuntu/efs/denser_output/exp_jotyy4_811f9ad2-307c-46b0-9842-af000ea2e0a0_doc_passages/raw_files/Elasticsearch in Action.pdf",
			"text": "Terms for Name:\n“late”, “night”, “with”, “elasticsearch”Indexing process Search process\nAnalyzerdoc1\nName: Late Night with ElasticsearchSearch for term\n“late”doc2\nName: latenightMatches:\ndoc1\nFigure 3.2 After the default analyzer breaks string s into terms, subsequent searches match those terms.\nLicensed to Thomas Snead <[email protected]>\n61 Core types for defining your own fields in documents\n Setting index  to not_analyzed  does the opposite: the analysis process is skipped,\nand the entire string is indexed as one te rm. Use this option when you want exact\nmatches, such as when you search for tags. You probably want only “big data” to show\nup as a result when you search for “big data,” not “data.” Also, you’ll need this for mostaggregations, which count terms. If you want  to get the most frequent tags, you proba-\nbly want “big data” to be counted as a si ngle term, not “big” and “data” separately.\nWe’ll explore aggregations in chapter 7.\n If you set \nindex  to no, indexing is skipped and no terms are produced, so you\nwon’t be able to search on that particular field. When you don’t need to search on a\nfield, this option saves space and decreases the time it takes to index and search. For\nexample, you might store reviews for even ts. Although storing and showing those\nreviews is valuable, searching through them might not be. In this case, disable index-\ning for that field, making the indexi ng process faster and saving space.\nNext, let’s look at how you can index number s. Elasticsearch provides many core types\nthat can help you deal with numbers, so we ’ll refer to them collectively as numeric.\n3.2.2 Numeric\nNumeric types  can be numbers with or without a floating point. If you don’t need dec-\nimals, you can choose among byte , short , integer , and long ; if you do need them,\nyour choices are float  and double . These types correspond to Java’s primitive data\ntypes, and choosing among them influences  the size of your index and the range of",
			"title": "Elasticsearch in Action",
			"id": 86,
			"score": 0.69288677,
			"score_rerank": -5.618374824523926
		}
	]
}

On this page