Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multi-tenant-enabled, full-text search engine with an HTTP Web interface and schemaless JSON documents.Elasticsearch is developed in Java and distributed as open source software under the Apache license. The official client is available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby, and many other languages.
After nearly three years, Elasticsearch 8 has been released with the following new features.
7.x REST API Compatibility
8.0 introduces some significant changes to the Elasticsearch REST APIs. While it’s important to update your application to accommodate these changes, finding and updating every API call after an upgrade can be painful and error-prone for developers. To make this process easier, Elasticsearch has added support for 7.x-compatible header in the REST API. These optional header files let you make 7.x-compatible requests to 8.0 clusters and receive 7.x-compatible responses.
While it is still officially recommended that developers update your application to use native 8.0 requests and responses, the 7.x API-compatible header files allow you to make these changes safely over a longer period of time.
Security features are enabled and configured by default
Running Elasticsearch without security exposes your cluster to any user who can send requests to Elasticsearch. In previous versions, you had to explicitly enable Elasticsearch’s security features such as authentication, authorization, and network encryption (TLS). Starting with Elasticsearch 8.0, security features are enabled and configured by default when starting Elasticsearch for the first time.
At startup, Elasticsearch 8.0 generates registration tokens that you can use to connect to Kibana instances or register other nodes in a secure Elasticsearch cluster without having to generate security certificates or update YAML configuration files. Simply use the generated registration token when starting a new node or Kibana instance, and Elastic Stack will handle all the security configuration for you.
If you install Elasticsearch from an archive on an arch64 platform such as Linux ARM or macOS M1, the
elasticuser password and Kibana registration token are not automatically generated when the node is first started. The
elasticpassword needs to be generated with the
bin/elasticsearch-reset-passwordutility after the node starts.
bin/elasticsearch-reset-password -u elastic
Then, use the
bin/elasticsearch-create-enrollment-tokentool to create a registration token for Kibana.
bin/elasticsearch-create-enrollment-token -s kibana
Better protection for system indexes
System indexes store configuration and internal data for Elastic functions. In general, the system index is reserved for internal use by these functions only. While possible, direct access to or changes to the system index can cause instability and other problems.
Several changes have been made in Elasticsearch 8.0 to protect the system index from direct access. To access the system index, users must now set the
allow_restricted_indices permission to
superuser role also no longer gives write access to the system index. As a result, the built-in
elastic superuser cannot change the system index by default.
Thereafter, developers should use Kibana or the associated Elasticsearch APIs to manage the data for a function, rather than accessing the system index. If you access the system index directly, Elasticsearch will return a warning in the header of the API response and in the deprecation log.
New KNN Search API
A technical preview of the KNN search API is available in Elasticsearch 8.0. By using the
dense_vector field, a k-nearest neighbor (KNN) search finds the k nearest vectors to the query vector (which is measured by the similarity metric). KNN is commonly used to support relevance ranking in recommendation engines and natural language processing (NLP) based algorithms.
Previously, Elasticsearch only supported exact KNN searches, using
script_score queries with vector functions. While this approach guaranteed accurate results, it often resulted in slow searches and did not scale well on large datasets. In exchange for slower indexing and imperfect accuracy, the new KNN search API lets you run approximate KNN searches on larger datasets at a much faster speed.
Save storage space for
This release updates the inverted index, which is an internal data structure that allows for more space-efficient encoding to be used. This change will benefit the
match_only_text fields as well as the
text field. In benchmark tests using application logs, this shift reduced the index size of the
message field (mapped to
match_only_text) by 14.4% and reduced the overall disk footprint by 3.5%.
Speed up indexing of
geo_shape and range fields
The new version optimizes the indexing speed of multi-dimensional points, the internal data structures used for
geo_shape and range fields. lucene-level benchmarking shows a 10-15% improvement in indexing speed for these field types. Elasticsearch indexes and data flows consisting primarily of these fields may show significant improvements in indexing speed.
PyTorch Models Support Natural Language Processing (NLP)
It is now possible to upload PyTorch models trained outside of Elasticsearch and use them for inference. Third-party model support brings modern natural language processing (NLP) and search use cases to Elastic Stack.
- Remove the adjacency matrix setting #46327 (issues: #46257, #46324)
MovingAveragepipeline aggregation #39328
- Remove deprecated
- Remove deprecated date history intervals #75000
include_relocationsset #47717 (issues: #46079, #47443)
- Clean up versioned deprecations in Analysis #41560 (issue: #41164)
- Remove pre-configured
delimited_payload_filter#43686 (issues: #41560, #43684)
- Always add file and native Realm #69096 (issue: #50892) unless explicitly disabled
- Do not set the NameID format in Policy by default #44090 (issue: #40353)
- Force set order for Realm configuration #51195 (issue: #37614)
- Remove connection timeout #60873 (issue: #60872)
- Removed support for delayed state recovery hang masters #53845 (issue: #51806)
- Removed synchronous refresh #50882 (issues: #50776, #50835)
cluster.remote.connectsetting #54175 (issue: #53924)
- Forced merge should reject requests with
max_num_segmentsset #44761 (issue: #43102)
- Remove per-type indexing statistics #47203 (issue: #41059)
- Remove translog reservation setting #51697 (issue: #50775)
- Remove deprecated
_cat/indices#64868 (issue: #62198)
- Remove deprecated
_cat/shards#64867 (issue: #62197)
true#79275 (issues: #76147, #79210)
prefer_v2_templatesparameter default value to
true#55489 (issues: #53101, #55411)
- Remove the deprecated
_upgradeAPI #64732 (issue: #21337)
- Remove parameter
include_type_namefrom REST layer
- Remove the
templatefield from index templates #49460 (issue: #21009)
- Remove the
nodes/0folder prefix from the datapath
node.max_local_storage_nodes#42428 (issue: #42426)
- Remove Joda dependency #79007
- Remove humped case from named date/time formats #60044
- Remove SysV initialization support #51716
- Removed support for
- Requires Java 17 to run Elasticsearch #79873
See more details at: https://www.elastic.co/cn/blog/whats-new-elastic-8-0-0