After three years, Elastic 8 is officially released

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multi-tenant-enabled, full-text search engine with an HTTP Web interface and schemaless JSON documents.Elasticsearch is developed in Java and distributed as open source software under the Apache license. The official client is available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby, and many other languages.

Elasticsearch

After nearly three years, Elasticsearch 8 has been released with the following new features.

7.x REST API Compatibility

8.0 introduces some significant changes to the Elasticsearch REST APIs. While it’s important to update your application to accommodate these changes, finding and updating every API call after an upgrade can be painful and error-prone for developers. To make this process easier, Elasticsearch has added support for 7.x-compatible header in the REST API. These optional header files let you make 7.x-compatible requests to 8.0 clusters and receive 7.x-compatible responses.

While it is still officially recommended that developers update your application to use native 8.0 requests and responses, the 7.x API-compatible header files allow you to make these changes safely over a longer period of time.

Security features are enabled and configured by default

Running Elasticsearch without security exposes your cluster to any user who can send requests to Elasticsearch. In previous versions, you had to explicitly enable Elasticsearch’s security features such as authentication, authorization, and network encryption (TLS). Starting with Elasticsearch 8.0, security features are enabled and configured by default when starting Elasticsearch for the first time.

At startup, Elasticsearch 8.0 generates registration tokens that you can use to connect to Kibana instances or register other nodes in a secure Elasticsearch cluster without having to generate security certificates or update YAML configuration files. Simply use the generated registration token when starting a new node or Kibana instance, and Elastic Stack will handle all the security configuration for you.

Known issues:

If you install Elasticsearch from an archive on an arch64 platform such as Linux ARM or macOS M1, the elastic user password and Kibana registration token are not automatically generated when the node is first started. The elastic password needs to be generated with the bin/elasticsearch-reset-password utility after the node starts.
1

bin/elasticsearch-reset-password -u elastic
Then, use the bin/elasticsearch-create-enrollment-token tool to create a registration token for Kibana.
1

bin/elasticsearch-create-enrollment-token -s kibana

Better protection for system indexes

System indexes store configuration and internal data for Elastic functions. In general, the system index is reserved for internal use by these functions only. While possible, direct access to or changes to the system index can cause instability and other problems.

Several changes have been made in Elasticsearch 8.0 to protect the system index from direct access. To access the system index, users must now set the allow_restricted_indices permission to true.

The superuser role also no longer gives write access to the system index. As a result, the built-in elastic superuser cannot change the system index by default.

Thereafter, developers should use Kibana or the associated Elasticsearch APIs to manage the data for a function, rather than accessing the system index. If you access the system index directly, Elasticsearch will return a warning in the header of the API response and in the deprecation log.

New KNN Search API

A technical preview of the KNN search API is available in Elasticsearch 8.0. By using the dense_vector field, a k-nearest neighbor (KNN) search finds the k nearest vectors to the query vector (which is measured by the similarity metric). KNN is commonly used to support relevance ranking in recommendation engines and natural language processing (NLP) based algorithms.

Previously, Elasticsearch only supported exact KNN searches, using script_score queries with vector functions. While this approach guaranteed accurate results, it often resulted in slow searches and did not scale well on large datasets. In exchange for slower indexing and imperfect accuracy, the new KNN search API lets you run approximate KNN searches on larger datasets at a much faster speed.

Save storage space for `keyword`, `match_only_text` and `text` fields

This release updates the inverted index, which is an internal data structure that allows for more space-efficient encoding to be used. This change will benefit the keyword and match_only_text fields as well as the text field. In benchmark tests using application logs, this shift reduced the index size of the message field (mapped to match_only_text) by 14.4% and reduced the overall disk footprint by 3.5%.

Speed up indexing of `geo_point`, `geo_shape` and range fields

The new version optimizes the indexing speed of multi-dimensional points, the internal data structures used for geo_point, geo_shape and range fields. lucene-level benchmarking shows a 10-15% improvement in indexing speed for these field types. Elasticsearch indexes and data flows consisting primarily of these fields may show significant improvements in indexing speed.

PyTorch Models Support Natural Language Processing (NLP)

It is now possible to upload PyTorch models trained outside of Elasticsearch and use them for inference. Third-party model support brings modern natural language processing (NLP) and search use cases to Elastic Stack.

Other changes

Aggregations

Remove the adjacency matrix setting #46327 (issues: #46257, #46324)
Remove MovingAverage pipeline aggregation #39328
Remove deprecated _time and _term sorting #39450
Remove deprecated date history intervals #75000

Allocation

Delete include_relocations set #47717 (issues: #46079, #47443)

Analysis

Clean up versioned deprecations in Analysis #41560 (issue: #41164)
Remove pre-configured delimited_payload_filter #43686 (issues: #41560, #43684)

Authentication

Always add file and native Realm #69096 (issue: #50892) unless explicitly disabled
Do not set the NameID format in Policy by default #44090 (issue: #40353)
Force set order for Realm configuration #51195 (issue: #37614)

Cluster Coordination

Remove connection timeout #60873 (issue: #60872)
Removed support for delayed state recovery hang masters #53845 (issue: #51806)

Distributed

Removed synchronous refresh #50882 (issues: #50776, #50835)
Removed cluster.remote.connect setting #54175 (issue: #53924)

Engine

Forced merge should reject requests with only_expunge_deletes and max_num_segments set #44761 (issue: #43102)
Remove per-type indexing statistics #47203 (issue: #41059)
Remove translog reservation setting #51697 (issue: #50775)

Features/CAT APIs

Remove deprecated local parameter for _cat/indices #64868 (issue: #62198)
Remove deprecated local parameter for _cat/shards #64867 (issue: #62197)

Features/ILM+SLM

Default cluster.routing.allocation.enable_default_tier_preference to true #79275 (issues: #76147, #79210)

Features/Indices APIs

Set prefer_v2_templates parameter default value to true #55489 (issues: #53101, #55411)
Remove the deprecated _upgrade API #64732 (issue: #21337)
Remove parameter include_type_name from REST layer
Remove the template field from index templates #49460 (issue: #21009)

Infra/Core

Remove the nodes/0 folder prefix from the datapath
Remove bootstrap.system_call_filter setting #72848
Remove node.max_local_storage_nodes #42428 (issue: #42426)
Remove Joda dependency #79007
Remove humped case from named date/time formats #60044
……

Packaging

Remove SysV initialization support #51716
Removed support for JAVA_HOME #69149
Requires Java 17 to run Elasticsearch #79873

……

See more details at: https://www.elastic.co/cn/blog/whats-new-elastic-8-0-0

Table of Contents