Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multi-tenant-enabled, full-text search engine with an HTTP Web interface and schemaless JSON documents.Elasticsearch is developed in Java and distributed as open source software under the Apache license. The official client is available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby, and many other languages.

Elasticsearch

After nearly three years, Elasticsearch 8 has been released with the following new features.

7.x REST API Compatibility

8.0 introduces some significant changes to the Elasticsearch REST APIs. While it’s important to update your application to accommodate these changes, finding and updating every API call after an upgrade can be painful and error-prone for developers. To make this process easier, Elasticsearch has added support for 7.x-compatible header in the REST API. These optional header files let you make 7.x-compatible requests to 8.0 clusters and receive 7.x-compatible responses.

While it is still officially recommended that developers update your application to use native 8.0 requests and responses, the 7.x API-compatible header files allow you to make these changes safely over a longer period of time.

Security features are enabled and configured by default

Running Elasticsearch without security exposes your cluster to any user who can send requests to Elasticsearch. In previous versions, you had to explicitly enable Elasticsearch’s security features such as authentication, authorization, and network encryption (TLS). Starting with Elasticsearch 8.0, security features are enabled and configured by default when starting Elasticsearch for the first time.

At startup, Elasticsearch 8.0 generates registration tokens that you can use to connect to Kibana instances or register other nodes in a secure Elasticsearch cluster without having to generate security certificates or update YAML configuration files. Simply use the generated registration token when starting a new node or Kibana instance, and Elastic Stack will handle all the security configuration for you.

Known issues:

  • If you install Elasticsearch from an archive on an arch64 platform such as Linux ARM or macOS M1, the elastic user password and Kibana registration token are not automatically generated when the node is first started. The elastic password needs to be generated with the bin/elasticsearch-reset-password utility after the node starts.

    1
    
    bin/elasticsearch-reset-password -u elastic
    
  • Then, use the bin/elasticsearch-create-enrollment-token tool to create a registration token for Kibana.

    1
    
    bin/elasticsearch-create-enrollment-token -s kibana
    

Better protection for system indexes

System indexes store configuration and internal data for Elastic functions. In general, the system index is reserved for internal use by these functions only. While possible, direct access to or changes to the system index can cause instability and other problems.

Several changes have been made in Elasticsearch 8.0 to protect the system index from direct access. To access the system index, users must now set the allow_restricted_indices permission to true.

The superuser role also no longer gives write access to the system index. As a result, the built-in elastic superuser cannot change the system index by default.

Thereafter, developers should use Kibana or the associated Elasticsearch APIs to manage the data for a function, rather than accessing the system index. If you access the system index directly, Elasticsearch will return a warning in the header of the API response and in the deprecation log.

New KNN Search API

A technical preview of the KNN search API is available in Elasticsearch 8.0. By using the dense_vector field, a k-nearest neighbor (KNN) search finds the k nearest vectors to the query vector (which is measured by the similarity metric). KNN is commonly used to support relevance ranking in recommendation engines and natural language processing (NLP) based algorithms.

Previously, Elasticsearch only supported exact KNN searches, using script_score queries with vector functions. While this approach guaranteed accurate results, it often resulted in slow searches and did not scale well on large datasets. In exchange for slower indexing and imperfect accuracy, the new KNN search API lets you run approximate KNN searches on larger datasets at a much faster speed.

Save storage space for keyword, match_only_text and text fields

This release updates the inverted index, which is an internal data structure that allows for more space-efficient encoding to be used. This change will benefit the keyword and match_only_text fields as well as the text field. In benchmark tests using application logs, this shift reduced the index size of the message field (mapped to match_only_text) by 14.4% and reduced the overall disk footprint by 3.5%.

Speed up indexing of geo_point, geo_shape and range fields

The new version optimizes the indexing speed of multi-dimensional points, the internal data structures used for geo_point, geo_shape and range fields. lucene-level benchmarking shows a 10-15% improvement in indexing speed for these field types. Elasticsearch indexes and data flows consisting primarily of these fields may show significant improvements in indexing speed.

PyTorch Models Support Natural Language Processing (NLP)

It is now possible to upload PyTorch models trained outside of Elasticsearch and use them for inference. Third-party model support brings modern natural language processing (NLP) and search use cases to Elastic Stack.

Other changes

Aggregations

  • Remove the adjacency matrix setting #46327 (issues: #46257, #46324)
  • Remove MovingAverage pipeline aggregation #39328
  • Remove deprecated _time and _term sorting #39450
  • Remove deprecated date history intervals #75000

Allocation

  • Delete include_relocations set #47717 (issues: #46079, #47443)

Analysis

  • Clean up versioned deprecations in Analysis #41560 (issue: #41164)
  • Remove pre-configured delimited_payload_filter #43686 (issues: #41560, #43684)

Authentication

  • Always add file and native Realm #69096 (issue: #50892) unless explicitly disabled
  • Do not set the NameID format in Policy by default #44090 (issue: #40353)
  • Force set order for Realm configuration #51195 (issue: #37614)

Cluster Coordination

  • Remove connection timeout #60873 (issue: #60872)
  • Removed support for delayed state recovery hang masters #53845 (issue: #51806)

Distributed

  • Removed synchronous refresh #50882 (issues: #50776, #50835)
  • Removed cluster.remote.connect setting #54175 (issue: #53924)

Engine

  • Forced merge should reject requests with only_expunge_deletes and max_num_segments set #44761 (issue: #43102)
  • Remove per-type indexing statistics #47203 (issue: #41059)
  • Remove translog reservation setting #51697 (issue: #50775)

Features/CAT APIs

  • Remove deprecated local parameter for _cat/indices #64868 (issue: #62198)
  • Remove deprecated local parameter for _cat/shards #64867 (issue: #62197)

Features/ILM+SLM

  • Default cluster.routing.allocation.enable_default_tier_preference to true #79275 (issues: #76147, #79210)

Features/Indices APIs

  • Set prefer_v2_templates parameter default value to true #55489 (issues: #53101, #55411)
  • Remove the deprecated _upgrade API #64732 (issue: #21337)
  • Remove parameter include_type_name from REST layer
  • Remove the template field from index templates #49460 (issue: #21009)

Infra/Core

  • Remove the nodes/0 folder prefix from the datapath
  • Remove bootstrap.system_call_filter setting #72848
  • Remove node.max_local_storage_nodes #42428 (issue: #42426)
  • Remove Joda dependency #79007
  • Remove humped case from named date/time formats #60044
  • ……

Packaging

  • Remove SysV initialization support #51716
  • Removed support for JAVA_HOME #69149
  • Requires Java 17 to run Elasticsearch #79873

……

See more details at: https://www.elastic.co/cn/blog/whats-new-elastic-8-0-0