Elasticsearch for Apache Hadoop 2.3.0 and 2.2.1 released
Joining the release train this week, Elasticsearch for Apache Hadoop 2.3.0 and 2.2.1 are now out containing compatibility improvements and bug fixes. Users are recommend to upgrade as soon as possible to take advantage of these.
As always, the artifacts are available at the downloads page and or Maven.
Important fixes
HDFS repository compatibility with Elasticsearch 2.3.0
For those that missed it, Elasticsearch 5.0.0 alpha1 was released a few days back and among its bundle of features, ships out of the box with the repository hdfs plugin. As such, pending any unforeseen events, ES-Hadoop 2.3 will be the last release cycle containing the HDFS plugin repository.
Optimized network transfer for fixed routing
When using a fixed or predefined routing, the connector optimizes the network request to hit only the target shards (whether it is for reads or writes).
Improved indexing of Spark RDD
s
The check for empty Spark RDD
s has been tweaked to avoid triggering loading of the RDD
content, especially important when using disk persistence or no caching.
Better detection of shards overlap
The algorithm for checking overlapping shards has been improved (thanks to a user contribution) to use significantly less memory and thus, increasing the limit of indices it can work on.
Last 2.2. release
Alongside 2.3, ES-Hadoop 2.2.1 is released as the last planned maintenance release in the 2.2.x line. It contains a series of backported bug-fixes for those with conservatory upgrade paths. However even if you are on ES 1.x, upgrading to ES-Hadoop 2.3 is highly recommended.
Feedback
Looking forward to hearing your feedback on ES-Hadoop 2.3! You can find us on GitHub, Twitter (@elastic) or the forums. IRC works too.