xx February 2022, Apache Solr™ 9.0 available

The Solr PMC is pleased to announce the release of Apache Solr 9.0

Solr is the popular, blazing fast, open source search platform from the Apache Solr project. Its major features include powerful full-text search, hit highlighting, faceted search and analytics, rich document parsing, geospatial search, extensive REST APIs as well as parallel SQL. Solr is enterprise grade, secure and highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites.

The release is available for immediate download at:

https://solr.apache.org/downloads.html

Please read CHANGES.txt for a detailed list of changes:

https://solr.apache.org/docs/9_0_0/changes/Changes.html

Solr 9.0 Release Highlights

DRAFT - please help

Major new features

  • Highlight 1
  • Highlight 2
  • Highlight 3
  • (Max 5? - see more in refguide)

System requirements

  • Java 11
  • ???

Compatibility breaks and removed functionality

  • Highlight 1
  • Highlight 2
  • Highlight 3
  • (Max 5?)


A summary of important changes is published in the Solr Reference Guide at https://solr.apache.org/solr/guide/9_0/solr-upgrade-notes.html. For the most exhaustive list, see the full release notes at https://solr.apache.org/docs/9_0_0/changes/Changes.html or by viewing the CHANGES.txt file accompanying the distribution.


--- END ANNOUNCEMENT ----


Below text is copied from https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/major-changes-in-solr-9.html as inspiration

  • Move to Java 11 as minimum Java version.

(raw; not yet edited)

  • SOLR-13671: Allow 'var' keyword in Java sources

  • SOLR-12055 introduces async logging by default. There’s a small window where log messages may be lost in the event of some hard crash. Switch back to synchronous logging if this is unacceptable, see comments in the log4j2 configuration files (log4j2.xml by default).

  • SOLR-13323: The unused package org.apache.solr.internal.csv.writer and associated classes/tests that were easily confused with but not used by org.apache.solr.response.CSVWriter (or any other code) have been removed

  • SOLR-13854, SOLR-13858: SolrMetricProducer / SolrInfoBean APIs have changed and third-party components that implement these APIs need to be updated.

  • SOLR-14344: Remove Deprecated HttpSolrClient.RemoteSolrException and HttpSolrClient.RemoteExcecutionException. All the usages are replaced by BaseHttpSolrClient.RemoteSolrException and BaseHttpSolrClient.RemoteExcecutionException.

  • SOLR-15409: Zookeeper client libraries upgraded to 3.7.0, which may not be compatible with your existing server installations

  • SOLR-15809: Get rid of blacklist/whitelist terminology. JWTAuthPlugin parameter algWhitelist is now algAllowlist. The old parameter will still work in 9.x. Environment variables SOLR_IP_WHITELIST and SOLR_IP_BLACKLIST are no longer supported, but replaced with SOLR_IP_ALLOWLIST and SOLR_IP_DENYLIST.

New Features & Enhancements

  • Replica placement plugins

  • Rate limiting and task management

  • Certificate Auth Plugin

  • SQL Query interface in UI

Configuration and Default Parameter Changes

  • SOLR-7530: TermsComponent’s JSON response format was changed so that "terms" property carries per field arrays by default regardless of distrib, terms.list, terms.ttf parameters. This affects JSON based response format but not others

  • SOLR-14036: Implicit /terms handler now returns terms across all shards in SolrCloud instead of only the local core. Users/apps may be assuming the old behavior. A request can be modified via the standard distrib=false param to only use the local core receiving the request.

  • SOLR-13783: In situations where a NamedList must be output as plain text, commas between key-value pairs will now be followed by a space (e.g., {shape=square, color=yellow} rather than {shape=square,color=yellow}) for consistency with other java.util.Map implementations based on AbstractMap.

  • SOLR-11725: JSON aggregations uses corrected sample formula to compute standard deviation and variance. The computation of stdDev and variance in JSON aggregation is same as StatsComponent.

  • SOLR-14012: unique and hll aggregations always returns long value irrespective of standalone or solcloud

  • SOLR-11775: Return long value for facet count in Json Facet module irrespective of number of shards

  • SOLR-15276: V2 API call to look up async request status restful style of "/cluster/command-status/1000" instead of "/cluster/command-status?requestid=1000".

  • SOLR-14972: The default port of prometheus exporter has changed from 9983 to 8989, so you may need to adjust your configuration after upgrade.

  • SOLR-15471: The language identification "whitelist" configuration is now an "allowlist" to better convey the meaning of the property

  • SOLR-12891: MacroExpander will no longer will expand URL parameters inside of the 'expr' parameter (used by streaming expressions). Additionally, users are advised to use the 'InjectionDefense' class when constructing streaming expressions that include user supplied data to avoid risks similar to SQL injection. The legacy behavior of expanding the 'expr' parameter can be reinstated with -DStreamingExpressionMacros=true passed to the JVM at startup

  • SOLR-13324: URLClassifyProcessor#getCanonicalUrl now throws MalformedURLException rather than hiding it. Although the present code is unlikely to produce such an exception it may be possible in future changes or in subclasses. Currently this change should only effect compatibility of custom code overriding this method.

  • SOLR-14510: The writeStartDocumentList in TextResponseWriter now receives an extra boolean parameter representing the "exactness" of the numFound value (exact vs approximation). Any custom response writer extending TextResponseWriter will need to implement this abstract method now (instead previous with the same name but without the new boolean parameter).

solr.xml maxBooleanClauses now enforced recursively

Lucene 9.0 has additional safety checks over previous versions that impact how the solr.xml global maxBooleanClauses option is enforced.

In previous versios of Solr, this option was a hard limit on the number of clauses in any BooleanQuery object - but it was only enforced for the direct clauses. Starting with Solr 9, this global limit is now also enforced against the total number of clauses in a nested query structure.

Users who upgrade from prior versions of Solr may find that some requests involving complex internal query structures (Example: long query strings using edismax with many qf and pf fields that include query time synonym expansion) which worked in the past now hit this limit and fail.

User’s in this situation are advised to consider the complexity f their queries/configuration, and increase the value of maxBooleanClauses if warranted.

Log4J configuration & Solr MDC values

MDC values that Solr sets for use by Logging calls (such as the collection name, shard name, replica name, etc…​) have been modified to now be "bare" values, with out the special single character prefixes that were included in past version. For example: In 8.x Log messages for a collection named "gettingstarted" would have an MDC value with a key collection mapped to a value of c:gettingstarted, in 9.x the value will simply be gettingstarted.

Solr’s default log4j2.xml configuration file has been modified to prepend these same prefixes to MDC values when included in Log messages as part of the <PatternLayout/>. Users who have custom logging configurations that wish to ensure Solr 9.x logs are consistently formatted after upgrading will need to make similar changes to their logging configuration files. See SOLR-15630 for more details.

base_url removed from stored state

If you’re able to upgrade SolrJ to 8.8.x for all of your client applications, then you can set -Dsolr.storeBaseUrl=false (introduced in Solr 8.8.1) to better align the stored state in Zookeeper with future versions of Solr; as of Solr 9.x, the base_url will no longer be persisted in stored state. However, if you are not able to upgrade SolrJ to 8.8.x for all client applications, then you should set -Dsolr.storeBaseUrl=true so that Solr will continue to store the base_url in Zookeeper. For background, see: SOLR-12182 and SOLR-15145.

Support for the solr.storeBaseUrl system property will be removed in Solr 10.x and base_url will no longer be stored.

  • Solr’s distributed tracing no longer incorporates a special samplePercentage SolrCloud cluster property. Instead, consult the documentation for the tracing system you use on how to sample the traces. Consequently, if you use a Tracer at all, you will always have traces and thus trace IDs in logs. What percentage of them get reported to a tracing server is up to you.

  • JaegerTracerConfigurator no longer recognizes any configuration in solr.xml. It is now completely configured via System properties and/or Environment variables as documented by Jaeger.

Schema Changes

  • LegacyBM25SimilarityFactory has been removed.

  • SOLR-13593 SOLR-13690 SOLR-13691: Allow to look up analyzer components by their SPI names in field type configuration.

Authentication & Security Changes

  • The property blockUnknown in the BasicAuthPlugin and the JWTAuthPlugin now defaults to true. This change is backward incompatible. If you need the pre-9.0 default behavior, you need to explicitly set blockUnknown:false in security.json.

  • The allow-list defining allowed URLs for the shards parameter is not in the shardHandler configuration anymore. It is defined by the allowUrls top-level property of the solr.xml file. For more information, see Format of solr.allowUrls documentation.

  • SOLR-13985: Solr’s Jetty now binds to localhost network interface by default for better out of the box security. Administrators that need Solr exposed more broadly can change the SOLR_JETTY_HOST property in their Solr include (solr.in.sh/solr.in.cmd) file.

  • SOLR-14147: Solr now runs with the java security manager enabled by default. Administrators that need to run Solr with Hadoop will need to disable this feature by setting SOLR_SECURITY_MANAGER_ENABLED=false in the environment or in one of the Solr init scripts. Other features in Solr could also break. (Robert Muir, marcussorealheis)

  • SOLR-14118: Solr embedded zookeeper only binds to localhost by default. This embedded zookeeper should not be used in production. If you rely upon the previous behavior, then you can change the clientPortAddress in solr/server/solr/zoo.cfg

Contrib Changes

  • SOLR-14067: StatelessScriptUpdateProcessorFactory moved to contrib/scripting package instead of shipping as part of Solr, due to security concerns. Renamed to ScriptUpdateProcessorFactory for simpler name.

  • SOLR-15121: XSLTResponseWriter moved to contrib/scripting package instead of shipping as part of Solr, due to security concerns.

  • SOLR-14926: contrib/clustering back and rewritten

  • SOLR-14912: Cleaned up solr-extraction contrib to produce solr-extraction-* jar (instead of solr-cell-*). (Dawid Weiss)

Deprecations & Removed Features

The following list of features have been permanently removed from Solr:

  • SOLR-14656: Autoscaling framework removed. This includes:

    • Autoscaling, policy, triggers etc.

    • withCollection handling (SOLR-14964)

    • UTILIZENODE command

    • Sim framework

    • Suggestions tab in UI

    • Reference guide pages for autoscaling

    • autoAddReplicas feature


  • SOLR-14783: Data Import Handler (DIH) has been removed from Solr. The community package is available at: https://github.com/rohitbemax/dataimporthandler

  • SOLR-14792: VelocityResponseWriter has been removed from Solr. This encompasses all previous included /browse and wt=velocity examples. This feature has been migrated to an installable package at https://github.com/erikhatcher/solr-velocity

  • SOLR-13817: Legacy SolrCache implementations (LRUCache, LFUCache, FastLRUCache) have been removed. Users have to modify their existing configurations to use CaffeineCache instead. (ab)

  • CDCR

  • Storing indexes and backups in HDFS

  • Solr’s blob store

    • SOLR-14654: plugins cannot be loaded using "runtimeLib=true" option. Use the package manager to use and load plugins


  • Metrics History

  • SOLR-15470: The binary distribution no longer contains test-framework jars.

  • SOLR-15203: Remove the deprecated jwkUrl in favour of jwksUrl when configuring JWT authentication.

  • SOLR-12847: maxShardsPerNode parameter has been removed because it was broken and inconsistent with other replica placement strategies. Other relevant placement strategies should be used instead, such as autoscaling policy or rules-based placement.

  • SOLR-14092: Deprecated BlockJoinFacetComponent and BlockJoinDocSetFacetComponent are removed. Users are encouraged to migrate to uniqueBlock() in JSON Facet API. (Mikhail Khludnev)

  • SOLR-13596: Deprecated GroupingSpecification methods are removed.

  • SOLR-11266: default Content-Type override for JSONResponseWriter from _default configSet is removed. Example has been provided in sample_techproducts_configs to override content-type.

  • min_rf deprecated in 7.x

------



  • No labels