In this tutorial, we will go through the introduction of Apache Solr, the reason to use Apache Solr, the features of Apache Solr, and so on.
What is Apache Solr?
Apache Solr is an open-source project from Apache Software Foundation and a famous search server that provides features such as full-text search, real-time indexing, hit highlighting, various types of document handling and are used in many organizations across the globe. Solr is developed in Java and it has an active community that releases new versions. To perform the search and full-text indexing Solr uses the Lucene search library. Solr supports the APIs for JSON, HTML, and XML and has a plugin architecture for advanced customization.
Features of Solr
The following are some of the important features of Apache Solr.
- Solr operates a Full-text search.
- Solr operates a Faceted search.
- Dynamic clustering.
- Solr operates a GEO search.
- Hit highlighting.
- It performs near-real-time indexing.
- It is capable of handling Rich documents.
- It can perform the Geospatial search.
- Solr provides support for Structured Query Language (SQL).
- Textual search.
- Solr supports the Rest API.
- Solr provides API support for PHP, JSON, Python, XML, custom Java binary, Ruby, velocity, and XSLT.
- It provides the GUI admin web interface.
- Solr is capable to perform the distributed search.
- For the Faster response Solr Cache, the users generated query filters and documents.
- Based on the search request Solr provides Auto-suggestion similar to the query.
The selection of Solr over RDBMS as a search server provides benefits but both have their pros and cons. On one side the RDBMS SQL has a limit for the wildcard-based text search, on the other hand, Solr is capable of performing search operations using the inverse index that is much faster.
The following figure shows the difference between Solr and RDBMS on different parameters.
|Text Search||Fast and sophisticated||Minimal and slow|
|Features||Few, targeted to text search||Many|
|Administration Tools||Minimal open source projects||Many open-source & commercial|
|Monitoring Tools||Weak||Very Strong|
|Scaling Tools||Automated, medium scale||Large scale|
|Schema Flexibility||Must in general rebuild||Changes immediately visible|
|Indexing Speed||Slow||Faster and adjustable|
|Query Speed||Text search is fast & predictable||Very dependent on design & use case|
|Row Addition/Extraction Speed||Slow||Fast|
|Partial Record Modification||No||Yes|
|Time to visibility after addition||Slow||Immediate|
|Access to internal data structures||High||None|
|Technical knowledge required||Java(minimal), web server deployment, IT||SQL, DB-specific factors, IT|
History of Solr
Let us see the year-by-year evaluation of Apache Solr.
2004: The creator of Solr is Yonik Seeley who developed it at CNET Networks for increasing the search capability of the company website.
2006: The source code of Solr was published to Apache Software Foundation by CNET Networks.
2007: Solr qualified from incubation and become the top-level project.
2008: The Solr version 1.3 was released this year with some major enhancements like distributed search.
2009: The Solr version 1.4 was released this year with some major enhancements like the document processing for PDF, Word, HTML, and database integration some additional plug-ins.
2010: Merging of the Lucene and Solr was done this year.
2011: The 3.1 version of Solr was released this year.
2012: The 4.0 version of Solr was released this year.
2015: The 5.0 version of Solr was released this year.
2016: The 6.0 version of Solr was released this year.
2017: The 7.0 version of Solr was released this year.
2019: The 8.0 version of Solr was released this year.
2020: The Solr Operator was donated by Bloomberg to Apache Lucene/Solr project.
2021: Solr become a TLP Apache project that is separate from Lucene.
The following figure shows the high-level overview of Apache Solr history.
Solr Use Cases
Apache Solr is playing an important role in many industries and the following figure shows some of them. The big giant companies such as LinkedIn, Instagram, Netflix, AT&T Interactive, Sears, eBay, and many more are there.