Apache Solr Introduction

In this tutorial, we will go through the introduction of Apache Solr, the reason to use Apache Solr, the features of Apache Solr, and so on.

What is Apache Solr?

Apache Solr is an open-source project from Apache Software Foundation and a famous search server that provides features such as full-text search, real-time indexing, hit highlighting, various types of document handling and are used in many organizations across the globe. Solr is developed in Java and it has an active community that releases new versions. To perform the search and full-text indexing Solr uses the Lucene search library. Solr supports the APIs for JSON, HTML, and XML and has a plugin architecture for advanced customization.

Features of Solr

The following are some of the important features of Apache Solr.

  • Solr operates a Full-text search.
  • Solr operates a Faceted search.
  • Dynamic clustering.
  • Solr operates a GEO search.
  • Hit highlighting.
  • It performs near-real-time indexing.
  • It is capable of handling Rich documents.
  • It can perform the Geospatial search.
  • Solr provides support for Structured Query Language (SQL).
  • Textual search.
  • Solr supports the Rest API.
  • Solr provides API support for PHP, JSON, Python, XML, custom Java binary, Ruby, velocity, and XSLT.
  • It provides the GUI admin web interface.
  • Replication.
  • Solr is capable to perform the distributed search.
  • For the Faster response Solr Cache, the users generated query filters and documents.
  • Based on the search request Solr provides Auto-suggestion similar to the query.

Why Solr?

The selection of Solr over RDBMS as a search server provides benefits but both have their pros and cons. On one side the RDBMS SQL has a limit for the wildcard-based text search, on the other hand, Solr is capable of performing search operations using the inverse index that is much faster.

The following figure shows the difference between Solr and RDBMS on different parameters.

Solr Relational DB
Lucene Solr Relational DB
Text Search Fast and sophisticated Minimal and slow
Features Few, targeted to text search Many
Deployment Complexity Medium Medium
Administration Tools Minimal open source projects Many open-source & commercial
Monitoring Tools Weak Very Strong
Scaling Tools Automated, medium scale Large scale
Support Availability Weak Strong
Schema Flexibility Must in general rebuild Changes immediately visible
Indexing Speed Slow Faster and adjustable
Query Speed Text search is fast & predictable Very dependent on design & use case
Row Addition/Extraction Speed Slow Fast
Partial Record Modification No Yes
Time to visibility after addition Slow Immediate
Access to internal data structures High None
Technical knowledge required Java(minimal), web server deployment, IT SQL, DB-specific factors, IT

History of Solr

Let us see the year-by-year evaluation of Apache Solr.

2004: The creator of Solr is Yonik Seeley who developed it at CNET Networks for increasing the search capability of the company website.

2006: The source code of Solr was published to Apache Software Foundation by CNET Networks.

2007: Solr qualified from incubation and become the top-level project.

2008: The Solr version 1.3 was released this year with some major enhancements like distributed search.

2009: The Solr version 1.4 was released this year with some major enhancements like the document processing for PDF, Word, HTML, and database integration some additional plug-ins.

2010: Merging of the Lucene and Solr was done this year.

2011: The 3.1 version of Solr was released this year.

2012: The 4.0 version of Solr was released this year.

2015: The 5.0 version of Solr was released this year.

2016: The 6.0 version of Solr was released this year.

2017: The 7.0 version of Solr was released this year.

2019: The 8.0 version of Solr was released this year.

2020: The Solr Operator was donated by Bloomberg to Apache Lucene/Solr project.

2021: Solr become a TLP Apache project that is separate from Lucene.

The following figure shows the high-level overview of Apache Solr history.

solr history cloudduggu

Solr Use Cases

Apache Solr is playing an important role in many industries and the following figure shows some of them. The big giant companies such as LinkedIn, Instagram, Netflix, AT&T Interactive, Sears, eBay, and many more are there.

solr usecases cloudduggu