What is Indexing in Apache Solr?

In Apache Solr, Indexing is used to organize the document systematically so that the document required by the user can be founded easily. Indexing is required to perform a fast search operation on the user query. It helps to collect documents and then parse and after that store it on storage medium. We can use different types of document types for indexing such as PDF, CSV, XML, databases table data, etc.

Indexing involves the following three operations namely the addition of data to the index, deletion of data from the index, and the updating of data in the index.

solr indexing cloudduggu

In this tutorial, we will go through the addition of data in the Apache Solr index, and the following are the most common three ways through which we can do it.

  • The post tool is used to add data in Apache Solr index.
  • We can add data in Apache Solr Index using the Apache Solr web interface.
  • We can use the client APIs as well such as Python, Java to add content in Apache Solr.

We will go through each way of data addition in the Apache Solr index.


Document addition using Apache Solr Post Command

Apache Solr Post command is used to add different formats of documents such as XML, CSV, JSON for indexing. We can use the Post command from the bin directory of Apache Solr.

Let us see the Apache Solr Post command with the below example.

We have a book.csv file in which the detail of the book such as its name, price, author, availability are present. The file is saved at the cloudduggu@ubuntu:~/hadoop/solr-8.8.1/bin$ location.

BOOK_ID, CAT, BOOK_NAME, PRICE, BOOK_AVAILABILITY, BOOK_AUTHOR
0453573403, book, A Game of Thrones, 8.99, true, George R.R. Martin
0353579908, book, A Clash of Kings, 6.99, true, George R.R. Martin
064357342X, book, A Storm of Swords, 9.99, true, George R.R. Martin
0553293354, book, Foundation, 8.99, true, Isaac Asimov
0812521390, book, The Black Company, 7.99, false, Glen Cook
0814564706, book, Ender's Game, 8.99, true, Orson Scott Card
0443458532, book, Jhereg, 9.95, false, Steven Brust
0383459300, book, Nine Princes In Amber, 7.99, true, Roger Zelazny
0805875481, book, The Book of Three, 8.99, true, Lloyd Alexander
080567749X, book, The Black Cauldron, 7.99, true, Lloyd Alexander

BOOK_ID, CAT, BOOK_NAME, PRICE, BOOK_AVAILABILITY, BOOK_AUTHOR 0453573403, book, A Game of Thrones, 8.99, true, George R.R. Martin 0353579908, book, A Clash of Kings, 6.99, true, George R.R. Martin 0643573429, book, A Storm of Swords, 9.99, true, George R.R. Martin 0553293354, book, Foundation, 8.99, true, Isaac Asimov 0812521390, book, The Black Company, 7.99, false, Glen Cook 0814564706, book, Ender's Game, 8.99, true, Orson Scott Card 0443458532, book, Jhereg, 9.95, false, Steven Brust 0383459300, book, Nine Princes In Amber, 7.99, true, Roger Zelazny 0805875481, book, The Book of Three, 8.99, true, Lloyd Alexander 0805677499, book, The Black Cauldron, 7.99, true, Lloyd Alexander

We can use nano editor to create the book.csv file and press CTRL+O to save the file and CTRL+X to exit from the editor.

cloudduggu@ubuntu:~/hadoop/solr-8.8.1/bin$ nano book.csv

solr book csv example cloudduggu

Now let us see the Apache Solr Post command to index book.csv file in Solr_sample_core core.


Command:

cloudduggu@ubuntu:~/hadoop/solr-8.8.1/bin$ ./post -c Solr_sample_core book.csv

Output:

Once the Post command is executed the below output will be generated.

solr post command example cloudduggu

We can verify the document indexing by visiting the Apache Solr Web interface at http://localhost:8983/. Select the core name(Solr_sample_core) and click on the query option. Now leave everything default and click on the Execute Query button. Once the query is executed, we will see the indexing data in the format JSON by default.

Note: We have used our server IP "http://192.168.216.131:8983/" on which Solr is installed. You can also use your server IP or localhost to open the Solr Web interface.

solr indexing verification cloudduggu

solr post command example cloudduggu


Document addition using Apache Solr Web Interface

We can add the document in Apache Solr index by login into the Solr web interface.

Let us see this in the following example.

Login into http://localhost:8983/ and choose core that we have already created "Solr_sample_core" and click on the documents. The below window will be opened.

solr document addition through admin cloudduggu

We have the below JSON records that we want to add in Solr for indexing.

{ "Emp_id" : "1001", "Emp_name" : "Deepak", "EMP_age" : 24, "Emp_Designation" : "System Engineer", "Emp_Work_Location" : "Delhi", }, { "Emp_id" : "1002", "Emp_name" : "Ankit", "EMP_age" : 28, "Emp_Designation" : "Consultant", "Emp_Work_Location" : "Deoria", }, { "Emp_id" : "1003", "Emp_name" : "Kanheya", "EMP_age" : 38, "Emp_Designation" : "Manager", "Emp_Work_Location" : "Odisha", }, { "Emp_id" : "1004", "Emp_name" : "Sarvesh", "EMP_age" : 32, "Emp_Designation" : "Developer", "Emp_Work_Location" : "Lucknow", }, { "Emp_id" : "1005", "Emp_name" : "Thousif", "EMP_age" : 30, "Emp_Designation" : "DBA", "Emp_Work_Location" : "Hyderabad", }

Now select the document type JSON from the Solr Web portal and put the JSON records in the Documents tab, leave other options such as commit within and overwrite as it is, and click on the submission document.

After the successful submission of the document, we will see the status sucessful on the right side.

solr document addition through admin cloudduggu