HBase Full-Text Index
Scenario
The mapping (defined by mapping.xml) between HBase tables and Solr indexes is created to provide a unified API for operating HBase and Solr. Indexes are stored in Solr and raw data is stored in HBase. When querying data on Solr, you can query raw data directly.
When distrib=false, the query is not supported.
Prerequisites
Solr and HBase have been installed.
Procedure
- Configure a config set.
- Obtain the initial config set template.
solrctl confset --generate ./confWithHBase -confWithHBase
vi confWithHBase/conf/managed-schema
The uniqueKey in managed-schema must be consistent with the row key in the HBase table. For other fields, set stored=false, which means that collections are stored in Solr and raw data is stored in HBase.
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="name" type="text_general" indexed="true" stored="false"/> <field name="sku" type="text_en_splitting_tight" indexed="true" stored="false" omitNorms="true"/> <uniqueKey>id</uniqueKey>
- Create a config set.
- Obtain the initial config set template.
- Configure the mapping.xml file for HBase tables and Solr collections.
- In the mapping.xml file, the index fields must be those configured with indexed=true in Solr Collection Schema. And the non-index fields must be those in Solr Collection Schema.
- The column in the mapping.xml file must be the column family and column in the HBase table specified by table.
- In the mapping.xml file, <mapping table="test_tb"> indicates the name of the HBase table that will establish the mapping with the Solr collection.
<?xml version="1.0" encoding="utf-8" standalone="yes"?> <mapping table="test_tb"> <index> <field name="name" column="I:n"/> <field name="alternative_names" column="I:a"/> <field name="latitude" column="I:la"/> <field name="longitude" column="I:ln"/> <field name="countrycode" column="I:x"/> <field name="population" column="I:p"/> <field name="elevation" column="I:e"/> <field name="timezone" column="I:t"/> <field name="lastupdate" column="I:las"/> <field name="text" column="I:tt"/> </index> <non-index> <field name="non_f1" column="I:n1"/> <field name="non_f2" column="I:n2"/> <field name="non_f3" column="I:n3"/> <field name="non_f4" column="I:n4"/> </non-index> </mapping>
- Create a collection.
- AdminInterface must be used to create a LunaAdmin class, and delete an HBase table, and collection.
- java.lang.StringmappingFileDirPath is the path of the mapping.xml file.
Main APIs:
// Create hbase table with descriptor and split keys void createTable(org.apache.hadoop.hbase.HTableDescriptor desc, byte[][] splitKeys) // Create hbase table with descriptor and split keys, create solr collection with create request in default solr root path, add solr index on hbase void createTable(org.apache.hadoop.hbase.HTableDescriptordesc, byte[][]splitKeys, org.apache.solr.client.solrj.request.CollectionAdminRequest.CreatecreateRequest, java.lang.StringmappingFileDirPath) // Add solr collection on hbase table, with default solr root path void addCollection(org.apache.hadoop.hbase.TableNametable, org.apache.solr.client.solrj.request.CollectionAdminRequest.CreatecreateRequest, java.lang.StringmappingFileDirPath) // Delete hbase table, then delete solr collection of table void deleteTable(org.apache.hadoop.hbase.TableNametableName) // Delete solr collection of hbase table, with default solr root path void deleteCollection(org.apache.hadoop.hbase.TableNametable, java.lang.Stringcollection) // Delete all solr collection of hbase table, with defalut solr root path void deleteAllCollections(org.apache.hadoop.hbase.TableName table) // Check solr collection exists boolean collectionExists(java.lang.String collection) // Check hbase table exists. boolean tableExists(org.apache.hadoop.hbase.TableName tableName) // Get hbase table descriptor org.apache.hadoop.hbase.HTableDescriptor getTableDescriptor(org.apache.hadoop.hbase.TableName tableName)
- Create a collection.
Obtain the table handle for LunaAdmin using the AdminInterface API. Call the HBase put API to write data to the HBase table. Create a collection in Solr based on the configurations in the mapping.xml file when the data is written to the HBase table.
// Get the table which handles write/read requests. org.apache.hadoop.hbase.client.Table getTable(org.apache.hadoop.hbase.TableName table)
- Query data.
- On Solr, use the native API of Solr to query data.
- To disable this feature, enter query.hbase=false when querying.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot