Modified and Deleted Data Can Still Be Queried by the Scan Command
Question
Why can I still query the modified and deleted data by running the following scan command?
scan '<table_name>',{FILTER=>"SingleColumnValueFilter('<column_family>','column',=,'binary:<value>')"}
Answer
When you query a table in HBase, all versions of queried column values are searched by default, including deleted or modified values. If a row fails to be hit (that is, the column cannot be matched in the row), HBase queries the row.
If you only need to query the latest value of a table and the rows that are hit, run the following statement:
scan '<table_name>',{FILTER=>"SingleColumnValueFilter('<column_family>','column',=,'binary:<value>',true,true)"}
This command filters out the rows that fail to be hit and queries the latest version of the table data. That is, the values before modification and deleted values are not queried.
The parameters of SingleColumnValueFilter are described as follows:
SingleColumnValueFilter(final byte[] family, final byte[] qualifier, final CompareOp compareOp, ByteArrayComparable comparator, final boolean filterIfMissing, final boolean latestVersionOnly)
Parameter description:
- family: indicates the column family of the column you want to query.
- qualifier: indicates the column you want to query.
- compareOp: indicates the comparison operator, such as = and >.
- comparator: indicates the target value to be searched for.
- filterIfMissing: indicates whether a row is filtered if the column cannot be matched in this row. The default value is false.
- latestVersionOnly: indicates whether only values of the latest version will be queried. The default value is false.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.