Similar to search engines, text data is queried based on terms. Therefore, you must configure word segmentation, case sensitivity, including options.
Determine whether to support case sensitivity when querying raw logs. For example, the raw log is internalError .
- After turning off the Case Sensitive switch, the sample log can be queried based on the keyword INTERNALERROR or internalerror .
- After turning on the Case Sensitive switch, the sample log can only be queried based on the keyword internalError .
You can separate the contents of a raw log into several keywords by using a token.
For example, the raw log is
- If no token is set, the string is considered as an individual word
/url/pic/abc.gif. You can only query this log by using the complete string or fuzzy match such as
/is set as the token, the raw log is separated into three words:
abc.gif. You can query this log by using any of the three words or fuzzy match, for example,
pi*. You can also use
/url/pic/abc.gifto query this log (
url and pic and abc.gifis separated into the following three conditions during the query: url , pic , and abc.gif ).
/.. is set as the token, the raw log is separated into four words:
|You can broaden the query range by setting appropriate tokens.|
Full text index
By default, full text query (index) considers all the fields and keys of a log, except the time field, as text data, and does not need to specify keys. For example, the following log is composed of four fields (time/status/level/message):
[20180102 12:00:00] 200,error,some thing is error in this field
- time:2018-01-02 12:00:00
- message:”some thing is error in this field”
After enabling full text index, the following text data is assembled in the “key:value + space” mode.
status:200 level:error message:"some thing is error in this field"