How AutoNavi engineers improve processing efficiency
1. Background
Let me explain the proper nouns in this article first.
Intelligence: It is a kind of information such as text, picture or video, which is used to solve specific problems in the production or navigation of Gaud map. Essentially, it refers to the knowledge or facts related to road or traffic, which is notified to specific users through certain space and time.
User feedback: refers to the user providing some feedback information on the software used, including intelligence, suggestions and complaints, with the help of certain media.
2. Problems and solutions
The way of user feedback can be reported through the Amap terminal and PC terminal of the mobile phone. When reporting, select some options and text descriptions to report the problem. The following is an example of user feedback, where the problem source, major type, subtype and road name are options, and the user description is filled in, which is generally short text. These are also the main features we can use.
Each user hopes to solve the problem and receive feedback in a timely manner after reporting the problem. However, the number of users' feedback is hundreds of thousands every day. It is very difficult to achieve the goal of timely feedback.
For these user feedback information, the current overall process is to first use rules to classify, in which each feedback related to the road must be manually verified to find the type of problem reported by the user and the location of the problem, and update the road data in time for navigation.
A specific feedback operation needs to go through intelligence identification, intelligence positioning, intelligence verification and other links:
1) Intelligence identification is mainly used to determine the type of problem and label the information
Analyze the information reported by users, including problem sources, major types, subtypes, user descriptions, etc
View uploaded picture data, including automatic screenshots of mobile phones and user photos
2) Intelligence positioning is mainly to find the location information of the problem, namely the positioning coordinates
Analyze the validity of the location of the time stamp of the user feedback problem
Check the position of the vehicle when the user reports the problem, that is, the self-driving position
Analyze log information such as user's planning and actual track in the process of using the software
3) Intelligence verification: the information label and location coordinates are determined through the above two steps, and the information label (including the road name) needs to be verified in this step
Analyze image and big data thermal map or basic data of road network
View the data uploaded by the user and the multimedia pictures collected
The whole business processing process is shown in the following figure:
The principle in the whole process of handling user feedback is to fully believe that the user's problem exists. If the information reported by the user is not enough to judge the type and location of the problem, the user will try to draw a conclusion that is biased towards the user through the log information such as user planning and actual track.
At present, the main problems in the whole user feedback problem processing process are: low accuracy rate of rule distribution, complex manual verification process, high skill requirements and low efficiency, and serious false killing.
In order to solve the above problems, we hope to introduce the method of machine learning to improve the operation ability in a data-driven way. In the process of exploring the specific realization of the goal, we first disassemble and classify the business hierarchically, then use algorithms to replace rules to classify the information, and then the engineering disassembly manual verification process is divided into intelligence identification, intelligence positioning, intelligence verification and other steps to realize the rapid operation of individual skills, and finally use algorithms to automate the intelligence identification steps after the engineering disassembly.
3. Machine learning problem solving
3.1 Business sorting and hierarchical process disassembly
After the original user feedback problem is classified by rules, the artificial intelligence identification, location and verification are carried out, and finally the problem and which of the nearly 100 sub-classification items it belongs to are confirmed, and then the corresponding relationship between the upper level classification and the whole level is determined.
It can be seen from this that the whole problem processing process has only one step, and the processing process is quite complex, requiring high manual skills and low efficiency. Moreover, there are a thousand Hamlets in the eyes of a thousand people, and personal subjectivity will also affect the judgment of the problem.
In view of this situation, we sorted out and disassembled the original business processes, hoping to solve some of them by machine learning and process automation, and improve the efficiency of overall problem handling.
First, the classification of effective intelligence and invalid intelligence is to eliminate the invalidity, and then the whole process is divided into six levels, including business level 1, business level 2, business level 3, intelligence identification, intelligence positioning and intelligence verification.
As shown in the figure above, the first three levels after disassembly are intelligence classification. Only the last three levels require partial manual intervention, and other levels are directly automated. In this way, problems are greatly simplified and efficiency is improved through such methods as hierarchy, automation and dedicated personnel.
3.2 Business and model adaptation
We can see that there are both options and input items in the user feedback. The options, such as the source of the problem, have default values. You need to click and select the corresponding breakdown items. The user may not have the patience to select carefully. The patient user may not be able to select the correct classification because he does not know the specific classification criteria. User description is the content that needs to be manually entered by users, the main way for users to express their true intentions, and the most valuable content of user feedback.
User descriptions are generally divided into three situations: no description, description but meaningless, description and meaningful. The first two are called invalid descriptions, and the latter is called valid descriptions.
According to the results of business disassembly, the first step of the business process is to remove invalidity. After that, we will distinguish the user feedback of valid and invalid descriptions, and establish corresponding processes for processing.
1) The user feedback of effective description is classified level by level. The first level is divided into three categories: data, product, and forwarding. The product and forwarding are directly processed automatically. The data category is divided into road and topic in the second level. The topic refers to non-road traffic restriction, step guidance, cycling, etc.
2) The user feedback of invalid description is classified in the same way and goes through the same process, but the sample set and model are different, and there is no algorithm processing step at the end. It is directly handled manually or by rules.
3) Finally, according to the actual business needs, the structure of business and model adaptation is formed after layer by layer disassembly.
From the above analysis, it can be seen that intelligence classification and intelligence recognition are multi-classification text classification problems. We carry out corresponding operations according to different data characteristics:
Although the classification of intelligence is different at each level, the model architecture can be reused, and only minor changes need to be made. Moreover, there are historical data sets that have been manually verified before (including intelligence identification, intelligence positioning, intelligence verification and other processes) and have the final result as the classification label as the true value, and the sample set is relatively easy to obtain.
The classification label of intelligence identification is the intermediate result before the intelligence verification, which can only be labeled manually. On the premise of ensuring the normal production on the line, human resources should be allocated to label as much as possible, and resources are very limited. So we first do Finetuning on the intelligence classification data set to train the model. Then the application of intelligence recognition can be carried out after the number of manually labeled samples has accumulated to a certain level.
3.3 Model selection
First, the unstructured text user description is expressed as a vector form, namely a vector space model. The traditional approach is to directly use discrete feature one-pot representation, that is, use tf-idf value to represent words, and the dimension is dictionary size. However, when the number of statistical samples in this representation is large, data sparsity and dimension explosion will occur.
In order to avoid similar problems and better reflect the relationship between words, such as semantic similarity and word order adjacency, we use the word embedding method, that is, the word2vec model proposed by Mikolov. This model can map the semantics of words into a fixed vector space through the context structure information of words, and its similarity in the vector space can represent the semantic similarity of text, In essence, it can be regarded as an abstract representation of context characteristics.
Secondly, the most important is model selection. Compared with the complex feature engineering steps of traditional statistical learning methods, the deep learning method is more popular. The most commonly used in NLP is the cyclic neural network RNN, which circulates the state in its own network. Compared with the feedforward neural network, it can accept a wider range of time series structure inputs and better express the context information, However, in the process of training, there will be problems such as gradient disappearance or gradient explosion, and the long and short term memory network LSTM can solve this problem very well.
3.4 Model architecture
Take the word vector result of each user's feedback information as the input of LSTM, and then take the result of the last unit of LSTM as the text feature, merge with the other user's choice question as the model input, and then use softmax as the output layer for classification after passing through the full connection layer. The real number between 0 and 1 obtained is the basis of classification. The multi-category network architecture is shown below:
4. Summary of actual combat experience
After clarifying the business logic, determining the problem-solving steps, confirming the sample labeling and scheduling, and running through the first version of the model, we feel relieved that the problem should have been solved more than half, and the rest is to do model adjustment and optimization, and wait for sample accumulation, so that the model can be easily launched after training.
However, the actual situation is faced with more problems and difficulties than expected, such as insufficient training data, poor effect of single model, and imperfect setting of hyper-parameters. The long and difficult optimization and iteration process has just begun.
4.1 Fine-tuning
After selecting the model, the first problem faced by intelligence recognition is that the sample size is seriously insufficient. We use the Fine-tuning method to slightly modify the trained model on the network before training to improve the effect of the model. With the gradual increase of manually labeled samples, we can achieve an improvement of about 3 percentage points in different size of data sets.
4.2 Parameter adjustment
The parameter adjustment of the model is a process of cultivating internal skills and refining the golden pill. In fact, the effect is not necessarily good. We have conducted nearly 30 groups of parameter adjustment experiments, and obtained the following valuable experience full of blood and tears:
1) Initialization is a must. We choose SVD initialization
2) Dropout must also be used to effectively prevent over-fitting, as well as the function of Ensemble. For LSTM, the drop out position should be placed before LSTM, especially for bi-directional LSTM, which must be done, otherwise it will be over-fitted directly.
3) As for the selection of optimization algorithm, we tried Adam, RMSprop, SGD, AdaDelta, etc. In fact, the effect of RMSprop and Adam is not much different, but based on Adam, it can be considered as the combination of RMSprop and Momentum, and finally chose Adam.
4) The batch size is generally adjusted from around 128, but the bigger the better. For different data sets, you must also try the case where the batch size is 64. You may be surprised.
5) The last thing to remember is to shuffle the data as much as possible.
4.3 Ensemble
To solve the problem of insufficient accuracy of a single model, we adopted the Ensemble method. After many groups of experiments, we finally selected five of the best models trained at different parameter settings to do Ensemble by voting, and the overall accuracy rate was 1.5 percentage points higher than the single optimal model.
In addition, in order to optimize the effect of the model, we also tried to adjust the model, such as two-way LSTM and different Padding methods. After comparison, it was found that there was little difference in intelligence recognition. After analysis, it was caused by the different ways each user described the problem and the inconspicuous distribution difference.
4.4 Confidence differentiation
When the structure optimization and parameter adjustment of the intelligence recognition multi-classification model itself have reached a certain bottleneck, it is found that the final effect of the model has a certain gap from automation, because the features are incomplete and the accuracy of engineering extraction of some features is limited, the categories are unbalanced, and the number of samples of a single category is not large.
In order to better implement the algorithm landing, we tried to distinguish the confidence within the category, mainly using two methods: confidence model and setting threshold by category, and finally chose a simple and efficient method of setting threshold by category.
The confidence model uses the label output result of the classification model as the input. The sample set of each label is re-classified into training set and verification set. After training, the confidence model is obtained and the high confidence result is applied.
In the confidence model experiment, we tried to carry out the confidence model experiment in the way of Binary, Weighted Crossentropy and Ensemble. The formula of Weighted Crossentropy is:
The result of the experiment is that the Binary method has no obvious effect improvement. Ensemble has achieved a high recall rate at 95% confidence, but it does not reach the 98% confidence model.
It uses the method of setting different softmax thresholds according to different categories to make high-confidence judgments when the intelligence classification algorithm model is landing, that is, setting thresholds according to categories. Similar methods are also used in the intelligence recognition, and the results obtained exceed the effects of the high-confidence model made before, so this method is finally selected, which can greatly improve the operating efficiency of operators. At the same time, in order to reduce the operational complexity of operators, we also provide the top N recommendation of the low confidence part to save the operation time to the greatest extent.
5. Algorithm effect and application results
5.1 Information classification
Algorithm effect: according to the actual application requirements, the final effect of the intelligence classification algorithm is more than 96% of the accuracy rate of the product class, and the recall rate of the data class is up to 99%.
Application results: Working together with other strategies, the overall automation rate has increased significantly. After the rule optimization, the actual application has achieved a significant reduction in the number of operators and the unit operation cost by 4/5, which solves the bottleneck of user feedback back-end processing.
5.2 Intelligence identification
Algorithm effect: according to the strategy of using high confidence part to automate and low confidence part to label manually, the final effect of intelligence recognition algorithm is effective description accuracy rate of more than 96%.
Application results: After the intelligence label classification model is connected to the platform, the efficiency of operators will be improved by more than 30% through different processing of high and low confidence labels.
6. Summary and outlook
Through this project, we have formed a set of methodology to effectively solve complex business problems, and accumulated practical experience in solving problems in close combination of NLP algorithm and business. At present, these methods and experiences have been well implemented in other projects, and are under continuous accumulation and improvement. Under the premise of continuously improving user satisfaction, we should deal with problems as efficiently and automatically as possible, and strive for the perfection of every detail of the product, which is our driving force and persistent goal.
Let me explain the proper nouns in this article first.
Intelligence: It is a kind of information such as text, picture or video, which is used to solve specific problems in the production or navigation of Gaud map. Essentially, it refers to the knowledge or facts related to road or traffic, which is notified to specific users through certain space and time.
User feedback: refers to the user providing some feedback information on the software used, including intelligence, suggestions and complaints, with the help of certain media.
2. Problems and solutions
The way of user feedback can be reported through the Amap terminal and PC terminal of the mobile phone. When reporting, select some options and text descriptions to report the problem. The following is an example of user feedback, where the problem source, major type, subtype and road name are options, and the user description is filled in, which is generally short text. These are also the main features we can use.
Each user hopes to solve the problem and receive feedback in a timely manner after reporting the problem. However, the number of users' feedback is hundreds of thousands every day. It is very difficult to achieve the goal of timely feedback.
For these user feedback information, the current overall process is to first use rules to classify, in which each feedback related to the road must be manually verified to find the type of problem reported by the user and the location of the problem, and update the road data in time for navigation.
A specific feedback operation needs to go through intelligence identification, intelligence positioning, intelligence verification and other links:
1) Intelligence identification is mainly used to determine the type of problem and label the information
Analyze the information reported by users, including problem sources, major types, subtypes, user descriptions, etc
View uploaded picture data, including automatic screenshots of mobile phones and user photos
2) Intelligence positioning is mainly to find the location information of the problem, namely the positioning coordinates
Analyze the validity of the location of the time stamp of the user feedback problem
Check the position of the vehicle when the user reports the problem, that is, the self-driving position
Analyze log information such as user's planning and actual track in the process of using the software
3) Intelligence verification: the information label and location coordinates are determined through the above two steps, and the information label (including the road name) needs to be verified in this step
Analyze image and big data thermal map or basic data of road network
View the data uploaded by the user and the multimedia pictures collected
The whole business processing process is shown in the following figure:
The principle in the whole process of handling user feedback is to fully believe that the user's problem exists. If the information reported by the user is not enough to judge the type and location of the problem, the user will try to draw a conclusion that is biased towards the user through the log information such as user planning and actual track.
At present, the main problems in the whole user feedback problem processing process are: low accuracy rate of rule distribution, complex manual verification process, high skill requirements and low efficiency, and serious false killing.
In order to solve the above problems, we hope to introduce the method of machine learning to improve the operation ability in a data-driven way. In the process of exploring the specific realization of the goal, we first disassemble and classify the business hierarchically, then use algorithms to replace rules to classify the information, and then the engineering disassembly manual verification process is divided into intelligence identification, intelligence positioning, intelligence verification and other steps to realize the rapid operation of individual skills, and finally use algorithms to automate the intelligence identification steps after the engineering disassembly.
3. Machine learning problem solving
3.1 Business sorting and hierarchical process disassembly
After the original user feedback problem is classified by rules, the artificial intelligence identification, location and verification are carried out, and finally the problem and which of the nearly 100 sub-classification items it belongs to are confirmed, and then the corresponding relationship between the upper level classification and the whole level is determined.
It can be seen from this that the whole problem processing process has only one step, and the processing process is quite complex, requiring high manual skills and low efficiency. Moreover, there are a thousand Hamlets in the eyes of a thousand people, and personal subjectivity will also affect the judgment of the problem.
In view of this situation, we sorted out and disassembled the original business processes, hoping to solve some of them by machine learning and process automation, and improve the efficiency of overall problem handling.
First, the classification of effective intelligence and invalid intelligence is to eliminate the invalidity, and then the whole process is divided into six levels, including business level 1, business level 2, business level 3, intelligence identification, intelligence positioning and intelligence verification.
As shown in the figure above, the first three levels after disassembly are intelligence classification. Only the last three levels require partial manual intervention, and other levels are directly automated. In this way, problems are greatly simplified and efficiency is improved through such methods as hierarchy, automation and dedicated personnel.
3.2 Business and model adaptation
We can see that there are both options and input items in the user feedback. The options, such as the source of the problem, have default values. You need to click and select the corresponding breakdown items. The user may not have the patience to select carefully. The patient user may not be able to select the correct classification because he does not know the specific classification criteria. User description is the content that needs to be manually entered by users, the main way for users to express their true intentions, and the most valuable content of user feedback.
User descriptions are generally divided into three situations: no description, description but meaningless, description and meaningful. The first two are called invalid descriptions, and the latter is called valid descriptions.
According to the results of business disassembly, the first step of the business process is to remove invalidity. After that, we will distinguish the user feedback of valid and invalid descriptions, and establish corresponding processes for processing.
1) The user feedback of effective description is classified level by level. The first level is divided into three categories: data, product, and forwarding. The product and forwarding are directly processed automatically. The data category is divided into road and topic in the second level. The topic refers to non-road traffic restriction, step guidance, cycling, etc.
2) The user feedback of invalid description is classified in the same way and goes through the same process, but the sample set and model are different, and there is no algorithm processing step at the end. It is directly handled manually or by rules.
3) Finally, according to the actual business needs, the structure of business and model adaptation is formed after layer by layer disassembly.
From the above analysis, it can be seen that intelligence classification and intelligence recognition are multi-classification text classification problems. We carry out corresponding operations according to different data characteristics:
Although the classification of intelligence is different at each level, the model architecture can be reused, and only minor changes need to be made. Moreover, there are historical data sets that have been manually verified before (including intelligence identification, intelligence positioning, intelligence verification and other processes) and have the final result as the classification label as the true value, and the sample set is relatively easy to obtain.
The classification label of intelligence identification is the intermediate result before the intelligence verification, which can only be labeled manually. On the premise of ensuring the normal production on the line, human resources should be allocated to label as much as possible, and resources are very limited. So we first do Finetuning on the intelligence classification data set to train the model. Then the application of intelligence recognition can be carried out after the number of manually labeled samples has accumulated to a certain level.
3.3 Model selection
First, the unstructured text user description is expressed as a vector form, namely a vector space model. The traditional approach is to directly use discrete feature one-pot representation, that is, use tf-idf value to represent words, and the dimension is dictionary size. However, when the number of statistical samples in this representation is large, data sparsity and dimension explosion will occur.
In order to avoid similar problems and better reflect the relationship between words, such as semantic similarity and word order adjacency, we use the word embedding method, that is, the word2vec model proposed by Mikolov. This model can map the semantics of words into a fixed vector space through the context structure information of words, and its similarity in the vector space can represent the semantic similarity of text, In essence, it can be regarded as an abstract representation of context characteristics.
Secondly, the most important is model selection. Compared with the complex feature engineering steps of traditional statistical learning methods, the deep learning method is more popular. The most commonly used in NLP is the cyclic neural network RNN, which circulates the state in its own network. Compared with the feedforward neural network, it can accept a wider range of time series structure inputs and better express the context information, However, in the process of training, there will be problems such as gradient disappearance or gradient explosion, and the long and short term memory network LSTM can solve this problem very well.
3.4 Model architecture
Take the word vector result of each user's feedback information as the input of LSTM, and then take the result of the last unit of LSTM as the text feature, merge with the other user's choice question as the model input, and then use softmax as the output layer for classification after passing through the full connection layer. The real number between 0 and 1 obtained is the basis of classification. The multi-category network architecture is shown below:
4. Summary of actual combat experience
After clarifying the business logic, determining the problem-solving steps, confirming the sample labeling and scheduling, and running through the first version of the model, we feel relieved that the problem should have been solved more than half, and the rest is to do model adjustment and optimization, and wait for sample accumulation, so that the model can be easily launched after training.
However, the actual situation is faced with more problems and difficulties than expected, such as insufficient training data, poor effect of single model, and imperfect setting of hyper-parameters. The long and difficult optimization and iteration process has just begun.
4.1 Fine-tuning
After selecting the model, the first problem faced by intelligence recognition is that the sample size is seriously insufficient. We use the Fine-tuning method to slightly modify the trained model on the network before training to improve the effect of the model. With the gradual increase of manually labeled samples, we can achieve an improvement of about 3 percentage points in different size of data sets.
4.2 Parameter adjustment
The parameter adjustment of the model is a process of cultivating internal skills and refining the golden pill. In fact, the effect is not necessarily good. We have conducted nearly 30 groups of parameter adjustment experiments, and obtained the following valuable experience full of blood and tears:
1) Initialization is a must. We choose SVD initialization
2) Dropout must also be used to effectively prevent over-fitting, as well as the function of Ensemble. For LSTM, the drop out position should be placed before LSTM, especially for bi-directional LSTM, which must be done, otherwise it will be over-fitted directly.
3) As for the selection of optimization algorithm, we tried Adam, RMSprop, SGD, AdaDelta, etc. In fact, the effect of RMSprop and Adam is not much different, but based on Adam, it can be considered as the combination of RMSprop and Momentum, and finally chose Adam.
4) The batch size is generally adjusted from around 128, but the bigger the better. For different data sets, you must also try the case where the batch size is 64. You may be surprised.
5) The last thing to remember is to shuffle the data as much as possible.
4.3 Ensemble
To solve the problem of insufficient accuracy of a single model, we adopted the Ensemble method. After many groups of experiments, we finally selected five of the best models trained at different parameter settings to do Ensemble by voting, and the overall accuracy rate was 1.5 percentage points higher than the single optimal model.
In addition, in order to optimize the effect of the model, we also tried to adjust the model, such as two-way LSTM and different Padding methods. After comparison, it was found that there was little difference in intelligence recognition. After analysis, it was caused by the different ways each user described the problem and the inconspicuous distribution difference.
4.4 Confidence differentiation
When the structure optimization and parameter adjustment of the intelligence recognition multi-classification model itself have reached a certain bottleneck, it is found that the final effect of the model has a certain gap from automation, because the features are incomplete and the accuracy of engineering extraction of some features is limited, the categories are unbalanced, and the number of samples of a single category is not large.
In order to better implement the algorithm landing, we tried to distinguish the confidence within the category, mainly using two methods: confidence model and setting threshold by category, and finally chose a simple and efficient method of setting threshold by category.
The confidence model uses the label output result of the classification model as the input. The sample set of each label is re-classified into training set and verification set. After training, the confidence model is obtained and the high confidence result is applied.
In the confidence model experiment, we tried to carry out the confidence model experiment in the way of Binary, Weighted Crossentropy and Ensemble. The formula of Weighted Crossentropy is:
The result of the experiment is that the Binary method has no obvious effect improvement. Ensemble has achieved a high recall rate at 95% confidence, but it does not reach the 98% confidence model.
It uses the method of setting different softmax thresholds according to different categories to make high-confidence judgments when the intelligence classification algorithm model is landing, that is, setting thresholds according to categories. Similar methods are also used in the intelligence recognition, and the results obtained exceed the effects of the high-confidence model made before, so this method is finally selected, which can greatly improve the operating efficiency of operators. At the same time, in order to reduce the operational complexity of operators, we also provide the top N recommendation of the low confidence part to save the operation time to the greatest extent.
5. Algorithm effect and application results
5.1 Information classification
Algorithm effect: according to the actual application requirements, the final effect of the intelligence classification algorithm is more than 96% of the accuracy rate of the product class, and the recall rate of the data class is up to 99%.
Application results: Working together with other strategies, the overall automation rate has increased significantly. After the rule optimization, the actual application has achieved a significant reduction in the number of operators and the unit operation cost by 4/5, which solves the bottleneck of user feedback back-end processing.
5.2 Intelligence identification
Algorithm effect: according to the strategy of using high confidence part to automate and low confidence part to label manually, the final effect of intelligence recognition algorithm is effective description accuracy rate of more than 96%.
Application results: After the intelligence label classification model is connected to the platform, the efficiency of operators will be improved by more than 30% through different processing of high and low confidence labels.
6. Summary and outlook
Through this project, we have formed a set of methodology to effectively solve complex business problems, and accumulated practical experience in solving problems in close combination of NLP algorithm and business. At present, these methods and experiences have been well implemented in other projects, and are under continuous accumulation and improvement. Under the premise of continuously improving user satisfaction, we should deal with problems as efficiently and automatically as possible, and strive for the perfection of every detail of the product, which is our driving force and persistent goal.
Related Articles
-
A detailed explanation of Hadoop core architecture HDFS
Knowledge Base Team
-
What Does IOT Mean
Knowledge Base Team
-
6 Optional Technologies for Data Storage
Knowledge Base Team
-
What Is Blockchain Technology
Knowledge Base Team
Explore More Special Offers
-
Short Message Service(SMS) & Mail Service
50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00