This article is Part 2 of a series that discusses many interesting tasks involved in the field of code intelligence. Each entry in the series includes several key aspects, such as a brief introduction, history, current situation of these tasks, in the hope of giving everyone a deep understanding of code intelligence.
Part 1: Learning about Intelligent Code Completion
The topic of this article is code defect detection, which determines whether there are bugs in a piece of code. This sounds a bit profound, but it works like drawing a conclusion based on the existing experience and historical data. Is it a bit similar to fortune telling? Note: Defect detection in this article only refers to checking bugs. Defect location and repair are not included.
The defect detection in our mind might look like this:
…or something like this:
However, defect detection can only tell if there is a bug in a piece of code. It cannot tell where and what the bug is. Therefore, defect detection is not particularly useful during development or testing. Imagine I tell you there is a bug in a piece of code without telling you what it is. Isn't it annoying? If I’m not sure whether or not there is a bug, it would be alright if we could locate the problem, but it would be more annoying if we could not. Then, another question arises, "Do you think there are bugs or not?" The answer could only be, “It’s not Do you think, it’s I think."
Let’s get back to the topic. Defect detection is not aimed at development scenarios. It could be useful if you know there is a high probability of problems in the current file during the code review (CR). At this time, people that do CR need to pay special attention. Although the actual effect is average, it is still an interesting small task in the field of code intelligence that has always existed.
The history of defect detection is the same as the history of defect definition. When defects are defined, some people are thinking about how to detect these defects or avoid some defects through tools. For example, the compiled language can find some static errors in the compilation process, which leads to unsuccessful compilation. Code analysis tools or static scanning tools can also find some defects in advance. These are all part of defect identification and location. We will discuss them later.
A common way of defect detection is to extract useful features from historical code and train based on these features to obtain a prediction model to predict the subsequent code. Here, the training units can be snippets and files or one commit. Compared to files, the prediction of snippets and commits is relatively more effective and locates the defects easier. However, due to the large size of the document, the difficulty of locating defects has also increased.
Common features include the following:
The first three are more of some quantitative indicator characteristics and are less relevant to defects, but they are easier to obtain and aggregate. The other two are more relevant to defects but more difficult to obtain.
As mentioned earlier, a separate defect detection technology does not bring us anything. More needs to be used together with defect location and defect repair technologies. In recent years, due to the rapid development of deep neural networks, there have been many papers and studies on defect detection and repair based on deep models. Some related products have been released, such as Microsoft's DeepDebug, which claims to repair Python defects automatically. It has not been used in practice, so the effect is still unknown. However, we can guess that it would be similar to code completion and more likely to be in the Pull Request stage. Also, there would still be a long way to go before practical application. However, we must firmly believe that technology is always developing. What if we can have a robot that automatically changes bugs one day?
66 posts | 1 followersFollow
Alibaba Clouder - July 15, 2020
Alibaba Cloud Community - February 14, 2022
Alibaba Clouder - November 30, 2020
Alibaba F(x) Team - July 28, 2022
Alibaba Clouder - September 7, 2020
gangz - December 10, 2020
66 posts | 1 followersFollow
A low-code development platform to make work easierLearn More
Help enterprises build high-quality, stable mobile appsLearn More
Alibaba Cloud (in partnership with Whale Cloud) helps telcos build an all-in-one telecommunication and digital lifestyle platform based on DingTalk.Learn More
More Posts by Alibaba F(x) Team