How Feature Engineering Automation Brings Major Changes to Machine Learning

From amateur programmer to professional programmer
When programmers first entered the industry, I felt that the most important thing was to cultivate myself into a professional programmer.

My programmers started a lot later than my peers, let alone young people these days. I majored in biology in college, and I was basically completely ignorant of computers before going to college. During the military training, because it was very boring, my roommate and I went to the computer room of the school every day to play. I am still very impressed. When I first walked into the computer room, someone asked if you want to play windows or dos. Completely black. Later, I just remembered that a bunch of people in the computer room were practicing blind strike. After the military training, blind strike was almost the same. That’s how I developed a strong interest in computers. , have some understanding of computer hardware.

After my sophomore year, I bought some books and started to learn the most popular web page Three Musketeers at that time. I learned the basic gameplay of handwriting HTML and PS, etc. After school and summer vacation, I can also start making websites for others (then I made websites at that time.) It’s really good to make money), maybe after a year or so, it’s not easy to make static web pages, and it’s not easy to find an internship, so I started to learn asp, write some simple CRUD, and make message boards and forums These dynamic programs should be considered programming at this stage.

After graduation, I joined a government software company in Shenzhen, a very reliable leader who gave me space, which made me grow well in those years and finally became a professional programmer.

Generally speaking, amateur or semi-professional programmers are mostly developed by one person, or a small team, so that there are usually problems in the development process, collaboration tools (such as jira, cvs/svn/git, etc.) There is a big lack, and professional programmers will be a lot more professional in this regard. In addition, the systems made by professional programmers usually run for a long time, so special attention will be paid to maintainability, which I understand more deeply after joining Ali. A system that's been running for 10 years is obviously very different from a system that's been written to play.

It is difficult for me to explain it clearly, I can only say that there is such a concept vaguely. Usually on the basis of interest, I don't think it's too difficult to jump from an amateur programmer to a professional programmer.

Development of programming skills
As a programmer, the most important ability is always programming ability. As far as my own feelings are concerned, I think the growth of programming ability mainly includes the following parts:

1. Primary programming ability: can use

Programming starts with learning the basic knowledge of programming languages. No matter what programming language it is, there are many common basic knowledge, such as how to write the first Hello World, if/while/for, variables, etc. Therefore, I compare It is recommended that when you just start learning a programming language, just look at some of the programming language's own documentation, and don't just read some advanced books. When I was learning Java, I read Think in Java, Effective Java, etc. It was really hard to understand.

In addition to reading documents, programming is a super practical activity, so be sure to write more code, only in this way can you really become proficient. This is why I still think it is very important to let the interviewee write code during the interview. This process is very easy to judge the familiarity of writing code. Many people will say that writing code is highly dependent on IDE, which makes handwriting difficult, but I definitely believe that people who have written a lot of code will not be difficult to write a piece of code that is not too complicated and works. Even if someone like me who hasn't written code for more than three years, let me handwrite a not-so-complicated and runnable Java program, there is no problem. The previous N years of coding career have made a lot of things go deep into the bone marrow.

I think this stage of primary programming ability will not be a problem for most programmers. Studying and practicing hard is the core of this stage.

2. Intermediate programming ability: can check and avoid problems

In addition to being able to use programming languages ​​to solve problems proficiently at the primary level, I think the first thing for the intermediate level is to improve the ability to check problems.

In the process of writing code, it is very normal to have problems. How to troubleshoot problems effectively and efficiently is the biggest gap in programming ability that programmers can usually feel.

The basics with strong problem-solving ability are easy to be highly recognized in the programmer community. In terms of the ability to check problems, the first thing to master is some basic debugging skills and easy-to-use debugging tools. In Java, there are jstat, jmap, jinfo that comes with JDK, and those not in JDK include mat, gperf, btrace, etc. If you want to do good things, you must first sharpen your tools. It is very typical to investigate problems. Sometimes the difference in people's ability to investigate problems may simply be because others know one more tool than you.

In addition to debugging skills and tools, the higher level of problem checking is to understand the principle. A programmer who understands the principle has a clear gap with other programmers in the level of problem checking. I think many students should be able to feel that sometimes the reason for finding out a problem is just because of an effective tool, but they don't know why.

I have trained many students of Ali on the methods of troubleshooting problems in Java. In this training, I often talk about the training of problem-solving skills. The most important thing is to be proficient. Try to write some programs for yourself that may cause problems. Actively look at how others check the problem, and actively participate in the problem-solving. Many people who have a strong ability to check the problem in the end are mostly just because "there is no other person, but they are familiar with it."

The improvement in my ability to troubleshoot problems was mainly in 2009 and 2010. During those two years as a member of Taobao Fire Brigade (a virtual team dealing with various problems and failures), I dealt with a lot of failures and problems. At that time, the fire brigade also had Ali’s most recognized technical god, Doron, from whom I learned a lot of troubleshooting skills. Compared with him, my ability to troubleshoot problems is the primary one.

The most impressive thing is that once we checked an application with high cpu us. After we both located a piece of code that caused the high cpu us when some input parameters were used, the way I can think of to continue the investigation is to go to The production environment captures the input parameters, and then uses the parameters to debug locally to see what is the reason. But after reading the code for a while, Duolong gave me an input parameter. I ran with this parameter, and sure enough, the cpu us was very high! This case is not once or twice. So I often tell others that I need a problem scene to troubleshoot the problem, but it is entirely possible for Doron to see the problem directly by looking at the code. This is the essential gap.

In addition to checking for problems, more powerful programmers will be very good at avoiding problems in the process of writing code. The easiest thing to understand is dealing with exceptions when writing code, and this is often where the gap between programmers is large.

Write a piece of code with positive logic. In most cases, even if there is a gap, it will not be too big, but in how to deal with the possible exceptions in this process, the gap in skill at this time will be very obvious. In many cases, the part of a piece of code that handles exception logic exceeds the amount of code for normal logic.

I often say that the gap between an excellent programmer and an ordinary programmer is that in many cases, there is no need to look at the architecture diagrams flying all over the sky, but only a small piece of code can be shown.

Take a small case for everyone to experience. There was a serious fault that year, and the reason was finally found out that one of the input parameters was an array, and the value in this array was used as a parameter to check the database. As a result, a large array was input in front, resulting in a large number of checks from the database. Data, memory overflow, many programmers will now understand the protection check for input and output parameters, but I have really encountered a lot of cases like this.

At the intermediate stage, I would recommend that everyone deliberately cultivate their abilities in these two aspects as much as possible, and become an excellent programmer who can write high-quality code and troubleshoot problems effectively.

3. Advanced programming ability: understand advanced API and principles

As far as my own experience is concerned, I only started to learn and master some of Java's more advanced APIs in more detail after writing Java code for many years, and I believe that the same is true for most Java programmers.

I started writing business system code in Java in 2003, but it was not until I joined Taobao in 2007 that I began to study Java's IO communication and concurrent APIs very seriously. Although I have learned and written some such code before, it is completely scratch. Of course, most of these reasons are usually related to work. Most programmers writing business systems may not need to use these, so it will be difficult to understand these relatively advanced APIs, but These APIs are crucial to a real understanding of a programming language.

I also talked about this part in the previous article on the programmer's growth route. In the absence of a scene, you can only rely on yourself to create a scene to learn well. I think as long as there is enough interest, this problem is not big, after all, there are various open source, these can be very good to help yourself create opportunities to learn, such as learning Java NIO, you can package a framework based on NIO, and then compare Netty, see which ones are not as good as Netty, it will be very helpful for real understanding.

In the process of learning high-level APIs and troubleshooting problems, I understand more and more that it is very important to understand the operating principles of programming languages. Therefore, I started to learn Java's compilation mechanism, memory management, and threading mechanism at a later stage. Wait. For a non-scientist like me, learning these will be much more difficult due to lack of foundation, but after learning these more fundamental things, my programming ability will be qualitatively improved, including those who learn other programming languages ​​in the future. Ability, the best way to learn these principles I think is to read some books about relevant knowledge first, and then go to the source code, so that you can really grasp it better, and finally, in the process of writing code in the future, check the problem In the process, you can combine the principles you have mastered, so that you will not forget it even after N years.

I don't think there are any shortcuts in terms of programming ability. I totally agree with the 10,000 hours theory. In the intermediate and advanced stages, it will be much better if someone gives pointers or works with good programmers. However, I think this is a bit similar to studying. After a certain stage (such as high school), talent will become the most important watershed, but like most industries, in most cases, it is not the time to fight for talent. Just work hard.

Growth of system design capabilities
Except for a few programmers who will enter specialized fields, such as Linux Kernel and JVM, most other programmers will need to grow in system design ability in addition to the growth of programming ability.

Usually, a programmer with good programming ability will start to undertake the work of a module after a certain stage, and then undertake a subsystem, a system, a larger system that spans multiple fields, etc.

In the third year of work, I myself began to undertake the design and implementation of a process engine, a system that was not small, and was also a core part of the project at that time. At that stage, I learned some basic knowledge of system design, such as the need to think clearly about the goals of the entire system, the division and responsibilities of modules, key object design, etc., instead of writing code right away. But at that time, since I was writing the whole system by myself, I didn't have a strong sense of design.

In the years since then, I have also been in charge of some systems, but overall I feel that I have not grown so much in system design. It was not until my experience in Alibaba that I gained more and more experience in system design. (Click at the end of the article to read the original text and see: 14 mistakes I made in system design, you can see a bunch of detours I took).

I once shared in Ali, I talked about my growth in system design capabilities, mainly because of three experiences, responsible for the design of professional field systems -> responsible for the design of professional systems in cross-professional fields -> responsible for Ali e-commerce system architecture Remodeled design.

The first experience was when I was in charge of HSF. HSF is a system built from scratch. It is mainly used as a framework to support service. It is a very professional system. From the perspective of the entire Taobao e-commerce system, it is actually a very small subsystem. There are three things that impressed me the most in this experience:

1). To design a system in this very professional field, the depth of professional knowledge is very important. When I first designed several frameworks of HSF, I did not design how the service consumers/providers should be combined with the existing frameworks. I also repeated several times in the design of load balancing, mainly because I was not satisfied with the Caused by the lack of deep mastery in this field;

2). Too technical. At the stage of HSF, out of sentimentality, there was a version that invested a lot of energy in introducing OSGi and making it dynamic. This later turned out to be a very, very wrong decision. From this point, I really understood when designing the system. At the same time, we must think clearly about the goal, and the goal is very important to be combined with the development stage of the company;

3). Sustainability. As a system that will continue to run in the production environment for many years, how to make it more sustainable in the future is very important to the design stage. The lowest example here is that when the HSF protocol was first designed, there was no version number in the protocol header, which made the subsequent upgrades very complicated; the most typical example is that HSF lacked the design of service Tracing in the early days, which led to the discovery later. After this place is very important, it took several years for it to be fully implemented; for example, HSF lacked the design of Filter Chain in the early stage, which made many extensions and customizations very inconvenient.

The second experience is to do T4. T4 is an Ali container based on LXC. It is different from HSF in that it is actually a multi-domain system, including a container engine on a single machine, a container management system, and the container management system provides external APIs. This to manage containers. This system development process is also a variety of mistakes, and the main reason for making mistakes is also because the field is not well mastered. In the days of T4, the most important thing I learned was how to design such a system that spans multiple professional fields, how to better divide the responsibilities of modules, and design interaction logic. This experience is more important to me. The meaning is that I have the confidence to do a larger system architecture.

The third experience is to do more work in different places in Ali e-commerce. For me, this is really to be the architect of a huge system. Although I participated in the major technical transformation of Taobao e-commerce 2.0-3.0 when I was doing HSF, there is a big difference between participating and leading myself. This The structural transformation involves technical teams in many different professional fields of Alibaba e-commerce. At this stage, the main thing I learned:

1). Subsystem responsibility division. In such an oversized technical solution, it is easy to have overlapping and conflicting responsibilities in some parts. At this time,
How to divide the subsystems is very important. As a master architect, at this time, we must choose the team from the responsibility of the team and the sustainability of the team;

2). The main responsibility of the master architect is to control system risks. For such a large system, it must be jointly designed by architects and major architects from multiple professional fields. How to ensure that the most important risks to the system can be controlled during the execution process? This is my real understanding of what it means. The design principles section of a system design document.

I think the design principle is used to ensure that each subsystem will be followed and considered in the design. It must not be a virtual thing. For example, in a remote multi-active architecture, the most important thing is how to control data risks. This needs to be in the principle. In writing, the most basic principle is to accept the unavailability of the system, but also to ensure data consistency, and I have seen more system design in which the design principles are only written, or the same, and the design principles truly reflect the architect. Understanding of the goal (for example, living in different places was actually just a concept at the beginning, but to what extent is it called living more in different places, this needs to be interpreted, and it is also necessary to ensure that the goal is achieved in terms of technical design) , the selection principle at the technical scheme level, and ensure that the design principle is accepted and implemented in the detailed design scheme;

3). Consider the comprehensiveness of the problem. Such large-scale structural transformation as multi-activity in different places involves the business level, various basic technology levels, and infrastructure levels. The decision on the execution rhythm should comprehensively consider human input, machine cost, infrastructure layout requirements, and stability control. Much more complicated than just doing a small system design.

For the growth of system design ability, I think the most important one is to first become professional in one or two technical fields, and then try to expand the breadth of my knowledge. For example, in addition to your own code, you should also know how it is deployed, where it is deployed, what the deployment environment is like, and what is the relationship with the entire system.

Like myself, it was only after joining the infrastructure team that I realized that sometimes a decision made in software will lead to huge investment in hardware, network or computer room in infrastructure, but in fact it may only require some adjustments in software It can be avoided. Doing research and development and doing operation and maintenance may be a better way to expand the breadth of knowledge.

The second point is to practice your ability to do tradeoffs. This is more difficult. To do tradeoffs, you need to make choices based on various factors, but this is also the most important thing for all architects. You can look back and reflect on your design of various systems. What is the tradeoff made. It's best to experience this firsthand, and it's helpful to hear some experienced architects share the logic behind their choices, especially if you happen to be in the same challenge stage, and hearing the final architectural results is actually limited in most cases. .

I think it is best to be a technical leader based on the architect, and there is still a big difference in the follow-up focus on growth, so I won't write it in this article, and I will write a special one later.

Programmer's Pyramid
I think the value of a programmer is reflected in the work. It is a great honor to be labeled as a work. I think the degree of influence of the work determines the level of the pyramid, so I will understand the pyramid of programmers in this way.

Of course, to create a work, only the above two abilities are not enough. A very important point in the work is the judgment of business and technology trends.

I hope that everyone as a programmer will have the opportunity to create a world-class work and contribute to the development of the technology circle.

Since the current IT technology update speed is still very fast, the programmer's profession is particularly in need of learning ability. I have always believed that only if you are truly interested in the profession of programmers and keep yourself motivated, it is possible to do well in this profession, otherwise, it will be easy to be eliminated.

About the Author:
Bi Xuan, who joined Alibaba in 2007, has been mainly engaged in the field of software infrastructure for more than ten years. He has been responsible for Alibaba's service framework, Hbase, Sigma, remote multi-active and other major basic technology products and overall architecture transformation.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us