Job Recruitment Website - Job information - How to become a big data engineer

How to become a big data engineer

Big data is a very fashionable technical term at present, and it also naturally gives birth to some occupations related to big data processing, which can influence the business decisions of enterprises through data mining and analysis.

This group of people are called data scientists abroad. This title was first proposed by D.J.Pati and Jeff Hammerbacher in 2008, and they later became the heads of LinkedIn and Facebook data science teams respectively. At present, the position of data scientist has also begun to create value in traditional industries such as telecommunications, retail, finance, manufacturing, logistics, medical care and education in the United States.

But in China, the application of big data has just sprouted, and the talent market is not so mature. "It's hard to expect a generalist to complete all the links in the whole chain. More companies will recruit talents who can supplement existing teams according to their existing resources and shortcomings. " Wang Yuyao, business analysis and strategy director of LinkedIn China, told CBN Weekly.

What does a data engineer do? Therefore, each company has different requirements for big data work: some emphasize database programming, some highlight applied mathematics and statistical knowledge, some require relevant experience of consulting companies or investment banks, and some hope to find applied talents who understand products and markets. Because of this, many companies will give these people who deal with big data some new titles and definitions according to their business types and team division of labor: data mining engineers, big data experts, data researchers, user analysis experts and so on. It is a title that often appears in domestic companies. We collectively refer to it as "Big Data Engineer".

Because the domestic big data work is still in a stage to be developed, how much value can be extracted from it depends entirely on the personal ability of engineers. Experts who have been in this industry have given a general framework of talent demand, including computer coding ability, mathematics and statistical background. Of course, if you can have a deeper understanding of some specific fields or industries, it will be more helpful for them to quickly judge and grasp key factors.

Although for some large companies, a master's degree is a better choice, Xue, a researcher at Alibaba Group, stressed that education is not the most important factor, and experience in large-scale data processing and curiosity about treasure hunting in the data ocean are more suitable for this job.

In addition, an excellent big data engineer should have certain logical analysis ability and be able to quickly locate the key attributes and determinants of a business problem. "He needs to know what is relevant, what is important, what kind of data is the most valuable, and how to quickly find the core requirements of each business." Shen Zhiyong, a data scientist at the United Nations Baidu Big Data Joint Lab, said. Learning ability can help big data engineers adapt to different projects quickly and become data experts in this field in a short time; Communication skills can make their work go smoothly, because the work of big data engineers is mainly divided into two ways: driven by the marketing department and driven by the data analysis department. The former needs to know the development requirements from the product manager frequently, and the latter needs to find the operation department to understand the actual transformation of the data model.

You can think of these requirements as the direction of becoming a big data engineer, because in the view of Nicole Yan, managing partner of Wanbao Ruihua, this is a big talent gap. At present, domestic big data applications are mostly concentrated in the Internet field, and more than 56% of enterprises are ready to carry out big data research. "In the next five years, 94% of companies will need data scientists." Fish Leong said. Therefore, she also suggested that some companies that were originally engaged in data-related work could consider transformation.

In the words of Xue, a researcher at Alibaba Group, big data engineers are a group of people who "play with data", exerting the commercial value of data and turning it into productivity. The biggest difference between big data and traditional data is that it is online, real-time, massive, irregular and irregular, so it is very important for people to "play" these data.

Shen Zhiyong believes that if you think of big data as an accumulating mine, the job of a big data engineer is, "The first step is to locate and extract the data set where the information is located, which is equivalent to prospecting and mining. The second step is to turn it into information that can be directly judged, which is equivalent to melting. Finally, the application visualizes the data. "

Therefore, analyzing history, predicting the future and optimizing choices are the three most important tasks for big data engineers to play with data. Through these three work directions, help enterprises make better business decisions.

1. Find out the characteristics of past events.

A very important job of big data engineers is to find out the characteristics of past events by analyzing data. For example, Tencent's data team is building a data warehouse, sorting out the huge and irregular data information on all the company's network platforms, and summarizing the characteristics that can be queried to support the company's various business needs for data, including advertising, game development and social networking.

Finding out the characteristics of past events can help enterprises better understand consumers. By analyzing the user's past behavior trajectory, we can understand this person and predict his behavior. "You can know what kind of person he is, his age, hobbies, whether he is a paying Internet user, what kind of games he likes to play and what he likes to do online." Zheng Lifeng, general manager of Beijing R&D Center of Tencent Cloud Computing Co., Ltd. told CBN Weekly. Next, at the business level, we can recommend related services for all kinds of people, such as mobile games, or derive new business models according to different characteristics and needs, such as the movie ticket business of WeChat.

2. predict what may happen in the future

By introducing key factors, big data engineers can predict future consumption trends. On Ali's mother's marketing platform, engineers are trying to help Taobao sellers do business by introducing meteorological data. "For example, if it is not hot this summer, it is very likely that some products could not be sold last year, except air conditioners, electric fans, vests and swimsuits. , may be affected by it. Then we will establish the relationship between meteorological data and sales data, find the related categories, and warn the seller's turnover inventory in advance. " Xue said to.

In Baidu, Shen Zhiyong supported the model development of some products of "Baidu Forecast", trying to serve a wider range of people with big data. World Cup prediction, college entrance examination prediction and scenic spot prediction have been launched. Taking Baidu's scenic spot forecast as an example, big data engineers need to collect all the key factors that may affect the tourist flow in scenic spots for a period of time, and rank the future congestion of scenic spots across the country-will it be smooth, crowded or generally crowded in the next few days?

3. Find out the result of optimization

According to the business nature of different enterprises, big data engineers can achieve different purposes through data analysis.

Taking Tencent as an example, Zheng Lifeng believes that the simplest and most direct example that can best reflect the work of big data engineers is AB testing, which helps product managers to choose between two alternatives, A and B. In the past, decision makers could only judge by experience, but now big data engineers can help marketing departments make the final choice through a wide range of real-time testing-for example, taking social networking products as an example, let half users look at interface A, while the other half use interface B to observe and count the click-through rate and conversion rate over a period of time.

As an e-commerce, Alibaba hopes to help sellers do better marketing through accurate crowd positioning of big data. "What we expect is that you can find a group of people who are more interested in products than existing users." Xue said to. An example of Taobao is that a ginseng seller originally promoted pregnant women, but engineers found that the marketing conversion rate for pregnant women was higher by mining the correlation between data.

Required capacity

1. Mathematics and statistics related background

As far as the three major BAT Internet companies we interviewed are concerned, the requirements for big data engineers are all master's or doctoral degrees in statistics and mathematics. Shen Zhiyong believes that data workers who lack theoretical background are more likely to enter a dangerous zone-skills-a bunch of numbers. According to different data models and algorithms, they can always get some results, but if you don't know what it represents, it is not really meaningful and it is easy to mislead you. "Only with certain theoretical knowledge can we understand models, reuse models and even innovate models to solve practical problems." Shen Zhiyong said.

2. Computer coding ability

Practical development ability and large-scale data processing ability are some essential elements for a big data engineer. "Because the value of a lot of data comes from the process of mining, you have to do it yourself to discover the value of gold." Zheng Lifeng said.

For example, many records generated by people on social networks are unstructured data. How to extract meaningful information from these clueless words, sounds, images and even videos requires big data engineers to dig for it themselves. Even in some teams, big data engineers are mainly responsible for business analysis, but they should also be familiar with the way computers handle big data.

3. Knowledge of specific application fields or industries.

In Nicole Yan's view, it is very important for the role of big data engineers to be inseparable from the market, because big data can only generate value if it is combined with applications in specific fields. Therefore, experience in one or more vertical industries can accumulate industry knowledge for candidates, which is very helpful for becoming a big data engineer in the future, so this is also a convincing plus item when applying for this position.

"He can't just know the data, but also have a business mind. He can have a certain understanding of some industries, such as retail, medicine, games and tourism, which is best in line with the company's business direction. " In this regard, Xue also made an analogy. "In the past, we said that some luxury shop assistants were snobbish and knew at a glance that they couldn't afford it, but this group of people was keen. We thought they were experts in this industry. Another example is a person who knows the medical industry. When considering the medical insurance business, he will not only be related to the medical records of the People's Hospital, but also consider the dietary data, which are based on his understanding of this field. "

Career development 1. How to become a big data engineer?

Due to the current shortage of big data talents, it is difficult for companies to recruit suitable talents-both highly educated and preferably experienced in large-scale data processing. So many companies will dig inside.

In August of 20 14, Alibaba held a big data competition, taking out the data on Tmall platform, removing sensitive issues, putting them on the cloud computing platform and handing them over to more than 7,000 teams for competition. The competition is divided into internal and external competitions. "This not only inspires internal employees, but also explores external talents, making big data engineers in various industries stand out."

Nicole Yan suggested that people who have been engaged in database management, mining and programming for a long time, including traditional quantitative analysts, Hadoop engineers, and any managers who need to make judgments and decisions through data in their work, such as operation managers in some fields, can try this position. Experts in various fields can also become big data engineers as long as they learn to use data.

2. Wages and treatment

As a "giant panda" in the IT industry, the income and treatment of big data engineers can be said to have reached the top of their kind. According to Nicole Yan's observation, 10% of domestic IT, communication and industry recruitment is related to big data, and the proportion is still rising. Nicole Yan said, "The era of big data is coming to expect the unexpected. The domestic development momentum is radical, but the talents are very limited. Now the supply is completely in short supply. " In the United States, the average annual salary of big data engineers is as high as $654.38 +0.75 million. It is understood that in the top Internet companies in China, the salary of big data engineers at the same level may be 20% to 30% higher than other positions, which is highly valued by enterprises.

3. Career development path

Due to the lack of big data talents, the data departments of most companies are generally flat hierarchical models, which are roughly divided into three levels: data analysts, senior researchers and department directors. Large companies may divide different teams according to the dimensions of application fields, while in small companies, they need to hold several positions. Some Internet companies with special emphasis on big data strategy will set up other high-level positions-for example, chief data officer of Alibaba. "Most people in this position will develop in the direction of research and become important data strategy talents." Fish Leong said. On the other hand, big data engineers have no less understanding of business and products than employees in business departments, so they can also transfer to product departments or marketing departments or even rise to the top of the company.