What does Big Data do at MegaFon and how to get there?


Megaphone: The pulse fades

Address for questions and suggestions about the site: [email protected]

Copyright © 2008–2021. LLC "Company BKS" Moscow, Prospekt Mira, 69, building 1 All rights reserved. Any use of site materials without permission is prohibited. License for brokerage activities No. 154-04434-100000, issued by the Federal Commission for the Securities Market of the Russian Federation on January 10, 2001.

The data is exchange information, the owner (owner) of which is PJSC Moscow Exchange. Distribution, broadcast or other provision of exchange information to third parties is possible only in the manner and under the conditions provided for by the procedure for using exchange information provided by Moscow Exchange OJSC. Brokercreditservice Company LLC, license No. 154-04434-100000 dated January 10, 2001 for brokerage activities. Issued by the Federal Financial Markets Service. No expiration date.

* The materials presented in this section do not constitute individual investment recommendations. The financial instruments or transactions mentioned in this section may not be suitable for you and may not correspond to your investment profile, financial situation, investment experience, knowledge, investment objectives, risk appetite and return. Determining the suitability of a financial instrument or transaction for investment objectives, investment horizon and risk tolerance is the responsibility of the investor. BKS Company LLC is not responsible for possible losses of the investor in the event of transactions or investing in financial instruments mentioned in this section.

The information cannot be considered a public offer, an offer or an invitation to purchase or sell any securities or other financial instruments, or to make transactions with them. The information cannot be considered as guarantees or promises of future investment returns, risk levels, costs, or break-even of investments. Past investment performance does not determine future returns. This is not an advertisement of securities. Before making an investment decision, the Investor must independently assess the economic risks and benefits, tax, legal, and accounting consequences of entering into a transaction, his readiness and ability to accept such risks. The client also bears the costs of paying for brokerage and depositary services, submitting orders by telephone, and other expenses payable by the client. The full list of tariffs of BCS Company LLC is given in Appendix No. 11 to the Regulations for the provision of services on the securities market of BCS Company LLC. Before making transactions, you also need to familiarize yourself with: notice of risks associated with transactions on the securities market; information about the client’s risks associated with making transactions with incomplete coverage, the occurrence of uncovered positions, temporarily uncovered positions; a statement disclosing the risks associated with conducting transactions in the market for futures contracts, forward contracts and options; declaration of risks associated with the acquisition of foreign securities.

The information and opinions provided are based on public sources that are recognized as reliable, however, BKS Company LLC is not responsible for the accuracy of the information provided. The information and opinions provided are formed by various experts, including independent ones, and opinions on the same situation can differ radically even among BCS experts. Given the foregoing, you should not rely solely on the materials presented at the expense of conducting independent analysis. BKS Company LLC and its affiliates and employees are not responsible for the use of this information, for direct or indirect damage resulting from the use of this information, as well as for its accuracy.

What does Big Data do at MegaFon and how to get there?

MegaFon is now not just a telecom company that provides mobile communications, it is a digital company that creates products that form an ecosystem for the client’s life: “Own Card”, “Own Cashback”, “MegaFon.TV”, “MegaFon.Music” and many other. MegaFon's big data analytics department personalizes offers to suit the needs of each client.


Speech by MegaFon big data analyst at the Data Fest conference in spring 2019.

Data scientists at MegaFon solve the problem of preserving the subscriber base, which is one of the priorities for the company against the backdrop of slowing growth of the telecom services market. For example, several years ago, based on big data, a new tariff line “Turn on” was developed. It is built on the real interests of digital users: talking, chatting in instant messengers, listening to music, communicating on social networks, watching videos. The names of the tariffs correspond to the content based on interests, and unlimited use of familiar applications does not require calculations of consumed traffic. When forming an ecosystem, our task is to make an individual offer to each client. Big Data also solves problems related to retail. For example, using machine learning models, we understand where to move underperforming salons and where to open new ones. Working with geodata helps us in this direction.

Big data analytics is also used in tasks related to the development of network infrastructure, where, using the analysis of towers and traffic from them, we determine optimal coverage and predict promising locations for construction.

What technologies are used?

The volume of data we work with is millions of subscribers and billions of daily records for them. Big Data is not just databases such as Oracle, MySQL or MongoDB. Big Data is a whole complex of software for working with it. To work with big data, you need to understand how Hadoop works and know the features of working with Spark, Hive, HDFS. Often the data analysts who come to us have not used these tools in their work before. In this case, we teach those skills that are lacking.

Skills in working with big data are acquired with experience, so MegaFon is interested in talented analysts who are ready to learn all the necessary tools and apply them to the company’s real problems.


BigDataCamp at MegaFon office, 2021

How do MegaFon's Big Data specialists develop models?

MegaFon's Big Data specialists are divided into analysts (data scientists) and engineers. Analysts test hypotheses and build machine learning models. Engineers help analysts assemble showcases, optimize ETL processes, and are responsible for putting models into production.

The development of the model is as follows. First, we collect the required data in Hadoop or Oracle. The model is then trained on dedicated servers with a large amount of memory and CPU cores. We use GPU servers to train neural networks.


BigDataCamp at MegaFon office, 2021

The main language for developing models is Python. To process data in Python, you usually need the standard libraries Pandas, Numpy, Scikit-learn. For calculations in Hadoop, PySpark and Hive are used, for modeling - the libraries Scikit-learn, Xgboost, LightGBM, PyTorch and others. The list depends on the task. Why Python? Its main advantage is its ease of productivity. We can make a solution that will immediately be built into the overall infrastructure. Although it happens that the necessary libraries are not in Python, they are in other languages. For example, R has statistics libraries that Python does not.

What if no one knows Hadoop?

Hadoop skills are desirable, but not a requirement to join our team. Not all companies have the volume of data that MegaFon has, and, as a result, candidates did not have the opportunity to work with Hadoop at their previous job.

It's not very difficult to master the basic commands for working with a Hadoop cluster, but when it comes to more complex tasks, a deep understanding of big data algorithms, MapReduce, and query optimization techniques is required. For example, in the Hadoop ecosystem there is a product such as Hive. It allows you to write SQL-like queries and runs on top of Hadoop. It was originally developed by Facebook. But you need to remember that these are not manipulations with a relational database, despite the fact that you are writing in SQL. You can write simple queries here, but in order to achieve efficiency, that is, speed and minimal use of cluster resources, it is worth understanding the nuances of query optimization using MapReduce.

Internships are an opportunity to develop and gain business experience. Are there internships at MegaFon Big Data?

In our digital world, it seems that any stool already collects data about the person who sits on it, not to mention the Internet of things and the large number of services that we all use.

The need for specialists is growing, there are a large number of analyzes and forecasts about how many of them will soon be needed. Every company that collects any kind of data recognizes that there can be value and a lot of insight in that data. This is why data analysts are so in demand now.


BigDataCamp at MegaFon office, 2021

We welcome great specialists, but the market is small and there are not many people suitable for us. Therefore, MegaFon is developing internship programs. We mainly invite senior students and recent graduates for internships who are related to programming and mathematics. There are exceptions, for example, there was a successful experience of interaction with guys from the geography departments. It is important for us that a student can harmoniously combine work with study, develop further in the company and, in the future, move to the position of an analyst or engineer.

How do you recruit for your team?

Our interviews with trainees are different from those with experienced professionals. When searching for interns, the recruiter conducts a short telephone interview, as a result of which it becomes clear whether our tasks are interesting to the candidate and what level of knowledge and experience he currently has. It is important for us whether the candidate knows how to program in Python, whether he knows the main machine learning libraries, whether he has experience in solving educational problems related to the analysis of big data, whether he has previously built mathematical models and what algorithms he used.

Based on the results of the telephone interview, we select 5-10 candidates who simultaneously come to our office for 2-3 hours to meet the guys from the team and solve the technical task.

It is as close as possible to the telecom sector - it is necessary to build a model for classifying our subscribers. Next, we compare the results and invite the best to a final interview to discuss an individual work schedule, tasks and other conditions.

The internship lasts 3 months. The intern deals with real business problems. Most often, the tasks are already formalized, and the person has a clear understanding of what needs to be done; if not, you can always turn to your mentor

.

In addition to business tasks, our trainees regularly undergo offline and online training. We collaborate with New Pro Lab, Big Data Team, Geek Brains, Data Gym and others, our specialists have access to Coursera.

As practice shows, three months is enough to understand whether we want to continue working together. If an intern shows good results, we hire him to the position of junior data scientist and develop him further.


Egor, MegaFon big data analyst, at the Data Fest conference in the spring of 2019.

The search for experienced specialists is as follows:

1. Double-check the candidate’s resume or profile by team leads and recruiter.

2. Personal interview with the team lead, where there are technical and other questions: probability theory, statistics, machine learning, experience in using various utilities, the expectations of the candidate himself.

3. If the interview went well for both parties, we request the candidate’s portfolio (personal projects and code) or ask them to solve our technical task in order to look at the code and find out the progress of solving problems. The technical task is also related to telecom: it is necessary to predict whether the subscriber has several SIM cards. The deadline for completing the task is determined by the candidate himself, but usually it is no more than a week. One of our employees solved the task that same evening and a week later came to work with us. Hi Artem ;)

4. Meeting with the director of big data analytics, discussing tasks and conditions.

Is bureaucracy strong in a large corporation?

Most of our team works at the head office in Moscow, but we have teams in Nizhny Novgorod and Yekaterinburg. Colleagues from different cities can be involved in projects, it all depends on the tasks and skills of the employees.

Our department is young, dynamic, and from the very beginning we managed to correctly build the processes of interaction with other departments: we don’t need to request data through colleagues, we mainly use our database, Oracle or Hadoop, and build a model.


Work in the MegaFon office

Our work process is arranged as follows. First, the manager discusses the requirements with the customer representative. As a rule, we are talking about improving some business process using machine learning and data analysis, for example, we can optimize the sale of smartphones for our retail. Then the manager, team lead and analyst jointly discuss the timing and stages of development. Agreements are recorded in Jira, we also maintain Confluence, this is our internal Wiki. Of course, we use Gitlab.

This year we introduced a code review process for all key stages of a data science project and are already seeing results: the quality of many guys’ code has improved significantly. Further plans to improve the development process include the introduction of the DVC (Data Version Control) tool, which will allow versioning of the entire project, including datasets.

The duration of projects can be from several months to six months. The analyst participates in all stages of the project, from formalizing requirements and determining the target event of the model, to monitoring the stability of the result in production.

We are very results-oriented; we never undertake development without a clear understanding of what benefits we can bring to MegaFon. After building the model, we launch test campaigns based on the results of its work. If successful, we will roll out our solution to millions of MegaFon subscribers. In the future, we analyze the results not only from the point of view of model metrics, such as accuracy or completeness in the target segment, but also seriously approach the analysis of business indicators. Our business analysts help us with this.

Team and development

The biggest advantage of working in this department is the team of really smart guys and interesting tasks. An office, a shopping center in it, bonuses, compensation, of course, are also good, but they come in third place. MegaFon is a real treasure trove of data for analysts. Not everyone has the opportunity to work with the type and amount of data that, when analyzed, can capture insights and make decisions that will ultimately bring big money. This is the most interesting thing for an analyst. You went to university, wrote a new algorithm, coded it, applied scientific methods, the algorithm began to work and actually bring some benefit. This is what causes the most emotions.

We are numbers people, surrounded by commercial people, and when our insights lead to making money, that's great!

The interview was prepared jointly with the career service “My Circle”

Rating
( 2 ratings, average 5 out of 5 )
Did you like the article? Share with friends:
For any suggestions regarding the site: [email protected]
Для любых предложений по сайту: [email protected]