The Two-Sided Market Network Analysis Based on Transfer Entropy & Labelr
TTHE TWO-SIDED MARKET NETWORK ANALYSISB
ASED ON T RANSFER E NTROPY & L
ABELR
A P
REPRINT
Seung Bin Baik
DeepNatural Inc. [email protected]
January 27, 2021 A BSTRACT
This study more complex digital platforms in early stages in the two-sided market to produce powerfulnetwork effects. In this study, I use Transfer Entropy to look for super users who connect hominids indifferent networks to achieve higher network effects in the digital platform in the two-sided market,which has recently become more complex. And this study also aims to redefine the decision criteria ofproduct managers by helping them define users with stronger network effects. With the developmentof technology, the structure of the industry is becoming more difficult to interpret and the complexityof business logic is increasing. This phenomenon is the biggest problem that makes it difficult forstart-ups to challenge themselves. I hope this study will help product managers create new digitaleconomic networks, enable them to make prioritized, data-driven decisions, and find users who canbe the hub of the network even in small products. K eywords Transfer Entropy · Two-side Market · Complex System Network · Platform · Product Management
The two-sided market is is two groups of agents interact with each other via a common network platform and the valueof participating in the network for agents in one group depends on the number of participants from the other group.[1]In this market, platform refer to goods and services that combine user groups.[2] Deep dive into this two-sided market,we can see that it consists of transaction parties affected by network effects and one or more intermediaries promotingtheir transactions.[3] These architectures of the network have become more complex in recent years, and it reminiscentof real-world complex networks. The network externality refers to the characteristics of the two-sided market thatinfluence the outcome of the other group through the group’s decision[4] is increasingly difficult to establish the causalrelationship. Of course, there was uncertainty before, because the real drivers also affect it. But increasing complexity iscausing more uncertainty. In particular, large platform companies, whose network effects have previously led to thegrowth of products through relationships among agents, have emphasized the network effects in homogeneous networks,while the business form of the recent platform is influenced by the influx of completely different groups.(e.g B2B2C)For this reason, it is difficult to find the highest-priority users to gain a competitive edge in the market and to measurethe network effectiveness. Especially for crowd-sourcing platforms that do not provide communication functions, itis difficult to fully validate users’ network effects and guess who is affecting external networks in the product. Thisstudy is based on a Transfer Entropy-based network analysis to solve these problems. And it aims to allow new clusterswith certain criteria to influence different networks and to guess the connectivity of different external networks. And tobest demonstrate these problems, we used data from the crowd-sourced data processing platform, "Labelr" provided byDeepNatural Inc. Labelr is a domestic service in Korea and provides users with web and mobile applications that cando data annotation for Machine Learning.Unlike other products, this product is divided into a ’Great user’ group who was selected on have annotation experienceand high inspection rates by internal professional staff. Therefore, Labelr is being chosen by customers for reason that a r X i v : . [ ec on . GN ] J a n PREPRINT - J
ANUARY
27, 2021high quality and speed of data annotation. Finally, I aim to define users(users with high network effects) who will focuson the product manager of the still-growing two-sided market platform and provide them with new decision criteria.
Labelr, provided by DeepNatural Inc. is a platform that supplies the labor required to build artificial intelligencelearning data with crowd-sourcing. The reason why crowd-sourcing platforms were selected for study is because ofthe structural features of two completely different networks interacting. For example, there is not much interaction inthe user network, but this group of assistance to be rewarded, an influx of enterprise users(customers) who open theproject is required. These enterprise users do not choose to have vague and simple requirements like the number ofusers, outcomes. This structural specificity is consequently formed on other platforms or SNS, with the assistance groupnetwork disconnected without being formed internally. And enterprise users are also developed into networks betweencorporate representatives, making it difficult for product managers to focus on any single indicator in the platform.To influence the effectiveness of these external networks and the network of enterprise customers, product managershave selected and managed separate groups of good assistants through internal evaluations. These tests have begunto create groups that receive huge rewards. However, these external networks are going through human hands, whichcould lead to another risk of loss of efficiency or intervention by stakeholders. In addition to the advantage of being ableto analyze products that are rapidly growing in the artificial intelligence learning data market, which has clear marketdemand, the invisible hand has also worked greatly as much as the government’s policy. These markets and productshave the advantage of being similar to real society.
The data used in the study is the half-year data of ’Labelr’ from July 2020 to December 2020. The reason why this datawas selected is that was the period of growth, commonly referred to as J-curve. Also the inflow of B2B enterprise userwas also increased, enhancing the production activities of the platform. It was also thought that data analysis of thesegrowth periods would be more helpful in the decision-making of product managers who wanted to grow.The data utilized credit related indicators. Credit is cash-like compensation for work in platform that user can withdrawanytime.•
Log data for the total amount of withdrawable credits remaining on the platform • Total amount of credit paid to the user and the date of payment • Total amount of credit withdrawn by the user and date of withdrawal
We also utilized user-related values such as the following for an effective analysis of the network.•
New user and sign up date • ’Great user’ who selected by internal professional staff and sign up date • ’Super user’ calculated by the hypothesis and the sign up date Finally, we also used project data that opened in the ’Labelr’, which can be interpreted as an indicator that reflects theactual demand for B2B. We made a hypothesis that user networks in the platform will follow the power-law in order toselect ’Super user’ calculated by the hypothesis. Agents that serve as hubs of the network will have the most impact.[5]To extract these hubs, The number of ’Super user’ are calculated under the following conditions because the ’Labelr’cannot be connected to the network such as recommendation and relationship making.1. Users with recent work logs and above average retention days2. Users who are satisfying the conditions above and have more than average workload in their clusters3. Users with an average pass rate of the first one or more task as a result of an expert examination of work resultssubmitted by a cluster of users who are satisfied with the first conditionWe plan to use Transfer Entropy in this study to find out how these selected ’Super user’ affect other drivers in theplatform and to verify that ’Super user’ will have more influence and network effects than ’Great user’ selected byinternal professional staff. Transfer Entropy is an indicator that quantifies the effect between two time series data.[6]This is a way to measure the direction of information as well as the absolute amount of information exchanged betweenthe two systems. First, we obtain the slope of each element’s variation to understand the shape of the change in the2
PREPRINT - J
ANUARY
27, 2021effect of the independent value(J) being the reference of the value of the two factors computed by the labeler’s platformon the dependent value(I).Below Formula 1 is an expression defined to determine the relationship between the previous sample of I and theprevious sample of J. T J → I = (cid:88) p ( i t +1 , i ( k ) t , j ( l ) t ) log p ( i t +1 | i ( k ) t , j ( l ) t ) p ( i t +1 | i ( k ) t ) (1)Finally, through the Transfer Entropy, I want to check the adequacy of the drivers and their effects that product managersshould focus on the early stage two-sided market. Also I want to see if can draw a better smile graph with the PowerUser Curve, which allows us to see how network effect work. The results of an analysis of time series data over six months based on transfer entropy through the key drivers of Labelrare shown in the Table 1 below. In addition, the values in the table are replaced, converted to the final relative value(F),and listed in order, as shown in Table 1, and each transfer entropy unit is multiplied by 10,000 as arbitrary units(A.U.)and expressed to the third decimal place.Table 1: Labelr Transfer Entropy Result(A.U.)- User Great User Super User Credit Withdraw Remained Credit ProjectUser - 72.156 0.001 40.853 0 42.557 132.956Great User 0 - 0 29.303 25.447 30.544 95.355Super User 32.578 52.132 - 29.499 25.618 30.749 95.994Credit 0 0 0.001 - 0 0 142.585Withdraw 5.84 0 0 5.288 - 0 17.213Remained Credit 0 0 0 1.516 1.317 - 4.934Project 0 0 0 0 0 0 -For relative evaluation, the value of the correlation(T) resulting in Transfer Entropy is difficult to compare, and theeffect of each value on the factor is divided by maximum value and reconstructed as follows with a score of 100 points. F J → I = T J → I M ax ( T ) × (2)Table 2: Labelr Impact Factor RankingFCredit to Project 100Super User to Projects 67.32User to Great User 50.61User to Balance 29.85Super User to User 22.85Great User to Balance 21.42Great User to Credit 20.55Great User to Withdraw 17.85Withdraw to Users 4.1Remained Credit to Project 3.46Remained Credit to Withdraw 0.92As a result, the payment of credit has the highest impact on the project at 0.01425847AU, which is then converted backto the base (Max) to analyze the data. It was followed by ’user’, ’super user’, and ’great user’. Users who evaluated3 PREPRINT - J
ANUARY
27, 2021new users as experts in-house(Great User) also affected the increase in the number of automatically calculated bestmembers(Super user). On the other hand, it was shown that the impact of the ’Great user’ on other drivers was relativelylow, especially the 34th lowest among 42 networks that led to an increase in total users. The important part of thisstudy is what we focus on to increase the number of users and the number of projects being opened. So we need tolook at the relationship between each driver more logically. Accordingly, when looked at the factors that have the mostimpact on the growth of users and the factors that have the most impact on the growth of projects. Naturally, for userswith low impact but the barriers to participation are low, the experience of withdrawal and credit receipt (5.84 A.U. , 0A.U. ) may be the biggest factor. But the ’Super user’ had the most powerful impact with 32.578 A.U. on the increasedusers. Project growth was most affected by the payment of credits at 142.575 A.U. on the opening of the project. As isgenerally said in the two-sided market, the next increase in users factor was 132.956 A.U., which had a powerful impacton the opening of the project. Overall, the project ranked 1st through 4th in the affected area, which could be the basisfor a typical solution to the chicken and egg problem, which requires lower prices on two-sided market platforms ormore inflows by first targeting subsidy sided users.
The Power User Curve is a measurement method of network effect. The stronger the network effect, the more the MAUuser graph forms a smile.[7] The Power User Curve should be based on the original user’s login, but Labelr did nothave a logging system yet. So drawing the Power User Curve with a history of participating in the project. As a result,there are some differences in the degree of smile graphs, but it is expected that there will be no significant differencesdepending on the original purpose of the Power User Curve, which wants to see Network Effect. (a)
User (b)
GreatUser (c)
SuperUser
Figure 1: Power User GraphAs the Power User Curve shows, ’Super user’ consists of users with powerful network effects, and this metric providesvarious effects in most areas as One Metric That Matters(OMMTM) for digital platform services. And the Power UserCurve shows that ’Super user’ is most likely to become hubs in the platform’s complexity network.
It was difficult to verify whether the ’Super user’ selected through this study could be the hub of the power law. This isbecause there are no criteria for interaction and no user clusters interact in data users interacted outside the product.However, despite the nature of products that rely on different networks for indirect impacts, it is significant in that theyhave a similar size of impact to the ’Great user’, which can only be selected through the internal testing. However, thereis a limitation that it is hard to directly test the power law hypothesis in different networks and that it is hard to checkwhether the ’Super user’ is in the hub. Furthermore, it is difficult to affirm that it is applicable as above on all platformsbecause there are not enough cases on other platforms.To overcome the limitations of this study, I will analyze data from other products on the same basis to develop. Andhope this study to help product managers make decisions that are creating new digital economic networks in theirplatform.
Dedicate this paper to my family and colleagues who helped me conduct study and write a paper.And also thanks DeepNatural Inc.(Team Labelr) for providing log data required for this study.4
PREPRINT - J
ANUARY
27, 2021