[PDF] Can Smartphone Co-locations Detect Friendship? It Depends How You Model It

Abstract

We present a study to detect friendship, its strength, and its change from smartphone location data collectedamong members of a fraternity. We extract a rich set of co-location features and build classifiers that detectfriendships and close friendship at 30% above a random baseline. We design cross-validation schema to testour model performance in specific application settings, finding it robust to seeing new dyads and to temporalvariance.

Full PDF

CCan Smartphone Co-locations Detect Friendship? ItDepends How You Model It.

MOMIN M. MALIK ∗ , Harvard University, USA

AFSANEH DORYAB,

University of Virginia, USA

MICHAEL MERRILL,

University of Washington

JÜRGEN PFEFFER,

Technical University of Munich, Germany

ANIND K. DEY,

University of Washington, USAWe present a study to detect friendship, its strength, and its change from smartphone location data collectedamong members of a fraternity. We extract a rich set of co-location features and build classifiers that detectfriendships and close friendship at 30% above a random baseline. We design cross-validation schema to testour model performance in specific application settings, finding it robust to seeing new dyads and to temporalvariance.CCS Concepts: •

Human-centered computing → Social network analysis ; Smartphones ; Ubiquitous andmobile computing design and evaluation methods ; Empirical studies in ubiquitous and mobile computing ; •

Computing methodologies → Feature selection ; •

Applied computing → Sociology ; Psychology; •

Networks → Location based services.Additional Key Words and Phrases: Friendship detection, proximity, co-location, machine learning

ACM Reference Format:

Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey. 2020. Can SmartphoneCo-locations Detect Friendship? It Depends How You Model It.. 1, 1 (September 2020), 24 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

Friendship is a “voluntary, personal relationship typically providing intimacy and assistance”,associated with characteristics of trust, loyalty, and self-disclosure [40]. It is one of the mostimportant aspects of human existence, lending meaning to life, and providing for material, cognitive,and social-emotional needs in ways that lead to greater health and well-being [40].Understanding the friendship relationship between people can be helpful for creating technologythat serves people better. If an individual’s friendships are known, these can be leveraged forapplications supporting help-seeking behavior such as requests for recommendations or for favors,or for automatically establishing trust between users’ devices. Friendships may also be leveragedfor carrying out social interventions around diet and exercise [71] or for preventing disease ∗ This is the corresponding authorAuthors’ addresses: Momin M. Malik, Harvard University, MA, USA, [email protected]; Afsaneh Doryab, Universityof Virginia, School of Engineering and Applied Science, VA, USA, [email protected]; Michael Merrill, University ofWashington, WA, [email protected]; Jürgen Pfeffer, Technical University of Munich, Bavarian School of Public Policy,Germany, [email protected]; Anind K. Dey, University of Washington, Information School, WA, USA, [email protected] to make digital or hard copies of all or part of this work for personal or classroom use is granted without feeprovided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice andthe full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requiresprior specific permission and/or a fee. Request permissions from [email protected].© 2020 Association for Computing Machinery.XXXX-XXXX/2020/9-ART $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn , Vol. 1, No. 1, Article . Publication date: September 2020. a r X i v : . [ c s . S I] A ug Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey transmission [69], such as in mobile applications that facilitate friends adopting and holding eachother accountable to healthy behaviors. Conversely, it may be of interest to create applications tomake recommendations about friendships in order to help bring people together, for example atconferences [19, 20] or for information dissemination in workplaces [63]. Longitudinal informationabout changes in friendships could help detect the onset of isolation and help design interventionsto strengthen friendships.But in order to incorporate information about one’s friendship network in personal informaticsand mobile applications, we need ways of detecting friendship. One easy way to get ‘ground truth’is to rely on ties from online social networking platforms; but such ties are not necessarily goodproxies for the underlying construct of friendship [27, 77, 108]. Survey instruments have been thestandard network data collection method in social network analysis for decades, but involve a highburden for users that make them impractical as a basis for mobile applications.Previous social science work has established the strong link between individuals being physicallyclose and being friends [40, 42, 61]. There is a two-way causal connection in that people who spendtime together are more likely to become friends [42], and that people who are friends spend timetogether [40]. This suggests that there should be a robust signal of friendship in measurements ofphysical proximity that can then be leveraged for analysis, services, or interventions.Our work is the first effort to detect friendships using feature extraction from smartphonelocation data. Previous works either descriptively, rather than predictively, linked location dataand friendship [39], looked at ties on location-based online social network services [28], or usedmobile phone call and SMS logs [106, 107]. We believe that detecting friendship from mobile phoneco-location data is a realistic approach for future mobile applications and interventions that seek toleverage friendship for other tasks.This paper presents the results of a 3-month study of a cohort of 53 participants, with finalanalysis performed on 9 weeks of data from 48 participants. We combine mobile phone sensor datacollection with established social network survey instruments, and use rich feature extraction fromco-location data to see how well such data can be used to detect friendships, close friendships, and changes in friendship.Our contributions are as follows: • We present, to our knowledge, the first pairwise feature extraction from smartphone locationdata, and show that a classifier built with the extracted features performs 30% above random(Matthews correlation coefficient). This can serve as a baseline for all future work. • We design a novel evaluation method (using temporal block assignment cross-validation andwhat we call dyadic assignment cross-validation ) to mimic different realistic applicationsettings in order to more rigorously test our classification’s generalizability to these settings,and use it to show that our approach is robust to seeing new pairings of individuals, and tovariability in co-location patterns over time.Below, first we review background work in social network analysis and in the mobile and perva-sive computing literature around existing approaches to extracting interactions and/or friendshipsfrom mobile sensor data. We describe our study design, and our feature extraction process. Withextracted features, we build a predictive model and use feature importance as a step towardscharacterizing the aspects of co-location most useful for detecting friendship. Detecting friendship is a link prediction problem [67], although we use the term ‘detection’ to emphasize that our predictionsare of concurrent, not future, values. We avoid ‘infer’ as we do not employ standard errors., Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 3

Friendship is an example of a relational phenomenon . Relational phenomena are two-unit or dyadic relations [12]. Instead of the n individuals of a dataset being the observations, they have as ob-servations (cid:0) n (cid:1) undirected (symmetric) relations (e.g., co-location), or 2 × (cid:0) n (cid:1) directed (potentiallyasymmetric) relations (e.g., self-reported friendships) between individuals.Research throughout the 20th century has provided examples of friendship and other ties havingexplanatory and predictive power, such as in explaining why girls ran away from a school fordelinquent teenage girls and predicting future runaways [78], explaining the breakup of a monastery[92], of the split of a karate club into two separate clubs [110], or using structures of informalnetworks to predict the success or failure of institutional reorganization [59]. Insights from theseapproaches have proved robust as they have been successfully integrated into search engines,recommender systems, and the structure of social media platforms.We represent a collection of friendship ties between n people as an n × n adjacency matrix, A ,where A ij = (cid:40) i → j A ii = A is not necessarily symmetric, as friendship ties collected fromsources like surveys can and do yield cases of A ij (cid:44) A ji . As an (asymmetric) adjacency matrixdefines a (directed) graph, which in this case is a friendship network , we refer to the n people asnodes. We refer to an (unordered) pair of individuals ( i , j ) as a dyad , and a value where A ij = edge , and also interchangeably as a tie or link .While A ij represents a ‘dependency’ between units i and j , for example A ij = i and j , such ties are themselves dependent [95]. Forexample, while friendship ties are not necessarily mutual or reciprocated , we still have A ij = A ji more often than we would expect at random. In symbolic terms, P ( A ij ) ̸⊥⊥ P ( A ji ) Other such dependencies include transitivity , which are friend-of-a-friend connections, P ( A ij ) ̸⊥⊥ P ({ A ik , A kj : k (cid:44) i , j }) , and preferential attachment [86], P ( A ij ̸⊥⊥ P ({ A kj : k (cid:44) j }) .Not accounting for such dependencies can, in explanatory models, lead to too-small standarderrors and omitted variable bias [33–35]. In predictive models [14, 94], dyadic dependencies impactthe validity of cross-validation estimates of model performance [29] in ways similar the impacts ofother types of dependencies [48].For example, if A ij = A ji , for co-location features X ij , if the pair ( A ij , X ij ) is in the training setand ( A ji , X ij ) is in the test set, we would be training and testing on the exact same row of data!In any link prediction task, having a cross-validation scheme that does not reflect an applicationsetting (for example, using features related to network degree or to the number of mutual friendswhen we may not have the complete network) can potentially give a misleading picture of whattrue out-of-sample performance would be.To avoid these problems, we first avoid using features that would not be available in an applicationsettings (network features, lagged values of the class label, etc.). We also employ three differentcross-validation schemes, described below. There are existing methods for doing cross validationwith network data [17, 29], but their validity depends on strong assumptions and even then they donot control for all known dependencies; instead, by employing different cross validation schema,we get a better picture of how our classifier would work in different use cases. , Vol. 1, No. 1, Article . Publication date: September 2020. Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey

The connection of friendship and proximity has long been a topic of study in social science. In afoundational work of social psychology carried out in 1946, the ‘Westgate study’, Festinger et al.[42] carried out a field experiment around soldiers returning from WWII and attending graduateschool at MIT on the GI bill. They and their families were randomly placed into the units of therelatively isolated, newly built Westgate housing complex. The study authors were able to quantifythe extent to which people living close by, or passing one another on the way to their residences,were more likely to become friends than with others with whom they did not have opportunitiesfor interaction (although the study looking only at men may have neglected an important causalprocess in the role of women, [18]). Then, in 1954 and 1955, the ‘Newcomb-Nordlie fraternity’ study[81, 82] recruited two waves of 17 male students who did not know each other, gave them free‘fraternity-style’ housing, and studied how their personal characteristics, political positions, andinterests affected their eventual friendship formation. These two studies established that proximityplays an important role in friendship, but also that proximity is not sufficient, and that othercharacteristics matter.Later work [61] further quantified the relationship between distance and friendship, finding“the inverse square of the distance separating two persons” to be a good fit to measures of socialimpact. A retreat for all incoming sociology majors at the University of Groningen provided anotheropportunity for studying the emergence of friendship, with van Duijn et al. [103] finding thatfriendships developed due to one of four main effects: physical proximity, visible similarity, invisiblesimilarity, and network opportunity.There are also a number of studies using data from online social networks or other onlineplatforms to study the connection between friendship and geography, although many of these areat the global scale [4, 21, 65, 68, 88] and do not have resolution at the scales at which interactionscan occur. Furthermore, users of location-based social networks, or those who share locations ongeneral online social networks like Twitter and Facebook, are a fraction of total users and form anunrepresentative sample [73] for a general smartphone-using population. Other work has taken calllogs and used calls as the ties to study with respect to geography [83, 105], although call logs andSMS have a surprisingly poor relationship with friendship as measured by self-report [106, 107].A landmark series of papers by Bernard, Killworth, and Sailer [9–11, 52, 53] showed that peopleare generally bad at recalling objective patterns of interaction. However, subsequent work [44, 58]argued that psychological perceptions of the network , rather than objectively measurable ties ofinteraction, were causal for individuals’ behavior: subjective data may in some cases be morevaluable for predicting and explaining, and thus the psychological perceptions captured by surveydata may be more valuable for certain tasks than objective measurements.This is related to adams’ [sic] [1] critique of the model in Eagle et al. [39] predicting friendshipfrom co-location; adams notes that there are ‘close strangers and distant friends’, both of whicha method based on co-location would misclassify. In response, Eagle et al. [36] argue that somecausal network processes might happen unconsciously, and sensor measurements might be able todetect these. The correlation of friendship and proximity can make it difficult to sort out whichmay be causal for a given process [26]. But in the future, if we can decorrelate friendships andproximity by controlling for one or the other, it could help us understand whether social influenceor environmental factors are more likely to be the causal factor. Furthermore, for many mobile usecases, even imperfect friendship detection will be relevant, making the task of friendship detectionfrom co-location a worthwhile pursuit. , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 5

Since the first work with the ‘sociometer’ (later, ‘sociometric badge’) in 2002 [22], research groupshave been using sensors to collect data about social interactions. Studies often used sensor ‘nodes’or other custom devices [3, 43, 45, 49, 60, 85] and, with the exception of RFID tags [5], studies havelargely used mobile devices [30, 37, 54, 55, 63, 66, 89, 91, 93, 93, 99, 100] because of their wide rangeof existing sensors, and because their use results in lower participant burden than when participantsare required to wear or carry an additional device. The data used is either co-location data (fromeither GPS, self-reported check-ins, or mutual detection of fixed sensors or WiFi hotspots) orproximity (through Bluetooth or RFID). There are also studies that use video to detect interactions[16, 51, 57, 104], and sociometric badges [22] or headsets [72] that gather audio data that allows fordetecting conversations between specific individuals, but, in our work, we focus on the co-locationand proximity sensing made possible with mobile phones.These existing studies largely fall into three categories. Most common are studies that describeinfrastructure, technical details, and study design, followed sometimes by descriptive modeling ofnetwork characteristics and some bivariate relationships [2, 3, 5–7, 15, 23–25, 43, 45, 50, 54, 64, 66, 76,89, 93, 93, 99, 100]. This has been important work that has contributed to present-day mobile sensingtools – tools that are capable of going beyond exploration and overall descriptives into measuringspecific processes, and applications designed on top of those processes like network-based healthinterventions [69].The second category is of those that try to build systems other than the specific sensing platform.They may build or lay the groundwork for recommender systems [19, 20, 46, 63], or present modelsor algorithms for mining information about interactions from sensor data [30, 31, 37, 87].The third category is of those that employ statistical models or techniques to make conclusionsor predictions using sensor data. Stehlé et al. [98] look at the connection between ‘spatial behavior’,measured by RFID badges, and gender similarity. The ‘Friends and Family’ or ‘SocialfMRI’ studyand dataset [2] has been used to look at the connection between interaction and financial status[84] and interaction and sleep and mood [79]. Madan et al. [70] used sensor measurements tolook at the relationship between social interactions and changes in political opinion. The ‘RealityMining’ dataset [37] has similarly been used to look at obesity and exercise in the presence ofcontact between people [71]. Another approach is that of Eagle & Pentland [38], which presentsa spectral clustering system for extracting daily patterns from time series. Staiano et al. [97] useego networks (induced subgraphs of single nodes and all their respective neighbors) in call logs,Bluetooth-based proximity networks, and surveys to predict Big-5 personality traits.In this third category, and most similar to our work, are two papers based on the Reality Miningdataset [32, 39]. Eagle and Pentland [39] were the first to use mobile phone data proximity to infer anetwork of self-reported friendship ties. They first calculated a ‘probability of proximity’ score overthe range of a week as an average frequency of proximity over nine months of data, which theyshowed was systematically different for each of reciprocated self-reported ties ( A ij , A ji ) = ( , ) ,non-reciprocated ties ( A ij , A ji ) ∈ {( , ) , ( , )} , and no ties ( A ij , A ji ) = ( , ) . But their model didnot use cross-validation, and their findings were based on aggregating over nine months of data tomodel a friendship self-report from the first month, which does not match a mobile use case whichwould involve detecting friendships only from recently gathered batches of location data.We explored replicating their approach; while we also found that, when aggregated over theentire time period of data collection, there was a major difference between mutual friendship tiesand both non-ties and non-reciprocated ties (fig. 3), this pattern proved ineffective for building aclassifier because of how aggregation like this, over time, dyads, and splits of training and test sets,obscures the variance that poses challenges to good test performance. , Vol. 1, No. 1, Article . Publication date: September 2020. Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey

Dong et al. [32] also modeled the co-evolution of behavior and social relationships from mobilephone sensor data. They outlined a model that predicted self-reported friendships from sensor data(and other survey data), but also did not use cross-validation, and only reported one performancemetric: that the binomial model explained 22% of overall variance, of which 6% was due to sensordata. This presumably from a pseudo R-squared metric, but as the specific metric is not given andthere was no cross-validated performance reported, it is difficult to compare results.We now describe the questions we seek to answer and the study and analysis we conducted toanswer them.

The goal of our study is to understand the feasibility of inferring social relationships (friendshipin particular) from (only) passive smartphone data. We are especially interested in the followingquestions:(1) How well can we detect friendships from co-location features? In other words, if all we knowabout two people in a social system is their location patterns, how accurately can we say ifthey are friends?(2) If we know that friendships exist, how well can we detect if these friendships are close friendships?(3) How accurately can we detect whether a friendship is likely to change? Will co-locationpatterns provide information about the creation or dissolution of friendships?To answer these questions, we carried out a 3-month study among members of a fraternity touse smartphone data to try and capture interactions and relationships as they were formed andevolved during that period. The following section describes the study setup and data collectionprocess.

We recruited members of an undergraduate fraternity in a research university in the northeasternUnited States. The fraternity had 60 members at the start of the study, with an additional 21prospective members going through the ‘pledging’ process during the study duration, of which19 completed the process. Of this cohort of 79 men, we recruited 66 participants, of which 53ultimately participated in sensor data collection, and of which 48 responded to at least one surveywave. Having this sort of well-defined boundary specification [62] let us ask each study participantabout their friendships with each member of the fraternity, giving negative examples that areexplicit, unlike open-ended solicitation for friendships (such as from ‘name generator’ instruments)in which individuals are only implicitly not friends by not being mentioned.The fraternity was relatively loose-knit; about 20 fraternity members live in a fraternity house,with the rest living elsewhere and required to be in the fraternity only one day a week (for afraternity chapter-wide meeting). Participants were compensated $20 a week for having the passiveand automated sensor data collection software, AWARE [41], installed on their smartphones, withadditional $5 incentives for each survey wave they completed.

Our task was to use mobile phone sensor data relating to location and proximity in a model thatcould recover self-reported friendship ties, and changes in such ties. Consequently, we collectedsurvey data about friendships in three waves, and used AWARE to collect Wifi, Bluetooth, andlocation data from mobile phones. , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 7

During the study, participants were asked to fill out a survey asking abouttheir social connections, based off of existing instruments [56, 102]. There was a public listing offraternity members, and consequently we were able to ask about respondents’ ties to all fraternitymembers (i.e., ask about ties to everybody in the specified boundary), not just those participatingin the study; while we were not able to relate friendships with non-participant fraternity membersto sensor data, since non-participation meant we do not have sensor data, it does give a sense ofthe importance of non-participants in the social system. The surveys were collected three times over 9 weeks: shortly after the beginning of the study,then four weeks after, and lastly at the end of the study five weeks later (we made the second periodlonger, as one of these five weeks was spring break, when many study participants were awayfrom campus). Participants were asked about five different quantities: their recollections about whothey interacted with frequently; who they considered to be a friend; who they considered to be a close friend; who they went to for advice on personal matters; and who they went to for advice onprofessional/academic matters. The correlation between these collected networks, between eachother and over time, is given below in figure (4). Friendship can change at shorter intervals than sixweeks; but since friendship is an internal and subjective psychological construct, currently the onlyway of getting data on friendship is surveys with high respondent burden that makes it infeasibleto collect at more frequent intervals.

We equipped each participant with the AWARE mobile phoneframework [41] on their iOS devices ( ≈

90% of participants) or Android devices (the remaining ≈ The completeness of the survey data is shown in figure (1a). The response ratedropped in each survey round; compared to survey 1, survey 2 had a response rate of 59%, andsurvey 3 had a response rate of 51%. In total, there were 48 participants providing network data, 34of which responded to 2 surveys giving us longitudinal network data (the minimum requirementfor detecting changes in friendship), including 20 participants that responded to all 3 surveys. Intotal, out of (cid:0) (cid:1) = The completeness of the sensor data is shown in figure (1b). Some logisticalproblems prevented all participants from starting smartphone data collection on the first day, andsome participants discontinued the use of the app because of technical issues (battery life, sporadicinterference with certain external Bluetooth devices, etc.). For example, if non-participants were all seldom nominated by respondents (corresponding to low indegree in the collectednetworks), it would mean that study non-participation is related to being unimportant in the social system, which would beencouraging, although this did not turn out to be the case. Momin M. Malik, Afsaneh Doryab, Michael Merrill, Jürgen Pfeffer, and Anind K. Dey

Feb 08 Mar 01Survey 1 Survey 2 Survey 3Apr 01 Apr 24People whotook 1surveyPeople whotook 2surveysPeople whotook 3surveys

14 0 0 9 5 0 2005101520 C o u n t s Survey 3Survey 2Survey 1 48 total29 total25 total

14 took onesurvey 14 took twosurveys

Survey completeness Sensor completeness

Fig. 1. (a. Left) Looking at longitudinal completeness, 14 people completed survey 1 only, and none completedsurveys 2 or 3 only. 9 people completed surveys 1 and 2, 5 people completed surveys 1 and 3, none completed2 and 3 only, and 20 people completed all three waves. This is shown in the vertical bars at the top. This comesout to 48 respondents for survey 1, 29 respondents for survey 2, and 25 respondents for survey 3, shown inthe solid horizontal bars on the right side. (b. Right) We show the time periods in which sensor data wascollected for people who answered one survey (dotted lines), two surveys (dashed lines), or all three surveys(solid lines). The times of the three surveys are marked with vertical lines. . . . . . . Fraction of time not covered in data, from time between samples t = minutes without sample taken P ( T i m e b e t w ee n c o n s e c u t i v e s a m p l e s ≥ t ) . . . . . . All devicesiOS clientAndroid client ˆ Fig. 2. A plot of the survival function (the empirical complementary cumulative distribution) for time notcovered by sensor data. 40% of the study time across all participants is covered by sensor readings lessfrequent than 10 minutes, meaning that 60% of the data has no gaps in coverage.

There were two sources of missing values in the calculated features: either artifacts relating tono observations fulfilling a certain criteria (e.g., no co-locations within 50m on mornings), or elseactual missing data (one or both mobile devices were not providing a certain sensor’s data during agiven period, e.g., mornings of a given week). For the former (artifacts), we replaced missing valueswith appropriate substitutes, such as 0s or the maximum possible value. For logarithmic features,some of which could be less than 1, we replaced −∞ with zeros. For inverse-squared features, wereplaced ∞ with a value, 200, slightly larger than the largest observed inverse-squared value. , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 9 Mon Tue Wed Thu Fri SatSun

Recip. friendsNon−recip. friendsNon−friends

Fig. 3. The median weekly pairwise distances between reciprocated (mutual) friendships, A ij = A ji = ,non-reciprocated friendships, A ij = (cid:44) A ji = or A ij = (cid:44) A ji = , and non-friendships, A ij = A ji = , fortimes when pairs are within the area of the university, and aggregated over the entire period of data (i.e., notraining/test split). This is analogous to the approach of [39], and this figure reproduces their figure 2 (exceptwith median distance, rather than mean frequency of proximity). While it appears there is a strong pattern, itis a result of an aggregation that obscures the variance between weeks and in data splitting, such that thisseeming pattern proved ineffective as a basis of classification in testing. Based on the distribution of lengths of time where data was missing (fig. 2), we would haveneeded to interpolate up to eight hour intervals to have any real impact on the proportion oftime that has missing data. Consequently, we chose to not use partial interpolation, and keptcells of missing values in the feature matrix. This necessitated using classifiers that can handlemissing values among the features, like the R random forest implementation rpart [101] whichhas procedures for handling missing values when constructing decision trees, and other packagesbuilt on top of rpart .We did test our assumptions about the importance of maintaining missing values by tryingdifferent variations. We did try out last value carried forward interpolation on the time series priorto feature extraction, as well as mean, median, and mode interpolation on the matrix of extractedfeatures, but neither improved results. Spring break may be extremely informative, for example if two people areproximate to each other but far from everybody else it may be that they are more likely to befriends. However, spring break is systematically different from every other week, such that if wetrain on spring break, we have no meaningful test set. Thus, we removed spring break from thedata set. This is also why the two periods have an unequal number of weeks, with 4 weeks betweensurvey waves 1 and 2, and 5 weeks between survey waves 2 and 3; spring break fell between surveywaves 2 and 3, such that removing it leaves 4 weeks in each period.

We asked participants about five different types of ties in eachof the three surveys: following previous social science literature, we asked about advice-seekingrelationships (both personal advice seeking, and academic/professional advice-seeking), in addition , Vol. 1, No. 1, Article . Publication date: September 2020. to asking about friendships and, for each reported friendship, asking if it was also a close friendship.For comparison with work on recall [9–11, 52, 53] and memorability of social interactions [61], wealso asked about frequency of interaction.The similarities between these collected networks, both for the same network across the threewaves and between the different networks, is given in figure (4). The similarity metric used is theJaccard index, a common method for comparing networks (as it looks only at ties shared across thetwo networks, not shared non-ties), potentially of overlapping but unequal sets of nodes, which inour case happens because of non-response. For two networks N A and N B , with n A ∩ B overlappingnodes and adjacency matrices A and B restricted to these nodes, the Jaccard index is J ( A , B ) = |{ A ij = B ij = }| × (cid:0) n A ∩ B (cid:1) As we can see, there is a much higher correlation between self-reported frequent interaction andfriendship than there is between friendship and close friendship. There is also a high correlationbetween close friendship and the two types of advice ties; while we did not use advice ties in thecurrent analysis, this similarity gives insight into what types of relationships the prompt about‘close friendships’ elicit (like the prompt about friendship, we explicitly do not define what wemean by ‘close’, letting participants interpret the term).Looking at the changes in the networks from survey to survey, we see that close friendships andboth types of advice-seeking relationships are much less variable over time than are friendships orself-reports of frequent interaction. I n t r , t I n t r , t I n t r , t F r n , t F r n , t F r n , t C l s , t C l s , t C l s , t P r f , t P r f , t P r f , t P r s , t P r s , t P r s , t Interaction, t Interaction, t Interaction, t Friendship, t Friendship, t Friendship, t Close friend, t Close friend, t Close friend, t Professional advice, t Professional advice, t Professional advice, t Personal advice, t Personal advice, t Personal advice, t Fig. 4. The similarity between networks (5 types of ties, each collected 3 times), measured via the Jaccard index.The self-reported frequent interaction and friendship networks are more similar than the other networks,and both also exhibit more variation across the three waves.

Collected Bluetooth data turned out to be unusable. Both Android and iOSno longer make the 16 hex digit Bluetooth MAC addresses of detected devices available to appdevelopers. Instead, detected devices are recorded in terms of a 32 hex digit universally unique , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 11 identifier (UUID), which are assigned by the detecting device uniquely to each detected device andused to recognize those detected devices in the future.

Following the findings of Wiese et al. [106, 107] that call logs and SMSdo not necessarily help detect degree of friendships, we elected to restrict our attention to usingco-location only. Another reason to avoid call logs and SMS are that such communications metadataare already seen as sensitive and intrusive, even if they were to turn out to not help us predictour target of interest. Lastly, there was a substantive reason to not use communications data: wewere informed that the fraternity largely used a group chat application for communications withone another, such that we expected call logs and SMS to not capture any informative aspects ofcommunications.

One candidate for characterizing proximity is when two devices detect or connect tothe same Wifi device. Here, Wifi hotspot MAC addresses are unique (unlike the hotspot name/label,which for example with ‘eduroam’ is shared not only across multiple hotspots in the same university,but across multiple cities across the world!), and mutual detection of this picks up when two devicesare proximate. Out of 830 potential pairs, 406 pairs of mobile devices detected at least one Wifihotspot in common (although not necessarily at the same time).As mentioned above, we also conduced Wifi fingerprinting in the fraternity house, includingcollecting all Wifi devices detected in each room along with the received signal strength indication(RSSI) of the respective signals. In order to perform Wifi fingerprinting (match a set of hotspots thatare detected by a mobile phone in a given scan and with respective RSSIs to previously collectedprofiles from specific rooms), we needed multiple detected hotspots per scan (every 10 minutes).However, we found that only about 6.7% of scans for Wifi hotspots recorded more than one detectedhotspot; in the frat house as well, we could tell when a device was connected to one of the frathouse’s Wifi hotspots, but not which other hotspots were detected in order to determine a specificroom. Thus, we only use as the basis for features whether at least one Wifi hotspot was detected incommon at the scan of a specific 10 minute interval from two devices, ignoring the tiny fraction ofdetections that include multiple devices, and also ignoring RSSI. The frat house has 5 main Wifidevices for about 30 rooms over 3 floors. Based on the size of rooms in the fraternity house and therelative coverage of its Wifi devices, we estimate that at least within the fraternity house, our Wifilocalization approach is accurate to within a bit of a smaller radius than its general 32m accuracy,perhaps 20m or so; however, we do not have similar measurements for the rest of campus.

Since we have, from previous theory, that co-location is causally related to friend-ship through interaction, we would ideally want to extract features from pairwise distance mea-surements that will be effective as a proxy for interaction. However, in the Google Fused Locationplugin that AWARE uses to collect location data, we used the PRIORITY_LOW_POWER optionwhich prioritizes low power usage, as previous testing with AWARE had showed battery drain wasa major cause of participant dropout. This low power option does not actively use GPS, insteadusing a combination of cell phone towers and detected Wifi hotspot with known geolocations,and is advertised as being accurate to within about 10km. In the iOS client, the accuracy settingcorresponding to low power use was to set desiredAccuracy option to 1km, with a threshhold forrecording new movements of 1000m. In practice, the reported accuracy was usually much better,with a significant portion of readings reporting an accuracy of within 10m. https://developer.apple.com/library/content/documentation/UserExperience/Conceptual/LocationAwarenessPG/CoreLocation/CoreLocation.html , Vol. 1, No. 1, Article . Publication date: September 2020. . . . . . . x = pairwise distance ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

100 2×10

200 5×10 P ( X ≥ x ) ˆ Pairwise distances in meters K e r n e l d e n s i t y e s t i m a t e Fig. 5. A survival function (a. left), plotted in log- x scale, shows pairwise distances over time. Based on the‘elbow’ around 2000m (approximately the size of the university and surrounding area), marked with a verticaldotted line, we only found clusters for pairwise distances below 2000m. Below 2000m, we clustered distances(again weighted by the time spent at that distance). The fitted clusters are shown on top of a kernel densityestimate (b. right) that gives a detail of the head of the distribution. The cluster breaks are at 207m, 422m,626m, 822m, 1001m, 1178m, 1373m, 1570m, 1776m, and then our cutoff of 2000m. These are also listed in table(1). Additionally, when calculating the continuous-valued time series of pairwise distances, we alsogenerated binary time series for if the locations of both members of the pair fell within a geoboxaround the university’s campus, and a geobox around the fraternity house.As a way of reducing the continuous-valued time series, we sought to pick several choicethresholds that might characterize geographic similarity in simple way. First, we plot an empiricalcomplementary cumulative distribution function (i.e., a survival function) in log- x scale (fig. 5a)to see the overall distribution. There is an ‘elbow’ around 2000m, which is about the size of theuniversity and surrounding area. Then, within 2000m, we use 1-dimensional clustering [105],weighted by time and using 10 clusters, and used the boundaries of the fitted clusters as thresholds.These thresholds are shown as the boundaries regions of gray over a kernel density estimate of thedistribution over the first 2000m (fig. 5b).After data processing, we have the following: • pairwise distances •

10 binary time series of whether both members of a given pair were within a given thresholdof each other • • • • , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 13 Distribution to summarize Statistic Timeframe

Count of times within thresholdMean of countStandard deviation of countStandard deviation squared of countCount(1 - Count)

Spans (distribution of lengths of consecutive 1s)

Gaps (distribution of lengths of consecutive 0s)

Mean of lengthsMedian of lengthsStandard deviation of lengthsMinimum lengthMaximum lengthMean of logarithm of lengthsMedian of logarithm of lengthsStandard deviation of logarithm of lengthsMinimum of logarithm of lengthsMaximum of logarithm of lengthsMeanMedianStandard deviationMean of logarithmMedian of logarithmStandard deviation of logarithmMean of inverse-squared distanceMedian of inverse-squared distanceStandard deviation of inverse-squared distance 4-week periodWeekdays only within 4-week periodWeekends only within 4-week periodMornings [6am - 12pm) only within 4-week periodAfternoons [12pm - 6pm) only within 4-week periodEvenings [6pm - 12am) only within 4-week periodNights [12am - 6am) only within 4-week periodPair is within 207m (binary)Pair is within 422m (binary)Pair is within 626m (binary)Pair is within 822m (binary)Pair is within 1001m (binary)Pair is within 1178m (binary)Pair is within 1373m (binary)Pair is within 1570m (binary)Pair is within 1776m (binary)Pair is within 2000m (binary)Pair is within geobox around campus (binary)Pair is within geobox around fraternity house (binary)Wiﬁ device detected in common (binary)Wiﬁ device in fraternity house detected in common (binary)Pairwise distances (continuous-valued) ⨂⨂ ⨂⨂

Table 1. Extracted features. “ ⊗ ” indicates taking all pairwise combinations. The thresholds are irregularlyspaced because they are empirically derived from 1-dimensional clustering; see figure (5b) for these clusters. often heavily right-skewed. Additionally, for the binary time series, we can consider the lengthof sequences of consecutive 1s ( spans of co-location at the given threshold) and of consecutive 0s( gaps between co-location at the given threshold). These are integer-valued but we treat them ascontinuous, and calculate an additional set of summary statistics accordingly. All of these summarystatistics are given in the central column of table (1).Lastly, each of these feature types are crossed with time periods: weekdays only and weekendsonly, nights only (12am - 6am), mornings only (6am - 12pm), afternoons only (12pm - 6pm), andevenings only (6pm - 12am). These are shown in the right column of table (1).In total, there are 9 features for the continuous-valued time series of pairwise distances, 12 ×( + ) =

300 features for the binary time series, and each of these 309 features are calculated overseven settings, for 309 × = × ( + ) = × =

350 for an additional 350 features, for a total of 2,513 features. We extracted these , Vol. 1, No. 1, Article . Publication date: September 2020. over two 4-week periods, corresponding to the 4 weeks between surveys 1 and 2, and the 5 weeksbetween surveys 2 and 3 with the week of spring break subtracted out.

We take on three targets for modeling.(1) Detecting friendship. This is a standard binary classification task. In this task, we do notmake use of survey wave 1.(2) Detecting friendship strength. Given that two people are friends, can we detect whether ornot they have reported that the friendship is a close one? For this, we restrict the data set toinstances of friendship ties only, and do binary classification of close friendships. Again, wedo not make use of survey wave 1.(3) Detecting change in friendship. Here, our targets are • P ( A ( t ) ij = A ( t + ) ij | X [ t , t + ) ) : No change in friendship (either no friendship, or maintainedfriendship) • P ( A ( t ) ij (cid:44) A ( t + ) ij | X [ t , t + ) ) : Change in friendship (either tie creation or tie dissolution)While we ideally would be able to separately model tie creation and dissolution, as they aredistinct processes [96], in our data only a small proportion of ties changed in either directionsuch that modeling became difficult. We will see below that this modeling target was themost challenging of all, although treating it as a multiclass problem over the direction ofchange only led to worse performance. In order to comprehensively evaluate our classifier’s performance at detecting friendship, we usethree cross-validation schema. Each corresponds to a different use case, and tests the generalizabilityof our method to that use case. In each case, dependencies (redundancies in data, latent or unmodeledsimilarities) between training and test sets can share information across a split in data, dependenciesthat would not be present in application settings, therefore inflating test performance compared toreal-world performance.Each schema uses a different rule and use case to assign observations to training and test folds.The rules and use cases of these schema are detailed below.

This is independently assigning each observed A ij to a fold. It corresponds to a use case where a model is trained on a population ( n − k pairs) andthen applied back to k pairs from same population (potentially seeing the same people multipletimes, or the same dyad in multiple directions). This groups all values associated with a pair ofindividuals (a dyad), that is, ( A ( ) ij , A ( ) ji , A ( ) ij , A ( ) ji , A ( ) ij , A ( ) ji ) , and assign the entire 6-tuple to a singlefold. Some values in the tuple will be missing, causing folds to be of different sizes; But sinceassignment to fold is not dependent of the number of missing values, sizes will be the same inexpectation.Such assignment controls for reciprocity and temporal autocorrelation. For reciprocity, if A ij = A ji , then the label-feature pair ( A ij , X ij ) and ( A ji , X ji ) are identical and should not be split betweentraining and test. Similarly for temporal autocorrelation, if two people’s friendship and co-locationpatterns do not change over time, then ( A ( t ) ij , X [ t − , t ) ij ) and ( A ( t + ) ij , X [ t , t + ) ij ) would also be very similarand should not be split between training and test. , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 15 Cross validation with dyadic assignment corresponds to a use case where we have not previouslyseen the labeled co-location patterns of a given dyad, whether previously in time or in one direction,to have included it as a training instance.

This splits data by whether a class label isfrom survey 2 or survey 3 (for detecting friendship and strength of friendship) or is the changefrom survey 1 to 2 or the change from survey 2 to 3 (for detecting change in friendship). In otherwords, for detecting friendship and strength, we train on ( A ( ) , X [ , ) ) and test on ( A ( ) , X [ , ) ) , andfor detecting change, we train on ( A ( ) , A ( ) , X [ , ) ) and test on ( A ( ) , A ( ) , X [ , ) ) .As a note, here we can only split into 2 folds as we only have two observation spans betweendifferent surveys. Cross validation with temporal block [8, 90] assignment accounts for temporalvariation in co-location. If there is a great deal of variability in co-location patterns, then ourclassifier would have little generalizability over time. In this case, if we train with instances withfeatures from both X [ t − , t ) and X [ t , t + ) , it would even out the temporal variation and obscure thelack of generalizability. But if we train only on instances associated with features X [ t − , t ) and thentest only on instances associated with features X [ t , t + ) , it simulates how well out classifier will doin predicting friendships from future patterns of co-location data. To summarize classifier performance, we rely on the Matthews correlation coefficient (MCC). Thisis the same as Pearson’s ϕ , or mean square contingency coefficient, an analog for a pair of binaryvariables of Pearson’s product-moment correlation coefficient, but was rediscovered by Matthews[74] for use as a classification metric. For the count of true positives (TP), true negatives (TN), falsepositives (FP) and false negatives (FN), the MCC isMCC = T P × T N − F P × F N (cid:112) ( F P + T P ) × (

T P + F N ) × (

T N + F P ) × (

T N + F N ) . The MCC has several desirable properties. First, like the F1 score and area under the ROC curve(AUC), it summarizes the performance on both classes in a single number. Unlike AUC and F1,however, it has an interpretable range: 0 for random predictions, -1 for perfect misclassification,and 1 for perfect classification. Most helpfully, is a good summary of performance in cases of classimbalance [13], which have here (about a 25:75 split). We include other metrics, but rely on theMCC as the single-number summary of how far we are above a random baseline of MCC = 0. Notethat, if we predict the majority class for all instances, the MCC is also zero.

Feature selection can often improve classifier performance, but it is also useful for diagnostic andexploratory analysis. In our case, we are interested in a reduced set of features that can providesimilar or better classification results, and that may be less burdensome to extract for use in real-time mobile applications built on friendship detection. To produce a selected set of features, we useCorrelation-based Feature Selection (CFS) [47], which selects features that are both correlated withthe class label, and uncorrelated with one another.To select the most stable set of features, we run the CFS method on the training set built withwhat turns out to be our most conservative cross validation scheme, temporal block assignment.We take the half of data with features extracted from the first four weeks and further divide it into10 folds. We perform CFS of each fold, then look at the features that were selected in the maximumnumber of folds, an approach also applied more formally elsewhere [75]. , Vol. 1, No. 1, Article . Publication date: September 2020.

We choose those features that appeared in CFS runs on at least 9 of the 10 folds. These featuresare then entered in the classification process for friendship detection.

Results for the three cross-validation schemes are given in table (2). In each case, the no informationrate corresponds to the proportion of the majority class, 0, and would be the accuracy we wouldget if we always predicted no tie.The unrestricted assignment gives better results than either of the other two CV schema, showingthat labeling a previously unseen dyad is indeed a more specific and difficult task than what isevaluated by unrestricted assignment, and that there is a significant amount of variation in co-location patterns over time—and that while our classifier performance does drop, it still generalizesacross patterns in time.We use a one-sided binomial test of the accuracy against the No Information Rate (NIR), equal tothe frequency of the majority class, and find that both unrestricted and dyadic CV are significant atthe usual p < .

05 level. Under temporal block CV, the classifier is only significantly better thanthe NIR at the p < . Cross validation Unrestricted Dyadic Temporal block

Accuracy 0.8006 0.7920 0.7913Accuracy, 95% CI (0.7882, 0.8125) (0.7794, 0.8042) (0.7726, 0.8091) (No Information Rate / Majority class) (0.7740) (0.7740) (0.7785)

Binomial test, Accuracy vs. NIR, p -value p =1.5e-05 p =0.0025 p =0.0901Precision (Positive predictive value) 0.6918 0.6508 0.6812Recall/Sensitivity (True positive rate) 0.2122 0.1723 0.1088Specificity (True negative rate) 0.9724 0.9730 0.9855F1 score 0.3248 0.2724 0.2964AUC 0.7148 0.7039 0.1876Matthews correlation coefficient 0.3039 0.2562 0.2120 Table 2. Friendship detection, test performance across the three CV schema. The no information rate corre-sponds to a baseline accuracy given by predicting no ties; in the case of networks, this is 1 minus the densityof the network.

We repeat the assessment of the above models, conditioning on the presence of a friendship, andmaking our detection target whether or not a friendship is reported to be close . In this case, thenetwork of close friendships has a network density of .41, making the no information rate .59.We see a similar pattern of performance, with temporal block CV being the most conservative(18% better than baseline), and unrestricted CV being more optimistic (32% better than baseline).

Detecting loss in friendships could be particularly important for social interventions, such aspreventing the onset of isolation. However, the rarity of changes in friendship (only 13% of tieschange, either being created or dissolving) complicates modeling.Our approach in meaningfully detect changes in friendship proved to be challenging. AdaBoostfailed to predict any positive test cases for any CV schema; a random forest performed better witha Matthews correlation coefficient of .07 for the unrestricted CV and .03 for the dyadic-based CV , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 17

Cross validation Unrestricted Dyadic Temporal block

Accuracy 0.6817 0.6670 0.5741Accuracy, 95% CI (0.6511, 0.7112) (0.6361, 0.6969) (0.5259, 0.6212) (No Information Rate / Majority class) (0.5861) (0.5861) (0.5185)

Binomial test, Accuracy vs. NIR, p -value p =7.6e-10 p =1.8e-07 p =0.0117Precision (Positive predictive value) 0.6904 0.6711 0.7069Recall/Sensitivity (True positive rate) 0.4188 0.3832 0.1971Specificity (True negative rate) 0.8674 0.8674 0.9241F1 score 0.5213 0.4879 0.3083AUC 0.6997 0.6695 0.5889Matthews correlation coefficient 0.3250 0.2906 0.1777 Table 3.

Close friendship detection, conditioned on the presence of a friendship, test performance across thethree CV schema. (see table (4). The classifier output does not pass a statistical test for being significantly better thanthe No Information Rate. One of the reasons for the poor performance may be the type of featuresused in the classification. We used the same aggregated features used for friendship detection todetect change. However, change in friendship may be reflected in the feature values and thus afeature set that contains change values may better capture change in friendship.

Cross validation Unrestricted Dyadic

Accuracy 0.6842 0.8645Accuracy, 95% CI (0.6692, 0.6989) (0.8532, 0.8752) (No Information Rate / Majority class) (0.8710) (0.8710)

Binomial test, Accuracy vs. NIR, p -value p =1 p =0.8902Precision (Positive predictive value) 0.1676 0.2093Recall/Sensitivity (True positive rate) 0.3651 0.0183Specificity (True negative rate) 0.7315 0.9898F1 score 0.2297 0.0336AUC 0.5483 0.5167Matthews correlation coefficient 0.0720 0.0256 Table 4.

Change detection, random forest test performance. AdaBoost made only negative test classifications,but random forests (performance shown here) did make some positive classifications under unrestricted anddyad-based CV, although under temporal block CV again there were no positive classifications.

While we applied CFS to select features from the training set in all tasks, the features selected werenot always consistent across folds, and across cross validation schema. So, we focus on the featuresselected in the case of the most conservative cross validation schema, and the extent to whichfeature selection improved model performance here.Applying CFS to only the training data from temporal block assignment and splitting it into 10folds, we find 19 features that are selected in 9 or 10 of the folds. Using only these features leadsto improved test performance from temporal block assignment, shown in table (6), which alsoincludes the test performance with this set of features under each cross validation scheme.While the test MCC of CV with unrestricted assignment goes down, with this fraction of only 19features the test MCC of CV with dyadic assignment rises slightly, and the test MCC of CV withtemporal block assignment does far better, going from an MCC of .21 to .27. These 19 features, then,seem to be picking up a significant portion of the pattern in co-location data, and a pattern that ismore robust to changes over time.While it is dangerous to substantively interpret the selected features as causal or even as neces-sarily stable [80, 109], it is a useful exploratory step to see the features that are effective for thedetection task. The features are listed in table (5) ,with the pairwise correlations given in figure , Vol. 1, No. 1, Article . Publication date: September 2020. −1−0.8−0.6−0.4−0.200.20.40.60.81distance, average nightinversed squared distance, s.d.distance, average eveningdistance, median weekendwithin city, minimum span nightwithin threshold 3, s.d. log gapwithin threshold 2, median gap nightwithin threshold 2, median log gap nightinverse squared distance, s.d. morninginverse squared distance, s.d. afternoonwithin city, s.d. log span nightinverse squared distance, s.d. night inverse squared distance, s.d. eveningwithin threshold 2, s.d. log span nightwithin threshold 2, max span weekdaywithin threshold 2, count nightwithin threshold 2, max span weekendwithin threshold 2, count morningwithin threshold 2, s.d. span weekday d i s t a n c e , a v e r a g e n i g h t i n v e r s e d s q u a r e d d i s t a n c e , s . d . d i s t a n c e , a v e r a g e e v e n i n g d i s t a n c e , m e d i a n w ee k e n d w i t h i n c i t y , m i n i m u m s p a n n i g h t w i t h i n t h r e s h o l d , s . d . l o g g a p w i t h i n t h r e s h o l d , m e d i a n g a p n i g h t w i t h i n t h r e s h o l d , m e d i a n l o g g a p n i g h t i n v e r s e s q u a r e d d i s t a n c e , s . d . m o r n i n g i n v e r s e s q u a r e d d i s t a n c e , s . d . a ft e r n oo n w i t h i n c i t y , s . d . l o g s p a n n i g h t i n v e r s e s q u a r e d d i s t a n c e , s . d . n i g h t i n v e r s e s q u a r e d d i s t a n c e , s . d . e v e n i n g w i t h i n t h r e s h o l d , s . d . l o g s p a n n i g h t w i t h i n t h r e s h o l d , m a x s p a n w ee k d a y w i t h i n t h r e s h o l d , c o u n t n i g h t w i t h i n t h r e s h o l d , m a x s p a n w ee k e n d w i t h i n t h r e s h o l d , c o u n t m o r n i n g w i t h i n t h r e s h o l d , s . d . s p a n w ee k d a y Fig. 6. Correlations between the features selected via CFS on the training set of a temporal block cross-validation scheme. The ordering is from the angular order of eigenvectors.

Feature Distribution Summary statistic Timeframe

1. Distance Mean Evening2. Distance Mean Night3. Distance Median Weekend4. Within city Minimum span Night5. Within threshold 3 Log gap All6. Within threshold 2 Median gap Night7. Within threshold 2 Median log gap Night8. Inverse squared distance S.D. Morning9. Inverse squared distance S.D. All10. Inverse squared distance S.D. Afternoon11. Within city S.D. log span Night12. Inverse squared distance Standard deviation Night13. Inverse squared distance Standard deviation Evening14. Within threshold 2 S.D. log span Night15. Within threshold 2 Max span Night16. Within threshold 2 Count Night17. Within threshold 2 Max span Weekend18. Within threshold 2 Count Morning19. Within threshold 2 S.D. span Weekday

Table 5. The 19 features selected via CFS on the training set from temporal block assignment: what theymeasure, how they summarize it, and the timeframe in which they summarize it. Ordering is from angularorder of eigenvectors on the correlation matrix (fig. 6). (6). While there are groups of highly linearly correlated features, many of the features are notcorrelated, giving an independent signal.There are some patterns that emerge in this well-performing subset of features. Threshold 2(422m) shows up frequently, as do measures related to variance (standard deviation measures), , Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 19

CV assignment method Unrestricted Dyadic Temporal block

Accuracy 0.7975 0.793 0.7923Accuracy, 95% CI (0.785, 0.8095) (0.7804, 0.8051) (0.7736, 0.8101) (No Information Rate / Majority class) (0.774) (0.774) (0.7785)

Binomial test, Accuracy vs. NIR, p -value p =0.0001 p =0.0016 p =0.0734Precision (Positive predictive value) 0.6602 0.6370 0.5799Recall/Sensitivity (True positive rate) 0.2143 0.1954 0.2269Specificity (True negative rate) 0.9678 0.9675 0.9532F1 score 0.3236 0.2990 0.3261AUC 0.6837 0.6804 0.6767Matthews correlation coefficient 0.2921 0.2682 0.2658 Table 6. Friendship detection with CFS feature selection on the temporal block assignment training data. nighttime, and the distribution of inverse squared distances. This generates several hypotheses:first, that Latané et al.’s [61] finding that inverse-squared distance fits well to reports of memorablesocial interactions may be effective for friendship detection as well. Second, the threshold at 422mseems particularly relevant versus others: this specific value might not be what is important, butperhaps this captures some relevant radius around the frat house. Otherwise, features associatedwith where people are co-located at night appear most frequently, which is in contrast to thefinding by Eagle et al. [39] that the daytime probability proximity is what was discriminative forfriendships.

In this paper, we have described the collection of subjective, self-reported friendship data along-side objective sensor data within a given boundary specification. We modeled friendship, closefriendships, and change in friendship with machine learning and evaluated them using three cross-validation schema that accounted for different use case scenarios in the real world to show thegeneralizability of our approach. We could detect friendship and close friendship with a significantbetter performance above baseline in both cases. Our change detection, however, performed poorlywith current aggregated features, suggesting a different set of features are needed to carry out thistask.We also obtained a set of features through a CFS method on the most conservative training set(one constructed through temporal block assignment). Our test using the extracted features showedsimilar results to the full feature set, suggesting them as potential alternatives to the full feature setthat can help building lightweight models, and suggesting that certain measures and timeframes,such as inverse squared distance, standard deviations, and nighttime patterns, are most helpful fordetection. In our future work, we will further explore feature selection for a parsimonious set offeatures applicable for different detection tasks.Our findings demonstrate the feasibility of detecting friendships from location data, as wellas establish the challenge of detecting changes in friendship. This opens possibilities for furtherinvestigating the relationship between friendship and co-location, as well as for designing mobileapplications that build recommendation systems or interventions based on detected friendships.

REFERENCES [1] jimi adams. 2010. Distant friends, close strangers? Inferring friendships from behavior.

Proceedings of the NationalAcademy of Sciences

Pervasive and Mobile Computing

7, 6 (2011), 643–659. https://doi.org/10.1016/j.pmcj.2011.09.004[3] Constantinos Marios Angelopoulos, Christofoulos Mouskos, and Sotiris Nikoletseas. 2011. Social signal processing:Detecting human interactions using wireless sensor networks. In

Proceedings of the 9th ACM International Symposiumon Mobility Management and Wireless Access (MobiWac ’11) . 171–174. https://doi.org/10.1145/2069131.2069163, Vol. 1, No. 1, Article . Publication date: September 2020. [4] Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: Improving geographical prediction withsocial and spatial proximity. In

Proceedings of the 19th International Conference on World Wide Web (WWW ’10) . 61–70.https://doi.org/10.1145/1772690.1772698[5] Alain Barrat, Ciro Cattuto, Vittoria Colizza, Francesco Gesualdo, Lorenzo Isella, Elisabetta Pandolfi, Jean-FrançoisPinton, Lucilla Ravà, Caterina Rizzo, Mariateresa Romano, Juliette Stehlé, Alberto Eugenio Tozzi, and Wouter van denBroeck. 2013. Empirical temporal networks of face-to-face human interactions.

The European Physical Journal SpecialTopics

Revised Selected Papers from the Third International Conference on Electronic Healthcare (eHealth 2010, Vol. 69) . 192–195.https://doi.org/10.1007/978-3-642-23635-8_24[7] Alain Barrat, Ciro Cattuto, Vittoria Colizza, Jean-François Pinton, Wouter Van den Broeck, and AlessandroVespignani. 2008. High resolution dynamical mapping of social interactions with active RFID. arXiv:0811.4170.arXiv:https://arxiv.org/abs/0811.4170[8] Christoph Bergmeir and José M. Benítez. 2012. On the use of cross-validation for time series predictor evaluation.

Information Sciences

191 (2012), 1920–213. https://doi.org/10.1016/j.ins.2011.12.028[9] H. Russell Bernard and Peter D. Killworth. 1977. Information accuracy in social network data II.

Human CommunicationResearch

4, 1 (1977), 3–18. https://doi.org/10.1111/j.1468-2958.1977.tb00591.x[10] H. Russell Bernard, Peter D. Killworth, and Lee Sailer. 1979. Informant accuracy in social network data IV: Acomparison of clique-level structure in behavioral and cognitive network data.

Social Networks

2, 3 (1979), 191–218.https://doi.org/10.1016/0378-8733(79)90014-5[11] H. Russell Bernard, Peter D. Killworth, and Lee Sailer. 1982. Informant accuracy in social-network data V: Anexperimental attempt to predict actual communication from recall data.

Social Science Research

11, 1 (1982), 30–66.https://doi.org/10.1016/0049-089X(82)90006-0[12] Stephen P. Borgatti, Ajay Mehra, Daniel J. Brass, and Giuseppe Labianca. 2009. Network analysis in the social sciences.

Science

PLOS ONE

12, 6 (06 2017), 1–17. https://doi.org/10.1371/journal.pone.0177678[14] Leo Breiman. 2001. Statistical modeling: The two cultures (with comments and a rejoinder by the author).

StatisticalScience

16, 3 (08 2001), 199–231. https://doi.org/10.1214/ss/1009213726[15] Ciro Cattuto, Wouter van den Broeck, Alain Barrat, Vittoria Colizza, Jean-François Pinton, and Alessandro Vespignani.2010. Dynamics of person-to-person interactions from distributed RFID sensor networks.

PLOS ONE

5, 7 (2010),e11596. https://doi.org/10.1371/journal.pone.0011596[16] Datong Chen, Jie Yang, Robert Malkin, and Howard D. Wactlar. 2007. Detecting social interactions of the elderly in anursing home environment.

ACM Transactions on Multimedia Computing, Communications, and Applications

3, 1(2007). https://doi.org/10.1145/1198302.1198308[17] Kehui Chen and Jing Lei. 2017. Network cross-validation for determining the number of communities in networkdata.

J. Amer. Statist. Assoc. (2017), 1–11. https://doi.org/10.1080/01621459.2016.1246365[18] Frances Cherry. 1995. One man’s social psychology is another woman’s social history. In

The stubborn particulars ofsocial psychology: Essays on the research process . Routledge, London, 68–83.[19] Alvin Chin, Bin Xu, Hao Wang, Lele Chang, Hao Wang, and Lijun Zhu. 2013. Connecting people through physicalproximity and physical resources at a conference.

ACM Transactions on Intelligent System Technologies

4, 3 (2013),50:1–50:21. https://doi.org/10.1145/2483669.2483683[20] Alvin Chin, Bin Xu, Hao Wang, and Xia Wang. 2012. Linking people through physical proximity in a conference. In

Proceedings of the 3rd International Workshop on Modeling Social Media (MSM ’12) . 13–20. https://doi.org/10.1145/2310057.2310061[21] Eunjoon Cho, Seth A. Myers, and Jure Leskovec. 2011. Friendship and mobility: User movement in location-basedsocial networks. In

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and datamining (KDD ’11) . 1082–1090. https://doi.org/10.1145/2020408.2020579[22] Tanzeem Choudhury and Alex Pentland. 2002. The sociometer: A wearable device for understanding human networks.In

Proceedings of the Workshop on Ad hoc Communications and Collaboration in Ubiquitous Computing Environments,Computer Supported Cooperative Work .[23] Tanzeem Choudhury and Alex Pentland. 2003. Modeling face-to-face communication using the sociometer. In

Workshop Proceedings of Ubicomp (Workshop: Supporting Social Interaction and Face-to-face Communication in PublicSpaces) .[24] Tanzeem Choudhury and Alex Pentland. 2003. Sensing and modeling human networks using the sociometer. In

Proceedings of the 7th IEEE International Symposium on Wearable Computers (ISWC ’03) . 216–222. https://doi.org/10., Vol. 1, No. 1, Article . Publication date: September 2020. an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 21

Proceedings of the ICMI-MLMI ’09 Workshop on Multimodal Sensor-Based Systems and MobilePhones for Social Computing (ICMI-MLMI ’09) . Article 1, 1:1–1:4 pages. https://doi.org/10.1145/1641389.1641390[26] Ethan Cohen-Cole and Jason M. Fletcher. 2008. Is obesity contagious? Social networks vs. environmental factors in theobesity epidemic.

Journal of Health Economics

27, 5 (2008), 1382–1387. https://doi.org/10.1016/j.jhealeco.2008.04.005[27] Rense Corten. 2012. Composition and structure of a large online social network in the Netherlands.

PLOS ONE

Proceedings of the 12th ACM International Conference on Ubiquitous Computing(Ubicomp ’10) . 119–128. https://doi.org/10.1145/1864349.1864380[29] Beau Dabbs and Brian Junker. 2016. Comparison of cross-validation methods for stochastic block models.arXiv:1605.03000. arXiv:https://arxiv.org/abs/1612.04717[30] Trinh Minh Tri Do and D. Gatica-Perez. 2011. GroupUs: Smartphone proximity data and human interaction typemining. In

Proceedings of the 15th Annual International Symposium on Wearable Computers (ISWC 2011) . 21–28.https://doi.org/10.1109/ISWC.2011.28[31] Trinh Minh Tri Do and Daniel Gatica-Perez. 2013. Human interaction discovery in smartphone proximity networks.

Personal and Ubiquitous Computing

17, 3 (2013), 413–431. https://doi.org/10.1007/s00779-011-0489-7[32] Wen Dong, Bruno Lepri, and Alex Pentland. 2011. Modeling the co-evolution of behaviors and social relationshipsusing mobile phone data. In

Proceedings of the 10th International Conference on Mobile and Ubiquitous Multimedia(MUM ’11) . 134–143. https://doi.org/10.1145/2107596.2107613[33] Malcolm M. Dow. 2007. Galton’s Problem as multiple network autocorrelation effects: Cultural trait transmission andecological constraint.

Cross-Cultural Research

41, 4 (2007), 336–363. https://doi.org/10.1177/1069397107305452[34] Malcolm M. Dow, Michael L. Burton, and Douglas R. White. 1982. Network autocorrelation: A simulation study of afoundational problem in regression and survey research.

Social Networks

4, 2 (1982), 169–200. https://doi.org/10.1016/0378-8733(82)90031-4[35] Malcolm M. Dow, Michael L. Burton, Douglas R. White, and Karl P. Reitz. 1984. Galton’s Problem as networkautocorrelation.

American Ethnologist

11, 4 (1984), 754–770. https://doi.org/10.1525/ae.1984.11.4.02a00080[36] Nathan Eagle, Aaron Clauset, Alex Pentland, and David Lazer. 2010. Reply to Adams: Multi-dimensional edgeinference.

Proceedings of the National Academy of Sciences

Personal Ubiquitous Computing

10, 4 (03 2006), 255–268. https://doi.org/10.1007/s00779-005-0046-3[38] Nathan Eagle and Alex Pentland. 2009. Eigenbehaviors: identifying structure in routine.

Behavioral Ecology andSociobiology

63, 7 (2009), 1057–1066. https://doi.org/10.1007/s00265-009-0739-0[39] Nathan Eagle, Alex Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data.

Proceedings of the National Academy of Sciences

Friendship processes . Sage Publications, Inc., Thousand Oaks, CA.[41] Denzil Ferreira, Vassilis Kostakos, and Anind K. Dey. 2015. AWARE: Mobile context instrumentation framework.

Frontiers in ICT

2, 6 (2015), 1–9. https://doi.org/10.3389/fict.2015.00006[42] Leon Festinger, Kurt W. Back, and Stanley Schachter. 1950.

Social pressure in informal groups: A study of human factorsin housing . Stanford University Press, Stanford, CA.[43] Anna Förster, Kamini Garg, Hoang Anh Nguyen, and Silvia Giordano. 2012. On context awareness and social distancein human mobility traces. In

Proceedings of the Third ACM International Workshop on Mobile Opportunistic Networks(MobiOpp ’12) . 5–12. https://doi.org/10.1145/2159576.2159581[44] Linton C. Freeman, A. Kimball Romney, and Sue C. Freeman. 1987. Cognitive structure and informant accuracy.

American Anthropologist

89, 2 (1987), 310–325. https://doi.org/10.1525/aa.1987.89.2.02a00020[45] Adrien Friggeri, Guillaume Chelius, Eric Fleury, Antoine Fraboulet, France Mentré, and Jean-Christophe Lucet. 2011.Reconstructing social interactions using an unreliable wireless sensor network.

Computer Communications

34, 5(2011), 609–618. https://doi.org/10.1016/j.comcom.2010.06.005[46] Raghu K. Ganti, Yu-En Tsai, and Tarek F. Abdelzaher. 2008. SenseWorld: Towards cyber-physical social networks.In

Proceedings of the 2008 International Conference on Information Processing in Sensor Networks (IPSN ’08) . 563–564.https://doi.org/10.1109/IPSN.2008.48[47] Mark A. Hall. 1999.

Correlation-based feature selection for machine learning . Ph.D. Dissertation. Department ofComputer Science, The University of Waikato.[48] Nils Y. Hammerla and Thomas Plötz. 2015. Let’s (not) stick together: Pairwise similarity biases cross-validationin activity recognition. In

Proceedings of the 2015 ACM International Joint Conference on Pervasive and UbiquitousComputing (UbiComp ’15) . 1041–1051. https://doi.org/10.1145/2750858.2807551, Vol. 1, No. 1, Article . Publication date: September 2020. [49] Jeng-Cheng Hsieh, Chih-Ming Chen, and Hsiao-Fang Lin. 2010. Social interaction mining based on wireless sensornetworks for promoting cooperative learning performance in classroom learning environment. In

Proceedings ofthe 6th IEEE International Conference on Wireless, Mobile and Ubiquitous Technologies in Education (WMUTE 2010) .219–221. https://doi.org/10.1109/WMUTE.2010.22[50] Lorenzo Isella, Juliette Stehlé, Alain Barrat, Ciro Cattuto, Jean-François Pinton, and Wouter van den Broeck. 2011.What’s in a crowd? Analysis of face-to-face behavioral networks.

Journal of Theoretical Biology

Computers

2, 2 (2013), 88–131. https://doi.org/10.3390/computers2020088[52] Peter D. Killworth and H. Russell Bernard. 1976. Informant accuracy in social network data.

Human Organization

Social Networks

2, 1 (1979), 19–46. https://doi.org/10.1016/0378-8733(79)90009-1[54] Niko Kiukkonen, Jan Blom, Olivier Dousse, Daniel Gatica-Perez, and Juha Laurila. 2010. Towards rich mobile phonedatasets: Lausanne data collection campaign. In

Proceedings of the 7th ACM International Conference on PervasiveServices (ICPS ’10) .[55] Mikkel Baun Kjærgaard and Petteri Nurmi. 2012. Challenges for social sensing using WiFi signals. In

Proceedings ofthe 1st ACM Workshop on Mobile Systems for Computational Social Science (MCSS ’12) . 17–21. https://doi.org/10.1145/2307863.2307869[56] Andrea Knecht, Tom A. B. Snijders, Chris Baerveldt, Christian E. G. Steglich, and Werner Raub. 2010. Friendshipand delinquency: Selection and influence processes in early adolescence.

Social Development

19, 3 (2010), 494–514.https://doi.org/10.1111/j.1467-9507.2009.00564.x[57] Yu Kong and Yun Fu. 2016. Close human interaction recognition using patch-aware models.

IEEE Transactions onImage Processing

25, 1 (2016), 167–178. https://doi.org/10.1109/TIP.2015.2498410[58] David Krackhardt. 1987. Cognitive social structures.

Social Networks

9, 2 (1987), 109–134. https://doi.org/10.1016/0378-8733(87)90009-8[59] David Krackhardt. 1996. Social networks and the liability of newness for managers. In

Trends in OrganizationalBehavior , Cary L. Cooper and Denise M. Rousseau (Eds.). Vol. 3. John Wiley & Sons, Inc., Chichester, NY, 159–173.[60] Mathew Laibowitz, Jonathan Gips, Ryan Aylward, Alex Pentland, and Joseph A. Paradiso. 2006. A sensor networkfor social dynamics. In

Proceedings of the Fifth International Conference on Information Processing in Sensor Networks(IPSN 2006) . 483–491. https://doi.org/10.1109/IPSN.2006.243937[61] Bibb Latané, James H. Liu, Andrzej Nowak, Michael Bonevento, and Long Zheng. 1995. Distance matters: Physicalspace and social impact.

Personality and Social Psychology Bulletin

21, 8 (1995), 795–805. https://doi.org/10.1177/0146167295218002[62] Edward O. Laumann, Peter V. Marsden, and David Prensky. 1983. The boundary specification problem in networkanalysis. In

Applied network analysis: A methodological introduction , Ron S. Burt and Michael J. Minor (Eds.). Vol. 61.Sage Publications, Beverly Hills, CA, 18–34.[63] Jamie Lawrence, Terry R. Payne, and David De Roure. 2006. Co-presence communities: Using pervasive computingto support weak social networks. In

Proceedings of the 15th IEEE International Workshops on Enabling Technologies:Infrastructure for Collaborative Enterprises (WETICE ’06) . 149–156. https://doi.org/10.1109/WETICE.2006.24[64] Bruno Lepri, Jacopo Staiano, Giulio Rigato, Kyriaki Kalimeri, Ailbhe Finnerty, Fabio Pianesi, Nicu Sebe, and AlexPentland. 2012. The SocioMetric badges corpus: A multilevel behavioral dataset for social behavior in complexorganizations. In

Proceedings of the 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/IEEEInternational Conference on Privacy, Security, Risk and Trust (SOCIALCOM-PASSAT ’12) . 623–628. https://doi.org/10.1109/SocialCom-PASSAT.2012.71[65] Jure Leskovec and Eric Horvitz. 2014. Geospatial structure of a planetary-scale social network.

IEEE Transactions onComputational Social Systems

1, 3 (2014), 156–163. https://doi.org/10.1109/TCSS.2014.2377789[66] Minshu Li, Haipeng Wang, Bin Guo, and Zhiwen Yu. 2012. Extraction of human social behavior from mobile phonesensing. In

Proceedings of the 8th International Conference on Active Media Technology (AMT 2012, 7669) . 63–72.https://doi.org/10.1007/978-3-642-35236-2_7[67] David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks.

Journal of the AmericanSociety for Information Science and Technology

58, 7 (2007), 1019–1031. https://doi.org/10.1002/asi.v58:7[68] David Liben-Nowell, Jasmine Novak, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2005. Geographicrouting in social networks.

Proceedings of the National Academy of Sciences an Smartphone Co-locations Detect Friendship? It Depends How You Model It. 23 [69] Anmol Madan, Manuel Cebrian, David Lazer, and Alex Pentland. 2010. Social sensing for epidemiological behaviorchange. In

Proceedings of the 12th ACM International Conference on Ubiquitous Computing (UbiComp ’10) . 291–300.https://doi.org/10.1145/1864349.1864394[70] Anmol Madan, Katayoun Farrahi, Daniel Gatica-Perez, and Alex Pentland. 2011. Pervasive sensing to model politicalopinions in face-to-face networks. In

Proceedings of the 9th International Conference on Pervasive Computing (Pervasive2011) . 214–231. https://doi.org/10.1007/978-3-642-21726-5_14[71] Anmol Madan, Sai T. Moturu, David Lazer, and Alex Pentland. 2010. Social sensing: Obesity, unhealthy eating andexercise in face-to-face networks. In

Proceedings of Wireless Health 2010 (WH ’10) . 104–110. https://doi.org/10.1145/1921081.1921094[72] Anmol Madan and Alex Pentland. 2006. VibeFones: Socially aware mobile phones. In

Proceedings of the Tenth IEEEInternational Symposium on Wearable Computers (ISWC 2006) . 109–112. https://doi.org/10.1109/ISWC.2006.286352[73] Momin M. Malik, Hemank Lamba, Constantine Nakos, and Jürgen Pfeffer. 2015. Population bias in geotagged tweets.In

Papers from the 2015 ICWSM Workshop on Standards and Practices in Large-Scale Social Media Research (ICWSM-15SPSM)

Biochimica et Biophysica Acta (BBA) - Protein Structure

Journal of the Royal Statistical Society: Series B(Statistical Methodology)

72, 4 (2010), 417–473. https://doi.org/10.1111/j.1467-9868.2010.00740.x[76] Andrew G. Miklas, Kiran K. Gollu, Kelvin K. W. Chan, Stefan Saroiu, Krishna P. Gummadi, and Eyal Lara. 2007.Exploiting social interactions in mobile systems. In

Proceedings of the 9th International Conference on UbiquitousComputing (Ubicomp 2007) . 409–428. https://doi.org/10.1007/978-3-540-74853-3_24[77] Alan Mislove, Massimiliano Marcon, Krishna P. Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Mea-surement and analysis of online social networks. In

Proceedings of the 7th ACM SIGCOMM Conference on InternetMeasurement (IMC ’07) . 29–42. https://doi.org/10.1145/1298306.1298311[78] Jacob L. Moreno. 1934.

Who shall survive? A new approach to the problem of human interrelations . Nervous and MentalDisease Publishing Co., Washington, D.C.[79] Sai T. Moturu, Inas Khayal, Nadav Aharony, Wei Pan, and Alex Pentland. 2011. Using social sensing to understandthe links between sleep, mood, and sociability. In

Proceedings of the 2011 IEEE International Conference on Privacy,Security, Risk and Trust and IEEE International Conference on Social Computing (PASSAT/SocialCom 2011) . 208–214.https://doi.org/10.1109/PASSAT/SocialCom.2011.200[80] Sendhil Mullainathan and Jann Spiess. 2017. Machine Learning: An Applied Econometric Approach.

Journal ofEconomic Perspectives

31, 2 (2017), 87–106. https://doi.org/10.1257/jep.31.2.87[81] Theodore Mead Newcomb. 1961.

The acquaintance process . Holt, Reinhard & Winston, New York, NY.[82] Peter G. Nordlie. 1958.

A longitudinal study of interpersonal attraction in a natural group setting, . Ph.D. Dissertation.University of Michigan.[83] Jukka-Pekka Onnela, Samuel Arbesman, Marta C. González, Albert-László Barabási, and Nicholas A. Christakis. 2011.Geographic constraints on social network groups.

PLOS ONE

6, 4 (2011), 1–7. https://doi.org/10.1371/journal.pone.0016939[84] Wei Pan, Nadav Aharony, and Alex Pentland. 2011. Fortune monitor or fortune teller: Understanding the connectionbetween interaction patterns and financial status. In

Proceedings of the 2011 IEEE International Conference on Privacy,Security, Risk and Trust and IEEE International Conference on Social Computing (PASSAT/SocialCom 2011) . 200–207.https://doi.org/10.1109/PASSAT/SocialCom.2011.163[85] Joseph A. Paradiso, Jonathan Gips, Mathew Laibowitz, Sajid Sadi, David Merrill, Ryan Aylward, Pattie Maes, and AlexPentland. 2010. Identifying and facilitating social interaction with a wearable wireless sensor network.

Personal andUbiquitous Computing

14, 2 (2010), 137–152. https://doi.org/10.1007/s00779-009-0239-2[86] Matjaž Perc. 2014. The Matthew effect in empirical data.

Journal of The Royal Society Interface

11, 98 (2014).https://doi.org/10.1098/rsif.2014.0378[87] Dinh Phung, Brett Adams, and Svetha Venkatesh. 2008. Computable social patterns from sparse sensor data. In

Proceedings of the First International Workshop on Location and the Web (LOCWEB ’08) . 69–72. https://doi.org/10.1145/1367798.1367810[88] Daniele Quercia, Licia Capra, and Jon Crowcroft. 2012. The social world of Twitter: Topics, geography, and emotions.In

Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media (ICWSM-12)

Proceedings of the 17th Annual InternationalConference on Mobile Computing and Networking (MobiCom ’11) . 73–84. https://doi.org/10.1145/2030613.2030623, Vol. 1, No. 1, Article . Publication date: September 2020. [90] Jeff Racine. 2000. Consistent cross-validatory model-selection for dependent data: hv -block cross-validation. Journalof Econometrics

99, 1 (2000), 39–61. https://doi.org/10.1016/S0304-4076(00)00030-0[91] Mika Raento, Antti Oulasvirta, and Nathan Eagle. 2009. Smartphones: An emerging tool for social scientists.

Sociological Methods & Research

37, 3 (2009), 426–454. https://doi.org/10.1177/0049124108330005[92] Samuel Franklin Sampson. 1968.

A novitiate in a period of change: An experimental and case study of social relationships .Ph.D. Dissertation. Cornell University.[93] Vedran Sekara and Sune Lehmann. 2014. The strength of friendship ties in proximity sensor data.

PLOS ONE

9, 7(2014), 1–8. https://doi.org/10.1371/journal.pone.0100915[94] Galit Shmueli. 2010. To explain or to predict?

Statistical Science

25, 3 (2010), 289–310. https://doi.org/10.1214/10-STS330[95] Tom A. B. Snijders. 2011. Statistical models for social networks.

Annual Review of Sociology

37, 1 (2011), 131–153.https://doi.org/10.1146/annurev.soc.012809.102709[96] Tom A. B. Snijders, Gerhard G. van de Bunt, and Christian E. G. Steglich. 2010. Introduction to stochastic actor-basedmodels for network dynamics.

Social Networks

32, 1 (2010), 44–60. https://doi.org/10.1016/j.socnet.2009.02.004[97] Jacopo Staiano, Bruno Lepri, Nadav Aharony, Fabio Pianesi, Nicu Sebe, and Alex Pentland. 2012. Friends don’t lie:Inferring personality traits from social network structure. In

Proceedings of the 2012 ACM Conference on UbiquitousComputing (UbiComp ’12) . 321–330. https://doi.org/10.1145/2370216.2370266[98] Juliette Stehlé, François Charbonnier, Tristan Picard, Ciro Cattuto, and Alain Barrat. 2013. Gender homophilyfrom spatial behavior in a primary school: A sociometric study.

Social Networks

35, 4 (2013), 604–613. https://doi.org/10.1016/j.socnet.2013.08.003[99] Arkadiusz Stopczynski, Piotr Sapiezynski, Alex Pentland, and Sune Lehmann. 2015. Temporal fidelity in dynamicsocial networks.

The European Physical Journal B

88, 10 (2015), 249. https://doi.org/10.1140/epjb/e2015-60549-7[100] Arkadiusz Stopczynski, Vedran Sekara, Piotr Sapiezynski, Andrea Cuttone, Mette My Madsen, Jakob Eg Larsen, andSune Lehmann. 2014. Measuring large-scale social networks with high resolution.

PLOS ONE

9, 4 (04 2014), 1–24.https://doi.org/10.1371/journal.pone.0095978[101] Terry Therneau and Beth Atkinson. 2018. rpart : Recursive partitioning and regression trees . https://CRAN.R-project.org/package=rpart R package version 4.1-13.[102] Gerhard G. van de Bunt, Marijtje A. J. van Duijn, and Tom A. B. Snijders. 1999. Friendship networks through time:An actor-oriented dynamic statistical network model. Computational & Mathematical Organization Theory

5, 2 (1999),167–192. https://doi.org/10.1023/A:1009683123448[103] Marijtje A. J. van Duijn, Evelien P. H. Zeggelink, Mark Huisman, Frans N. Stokman, and Frans W. Wasseur. 2003.Evolution of sociology freshmen into a friendship network.

The Journal of Mathematical Sociology

27, 2-3 (2003),153–191. https://doi.org/10.1080/00222500305889[104] Coert van Gemeren, Ronald Poppe, and Remco C. Veltkamp. 2016. Spatio-temporal detection of fine-grained dyadichuman interactions. In

Proceedings of the 7th International Workshop on Human Behavior Understanding (HBU 2016) .116–133. https://doi.org/10.1007/978-3-319-46843-3_8[105] Dashun Wang, Dino Pedreschi, Chaoming Song, Fosca Giannotti, and Albert-Laszlo Barabasi. 2011. Human mobility,social ties, and link prediction. In

Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining (KDD ’11) . 1100–1108. https://doi.org/10.1145/2020408.2020581[106] Jason Wiese, Jun-Ki Min, Jason I. Hong, and John Zimmerman. 2014.

Assessing call and SMS logs as an indication of tiestrength . Technical Report CMU-HCII-14-101. Human-Computer Interaction Institute, School of Computer Science,Carnegie Mellon University.[107] Jason Wiese, Jun-Ki Min, Jason I. Hong, and John Zimmerman. 2015. “You never call, you never write”: Call and SMSlogs do not always indicate tie strength. In

Proceedings of the 18th ACM Conference on Computer Supported CooperativeWork & Social Computing (CSCW ’15) . 765–774. https://doi.org/10.1145/2675133.2675143[108] Christo Wilson, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2012. Beyond social graphs: Userinteractions in online social networks and their implications.

ACM Transactions on the Web

Biometrics

73, 1 (2016), 20–30. https://doi.org/10.1111/biom.12554[110] Wayne W. Zachary. 1977. An information flow model for conflict and fission in small groups.