Analyzing course programmes using complex networks
AAnalyzing course programmes using complex networks
Suzane F. Pinto
Programa de graduação em Engenharia de Computação, Universidade Federal de Ouro Preto, 35931-008, Brasil
Ronan S. Ferreira
Departamento de Ciências Exatas e Aplicadas, Universidade Federal de Ouro Preto, 35931-008, Brasil
Abstract : We analyze the curriculum of the early common-years of engineering in our institute using toolsof statistical physics of complex networks. Naturally, a course programme is structured in a networkedform (temporal dependency and prerequisites). In this approach, each topic within each programme isassociated with a node, which in turn is joined by links representing the dependence of a topic for theunderstanding of another in a different discipline. As a course programme is a time-dependent structure,we propose a simple model to assign links between nodes, taking into account only two ingredients ofthe teaching-learning process: recursiveness and accumulation of knowledge. Since we already know theprogrammes, our objective is to verify if the proposed model is able to capture their particularities andto identify implications of different sequencing on the student learning in the early years of engineeringdegrees. Our model can be used as a systematic tool assisting the construction of a more interdisciplinarycurriculum, articulating between disciplines of the undergraduate early-years in exact sciences.
Keywords : course programme, learning, complex systems.
The theory of complex networks has its inaugurallandmark in the late 1990s [1, 2]. Since then, sev-eral areas of knowledge have passed through [3–5].From investigations into social dynamics [6],throughspread and control of epidemics [7–9] and even lin-guistics [10, 11]. In this way, the theory of completenetworks has become a paradigm for interdisciplinaryresearch that, in turn, is a challenge for teaching, pro-duction and technical-scientific dissemination. How-ever, our curriculum is divided into specific content,due to a didactic motivation, taking into account the teaching and learning process. Another point is theprerequisites: requirements on skills acquired for con-tinuity and the accumulation of knowledge. Thus, theprerequisites are, at least in the context of interdiscip-linarity, a connection mechanism between the variousdisciplines offered in a curriculum.On the other hand, the knowledge of a larger frac-tion of the contents in a curricular grid by the teachersis also a fundamental mechanism for the consolidationof the connections between the disciplines. This is be-cause, on the part of the students, it is not always easyto perceive the relevance of the topics of one discipline1 a r X i v : . [ phy s i c s . e d - ph ] M a y n another. Perhaps for this reason, questions such as:“ When will I use this? ” or “
Why am I learning this? ”accompany the student’s daily life.Several studies aim to improve the question of in-tegrating the contents of the basic cycle [12, 13]. Par-ticularly, in engineering undergraduate courses, thefailure rates in the disciplines of this stage have beentransformed into a kind of “culture of failure”. Thisthought leads freshmen to consider the repetition ofsubjects such as
Differential and Integral Calculus and
Physics a natural fact: the rule, the exception beingthat regular student in the course in question [12–14].Consequently, retentions lead to a high number of dro-pouts in some courses, implying a large number of idlevacancies that are difficult to be reoccupied when inmore advanced periods of the courses [15].In our institution, the Federal University of OuroPreto - UFOP, the Pro-active program was created,which aims to improve the offer conditions of under-graduate courses and disciplines and the learning pro-cess. This is encouraged with the participation of stu-dents, who are selected every year by the program.Our project proposed the exposure of selected studentsto the theory of complex networks. From there, thegroup was encouraged to map the disciplines of thebasic cycle, building a network of themes connectingthe various disciplines.A network is a graph [4] in which we assign aphysical meaning to a set of nodes (vertices) connectedby edges, obeying some statistical distribution. In ourapproach, each topic within each menu of the discip-lines of the basic cycle is associated with a node,whichin turn are connected in pairs by edges (links). Theselinks represent, therefore, the dependence of a certaintheme in one discipline for the understanding of an-other in a different discipline. In this way, we can builda network of themes from the connections between thetopics of the menus. We use the menus of the basiccycle curriculum offered at the Institute of Exact andApplied Sciences - ICEA - UFOP.Our objectives then are ( i ) the proposal of a modelto point out the dependencies between the different themes of a curricular menu;( ii ) check if the proposedmodel is capable of capturing the particularities ofeach studied grid. Verification is possible because weknow the grids beforehand; Finally, ( iii ) identify fromthe model the influence of the different sequences ofphysics themes both for the curriculum and for theteaching-learning process. We can use adjacency matrices to represent graphs.For a graph of type G ( N, E ) , where N is the num-ber of nodes and E the number of edges, we can writea matrix of type A : N × N naming each node withan integer i = 1 , , , ..., N . The adjacency matrix car-ries information about the existence of a connectionbetween any two nodes in a graph. For a directed typegraph, the { a ij } elements of matrix A are defined asfollows: a ij = , if an i connectionfor j already exists ;0 , if an i connectionfor j no longer exists . A targeted graph can, for example, represent a socialnetwork such as
Facebook or Instagram : we don’t al-ways follow people who know us, as is the case withcelebrity followers. This can be understood as a con-nection that preserves meaning. So we can representthat connection with a arrow indicating a direction inthe link between the celebrity and the unknown fan.On the other hand, in the case of an undirected graph,we can write the set of elements of a matrix B : N × N from its { b ij } entries, where b ij = , if a connection between i and j already exists ;0 , if a connection between i and j does not exist . { a ii } (or { b ii } ), informing if a node isconnected to itself. If that happens, he has a self-connection. In general, a ij (cid:54) = a ji , but b ij = b ji always,so B is a symmetric matrix.The most fundamental of the quantities studied innetworks is the degree of a node, given by the numberof connections it has. The distribution of degrees p ( k ) in a graph is equivalent to the fraction of nodes thathave degree k : p ( k ) = N k N , (1)since (cid:80) k N k = N . The k i degree of a node i canbe obtained from the adjacency matrix of a graph G ( N, E ) . Assuming undirected G , then k i = N (cid:88) j =1 b ij . (2)If it is directed, we will have to count the number ofedges that enter (arrow pointing to node i ), k ini , andthat come, k outi , in an node i : k ini = N (cid:88) j =1 a ji (3)e k outi = N (cid:88) j =1 a ij . (4)The average value of degrees (cid:104) k (cid:105) that a graph cancause from the distribution of connectivity p ( k ) . Forexample, for an undirected graph, the average value (cid:104) k (cid:105) can be written as: (cid:104) k (cid:105) = 1 N (cid:88) i k i = (cid:88) k kP ( k ) . (5) By mapping the course menus in an adjacency mat-rix, we can associate a node with each topic and estab-lish a link representing the correspondence of a giventopic in a certain discipline to understand the contentin another. In this way, we obtained a targeted typenetwork, in which the links have their ends with ar-rows. In addition, our network also has a temporaldependency, because, of course, disciplines from latersemesters may depend on content presented in pastsemesters. Therefore, our network of themes growswith each period and its links have a direction.In practice, in order to deal with the time-dependent character between nodes and build our net-work of themes, we propose the following model:( i ) Topics in the same period are allowed to be con-nected in a distributive manner;( ii ) Topics in a period ℘ can also connect with theperiod ℘ − (com ℘ > ), thus giving the in-gredient of recursion, portraying the idea of theaccumulation of knowledge.Figure 1 illustrates this mapping of menu topics inan undirected network. Each topic is named by a num-ber . The distributions that we found show too many fluc-tuations, which makes the analysis for statistical pur-poses difficult. These fluctuations are mainly at thetail of distributions where x >> or, equivalently,for low frequencies of occurrence of x . In order tocarry out a statistical treatment in p ( k ) << val-ues, one strategy is to use complementary cumulat-ive distribution functions - CCDF. In general way, aCCDF function is such that, at the continuous limit, f ( x ) = 1 − (cid:82) x −∞ g ( x (cid:48) ) dx (cid:48) , where x is a variable ran-dom corresponding to the variable, also random, x (cid:48) , This correspondence is not shown for brevity.
123 456 78 9 1011 1213 1415 1617 181920 212223 2425 2627 28 29303132 3334 353637 3839 40 41 42
14 15 1617
27 28
39 40
56 57
71 72 73
77 7879 80
Figure 1: Illustration of the temporal character for increasing the density of correlations and correspondingnumber of nodes to the courses offered in the 1st Period (top, left) and clockwise for the 2nd, 3rd and 4thPeriods of the Electrical Engineering course / UFOP. truncated in the interval [ −∞ , x ] .For our the discreetcase suffices, in which we will denote a function ofthis type by p > ( k ) : p > ( k ) = (cid:88) q ≥ k p ( q ) . (6)At the top of figure 2 the input distributions ( a ) p ( k in ) and (b) p > ( k in ) , are shown, while at the bottomthe output distributions ( c ) p ( k out ) and p > ( k out ) areshown for the Electrical Engineering (EE) themes net-work. In (a) , the data refer to the distribution withoutthe CCDF statistical treatment, which is shown in (b) . The best fit found, following the proportionality p ( k ) ∝ exp ( − αk ) is shown, both in (a) as in (b) , bythe continuous plot curves. The same sequence of in-formation is presented in (c) and (d) for outbound dis-tributions. It is interesting, at this point, to make anobservation about the process of building the networkof themes. Our first investigation was just to connectthose themes related, without any concern with thetemporal dependence nor thinking about the processof accumulation of knowledge, on which the learningprocess is essential based. In this case, what we foundwas an exclusively random distribution of points ran- domly ordered on the xy plane (data not shown). Onlywith our model proposal presented in section 3 whichis obtained as distributions prohibited in figure 2.Figure 3 shows the input and output distributionsfor the three UFOP engineers analyzed: Electrical En-gineering (EE), Computer Engineering (EC) and Pro-duction Engineering (EP) The trace curves are expo-nential adjustments and the values of the respectiveexponents are shown in table 1.For comparison, the continuous functions obtainedfrom the exponential adjustments shown in figure 3 areplaced on the same scale in figure 4. Note that, al-though it is data from the basic cycle of Engineeringand, therefore, we hope to obtain a superposition ofthe inclinations of all the studied Engineering, this itonly happens with data related to Computer and Pro-duction Engineering, respectively, EC and EP. The su-perposition is not perfect due to slight differences inthe menus of the two courses. The most prominent ofthese differences lies in the fact that the EC has in itsPPC the discipline of Modern Physics (or Physics vol.IV) , which does not happens for EP.This suggests that the inclusion of Physics IV inthe EP curriculum would be a proposal to the PPC4 k in p > ( k i n ) k in p ( k i n ) (b) (a) k out p > ( k ou t ) k out p ( k ou t ) (d) (c) Figure 2: From the inserted graphs (a) and (c) to the respective main graphs (b) and (d), the data suffereda treatment due to fluctuations for p ( k ) << . Left: degree distributions for incoming links. Right: degreedistributions for outbound links. EE EP ECNodes 96 87 96links 259 293 300 (cid:104) k in (cid:105) (cid:104) k out (cid:105) α in α out Table 1: Quantitative characterization of the engineering themes network. Correspondence: EE (EngineeringElectrical), EP (Production Engineering) and EC (Computer Engineering). course, in order to offer students a current view of thesubjects covered, implying a better understanding ofprocesses in modern engineering, with emphasis onthe processes of physics and materials engineering.Another interpretation of figure 3 is that the inclu-sion of Modern Physics in the EP curriculum wouldnot impose an overload on students from the point ofview of accumulation of knowledge, number of pre-requisites that would be required for such discipline.Otherwise, the deviation between the slopes of the ECand EP curves would be evident, both for the distribu-tion of input connections and for connections of out-put, since the number of incoming connections of anode is related to the number of prerequisites for the theme that this node represents, while its number ofoutgoing connections is associated with the relevanceof this theme for the course progress.The most substantial deviation occurs betweenElectrical Engineering, EE, and the others. In thiscase, the superposition doesn’t happen because thereare also particularities in its pedagogical project.Again, it is the physics sector that presents the maindifferences. For example, in EE’s PPC the discip-line of
Electromagnetism (or Physics vol. III) and
Thermodynamics (or Physics vol. II)) are in the samesemester, while separated by a period in the other ECand EP.Another factor is that the discipline of
Modern k in P > ( k i n ) k out P > ( k ou t ) k out P > ( k ou t ) k in P > ( k i n ) k in P > ( k i n ) k out P > ( k ou t ) EE EC EP
Figure 3: p > ( k in ) distributions (top) and p > ( k out ) > (kout) distributions (base) for connections between thetopics of menus of the three studied engineers. from left to right: Electrical Engineering (EE), EngineeringComputing (EC) and Production Engineering (EP). The values for exponents of the adjustment for each of thecurves is shown in table 1. k out -2 -1 p > ( k ou t ) EEECEP k in -1 p > ( k i n ) EEECEP (b) (a)
Figure 4: Comparison between the continuous functions obtained in figure 3. How is the network of themes forthe basic cycle of three engineering companies was expected to overlap curves. However, this does not occurdue to differences in the Pedagogical Curricular Program of the courses. Our model was able to capture thesedifferences using a quantitative approach. ℘ = 2 FIS I FIS I FIS I ℘ = 3 FIS II & FIS III FIS II FIS II ℘ = 4 FIS IV FIS III FIS III ℘ = 5 - FIS IV - Table 2: Correspondence: FIS I (Mechanics), FIS II (Electromagnetism), FIS III (Thermodynamics) e FIS IV(Modern Physics).Physics at EE is in the semester immediately after ( ℘ and ℘ + 1 ) to the period in which the discipline of Electromagnetism is programmed. This is because inits
Modern Physics menu the first chapters are dedic-ated to the study of alternating currents , which in turn,have direct relationship with the study of circuits inthe discipline
Electromagnetism . In our approach, thenodes that represent the theme of alternating currents ,in EE, have k out values higher than the same nodes inthe EC network, because in this engineering Electro-magnetism and
Modern Physics are separated from aperiod.The reverse reasoning applies to the k in values forthe nodes that represent the study of the circuits . Thetable 2 outlines the positioning of these disciplines inthe curriculum grids of the three engineering compan-ies. These factors lead to a higher average connectiv-ity for EE showing a richer network in prerequisites- curve with greater slope in figure 3 (a), although anunbalanced grid if we take into account the relation-ship between prerequisites and relevance of a themefollowing a course. This is related to the ratio betweenthe coefficients of the input and output connection dis-tributions, where this ratio r tends to 1 if there is a lin-ear correlation. Namely, r = 0 . for EE, r = 0 . forEP and r = 0 . for EC, the latter showing the coursehere analyzed with the most balanced grid from thepoint of view of the sequence of presentation of thethemes of teaching of physical. Within the scope of the Pro-Ativa / UFOP project,which aims to think about learning and teaching in un-dergraduate courses, we a simple model assuming inthe learning process the ingredients of recursion andaccumulation knowledge. To test our model, we stud-ied the curricula of the basic cycle of the three engin-eering offered at ICEA / UFOP. With mapping of cor-respondences between grid topics curriculum in a net-work of themes, we were able to obtain mathematicaldistributions describing each of these grids. Althoughwe only deal with the basic cycle of these courses, thePPC of one of these engineering differences comparedto the others. Our model was able to capture thesedifferences quantitatively while treat the connectivitydistributions for each of the curricular grids. Our ana-lysis can be extended both to the professional cycleand to assist in the study of prerequisites in a PPC.
The authors would like to thank the Pro-Active Pro-gram / UFOP. RSF would like to thank the AuxílioPesquisador program / UFOP, to professors from DE-CEA / UFOP, UESPI-Piripiri-PI / Teresina-PI, DFIS /UFPI and National Council of Scientific and Techno-logical Development - CNPq, in the scope of process424950 / 2018-9.7 eferences [1] D.J. Watts e S.H. Strogatz, Nat. , 440 (1998).[2] A.L. Barabási and R. Albert, Sci. , 509(1999).[3] A.L. Barabási, Nat. Phys. , 14 (2011).[4] M. Newman, Networks: an introduction (OxfordUniversity Press, 2010).[5] A.L. Barabási, Sci. , 138 (2017).[6] G. Szabó e G. Bunth, Phys. Rev. E , 012305(2018).[7] N. Masuda e P. Holme, Temporal Network Epi-demiology (Springer, Singapura, 2017), p. 179.[8] R. Pastor-Satorras e C. Castellano, J. Stat. Phys. , 1110 (2018). [9] E. Valdano, M.R. Fiorentin, C. Poletto e V.Colizza,
Phys. Rev. Lett. , 068302 (2018).[10] S. Martincic-Ipsic, D. Margan e A. Mestrovic,Physica A. , 117 (2016).[11] I.G. Torre, B. Luque, L. Lacasa, J. Luque eA. Hernández-Fernández, Sci. Rep. , 43862(2017).[12] M.P. Belançon, Rev. Bras. Ens. Fis. , 4 (2017).[13] C.M.B. Matta, S.M.G. Lebrão e M.G.V. Heleno,Psicol. Esc. Educ. , 583 (2017).[14] M.F. Barroso e E.B.M. Falcão, in Anais do IXEncontro Nacional de Pesquisa em Ensino deFísica , Jaboticatubas, 2004.[15] A.V.C. Campello e L.N. Lins, in